Loading…
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard...
Saved in:
Published in: | Bioinformatics (Oxford, England) England), 2003-01, Vol.19 (2), p.185-193 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c500t-2e5e5d22e6e1c10a681514b00d00fb8530740e8035d6492debe8d4a56f1360143 |
---|---|
cites | |
container_end_page | 193 |
container_issue | 2 |
container_start_page | 185 |
container_title | Bioinformatics (Oxford, England) |
container_volume | 19 |
creator | BOLSTAD, B. M IRIZARRY, R. A ASTRAND, M SPEED, T. P |
description | When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations.
We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.
Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org.
Additional figures may be found at http://www.stat.berkeley.edu/~bolstad/normalize/index.html |
doi_str_mv | 10.1093/bioinformatics/19.2.185 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_72973847</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>72973847</sourcerecordid><originalsourceid>FETCH-LOGICAL-c500t-2e5e5d22e6e1c10a681514b00d00fb8530740e8035d6492debe8d4a56f1360143</originalsourceid><addsrcrecordid>eNqFkUtv1DAUhS0Eog_4C2AhwW6m9_qROMuqAopUiQ2sI8e-6bhK7MFOkIZfj0cdUcGGlS37O-f6-DD2FmGL0MmrIaQQx5RnuwRXrrDbii0a_Yydo2zajTKIz__sQZ6xi1IeAECDbl6yMxRaGiHNOZuuuUvz3uZQUuRp5PFoOoVf1bgezLTski-8juK7cL_jnmIJy4GnKdynuLqJ0hI8cZuzPXBvF8sHW8jzKv5ZXW109TJ6PgRbXrEXo50KvT6tl-z7p4_fbm43d18_f7m5vts4DbBsBGnSXghqCB2CbQxqVAOABxgHoyW0CqjG0r5RnfA0kPHK6masgQGVvGQfHn33Of1YqSz9HIqjabKR0lr6VnStNKr9L4imMRoBK_juH_AhrTnWED12ptFaqSPUPkIup1Iyjf0-h9nmQ4_QH2vr_66tSntRZ-iqfHOyX4eZ_JPu1FMF3p8AW5ydxlz_NZQnTulW1ofK32hbpTM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>198655441</pqid></control><display><type>article</type><title>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</title><source>Oxford Open</source><creator>BOLSTAD, B. M ; IRIZARRY, R. A ; ASTRAND, M ; SPEED, T. P</creator><creatorcontrib>BOLSTAD, B. M ; IRIZARRY, R. A ; ASTRAND, M ; SPEED, T. P</creatorcontrib><description>When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations.
We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.
Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org.
Additional figures may be found at http://www.stat.berkeley.edu/~bolstad/normalize/index.html</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/19.2.185</identifier><identifier>PMID: 12538238</identifier><identifier>CODEN: BOINFP</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Calibration ; Fundamental and applied biological sciences. Psychology ; General aspects ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Models, Genetic ; Molecular Probes ; Nonlinear Dynamics ; Oligonucleotide Array Sequence Analysis - instrumentation ; Oligonucleotide Array Sequence Analysis - methods ; Oligonucleotide Array Sequence Analysis - standards ; Quality Control ; Sequence Analysis, DNA - methods ; Sequence Analysis, DNA - standards ; Stochastic Processes</subject><ispartof>Bioinformatics (Oxford, England), 2003-01, Vol.19 (2), p.185-193</ispartof><rights>Copyright Oxford University Press(England) Jan 22, 2003</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c500t-2e5e5d22e6e1c10a681514b00d00fb8530740e8035d6492debe8d4a56f1360143</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=14573510$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/12538238$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>BOLSTAD, B. M</creatorcontrib><creatorcontrib>IRIZARRY, R. A</creatorcontrib><creatorcontrib>ASTRAND, M</creatorcontrib><creatorcontrib>SPEED, T. P</creatorcontrib><title>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations.
We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.
Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org.
Additional figures may be found at http://www.stat.berkeley.edu/~bolstad/normalize/index.html</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Calibration</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Models, Genetic</subject><subject>Molecular Probes</subject><subject>Nonlinear Dynamics</subject><subject>Oligonucleotide Array Sequence Analysis - instrumentation</subject><subject>Oligonucleotide Array Sequence Analysis - methods</subject><subject>Oligonucleotide Array Sequence Analysis - standards</subject><subject>Quality Control</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Sequence Analysis, DNA - standards</subject><subject>Stochastic Processes</subject><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNqFkUtv1DAUhS0Eog_4C2AhwW6m9_qROMuqAopUiQ2sI8e-6bhK7MFOkIZfj0cdUcGGlS37O-f6-DD2FmGL0MmrIaQQx5RnuwRXrrDbii0a_Yydo2zajTKIz__sQZ6xi1IeAECDbl6yMxRaGiHNOZuuuUvz3uZQUuRp5PFoOoVf1bgezLTski-8juK7cL_jnmIJy4GnKdynuLqJ0hI8cZuzPXBvF8sHW8jzKv5ZXW109TJ6PgRbXrEXo50KvT6tl-z7p4_fbm43d18_f7m5vts4DbBsBGnSXghqCB2CbQxqVAOABxgHoyW0CqjG0r5RnfA0kPHK6masgQGVvGQfHn33Of1YqSz9HIqjabKR0lr6VnStNKr9L4imMRoBK_juH_AhrTnWED12ptFaqSPUPkIup1Iyjf0-h9nmQ4_QH2vr_66tSntRZ-iqfHOyX4eZ_JPu1FMF3p8AW5ydxlz_NZQnTulW1ofK32hbpTM</recordid><startdate>20030122</startdate><enddate>20030122</enddate><creator>BOLSTAD, B. M</creator><creator>IRIZARRY, R. A</creator><creator>ASTRAND, M</creator><creator>SPEED, T. P</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20030122</creationdate><title>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</title><author>BOLSTAD, B. M ; IRIZARRY, R. A ; ASTRAND, M ; SPEED, T. P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c500t-2e5e5d22e6e1c10a681514b00d00fb8530740e8035d6492debe8d4a56f1360143</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Calibration</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Models, Genetic</topic><topic>Molecular Probes</topic><topic>Nonlinear Dynamics</topic><topic>Oligonucleotide Array Sequence Analysis - instrumentation</topic><topic>Oligonucleotide Array Sequence Analysis - methods</topic><topic>Oligonucleotide Array Sequence Analysis - standards</topic><topic>Quality Control</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Sequence Analysis, DNA - standards</topic><topic>Stochastic Processes</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>BOLSTAD, B. M</creatorcontrib><creatorcontrib>IRIZARRY, R. A</creatorcontrib><creatorcontrib>ASTRAND, M</creatorcontrib><creatorcontrib>SPEED, T. P</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>BOLSTAD, B. M</au><au>IRIZARRY, R. A</au><au>ASTRAND, M</au><au>SPEED, T. P</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A comparison of normalization methods for high density oligonucleotide array data based on variance and bias</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2003-01-22</date><risdate>2003</risdate><volume>19</volume><issue>2</issue><spage>185</spage><epage>193</epage><pages>185-193</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><coden>BOINFP</coden><abstract>When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations.
We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably.
Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org.
Additional figures may be found at http://www.stat.berkeley.edu/~bolstad/normalize/index.html</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>12538238</pmid><doi>10.1093/bioinformatics/19.2.185</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics (Oxford, England), 2003-01, Vol.19 (2), p.185-193 |
issn | 1367-4803 1367-4811 |
language | eng |
recordid | cdi_proquest_miscellaneous_72973847 |
source | Oxford Open |
subjects | Algorithms Biological and medical sciences Calibration Fundamental and applied biological sciences. Psychology General aspects Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Genetic Molecular Probes Nonlinear Dynamics Oligonucleotide Array Sequence Analysis - instrumentation Oligonucleotide Array Sequence Analysis - methods Oligonucleotide Array Sequence Analysis - standards Quality Control Sequence Analysis, DNA - methods Sequence Analysis, DNA - standards Stochastic Processes |
title | A comparison of normalization methods for high density oligonucleotide array data based on variance and bias |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T17%3A55%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20comparison%20of%20normalization%20methods%20for%20high%20density%20oligonucleotide%20array%20data%20based%20on%20variance%20and%20bias&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=BOLSTAD,%20B.%20M&rft.date=2003-01-22&rft.volume=19&rft.issue=2&rft.spage=185&rft.epage=193&rft.pages=185-193&rft.issn=1367-4803&rft.eissn=1367-4811&rft.coden=BOINFP&rft_id=info:doi/10.1093/bioinformatics/19.2.185&rft_dat=%3Cproquest_cross%3E72973847%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c500t-2e5e5d22e6e1c10a681514b00d00fb8530740e8035d6492debe8d4a56f1360143%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=198655441&rft_id=info:pmid/12538238&rfr_iscdi=true |