Loading…
Comparison of pre-processing methods for Infinium HumanMethylation450 BeadChip array
Microarrays are widely used to quantify DNA methylation because they are economical, require only small quantities of input DNA and focus on well-characterized regions of the genome. However, pre-processing of methylation microarray data is challenging because of confounding factors that include bac...
Saved in:
Published in: | Bioinformatics (Oxford, England) England), 2017-10, Vol.33 (20), p.3151-3157 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Microarrays are widely used to quantify DNA methylation because they are economical, require only small quantities of input DNA and focus on well-characterized regions of the genome. However, pre-processing of methylation microarray data is challenging because of confounding factors that include background fluorescence, dye bias and the impact of germline polymorphisms. Therefore, we present valuable insights and a framework for those seeking the most optimal pre-processing method through a data-driven approach.
Here, we show that Dasen is the optimal pre-processing methodology for the Infinium HumanMethylation450 BeadChip array in prostate cancer, a frequently employed platform for tumour methylome profiling in both the TCGA and ICGC consortia. We evaluated the impact of 11 pre-processing methods on batch effects, replicate variabilities, sensitivities and sample-to-sample correlations across 809 independent prostate cancer samples, including 150 reported for the first time in this study. Overall, Dasen is the most effective for removing artefacts and detecting biological differences associated with tumour aggressivity. Relative to the raw dataset, it shows a reduction in replicate variances of 67% and 76% for β- and M-values, respectively. Our study provides a unique pre-processing benchmark for the community with an emphasis on biological implications.
All software used in this study are publicly available as detailed in the article.
paul.boutros@oicr.on.ca.
Supplementary data are available at Bioinformatics online. |
---|---|
ISSN: | 1367-4803 1367-4811 |
DOI: | 10.1093/bioinformatics/btx372 |