Loading…

Controlling technical variation amongst 6693 patient microarrays of the randomized MINDACT trial

Gene expression data obtained in large studies hold great promises for discovering disease signatures or subtypes through data analysis. It is also prone to technical variation, whose removal is essential to avoid spurious discoveries. Because this variation is not always known and can be confounded...

Full description

Saved in:
Bibliographic Details
Published in:Communications biology 2020-07, Vol.3 (1), p.397-397, Article 397
Main Authors: Jacob, Laurent, Witteveen, Anke, Beumer, Inès, Delahaye, Leonie, Wehkamp, Diederik, van den Akker, Jeroen, Snel, Mireille, Chan, Bob, Floore, Arno, Bakx, Niels, Brink, Guido, Poncet, Coralie, Bogaerts, Jan, Delorenzi, Mauro, Piccart, Martine, Rutgers, Emiel, Cardoso, Fatima, Speed, Terence, van ’t Veer, Laura, Glas, Annuska
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Gene expression data obtained in large studies hold great promises for discovering disease signatures or subtypes through data analysis. It is also prone to technical variation, whose removal is essential to avoid spurious discoveries. Because this variation is not always known and can be confounded with biological signals, its removal is a challenging task. Here we provide a step-wise procedure and comprehensive analysis of the MINDACT microarray dataset. The MINDACT trial enrolled 6693 breast cancer patients and prospectively validated the gene expression signature MammaPrint for outcome prediction. The study also yielded a full-transcriptome microarray for each tumor. We show for the first time in such a large dataset how technical variation can be removed while retaining expected biological signals. Because of its unprecedented size, we hope the resulting adjusted dataset will be an invaluable tool to discover or test gene expression signatures and to advance our understanding of breast cancer. Laurent Jacob et al. develop a workflow and analytical pipeline to remove technical variation from the MINDACT microarray dataset. Their method preserved biological signals and the normalized datasets can be repurposed for the discovery of other biomarkers and signatures for breast cancer.
ISSN:2399-3642
2399-3642
DOI:10.1038/s42003-020-1111-1