Loading…

A merged microarray meta-dataset for transcriptionally profiling colorectal neoplasm formation and progression

Transcriptional profiling of pre- and post-malignant colorectal cancer (CRC) lesions enable temporal monitoring of molecular events underlying neoplastic progression. However, the most widely used transcriptomic dataset for CRC, TCGA-COAD, is devoid of adenoma samples, which increases reliance on an...

Full description

Saved in:
Bibliographic Details
Published in:Scientific data 2021-08, Vol.8 (1), p.214-11, Article 214
Main Authors: Rohr, Michael, Beardsley, Jordan, Nakkina, Sai Preethi, Zhu, Xiang, Aljabban, Jihad, Hadley, Dexter, Altomare, Deborah
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Transcriptional profiling of pre- and post-malignant colorectal cancer (CRC) lesions enable temporal monitoring of molecular events underlying neoplastic progression. However, the most widely used transcriptomic dataset for CRC, TCGA-COAD, is devoid of adenoma samples, which increases reliance on an assortment of disparate microarray studies and hinders consensus building. To address this, we developed a microarray meta-dataset comprising 231 healthy, 132 adenoma, and 342 CRC tissue samples from twelve independent studies. Utilizing a stringent analytic framework, select datasets were downloaded from the Gene Expression Omnibus, normalized by frozen robust multiarray averaging and subsequently merged. Batch effects were then identified and removed by empirical Bayes estimation (ComBat). Finally, the meta-dataset was filtered for low variant probes, enabling downstream differential expression as well as quantitative and functional validation through cross-platform correlation and enrichment analyses, respectively. Overall, our meta-dataset provides a robust tool for investigating colorectal adenoma formation and malignant transformation at the transcriptional level with a pipeline that is modular and readily adaptable for similar analyses in other cancer types. Measurement(s) transcriptome • colorectal cancer Technology Type(s) microarray Factor Type(s) gene expression Sample Characteristic - Organism Homo sapiens Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.14589006
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-021-00998-5