Loading…
Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and lo...
Saved in:
Published in: | BMC bioinformatics 2013-03, Vol.14 (1), p.100-100, Article 100 |
---|---|
Main Authors: | , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3 |
---|---|
cites | cdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3 |
container_end_page | 100 |
container_issue | 1 |
container_start_page | 100 |
container_title | BMC bioinformatics |
container_volume | 14 |
creator | Chou, Hsiu-Ling Yao, Chung-Tay Su, Sui-Lun Lee, Chia-Yi Hu, Kuang-Yu Terng, Harn-Jing Shih, Yun-Wen Chang, Yu-Tien Lu, Yu-Fen Chang, Chi-Wen Wahlqvist, Mark L Wetter, Thomas Chu, Chi-Ming |
description | Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression.
The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence.
The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence. |
doi_str_mv | 10.1186/1471-2105-14-100 |
format | article |
fullrecord | <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3614553</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A534517829</galeid><sourcerecordid>A534517829</sourcerecordid><originalsourceid>FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</originalsourceid><addsrcrecordid>eNqFkl1rFDEUhgdR7IfeeyUBbyw4Nd-TvRGWqrVQFPy4DtnkzJg6O9kmmbX7f_yhZtp16Yogw5CQPO87Z855q-oZwaeEKPma8IbUlGBRE14TjB9Uh7ujh_f2B9VRSlcYk0Zh8bg6oExgKTk-rH6dwwAIblYRUvJhQKsYWt_7oUOhRYsIJmVkzWAhojTGtV-bRbnOG7TYoFUIPThk336co6W3MZgYzQaZwfSb5BMa0-TTh86n7C2K0G2_8gqZmH3rrTc9GmCMt0v-GeKPVOQOObD-tpwcAdKT6lFr-gRPt-tx9e39u69nH-rLT-cXZ_PL2gqhcs2darhRsnWNoaQBXl7qlJwxYh2ZOWgaKsmMC9wqoMICCIsFV5bNmGGtY8fVmzvf1bhYgrMw5FKZXkW_NHGjg_F6_2bw33UX1ppJwoVgxeDl1iCG6xFS1kufLPS9GSCMSRMhiCSUSvl_lHHFleRMFfTFX-hVGGNp8kRRyhoiyh_sqM70oP3QhlKinUz1XDAuyuzpRJ3-gyqPgzLBMEAZPuwLTvYEhclwkzszpqQvvnzeZ_EdW6KQUoR21zqC9RRYPSVST4ksu3KIi-T5_ZbvBH8Syn4DdEvnPw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1322371539</pqid></control><display><type>article</type><title>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</title><source>PubMed (Medline)</source><source>Publicly Available Content (ProQuest)</source><creator>Chou, Hsiu-Ling ; Yao, Chung-Tay ; Su, Sui-Lun ; Lee, Chia-Yi ; Hu, Kuang-Yu ; Terng, Harn-Jing ; Shih, Yun-Wen ; Chang, Yu-Tien ; Lu, Yu-Fen ; Chang, Chi-Wen ; Wahlqvist, Mark L ; Wetter, Thomas ; Chu, Chi-Ming</creator><creatorcontrib>Chou, Hsiu-Ling ; Yao, Chung-Tay ; Su, Sui-Lun ; Lee, Chia-Yi ; Hu, Kuang-Yu ; Terng, Harn-Jing ; Shih, Yun-Wen ; Chang, Yu-Tien ; Lu, Yu-Fen ; Chang, Chi-Wen ; Wahlqvist, Mark L ; Wetter, Thomas ; Chu, Chi-Ming</creatorcontrib><description>Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression.
The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence.
The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-14-100</identifier><identifier>PMID: 23506640</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Analysis ; Anopheles ; Bioinformatics ; Breast cancer ; Breast Neoplasms - genetics ; Cancer ; Cancer therapies ; Comparative analysis ; Conversion ; Data mining ; Databases, Genetic ; Decision Trees ; DNA damage ; DNA microarrays ; DNA, Complementary - genetics ; Female ; Gene expression ; Gene Expression Profiling ; Genes ; Genetic aspects ; Hospitals ; Humans ; Logistic Models ; Logistics ; Medical research ; Neural networks ; Neural Networks (Computer) ; Oligonucleotide Array Sequence Analysis ; Physiological aspects ; Prognosis ; Recurrence ; Sample Size ; Standard deviation ; Statistical methods ; Studies ; Survival Analysis ; Variables</subject><ispartof>BMC bioinformatics, 2013-03, Vol.14 (1), p.100-100, Article 100</ispartof><rights>COPYRIGHT 2013 BioMed Central Ltd.</rights><rights>2013 Chou et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2013 Chou et al.; licensee BioMed Central Ltd. 2013 Chou et al.; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</citedby><cites>FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3614553/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1322371539?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25732,27903,27904,36991,36992,44569,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23506640$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chou, Hsiu-Ling</creatorcontrib><creatorcontrib>Yao, Chung-Tay</creatorcontrib><creatorcontrib>Su, Sui-Lun</creatorcontrib><creatorcontrib>Lee, Chia-Yi</creatorcontrib><creatorcontrib>Hu, Kuang-Yu</creatorcontrib><creatorcontrib>Terng, Harn-Jing</creatorcontrib><creatorcontrib>Shih, Yun-Wen</creatorcontrib><creatorcontrib>Chang, Yu-Tien</creatorcontrib><creatorcontrib>Lu, Yu-Fen</creatorcontrib><creatorcontrib>Chang, Chi-Wen</creatorcontrib><creatorcontrib>Wahlqvist, Mark L</creatorcontrib><creatorcontrib>Wetter, Thomas</creatorcontrib><creatorcontrib>Chu, Chi-Ming</creatorcontrib><title>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression.
The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence.
The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.</description><subject>Algorithms</subject><subject>Analysis</subject><subject>Anopheles</subject><subject>Bioinformatics</subject><subject>Breast cancer</subject><subject>Breast Neoplasms - genetics</subject><subject>Cancer</subject><subject>Cancer therapies</subject><subject>Comparative analysis</subject><subject>Conversion</subject><subject>Data mining</subject><subject>Databases, Genetic</subject><subject>Decision Trees</subject><subject>DNA damage</subject><subject>DNA microarrays</subject><subject>DNA, Complementary - genetics</subject><subject>Female</subject><subject>Gene expression</subject><subject>Gene Expression Profiling</subject><subject>Genes</subject><subject>Genetic aspects</subject><subject>Hospitals</subject><subject>Humans</subject><subject>Logistic Models</subject><subject>Logistics</subject><subject>Medical research</subject><subject>Neural networks</subject><subject>Neural Networks (Computer)</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Physiological aspects</subject><subject>Prognosis</subject><subject>Recurrence</subject><subject>Sample Size</subject><subject>Standard deviation</subject><subject>Statistical methods</subject><subject>Studies</subject><subject>Survival Analysis</subject><subject>Variables</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqFkl1rFDEUhgdR7IfeeyUBbyw4Nd-TvRGWqrVQFPy4DtnkzJg6O9kmmbX7f_yhZtp16Yogw5CQPO87Z855q-oZwaeEKPma8IbUlGBRE14TjB9Uh7ujh_f2B9VRSlcYk0Zh8bg6oExgKTk-rH6dwwAIblYRUvJhQKsYWt_7oUOhRYsIJmVkzWAhojTGtV-bRbnOG7TYoFUIPThk336co6W3MZgYzQaZwfSb5BMa0-TTh86n7C2K0G2_8gqZmH3rrTc9GmCMt0v-GeKPVOQOObD-tpwcAdKT6lFr-gRPt-tx9e39u69nH-rLT-cXZ_PL2gqhcs2darhRsnWNoaQBXl7qlJwxYh2ZOWgaKsmMC9wqoMICCIsFV5bNmGGtY8fVmzvf1bhYgrMw5FKZXkW_NHGjg_F6_2bw33UX1ppJwoVgxeDl1iCG6xFS1kufLPS9GSCMSRMhiCSUSvl_lHHFleRMFfTFX-hVGGNp8kRRyhoiyh_sqM70oP3QhlKinUz1XDAuyuzpRJ3-gyqPgzLBMEAZPuwLTvYEhclwkzszpqQvvnzeZ_EdW6KQUoR21zqC9RRYPSVST4ksu3KIi-T5_ZbvBH8Syn4DdEvnPw</recordid><startdate>20130319</startdate><enddate>20130319</enddate><creator>Chou, Hsiu-Ling</creator><creator>Yao, Chung-Tay</creator><creator>Su, Sui-Lun</creator><creator>Lee, Chia-Yi</creator><creator>Hu, Kuang-Yu</creator><creator>Terng, Harn-Jing</creator><creator>Shih, Yun-Wen</creator><creator>Chang, Yu-Tien</creator><creator>Lu, Yu-Fen</creator><creator>Chang, Chi-Wen</creator><creator>Wahlqvist, Mark L</creator><creator>Wetter, Thomas</creator><creator>Chu, Chi-Ming</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7TM</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20130319</creationdate><title>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</title><author>Chou, Hsiu-Ling ; Yao, Chung-Tay ; Su, Sui-Lun ; Lee, Chia-Yi ; Hu, Kuang-Yu ; Terng, Harn-Jing ; Shih, Yun-Wen ; Chang, Yu-Tien ; Lu, Yu-Fen ; Chang, Chi-Wen ; Wahlqvist, Mark L ; Wetter, Thomas ; Chu, Chi-Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Analysis</topic><topic>Anopheles</topic><topic>Bioinformatics</topic><topic>Breast cancer</topic><topic>Breast Neoplasms - genetics</topic><topic>Cancer</topic><topic>Cancer therapies</topic><topic>Comparative analysis</topic><topic>Conversion</topic><topic>Data mining</topic><topic>Databases, Genetic</topic><topic>Decision Trees</topic><topic>DNA damage</topic><topic>DNA microarrays</topic><topic>DNA, Complementary - genetics</topic><topic>Female</topic><topic>Gene expression</topic><topic>Gene Expression Profiling</topic><topic>Genes</topic><topic>Genetic aspects</topic><topic>Hospitals</topic><topic>Humans</topic><topic>Logistic Models</topic><topic>Logistics</topic><topic>Medical research</topic><topic>Neural networks</topic><topic>Neural Networks (Computer)</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Physiological aspects</topic><topic>Prognosis</topic><topic>Recurrence</topic><topic>Sample Size</topic><topic>Standard deviation</topic><topic>Statistical methods</topic><topic>Studies</topic><topic>Survival Analysis</topic><topic>Variables</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chou, Hsiu-Ling</creatorcontrib><creatorcontrib>Yao, Chung-Tay</creatorcontrib><creatorcontrib>Su, Sui-Lun</creatorcontrib><creatorcontrib>Lee, Chia-Yi</creatorcontrib><creatorcontrib>Hu, Kuang-Yu</creatorcontrib><creatorcontrib>Terng, Harn-Jing</creatorcontrib><creatorcontrib>Shih, Yun-Wen</creatorcontrib><creatorcontrib>Chang, Yu-Tien</creatorcontrib><creatorcontrib>Lu, Yu-Fen</creatorcontrib><creatorcontrib>Chang, Chi-Wen</creatorcontrib><creatorcontrib>Wahlqvist, Mark L</creatorcontrib><creatorcontrib>Wetter, Thomas</creatorcontrib><creatorcontrib>Chu, Chi-Ming</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Nucleic Acids Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chou, Hsiu-Ling</au><au>Yao, Chung-Tay</au><au>Su, Sui-Lun</au><au>Lee, Chia-Yi</au><au>Hu, Kuang-Yu</au><au>Terng, Harn-Jing</au><au>Shih, Yun-Wen</au><au>Chang, Yu-Tien</au><au>Lu, Yu-Fen</au><au>Chang, Chi-Wen</au><au>Wahlqvist, Mark L</au><au>Wetter, Thomas</au><au>Chu, Chi-Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2013-03-19</date><risdate>2013</risdate><volume>14</volume><issue>1</issue><spage>100</spage><epage>100</epage><pages>100-100</pages><artnum>100</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression.
The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence.
The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>23506640</pmid><doi>10.1186/1471-2105-14-100</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2013-03, Vol.14 (1), p.100-100, Article 100 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3614553 |
source | PubMed (Medline); Publicly Available Content (ProQuest) |
subjects | Algorithms Analysis Anopheles Bioinformatics Breast cancer Breast Neoplasms - genetics Cancer Cancer therapies Comparative analysis Conversion Data mining Databases, Genetic Decision Trees DNA damage DNA microarrays DNA, Complementary - genetics Female Gene expression Gene Expression Profiling Genes Genetic aspects Hospitals Humans Logistic Models Logistics Medical research Neural networks Neural Networks (Computer) Oligonucleotide Array Sequence Analysis Physiological aspects Prognosis Recurrence Sample Size Standard deviation Statistical methods Studies Survival Analysis Variables |
title | Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T01%3A13%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Gene%20expression%20profiling%20of%20breast%20cancer%20survivability%20by%20pooled%20cDNA%20microarray%20analysis%20using%20logistic%20regression,%20artificial%20neural%20networks%20and%20decision%20trees&rft.jtitle=BMC%20bioinformatics&rft.au=Chou,%20Hsiu-Ling&rft.date=2013-03-19&rft.volume=14&rft.issue=1&rft.spage=100&rft.epage=100&rft.pages=100-100&rft.artnum=100&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-14-100&rft_dat=%3Cgale_pubme%3EA534517829%3C/gale_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1322371539&rft_id=info:pmid/23506640&rft_galeid=A534517829&rfr_iscdi=true |