Loading…

Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees

Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and lo...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2013-03, Vol.14 (1), p.100-100, Article 100
Main Authors: Chou, Hsiu-Ling, Yao, Chung-Tay, Su, Sui-Lun, Lee, Chia-Yi, Hu, Kuang-Yu, Terng, Harn-Jing, Shih, Yun-Wen, Chang, Yu-Tien, Lu, Yu-Fen, Chang, Chi-Wen, Wahlqvist, Mark L, Wetter, Thomas, Chu, Chi-Ming
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3
cites cdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3
container_end_page 100
container_issue 1
container_start_page 100
container_title BMC bioinformatics
container_volume 14
creator Chou, Hsiu-Ling
Yao, Chung-Tay
Su, Sui-Lun
Lee, Chia-Yi
Hu, Kuang-Yu
Terng, Harn-Jing
Shih, Yun-Wen
Chang, Yu-Tien
Lu, Yu-Fen
Chang, Chi-Wen
Wahlqvist, Mark L
Wetter, Thomas
Chu, Chi-Ming
description Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence. The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.
doi_str_mv 10.1186/1471-2105-14-100
format article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3614553</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A534517829</galeid><sourcerecordid>A534517829</sourcerecordid><originalsourceid>FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</originalsourceid><addsrcrecordid>eNqFkl1rFDEUhgdR7IfeeyUBbyw4Nd-TvRGWqrVQFPy4DtnkzJg6O9kmmbX7f_yhZtp16Yogw5CQPO87Z855q-oZwaeEKPma8IbUlGBRE14TjB9Uh7ujh_f2B9VRSlcYk0Zh8bg6oExgKTk-rH6dwwAIblYRUvJhQKsYWt_7oUOhRYsIJmVkzWAhojTGtV-bRbnOG7TYoFUIPThk336co6W3MZgYzQaZwfSb5BMa0-TTh86n7C2K0G2_8gqZmH3rrTc9GmCMt0v-GeKPVOQOObD-tpwcAdKT6lFr-gRPt-tx9e39u69nH-rLT-cXZ_PL2gqhcs2darhRsnWNoaQBXl7qlJwxYh2ZOWgaKsmMC9wqoMICCIsFV5bNmGGtY8fVmzvf1bhYgrMw5FKZXkW_NHGjg_F6_2bw33UX1ppJwoVgxeDl1iCG6xFS1kufLPS9GSCMSRMhiCSUSvl_lHHFleRMFfTFX-hVGGNp8kRRyhoiyh_sqM70oP3QhlKinUz1XDAuyuzpRJ3-gyqPgzLBMEAZPuwLTvYEhclwkzszpqQvvnzeZ_EdW6KQUoR21zqC9RRYPSVST4ksu3KIi-T5_ZbvBH8Syn4DdEvnPw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1322371539</pqid></control><display><type>article</type><title>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</title><source>PubMed (Medline)</source><source>Publicly Available Content (ProQuest)</source><creator>Chou, Hsiu-Ling ; Yao, Chung-Tay ; Su, Sui-Lun ; Lee, Chia-Yi ; Hu, Kuang-Yu ; Terng, Harn-Jing ; Shih, Yun-Wen ; Chang, Yu-Tien ; Lu, Yu-Fen ; Chang, Chi-Wen ; Wahlqvist, Mark L ; Wetter, Thomas ; Chu, Chi-Ming</creator><creatorcontrib>Chou, Hsiu-Ling ; Yao, Chung-Tay ; Su, Sui-Lun ; Lee, Chia-Yi ; Hu, Kuang-Yu ; Terng, Harn-Jing ; Shih, Yun-Wen ; Chang, Yu-Tien ; Lu, Yu-Fen ; Chang, Chi-Wen ; Wahlqvist, Mark L ; Wetter, Thomas ; Chu, Chi-Ming</creatorcontrib><description>Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence. The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-14-100</identifier><identifier>PMID: 23506640</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Analysis ; Anopheles ; Bioinformatics ; Breast cancer ; Breast Neoplasms - genetics ; Cancer ; Cancer therapies ; Comparative analysis ; Conversion ; Data mining ; Databases, Genetic ; Decision Trees ; DNA damage ; DNA microarrays ; DNA, Complementary - genetics ; Female ; Gene expression ; Gene Expression Profiling ; Genes ; Genetic aspects ; Hospitals ; Humans ; Logistic Models ; Logistics ; Medical research ; Neural networks ; Neural Networks (Computer) ; Oligonucleotide Array Sequence Analysis ; Physiological aspects ; Prognosis ; Recurrence ; Sample Size ; Standard deviation ; Statistical methods ; Studies ; Survival Analysis ; Variables</subject><ispartof>BMC bioinformatics, 2013-03, Vol.14 (1), p.100-100, Article 100</ispartof><rights>COPYRIGHT 2013 BioMed Central Ltd.</rights><rights>2013 Chou et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</rights><rights>Copyright © 2013 Chou et al.; licensee BioMed Central Ltd. 2013 Chou et al.; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</citedby><cites>FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3614553/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1322371539?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25732,27903,27904,36991,36992,44569,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23506640$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chou, Hsiu-Ling</creatorcontrib><creatorcontrib>Yao, Chung-Tay</creatorcontrib><creatorcontrib>Su, Sui-Lun</creatorcontrib><creatorcontrib>Lee, Chia-Yi</creatorcontrib><creatorcontrib>Hu, Kuang-Yu</creatorcontrib><creatorcontrib>Terng, Harn-Jing</creatorcontrib><creatorcontrib>Shih, Yun-Wen</creatorcontrib><creatorcontrib>Chang, Yu-Tien</creatorcontrib><creatorcontrib>Lu, Yu-Fen</creatorcontrib><creatorcontrib>Chang, Chi-Wen</creatorcontrib><creatorcontrib>Wahlqvist, Mark L</creatorcontrib><creatorcontrib>Wetter, Thomas</creatorcontrib><creatorcontrib>Chu, Chi-Ming</creatorcontrib><title>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence. The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.</description><subject>Algorithms</subject><subject>Analysis</subject><subject>Anopheles</subject><subject>Bioinformatics</subject><subject>Breast cancer</subject><subject>Breast Neoplasms - genetics</subject><subject>Cancer</subject><subject>Cancer therapies</subject><subject>Comparative analysis</subject><subject>Conversion</subject><subject>Data mining</subject><subject>Databases, Genetic</subject><subject>Decision Trees</subject><subject>DNA damage</subject><subject>DNA microarrays</subject><subject>DNA, Complementary - genetics</subject><subject>Female</subject><subject>Gene expression</subject><subject>Gene Expression Profiling</subject><subject>Genes</subject><subject>Genetic aspects</subject><subject>Hospitals</subject><subject>Humans</subject><subject>Logistic Models</subject><subject>Logistics</subject><subject>Medical research</subject><subject>Neural networks</subject><subject>Neural Networks (Computer)</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Physiological aspects</subject><subject>Prognosis</subject><subject>Recurrence</subject><subject>Sample Size</subject><subject>Standard deviation</subject><subject>Statistical methods</subject><subject>Studies</subject><subject>Survival Analysis</subject><subject>Variables</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqFkl1rFDEUhgdR7IfeeyUBbyw4Nd-TvRGWqrVQFPy4DtnkzJg6O9kmmbX7f_yhZtp16Yogw5CQPO87Z855q-oZwaeEKPma8IbUlGBRE14TjB9Uh7ujh_f2B9VRSlcYk0Zh8bg6oExgKTk-rH6dwwAIblYRUvJhQKsYWt_7oUOhRYsIJmVkzWAhojTGtV-bRbnOG7TYoFUIPThk336co6W3MZgYzQaZwfSb5BMa0-TTh86n7C2K0G2_8gqZmH3rrTc9GmCMt0v-GeKPVOQOObD-tpwcAdKT6lFr-gRPt-tx9e39u69nH-rLT-cXZ_PL2gqhcs2darhRsnWNoaQBXl7qlJwxYh2ZOWgaKsmMC9wqoMICCIsFV5bNmGGtY8fVmzvf1bhYgrMw5FKZXkW_NHGjg_F6_2bw33UX1ppJwoVgxeDl1iCG6xFS1kufLPS9GSCMSRMhiCSUSvl_lHHFleRMFfTFX-hVGGNp8kRRyhoiyh_sqM70oP3QhlKinUz1XDAuyuzpRJ3-gyqPgzLBMEAZPuwLTvYEhclwkzszpqQvvnzeZ_EdW6KQUoR21zqC9RRYPSVST4ksu3KIi-T5_ZbvBH8Syn4DdEvnPw</recordid><startdate>20130319</startdate><enddate>20130319</enddate><creator>Chou, Hsiu-Ling</creator><creator>Yao, Chung-Tay</creator><creator>Su, Sui-Lun</creator><creator>Lee, Chia-Yi</creator><creator>Hu, Kuang-Yu</creator><creator>Terng, Harn-Jing</creator><creator>Shih, Yun-Wen</creator><creator>Chang, Yu-Tien</creator><creator>Lu, Yu-Fen</creator><creator>Chang, Chi-Wen</creator><creator>Wahlqvist, Mark L</creator><creator>Wetter, Thomas</creator><creator>Chu, Chi-Ming</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7TM</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20130319</creationdate><title>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</title><author>Chou, Hsiu-Ling ; Yao, Chung-Tay ; Su, Sui-Lun ; Lee, Chia-Yi ; Hu, Kuang-Yu ; Terng, Harn-Jing ; Shih, Yun-Wen ; Chang, Yu-Tien ; Lu, Yu-Fen ; Chang, Chi-Wen ; Wahlqvist, Mark L ; Wetter, Thomas ; Chu, Chi-Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Algorithms</topic><topic>Analysis</topic><topic>Anopheles</topic><topic>Bioinformatics</topic><topic>Breast cancer</topic><topic>Breast Neoplasms - genetics</topic><topic>Cancer</topic><topic>Cancer therapies</topic><topic>Comparative analysis</topic><topic>Conversion</topic><topic>Data mining</topic><topic>Databases, Genetic</topic><topic>Decision Trees</topic><topic>DNA damage</topic><topic>DNA microarrays</topic><topic>DNA, Complementary - genetics</topic><topic>Female</topic><topic>Gene expression</topic><topic>Gene Expression Profiling</topic><topic>Genes</topic><topic>Genetic aspects</topic><topic>Hospitals</topic><topic>Humans</topic><topic>Logistic Models</topic><topic>Logistics</topic><topic>Medical research</topic><topic>Neural networks</topic><topic>Neural Networks (Computer)</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Physiological aspects</topic><topic>Prognosis</topic><topic>Recurrence</topic><topic>Sample Size</topic><topic>Standard deviation</topic><topic>Statistical methods</topic><topic>Studies</topic><topic>Survival Analysis</topic><topic>Variables</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chou, Hsiu-Ling</creatorcontrib><creatorcontrib>Yao, Chung-Tay</creatorcontrib><creatorcontrib>Su, Sui-Lun</creatorcontrib><creatorcontrib>Lee, Chia-Yi</creatorcontrib><creatorcontrib>Hu, Kuang-Yu</creatorcontrib><creatorcontrib>Terng, Harn-Jing</creatorcontrib><creatorcontrib>Shih, Yun-Wen</creatorcontrib><creatorcontrib>Chang, Yu-Tien</creatorcontrib><creatorcontrib>Lu, Yu-Fen</creatorcontrib><creatorcontrib>Chang, Chi-Wen</creatorcontrib><creatorcontrib>Wahlqvist, Mark L</creatorcontrib><creatorcontrib>Wetter, Thomas</creatorcontrib><creatorcontrib>Chu, Chi-Ming</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>Nucleic Acids Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chou, Hsiu-Ling</au><au>Yao, Chung-Tay</au><au>Su, Sui-Lun</au><au>Lee, Chia-Yi</au><au>Hu, Kuang-Yu</au><au>Terng, Harn-Jing</au><au>Shih, Yun-Wen</au><au>Chang, Yu-Tien</au><au>Lu, Yu-Fen</au><au>Chang, Chi-Wen</au><au>Wahlqvist, Mark L</au><au>Wetter, Thomas</au><au>Chu, Chi-Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2013-03-19</date><risdate>2013</risdate><volume>14</volume><issue>1</issue><spage>100</spage><epage>100</epage><pages>100-100</pages><artnum>100</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence. The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>23506640</pmid><doi>10.1186/1471-2105-14-100</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2013-03, Vol.14 (1), p.100-100, Article 100
issn 1471-2105
1471-2105
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3614553
source PubMed (Medline); Publicly Available Content (ProQuest)
subjects Algorithms
Analysis
Anopheles
Bioinformatics
Breast cancer
Breast Neoplasms - genetics
Cancer
Cancer therapies
Comparative analysis
Conversion
Data mining
Databases, Genetic
Decision Trees
DNA damage
DNA microarrays
DNA, Complementary - genetics
Female
Gene expression
Gene Expression Profiling
Genes
Genetic aspects
Hospitals
Humans
Logistic Models
Logistics
Medical research
Neural networks
Neural Networks (Computer)
Oligonucleotide Array Sequence Analysis
Physiological aspects
Prognosis
Recurrence
Sample Size
Standard deviation
Statistical methods
Studies
Survival Analysis
Variables
title Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T01%3A13%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Gene%20expression%20profiling%20of%20breast%20cancer%20survivability%20by%20pooled%20cDNA%20microarray%20analysis%20using%20logistic%20regression,%20artificial%20neural%20networks%20and%20decision%20trees&rft.jtitle=BMC%20bioinformatics&rft.au=Chou,%20Hsiu-Ling&rft.date=2013-03-19&rft.volume=14&rft.issue=1&rft.spage=100&rft.epage=100&rft.pages=100-100&rft.artnum=100&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-14-100&rft_dat=%3Cgale_pubme%3EA534517829%3C/gale_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c558t-4d874a86fd7a217e417e2d86931cd19de772619450f8e25cee5c0548c393a3fd3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1322371539&rft_id=info:pmid/23506640&rft_galeid=A534517829&rfr_iscdi=true