Loading…
Learning Rough Set Classifiers from Gene Expressions and Clinical Data
Biological research is currently undergoing a revolution. With the advent of microarray technology the behavior of thousands of genes can be measured simultaneously. This capability opens a wide range of research opportunities in biology, but the technology generates a vast amount of data that canno...
Saved in:
Published in: | Fundamenta informaticae 2002-11, Vol.53 (2), p.155-183 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 183 |
container_issue | 2 |
container_start_page | 155 |
container_title | Fundamenta informaticae |
container_volume | 53 |
creator | Midelfart, Herman Komorowski, Jan Nørsett, Kristin Yadetie, Fekadu Sandovik, Arne K. Lægreid, Astrid |
description | Biological research is currently undergoing a revolution. With the
advent of microarray technology the behavior of thousands of genes can be
measured simultaneously. This capability opens a wide range of research
opportunities in biology, but the technology generates a vast amount of data
that cannot be handled manually. Computational analysis is thus a prerequisite
for the success of this technology, and research and development of
computational tools for microarray analysis are of great importance. One
application of microarray technology is cancer studies where supervised
learning may be used for predicting tumor subtypes and clinical parameters. We
present a general Rough Set approach for classification of tumor samples
analyzed with microarrays. This approach is tested on a data set of gastric
tumors, and we develop classifiers for six clinical parameters. One major
obstacle in training classifiers from microarray data is that the number of
objects is much smaller that the number of attributes. We therefore introduce a
feature selection method based on bootstrapping for selecting genes that
discriminate significantly between the classes, and study the performance of
this method. Moreover, the efficacy of several learning and discretization
methods implemented in the ROSETTA system [18] is examined. Their performance
is compared to that of linear and quadratic discrimination analysis. The
classifiers are also biologically validated. One of the best classifiers is
selected for each clinical parameter, and the connection between the genes used
in these classifiers and the parameters are compared to the establish knowledge
in the biomedical literature. |
doi_str_mv | 10.3233/FUN-2002-53204 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_27187901</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.3233_FUN-2002-53204</sage_id><sourcerecordid>27187901</sourcerecordid><originalsourceid>FETCH-LOGICAL-c259t-c4aa4f99d80a43787151274803c65b4b226fea415125c9534b5b76c3034f6aed3</originalsourceid><addsrcrecordid>eNp1kE1LAzEURYMoWKtb11m5kan5nEmWUtsqFAW16_AmTWrKNFOTGdB_79S6dfXgcu6DexC6pmTCGed389VzwQhhheSMiBM0oqqShSoVPUUjQktdMF2qc3SR85YQQjXXIzRfOkgxxA1-bfvNB35zHZ42kHPwwaWMfWp3eOGiw7OvfXJD3saMIa4HKsRgocEP0MElOvPQZHf1d8doNZ-9Tx-L5cviaXq_LCyTuiusABBe67UiIHilKiopq4Qi3JayFjVjpXcgDqm0WnJRy7oqLSdc-BLcmo_RzfHvPrWfvcud2YVsXdNAdG2fDauG0ZrQAZwcQZvanJPzZp_CDtK3ocQcdJlBlznoMr-6hsLtsZBh48y27VMchvxH_wAU-Gj7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>27187901</pqid></control><display><type>article</type><title>Learning Rough Set Classifiers from Gene Expressions and Clinical Data</title><source>SAGE:Jisc Collections:SAGE Journals Read and Publish 2023-2024:2025 extension (reading list)</source><creator>Midelfart, Herman ; Komorowski, Jan ; Nørsett, Kristin ; Yadetie, Fekadu ; Sandovik, Arne K. ; Lægreid, Astrid</creator><creatorcontrib>Midelfart, Herman ; Komorowski, Jan ; Nørsett, Kristin ; Yadetie, Fekadu ; Sandovik, Arne K. ; Lægreid, Astrid</creatorcontrib><description>Biological research is currently undergoing a revolution. With the
advent of microarray technology the behavior of thousands of genes can be
measured simultaneously. This capability opens a wide range of research
opportunities in biology, but the technology generates a vast amount of data
that cannot be handled manually. Computational analysis is thus a prerequisite
for the success of this technology, and research and development of
computational tools for microarray analysis are of great importance. One
application of microarray technology is cancer studies where supervised
learning may be used for predicting tumor subtypes and clinical parameters. We
present a general Rough Set approach for classification of tumor samples
analyzed with microarrays. This approach is tested on a data set of gastric
tumors, and we develop classifiers for six clinical parameters. One major
obstacle in training classifiers from microarray data is that the number of
objects is much smaller that the number of attributes. We therefore introduce a
feature selection method based on bootstrapping for selecting genes that
discriminate significantly between the classes, and study the performance of
this method. Moreover, the efficacy of several learning and discretization
methods implemented in the ROSETTA system [18] is examined. Their performance
is compared to that of linear and quadratic discrimination analysis. The
classifiers are also biologically validated. One of the best classifiers is
selected for each clinical parameter, and the connection between the genes used
in these classifiers and the parameters are compared to the establish knowledge
in the biomedical literature.</description><identifier>ISSN: 0169-2968</identifier><identifier>EISSN: 1875-8681</identifier><identifier>DOI: 10.3233/FUN-2002-53204</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><ispartof>Fundamenta informaticae, 2002-11, Vol.53 (2), p.155-183</ispartof><rights>IOS Press. All rights reserved</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Midelfart, Herman</creatorcontrib><creatorcontrib>Komorowski, Jan</creatorcontrib><creatorcontrib>Nørsett, Kristin</creatorcontrib><creatorcontrib>Yadetie, Fekadu</creatorcontrib><creatorcontrib>Sandovik, Arne K.</creatorcontrib><creatorcontrib>Lægreid, Astrid</creatorcontrib><title>Learning Rough Set Classifiers from Gene Expressions and Clinical Data</title><title>Fundamenta informaticae</title><description>Biological research is currently undergoing a revolution. With the
advent of microarray technology the behavior of thousands of genes can be
measured simultaneously. This capability opens a wide range of research
opportunities in biology, but the technology generates a vast amount of data
that cannot be handled manually. Computational analysis is thus a prerequisite
for the success of this technology, and research and development of
computational tools for microarray analysis are of great importance. One
application of microarray technology is cancer studies where supervised
learning may be used for predicting tumor subtypes and clinical parameters. We
present a general Rough Set approach for classification of tumor samples
analyzed with microarrays. This approach is tested on a data set of gastric
tumors, and we develop classifiers for six clinical parameters. One major
obstacle in training classifiers from microarray data is that the number of
objects is much smaller that the number of attributes. We therefore introduce a
feature selection method based on bootstrapping for selecting genes that
discriminate significantly between the classes, and study the performance of
this method. Moreover, the efficacy of several learning and discretization
methods implemented in the ROSETTA system [18] is examined. Their performance
is compared to that of linear and quadratic discrimination analysis. The
classifiers are also biologically validated. One of the best classifiers is
selected for each clinical parameter, and the connection between the genes used
in these classifiers and the parameters are compared to the establish knowledge
in the biomedical literature.</description><issn>0169-2968</issn><issn>1875-8681</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><recordid>eNp1kE1LAzEURYMoWKtb11m5kan5nEmWUtsqFAW16_AmTWrKNFOTGdB_79S6dfXgcu6DexC6pmTCGed389VzwQhhheSMiBM0oqqShSoVPUUjQktdMF2qc3SR85YQQjXXIzRfOkgxxA1-bfvNB35zHZ42kHPwwaWMfWp3eOGiw7OvfXJD3saMIa4HKsRgocEP0MElOvPQZHf1d8doNZ-9Tx-L5cviaXq_LCyTuiusABBe67UiIHilKiopq4Qi3JayFjVjpXcgDqm0WnJRy7oqLSdc-BLcmo_RzfHvPrWfvcud2YVsXdNAdG2fDauG0ZrQAZwcQZvanJPzZp_CDtK3ocQcdJlBlznoMr-6hsLtsZBh48y27VMchvxH_wAU-Gj7</recordid><startdate>20021101</startdate><enddate>20021101</enddate><creator>Midelfart, Herman</creator><creator>Komorowski, Jan</creator><creator>Nørsett, Kristin</creator><creator>Yadetie, Fekadu</creator><creator>Sandovik, Arne K.</creator><creator>Lægreid, Astrid</creator><general>SAGE Publications</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20021101</creationdate><title>Learning Rough Set Classifiers from Gene Expressions and Clinical Data</title><author>Midelfart, Herman ; Komorowski, Jan ; Nørsett, Kristin ; Yadetie, Fekadu ; Sandovik, Arne K. ; Lægreid, Astrid</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c259t-c4aa4f99d80a43787151274803c65b4b226fea415125c9534b5b76c3034f6aed3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Midelfart, Herman</creatorcontrib><creatorcontrib>Komorowski, Jan</creatorcontrib><creatorcontrib>Nørsett, Kristin</creatorcontrib><creatorcontrib>Yadetie, Fekadu</creatorcontrib><creatorcontrib>Sandovik, Arne K.</creatorcontrib><creatorcontrib>Lægreid, Astrid</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Fundamenta informaticae</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Midelfart, Herman</au><au>Komorowski, Jan</au><au>Nørsett, Kristin</au><au>Yadetie, Fekadu</au><au>Sandovik, Arne K.</au><au>Lægreid, Astrid</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Rough Set Classifiers from Gene Expressions and Clinical Data</atitle><jtitle>Fundamenta informaticae</jtitle><date>2002-11-01</date><risdate>2002</risdate><volume>53</volume><issue>2</issue><spage>155</spage><epage>183</epage><pages>155-183</pages><issn>0169-2968</issn><eissn>1875-8681</eissn><abstract>Biological research is currently undergoing a revolution. With the
advent of microarray technology the behavior of thousands of genes can be
measured simultaneously. This capability opens a wide range of research
opportunities in biology, but the technology generates a vast amount of data
that cannot be handled manually. Computational analysis is thus a prerequisite
for the success of this technology, and research and development of
computational tools for microarray analysis are of great importance. One
application of microarray technology is cancer studies where supervised
learning may be used for predicting tumor subtypes and clinical parameters. We
present a general Rough Set approach for classification of tumor samples
analyzed with microarrays. This approach is tested on a data set of gastric
tumors, and we develop classifiers for six clinical parameters. One major
obstacle in training classifiers from microarray data is that the number of
objects is much smaller that the number of attributes. We therefore introduce a
feature selection method based on bootstrapping for selecting genes that
discriminate significantly between the classes, and study the performance of
this method. Moreover, the efficacy of several learning and discretization
methods implemented in the ROSETTA system [18] is examined. Their performance
is compared to that of linear and quadratic discrimination analysis. The
classifiers are also biologically validated. One of the best classifiers is
selected for each clinical parameter, and the connection between the genes used
in these classifiers and the parameters are compared to the establish knowledge
in the biomedical literature.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.3233/FUN-2002-53204</doi><tpages>29</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0169-2968 |
ispartof | Fundamenta informaticae, 2002-11, Vol.53 (2), p.155-183 |
issn | 0169-2968 1875-8681 |
language | eng |
recordid | cdi_proquest_miscellaneous_27187901 |
source | SAGE:Jisc Collections:SAGE Journals Read and Publish 2023-2024:2025 extension (reading list) |
title | Learning Rough Set Classifiers from Gene Expressions and Clinical Data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T07%3A49%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Rough%20Set%20Classifiers%20from%20Gene%20Expressions%20and%20Clinical%20Data&rft.jtitle=Fundamenta%20informaticae&rft.au=Midelfart,%20Herman&rft.date=2002-11-01&rft.volume=53&rft.issue=2&rft.spage=155&rft.epage=183&rft.pages=155-183&rft.issn=0169-2968&rft.eissn=1875-8681&rft_id=info:doi/10.3233/FUN-2002-53204&rft_dat=%3Cproquest_cross%3E27187901%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c259t-c4aa4f99d80a43787151274803c65b4b226fea415125c9534b5b76c3034f6aed3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=27187901&rft_id=info:pmid/&rft_sage_id=10.3233_FUN-2002-53204&rfr_iscdi=true |