Loading…

Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets

Motivation: Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than indi...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2010-06, Vol.26 (12), p.1506-1512
Main Authors: Park, Inho, Lee, Kwang H., Lee, Doheon
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823
cites cdi_FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823
container_end_page 1512
container_issue 12
container_start_page 1506
container_title Bioinformatics
container_volume 26
creator Park, Inho
Lee, Kwang H.
Lee, Doheon
description Motivation: Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than individual gene-based pattern analyses. Although a number of approaches have been proposed for gene set-based pattern analysis, the combinatorial influence of deregulated gene sets on disease phenotype classification has not been studied sufficiently. Results: We propose a new approach for inferring combinatorial Boolean rules of gene sets for a better understanding of cancer transcriptome and cancer classification. To reduce the search space of the possible Boolean rules, we identify small groups of gene sets that synergistically contribute to the classification of samples into their corresponding phenotypic groups (such as normal and cancer). We then measure the significance of the candidate Boolean rules derived from each group of gene sets; the level of significance is based on the class entropy of the samples selected in accordance with the rules. By applying the present approach to publicly available prostate cancer datasets, we identified 72 significant Boolean rules. Finally, we discuss several identified Boolean rules, such as the rule of glutathione metabolism (down) and prostaglandin synthesis regulation (down), which are consistent with known prostate cancer biology. Availability: Scripts written in Python and R are available at http://biosoft.kaist.ac.kr/∼ihpark/. The refined gene sets and the full list of the identified Boolean rules are provided in the Supplementary Material. Contact: khlee@biosoft.kaist.ac.kr; dhlee@biosoft.kaist.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btq207
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_746307517</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>746307517</sourcerecordid><originalsourceid>FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823</originalsourceid><addsrcrecordid>eNqFkV9rFTEQxYNYbK1-BCUvoi9r8z-7j96itlBoKxWkL2E2OynR3U2b7AXvtzeXe634ok-T4fzmTJhDyCvO3nPWyZM-pjiHlCdYoi8n_fIgmH1CjrgyrBFMd0_rWxrbqJbJQ_K8lO-Maa6UekYOBVO8duKIwPkcMOPskaZAfZr6OMOScoSRrlIaEWaa1yOWrVw2M-a7WOpGeocz0oJLoSGniXqoFplO0ecEOcOGDrDAVn9BDgKMBV_u6zH5-unjzelZc3H5-fz0w0XjtVZLoznjuvMK244NBrgZQPdhgODRK-EBBe_7Dozsq9i1njNExYFLBtCaVshj8nbne5_TwxrL4qZYPI4jzJjWxVllJLOa2_-TUgrNuNh6vvsnyY3lUraWsYrqHVoPUErG4O5znCBvHGdum5j7OzG3S6zOvd6vWPcTDo9TvyOqwJs9AMXDGHK9dCx_ONGa-lNeuWbH1Xzw56MO-YczVlrtzr7dumtxtbpd3Qj3Rf4CKZ20vg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1671338700</pqid></control><display><type>article</type><title>Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets</title><source>PubMed (Medline)</source><source>Oxford Journals Open Access Collection</source><creator>Park, Inho ; Lee, Kwang H. ; Lee, Doheon</creator><creatorcontrib>Park, Inho ; Lee, Kwang H. ; Lee, Doheon</creatorcontrib><description>Motivation: Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than individual gene-based pattern analyses. Although a number of approaches have been proposed for gene set-based pattern analysis, the combinatorial influence of deregulated gene sets on disease phenotype classification has not been studied sufficiently. Results: We propose a new approach for inferring combinatorial Boolean rules of gene sets for a better understanding of cancer transcriptome and cancer classification. To reduce the search space of the possible Boolean rules, we identify small groups of gene sets that synergistically contribute to the classification of samples into their corresponding phenotypic groups (such as normal and cancer). We then measure the significance of the candidate Boolean rules derived from each group of gene sets; the level of significance is based on the class entropy of the samples selected in accordance with the rules. By applying the present approach to publicly available prostate cancer datasets, we identified 72 significant Boolean rules. Finally, we discuss several identified Boolean rules, such as the rule of glutathione metabolism (down) and prostaglandin synthesis regulation (down), which are consistent with known prostate cancer biology. Availability: Scripts written in Python and R are available at http://biosoft.kaist.ac.kr/∼ihpark/. The refined gene sets and the full list of the identified Boolean rules are provided in the Supplementary Material. Contact: khlee@biosoft.kaist.ac.kr; dhlee@biosoft.kaist.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btq207</identifier><identifier>PMID: 20410052</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Boolean algebra ; Cancer ; Classification ; Combinatorial analysis ; Entropy ; Fundamental and applied biological sciences. Psychology ; Gene Expression Regulation, Neoplastic ; Gene Regulatory Networks ; General aspects ; Genes ; Genes, Neoplasm ; Humans ; Male ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Neoplasms - genetics ; Oligonucleotide Array Sequence Analysis - methods ; Pattern analysis ; Prostate ; Prostatic Neoplasms - genetics</subject><ispartof>Bioinformatics, 2010-06, Vol.26 (12), p.1506-1512</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823</citedby><cites>FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,778,782,27907,27908</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=22862231$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20410052$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Park, Inho</creatorcontrib><creatorcontrib>Lee, Kwang H.</creatorcontrib><creatorcontrib>Lee, Doheon</creatorcontrib><title>Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Motivation: Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than individual gene-based pattern analyses. Although a number of approaches have been proposed for gene set-based pattern analysis, the combinatorial influence of deregulated gene sets on disease phenotype classification has not been studied sufficiently. Results: We propose a new approach for inferring combinatorial Boolean rules of gene sets for a better understanding of cancer transcriptome and cancer classification. To reduce the search space of the possible Boolean rules, we identify small groups of gene sets that synergistically contribute to the classification of samples into their corresponding phenotypic groups (such as normal and cancer). We then measure the significance of the candidate Boolean rules derived from each group of gene sets; the level of significance is based on the class entropy of the samples selected in accordance with the rules. By applying the present approach to publicly available prostate cancer datasets, we identified 72 significant Boolean rules. Finally, we discuss several identified Boolean rules, such as the rule of glutathione metabolism (down) and prostaglandin synthesis regulation (down), which are consistent with known prostate cancer biology. Availability: Scripts written in Python and R are available at http://biosoft.kaist.ac.kr/∼ihpark/. The refined gene sets and the full list of the identified Boolean rules are provided in the Supplementary Material. Contact: khlee@biosoft.kaist.ac.kr; dhlee@biosoft.kaist.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online.</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Boolean algebra</subject><subject>Cancer</subject><subject>Classification</subject><subject>Combinatorial analysis</subject><subject>Entropy</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Gene Expression Regulation, Neoplastic</subject><subject>Gene Regulatory Networks</subject><subject>General aspects</subject><subject>Genes</subject><subject>Genes, Neoplasm</subject><subject>Humans</subject><subject>Male</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Neoplasms - genetics</subject><subject>Oligonucleotide Array Sequence Analysis - methods</subject><subject>Pattern analysis</subject><subject>Prostate</subject><subject>Prostatic Neoplasms - genetics</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><recordid>eNqFkV9rFTEQxYNYbK1-BCUvoi9r8z-7j96itlBoKxWkL2E2OynR3U2b7AXvtzeXe634ok-T4fzmTJhDyCvO3nPWyZM-pjiHlCdYoi8n_fIgmH1CjrgyrBFMd0_rWxrbqJbJQ_K8lO-Maa6UekYOBVO8duKIwPkcMOPskaZAfZr6OMOScoSRrlIaEWaa1yOWrVw2M-a7WOpGeocz0oJLoSGniXqoFplO0ecEOcOGDrDAVn9BDgKMBV_u6zH5-unjzelZc3H5-fz0w0XjtVZLoznjuvMK244NBrgZQPdhgODRK-EBBe_7Dozsq9i1njNExYFLBtCaVshj8nbne5_TwxrL4qZYPI4jzJjWxVllJLOa2_-TUgrNuNh6vvsnyY3lUraWsYrqHVoPUErG4O5znCBvHGdum5j7OzG3S6zOvd6vWPcTDo9TvyOqwJs9AMXDGHK9dCx_ONGa-lNeuWbH1Xzw56MO-YczVlrtzr7dumtxtbpd3Qj3Rf4CKZ20vg</recordid><startdate>20100615</startdate><enddate>20100615</enddate><creator>Park, Inho</creator><creator>Lee, Kwang H.</creator><creator>Lee, Doheon</creator><general>Oxford University Press</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><scope>7QO</scope><scope>FR3</scope><scope>P64</scope></search><sort><creationdate>20100615</creationdate><title>Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets</title><author>Park, Inho ; Lee, Kwang H. ; Lee, Doheon</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Boolean algebra</topic><topic>Cancer</topic><topic>Classification</topic><topic>Combinatorial analysis</topic><topic>Entropy</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Gene Expression Regulation, Neoplastic</topic><topic>Gene Regulatory Networks</topic><topic>General aspects</topic><topic>Genes</topic><topic>Genes, Neoplasm</topic><topic>Humans</topic><topic>Male</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Neoplasms - genetics</topic><topic>Oligonucleotide Array Sequence Analysis - methods</topic><topic>Pattern analysis</topic><topic>Prostate</topic><topic>Prostatic Neoplasms - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Park, Inho</creatorcontrib><creatorcontrib>Lee, Kwang H.</creatorcontrib><creatorcontrib>Lee, Doheon</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><collection>Biotechnology Research Abstracts</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Park, Inho</au><au>Lee, Kwang H.</au><au>Lee, Doheon</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2010-06-15</date><risdate>2010</risdate><volume>26</volume><issue>12</issue><spage>1506</spage><epage>1512</epage><pages>1506-1512</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Motivation: Gene set analysis has become an important tool for the functional interpretation of high-throughput gene expression datasets. Moreover, pattern analyses based on inferred gene set activities of individual samples have shown the ability to identify more robust disease signatures than individual gene-based pattern analyses. Although a number of approaches have been proposed for gene set-based pattern analysis, the combinatorial influence of deregulated gene sets on disease phenotype classification has not been studied sufficiently. Results: We propose a new approach for inferring combinatorial Boolean rules of gene sets for a better understanding of cancer transcriptome and cancer classification. To reduce the search space of the possible Boolean rules, we identify small groups of gene sets that synergistically contribute to the classification of samples into their corresponding phenotypic groups (such as normal and cancer). We then measure the significance of the candidate Boolean rules derived from each group of gene sets; the level of significance is based on the class entropy of the samples selected in accordance with the rules. By applying the present approach to publicly available prostate cancer datasets, we identified 72 significant Boolean rules. Finally, we discuss several identified Boolean rules, such as the rule of glutathione metabolism (down) and prostaglandin synthesis regulation (down), which are consistent with known prostate cancer biology. Availability: Scripts written in Python and R are available at http://biosoft.kaist.ac.kr/∼ihpark/. The refined gene sets and the full list of the identified Boolean rules are provided in the Supplementary Material. Contact: khlee@biosoft.kaist.ac.kr; dhlee@biosoft.kaist.ac.kr Supplementary information: Supplementary data are available at Bioinformatics online.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>20410052</pmid><doi>10.1093/bioinformatics/btq207</doi><tpages>7</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2010-06, Vol.26 (12), p.1506-1512
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_proquest_miscellaneous_746307517
source PubMed (Medline); Oxford Journals Open Access Collection
subjects Algorithms
Biological and medical sciences
Boolean algebra
Cancer
Classification
Combinatorial analysis
Entropy
Fundamental and applied biological sciences. Psychology
Gene Expression Regulation, Neoplastic
Gene Regulatory Networks
General aspects
Genes
Genes, Neoplasm
Humans
Male
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Neoplasms - genetics
Oligonucleotide Array Sequence Analysis - methods
Pattern analysis
Prostate
Prostatic Neoplasms - genetics
title Inference of combinatorial Boolean rules of synergistic gene sets from cancer microarray datasets
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T17%3A27%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Inference%20of%20combinatorial%20Boolean%20rules%20of%20synergistic%20gene%20sets%20from%20cancer%20microarray%20datasets&rft.jtitle=Bioinformatics&rft.au=Park,%20Inho&rft.date=2010-06-15&rft.volume=26&rft.issue=12&rft.spage=1506&rft.epage=1512&rft.pages=1506-1512&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btq207&rft_dat=%3Cproquest_cross%3E746307517%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c554t-510159c4e890d6a16da5bfdafcec42cae21bb9a63b6a198c10ee41a130aa86823%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1671338700&rft_id=info:pmid/20410052&rfr_iscdi=true