Loading…

bcGST—an interactive bias-correction method to identify over-represented gene-sets in boutique arrays

Abstract Motivation Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-se...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2019-04, Vol.35 (8), p.1350-1357
Main Authors: Wang, Kevin Y X, Menzies, Alexander M, Silva, Ines P, Wilmott, James S, Yan, Yibing, Wongchenko, Matthew, Kefford, Richard F, Scolyer, Richard A, Long, Georgina V, Tarr, Garth, Mueller, Samuel, Yang, Jean Y H
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903
cites cdi_FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903
container_end_page 1357
container_issue 8
container_start_page 1350
container_title Bioinformatics
container_volume 35
creator Wang, Kevin Y X
Menzies, Alexander M
Silva, Ines P
Wilmott, James S
Yan, Yibing
Wongchenko, Matthew
Kefford, Richard F
Scolyer, Richard A
Long, Georgina V
Tarr, Garth
Mueller, Samuel
Yang, Jean Y H
description Abstract Motivation Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene-set selection bias induced during the construction of these platforms. Results We propose bcGST, a bias-corrected GST by introducing bias-correction terms in the contingency table needed for calculating the Fisher’s Exact Test. The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and the Cancer Genome Atlas cancer studies. Availability and implementation The bcGST method is made available as a Shiny web application at http://shiny.maths.usyd.edu.au/bcGST/. Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/bty783
format article
fullrecord <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_2105070477</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/bty783</oup_id><sourcerecordid>2105070477</sourcerecordid><originalsourceid>FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903</originalsourceid><addsrcrecordid>eNqNkM1O3DAUhS1EVWDaRyjykk3g-i-2lxUqPxJSF6XryHZuBleTeGo7SLPjIfqEPAmpBpC66-r-6DvnSIeQLwzOGVhx4WOK05Dy6GoM5cLXnTbigBwz2ULDQdnDZRetbqQBcUROSvkFoJiU8iM5EsCZaltzTNY-XP-4f3764yYap4rZhRofkfroShNSzrjcaaIj1ofU05po7HGqcdjR9Ii5ybjNWJYP9nSNEzYFa1mcqE9zjb9npC5ntyufyIfBbQp-fp0r8vPq2_3lTXP3_fr28utdE4TVtTG8V0YyGyRwYdErG3yvB8F54INpW22lgp4HhYx78LoX3JjBKctQSWNBrMjZ3neb05JeajfGEnCzcROmuXScgQINUusFVXs05FRKxqHb5ji6vOsYdH877v7tuNt3vOhOXyNmP2L_rnordQFgD6R5-5-eLxbpkQw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2105070477</pqid></control><display><type>article</type><title>bcGST—an interactive bias-correction method to identify over-represented gene-sets in boutique arrays</title><source>Oxford University Press Open Access</source><creator>Wang, Kevin Y X ; Menzies, Alexander M ; Silva, Ines P ; Wilmott, James S ; Yan, Yibing ; Wongchenko, Matthew ; Kefford, Richard F ; Scolyer, Richard A ; Long, Georgina V ; Tarr, Garth ; Mueller, Samuel ; Yang, Jean Y H</creator><contributor>Kelso, Janet</contributor><creatorcontrib>Wang, Kevin Y X ; Menzies, Alexander M ; Silva, Ines P ; Wilmott, James S ; Yan, Yibing ; Wongchenko, Matthew ; Kefford, Richard F ; Scolyer, Richard A ; Long, Georgina V ; Tarr, Garth ; Mueller, Samuel ; Yang, Jean Y H ; Kelso, Janet</creatorcontrib><description>Abstract Motivation Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene-set selection bias induced during the construction of these platforms. Results We propose bcGST, a bias-corrected GST by introducing bias-correction terms in the contingency table needed for calculating the Fisher’s Exact Test. The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and the Cancer Genome Atlas cancer studies. Availability and implementation The bcGST method is made available as a Shiny web application at http://shiny.maths.usyd.edu.au/bcGST/. Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/bty783</identifier><identifier>PMID: 30215668</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Computational Biology ; Gene Expression Profiling ; Gene Ontology ; Genome ; Molecular Sequence Annotation ; Software</subject><ispartof>Bioinformatics, 2019-04, Vol.35 (8), p.1350-1357</ispartof><rights>The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2018</rights><rights>The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903</citedby><cites>FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903</cites><orcidid>0000-0003-2615-6102</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1604,27924,27925</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/bty783$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30215668$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Kelso, Janet</contributor><creatorcontrib>Wang, Kevin Y X</creatorcontrib><creatorcontrib>Menzies, Alexander M</creatorcontrib><creatorcontrib>Silva, Ines P</creatorcontrib><creatorcontrib>Wilmott, James S</creatorcontrib><creatorcontrib>Yan, Yibing</creatorcontrib><creatorcontrib>Wongchenko, Matthew</creatorcontrib><creatorcontrib>Kefford, Richard F</creatorcontrib><creatorcontrib>Scolyer, Richard A</creatorcontrib><creatorcontrib>Long, Georgina V</creatorcontrib><creatorcontrib>Tarr, Garth</creatorcontrib><creatorcontrib>Mueller, Samuel</creatorcontrib><creatorcontrib>Yang, Jean Y H</creatorcontrib><title>bcGST—an interactive bias-correction method to identify over-represented gene-sets in boutique arrays</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene-set selection bias induced during the construction of these platforms. Results We propose bcGST, a bias-corrected GST by introducing bias-correction terms in the contingency table needed for calculating the Fisher’s Exact Test. The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and the Cancer Genome Atlas cancer studies. Availability and implementation The bcGST method is made available as a Shiny web application at http://shiny.maths.usyd.edu.au/bcGST/. Supplementary information Supplementary data are available at Bioinformatics online.</description><subject>Computational Biology</subject><subject>Gene Expression Profiling</subject><subject>Gene Ontology</subject><subject>Genome</subject><subject>Molecular Sequence Annotation</subject><subject>Software</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNqNkM1O3DAUhS1EVWDaRyjykk3g-i-2lxUqPxJSF6XryHZuBleTeGo7SLPjIfqEPAmpBpC66-r-6DvnSIeQLwzOGVhx4WOK05Dy6GoM5cLXnTbigBwz2ULDQdnDZRetbqQBcUROSvkFoJiU8iM5EsCZaltzTNY-XP-4f3764yYap4rZhRofkfroShNSzrjcaaIj1ofU05po7HGqcdjR9Ii5ybjNWJYP9nSNEzYFa1mcqE9zjb9npC5ntyufyIfBbQp-fp0r8vPq2_3lTXP3_fr28utdE4TVtTG8V0YyGyRwYdErG3yvB8F54INpW22lgp4HhYx78LoX3JjBKctQSWNBrMjZ3neb05JeajfGEnCzcROmuXScgQINUusFVXs05FRKxqHb5ji6vOsYdH877v7tuNt3vOhOXyNmP2L_rnordQFgD6R5-5-eLxbpkQw</recordid><startdate>20190415</startdate><enddate>20190415</enddate><creator>Wang, Kevin Y X</creator><creator>Menzies, Alexander M</creator><creator>Silva, Ines P</creator><creator>Wilmott, James S</creator><creator>Yan, Yibing</creator><creator>Wongchenko, Matthew</creator><creator>Kefford, Richard F</creator><creator>Scolyer, Richard A</creator><creator>Long, Georgina V</creator><creator>Tarr, Garth</creator><creator>Mueller, Samuel</creator><creator>Yang, Jean Y H</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-2615-6102</orcidid></search><sort><creationdate>20190415</creationdate><title>bcGST—an interactive bias-correction method to identify over-represented gene-sets in boutique arrays</title><author>Wang, Kevin Y X ; Menzies, Alexander M ; Silva, Ines P ; Wilmott, James S ; Yan, Yibing ; Wongchenko, Matthew ; Kefford, Richard F ; Scolyer, Richard A ; Long, Georgina V ; Tarr, Garth ; Mueller, Samuel ; Yang, Jean Y H</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Computational Biology</topic><topic>Gene Expression Profiling</topic><topic>Gene Ontology</topic><topic>Genome</topic><topic>Molecular Sequence Annotation</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Kevin Y X</creatorcontrib><creatorcontrib>Menzies, Alexander M</creatorcontrib><creatorcontrib>Silva, Ines P</creatorcontrib><creatorcontrib>Wilmott, James S</creatorcontrib><creatorcontrib>Yan, Yibing</creatorcontrib><creatorcontrib>Wongchenko, Matthew</creatorcontrib><creatorcontrib>Kefford, Richard F</creatorcontrib><creatorcontrib>Scolyer, Richard A</creatorcontrib><creatorcontrib>Long, Georgina V</creatorcontrib><creatorcontrib>Tarr, Garth</creatorcontrib><creatorcontrib>Mueller, Samuel</creatorcontrib><creatorcontrib>Yang, Jean Y H</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Wang, Kevin Y X</au><au>Menzies, Alexander M</au><au>Silva, Ines P</au><au>Wilmott, James S</au><au>Yan, Yibing</au><au>Wongchenko, Matthew</au><au>Kefford, Richard F</au><au>Scolyer, Richard A</au><au>Long, Georgina V</au><au>Tarr, Garth</au><au>Mueller, Samuel</au><au>Yang, Jean Y H</au><au>Kelso, Janet</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>bcGST—an interactive bias-correction method to identify over-represented gene-sets in boutique arrays</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2019-04-15</date><risdate>2019</risdate><volume>35</volume><issue>8</issue><spage>1350</spage><epage>1357</epage><pages>1350-1357</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene-set selection bias induced during the construction of these platforms. Results We propose bcGST, a bias-corrected GST by introducing bias-correction terms in the contingency table needed for calculating the Fisher’s Exact Test. The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and the Cancer Genome Atlas cancer studies. Availability and implementation The bcGST method is made available as a Shiny web application at http://shiny.maths.usyd.edu.au/bcGST/. Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>30215668</pmid><doi>10.1093/bioinformatics/bty783</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0003-2615-6102</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2019-04, Vol.35 (8), p.1350-1357
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_proquest_miscellaneous_2105070477
source Oxford University Press Open Access
subjects Computational Biology
Gene Expression Profiling
Gene Ontology
Genome
Molecular Sequence Annotation
Software
title bcGST—an interactive bias-correction method to identify over-represented gene-sets in boutique arrays
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T00%3A22%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=bcGST%E2%80%94an%20interactive%20bias-correction%20method%20to%20identify%20over-represented%20gene-sets%20in%20boutique%20arrays&rft.jtitle=Bioinformatics&rft.au=Wang,%20Kevin%20Y%20X&rft.date=2019-04-15&rft.volume=35&rft.issue=8&rft.spage=1350&rft.epage=1357&rft.pages=1350-1357&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/bty783&rft_dat=%3Cproquest_TOX%3E2105070477%3C/proquest_TOX%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c397t-82d58419c40239eb59cbd7f322c2f86679450d2c5e12b0b7d3288fa591e548903%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2105070477&rft_id=info:pmid/30215668&rft_oup_id=10.1093/bioinformatics/bty783&rfr_iscdi=true