Loading…
Robust adjustment of sequence tag abundance
The majority of next-generation sequencing technologies effectively sample small amounts of DNA or RNA that are amplified (i.e. copied) before sequencing. The amplification process is not perfect, leading to extreme bias in sequenced read counts. We present a novel procedure to account for amplifica...
Saved in:
Published in: | Bioinformatics 2014-03, Vol.30 (5), p.601-605 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c370t-2942a4e287afd87309a001bb70ce641e90ba44ecd2d38e4ceeb4cb244e6427963 |
container_end_page | 605 |
container_issue | 5 |
container_start_page | 601 |
container_title | Bioinformatics |
container_volume | 30 |
creator | Baumann, Douglas D Doerge, Rebecca W |
description | The majority of next-generation sequencing technologies effectively sample small amounts of DNA or RNA that are amplified (i.e. copied) before sequencing. The amplification process is not perfect, leading to extreme bias in sequenced read counts. We present a novel procedure to account for amplification bias and demonstrate its effectiveness in mitigating gene length dependence when estimating true gene expression.
We tested the proposed method on simulated and real data. Simulations indicated that our method captures true gene expression more effectively than classic censoring-based approaches and leads to power gains in differential expression testing, particularly for shorter genes with high transcription rates. We applied our method to an unreplicated Arabidopsis RNA-seq dataset resulting in disparate gene ontologies arising from gene set enrichment analyses.
R code to perform the RASTA procedures is freely available on the web at www.stat.purdue.edu/∼doerge/. |
doi_str_mv | 10.1093/bioinformatics/btt575 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1786182381</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1503001034</sourcerecordid><originalsourceid>FETCH-LOGICAL-c370t-2942a4e287afd87309a001bb70ce641e90ba44ecd2d38e4ceeb4cb244e6427963</originalsourceid><addsrcrecordid>eNqNkEtLw0AUhQdRbK3-BCVLQWLvvDKTpRRfUBBE18PM5EZSmkzNTBb-e1NaC650dR-cc-_hI-SSwi2Fks9dE5quDn1rU-Pj3KUklTwiU8oLlQtN6fGhBz4hZzGuAECCLE7JhAkKmmo5JTevwQ0xZbZajaXFLmWhziJ-Dth5zJL9yKwbusqO0zk5qe064sW-zsj7w_3b4ilfvjw-L-6WuecKUs5KwaxAppWtK604lBaAOqfAYyEoluCsEOgrVnGNwiM64R0bV4Vgqiz4jFzv7m76MOaIybRN9Lhe2w7DEA1VuqCacU3_lkouNANJxT-kwMecwLdSuZP6PsTYY202fdPa_stQMFv65jd9s6M_-q72LwbXYnVw_eDm3ybKhFE</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1503001034</pqid></control><display><type>article</type><title>Robust adjustment of sequence tag abundance</title><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Baumann, Douglas D ; Doerge, Rebecca W</creator><creatorcontrib>Baumann, Douglas D ; Doerge, Rebecca W</creatorcontrib><description>The majority of next-generation sequencing technologies effectively sample small amounts of DNA or RNA that are amplified (i.e. copied) before sequencing. The amplification process is not perfect, leading to extreme bias in sequenced read counts. We present a novel procedure to account for amplification bias and demonstrate its effectiveness in mitigating gene length dependence when estimating true gene expression.
We tested the proposed method on simulated and real data. Simulations indicated that our method captures true gene expression more effectively than classic censoring-based approaches and leads to power gains in differential expression testing, particularly for shorter genes with high transcription rates. We applied our method to an unreplicated Arabidopsis RNA-seq dataset resulting in disparate gene ontologies arising from gene set enrichment analyses.
R code to perform the RASTA procedures is freely available on the web at www.stat.purdue.edu/∼doerge/.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/btt575</identifier><identifier>PMID: 24108185</identifier><language>eng</language><publisher>England</publisher><subject>Amplification ; Arabidopsis ; Arabidopsis - genetics ; Bias ; Bioinformatics ; Contact ; Expressed Sequence Tags ; Gene expression ; Gene Expression Profiling - methods ; Gene sequencing ; Genes ; High-Throughput Nucleotide Sequencing - methods ; Sequence Analysis, RNA - methods ; Simulation</subject><ispartof>Bioinformatics, 2014-03, Vol.30 (5), p.601-605</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c370t-2942a4e287afd87309a001bb70ce641e90ba44ecd2d38e4ceeb4cb244e6427963</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/24108185$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Baumann, Douglas D</creatorcontrib><creatorcontrib>Doerge, Rebecca W</creatorcontrib><title>Robust adjustment of sequence tag abundance</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>The majority of next-generation sequencing technologies effectively sample small amounts of DNA or RNA that are amplified (i.e. copied) before sequencing. The amplification process is not perfect, leading to extreme bias in sequenced read counts. We present a novel procedure to account for amplification bias and demonstrate its effectiveness in mitigating gene length dependence when estimating true gene expression.
We tested the proposed method on simulated and real data. Simulations indicated that our method captures true gene expression more effectively than classic censoring-based approaches and leads to power gains in differential expression testing, particularly for shorter genes with high transcription rates. We applied our method to an unreplicated Arabidopsis RNA-seq dataset resulting in disparate gene ontologies arising from gene set enrichment analyses.
R code to perform the RASTA procedures is freely available on the web at www.stat.purdue.edu/∼doerge/.</description><subject>Amplification</subject><subject>Arabidopsis</subject><subject>Arabidopsis - genetics</subject><subject>Bias</subject><subject>Bioinformatics</subject><subject>Contact</subject><subject>Expressed Sequence Tags</subject><subject>Gene expression</subject><subject>Gene Expression Profiling - methods</subject><subject>Gene sequencing</subject><subject>Genes</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Sequence Analysis, RNA - methods</subject><subject>Simulation</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNqNkEtLw0AUhQdRbK3-BCVLQWLvvDKTpRRfUBBE18PM5EZSmkzNTBb-e1NaC650dR-cc-_hI-SSwi2Fks9dE5quDn1rU-Pj3KUklTwiU8oLlQtN6fGhBz4hZzGuAECCLE7JhAkKmmo5JTevwQ0xZbZajaXFLmWhziJ-Dth5zJL9yKwbusqO0zk5qe064sW-zsj7w_3b4ilfvjw-L-6WuecKUs5KwaxAppWtK604lBaAOqfAYyEoluCsEOgrVnGNwiM64R0bV4Vgqiz4jFzv7m76MOaIybRN9Lhe2w7DEA1VuqCacU3_lkouNANJxT-kwMecwLdSuZP6PsTYY202fdPa_stQMFv65jd9s6M_-q72LwbXYnVw_eDm3ybKhFE</recordid><startdate>20140301</startdate><enddate>20140301</enddate><creator>Baumann, Douglas D</creator><creator>Doerge, Rebecca W</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7QO</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>7SC</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20140301</creationdate><title>Robust adjustment of sequence tag abundance</title><author>Baumann, Douglas D ; Doerge, Rebecca W</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c370t-2942a4e287afd87309a001bb70ce641e90ba44ecd2d38e4ceeb4cb244e6427963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Amplification</topic><topic>Arabidopsis</topic><topic>Arabidopsis - genetics</topic><topic>Bias</topic><topic>Bioinformatics</topic><topic>Contact</topic><topic>Expressed Sequence Tags</topic><topic>Gene expression</topic><topic>Gene Expression Profiling - methods</topic><topic>Gene sequencing</topic><topic>Genes</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Sequence Analysis, RNA - methods</topic><topic>Simulation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Baumann, Douglas D</creatorcontrib><creatorcontrib>Doerge, Rebecca W</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Baumann, Douglas D</au><au>Doerge, Rebecca W</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust adjustment of sequence tag abundance</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2014-03-01</date><risdate>2014</risdate><volume>30</volume><issue>5</issue><spage>601</spage><epage>605</epage><pages>601-605</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>The majority of next-generation sequencing technologies effectively sample small amounts of DNA or RNA that are amplified (i.e. copied) before sequencing. The amplification process is not perfect, leading to extreme bias in sequenced read counts. We present a novel procedure to account for amplification bias and demonstrate its effectiveness in mitigating gene length dependence when estimating true gene expression.
We tested the proposed method on simulated and real data. Simulations indicated that our method captures true gene expression more effectively than classic censoring-based approaches and leads to power gains in differential expression testing, particularly for shorter genes with high transcription rates. We applied our method to an unreplicated Arabidopsis RNA-seq dataset resulting in disparate gene ontologies arising from gene set enrichment analyses.
R code to perform the RASTA procedures is freely available on the web at www.stat.purdue.edu/∼doerge/.</abstract><cop>England</cop><pmid>24108185</pmid><doi>10.1093/bioinformatics/btt575</doi><tpages>5</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2014-03, Vol.30 (5), p.601-605 |
issn | 1367-4803 1367-4811 1460-2059 |
language | eng |
recordid | cdi_proquest_miscellaneous_1786182381 |
source | Oxford Journals Open Access Collection; PubMed Central |
subjects | Amplification Arabidopsis Arabidopsis - genetics Bias Bioinformatics Contact Expressed Sequence Tags Gene expression Gene Expression Profiling - methods Gene sequencing Genes High-Throughput Nucleotide Sequencing - methods Sequence Analysis, RNA - methods Simulation |
title | Robust adjustment of sequence tag abundance |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T18%3A46%3A19IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20adjustment%20of%20sequence%20tag%20abundance&rft.jtitle=Bioinformatics&rft.au=Baumann,%20Douglas%20D&rft.date=2014-03-01&rft.volume=30&rft.issue=5&rft.spage=601&rft.epage=605&rft.pages=601-605&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btt575&rft_dat=%3Cproquest_cross%3E1503001034%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c370t-2942a4e287afd87309a001bb70ce641e90ba44ecd2d38e4ceeb4cb244e6427963%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1503001034&rft_id=info:pmid/24108185&rfr_iscdi=true |