Loading…

Addressing challenges in the production and analysis of illumina sequencing data

Advances in DNA sequencing technologies have made it possible to generate large amounts of sequence data very rapidly and at substantially lower cost than capillary sequencing. These new technologies have specific characteristics and limitations that require either consideration during project desig...

Full description

Saved in:
Bibliographic Details
Published in:BMC genomics 2011-07, Vol.12 (1), p.382-382, Article 382
Main Authors: Kircher, Martin, Heyn, Patricia, Kelso, Janet
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43
cites cdi_FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43
container_end_page 382
container_issue 1
container_start_page 382
container_title BMC genomics
container_volume 12
creator Kircher, Martin
Heyn, Patricia
Kelso, Janet
description Advances in DNA sequencing technologies have made it possible to generate large amounts of sequence data very rapidly and at substantially lower cost than capillary sequencing. These new technologies have specific characteristics and limitations that require either consideration during project design, or which must be addressed during data analysis. Specialist skills, both at the laboratory and the computational stages of project design and analysis, are crucial to the generation of high quality data from these new platforms. The Illumina sequencers (including the Genome Analyzers I/II/IIe/IIx and the new HiScan and HiSeq) represent a widely used platform providing parallel readout of several hundred million immobilized sequences using fluorescent-dye reversible-terminator chemistry. Sequencing library quality, sample handling, instrument settings and sequencing chemistry have a strong impact on sequencing run quality. The presence of adapter chimeras and adapter sequences at the end of short-insert molecules, as well as increased error rates and short read lengths complicate many computational analyses. We discuss here some of the factors that influence the frequency and severity of these problems and provide solutions for circumventing these. Further, we present a set of general principles for good analysis practice that enable problems with sequencing runs to be identified and dealt with.
doi_str_mv 10.1186/1471-2164-12-382
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_a800c47960fe4460b296f10aaadfbb52</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A265517499</galeid><doaj_id>oai_doaj_org_article_a800c47960fe4460b296f10aaadfbb52</doaj_id><sourcerecordid>A265517499</sourcerecordid><originalsourceid>FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43</originalsourceid><addsrcrecordid>eNp1ksuL1TAUxosozji6dyUFF66qSZpXN8Ll4mNgQBe6DieP9mZIk2vSDsx_b2vHy1zQRUg458uP7zyq6jVG7zGW_AOmAjcEc9pg0rSSPKkuT6Gnj94X1YtSbhHCQhL2vLogWCJMEbusvu-sza4UH4faHCAEFwdXah_r6eDqY052NpNPsYZolwPhvvhSp772Icyjj1AX92t20awACxO8rJ71EIp79XBfVT8_f_qx_9rcfPtyvd_dNJpjNjWYdcRqw0krDGNGtoxaRJhzEroeEHfECquRYRQRoqXpOyRI65jV1FJhaHtVXW9cm-BWHbMfId-rBF79CaQ8KMiTN8EpkAgZKjqOekcpR5p0vMcIAGyvNSML6-PGOs56dNa4OGUIZ9DzTPQHNaQ71WLeMi4WwH4DaJ_-AzjPmDSqdTpqnY7CRC3DWyjvHmzktPS0TGr0xbgQILo0FyUlZ52Q3ap8uykHWOrzsU8L1axqtSOcMSxo1y0qtKlMTqVk158MYaTW_fmXhTePO3H68Hdh2t-wPcDa</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>886597892</pqid></control><display><type>article</type><title>Addressing challenges in the production and analysis of illumina sequencing data</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Kircher, Martin ; Heyn, Patricia ; Kelso, Janet</creator><creatorcontrib>Kircher, Martin ; Heyn, Patricia ; Kelso, Janet</creatorcontrib><description>Advances in DNA sequencing technologies have made it possible to generate large amounts of sequence data very rapidly and at substantially lower cost than capillary sequencing. These new technologies have specific characteristics and limitations that require either consideration during project design, or which must be addressed during data analysis. Specialist skills, both at the laboratory and the computational stages of project design and analysis, are crucial to the generation of high quality data from these new platforms. The Illumina sequencers (including the Genome Analyzers I/II/IIe/IIx and the new HiScan and HiSeq) represent a widely used platform providing parallel readout of several hundred million immobilized sequences using fluorescent-dye reversible-terminator chemistry. Sequencing library quality, sample handling, instrument settings and sequencing chemistry have a strong impact on sequencing run quality. The presence of adapter chimeras and adapter sequences at the end of short-insert molecules, as well as increased error rates and short read lengths complicate many computational analyses. We discuss here some of the factors that influence the frequency and severity of these problems and provide solutions for circumventing these. Further, we present a set of general principles for good analysis practice that enable problems with sequencing runs to be identified and dealt with.</description><identifier>ISSN: 1471-2164</identifier><identifier>EISSN: 1471-2164</identifier><identifier>DOI: 10.1186/1471-2164-12-382</identifier><identifier>PMID: 21801405</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Correspondence ; DNA Primers - genetics ; DNA sequencing ; Gene Library ; Genomes ; Genomic libraries ; Humans ; Image Processing, Computer-Assisted ; Nucleotide sequencing ; Physiological aspects ; Polymerase chain reaction ; Quality Control ; Sequence Analysis, DNA - methods ; Statistics as Topic - methods</subject><ispartof>BMC genomics, 2011-07, Vol.12 (1), p.382-382, Article 382</ispartof><rights>COPYRIGHT 2011 BioMed Central Ltd.</rights><rights>Copyright ©2011 Kircher et al; licensee BioMed Central Ltd. 2011 Kircher et al; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43</citedby><cites>FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3163567/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3163567/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,36990,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/21801405$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kircher, Martin</creatorcontrib><creatorcontrib>Heyn, Patricia</creatorcontrib><creatorcontrib>Kelso, Janet</creatorcontrib><title>Addressing challenges in the production and analysis of illumina sequencing data</title><title>BMC genomics</title><addtitle>BMC Genomics</addtitle><description>Advances in DNA sequencing technologies have made it possible to generate large amounts of sequence data very rapidly and at substantially lower cost than capillary sequencing. These new technologies have specific characteristics and limitations that require either consideration during project design, or which must be addressed during data analysis. Specialist skills, both at the laboratory and the computational stages of project design and analysis, are crucial to the generation of high quality data from these new platforms. The Illumina sequencers (including the Genome Analyzers I/II/IIe/IIx and the new HiScan and HiSeq) represent a widely used platform providing parallel readout of several hundred million immobilized sequences using fluorescent-dye reversible-terminator chemistry. Sequencing library quality, sample handling, instrument settings and sequencing chemistry have a strong impact on sequencing run quality. The presence of adapter chimeras and adapter sequences at the end of short-insert molecules, as well as increased error rates and short read lengths complicate many computational analyses. We discuss here some of the factors that influence the frequency and severity of these problems and provide solutions for circumventing these. Further, we present a set of general principles for good analysis practice that enable problems with sequencing runs to be identified and dealt with.</description><subject>Correspondence</subject><subject>DNA Primers - genetics</subject><subject>DNA sequencing</subject><subject>Gene Library</subject><subject>Genomes</subject><subject>Genomic libraries</subject><subject>Humans</subject><subject>Image Processing, Computer-Assisted</subject><subject>Nucleotide sequencing</subject><subject>Physiological aspects</subject><subject>Polymerase chain reaction</subject><subject>Quality Control</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Statistics as Topic - methods</subject><issn>1471-2164</issn><issn>1471-2164</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNp1ksuL1TAUxosozji6dyUFF66qSZpXN8Ll4mNgQBe6DieP9mZIk2vSDsx_b2vHy1zQRUg458uP7zyq6jVG7zGW_AOmAjcEc9pg0rSSPKkuT6Gnj94X1YtSbhHCQhL2vLogWCJMEbusvu-sza4UH4faHCAEFwdXah_r6eDqY052NpNPsYZolwPhvvhSp772Icyjj1AX92t20awACxO8rJ71EIp79XBfVT8_f_qx_9rcfPtyvd_dNJpjNjWYdcRqw0krDGNGtoxaRJhzEroeEHfECquRYRQRoqXpOyRI65jV1FJhaHtVXW9cm-BWHbMfId-rBF79CaQ8KMiTN8EpkAgZKjqOekcpR5p0vMcIAGyvNSML6-PGOs56dNa4OGUIZ9DzTPQHNaQ71WLeMi4WwH4DaJ_-AzjPmDSqdTpqnY7CRC3DWyjvHmzktPS0TGr0xbgQILo0FyUlZ52Q3ap8uykHWOrzsU8L1axqtSOcMSxo1y0qtKlMTqVk158MYaTW_fmXhTePO3H68Hdh2t-wPcDa</recordid><startdate>20110729</startdate><enddate>20110729</enddate><creator>Kircher, Martin</creator><creator>Heyn, Patricia</creator><creator>Kelso, Janet</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20110729</creationdate><title>Addressing challenges in the production and analysis of illumina sequencing data</title><author>Kircher, Martin ; Heyn, Patricia ; Kelso, Janet</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Correspondence</topic><topic>DNA Primers - genetics</topic><topic>DNA sequencing</topic><topic>Gene Library</topic><topic>Genomes</topic><topic>Genomic libraries</topic><topic>Humans</topic><topic>Image Processing, Computer-Assisted</topic><topic>Nucleotide sequencing</topic><topic>Physiological aspects</topic><topic>Polymerase chain reaction</topic><topic>Quality Control</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Statistics as Topic - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kircher, Martin</creatorcontrib><creatorcontrib>Heyn, Patricia</creatorcontrib><creatorcontrib>Kelso, Janet</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>BMC genomics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kircher, Martin</au><au>Heyn, Patricia</au><au>Kelso, Janet</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Addressing challenges in the production and analysis of illumina sequencing data</atitle><jtitle>BMC genomics</jtitle><addtitle>BMC Genomics</addtitle><date>2011-07-29</date><risdate>2011</risdate><volume>12</volume><issue>1</issue><spage>382</spage><epage>382</epage><pages>382-382</pages><artnum>382</artnum><issn>1471-2164</issn><eissn>1471-2164</eissn><abstract>Advances in DNA sequencing technologies have made it possible to generate large amounts of sequence data very rapidly and at substantially lower cost than capillary sequencing. These new technologies have specific characteristics and limitations that require either consideration during project design, or which must be addressed during data analysis. Specialist skills, both at the laboratory and the computational stages of project design and analysis, are crucial to the generation of high quality data from these new platforms. The Illumina sequencers (including the Genome Analyzers I/II/IIe/IIx and the new HiScan and HiSeq) represent a widely used platform providing parallel readout of several hundred million immobilized sequences using fluorescent-dye reversible-terminator chemistry. Sequencing library quality, sample handling, instrument settings and sequencing chemistry have a strong impact on sequencing run quality. The presence of adapter chimeras and adapter sequences at the end of short-insert molecules, as well as increased error rates and short read lengths complicate many computational analyses. We discuss here some of the factors that influence the frequency and severity of these problems and provide solutions for circumventing these. Further, we present a set of general principles for good analysis practice that enable problems with sequencing runs to be identified and dealt with.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>21801405</pmid><doi>10.1186/1471-2164-12-382</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2164
ispartof BMC genomics, 2011-07, Vol.12 (1), p.382-382, Article 382
issn 1471-2164
1471-2164
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_a800c47960fe4460b296f10aaadfbb52
source Publicly Available Content Database; PubMed Central
subjects Correspondence
DNA Primers - genetics
DNA sequencing
Gene Library
Genomes
Genomic libraries
Humans
Image Processing, Computer-Assisted
Nucleotide sequencing
Physiological aspects
Polymerase chain reaction
Quality Control
Sequence Analysis, DNA - methods
Statistics as Topic - methods
title Addressing challenges in the production and analysis of illumina sequencing data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-22T00%3A25%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Addressing%20challenges%20in%20the%20production%20and%20analysis%20of%20illumina%20sequencing%20data&rft.jtitle=BMC%20genomics&rft.au=Kircher,%20Martin&rft.date=2011-07-29&rft.volume=12&rft.issue=1&rft.spage=382&rft.epage=382&rft.pages=382-382&rft.artnum=382&rft.issn=1471-2164&rft.eissn=1471-2164&rft_id=info:doi/10.1186/1471-2164-12-382&rft_dat=%3Cgale_doaj_%3EA265517499%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-b615t-1592dbc6237c55c8354d025ee8a9fa06e2d7db0c54022b8cf90723e5db4d47c43%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=886597892&rft_id=info:pmid/21801405&rft_galeid=A265517499&rfr_iscdi=true