Loading…

Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore

Abstract Background The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers—Pacific Bioscienc...

Full description

Saved in:
Bibliographic Details
Published in:Gigascience 2020-12, Vol.9 (12)
Main Authors: Lang, Dandan, Zhang, Shilai, Ren, Pingping, Liang, Fan, Sun, Zongyi, Meng, Guanliang, Tan, Yuntao, Li, Xiaokang, Lai, Qihua, Han, Lingling, Wang, Depeng, Hu, Fengyi, Wang, Wen, Liu, Shanlin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3
cites cdi_FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3
container_end_page
container_issue 12
container_start_page
container_title Gigascience
container_volume 9
creator Lang, Dandan
Zhang, Shilai
Ren, Pingping
Liang, Fan
Sun, Zongyi
Meng, Guanliang
Tan, Yuntao
Li, Xiaokang
Lai, Qihua
Han, Lingling
Wang, Depeng
Hu, Fengyi
Wang, Wen
Liu, Shanlin
description Abstract Background The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers—Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)—have recently updated their platforms: PacBio enables high-throughput HiFi reads with base-level resolution of >99%, and ONT generated reads as long as 2 Mb. We applied the 2 up-to-date platforms to a single rice individual and then compared the 2 assemblies to investigate the advantages and limitations of each. Results The results showed that ONT ultralong reads delivered higher contiguity, producing a total of 18 contigs of which 10 were assembled into a single chromosome compared to 394 contigs and 3 chromosome-level contigs for the PacBio assembly. The ONT ultralong reads also prevented assembly errors caused by long repetitive regions, for which we observed a total of 44 genes of false redundancies and 10 genes of false losses in the PacBio assembly, leading to over- or underestimation of the gene families in those long repetitive regions. We also noted that the PacBio HiFi reads generated assemblies with considerably fewer errors at the level of single nucleotides and small insertions and deletions than those of the ONT assembly, which generated an average 1.06 errors per kb and finally engendered 1,475 incorrect gene annotations via altered or truncated protein predictions. Conclusions It shows that both PacBio HiFi reads and ONT ultralong reads had their own merits. Further genome reference constructions could leverage both techniques to lessen the impact of assembly errors and subsequent annotation mistakes rooted in each.
doi_str_mv 10.1093/gigascience/giaa123
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7736813</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/gigascience/giaa123</oup_id><sourcerecordid>2715816037</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3</originalsourceid><addsrcrecordid>eNqNUt2K1DAULqK4y7pPIEjAG2-65qfTpF4IOrjuwOIKKngXMulJJ0ubU5NWnSfyNc0w4zB6ZQjkwPnO90NOUTxl9IrRRrzsfGeS9RAs5NoYxsWD4pzTSpacya8PT-qz4jKle5qPlEpJ8bg4E0KwpqHNefFricNook8YCDoybYBMP5DMYzlh2ZoJSIJvc5bxoSMT2E3AHjsPiTiMpIOAAxCTEgzrfvuK3PhrTyKYNu3YPhrrnbfkrceD10Q-7eh6slqRtE0TDMSElsz9FE2PWeI4e_czC7Tkgwk4YoQnxSNn-gSXh_ei-HL97vPypry9e79avrktbUX5VDpVUyZACS6YXas1s9xBVYGpjOCmWdS1s9yCsnW7aJlqbFtbBlXtGDgpqBEXxes97zivB2gthJ0zPUY_mLjVaLz-uxP8Rnf4XUspasVEJnhxIIiYk6ZJDz5Z6HsTAOekeSUpV_nKDH3-D_Qe5xhyPM0lWyhWU7FDiT3KRkwpgjuaYVTvdkGf7II-7EKeenaa4zjz5-cz4GoPwHn8L8bfEGrIbw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715816037</pqid></control><display><type>article</type><title>Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore</title><source>PubMed Central</source><source>Oxford Academic Journals (Open Access)</source><creator>Lang, Dandan ; Zhang, Shilai ; Ren, Pingping ; Liang, Fan ; Sun, Zongyi ; Meng, Guanliang ; Tan, Yuntao ; Li, Xiaokang ; Lai, Qihua ; Han, Lingling ; Wang, Depeng ; Hu, Fengyi ; Wang, Wen ; Liu, Shanlin</creator><creatorcontrib>Lang, Dandan ; Zhang, Shilai ; Ren, Pingping ; Liang, Fan ; Sun, Zongyi ; Meng, Guanliang ; Tan, Yuntao ; Li, Xiaokang ; Lai, Qihua ; Han, Lingling ; Wang, Depeng ; Hu, Fengyi ; Wang, Wen ; Liu, Shanlin</creatorcontrib><description>Abstract Background The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers—Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)—have recently updated their platforms: PacBio enables high-throughput HiFi reads with base-level resolution of &gt;99%, and ONT generated reads as long as 2 Mb. We applied the 2 up-to-date platforms to a single rice individual and then compared the 2 assemblies to investigate the advantages and limitations of each. Results The results showed that ONT ultralong reads delivered higher contiguity, producing a total of 18 contigs of which 10 were assembled into a single chromosome compared to 394 contigs and 3 chromosome-level contigs for the PacBio assembly. The ONT ultralong reads also prevented assembly errors caused by long repetitive regions, for which we observed a total of 44 genes of false redundancies and 10 genes of false losses in the PacBio assembly, leading to over- or underestimation of the gene families in those long repetitive regions. We also noted that the PacBio HiFi reads generated assemblies with considerably fewer errors at the level of single nucleotides and small insertions and deletions than those of the ONT assembly, which generated an average 1.06 errors per kb and finally engendered 1,475 incorrect gene annotations via altered or truncated protein predictions. Conclusions It shows that both PacBio HiFi reads and ONT ultralong reads had their own merits. Further genome reference constructions could leverage both techniques to lessen the impact of assembly errors and subsequent annotation mistakes rooted in each.</description><identifier>ISSN: 2047-217X</identifier><identifier>EISSN: 2047-217X</identifier><identifier>DOI: 10.1093/gigascience/giaa123</identifier><identifier>PMID: 33319909</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Annotations ; Assemblies ; Assembly ; Chromosomes ; Errors ; Gene families ; Gene sequencing ; Genes ; Genome ; Genomes ; High-Throughput Nucleotide Sequencing ; Humans ; Molecular Sequence Annotation ; Nanopores ; Nucleotides ; Platforms ; Sequence Analysis, DNA ; Technical Note</subject><ispartof>Gigascience, 2020-12, Vol.9 (12)</ispartof><rights>The Author(s) 2020. Published by Oxford University Press GigaScience. 2020</rights><rights>The Author(s) 2020. Published by Oxford University Press GigaScience.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3</citedby><cites>FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3</cites><orcidid>0000-0001-7912-0459 ; 0000-0002-7801-2066 ; 0000-0001-8118-8313</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7736813/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7736813/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,882,1599,27905,27906,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33319909$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lang, Dandan</creatorcontrib><creatorcontrib>Zhang, Shilai</creatorcontrib><creatorcontrib>Ren, Pingping</creatorcontrib><creatorcontrib>Liang, Fan</creatorcontrib><creatorcontrib>Sun, Zongyi</creatorcontrib><creatorcontrib>Meng, Guanliang</creatorcontrib><creatorcontrib>Tan, Yuntao</creatorcontrib><creatorcontrib>Li, Xiaokang</creatorcontrib><creatorcontrib>Lai, Qihua</creatorcontrib><creatorcontrib>Han, Lingling</creatorcontrib><creatorcontrib>Wang, Depeng</creatorcontrib><creatorcontrib>Hu, Fengyi</creatorcontrib><creatorcontrib>Wang, Wen</creatorcontrib><creatorcontrib>Liu, Shanlin</creatorcontrib><title>Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore</title><title>Gigascience</title><addtitle>Gigascience</addtitle><description>Abstract Background The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers—Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)—have recently updated their platforms: PacBio enables high-throughput HiFi reads with base-level resolution of &gt;99%, and ONT generated reads as long as 2 Mb. We applied the 2 up-to-date platforms to a single rice individual and then compared the 2 assemblies to investigate the advantages and limitations of each. Results The results showed that ONT ultralong reads delivered higher contiguity, producing a total of 18 contigs of which 10 were assembled into a single chromosome compared to 394 contigs and 3 chromosome-level contigs for the PacBio assembly. The ONT ultralong reads also prevented assembly errors caused by long repetitive regions, for which we observed a total of 44 genes of false redundancies and 10 genes of false losses in the PacBio assembly, leading to over- or underestimation of the gene families in those long repetitive regions. We also noted that the PacBio HiFi reads generated assemblies with considerably fewer errors at the level of single nucleotides and small insertions and deletions than those of the ONT assembly, which generated an average 1.06 errors per kb and finally engendered 1,475 incorrect gene annotations via altered or truncated protein predictions. Conclusions It shows that both PacBio HiFi reads and ONT ultralong reads had their own merits. Further genome reference constructions could leverage both techniques to lessen the impact of assembly errors and subsequent annotation mistakes rooted in each.</description><subject>Annotations</subject><subject>Assemblies</subject><subject>Assembly</subject><subject>Chromosomes</subject><subject>Errors</subject><subject>Gene families</subject><subject>Gene sequencing</subject><subject>Genes</subject><subject>Genome</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing</subject><subject>Humans</subject><subject>Molecular Sequence Annotation</subject><subject>Nanopores</subject><subject>Nucleotides</subject><subject>Platforms</subject><subject>Sequence Analysis, DNA</subject><subject>Technical Note</subject><issn>2047-217X</issn><issn>2047-217X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqNUt2K1DAULqK4y7pPIEjAG2-65qfTpF4IOrjuwOIKKngXMulJJ0ubU5NWnSfyNc0w4zB6ZQjkwPnO90NOUTxl9IrRRrzsfGeS9RAs5NoYxsWD4pzTSpacya8PT-qz4jKle5qPlEpJ8bg4E0KwpqHNefFricNook8YCDoybYBMP5DMYzlh2ZoJSIJvc5bxoSMT2E3AHjsPiTiMpIOAAxCTEgzrfvuK3PhrTyKYNu3YPhrrnbfkrceD10Q-7eh6slqRtE0TDMSElsz9FE2PWeI4e_czC7Tkgwk4YoQnxSNn-gSXh_ei-HL97vPypry9e79avrktbUX5VDpVUyZACS6YXas1s9xBVYGpjOCmWdS1s9yCsnW7aJlqbFtbBlXtGDgpqBEXxes97zivB2gthJ0zPUY_mLjVaLz-uxP8Rnf4XUspasVEJnhxIIiYk6ZJDz5Z6HsTAOekeSUpV_nKDH3-D_Qe5xhyPM0lWyhWU7FDiT3KRkwpgjuaYVTvdkGf7II-7EKeenaa4zjz5-cz4GoPwHn8L8bfEGrIbw</recordid><startdate>20201215</startdate><enddate>20201215</enddate><creator>Lang, Dandan</creator><creator>Zhang, Shilai</creator><creator>Ren, Pingping</creator><creator>Liang, Fan</creator><creator>Sun, Zongyi</creator><creator>Meng, Guanliang</creator><creator>Tan, Yuntao</creator><creator>Li, Xiaokang</creator><creator>Lai, Qihua</creator><creator>Han, Lingling</creator><creator>Wang, Depeng</creator><creator>Hu, Fengyi</creator><creator>Wang, Wen</creator><creator>Liu, Shanlin</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>K9.</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-7912-0459</orcidid><orcidid>https://orcid.org/0000-0002-7801-2066</orcidid><orcidid>https://orcid.org/0000-0001-8118-8313</orcidid></search><sort><creationdate>20201215</creationdate><title>Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore</title><author>Lang, Dandan ; Zhang, Shilai ; Ren, Pingping ; Liang, Fan ; Sun, Zongyi ; Meng, Guanliang ; Tan, Yuntao ; Li, Xiaokang ; Lai, Qihua ; Han, Lingling ; Wang, Depeng ; Hu, Fengyi ; Wang, Wen ; Liu, Shanlin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Annotations</topic><topic>Assemblies</topic><topic>Assembly</topic><topic>Chromosomes</topic><topic>Errors</topic><topic>Gene families</topic><topic>Gene sequencing</topic><topic>Genes</topic><topic>Genome</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing</topic><topic>Humans</topic><topic>Molecular Sequence Annotation</topic><topic>Nanopores</topic><topic>Nucleotides</topic><topic>Platforms</topic><topic>Sequence Analysis, DNA</topic><topic>Technical Note</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lang, Dandan</creatorcontrib><creatorcontrib>Zhang, Shilai</creatorcontrib><creatorcontrib>Ren, Pingping</creatorcontrib><creatorcontrib>Liang, Fan</creatorcontrib><creatorcontrib>Sun, Zongyi</creatorcontrib><creatorcontrib>Meng, Guanliang</creatorcontrib><creatorcontrib>Tan, Yuntao</creatorcontrib><creatorcontrib>Li, Xiaokang</creatorcontrib><creatorcontrib>Lai, Qihua</creatorcontrib><creatorcontrib>Han, Lingling</creatorcontrib><creatorcontrib>Wang, Depeng</creatorcontrib><creatorcontrib>Hu, Fengyi</creatorcontrib><creatorcontrib>Wang, Wen</creatorcontrib><creatorcontrib>Liu, Shanlin</creatorcontrib><collection>Oxford Academic Journals (Open Access)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Gigascience</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lang, Dandan</au><au>Zhang, Shilai</au><au>Ren, Pingping</au><au>Liang, Fan</au><au>Sun, Zongyi</au><au>Meng, Guanliang</au><au>Tan, Yuntao</au><au>Li, Xiaokang</au><au>Lai, Qihua</au><au>Han, Lingling</au><au>Wang, Depeng</au><au>Hu, Fengyi</au><au>Wang, Wen</au><au>Liu, Shanlin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore</atitle><jtitle>Gigascience</jtitle><addtitle>Gigascience</addtitle><date>2020-12-15</date><risdate>2020</risdate><volume>9</volume><issue>12</issue><issn>2047-217X</issn><eissn>2047-217X</eissn><abstract>Abstract Background The availability of reference genomes has revolutionized the study of biology. Multiple competing technologies have been developed to improve the quality and robustness of genome assemblies during the past decade. The 2 widely used long-read sequencing providers—Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT)—have recently updated their platforms: PacBio enables high-throughput HiFi reads with base-level resolution of &gt;99%, and ONT generated reads as long as 2 Mb. We applied the 2 up-to-date platforms to a single rice individual and then compared the 2 assemblies to investigate the advantages and limitations of each. Results The results showed that ONT ultralong reads delivered higher contiguity, producing a total of 18 contigs of which 10 were assembled into a single chromosome compared to 394 contigs and 3 chromosome-level contigs for the PacBio assembly. The ONT ultralong reads also prevented assembly errors caused by long repetitive regions, for which we observed a total of 44 genes of false redundancies and 10 genes of false losses in the PacBio assembly, leading to over- or underestimation of the gene families in those long repetitive regions. We also noted that the PacBio HiFi reads generated assemblies with considerably fewer errors at the level of single nucleotides and small insertions and deletions than those of the ONT assembly, which generated an average 1.06 errors per kb and finally engendered 1,475 incorrect gene annotations via altered or truncated protein predictions. Conclusions It shows that both PacBio HiFi reads and ONT ultralong reads had their own merits. Further genome reference constructions could leverage both techniques to lessen the impact of assembly errors and subsequent annotation mistakes rooted in each.</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>33319909</pmid><doi>10.1093/gigascience/giaa123</doi><orcidid>https://orcid.org/0000-0001-7912-0459</orcidid><orcidid>https://orcid.org/0000-0002-7801-2066</orcidid><orcidid>https://orcid.org/0000-0001-8118-8313</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2047-217X
ispartof Gigascience, 2020-12, Vol.9 (12)
issn 2047-217X
2047-217X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7736813
source PubMed Central; Oxford Academic Journals (Open Access)
subjects Annotations
Assemblies
Assembly
Chromosomes
Errors
Gene families
Gene sequencing
Genes
Genome
Genomes
High-Throughput Nucleotide Sequencing
Humans
Molecular Sequence Annotation
Nanopores
Nucleotides
Platforms
Sequence Analysis, DNA
Technical Note
title Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T00%3A09%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20the%20two%20up-to-date%20sequencing%20technologies%20for%20genome%20assembly:%20HiFi%20reads%20of%20Pacific%20Biosciences%20Sequel%20II%20system%20and%20ultralong%20reads%20of%20Oxford%20Nanopore&rft.jtitle=Gigascience&rft.au=Lang,%20Dandan&rft.date=2020-12-15&rft.volume=9&rft.issue=12&rft.issn=2047-217X&rft.eissn=2047-217X&rft_id=info:doi/10.1093/gigascience/giaa123&rft_dat=%3Cproquest_pubme%3E2715816037%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c402t-f86013e83231cb8b1c2fe44ea4a32a9566fc2ce8c6d5d189cd6c1e46f1ef730a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2715816037&rft_id=info:pmid/33319909&rft_oup_id=10.1093/gigascience/giaa123&rfr_iscdi=true