Loading…

Deciphering DNA nucleotide sequences and their rotation dynamics with interpretable machine learning integrated C 3 N nanopores

A solid-state nanopore combined with the quantum transport method has garnered substantial attention and intrigue for DNA sequencing due to its potential for providing rapid and accurate sequencing results, which could have numerous applications in disease diagnosis and personalized medicine. Howeve...

Full description

Saved in:
Bibliographic Details
Published in:Nanoscale 2023-11, Vol.15 (44), p.18080-18092
Main Authors: Jena, Milan Kumar, Mittal, Sneha, Manna, Surya Sekhar, Pathak, Biswarup
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c581-95845877df3e9a0da3248bb58512850a8a86984033f70b1979e8f7b26618fc323
container_end_page 18092
container_issue 44
container_start_page 18080
container_title Nanoscale
container_volume 15
creator Jena, Milan Kumar
Mittal, Sneha
Manna, Surya Sekhar
Pathak, Biswarup
description A solid-state nanopore combined with the quantum transport method has garnered substantial attention and intrigue for DNA sequencing due to its potential for providing rapid and accurate sequencing results, which could have numerous applications in disease diagnosis and personalized medicine. However, the intricate and multifaceted nature of the experimental protocol poses a formidable challenge in attaining precise single nucleotide analysis. Here, we report a machine learning (ML) framework combined with the quantum transport method to accelerate high-throughput single nucleotide recognition with C N nanopores. The optimized eXtreme Gradient Boosting Regression (XGBR) algorithm has predicted the fingerprint transmission of each unknown nucleotide and their rotation dynamics with root mean square error scores as low as 0.07. Interpretability of ML black box models with the game theory-based SHapley Additive exPlanation method has provided a quasi-explanation for the model working principle and the complex relationship between electrode-nucleotide coupling and transmission. Moreover, a comprehensive ML classification of nucleotides based on binary, ternary, and quaternary combinations shows maximum accuracy and F1 scores of 100%. The results suggest that ML in tandem with a nanopore device can potentially alleviate the experimental hurdles associated with quantum tunneling and facilitate fast and high-precision DNA sequencing.
doi_str_mv 10.1039/D3NR03771A
format article
fullrecord <record><control><sourceid>pubmed_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1039_D3NR03771A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>37916991</sourcerecordid><originalsourceid>FETCH-LOGICAL-c581-95845877df3e9a0da3248bb58512850a8a86984033f70b1979e8f7b26618fc323</originalsourceid><addsrcrecordid>eNpFkF1LwzAYhYMobk5v_AGSa6Ga9G2b5HJsfsGYILsvafp2jXRpTTJkV_51HdN5dQ6ch3PxEHLN2R1noO7nsHxjIASfnpBxyjKWAIj09NiLbEQuQnhnrFBQwDkZgVC8UIqPydccjR1a9Nat6Xw5pW5rOuyjrZEG_NiiMxiodjWNLVpPfR91tL2j9c7pjTWBftrYUusi-sFj1FWHdKNNax3SDrV3--P9vPY6Yk1nFOiSOu36ofcYLslZo7uAV785IavHh9XsOVm8Pr3MpovE5JInKpdZLoWoG0ClWa0hzWRV5TLnqcyZlloWSmYMoBGs4koolI2o0qLgsjGQwoTcHm6N70Pw2JSDtxvtdyVn5V5i-S_xB745wMO22mB9RP-swTcuJ21b</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Deciphering DNA nucleotide sequences and their rotation dynamics with interpretable machine learning integrated C 3 N nanopores</title><source>Royal Society of Chemistry</source><creator>Jena, Milan Kumar ; Mittal, Sneha ; Manna, Surya Sekhar ; Pathak, Biswarup</creator><creatorcontrib>Jena, Milan Kumar ; Mittal, Sneha ; Manna, Surya Sekhar ; Pathak, Biswarup</creatorcontrib><description>A solid-state nanopore combined with the quantum transport method has garnered substantial attention and intrigue for DNA sequencing due to its potential for providing rapid and accurate sequencing results, which could have numerous applications in disease diagnosis and personalized medicine. However, the intricate and multifaceted nature of the experimental protocol poses a formidable challenge in attaining precise single nucleotide analysis. Here, we report a machine learning (ML) framework combined with the quantum transport method to accelerate high-throughput single nucleotide recognition with C N nanopores. The optimized eXtreme Gradient Boosting Regression (XGBR) algorithm has predicted the fingerprint transmission of each unknown nucleotide and their rotation dynamics with root mean square error scores as low as 0.07. Interpretability of ML black box models with the game theory-based SHapley Additive exPlanation method has provided a quasi-explanation for the model working principle and the complex relationship between electrode-nucleotide coupling and transmission. Moreover, a comprehensive ML classification of nucleotides based on binary, ternary, and quaternary combinations shows maximum accuracy and F1 scores of 100%. The results suggest that ML in tandem with a nanopore device can potentially alleviate the experimental hurdles associated with quantum tunneling and facilitate fast and high-precision DNA sequencing.</description><identifier>ISSN: 2040-3364</identifier><identifier>EISSN: 2040-3372</identifier><identifier>DOI: 10.1039/D3NR03771A</identifier><identifier>PMID: 37916991</identifier><language>eng</language><publisher>England</publisher><ispartof>Nanoscale, 2023-11, Vol.15 (44), p.18080-18092</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c581-95845877df3e9a0da3248bb58512850a8a86984033f70b1979e8f7b26618fc323</cites><orcidid>0000-0002-8363-1291 ; 0000-0002-8496-109X ; 0000-0002-9972-9947 ; 0000-0003-2567-4274</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37916991$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Jena, Milan Kumar</creatorcontrib><creatorcontrib>Mittal, Sneha</creatorcontrib><creatorcontrib>Manna, Surya Sekhar</creatorcontrib><creatorcontrib>Pathak, Biswarup</creatorcontrib><title>Deciphering DNA nucleotide sequences and their rotation dynamics with interpretable machine learning integrated C 3 N nanopores</title><title>Nanoscale</title><addtitle>Nanoscale</addtitle><description>A solid-state nanopore combined with the quantum transport method has garnered substantial attention and intrigue for DNA sequencing due to its potential for providing rapid and accurate sequencing results, which could have numerous applications in disease diagnosis and personalized medicine. However, the intricate and multifaceted nature of the experimental protocol poses a formidable challenge in attaining precise single nucleotide analysis. Here, we report a machine learning (ML) framework combined with the quantum transport method to accelerate high-throughput single nucleotide recognition with C N nanopores. The optimized eXtreme Gradient Boosting Regression (XGBR) algorithm has predicted the fingerprint transmission of each unknown nucleotide and their rotation dynamics with root mean square error scores as low as 0.07. Interpretability of ML black box models with the game theory-based SHapley Additive exPlanation method has provided a quasi-explanation for the model working principle and the complex relationship between electrode-nucleotide coupling and transmission. Moreover, a comprehensive ML classification of nucleotides based on binary, ternary, and quaternary combinations shows maximum accuracy and F1 scores of 100%. The results suggest that ML in tandem with a nanopore device can potentially alleviate the experimental hurdles associated with quantum tunneling and facilitate fast and high-precision DNA sequencing.</description><issn>2040-3364</issn><issn>2040-3372</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpFkF1LwzAYhYMobk5v_AGSa6Ga9G2b5HJsfsGYILsvafp2jXRpTTJkV_51HdN5dQ6ch3PxEHLN2R1noO7nsHxjIASfnpBxyjKWAIj09NiLbEQuQnhnrFBQwDkZgVC8UIqPydccjR1a9Nat6Xw5pW5rOuyjrZEG_NiiMxiodjWNLVpPfR91tL2j9c7pjTWBftrYUusi-sFj1FWHdKNNax3SDrV3--P9vPY6Yk1nFOiSOu36ofcYLslZo7uAV785IavHh9XsOVm8Pr3MpovE5JInKpdZLoWoG0ClWa0hzWRV5TLnqcyZlloWSmYMoBGs4koolI2o0qLgsjGQwoTcHm6N70Pw2JSDtxvtdyVn5V5i-S_xB745wMO22mB9RP-swTcuJ21b</recordid><startdate>20231116</startdate><enddate>20231116</enddate><creator>Jena, Milan Kumar</creator><creator>Mittal, Sneha</creator><creator>Manna, Surya Sekhar</creator><creator>Pathak, Biswarup</creator><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-8363-1291</orcidid><orcidid>https://orcid.org/0000-0002-8496-109X</orcidid><orcidid>https://orcid.org/0000-0002-9972-9947</orcidid><orcidid>https://orcid.org/0000-0003-2567-4274</orcidid></search><sort><creationdate>20231116</creationdate><title>Deciphering DNA nucleotide sequences and their rotation dynamics with interpretable machine learning integrated C 3 N nanopores</title><author>Jena, Milan Kumar ; Mittal, Sneha ; Manna, Surya Sekhar ; Pathak, Biswarup</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c581-95845877df3e9a0da3248bb58512850a8a86984033f70b1979e8f7b26618fc323</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jena, Milan Kumar</creatorcontrib><creatorcontrib>Mittal, Sneha</creatorcontrib><creatorcontrib>Manna, Surya Sekhar</creatorcontrib><creatorcontrib>Pathak, Biswarup</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><jtitle>Nanoscale</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jena, Milan Kumar</au><au>Mittal, Sneha</au><au>Manna, Surya Sekhar</au><au>Pathak, Biswarup</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deciphering DNA nucleotide sequences and their rotation dynamics with interpretable machine learning integrated C 3 N nanopores</atitle><jtitle>Nanoscale</jtitle><addtitle>Nanoscale</addtitle><date>2023-11-16</date><risdate>2023</risdate><volume>15</volume><issue>44</issue><spage>18080</spage><epage>18092</epage><pages>18080-18092</pages><issn>2040-3364</issn><eissn>2040-3372</eissn><abstract>A solid-state nanopore combined with the quantum transport method has garnered substantial attention and intrigue for DNA sequencing due to its potential for providing rapid and accurate sequencing results, which could have numerous applications in disease diagnosis and personalized medicine. However, the intricate and multifaceted nature of the experimental protocol poses a formidable challenge in attaining precise single nucleotide analysis. Here, we report a machine learning (ML) framework combined with the quantum transport method to accelerate high-throughput single nucleotide recognition with C N nanopores. The optimized eXtreme Gradient Boosting Regression (XGBR) algorithm has predicted the fingerprint transmission of each unknown nucleotide and their rotation dynamics with root mean square error scores as low as 0.07. Interpretability of ML black box models with the game theory-based SHapley Additive exPlanation method has provided a quasi-explanation for the model working principle and the complex relationship between electrode-nucleotide coupling and transmission. Moreover, a comprehensive ML classification of nucleotides based on binary, ternary, and quaternary combinations shows maximum accuracy and F1 scores of 100%. The results suggest that ML in tandem with a nanopore device can potentially alleviate the experimental hurdles associated with quantum tunneling and facilitate fast and high-precision DNA sequencing.</abstract><cop>England</cop><pmid>37916991</pmid><doi>10.1039/D3NR03771A</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-8363-1291</orcidid><orcidid>https://orcid.org/0000-0002-8496-109X</orcidid><orcidid>https://orcid.org/0000-0002-9972-9947</orcidid><orcidid>https://orcid.org/0000-0003-2567-4274</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2040-3364
ispartof Nanoscale, 2023-11, Vol.15 (44), p.18080-18092
issn 2040-3364
2040-3372
language eng
recordid cdi_crossref_primary_10_1039_D3NR03771A
source Royal Society of Chemistry
title Deciphering DNA nucleotide sequences and their rotation dynamics with interpretable machine learning integrated C 3 N nanopores
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T08%3A02%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pubmed_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deciphering%20DNA%20nucleotide%20sequences%20and%20their%20rotation%20dynamics%20with%20interpretable%20machine%20learning%20integrated%20C%203%20N%20nanopores&rft.jtitle=Nanoscale&rft.au=Jena,%20Milan%20Kumar&rft.date=2023-11-16&rft.volume=15&rft.issue=44&rft.spage=18080&rft.epage=18092&rft.pages=18080-18092&rft.issn=2040-3364&rft.eissn=2040-3372&rft_id=info:doi/10.1039/D3NR03771A&rft_dat=%3Cpubmed_cross%3E37916991%3C/pubmed_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c581-95845877df3e9a0da3248bb58512850a8a86984033f70b1979e8f7b26618fc323%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/37916991&rfr_iscdi=true