Loading…

Precision information extraction for rare disease epidemiology at scale

The United Nations recently made a call to address the challenges of an estimated 300 million persons worldwide living with a rare disease through the collection, analysis, and dissemination of disaggregated data. Epidemiologic Information (EI) regarding prevalence and incidence data of rare disease...

Full description

Saved in:
Bibliographic Details
Published in:Journal of translational medicine 2023-02, Vol.21 (1), p.157-157, Article 157
Main Authors: Kariampuzha, William Z, Alyea, Gioconda, Qu, Sue, Sanjak, Jaleal, Mathé, Ewy, Sid, Eric, Chatelaine, Haley, Yadaw, Arjun, Xu, Yanji, Zhu, Qian
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93
cites cdi_FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93
container_end_page 157
container_issue 1
container_start_page 157
container_title Journal of translational medicine
container_volume 21
creator Kariampuzha, William Z
Alyea, Gioconda
Qu, Sue
Sanjak, Jaleal
Mathé, Ewy
Sid, Eric
Chatelaine, Haley
Yadaw, Arjun
Xu, Yanji
Zhu, Qian
description The United Nations recently made a call to address the challenges of an estimated 300 million persons worldwide living with a rare disease through the collection, analysis, and dissemination of disaggregated data. Epidemiologic Information (EI) regarding prevalence and incidence data of rare diseases is sparse and current paradigms of identifying, extracting, and curating EI rely upon time-intensive, error-prone manual processes. With these limitations, a clear understanding of the variation in epidemiology and outcomes for rare disease patients is hampered. This challenges the public health of rare diseases patients through a lack of information necessary to prioritize research, policy decisions, therapeutic development, and health system allocations. In this study, we developed a newly curated epidemiology corpus for Named Entity Recognition (NER), a deep learning framework, and a novel rare disease epidemiologic information pipeline named EpiPipeline4RD consisting of a web interface and Restful API. For the corpus creation, we programmatically gathered a representative sample of rare disease epidemiologic abstracts, utilized weakly-supervised machine learning techniques to label the dataset, and manually validated the labeled dataset. For the deep learning framework development, we fine-tuned our dataset and adapted the BioBERT model for NER. We measured the performance of our BioBERT model for epidemiology entity recognition quantitatively with precision, recall, and F1 and qualitatively through a comparison with Orphanet. We demonstrated the ability for our pipeline to gather, identify, and extract epidemiology information from rare disease abstracts through three case studies. We developed a deep learning model to extract EI with overall F1 scores of 0.817 and 0.878, evaluated at the entity-level and token-level respectively, and which achieved comparable qualitative results to Orphanet's collection paradigm. Additionally, case studies of the rare diseases Classic homocystinuria, GRACILE syndrome, Phenylketonuria demonstrated the adequate recall of abstracts with epidemiology information, high precision of epidemiology information extraction through our deep learning model, and the increased efficiency of EpiPipeline4RD compared to a manual curation paradigm. EpiPipeline4RD demonstrated high performance of EI extraction from rare disease literature to augment manual curation processes. This automated information curation paradigm will not only effect
doi_str_mv 10.1186/s12967-023-04011-y
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_97c68f5ddac148e2af07b15e158462a3</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A739066453</galeid><doaj_id>oai_doaj_org_article_97c68f5ddac148e2af07b15e158462a3</doaj_id><sourcerecordid>A739066453</sourcerecordid><originalsourceid>FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93</originalsourceid><addsrcrecordid>eNptUstuFDEQtBCIPOAHOKCRuHCZ4PfjghRFJESKBAc4Wx67Z_FqZrzYsxH793h2Q8gi5INb7aqSu7oQekPwBSFafiiEGqlaTFmLOSak3T1Dp4Qr0wqt5PMn9Qk6K2WNMeWCm5fohEktBGH8FN18zeBjiWlq4tSnPLp5qeHXnJ3fl7XZZJehCbGAK9DAJgYYYxrSate4uSneDfAKvejdUOD1w32Ovl9_-nb1ub37cnN7dXnXeiHZ3FINwHjHaa-dAhm4kZhoTDoaBBZBY8MD6ZRyIELQOvREU9F5pjDjTgfDztHtQTckt7abHEeXdza5aPeNlFfW5Tn6AaxRXuq-6jhPuAbqeqw6IoAIzSV1rGp9PGhttt0IwcNUhx6ORI9fpvjDrtK9NUZRyXgVeP8gkNPPLZTZjrF4GAY3QdoWS5UmtNosaIW--we6Tts8VasWlBaEGC3_olbVUrssZFnDImovFTNYSi6Wf1_8B1XPshafJuhj7R8R6IHgcyolQ_84I8F2iZI9RMnWKNl9lOyukt4-deeR8ic77DdSiMNO</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2788511986</pqid></control><display><type>article</type><title>Precision information extraction for rare disease epidemiology at scale</title><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><creator>Kariampuzha, William Z ; Alyea, Gioconda ; Qu, Sue ; Sanjak, Jaleal ; Mathé, Ewy ; Sid, Eric ; Chatelaine, Haley ; Yadaw, Arjun ; Xu, Yanji ; Zhu, Qian</creator><creatorcontrib>Kariampuzha, William Z ; Alyea, Gioconda ; Qu, Sue ; Sanjak, Jaleal ; Mathé, Ewy ; Sid, Eric ; Chatelaine, Haley ; Yadaw, Arjun ; Xu, Yanji ; Zhu, Qian</creatorcontrib><description>The United Nations recently made a call to address the challenges of an estimated 300 million persons worldwide living with a rare disease through the collection, analysis, and dissemination of disaggregated data. Epidemiologic Information (EI) regarding prevalence and incidence data of rare diseases is sparse and current paradigms of identifying, extracting, and curating EI rely upon time-intensive, error-prone manual processes. With these limitations, a clear understanding of the variation in epidemiology and outcomes for rare disease patients is hampered. This challenges the public health of rare diseases patients through a lack of information necessary to prioritize research, policy decisions, therapeutic development, and health system allocations. In this study, we developed a newly curated epidemiology corpus for Named Entity Recognition (NER), a deep learning framework, and a novel rare disease epidemiologic information pipeline named EpiPipeline4RD consisting of a web interface and Restful API. For the corpus creation, we programmatically gathered a representative sample of rare disease epidemiologic abstracts, utilized weakly-supervised machine learning techniques to label the dataset, and manually validated the labeled dataset. For the deep learning framework development, we fine-tuned our dataset and adapted the BioBERT model for NER. We measured the performance of our BioBERT model for epidemiology entity recognition quantitatively with precision, recall, and F1 and qualitatively through a comparison with Orphanet. We demonstrated the ability for our pipeline to gather, identify, and extract epidemiology information from rare disease abstracts through three case studies. We developed a deep learning model to extract EI with overall F1 scores of 0.817 and 0.878, evaluated at the entity-level and token-level respectively, and which achieved comparable qualitative results to Orphanet's collection paradigm. Additionally, case studies of the rare diseases Classic homocystinuria, GRACILE syndrome, Phenylketonuria demonstrated the adequate recall of abstracts with epidemiology information, high precision of epidemiology information extraction through our deep learning model, and the increased efficiency of EpiPipeline4RD compared to a manual curation paradigm. EpiPipeline4RD demonstrated high performance of EI extraction from rare disease literature to augment manual curation processes. This automated information curation paradigm will not only effectively empower development of the NIH Genetic and Rare Diseases Information Center (GARD), but also support the public health of the rare disease community.</description><identifier>ISSN: 1479-5876</identifier><identifier>EISSN: 1479-5876</identifier><identifier>DOI: 10.1186/s12967-023-04011-y</identifier><identifier>PMID: 36855134</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Acidosis, Lactic ; Case studies ; Cholestasis ; Datasets ; Deep learning ; Epidemiology ; Homocystinuria ; Humans ; Information processing ; Information Storage and Retrieval ; Labeling ; Machine learning ; Mathematical models ; Phenylketonuria ; Public Health ; Rare diseases ; Rare Diseases - diagnosis ; Rare Diseases - epidemiology ; Supervision</subject><ispartof>Journal of translational medicine, 2023-02, Vol.21 (1), p.157-157, Article 157</ispartof><rights>2023. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.</rights><rights>COPYRIGHT 2023 BioMed Central Ltd.</rights><rights>2023. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93</citedby><cites>FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93</cites><orcidid>0000-0002-4858-6333</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972634/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2788511986?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25732,27903,27904,36991,36992,44569,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36855134$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kariampuzha, William Z</creatorcontrib><creatorcontrib>Alyea, Gioconda</creatorcontrib><creatorcontrib>Qu, Sue</creatorcontrib><creatorcontrib>Sanjak, Jaleal</creatorcontrib><creatorcontrib>Mathé, Ewy</creatorcontrib><creatorcontrib>Sid, Eric</creatorcontrib><creatorcontrib>Chatelaine, Haley</creatorcontrib><creatorcontrib>Yadaw, Arjun</creatorcontrib><creatorcontrib>Xu, Yanji</creatorcontrib><creatorcontrib>Zhu, Qian</creatorcontrib><title>Precision information extraction for rare disease epidemiology at scale</title><title>Journal of translational medicine</title><addtitle>J Transl Med</addtitle><description>The United Nations recently made a call to address the challenges of an estimated 300 million persons worldwide living with a rare disease through the collection, analysis, and dissemination of disaggregated data. Epidemiologic Information (EI) regarding prevalence and incidence data of rare diseases is sparse and current paradigms of identifying, extracting, and curating EI rely upon time-intensive, error-prone manual processes. With these limitations, a clear understanding of the variation in epidemiology and outcomes for rare disease patients is hampered. This challenges the public health of rare diseases patients through a lack of information necessary to prioritize research, policy decisions, therapeutic development, and health system allocations. In this study, we developed a newly curated epidemiology corpus for Named Entity Recognition (NER), a deep learning framework, and a novel rare disease epidemiologic information pipeline named EpiPipeline4RD consisting of a web interface and Restful API. For the corpus creation, we programmatically gathered a representative sample of rare disease epidemiologic abstracts, utilized weakly-supervised machine learning techniques to label the dataset, and manually validated the labeled dataset. For the deep learning framework development, we fine-tuned our dataset and adapted the BioBERT model for NER. We measured the performance of our BioBERT model for epidemiology entity recognition quantitatively with precision, recall, and F1 and qualitatively through a comparison with Orphanet. We demonstrated the ability for our pipeline to gather, identify, and extract epidemiology information from rare disease abstracts through three case studies. We developed a deep learning model to extract EI with overall F1 scores of 0.817 and 0.878, evaluated at the entity-level and token-level respectively, and which achieved comparable qualitative results to Orphanet's collection paradigm. Additionally, case studies of the rare diseases Classic homocystinuria, GRACILE syndrome, Phenylketonuria demonstrated the adequate recall of abstracts with epidemiology information, high precision of epidemiology information extraction through our deep learning model, and the increased efficiency of EpiPipeline4RD compared to a manual curation paradigm. EpiPipeline4RD demonstrated high performance of EI extraction from rare disease literature to augment manual curation processes. This automated information curation paradigm will not only effectively empower development of the NIH Genetic and Rare Diseases Information Center (GARD), but also support the public health of the rare disease community.</description><subject>Acidosis, Lactic</subject><subject>Case studies</subject><subject>Cholestasis</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Epidemiology</subject><subject>Homocystinuria</subject><subject>Humans</subject><subject>Information processing</subject><subject>Information Storage and Retrieval</subject><subject>Labeling</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Phenylketonuria</subject><subject>Public Health</subject><subject>Rare diseases</subject><subject>Rare Diseases - diagnosis</subject><subject>Rare Diseases - epidemiology</subject><subject>Supervision</subject><issn>1479-5876</issn><issn>1479-5876</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptUstuFDEQtBCIPOAHOKCRuHCZ4PfjghRFJESKBAc4Wx67Z_FqZrzYsxH793h2Q8gi5INb7aqSu7oQekPwBSFafiiEGqlaTFmLOSak3T1Dp4Qr0wqt5PMn9Qk6K2WNMeWCm5fohEktBGH8FN18zeBjiWlq4tSnPLp5qeHXnJ3fl7XZZJehCbGAK9DAJgYYYxrSate4uSneDfAKvejdUOD1w32Ovl9_-nb1ub37cnN7dXnXeiHZ3FINwHjHaa-dAhm4kZhoTDoaBBZBY8MD6ZRyIELQOvREU9F5pjDjTgfDztHtQTckt7abHEeXdza5aPeNlFfW5Tn6AaxRXuq-6jhPuAbqeqw6IoAIzSV1rGp9PGhttt0IwcNUhx6ORI9fpvjDrtK9NUZRyXgVeP8gkNPPLZTZjrF4GAY3QdoWS5UmtNosaIW--we6Tts8VasWlBaEGC3_olbVUrssZFnDImovFTNYSi6Wf1_8B1XPshafJuhj7R8R6IHgcyolQ_84I8F2iZI9RMnWKNl9lOyukt4-deeR8ic77DdSiMNO</recordid><startdate>20230228</startdate><enddate>20230228</enddate><creator>Kariampuzha, William Z</creator><creator>Alyea, Gioconda</creator><creator>Qu, Sue</creator><creator>Sanjak, Jaleal</creator><creator>Mathé, Ewy</creator><creator>Sid, Eric</creator><creator>Chatelaine, Haley</creator><creator>Yadaw, Arjun</creator><creator>Xu, Yanji</creator><creator>Zhu, Qian</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7T5</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>H94</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-4858-6333</orcidid></search><sort><creationdate>20230228</creationdate><title>Precision information extraction for rare disease epidemiology at scale</title><author>Kariampuzha, William Z ; Alyea, Gioconda ; Qu, Sue ; Sanjak, Jaleal ; Mathé, Ewy ; Sid, Eric ; Chatelaine, Haley ; Yadaw, Arjun ; Xu, Yanji ; Zhu, Qian</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Acidosis, Lactic</topic><topic>Case studies</topic><topic>Cholestasis</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Epidemiology</topic><topic>Homocystinuria</topic><topic>Humans</topic><topic>Information processing</topic><topic>Information Storage and Retrieval</topic><topic>Labeling</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Phenylketonuria</topic><topic>Public Health</topic><topic>Rare diseases</topic><topic>Rare Diseases - diagnosis</topic><topic>Rare Diseases - epidemiology</topic><topic>Supervision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kariampuzha, William Z</creatorcontrib><creatorcontrib>Alyea, Gioconda</creatorcontrib><creatorcontrib>Qu, Sue</creatorcontrib><creatorcontrib>Sanjak, Jaleal</creatorcontrib><creatorcontrib>Mathé, Ewy</creatorcontrib><creatorcontrib>Sid, Eric</creatorcontrib><creatorcontrib>Chatelaine, Haley</creatorcontrib><creatorcontrib>Yadaw, Arjun</creatorcontrib><creatorcontrib>Xu, Yanji</creatorcontrib><creatorcontrib>Zhu, Qian</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Immunology Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Journal of translational medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kariampuzha, William Z</au><au>Alyea, Gioconda</au><au>Qu, Sue</au><au>Sanjak, Jaleal</au><au>Mathé, Ewy</au><au>Sid, Eric</au><au>Chatelaine, Haley</au><au>Yadaw, Arjun</au><au>Xu, Yanji</au><au>Zhu, Qian</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Precision information extraction for rare disease epidemiology at scale</atitle><jtitle>Journal of translational medicine</jtitle><addtitle>J Transl Med</addtitle><date>2023-02-28</date><risdate>2023</risdate><volume>21</volume><issue>1</issue><spage>157</spage><epage>157</epage><pages>157-157</pages><artnum>157</artnum><issn>1479-5876</issn><eissn>1479-5876</eissn><abstract>The United Nations recently made a call to address the challenges of an estimated 300 million persons worldwide living with a rare disease through the collection, analysis, and dissemination of disaggregated data. Epidemiologic Information (EI) regarding prevalence and incidence data of rare diseases is sparse and current paradigms of identifying, extracting, and curating EI rely upon time-intensive, error-prone manual processes. With these limitations, a clear understanding of the variation in epidemiology and outcomes for rare disease patients is hampered. This challenges the public health of rare diseases patients through a lack of information necessary to prioritize research, policy decisions, therapeutic development, and health system allocations. In this study, we developed a newly curated epidemiology corpus for Named Entity Recognition (NER), a deep learning framework, and a novel rare disease epidemiologic information pipeline named EpiPipeline4RD consisting of a web interface and Restful API. For the corpus creation, we programmatically gathered a representative sample of rare disease epidemiologic abstracts, utilized weakly-supervised machine learning techniques to label the dataset, and manually validated the labeled dataset. For the deep learning framework development, we fine-tuned our dataset and adapted the BioBERT model for NER. We measured the performance of our BioBERT model for epidemiology entity recognition quantitatively with precision, recall, and F1 and qualitatively through a comparison with Orphanet. We demonstrated the ability for our pipeline to gather, identify, and extract epidemiology information from rare disease abstracts through three case studies. We developed a deep learning model to extract EI with overall F1 scores of 0.817 and 0.878, evaluated at the entity-level and token-level respectively, and which achieved comparable qualitative results to Orphanet's collection paradigm. Additionally, case studies of the rare diseases Classic homocystinuria, GRACILE syndrome, Phenylketonuria demonstrated the adequate recall of abstracts with epidemiology information, high precision of epidemiology information extraction through our deep learning model, and the increased efficiency of EpiPipeline4RD compared to a manual curation paradigm. EpiPipeline4RD demonstrated high performance of EI extraction from rare disease literature to augment manual curation processes. This automated information curation paradigm will not only effectively empower development of the NIH Genetic and Rare Diseases Information Center (GARD), but also support the public health of the rare disease community.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>36855134</pmid><doi>10.1186/s12967-023-04011-y</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-4858-6333</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1479-5876
ispartof Journal of translational medicine, 2023-02, Vol.21 (1), p.157-157, Article 157
issn 1479-5876
1479-5876
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_97c68f5ddac148e2af07b15e158462a3
source Publicly Available Content (ProQuest); PubMed Central
subjects Acidosis, Lactic
Case studies
Cholestasis
Datasets
Deep learning
Epidemiology
Homocystinuria
Humans
Information processing
Information Storage and Retrieval
Labeling
Machine learning
Mathematical models
Phenylketonuria
Public Health
Rare diseases
Rare Diseases - diagnosis
Rare Diseases - epidemiology
Supervision
title Precision information extraction for rare disease epidemiology at scale
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T02%3A10%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Precision%20information%20extraction%20for%20rare%20disease%20epidemiology%20at%20scale&rft.jtitle=Journal%20of%20translational%20medicine&rft.au=Kariampuzha,%20William%20Z&rft.date=2023-02-28&rft.volume=21&rft.issue=1&rft.spage=157&rft.epage=157&rft.pages=157-157&rft.artnum=157&rft.issn=1479-5876&rft.eissn=1479-5876&rft_id=info:doi/10.1186/s12967-023-04011-y&rft_dat=%3Cgale_doaj_%3EA739066453%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c563t-28ee34b42f8a7e6d49601801b2d505d8094d1b77ae5dd88df1825bc37034a8d93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2788511986&rft_id=info:pmid/36855134&rft_galeid=A739066453&rfr_iscdi=true