Loading…

Pathogen metadata platform: software for accessing and analyzing pathogen strain information

Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public healt...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2016-09, Vol.17 (1), p.379-379, Article 379
Main Authors: Chang, Wenling E, Peterson, Matthew W, Garay, Christopher D, Korves, Tonia
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3
cites cdi_FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3
container_end_page 379
container_issue 1
container_start_page 379
container_title BMC bioinformatics
container_volume 17
creator Chang, Wenling E
Peterson, Matthew W
Garay, Christopher D
Korves, Tonia
description Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .
doi_str_mv 10.1186/s12859-016-1231-2
format article
fullrecord <record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5025631</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A464400492</galeid><sourcerecordid>A464400492</sourcerecordid><originalsourceid>FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3</originalsourceid><addsrcrecordid>eNptkt1r1jAUxoMobk7_AG-k4I1edOaraeqFMIYfg4Hix50QTtPTLqNN3jWpOv96U99tvK9ICDlJfs-TcHgIecroMWNavYqM66opKVMl44KV_B45ZLLOBaPV_Z36gDyK8ZJSVmtaPSQHvFZC8oYdku-fIF2EAX0xYYIOEhSbEVIf5ul1EUOffsKMRd4WYC3G6PxQgO_yhPH697rb3BrENIPzhfOrGJIL_jF50MMY8cnNekS-vXv79fRDef7x_dnpyXlpK65TqUXftr0SPWsREKxWFdQcRdvpVlSgtRadbBWvUVvasZY30ipotBUoawAUR-TN1neztBN2Fn3-ymg2s5tgvjYBnNm_8e7CDOGHqSivlGDZ4MWNwRyuFozJTC5aHEfwGJZomOZUUVkxmtHn_6CXYZlzN_5SsqmlljvUACOatSX5XbuamhOppKRUNjxTx_-h8uhwcjZ47F0-3xO83BNkJuGvNMASozn78nmfZVvWziHGGfu7fjBq1viYbXxMjo9Z42NWzbPdRt4pbvMi_gAJZMB5</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1824974840</pqid></control><display><type>article</type><title>Pathogen metadata platform: software for accessing and analyzing pathogen strain information</title><source>Publicly Available Content Database</source><source>PubMed</source><creator>Chang, Wenling E ; Peterson, Matthew W ; Garay, Christopher D ; Korves, Tonia</creator><creatorcontrib>Chang, Wenling E ; Peterson, Matthew W ; Garay, Christopher D ; Korves, Tonia</creatorcontrib><description>Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-016-1231-2</identifier><identifier>PMID: 27634291</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Databases, Factual ; Disease Outbreaks ; Epidemics ; Genetic aspects ; Genome, Microbial ; Genomics ; Humans ; Metadata ; Pathogenic microorganisms ; Software ; United States</subject><ispartof>BMC bioinformatics, 2016-09, Vol.17 (1), p.379-379, Article 379</ispartof><rights>COPYRIGHT 2016 BioMed Central Ltd.</rights><rights>Copyright BioMed Central 2016</rights><rights>The MITRE Corporation. 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3</citedby><cites>FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5025631/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1824974840?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,724,777,781,882,25734,27905,27906,36993,36994,44571,53772,53774</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/27634291$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chang, Wenling E</creatorcontrib><creatorcontrib>Peterson, Matthew W</creatorcontrib><creatorcontrib>Garay, Christopher D</creatorcontrib><creatorcontrib>Korves, Tonia</creatorcontrib><title>Pathogen metadata platform: software for accessing and analyzing pathogen strain information</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .</description><subject>Databases, Factual</subject><subject>Disease Outbreaks</subject><subject>Epidemics</subject><subject>Genetic aspects</subject><subject>Genome, Microbial</subject><subject>Genomics</subject><subject>Humans</subject><subject>Metadata</subject><subject>Pathogenic microorganisms</subject><subject>Software</subject><subject>United States</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNptkt1r1jAUxoMobk7_AG-k4I1edOaraeqFMIYfg4Hix50QTtPTLqNN3jWpOv96U99tvK9ICDlJfs-TcHgIecroMWNavYqM66opKVMl44KV_B45ZLLOBaPV_Z36gDyK8ZJSVmtaPSQHvFZC8oYdku-fIF2EAX0xYYIOEhSbEVIf5ul1EUOffsKMRd4WYC3G6PxQgO_yhPH697rb3BrENIPzhfOrGJIL_jF50MMY8cnNekS-vXv79fRDef7x_dnpyXlpK65TqUXftr0SPWsREKxWFdQcRdvpVlSgtRadbBWvUVvasZY30ipotBUoawAUR-TN1neztBN2Fn3-ymg2s5tgvjYBnNm_8e7CDOGHqSivlGDZ4MWNwRyuFozJTC5aHEfwGJZomOZUUVkxmtHn_6CXYZlzN_5SsqmlljvUACOatSX5XbuamhOppKRUNjxTx_-h8uhwcjZ47F0-3xO83BNkJuGvNMASozn78nmfZVvWziHGGfu7fjBq1viYbXxMjo9Z42NWzbPdRt4pbvMi_gAJZMB5</recordid><startdate>20160915</startdate><enddate>20160915</enddate><creator>Chang, Wenling E</creator><creator>Peterson, Matthew W</creator><creator>Garay, Christopher D</creator><creator>Korves, Tonia</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20160915</creationdate><title>Pathogen metadata platform: software for accessing and analyzing pathogen strain information</title><author>Chang, Wenling E ; Peterson, Matthew W ; Garay, Christopher D ; Korves, Tonia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Databases, Factual</topic><topic>Disease Outbreaks</topic><topic>Epidemics</topic><topic>Genetic aspects</topic><topic>Genome, Microbial</topic><topic>Genomics</topic><topic>Humans</topic><topic>Metadata</topic><topic>Pathogenic microorganisms</topic><topic>Software</topic><topic>United States</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chang, Wenling E</creatorcontrib><creatorcontrib>Peterson, Matthew W</creatorcontrib><creatorcontrib>Garay, Christopher D</creatorcontrib><creatorcontrib>Korves, Tonia</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chang, Wenling E</au><au>Peterson, Matthew W</au><au>Garay, Christopher D</au><au>Korves, Tonia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Pathogen metadata platform: software for accessing and analyzing pathogen strain information</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2016-09-15</date><risdate>2016</risdate><volume>17</volume><issue>1</issue><spage>379</spage><epage>379</epage><pages>379-379</pages><artnum>379</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>27634291</pmid><doi>10.1186/s12859-016-1231-2</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2016-09, Vol.17 (1), p.379-379, Article 379
issn 1471-2105
1471-2105
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5025631
source Publicly Available Content Database; PubMed
subjects Databases, Factual
Disease Outbreaks
Epidemics
Genetic aspects
Genome, Microbial
Genomics
Humans
Metadata
Pathogenic microorganisms
Software
United States
title Pathogen metadata platform: software for accessing and analyzing pathogen strain information
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T14%3A55%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Pathogen%20metadata%20platform:%20software%20for%20accessing%20and%20analyzing%20pathogen%20strain%20information&rft.jtitle=BMC%20bioinformatics&rft.au=Chang,%20Wenling%20E&rft.date=2016-09-15&rft.volume=17&rft.issue=1&rft.spage=379&rft.epage=379&rft.pages=379-379&rft.artnum=379&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-016-1231-2&rft_dat=%3Cgale_pubme%3EA464400492%3C/gale_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c528t-83fbbf63f1beaeac865a72e3bd8b35a8883d4b627e8c0d1b294c6a98c3e47aae3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1824974840&rft_id=info:pmid/27634291&rft_galeid=A464400492&rfr_iscdi=true