Loading…

Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications

Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient publi...

Full description

Saved in:
Bibliographic Details
Published in:Microbial genomics 2023-12, Vol.9 (12)
Main Authors: Timme, Ruth E, Karsch-Mizrachi, Ilene, Waheed, Zahra, Arita, Masanori, MacCannell, Duncan, Maguire, Finlay, Petit Iii, Robert, Page, Andrew J, Mendes, Catarina Inês, Nasar, Muhammad Ibtisam, Oluniyi, Paul, Tyler, Andrea D, Raphenya, Amogelang R, Guthrie, Jennifer L, Olawoye, Idowu, Rinck, Gabriele, O'Cathail, Colman, Lees, John, Cochrane, Guy, Cummins, Carla, Brister, J Rodney, Klimke, William, Feldgarden, Michael, Griffiths, Emma
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Fast, efficient public health actions require well-organized and coordinated systems that can supply timely and accurate knowledge. Public databases of pathogen genomic data, such as the International Nucleotide Sequence Database Collaboration (INSDC), have become essential tools for efficient public health decisions. However, these international resources began primarily for academic purposes, rather than for surveillance or interventions. Now, queries need to access not only the whole genomes of multiple pathogens but also make connections using robust contextual metadata to identify issues of public health relevance. Databases that over time developed a patchwork of submission formats and requirements need to be consistently organized and coordinated internationally to allow effective searches.To help resolve these issues, we propose a common pathogen data structure called the Pathogen Data Object Model (DOM) that will formalize the minimum pieces of sequence data and contextual data necessary for general public health uses, while recognizing that submitters will likely withhold a wide range of non-public contextual data. Further, we propose contributors use the Pathogen DOM for all pathogen submissions (bacterial, viral, fungal, and parasites), which will simplify data submissions and provide a consistent and transparent data structure for downstream data analyses. We also highlight how improved submission tools can support the Pathogen DOM, offering users additional easy-to-use methods to ensure this structure is followed.
ISSN:2057-5858
2057-5858
DOI:10.1099/mgen.0.001145