Loading…

DeepToA: an ensemble deep-learning approach to predicting the theater of activity of a microbiome

MOTIVATIONMetagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a 'theater of activity' (ToA). An important question is, to what degree does the taxonomic and functional content of the former depend on the...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) England), 2022-10, Vol.38 (20), p.4670-4676
Main Authors: Zeng, Wenhuan, Gautam, Anupam, Huson, Daniel H
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:MOTIVATIONMetagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a 'theater of activity' (ToA). An important question is, to what degree does the taxonomic and functional content of the former depend on the (details of the) latter? Here, we investigate a related technical question: Given a taxonomic and/or functional profile estimated from metagenomic sequencing data, how to predict the associated ToA? We present a deep-learning approach to this question. We use both taxonomic and functional profiles as input. We apply node2vec to embed hierarchical taxonomic profiles into numerical vectors. We then perform dimension reduction using clustering, to address the sparseness of the taxonomic data and thus make the problem more amenable to deep-learning algorithms. Functional features are combined with textual descriptions of protein families or domains. We present an ensemble deep-learning framework DeepToA for predicting the ToA of amicrobial community, based on taxonomic and functional profiles. We use SHAP (SHapley Additive exPlanations) values to determine which taxonomic and functional features are important for the prediction. RESULTSBased on 7560 metagenomic profiles downloaded from MGnify, classified into 10 different theaters of activity, we demonstrate that DeepToA has an accuracy of 98.30%. We show that adding textual information to functional features increases the accuracy. AVAILABILITY AND IMPLEMENTATIONOur approach is available at http://ab.inf.uni-tuebingen.de/software/deeptoa. SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online.
ISSN:1367-4803
1367-4811
DOI:10.1093/bioinformatics/btac584