Loading…

A machine learning approach utilizing DNA methylation as a classifier for Pulmonary Tuberculosis screening

Tuberculosis (TB), resulting from the Mycobacterium tuberculosis (Mtb) infection, ranks among the deadliest infectious diseases, with an estimated 1.6 million human fatalities recorded in 2022. Recently, non-sputum DNA methylation biomarkers emerged as a promising approach for rapid detection of TB...

Full description

Saved in:
Bibliographic Details
Published in:Gene reports 2024-09, Vol.36, p.101939, Article 101939
Main Authors: Le, Nhat Thong, Do, Thi Thu Hien, Duong, Doan Minh Trung, Tran, Doan Hong Ngoc, Huynh, Thuc Quyen, Huynh, Khon, Nguyen, Phuong Thao, Le, Minh Thong, Nguyen, Thi Thu Hoai
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Tuberculosis (TB), resulting from the Mycobacterium tuberculosis (Mtb) infection, ranks among the deadliest infectious diseases, with an estimated 1.6 million human fatalities recorded in 2022. Recently, non-sputum DNA methylation biomarkers emerged as a promising approach for rapid detection of TB infection. However, comprehensive work to explore potential of DNA methylation in TB prediction has been infrequent. Here, we aimed to introduce a novel set of blood DNA methylation biomarkers associated with TB infection. We conducted a pooled analysis of DNA methylation datasets which contain 290 Mtb infected samples. We built and followed an in-house pipeline to identify differentially methylated CpGs (DMCs). Feature selection and five machine learning algorithms were used to construct classifiers that could predict infectious Mtb. Simultaneously, the performance of the classifiers was evaluated in discovery datasets and in an independent cohort. We also used GO and KEGG pathway enrichment to characterize the correlation between alterations of DNA methylation and immune processes responding to Mtb infection. Our data showed that a major number of active alterations in DNA methylation were hypo-methylated. Significantly, we observed a high association between the reduction of DNA methylation and the activation immune system process. Multivariate analysis MUVR using random forest core algorithm (12-CpG model) combined with random forest classifier showed high performance with the sensitivity of 90 %, the specificity of 82 % and AUC of 0.91 (95 % CI: 0.85–0.97) in the validation cohort. Further differential analysis of Bacillus Calmette-Guerin (BCG) vaccination groups and HIV - Mtb coinfection showed clear differences between BCG and non-BCG groups, as well as between HIV - Mtb coinfection, Mtb infection and healthy samples; which showed high potential to overcome traditional methods. Collectively, DNA methylation was a promising method for early detection of tuberculosis and potentially a clinical tool for TB diagnostic biomarkers. Further external validation studies are needed to confirm the impact of our tool in daily practice. To make our model and the collected data widely available for the scientific community, we hosted both on a publicly accessible website at https://tbpred.shinyapps.io/shinyr/. •Differential DNA methylation due to MTB infection were identified and characterized.•Reduction of DNA methylation was strongly associated with the activation im
ISSN:2452-0144
2452-0144
DOI:10.1016/j.genrep.2024.101939