Loading…

Identification of Biomarkers for Severity in COVID-19 Through Comparative Analysis of Five Machine Learning Algoritms [version 1; peer review: awaiting peer review]

Background COVID-19 is a global public health problem. Aim The main objective of this research is to evaluate and compare the performance of the algorithms: Random Forest, Support Vector Machine, Logistic Regression, Decision Tree, and Neural Network, using metrics such as precision, recall, F1-scor...

Full description

Saved in:
Bibliographic Details
Published in:F1000 research 2024, Vol.13, p.688
Main Authors: Olán-Ramón, Juan P., De la Cruz-Ruiz, Freddy, De la Cruz-Cano, Eduardo, Aguilar-Barojas, Sarai, Zamarron-Licona, Erasmo
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background COVID-19 is a global public health problem. Aim The main objective of this research is to evaluate and compare the performance of the algorithms: Random Forest, Support Vector Machine, Logistic Regression, Decision Tree, and Neural Network, using metrics such as precision, recall, F1-score and accuracy. Methods A dataset (n=138) was used, with numerical and categorical variables. The algorithms Random Forest, Support Vector Machine, Logistic Regression, Decision Tree, and Neural Network were considered. These were trained using an 80-20 ratio. The following metrics were evaluated: precision, recall, F1-Score, and 5-fold stratified cross-validation. Results The Random Forest algorithm was superior, achieving a maximum score of 0.9727 in cross-validation. The correlation analysis identified ferritin (0.8277) and oxygen saturation (-0.6444). The heuristic model was compared with metaheuristics models. Models obtained through metaheuristic search could maintaining the metrics with 3 variables and stable weight distribution. A perplexity analysis it allows to differentiate between the best models. The features of creatinine and ALT are highlighted in the model with the best CV score and the lowest perplexity. Conclusion Comparative analysis of different classification models was carried out to predict the severity of COVID-19 cases with biological markers.
ISSN:2046-1402
2046-1402
DOI:10.12688/f1000research.150128.1