Loading…

An Optimal Model for Medical Text Classification Based on Adaptive Genetic Algorithm: An optimal model for medical

Automatic text classification, in which textual data is categorized into specified categories based on its content, is a classic issue in the science of Natural Language Processing. In recent years, there has been a notable surge in research on medical text classification due to the increasing avail...

Full description

Saved in:
Bibliographic Details
Published in:Data science and engineering 2024, Vol.9 (4), p.378-392
Main Authors: Ben Abdennour, Ghada, Gasmi, Karim, Ejbali, Ridha
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic text classification, in which textual data is categorized into specified categories based on its content, is a classic issue in the science of Natural Language Processing. In recent years, there has been a notable surge in research on medical text classification due to the increasing availability of medical data like patient medical records and medical literature. Machine learning and statistical methods, such as those used in medical text classification, have proven to be highly efficient for these tasks. However, a significant amount of manual labor is still required to categorize the extensive dataset utilized for training. Recent research have demonstrated the effectiveness of pretrained language models, including machine learning models, in reducing the time and effort required for feature engineering by medical experts. However, there is no statistically significant enhancement in performance when directly applying the machine learning model to the classification task. In this paper, we present a hybrid machine learning model that combines individual traditional algorithms augmented by a genetic algorithm. However, the improved model is designed to enhance performance by optimizing the weight parameter. In this context, the best single model demonstrated commendable accuracy. In addition, when applying the hybridization approach and optimizing the weight parameters, the results were substantially enhanced. The results underscore the superiority of our augmented hybrid model over individual traditional algorithms. We conduct experiments using two distinct types of datasets: one comprising medical records, such as the Heart Failure Clinical Record and another consisting of medical literature, such as PubMed 20k RCT. So, the objective is to clearly showcase the effectiveness of our approach by highlighting the significant enhancements in accuracy, precision, F1-score and Recall achieved through our improved model.
ISSN:2364-1185
2364-1541
DOI:10.1007/s41019-024-00257-8