Loading…

Transforming Medical Imaging: A VQA Model for Microscopic Blood Cell Classification

Visual Question Answering (VQA) is a promising technology that has the potential to revolutionize the medical field by enabling computers to respond to questions about medical images. VQA holds great potential to transform medical imaging, but several obstacles stand in the way, effective medical VQ...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024, Vol.12, p.168547-168556
Main Authors: Fatima, Izzah, Hussain Shah, Jamal, Saleem, Rabia, Riaz, Samia, Rafiq, Muhammad, Khokhar, Fahad Ahmed
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Visual Question Answering (VQA) is a promising technology that has the potential to revolutionize the medical field by enabling computers to respond to questions about medical images. VQA holds great potential to transform medical imaging, but several obstacles stand in the way, effective medical VQA model development is hampered by problems like the scarcity of easily accessible medical datasets, complicated medical scenarios, and the complexity of the medical images. To contribute to the advancement of VQA models in the medical field, our research undertakes two key initiatives. Firstly, we introduce a novel dataset for medical VQA, derived from an existing dataset of blood cell images, Secondly, our study proposes a VQA model that is specifically designed to classify images of microscopic blood cells. We use pre-trained transformers like Electra, BERT, and DistilBERT to extract textual features in combination with Vision Transformer (ViT) to extract visual features from images, and then combine textual and visual features and then apply classifiers like Linear SVM and Quadratic SVM. Experimental results show that The Electra & ViT model surpasses other models by achieving high scores across multiple evaluation metrics, including a WUPS score of 90.09%, accuracy of 89.63%, F1-Score of 64.03%, precision of 63.42%, and recall of 65.23%.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3496655