Loading…

A pipeline for medical literature search and its evaluation

One database commonly used by clinicians for searching the medical literature and practicing evidence-based medicine is PubMed. As the literature grows, it has become challenging for users to find the relevant material quickly because most of the time the relevant results are not on the top. In this...

Full description

Saved in:
Bibliographic Details
Published in:Journal of information science 2023-04
Main Authors: Zafar, Imamah, Wali, Aamir, Kunwar, Muhammad Ahmed, Afzal, Noor, Raza, Muhammad
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:One database commonly used by clinicians for searching the medical literature and practicing evidence-based medicine is PubMed. As the literature grows, it has become challenging for users to find the relevant material quickly because most of the time the relevant results are not on the top. In this article, we propose a search and ranking pipeline to improve the search results based on relevancy. We first propose an ensemble model consisting of three classifiers: bidirectional long-short-term memory conditional random field (bi-LSTM-CRF), support vector machine and naive Bayes to recognise PICO (patient, intervention, comparison, outcome) elements from abstracts. The ensemble was trained on an annotated corpus consisting of 5000 abstracts split into 4000 and 1000 training and testing data, respectively. The ensemble recorded an accuracy of 93%. We then retrieved around 927,000 articles from PubMed for the years 2017–2021 (access date 16 April 2021). For every abstract, we extracted and grouped all P, I and O terms, and stored these groups along with the article ID in a separate database. During the search, every P, I and O term of the query is searched only in its corresponding group of every abstract. The scoring method simply counts the number of matches between the query’s P, I and O elements and the words in P, I and O groups, respectively. The abstracts are sorted by the number of matches and the top five abstracts are listed using their pre-stored abstract IDs. A comprehensive user study was conducted where 60 different queries were formulated and used to generate ranked search results using both PubMed and our proposed model. Five medical professionals assessed the ranked search results and marked every item as relevant or non-relevant. Both models were compared using precision@K and mean-average-precision@K metrics where K is 5. For most of the queries, our model produced higher precision@K values than PubMed. The mean-average-precision@K value of our model is also higher than PubMed (0.83 versus 0.67).
ISSN:0165-5515
1741-6485
DOI:10.1177/01655515231161557