Loading…
Medical concept normalization in social media posts with recurrent neural networks
[Display omitted] •Concept normalization is the task of mapping free-form expressions to medical terms.•Sequence learning with recurrent neural networks and semantic representation of text.•Evaluate end-to-end neural architectures and word embeddings on a real world dataset.•Build a framework based...
Saved in:
Published in: | Journal of biomedical informatics 2018-08, Vol.84, p.93-102 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•Concept normalization is the task of mapping free-form expressions to medical terms.•Sequence learning with recurrent neural networks and semantic representation of text.•Evaluate end-to-end neural architectures and word embeddings on a real world dataset.•Build a framework based on RNNs with attention and similarity features based on UMLS.•Promising results, both quantitatively and qualitatively, on a set of drug reviews.
Text mining of scientific libraries and social media has already proven itself as a reliable tool for drug repurposing and hypothesis generation. The task of mapping a disease mention to a concept in a controlled vocabulary, typically to the standard thesaurus in the Unified Medical Language System (UMLS), is known as medical concept normalization. This task is challenging due to the differences in the use of medical terminology between health care professionals and social media texts coming from the lay public. To bridge this gap, we use sequence learning with recurrent neural networks and semantic representation of one- or multi-word expressions: we develop end-to-end architectures directly tailored to the task, including bidirectional Long Short-Term Memory, Gated Recurrent Units with an attention mechanism, and additional semantic similarity features based on UMLS. Our evaluation against a standard benchmark shows that recurrent neural networks improve results over an effective baseline for classification based on convolutional neural networks. A qualitative examination of mentions discovered in a dataset of user reviews collected from popular online health information platforms as well as a quantitative evaluation both show improvements in the semantic representation of health-related expressions in social media. |
---|---|
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2018.06.006 |