Loading…
Information Extraction for Intestinal Cancer Electronic Medical Records
The data generated by the structured electronic medical records is helpful for mining and extracting medical data, and it is an effective way to make effective use of valuable data resources. However, the hospitals have accumulated a large number of unstructured data in electronic medical records, w...
Saved in:
Published in: | IEEE access 2020, Vol.8, p.125923-125934 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The data generated by the structured electronic medical records is helpful for mining and extracting medical data, and it is an effective way to make effective use of valuable data resources. However, the hospitals have accumulated a large number of unstructured data in electronic medical records, which cannot be effectively searched, resulting in serious waste of resources. In this paper, we study the problem of extracting attribute values from the unstructured text in electronic medical records. By observing intestinal cancer diagnostic texts, our attributes have two categories - discriminative attributes and extractive attributes, which use the text classification and the sequence labeling to tackle attribute values extraction problems. For discriminative attributes, we firstly divide the text into sentences/segments as instances. Secondly, we fine-tune the pre-trained word embedding to capture domain-specific semantics/knowledge. Thirdly, we also use an attention mechanism to select the most important instance for different attribute extractors. Finally, multi-tasking learning is used to share useful information to get better experimental results. For extractive attributes, we propose a novel model to get attribute values, including the BiLSTM layer, the CNN layer and the CRF layer. In particular, we use BiLSTM and CNN to learn text features and CRF as the last layer of the model. Experiments have shown that our method is superior to several competitive baseline methods. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2020.3005684 |