Loading…

Deep neural model with self-training for scientific keyphrase extraction

Scientific information extraction is a crucial step for understanding scientific publications. In this paper, we focus on scientific keyphrase extraction, which aims to identify keyphrases from scientific articles and classify them into predefined categories. We present a neural network based approa...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2020-05, Vol.15 (5), p.e0232547-e0232547
Main Authors: Zhu, Xun, Lyu, Chen, Ji, Donghong, Liao, Han, Li, Fei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Scientific information extraction is a crucial step for understanding scientific publications. In this paper, we focus on scientific keyphrase extraction, which aims to identify keyphrases from scientific articles and classify them into predefined categories. We present a neural network based approach for this task, which employs the bidirectional long short-memory (LSTM) to represent the sentences in the article. On top of the bidirectional LSTM layer in our neural model, conditional random field (CRF) is used to predict the label sequence for the whole sentence. Considering the expensive annotated data for supervised learning methods, we introduce self-training method into our neural model to leverage the unlabeled articles. Experimental results on the ScienceIE corpus and ACL keyphrase corpus show that our neural model achieves promising performance without any hand-designed features and external knowledge resources. Furthermore, it efficiently incorporates the unlabeled data and achieve competitive performance compared with previous state-of-the-art systems.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0232547