Loading…

Improving Sentence Representations via Component Focusing

The efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to...

Full description

Saved in:

Bibliographic Details
Published in:	Applied sciences 2020-02, Vol.10 (3), p.958
Main Authors:	Yin, Xiaoya, Zhang, Wu, Zhu, Wenhao, Liu, Shuang, Yao, Tengjun
Format:	Article
Language:	English
Subjects:	Artificial intelligence Classification component focusing Comprehension Computers Datasets Entailment Grammatical relations Grammatical subject Information processing Information retrieval Language Linguistics Machine learning Machine translation Mass media Methods Natural language Natural language processing Neural networks Predicate Recurrent neural networks Representations semantic textual similarity Semantics sentence embedding sentence representation Sentence structure Sentences Syntax Text categorization
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The efficiency of natural language processing (NLP) tasks, such as text classification and information retrieval, can be significantly improved with proper sentence representations. Neural networks such as convolutional neural network (CNN) and recurrent neural network (RNN) are gradually applied to learn the representations of sentences and are suitable for processing sequences. Recently, bidirectional encoder representations from transformers (BERT) has attracted much attention because it achieves state-of-the-art performance on various NLP tasks. However, these standard models do not adequately address a general linguistic fact, that is, different sentence components serve diverse roles in the meaning of a sentence. In general, the subject, predicate, and object serve the most crucial roles as they represent the primary meaning of a sentence. Additionally, words in a sentence are also related to each other by syntactic relations. To emphasize on these issues, we propose a sentence representation model, a modification of the pre-trained bidirectional encoder representations from transformers (BERT) network via component focusing (CF-BERT). The sentence representation consists of a basic part which refers to the complete sentence, and a component-enhanced part, which focuses on subject, predicate, object, and their relations. For the best performance, a weight factor is introduced to adjust the ratio of both parts. We evaluate CF-BERT on two different tasks: semantic textual similarity and entailment classification. Results show that CF-BERT yields a significant performance gain compared to other sentence representation methods.
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app10030958