Loading…
Leveraging event-based semantics for automated text simplification
•Event-based automatic text simplification system.•Lexical and syntactic text simplification with content reduction.•No need for parallel datasets nor large set of handcrafted simplification rules.•Proposed simplification system is highly competitive.•Novel evaluation method based on measuring the p...
Saved in:
Published in: | Expert systems with applications 2017-10, Vol.82, p.383-395 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Event-based automatic text simplification system.•Lexical and syntactic text simplification with content reduction.•No need for parallel datasets nor large set of handcrafted simplification rules.•Proposed simplification system is highly competitive.•Novel evaluation method based on measuring the postediting effort.
Automated Text Simplification (ATS) aims to transform complex texts into their simpler variants which are easier to understand to wider audiences and easier to process with natural language processing (NLP) tools. While simplification can be applied on lexical, syntactic, and discourse level, all previously proposed ATS systems only operated on the first two levels, thus failing at simplifying texts on the discourse level. We present a semantically-motivated ATS system which is the first system that is applied on the discourse level. By exploiting the state-of-the-art event extraction system, it is the first ATS system able to eliminate large portions of irrelevant information from texts, by maintaining only those parts of the original text that belong to factual event mentions. A few handcrafted rules ensure that the output of the system is syntactically simple, by placing each factual event mention in a separate short sentence, while the state-of-the-art unsupervised lexical simplification module, based on using word embeddings, replaces complex and infrequent words with their simpler variants. We perform a thorough evaluation, both automatic and manual, showing that our system produces more readable and simpler texts than the state-of-the-art ATS systems. Our newly proposed post-editing evaluation further reveals that our system requires less human effort for correcting grammaticality and meaning preservation on news articles than the state-of-the-art ATS system. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2017.04.005 |