Loading…

Abstractive text summarization using deep learning with a new Turkish summarization benchmark dataset

Exponential increase in the amount of textual data made available on the Internet results in new challenges in terms of accessing information accurately and quickly. Text summarization can be defined as reducing the dimensions of the expressions to be summarized without spoiling the meaning. Summari...

Full description

Saved in:
Bibliographic Details
Published in:Concurrency and computation 2022-04, Vol.34 (9), p.n/a
Main Authors: Ertam, Fatih, Aydin, Galip
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Exponential increase in the amount of textual data made available on the Internet results in new challenges in terms of accessing information accurately and quickly. Text summarization can be defined as reducing the dimensions of the expressions to be summarized without spoiling the meaning. Summarization can be performed as extractive and ive or using both together. In this study, we focus on ive summarization which can produce more human‐like summarization results. For the study we created a Turkish news summarization benchmark dataset from various news agency web portals by crawling the news title, short news, news content, and keywords for the last 5 years. The dataset is made publicly available for researchers. The deep learning network training was carried out by using the news headlines and short news contents from the prepared dataset and then the network was expected to create the news headline as the short news summary. To evaluate the performance of this study, Rouge‐1, Rouge‐2, and Rouge‐L were compared using precision, sensitivity and F1 measure scores. Performance values for the study were presented for each sentence as well as by averaging the results for 50 randomly selected sentences. The F1 Measure values are 0.4317, 0.2194, and 0.4334 for Rouge‐1, Rouge‐2, and Rouge‐L respectively. Performance results show that the approach is promising for Turkish text summarization studies and the prepared dataset will add value to the literature.
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.6482