Loading…

Influence of the duration of training a deep neural network model on the quality of text summarization task

In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are...

Full description

Saved in:
Bibliographic Details
Main Authors: Gryaznov, Artem, Rybka, Roman, Moloshnikov, Ivan, Selivanov, Anton, Sboev, Alexander
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.
ISSN:0094-243X
1551-7616
DOI:10.1063/5.0162393