Loading…

Influence of the duration of training a deep neural network model on the quality of text summarization task

In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are...

Full description

Saved in:
Bibliographic Details
Main Authors: Gryaznov, Artem, Rybka, Roman, Moloshnikov, Ivan, Selivanov, Anton, Sboev, Alexander
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue 1
container_start_page
container_title
container_volume 2849
creator Gryaznov, Artem
Rybka, Roman
Moloshnikov, Ivan
Selivanov, Anton
Sboev, Alexander
description In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.
doi_str_mv 10.1063/5.0162393
format conference_proceeding
fullrecord <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_proquest_journals_2859722745</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2859722745</sourcerecordid><originalsourceid>FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3</originalsourceid><addsrcrecordid>eNotUE1Lw0AUXETBWj34Dxa8Calv92U3yVGKH4WCFwVvS7J50W3TTZps0PrrTT9Ow_Bm5g3D2K2AmQCND2oGQkvM8IxNhFIiSrTQ52wCkMWRjPHzkl31_QpAZkmSTth64at6IG-JNxUP38TLocuDa_yBd7nzzn_xnJdELfc0HusRwk_TrfmmKanmo3Tv2w557cLuYKPfwPths8k793cMC3m_vmYXVV73dHPCKft4fnqfv0bLt5fF_HEZtQIxRIoKyAjJFigAlIS0qtBSohMrilJoFJgpBGXjUhXSyphSK0slELJUp1jglN0dc9uu2Q7UB7Nqhs6PL41MVZZImcRqVN0fVb114VDStJ0bK--MALMf0yhzGhP_AVxaZs4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2859722745</pqid></control><display><type>conference_proceeding</type><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><source>American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)</source><creator>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander</creator><contributor>Simos, Theodore ; Tsitouras, Charalambos</contributor><creatorcontrib>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander ; Simos, Theodore ; Tsitouras, Charalambos</creatorcontrib><description>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0162393</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Artificial neural networks ; Large language models ; Machine learning ; News ; Training</subject><ispartof>AIP conference proceedings, 2023, Vol.2849 (1)</ispartof><rights>Author(s)</rights><rights>2023 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,314,780,784,789,790,23930,23931,25140,27924,27925</link.rule.ids></links><search><contributor>Simos, Theodore</contributor><contributor>Tsitouras, Charalambos</contributor><creatorcontrib>Gryaznov, Artem</creatorcontrib><creatorcontrib>Rybka, Roman</creatorcontrib><creatorcontrib>Moloshnikov, Ivan</creatorcontrib><creatorcontrib>Selivanov, Anton</creatorcontrib><creatorcontrib>Sboev, Alexander</creatorcontrib><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><title>AIP conference proceedings</title><description>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</description><subject>Artificial neural networks</subject><subject>Large language models</subject><subject>Machine learning</subject><subject>News</subject><subject>Training</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotUE1Lw0AUXETBWj34Dxa8Calv92U3yVGKH4WCFwVvS7J50W3TTZps0PrrTT9Ow_Bm5g3D2K2AmQCND2oGQkvM8IxNhFIiSrTQ52wCkMWRjPHzkl31_QpAZkmSTth64at6IG-JNxUP38TLocuDa_yBd7nzzn_xnJdELfc0HusRwk_TrfmmKanmo3Tv2w557cLuYKPfwPths8k793cMC3m_vmYXVV73dHPCKft4fnqfv0bLt5fF_HEZtQIxRIoKyAjJFigAlIS0qtBSohMrilJoFJgpBGXjUhXSyphSK0slELJUp1jglN0dc9uu2Q7UB7Nqhs6PL41MVZZImcRqVN0fVb114VDStJ0bK--MALMf0yhzGhP_AVxaZs4</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Gryaznov, Artem</creator><creator>Rybka, Roman</creator><creator>Moloshnikov, Ivan</creator><creator>Selivanov, Anton</creator><creator>Sboev, Alexander</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20230901</creationdate><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><author>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Large language models</topic><topic>Machine learning</topic><topic>News</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gryaznov, Artem</creatorcontrib><creatorcontrib>Rybka, Roman</creatorcontrib><creatorcontrib>Moloshnikov, Ivan</creatorcontrib><creatorcontrib>Selivanov, Anton</creatorcontrib><creatorcontrib>Sboev, Alexander</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gryaznov, Artem</au><au>Rybka, Roman</au><au>Moloshnikov, Ivan</au><au>Selivanov, Anton</au><au>Sboev, Alexander</au><au>Simos, Theodore</au><au>Tsitouras, Charalambos</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Influence of the duration of training a deep neural network model on the quality of text summarization task</atitle><btitle>AIP conference proceedings</btitle><date>2023-09-01</date><risdate>2023</risdate><volume>2849</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0162393</doi><tpages>5</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0094-243X
ispartof AIP conference proceedings, 2023, Vol.2849 (1)
issn 0094-243X
1551-7616
language eng
recordid cdi_proquest_journals_2859722745
source American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)
subjects Artificial neural networks
Large language models
Machine learning
News
Training
title Influence of the duration of training a deep neural network model on the quality of text summarization task
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T19%3A20%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Influence%20of%20the%20duration%20of%20training%20a%20deep%20neural%20network%20model%20on%20the%20quality%20of%20text%20summarization%20task&rft.btitle=AIP%20conference%20proceedings&rft.au=Gryaznov,%20Artem&rft.date=2023-09-01&rft.volume=2849&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0162393&rft_dat=%3Cproquest_scita%3E2859722745%3C/proquest_scita%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2859722745&rft_id=info:pmid/&rfr_iscdi=true