Loading…

Influence of the duration of training a deep neural network model on the quality of text summarization task

In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are...

Full description

Saved in:

Bibliographic Details
Main Authors:	Gryaznov, Artem, Rybka, Roman, Moloshnikov, Ivan, Selivanov, Anton, Sboev, Alexander
Format:	Conference Proceeding
Language:	English
Subjects:	Artificial neural networks Large language models Machine learning News Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue	1
container_start_page
container_title
container_volume	2849
creator	Gryaznov, Artem Rybka, Roman Moloshnikov, Ivan Selivanov, Anton Sboev, Alexander
description	In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.
doi_str_mv	10.1063/5.0162393
format	conference_proceeding
fullrecord	<record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_proquest_journals_2859722745</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2859722745</sourcerecordid><originalsourceid>FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3</originalsourceid><addsrcrecordid>eNotUE1Lw0AUXETBWj34Dxa8Calv92U3yVGKH4WCFwVvS7J50W3TTZps0PrrTT9Ow_Bm5g3D2K2AmQCND2oGQkvM8IxNhFIiSrTQ52wCkMWRjPHzkl31_QpAZkmSTth64at6IG-JNxUP38TLocuDa_yBd7nzzn_xnJdELfc0HusRwk_TrfmmKanmo3Tv2w557cLuYKPfwPths8k793cMC3m_vmYXVV73dHPCKft4fnqfv0bLt5fF_HEZtQIxRIoKyAjJFigAlIS0qtBSohMrilJoFJgpBGXjUhXSyphSK0slELJUp1jglN0dc9uu2Q7UB7Nqhs6PL41MVZZImcRqVN0fVb114VDStJ0bK--MALMf0yhzGhP_AVxaZs4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2859722745</pqid></control><display><type>conference_proceeding</type><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><source>American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)</source><creator>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander</creator><contributor>Simos, Theodore ; Tsitouras, Charalambos</contributor><creatorcontrib>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander ; Simos, Theodore ; Tsitouras, Charalambos</creatorcontrib><description>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0162393</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Artificial neural networks ; Large language models ; Machine learning ; News ; Training</subject><ispartof>AIP conference proceedings, 2023, Vol.2849 (1)</ispartof><rights>Author(s)</rights><rights>2023 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,314,780,784,789,790,23930,23931,25140,27924,27925</link.rule.ids></links><search><contributor>Simos, Theodore</contributor><contributor>Tsitouras, Charalambos</contributor><creatorcontrib>Gryaznov, Artem</creatorcontrib><creatorcontrib>Rybka, Roman</creatorcontrib><creatorcontrib>Moloshnikov, Ivan</creatorcontrib><creatorcontrib>Selivanov, Anton</creatorcontrib><creatorcontrib>Sboev, Alexander</creatorcontrib><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><title>AIP conference proceedings</title><description>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</description><subject>Artificial neural networks</subject><subject>Large language models</subject><subject>Machine learning</subject><subject>News</subject><subject>Training</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotUE1Lw0AUXETBWj34Dxa8Calv92U3yVGKH4WCFwVvS7J50W3TTZps0PrrTT9Ow_Bm5g3D2K2AmQCND2oGQkvM8IxNhFIiSrTQ52wCkMWRjPHzkl31_QpAZkmSTth64at6IG-JNxUP38TLocuDa_yBd7nzzn_xnJdELfc0HusRwk_TrfmmKanmo3Tv2w557cLuYKPfwPths8k793cMC3m_vmYXVV73dHPCKft4fnqfv0bLt5fF_HEZtQIxRIoKyAjJFigAlIS0qtBSohMrilJoFJgpBGXjUhXSyphSK0slELJUp1jglN0dc9uu2Q7UB7Nqhs6PL41MVZZImcRqVN0fVb114VDStJ0bK--MALMf0yhzGhP_AVxaZs4</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Gryaznov, Artem</creator><creator>Rybka, Roman</creator><creator>Moloshnikov, Ivan</creator><creator>Selivanov, Anton</creator><creator>Sboev, Alexander</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20230901</creationdate><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><author>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Large language models</topic><topic>Machine learning</topic><topic>News</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gryaznov, Artem</creatorcontrib><creatorcontrib>Rybka, Roman</creatorcontrib><creatorcontrib>Moloshnikov, Ivan</creatorcontrib><creatorcontrib>Selivanov, Anton</creatorcontrib><creatorcontrib>Sboev, Alexander</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gryaznov, Artem</au><au>Rybka, Roman</au><au>Moloshnikov, Ivan</au><au>Selivanov, Anton</au><au>Sboev, Alexander</au><au>Simos, Theodore</au><au>Tsitouras, Charalambos</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Influence of the duration of training a deep neural network model on the quality of text summarization task</atitle><btitle>AIP conference proceedings</btitle><date>2023-09-01</date><risdate>2023</risdate><volume>2849</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0162393</doi><tpages>5</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0094-243X
ispartof	AIP conference proceedings, 2023, Vol.2849 (1)
issn	0094-243X 1551-7616
language	eng
recordid	cdi_proquest_journals_2859722745
source	American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)
subjects	Artificial neural networks Large language models Machine learning News Training
title	Influence of the duration of training a deep neural network model on the quality of text summarization task
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T19%3A20%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Influence%20of%20the%20duration%20of%20training%20a%20deep%20neural%20network%20model%20on%20the%20quality%20of%20text%20summarization%20task&rft.btitle=AIP%20conference%20proceedings&rft.au=Gryaznov,%20Artem&rft.date=2023-09-01&rft.volume=2849&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0162393&rft_dat=%3Cproquest_scita%3E2859722745%3C/proquest_scita%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2859722745&rft_id=info:pmid/&rfr_iscdi=true