Loading…
Influence of the duration of training a deep neural network model on the quality of text summarization task
In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | 1 |
container_start_page | |
container_title | |
container_volume | 2849 |
creator | Gryaznov, Artem Rybka, Roman Moloshnikov, Ivan Selivanov, Anton Sboev, Alexander |
description | In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant. |
doi_str_mv | 10.1063/5.0162393 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>proquest_scita</sourceid><recordid>TN_cdi_proquest_journals_2859722745</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2859722745</sourcerecordid><originalsourceid>FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3</originalsourceid><addsrcrecordid>eNotUE1Lw0AUXETBWj34Dxa8Calv92U3yVGKH4WCFwVvS7J50W3TTZps0PrrTT9Ow_Bm5g3D2K2AmQCND2oGQkvM8IxNhFIiSrTQ52wCkMWRjPHzkl31_QpAZkmSTth64at6IG-JNxUP38TLocuDa_yBd7nzzn_xnJdELfc0HusRwk_TrfmmKanmo3Tv2w557cLuYKPfwPths8k793cMC3m_vmYXVV73dHPCKft4fnqfv0bLt5fF_HEZtQIxRIoKyAjJFigAlIS0qtBSohMrilJoFJgpBGXjUhXSyphSK0slELJUp1jglN0dc9uu2Q7UB7Nqhs6PL41MVZZImcRqVN0fVb114VDStJ0bK--MALMf0yhzGhP_AVxaZs4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype><pqid>2859722745</pqid></control><display><type>conference_proceeding</type><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><source>American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list)</source><creator>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander</creator><contributor>Simos, Theodore ; Tsitouras, Charalambos</contributor><creatorcontrib>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander ; Simos, Theodore ; Tsitouras, Charalambos</creatorcontrib><description>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</description><identifier>ISSN: 0094-243X</identifier><identifier>EISSN: 1551-7616</identifier><identifier>DOI: 10.1063/5.0162393</identifier><identifier>CODEN: APCPCS</identifier><language>eng</language><publisher>Melville: American Institute of Physics</publisher><subject>Artificial neural networks ; Large language models ; Machine learning ; News ; Training</subject><ispartof>AIP conference proceedings, 2023, Vol.2849 (1)</ispartof><rights>Author(s)</rights><rights>2023 Author(s). Published by AIP Publishing.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>309,310,314,780,784,789,790,23930,23931,25140,27924,27925</link.rule.ids></links><search><contributor>Simos, Theodore</contributor><contributor>Tsitouras, Charalambos</contributor><creatorcontrib>Gryaznov, Artem</creatorcontrib><creatorcontrib>Rybka, Roman</creatorcontrib><creatorcontrib>Moloshnikov, Ivan</creatorcontrib><creatorcontrib>Selivanov, Anton</creatorcontrib><creatorcontrib>Sboev, Alexander</creatorcontrib><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><title>AIP conference proceedings</title><description>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</description><subject>Artificial neural networks</subject><subject>Large language models</subject><subject>Machine learning</subject><subject>News</subject><subject>Training</subject><issn>0094-243X</issn><issn>1551-7616</issn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><recordid>eNotUE1Lw0AUXETBWj34Dxa8Calv92U3yVGKH4WCFwVvS7J50W3TTZps0PrrTT9Ow_Bm5g3D2K2AmQCND2oGQkvM8IxNhFIiSrTQ52wCkMWRjPHzkl31_QpAZkmSTth64at6IG-JNxUP38TLocuDa_yBd7nzzn_xnJdELfc0HusRwk_TrfmmKanmo3Tv2w557cLuYKPfwPths8k793cMC3m_vmYXVV73dHPCKft4fnqfv0bLt5fF_HEZtQIxRIoKyAjJFigAlIS0qtBSohMrilJoFJgpBGXjUhXSyphSK0slELJUp1jglN0dc9uu2Q7UB7Nqhs6PL41MVZZImcRqVN0fVb114VDStJ0bK--MALMf0yhzGhP_AVxaZs4</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Gryaznov, Artem</creator><creator>Rybka, Roman</creator><creator>Moloshnikov, Ivan</creator><creator>Selivanov, Anton</creator><creator>Sboev, Alexander</creator><general>American Institute of Physics</general><scope>8FD</scope><scope>H8D</scope><scope>L7M</scope></search><sort><creationdate>20230901</creationdate><title>Influence of the duration of training a deep neural network model on the quality of text summarization task</title><author>Gryaznov, Artem ; Rybka, Roman ; Moloshnikov, Ivan ; Selivanov, Anton ; Sboev, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Large language models</topic><topic>Machine learning</topic><topic>News</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gryaznov, Artem</creatorcontrib><creatorcontrib>Rybka, Roman</creatorcontrib><creatorcontrib>Moloshnikov, Ivan</creatorcontrib><creatorcontrib>Selivanov, Anton</creatorcontrib><creatorcontrib>Sboev, Alexander</creatorcontrib><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>Advanced Technologies Database with Aerospace</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gryaznov, Artem</au><au>Rybka, Roman</au><au>Moloshnikov, Ivan</au><au>Selivanov, Anton</au><au>Sboev, Alexander</au><au>Simos, Theodore</au><au>Tsitouras, Charalambos</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Influence of the duration of training a deep neural network model on the quality of text summarization task</atitle><btitle>AIP conference proceedings</btitle><date>2023-09-01</date><risdate>2023</risdate><volume>2849</volume><issue>1</issue><issn>0094-243X</issn><eissn>1551-7616</eissn><coden>APCPCS</coden><abstract>In this paper we apply the generative deep learning language model to text summarization task. Because such large language models require a lot of resources to train, it is interesting to study how much long training affects the final result and at which point it gets saturated. The experiments are ran using mT5 model and 2 news corpora RIA and Lenta. During this research we achieve state-of-the-art results for the task of title prediction by news texts: 47.60 Rouge-L for RIA corpus and 40.69 Rouge-L for Lenta. It is shown that after 800k updates of the weights, the accuracy scores continue to grow, but their growth becomes more and more insignificant.</abstract><cop>Melville</cop><pub>American Institute of Physics</pub><doi>10.1063/5.0162393</doi><tpages>5</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0094-243X |
ispartof | AIP conference proceedings, 2023, Vol.2849 (1) |
issn | 0094-243X 1551-7616 |
language | eng |
recordid | cdi_proquest_journals_2859722745 |
source | American Institute of Physics:Jisc Collections:Transitional Journals Agreement 2021-23 (Reading list) |
subjects | Artificial neural networks Large language models Machine learning News Training |
title | Influence of the duration of training a deep neural network model on the quality of text summarization task |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T19%3A20%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Influence%20of%20the%20duration%20of%20training%20a%20deep%20neural%20network%20model%20on%20the%20quality%20of%20text%20summarization%20task&rft.btitle=AIP%20conference%20proceedings&rft.au=Gryaznov,%20Artem&rft.date=2023-09-01&rft.volume=2849&rft.issue=1&rft.issn=0094-243X&rft.eissn=1551-7616&rft.coden=APCPCS&rft_id=info:doi/10.1063/5.0162393&rft_dat=%3Cproquest_scita%3E2859722745%3C/proquest_scita%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-p133t-5eb09e3ecb31005208ff3ce767c1bd1631395305c4d5b2c24e8c2d513098683b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2859722745&rft_id=info:pmid/&rfr_iscdi=true |