Loading…

The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles

Creating a collection of examples for training abstractive summarization systems is a costly process owing to the high time costs and high requirements for the qualification of experts necessary for writing high-quality summaries. A new method of creating collections for training neural summarizatio...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition and image analysis 2023-09, Vol.33 (3), p.255-267
Main Authors: Chernyshev, D. I., Dobrov, B. V.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c268t-f51a4db74fc486d4fcbd12d9e25e6faadd2813e66a3682456674a8426d8041023
container_end_page 267
container_issue 3
container_start_page 255
container_title Pattern recognition and image analysis
container_volume 33
creator Chernyshev, D. I.
Dobrov, B. V.
description Creating a collection of examples for training abstractive summarization systems is a costly process owing to the high time costs and high requirements for the qualification of experts necessary for writing high-quality summaries. A new method of creating collections for training neural summarization methods is proposed—ClusterVote, designed to simulate the features of the task by taking into account information in related documents. The method can be used to form abstractive summaries of various levels of detail, as well as to obtain extractive summaries. Using the ClusterVote method, a new collection was formed in English and Russian to train the news article summarization systems—Telegram NewsCV. Experimental results show that, under certain parameters, the collections formed by ClusterVote have similar extractive characteristics with such well-known datasets as CNN/Daily Mail and at the same time have higher indicators of “factuality”—reproduction in summaries of named entities of source texts, as well as their relationships.
doi_str_mv 10.1134/S1054661823030070
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2869268536</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2869268536</sourcerecordid><originalsourceid>FETCH-LOGICAL-c268t-f51a4db74fc486d4fcbd12d9e25e6faadd2813e66a3682456674a8426d8041023</originalsourceid><addsrcrecordid>eNp1kE9PAjEQxRujiYh-AG9NPK92-o9yJETFBPXAet6UbRcWly22XYl-egtoPBhPM5n3fm8mg9AlkGsAxm9mQASXEhRlhBEyIEeoB0KITFKgx6lPcrbTT9FZCCtCiIIh7aFtvrT40calM9hVeNRFt9axLvHYtSH6roy1a3dK7nXd1u0iCU1j9-OAK-dxTAG5Dq97fJ4YncR3i2fdeq19_al_Ep7sNuCRT-GNDefopNJNsBfftY9e7m7z8SSbPt8_jEfTrKRSxawSoLmZD3hVciVNKnMD1AwtFVZWWhtDFTArpWZSUS6kHHCtOJVGEQ6Esj66OuRuvHvrbIjFynW-TSsLquQwLRFMJhccXKV3IXhbFRtfp-s_CiDF7r_Fn_8mhh6YkLztwvrf5P-hL9HGfRI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2869268536</pqid></control><display><type>article</type><title>The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles</title><source>Springer Nature</source><creator>Chernyshev, D. I. ; Dobrov, B. V.</creator><creatorcontrib>Chernyshev, D. I. ; Dobrov, B. V.</creatorcontrib><description>Creating a collection of examples for training abstractive summarization systems is a costly process owing to the high time costs and high requirements for the qualification of experts necessary for writing high-quality summaries. A new method of creating collections for training neural summarization methods is proposed—ClusterVote, designed to simulate the features of the task by taking into account information in related documents. The method can be used to form abstractive summaries of various levels of detail, as well as to obtain extractive summaries. Using the ClusterVote method, a new collection was formed in English and Russian to train the news article summarization systems—Telegram NewsCV. Experimental results show that, under certain parameters, the collections formed by ClusterVote have similar extractive characteristics with such well-known datasets as CNN/Daily Mail and at the same time have higher indicators of “factuality”—reproduction in summaries of named entities of source texts, as well as their relationships.</description><identifier>ISSN: 1054-6618</identifier><identifier>EISSN: 1555-6212</identifier><identifier>DOI: 10.1134/S1054661823030070</identifier><language>eng</language><publisher>Moscow: Pleiades Publishing</publisher><subject>Computer Science ; Image Processing and Computer Vision ; Pattern Recognition ; Selected Conference Papers ; Summaries ; Training</subject><ispartof>Pattern recognition and image analysis, 2023-09, Vol.33 (3), p.255-267</ispartof><rights>Pleiades Publishing, Ltd. 2023. ISSN 1054-6618, Pattern Recognition and Image Analysis, 2023, Vol. 33, No. 3, pp. 255–267. © Pleiades Publishing, Ltd., 2023.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c268t-f51a4db74fc486d4fcbd12d9e25e6faadd2813e66a3682456674a8426d8041023</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Chernyshev, D. I.</creatorcontrib><creatorcontrib>Dobrov, B. V.</creatorcontrib><title>The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles</title><title>Pattern recognition and image analysis</title><addtitle>Pattern Recognit. Image Anal</addtitle><description>Creating a collection of examples for training abstractive summarization systems is a costly process owing to the high time costs and high requirements for the qualification of experts necessary for writing high-quality summaries. A new method of creating collections for training neural summarization methods is proposed—ClusterVote, designed to simulate the features of the task by taking into account information in related documents. The method can be used to form abstractive summaries of various levels of detail, as well as to obtain extractive summaries. Using the ClusterVote method, a new collection was formed in English and Russian to train the news article summarization systems—Telegram NewsCV. Experimental results show that, under certain parameters, the collections formed by ClusterVote have similar extractive characteristics with such well-known datasets as CNN/Daily Mail and at the same time have higher indicators of “factuality”—reproduction in summaries of named entities of source texts, as well as their relationships.</description><subject>Computer Science</subject><subject>Image Processing and Computer Vision</subject><subject>Pattern Recognition</subject><subject>Selected Conference Papers</subject><subject>Summaries</subject><subject>Training</subject><issn>1054-6618</issn><issn>1555-6212</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp1kE9PAjEQxRujiYh-AG9NPK92-o9yJETFBPXAet6UbRcWly22XYl-egtoPBhPM5n3fm8mg9AlkGsAxm9mQASXEhRlhBEyIEeoB0KITFKgx6lPcrbTT9FZCCtCiIIh7aFtvrT40calM9hVeNRFt9axLvHYtSH6roy1a3dK7nXd1u0iCU1j9-OAK-dxTAG5Dq97fJ4YncR3i2fdeq19_al_Ep7sNuCRT-GNDefopNJNsBfftY9e7m7z8SSbPt8_jEfTrKRSxawSoLmZD3hVciVNKnMD1AwtFVZWWhtDFTArpWZSUS6kHHCtOJVGEQ6Esj66OuRuvHvrbIjFynW-TSsLquQwLRFMJhccXKV3IXhbFRtfp-s_CiDF7r_Fn_8mhh6YkLztwvrf5P-hL9HGfRI</recordid><startdate>20230901</startdate><enddate>20230901</enddate><creator>Chernyshev, D. I.</creator><creator>Dobrov, B. V.</creator><general>Pleiades Publishing</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20230901</creationdate><title>The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles</title><author>Chernyshev, D. I. ; Dobrov, B. V.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c268t-f51a4db74fc486d4fcbd12d9e25e6faadd2813e66a3682456674a8426d8041023</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science</topic><topic>Image Processing and Computer Vision</topic><topic>Pattern Recognition</topic><topic>Selected Conference Papers</topic><topic>Summaries</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chernyshev, D. I.</creatorcontrib><creatorcontrib>Dobrov, B. V.</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Pattern recognition and image analysis</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chernyshev, D. I.</au><au>Dobrov, B. V.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles</atitle><jtitle>Pattern recognition and image analysis</jtitle><stitle>Pattern Recognit. Image Anal</stitle><date>2023-09-01</date><risdate>2023</risdate><volume>33</volume><issue>3</issue><spage>255</spage><epage>267</epage><pages>255-267</pages><issn>1054-6618</issn><eissn>1555-6212</eissn><abstract>Creating a collection of examples for training abstractive summarization systems is a costly process owing to the high time costs and high requirements for the qualification of experts necessary for writing high-quality summaries. A new method of creating collections for training neural summarization methods is proposed—ClusterVote, designed to simulate the features of the task by taking into account information in related documents. The method can be used to form abstractive summaries of various levels of detail, as well as to obtain extractive summaries. Using the ClusterVote method, a new collection was formed in English and Russian to train the news article summarization systems—Telegram NewsCV. Experimental results show that, under certain parameters, the collections formed by ClusterVote have similar extractive characteristics with such well-known datasets as CNN/Daily Mail and at the same time have higher indicators of “factuality”—reproduction in summaries of named entities of source texts, as well as their relationships.</abstract><cop>Moscow</cop><pub>Pleiades Publishing</pub><doi>10.1134/S1054661823030070</doi><tpages>13</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1054-6618
ispartof Pattern recognition and image analysis, 2023-09, Vol.33 (3), p.255-267
issn 1054-6618
1555-6212
language eng
recordid cdi_proquest_journals_2869268536
source Springer Nature
subjects Computer Science
Image Processing and Computer Vision
Pattern Recognition
Selected Conference Papers
Summaries
Training
title The Method of Automatic Construction of Training Collections for the Task of Abstractive Summarization of News Articles
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-29T12%3A13%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20Method%20of%20Automatic%20Construction%20of%20Training%20Collections%20for%20the%20Task%20of%20Abstractive%20Summarization%20of%20News%20Articles&rft.jtitle=Pattern%20recognition%20and%20image%20analysis&rft.au=Chernyshev,%20D.%C2%A0I.&rft.date=2023-09-01&rft.volume=33&rft.issue=3&rft.spage=255&rft.epage=267&rft.pages=255-267&rft.issn=1054-6618&rft.eissn=1555-6212&rft_id=info:doi/10.1134/S1054661823030070&rft_dat=%3Cproquest_cross%3E2869268536%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c268t-f51a4db74fc486d4fcbd12d9e25e6faadd2813e66a3682456674a8426d8041023%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2869268536&rft_id=info:pmid/&rfr_iscdi=true