Loading…

Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning

Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationship...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of information science 2023-02, Vol.49 (1), p.164-182
Main Authors:	Lamsiyah, Salima, Mahdaouy, Abdelkader El, Ouatik, Saïd El Alaoui, Espinasse, Bernard
Format:	Article
Language:	English
Subjects:	Computer Science Datasets Deep learning Documents Embedding Representations
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273
cites	cdi_FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273
container_end_page	182
container_issue	1
container_start_page	164
container_title	Journal of information science
container_volume	49
creator	Lamsiyah, Salima Mahdaouy, Abdelkader El Ouatik, Saïd El Alaoui Espinasse, Bernard
description	Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.
doi_str_mv	10.1177/0165551521990616
format	article
fullrecord	<record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_03594048v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_0165551521990616</sage_id><sourcerecordid>2771306398</sourcerecordid><originalsourceid>FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273</originalsourceid><addsrcrecordid>eNp1kc1LAzEQxYMoWKt3jwFPHlaTzcdujrWoFQqCtOclm03a1O6mJtn68de7yxYFwdMw837vMcMAcInRDcZZdoswZ4xhlmIhEMf8CIxwRnHCac6OwaiXk14_BWchbBBCTBA6Au_LJrQ77fc26Arqj-ilinavYd1uo00qp9paNxGGtq6lt18yWtfAWse1q2Ape1PXd64mGO3hVkvf2GYFjXc1vLt_WRyCogyv0NhGJ7HtgXNwYuQ26ItDHYPlw_1iOkvmz49P08k8UYSSmKQVEVXOCU1FyiklQmNqVKZSVsmKG84UzkqmGFecEaPyXJlSSYrKilPN0oyMwfWQu5bbYudtd8Rn4aQtZpN50c8QYYIimu9xx14N7M67t1aHWGxc65tuvSLNMkwQJyLvKDRQyrsQvDY_sRgV_SuKv6_oLMlgCXKlf0P_5b8BW_-JvQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2771306398</pqid></control><display><type>article</type><title>Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning</title><source>Library & Information Science Abstracts (LISA)</source><source>Sage Journals Online</source><creator>Lamsiyah, Salima ; Mahdaouy, Abdelkader El ; Ouatik, Saïd El Alaoui ; Espinasse, Bernard</creator><creatorcontrib>Lamsiyah, Salima ; Mahdaouy, Abdelkader El ; Ouatik, Saïd El Alaoui ; Espinasse, Bernard</creatorcontrib><description>Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.</description><identifier>ISSN: 0165-5515</identifier><identifier>EISSN: 1741-6485</identifier><identifier>DOI: 10.1177/0165551521990616</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Computer Science ; Datasets ; Deep learning ; Documents ; Embedding ; Representations</subject><ispartof>Journal of information science, 2023-02, Vol.49 (1), p.164-182</ispartof><rights>The Author(s) 2021</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273</citedby><cites>FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273</cites><orcidid>0000-0001-8789-5713 ; 0000-0003-4281-2472</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925,34135,79364</link.rule.ids><backlink>$$Uhttps://hal.science/hal-03594048$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Lamsiyah, Salima</creatorcontrib><creatorcontrib>Mahdaouy, Abdelkader El</creatorcontrib><creatorcontrib>Ouatik, Saïd El Alaoui</creatorcontrib><creatorcontrib>Espinasse, Bernard</creatorcontrib><title>Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning</title><title>Journal of information science</title><description>Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.</description><subject>Computer Science</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Documents</subject><subject>Embedding</subject><subject>Representations</subject><issn>0165-5515</issn><issn>1741-6485</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNp1kc1LAzEQxYMoWKt3jwFPHlaTzcdujrWoFQqCtOclm03a1O6mJtn68de7yxYFwdMw837vMcMAcInRDcZZdoswZ4xhlmIhEMf8CIxwRnHCac6OwaiXk14_BWchbBBCTBA6Au_LJrQ77fc26Arqj-ilinavYd1uo00qp9paNxGGtq6lt18yWtfAWse1q2Ape1PXd64mGO3hVkvf2GYFjXc1vLt_WRyCogyv0NhGJ7HtgXNwYuQ26ItDHYPlw_1iOkvmz49P08k8UYSSmKQVEVXOCU1FyiklQmNqVKZSVsmKG84UzkqmGFecEaPyXJlSSYrKilPN0oyMwfWQu5bbYudtd8Rn4aQtZpN50c8QYYIimu9xx14N7M67t1aHWGxc65tuvSLNMkwQJyLvKDRQyrsQvDY_sRgV_SuKv6_oLMlgCXKlf0P_5b8BW_-JvQ</recordid><startdate>20230201</startdate><enddate>20230201</enddate><creator>Lamsiyah, Salima</creator><creator>Mahdaouy, Abdelkader El</creator><creator>Ouatik, Saïd El Alaoui</creator><creator>Espinasse, Bernard</creator><general>SAGE Publications</general><general>Bowker-Saur Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0001-8789-5713</orcidid><orcidid>https://orcid.org/0000-0003-4281-2472</orcidid></search><sort><creationdate>20230201</creationdate><title>Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning</title><author>Lamsiyah, Salima ; Mahdaouy, Abdelkader El ; Ouatik, Saïd El Alaoui ; Espinasse, Bernard</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Documents</topic><topic>Embedding</topic><topic>Representations</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lamsiyah, Salima</creatorcontrib><creatorcontrib>Mahdaouy, Abdelkader El</creatorcontrib><creatorcontrib>Ouatik, Saïd El Alaoui</creatorcontrib><creatorcontrib>Espinasse, Bernard</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Journal of information science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lamsiyah, Salima</au><au>Mahdaouy, Abdelkader El</au><au>Ouatik, Saïd El Alaoui</au><au>Espinasse, Bernard</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning</atitle><jtitle>Journal of information science</jtitle><date>2023-02-01</date><risdate>2023</risdate><volume>49</volume><issue>1</issue><spage>164</spage><epage>182</epage><pages>164-182</pages><issn>0165-5515</issn><eissn>1741-6485</eissn><abstract>Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC’2002–2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning–based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/0165551521990616</doi><tpages>19</tpages><orcidid>https://orcid.org/0000-0001-8789-5713</orcidid><orcidid>https://orcid.org/0000-0003-4281-2472</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0165-5515
ispartof	Journal of information science, 2023-02, Vol.49 (1), p.164-182
issn	0165-5515 1741-6485
language	eng
recordid	cdi_hal_primary_oai_HAL_hal_03594048v1
source	Library & Information Science Abstracts (LISA); Sage Journals Online
subjects	Computer Science Datasets Deep learning Documents Embedding Representations
title	Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T01%3A17%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unsupervised%20extractive%20multi-document%20summarization%20method%20based%20on%20transfer%20learning%20from%20BERT%20multi-task%20fine-tuning&rft.jtitle=Journal%20of%20information%20science&rft.au=Lamsiyah,%20Salima&rft.date=2023-02-01&rft.volume=49&rft.issue=1&rft.spage=164&rft.epage=182&rft.pages=164-182&rft.issn=0165-5515&rft.eissn=1741-6485&rft_id=info:doi/10.1177/0165551521990616&rft_dat=%3Cproquest_hal_p%3E2771306398%3C/proquest_hal_p%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c343t-2d39d863429264439e14fc7c25dad6f65c17b5c56c653fc88cfbca40bd64e5273%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2771306398&rft_id=info:pmid/&rft_sage_id=10.1177_0165551521990616&rfr_iscdi=true