Loading…
Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations
Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on BERT's over-parameterization and under-utilization issues...
Saved in:
Published in: | arXiv.org 2020-10 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Manginas, Nikolaos Chalkidis, Ilias Malakasiotis, Prodromos |
description | Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on BERT's over-parameterization and under-utilization issues. To this end, we propose o novel approach to fine-tune BERT in a structured manner. Specifically, we focus on Large Scale Multilabel Text Classification (LMTC) where documents are assigned with one or more labels from a large predefined set of hierarchically organized labels. Our approach guides specific BERT layers to predict labels from specific hierarchy levels. Experimenting with two LMTC datasets we show that this structured fine-tuning approach not only yields better classification results but also leads to better parameter utilization. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2450684675</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2450684675</sourcerecordid><originalsourceid>FETCH-proquest_journals_24506846753</originalsourceid><addsrcrecordid>eNqNi98KgjAcRkcQJOU7DLoW1uZUuqzsD3glXgYy9GdMbLPNEb59Gj1AVx-cc74F8ihjuyAJKV0h39qWEEKjmHLOPHTPxAgmeEsL-OJkDTUujJBKqgdutMGHNC_2OANhvuimKgNPUIPouhHn0Eg1PU66cjOcQG_AznqQWtkNWjais-D_do2257Q4XoPe6JcDO5StdkZNqqQhJ1ESRjFn_1UfgzdDRg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2450684675</pqid></control><display><type>article</type><title>Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations</title><source>Publicly Available Content Database</source><creator>Manginas, Nikolaos ; Chalkidis, Ilias ; Malakasiotis, Prodromos</creator><creatorcontrib>Manginas, Nikolaos ; Chalkidis, Ilias ; Malakasiotis, Prodromos</creatorcontrib><description>Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on BERT's over-parameterization and under-utilization issues. To this end, we propose o novel approach to fine-tune BERT in a structured manner. Specifically, we focus on Large Scale Multilabel Text Classification (LMTC) where documents are assigned with one or more labels from a large predefined set of hierarchically organized labels. Our approach guides specific BERT layers to predict labels from specific hierarchy levels. Experimenting with two LMTC datasets we show that this structured fine-tuning approach not only yields better classification results but also leads to better parameter utilization.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classification ; Labels ; Parameterization</subject><ispartof>arXiv.org, 2020-10</ispartof><rights>2020. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2450684675?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Manginas, Nikolaos</creatorcontrib><creatorcontrib>Chalkidis, Ilias</creatorcontrib><creatorcontrib>Malakasiotis, Prodromos</creatorcontrib><title>Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations</title><title>arXiv.org</title><description>Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on BERT's over-parameterization and under-utilization issues. To this end, we propose o novel approach to fine-tune BERT in a structured manner. Specifically, we focus on Large Scale Multilabel Text Classification (LMTC) where documents are assigned with one or more labels from a large predefined set of hierarchically organized labels. Our approach guides specific BERT layers to predict labels from specific hierarchy levels. Experimenting with two LMTC datasets we show that this structured fine-tuning approach not only yields better classification results but also leads to better parameter utilization.</description><subject>Classification</subject><subject>Labels</subject><subject>Parameterization</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNi98KgjAcRkcQJOU7DLoW1uZUuqzsD3glXgYy9GdMbLPNEb59Gj1AVx-cc74F8ihjuyAJKV0h39qWEEKjmHLOPHTPxAgmeEsL-OJkDTUujJBKqgdutMGHNC_2OANhvuimKgNPUIPouhHn0Eg1PU66cjOcQG_AznqQWtkNWjais-D_do2257Q4XoPe6JcDO5StdkZNqqQhJ1ESRjFn_1UfgzdDRg</recordid><startdate>20201012</startdate><enddate>20201012</enddate><creator>Manginas, Nikolaos</creator><creator>Chalkidis, Ilias</creator><creator>Malakasiotis, Prodromos</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20201012</creationdate><title>Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations</title><author>Manginas, Nikolaos ; Chalkidis, Ilias ; Malakasiotis, Prodromos</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_24506846753</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Classification</topic><topic>Labels</topic><topic>Parameterization</topic><toplevel>online_resources</toplevel><creatorcontrib>Manginas, Nikolaos</creatorcontrib><creatorcontrib>Chalkidis, Ilias</creatorcontrib><creatorcontrib>Malakasiotis, Prodromos</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Manginas, Nikolaos</au><au>Chalkidis, Ilias</au><au>Malakasiotis, Prodromos</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations</atitle><jtitle>arXiv.org</jtitle><date>2020-10-12</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>Although BERT is widely used by the NLP community, little is known about its inner workings. Several attempts have been made to shed light on certain aspects of BERT, often with contradicting conclusions. A much raised concern focuses on BERT's over-parameterization and under-utilization issues. To this end, we propose o novel approach to fine-tune BERT in a structured manner. Specifically, we focus on Large Scale Multilabel Text Classification (LMTC) where documents are assigned with one or more labels from a large predefined set of hierarchically organized labels. Our approach guides specific BERT layers to predict labels from specific hierarchy levels. Experimenting with two LMTC datasets we show that this structured fine-tuning approach not only yields better classification results but also leads to better parameter utilization.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2020-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2450684675 |
source | Publicly Available Content Database |
subjects | Classification Labels Parameterization |
title | Layer-wise Guided Training for BERT: Learning Incrementally Refined Document Representations |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T16%3A58%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Layer-wise%20Guided%20Training%20for%20BERT:%20Learning%20Incrementally%20Refined%20Document%20Representations&rft.jtitle=arXiv.org&rft.au=Manginas,%20Nikolaos&rft.date=2020-10-12&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2450684675%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_24506846753%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2450684675&rft_id=info:pmid/&rfr_iscdi=true |