Loading…
Fine-tuning techniques and data augmentation on transformer-based models for conversational texts and noisy user-generated content
Transfer learning and Transformer-based language models play important roles in modern natural language processing research community. In this paper, we propose Transformer model's fine-tuning and data augmentation (TMFTDA) techniques for conversational texts and noisy user-generated content. W...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Transfer learning and Transformer-based language models play important roles in modern natural language processing research community. In this paper, we propose Transformer model's fine-tuning and data augmentation (TMFTDA) techniques for conversational texts and noisy user-generated content. We use two NTCIR-15 tasks, namely the first Dialogue Evaluation (DialEval-1) task and the second Numeral Attachment in Financial Tweets (FinNum-2) task, to evaluate the efficacy of TMFTDA. Experimental results show that TMFTDA substantially outperforms the baselines model of Bidirectional Long Short-Term Memory (Bi-LSTM) in multi-turn dialogue system evaluation at DialEval-1's Dialogue Quality (DQ) and Nugget Detection (ND) subtasks. Moreover, TMFTDA performs to a satisfactory level at FinNum-2 with a model of Cross-lingual Language Models using a Robustly Optimized BERT Pretraining Approach (XLM-RoBERTa). The research contribution of this paper is that, we help shed some light on the usefulness of TMFTDA, for conversational texts and noisy user-generated content in social media text analytics. |
---|---|
ISSN: | 2473-991X |
DOI: | 10.1109/ASONAM49781.2020.9381329 |