Loading…
Multitasking of sentiment detection and emotion recognition in code-mixed Hinglish data
As the number of non-native English speakers on social media has skyrocketed in recent years, sentiment and emotion analysis on regional languages and code-mixed data has gained traction. Despite extensive research on English, the area of Hindi–English code-mixed texts is still relatively new and un...
Saved in:
Published in: | Knowledge-based systems 2023-01, Vol.260, p.110182, Article 110182 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | As the number of non-native English speakers on social media has skyrocketed in recent years, sentiment and emotion analysis on regional languages and code-mixed data has gained traction. Despite extensive research on English, the area of Hindi–English code-mixed texts is still relatively new and understudied. We create an emotion annotated Hindi–English (Hinglish) code-mixed dataset by performing emotion annotation on the benchmark SentiMix dataset to solve this problem and enable future researchers to contribute to this domain. We propose an end-to-end transformer-based multitask framework for sentiment detection and emotion recognition from the SentiMix code-mixed dataset. We fine-tune the pre-trained cross-lingual embedding model, XLMR, using task-specific data to further exploit the efficacy of transfer learning to improve the overall efficiency of our methods. Our proposed multi-task solution outperforms the state-of-the-art single-task and multitask baselines by a considerable margin, implying that the auxiliary task (i.e. emotion recognition) increases the efficiency of the primary task (i.e. sentiment detection) in a multi-task environment. It should be noted that the reported findings were obtained without the use of any ensemble techniques, thereby adhering to a model of effective and production-ready NLP.
•Investigated the role of emotion in code-mixed Hinglish sentiment detection.•Proposed a cross-lingual multitask system for sentiment analysis on code-mixed texts.•Introduced a benchmark emotion annotated code-mixed Hindi-English (Hinglish) dataset.•Transfer learning improves performance on code-mixed Hinglish Sentiment data.•The dataset is available at: http://www.iitp.ac.in/~ai-nlp-ml/resources.html#EmoSen. |
---|---|
ISSN: | 0950-7051 1872-7409 |
DOI: | 10.1016/j.knosys.2022.110182 |