Loading…

Multitasking of sentiment detection and emotion recognition in code-mixed Hinglish data

As the number of non-native English speakers on social media has skyrocketed in recent years, sentiment and emotion analysis on regional languages and code-mixed data has gained traction. Despite extensive research on English, the area of Hindi–English code-mixed texts is still relatively new and un...

Full description

Saved in:
Bibliographic Details
Published in:Knowledge-based systems 2023-01, Vol.260, p.110182, Article 110182
Main Authors: Ghosh, Soumitra, Priyankar, Amit, Ekbal, Asif, Bhattacharyya, Pushpak
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As the number of non-native English speakers on social media has skyrocketed in recent years, sentiment and emotion analysis on regional languages and code-mixed data has gained traction. Despite extensive research on English, the area of Hindi–English code-mixed texts is still relatively new and understudied. We create an emotion annotated Hindi–English (Hinglish) code-mixed dataset by performing emotion annotation on the benchmark SentiMix dataset to solve this problem and enable future researchers to contribute to this domain. We propose an end-to-end transformer-based multitask framework for sentiment detection and emotion recognition from the SentiMix code-mixed dataset. We fine-tune the pre-trained cross-lingual embedding model, XLMR, using task-specific data to further exploit the efficacy of transfer learning to improve the overall efficiency of our methods. Our proposed multi-task solution outperforms the state-of-the-art single-task and multitask baselines by a considerable margin, implying that the auxiliary task (i.e. emotion recognition) increases the efficiency of the primary task (i.e. sentiment detection) in a multi-task environment. It should be noted that the reported findings were obtained without the use of any ensemble techniques, thereby adhering to a model of effective and production-ready NLP. •Investigated the role of emotion in code-mixed Hinglish sentiment detection.•Proposed a cross-lingual multitask system for sentiment analysis on code-mixed texts.•Introduced a benchmark emotion annotated code-mixed Hindi-English (Hinglish) dataset.•Transfer learning improves performance on code-mixed Hinglish Sentiment data.•The dataset is available at: http://www.iitp.ac.in/~ai-nlp-ml/resources.html#EmoSen.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2022.110182