Loading…

Sentiment analysis using a deep ensemble learning model

The coronavirus pandemic has kept people away from social life and this has led to an increase in the use of social media over the past two years. Thanks to social media, people can now instantly share their thoughts on various topics such as their favourite movies, restaurants, hotels, etc. This ha...

Full description

Saved in:
Bibliographic Details
Published in:Multimedia tools and applications 2024-04, Vol.83 (14), p.42207-42231
Main Authors: Başarslan, Muhammet Sinan, Kayaalp, Fatih
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The coronavirus pandemic has kept people away from social life and this has led to an increase in the use of social media over the past two years. Thanks to social media, people can now instantly share their thoughts on various topics such as their favourite movies, restaurants, hotels, etc. This has created a huge amount of data and many researchers from different sciences have focused on analysing this data. Natural Language Processing (NLP) is one of these areas of computer science that uses artificial technologies. Sentiment analysis is also one of the tasks of NLP, which is based on extracting emotions from huge post data. In this study, sentiment analysis was performed on two datasets of tweets about coronavirus and TripAdvisor hotel reviews. A frequency-based word representation method (Term Frequency-Inverse Document Frequency (TF-IDF)) and a prediction-based Word2Vec word embedding method were used to vectorise the datasets. Sentiment analysis models were then built using single machine learning methods (Decision Trees-DT, K-Nearest Neighbour-KNN, Naive Bayes-NB and Support Vector Machine-SVM), single deep learning methods (Long Short Term Memory-LSTM, Recurrent Neural Network-RNN) and heterogeneous ensemble learning methods (Stacking and Majority Voting) based on these single machine learning and deep learning methods. Accuracy was used as a performance measure. The heterogeneous model with stacking (LSTM-RNN) has outperformed the other models with accuracy values of 0.864 on the coronavirus dataset and 0.898 on the Trip Advisor dataset and they have been evaluated as promising results when compared to the literature. It has been observed that the use of single methods as an ensemble gives better results, which is consistent with the literature, which is a step forward in the detection of sentiments through posts. Investigating the performance of heterogeneous ensemble learning models based on different algorithms in sentiment analysis tasks is planned as future work.
ISSN:1573-7721
1380-7501
1573-7721
DOI:10.1007/s11042-023-17278-6