Loading…
A Comparative Study of ML Models on Large-Scale Turkish Sentiment Datasets
This study investigates the performance of traditional Machine Learning (ML) models for Sentiment Analysis (SA) in the Turkish language, addressing the growing need for robust Natural Language Processing (NLP) techniques in low-resource languages. We compare the efficacy of Logistic Regression (LR),...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This study investigates the performance of traditional Machine Learning (ML) models for Sentiment Analysis (SA) in the Turkish language, addressing the growing need for robust Natural Language Processing (NLP) techniques in low-resource languages. We compare the efficacy of Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB) classifiers on two recently developed large-scale Turkish sentiment datasets: "Vitamins and Supplements Customer Review" (VSCR) and "Turkish Sentiment Analysis version I" (TRSAv1). Our research leverages advanced hyperparameter tuning techniques to optimize these models, providing benchmarks and practical insights for Turkish SA. The study contributes to the field by establishing performance baselines on new open-source datasets, enhancing understanding of NLP models for agglutinative languages, and analyzing factors affecting classifier performance in binary and multi-class scenarios. Our findings reveal the strengths and limitations of each model within the context of Turkish language SA, offering valuable guidance for future research and applications in this domain. |
---|---|
ISSN: | 2770-7946 |
DOI: | 10.1109/ASYU62119.2024.10757130 |