Loading…

A Comparative Study of ML Models on Large-Scale Turkish Sentiment Datasets

This study investigates the performance of traditional Machine Learning (ML) models for Sentiment Analysis (SA) in the Turkish language, addressing the growing need for robust Natural Language Processing (NLP) techniques in low-resource languages. We compare the efficacy of Logistic Regression (LR),...

Full description

Saved in:
Bibliographic Details
Main Authors: Kiziltepe, Rukiye Savran, Ezin, Ercan, Karakus, Murat
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study investigates the performance of traditional Machine Learning (ML) models for Sentiment Analysis (SA) in the Turkish language, addressing the growing need for robust Natural Language Processing (NLP) techniques in low-resource languages. We compare the efficacy of Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB) classifiers on two recently developed large-scale Turkish sentiment datasets: "Vitamins and Supplements Customer Review" (VSCR) and "Turkish Sentiment Analysis version I" (TRSAv1). Our research leverages advanced hyperparameter tuning techniques to optimize these models, providing benchmarks and practical insights for Turkish SA. The study contributes to the field by establishing performance baselines on new open-source datasets, enhancing understanding of NLP models for agglutinative languages, and analyzing factors affecting classifier performance in binary and multi-class scenarios. Our findings reveal the strengths and limitations of each model within the context of Turkish language SA, offering valuable guidance for future research and applications in this domain.
ISSN:2770-7946
DOI:10.1109/ASYU62119.2024.10757130