Loading…
A novel sentiment aware dictionary for multi-domain sentiment classification
•The proposed sentiment aware dictionary, created using multiple domain data, is a solution to multi-domain sentiment classification in e-commerce domain.•Our dictionary is used to classify unlabeled reviews of the target domain.•Our classifier is implemented on Hindi language Product reviews. It ca...
Saved in:
Published in: | Computers & electrical engineering 2018-07, Vol.69, p.585-597 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •The proposed sentiment aware dictionary, created using multiple domain data, is a solution to multi-domain sentiment classification in e-commerce domain.•Our dictionary is used to classify unlabeled reviews of the target domain.•Our classifier is implemented on Hindi language Product reviews. It can be easily extended to any reviews in e-commerce domain by using language specific parser and tagger.•Several experiments have been performed and the results obtained are able to label 23–24% more number of words of unlabeled target domain.
Sentiment Analysis is a sub area of Natural Language Processing (NLP) which extracts user’s opinion and classifies it according to its polarity. This task has many applications but it is domain dependent and a costly task to annotate the corpora in every possible domain of interest before training the classifier. We are making an attempt to solve this problem by creating a sentiment aware dictionary using multiple domain data. This dictionary is created using labeled data from the source domain and unlabeled data from both source and target domains. Next, this dictionary is used to classify the unlabeled reviews of the target domain. The work is carried out in Hindi, the official language of India. The web pages in Hindi language is booming after the introduction of UTF-8 encoding style. When compared with labeling done by Hindi Sentiwordnet (HSWN), a general lexicon for word polarity, the proposed method is able to label 23–24% more number of words of target domain. The labels assigned by our method and the labels given by HSWN, for the available words, are compared and found matching with 76% accuracy. |
---|---|
ISSN: | 0045-7906 1879-0755 |
DOI: | 10.1016/j.compeleceng.2017.10.015 |