Loading…
Active learning for sentiment analysis on data streams: Methodology and workflow implementation in the ClowdFlows platform
•We present a cloud based platform for data stream processing with workflows.•The ClowdFlows platform enables processing of multiple concurrent data streams.•We implement an active learning scenario for sentiment analysis on data streams.•Machine learning methods are shown to be suitable for sentime...
Saved in:
Published in: | Information processing & management 2015-03, Vol.51 (2), p.187-203 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We present a cloud based platform for data stream processing with workflows.•The ClowdFlows platform enables processing of multiple concurrent data streams.•We implement an active learning scenario for sentiment analysis on data streams.•Machine learning methods are shown to be suitable for sentiment analysis.•Active learning improves the accuracy of sentiment classification.
Sentiment analysis from data streams is aimed at detecting authors’ attitude, emotions and opinions from texts in real-time. To reduce the labeling effort needed in the data collection phase, active learning is often applied in streaming scenarios, where a learning algorithm is allowed to select new examples to be manually labeled in order to improve the learner’s performance. Even though there are many on-line platforms which perform sentiment analysis, there is no publicly available interactive on-line platform for dynamic adaptive sentiment analysis, which would be able to handle changes in data streams and adapt its behavior over time. This paper describes ClowdFlows, a cloud-based scientific workflow platform, and its extensions enabling the analysis of data streams and active learning. Moreover, by utilizing the data and workflow sharing in ClowdFlows, the labeling of examples can be distributed through crowdsourcing. The advanced features of ClowdFlows are demonstrated on a sentiment analysis use case, using active learning with a linear Support Vector Machine for learning sentiment classification models to be applied to microblogging data streams. |
---|---|
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/j.ipm.2014.04.001 |