Loading…
An Approach Based on the Visualization Model for the Ukrainian Web Content Classification
In the course of the work, it was established that the best vectorization method for the modern Ukrainian everyday language is the TextRank method. It is proposed an approach for building a classifier based on SVM, which allows classifying web content of everyday Ukrainian language. The proposed app...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In the course of the work, it was established that the best vectorization method for the modern Ukrainian everyday language is the TextRank method. It is proposed an approach for building a classifier based on SVM, which allows classifying web content of everyday Ukrainian language. The proposed approach's effectiveness was investigated using the following metrics: precision, recall, F1-norm, and confusion matrix. The classifier showed high efficiency by all metrics, with an accuracy of over 99%. The research was conducted on sets of Ukrainian-language texts of two categories, each of which had more than 200 texts, with a length of about 500 words. The conducted research provides an opportunity to use the proposed approach to classify modern everyday Ukrainian language to solve problems of textual content analysis and its classification by various features. The study results can be used to address socially significant issues: detecting bullying in social networks, detecting negative emotional content, and warning users about possible harmful content. |
---|---|
ISSN: | 2770-5218 |
DOI: | 10.1109/ACIT54803.2022.9913162 |