Loading…
N-Gram Approach for Gender Prediction
The Internet was growing with huge amount of information, through Blogs, Twitter tweets, Reviews, social media network and with other information content. Most of the text in the internet was unstructured and anonymous. Author Profiling is a text classification technique that is used to predict the...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The Internet was growing with huge amount of information, through Blogs, Twitter tweets, Reviews, social media network and with other information content. Most of the text in the internet was unstructured and anonymous. Author Profiling is a text classification technique that is used to predict the profiling characteristics of the authors like gender, age, country, native language and educational background by analyzing their texts. Researchers proposed different types of features such as lexical, content based, structural and syntactic features to identify the writing styles of the authors. Most of the existing approaches in Author Profiling used the combination of features to represent a document vector for classification. In this paper, a new model was proposed in which document weights were calculated with combination of POS N-grams and most frequent terms. These document weights were used to represent the document vectors for classification. This experiment was carried out on the reviews domain to predict the gender of the authors and the achieved results were promising when compared with the existing approaches in Author Profiling. |
---|---|
ISSN: | 2473-3571 |
DOI: | 10.1109/IACC.2017.0176 |