Loading…

N-Gram Approach for Gender Prediction

The Internet was growing with huge amount of information, through Blogs, Twitter tweets, Reviews, social media network and with other information content. Most of the text in the internet was unstructured and anonymous. Author Profiling is a text classification technique that is used to predict the...

Full description

Saved in:
Bibliographic Details
Main Authors: Reddy, T. Raghunadha, Vardhan, B. Vishnu, Reddy, P. Vijayapal
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Internet was growing with huge amount of information, through Blogs, Twitter tweets, Reviews, social media network and with other information content. Most of the text in the internet was unstructured and anonymous. Author Profiling is a text classification technique that is used to predict the profiling characteristics of the authors like gender, age, country, native language and educational background by analyzing their texts. Researchers proposed different types of features such as lexical, content based, structural and syntactic features to identify the writing styles of the authors. Most of the existing approaches in Author Profiling used the combination of features to represent a document vector for classification. In this paper, a new model was proposed in which document weights were calculated with combination of POS N-grams and most frequent terms. These document weights were used to represent the document vectors for classification. This experiment was carried out on the reviews domain to predict the gender of the authors and the achieved results were promising when compared with the existing approaches in Author Profiling.
ISSN:2473-3571
DOI:10.1109/IACC.2017.0176