Loading…
WordNet2Vec: Corpora agnostic word vectorization method
The complex nature of big data resources requires new structuring methods, especially for textual content. WordNet is a good knowledge source for the comprehensive abstraction of natural language as it offers good implementation for many languages. Since WordNet embeds natural language in the form o...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2019-01, Vol.326-327, p.141-150 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The complex nature of big data resources requires new structuring methods, especially for textual content. WordNet is a good knowledge source for the comprehensive abstraction of natural language as it offers good implementation for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism, WordNet2Vec, is proposed in this paper. This creates vectors for each word from WordNet. These vectors encapsulate a general position — the role of a given word related to all other words in the given natural language. Any list or set of such vectors contains knowledge about the context of its components within the whole language. This type of word representation can be easily applied to many analytic tasks such as classification or clustering. The usefulness of the WordNet2Vec method is demonstrated in sentiment analysis including the classification of an Amazon opinion text dataset with transfer learning. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2017.01.121 |