Loading…
A method for multilingual text mining and retrieval using growing hierarchical self-organizing maps
With the increasing number of multilingual texts in the internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applies the growing...
Saved in:
Published in: | Journal of information science 2009-02, Vol.35 (1), p.3-23 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With the increasing number of multilingual texts in the internet, multilingual text retrieval techniques have become an important research issue. However, the discovery of relationships between different languages remains an open problem. In this paper we propose a method, which applies the growing hierarchical self-organizing map (GHSOM) model, to discover knowledge from multilingual text documents. Multilingual parallel corpora were trained by the GHSOM to generate hierarchical feature maps. A discovery process is then applied on these maps to discover the relationships between documents of different languages. The relationships between keywords of different languages are also revealed. We conducted experiments on a set of Chinese—English bilingual parallel corpora to discover the relationships between documents of these languages. We also use such relationships to perform multilingual information retrieval tasks. The experimental results show that our multilingual text mining approach may capture conceptual relationships among documents as well as keywords written in different languages. |
---|---|
ISSN: | 0165-5515 1741-6485 |
DOI: | 10.1177/0165551508088968 |