Loading…

Automatic building of an ontology on the basis of text corpora in Thai

This paper presents a methodology for automatic learning of ontologies from Thai text corpora, by extraction of terms and relations. A shallow parser is used to chunk texts on which we identify taxonomie relations with the help of cues: lexico-syntactic patterns and item lists. The main advantage of...

Full description

Saved in:
Bibliographic Details
Published in:Language Resources and Evaluation 2008-05, Vol.42 (2), p.137-149
Main Authors: Imsombut, Aurawan, Kawtrakul, Asanee
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a methodology for automatic learning of ontologies from Thai text corpora, by extraction of terms and relations. A shallow parser is used to chunk texts on which we identify taxonomie relations with the help of cues: lexico-syntactic patterns and item lists. The main advantage of the approach is that it simplify the task of concept and relation labeling since cues help for identifying the ontological concept and hinting their relation. However, these techniques pose certain problems, i.e. cue word ambiguity, item list identification, and numerous candidate terms. We also propose the methodology to solve these problems by using lexicon and co-occurrence features and weighting them with information gain. The precision, recall and F-measure of the system are 0.74, 0.78 and 0.76, respectively.
ISSN:1574-020X
1572-8412
1574-0218
DOI:10.1007/s10579-007-9045-5