Loading…

Automatic Technical Term Extraction Based on Term Association

This paper proposes a new automatic Chinese term extracting algorithm combining both statistics-based and rule-based methods. This algorithm firstly uses a statistical method to extract two-word candidates from raw corpus, and then extends these candidates forward to obtain multi-word candidate term...

Full description

Saved in:
Bibliographic Details
Main Authors: Miao Wan, Song Liu, Jian-Yi Liu, Cong Wang
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes a new automatic Chinese term extracting algorithm combining both statistics-based and rule-based methods. This algorithm firstly uses a statistical method to extract two-word candidates from raw corpus, and then extends these candidates forward to obtain multi-word candidate terms. We propose a new metric named term association (TA) that can measure the combining degree between words in a string very well. In the second subsystem it filters these candidates to get domain-specific technical terms based on defined rules. Our purpose is to achieve a higher precision of the domain-specific Chinese term extraction task by the hybrid method than the previous approaches. This algorithm implements an extractor with an unprocessed corpus as input for technical papers of ethanol fuels. The results of experiments are analyzed and evaluated, and the precision and recall are 84.26% and 63.86% respectively.
DOI:10.1109/FSKD.2008.40