Loading…

Discrete Student Psychology Optimization Algorithm for the Word Sense Disambiguation Problem

Word Sense Disambiguation (WSD) is a key step for many natural language processing tasks such as information search, automatic translation, and sentiment analysis. WSD is the process that identifies appropriate senses of ambiguous words in the text. With the increasing number of words to be disambig...

Full description

Saved in:
Bibliographic Details
Published in:Arabian journal for science and engineering (2011) 2024-03, Vol.49 (3), p.3487-3502
Main Authors: Haouassi, Hichem, Bekhouche, Abdelaali, Rahab, Hichem, Mahdaoui, Rafik, Chouhal, Ouahiba
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Word Sense Disambiguation (WSD) is a key step for many natural language processing tasks such as information search, automatic translation, and sentiment analysis. WSD is the process that identifies appropriate senses of ambiguous words in the text. With the increasing number of words to be disambiguated in large amount of text data, WSD becomes very challenging and that is why an exhaustive search for the best set of senses may be unpractical. Recently, several metaheuristic approaches have been proposed for different complex optimization problems and have achieved good results. Therefore, in order to improve the WSD process, in this paper, the WSD problem is modeled as a combinatorial optimization problem, and the Discrete Student Psychology-Based Optimization (DSPBO) metaheuristic is proposed and used to selecting appropriate senses. A DSPBO-based WSD is proposed to disambiguate more ambiguous words together in function to their contexts in the target text, and a Lesk-based fitness function is used to guide the DSPBO metaheuristic to optimize the general semantic similarity of selected senses. The proposed approach is evaluated and compared to several recent WSD approaches on the well-known corpuses SensEval-2, SensEval-3, SemEval-2007, SemEval-13, and SemEval-15. The comparison is made in terms of F -Measure, precision, and recall. Experiments show a significant improvement both over existing knowledge lexicon-based approaches and metaheuristic-based approaches, with a higher F -measure of 84.21%, 83.33%, 87.5%, 77.58%, and 81.08% on SensEval-2, SensEval-3, SemEval-2007, SemEval-13, and SemEval-15, respectively.
ISSN:2193-567X
1319-8025
2191-4281
DOI:10.1007/s13369-023-07993-5