Loading…
WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features
Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based information retrieval systems, which employ a word sense disambig...
Saved in:
Published in: | arXiv.org 2018-07 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ngo, Vuong M Cao, Tru H Le, Tuan M V |
description | Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based information retrieval systems, which employ a word sense disambiguation method to process queries and documents. The problem is that in many cases a word has more than one possible direct sense, and picking only one of them may give a wrong sense for the word. Moreover, the previous systems use only word forms to represent word senses and their hypernyms. We propose a novel approach that uses the most specific common hypernym of the remaining undisambiguated multi-senses of a word, as well as combined WordNet features to represent word meanings. Experiments on a benchmark dataset show that, in terms of the MAP measure, our search engine is 17.7% better than the lexical search, and at least 9.4% better than all surveyed search systems using WordNet. Keywords Ontology, word sense disambiguation, semantic annotation, semantic search. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2074062164</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2074062164</sourcerecordid><originalsourceid>FETCH-proquest_journals_20740621643</originalsourceid><addsrcrecordid>eNqNi10LgjAYRkcQJOV_GHQtzM2PrpPEbiKi6FIWvsbEbbZ3Bv77DPoBXR0453kWJOBCxNEu4XxFQsSOMcaznKepCMj5bl1zAh_tJUJDj6a1TkuvrKEX8E7BW_b0hso8aWG1nnU1DeDMpJFK03zlQ5n5WYL0owPckGUre4TwxzXZlodrUUWDs68R0NedHZ2ZU81ZnrCMx1ki_lt9AF_HPuk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2074062164</pqid></control><display><type>article</type><title>WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features</title><source>Publicly Available Content Database</source><creator>Ngo, Vuong M ; Cao, Tru H ; Le, Tuan M V</creator><creatorcontrib>Ngo, Vuong M ; Cao, Tru H ; Le, Tuan M V</creatorcontrib><description>Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based information retrieval systems, which employ a word sense disambiguation method to process queries and documents. The problem is that in many cases a word has more than one possible direct sense, and picking only one of them may give a wrong sense for the word. Moreover, the previous systems use only word forms to represent word senses and their hypernyms. We propose a novel approach that uses the most specific common hypernym of the remaining undisambiguated multi-senses of a word, as well as combined WordNet features to represent word meanings. Experiments on a benchmark dataset show that, in terms of the MAP measure, our search engine is 17.7% better than the lexical search, and at least 9.4% better than all surveyed search systems using WordNet. Keywords Ontology, word sense disambiguation, semantic annotation, semantic search.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Annotations ; Information retrieval ; Query processing ; Search engines ; Semantics ; Word sense disambiguation</subject><ispartof>arXiv.org, 2018-07</ispartof><rights>2018. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2074062164?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25732,36991,44569</link.rule.ids></links><search><creatorcontrib>Ngo, Vuong M</creatorcontrib><creatorcontrib>Cao, Tru H</creatorcontrib><creatorcontrib>Le, Tuan M V</creatorcontrib><title>WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features</title><title>arXiv.org</title><description>Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based information retrieval systems, which employ a word sense disambiguation method to process queries and documents. The problem is that in many cases a word has more than one possible direct sense, and picking only one of them may give a wrong sense for the word. Moreover, the previous systems use only word forms to represent word senses and their hypernyms. We propose a novel approach that uses the most specific common hypernym of the remaining undisambiguated multi-senses of a word, as well as combined WordNet features to represent word meanings. Experiments on a benchmark dataset show that, in terms of the MAP measure, our search engine is 17.7% better than the lexical search, and at least 9.4% better than all surveyed search systems using WordNet. Keywords Ontology, word sense disambiguation, semantic annotation, semantic search.</description><subject>Annotations</subject><subject>Information retrieval</subject><subject>Query processing</subject><subject>Search engines</subject><subject>Semantics</subject><subject>Word sense disambiguation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNi10LgjAYRkcQJOV_GHQtzM2PrpPEbiKi6FIWvsbEbbZ3Bv77DPoBXR0453kWJOBCxNEu4XxFQsSOMcaznKepCMj5bl1zAh_tJUJDj6a1TkuvrKEX8E7BW_b0hso8aWG1nnU1DeDMpJFK03zlQ5n5WYL0owPckGUre4TwxzXZlodrUUWDs68R0NedHZ2ZU81ZnrCMx1ki_lt9AF_HPuk</recordid><startdate>20180715</startdate><enddate>20180715</enddate><creator>Ngo, Vuong M</creator><creator>Cao, Tru H</creator><creator>Le, Tuan M V</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20180715</creationdate><title>WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features</title><author>Ngo, Vuong M ; Cao, Tru H ; Le, Tuan M V</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20740621643</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Annotations</topic><topic>Information retrieval</topic><topic>Query processing</topic><topic>Search engines</topic><topic>Semantics</topic><topic>Word sense disambiguation</topic><toplevel>online_resources</toplevel><creatorcontrib>Ngo, Vuong M</creatorcontrib><creatorcontrib>Cao, Tru H</creatorcontrib><creatorcontrib>Le, Tuan M V</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ngo, Vuong M</au><au>Cao, Tru H</au><au>Le, Tuan M V</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features</atitle><jtitle>arXiv.org</jtitle><date>2018-07-15</date><risdate>2018</risdate><eissn>2331-8422</eissn><abstract>Text search based on lexical matching of keywords is not satisfactory due to polysemous and synonymous words. Semantic search that exploits word meanings, in general, improves search performance. In this paper, we survey WordNet-based information retrieval systems, which employ a word sense disambiguation method to process queries and documents. The problem is that in many cases a word has more than one possible direct sense, and picking only one of them may give a wrong sense for the word. Moreover, the previous systems use only word forms to represent word senses and their hypernyms. We propose a novel approach that uses the most specific common hypernym of the remaining undisambiguated multi-senses of a word, as well as combined WordNet features to represent word meanings. Experiments on a benchmark dataset show that, in terms of the MAP measure, our search engine is 17.7% better than the lexical search, and at least 9.4% better than all surveyed search systems using WordNet. Keywords Ontology, word sense disambiguation, semantic annotation, semantic search.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2018-07 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2074062164 |
source | Publicly Available Content Database |
subjects | Annotations Information retrieval Query processing Search engines Semantics Word sense disambiguation |
title | WordNet-Based Information Retrieval Using Common Hypernyms and Combined Features |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T11%3A41%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=WordNet-Based%20Information%20Retrieval%20Using%20Common%20Hypernyms%20and%20Combined%20Features&rft.jtitle=arXiv.org&rft.au=Ngo,%20Vuong%20M&rft.date=2018-07-15&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2074062164%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_20740621643%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2074062164&rft_id=info:pmid/&rfr_iscdi=true |