Loading…

DOCUMENT RANKING USING AN ENRICHED THESAURUS

A thesaurus may be viewed as a graph, and document retrieval algorithms can exploit this graph when both the documents and the query are represented by thesaurus terms. These retrieval algorithms measure the distance between the query and documents by using the path lengths in the graph. Previous wo...

Full description

Saved in:
Bibliographic Details
Published in:Journal of documentation 1991-03, Vol.47 (3), p.240-253
Main Authors: RADA, ROY, BARLOW, JUDITH, POTHARST, JAN, ZANSTRA, PIETER, BIJSTRA, DJUJAN
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A thesaurus may be viewed as a graph, and document retrieval algorithms can exploit this graph when both the documents and the query are represented by thesaurus terms. These retrieval algorithms measure the distance between the query and documents by using the path lengths in the graph. Previous work with such strategies has shown that the hierarchical relations in the thesaurus are useful but the non-hierarchical relations are not. This paper shows that when the query explicitly mentions a particular non-hierarchical relation, the retrieval algorithm benefits from the presence of such relations in the thesaurus. Our algorithms were applied to the Excerpta Medica bibliographic citation database whose citations are indexed with terms from the EMTREE thesaurus. We also created an enriched EMTREE by systematically adding non-hierarchical relations from a medical knowledge base. Our algorithms used at one time EMTREE and, at another time, the enriched EMTREE in the course of ranking documents from Excerpta Medica against queries. When, and only when, the query specifically mentioned a particular non-hierarchical relation type, did EMTREE enriched with that relation type lead to a ranking that better corresponded to an expert's ranking.
ISSN:0022-0418
1758-7379
DOI:10.1108/eb026879