Loading…
Searching and browsing collections of structural information
This paper proposes a new approach to querying collections of structured textual information such as SGML/XML documents. Knowledge about the structure of documents is an additional resource that should be exploited during retrieval since the semantics of the different textual objects can be used to...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper proposes a new approach to querying collections of structured textual information such as SGML/XML documents. Knowledge about the structure of documents is an additional resource that should be exploited during retrieval since the semantics of the different textual objects can be used to specify an information need much more precisely. However the traditional probabilistic retrieval model lacks the ability to handle structural information. We define a new retrieval function based on the probabilistic model which overcomes this drawback. The presented query language allows the assignment of structural roles to individual terms. The efficient evaluation of queries in this framework requires appropriate index structures. We design text and structure indexes and show how their information is combined during evaluation. The implementation supports additional functionalities such as a table of contents for browsing. First evaluation results show the feasibility of the approach on collections of unstructured documents. |
---|---|
DOI: | 10.1109/ADL.2000.848377 |