Loading…
Distributed processing of queries for XML documents in an agent based information retrieval system
The paper addresses the problem of efficiently querying large numbers of text documents using parallel processing methods. The optimization criteria are somewhat different from those used in querying heterogeneous databases, largely because the extraction of ontological information from documents is...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The paper addresses the problem of efficiently querying large numbers of text documents using parallel processing methods. The optimization criteria are somewhat different from those used in querying heterogeneous databases, largely because the extraction of ontological information from documents is the dominant component of query execution time. We assume that each document has been previously annotated using XML. The authors describe the architecture of a system to process ontology based queries for XML annotated documents. We have introduced two basic strategies for query processing: simple strategy, and semi-join strategy, and their possible extensions using pipelining and longer lists for keyword search. Different levels of parallelism for these strategies are discussed. An evaluation model is created and used to derive optimal replication of resource agents. The theoretical and experimental results are compared. |
---|---|
DOI: | 10.1109/DLRP.2000.942181 |