Loading…

mod_oai: An Apache Module for Metadata Harvesting

We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAIPMH is the de facto standard for metadata exchange in digital libraries and allows repositories to expose their contents in a structured, application-neutral format w...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2005-03
Main Authors: Nelson, Michael L, Van de Sompel, Herbert, Liu, Xiaoming, Harrison, Terry L, McFarland, Nathan
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). OAIPMH is the de facto standard for metadata exchange in digital libraries and allows repositories to expose their contents in a structured, application-neutral format with semantics optimized for accurate incremental harvesting. Current implementations of OAI-PMH are either separate applications that access an existing repository, or are built-in to repository software packages. mod_oai is different in that it optimizes harvesting web content by building OAI-PMH capability into the Apache server. We discuss the implications of adding harvesting capability to an Apache server and describe our initial experimental results accessing a departmental web site using both web crawling and OAIPMH harvesting techniques.
ISSN:2331-8422