Loading…

Strength and similarity of affix removal stemming algorithms

This study evaluated the strength of, and similarity among, four affix removal stemming algorithms. Strength and similarity were evaluated in different ways, including new metrics based on the Hamming distance measure. Data was collected on stemmer outputs for a list of 49,656 English words derived...

Full description

Saved in:
Bibliographic Details
Published in:SIGIR forum 2003-04, Vol.37 (1), p.26-30
Main Authors: Frakes, William B., Fox, Christopher J.
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study evaluated the strength of, and similarity among, four affix removal stemming algorithms. Strength and similarity were evaluated in different ways, including new metrics based on the Hamming distance measure. Data was collected on stemmer outputs for a list of 49,656 English words derived from the UNIX spelling dictionary and the Moby corpus. Conclusions about the relative strength and similarity of the four stemming algorithms are reported.
ISSN:0163-5840
DOI:10.1145/945546.945548