Loading…
Strength and similarity of affix removal stemming algorithms
This study evaluated the strength of, and similarity among, four affix removal stemming algorithms. Strength and similarity were evaluated in different ways, including new metrics based on the Hamming distance measure. Data was collected on stemmer outputs for a list of 49,656 English words derived...
Saved in:
Published in: | SIGIR forum 2003-04, Vol.37 (1), p.26-30 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This study evaluated the strength of, and similarity among, four affix removal stemming algorithms. Strength and similarity were evaluated in different ways, including new metrics based on the Hamming distance measure. Data was collected on stemmer outputs for a list of 49,656 English words derived from the UNIX spelling dictionary and the Moby corpus. Conclusions about the relative strength and similarity of the four stemming algorithms are reported. |
---|---|
ISSN: | 0163-5840 |
DOI: | 10.1145/945546.945548 |