Loading…

Applications of ngrams in textual information systems

This paper provides an introduction to the use of ngrams in textual information systems, where an ngram is a string of n, usually adjacent, characters extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of ngrams include spelling e...

Full description

Saved in:
Bibliographic Details
Published in:Journal of documentation 1998-03, Vol.54 (1), p.48-67
Main Authors: Robertson, Alexander M., Willett, Peter
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper provides an introduction to the use of ngrams in textual information systems, where an ngram is a string of n, usually adjacent, characters extracted from a section of continuous text. Applications that can be implemented efficiently and effectively using sets of ngrams include spelling error detection and correction, query expansion, information retrieval with serial, inverted and signature files, dictionary lookup, text compression, and language identification.
ISSN:0022-0418
DOI:10.1108/EUM0000000007161