Loading…

Filtering search results using an optimal set of terms identified by an artificial neural network

Information filtering (IF) systems usually filter data items by correlating a set of terms representing the user’s interest (a user profile) with similar sets of terms representing the data items. Many techniques can be employed for constructing user profiles automatically, but they usually yield la...

Full description

Saved in:
Bibliographic Details
Published in:Information processing & management 2006-03, Vol.42 (2), p.469-483
Main Authors: Kuflik, Tsvi, Boger, Zvi, Shoval, Peretz
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Information filtering (IF) systems usually filter data items by correlating a set of terms representing the user’s interest (a user profile) with similar sets of terms representing the data items. Many techniques can be employed for constructing user profiles automatically, but they usually yield large sets of term. Various dimensionality-reduction techniques can be applied in order to reduce the number of terms in a user profile. We describe a new terms selection technique including a dimensionality-reduction mechanism which is based on the analysis of a trained artificial neural network (ANN) model. Its novel feature is the identification of an optimal set of terms that can classify correctly data items that are relevant to a user. The proposed technique was compared with the classical Rocchio algorithm. We found that when using all the distinct terms in the training set to train an ANN, the Rocchio algorithm outperforms the ANN based filtering system, but after applying the new dimensionality-reduction technique, leaving only an optimal set of terms, the improved ANN technique outperformed both the original ANN and the Rocchio algorithm.
ISSN:0306-4573
1873-5371
DOI:10.1016/j.ipm.2005.03.020