Loading…

A Method of Improving Feature Vector for Web Pages Reflecting the Contents of Their Out-Linked Pages

TF-IDF schemes are popular for generating the feature vectors of documents. These schemes are proposed for characterizing one document. Therefore, in order to characterizeWeb pages using tf-idf schemes, the feature vectors of the Web pages should be reflected by the contents of Web pages linked with...

Full description

Saved in:
Bibliographic Details
Main Authors: Sugiyama, Kazunari, Hatano, Kenji, Yoshikawa, Masatoshi, Uemura, Shunsuke
Format: Book Chapter
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:TF-IDF schemes are popular for generating the feature vectors of documents. These schemes are proposed for characterizing one document. Therefore, in order to characterizeWeb pages using tf-idf schemes, the feature vectors of the Web pages should be reflected by the contents of Web pages linked with other pages via hyperlinks. In this paper, we propose three methods of generating feature vectors for linked documents such asWeb pages. Moreover, in order to verify the effectiveness of our proposed methods, we compare our methods with current search engines and confirm their retrieval accuracy using recall precision curves.
ISSN:0302-9743
1611-3349
DOI:10.1007/3-540-46146-9_88