Loading…
Clustering Based URL Normalization Technique for Web Mining
URL (Uniform Resource Locator) normalization is an important activity in web mining. Web data can be retrieved in smoother way using effective URL normalization technique. URL normalization also reduces lot of calculations in web mining activities. A web mining technique for URL normalization is pro...
Saved in:
Main Author: | |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 351 |
container_issue | |
container_start_page | 349 |
container_title | |
container_volume | |
creator | Nagwani, N K |
description | URL (Uniform Resource Locator) normalization is an important activity in web mining. Web data can be retrieved in smoother way using effective URL normalization technique. URL normalization also reduces lot of calculations in web mining activities. A web mining technique for URL normalization is proposed in this paper. The proposed technique is based on content, structure and semantic similarity and web page redirection and forwarding similarity of the given set of URLs. Web page redirection and forward graphs can be used to measure the similarities between the URL's and can also be used for URL clusters. The URL clusters can be used for URL normalization. A data structure is also suggested to store the forward and redirect URL information. |
doi_str_mv | 10.1109/ACE.2010.47 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5532806</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5532806</ieee_id><sourcerecordid>5532806</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-4de7757ca2b96ce6fdc90b0726ba1344984d37d59bfa44dac5c46b9512348dfc3</originalsourceid><addsrcrecordid>eNotjEtLw0AURkdEUGtWLt3MH0idx70zGVzVUB-QVpCIyzKv6EiaaJIu9Ncb0G9zOIvzEXLJ2ZJzZq5X5Xop2Gygj8g508ogMCzMMcmMLjgIAM0R1CnJxvGDzQMUgOqM3JTtYZzikLo3emvHGOjLc0W3_bC3bfqxU-o7Wkf_3qWvQ6RNP9DX6OgmdXNwQU4a244x--eC1HfrunzIq6f7x3JV5cmwKYcQtUbtrXBG-aia4A1zTAvlLJcApoAgdUDjGgsQrEcPyhnkQkIRGi8X5OrvNsUYd59D2tvhe4coRcGU_AUh_Ea8</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Clustering Based URL Normalization Technique for Web Mining</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Nagwani, N K</creator><creatorcontrib>Nagwani, N K</creatorcontrib><description>URL (Uniform Resource Locator) normalization is an important activity in web mining. Web data can be retrieved in smoother way using effective URL normalization technique. URL normalization also reduces lot of calculations in web mining activities. A web mining technique for URL normalization is proposed in this paper. The proposed technique is based on content, structure and semantic similarity and web page redirection and forwarding similarity of the given set of URLs. Web page redirection and forward graphs can be used to measure the similarities between the URL's and can also be used for URL clusters. The URL clusters can be used for URL normalization. A data structure is also suggested to store the forward and redirect URL information.</description><identifier>ISBN: 9781424471546</identifier><identifier>ISBN: 1424471540</identifier><identifier>EISBN: 0769540589</identifier><identifier>EISBN: 9781424471553</identifier><identifier>EISBN: 9780769540580</identifier><identifier>EISBN: 1424471559</identifier><identifier>DOI: 10.1109/ACE.2010.47</identifier><language>eng</language><publisher>IEEE</publisher><subject>Access protocols ; Clustering ; Crawlers ; Data structures ; Indexing ; Information retrieval ; Search engines ; Uniform resource locators ; URL Normalization ; Web mining ; Web Page Forward and Redirect Similarity Tree ; Web pages ; World Wide Web</subject><ispartof>2010 International Conference on Advances in Computer Engineering, 2010, p.349-351</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5532806$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27899,54892</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5532806$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Nagwani, N K</creatorcontrib><title>Clustering Based URL Normalization Technique for Web Mining</title><title>2010 International Conference on Advances in Computer Engineering</title><addtitle>ACE</addtitle><description>URL (Uniform Resource Locator) normalization is an important activity in web mining. Web data can be retrieved in smoother way using effective URL normalization technique. URL normalization also reduces lot of calculations in web mining activities. A web mining technique for URL normalization is proposed in this paper. The proposed technique is based on content, structure and semantic similarity and web page redirection and forwarding similarity of the given set of URLs. Web page redirection and forward graphs can be used to measure the similarities between the URL's and can also be used for URL clusters. The URL clusters can be used for URL normalization. A data structure is also suggested to store the forward and redirect URL information.</description><subject>Access protocols</subject><subject>Clustering</subject><subject>Crawlers</subject><subject>Data structures</subject><subject>Indexing</subject><subject>Information retrieval</subject><subject>Search engines</subject><subject>Uniform resource locators</subject><subject>URL Normalization</subject><subject>Web mining</subject><subject>Web Page Forward and Redirect Similarity Tree</subject><subject>Web pages</subject><subject>World Wide Web</subject><isbn>9781424471546</isbn><isbn>1424471540</isbn><isbn>0769540589</isbn><isbn>9781424471553</isbn><isbn>9780769540580</isbn><isbn>1424471559</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2010</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjEtLw0AURkdEUGtWLt3MH0idx70zGVzVUB-QVpCIyzKv6EiaaJIu9Ncb0G9zOIvzEXLJ2ZJzZq5X5Xop2Gygj8g508ogMCzMMcmMLjgIAM0R1CnJxvGDzQMUgOqM3JTtYZzikLo3emvHGOjLc0W3_bC3bfqxU-o7Wkf_3qWvQ6RNP9DX6OgmdXNwQU4a244x--eC1HfrunzIq6f7x3JV5cmwKYcQtUbtrXBG-aia4A1zTAvlLJcApoAgdUDjGgsQrEcPyhnkQkIRGi8X5OrvNsUYd59D2tvhe4coRcGU_AUh_Ea8</recordid><startdate>201006</startdate><enddate>201006</enddate><creator>Nagwani, N K</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201006</creationdate><title>Clustering Based URL Normalization Technique for Web Mining</title><author>Nagwani, N K</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-4de7757ca2b96ce6fdc90b0726ba1344984d37d59bfa44dac5c46b9512348dfc3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Access protocols</topic><topic>Clustering</topic><topic>Crawlers</topic><topic>Data structures</topic><topic>Indexing</topic><topic>Information retrieval</topic><topic>Search engines</topic><topic>Uniform resource locators</topic><topic>URL Normalization</topic><topic>Web mining</topic><topic>Web Page Forward and Redirect Similarity Tree</topic><topic>Web pages</topic><topic>World Wide Web</topic><toplevel>online_resources</toplevel><creatorcontrib>Nagwani, N K</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Nagwani, N K</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Clustering Based URL Normalization Technique for Web Mining</atitle><btitle>2010 International Conference on Advances in Computer Engineering</btitle><stitle>ACE</stitle><date>2010-06</date><risdate>2010</risdate><spage>349</spage><epage>351</epage><pages>349-351</pages><isbn>9781424471546</isbn><isbn>1424471540</isbn><eisbn>0769540589</eisbn><eisbn>9781424471553</eisbn><eisbn>9780769540580</eisbn><eisbn>1424471559</eisbn><abstract>URL (Uniform Resource Locator) normalization is an important activity in web mining. Web data can be retrieved in smoother way using effective URL normalization technique. URL normalization also reduces lot of calculations in web mining activities. A web mining technique for URL normalization is proposed in this paper. The proposed technique is based on content, structure and semantic similarity and web page redirection and forwarding similarity of the given set of URLs. Web page redirection and forward graphs can be used to measure the similarities between the URL's and can also be used for URL clusters. The URL clusters can be used for URL normalization. A data structure is also suggested to store the forward and redirect URL information.</abstract><pub>IEEE</pub><doi>10.1109/ACE.2010.47</doi><tpages>3</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISBN: 9781424471546 |
ispartof | 2010 International Conference on Advances in Computer Engineering, 2010, p.349-351 |
issn | |
language | eng |
recordid | cdi_ieee_primary_5532806 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Access protocols Clustering Crawlers Data structures Indexing Information retrieval Search engines Uniform resource locators URL Normalization Web mining Web Page Forward and Redirect Similarity Tree Web pages World Wide Web |
title | Clustering Based URL Normalization Technique for Web Mining |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-03-05T02%3A39%3A06IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Clustering%20Based%20URL%20Normalization%20Technique%20for%20Web%20Mining&rft.btitle=2010%20International%20Conference%20on%20Advances%20in%20Computer%20Engineering&rft.au=Nagwani,%20N%20K&rft.date=2010-06&rft.spage=349&rft.epage=351&rft.pages=349-351&rft.isbn=9781424471546&rft.isbn_list=1424471540&rft_id=info:doi/10.1109/ACE.2010.47&rft.eisbn=0769540589&rft.eisbn_list=9781424471553&rft.eisbn_list=9780769540580&rft.eisbn_list=1424471559&rft_dat=%3Cieee_6IE%3E5532806%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i90t-4de7757ca2b96ce6fdc90b0726ba1344984d37d59bfa44dac5c46b9512348dfc3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5532806&rfr_iscdi=true |