Loading…
A study of divisive clustering with Hausdorff distances for interval data
•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outper...
Saved in:
Published in: | Pattern recognition 2019-12, Vol.96, p.106969, Article 106969 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303 |
---|---|
cites | cdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303 |
container_end_page | |
container_issue | |
container_start_page | 106969 |
container_title | Pattern recognition |
container_volume | 96 |
creator | Chen, Yi Billard, L. |
description | •Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations.
Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set. |
doi_str_mv | 10.1016/j.patcog.2019.106969 |
format | article |
fullrecord | <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_patcog_2019_106969</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0031320319302729</els_id><sourcerecordid>S0031320319302729</sourcerecordid><originalsourceid>FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</originalsourceid><addsrcrecordid>eNp9kMFqAjEQhkNpodb2DXrIC6ydSTTrXgoibRWEXryHbDKxEetKErf49o1sz4WBgZn_H-b_GHtGmCCgetlPTibbbjcRgE0ZqUY1N2yE81pWM5yKWzYCkFhJAfKePaS0B8C6LEZsveApn92Fd5670IcUeuL2cE6ZYjju-E_IX3xlzsl10V8lKZujpcR9F3k4FlVvDtyZbB7ZnTeHRE9_fcy272_b5arafH6sl4tNZSWoXKG1zqLzvp0ZQTAnaIQSTqnWzYRH38xrj1MPbSmSZARYcq2iEqA2EuSYTYezNnYpRfL6FMO3iReNoK809F4PNPSVhh5oFNvrYKPyWh8o6mQDlSQuRLJZuy78f-AXDnZsHQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A study of divisive clustering with Hausdorff distances for interval data</title><source>ScienceDirect Freedom Collection</source><creator>Chen, Yi ; Billard, L.</creator><creatorcontrib>Chen, Yi ; Billard, L.</creatorcontrib><description>•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations.
Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.</description><identifier>ISSN: 0031-3203</identifier><identifier>EISSN: 1873-5142</identifier><identifier>DOI: 10.1016/j.patcog.2019.106969</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Divisive clustering ; Euclidean normalization ; Gowda–Diday distances ; Hausdorff distances ; Ichino–Yaguchi distances ; Interval data ; Local and global normalizations ; Span normalization</subject><ispartof>Pattern recognition, 2019-12, Vol.96, p.106969, Article 106969</ispartof><rights>2019 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</citedby><cites>FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Chen, Yi</creatorcontrib><creatorcontrib>Billard, L.</creatorcontrib><title>A study of divisive clustering with Hausdorff distances for interval data</title><title>Pattern recognition</title><description>•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations.
Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.</description><subject>Divisive clustering</subject><subject>Euclidean normalization</subject><subject>Gowda–Diday distances</subject><subject>Hausdorff distances</subject><subject>Ichino–Yaguchi distances</subject><subject>Interval data</subject><subject>Local and global normalizations</subject><subject>Span normalization</subject><issn>0031-3203</issn><issn>1873-5142</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kMFqAjEQhkNpodb2DXrIC6ydSTTrXgoibRWEXryHbDKxEetKErf49o1sz4WBgZn_H-b_GHtGmCCgetlPTibbbjcRgE0ZqUY1N2yE81pWM5yKWzYCkFhJAfKePaS0B8C6LEZsveApn92Fd5670IcUeuL2cE6ZYjju-E_IX3xlzsl10V8lKZujpcR9F3k4FlVvDtyZbB7ZnTeHRE9_fcy272_b5arafH6sl4tNZSWoXKG1zqLzvp0ZQTAnaIQSTqnWzYRH38xrj1MPbSmSZARYcq2iEqA2EuSYTYezNnYpRfL6FMO3iReNoK809F4PNPSVhh5oFNvrYKPyWh8o6mQDlSQuRLJZuy78f-AXDnZsHQ</recordid><startdate>201912</startdate><enddate>201912</enddate><creator>Chen, Yi</creator><creator>Billard, L.</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201912</creationdate><title>A study of divisive clustering with Hausdorff distances for interval data</title><author>Chen, Yi ; Billard, L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Divisive clustering</topic><topic>Euclidean normalization</topic><topic>Gowda–Diday distances</topic><topic>Hausdorff distances</topic><topic>Ichino–Yaguchi distances</topic><topic>Interval data</topic><topic>Local and global normalizations</topic><topic>Span normalization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Yi</creatorcontrib><creatorcontrib>Billard, L.</creatorcontrib><collection>CrossRef</collection><jtitle>Pattern recognition</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Yi</au><au>Billard, L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A study of divisive clustering with Hausdorff distances for interval data</atitle><jtitle>Pattern recognition</jtitle><date>2019-12</date><risdate>2019</risdate><volume>96</volume><spage>106969</spage><pages>106969-</pages><artnum>106969</artnum><issn>0031-3203</issn><eissn>1873-5142</eissn><abstract>•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations.
Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.patcog.2019.106969</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0031-3203 |
ispartof | Pattern recognition, 2019-12, Vol.96, p.106969, Article 106969 |
issn | 0031-3203 1873-5142 |
language | eng |
recordid | cdi_crossref_primary_10_1016_j_patcog_2019_106969 |
source | ScienceDirect Freedom Collection |
subjects | Divisive clustering Euclidean normalization Gowda–Diday distances Hausdorff distances Ichino–Yaguchi distances Interval data Local and global normalizations Span normalization |
title | A study of divisive clustering with Hausdorff distances for interval data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T16%3A54%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20study%20of%20divisive%20clustering%20with%20Hausdorff%20distances%20for%20interval%20data&rft.jtitle=Pattern%20recognition&rft.au=Chen,%20Yi&rft.date=2019-12&rft.volume=96&rft.spage=106969&rft.pages=106969-&rft.artnum=106969&rft.issn=0031-3203&rft.eissn=1873-5142&rft_id=info:doi/10.1016/j.patcog.2019.106969&rft_dat=%3Celsevier_cross%3ES0031320319302729%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |