Loading…

A study of divisive clustering with Hausdorff distances for interval data

•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outper...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2019-12, Vol.96, p.106969, Article 106969
Main Authors: Chen, Yi, Billard, L.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303
cites cdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303
container_end_page
container_issue
container_start_page 106969
container_title Pattern recognition
container_volume 96
creator Chen, Yi
Billard, L.
description •Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations. Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.
doi_str_mv 10.1016/j.patcog.2019.106969
format article
fullrecord <record><control><sourceid>elsevier_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1016_j_patcog_2019_106969</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0031320319302729</els_id><sourcerecordid>S0031320319302729</sourcerecordid><originalsourceid>FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</originalsourceid><addsrcrecordid>eNp9kMFqAjEQhkNpodb2DXrIC6ydSTTrXgoibRWEXryHbDKxEetKErf49o1sz4WBgZn_H-b_GHtGmCCgetlPTibbbjcRgE0ZqUY1N2yE81pWM5yKWzYCkFhJAfKePaS0B8C6LEZsveApn92Fd5670IcUeuL2cE6ZYjju-E_IX3xlzsl10V8lKZujpcR9F3k4FlVvDtyZbB7ZnTeHRE9_fcy272_b5arafH6sl4tNZSWoXKG1zqLzvp0ZQTAnaIQSTqnWzYRH38xrj1MPbSmSZARYcq2iEqA2EuSYTYezNnYpRfL6FMO3iReNoK809F4PNPSVhh5oFNvrYKPyWh8o6mQDlSQuRLJZuy78f-AXDnZsHQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A study of divisive clustering with Hausdorff distances for interval data</title><source>ScienceDirect Freedom Collection</source><creator>Chen, Yi ; Billard, L.</creator><creatorcontrib>Chen, Yi ; Billard, L.</creatorcontrib><description>•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations. Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.</description><identifier>ISSN: 0031-3203</identifier><identifier>EISSN: 1873-5142</identifier><identifier>DOI: 10.1016/j.patcog.2019.106969</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Divisive clustering ; Euclidean normalization ; Gowda–Diday distances ; Hausdorff distances ; Ichino–Yaguchi distances ; Interval data ; Local and global normalizations ; Span normalization</subject><ispartof>Pattern recognition, 2019-12, Vol.96, p.106969, Article 106969</ispartof><rights>2019 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</citedby><cites>FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Chen, Yi</creatorcontrib><creatorcontrib>Billard, L.</creatorcontrib><title>A study of divisive clustering with Hausdorff distances for interval data</title><title>Pattern recognition</title><description>•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations. Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.</description><subject>Divisive clustering</subject><subject>Euclidean normalization</subject><subject>Gowda–Diday distances</subject><subject>Hausdorff distances</subject><subject>Ichino–Yaguchi distances</subject><subject>Interval data</subject><subject>Local and global normalizations</subject><subject>Span normalization</subject><issn>0031-3203</issn><issn>1873-5142</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9kMFqAjEQhkNpodb2DXrIC6ydSTTrXgoibRWEXryHbDKxEetKErf49o1sz4WBgZn_H-b_GHtGmCCgetlPTibbbjcRgE0ZqUY1N2yE81pWM5yKWzYCkFhJAfKePaS0B8C6LEZsveApn92Fd5670IcUeuL2cE6ZYjju-E_IX3xlzsl10V8lKZujpcR9F3k4FlVvDtyZbB7ZnTeHRE9_fcy272_b5arafH6sl4tNZSWoXKG1zqLzvp0ZQTAnaIQSTqnWzYRH38xrj1MPbSmSZARYcq2iEqA2EuSYTYezNnYpRfL6FMO3iReNoK809F4PNPSVhh5oFNvrYKPyWh8o6mQDlSQuRLJZuy78f-AXDnZsHQ</recordid><startdate>201912</startdate><enddate>201912</enddate><creator>Chen, Yi</creator><creator>Billard, L.</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>201912</creationdate><title>A study of divisive clustering with Hausdorff distances for interval data</title><author>Chen, Yi ; Billard, L.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Divisive clustering</topic><topic>Euclidean normalization</topic><topic>Gowda–Diday distances</topic><topic>Hausdorff distances</topic><topic>Ichino–Yaguchi distances</topic><topic>Interval data</topic><topic>Local and global normalizations</topic><topic>Span normalization</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Yi</creatorcontrib><creatorcontrib>Billard, L.</creatorcontrib><collection>CrossRef</collection><jtitle>Pattern recognition</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Yi</au><au>Billard, L.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A study of divisive clustering with Hausdorff distances for interval data</atitle><jtitle>Pattern recognition</jtitle><date>2019-12</date><risdate>2019</risdate><volume>96</volume><spage>106969</spage><pages>106969-</pages><artnum>106969</artnum><issn>0031-3203</issn><eissn>1873-5142</eissn><abstract>•Hausdorrf, Gowda–Diday and Ichino–Yaguchi distances for intervals are compared.•Euclidean counterparts and their normalizations are included.•Summary of advantages and disadvantages of these respective distances are based on simulation studies.•The simulation study shows local normalizations outperform global normalizations. Clustering methods are becoming key as analysts try to understand what knowledge is buried inside contemporary large data sets. This article analyzes the impact of six different Hausdorff distances on sets of multivariate interval data (where, for each dimension, an interval is defined as an observation [a, b] with a ≤ b and with a and b taking values on the real line R1), used as the basis for Chavent’s [15, 16] divisive clustering algorithm. Advantages and disadvantages are summarized for each distance. Comparisons with two other distances for interval data, the Gowda–Diday and Ichino–Yaguchi measures are included. All have specific strengths depending on the type of data present. Global normalization of a distance is not recommended; and care needs to be made when using local normalizations to ensure the features of the underlying data sets are revealed. The study is based on sets of simulated data, and on a real data set.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.patcog.2019.106969</doi></addata></record>
fulltext fulltext
identifier ISSN: 0031-3203
ispartof Pattern recognition, 2019-12, Vol.96, p.106969, Article 106969
issn 0031-3203
1873-5142
language eng
recordid cdi_crossref_primary_10_1016_j_patcog_2019_106969
source ScienceDirect Freedom Collection
subjects Divisive clustering
Euclidean normalization
Gowda–Diday distances
Hausdorff distances
Ichino–Yaguchi distances
Interval data
Local and global normalizations
Span normalization
title A study of divisive clustering with Hausdorff distances for interval data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T16%3A54%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20study%20of%20divisive%20clustering%20with%20Hausdorff%20distances%20for%20interval%20data&rft.jtitle=Pattern%20recognition&rft.au=Chen,%20Yi&rft.date=2019-12&rft.volume=96&rft.spage=106969&rft.pages=106969-&rft.artnum=106969&rft.issn=0031-3203&rft.eissn=1873-5142&rft_id=info:doi/10.1016/j.patcog.2019.106969&rft_dat=%3Celsevier_cross%3ES0031320319302729%3C/elsevier_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c306t-1ccdc1dffb5a2e08e09262d66bd52f1f987f14f0bf0be3ea20cedb6e0037a303%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true