Loading…
Do all roads lead to Rome? Studying distance measures in the context of machine learning
•Review of the most commonly used distance measures in machine learning•Analysis of their main properties, applications and key aspects to consider•The similarity analysis shows a high degree of correlation between all the measures•Evaluation of classification and clustering performance, noise toler...
Saved in:
Published in: | Pattern recognition 2023-09, Vol.141, p.109646, Article 109646 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Review of the most commonly used distance measures in machine learning•Analysis of their main properties, applications and key aspects to consider•The similarity analysis shows a high degree of correlation between all the measures•Evaluation of classification and clustering performance, noise tolerance and runtime•Canberra distance shows the best overall performance and the highest tolerance to noise
Many machine learning and data mining tasks are based on distance measures, so a large amount of literature addresses this aspect somehow. Due to the broad scope of the topic, this paper aims to provide an overview of the use of these measures in the most common machine learning problems, pointing out those aspects to consider to choose the most appropriate measure for a particular task. For this purpose, the most recent works addressing the subject were reviewed and seven of the most commonly used measures were analyzed, investigating in detail their main properties and applications. Different experiments were carried out to study their relationships and compare their performance. The degradation of the results in the presence of noise was also considered, as well as the execution time required by each measure. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2023.109646 |