Loading…

Statistical Technique in Clustering Problems

The problem of evaluating and improving the quality of clustering multispectral data is considered. A method for calculating the distance between clusters is developed. Vectors of each cluster are considered as implementations of some random vector. Sampling distribution functions (SDF) are found an...

Full description

Saved in:
Bibliographic Details
Published in:Mathematical models and computer simulations 2023-06, Vol.15 (3), p.445-453
Main Author: Nikolaeva, O. V.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The problem of evaluating and improving the quality of clustering multispectral data is considered. A method for calculating the distance between clusters is developed. Vectors of each cluster are considered as implementations of some random vector. Sampling distribution functions (SDF) are found and the errors of the approximation of unknown exact distribution functions by SDFs are obtained. The distance between two clusters is defined as the distance between two SDFs. The criteria for indiscernible, overlapping, and disjoint clusters are defined. A technique to improve clustering is proposed in which indiscernible (or indiscernible and overlapping) clusters are merged. The results of numerical experiments on simulated data are given. It is shown that the technique can decompose the data into the initial groups of vectors. The results of numerical experiments with real data are given. The real data are multispectral images of the HYPERION sensor, obtained above the ocean under a clear sky and broken clouds. It is shown that the presented technique can distinguish clouds and their shadows in the images.
ISSN:2070-0482
2070-0490
DOI:10.1134/S2070048223030134