Loading…
Histogram Specification by Assignment of Optimal Unique Values
In this paper, we propose two novel algorithms for histogram specification and quantile transformation of data without local information. These are core techniques that can serve as building blocks for applications that require specifying the sample distribution of a given set of data. Histogram spe...
Saved in:
Published in: | arXiv.org 2021-02 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we propose two novel algorithms for histogram specification and quantile transformation of data without local information. These are core techniques that can serve as building blocks for applications that require specifying the sample distribution of a given set of data. Histogram specification is best known for its image enhancement applications, whereas quantile transformation is typically employed in data preprocessing for data normalization. In signal processing, methods often require temporal or spatial information; in data preprocessing, methods work by interpolation or by approximation, drawing from results in computational statistics, and have a trade-off between speed and quality. It is nontrivial to accommodate for cases that do not have local information (e.g., tabular data) while also providing a fast, exact solution. For that, we take up a concept in image processing called group mapping law and propose an extension. The proposed extension allows us to formulate a convex functional where we look for the best approximation between the output unique values and the reference histogram. Then, we apply the ordered assignment solution, a result in optimal transport, to reconstruct the output from the optimal unique values. Two sets of results show the effectiveness of the proposed algorithms when compared to traditional and state-of-the-art methods. The proposed algorithms are fast, exact, and least \(p\)-norm optimal. Further, we define the algorithms as generic data processing methods. Thus, contributions from this paper can be easily incorporated in applications spanning many disciplines, especially in applied data science. |
---|---|
ISSN: | 2331-8422 |