Loading…
CSQUiD: an index and non-probability framework for constrained skyline query processing over uncertain data
Uncertainty of data, the degree to which data are inaccurate, imprecise, untrusted, and undetermined, is inherent in many contemporary database applications, and numerous research endeavours have been devoted to efficiently answer skyline queries over uncertain data. The literature discussed two dif...
Saved in:
Published in: | PeerJ. Computer science 2024-09, Vol.10, p.e2225 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Uncertainty of data, the degree to which data are inaccurate, imprecise, untrusted, and undetermined, is inherent in many contemporary database applications, and numerous research endeavours have been devoted to efficiently answer skyline queries over uncertain data. The literature discussed two different methods that could be used to handle the data uncertainty in which objects having continuous range values. The first method employs a probability-based approach, while the second assumes that the uncertain values are represented by their median values. Nevertheless, neither of these methods seem to be suitable for the modern high-dimensional uncertain databases due to the following reasons. The first method requires an intensive probability calculations while the second is impractical. Therefore, this work introduces an index, non-probability framework named Constrained Skyline Query processing on Uncertain Data (
) aiming at reducing the computational time in processing constrained skyline queries over uncertain high-dimensional data. Given a collection of objects with uncertain data, the
framework constructs the minimum bounding rectangles (
) by employing the
-tree indexing structure. Instead of scanning the whole collection of objects, only objects within the dominant
are analyzed in determining the final skylines. In addition,
makes use of the
approach where the exact value of each continuous range value of those dominant
' objects is identified. The proposed
framework is validated using real and synthetic data sets through extensive experimentations. Based on the performance analysis conducted, by varying the sizes of the constrained query, the
framework outperformed the most recent methods (
algorithm and
framework) with an average improvement of 44.07% and 57.15% with regards to the number of pairwise comparisons, while the average improvement of CPU processing time over
and
stood at 27.17% and 18.62%, respectively. |
---|---|
ISSN: | 2376-5992 2376-5992 |
DOI: | 10.7717/peerj-cs.2225 |