Loading…
Advances in Principal Balances for Compositional Data
Compositional data analysis requires selecting an orthonormal basis with which to work on coordinates. In most cases this selection is based on a data driven criterion. Principal component analysis provides bases that are, in general, functions of all the original parts, each with a different weight...
Saved in:
Published in: | Mathematical geosciences 2018-04, Vol.50 (3), p.273-298 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Compositional data analysis requires selecting an orthonormal basis with which to work on coordinates. In most cases this selection is based on a data driven criterion. Principal component analysis provides bases that are, in general, functions of all the original parts, each with a different weight hindering their interpretation. For interpretative purposes, it would be better to have each basis component as a ratio or balance of the geometric means of two groups of parts, leaving irrelevant parts with a zero weight. This is the role of principal balances, defined as a sequence of orthonormal balances which successively maximize the explained variance in a data set. The new algorithm to compute principal balances requires an exhaustive search along all the possible sets of orthonormal balances. To reduce computational time, the sets of possible partitions for up to 15 parts are stored. Two other suboptimal, but feasible, algorithms are also introduced: (i) a new search for balances following a constrained principal component approach and (ii) the hierarchical cluster analysis of variables. The latter is a new approach based on the relation between the variation matrix and the Aitchison distance. The properties and performance of these three algorithms are illustrated using a typical data set of geochemical compositions and a simulation exercise. |
---|---|
ISSN: | 1874-8961 1874-8953 |
DOI: | 10.1007/s11004-017-9712-z |