Loading…

Bayesian-multiplicative treatment of count zeros in compositional data sets

Compositional count data are discrete vectors representing the numbers of outcomes falling into any of several mutually exclusive categories. Compositional techniques based on the log-ratio methodology are appropriate in those cases where the total sum of the vector elements is not of interest. Such...

Full description

Saved in:
Bibliographic Details
Published in:Statistical modelling 2015-04, Vol.15 (2), p.134-158
Main Authors: Martín-Fernández, Josep-Antoni, Hron, Karel, Templ, Matthias, Filzmoser, Peter, Palarea-Albaladejo, Javier
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Compositional count data are discrete vectors representing the numbers of outcomes falling into any of several mutually exclusive categories. Compositional techniques based on the log-ratio methodology are appropriate in those cases where the total sum of the vector elements is not of interest. Such compositional count data sets can contain zero values which are often the result of insufficiently large samples. That is, they refer to unobserved positive values that may have been observed with a larger number of trials or with a different sampling design. Because the log-ratio transformations require data with positive values, any statistical analysis of count compositions must be preceded by a proper replacement of the zeros. A Bayesian-multiplicative treatment has been proposed for addressing this count zero problem in several case studies. This treatment involves the Dirichlet prior distribution as the conjugate distribution of the multinomial distribution and a multiplicative modification of the non-zero values. Different parameterizations of the prior distribution provide different zero replacement results, whose coherence with the vector space structure of the simplex is stated. Their performance is evaluated from both the theoretical and the computational point of view.
ISSN:1471-082X
1477-0342
DOI:10.1177/1471082X14535524