Loading…

Problems with using the normal distribution--and ways to improve quality and efficiency of data analysis

The gaussian or normal distribution is the most established model to characterize quantitative variation of original data. Accordingly, data are summarized using the arithmetic mean and the standard deviation, by mean ± SD, or with the standard error of the mean, mean ± SEM. This, together with corr...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2011-07, Vol.6 (7), p.e21403-e21403
Main Authors: Limpert, Eckhard, Stahel, Werner A
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The gaussian or normal distribution is the most established model to characterize quantitative variation of original data. Accordingly, data are summarized using the arithmetic mean and the standard deviation, by mean ± SD, or with the standard error of the mean, mean ± SEM. This, together with corresponding bars in graphical displays has become the standard to characterize variation. Here we question the adequacy of this characterization, and of the model. The published literature provides numerous examples for which such descriptions appear inappropriate because, based on the "95% range check", their distributions are obviously skewed. In these cases, the symmetric characterization is a poor description and may trigger wrong conclusions. To solve the problem, it is enlightening to regard causes of variation. Multiplicative causes are by far more important than additive ones, in general, and benefit from a multiplicative (or log-) normal approach. Fortunately, quite similar to the normal, the log-normal distribution can now be handled easily and characterized at the level of the original data with the help of both, a new sign, x/, times-divide, and notation. Analogous to mean ± SD, it connects the multiplicative (or geometric) mean mean * and the multiplicative standard deviation s* in the form mean * x/s*, that is advantageous and recommended. The corresponding shift from the symmetric to the asymmetric view will substantially increase both, recognition of data distributions, and interpretation quality. It will allow for savings in sample size that can be considerable. Moreover, this is in line with ethical responsibility. Adequate models will improve concepts and theories, and provide deeper insight into science and life.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0021403