In statistics, one of the first distributions that one learns about is usually the normal distribution. Not only because it’s pretty, also because it’s ubiquitous.

In addition, the normal distribution is often the reference that is used when discussion other distributions: right skewed is skewed to the rightÂ compared to the normal distribution; when looking at kurtosis, a leptokurtic distribution is relatively spiky compared to the normal distribution: and unimodality is considered the norm, too.

There exist quantitative representations of skewness, kurtosis, and modality (the dip test), and each of these can be tested against a null hypothesis, where the null hypothesis is (almost) always that the skewness, kurtosis, or dip test value of the distribution is equal to that of a normal distribution.

In addition, some statistical tests require that the sampling distribution of the relevant statistic is approximately normal (e.g. the *t*-test), and some require an even more elusive assumption called multivariate normality.

Perhaps all these bit of knowledge mesh together in people’s minds, or perhaps there’s another explanation: but for some reason, many researchers and almost all students operate on the assumption that their data have to be normally distributed. If they are not, they often resort to, for example, converting their data into categorical variables or transforming the data.

Continue reading “On the obsession with being normal”