Normality of raw data in general linear models: The most widespread myth in statistics

Bulletin of the Ecological Society of America
By:  and 



In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.
Publication type Article
Publication Subtype Journal Article
Title Normality of raw data in general linear models: The most widespread myth in statistics
Series title Bulletin of the Ecological Society of America
DOI 10.1890/0012-9623(2003)84[92:NORDIG]2.0.CO;2
Volume 84
Issue 2
Year Published 2003
Language English
Contributing office(s) Patuxent Wildlife Research Center
Description 3 p.
First page 92
Last page 94
Google Analytic Metrics Metrics page
Additional publication details