ANOVA for Researchers
Kirk Steinhorst


Use Proc ANOVA for 1) one-way analysis of variance, 2) (completely) nested designs, and 3) BALANCED randomized block designs, Latin squares, etc.

Use Proc GLM for the same cases plus unbalanced fixed effects models. While GLM can be used for mixed effect models, SAS' new Proc MIXED has some advantages for mixed models per se.

After identifying the design, the most difficult part of analysis of variance for many researchers is identifying the experimental unit. For example, in the deer data the basic experimental unit is the deer. Pellet samples are SUBSAMPLES. The ANOVA has a subsampling error term below the error term itself. This causes most stat packages to use the subsampling error term as the denominator of the F test. Although this is the default, one can usually force the package to do the right test (see the example).

Subsamples are ALWAYS nested within experimental units. Other terms in the model can be nested or crossed. The example above shows blocks nested within sites whereas sites, ozone treatments and rainfall treatments are crossed.

The underlying assumptions of experimental design ANOVA are

  1. Normality,
  2. Independence,
  3. Homogeneity of variance and
  4. Zero mean for error (this is a statistician's way of saying that you have specified the correct model).

Many textbooks suggest using statistical tests for normality, independence,... However, if you think about it, your subsequent F test in the ANOVA is then CONDITIONAL on the outcome of the tests for assumptions. This cobbles up the Type I error of the F test so bad that you no longer know what is going on.

A better way to assess the underlying assumptions of ANOVA are

For assessing underlying assumptions graphically, one must remember that the raw data have treatment effects in them. You cannot do a histogram of raw data to assess normality because if there is a treatment effect you should have several "mounds" of data even if the data are normal. Thus one starts with the residuals . This takes out the treatment, block, factor ... effects so that you can examine (an unbiased estimate of) the errors directly.

For assessing normality , one can look at dot plots, histograms, or normal probability plots. If there are just a few data points, dot plots are the best you can do (see termite data example).

For assessing independence and homogeneity of variance , one usually plots the residuals against Y, Yhat, order of data collection, time, ... to see if any patterns emerge. In experimental design, plots of the residuals versus Yhat (the predicted value) will often be uninteresting. Try the other plots first.

Determining if the model is correct requires a little more subtlety. If things you think should be significant aren't, then perhaps you have left terms out of the model. If you have an interaction which shows diverging patterns between treatments, then perhaps your model is multiplicative, rather than additive. Likewise, the model can be multiplicative because the treatments multiply the response by a factor rather than add to the response. Multiplicative models can often be made additive by taking logarithms. Examination of residual plots may reveal patterns that suggest a missing variable to you.

A contrast is a set of coefficients that sum to zero. Contrasts are used in experimental design ANOVA to compare treatments in various ways. Contrasts of main effects are straightforward. In the stepping data, subjects walked at three different speeds. To compare speed 1 to speed 3, use the contrast -1 0 1.

Contrasts of interactions can be particularly useful. If factor A is qualitative (the levels are categorical) and B is quantitative (the levels are measured on a numerical scale), then one might be interested in the A x Blinear and the A x Bquadratic... contrasts. Other contrasts may be of interest if both factors are qualitative or both factors are quantitative. See the example above to see how one tests contrasts in SAS.

Multiple comparisons of means is a topic that can lead to fist fights. There are proponents of all sorts of techniques...Tukey, Bonferroni, Scheffe, GT2, Duncan's Multiple Range, etc. etc. etc.

There are several schools of thought:

Repeated measures experiments involve observing several treatments on the same experimental unit. The hearing example above is a simple repeated measures experiment. Each subject listened to the four lists (presumably in random order). The experimental unit is the subject. Each experimental unit was observed under four conditions.

Repeated measures experiments when the repeated treatments occur in order are called repeated measures in time . Given the nonrandom nature of the repeated part of the experiment, one must interpret the analysis carefully.

If the correlations between observations on the same experimental unit are "nice" (equal for example), then the repeated measures analysis reduces to a split-plot ANOVA. In general, one uses a multivariate analysis of variance. See the example to get started. See a statistician when you get stuck.

ANCOVA combines elements of experimental design ANOVA and regression ANOVA. That is, one has blocks, factor A, B,... plus one or more continuous independent variables. In this presentation, we have the simplist case--simple treatments and a single covariate. Without the covariate, we would have a one-way ANOVA. Without the treatments, we would have a simple linear regression. It helps to visualize the problem. The SAS code shows how to plot the relationship between Y and X using treatment as the plotting symbol. What we expect to see is data falling along several parallel lines. To test for parallelism, we look at the "interaction" between treatments and the covariate. If it is significant, then the slopes are not parallel and we might want to stop and explain why the lines have different slopes. If it makes sense to assume equal slopes, then we can do the traditional test of equal treatments adjusting for the covariate. Least squares means are the adjusted means in this case. See the example.