|
Interpreting results: One-way ANOVA |
|
|
One-way ANOVA compares three or more unmatched groups, based on the assumption that the populations are Gaussian. P value The P value answers this question: If the overall P value is large, the data do not give you any reason to conclude that the means differ. Even if the population means were equal, you would not be surprised to find sample means this far apart just by chance. This is not the same as saying that the true means are the same. You just don't have compelling evidence that they differ. If the overall P value is small, then it is unlikely that the differences you observed are due to random sampling. You can reject the idea that all the populations have identical means. This doesn't mean that every mean differs from every other mean, only that at least one differs from the rest. Look at the results of post tests to identify where the differences are. F ratio and ANOVA table The P value is computed from the F ratio which is computed from the ANOVA table.
ANOVA partitions the variability among all the values into one component that is due to variability among group means (due to the treatment) and another component that is due to variability within the groups (also called residual variation). Variability within groups (within the columns) is quantified as the sum of squares of the differences between each value and its group mean. This is the residual sum-of-squares. Variation among groups (due to treatment) is quantified as the sum of the squares of the differences between the group means and the grand mean (the mean of all values in all groups). Adjusted for the size of each group, this becomes the treatment sum-of-squares. Each sum-of-squares is associated with a certain number of degrees of freedom (df, computed from number of subjects and number of groups), and the mean square (MS) is computed by dividing the sum-of-squares by the appropriate number of degrees of freedom. The F ratio is the ratio of two mean square values. If the null hypothesis is true, you expect F to have a value close to 1.0 most of the time. A large F ratio means that the variation among group means is more than you'd expect to see by chance. You'll see a large F ratio both when the null hypothesis is wrong (the data are not sampled from populations with the same mean) and when random sampling happened to end up with large values in some groups and small values in others. The P value is determined from the F ratio and the two values for degrees of freedom shown in the ANOVA table. Bartlett's test for equal variances ANOVA is based on the assumption that the populations all have the same standard deviations. If ever group has at least five values, Prism tests this assumption using Bartlett's test. It reports the value of Bartlett's statistic along with a P value that answers this question: If the P value is small, you must decide whether you will conclude that the standard deviations of the two populations are different. Obviously Bartlett's test is based only on the values in this one experiment. Think about data from other similar experiments before making a conclusion. If you conclude that the populations have different variances, you have three choices:
Why not switch to the nonparametric Kruskal-Wallis test?. While nonparametric tests do not assume Gaussian distributions, the Kruskal-Wallis test (and other nonparametric tests) does assume that the shape of the data distribution is the same in each group. So if your groups have very different standard deviations and so are not appropriate for one-way ANOVA, they should not be analyzed by the Kruskal-Wallis test either. Some suggest using Levene's median test instead of Bartlett's test. Prism doesn't do this test (yet), but it isn't hard to do by Excel (combined with Prism). To do Levene's test, first create a new table where each value is defined as the absolute value of the difference between the actual value and median of its group. Then run a one-way ANOVA on this new table. The idea is that by subtracting each value from its group median, you've gotten rid of difference between group averages. (Why not subtract means rather than medians? In fact, that was Levene's idea, but others have shown the median works better.) So if this ANOVA comes up with a small P value, then it must be confused by different scatter (SD) in different groups. If the Levene P value is small then don't believe the results of the overall one-way ANOVA. See an example on pages 325-327 of Glantz. Read more about the general topic of assumption checking after ANOVA in this article by Andy Karp. R squared R2 is the fraction of the overall variance (of all the data, pooling all the groups) attributable to differences among the group means. It compares the variability among group means with the variability within the groups. A large value means that a large fraction of the variation is due to the treatment that defines the groups. The R2 value is calculated from the ANOVA table and equals the between group sum-of-squares divided by the total sum-of-squares. Some programs (and books) don't bother reporting this value. Others refer to it as η2 (eta squared) rather than R2. It is a descriptive statistic that quantifies the strength of the relationship between group membership and the variable you measured. |