|
Other multiple comparison tests |
|
|
Tests that Prism offers, but we don't recommend Bonferroni test to compare every pair of means Prism offers the Bonferroni test for comparing every pair of means, but its only advantage over Tukey's test is that it is much easier to understand how it works. Its disadvantage is that it is too conservative, so you are more apt to miss real differences (also confidence intervals are too wide). This is a minor concern when you compare only a few columns, but is a major problem when you have many columns. Don't use the Bonferroni test with more than five groups. Newman-Keuls The Newman-Keuls (also called Student-Newman-Keuls test) compares all pairs of means following one-way ANOVA. The Newman-Keuls test is popular, but is no longer recommended for three reasons:
Although Prism still offers the Newman-Keuls test (for compatibility with prior versions), we recommend that you use the Tukey test instead. Unfortunately, the Tukey test has less power. This means that the Tukey test concludes that the difference between two groups is 'not statistically significant' in some cases where the Newman-Keuls test concludes that the difference is 'statistically significant'. Tests Prism does not offer because many consider them obsolete Fisher's LSD While the Fisher's Least Significant Difference (LSD) test is of historical interest as the first post test ever developed, it is no longer recommended. The other tests are better. Prism does not offer the Fisher LSD test. Fisher's LSD test does not correct for multiple comparisons as the other post tests do. The other tests can be used even if the overall ANOVA yields a "not significant" conclusion. They set the 5% significance level for the entire family of comparisons -- so there is only a 5% chance than any one or more comparisons will be declared "significant" if the null hypothesis is true. The Fishers LSD post test can only be used if the overall ANOVA has a P value less than 0.05. This first step sort of controls the false positive rate for the entire family of comparisons. But when doing each individual comparison, it sets the 5% significance level to apply to each individual comparison, rather than to the family of comparisons. This means it is easier to find statistical significance with the Fisher LSD test than with other post tests (it has more power), but that also means it is too easy to be mislead by false positives (you'll get bogus 'significant' results in more than 5% of experiments). Duncan's test This test is adapted from the Newman-Keuls method. Like the Newman-Keuls method, Duncan's test does not control family wise error rate at the specified alpha level. It has more power than the other post tests, but only because it doesn't control the error rate properly. Few statisticians, if any, recommend this test. Multiple comparisons tests that Prism does not offer Scheffe's test Scheffe's test (not calculated by Prism) is used to do more all possible comparisons, including averages of groups. So you might compare the average of groups A and B with the average of groups C, D and E. Or compare group A, to the average of B-F. Because it is so versatile, Scheffe's test has less power to detect differences between pairs of groups, so should not be used when your goal is to compare one group mean with another. Holm's test Some statisticians highly recommend Holm's test. We don't offer it in Prism, because while it does a great job of deciding which group differences are statistically significant, it cannot compute confidence intervals for the differences between group means. (Let us know if you would like to see this in a future version of Prism.) False Discovery Rate The concept of the False Discovery Rate is a major advance in statistics. But it is really only useful when you have calculated a large number of P values from independent comparisons, and now have to decide which P values are small enough to followup further. It is not used as a post test following one-way ANOVA. References 1. MA Seaman, JR Levin and RC Serlin, Psychological Bulletin 110:577-586, 1991. |