Results: Wilcoxon matched pairs test

Print this Topic

Interpreting the P value

The Wilcoxon test is a nonparametric test that compares two paired groups. Prism first computes the differences between each set of pairs and ranks the absolute values of the differences from low to high. Prism then sums the ranks of the differences where column A was higher (positive ranks), sums the ranks where column B was higher (it calls these negative ranks), and reports the two sums. If the two sums of ranks are very different, the P value will be small.

The P value answers this question:

If the median difference in the entire population is zero (the treatment is ineffective), what is the chance that random sampling would result in a median change as far from zero (or further) as observed in this experiment?

If the P value is small, you can reject the idea that the difference is due to chance, and conclude instead that the populations have different medians.

If the P value is large, the data do not give you any reason to conclude that the overall medians differ. This is not the same as saying that the means are the same. You just have no compelling evidence that they differ. If you have small samples, the Wilcoxon test has little power to detect small differences.

How the P value is calculated

If your samples are small and there are no tied ranks, Prism calculates an exact P value. If your samples are large or there are tied ranks, it calculates the P value from a Gaussian approximation. The term Gaussian, as used here, has to do with the distribution of sum of ranks and does not imply that your data need to follow a Gaussian distribution.

When some of the subjects have exactly the same value before and after the intervention (same value in both columns), there are two ways to compute the P value:

Prism uses the method suggested by Wilcoxon and described in S Siegel and N Castellan, Nonparametric Statistics for the Behavioral Sciences and in WW Daniel, Applied Nonparametric Statistics (and many others). The subjects that show no change are simply eliminated from the analysis, reducing N. The argument is that since the outcome doesn't change at all in these subjects, they provide no information at all that will be helpful in comparing groups.
Other books show a different method that still accounts for those subjects, and this alternative method gives a different P value. The argument is that the lack of change in these subjects brings down the average change altogether, so appropriately raises the P value.

Test for effective pairing

The whole point of using a paired test is to control for experimental variability. Some factors you don't control in the experiment will affect the before and the after measurements equally, so they will not affect the difference between before and after. By analyzing only the differences, therefore, a paired test corrects for these sources of scatter.

If pairing is effective, you expect the before and after measurements to vary together. Prism quantifies this by calculating the nonparametric Spearman correlation coefficient, rs. From rs, Prism calculates a P value that answers this question: If the two groups really are not correlated at all, what is the chance that randomly selected subjects would have a correlation coefficient as large (or larger) as observed in your experiment? The P value is one-tail, as you are not interested in the possibility of observing a strong negative correlation.

If the pairing was effective, rs will be positive and the P value will be small. This means that the two groups are significantly correlated, so it made sense to choose a paired test.

If the P value is large (say larger than 0.05), you should question whether it made sense to use a paired test. Your choice of whether to use a paired test or not should not be based on this one P value, but also on the experimental design and the results you have seen in other similar experiments (assuming you have repeated the experiments several times).

If rs is negative, it means that the pairing was counterproductive! You expect the values of the pairs to move together if one is higher, so is the other. Here the opposite is true if one has a higher value, the other has a lower value. Most likely this is just a matter of chance. If rs is close to -1, you should review your procedures, as the data are unusual.



Copyright (c) 2007 GraphPad Software Inc. All rights reserved.
URL: http://www.graphpad.com/help/Prism5/Prism5Help.html?how_the_wilcoxon_matched_pairs_test_works.htm