KNOWLEDGEBASE - ARTICLE #2242

Two-way ANOVA is not full rank. What does that mean?

The problem

You just tried to run a two-way ANOVA on your data table and saw the following message in the results:

There are not enough data fo reach of the factor level combinations (row/column intersections) to estimate the terms of the two-way ANOVA (this ANOVA is not full rank). Learn more.

This is a relatively rare situation to find yourself in as long as you have a moderate amount of data, but this page explains why you received this error and what you can do to fix it (short version: you probably need more data!)

Why did this happen?

Since Prism version 9, it's been possible to perform two-way ANOVA with missing data for some row-column combinations. In these situations, it may be common to fit a "main effects only" model (a model that does not include an interaction term between the row and column factors).

In these cases, Prism would start by calculating the number of row/column intersections that contain any number of values (i.e. row/column intersections that are not totally blank). Call this value n. Now, let r be the number of levels in the row factor (i.e. the number of rows with data), and let c be the number of levels in the column factor (i.e. the number of columns with data). Using these definitions, the number of variables in a main effects only model is equal to:

1 + (r - 1) + (c - 1) = r + c - 1

Because it's always the case that a model needs more values than variables, you would need at least r + c values to produce valid results. If you only had r + c -1 values (the same number of values as variables), you would get a perfect fit with zero residual sum of squares. In this case, Prism will display the message "There are not enough data to calculate two-way ANOVA."

What about the "full rank" message?

There is a second requirement for ANOVA which is that the factors must be linearly independent. In the previous section, we simply established that you must have a minimum number of row/column intersections that contain data. However, in some rare situations, your data may be arranged in such a way that they are "disconnected" from each other (see graphic representations below). In these cases, the corresponding system of linear equations used to solve the ANOVA are said to not be "full rank" (or in other terms, they are linearly dependent). Solving the ANOVA under these conditions would result in results that were nonsensical (values for SS that were zero, negative, or extremely large).

How can you tell if data are "full rank"? Without getting too technical, one way to confirm that data are "full rank" is to visually confirm that the data aren't disconnected from each other. Start by looking at the row/column intersections where there are data (at least one replicate). To have a solvable ANOVA, you must be able to connect all row/column intersections with straight horizontal or vertical lines (no diagonals), with each row/column intersection that contains values on a "corner" of the line.

Let's start with a simple example that doesn't contain any replicates:

There are 5 rows and 4 columns, so + c - 1 = 8. We have 9 row/column intersections with values, so we're good on that front. Now, let's see if we can create a path connecting all of these values.

We can! And - if you were to enter these values into Prism and perform a main effects only two-way ANOVA, Prism would calculate the results for you as expected.

Now let's take a look at the same data, but let's remove the value from Treatment 4 Group 1:

Even if we ignore the fact that we only have 8 row/column intersections with values (so a "perfect fit" scenario), let's see if we can draw a path to connect these values.

The path can no longer be constructed (the dashed lines show that we're missing a value in order to connect the two groups). Even if you start with a different value and try to create a different path, you won't be able to connect the data. Correspondingly, this ANOVA is said to no longer be "full rank". When trying to analyze this data, you will receive the error message shown at the top of this page (note: there was a bug in Prism 9; see below).

For the sake of completeness, let's look at another slightly more complex example with replicates.

We'll start by circling all row/column intersections with data, then try to connect all of these with a single path (horizontal and vertical segments only):

You can get most of the way there, but the data in Treatment 3 Group 3 are disjoint from all others.

So what could we do to fix this issue? The simple answer is: collect more data for more row/column interactions! With more complete data, it's less likely that you'll have an issue with your ANOVA not being "full rank". For example, what if we had collected data from Treatment 3 Group 4?

Now, we can construct a path that connects all row/column intersections with values:

Prism 9 bug: No accounting for "full rank"

In Prism 9, the only restriction discussed on this page when performing two-way ANOVA was to verify that there were at least r + c - 1 values (row/column intersections) in the table. Prism 9 did not account for the structure/arrangement of the data, resulting in situaitons where it would attempt to calculate the two-way ANOVA even when it was not "full rank". In these cases, the analysis would report absurd values for the sum of squares (zero, negative, or extremely large values). This bug was corrected with the release of Prism 10.



Keywords: two-way anova, full rank

Explore the Knowledgebase

Analyze, graph and present your scientific work easily with GraphPad Prism. No coding required.