Prism offers a number of goodness-of-fit metrics that can be reported for simple logistic regression. Three of them (Tjur’s R squared, Cox-Snell’s R squared, and Model deviance) are reported in the Goodness of Fit section of the results for simple logistic regression, and are briefly discussed below. A fourth option offered in the goodness-of-fit section of the simple logistic regression dialog is the likelihood ratio test which is described here, and reported in its own section of the simple logistic regression results sheet.
Tjur’s R squared is one of a number of metrics developed for logistic regression collectively referred to as “Pseudo R squared” values. If you’re familiar with linear regression, you’ve probably encountered the concept of R squared in the past. It is critical to understand that these pseudo R squared values for logistic regression are NOT the same as R squared for linear regression (read more about R squared for linear regression).
Tjur’s R squared is one of the easier pseudo R squared values for logistic regression to understand and interpret: find the average predicted probability for all rows where Y was entered as a 0 and the average predicted probability for all rows where Y was entered as a 1, and then find the absolute value of the difference between these two values. In other words:
Tjur’s R squared = |Average Predicted value for 0s – Average Predicted value for 1s|
For good models, you would expect that the average for 0s would be close to zero and the average for 1s would be close to one. Thus, like R squared for linear regression, this will be a value between 0 and 1, and values that are closer to 1 indicate a better model fit to the data.
Cox-Snell’s R squared is sometimes referred to as “generalized R squared”, but is just another pseudo R squared intended to provide an idea of how well a model fits the given data. The calculation of Cox-Snell’s R squared is more complicated than Tjur’s R squared, as is its interpretation. However, this is a common metric among other statistics packages, and Prism offers it for the sake of comparison to results calculated elsewhere. You can read more about how Cox-Snell’s R squared is calculated here.
Model deviance is a metric that can be used to assess how well a given model fits to the entered data. Deviance is calculated based on another metric known as likelihood (or log likelihood). Calculation and interpretation of these values can be somewhat complicated, but for those interested, methods to calculate model deviance, likelihood, and log likelihood for logistic regression are provided in the multiple logistic regression portion of this guide.