An R-squared for logistic regression, packaged

This morning I checked Paul Allison's Statistical Horizons blog and found a post on R^2 measures for logistic regression. It introduced me to Tjur's R^2 by way of an example, which I repackaged below:

// Reference:

// program definition
capture prog drop tjur2
program tjur2, rclass

if !inlist(e(cmd),"logit","logistic") {
   di as err "Tjur's R-squared only works after logit or logistic."
   exit 498 // Thank you, Nick Cox.
tempname yhat
predict `yhat' if e(sample)
local y `e(depvar)'
quietly ttest `yhat', by(`y')
local r2logistic r(mu_2)-r(mu_1)
di "Tjur's R-squared " _col(20) %4.3f `r2logistic'
return local r2logistic `r2logistic'


// use case
use "", clear
logistic inlf kidslt6 age educ huswage city exper

I'm not sure yet if it's worth saving this program as ado/personal/t/tjur2.ado for my future logistic regression diagnostic needs, but I haven't posted anything Stata-related in too long, so there you have it.

3 Responses to “An R-squared for logistic regression, packaged”

  1. Nick Cox writes:


    Protect your program against misuse by ensuring an error message if the previous estimation command was not -logit- or -logistic-.

    Also, -word 2 of e(cmdline)- will fail to do what you what in the unlikely but possible case that the second word is a wildcard. It's safe to use -e(depvar)- instead.

    None of these measures should be taken too seriously, but I like the measure advocated by Zheng, B. and A. Agresti. 2000. Summarizing the predictive power of a generalized linear model. Statistics in Medicine 19: 1771-1781. There is a Stata implementation in -glmcorr- (SSC).

  2. Gabi Huiber writes:

    Done. Thank you also for the pointer to the Zheng and Agresti article.

  3. Nick Cox writes:

    For completeness, on error the return code should not be zero. -exit 498- is one possibility or you can look through the list of error messages in [P] for the most suitable code.