Odds, Odds Ratios, and the Logit Link

Stat 203 Lecture 25

Dr. Janssen

Setup

Odds

Suppose \(A\) is an event. Odds refers to the ratio of the probability that \(A\) occurs to the probability that \(A\) does not occur.

Therefore, the logit link function in a binomial GLM models the logarithm of the odds:

\[ \log(\text{odds}) = \beta_0 + \beta_1 x. \]

Example: turbines

Recall:

library(GLMsData); data(turbines)
tr.logit <- glm( Fissures/Turbines ~ Hours, family=binomial, weights=Turbines, data=turbines)
coef(tr.logit)
  (Intercept)         Hours 
-3.9235965551  0.0009992372 
Code
LogOdds <- predict( tr.logit ); odds <- exp( LogOdds )
plot( LogOdds ~ turbines$Hours, type="l", las=1,
        xlim=c(0, 5000), ylim=c(-5, 1),
        ylab="Log-odds", xlab="Run-time (in hours)" )
my <- turbines$Fissures; m <- turbines$Turbines
EmpiricalOdds <- (my + 0.5)/(m - my + 0.5) # To avoid log of zeros
points( log(EmpiricalOdds) ~ turbines$Hours)

Code
plot( odds ~ turbines$Hours, las=1, xlim=c(0, 5000), ylim=c(0, 2),
        type="l", ylab="Odds", xlab="Run-time (in hours)")
points( EmpiricalOdds ~ turbines$Hours)

Odds Ratio

Consider the binomial glm with systematic component

\[ \log \frac{\mu}{1-\mu} = \beta_0 + \beta_1 x, \qquad(1)\]

where \(x\) is a dummy variable taking the values 0 or 1.

Then the odds of a success when \(x = 1\) are \(\exp(\beta_1)\) times greater than a success when \(x = 0\). This is the odds ratio.

Explorations

Setup

Does the probability of a save in a soccer match depend upon whether the goalkeeper’s team is behind or not? Roskes et al. (2011) looked at penalty kicks in the men’s World Cup soccer championships from 1982 to 2010, and they assembled data on 204 penalty kicks during shootouts.

Saves Scores Total
Behind 2 22 24
Not Behind 39 141 180
Total 41 163 204

Begin by calculating various odds, including the odds ratio of a successful penalty kick when the shooter’s team is leading.

Logistic regression model

Code
soccer_model <- glm(formula = cbind(c(2, 22), c(39, 141)) ~ as.factor(c(0,1)), 
family = binomial())
exp(soccer_model$coefficients)
        (Intercept) as.factor(c(0, 1))1 
         0.05128205          3.04255319 

Example: germ

library(GLMsData)
data(germ); str(germ)
'data.frame':   21 obs. of  4 variables:
 $ Germ   : int  10 23 23 26 17 5 53 55 32 46 ...
 $ Total  : int  39 62 81 51 39 6 74 72 51 79 ...
 $ Extract: Factor w/ 2 levels "Bean","Cucumber": 1 1 1 1 1 2 2 2 2 2 ...
 $ Seeds  : Factor w/ 2 levels "OA73","OA75": 2 2 2 2 2 2 2 2 2 2 ...
plot( Germ/Total ~ Extract, data=germ, las=1, ylim=c(0, 1) )

plot( Germ/Total ~ Seeds,   data=germ, las=1, ylim=c(0, 1) )

A model

library(GLMsData)
data(germ)
gm.m1 <- glm(Germ/Total ~ Seeds + Extract, family=binomial,
               data=germ, weights=Total)
printCoefmat(coef(summary(gm.m1)))
                Estimate Std. Error z value  Pr(>|z|)    
(Intercept)     -0.70048    0.15072 -4.6475 3.359e-06 ***
SeedsOA75        0.27045    0.15471  1.7482   0.08044 .  
ExtractCucumber  1.06475    0.14421  7.3831 1.546e-13 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp( coef(gm.m1) )
    (Intercept)       SeedsOA75 ExtractCucumber 
      0.4963454       1.3105554       2.9001133 

Exploration: Health Expenditures

Head to

https://prof.mkjanssen.org/glm/S203_Lecture26.html