library(GLMsData); data(turbines)
tr.logit <- glm( Fissures/Turbines ~ Hours, family=binomial, weights=Turbines, data=turbines)
coef(tr.logit)
(Intercept) Hours
-3.9235965551 0.0009992372
Stat 203 Lecture 25
Suppose \(A\) is an event. Odds refers to the ratio of the probability that \(A\) occurs to the probability that \(A\) does not occur.
Therefore, the logit link function in a binomial GLM models the logarithm of the odds:
\[ \log(\text{odds}) = \beta_0 + \beta_1 x. \]
turbines
Recall:
library(GLMsData); data(turbines)
tr.logit <- glm( Fissures/Turbines ~ Hours, family=binomial, weights=Turbines, data=turbines)
coef(tr.logit)
(Intercept) Hours
-3.9235965551 0.0009992372
LogOdds <- predict( tr.logit ); odds <- exp( LogOdds )
plot( LogOdds ~ turbines$Hours, type="l", las=1,
xlim=c(0, 5000), ylim=c(-5, 1),
ylab="Log-odds", xlab="Run-time (in hours)" )
my <- turbines$Fissures; m <- turbines$Turbines
EmpiricalOdds <- (my + 0.5)/(m - my + 0.5) # To avoid log of zeros
points( log(EmpiricalOdds) ~ turbines$Hours)
Consider the binomial glm with systematic component
\[ \log \frac{\mu}{1-\mu} = \beta_0 + \beta_1 x, \qquad(1)\]
where \(x\) is a dummy variable taking the values 0 or 1.
Then the odds of a success when \(x = 1\) are \(\exp(\beta_1)\) times greater than a success when \(x = 0\). This is the odds ratio.
Does the probability of a save in a soccer match depend upon whether the goalkeeper’s team is behind or not? Roskes et al. (2011) looked at penalty kicks in the men’s World Cup soccer championships from 1982 to 2010, and they assembled data on 204 penalty kicks during shootouts.
Saves | Scores | Total | |
---|---|---|---|
Behind | 2 | 22 | 24 |
Not Behind | 39 | 141 | 180 |
Total | 41 | 163 | 204 |
Begin by calculating various odds, including the odds ratio of a successful penalty kick when the shooter’s team is leading.
germ
'data.frame': 21 obs. of 4 variables:
$ Germ : int 10 23 23 26 17 5 53 55 32 46 ...
$ Total : int 39 62 81 51 39 6 74 72 51 79 ...
$ Extract: Factor w/ 2 levels "Bean","Cucumber": 1 1 1 1 1 2 2 2 2 2 ...
$ Seeds : Factor w/ 2 levels "OA73","OA75": 2 2 2 2 2 2 2 2 2 2 ...
library(GLMsData)
data(germ)
gm.m1 <- glm(Germ/Total ~ Seeds + Extract, family=binomial,
data=germ, weights=Total)
printCoefmat(coef(summary(gm.m1)))
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.70048 0.15072 -4.6475 3.359e-06 ***
SeedsOA75 0.27045 0.15471 1.7482 0.08044 .
ExtractCucumber 1.06475 0.14421 7.3831 1.546e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Intercept) SeedsOA75 ExtractCucumber
0.4963454 1.3105554 2.9001133
Head to
https://prof.mkjanssen.org/glm/S203_Lecture26.html