Hypothesis Testing and Model Comparisons

Stat 203 Lecture 18

Dr. Janssen

First: Exploring the Poisson distribution

MLE for Poisson

The Poisson distribution has the probability function

\[ \mathcal{P}(y; \mu) = \frac{\exp(-\mu)\mu^y}{y!} \]

for \(\mu < \infty\) and where \(y\) is a nonnegative integer. Initially, consider estimating the mean \(\mu\) for the Poisson distribution, based on a sample \(y_1, y_2, \ldots, y_n\).

  1. Determine the likelihood function and the log-likelihood function.
  2. Find the score function \(U(\mu)\).
  3. Using the score function, find the MLE of \(\mu\).
  4. Find the observed and expected information for \(\mu\).
  5. Find the standard error for \(\hat{\mu}\).

Hypothesis Testing

Method 1: Wald test

Based on the distance between \(\hat{\zeta}\) and \(\zeta^0\):

\[ W = \frac{(\hat{\zeta} - \zeta^0)^2}{\widehat\var[\hat{\zeta}]}, \]

where \(\widehat\var[\hat{\zeta}] = 1/\mathcal{I}(\hat{\zeta})\).

If \(H_0\) is true, then \(W\) follows a \(\chi_1^2\) distribution as \(n\to\infty\).

Example: quilpie, \(H_0: \mu^0 = 0.5\)

library(GLMsData); data(quilpie)
muhat <- mean(quilpie$y)
mu0 <- 0.5
n <- length(quilpie$y)
varmu <- muhat*(1-muhat)/n
W <- (muhat - mu0)^2 / varmu
P.W <- pchisq(W, df=1, lower.tail=FALSE)
round(c(Wald=W, Pvalue=P.W), 5)
   Wald  Pvalue 
0.05887 0.80828 

Method 2: Score Test

\[ S = \frac{U(\zeta^0)^2}{\mathcal{I}(\zeta^0)} \]

If \(H_0\) is true, then \(S\) follows a \(\chi_1^2\) distribution as \(n\to\infty\).

Example: quilpie

S <- (muhat - mu0)^2 / (mu0*(1-mu0)/n)
P.S <- pchisq(S, df=1, lower.tail=FALSE)
round(c(Score=S, Pvalue=P.S), 5)
  Score  Pvalue 
0.05882 0.80837 

Method 3: Likelihood Ratio Test

\[ L = 2 \left( \ell(\hat{\zeta}; y) - \ell(\zeta^0; y)\right) \]

Example: quilpie

Lmu0 <- sum(dbinom(quilpie$y, 1, mu0, log=TRUE))
Lmuhat <- sum(dbinom(quilpie$y, 1, muhat, log=TRUE))
L <- 2*(Lmuhat - Lmu0)
P.L <- pchisq(L, df=1, lower.tail=FALSE)
round(c(Loglik=L, Pvalue=P.L), 5)
 Loglik  Pvalue 
0.05883 0.80835 

Example: quilpie

Lmu0 <- sum(dbinom(quilpie$y, 1, mu0, log=TRUE))
Lmuhat <- sum(dbinom(quilpie$y, 1, muhat, log=TRUE))
L <- 2*(Lmuhat - Lmu0)
P.L <- pchisq(L, df=1, lower.tail=FALSE)
c( Wald=W, score=S, LLR=L)
      Wald      score        LLR 
0.05887446 0.05882353 0.05883201 
round(c(Wald=P.W, Score=P.S, LLR=P.L), 5)
   Wald   Score     LLR 
0.80828 0.80837 0.80835 

Confidence Intervals

Wald statistic

For a single parameter, the approximate \(100(1-\alpha)\)% confidence interval based on the Wald statistic is obtained via:

\[ \hat{\zeta}_j - z^* \sqrt{\var[\hat{\zeta}_j]} < \zeta_j < \hat{\zeta}_j + z^* \sqrt{\var[\hat{\zeta}_j]}, \]

where \(z^*\) is the quantile of the standard normal distribution such that an area of \(\alpha/ 2\) is in each tail.

Exploration

Hot Hand in Basketball?

Explore the questions at https://prof.mkjanssen.org/glm/hothand.html