MLE for One Parameter

Stat 203 Lecture 16

Dr. Janssen

The Idea of Likelihood

Example: quilpie

Question

What is the unknown parameter in the quilpie data?

The probability function of \(y\) is

\[ \mathcal{P}(y; \mu) = \mu^y (1-\mu)^{1-y}. \]

for \(y = 0\) or \(y = 1\).

In R (first draft)

library(GLMsData); data(quilpie); names(quilpie)
mu <- c(0.2, 0.4, 0.5, 0.6, 0.8) # Candidate values to test
ll <- rep(0, 5)   # A place-holder for the log-likelihood values
for (i in 1:5)
   ll[i] <- sum( dbinom(quilpie$y, size=1, prob=mu[i], log=TRUE))
data.frame(Mu=mu, LogLikelihood=ll)

In R (first draft)

library(GLMsData); data(quilpie); names(quilpie)
[1] "Year"   "Rain"   "SOI"    "Phase"  "Exceed" "y"     
mu <- c(0.2, 0.4, 0.5, 0.6, 0.8) # Candidate values to test
ll <- rep(0, 5)   # A place-holder for the log-likelihood values
for (i in 1:5)
   ll[i] <- sum( dbinom(quilpie$y, size=1, prob=mu[i], log=TRUE))
data.frame(Mu=mu, LogLikelihood=ll)
   Mu LogLikelihood
1 0.2     -63.69406
2 0.4     -48.92742
3 0.5     -47.13401
4 0.6     -48.11649
5 0.8     -60.92148

Score Equations

A Systematic Approach

When there is a single parameter \(\zeta\), the derivative of the log-likelihood function is called the score function, denoted

\[ U(\zeta) = d\ell / d\zeta, \]

and the equation to be solved for \(\hat\zeta\) is the score equation, \(U(\hat\zeta) = 0\).

Example: Bernoulli

The log-probability function of the Bernoulli distribution is

\[ \log \mathcal{P}(y; \mu) = y\log \mu + (1-y) \log(1-\mu). \]

muhat <- mean(quilpie$y); muhat
[1] 0.5147059

Information: Observed and Expected

Second Derivative: Observed Information

Write \(\mathcal{J}(\zeta)\) for minus the second derivative of the log-likelihood with respect to \(\zeta\):

\[ \mathcal{J}(\zeta) = - \frac{d^2 \ell(\zeta; y)}{d\zeta^2} = - \frac{d U(\zeta)}{d\zeta}. \]

This is called the observed information.

Properties of Observed Information

  • \(\mathcal{J}(\zeta)\) must be positive near the MLE \(\hat\zeta\) (why?)
  • If \(\mathcal{J}\) is very large, then \(U\) is changing rapidly near the MLE (why?)

Thus, \(\mathcal{J}(\zeta)\) is a measure of the precision of the estimate \(\hat{\zeta}\); that is, \(\mathcal{J}(\zeta)\) measures how much information is available for estimating \(\zeta\).

Expected Information

The expected information is \(\mathcal{I}(\zeta) = E[\mathcal{J}(\zeta)]\). It measures the average information available for the parameter from the model and the specified parameter value.

For models we consider this semester, expected information will be easier to evaluate.

Example: Bernoulli

We see

\[ \frac{d^2 \ell(\mu; y )}{d\mu^2} = \frac{d U(\mu)}{d\mu} = \frac{-\mu (1-\mu) - (y-\mu)(1-2\mu)}{\mu^2(1-\mu)^2}. \]

Standard Errors of Parameters

Variance

Exercise. \(\mathcal{I}(\zeta) = E[U(\zeta)] = \text{var}[U(\zeta)]\).

A Taylor series expansion of the log-likelihood around \(\zeta = \hat\zeta\) shows that

\[ \text{var}[\hat\zeta] \approx 1/\mathcal{I}(\zeta). \]