Stat 203 Lecture 27
For a binomial distribution, \(\text{var}[y] = \mu (1-\mu)\). In practice, the amount of variation can exceed this quantity, even for binomial-like data.
This is called overdispersion.
Example: in normal linear regression, the MLE of \(\phi = \sigma^2\) is
\[ \hat{\sigma}^2 = \frac{1}{n} \sum\limits_{i=1}^n w_i (y_i - \hat{\mu}_i)^2, \]
which is never used as it is biased.
Instead:
\[ s^2 = \frac{1}{n-p'}\sum\limits_{i=1}^n w_i (y_i - \hat{\mu}_i)^2. \]
The profile log-likelihood for \(\phi\) is
\[ \ell(\phi) = \ell(\hat{\beta}_0, \hat{\beta}_1, \ldots, \hat{\beta}_p, \phi; y) \]
The modified profile log-likelihood is
\[ \ell^0(\phi) = \frac{p'}{2} \log(\phi) + \ell(\hat{\beta}_0, \hat{\beta}_1, \ldots, \hat{\beta}_p, \phi; y). \]
Another approach is to use the mean deviance estimator of \(\phi\):
\[ \tilde{\phi} = \frac{D(y,\hat{\mu})}{n-p'}. \]
The Pearson statistic is the working RSS:
\[ X^2 = \sum\limits_{i=1}^n w_i (z_i - \hat{\eta}_i)^2 = \sum\limits_{i=1}^n \frac{w_i (y_i - \hat{\mu}_i)^2}{V(\hat{\mu}_i)}. \]
The Pearson estimator of \(\phi\) is then
\[ \overline{\phi} = \frac{X^2}{n-p'}. \]
Compare Model A to an alternative Model B of a particular type, typically the largest model we can fit to the data (a saturated model).
If goodness-of-fit is rejected, this is evidence that the current model is not accurate.
Large-sample asymptotics do not apply, because the number of parameters in the saturated model increases with the number of observations..
Some rules-of-thumb for small dispersion asymptotics are given on p. 277; in these cases, the Pearson statistic for goodness-of-fit are approximately chi-square.
The probabilities \(\mu_i\) are not constant between observations even when all the explanatory variables are unchanged.
Alternatively, the \(m_i\) cases, of which observation \(y_i\) is a proportion, are not independent.
Example: positive cases arrive in clusters rather than as individual cases.
Instead, writing \(\rho\) for the correlation between the Bernoulli trials, we find \(\phi_i = 1 + (m_i-1)\rho\).