'data.frame': 21 obs. of 4 variables:
$ Age : int 22 23 25 27 28 29 30 31 32 33 ...
$ Births: int 1 1 1 1 6 1 3 6 7 7 ...
$ Weight: num 0.52 0.7 1 1.17 1.2 ...
$ SD : num NA NA NA NA 0.121 NA 0.589 0.319 0.438 0.313 ...
Stat 203 Lecture 4
\[ \begin{cases} \text{var}[y_i] = \sigma^2/w_i \\ \mu_i = \beta_0 + \sum\limits_{j=1}^p \beta_j x_{ji} \end{cases} \qquad(1)\]
where \(E[y_i] = \mu_i\), and the prior weights \(w_i\) are known.
The regression parameters \(\beta_0, \beta_1, \ldots, \beta_p\), as well as the error variance \(\sigma^2\), are unknown and must be estimated from the data.
Is simple linear regression a reasonable tool for modeling the following situations?
A possible model for the data is
\[ \begin{cases} \text{var}[y_i] = \sigma^2 / m_i\\ \mu_i = \beta_0 + \beta_1 x_i \end{cases} \qquad(2)\]
The model given in Equation 2 is weighted linear regression model.
Response variable of interest: sal77
(annual salary in 1977) and/or bsal
(annual salary at the time of hire)
sex
: MALE or FEMALEsenior
: months since hiredage
: age in monthseduc
: years of educationexper
: months of prior work experience bsal sal77 sex senior
Min. :3900 Min. : 7860 Length:93 Min. :65.00
1st Qu.:4980 1st Qu.: 9000 Class :character 1st Qu.:74.00
Median :5400 Median :10020 Mode :character Median :84.00
Mean :5420 Mean :10393 Mean :82.28
3rd Qu.:6000 3rd Qu.:11220 3rd Qu.:90.00
Max. :8100 Max. :16320 Max. :98.00
age educ exper
Min. :280.0 Min. : 8.00 Min. : 0.0
1st Qu.:349.0 1st Qu.:12.00 1st Qu.: 35.5
Median :468.0 Median :12.00 Median : 70.0
Mean :474.4 Mean :12.51 Mean :100.9
3rd Qu.:590.0 3rd Qu.:15.00 3rd Qu.:144.0
Max. :774.0 Max. :16.00 Max. :381.0
bsal sal77 sex senior age educ exper
1 5040 12420 MALE 96 329 15 14.0
2 6300 12060 MALE 82 357 15 72.0
3 6000 15120 MALE 67 315 15 35.5
4 6000 16320 MALE 97 354 12 24.0
5 6000 12300 MALE 66 351 12 56.0
6 6840 10380 MALE 92 374 15 41.5
bsal
vs exper