Fitting Linear Regression Using R

Stat 203 Lecture 7

Author

Dr. Janssen

Syntax

Goal is to understand how to create linear regression models in R, get the coefficients, and use the model to plot lines of best fit.

General syntax

library(GLMsData); data(gestation);
gest.wtd <- lm(Weight ~ Age, data=gestation, weights=Births)
summary(gest.wtd)

General syntax

library(GLMsData); data(gestation);
gest.wtd <- lm(Weight ~ Age, data=gestation, weights=Births)
summary(gest.wtd)

Call:
lm(formula = Weight ~ Age, data = gestation, weights = Births)

Weighted Residuals:
     Min       1Q   Median       3Q      Max 
-1.62979 -0.60893 -0.30063 -0.08845  1.03880 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.678389   0.371172  -7.216 7.49e-07 ***
Age          0.153759   0.009493  16.197 1.42e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7753 on 19 degrees of freedom
Multiple R-squared:  0.9325,    Adjusted R-squared:  0.9289 
F-statistic: 262.3 on 1 and 19 DF,  p-value: 1.416e-12

Weighted vs Unweighted

library(GLMsData); data(gestation);
gest.wtd <- lm(Weight ~ Age, data=gestation, weights=Births)
gest.ord <- lm(Weight ~ Age, data=gestation)
coef(gest.wtd)
(Intercept)         Age 
 -2.6783891   0.1537594 
coef(gest.ord)
(Intercept)         Age 
  -3.049879    0.159483 

Plotting

plot( Weight ~ Age, data=gestation, type="n",
    las=1, xlim=c(20, 45), ylim=c(0, 4),
    xlab="Gestational age (weeks)", ylab="Mean birthweight (in kg)" )
points( Weight[Births< 20] ~ Age[Births< 20], pch=1,  data=gestation )
points( Weight[Births>=20] ~ Age[Births>=20], pch=19, data=gestation )
abline( coef(gest.ord), lty=2, lwd=2)
abline( coef(gest.wtd), lty=1, lwd=2)
legend("topleft", lwd=c(2, 2), bty="n",
    lty=c(2, 1, NA, NA), pch=c(NA, NA, 1, 19),   # NA shows nothing
    legend=c("Ordinary regression", "Weighted regression","Based on 20 or fewer obs.","Based on more than 20 obs."))