Fitting Linear Regression Using R
Stat 203 Lecture 7
Syntax
Goal is to understand how to create linear regression models in R, get the coefficients, and use the model to plot lines of best fit.
General syntax
General syntax
library(GLMsData); data(gestation);
<- lm(Weight ~ Age, data=gestation, weights=Births)
gest.wtd summary(gest.wtd)
Call:
lm(formula = Weight ~ Age, data = gestation, weights = Births)
Weighted Residuals:
Min 1Q Median 3Q Max
-1.62979 -0.60893 -0.30063 -0.08845 1.03880
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.678389 0.371172 -7.216 7.49e-07 ***
Age 0.153759 0.009493 16.197 1.42e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7753 on 19 degrees of freedom
Multiple R-squared: 0.9325, Adjusted R-squared: 0.9289
F-statistic: 262.3 on 1 and 19 DF, p-value: 1.416e-12
Weighted vs Unweighted
library(GLMsData); data(gestation);
<- lm(Weight ~ Age, data=gestation, weights=Births)
gest.wtd <- lm(Weight ~ Age, data=gestation)
gest.ord coef(gest.wtd)
(Intercept) Age
-2.6783891 0.1537594
coef(gest.ord)
(Intercept) Age
-3.049879 0.159483
Plotting
plot( Weight ~ Age, data=gestation, type="n",
las=1, xlim=c(20, 45), ylim=c(0, 4),
xlab="Gestational age (weeks)", ylab="Mean birthweight (in kg)" )
points( Weight[Births< 20] ~ Age[Births< 20], pch=1, data=gestation )
points( Weight[Births>=20] ~ Age[Births>=20], pch=19, data=gestation )
abline( coef(gest.ord), lty=2, lwd=2)
abline( coef(gest.wtd), lty=1, lwd=2)
legend("topleft", lwd=c(2, 2), bty="n",
lty=c(2, 1, NA, NA), pch=c(NA, NA, 1, 19), # NA shows nothing
legend=c("Ordinary regression", "Weighted regression","Based on 20 or fewer obs.","Based on more than 20 obs."))