[1] "Wind" "DC"
Stat 203 Lecture 13
Question
Why does it not make sense to transform factors?
There are at least a couple of reasons we may consider transforming covariates:
When transforming the covariates, we still produce a model linear in the parameters, and we may transform any or all of the covariates as appropriate to the situation.
windmill
windmill
Example 2 In this example, we explore a dataset relating wind velocity and corresponding direct current output from windmills. We first import the data.
[1] "Wind" "DC"
scatter.smooth(windmill$DC ~ windmill$Wind, main="No transforms", xlab="Wind speed", ylab="DC output", las=1)
Explore. How can you adjust Wind
to produce something more linear?
windmill
Warning
To tell R
to interpret 1/Wind
arithmetically rather than as a formula in the lm()
function, we need to insulate it using the I()
function:
Call:
lm(formula = DC ~ I(1/Wind), data = windmill)
Residuals:
Min 1Q Median 3Q Max
-0.20547 -0.04940 0.01100 0.08352 0.12204
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.9789 0.0449 66.34 <2e-16 ***
I(1/Wind) -6.9345 0.2064 -33.59 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.09417 on 23 degrees of freedom
Multiple R-squared: 0.98, Adjusted R-squared: 0.9792
F-statistic: 1128 on 1 and 23 DF, p-value: < 2.2e-16
Consider the trees
dataset.
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.631617 0.799790 -8.2917 5.057e-09 ***
log(Girth) 1.982650 0.075011 26.4316 < 2.2e-16 ***
log(Height) 1.117123 0.204437 5.4644 7.805e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
heatcap
dataheatcap
heatcap
heatcap
(Intercept) Temp I(Temp^2)
(Intercept) 1.0000000 -0.9984975 0.9941781
Temp -0.9984975 1.0000000 -0.9985344
I(Temp^2) 0.9941781 -0.9985344 1.0000000
heatcap
(Intercept) Temp I(Temp^2)
(Intercept) 1.0000000 -0.9984975 0.9941781
Temp -0.9984975 1.0000000 -0.9985344
I(Temp^2) 0.9941781 -0.9985344 1.0000000
hc_mod1 <- lm( Cp ~ poly(Temp, 1), data=heatcap) # Linear
hc_mod2 <- lm( Cp ~ poly(Temp, 2), data=heatcap) # Quadratic
hc_mod3 <- lm( Cp ~ poly(Temp, 3), data=heatcap) # Cubic
hc_mod4 <- lm( Cp ~ poly(Temp, 4), data=heatcap) # Quartic
zapsmall( summary(hc_mod3,correlation=TRUE)$correlation )
(Intercept) poly(Temp, 3)1 poly(Temp, 3)2 poly(Temp, 3)3
(Intercept) 1 0 0 0
poly(Temp, 3)1 0 1 0 0
poly(Temp, 3)2 0 0 1 0
poly(Temp, 3)3 0 0 0 1