Stat 6601 Project: Bootstrapping Linear Models

Stat 6601 Project:
Bootstrapping Linear
Models (V&R 6.6)
Jaimie Kwon
Statistics, CSUH
Goal
• How do we apply bootstrap to linear regression
models?
Data
• ‘Phones’ data: the annual numbers of telephone
calls, in Belgium
– ‘year’ : The last two digits of the year.
– ‘calls’ : The number of telephone calls made (in
millions of calls).
Model
• Linear models of the form:
Y=X + 
in which only  is considered random.
Method
• Most obvious form of bootstrapping: randomly
sample pairs (xi, yi) with replacement. (called
“case-based resampling” in Davison and Hinkley
(1997))
– Might not be appropriate
• Alternative: “Model-based resampling” –
resample the residulas.
Method (Continued)
• Procedure:
– After fitting the linear model to get:
yi=xib + ei
– Create a new dataset by yi=xib + ei where ei* are
resample with replacement from the residuals (ei).
• Some issues:
Codes
•
•
•
•
•
•
•
•
•
•
•
•
library(MASS)
library(boot)
plot(phones$year, phones$calls)
fit <- lm(calls ~ year, data=phones)
ph <- data.frame(phones, res=resid(fit), fitted=fitted(fit))
ph.fun <- function(data, i){
d <- data
d$calls <- d$fitted + d$res[i]
coef(update(fit, data=d))
}
(ph.lm.boot <- boot(ph, ph.fun, R=999))
plot(ph.lm.boot)
Results
• Bootstrap Statistics :
•
original
bias
• t1* -260.059246 -3.5164095
• t2*
5.041478 0.0690514
std. error
96.730498
1.567871
Summary
• Phones data
• Bootstrap linear models
Some Tips
• Don’t try to cover too much.
• Keep it structured.
• Don’t put too much in the slides.
– Let them listen to you.
– Plan to use blackboards as well.
– Don’t agonize over equations (or equation editor).
• Make slides look neat & pleasant.
• Practice and time your presentation.