smoo - 5601 Notes: Smoothing Charles J. Geyer April 8, 2006...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 5601 Notes: Smoothing Charles J. Geyer April 8, 2006 Contents 1 Web Pages 2 2 The General Smoothing Problem 2 3 Some Smoothers 4 3.1 Running Mean Smoother . . . . . . . . . . . . . . . . . . . . 4 3.2 General Kernel Smoothing . . . . . . . . . . . . . . . . . . . . 5 3.3 Local Polynomial Smoothing . . . . . . . . . . . . . . . . . . 10 3.4 Smoothing Splines . . . . . . . . . . . . . . . . . . . . . . . . 11 4 Some Theory 17 4.1 Linear Smoothers . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Distribution Theory . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . 19 4.2.2 Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.3 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.4 Variance Estimate . . . . . . . . . . . . . . . . . . . . 21 4.2.5 Degrees of Freedom . . . . . . . . . . . . . . . . . . . 22 4.3 Performance Criteria . . . . . . . . . . . . . . . . . . . . . . . 23 4.3.1 Mean Squared Error . . . . . . . . . . . . . . . . . . . 23 4.3.2 Mallowss C p . . . . . . . . . . . . . . . . . . . . . . . 25 4.3.3 Cross Validation . . . . . . . . . . . . . . . . . . . . . 25 4.3.4 Leave One Out . . . . . . . . . . . . . . . . . . . . . . 26 4.3.5 Cross Validation Revisited . . . . . . . . . . . . . . . . 27 4.3.6 Generalized Cross Validation . . . . . . . . . . . . . . 27 5 The Bias-Variance Trade-off 28 1 1 Web Pages This handout accompanies the web pages http://www.stat.umn.edu/geyer/5601/examp/smoo.html http://www.stat.umn.edu/geyer/5601/examp/smootoo.html 2 The General Smoothing Problem In simple linear regression, the standard assumptions are that the data are of the form ( x i ,y i ), i = 1, ... , n . We are interested in being able to predict y i values given the corresponding x i values. For this reason we treat x i as non-random. If the x i are actually random, we say we are conditioning on their observed values, which is the same thing as treating them as non- random. The conditional distribution of the y i given the x i is determined by y i = + x i + e i (1) where and are unknown parameters (non-random but unknown con- stants) and the e i are IID mean zero normal random variables. More generally, using multiple linear regression, we can generalize the model (1) to y i = + 1 g 1 ( x i ) + + k g k ( x i ) + e i (2) where g 1 , ... , g k are any known functions and the errors e i are as before. For example, polynomial regression is the case where the g i are mono- mials g i ( x ) = x i , i = 1 ,...,k. But multiple regression works with any functions g i so long as they are known not estimated , that is, chosen by the data analyst without looking at the data rather than somehow estimated from the data (only the regression parameters , 1 , ... , k are estimated from the data)....
View Full Document

Page1 / 34

smoo - 5601 Notes: Smoothing Charles J. Geyer April 8, 2006...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online