notes Smoothing and Non-Parametric Regression

notes Smoothing and Non-Parametric Regression - Smoothing...

Info icon This preview shows pages 1–4. Sign up to view the full content.

Smoothing and Non-Parametric Regression Germ´ an Rodr´ ıguez [email protected] Spring, 2001 Objective: to estimate the effects of covariates X on a response y non- parametrically , letting the data suggest the appropriate functional form. 1 Scatterplot Smoothers Consider first a linear model with one predictor y = f ( x ) + . We want to estimate f , the trend or smooth . Assume the data are ordered so x 1 < x 2 < . . . < x n . If we have multiple observations at a given x i we introduce a weight w i . 1.1 Running Mean We estimate the smooth at x i by averaging the y ’s corresponding to x ’s in a neighborhood of x i : S ( x i ) = j N ( x i ) ( y j ) /n i , for a neighborhood N ( x i ) with n i observations. A common choice is to take a symmetric neighborhood consisting of the nearest 2 k + 1 points: N ( x i ) = { max( i - k, 1) , . . . , i - 1 , i, i + 1 , . . . , min( i + k, n ) } . Problems: it’s wiggly, bad near the endpoints (bias). Use only for equally spaced points. 1
Image of page 1

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

1.2 Running Line One way to reduce the bias is by fitting a local line: S ( x i ) = ˆ α i + ˆ β i x i , where ˆ α i and ˆ β i are OLS estimates based on points in a neighborhood N ( x i ) of x i . This is actually easy to do thanks to well-known regression updating formulas. Extension to weighted data is obvious. Much better than running means. 1.3 Kernel Smoothers An alternative approach is to use a weighted running mean, with weights that decline as one moves away from the target value. To calculate S ( x i ), the j -th point receives weight w ij = c i λ d ( | x i - x j | λ ) , where d ( . ) is an even function, λ is a tunning constant called the window width or bandwidth, and c i is a normalizing constant so the weights add up to one for each x i . Popular choices of function d ( . ) are Gaussian density, Epanechnikov: d ( t ) = 3 4 (1 - t 2 ) , t 2 < 1, 0 otherwise, Minimum var: d ( t ) = 3 8 (3 - 5 t 2 ) , t 2 < 1, 0 otherwise. One difficulty is that a kernel smoother still exhibits bias at the end points. Solution? Combine the last two approaches: use kernel weights to estimate a running line. 1.4 Loess/Lowess One such approach is loess , a locally weighted running line smoother due to Cleveland and implemented in S and R. To calculate S ( x i ) you basically find a symmetric nearest neighborhood of x i , find the distance from x i to the furthest neighbor and use this as λ , use a tri-cube weight function d ( t ) = (1 - t 3 ) 3 , 0 t 1, 0 otherwise, estimate a local line using these weights, take the fitted value at x i as S ( x i ). A variant uses robust regression in each neighborhood. 2
Image of page 2
1.5 Other Approaches Splines are a popular family of smoothers. We will study splines in the next section. All of the methods discussed so far are linear smoothers, we can always write S ( x ) = Ay where S and y are n-vectors and A is an n × n matrix that depends on the x ’s. There are also non-linear smoothers. These are usually based on running medians, followed by enhacements such as Hanning, splitting, and twicing.
Image of page 3

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

Image of page 4
This is the end of the preview. Sign up to access the rest of the document.
  • Spring '06
  • Rodriguez
  • Regression Analysis, splines, natural cubic spline, smoothers, Interpolating Splines

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern