cheat sheet(3) - Straight Line Regression Y(i = Beta(0...

Info icon This preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
Straight Line Regression: Y(i) = Beta(0) + Beta(1)*X(i) + Epsilon(i), where Beta(0) is the intercept (constant) and Beta(1) is the slope (coefficient) and Epsilon(i) is white noise iid w/ mean 0 and variance (sigma)^2. Least Squares Estimation: sum{Y(i) – (b(0) + b(1)*X(i))}^2, where Y(i) is the observed value and (b(0) + b(1)*X(i)) is from the model. We have a null hypothesis that says that the coefficients (beta(i)’s) are zero. If we have a low p-value/high F-value, we reject this hypothesis, and can say that the coefficients are not 0, and therefore the variables have a linear relationship to with each other (colinearity). However, we cannot always trust the low p-value/high F-value because we are testing multiple variables. R^2 = 68.5% means that the model explains 68.5% of the variation. We want a high value for (R^2), but we should be suspicious about an (R^2) value being too high. Standard error = sigma. We want a low standard error. The degrees of freedom for regression is 0 = number of predictor variables (p=1 for straight line regression). The total degrees of freedom is (n-1). The residual error dof is (n-p-1). The mean sum of squares (MS) is its sum of squares divided by its degrees of freedom. F = (regression MS)/(residual error MS). Least squares line is a sample version of best linear predictor R^2 is the squared correlation between X and Y is the fraction of the variation that can be predicted using the linear predictor. Multiple Linear Regression: Y(i) = Beta(0) + Beta(1)*X(i1) + … + Beta(p)*X(ip) + Epsilon(i). All coefficients estimated by least squares. Model Selection: larger models have less bias (good) and would give us best predictions if all coefficients could be estimated w/o error, but larger models (more coefficients), when coefficients are replaced by estimates, the prediction becomes less accurate and more variability (bad). R^2 is not useful for comparing models of different sizes because it will always choose the largest model., so use adjusted R^2. C(p) estimates how well a model will predict. We want to choose where C(p) <= p, but approximately equal. Nonlinear Regression: Use this because we can’t use a linear model for nonlinear functions. Need to estimate a starting r (usually between 5 – 7%), look at SAS output and find the r where the sum of squares is the smallest. Residual Plotting: problems to look for: (1) non-normality – outliers can be a problem since they have a large influence on the estimation results; solution – transformation of the response, (2) non-constant variance – causes inefficient (too variable) estimates; solution – transformation of the response and weighting, (3) non-linearity, (4) model misspecification means E(Y|X(1)….X(p)) has a functional form different from the model; causes biased estimates; solution – transformation, polynomial regression, nonparametric regression, etc.
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern