Distances value of more than chi square critical value with degrees of freedom

# Distances value of more than chi square critical

This preview shows page 62 - 70 out of 70 pages.

Distances value of more than chi-square critical value (with degrees of freedom is equal to the number of explanatory variables) is classified as outliers.
Business Analytics – The Science of Data Driven Decision Making Cook’s Distance Cook’s distance measures how much the predicted value of the dependent variable changes for all the observations in the sample when a particular observation is excluded from sample for the estimation of regression parameters. Cook’s distance for simple linear regression is given by where D i is the Cook’s distance measure for i th observation, is the predicted value of j th observation including i th observation, is the predicted value of j th observation after excluding i th observation from the sample, MSE is the Mean–Squared–Error. MSE ) Y Y ( D j 2 j(i) j i j Y ) ( i j Y
Business Analytics – The Science of Data Driven Decision Making Leverage Value Leverage value of an observation measures the influence of that observation on the overall fit of the regression function. Leverage value for an observation in SLR is given by Leverage value of more than 2/ n or 3/ n is treated as highly influential observation. In Eq. the first term (1/ n ) will tend to zero for large value of n . n i i i i x x x x n h 1 2 2 ) ( ) ( 1
Business Analytics – The Science of Data Driven Decision Making DFFit and DFBeta DFFit is the change in the predicted value of Y i when case i is removed from the data set. DFBeta is the change in the regression coefficient values when an observation i is removed from the data.
Business Analytics – The Science of Data Driven Decision Making Confidence Interval for Regression coefficients 0 and 1 The standard error of estimates of and are given by where Where S e is the standard error of residuals and SSX = The interval estimate or (1- )100% confidence interval for and are given by 0 1 X n i i e e SS n X S S 1 2 0 ) ( X e e SS S S ) ( 1 2 2 n Y Y S i i e n i i X X 1 2 ) ( 0 1 ) ( 0 2 , 2 / 0 e n S t ) ( 1 2 , 2 / 1 e n S t
Business Analytics – The Science of Data Driven Decision Making Confidence Interval for the Expected Value of Y for a Given X Since the point estimates are subjected to higher levels of error, due to uncertainties around estimation of parameters and natural variation in the data around the predicted line, the user would like to know the interval estimate or the confidence interval for the conditional expected value. The confidence interval of the expected value of Y i for a given value of X i is given by Where the term is the standard error of E(Y|X). n i i i e n i X X X X n S t Y 1 2 2 2 , 2 / ) ( ) ( 1 n i i i e X X X X n S 1 2 2 ) ( ) ( 1
Business Analytics – The Science of Data Driven Decision Making Prediction Interval for the Value of Y for a Given X The prediction interval of Y i for a given value of X i is given by where the term, is the standard error of Yi for a given Xi value n i i i e n i X X X X n S t Y 1 2 2 2 , 2 / ) ( ) ( 1 1 n i i i e X X X X n S 1 2 2 ) ( ) ( 1 1
Business Analytics – The Science of Data Driven Decision Making For large n , the confidence interval of E ( Y|X

#### You've reached the end of your free preview.

Want to read all 70 pages?