146 Pages

Week 2

Course: GEO 4167, Spring 2012
School: University of Florida
Rating:
 
 
 
 
 

Word Count: 7890

Document Preview

on More the Reliability, Precision, and Performance of the regression model and its estimated parameters. As the least-squares coefficient/parameter estimates ( j's) and the SRF's ability to explain variation in the dependent variable (Y) can vary from sample to sample, what is needed are some sort of measures of reliability and precision. Let us review some of the more useful indices and tests. 1. Standard error...

Register Now

Unformatted Document Excerpt

Coursehero >> Florida >> University of Florida >> GEO 4167

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
on More the Reliability, Precision, and Performance of the regression model and its estimated parameters. As the least-squares coefficient/parameter estimates ( j's) and the SRF's ability to explain variation in the dependent variable (Y) can vary from sample to sample, what is needed are some sort of measures of reliability and precision. Let us review some of the more useful indices and tests. 1. Standard error of and related statistics "Standard error" of the standard deviation of the sampling distribution of ... based on sample estimates from repeated samples of a given size n, and general noted as se(j) for any given j (where j = 0, 1, ...., k) and se(0) standard error of estimated coefficient for the y-intercept or constant term; and se(j ) standard error of estimated slope coefficient associated with a j-th regressor (j=1,..., k) Hypothesis testing for estimated regression coefficients ^ To assess whether an "estimated j" (j) differs significantly from a hypothesized value of j, as designated under a stated null hypothesis. For example, consider a the two-tailed test: ^ Ho: j = j,Ho Ha: ^j = j,Ho , (j=0, 1,..., k) Typically, the default test procedure assume j,Ho= 0. Test statistic: t-statistic ^ - j j,Ho ^ se(j) t = ...or when we are testing to see if the estimated beta coefficient is significantly different from a value of zero: ^ j ^ se(j) t = ...distributed as a t-distribution with n-k* degrees of freedom. t-distribution Probability density 2-tailed test criterion: Since | t | > | t/2|, we must "reject" Ho at the (1- ) x 100% level of confidence. /2 Note: Critical t-values are found for (n-k*) degrees of freedom, where n sample size; k number of regressors; k*=k+1 number of regression coefficents to be estimated including the intercept term or constant. /2 -t/2 rejection region (reject Ho) 0 +t/2 t non-rejection region (fail to reject or accept Ho) rejection region (reject Ho) In general, the t-tests on the individual 's allow us to evaluate the "explanatory power" and/or "statistical significance" of each individual explanatory variable in the model. Rule of thumb: the higher the t-value, the greater the contribution of a variable (X) to explain variation in a dependent variable Y. For the bi-variate model, it can be shown that the ^ "standard error" of the estimated slope parameter 1 is se( ^1) = where n ^2 / (Xi - X )2 ^2 = [ i2 ] / (n-k*) i=1 ... the error variance (where k* =2). Recall that the square root of the error variance is the standard error of the estimate or "root mean square error"-- RMSE. In our bi-variate example (the GPA model), ^ ^ ~ 1 = .12037 and se(1) = .0259 Under the null hypothesis: Ho: 1 = 0 ^ ^ ...the t-statistic is t = (1 0 ) / se( 1), or ^ ^ t = 1 / se(1) = .12037 / .0259 ~ 4.64 = Ha: 1 = 0 , Since t > t-critical (=.05, 6 d.f., two-tailed test): 4.64 > 2.447 then we must `reject" the null hypothesis Ho: 1 = 0 at the 95% confidence level in favor of the alternative hypothesis. t-distribution Probability density 4.64 -2.44 0 +2.44 t rejection region How much area under the curve is associated with the tail for | t | > 4.64? t-distribution Probability density -4.64 0 +4.64 t We can answer this question using a probability calculator for 6 d.f., = .05, and a two-tailed application. Or we can simply look at our computer output which might look something like this: Variable constant X ^ 1.3750 .12037 ^ se() 0.3687 0.0259 t 3.72 4.64 prob. (p) .0097 .0035 Tells us that the null hypothesis Ho: 1 = 0 would be reject up to the 99.65 % confidence level from (1 p) x 100%. In general, if p < .05, then the corresponding estimated coefficient is significantly different from zero at the 95% confidence level. Similarly, we may test to see if an estimated coefficient is significantly different from any hypothesized value. For example, suppose that we have a hypothesis that for every $1,000 increase in per capita household disposable income, we expect a rise in GPA of .1 Ho: 1 = .10. In our bi-variate example (the GPA model), ^ ^ ~ 1 = .12037 and se(1) = .0259 Testing the null hypothesis: Ho: 1 = .10 Ha: 1 = .10 , ...the t-statistic now becomes ^ ^ t = ( 1 - .10) / se( 1) = (.12037 - .10) / .0259 ~ .786 = Note: .786 < 2.447, leading us to "fail to reject" the null hypothesis at the 95% confidence level. A general formula for finding the standard error of beta, for a multivariate model Y = f (X1, X2, ..., Xk): n (Yi - Yi )2 ^ se( j ) = i=1 n (Xj,i Xj )2 (1 Rj2) (n- k*) i=1 where Rj = "coefficient of determination" obtained by regressing Xj (the variable in question) on all other independent variables in the model... Xj = f (all Xp's), where p = j. 2 Confidence intervals for j may by drawn about the point estimate such that the probability of the true (unknown) j parameter being contained in confidence intervals drawn about point estimates obtained from a given sample size n can be determined. Thus, P[ ^j - t/2 se(^ j ) < j < ^ j + t/2 se(^ j ) ] = 1 - where the (1- ) x 100% confidence interval for ^j define as ^ ^ { j t/2 se( j ) } ...for a chosen level of significance , and t/2 is again defined for a regression model with n-k* degrees of freedom from n observations and k regressors or explanatory variables. In our example, one may verify that the 95% confidence interval for the estimated slope parameter for 1 is { .12037 2.447 (.0259) } or { .12037 .0633 } ... a value that is approximately between .057 and .184. Notice that this interval does not contain the value of zero. Implications? Standardized Beta Coefficients (*'s) Consider a "standardized" linear regression models, where the dependent and independent variables are written in terms of their Z-scores (as standard normal deviates): k Yi* = 0* + j* Xj,i* + i* j=1 where Yi* = ( Yi - Y) / sY j Z-scores of Y Xji* = ( Xji - Xj) / sX Z-scores of Xj for j=1,...,k explanatory variables. Properties of standardized beta coefficients: A. 0* = 0 It is known that the standardized beta coefficient 0* will always equal zero (by definition). Thus, the Y-intercept of a regression model using standardized data will always be zero. Subsequently, there is no need to estimate the constant term when using Z-scores as input to the model (Note; the "no-intercept" option is available on most regression packages). B. The slope terms of the "standardized regression" model may be found from the non-standardized OLS estimates. There is no need to standardize the raw data and re-estimate a regression model, as the standardized beta coefficients may be computed from estimates obtained from the original regression model. It can be shown that... ^* = ^ j j sX sY j ^ Note that for the bi-variate model only, j* = rxy ; that is, the standardized beta coefficient is equal to Pearson's r. C. Standardized beta coefficients are "unit free" measures and allow comparisons of value changes that are interpreted strictly in `standard deviation' terms. ^ In our example, the standardized beta coefficient j* = .885. Interpretation a 1.0 standard deviation change in X will lead to an estimated +.885 standard deviation change in Y. Note that most computer statistical software packages will compute standardized regression coefficients right along with the OLS coefficient estimates (e.g., NCSS output). 2. ANOVA "Analysis of Variance" and R-Square: the "Coefficient of Determination" Decomposition of the total variation in Y using the "sum of squares" principle, may be summarized as TSS = ESS + RSS Total sum of squares of Y about its mean Error sum of squares (unexplained) Regression sum of squares (explained) Y Yi Total variation of Y about its mean i-th observation (Yi, Xi) Predicted value SRF Unexplained variation (error) Explained variation by regression Y ^ Error: i = Yi - Yi 0 ^ Yi Y Xi X TSS = ESS + RSS ^ ^ ( Yi - Y)2 = ( Yi - Yi)2 + (Yi - Y)2 ^ = i2 + ( Yi - Y)2 Dividing each side of the equation by TSS yields 1 = ESS/TSS + RSS/TSS Note: the ratio (RSS/TSS) is the measure of "goodness of fit" known as the "coefficient of determination" (r2, R2): RSS/TSS = 1 - (ESS/TSS) In its expanded form, the coefficient of determination may be expressed as {r2, R2} = RSS/TSS = 1 - [ i2 / ( Yi Y )2 ] r2 for the bi-variate model Y = f(X); or R2 for the multi-variate model Y = f(X1, X2, ..., Xk), where (0 < r2, R2 < 1.0); and as {r2, R2} tends toward zero (0) poor fit as {r2, R2} tends toward unity (1) excellent fit. In our example, the coefficient of determination is r2 = RSS/TSS = 1 - [ i2 / ( Yi Y )2 ] = 1 - [ 0.65277 / 3.0 ] = 1 - [.21759] ~ = .7824 Interpretation: approximately 78.24% of the variation in Y about its mean is explained or has been accounted for by the independent variable(s) (i.e., by the regression model)... indicating a relatively strong model in terms of explanation. Note: The coefficient of determination is a `crude measure' of the strength or precision of the model. Alternatively, we may express r2, R2 as ^ 2 2 RSS/TSS = (Yi - Y) / ( Yi Y ) = 2.34722 / 3.0 ~ = .7824 And for the bi-variate case only, ^2 2 r = 1 [ xi2 / yi2 ] = (.12037)2 [ 162 / 3.0 ] ~ .7824 = Note that for the bi-varite case (only), the square root of "r2" equals r (Pearson's product moment correlation coefficient) and the square of Pearson's r is equal to the coefficient of determination: r2 = .7824 = rxy = .8845 Recall that for the bi-variate model (as shown earlier): ^ = r (s / s ) and r = ^ (s / s ) 1 xy y x xy 1 x y Note: A small beta value (slope coefficient) does not mean that there is a weak statistical correlation or association between the variables. In short, 1 and rxy are measuring two different things. The slope coefficient of the bi-variate model measures the sensitivity of Y for changes in X, while a correlation coefficient measures the amount of linear statistical association between two variables. In our example, ^ rxy = 1 (sx / sy ) = .12037 (4.8107 / 0.654654) ~ = .8845 Note: Pearson's r suggests a very strong positive correlation between variables X and Y. The slope coefficient (.12037) suggests that for every change in X of 1 (that is $1,000), that Y is expected to increase by approximately .120. And we have already established that the value of .120 is significantly different from zero (i.e., there is statistical evidence of a significant association between X and Y, and the slope of the SRF is significantly greater than zero). r and 1 are measuring two different properties. Significance testing also reveals that the correlation between X and Y is statistically "significant" the estimated correlation is significantly different from zero. Consider the test hypotheses Ho: = 0 versus Ha : = 0 and Pearson's r = .884 as an estimate of unknown parameter rho (). The test statistic is t, and t = (r - ) / Sr = r / Sr where Sr = 1 r2 (n-2) Rearranging terms gives us the classic t-test statistic in the form: t = r (n-2) (1 r2) = .8845 6 1 .7824 = 4.64 > t-critical of 2.447 (6 d.f., = .05) ...leading us to "reject" the null hypothesis Ho: = 0 at the 95% confidence level. For the bi-variate case only, Pearson's r and the t-test for r can be to assess the overall significance of the regression model. This, of ocurse, will not hold for multi-variate regression. 3. The Coefficient of Variation (CV) An alternative "goodness-of-fit" measure for a regression model is the coefficient of variation (CV). It is usually expressed in one of two ways: (a) ^ CV = ( / Y ) As CV goodness of fit (b) ^ CV = ( / Y ) x 100% Range: { 0 < CV < 1.0} for (a); or { 0% < CV < 100% } for (b). Note that as r2,R2 CV (they are inversely proportional). Y r2 = 0 CV = 1 (or 100%) Y SRF (best fit trend line) X Note: SRF offers a very poor fit and X offers virtually no explanation for the variation in Y. In our example, CV = (.3298428 / 3.0) x 100% ~ = 10.99 Low CV relatively "good fit" 4. Modified Coefficient of Efficiency (E*) The "coefficient of efficiency" compares and contrasts the observed values of Y from the predicted values of Y obtained from the SRF. Essentially, it is R-square: E = 1.0 ^ (Yi Yi )2 ( Yi Y ) 2 Note that as the values of the difference are squared, this index is very sensitive to "extreme values". To sidestep this potential shortcoming, the index may be modified using differences in the absolute values of observed less predicted values. The "modified coefficient of efficiency" E* may thus be defined as ^ | Yi Yi | E* = 1.0 | Yi Y | ... ranging from zero to 1.0 (with a value of 1.0 indicating the highest level of efficiency in the prediction of values based on a model). In our GPA as a function of X model, it can be shown that the modified coefficient of efficiency E* = 1 (2.0 / 4) = .50 ...suggesting a "moderately" efficient model in terms of its predictive abilities. Note: Like R-square and CV, E* is a useful "descriptive" measure, but nothing more. 5. The F-test or F ratio A statistical test for "goodness of fit" of the sample regression function (SRF) Used to validate the overall strength of a regression model and the explanatory power of the independent variable(s). A formal procedure that will address the question on the likelihood that the relationship between the observed and fitted (predicted) values of Y is simply due to "chance". The worst-case scenario (a benchmark)... is the basis for our null hypothesis: Ho: 1 = 0 or Ho: 1 = 2 = .... = k = 0 versus Ha: not all the j's (j>0) are equal to zero; that is, at least one slope coefficient is significantly different from zero. The F-Ratio may be expressed as F = MSE due to regression / MSE about regression = Explained Variation / Unexplained Variation ... as derived from the ANOVA table results. Formally, the F statistic is defined as follows: F = ^ [ ( Yi Y )2 ] / (k* - 1) ^ [ ( Yi Yi )2 ] / (n - k*) distributed as F with v1 and v2 degrees of freedom, where v1 = k*-1 and v2 = n-k*, the d.f. associated with the numerator and denominator of the ratio, respectively. Algebraic manipulation of the F statistic shows that it is related to the coefficient of determination (R-square). More specifically, we can write F in terms of R-square: (n-k*) R2 (k*-1) 1 R2 F= Alternatively, the F statistic may be stated using the ANOVA components. In our example... F = RSS / (k*-1) Mean square error (explained by SRF) ESS / (n-k*) Mean square error (unexplained) 2.347 / (2-1) = 0.653 / (8-2) ~ 21.574 = Computing F in terms of R-square yields the exact same answer: F= (n-k*) R2 (k*-1) (8-2) 1 R2 .78240 = (2-1) .21759 = 6 [ 3.5957 ] ~ = 21.574 Critical value of F and test result: F = 5.99 for = .05 w/{v1 = 1, v2 = 6} d.f. As F > F we must "reject" Ho : 1 = 0 at the 95% confidence level. Note: 21.57 > 5.99 Probability density Test procedure: Should F > F reject null hypothesis at the (1- ) x 100% confidence level. 5.99 21.57 0 Non-rejection region F F Rejection region Highly improbable value There is statistical evidence that the variation in Y that is accounted for by the independent variable X is a statistically significant amount, and that the amount of explained variation is unlikely to have been generated by chance. In fact, there is a probability of .004 that it was due to chance. In short, the variable in question is contributing a statistically significant degree of explanation, and it is very unlikely that it is simply due to chance. Computer output: F = 21.574 (p= .004) Reject Ho : 1 = 0 up to the 99.6% confidence level For any regression model with k regressors and k+1= k* estimated coefficients and n-k* degrees of freedom: k Yi = 0 + j Xji + i j=1 ...and a SRF and error vector estimated for i = 1,... , n sample observations, it can be shown that R-square is a function of F: F (k* - 1) / (n-k*) R = 1 + F (k*-1) / (n-k*) 2 F (k* - 1) / (n - k*) R = 1 + F (k*-1) / (n - k*) 2 When F is zero, R2 is also zero; and as F becomes indefinitely large or as F , then R2 1.0. Since we may write R2 = 1 - [ i2 / ( Yi Y )2 ] = 1 - [ i / yi ] 2 2 it follows that... i2 = - yi2 (R2- 1) . Hence, minimization of the sum of squared error implies 2 that we should somehow maximize R ; since the above equation suggests that as R2 i2 . This logic is flawed, however. Regression Model Y = f (X1) Y = f (X1, X2) Y = f (X1, X2, X3) Y = f (X1, X2, X3, X4) .8531 Y = f (X1, X2, X3, X4, X5) Y = f (X1, X2, X3, X4, X5, X6) Y = f (X1, X2, X3, X4, X5, X6, X7) R .7824 .8405 .8509 .8540 .8542 .8543 2 Note declining increase in R2 with additional regressors (X's) Last several increases may strictly be due to chance. Assume that the explanatory variables X5, X6, and X7 are "random variables" variables whose values are generated random, say by a random-number generator. Even as random variables are added to a model, the sum of square error is likely to decline as there will always be a chance that an additional proportion of the remaining or residual variation in Y will be accounted for. Thus, the goal of maximizing R-square is not advisable! Subsequently, R2 is not a reliable measure of the overall strength of a regression model. A better approach involves "adjusting" the R-square value to account for the inherent trade-off in adding additional explanatory variables/power to the model purely as the number of regressors (X's) increase, which also adds complexity, and the potential of having explanation that is simply due to chance and not the information provided by the extra explanatory variables... that are sequentially added to the model. 6. Adjusted R -- the adjusted coefficient of determination A preferred index of `goodness of fit' and efficiency is "adjusted R2", sometimes written as R2 . Adjusted R-square adjusts for the loss of additional degrees of freedom in a model as regressors (X's) are added; basically assessing a penalty for adding variables to the model (under the assumption that there is a likelihood that some portion of the increase in explanatory power is brought about by chance). Hence, the goal of maximizing adjusted R2 is one that can be more easily defended. 2 Adjusted r-square can be formally expressed in terms of n, k*, and R2; where (n-1) R2 = 1 - (1 - R2) (n-k*) Real penalty assessed for sample size and limited degrees of freedom as k (and k*) increases ... and is the preferred index of measuring the amount of explained variation that is accounted for by the independent variables in the model. (adjusted R-square) x 100% % of explained variation in Y account for by regression -- the SRF In our example, (n-1) R = 1 - (1 - R ) (n-k*) 2 2 = 1 - (1 - .782407) [ (8-1) / (8-2)] ~ = .7461 (recall that R2 = .7824) Note: adjusted R-square < R-square Interpretation of an "adjusted R-square" value of .746: Approximately 74.6% of the variation in Y is explained or accounted for by the variable X in the model Y = f (X). Note that the adjusted R-square, takes into account the possibility that R2 might be slightly inflated due to chance and/or the effects of a small size -- limited degrees of freedom. Linear Regression Functional Forms and Transformation of Variables ... a review of some commonly used "linear models" and various transformations to achieve linearity. Case #1: Curvilinear Function A: (1 > 0, Xi > 0) Y = 0 X 1 "power function" Y 1 > 1.0 Transformation: double-log model logY = log 0 + 1 logX + 0 < 1 < 1.0 0 X a.k.a. Double-log model, Log-Linear model, or Constant Elasticity model" Consider the following "power function" used to model Y -- suspended sediment flows (mg./liter) as a function of X -- water discharge (cubic ft. per second). Y = 0 X "power function" ...where is the base of logarithm ( = e when using natural logs), 0 and 1 are parameters to be estimated, and is the error term or stochastic disturbance. 1 Logarithmic Transformation (using base e logs) yields ln Y = ln 0 + 1 lnX + u Constant term Slope term Note that in this model, the slope coefficient measures the "elasticity" of Y with respect to X: % change in Y for a very small % change in X (a point-elasticity measure). An "elasticity" is usually calculated at the point of the means of each of the independent variable of the model... and, for a j-th explanatory variable, the j-th elasticity may be expressed as Ej = ^j [ Xj / Y ] ~ ~ Y/Y / Xj/ X ~ = % change in Y (about mean) % change in Xj (about mean) ratio of percentage changes in Y and Xj about their respective means. Note: Elasticity values Ej are "unbounded' and may be positive or negative depending on the sign of the estimated j's. Elasticities are easily recovered from any regression output when using the double-log model. Elasticities are particularly useful measures because, like standardized beta coefficients, they are "unit free" (i.e., their values are independent of the units in which the variables are measured). For example, if Ej = 2.0, we can say that (about the means of Y and Xj), a 1% increase in Xj will lead to a 2% increase in the dependent variable Y (as Ej = 2 2/1). If Ej = -.5, it can be said that a 1% increase in Xj will lead to a .5% decrease in Y. In our example, ^ E1 = 1 [X1 / Y1 ] = .12037 [13.5 / 3.0] ~ .541 = ... meaning that a 1% change in X (per capita household disposable income) will lead to an "expected" .54% change in Y (GPA). Case #2: Curvilinear Function B: (1 < 0 and Xi > 0) Y Y = 0 X -1 "negative power function" Transformation: double-log model logY = log 0 - 1 logX + 1 < 0, and | 1 |: 0 < | 1 | < 1.0 | 1 | > 1.0 0 X ... used, for example, to model urban land-rent gradients: Land value LV ($/unit) Ln(LV) linearize Sample Regression Function Ln(X) LV = f(X) (cost of conversion) C Urban fringe X X -- distance from central node C The double-log model (type B) is also used for the estimation of "gravity models"... in particular, the single-origin production-constrained gravity model. Consider the simple gravity model used to model spatial flows Iij from a given origin i to various destinations j: j=1,..., n, as a multiplicative/power function with a constant of proportionality ko, the mass or size Mj of destination j, and the distance dij between origin i and destination j, where uij is the disturbance or/error term: Iij = ko Mj dij uij where is the mass attraction parameter and is the "friction-of-distance" parameter that describes how the propensity of spatial flows decline with increasing distance. - Note on the constant (Y-intercept term): ^ ^ ln0 ^ 0 = antilog (ln 0 ) e (for natural logs) ^ log0 ^ ^ 0 = antilog (log0 ) 10 (for base-10 logs) ^ ... and 0 is a "biased estimator" of 0. As such, the estimated intercept term might have to be "adjusted". In most application, however, this is not a problem as the intercept term is of secondary importance in the modeling process. Ad hoc test for bias: Apply an equality of means procedure to test if the mean of the observed values of Y is equal to the mean of the predicted values of Y, once the predicted logarithmic values are "recovered"-- converted back into their raw or non-logarithmic form. The difference in the predicted mean and the observed mean can be used to adjust the predicted values (upward or downward) by a constant shift. Y Y Observed mean Predicted mean ^ Case #3: Exponential Function A: (1 > 0) Y Y = 0 e 1X "exponential function" Transformation: semi-log model (Y) lnY = ln 0 + 1 X + Explosive upward trend X 0 Applications: population growth models, economic growth modeling Case #4: Exponential Function B: (1 < 0) Y Y = 0 e -1X "negative exponential function" Transformation: semi-log model (Y) lnY = ln 0 - 1 X + Explosive downward trend X 0 Applications: population density gradients Density D ..where declining population density with increasing distance from a central point P is expressed as a negative exponential function: D(X) = D(0) e -1X Transformation yields: ln D(X) = lnD(0) - 1X + ln P At X=0, e = 1.0; hence, density at X=0 is equal to D(0) -1X distance (X) Urban fringe ^ Note: 1 is an estimate of the density gradient Case #5a and 5b: Semi-log in X ( 1 > 0) shown in black Semi-log in X ( 1 < 0) shown in red Y 1 > 0 Function and transformation: Y = 0 + 1 (log X) Y = 0 - 1 (log X) 0 1 < 0 X In general, "semi-log models" (in Y or X) are commonly used to linearize a curvilinear trend, for situations where a linear model is not appropriate. Y Running the model Y = 0 + 1 X + would not work! Y Running the model Y = 0 + 1Log(X) + would work fine! Transform X SRF SRF X Log X A note on "semi-log" models Consider the semi-log specifications shown below: ln Yi = 0 + 1 Xi + i (A) Yi = 0 + 1 ln Xi + i (B) Model A is known as the "constant growth" model, where 1 = relative in Y / absolute in X; whereas Model B is a semi-log in X model, where 1 = absolute in Y / relative in X Examples: model A population growth as a function of time model B life expectancy as a function of regional economic growth or development Case #6: Reciprocal model with "positive curvature" (where 1 > 0) Y ... and the function is such that it asymptotically approaches the constants 1/0 and 1 / 0 . Function: Y= X 0 X - 1 -1 (implied as 1 > 0) -1 Transformation: Y = 0 - 1 X 1/0 0 1 / 0 X Applications: population density gradients and economic development status models Case #7: Reciprocal model with "negative curvature" (where 1 < 0) Y ... and the function is such that it asymptotically approaches the constants 1/0 and 1 / 0 . 1/0 Function: Y = X (implied as 1 < 0) 0 X + 1 -1 -1 Transformation: Y = 0 + 1 X 1/0 0 X Case #8: Semi-Reciprocal model (where 0 > 0) Y Function: Y = 0 + 1 X + i -1 -1 Note: As X , the term 1 X 0 ...and Y 0 If 0 and 1 are both positive, the reciprocal transformation model shows that Y decreases non-linearly as X increases. 0 0 X Limiting value Applications: land-use and population density gradients, economic development status models, trade areas and distance-decay effects Y SRF (after transformation) ^ = ^ + ^ X-1 Y 0 us 1 ...allowing to estimate the Y-intercept term 0 0 0 -1 X (X ) Case #9. The "Logistic Function" Y inflection point Y = f(X), where is the "limiting value" of Y X` e.g., Y = built-up urban area... as a function of X = time 0 X The logistic function may be expressed mathematically as Y = e 0 + 1X 0 + 1X 1+e Post transformation, the function takes on a semi-log linear form... Ln(Y / 1 Y) = 0 + 1 X which can be re-written as follows (for i=1,..,n sample observations of Y and X): Ln(Yi 1) = 0 + 1 Xi + i SRF -1 Y* ^ 0 where Y* = Ln(Yi 1) -1 Y* SRF(post transformation) 0 X X Ln(Yi-1 1) = ^0 + 1 Xi + i SRF ^ -1 ^ ^ Ln(Yi 1) = 0 + 1 Xi + i SRF The "prediction equation" for any value of Xi : ^0 ^ Yi = [ 1 + e + ^ Xi -1 1 ] Product Life Cycle and Innovation Diffusion models Product life cycle (market saturation over time) Y where Y = cumulative unit sales or revenue ... a 4-stage model: Satiation & obsolescence inflection point 0 I Product intro II Take-off period III WideAcceptance IV Market saturation X (time) Case #10. Polynomials Consider the m-th order polynomial expression: m j X , j=0 j where m is a non-negative integer and each coefficient j is a real number. A polynomial regression may be thus be defined as follows: m Yi = j X i + i . j=0 j Polynomial of "order 1" is for modeling a straight line relationship, where m=1: Y SRF m j=0 Yi = j X i + i = 0 + 1 Xi + i X Hence, the bi-variate linear model is a "special case' of the more generalized m-th order polynomial, where m > 1. j Polynomial of "order 2" is for modeling a curvilinear relationship, where m=2: ...also known as the "quadratic function" Y SRF m Yi = j X i + i j=0 j = 0 + 1 Xi + 2 Xi + i 2 X Note: the way that the curve is drawn, we would expect that 1 > 0 and 2 < 0. Polynomial of "order 3" is for modeling a sinusoidal or wavy-like relationships, where m=3: Y ...also known as the "cubic function" SRF m j=0 Yi = j X i + i j = 0 + 1 Xi + 2 Xi + 3 Xi + i X denotes directional change in trend line 2 3 Determining polynomial order For polynomials of order v, where v > 3, the order v is always one greater than the number of "directional changes" in the trend line or SRF, or equal to the number of observed trend lines. Y 4-th order polynomial to model trend in the relationship between Y and X. X Y ? 5-th order polynomial to model trend in the relationship between Y and X. X Note that in this case it is useful to assume one order up... that is, a 5-th order polynomial to capture subtle up-turns at the extreme. Application. Consider the following data and model of electrical power consumption/usage (Y--in kilowatt hours) as a function of the size of a home (X--in sq. feet) for n=10 randomly selected residential households in a given urban area: Y 1 1,182 2 1,172 3 1,264 4 1,493 5 1,571 6 1,711 7 1,804 8 1,840 9 1,956 10 1,954 X 1,290 1,350 1,470 1,600 1,710 1,840 1,980 2,230 2,400 2,930 Y Curvilinear trend 2nd-order polynomial (implied) X Suppose that our goal is to build a model that will predict power consumption/usage within 100 kw hrs. based solely on the size of a house... ... deeming that as an acceptable level of accuracy. m j=0 Yi = j X i + i = 0 + 1 Xi + 2 Xi + i Estimation of the model using OLS regression yields: Yi = -1,216.1 + 2.3989 Xi - 0.00045 Xi2 + i . t's -5.009 prob. (.0015) 9.758 (.00003) -7.618 (.00012) prob. t > t(critical) 2 j where all estimated coefficients test significantly different from zero at the 99% confidence level. Suppose that the adjusted R-square is approx. 0.97, indicating that the model has a very good fit overall (note: F= 189.71 with a p=.0000, and a normal error structure). Suppose that the mean square error is ^2 = i2 / (n-k*) i=1 n = 15,332.52 / (10-3) = 2,190.36 kw hrs ... and a "root mean square error" (RMSE) of ^ = 46.8 kw hrs (approx.) [a.k.a. the "standard error of the estimate"]. Interpretation of RMSE: We expect that our 2nd-order polynomial model will predict electrical power usage to within (plus or minus) ^ 2 = 93.6 kw hrs ... on average. Hence, (2 x RMSE) is roughly a measure of the power of the predictive model.... and in this case, falls within the prescribed level of acceptability. This is something that adjusted R-square cannot tell us! In a curvilinear model (2nd-order polynomial), the estimated 2 parameter is said to measure the "curvature" of the response curve. [Note that in a straight-line relationship, it is expected that 2 = 0... implying no curvature]. Y 2 > 0 Y 2 < 0 downward curvature X upward curvature X If 2 > 0 (2 < 0), the slope of the curve will increase (decrease) with increasing values of X. We could perform a one-tailed test procedure to evaluate Ho: 2 = 0 versus Ha: 2 > 0 or (2 < 0)... or evaluate the confidence interval for this parameter estimate. It can be shown that our confidence interval for j (j=2): ^ ^ { j t/2 se( j ) } is { -0.000450 (2.365) (.0000591) } or [ -0.000590 < 2 < -0.000310 ] Note that this confidence interval does not contain the value of zero... consistent with our t-test results. I. Testing the "Normality" Assumption of Regression Residuals Most commonly used in regression analysis-- Shapiro-Wilk test statistic (W) Kolmogorov-Smirnov (K-S) test D'Agostino's battery of tests (3 in all): D'A.. -skewness, -kurtosis, -omnibus Why should we be concerned with normality of error terms from a regression model? Consequences of Non-Normal Error? Non-normality of error does not interfere with the properties of OLS estimators... they are still said to be BLUE "best linear unbiased estimates/estimators". In other words, OLS estimators are "unbiased", even if the error terms of the model are "non-normal". Then why should we be concerned with normality??? The assumption of normality of error plays a crucial role, however, in the tests of model structure and significance; in particular, the significance/inferential tests performed on the estimated beta parameters of the regression model. Should the error terms be "non-normal", the estimates of the standard errors of the beta parameters are no longer "efficient". Non-normality "inflated" standard errors. Consequences: As the standard errors of the beta's t-stats, resulting in the under-estimation of the statistical significance of the beta's and their associated explanatory variables. In short, there is a greater likelihood of making TYPE II errors in hypothesis testing "failing to reject" Ho: j = 0, when the null should be rejected. Shapiro-Wilk (SW) test [W statistic] Particularly good test when sample size is small especially when n < 50 or n < 30...good up to n=2000 a preferred test of normality in regression error Used as a pre-test procedure to validate "normality" (that a variable from a sample of n observations was drawn for a normal population) Useful alternative to "normality plots" (which are purely descriptive in nature) Requires the use of SW tables: (a) coefficients; (b) critical values... Null Hypothesis: Normality RE: Critical values... if W > W(critical),n n.b. we must "Fail to Reject Normality" (opposite) Shapiro-Wilk statistic (W) may be formally expressed as W = b2 / (n-1) s2 where k b = j=1 aj { X(n j + 1) Xj }; aj = SW-coefficients (j=1,.. , k); n = sample size (number of observations); s2 = unbiased sample variance estimate ...the version with (n-1) in the denominator; and X "ordered" vector of some variable X (in ascending order) lowest to highest values for i=1,...n observations of X, such that X1 < X2 < .... < Xn . Note that the value of k (number of test coefficients) depends on the number of observations (n): k = n/2 k = n/2 + when n is "even" when n is "odd" Tabled values for the k coefficients are typically given for n: { 2 < n < 50 }. Note also that when n is "odd" ak = 0.0 (always) ... given that a median or middle value is known to exist, and acts as a stand-alone pivot point. Suppose n= 8 k = 4. Thus, there are four SW coefficients {aj} used in the calculation of W. Suppose that n=8 and k=4. The SW coefficients are defined as follows (from tables values): a1 = .6052 a2 = .3164 a3 = .1743 a4 = .0561 Application. Consider a variable X measuring inches of rainfall from n=8 randomly sampled rain gauges in a given region: X(i) vector (unordered) 80, 70, 60, 55, 72, 73, 81, 50 with a mean of approximately 67.63" and s = 11.43 (and sunbiased estimate of sample standard deviation). Goal (or implied Null Hypothesis): Test if vector of values is not significantly different from a "normal" distribution at the 95% confidence level. H0: Normality vs. Ha : Non-normality To carry out our Shapiro-Wilk test we must... Step 1. Arrange values in ascending order... X(i)a = {50, 55, 60, 70, 72, 73, 80, 81} and re-label observations in ordered vector as { X1 , X2 , ..... , Xn-1, Xn } Step 2. Using the "ordered" observations of the variable X and our SW coefficients {aj}, j=1,..,k, we may define b: b = .6052(81-50) + .3164(80-55) + .1743(73-60) + .0561(72-55) = 29.04 approx. (rounded) Note: when j=1 j=2 j=3 j=4 (k) (n-j+1) n n-1 n-2 n-3 X(n-j+1) largest value of X second-largest value of X third-largest value of X fourth-largest value of X The Shapiro-Wilk test statistic (W) may thus be defined as W = (29.04)2 / { (8-1) 11.432 } = 843.32 / 914.55 = .922 This value must now be compared to the critical table value at a given confidence/significance level to make a decision regarding the null hypothesis Ho: normality. The tabled (critical value) of W: W=.05 (95% confidence) = .818 Since, W > W=.05 .922 > .818 we must "fail to reject" the null hypothesis-- Ho: Normality...at the 95% confidence level. Hence, there is no statistical evidence to support the rejection of normality at the chosen level of confidence. In other words, there is no evidence to support the notion that the underlying distribution of the variable X is anything other than "normal". Prob. density Calculated W and critical value of W for n=8 observations at =.05 significance. As value of W 1.0, the distribution standard normal Range: 0 < W < 1.0 (theoretical) Normal 0 W=.05 W .818 .922 (n=8) W 1.0 Problem. Verify that W = .162 (approx.) for the following vector of X: {5, 7, 9, 13, 11, 8, 10, 34, 17, 50, 17, 25) for n=12 obs. And since W < W=.05 (.859), we must "reject" the assumption of normality at the 95% confidence level (=.05 significance level). In the case of error terms (OLS regression residuals), we may rewrite the Shapiro-Wilk statistic as n W* = b2 / where i=1 i2 k b = j=1 aj { (n j + 1) j }; ...given that the expected value of E() = 0, or the mean error is zero (approx.) using OLS estimation. Problem. Verify that the Shapiro-Wilk statistic for the OLS residuals {} from our GPA=f(X) example, where the "sum of squared error" = .653025, is W=.9643 (approx.); a value that exceeds the critical value of .818... leading us to "fail to reject" the null hypothesis of normality at =.05. {}: (i*) 1 2 3 4 5 6 7 8 error terms (*in ascending order) -.455 -.315 Note that mean error is -.175 approx. zero (as expected) -.035 .105 .185 .325 .405 Typically, for programs like SPSS and NCSS... we get the output...to address the null hypothesis [H0: variable in question is "normal"; that is, normally distributed]: Normal? var: X1 var: X2 n= 40 W = .954 p = (.830) n=40 W = .801 p = (.042) YES NO Conclusion (X1) Fail to reject null hypothesis of normality at 95% confidence as p > .05. Conclusion (X2) Reject null hypothesis of normality at 95% confidence as p < .05. Kolmogorov-Smirnov (KS) test based on the cumulative normal distribution The "K-S test" of normality for OLS residuals: (1) Run OLS model and generate error terms {}; (2) Compute "Z-scores" of error terms, where ^ Z() = / ; i i ^ and is std. error of estimate (accounting for n-k* d.f. ); (3) Create cumulative relative frequency distributions of Z() values and benchmark distribution (SN); and (4) Carry out the Kolmogorov-Smirnov (K-S) test comparing the cumulative relative frequency distribution of Z() to the "expected" cumulative relative frequency values associated with the "standard normal" (SN) distribution (our theoretical benchmark distribution). Suppose we were to use m=8 intervals and generate the cumulative "area" under the standard normal distribution. Standard Normal Distribution Interval - -3.0 -2.99 -2.0 -1.99 -1.0 -0.99 0.0 +0.01 +1.0 +1.01 +2.0 +2.01 +3.0 +3.01 + Area .0014 .0214 .1359 .3413 .3413 .1359 .0214 .0014 Cumulative Area (crf) .0014 .0228 .1587 .5000 .8413 .9772 .9986 1.0000 Note: One should chose between 5 and 11 intervals when carrying out such a procedure. Our error terms and Z-scores from our GPA = f(X) example, where estimated S.E.E. = .3298 (approx.): Observation i Zi = i / 1 .105 .3183 2 -.175 -.5306 3 .325 .9854 4 -.455 -1.3796 5 .185 .5609 6 -.035 -.1061 7 .405 1.2280 8 -.315 -.9551 Note: Standardization of error terms also helps us to identify potential "outliers" where Z() > 2.0 or 3.0 (extreme), for example. Note: There are no outliers found in the example above. Using our m=8 intervals, we can generate the "observed" cumulative relative frequencies of our error terms and compare them to the "expected" cumulative relative frequencies for the normal distribution. Observed vs. Standard Normal Distribution Interval - -3.0 -2.99 -2.0 -1.99 -1.0 -0.99 0.0 +0.01 +1.0 +1.01 +2.0 +2.01 +3.0 +3.01 + # of obs. crf count 0 0 1 3 3 1 0 0 0 0 1 4 7 8 8 8 crf obs .0000 .0000 .1250 .5000 .8750 1.000 1.000 1.000 crf normal .0014 .0228 .1587 .5000 .8413 .9772 .9986 1.0000 1/8= 4/8= 7/8= 8/8= 8/8= 8/8= Interval - -3.0 -2.99 -2.0 -1.99 -1.0 -0.99 0.0 +0.01 +1.0 +1.01 +2.0 +2.01 +3.0 +3.01 + crf obs .0000 .0000 .1250 .5000 .8750 1.000 1.000 1.000 "expected" crf normal | crferror - crfnormal | .0014 .0228 .1587 .5000 .8413 .9772 .9986 1.0000 .0014 .0228 .0337 .0000 .0337 .0228 .0014 .0000 K-S (D-statistic) = | crferror - crfnormal |max = .0337 Testing at 95% confidence level (=.05)... Since D < D for n=8 observations (from tabled values): .0337 < .457, ...we must "fail to reject" the null hypothesis Ho: Normality at 95% confidence. In short, there is no statistical evidence to conclude that the distribution of OLS error from our example is anything other than "normal" at 95% confidence. Other options for validating the "normality assumption". Recall that "skewness" measures the degree and direction of asymmetry of a distribution of a variable (in comparison to a benchmark distribution -- a symmetrical/mesokurtic distribution). The "Beta 1" measure of skewness may be defined as 1 = M32 / M2 3 where Mr refers to an r-th moment of a distribution and "s" is the standard deviation of the variable in question (X). Some basic properties of 1: (1) a value of zero indicates a perfectly symmetrical distribution; (2) a large positive value indicates skewness (tail to the right); (3) a large negative value indicates skewness (tail to the left)... though values between 3 and +3 indicate the typical range for the normal distribution (or distributions that are not significantly different from a normal distribution); and (4) skewness is said to be severe if |1| > 3.0 It is important that researchers notice skewness when it appears in their sample data, as it suggests a need to investigate "outliers". If an inferential test statistic assumes normality, one should employ one of various pretest procedures to validate the normality assumption. Should the normality assumption be violated, one may consider the use of transformations transforming a variable X such that it conforms to a distribution that is not significantly different from normal. Transformations commonly employed to reduce (positive) skewness include taking the square root, logarithm (base 10 or natural), or reciprocal of X... and retesting to see if the transformed variable "conforms" to a normal distribution. NCSS reports a general skewness index defined as the square root of 1 its standard error is defined under a large sample size assumption; thereby, allowing a formal test of skewness. The test statistic (S) which evaluates the severity of skewness may be expressed as S = { 1 - E[1 ] } / [se(1)] = 1 / [ se(1)] and S is approximately normally distributed. Hypotheses: Ho: symmetry vs. Ha: asymmetry. If S > Z , then we would "reject" the null hypothesis at the (1-) x 100% level of confidence. The standard error of the skewness measure is found by taking the square root of the variance estimate. A large-sample variance estimate for the r-th moment (mr) of a distribution where r=3 can be obtained using the "Kendall-Stuart equation": Var (mr) = {m2r mr2 + (4 m2 mr-12) (2 r mr-1 mr+1) } / n ...where an "r-th moment" is defined as n r mr = { ( Xi X ) } / n i=1 Note: The Kendall-Stuart formula may also be used to find the variance estimate for kurtosis (when r=4). Fisher's "g1 statistic" -Fisher's g1 is an alternative measure of skewness, defined for a given 1 and sample size n, used to adjust for the effects of sample size, where g1 = [n (n-1) 1] / (n-2) The statistic g1 is approximately normally distributed, and its expected value is zero (when the underlying distribution is perfectly symmetrical): E[g1] = 0. The standard error of this statistic is approximately equal to (6/n) for large samples size (when n > 150). Hypothesis testing (Fisher's g1): If [g1 / se(g1)] falls within the interval Z/2 and + Z/2 , it indicates a degree of symmetry that is not significantly different from a normal distribution, where the sign of g1 defines the type of skewness (+ or -), at a given () significance level. If | g1 / se(g1) | > Z/2 , then skewness is said to be "severe" at the (1 - ) x 100% confidence level. D'Agostino's skewness measure Z(S): Z(S) is a statistic that was developed to test if the value of 1 is significantly different from a value of zero (its expected value). Z(S) is approximately normally distributed, and its use is restricted to sample sizes n > 8 (hence, very useful). Null hypothesis: Ho: X is symmetrically distributed. Formally, Z(S) = ln ( / ) + ( / ) + 1 , 2 { } where = 1 { [(n+1) (n+3)] / [6 (n - 2)] } = 1/ ln (W ) 2 2 = 2 / (W - 1) W = -1 + 2 2 (C - 1) and 3 (n + 27n 70) (n+1) (n+3) 2 C = (n 2) (n+5) (n+7) (n+9) If | Z(S) | > Z/2 , then we must reject Ho at the the (1 - ) x 100% level of confidence. Note: a two-tailed critical significance level (p) is generally reported for this test... where we would reject the null hypothesis if p is less than a pre-determined level of significance (say, =.05). Recall that indices of "kurtosis" measure the degree of peakedness or flatness of a distribution of a variable (in comparison to a benchmark distribution the bell-shaped or normal distribution). The "Beta 2" measure of kurtosis may be defined as 2 = M4 / M2 2 2 2 or M4 / (s ) where Mr refers to an r-th moment of a distribution and "s2" is the variance estimate of the variable in question (X). The expected value of 2 is 3.0 (when a distribution is mesokurtic or "normal"). Peaked or "leptokurtic" distributions tend to have 2 values that exceed 3.0; whereas, flat or "platykurtic" distributions tend to produce 2 values that are less than 3.0. Note: This statistic is an unreliable estimator of kurtosis when we are dealing with a small sample size (i.e., when n < 40). Fisher's "g2 statistic" -Fisher's g2 is a measure of kurtosis, defined for a given 2 and sample size n, used to adjust for the effects of sample size, where (n+1) (n-1) 3 (n-1) [2 ] g2 = (n-2) (n-3) n+1 The expected value of g2 is zero (when distribution is similar to that of a normal distribution in terms of its peakedness or shape) E[g2] = 0. Note that g2 is an unbiased estimator of the statistic 2, where 2 = (2 3) or the "adjusted" 2 index. For very large samples (n > 1000), the standard error of g2 is approximately (24/n). When the value of [g2 / se(g2)] falls within the interval: Z/2 and + Z/2, it indicates a degree of peakedness that is not significantly different from that of a normal distribution (with the sign of g2 indicating the type of kurtosis) negative (positive) values indicate a flatter (more peaked) distribution than normal. If | g2 / se(g2) | > Z/2 , then deviation from the normal distribution is said to be significant, and one would "reject" the null hypothesis of normality at the (1 - ) x 100% confidence level. Such a test should only be employed when using very large samples (say n > 1000)... as the sampling distribution of this statistic is asymptotically normal. D'Agostino's kurtosis measure Z(K): Z(K) is a statistic that was developed to test if the value of 2 is significantly different from a value of 3.0 (its expected value... using the normal curve as a benchmark distribution). The statistic Z(K) is approximately normally distributed, and its use is restricted to sample sizes n > 8. Null hypothesis: Ho: the degree of peakedness is not significantly different from that of a normal distribution. Formally, 2 1 (9 ) Z(K) = 2 (9 ) where 1 (2/) 1 + [2 / ( - 4)] 1/ 3 = 2 [ (3n 3) / (n + 1) ] 24n (n 2) (n - 3) (n + 1)2 (n+3) (n+5) and = 6 + (8 / ) [ (2 / ) + 1 + (4 / ) ] where = 6 (n 5n + 2) (n + 7) (n + 9) 2 2 6 (n + 3) (n +5) n (n - 2)(n - 3) If | Z(K) | > Z/2 , then we must reject Ho at the the (1 - ) x 100% level of confidence. Note: a two-tailed critical significance level (p) is generally reported for this test... where we would reject the null hypothesis if p is less than a pre-determined level of significance (say, =.05). D'Agostino's omnibus test statistic K : K is a statistic that was developed to simultaneously test if the values of 1 and 2 are significantly different from their "expected values" (using the normal curve as a benchmark distribution). It is a test that combines the tests of skewness and kurtosis into one formula. 2 2 The statistic K is distributed as a Chi-square distribution with k=2 degrees of freedom. [Note: we will discuss the 2 distribution in a forthcoming lecture.] Null hypothesis: Ho: the degree of skewness and the degree of peakedness is not significantly different from that of a normal distribution. 2 2 Before we can run this test, we must first calculate values for Z(S) and Z(K). D'Agostino's omnibus statistic is defined as follows: K = Z(S) + Z(K) . If K > 2 critical value (for a given ), we would reject the null hypothesis at the (1 - ) x 100% confidence level. We will talk about Chi-square (2) distribution later. Typically, a report is provided of the outcome of this test at the 95% confidence level... or a p-value is given. 2 2 2 2 Note: If p < .05, we must reject the null hypothesis that the distribution is shaped like a normal distribution at the 95% confidence level. D'Agostino's test results, from NCSS Output... support the assumption of normality for these error terms: D'Agostino's stat skewness Z(s) kurtosis Z(k) omnibus K2 value -.2719 -.8376 0.7755 prob. .7857 .4022 .6785 Normality? accepted accepted accepted Note: In each case, D'Agostino's stats are less than their respective critical values at the 95% confidence level. | Z(s) | < 1.96 | Z(k) | < 1.96 K2 < 5.99 Several other test options are available for testing normality: Anderson-Darling test Martinez-Iglewicz test (based on the median value) Jarque-Bera (JB) test Note: each test/statistic has its own set of rules, assumptions, and restrictions (so be careful). Note: When carrying out a normality test on small samples, it is highly recommended that one use Shapiro-Wilk's W (a test that is actually valid for samples that range from n=3 to n=5000, so long as there are relatively few ties in the data vector). Note: The Anderson-Darling test (not applicable when a frequency variable is specified) is also useful when n < 40. For large samples, especially when n > 100, it is suggested that one apply a battery of normality tests (providing that the test assumptions are met). The Anderson-Darling test is used to test if a sample of data came from a population with a specific distribution (like the normal distribution). It is a modification of the Kolmogorov- Smirnov KS test, but gives more weight to the tails than does the KS test. The KS test is distribution free in the sense that the critical values do not depend on the specific distribution being tested (hence, it is a nonparametric test, by definition). The Anderson-Darling test makes use of the specific distribution in calculating critical values. This has the advantage of allowing a more sensitive test and the disadvantage that critical values must be calculated for each distribution (advanced topic). In normality testing, the p-value is typically given when the test statistic is reported. Rule of Thumb: If p-value < 0.05 distribution is not normal. If p-value > 0.05, then distribution is not significantly different from a normal distribution. Note: similar comparison of p-values occurs in all hypothesis testing. If p-value > 0.05, we "fail to reject" the null Ho. Jarque-Bera (JB) Test of Normality is an asymptotic test of normality...that is for very large sample sizes only n >>40 and works well as a way to evaluate OLS residuals. JB utilizes skewness and kurtosis measures, and may be expressed as JB = n { (S2/6) + [(K 3)2/24] }, where n is the sample size, S is the common skewness measure, and K is the measure of kurtosis; and specifically, S = M3 / (s)3 ; and K = M4 / (s2 )2 ; with Mj representing the j-th moment of a distribution n Mj = [ ( Xi - X ) ] / n i=1 (where j = 1,..., 4) j and s (s2) is the sample standard deviation (variance). Under the null hypothesis of "normally distributed error", the JB test can be applied to the residuals from a SRF, and for large samples... and asymptotically follows a 2 statistic with 2 degrees of freedom
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

University of Florida - GEO - 4167
II. Testing for Multicollinearity When two or more independent variables in a regression model are highly correlated with one another (or collinear), they will contribute &quot;redundant&quot; explanatory information. Hence, not all of those independent variables
University of Florida - GEO - 4167
Recall our recent Reading Assignments. Read and review: (a) the technical appendix in your textbook on Matrix approach to LS regression. Basic Econometrics by D. Gujarati, 2007, 4th edition. and/or (b) the posted Matrix Algebra review and the Matrix Appro
University of Florida - GEO - 4167
Extending Linear Regression: Weighted Least Squares, Heteroskedasticity, Local Polynomial Regression36-350, Data Mining 23 October 2009Contents1 Weighted Least Squares 2 Heteroskedasticity 2.1 Weighted Least Squares as a Solution to Heteroskedasticity
University of Florida - GEO - 4167
OLS Under Heteroskedasticity Testing for HeteroskedasticityHeteroskedasticity and Weighted Least SquaresWalter Sosa-EscuderoEcon 507. Econometric Analysis. Spring 2009April 14, 2009Walter Sosa-EscuderoHeteroskedasticity and Weighted Least SquaresOL
University of Florida - GEO - 4167
Regression Analysis Tutorial183LECTURE / DISCUSSION Weighted Least SquaresEconometrics Laboratory C University of California at Berkeley C 22-26 March 1999Regression Analysis Tutorial184IntroductionIn a regression problem with time series data (whe
University of Florida - GEO - 4167
Intermediate Quantitative MethodsTimothy J. Fik Associate Professor GEO 4167 section #6647 (undergraduate) GEO 6161 section #8377 (graduate)Credit hours: 3Thursdays (periods 2-4): 8:30-11:30AM Location: TUR 3012 SPRING 2012Intermediate Quantitative Me
University of Florida - AST - 1002
UFIDQ1 9.2 8.25 4.5 8 9.85 5.5 9.1 10 7.5 4.5 9.85 6 3.5 7 6.35 10 9 7.5 9.5 5.25 6.75 5 5.75 2.5 5.25 3.25 6.1 7 6.5 9.1 5 3.25 6.5 8.75 9 3.5 10 5 4.1 5.1 4.5 6.7501713653 03291993 03891805 05193165 09669612 11156163 11161338 11314038 11334031 1139879
University of Florida - AST - 1002
1/19/12discoveredinNov2011~600lyfromEarth P=290daysThe1stexoplanetorbiAngwithintheGoldilockzonearoundaSunlikestarReviewonLecture2WhyPtolemy'sEpicycleModelwasagoodtheory? WhyPtolemy'sEpicycleModelwasnotagood theory? Inwhataspect,Kepler'sModelissuperio
University of Florida - AST - 1002
1/19/12ImportantNo/ce1stQuizonJan26(1weekfromtoday) about10~15problems mul/plechoice+T,F+answering +simplemath? itwillcoverChap0.2. Tipsforstudyingthetextbook.Exoplanets51Pegasib*1stexoplanetdiscovered (1995)orbi/ngaSunlikestar CentralStar(51Pegasi)
University of Florida - AST - 1002
1/24/12Observa-onProjectI:Observingthe FullCycleoftheMoonAim:Understandingtherela-vemo-ons betweentheMoon&amp;Sunbyobserving 1)theloca-onoftheMoonintheskyata fixedobserva-on-me 2)thephaseoftheMoon Due:1weekbeforetheFinalExamObserva-onProjectI:Observingthe
University of Florida - AST - 1002
1/27/12ReviewL02L051.BeginningoftheModernAstronomy Aristotle,Ptolemy,Copernicus,Kepler,Newton 2.Exoplanets(examples&amp;generalproperMes) mass,eccentricity,distancefromhoststars,numberofmembers &amp;layout 3.DetecMonMethodsofExoplanets directimagingwithAO radia
University of Florida - AST - 1002
What'supUniverse? TheStrongestSolarFlareIn2012,arewedoomed?UnderstandingOurWorld,SolarSystemChap48kpclyrSolarSystemLayout(1)30AU 100AU 105AULaunchedin1977 V=20,000m/sAsofAug2006OortCloudisahypothePcalshellwhichiscomposedofnumerous cometlikebodie
University of Florida - AST - 1002
EarthMoonSystemPlanetEarthP=365days d=1AU =5,500kg/m3 6,387kmMoon=3,300kg/m3 1,738kmChap5StudyingEarth:LandscapesStudyingEarth:OverallStructure6mainlayersofEarth1)MetallicCores(ironcore) 2)Mantle(Silicatemantle) 3)Crust 4)Atmosphere 5)Trophospher
University of Florida - AST - 1002
2/10/12Reminder!ObservingProjectsReviewonLecture78TextbookChap45Keyconcepts Q.UnderstandingtheoverallproperKesoftheSolarSystem Layout,Orbits,ChemicalcomposiKon.etc Q.Understandingthebasicfeaturesofthenebulartheoryof SolarSystemformaKon Q.Understanding
University of Florida - AST - 1002
MarsRoverMissionsareincredibly cheapandefficient!Whathappenedsince1990?ReviewonLecture691.Exoplanets:DetecEonMethods(Chap4) GravitaEonalLensing AstrometricMeasurement(angulardistance) TransitMethod 2.SolarSystem(Chap48) GeneralProperEes(members,orbits,
University of Florida - AST - 1002
1/12/12Coursewebsite:www.astro.ufl.edu/~sczoo LecturenotewillbeuploadedonFriday ReviewsessionwillbegivenbeforeQuiz&amp;Exam CDisnotarequirement.ItisopIonal!Whereourstorybegins.?11/12/12NabtaPlayaStoneCircle AncientEgypt~5000B.C.TheGreatGizaPyramids (Pha
University of Florida - MAR - 3053
Enjoy! Hedonic Consumption and Compliance with Assertive MessagesANN KRONROD AMIR GRINSTEIN LUC WATHIEUThis paper examines the persuasiveness of assertive language (as in Nike's slogan &quot;Just do it&quot;) as compared to nonassertive language (as in Microsoft'
University of Florida - MAR - 3053
Exam #1 Review Sheet MAR 3503 Consumer Behavior Spring 2012 These questions should help you organize your thoughts and prepare for the exam. The questions on these pages are, in general, much broader than the questions you'll find on the exam. This means
University of Florida - MAR - 3053
A Stranger's Touch: Effects of Accidental Interpersonal Touch on Consumer Evaluations and Shopping TimeBRETT A. S. MARTINThis article examines an unexplored area of consumer research-the effect of accidental interpersonal touch (AIT) from a stranger on
University of Florida - MAR - 3053
Some notes on reading and evaluating behavioral research Courtesy Lyle Brenner Different papers have different approaches and goals, so not all of the considerations below will necessarily apply. But here are some questions to ponder when reading an empir
University of Florida - MAR - 3053
Plate Size and Color Suggestibility: The Delboeuf Illusion's Bias on Serving and Eating BehaviorKOERT VAN ITTERSUM BRIAN WANSINKDespite the challenged contention that consumers serve more onto larger dinnerware, it remains unclear what would cause this
University of Florida - MAR - 3053
1Copyright Journal of Consumer Research 2011 Preprint (not copyedited or formatted) Please use DOI when citing or quotingThe Presenter's ParadoxKIMBERLEE WEAVER STEPHEN M. GARCIA NORBERT SCHWARZ Author Note Kimberlee Weaver (kdweaver@vt.edu) is an Assi
University of Florida - MAR - 3053
Nostalgia: The Gift That Keeps on GivingXINYUE ZHOU TIM WILDSCHUT CONSTANTINE SEDIKIDES KAN SHI CONG FENGNostalgia, a sentimental longing for a personally experienced and valued past, is a social emotion. It refers to significant others in the context o
University of Florida - CNT - 6107
CNT 6107 Advanced Computer Networks, Spring 2012 Assignment 1given by Jonathan C.L. Liu Out: Feb. 01 (Wednesday), 2012 Due: Beginning of the lecture on Feb. 08 (Wednesday), 2012 The problem sets form an important part of the learning in this course. Thus
University of Florida - CNT - 6107
CNT 6107: A Quick Background ReviewJonathan C.L. Liu, Ph.D.Department of Computer, Information Science and Engineering (CISE), University of Florida1Uses of Computer Networks BusinessApplications Home Applications Mobile Users Social Issues New Emer
University of Florida - CNT - 6107
Chapter 3 Quick Review on Data Link LayerJonathan C.L. Liu, Ph.D.Department of Computer, Information Science and Engineering (CISE), University of Florida1Functions of the Data Link Layer Provide service interface to the network layer Dealing with tr
University of Florida - CNT - 6107
Chapter 3-4 Quick Review on Data Link Layer Part 2Jonathan C.L. Liu, Ph.D.Department of Computer, Information Science and Engineering (CISE), University of Florida1Selected wireless link standards54 Mbps 5-11 Mbps 1 Mbps802.11cfw_a,g 802.11b802.15
University of Florida - CNT - 6107
Chapter 5 Network LayerJonathan C.L. Liu, Ph.D.Department of Computer, Information Science and Engineering (CISE), University of Florida12Store-and-Forward Packet SwitchingThe environment of the network fig 5-1 layer protocols.3Implementation of C
University of Florida - CNT - 6107
CNT 6107 ACN, Spring 2012 A Survey on the Student Backgroundgiven by Jonathan C.L. Liu Out: Jan. 13 (Friday), 2012 Due: Beginning of the lecture on Jan. 18 (Wednesday), 2012 This is an anonymous survey in order to let the instructor determine the proper
University of Florida - CNT - 6107
CNT 6107 Advanced Computer Networks, Spring 2012Prerequisites: Basic probability theory, general networking knowledge (i.e., CNT 4007 and CNT 5106) and operating systems (COP 4600). Goals: This course is designed to cover several design issues in the eme
University of Florida - CNT - 6107
Term Project for CNT 6107Jonathan C.L. Liu, Ph.D. CISE Department University of Florida(c) Jonathan C.L. Liu, Ph.D.Results of Student Survey 81% of students have taken Introductorynetworking courses before 47% of students wrote networking client prog
University of Florida - CNT - 6107
Suggested TopicsJonathan C.L. Liu, Ph.D. CISE Department University of Florida(c) Dr. Jonathan C.L. LiuList of Suggested Topics (WAN) Multicast Schemes and beyond QOS routing (with multiple paths) ATM networks Admission control Congestion control meth
University of Florida - EAS - 4200C
Project: Wing-Box Analysis Objective: to estimate stresses at the root of a wing and determine the safety of the wing using failure criteriay hrootctipcroot c/4 chord line b Sweep angle L/2 xProject: Wing-Box Analysis Load estimation The lift, w, o
University of Florida - EAS - 4200C
0.1. Failure TheoriesIn the previous section, we introduced the concept of stress, strain and the relationship between stresses and strains. We also discussed failure of materials under uniaxial state of stress. Failure of engineering materials can be br
University of Florida - EEL - 4744
54321VDD3VFLVDDIOVDDXA[0.16]( 3.3V )( 3.3V )( 1.8V )9 71 93 107 121 143 159 1704 15 23 29 61 101 109 117 126 139 146 154 167151 152 153 156 157 158 161 162 163 164 165 168 169 172 173 174 175XA0 XA1 XA2 XA3 XA4 XA5 XA6 XA7 XA8 XA9 XA10 XA11
University of Florida - EEL - 4744
University of Florida Department of Electrical &amp; Computer Engineering Page 1/2EEL 4744 Revision 2Drs. K. Gugel &amp; E M. Schwartz TAs: W. Goh &amp; D. Szmulewicz 18-Jan-12Downloading &amp; Installing Code Composer Studio (CCS) v4IntroductionCode Composer Studio
University of Florida - EEL - 4744
University of Florida Department of Electrical &amp; Computer Engineering Page 1/7EEL 4744 Revision 0Drs. E M. Schwartz &amp; K. Gugel TAs: W. Goh &amp; Colin Watson 12-Feb-12Creating and Simulate an ASM Project in CCSIntroductionThe purpose of this document is
University of Florida - EEL - 4744
; divide.asm ; Description: ; 16 and 32 bit unsigned divide routines using the SUBCU instruction ; ; DIVIDE32 = 32 bit divide routine ; Call routine with: XAR0 = Numerator ; XAR1 = Denominator ; Returns: XAR2 = Remainder ; XAR3 = Quotient ; ; DIVIDE = Cal
University of Florida - EEL - 4744
University of Florida Department of Electrical &amp; Computer Engineering Page 1/2EEL 4744 Revision 0Drs. E M. Schwartz &amp; K. Gugel TAs: W. Goh &amp; Colin Watson 12-Feb-12Emulating an ASM Project in CCSIntroductionThere are two types of debugging modes that
University of Florida - EEL - 4744
EEL 4744C Dr. Gugel Spring 2011 Exam #1 LAST NAME_ FIRST NAME _ UF ID#_Open book and open notes, 90-minute examination to be done in pencil. No electronic devices are permitted. All work and solutions are to be written on the exam where appropriate.Poi
University of Florida - EEL - 4744
University of FloridaDepartment of Electrical &amp; Computer Engineering Page 1/11EEL 4744-Spring 2011 15 February 2011Dr. Eric M. Schwartz13-Feb-12 1:01 PMExam 1Last Name, ,First NameInstructions: Turn off cell phones, beepers and other noise making
University of Florida - EEL - 4744
13-Feb-12-1:27 PMExam 1 InfoEEL 4744EEL4744 First Exam120 minutes About 50% hardware, 50% software Questions require understanding Questions deal with &quot;real stuff&quot; Lectures 9 (plus Thur, 10 Feb); HW 3; Labs 3 (and some timing from 4) Relevant topics
University of Florida - EEL - 4744
University of FloridaElectrical &amp; Computer EngineeringEEL 4744Dr. Eric M Schwartz16-Jan-12Page 1/1Instructions Note: Late HW isHomework 1Revision 0not accepted! HW is due at the beginning of class. Put your &quot;last name, first name&quot; and the HW numb
University of Florida - EEL - 4744
University of FloridaElectrical &amp; Computer EngineeringEEL 4744Dr. Eric M Schwartz23-Jan-12Page 1/1Instructions Note: Late HW isHomework 2Revision 0not accepted! HW is due at the beginning of class. Put your &quot;last name, first name&quot; and the HW numb
University of Florida - EEL - 4744
University of FloridaElectrical &amp; Computer EngineeringEEL 4744Dr. Eric M Schwartz7-Feb-12Page 1/1Homework 3Revision 0Instructions Note: Late HW is not accepted! HW is due at the beginning of class. Put your &quot;last name, first name&quot; and the HW numbe
University of Florida - EEL - 4744
University of Florida Electrical &amp; Computer Engineering Dept. Page 1/4EEL 4744 Spring 2011Revision 1Dr. Eric M. Schwartz D. Szmulewicz, M Carroll, TA18-Jan-122Lab 0: Intro to UF F28335 Development Board, Soldering/Wire-wrapping, and your TAOBJECTIVE
University of Florida - EEL - 4744
University of Florida Electrical &amp; Computer Engineering Dept. Page 1/2EEL 4744 Spring 2012Revision 1Dr. Eric M. Schwartz Michael Carroll, TA 27-Jan-122Lab 1: Programming the GCPUOBJECTIVESIn this lab you will review the GCPU, a good example of a sim
University of Florida - EEL - 4744
University of Florida Electrical &amp; Computer Engineering Dept. Page 1/2EEL 4744 Spring 2012Revision 2Dr. Eric M. Schwartz Colin Watson &amp; Erick Macias, TA 31-Jan-12Lab 2: Programming the DSPTable 1: Memory Array Address 0xA000 0xA001 0xA002 0xA003 0xA0
University of Florida - EEL - 4744
University of Florida Electrical &amp; Computer Engineering Dept. Page 1/4EEL 4744 Spring 2012Revision 3Dr. Eric M. Schwartz Erick Macias &amp; Eric Jeffers, TA 13-Feb-12Lab 3: Switch Inputs, LED outputs, GPIO, Timing Loop and LSA.OBJECTIVESIn this lab you
University of Florida - EEL - 4744
University of Florida Electrical &amp; Computer Engineering Dept. Page 1/5EEL 4744 Spring 2012Revision 0Dr. Eric M. Schwartz Colin Watson and Ali Nuhi, TAs 15-Feb-12Lab 4: Keypad and I/O Port ExpansionsOBJECTIVES To explore and understand the implementa
University of Florida - EEL - 4744
University of FloridaDepartment of Electrical &amp; Computer EngineeringEEL 3701 - Fall 2011 Revision 1Dr. Eric M. Schwartz 28-Nov-11Page 1/3LAB 9: G-CPU: Assembly Programming and Hand AssemblyFUNCTIONAL COMPILATION / SIMULATION Because our design will
University of Florida - EEL - 4744
University of Florida Electrical &amp; Computer Engineering Dept. Page 1/1EEL 4744Revision 0Drs. Eric M. Schwartz &amp; Karl Gugel13-Jan-12Pre-laboratory Report GuidelinesDecoding Logic:When we start using the CPLD, you should include printouts of either y
University of Florida - EEL - 4744
Application ReportSPRAAM0A May 2007 Revised October 2007Getting Started With TMS320C28x Digital Signal ControllersChristine Peng . C2000/AEC ABSTRACT This guide is organized by development flow and functional areas to make your design effort as seamles
University of Florida - EEL - 4744
University of FloridaDepartment of Electrical &amp; Computer EngineeringEEL 4744-Spring 2012Dr. Eric. M. Schwartz7-Feb-12SYLLABUSRevision 8Page 1/9EEL 4744C: MICROPROCESSOR APPLICATIONShttp:/mil.ufl.edu/4744/INSTRUCTOR LECTURES Dr. Eric M. Schwartz
University of Florida - EEL - 4744
test_32.asm 1 ;* 2 ; file = test_32.asm Author: Dr. Eric M. Schwartz Date: 13 Feb 2012 3; 4 ; Description: 5 ; This program will be used to test moving 32 bits at a time 6; between pairs of memory, the accumulator, and the auxiliary reisters 7 .global _c_
University of Florida - EEL - 4744
;* ; timer_ex1.asm 24 Mar 2011 Rev. 1 ; ; Original Author: Adam Mills ; Editted by: Dr. Schwartz ;* ;* Flashes the LED on our board. Example using Timer1. ;* GPAMUX1 .set 0x6F86 GPATOGGLE .set 0x6FC6 GPADIR .set 0x6F8A GPADAT .set 0X6FC0 INT13_VECT .set 0
North Carolina State University - ELM - 410
The Cuisenaire productPaul StephensonKfarlcrs dt'liyhtcd by the siniplit it\ of the mctlnxl ot multiplication shown in MT203 (Foster, 2007) may be interested In a (iattcgno-inspired variant. Each set of intersections in the representation Colin Foster d
North Carolina State University - ELM - 410
J Math Teacher Educ (2009) 12:89109 DOI 10.1007/s10857-009-9098-zInstructional practices related to prospective elementary school teachers' motivation for fractionsKristie Jones NewtonPublished online: 10 February 2009 Springer Science+Business Media B
North Carolina State University - ELM - 410
North Carolina State University - PSY - 400
PSY 376- Developmental Psychology Chapter 12: Gender and Sexuality Study Guide 1. Be able to define the basic terminology associated with gender development, including gender, sex, gender identity, gender roles, gender-typing, and gender stereotypes. 2. H
UCF - CNT - 4603
CNT 4603: System Administration Spring 2012Introduction To Active DirectoryInstructor : Dr. Mark Llewellyn markl@cs.ucf.edu HEC 236, 4078-823-2790 http:/www.cs.ucf.edu/courses/cnt4603/spr2012Department of Electrical Engineering and Computer Science Com