The Pearson Product-Moment Correlation Coefficient

The regression coefficient is an asymmetrical statistic, one that gives different values for the model Y = f(X) and the model X = f(Y). The other major measure of bivariate association is the Pearson product-moment correlation coefficient (sometimes called "little r" for short). The correlation coefficient is a symmetrical statistic. That is, it simply describes the association between X and Y without worrying about whether Y = f(X) or X = f(Y). It would produce the same result in either case. Unlike the regression coefficient, whose values range from 0.0 to ± , the correlation coefficient ranges from 0.0 when there is NO association between X and Y to ± 1.00 when there is PERFECT association (either direct or inverse).
To generate the second set of statistics describing association from the linear model, we partition the sum of squares . Graphically, we begin with a single data point, i, in two-dimensional space. Y i is its location on the scale of y (on the y-axis); below that is the predicted location of Y, Y i -hat. The dotted horizontal line (- - - -) is the location of the mean of Y. (When there is no association between X and Y, b = 0.0 and therefore a = Y-bar.) where b = 0, X b Y a + = Y a =

i Y i } Y i - hat _ } Y - - - - - - - - - - - - - - - X i ( 29 i i Y Y ˆ - ( 29 Y Y i - ˆ

The vertical line represents the deviation of the i th observation from the mean of Y (i.e., the difference between Y i and Y-bar). The line of best fit bisects the deviation into its two mathematical components. The component ABOVE the line of best fit is the residual , the difference between Y i and Y i - hat, the actual location of the i th observation on the y-axis and the predicted location of this observation on the y-axis. This is the error (or residual) component .
The component BELOW the line of best fit is new. It is

