# Week 5 pre-reading.pdf - Psychological data analysis...

• 11

This preview shows page 1 - 5 out of 11 pages.

Psychological data analysis Correlation and regression Week 3 pre-reading: Correlation and partial correlation Many variables are related to one another in some way. In some fields of study, the relationships might prove to be very complex. Fortunately for us, the kinds of relationships we typically find in psychology are quite simple ones. It is the representation and analysis of these relationships that we will concern ourselves with here. For us to be able to investigate a relationship we require scores on the two variables involved for a number of different people. These can then be shown on a scatterplot. This is simply a plot of scores on one variable, against scores on the other variable. This is a simple descriptive method of representing the strength and direction of any relationship between the two variables. For example, we might wish to investigate whether a relationship exists between amount of alcohol ingested and number of errors made on a cognitive task.
Positive and negative linear correlations As you can see, the points on the plot tend towards a straight (regression) line. This then is called a linear function, for instance Y (the number of errors) can be described as a particular linear function of X (the amount of alcohol ingested). If the line runs from bottom left to top right, as in this case, then as scores on one variable increase, scores on the other variable increase, then the relationship is described as a positive linear relationship . If, on the other hand, the line runs from top left to bottom right, that is as scores on one variable increase, scores on the other variable decrease , then the relationship is described as a negative linear relationship . In nearly all real-world cases, the relationship is much less clear than the example used above. Adding a line of best fit As the scatterplot indicates a straight-line relationship to exist between the two variables it is perhaps not surprising that SPSS will add a line to the plot that best describes the relationship. This line is added in such a way as to minimise the differences between each of the data points and the line. No correlation and the influence of outliers SPSS is of course just a programme and makes no judgements on the data. It is quite possible to get a scatterplot complete with line of best fit for two data sets that have no relationship between them.
The practically horizontal line of best fit indicates that there is no line to best represent the relationship. This is because there is no relationship between the two sets of data. Such a situation can be grossly distorted however by the addition of just one outlying data point as the following graph shows.
The only difference between the two sets of data is the addition of one outlying data point – the one that appears at the top right of the scatterplot. The line of best fit indicates a strong positive relationship now. Clearly, we need to be aware of, and consider excluding outliers when we are investigating correlations. This is particularly important when sample sizes are small (N < 100 for correlations).