This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: 175 Lecture 18: CHAPTER 7: CORRELATION AND SIMPLE LINEAR REGRESSION Correlation (Section 7.1, page 500) BIVARIATE DATA Most of the data sets we have encountered up to now deal with measurements of one variable on each of several “individuals” (i.e. incomes of individuals, test scores etc.). As discussed in our first lecture, such type of data is called univariate data. In many situations it may be of interest to take measurements of two variables on each of several individuals. For example we may be interested in both the number of hours an individual spends studying on a midterm, and the midterm grade. In such a case each observation will consist of two values (a pair of measurements): (number of hours studying, midterm grade) Such type of data is called bivariate data and is often denoted in the form; (x,y), the “xobservation” and “yobservation”. A bivariate sample of ‘n’ observations would then be written as ) , ( ),..., , ( ), , ( 2 2 1 1 n n y x y x y x The reason we take observations on two variables is that we are interested in exploring the relationship between the variables. For example “Do A average students tend to study more?” The simplest possible relationship between two variables x and y is that of a straight line. This leads us to scatter plots: 176 Since we seem to be interested in both the number of hours an individual spends studying on a midterm, and the midterm grade, the following is a data set based on 5 individuals midterm grade (in percent) and the amount of studying spent per week (in hours); where x = studying and y = grades: (14, 95), (3, 53), (7, 76), (9, 88), (0, 28) Draw a scatter plot of this data: What does this show? 1) 2) A scatter plot of bivariate numerical data gives a visual impression of how strongly x values and y values are related. 177 The following scatter plots display different types of relationships between the x and y values: 1) 2) 3) 178 4) However, to make precise statements about a data set, we must go beyond just a scatter plot. A correlation coefficient is a quantitative assessment of the strength of the relationship between x and y....
View
Full
Document
This note was uploaded on 06/20/2011 for the course STAT 2800 taught by Professor Paula during the Winter '11 term at UOIT.
 Winter '11
 Paula
 Correlation, Linear Regression

Click to edit the document details