This preview shows pages 1–4. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Chapter 4 A First Look at Bivariate Data 4.1 Graphical representation So far we have looked at a single variable or what is known as univariate data and their descriptive and numerical summaries. We now turn to bi variate data, that is statistics with two variables. As economists, we are often interested to understand the relationship between two variables, for example, wages and education, CEO performance and salaries, economic growth and aid, and so on. Here, as with univariate data, graphical and numerical summaries can be employed. The most important graphical summaries are the contingency table (usually for categorical variables) and the scatter diagram. A contingency table lists the frequency of each combination of the values of the two vari ables. For example, a survey of 2,237 people identifying their gender and handedness could give: 1 1 For more on the relationship between categorical variables a useful text is Utts and Heckard (2006: Chapter 6) 1 2 CHAPTER 4. A FIRST LOOK AT BIVARIATE DATA Male Female Righthanded 934 1070 Lefthanded 113 92 Ambidextrous 20 8 With cardinal data, the relationship between two variables is better ap preciated by the scatter diagram. Below is a scatter diagram that shows the association between duration of eruptions of Old Faithful and waiting time between eruptions. Figure 4.1: Waiting time and eruption duration of Old Faithful 4.2. CORRELATION 3 4.2 Correlation It is obvious that graphical representations have their limitations and there fore a number of important numerical summaries are available to describe the relationship between any two variables. In the previous chapters, the arithmetic mean and standard deviation were introduced as the most important numerical summaries for univariate data. One might ask, therefore, why these two numerical summaries might not be sufficient for bivariate data as well. We will show why. Assume a university has been offering two statistics classes or sections A and B to freshmen over the years. The midterm scores and final scores (both out of 200) were recorded and shown on a scatter diagram below, first for section A students: Figure 4.2: Midterm and final exam scores for section A students 4...
View
Full
Document
This note was uploaded on 03/14/2010 for the course ECON Statistics taught by Professor Yy during the Spring '10 term at Seoul National.
 Spring '10
 YY

Click to edit the document details