{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

chapter4 - Chapter 4 A First Look at Bivariate Data 4.1...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 4 A First Look at Bivariate Data 4.1 Graphical representation So far we have looked at a single variable or what is known as univariate data and their descriptive and numerical summaries. We now turn to bi- variate data, that is statistics with two variables. As economists, we are often interested to understand the relationship between two variables, for example, wages and education, CEO performance and salaries, economic growth and aid, and so on. Here, as with univariate data, graphical and numerical summaries can be employed. The most important graphical summaries are the contingency table (usually for categorical variables) and the scatter diagram. A contingency table lists the frequency of each combination of the values of the two vari- ables. For example, a survey of 2,237 people identifying their gender and handedness could give: 1 1 For more on the relationship between categorical variables a useful text is Utts and Heckard (2006: Chapter 6) 1
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 CHAPTER 4. A FIRST LOOK AT BIVARIATE DATA Male Female Right-handed 934 1070 Left-handed 113 92 Ambidextrous 20 8 With cardinal data, the relationship between two variables is better ap- preciated by the scatter diagram. Below is a scatter diagram that shows the association between duration of eruptions of Old Faithful and waiting time between eruptions. Figure 4.1: Waiting time and eruption duration of Old Faithful
Background image of page 2
4.2. CORRELATION 3 4.2 Correlation It is obvious that graphical representations have their limitations and there- fore a number of important numerical summaries are available to describe the relationship between any two variables. In the previous chapters, the arithmetic mean and standard deviation were introduced as the most important numerical summaries for univariate data. One might ask, therefore, why these two numerical summaries might not be sufficient for bivariate data as well. We will show why. Assume a university has been offering two statistics classes or sections A and B to freshmen over the years. The midterm scores and final scores (both out of 200) were recorded and shown on a scatter diagram below, first for section A students: Figure 4.2: Mid-term and final exam scores for section A students
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}