ARE106
Homework #2
‐
AK
ARE106 – Homework 2 Answer Key
1.
Questions related to data file
data22.
(i)Click on ‘Data’, then click on ‘Read info’ to see the following:
DATA22:
High school and College GPA of undergraduate students
Data reflect the first year achievement.
colgpa =
Grade point average in college (Range 0.85  3.97)
hsgpa
=
High school GPA (Range 2.29  4.5)
(ii)This is a crosssection dataset as the data are for a specific point in time. We do not have
measurements of the same variable for different points in time here; we only have one data point
for high school GPA and one data point for college GPA.
(iii)
Within range
– there are some values of the college GPA that are well below 2. The lowest
value of colgpa is 0.85 for observation 416. There are also some unusual values of hsgpa, such as
values that exceed 4. The highest hsgpa is 4.5 for observation 411.
Obvious errors
– it is possible that values for hsgpa exceeding 4.0 are errors; however, many
high schools allow for grade point averages exceeding 4.0 if students take Advanced Placement
(AP) classes. Therefore, there are no “obvious” errors in the dataset.
Surprises
– There are some surprising observations; specifically, those where the difference
between colgpa and hsgpa is unusually large. The observations where the difference exceeds 2
are the following:
Obs
colgpa
hsgpa
diff
390
1.57
3.6
‐
2.03
380
1.57
3.67
‐
2.1
61
1.11
3.21
‐
2.1
92
1.84
4
‐
2.16
416
0.85
3.04
‐
2.19
386
1.81
4.07
‐
2.26
Overall impression
– In general, hsgpa is greater than colgpa, which is what we would expect
before seeing the data as grades generally decline in college.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentARE106
Homework #2
‐
AK
(iv) The summary statistics are below. The means are highlighted and support the conclusion
reached in (iii) as the mean of hsgpa (3.5578) is greater than the mean of colgpa (2.7855). The
variance of hsgpa is 0.17605 (=0.41958
2
). The coefficient of variation (aka CV) for hsgpa is
0.11793. The coefficient of variation uses the standard deviation rather than the variance because
the units of variance are squared (e.g., gpa squared) while the units of the standard deviation are
identical to those of the variable of interest. In calculating CV, using the standard deviation
instead of the variance allows us to divide out the units resulting in a unitless measurement of
the spread. The CV is defined as “
a measure of the dispersion of a distribution relative to its
mean
” (Ramanathan, pg. 26). We might use CV when we are interested in a measure of spread
This is the end of the preview.
Sign up
to
access the rest of the document.
 Winter '09
 Havenner
 Statistics, Standard Deviation, Statistical hypothesis testing, high school gpa, HSGPA

Click to edit the document details