# problem set 4.docx - CSSS/SOC/STAT 221 Summer 2020 Name: ...

Doc Preview
Pages 13
Identified Q&As 15
Solutions available
Total views 100+
CSSS/SOC/STAT 221 Summer 2020 Name: ____ Collaborators: __________________________________________ Student number: _____________ Problem Set 4 CSSS/SOC/STAT 221 students from the Summer 2019, Fall 2019, Winter 2020, and Summer 2020 quarters were asked to report the lowest temperature they remembered experiencing, reported in degrees Fahrenheit (F). Among the Summer 2019 students, 26 out of 37 reported valid temperatures (the remaining 11 opted not to respond to this question). Among the Fall 2019 students, 159 out of 188 reported valid temperatures (the remaining 29 opted not to respond to this question).Among the Winter 2020 students, 204 out of 238 reported valid temperatures (32 opted not to answer, while 2 more reported temperatures that were impossibly high for that quarter). Among the Summer 2020 students, 51 out of 62 reported valid temperatures (10 opted not to answer, while 1 more reported a temperature that was impossibly high for that quarter). In total, 440 out of 525 students provided valid responses to this question (an 83.81% valid response rate). The side-by-side box plots below suggest that the temperature experiences of these students were broadly similar between Summer 2019 and Summer 2020: The following figure further reinforces this assertion: 1
CSSS/SOC/STAT 221 Summer 2020 The gray histogram presents the sample distribution of all 440 valid responses. The solid black, red, green, and blue lines present smoothed representations of each quarter’s sample distribution (they are similar to the hollow histograms discussed in chapter 2 of your textbook). As you can see, the location, scale, and shape of all four quarter-specific sample distributions are very similar to one another. Their medians vary between 10 and 12.5; their first quartiles vary between -5 and 0; their third quartiles vary between 20 and 24; and their interquartile ranges vary between 20 and 29. All four samples are also clearly left-skewed. When combined into a single sample, the quartiles {Q0, Q1, Q2, Q3, Q4 } for all 440 valid cases are {-60, -1, 12, 22, 60} degrees F, implying an interquartile range of 23 degrees. The dashed line in the above graph represents a new parametric probability distribution model for continuous numerical variables called the Gumbel distribution, which has been fitted to the combined student data. The Gumbel model belongs to a set of models called “extreme value distributions,” which are widely used to approximate the probability distributions underlying “lowest value from median” or “highest value from median” variables, for example scores for a particular sporting event observed over multiple years; daily website user traffic; market anomalies; etc. These distributions tend to be left-skewed for “lowest value per observational unit” data or right-skewed for “highest value per observational unit.” When applied to left-skewed data such as the temperature variable described above, the Gumbel model has two parameters, the location parameterα (the Greek letter alpha) and the scale parameter

## Want to read all 13 pages?

Previewing 3 of 13 pages Upload your study docs or become a member.

## Want to read all 13 pages?

Previewing 3 of 13 pages Upload your study docs or become a member.