ENGRD 2700 F10 Homework 1 Solutions (65 points total) 1. What fraction of the calls are cancelled? (5 points) To get the number of calls canceled, the students can use the filter and wipe out all the data with pre-cancel time -1, the answer of percentile is 883 / 15,000 = 5.9%. There are also many other methods to get the answer, e.g. by sorting or COUNTIF function. (there are many ways of computing this number and so long as the method used is explained and the answer is correct, students should receive full points, 3 points for describing the working, 2 points for the percentile) 2 . Provide a histogram for the time in seconds until cancellation for all of those calls that are eventually cancelled. Explain why your histogram allows us to conclude that cancellation times greater than about 840 seconds (14 minutes) can be disregarded as outliers. (No formal de_nition of outlier is necessary.) (6 points) You can first copy the pre-cancel time of canceled calls, and scale them into seconds by *24*60*60. Then draw a histogram from these pre-cancel time data. From the histogram, you can see that there are very rare data with cancellation time greater than 840s. (6 points broken down as follows: 2 points for ensuring the histogram doesn't include the "-1"s that indicate a non-cancelled call. 1 point for scaling to seconds from days. 1 point if the histogram axes are both labeled and there is a title (0 if any of these labels are missing). 2 points for explaining their working and reasoning) 3. Now provide a second histogram for the time until cancellation for all those calls that are eventually cancelled within 840 seconds (14 minutes). Based on your new histogram, what is the most common value for the time until cancellation of a call? (4 points) Filter the data in question 2 to get all calls canceled within 840s and draw the histogram. It’s easy to see that the most common value for cancellation time is
approximately 300 seconds (5 minutes). (2 points for histogram to ensure that it does not contain “-1”s or calls with pre-cancel time greater than 840s, 2 points for indicating the most common values of time. Deduct a point if the time given as the most common value is either too specific (say to the nearest second) or too broad
