Hint in class what sort of samples can properly

This preview shows page 17 - 22 out of 26 pages.

Hint: In class, what sort of samples can properly represent the population? In [123]: Out[122]: [6.422999999999999, 4.7749999999999995] 4.838400000000001 sample1 = earthquakes.sort( 'mag' , descending = True ).take(np.arange( 100 sample1_magnitude_mean = np.mean(sample1.column( 'mag' )) sample2 = earthquakes.take(np.arange( 100 )) sample2_magnitude_mean = np.mean(sample2.column( 'mag' )) [sample1_magnitude_mean, sample2_magnitude_mean] For sample 1, we should not expect this value to be representative of the entire population, as it is the first 100 taken from the largest earthquakes that occured in this time frame due to the ordering of the table, so there is a bias towards the stronger earthquakes. Sample 2 is more likely to be close to the population mean because the earthquakes are organized by time, and assuming they occur randomly, taking the first 100 should not give us a biased sample. Out[123]: )) )
In [124]: Question 3. Suppose we want to figure out what the biggest magnitude earthquake was in 2017, but we are tasked with doing this only with a sample of 500 from the earthquakes table. To determine whether trying to find the biggest magnitude from a sample is a plausible idea, write code that simulates the maximum of a random sample of size 500 from the earthquakes table 5000 times. Assign your array of maximums to maximums . In [127]: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running tests --------------------------------------------------------------------- Test summary Passed: 1 Failed: 0 [ooooooooook] 100.0% passed Out[127]: 5000 _ = ok.grade( 'q4_2' ) maximums = make_array() for i in np.arange( 5000 ): representative_sample = earthquakes.sample( 500 ) representative_max = max (representative_sample.column( 1 )) maximums = np.append(maximums,representative_max)
In [130]: In [131]: Question 4. We want to see if a random sample of size 500 is likely to help you determine the largest magnitude earthquake in the population. To help determine this, find the magnitude of the (actual) strongest earthquake in 2017. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running tests --------------------------------------------------------------------- Test summary Passed: 1 Failed: 0 [ooooooooook] 100.0% passed #Histogram of your maximums Table().with_column( 'Largest magnitude in sample' , maximums).hist( 'Largest magnitude _ = ok.grade( 'q4_3' )
In [132]: In [133]: Question 5. Explain whether you believe you can accurately use a sample size of 500 to determine the maximum. What is a specific con of using the maximum as your estimator? Use the histogram above to help answer. Out[132]: 8.2 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running tests --------------------------------------------------------------------- Test summary Passed: 1 Failed: 0 [ooooooooook] 100.0% passed strongest_earthquake_magnitude = max ( earthquakes.column( 1 ) ) strongest_earthquake_magnitude _ = ok.grade( 'q4_4' )
5. Assessing Gary's Models Games with Gary

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture