hw06 part 4.pdf - hw06 Jupyter Notebook In[92 earthquakes =...

This preview shows page 1 - 3 out of 8 pages.

hw06 - Jupyter NotebookIn [92]:If we were studying all human-detectable 2019 earthquakes and had access to the above data,we’d be in good shape - however, if the USGS didn’t publish the full data, we could still learnsomething about earthquakes from just a smaller subsample. If we gathered our sample correctly,we could use that subsample to get an idea about the distribution of magnitudes (above 5, ofcourse) throughout the year!In the following lines of code, we take two different samples from the earthquake table, andcalculate the mean of the magnitudes of these earthquakes.In [93]:Question 1.Are these samples representative of the population of earthquakes in the original table(that is, the should we expect the mean to be close to the population mean)?Hint:Consider the ordering of the earthquakestable.timemagplace2019-12-31T11:22:49.734Z5245km S of L'Esperance Rock, New Zealand2019-12-30T17:49:59.468Z537km NNW of Idgah, Pakistan2019-12-30T17:18:57.350Z5.534km NW of Idgah, Pakistan2019-12-30T13:49:45.227Z5.433km NE of Bandar 'Abbas, Iran2019-12-30T04:11:09.987Z5.2103km NE of Chichi-shima, Japan2019-12-29T18:24:41.656Z5.2Southwest of Africa2019-12-29T13:59:02.410Z5.1138km SSW of Kokopo, Papua New Guinea2019-12-29T09:12:15.010Z5.279km S of Sarangani, Philippines2019-12-29T01:06:00.130Z59km S of Indios, Puerto Rico2019-12-28T22:49:15.959Z5.2128km SSE of Raoul Island, New Zealand... (1626 rows omitted)Out[93]:[6.458999999999999, 5.279000000000001]earthquakes =Table().read_table('earthquakes_2019.csv').select(['time', earthquakessample1 =earthquakes.sort('mag', descending =True).take(np.arange(100sample1_magnitude_mean =np.mean(sample1.column('mag'))sample2 =earthquakes.take(np.arange(100))sample2_magnitude_mean =np.mean(sample2.column('mag'))[sample1_magnitude_mean, sample2_magnitude_mean] The second sample is more representatitve of the population of earthquakes in the original table.The average in this sample is 5.279, which is close to the population mean. The first sample, on theother hand, has an average higher than the population mean, which means is not representative.Question 2.Write code to produce a sample of size 200 that is representative of the population.Out[92]:timemagplace2019-12-31T11:22:49.734Z5245km S of L'Esperance Rock, New Zealand2019-12-30T17:49:59.468Z537km NNW of Idgah, Pakistan2019-12-30T17:18:57.350Z5.534km NW of Idgah, Pakistan2019-12-30T13:49:45.227Z5.433km NE of Bandar 'Abbas, Iran2019-12-30T04:11:09.987Z5.2103km NE of Chichi-shima, Japan2019-12-29T18:24:41.656Z5.2Southwest of Africa2019-12-29T13:59:02.410Z5.1138km SSW of Kokopo, Papua New Guinea2019-12-29T09:12:15.010Z5.279km S of Sarangani, Philippines2019-12-29T01:06:00.130Z59km S of Indios, Puerto Rico2019-12-28T22:49:15.959Z5.2128km SSE of Raoul Island, New Zealand... (1626 rows omitted)Out[93]:[6.458999999999999, 5.279000000000001]earthquakes =Table().read_table('earthquakes_2019.csv').select(['time', earthquakessample1 =earthquakes.sort('mag', descending =True).take(np.arange(100sample1_magnitude_mean =np.mean(sample1.column('mag'))sample2 =earthquakes.take(np.arange(100))sample2_magnitude_mean =np.mean(sample2.column('mag'))[sample1_magnitude_mean, sample2_magnitude_mean]'m))
hw06 - Jupyter Notebook14/21Then, take the mean of the magnitudes of the earthquakes in this sample. Assign these to representative_sampleand representative_meanrespectively.Hint:In class, we learned what kind of samples should be used to properly represent thepopulation.

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture