# 1 Preliminaries We the RAF in World War II want to know...

• Lab Report
• 14
• 95% (109) 104 out of 109 people found this document helpful

This preview shows page 1 - 3 out of 14 pages.

lab08 August 6, 2019 [1]: # Initialize OK from client.api.notebook import Notebook ok = Notebook( ' lab08.ok ' ) ===================================================================== Assignment: Resampling and the Bootstrap OK, version v1.12.5 ===================================================================== 1 Lab 8: Resampling and the Bootstrap The British Royal Air Force wanted to know how many warplanes the Germans had (some num- ber N , which is a parameter ), and they needed to estimate that quantity knowing only a random sample of the planes’ serial numbers (from 1 to N ). We know that the German’s warplanes are labeled consecutively from 1 to N , so N would be the total number of warplanes they have. We normally investigate the random variation among our estimates by simulating a sampling procedure from the population many times and computing estimates from each sample that we generate. In real life, if the RAF had known what the population looked like, they would have known N and would not have had any reason to think about random sampling. However, they didn’t know what the population looked like, so they couldn’t have run the simulations that we normally do. Simulating a sampling procedure many times was a useful exercise in understanding random variation for an estimate, but it’s not as useful as a tool for practical data analysis. Let’s flip that sampling idea on its head to make it practical. Given just a random sample of serial numbers, we’ll estimate N , and then we’ll use simulation to find out how accurate our estimate probably is, without ever looking at the whole population. This is an example of statistical inference . As usual, run the cell below to prepare the lab and the automatic tests. [3]: # Run this cell to set up the notebook, but please don ' t change it. # These lines import the Numpy and Datascience modules. import numpy as np from datascience import * # These lines do some fancy plotting magic. 1
import matplotlib % matplotlib inline import matplotlib.pyplot as plt plt . style . use( ' fivethirtyeight ' ) import warnings warnings . simplefilter( ' ignore ' , FutureWarning ) # These lines load the tests. from client.api.notebook import Notebook ok = Notebook( ' lab08.ok ' ) _ = ok . submit() ===================================================================== Assignment: Resampling and the Bootstrap OK, version v1.12.5 ===================================================================== <IPython.core.display.Javascript object> <IPython.core.display.Javascript object> Saving notebook... Saved ' lab08.ipynb ' . Submit... 100% complete Submission successful for user: [email protected] URL: 1.1 1. Preliminaries We (the RAF in World War II) want to know the number of warplanes fielded by the Germans. That number is N . The warplanes have serial numbers from 1 to N , so N is also equal to the largest serial number on any of the warplanes. We only see a small number of serial numbers (assumed to be a random sample with replace- ment from among all the serial numbers), so we have to use estimation.