# Div question 2 she then wants to use 10000 bootstrap

• Homework Help
• 32
• 97% (149) 145 out of 149 people found this document helpful

This preview shows page 18 - 23 out of 32 pages.

</div> #### Question 2 She then wants to use 10,000 bootstrap resamples to compute a confidence interval for the proportion of all California voters who will vote Yes. Fill in the next cell to simulate an empiri- cal distribution of Yes proportions with 10,000 resamples. In other words, use bootstrap resam- pling to simulate 10,000 election outcomes, and populate resample_yes_proportions with the yes proportion of each bootstrap resample. Then, visualize resample_yes_proportions with a histogram. You should see a bell shaped curve centered near the proportion of Yes in the original sample.
18
In [27]: _ = ok . grade( ' q3_2 ' ) _ = ok . backup() ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running tests --------------------------------------------------------------------- Test summary Passed: 1 Failed: 0 [ooooooooook] 100.0% passed <IPython.core.display.Javascript object> <IPython.core.display.Javascript object> Saving notebook... Saved ' hw09.ipynb ' . Backup... 100% complete Backup successful for user: [email protected] URL: NOTE: this is only a backup. To submit your assignment, use: python3 ok --submit 19
</div> #### Question 3 Why does the Central Limit Theorem (CLT) apply in this situation, and how does it explain the distribution we see above?
20
</div> In a population whose members are 0 and 1, there is a simple formula for the standard deviation of that population: standard deviation = ( proportion of 0s ) × ( proportion of 1s ) (Figuring out this formula, starting from the definition of the standard deviation, is an fun exercise for those who enjoy algebra.) 21
</div> #### Question 4 Using only the CLT and the numbers of Yes and No voters in our sample of 400, compute ( algebraically ) a number approximate_sd that’s the predicted standard deviation of the array resample_yes_proportions according to the Central Limit Theorem. Do not access the data in resample_yes_proportions in any way. Remember that a predicted stan- dard deviation of the sample means can be computed from the population SD and the size of the sample. Also remember that if we do not know the population SD, we can use the sample SD as a reasonable approximation in its place. In [28]: approximate_sd = (( 210/400 ) * ( 190/400 ) / 400 ) ** 0.5 approximate_sd Out[28]: 0.02496873044429772 In [29]: _ = ok . grade( ' q3_4 ' ) _ = ok . backup() ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Running tests --------------------------------------------------------------------- Test summary Passed: 1 Failed: 0 [ooooooooook] 100.0% passed <IPython.core.display.Javascript object> <IPython.core.display.Javascript object> Saving notebook... Saved ' hw09.ipynb ' .