Question 2.Create a table calledfull_data_with_valuethat’s a copy offull_data, with anextra column called"Value"containing each player’s value (according to our crude measure).Then make a histogram of players’ values.Specify bins that make the histogram informative,and don’t forget your units!Hint: Informative histograms contain a majority of the data andexclude outliers.
Now suppose we weren’t able to find out every player’s salary (perhaps it was too costly tointerview each player). Instead, we have gathered asimple random sampleof 100 players’ salaries.The cell below loads those data.In [102]:sample_salary_data=Table.read_table("sample_salary_data.csv")sample_salary_data.show(3)<IPython.core.display.HTML object>Question 3.Make a histogram of the values of the players insample_salary_data, using thesame method for measuring value we used in question 2.Use the same bins, too.Hint:This will take several steps.
Now let us summarize what we have seen. To guide you, we have written most of the sum-mary already.Question 4.Complete the statements below by filling in the [SQUARE BRACKETS].Hint 1:For a refresher on distribution types, check outSection 10.1Hint 2:Thehist()table method ignores data points outside the range of its bins, but you mayignore this fact and calculate the areas of the bars using what you know about histograms fromlecture.The plot in question 2 displayed a(n) [empirical] distribution of the population of [492] players.The areas of the bars in the plot sum to [99.2].The plot in question 3 displayed a(n) [empirical] distribution of the sample of [100] players.The areas of the bars in the plot sum to [95].Question 5.For which range of values does the plot in question 3 better depict the distributionof thepopulation’s player values: 0 to 0.5, or above 0.5? Explain your answer.0 to 0.511
