# 3 sampling basketball players this exercise uses

• Homework Help
• 14
• 96% (23) 22 out of 23 people found this document helpful

This preview shows page 6 - 9 out of 14 pages.

3. Sampling Basketball PlayersThis exercise uses salary data and game statistics for basketball players from the 2014-2015 NBA season. The data was collectedfrom Basketball-Referenceand SpotracRun the next cell to load the two datasets.In [20]:player_data = Table.read_table('player_data.csv'salary_data = Table.read_table('salary_data.csv'player_data.show(3)salary_data.show(3)Question 1.We would like to relate players' game statistics to their salaries. Compute a table called full_datathat includes onerow for each player who is listed in both player_dataand salary_data. It should include all the columns from player_dataand salary_data, except the "PlayerName"column.NameAgeTeamGamesReboundsAssistsStealsBlocksTurnoversPointsJamesHarden25HOU81459565154603212217Chris Paul29LAC82376838156151901564StephenCurry26GSW80341619163162491900... (489 rows omitted)PlayerNameSalaryKobe Bryant23500000Amar'eStoudemire23410988Joe Johnson23180790... (489 rows omitted).))
In [21]:full_data = player_data.join("Name",salary_data,"PlayerName"full_dataIn [22]:_ = ok.grade('q3_1'Basketball team managers would like to hire players who perform well but don't command high salaries. From this perspective, a verycrude measure of a player's valueto their team is the number of points the player scored in a season for every \$1000 of salary(Note: the Salarycolumn is in dollars, not thousands of dollars). For example, Al Horford scored 1156 points and has a salary of\$12 million.This is equivalent to 12,000 thousands of dollars, so his value is $\frac{1156}{12000}$.In [49]:value = full_data.column("Points")/(full_data.column("Salary")/1000full_data_with_value = full_data.with_column("Value",value)full_data_with_value.hist("Value",unit = "Points per thousand dollars",bins=np.arange(0,1,.05Out[21]:NameAgeTeamGamesReboundsAssistsStealsBlocksTurnoversPointsSalaryA.J. Price28TOT263246701413362552Aaron Brooks30CHI8216626154151579541145685Aaron Gordon19ORL47169332122382433992040Adreian Payne23TOT3216230199442131855320Al Horford28ATL765442446898100115612000000Al Jefferson30CHO65548113478468108213666667Al-Farouq Aminu24DAL74342597062554121100602Alan Anderson32BRK7420483565605451276061Alec Burks23UTA2711482175523743034356Alex Kirk23CLE5110004507336... (482 rows omitted)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Running tests---------------------------------------------------------------------Test summaryPassed: 1Failed: 0[ooooooooook] 100.0% passed))Question 2.Create a table called full_data_with_valuethat's a copy of full_data, with an extra column called "Value"containing each player's value (according to our crude measure). Then make a histogram of players' values. Specifybins that make the histogram informative, and don't forget your units!Remember that hist()takes in an optional thirdargument that allows you to specify the units!Hint: Informative histograms contain a majority of the data and exclude outliers.)))