BerkeleyX+Data8.1x lab04.html - Functions and Visualizations Welcome to lab 4 This week we'll learn about functions and the table method apply from

# BerkeleyX+Data8.1x lab04.html - Functions and...

• Lab Report
• rf15707
• 24
• 95% (19) 18 out of 19 people found this document helpful

This preview shows page 1 - 4 out of 24 pages.

Functions and Visualizations Welcome to lab 4! This week, we'll learn about functions and the table method apply from Section 8.1 . We'll also learn about visualization from Chapter 7 . First, set up the tests and imports by running the cell below. In [1]: import numpy as np from datascience import * # These lines set up graphing capabilities. import matplotlib %matplotlib inline import matplotlib.pyplot as plt plt.style.use('fivethirtyeight') import warnings warnings.simplefilter('ignore', FutureWarning) from ipywidgets import interact, interactive, fixed, interact_manual import ipywidgets as widgets from client.api.notebook import Notebook ok = Notebook('lab04.ok') ===================================================================== Assignment: Functions and Visualizations OK, version v1.13.11 ===================================================================== 1. Functions and CEO IncomesLet's start with a real data analysis task. We'll look at the 2015 compensation of CEOs at the 100 largest companies in California. The data were compiled for a Los Angeles Times analysis here, and ultimately came from filings mandated by the SEC from all publicly-traded companies. Two companies have two CEOs, so there are 102 CEOs in the dataset.We've copied the data in raw form from the LA Times page into a file called raw_compensation.csv(The page notes that all dollar amounts are in millions of dollars.)In [2]:raw_compensation = Table.read_table('raw_compensation.csv')raw_compensationOut[2]: .
Rank Name Company (Headquarters) Total Pay % Change Cash Pay Equity Pay Other Pay Ratio of CEO pay to average industry worker pay 1 Mark V. Hurd* Oracle (Redwood City) \$53.25 (No previous year) \$0.95 \$52.27 \$0.02 362 2 Safra A. Catz* Oracle (Redwood City) \$53.24 (No previous year) \$0.95 \$52.27 \$0.02 362 3 Robert A. Iger Walt Disney (Burbank) \$44.91 -3% \$24.89 \$17.28 \$2.74 477 4 Marissa A. Mayer Yahoo! (Sunnyvale) \$35.98 -15% \$1.00 \$34.43 \$0.55 342 5 Marc Benioff salesforce.com (San Francisco) \$33.36 -16% \$4.65 \$27.26 \$1.45 338 6 John H. Hammergren McKesson (San Francisco) \$24.84 -4% \$12.10 \$12.37 \$0.37 222 7 John S. Watson Chevron (San Ramon) \$22.04 -15% \$4.31 \$14.68 \$3.05 183 8 Jeffrey Weiner LinkedIn (Mountain View) \$19.86 27% \$2.47 \$17.26 \$0.13 182 9 John T. Chambers** Cisco Systems (San Jose) \$19.62 19% \$5.10 \$14.51 \$0.01 170 10 John G. Stumpf Wells Fargo (San Francisco) \$19.32 -10% \$6.80 \$12.50 \$0.02 256 ... (92 rows omitted) Question 1.1. We want to compute the average of the CEOs' pay. Try running the cell below. In [3]: np.average(raw_compensation.column("Total Pay")) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-3-f97fab5a8083> in <module> () ----> 1 np . average ( raw_compensation . column ("Total Pay")) /usr/local/lib/python3.6/dist-packages/numpy/lib/function_base.py in average (a, axis, weights, returned) 1126 1127 if weights is None : -> 1128 avg = a . mean ( axis ) 1129 scl = avg . dtype . type ( a . size / avg . size ) 1130 else :
/usr/local/lib/python3.6/dist-packages/numpy/core/_methods.py in _mean (a, axis, dtype, out, keepdims) 68 is_float16_result = True 69 ---> 70 ret = umr_sum ( arr , axis , dtype , out , keepdims ) 71 if isinstance ( ret , mu . ndarray ): 72 ret = um.true_divide( TypeError : cannot perform reduce with flexible type You should see an error. Let's examine why this error occured by looking at the values in the "Total Pay" column. Use the type function and set total_pay_type to the type of the first value in the "Total Pay" column. In [4]: total_pay_type = type(raw_compensation.column("Total Pay")[0]) total_pay_type Out[4]: numpy.str_ In [5]: _ = ok.grade('q1_1')