lab12pdf.pdf - lab12 1)13 PM In[38 Don't change this cell just run it from datascience import import numpy as np import math import matplotlib.pyplot as

# lab12pdf.pdf - lab12 1)13 PM In[38 Don't change this cell...

• Lab Report
• 24
• 100% (1) 1 out of 1 people found this document helpful

This preview shows page 1 - 4 out of 24 pages.

4/26/19, 1)13 PMlab12Page 1 of 24In [38]:# Don't change this cell; just run it. fromdatascienceimport*importnumpyasnpimportmathimportmatplotlib.pyplotasplt%matplotlibinlineplt.style.use('fivethirtyeight')In [39]:# Don't change this cell either; just run it.defabline(slope, intercept):"""Plot a line from slope and intercept"""axes = plt.gca()x_vals = np.array(axes.get_xlim())y_vals = intercept + slope * x_valsplt.plot(x_vals, y_vals, '--')defstr_mat(A):return'\n'.join([ ' '.join(['{:.3f}'.format(A[i, j]) forj inrange(A.shape[1])]) fori inrange(A.shape[0])])defsort_eigs(eigvals, eigvecs):idx = eigvals.argsort()[::-1] eigvals = eigvals[idx]eigvecs = eigvecs[:,idx]returneigvals, eigvecs1. PCAA matrix doesn't have to represent a linear function. Some matrices, for example, represent data sets insuch a way that every column of the matrix corresponds to a point of the data set.Let's see an example in 2 dimensions. We have here two variables, which is the weights of 30 students inthe fifth grade, and which is their heights, and we're interested in seeing how "related" the two variablesare, e.g. in the sense that changes in the value of one variable "predict" changes in the value of the othervariable.First, let's plot the sample points to get an idea of what we are working with.
4/26/19, 1)13 PMlab12Page 2 of 24In [3]:np.random.seed(42)X = (np.random.normal(loc=5, scale=15, size=30))+80Y = 0.01*X +4+np.random.normal(0,.25,size=30)plt.ylim(0, 6)plt.xlim(0, 120)plt.scatter(X, Y, alpha=0.5)plt.title('Height and weight of fifth graders')plt.xlabel('weight (lbs)')plt.ylabel('height (ft)')plt.show()1.1. Finding the correlation coecient The correlation coecient is one first way in which we might try to gauge the strength and direction of therelationship between the two variables and .,
4/26/19, 1)13 PMlab12Page 3 of 241.1.1.In the cell below, calculate the correlation coecient:You may not use np.corrcoefor any other similar methods.Hint:Which values do you need to calculate the correlation coecient? Where can you find these valueswithin the covariance matrix?.,=defCov(,)() ()In [4]:n = X.shape[0]A = np.vstack((X.reshape((1, n)), Y.reshape((1, n))))A_cov = np.cov(A)corr_XY = A_cov.item(1)/(A_cov.item(0) * A_cov.item(3))**.5 # SOLUTIONprint('rho_XY = {}\n'.format(corr_XY))print('Your solution is {}.'.format(('correct' ifnp.isclose(corr_XY, np.corrcoef(A)[0][1]) else'incorrect')))1.1.2. ObservationsBased on the value of the correlation coecient, to what extent do the two variables "predict" each other? Inother words, to what extent do changes in the value of one variable correspond to changes in the value ofthe other variable?

#### You've reached the end of your free preview.

Want to read all 24 pages?

• Fall '18

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern