4/26/19, 1)13 PMlab12Page 1 of 24In [38]:# Don't change this cell; just run it. fromdatascienceimport*importnumpyasnpimportmathimportmatplotlib.pyplotasplt%matplotlibinlineplt.style.use('fivethirtyeight')In [39]:# Don't change this cell either; just run it.defabline(slope, intercept):"""Plot a line from slope and intercept"""axes = plt.gca()x_vals = np.array(axes.get_xlim())y_vals = intercept + slope * x_valsplt.plot(x_vals, y_vals, '--')defstr_mat(A):return'\n'.join([ ' '.join(['{:.3f}'.format(A[i, j]) forj inrange(A.shape[1])]) fori inrange(A.shape[0])])defsort_eigs(eigvals, eigvecs):idx = eigvals.argsort()[::-1] eigvals = eigvals[idx]eigvecs = eigvecs[:,idx]returneigvals, eigvecs1. PCAA matrix doesn't have to represent a linear function. Some matrices, for example, represent data sets insuch a way that every column of the matrix corresponds to a point of the data set.Let's see an example in 2 dimensions. We have here two variables, which is the weights of 30 students inthe fifth grade, and which is their heights, and we're interested in seeing how "related" the two variablesare, e.g. in the sense that changes in the value of one variable "predict" changes in the value of the othervariable.First, let's plot the sample points to get an idea of what we are working with.
4/26/19, 1)13 PMlab12Page 2 of 24In [3]:np.random.seed(42)X = (np.random.normal(loc=5, scale=15, size=30))+80Y = 0.01*X +4+np.random.normal(0,.25,size=30)plt.ylim(0, 6)plt.xlim(0, 120)plt.scatter(X, Y, alpha=0.5)plt.title('Height and weight of fifth graders')plt.xlabel('weight (lbs)')plt.ylabel('height (ft)')plt.show()1.1. Finding the correlation coecient The correlation coecient is one first way in which we might try to gauge the strength and direction of therelationship between the two variables and .,
4/26/19, 1)13 PMlab12Page 3 of 241.1.1.In the cell below, calculate the correlation coecient:You may not use np.corrcoefor any other similar methods.Hint:Which values do you need to calculate the correlation coecient? Where can you find these valueswithin the covariance matrix?.,=defCov(,)() ()In [4]:n = X.shape[0]A = np.vstack((X.reshape((1, n)), Y.reshape((1, n))))A_cov = np.cov(A)corr_XY = A_cov.item(1)/(A_cov.item(0) * A_cov.item(3))**.5 # SOLUTIONprint('rho_XY = {}\n'.format(corr_XY))print('Your solution is {}.'.format(('correct' ifnp.isclose(corr_XY, np.corrcoef(A)[0][1]) else'incorrect')))1.1.2. ObservationsBased on the value of the correlation coecient, to what extent do the two variables "predict" each other? Inother words, to what extent do changes in the value of one variable correspond to changes in the value ofthe other variable?

