Out[40]:
0.95392696822643275
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running tests
---------------------------------------------------------------------
Test summary
Passed: 1
Failed: 0
[ooooooooook] 100.0% passed
Saving notebook... Saved 'hw07.ipynb'.
Backup... 100% complete
Backup successful for user: [email protected]
URL:
NOTE: this is only a backup. To submit your assignment, use:
python3 ok --submit

need to be done here before making such conclusions.

11/8/2017
hw07
8/31
In [32]:
_ = ok.grade('q1_7')
_ = ok.backup()
2. Finding the Least Squares Regression Line
In this exercise, you'll work with a small invented data set. Run the next cell to generate the dataset
d
and see
a scatter plot.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Running tests
---------------------------------------------------------------------
Test summary
Passed: 1
Failed: 0
[ooooooooook] 100.0% passed
Saving notebook... Saved 'hw07.ipynb'.
Backup... 100% complete
Backup successful for user: [email protected]
URL:
NOTE: this is only a backup. To submit your assignment, use:
python3 ok --submit

11/8/2017
hw07
9/31
In [42]:
d = Table().with_columns(
'x', make_array(0,
1,
2,
3,
4),
'y', make_array(1, .5, -1,
2, -3))
d.scatter('x')
Question 1 (Ungraded, but you'll need the result later)
Running the cell below will generate sliders that control the slope and intercept of a line through the scatter
plot. When you adjust a slider, the line will move.
By moving the line around, make your best guess at the least-squares regression line. (It's okay if your line
isn't exactly right, as long as it's reasonable.)
Note:
Python will probably take about a second to redraw the plot each time you adjust the slider. We suggest
clicking the place on the slider you want to try and waiting for the plot to be drawn; dragging the slider handle
around will cause a long lag.

11/8/2017
hw07
10/31
In [43]:
def
plot_line(slope, intercept):
plt.figure(figsize=(5,5))
endpoints = make_array(-2, 7)
p = plt.plot(endpoints, slope*endpoints + intercept,
color='orange', label='Proposed line')
plt.scatter(d.column('x'), d.column('y'), color='blue', label='Poin
ts')
plt.xlim(-4, 8)
plt.ylim(-6, 6)
plt.gca().set_aspect('equal', adjustable='box')
plt.legend(bbox_to_anchor=(1.8, .8))
plt.show()
interact(plot_line, slope=widgets.FloatSlider(min=-4, max=4, step=.1),
intercept=widgets.FloatSlider(min=-4, max=4, step=.1));
You can probably find a reasonable-looking line by just eyeballing it. But remember: the least-squares
regression line minimizes the mean of the squared errors made by the line for each point. Your eye might not
be able to judge squared errors very well.
A note on mean and total squared error
It is common to think of the least-squares line as the line with the least
mean
squared error (or the square root
of the mean squared error), as the textbook does.
But it turns out that it doesn't matter whether you minimize the mean squared error or the
total
squared error.
You'll get the same best line in either case.