8x3_lab02.pdf - \"cells\"cell_type\"markdown\"metadata\"source Lab 2 Regression\\n\\n\"Welcome to Lab 2 of Data 8.3x\\n\\n\"Today we will get some hands-on

# 8x3_lab02.pdf -...

This preview shows page 1 - 6 out of 105 pages.

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lab 2: Regression\n", "\n", "Welcome to Lab 2 of Data 8.3x!\n", "\n", "Today we will get some hands-on practice with linear regression. You can find more information about this topic in\n", "[section 15.2]( Regression_Line)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Run this cell, but please don't change it.\n", "\n", "# These lines import the Numpy and Datascience modules.\n", "import numpy as np\n", "from datascience import *\n", "\n", "# These lines do some fancy plotting magic.\n", "import matplotlib\n", "%matplotlib inline\n", "import matplotlib.pyplot as plots\n", "plots.style.use('fivethirtyeight')\n", "import warnings\n", "warnings.simplefilter('ignore', FutureWarning)\n", "warnings.simplefilter('ignore', UserWarning)\n", "\n", "# These lines load the tests.\n", "from gofer.ok import check" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. How Faithful is Old Faithful? Revisited\n", "\n", "Let's revisit a question from lab 1. Last lab, we investigated Old Faithful, a geyser in Yellowstone National Park in the central United States. It's famous for erupting on a fairly regular schedule. \n", "\n", "To recap, some of Old Faithful's eruptions last longer than others. Today, we will use the same dataset on eruption durations and waiting times to see if we can make predict the wait time from the eruption duration using linear regression.\n", "\n", "The dataset has one row for each observed eruption. It includes the following columns:\n", "- **duration**: Eruption duration, in minutes\n", "- **wait**: Time between this eruption and the next, also in minutes\n", "\n", "Run the next cell to load the dataset." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr>\n", " <th>duration</th> <th>wait</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <td>3.6 </td> <td>79 </td>\n", " </tr>\n", " <tr>\n", " <td>1.8 </td> <td>54 </td>\n", " </tr>\n", " <tr>\n", " <td>3.333 </td> <td>74 </td>\n", " </tr>\n", " <tr>\n", " <td>2.283 </td> <td>62 </td>\n", " </tr>\n", " <tr>\n", " <td>4.533 </td> <td>85 </td>\n", " </tr>\n", " <tr>\n", " <td>2.883 </td> <td>55 </td>\n", " </tr>\n", " <tr>\n", " <td>4.7 </td> <td>88 </td>\n", " </tr>\n", " <tr>\n", " <td>3.6 </td> <td>85 </td>\n", " </tr>\n", " <tr>\n", " <td>1.95 </td> <td>51 </td>\n", " </tr>\n", " <tr>\n", " <td>4.35 </td> <td>85 </td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>... (262 rows omitted)</p>" ], "text/plain": [ "duration | wait\n", "3.6 | 79\n", "1.8 | 54\n", "3.333 | 74\n", "2.283 | 62\n", "4.533 | 85\n", "2.883 | 55\n", "4.7 | 88\n", "3.6 | 85\n", "1.95 | 51\n", "4.35 | 85\n", "... (262 rows omitted)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "faithful = Table.read_table(\"faithful.csv\")\n", "faithful" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember from last lab that we concluded eruption time and waiting time are positively correlated. The table below called `faithful_standard` contains the eruption durations and waiting times in standard units." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr>\n", " <th>duration (standard units)</th> <th>wait (standard units)</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <td>0.0984989 </td> <td>0.597123 </td>\n", " </tr>\n", " <tr>\n", " <td>-1.48146 </td> <td>-1.24518 </td>\n", " </tr>\n", " <tr>\n", " <td>-0.135861 </td> <td>0.228663 </td>\n", " </tr>\n", " <tr>\n", " <td>-1.0575 </td> <td>-0.655644 </td>\n", " </tr>\n", " <tr>\n", " <td>0.917443 </td> <td>1.03928 </td>\n", " </tr>\n", " <tr>\n", " <td>-0.530851 </td> <td>-1.17149 </td>\n", " </tr>\n", " <tr>\n", " <td>1.06403 </td> <td>1.26035 </td>\n", " </tr>\n", " <tr>\n", " <td>0.0984989 </td> <td>1.03928 </td>\n", " </tr>\n", " <tr>\n", " <td>-1.3498 </td> <td>-1.46626 </td>\n", " </tr>\n", " <tr>\n", " <td>0.756814 </td> <td>1.03928 </td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>... (262 rows omitted)</p>" ],  #### You've reached the end of your free preview.

Want to read all 105 pages?

• Fall '17
• • •  