Once we recognize a need for a linear function to model the data in "Draw and interpret scatter plots," the natural follow-up question is "what is that linear function?" One way to approximate our linear function is to sketch the line that seems to best fit the data. Then we can extend the line until we can verify the y-intercept. We can approximate the slope of the line by extending it until we can estimate the
On a graph, we could try sketching a line.
Using the starting and ending points of our hand drawn line, points (0, 30) and (50, 90), this graph has a slope of
and a y-intercept at 30. This gives an equation of
where c is the number of chirps in 15 seconds, and T(c) is the temperature in degrees Fahrenheit. The resulting equation is represented in the graph below.
While the data for most examples does not fall perfectly on the line, the equation is our best guess as to how the relationship will behave outside of the values for which we have data. We use a process known as interpolation when we predict a value inside the domain and range of the data. The process of extrapolation is used when we predict a value outside the domain and range of the data.
The graph below compares the two processes for the cricket-chirp data addressed in Example 2. We can see that interpolation would occur if we used our model to predict temperature when the values for chirps are between 18.5 and 44. Extrapolation would occur if we used our model to predict temperature when the values for chirps are less than 18.5 or greater than 44.
There is a difference between making predictions inside the domain and range of values for which we have data and outside that domain and range. Predicting a value outside of the domain and range has its limitations. When our model no longer applies after a certain point, it is sometimes called model breakdown. For example, predicting a cost function for a period of two years may involve examining the data where the input is the time in years and the output is the cost. But if we try to extrapolate a cost when x = 50, that is in 50 years, the model would not apply because we could not account for factors fifty years in the future.
Different methods of making predictions are used to analyze data.
Use the cricket data above to answer the following questions:
We can compare the regions of interpolation and extrapolation using the graph below.
According to the data from the table in Example 3, what temperature can we predict it is if we counted 20 chirps in 15 seconds?Solution
While eyeballing a line works reasonably well, there are statistical techniques for fitting a line to data that minimize the differences between the line and data values. One such technique is called least squares regression and can be computed by many graphing calculators, spreadsheet software, statistical software, and many web-based calculators. Least squares regression is one means to determine the line that best fits the data, and here we will refer to this method as linear regression.
No. There is only one best fit line.