# Lines of Best Fit

### Linear Regression

Linear regression is the process used to find the equation of a line of best fit that approximates the closest linear relationship between two variables. The correlation coefficient indicates the strength of the linear fit to the data.
Linear regression is a statistical method that calculates a line of best fit for a given set of data points. The line of best fit has the minimum value for the sum of the squares of the distances from the data points to the line.
A correlation is a relationship between two variables. When the points in a scatterplot are very close to a line of best fit, there is a strong correlation. When they show a general linear pattern, but are not close to the line, there is a weak correlation. Both positive and negative trends may exhibit strong or weak correlations. When the points show no linear pattern at all, there is no linear correlation.

The correlation coefficient, $r$, of a line of best fit is a value between –1 and 1, inclusive, that indicates the strength and direction of the correlation of the line.

• An $r$-value of 1 indicates that the line has a positive slope and all the points lie on the line.
• A value of $r$ close to 1 indicates a strong positive correlation.
• A positive $r$ -value closer to zero than to 1 indicates a weak positive correlation.
• An $r$ -value of zero indicates no correlation.
• A negative $r$ -value closer to zero than to –1 indicates a weak negative correlation.
• A value of $r$ close to –1 indicates a strong negative correlation.
• An $r$ -value of –1 indicates that the line has a negative slope and all the points lie on the line.

### Interpreting Lines of Best Fit

Technology can be used to generate a line of best fit.
For very small data sets, a line of best fit can be calculated by hand. Most often, technology such as graphing calculators, spreadsheets, or online tools is used to determine the equation of the line.
Step-By-Step Example
Determining the Equation of a Line of Best Fit

Employees at a company start with an average salary of $40,500 at year zero. The table shows the average salaries of the company's employees for selected years of service. Graph and interpret the line of best fit. Year Average Salary 0$40,500
1 $42,000 2$43,500
3 $45,000 4$45,500
5 $47,000 9$52,000
10 $54,000 11$55,500
12 $56,000 13$56,500
14 $56,000 15$56,500
16 $57,000 17$58,000

Step 1
Create a scatterplot of the data.
Step 2
Calculate the line of best fit by using the linear regression function of a graphing calculator.
The line of best fit is:
$y \approx 1\rm{,}061x + 41\rm{,}660$
Solution
Graph the line of best fit on the scatterplot.
Next, interpret the line of best fit.
• The correlation coefficient of $r\approx 0.98$ means that there is a very strong positive correlation between the years and salaries. When employees work at the company for a number of years, their average salaries have increased.