Data can be represented in two variables using a scatterplot.
A scatterplot is a data display consisting of the graph of a set of ordered pairs. A scatterplot shows data for two variables and helps to indicate whether there is a relationship between those variables. The data points of a scatterplot, unlike those of a line graph, are not connected by line segments.
A scatterplot can be used in the real world to analyze data to make informed decisions about a situation. For example, a forester might use a scatterplot to determine how fast maple trees in a reforested area are growing based on their age. Before plotting the data on a scatterplot, a table can be created to organize the data, which represent ordered pairs. Plotting the data on to a scatterplot can then reveal the relationship between the x- and y-values of the data set.
Maple Tree Age and Height
Using a table can help organize data before plotting points on a scatterplot. For instance, a forester might use a table to record the age and height of maple trees in a reforested area.
Scatterplot of Maple Tree Age and Height
Trends in Data
A scatterplot can be used to visualize trends in data that indicate a positive, negative, or no linear relationship between two variables. The type of relationship between the variables is indicated by the slope of the line of best fit.
Scatterplots show linear patterns in data, even if there is no single line that goes through all the data points. To gauge the general pattern of data in a scatterplot, a line of best fit is calculated and drawn on a scatterplot to indicate the general direction of the points. The line of best fit, or regression line, minimizes the sum of the squared distances to all the points in a scatterplot. When the line of best fit for a data set is graphed on a scatterplot, about half the data points will be above the line and about half will be below the line. Any line that is far away from the data points and does not follow the general direction of the data points are not lines of best fit.
Scatterplot with a Positive Trend
The slope, or the change in y-values over the change in x-values, of the line of best fit must match the general trend of the data on a scatterplot. Otherwise, the line is not considered a line of best fit. For a scatterplot with a positive trend, the line of best fit will not only be close to the data points on a scatterplot, but it will also have a positive slope. It indicates that the change in the y-values are increasing as the change in x-values are increasing.
Some scatterplots show a negative trend. The line of best fit for scatterplots with a negative trend will be close to the points in the scatterplot and will have a negative slope. So, the change in y-values are decreasing as the change in x-values are increasing. Note that the word negative refers only to the direction of the trend, not its strength.
Scatterplot with a Negative Trend
Not all scatterplots show a clear trend. If the distribution of the data points does not show an obvious pattern, it is more difficult to draw a representative line of best fit through the data points. For scatterplots without a clear trend, there may not be a relationship between the two variables displayed in the scatterplot.
Scatterplot Without a Linear Trend
Other scatterplots may show a trend, but not one that is linear. If the data points of a scatterplot lie close to a curve or the pattern of the data points resemble a curve, the data set displays a nonlinear trend.