# Plots

### Chapters

### Scatter Plots

# Scatter Plots

'A `scatter plot`

is a diagram in the \(xy\)-plane that consists of a collection of plotted points. The points illustrate the relationship (if there is one)
between two different sets of data, and are plotted as Cartesian coordinates.

In the example on the left, each point shows the marks of one student at Sam's school on two different tests.

Let's construct a scatter plot for an example.

### Example:Soup Sales

Sam's school canteen sells soup in terms 2 and 3. The canteen manager keeps track of the numbers of bowls of soup sold and the temperature each day. She hopes to be able to use the temperature to predict how much soup she should make any a given day. Here is the data from the last three weeks of school:

Soup Sales vs Temperature | |
---|---|

Temperature (\({}^\circ C\)) | Bowls of Soup Sold |

8 | 28 |

10 | 25 |

11 | 24 |

8 | 26 |

12 | 22 |

15 | 18 |

8 | 27 |

17 | 15 |

16 | 20 |

12 | 21 |

21 | 9 |

16 | 18 |

17 | 15 |

18 | 12 |

20 | 8 |

Here's a scatter plot of the data:

The data appear to follow a straight line fairly closely, and the slope of the line is negative. So, it looks like the canteen manager should be able to use the temperature to predict how much soup to make. The relationship is not perfect, but it is easier to see that colder weather leads to more bowls of soup being sold.

## Line of Best Fit

We often draw a `line of best fit`

(or `trend line`

) to help us understand the relationship between the data sets plotted on our scatter plot.
We choose the line that lies as close as possible to all of the points, and for which approximately the same numbers of points lie above and below the line.

Sometimes, it's enough to just estimate where the line should lie, but there are situations when we need to be more precise. We then use a technique called
`linear regression`

or `least squares regression`

to find the line of best fit. We'll talk more about that in a more advanced article.

For our soup example, we don't need to be quite so precise. Here's a line of best fit drawn on the scatter plot

Here's another example. Two data sets relating the stopping distances and speed of 1920s cars have been plotted on a scatter plot:

I've had a go at drawing a line of best fit on the scatter plot. See if you can do better!## Interpolation and Extrapolation

In `interpolation`

, we look for a missing value that lies in the range of our data set. For example, I have used **linear interpolation**
(using a line to estimate the value) on the scatter plot below
to estimate the number of bowls of soup sold when the temperature is \(9 {}^\circ \text{C}\.)

In `extrapolation`

, we look for a missing value that lies outside the range of our data set. We perform **linear extrapolation** by extending the
line of best fit to include the data values we are looking for. On the scatter plot below, I've used linear extrapolation to estimate the number of bowls of soup
sold when the temperature reaches \(22.5 {}^\circ \text{C}\).

**Note:** these techniques can only give an **estimate** of the missing values. Extrapolation, in particular, can give misleading results as we
really can't be certain about what happens to our data values once we leave our data set.

### Using an Equation to Interpolate or Extrapolate

We can use the points on our scatter plot to come up with an approximate equation for the line of best fit. We can then use the equation of this line to extrapolate or interpolate.

Let's try it on our soup example. We only need two points to find the equation of a straight line. Choose two that are as close to the line of best fit as possible.

I've chosen the points \((15^\circ,18)\) and \((17^\circ, 15)\), corresponding to the orange circle and blue square on my scatter plot.

First, let's find the gradient (slope) of the line:

#### Interpolating

We want to predict the number of bowls of soup that will be sold when the temperature is \(9^\circ\), so we plug this \(x\)-value into the above equation to give

#### Extrapolating

If we want to predict the number of bowls of soup that will be sold when the temperature is \(22.5^\circ\), then we need to extrapolate because this value is outside the range of our temperature data set. Plug \(x = 22.5\) into the equation to give

You need to be very careful not to extrapolate too far. If you tried to use the equation to predict how many bowls of soup would be sold at a temperature of \(40^\circ\), you'd get

## Correlation

Correlation gives us a measure of how strongly linked two sets of data are.

We say that the correlation is **positive** if both sets of data values **increase** together.

If one set of data values **increases** while the other **decreases**, then we say that the correlation is **negative**.

The values of linear correlation lie between \(-1\) and \(1\).

## Examples

There is a positive correlation between the stopping distances of 1920s cars and their speed:

The stopping distance increases with the speed.There is a negative correlation between soup sales and the temperature:

The soup sales go down as the temperature goes up.

### Description

In these chapters you will learn more about

- Histograms
- Scatter plots
- Stem and leaf plots etc

these lessons are for students studying maths in Year 10 or highter

### Audience

Year 10 students or higher, however, suitable for Year 8+ students too.

### Learning Objectives

Learn about plotting

Author: Subject Coach

Added on: 28th Sep 2018

`You must be logged in as Student to ask a Question.`

None just yet!