Measures Of Variability
Why does variability matter?
While the central tendency, or average, tells you where most of your points lie,
variability summarizes how far apart they are. This is important because the amount of
variability determines how well you can generalize results from the sample to your
population.
Low variability is ideal because it means that you can better predict information about
the population based on sample data. High variability means that the values are less
consistent, so it’s harder to make predictions.
Data sets can have the same central tendency but different levels of variability or vice
versa. If you know only the central tendency or the variability, you can’t say anything
about the other aspect. Both of them together give you a complete picture of your data.
Example: Variability in normal distributionsYou are investigating the amounts of time
spent on phones daily by different groups of people.
Using simple random samples, you collect data from 3 groups:
● Sample A: high school students,
● Sample B: college students,
● Sample C: adult full-time employees.
All three of your samples have the same average phone use, at 195 minutes or 3 hours and
15 minutes. This is the x-axis value where the peak of the curves are.
Although the data follows a normal distribution, each sample has different spreads. Sample A
has the largest variability while Sample C has the smallest variability.
Range
The range tells you the spread of your data from the lowest to the highest value in the
distribution. It’s the easiest measure of variability to calculate.
To find the range, simply subtract the lowest value from the highest value in the data
set.
Range exampleYou have 8 data points from Sample A.
Data (minutes) 72 110 134 190 238 287 305 324
The highest value (H) is 324 and the lowest (L) is 72.
R=H–L
R = 324 – 72 = 252
The range of your data is 252 minutes.
Because only 2 numbers are used, the range is influenced by outliers and doesn’t give
you any information about the distribution of values. It’s best used in combination with
other measures.
Interquartile range
The interquartile range gives you the spread of the middle of your distribution.
For any distribution that’s ordered from low to high, the inter quartile range contains half
of the values. While the first quartile (Q1) contains the first 25% of values, the fourth
quartile (Q4) contains the last 25% of values.
The interquartile range is the third quartile (Q3) minus the first quartile (Q1). This gives
us the range of the middle half of a data set.
Interquartile range example: To find the interquartile range of your 8 data points, you
first find the values at Q1 and Q3.
Multiply the number of values in the data set (8) by 0.25 for the 25th percentile (Q1) and by
0.75 for the 75th percentile (Q3).
Q1 position: 0.25 x 8 = 2
Q3 position: 0.75 x 8 = 6
Q1 is the value in the 2nd position, which is 110. Q3 is the value in the 6th position, which is
287.
IQR = Q3 – Q1
IQR = 287 – 110 = 177
The interquartile range of your data is 177 minutes.
Just like the range, the interquartile range uses only 2 values in its calculation. But the
IQR is less affected by outliers: the 2 values come from the middle half of the data set,
so they are unlikely to be extreme scores.
The IQR gives a consistent measure of variability for skewed as well as normal
distributions.
Five-number summary
Every distribution can be organized using a five-number summary:
● Lowest value
● Q1: 25th percentile
● Q2: the median
● Q3: 75th percentile
● Highest value (Q4)
These five-number summaries can be easily visualized using box and whisker plots.
Box and whisker plot exampleFor each of our samples, the horizontal lines in a box
show Q1, the median and Q3, while the whiskers at the end show the highest and
lowest values.
Percentile Range
A percentile range is the difference between two specified percentiles. These could
theoretically be any two percentiles, but the 10-90 percentile range is the most common.
To find the 10-90 percentile range:
1. Calculate the 10th percentile using the above steps.
2. Calculate the 90th percentile using the above steps.
3. Subtract Step 1 (the 10th percentile) from Step 2 (the 90th percentile).