[go: up one dir, main page]

0% found this document useful (0 votes)
81 views6 pages

Measures of Variability

This document discusses measures of variability including range, interquartile range, standard deviation, and variance. It explains that variability describes how spread out data is from the center and from each other. The standard deviation averages the distances from the mean while the variance averages the squared distances from the mean. For normally distributed data, standard deviation and variance are preferred measures of variability but the interquartile range is best for skewed or outlier-containing data as it is less influenced by extremes.

Uploaded by

Divyanshi Dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views6 pages

Measures of Variability

This document discusses measures of variability including range, interquartile range, standard deviation, and variance. It explains that variability describes how spread out data is from the center and from each other. The standard deviation averages the distances from the mean while the variance averages the squared distances from the mean. For normally distributed data, standard deviation and variance are preferred measures of variability but the interquartile range is best for skewed or outlier-containing data as it is less influenced by extremes.

Uploaded by

Divyanshi Dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Measures of Variability:

Variability:

Variability describes how far apart data points lie from each other and from the center of a
distribution.

Variability is also referred to as spread, scatter or dispersion. It is most commonly measured


with the following:

 Range: the difference between the highest and lowest values


 Interquartile range: the range of the middle half of a distribution
 Standard deviation: average distance from the mean
 Variance: average of squared distances from the mean

Why does variability matter?

1. amount of variability determines how well you can generalize results from the
sample to your population.
2. Low variability is ideal because it means that you can better predict information
about the population based on sample data. High variability means that the values
are less consistent, so it is harder to make predictions.
3. Data sets can have the same central tendency but different levels of variability or
vice versa
Same average but different variability.

Range:

1. The range tells you the spread of your data from the lowest to the highest value in the
distribution.
2. To find the range, simply subtract the lowest value from the highest value in the data
set.
3. Because only 2 numbers are used, the range is influenced by outliners and doesn’t
give you any information about the distribution of values.

Interquartile range:

1. The interquartile range gives you the spread of the middle of your distribution.
2. For any distribution that’s ordered from low to high, the interquartile range contains
half of the values. While the first quartile(Q1) contains the first 25% of values, the
third quartile (Q3) contains the last 25% of values.

The interquartile range is the third quartile (Q3) minus the first quartile (Q1). This
gives us the range of the middle half of a data set.
1. The interquartile range uses only 2 values in its calculation. But the IQR is less
affected by outliers: the 2 values come from the middle half of the data set, so
they are unlikely to be extreme scores.
2. The IQR gives a consistent measure of variability for skewed as well as normal
distributions.

Five-number summary

Every distribution can be organized using a five-number summary:

 Lowest value
 Q1: 25th percentile
 Q2: the median
 Q3: 75th percentile
 Highest value (Q4)

These five-number summaries can be easily visualized using box and whisker plots.

Standard deviation:
The standard deviation is the average amount of variability in your dataset.

It tells you, on average, how far each score lies from the mean. The larger the standard
deviation, the more variable the data set is.

There are six steps for finding the standard deviation by hand:

1. List each score and find their mean


2. Subtract the mean from each score to get the deviation from the mean.
3. Square each of these deviations.
4. Add up all of the squared deviations.
5. Divide the sum of the squared deviations by n – 1 (for a sample) or N (for a population).
6. Find the square root of the number you found.

STANDARD DEIVATION FOR POPULATION:

 = population standard deviation


 = sum of…
 = each value
 = population mean
 = number of values in the population

STANDARD DEIVATION FOR SAMPLE:

 s = sample standard deviation

 = sum of…
 = each value
 = sample mean

 = number of values in the sample

Note: A population is the entire group that you want to draw conclusions about. A sample is
the specific group that you will collect data from. The size of the sample is always less than
the total size of the population.

Why use (n-1) for sample?

1. When you use sample data, your sample standard deviation is always used as an
estimate of the population standard deviation. Using n in this formula tends to give
you a biased estimate that consistently underestimates variability.
2. Reducing the sample n to n – 1 makes the standard deviation artificially large, giving
you a conservative estimate of variability.
3. While this is not an unbiased estimate, it is a less biased estimate of standard
deviation: it is better to overestimate rather than underestimate variability in samples.

Variance:

1. The variance is the average of squared deviations from the mean.


2. Variance is the square of the standard deviation.

VARIANCE FOR POPULATION:

 = population variance
 = sum of…
 X= each value
 = population mean
 N= number of values in the population

VARIATION FOR SAMPLE:

 = sample variance
 = sum of …
 X=each value
 = sample mean
 = number of values in the sample

What is the best measure of variability?

1. For normal distributions, all measures can be used. The standard deviation and
variance are preferred because they take your whole data set into account, but this
also means that they are easily influenced by outliers.
2. For skewed distributions or data sets with outliers, the interquartile range is the best
measure. It’s least affected by extreme values because it focuses on the spread in the
middle of the data set.
REFERENCES:

1. https://www.scribbr.com/statistics/variability/

You might also like