0% found this document useful (0 votes)

7 views28 pages

Chapter 1 Notes

Uploaded by

chaopuishan127

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views28 pages

Chapter 1 Notes

Uploaded by

chaopuishan127

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Associate Degree 2020 – 2021 First Semester

CCMA4001 Quantitative Analysis I

Chapter 1
Summarizing and Describing Data

1.1 Summarizing Data

In order to visualize the distribution of a set of raw data, we ought to compile the data into a
more comprehensible form, making use of tables and graphs.

A. Frequency Tables

Given a set of raw data we usually arrange it into a frequency distribution where we collect
‘like’ quantities and display them by writing down how many of each type there are to form
a frequency table.

2-1
Example 1

In a multiple-choice test with 10 questions, the numbers of correct answers of 40 students are as follows.

10 4 9 6 7 4 8 7 7 5
5 8 7 9 10 6 5 9 7 6
4 7 5 6 7 9 5 8 8 4
8 7 7 5 5 4 8 6 6 6

Construct a frequency table for these data.

Solution:

Number of correct answers Tally Frequency

4 //// 5
5 //// // 7
6 //// // 7
7 //// //// 9
8 //// / 6
9 //// 4
10 // 2
Total: 40

2-2
B. Bar Chart

Example 2

Using the frequency table constructed in Example 1, draw a bar chart for the distribution of the
number of correct answers of 40 students in the multiple-choice test.

Solution:

Distribution of the number of correct answers of 40 students

in the multiple-choice test

10
Frequency (Number of students)

0
4 5 6 7 8 9 10
Number of correct answers

2-3
C. Stem-and-leaf diagrams

A very useful graphical representation of a frequency distribution is the stem-and-leaf diagram

(or stemplot) .

The stem-and-leaf diagram involves a combination of a graphical technique and a sorting technique.
By sorting it means listing the data in rank order according to numerical value.
The data values themselves are use to do this sorting.
The “stem” is the leading digit(s) of the data, while the “leaf” is the trailing digit.

For example, the numerical data 386 might be split 38 – 6 as shown:

Leading digits Trailing digit

38 6
(Used in sorting) (Shown in display)

A stem-and-leaf diagram is a method of presenting a data set so that gaps or concentrations in the
data become visible.

2-4
Example 3

Suppose that a class of 40 students obtained the following results in a Mathematics test.

61 80 55 70 76 73 100 90 64 62
75 64 62 66 46 61 67 39 58 63
63 64 51 40 66 43 38 37 28 71
70 49 48 68 86 27 69 74 37 56

Construct a stem-and-leaf diagram for these data.

Solution:

Stem Leaf
(Tens) (Units)
2 78
3 7789
4 03689
5 1568
6 11223344466789
7 0013456
8 06
9 0
10 0

2-5
Advantages of a stem-and-leaf diagram

1. It is easy to construct. In fact, it is no more difficult to construct than a frequency table.

2. It is actually partly a table and partly a graph and so it immediately and directly gives a good
picture of the frequency distribution without having to prepare a frequency table first and then
construct charts afterwards.

3. Since the actual data are recorded in the diagram, it retains the information about the original data,
and the information may be recovered readily.

In a frequency table or histogram, data are represented by tallies or areas of rectangles in class intervals
and so some information about the original data is lost and cannot be recovered.
For example, the reading 64 is recorded in its entirety in a stem-and-leaf diagram, but is represented
only by a count of 1 in the class interval (e.g. 60 – 64) in a frequency table or histogram.

4. It can be regarded as the original set of data arranged in ascending order of magnitude.
Hence it can be readily used for finding quartiles.

Disadvantages of a stem-and-leaf diagram

1. For some type of data, the number of stems that can be chosen is either very small or very large,
thus making the diagram inconvenient to construct and unable to show the distribution effectively.

2. It is not quite suitable for large sets of data.

Actually, for a large set of data, the purpose of graphical representation is to give a good overall
picture of the distribution rather than to show the details of the data.
A bar chart or a histogram is more suitable in this case.

2-6
Example 4

A fishery expert found the following concentrations of mercury, in parts per million, in thirty fish caught
in a certain stream.

0.024 0.031 0.052 0.024 0.024 0.030 0.056 0.034 0.059 0.068
0.035 0.021 0.052 0.023 0.054 0.028 0.037 0.034 0.048 0.040
0.022 0.049 0.043 0.034 0.032 0.021 0.040 0.032 0.021 0.039

Construct a double-stem diagram for these data.

Solution:

Stem Leaf
(Unit = 0.01) (Unit = 0.001)
2 11123444
2 8
3 0122444
3 579
4 003
4 89
5 224
5 69
6
6 8

In the above diagram, the units of the stems and leaves have been chosen to make the recorded digits simple.
This is an important feature of a stem-and-leaf diagram.

2-7
1.2 Statistical Descriptions

In statistics, there are two useful types of measure which characterize any set of data or
frequency distribution.

The first type, a measure of ‘centralization’, attempts to locate a typical value about which the
distribution clusters. This type of measure is called an average or measure of central tendency
or measure of location.

The second type is a measure of how scattered or spread out a distribution is and is called
a measure of dispersion.

In the figures shown,

(a) shows two distributions with different measures of central tendency but roughly the same spread,
(b) illustrates two distributions with the same measure of central tendency but different spreads.

(a) (b)

2-8
I. Measures of Central Tendency

The most common measures of central tendency or average are the mean, the median and the mode.

A. Mean (Arithmetic Mean)

Given the complete set of N data {x 1 , x 2 ,!, x N } in a population, the mean µ , is defined as

1 1 N
µ= (x1 + x 2 + ! + x N ) or µ= ∑ xi
N N i =1

The mean is usually denoted by Greek letter µ (pronounced as mu).

If the set of n data {x p1 , x p 2 ,!, x p n } , where the p i ’s are a set of integers selected from 1 to N,

is a sample of size n drawn from a population, then the sample mean is defined similarly,
but is denoted by x (read as x bar). Thus

1 1 n
x = ( x p1 + x p 2 + ! + x p n ) or x = ∑ x pi
n n i =1

The notation x p i for the elements of the sample may be a bit difficult for beginners.

Hence, when no misunderstanding arises, we shall denote the sample of size n simply as
{x1 , x 2 ,!, x n }

Bearing in mind that the element x i in the sample is, in general, not the same element x i in
the population.

1 1 n
With this understanding, the sample mean is x = (x1 + x 2 + ! + x n ) or x = ∑ xi
n n i =1

2-9
Example 4

Suppose that a class of 40 students obtained the following results in a Mathematics test.

61 80 55 70 76 73 100 90 64 62
75 64 62 66 46 61 67 39 58 63
63 64 51 40 66 43 38 37 28 71
70 49 48 68 86 27 69 74 37 56

(a) Find the mean of the population of Mathematics test marks.

(b) The following two samples each have been drawn randomly from the population of Mathematics
test marks.
S1 = {70, 43, 28, 69, 75, 90}
S 2 = {68, 62, 48, 39, 38, 55, 66, 71, 37, 76}
Find the means of these samples.
(c) Find the mean of the sample S3 formed by combining the samples S1 and S 2 .

Solution:

(a) The population mean is

1
µ= (61 + 80 + 55 + 70 + 76 + 73 + 100 + 90 + 64 + 62 + 75 + 64 + 62 + 66 + 46 + 61 + 67 + 39 + 58 + 63
40
+ 63 + 64 + 51 + 40 + 66 + 43 + 38 + 37 + 28 + 71 + 70 + 49 + 48 + 68 + 86 + 27 + 69 + 74 + 37 + 56)

= 60.425

(b) The sample mean of S1 is

1
x 1 = (70 + 43 + 28 + 69 + 75 + 90) = 62.5
6

The sample mean of S 2 is

1
x2 = (68 + 62 + 48 + 39 + 38 + 55 + 66 + 71 + 37 + 76) = 56
10

Note that a population mean is a unique value, but the sample mean varies from sample to sample.

(c) The sample mean of S3 is

1
x3 = (70 + 43 + 28 + 69 + 75 + 90 + 68 + 62 + 48 + 39 + 38 + 55 + 66 + 71 + 37 + 76) = 58.4375
16
62.5 × 6 + 56 × 10
or x3 = = 58.4375
6 + 10

2-10
B. Median

The median is a measure of position. It is the middle value in an ordered sequence of data.

To find the median from a set of data collected in its raw form, we must first arrange the data
in rank order, from the smallest to the largest observation. Such an ordered sequence of data is
called an ordered array.

For a set of discrete data x 1 , x 2 , …, x n arranged in ascending order,

(i) if n is odd, x n +1 is the median, the median is the value of the datum that is in the middle.
2

1⎛ ⎞
(ii) if n is even, the median is ⎜ x n + x n ⎟ , the median is the mean of the two data that are
2 ⎜⎝ 2 +1 ⎟
2 ⎠

nearest to the middle.

Example 5

(a) Find the median of the set of data {12, 8, 13, 16, 5}.
(b) Find the median of the set of data {25, 25, 37, 26, 25, 12, 75, 75}.

Solution:

(a) Arrange the set of five data in ascending order 5, 8, 12, 13, 16, the median is x 5+1 = x 3 = 12
2

(b) Arrange the set of eight data in ascending order 12, 25, 25, 25, 26, 37, 75, 75,

1⎛ ⎞ 1 1
the median is ⎜ x 8 + x 8 ⎟ = ( x 4 + x 5 ) = (25 + 26) = 25.5
⎜
2⎝ 2 +1 ⎟ 2 2
2 ⎠

2-11
C. Mode

The mode of a set of data is the value that occurs with the highest frequency.
In this sense it is “most typical” of a set of data

For example, for the data 1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 6, the mode is 2.

A distribution with one mode is called a unimodal distribution, while those with two modes are
bimodal, and with three or more are multimodal.

The two main advantages of mode are that it requires no calculations, only counting, and that
it can be determined for qualitative as well as quantitative data.
However, if all values are different in the set of data, certainly, the mode is useless in such a situation.

Example 6

Suppose that 50 children are asked which of the six brands of soft drink they prefer most
and the following results are obtained.

Brand A B C D E F
Number of children 4 15 5 8 3 15

Find the mode of these data.

Solution:

There are two modes in this set, namely, B and F.

This set is said to be bimodal.

2-12
II. Measures of Dispersion

The measures of central tendency can provide only brief information on a set of data.
Obviously, for a set of data, the averages alone cannot tell us how spread out or dispersed the data are.
We need some measures of dispersion, a numerical value indicating the amount of scatter about
a central point.

Widely dispersed data are also highly variable data. Hence measures of dispersion are also called
measures of variability.

The most common measures of dispersion in statistics are the range, the inter-quartile range,
the variance and the standard deviation.

2-13
A. Range

The range of a set of data is the difference between the largest value and the smallest value of the set.

In general, the greater the range, the greater the dispersion of the set of data.

Example 7

Find the range of scores of athlete A and B in Example 11

Solution:

The range of scores of athlete A = 9.5 – 6.0 = 3.5

The range of scores of athlete B = 8.0 – 7.0 = 1.0

Since the range of score of athlete A is greater than that of athlete B, we say that the scores of
athlete A is more dispersed than those of athlete B.

2-14
B. Inter-quartile range

With the set of data arranged in ascending order, the median is the value which divides the set of
data into two equal parts.
Similarly, if we divide the set of data into four equal parts, the corresponding values, denoted by
Q1 , Q 2 , Q 3 are called the first, second and third quartiles respectively.
And Q 2 is just the median of the distribution.

The inter-quartile range (IQR) of a set of data is defined as Q 3 − Q1 ,

it measures approximately how far from the median we must go on either side before we can
include one-half of the values of the data set.

In dividing the set of data into 100 equal parts, the values are called percentiles and
are denoted by P1 , P2 , …, P99 .
The 50 th percentile, P50 , corresponds to the median,
whereas P25 and P75 corresponds to Q1 and Q 3 respectively.

The p th percentile of a data set is a value such that at least p percent of the items take on this value or less
and at least (100 – p) percent of the items take on this value or more.

Q1 is the first quartile (or lower quartile) where 25% of the data lie below it;
Q 2 is the second quartile (or middle quartile or median) where 50% of the data lie below it; and
Q 3 is the third quartile (or upper quartile) where 75% of the data lie below it.

To find the p th percentile, first arrange the set of discrete data x 1 , x 2 , …, x n in ascending order,
then compute index i, where
p
i= ×n
100
to find the position of the p th percentile.

If i is not an integer, round up to the nearest integer. The p th percentile is the value in the i th position.
If i is an integer, the p th percentile is the average of the values in positions i and i + 1.

2-15
Example 8

(a) Find the inter-quartile range of the data set A {14, 23, 16, 18, 15, 44, 19}.
(b) Find the inter-quartile range of the data set B {10, 15, 40, 28, 34, 18, 24, 30}.
(c) By comparing the inter-quartile range of the data sets A and B, which set has a greater dispersion?

Solution:

(a) Arrange the seven data of the data set A in ascending order 14, 15, 16, 18, 19, 23, 44.
25
For the 25 th percentile, the index i = × 7 = 1.75 = 2 (round up to the nearest integer),
100
hence Q1 = x 2 = 15
75
For the 75 th percentile, the index i = × 7 = 5.25 = 6 (round up to the nearest integer),
100
hence Q 3 = x 6 = 23
The inter-quartile range = Q 3 − Q1 = 23 – 15 = 8

(b) Arrange the eight data of the data set B in ascending order 10, 15, 18, 24, 28, 30, 34, 40.
25
For the 25 th percentile, the index i = × 8 = 2,
100
1 1
hence Q1 = ( x 2 + x 3 ) = (15 + 18) = 16.5
2 2
75
For the 75 th percentile, the index i = × 8 = 6,
100
1 1
hence Q 3 = ( x 6 + x 7 ) = (30 + 34) = 32
2 2
The inter-quartile range = Q 3 − Q1 = 32 – 16.5 = 15.5

(c) The range of both data sets A and B are 30.

However, the inter-quartile range of data set A is less than the inter-quartile range of data set B,
data set B has a greater dispersion.

The range considers the difference between the maximum and minimum values of a set of data.
The inter-quartile range considers the range of 50% of the data in the middle and thus avoids the
impact of extreme values.

Therefore if there are extreme values in a set of data, the inter-quartile range is a better measure of
dispersion than the range.

Moreover, the inter-quartile range exists even if the set of data has open ends.
2-16
Box-and-Whisker Diagram

The median, the lower quartile and the upper quartile together with the maximum and the minimum
values provide a good description of a set of data as they indicate some of the most important
characteristics of the set. These five key descriptive statistical measures are often called the
five-number summary of the set of data. A graphical display of these measures, called a
box-and-whisker diagram or a box plot, gives an even better visual impression of the set.

middle
$!!! 50
#%!of!data
!"
lower upper
25% of data 25% of data
$!!#!!" $!!#!!"
_____________ _____________

Minimum Q1 Q2 Q3 Maximum
(median)

IQR

Range

A box-and-whisker diagram consists of a rectangular box drawn with its length parallel to the x-axis
and with its ends marking the position of the lower and the upper quartiles. An orange bar is then
inserted in the box to mark the median. The two extreme values, the minimum and the maximum
values of the data, are linked to the box by lines, called whiskers, parallel to the x-axis.

A glance at the diagram then gives us good information about the central tendency, dispersion and
extreme values of the set.
(1) The bar at the median shows the location of the centre of the data.
(2) The length of the box is equal to the inter-quartile range shows the dispersion of 50% of the data
in the middle, a measure of dispersion.
(3) The lengths of the whiskers show the dispersion of the data below the lower quartile and
above the upper quartile, describe the behavior at the ends or tails of the distribution.
(4) The shape of the diagram gives us a quick impression on the degree of symmetry of the data
distribution about the median.

It is easy to use box-and-whisker diagrams to compare the features, such as location of centre,
dispersion and symmetry of different sets of data. However, a box-and-whisker diagram does not
reveal the total frequency of each set of data, nor the frequency of the data for any specific range.
If such information is required, a stem-and-leaf diagram, bar chart or histogram can be used.
2-17
Box-and-whisker diagrams are particularly useful for comparing the central tendency and
the dispersion of two or more sets of data.

Example 9

The following box-and-whisker diagrams show the distributions of marks of Chinese, English and
Mathematics test.

(a) Which test has the marks with the largest inter-quartile range?
(b) Which test has the marks with the smallest range?
(c) Which test has the highest median mark?
(d) If Mary gets 70 marks in all three tests, in which test does she perform the best?
Briefly explain your answer.

Solution:

(a) Since the length of the box of Mathematics test is the largest, Mathematics test has the marks with
the largest inter-quartile range.

(b) Since the distance between two ends of the whiskers of Chinese test is the shortest.
Chinese test has the marks with the smallest range.

(c) Since the orange bar in the box of Mathematics test is at the rightmost position, the median mark
of Mathematics test is the highest.

(d) Since from the box-and-whisker diagram above, the mark of Mary’s English test is in the top
25% of the class while her marks in Mathematics and Chinese tests are not.
Mary performs the best in English test.

2-18
Skewness of Distributions

A distribution can have many different shapes. It may be symmetric or skewed.

A distribution is symmetric if the parts above and below its center are mirror images.
If Q 2 − Q1 = Q 3 − Q 2 , the distribution is symmetric.

Min Q1 Q2 Q3 Max

A distribution is skewed to the right if the right side is longer, while it is skewed to the left if the left
side is longer.

For a positively skewed or right-skewed distribution, an asymmetric distribution with a “tail” on the right
indicates the presence of extreme values at the positive end of the distribution.
A distribution is positively skewed if Q 2 − Q1 < Q 3 − Q 2

Long tail to the right

Min Q1 Q 2 Q3 Max

For a negatively skewed or left-skewed distribution, an asymmetric distribution with a “tail” on the left.
A distribution is negatively skewed if Q 2 − Q1 > Q 3 − Q 2

Long tail to the left

Min Q1 Q 2 Q 3 Max

2-19
Example 10

Using the stem-and-leaf diagram constructed in Example 5 for the distribution of results of the class
of 40 students in the Mathematics test.

Stem Leaf
(Tens) (Units)
2 78
3 7789
4 03689
5 1568
6 11223344466789
7 0013456
8 06
9 0
10 0

(a) Find the median, the first and the third quartiles.
(b) Construct the box-and-whisker diagram.
(c) Use the quartiles to comment on the skewness of the distribution.

Solution:

1⎛ ⎞ 1 1
(a) The median is ⎜ x 40 + x 40 ⎟ = ( x 20 + x 21 ) = (63 + 63) = 63
2 ⎜⎝ 2 2
+1 ⎟
⎠ 2 2
25 1 1
For the 25 th percentile, the index i = × 40 = 10 , hence Q1 = ( x 10 + x 11 ) = (48 + 49) = 48.5
100 2 2
75 1 1
For the 75 th percentile, the index i = × 40 = 30 , hence Q 3 = ( x 30 + x 31 ) = (70 + 70) = 70
100 2 2

(b)
63

27 48.5 70 100

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

(c) Q2 − Q1 = 63 − 48.5 = 14.5 and Q 3 − Q 2 = 70 − 63 = 7

Since Q 2 − Q1 > Q 3 − Q 2 , the distribution is negatively skewed (left-skewed).

2-20
Example 11

The table below gives the monthly salaries in dollars of 25 employees of a certain department.

7800 11900 12700 10400 20200

6200 7300 9200 15500 17900
9700 9500 10500 13300 10200
9900 14200 8900 8700 16600
7400 6600 9600 6100 8200

(a) Construct a stem-and-leaf diagram for the data.

(b) Find the mean.
(c) Find the median, the first and the third quartiles and the inter-quartile range.
(d) Construct the box-and-whisker diagram.
(e) Use the quartiles to comment on the skewness of the distribution.

Solution:

(a)
Stem Leaf
(Unit = $1000) (Unit = $100)
6 126
7 348
8 279
9 25679
10 245
11 9
12 7
13 3
14 2
15 5
16 6
17 9
18
19
20 2

2-21
1
(b) The mean = (7800 + 11900 + 12700 + 10400 + 20200 + 6200 + 7300 + 9200 + 15500 + 17900
25
+ 9700 + 9500 + 10500 + 13300 + 10200 + 9900 + 14200 + 8900 + 8700 + 16600
+ 7400 + 6600 + 9600 + 6100 + 8200)
= 10740

(c) Making use of the stem-and-leaf diagram for the distribution of the salaries (with a column of
cumulative frequencies added to help locating the quartiles),

The median is x 25+1 = x 13 = 9700

25
For the 25 th percentile, the index i = × 25 = 6.25 = 7 (round up to the nearest integer),
100
hence Q1 = x 7 = 8200 .
75
For the 75 th percentile, the index i = × 25 = 18.75 = 19 (round up to the nearest integer),
100
hence Q 3 = x 19 = 12700 .

The inter-quartile range = Q 3 − Q1 = 12700 – 8200 = 4500

(d)

9700

6100 8200 12700 20200

6000 7000 8000 9000 10000 11000 12000 13000 14000 15000 16000 17000 18000 19000 20000 21000

(e) Q2 − Q1 = 9700 − 8200 = 1500 and Q 3 − Q 2 = 12700 − 9700 = 3000

Since Q 2 − Q1 < Q 3 − Q 2 , the distribution is positively skewed (right-skewed).

2-22
C. Variance and Standard Deviation

Although the inter-quartile range is an improved measure of dispersion compared with the range,
still it does not make use of the actual values of all the data in the set, therefore, cannot completely
reflect the dispersion of the data. A measure of dispersion which does take into account the
dispersion of all the values is the variance and standard deviation.

To overcome the limitations of range and inter-quartile range mentioned above, we can find the
distance of each datum from the centre of a group of data. The greater the average distance of all
data from the centre, the wider the dispersion of a set of data is.

If the set of N data {x 1 , x 2 ,!, x N } represents a population with mean µ , then the variance of the
set of data is defined as the mean of the squares of the deviations of individual values from the
population mean, and is commonly denoted by σ 2 . Thus, population variance

1 N 1
σ2 = ∑ ( x i − µ) 2 = [(x 1 − µ) 2 + ( x 2 − µ) 2 + ! + ( x N − µ) 2 ]
N i =1 N

Large variances indicate large dispersion and small variance indicate small dispersion.

However, the variance defined above does not have the same unit as the original values of x.
To have a measure of dispersion with the same unit as the original data, we take the positive square
root of the variance. The resulting measure is called the standard deviation of the set of data. Thus,

1 N 1
Population standard deviation σ = ∑ ( x i − µ) 2 = [(x 1 − µ) 2 + ( x 2 − µ) 2 + ! + ( x N − µ) 2 ]
N i =1 N

If the set of n data {x1 , x 2 ,!, x n } is a sample of size n drawn from a population and with mean x ,

the sample variance, s 2 , is defined as

1 n 1
s2 = ∑ (x i − x) 2 = [(x 1 − x ) 2 + ( x 2 − x ) 2 + ! + ( x n − x ) 2 ]
n − 1 i =1 n −1

The sample standard deviation, s, is the positive square root of the sample variance.

1 n 1
s= ∑ (x i − x) 2 = [(x 1 − x ) 2 + ( x 2 − x ) 2 + ! + ( x n − x ) 2 ]
n − 1 i =1 n −1

2-23
Note that the differences between sample variance s 2 and population variance σ 2 are
the sample mean x is used instead of the population mean µ , and the divisor is n – 1 instead of N.

Standard deviation can give us an idea about how close all the data are from their mean, and thus
we can learn about the consistency of the set of data.
The smaller the standard deviation, the less dispersed the set of data is.
In other words, the distribution of data in the set is more consistent.

2-24
Example 12

The temperatures (in o C ) of water in seven beakers are: 30, 32, 33, 28, 31, 29, 34.
(a) Find the mean of the temperatures of the water.
(b) Find the population standard deviation of the temperatures of the water.

Solution:

(a) The mean of the temperatures of the water is

1 7 1
µ= ∑ x i = (30 + 32 + 33 + 28 + 31 + 29 + 34) = 31
7 i =1 7

(b) The variance of the temperatures of the water is

1 7
σ2 = ∑ ( x i − µ) 2
7 i =1
1
= [(30 − 31) 2 + (32 − 31) 2 + (33 − 31) 2 + (28 − 31) 2 + (31 − 31) 2 + (29 − 31) 2 + (34 − 31) 2 ]
7
1
= [(−1) 2 + 12 + 2 2 + (−3) 2 + 0 2 + (−2) 2 + 3 2 ]
7
1
= (1 + 1 + 4 + 9 + 0 + 4 + 9)
7
=4

Therefore, the population standard deviation of the temperatures of the water is σ = 4 = 2

2-25
Example 13

(a) Find the variance and standard deviation of the population of Mathematics test marks in Example 7
with the population mean 60.425.
(b) If the passing mark is one population standard deviation less than the mean, find the number of
students failed in the Mathematics test.
(c) The sample S 2 = {68, 62, 48, 39, 38, 55, 66, 71, 37, 76} has been drawn from the population of
Mathematics test marks in Example 7. The sample mean was found to be 56.
Find the sample variance and sample standard deviation.

Solution:

(a) The variance is

1 40 2
σ2 = ∑ xi − µ2
40 i =1
1
= (612 + 80 2 + 55 2 + 70 2 + 76 2 + 73 2 + 100 2 + 90 2 + 64 2 + 62 2 + 75 2 + 64 2 + 62 2 + 66 2 + 46 2
40
+ 612 + 67 2 + 39 2 + 58 2 + 63 2 + 63 2 + 64 2 + 512 + 40 2 + 66 2 + 43 2 + 38 2 + 37 2 + 28 2 + 712
+ 70 2 + 49 2 + 48 2 + 68 2 + 86 2 + 27 2 + 69 2 + 74 2 + 37 2 + 56 2 ) − 60.4252
1
= (156497) − (60.425) 2
40
= 261.2444

Therefore the standard deviation is σ = 261.2444 = 16.1631

(b) The passing mark = 60.425 – 16.1631 = 44.2619

There are eight students with marks less than 44, so eight students failed in the Mathematics test.

(c) The sample variance is

1 ⎛ 10 2 2⎞
s2 = ⎜ ∑ x i − 10 x ⎟
10 − 1 ⎝ i =1 ⎠
1
= [(68 2 + 62 2 + 48 2 + 39 2 + 38 2 + 55 2 + 66 2 + 712 + 37 2 + 76 2 ) − 10 × 56 2 ]
9
1
= (33304 − 31360)
9
= 216

And the sample standard deviation is s = 216 = 14.70

2-26
Use Scientific Calculator to find mean and standard deviation

Use the calculator to find the mean and standard deviation of the data set
{1, 2, 5, 6, 8, 9, 10, 12, 14, 18}

2-27
Use Scientific Calculator to find mean and standard deviation

Use the calculator to find the mean and standard deviation of the data set
{1, 2, 5, 6, 8, 9, 10, 12, 14, 18}

2-28

QA1 Notes Binder
No ratings yet
QA1 Notes Binder
139 pages
Summry Biostatstics
No ratings yet
Summry Biostatstics
32 pages
Lesson2 - Measures of Tendency
No ratings yet
Lesson2 - Measures of Tendency
65 pages
Chapter 2 Measures of Location
No ratings yet
Chapter 2 Measures of Location
16 pages
Statistics, mg4
No ratings yet
Statistics, mg4
58 pages
Statistics and Probability
No ratings yet
Statistics and Probability
253 pages
Statistics and Probability
No ratings yet
Statistics and Probability
196 pages
Frequency Distribution & Statistics
No ratings yet
Frequency Distribution & Statistics
21 pages
3 Data Description and Measures of Central Tenndency
No ratings yet
3 Data Description and Measures of Central Tenndency
72 pages
Lecture 1: Introduction: Statistics Is Concerned With
No ratings yet
Lecture 1: Introduction: Statistics Is Concerned With
45 pages
Eng4201 Note Merged-2
No ratings yet
Eng4201 Note Merged-2
58 pages
05.1 Data Organization PRESENTATION
No ratings yet
05.1 Data Organization PRESENTATION
19 pages
CAS - Descriptive Statistics - Final PPT-1
No ratings yet
CAS - Descriptive Statistics - Final PPT-1
112 pages
Or Lecture 202209
No ratings yet
Or Lecture 202209
21 pages
Statistics - Slide 2
No ratings yet
Statistics - Slide 2
15 pages
Assessment Learning 2. M4
No ratings yet
Assessment Learning 2. M4
10 pages
D2 - Mathematics in The Modern World
No ratings yet
D2 - Mathematics in The Modern World
7 pages
Data Analysis
No ratings yet
Data Analysis
11 pages
MMW Module 4 - Statistics
No ratings yet
MMW Module 4 - Statistics
18 pages
Lecture2 Slides
No ratings yet
Lecture2 Slides
10 pages
Chap1 Lesson 2
No ratings yet
Chap1 Lesson 2
10 pages
Methods of Organizing Data
No ratings yet
Methods of Organizing Data
8 pages
GE3 Module 6 Statistics
No ratings yet
GE3 Module 6 Statistics
21 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
50 pages
4th Grade 7 Reviewer
No ratings yet
4th Grade 7 Reviewer
2 pages
Data Presentation Basics
100% (1)
Data Presentation Basics
45 pages
Write in A Piece of Paper
No ratings yet
Write in A Piece of Paper
87 pages
Research II Q4 M2
No ratings yet
Research II Q4 M2
14 pages
Sta 131 Complete Note
No ratings yet
Sta 131 Complete Note
33 pages
Introduction to Statistics Concepts
No ratings yet
Introduction to Statistics Concepts
98 pages
Lecture 7 Quantitative Reasoning
No ratings yet
Lecture 7 Quantitative Reasoning
7 pages
Data Management & Statistics Module
No ratings yet
Data Management & Statistics Module
38 pages
Introduction to Probability & Statistics
No ratings yet
Introduction to Probability & Statistics
76 pages
Data Presentation and Organization
100% (1)
Data Presentation and Organization
40 pages
Statistics Review Notes & Exercises
No ratings yet
Statistics Review Notes & Exercises
243 pages
1 Review of Statistics
No ratings yet
1 Review of Statistics
24 pages
Data Collection and Display
No ratings yet
Data Collection and Display
34 pages
Tabular and Graphical Presentation of Data1
100% (1)
Tabular and Graphical Presentation of Data1
7 pages
2017.05.01 Data Representation 1
No ratings yet
2017.05.01 Data Representation 1
20 pages
Introduction BS Final
No ratings yet
Introduction BS Final
54 pages
Descriptive Statistics Lecture 2024
No ratings yet
Descriptive Statistics Lecture 2024
7 pages
Statistics
No ratings yet
Statistics
39 pages
Notes Part 2
No ratings yet
Notes Part 2
310 pages
MMW Statistics
No ratings yet
MMW Statistics
50 pages
Introduction to Descriptive and Inferential Statistics
No ratings yet
Introduction to Descriptive and Inferential Statistics
68 pages
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
No ratings yet
Describing Data: Probability and Statistics For Science and Engineering With Examples in R
24 pages
Engineering Statistics Lect 1
No ratings yet
Engineering Statistics Lect 1
15 pages
MDM4U Unit3
No ratings yet
MDM4U Unit3
22 pages
SMA 160 Stds Notes PDF
No ratings yet
SMA 160 Stds Notes PDF
41 pages
MMW Module 4
No ratings yet
MMW Module 4
41 pages
Descriptive Statistics
100% (1)
Descriptive Statistics
18 pages
WEEK 3 and 4 - Formulation and Presentation of Data
No ratings yet
WEEK 3 and 4 - Formulation and Presentation of Data
36 pages
Lecture No. Statistics and Probability
No ratings yet
Lecture No. Statistics and Probability
64 pages
Lecture1 Slides
No ratings yet
Lecture1 Slides
10 pages
Statistics
No ratings yet
Statistics
14 pages
Topic 1 A
No ratings yet
Topic 1 A
116 pages
EMBA Day3
No ratings yet
EMBA Day3
29 pages
MAT217PART2
No ratings yet
MAT217PART2
54 pages
Module6 Statistical Tools
No ratings yet
Module6 Statistical Tools
29 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
48 pages
Detailed Lesson Plan
100% (5)
Detailed Lesson Plan
7 pages
FEU Diliman - Forecasting Techniques
No ratings yet
FEU Diliman - Forecasting Techniques
5 pages
TNPSC Data Collection Guide
No ratings yet
TNPSC Data Collection Guide
100 pages
The Tik Tok Addiction Scale Developmentandvalidation
No ratings yet
The Tik Tok Addiction Scale Developmentandvalidation
23 pages
Measures of Central Tendency (Grouped Data)
No ratings yet
Measures of Central Tendency (Grouped Data)
7 pages
Module 8
No ratings yet
Module 8
15 pages
A Study On Most Preferred Car Brand
No ratings yet
A Study On Most Preferred Car Brand
7 pages
Central Tendency Measures Guide
No ratings yet
Central Tendency Measures Guide
9 pages
Biometry - Chapter 1
No ratings yet
Biometry - Chapter 1
22 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
11 pages
Statistical Inference Assignment 1
No ratings yet
Statistical Inference Assignment 1
1 page
Quiz (Mean&var)
No ratings yet
Quiz (Mean&var)
2 pages
The Assumptions Underlying The Method of Least Squares (CLRM)
No ratings yet
The Assumptions Underlying The Method of Least Squares (CLRM)
11 pages
Z-Score Practice Worksheet
No ratings yet
Z-Score Practice Worksheet
3 pages
Non-Normal Process Capability Indices
No ratings yet
Non-Normal Process Capability Indices
6 pages
Algebra Shortcuts
No ratings yet
Algebra Shortcuts
20 pages
Tugas 2 Statistika Bisnis
No ratings yet
Tugas 2 Statistika Bisnis
2 pages
T-Test Formula Excel Template
No ratings yet
T-Test Formula Excel Template
3 pages
BTech CSE (Question Bank)
No ratings yet
BTech CSE (Question Bank)
5 pages
Bivariate Analysis for Business
No ratings yet
Bivariate Analysis for Business
10 pages
SHS Final Exam (Statistics)
No ratings yet
SHS Final Exam (Statistics)
3 pages
Illustration 2
No ratings yet
Illustration 2
1 page
Aol1 1 Semi Final Exam. Key For Correction
No ratings yet
Aol1 1 Semi Final Exam. Key For Correction
6 pages
QUARTILE
No ratings yet
QUARTILE
10 pages
Introduction To Statistics and Statistical Inference
No ratings yet
Introduction To Statistics and Statistical Inference
68 pages
S2 Chapter 3 - Normal Distribution
No ratings yet
S2 Chapter 3 - Normal Distribution
42 pages
Econometrics II Chapter Two
No ratings yet
Econometrics II Chapter Two
40 pages
Deciles: A Guide for Educators
No ratings yet
Deciles: A Guide for Educators
14 pages
Business Statistics Syl Lab Us
No ratings yet
Business Statistics Syl Lab Us
2 pages