[go: up one dir, main page]

0% found this document useful (0 votes)
31 views64 pages

الإحصاء الهندسي

Uploaded by

hwynwy96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views64 pages

الإحصاء الهندسي

Uploaded by

hwynwy96
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Engineering Statistics Second Year

CHAPTER ONE

INTRODUCTION TO ENGINEERING STATISTICS

1.1 Introduction
Engineering is about bridging the gaps between problems and solutions, and
that process requires an approach called the scientific method. Many aspects of
engineering practice involve collecting, working with, and using data in the solution
of a problem, so knowledge of statistics is just as important to the engineer
as knowledge of any of the other engineering sciences. Statistical methods are a
powerful aid in designing new products and systems, improving existing designs, and
designing, developing, and improving production operations. Statistical methods are
used to help us describe and understand variability.
All natural processes, as well as those devised by humans, are subject to
variability. Civil engineers are aware, for example, the compressive strength of
concrete, soil pressures, traffic flow, floods, and pollution loads in streams have wide
variations. To cope with uncertainty, the engineer must first obtain and investigate a
sample of data, such as a set of flow data or soil test results. The sample is used in
applying statistics and probability at the descriptive stage. For inferential purposes,
however, one needs to make decisions regarding the population from which the
sample is drawn. A data set comprises a number of measurements of a phenomenon
such as the failure load of a structural component. The quantities measured are
termed variables, each of which may take any one of a specified set of values.

1.2 Inferential statistics and probability models


After the preceding experiment is completed and the data are described and
summarized, we hope to be able to draw a conclusion. This part of statistics,
concerned with the drawing of conclusions, is called inferential statistics. To be able
to draw a conclusion from the data, we must take into account the possibility of
chance. To be able to draw logical conclusions from data, we usually make some
assumptions about the chances (or probabilities) of obtaining the different data
values. The totality of these assumptions is referred to as a probability model for the
data. Sometimes the nature of the data suggests the form of the probability model that
is assumed.
In other situations, the appropriate probability model for a given data set will
not be readily apparent. However, careful description and presentation of the data
sometimes enable us to infer a reasonable model, which we can then try to verify
with the use of additional data. Because the basis of statistical inference is the
formulation of a probability model to describe the data, an understanding of statistical
inference requires some knowledge of the theory of probability. In other words,
statistical inference starts with the assumption that important aspects of the
phenomenon under study can be described in terms of probabilities; it then draws
conclusions by using data to make inferences about these probabilities.

1
Engineering Statistics Second Year
1.3 Populations and samples
In everyday language, the word “population” refers to all the people or
organisms contained within a specific country, area, region, etc. When we talk about
the population of IRAQ, we usually mean something like “the total number of
people who currently reside in IRAQ” In the field of statistics, however, the term
population is defined operationally by the question we ask: it is the entire collection
of measurements about which we want to make a statement.
In statistics, we are interested in obtaining information about a total collection
of elements, which we will refer to as the population. The population is often too
large for us to examine each of its members. In such cases, we try to learn about the
population by choosing and then examining a subgroup of its elements. This
subgroup of a population is called a sample.
A population consists of all possible observations available from a particular
probability distribution. A sample is a particular subset of the population that an
experimenter measures and uses to investigate the unknown probability distribution.
A random sample is one in which the elements of the sample are chosen at random
from the population, and this procedure is often used to ensure that the sample is
representative of the population.

1.4 A brief definition in statistics


Statistics is a discipline of study dealing with the collection, analysis,
interpretation, and presentation of data. Descriptive statistics is the use of
graphs, charts, and tables and the calculation of various statistical measures to
organize and summarize information. Descriptive statistics help to reduce our
information to a manageable size and put it into focus. The population is the
complete collection of individuals, items, or data under consideration in a statistical
study. The portion of the population selected for analysis is called the sample.
Inferential statistics consist of techniques for reaching conclusions about a
population based upon information contained in a sample. A variable is a
characteristic of interest concerning the individual elements of a population or a
sample. A variable is often represented by a letter such as x, y, or z. The value of a
variable for one particular element from the sample or population is called an
observation. A data set consists of the observations of a variable for the elements of
a sample. A quantitative variable is determined when the description of the
characteristic of interest results in a numerical value. When a measurement is
required to describe the characteristic of interest or it is necessary to perform a
count to describe the characteristic, a quantitative variable is defined. A
discrete variable is a quantitative variable whose values are countable.
Discrete variables usually result from counting. A continuous variable is a
quantitative variable that can assume any numerical value over an interval or over
several intervals. A continuous variable, usually results from making a measurement
of some type. A qualitative variable is determined when the description of the
characteristic of interest results in a non-numeric value. A qualitative variable may be
classified into two or more categories.

2
Engineering Statistics Second Year
CHAPTER TWO

PRESENTATION OF STATISTICS DATA

2.1 Introduction
Once a data set has been collected, the experimenter’s next task is to find an
informative way of presenting it. In general, a table of numbers is not very
informative, whereas a picture or graphical representation of the data set can be quite
informative. If “a picture is worth a thousand words,” then it is worth at least a
million numbers.

2.2 Frequency distributions


One way of describing a distribution of sample values, which is particularly
useful in large samples, is to construct a frequency distribution of the sample values.
We distinguish between two types of frequency distributions, namely, frequency
distributions of: (i) discrete variables; and (ii) continuous variables.
A random variable, X, is called discrete if it can assume only a finite number
of different values. For example, the number of defective computer cards in a
production lot is a discrete random variable. A random variable is called continuous
if, theoretically, it can assume all possible values in a given interval. For example, the
output voltage of a power supply is a continuous random variable.

2.2.1 Frequency distributions of discrete random variables


Consider a random variable, X, that can assume only the value x1, x2, x3, …, xk,
where x1<x2<…..<xk . Suppose that we have made n different observations on X.
The frequency of xi (i=1,…k) is defined as the number of observations having the
value xi. We denote the frequency of xi by fi. Notice that

f
i =1
i = f 1 + f 2 + ....... + f k = n

We can present a frequency distribution in a tabular form as:

Value Frequency
x1 f1
x2 f2
. .
. .
. .
xk fk
Total n

3
Engineering Statistics Second Year
It is sometimes useful to present a frequency distribution in terms of the
proportional or relative frequencies pi, which are defined by
fi
pi = (i = 1, 2,......, k )
n
In addition to the frequency distribution, it is often useful to present the
cumulative frequency distribution of a given variable. The cumulative frequency of xi
is defined as the sum of frequencies of values less than or equal to xi. We denote it by
Fi , and the proportional cumulative frequencies or cumulative relative frequency by
Fi
Pi = (i = 1, 2,......, k )
n
A table of proportional cumulative frequency distribution could be represented
as follows:

Value Relative Cumulative relative


frequency (p) frequency (P)
x1 p1 P1=p1
x2 p2 P2=p1+p2
. . .
. . .
. . .
xk pk Pk=p1+p2+…+pk=1
Total 1

2.2.2 Frequency distributions of continuous random variables


For the case of a continuous random variable, we partition the possible range
of variation of the observed variable into k subintervals. Generally speaking, if the
possible range of X is between L and H, we specify numbers b0, b1, b2, · · · , bk such
that L = b0< b1< b2< · · · < bk−1< bk = H. The values b0, b1, · · · , bk are called the
limits of the k subintervals. We then classify the X values into the interval (bi−1, bi ) if
bi−1< X ≤ bi (i = 1, · · · , k). (If X = b0, we assign it to the first subinterval.)
Subintervals are also called bins, classes or class-intervals.

In order to construct a frequency distribution we must consider the following


two questions:
(i) How many sub-intervals should we choose? and
(ii) How large should the width of the subintervals be?

In general, it is difficult to give to these important questions exact answers


which apply in all cases. However, the general recommendation is to use between 10
and 15 subintervals in large samples, and apply equal width subintervals. The
frequency distribution is given then for the subintervals, where the mid-point of each
subinterval provides a numerical representation for that interval. A typical frequency
distribution table might look like the following:

4
Engineering Statistics Second Year

Subinterval Mid-Point Frequency Cumulative frequency


_
b0 - b1 b1 f1 F1=f1
_
b1- b2 b2 f2 F2=f1+f2
. . . .
. . . .
. . . .
_
bk-1 - bk bk fk Fk=f1+f2+…+fk=n

2.3 Frequency distributions table


The easy method for construction of the frequency distribution table, it must
conduct the following steps below:-
1. Find the Largest and Smallest values in the data.
2. Calculate the Range of data. (Range = Largest Value – Smallest Value)
3. Select the suitable Number of Classes according to the data under study
(generally 5 – 15 ).
4. Find the Class Interval (C.I.):-
Range
Class Interval =
No .of Classes
Note: (Class Interval coming near to the largest number depending on the accuracy of the data)
5. Determine the Class Limits (C.L.), that start with the lowest value of the
sample data which is the lowest limit of the first class..
6. Calculate the class boundaries based on required accuracy.
7. Calulate the Class Mark (C.M.) of each Class Interval :-
(Upper Limit + Lower Limit )
Class Mark =
2
8. Find the frequency of each class, the number of observations (data)
corresponding to that class.
To separate one class from another, we use class boundaries (C.B.). If the data
are given to the nearest integer, the class boundaries should be given to the nearest
half. The class boundaries equal to the half way between the classes limits. These are
called the class boundaries.

Example 2.1 :If we have the following frequency table. Complete the other
elements of that table .
Score Frequency
80-94 8
95-109 14
110-124 24
125-139 16
140-154 13
5
Engineering Statistics Second Year
Solution :-
Largest value = 154
Smallest value = 80
Range = 154 – 80 = 74
Take No. of classes = 5
Class Interval = (74)/5 =14.8 → take C.I.= 15
Lower boundary of the first class = 80-0.5 = 79.5
Upper boundary of the first class = 79.5 + 15 = 94.5

Class Class Class


No. Frequency
Limits Boundaries Mark
1 80-94 79.5 – 94.5 87 8
2 95-109 94.5 – 109.5 102 14
3 110-124 109.5 – 124.5 117 24
4 125-139 124.5 – 139.5 132 16
5 140-154 139.5 – 154.5 147 13
Note: When forming a frequency distribution, the following general
guidelines should be followed:
1. The number of classes should be between 5 and 15
2. Each data value must belong to one, and only one, class.
3. When possible, all classes should be of equal width (Class Interval).

Example 2.2 :-Group the following data into classes and show its frequency table.
111 120 127 129 130 145 145 150 153 155 160 161
165 167 170 171 174 175 177 179 180 180 185 185
190 195 195 201 210 220 224 225 230 245 248

Solution :-
Largest value = 248
Smallest value = 111
Range = 248 – 111 = 137
Take No. of classes = 6
Class Interval = (137)/6 =22.8333 → take C.I.= 23
Lower boundary of the first class = 111-0.5 = 110.5
Upper boundary of the first class = 110.5 + 23 = 133.5

Class Class Class


No. Tally Frequency
Limits Boundaries Mark
1 111 – 133 110.5 – 133.5 122 llll 5
2 134 – 156 133.5 – 156.5 145 llll 5
3 157 – 179 156.5 – 179.5 168 llll llll 10
4 180 – 202 179.5 – 202.5 191 llll lll 8
5 203 – 225 202.5 – 225.5 214 llll 4
6 226 – 248 225.5 – 248.5 237 lll 3

6
Engineering Statistics Second Year
2.4 Relative frequency
The relative frequency of a class is obtained by dividing the frequency for a class
by the sum of all the frequencies. The relative frequencies for the five classes in
Example 2.1 are shown below. The sum of the relative frequencies will always equal
one.
Frequency Relative frequency
No.
(F) (RF)
1 8 8/75 = 0.11
2 14 14/75 = 0.19
3 24 24/75 = 0.32
4 16 16/75 = 0.21
5 13 13/75 = 0.17
Total 75 1

2.5 Percentage
The percentage for a class is obtained by multiplying the relative frequency for
that class by 100. The percentages for the six classes in Example 2.2 are shown in
below. The sum of the percentages for all the categories will always equal 100
percent.
Frequency Relative frequency
No. Percentage
(F) (RF)
1 5 5/35 = 0.14 0.14 x 100 = 14 %
2 5 5/35 = 0.14 0.14 x 100 = 14 %
3 10 10/35 = 0.29 0.29 x 100 = 29 %
4 8 8/35 = 0.23 0.23 x 100 = 23 %
5 4 4/35 = 0.11 0.11 x 100 = 11 %
6 3 3/35 = 0.09 0.09 x 100 = 9 %
Total 35 1 100 %

2.6 Cumulative frequency distribution


A cumulative frequency distribution gives the total number of values that fall
below various class boundaries of a frequency distribution. The cumulative frequency
distribution of the five classes in Example 2.1 are shown below

Frequency Cumulative frequency


No.
(F) (CF)
1 8 8
2 14 22
3 24 46
4 16 62
5 13 75
Total 75

7
Engineering Statistics Second Year
There are different types of Cumulative frequency; which are shown below:

1. Ascending Cumulative Frequency :-


For any class, the ascending cumulative frequency is equal to the number of
observations that are less than the upper boundary of that class.

2. Descending Cumulative Frequency :-


For any class, the descending cumulative frequency is equal to the number of
observations that are greater than the lower boundary of that class.

3. Relative Cumulative Frequency:-


Relative Cumulative Frequency is obtained by dividing a cumulative
frequency by the total number of observations in the data set. The cumulative
frequencies for the frequency distribution given in Example 2.1 are shown below.
Cumulative percentages are obtained by multiplying relative cumulative frequencies
by 100.

Relative Relative
Ascending Descending
Relative Ascending Descending
Frequency cumulative cumulative
No. frequency cumulative cumulative
(F) frequency frequency
(RF) frequency frequency
(ACF) (DCF)
(RACF) (RDCF)
1 8 8/75 = 0.11 8 8/75 = 0.11 75 75/75 = 1
2 14 14/75 = 0.19 22 22/75 = 0.29 67 67/75 = 0.89
3 24 24/75 = 0.32 46 46/75 = 0.61 53 53/75 = 0.71
4 16 16/75 = 0.21 62 62/75 = 0.82 29 29/75 = 0.39
5 13 13/75 = 0.17 75 75/75 = 1 13 13/75 = 0.17
Total 75

2.7 Graphical representation of data


If "a picture is worth a thousand words," then graphical techniques provide an
excellent method to visualize the variability and other properties of a set of data. We
proceed by assembling the data into graphs, scanning the details, and noting the
important characteristics. There are numerous types of graphs. Histograms, frequency
polygons, and cumulative frequency curves are given in this section.

2.7.1 Histogram
The data are divided into groups according to their magnitudes. The horizontal
axis of the graph gives the magnitudes. Blocks are drawn to represent the groups,
each of which has a different upper and lower limit. The area of a block is
proportional to the number of occurrences in the group. The variability of the data is
shown by the horizontal spread of the blocks, and the most common values are found
in blocks with the largest areas. Other features such as the symmetry of the data or
lack of it are also shown. The first step is to take into account the range r of the
observations, that is, the difference between the largest and smallest values.

8
Engineering Statistics Second Year
2.7.2 Frequency polygon
A frequency polygon is a useful characteristic tool to determine the distribution
of a variable. It can be drawn by joining the midpoints of the tops of the rectangles of
a histogram after extending the diagram by one class on both sides. We assume that
equal class widths are used. If the ordinates of a histogram are divided by the total
number of observations, then a relative frequency histogram is obtained. Thus, the
ordinates for each class denote the probabilities bounded by 0 and 1, by which we
simply mean the chances of occurrence. The resulting diagram is called the relative
frequency polygon. The polygon ends on the horizontal axis at a distance equal to
one half class interval after the upper boundary.

2.7.3 Cumulative frequency curves


If a cumulative sum is taken of the relative frequencies step by step from the
smallest class to the largest, then the line joining the ordinates (cumulative relative
frequencies) at the ends of the class boundaries forms a cumulative relative frequency
or probability diagram. On the vertical axis of the graph, this line gives the
probabilities of non exceedance of values shown on the horizontal axis. There are
two types of cumulative frequency curves:

1. Ascending Cumulative Frequency Curve :-


It is drawn on a pair of perpendicular axes, just as the histogram and the
frequency polygon, with the horizontal axis representing the values of the upper
boundaries of the classes and the vertical axis representing the corresponding
ascending cumulative frequencies of these classes.

2. Descending Cumulative Frequency Curve :-


It is drawn, just as the ascending curve, but the horizontal axis representing the
values of the lower boundaries of the classes and the vertical axis representing the
corresponding descending cumulative frequencies of these classes.

Example 2.3:-
For the following classified data, draw the histogram, frequency polygon, and
the cumulative frequency curves.
Class Limits 11 – 27 28 – 44 45 – 61 62 – 78 79 – 95
Frequency 7 9 15 8 6
Solution :-
No. C.L F C.M. C.B. ACF DCF
1 11 – 27 7 19 10.5 – 27.5 7 45
2 28 – 44 9 36 27.5 – 44.5 16 38
3 45 – 61 15 53 44.5 – 61.5 31 29
4 62 – 78 8 70 61.5 – 78.5 39 14
5 79 – 95 6 87 78.5– 95.5 45 6

9
Engineering Statistics Second Year

16.00

12.00
Frequency
Polygon
Frequency

8.00 Histogr
am

4.00

0.00

0.00 20.00 40.00 60.00 80.00 100.00 120.00


Class Boundary

50.00

Descending
Curve
40.00

Ascending
Cumulative Frequency

30.00 Curve

20.00

10.00

0.00

0.00 20.00 40.00 60.00 80.00 100.00


Class Boundary

10
Engineering Statistics Second Year

CHAPTER THREE

MEASURES OF CENTRAL LOCATION AND DISPERSION

3.1 Introduction
When describing a numerical data set, it is common to report both a value
that describes where the data distribution is centered along the number line and a
value that describes how spread out the data distribution is.

Measures of center describe where the data distribution is located along the number
line. A measure of center provides information about what is “typical.”

Measures of Spread describe how much variability there is in a data distribution. A


measure of spread provides information about how much individual values tend to
differ from one another.

There is more than one way to measure center and spread in a data distribution.

3.2 Measures of center


A data set consisting of the observations for some variable is referred to
as raw data or ungrouped data. Data presented in the form of a frequency
distribution are called grouped data. The measures of central tendency discussed
in this chapter will be described for both grouped and ungrouped data since
both forms of data occur frequently. There are many different measures of central
tendency. The five most widely used measures of central tendency are the mean, the
median, the mode, geometric mean, and the harmonic mean. These measures are
defined for both samples and populations.

3.2.1 The mean


The mean of a numerical data set is just the familiar arithmetic average—the
sum of all of the observations in the data set divided by the total number of
observations. It is denoted by the symbol x . Notice that the value of the subscript on
x doesn’t tell us anything about how small or large the data value is.

Mean of ungrouped Data


The sum of x1, x2, …, xn can be written x1+x2+…+ xn, but this can be shortened
by using the Greek letter∑ , which is used to denote summation. In particular∑x is
used to denote the sum of all of the x values in a data set.

Mean is the most commonly used measure of central tendency. The mean of a
set of N-numbers x1,x2,x3,……,xn is denoted by ( x ) and is defined as :-

11
Engineering Statistics Second Year
n

x i
x = i =1

n
Example 3.1:-
For the following raw data on a variable X: 1, 4, 10, 8, 10. What is the mean of
X or the value of x ?

Solution :-

1 + 4 + 10 + 8 + 10 33
x = = = 6.6
5 5

Notes: The properties of the mean are as follows:


1. The mean always exists.
2. The mean is unique.

Mean of grouped (Classified) Data


For the samples that have frequency distribution, the mean of it can be
obtained as:
n

x /
f
i i
x = i =1
n

f
i =1
i

where :-
x/i : is the class mark of class (i).
fi : is the frequency of class (i).
n : the number of classes.

Example 3.2 :-
For the following classified data, find the mean.

Class Limits 84-86 87-89 90-92 93-95 96-98


Frequency 4 12 13 17 3

12
Engineering Statistics Second Year
Solution :-

Class Mark Frequency


No. Class Limits x/ i . f i
(x/i) (fi)
1 84-86 85 4 340
2 87-89 88 12 1056
3 90-92 91 13 1183
4 93-95 94 17 1598
5 96-98 97 3 291
∑49 ∑4468

x = 4468/49 = 91.18

3.2.2 The median


When the values of a data set of size n were ordered from smallest to largest. If
n is odd, the median is the value in position (n+1)/2; if n is even, it is the average of
the values in positions (n/2) and (n/2)+1. Thus the sample median of a set of three
values is the second smallest; of a set of four values, it is the average of the second
and third smallest.

Median of ungrouped Data


The sample median of n measurements x1, x2, …, xn is the middle value when
measurements are arranged from the smallest to largest. The median is the value that
divides the data into two equal halves. In other words, 50% of the data lie below the
median and 50% lie above it. If n is an odd number, there is a unique middle value
and it is the median. If n is an even number, there are two middle values and the
median is defined as their average value.

Example 3.3:-
For the following raw data on a variable X: 1, 8, 5, 10, 15, 2 and variable
Y: 8, 7, 12, 8, 6, 2, 4, 3, 5, 11, 10. What is the median of X and Y?
Solution :-
For variable X:-
Arranging the data in an increasing sequence gives, 1, 2, 5, 8, 10, 15
Median = (5+8)/2=6.5

For variable Y:-


Arranging the data in an increasing sequence yields, 2, 3, 4, 5, 6, 7, 8, 8, 10, 11, 12
Median = 7

Notes: The properties of the median are as follows:


1. The median may or may not equal the mean.
2. The median always exists.
3. The median is unique.

13
Engineering Statistics Second Year
Median of Classified Data
The median of classified data can be founded by determined first the smallest
class that has an ascending cumulative frequency (ACF) greater than the half of
summation of all frequencies. This class called the median class. Thus, the median
can be determined as follows :-

 n2 − ( f ) L 
Median = B L +   C
 f median class 
where :-
BL : lower boundary of the median class.
n : number of observations (data).
(∑f )L : sum of all class frequencies lower than the median class.
fmedian class : frequency of median class.
C : the median class interval.

Example 3.4 :-
For the following classified data, find the median.

Class Limits 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79
Frequency 6 12 15 13 8

Solution :-
No. Class Limits Frequency Class Boundaries ACF
1 30 – 39 6 29.5 – 39.5 6
2 40 – 49 12 39.5 – 49.5 18
3 50 – 59 15 49.5 – 59.5 33
4 60 – 69 13 59.5– 69.5 46
5 70 – 79 8 69.5 – 79.5 54

n/2 = 54/2=27
Then the median class is (49.5 – 59.5) which is the third class.
BL= 49.5
n = 54
(∑f)L= 18
fmedian class= 15
C = 10
 54 
 2 − 18 
Median = 49.5 +   10 = 55.5
 15 
 

14
Engineering Statistics Second Year
3.2.3 The mode
The mode is the value in a data set that occurs the most often. If no such value
exists, we say that the data set has no mode. If two or more such values exist, we say
the data set is multimodal. There is no symbol that is used to represent the mode.

Mode of ungrouped Data


The mode of X is the value that occurs with the highest frequency; it is the
most common or most probable value of X.

Example 3.5:- For the following raw data on a variable X: 1, 2, 4, 4, 5 , variable Y:


1, 2, 4, 4, 5, 6, 6, 9 and variable Z: 1, 2, 3, 7, 9, 11. What is the mode of X ,Y,and Z ?

Solution :-
For variable X:-
Mode = 4 (its frequency is two)
There is only one mode.

For variable Y:-


Modes = 4 and 6 ( Y is thus a multimodal variable)
There are two modes.

For variable Z:-


Mode = no mode (no one value appears more often than any other).
There is no mode.

Notes: The properties of the mode are as follows:


1. The mode may or may not equal the mean and median.
2. The mode may not exist.
3. If the mode exists, it may not be unique.

Mode of Classified Data


In classified data, the mode can be calculated from the following formula :-
 1 
Mode = Lw +  C

 1 +  2
where :-
Lw : lower boundary of the modal class.
Δ1 : excess of modal class frequency over frequency of the next lower class.
Δ2 : excess of modal class frequency over frequency of the next higher class.
C : the modal class interval.

Modal Class: The class which has the maximum number of observations.

15
Engineering Statistics Second Year
Example 3.6:-
Find the mode for the following classified data?

Class Limits 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79
Frequency 6 12 15 13 8

Solution :-
Modal class is ( 50 – 59 ) which is the third class.
Lw = 49.5
Δ1 = 15-12 = 3
Δ2 = 15-13 = 2
C = 10
:. Mode = 49.5 + {3 / (3+2)} x 10 = 55.5

3.2.4 The geometric mean


Geometric mean is a kind of average of a set of numbers that is different from
the arithmetic average (mean). The geometric mean is well defined only for sets of
positive real numbers. This is calculated by multiplying all the numbers (call the
number of numbers n), and taking the nth root of the total.
Geometric Mean ( G ) = [x1  x 2  x3  ......  x n ]1/n = n x 1  x 2  x 3  ........  x n

G can be computed by logarithms such as:


n

 log x i
log G = i =1
i = 1, 2, 3, ……., n
n

Example 3.7:- Find the geometric mean for the following data 1,2,3,4,5.
Solution :-
n=5, total number of values.
Then G = [x1  x 2  x 3  ...... x n ]1/n = [1 2  3  4  5]1/5 = 2.60517
3.2.5 The harmonic mean
The harmonic mean of a set of n-numbers x1,x2,x3,……,xn , can be evaluated as
follows :-
n n
H = n
=
1 1 1 1 1
x
i =1 x1
+
x2
+
x3
+ ..... +
xn
i

Example 3.8:- Find the harmonic mean for the following data 1,2,3,4,5.
Solution :-
n=5, total number of values
5
Then, H= = 2.189781
1 1 1 1 1
+ + + +
1 2 3 4 5

16
Engineering Statistics Second Year
3.3 Measures of dispersion
In addition to locating the center of the data, another important aspect of a
descriptive study of data is numerically measuring the extent of variation around the
center. Two data sets may show similar positions of center, but may be remarkably
different with respect to variability. For example, the following figure that show dot
diagrams with similar center values but different variations.

There are several forms that are used as measures of variation such as range,
mean absolute deviation, variance, and standard deviation.

3.3.1 The range (R)


We previously defined the range of a variable X as range (R)= max xi - min xi.
Clearly, the range is a measure of the total spread of the data.

3.3.2 Mean absolute deviation (M.A.D.)


Mean absolute deviation is used to find how consistent a set of data. It
describes the average distance from the mean for the numbers in the data set. M.A.D.
for ungrouped data, can be calculated as;
n

x i −x
M.A.D. = i =1

n
For grouped (classified) data, the M.A.D. can be calculated by the following
formula:-

f i x i/ − x
M .A .D . = i =1

n
where :-
c : number of classes.
n : number of observation.
x/i the class mark in grouped data.
fi the frequency of class i in grouped data.
x/i ,fi : the mark and frequency of class i.
x : the mean of (grouped or ungrouped) data.

17
Engineering Statistics Second Year
Example 3.9:- Find the mean absolute deviation for the following data 1,2,4,5.
Solution :- Mean = (1+2+4+5)/4=12/4=3
M.A.D. =[(2+1+1+2)/4]=6/4=1.5

3.3.3 Variance (S2) and standard deviation (S)


Variance is constructed by adding the squared deviations and dividing the total
by the number of observations minus one. If the dataset contains n measurements
labeled, x1,x2,x3,….,xn , then the variance is defined as :
n

sum of squared deviations  (x i − x )2


S2 = = i =1

n −1 n −1

Whereas, the variance for (grouped) classified data can be calculated as :-


c

 (x i
/
− x )2
S2 = i =1

n −1

The standard deviation is the square root of the variance.


S = Variance = S 2
Example 3.10:- Find the Variance and standard deviation for the following data
3,4,6,7,10
Solution :- Mean ( x ) = (3+4+6+7+10)/5=6
Variance (S2) = [(-3)2+(-2)2+02+12+42)/4=7.5
Standard deviation (S) = 7.5 =2.738

3.3.4 Coefficient of Variance (C.V.)


The standard formulation of the Coefficient of Variance, is the ratio of the
standard deviation to the mean. It can be calculated as follows:
S
. .=
CV 100
x
where :-
S : standard deviation for (grouped or ungrouped) data.
x : the mean of (grouped or ungrouped) data.

Example 3.11 :- Find the coefficient of variance for data in example 3.10 above?
Solution :- Mean ( x ) = (3+4+6+7+10)/5=6
Variance (S2) = [(-3)2+(-2)2+02+12+42)/4=7.5
Standard deviation (S) = 7.5 =2.738
S 2.738
Coefficient of variance (C.V.) = − 100 = 100 = 45.6333 %
x 6

18
Engineering Statistics Second Year
Example 3.12 :- Find the mean absolute deviation (M.A.D.), standard deviation (S),
and coefficient of variance (C.V.) for the grouped (classified) data below?

Class Limits 60 – 62 63 – 65 66 – 68 69 – 71 72 – 74
Frequency 5 18 42 27 8
Solution :-

Classes x /i fi x / i fi xi/ − x fi xi/ − x


(x/i – x )2
60 – 62 61 5 305 6.45 32.225 41.6025
63 – 65 64 18 1152 3.45 62.1 11.9025
66 – 68 67 42 2814 0.45 18.9 0.2025
69 – 71 70 27 1890 2.55 68.85 6.5025
72 – 74 73 8 584 5.55 44.4 30.8025
∑100 ∑6745 ∑226.5 ∑91.0125
n

x
i =1
i
/
fi
x= n
= 6745/100 = 67.45
f
i =1
i

f i x i/ − x
M .A .D . = i =1
= 226.5/100 = 2.265
n
c

 (x i
/
− x )2
91.0125
S = i =1
= = 0.9193
n −1 99

S
. .=
CV 100 = (0.9193/67.45) * 100 = 1.3629 %
x

Homework A: For the following table that represent the statistical data for ages and
its frequencies. Find the following: the mean, median, mode, mean absolute
deviation, standard deviation, and Coefficient of Variance.
Age 5 – 14 15 – 24 25 – 34 35 – 44 45 – 54
Frequency 750 2005 1950 195 100

Homework B: As shown in the table below that gives the age distribution of
individuals starting new companies. Find the mean, median, mean absolute
deviation, standard deviation, and Coefficient of Variance for this distribution.
Age 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69
Frequency 11 25 14 7 3

19
Engineering Statistics Second Year
CHAPTER FOUR

THE PROBABILITY

4.1 Introduction
Probability theory is a mathematical theory to describe and analyze situations
where randomness or uncertainty is present. An experiment that can result in
different outcomes, even though it is repeated in the same manner every time, is
called a random experiment. After the experiment is over, we call the result an
outcome. For any given experiment, there is a set of possible outcomes, and can be
state as the set of all possible outcomes in a random experiment is called the sample
space, denoted S. The subsets of the sample space of random experiment called
event.

4.2 Probability theory


There is a finite number n of possibilities (often called outcomes) and each of
them has the same probability 1/n. A collection A (event) of k outcomes with k ≤ n is
called an event and its probability P(A) is calculated as k/n:
k the number of outcomes in A
P (A ) = =
n the total number of outcomes

An empty collection has probability zero and the whole collection one. Also,
the probability of non-occurrence of event (A) is referred :

n −k k
q (A ) = P (not A ) = = 1− = 1 − P (A )
n n

Example 4.1:- as shown in the figure below, a sample space S consisting of eight
outcomes, each of which is labeled with a probability value. Find
P(A) and q(A)?

20
Engineering Statistics Second Year
Solution :-
P(A)= 0.10+0.15+0.30=0.55
q(A)=0.10+0.10+0.05+0.15+0.05=0.45
or q(A)=1-P(A)=1-0.55=0.45 because P(A)+q(A)=1, which is general rule.

Example 4.2:- As shown in the table below that gives the age distribution of
individuals starting new companies. Find the following
probabilities: 1-Ages (30-39). 2-Age=45. 3-Age>50. 4-Age<40.

Age 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69
Frequency 11 25 14 7 3

Solution :-
1- The probability of ages between (30-39) = 25/60= 0.41667
2- The probability of age (45) = probability of the class/class interval
= (14/60)/10=0.02334
3- The probability of age > 50 = probability of the classes (50-59) and (60-69)
= (7/60) + (3/60) = (10/60) = 0.16667
4- The probability of age < 40 = probability of the classes (20-29) and (30-39)
= (11/60) + (25/60) = (36/60) = 0.6

4.3 Combinations of Events


For two events A and B, in addition to the consideration of the probability of
event A occurring and the probability of event B occurring, it is often important to
consider other probabilities such as the probability of both events occurring
simultaneously. Other quantities of interest may be the probability that neither event
A nor event B occurs, the probability that at least one of the two events occurs, or the
probability that event A occurs, but event B does not.

4.3.1 Intersections of Events


Consider first the calculation of the probability that both events occur
simultaneously. This can be done by defining a new event to consist of the outcomes
that are in both event A and event B.
The event A∩B is the intersection of the events A and B and consists of the
outcomes that are contained within both events A and B. The probability of this
event, P(A∩B), is the probability that both events A and B occur simultaneously.

Example 4.3:- Figure below shows a sample space S that consists of nine
outcomes. Event A consists of three outcomes, and event B
consists of five outcomes. Find P(A∩B), P(A∩A´), P(A´∩B),
and P(A∩B´).
Solution :-
21
Engineering Statistics Second Year

P(A) = 0.01 + 0.07 + 0.19 = 0.27

P(B) = 0.07 + 0.19 + 0.04 + 0.14 + 0.12 = 0.56

P(A∩B) = 0.07 + 0.19 = 0.26 (there are two outcomes that contained within both
events A and B), Which is the probability that both events A and B occur
simultaneously.

Event A´, the complement of the event A, is the event consisting of the six outcomes
that are not in event A. Notice that there are obviously no outcomes in
A∩A´, and this is written as A∩A´ = ∅. Where ∅ is referred to as the “empty set,” a
set that does not contain anything.
P(A∩A´) = P(∅) = 0

And it is impossible for the event A to occur at the same time as its complement (A´).
A more interesting event is the event A´∩B. This event consists of the three
outcomes that are contained within event B but that are not contained within event A.
It has a probability of

P(A´∩B) = 0.04 + 0.14 + 0.12 = 0.30 (which is the probability that event B occurs
but event A does not occur.)

Similarly, event A∩B´, which has a probability of

P(A∩B´) = 0.01 (This is the probability that event A occurs but event B does not.)

Notice that
P(A∩B) + P(A∩B´) = 0.26 + 0.01 = 0.27 = P(A)
and similarly that
22
Engineering Statistics Second Year
P(A∩B) + P(A´∩B) = 0.26 + 0.30 = 0.56 = P(B)

The following two equalities hold in general for all events A and B:

P(A∩B) + P(A∩B´) = P(A) P(A∩B) + P(A´∩B) = P(B)

Two events A and B that have no outcomes in common are said to be mutually
exclusive events. In this case A∩B=∅ and P(A∩B)=0.

Some other simple results concerning the intersections of events are as follows:

A∩B=B∩A A∩A=A A∩S=A


A∩∅=∅ A∩A´=∅ A∩(B∩C)=(A∩B)∩C

4.3.2 Unions of Events


The event A∪B is the union of events A and B and consists of the outcomes
that are contained within at least one of the events A and B. The probability of this
event, P(A∪B), is the probability that at least one of the events A and B occurs.

Notice that the outcomes in the event A∪B can be classified into three kinds.
They are
1- in event A, but not in event B
2- in event B, but not in event A
3- in both events A and B

Since the probability of A∪B is obtained as the sum of the probability values of the
outcomes within (mutually exclusive) events, the following result is obtained:

P(A∪B) = P(A∩B´) + P(A´∩B) + P(A∩B)

This equality can be presented in another form using the relationships

P(A∩B´) = P(A) − P(A∩B) and P(A´∩B) = P(B) − P(A∩B)

Substituting in these expressions for P(A∩B´) and P(A´∩B) gives following result:

P(A∪B) = P(A) + P(B) − P(A∩B)

This equality has the intuitive interpretation that the probability of at least one of the
events A and B occurring can be obtained by adding the probabilities of the two

23
Engineering Statistics Second Year
events A and B and then subtracting the probability that both the events occur
simultaneously.

If the events A and B are mutually exclusive so that P(A∩B)=0,


Then P(A∪B)=P(A) + P(B)

Example 4.4:- The sample space of nine outcomes illustrated in Example 4.3 can
be used to demonstrate some general relationships between unions
and intersections of events.
Solution :-
The event (A∪B) consists of the six outcomes, and it has a probability of
P(A∪B) = 0.01 + 0.07 + 0.19 + 0.04 + 0.14 + 0.12 = 0.57

The event (A∪B)´, which is the complement of the union of the events A and B,
consists of the three outcomes that are neither in event A nor in event B. It has a
probability of

P((A∪B)´) = 0.03 + 0.22 + 0.18 = 0.43 = 1 − P(A∪B)

Notice that the event (A∪B)´ can also be written as A´∩B´ since it consists of
those outcomes that are simultaneously neither in event A nor in event B. This is a
general result:

(A∪B)´ = A´∩B´

Furthermore, the event A´∪B´ consists of the seven outcomes, and it has a
probability of
P(A´∪B´) = 0.01 + 0.03 + 0.22 + 0.18 + 0.12 + 0.14 + 0.04 = 0.74

However, this event can also be written as (A∩B)´ since it consists of the outcomes
that are in the complement of the intersection of sets A and B. Hence, its probability
could have been calculated by

P(A´∪B´) = P((A∩B)´) = 1 − P(A∩B) = 1 − 0.26 = 0.74


Again, this is a general result:
(A∩B) ´= A´∪B´
Finally, if event A is contained within event B, A⊂B, then clearly A∪B = B.
Some other simple results concerning the unions of events are as follows:

A∪B = B∪A A∪A = A A∪S = S


A∪∅ = A A∪A´= S A∪(B∪C) = (A∪B)∪C

24
Engineering Statistics Second Year
4.3.3 Combinations of Three or More Events
The probability of the union of three events A, B, and C is the sum of the
probability values of the simple outcomes that are contained within at least one of the
three events. It can also be calculated from the expression
P(A∪B∪C) =[P(A)+P(B)+P(C)]-[P(A∩B)+P(A∩C)+P(B∩C)]+P(A∩B∩C)

Example 4.5:- Three events decompose the sample space into eight regions as
shown in the figure below. Find P(A∪B∪C) ?

Solution :-
The event A, is composed of the regions 2, 3, 5, and 6, and the event A∩B is
composed of the regions 3 and 6. The event A∩B∩C, the intersection of the events
A, B, and C, consists of the outcomes that are simultaneously contained within all
three events A, B, and C , it corresponds to region 6. The event A∪B∪C, the
union of the events A, B, and C, consists of the outcomes that are in at least one of
the three events A, B, and C , it corresponds to all of the regions except for region 1.
Hence region 1 can be referred to as (A∪B∪C)´ since it is the complement of the
event A∪B∪C.

In general, care must be taken to avoid ambiguities when specifying combinations of


three or more events. For example, the expression
A∪B∩C
since the two events are different
A∪(B∩C) and (A∪B)∩C

25
Engineering Statistics Second Year
The event B∩C is composed of regions 6 and 7, so A∪(B∩C) is composed of
regions 2, 3, 5, 6, and 7. In contrast, the event A∪B is composed of regions 2, 3, 4,
5, 6, and 7, so (A∪B)∩C is composed of just regions 5, 6, and 7.

The required probability, P(A∪B∪C), is the sum of the probability values of the
outcomes in regions 2, 3, 4, 5, 6, 7, and 8. However, the sum of the probabilities
P(A), P(B), and P(C) counts regions 3, 5, and 7 twice, and region 6 three times.
Subtracting the probabilities P(A∩B), P(A∩C), and P(B∩C) removes the double
counting of regions 3, 5, and 7 but also subtracts the probability of region 6 three
times. The expression is then completed by adding back on P(A∩B∩C), the
probability of region 6.

The figure below illustrates three events A, B, and C that are mutually exclusive
because no two events have any outcomes in common. In this case,
P(A∪B∪C) = P(A) + P(B) + P(C)

In general

The union and intersection operations satisfy the following laws: For any subsets A,
B, C of S, we have:

Commutative Law: A∪B = B∪A ,


A∩B = B∩A
Associative Law: (A∪B)∪C = A∪(B∪C) ,
(A∩B)∩C = A∩(B∩C)
Distributive Law: (A∪B)∩C = (A∩C)∪(B∩C) ,
A∪(B∩C) = (A∪B)∩(A∪C)

26
Engineering Statistics Second Year
4.4 Combinations Rule
A second method of determining the number of sample points for an
experiment is to use combinatorial mathematics. This branch of mathematics is
concerned with developing counting rules for given situations. For example, there is a
simple rule for finding the number of different samples of 5 student selected from
1,000. This rule, called the combinations rule, is given below.

Suppose a sample of n elements is to be drawn without replacement from a


set of N elements. Then the number of different samples possible is denoted by the
following:

( ) = n ! (NN −! n )!
N
n

Where: -
n! = n (n-1) (n-2) …. (3)(2)(1)
and similarly for N! and (N – n)!

for example, 5! =5.4.3.2.1. [Note: the quantity 0! is defined to be equal to 1]

Example 4.6: - Consider the task of choosing 2 students from 4 students. Use the
combinations counting rule to determine how many different
selections can be made.
Solution: -
For this example, N = 4, n = 2, and

( ) = 2! (44!− 2)! = 2!4!2! = (2.1)(2.1)


4
2
4.3.2.1
=6

Classwork: - Compute the number of ways you can select n elements from N
elements for each of the following:
a. n = 2, N = 5
b. n = 3, N = 6
c. n = 5, N = 20

27
Engineering Statistics Second Year

4.5 Multiplicative Rule


If there are k sets of elements, n1 in the first set, n2 in the second set, ….. and
nk in the kth set. Suppose it wish to form a sample of k elements by taking one
element from each of the k sets. Then the number of different samples that can be
formed is the product ( n1 n2 n3 ….nk )

Example 4.7: - There are 20 engineers for three different positions: E1, E2, and E3.
How many different ways that the positions can be fill?
Solution: -
There are k = 3 sets of elements, corresponding to
Set 1: engineers available to fill position E1
Set 2: engineers remaining (after filling E1) that are available to fill E2
Set 3: engineers remaining (after filling E1 and E2) that are available to fill E3 .
The numbers of elements in the sets are n1= 20, n2= 19, and n3= 18.
Therefore, the number of different ways of filling the three positions is given by the
multiplicative rule as n1 n2 n3 = (20) (19) (18) = 6840.

4.6 Permutations Rule


When there are a set of N different elements, it wish to select n elements
from the N and arrange them within n positions. The number of different
permutations of the N elements taken n at a time is denoted by PnN and is equal to
N!
PnN = N (N − 1) (N − 2)......(N − n − 1) =
(N − n )!
Example 4.8: - Consider the task of choosing 2 students from 4 students. Use the
Permutations rule to determine how many different ways.
Solution: -
For this example, N = 4, n = 2, and
4! 4.3.2.1
PnN = P24 = = = 12
(4 − 2)! 2.1

4.7 Partitions Rule


Suppose we wish to partition a single set of N different elements into k sets,
with the first set containing n1 elements, the second containing n2 elements, …,and
the kth set containing nk elements. Then the number of different partitions is
N!
n1 ! n 2 ! .....n k !
Where n1+n2+..…+nk = N
Example 4.9: - If you have 12 construction workers and you wish to put 3 workers
to site 1, 4 workers to site 2, and 5 workers to site 3. In how many
different ways can you make this assignment?
Solution: -
For this example, k = 3 (corresponding to the k = 3 different sites),
28
Engineering Statistics Second Year
N = 12 , n1 = 3, n2 = 4, and n3 = 5. Then the number of different ways to assign
the workers to the sites is
N! 12! 12.11.10.......3.2.1
= = = 27720
n1 ! n 2 ! n3 ! 3! 4! 5! (3.2.1)(4.3.2.1)(5.4.3.2.1)

Summary of Counting Rules

1. Multiplicative rule. If you are drawing one element from each of k sets of
elements , where the sizes of the sets are n1 , n2 , …. , nk , then the number of
different results is
( n1 n2 n3 ….nk )
2. Permutations rule. If you are drawing n elements from a set of N elements and
arranging the n elements in a distinct order , then the number of different results
is
N!
PnN =
(N − n )!
3. Partitions rule. If you are partitioning the elements of a set of N elements into k
groups consisting of n1 ,n2 , … ,nk elements (n1+…+ nk= N), then the
number of different results is
N!
n1 ! n 2 ! .....n k !

4.8 Conditional Probability


The probability of an event B occurring when it is known that some event A
has occurred is called a conditional probability and is denoted by P(B\A). The
symbol P(B\A) is usually read “the probability that B occurs given that A occurs” or
simply “the probability of B, given A”. It can be defined by P(B\A) = P(A∩B)/ P(A)
provided P(A)>0.

Example 4.10: - For the table shown below which represent the number of persons
for a small city. Find the conditional probability for chosen a man
and the one that chosen is educated?
Educated Uneducated Total
Male 460 40 500
Female 140 260 400
Total 600 300 900
Solution: -
For this example, M: a man is chosen, E: the one chosen is educated
P(M\E) = P(E∩M)/ P(E)=(460/900)/(600/900)= 0.766666

29
Engineering Statistics Second Year
Example 4.11: - The following table summarizes the analysis of samples of
galvanized steel for coating weight and surface roughness. The
results from 100 samples are shown.
Coating Weight
High Low
Surface High 70 9
Roughness Low 16 5

Let A denote the event that a sample has high coating weight,
and let B denote the event that a sample has high surface
roughness. Determine the following probabilities:
(a) P(A ) (b) P(B) (c) P (A \B) (d) P(B \A)
Solution: -
(a) P(A)=(70+16)/100=86/100
(b) P(B)=(70+9)/100=79/100
(c) P(A\B)=P(B∩A)/P(B)= (70/100)/(79/100)=70/79
(d) P(B\A)=P(A∩B)/P(A)= (70/100)/(86/100)=70/86

4.9 Total probability and Bayes’ theorems


Sometimes the probability of an event A cannot be determined directly.
However, its occurrence is accompanied by the occurrence of other events Bi , i = 1,
2, . . . , n, such that the probability of A will depend on which of the events Bi has
occurred. In such a case, the probability of A will be an expected probability, that is,
the average probability weighted by those of Bi . This problem can be approached by
using the theorem of total probability, which can be derived by the definition of
conditional probability. When the sample space is partitioned into k subsets is
covered by the following theorem, sometimes called the theorem of total probability
or the rule of elimination (see figure).

If the events B1 , B2 , . . . , Bk constitute a partition of the sample space S such


that P(Bi) ≠0 for i = 1, 2, . . . , k, then for any event A of S,

P(A)=P(B1)P(A\B1)+P(B2)P(A\B2)+…….+P(Bk)P(A\Bk) , Or
k k
P (A ) =  P (B i A ) =  P (B i ) P (A \ B i )
i =1 i =1

30
Engineering Statistics Second Year
Example 4.12: - Plant have three machines, B1, B2, and B3, make 30%, 45%, and
25%, respectively, of the products. It is known from past experience
that 2%, 3%, and 2% of the products made by each machine,
respectively, are BAD. Now, suppose that a finished product is
randomly selected. What is the probability that it is BAD?
Solution: -
Consider the following events:
A : the product is BAD,
B1 : the product is made by machine B1 ,
B2 : the product is made by machine B2 ,
B3 : the product is made by machine B3 .
Applying the rule of elimination, and make a tree diagram,
P(A)= P(B1)P(A\B1) + P(B2)P(A\B2) + P(B3)P(A\B3)

Referring to the tree diagram, the three branches give the probabilities

P(B1)P(A\B1) = (0.3)(0.02) = 0.006,


P(B2)P(A\B2) = (0.45)(0.03) = 0.0135,
P(B3)P(A\B3) = (0.25)(0.02) = 0.005,
and hence
P(A) = 0.006 + 0.0135 + 0.005 = 0.0245.

Bayesian statistics is a collection of tools that is used in a special form of statistical


inference which applies in the analysis of experimental data in many practical
situations in science and engineering. Bayes’ rule is one of the most important rules
in probability theory.

Bayes’s rule, can be applied when an observed event A occurs with any one of
several mutually exclusive and exhaustive events, B1 ,B2 , … , Bk . The formula
for finding the appropriate conditional probabilities is given below

Bayes’s Rule
Given k mutually exclusive and exhaustive events, B1, B2 , … ,Bk such that
P(B1)+P(B2)+…..+P(Bk)=1, and given an observed event A , it follows that
P(Bi\A)=P(Bi∩A)/P(A)
Where: P(A)=P(B1)P(A\B1)+P(B2)P(A\B2)+…….+P(Bk)P(A\Bk)

31
Engineering Statistics Second Year
Example 4.13: - For example 4.12, if a product was chosen randomly and found to
be BAD, what is the probability that it was made by machine B3 ?
Solution: -
Using Bayes’ rule,
P(B3\A)=P(B3∩A)/P(A)
P(B3∩A)=P(B3)P(A\B3)= (0.25)(0.02) = 0.005,
P(A)=P(B1)P(A\B1)+P(B2)P(A\B2)+P(B3)P(A\B3)= 0.006+0.0135+0.005= 0.0245.
Then,
P(B3\A)= 0.005/0.0245=0.2041

Example 4.14: - The table shows the probabilities for product failure subjected to
level of contamination in manufacturing:
Probability of Failure Level of Contamination

0.10 High
0.01 Medium
0.001 Low

In a particular production run, 20% of the chips are subjected to high levels of
contamination, 30% to medium levels of contamination, and 50% to low levels of
contamination. What is the probability that a product using one of these chips
fails?
Solution: -
Let:
H denote the event that a chip is exposed to high levels of contamination
M denote the event that a chip is exposed to medium levels of contamination
L denote the event that a chip is exposed to low levels of contamination
Then,

P(F)=P(F\H)P(H)+P(F\M)P(M)+P(F\L)P(L)=0.10(0.2)+0.01(0.30)+0.001(0.50)=0.0235
The calculations are conveniently organized with the tree diagram.

32
Engineering Statistics Second Year
4.10 Geometric Probability
Probability is always expressed as a ratio between 0 and 1 that gives a value to
how likely an event is to happen. A probability of 0 means there is no chance of that
event happening. A probability of 1 means the particular event will always happen.
To calculate geometric probability, it will need to find the areas of the shapes
involved in the problem. It will need to know the total area, which means the biggest
area in the diagram. It will also need to know the desired area. The formula is simply:

desired
P=
total

Where, P represents the geometric probability. Desired stands for the


area. Total stands for the area of the whole figure

Example 4.14: Find the probability for a circle inscribed in a square of 5 cm?
Solution:-
1- Draw both of circle and square.

2- When circle inscribed in a square, it must have four points in its boundaries in
tangent with the square.

3- Find the area of circle and the area of square.

Area of circle=πr2
= 3.14 * (2.5)2= 19.63 cm2
Area of square=side 2
= (5)2= 25 cm2

4- Find the geometric probability.

Areaof Circle 19.63


P= = = 0.785
Total Area (SquareArea ) 25

Example 4.15: Two numbers are chosen which their values between 0 and 1
randomly. What is the probability that the sum of their squares is
greater than 1?
33
Engineering Statistics Second Year
Solution:-
1. Represent each number by a coordinate axis and then determine the
sample space.

y
1

0 x
0 1

2. Derive a mathematical expression for the required event. So ,

X2 + Y2 > 1 ( this is the required event)

3. Change the above expression to a mathematical equation.

X2 + Y2 = 1 (a circle equation with r = 1 )

4. Now, draw the above equation in the sample space.

1
X2 + Y2
<1
X2 + Y2
0 >1
0
x
1

X2 + Y2 =
Then, 1

P = Desired / Total

Total = 1 x 1 = 1

Desired= 1 – (π 12 / 4) = 1– π/4

P = (1 – π/4) / 1 = 1 – π/4 = 0.21

34
Engineering Statistics Second Year
CHAPTER FIVE

PROBABILITY DISTRIBUTION

5.1 Introduction
In an experiment, a measurement is usually denoted by a variable such as X. In
a random experiment, a variable whose measured value can change (from one
replicate of the experiment to another) is referred to as a random variable. There are
two types of random variables. A discrete random variable is a random variable with
a finite (or countable infinite) set of real numbers for its range. A continuous random
variable is a random variable with an interval (either finite or infinite) of real numbers
for its range.

The sample space for the machine breakdown problem is S = {electrical,


mechanical, misapply} and each of these failures may be associated with a repair
cost. For example, suppose that electrical failures generally cost an average of $200
to repair, mechanical failures have an average repair cost of $350, and operator
misapply failures have an average repair cost of only $50. These repair costs generate
a random variable cost, as illustrated in Figure below, which as a state space of
{ 50 , 200 , 350 } .

Notice that cost is a random variable because its values 50, 200, and 350 are
numbers. The breakdown cause, defined to be electrical, mechanical, or operator
misapply, is not considered to be a random variable because its values are not
numerical.

The probability distribution of a random variable X is a description of the


probabilities associated with the possible values of X. The two kinds of probability
distribution of a random variable are Discrete and Continuous.

35
Engineering Statistics Second Year
5.2 Discrete probability distribution
For a discrete random variable X, its distribution can be described by a
function that specifies the probability at each of the possible discrete values for X.

For a discrete random variable X with possible values x 1 , x2 ,…., xn


a probability mass function (pmf) is a function such that

(1) f (x i )  0
n
(2)  f (x
i =1
i ) =1

(3) f ( x i ) = P ( X = x i )

The cumulative distribution function of a discrete random variable X, denoted


as F(x), is F (x ) = P (X  x ) =  f (x i ) and F(x) satisfies the following properties,
x i x

(1) F (x ) = P (X  x ) =  f (x i )
x i x

(2) 0  F ( x ) 1
(3) if x  y , then F (x )  F ( y )

Example 5.1: - Consider randomly selecting a student at university by X=1 if the


selected student does not qualify and X=0 if the student does qualify. If 20% of all
students do not qualify, find the probability mass function (pmf) for X ?
Solution.
The pmf for X is
p(0)=P(X=0)=P(the selected student does qualify)= 0.8
p(1)=P(X=1)=P(the selected student does not qualify)= 0.2
p(x)=P(X=x)=0 for x≠0 or 1.

0.8 if x = 0

p (x ) = 0.2 if x = 1
0 if x  0 or 1

Example 5.2: - Determine the probability mass function (pmf) of X from the
following cumulative distribution function:

0 x  −2
0.2 −2 x  0

F (x ) = 
0.7 0x 2
1 2x
36
Engineering Statistics Second Year
Solution.

The Figure above displays a plot of F(x) . From the plot, the only points that receive
nonzero probability are –2, 0, and 2. The probability mass function (pmf) at each
point is the jump in the cumulative distribution function at the point.
Therefore,
f (−2) = 0.2 − 0 = 0.2
f (0) = 0.7 − 0.2 = 0.5
f (2) = 1.0 − 0.7 = 0.3

5.2.1 Discrete Uniform Distribution


The simplest discrete random variable is one that assumes only a finite number
of possible values, each with equal probability. A random variable X that assumes
each of the values x1, x2 ,……, xn with equal probability 1/n. Then
1
f (x i ) =
n
Suppose that X is a discrete uniform random variable on the consecutive integers
a, a+1 ,a+2,…….. b, for a ≤ b . Then

The mean of X is
b +a
 = E (X ) =
2
The variance of X is
( b − a + 1) 2 − 1
 =
2

12

Example 5.3: - The first digit of serial number is any one of the digits 0 through 9. If
one part is selected and X is the first digit, find f(x)?
Solution.
X has a discrete uniform distribution with probability 0.1 for each value. That is,
f ( x ) = 0.1

Example 5.4: - A voice communication system contains 48 external lines. At a


particular time, the system is observed, and some of the lines are being used. Find the
mean and the variance?
Solution
Let the random variable X denote the number of 48 (lines) in use.
37
Engineering Statistics Second Year
Then X can assume any of the integer values with a range of 0 to 48.
b + a 48 + 0
 = E (X ) = = = 24
2 2
And,
( b − a + 1) 2 −1 (48 − 0 + 1) 2 − 1
 =
2
=
12 12
(48 − 0 + 1) 2 − 1
= =14.14
12

5.2.2 Binomial Distribution


A trial with only two possible outcomes is used so frequently as a building
block of a random experiment that it is called a Bernoulli trial.
A random experiment consists of n Bernoulli trials such that
(1) The trials are independent.
(2) Each trial results in only two possible outcomes, labeled as “success” and
“failure.”
(3) The probability of a success in each trial, denoted as p, remains constant.

The random variable X that equals the number of trials that result in a success
is a binomial random variable with parameters 0 < p < 1 and n = 1,2 …. The
probability mass function of X is
n 
f (x ) =   p x (1 − p )n −x , x = 0,1, 2,......, n
x 
n  n!
Where,   represent the following value.
x  x ! (n − x ) !
If X is a binomial random variable with parameters p and n,

The mean of X is
 = E (X ) = n p
The variance of X is
 2 = n p (1 − p )

Example 5.5:- Each sample of water has a 10% chance of containing a particular
organic pollutant. Assume that the samples are independent with regard to the
presence of the pollutant. Find the following :-
(a) The probability that in the next 18 samples, exactly 2 contain the pollutant.
(b) The probability that at least four samples contain the pollutant in 18 samples.
(c) The probability that 3 ≤ X < 7 where X are samples contain the pollutant .
Solution
Let X = the number of samples that contain the pollutant in the next 18
samples analyzed. Then X is a binomial random variable with p = 0.1 and
n = 18.
38
Engineering Statistics Second Year

18  2 18− 2
(a) P ( X = 2) =   0.1 (1 − 0.1) = 0.284
 2
  x
18 18 3
18  x
(b) P ( X  4) =   
x =4  x 
0.1 (1 − 0.1)18− x
= 1 −    0.1 (1 − 0.1)
x =0  x 
18− x
= 0.098
6
18 
(c) P ( 3  X  7 ) =    0.1x (1 − 0.1)18−x = 0.265
x =3  x 

Example 5.6:- Let X be a binomial random variable with p=0.1, and n=10. Calculate
the following probabilities from the binomial probability mass function:
(a ). P (X  2) (b ). P (X = 4) (c ). P (5  X  7)
Solution
2
10 
(a) P (X = 2) =    0.1x (1 − 0.1)10−x = 0.9298
x =0  x 

10 
(b) P (X = 4) =   0.14 (1 − 0.1) 6 = 0.0112
4 
7
10 
(c) P ( 5  X  7 ) =    0.1x (1 − 0.1)10−x = 0.0016
x =5  x 

NOTE: Using Binomial Tables


Calculating binomial probabilities becomes tedious when n is large. For some
values of n and p , the binomial probabilities have been tabulated in Tables below.
The entries in the table represent cumulative binomial probabilities.

For example, N=10, the entry in the column corresponding to p = 0.10 and the
row corresponding to x = 2 is 0.930 , and its interpretation is
2
10 
P ( x  2) =    0.1x (1 − 0.1)10−x = P (x = 0) + P (x = 1) + P (x = 2) = 0.930
x =0  x 

For x = 2, P ( x = 2) = P (x  2) − P (x  1) = 0.930 − 0.736 = 0.194


For x  2, P (x  2) = 1 − P (x  2) = 1 − 0.930 = 0.070

Classwork: - For N=20, p=0.6. Use Binomial Tables to calculate the following:
1- The probability that x  10
2- The probability that x  12
3- The probability that x = 11
Ans.
1- 0.245 2- 0.416 3- 0.159

39
Engineering Statistics Second Year

N=5

N=6

N=7

N=8

N=9

N=10

40
Engineering Statistics Second Year

N=15

N=20

N=25

41
Engineering Statistics Second Year

5.2.3 Geometric Distribution


In a series of Bernoulli trials (independent trials with constant probability p of
a success), the random variable X that equals the number of trials until the first
success is a geometric random variable with parameter 0 < p< 1 and

f (x ) = (1 − p ) x −1 p , x = 1, 2,.....

The mean of X is
1
 = E (X ) =
p
The variance of X is
(1 − p )
 2 =V (X ) =
p2

Example 5.7:- Let X denotes a random variable having a geometric distribution, with
probability of success on any trial “p” ( p=0.4) . Find (a ). P (X  2) (b ). P (4  X  7)
Solution
a − f (x ) = P (X  2) = (1 − 0.4)1−1 0.4 + (1 − 0.4) 2−1 0.4 = 0.64
b − f (x ) = P (4  X  7) = (1 − 0.4)5−1 0.4 + (1 − 0.4) 6−1 0.4 = 0.0829

5.2.4 Negative binomial Distribution


In Section 5.2.3, the geometric distribution models show the probabilistic
behavior of the number of the trial on which the first success occurs in a sequence of
independent Bernoulli trials. But what if the interest is in the number of the trial for
the second success, or the third success, or in general, the rth success? The
distribution governing the probabilistic behavior in these cases is called the negative
binomial distribution.
Let Y denote the number on the trial on which the r th success occurs in
a sequence of independent Bernoulli trials with p denoting the common probability
of “success.” The negative binomial distribution is defined by two parameters, r and
p.
 y − 1 r y −r
P (y ) =   p (1 − p ) , y = r , r + 1,.... and r = 1, 2,3,.... for 0  p  1
 r − 1 

The mean of Y is
r
 = E (Y ) =
p
The variance of Y is
r (1 − p )
 2 =V (Y ) =
p2

42
Engineering Statistics Second Year
Example 5.8:- Let Y denotes a random variable having a negative binomial
distribution, with p=0.4 . Find P (Y  4) if a- r=2 and b- r=4
Solution
 y − 1 y −2
a − P (Y  4) =   0.4 (1 − 0.4)
2
, y = 2,3, 4,5,.......
 2 −1 
 1   2 
P (Y  4) = 1 −   0.42 (0.6) 2−2 +   0.42 (0.6)3− 2  = 1 − 0.16 + 0.192  = 0.648
 1  1  
 y − 1 y −4
b − P (Y  4) =   0.4 (1 − 0.4)
4
, y = 4,5, 6, 7,.......
 4 − 1 
P (Y  4) = 1

Example 5.9:- It was found that 30% of the applicants for a certain job have
advanced training in computer programming. Applicants are selected at random and
are interviewed sequentially.
a- Find the probability that the first applicant having advanced training is found
on the fifth interview.
b- Suppose three jobs requiring advanced programming training are open. Find
the probability that the third qualified applicant is found on the fifth interview,
if the applicants are interviewed sequentially and at random.
Solution
a − P (Y = 5) = (1 − 0.3)5−1 0.3 = 0.072
b- It was assumed independent trials, with the probability of finding a qualified
applicant on any one trials being 0.3. Let Y denote the number of the trial on
which the third qualified applicant is found. Then Y can reasonably be assumed
to have a negative binomial distribution with r=3 and p=0.3 , so
 5 − 1 5−3
P (Y = 5) =   0.3 (1 − 0.3) = 0.0794
3

 3 − 1
Example 5.10:- Camera Flashes consider the time to recharge the flash. The
probability that a camera passes the test is 0.8, and the cameras perform
independently. What is the probability that the third failure is obtained in five or
fewer tests?
Solution
Let Y denote the number of cameras tested until three failures have been obtained.
The requested probability is P (Y ≤ 5).
Here Y has a negative binomial distribution with p = (1-0.8) = 0.2 and r = 3.
Therefore,
5
 y − 1
P (Y  5) =    0.2 (1 − 0.2)
3 y −3
= 0.056
y =3  3 − 1 

43
Engineering Statistics Second Year
5.3 Continuous Probability Distributions
Density functions are commonly used in engineering to describe physical
systems. A probability density function f (x) can be used to describe the
probability distribution of a continuous random variable X. If an interval is likely to
contain a value for X, its probability is large and it corresponds to large values for f
(x) . The probability that X is between a and b is determined as the integral of f (x)
from a to b. Then,
For a continuous random variable X, a probability density function (pdf) is a
function such that

(1) f (x )  0

(2)  f (x )dx = 1
−
b
(3) P (a  X  b ) =  f (x )dx = areaunder f (x ) from ato b for any a and b
a

If X is a continuous random variable, for any x1 and x2 ,


P ( x 1  X  x 2 ) = P ( x 1  X  x 2 ) = P ( x 1  X  x 2 ) = P (x 1  X  x 2 )

Example 5.11:- Let X denote the continuous random variable of the current
measured in a thin copper wire in milliamperes. Assume that the range of X is [4.9,
5.1] mA, and assume that the probability density function of X is f (x) = 5 for
4.9 ≤ x≤5.1. What is the probability that a current measurement is less than 5
milliamperes?
Solution
The probability density function is shown in the figure.
It is assumed that f (x) =0 wherever it is not
specifically defined. The shaded area in the figure
indicates the probability.
5 5
P (X  5) =  f (x )dx =
4.9
 5 dx =0.5
4.9

44
Engineering Statistics Second Year
Example 5.12:- Let X denote the continuous random variable of the diameter of a
hole drilled in a metal sheet. The target diameter is 12.5 millimeters. From past data
show that the distribution of X can be modeled by a probability density function of
f (x ) = 20 e −20( x −12.5) for x  12.5 . If a part with a diameter greater than 12.6 mm is
required, what proportion of those parts?
Solution
From the question, A part is required if X
>12.6. Now, the density function is shown in the
figure.

 

 f (x )dx  20e
−20( x −12.5)
P (X  12.6) = = dx P (X  12.6) = −e −20( x −12.5) = 0.135
12.6 12.6

Now, it can be calculate the proportion of parts between 12.5 and 12.6 mm,
12.6
P (12.5  X  12.6) =
12.5
 f (x )dx = − e −20( x −12.5) = 0.865 Or

Total area under f (x) = 1 , then P(X>12.6)=1- P(12.5<X<12.6)= 1-0.865=0.135

------------------------------------------------------------

The cumulative distribution function (cdf) of a continuous random


variable X is

F (x ) = P (X  x ) =  f (u )du
−
for −   x  

For example, 5.12, the cumulative distribution function can be shown below, consists
of two expressions. F (x) =0 for x<12.5
And the other expression can be determine when x  12.5 as shown below
x

 20e
−20( x −12.5)
F (x ) = du =1 − e −20( x −12.5)
12.5

Therefore,
0 x  12.5
F (x ) =  −20( x −12.5)
1 − e 12.5  x
The figure represents the graph of F(x).

The probability density function of a continuous random variable can be


determined from the cumulative distribution function by differentiating. The
x
d
fundamental theorem of calculus states that
dx  f (u )du = f (x )
−

45
Engineering Statistics Second Year
d F (x )
Then given F(x), f (x ) =
dx

Example 5.13:- The time until a chemical reaction is complete (in milliseconds) is
approximated by the cumulative distribution function F(x), Find the probability
density function of X.

0 x 0
F (x ) =  −0.01x
1 − e 0x

Solution
Using the result that the probability density function is the derivative of F (x) ,
0 x 0
f (x ) =  −0.01x
0.01e 0x
The probability that a reaction completes within 200 milliseconds is
P (X  200) = F (200) = 1 − e −2 = 0.8647

The mean and variance can also be defined for a continuous random
variable. Suppose that X is a continuous random variable with probability density
function f (x) . The mean or expected value of X, denoted as μ or E (X) , is

 = E (X ) =  x f (x )dx
−
The variance of X, denoted as V (X) or σ2 , is
 
 =V ( X ) =  (x −  ) f (x )dx =  x f (x )dx −  2
2 2 2

− −
The standard deviation of X is
 = 2

Example 5.14:- Find the mean and variance of X in example 5.11?


Solution
The mean will be,
5.1
 = E (X ) =  x f (x )dx
4.9
= 5x 2 / 2 = 5

The variance will be,


5.1
 =V ( X ) =  (x − 10) f (x )dx = 5(x − 10)3 / 3 = 0.0033
2 2

4.9

46
Engineering Statistics Second Year
5.3.1 Continuous Uniform Distributions

A continuous random variable X with probability density function


f(x)=1(b-a), a ≤ x ≤ b is a continuous uniform random variable.

The mean of continuous uniform random variable, is


(a + b )
 = E (X ) =
2
The variance of continuous uniform random variable, is
(b − a ) 2
 2 =V ( X ) =
12

Example 5.15:- Find the mean and variance of X in example 5.11?


Solution
The mean and variance formulas can be applied with a=4.9 and b=5.1. Therefore,
(4.9 + 5.1) (5.1 − 4.9) 2
 = E (X ) = = 5 mA and  =V (X ) =
2
= 0.0033 mA
2 12
Consequently, the standard deviation of X is 0.0577 mA .
---------------------------------------------------------------

The cumulative distribution function of a continuous uniform random


variable is obtained by integration. If a < x < b ,
x −a
x
1
F (x ) =  du =
a
b −a b −a
Therefore, the complete description of the cumulative distribution function
of a continuous uniform random variable is,
0 x a
x −a

f (x ) =  a  x b
b − a
1 b x

5.3.2 Normal Distributions


A random variable X with probability density function
− ( x −  )2
1
f (x ) = e 2 2
−  x   is a normal random variable with parameters μ
2
where −∞< μ <∞ , and σ > 0. Also E (X ) =  and V (X ) =  2 and the notation
N(μ, σ2) is used to denote the distribution.

A normal random variable with μ=0 and σ2=1 is called a standard normal
random variable and is denoted as Z . The cumulative distribution function of a
standard normal random variable is denoted as (z ) = P ( Z  z )
47
Engineering Statistics Second Year

Example 5.16:- Assume that Z is a standard normal random variable. Find P(Z ≤
1.5). 
Solution
Read down the z column to the row that equals 1.5.
The probability is read from the adjacent
column, labeled 0.00, to be 0.93319.

48
Engineering Statistics Second Year

For another example, P(Z≤1.53) is found by reading down the z column to the row
1.5 and then selecting the probability from the column labeled 0.03 to be 0.93699.

NOTE: In practice, a probability is often rounded to one or two significant digits.

Example 5.17:- If Z is a standard normal random variable, find the following


probabilities :
1- P(Z>1.26)= 1- P(Z≤1.26)=1-0.89616=0.10384
2- P(Z<20.86)=0.19490
3- P(Z> -1.37)=P(Z<1.37)=0.91465
4- P(-1.25<Z<0.37)=P(Z<0.37)-P(Z<-1.25)=0.64431-0.10565=0.53866

5- P(Z≤ - 4.6)=0 ,
since from the table
the last value for (Z≤ -
3.99)=0.00003
6- Find the value of z,
such that
P(Z>z)=0.05. This
probability expression
can
be written as P(Z ≤ z)=1-0.05=0.95. We search the probabilities to find the
value that equal to 0.95. Then the nearest value is 0.95053,
corresponding to z =1.65.

Standardizing a Normal Random Variable

If X is a normal random variable with the E(X) =μ and V(X) =σ2, the
random variable Z=(X−μ)/σ is a normal random variable with E(Z)=0 and V(Z)=1.
That is, Z is a standard normal random variable. The random variable Z represents
the distance of X from its mean in terms of standard deviations. It is the key step for
calculating a probability for an arbitrary normal random variable.

 X − x−
Then P( X  x ) = P   = P( Z  z )
   

Where Z is a standard normal random variable, and z=(x−μ)/σ is the z-value


calculated from standardizing X. Then the probability is obtained by using tables of
 (z ) = P ( Z  z ) with z=(x−μ)/σ.

49
Engineering Statistics Second Year

Example 5.18:- Suppose that the current measurements in a strip of wire are assumed
to follow a normal distribution with a mean of 10 milliamperes and a variance of 4
(milliamperes)2.What is the probability that a measurement exceeds 13 milliamperes?
And what is the probability that a current measurement is between 9 and 11
milliamperes?
Solution
Let X denote the current in milliamperes. The requested probability can be
represented as P(X>13). Let Z=(X−10)/2. It is note that X >13 corresponds to Z>1.5.
Therefore, from tables of (z ) = P ( Z  z ) ,

P( X >13)=P( Z > 1.5 )=1 - P( Z ≤ 1.5)= 1- 0.93319 = 0.06681

Or the probability can be found from the inequality X>13. That is

 ( X − 10) (13 − 10) 


P( X  13) = P    = P( Z  1.5) = 0.06681
 2 2 

Now, to find the probability that a current measurement is between 9 and 11


milliamperes, P(9 < X < 11) is required.

P(9 < X < 11) = P ((9-10)/2 < (X-10)/2 < (11-10)/2)


= P(-0.5<Z<0.5)=P(Z<0.5)-P(Z<-0.5)
= 0.69146-0.30854
=0.38292

5.3.3 Exponential Distributions


The random variable X that equals the distance between successive events from a
Poisson process with mean number of events λ > 0 per unit interval is an exponential
random variable with parameter λ. The probability density function of X is

f ( x) =  e− x for 0  x  

The exponential distribution obtains its name from the exponential function in the
probability density function. If the random variable X has an exponential distribution with
parameter λ , It is important to use consistent units to express intervals, X, and λ.

1 1
 = E( X ) = and  2 =V(X )=
 2

50
Engineering Statistics Second Year

Class Work1:- Suppose that X has an exponential distribution with mean equal to 10.
Determine the following:

(a) P (X>10) Ans. =0.3679


(b) P (X>20) Ans. =0.1355
(c) P (X<30) Ans. =0.9502
(d) Find the value of x such that P (X<x )=0.95 Ans. =0.95, with x=29.96

Class Work2:- Suppose that the counts recorded by a counter follow a Poisson process
with an average of two counts per minute.
(a) What is the probability that there are no counts in a 30-second interval? Ans.=0.3679
(b) What is the probability that the first count occurs in less than 10 seconds? Ans.=0.2835
(c) What is the probability that the first count occurs between one and two minutes after
start-up? Ans. =0.117

5.3.4 Lognormal Distributions


Variables in a system sometimes follow an exponential relationship as x= exp (w). If
the exponent is a random variable W, then X=exp(W) is a random variable with a
distribution of interest. An important special case occurs when W has a normal distribution.
In that case, the distribution of X is called a lognormal distribution. The name follows
from the transformation ln (X) = W. That is, the natural logarithm of X is normally
distributed.
Probabilities for X are obtained from the transform of the normal distribution. The
range of X is ( 0, ∞ ).

The probability density function of X can be obtained from the derivative of F (x).
This derivative is applied to the last term in the expression for F (x) . Because Φ(⋅) is the
51
Engineering Statistics Second Year
integral of the standard normal density function, the fundamental theorem of calculus is
used to calculate the derivative.

Suppose that W is normally distributed with mean θ and variance ω2 ; then the
X=exp (W) is a lognormal random variable with probability density function
1  ( ln ( x ) −  )2 
f ( x) = exp  −  for 0  x  
x  2  22 

The parameters of a lognormal distribution are θ and ω2 , but these are the mean and
variance of the normal random variable W. Hence, mean and variance of X are

 = E ( X ) = e +   2 = V ( X ) = e 2 +  (e − 1)
2 2 2
/2
and

Example 5.19:- The lifetime (in hours) of a semiconductor laser has a lognormal
distribution with θ = 10 and ω= 1.5 . What is the probability that the lifetime exceeds 10000
hours? What is the lifetime that exceeded by 99% of lasers? Find mean and standard
deviation of lifetime?
Solution
From the cumulative distribution function for X,

P(X>10000)=1-[ exp (W) ≤10000]


=1-P[W≤ ln (10000)]
 ln(10000) − 10 
= 1− 
 1.5 
= 1 −  ( −0.52 ) = 1- 0.30= 0.70

The question is to determine x such that P(X>x) =0.99 Therefore,

P( X  x ) =P exp(W )  x  = P W  ln( x ) 

 ln( x ) − 10 
= 1−    = 0.99
 1.5 
Therefore, from tables of (z ) = P ( Z  z ) , 1-Φ(z)=0.99 when z =-2.33. Then,
 ln( x ) − 10 
  = −2.33 and x =exp(6.505)= 668.48 hours
 1.5 
 = E ( X ) = e +  /2 = e(10 +1.125) = 67846.3
2

 2 = V ( X ) = e 2 +  (e − 1) = e 20 + 2.25 (e 2.25 − 1) = 39070059886.6


2 2

Then the standard deviation of X is 197661.5 hours.

CHAPTER SIX

52
Engineering Statistics Second Year

SAMPLING DISTRIBUTION

6.1 Introduction
In inferential statistics, we want to use characteristics of the sample (i.e. a
statistic) to estimate the characteristics of the population (i.e. a parameter).
A sampling distribution is a probability distribution of a statistic obtained through a
large number of samples drawn from a specific population. The sampling distribution
of a given population is the distribution of frequencies of a range of different
outcomes that could possibly occur for a statistic of a population.
A lot of data drawn and used by academicians, statisticians, researchers, etc.
are actually samples, not population. A sample is a subset of a population. For
example, it is very difficult for a medical researcher to compare the average weight of
all babies born in North America from 1995 to 2005 to those born in South America
within a reasonable amount of time. He will instead only use the weight of, say 100
babies, in each continent to make a conclusion. The weight of 200 babies used is the
sample and the average weight calculated is the sample mean.
Now suppose that instead of taking just one sample of 100 newborn weights
from each continent, the medical researcher takes repeated random samples from the
general population, and computes the sample mean for each sample group. So, for
North America, he pulls up data for 100 newborn weights recorded in the US,
Canada, and Mexico as follows: four 100 samples from select hospitals in the US,
five 70 samples from Canada, and three 150 records from Mexico, for a total of 1200
weights of newborn babies grouped in 12 sets. He also collects a sample data of 100
birth weights from each of the 12 countries in South America. Each sample has its
own sample mean and the distribution of the sample means is known as the
sample distribution.
Other statistics, such as the standard deviation and variance, can be
calculated from a sample data. The standard deviation and variance measure the
variability of the sampling distribution. The standard deviation of a sampling
distribution is called the standard error. Knowing how spread apart the mean of

53
Engineering Statistics Second Year
each of the sample sets are from each other and from the population mean will give
an indication of how close the sample mean is to the population mean. The standard
error of the sampling distribution decreases as the sample size increases.

6.2 Sampling Distribution of the Sample Mean


The sample mean (the statistic) can be used to estimate the population mean
(the parameter). In doing so, we need to know the properties of the sample mean.
That is why we need to study the sampling distribution of the statistics. We will
begin with the sampling distribution of the sample mean. The sample mean is
random since its value depends on the sample chosen. It is called a statistic. The
population mean is fixed, usually denoted as Example 6.1: Weights of concrete
blocks.
The population is the weight of FIVE concrete blocks (in pounds). You are asked to
guess the average weight of the six blocks by:
a) Taking a random sample of 2 blocks from the population.
b) Taking a random sample of 4 blocks from the population.
Block A B C D E
Weight 19 14 15 9 10

Solution:
Calculate the population mean
= (19 + 14 + 15 + 9 + 10) / 5 = 13.4 pounds
Part a
Obtain the sampling distribution of the sample mean for a sample size of 2 blocks

Sample Weight Probability


A, B 19, 14 16.5 1/10
A, C 19, 15 17.0 1/10
A, D 19, 9 14.0 1/10
A, E 19, 10 14.5 1/10
B, C 14, 15 14.5 1/10
B, D 14, 9 11.5 1/10

54
Engineering Statistics Second Year

B, E 14, 10 12.0 1/10


C, D 15, 9 12.0 1/10
C, E 15, 10 12.5 1/10
D, E 9, 10 9.5 1/10

Distribution of :

9.5 11.5 12.0 12.5 14.0 14.5 16.5 17.0


Probability 1/10 1/10 2/10 1/10 1/10 2/10 1/10 1/10

One can thus see that the chance that the sample mean is exactly the population
mean is only 1 in 10, very small. (In some other examples, it may happen that the
sample mean can never be the same value as the population mean.) When using the
sample mean to estimate the population mean, some possible error will be involved
since sample mean is random.
The mean of the sample mean =
(16.5+17.0+14.0+14.5+14.5+11.5+12.0+12.0+12.5+9.5) / 10 = 13.4 pounds
Thus, even though each sample may give an answer involving some error, the
expected value is right at the target: exactly the population mean.
Part b

Sample Weight Probability


A, B, C, D 19, 14, 15, 9 14.25 1/5
A, B, C, E 19, 14, 15, 10 14.50 1/5
A, B, D, E 19, 14, 9, 10 13.0 1/5
A, C, D, E 19, 15, 9, 10 13.25 1/5
B, C, D, E 14, 15, 9, 10 12 1/5

The mean of the sample mean =


(14.25+14.5+13+13.25+12) / 5 = 13.4 pounds

55
Engineering Statistics Second Year
We can see that using sample mean to estimate population mean involves
sampling error. However, the error with a sample of size 4 is smaller than with a
sample of size 2.
The following dot plots shows the distribution of the sample means
corresponding to sample sizes of 2 and 4 blocks.

6.2.2 Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of any
independent, random variable will be normal or nearly normal, if the sample size is
large enough.
How large is "large enough"? The answer depends on two factors.
1. Requirements for accuracy. The more closely the sampling distribution needs to
resemble a normal distribution, the more sample points will be required.
2. The shape of the underlying population. The more closely the original population
resembles a normal distribution, the fewer sample points will be required.

In practice, some statisticians say that a sample size of 30 is large enough when
the population distribution is roughly bell-shaped. Others recommend a sample size
of at least 40. But if the original population is distinctly not normal (e.g., is badly
skewed, has multiple peaks, and/or has outliers), researchers like the sample size to
be even larger.

56
Engineering Statistics Second Year
6.2.3 Sample Size and Sampling Error
The possible sample means cluster more closely around the population mean as
the sample size increases. Thus, possible sampling error decreases as sample size
increases. The standard deviation of this statistic is called the standard error.

standard error =
where is the standard deviation of the population, n is the sample size.

Example 6.2:
For Example 6.1, find the standard error.

Solution:
=

For n = 2

= 4.037
=

For n = 4
=

6.2.4 Application of Sample Mean Distribution

When we know that the sample mean is normal or approximately normal, and
we know the population mean, μ, and population standard deviation, σ, then we can
calculate a z-score for the sample mean and determine probabilities for it where:

57
Engineering Statistics Second Year

Example 6.3
The engines made by Ford for speedboats had an average power of 220
horsepower (HP) and standard deviation of 15 HP.
1. A buyer intends to take a sample of 4 engines and will not place an order if the
sample mean is less than 215 HP. What is the probability that the buyer will not
place an order? (Note: Suppose that the distribution of sample mean is normal
distribution).
2. If the customer samples 100 engines, what is the probability that the sample mean
will be less than 215?
Solution:
Part 1
We want to find P ( < 215) = ?
We need to know whether the distribution of the population is normal since the
sample size is too small: n = 4 (less than 30 which is required in the central limit
theorem). If someone confirms that the population normal, then we can proceed since
the sampling distribution of the mean of a normal distribution is also normal for all
sample sizes.
If the population follows a normal distribution, we can conclude that has a
normal distribution with mean 220 HP and a standard error of:
=
Then
P ( < 215)
= P (Z ˂ (215-220) / 7.5)
= P (Z ˂ -0.67)
= 0.2514 (by using normal distribution tables)
The probability that the customer will not place an order is 25.14%.

58
Engineering Statistics Second Year
Part 2
=
P ( < 215)
= P (Z ˂ (215-220) / 1.5)
= P (Z ˂ -.3.33) = 0.0004

6.3 The Sampling Distribution of the Variance


The sampling distribution of the variance is the distribution of sample
variances, with all samples having the same sample size n taken from the same
population.

Example 6.4
For example 6.1 find the sampling distribution of the variance by taking a random
sample of 4 blocks from the population.
Solution

Sample Weight S2 Probability


A, B, C, D 19, 14, 15, 9 14.25 16.916 1/5
A, B, C, E 19, 14, 15, 10 14.50 13.666 1/5
A, B, D, E 19, 14, 9, 10 13.0 20.666 1/5
A, C, D, E 19, 15, 9, 10 13.25 21.583 1/5
B, C, D, E 14, 15, 9, 10 12 8.666 1/5

Mean of the sample variances = (16.916 + 13.666 + 20.666 + 21.583 + 8.666) / 5 =


16.3
From Example 6.2, we can see that the variance of population is:
= (4.037)2 = 16.3
Therefore, the mean of the sample variances is exactly as the population
variance, and it is a good tool to estimate the population variance.

59
Engineering Statistics Second Year

CHAPTER SEVEN

THE EXPECTATION

7.1 Introduction
Statistical methods are used to make decisions and draw conclusions
about the populations. This aspect of statistics is generally called statistical
inference. These techniques utilize the information in a sample for drawing
conclusions. This chapter begins our study of the statistical methods used in decision
making. Statistical inference may be divided into two major areas: parameter
estimation and hypothesis testing.

60
Engineering Statistics Second Year
As an example of a parameter estimation problem, suppose that an
engineer is analyzing the tensile strength of a component used in an air frame.
This is an important part of assessing the overall structural integrity of the airplane.
Variability is naturally present in the individual components because of
differences in the batches of raw material used to make the components,
manufacturing processes, and measurement procedures (for example), so the
engineer wants to estimate the mean strength of the population of components. In
practice, the engineer will use sample data to compute a number that is in some sense
a reasonable value (a good guess) of the true population mean. This number is called
a point estimate. We will see that procedures are available for developing point
estimates of parameters that have good statistical properties. We will also be able to
establish the precision of the point estimate.

Statistical inference always focuses on drawing conclusions about one or more


parameters of a population. An important part of this process is obtaining estimates of
the parameters. Suppose that we want to obtain a point estimate (a reasonable value)
of a population parameter. We know that before the data are collected, the
observations are considered to be random variables, say, X1, X2, … , Xn. Therefore,
any function of the observation, or any statistic, is also a random variable. For
example, the sample mean and the sample variance S2 are statistics and random
variables. Another way to visualize this is as follows. Suppose we take a sample of
n=10 observations from a population and compute the sample average, getting the
result =10.2. Now we repeat this process, taking a second sample of n=10
observations from the same population and the resulting sample average is 10.4. The
sample average depends on the observations in the sample, which differ from sample
to sample because they are random variables. Consequently, the sample average (or
any other function of the sample data) is a random variable. Because a statistic is a
random variable, it has a probability distribution. We call the probability distribution
of a statistic a sampling distribution.

61
Engineering Statistics Second Year
7.2 Expectation properties
Estimation problems occur frequently in engineering. We often need to
estimate

Reasonable point estimates of these parameters are as follows:

The definitions of other properties of estimators do not provide any guidance


about how to obtain good estimators. Thus, we will discuss the method for obtaining
point estimators: the method of moments. Moment estimators are sometimes easier to
compute.

7.3 Moments
The general idea behind the method of moments is to equate population
moments, which are defined in terms of expected values, to the corresponding
sample moments. The population moments will be functions of the unknown
parameters. Then these equations are solved to yield estimators of the unknown
parameters.

62
Engineering Statistics Second Year
CHAPTER EIGHT

THE ESTIMATION

7.1 Introduction
Typically, all the information about a population of interest is not
available. In many situations, it is also not practical to obtain information from each
and every member of the population due to time, monetary, or experimental
constraints. For example, to estimate the breaking strength of manufactured bricks,
pressure must be applied until the brick breaks. In this case, if the manufacturer tries
to collect information from the population, there will be no bricks left to use. Then
how do we get information about the population? We take a random sample
from the population and obtain information from the sampled units. We then use this
information from the sample to estimate or to make inference about the
unknown population characteristics. In this example, to estimate µ, the mean strength
of the batch of bricks manufactured, the manufacturer would probably test a few
bricks from this batch for breaking strength, compute the sample mean from the
measurements, and use the computed sample mean to estimate the mean
breaking strength of the bricks from this whole batch.

We use the estimation technique frequently for decision making.


• Environmental engineers take a few soil samples and estimate chemical
levels at the proposed construction site.
• Marine biologists sample a few fish from Mobile Bay and estimate the
mercury level of the fish.
• Chemical engineers design experiments with a few trial runs and estimate the
percentage of yield of specific chemicals from the process.
• A team of engineers from a construction company takes a few measurements
and estimates the cost of a new highway construction project that involves two
overpasses.

63
Engineering Statistics Second Year
• Car mechanics run a few diagnostic tests and estimate the extent of damage
and resulting repair costs.

The estimation processes use estimators to estimate unknown population


characteristics.
An estimator is a statistic that specifies how to use the sample data to estimate
an unknown parameter of the population.
There are basically two types of estimators: point estimators and interval
estimators
• A point estimator gives a single number as an estimate of the unknown
parameter value.
• An interval estimator gives a range of possible values (i.e., an interval of
possible values) as an estimate of the unknown parameter value.
A specific value of an estimator computed from the sample is known as an
estimate.
For example, the brick manufacturer could estimate the mean strength of
bricks manufactured using a sample of bricks as:
• The mean strength of bricks is 150 lb per inch.
• The mean strength of bricks is between 135 and 165 lb per sq inch;
or the mean strength of bricks is 150, give or take 15 lb, per sq inch.
A chemical engineer could estimate the yield of a chemical process from a few
trial runs as:
• The yield of chemical A from this process is 28%.
• The yield of chemical A from this process is 25% to 31%; or the
yield of chemical A from this process is 28%, give or take 3%.
In each of these two examples, the first option gives a point estimate of
the unknown population characteristic (mean strength or chemical yield), whereas the
second option gives a range of believable values for the unknown population
characteristic.

64

You might also like