Statistics and Probabilities
Statistics and Probabilities
and
Probabilities
1st year C.S
Prepared by
Dr/ Maisaa Mohamed
رؤيـة ورســالة األكاديمية الحديثة
• الرؤيــة :
تتطلع األكاديمية الحديثه لعلوم الكمبيوتر وتكنولوجيا اإلدارة إلى أن تكون متميزة
فى مجاالت تخصصاتها لمسايرة المستجدات المحلية واإلقليمية والعالمية فى سوق
األعمال.
• الرسالة :
تلتزم األكاديمية الحديثه لعلوم الكمبيوتر وتكنولوجيا اإلدارة بإعداد كوادر متخصصة فى
مجاالت علوم الحاسب اآللى وإدارة األعمال والمحاسبة والمراجعة واقتصاديات التجارة
الدولية ونظم معلومات األعمال وذلك إلمداد المجتمع المحلى والعربى بالكوادر البشرية
المذكورة ،ومن خالل االستفادة من جميع الموارد المتاحة تواكب األكاديمية التطورات
العلمية والتكنولوجية بأنشطتها البحثية كما تساهم فى خدمة المجتمع وتنمية البيئة فى
محيطها ،ويتم ذلك فى إطار من اإللتزام بالقيم األخالقية والعلمية المتعارف عليها.
Vision:
Achieving excellence in its fields of specializations to be in line with the local,
regional, and international updates in the labor market.
Mission
Modern academy is committed to preparing professional graduates specialized
in the fields of Computer Science, Business Administration, Accounting and
Auditing, Economics of International Trade and Business Information Systems
to provide the regional and Arab community with qualified cadres equipped with
theoretical and professional foundations required in the labor market in the
aforementioned fields. By utilizing all the available resources, the academy
keeps up with the scientific and technological advancements through research
activities in addition to participating in serving the society and developing the
surrounding environment, all this within a frame of commitment to the
recognized moral and scientific values.
رؤية برنامج علوم الحاسب
في ضوء المعايير المعتمدة لجودة،تحقيق التميز في مجال علوم الحاسب محليا و إقليميا و دوليا
.التعليم
▪ Computer Science Program Vision
Achieving excellence in the field of Computer Science locally, regionally and
internationally, in view of the approved standards of the quality of education.
Course Specifications
1- 1- Basic Information
Academic year / Level:1 st Year/2nd term Specialization: Computer Science
Title: Statistics & Probabilities Code: - BS110
Lecture: 2 Tutorial: 2 Practical: ---- Total: 3 (Hour/week)
2 – Overall Aims On completion of this course the successful student will be able to:
of Course: Enable graduates to exhibit a high level of practical and theoretical skills in Mathematics with knowledge
of currently available techniques and technologies.
Explore the principles that support developments in Mathematics. Teach students basic mechanisms for
following and learning the continuous progress in Mathematics.
3 – Intended Learning Outcomes of Course (ILOs):
A-Knowledge On completion of this course student will have knowledge and understanding of:
and a1.Recognizes the concepts of Probability.[A13]
Understanding a2 - Explain mechanisms and method Probability.[A1]
: a3 - Demonstrate statistical methods to computer science applications.[A2]
On completion of this course the successful student will be able to:
B-Intellectual b1 – Realize various type of data in statistics and using the appropriate method of Solving the
Skills: related statistical model.[B3]
C-Professional On completion of this course the successful student will be able to:
and Practical c1- solve different exercises of Probability.[C4]
c2 – Explain how to treat a computer model as a statistical model.[C6]
Skills:
On completion of this course the successful student will be able to:
D-General and
d1. Search for other Methodologies in specific topics related to Probability.[D1]
Transferable d2 – Learn how to transform the solution of Statistical model to the solution of desired computer
Skills: model.[D7]
Week No Contents
4-Contents: 1 An Introduction to Statistics,( Types of Statistics , Types of Variables, Levels of Measurements,)
Organizing and Graphing Data,( Frequency Distributions, Relative Frequency and Percentage
2 Distributions, Stem and Leaf Plots, Graphing for Quantitative Data, Graphing for Qualitative
Data,
Measures of Central Tendency (for mean, median, mode , Relationships Among the Mean,
3 Median,)
Measures of Dispersion, (The Range, Variance and Standard Deviation, Measures of Position,
4 Measures of Skewness,
Sample space, probability axioms, combinatorial techniques, conditional probability
5 independence and Bays" Random variables;
Probability Distribution (The Concept of a Random Variable, Discrete Random Variable and
Distribution Function, Mean and Variance for Discrete Random Variable, Continuous Random
6 Variable and Distribution Function, Mean and Variance for Continuous Random Variable,
Important continuous Probability Distribution,
Estimation,( Point and interval Estimation, Estimation of population mean 𝝈 is known, Estimation
7 of population 𝝈 is unknown, Estimation of population proportion larges samples,
Hypothesis Testes, (Concepts of Hypothesis Testing, Hypothesis Testes about mean 𝝈 Known,
8 Hypothesis Testes about mean 𝝈 unknown, one -way analysis of variance,
Lectures: ( √ ) - Exercises: (√ ( Practical training: ( X )
5–Teaching and -Open Discussion: ( √ ) - Projects: (X ) Presentation: ( √ )
-Web-Site searches: ( √ ) - Self Studies: (√ ) E. Learning: ( √ )
Learning - Chat Room ( √ ) Virtual class ( X ) Case Study: ( X )
Methods: - Voice Lectures ( √ ) Movie Lectures ( √ ) Virtual lab ( X )
- Others (list): ( X ) Simulation lab ( X )
7-List of References:
A-lecture notes. Lecture Notes, “Probability”, staff members, Modern Academy for Computer Science and
Management Technology.
B- Essential books [1]Probability and statistics for Engineers and Scientists, Anthony Hayter -4th edition,2006.
(text books) [1] Howard Anton, Elementary linear algebra, John Wiley & Sons, Inc., 2013.
[2] David C. Lay, Linear algebra and its application (3rd Edition) Addison Wesley, 2002.
[3]Probability and statistics for Engineers and Scientists,Anthony Hayter –Nelson Education,2012.
[4]Ross, Sheldon. “Probability and statistics foe engineers and scientist”. Vol.16, no. m2. Elsevier, New
Delhi, 2009.
C- Recommended
………………………………………………………………………………………
Books
D- Periodicals, Web-
Sites, etc…. …………………………………………………………………………………..
d.General Skills
Understanding
c.Professional
a. Knowledge
b.Intellectual
Skills
Skills
and
Hours
Lec.
Tut.
b1
d1
d2
a1
a2
a3
c1
c2
Course Content
confidence interval estimates (for mean, 6 2
proportions. Differences, sums, variances, and √ √ √ √ √
variance ratios), maximum likelihood estimates.
Curve fitting, regression and correlation: Method 8 2 √
of least squares, multiple regression,( linear √ √ √ √ √ √
generalized and rank) correlation. Correlation
and dependence.
Sample space, probability axioms, combinatorial 8 4 √
techniques, conditional probability independence √ √ √ √
and Bays" Random variables;
distribution functions, moments and generating 6 4 √ √ √ √ √
function. Some probability distributions
Teaching and learning methods versus Intended Learning Outcomes
Year: 2022-2023 Academic term : 2nd term
Title: Statistics & Probabilities Code: BS110
Academic year /level: 1st year Specialization: CS
a.Knowledge b.Intellectual c.Professional
Teaching &Understanding Skills Skills d.General Skills
b1
d1
d2
a1
a2
a3
c1
c2
Activities
Lectures √ √ √ √ √
Exercises √ √ √ √ √ √ √ √
Open Discussion √ √ √ √ √ √
Self Studies √ √ √ √ √ √ √ √
Presentation √ √ √ √ √
Web-site search √ √ √
E.Learning √ √ √ √ √ √ √ √
Movie Lecture √ √ √ √ √ √
Voice Lecture √ √ √ √ √ √
Chat Room √ √ √ √ √ √ √ √
Course Assessment Methods versus Intended Learning Outcomes
Year: 2022-2023 Academic term : 2nd term
Title: Statistics & Probabilities Code: BS110
Academic year /level: 1 year st
Specialization: CS
a.Knowledge b.Intellectual c.Professional d.General
&Understanding Skills Skills Skills
b1
a1
a2
a3
c1
c2
d1
d2
1
4.3.1 Inter Quartile Range (IQR)……………………………………………………98
4.3.2 Coefficient of Variation……………………………………………………….101
4.4 Measures of Skewness……………………………………………………………….102
Exercises (4) .................................................................................. 110
CHAPTER V.................................................................................... 114
Probability .................................................................................... 114
5.1 Experiment, Outcome, Sample spaces and events………………………114
5.2 Approaches of Probabilities…………………………………………………………117
5.2.1 Classical Probability………………………………………………………………117
5.2.2 Empirical Probability…………………………………………………………….119
5.2.3 Subjective Probability……………………………………………………………120
5.3 Probability Rules………………………………………………………………………….120
5.4 Conditional Probability and Independence………………………………….124
5.5 Bayes'Theorem……………………………………………………………………………127
5.6 Probability Reference List……………………………………………………………130
Exercises (5) ................................................................................. .134
CHAPTER VI .................................................................................. .139
Probability Distribution ................................................................. .139
6.1 The Concept of a Random Variable…………………………………………….139
6.1.1 Probability Distribution…………………………………………………….140
6.2 Discrete Random Variable and Distribution Function………………….141
6.3 Mean and Variance for Discrete Random Variable……………………..143
6.4 Important Discrete Probability Distribution………………………………..148
6.5 Continuous Random Variable and Distribution Function…………….156
6.6 Mean and Variance for Continuous Random Variable…………………161
6.7 Important continuous Probability Distribution……………………………163
Exercises (6) .................................................................................. 177
CHAPTER VII .................................................................................. 184
Estimation .................................................................................... 184
7.1 Point and interval Estimation……………………………………………………..184
2
7.2 Estimation of population mean 𝝈 is known………………………………186
7.3 Estimation of population 𝝈 is unknown…………………………………….191
7.4 Estimation of population proportion larges samples…………………195
CHAPTER VIII ................................................................................. 204
Hypothesis Testes .......................................................................... 204
8.1 Concepts of Hypothesis Testing………………………………………………….204
8.2 Hypothesis Testes about mean 𝝈 Known……………………………………205
8.3 Hypothesis Testes about mean 𝝈 unknown………………………………..213
8.4 Testing for a Population Proportion……………………………………………216
8.5 one -way analysis of variance……………………………………………………..219
8.5.1 Calculating the Value of the Test Statistic…………………………….220
Exercises (8) .................................................................................. 223
Appendix ...................................................................................... 226
References .................................................................................... 229
3
4
GOALS of CHAPTER 1
quantitative variable.
a continuous variable.
5
CHAPTER I
An Introduction to Statistics
6
3. No matter what your career, you will make
professional decisions that involve data. An
understanding of statistical methods will help you
make these decisions efectively
Application Areas
7
(i) Descriptive Statistics
8
The field of inferential statistics enables you to
make educated guesses about the numerical
characteristics of large groups. The logic of sampling
gives you a way to test conclusions about such groups
using only a small portion of its members.
9
Examples, Gender, religious affiliation, type of
automobile owned, state of birth, eye color are examples.
10
(ii) Qualitative Variable
11
1.4 Levels of Measurements
12
(3) Interval level Data
The third level of measurement is the interval level
of measurement. Similar to the ordinal level, with the
additional property that meaningful amount of differences
between data values can be determined. There is no
natural zero point.
Examples, Temperature on the Fahrenheit scale
(4) Ratio level Data
The fourth level of measurement is the ratio level of
measurement. In this level of measurement, the
observations, in addition to having equal intervals, can
have a value of zero as well. The zero in the scale makes
this type of measurement unlike the other types of
measurement, although the properties are similar to that
of the interval level of measurement. In the ratio level of
measurement, the divisions between the points on the
scale have an equivalent distance between them.
13
Glossary of Chapter I
14
Quantitative data Data generated by a quantitative
variable.
Quantitative variable A variable that can be measured
numerically.
Random sample A sample drawn in such a way that each
element of the population has some chance of being
included in the sample.
Representative sample A sample that contains the same
characteristics as the corresponding population.
Sample A portion of the population of interest.
Statistics Group of methods used to collect, analyze,
present, and interpret data and to make decisions.
Variable A characteristic under study or investigation that
assumes different values for different elements.
15
Exercises (1)
1. What is statistics?
16
Required
1. Describe the statistical population under study.
2. Define the variable under study.
3.If the degree of each student is registered, is this a
population survey or sample survey?
17
5. Explain the difference between quantitative data.
Give an example of quantitative and quantitative
data.
18
19
GOALS of the CHAPTER
chart.
distribution.
20
CHAPTER II
21
• A percentage distribution records the percent of
the observations that fell into each class.
Class midpoint A point that divides a class into two
equal parts. This is the average of the upper- and lower-
class limits.
𝐿𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 + 𝑢𝑝𝑝𝑒𝑟 𝐿𝑖𝑚𝑖𝑡
𝑚𝑖𝑑 𝑝𝑜𝑖𝑛𝑡(𝑥 ) =
2
Class frequency: The number of observations in each
class.
Class Width: Although it is not uncommon to have
classes of different sizes, most of the time it is preferable
to have the same width for all classes. To determine the
class width when all classes are the same size, first find
the difference between the largest and the smallest values
in the data. Then, the approximate width of a class is
obtained by dividing this difference by the number of
desired classes
Largest value - Smallest value
Approximate class width =
Number of classes
22
from 5 to 20, depending mainly on the number of
observations in the data set. It is preferable to have more
classes as the size of a data set increases. The decision
about the number of classes is arbitrarily made by the data
organizer.
One rule to help decide on the number of classes is
Sturge’s formula:
c = 1 + 3.3log n
where c is the number of classes and n is the
number of observations in the data set. The value of log n
can be obtained by using a calculator.
23
5- Prepare a table of the distribution using actual
counts and/ or percentages (relative frequencies)
Example
8 25 11 15 29 22 10 5 17 21
22 13 26 16 18 12 9 26 20 16
23 14 19 23 20 16 27 16 21 14
Solution
24
as the lower limit of the first class. Then our classes will
be
5–9, 10–14, 15–19, 20–24, and 25–29
We record these five classes in the first column of Table
25
20-24 8
25-29 5
Total 30
Example
Consider the following data
11 12 13 15 20 11 8 20 14 12
12 10 16 24 14 12 10 9 12 13
18 16 9 8 12 12 24 18 13 8
15 12 13 20 11 14 9 12 9 9
9 10 9 10 13 11 12 6 11 10
26
Construct of frequency distribution, using 5 classes and 6
as the lower limit of the first class.
Solution
24−6
Approximate class width = = 3.6 ≃ 4
5
- number of class=5
- Tally and count the number of observations in each
class
Class Tally Frequency
6- 11
10- 25
14- 7
18- 5
22-26 2
Total 50
27
Classes 6- 10- 14- 18- 22-26 Total
Frequency 11 25 7 5 2 50
Example
Consider the following observations of data set
57 39 41 63 51 59 55 37 70 23
53 64 63 54 48 49 60 46 45 65
55 78 82 52 38 41 65 43 64 75
42 25 65 48 61 24 58 34 32 31
55 88 45 46 53 55 52 43 50 20
28
Class Tally Frequency
20- 4
30- 6
40- 12
50- 14
60- 9
70- 3
80-90 2
Total 50
Frequency 4 6 12 14 9 3 2 50
29
Calculating Relative Frequency and Percentage as the
Follow:
Example
Consider the following observations of data set
13 27 12 14 17 12
20 22 18 27 23 15
21 20 17 16 14 12
13 21 20 23 22 27
26 24 14 16 17 19
30
- Frequency distribution – Relative frequency
distribution
Class Frequency Relative Frequency Percentage
12- 8 8 27%
= 0.27
30
15- 4 4 13%
= 0.13
30
18- 7 7 23%
= 0.23
30
21- 6 6 20%
= 0.2
30
24- 2 2 7%
= 0.07
30
27- 3 3 10%
= 0.1
30
Total 30 1 100%
Example
If you have the following observations of data set
12 7 12 12 7 13 17 8 3 11
15 10 6 16 19 4 14 7 17 14
13 18 0 10 11 5 9 17 14 15
11 9 2 8 13 9 10 19 14 11
5 7 15 11 8 12 9 2 13 19
31
-Determine the relative frequency distribution.
Solution
32
display we do not lose information on individual
observations.
Example
The National Bank is studying the number of times the
ATM located in City Stars Mall is used per day.
33
Following are the number of times the machine was used
over each of the last 30 days:
43 55 36 43 29 67 70 41 38 50
39 62 58 55 63 44 57 24 80 59
29 46 39 48 68 77 32 50 67 56
Stem Leaf
2 9 4 9
3 6 8 9 9 2
4 3 3 1 4 6 8
5 5 0 8 5 7 9 0 6
6 7 2 3 8 7
7 0 7
8 0
34
Stem Leaf
2 4 9 9
3 2 6 8 9 9
4 1 3 3 4 6 8
5 0 0 5 5 6 7 8 9
6 2 3 7 7 8
7 0 7
8 0
Example
If you have the following data
13 19 65 58 17 67 23 66 46 17
16 44 48 38 29 28 25 47 20 29
37 80 51 52 59 60 69 35 34 33
52 17 41 50 57 38 48 23 46 77
32 27 70 64 61 19 45 49 47 19
35
Solution
Stem Leaf
1 3 9 7 7 6 7 9 9
2 3 9 8 5 0 9 3 7
3 8 7 5 4 3 8 2
4 6 4 8 7 1 8 6 5 9 7
5 8 1 2 9 2 0 7
6 5 7 6 0 9 4 1
7 7 0
8 0
36
Example
Construct a stem and leaf plot for the following data
222 218 249 212 233 239 265 240 217 247
260 255 243 234 249 239 261 219 216 269
254 238 249 235 257 229 250 218 212 263
270 227 257 266 258 263 261 253 237 277
259 239 230 273 255 220 262 226 258 264
Solution
Stem Leaf
21 8 2 7 9 6 8 2
22 2 9 7 0 6
23 3 9 4 9 8 5 7 9 0
24 9 0 7 3 9 9
25 5 4 7 0 7 8 3 9 5 8
26 5 0 1 9 3 6 3 1 2 4
27 0 7 3
37
Stem Leaf
21 2 2 6 7 8 8 9
22 0 2 6 7 9
23 0 3 4 5 7 8 9 9 9
24 0 3 7 9 9 9
25 0 3 4 5 5 7 7 8 8 9
26 0 1 1 2 3 3 4 5 6 9
27 0 3 7
Example
If you have the following frequency distribution
Classes 10- 20- 30- 40- 50-60 Sum.
Frequency 1 2 4 8 4 20
39
Solution
50 50
Polygon
A polygon is another device that can be used to present
quantitative data in graphic form. To draw a frequency
polygon, we first mark a dot above the midpoint of each
class at a height equal to the frequency of that class. This
is the same as marking the midpoint at the top of each bar
in a histogram. Next we mark two more classes, one at
each end, and mark their midpoints. Note that these two
classes have zero frequencies. In the last step, we join the
adjacent dots with straight lines. The resulting line graph
is called a frequency polygon or simply a polygon. A
polygon with relative frequencies marked on the vertical
40
axis is called a relative frequency polygon. Similarly, a
polygon with percentages marked on the vertical axis is
called a percentage polygon. The follow figure shows the
frequency polygon for the frequency distribution for
example, iPods sold
Example
If you have the following frequency distribution
Classes 8- 12- 16- 20- 24- 28-32 Sum
frequency 5 13 2 15 7 8 50
41
Solution
Dot Diagram
A dot diagram can be used to represent frequency
distribution especially for discrete data, we construct dot
diagram as follows
42
x 1 3 5 6 7 11
f 2 1 3 5 3 2
Example
The following frequency distribution of the salary of 50
workers
frequency 6 12 13 9 10 50
43
Construct the cumulative frequency table (the Ogive)
Solution
60
OGIVE
50
40
30
20
10
0
10 20 30 40 50 60
الصاعدAscending
التكرار المتجمع المتجمع الهابطDescending
التكرار
44
2.5 Graph for Qualitative Data
Graphic representations are, in a sense, pictures of
data and as some times make the data easier to
comprehend. We have different ways to represent data
graphically. The bar graph and the pie chart are two
types of graphs that are commonly used to display
qualitative data.
Bar Graph
To construct a bar graph (also called a bar chart), we
mark the various categories on the horizontal axis. Note
that all categories are represented by intervals of the same
width. We mark the frequencies on the vertical axis. Then
we draw one bar for each category such that the height of
the bar represents the frequency of the corresponding
category. We leave small gap between adjacent bars. The
bar graphs for relative frequency and percentage
distributions can be drawn simply by marking the relative
frequencies or percentages, instead of the frequencies, on
the vertical axis.
Sometimes a bar graph is constructed by marking
the categories on the vertical axis and the frequencies on
the horizontal axis
45
Example
Stress on Percentag
Frequency Relative Frequency
Job e
Very 10 0.33 33.3
None 6 0.200 20
46
Example
The following data represents the value of industrial
products in millions of units for accompany during the
last five years and the value of exports for this company
in millions of units during the same period. present these
data by double bar chart
2015 80 30
90
80
70
60
50
Production
40
Export
30
20
10
0
2012 2013 2014 2015
47
Pie Chart
Example
The next table for Relative Frequency Distributions of
Stress on Job
48
Stress on Job Relative Frequency
Very 0.33
Somewhat 0.467
None 0.200
Sum=1
Stress on
Relative Frequency Angle size
Job
Very 0.33 360 0.333 = 119.88
Somewh
at
0.467 360 0.467 = 168.12
None 0.200 360 0.200 = 72
Sum=1 Sum= 360
49
Pie Chart
72%
Very 119.88%
Somewhat
None
168.12%
50
Glossary of Chapter II
Bar graph A graph made of bars whose heights represent
the frequencies of respective categories.
Class An interval that includes all the values in a
(quantitative)data set that fall within two numbers, the
lower and upper limits of the class.
Class boundary The midpoint of the upper limit of one
class and the lower limit of the next class.
Class frequency The number of values in a data set that
belong to a certain class.
Class midpoint or mark the class midpoint or mark is
obtained by dividing the sum of the lower and upper
limits (or boundaries) of a class by 2.
Class width or size The difference between the two
boundaries of a class.
Cumulative frequency The frequency of a class that
includes all values in a data set that fall below the upper
boundary of that class.
Cumulative frequency distribution A table that lists the
total number of values that fall below the upper boundary
of each class.
51
Cumulative percentage The cumulative relative
frequency multiplied by 100.
Cumulative relative frequency The cumulative frequency
of a class divided by the total number of observations.
Frequency distribution A table that lists all the categories
or classes and the number of values that belong to each of
these categories or classes.
Grouped data A data set presented in the form of a
frequency distribution.
Histogram A graph in which classes are marked on the
horizontal axis and frequencies, relative frequencies, or
percentages are marked on the vertical axis. The
frequencies, relative frequencies, or percentages of
various classes are represented by bars that are drawn
adjacent to each other.
Ogive A curve drawn for a cumulative frequency
distribution.
Outliers or Extreme values Values that are very small or
very large relative to the majority of the values in a data
set.
52
Exercises (2)
53
2. The following data give the results of a sample survey.
The letters A, B, and C represent the three categories.
A B B A C B C C C A
C B C A C C B C C A
A B C C B C B A C A
54
C CK CK C CC D O C
CK CC D CC C CK CK CC
5 1 1 2 0 1 1 2 1 1
1 3 3 0 2 5 1 2 3 4
2 1 2 2 1 2 2 1 1 1
4 2 1 1 2 1 1 4 1 3
55
5-The following is the degree of fifty students in the final
exam of statistics in Modern Academy
41 65 33 30 85 79 63 77 98 48
80 54 67 62 47 50 78 53 91 69
57 69 74 30 61 72 87 77 65 84
45 50 61 41 55 82 52 67 58 73
62 95 77 74 62 73 68 51 60 56
56
a. Construct a frequency distribution using six classes and
the lower limit for the first class =20.
b. Portray the histogram and polygon.
c. Convert the frequency distribution to a cumulative
frequency distribution.
d. Find the relative frequencies for the pervious frequency
distribution.
57
58
GOALS of CHAPTER III
grouped data.
59
CHAPTER III
For the raw data, that is, data has not been grouped
in a frequency distribution; the mean is the sum of all
values divided by the number of values in the data set,
61
∑ : Is the Greek capital sigma and indicates the
operation of adding.
N: is the total number of values (observation) in the
population
Example
Compute the mean for the following data
1
2
2
4
5
10
Solution
∑ 𝑥 1 + 2 + 2 + 4 + 5 + 10
𝑥̄ = = =4
𝑛 6
Example
If you have the course grades for seven students in the
exam
62
90, 62, 92, 83, 78, 83, 81 , Compute the mean for the
course grade.
Solution
∑ 𝒙 𝟗𝟎 + 𝟔𝟐 + 𝟗𝟐 + 𝟖𝟑 + 𝟕𝟖 + 𝟖𝟑 + 𝟖𝟏
𝒙̄ = = = 𝟖𝟏. 𝟐𝟖𝟓
𝒏 𝟕
Example
Compute the mean for the following data
2, 7, 8, 14, 15
Solution
∑ 𝒙 𝟐 + 𝟕 + 𝟖 + 𝟏𝟒 + 𝟏𝟓
𝒙̄ = = = 𝟗. 𝟐
𝒏 𝟓
Example
Compute the mean for the following data
Solution
∑𝟓𝒊=𝟏 𝒙𝒊 𝟏 + 𝟏 + 𝟏 + 𝟏 + 𝟓𝟏
𝒙̄ = = = 𝟏𝟏
𝒏 𝟓
63
∑𝑛𝑖=1 𝑥𝑖 𝑓𝑖
𝑥̄ = 𝑛
∑𝑖=1 𝑓𝑖
Where
̅ : stands for the sample mean it is read (X bar).
𝒙
n
i =1
: is the Greek capital sigma and indicates the
operation of adding.
𝒙 : is the mid-point
Example
Determine the mean of the following frequency
distribution
Class 10- 20- 30- 40- 50-60 Sum.
Frequency 2 3 3 2 5 15
Solution
Class Frequency(f) 𝒙 𝒙𝒇
10- 2 15 30
20- 3 25 75
30- 3 35 105
40- 2 45 90
50-60 5 55 275
∑ 15 575
64
∑𝑛𝑖=1 𝑥𝑖 𝑓𝑖 575
𝑥̄ = 𝑛 = = 38.3
∑𝑖=1 𝑓𝑖 15
Example
Determine the mean of the following frequency
distribution
Class 8- 12- 16- 20- 24-28 Sum.
Frequency 3 5 8 12 8 36
Solution
Class Frequency 𝒙 𝒙𝒇
8- 3 10 30
12- 5 14 70
16- 8 18 144
20- 12 22 264
24-28 8 26 208
∑ 36 716
x i fi
716
x = i =1
n
= = 19.889
f
36
i
i =1
65
The Properties of the Mean
1- The sum of square of deviations from mean is
( x − x ) is least
2
minimum i.e.,
2- The sum of deviations of items from their mean is
equal to zero, i.e., (x − x ) = 0
3- A set of data has only one mean (the mean is unique).
4- The mean is a very useful measure for comparing two
or more populations.
5- The mean is affected by extremely values (outliers).
6- All the values are included in computing the mean.
66
If n is odd, the median is the middle observation of the
ordered array. If n is even, it is midway between the two
central observations.
67
Example
Find the median for the following data
30, 10, 20, 50, 40, 60
Solution
- The first, Arranging data in ascending order
10
20
30
40
50
60
n=6
- The second, since n is even the median will be number
of observations; the median is determined by averaging
the two observations in the middle.
- The median is the average of the third and fourth
observations (the middle two), which are 30 and 40,
30 + 40
respectively. Thus, the median is = 35 .
2
Example
Find the median for the following data
7
5
14
10
3
68
Solution
- The first, Arranging data in ascending order
3
5
7
10
14
𝑛+1
- The second, since n = 5 is odd the median = order
2
51 , 20 , 12 , 15 , 40 , 18 , 45 , 42 , 6, 66 , 27 , 30, 31
Solution
- The first, Arranging data in ascending order
6, 12, 15, 18, 20, 27, 30, 31, 40, 42 , 45 , 51, 66
𝑛+1
- The second, since n = 13 is odd the median = order
2
69
measure of central tendency for data sets that contain
outliers.
2. Is used when one must find the center or middle
value of data set.
3. Is used when must determine whether the data
values fall into the upper half or lower half of the
distribution.
3.3 The Mode
The mode is the value that occurs with the highest
frequency in a data set.
(i) The Mode for Raw Data
The mode is the most frequency value.
Example
The following data give the speeds (in miles per hour) of
eight cars that were stopped on I-95 for speeding
violations.
77, 82, 74, 81, 79, 84, 74, 78 . Find the mode.
70
Mode = 74 miles per hour
Example
Last year’s incomes of five randomly selected families
were $76,150, $95,750, $124,985, $87,490, and $53,740.
Find the mode.
Solution
Because each value in this data set occurs only once, this
data set contains no mode.
Example
Solution
This data set has three modes 19, 21, and 22. Each of
these three values occurs with a (highest) frequency of 2.
Example
Find the mode for the following data
5, 3, 2, 8, 3, 7, 1, 3
Solution
The mode=3
71
Example
Find the mode for the following data
11 , 15 , 13 , 15, 7 , 15 , 17 , 7 , 15 , 6 , 15 , 7
Solution
The mode = 15
Example
Solution
72
set includes outliers. The mode is simple to locate, but it
is not of much use in practical applications.
73
1- For a symmetric histogram and frequency distribution
curve with one peak, the values of the mean, median, and
mode are identical, and they lie at the center of the
distribution.
74
that occur in the right tail. These outliers pull the mean to
the right.
75
Mean, median, and mode for a histogram and frequency distribution
curve skewed to the left.
76
Glossary of Chapter III
77
Exercises (3)
1- The following data give the prices (in thousands of
dollars) of seven houses selected from all houses sold last
month in a city.
312 257 421 289 526 374 497
Find the median.
2- The following data give the speeds (in miles per hour)
of eight cars that were stopped on I-95 for speeding
violations.
77 82 74 81 79 84 74 78
Find the mode
3- The following data set belongs to a population:
5 -7 2 0 - 9 16 10 7
Calculate the mean, median, and mode.
4- The following data set belongs to a sample:
14 18 - 1 08 8 - 16
Calculate the mean, median, and mode.
5- The following data give the revenues (in millions of
dollars) for the last available fiscal year for a sample of
six charitable organizations for serious diseases in (2012).
The values are, listed in order, for the Alzheimer’s
Association, the American Cancer Society, the American
Diabetes Association, the American Heart Association,
78
the American Lung Association, and the Cystic Fibrosis
Foundation.
952 1129 231 668 49 149
Compute the mean and median. Do these data have a
mode? Why or why not?
24 32 27 23 35 33 29 40 23
28
Calculate the mean, median, and mode for these data.
79
Calculate the mean, the median, the mode of these data.
3- The weight (in K.g.) of a sample of twelve boxes being
received by post are:
8 2 4 3 10 8 4 9 8 11 10 12
80
81
GOALS of CHAPTER IV
quartiles.
82
CHAPTER IV
Measures of Dispersion
83
(i)The Range for Ungrouped Data
84
Example
Find the range for the following distribution
Frequency 15 8 12 11 7 53
Solution
The Rang = Upper limit of the last class – Lower limit for
the first class
The Rang = 20 – 10 = 10
Example
Determine the range for the following distribution
Class 40- 50- 60- 70- 80-90 Sum.
Frequency 12 30 27 22 15 106
Solution
The Rang = Upper limit of the last class – Lower limit for
the first class
The Rang = 90 - 40 = 50
85
4.2 Variance and Standard Deviation
The standard deviation is the most-used measure
of dispersion. The value of the standard deviation tells
how closely the values of a data set are clustered around
the mean. In general, a lower value of the standard
deviation for a data set indicates that the values of that
data set are spread over a relatively smaller range around
the mean. In contrast, a larger value of the standard
deviation for a data set indicates that the values of that
data set are spread over a relatively larger range around
the mean.
86
𝟐
∑(𝒙 − 𝝁)𝟐 𝟐
∑(𝒙 − 𝒙̄ )𝟐
𝝈 = 𝒂𝒏𝒅 𝒔 =
𝑵 𝒏−𝟏
Where 𝜎 2 is the population variance and 𝑠 2 is the
sample variance.
Example
Suppose the midterm scores of a sample of four
students are 82, 95, 67, and 92, respectively. Find the
mean score for these four students, the deviations of the x
value from the mean, the variance and stander deviations.
Solution
∑𝑥
Step 1: Calculate 𝑥̄ = then find (𝑥 − 𝑥̄ ) by
𝑛
87
Step2: find (𝒙 − 𝒙̄ )𝟐
𝒙 𝒙 − 𝒙̄ (𝒙 − 𝒙̄ )𝟐
82 82 - 84 = - 2 4
95 95 - 84 = 11 121
67 67 - 84 = -17 289
92 92 – 84 = 8 64
∑𝒙
𝒙̄ = = 𝟖𝟒 ∑(𝒙 − 𝒙̄ ) = 𝟎 ∑(𝒙 − 𝒙̄ )𝟐 = 478
𝒏
𝟐
∑(𝒙 − 𝒙̄ )𝟐 𝟒𝟕𝟔
𝒔 = = = 𝟏𝟓𝟗. 𝟑𝟑
𝒏−𝟏 𝟑
88
Example
Market Value
Company
(billions of dollars)
Pepsi Co 75
Google 107
Intel 71
Solution
∑𝑥
Step 1: Calculate 𝑥̄ = then find (𝑥 − 𝑥̄ ) by
𝑛
89
Step2: find (𝒙 − 𝒙̄ )𝟐
The value of (𝒙 − 𝒙̄ )𝟐 is obtained by squaring each value
of 𝑥 − 𝑥̄ and then adding values to get ∑(𝒙 − 𝒙̄ )𝟐
Step3: Determine the variance.
Because the given data are on the market values of only
five companies, we use the formula for the sample.
𝒙 𝒙 − 𝒙̄ (𝒙 − 𝒙̄ )𝟐
90
𝟐
∑(𝒙 − 𝒙̄ )𝟐 𝟐𝟔𝟗𝟓𝟏. 𝟐
𝒔 = = = 𝟔𝟕𝟑𝟕. 𝟖
𝒏−𝟏 𝟒
𝑠 = √𝑠 2 = √6737.8 = 82.084
Example
91
Step3: Determine the variance.
𝒙 𝒙 − 𝒙̄ (𝒙 − 𝒙̄ )𝟐
∑𝒙
𝝁= = ∑(𝒙 − 𝝁)𝟐
𝒏 ∑(𝒙 − 𝝁) = 𝟎
𝟕𝟒. 𝟖𝟖𝟑 = 𝟐𝟑𝟑𝟒. 𝟕𝟖𝟗
2
∑(𝑥 − 𝜇)2 2334.789
𝜎 = = = 389.131
𝑁 6
92
The standard deviation is obtained by taking the (positive)
square root of the variance:
𝜎 = √𝜎 2 = √389.131 = 19.726
2
∑ 𝑓(𝑥 − 𝜇)2 2
∑ 𝑓(𝑥 − 𝑥̄ )2
𝜎 = 𝑎𝑛𝑑 𝑠 =
𝑁 𝑛−1
93
the standard deviation is obtained by taking the positive
square root of the variance.
Example
94
Daily Commuting Time
Number of Employees
(minutes)
0 to less than 10 4
10 to less than 20 9
20 to less than 30 6
30 to less than 40 4
40 to less than 50 2
25
Solution
95
The sum of these products (that is, the sum of the fifth
column) gives ∑ 𝒙𝟐 𝒇
Daily
Commuting f x x f 𝒙𝟐 𝒇
Time
0 to less than 10 4 5 20 100
∑𝑓 ∑ 𝑥𝑓 ∑ 𝑥 2𝑓
= 25 = 535 = 14825
96
(∑ 𝑥𝑓 )2 (535)2
∑ 𝑥 2𝑓 − 14825 −
𝜎2 = 𝑁 = 25 = 3376
𝑁 25 25
= 135.04
𝜎 = √𝜎 2 = √135.04 = 11.620
97
Figure: Quartiles
98
Example
47 28 39 51 33 37 59 24 33
(a) Find the values of the three quartiles. Where does the
age of 28 years fall in relation to the ages of these
employees?
(b) Find the interquartile range.
Solution
99
Thus the values of the three quartiles are
Remember
Position of the first quartile (𝑸𝟏 )
Position 𝑄1 =(𝑛 + 1) × 0.25
Position of the third quartile (𝐐𝟑 )
Position 𝑄3 =(𝑛 + 1) × 0.75
Example
8, 11 ,12 , 15 , 17 , 18 , 21 , 26 , 29 , 33 , 41
100
=(11 + 1) × 0.25 = 3rd
- 𝑄1 = 12
- Position 𝑄3 =(𝑛 + 1) × 0.75
=(11 + 1) × 0.75 =9th
-𝑄3 =29
𝐼𝑄𝑅 = (𝑄3 − 𝑄1 ) = 29 − 12 = 17
4.3.2 Coefficient of Variation
101
𝜎 𝑠
𝐶. 𝑉 = × 100 = × 100
𝑥̄ 𝑥̄
Example
Xi Yi
1 0
2 0
3 0
4 5
5 10
𝑥̄ =3 𝑦̄ =3
s=1.58 s=4.47
Solution
𝜎 𝑠 1.58
C .V x = × 100 = × 100 = = = 52.7
𝑥̄ 𝑥̄ 3
𝜎 𝑠 4.47
𝐶. 𝑉𝑦 = = × 100 = × 100 = = 149
𝑥̄ 𝑥̄ 3
102
about its mean. Shape can be described by degree of
asymmetry (i.e., skewness).
𝑚𝑒𝑎𝑛 − 𝑚𝑜𝑑𝑒
𝑆𝐾 =
𝜎
or
3(𝑚𝑒𝑎𝑛 − 𝑚𝑒𝑑𝑖𝑎𝑛)
𝑆𝐾 =
𝜎
Example
∑ 𝑥𝑓 4910
𝑥̅ = = = 38.36
∑𝑓 128
𝑓3
The mode = 𝐿. 𝐿. + ( × 𝑖)
𝑓1 +𝑓3
Where
L.L : is the lower limit for the chosen class.
i : is the interval for the chosen class.
𝑓1 : is the frequency before the largest frequency.
𝑓3 : is the frequency next the largest frequency.
26
=30 + ( × 10)
26+36
26
= 30+ × 10
62
= 30+4.19
The mode =34.19
2
∑ 𝑥 2𝑓 ∑ 𝑥𝑓
𝜎= √ −( )
∑𝑓 ∑𝑓
205800 4910 2
=√ −( ) = √136.37
128 128
𝜎 = 11.68
The median
104
Ascending
Class Frequency Upper Cumulative
Limits frequency
20- 36 Less than 20 0
30- 43 Less than 30 36
40- 26 Less than 40 79
50- 16 Less than 50 105
60-70 7 Less than 60 121
∑𝑓
The median position = .
2
𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛−𝐶.𝐹.𝑃
-The median = 𝐿. 𝐿. + ( × 𝑖)
𝐹
Where
L.L : is the lower limit for the chosen class.
C.F.P. : is the prior cumulative frequency.
F : frequency of median class.
i : is the interval for the chosen class
∑𝑓 128
-The median position= = = 64
2 2
64−36
The median= 30 + ( × 10) = 36.51
43
105
38.36−34.19
=
11.68
𝑆𝐾 = +0.357
by another way:
3(𝑚𝑒𝑎𝑛 − 𝑚𝑒𝑑𝑖𝑎𝑛)
𝑆𝐾 =
𝜎
3(38.36−36.51)
=
11.68
3×1.85
=
11.68
𝑆𝐾 = +0.475
Example
Solution
Ascending
Class Frequency Upper Cumulative
Limits frequency
10- 8 Less than 10 0
106
20- 12 Less than 20 8
30- 16 Less than 30 20
40- 14 Less than 40 36
50-60 10 Less than 50 50
60 Less than 60 60
∑𝑓 60
- Position 𝑄1 = = = 15
4 4
Position 𝑄1 −𝐶.𝐹.𝑃
- 𝑄1 = 𝐿. 𝐿. + ( × 𝑖)
𝐹
15− 8
𝑄1 = 20 + ( × 10) = 25.8
12
2∑𝑓 2×60
- Position 𝑄2 = = = 30
4 4
Position 𝑄2 −𝐶.𝐹.𝑃
- 𝑄2 = 𝐿. 𝐿. + ( × 𝑖)
𝐹
30− 20
𝑄2 = 30 + ( × 10) = 36.25
16
3∑𝑓 3×60
-Position 𝑄3 = = = 45
4 4
Position 𝑄3 −𝐶.𝐹.𝑃
- 𝑄3 = 𝐿. 𝐿. + ( × 𝑖)
𝐹
45−36
𝑄3 = 40 + ( × 10) = 46.4
14
𝑄1 + 𝑄3 − 2𝑄2
𝑆𝐾 =
𝑄3 − 𝑄1
25.8 + 46.4 − 2 × 36.25
𝑆𝐾 =
46.4 − 25.8
107
−0.3
𝑆𝐾 =
20.6
𝑆𝐾 = −0.015
the quartile coefficient of skewness is −0.015
108
Glossary of Chapter IV
109
Exercises (4)
5 8 7 4 10 6 9
8 2 4 3 12 7 5 9 11 10 6
110
Age 16- 18- 20- 22- 24-26 total
Number of 7 10 18 12 3 50
students
111
Probability
112
GOALS of CHAPTER V
1- Define probability.
2- Explain the terms experiment, event, outcome.
3- Identify and apply the appropriate approach to
assigning probabilities.
4- Calculate probabilities using the rules of addition.
5- Define the term joint probability and calculate
probabilities using the rules of multiplication.
6- conditional probability, and what you can and can’t
do with conditional expressions.
7- Bayes’ Theorem.
113
CHAPTER V
Probability
114
An outcome, is the result of a single trial of an
experiment.
A sample space, Ω, is a set of possible outcomes of a
random experiment.
Example
If Toss a coin 3 times the sample space (ie, the total
number of results from this experiment) = it will be as
follows
Total number of results denoted by S
Number of results in one trail = N
Number of trails =m
∴ 𝑆 = 𝑁𝑚
Then in this Example
∴ 𝑆 = 𝑁 𝑚 = 23 = 8
S={(HHH),(HHT),(HTH),(THH),(HTT),(THT),(TTH),(TTT)}
115
2. Compound event: An event consisting of more than
one item such as {2, 4, 6} has an even number in
the dice experiment
3. The impossible event: An event that does not
contain any element as the event of the emergence
of the number 7 in the experiment of throwing dice
4. The certain event is: For an event that includes all
the space elements as an event, fewer than 7 occur
in the dice experiment
5. Mutually Exclusive Events: The two events that do
not share any element and intersect the empty group
𝐴 ∩ 𝐵 = 𝜑 , two outcomes can’t occur at the same
time
— Male & Female in same person
6. independent Events: Events are if the occurrence of
one event does not affect the occurrence of another.
Example
116
Solution
117
Probability of an event P(E): “Chance” that an event will
occur
• Must lie between 0 and 1
• “0” implies that the event will not occur
• “1” implies that the event will occur
∑ 𝑝(𝐸𝑖 ) = 𝑝(𝐸1 ) + 𝑝(𝐸2 ) + 𝑝(𝐸3 ) + ⋯ = 1
Example
Solution
118
5.2.2 Empirical Probability
Example
On February 2, 2005, the Space Shuttle Columbia
exploded. This was the third disaster in 113 space
missions for NASA. On the basis of this information,
what is the probability that a future mission is
successfully completed?
Solution
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 𝑓𝑙𝑖𝑔ℎ𝑡𝑠
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑎 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 𝑓𝑙𝑖𝑔ℎ𝑡 = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑙𝑖𝑔ℎ𝑡𝑠
110
= = 0.973
113.
119
5.2.3 Subjective Probability
120
composed of all basic outcomes in S that do not belong to
A.
The complement rule is used to determine the
probability of an event occurring by subtracting the
probability of the event not occurring from 1.
𝑃(𝐴) + 𝑝( A ) = 1 or 𝑝(𝐴) = 1 − 𝑝(𝐴̄ )
Example
If the probability of getting a “working” computer is
(0.7). What is the probability of getting a defective
computer?
Solution
121
Let A and B be two events in a sample space S. The
probability of the union of A and B is
P( A B) = P( A) + P( B) − P( A B).
Example
At State U, all first-year students must take chemistry and
math. Suppose 16% fail chemistry, 11% fail math, and
5% fail both. Suppose a first-year student is selected at
random. What is the probability that student selected
failed at least one of the courses?
Solution
122
Let A and B be two events in a sample space S. The
probability of the union of two mutually exclusive events
A and B is
P( A B) = P( A) + P( B).
A B
Solution
123
The probability that the second member selected made a
reservation is also 0.60, so P(R2) = 0.60.
Since the number of AAA members is very large,
(R1andR2)= P(R1 ∩ R2)= P(R1)P(R2) = (0.60)(0.60) =0.36
124
𝑚
𝑝 (𝐴 | 𝐵 ) =
𝑚+𝑛−1
We now give a formal definition of conditional
probability.
125
Solution
There are 100 ways in which we can select a patient from
the 100 patients. Since the patient is selected at random,
all the 100 ways of selecting a patient are equally likely.
(a) There are 20 males with blood group B. Therefore, the
probability that the patient selected is a male and has
20
blood group B is = 0.2 .
100
𝑝(𝐴 ∩ 𝑀) 30⁄
𝑝 (𝐴 | 𝑀 ) = = 100 = 30
𝑝(𝑀) 67⁄ 67
100
Example
The probability that a regularly scheduled flight departs
on time is P ( D)= 0.83, the probability that it arrives on
time is P (A)= 0.82 ; and the probability that it departs and
arrives on time is P (D∩A )= 0.78. Find the probability
that a plane:
126
(a) arrives on time given that it departed on time,
(b) departed on time given that it has arrived on time,
Solution
(a) The probability that a plane arrives on time given that
it departed on time is
𝑝(𝐷 ∩ 𝐴) 0.78
𝑝 (𝐴 | 𝐷 ) = = = 0.94
𝑃(𝐷) 0.82
(b) The probability that a plane departed on time given
that it has arrived on time is
𝑝(𝐷 ∩ 𝐴) 0.78
𝑝(𝐷 |𝐴) = = = 0.95
𝑃(𝐴) 0.82
5.5 Bayes' Theorem
Bayes' theorem is a formula that describes how to update
the probabilities of hypotheses when given evidence. It
follows simply from the axioms of conditional
probability, but can be used to powerfully reason about a
wide range of problems involving belief updates.
Given a hypothesis H and evidence E, Bayes' theorem
states that the relationship between the probability of the
hypothesis before getting the evidence P(H) and the
probability of the hypothesis after getting the
evidence 𝑃(𝐻 |𝐸 ) is
127
𝑃(𝐸 |𝐻 )
𝑃(𝐻 |𝐸 ) = 𝑃(𝐻)
𝑃(𝐸)
Example
If a single card is drawn from playing cards, the
probability that the card is a king is 4/52, since there are 4
kings in a standard deck of 52 cards.
𝑃(𝐹𝑎𝑐𝑒 |𝐾𝑖𝑛𝑔)
𝑃(𝐾𝑖𝑛𝑔|𝐹𝑎𝑐𝑒) = 𝑃(𝐾𝑖𝑛𝑔).
𝑃(𝐹𝑎𝑐𝑒)
128
Example
𝑃(𝑆𝑚𝑜𝑘𝑒|𝐹𝑖𝑟𝑒 )
𝑃(𝐹𝑖𝑟𝑒 |𝑆𝑚𝑜𝑘𝑒) = 𝑃(𝐹𝑖𝑟𝑒)
𝑃(𝑆𝑚𝑜𝑘𝑒)
0.90×0.01
= = 0.09
0.10
Example
129
- Find the probability of rain during the day?
- Find the probability of Rain given Cloud is written
𝑃(𝑅𝑎𝑖𝑛|𝐶𝑙𝑜𝑢𝑑 )
Solution
𝑃(𝐶𝑙𝑜𝑢𝑑 |𝑅𝑎𝑖𝑛)
𝑃(𝑅𝑎𝑖𝑛|𝐶𝑙𝑜𝑢𝑑 ) = 𝑃(𝑅𝑎𝑖𝑛)
𝑃(𝐶𝑙𝑜𝑢𝑑 )
0.1 × 0.5
𝑃(𝑅𝑎𝑖𝑛|𝐶𝑙𝑜𝑢𝑑 ) = = 0.125
0.4
130
If A and B are mutually exclusive, then
P (A ∪ B) = P(A) + P(B).
𝑃 (𝐴 ∩ 𝐵)
• Conditional probability: P(A|B) = 𝑃(𝐵)
.
131
Glossary of Chapter V
132
Independent events Two events for which the occurrence
of one does not change the probability of the occurrence
of the other.
Intersection of events The intersection of events is given
by the outcomes that are common to two (or more)
events.
Mutually exclusive events Two or more events that do not
contain any common outcome and, hence, cannot occur
together.
133
Exercises (5)
1- A car rental agency currently has 44 cars available,
28 of which have a GPS navigation system. One of
the 44 cars is selected at random. Find the
probability that this car
a. has a GPS navigation system.
b. does not have a GPS navigation system.
2- In a class of 35 students, 13 are seniors, 9 are juniors,
8 are sophomores, and 5 are freshmen. If one
student is selected at random from this class, what is
the probability that this student is
a. a junior?
b. a freshman?
3- Find P(A or B) for the following.
a. P(A)= 0.58, P(B) =0.66, and P(A and B)= 0.57.
b. P(A)= 0.72, P(B)= 0.42, and P(A and B)= 0.39.
134
5- The following table shows the number of skilled and
unskilled workers in three divisions in a factory
Section
first second third sum
worker type
skilled 370 410 55 835
unskilled 30 90 45 165
Sum 400 500 100 1000
135
other colleges. A random number of students were
selected, with the following possibilities:
1 - Probability to be a student from the College of
Business Administration.
2 - The probability of the student from the Faculty of Arts
3 - The probability of being a student from the Faculty of
Business or from the Faculty of Arts.
4 - Probability to be a student from another college.
136
137
GOALS of CHAPTER VI
138
CHAPTER VI
Probability Distribution
140
6.2 Discrete Random Variable and Distribution
Function
Two Characteristics of a Probability Distribution The
probability distribution of a discrete random variable
possesses the following two characteristics.
1. The probability assigned to each value of a random
variable x lies in the range 0 to 1; that is,
0 ≤ P(x) ≤ 1 for each x
141
2. The sum of the probabilities assigned to all possible
values of x is equal to 1. that is,
∑ P(x) = 1.
These two characteristics are also called the two
conditions that a probability distribution must satisfy.
Example
Each of the following tables lists certain values of x and
their probabilities. Determine whether or not each table
represents a valid probability distribution.
Solution
142
0.27 = 0.85 . Therefore, the second condition is not
satisfied. Consequently, this table does not represent a
valid probability distribution.
(b) Each probability listed in this table is in the range 0 to
1. Also, ∑ 𝑝(𝑥 ) = 0.25 + 0.34 + 0.28 + 0.13 = 1.
Consequently, this table represents a valid probability
distribution.
(c) Although the sum of all probabilities listed in this
143
𝑀𝑒𝑎𝑛 = 𝐸(𝑋) = 𝜇 = ∑ 𝑥𝑃(𝑥)
√var( 𝑋) = 𝜎 = √𝜎 2
Example
John sells new cars for Ford. John usually sells the
largest number of cars on Saturday. He has developed the
following probability distribution for the number of cars
he expects to sell on a particular Saturday. Find the mean
and the variance.
144
Variance.
Example
145
(c) Find
(i) P(x)>1 , (ii) P(0<x<2 ), (iii) P (x)>2.
Solution
1 1 1 1
(a) ∑3
𝑥=0 𝑝(𝑥 ) =1 ⇒ +𝐶+ + =1 ⇒𝐶 =
4 2 8 8
(b)
Alternatively,
1 1 1
𝑝(𝑥 > 1) = 𝑝(𝑥 = 2) + 𝑝(𝑥 = 3) + 𝑝(𝑥 = 4) = + +
8 2 8
3
=
4
1
(ii) 𝑝(0 < 𝑥 < 2) = 𝑝(𝑥 = 1) =
4
Example
= ∑ 𝑥𝑃(𝑥)
∴ √𝑣(𝑥) = 𝜎 = 0.8
147
6.4 Important Discrete Probability Distribution
Certain particular discrete distributions are so important
that we list them here.
Binomial Probability Distribution
A Widely occurring discrete probability distribution.
Characteristics of a Binomial Probability Distribution
• There are only two possible outcomes on a particular
trial of an experiment.
• The outcomes are mutually exclusive,
• The random variable is the result of counts.
• Each trial is independent of any other trial.
Binomial Probability Formula
𝑛!
𝑃(𝑥) = 𝐶𝑥𝑛 𝑝 𝑥 𝑞𝑛−𝑥 = 𝑝 𝑥 𝑞 𝑛−𝑥 , 𝑥 = 0,1,2, … , 𝑛
𝑥!(𝑛−𝑥)!
𝑐: denotes a combination
𝑛: is the number of trails
𝑥:is random variable defined as number of successes
𝑝: is the probability of a success on each trail and q
is (1-p)
Mean and Variance
• Mean: μ = 𝑛𝑝
• Variance:σ2 = 𝑛𝑝𝑞
• Stander deviation: σ = √𝑛𝑝𝑞
148
Example
There are five flights daily from Cairo via Miser Airways
into the Alex, Luxor Airport. Suppose the probability that
any flight arrives late is 0.20.
- What is the probability that none of the flights are
late today?
Solution
𝑛!
𝑃(𝑥 = 0) = 𝐶𝑥𝑛 𝑝 𝑥 𝑞𝑛−𝑥 = 𝑝 𝑥 𝑞𝑛−𝑥 ,
𝑥! (𝑛 − 𝑥 )!
N=5 p=0.20 q=1-p=0.80
𝑃(𝑥 = 0) = 𝐶05 (0.20)0 (0.80)5−0
5!
= (0.20)0 (0.80)5−0 = 0.3277
0! (5 − 0)!
Example
For the example regarding the number of late flights,
recall that p=0.20 and n=5
-What is the average number of late flights?
-What is the variance of the number of late flights?
Solution
• 𝜇 = 𝑛𝑝 = 5 × 0.20 = 1
• 𝜎 2 = 𝑛𝑝𝑞 = 5 × 0.20 × 0.80 = 0.8
149
Example
The probability that a certain kind of component will
survive a shock test is 3/4. Find the probability that
exactly 2 of the next 4 components tested survive.
Solution
4 3 2 1 2 4! 9 1
𝑝(𝑥 = 2) = ( ) ( ) ( ) = ( )( )
2 4 4 2! ⋅ 2! 16 16
9 27
= 6( )= = 0.21
266 128
Example
A large chain retailer purchases a certain kind of
electronic device from a manufacturer. The manufacturer
indicates that the defective rate of the device is 3%. The
inspector randomly picks 20 items from a shipment. What
is the probability that there will be at least one defective
item among these 20?
150
Solution
151
distribution of this discrete random variable is called the
Bernoulli distribution.
Bernoulli Probability Formula
𝑝(x) = 𝑝 𝑥 (1 − p)1−𝑥 , 𝑥 = 0,1
Mean and Variance
• Mean: μ = 𝑝
• Variance:σ2 = 𝑝𝑞
• Stander deviation: σ = √𝑝𝑞
Poisson Process
152
• The specified region could be a line segment, an area, a
volume, or perhaps a piece of material. In such instances,
X might represent the number of field mice per acre, the
number of bacteria in a given culture, or the number of
typing errors per page.
153
Poisson Distribution formula
The probability distribution of the Poisson random
variable X, representing the number of outcomes
occurring in a given time interval or specified region
denoted by t, is
𝑒 −𝜆 𝜆𝑥
𝑃 (𝑥 ) = , 𝑥 = 0,1,2, …
𝑥!
𝜆 : is the mean number of occurrences (successes) in a
particular interval.
𝑒 : is the constant 2.718 (base Napierian logarithmic
system)
X: is the number of occurrences (successes).
P(x): is the probability for a specified value of x.
Example
During a laboratory experiment, the average number of
radioactive particles passing through a counter in 1
millisecond is 4. What is the probability that 6 particles
enter the counter in a given millisecond?
154
Solution
Using the Poisson distribution with x = 6 and λ = 4, we
have
𝑒 −4 46
𝑝(𝑥 = 6) = = 0.1042
6!
Example
Ten is the average number of oil tankers arriving each day
at a certain port. What is the probability that on a given
day the port can handle at most 15 tankers per day?
Solution
Theorem
Example
In a manufacturing process where glass products are
made, defects or bubbles occur, occasionally rendering
155
the piece undesirable for marketing. It is known that, on
average, 1 in every 1000 of these items produced has one
or more bubbles. What is the probability that a random
sample of 8000 will yield fewer than 7 items possessing
bubbles?
Solution
157
Example
(i)Show that
1 2
𝑓 (𝑥 ) = {9 𝑥 0<𝑥<3
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
1 3
=0+[ 𝑥 3] + 0 = 1
27 0
The p.d.f. of x
158
Example
(1) 𝑝(𝑥 ≤ 2)
(2) 𝑝(𝑥 > 1)
(3) 𝑝(2 ≤ 𝑥 ≤ 3)
Solution
(1)
2 0 2
𝑝(𝑥 ≤ 2) = ∫ 𝑓 (𝑥 )𝑑𝑥 = ∫ 𝑓 (𝑥 )𝑑𝑥 + ∫ 𝑓(𝑥 )𝑑𝑥
−∞ −∞ 0
0 21
= ∫−∞ 0 𝑑𝑥 + ∫0 𝑥 2 𝑑𝑥
9
1 2 8
3
=0+[ 𝑥 ] =
27 0 27
(2)
∞ 3 ∞
𝑝(𝑥 > 1) = ∫ 𝑓(𝑥 )𝑑𝑥 = ∫ 𝑓 (𝑥 )𝑑𝑥 + ∫ 𝑓(𝑥 )𝑑𝑥
1 1 3
31 ∞
= ∫1 𝑥 2 𝑑𝑥 + ∫3 0𝑑𝑥
9
1 3 26
= [ 𝑥 3] + 0 =
27 1 27
31 1 3
2 3
(3) 𝑝(2 ≤ 𝑥 ≤ 3) = ∫2 9 𝑥 𝑑𝑥 = [ 𝑥 ] =
27 2
1 19
(27 − 8) =
27 27
159
Example
𝑓(𝑥 ) = {𝑘 √𝑥 0<𝑥<1
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Solution
(a) The value of k is given by the equation
∞ 0 1 ∞
1 = ∫ 𝑓(𝑥) = ∫ 𝑓(𝑥)𝑑𝑥 + ∫ 𝑓(𝑥)𝑑𝑥 + ∫ 𝑓(𝑥)𝑑𝑥
−∞ −∞ 0 1
0 1 ∞
= ∫ 0𝑑𝑥 + ∫ 𝑘√𝑥 𝑑𝑥 + ∫ 0𝑑𝑥
−∞ 0 1
3 1
𝑘 𝑥2 2
=0+[ ] +0= 𝑘
3 3
2 0
2 3
∴ 𝑘 =1⇒𝑘 =
3 2
Thus
3
𝑓 (𝑥 ) = {2 √𝑥 0<𝑥<1
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
1
1 0
(b) 𝑝 (𝑥 < ) =
4
∫−∞ 𝑓(𝑥 )𝑑𝑥 + ∫0 𝑓 (𝑥 )𝑑𝑥
4
160
1 1
0 43 3 4 1
= ∫ 0𝑑𝑥 + ∫ √𝑥 𝑑𝑥 = 0 + [𝑥 2 ] =
−∞ 0 2 0 8
161
Let X be a continuous random variable with range
[a, b] and probability density function f(x). The expected
value of X is defined by
𝑏
𝐸 (𝑥 ) = ∫ 𝑓(𝑥 )𝑑𝑥.
𝑎
Example
3
Let X have range [0, 2] and density 𝑥 2 . Find E(X).
8
Solution
2 2 2
3 3 3𝑥 4 3
𝐸(𝑥) = ∫ 𝑥 𝑓(𝑥)𝑑𝑥 = ∫ 𝑥 𝑑𝑥 = [ ] =
0 0 8 32 0 2
162
𝑉 (𝑥 ) = 𝐸 ((𝑥 − 𝜇)2 ).
Properties of Variance
163
FIGURE 𝑝(𝑥1 ≤ 𝑥 ≤ 𝑥2 )
164
Solution
The probability density function is
1
𝑓 (𝑥 ) = 𝑤ℎ𝑒𝑟𝑒 𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
1 1
𝑓(𝑥) = = 𝑤ℎ𝑒𝑟𝑒 2000 ≤ 𝑥 ≤ 5000
5000−2000 3000
1
a- 𝑝(2500 ≤ 𝑥 ≤ 3000) = (3000 − 2500) × =
3000
0.166
𝑝(2500 ≤ 𝑥 ≤ 3000)
1
b- 𝑝(𝑥 ≥ 4000) = (5000 − 4000) × = 0.333
3000
𝑝(4000 ≤ 𝑥 ≤ 5000)
165
c- 𝑝(𝑥 = 2500) = 0, Because there is an uncountable
infinite number of values of X, the probability of each
individual value is zero.
𝑝(𝑥 = 2500)
Normal Distribution
• Bell shaped,
plus infinity.
166
The Normal Distribution - Graphically
1
1 − ( x − )2
n ( x ; , ) = 2
, − x ,
2
e
2
normal distribution.
167
▪ Any normal distribution can be converted to a
random variable
168
• Or, we need to know the area under the curve between
two specific boundaries.
Example
Solution
169
0.1915+0.341=0.5328
Example
170
Example
Example
Solution
Exponential distribution.
171
Exponential distribution formula
172
Solution
1
a. The mean and standard deviation are equal to .
𝜆
Thus,
1 1
𝜇=𝜎= = = 20 ℎ𝑜𝑢𝑟𝑠
𝜆 0.05
Example
173
c. The service is completed in a time between 5 and 8
minutes.
Solution
174
Glossary of Chapter
175
Probability distribution of a discrete random variable A
list of all the possible values that a discrete random
variable can assume and their corresponding probabilities.
176
Exercises (6)
1- Refer to the information regarding the weekly income
of shift foremen in the glass industry. The distribution of
weekly incomes follows the normal probability
distribution with a mean of 1000 and a standard deviation
of 100.
What is the probability of selecting a shift foreman in the
glass industry whose income is: Less than 790?
2- Refer to the information regarding the weekly income
of shift foremen in the glass industry. The distribution of
weekly incomes follows the normal probability
distribution with a mean of 1000 and a standard deviation
of 100.
What is the probability of selecting a shift foreman in the
glass industry whose income is: Between 840 and 1,200?
5-What is P( Z < 1)
6-What is P(Z < -2.23)
177
8.Let X be an exponential random variable with 𝜆 = 0.50
. Find the following probabilities.
a. 𝑝(𝑥 > 1)
b. 𝑝(𝑥 > 0.4)
c. 𝑝(𝑥 < 0.5)
d. 𝑝(𝑥 < 2)
9.Let X is an exponential random variable with 𝜆 = 0.3 .
Find the following probabilities
a. 𝑝(𝑥 > 2)
b. 𝑝(𝑥 < 4)
c. 𝑝(1 < 𝑥 < 2)
d. 𝑝(𝑥 = 3)
10. Consider the experiment consisting of 10 tosses of a
coin. Determine whether or not it is a binomial
experiment.
11. At the Express House Delivery Service, providing
high-quality service to customers is the top priority of the
management. The company guarantees a refund of all
charges if a package it is delivering does not arrive at its
destination by the specified time. It is known from past
data that despite all efforts, 2% of the packages mailed
through this company do not arrive at their destinations
178
within the specified time. Suppose a corporation mails 10
packages through Express House Delivery Service on a
certain day.
(a) Find the probability that exactly one of these 10
packages will not arrive at its destination within the
specified time.
(b) Find the probability that at most one of these 10
packages will not arrive at its destination within the
specified time.
11. In a Robert Half International survey of senior
executives, 35% of the executives said that good
employees leave companies because they are unhappy
with the management (USA TODAY, February 10, 2009).
Assume that this result holds true for the current
population of senior executives. Let x denote the number
in a random sample of three senior executives who hold
this opinion. Write the probability distribution of x and
draw a bar graph for this probability
12. According to a Harris Interactive survey conducted
for World Vision and released in February 2009, 56% of
teens in the United States volunteer time for charitable
causes. Assume that this result is true for the current
179
population of U.S. teens. A sample of 60 teens is selected.
Let x be the number of teens in this sample who volunteer
time for charitable causes. Find the mean and standard
deviation of the probability distribution of x.
13. On average, a household receives 9.5 telemarketing
phone calls per week. Using the Poisson probability
distribution formula, find the probability that a randomly
selected household receives exactly 6 telemarketing
phone calls during a given week.
180
are low. The probabilities of these three scenarios are
0.32, 0.51, and 0.17, respectively.
(a) Let x be the profits (in millions of dollars) earned per
annum from this product by the company. Write the
probability distribution of x.
(b) Calculate the mean and standard deviation of x
181
182
GOALS of the CHAPTER
1- Define point and confidence interval estimates.
2- Define level of confidence.
3- Compute a confidence interval for the population
mean when the population standard deviation is
known.
4- Compute a confidence interval for a population
mean when the population standard deviation is
unknown.
5- Compute a confidence interval for a population
proportion.
183
CHAPTER VII
Estimation
184
2. Collect the required information from the members of
the sample.
3. Calculate the value of the sample statistic.
4. Assign value(s) to the corresponding population
parameter.
• Point Estimate A sample statistic used to estimate
the exact value of a population parameter
• Confidence interval (interval estimate) A range of
values defined by the confidence level within which
the population parameter is estimated to fall.
• Confidence Level The likelihood, expressed as a
percentage or a probability, that a specified interval
will contain the population parameter.
• 95% confidence level there is a 0.95 probability
that a specified interval does contain the population
mean. In other words, there are 5 chances out of
100 (or 1 chance out of 20) that the interval doesn’t
contain the population mean.
• 99% confidence level there is 1 chance out of 100
that the interval doesn’t contain the population
mean.
185
Constructing a Confidence Interval (CI)
• The sample mean is the point estimate of the
population mean.
• The sample standard deviation is the point
estimate of the population standard deviation.
• The standard error of the mean makes it possible
to state the probability that an interval around the
point estimate contains the actual population mean.
186
Case I. If the following three conditions are fulfilled
1. The population standard deviation is known
2. The sample size is small (i.e., n < 30)
3. The population from which the sample is selected is
normally distributed,
then we use the normal distribution to make the
confidence interval for 𝜇.
Case II. If the following two conditions are fulfilled
187
Confidence Interval for 𝜇 The (1 -𝛼 )100% confidence
interval for 𝜇 under Cases I and II is
𝝈
̅ ±𝒛
𝒙
√𝒏
̅ : is a sample mean.
𝒙
Z: value for a particular confidence level.
𝝈: the population standard deviation.
n: the number of observations in the sample.
188
The value of z used here is obtained from the
standard normal distribution table for the given
confidence level.
Example
Solution
̅ = 𝟒𝟎, 𝝈 = 𝟓, 𝑪. 𝑰 = 𝟗𝟓% 𝒛 = 𝟏. 𝟗
𝒏 = 𝟖𝟏, 𝒙
𝝈
̅ ±𝒛
𝒙
√𝒏
+ 5
40 − (1.96) = 40+
− 1.089
√81
38.91≤ 𝜇 ≤ 41.08
189
Example
Solution
29.216≤ 𝜇 ≤ 30.784
Example
190
Solution
𝝈
̅ ±𝒛
𝒙
√𝒏
2050
45420 ± 𝟏. 𝟗𝟔
√𝟐𝟓𝟔
191
If the following three conditions are fulfilled:
1. The population standard deviation 𝜎is unknown.
2. The sample size is small (i.e., n < 30) .
3. The population from which the sample is selected is
normally distributed, then we use the t distribution to
make the confidence interval for 𝜇.
𝑠
𝑥̅ ± 𝑡
√𝑛
𝑥̅ : is a sample mean.
t: value for a particular confidence level.
𝑠: The sample standard deviation.
n: the number of observations in the sample.
192
size increases, however, the t distribution approaches the
standard normal distribution.
The units of a t distribution are denoted by t. The
shape of a particular t distribution curve depends on the
number of degrees of freedom (df), the number of degrees
of freedom for a t distribution is equal to the sample size
minus one, that is,
𝑑𝑓 = 𝑛 − 1
The t distribution has only one parameter, called the
degrees of freedom (df). The mean of the t distribution is
𝑑𝑓
equal to 0, and its standard deviation is √
(𝑑𝑓−2)
Example
shows the standard normal distribution and the t
distribution for 9 degrees of freedom.
Solution
𝑑𝑓
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡 = √
(𝑑𝑓 − 2)
193
9
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑡 = √ = 1.134
(9 − 2)
Example
Find the value of t for 16 degrees of freedom and 0.05
area in the right tail of a t distribution curve.
Solution
194
If the Sample Size Is Too Large and 𝝈 is unknown
195
̄
𝑝̄ (1 − 𝑝̄ ) 𝑝̄ 𝑞
𝑝̄ ± 𝑧𝛼/2 √ = 𝑝 ± 𝑧𝛼/2 √
𝑛 𝑛
𝑥
Where 𝑝̄ =
𝑛
Example
Out of 50 students randomly selected from a college, 16
had been bitten by a dog at some time. Construct an
approximate 95% confidence interval for the population
proportion.
Solution
𝑥 16
𝑝̄ = = = 0.32
𝑛 50
𝑝̄ (1 − 𝑝̄ )
𝑝 = 𝑝̄ ± 𝑧 √
𝑛
0.32(1 − 0.32)
𝑝 = 0.32 ± 1.96√
50
𝑝 = 0.32 ± 0.13
𝟎. 𝟏𝟗 ≤ 𝒑 ≤ 𝟎. 𝟒𝟓
196
We estimate that between 19% and 45% of all students in
the college had been bitten by a dog at some time, with
approximately 95% confidence.
Example
The union representing of America (BBA) is considering
a proposal to merge with the Teamsters Union. According
to BBA union by laws, at least three-fourths of the union
membership must approve any merger. A random sample
of 2000 current BBA members reveals 1600 plan to vote
for the merger proposal. What is the estimate of the
population proportion?
Develop a 95 percent confidence interval for the
population proportion. Basing your decision on this
sample information, can you conclude that the necessary
proportion of BBA members favor the merger? Why?
Solution
197
𝑝̄ (1 − 𝑝̄ )
C.I. = 𝑝̄ ± 𝑧𝛼 √
2 𝑛
0.80(1 − 0.80)
= 0.80 ± 1.96√
2000
198
Glossary of Chapter
199
Exercises (7)
200
b. Make a 99% confidence interval for the percentage of
all auto claims filed with this company that are fraudulent.
201
202
GOALS of the CHAPTER
1. Define a hypothesis and hypothesis testing.
2. Explain the five-step hypothesis-testing procedure.
3. Describe Type I and Type II errors.
4. Distinguish between a one-tailed and two-tailed
hypothesis.
5. Describe the ANOVA approach for testing
difference in means.
203
CHAPTER VIII
Hypothesis Testes
Two hypotheses
A null hypothesis is a claim (or statement) about a
population parameter that is assumed to be true until it is
declared false. It is represented by 𝐻0 .
An alternative hypothesis is a claim about a
population parameter that will be true if the null
hypothesis is false. It is represented by 𝐻1 .
204
There are two possible errors. A Type I error occurs
when we reject a true null hypothesis. A Type II error is
defined as not rejecting a false null hypothesis. The
probability of a Type I error is denoted by 𝛼, which is
also called the significance level. The probability of a
Type II error is denoted by 𝛽
205
value stated in the null hypothesis that is, when the
hypotheses assume the following form:
𝐻0 : 𝜇 = 𝜇0
𝐻1 : 𝜇 ≠ 𝜇0
𝐻0 : 𝜇 = 𝜇0
𝐻1 : 𝜇 > 𝜇0
206
Level of significance the probability of rejecting the null
hypothesis when it is true, designated α.
Where
𝜎
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑒𝑟𝑟𝑜𝑟 =
√𝑛
207
𝑝 𝑣𝑎𝑙𝑢𝑒 ≥ 𝛼 𝑜𝑟 𝛼 ≤ 𝑝 𝑣𝑎𝑙𝑢𝑒
Example
At Sweden Food Corporation, it used to take an average
of 90 minutes for new workers to learn a food processing
208
job. Recently the company installed a new food
processing machine. The supervisor at the company wants
to find if the mean time taken by new workers to learn the
food processing procedure on this new machine is
different from 90 minutes. A sample of 20 workers
showed that it took, on average, 85 minutes for them to
learn the food processing procedure on the new machine.
It is known that the learning times for all new workers are
normally distributed with a population standard deviation
of 7 minutes. Find the p-value for the test that the mean
learning time for the food processing procedure on the
new machine is different from 90 minutes. What will your
conclusion be if 𝛼 = 0.01?
Solution
210
Reject area
Reject area
(1-𝛼)
Acceptance area
-2.58 𝜇 2.58
-3.19
Example
Vodafone Telephone Company provides long-distance
telephone service in an area. According to the company’s
records, the average length of all long-distance calls
placed through this company in 2009 was 12.44 minutes.
The company’s management wanted to check if the mean
length of the current long-distance calls is different from
211
12.44 minutes. A sample of 150 such calls placed through
this company produced a mean length of 13.71 minutes.
The standard deviation of all such calls is 2.65 minutes.
Using the 2% significance level, can you conclude that
the mean length of all current long-distance calls is
different from 12.44 minutes?
Solution
𝑥̅ − 𝜇 13.71 − 12.44
𝑧= 𝜎 = = 5.87
2.65
√𝑛 √150
212
Step 4: Formulate the decision rule.
1-𝛼
Acceptance area
91-𝛼)
213
If the population standard deviation is unknown
and the sample is less than 30 use t-distribution
𝑥̅ − 𝜇
𝑡= 𝑠
√𝑛
Example
A psychologist claims that the mean age at which children
start walking is 12.5 months. Carol wanted to check if
this claim is true. She took a random sample of 18
children and found that the mean age at which these
children started walking was 12.9 months with a standard
deviation of 0.80 month. It is known that the ages at
which all children start walking are approximately
normally distributed the test that the mean age at which
all children start walking is different from 12.5 months if
the significance level is 1%.
Solution
214
Step 1. State the null and alternative hypotheses.
𝐻0 : 𝜇 = 12.5
𝐻1 : 𝜇 ≠ 12.5
Step 2: Select the level of significance.
α = 0.01 as stated in the problem
Step 3: Select the test statistic.
𝑥̅ − 𝜇 12.19 − 12.5
𝑡= 𝑠 = = 2.121
0.80
√𝑛 √18
Reject Reject
area area
1−𝛼−𝛼
Acceptance area
215
Step 5: Make a decision and interpret the result.
216
𝑝̅ − 𝑝
𝑧=
√𝑝(1 − 𝑝)
𝑛
Solution
𝑥 1550
𝑝̅ = = = 0.775
𝑛 2000
217
Step 1: State the null hypothesis and the alternate
hypothesis.
𝐻0 : 𝑝 ≥ 0.80
𝐻1 : 𝑝 < 0.80
𝑝̅ − 𝑝 0.775 − 0.80
𝑧= = = −2.79
√𝑝(1 − 𝑝) √0.80(1 − 0.80)
𝑛 2000
Acceptance area
(1 − 𝛼)
218
Step 5: Make a decision and interpret the result.
The F distribution
The shape of a particular F distribution curve
depends on the number of degrees of freedom. However,
the F distribution has two numbers of degrees of freedom,
degrees of freedom for the numerator and degrees of
freedom for the denominator. These two numbers are the
parameters of the F distribution.
To test if the three teaching methods produce
different means, we test the null hypothesis
𝐻0 : 𝜇1 = 𝜇2 = 𝜇3
𝐻1 : Not all three population means are equal.
219
The ANOVA, short for analysis of variance, provides
such a procedure. It is used to compare three or more
population means in a single test.
220
221
Glossary of Chapter
222
Exercises (8)
223
A random sample of 600 observations taken from this
population produced a sample proportion of 0.86. a. If this
test is made at the 2% significance level, would you reject
the null hypothesis?
224
c. an alternative hypothesis is rejected when it is
actually true.
4. A critical value is the value
a. calculated from sample data.
b. determined from a table (e.g., the normal
distribution table or other such tables).
c. neither a nor b.
225
Appendix
226
Table (1) area under Normal curve
227
Table (2 )t distribution
228
References
2- Douglas.A.Lind,William.G.Marchal,Samuel.A.W
athen, “Statistical Techniques in Business
&Economics”, 14th Ed., McGraw-Hill, Irwin,
2012.
229