[go: up one dir, main page]

0% found this document useful (0 votes)
80 views155 pages

Statistics - For - Business Module

This document provides an overview of a course on statistics for business. It covers topics such as data collection and presentation methods, measures of central tendency and dispersion, probability, probability distributions, sampling, and statistical inference. The course is offered through the Department of Management at the Royal College and aims to teach fundamental statistical concepts and their applications in business. It is a 3-credit course offered through the college's Continuous Education Program in Addis Ababa, Ethiopia in 2019.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views155 pages

Statistics - For - Business Module

This document provides an overview of a course on statistics for business. It covers topics such as data collection and presentation methods, measures of central tendency and dispersion, probability, probability distributions, sampling, and statistical inference. The course is offered through the Department of Management at the Royal College and aims to teach fundamental statistical concepts and their applications in business. It is a 3-credit course offered through the college's Continuous Education Program in Addis Ababa, Ethiopia in 2019.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 155

ROYAL COLLEGE

Department of Management
Statistics for Business (MGMT1071)
Credit Hours: 3

Continuous Education Program

2019
Addis Ababa

i|Page
Contents
CHAPTER 1: INTRODUCTION ................................................................................................................. 1
1.1. Definition of Statistics ................................................................................................................. 1
1.2 Stages in Statistical Investigation ....................................................................................................... 1
1.3. Definition of some terms ................................................................................................................... 2
1.4. Applications, uses and limitation of Biostatistics/Statistics ............................................................... 3
1.5. Scale of Measurement ........................................................................................................................ 5
CHAPTER 2: METHODS OF DATA COLLECTION AND PRESENTATION ....................................... 8
2.1 Methods of Data Collection ................................................................................................................ 8
2.2. Methods of Data Presentation ........................................................................................................... 9
2.2.1. Frequency distribution ............................................................................................................... 9
2.2.2. Diagrammatic presentation of data: Bar charts, Pie-chart, Cartograms ................................... 14
2.2.4 Graphical Presentation of data ................................................................................................... 18
CHAPTER 3: MEASURES OF CENTRAL TENDENCY ........................................................................ 23
3.1 The Summation Notation ( ............................................................................................................ 23
3.2 Properties of measures of central tendency....................................................................................... 24
3.3 Types of Measures of Central Tendency ........................................................................................... 24
3.3.1 Arithmetic Mean ........................................................................................................................ 24
3.3.2 Geometric Mean......................................................................................................................... 29
3.3.3. Harmonic Mean......................................................................................................................... 30
3.3.4 Median ....................................................................................................................................... 31
3.3.5 The Mode or modal value .......................................................................................................... 33
3.5 Measures of Non-central Locations .................................................................................................. 35
CHAPTER 4: Measures of Dispersion (Variation) ..................................................................................... 41
4.1. Introduction ...................................................................................................................................... 41
4.2. Absolute and Relative Measures of Dispersion Absolute measures of dispersion: ... 41
4.3 Types of Measures of Variation ........................................................................................................ 43
4.3.1 The Range and Relative Range .................................................................................................. 43
4.3.2. The Quartile Deviation and Coefficient of Quartile Deviation ................................................. 44
4.3.3 The Mean Deviation and Coefficient of Mean Deviation .......................................................... 46
4.3.4 The Variance, Standard Deviation and Coefficient of Variation ............................................... 49
4.4 Standard Scores (Z-Scores)............................................................................................................... 54

ii | P a g e
4.5 Moments, Skewness and Kurtosis .................................................................................................... 55
4.5.1 Moments .................................................................................................................................... 55
4.5.2 Skewness .................................................................................................................................... 57
4.5.3 Kurtosis ...................................................................................................................................... 59
CHAPTER 5: ELEMENTARY PROBABLITY ........................................................................................ 62
5.1 Definition of some probability terms ................................................................................................ 62
5.2 Counting rules ................................................................................................................................... 63
5.3 Probability of an event ...................................................................................................................... 66
5.4 Some probability rules ...................................................................................................................... 69
5.5 Conditional Probability and Independence ....................................................................................... 70
5.5.1 Conditional Probability .............................................................................................................. 70
5.5.2 Multiplication Law of Probability.............................................................................................. 70
5.5.3 Probability of Independent Event............................................................................................... 71
Chapter 6: Probability Distribution ............................................................................................................. 73
6.1 The Concept of Random Variables ................................................................................................... 73
6.1.1 Discrete Random Variable ......................................................................................................... 74
6.1.2 Continuous Random Variable .................................................................................................... 74
6.2 Probability Distribution .................................................................................................................... 75
6.3 Expectation and Variance of Random variable ................................................................................. 76
6.3.1 Expectation/Mean ...................................................................................................................... 77
6.4 Common Discrete Probability Distributions ..................................................................................... 80
6.4.1 Binomial Distribution ................................................................................................................ 80
6.4.2 Poisson Distribution ................................................................................................................... 81
6.5. Common Continuous Probability Distributions ............................................................................... 82
6.5.1 Normal Distributions.................................................................................................................. 82
6.5.2 Chi-Square Distribution ............................................................................................................. 87
6.5.3 The t-distribution........................................................................................................................ 88
CHAPTER 7: SAMPLING AND SAMPLING DISTRIBUTION OF THE SAMPLE MEAN ................. 93
7.1 Basic Concepts .................................................................................................................................. 93
7.2 Reasons for sampling ........................................................................................................................ 94
7.3 Sampling Techniques ........................................................................................................................ 95
7.5 Sampling Distribution of the sample Proportion ............................................................................ 105

iii | P a g e
CHAPTER 8: STATISTICAL INFERENCES ......................................................................................... 108
8.1 Statistical Estimation ...................................................................................................................... 108
8.1.1 Confidence interval Estimation for population means ............................................................. 109
8.1.2 Sample size determination in estimation of population mean.................................................. 113
8.1.3 Confidence interval for population proportion ........................................................................ 113
8.2 Statistical Hypothesis testing .......................................................................................................... 114
8.2.1 Hypothesis testing for population mean ................................................................................... 115
8.2.2 Hypothesis testing for population proportion .......................................................................... 120
8.2.3 TEST OF ASSOCIATION OF ATTRIBUTES ....................................................................... 121
Decision Rule ........................................................................................................................................ 123
CHPTER 9: TWO SAMPLES INFERENCE ........................................................................................... 126
9.1 Inferences about differences between means .................................................................................. 126
9.2 Paired comparison ........................................................................................................................... 130
9.3 Inferences about differences between Proportions ......................................................................... 131
9.4 Inferences concerning variances ..................................................................................................... 133
CHAPTER 10: SIMPLE LINEAR REGRESSION AND CORRELATION ........................................... 136
10.1 Simple Linear Regression of Y on X ............................................................................................ 136
solution: a/ The scatter diagram is as follows: .................................................................................. 138
10.2 Covariance and Simple Linear Correlation Analysis .................................................................... 139
10.3 Spearman‟s Rank Correlation Coefficient .................................................................................... 141
Appendix: Table-A .................................................................................................................................. 148
Table B. ................................................................................................................................................. 149
Table C .................................................................................................................................................. 150
References:................................................................................................................................................ 151

iv | P a g e
CHAPTER 1: INTRODUCTION

1.1. Definition of Statistics


The word statistics is defined in different ways depending on its use in the plural and singular
sense.

In the plural sense: - statistics is defined as the collection of numerical facts or figures ( or the
raw data themselves).

Eg. 1. Vital statistics (numerical data on marriage, births, deaths, etc).

2. The average mark of statistics course for students is 70% would be considered as a statistics
whereas Abebe has got 90% in statistics course is not statistics.

Remark: statistics are aggregate of facts. Single and isolated figures are not statistics as they cannot be
compared and are unrelated.

In its singular sense:- the word Statistics is the subject that deals with the methods of collecting,
organizing, presenting, analyzing and interpreting statistical data.

Classification of Statistics
Statistics is broadly divided into two categories based on how the collected data are used.
Descriptive Statistics: deals with describing the data collected without going further conclusion
Example 1.1: Suppose that the mark of 6 students in Statistics course for Biology section A is given as
40, 45, 50, 60, 70 and 80. The average mark of the 6 students is 57.5 and it is considered as descriptive
statistics.
Inferential Statistics: It deals with making inferences and/or conclusions about a population based on
data obtained from a sample of observations. It consists of performing hypothesis testing, determining
relationships among variables and making predictions.
Example 1.2: In the above example, if we say that the average mark in Statistics course for Biology
section A students is 57.5, then we talk about inferential statistics (draw conclusion based on the sample
observation).

1.2 Stages in Statistical Investigation


The area of statistics points out the following five stages. These are collection, organization, presentation,
analysis and interpretation of data.

1|Page
i. Collection of data: This is the process of obtaining measurements or counts or obtaining raw
data.
Data can be collected in a variety of ways; one of the most common methods is through the use of sample
or census survey. Survey can also be done in different methods, three of the most common methods are:
 Telephone survey
 Mailed questionnaire
 Personal interview.
ii. Organization of data: Data collected from published sources are generally in organized form.
However if an investigator has collected data through a survey, it is necessary to edit these data in
order to correct any apparent inconsistencies, ambiguities, and recording errors.
This phase also includes correcting the data for errors, grouping data into classes and tabulating.
iii. Presentation of data: After the data have been collected and organized they can be presented in
the form of tables, charts, diagrams and graphs. This presentation in an orderly manner facilitates
the understanding as well as analysis of data.
iv. Analysis of data: the basic purpose of data analysis is to dig out useful information for decision
making. This analysis may simply be a critical observation of data to draw some meaningful
conclusions about it or it may involve highly complex and sophisticated mathematical techniques.
v. Interpretation of data: Interpretation means drawing conclusions from the data collected and
analyzed. Correct interpretation will lead to a valid conclusion of the study & thus can aid in
decision making.

1.3. Definition of some terms


Population: It is the totality of objects under study. The population represents the target of an
investigation, and the objective of the investigation is to draw conclusions about the population hence we
sometimes call it target population. The word population doesn‟t necessarily refer to people.
Example 1.3:
 All patients of hospital suffering from TB and treated with a new drug
 All clients of Telephone Company
 All students of Debre Markos University (DMU)
 Population of families, etc
The population could be finite or infinite (an imaginary collection of units).
Survey: - is an investigation of a certain population to assess its characteristics. It may be census or
sample.
Sample:- is part or subset of population under study.

Sampling frame:- is the list of all possible units of the population that the sample can be drawn from it.

2|Page
Eg. List of all students of PU, List of all residential houses in AA, etc

Census survey: a complete enumeration of the population under study.


Sample survey: the process of collecting data covering a representative part or portion of a population.
Parameter: is a statistical measure of a population, or summary value calculated from a population.
Examples: Average, Range, proportion, variance, etc.
Statistic: is a descriptive measure of a sample, or it is a summary value calculated from a sample.
Sampling: The process or method of sample selection from the population.
Sample size: The number of elements or observation to be included in the sample.
An element: is a member of sample or population. It is specific subject or object (for example a person,
firm, item,etc.) about which the information is collected.
Variable: It is an item of interest that can take numerical or non-numerical values for different elements.
It may be qualitative or quantitative. Example: age, weight, sex, marital status, etc.
Observation (measurement): is the value of a variable for an element.
Qualitative variables: - are variables that assume non-numerical values. They can be categorized and
they are usually called attributes. Eg. Sex, marital status, ID number, etc

Quantitative variables: - are variables which assume numerical values.eg. Age, weight, etc.

1.4. Applications, uses and limitation of Biostatistics/Statistics


Application of Biostatistics
Application of Biostatistics is not restricted to certain experiments but is used in a wide variety of
contexts. Some of these applications are as follows:
1. Genetical statistics
- In classical or Mendelian genetics-the focus of interest is centered on the inheritance of qualitative
characteristics. The statistical methods generally applied are binomial or chi-square tests.
- Population genetics:- is concerned with studying genetic structure of populations and changes
occurring in it over generations. The frequencies of different genes & their changes due to the effect
of some forces can be estimated with the application of different statistical methods.
2. Numerical taxonomy:- deals with grouping of taxonomic units into taxa by numerical methods.
3. Statistical ecology:- deal with the study of temporal and spatial patterns of populations of
organisms.
4. Forest measurement: - measurement of tree length, area, volume, weight and the like.
5. Demography: - the study of measurement of human population. It measures birth, death, marriage,
morbidity, migration, etc.

3|Page
6. Medical sciences: - health managers are expected to take some decisions on the basis of whatever
bit of information is available.

Applications of Statistics
Statistics can be applied in any field of study which seeks quantitative evidence. For instance,
engineering, economics, natural science, etc.
a) Engineering: Statistics have wide application in engineering.
 To compare the breaking strength of two types of materials.
 To determine the probability of reliability of a product.
 To control the quality of products in a given production process.
 To compare the improvement of yield due to certain additives such as fertilizer, herbicides, e t c
b) Economics: Statistics are widely used in economics study and research.
 To measure and forecast Gross National Product (GNP).
 Statistical analyses of population growth, inflation rate, poverty, unemployment figures, rural or
urban population shifts and so on influence much of the economic policy making.
 Financial statistics are necessary in the fields of money and banking including consumer savings
and credit availability.
c) Statistics and research: there is hardly any advanced research going on without the use of statistics
inform or another. Statistics are used extensively in medical, pharmaceutical and agricultural research.
Function/Uses of Statistics: Today the field of statistics is recognized as a highly useful tool to making
decision process by managers of modern business, industry, frequently changing technology. It has a lot
of functions in everyday activities. The following are some uses of statistics:
• It condenses and summarizes a mass of data: the original set of data (raw data) is normally
voluminous and disorganized unless it is summarized and expressed in few presentable, understandable &
precise figures.
• Statistics facilitates comparison of data: measures obtained from different set of data can be compared
to draw conclusion about those sets. Statistical values such as averages, percentages, ratios, rates,
coefficients, etc, are the tools that can be used for the purpose of comparing sets of data.
• Statistics helps to predict future trends: statistics is very useful for analyzing the past and present data
and forecasting future events.
• Statistics helps to formulate & review policies: Statistics provide the basic material for framing
suitable policies. Statistical study results in the areas of taxation, on unemployment rate, on inflation, on
the performance of every sort of military equipment, etc, may convince a government to review its
policies and plans with the view to meet national needs and aspirations.

4|Page
• Formulating and testing hypothesis: Statistical methods are extremely useful in formulating and
testing hypothesis and to develop new theories.
Limitations of Statistics
The field of statistics, though widely used in all areas of human knowledge and widely applied in a
variety of disciplines such as engineering, economics and research, has its own limitations. Some of these
limitations are:
a) It does not deal with individual values: as discussed earlier, statistics deals with aggregate of facts.
For example, wage earned by an individual worker at any one time, taken by it self is not a statistics.
b) It does not deal with qualitative characteristics directly: statistics is not applicable to qualitative
characteristics such as beauty, honesty, poverty, standard of living and so on since these cannot be
expressed in quantitative terms. These characteristics, however, can be statistically dealt with if some
quantitative values can be assigned to these with logical criterion. For example, intelligence may be
compared to some degree by comparing IQs or some other scores in certain intelligence tests.
c) Statistical conclusions are not universally true: since statistics is not inexact science, as is the case
with natural sciences, the statistical conclusions are true only under certain assumptions.
d) It can be misused: statistics cannot be used to full advantage in the absence of proper understanding of
the subject matter.

1.5. Scale of Measurement


Proper knowledge about the nature and type of data to be dealt with is essential in order to specify and
apply the proper statistical method for their analysis and inferences.
Scale Types
Measurement is the assignment of values to objects or events in a systematic fashion. Four levels of
measurement scales are commonly distinguished: nominal, ordinal, interval, and ratio and each possessed
different properties of measurement systems. The first two are qualitative while the last two are
quantitative.
Nominal scale: The values of a nominal attribute are just different names, i.e., nominal attributes provide
only enough information to distinguish one object from another. Qualities with no ranking or ordering, no
numerical or quantitative value. Data consists of names, labels and categories. This is a scale for grouping
individuals into different categories.
Example1.4: type of tree, type of insect, eye color: brown, black, . . ., sex: male, female.
 In this scale, one is different from the other
 Arithmetic operations(+, -, *, ÷) are not applicable, comparison (<, >, etc) is impossible
Ordinal scale: - defined as nominal data that can be ordered or ranked.
 Can be arranged in some order, but the differences between the data values are meaningless.

5|Page
 Data consisting of an ordering of ranking of measurements are said to be on an ordinal scale of
measurements. That is, the values of an ordinal scale provide enough information to order objects.
 One is different from and greater /better/ less than the other
 Arithmetic operations (+, -, *, ÷) are impossible, comparison (<, >, etc) is possible.
Example1.5: Letter grading (A, B, C, D, F), rating scales (excellent, very good, good, fair, poor), military
status (general, colonel, lieutenant etc).

Interval Level: data are defined as ordinal data and the differences between data values are meaningful.
However, there is no true zero, or starting point, and the ratio ofdata values are meaningless.
Note: Celsius & Fahrenheit temperature readings haveno meaningful zero and ratios are meaningless.
In this measurement scale:-
 One is different, better/greater and by a certain amount of difference than another
 Possible to add and subtract. For example; 800c – 500c = 300c, 700c – 400c =300c
 Multiplication and division are not possible. For example; 600c = 3(200c). Butth is does not imply that an
object which is 600c is three times as hot as an object which is 200c.
Most common examples are: IQ, temperature.
Ratio scale: Similar to interval, except there is a true zero (absolute absence), or starting point, and the
ratios of data values have meaning.
 Arithmetic operations (+, -, *, ÷) are applicable. For ratio variables, both differences and ratios are
meaningful.
 One is different/larger /taller/ better/ less by a certain amount of difference and so much times than the
other.
 This measurement scale provides better information than interval scale of measurement
Example1.6: weight, age, number of trees, etc.
Exercise 1
1. A popular radio program, asked its listener to respond either „yes‟ or „No‟ to the question; „Are you
concerned about spread of AIDS through unprotected sex?‟ with in 1 minute, 91 callers expressed their
views.
a/ what is the population of this study?
b/ what is the sample the radio station used to gauge the public mood?
c/ Is this sample scientific? Explain.
2. To study the average effect of fish dirt on human cholesterol level in blood, a researcher randomly selects
500 males of 25 years of age who have never to taken fish more than once a week, and measure their
cholesterol level. The researcher then serves all the individuals 8 ounces of fish everyday for one year.

6|Page
After one year the researcher measures the cholesterol level of each individual again and calculates the
difference with the year before value (difference=pre-diet level minus post-diet level).
Determine:
a/ the population
b/ the Sample
c/ the variable under study
d/ the parameter of interest and
e/ Is the variable qualitative or quantitative
3. A survey has been conducted on a sample of 350 Patients suffering from a Particular type of headache.
Determine a/the population
b/ the sample

4. Classify the variable as qualitative or quantitative and separate which are nominal, ordinal, interval &
ratio

a/ Hair color , f/ white blood count,


b/ Type of tree, g/cancer staging,
c/ age, h/WHO HIV stage,
d/ Grams of fat in herbages, i/ blood group
e/ types of surreal procedures,

7|Page
CHAPTER 2: METHODS OF DATA COLLECTION AND
PRESENTATION
The second unit of this module introduces the methods of data collection and presentation. This unit will
deal how to collect and present the data you have collected so that they can be of use. Thus the collected
data also known as raw data are always in an unorganized form and need to be organized and presented in
a meaningful and readily comprehensible form in order to facilitate further statistical analysis.
Objectives
At the end of this chapter students will be able to:
 Arrange raw data in an array and then classified data to construct a frequency table and a
cumulative frequency table.
 To organize data using frequency distribution.
 To present data using suitable graphs or diagrams.

2.1 Methods of Data Collection


Data: is the raw material of statistics. It can be obtained either by measurement or counting. When we
determine that the appropriate approach to seeking an answer to a question will require the use of
statistics, we begin to search for suitable data to serve as the raw material for our investigation.

Sources of data

The statistical data may be classified under two categories depending up on the sources.
1. Primary data: - Data collected by the investigator himself for the purpose of a specific inquiry or
study. Such data are original in character & are mostly generated by surveys conducted by
individuals or research institutions.
It is more reliable & accurate since the investigator can extract the correct information by removing
doubts, if any, in the minds of the respondents regarding certain questions.
2. Secondary data: - When an investigator uses data, which have already been collected by others, such
data are called secondary data. Such data are primary data for the agency that collected them, and
become secondary for some one else who uses these data for his own purposes. Sources of secondary
data are books, journals, reports, etc.
When our source is secondary data check that:
 The type and objective of the situations.
 The purpose for which the data are collected and compatible with the present problem.
 The nature and classification of data is appropriate to our problem.
 There are no biases and misreporting in the published data.
Note: Data which are primary for one may be secondary for the other.

8|Page
2.2. Methods of Data Presentation
Having collected and edited the data, the next important step is to organize it. That is to present it in a
readily comprehensible condensed form that aids in order to draw inferences from it. It is also necessary
that the like be separated from the unlike ones.
The presentation of data is broadly classified in to the following two categories:
 Tabular presentation
 Diagrammatic and Graphic presentation.
The process of arranging data in to classes or categories according to similarities technically is called
classification. It eliminates inconsistency and also brings out the points of similarity and/or dissimilarity
of collected items/data.
Classification is necessary because it would not be possible to draw inferences and conclusions if we have
a large set of collected [raw] data.

2.2.1. Frequency distribution


Frequency: is the number of times a certain value or class of values occurs.
Frequency distribution (FD): is the organization of raw data in table form using classes and frequency.
There are three types of FD and there are specific procedures for constructing each type.
The three types are:-
I. Categorical FD
II. Ungrouped FD and
III. Grouped FD
I. Categorical FD: Used for data that can be placed in specific categories; such as nominal, ordinal level
of data.
Example 2.1: Twenty five patients were given a blood test to determine their blood type. The data is as
shown below: A B B AB O A O O B AB B B B O A O O O AB AB A O O B A.
Solution: Since the data are categorical by taking the four blood types as classes we can construct a FD as
shown below.
Step 1: Make a table as shown below
CLASS TALLY FREQUANCY PERCENRT
A
B
AB
O

Step 2: Tally data and place the result under the column Tally
Step 3: Count the tallies and place the result under the column frequency.

9|Page
Step 4: find the percentage of values in each class by the formula (%= f/n * 100%; f= frequency, n total
number of observation.)
CLASS TALLY FREQUANCY PERCENRT
A //// 5 5/25* 100 = 20%
B //// // 7 28%
AB //// 4 16%
O //// //// 9 9/25*100 = 36%

II. Ungrouped Frequency Distribution (UFD)


UFD us a table of all potential raw score values each times each actually could possibly occur in the data
along with the number of times each actually could occur.
UFD is often constructed for small set of data or data of discrete variable.

Constructing ungrouped frequency distribution:

 First find the smallest and largest raw score in the collected data.
 Arrange the data in order of magnitude and count the frequency.
 To facilitate counting one may include a column of tallies.

Example 2.2:The following data represent the number of days of sick leave taken by each of 50
workers of a company over the last 6 weeks.

2 0 0 5 8 3 4 1 0 0 7 1
7 1 5 4 0 4 0 1 8 9 7 0
1 7 2 5 5 4 3 3 0 0 2 5
1 3 0 2 4 5 0 5 7 5 1 1
0 2

i. Construct ungrouped frequency distribution


ii. How many workers had at least 1 day of sick leave?
iii. How many workers had between 2 and 6 days of sick leave?

Solution:

i. Since this data set contains only a relatively small number of distinct or differentvalues, it is
convenient to represent it in a frequency table which presents each distinct value along with its
frequency of occurrence.

10 | P a g e
Class Frequency Cumulative
frequency
0 12 12
1 8 20
2 5 25
3 4 29
4 5 34
5 8 42
7 5 47
8 2 49
9 1 50

ii. Since 12 of the 50 workers had no days of sick leave, the answer is 50-12=38
iii. The answer is the sum of the frequencies for values 3, 4 and 5 that is 4+5+8=17.

3. Grouped Frequency Distribution (GFD).


When the range of the data is large the data must be grouped in to classes that are more than one unit in
width.
Definition of some basic terms
 Grouped frequency distribution: is a FD when several numbers are grouped into one class.
 Class limits (CL): It separate one class from another. The limits could actually appear in the data and
have gaps between the upper limits of one class and the lower limit of the next class.
 Unit of measure (U): This is the possible difference between successive values. E.g. 1, 0.1, 0.01,
0.001, etc
 Class boundaries: Separate one class in a grouped frequency distribution from the other. The
boundary has one more decimal place than the raw data. There is no gap between the upper
boundaries of one class and the lower boundaries of the succeeding class. Lower class boundary is
found by subtracting half of the unit of measure from the lower class limit and upper class boundary
is found by adding half unit measure to the upper class limit.
 Class width (W): The difference between the upper and lower boundaries of any consecutive class.
The class width is also the difference between the lower limit or upper limits of two consecutive
class.

11 | P a g e
 Class mark (Midpoint): It is found by adding the lower and upper class limit (boundaries) and
divided the sum by two.
 Cumulative frequency: It is the number of observation less than or greater than the upper class
boundary of class.
 CF (Less than type): it is the number of values less than the upper class boundary of a given class.
 CF (Greater than type): it is the number of values greater than the lower class boundary of a given
class.
 Relative frequency (Rf ):The frequency divided by the total frequency. This gives the present of
values falling in that class.

Rfi = fi/n= fi/ ∑fi

 Relative cumulative frequency (RCf): The running total of the relative frequencies or the
cumulative frequency divided by the total frequency gives the present of the values which are less
than the upper class boundary or the reverse.

CRfi = Cfi/n= Cfi/∑fi

STEPS IN CONSTRUCTING A GFD


1. Find the highest and the smallest value
2. compute the range; R = H – L
3. Select the number of class desired (K)
I. Choose arbitrary between 5 and 15.
II. Using struggles formula
K= 1 + 3.322Log n; n= Total frequency
4. Find the class width (W) by dividing the range by the number of classes and round to the nearest
integer the result you get.
W = R/K
5. Identify the unit of measure usually as 1, 0.1, 0.01,
6. Pick a suitable starting point less than or equal to the minimum value. Your starting point is lower
limit of the first class.
- Then continue to add the class width to get the rest lower class limits.
7. Find the upper class limits UCLi = LCLi-U. then continue to add width to get the rest upper class
limits

12 | P a g e
8. find class boundaries
LCBi = LCLi – ½ U
UCBi = UCLi + ½ U
9. Find class mark
CMi = (UCLi + LCLi)/ 2 or CMi = (UCBi + LCBi)/ 2.
10. Tally the data
11. Find the frequencies
12. Find the cumulative frequencies. Depending on what you are trying to accomplish, it may be
necessary to find the cumulative frequency.
13. If necessary find Rf and RCf.
When grouping data the following rules are important:
 The groups must not overlap, otherwise there is confusion concerning in which group a
measurement belongs.
 There must be continuity from one group to the next, which means that there must be no
gaps. Otherwise some measurements may not fit in a group.
 The groups must range from the lowest measurement to the highest measurement so that all
of the measurements have a group to which they can be assigned.
 The groups should normally be of an equal width, so that the counts in different groups can
easily be compared.
Example 2.3: The blood glucose level, in milligrams per deciliter, for 60 patients is shown
below. Construct a grouped frequency distribution for the data set.

55 70 85 90 93 86 103 74 92
63 101 83 82 100 97 97 109 84
84 75 92 68 114 84 101 81 91
82 115 86 69 59 56 84 77 90
77 97 80 101 61 74 87 80
58 81 78 88 86 59 82 83
59 78 116 72 62 105 65 78
Solution:-
1) Highest value = 116, Lowest value = 55
2) Range = 116 – 55 = 61
3) K = 1+ 3.322Log60 = 1 + 3.322(1.78) = 6.9 ≈ 7

13 | P a g e
4) W = R / K = 61/7 = 8.7 ≈ 9
5) U = 1
6) LCL1=55
7) Find the upper class limits.
8) Find class boundaries
9) Find class mark

Class limit Frequency Class boundary Class CF(<) CF(>) Rf %Rf


Mark
55 – 63 9 54.5 – 63.5 59 9 60 0.15 15
64 – 72 5 63.5 – 72.5 68 14 51 0.08 8
73 – 81 12 72.5 – 81.5 77 26 46 0.2 20
82 – 90 17 81.5 – 90.5 86 43 34 0.28 28
91 – 99 7 90.5 – 99.5 95 50 17 0.12 12
100 – 108 6 99.5 – 108.5 104 56 10 0.1 10
109 –117 4 108.5 – 117.5 113 60 4 0.07 7

2.2.2. Diagrammatic presentation of data: Bar charts, Pie-chart, Cartograms


The most convenient and popular way of describing data is using graphical presentation. It is easier to
understand and interpret data when they are presented graphically than using words or a frequency table.
A graph can present data in a simple and clear way. Also it can illustrate the important aspects of the data.
This leads to better analysis and presentation of the data. In this article, we discuss the approach for the
most commonly used diagrammatic or graphical methods such as bar charts, pie chart, histograms,
frequency polygons and cumulative frequency polygons.
The three most commonly used diagrammatic presentation for discrete as well as qualitative data are:

 Pie charts
 Bar charts
 Pictogram
A) Pie chart

A pie chart is a circle that is divided in to sections or wedges according to the percentage of
frequencies in each category of the distribution. The angle of the sector is obtained using:

14 | P a g e
Example 2.4:Using the immunization status of children in certain area given in example 2.5,
draw the pie chart.

not immunized

37% 36%
partially immunized

fully immunized
27%

Bar Charts

 Used to represent & compare the frequency distribution of discrete variables and attributes or
categorical series.
 Bars can be drawn either vertically or horizontally.

In presenting data using bar diagram,

 All bars must have equal width and the distance between bars must be equal.
 The height or length of each bar indicates the size (frequency) of the figure represented.

There are different types of bar charts. The most common being:

 Simple bar chart


 Component or sub divided bar chart.
 Multiple bar charts.
I. Simple bar chart
 Are used to display data on one variable.
 They are thick lines (narrow rectangles) having the same breadth. The magnitude of a quantity is
represented by the height /length of the bar.

15 | P a g e
Example 2.5Consider the immunization status of children in certain area;
Immunization status Number/ Relative frequency in Percent
(Class) frequency
Not immunized 75 35.7 %
partially immunized 57 27.2 %
fully immunized 78 37.1 %
total 210 100.0 %
Draw a simple bar chart of the immunization status of children.

Solution:

90
number of children

80
70
60
50
40
30
20
10
0
Not immunized partially fully
immunized immunized

immunization status

II. Component Bar chart


 When there is a desire to show how a total (or aggregate) is divided in to its component parts, we
use component bar chart.
 The bars represent total value of a variable with each total broken in to its component parts and
different colors or designs are used for identifications

Draw a component bar chart to represent the sales by product from 1957 to 1959.

Example 2.6:Consider data on immunization status of women by marital status

16 | P a g e
Immunization Status

Marital Status
Immunized Non immunized

No % No % Total
Single
58 24.7 177 75.3 235

Married 156 34.7 294 65.3 450

Divorced 10 35.7 18 64.3 28

Widowed 7 50.0 7 50.0 14

Total 231 31.0 496 68.2 727

Draw a component (sub-divided) bar chart of the immunization status of women by marital
status

Solution:

500
450
400
number of womwn

non
350
immunized
300
250 immunized
200
150
100
50
0
Single Married Divorced Widowed
marital status

III. Multiple Bar charts


 These are used to display data on more than one variable.
 They are used for comparing different variables at the same time.

Example 2.7: Draw a multiple bar chart to represent the immunization status of women by marital status
given in Example 2.6.

Solution:

17 | P a g e
350

number of womens
300
250
200 Immunized
150
Non-Immunized
100
50
0

d
le

d
rie

rce

we
ng

ar
Si

vo

i do
M

Di

W
marital status

B) Pictograph

In this diagram, we represent data by means of some picture symbols. We decide about a suitable picture
to represent a definite number of units in which the variable is measured.

2.2.4 Graphical Presentation of data


The histogram, frequency polygon and cumulative frequency graph or ogive is most commonly applied
graphical representation for continuous data.

Procedures for constructing statistical graphs:

 Draw and label the x and y axes.


 Choose a suitable scale for the frequencies or cumulative frequencies and label it on the y-axes.
 Represent the class boundaries for the histogram or ogive or the mid points for the frequency
polygon on the x-axes.
 Plot the points.
 Draw the bars or lines to connect the points.
Histogram
A graph which displays the data by using vertical bars of various heights to represent frequencies. Class
boundaries are placed along the horizontal axes. Class marks and class limits are sometimes used as
quantity on the x-axis.
Example 2.8:Construct a histogram to represent the blood glucose level for 60 patients given in example
2.3.

Solution:

18 | P a g e
Histogram
20
Frequency 17

15
12

10 9
7
6
5
5 4

0
1
Class boundary
l

Frequency polygon
If we join the mid-points of the tops of the adjacent rectangles of the histogram with line segments a
frequency polygon is obtained. When the polygon is continued to the x-axis just outside the range of the
lengths the total area under the polygon will be equal to the total area under the histogram.
Example 2.9:Construct a Frequency polygon to represent the following data.

Class Frequency Class Class R.F. % R.F. Less than More than
limits marks boundarie C.F. C. F.
(percent)
s

15 - 24 3 19.5 14.5 - 24.5 0.06 6% 3 50

25 – 34 4 29.5 24.5 - 34.5 0.08 8% 7 47

35 - 44 10 39.5 34.5 - 44.5 0.20 20% 17 43

45 - 54 15 49.5 44.5 - 54.5 0.30 30% 32 33

55 - 64 12 59.5 54.5 - 64.5 0.24 24% 44 18

65 - 74 4 69.5 64.5 - 74.5 0.08 8% 48 6

75 - 84 2 79.5 74.5 - 84.5 0.04 4% 50 2

Solution:
Adding two class marks with fi = 0, we have 9.5 at the beginning, and 89.5 at the end, the following
frequency polygon is plotted.

19 | P a g e
Frequency Polygon
20
F
r
15
e
q
10
u
e
n 5
c
y 0
9.5 19.529.539.549.559.569.579.589.5

Class mark

Ogive (cumulative frequency polygon)

An Ogive (pronounced as “oh-jive”) is a line that depicts cumulative frequencies, just as the cumulative
frequency distribution lists cumulative frequencies. Note that the Ogive uses class boundaries along the
horizontal scale, and graph begins with the lower boundary of the first class and ends with the upper
boundary of the last class. Ogive is useful for determining the number of values below some particular
value. There are two type of Ogive namely less than Ogive and more than Ogive. The difference is that
less than Ogive uses less than cumulative frequency and more than Ogive uses more than cumulative
frequency on y-axis.

Example 2.10: i) Draw a less than Ogive for data of blood glucose level of the 60 patients given in
Example 2.3.

The less than Ogive

60
of patients

50

40

30

20
Number

10

0
54.5 63.5 72.55 81.5 90.5 99.5 108.5 117.5

blood glucose level ( upper class boundary)

ii) Draw a more than Ogive for the F.D. of Example 2.9.

20 | P a g e
The More than Ogive
Cumulative Frequency
60
50
40
30
20
10
0
14.5 24.5 34.5 44.5 54.5 64.5 74.5 84.5
Class Boundaries

Note: For both ogives, one class with frequency zero is added for similar reason with the frequency
polygon.

Exercise 2

1. Classify the following as discrete or continuous variable.


a)Temperature; b) number of courses offered in DMU; c) rain fall; d) age.

2. Distinguish between primary and secondary data. What precautions should be taken before using
secondary data?
3. Construct a frequency distribution for a survey taken at a hotel, that 40 tourists arrived by the
following means of transportation:
car car bus plane plane car plane plane bus car plane car car car plane bus
car bus car plane car car car bus car bus bus plane plane plane car
plane plane plane bus bus car car plane car

4. The weight of 30 students on a College of Natural and Computational Science was recorded as follows:

143 151 104 99 121 126


138 119 104 112 132 123
121 133 137 132 126 107
139 122 127 90 129 134
133 136 113 112 140 123

21 | P a g e
Construct a frequency distribution with intervals of 7 classes.

5. Given the following frequency distribution:


Class limits 0-1 2-3 4-5 6-7 8-9

Frequency 16 25 13 4 2

Find a) the class marks; b) the class boundaries; c) the relative frequencies

22
CHAPTER 3: MEASURES OF CENTRAL TENDENCY
Objectives
At the end of this chapter students will be able to:
 Identify measure of central tendency
 understand properties of arithmetic mean
 Summarize an aggregate of statistical data by using single measure
 Define and calculate the mean, mode and median.
 Measure the position of data using quartiles, deciles and percentiles with their interpretation.

3.1 The Summation Notation (


Statistical Symbols: Let a data set consists of a number of observations, represents by ,
th
where n (the last subscript) denotes the number of observations in the data and is the i observation.
Then the sum of all numbers where i goes from 1 up to n is symbolically given by
∑ ∑ ∑ that is
∑ = +
x - whole set of numbers
- specific score in a set of numbers
n - total number of observations.
Example 3.1: For instance a data set consisting of six measurements 2, 3, 9, 10, 8 and -2 is
represented by , where , =3, 9, = 10, = 8 and =-2 Their sum
becomes ∑ = + = 2+3+9+10+8+ (-2) = 30
Some Properties of the Summation Notation
1. ∑ = n.c, where c is a constant number.

2. ∑ = b∑ where b is a constant number

3. ∑ = n.a + b∑

4. ∑ =∑ ∑

5. ∑ ∑ ∑

Example 3.2:

∑ = 20 ,∑ = 30, ∑ = 420, ∑ =280

Find i/ ∑ +4 = ∑ + 4∑ = 6.20 + 4.30 = 240


ii/ 3∑ ∑ = 3.420 – 2.280 = 700

23
3.2 Properties of measures of central tendency
A good average should be:

1. Rigidly defined (unique).

2. Based on all observation under investigation.

3. Easily understood.

4. Simple to compute.

5. Suitable for further mathematical treatment.

6. Little affected by fluctuations of sampling.

7. Not highly affected by extreme values.

3.3 Types of Measures of Central Tendency


Measures of Central Tendency: give us information about the location of the center of the
distribution of data values. A single value that describes the characteristics of the entire mass of data is
called measures of central tendency. We will discuss briefly the three measures of central tendency:
mean, median and mode in this unit.
The following are types of central tendency which are suitable for a particular type of data. These are
 Arithmetic Mean
- Weighted Arithmetic Mean
- Combined mean
 Geometric Mean
 Harmonic Mean
 Median
 Mode or modal value
3.3.1 Arithmetic Mean
Arithmetic mean:-is defined as the sum of the measurements of the items divided by the total number
of items. It is usually denoted by ̅ .
Arithmetic Mean for individual series
Suppose , are observed values in a sample of size n from a population of size N, n<N then
the arithmetic mean of the sample, denoted by ̅ is given by

̅= =
If we take an entire population the Mean is denoted by μ and is given by:

24

= =

Where N stands for the total number of observations in the population.


Example 3.3: The number of flowers per plant is given below. Find the mean.
i. 5 12 9 6
ii. 6 8 6 7 8
Find the arithmetic mean
Solution:
i. The sample values are: 5 12 9 6

̅= = = =8

The arithmetic mean for sample value is 8.


ii. The sample values are: 6 8 6 7 8

̅= = = =7

The arithmetic mean for sample value is 7.

Arithmetic mean for discrete data arranged in frequency distribution

When the numbers , occur with frequencies , , respectively, then the mean can
be expressed in a more compact form as:

̅= =

Example 3.4
Calculate the arithmetic mean of the pulse rates (beats per minute) of eleven students:
60 60 71 68 71 72 71 76 72 80 80

̅= = = = 71

In this case there are two 60‟s, one 68, three 71‟s, two 72‟s, one 76, and two 80‟s. The number of
times each number occurs is called its frequency and the frequency is usually denoted by f. The
information in the sentence above can be written in a table, as follows.
Value, xi 60 68 71 72 76 80 Total
Frequency, fi 2 1 3 2 1 2 11
xi fi 120 68 213 144 76 160 781

25
The formula for the arithmetic mean for data of this type is

̅= = ∑

In this case we have:


̅= = 71.

The mean pulse rate (beats per minute) of the eleven students is 71.

Arithmetic Mean for Grouped Continuous Frequency Distribution


If data are given in the form of continuous frequency distribution, the sample mean can be computed
as


̅= where is the class mark of the ith class; i=1, 2, . . . , k

is the frequency of the ith class and k is the number of classes


Note that ∑ = n = the total number of observations.
Example 3.5

The following frequency table gives the height (in inches) of 100 students in a college.

Class boundary 60-62 62-64 64-66 66-68 68-70 70-72 Total


Frequency (fi) 5 18 42 20 8 7 100
Calculate the mean
Solution:
The formula to be used for the mean is as follows:


̅=

Let us calculate these values and make a table for these values for the sake of convenience.

Class boundary Frequency (fi) Mid-Point ( )


60 - 62 5 61 305
62 – 64 18 63 1134
64 – 66 42 65 2730
66 – 68 20 67 1340
68 – 70 8 69 552
70 – 72 7 71 497
Total 100 6558

26
Substituting these values with ∑ = 100, we get

̅= = ̅= = 65.58

The mean height of students is 65.58.


Properties of the Arithmetic Mean
 The algebraic sum of the deviations of a set of numbers , from their mean x is
always zero. i.e.
n

 (x
i 1
i  x)  0
n

 (x i  A) 2
 The sum of squares of deviations from the mean is the least. That is, i 1 is

minimum when A  x .
 If the mean of , is ̅ , then
a) The mean of ±k, ±k,..., ±k will be ̅ ± k
b) The mean of will be k ̅ .
Merits of Arithmetic Mean
 Arithmetic mean has a rigidly defined mathematical formula so that its value is always
definite or unique. It can be calculated for any set of numerical data.
 It is calculated based on all observations.
 Arithmetic mean is simple to calculate and easy to understand.
 It doesn‟t need arrangement of data in increasing or decreasing order.
 Arithmetic mean of many samples from the same population does not fluctuate considerably.
 It affords a good standard of comparison.
Demerits of Arithmetic Mean
 It can‟t be calculated for data which are not quantifiable.
 It is highly affected by extreme (abnormal) values in the series.
 It can be a number which does not exist in the series.
 It can‟t be calculated for grouped continuous open-ended classes.
Weighted Arithmetic Mean
While calculating simple arithmetic mean, all items were assumed to be of equally importance (each
value in the data set has equal weight). When the observations have different weight, we use weighted
average. Weights are assigned to each item in proportion to its relative importance.

27
If , represent values of the items and , are the corresponding weights, then
the weighted mean, ( ̅ ) is given by

w1 x1  w2 x2    wn xn  wi xi
xw  
w1  w2    wn  wi
Example 3.6
A student‟s final mark in Mathematics, Physics, Chemistry and Biology are respectively A, B, D and
C. If the respective credits received for these courses are 4, 4, 3 and 2, determine the approximate
average mark the student has got for the course.
Solution
We use a weighted arithmetic mean, weight associated with each course being taken as the number of
credits received for the corresponding course.
4 3 1 2 Total
4 4 3 2 13
16 12 3 4 35

w1 x1  w2 x2    wn xn  wi xi
xw  
w1  w2    wn  wi
= = = 2.69

Average mark of the student is approximately 2.69.

Combined mean:-When a set of observations is divided into k groups and ̅ is the mean of n1
observations of group 1, ̅ is the mean of n2 observations of group2, …, ̅ is the mean of nk
observations of group k , then the combined mean ,denoted by ̅ , of all observations taken together is
given by

̅ ̅ ̅
̅

This is a special case of the weighted mean. In this case the sample sizes are the weights.

Example 3.7
In the Previous year there were two sections taking Statistics course. At the end of the semester, the
two sections got average marks of 70 & 78. There were 45 and 50 students in each section
respectively. Find the mean mark for the entire students.

28
Solution:

̅ ̅ ̅ ̅ ̅
̅ = = = = 74.21

The combined mean of the entire students will be 74.21.

3.3.2 Geometric Mean


The geometric mean like arithmetic mean is calculated average. It is used when observed values are
measured as ratios, percentages, proportions, indices or growth rates.

Geometric mean for individual series:- The geometric mean, G.M. of an individual series of positive
numbers , is defined as the nth root of their product.

G.M  n x1.x2  xn = antilog ( ∑ )

Example 3.8

Find the G. M of (a) 3 and 12 b) 2, 4 and 8

Solution: a) GM  3 12  36  6 ; b) GM= √ √ =4

Geometric mean for discrete data arranged in FD:- When the numbers , occur with
frequencies , , respectively, then the geometric mean is obtained by

G.M .  n x1f1 .x2f 2 ..xmf m = antilog ( ∑ )


Example 3.9
Compute the geometric mean of the following values: 3, 3, 4, 4, 4, 5, 6 and 6.
Solution
Values 3 4 5 6
Frequency 2 3 1 2

G.M. = √ = 4.236
The geometric mean for the given data is 4.236.
Geometric mean for continuous grouped FD: The above formula can also be used whenever the
frequency distribution is grouped continuous, class marks of the class intervals are considered as xi.
Properties of geometric mean
 It is less affected by extreme values.
 It takes each and every observation into consideration.
 If the value of one observation is zero its values becomes zero.

29
3.3.3. Harmonic Mean
It is a suitable measure of central tendency when the data pertains to speed, rate and time. The
harmonic of n values is defined as n divided by the sum of their reciprocal.
Harmonic mean for individual series:- If , are n observations, then harmonic mean can
be represented by the following formula:
n
H .M 
1 1 1
 
x1 x2 xn

Example 3.10: A car travels 25 miles at 25 mph, 25 miles at 50 mph, and 25 miles at 75 mph. Find the
harmonic mean of the three velocities.
Solution

H .M 
n = = 40.9.
1 1 1
 
x1 x2 xn

Harmonic mean for discrete data arranged in FD:- If the data is arranged in the form of frequency
distribution

n
H .M  m

f1 f 2 f
, where n  f k
  m k 1

x1 x 2 xm

Harmonic mean for continuous grouped FD: Whenever the frequency distribution are grouped
continuous, class marks of the class intervals are considered as and the above formula can be used
as
m
H.M. =

where n  f
k 1
k

is the class mark of ith class


Properties of harmonic mean
 It is unique for a given set of data.
 It takes each and every observation into consideration.
 Difficult to calculate and understand.
 Appropriate measure of central tendency in situations where data is in ratio, speed or rate.
Relations among different means
i. If all the observations are positive we have the relationship among the three means given as: ̅
GM HM
ii. For two observations √ ̅ GM

30
iii. ̅ = GM = HM if all observation are positive and have equal value.
3.3.4 Median
The median is as its name indicates the middle most value in the arrangement which divides
the data into two equal parts. It is obtained by arranging the data in an increasing or
decreasing order of magnitude and denoted by ̃ .
Median for individual series:-We arrange the sample in ascending order of the variable of
interest. Then the median is the middle value (if the sample size n is odd) or the average of the
two middle values (if the sample size n is even).
For individual series the median is obtained by
a/ ̃ = value if n is odd, and

b/ ̃ = if n is even

Example 3.11
Find the median for the following data.
a/ -5 15 10 5 0 2 1 4 6 and 8
b/ 5 2 2 3 1 8 4

Solution;
i. The data in ascending order is given by:
-5 0 1 2 4 5 6 8 10 15
n=10 n is even. The two middle values are 5th and 6th observations. So the median is,

̃= = .

ii. The data in ascending order is given by:


1 2 2 3 4 5 8
The middle value is the 4th observation. So the median is 3.

Note: The median is easy to calculate for small samples and is not affected by an "outlier".
Median for Discrete data arranged in a frequency distribution:- In this case also, the median is
obtained by the above formula. After arranging the value in an increasing order find the smallest CF
greater than or equal to that value obtained by formula and the corresponding value is the median.

Median for grouped continuous data:-For continuous data, the median is obtained by the following
formula.

31
w n 
Median  L    CF   ~
x
f med  2 

Where: L= the lower class boundary of the median class; w = the class width of the median class;

f med = the frequency of the median class; and CF  the cum. freq. corresponding to the class
preceding the median class. That is, the sums of the frequencies of all classes lower than the median
class. Where the median class is the class which contains the (n/2)th observation whether n is odd or
even, since the items have already lost their originality once they are grouped in to continuous classes.

Example 3.12: Water percentage in the body of species of Fish is given below. Calculate the median.

C.I 15-24 25-34 35-44 45-54 55-64 Total

Freq. 7 17 16 6 4 50

Solution: Construct the less than cumulative frequency distribution, then:

C.I 15- 25-34 35- 45-54 55- Total


24 44 64

Freq. 7 17 16 6 4 50

Cuml. Freq. 7 24 40 46 50

Since n = 50, 50/2 = 25, and the smallest CF greater than or equal to 25 is 40; thus, the median class is
the third class. And for this class, L = 34.5, w = 10, f med =16, CF = 24. Then applying the formula, we

get:
~x =34.5+(25-24)*10/16 = 35.1.

Merits of median

• It is less affected by extreme values.


• Median can be calculated even in case of open-ended intervals.
• It can be computed for ratio, interval, and ordinal level of data.
Demerits of median
• Its value is not determined by each & every observation.

32
• It is not a good representative of the data if the number of items (data) is small.
• The arrangement of items in order of magnitude is sometimes very tedious process if the number of
items is very large.
3.3.5 The Mode or modal value
The mode or the modal value is the value with the highest frequency and denoted by ̂. A data set may
not have a mode or may have more than one mode. A distribution is called a bimodal distribution if it
has two data values that appear with the greatest frequency. If a distribution has more than two modes,
then the distribution is multimodal. If a distribution has no modes, then the distribution is non-modal.

Mode of individual series: The mode or the modal value of individual series (raw data) is simply
obtained by locating the observation with the maximum frequency.

Example 3.13
Consider the following data:
a. 30 45 69 70 32 18 32. The Mode ( ̂ ) = 32.
b. 10 20 30 10 40 30. The Mode ( ̂ ) = 10 and 30.
c. 10 40 30 20 50 60. No Mode.
Note that in some samples there may be more than one mode or there may not be a mode. The mode is
not a suitable measure of central tendency in these cases. We use the mode as a measure of central
tendency if we require a measure that takes on one of the sample values. The mode can be used for
variables that are measured on a category (nominal) scale.e.g. the most popular computer type.

Mode for discrete data arranged in a frequency distribution:-In the case of discrete grouped data,
the mode is determined just by looking to that value (s) having the highest frequency.

Mode for Grouped Continuous Frequency Distribution


For grouped data, the mode is found by the following formula.In such cases, one can only determine
the modal class easily i.e. the class with the highest frequency. After locating this class, the mode is
interpolated using:
1
Mode  L   w ,where L = the lower class boundary of the modal class; 1  f mod  f 1 ,
1   2
 2  f mod  f 2 , w = the common class width, f 1 = frequency of the class immediately preceding
the modal class; f 2 = frequency of the class immediately succeeding the modal class; and f mode =
frequency of the modal class.
Example 3.14
Calculate the mode for the frequency distribution (water percentage) of the data on example 3.12.

33
Solution: By inspection, the mode lies in the second class, where L =24.5, fmod = 17, f1= 7, f2=16, w =
10. Using the formula, the mode is:

1
Mode  L   w = 24.5 + (17-7)*10/[(17-7)+(17-16)] = 33.59.
1   2

Merits of mode

 Mode is not affected by extreme values.


 We can change the size of the observations without changing the mode.
 It can be computed for all level of data. i.e., ratio, interval, ordinal or nominal.
Demerits of mode
 It may not exist
 It does not take every value into consideration.
 Mode may not exist in the series and if it exists it may not be unique
3.4. The Relationship of the Mean, Median and Mode
Comparing the Mean, Median, and the Mode
 If the data is skewed, avoid the mean.
 If there is high gap around the middle, avoid the median.
 A measure is a resistant measure if its value is not affected by an outlier or an extreme data
value.
 The mean is not a resistant measure of central tendency because it is not resistant to the
influence of the extreme data values or outliers.
 The median is resistant to the influence of extreme data values or outliers and its value does
not respond strongly to the changes of a few extreme data values regardless of how large the
change may be.
 The mode has an advantage over both the mean and the median when the data is categorical
since it is not possible to calculate the mean or median for this type of data. Also, the mode
usually indicates the location within a large distribution where the data values are
concentrated. However, the mode can not always be calculated because if a distribution has all
different data values, then the distribution is non-modal.
 In the case of symmetrical distribution; mean, median and mode coincide. That is,
mean=median = mode. However, for a moderately asymmetrical (non symmetrical)
distribution, mean and mode lie on the two ends and median lies between them and they have
the following important empirical relationship, which is

34
(Mean – Mode) = 3(Mean - Median).
Example 3.15
In a moderately asymmetrical distribution, the mean and the mode are 30 and 42 respectively. What is
the median of the distribution?
Solution:
Median= (2mean + Mode)/2 = (2*30 + 42)/3 = 34
Hence the median of the distribution is 34.
Which of the Three Measures is the Best?
At this stage, one may ask as to which of these three measure of central tendency is the best. There is
no simple answer to this question. It is because these three measures are based upon different
concepts. The arithmetic mean is the sum of the values divided by the total number of observations in
the series. The median is the value of the middle observations tend to concentrate, As such; the use of
a particular measure will largely depend on the purpose of the study and the nature of the data. For
example, when we are interested in knowing the consumers‟ preferences for different brands of
television sets or kinds of advertising, the choice should go in favor of mode. The use of mean and
median would not be proper. However, the median can sometimes be used in the case of qualitative
data when such data can be arranged in an ascending or descending order. Let us take another
example. Suppose we invite applications for a certain vacancy in our University. A large number of
candidates apply for that post. We are now interested to know as to which age or age group has the
largest concentration of applicants. Her, obviously the mode will be the most appropriate choice. The
arithmetic mean may not be appropriate as it may be influenced by some extreme values.

3.5 Measures of Non-central Locations


Median is the value of the middle item which divides the data in to two equal parts and found by
arranging the data in an increasing or decreasing order of magnitude, where as quintiles are measures
which divides a given set of data in to approximately equal subdivision and are obtained by the same
procedure to that of median. They are averages of position (non-central tendency). Some of these are
quartiles, deciles and percentiles.
Quartiles: are values which divide the data set in to approximately four equal parts, denoted by
. The first quartile ( ) is also called the lower quartile and the third quartile ( ) is
the upper quartile. The second quartile ( ) is the median.
• Quartiles for Individual series:

Let x1 , x 2 , , x n be n ordered observations. The ith quartile Qi  is the value of the item

corresponding with the [i(n+1)/4]th position, i = 1, 2, 3.

35
That is, after arranging the data in ascending order, Q1, Q2, & Q3 are, obtained by:

( ) , ( ) and ( ) .

• Quartiles for discrete data arranged in a frequency distribution:-Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct the
less than cumulative frequency distribution and apply the formula of quartile for Individual series.

• Quartiles in continuous data:- For continuous data, use the following formula:

w  in 
Qi  L    CF 
f Qi  4 

Where i = 1,2, 3, and L, w ,fQi and CF are defined in the same way as the median.

i.e. Q1 = L + ( ), Q2 = L + ( ) Q3 = L + ( )

The class under question is the one including (ixn/4)th value. That is, the class with the minimum
cumulative frequency greater than or equal to (ixn/4) th is the class of the ith quartile.

Deciles: are values dividing the data approximately in to ten equal parts, denoted by .
• Deciles for Individual series:

Let x1 , x 2 , , x n be n ordered observations. The ith decile is the value of the item

corresponding with the [i(n+1)/10]th position, i = 1, 2, . . . ,9.

That is, after arranging the data in ascending order, D1, D2, . . .& D9 are, obtained by:

( ) , ( ) . . . and ( ) .

• Deciles for Discrete data arranged in a frequency distribution:-Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct the
less than cumulative frequency distribution and apply the formula of deciles for individual series.

• Deciles for continuous data: Apply the following formula and follow the procedures of quartile
for continuous data.

( ) i = 1, 2,...,9 . Then

Define the symbols similar ways as we did in the case of quartiles for continuous data.

36
Percentiles: are values which divide the data approximately in to one hundred equal parts, and
denoted by
• Percentiles for Individual series:

Let x1 , x 2 , , x n be n ordered observations. The ith percentile is the value of the item

corresponding with the [i(n+1)/100]th position, i = 1, 2, . . . ,99.

That is, after arranging the data in ascending order, P1, P2, . . .& P99 are, obtained by:

( ) , ( ) . . . and ( ) .

• Percentiles for Discrete data arranged in a frequency distribution:-Arranged in a frequency


distribution this case also, we will follow the same procedure as the median. That is, we construct the
less than cumulative frequency distribution and apply the formula of percentile for individual series.

• Percentiles for continuous data: Apply the following formula

( ),i = 1, 2,...,99.

Define the symbols similar ways as we did in the case of quartiles or deciles for continuous data.
Interpretations
1. is the value below which ( i × 25) percent of the observations in the series are found (where i = 1,
2,3). For instance means the value below which 75 percent of observations in the given series are
found.
2. is the value below which ( i ×10) percent of the observations in the series are found (where i = 1,
2,...,9 ). For instance is the value below which 40 percent of the values are found in the series.
3. is the value below which i percent of the total observations are found (where i = 1, 2,3,...,99 ). For
example 60 percent of the observations in a given series are below .
Example 3.16:Calculate , for the following tables.
x 10 11 12 13 14 15 16 17 18
f 2 8 25 48 65 40 20 9 2

Solution: The given data is measured and it is arranged in an increasing order. So we need to
construct only the cumulative frequency table before calculating the required values.

37
x 10 11 12 13 14 15 16 17 18
f 2 8 25 48 65 40 20 9 2
Cum. 2 10 35 83 148 188 208 217 219
Freq.

The total number of observations is 219 which is odd. Clearly then the median is 14 because
̃= = value = 110th value = 14

( ) =( ) = 55th value = 13

( ) =( ) = 110th value = 14 = ̃

( ) =( ) = 165th value = 15

( ) =( ) = 88th value = 14

( ) =( ) = 198th value = 16

( ) =( ) = 88th value = 14

( ) =( ) = 198th value = 16.

Example 3.17: Values of fecundity (rate of reproduction) of 50 Fish of a species of Fish is given
below. Based on the data find ,
rate of 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 Total
reproduction
f 3 11 7 4 15 0 7 3 50
Solution:- first find the class boundaries and cumulative frequency distributions.
rate of 1-10 11-20 21-30 31-40 41-50 51-60 61-70 71-80 Total
reproduction
f 3 11 7 4 15 0 7 3 50
Cumulative 3 14 21 25 40 40 47 50
freq. dist.

Q1 Measure of (n/4)th value = 12.5th value which lies in group 10.5 – 20.5

Q1 = L + ( ) = 10.5 + = 19.1

D4 Measure of (4n/10)th value = 20th value which lies in group 20.5 – 30.5.

38
D4 = L + ( ) = 20.5 + = 29.1

P7 Measure of (7n/100)th value = 3.5th value which lies in group 10.5 – 20.5

P7 = L + ( ) = 10.5 + = 11.

Exercise- 3
1. A random sample of 10 days gave the following information about the total number of people treated
per day at a community hospital emergency room (ER)
40,8,26,20,30,27,60,13,119,40
Find a/ the mean b/ the mode C/ the median
2. The circumferences (distance around) of 100 oak trees selected at random in a state park were
measured. The FD is given below

Circumference (inch) No of trees (f)


10-24 8
25-39 22
40-54 53
55-69 17
Find the a/ mean, b/ mode, c/ median, d/ 1st quartile, e/ P2, f/ D8 ,g/ D9
3. The Frequency of flowers on 30 plants is given below

Number of flowers 1 2 3 4 5 6 7 Total


Fi 1 4 12 9 2 1 1 30

Find the a/ mode, b/ median, c/ mean, d/ Q2, e/D6 ,f/ P25,


4. The nose length of forty students in a class (measured) to the nearest millimeter) are given below.
40 43 51 49 50 43 41 43 46 45 40 38 45 46 40 38 40 43 50 35 41 32 40 45 62 39 46 45
44 50 52 39 45 35 51 48 45 55 32 45
Make a frequency table with width five to organize the above data set.
5. The following table contains number of leaves on 20 plants.

No of leaves 5 12 15 20 22 Total
Fi 3 5 7 3 2 20

Find the a/ mean, b/mode, c/ median, d/D7, e/P80

6. A random sample of 10 insects had the following weight (in mlgms)


70 120 110 101 88 83 95 98 107 100
Find the a/ man b/ median c/ mode

39
7. The following data represent the diagnosis of patients admitted to a hospital.
Cancer Heart failure Diabetics Accident Gunshotwound Diabetics
Gunshot wound diabetics Gunshot wound AccidentDiabetics Cancer
Cancer Diabetics TBDiabetics TB
Determine the mode diagnosis.

8. The blood type of patients is given below

O A O AB AB O A B AB A

A O B O O O A B AB B

Determine the mode of blood type.

40
CHAPTER 4: Measures of Dispersion (Variation)
4.1. Introduction
Just as central tendency can be measured by a number in the form of an average, the amount of
variation (dispersion, spread, or scatter) among the values in the data set can also be measured. The
measures of central tendency describe that the major part of values in the data set appears to
concentrate around a central value called average with the remaining values scattered (distributed) on
either sides of that value. But these measures do not reveal how these values are dispersed (spread or
scatter) on each side of the central value. The dispersion of values is indicated by the extent to which
these values tend to spread over an interval rather than cluster closely around an average.
The term dispersion is generally used in two senses. Firstly, dispersion refers to the variations of the
items among themselves. If the value of all the items of a series is the same, there will be no variation
among different items of a series. Secondly, dispersion refers to the variation of the items around an
average. If the difference between the value of items and the average is large, the dispersion will be
high and on the other hand if the difference between the value of the items and averaging is small, the
dispersion will be low. Thus, dispersion is defined as scatteredness or spreadness of the individual
items in a given series.

After studying this chapter, you should be able to:


 Explain the meaning of measures of dispersion

 Compare two or more sets of data using relative measures of dispersion.


 Apply the Z-score to find out the relative standing of values.
 Explain measures of skewness and kurtosis.
Objectives of measuring Variation:
 To judge the reliability of measures of central tendency
 To control variability itself.
 To compare two or more groups of numbers in terms of their variability.
 To make further statistical analysis.
4.2. Absolute and Relative Measures of Dispersion Absolute
measures of dispersion: Absolute measure is expressed in the same statistical unit in
which the original data are given such as kilograms, tones etc. These measures are suitable for
comparing the variability in two distributions having variables expressed in the same units and of
the same averaging size. These measures are not suitable for comparing the variability in two
distributions having variables expressed in different units.

41
Relative measures of dispersion: A relative measure of dispersion is the ratio of a measure of absolute
dispersion to an appropriate average or the selected items of the data.

Relative
measure of
dispersion

Based on selected
items Based on
all items

Coefficient of range Coefficient of mean


and coefficient of deviation
quartile deviation &coefficient of
standard deviation
or coefficient of
variation

42
4.3 Types of Measures of Variation
4.3.1 The Range and Relative Range
Range is the simplest measures of dispersion. It is defined as the difference between the largest and
smallest value in a given set of data. Its formula is:

Where R=Range, L= Largest value in a given set of data, S= smallest value in a given set of data
For a continuous grouped distribution, the range may be obtained as:

 The difference between upper class limit of the last class and the lower class limit of the first class,
or

 The difference between the largest class mark and the smallest class mark, or

 The difference between the upper class boundary of the last class and the lower class boundary of
the first class.

The range is used in describing like the maximum change in daily temperature, rainfall, etc. When the
sample size is small, it can be an adequate measure of variation. It is commonly used in quality
control.

The relative measures of range, also called coefficient of range, is defined as

LS
LS
Example 4.1: Five students obtained the following marks in statistics: . Find the
range and relative range
Solution: Here,

LS 35  15
=   0 .4
LS 35  15
Example 4.2: Find out range and relative range of the following given data.

Size 5-10 11-15 16-20 21-25 26-30

Frequency 4 9 15 30 40

43
Solution: Here,L = Upper class limit of the largest class = 30, S = lower class limit of the
smallest class = 5

Range = 30 – 5 = 25

30  5
 0.7143 .
30  5
Merits of the Range

 It is well-defined, easy to compute and simple to understand.


 It helps in giving an idea about the variation, just by giving the lowest value and the greatest
value of variable.

Demerits of the Range

 It is not based on all observations of the series.


 It can‟t be calculated in case of open-ended distribution.
 It is affected by sampling fluctuation.
 It is affected by extreme values in the series.

4.3.2. The Quartile Deviation and Coefficient of Quartile Deviation


Inter-quartile range and quartile deviation are other measures of dispersion. The difference between
the upper quartile and lower quartile is called inter-quartile range. Symbolically,

The inter-quartile ranges covers dispersion of middle 50% of the items of the series. Quartile
deviation, also called semi-inter-quartile range,is half of the difference between the upper and lower
quartile. That is, half of the inter-quartile range. Its formula is

The relative measure of quartile deviation also called the coefficient of quartile deviation (CQD) is
defined as:

Example 4.3: Find inter-quartile range, quartile deviation and coefficient of quartile
deviation from the following age of patients.

44
18, 59, 24, 42, 21, 23, 24, 32

Solution: First arrange the data in ascending order. 18, 21, 23, 24, 24, 32, 42, 59

( ) ( ) = (2.25)th item= 2nd item + 0.25(3rd item - 2nd

item)= 21 + 0.25(23 - 21) = 21.5

( ) ( ) = (6.75)th item = 6 th item + 0.75(7 th item - 6


th
item)= 32 + 0.75(42 - 32) = 39.5

Example 4.4: Find inter-quartile range, quartile deviation and coefficient of quartile
deviation from the following data.

Marks 2 3 4 5 6 7 8 9
No. Of students 10 11 12 13 5 12 7 5

Solution:
Marks No. Of students CF

2 10 10
3 11 21
4 12 33
5 13 46
6 5 51
7 12 63
8 7 70
9 5 75=N

( )

45
( ) ( )

Remark: Q.D or CQD includes only the middle 50% of the observation.

Merits of QD

 It is well-defined, easy to compute and simple to understand.


 It helps in studying the middle 50% item in the series.
 It is not affected by the extreme items.
 It is useful in measuring variations in the case of open-ended distributions.

Demerits of QD

 It is not based on all the items (it ignores 50% items, i.e., the first 25% and the last 25%).
 It is greatly influenced by sampling fluctuations.
 It is not amenable to algebraic manipulations.

4.3.3 The Mean Deviation and Coefficient of Mean Deviation


The mean deviation (MD) measures the average deviation of a set of observations about their central
value, generally the mean or the median, ignoring the plus/minus sign of the deviations. In other words
the mean deviation of a set of items is defined as the arithmetic mean of the values of the absolute
deviations from a given average. Depending up on the type of averages used we have different mean
deviations.
 The mean deviation of a sample of n observations x1, x2, . . ., xnis given as
∑| |

Where | | denotes the absolute value of the deviation. Generally, arithmetic mean and median
are used in calculating mean deviation. So, stands for the average used for calculating . That is,
(̃) ̅ .

 In case of grouped data, the formula for MD becomes

46
∑ | |
, where is the class mark of the ith class, is the frequency of the ith class and

n=∑ .
1. The mean deviation about the arithmetic mean is, therefore, given by

∑| ̅|
̅ for ungrouped data.
∑ | ̅|
̅ , for discrete data arranged in FD &for grouped frequency distribution;

where is the value or class mark of the ith class, is the frequency of the ith class and n =
∑ .

Steps to calculate M.D for ( ̅ )


 Find the arithmetic mean, ̅
 Find the deviations of each reading from ̅
 Find the arithmetic mean of the deviations, ignoring sign.
2. The mean deviation about the median is also given by
∑| ̃|
( ̃) or ungrouped data.
∑ | ̃|
̃ , for discrete data arranged in FD &for grouped frequency distribution;

where is value or the class mark of the ith class, is the frequency of the ith class and n =
∑ .

Steps to calculate M.D ( ̃ )


 Find the median, ̃
 Find the deviations of each reading from ̃
 Find the arithmetic mean of the deviations, ignoring sign.
3. The mean deviation about the mode is also given by

∑| ̂|
̂ for ungrouped data.
∑ | ̂|
̂ , for discrete data arranged in FD &for grouped frequency distribution; where is

value or the class mark of the ith class, is the frequency of the ith class and n = ∑ .

Steps to calculate M.D ̂


 Find the mode, ̂
 Find the deviations of each reading from ̂
 Find the arithmetic mean of the deviations, ignoring sign.

47
Example 4.5: The following are the number of visit made by ten mothers to the local doctor‟s surgery.
8, 6, 5, 5, 7, 4, 5, 9, 7, 4. Find mean deviation about mean, median and mode.
Solution:
First calculate the three averages
̅ , ̃ ,̂
Then take the deviations of each observation from these averages.
xi 4 4 5 5 5 6 7 7 8 9 Total
| ̅| 2 2 1 1 1 0 1 1 2 3 14

| ̃| 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14

| ̂| 1 1 0 0 0 1 2 2 3 4 14

Since the distribution is ungrouped the mean deviation about mean, median and mode:
∑| ̅|
̅

∑| ̃|
( ̃)

∑| ̂|
̂ .

Merits of

 It is well-defined, easy to compute and simple to understand.


 It is based on all observations
 It is not greatly affected by the extreme items
 It can be calculated by using any average

Demerit of

 It does not take in to account the signs of the deviations of items from the average

Remark: Of all the mean deviations taken about different averages or any arbitrary value, the
mean deviation about the median has the smallest value.

Coefficient of mean deviation (CMD):


The relative measure of mean deviation, also called the coefficient of mean deviation is obtained by
dividing mean deviation by the particular average used in computing mean deviation. Thus,

 CMD about the arithmetic mean is given by:

48
̅
̅ , whereMD is the mean deviation calculated about the arithmetic mean.
̅
 CMD about the median is given by:

̃
̃ In which case MD is calculated about the median of the observations.
̃

 CMD about the mode is given by:


̂
̂ ̂
in which case MD is calculated about the mode of the observations.

Example 4.6
Calculate the coefficient of mean deviation about the mean, median and mode for the data in Example
4.5 above.
Solution:
̅
̅
̅
( ̃)
( ̃)
̃

̂
̂
̂

4.3.4 The Variance, Standard Deviation and Coefficient of Variation


Variance and Standard Deviation

Like the mean deviation, the variance is also based on all observations in a set of data. But the
variance is the average of squared deviations from the mean. Recall that the sum of squared deviations
is minimum only when taken from the mean. Squared deviations are mathematically manipulated than
absolute deviations. Thus, if we averaged the squared deviations from the mean and take the square
root of the result (to compensate for the fact that the deviations were squared), we obtain the standard
deviation. This overcomes the limitation of the mean deviation.

Population Variance ( )
If we divide the squared variation by the number of values in the population, we get something called
the population variance. This variance is the "average squared deviation from the mean".
 For ungrouped data

49

[∑ ], where is the population arithmetic mean and Nis the

total number of observations in the population.

 For discrete data arranged in FD and for continuous grouped data



[∑ ]where is the population arithmetic mean, is the value or

class mark of the ith class, is the frequency of the ithclass and N=∑
Sample Variance ( )
One would expect the sample variance to simply be the population variance with the population
mean replaced by the sample mean. However, one of the major uses of statistics is to estimate the
corresponding parameter. This formula has the problem that the estimated value isn't the same as
the parameter. To offset this, the sum of the squares of the deviations is divided by one less than
the sample size.
 For ungrouped data
∑ ̅
[∑ ̅ ]where̅ is the sample arithmetic mean and n is the total
number of observations in the sample.

If the values xi have frequencies fi (i=1,2,…,m), then the sample variance is given by:

1 m
]or S   f i  xi  x 
2 2
∑ ̅
[∑ ̅
n  1 i 1

 For discrete data arranged in FD and for grouped data


∑ ̅
[∑ ̅ ]where̅ is the sample arithmetic mean, is the value or

class mark of the ith class, is the frequency of the ithclass and n=∑ .

The Standard Deviation


There is a problem with variances. Recall that the deviations were squared. That means that the units
were also squared. To get the units back the same as the original data values, the square root must be
taken.
 Population Standard Deviation ( )
√ where is the population variance.
 Sample Standard Deviation ( S )
√ where is the sample variance.

50
Example 4.7: Find the sample variance and standard deviation for frequency distribution of height in cms
of students in a DMU given below.

Heights in cms 150 152 154 156 158 160 162 164 166

Number of students 28 40 52 100 60 48 32 20 7

Solution: Prepare the following table:

xi fi fixi xi2 fixi2


150 28 4200 22500 630000

924160
152 40 6080 23104
1233232
154 52 8008 23716
2433600
156 100 15600 24336
1497840
158 60 9480 24964
1228800
160 48 7680 25600
839808
162 32 5184 26244
537920
164 20 3280 26896
192892
166 7 1162 27556
224916 9518252
Sum 387 60674

Thus, n=∑ ∑ ∑ ∑ .

*∑ ̅ +

= [ ( ) ]

Example 4.8: Calculate the sample variance and standard deviation of the blood glucose level, in
milligrams per deciliter, for 60 patients shown below.

Class limit 55 – 63 64 – 72 73 – 81 82 – 90 91 – 99 100 – 108 109 –117

51
Frequency 9 5 12 17 7 6 4

Solution: In a continuous F.D., xi is the class mark representing the ith class.

Class limit xi fi f i xi 2
f i xi
55 – 63 59 9 531 31329
64 – 72 68 5 340 23120
73 – 81 77 12 924 71148
82 – 90 86 17 1462 125732
91 – 99 95 7 665 63175
100 – 108 104 6 624 64896
109 –117 113 4 452 51076

Total 60 4998 430476


Where, n=∑ ̅= ∑ , so that

[∑ ̅ ]= [ ]

√ = 15.48

Properties of Variance & Standard Deviation

1. If a constant is added to (or subtracted from) all the values, the variance remains the same; i.e., for
any constant k, V ( xi  k )  V ( xi ) .

Example 4.9 Consider the 6 sample values xi: 54, 52,53,50,51, and 52.

The sample variance is 2 = V xi  . Now, subtract 50 from each value to get:

yi : 4, 2, 3, 0, 1, 2; and, the variance of this new series is 2. i.e. V x   V  y   2 .

1. If each and every value is multiplied by a non-zero constant (k), the standard deviation is
multiplied by /k/ and the variance is multiplied by k2 ; i.e., V (kxi )  k 2V ( xi ) .

52
2. Both the variance and the standard deviation are give more weight to extreme values and less to
those which are near to the mean.

Coefficient of Variation
The standard deviation is an absolute measure of dispersion. The corresponding relative measure is
known as the coefficient of variation (CV).
Of course, standard deviation is an absolute measure of dispersion that expresses the variation in the
same unit as the original data but it can not be the sole basis for comparing two distributions. For
instance, if we have a standard deviation of 10 and a mean of 5, the values vary by an amount twice as
large as the mean itself. If, on the other hand, we have a standard deviation of 10 and a mean of 5000,
the variation relative to the mean is significant. Therefore, we cannot know the dispersion of a set of
data until we know the standard deviation, the mean, and how the standard deviation compares with
the mean.
Coefficient of variation is used in such problems where we want to compare the variability of two or
more different series. Coefficient of variation is the ratio of the standard deviation to the arithmetic
mean, usually expressed in percent.

CV =

For population data:


CV =

Where is the population standard deviation and is population mean.


For sample data:
CV = ̅

Where is the sample standard deviation and ̅ is sample mean.


Remark: A distribution having less coefficient of variation is said to be less variable or more
consistent or more uniform or more homogeneous.
Example 4.10: One patient‟s blood pressure, measured daily over several weeks, averaged 182 with a
standard deviation of 12.6, while that of another patient averaged 124 with a standard deviation of 9.4.
Which patient‟s blood pressure is relatively more variable?

53
Solution:

Given: S1=12.6 x1 =182 S2=9.4 x 2 = 124

CV1 = ̅ 100% = 100 % = 6.923

CV2 = ̅ 100% = 100 % = 7.58

Blood pressure of the second patient is relatively more variable.

4.4 Standard Scores (Z-Scores)


A standard score for sample value in a data set is obtained bysubtracting the mean of the data set from
the value and dividing the result by the standard deviation of the data set. Basically, the standard score
(z-score) tells us how many standard deviations a specific value is above or below the mean value of
the data set. That is, the z-score is the number of standard deviations the data value falls above
(positive z-score) or below (negative z-score) the mean for the data set.

Z-score computed from the population

Z-score computed from the sample

Example 4.11: What is the Z-score for the value of 14 in the following sample data set?

3 8 6 14 4 12 7 10

Solution:

̅ = 8, S = 3.8173 thus, Z =

 The data value of 14 is located 1.57 standard deviations above the mean 8 because the z-score is
positive.

54
Example 4.12: Suppose that a student scored 66 in Statistics and 80 in Biology. The score of the
summary of the courses is given below.
Course Average score Standard deviation of the score
Statistics 51 12
Biology 72 16

In which course did the student scored better as compared to his classmates?
Solution:
̅
Z-score of student in Statistics:

̅
Z-score of student in Biology:

From these two standard scores, we can conclude that the student has scored better in Statistics course
relative to his classmates than in Biology course.

4.5 Moments, Skewness and Kurtosis


The measures of central tendency and variation discussed in previous one do not reveal the entire story
about a frequency distribution. Two distributions may have the same mean and standard deviation but
may differ in their shape of the distribution. Further description of their characteristics is necessary
that is provided by measures of skewness and kurtosis.

4.5.1 Moments
Moments are statistical tools used in statistical investigation. The moments of a distribution are the
arithmetic mean of the various powers of the deviations of items from some number. In our course, we
shall use it in the study of Skewness and Kurtosis of statistical distribution.

Moments about the origin

Where

Moments about the origin for grouped frequency distribution andfor ungrouped frequency distribution
is

55
Where is the frequency of . is the midpoint in the case of grouped frequency distribution or class
value in the case of ungrouped frequency distribution.

Note that: ̅,

Moments about the Mean (Central Moments)

∑ ̅

Moments about the mean for grouped frequency distribution andfor ungrouped frequency distribution

∑ ̅

Where is the frequency of . is the midpoint in the case of grouped frequency distribution or class
value in the case of ungrouped frequency distribution.

Note that: if it is assumed

Moments about any arbitrary constant

Moments about any arbitrary constant for grouped frequency distribution andfor ungrouped
frequency distribution


.

Example 4.13: Find the first four moments about the mean for the following individual
series

: 3 6 8 10 18

56
Solution: n=5,

S.No ̅ ̅ ̅ ̅

1 3 -6 36 -216 1296
2 6 -3 9 -27 81
3 8 -1 1 -1 1
4 10 1 1 1 1
5 18 9 81 729 6561
Total ∑ ∑ ̅ ∑ ̅ ∑ ̅ ∑ ̅

Thus,

∑ ∑
̅

∑ ∑
.

4.5.2 Skewness
Skewness refers to lack of symmetry (or departure from symmetry) in a distribution.

 A skewed frequency distribution is one that is not symmetrical.


 Skewness is concerned with the shape of the curve not size.
A distribution is said to symmetrical when the value is uniformly distributed around the mean
(distribution of the data bellow the mean and above the mean are equal). In a symmetrical distribution,
the mean, median and mode coincide (i.e., mean = median = mode).
Positively skewed distribution: - if the value of mean is greater than the mode, skewness is said to be
positive. In a positively skewed distribution mean is greater than the mode and the median lies
somewhere in between mean and mode. A positively skewed distribution contains some values that
are much larger than the majority of other observations.
Negatively Skewed distribution: - if the value of mode is greater than the mean, skewness is said to
be negative. In a negatively skewed distribution mode is greater than the mean and the median lies in
between mean and mode. The mean is pulled towards the low-valued item (that is, to the left). A
negatively skewed distribution contains some values that are much smaller than the majority of
observations.

57
Note that: In moderately skewed distributions the averages have the following relationship.

(Mean – Mode) = 3(Mean - Median).

How to check the presence of skewness in a distribution?

Skewness present in the data if:

i) the graph not symmetrical.


ii) the mean, median and mode do not coincide
iii) the sum of positive and negative deviations from the median is not zero
iv) the frequencies are not similarly distributed on either side of the mode.

Measures of skewness ( )

A measure of skewness gives a numerical expression for and the direction of asymmetry in a
distribution. It gives information about the shape of the distribution and the degree of variation on
either side of the central value. The three most commonly used measures of skewness are Pearson’s
coefficient skewness, Bowley’s coefficient of skewness and coefficient of skewness based on moments.

1. Pearson’s coefficient skewness (Pearsonian coefficient of skewness)


The skewness of the distribution can be measured by Pearson‟s Coefficient of Skewness ( ), for
which the formula is given below:

58
2. Bowley’s Coefficient of Skewness
Bowley‟s coefficient of skewness is based on quartiles. The formula for calculating coefficient of
skewness is:

3. Moment Coefficient of Skewness


Moment coefficient of skewness is based on moments. The formula for calculating coefficient of
skewness is:

Where, M'r = ∑ ̅

The shape of the curve is determined by the value of

> 0, the distribution is positively skewed/skewed to the right,i.e mode < median <mean.

smaller observations are more frequent than larger observations. i.e., the majority of

the observations have a value below an average.

= 0, the distribution is symmetric,i.e. mean = mode = median.

< 0,the distribution is negativelyskewed/skewed to the left.i.e., mean < median < mode.

smaller observations are less frequent than larger observations. i.e., the majority of

the observations have a value above an average.

4.5.3 Kurtosis
Kurtosis is a measure of peaked ness of a distribution. The degree of kurtosis of a distribution is
measured relative to the peaked ness of a normal curve. If a curve is more peaked than the normal
curve it is called „leptokurtic‟; if it is more or flat-topped than the normal curve it is called
„platykurtic‟ or flat-topped. The normal curve itself is known as „mesokurtic‟.

59
Measures of Kurtosis ( )

The moment coefficient of kurtosis:


= =

The peakedness depends on the value of


 > 3  the curve is leptokurtic,
 = 3  the curve is mesokurtic,
 < 3  the curve is platykurtic.

Example 4:14: Based on the following data:


0 = 1, 1 = -0.6, 2 = 1.6, 3 = -2.4, 4 = 5.8
a/ Find the coefficient of skewness and discuss the distribution type.
b/ Find the coefficient of kurtosis and discuss the distribution type.
Solution:
a/ = = -1.19 < 0, the distribution is negatively skewed.

b/ = = = 2.26 < 3, the distribution is platykurtic.

Example 4.15: Findthe coefficient of skewness and the coefficient of kurtosis for the above example
4.13.
Solution:
i)

the distribution is positively skewed.


ii) = =

the distribution is platykurtic.

Exercise 4

60
1. Calculate the mean deviation about the mean, median and mode, and their coefficients and also
variance and standard deviation for the following data.

Size of shoes 3 6 11 2 4 10 5 7 8 9
No. of pairs sold 10 15 25 6 4 3 2 8 9 4

2. Last semester, the students of department Biology section A and B look Fundamentals of
Biostatistics course. At the end of the semester, the following information was recorded.
Section A Section B
Mean score 79 64
Standard deviation 23 11

Compare the relative dispersion of the two sections‟ scores using appropriate way.
3. A meteorologist interested in the consistency of temperatures in three cities during a given week
collected the following data. The temperatures for the five days of the week in the three cities
were:
City 1: 25, 24, 23, 26, 17
City 2: 22, 21, 24, 22, 20
City 3: 32, 27, 35, 24, 28
Which city have the most consistent temperature, based on these data?
4. The median and the mode of a mesokurtic distribution are 32 and 34 respectively. The 4thmoment
about the mean is 243. Compute the Pearsonian coefficient of skewness and identify the type of
skewness. Assume (n-1 = n).
5. If the standard deviation of a symmetric distribution is 10, what should be the value of the fourth
moment so that the distribution is mesokurtic?

61
CHAPTER 5: ELEMENTARY PROBABLITY
Objectives
After studying this chapter, you should be able to:
 Understand the fundamental concepts of probability
 Apply the principle of counting techniques to solve real problem.
 Define some basic terms of probability.

5.1 Definition of some probability terms


• Experiment:-Any process of observation or measurement or any process which generates well
defined outcome.
•Random experiment: it is an experiment which can be repeated any number of times under the same
conditions, but does not give unique results. The result will be any one of several possible outcomes,
but for each trial, the result will not be known in advance. ARandom experiment is also called a trial
& the outcomes are called events.
• Sample space: - is the collection of all possible out comes or sample points of a random experiment.
•Sample point: -Each element of sample space is called Sample point.
• Event: - is a subset of a sample space i.e. an event is a collection of sample points.
•Impossible event:-this is an event which will never occur.

Example 5.1: In an experiment of tossing a coin three times, S = {HHH, HHT, HTH, HTT, THH,
THT, TTH, TTT}, each sample point is an equally likely out come. It is possible to define many
events on this sample space as follows:
A = {HHH} - the event of getting only head.
B = {HHH, HHT} - the event of getting head on the first two tosses.
C = {HHT, HTH, THH} - the event of getting exactly two heads.
D = the event of getting number 9 is an impossible event.
Example 5.2: If we toss a coin the sample space (S) of this experiment,S = {head, tail} where head
and tail are two faces of a coin. If we are interested the outcome of head will turn up then the event E=
{head}.
Example 5.3
Find the sample space rolling a fair die.
S= {1, 2, 3, 4, 5, 6}
• Mutually exclusive event: - two events A and B are said to be mutually exclusive if there is no
sample point which is common to A and B. i.e. A∩ B =
• Independent event: two or more events are said to be independent if the occurrence or non-
occurrence of an event does not affect the occurrence or non-occurrence of the other.

62
• Dependent Events: Two events are dependent if the first event affects the outcome or occurrence of
the second event in a way the probability is changed.
• Complement of an Event: the complement of an event A means nonoccurrence of A and is denoted
by A', or Ac contains those points of the sample space which don‟t belong to A.
• Equally likely outcomes: if each outcome in a sample space has the same chance to be occurred.
Example 5.4: Casting a fair die all possible outcomes are equally likely.
5.2 Counting rules: addition, multiplication, Permutation & Combination rule
In order to calculate probabilities, we have to know
• The number of elements of an event
• The number of elements of the sample space.
That is in order to judge what is probable, we have to know what is possible.
In order to determine the number of out comes one can use several rules of counting:
1. Addition rule
2. Multiplication rule
3. Permutation rule
4. Combination rule.
1. The addition Rule
Suppose that a procedure, designated by 1, can be done in n1 ways. Assume that a second procedure
designated by 2, can be done in n2 ways. Suppose furthermore, that it is not possible that both 1 and 2
done together. Then, the number of ways in which we can do1 or 2 is ways.
Example 5.5: Suppose we are planning a trip to some place. If there are 3 bus routs & two train routs
that we can take, then there are 3+2=5 different routs that we can take.
2. Multiplication rule: If an operation consists of k steps and the 1st step can be done in n 1 ways, the
2nd step can be done in n2 ways (regardless of how the 1st step was performed), the kth step can be
done in nk ways, (regardless of how the preceding steps were performed), then the entire operation can
be performed in n1· n2·… · nkways.
Example 5.6: Suppose that a person has 2 different pairs of trousers and 3 shirts. In how many ways
can he wear his trousers and shirts?
Solution: He can choose the trousers in n1  2 ways, and shirts in n 2  3 ways. Therefore, he can wear

in n1  n 2  2  3  6 possible ways.

63
3. Permutation:-An arrangement of objects with attention given to order of arrangement is called
permutation. The number of permutation of n different objects taken r at a time is obtained by:
n!
Pr  for r  0, 1, 2,  , n
(n  r )!
n

Permutation Rule:
a) The number of permutations of n objects taken all together is n!
n! n!
i.e., n!= n*(n-1)*(n-2)*…*3*2*1 = Pn    n!
(n  n)! 0!
n

Note: By definition 0! = 1.
b) The arrangement of n distinct objects in a specific order using r objects at a time is called the
permutation of n objects taken r objects at a time. It is written as nPrand the formula is
n!
Pr 
(n  r )!
n

c) The number of distinct permutation of n objects in which k1 are alike, k2 are alike,
kn are alike , etc. is n! for n = n1 + n2 + n3 + …+ nk.
n1 !.n 2 !. .n k !

Example 5.7: Find number of permutations of the letters in the word „„statistics‟‟.
Solution:
There are 3 s‟s, 3t‟s, 1a, 2i‟s and 1c. i.e. , ,

Therefore, 10! = 50,400.


3!.3!.1!.2!1!
Example 5.8
A photographer wants to arrange 3 persons in a row for photograph. How many different types of
photographs are possible?
Solution:
Assume 3 persons Aster (A), lemma (L), Yared (Y) and n=3
Since n! =3! = 3*2! = 6, there are 6 possible arrangement ALY, AYL, LAY, LYA, YLA and YAL.

Example 5.9
Suppose we have a letters A,B, C, D,E
a) How many permutations are there taking all the four?
b) How many permutations are there taking two letters at a time?
Solution:
a) Here n = 5, there are four distinct object.
There are 5! = 120 permutations.

64
b) Here n = 5, r = 2
There are 5P2 = 5!/(5-2)! = 120/6 = 20 permutations.

Example 5.10
Fifteen Ethiopian athletes were entered to the race. In how many different ways could prizes for the
first, the second and the third place be awarded?
Solution
15 objects taken 3 at a time 15P3=15!/(15-3)! = 2730 ways.
4. Combination-A selection of objects considered without regard to order in which they occur is
called Combination. The number of combination of n different objects taking r of them at a time is
n n!
C r     , for r  0,1,2,, n .
 r  r!(n  r )!
n

Example 5.11
Given the letters A, B, C, and D list the permutation and combination for selecting two letters.
Solution:
Permutation Combination
AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC
Note that in permutation AB is different from BA but in combination AB is the same as BA.

Example 5.12
In a club containing 7 members a committee of 3 people is to be formed. In how many ways can the
committee be formed?
n n! 7 7!
Solution: 7C3 = n C r      7 C3     = 35.
 r  r!(n  r )!  3  3!(7  3)!

Example 5.13
How many four-digit numbers can be formed with the 10 digits 0,1,2,.. . ,9 if
a/ repetitions are allowed
b/ repetitions are allowed, and
c/ the last digit must be zero & repetitions are not allowed.
Solution:

65
a/ the first digit can be any one of 9 (since 0 is not allowed). The second, third and fourth digits can be
any one of 10. Then 9.10.10.10=9000 numbers can be formed.
b/ the first digit can be any one of 9 & the remaining three can be chosen in 9 P3 ways.

Thus, 9. 9 P3 = 4563 numbers can be formed.

c/ the first digit can be chosen in 9 ways & the next two digits in 9 P2 ways. Thus, 9. 8 P2 = 504

numbers can be formed.

5.3 Probability of an event


Definition: Probability is a numerical measure of the chance or likelihood that a particular event will
occur & it lies in the range from 0-1, inclusive. Probability is a building block of inferential statistics.
Definition: Let E be an experiment. Let S be a sample space associated with E. With each event A in
S we associate a real number designated by P (A) and called the probability of A.
Generally probability can be divided into two
i) Subjective probability: - probability determined based on individual‟s own judgment, experience,
information, belief, etc is called Subjective probability.
ii) Objective probability: - the probability of an event in a certain experiment based on experimental
evidence.
Basic approaches to probability
There are three different conceptual approaches to the study of probability theory.
These are:
1. The classical approach.
2. The frequentist approach.
3. The axiomatic approach.
1. Classical approach:
Definition: If there are n equally likely outcomes of an experiment, and out of the n outcomes event A
occur only k times, the probability of the event A is denoted by P (A) is defined as

p(A) = = =

Note: Classical approach of measuring probability fails to answer for the following conditions:
• If total number of outcomes is infinite or if it is not possible to enumerate all elements of the sample
space.
• If each outcome is not equally likely
Example 5.14
Compute a/ the probability of having two boys & one girl is a three child family using the classical
method, assuming boys & girls are equally likely.

66
b/ using (a) compute the probability of having three boys in a three-child family.
c/ using (a) compute the probability of having three girls in a three –child family.
d/ using (a) compute the probability of having two girls & one boy in three child family.

Solution

The sample space S or the experiment is S= {BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG}
So n(S)=8.

a/ For the event A= „ two boys & a girl‟ = {BBG,BGB,GBB} , we have n(A)=3,Since the outcome
are equally likely , the probability of A is P(A)= n(A)/n(S)=3/8 =0.375.

b/ Compute the probability of having three boys in a three-child family.

For the event B= „ three boys‟ = {BBB} , we have n(B)=1,Since the outcome are equally likely , the
probability of B is P(B)= n(B)/n(S)=1/8 .

c/ compute the probability of having three girls in a three –child family.

For the event C= „ three girls‟ = {GGG} , we have n(C)=1.Since the outcome are equally likely , the
probability of C is P(C)= n(C)/n(S)=1/8

d/ Compute the probability of having two girls & one boy in three child family.

For the event D= ''two girls & one boy'' = {BGG, GBG,GGB} , we have n(A)=3,Since the outcome
are equally likely , the probability of D is P(D)= n(D)/n(S)=3/8 =0.375.

Example 5.15: A box of 80 candles consists of 30 defective and 50 non defective candles. If 10 of
these candles are selected at random with out replacement, what is the probability?
a) all will be defective?
b) 6 will be non-defective?
c) all will be non-defective?
Solution
 80 
Total Selection:    N  n( S )
 10 
a) Let A be the event that all will be defective.
 30   50 
Total way in which A occurs =   *    n (A)
 10   0 

67
 30   50   80 
P (A) ) = =   *   /    0.00001825.
 10   0   10 
b) Let A be the event that 6 will be non defective.
 30   50 
Total way in which A occur =   *    n (A)
4 6
 30   50   80 
P (A) = =   *   /    0.265
4 6  10 
c) Let A be the event that all will be non defective.
 30   50 
Total way in which A occur =   *    n (A)
 0   10 
 30   50   80 
P (A) = =   *   /    0.00624.
 0   10   10 
2. The Frequentist Approach (Empirical Probability): This approach to probability is based on
relative frequencies.
Definition: Suppose we do again and again a certain experiment n times and let A be an event of the
experiment and let k be the number of times that event A occurs. Therefore the probability of the
event A happening in the long run is given by:

P(A) = =

In other words given a frequency distribution, the probability of an event (A) beingin a given class is

P(A) =

Example 5.16: The national center for health statistics reported that of every 539 deaths in recent
years, 24 resulted that from automobile accident, 182 from cancer, and 353 from other disease. What
is the probability that particular death is due to an automobile accident?

Solution:
P (automobile) = death due to automobile /total death =24/539 = 0.445.
The probability that particular death is due to an automobile accident is 0.445.
3. The axiomatic approach.
Let E be a random experiment and S be a sample space associated with E. With each event A a real
number called the probability of A satisfies the following properties called axioms of probability or
postulates of probability.
1.0 1
2. P(S) =1, S is the sure/certain event.

68
3. If A1 and A2 are mutually exclusive events, the probability that one or the other occur equals the
sum of the two probabilities. i. e. P(A1 A2)=P(A1)+P(A2)
Similarly P(A1 A2 . . . An) = P(A1)+P(A2) +. . . P(An) = ∑
4. P (A') =1-P (A)
5. P (ø) =0, ø is the impossible event.

5.4 Some probability rules


Rule l: let A be an event and A' be the complement of A with respect to a given sample space of an
experiment, then P(A')=1-P(A)
Proof: let S be a sample space S=AUA' and, A and A' are mutually exclusive
A∩A' = ø
P(S) = P (AUA') = P (A') + P (A) and P(S) = 1
1= P (A') + P (A) => P (A') = 1-P (A)
Rule 2: let A and B are events of a sample space S, then
P (A'∩ B) = P (B)-P (A ∩ B)
Proof: B =S∩ B = (AUA') ∩ B = (A∩B) U (A'∩ B)
If A∩B B)
P (A'∩B) = P(B) – P(A∩B)
Rule 3: Suppose A and B are two events of a sample space, then
P(AUB) = P(A) + P(B) – P(A∩B).
Proof:
(AUB) = AU(A'∩B), A and A'∩B are disjoint sets
P(AU B) = P(A) + P(A'∩B) . . . .*
But we have already proved that P (A'∩ B) = P (B) – P (A ∩ B)
Put this in equation *
P(A U B) = P(A) + P (B) – P (A ∩ B)
Example 5.17: A fair die is thrown twice. Calculate the probability that the sum of spots on the face
of the die that turn up is divisible by 2 or 3.
Solution
S={(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),(3,1),(3,2),(3,3),(3,4),(3,5),
(3,6),(4,1),(4,2),(4,3),(4,4),(4,5),(4,6),(5,1),(5,2),(5,3),(5,4),(5,4),(5,5),(5,6),(6,1),(6,2),(6,3),(6,4),(6,5),
(6,6)}
This sample space has 6*6 =36 elements let A be the event that the sum of the spots on the die is
divisible by 2 and B be the event that the sum of the spots on the die isdivisible by 3, then
A = {(1,1), (1,3), (1,5), (2,2), (2,4), (2,6), (3,1), (3,3), (3,5), (4,2), (4,4), (4,6), (5,1), (5,3), (5,5),

69
(6,2), (6,4),(6,6)}
B = {(1,2), (1,5), (2,1), (2,4), (3,3), (3,6), (4,2), (4,5), (5,1), (5,4), (6,3), (6,6)}
A∩B = {(1,5), (2,4), (3,3), (4,2), (5,1), (6,6)}
P (A or B) = P (A U B)= P (A) +P (B) – P (A ∩ B)= 18/36 + 12/36 -6/36 = 24/36 = 2/3

5.5 Conditional Probability and Independence


5.5.1 Conditional Probability
If A and B are events. Conditional probability of A given B means the probability of occurrence of A
when the event B has already happened.
It is denoted by P (A/B) and is defined by
P (A/B) =P(A∩B)/P (B), if P (B) 0
Conditional probability of B given A means the probability of occurrence of B when the event A has
already happened. It is denoted by P (B/A) and is defined
P (B/A) = P(A∩B)/P (A), if P (A) 0
P (A ∩ B) = P (A) P (B/A) = P (B) P (A/B)
5.5.2 Multiplication Law of Probability
If A and B are events in a sample space S, then
P (A ∩ B) = P (A) P (B/A), P (A) 0
= P (B) P (A/B), P (B) 0
Where P (B/A) represents the conditional probability of B given A and P (A/B) represents the
conditional probability of A given B.
Note: Extension of multiplication law of probability for „n‟ events A1, A2, …, An we have
P (A1∩ A2∩…An) = P (A1) P (A2/A1) p (A3/A1∩ A2)…P(An/A1∩ A2∩…An-1)
Example 5.18: A coin is tossed twice. If it is already known that the first coin has thrown a head, what
is the probability of getting two heads?
Solution:
S = {HH, HT, TH, TT}, A = the first shows a head = {HH, HT}, B= two heads occur ={HH}
P (B/A) = P(A∩B)/ P(A)
But A∩B ={HH}, P(A∩B) =1/4, P(A)=1/2, therefore, P (B/A) = P(A∩B)/ P(A) = 1/2
Example 5.19: Let A and B are events such that P (A U B) = ¾, P (A n B) = ¼ and P(A' ) = 2/3.
Find P ( A'/B)
Solution:
P(A') = 2/3  P (A) = 1- P(A') = 1-2/3 = 1/3, Now, P (A U B) = P (A) + P (B) - P (A ∩ B)
3/4 = 1/3 + P (B) – ¼. P(B) = 3/4 - 1/3 + ¼ = 2/3
Therefore, P (A/B) = P (A∩B)/P(B) = 3/8  P(A'/B) =1-P (A/B) = 1-3/8 =5/8

70
5.5.3 Probability of Independent Event Two events A and B are said to be independent if the
occurrence of A has no bearing on occurrence of B. That means knowledge of A has occurred given
no information about the occurrence of B. Two events, A and B, are said to be independent if P(A∩B)
P(A)P(B) .
Suppose A and B are independent events with 0<P (A) <1 and 0<P (B) <1. The following statements
true:
i. A' and B' are independent.
ii. A and B' are independent.
iii. A' and B are independent.
iv. P(B|A) = P(B).
v. P(B|A') = P(B).
Example 5.20: A box contains four black and six white balls. What is the probability of getting two
black balls in drawing one after the other under the following conditions?
a. The first ball drawn is not replaced
b. The first ball drawn is replaced
Solution
Let A= first drawn ball is black
B= second drawn is black
Required P (A n B)
a. P (A ∩ B) = P (B/A) P(A) = (4/10) (3/9) = 2/15
b. P (A∩ B) = P (A) P (B) = (4/10) (4/10) = 16/100 = 4/25.
Exercise 5
1. A basket of fruit contains 3 mangoes, 2 Bananas and 7 pine apples. If a fruit is chosen at random,
what is the probability that it is either a mango or a banana?
2. Suppose a survey is conducted in which 50 families with three children are asked to disclose the
gender of their children. Based upon the results, it was found that 18 of the families had two boys
& one girl. Estimate the probability of having two boys & one girl in a three-child family.
3. A box contains 6 red, 4 white and 5 black balls. A man draws 4 balls from the box at random. Find
the probability that among the balls drawn there is at least one ball of each color.
4. Births exclude leap years from the following calculations; determine the probability that a
randomly selected person has a birth day.
a/ on the first day of a month, b/ on the 31st day of a month, c/ in the month of December, d/ on
November 8.

71
72
Chapter 6: Probability Distribution
The purpose of this unit is to introduce you with the concept of random variable and their probability

distributions. In a probability distribution, the variables are distributed according to some definite

probability function. In the previous unit we have discussed the concept of probability. The different

rules of probability and frequency distributions were also discussed. In this unit we utilize this

information to understand the discrete and continuous probability distributions. Moreover, the concept

of mathematical expectation is discussed.

After completing this unit you will be able to:

 define the term random variable

 understand discrete and continuous random variables

 define the term probability distribution

 differentiate between discrete and continuous probability distributions

 compute the expected value of a random variable

6.1 The Concept of Random Variables


Before introducing a probability distribution, we have to define what is meant by a random variable,
and provide some brief demonstrations based on finite sample spaces.

Definition: A variable whose values are determined by chance with associated probabilities is called a

random variable. It is a quantity in which different observations can assume different values.

In any experiment of chance, the outcomes occur randomly. For example, the total score when a pair
of dice is rolled, the number of heads when a coin is tossed several times, annual household income,
and so on are examples of random variables (or stochastic variables).

Random variables are usually denoted with capital letter X, Y, Z etc, while the values taken by them

are denoted by lower case letters x, y, z etc. Thus, P (x1 X  x2) is the probability that the random

variable X takes values between x1 and x2, both inclusive. A random variable can be discrete or

continuous.

73
6.1.1 Discrete Random Variable
If the random variable X can assume only a particular finite or countably infinite set of values, it is

said to be a discrete random variable. For example, if you throw a die, the outcome X is a random

variable, which can assume only the values 1, 2, 3, 4, 5 and 6.

Example 6.1: Consider an experiment of "flipping a fair coin 3 times". List the elements of the
sample space that are assumed to be equally likely (as this is what is meant by a fair or balanced coin)
and the corresponding values x of the r-v X, the number of heads observed.

Solution: If H stands for heads and T for tails, then the sample space corresponding to this
experiments is S = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}.

Since X= the number of heads observed, the results are shown in the following table:

Element of sample space Probability X


HHH 1/8 3
HHT 1/8 2
HTH 1/8 2
HTT 1/8 1
THH 1/8 2
THT 1/8 1
TTH 1/8 1
TTT 1/8 0

Thus, we can write X(HHH) = 3, X(HHT) = 2, , X(TTT) = 0, and P(X = 3) = 1/8 = the probability
that the r-v X is 3, P(X= 2) = 3/8, and P(X=0)=1/8.

Note that the possible values of X are: xi  0, 1, 2, 3 .

6.1.2 Continuous Random Variable


A random variable X is said to be continuous if it can take all possible values (integral as well as
fractional) between certain limits. Continuous random variables occur when we deal with quantities
that are measured on a continuous scale. For instance, the life length of an electric bulb, the speed of a
car, weights, heights, and the like are continuous. In such cases, probabilities are associated with
intervals or regions of a continuous random variable, and not with individual points.

Example 6.2: -The height of an individual

74
- The distance between Debre Markos and Addis Ababa

- The weight of an individual, etc.

6.2 Probability Distribution


A probability distribution shows the possible outcomes of an experiment and the probability of each of

these outcomes. That is, probability distribution is a complete list of all possible of values of a random

variable and their corresponding probabilities.

A formula giving the probability of the different values of the random variable X for:

 Discrete variable is the probability massy function (pmf) and is usually denoted by p(x). If X is a
discrete random variable taking at most a countably infinite number of values x1, x2, …, then P (xi)
= P(X = xi): i= 1, 2 …is called the probability mass function of random variable X. The set of
ordered pairs {xi, P (xi)} i= 1, 2 … gives the probability distribution of the random variable X.
The numbers P (xi): i= 1, 2…must satisfy the following conditions.
i) P(xi) ≥0

ii)  P( X  x ) = 1
i 1
i

 Continuous variable is the probability density function (pdf) and is usually denoted by f(x). A
random variableX, is said to be a continuous random variable if there is a non–negative function,
f,
F(x) = ∫

The function f is called probability density function of X. And it satisfies the following conditions.
i) f(x)≥0 for all x, -∞ <x < ∞
ii) ∫

Discrete probability distribution


Discrete probability distribution is a distribution whose random variable is discrete. It describes a

finite set of possible occurrences, for discrete “count data.”

75
Example 6.3: Consider the possible outcomes for the random experiment of tossing three coins

together once.

Sample space, S = {HHH, THH, HTH, HHT, TTH, THT, HTT,TTT}

Let X be the number of heads that will turn up when three coins tossed. The possible values of X are

0,1,2 and 3.

P(X = 0) = P(X (TTT)) = 1/8,

P(X=1) = P(X (HTT))+P(X (THT) )+ P(X (TTH) )=1/8+1/8+1/8 = 3/8

P(X=2) = P(X (HHT)) +P(X (HTH)) +P(X (THH)) = 1/8+1/8+1/8 = 3/8,

P(X=3) = P(X (HHH)) = 1/8.

X 0 1 2 3

P(X=x)

Continuous probability distribution


Continuous probability distribution is a probability distribution whose random variable is continuous.

It describes an “unbroken” continuum of possible occurrences.Probability of a single value is zero and

probability of an interval is the area bounded by curve of probability density function and interval on

x-axis. Let a andb be any two values; a <b. The probability that X assumes a value that lies between a
b
and b is equal to the area under the curve a and b;that isP(a  X  b) =  f ( x)dx . The integration
a

from a tob in the case of the continuous variable is analogous to the summation of probabilities in the

discrete case.
Example 6.4:A continuous random variable X has a probability density function given by

1 1
f(x) = x  , 0  X  1.
4 2

Find the probability that X lies between the interval 0 and 1.

1 1
1
1 1 1 1 1 5
Solution:   4 x  2 dx  8 x  x0   
2

0
2 8 2 8

6.3 Expectation and Variance of Random variable

76
The objective of this section is to introduce you with the most common parameters of probability
distributions. There are some summary measures in terms of which we can summarize the behavior of
probability distributions. The most common of these are the average called expected value and
dispersion about the average called the variance.

6.3.1 Expectation/Mean
The averaging process, when applied to a random variable is called expectation. It is denoted by E(X)
or and is read as the expected value of X or the mean value of X.

Case 1: For discrete random variable


Suppose X is a discrete random variable which takes on values in a finite set x 1, x2,…, xn with
probabilities P(xi) = P[X = xi] i= 1, 2, …n, then Expected value of X, E(X) of the discrete random
variable is given by:
n
E(x) =  =  x P( x )
i 1
i i

Case 2: For continuous random variable


If X is a continuous random variable then
E(X) =∫ provided ∫ where is the probability density
function of the continuous random variable X.
Case 3: Mathematical expectation of some real function h(x) of a discrete random variable is given
by:
n
E[h(x)] =  h( x ) P ( x )
i 1
i i

Similarly if X is a continuous random variable, then


E[h(x)] =∫
Properties of Expectation
If X and Y are random variables and a, b are constants then:
1. E(k) = k, where k is any constant
2. E (kX) = k E(X), where k is any constant
3. E (X + k) =E(X) + k
4. E(X + Y) = E(X) +E(Y)

77
5. E(XY) = E(X) E(Y), if X, Y are independent random variables
6. E(X) ≥ 0, if X ≥ 0.
7. |E(X)| ≤ E(|X|)
8. |E(XY)2| ≤ E(X2) E(Y2).
6.3.2 Variance of a random variables
Mean of X = E(X)
Variance of X = [ ]
= [ ]
Case 1:Variance for discrete random variable
If X is a discrete random variable with expected value μ then the variance of X, denoted by Var (X), is
defined by:
Var(X) = E(X-μ)2 = E(X2) – μ2
=∑
Alternatively, Var(X) = ∑
Case 2:Variance for continuous random variable
If X is a continuous random variable, then var (X),

∫ ̅

Properties of Variances
 For any random variable X and constant a, it can be shown that
- Var(aX) = a2Var(X)
- Var(X + a) = Var(X) +0 = Var(X)
 If X and Y are independent random variables, then
Var(X + Y) = Var(X) + Var(Y)

More generally if X1, X2 ……, Xk are independent random variables, then


Var (X1 +X2 + …..+ Xk) = Var (X1) +Var (X2) +…. + var (Xk)
i.e., (∑ ) ∑
 If X and Y are not independent, then
Var (X+Y) = Var(X) + 2Cov(X,Y) + Var(Y)
Var(X-Y) = Var(X) – 2Cov(X,Y) + Var(Y)

Example 6.5: Consider the random variable representing the number of episodes of diarrhea in the
first 2 years of life. Suppose this random variable has a probability mass function as below

78
X 0 1 2 3 4 5 6
P(X = x) 0.129 0.264 0.271 0.185 0.095 0.039 0.017
i) What is the expected number of episodes of diarrhea in the first 2 years of life?
ii) Compute the variance and SD for the random variable representing number of episodes of diarrhea in
the first 2 years of life.
Solution:
6
a) E (X) =  x P( x )
i o
i i

= 0.P(X=0) +1.P (X=1) +2P(X=2) + … + 6P(X=6)


= 0 (0.129) + 1(0.264) +2(0.271) + … + 6(0.017)
= 2.038
Thus, on the average a child would be expected to have 2 episodes of diarrhea in the first 2 years of
life.
b) E (X) = μ = 2.038
E(X2) = 02P(X=1) +12.P(X=1) +22P(X=2) + … + 62P(X=6)
= 0 (0.129) + 1(0.264+4(0.271) + … +36(0.017)
= 6.12
Thus, Var(X) = E(X2) – μ2 = 6.12 – (2.038)2 = 1.967 and the SD of X is
σ=√

x2
Example 6.6:Compute the variance of f(x) = for 0 < x < 3
9

V(x) = E(x2) – [E(x)]2

2 x  1  x5 
3 2 3
x4 27
0  9
 dx   dx    
3
E(x2) = x 0
 0
9 9 5  5

 x2 
3
1  x4  9
E(x) =  x  dx   
  
3
0
0  9  9 4  4

Therefore, V(x) = E(x2) – [E(x)]2

2
27  9 
=    = 0.34
5 4

79
6.4 Common Discrete Probability Distributions
6.4.1 Binomial Distribution
The origin of binomial distribution is Bernoulli's trial. Bernoulli's trial is an experiment where there
are only two possible outcomes, “success" or "failure". In connection with this trial, a success may be
getting heads with a balanced coin; it may be passing an examination. Whenever we face such
experiment, we use binomial distribution under the assumptions stated below. Any experiment can
also be turned into a Bernoulli trial by defining one or more possible results which we are interested as
„„Success” and all other possible results as “Failure”. For instance, while rolling a fair die, a "success"
may be defined as "getting even numbers on top" and odd numbers as "Failure".

Generally, the sample space in a Bernoulli trial is S = {S, F}, S = Success, F = failure.

Notation: Let probability of success and failure are p and q respectively.

P (success) = P(s) = p and P (failure) = P (f) = q, where q= 1- p.


Definition: Let X be the number of success in n repeated Binomial trials with probability of success p
on each trial, then the probabilities distribution of a discrete random variable X is called binomial
distribution. Let p = the probability of success &q= 1-p= the probability of failure on any given trial.
A binomial random variable with parameters n and p represents the number of r successes in n
independent trials, when each trial has p probability of success.
If X is a random variable, then for r= 0, 1, 2… n
𝑛!
𝑃( 𝑋 𝑟 ) 𝑝𝑟 𝑝 𝑛 𝑟
𝑟! 𝑛 !
𝑛!
𝑃( 𝑋 𝑟 ) 𝑝𝑟 𝑞 𝑛 𝑟 , where q = 1 – p
𝑟! 𝑛 𝑟 !

A binomial experiment is a probability experiment that satisfies the following assumptions.


1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive outcomes, success or a failure.
3. The probability of each outcome does not change from trial to trial.
4. The trials are independent.
If X is a binomial random variable with two parameters n and p then
i) E (X) = np
ii) Var (X) = npq.
Example 6.7: Assume that, when a child is born, the probability it is a girl is and that the sex of the
child does not depend on the sex of an older sibling.
a) Find the probability distribution for the number of girls in a family with 4 children.

80
b) Find the mean and the SD of this distribution.
Solution:
Let X be number of girls born with possible values 0,1,2,3,4
! 𝑟 𝑟
a) P (getting girl) = 𝑃( 𝑋 𝑟 ) 𝑟! 𝑟 !
, n =4

r 0 1 2 3 4
𝑃( 𝑋 𝑟 ) ⁄ ⁄ ⁄ ⁄ ⁄

b) Mean = np =4 × = 2

Standard deviation (SD) = √ =√ = 1.

6.4.2 Poisson Distribution


Another important discrete probability distribution is the Poisson distribution. It is a discrete
probability distribution which is used in the area of rare events. The Poisson distribution counts the
number of success in a fixed interval of time or within a specified region.
Examples of random variables that usually obey the Poisson distribution are:
 The number of car accidents in a day.
 Arrival of telephone calls over interval of times.
 The number of misprints on a typed page (a group of pages) of a book.
 Natural disasters like earth quake.
 The number of suicides reported by a particular city.
 The number of customers entering a post office on a given day.
To apply the Poisson distribution, two conditions must be met:
i) The number of success that occurs in any interval is independent of those that occur in other
non-overlapping intervals.
ii) The probability of a success in an interval is proportional to the size of the interval. In short,
the two important traits of the Poisson distribution are independence and probability.
Let X is the number of occurrences in a Poisson process and λ be the actual average number of
occurrence of an event in a unit length of interval, the probability function for Poisson
distribution is,
𝜆𝑥 𝑒 −𝜆
𝑃( 𝑋 ) 𝑥!
, x = 0,1,2, ….

Remarks
 Poisson distribution possesses only one parameter λ

81
 If X has a Poisson distribution the parameter, then E (X) = λ and Var (X) = λ,
i.e. E (X) = Var (X) =λ ,

  P( x )  1
i 0
i

Example 6.8In a small city, 10 accidents took place in a time of 50 days. Find the probability that
there will be a) two accidents in a day and b) three or more accidents in a day.
Solution:
There are 0.2 accidents per day.
Let X be the random variable, the number of accidents per day
X ~poiss (𝜆= 0.2), X = 0, 1, 2, ….
𝑒
𝑃( 𝑋 )
!
b) P (X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) +...
= 1- [P(X = 0) + P(X = 1) + P(X = 2)]

. . . . . . since  P ( xi )  1
i 0

= 1- [0.8187 + 0.1637 + 0.0164]


= 0.0012.

6.5. Common Continuous Probability Distributions


6.5.1 Normal Distributions
It is the most important distribution in describing a continuous random variable and used as an
approximation of other distribution. Many variables in the practical world follow this distribution, and
hence in many ways it is the cornerstone of modern Statistical Theory. It has been noticed that
empirical distributions of various types of observations in natural and social sciences are often very
close to normal distribution. In statistical analysis the distributions of observations is frequently
assumed to be approximately normal. In statistical estimation and testing of hypotheses the normal
distribution plays an important role.

A random variable X has a normal distribution with parameters μ&σ2 and it is known as a normal
random variable iff its pdf is given by:

1
-1  x    
2
 1
f ( x)    e ( x   ) / 2
2 2
exp  
 2 2    
   2
for      ,    x   &   0.

82
The graph of the normal distribution is known as the normal curve, which is bell-shaped:

X
Normal probability curve

SOME PROPERTIES OF THE NORMAL CURVE


The following are the important properties of the normal curve:

1. The normal curve is “bell-shaped” and symmetrical about the mean. The property of symmetry

can be shown using the pdf as: f (   c)  f   c  .

Which is equivalent to saying that P( X   )  P( X   )  0.5 .

Since this is the property of the median, it follows that, for the normal distribution,

Mean = Median = Mode.

2. The height of the normal curve is at its maximum when X    mean , which means, again,
Mean = Mode =Median.This property can also be verified using the first and second derivative
tests; that is, f ( x)  0  x   .
This shows x =  may be maximum or minimum value of X, but using the second derivative

1
test, f (  )   0 , we see that the point is the maximum value.
 23

Therefore, by property 1 and 2, we can conclude that, the mean, median and mode coincide for the
normal curve.
3. The normal curve is asymptotic to the X- axis.
4. The first and the third quartiles are equidistant from the median,

Q Q
i.e., Q  Q  Q  Q1 . Or, Q  1 3 .
3 2 2 2 2

5. The Probability that a random variable will have a value between any two points is equal to the
area under the curve between those points.

Standard Normal Distribution

83
The symmetrical property of the normal distribution provides a means that is helpful in calculating
probabilities, which is also facilitated by transforming any normal distribution with any mean and
variance to the standard normal distribution.

By standardization we mean that the random variable X will be transformed to another random
variable whose mean is 0 and variance is 1. The normal distribution with zero mean and standard
deviation one is known as standard normal distribution. If X has normal distribution with mean μx and
standard deviation , then the standard normal distribution Z is given by

Z= , for population

̅
Z= , for sample

Using the properties of expectations, it is now trivial to show that E (Z )  0 and V(Z)  1 . The pdf of Z

1 2
1 
z
is, thus, given by f ( z )  e 2 ,  z   .
2
z
The entries in Table A of the Appendix are the values of P(0  Z  z )   f ( z )dz .
0

That is, the table gives us the probabilities that a random variable Z having the standard normal
distribution will take on a value on the interval from 0 to z, for z  0.00, 0.01, 0.02,, 3.98, and 3.99;
due to the symmetrical property of the normal curve with respect to its mean, it is unnecessary to
extend the table for negative values of Z.

Note that P(Z  0)  P(Z  0)  0.5 .

Table value

0 Z

Tabulated areas under the standard N.D from 0 to z

84
1 2
z 1  z
That is, the arrowed region is P (0  Z  z )   e 2 dz .
0 2

Basic Properties of the standard normal Curve:


1. Total area under the standard normal curve is equal to 1.
2. The standard normal curve is asymptotic to x-axis.
3. The standard normal curve is symmetric about 0.
4. Most of the area under the standard normal curve lies between z= -3 and z = 3.
Given a normal distributed random variable X with mean μ and standard deviation σ

But, standard normal random variable

Note: i) P (a<x<b) = P (a ≤X<b)


= P (a<X≤ b)
=P (a ≤X≤ b)
ii) P (- ∞ <Z < ∞) = 1
Example 6.9: Find the probabilities that a random variable having the standard normal distribution will
take on a value:
a) Less than 1.72; b)Less than -0.88;
c) Between 1.30 and 1.75; d) Between -0.25 and 0.45.

Solution: By using the normal table,

a) P(Z  1.72)  P(Z  0)  P(0  Z  1.72)  0.5  0.4573  0.9573 .

b) P(Z  0.88)  P(Z  0.88)  0.5  P(0  Z  0.88)  0.5  0.3106  0.1894 .

c) P(1.30  Z  1.75)  P(0  Z  1.75)  P(0  Z  1.30)  0.4599  0.4032  0.0567 .

d) P(0.25  Z  0.45)  P(0.25  Z  0)  P(0  Z  0.45) .

 P(0  Z  0.25)  P(0  Z  0.45)  0.0987  0.1736  0.2723 .

Application of the Standard Normal Distribution


Let X N(  ,  ). Suppose that we want to find the probability P(a  X  b) .
2

85
Since a, b,  , and  are known (given), we standardize a, b and X as:

a X  b
P ( a  X  b)  P     P ( z1  Z  z 2 ), say .
    

Now, we need only to get the readings from the Z- table corresponding to z1 and z2 to get the required
probabilities, as we have done in the preceding example.

Also, we can find the following one-sided probabilities:

 b   a
P ( X  b)  P Z    P ( Z  z 2 ) , and P ( X  a )  P Z    P ( Z  z1 ) .
     

We have seen that a Z- value measures the distance between a particular value of X and the mean in
units of standard deviation.

Example 6.10: If XN(  ,  ).find the probabilities


2

a) P(    X     ) ; b) P(  2  X    2 ) ; c) P(  3  X    3 ) .

Solution: As in the case of P(a  X  b) , we simply replace a andb.

a) P (     X     )  P       Z       
   

 P(1  Z  1)  2P(0  Z  1)  2(0.3413) (See Table A)

 0.6828 or 68.28%.

b) Similarly, P(  2  X    2 )  P(2  Z  2)  2P(0  Z  2) =2(0.4772) = 0.9544.

c) P(  3  X    3 )  P(3  Z  3)  2 P(0  Z  3) = 2(0.4987) = 0.9974, or


99.74%.

From which we can tell that,

a) About 68.30% lies in the region    &    (1 Standard Dev. on either side).
b) About 95.50% lies in   2 &   2 (2 Standard Deviations on either side).
c) About 99.7% lies in   3 &   3 (3 Standard Deviations on either side).

Notation: Z denotes the value of Z for which the area to its right is equal to  .

86
This notation is useful in statistical inference, and note that finding Z is identical with reading anti-

logarithms.

Example 6.11: Find a) Z 0.01 ; b) Z0.05

Solution: a) Z 0.01 corresponds to an entry of 0.5 - 0.01 = 0.4900.

In Table A, look for the value closest to 0.4900, which is 0.4901, and the Z value for this is Z= 2.33.
Thus, Z0.01  2.33 .

b) Again, Z0.05 is obtained as 0.5 - 0.05 = 0.4500, which lies exactly between 0.4495 and 0.4505,

corresponding to Z = 1.64 and Z= 1.65. Hence, using interpolation, Z0.05  1.645.

Example 6.12: Suppose that X N (165, 9), where X = the breaking strength of cotton fabric. A
sample is defective if X<162. Find the probability that a randomly chosen fabric will be defective.

Solution: Given that   165 and  2  9 ,

 X   162     162  165 


P ( X  162 )  P    P Z  
     3 

 P( Z  1)  0.5  P(1  Z  0) (Since P(Z  0)  0.5 )

 0.5  P(0  Z  1) (By symmetry)

 0.5  0.3413  0.1587 (Table value for Z = 1)

6.5.2 Chi-Square Distribution


Chi-Square distribution may be derived from normal distributions, if Xi (i = 1, 2… n) are n
independent normal varieties with mean μi and variance (i= 1, 2, … , n) then
n
X i  i
χ2 
i 1 i
2 is a chi-square variate with n degrees of freedom. The probability density

function of the –distribution is given by

( ) ⁄ , 0< <∞ where n is the degree of freedom.


Since the Chi-square distribution arises in many important applications, its values have been

extensively tabulated. Table C at the end of this module contains values of  2  ,n for  =0.05, 0.025,

0.01, 0.005 and n=1, 2, 3, …, 30, where  2  ,n is such that the area to its right under the Chi-square

87
curve with n degrees of freedom is equal to  . That is,  2  ,n is such that if X is a random variable

having a Chi-square distribution with n degrees of freedom, then P( X   2 ,n )   .  is known as

the level of significance. When n is greater than 30, the table cannot be used and probabilities related
to Chi-square distributions are usually approximated with normal distributions.

0  2  ,

Properties of Chi-square Distribution


1. The exact shape of the distribution depends upon the number of degrees of freedom n. In
general, when n is small, the shape of the curve is skewed to the right and as n gets larger, the
distribution becomes more and more symmetrical.
2. The mean and variance of the distribution are n and 2n respectively.
3. As n the distribution approaches a normal distribution.
4. The sum of independent varieties is also variety.
6.5.3 The t-distribution
Let X1,X2,….Xn be a random sample drawn from a normal distribution having mean μ and
standard deviation σ (unknown but estimated by S, sample standard deviation).
̅
The statistic has t – distribution with (n-1) degree of freedom where ̅ is sample

mean and S is standard deviation.


In view of its importance, the t distribution has been tabulated extensively. Table B at the end
of this module contains values of t , n 1 , for  = 0.10, 0.05, 0.025, 0.01, 0.005,etc and n = 1,

2, 3, …,degrees of freedom; where t , n 1 is such that the area to its right under the curve of

the t distribution with (n-1) degrees of freedom is equal to  .

88
Notation: tα ((n-1)) stands for a value of t with (n-1) degree of freedom the right of which an
area equal to a in reading the tabulated values.

- t 0 t

Student’s t Distribution

Note: 1. The table value does not contain values of t , n 1 for  > 0.50, since the curve is

symmetrical about

t=0 (like the normal distribution) we have,

t , n 1 =  t , n 1 .

2. When (n-1) =30 or more, probabilities related to the t distribution are usually approximated
with the use of normal distributions.

Example 6.13: For a t-distribution with n=20 ,find t  values leaving an area of

a) 0.05 to the right; c) 0.10 to the left;


b) 0.975 to the right; d) half of  =0.01 on either side.
Solution; referring to Table B with (n-1) =19 df, we have

a) t 0.05 =1.729; c)  t 0.10 =-1.328.

b) t 0.975  2.093; d) t   t 0.005  2.861; &  t0.05  2.861.


2

Applications of t Distribution

89
The t distribution has wide applications in Statistics, only some are listed below:

a) Test of population Mean ( One-sample t-test)


When we are dealing with a random sample of size n<30, from a normal population, when
 2 is unknown, the t distribution with n-1 degrees of freedom, is used to test the hypothesis
that the population mean  equals a given value (Say, O ), against the alternatives:   O ,

or   O , or   O .

X 
Then, we calculate t= , which is to be compared with the table value t  , or t  with n-
S/ n 2

1 degrees of freedom.

Note: The assumptions underlying student‟s t-distribution for such tests are:

a) The parent population from which the sample is drawn is normal.


b) The sample observations are independent; that is, the sample is random.
c) The population standard deviation (  ) is unknown.
d) n is small; that is n<30.
Example: 6.14: In 16 one-hour test runs, the gasoline consumption of an engine averaged
16.4 gallons with a standard deviation of 2.1 gallons. In order to test the claim that the
average gasoline consumption of this engine is 12.0 gallons per hour, calculate the t value and
t , n 1 , for  =0.05.

Solution: Substituting n=16,  =12.0, X =16.4, and S=2.1 in the formula, we get

X  16.4  12.0
t= = =8.38; and the table value for n-1=15 is t 0.05,15 =1.753.
S/ n 2.1 / 16

b) t-Test of the equality of two means (two-sample t-test)


In many problems of applied research, we are interested in hypotheses concerning the
difference between the means of two population means, 1   2  0 , or the equality of two

means ( 1   2  0) . In such tests, the following are assumed:

I. The parent populations from which the samples have been drawn are normally distributed;

90
II. The two population variances are equal, though unknown:  12   22   2 .
III. The two samples are random and independent of each other;
IV. The sample sizes are small: n1 and/or n2 are <30.
c) t-Test of correlation and regression coefficients
In a normal regression and correlation analyses, it is used to test:

a) if the population regression coefficient equals to a certain constant (    o ); &


b) if the population correlation coefficient is significantly different from zero.
Exercise 6

1. From a lot containing 20 items, of which 5 are defective, 4 are chosen at random. Let X be the
number of defectives found.

a) Write down the pmf of X. b) Find the probability distribution of X.

c) Find E(X) and V(X).

2. If X has a pdf of f ( x)  3x 2 , for 0 <x <1, and o elsewhere, find

a) P(X < 0.5); b) E(X) and V(X);

c) a if P( X  a)  0.05 ; d) b if P( X  b)  P( X  b) .

3. The amount of bread X ( in hundreds of kg) that a certain bakery is able to sell in a day is found to be
a continuous r-v with a pdf given as below:

 kx , 0 x5

f ( x)  k (10  x ) , 5  x  10
 0
 , otherwise

a) Find k; b) Find the probability that the amount of bread that will be sold tomorrow is
i) More than 500kg, ii) between 250 and 750 kg;

c) Find the expected amount of bread to be sold in any day.

4. Find the value of Z if the area between -Z and Z is a) 0.4038; b) 0.8812; c) 0.3410.
5. The reduction of a person's oxygen consumption during periods of deep meditation may be looked up
on as a random variable having the normal distribution with   38.6 cc per minute and   6.5 cc

91
per minute. Find the probabilities that during such a period a person's oxygen consumption will be
reduced by

a) at least 33.4 cc per minute; b) at most 34.7 cc per minute


6. For a Chi-square distribution with n-1 =14 degrees of freedom, find the table value such that the area
a) to its right is 0.01;
b) to its left is 0.975.
7. What is the probability that a family with five children will have 3 boys and 2 girls?
8. In a research, rats are injected with a drug that inhibits body synthesis of protein. By the previous
research, it was found that the probability of a rat dying from the drug before the experiment is over is
0.2. If 10 rats are used:
i) How many are expected to die before the experiments ends.
ii) What is the probability that at least eight will survive.
9. The heights of 10 males of a certain locality are found to be 70, 67, 62, 68, 61, 68, 70, 64, 64, and 66
inches. If it is desired to test if the average height is 64 inches, at  =0.05,
a) calculate the t value;

b) find the table value t , n 1 .

92
CHAPTER 7: SAMPLING AND SAMPLING DISTRIBUTION OF
THE SAMPLE MEAN
After completing this unit, the student should be able to
 Describe the basic concepts of sampling.
 List down and explain random sampling versus non-random sampling techniques.
 Identify the causes of non-sampling error.
 Develop the sampling distribution of the mean.
 Construct the probability distribution of the mean.
 Calculate the means, standard deviation and variance of sample means and sample Proportion.

is normal or non-normal.
 Explain the importance of central limit theorem for statistical inference.

7.1 Basic Concepts


Population: is the complete collection of individuals, objects or measurements for which
inferences are to be made. The population represents the target of an investigation, and the
objective of the investigation is to draw conclusions about the population and it should be
defined on the basis of the objective of the study by the investigator.
Example 7.1
-All customers of electric supply company.
-All students of DMU.
-Population of farms having a certain type of natural fertility.
-Population of households in a certain village.
Sample: A sample from a population is the set of measurements that are actually collected in
the course of an investigation. It should be selected using some predefined sampling
technique in such a way that they represent the population very well.
Sampling (elementary) unit:- the ultimate unit to be sampled or elements of the population
to be sampled.
Example 7.2: If somebody studies economic status of the house holds, households is the
sampling unit.If one studies performance of freshman students in some college, the student is
the sampling unit.
Sampling frame: is the list of all elements (sampling units) in a population.
Example 7.3:

93
 List of households of a certain city.
 List of students in the registrar office of the university.
Parameter and Statistic are basic terms in sampling theory. Parameter is a value calculated
from the population. For instance population mean, population variance, population
proportion is parameters. Statistic is a value calculated from a sample. Sample mean, sample
variance, sample proportion, etc are statistics.
Sampling error: A type of error that may arise due to in appropriate sampling techniques
applied .A sampling error is the difference between a sample statistic and its corresponding
parameter. We can make probabilistic statements about this sampling error only if we have a
probability sample.
Non-sampling error: In addition to sampling error, the sample estimate may be subject to
other errors, sampling errors. Errors in observation, interview or measurement error, errors
due to non-response and errors in data processing: editing, coding, etc. The non-sampling
error is likely to increase with increase in sample size. For instance a census survey may have
non-sampling errors in large amountcollected in the course of an investigation. It should be
selected using some predefined sampling technique in such a way that they represent the
population very well.

7.2 Reasons for sampling


Sample survey saves money: It is possible to collect information from sample households
and obtain estimates that reasonably approximate the actual characteristics of a large
population.
It obviously cheaper to gather information from 100 households rather than from 10,000
households.
Sample Survey saves time: sample survey requires a smaller scale of operations at all stage
and it reduces data collection and processing time.
Sample survey provides higher level of accuracy: This accuracy can be achieved through
more selective recruiting of interviewers and supervisors, more extensive training programs, a
closer supervision of the personnel involved and a more efficient monitoring of the field
work.

94
Sample survey could be the only option for the study in some specialized area. For
example, there are some cases where information of technical nature requires highly trained
personnel and specialized equipment like in medical areas.

Experimentation could be destructive in nature like testing industrial products such as


testing the average duration of burning of bulbs, and testing the quality of wine, beer, etc. In
this case sampling is the only feasible means of study.

7.3 Sampling Techniques


The technique of selecting a sample is important in sampling theory and usually it depends
upon the nature of the investigation. The commonly used sampling techniques may be broadly
classified as: Non Probability and Probability Sampling.
A. Random Sampling or probability sampling.
Probability sampling techniques is a method of sampling in which all elements in the
population have a pre-assigned probability to be included in to the sample.
In this sub-section, four different techniques of taking a random sample are discussed.
a/ Simple random sampling, b/ Stratified random sampling, c/ Cluster sampling, d/ Systematic
sampling
a) Simple Random Sampling
In statistics, a simple random sample from a population is a sample chosen randomly, so that
each possible sample has the same probability of being chosen. One consequence is that each
member of the population has the same probability of being chosen as any other. In small
populations such sampling is typically done "without replacement", i.e., one deliberately
avoids choosing any member of the population more than once. Although simple random
sampling can be conducted with replacement instead, this is less common and would normally
be described more fully as simple random sampling with replacement. Simple random
sampling is a method of selecting n units out of a finite population of size N by giving equal
probability to all units, or a sampling procedure in which all possible combinations of n units
that may be formed from the finite population of size N units have the same probability of

selection. There are N Cn distinct possible samples in the case of sampling without

1
replacement; the chance of selecting each one of them is . There are possible
N Cn

95
samples in the case of sampling with replacement, the chance of selecting each one of them is
1/ . Conceptually, simple random sampling is the simplest of the probability sampling
techniques. It requires a complete sampling frame, which may not be available or feasible to
construct for large populations. Even if a complete frame is available, more efficient
approaches may be possible if other useful information is available about the units in the
population.
Simple random sampling is free of classification error, and it requires minimum advance
knowledge of the population. It best suits situations where the population is fairly
homogeneous and not much information is available about the population. If these conditions
are not true, some other types of sampling techniques may be a better choice. Lottery method
and computer generated random numbers are used to select a random sample in simple
random sampling.
i) Lottery method: This is a very common method of taking a random sample, under this
method; we label each member of the population by identifiable ticket or pieces of papers.
Tickets must be of identical size, color and shape. They are placed in the container and well
mixed before each draw and then draws may be continued until a sample of the required size
is selected. This shows that selection of items depends entirely on chance.
Example 7.4: If we want to take a sample of 25 persons out of a population of 150, the
procedure is to write the names of all the 150 persons on separate slips of papers, fold these
slips, mix them thoroughly and then make a blindfold selection of 25 slips without
replacement.

ii) Table of random numbers

This is an alternative method of selecting a simple random sample. It is constructed from the
digits 0, 1, 2,…, 9. There are several tables available in standard books of Statistics.

Suppose we want to select a sample of size n, then

- Make a list of population to be sampled;


- Give a distinct code number to each unit of the population;
- Choose the direction of selection randomly;

96
- Take n units whose code numbers coincide with the random numbers as numbers of the sample by
omitting those random numbers which do not exist on the list and repeated numbers if an element is
not appear more than once in a sample.

Table of Random Numbers

Column

Row 1 2 3 4 5 6 7 8

1 57172 42088 70098 17333 26902 29959 43909 49607

2 33883 87680 24923 15659 O9839 45817 89405 70743

3 77950 15344 35609 87119 15859 74577 42791 75889

4 11607 26596 16796 24498 17009 67119 60557 49521

5 56149 55678 38169 47228 49931 94303 67448 31286

6 80719 65101 77729 83949 83358 75230 56624 27549

7 93809 19505 82000 79068 45552 86776 48980 56684

8 40950 86216 48161 17646 24164 35513 94057 51834

9 12182 59744 83710 41125 14291 74773 66391 50031

10 13382 48076 73151 48724 35670 38453 63154 58116

11 38629 94576 48859 75654 17152 66516 78796 73099

12 60728 52063 12431 23898 23683 10853 O4038 75246

13 O1881 99056 46747 O8846 O1331 88163 74462 14551

14 23094 08831 24387 23917 O7421 97869 88092 72201

Example 7.5: Suppose that N= 40 and we want to select n=10 without replacement, starting
with the 3rd row and 2ndcolumn by reading vertically using the above random table, we get
(ignoring the numbers greater than 40):
97
Solution: starting with the 3rd row and 2nd column by reading vertically we will get:

15, 26, 19,08, 24, 35, 16, 38, 12 and 17.

b/ Stratified random sampling

In stratified sampling, the population of N units is sub-divided into k sub-populations, called


strata, so that the units in each stratum are as homogeneous as possible and the means of the
different strata are as different as possible. These sub-populations should be non-overlapping
so that they comprise the whole population such that N1  N2    Nk  N , where N i

represent the population size in the i th strata. Then a sample is drawn from each stratum
independently, the sample size within the i th stratum being ni (i  1,2,, k ) such that

n1  n2    nk  n . The procedure of taking samples in this way is called stratified


sampling. If the sample is taken randomly from each stratum, the procedure is known as
stratified random sampling.

Remarks:

In stratified random sampling, the following two points are equally important to ensure
accuracy:

a) proper stratification of the population into various strata, and

b) a suitable sample size from each stratum.

For example a population can be stratified based on the following variables:

 Sex (male, female);Age (under 18, 18 to 28, 29 to 39);Occupation (professional, other);


Geographical classifications;Administrative regions,etc.

c/ Cluster Sampling:
The population is divided in to non-overlapping groups called clusters. A simple random
sample of groups or cluster of elements is chosen and all the sampling units in the selected
clusters will be surveyed in the case of single stage cluster sampling. Clusters are formed in a
way that elements within a cluster are heterogeneous, i.e. observations in each cluster should
be more or less dissimilar. Cluster sampling is useful when it is difficult or costly to generate
a simple random sample. For example, to estimate the average annual household income in a
large city we use cluster sampling, because to use simple random sampling we need a

98
complete list of households in the city as sampling frame. To use stratified random sampling,
we would again need the list of households. A less expensive way is to let each block within
the city represent a cluster. A sample of clusters could then be randomly selected, and every
household within these clusters could be interviewed to find the average annual household
income.
d/ Systematic Sampling:
Systematic sampling is the selection of every kthelement from a sampling frame, where k, the
sampling interval and
k = population size / sample size = N/n.
Using this procedure each element in the population has a known and equal probability of
selection. This makes systematic sampling functionally similar to simple random sampling. It
is however, much more efficient and much less expensive to do. Like simple random
sampling a complete list of all elements with in the population (sampling frame) is required.
The procedure starts in determining the first element to be included in the sample. It is
however, much more efficient and much less expensive to do. Suppose that we have a
complete and up-to-date list of the N units in the population numbered from 1 to N in some
order. To select a sample of size n, if N is an integral multiple of n, N = k*n for some integer
k,
k = population size / sample size = N/n.
The procedure starts in determining the first element to be included in the sample, select a unit
i randomly from the first group, i as the first element. The second unit will be
(i+k)thelement from the frame. Totality we have a sample of size n from the population of size
N, ith , (i+k)th , (i+2k)th ,… (i+(n-1)k)th element of the population are taken as a sample.
Example 7.6
Suppose that N = 20 and we want to select a sample of size 4, so that k = N/n =20/4 = 5.
The first element in the sample is selected from the first 5 units randomly, say 3 rd, which is
the random start. Then, every 5th unit is selected, and the sample contains the 3rd,8th, 13th and
18th units of the population.
B. Non-Random Sampling or non-probability sampling.
It is a sampling technique in which the choice of individuals for a sample depends on the basis
of convenience, personal choice or interest.

99
Types of non-random sampling are:
1. Judgment sampling.
2. Convenience sampling
3. Quota Sampling.
1. Judgment Sampling
In this case, the person taking the sample has direct or indirect control over which items are
selected for the sample. This method is mainly used for opinion surveys but is not
recommended for general use, as it bias of the sampler.
2. Convenience Sampling
In this method, the decision maker selects a sample from the population in a manner that is
relatively easy and convenient.
3. Quota Sampling
This is a type of judgment sampling and may be the most commonly used one in the non-
probability category. In a quota sample, quotas are set up according to some specified
characteristics such as income groups, age groups, political or religious groups, etc. Within
the quota, the selection of sampling units depends up on personal judgment.
7.4 Sampling Distribution of the sample mean
Consider all possible samples of size n that can be drawn from a given population (either with
or without replacement). For each sample, we can compute a statistic (such as the mean & the
standard deviation) that will vary from sample to sample. In this manner we obtain a
distribution of the statistic that is called its sampling distribution.
Steps for the construction of Sampling Distribution of the mean
1. From a finite population of size N, randomly draw all possible samples of size n. There are

possible samples if sampling is with replacement and there are N Cn possible samples if
sampling is without replacement.
2. Calculate the mean for each sample.
3. Summarize the mean obtained in step 2 in terms of frequency distribution
Example 7.7: Suppose we have a population of size 5, consisting of the age of five children
3, 5, 7, 9, and 11. Population mean is 7 and population variance is 8. (Consider sampling
without replacement).
Take samples of size 2 and construct sampling distribution of the sample mean.

100
Solution:
Step 1: N= 5, n=2 we have 5 C2 =10, possible samples.
(3,5), (3,7), (3,9), (3,11), (5,7), (5,9), (5,11), (7,9), (7,11) and (9, 11)
Step 2: Calculate the sample mean for each sample:
Means = 4, 5,6,7,6,7,8,8,9,10 respectively.
Step 3: Summarize the mean obtained in step 2 in terms of frequency distribution.

xi 4 5 6 7 8 9 10 Total
1 1 2 2 2 1 1 10
xi 4 5 12 14 16 9 10 70
xi 9 4 2 0 2 4 9 30

∑ xi
a) Mean of sample means , E( X ) = ∑
= 70/10 = 7
∑ xi X
b) Variance of sample means, var( X ) = = 30/10 = 3

 2  N n 852
V (x)    =  = 3
n  N 1  2  5  1 
Example 7.8: Three students have taken a class test which is marked out of 10. We want to
estimate the mean mark using the sample mean as the estimate of the population mean. We
take a sample of size 2 in two cases and suppose the marks of the three students are 1, 2 and 6.
The population mean μ is (1+2+6)/3 = 3
∑ 
The population variance  2 = = 14/3.

i) Sampling without replacement


In this type of sampling an observation is included in the sample only once and is selected
randomly without any preference or conscious effort.
If sampling is without replacement we can take 3C2 =3 possible samples; the possibilities are
given below.

Possible sample (1,2) (1,6) (2,6)

Sample mean 1.5 3.5 4

101
The sample mean is a random variable, and we see that it can take three possible values. We
can now write down its probability distribution as follows
xi 1.5 3.5 4 Total

P( X = xi ) 1/3 1/3 1/3 1

xi 2.25 0.25 1 3.5

i) Mean of sample means E( X )=∑ ̅ (X xi )=1.5(1/3) + 3.5(1/3) +4(1/3) =3 =

population meani.e., Mean of sample means E( X ) = population mean


∑ xi X
ii) Variance of sample means, var( X ) = 3.5/3 = 1.17

where k is number of sample mean.


 2  N  n  14 / 3  3  2 
In which if sampling without replacement, V ( x )    =   =14/12 = 1.17.
n  N 1  2  3 1 
ii) Sampling with replacement
In this type of sampling an observation has a chance to be selected at each draw.
Suppose that we take the sample with replacement, there are 32 = 9 possible samples.

Sample (1,1) (1,2) (1,6) (2,1) (2,2) (2,6) (6,1) (6,2) (6,6)
Sample mean 1 1.5 3.5 1.5 2 4 3.5 4 6

The sample mean is a random variable & its probability distribution is:
xi 1 1.5 2 3.5 4 6 Total

P( X = xi ) 1/9 2/9 1/9 2/9 2/9 1/9 1

xi P( X = xi ) 1/9 1/3 2/9 7/9 8/9 6/9 3

xi 4 4.5 1 0.50 2 9 21

i) Mean of sample means E( X )=∑ ̅ (X xi )

102
=1(1/9) +1.5(2/9) + 2(1/9) +3.5(2/9) + 4(2/9) + 6(1/9) =3
Mean of sample means, E( X ) =population mean.
∑ xi X
ii) Variance of sample means var( X ) = =21/9 = 2.33

Where k is number of sample means

 
V X   x2 
2
n
= 14 / 3 = 14/6 = 2.33
2

In which if sampling with replacement, V X   x2    2


n
= 14 / 3 = 14/6 = 2.33
2
In each case the expected value of the sample mean equals the population mean. This explains
why the sample mean is a good estimate of the population mean. If we use the sample mean
as an estimate of the population mean we will sometimes overestimate it, and sometimes
under-estimate it, but “on average” we will be accurate.
The example above illustrates an important result.

Remark:
∑ xi
1. Mean of sample means= E( X ) = ∑
=∑ ̅ (X xi )= population mean

 
2. Variance of sample means, V X   x2 
2
n
( if sampling is with replacement)

2  N n
3. Variance of sample means V ( x )    ,(if sampling is with out replacement)
n  N 1 

 N n
The quantity   is finite population correction (fpc), and if n/N<0.05, fpc is ignored.
 N 1 
Note: the square root the variance of sample means is known as standard error.
The distribution of sample means depends on distribution of the population, sample size and
whether population variance is known or unknown. A sample may be from a normally
distributed population or from a non normally distributed population, from a population with
variance is known or un known and the sample size may be large or small.
Case-I: If sampling is from a normally distributed population with known variance:
When sampling is from a normally distributed population with known variance, the
distribution of sample means X , is normal what ever the sample size.

103
Example 7.9: The average height of Christmas tree farm is normally distributed with mean 68
inches & variance 9 inches square. Find the probability that the mean height of a random
sample of 16 Christmas tree is more than 70 inches.
Solution:
Let X be the height of trees with mean 68 and variance 9.
A sample of size 16 is taken, the sample mean is a random variable ( X ),

X  N   ,   = X  N 68 , 0.56  , since the population is normally distributed, probability of a


2

 
 n 
70  68
sample mean is greater than 70 isP( X >70) = p(Z> ) = p(Z>2.67) = 0.0038.
0.56

Case-II: When sampling from a non normal population and when the sample size is large.
If sampling is from a non normal population and when the sample size is large the distribution
of X depends on Central Limit Theorem.
The Central Limit Theorem
If X1, X2, …, Xn is a random sample from a population with mean μ and variance  2 , then as
n goes to infinity the distribution of the sample mean, X , approximates normal distribution
2
. In short as n gets large number, X  N   ,   .
2
with mean μ and variance
n  n 

We can standardize this to get Z  X    N (0, 1) (approximately as n gets large).When


/ n

population variance is unknown Z  X    N (0, 1) (approximately as n gets large).


S/ n

Example 7.10: The mean weight of 500 male students at a certain university is 151 pounds
(lb) and the standard deviation is 15 lb.assuming that the weights are normally distributed.
Suppose that a sample of 64 students is taken, what is the probability that the weight in the
sample is more than 154.75 lb?
Solution:
As we have taken a large (n=64) sample we can use the Central Limit Theorem. This says that
the mean weight of the sample can be approximated by a normal random variable with a mean
of 151 and a variance of 225. If we let X be the mean weight of the students, it is required to
find
P( X >154.75) = X  N 151 ,225 / 64

104
154 .75  151
P( X >154.75) = p( X   > ) = P (Z>2.00) = 0.5 – 0.4772 = 0.0228.
/ n 15 / 8
Example 7.11: Suppose that 150 customers enter a supermarket on a given day. Each
customer spends a random amount. All they know about the distribution of these expenditures
that its mean is 7.50 birr and its standard deviation is 3.40 birr. What is the probability that a
person, on average, spent more than 8.00 birr during the day?
Solution:
We have n = 150 which is large enough to use the Central Limit Theorem. Mean =7.50 and
standard deviation = 3.40.
Let X be the amount of an individual‟s expenditure during the day. X N(7.50, 11.56)
Let X the average amount of an individual‟s expenditure during the day, it is required to find
P( X >8)

P( X >8.00) = p( X   > 8.00   ) = p(Z > 8.00  7.5 ) = p(Z>1.80) = 0.5 – P (0<Z<1.80)
/ n / n 3.4 / 150
= 0.5 – 0.4641 = 0.0359
This means there is only 0.0359 probabilities that a person will spent larger than 8.00 birr on
average.
Case-III: When sampling is from normally distributed population with unknown population
variance,

a) If the sample size is large, Z  X    N (0, 1), where S is an estimate of  .


S/ n

b) If the sample size is small (n<30), t  X   t(n-1). t has t-distribution with (n-1) degree of
S/ n

freedom, where S is an estimate of  .

7.5 Sampling Distribution of the sample Proportion


In situations where it is not possible to measure the characteristic under study, but is possible
to classify the whole population in various categories with respect to the attributes they
possess, consideration is usually given to estimating the population elements that belong to a
defined category of class. Suppose that we have two complementary and mutually exclusive
class, C and C'such that every unit in the population falls into either of them.
In order to know how many of the units fall in class C, we define a counting variable as

Xi {

105
If the number of units falling in C is denoted by A for the population and by a for the sample,
then
∑ and hence the population proportion denoted by P is given by P = A/N.

Given a simple random sample of n units, the sample proportion denoted by p= = from

the formula, we see that X and p are essentially identical. In fact p is special case of X , the
case where possible values of Xi are only 0 and 1.Consequently p possesses all properties of
X .p is an estimate of P, with variance

2  N n ∑ 
var(p) = var( X ) =   where  2 = = PQ
n  N 1 
PQ  N  n 
var(P) =  
n  N 1 
Where Q=1-P is proportion of units falling in class C'.
PQ  N  n 
var(P) =   is estimated by using sample values as
n  N 1 

pq  N  n 
var( ̂ ) =  =
pq
1  f 
n 1  N  n 1
Where sampling fraction, f = n/N
npq
This expression is obtained by replacing  2 by its estimator s2 = .
n 1
The sampling fraction can be ignored, when N is large relative to sample size n, n/N<0.05.

pq pq
var( ̂ ) = and the standard error of p is √ .
n 1 n 1

PQ  N  n 
Sample proportion p is normally distributed with mean P and variance var (p) =  .
n  N 1 
Example 7.12
In a simple random sample of size 100, from a population of size 500, there are 37 employed
persons in the sample.
a) Estimate proportion of employed persons in the population.
b) Calculate the standard error of p.
Solution:

106
a) Population proportion P is estimated by p= a/n = 37/100 = 0.37.37% of the population is
employed.

pq (1  f ) (0.37)( 0.63)(1  0.2)


b) Standard error of p is√ =√ = 0.0434.
n 1 99

Exercise 7
1. A certain type of bacteria occurs in all raw milk. Let X denotes the bacteria count per ml of milk. The
public health dep‟t has found that if the milk is not contaminated, then x- has a normal distribution.
The population distribution has a mean of 2250 with standard deviation 300. In a large commercial
dairy, the health inspector takes 42 random samples of milk produced each day. At the end of the day
the bacteria count in each of the 42 samples is averaged to obtain the sample average bacteria count
̅.
a/ Assuming that the milk is not contaminated , what is the distribution of ̅?
b/ Assuming that the milk is not contaminated, what is the probability that the sample average
bacteria count ̅ for one day is between 2500 and 2600 bacteria/ml?
2. The diastolic blood pressure (DBP) of adult males between the age of 25&45 years in a certain society
is normally distributed with mean 78 mm and s.d. 10 mm. The DBP is considered elevated if it
exceeds 95 mm. An insurance company has 5000 adult male policy holders b/n 25 and 45 years of age.
What is the probability that 10% or less of them have elevated DBP?
3. Determine the type of sampling used
a/ An interviewer in a mall is told to survey every fifth shopper starting with the second.
b/ A researcher randomly selects 5 of the 70 hospitals in a metropolitan area and then surveys all of
the surgical doctors in each hospital.
4. The amount of sulphur in a daily emission from a factory has a normal distribution with mean of 134
pounds and a standard deviation of 22 pounds. For a day selected randomly, find the probability
thatthe mean amount of sulphur emission will be less than 130 pounds.
5. A population consists of the four numbers, 3,7,11, 13 and 15. Consider all possible samples of size2
drawn from this population without replacement. Find a) The sampling distribution of sample means
b) The mean of sample meansc) The standard deviation of the sample means.

107
CHAPTER 8: STATISTICAL INFERENCES
The process of inferring information about a population from a sample is known as statistical
inference. This chapter has two major parts. The first part is statistical estimation discusses
the method of estimating a population parameter by using statistic, point estimation. It also
explains the concepts of confidence interval. The second part is hypothesis testing describes
the different techniques of testing a given tentative assumptions by applying an appropriate
test statistic.
Objectives
After completing this unit, the student should be able to:
 Explain the concepts of statistical estimation and the confidence interval.
 Distinguish interval estimation from point estimation.
 Calculate and interpret point estimate of population mean and populationproportion.
 Define the concept of hypothesis testing and differentiate types of tests.
 List down the basic steps in hypothesis testing.
 Follow the steps to solve problems on hypothesis testing.
 Identify the appropriate test statistics for a given practical problem.

8.1 Statistical Estimation

It is the procedure of using a sample statistic to estimate a population parameter. This is one
wayof making inference about the population parameter where the investigator does not have
any prior notion about values or characteristics of the population parameter. A statistic used to
estimate a parameter is called an estimator and the value taken by the estimator is called
anestimate. Statistical estimation is divided into two main categories: Point Estimation and
Interval Estimation.
Point Estimation:- When we use a single value of a statistic to estimate the corresponding
parameter of a population, it is called point estimation. It is a common way of estimating a
parameter, where a random sample of n observations is selected from a population and the
statistic is calculated.
Examples:
 A sample mean is an estimate for population mean μ. That is, ̅ is an estimator for population mean μ.
 A sample variance is an estimate for population variance. That is, S2 is an estimator for population
Variance .
 A sample proportion estimate for population proportion.

108
Properties of best estimator
The following are some qualities of an estimator.
 It should be unbiased.
 It should be consistent.
 It should be relatively efficient.
To explain these properties let ̂ be an estimator of θ.
1. Unbiased Estimator: An estimator whose expected value is the value of the parameter being
estimated. i.e. E( ̂) = θ.
2. Consistent Estimator: An estimator which gets closer to the value of the parameter as the
sample size increases. i.e. ̂ gets closer to θ as the sample size increases.
3. Relatively Efficient Estimator: The estimator for a parameter with the smallest variance. This
actually compares two or more estimators for one parameter.
Interval estimation:- It is unlikely that any particular estimate will be exactly equal to the
population mean, surely an estimate can be greater than or less than the parameter .That is, it
is not always possible to estimate population parameter with out any error so allowance is
needed for such error .We take interval, ranges of values about an estimate in which the
parameter may lie. This procedure is Interval estimation.
It is the procedure that results in the interval of values of a parameter. Interval estimates
indicate the precision or accuracy of an estimate and are, therefore, preferable to point
estimates. It deals with identifying the upper and lower limits of a parameter. Confidence
interval for the parameter is:
Estimate ± critical value × Standard error of the estimator
Example 8.1:Confidence interval for the population mean is:
̅ ± Critical value × Standard error of ( ̅ )

8.1.1 Confidence interval Estimation for population means


Although ̅ possesses nearly all the qualities of a good estimator, because of sampling error,
we know that it's not likely that our sample statistic will be equal to the population parameter,
but instead will fall into an interval of values. We will have to be satisfied knowing that the
statistic is "close to" the parameter. That leads to the obvious question, what is "close"?

109
We can phrase the latter question differently: How confident can we be that the value of the
statistic falls within a certain "distance" of the parameter? Or, what is the probability that the
parameter's value is within a certain range of the statistic's value? This range is the confidence
interval.

The confidence levelis the probability that the value of the parameter falls within the range
specified by the confidence interval surrounding the statistic. There are different cases to be
considered to construct confidence intervals.

Case-I: Population variance (σ2) is known and parent population is normal.


The sampling distribution of the sample mean is normal with mean μ and variance ⁄ , that
̅
is, ̅ ~ N(μ, ⁄ ) . We can standardize this to get Z=
⁄√
~ N (0, 1).

From the standard normal distribution, we have


( ⁄ ⁄ )
Where α is risk probability and 1- α confidence level. The confidence level is the probability
that the value of the parameter falls within the range specified by the confidence interval
surrounding the statistic. ⁄√ is the standard error of the statistic . Standard error is the
square root of variance where Var ( ̅ ) = ⁄ .
Using the standardized form of the sampling distribution of the sample mean in the above
probability statement, we get the limits of the confidence interval as follows:
̅
( ⁄ ⁄ )
⁄√
( ⁄ ⁄√ ̅ ⁄ ⁄√ )

( ⁄ ⁄√ ̅ ̅ ⁄ ⁄√ )

(̅ ⁄ ⁄√ ̅ ⁄ ⁄√ )
The last statement clearly shows that, there is a (1- ) 100% confidence interval for
population mean (μ) to lie in the interval
(̅ ⁄ ⁄√ ̅ ⁄ ⁄√ ).
This interval is known as a (1- ) 100% confidence interval for population mean (μ).

Here are the Z values corresponding to the most commonly used confidence levels.

110
(1- ) 100% ⁄ ⁄

90 0.10 0.05 1.645


95 0.05 0.025 1.96
99 0.01 0.005 2.58

Example 8.2: The weights of full boxes of a certain kind of cereal are normally distributed
with a standard deviation of 0.27 ounce. If a sample of 15 randomly selected boxes produced
a mean weight of 9.87 ounce, find:

a) The 95% confidence interval for the true mean weight of boxes of this cereal,

b) The 99% confidence interval for the true mean weight of boxes of this cereal,

c) What effect does the increase in the level of confidence have on the width of the interval?

Solution:

a) Given 1    0.95 , so that  / 2  0.005 , n  15,   0.27 ounce, x  9.87 ounce . The 95%

C.I. is P(Z 0.025  Z  Z 0.025 )  0.95 and  Z / 2   Z 0.025  1.96 ounce

X 
Where Z  .
/ n

 
Substituting these values in x  Z  / 2     x  Z / 2  , the resulting confidence
n n
interval is (9.73, 10.01).

b) Similarly the 99% C.I. is (9.69, 10.05).

c) The increase in the confidence level widens the length of the confidence interval.

Case-II:When sampling from a non normal population and when the sample size is large the
distribution of ̅depends on Central Limit Theorem (with known and unknown variance).
Recall the Central Limit Theorem, which applies to the sampling distribution of the mean of a
sample. Consider samples of size n drawn from a population, whose mean is μ and standard
deviation is σ. The population can have any frequency distribution. The sampling distribution
of ̅ will have a mean μ and standard deviation is √ . The sampling distribution of ̅ is normal

111
with a mean μ and variance as n gets large .That is ̅ ~ N (μ, ) (as n gets large). We can
̅− ̅−
standardize this to get Z= ⁄√
~ N(0,1) or Z= ⁄√
~ N(0,1) when is unknown.

A (1-α) 100% confidence interval for population mean (μ) is


(̅ ⁄ ⁄√ ̅ ⁄ ⁄√ ), if known
(̅ ⁄ ⁄√ ̅ ⁄ ⁄√ ), if known
Example 8.3: An economist wants to estimate the average amount in checking accounts at
banks in given region. A random sample of 100 accounts gives ̅ and S= $140.00.
Give a 95% confidence interval for μ, the average amount in any checking account at a bank
in the given region.
Solution:
Given: n = 100, ̅ , S= $140.00 &α = 0.05
A 95% confidence interval for population mean (μ) is
(̅ ⁄ ⁄√ ̅ ⁄ ⁄√ )

=( ( ⁄√ ) ⁄√

Case-III: When sampling is from normally distributed population with unknown population
variance and when the sample size is small (n<30).
When population variance σ2 is unknown, we estimate it by sample variance. The
̅̅̅−
standardized distributions of the sample mean, ⁄√
is t-distribution with (n-1) degrees of

freedom. From this distribution, (1-α) 100% confidence interval for population mean is
(̅ ⁄
̅ ⁄ √
).

Example 8.4: From a normal sample of size 25 a mean of 32 was found .Given that the
standard deviation is 4.2. Find

a) A 95% confidence interval for the population mean.

b) A 99% confidence interval for the population mean.

Solution: a/Given: n = 25 ̅ , S = 4.2, 1-α = 0.95 α = 0.05,

112
The required interval will be ( ̅ ⁄
̅ ⁄ √ )

=32

=32± ×

= 32±1.73
= (30.27, 33.73)
b/ Given: n = 25 ̅ , S = 4.2, 1-α = 0.99 α = 0.01,

The required interval will be ( ̅ ⁄


̅ ⁄ √ )

=32

=32± ×

= 32±1.35
= (29.65, 34.35)
8.1.2 Sample size determination in estimation of population mean
In the process of estimating population mean μ using the sample mean with absolute margin
of error (d) and risk probability α, the sample size is given by:

[ ] where| ̅ |

Example 8.5: To determine the average amount of time students take to get from one class to
the next, how large a sample is needed with probability 0.95 that the error will be at most 0.25
minutes, if  is known from past experience to be 1.50 minutes?

Solution: Using Z0.025  1.96 , and replacing E  0.25 , and   1.50 in the formula for n , we

get n  138.30  139 (always rounded to the next integer) is required for the estimate.

8.1.3 Confidence interval for population proportion


The confidence interval for the population proportion is performed in the same manner as the
population mean.We have discussed that the sampling distribution of sample proportion is
normal. The sample estimate of population proportion P is sample proportion p and sample
estimate of variance of sample proportion is ̂) −
for large sample ̂ .

A (1-α)100% confidence interval for proportion p is given by (for large n):

113
̂̂
̂ ⁄ √

Example 8.6: The Human Resource director of a large organization wanted to know what
proportion of all persons who had ever been interviewed for a job with his organization had
been hired. He was willing to settle for 95% confidence interval. A random sample of 500
interview records revealed that 76 or 0.152 of the persons in the sample had been hired.

Solution:

Given: ̂ ̂ ̂ , n = 500, α = 0.05,

The 95% confidence interval for the population proportion is given by

̂̂
̂ ⁄ √ √

= (0.121, 0.183)

Hence the required proportion varies between 0.121 and 0.183.

8.2 Statistical Hypothesis testing


In section 8.1, we have studied how to make estimations of the mean using point and interval
estimations. The other aspect of statistical inference is known as statistical test of hypothesis.
The branch of statistics which helps us in arriving at the criterion for deciding about the
characteristics of the population, a parameter, based on the information obtained from the
sample data is known as testing of hypothesis. We shall use the theoretical results presented
for the interval estimation, and hence, a test of hypothesis is highly connected with the theory
of estimation we studied before.

In this section, basically we will deal with testing hypotheses about population mean and
population proportion. While doing so, we shall define some important terminologies which
we may face and the errors we are committing in the process. We shall employ the standard
normal distribution (or Z-test), chi-square distribution and the t-distribution (or t-test),
depending upon the nature of the population sampled and the sample.

114
8.2.1 Hypothesis testing for population mean

8.2.1.1 Some terms in tests of hypothesis


Statistical hypothesis is defined as a statement (or an assertion) about the parameter of a
population or its distribution that may be proved or disproved. Its plausibility is to be
evaluated on the basis of information obtained by taking sample from the population.
Test statistic: is a statistic whose value serves to determine whether to reject or accept the
hypothesis to be tested. It is a random variable.
A given statement concerning a parameter could be true or false. Hence we have two
complementary hypotheses, namely, null hypothesis and alternative hypothesis.
a) Null hypothesis ( H0)
It is the hypothesis to be tested for possible rejection under the assumption that it is true and it
is the hypothesis of equality or the hypothesis of no difference.
b) Alternative hypothesis (H1)
It is hypothesis which is the complementary to the null hypothesis. It may be accepted if H 0 is
rejected or be rejected if H0 is accepted. It is the hypothesis of difference.
Statistical Test: is a test or procedure used to evaluate a statistical hypothesis for deciding
whether to reject the hypothesis depending on sample data. The decisions we make are of two
types: Either to reject H0 and conclude that H1 is accepted or retain H0 and conclude that we
have no enough evidence to reject H0.
Types of errors
Statistical test of hypothesis can lead to two kinds of errors. If the statistical test rejects Ho
when it is true, the error is type I error. If the test accepts H0 when it is false, the error is a
type II error.
The following table gives a summary of types of errors:

Decision Ho is true Ho is false

Reject Ho Type I error Correct decision

Accept Ho Correct decision Type II error

115
Type I error is the error committed in rejecting the null hypothesis when it is true.
Probability of committing type I error is sometimes called level of significance and denoted
by α.
Type II error is the error committed in accepting the null hypothesis when it is false.
Probability of committing type II error is denoted by β.
In both types of errors, a wrong decision has occurred. An ideal test procedure is one which is
so planned as to safeguard against both these errors. However, in practical situations an
attempt to minimize one of the errors maximizes the other. In view of this dilemma and the
fact that wrong rejection of Ho is a more serious error, we will hold  at a predetermined low
level, such as 0.1, 0.05, or 0.01 when choosing a rejection region. The level of significance
5% (  0.05) implies that in 5 samples out of 100 we are likely to reject a correct H 0. In
other words this implies that we are 95% confident that our decision to reject H0 is correct.
General steps in hypothesis testing on population mean, μ
Step-1 The first step in hypothesis testing is to specify the null hypothesis (H0) and the
alternative hypothesis (H1). Suppose the assumed or hypothesized value of μ is denoted by μo,
then one can formulate two sided and one sided hypothesis as follows:
1. H0: μ = μo versus H1: μ μo (two sided test)
2. H0: μ = μo versus H1: μ <μo (one sided test or left sided-test)
3. H0 o versus H1 o sided-test)
Step-2: Specify a significance level of α.
Step-3 We should identify the sampling distribution of the estimator and the test statistic.
Case-I: Population variance (σ2) is known and parent population is normal.
̅
The test statistic is Zc ~ N (0, 1).

Case-II: When sampling from a non-normal population and when the sample size is large the
distribution of X depends on Central Limit Theorem (with known and unknown variance).
̅
a) The test statistic is:Zc ~ N (0, 1) with known variance


̅
b) The test statistic is: Zc ~ N (0, 1) with unknown variance.

Case-III: When sampling is from normally distributed population with unknown population
variance.

116
̅
i) When the sample size is large, ~ N (0, 1)


̅
ii) When the sample size is small (n<30), ~ t(n-1).

Step-4.The value of the test statistic can be calculated as follows:


̅
a) Zc = with known variance.


̅
b) Zc = with unknown variance.


̅
c) tc = with unknown variance and small sample size.

where ̅ is the sample mean and the parameter specified by the null hypothesis.
Step-5: Identify the critical (rejection) region or put the decision rule.
a) For two sided test H0: μ = μo versus H1: μ μo , reject H0 if
Zc> ⁄ or Zc< ⁄ .
Graphically, the rejection and acceptance regions are:

Rejection Region Acceptance Region Rejection Region

-Z Z
2 2

b) For one sided test H0: μ = μo versus H1: μ > μoreject H0 if Zcalculated> . Graphically, the
rejection and acceptance regions are

Acceptance Region Rejection Region (  )

Z

c)For one sided test H0: μ = μo versus H1: μ < μo reject H0 if Zc< . Graphically, the
rejection and acceptance regions are
117
d)

Rejection Region Acceptance Region

 Z

Step 6: Summarization the result and put the conclusion

Decision Table

To test H0 :   0 against the three alternatives, the rules are summarized as:

Alternative Accept H0 if Reject H0 if Inconclusive if


Hypothesis

  0  Z / 2  Z C  Z / 2 Z C  Z / 2 or Z C  Z / 2 Z C  Z / 2
orZC  Z / 2

  0 Z C  Z Z C  Z Z C  Z

  0 Z C   Z Z C  Z  Z C   Z

Example 8.7: Test at   0.05 whether the mean of a random sample of size n = 16 is
"significantly less than 10" if the distribution from which the sample was taken is normal,
x  8.4 and   3.2 (known).

Solution:

* H 0 :   10 versus, H 1 :   10   0.05

* Z  Z0.05  1.645 (critical value)

x  0 8.4  10
* ZC    2 (calculated value)
/ n 3.2 / 4

118
* Since Zc  2  Z  1.645 , the null hypothesis is rejected. That is, the population mean
8.4 is significantly less than 10 at 5% level of significance.

Example 8.8: Assume that in a certain district the mean systolic blood pressure of persons
aged 20 to 40 is 130 mm Hg with a standard deviation of 10 mm Hg. A random sample of 64
persons aged 20 to 40 from village x of the same district has a mean systolic blood pressure of
132 mm Hg. Does the mean systolic blood pressure of the dwellers of the village (aged 20 to
40) differ from that of the inhabitants of the district (aged 20 to 40) in general, at a 5% level
of significance?

Solution: H 0 :   130 H 1 :   130   0.05 and hence  / 2  0.025 .

Z0.025 = 1.96

x  0 132  130 2
ZC     1.6.
/ n 10 / 64 1.25

Critical Region Acceptance Region Critical Region

- Z 0.025 =-1.96 0 Z 0.025 =1.96

Since Zc  1.6 is in the acceptance region, H 0 is accepted. That is, the systolic blood pressure
of persons (aged 20 to 40) living in village x is the same as the mean systolic blood pressure
of the inhabitants (aged 20 to 40) of the district.

Example 8.9: A sample of 16 students gave an average mark of 53.8 with a standard
deviation of 5.2. Can you we that the population mean of marks is 50 at   0.05 ?

Solution: H0 :   50 H 1 :   50

  0.05 and hence  / 2  0.025

119
t / 2,n1  t 0.025, 15  2.131 .

x  0 53.8  50 3.8
tC     2.92.
s/ n 5.2 / 16 1 .3

Since tc  2.92  2.131, H0 is rejected. i.e the population mean mark is significantly different

from 50 at   0.05.

8.2.2 Hypothesis testing for population proportion


Hypothesis testing for population proportion P is carried out in the same way as hypothesis
testing for population mean when large samples and normality assumptions are fulfilled.
The test statistic is:
̂
~ N (0, 1) where

The decision rule is:


a) For two sided test Ho:P = Poversus H1:P Poreject Hoif
⁄ or ⁄ .
b) For one sided test Ho: P = Poversus H1: P >Poreject Ho if

c) For one sided test Ho: P = Poversus H1: P <Po, reject Ho if


.

Example 8.10: :A pharmaceutical company claims that a drug which it manufactures relieves
cold symptoms for a period of 10 hrs in 90% of those who take it. In random sample of 400
people with colds who take the drug, 350 find relief for 10hrs. At a 0.05 level of significance,
is the manufacture‟s correct?

Solution: P̂ = , P0 = 0.9, 1  P0  0.10 , n = 400,   0.05 ,  Z   Z 0.05 = -1.645

Ho: P = 0.90 Vs H1:P <0.90

Using the z statistic, we have

Pˆ  p 0 0.875  0.90
Zc    1.67
p 0 (1  p 0 ) / n 0.90(0.10) / 400

120
Since computed value of Zc = -1.67 is less than the critical value of ,
therefore, the null hypothesis is rejected. The manufacture‟s claim is not upheld.
8.2.3 TEST OF ASSOCIATION OF ATTRIBUTES
In the tests of hypotheses considered so far, the Z-test and the t-test, we have assumed that the
samples were drawn from normally distributed populations, or we were considering large
samples, or more accurately, the tests were based on the assumption that the sample means
were normally distributed. Since the tests require assumptions about the type of population or
parameters, they are known as" parametric tests".

In this section, we will introduce the Chi-square test.The Chi-square test used to test
association of attributes. It is a test for nominal data. Before the test, the Chi-square
distribution will be introduced.

8.2.3.1. The Chi-Square (  2 )distribution


The  2 - distribution is one of the simplest and most widely used in statistical applications.
This distribution is not defined for negative real numbers and is not applicable when
observations assume such values.

Unlike the normal and the t distributions, its curve is not symmetrical, it is rather positively
skewed. As in the case of the t distribution, the degree of freedom, (n  1) is the parameter of
the Chi-square distribution. Since this distribution arises in many important applications, its
values for different value of significance  are tabulated as a function of its degrees of
freedom, n  1 . Table C at the end of this module provides the  2 values for
  0.05, 0.025, 0.01, 0.005 , etc and 1, 2, 3, , 30 degrees of freedom such that

 
P   2 , n1   , meaning the area to the right under the Chi-square curve with n -1
2

degrees of freedom is equal to  .

Chi-Square distribution
121
8.2.3.2. A Test of Association
This is also known as analysis of r  c contingency table. A table of frequencies of order r  c
(r by c) without totals is said to be r  c contingency table if the row totals and column totals
are random. Suppose that the frequencies of the occurrences of two attributes, say A and B,
are given in a contingency table having r rows and c columns. Then, the table gives a total of
r  c frequencies. We say that A has r levels (categories) and B has c levels. Based on the
information provided by the r  c contingency table, we test the hypothesis

H 0 : The two attributes are independent, against

H A : The two attributes are dependent, at a specified level  .

 c  
2
r c O
ij  Eij 
2

The test statistic is: .


i 1 j 1 Eij

Where Oij denotes the observed frequency for the cell in the ith row and the jth column, and

Eij is the expected frequency (the frequency or count which is expected of the two attributes

A and B if the null hypothesis of independence is true) of the same cell.

Eij  n.PAi  B j   n.P( Ai ).P( B j ) , where n   Oij ,


r c

i 1 j 1

ri c
P ( Ai )  , ri= total for ith row, P ( B j )  j , cj= total for jth column.
n n

 r  c  ri .c j  row total    column total 


From which we get, Eij  n. i  j    .
 n  n  n n

The expected frequency, Eij , is obtained by multiplying the total of the row(i) to which the

cell belongs ri  by the total of the column (j) to which it belongs (cj) and then dividing by the
ri .c j
grand total: Eij  .
n

122
Decision Rule
To reach a decision, we need to know the distribution of the test statistic. Under the
assumption that H0 is true, the test statistic follows a chi-square distribution with (r-1)(c-1)
degrees of freedom. That is, the degrees of freedom of an r  c contingency table out of the
totals is given by (r-1)  (c-1).

Note that the degrees of freedom here is based on the number of cases that can be freely
changed given that the row totals and the column totals are fixed. Can you verify?

The decision rule is then, we reject H0 (and accept H1) at  level of significance if the

calculated value  c2 2
is larger than the table value, X  , ( r 1)( c 1) . That is, reject H0 if

 c2  2 ,( r 1)( c 1) at  level of significance, or with a 1    100% confidence.

Example 8.11: A geneticist took a random sample of 301 men to study whether there is
association between father and son regarding boldness. He obtained the following results.
Using α= 0.05 and test whether there is association between father and son regarding
boldness.
Father Son Total
Bold Not bold
Bold 85 59 144
Not bold 65 92 157
Total 150 151 301

Solution: H0: The father and son are independent regarding boldness

H1: The father and son are dependent regarding boldness

The test statistic is:

 c  
2
r c O
ij  Eij 
2

i 1 j 1 Eij

ri .c j
Eij 
n

e11 = e12 = e21 = e22 =

123
(85  71.76) 2 (59  72.24) 2 (65  78.24) 2 (92  78.76) 2
c2      9.34
71.76 72.24 78.24 78.76

The degree of freed is (r-1)(c-1) = (2-1)(2-1) = 1.  2 , ( r 1)( c 1)   02.05,1  3.84

H0 is rejected in favour of H1 since  c  9.34 >  02.05,1  3.84


2

Therefore, the sample indicates that father and son are dependent regarding boldness at 0.05
level of significance.

Exercise 8
1. From a normal population with the standard deviation is 4.2. A sample of size 25 is taken with mean
of 32. Find 99% confidence interval for the population mean.
2. A sample from an assumed normal distribution produced the values 9, 14, 10, 12, 7, 13, 12. a) What
is the single best estimate of  ? b) Find an 80% C.I. for  ?
3. Out of a sample of 80 customers 60 of them reply they are satisfied with the service they
received.Calculate a 95% confidence interval for the proportion of satisfied customers.
4. The mean pulse rate and standard deviation of a random sample of 9 first year male medical students
were 68.7 and 8.67 beats per minute respectively. (Assume normal distribution).
a) Find a 95% C.I. for the population mean.
b) If past experience indicates that the mean pulse rate of first year male medical students is 72 beats per
minute, test the hypothesis that the above sample estimate is consistent with the population mean at
5% level of significance.

5. According to the norms established for a reading comprehension test, students should average
84. If 45 randomly selected students averaged 87.8 with a s.d of 8.6, test the null hypothesis
  84 against the alternative   84 , at   0.01 .

6. In a survey of drug users in a large city, it is found that 18 out of 423 of them were HIV positive. Can
we conclude that at α=0.05 level of significance fewer than 5 percent of the drug users in the
population are HIV positive?

7. In a study of aviophobia, a psychologist claims that 30% of all women are afraid of flying. If, in
a random sample, 41 of 150 women are afraid of flying, test the null hypothesis p = 0.30 against
H A : p  0.30 , at   0.05 .

124
8. A biostatistician intends to estimate μ, the mean blood pressure of women between the ages of 45 and
50. She takes a random sample of 20 women and measures their blood pressure. Based on past
experience she believes the measurements will follow a N(μ, 100) distribution. (Measurements are in
mm mercury.) Suppose she discovers the sample mean is equal to 136.9 mm mercury. Find a 95%
confidence interval for μ.
9. In a study of parents' feelings about a required course in sex education, a random sample of 360
parents are classified according to whether they have one, two, or three or more children in the school
system and also whether they feel that the course is poor, adequate, or good. Based on the results
shown in the following table, test at the 0.05 level of significance whether there is a relationship
between parents' reaction to the course and the number of children they have in the school system.
Parents' Number of Children
feelings
1 2 3 or more
Poor 48 40 12
Adequate 55 53 29
Good 57 46 20

10. A special diet given to 8 overweight women helped them to lose 6, 7.5, 11, 9, 6.5, 11, 8 and 5kg
within a period of 3 months. Assuming normal distribution, can we reject the claim that the diet
will help an overweight woman to lose at least 10kg in 3 months, at   0.01 ?

125
CHPTER 9: TWO SAMPLES INFERENCE
9.1 Inferences about differences between means
In many applied research problems, we are interested in hypotheses concerning differences
between the mean values of two populations. For instance, we may want to decide upon the
mean step pulse of men is less than the mean step pulse of women.

Case 1: when the populations are normal and their variances  1 and  2 are known
2 2

Given independent random samples of size n1 and n2 from two normal populations having

the means 1 and  2 and the known variances  12 and  22 . To test the null hypothesis

H 0 : 1  2  0 , against the alternatives, H A : 1   2  0 , H A : 1   2  0 , or H A :

1   2  0 , the respective critical regions are

Z C  Z / 2 , Z C  Z , and Z C   Z  , where

x1  x2  ( 1   2 )  12  22
ZC  ~ N(0,1) and  x  x   .
 x x
1 2
1 2
n1 n2

 x x1 2 is called the standard error of the difference of the two samples.

N.B: When we deal with independent random samples from populations with unknown variances
that may not even be normal, we can still use the test described above with S1 substituted for
 1 and S2 substituted for  2 as long as both samples are large enough to invoke the Central
Limit Theorem.

Example 9.1:- Vision, or more especially visual acuity, depends on a number of factors. A study
was undertaken in Australia to determine the effect of one of these factors: racial variation.
Visual acuity of recognition as assessed in clinical practice has a defined normal value of
20/20 (or zero in log scale). The following summarized data on monocular visual acuity
(expressed in log scale) were obtained from two groups:

126
Number Sample mean Sample standard deviation
observation
Race (of visual acuity)

Australian males of European origin 89 -0.20 0.18

Australian males of Aboriginal origin


107 -0.26 0.13

Solution: a) H0 : 1  2  0 , H A : 1   2  0 at   0.05

x1  x2  ( 1   2 )
Z / 2  1.96  Reject H0if Z c  1.96 , where Z C  .
 x x
1 2

Substituting n1  89, n2  107, x1  0.20, x2  0.26 s1  0.18 for  1 , s2  0.13 for  2 ,

and 1   2 = 0, we get Z   0.20  0.20



0.06
 2.63.
C
(0.18) 2 (0.13) 2 0.0228

89 107

Conclusion: Since Z c  2.63  1.96 , H0 must be rejected; in other words, the difference is
statistically significant.

 12  22
One can also construct a 100(1-  )% CI for 1   2 : ̅ - ̅ Z / 2 
n1 n2

Case 2: Small samples and  1 and  2 unknown

When n1 and n2 are small (both less than 30), and  1 and  2 are unknown, the above test

cannot be used. In this case the t test will be used. t tests can be used either to compare two
independent groups or to compare observations from two measurement occasions for the
same group. The first kind of problem is treated using what is known as the independent
samples t test and the second using paired samples t test.

a) Independent-samples t test

127
This test is used to compare two groups of scores on the same variable which are independent
random samples from normal distributions having the same unknown variance  2 . We can
have two sub-cases under this case.

a/ when n1 n2

x1  x2  ( 1   2 ) x1  x2  ( 1   2 )
tC   which has a t distribution with n1  n2 - 2
S p2 S p2 S x x
 1 2

n1 n2

degree of freedom.

Here, S p2 
n1  1S12  n2  1S22 is called the pooled variance.
n1  n2  2

Under the given assumptions and H0, t is a value of a random variable having the t
distribution with n1  n2  2 degrees of freedom. Thus, the appropriate critical regions of size

 for testing H0 : 1  2   against the alternatives

1   2   , 1   2   Or 1   2   . The rejection regions are, respectively,

tC  t  , n1  n2  2, tC  t , n1  n2  2, or tC  t , n1  n2  2 .
2

Example 9.2: In an attempt to assess the physical condition of joggers and non- joggers, a
sample was selected from each and their maximum volume of oxygen uptake (V-O2) was
measured with the following results:

Number observation Sample mean Sample standard


deviation

Joggers 25 47.5 ml/kg 4.8 ml/kg

Non-joggers 26 37.5 ml/kg 5.1 ml/kg

Solution: * H0 : 1  2 (or 1  2  0 )

128
H A : 1   2

  0.05

Results: n1  25, x1  47.5, s12  23.04 , n2  26, x2  37.5, s22  26.01. ,

So that, S p2 
n1  1S12  n2  1S22 
25  123.04  26  126.01  24.56 or S p  4.96 , and
n1  n2  2 25  26  2

s x1  x 2 = 4.96
1

1
 1.39 .
25 26

* t  t 0.025, 49  2.0 (critical value).


, 49
2

47.5  37.5
* tC   7.19.
1.39
* Since tc  7.19  2.0 , reject H0 at 5% level of significance.

1 1
One can also construct a 100(1-  )% CI for 1   2 : ̅ - ̅ t / 2,n1 n2 2 S p  .
n1 n2

b/ when n1 n2=n

x1  x2  ( 1   2 )
t ~ t2(n-1)
S x x
1 2

2
to compute S x1  x2 use the following formula s x1  x2 =2 s /n where
2
s 2 =( s12 + s22 )/2

2
where s1 is the sample variance of sample 1.

s22 is the sample variance of sample 2.

s12 s 22
One can also construct a 100(1-  )% CI for 1   2 : ̅ - ̅ t / 2, 2( n1) 
n n

129
9.2 Paired comparison
Suppose we have a random sample of n observations having pairs of measurements or a
random variable x say (x, y). In this case observations for each pair should be made under the
same conditions and the mean difference should be normally distributed. Variances of each
variable can be equal or unequal.

The purpose is to test the hypothesis H 0 : 1  2 or H 0 : 1  2  0 against H A :  2  1 or

H A :  2  1

Let D=X-Y so that for the pair of samples ( x1 , y1 ), ( x2 , y2 ), ( x3 , y3 ),..., ( xn , yn ) we have

d1  y1  x1 , d 2  y2  x2 , d 3  y3  x3 , ... , d n  yn  xn

d
The test statistic is t c
 where s d2 is the variance of the variable d. This statistic is,
2
s /n
d

under the null hypothesis, distributed as t with n-1 degrees of freedom. The decision rule is
the same with the t-tests.

Example 9.3: The weight of 10 boys before they are subjected to a change of diet and after a lapse of 6
months is given below. Test whether there has been any gain in weight as a result of change
of diet.

Weights(lb) before, X 109 112 98 114 102 97 88 101 89 91

Weights(lb) after, Y 112 120 99 117 105 98 91 99 93 89

Solution:

To test H 0 : 1   2 against H A :  2  1 at   0.05

First calculate d.

Weights(lb) before, X 109 112 98 114 102 97 88 101 89 91

Weights(lb) after, Y 112 120 99 117 105 98 91 99 93 89

d 3 8 1 3 3 1 3 -2 4 -2

130
Using the procedures for determining the test statistic given above, we have
d  2.2, and sd2  8.62

2.2
Thus, t   2.37 . Taking the level of significance to be   0.05 , we have
8.62 / 10
t ( 0.05),9  1.833 . Since calculated value of the statistic is greater than the tabulated value, the

null hypothesis of no gain in weight will be rejected.

One can also construct a 100(1-  )% CI for ̅ : d t / 2,n1 sd/√ .

9.3 Inferences about differences between Proportions


If we have two populations with proportions of the required characteristic equal to p1 and p 2 .

Then their estimate can be obtained on the basis of sample of size n1 and n 2 . Suppose that x1

and x2 are the numbers of successes observed in n1 trials of one kind and n 2 of another, the

trials are independent, and the corresponding probabilities of success are, respectively, p1 and
x1 x2
p 2 , the sampling distribution of  has mean p1  p2 and standard error of
n1 n2

p1 1  p1  p2 1  p2 
 .
n1 n2

To test the null hypothesis p1  p2  p or p1  p2  0 , the standard error of the formula

1 1
can be written as p1  p    .
 n1 n2 

Where p is usually estimated by pooling the data and substituting for p the combined sample
x1  x2
proportion given by: pˆ  .
n1  n2
Then, converting to standard units, we get the test statistic:

x1 x 2

n1 n 2 pˆ 1  pˆ 2 x1  x2
Z = , with pˆ 
1 1  s pˆ1  pˆ 2 n1  n2
pˆ (1  pˆ )  
 n1 n 2 

131
This is approximately normal for large samples.

pˆ 1  pˆ 2  ( p1  p 2 )
More generally, to test H 0 : p1  p2  0 , we use: Z  .
pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 )

n1 n2

The test criteria are similar to those of H 0 : 1  2  0 , with p1 and p 2 substituted for 1 and
 2 , respectively.

One can also construct a 100(1-  )% CI for p1  p2 : ̂ - ̂


̂ ̂ ̂ ̂
z / 2 √

Example 9.4: A health officer is trying to study the malaria situation of Ethiopia. From the
records of seasonal blood survey results he came to understand that the proportion of people
having malaria in Ethiopia was 3.8% in 1978 (Eth. Cal.). The size of the sample considered
was 15,000. He also realized that during the year that followed (1979), blood samples were
taken from 10,000 randomly selected persons. The result of the 1979 seasonal blood survey
showed that 200 persons were positive for malaria help the health officer testing the
hypothesis that the malaria situation of 1979 did not show any significant difference from that
of 1978 (take the level of significance 0.01).

Solution: To test H 0 : p1978  p1979 (or H 0 : p1978  p1979  0 )

against H A : p1978  p1979 (or H A : p1978  p1979  0 ) at   0.01

p1978  0.038, p1979  0.02

n1= 15,000, n2= 15,000

x1= 570, x2= 200

zt = z0.05=2.58

x1  x2 570  200
pˆ   = 0.0308
Common (pooled) proportion, n1  n2 15,000  10,000

 1 1 
And the standard error = (0.0308)0.9692    = 0.0022
 15,000 10,000 

Hence, zc=(0.038-0.020)/0.0022=0.018/0.0022=8.2

132
Conclusion: Since zcal> ztabreject H 0 . Therefore, it is concluded that there was a statistically
significant difference in the proportion of malaria patients between 1978 and 1979 at a 0.01
level of significance.

9.4 Inferences concerning variances


Under the assumption of normality one can perform a test on a population variance. If we
want to test the null hypothesis

H O : 2   O2 against one of the alternatives:  2   O2 , or  2   O2 ,or  2   O2 .

(n  1) S 2 ; this value is then compared with



2
In such tests, we use the calculated value as x c
2
the table value of  , where  =(n-1) and  is the level of significance.
2
 ,

The decision rule is to reject H 0 if  2 c >  or  2 c <  for two sided test, if  2 c >
2 2
 / 2, 1 / 2,

 2  , for right sided test or if  2 c <  21 , for left sided test.
Example 9.5: In a random sample the amount of time which 18 women took to complete the
written test for their driver licenses has standard deviation 2.1 minutes. Do the data give
sufficient evidence that the population variance is significantly less than 6.25 minutes at
  0.05 (Assume normality).

Solution: Given that n=18,  = (2.1) 2 =4.4, the computed  is


2 2

(n  1) S 2 17 * 4.4
   11.995 .
2
x c
2 6.25

H O : 2  4.4 versus

H O : 2  4.4

 20.95,17 =8.672<11.995= x c 2 so accept the null hypothesis. 


(n  1) S 2 (n  1) S 2
One can also construct a 100(1-  )% CI for  2 : 2
X 2 / 2 X 12 / 2 .

Exercise 9
1. Hearing levels in two groups of school children with normal hearing in the frequency of 500 cycles
per second was found to be as follow.

133
No. of children Mean of hearing Stand. deviation
threshold
62 15.5 decibels 6.5 decibels
Group 1
76 20 decibels 7.1 decibels
Group 2

Test if there is any difference between hearing levels recorded in the two groups at   0.01 .
2. A group of 12 patients was given a new type of sleeping pill. Another group consisting of 10 patients
was given a conventional type. The number of hours of sleep of these patients in one day are as
follows:
Group 1 (new pill): 8.4, 6, 7, 8, 6.5, 7.4, 8.1, 9, 10.1, 7.2, 8, 11
Group 2 (conventional): 8, 7.2, 6.3, 6.4, 8.5, 9.2, 8, 7, 7, 9
Do the data suggest that there is any appreciable difference in the effect of the two types of pills on

patients for   0.05 .


3. The mean produce of wheat from a sample of 10 fields comes to 200 kg per acre and another sample
of 150 fields gives a mean of 220 kg per acre. Assuming the standard deviation of the yield at 11 kg
for the population, test if there is a significant difference between the means of the sample at
  0.01 .

4. A certain stimulus administered to each of 12 patients resulted in the following increase of blood
pressure: 5, 2, 8, -1, 3, 0, -2, 1, 5, 0, 4 & 6. Can it be concluded that the stimulus will, in general, be

accompanied by an increase in blood pressure at   0.05 .


5. In a year there are 956 births in a town A of which 52.5% were male, while in towns A &B combined,
this proportion in a total of 1406 births was 0.496. Is there any significant difference in the proportion

of male births in the two towns at   0.05 ?


6. The variance of the weights of 10 shipments was 31 square kilogram can we say that variance of the
distribution of weight of all shipments from which the above sample was drawn is equal to 20 square
kilogram.(assume normality)
7. Two different types of drugs A & B were tried on certain patients for increasing weight, 5 persons
were given drug A & 7 persons were given drug B. The increase in weight (in pounds) is given below.
Drug A: 8 12 13 9 3
Drug B: 10 8 12 15 6 8 11

Do the two drugs differ significantly with regard to their effect in increasing weight at   0.05 ?

134
8. Two samples of sizes 9 and 8 gave the sums of squares of deviation from their respective means as

equal to 50 and 25 with means 275 and 290. Do the sample means differ significantly at   0.01

135
CHAPTER 10: SIMPLE LINEAR REGRESSION AND
CORRELATION
After completing the topic, the students will be able to:
 Determine the relationship between variables.
 Find the fitted regression line of the two variables.
 Draw and describe scatter diagram.
 Interpret the slope and intercept of the fitted regression line.
 Calculate and interpret the correlation coefficient.
 Find and interpret the coefficient of determination.
 Calculate and Interpret explained and unexplained variations.
 Calculate and interpret the spearman‟s correlation coefficient.

10.1 Simple Linear Regression of Y on X


Under simple linear regression of Y on X, we have one independent variable which is
influential usually denoted by X and one dependent variable influenced by the independent
variable which we denote it by Y. For example in real world variables that may be related
linearly are, production/yield ( Y ) and amount of rainfall(X ), monthly income (Y ) and level
of education (X), ,where an increase in one variable is associated with an increase in the other
variable. Similar examples can also be given on the negative relation between two variables;
the increase in one is accompanied by a decrease in the other.
A simple linear regression model is given as
Y=α+ X+
Where α is intercept of the regression line. It gives the value of Y whenever X is zero. If the
range of X does not include zero, α has no practical interpretation. is the slope. It is a
measure of the rate of change. It shows by how much Y changes for every unit change in X.
The sign of has also some significance; because it shows the direction of the relation
between the two variables. A positive sign of shows that the two variables are positively
related and a negative sign of shows that the two variables are negatively related.
The constants, α and are parameters and are commonly referred to as regression coefficients.
- is a random error term. It is neither observable nor measurable. In real life problems, even
though two variables are linearly related, their relationship is not fixed as
Y=α+ X

136
This is because the dependent variable, Y is the effect of many independent variables in
which X is one of them. Contribution of other independent variables not considered in the
model may be minor. However, we cannot be certain that Y depends only on X. Thus the
contribution of these variables not included in the model and other factors such as
measurement error is accommodated by .
Mean of the values of is zero. Some of its values are positive, that is when the actual value
lies above the line ̂ = ̂ ̂ Xiand some are negative in case when the actual value of Y lies
below the fitted regression line.
Assumptions:
1. The relationship between the dependent variable Y and independent variable X exist and is
linear.
2. For every value of the independent variable X, there is an expected value of the dependent
variable Y.
3. The dependent variable Y is a continuous random variable, whereas values of the
independent variable X are fixed values.
4. The sampling error, , associated with the expected value of the dependent variable Y is
assumed to be an independent random variable distributed normally with mean 0 and constant
variance  2 about the regression line.
To estimate this model we take a sample of n independent observations which give rise to n
pairs (Xi , Yi) and find best estimates of the parameters or best fitted line using least square
method of estimation. A best fitting line is one for which the sum of squares of the errors,
∑ is minimum.
In the principle of least square method, one would select a and b such that
∑ =∑ ̂ is minimum where ̂ = ̂ ̂ Xi

To minimize this function, first we take the partial derivatives of ∑ with respect to
̂ ̂ respectively .Then the partial derivatives are equated to zero separately and result in
the following normal equations respectively
∑ = n̂ + ̂ ∑
∑ =̂ ∑ +̂∑
Solving these normal equations simultaneously we can get the values of ̂ ̂ as follows.

137
n
 n  n 
n xi yi    xi   yi 
̂  2
xy  nx y
 i 1  i 1  i 1     x  x  y  y 
and ̂ = ̅ - ̂ ̅
 x  nx 2
 n
n  x i    xi 
2 
2
  x  x 2

 i 1 
These estimates are denoted by ̂ ̂ the estimated (fitted) regression line is
given by
̂= ̂ ̂ Xi
Before estimating the regression coefficients, it would be wise to plot the observed data on a
graph known as a scatter diagram. Scatter diagram is a plot of all ordered pairs (xi ,yi )on the
coordinate plane which helps to observe relationship between two variables. This diagram
gives a preliminary idea on the type of relationship the two variables have.
Regression analysis is useful in predicting the value of one variable from the given value of
another variable, ̂ = ̂ ̂ Xi.
Example 10.1

For the following example height of father in inch (X) and height of their sons in inch (Y):

Assuming simple linear relationship between X and Y,


a/ Draw the scatter diagram;
b/ Find the estimated regression equation of Y on X;
c/ Give the predicted value of Y for X= 65.5

solution: a/ The scatter diagram is as follows:

total

x 65 63 67 64 68 70 71 69 537

y 67 66 68 65 69 68 70 68 541

x2 4225 3969 4484 4096 4624 4900 5041 4761 36105

xy 4355 4158 4556 4160 4692 4760 4970 4692 36343

y2 4489 4356 4624 4225 4761 4624 4900 4624 36603

138
Scatter diagram for height of father(x) and height of their sons(y)
71
H
70
t
o 69
f 68
67
s
66
o
n 65
s 64
62 63 64 65 66 67 68 69 70 71 72
Height of Father

b) And the necessary statistics are computed below:

̂  xy  nx y  36343  (8)(67.125)(67.625)  28.375  0.482 and


 x  nx
2 2
36105  (8)(67.125 ) 2
58.875

̂ = 67.625 – 0.482(67.125) =35.27


Hence, the equation is ŷ = 35.27 + 0.482x.

c) When X = 65.5, yˆ 65.5  35.27 + 0.482(65.5)  66.841

10.2 Covariance and Simple Linear Correlation Analysis


Given the paired data (x1,y1), (x2,y2), . . ., (xn,yn) we may want to describe the type & strength
of relationship between the independent variable X and the dependent variable Y. We can
give these two by applying an index called simple correlation coefficient. The population
correlation coefficient is represented by and its estimator by r. The correlation coefficient r
is also called Pearson‟s correlation coefficient since it was developed by Karl Pearson. r is
given as the ratio of the covariance of the variables X and Y to the product of the standard
deviations of X and Y. The computational formula is:

r
cov( X , Y )
= r
 ( x  x )( y  y ) /( n  1)
Var ( X ) var( y) [ ( x  x ) 2 /( n  1)][  ( y  y ) 2 /( n  1)

r
 ( x  x )( y  y )
 (x  x)  ( y  y)
2 2

139
Alternatively: The correlation coefficient is given by

r
 xy  nx y
 x  nx  y
2 2 2
 ny 2 
The correlation coefficient, r is always lies between –1 and +1, inclusive.
 r = -1 implies perfect negative linear relationship between the two variables.
 r = +1 implies perfect positive linear relationship between the two variables.
 r = 0 implies there is no linear relationship between the two variables. But the two variables may
have non-linear relationship between them.
 r approaches +1 indicates strong positive linear relationship between the two variables.
 r approaches -1 indicates strong negative linear relationship between the two variables.
 r approaches 0 indicates weak linear relationship between the two variables .

Coefficient of Determination (r2)


The square of the correlation coefficient, r2, is called the coefficient of determination. It
measures the variation in the dependent Y explained by the simple linear regression of Y on
X. 1− r2measures the proportion of variation in Y not explained by the simple linear
regression of Y on X.
Example 10.2
If r = 0.9, then r2 = 0.81 and 1- r2 =0.19. Approximately 81% of the variation in the dependent
variable, Y, is explained by the simple linear regression of Y on X fitted on sample data. The
remaining, 1-r2, 19 % of the variation in Y is unexplained by the simple linear regression of Y
on X fitted on sample data.
Example 10.3
The research director of the Saving and Loan Bank collected 25 observation of montage
interest rates X and number of house sales Y at each interest rate. The director computed that,
∑ ∑ ∑ ,∑ ∑ = 436
Compute and interpret (i) Coefficient of correlation.
(ii) The coefficient of determination.
Solution: i) Coefficient of correlation.

140
r
 xy  nx y 
520  (25)(5)( 4)
 x  nx  y
2 2 2
 ny 2  650  25(5)(5)436  (25)( 4)( 4) = 0.667
The two variables have positive linear relationship.
ii) Coefficient of determination, r2= (0.667)2 =0.44 this shows that 44% of the variation in the
number of house sales is due to the variation in the interest rate. 44% of the variation in the
number of house sales(Y) is explained by the simple linear regression of Y on X (interest
rate).

10.3 Spearman‟s Rank Correlation Coefficient


The simple correlation coefficient (r) cannot be used when we are dealing with a qualitative
data such as judgment about beauty, efficiency, honesty, etc. In such cases, the rank
correlation coefficient is used to explain the correlation or if there is an agreement in ranking.

It is denoted by rs and is defined as follows:

Definition: The coefficient of rank correlation, rs ,given by Spearman for n pairs, is

6 d 2
rs  1  , where d is the difference between the rank of x and the corresponding y.
n( n 2  1)

To calculate rs , we first rank the xs among themselves from least to best or from best to

least; then we rank the y' s in the same way, find the sum of the squares of the differences, d,
between the ranks of the x's and the y's. When there are ties in rank, we assign to each of the
tied observations (having equal value) the mean of their ranks.

Example 10.4: Assume that ten girls in a beauty contest for Miss Debre Markos were ranked
by two judges as follows:

Girl Number 1 2 3 4 5 6 7 8 9 10

Judge A 4 8 6 7 1 3 2 5 10 9

Judge B 3 9 6 5 1 2 4 7 8 10

141
Calculate rs and interpret it.

Solution: Since the ranks are given, we need to find only the difference in ranks for each
girl and the square of these differences.

Girl Number 1 2 3 4 5 6 7 8 9 10 Total

d 1 -1 0 2 0 1 -2 -2 2 -1 0

d2 1 1 0 4 0 1 4 4 4 1 20

For these n = 10 pairs, d 2


 20 , and rs = 1  6(20)
10(100  1)
 0.88 , which is positive and close to 1,

showing that there is a very good agreement (or concordance) between the two judges regarding the
beauty of the girls.

N.B:  d  0 provides a check in calculations.


Like the values of r, the values of rs also lie between -1 and +1, inclusive, and the interpretations of

its size and sign are analogous to those of r. rs  1  Perfect positive agreement, rs  1 
Complete disagreement, where the two rankings go completely in opposite direction.

Exercise 10
1. The following table show the heights to the nearest inch (in) and the weights to the nearest pound (lb)
of a sample or 12 male students drawn at random from the first year students at a university .

Height x (in) 70 63 72 60 66 70 74 65 62 67 65 68
Weight y (lb) 155 150 180 135 156 168 178 160 132 145 139 152

a/ plot a scatter diagram of the data


b/ fit the least square equation
c/ estimate the weight of a student whose height is 63 inches.

142
2. The following table presented age of female patients (x) with total cholesterol (Y).

Age(X) 25 25 28 32 32 32 38 42 48 51 51 58 62 65
Total 18 19 18 18 21 19 23 18 20 22 24 20 22 26
Chol.(Y 0 5 6 0 0 7 9 3 4 1 3 8 8 9
)

a/ Draw the scatter diagram


b/ Find the least square regression equation.
c/ Find the Correlation coefficient.
d/ find the coefficient of determination .
3. The following data represent the number of calories per serving and the number of grams of sugar per
serving for a random sample of high- fiber cereals.

Calories(x) 200 210 170 190 200 180 210 210 210 190 190 200
Sugar (Y) 18 23 17 20 18 19 23 16 17 12 11 11

a/ Draw the scatter diagram


b/ Determine the least square regression equation
4. Based on the following date answer the questions
n=10, ∑ =4174, ∑ =139, ∑ =280, ∑ =1985
a/ what is the linear regression equation of y on x.
b/ suppose the value of x=20,what will be the value of y.

143
ANSWER FOR EXERCISES
Exercise 1
1. a/ All potential listeners of the radio program, b/ Those 91 listeners who called with in a minute of
the question asked, c/ The sample is not random . Only those who listened to that program and had
access to phones could have responded to the question. Hence the sample was biased, & not scientific.
2. a/ collection of all 25 – years old males who have never taken fish more than once a week,
b/ a Sub-collection of 500 males selected randomly, c/ difference (pre-diet minus post-diet) in
Cholesterol level, d/ mean difference in cholesterol level, e/qualitative
3. a/ Collection of all patients suffering from that particular type of headache, b/ Collection of 350
patients who have been surveyed or interviewed.
4. a/ nominal, b/nominal, c/ratio, d/ratio, e/nominal, f/ratio, g/ordinal, h/ ordinal, i/nominal

Exercise 2
1. a) Continuous ; b) discrete; c) continuous; d) continuous.
2. Primary data are data collected by the investigator for the intended purpose; while secondary
data are collected by some other agency for different or similar purpose. Secondary data have to be
checked whether or not they are suitable, adequate and reliable for the purpose of the current study
before using them.

3.
Means of bus car plane Total
transport

Frequency 17 9 14 40

4. The frequency distribution is:

Monthly salary 90-98 99-107 108-116 117-125 126-134 135-143 144-152 Total

Frequency 1 4 3 6 9 6 1 30

5. a) Class marks: 0.5, 2.5, 4.5, 6.5, and 8.5;

b) C.B‟s: -0.5-1.5, 1.5-3.5, 3.5-5.5, 5.5-7.5, and 7.5 -9.5;

c) R.F‟s: 0.27, 0.42, 0.22, 0.07, and 0.03;

144
d) Cumulative frequencies are:

i) the less than  cf ii) the „or more‟  cf


than

Less than 1.5 16 -0.5 or more 60

Less than 3.5 41 1.5 or more 44

Less than 5.5 54 3.5 or more 19

Less than 7.5 58 5.5 or more 6

Less than 9.5 60 7.5 or more 2

Exercise 3
1. a/mean=38.3, b/mode=40, c/median=28.5
2. a/mean=45.59, b/mode=46.44, c/median=40.16, d/Q1=36.09, P1=36.09, D8=53.65, D9=60.68
3. a/mode=3, b/median = 3, c/mean=3.47, d/Q2=3, e/D6=4, f/P25=3
4.
CI 32-36 37-41 42-46 47-51 52-56 57-61 Total
f 4 11 15 7 2 1 40

5. a/mean=14.35, b/mode = 15, c/median=15, d/D7=14.1, e/P80=20

Exercise 4
1. M.D( ̅ ) = 2.65, C.M.D( ̅ ) = 0.36, M.D( ̂) = 3.71, C.M.D( ̂) = 0.34, S2 = 9.584, S = 3.096
2. Since CVA = 29.11% < CVB = 17.19%, section B scores was more variable than that of section A
scores.
3. City 1: ̅
City 2: ̅
City 1: ̅
Therefore, the city2 has the most consistent temperature.
4. . Since , the distribution is negatively skewed.
The fourth moment is 30,000.

Exercise 5

145
1. 0.4167, 2/18/50, 3/0.5275, 4/ a/12/365, b/7/365, c/31/365, d/1/365
Exercise 6

 5  15 
  
1. a) x 4  x  ; b) Find f (0), f (1), f (2), f (3) and f (4) in (a);
f ( x )    , x  0,1,2,3,4
 20 
 
 4

c) E ( X )  1 , and V(X)=12/19
2. a) 0.125; b) E ( X )  3 / 4 , and V ( X )  3 / 80 ; c) (0.95)1/3; d) (0.5)1/3.
3. a) k = 1/25 ; b) i) 0.5 ; ii) ¾; c) 500kg
4. a)0.53; b)1.56; c)0.44 (5) a)0.7881; b)0.2743.
6. a) 29.141; b) 26.119; (7) 0.312. 8. i) 2 ii) 0.678 (9) a) 2; b)1.833

Exercise 7
1. a/Mean of sample mean = 2550, variance of sample mean = 46.291
b/ p=0.7198
2. p=1
3. a/ systematic random sampling, b/cluster sampling(first stage)
4. 0.4286, 5. b/9, c/ s.d.=2.64
Exercise 8

1. (29.833, 34.167)
2. a) x  11 b) (9.87, 12.13)
3. (0.654, 0.846)
4. a) The 95% C.I. for μ is (62.0, 75.4) beats per minute. b) Ho: μ = μo Vs H1: μ  μo. Since tcal.= -1.14 ≤
ttab = -2.306, Ho is accepted.
5. Z c  2.96  2.33  H o is rejected
6. Ho: P = 0.05 Vs H1: P < 0.05, Zc = -0.07 > -Zα = -1.645  Ho is not rejected.
7. Z = -0.81  H0 cannot be rejected
8. (132.517, 141.283)

9.  2  4.01  The two attributes are independent.


10. t  2.55  H o Cannot be rejected.

Exercise 9
1. zc= 3.88, reject Ho
2. ̅ =8.1, ̅ = 7.66, = 2.07, = 1.07, tc= 0.74, Accept Ho

146
3. zc= 14.08, reject Ho
4. ̅ =2.58, =9.538, tc=2.89, reject Ho
5. n1 = 956, n2 = 450, p1= 0.525, p2 = 0.434, p = 0.496, reject Ho
6. = 13.95, accept Ho
7. tc=0.5, accept Ho
8. tc= -13.81, reject Ho
Exercise 10
1. b/ ̂ = -60.7 + 3.22x, c/ 142.16,
2. b/ ̂=151.354 + 1.399x, c/ 0.718, d/coeff. of deter. =51.5%
3. b/ ̂ = 0.93 + 0.0821x
4. a/ ̂ = -0.46 + 5.33x, b/ 106.14

147
Appendix: Table-A: Area between z=0 and Z=z OR area between Z= 0 and Z≤z):

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0190 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2157 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2969 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3513 0.3554 0.3577 0.3529 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4215 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4492 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993
3.2 148 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995
3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997
3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998
Table B. t- table with right tail probabilities

t α=p 0.1 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005


df = 1 3.078 6.314 12.706 31.821 63.656 127.321 318.289 636.578
2 1.886 2.920 4.303 6.965 9.925 14.089 22.328 31.600
3 1.638 2.353 3.182 4.541 5.841 7.453 10.214 12.924
4 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 1.476 2.015 2.571 3.365 4.032 4.773 5.894 6.869
6 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.689
28 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.660
30 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
50 1.299 1.676 2.009 2.403 2.678 2.937 3.261 3.496
149 60 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
Infinity 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.290
Table C. Right tail areas for the Chi-square Distribution

df\area 0.995 0.99 0.975 0.95 0.9 0.25 0.1 0.05 0.025 0.01 0.005
1 0.000 0.000 0.001 0.004 0.016 1.323 2.706 3.841 5.024 6.635 7.879
2 0.010 0.020 0.051 0.103 0.211 2.773 4.605 5.991 7.378 9.210 10.597
3 0.072 0.115 0.216 0.352 0.584 4.108 6.251 7.815 9.348 11.345 12.838
4 0.207 0.297 0.484 0.711 1.064 5.385 7.779 9.488 11.143 13.277 14.860
5 0.412 0.554 0.831 1.145 1.610 6.626 9.236 11.071 12.833 15.086 16.750
6 0.676 0.872 1.237 1.635 2.204 7.841 10.645 12.592 14.449 16.812 18.548
7 0.989 1.239 1.690 2.167 2.833 9.037 12.017 14.067 16.013 18.475 20.278
8 1.344 1.647 2.180 2.733 3.490 10.219 13.362 15.507 17.535 20.090 21.955
9 1.735 2.088 2.700 3.325 4.168 11.389 14.684 16.919 19.023 21.666 23.589
10 2.156 2.558 3.247 3.940 4.865 12.549 15.987 18.307 20.483 23.209 25.188
11 2.603 3.053 3.816 4.575 5.578 13.701 17.275 19.675 21.920 24.725 26.757
12 3.074 3.571 4.404 5.226 6.304 14.845 18.549 21.026 23.337 26.217 28.300
13 3.565 4.107 5.009 5.892 7.042 15.984 19.812 22.362 24.736 27.688 29.819
14 4.075 4.660 5.629 6.571 7.790 17.117 21.064 23.685 26.119 29.141 31.319
15 4.601 5.229 6.262 7.261 8.547 18.245 22.307 24.996 27.488 30.578 32.801
16 5.142 5.812 6.908 7.962 9.312 19.369 23.542 26.296 28.845 32.000 34.267
17 5.697 6.408 7.564 8.672 10.085 20.489 24.769 27.587 30.191 33.409 35.718
18 6.265 7.015 8.231 9.390 10.865 21.605 25.989 28.869 31.526 34.805 37.156
19 6.844 7.633 8.907 10.117 11.651 22.718 27.204 30.144 32.852 36.191 38.582
20 7.434 8.260 9.591 10.851 12.443 23.828 28.412 31.410 34.170 37.566 39.997
21 8.034 8.897 10.283 11.591 13.240 24.935 29.615 32.671 35.479 38.932 41.401
22 8.643 9.542 10.982 12.338 14.041 26.039 30.813 33.924 36.781 40.289 42.796
23 9.260 10.196 11.689 13.091 14.848 27.141 32.007 35.172 38.076 41.638 44.181
24 9.886 10.856 12.401 13.848 15.659 28.241 33.196 36.415 39.364 42.980 45.559
25 10.520 11.524 13.120 14.611 16.473 29.339 34.382 37.652 40.646 44.314 46.928
26 11.160 12.198 13.844 15.379 17.292 30.435 35.563 38.885 41.923 45.642 48.290
27 11.808 12.879 14.573 16.151 18.114 31.528 36.741 40.113 43.195 46.963 49.645
28 12.461 13.565 15.308 16.928 18.939 32.620 37.916 41.337 44.461 48.278 50.993
29 13.121 14.256 16.047 17.708 19.768 33.711 39.087 42.557 45.722 49.588 52.336
30 13.787 14.953 16.791 18.493 20.599 34.800 40.256 43.773 46.979 50.892 53.672

150
References:
1. Eshetu Wencheko, Introduction to Statistics. April 2000, Addis Ababa University.
2. Getu Degu, Fasil Tessema, Biostatistics for Health Science students, Jan. 2005,
University of Gondar.
3. Gupta S.P., Gupta M.P., Business Statistics, 2001, Sultan chand & sons, New Delhi.
4. Monga G.S., Mathematics and Statistics for Economics(second revised edition),
2007.
5. Moorthy M.B.K., Subramani K. &Santha A. Probability and Statistics, Dec. 2007,
Scitechpublications (India) pvt.Ltd.
6. Pal Nabendu, Sarkar Sahadeb, Statistics concepts and applications, 2006, New Delhi.
7. Rastogi ,Veer Bala, Fundamentals of Biostatistics, 2008, University of Delhi
8. Spiegel Murry R. & Stephen Larry J..Statistics-schaum’s outline, 1999,
ATAMCGraw-Hill edition, 3rd edition, New Delhi.
9. Sullivan Michael, iii, Statistics: informed decision using data: 2004, New Jersey.

151

You might also like