[go: up one dir, main page]

0% found this document useful (0 votes)
15 views24 pages

MAS 206 Lecture Notes 2

The document discusses the history and applications of the Central Limit Theorem (CLT), detailing its development from De Moivre's discovery of the normal curve in 1733 to its modern applications in statistics. It provides examples of using the CLT for sampling distributions of means and proportions, illustrating how sample sizes affect the approximation to a normal distribution. Additionally, it explains the mean and variance of sampling distributions and includes practical examples to demonstrate the theorem's usage in real-world scenarios.

Uploaded by

khatenjeesther7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views24 pages

MAS 206 Lecture Notes 2

The document discusses the history and applications of the Central Limit Theorem (CLT), detailing its development from De Moivre's discovery of the normal curve in 1733 to its modern applications in statistics. It provides examples of using the CLT for sampling distributions of means and proportions, illustrating how sample sizes affect the approximation to a normal distribution. Additionally, it explains the mean and variance of sampling distributions and includes practical examples to demonstrate the theorem's usage in real-world scenarios.

Uploaded by

khatenjeesther7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

The History of the Central Limit Theorem

What we call the central limit theorem actually comprises several theorems developed over

the years. The first such theorem was the discovery of the normal curve by Abraham De

Moivre in 1733, when he discovered the normal distribution as the limit of the binomial

distribution. The fact that the normal distribution appears as a limit of the binomial distribu-

tion as n increases is a form of the central limit theorem. Around the turn of the twentieth

century, Liapunov gave a more general form of the central limit theorem, and in 1922

Lindeberg gave the final form we use in applied statistics. In 1935, W Feller gave the proof

of the necessary condition of the theorem.

Let us now look at an example of the use of the central limit theorem.

Example -1

ABC Tool Company makes Laser XR; a special engine used in speedboats. The company’s

engineers believe that the engine delivers an average power of 220 horsepower, and that the

standard deviation of power delivered is 15 horsepower. A potential buyer intends to sample

100 engines (each engine to be run a single time). What is the probability that the sample

mean X will be less than 217 horsepower?

Solution: Given that:

Population mean μ = 220 horsepower

Population standard deviation σ = 15 horsepower

Sample size n = 100

Here our random variable X is normal (or at least approximately so, by the central limit

theorem as our sample size is large).


2
X ~ N (μ, σ2 n )

2
or X ~ N (220, 15 2 100 )

X −μ
So we can use the standard normal variable Z = to find the required probability,
σ n

217 − 220
P( X < 217) = P(Z < )
15 100

= P(Z < -2)

= 0.0228

So there is a small probability that the potential buyer’s tests will result in a sample mean less

than 217 horsepower.

SAMPLING DISTRIBUTION OF THE PROPORTION


Let us assume we have a binomial population, with a proportion p of the population possesses

a particular attribute that is of interest to us. This also implies that a proportion q (=1-p) of the

population does not possess the attribute of interest. If we pick up a sample of size n with

replacement and found x successes in the sample, the sample proportion of success ( p ) is

given by

x
p =
n

x is a binomial random variable, the possible value of this random variable depends on the

composition of the random sample from which p is computed. The probability of x

successes in the sample of size n is given by a binomial probability distribution, viz.

P( x) = nCx p x qn-x

x
Since p = and n is fixed (determined before the sampling) the distribution of the number
n

of successes (x) leads to the distribution of p .


The sampling distribution of p is the probability distribution of all possible

values the random variable p may take when a sample of size n is taken

from a specified population.

The expected value and the variance of x i.e. number of successes in a sample of size n is

known to be:

E(x) = n p

Var (x) = n p q

Finally we have mean and variance of the sampling distribution of p

()
μp = E p = E⎜ ⎟
⎛ x⎞
⎝n⎠
1 1
= E(x) = .n p = p
n n

and ()
σ 2p = Var p = Var ⎜ ⎟
⎛ x⎞
⎝n⎠

1 1 pq
= 2
. Var(x) = 2 . n p q =
n n n

σ p = SD p =() pq
n

When sampling is without replacement, we can use the finite population correction factor, so

sampling distribution of p has its

Mean μp = p

pq ⎛ N − n ⎞
Variance σ 2p = .⎜ ⎟
n ⎝ N −1 ⎠

pq N − n
and standard deviation σp = .
n N −1
As the sample size n increases, the central limit theorem applies here as well. The rate at

which the distribution approaches a normal distribution does depend, however, on the shape

of the distribution of the parent population.

¾ if the population is symmetrically distributed, the distribution of p approaches the

normal distribution relatively fast

¾ if the population distributions are very different from a symmetrical distribution, a

relatively large sample size is required to achieve a good normal approximation for

the distribution of p

In order to use the normal approximation for the sampling distribution of p , the sample size

needs to be large. A commonly used rule of thumb says that the normal approximation to the

distribution of p may be used only if both n p and n q are greater than 5.

We now state the central limit theorem when sampling for the population proportion p .

When sampling is done from a population with proportion p, the sampling

distribution of the sample proportion p approaches to a normal distribution

with proportion p and standard deviation pq n as the sample size n

increases.

2
For "Large Enough" n: p ~ N (p, pq n )

The estimated standard deviation of p is also called its standard error. We demonstrate the

use of the theorem in Example -2

Example -2

A manufacturer of screws has noticed that on an average 0.02 proportion of screws produced

are defective. A random sample of 400 screws is examined for the proportion of defective
screws. Find the probability that the proportion of the defective screws ( p ) in the sample is

between 0.01 and 0.03?

Solution: Given that:

Population proportion p = 0.02

So q = 0.08 (= 1-0.02)

Sample size n = 400

Since the population is infinite and also the sample size is large, the central limit theorem
2
applies. So p ~ N (p, pq n )
2
p ~ N (0.02, (0.02)(0.08) 400 )

⎛ p− p ⎞
We can find the required probability using standard normal variable Z = ⎜ ⎟
⎜ pq / n ⎟
⎝ ⎠
⎛ ⎞
⎜ ⎟
0.01 − 0.02 0.03 − 0.02 ⎟
P(0.01 < p < 0.03) = P ⎜⎜ <Z<
(0.02)(0.08) (0.02)(0.08) ⎟
⎜ ⎟
⎝ 400 400 ⎠
⎛ − 0.01 0.01 ⎞
= P⎜ <Z< ⎟
⎝ 0.007 0.007 ⎠
= P(-1.43 < Z < 1.43)
= 2 P(0 < Z < 1.43)
= 0.8472
So there is a very high probability that the sample will result in a proportion between 0.01

and 0.03.

SAMPLING DISTRIBUTION OF THE DIFFERENCE OF SAMPLE MEANS


In order to bring out the sampling distribution of the difference of sample means, let us

assume we have two populations labeled as 1 and 2. So that

μ1 and μ2 denote the two population means.

σ1 and σ2 denote the two population standard deviations

n1 and n2 denote the two sample sizes


X 1 and X 2 denote the two sample means

Let us consider independent random sampling from the populations so that the sample sizes

need not be same for both populations.

Since X 1 and X 2 are random variables so is their difference X 1 - X 2 . As a random variable,

X 1 - X 2 has a probability distribution. This probability distribution is the sampling

distribution of X 1 - X 2 .

The sampling distribution of X 1 - X 2 is the probability distribution of all

possible values the random variable X 1 - X 2 may take when independent

samples of size n1 and n2 are taken from two specified populations.

Mean and Variance of X 1 - X 2

(
μX −X = E X1 - X 2
1 2
) ( ) ( )
= E X1 − E X 2

= μ1 - μ2

and (
σ X2 − X = Var X 1 - X 2
1 2
) ( )
= Var X 1 + Var X 2 ( )
σ 12 σ 22
= + ; when sampling is with replacement
n1 n2

σ 12 ⎛ N 1 − n1 ⎞ σ 22 ⎛ N 2 − n2 ⎞
= .⎜ ⎟+ .⎜ ⎟ ; when sampling is without
n1 ⎜⎝ N 1 − 1 ⎟⎠ n 2 ⎜⎝ N 2 − 1 ⎟⎠

replacement

As the sample sizes n1 and n2 increases, the central limit theorem applies here as well. So we

state the central limit theorem when sampling for the difference of population means X 1 - X 2

When sampling is done from two populations with means μ1 and μ2 and

standard deviations σ1 and σ2 respectively, the sampling distribution of the

difference of sample means X 1 - X 2 approaches to a normal distribution


σ 12 σ 22
with mean μ1 - μ2 and standard deviation + as the sample sizes n1
n1 n2

and n2 increases.

2
σ 12 σ 22
For "Large Enough" n1 and n2: X 1 - X 2 ~ N (μ1 - μ2, + )
n1 n2

The estimated standard deviation of X 1 - X 2 is also called its standard error. We

demonstrate the use of the theorem in Example -3.

Example -3

The makers of Duracell batteries claims that the size AA battery lasts on an average of 45

minutes longer than Duracell’s main competitor, the Energizer. Two independent random

samples of 100 batteries of each kind are selected. Assuming σ 1 = 84 minutes and

σ 2 = 67 minutes, find the probability that the difference in the average lives of Duracell and

Energizer batteries based on samples does not exceed 54 minutes.

Solution: Given that:

μ1 - μ2 = 45

σ1 = 84 and σ2 = 67

n1 =100 and n2 = 100

Let X 1 and X 2 denote the two sample average lives of Duracell and Energizer batteries
respectively. Since the population is infinite and also the sample sizes are large, the central
limit theorem applies.
2
σ 12 σ 22
i.e X 1 - X 2 ~ N (μ1 - μ2, + )
n1 n2
2
84 2 67 2
X 1 - X 2 ~ N (45, + )
100 100
So we can find the required probability using standard normal variable

Z =
(X 1 )
− X 2 − (μ 1 − μ 2 )
σ 12 σ 22
+
n1 n2

54 − 45
So P( X 1 - X 2 < 54) = P(Z< )
84 2 67 2
+
100 100
= P(Z < 0.84)
= 1- 0.20045
= 0.79955
So there is a very high probability that the difference in the average lives of Duracell and

Energizer batteries based on samples does not exceed 54 minutes.

SAMPLING DISTRIBUTION OF THE DIFFERENCE OF SAMPLE


PROPORTIONS
Let us assume we have two binomial populations labeled as 1 and 2. So that

p1 and p2 denote the two population proportions

n1 and n2 denote the two sample sizes

p1 and p 2 denote the two sample proportions

Let us consider independent random sampling from the populations so that the sample sizes

need not be same for both populations.

Since p1 and p 2 are random variables so is their difference p1 - p 2 . As a random variable,

p1 - p 2 has a probability distribution. This probability distribution is the sampling

distribution of p1 - p 2 .

The sampling distribution of p1 - p 2 is the probability distribution of all

possible values the random variable p1 - p 2 may take when independent

samples of size n1 and n2 are taken from two specified binomial populations.
Mean and Variance of p1 - p 2

(
μ p − p = E p1 - p 2
1 2
) ( ) ( )
= E p1 − E p 2

= p1 - p2

and (
σ p2 − p = Var p1 - p 2
1 2
) ( )
= Var p1 + Var p 2 ( )
p1 q1 p 2 q 2
= + ; when sampling is with replacement
n1 n2

p1 q1 ⎛ N 1 − n1 ⎞ p 2 q 2 ⎛ N 2 − n 2 ⎞
= .⎜ ⎟+ .⎜ ⎟⎟ ; when sampling is
n1 ⎜⎝ N 1 − 1 ⎟⎠ n 2 ⎜⎝ N 2 − 1 ⎠

without replacement

As the sample sizes n1 and n2 increases, the central limit theorem applies here as well. So we

state the central limit theorem when sampling for the difference of population proportions

p1 - p 2

When sampling is done from two populations with proportions p1 and p2

respectively, the sampling distribution of the difference of sample proportions

p1 - p 2 approaches to a normal distribution with mean p1 - p2 and standard

p1 q1 p 2 q 2
deviation + as the sample sizes n1 and n2 increases.
n1 n2

2
p1 q1 p 2 q 2
For "Large Enough" n1 and n2: p1 - p 2 ~ N (p1 - p2, + )
n1 n2

The estimated standard deviation of p1 - p 2 is also called its standard error. We demonstrate

the use of the theorem in Example -4.

Example -4

It has been experienced that proportions of defaulters (in tax payments) belonging to business

class and professional class are 0.20 and 0.15 respectively. The results of a sample survey

are:
Business class Professional class

Sample size: n1 = 400 n2 = 420

Proportion of defaulters: p1 = 0.21 p 2 = 0.14

Find the probability of drawing two samples with a difference in the two sample proportions

larger than what is observed.

Solution: Given that:

p1 = 0.20 p2 = 0.15

q1 = 1-0.20 = 0.80 q2 = 1-0.15 = 0.85

n1 = 400 n2 = 420

p1 = 0.21 p 2 = 0.14

Since the population is infinite and also the sample sizes are large, the central limit theorem
applies. i.e.
2
p1 q1 p 2 q 2
p1 - p 2 ~ N (p1 - p2, + )
n1 n2

2
(0.20)(0.80) (0.15)(0.85)
p1 - p 2 ~ N (0.05, + )
400 420

So we can find the required probability using standard normal variable

Z =
(p 1 )
− p 2 − ( p1 − p 2 )
p1 q1 p q
+ 2 2
n1 n2

0 . 07 − 0 . 05
P( p1 - p 2 > 0.07) = P(Z > )
( 0 . 20 )( 0 . 80 ) ( 0 . 15 )( 0 . 85 )
+
400 400
= P(Z > 0.76)
= 0.22363
So there is a low probability of drawing two samples with a difference in the two sample

proportions larger than what is observed.


SMALL SAMPLING DISTRIBUTIONS
Up to now we were discussing the large sampling distributions in the sense that the various

sampling distributions can be well approximated by a normal distribution for “Large Enough”

sample sizes. In other words, the Z-statistic is used in statistical inference when sample size is

large. It may, however, be appreciated that the sample size may be prohibited from being

large either due to physical limitations or due to practical difficulties of sampling costs being

too high. Consequently, for our statistical inferences, we may often have to contend

ourselves with a small sample size and limited information. The consequences of the sample

being small; n < 30; are that

¾ the central limit theorem ceases to operate, and

¾ the sample variance S2 fails to serve as an unbiased estimator of σ 2

Thus, the basic difference which the sample size makes is that while the sampling

distributions based on large samples are approximately normal and sample variance S2 is an

unbiased estimator of σ 2 , the same does not occur when the sample is small.

It may be appreciated that the small sampling distributions are also known as exact sampling

distributions, as the statistical inferences based on them are not subject to approximation.

However, the assumption of population being normal is the basic qualification underlying the

application of small sampling distributions.

In the category of small sampling distributions, the Binomial and Poisson distributions were

already discussed. Now we will discuss three more important small sampling

distributions – the chi-square, the F and the student t-distribution. The purpose of discussing

these distributions at this stage is limited only to understanding the variables, which define

them and their essential properties. The application of these distributions will be highlighted

in the next two lessons.


The small sampling distributions are defined in terms of the concept of degrees of freedom.

We will discuss this before concept proceeding further.

Degrees of Freedom (df)

The concept of degrees of freedom (df) is important for many statistical calculations and

probability distributions. We may define df associated with a sample statistic as the number

of observations contained in a set of sample data which can be freely chosen. It refer to the

number of independent variables which vary freely without being influenced by the

restrictions imposed by the sample statistic(s) to be computed.

1 n
Let x1 , x 2 ......x n be n observations comprising a sample whose mean x = ∑ xi is a value
n i =1

known to us. Obviously, we are free to assign any value to n-1 observation out of n

observations. Once the value are freely assigned to n-1observations, freedom to do the same

for the nth observation is lost and its value is automatically determined as

nth observation = n x - sum of n-1 observations

n −1
= n x − ∑ xi
i =1

As the value of nth observation must satisfy the restriction

∑x
i =1
i = nx

We say that one degree of freedom, df is lost and the sum n x of n observations has n-1 df

associated with it.

For example, if the sum of four observations is 10, we are free to assign any value to three

observations only, say, x1 = 2, x 2 = 1 and x3 = 4 . Given these values, the value of fourth

observation is automatically determined as

4
x 4 = ∑ xi − (x1 + x 2 + x3 )
i =1
x 4 = 10 − (2 + 1 + 4)

x4 = 3

Sampling essentially consists of defining various sample statistics and to make use them in

estimating the corresponding population parameters. In this respect, degrees of freedom may

be defined as the number of n independent observations contained in a sample less the

number of parameters m to be estimated on the basis of that sample information, i.e. df =

n-m.

For example, when the population variance σ2 is not known, it is to be estimated by a

particular value of its estimator S2; the sample variance. The number of observations in the

sample being n, df = n-m = n-1 because σ2 is the only parameter (i.e. m =1) to be

estimated by the sample variance.

SAMPLING DISTRIBUTION OF THE VARIANCE


We will now discuss the sampling distribution of the variance. We will first introduce

the concept of the sample variance as an unbiased estimator of population variance

and then present the chi-square distribution, which helps us in working out

probabilities for the sample variance.

THE SAMPLE VARIANCE

By now it is implicitly clear that we use the sample mean to estimate the population mean

and sample proportion to estimate the population proportion, when those parameters are

unknown. Similarly, we use a sample statistic called the sample variance to estimate the

population variance.

As will see in the next lesson on Statistical Estimation a sample statistic is an unbiased

estimator of the population parameter when the expected value of sample statistic is equal to

the corresponding population parameter it estimates.

Thus, if we use the sample variance S2 as an unbiased estimator of population varianceσ2


Then E(S2) = σ2

However, it can be shown empirically that while calculating S2 if we divide the sum of square

∑ (x − x )
n
2
of deviations from mean (SSD) i.e. by n, it will not be an unbiased estimator of σ2
i =1

⎛ n
⎜∑ x−x( )2 ⎞
⎟ n −1 2 σ2
and E ⎜ i =1 ⎟ = σ = σ2 −
⎜ n ⎟ n n
⎜ ⎟
⎝ ⎠

∑ (x − x ) σ2
2

i.e. will underestimate the population variance σ by the factor 2


n
.
To
n

∑ (x − x )
n
2

∑ (x − x )
n
2
compensate for this downward bias we divide by n-1, so that S 2 = i =1
is
i =1 n −1
an unbiased estimator of population variance σ2 and we have:
⎛ n
(
⎜∑ x−x )2 ⎞

E ⎜ i =1 ⎟ = σ2
⎜ n −1 ⎟
⎜ ⎟
⎝ ⎠
In other words to get the unbiased estimator of population variance σ2, we divide the

( )
n
sum ∑ x − x by the degree of freedom n-1
2

i =1

THE CHI-SQUARE DISTRIBUTION

Let X be a random variable representing N values of a population, which is normally

distributed with mean μ and varianceσ2, i. e.

X = {X 1 , X 2 ...... X N }

We may draw a random sample of size n comprising x1 , x 2 ......x n values from this population.

As brought out in previous section, each of the n sample values x1 , x 2 ......x n can be treated as an

independent normal random variable with mean μ and variance σ2. In other words

xi ~ N (μ, σ2) where i = 1, 2, ………n


Thus each of these n normally distribution random variable may be standardized so that

xi − μ
Zi = ~ N (0, 12) where i = 1, 2, ………n
σ

A sample statistic U may, then, be defined as

U = Z 12 + Z 22 + ......... + Z n2

n
U = ∑ Z i2
i =1

2
n
⎛x −μ⎞
U = ∑⎜ i ⎟
i =1 ⎝ σ ⎠

Which will take different values in repeated random sampling. Obviously, U is a random

variable. It is called chi-square variable, denoted by χ2. Thus the chi-square random

variable is the sum of several independent, squared standard normal random variables.

The chi-square distribution is the probability distribution of chi-square variable. So

The chi-square distribution is the probability distribution of the sum of

several independent, squared standard normal random variables.

The chi-square distribution is defined as

1 n
− χ2 −1
f (χ ) =Ce (χ ) dχ2
2 2 2 2
for χ2 ≥ 0

where e is the base of natural logarithm, n denotes the sample size (or the number of

independent normal random variables).C is a constant to be so determined that the total area

under the χ2 distribution is unity. χ2 values are determined in terms of degrees of freedom, df

=n

Properties of χ2 Distribution

1. A χ2 distribution is completely defined by the number of degrees of freedom, df = n.

So there are many χ2 distributions each with its own df.


2. χ2 is a sample statistic having no corresponding parameter, which makes χ2distribution

a non-parametric distribution.

3. As a sum of squares the χ2 random variable cannot be negative and is, therefore,

bounded on the left by zero.

Figure: χ2 Distribution with Different Numbers of df

4. The mean of a χ2 distribution is equal to the degrees of freedom df. The variance of

the distribution is equal to twice the number of degrees of freedom df .

E(χ2) = n Var (χ2 ) = 2n

5. Unless the df is large, a χ2 distribution is skewed to the right. As df increases, the

χ2 distribution looks more and more like a normal. Thus for large df

2
χ2 ~ N (n, 2n )

Figure 12-6 shows several χ2 distributions with different numbers of df.

In general, for n ≥ 30, the probability of χ2 taking a value greater than or less than a

particular value can be approximated by using the normal area tables.

6. If χ 12 , χ 22 , χ 32 ,.........χ k2 are k independent χ2 random variables, with degrees of

freedom n1 , n2 , n3 ,.........nk . Then their sum χ 12 + χ 22 + χ 32 + ......... + χ k2 also possesses

a χ2 distribution with df = n1 + n2 + n3 + ......... + nk .

The χ2Distribution in terms of Sample Variance S2

We can write
∑ [( x − x) + ( x − μ )]
2 2
n
⎛ xi − μ ⎞ 1 n

∑ ⎜
i =1 ⎝ σ ⎠ σ
⎟ = 2
i =1
i

∑ [( x ]
n
1
= i − x) 2 + ( x − μ ) 2 + 2( xi − x)( x − μ )
σ 2
i =1

n n n
1 1 2
=
σ2
∑ ( xi − x ) 2 +
i =1 σ2
∑ (x − μ)2 +
i =1 σ2
( x − μ )∑ ( x i − x )
i =1

2
(n − 1) S 2 ⎛ x−μ ⎞
= + ⎜⎜ ⎟

σ2 ⎝ σ / n ⎠

⎡ n n n

⎢since

∑ ( xi − x) 2 = (n − 1)S 2 ; ∑ ( x − μ ) = n( x − μ ) and
i =1 i =1
∑ (x
i =1
i − x ) = 0⎥

Now, we know that the LHS of the above equation is a random variable which has chi-square

distribution, with df = n

We also know that if

2
x ~ N (μ, σ2 n )

2
⎛ x−μ ⎞
Then ⎜⎜ ⎟ will have a chi-square distribution with df = 1

⎝ σ n ⎠

Since the two terms on the RHS are independent,


(n − 1)S 2 will also has a chi-square
σ2

distribution with df = n-1. One degree of freedom is lost because all the deviations are

measured from x and not from μ..

Expected Value and Variance of S2

In practice, therefore, we work with the distribution of


(n − 1)S 2 and not with the distribution
σ2

of S2 directly.

Since
(n − 1)S 2 has a chi-square distribution with df = n-1
σ2
⎡ (n − 1)S 2 ⎤
So E⎢ ⎥ = n −1
⎣ σ
2

n −1
E (S 2 ) = n − 1
σ 2

E (S 2 ) = σ 2

⎡ (n − 1) S 2 ⎤
Also Var ⎢ ⎥ = 2(n − 1)
⎣ σ
2

Using the definition of variance, we get

2
⎡ (n − 1)S 2 ⎛ (n − 1) S 2 ⎞⎤
E⎢ − E ⎜⎜ ⎟⎟⎥ = 2(n − 1)
⎣ σ ⎝ σ
2 2
⎠⎦

⎡ (n − 1)S 2
2

or E⎢ − (n − 1)⎥ = 2(n − 1)
⎣ σ
2

⎡ (n − 1)2 S 4 (n − 1)S 2 ⎤ = 2(n − 1)


2

or E⎢ + ( n − 1) 2
− 2( n − 1) ⎥
⎣ σ σ2 ⎦
4

or
(n − 1)2 E [S 4 + σ 4 − 2S 2σ 2 ]2 = 2(n − 1)
σ4

or
(n − 1)2 E ( S 2 − σ 2 ) 2 = 2(n − 1)
σ4

2(n − 1) 4
or E (S 2 − σ 2 ) 2 = σ
(n − 1) 2

2σ 4
So Var ( S 2 ) =
n −1

It may be noted that the conditions necessary for the central limit theorem to be operative in

the case of sample variance S2 are quite restrictive. For the sampling distribution of S2 to be

approximately normal requires not only that the parent population is normal, but also that the

sample is at least as large as 100.

Example -5
In an automated process, a machine fills cans of coffee. The variance of the filling process is

known to be 30. In order to keep the process in control, from time to time regular checks of

the variance of the filling process are made. This is done by randomly sampling filled cans,

measuring their amounts and computing the sample variance. A random sample of 101 cans

is selected for the purpose. What is the probability that the sample variance is between 21.28

and 38.72?

Solution: We have

Population variance σ2 = 30

n = 101

We can find the required probability by using the chi-square distribution

χ2 =
(n − 1)S 2
σ2

⎛ (101 − 1)21.28 (101 − 1)38.72 ⎞


So P(21.28 < S2 < 38.72) = P⎜ < χ2 < ⎟
⎝ 30 30 ⎠

= P(70.93 < χ2 < 129.06)

= P(χ2 > 70.93) - P(χ2 > 129.06)

≈ 0.990 – 0.025

= 0.965

Since our population is normal and also sample size is quite large, we can also estimate the

required probability using normal distribution.

2
2σ 4
We have S ~ (σ ,
2 2
)
n −1

⎛ ⎞
⎜ ⎟
2 ⎜ 21.28 − σ 2 38.72 − σ 2 ⎟
So P(21.28 < S < 38.72) = P⎜ <Z< ⎟
⎜ 2σ 4
2σ 4 ⎟
⎜ ⎟
⎝ n −1 n −1 ⎠
⎛ ⎞
⎜ ⎟
21.28 − 30 38.72 − 30
= P⎜ <Z< ⎟
⎜ 2 x30 x30 2 x30 x30 ⎟
⎜ ⎟
⎝ 101 − 1 101 − 1 ⎠
⎛ − 8.72 8.72 ⎞
= P⎜ <Z< ⎟
⎝ 4.36 4.36 ⎠

= P(− 2 < Z < 2 )


= 2 P(0 < Z < 2 )
= 2x0.4772

= 0.9544

Which is approximately the same as calculated above using χ2distribution

THE F -DISTRIBUTION
Let us assume two normal population with variances σ 12 and σ 22 repetitively. For a random

sample of size n1 drawn from the first population, we have the chi-square variable

χ =
2 (n1 − 1)S12
1 2
σ1

which process a χ2 distribution with ν1 = n1 -1 df

Similarly, for a random sample of size n2 drawn from the second population, we have the chi-

square variable

χ =
2 (n2 − 1)S 22
2 2
σ2

which process a χ2 distribution with ν2 = n2 -1 df

A new sample statistic defined as

χ 12
v1
F=
χ 2
2
v2

is a random variable known as F statistic, named in honor of the English statistician Sir

Ronald A Fisher.
Being a random variable it has a probability distribution, which is known as F distribution.

The F distribution is the distribution of the ratio of two chi-square random

variables that are independent of each other, each of which is divided by its

own degrees of freedom.

Properties of F- Distribution

1. The F distribution is defined by two kinds of degrees of freedom – the degrees of

freedom of the numerator always listed as the first item in the parentheses and the

degrees of freedom of the denominator always listed as the second item in the

parentheses. So there are a large number of F distributions for each pair of v1 and v2.

Figure 12-7 shows several F distributions with different v1 and v2.

2. As a ratio of two squared quantities, the F random variable cannot be negative and is,

therefore, bounded on the left by zero.

Figure: F- Distribution with different v1 and v2

3. The F( v1 ,v2 ) has no mean for v2 ≤ 2 and no variance for v2 ≤ 4. However, for v2

>2, the mean and for v2 > 4, the variance is given as

v2 2v 22 (v1 + v 2 − 2)
E( F( v1 ,v2 ) ) = Var( F( v1 ,v2 ) ) =
v2 − 2 v1 (v 2 − 2) 2 (v 2 − 4)

4. Unless the v2 is large, a F distribution is skewed to the right. As v2 increases, the F

distribution looks more and more like a normal. In general, for v2 ≥ 30, the probability
of F taking a value greater than or less than a particular value can be approximated by

using the normal area tables.

5. The F distributions defined as F( v1 ,v2 ) and as F( v2 ,v1 ) are reciprocal of each other.

1
i.e. F( v1 ,v2 ) =
F( v2 ,v1 )

THE t-DISTRIBUTION
Let us assume a normal population with mean μ and variance σ 2 . If xi represent the n values

of a sample drawn from this population. Then

xi − μ
Zi = ~ N (0, 12) where i = 1, 2, ………n
σ

∑ (x )
n 2

2 −x
⎛x −μ⎞
n i
and U = ∑⎜ i ⎟ = i =1
~ χ2 (n-1 df) where i = 1, 2, ………n
i =1 ⎝ σ ⎠ σ2

A new sample statistic T may, then, be defined as

xi − μ
T= σ

∑ (x )
n 2

i −x
1 i =1

n −1 σ2

xi − μ
T=

∑ (x )
n 2

i −x
i =1

n −1

xi − μ
T=
S

This statistic - the ratio of the standard normal variable Z to the square root of the χ2

variable divided by its degree of freedom - is known as ‘t’ statistic or student ‘t’ statistic,

named after the pen name of Sir W S Gosset, who discovered the distribution of the quantity.
xi − μ
The random variable follows t-distribution with n-1 degrees of freedom.
S

xi − μ
~ t (n-1 df) where i = 1, 2, ………n
S

The t- distribution in terms of Sampling Distribution of Sample Mean

2
We know X ~ N (μ, σ2 n )

X −μ
So ~ N (0, 12 )
σ
n

xi − μ
X −μ xi − μ σ
Putting for in T = , we get
σ σ
∑ (x )
n 2
n i −x
1 i =1

n −1 σ2

X −μ
σ
n
T=

∑ (x )
n 2

i −x
1 i =1

n −1 σ2

(X − μ )
or T= σ

∑ (x )
n 2

i −x
1 i =1

σ n(n − 1)

X −μ
or T=

∑ (x )
n 2

i −x
1 i =1

n n −1

X −μ
or T=
S
n

When defined as above, T again follows t-distribution with n-1 degrees of freedom.
X −μ
~ t (n-1 df) where i = 1, 2, ………n
S
n

Properties of t- Distribution

1. The t-distribution like Z distribution, is unimodal, symmetric about mean 0, and the t-

variable varies from -∝ and∝

2. The t-distribution is defined by the degrees of freedom v = n-1, the df associated with

the distribution are the df associated with the sample standard deviation.

3. The t-distribution has no mean for n = 2 i.e. for v = 1 and no variance for n ≤ 3 i.e. for v

≤ 2. However, for v >1, the mean and for v > 2, the variance is given as

v
E(T) = 0 Var(T) =
v−2

Figure: t-Distribution with different df

v
4. The variance of the t-distribution must always be greater than 1, so it is more
v−2

variable as against Z distribution which has variance 1. This follows from the fact what

while Z values vary from sample to sample owing to the change in the X alone, the

variation in T values are due to changes in both X and S.

5. The variance of t-distribution approaches 1 as the sample size n tends to increase. In

general, for n ≥ 30, the variance of t-distribution is approximately the same as that of Z

distribution. In other words the t-distribution is approximately normal for n ≥ 30.

You might also like