| Statistical Inferences
| & Gauss Markov Model
yethods of Estimation
timate’ means ‘to judge’.
peimating parameter has a particular value (quantitative in the form of real number),
repr it method of estimation is called point estimation
Fesimating parameter has a value (quantity in the form of real number) lying in an
va, then for it method of estimation is called interval estimation.
Characteristics of Estimators
# Consistency
‘= Unbiasedness
« Efficiency
+ Sufficiency
Ley be a random variable with pdf f(x, 6) 6 being a parameter.
Problem is to find the value of @.
The collection of all possible values that @ can tal
parameter space
inset theory notation above information can be expressed as
ve consider a random sample Xj, %)y--«/%9 of size n from a population with probability
density function.
F(X; Oy, Op, ==)
Where @,0,,...,8_ are unknown population parameters.
te one or more parameters is called a statistic.
Aninfnite number of statistics can be created to estimate population parameters.
Among these statistics, a statistic which has value 1
Study) is called best estimate of the parameter.
Symbolically if x,, x9, ..-)X, is a sample of size n.
Ty 8X 0X), Ty =9p(y Korver Klee Th = OC %r
salues i.e, statistics), which have values near to population parameters,
tofind the statistic which is best estimate of the population parameter (s)
function of sample values used to estimat
T
hese statistics are called estimators,
Consistency
{0%} be a fandom sample of size
aie
ke is denoted by @ and is called
s {fx 0)|8€ OP,
yearest to the parameter value (under
s.=)%q) be functions of sample
then problem is
_ be a sequence of estimators of 710),
syllabus
"Methods of Estimation
1 Properties of Estimators
* Confidence intervals
1 Tests of Hypothesis
Most powerful and Uniformly most
Powerful Tests
= Ukelnood Ratio Tests
Analysis of Discrete Data and chi-square
‘est of goodness of Fit
= Large Sample Tests
' Simple Non-parametric Tests for One
and'Two Sample Problems
"Rank Correlation and Test for
Independence
‘= Elementary Bayesian Inference
* Gauss-Markov Models
= Estimabilty of Parameters
Best Linear Unbiased Estimators
‘= Tests for Linear Hypothesis and.
Confidence Intervals
‘= Analysis of Variance and Covariance
‘= Fixed, Random and Mixed Eects
Models
‘Simple and Multiple Linear Regression
Elementary Regression Diagnostics
Logistic Regression
‘Multivariate Normal Distribution
\Wishart cistrbution and their Properties
Distribution of Quadratic Forms
Inference for Parameters
Partial and Multiple Correlation
Coefficients and Related Tests
‘= Data Reduction Techniques
* Principle Component Analysis.
‘Discriminant Analysis
= Cluster Analysis.
Canonical Correlation,
Simple RandomSampling
+ Stratified Sampling and Systematic Sampling
1 Probability Proportional to Size Sampling
= Ratio and Regression Methods
= Completely Randomized
‘Randomized Blocks and Latin-square
Designs
Connectedness and Orthogonalty of
Block Designs
‘IBD. 2k Factorial Experiments
Confounding and Construction
Series and Parallel Systems
‘Hazard Function and Failure Rates
= Censoring and Life Testing1002
This T, is said to be consistent if
lim T, 2 70)
This can also be expressed as, is said to be consistent to 110),
if for any arbitrarily chosen €>0 and n>0 there exists a
Positive integer m, (depending upon ¢ and 1) such that
FLIT, ~ 10)|>€]— 1as n> 0
ie, FIlT, ~ @)|
1-9, Wn2m
Here, T,=T0xy,%).-.:%) iS an estimator of 10). Its value
depends upon the sample size n. AS n increases, Ty may
cchange on making n very large, we study how does T, behave.
Theorem Invariance Property of Consistent Estimators.
If T, is a consistent estimator of r(@) and w(y(@)) is a
continuous function of y(@), then y(T,) is a consistent
estimator of y(i(9))
Theorem Sufficient Condition for Consistency. Let {T,}
be a sequence of estimators such that for all @<@
(i) E,(T,) > ¥(0),n 0
and (ii) Varg(T,) > 0 as n>
Then, 7, is a consistent estimator of (0).
Unbiasedness
Unbiasedness is associated with sample size n
A Statistic Ty =Thx%y-%qh is said to be an unbiased
estimator of 710) if
E,)= 7), VOEe@
If E(,)> ¥@), then T, is said to be positively biased and if
ET,)< ¥@), it is said to be negatively biased.
The amount of bias b@)is given by
b@)= ET,)- ¥@),6€O
Unbiased Estimate for Population Mean (11)
and Variance (co)
Let a large population be {X,X,,....Xy} (of size N) with mean
Hand variance 0?
Let {4 %4y+:11%,} be a random sample of size m taken from it.
Then, sample mean ¥ is given by
Joey + apathy)
n
ake
=3o4
and sample variance,
i
a ze Ea
ate
Now, a=1(23)
In the sample, each x, is from the population {X;,X,-.. Xu}
and so x, can take anyone of the values X,, Xs, ..,X, each with
equal probability 1/ N.
x
1
£ a)
2,
a
UGC-CSIR NET Tutor Mathematical Sciences
Thus, probability distribution for x
aX | %
| uN
- Ay
UN | ee TW
1 ae 1
BEX th tench
Therefore, x)= 581+ 5% ne
1
= LOG th to AXy)=
wet u
Now,
=> Sample mean 5 is unbiased estimate ofthe pop
mean ().
Te
Now, are [23 =|
Vox) = Bx?) ~ (£0)? 4
For x = xj this gives
Vox) = Ex) [ba
Fd) = Vox) + 1x )P =o? +H? -*
Eq. (i) gives
VQ) = BR?) — (ERP ea
Abo, v@=vi1¥x)=1 ¥
so, e ( 3s) es 2 Vox)
- SMV tat Vo
orStatistical
tatistical Inferences and Gauss Markov Model
- «iy and Gv), we get
o ated
2) -[6@)F =
at) - [EOF = —
2
o
EG?) = — +p, ri
; eer HSER =n) 0
13 pe) — Ag?
03) =
3 nee) BG)
13492 ay2y_ (2
« (Soe
ane bg fe +)
segs. (i) and (0), we get
a o
= Lin’ +e y-(S ae)
ate
oo" 4) ee
-(cr alla
(n
ea?
i as)=
sample variance is not an unbiased estimate of
npulation variance.
Fc... n-1a
3 LAs) =0'
a 185 gle?
are -xf}=
fo oe “| g
* AS')=o?
s unbiased estimate of the population variance,
o.
Me know
Gaal
2 ftge sample, n> so and so s? = S*
°ecomes unbiased estimate of population variance.
1003
Test of Signifi
ean gnificance for Sample
For
{oc large samples, the standard normal variate corresponding
oi vn
Null Hypothesis, H, = The samy
ypothesis, iple has been di
Population with mean and variance 6, aa
significant difference between the sample mean (z) and
Population mean (u), the test statistic (for large samples) is
o/vn
If @ is unknown, its estimate is 6”
sample.
Confidence limits for u 95% confidence interval for u is given
by
[Z|s196, ie, Ets 196
In)
= £196 © are known as 95% confidence lit
vn
or 6=5 for large
ts forp as
oe cee
196 S<, the likelihood equatot
{jp 100 =0, has a solution which converges in probstit
to the true value 0." In other words MLES **
consistent.
MLE's are always consistent but need not he unbiss@!
Theorem 2 (Hazoor Bazar's Theorem) Any MSS
solution of the likelihood equation is 25}
U< Moo.Statistical Inferences and Gauss Markov Model
jy distributed about the true value @, Thus,
f
om 1 Vesn->—
110)
nce
» ne s
1°70 A-—2;log 1]
rs
43 If MLE exists, it is the most efficient in the
em
euch estimators.
of MLE is given by
rem 4 if @ sufficient estimator exists, it is a
fore maximum likelihood estimator
‘is theorem 1S useful in finding, if a sufficient
s sor exists OF not)
seorem 5 lf for a given population with pdf f(x,6),
xe (Minimum variance bound) estimator T exists for 6,
Wye likelihood equation will have a solution equal to
ppestimator T.
theorem 6 (Invariance property of MLE) If T is the
Tate and y(@) is one-to-one function of @, then (7) is
be MLE of v (0)
theorem 7 Cramer-Rao Inequality. If t is an unbiased
‘emator for ¥(@),@ function of parameter @, then
4 ee
0)
[°°] vor
1)
mt
itere, @) is the information on @ supplied by the
sample
var(t)>
Regularity Conditions for Cramer-Rao
‘inequality
() The parameter space © is a non-degenerate open
interval on the real line R’=]— =, =
(i) For almost all x = (xy, %9,-+%qh and for all 8 @,
a ,
5;1%0) exits the exceptional set if any is
independent of 8.
Gi) The range of integration is independent of the
parameter @, so that f{x,6) is differentiable under
integral sign.
(v) The conditions of uniform convergence of integrals,
are satisfied so that differentiation under the integral
sign is valid,
2
Wi@= { a oat ol |
yf and is postive forall 8 © corny.
‘is an unbiased estimator of @ ie., Et)=8
2 =
1005
50 (OP cary
real)
(s dl
1@)
(
a
where, ra-d( 1) | is called the amount of
information on @ supplied by the sample {1% ++}
and its reciprocal 755 as the information limit to the
variance of estimator t = tty X20 %q)
(itis by RA Fisher)
MVBE.
‘An unbiased estimator t of (@) for which Cramer-Rao lower
bound in
iste
)
varit) => Yor
a 10)
ke
ts asi]
is attained is called a minimum variance bound estimator
(MVBE)
Condition for the Equality Sign in
Crammer-Rao (C-R) inequality
CR inequality is
tier
{e=)|
The necessary and sufficient condition for an unbiased
estimator t to attain the lower bound ofits variance is given by
Svartt)= Bt - yO)P
a 1),
sk =It- 7040)
30 °8 xe) It - OnAe
1
where, REA)
Theorem 1 If the likelihood functior Lis expressible as
2 jog =f = tt -YOA00),
Fog = FG alt OAc
then
(i) t is an unbiased estimator of 110).
(i) Minimum Variance Bound (MVB) estimator ¢ for
100) exists and :
i =U! - tyr)
tai) var) = gy = ©0011
Theorem 2.A MVB estimator for (0) exists ifand ni7 if
there exists a sufficient estimator for 1)1006
Complete Family of
Distributions
Let a statistic T= Tb %,%,.14%9) be based on a random
sample of size n from the population f\x,0),0¢ ©,
The distribution of the statistic T will, in general depend on 0.
So, corresponding to T, we again have a family of
distributions, say {g(t,0),0€ @)
Definition The statistic T = tx, x),...,x,) or more precisely,
the family of distributions {g(t,6)|@€@)} is said to be
complete for 0 if
E\(MT)1=0 for allo
= RIT) =01=1
ie, J h)-g¢,Oxdt=0, vee
or Et) gtt,0)=0, Voc @
= A{T) =0 Ve N almost surely (a.s)
MVUE (Minimum Variance
Unbiased Estimator) and
Blackmellisation
Theorem (Rao-Blackwell Theorem) Let X and Y be
random variables such that
E(Y) = and var(Y) =o’y >0
Let EX Ox)
Then, (i) E90)
(ii) varfo(o) < var(y)
The Principle of Maximum Likelihood
Estimation
It consists of finding an estimator 6 for the unknown parameter
6 =(@,,0,,..., 6 )say, which maximises the likelihood function
(@) for variations in parameter, ie.,
To find 6=(6,6,,....6,)
such that 16)> 16), ¥0¢O
or 16) = sup L(0),¥8e
6 is called Maximum t ikelihood Estimator (MLE)
Gis the solution of,
2
* ond X @)= a, ag
and PU, <0)=0 a
4, and a, are constants independent of 8 In Eqs. () and i
and 7 may be taken as C, and G B
(On combining Eqs. (i) and (i), we get
Pq, <0 Hy or Hy
Hi; H> Ho
(i) Hi 2
The value of Z given by Eq, (i) is known as
fortwo tailed test, at ‘a’ level of significance,
2, Thus, P|Z|> Zq)= a
The level of significance is the size of the type | error (or the
"aximum producers risk). It is generally 5% or 1%. Its fixed
' advance before collecting the sample information.
test statistict,
Z is denoted by
Etrors in Sampling
- ain objective of sampling theory is
‘out the population parameters on th
ts,
{fi Beerally in the form : To accept oF re
prexamining a sample from it. n this wor
ibity of error. It is of two types
to draw conclusion
1e basis of sample
ject the population
king there may be
he Error: Reject Hy, when it is true.
We Error: Accept Hy, when itis wrong, ie, accept Ho
\e rear
Reject Hy when it is true}
1007
= Pf Reject Hy [b} =
and P{ Accept Hy when it is wrong)
= Pf Accept Hp | Hi 1}=B
Then, o.is called size of type | error, B is
called size of type I! error.
@ is also called Producer's risk and B is
called consumer risk.
In two dimensional space
test statistics = tlxy %y---)%9), each
clement is in the form of ordered pairs.
Thus, ne YN)
The values can be plotted in 2-dimensional plane.
Critical Region
A region (corresponding
which amounts to rejecti
or region of rejection.
IFW is the critical region and iftt = ty %,-
the statistic based on a random sample of size n,
PteW/ Hy
Pte W/ Hy
W (complement of W) is called acceptance region,
Here, wuW=S,WoW=0
Probability ‘«’ that a random value of the stati
the critical region is known as the level of si
Power of the test
‘We know
a= Probability of type | error
= P (Reject Hy| Hp is true)
= Probability of type Il error
= PiAccept Hy| Hp is false)
Power of the test is defined as 1—B. It is alo called power
function of the test hypothesis H, against the alternative
hypothesis H,.
The value of the power function at a parameter point is called
the power of the test at that point.
to a statistic t)in the sample space S
jon of H, is termed as critical region
,,)is the value of
then
ict belongs to
ificance.
Steps in Solving Testing of Hypothesis
Problem
() From the problem, to know the form of the population
distribution, and ‘the parameter(s) about which the
hypothesis are set up.
(ii) Setting up of the null hypothesis Hy and the alternative
hypothesis Hy, in terms of the range of the parameter
values.
Gli) Choosing of statistic
= they %y,-0/%) Called test statistic which control
and B to minimum level. oe
(iv) Partition the set of all possible values of stat
disjoint sets W (rejection region or critical region) and W_
(acceptance region) such that1008
(a) Reject Hy (ie., accept H,),
if the value of t falls in W.
(b) Accept Hy, if the value of t fall in W.
(W) Take observation (Experimentally sample observations)
calculate test statistic and further follow the steps and
take decision
Optimum Test under Different
Situations
In any testing problem the first two steps, viz.,
( the form of population distribution
Gi) the parameter(s) of interest are considered from the
problem. The most important step is to choose the best
test, ie., the best statistic ‘t’ and the critical region W
where by the best test we mean one which in addition to
controlling a at any desired low level has the minimum
type Il error B or maximum power 1, compared to B
of all other tests having this o:
Most Powerful Test (MP test)
Let test hypothesis be
Hy: 0-8
Hy: 828
The critical region W is the most powerful (MP) critical region
of size a (and corresponding test a most powerful test of level
@) for testing Hy: 8 = 8, against Hy: @=6,, if
Pix e W| Hy) = fu bo dx =
and Pixe W /H,)2 Poxe W| Hy)
for every other critical region W, satisfying Eq. (.
Uniformly Most Powerful Text (UMP
Test)
Let Hy:8=8)
Hy: 0 #8),
Then, for a predetermined ar, the best test for H, is called the
uniformly most powerful test of level a.
Definition The region W is called uniformly most powerful
(UMP) critical region of size a (and corresponding test as
uniformly most powerful (UMP) test of level a) for testing
Hy: 8 = 6, against H,: 0 #9 i.e.,
Hy: 8 =0, #8) if
PaceW] Hy) = flock =a wi)
and Pie W|H,)2 Pic e W,|H,)
for all #6,
Whatever the region W, satisfying Eq. (i) may be.
Neyman-Pearson Lemma
Let 4,%/-04,%)} be a random sample of size n from the
population whose density function is f(x, 8).
UGC-CSIR NET Tutor Mathematical Sciences
Let null hypothesis be Hy: © = % and alternative Pete
Hy 0=6, P
Let K>0, be a constant and W be a critical region of
4 p
such that
: spo}
= were a
wafresttsa]
and So
nd L; are the likelihood functions of the
weet ae a by Mayon %q) UNDE Hy an respect
the most powerful critical region of the
Then, W is U
hypothesis Hy: = @ se the alternative H,:0=0, test
Let We={re Se >K
be the most powerful H; :@=0 1 critical region of size a
testing Hy: @=0) against and let it be independent of
8) € 0, = 8 @,, where O) isthe parameter space under
‘Then, 'we say that critical region W is the uniformly mag
powerful critical region (UMP CR) of size a for tesing
Hy: 8 =@, against H,: 0 € @;,
Unbiased Test and Unbiased Critical
Region
Let testing be of
Hy: @= 6, against
Hy: =)
The critical region W and consequently the test based on ts
said to be unbiased, if the power of the test exceeds the sie
of the critical region, ie.,if
power of the test size of the CR.
- 1-B2a
= RW)2 RW)
= Pix|x€ W| Hy12 Plx|xe WH)
In other words, the critical region W is said to be unbiased. i
PAW) 2 Py, (W), VO Be ©
Theorem Every most powerful (MP) or uniformly 108
powerful (UMP) (CR) is necessarily unbiased.
(IEW be MPCR of size « for testing Hy: 0=6 agit
Hy: @=0,, then it is necessarily unbiased.
(id) Similarly, if W be UMPCR of size a for testis
Hy:8= 0) against Hy: 8 ©, then itis ako wnbiaes
um Regions and
Sufficient Statistics
Let X,X,
PopulationStatistical Inferences and Gauss Markov Model
wy, Let T be a sufficient statisti
e
vecration theorem.
iC for @ Then, by
4 orzation °
Lx,0)= TT, 8) = galtexyih(x) o
ve, ggt®)) 15 the marginal distribut
wie’ By NP-Lemma, the MPCR for recy the, Satie
st Hy: 0=0, is given by Bh
W = {3 Lx 8;)2 KL, 04}, VK >.0
son fap (0. and (i, we get
W = {818 000)- HO K. By (60). Ha}, WK>0
= 43] 8, €00)2 K- Bo, GX), WK>0
sence, fT =) is sufficient statistic for 8 then Ml
ves gven in terms of marginal distribution of 7 =e) cats
1A int distibution of XX, Xp
Likelihood Function
let XpMrer%y be a random sample of size n from
population with density function f(x,@). Then, the TikelIhood
theo of the sample values xy xysu/%, 1s denoted by
{8} and is defined as their joint density function given by
L= fbx, 8), 0) nfl 8) = Tf, 0)
{ gies the relative likelihood that the random variables
assume a particular set of values XX... Xp:
fora given sample X,%/..0% & becomes a function of the
variable 8, the parameter.
Likelihood Ratio Test
8 Xy%yyy%y be a random sample of size n>1 from a
opulation with pf f(x, ;,8;,---/ 0) where, @ the parameter
sguce is the totality Of all points that (6,/@;, ...8,) can
sume.
want to test the null hypothesis Ho: (8027-8) € Np
int the alternative hypothesis of the type Hi (810, ®x)
<6-,. The likelihood function of the sample observations
given by
[1,010.0 0)
8) the principle of maximum likelihood,
The likelihood equation for estimating any parameter @, is
Bien by
L
aL
Hp iat
aL
8,
Me can obtain the maximum likelihood estimates for the
Diameters (8, 8,,-~,8,) these parameters vary over the
Siameter space @ and subspace Q)- ;
“8 the maximum likelihood estimates sO obtained, in
LP] 6,8, 8,0)
=
Using
0, (= 42-0) KD
@
1009
We obtain maximum values of the likelihood function for
variation of the parameters in © and @,, respectively
The likelihood ratio test is defined as the quotient of these two
‘maximas, given by
sup L0x,@)
De Aay Mo eo __
0 Xero) = TG) = Sup Li 8)
0
Where, (5) and L(@) are the maxima of the likelihood
function (i) with respect to the parameters in the regions ©,
and , respectively.
Theorem 1 If i is the likelihood ratio test for testing a
null hypothesis H, and if U=9() is a monotonic
increasing (decreasing) function of 2, then the test based
‘on U is equivalent to the likelihood ratio test. The critical
region of the test based on U Is
4(0)< <6)
[o(Ag)< UV < 00)
Theorem 2 Let %,,%;,...%, be a random sample from @
population with Pdf /(x;0,,6,,....0,) where the parameter
space N is kdimensional. Suppose, we want to test the
composite hypothesis.
Fh: = 0%, 8, = 05...
Where 64,6;,...6 are specified numbers. When, Hy is
true, -2log, 4 is asymptotically distributed as chi-square
with r degrees of freedom i.e., under Hy, ~2log ~ z;, ifn
is large.
Test for the mean of a normal population N(u,0*)..
Let (%,%p,-.-.%,) be a random sample of size n from it
Let Hy: =H (specified) 00,
= {(H, 64, 3) |- o0, i= 12}
In case of population variances are unequal, 03 # 3
In case of population variances are equal, 031010
Test for the Equality of Means of Several
Normal Populations having Unknown
Common Variance o”
Let Hy =n
of=
Wy == We =H
=o] =0°
1, are not all equal
=oi =o?
Test for the Variance of a Normal
Population
Let (specified)
Test for Equality of Variance of Two Normal
Populations
Niw,03) and Niu,,03)
Ho} (unspecified)
Hy and p, are unspecified
Heo} #03, Wy tty (unspecified)
Test for the Equality of Variances of Several
Normal Populations
Nii,o7) 1542 wk.
Hy:03 =03 =...=0} =o? (unspecified)
With by Bhar by (unspecified)
Hy lim 12,41, K) are not all equal;
eel ee (unspecified)
IIDR Variables
A collection of random variables is independent and
identically distributed (IID) if each random variable has the
same probability distribution as the others and all are mutually
independent.
Central Limit Theorem The probability distribution of the
sum (or average) of IID variables with finite variance
approaches a normal distribution,
Example of IID
(i) A sequence of fair due rolls is ID
(ii)_A sequence of fair coin flips is IID
Chi-square Test of Goodness of Fit
It tests ifthe deviation of the experiment from theory is just by
cchance or is it really due to the inadequacy of theory to fit the
observed data.
1 O, and E, (i= 42, ..,k), be a set of observed and expected
frequencies, then
£(0,-
e Se EF
a
UGC-CSIR NET Tutor Mathematical Sciences
{follows chi-square distribution with (k ~ 1) df
ditions :
os Sample observations should be independent
) Conaains on the cll Fequences any shag
linear,
84 2nj= ZA, oF Fo) = BE
N; the total frequency should be reasonably. leg,
greater than 50. a
iv) No theoritical cell frequency shou
(9 Mt square is a continuous distribution, bur aut
inant character of continuity i call equa
fess than 5. It is a non-parametric test (or distgug
free test)
Degrees of Freedom (DF)
The number of independent variates which make
statistic (e.g., 12 )is known as the degrees of freedom i anj
is usually denoted by v.
The number of degrees of freedom is the total number of
observations less the number of independent constnins
imposed on the observations.
If k is the number of independent constraints in the set of daa
of n observations, then v
If a random variable X has a chi-square distribution with
n-degrees of freedom,then we write it as X ~ 32...
Its probability density function is given by
1
fox
Pima te Osten
Properties of Degree of Freedom
(@ If X,i=12,...,n are n independent normal varies
with mean; and standard deviation 6, then
S54) is a y?-variate with n degrees of
AL 6,
freedom,
Normal distrib is a particular case
%C--distribution when n=1
We know pdf for x? distribution is
= a ete OS Kem
Tin/ 2)
Putting x = 4, probability differential is given by
PO?) = fg dy?
is 1 HIP 2 y2 falda? gg yc*
Pippy eke Nano ¢
for n=1
POP) = fog ay? = yy Mt gg 21
2 T7/2)
1
Teer iy ecycn
= 1 isa standard normal variate.