Chapter5 Sampling Ratio Method Estimation
Chapter5 Sampling Ratio Method Estimation
An important objective in any statistical estimation procedure is to obtain the estimators of parameters of
interest with more precision. It is also well understood that incorporation of more information in the
estimation procedure yields better estimators, provided the information is valid and proper. Use of such
auxiliary information is made through the ratio method of estimation to obtain an improved estimator of
population mean. In ratio method of estimation, auxiliary information on a variable is available which is
linearly related to the variable under study and is utilized to estimate the population mean.
Let Y be the variable under study and X be any auxiliary variable which is correlated with Y . The
observations xi on X and yi on Y are obtained for each sampling unit. The population mean X of X
(or equivalently the population total X tot ) must be known. For example, xi ' s may be the values of
yi ' s from
- some earlier completed census,
- some earlier surveys,
- some characteristic on which it is easy to obtain information etc.
For example, if yi is the quantity of fruits produced in the ith plot, then xi can be the area of ith plot or
Let ( x1 , y1 ),( x2 , y2 ),...,( xn , yn ) be the random sample of size n on paired variable (X, Y) drawn,
preferably by SRSWOR, from a population of size N. The ratio estimate of population mean Y is
YˆR X RX
y ˆ
x
N
assuming the population mean X is known. The ratio estimator of population total Ytot Yi is
i 1
y
YˆR (tot ) tot X tot
xtot
N n
where X tot X i is the population total of X which is assumed to be known, ytot yi and
i 1 i 1
n
xtot xi are the sample totals of Y and X respectively. The YˆR (tot ) can be equivalently expressed as
i 1
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 1
y
YˆR (tot ) X tot
x
ˆ .
RX tot
Ytot
Looking at the structure of ratio estimators, note that the ratio method estimates the relative change
X tot
yi
that occurred after ( xi , yi ) were observed. It is clear that if the variation among the values of and
xi
ytot y
is nearly same for all i = 1,2,...,n then values of (or equivalently ) vary little from sample to
xtot x
sample and the ratio estimate will be of high precision.
known. Then
N
n
1
E (YˆR )
yi
N i 1 xi
X
n
Y (in general).
y y2
Moreover, it is difficult to find the exact expression for E and E 2 . So we approximate them
x x
and proceed as follows:
Let
y Y
0 y (1 o )Y
Y
xX
1 x (1 1 ) X .
X
Since SRSWOR is being followed , so
E ( 0 ) 0
E ( 1 ) 0
1
E ( 02 ) E ( y Y )2
Y2
1 N n 2
2 SY
Y Nn
f SY2
n Y2
f
CY2
n
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 2
N n 2 1 N S
where f
N
, SY
N 1 i 1
(Yi Y ) 2 and CY Y is the coefficient of variation related to Y.
Y
Similarly,
f 2
E (12 ) CX
n
1
E ( 01 ) E[( x X )( y Y )]
XY
1 N n 1 N
XY Nn N 1 i 1
( X i X )(Yi Y )
1 f
. S XY
XY n
1 f
S X SY
XY n
f S S
X Y
n X Y
f
C X CY
n
SX
where CX is the coefficient of variation related to X and is the population correlation coefficient
X
between X and Y.
YˆR X
y
x
(1 0 )Y
X
(1 1 ) X
(1 0 )(1 1 ) 1Y
Assuming 1 1, the term (1 1 )1 may be expanded as an infinite series and it would be convergent.
xX
Such assumption means that 1, i.e., possible estimate x of population mean X lies between 0
X
and 2 X , This is likely to hold true if the variation in x is not large. In order to ensures that variation in
x is small, assume that the sample size n is fairly large. With this assumption,
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 3
In case, when sample size is large, then 0 and 1 are likely to be small quantities and so the terms
involving second and higher powers of 0 and 1 would be negligibly small. In such a case
YˆR Y Y ( 0 1 )
and
E (YˆR Y ) 0.
So the ratio estimator is an unbiased estimator of population mean upto the first order of approximation.
If we assume that only terms of 0 and 1 involving powers more than two are negligibly small (which is
more realistic than assuming that powers more than one are negligibly small), then the estimation error
E (YˆR Y ) Y 0 0 C X2 C X C y
f f
n n
Bias (Yˆ ) E (YˆR Y ) YC X (C X CY ).
f
n
upto the second order of approximation. The bias generally decreases as the sample size grows large.
Bias (YˆR ) 0
if E (12 01 ) 0
Var ( x ) Cov ( x , y )
or if 0
X2 XY
1 X
or if 2 Var ( x ) Cov ( x , y ) 0
X Y
Cov ( x , y )
or if Var ( x ) 0 (assuming X 0)
R
Y Cov ( x , y )
or if R
X Var ( x )
which is satisfied when the regression line of Y on X passes through origin.
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 4
Under the assumption 1 1 and the terms of 0 and 1 involving powers more than two are negligible
small,
f 2f
MSE (YˆR ) Y 2 C X2 CY2
f
C X CY
n n n
2
Y f
C X2 CY2 2 C X C y
n
up to the second order of approximation.
1 CX
or if .
2 CY
Thus ratio estimator is more efficient than the sample mean based on SRSWOR if
1 CX
if R 0
2 CY
1 CX
and if R 0.
2 CY
It is clear from this expression that the success of ratio estimator depends on how close is the auxiliary
information to the variable under study.
Thus
Y Cov( Rˆ , x )
E ( Rˆ )
X X
Cov( Rˆ , x )
R
X
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 5
Bias ( Rˆ ) E ( Rˆ ) R
Cov ( Rˆ , x )
X
Rˆ , x Rˆ x
X
where Rˆ , x is the correlation between Rˆ and x ; Rˆ and x are the standard errors of Rˆ and x
respectively.
Thus
Rˆ , x Rˆ x
Bias ( Rˆ )
X
Rˆ x
X
Rˆ , x
1 .
assuming X 0. Thus
Bias ( Rˆ ) x
Rˆ X
Bias ( Rˆ )
or CX
Rˆ
where C X is the coefficient of variation of X. If C X < 0.1, then the bias in R̂ may be safely regarded
(Y RX ) (Y Y ) (Y RX )
i 1
i i
2
i 1
i i
N 2
(Yi Y ) R( X i X ) (Using Y RX )
i 1
N N N
(Yi Y )2 R 2 ( X i X )2 2 R ( X i X )(Yi Y )
i 1 i 1 i 1
1 N
N 1 i 1
(Yi RX i )2 SY2 R 2 S X2 2 RS XY .
The MSE of YˆR has already been derived which is now expressed again as follows:
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 6
MSE (YˆR ) Y 2 (CY2 C X2 2 C X CY )
f
n
f S2 S2 S
Y 2 Y2 X2 2 XY
n Y X XY
f Y2 2 Y2 2 Y
S 2 S X 2 S XY
2 Y
nY X X
SY2 R 2 S X2 2 RS XY
f
n
N
f
(Yi RX i )2
n( N 1) i 1
N n N
nN ( N 1) i 1
(Yi RX i ) 2 .
f 1 N
MSE (YˆR ) (U i U )2
n N 1 i 1
f
= SU2
n
1 N
where SU2
N 1 i 1
(U i U )2 .
y
Rˆ .
x
Based on the expression
N
MSE (YˆR )
f
(Yi RX i )2 ,
n( N 1) i 1
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 7
n
(Yˆ ) f
MSE R ( yi Rx
n(n 1) i 1
ˆ )2
i
.
f 2 ˆ2 2 ˆ ).
( s y R sx 2 Rs xy
n
Confidence interval of ratio estimator
If the sample is large so that the normal approximation is applicable, then the 100(1- )% confidence
ˆ ˆ ˆ ˆ
YR Z Var (YR ), YR Z Var (YR )
2 2
and
ˆ ( Rˆ )
R Z Var ( Rˆ ), Rˆ Z Var
2 2
respectively where Z is the normal derivate to be chosen for given value of confidence coefficient
2
(1 ).
If ( x , y ) follows a bivariate normal distributions, then ( y Rx ) is normally distributed. If SRS is
followed for drawing the sample, then assuming R is known, the statistic
y Rx
N n 2
( s y R 2 sx2 2 R sxy )
Nn
is approximately N(0,1).
This can also be used for finding confidence limits, see Cochran (1977, Chapter 6, page 156) for more
details.
(i) the relationship between yi and xi is linear passing through origin., i.e.
yi xi ei ,
where ei ' s are independent with E (ei / xi ) 0 and is the slope parameter
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 8
n
Proof. Consider the linear estimate of because ˆ i yi where yi xi ei and i ‘s are constant.
i 1
n
So E ( ˆ ) when x
i 1
i i 1.
Consider the minimization of Var( yi / xi ) subject to the condition for being the unbiased estimator
n
x 1 using Lagrangian function. Thus the Lagrangian function with Lagrangian multiplier is
i 1
i i
n
Var ( yi / xi ) 2 ( i xi 1.)
i 1
n n
C 12 xi 2 ( i xi 1).
i 1 i 1
Now
0 i xi xi , i 1, 2,.., n
i
n
0 i xi 1
i 1
n
Using x
i 1
i i 1
n
or x
i 1
i 1
1
or .
nx
Thus
1
i
nx
n
y i
y
and so ˆ i 1
.
nx x
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 9
Thus ̂ is not only superior to y but also the best in the class of linear and unbiased estimators.
Alternative approach:
This result can alternatively be derived as follows:
y Y
The ratio estimator Rˆ is the best linear unbiased estimator of R if the following two
x X
conditions hold:
(i) For fixed x, E ( y) x, i.e., the line of regression of y on x is a straight line passing
through the origin.
(ii) For fixed x , Var ( x) x, i.e., Var ( x) x where is constant of proportionality.
Proof: Let y ( y1) , y 2 ,..., y n ) ' and x ( x1 , x2 ,..., xn ) ' be two vectors of observations on
where diag( x1 , x2 ,..., xn ) is the diagonal matrix with x1 , x2 ,..., xn as the diagonal elements.
S 2 ( y x ) ' 1 ( y x )
n
( yi xi )2
.
i 1 xi
Solving
S 2
0
n
( yi ˆ xi ) 0
i 1
y ˆ
or ˆ R.
x
ˆ Yˆ is the best
Thus R̂ is the best linear unbiased estimator of R . Consequently, RX R
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 10
Ratio estimator in stratified sampling
Suppose a population of size N is divided into k strata. The objective is to estimate the population mean
Y using ratio method of estimation.
In such situation, a random sample of size ni is being drawn from the ith strata of size Ni on variable
under study Y and auxiliary variable X using SRSWOR.
Let
yij : jth observation on Y from ith strata
An estimator of Y based on the philosophy of stratified sampling can be derived in following two
possible ways:
i 1 N
k
wY ˆ
i Ri
i 1
k
yi
wi Xi
i 1 xi
1 ni
where yi
ni
yj 1
ij : sample mean of Y from ith strata
1 ni
xi xij : sample mean of X from ith strata
ni j 1
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 11
1 Ni
Xi
Ni
x j 1
ij : mean of all the X units in ith stratum
No assumption is made that the true ratio remains constant from stratum to stratum. It depends on
information on each X i .
YˆRc st X
y
xst
N
where X is the population mean of X based on all the N Ni units. It does not depend on individual
i 1
E (YˆR ) Y
Yf
(C x2 C X CY ) .
n
2
N i ni 2 Siy S2
fi , Ciy 2 , Cix2 ix2 ,
Ni Yi Xi
1 Ni 1 Ni
Siy2
N i 1 j 1
(Yij Yi ) 2
, S 2
ix
N i 1 j 1
( X ij X i ) 2 ,
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 12
k
E (YˆRs ) wi E (YˆRi )
i 1
k
f
wi Yi Yi i (Cix2 i Cix Ciy
i 1 ni
k
wY f
Y i i i (Cix2 i Cix Ciy )
i 1 ni
Bias(YˆRs ) E (YˆRs ) Y
k
wiYi f i
Cix (Cix i Ciy )
i 1 ni
upto the second order of approximation.
Assuming finite population correction to be approximately 1, ni n / k and Cix , Ciy and i are the same
Bias (YˆRs ) (C x2 C x C y ) .
k
n
Thus the bias is negligible when the sample size within each stratum should be sufficiently large and
YRs is unbiased when C ix C iy .
Now we derive the approximate MSE of YˆRs . We already have derived the MSE of YˆR earlier as
Y2f
MSE (YˆR ) (C X2 CY2 2 Cx C y )
n
N
f
(Yi RX i )2
n( N 1) i 1
Y
where R .
X
Thus the MSE of ratio estimate upto the second order of approximation based on ith stratum is
MSE (YˆRi )
fi
(CiX2 CiY2 2 i CiX CiY )
ni ( N i 1)
Ni
fi
(Yij Ri X ij )2
ni ( Ni 1) j 1
and so
k
MSE (YˆRs ) wi2 MSE (YˆRi )
i 1
k
w2 f
i i Yi 2 (CiX2 CiY2 2 i CiX CiY )
i 1 ni
k N
fi
wi2
i
(Yij Ri X ij )2
i 1 ni ( Ni 1) j 1
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 13
An estimate of MSE (YˆRs ) can be found by substituting the unbiased estimators of SiX2 , SiY2 and SiXY
2
as
six2 , siy2 and sixy , respectively for ith stratum and Ri Yi / X i can be estimated by ri yi / xi .
(Yˆ ) wi f i ( s 2 r 2 s 2 2r s ) .
k 2
MSE Rs
i 1 ni
iy i ix i ixy
Also
(Yˆ ) wi f i
k 2 ni
MSE Rs
i 1 ni ( ni 1) j 1
( yij ri xij ) 2
w y i i
YˆRC
yst
i 1
k
X X Rˆc X .
w x
xst
i i
i 1
It is difficult to find the exact expression of bias and mean squared error of YˆRc , so we find their
approximate expressions.
Define
yst Y
1
Y
x X
2 st
X
E (1 ) 0
E ( 2 ) 0
N i ni wi2 SiY2
k k
fi wi2 SiY2 ˆ f SY2 f 2
E ( )
2
1 Recall that in case of YR , E (1 )
2
CY
i 1 N i ni Y2 i 1 ni Y2 n Y2 n
k
fi wi2 SiX2
E ( 22 )
i 1 ni X 2
k
fi SiXY
E (1 2 ) wi2 .
i 1 ni XY
Thus assuming 2 1,
(1 1 )Y
YˆRC X
(1 2 ) X
Y (1 1 )(1 2 22 ...)
Y (1 1 2 1 2 22 ...)
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 14
Retaining the terms upto order two due to same reason as in the case of YˆR ,
YˆRC Y (1 1 2 1 2 22 )
Yˆ Y Y ( 2 )
RC 1 2 1 2 2
i 1 ni X Y XY
k
fi 2 SiX2 SiY2 2 i SiX SiY
Y wi 2 2
2
i 1 ni X Y X Y
Y2 k fi 2 Y 2 2 Y
2 wi 2 SiX SiY 2 i SiX SiY
2
Y i 1 ni X X
k
f
i wi2 ( R 2 SiX2 SiY2 2 i RSiX SiY ) .
i 1 ni
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 15
An estimate of MSE (YRc ) can be obtained by replacing S iX2 , S iY2 and S iXY by their unbiased estimators
Y y
six2 , siy2 and sixy respectively whereas R is replaced by r . Thus the following estimate is
X x
obtained:
(Y ) wi fi r 2 s 2 s 2 2rs
k 2
MSE Rc
i 1 ni
ix iy ixy
in MSE (YˆRs )
yi
Ri
xi
in MSE (YˆRc ).
Y
R
X
Thus
x is linear and passes through origin within each stratum. See as follows:
Ri Six2 i Six Siy 0
i Six Siy
Ri
Six2
which is the estimator of the slope parameter in the regression of y on x in the ith stratum. In
such a case
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 16
So unless Ri varies considerably, the use of YˆRc would provide an estimate of Y with negligible bias
If Ri R, YˆRc can be as precise as YˆRs but its bias will be small. It also does not require
knowledge of X1 , X 2 ,..., X k .
surveys . There are several approaches to derive such estimators. We consider here two such approaches:
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 17
1 n
YˆR 0 Ri X
n i 1
rX
where
1 n
r Ri
n i 1
Bias (YˆR 0 ) E (YˆR 0 ) Y
E (rX ) Y
E (r ) X Y .
Since
1 n 1 N
E (r ) (
n i 1 N
R )
i 1
i
1 n
R
n i 1
R.
So Bias (YˆR 0 ) RX Y .
N n
Using the result that under SRSWOR, Cov( x , y ) S XY , it also follows that
Nn
N n 1 N
Cov(r , x ) ( Ri R )( X i X )
Nn N 1 i 1
N n 1 N
( Ri X i NRX )
Nn N 1 i 1
N n 1 N
Y
( i X i NRX )
n N 1 i 1 X i
N n 1
( NY NRX )
Nn N 1
N n 1
[ Bias (YˆR 0 )].
n N 1
N n N n
Thus using the result that in SRSWOR, Cov( x , y ) S XY , and therefore Cov(r , x ) S RX , we
Nn Nn
have
n( N 1)
Bias (YˆRo ) Cov(r , x )
N n
n( N 1) N n
S RX
N n Nn
N 1
S RX
N
1 N
where SRX (Ri R)( X i X ).
N 1 i 1
The following result helps in obtaining an unbiased estimator of population mean:.
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 18
Since under SRSWOR set up,
E ( sxy ) S xy
1 n
where sxy ( xi x )( yi y ),
n 1 i 1
1 N
S xy ( X i X )(Yi Y ).
N 1 i 1
So an unbiased estimator of the bias in Bias(YˆR0 ) ( N 1)SRX is obtained as follows:
(Yˆ ) ( N 1) s
Bias R0 rx
N
N 1 n
(ri r )( xi x )
N (n 1) i 1
N 1 n
( ri xi n r x )
N (n 1) i 1
N 1 n yi
xi nr x
N (n 1) i 1 xi
N 1
(ny nr x ).
N (n 1)
So
(Yˆ )
Bias R0 E YˆR 0 Y
n( N 1)
N ( n 1)
( y r x ).
Thus
E YˆR 0 Bias (YˆR 0 ) Y
n( N 1)
or E YˆR 0 ( y r x ) Y .
N ( n 1)
Thus
n( N 1) n( N 1)
YˆR 0 ( y r x ) rX (y r x)
N ( n 1) N ( n 1)
is an unbiased estimator of population mean.
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 19
ga ga
E ( gRˆ ) gR 1 2 2 2 ...
gm g m
a a
gR 1 2 2 ...
m gm
Let Rˆi* * i where the * denotes the summation over all values of the sample except the ith
*
y
xi
group. So Rˆ i* is based on a simple random sample of size m(g - 1),
so we can express
a1 a
E ( Rˆi* ) R 2 2 2 ...
m( g 1) m ( g 1)
or
a1 a
E ( g 1) Rˆi* ( g 1) R 2 2 ...
m m ( g 1)
Thus
a2
E gRˆ ( g 1) Rˆi* R ...
g ( g 1)m2
or
a g
E gRˆ ( g 1) Rˆi* R 22 ...
n g 1
1
Hence the bias of gRˆ ( g 1) Rˆi* is of order 2 .
n
Now g estimates of this form can be obtained, one estimator for each group. Then the jackknife or
Quenouille’s estimator is the average of these of estimators
g
Rˆ i
RˆQ gRˆ ( g 1) i 1
.
g
1 Cx
which is usually the case. This shows that if auxiliary information is such that , then we
2 Cy
cannot use the ratio method of estimation to improve the sample mean as an estimator of the population
mean. So there is a need of another type of estimator which also makes use of information on auxiliary
variable X. Product estimator is an attempt in this direction.
The product estimator of the population mean Y is defined as
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 20
YˆP
yx
.
X
assuming the population mean X to be known
y Y xX
Let 0 , 1 ,
Y X
We write Yˆp as
Yˆp
yx
Y (1 0 )(1 1 )
X
Y (1 0 1 0 1 ).
1
Bias(Yˆp ) E ( 01 ) Cov( y , x )
f
S xy ,
X nX
which shows that bias of Yˆp decreases as n increases. Bias of Yˆp can be estimated by
(Yˆ ) f s .
Bias p xy
nX
Writing Yˆp is terms of 0 and 1 , we find that the mean squared error of the product estimator Yˆp upto
Here terms in (1 , 0 ) of degrees greater than two are assumed to be negligible. Using the expected
values we find that
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 21
(Yˆ ) f s 2 r 2 s 2 2rs
MSE
n
p y x xy
where r y / x .
SY which shows that Yˆp is more efficient than the simple mean y for
f 2
where Var ( y ) SRS
n
1 Cx
if R 0
2 Cy
and for
1 Cx
if R 0.
2 Cy
Further it is assumed that X 1 , X 2 ,..., X p are independent. Let Y , X 1 , X 2 ,..., X p be the population means of
the variables y , X 1 , X 2 ,..., X p . We assume that a SRSWOR of size n is selected from the population of
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 22
Si2 : the population mean sum of squares for the variate X i ,
si2 : the sample mean sum of squares for the variate X i ,
S02 : the population mean sum of squares for the study variable y,
s02 : the sample mean sum of squares for the study variable y ,
Si
Ci : coefficient of variation of the variate X i ,
Xi
S0
C0 : coefficient of variation of the variate y ,
Y
S
i iy : coefficient of correlation between y and X i ,
Si S 0
YˆRi
y
:ratio estimator of Y , based on X i
Xi
Bias(YˆRi ) Y (Ci2 i Ci C0 ).
f
n
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 23
The variance of YˆMR upto the second order of approximation is obtained as
p
Var (YˆMR ) Y 2 wi2 (C 02 Ci2 2 i C 0 Ci ).
f
n i 1
Sampling Theory| Chapter 5 | Ratio Product Method Estimation | Shalabh, IIT Kanpur Page 24