[go: up one dir, main page]

0% found this document useful (0 votes)
56 views26 pages

Chapter 3: Element Sampling Design (Part 2) : Jae-Kwang Kim

This document summarizes key aspects of systematic sampling and stratified sampling discussed in Chapter 3 of a textbook. For systematic sampling, it describes the setup, inclusion probabilities, estimation approach, and comparisons to simple random sampling. Stratified sampling is introduced as dividing the population into non-overlapping groups and sampling independently within each group. Estimation for stratified sampling combines estimates from each stratum. Sample allocation strategies like proportional and optimal allocation are also covered.

Uploaded by

Fernando Jordan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views26 pages

Chapter 3: Element Sampling Design (Part 2) : Jae-Kwang Kim

This document summarizes key aspects of systematic sampling and stratified sampling discussed in Chapter 3 of a textbook. For systematic sampling, it describes the setup, inclusion probabilities, estimation approach, and comparisons to simple random sampling. Stratified sampling is introduced as dividing the population into non-overlapping groups and sampling independently within each group. Estimation for stratified sampling combines estimates from each stratum. Sample allocation strategies like proportional and optimal allocation are also covered.

Uploaded by

Fernando Jordan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 3: Element sampling design (Part 2)

Jae-Kwang Kim
Iowa State University
Spring, 2013
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 1 / 26
Systematic sampling
1
Systematic sampling
2
Stratied sampling
3
Domain estimation
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 2 / 26
Systematic sampling
Setup:
1
Have N elements in a list.
2
Choose a positive integer, a, called sampling interval. Let n = [N/a].
That is, N = na + c, where c is an integer 0 c < a.
3
Select a random start, r , from {1, 2, , a} with equal probability.
4
The nal sample is
A = {r , r + a, r + 2a, , r + (n 1)a} , if c < r a
= {r , r + a, r + 2a, , r + na} , if 1 r c.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 3 / 26
Systematic sampling
Sample size can be random
n
A
=
_
n if c < r a
n + 1 if r c
Inclusion probabilities

k
=

kl
=
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 4 / 26
Systematic sampling
Remark
This is very easy to do.
This is a probability sampling design.
This is not measurable sampling design: No design-unbiased
estimator of variance (because only one random draw)
Pick one set of elements (which always go together) & measure each
one: Later, we will call this cluster sampling.
Divide population into non-overlapping groups & choose an element
in each group: closely related to stratication.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 5 / 26
Systematic sampling
Estimation
Partition the population into a groups
U = U
1
U
2
U
a
where U
i
: disjoint
Population total
Y =

i U
y
i
=
a

r =1

kU
r
y
k
=
a

r =1
t
r
where t
r
=

kU
r
y
k
.
Think of nite population with a elements with measurements
t
1
, , t
a
.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 6 / 26
Systematic sampling
Estimation (Contd)
HT estimator:

Y
HT
=
t
r
1/a
,
if A = U
r
.
Variance: Note that we are doing SRS from the population of a
elements {t
1
, , t
a
}.
Var
_

Y
HT
_
=
a
2
1
_
1
1
a
_
S
2
t
where
S
2
t
=
1
a 1
a

r =1
(t
r

t)
2
and

t =

a
r =1
t
r
/a.
When the variance is small ?
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 7 / 26
Systematic sampling
Estimation (Contd)
Now, assuming N = na
V
_

Y
HT
_
= a (a 1) S
2
t
= n
2
a
a

r =1
( y
r
y
u
)
2
where y
r
= t
r
/n and y
u
=

t/n.
ANOVA: U =
a
r =1
U
r
SST =

kU
(y
k
y
u
)
2
=
a

r =1

kU
r
(y
k
y
u
)
2
=
a

r =1

kU
r
(y
k
y
r
)
2
+ n
a

r =1
( y
r
y
u
)
2
= SSW + SSB.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 8 / 26
Systematic sampling
V
_

Y
HT
_
= na SSB = N SSB = N (SST SSW) .
If SSB is small, then y
r
are more alike and V
_

Y
HT
_
is small.
If SSW is small, then V
_

Y
HT
_
is large.
Intraclass correlation coecient measures homogeniety of clusters.
= 1
n
n 1
SSW
SST
More details about will be covered in the cluster sampling.
(Chapter 4).
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 9 / 26
Systematic sampling
Comparison between systematic sampling (SY) and SRS
How does SY compare to SRS when the population is sorted by the
following way ?
1
Random ordering: Intuitively should be the same
2
Linear ordering: SY should be better than SRS
3
Periodic ordering: if period = a, SY can be terrible.
4
Autocorrelated order: Successive y
k
s tend to lie on the same side of
y
u
. Thus, SY should be better than SRS.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 10 / 26
Systematic sampling
How to quantify ? :
V
SRS
_

Y
HT
_
=
N
2
n
_
1
n
N
_
1
N 1
N

k=1
_
y
k


Y
N
_
2
V
SY
_

Y
HT
_
= n
2
a
a

r =1
( y
r
y
u
)
2
Cochran (1946) introduced superpopulation model to deal with this
problem. (treat y
k
as a random variable)
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 11 / 26
Systematic sampling
Example: Superpopulation model for a population in random order.
Denote the model by : {y
k
} iid
_
,
2
_
E

_
V
SRS
_

Y
HT
__
=
N
2
n
_
1
n
N
_

2
E

_
V
SY
_

Y
HT
__
=
N
2
n
_
1
n
N
_

2
Thus, the model expectations of the design variances are the same
under the IID model.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 12 / 26
Stratied sampling
1
Systematic sampling
2
Stratied sampling
3
Domain estimation
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 13 / 26
Stratied sampling
Stratied sampling:
1
The nite population is stratied into H subpopulations.
U = U
1
U
H
2
Within each population (or stratum), samples are drawn independently
across the strata.
Pr (i A
h
, j A
g
) = Pr (i A
h
) Pr (j A
g
) , for h = g
where A
h
is the index set of the sample in stratum h, h = 1, 2, , H.
Example: Stratied SRS
1
Stratify the population. Let N
h
be the population size of U
h
.
2
Sample size allocation: Determine n
h
.
3
Perform SRS independently (select n
h
sample elements from N
h
) in
each stratum.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 14 / 26
Stratied sampling
Why stratication ?
1
Control for domains of study
2
Flexibility in design and estimation
3
Convenience
4
Eciency
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 15 / 26
Stratied sampling
Estimation
HT estimation for t =

H
h=1
t
h
, where t
h
=

i U
h
y
i
.
1
HT estimator:

t
HT
=
H

h=1

t
h,HT
where

t
h,HT
is unbiased for t
h
.
2
Variance
Var
_

t
HT
_
=
H

h=1
Var
_

t
h,HT
_
by independence
3
Variance estimation

V
_

t
HT
_
=
H

h=1

V
h
_

t
h,HT
_
where

V
h
_

t
h,HT
_
is unbiased for Var
_

t
h,HT
_
.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 16 / 26
Stratied sampling
Example: Stratied SRS
1
HT estimator:

t
HT
=
H

h=1
N
h
y
h
where y
h
= n
1
h

i A
h
y
i
.
2
Variance
Var
_

t
HT
_
=
H

h=1
N
2
h
n
h
_
1
n
h
N
h
_
S
2
h
where S
2
h
= (N
h
1)
1

i U
h
_
y
i


Y
h
_
2
.
3
Variance estimation

V
_

t
HT
_
=
H

h=1
N
2
h
n
h
_
1
n
h
N
h
_
s
2
h
where s
2
h
= (n
h
1)
1

i A
h
(y
i
y
h
)
2
.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 17 / 26
Stratied sampling
Sample allocation: Given n =

H
h=1
n
h
, how to choose n
h
?
1
Proportional allocation: choose n
h
N
h
.
2
Optimal allocation: choose n
h
such that
minimize Var
_

t
HT
_
subject to
H

h=1
c
h
n
h
= C,
where c
h
is the cost of observing an element in stratum h and C is a
given total cost. The solution (Neyman, 1934) is
n
h
N
h
S
h
/

c
h
.
3
Properties
Under proportional allocation, the weights are all equal.
In general,
V
opt

t
HT

V
prop

t
HT

V
SRS

t
HT

where V
opt

t
HT

is the variance of the stratied sampling estimator


under optimal allocation, V
prop

t
HT

is the variance of the stratied


sampling estimator under proportional allocation, and V
SRS

t
HT

is the
variance of SRS estimator.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 18 / 26
Stratied sampling
Method of collapsed strata
n
h
1: One-per-stratum design
1
Most ecient
2
No unbiased estimator of Var
_

t
HT
_
under stratied sampling.
Form pairs of strata:

t
1
, ,

t
H

_

t
j 1
,

t
j 2
_
, j = 1, 2, , H/2
where H: even
Variance estimator

V
coll
=
H/2

j =1
_

t
j 1

t
j 2
_
2
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 19 / 26
Stratied sampling
Method of collapsed strata (Contd)
Property
E
_

V
coll
_
= E
_
_
H/2

j =1
__

t
j 1
t
j 1
_

t
j 2
t
j 2
_
(t
j 2
t
j 1
)
_
2
_
_
=
H/2

j =1
_
Var
_

t
j 1
_
+ Var
_

t
j 2
_
+ (t
j 2
t
j 1
)
2
_
=
H

h=1
Var
_

t
h
_
+
H/2

j =1
(t
j 1
t
j 2
)
2
Var
_

t
HT
_
Thus, it is a conservative variance estimator.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 20 / 26
Domain estimation
1
Systematic sampling
2
Stratied sampling
3
Domain estimation
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 21 / 26
Domain estimation
Basic setup
Estimation for domains (subpopulation): Usually want to make
inference about subpopulations as well as the whole population.
Often, we dont plan for all subpopulation of interest => random
sample size within subpopulations.
Denote domain d by U
d
U. Parameters are
N
d
= |U
d
|: number of elements in U
d
P
d
= N
d
/N: proportion of elements in U
d
. Often, N is known but N
d
is unknown.
t
d
=

i U
d
y
i
: domain total of y in domain d

Y
d
= t
d
/N
d
: domain mean of y in domain d
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 22 / 26
Domain estimation
Domain estimation
For k = 1, 2, , N, dene
z
kd
=
_
1 if k U
d
0 if k / U
d
Note that z
id
is not a random variable. (i.e., it does not depend on
the sampling scheme.)
Properties of z
kd
1

kU
z
kd
= N
d
2

Z
d
=

kU
z
kd
/N = N
d
/N = P
d
3
S
2
zd
=
1
n 1
_

kU
z
2
kd
N

Z
2
d
_
=
N
N 1
P
d
(1 P
d
)
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 23 / 26
Domain estimation
HT estimation of N
d

N
d
=

kU
z
kd
I
k

k
Under SRS,

N
d
=

kU
z
kd
I
k
n/N
= Nn
d
/n = Np
d
and
Var
_

N
d
_
=
N
2
n
_
1
n
N
_
S
2
zd
=
N
2
n
_
1
n 1
N 1
_
P
d
(1 P
d
)

V
_

N
d
_
=
N
2
n
_
1
n
N
_
s
2
zd
= N
2
_
1
n
N
_
p
d
(1 p
d
)
n 1
.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 24 / 26
Domain estimation
HT estimation of t
d
=

kU
d
y
k
=

kU
y
k
z
kd
:

t
d
=

kU
y
k
z
kd
I
k

k
=

kA
y
k
z
kd

k
.
It is unbiased for t
d
.
HT estimator of

Y
d
= t
d
/N
d
:
y
d
=

t
d

N
d
Probably not unbiased, because its a non-linear function of unbiased
estimators.
Generally, we will make population parameters look like functions of
population totals and then do HT estimation on each totals.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 25 / 26
Domain estimation
The statistical properties of y
d
can be derived from the following
approximation:
y
d
=

t
d

N
d
= f
_

N
d
,

t
d
_
.
= f (N
d
, t
d
) +
_

t
d
f (N
d
, t
d
)
_
_

t
d
t
d
_
+
_

N
d
f (N
d
, t
d
)
_
_

N
d
N
d
_
=
t
d
N
d
+
_
1
N
d
_
_

t
d
t
d
_
+
_

t
d
N
2
d
_
_

N
d
N
d
_
Thus,
Var ( y
d
)
.
= Var
_
1
N
d
_

t
d


Y
d

N
d
_
_
.
Under SRS,
Var ( y
d
)
.
=
_
1
E(n
d
)

1
N
d
_
1
N
d
1

i U
d
_
y
i


Y
d
_
2
.
Kim (ISU) Ch. 3: Element sampling design Spring, 2013 26 / 26

You might also like