5-1
Chapter 7
Sampling &
Sampling
Distributions
Reasons for Sampling 5-2
Sampling can save money.
Sampling can save time.
For given resources, sampling can
broaden the scope of the data set.
Because the research process is
sometimes destructive, the sample can
save product.
If accessing the population is impossible;
sampling is the only option.
Reasons for Taking a Census 5-3
Eliminate the possibility that a random
sample is not representative of the
population.
For the safety of the consumer
Population Frame
Frame - a list, map, directory, or other source used to
represent the population
Overregistration - the frame contains all members of
the target population and some additional elements
Example: Using Bell Montreal Telephone Registry
as a listing of Bell telephones in Montreal
Underregistration - the frame does not contain all
members of the target population.
Example: using the chamber of commerce
membership directory as the frame for a target
population of all businesses.
Random Versus Nonrandom
Sampling
Random sampling
• Every unit of the population has the same
probability of being included in the
sample.
• A chance mechanism is used in the
selection process.
• Eliminates bias in the selection process
• Also known as probability sampling
Random Versus Nonrandom
Sampling
Nonrandom Sampling
• Every unit of the population does not have
the same probability of being included in
the sample.
• Open the selection bias
• Not appropriate data collection methods
for most statistical methods
• Also known as nonprobability sampling
Random Sampling Techniques
Simple Random Sample
Stratified Random Sample
Proportionate
Disportionate
Systematic Random Sample
Cluster (or Area) Sampling
Simple Random Sampling
Number each frame unit from 1 to N.
Use a random number table or a
random number generator to select n
distinct numbers between 1 and N,
inclusively.
Easier to perform for small populations
Cumbersome for large populations
5-
Simple Random Sampling 9
Stratified Random Sampling
Population is divided into nonoverlapping
subpopulations called strata.
A random sample is selected from each stratum.
Potential for reducing sampling error.
Proportionate - the percentage of the sample
taken from each stratum is proportionate to the
percentage that each stratum is within the
population
Disproportionate - proportions of the strata
within the sample are different than the
proportions of the strata within the population
Stratified Random Sampling:
Systematic Sampling
Convenient and relatively
easy to administer N
k = ,
Population elements are an n
ordered sequence (at least,
conceptually). where:
The first sample element is n = sample size
selected randomly from the
first k population elements. N = population size
Thereafter, sample elements
are selected at a constant k = size of selection interval
interval, k, from the ordered
sequence frame.
Systematic Sampling:
Cluster (Area) Sampling
Population is divided into nonoverlapping clusters or
areas
Each cluster is a miniature, or microcosm, of the
population.
A subset of the clusters is selected randomly for the
sample.
If the number of elements in the subset of clusters is
larger than the desired value of n, these clusters may
be subdivided to form a new set of clusters and
subjected to a random selection process.
Cluster (Area) Sampling
u Advantages
• More convenient for geographically dispersed
populations
• Reduced travel costs to contact sample elements
• Simplified administration of the survey
• Unavailability of sampling frame prohibits using
other random sampling methods
u Disadvantages
• Statistically less efficient when the cluster elements
are similar
• Costs and problems of statistical analysis are
greater than for simple random sampling
Cluster (Area) Sampling
Nonrandom Sampling
Techniques
Convenience Sampling
Judgment Sampling
Quota Sampling
Snowball Sampling
Nonrandom Sampling
Convenience Sampling
sample elements are selected for the convenience
of the researcher
is probably the most common of all sampling
techniques. With convenience sampling, the
samples are selected because they are accessible
to the researcher. Subjects are chosen simply
because they are easy to recruit. This technique is
considered easiest, cheapest and least time
consuming.
Nonrandom Sampling
Judgment Sampling
sample elements are selected by the judgment of
the researcher
is more commonly known as purposive sampling.
In this type of sampling, subjects are chosen to be
part of the sample with a specific purpose in mind.
With judgmental sampling, the researcher believes
that some subjects are more fit for the research
compared to other individuals. This is the reason
why they are purposively chosen as subjects.
Nonrandom Sampling
Quota Sampling
sample elements are selected until the quota controls are
satisfied
is a non-probability sampling technique wherein the
researcher ensures equal or proportionate representation of
subjects depending on which trait is considered as basis of
the quota.
For example, if basis of the quota is college year level and the
researcher needs equal representation, with a sample size of 100, he
must select 25 1st year students, another 25 2nd year students, 25 3rd
year and 25 4th year students. The bases of the quota are usually age,
gender, education, race, religion and socioeconomic status.
Nonrandom Sampling
Snowball Sampling
survey subjects are selected based on referral from
other survey respondents
is usually done when there is a very small
population size. In this type of sampling, the
researcher asks the initial subject to identify
another potential subject who also meets the
criteria of the research. The downside of using a
snowball sample is that it is hardly representative
of the population.
Errors
u Data from nonrandom samples are not appropriate for analysis
by inferential statistical methods.
u Sampling Error occurs when the sample is not
representative of the population
The sampling process is such that a specific group is excluded or
under-represented in the sample, deliberately or inadvertently. If
the excluded or under-represented group is different, with respect
to survey issues, then bias will occur.
Errors
u Nonsampling Errors
u Some examples of non-sampling errors are:
• Missing Data, Recording, Data Entry, and Analysis
Errors
• Poorly conceived concepts , unclear definitions, and
defective questionnaires
• Response errors occur when people so not know, will
not say, or overstate in their answers
Sampling
Distribution of
Central Limit Theorem
Z Formula for Sample Means
Z
X X
X
X
n
Example:
The mean expenditure per customer at a tire
store is P85.00, with a standard deviation of
P9.00. If a random sample of 40 customers is
taken, what is the probability that the sample
average expenditure per customer for this
sample will be P87.00 or more?
Example:
X- 87 85 2
Z= 1.41
9 1.42
n 40
The z value of 1.41 yields a probability of 0.4207.
Solving for the tail of the distribution yields
.5000 - .4207 = .0793
* 7.93% of the time, a random sample of 40 customers from
this population will yield a sample mean expenditure of P87.00
or more.
Graphic Solution
Sampling from a Finite Population
without Replacement
In this case, the standard deviation of the
distribution of sample means is smaller than
when sampling from an infinite population (or
from a finite population with replacement).
The correct value of this standard deviation is
computed by applying a finite correction factor
to the standard deviation for sampling from a
infinite population.
If the sample size is less than 5% of the
population size, the adjustment is unnecessary.
Sampling from a Finite Population
Finite Correction
Factor
N n
N 1
Modified Z Formula
X
Z
N n
n N 1
Finite Correction Factor
for Selected Sample Sizes
Population Sample Sample % Value of
Size (N) Size (n) of Population Correction Factor
6,000 30 0.50% 0.998
6,000 100 1.67% 0.992
6,000 500 8.33% 0.958
2,000 30 1.50% 0.993
2,000 100 5.00% 0.975
2,000 500 25.00% 0.866
500 30 6.00% 0.971
500 50 10.00% 0.950
500 100 20.00% 0.895
200 30 15.00% 0.924
200 50 25.00% 0.868
200 75 37.50% 0.793
Example:
A production company’s 350 hourly
employees average 37.6 years of age, with
a standard deviation of 8.3 years. If a
random sample of 45 hourly employees is
taken, what is the probability that the sample
will have an average age of less than 40
years?
Given:
population mean (µ) is 37.6
population standard deviation of 8.3
sample size (n) is 45
finite population (N) is 350
sample mean under consideration is 40
Solution:
Solution:
The z value of 2.07 yields a probability of .4808.
Therefore, the probability of getting a sample
average age of less than 40 years is
.4808 + .5000 = .9808.
Sampling
Distribution of
p
Sampling Distribution of p
Sample Proportion
X
p
n
where:
X number of items in a sample that possess the characteristic
n = number of items in the sample
Sampling Distribution
• Approximately normal if nP > 5 and nQ > 5
(P is the population proportion and Q = 1 - P.)
• The mean of the distribution is P.
P Q
• The standard deviation of the distribution is
n
Z Formula for Sample Proportions
Problem:
Suppose 60% of the electrical contractors
in a region use a particular brand of wire.
What is the probability of taking a
random sample of size 120 from these
electrical contractors and finding that .50
or less use that brand of wire?
Solution:
Given:
p = .60 ෝ = .50
𝒑 n = 120
The probability corresponding to z = -2.24 is .4875.
For z -2.24(the tail of the distribution), the answer
is .5000 - .4875 = .0125.
Thank you!