SAMPLING
The Importance of Sampling
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.235.
Representativeness of Samples
Sample statistics cannot be exactly equal to the population parameters.
However, they should predict these parameters as closely as possible.
Two things are very important to achieve this:
Sampling design
Sample size
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.238.
Reasons of Sampling
Time
Cost
Fatigue and errors occur if we do not sample
Destructive sampling: destructive nature of measurement (i.e. health,
quality control)
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.235-236.
Basic Terminology of Sampling
Population: the entire group of people, events or things of interest that the
researcher wishes to investigate
- effectiveness of advertising strategies for products directed at preschool
children in Turkey (Population: All preschool children in Turkey)
- Advertising strategies of touristic hotels in Turkey (Population: All touristic
hotels in Turkey)
Element: a single member of the population
Sample: a subset of the population
Subject: a single member of the sample
Census: a complete measurement of the total population (used when the
population is so small that sampling is not possible or meaningful)
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.236-237.
The Sampling Process
Step 1: Define the population:
precise definition of the sample population is important
What are the characteristics of Turkish consumers who make purchases from the
online environment?
P: All online shoppers in Turkey
University students’ satisfaction from the education system in Turkey
P: All university students in Turkey
University students’ satisfaction from Boğaziçi University
P: All students at Boğaziçi
Financial situation of companies in the food sector
P: All companies in the food sector
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.240.
The Sampling Process
Step 2: Determine the sampling frame
A sampling frame is a listing of all the elements in the population from
which the sample is drawn. Examples:
Payroll list for members of an organization
Registration list of students
İTO registration list for companies in a specific sector in İstanbul
Sometimes the sampling frame is incomplete or not current. This creates a
coverage problem. If the frame contains too much inaccuracy, this is an
important problem. Minor inaccuracy is usually acceptable and ignored.
Sometimes there is no sampling frame.
Ex: Online shoppers in Turkey, people who consume coffee every day, etc.
In these cases, non-probability sampling is used.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.240.
The Sampling Process
Step 3: Determine the sampling design
Two Major Types of Sampling Design
Probability Sampling: Elements in the population have a known and
nonzero probability of being selected as sample subjects.
Representativeness of the sample is important for generalizability.
Nonprobability Sampling: Elements do not have a known probability of
being selected as subjects.
Time or similar other factors are more important than generalizability.
We rely on personal judgment here. The sample may be a good estimate of
the population but its generalizability is low.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.240.
The Sampling Process
Step 4: Determine the sample size
Step 5: Execute the sampling process
Nonresponse error: Some people who have been selected as subjects may not
respond to your survey. Non-response can happen because of:
- not being able to reach the respondent
- refusals: being rejected or not getting a response from the respondent
Keeping surveys short, preparing personalized cover letters, sending notices
before or after sending surveys (follow-up) are some of the ways to reduce non-
response.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.241-242.
Sampling Designs
Probability Sampling
Simple Random Sampling:
Every element in the population has a “known” and “equal” chance of being selected
as a subject.
Ex: selecting 100 out of 1000
This method has the least bias and offers the most generalizability. (+)
A full list of the population may not be available. (-)
It might be difficult and costly. (-)
Let’s say there are approximately 15,000 undergraduate students at Boğaziçi
University. There is a full list of these students at the Registrar’s Office. Using a
random selection program, we can select 1,500 students from this total list to conduct
a student satisfaction survey. This is simple random sampling.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.242-243.
Probability Sampling
Systematic Sampling: Drawing every nth element in the
population starting with a randomly chosen element between 1
and n.
Ex: Sample of 15 students out of a class of 100.
Very similar to simple random sampling
Systematic bias might occur during selection (not very common)
An airline flies from Istanbul to Vienna on Mondays, Wednesdays and Fridays. What if
we decide to choose every 3rd flight to measure satisfaction? May results be
different if we start with a Monday flight? A Friday flight?
I will select 1,500 students from a total population of 15,000. I can first arrange my
population list and sort it according to a specific variable, i.e. GPA from lowest to
highest or according to student number, etc. Then starting from a random point in the
list, I can select every 10th student and come up with a sample of 1,500. So I’ll make
sure that students from different GPA levels or different entry years will be represented
in the list. This is systematic sampling.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.243.
Probability Sampling
Stratified Random Sampling: If there are identifiable subgroups of elements
within the population that may be expected to have different parameters on
research variables, the population is first divided into these subgroups. Then, a
random selection of subjects from each group is done.
Ex: measuring the training needs or motivation of employees, satisfaction level
from an elective course open to all departments, satisfaction of different segments
from products
Ex: motivation research is done within the company, middle level managers seem
unmotivated. With simple or systematic random sampling, this information would
not be captured.
•It makes it possible to identify differences between group parameters
•Stratification ensures homogeniety within strata but heterogeniety between
strata
a) proportionate – when sizes of different groups are close
b) disproportionate – when sizes of different groups are different (too small and too
large groups)
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.244.
Probability Sampling
Let’s say I am interested to find about whether there are differences between the
satisfaction levels of students in different faculties at Boğaziçi University. I will first
divide the population into subgroups which will be faculties in this case. Then I can
select a proportionate number of students from each stratum (10% of each) to
reach a sample size 1,500. This is stratified sampling.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.244.
Proportionate vs. Disproportionate Sampling in
Stratified Sampling
Number of subjects in the sample
Job Level No. of elements Prop. Samp. Disprop. Samp.
Top mngmt 10 2 7
Middle-level 30 6 15
mngmt
Lower-level 50 10 20
mngmt
Supervisors 100 20 30
Clerks 500 100 60
Secretaries 20 4 10
Total 710 142 142
Disproportionate sampling decisions are made when sizes are different or when there
is more variability suspected within a particular stratum.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.244-245.
Probability Sampling
Cluster Sampling: Clusters are groups of elements that are expected to be naturally
similar to (thus representative of) the population. The target population is first
divided into clusters, then a random sample of clusters is drawn.
Contrary to stratified sampling, cluster have “heterogeniety within the group” but
“homogeniety between groups”.
Ex: We want to research the motivations of students to participate in students clubs:
If we select a proportionate number of students from each club, this would be
stratified sampling.
If we randomly select a certain number of clubs first (i.e. 10 clubs), then reach all
the students in those clubs, this is cluster sampling. (one-stage)
If we make a random selection of students from each of the ten selected clubs
instead of reaching all of them, this two-stage cluster sampling.
Cluster sampling is more convenient and less costly.(+)
Groups may not be mutually exclusive so there might be a duplication of
subjects.(-) (A student might be a member of more than one club, for example)
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.246.
Probability Sampling
Double Sampling: It is used when further information is needed from a
subset of the group from which some information has already been
collected for the same study
Ex: Let’s say a research is done about the satisfaction of all academicians
at Boğaziçi University. This study shows that assistant professors have a
very low level of satisfaction compared to academicians with other titles. A
second sample of assistant professors can be selected to conduct a
second study and understand the reasons of this. This is double sampling.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.247.
Nonprobability Sampling
- Used when there is no sampling frame, thus there is no known
chance of being selected
- Generalizability is a problem, but sometimes collected the data in a
quick and expensive way is more important
Convenience Sampling: the collection of information from members of the
population who are conveniently available to provide it.
Ağaç Ev collects data from those who come to the cafeteria that week about
their satisfaction from the new menu. No population list is used and data is
collected from those who are conveniently available. This is convenience
sampling.
Snowball Sampling: using an initial set of contacts to reach subjects with
similar characteristics, i.e. mothers who have children aged 0-6 years, an
initial circle of subjects reached with such characteristics can make you reach
others with the same profile.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.247.
Nonprobability Sampling
Purposive Sampling: Instead of those readily available, it might sometimes be
necessary to obtain information from specific target groups
a) Judgment sampling: The choice of subjects who are most advantageously
placed or in the best position to provide the information required
Ex: what it takes for women managers to make it to the top (I can use a list of
the most successful female C level managers listed in Capital magazine.
This is not a definitive list but I use my judgment saying that this can be a
good source to use.)
Ex: I want to understand what motivates gamers to pay for in-app purchases. I
go to a mobile gaming convention and collect data from people who attend
this convention. As a researcher, I have found this convention to be good
environment to collect this data.
This might be the only viable method to get information from subjects who
cannot be replaced by others.(+)
It might be difficult and costly.(-)
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.248-249.
Nonprobability Sampling
b) Quota Sampling: It ensures that certain groups are adequately
represented in the study through the assignment of a quota which is generally
determined according to the number of members in each subgroup
This is the nonprobability or convenience-based form of stratified sampling.
The sample may not be representative of the population, generalizability is low.
I want to conduct a study about factors influencing football fans to support their
teams. My sample size will be 500. I set quotas as 125 FB supporters, 125 GS
supporters, 125 BJK supporters, 25 TS supporters, 25 BS supporters, 75 other
teams. This is not probability-based. I don’t have an exact population list but I
set quotas in a non-probability way.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.248-249.
Which Sampling Design?
The human resources director of a company with 820 people on its
payroll has been asked by the vice president to consider formulating an
implementable flextime policy. The director feels that such a policy is not
necessary since everyone seems happy with the 9 to 5 hours and noone
has complained. She wants to prove this to the V.P.
(simple random)
The director of human resources of a manufacturing firm wants to offer
stress management seminars to the personnel who experience high
levels of stress. Although everybody is expected to experience some
level of stress in the company, it is expected that the workmen who
constantly handle dangerous chemicals, the foremen responsible for
production quotas, counselors who listen to the problems of employees
might have higher stress levels.
(stratified – proportionate or disproportionate is determined according to
group sizes)
Source: Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2014, 6th edition, pp.256-257.
Which Sampling Design?
International financial analysts want to analyze the depositing patterns of
bank clients in large cities in Turkey. Deposit holders in a selected set of
large cities will be selected.
(cluster)
Exit surveys are done to see why people resign from the company. From
those who have been selected, those who say they resign because of
discrimination are surveyed again deeper to find out the types and
impact of discrimination in the company.
(double)
Accounts executive in a company has established a new accounting
system and wants to test initial reactions. (Kayıt İşleri establishes a new
registrar system for instructors and wants to test responses. Send mail to
instforum). (convenience)
Quota sampling example: selecting equal numbers of people who use
anti-depressants (Group 1), meditation-yoga (Group 2), herbal
treatments (Group 3), and therapy (Group 4) for mood problems
SAMPLE SIZE
The Importance of the Sample Size
Is choosing the correct sampling design enough?
When my population is ...., what should my sample size be so that I can
make generalizations? See page 268 for approximate figures.
Sample statistics population parameters
What is the sample size that will give me the statistics that can estimate
population parameters as closely as possble within a narrow margin of
error? The larger the sample size, the higher the estimation power will
be.
Precision and Confidence
Precision: How close is our estimate to the true population characteristic?
Ex: 50 out of 300 employees
Sample mean (x̅) = 50 pieces / day
Population mean (µ) :
50 +/- 10 – less precise, more confident
50 +/- 5 – more precise, less confident
“Standard error” is a measure of the extent of precision offered by the sample
(the range of variability in the sampling distribution of the sample mean)
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.257.
Standard Error
If we take a number of different samples from a population and then take
the mean of each of these, we will usually find that they are all different.
The smaller the dispersion or variability of these means, the greater the
probability that the sample mean will be closer to the population mean.
This variability is called the standard error (the extent of precision offered
by the sample):
S(x̅) = S / √n
S = standart deviation of the sample
n = sample size
In order to reduce standard error, we need to increase the sample size.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.257.
Precision and Confidence
Confidence: How certain are we that our estimates will
really hold true for the population?
90% - 95% - 99%: 90-95 or 99 times out of 100, our
estimate will reflect the true population characteristic.
100% confident that it will be between 0 and .
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, pp.258-259.
Tradeoff between precision and confidence
Through sampling we determine sample statistics that are
expected to represent the population mean within an interval.
Problem: We want to estimate the mean dollar value of purchases
made by customers when they shop from department stores.
Sample mean (x̅) :$105
n :64
S :$10
We have a sample mean to estimate pop. mean. We will create an interval
estimate around sample mean such that pop. mean will fall in it. Two things
are important here:
• Standard error of the sampling distribution
• Confidence level we require
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.259.
Alternative Calculations of the Interval
Source: Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2014, 6th edition, p.263.
Alternative Calculations of the Interval
Dr.Kerry has many patients suffering from high levels of blood
cholesterol. He wants to determine the average reduction in the
cholesterol levels of these patients after following his medical
plan and advice. The mean reduction level of a sample of
selected 81 patients is found to be 68 with a standard deviation of
18. By using this example, show the trade-off between precision
and confidence for 95% and 99% levels of confidence.
Determining the Sample Size
A manager wants to be 95% confident that the expected monthly
withdrawals in a bank will be within a confidence interval of $500. A
study of sample clients indicates that average withdrawals have a
standard deviation of $3,500. What is the sample size needed?
What do we know? Interval estimate, z and S.
If confidence level increases to 99%, required sample size increases to
325. If the interval estimate decreases to $300, sample size needs to be
902.
Thus, we can see how difficult it is to achieve precision and confidence
at the same time.
Sekaran, U. & R. Bougie, Research Methods for Business, John Wiley and Sons Inc., 2016, 7th edition, p.262.
A solar energy company wants to find out about:
How much on the average people spend on solar energy devices with a $100
interval of estimation (confidence interval) and 95% confidence (confidence level).
From previous studies, standard deviation can be taken as $1,500.
A company wants to determine the average spending of loyal customers on a
monthly basis with 90% confidence and an interval estimate of $500. From
previous studies, standard deviation is estimated to be $5,000.