Unit 4
Unit 4
Accessible
Goal-Oriented
Practical
Basis of Comparison Population Sample
Meaning Population refers to the Sample means a subgroup of
collection of all elements the members of population
possessing common chosen for participation in the
characteristics, that study
comprises universe
Includes Each and every unit of the Only a handful units of
group population
Data Collection Complete enumeration or Sample survey or sampling
census
Focus on Identifying the characteristics Making inferences about the
population
▪ Sampling unit is the element of set of elements that is available for selection in
some stage of the sampling process
▪ A sampling unit is one of the units selected for the purpose of sampling. Each unit
being regarded as individual and indivisible when selection is made
▪ A unit is a person, animal, plant or thing which is actually studied by a
researcher, the basic objects upon which the study or experiment is executed
▪ A unit is a single member of the sample
▪ Individuals who are to be contacted are the sampling units. If retailers are to be
contacted in a locality, they are the sampling units
▪ Sampling unit may be a geographical one, such as state, district, village, etc. or a
construction unit such as house, flat etc. or it may be a social unit such as family,
club, school etc. or it may be an individual. The researcher will have to decide one
or more of such units that he has to select for his study
▪ The elementary units or the group or cluster of such units may form the basis of sampling process
in which case they are called sampling units. A list containing all such sampling units is known as
sampling frame. Thus, sampling frame consists of a list of items from which the sample is to be
drawn
▪ If the population is finite and the time frame is in the present or past, then it is possible for the
frame to be identical with the population. In most cases they are not identical because it is often
impossible to draw a sample directly from population.
▪ This frame is either constructed by a researcher for the purpose of his study or may consist of
some existing list of the population. For instance, one can use telephone directory as a frame for
conducting opinion survey in a city. Whatever the frame may be, it should be a good
representative of the population.
▪ The list of all the sampling units with a proper identification (which represents the population to
be covered is called sampling frame). The frame may consist of units of map of area.
▪ The frame should be accurate, free from omission and duplication (overlapping),
adequate, up to date and the units must cover the whole of the populations and
should be well identified
▪ In improving sampling design, supplementary information for the field covered by
the sampling frame may also be valuable
▪ A sampling frame is a list of all the items in your population. It is a complete list of
everyone or everything you want to study. The difference between a population
and a sampling frame is that the population is general, and the frame is specific
All sampling units can be The sampling frame has
found i.e. contact some additional
All sampling units have a The frame is organised in
information, map information about the
logical and numerical a logical and systematic
location or other relevant units that allow the use of
identifier manner
information about more advanced sampling
sampling unit is present frames
No elements from
Every element of the Every element of
outside the population of
population of interest is population is present The data is up-to-date
interest are present in
present in the frame only once in the frame
frame
Inaccurate: some sampling
Inadequate: when it does not
units of the population are listed
include all classes of the
inaccurately or some units
population which are to be
which do not actually exist are
taken for survey
included in the list
Non- Sampling
Sampling Error
Error
▪ Sampling error comprises the differences between the sample and the population
that are due solely to the particular units that happen to have been selected
▪ E.g., suppose that a sample of 100 females from Haryana is taken and all are found
to be taller than six feet
▪ It is very clear even without any statistical prove that this would be a highly
unrepresentative sample leading to invalid conclusion
▪ This is a very unlikely occurrence because naturally such rare cases are widely
distributed among the population. But it can occur
▪ Sampling bias is a tendency to favour the selection of units that have particular
characteristics
▪ Sampling error may be committed due to the chance factor
▪ Unusual units in a population do exist and there is always a possibility that an
abnormally large number of them will be chosen. Sampling error may also be
committed due to sampling bias
▪ Sampling bias is usually the result of a poor sampling plan
▪ The most notable is the bias of non-response when for some reason some units
have no chance of appearing in the sample
▪ As an example, we would like to know the average income of some community and
we decide to use the telephone numbers to select a sample of the total population
in a locality
▪ We will end up with high average income, which will lead to the wrong policy
decisions
▪ Non-sampling errors occur whether a census or a sample is being used. A non-
sampling error is an error that results solely from the manner in which the
observations are made
▪ The simplest example of non-sampling error is inaccurate measurements due to
malfunctioning instruments or poor procedures
▪ For example, if persons are asked to state their own weights themselves, no two
answers will be of equal reliability
▪ An individual’s weight fluctuates during the day and so the time of weighing will
also affect the answer
MINIMIZING SAMPLING ERROR
Of the two types of errors, sampling error is easier to identify. The biggest technique for
reducing sampling error are:
▪ Increase the sample size: A larger sample size leads to a more precise result because
the study gets closer to the actual population size
▪ Divide the population into group: Instead of a random sample, test groups according to
their size in the population. E.g., if people of a certain demographic make up 35% of the
population, make sure 45% of the study is made up of this variable
▪ Know your population: The error of population specification is when a research team
selects an inappropriate population from which to obtain data. Know who buys your
product, uses it, works with you, and so forth. With basic socio-economic information, it
is possible to reach a consistent sample of the population. In cases like marketing
research, studies often relate to one specific population like Facebook users, baby
boomers or even homeowners
MINIMIZING NON-SAMPLING ERROR
Non-sampling error is because the types of marketing studies conducted are various. The
following are general techniques used to minimize non-sampling error, but remember, an
in-person study has different factors than a survey or questionnaire
▪ Randomize selection to eliminate bias: Select participants based on a random factor,
like choosing every fourth person on a list
▪ Train your team: if the study is conducted by a researcher, either use the same
researcher or be sure to train your team on the procedure. Training & experience is
essential
▪ Perform an external record check: Human error occurs in entering data. Have an
external source check your records and conform their consistency with written results.
Entering the number 20 as the number 200 is an easy mistake that could throw off your
research dramatically
▪ A failure to obtain information from a number of subjects included in the sample (non-response) may
lead to non-response error. Non-response error exists to the extent that those who did respond to
your survey are different from those who did no on (one of the) characteristics of interest in your
study
▪ Two important sources of non-response are not-at-homes and refusals. An effective way to reduce the
incidence of not-at-homes is to call back at another time, preferably at a different time of day
▪ Personalized cover letters, a small incentive for participating in the study, and an advance notice that
the survey is taking place may also help you to increase the response rate
▪ The rate of refusal depends, among other things, on the length of the survey, the data collection
method, and the patronage of the research
▪ Hence, a decrease in survey length, in the data collection method (personal interviews instead of
mail questionnaires), and the auspices of the research often improve the overall return rate
▪ Nonetheless, it is almost impossible to entirely avoid non-response in surveys. In these cases, you
may have to turn to methods to deal with non-response error, such as generalizing the results to the
respondents only or statistical adjustment (weighing the data by observable variables)
Step - I • Define the Target Population
Step - II • Specify the Sampling Frame
Step - III • Specify the Sampling Unit
Step - IV • Selection of the Sampling Method
Step - V • Determination od Sample Size
Step - VI • Specifying the Sampling Plan
Step - VII • Selecting the Sample
▪ Defining the population of interest, for business research, is the first step in sampling
process.
▪ In general, target population is defined in terms of element, sampling unit, extent, and
time frame.
▪ The definition should be in line with the objectives of the research study.
▪ E.g. if a kitchen appliances firm wants to conduct a survey to ascertain the demand for its
micro-ovens, it may define the population as ‘all women above the age of 20 who cook
(assuming that very few men cook)’
▪ However, this definition is too broad and will include every household in the country, in
the population that is to be covered by the survey
▪ Therefore, the definition can be further refined and defined at the sampling unit level,
that all women above the age 20, who cook and whose monthly household income
exceeds Rs. 50,000. This reduces the target population size and makes the research more
focused.
▪ Once the definition of the population is clear a researcher should decide on the
sampling frame. A sampling frame is the list of elements from which the sample
may be drawn
▪ Continuing with the micro-oven example, an ideal sampling frame would be a
database that contains all the households that have a monthly income above Rs.
50,000. However, in practice it is difficult to get an exhaustive sampling frame that
exactly fits the requirements of a particular research
▪ In general, researchers use easily available sampling frames like telephone
directories and list of credit card and mobile phone users. Various private players
provide databases developed along various geographic and economic variables.
▪ A sampling unit is a basic unit that contains a single element or a group of
elements of the population to be sampled. In this case, a household becomes a
sampling unit and all women above the age of 20 years living in that particular
house become the sampling elements
▪ If it is possible to identify the exact target audience of the business research, every
individual element would be a sampling unit. This would present a case of primary
sampling unit
▪ However, a convenient and better means of sampling would be to select
households as the sampling unit and interview all females above 20 years, who
cook. This would present a case of secondary sampling unit.
▪ The sampling method outlines the way in which the sample units are to be selected
▪ The choice of the sampling method is influenced by the objectives of the business
research, availability of financial resources, time constraints, and the nature of the
problem to be investigated
▪ All sampling methods can be grouped under two distinct heads, that is, probability
and non-probability sampling
▪ The sample size plays a crucial role in sampling process. There are various ways of
classifying the techniques used in determining the sample size
▪ In this step, the specification and decisions regarding the implementation of the
research process are outlined. Suppose, blocks in a city are the sampling units and
the households are the sampling elements. This step outlines the modus operandi
of the sampling plan in identifying houses based on specified characteristics.
▪ It includes issues like how is the interviewer going to take a systematic sample of
the houses. What should the interviewer do when a house is vacant? What is the
recontact procedure for respondents who were unavailable? All these and many
other questions need to be answered for the smooth functioning of the research
process. These are guide lines that would help the researcher in every step of the
process.
▪ This is the final step in the sampling process, where the actual
selection of the sample elements is carried out.
▪ At this stage, it is necessary that the interviewers stick to the rules
outlined for the smooth implementation of the business research.
▪ A sample design is a definite plan for obtaining a sample from a
given population. It refers to the technique or the procedure the
researcher would adopt in selecting items for the sample
▪ Sample design may as well lay down the number of items to be
included in the sample i.e. the size of the sample. Sample design is
determined before data are collected. There are many sample
designs from which a researcher can choose
▪ Some designs are relatively more precise and easier to apply than
others. Researcher must select/prepare a sample design which
should be reliable and appropriate for his research study
Probability Sampling: Based on the
concept of random selection. Researcher Non - Probability Sampling: Researcher
sets a selection of a few criteria and chooses members for research at random.
chooses members of a population This method is not a fixed or predefined
randomly. All members have an equal selection process. This makes it difficult for
all elements of a population to have equal
opportunity to be a part of the sample with
opportunities to be included in a sample
this selection parameter
Quota Sampling
Area Sampling
Snowball Sampling Methods
Cluster Sampling
▪ Here all members have the same chance (probability) of being selected
▪ Random methods provide an unbiased cross selection of the population
▪ It is an equal probability sampling method
▪ The selection of one item for inclusion in the sample should in no way influence the
selection of another item
▪ It should be used with a homogeneous population, that is, a population consisting
of items that possess the same attributes that the research is interested in
▪ The characteristics of homogeneity may include gander, age, income, social,
religious , political, geographical region etc
▪ The best way to choose a simple random sample is to use random number table
Lottery Method Table of Random Numbers
▪ Requires less knowledge about the ▪ If the population size is large, a great deal of
characteristics of the population time must be spent listing and numbering
the members of the population
▪ Since sample is selected at random giving
each member of the population equal ▪ A simple random sample will not adequately
chance of being selected represent many population characteristics
unless the sample is very large. That is, if the
▪ The sample can be called as unbiased
researcher is interested in choosing a
sample, bias due to human preferences and sample on the basis of the distribution in the
influences is eliminated population of gender, age, social status, a
▪ Assessment of the accuracy of the results is simple random sample needs to be very
possible by sample error estimation large to ensure all these distributions are
representative of the population
▪ It is a simple and practical sampling
method, provided population size is not ▪ To obtain a representative sample across
large multiple population attributes we should use
stratified random sampling
▪ It is a method of probability sampling in which the defined target population is ordered,
and the 1st unit of sample is selected at random, and the rest of the sample is selected
according to position using a skip interval
▪ Fixed interval method
▪ This method of sampling is an alternative of random selection. It consists of every nth
item in the population after a random start with an item from 1 to N
▪ As the interval between sample units is fixed, this method is also known as fixed
interval method
▪ Arrange all units in the population in an order by giving serial numbers from 1 to N
▪ Determine the sampling interval by dividing the populations by the sample size, i.e. K =
N/n
▪ Select the first sample unit at random from the first sampling interval (1 to K)
▪ Select the subsequent sample units at equal regular intervals
ADVANTAGES DISADVANTAGES
▪ Since the sample are drawn from ▪ Requires detailed knowledge of the
each of the stratums of the distribution of attributes or
population, stratified sampling is characteristics of interest in the
more representative, and this more population to determine the
accurately reflects characteristics of homogeneous groups that life within
the population from which they are it.
chosen
▪ If we cannot accurately identify the
▪ It is more precise and to a great homogeneous groups, it is better to
extent avoids bias use simple random sample since
improper stratification can lead to
▪ Since sample size can be less in this serious errors
method, it saves a lot of time, money
and other resources for data ▪ Preparing a stratified list is a difficult
collection task as the list may not be readily
available
▪ Area Sampling is a method of sampling used when no complete frame of
reference is available. The total area under investigation is divided into
small sub-areas which are sampled at random or according to a restricted
process (stratification of sampling). Each of the chosen sub-areas is then
fully inspected and enumerated, and may form the basis for further
sampling if desired
▪ It involves sampling from a map, an arial photograph or a similar area
frame. It is often the sampling method of choice when a sampling frame is
not available
▪ E.g., a city map can be divided into equal size blocks, from which random
samples can be drawn. Although area sampling is most often associated
with maps, sometimes the samples might be drawn from lists.
▪ In cluster sampling, we divide the population into groups having
heterogeneous characteristics called clusters and then select a sample of
clusters using simple random sampling
▪ Its is assumed that each of the clusters is representative of the population as
a whole. This sampling is widely used for geographical studies of many
issues
▪ The principles that are basic to the cluster sampling are as follows:
▪ The differences or variability within a cluster should be as large as
possible. As far as possible the variability within each cluster should be
the same as that of the population.
▪ The variability between clusters should be as small as possible.
Once the clusters are selected, all the units in the selected clusters are
covered for obtaining data.
ADVANTAGES DISADVANTAGES
▪ Provides significant gains in data ▪ Less precise than sampling of units from the
collection costs, since travelling costs are whole population since the latter is
smaller expected to provide a better cross-section
of the population than the former, due to the
▪ Since the researcher need not cover all the usual tendency of units in a cluster to be
cluster and only a sample of clusters is homogeneous
covered, it becomes a more practical
method which facilitates fieldwork ▪ Sampling efficiency of cluster sampling is
likely to decrease with the decrease in
cluster size or increase in number of
clusters
This equation is for an unknown population size or a very large population size.