[go: up one dir, main page]

0% found this document useful (0 votes)
43 views13 pages

Chapter 3 Sampling Techniques and Data Collection

Chapter 3 discusses sampling techniques and data collection, emphasizing the importance of selecting a representative sample from a defined population to ensure valid and generalizable research results. It outlines various sampling methods, including probability and non-probability techniques, and highlights the significance of determining an appropriate sample size based on factors such as population size and desired precision. Additionally, the chapter covers data collection methods, distinguishing between primary and secondary data, and detailing tools like questionnaires and interviews.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views13 pages

Chapter 3 Sampling Techniques and Data Collection

Chapter 3 discusses sampling techniques and data collection, emphasizing the importance of selecting a representative sample from a defined population to ensure valid and generalizable research results. It outlines various sampling methods, including probability and non-probability techniques, and highlights the significance of determining an appropriate sample size based on factors such as population size and desired precision. Additionally, the chapter covers data collection methods, distinguishing between primary and secondary data, and detailing tools like questionnaires and interviews.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 3 Sampling Techniques and Data Collection

3.1 Introduction to Sampling

A sample is a part of the population selected for observation and analysis (Kothari, 2004).
Sampling is the process of selecting a subset of individuals from a population to estimate its
characteristics. Often, research focuses on a large population that, for practical reasons, can only
include some of its members in the investigation. You then have to draw a sample from the total
population.
In such cases, you must consider the following questions:
▪ What is the study population you are interested in, from which we want to draw a sample?
▪ How many subjects do you need in your sample?
▪ How will these subjects be selected?
Population, also known as the target population, refers to the entire group or set of individuals,
objects, or events that possess specific characteristics and are of interest to the researcher. It
represents the larger population from which a sample is drawn. The research population is defined
based on the research objectives and the specific parameters or attributes under investigation. The
study population has to be clearly defined; otherwise, you cannot do the sampling. Apart from
persons, a study population may consist of villages, institutions, plants, animals, records, etc. Each
study population consists of study units.
3.2 Why Do We Sample?
➢ Economy: Saves time and money.
➢ Feasibility: Entire populations are often too large to study.
➢ Accuracy: If well-designed, sampling can yield highly reliable results.
➢ Speed: Data can be collected and analyzed faster.

The key reason for being concerned with sampling is validity, the extent to which the
interpretations of the results of the study follow from the study itself and the extent to which results
may be generalized to other situations with other people or situations.
Sampling is essential for external validity, which refers to the extent to which the results of a study
can be applied to people or situations beyond those directly studied. Sampling is also important
for internal validity, which looks at whether the study's outcomes come from the variables we
changed or measured, rather than from other factors that were not carefully controlled.

To ensure their findings are applicable to the entire study population, researchers must create a
representative sample. This means the sample should include all the key characteristics of the
overall population being studied.

1
Table 3.2 Population vs. Sample
Concept Definition
The entire group of individuals/events of
Population
interest (e.g., all malaria patients in Ghana).
A subset of the population (e.g., 500 malaria
Sample
patients from Accra hospitals).
A list of all units in the population from
Sampling Frame which the sample is drawn (e.g., hospital
records).

Example
Research Study/Topic: Social Media Usage and University Students’ Academic Performance and
Lifestyle Choices.

Population: All University students in a particular city.

Sampling Frame: The sampling frame would involve obtaining a comprehensive list of all
Universities in the specific city. A selection of schools would be made from this list to ensure
representation from different areas and demographics of the city.

Sample: Randomly selected 500 University students from different schools in the city

The sample represents a subset of the entire population of University students in the city.

3.3 Types of Sampling Methods

(A) Probability Sampling (Random Selection)

Probability Sampling is a method wherein each member of the population has the same probability
of being a part of the sample. Every population element has a known, non-zero chance of being
selected. To generalize validly the findings from a sample to some defined population requires that
the sample has been drawn from that population according to one of several probability sampling
plans. By a probability sample it is meant that the probability of inclusion in the sample of any
element in the population must be given a priori. All probability samples involve the idea of
random sampling at some stage. Probability sampling requires that a listing of all study units exists
or can be compiled. This listing is called the sampling frame.

2
3.3.1 Probabilistic Sampling Methods

Simple random sampling


The guiding principle behind this technique is that each element must have an equal and nonzero
chance of being selected. The product of this technique is a sample determined entirely by chance.
It should be noted, however, that chance is “lumpy”, meaning that random selection does not
always produce a sample that is representative of the population.
Imagine, for example, a sampling frame comprising 10,000 people, and altitude is a critical
variable, and that the composition of the sampling frame is as follows: 1,500 are from high altitude,
7,500 are from medium altitude, and 1,000 are from low altitude. You are going to select a sample
of 500 people from this sampling frame using a simple random sampling technique. Unfortunately,
the simple random selection process may or may not yield a sample that has equivalent altitudinal
proportions as the sampling frame due to chance; disproportionate numbers of each altitudinal
category may be selected.

Systematic sampling
The systematic random sampling technique begins with selecting one element at random in the
sampling frame as the starting point; however, from this point onward, the rest of the sample is
selected systematically by applying a predetermined interval. For example, in this sampling
technique, after the initial element is selected at random, every “kth” element will be selected (kth
refers to the size of the interval—the ratio of the population to sample size) and becomes eligible
for inclusion in the study. The “kth ” element is selected through the end of the sampling frame and
then from the beginning until a complete cycle is made back to the starting point (that is, the place
where the initial random selection was made). If there is a cyclic repetition in the sampling frame,
systematic sampling is not recommended.

Stratified sampling
Stratified random sampling begins with the identification of some variable, which may be related
indirectly to the research question and could act as a confounder (such as geography, age, income,
ethnicity, or gender). This variable is then used to divide the sampling frame into mutually
exclusive strata or subgroups. Once the sampling frame is arranged by strata, the sample is selected
from each stratum using simple random sampling or systematic sampling techniques. It is
important that the sample selected within each stratum reflects proportionately the population
proportions; thus, you can employ proportionate stratified sampling.

Cluster sampling
It may be difficult or impossible to take a simple random sample of the units of the study population
at random, because a complete sampling frame does not exist. Logistical difficulties may also
discourage random sampling techniques (e.g., interviewing people who are scattered over a large
area may be too time-consuming). However, when a list of groupings of study units is available

3
(e.g., villages or schools) or can be easily compiled, a number of these groupings can be randomly
selected. Then all study units in the selected clusters will be included in the study.

Multistage sampling
Multistage cluster sampling is used when an appropriate sampling frame does not exist or cannot
be obtained. Multistage cluster sampling uses a collection of preexisting units or clusters to “stand
in” for a sampling frame. The first stage in the process is selecting a sample of clusters at random
from the list of all known clusters. The second stage consists of selecting a random sample from
each cluster. Because of this multistage process, the likelihood of sampling bias increases. This
creates a lack of sampling precision known as a design effect. It is recommended to consider the
design effect during sample size determination.

Table 3.1 Summary of Probabilistic sampling methods


Method Description Example
Simple Random Each member has an equal chance. Lottery draw of student names.
Systematic Every kth individual is chosen. Every 10th patient on a clinic list.
Population divided into strata; Sampling male & female students
Stratified
random samples from each stratum. proportionally.
Entire clusters (groups) are randomly Randomly select schools, then
Cluster
selected. survey all students in them.
(B) Non-Probability Sampling (Non-Random Selection)
In non-probabilistic sampling not every element has a known chance of being included.

3.3.1 Non-Probabilistic Sampling Methods

Convenience Sampling
Convenience sampling is a type of non-probabilistic sampling where the researcher selects
participants based on their ease of access and proximity. This method is often used when time,
cost, and resources are limited. For example, a researcher might choose individuals who happen
to be nearby or readily available, such as students in a classroom or people in a shopping mall.
Although this approach is practical and easy to implement, it is prone to bias because the sample
may not represent the wider population, leading to limited generalizability of the results.

Purposive Sampling
Purposive sampling, also known as judgmental or selective sampling, involves the deliberate
selection of participants based on specific characteristics or qualities deemed important for the
study. The researcher uses their judgment to identify and choose individuals who have the
experience, knowledge, or attributes necessary to provide rich and relevant data. This method is
commonly used in qualitative research where the goal is to gain deep insight rather than to
4
generalize findings. However, since the selection is subjective, there is a risk of bias and limited
representativeness.

Quota Sampling
Quota sampling is a non-probabilistic technique where the researcher ensures that the sample
reflects certain characteristics of the population in specific proportions. The population is divided
into subgroups, or quotas, based on variables such as age, gender, or income, and a set number of
participants are selected from each subgroup. While the selection within each subgroup is not
random, this method helps ensure that the final sample mirrors the population structure to some
extent. Despite this, quota sampling can still suffer from bias because the individuals chosen within
each category may not be representative of that subgroup as a whole.

Snowball Sampling
Snowball sampling is often used in situations where the population is difficult to locate or identify,
such as in studies involving hidden or hard-to-reach groups. In this method, initial participants are
identified and then asked to refer others who meet the study’s criteria. The sample gradually
"snowballs" as more participants are recruited through referrals. This technique is particularly
useful in qualitative research and social network studies. However, it has limitations, including the
potential for homogeneity in the sample since participants often refer individuals who are similar
to themselves, which may lead to biased outcomes and limited diversity.

Table 3.2 Summary of non-probabilistic sampling methods


Method Description Example
Selecting whoever is easiest to
Convenience Surveying friends/family.
reach.
Selecting specific individuals with
Purposive Experts on malaria control.
unique insights.
Filling pre-set quotas (e.g., 60%
Quota Interviewing until quotas are met.
male, 40% female).
Locating hidden populations (e.g.,
Snowball Participants refer others.
HIV-positive individuals).

3.4 Sample Size Determination

Having decided how to select the sample, you have to determine the sample size. The research
proposal should provide information and justification about sample size. It is not necessarily true
that the bigger the sample, the better the study. Beyond a certain point, an increase in sample size
will not improve the study. In fact, it may do the opposite; if the quality of the measurement or
data collection is adversely affected by the large size of the study. After a certain sample size, in

5
general, it is much better to increase the accuracy and richness of data collection (for example by
improving the training of interviewers, by pre-testing of the data collection tools or by calibrating
measurement devices). than to increase sample size. Also, it is better to make extra effort to get a
representative sample rather than to get a very large sample. The level of precision needed for the
estimates will impact the sample size. Generally, the actual sample size of a study is a compromise
between the level of precision to be achieved, the research budget and any other operational
constraints, such as time.

Factors to consider:
➢ Population size: To a certain extent, the bigger the population, the bigger the sample
needed. But once you reach a certain level, an increase in population no longer affects the
sample size. For instance, the necessary sample size to achieve a certain level of precision
will be about the same for a population of one million as for a population twice that size.
➢ Margin of error (usually 5%)
➢ Confidence level (typically 95%)
➢ Expected variability: For instance, if every person in a population had the same salary, then
a sample of one person would be all you would need to estimate the average salary of the
population. If the salaries are very different, then you would need a bigger sample in order
to produce a reliable estimate.
➢ Type of algorithm to be used for the processing: Some algorthms requires certain amount
of data to be able to estimate or train very well.
➢ The sampling and estimation methods: Not all sampling and estimation methods have the
same level of efficiency. You will need a bigger sample if your method is not the most
efficient. But because of operational constraints and the unavailability of an adequate
frame, you cannot always use the most efficient technique.

Various formulas exist for determining sample size, each serving different research contexts.
Choose based on population size, confidence level, and precision needs.
Example formula:

6
1. Yamane’s Formula (Simplified for proportions): Useful for basic surveys with known
population size.

2. Cochran’s Formula (for large populations): Used when the population is large (over 10,000).

3. Adjusted Sample Size for Finite Population: Ensures sample is appropriate for small
populations.

4. Alternatively, online tools like Raosoft Sample Size Calculator can be used.

3.5 Introduction to Data Collection

Data collection is systematically gathering information for research purposes (Kumar, 2019).

7
3.5.1 Types of Data

Primary Data
Primary data refers to original data collected firsthand by the researcher to specifically address a
current research problem. It is gathered directly from the source using methods such as interviews,
surveys, observations, experiments, and focus groups. Primary data is considered highly reliable,
relevant, and up-to-date, as it is tailored to the researcher’s specific objectives. However, it often
requires more time, resources, and planning to collect compared to secondary data (Kumar, 2019).
For instance, a researcher studying solar panel efficiency may collect power output data from
experimental setups in different orientations. This constitutes primary data because it is collected
in real time for that specific study.
Secondary Data
Secondary data refers to existing information originally collected for a different purpose but is
now being reused for a new research question. This includes data from sources such as government
reports, academic publications, statistical databases, organizational records, and previous research
studies. It is typically more cost-effective and time-saving, as it is already available, but it may not
always perfectly align with the current research objectives or context. Researchers must evaluate
the credibility, relevance, and limitations of secondary data before using it (Creswell, 2014). For
example, using national energy reports or weather databases to analyze solar radiation trends
would involve secondary data.

3.6 Data Collection Methods/Tools/Instruments

1. Questionnaires

Questionnaires are structured tools made up of a series of questions aimed at gathering information
from respondents. They are ideal for efficiently collecting large amounts of data and are widely
used in quantitative research. They can be administered in person, by mail, or online, which offers
flexibility and scalability. However, they may experience low response rates or misinterpretation
of questions if not well designed. To enhance reliability and validity, questionnaires should be pre-
tested prior to use (Kumar, 2019; Creswell, 2014).
Types of Questionnaires
➢ Structured (Closed-ended): Fixed response options (e.g., multiple choice, Likert scale).
➢ Unstructured (Open-ended): Participants provide free-text responses.
➢ Self-administered: Respondent completes it independently (e.g., online, paper-based).
➢ Researcher-administered: Facilitated by the researcher (e.g., face-to-face or phone
survey).

8
2. Interviews

An interview is a method of data collection that involves verbal interaction between the researcher
and the participant to explore ideas, experiences, or behaviors. Interviews are frequently employed
in qualitative research to investigate complex behaviors, perceptions, or motivations. They provide
rich, detailed data but can be time-consuming and require skilled interviewers to prevent bias and
misinterpretation (Creswell, 2014).
Types of Interviews
➢ Structured: Same set of questions in a fixed order.
➢ Semi-structured: Predefined questions with flexibility for follow-up.
➢ Unstructured: Open-ended discussion with minimal guidance.
➢ Individual Interviews: One-on-one interactions.
➢ Group Interviews: More than one participant interviewed simultaneously (but not as
dynamic as a focus group).

3. Observations

Observation is a method in which the researcher observes and records behaviors, events, or
interactions in either a natural or controlled setting. This method proves valuable in behavioral or
social studies, especially when participants might not fully acknowledge or be truthful about their
behaviors. It enables real-time data collection but may raise ethical concerns (e.g., informed
consent) and the potential for observer bias (Kumar, 2019).
Types of Observation
➢ Participant Observation: The researcher becomes part of the group being studied.
➢ Non-participant Observation: The Researcher observes without interaction.
➢ Overt Observation: Participants know they are being observed.
➢ Covert Observation: Participants are unaware of the observation.
➢ Structured Observation: Uses a predefined checklist or rubric.
➢ Unstructured Observation: Open-ended, flexible approach.

4. Focus Groups

Focus groups involve discussions led by a moderator to gather opinions, beliefs, or experiences on
a specific topic. A focus group typically includes 6–10 participants, and their interactions can
generate new ideas or shared insights that individual interviews might not uncover. Focus groups
are especially effective in exploratory research or needs assessments, although they may be
affected by dominant participants or groupthink. The moderator must be adept at managing group
dynamics to ensure that all voices are heard (Creswell, 2014; Bryman, 2016).

9
5. Document Review
Document review entails analyzing existing documents and records to extract relevant
information, including official reports, meeting minutes, letters, policy papers, and archival data.
This method proves beneficial when historical data is needed or when primary data collection is
impractical. Though cost-effective and non-intrusive, document review relies heavily on the
authenticity, completeness, and accuracy of the documents being examined. Researchers must
assess the credibility and relevance of each source (Bowen, 2009).

Table 3.3 Summary of Research Instruments


Method Advantages Disadvantages
Low response rate;
Can reach many people;
Questionnaires misunderstanding of
standardized responses.
questions.
In-depth data; clarifies Time-consuming; interviewer
Interviews
misunderstandings. bias risk.
Directly records Observer bias; can be
Observations
behavior/events. intrusive.
Dominant voices may skew
Interactive discussions;
Focus Groups discussion and
generate new insights.
confidentiality.
Uses existing data; non- It may be outdated or
Document Review
intrusive. irrelevant to current needs.

3.7 Qualitative vs. Quantitative Data Collection

Aspect Quantitative Qualitative


Words, meanings,
Nature of Data Numbers/Statistics
experiences
Surveys, experiments, and Interviews, focus groups,
Instruments
questionnaires observations
Analysis Statistical tests Thematic/coding analysis
Goal Measure and quantify Explore and understand

10
3.8 Ensuring Reliability and Validity

Reliability in data collection refers to the consistency and stability of the measurement process
over time. It indicates the degree to which repeated measurements under the same conditions yield
the same results. For instance, if a survey is administered to a group of people on two different
occasions and produces similar responses each time, the instrument is considered reliable.
Reliability is crucial for ensuring that the data are dependable and not influenced by random errors
or inconsistencies. Common methods of assessing reliability include test-retest reliability, inter-
rater reliability, and internal consistency. While reliability does not guarantee accuracy, it is a
necessary condition for validity because data must be consistent before it can be considered
accurate.

a. Test-Retest Reliability

➢ What it measures: Consistency of scores over time.


➢ How it's done: Administer the same instrument to the same group at two different times
and compute correlation (e.g., Pearson’s r).
➢ Use case: Attitudes or perceptions that are stable over time.

b. Internal Consistency (e.g., Cronbach’s Alpha)

➢ What it measures: Consistency among items in a scale.


➢ How it's done: Use Cronbach’s alpha statistic; α ≥ 0.7 is generally acceptable (Creswell,
2014).
➢ Use case: Multiple-item Likert scales.

c. Inter-Rater Reliability

➢ What it measures: Agreement between different observers/raters.


➢ How it's done: Use Cohen’s Kappa or Intraclass Correlation Coefficient (ICC).
➢ Use case: Observational checklists or coding in qualitative research.

Validity in data collection refers to the extent to which the data accurately represent the concept
or phenomenon being measured. It determines whether the research truly measures what it intends
to measure. For example, if a questionnaire is designed to assess students' academic motivation, it
should capture aspects directly related to motivation, rather than unrelated factors such as test
anxiety or self-esteem. Validity ensures that the conclusions drawn from the data are credible and
meaningful. There are various types of validity, including content validity, construct validity, and
criterion-related validity, each addressing different aspects of accuracy and relevance. High
validity indicates that the data and the instruments used are aligned with the research objectives,
which strengthens the overall trustworthiness of the study

11
a. Content Validity

➢ What it measures: Whether the instrument covers all aspects of the construct.
➢ How it's done: Review by subject matter experts.
➢ Use case: Custom questionnaires or exams.

b. Construct Validity

➢ Whether the instrument truly measures the theoretical construct.


➢ How it is done: Through factor analysis or by checking correlations with related
measures (convergent validity) and unrelated measures (discriminate validity).
➢ Use case: Psychological or educational assessments.

c. Criterion Validity

➢ What it measures: How well one measure predicts an outcome based on another measure.
➢ Types:
o Concurrent validity – comparison with current gold-standard tool.
o Predictive validity – forecasting future outcomes.
➢ Use case: Admission tests predicting academic performance.

Pre-testing Instruments (Pilot Study)

Conduct a small trial run before full data collection to:


➢ Check clarity of questions.
➢ Identify logistical issues.
➢ Improve reliability and validity.

Ethical Considerations in Data Collection

➢ Informed consent
➢ Confidentiality & privacy
➢ Voluntary participation
➢ Avoid harm (physical, psychological)

12
3.9 The Difference between Research Method and Research Methodology
Although the terms "methods" and "methodologies" are often used synonymously, it is helpful for
you to understand that they convey different meanings.

A method is a specific research technique or approach used to gather evidence about a


phenomenon. Therefore, methods are the specific research tools we use in research projects to gain
a fuller understanding of phenomena. That is, the range of approaches used in research to gather
data that are to be used as a basis for inference and interpretation, for explanation, and prediction.
E.g., surveys, interviews, participant observations. A method can be grouped under qualitative,
quantitative and Mixed methods

Methodology describes “the theory of how inquiry should proceed, “which involves analysis of
the principles and procedures in a particular field of inquiry.” It involves the researchers’
assumptions about the nature of reality and the nature of knowing and knowledge. In other words,
methodology represents “a theory and analysis of how research does or should proceed.”
Methodology encompasses our entire approach to research. Our assumptions about what we
believe knowledge is, embedded in methodological discussions, have consequences for how we
design and implement research studies.

13

You might also like