Data Collection and Analysis
Presenter :Engineer Motion Marufu
E-mail: motionmarufu60@gmail.com
WhatsApp: +263773587388
9/10/2025 Data Collection and Analysis
Objectives
After studying this chapter, the reader will be able to:
• Define data collection and explain its importance
• Differentiate between primary and secondary data
• Identify and evaluate data collection methods
• Explain ethical issues in data collection
• Apply basic analysis techniques (quantitative & qualitative)
• Present and interpret research findings
• Understand types of data in engineering research.
• Learn methods for data collection in automotive and aerospace contexts.
Introduction
The data are needed in a research work to serve the following purposes:
1. Collection of data is very essential in any educational research to provide a solid
foundation for it.
2. It is something like the raw material that is used in the production of data. Quality
of data determines the quality of research.
3. It provides a definite direction and definite answer to a research inquiry.
Whatever inquiry has to give a definite answer to an investigation. Data are very
essential for a scientific research.
4. The data are needed to substantiate the various arguments in research findings.
5. The main purpose of data collection is to verify the hypotheses.
6. Statistical data are used in two basic problems of any investigation:
(a) Estimation of population parameters, which helps in drawing generalization.
(b) The hypotheses of any investigation are tested by data collection procedure.
Introduction
7. The qualitative data are used to find out the facts and quantitative data are
employed to formulate new theory or principles.
8. Data are also employed to ascertain the effectiveness of new device for its
practical utility.
9. Data are necessary to provide the solution of the problem.
MEANING OF DATA
Data means observations or evidences. The scientific educational researches require
the data by means of some standardized research tools or self-designed instrument.
Data are both qualitative and quantitative in nature.
Score is the numerical description of an individual with regard to some
characteristics or variables.
Introduction
• Data collection is an important feature of the whole process of research.
• Measured facts used as a basis for reasoning, calculation, or decision-making are
called data (Although datum is singular and data is plural, in technical writing,
the plural form data is always used).
• When you collect data, your intention is to make inferences based on the
collected data.
• Sometimes, the information you are trying to find out is already available in
records but needs to be extracted. However, this may not be the case always, and
in most situations, data have to be generated by doing some research.
• According to the broad approaches to information gathering, data are
categorized as primary data and secondary data.
Introduction
• When the data are collected by direct observation or survey, they are called
primary data. It gives first-hand information to you.
• If the data are collected from already published books, census reports, journals,
theses, project reports, published statistics, and similar documents, they are
called secondary data.
Introduction to Data Collection
What is data?
Why collect data in research?
Types of data: Qualitative vs Quantitative
Data sources: Primary vs Secondary
Introduction to Data
Definition of data: Measurable observations to answer research questions.
Types:
Qualitative → e.g., pilot feedback, driver surveys
Quantitative → e.g., engine RPM, lift-to-drag ratio
Importance: Valid data ensures credible research conclusions.
Introduction to Data Collection
Data collection is a process of collecting information from all the relevant sources
to find answers to the research problem, test the hypothesis and evaluate the
outcomes.
Data collection methods can be divided into two categories: primary methods of
data collection andsecondary methods of data collection
Primary Data Collection Methods
Primary data collection methods can be divided into two groups: quantitative and
qualitative. Quantitative are based in mathematical calculations in various formats.
Methods of quantitative data collection and analysis include questionnaires with
closed-ended questions, methods of correlation and regression, mean, mode and
median and others.
Precautions in Data Collection
In the data collection the following precautions should be observed:
1. The data must be relevant to the research problem.
2. It should be collected through formal or standardized research tools.
3. The data should be such as these can be subjected to statistical treatment easily.
4. The data should have minimum measurement error.
5. The data must be tenable for the verification of the hypotheses.
6. The data should be such as parameters of the population may be estimated for
inferential purpose.
7. The data should be complete in itself and also comprehensive in nature.
8. The data should be collected through objective procedure.
9. The data should be accurate and precise.
10. The data should be reliable and valid.
Precautions in Data Collection
11. The data should be such that these can be presented and interpreted easily.
12. The scoring procedure of the research tool should be easy and objective.
Data Collection Methods
Quantitative
Surveys & Questionnaires
Experiments
Structured Observations Data Collection Definition
Standardized Tests Systematic process of gathering
information relevant to research
Qualitative objectives.
Interviews Key point: Choice of method
Focus Groups affects reliability, validity, and
Case Studies research outcome.
Document Review
Participant Observation
Choosing a Method
Based on:
Research questions & objectives
Type of data needed
Resources & time available
Ethical considerations
Data Collection Methods (Table)
Method Automotive Example Aerospace Example Notes
Surveys Driver satisfaction Pilot cockpit evaluation Human factors studies
Experiments Engine tests Wind tunnel tests Controlled environment
Field Trials Vehicle durability UAV flight tests Real-world validation
High-precision
Sensors OBD-II logs Flight data recorders
measurements
Simulations CFD airflow FEM stress analysis Cost-effective & risk-free
Example: Wind tunnel for wing design is cheaper than full-scale flight test
Practical Examples
Field Topic Data Collection Analysis
Tire pressure on Instrumented Regression; mean
Automotive
braking distance braking stopping distance
Wing shape on lift- Wind tunnel + ANOVA; lift vs drag
Aerospace
to-drag sensors curves
Fuel type on Engine Efficiency curves; %
Automotive
efficiency dynamometer improvement
UAV stability in Flight + IMU Time-series;
Aerospace
gusts sensors pitch/roll deviations
Summary
Accurate data collection is critical in engineering research
Choose method based on objective, resources, and constraints
Use software tools and statistical techniques for analysis
Real-world examples help bridge theory and practice
Quantitative vs Qualitative Research and their applications
• Research is a systematic investigation that aims to generate knowledge about a
particular phenomenon.
• However, the nature of this knowledge varies and reflects your study objectives.
Some study objectives seek to make standardised and systematic comparisons,
others seek to study a phenomenon or situation in detail.
• These different intentions require different approaches and methods, which are
typically categorised as either quantitative or qualitative.
• You have probably already made decisions about using qualitative or
quantitative data for monitoring and evaluation.
• Perhaps you have had to choose between using a questionnaire or conducting a
focus group discussion in order to gather data for a particular indicator.
Quantitative research
• Quantitative research typically explores specific and clearly defined questions
that examine the relationship between two events, or occurrences, where the
second event is a consequence of the first event.
• Such a question might be: ‘what impact did the programme have on children’s
school performance?’
• To test the causality or link between the programme and children’s school
performance, quantitative researchers will seek to maintain a level of control of
the different variables that may influence the relationship between events and
recruit respondents randomly.
• Quantitative data is often gathered through surveys and questionnaires that are
carefully developed and structured to provide you with numerical data that can
be explored statistically and yield a result that can be generalised to some larger
population.
Qualitative research
• Research following a qualitative approach is exploratory and seeks to explain
‘how’ and ‘why’ a particular phenomenon, or programme, operates as it does in a
particular context.
• As such, qualitative research often investigates
i) local knowledge and understanding of a given issue or programme;
ii) people’s experiences, meanings and relationships and
iii) social processes and contextual factors (e.g., social norms and cultural
practices) that marginalise a group of people or impact a programme.
• Qualitative data is non-numerical, covering images, videos, text and people’s
written or spoken words.
• Qualitative data is often gathered through individual interviews and focus group
discussions using semistructured or unstructured topic guides.
Qualitative research
• Qualitative do not involve numbers or mathematical calculations.
• Qualitative research is closely associated with words, sounds, feeling, emotions,
colors and other elements that are non-quantifiable.
• Qualitative studies aim to ensure greater level of depth of understanding and
qualitative data collection methods include interviews, questionnaires with open-
ended questions, focus groups, observation, game or role-playing, case studies
etc.
Data Collection - Qualitative
Information you gather can come from a range of sources. Likewise, there are a
variety of techniques to use when gathering primary data. Listed below are some of
the most common data collection techniques.
⚫ Interviews
⚫ Questionnaires and Surveys
⚫ Observations
⚫ Focus Groups
⚫ Documents and Records
Quantitative methods
Quantitative data can be collected using a number of different methods and from a
variety of sources.
1. Surveys and questionnaires use carefully constructed questions, often ranking
or scoring options or using closed-ended questions. A closed-ended question
limits respondents to a specified number of answers. For example, this is the
case in multiple-choice questions. Good quality design is particularly important
for quantitative surveys and questionnaires.
2. Biophysical measurements can include height and weight of a child
3. Project records are a useful source of data. For example, the number of training
events held and the number of participants attending
4. Service provider or facility data for example school attendance or health care
provider vaccination records
5. Service provider or facility assessments are often carried out during the
monitoring and evaluation of our projects.
Qualitative vs Quantitative
• Table 1: Key differences between qualitative and quantitative research
Qualitative vs Quantitative
• Although the table above illustrates qualitative and quantitative research as
distinct and opposite, in practice they are often combined or draw on elements
from each other.
• For example, quantitative surveys can include open ended questions. Similarly,
qualitative responses can be quantified.
• Qualitative and quantitative methods can also support each other, both through a
triangulation of findings and by building on each other (e.g., findings from a
qualitative study can be used to guide the questions in a survey).
Methods for collecting Data
Methods for Collecting Primary Data
Individual interview
• An individual interview is a conversation between two people that has a
structure and a purpose. It is designed to elicit the interviewee’s knowledge or
perspective on a topic.
• Individual interviews, which can include key informant interviews, are useful for
exploring an individual’s beliefs, values, understandings, feelings, experiences
and perspectives of an issue. Individual interviews also allow the researcher to
ask into a complex issue, learning more about the contextual factors that govern
individual experiences.
Designing questionnaire and schedule of questions
Questionnaire:
• A questionnaire refers to a device for securing answers to questions by using a
form which the respondent fills in by himself/herself. It consists of some
questions printed or typed in a definite order.
• People quite commonly use questionnaire and schedule interchangeably, due to
much resemblance in their nature; however, there are many differences between
these two.
• While a questionnaire is filled by the informants themselves, enumerators fill
schedule on behalf of the respondent.
Questionnaires
• A questionnaire consists of a series of pre-determined questions that can be self-
administered, administered by mail, or asked by interviewers.
• The information is collected usually through mail by sending a proforma. The
respondent is asked to fill it, and there is no interaction between the investigator
and the respondent. Hence, simple questions only are included in questionnaires.
• When the questionnaire is administered by interview, it is often called an
interview schedule or simply schedule. Here, investigators ask questions directly
to the respondents using the schedule. They frame a set of questions, and while
asking questions, they can make some variations in the mode of asking
questions.
Questionnaires
• Questionnaires are used in research based on a fundamental and important
assumption that the respondents are willing to give truthful answers.
• They must also be able to give responses freely. Interviews or questionnaires
commonly include three types of questions, (1) closed questions or forced-
choice questions, (2) open-ended questions, and (3) scale items.
• Forced-choice items: Forced-choice items or closed items allow the respondent
to choose from two or more fixed alternatives.
• Most frequently used are the dichotomous items, which offer two alternatives
only, for example: ‘yes/no’ or ‘agree/disagree’. Sometimes, a third alternative is
also given as ‘undecided’ or ‘don’t know’. The alternatives offered should cover
all the possibilities.
Questionnaires
Open-ended items:
• The respondents are free to answer anything to the questions offered to them.
The answers may be a short note or short essay depending on the nature of
question. No restraints are imposed on their answer to the question.
Scale items:
• As already mentioned, the scale is a set of verbal items to which the respondent
answers by indicating degrees of agreement or disagreement. Individual
responses are located on a scale of fixed alternatives, for example: ‘strongly
disagree’ to ‘strongly agree’.
Definition of Schedule
• The schedule is a proforma which contains a list of questions filled by the
research workers or enumerators, specially appointed for the purpose of data
collection.
• Enumerators go to the informants with the schedule, and ask them the questions
from the set, in the sequence and record the replies in the space provided.
• There are certain situations, where the schedule is distributed to the respondents,
and the enumerators assist them in answering the questions.
Limitations of a Schedule
The following are the main disadvantages of a schedule:
1. It is very time consuming and costly instrument in administering to the subject
personally.
2. Sometimes some subjects have several time queries about the schedule and
difficult to explain and satisfy them.
3. Some subjects are to be conducted to get-data.
4. Some of the subjects e.g. principals administrators are not easily approachable
and get appointment for administering tool.
5. Sometimes subjects are more alert or intelligent than the researcher. Researcher
has the difficulty to administer the tools.
6. On a large sample of subjects this tool can not be used effectively and easily.
Key Differences Between Questionnaire and Schedule
• Questionnaire refers to a technique of data collection which consist of a series of
written questions along with alternative answers. The schedule is a formalized
set of questions, statements, and spaces for answers, provided to the enumerators
who ask questions to the respondents and note down the answers.
• Questionnaires are delivered to the informants by post or mail and answered as
specified in the cover letter. On the other hand, schedules are filled by the
research workers, who interpret the questions to the respondents if necessary.
• The response rate is low in case of questionnaires as many people do not respond
and often return it without answering all the questions. On the contrary, the
response rate is high, as they are filled by the enumerators, who can get answers
to all the question.
Key Differences Between Questionnaire and Schedule
• The questionnaires can be distributed a large number of people at the same time,
and even the respondents who are not approachable can also be reached easily.
Conversely, in schedule method, the reach is relatively small, as the enumerators
cannot be sent to a large area.
• Data collection by questionnaire method is comparatively cheaper and
economical as the money is invested only in the preparation and posting of the
questionnaire. As against this, a large amount is spent on the appointment and
training of the enumerators and also on the preparation of schedules.
• In questionnaire method, it is not known that who answers the question whereas,
in the case of schedule, the respondent’s identity is known.
• The success of the questionnaire lies on the quality of the questionnaire while the
honesty and competency of the enumerator determine the success of a schedule.
Characteristics of a good questionnaire
• The questionnaire is usually employed only when the respondents literate and
cooperative. Unlike schedule which can be used for data collection from all
classes of people.
Characteristics of a good questionnaire
• Deals with a significant topic
• Seeks only that information which cannot be obtained from other sources such as
census data
• As short as possible, only long enough to get the essential data.
• Attractive in appearance, neatly arranged, and clearly duplicated or printed.
• Directions are clear and complete. Questions are objective, with no leading
suggestions to the desired response
• Questions are presented in good psychological order, proceeding from general to
more specific responses.
• Easy to tabulate and interpret.
Guidelines for preparing questionnaire
• Prepared according with study objective
• Concise, precise and brief
• Trailing the questionnaire with friends
• Respondents should be selected carefully
• As far as possible open ended questions should be avoided
• Controversial and ambiguous questions should be avoided
• Getting permission in organization before administering questionnaire
• Try to get the aid of sponsorship
• Mailed questionnaire should have introduction, purpose and directions to fill the
questions
• Abrupt ending of the questions and questionnaire should be avoided.
Deciding which Questions to Use
• If you’re sure that a questionnaire is the most appropriate method for your
research, you need to decide whether you intend to construct a closed-ended,
open-ended or combination questionnaire.
• In open questions respondents use their own words to answer a question,
whereas in closed questions prewritten response categories are provided.
• You need to think about whether your questionnaire is to be self-administered,
that is, the respondent fills it in on his own, away from the researcher, or whether
it is to be interviewer administered.
• Self-administered questionnaires could be sent through the post, delivered in
person or distributed via the internet.
• It is also important to think about the analysis of your questionnaire at this stage
as this could influence its design
Open and Closed Questions
OPEN QUESTIONS CLOSED QUESTIONS
Tend to be slower to administer. Tend to be quicker to administer
Can be harder to record responses. Often easier and quicker for the
researcher to record responses.
May be difficult to code, especially if multiple Tend to be easy to code.
answers are given.
Do not stifle response. Respondents can only answer
in a predefined way.
Enable respondents to raise new issues. New issues cannot be raised
Respondents tend to feel that they have been able Respondents can only answer in a way
to speak their mind. which may not match their actual
opinion and may, therefore, become
frustrated.
In self-administered questionnaires, respondents Is quick and easy for respondents to tick
might not be willing to write a long answer and boxes –might be more likely to
decide to leave the question blank. answer all the questions.
Wording and Structure of Questions
• Questions should be kept short and simple
• Make sure that your questions don’t contain some type of prestige bias.
• Some issues may be very sensitive and you might be better asking an indirect
question rather than a direct question.
• Avoiding leading questions
Focus group discussions
• A focus group discussion is an organised discussion between 6 to 8 people.
• Focus group discussions provide participants with a space to discuss a particular
topic, in a context where people are allowed to agree or disagree with each other.
• Focus group discussions allow you to explore how a group thinks about an issue,
the range of opinions and ideas, and the inconsistencies and variations that exist
in a particular community in terms of beliefs and their experiences and practices.
• You should therefore purposefully recruit participants for whom the issue is
relevant.
• Be clear about the benefits and limitations of recruiting participants that
represent either one population (e.g. school going girls) or a mix (e.g. school
going boys and girls), and whether or not they know each other.
Focus Groups
• A focus group is where a number of people are asked to come together in order
to discuss a certain issue for the purpose of research.
• They are popular within the fields of market research, political research and
educational research.
• The focus group is facilitated by a moderator who asks questions, probes for
more detail, makes sure the discussion does not digress and tries to ensure that
everyone has an input and that no one person dominates the discussion.
Observation
• Observation and survey are the two major primary data collection procedures
employed by researchers.
• The choice depends upon the purpose of the study, nature of the problem, the
resources available, and the skills of the researcher.
• Surveys do not involve direct observation; rather, inferences about behaviour or
situations are made from data collected through interviews or questionnaires.
• Survey methods are used widely in social sciences and management to assess
prevalence, attitudes, and opinions on different subjects, especially in non-
experimental studies such as cross-sectional studies.
Observation
• Observation of nature is the fundamental basis of science.
• It is a purposeful and selective way of watching and recording an interaction or
phenomena as it occurs.
• When you conduct an experiment, you observe and record information on
several features. For example, if it is a growth analysis study of plants, you will
observe and record number of leaves, leaf area, total dry weight, stem weight,
height, root weight, and so on.
• Similarly, when you are conducting a titration study with chemicals or designing
a device and testing its functions, you are making observations.
• In social sciences too, observation is made in several situations such as the
behaviour of a group, personality traits of an individual, and functions of a
worker.
Observation
• Like experimental studies, the objective of an observational study is to
understand the meanings of what is going on and the cause-and-effect
relationships.
• However, different from an experimental study, we do not change the conditions
and parameters during an observational study.
• Some observational studies aim to develop or test the models that describe or
predict a behavior. In such a study, we directly observe the subject of research,
typically in a natural setting without an interference and manipulation.
• During a site visit for observation, we should carefully take records, in the form
of handwritten notes, photos, and/or videos. We should get as much information
as possible during a site visit but use a small part of the information collected for
a later analysis and study.
• During an observational study, we may also participate in the activity.
Observation
• An observational study may involve human participants, such as production
workers.
• The human participants may or may not be aware of being observed. If not
aware, the observation is called unobtrusive or nonreactive.
• In some cases, an observational study does not have a fully defined research
problem in advance. New questions likely arise during observation. For
example, we may notice an unknown phenomenon and ask, “what is happening
and why?”
Natural Observation
• In natural observation, researchers passively observe and record the behaviour,
phenomena, or some other events in its natural background.
• The researcher will not in any way interfere with the system.
• This kind of observation has been a successful method in such physical sciences
as astronomy, geology, oceanography, and meteorology.
• In social sciences, humans or animals are observed as they go about their
activities in real life situations.
• In natural sciences, this may involve observing plants, animals, or some physical
phenomena in natural settings.
• In health science-related areas such as human anatomy, observation is the
primary method used to describe the construction of the body.
• In ecology, observation has always been of importance although experimental
methods have also been utilized
Advantages of Observational Studies
• Compared with surveys and interviews, observational studies have a greater
proximity to real-world situations. Therefore, industrial professionals often use
observational studies particularly for problem-solving and validation.
• Other advantages of observational studies include that they are usually
inexpensive and can generate new research initiatives or hypotheses. They may
also be complementary to other research methods and simultaneously used.
Limitations of Observational Studies
• Since observational studies are passive, not altering any conditions and
parameters during a study can limit the depth of a study. We often use an
observational approach at an early phase of a large research project.
• Another concern is related to observational studies is the possibility of being
subjective. The influence of observers, with their unique experience and
background, can be significant because observers have a certain level of
judgment and inference in their results. A study showed only 5–20% of
observational studies could be replicated. Using such results may inherit the
invalidated results and make further research results questionable.
• In order to reduce such limitations, we should get trained and practice in advance
for consistent results. For example, let observers use the same rating scale to
evaluate the same phenomenon independently. Then, we analyze the
observational results to assess their consistency, which is similar to the
reproducibility tests for quantitative data.
Interviews
• Interview is a personal form of research compared to a questionnaire.
• For seeking opinions or impressions from a person or group of persons,
interview is an ideal choice.
• An interview is a verbal interchange, often face to face, in which an interviewer
tries to elicit information, beliefs, or opinions from another person.
• In interviews, the interviewer works directly with the respondent. If required, the
interviewer can probe further by asking follow up questions.
• Interviews can be conducted in many ways; it may be a structured interview,
unstructured interview, or semi-structured interview based on the interview
schedule used and the context.
• Quantitative analysis is possible only from the numerical data generated out of
structured interviews. If you need only qualitative information, then either semi-
structured or unstructured interviews can be adopted.
Interviews
Structured Interviews
• An interview study with predetermined and standardized questionnaires is called a
structured interview.
• Using this approach, we can have questions presented with the same or similar order to
various interviewees to reliably aggregate the corresponding answers, which we can
then do comparative analysis after interviews with a good confidence between
interviewees and between different periods.
• A major advantage of structured interview is that the quantitative data generated are
amenable to statistical analysis.
Unstructured Interviews
• This takes the form of a conversation between the informant and the researcher.
• There is no standardized list of questions, and it is a free flowing conversation in a
natural setting. In an unstructured way, it focuses on the respondent’s perception or
opinion on various issues. This is also called open-ended interviewing or in-depth
interviewing.
Interviews
• An unstructured interview offers us a great flexibility on the interview contents
and plentiful outcomes, but they may not be fully predictable, which makes later
analysis difficult.
Semi-structured Interviews
• Semi-structured interview makes use of a guide list with some broad questions
or issues, which are to be discussed for possible investigation during an
interview.
• The list normally includes broad and open-ended questions to be answered in a
free flowing, conversational, relaxed, and informal setting.
• The interviewer is left free to rephrase the questions and to ask probing
questions for added details.
• During an interview, however, we may omit some questions and add new topics,
depending on the flow of an interview conversation.
Overview of Interviews
We
We maymayview anview
interviewan
as a interview
type of survey for as a typeparticipant,
an individual of surveyin
either face to face, telephone, or computer online. The purpose of an interview
for individual's
study is to explore an individual
perceptions inparticipant,
detail. In other words,in
interviews can
be a better method to collect in-depth information on people's opinions, thoughts,
eitherandface
experiences, to face, telephone, or computer
feelings.
online.
An interview study hasThe purpose
similar characteristics of
with aan interview
survey study but with unique
features. In design and planning stages, we should more judiciously prepare the
studytarget
questions, is to explore
interviewees, individual's
moderators, perceptions
and possible flexibility during in
interviews.
detail. In other words, interviews can
be a better method to collect in-depth
information on people's opinions, thoughts,
Overview of Interviews
• Interview studies may be more suitable for a study that requires flexible probing
to obtain adequate information than questionnaire surveys.
• Thus, an advantage of interview studies is the flexibility and possibility to ask
further and/or controversial questions.
• Compared with a survey, the interview approach is also more intrusive and
reactive.
• Interestingly, there are some interview studies in various engineering fields. For
example, interview studies are in applied software engineering, product
development, manufacturing, etc.
Types and Characteristics of Interviews
Figure below shows three types of interviews based on how the questions are
pre-prepared.
Considerations in Interviews
• There are few practices for consideration to ensure the success of an interview.
• For example, we may send e-mail reminders before an interview to confirm
• interviewee's participation.
• We may also plan to limit an interview section to a half hour with about 10
questions.
• We will also need to decide how to record the interview information, for
example, take notes, video tape, or simply rely on our memory.
• Video recording or audiotaping is effective and accurate but needs the
permission from interviewees beforehand.
• In addition, during the beginning of the interview, we may need to ensure the
confidentiality and impartiality toward the interviewee.
Considerations in Interviews
• We can elicit more in-depth responses or fill in information by helping
participants to understand the questions.
• However, to avoid influencing or prompting a reaction from an interviewee, we
should ask questions in a neutral tone of voice.
• After an interview, we normally submit the transcripts and summary of
interviews to the interviewees so that they may clarify matters, add a few
afterthoughts, and correct misrepresentations.
• By doing so, we can improve interview quality by not fully relying on a quick
response during interview. We may send a cordial closing and thank you letter in
the following day.
Limitations of Interviews
• People often welcome the opportunity to express themselves and talk about their
opinions. However, there are challenging factors to a quality interview beyond
the time-consumption of an interview study. The factors include:
• Analytic observation of interviewers leading to next questions
• Possible snowball sampling in terms of questions
• Influence of the result from the first interview on the subsequent interviews
• Role of interviewers and their conducting consistency, if multiple
Interviewers, it may be difficult that interviewer keeps neutral when asking
additional questions after getting answers from the basic questions. The
additional questions, the tone, and interviewer's facial expression can show
the interviewer's preference, which may lead to different reactions and
answers.
Limitations of Interviews
• Given the amount of time and effort required for an interview, inherently it has a
limitation in the number and range of interviewees.
• There is an argument that an interview approach is biased in its data collection.
For example, influencing factors include the social interaction between the
researcher and interviewees and the researcher's own background and attitude.
Such factors are difficult to be measured and fully eliminated.
Identifying Participants
Qualitative research often focuses on a limited number of respondents who have
been purposefully selected to participate because you believe they have in-depth
knowledge of an issue you know little about, such as:
• They have experienced first-hand you topic of study, e.g. working street children
• They show variation in how they respond to hardship, e.g. children who draw on
different protective mechanisms to cope with hardship on the street and in the
work place
• They have particular knowledge or expertise regarding the group under study,
e.g. social workers supporting working street children.
Identifying Participants
You can select a sample of individuals with a particular ‘purpose’ in mind in
different ways, including:
• Extreme or typical case sampling – learning from unusual or typical cases, e.g.
children who expectedly struggle with hardship (typical) or those who do well
despite extreme hardship (unusual)
• Snowball sampling – asking others to identify people who will interview well,
because they are open and because they have an in-depth understanding about
the issue under study. For example, you may ask street children to identify other
street children you can talk to.
• Random purposeful sampling – if your purposeful sample size is large you can
randomly recruit respondents from it.
Identifying Participants
• Whilst purposeful sampling enables you to recruit individuals based on your
study objectives, this limits your ability to produce findings that represent your
population as a whole.
• It is therefore good practice for triangulation purposes to recruit a variety of
respondents (e.g., children, adults, service users and providers)
Modelling And Simulation
Models
Many problems can be formulated or translated into models. A system is a
subset of the world that is considered to be self-contained. A model is a
simplified representation of a system. Armed with a suitable model,
researchers may try either or both of:
mathematical analysis to solve problems about a system or optimise
its functioning; and computer simulation to approximate what happens when it is
functioning.
There are several good reasons for using a model.
1. It would be too expensive to build the real thing to see if it works (e.g.
a petrochemical plant).
2. The real system exists but cannot be experimented on (e.g. a nuclear
reactor).
3. Researchers can use the model for 'what-ifs7 (e.g. 'what will happen
Simulation
The process of model creation and usage encapsulates the scientific method in
miniature. The steps in a simulation are as follows:
1. Define the system and the objectives.
2. Determine the model's scope and scale (what's in it and how much
detail will be included).
1. Choose a programming language and code the model.
2. Run the model.
3. Gather data and analyse it.
Computer Simulation
Concept of Simulation
Physical systems in the real world are extremely complex, which challenges
analytical solutions and computing resources. In some cases, analytically or
experimentally studying a system or a situation in the real world is yet either
impractical or unable to get good results due to limited resources and/or human
knowledge.
Computer simulation on the other hand can be sued to solve those problems.
Simulation is a research method and process to produce the functions, behaviors,
and outcomes of a physical system based on a certain algorithm and using a
computer technology in a digital and virtual environment. As a type of virtual
experimental study, the information and conditions in a physical world are modeled
into a virtual world.
Computer Simulation
We may simulate physical systems of any type, such as electrical, mechanical,
operational, human actions, and so on. Modeling is a central part of computer
simulation, which is based on scientific and engineering principles. Most
simulation work uses commercial software and sometimes may need additional
programming efforts.
To make a simulation study feasible, a simplification of real-world situations is
necessary; some assumptions and parameters (including input data) are determined
based on the assumptions. Therefore, a simulation may not be 100% representative
to the corresponding real world, but hopefully close to. We should check the output
of a simulation for its validity against the real situations if possible. In many cases,
we can also perform sensitivity analyses for the parameters to
improve the accuracy of the simulation results.
Common Types of Simulation
There are a few types of computer simulation, which may be categorized in
different ways with significant overlaps between some types. Figure below shows
the common types of computer simulation.
Common Types of Simulation
Based on a deterministic model, a computer simulation can generate the output
that is fully determined by the parameter and the initial conditions. Due to the
complexity of the real world, deterministic models may be used as an
approximation of reality with simplified inputs and assumptions.
Continuous simulation and discrete event simulation are distinctive in terms the
state variables change either in a continuous way or at a countable number of
points in time. Therefore, using whether continuous and discrete event simulation
depends on the nature and states of physical systems and the data types. Both
continuous and discrete event simulation may be either deterministic or stochastic.
Common Types of Simulation
We often use the term of dynamic simulation, which is to model the time varying
behaviors, characteristics, and outputs of a physical system. Dynamic simulation
can be viewed consisting of two major parts: simulation calculation over time and
real-time graphic animation, which are integrated into powerful software
packages. The simulation method itself is based on any number types of computer
calculation, such as mathematical equations and numerical analysis. Dynamic
simulation is widely used in industrial applications, such as vehicle performance
modeling and movements of complex equipment
Common Types of Simulation
Common Types of Simulation
As emphasized above, we need to validate the simulation outcomes. A common
way is to use physical experiments under the same settings and parameters. The
simulation results have a good validity if they are in agreement with the physical
experimental observations.
Sampling for quantitative methods
• Commonly in our research or programmatic data collection, it is not possible or
even desirable, to collect data from a whole target group or population.
• This could be extremely difficult and expensive.
• Through accurate sampling of a subset of the population we can reduce costs and
gain a good representation from which we can infer or generalize about the total
population.
• Accurate sampling requires a sample frame or list of all the units in our target
population. A unit is the individual, household or school (for example) from
which we are interested in collecting data.
Bias
• The process of recruiting participants for quantitative research is quite different
from that of qualitative research.
• In order to ensure that our sample accurately represents the population and
enables us to make generalisations from our sample we must fulfil a number of
requirements.
• Sampling bias can occur if decisions are made about sample selection that mean
that some individuals have a greater chance of being selected for the sample than
others.
• Sample bias is a major failing in our research design and can lead to
inconclusive, unreliable results. There are a many different types of bias. For
example, tarmac bias relates to our tendency to survey those villages that are
easily accessible by road. We may be limited in our ability to travel to many
places due to lack of roads, weather conditions etc. which can create a bias in our
sample.
Bias
• Self-selection or non-response bias is one of the most common forms of bias
and is difficult to manage.
• Participation in questionnaire/surveys must be on a voluntary basis.
• If only those people with strong views about the topic being researched
volunteer then the results of the study may not reflect the opinions of the wider
population creating a bias.
Definitions
• Quantitative Data Collection: Gathering numerical data that can be measured,
counted, or expressed in numbers. Involves statistical, mathematical, or
computational techniques.
• Qualitative Data Collection: Gathering non-numeric, descriptive data. Focuses
on understanding meanings, perceptions, experiences, and processes.
Why Use Both? Mixed Methods
• Combines strengths of both: quantitative gives breadth, qualitative gives depth.
• In aerospace and automotive, complex systems often require both: e.g. sensor
data (quantitative), user/operator feedback (qualitative).
• Helps in validation: qualitative insights can explain anomalies in quantitative
data.
Quantitative and Qualitative Methods
Quantitative Methods
• Sensors & Telemetry (speed, vibration, temperature, accelerometers).
• Statistical/Mathematical Modeling & Simulations.
• Surveys / Questionnaires (e.g., safety culture surveys).
• Operational Data & Logs (maintenance, fuel, errors).
• Performance Metrics (turnaround time, emissions, costs).
Qualitative Methods
• Interviews with pilots, engineers, technicians, customers.
• Observations of operations: assembly, maintenance, flight ops.
• Focus Groups for usability, safety, innovation challenges.
• Document & Policy Analysis (manuals, certification docs).
• Case Studies (fleet or company-level insights).
Applications / Use-Cases
• Maintenance compliance: Quantitative = error rates, Qualitative = interviews
with technicians.
• Safety culture: Quantitative = survey scales, Qualitative = focus groups.
• Operational efficiency: Quantitative = turnaround time metrics, Qualitative =
crew observations.
• Systems/software development: Quantitative = usage logs, Qualitative = user
interviews.
Special Issues in Aerospace & Automotive
• Safety & certification requirements.
• High costs & risks of testing.
• Long product life cycles.
• Human factors & organizational culture.
• Data privacy and ownership.
Characteristics Of Quantitative Data
The quantitative data are collected by administering the research tools. These should
possess the following characteristics:
1. The quantitative data should be collected through standardized tests. If self-
made test is used it should be reliable and valid.
2. They are highly reliable and valid. Therefore, generalization and conclusions
can be made easily with certain level of accuracy.
3. The obtained results through quantitative data can be easily interpreted with
scientific accuracy. The level of significance can also be determined.
4. The scoring system of quantitative data is highly objective.
5. The use of quantitative data is always based upon the purpose of the study. The
specific psychometric tests are used in difficult investigation.
6. The inferential statistical can be used with the help of quantitative data.
7. The precision and accuracy of the results can be obtained by using quantitative
data in an educational research.
Primary Data Collection Summary
Research in aviation and automotive engineering requires systematic data
collection.
Data collection methods determine reliability and validity.
Data analysis transforms raw data into insights.
Primary Data Collection
1. Surveys & Questionnaires
- Aviation: Passenger comfort surveys
- Automotive: Customer feedback on efficiency
2. Interviews
- Aviation: Pilots on cockpit workload
- Automotive: Engineers on autonomous systems
Primary Data Collection Summary
3. Observations
- Aviation: Ground crew operations
- Automotive: Driver behavior at intersections
4. Experiments & Tests
- Aviation: Wind tunnel tests
- Automotive: Crash tests
5. Simulations
- Aviation: Flight simulators
- Automotive: Driving simulators
6. Sensors & Instrumentation
- Aviation: Flight data recorders
- Automotive: Telematics & OBD-II
Secondary Data Collection Summary
Sources of secondary data:
- Aviation: ICAO accident databases, FAA safety reports
- Automotive: Government road safety statistics, manufacturer reports
- Industry-wide studies, published research
Qualitative Analysis
Focus: Non-numerical data (opinions, experiences).
Methods:
- Thematic analysis (passenger complaints)
- Content analysis (maintenance logs)
Examples:
- Aviation: Passenger safety concerns
- Automotive: Drivers’ perceptions of electric cars
Reasons for using secondary data
1. Because collecting primary data is difficult, time-consuming and expensive.
2. Because you can never have enough data.
3. Because it makes sense to use it if the data you want already exists in some
form.
4. Because it may shed light on, or complement, the primary data you have
5. collected.
6. Because it may confirm, modify or contradict your findings.
7. Because it allows you to focus your attention on analysis and
8. interpretation.
9. Because you cannot conduct a research study in isolation from what has
already been done.
10. Because more data is collected than is ever used.
Qualitative Analysis
Focus: Numerical data, statistical methods.
Techniques:
1. Descriptive Stats – Average delay time, fuel consumption
2. Inferential Stats – Hypothesis testing (wing drag reduction, tire braking)
3. Correlation/Regression – Engine thrust vs fuel burn, engine size vs mileage
4. Time-Series – Passenger traffic forecasting, EV adoption
5. Reliability/Safety Analysis – FTA, FMEA
Measurement Scales
Researchers need to measure several characteristics of the variables under study. It
is essential to have a numerical method for describing observations.
Four scales of measurement are commonly used depending upon the nature of
variables;
(1) the nominal or classificatory scale,
(2) the ordinal or ranking scale,
(3) the interval scale, and
(4) the ratio scale.
Data scale specifies the categories of measurements and data. Based on a
measurement scale, we may consider data nominal, ordinal, etc., refer to
Figure. The scale of data also determines the measurement procedure and
following data analysis.
Measurement Scales
Measurement scales
Measurement Scales
Measurement scales define how variables are categorized, ordered, or quantified.
They guide the choice of statistical analysis.
Four main scales: Nominal, Ordinal, Interval, Ratio.
Nominal
Definition: Categories only, no order or magnitude.
Aviation Examples:
- Types of aircraft (Commercial Jet, Cargo Plane, Fighter Jet)
- Pilot license categories (PPL, CPL, ATPL)
Automotive Examples:
- Vehicle types (Sedan, SUV, Truck, Electric Car)
- Fuel type (Petrol, Diesel, Electric, Hybrid)
Stats: Frequency, mode, chi-square test.
Nominal Scale
• Nominal scale (also called classificatory scale or categorical scale) is the most
elementary method of quantification of observation, and the least precise.
• This scaling is done for nominal variables. A nominal scale describes differences
between characters by assigning them to categories.
• By using nominal scale, you can classify animate beings, inanimate objects, or
events into a number of mutually exclusive categories, so that each member of
the subgroups has some characteristics in common.
• For example, the variable ‘gender’ can have two categories, males or females;
and considering ‘educational qualification’, six categories can be suggested, non-
literate, literate, matriculate, graduate, post-graduate, and doctorate. See the
fictitious data in Table in next slide.
Nominal Scale
Using nominal scale, you can only allot individuals to a category, but cannot
rank them. Sometimes, code values are given for processing data, for example,
1 for female and 2 for male. Individuals having a single value are alike and
those with different values are different.
The labels tell us that the categories are
quantitatively different from each other.
However, they have no quantitative
significance, implying that they cannot be
added, subtracted, multiplied, or divided.
Probably, the only arithmetic operation
possible with nominal scales is counting.
Nominal Scale
• Nominal data are normally analyzed using simple graphic and statistical
techniques, such as bar charts and pie charts.
• Nominal data and combining with other types of data are used engineering
disciplines.
• For example, “Calmness of partially perturbed linear systems with an application
to the central path”.
• “Fast and efficient prediction of finned-tube heat exchanger performance using
wet-dry transformation method with nominal data” (Zhou et al. 2018)
Ordinal Scale
• With a nominal scale, the researcher may only be able to indicate that certain
things differ as they fit into certain categories. However, with an ordinal scale
(also called ranking scale); one may be able to assert the amount or degree of
their differences. For ordinal variables, ordinal scaling is done.
• Note that they will have all the properties of a nominal scale; but in addition, it is
possible to rank the subgroups in a certain order. They can be arranged either in
ascending or descending ranks according to the magnitude of variation, but
actual differences between adjacent ranks may not be equal, as they have no
absolute values.
• All the properties of nominal scale can be applied to ordinal scores too, but not
vice versa.
• For ordinal scale, frequency can be identified. As in the case of nominal scores,
in ordinal scale too, individuals with the same scores are treated alike.
Ordinal Scale
After tabulating and ranking the scores, these can be subjected to analysis. Consider
the example given in Table below.
In the example, ranks were allotted based on the
degree of adoption of integrated pest management
(IPM) in rice by farmers. The lowest score of ‘0’ was
given to ‘no adoption’ and the highest rank of ‘5’ was
given to ‘full adoption’. In a similar way, if you want
to study the level of infestation of a disease in a crop,
scores can be given by visually observing the disease
infestation and assessing its severity. The scores can
be arranged either in increasing or in decreasing
order.
Ordinal Scale
If 0–5 is the range of scores to be given for the level of intensity of infestation, you
can give score 0 for no infestation and 5 for the severest infestation. The scores
would be 0- no infestation, 1—very slight infestation, 2—slight infestation, 3—
moderate infestation, 4—severe infestation, and 5—very severe infestation. You can
rank the units as 5 > 4> 3 >2> 1 > 0.
Ordinal Scale
Definition: Ordered data without equal intervals.
Aviation Examples:
- Passenger satisfaction ratings (Excellent, Good, Fair, Poor)
- Airport safety ranking (Category A, B, C)
Automotive Examples:
- Crash test safety ratings (1-star to 5-star)
- Customer preference ranking of brands
Stats: Median, percentiles, Spearman’s correlation.
Interval Scale
• Although the ordinal scale is an improvement over the nominal scale, it is not
without problems.
• As the real differences between adjacent ranks are not equal with ordinal scale,
the differences between two values do not have any specific meaning.
• If our intention is to explain such differences, we must go for interval scales.
Interval level scores are in a position higher than ordinal scores as the
differences or intervals convey some more meaning.
• All the characters of ordinal values are applicable in interval level scores too, but
additionally, interval scale uses a measurement unit, which allow the responses
to be placed at equally spaced intervals in relation to the spread of the variable.
For example, the difference between 3 and 4 is same as the difference between
10 and 11.
Interval Scale
• Because the interval is same in both cases, the differences between the characters
are also the same. Interval scale is applicable for interval variables coming under
continuous variables.
• Most quantitative measurements are amenable to interval-level measurement.
• The interval scale has a starting point and a terminating point divided into
equally spaced intervals.
• Nevertheless, the starting and terminating points and the number of intervals
between them are arbitrary and vary from scale to scale.
• A common example of interval scale is temperature measurement scales, Celsius
and Fahrenheit, which start with different points of origin.
• The point of origin for the same natural phenomenon, the freezing point of
water, is 0 on the Celsius scale and 32 on Fahrenheit scale. Each degree or
interval is a measurement of temperature; however, as the starting point and
terminating points are arbitrary, they are not absolute.
Interval Scale
• Therefore, saying that 100 °C is twice as hot as 50 °C is wrong. The main limitation
of interval scale is the absence of a true zero, and you cannot measure the complete
absence of a feature using this scale. Because it is a relative scale, no mathematical
operation can be performed on its reading.
Interval Scale
Definition: Equal intervals, no true zero.
Aviation Examples:
- Cabin temperature (°C, °F)
- Time zones (UTC offsets)
Automotive Examples:
- Engine temperature (°C)
- Tire pressure (if offset-adjusted)
Stats: Mean, SD, correlation, regression, t-tests, ANOVA.
Ratio Scale
• A ratio scale, in addition to having equal interval properties of an interval scale,
has two additional features. It has a true zero, meaning a fixed starting point.
• Therefore, it is possible to indicate the complete absence of a property.
• Another feature is that the numerals of the ratio scale have the qualities of real
numbers and can be added, subtracted, multiplied, and divided, and expressed in
ratio relationships.
• Measurement of height, weight, area, income, and age are examples of this scale.
• Ratio scale is applicable for ratio variables coming under continuous variables.
• In physical and natural sciences, variables are mostly expressed in ratio scale.
• However, behavioural sciences such as sociology and psychology are generally
limited to describing variables in ordinal scale, nominal scale, or interval scale
warranting the use of nonparametric tests.
Ratio Scale
• The interval scale with a natural origin is called a ratio scale.
• A length measurement is an example of such a ratio scale.
• Different from interval data, ratio data have a true zero point. Ratio data are in
relation to a zero value (e.g. a distance).
• Both differences and ratios have real meanings and are interpretable. For
instance, zero inch and zero centimeter are exactly the same thing.
• The advantage of ratio data is that they can express values in terms of multiples
of fractional parts and have a wealth of possibilities for various analyses.
Ratio Scale
Definition: Equal intervals and true zero. Ratios are meaningful.
Aviation Examples:
- Aircraft speed (knots, Mach)
- Fuel consumption (kg/hour)
- Altitude (feet, meters)
Automotive Examples:
- Vehicle speed (km/h, mph)
- Fuel efficiency (liters/100km)
- Distance traveled (km)
Sampling Methods
• Sampling is a process used in statistical analysis in which a predetermined
number of observations are taken from a larger population.
• Sampling helps a lot in research. It is one of the most important factors which
determines the accuracy of your research/survey result. If anything goes wrong
with your sample then it will be directly reflected in the final result.
• Sample is the subset of the population. The process of selecting a sample is
known as sampling. Number of elements in the sample is the sample size.
Sampling
• A population is a group of individuals having one or more characteristics in
common. In some cases, you may attempt to obtain information from all the
elements of a population or complete enumeration of a population as in a census.
• A researcher, however, attempts to conduct a census, only if the population is
sufficiently small. When the population is large, it is impossible to collect
information from all the members of a population, and the researcher has to plan
for some shortcuts.
• A widely accepted procedure is to collect information from representative
samples of the population. If this subset is a true representative of the overall
population and exhibits similar characteristics to any randomly chosen division
of the population, then the generalization or conclusion may have applicability to
behaviour of the entire population.
Advantages of Sampling
The process of sampling enables us to draw generalizations based on careful
observations.
A measured value based on sample data is called a statistic. You can use this
statistic to estimate the characters of a population. A population value inferred from
a statistic is a parameter. The following are the merits of sampling over complete
enumeration or census of a population.
• Reduced cost: If the data are secured from only a small fraction of the
population, naturally, expenditure is less.
• Greater speed: Sample data can be collected and summarized more
quickly than with complete enumeration.
• Greater scope for accuracy: Studies that rely on sampling have more careful
supervision than a complete enumeration.
Probability Sampling Methods
Types of probability sampling methods
Sampling Methods
• Sampling methods are classified as probability sampling and non-probability
• sampling.
• In probability sampling, each constituent of the population has a known
probability of being selected.
• Because of this character, sampling error can be estimated—a major advantage
of probability sampling.
• When extrapolating data from samples to that of population, values are presented
plus or minus the sampling error.
• Common probability methods include simple random sampling, systematic
random sampling, and stratified random sampling.
• This Sampling technique uses randomization to make sure that every element of
the population gets an equal chance to be part of the selected sample. It’s
alternatively known as random sampling.
Sampling strategies
Probability sampling:
• Simple random sampling – selection at random
• Systematic sampling – selecting every nth case
• Stratified sampling – sampling within groups of the population
• Cluster sampling – surveying whole clusters of the population sampled at
random
• Stage sampling – sampling clusters sampled at random
The most widely understood probability sampling approach is probably
random sampling, where every individual or object in the group or ‘population’
of interest (e.g. MPs, dog owners, course members, pages, archival texts)
has an equal chance of being chosen for study.
Sampling strategies
Non-probability sampling:
• Convenience sampling – sampling those most convenient
• Voluntary sampling – the sample is self-selected
• Quota sampling – convenience sampling within groups of the population
• Purposive sampling – handpicking supposedly typical or interesting cases
• Dimensional sampling – multi-dimensional quota sampling
• Snowball sampling – building up a sample through informants
Other kinds of sampling:
• Event sampling – using routine or special events as the basis for sampling
• Time sampling – recognizing that different parts of the day, week or year may be
significant
Sampling strategies illustrated
Probability - Simple Random Sampling
• In a simple sampling, a common practice is to avoid choosing any member of the
population more than once, or sampling without replacement.
• This practice is important when a population is small.
• Simple random sampling is easy to use with minimum knowledge of the
population.
• If the information is available about the population, other types of sampling may
be more efficient.
• When you attempt sampling, ensure that every constituent of the population has
an equal chance of being selected.
• You have to fix up the number of samples based on the size of population, and
every element of population has the same probability of being selected.
• Data collection will be made from the representative samples only and not from
the entire population.
Sampling Methods
Simple Random Sampling: Every element has an equal chance of getting selected
to be the part sample. It is used when we don’t have any kind of prior information
about the target population.
⚫ For example: Random selection of 20 students from class of 50 student. Each
student has equal chance of getting selected.
Stratified Sampling
This technique divides the elements of the population into small subgroups (strata)
based on the similarity in such a way that the elements within the group are
homogeneous and heterogeneous among the other subgroups formed. And then the
elements are randomly selected f rom each of
these strata. We need to have prior information
about the population to create subgroups.
Sampling Methods
Cluster Sampling
• Our entire population is divided into clusters or sections and then the clusters are
randomly selected. All the elements of the cluster are used for sampling. Clusters
are identified using details such as age, sex, location etc.
Sampling
Degree of precision
• Information generated through sample collection is prone to some uncertainties.
• A major problem is the occurrence of sampling errors, as only a part of the
population is measured in sampling.
• Although there will not be any sampling errors in complete enumeration,
possibilities of non-sampling errors are greater.
• If you want to have high degree of precision, take large samples and use superior
instruments for measurements.
Sampling and Non-sampling Errors
• In sampling, instead of studying a whole population, you study a fraction of the
population, the sample, and infer the situation based on that sample. This is
actually an inductive process.
• Inferences are made by observing a small section of people, animals, or plants
and extrapolate them for the whole population, which they represent.
• Note that representative sampling is a prerequisite for successfully averaging out
random errors and avoiding systematic errors. The quality and usefulness of your
inference depends on how best is the representative sample.
Sampling and Non-sampling Errors
• The error occurring because of the likely faults in the sampling process is called
sampling error. Sampling error is the extent to which a sample drawn from a
population differs from the original population.
• Errors can also occur due to other reasons like errors in measurement,
investigator bias during data processing, and interpretation.
• Such errors are called non-sampling errors. It includes personal error also, for
example, things hidden or forgotten.
• When the population is large and sample size is small, chances of errors are
more. The degree of precision can be increased by taking larger samples and by
using superior instruments for measurements.
Sampling and Non-sampling Errors
• In census, sampling errors are absent, only non-sampling errors; while in
sampling, both types of errors occur.
• When the sample size increases, sampling error decreases, but non-sampling
error increases.
• When the sample size becomes equal to the population, there is no sampling
error as there is no difference between the population and sample.
Systematic Random Sampling
Systematic Sampling
We often use systematic sampling for a large list to select elements from an
ordered sampling frame. We pick the every kth one from a complete set with total
N elements in sampling. Thus, the sample size is N/k. In other words, the k, a fixed
interval, is determined by k = N/n if N is known and n is decided. An example of
systematic sampling is in a study of “Automated Pre-Seizure Detection for
Epileptic Patients Using Machine Learning Methods”.
Simple random sampling plan is often preferred over systematic sampling plans
because random sampling helps avoid subjective selection of samples. In fact,
systematic sampling has little subjectivity if samples are determined in advance.
Systematic sampling is better than simple sampling if n is large because of the
more uniform coverage of an entire population.
Systematic Random Sampling
• Sometimes, a variation of simple random sampling—systematic random
sampling—is adopted if the entire population is finite or can be listed.
• It is also called an Nth name selection technique. First, the required sample size
is determined, suppose it is 50 from a population of 1000.
• Then, the number of intervals is found out by dividing the population by the
sample size. One would select the first item by choosing a randomly selected
item, and then, every 40th (Nth record) item is selected from a list of population
until the sample of 50 items is completed.
• For example, if the first sample is the 6th item, subsequent samples will be 46th,
86th, 126th, and so on until we get 50 samples. This kind of systematic sampling
is as good as random sampling.
Stratified Random Sampling
• Stratified random sampling is a commonly used probability method when large
samples are involved. In such cases, it is considered superior to random
sampling as it helps to reduce sampling error.
• Stratified sampling is adopted usually for sampling heterogeneous populations.
• The population is divided into mutually exclusive groups called strata simple
sampling to collect samples from each group.
• Before going for sampling, the researcher has to identify relevant strata with
some common characteristics and their actual representation in the population.
• Afterwards, for each stratum, subjects are selected in proportion to its frequency
in the population using random sampling procedure.
Cluster Sampling
Cluster sampling is similar to stratified sampling. Sometimes, researchers divide a
population into natural groups based on their existing quantities. The sample size
may be different. For example, the sampling probability may be proportional to
size. The groups are called clusters in a cluster sampling. For example, an
electrical study on “An I/O Efficient Distributed Approximation Framework Using
Cluster Sampling”
Table shows a comparison between
stratified sampling and cluster sampling. A
population may or may not be uniform or
homogeneous so care is needed when
using these two sampling methods.
Corresponding discussion and statements
are needed regarding the possibility of
data not being accurately representative.
Multistage Sampling
• It is a type of stratified sampling suitable for infinite populations, where a list of
members is absent, or when the individuals are living in widely scattered groups.
• This is also called cluster or area sampling.
• The population is first divided into different stages, and random samples are
drawn.
• Initially, the population is divided into first stage sampling units, from which a
random sample is selected.
• This sample is then divided into second stage units, and again a sample is
selected.
• In this way, a random sample is selected at each stage. There must be at least two
stages in this type of sampling.
Non-Probability Sampling Methods
• In non-probability sampling, samples are chosen from the population using some
non-random procedures.
• Non-probability sampling methods include convenience sampling, quota
sampling, judgment sampling, and snowball sampling.
• A major disadvantage of non-probability sampling is that the extent to which the
sample differs from the population remains unknown, and therefore, it is very
difficult to estimate sampling error.
• Non-probability sampling methods are generally used for qualitative studies.
Non-probability Sampling Methods
Types of Non-probability Sampling
In some situations, it is impossible to know the sampling probability, and we may
have to use non-probability sampling methods, which do not follow the random or
statistical requirements. However, it may be still useful to select samples from a
population. The common types of nonrandom sampling include
Characteristics of Non-probability Sampling
• Any type of sampling method has some appropriate utility under certain
circumstances.
• Due to the convenience and data availability, we use nonrandom sampling
methods in many cases either explicitly or implicitly.
• The research results based on nonrandom sampling may have a good reference
value but are much limited to certain and special conditions.
• In other words, the selection of a sampling method for a research project can be
case dependent and need to study the characteristics of sampling methods with
consideration of practical issues.
Characteristics of Non-probability Sampling
• In general, non-probability sample techniques cannot produce a general
conclusion about the whole population. We cannot guarantee whether the sample
data represent the population well because some characteristics of the population
have little or no chance of being included in the studies. Therefore, the
disadvantages of non-probability sampling approaches outweigh the advantages
because of limited value to general situations.
Convenience Sampling
• Convenience sampling is a non-probability sampling technique, which is also
called accidental sampling. It is most often used in descriptive research where
the concern of researchers is to get an inexpensive estimate of facts.
• As the name implies, selection of samples is based on convenience. This non-
probability method is often used during preliminary stages of research to get a
rough estimate of the results.
• Using this approach, we only select readily available samples on an opportunity.
• As a result, the extent to which the sample is representative of the target
population is unknown.
• The analysed results are unlikely accurate for the target population. We should
avoid using this type of sampling method if possible but we may use it as a
preliminary study only.
Judgment/Expert/ Sampling
• Judgment sampling/Expert sampling is another commonly used non-probability
method.
• The researcher selects the sample based on some judgment, especially when the
entire population is inaccessible.
• This is actually an extension of convenience sampling.
• Suppose that an investigator has to take samples from several districts. The
researcher may decide to select samples from one representative district only
instead of several districts, having convinced that the chosen district is truly
representative of all the districts to be sampled.
Judgment/Expert/ Sampling
• The first step of using expert sampling is to define and identify the expert, which
may be subjective as well. In addition, even a true expert can be wrong.
• A study that used the sampling method stated, “The definition of an expert is
clearly open to interpretation as an ‘expert’ may very well be in the eye of the
beholder, and an improper interpretation on the part of researchers may lead to a
biased sample of participants that fails to adequately represent a population”
• Although bias can occur in judgment sampling, you can still have good
representation of population, if you can do it objectively.
Purposive Sampling
• We select samples for a particular objective based on our knowledge and
professional judgment.
• This type of sampling may be acceptable only for special situations. In such
cases, we should explain why the particular samples are selected.
• We may use this sampling method to measure a difficult-to-reach population. An
example of using purposive sampling is “Designing a local Flexible Model for
Electronic Systems Acquisition Based on Systems Engineering, Case Study:
Electronic high-tech Industrial” (Karbasian et al. 2016). In general, purposive
sampling can introduce or increase research(er) bias.
Quota Sampling
• Quota sampling is another non-probability sampling method almost similar to
stratified sampling.
• In quota sampling too, the researcher first identifies the strata and their
proportions as they are represented in the population.
• However, after selecting the stratum, samples are drawn from each stratum using
the procedure of convenience or judgment sampling unlike stratified sampling,
where each stratum is filled by random sampling.
• We conveniently select samples in a subgroup found in the general population,
but not in a random fashion.
• Thus, quota sampling is a type of the convenience sampling. The results may be
true for the subgroup but probably not for the entire population.
• There are very limited examples of using quota sampling in technical fields.
Snowball Sampling
• Snowball sampling is a special non-probability sampling method for situations
where the desired sample characteristic is rare.
• It is also called chain sampling or referral sampling. In certain occasions,
locating experimental subjects may be very difficult or costly. In such cases, the
researcher can try referrals from initial subjects to generate additional subjects.
• It begins with a few cases and spreads out on the basis of links or referrals to the
initial cases.
• This method is sometimes used in qualitative research. For instance, the
snowball sampling is used to study what makes research software sustainable.
• Although this technique substantially reduces difficulties of researchers in
locating samples and cut the cost on searching, this may increase sampling bias.
Generalisations and interpretation
• If a hypothesis is tested and upheld several times, it may be possible for the
researcher to arrive at generalisation, i.e., to build a theory.
• As a matter of fact, the real value of research lies in its ability to arrive at certain
generalisations.
• If the researcher had no hypothesis to start with, he might seek to explain his
findings on the basis of some theory. It is known as interpretation.
• The process of interpretation may quite often trigger off new questions which in
turn may lead to further researches.
Mixed-Method Approaches
Combination of Two Types of Methods
Characteristics of Mixed Methods
When both quantitative and qualitative data available, we have an opportunity to
use and analyze both of them. Qualitative research, for example, may include
quantitative components and data, such as categories and frequency, and vice
versa. Using both quantitative and qualitative methods are called mixed methods or
Multi-methods. As discussed, quantitative and qualitative analyses methods have
their own characteristics (see Table ).
Mixed methods research may draw on potential strengths of both types of methods.
Using a mixed method may, in principle, offset the weaknesses and allow both
exploration and analysis in the same study. Mixed-method approaches may also
provide additional evidence and support to research findings. We may contain the
reduced personal biases and improved validity of results.
Mixed-Method Approaches
Mixed-Method Approaches
Considerations for Using Mixed Methods
We may ask ourselves whether we should use both types of methods to take the
advantages of each type in one study. However, simply having both qualitative
and quantitative methods in a study does not necessarily mean using an optimal
methodology.
In other words, both methods
should have their purposes and
meaningful contributions to a
study.
There are a few factors we may
consider for what type of mixed
method to be used in a study, see
Table .
Statistical Terms
Some of the words and phrases necessary for understanding the terminology
of statistics are listed below:
Population: A population is the universal group from which data are sampled.
Sample: A sample is a part of a population, provided it represents the whole
population.
Variable: A variable is a feature characteristic of any member of a population
differing in quality or quantity from one member to another.
Qualitative variable: A variable differing in quality is called a qualitative variable
or attribute, for example, eye color, gender, religion etc.
Quantitative variable: A variable differing in quantity is called quantitative
variable, for example, age, height, weight etc.
Discrete variable: A discrete variable is the one which can take only certain
isolated values, for example, number of members in a family, numbers of errors in a
book etc.
Introduction to Data Analysis
Definition: Processing and interpreting data to draw conclusions.
Importance: Converts raw measurements into actionable engineering insights.
Steps in Data Analysis
Data cleaning → Remove errors/outliers
Organization → Tables, charts
Descriptive analysis → Mean, median, SD
Inferential analysis → Regression, ANOVA, correlation
Simulation/model comparison → CFD vs experimental results
Visual: Flowchart of data analysis process
Introduction to Data Analysis
• Analysis of data: Soon after the collection of data, the researcher turns to the
process of analyzing the collected data.
• The raw data will be tuned.
• There are many things used for analysis like coding, tabulation, editing and
statistical analysis.
• Data will be collected in the form of questionnaires or schedules. Hence the data
collected in short forms will be elaborated through coding.
• Editing can be done at the time of collecting or collecting the data.
• Through editing the researcher removes all the mistakes in the project. It will be
polished. Through tabulation the researchers do the work of preparing the tables.
Introduction to Data Analysis
Purpose of analysis: making sense of data
Quantitative analysis: Descriptive & Inferential statistics
Qualitative analysis: Thematic coding, Content, Narrative analysis
Tools for Data Analysis
Microsoft Excel
SPSS / R / Stata
NVivo / ATLAS.ti (qualitative data)
Visualization: charts, graphs, tables
Presenting and Interpreting Data
• Use charts, graphs, and tables
• Identify patterns and trends
• Relate findings back to research questions/hypotheses
• Ensure clarity for target audience
Tools & Techniques
Software: MATLAB, Python, Excel, LabVIEW, ANSYS, CATIA
Techniques: Regression, ANOVA, visualization, CFD/FEM
Visual: Screenshot examples of MATLAB graphs and CFD plots
Interpretation of Results
Compare with theory, design criteria, or standards
Identify anomalies (e.g., unexpected vibration or airflow separation)
Draw conclusions & recommend improvements
Hypothesis-testing
• After analysing the data as stated above, the researcher is in a position to test the
hypotheses, if any, he had formulated earlier.
• Do the facts support the hypotheses or they happen to be contrary? This is the
usual question which should be answered while testing hypotheses.
• Various tests, such as Chi square test, t-test, F-test, have been developed by
statisticians for the purpose.
• The hypotheses may be tested through the use of one or more of such tests,
depending upon the nature and object of research inquiry.
• Hypothesis-testing will result in either accepting the hypothesis or in rejecting it.
• If the researcher had no hypotheses to start with, generalisations established on
the basis of data may be stated as
• hypotheses to be tested by subsequent researches in times to come.
Ethical Considerations
Informed consent
Confidentiality & anonymity
Honesty and transparency
Avoiding bias or manipulation
Ethical considerations in Collection of Data
Any researcher who involves human sample subjects in his research has certain
responsibilities towards them. Since the activities of the sample subjects are often
closely associated with data collection process, it is appropriate to consider ethical
considerations here.
The following points have to be considered in process of data collection:
1. The researcher must protect the dignity and welfare of human sample subjects.
2. The human sample subjects freedom to decline participation must be respected,
and the confidentially of research data must be maintained.
3. The researcher must guard against violation or invasion of privacy.
4. The responsibility for maintaining ethical standard remains with the individual
researcher and the principal investigator or supervisor is also responsible for actions
of his scholars.
Ethical considerations in Collection of Data
Any researcher anticipating “the use of human sample subjects should consult on
‘ethics’ statements such as those mentioned above. A researcher should not mention
the name of subjects anywhere in the report. If possible name of institutions where
sample subjects have selected for data collection should not be mentioned even in
the appendix.