[go: up one dir, main page]

0% found this document useful (0 votes)
13 views41 pages

Module 1 Part 1 - Introduction To Statistics & Data Analysis

The document is an introduction to statistics and data analysis, focusing on the role of statistics, data collection, and sampling techniques. It covers key concepts such as uncertainty, variability, and the data analysis process, including steps like data collection, summarization, and interpretation. Additionally, it discusses different levels of measurement and their significance in statistical analysis.

Uploaded by

CHRISTIAN ALER
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views41 pages

Module 1 Part 1 - Introduction To Statistics & Data Analysis

The document is an introduction to statistics and data analysis, focusing on the role of statistics, data collection, and sampling techniques. It covers key concepts such as uncertainty, variability, and the data analysis process, including steps like data collection, summarization, and interpretation. Additionally, it discusses different levels of measurement and their significance in statistical analysis.

Uploaded by

CHRISTIAN ALER
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

29/08/2024

INTRODUCTION TO
STATISTICS & DATA ANALYSIS
NCE 3108: Engineering Data Analysis
Module 1 Part 1

PREPARED BY: ENGR. MARC DANIEL LAURINA

No part of this material may be reproduced, distributed or transmitted in any


form or by any means including photocopying or other means without prior
written permission of the owner except for personal academic use and
certain other non-commercial uses permitted by copyright law.

Part 1: Role of Statistics and Data


Analysis Process
1.1. Introduction
1.2. Uncertainty and Variability
1.3. Data Analysis Process
Part 2: Data Collection and Sampling
Techniques
1.4. Data and Measurement
1.5. Data Collection Methods
1.6. Sampling06
1.7. Introduction to Design of
Experiments

1
29/08/2024

I. Role of Statistics
and the Data
Analysis Process

1.1. Introduction

2
29/08/2024

Definition

Statistics is defined as the branch of science that deals with


collection, presentation, organization, analysis, and
interpretation of data (Almeda et.al., 2010). Statistics is widely
used in many different fields of science and technology, as
well as in our everyday lives.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Examples of applications of statistics

Examples of applications of statistics include (but not limited to):


1. Population census (as implemented by the government);
2. Public choices/responses (e.g. surveys related to elections and
government service satisfaction);
3. Product advertisements (e.g. comparison of Brand X vs Brand Y);
4. Teaching and instruction (e.g. students' performance, examination
item analysis etc.)
5. Scientific observations and experiments (e.g. clinical trials for
medicines and vaccines, development of new technologies etc.);
and
6. Engineering data collection (e.g. engineering soil properties,
testing of materials etc.). ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

3
29/08/2024

Main Branches of Statistics

There are two main branches of statistics: descriptive statistics and


inferential statistics. Descriptive statistics is the branch of statistics
that includes methods for organizing and summarizing
data. Inferential statistics is the branch of statistics that involves
generalizing from a sample to the population from which the sample
was selected and assessing the reliability of such generalizations.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Taxonomy of Statistics

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

4
29/08/2024

Steps in the Data Analysis Process

In order to understand the ideas behind data, a good understanding of statistics


is a must. To do this, one must be able to (Peck et.al., 2019):
1. Decide whether existing data is adequate or whether additional information is
required;
2. If necessary, collect more information in a reasonable and thoughtful way;
3. Summarize the available data in a useful and informative manner;
4. Analyze the available data; and
5. Draw conclusions, make decisions, and assess the risk of an incorrect decision.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

5
29/08/2024

1.2. Uncertainty
and Variability

Uncertainty

• Concept: Uncertainty in statistics refers to the lack of precise knowledge


about a value or outcome due to limited data or information.
• Uncertainty is about not being sure of your estimation due to limited data.
• Example: If you take a sample of people to find out the average height in a
city, there's some uncertainty because you didn't measure everyone. You
estimate the average height with a margin of error.
• In statistics, uncertainty is often measured using confidence intervals or
standard errors.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

6
29/08/2024

Variability

• Concept: Variability refers to the natural differences or changes that occur in data
or measurements. It’s the range of different possible values.
• Variability is about the natural differences in the data you do have.
• Example: If you measure the height of everyone in a city, you'll find that not
everyone is the same height. Some people are taller, some are shorter, and this
spread in height is variability.
• In statistics, variability is measured using things like range, variance, or standard
deviation.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Uncertainty vs Variability

Another Example:
• Uncertainty: You estimate that the average load on the bridge will be around
10,000 kg with a margin of error of ±500 kg because you can't measure every
single load scenario.
• Variability: On the bridge, the weight varies from light cars of 1,000 kg to heavy
trucks of 20,000 kg. This spread of different weights is the variability.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

7
29/08/2024

Uncertainty vs Variability

• Statistics deals with both uncertainty and variability.


• Uncertainty: Refers to the lack of certainty or predictability in data or outcomes.
It can arise from various sources such as incomplete knowledge, measurement
errors, or inherent unpredictability.
• Variability: Refers to the natural variation inherent in a system or process. This
can be due to differences in individual elements, fluctuations over time, or
environmental factors.
• Uncertainty is derived from theoretical information (i.e. it is expressed in terms of
probabilities) while variability is derived from data extracted from observations
and experiments (i.e. it is expressed in terms of frequencies).

Sources of Uncertainty

Sources of uncertainty are classified broadly into two types:


1. Aleatory uncertainty
2. Epistemic uncertainty.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

8
29/08/2024

Aleatory Uncertainty

Concept: Aleatory uncertainty is the inherent randomness or variability


in a system. It’s the kind of uncertainty that comes from the natural
variation in the data or process.
Example: If you're measuring the daily traffic flow on a highway, aleatory
uncertainty comes from the fact that the number of cars will naturally
vary each day. This randomness is something you can’t eliminate, only
measure and understand.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Epistemic Uncertainty

Concept: Epistemic uncertainty is the uncertainty that comes from a lack


of knowledge or information. It's the uncertainty we could potentially
reduce by gathering more data or improving our models.
Example: If you’re designing a new bridge and don’t have complete
information about the materials you’re using or the exact environmental
conditions, this lack of knowledge creates epistemic uncertainty. By
conducting more tests and gathering more data, you can reduce this
type of uncertainty.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

9
29/08/2024

Aleatory vs Epistemic Uncertainty

• Summary with Examples:


Aleatory Uncertainty: Daily traffic flow varies due to natural randomness. Some
days are busier, some are quieter, and this variability is inherent.
Epistemic Uncertainty: You don’t know the exact properties of the construction
materials. By testing and researching, you can gain more knowledge and reduce this
uncertainty.
• In essence:
Aleatory Uncertainty is about the randomness you can’t eliminate.
Epistemic Uncertainty is about the gaps in your knowledge that you can work to
fill. ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

10
29/08/2024

1.3. Data Analysis


Process

Data Analysis Process

In order to understand the ideas behind data, statistics and data analysis must be
performed. The data analysis process goes as follows (Peck et.al., 2019):
1. Understanding the nature of the problem. Effective data analysis requires an
understanding of the research problem. We must know the goal of the research
and what questions we hope to answer. It is important to have a clear direction
before gathering data to ensure that we will be able to answer the questions of
interest using the data collected.
2. Deciding what to measure and how to measure it. The next step in the
process is deciding what information is needed to answer the questions of
interest. In some cases, the choice is obvious.

11
29/08/2024

Data Analysis Process

3. Data collection. The data collection step is very important. The researcher
must first decide whether an existing data source is adequate or whether new
data must be collected. If a decision is made to use existing data, it is important
to understand how the data were collected and for what purpose, so that any
resulting limitations are also fully understood. If new data are to be collected, a
careful plan must be developed, because the type of analysis that is appropriate
and the conclusions that can be drawn depend on how the data are collected.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Data Analysis Process

4. Data summarization and preliminary analysis. After the data are collected,
the next step is usually a preliminary analysis that includes summarizing the data
graphically and numerically. This initial analysis provides insight into important
characteristics of the data and provides guidance in selecting appropriate
methods for further analysis.
5. Formal data analysis. The data analysis step requires the researcher to select
appropriate statistical methods.
6. Interpretation of results. The interpretation step often leads to the
formulation of new research questions. These new questions lead back to the first
step. In this way, good data analysis is often an iterative process.

12
29/08/2024

Population vs Sample

• Data from statistics are usually obtained from samples and the corresponding
values for the entire population are estimated from these sample data.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Data Analysis Process

• A summary measure that describes a specific characteristic of a sample is called


a statistic, while a summary measure that describes a specific characteristic of
a population is called a parameter.
• This means that the data obtained from descriptive statistics are examples of a
statistic while the data obtained from inferential statistics are examples of a
parameter.
• A designed research that provides information needed to solve a certain
research problem is called a statistical inquiry.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

13
29/08/2024

II. Data Collection &


Sampling Techniques

14
29/08/2024

1.4. Data and


Measurement

Data and Measurement

• In order to perform statistics, data must be collected.


• A data is a collection of observations on one or more variables.
• A variable is a characteristic whose value may change from one observation to
another.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

15
29/08/2024

Classification of Data

Data may be depending on its nature and on the number of involved variables.
Depending on its nature, data may be classified as either:
• Categorical (qualitative): if the individual observations are categorical
responses; or
• Numerical (quantitative): if the individual observations are expressed as
numbers.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Classification of Data

Numerical data may be further classified as follows:


• Discrete: if possible values of the variable/s correspond to isolated points on the
number line; and
• Continuous: if possible values of the variable/s correspond to all points inside an
interval on the number line.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

16
29/08/2024

Measurement

• In order to collect data for statistical analysis, the manner in which the variables
are to be observed should be decided upon, depending on the type of data
involved.
• The process of determining the value (for numerical data) or label (for categorical
data) of the variable based on what has been observed is called measurement.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Measurement

In statistics, measurement is important because it helps in determining the


appropriate statistical method/s that should be used to analyze a data set. There are
four levels or scales of measurement considered in statistics, as follows:
1. Ratio Level
2. Interval Level
3. Ordinal Level
4. Nominal Level

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

17
29/08/2024

Nominal Scale

• Definition: The nominal level is the most basic level of measurement. It involves
categorizing data without any quantitative value. The categories are distinct and have no
inherent order.
• Examples: Gender (male, female), eye color (blue, green, brown), types of engineering
disciplines (civil, mechanical, electrical).
• Key Characteristics:
o Data is classified into distinct groups or categories.
o There is no ranking or order to the categories.
o The only analysis possible is counting the frequency of each category (mode).
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Ordinal Scale

• Definition: The ordinal level involves categorizing data with a meaningful order or
ranking, but the intervals between the categories are not necessarily equal or known.
• Examples: Education level (high school, bachelor's, master's, Ph.D.), satisfaction ratings
(satisfied, neutral, dissatisfied), ranks in a competition (1st, 2nd, 3rd).
• Key Characteristics:
o Data is ranked or ordered.
o The distance between the ranks is not uniform or specified.
o Analysis can include median and mode, but not meaningful arithmetic operations like
addition or subtraction.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

18
29/08/2024

Interval Scale

• Definition: The interval level includes data that is ordered, with equal intervals between
values. However, there is no true zero point, meaning that ratios or comparisons of
absolute magnitude are not meaningful.
• Examples: Temperature in Celsius or Fahrenheit, dates on a calendar, IQ scores.
• Key Characteristics:
o Data is ordered with equal intervals.
o There is no true zero (e.g., 0 degrees Celsius does not mean "no temperature").
o Arithmetic operations like addition and subtraction are meaningful; however,
multiplication and division are not because there is no absolute zero.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Ratio Scale

• Definition: The ratio level is the highest level of measurement, where data is ordered,
with equal intervals, and a true zero point exists. This allows for a full range of arithmetic
operations.
• Examples: Weight, height, age, income, length, time.
• Key Characteristics:
o Data is ordered with equal intervals and a true zero point.
o Both arithmetic operations and comparisons of ratios are meaningful (e.g., a weight
of 10 kg is twice as heavy as 5 kg).
o All statistical methods, including mean, median, mode, and geometric mean, are
applicable. ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

19
29/08/2024

Measurement

• Summary:
 Nominal: Categories without order (e.g., gender, colors).
 Ordinal: Ordered categories without equal intervals (e.g., rankings).
 Interval: Ordered with equal intervals, but no true zero (e.g., temperature in Celsius).
 Ratio: Ordered with equal intervals and a true zero (e.g., weight, income).

• Understanding these levels of measurement helps in selecting appropriate statistical


methods and interpreting data correctly.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Measurement

20
29/08/2024

Measurement

Measurement

21
https://www.youtube.com/watch?v=OXTdii-b9Co&t=1s

Measurement

Level of Measurement
29/08/2024

22
29/08/2024

Teaching and Learning Activity (TLA)_Assignment 1

• Instructions: Accomplish the assignment on a short paper-sized document, using


handwritten text. Make sure to include a title page. Complete the following survey by
answering each question. For each item, identify the level of measurement (Nominal,
Ordinal, Interval, or Ratio).
____1. What type of soil is predominant at your project site?
Options: Clay, Sand, Silt, Gravel, Other
____ 2. Rank the difficulty of the following civil engineering subjects: Structural Analysis,
Geotechnical Engineering, Fluid Mechanics, Transportation Engineering.
Options: (Rank each subject from 1 to 4)
____3. What is the compressive strength of the concrete used in your latest project?
Options: (Fill in the value in MPa)
____4. What was the temperature during the curing of the concrete sample?
Options: (Fill in the temperature in Celsius)
____5. Which software do you use most frequently for structural analysis?
Options: SAP2000, ETABS, STAAD Pro, SAFE, MIDAS

Teaching and Learning Activity (TLA)_Assignment 1

• Instructions: Complete the following survey by answering each question. For each item,
identify the level of measurement (Nominal, Ordinal, Interval, or Ratio).
____6. How would you rate your understanding of soil classification methods (e.g., USCS,
AASHTO)?
Options: Excellent, Good, Fair, Poor
____7. On a scale from 1 to 10, how confident are you in your ability to perform a load
calculation?
Options: (1 to 10 scale)
____8. How many floors does the building you are designing have?
Options: (Fill in the number of floors)
____9. Which type of foundation is most commonly used in your current projects?
Options: Shallow Foundation, Deep Foundation, Pile Foundation, Mat Foundation
____10. How many engineering courses are you taking this semester?
Options: (Fill in the number of courses)
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

23
29/08/2024

1.5. Data Collection


Methods

24
29/08/2024

Data Collection Methods

• Data may be collected, depending on the type of study, using any of the following
methods:
1. Use of documented data: Available research data may be used such as government
data and data from related studies or researches. However, caution must be
exercised when using documented data, especially secondary data (i.e. data
documented by entities other than the actual data collectors).
2. Surveys: A survey is a method of collecting data on the variable of interest by asking
people to answer a set of carefully written questions called a questionnaire. A survey
comprising an entire population is called a census while a survey comprising only a
sample of the population is called a sample survey. Surveys are usually performed if
the study involves human behavior such as consumer studies and election surveys.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Data Collection Methods

• Data may be collected, depending on the type of study, using any of the following
methods:
3. Experiments: An experiment is a method of collecting data where there is a direct
human intervention on the conditions that may affect the values of the variable of
interest. Variables that may be directly manipulated are called independent
variables while variables that cannot be manipulated directly but can have their
values changed are called dependent variables. Most scientific studies with
multivariate data involve experimentation.
4. Observations: An observation is a method of collecting data on the phenomenon of
interest by recording the observations made about the phenomenon as it actually
happens. Examples of studies involving observations include weather and climate,
earthquake, and astronomical studies.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

25
29/08/2024

1.6. Sampling

Sampling

• Sampling is a crucial process in engineering data analysis, where a subset of data is


selected from a larger population for analysis.
• This approach is often necessary when it's impractical or impossible to collect data
from the entire population
• To ensure reliable and acceptable statistical results, proper selection of samples
should be performed.
• The process of obtaining or selecting samples from a population related to a
study is called sampling.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

26
29/08/2024

Sampling Bias

• Sampling bias occurs when the sample selected for analysis does not accurately
represent the entire population, leading to skewed or invalid results. In engineering
data analysis, this bias can significantly affect the conclusions drawn from the data,
potentially leading to poor decision-making and flawed engineering solutions.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Causes of Sampling Bias

1. Non-Random Sampling:
• When samples are selected based on convenience, judgment, or any non-random
method, certain segments of the population may be overrepresented or
underrepresented.
• Example: In a material testing study, if an engineer only selects samples from the
top of a batch, they might miss variations that occur throughout the batch, leading
to biased conclusions about material strength.
2. Undercoverage:
• This occurs when some parts of the population are not included or are
underrepresented in the sample.
• Example: If a survey of construction practices only includes responses from large
firms and excludes small contractors, the results may not reflect the practices of
the entire industry.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

27
29/08/2024

Causes of Sampling Bias

3. Non-Response Bias:
• When individuals or items selected for the sample do not respond or participate,
and their absence is related to the study’s focus, the results can be biased.
• Example: In a survey on the adoption of new engineering software, if only tech-
savvy engineers respond, the results may overestimate the overall adoption rate.

4. Voluntary Response Bias:


• This occurs when the sample consists only of individuals who choose to
participate, leading to overrepresentation of strong opinions or specific groups.
• Example: An online survey on engineering tools might attract responses mostly
from those with a strong preference for a particular tool, skewing the results.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Impacts of Sampling Bias in Engineering:

1. Inaccurate Data Analysis: The data may not truly reflect the population, leading to
incorrect conclusions.
2. Faulty Designs and Decisions: Engineering solutions based on biased data may be
ineffective or unsafe, potentially leading to project failures or increased costs.
3. Reduced Reliability: Results that suffer from sampling bias lack credibility and may
not be applicable to broader contexts, reducing their usefulness in guiding
engineering practices.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

28
29/08/2024

Avoiding Sampling Bias:

To minimize sampling bias, engineers should:


• Use random sampling methods wherever possible.
• Ensure that the sample is representative of the entire population,
considering all relevant factors.
• Address non-response and voluntary response biases by encouraging
broader participation and follow-ups.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Sampling Methods

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

29
29/08/2024

Sampling Methods

• Sampling is a crucial process in engineering data analysis, where a subset of data is


selected from a larger population for analysis.
• This approach is often necessary when it's impractical or impossible to collect data
from the entire population.
• The choice of sampling method depends on the specific goals of the analysis, the
nature of the population, and practical considerations like time and resources.
• Proper sampling leads to more accurate and meaningful conclusions in engineering
projects.
• Here’s a brief overview of common sampling methods.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Sampling Methods

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

30
29/08/2024

Sampling Methods

1. Simple Random Sampling:


• Definition: Each member of the population has an equal chance of being selected.
This method is the most straightforward and helps reduce bias.
• Application in Engineering: When testing materials in construction, a simple
random sample of batches might be selected to ensure a representative
understanding of quality.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Sampling Methods

2. Stratified Sampling:
• Definition: The population is divided into distinct subgroups (strata) based on a
specific characteristic, and random samples are taken from each subgroup. This
ensures representation from all key segments of the population.
• Application in Engineering: When analyzing the strength of concrete from
multiple suppliers, engineers might stratify the samples by supplier and then
randomly select from each group to ensure each supplier's product is tested.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

31
29/08/2024

Sampling Methods

3. Systematic Sampling:
• Definition: Every nth member of the population is selected after a random starting
point. This method is easier to administer than simple random sampling and can be
used when the population is ordered.
• Application in Engineering: For quality control in manufacturing, an engineer might
inspect every 10th item coming off a production line to monitor consistency.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Sampling Methods

4. Cluster Sampling:
• Definition: The population is divided into clusters, often based on geography or
another natural grouping, and entire clusters are randomly selected. This method is
cost-effective when dealing with large, dispersed populations.
• Application in Engineering: In a large construction project spread over different sites,
engineers might select a few sites (clusters) and then test all the samples within those
sites.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

32
29/08/2024

Sampling Methods

5. Convenience Sampling:
• Definition: Samples are selected based on their availability and ease of access. While
this method is the least rigorous and can introduce bias, it is sometimes used in
exploratory research or when other methods are impractical.
• Application in Engineering: When an engineer needs to quickly assess the
performance of a material and selects samples from the nearest available source, this
is convenience sampling.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Sampling Methods

6. Multistage Sampling:
• Definition: Combines several sampling methods. The population is divided into
groups (like in cluster sampling), and then a sample is taken from within each
selected group, often using another method like simple random sampling or
stratified sampling.
• Application in Engineering: In large infrastructure projects, an engineer might first
choose specific regions (clusters), then within each region, select specific
construction sites (stratified), and finally, within those sites, randomly choose
samples for testing.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

33
29/08/2024

Video
https://www.youtube.com/watch?v=PdXDLNNXPik&t=510s

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

34
29/08/2024

1.7. Introduction to
Design of
Experiments (DOE)

Design of Experiments (DOE)

• Scientific and engineering studies usually involved experiments to understand physical


phenomena.
• An experiment is a method of collecting data where there is a direct human intervention
on the conditions that may affect the values of the variable of interest.
• Design of Experiments (DOE) is a systematic approach used in engineering to plan,
conduct, and analyze controlled tests to evaluate the factors that may influence a
particular process or product.
• It allows engineers to understand the relationships between different variables and
optimize processes for better performance, quality, and efficiency.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

35
29/08/2024

Variables

• Experiment is a study in which one or more explanatory variables are manipulated in order to
observe the effect on a response variable.
• Explanatory variables are independent variables or factors, those that have values that are
controlled by the experimenter, while response variables are dependent variables, those that are
thought to be related to the explanatory variables in an experiment.
• Response variables are measured as part of the experiment, but not controlled by the
experimenter. An experimental study involves several set-ups, called experimental conditions or
treatments, to observe the relationship between the independent and the dependent variables.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Key Concepts in DOE

1. Factors and Levels:


• Factors are the variables that are being tested in an experiment (e.g., temperature, pressure, material
type).
• Levels are the specific values or settings of each factor (e.g., high/low temperature).
2. Response Variable:
• The response variable is the outcome or result that is measured in the experiment (e.g., product
strength, yield, efficiency).
3. Experimental Units:
• The experimental units are the items or subjects on which the experiment is conducted (e.g., samples,
machines, batches of material).

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

36
29/08/2024

Key Concepts in DOE

4. Randomization:
• Randomization involves randomly assigning treatments to experimental units to
minimize the effects of uncontrolled variables and reduce bias.
• Example in Engineering: In a study testing the durability of different concrete mixes,
randomization might involve randomly assigning various concrete formulas to
different batches. This ensures that any differences in durability are due to the mix
itself and not to variations in the testing conditions (like temperature or humidity)
that could affect the results.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Key Concepts in DOE

5. Replication:
• Replication means repeating the experiment under the same conditions to ensure
that the results are consistent and reliable.
• Example in Engineering: If an engineer is testing the tensile strength of a new steel
alloy, they might replicate the test by pulling multiple samples of the steel under the
same conditions. This helps confirm that the observed strength is consistent across
different samples and not due to random variation.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

37
29/08/2024

Key Concepts in DOE

6. Blocking:
• Blocking is used to control for variables that are not of primary interest but may
influence the response, by grouping similar experimental units together.
• Example in Engineering: When testing the effectiveness of a new pavement material,
an engineer might block the experiment by different weather conditions (e.g., sunny,
rainy). Within each block, they would test the material under various conditions to
isolate the effects of the weather from the material's performance. This ensures that
differences in performance are due to the material itself rather than the varying
weather conditions.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Importance of DOE in Engineering:

• Efficiency: DOE helps identify the most significant factors affecting a process with the
fewest number of experiments, saving time and resources.
• Optimization: By understanding the interactions between variables, engineers can
optimize processes for better performance and quality.
• Problem Solving: DOE provides a structured approach to troubleshooting and improving
engineering systems, leading to more robust and reliable designs.

ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

38
29/08/2024

Video:

https://www.youtube.com/watch?v=DaBq0naj0YY&t=524s

39
29/08/2024

Teaching and Learning Activity (TLA)_Assignment 2

Instructions:
• Present an example of application of data analysis process in civil engineering. You may search Google
Scholar (or any other credible website) for some papers or design experiments which show how statistics is
applied in understanding a civil engineering problem. Compare the process done in the said paper/design
experiment to the data analysis process discussed in pages 22-24
• Accomplish this on short paper size document, typewritten in PDF format. Provide a title page.
• Use 12-pt Arial Narrow for font type and size, with 1.5-space line spacing. Sentences must be in justified
alignment. Images, if any, must be centered and with caption below.
• References used for this assignment should be put at the end of the document. Use APA format for citation
and listing. Photo credits (list of sources where photos come from) should also be put right after
references.
ENGINEERING DATA ANALYSIS LECTURES by Engr. Marc Daniel Laurina

Get answers to your questions

GOOGLE CANVAS FACEBOOK GMAIL


mdlaurina.ce@tip.edu.ph Marc Laurina mdlaurina.ce@tip.edu.ph

40
29/08/2024

END OF SLIDE

TO GOD BE, THE GLORY

No part of this material may be reproduced, distributed or transmitted in any


form or by any means including photocopying or other means without prior
written permission of the owner except for personal academic use and
certain other non-commercial uses permitted by copyright law.

41

You might also like