0% found this document useful (0 votes)

9 views31 pages

Lecture 3

The document discusses various statistical methods for analyzing and forecasting logistics data, including histograms, boxplots, time series plots, and scatterplots. It emphasizes the importance of data preprocessing techniques such as outlier detection, data aggregation, and normalization to improve forecasting accuracy. Additionally, it covers the classification of time series and the use of linear regression for explanatory modeling.

Uploaded by

user1234455

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views31 pages

Lecture 3

Uploaded by

user1234455

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

INDU 342: Logistics Network Models

Forecasting

Claudio Contardo

Mechanical, Industrial and Aerospace Engineering

Concordia University

Lecture 3
Histograms

Histograms can help at identifying (or at least to estimate) the

underlying probability distribution governing a random variable
First: Establish many equidistant intervals between MIN and MAX
Second: Compute the percentage of the total observations that fall
within each interval
Histograms

Mumbarai
In the Mumbarai problem, the logistics manager wants to verify visually
whether the parent distribution of the AGV travel times is normal. They
plot in the same diagram the density histogram of the available data and
the probability density function of the normal distribution with the same
mean and the standard deviation as the sample mean and the sample
standard deviation of the data. The visual inspection of the resulting
graphical representation shows that the intuition of the logistics manager
is correct
Histograms

Figure: Travel times histogram

Boxplots

Boxplot
A boxplot (or box-and-whisker plot) is a graphical display depicting
numerical data through their quartiles. In particular, the plot contains a
box whose extreme sides represent quartiles Q25 and Q75 . The box also
includes an internal line representing the median value and, possibly, a
small triangle associated with the mean of the data. In addition, the
representation includes two lines (called whiskers), extending from the
box to the minimum and to the maximum data values which are not
outliers, respectively
Boxplots

Ravaioli
Ravaioli is an Italian producer of fresh pasta, renown for its ravioli. The
sales of its spinach ravioli Nonna Pina are influenced heavily by the
company’s TV and social media advertising. The Table next reports the
latest 20 weekly expenditures (in ke) made by the company in TV and
social media advertising, together with the sales (always in ke) realized
in the same time periods. In order to devise sales forecast for inventory
planning, the logistics manager performed preliminarily an EDA. This
phase comprised the generation of the boxplots for the company’s TV
and social media advertising expenditures
Boxplots

Figure: Ravaioli’s expenditures in marketing

Boxplots

average
max
Q75
median
Q25
min
Time series plots

Time series plot

A plot of a time series yt , t = 1 . . . T is a Cartesian diagram (t, yt ) in
which the horizontal axis shows graduations t = 1 . . . T of time using an
appropriate scale (weeks, months, quarters, years), while the vertical axis
shows the corresponding numerical values yt for each t
Time series plots

Example: Ravaioli
The EDA performed by the logistics manager of Ravaioli includes the
generation of the plot of the weekly sales of spinach ravioli

Figure: Weekly sales of spinach ravioli (in ke) in the Ravaioli problem
Bivariate EDA

Sample covariance
Given two time series xt , yt , t = 1 . . . T , their sample covariance is
T
1X
vxy = (xt − x)(yt − y)
T t=1

Since x, y may be in different units, it is best to normalize it using the

sample standard deviations Sx , Sy
vx
rxy =
Sx Sy

The term rxy is also referred to as the Pearson correlation coefficient

Bivariate EDA

If rxy is substantially less than zero ⇒ x, y are negatively correlated

If rxy is substantially greater than zero ⇒ x, y are positively
correlated
If rxy is close to zero ⇒ x, y are uncorrelated
Scatterplot

Scatterplot
A scatterplot is diagram using Cartesian coordinates to display
corresponding values for two numerical variables xt , yt , t = 1 . . . T of a
dataset
Scatterplot

Example: Ravaioli
The findings of the Ravaioli’s logistics manager about the quality of x1
and x2 as predictors are confirmed by 2D scatterplots illustrated in the
two Figures next

(b) Social media advertising vs TV

(a) TV advertising vs sales advertising
Data preprocessing

Data preprocessing
It is the process of performing data cleaning, interpolation, aggregation
or transformation of data before it can be used to make forecasts
Data preprocessing

Insertion of missing data

Simplest case: replace missing value with average of previous and
subsequent observations

Figure: Number of cars sold per month

In this case we can make

x6 = (x5 + x7 )/2 = (38, 521 + 41, 345)/2 = 39, 333
Data preprocessing

Outliers detection
Data may contain errors due to devices’ failures, human errors, or even
natural deviations. They can often lead to misleading forecasts. Their
identification can be challenging since there is no unified rule to consider
or not an observation as an outlier. Very application-dependent

Rule of thumb for outliers’ detection

If the trend is constant and there are no cyclical components observed
Compute Q25 , Q75 for a dataset
Discard every data point xt such that xt < Q25 − 1.5(Q75 − Q25 ) or
xt > Q75 + 1.5(Q75 − Q25 )
Data preprocessing

Example: Elleshop
Elleshop distributes electrical appliances in Austria. The Table below
reports its sales of smart LED TV sets in the province of Klagenfurt
during the last 12 months. Since the trend is constant and there are no
cyclical components, the above mentioned rule of thumb is used

Figure: Number of Smart TVs sets delivered monthly

Data preprocessing

Q25 = 866.25, Q75 = 977.5

The interval
[Q25 − 1.5(Q75 − Q25 ), Q75 + 1.5(Q75 − Q25 )] = [699.38, 1144.38]
The sales amount reported in month 8 (200) is identified as an
outlier and therefore removed
We replace it with the average sales reported for months 7 and 9
Data preprocessing

Data aggregation
It consists in merging disaggregated data from multiple sources (e.g.
monthly sales from individual retailers in a given district) into a single
time series (e.g. overall sales in a given district). Aggregating stochastic
data leads to more accurate data

Variability of aggregated data

Let X1 . . . Xn iid, expected value µ, stdev σ, and let Y = X1 + · · · + Xn

µY = nµ
σY2 = nσ 2

Therefore σY /µY = √1 σ/µ

n
Data preprocessing

Data aggregation
For the same reason, it may sometimes be convenient to aggregate data
if the fine granularity of a time series leads to too much variability
Daily sales vs weekly sales vs quarterly sales
Sales of cars of a particular make/model/year vs Sales of SUV cars
of a given maker
Sales in multiple small districts vs aggregate sales in a larger area
Data preprocessing

Removing calendar variations

Time series representing a cumulative amount over a time period (e.g.
monthly sales of a product) may contain calendar effects due to the
variability of a month/week length. Potential solution
Define wt = n/nt (e.g. n = average number of business days in a
month; nt = number of business days in month t)
Replace yt by yt′ = wt yt
Data preprocessing
Deflating monetary time series
For a time series impacted by inflation (e.g. yearly sales of a product over
a 10-year period), it may be convenient to deflate the data (compare
apples with apples)

Example: Cavis
Cavis is a wine-making company that sells its products almost exclusively
in France. The annual sales (in Me) over the last 10 years are reported
in the left Table. The same table also shows the annual rate of inflation
recorded in the decade. The deflated data is reported in the right Table
Data preprocessing

Adjusting for population variations

When forecasting some economic variables such as sales in a certain
geographic area, demographic variations need to be taken into account.
Let at be the population of a given market in time period t and let
yt , t = 1 . . . T , be the time series. Then, forecasts are devised on
yt′ = aa1t yt
Data preprocessing

Example: Salus
Salus is a private company providing home care services for the elderly in
the Lombardy region, Italy. The annual number of customers over the
past decade is shown in the left Table below. Considering the annual
population of Lombardy (second and sixth columns of the right Table
below) over the same 10 years, the modified time series is obtained as
shown in the fourth and eighth columns.
Data preprocessing

Data normalization
Make data fit into an interval [m, M ], m < M . If y M IN , y M AX represent
min and max values for yt , t = 1 . . . T we let

yt − y M IN
yt′ = (M − m) + m
y AX − y M IN
M

The most usual form of normalization is the [0, 1]-normalization for

m = 0, M = 1
Classification of time series

Intermittent vs continuous

(a) Intermittent time series (b) Continuous time series

Classification of time series

Regular time series

A time series is said to be regular if it can be decomposed in
Trend. long term modification of a data pattern over time
Cycle. long term fluctuations due to the business cycle which
depends on macroeconomic issues. Four phases: prosperity (pr
boom), recession, depression, and recovery
Seasonality. Repeating occurrences in a cyclical manner of a pattern
caused by the periodicity of human activities: Christmas sales
season, Summer season for ice creams, gym registrations in January,
etc
Error. Also called residual component or noise, it is the irregular
component of the historical data
Regular time series

Figure: Example of a regular time series

Explanatory methods

Linear regression
Given a set of explanatory variables xti and an outcome yt , we want to
build a model that will approximate y = wT x + ϵ that will approximate y
as a linear function of the explanatory variables plus a random error

How to find w?
The vector w is chosen as the one that minimizes the sum of the square
errors, namely
XT
SSE = (yt − wT xt )2 . (1)
t=1

The first-order optimality conditions for a minimizer w∗ lead to the

following identity

∇w SSE = −2XT y + 2XT Xw∗ = 0

Explanatory methods

Linear regression
If XT X is non-singular, then

w∗ = (XT X)−1 Xy

Otherwise, it can be shown that two columns of X are linearly

dependent
If det(XT X) ∼ 0 the matrix is ill-conditioned ⇒ forecast very
sensitive to the input data

Lecture2 25
No ratings yet
Lecture2 25
55 pages
Time Series For Data Science Analysis and Forecasting (Wayne A. Woodward, Bivin Philip Sadler Etc.) (Z-Library)
100% (1)
Time Series For Data Science Analysis and Forecasting (Wayne A. Woodward, Bivin Philip Sadler Etc.) (Z-Library)
529 pages
E Monika Sree 10-10-2024
No ratings yet
E Monika Sree 10-10-2024
60 pages
Topic 8 Time Series and Forecasting
No ratings yet
Topic 8 Time Series and Forecasting
33 pages
DAUP Exam Notes - 2in1
No ratings yet
DAUP Exam Notes - 2in1
35 pages
Time Series EDA for Data Analysts
No ratings yet
Time Series EDA for Data Analysts
20 pages
Topic 8 Time Series and Forecasting
No ratings yet
Topic 8 Time Series and Forecasting
33 pages
Lecture1 25
No ratings yet
Lecture1 25
67 pages
Time Series: "The Art of Forecasting"
100% (1)
Time Series: "The Art of Forecasting"
98 pages
Topic 8 Time Series and Forecasting
No ratings yet
Topic 8 Time Series and Forecasting
33 pages
A129205660 - 23591 - 22 - 2019 - Time Series-1-1
No ratings yet
A129205660 - 23591 - 22 - 2019 - Time Series-1-1
20 pages
Week09 Handling Time Series
No ratings yet
Week09 Handling Time Series
24 pages
Jal1603 Tsaf Unit-2
No ratings yet
Jal1603 Tsaf Unit-2
24 pages
Introduction to Time Series Analysis
No ratings yet
Introduction to Time Series Analysis
274 pages
Time Series & Index Numbers Guide
No ratings yet
Time Series & Index Numbers Guide
27 pages
Lecture 11
No ratings yet
Lecture 11
37 pages
Time Series
No ratings yet
Time Series
34 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
Time Series Analysis Essentials
No ratings yet
Time Series Analysis Essentials
104 pages
Timeseries - Analysis
No ratings yet
Timeseries - Analysis
37 pages
Forecasting Techniques Guide
No ratings yet
Forecasting Techniques Guide
25 pages
MGT 104 Chapter 3
No ratings yet
MGT 104 Chapter 3
51 pages
Unit 2
No ratings yet
Unit 2
37 pages
Stata
No ratings yet
Stata
33 pages
Business Forecasting
No ratings yet
Business Forecasting
85 pages
Forecasting
No ratings yet
Forecasting
9 pages
The Complete Guide To Time Series Analysis and Forecasting
No ratings yet
The Complete Guide To Time Series Analysis and Forecasting
20 pages
Data Analytics & Visualization Guide
No ratings yet
Data Analytics & Visualization Guide
77 pages
Lecture1&2slides PDF
No ratings yet
Lecture1&2slides PDF
88 pages
DVA Unit 1 - Part 2
No ratings yet
DVA Unit 1 - Part 2
53 pages
EDA Guide for Data Analysts
No ratings yet
EDA Guide for Data Analysts
35 pages
Time Series Decomposition Guide
No ratings yet
Time Series Decomposition Guide
44 pages
Session 2
100% (1)
Session 2
35 pages
DSBA - Exploratory Data Analysis v2
No ratings yet
DSBA - Exploratory Data Analysis v2
22 pages
ETC3550/ETC5550 Applied Forecasting: Ch3. Time Series Decomposition
No ratings yet
ETC3550/ETC5550 Applied Forecasting: Ch3. Time Series Decomposition
66 pages
FALLSEM2024-25 ITA2004 ETH VL2024250103446 2024-09-02 Reference-Material-I
No ratings yet
FALLSEM2024-25 ITA2004 ETH VL2024250103446 2024-09-02 Reference-Material-I
58 pages
Time Series Analysis. Trends, Patters, Seasonality
No ratings yet
Time Series Analysis. Trends, Patters, Seasonality
14 pages
Chapter 4 - Summarizing Numerical Data
No ratings yet
Chapter 4 - Summarizing Numerical Data
8 pages
GRADE 11 - STATISTICS PRESENTATIONS 2021 (14 & 15 July 2021) - 1
No ratings yet
GRADE 11 - STATISTICS PRESENTATIONS 2021 (14 & 15 July 2021) - 1
100 pages
Data Preparation DM
No ratings yet
Data Preparation DM
26 pages
Basics - Time Series
No ratings yet
Basics - Time Series
66 pages
DSBA - Exploratory Data Analysis v2
No ratings yet
DSBA - Exploratory Data Analysis v2
22 pages
Lecture 5 - Spring 2024
No ratings yet
Lecture 5 - Spring 2024
30 pages
Pracal Labexamsamplequestions
No ratings yet
Pracal Labexamsamplequestions
35 pages
Data - Analysis Using Matlab
No ratings yet
Data - Analysis Using Matlab
156 pages
11 - Introduction To Forecasting Analysis
No ratings yet
11 - Introduction To Forecasting Analysis
54 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
Time Series Analysis NMIMS
No ratings yet
Time Series Analysis NMIMS
17 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
30 pages
Hanke, John E. - Wichern, Dean W. - Business Forecasting
No ratings yet
Hanke, John E. - Wichern, Dean W. - Business Forecasting
45 pages
Business and Economic Forecasting: Demand Forecasting Is A Critical
No ratings yet
Business and Economic Forecasting: Demand Forecasting Is A Critical
36 pages
TOD 212-PPT 2 For Students - Monsoon 2023
No ratings yet
TOD 212-PPT 2 For Students - Monsoon 2023
26 pages
MathsReport 7thsem
No ratings yet
MathsReport 7thsem
11 pages
CH 13
No ratings yet
CH 13
11 pages
Statistics Foundation Slider Team Group#1
No ratings yet
Statistics Foundation Slider Team Group#1
94 pages
Predictive Modeling for Analysts
100% (1)
Predictive Modeling for Analysts
28 pages
Week 1 Introduction To Time Series
No ratings yet
Week 1 Introduction To Time Series
32 pages
Safety FCTC 05 - Risk Analysis Techniques
No ratings yet
Safety FCTC 05 - Risk Analysis Techniques
111 pages
Assignment
No ratings yet
Assignment
1 page
Advanced Calculus Lecture Notes
No ratings yet
Advanced Calculus Lecture Notes
128 pages
Assignment 7
No ratings yet
Assignment 7
3 pages
Assignment 8
No ratings yet
Assignment 8
1 page
MapSource Installation Guide
No ratings yet
MapSource Installation Guide
8 pages
Updated Generate Approval Slip User Manaual
No ratings yet
Updated Generate Approval Slip User Manaual
6 pages
Charts 043916
No ratings yet
Charts 043916
27 pages
2nd Term Assessment Revision Compt 8 (2) (Answer Key)
No ratings yet
2nd Term Assessment Revision Compt 8 (2) (Answer Key)
2 pages
AP Precalculus Summer Assignment 2024
No ratings yet
AP Precalculus Summer Assignment 2024
9 pages
Quality Assurance and Testing The ML Model - DZone AI
No ratings yet
Quality Assurance and Testing The ML Model - DZone AI
3 pages
Flutter Application Development Book
No ratings yet
Flutter Application Development Book
639 pages
Dhyan
No ratings yet
Dhyan
1 page
Alarmserver Userguide Uk
No ratings yet
Alarmserver Userguide Uk
14 pages
OutSystems Coding Principles and Review Checklist
No ratings yet
OutSystems Coding Principles and Review Checklist
2 pages
Course Book Dlcoa - Alka Srivastava - Se B
No ratings yet
Course Book Dlcoa - Alka Srivastava - Se B
132 pages
NPM Vs PNPM Vs YARN
No ratings yet
NPM Vs PNPM Vs YARN
2 pages
Somachine Basic Example Guide: Importing Twido Drive Macros To M221 Xsample - Twido - Macro - Drive - Conversion - Smbe
No ratings yet
Somachine Basic Example Guide: Importing Twido Drive Macros To M221 Xsample - Twido - Macro - Drive - Conversion - Smbe
42 pages
Differential Equations Exam 2021
No ratings yet
Differential Equations Exam 2021
1 page
Service Manual: L455/L456, L350/351, L300/301
No ratings yet
Service Manual: L455/L456, L350/351, L300/301
80 pages
Equipment List: Leica TS03, TS07, TS10
No ratings yet
Equipment List: Leica TS03, TS07, TS10
20 pages
Nazam Themer e Chaman 10th Urdu PDF
No ratings yet
Nazam Themer e Chaman 10th Urdu PDF
1 page
Exislearning WordPress Website Development Ebook1
No ratings yet
Exislearning WordPress Website Development Ebook1
176 pages
IT Accessibility & Web Basics
No ratings yet
IT Accessibility & Web Basics
5 pages
Biostatistics and Research Methodology Important Questions 8th
No ratings yet
Biostatistics and Research Methodology Important Questions 8th
39 pages
Software Engineer Resume - Saiteja Gurram
No ratings yet
Software Engineer Resume - Saiteja Gurram
1 page
CNS DARSHAN - Docx Final
No ratings yet
CNS DARSHAN - Docx Final
40 pages
OUTPUT DEVICE Inkjet Printer
No ratings yet
OUTPUT DEVICE Inkjet Printer
2 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
28 pages
Migrating From SharePoint On Premises To Microsoft 365 Daniel Glenn
No ratings yet
Migrating From SharePoint On Premises To Microsoft 365 Daniel Glenn
39 pages
ChuangDaJacquard Instruction English
No ratings yet
ChuangDaJacquard Instruction English
44 pages
Telemetry Dashboard 3
No ratings yet
Telemetry Dashboard 3
3 pages
Self Refferential Structure
No ratings yet
Self Refferential Structure
30 pages
WWW Artfolio Tech Kiruthik Arun...
No ratings yet
WWW Artfolio Tech Kiruthik Arun...
7 pages
Software Project Management - U2
No ratings yet
Software Project Management - U2
14 pages