Predictive Analytics
Predictive Analytics
Outline
Forecasting
Predictive Analytics
BY KHAKIM HABIBI
Machine Learning
RENNES SCHOOL OF BUSINESS
1 2
Forecasting
◦ Sales force allocation, promotions, new production introduction
◦ Plant/equipment investment, budgetary planning
◦ Workforce planning, hiring, layoffs
• All of these decisions are interrelated
3 4
9/19/2023
1. Forecasts are always inaccurate and should thus include both the • Companies must identify the factors that influence future demand and
expected value of the forecast and a measure of forecast error then ascertain the relationship between these factors and future
2. Long-term forecasts are usually less accurate than short-term demand
forecasts ◦ Past demand
◦ Lead time of product replenishment
3. Aggregate forecasts are usually more accurate than disaggregate ◦ Planned advertising or marketing efforts
forecasts ◦ Planned price discounts
4. In general, the farther up the supply chain a company is, the greater ◦ State of the economy
is the distortion of information it receives ◦ Actions that competitors have taken
5 6
7 8
9/19/2023
2. Integrate demand planning and forecasting throughout the supply Exponential Smoothing
chain. Moving Average
3. Identify the major factors that influence the demand forecast. ARIMA
9 10
Linear
Regression
K-Nearest
K-Means
Unsupervised
Hierarchical
Clustering
11 12
9/19/2023
Outline
Structure
Entropy and Information Gain
An example
Decision Tree
Issues
◦ Overfitting
◦ Continuous Variables
13 14
we have to get an homogène data at the end and try to get a sub data
15 16
9/19/2023
𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝐴 = 𝐸 𝑆 − 𝐸(𝑆 )
𝑆
∈ ( )
17 18
19 0 means it's completly homogene and 1 means that the entropy is completly heretogene 20
9/19/2023
For all data that we have we have 9 Yes and 5 No so we will try to major the entropy Pi so -9/14*log2(9/14) -5/14*log2(5/14)=0.94 so the range of the entropy is normally between 0 and 1 so the entropy is heterogene
=> E sunny=0.971 & Eovercast=0 and Erain=[3+,2-]= 0.971
21 22
23 24
9/19/2023
𝑆 𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝑂𝑢𝑡𝑙𝑜𝑜𝑘 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 𝐺𝑎𝑖𝑛 𝑆, 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆 𝑆
, , ,
5 4 5 7 7
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.2464 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.1516
14 14 14 14 14
25 4 number of columns
26
if we use the
𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝑊𝑖𝑛𝑑 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆
,
6 8
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.0478
14 14
27 28
9/19/2023
Outlook
Sunny Rain
Overcast
? ?
Yes
{𝐷1, 𝐷2, 𝐷8, 𝐷9, 𝐷11} {𝐷4, 𝐷5, 𝐷6, 𝐷10, 𝐷14}
{𝐷3, 𝐷7, 𝐷12, 𝐷13}
[2+, 3−] [3+, 2−]
[4+, 0−]
29 30
𝑆 𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 𝐺𝑎𝑖𝑛 𝑆, 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆 𝑆
, , ,
2 2 1 3 2
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.570 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.97
5 5 5 5 5
31 32
9/19/2023
𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝑊𝑖𝑛𝑑 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆
,
2 3
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.0192
5 5
33 34
Outlook Outlook
? ? Humidity ?
Yes Yes
{𝐷1, 𝐷2, 𝐷8, 𝐷9, 𝐷11} {𝐷4, 𝐷5, 𝐷6, 𝐷10, 𝐷14} {𝐷4, 𝐷5, 𝐷6, 𝐷10, 𝐷14}
{𝐷3, 𝐷7, 𝐷12, 𝐷13} High Normal {𝐷3, 𝐷7, 𝐷12, 𝐷13}
[2+, 3−] [3+, 2−] [3+, 2−]
[4+, 0−] [4+, 0−]
No Yes
{𝐷1, 𝐷2, 𝐷8} {𝐷9, 𝐷11}
35 36
9/19/2023
𝑆 = [3+, 2−] 3 3 2 2
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = − log − log = 0.97
5 5 5 5
𝑆 ← [0+, 0−] 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 =0
2 2 1 1
𝑆 ← [2+, 1−] 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = − log − log = 0.9183
3 3 3 3
1 1 1 1
𝑆 ← [1+, 1−] 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = − log − log =1
2 2 2 2
𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆
, ,
0 3 2
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.0192
5 5 5
37 38
𝑆 𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 𝐺𝑎𝑖𝑛 𝑆, 𝑊𝑖𝑛𝑑 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆 𝑆
, ,
2 3 2 3
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.0192 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.97
5 5 5 5
39 40
9/19/2023
Recapitulation
{𝐷1, 𝐷2, … , 𝐷14}
[9+, 5−]
Humidity ?
Yes
{𝐷4, 𝐷5, 𝐷6, 𝐷10, 𝐷14}
High Normal {𝐷3, 𝐷7, 𝐷12, 𝐷13}
[3+, 2−]
[4+, 0−]
No Yes
{𝐷1, 𝐷2, 𝐷8} {𝐷9, 𝐷11}
41 42
Humidity Wind
Yes
43 44
9/19/2023
45 46
Rule post-pruning
Outlook
1. Convert the tree into an equivalent set of rules
1. Each path corresponds to a rule
Sunny Rain 2. Each node along a path corresponds to a pre-condition
{𝐷1, 𝐷2, 𝐷8, 𝐷9, 𝐷11}
{𝐷4, 𝐷5, 𝐷6, 𝐷10, 𝐷14} 3. Each leaf classification to the post-condition
[2+, 3−] Overcast
[3+, 2−] 2. Prune (generalize) each rule by removing those preconditions whose removal improves
accuracy over validation set
Humidity Wind
Yes
47 48
9/19/2023
49 50
51 52
9/19/2023
𝑆 𝑆
𝐺𝑎𝑖𝑛 𝑆, 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 𝐺𝑎𝑖𝑛 𝑆, 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆
𝑆 𝑆
, ,
2 4 5 1
= 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.4591 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = 0.1908
6 6 6 6
53 54
Recapitulation
𝐺𝑎𝑖𝑛 𝑆 , 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 0.4591
𝐺𝑎𝑖𝑛 𝑆 , 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 0.1908
K-Nearest Neighbour
55 56
9/19/2023
Source: https://en.wikipedia.org/wiki/K-
nearest_neighbors_algorithm
Evans (2016)
57 58
Data
Prediction
Data Data
• ID Age Gender Product • ID Age Gender Product
• 1 54 Male C • 1 54 Male C
• 2 32 Female B • 2 32 Female B With k = 3 Predict the product that may
• 3 43 Female B • 3 43 Female B be bought by a new customer with the
following information:
• 4 75 Female C • 4 75 Female C Age = 48
• 5 15 Female A • 5 15 Female A Gender = Male
• 6 52 Male B • 6 52 Male B
• 7 49 Female B • 7 49 Female B
• 8 10 Female A • 8 10 Female A
• 9 17 Female C • 9 17 Female C
• 10 49 Female B • 10 49 Female B
59 60
9/19/2023
Male 0 4 75 1 4 75 1 75,00667
4 75 1
Female 1 5 15 1
5 15 1 5 15 1 15,0333
6 52 0
6 52 0 6 52 0 52
7 49 1
7 49 1 7 49 1 49,0102
8 10 1
8 10 1 8 10 1 10,04988
9 17 1
9 17 1 9 17 1 17,02939
10 49 1
10 49 1 10 49 1 49,0102
New Customer 11 48 0
New Customer 11 48 0 Euclidean distance
61 62
63 64
9/19/2023
Conclusion Exercise: k = 4!
Patient ID Age Gender Blood Type Demand
Age Gender Distance Product
1 78 Female AB+ Fresh Frozen Plasma
54 0 6 C
2 61 Male B+ Red Blood Cell
10 1 10,04988 A
3 18 Male A- Whole Blood
15 1 15,0333 A As the majority of the
4 85 Female B- Platelets
17 1 17,02939 C neighbors chose product A,
5 95 Female A+ Whole Blood
we predict that the new
32 1 32,01562 B
customer will choose 6 84 Male A- Fresh Frozen Plasma
43 1 43,01163 B
Product A
7 33 Female O- Red Blood Cell
49 1 49,0102 B
52 0 52 B
9 34 Male A+ Fresh Frozen Plasma
75 1 75,00667 C
10 93 Female A+ Whole Blood
11 57 Male A+ ????
65 66
K-Means
It captures the insight that each point in a cluster should be near to the center of that cluster.
1. choose k, the number of clusters we want to find in the data. Then, the centers of those k
clusters, called centroids, are initialized in some fashion
K-Means 2. Do:
1.
2.
Reassign Points step: assign every point in the data to the cluster whose centroid is nearest to it.
Update Centroids step: recalculate each centroid's location as the mean (center) of all the points
assigned to its cluster. If the points stop switching clusters, end the algorithm. Otherwise, Back to
step 2.1
67 68
9/19/2023
1 2 10 1 2 10
2 2 5 2 2 5
3 8 4 3 8 4
4 5 8 4 5 8
5 7 5 5 7 5
6 6 4 6 6 4
7 1 2 7 1 2
8 4 9 8 4 9
Cluster the customers into
9 62 61 9 62 61
two clusters!
10 37 13 10 37 13
69 70
Iteration 1 Iteration 1
Determine the initial centroids Compute the distance between each customer and each centroid
71 72
9/19/2023
Iteration 1 Iteration 1
Calculate the minimum distance for each customer Assign each customer to the closest centroid
73 74
Iteration 1 Iteration 1
Cluster each customer Determine the new centroid per cluster
8+7+6+1 4+5+4+2
4 4
75 76
9/19/2023
Iteration 2 Iteration 2
Compute the distance of each customer to the new centroids Calculate the minimum distance for each customer
10 40,75 23 10 40,75 23 23
77 78
Iteration 2 Iteration 2
Assign each customer to the closest centroid Determine the new centroid for each cluster
10 40,75 23 23 2
Cluster 1
79 80
9/19/2023
Iteration 3 Iteration 3
Compute the Manhattan distance of each customer to the new centroids Calculate the minimum distance for each customer
(4,375; (4,375;
ID (49,5; 37) Min Cluster ID (49,5; 37) Min Cluster
5,875) 5,875)
81 82
Iteration 3 Iteration 3
Assign each customer to the closest centroid Assign each customer to the closest centroid
(4,375; (4,375;
ID (49,5; 37) Min Cluster ID (49,5; 37) Min Cluster
5,875) 5,875)
83 84
9/19/2023
Exercise: k = 3
Supplier ID Experience (in years) Capacity (in ton)
1 2 10
Hierarchical Clustering
2 2 5
3 8 4
4 5 8
5 7 5
6 6 4
7 1 2
Initial centroids:
8 4 9 1,4, and 7
85 86
Introduction Dendrogram
A Hierarchical clustering method works via grouping data into a tree of clusters. A tree like structure which represents hierarchical technique
◦ Leaf-Individual
Hierarchical (agglomerative) clustering begins by treating every data points as a separate cluster.
◦ Root-One cluster
Then, it repeatedly executes the subsequent steps:
◦ Identify the 2 clusters which can be closest together, and A cluster at level 1 is the merger of its child cluster at level i+1
◦ Merge the 2 maximum comparable clusters.
◦ Continue these steps until all the clusters are merged together.
87 88
9/19/2023
A B C D E F
89 90
Distances Example
Single-nearest distance or single linkage
◦ Distance between the closest members of the two clusters 1
Complete-farthest distance or complete linkage 5 2
3 6
◦ Distance between the members that are the farthest apart
91 92
9/19/2023
Smallest distance
93 94
Dendrogram
3 6 2 5 4 1
95 96
9/19/2023
Smallest distance
97 98
99 100
9/19/2023
3 6 2 5 4 1
101 102
103 104
9/19/2023
3 6 2 5 4 1
105 106
107 108
9/19/2023
3 6 2 5 4 1
109 110
Dendrogram
3 6 2 5 4 1
111 112
9/19/2023
113 114
References
Chopra, S. and Meindl, P. (2019). Supply Chain Management: Strategy, Planning, and Operation.
(7th edition). Pearson.
Evans, J. R. (2016). Business Analytics. (2nd Edition). Pearson.
http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
https://www.naftaliharris.com/blog/visualizing-k-means-clustering/
https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/?ref=gcse
Various tutorials from http://www.anuradhabhatia.com
Various lecture series from https://www.vtupulse.com/
https://towardsdatascience.com/entropy-how-decision-trees-make-decisions-2946b9c18c8
115