0% found this document useful (0 votes)

6 views24 pages

Anomaly Detection Class

The document discusses anomaly detection in machine learning, outlining the definitions, methods, and challenges associated with detecting anomalies in datasets. It distinguishes between outlier detection and novelty detection, emphasizing the need for human intervention in setting thresholds and interpreting results. Various techniques, including supervised and unsupervised methods, are explored, along with the importance of monitoring performance and adapting to changes in data characteristics.

Uploaded by

Marcos Vinicius

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views24 pages

Anomaly Detection Class

Uploaded by

Marcos Vinicius

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Anomaly Detection

Algorithms in Machine Learning, ISAE-SUPAERO

Jérémy Pirard
Data Scientist
Airbus Commercial Aircraft
Anomaly detection: intuition
Build a model to detect anomalies (labeled in red here)... what do you do ?

Supervised Learning?

Naive bayes classiﬁer, Random Forest,

SVM…

→ Features = X1, X2
→ Label = Anomaly or not (0 or 1)

2
Fabrice Jimenez - Anomaly Detection
Anomaly detection: intuition
Build a model to detect anomalies (labeled in red here)... what do you do ?

What if new anomalies?

3
Fabrice Jimenez - Anomaly Detection
Anomaly detection: intuition
Build a model to detect anomalies... what do you do ?

What if no label?

4
Fabrice Jimenez - Anomaly Detection
Anomaly detection: deﬁnition and scope
What is an anomaly?

1/ Generally: a rare individual (row) in a dataset that diﬀers signiﬁcantly from the majority of the data

2/ Sometimes: anomalies are not so rare, and may not be so diﬀerent from the majority of the data...

Anomalies

Normal during mid-season

Normal
during winter

Normal during summer

5
Fabrice Jimenez - Anomaly Detection
Anomaly detection: deﬁnition and scope
Why not using Supervised Learning with labeled dataset?

Very unbalanced dataset Lack of coverage of all anomaly types

5 anomalies given 100 000 normal points... Anomaly = something not expected, what if a new type happens...

We need other approaches...

Outlier detection: the dataset contains anomalies in the sense of statement 1/ (rare + statistically diﬀerent)
→ Detect elements in this same dataset which diﬀer from the majority of the data

Novelty detection: you have a clean dataset without anomalies (in the sense of 1/ or 2/)
→ Learn the normal behavior, to be able to check if a new item is normal or an anomaly

→ Some techniques can be used for both, but be aware of the approach you are using, and why...

6
Fabrice Jimenez - Anomaly Detection
Outlier detection
1/ Anomaly = a rare individual (row) in a dataset that diﬀers signiﬁcantly from the majority of the data

Outlier detection: the dataset contains anomalies in the sense of statement 1/

→ Detect elements in this same dataset which diﬀer from the majority of the data

7
Fabrice Jimenez - Anomaly Detection
Outlier detection: 1D
Example

1 feature x

Univariate case: in 1 dimension (1 variable), how would you detect anomalies?

Remember your normal distribution!

→ Mean and Std help quantify density of data → outliers = points outside [mean - 2xstd, mean + 2xstd]
8
Fabrice Jimenez - Anomaly Detection
Outlier detection: 1D
Are mean and std always reliable?
They quantify the data density in the case of normal Sensitive to outliers!
distribution… It is not always the case! If too far outliers or many outliers → distorts estimation!

What is the alternative? Let’s go MAD!

Robustify mean? → median Robustify standard deviation? → ...
MAD = Median Absolute Deviation = median( | x - median(x) | )

Example:
→ outliers = points outside [median - 3xMAD, median + 3xMAD]

What threshold to use? Why 3 x MAD?

→ there are relationships to quantify percentiles with median and MAD
→ they depend on the type of distribution...

→ Thresholds always need human intervention / ﬁne-tuning!

Robust thresholds
9
Fabrice Jimenez - Anomaly Detection
Outlier detection: nD
Multivariate case: generalize what we saw in 1D ?

→ 1st approach: median and MAD on each of the

variables (still univariate…)
It does not take at all into account relationship
between x1 and x2..

→ We can use covariance matrix!

Mahalanobis distance of a point to the distribution:

If scaled distribution, Euclidean distance to the center!

Elliptic envelope
→ Threshold to
deﬁne!
For more details on robust covariance estimator (FastMCD algorithm):
A Fast Algorithm for the Minimum Covariance Determinant Estimator
Peter J. Rousseeuw and Katrien Van Driessen
10
Fabrice Jimenez - Anomaly Detection
Outlier detection: nD
Minimum Covariance Determinant (MCD)

→ 1/ Randomly select a subset of datapoint

→ 2/ Calculate the covariance matrix, its

determinant and mean on the subset

→ 3/ Repeat 1 and 2 several times and keep The determinant of the covariance matrix
the matrix with smallest determinant “measures” how broad a distribution is

→ 4/ Compute the Mahalanobis distance

for each observation based on previous
estimation.

… Again, threshold to be deﬁned …

For more details on robust covariance estimator (FastMCD algorithm):
A Fast Algorithm for the Minimum Covariance Determinant Estimator
Peter J. Rousseeuw and Katrien Van Driessen
11
Fabrice Jimenez - Anomaly Detection
Outlier detection: nD Isolated
point
Other methods - Isolation Forest
→ 1/ Build Isolation Tree:
Split entire dataset with random variables
and random thresholds
→ 2/ Repeat with 100, 1000 trees...

→ 3/ Average depth of a point in the forest

≈ anomaly score*

Low depth = high anomaly score

High depth = low anomaly score

Once again, threshold to deﬁne!

Advantages:
→ Few hyperparameters to tune
→ Linear complexity: time does not explode with volume!

* Anomaly score = average depth normalized with average depth of

unsuccessful searches in a binary search tree. For more details:
Isolation-based Anomaly Detection, Fei Tony Liu and Kai Ming Ting

12
Fabrice Jimenez - Anomaly Detection
Outlier detection: nD Isolated
point
Other methods - Isolation Forest

Not exactly true… Each tree splits a subset

of the data (max 256 points) to avoid
swamping and masking

Swamping: predicting normal points as anomalies, because local

density is lower

Masking: locally dense anomaly clusters, therefore predicting

these anomalies as normal points

Subsampling reduces these 2 eﬀects

Image taken from:

13
Isolation-based Anomaly Detection, Fei Tony Liu and Kai Ming Ting Fabrice Jimenez - Anomaly Detection
Outlier detection: nD
Other methods - Local Outlier Factor (LOF)
→ 1/ For each point A, the k-distance to all the other K-distance(A,B) is the distance of to its k-th neighbour
points

→ 2/ Compute the Reachability Distance(LR) of A

reachability-distance (A,B)=max{k-distance(B), d(A,B)}
k
→ 3/ Compute the inverse of the average RD of A to Low LRD values means that closest cluster of
its k-neighbours: Local Reachability Density points from A are “far”

→ 4/ Local Outlier Factor LOF <= 1: similar or higher density than neighbors = low anomaly score
LOF > 1: lower density than neighbors = high anomaly score

Once again, threshold to deﬁne!

Advantage:
→ Locality aspect: points close to very dense cluster can still
be anomalies, compared to “border”-based methods

Inconvenient:
→ Anomaly score (ratio) is hard to interpret

For more details:

LOF: Identifying Density-Based Local Outliers
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, Jörg Sander
LoOP: local outlier probabilities
H. Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek 14
Fabrice Jimenez - Anomaly Detection
Outlier detection: nD
Other methods - One Class SVM

→ Simple idea: draw a circle around your data points!

→ You allow a “soft-margin”, tuned with parameter nu (contamination rate),
because you have outliers in your dataset
→ With a kernel: projection of dataset in higher dimension, compute the circle,
translate into a non-linear boundary in initial space!

Outside circle: outlier

Inside circle: normal point Kernel trick used as regular SVM: no need to know the projection,
just the dot product...
Once again, threshold to deﬁne!

Advantage:
→ Complex boundary deﬁnition
Inconvenient:
→ Very sensitive to threshold and choice of kernel…
See https://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html

For more details:

Estimating the Support of a High-Dimensional Distribution
Bernhard Scholkopf et al.
15
Fabrice Jimenez - Anomaly Detection
Outlier detection: nD

It’s time to play with these methods with sklearn...

Main interest = play with parameters to see the impact on

detection boundaries, and explain it through theory

16
Fabrice Jimenez - Anomaly Detection
Outlier detection: score VS decision
Be careful!

→ These unsupervised methods give only a relative measure of abnormality

→ Elliptic envelope: mahalanobis distance
→ iForest: average depth
Continuous scores
→ LOF: density ratio with neighbors
...

→ The decision itself (outlier or not) is proposed by default in those methods, but it always requires threshold tuning!
→ Always need for human intervention, especially with complex interdependent systems!
→ For example: cross the anomaly scores with manual cluster analysis with PCA, geometrical interpretation...

NOT because technology is not mature enough... BUT because the problem is badly formulated!
“Anomaly” is not clearly deﬁned a priori, and statistics
will never tell you what it is!

Human intervention for

threshold and / or decision!

17
Fabrice Jimenez - Anomaly Detection
Novelty detection
2/ Anomalies = not so rare, and may not be so diﬀerent from the majority of the data...

Novelty detection: you have a clean dataset without anomalies (in the sense of 1/ or 2/)
→ Learn the normal behavior, to be able to check if a new item is normal or an anomaly

Anomalies

Normal during mid-season

Normal
during winter

Normal during summer

18
Fabrice Jimenez - Anomaly Detection
Novelty detection
Basic principle
Clean dataset without anomalies: “learn” the normal behavior.
Predict the value / score of new points to ﬁnd out if they match the normal behavior or not

→ Unsupervised methods we have seen can be used in this case (One Class SVM is even better at this than outlier detection!)

New possibilities
Why not using supervised learning to learn the normal behavior?

Predict each variable by using

v1 v2 v3
Model 1: v1 = f(v2,v3) the others as features:
Model 2: v2 = f(v1,v3) → Linear regression
8.4 15 2.2 Model 3: v3 = f(v1,v2) → Random Forest
→ SVM...
9.1 10 5.1
→ A new point comes in: (x1,x2,x3)
... ... ... → Compute the predictions [x1] = f(x2,x3), [x2] = f(x1,x3), [x3] = f(x1,x2)
→ Compute the errors [xi] - xi : squared error, absolute error...
High error = does not ﬁt the “normal” model = high anomaly score
19
Fabrice Jimenez - Anomaly Detection
Novelty detection
The rise of deep learning and neural network gives new possibilities in anomaly
detection
Example of AutoEncoders

→ Use error of reconstruction as a score

of Anomaly
→ Architecture choice, loss is problem
dependant and requires lots of iterations

Going further:
- Variational autoencoder
https://github.com/Michedev/VAE_
anomaly_detection
20
Fabrice Jimenez - Anomaly Detection
Synthesis Formulate the problem
What is an anomaly?

Anomaly detection

Outlier detection Novelty detection

Unsupervised methods Unsupervised methods

Supervised methods to learn normality

Human intervention
Cluster or geometrical based analysis

Reformulate the problem Define a threshold Find a decision model

21
Fabrice Jimenez - Anomaly Detection
What’s next ?
Not this end of the story … Monitoring the performance of the deployed algorithm is key

Concept drift
Data drift monitoring Is my understanding of the anomaly still relevant ?
Is my input data still have the same Explainability ?
characteristics? Sensors issues ?

→ Data collection, retraining strategy …

22
Fabrice Jimenez - Anomaly Detection
Quick piece of advice...
Machine Learning = complex ﬁeld → a lot of: models, ideas, approaches, theories… every day!

How to keep up the rhythm? → Build your own understanding, from global to detail

Example:

Based on historical data, detect

Goals Qualitative Global
when behavior is changing

Novelty detection: learn normal

Means past behavior, use prediction
error as anomaly score

Random Forest regression to

predict each feature in function of
Techniques Quantitative Detail others, use mean squared error

23
Fabrice Jimenez - Anomaly Detection
Questions?

24
Fabrice Jimenez - Anomaly Detection

10 - Anomaly Detection
No ratings yet
10 - Anomaly Detection
12 pages
17 dm2 Anomaly Detection 2022 23
No ratings yet
17 dm2 Anomaly Detection 2022 23
113 pages
Data Mining Slide Contents
No ratings yet
Data Mining Slide Contents
22 pages
Module 11 (C)
No ratings yet
Module 11 (C)
4 pages
Lecture23 2
No ratings yet
Lecture23 2
10 pages
Anomaly Detection Overview
No ratings yet
Anomaly Detection Overview
36 pages
Anomaly Detection and Curve Fitting
No ratings yet
Anomaly Detection and Curve Fitting
72 pages
Anomoly Detection - Ensemble - Classifiers
No ratings yet
Anomoly Detection - Ensemble - Classifiers
68 pages
Anomaly Detection Unit 5
No ratings yet
Anomaly Detection Unit 5
9 pages
5 Anomaly Detection Annotated Section 100 300
No ratings yet
5 Anomaly Detection Annotated Section 100 300
48 pages
Anomaly Detection
No ratings yet
Anomaly Detection
7 pages
6anomaly Fraud Detection
No ratings yet
6anomaly Fraud Detection
5 pages
Ecmlpkdd08 Lazarevic Dmfa
No ratings yet
Ecmlpkdd08 Lazarevic Dmfa
116 pages
Anomaly Detection Techniques
No ratings yet
Anomaly Detection Techniques
14 pages
Anomaly Detection
No ratings yet
Anomaly Detection
22 pages
Anomaly Detection and Outlier Analysis
No ratings yet
Anomaly Detection and Outlier Analysis
25 pages
12 Anomaly Detection SVD III
No ratings yet
12 Anomaly Detection SVD III
25 pages
Unit 4
No ratings yet
Unit 4
17 pages
Introtoanomalydetection 170421012904
No ratings yet
Introtoanomalydetection 170421012904
53 pages
2021-Anomaly Detection With Representative Neighbors
No ratings yet
2021-Anomaly Detection With Representative Neighbors
11 pages
Anomaly Detection
No ratings yet
Anomaly Detection
10 pages
02 - 03 - Anomaly Detection Survey
No ratings yet
02 - 03 - Anomaly Detection Survey
27 pages
Distance Based Outlier Detection
No ratings yet
Distance Based Outlier Detection
40 pages
Machine Learning For Anomaly Detection
No ratings yet
Machine Learning For Anomaly Detection
23 pages
ISAT 600 Progress Report 3
No ratings yet
ISAT 600 Progress Report 3
4 pages
Anomaly Detection: Jing Gao
No ratings yet
Anomaly Detection: Jing Gao
51 pages
11 Different Ways For Outlier Detection in Python
No ratings yet
11 Different Ways For Outlier Detection in Python
11 pages
1 s2.0 S0952197622004936 Main
No ratings yet
1 s2.0 S0952197622004936 Main
8 pages
Data Mining Chapter 6 Anomaly & Fraud Detection
No ratings yet
Data Mining Chapter 6 Anomaly & Fraud Detection
41 pages
Anomaly-Fraud-Detection
No ratings yet
Anomaly-Fraud-Detection
50 pages
Histogram-Based Outlier Score (HBOS) : A Fast Unsupervised Anomaly Detection Algorithm
No ratings yet
Histogram-Based Outlier Score (HBOS) : A Fast Unsupervised Anomaly Detection Algorithm
5 pages
References
No ratings yet
References
6 pages
Distance-Based Outlier Detection: Consolidation and Renewed Bearing
No ratings yet
Distance-Based Outlier Detection: Consolidation and Renewed Bearing
12 pages
The Ultimate Guide To Anomaly Detection: Key Use Cases, Techniques, and Autoencoder Machine Learning Models
No ratings yet
The Ultimate Guide To Anomaly Detection: Key Use Cases, Techniques, and Autoencoder Machine Learning Models
9 pages
Anomaly Detection
No ratings yet
Anomaly Detection
7 pages
Rare Anomalies Require Large Datasets About Provin
No ratings yet
Rare Anomalies Require Large Datasets About Provin
13 pages
XV. Anomaly Detection
0% (1)
XV. Anomaly Detection
4 pages
Feature Engineering
No ratings yet
Feature Engineering
66 pages
Anomaly Detection 2
No ratings yet
Anomaly Detection 2
8 pages
Data Mining Anomaly Detection
No ratings yet
Data Mining Anomaly Detection
33 pages
Dealing With Outliers
No ratings yet
Dealing With Outliers
19 pages
WSDM21 Tutorial DLAD Slides
No ratings yet
WSDM21 Tutorial DLAD Slides
110 pages
Anomaly Detection
No ratings yet
Anomaly Detection
11 pages
ARDOD: Adaptive Radius Density Based Outlier Detection: Farshad Rahmati Reza Heydari Gharaei Hossein Nezamabadi Pour
No ratings yet
ARDOD: Adaptive Radius Density Based Outlier Detection: Farshad Rahmati Reza Heydari Gharaei Hossein Nezamabadi Pour
16 pages
Anomaly Detection Insights
No ratings yet
Anomaly Detection Insights
7 pages
Anomaly Detection Guide for Beginners
No ratings yet
Anomaly Detection Guide for Beginners
12 pages
Anomaly Detection
No ratings yet
Anomaly Detection
49 pages
Make 05 00042 v3
No ratings yet
Make 05 00042 v3
21 pages
Anomaly Detection - Problem Motivation
No ratings yet
Anomaly Detection - Problem Motivation
9 pages
Anomaly Detection Survey Overview
No ratings yet
Anomaly Detection Survey Overview
72 pages
Outlier Detection Methods Guide
No ratings yet
Outlier Detection Methods Guide
2 pages
Data Mining-Outlier Analysis
No ratings yet
Data Mining-Outlier Analysis
6 pages
741 Outlier Detection
No ratings yet
741 Outlier Detection
55 pages
Data Outlier Detection Techniques
No ratings yet
Data Outlier Detection Techniques
17 pages
Unit 3
No ratings yet
Unit 3
37 pages
20 Cs 112
No ratings yet
20 Cs 112
11 pages
T6 - QMchange Point Anomaly
No ratings yet
T6 - QMchange Point Anomaly
11 pages
EC380 Mini Project
No ratings yet
EC380 Mini Project
23 pages
Wa0004.
No ratings yet
Wa0004.
3 pages
Accounts Payable (J60)
0% (1)
Accounts Payable (J60)
17 pages
Campus Journalism Essentials
No ratings yet
Campus Journalism Essentials
25 pages
Holter: Operating Manual
No ratings yet
Holter: Operating Manual
24 pages
CP D155AX 6 S N 81028 UP (Chassi)
No ratings yet
CP D155AX 6 S N 81028 UP (Chassi)
5,092 pages
Master Rotation
No ratings yet
Master Rotation
6 pages
Automated Glaucoma Detection Using Support Vector Machine Classification Method
No ratings yet
Automated Glaucoma Detection Using Support Vector Machine Classification Method
13 pages
05 RA41235EN06GLA1 UE States and LTESAE Signaling PDF
No ratings yet
05 RA41235EN06GLA1 UE States and LTESAE Signaling PDF
83 pages
Types of Teams - Permanent Teams, Temporary Teams, Task Force, Virtual Teams Etc
No ratings yet
Types of Teams - Permanent Teams, Temporary Teams, Task Force, Virtual Teams Etc
6 pages
Unit and Lesson Planning (I) Sample of Unit Plan
No ratings yet
Unit and Lesson Planning (I) Sample of Unit Plan
6 pages
Solar Energy Fundamentals (Citizenre Training)
90% (20)
Solar Energy Fundamentals (Citizenre Training)
69 pages
2020 LG Hausys Flooring Total Catalogue
No ratings yet
2020 LG Hausys Flooring Total Catalogue
99 pages
BL 2018 - 3018 Operators Manual
100% (1)
BL 2018 - 3018 Operators Manual
242 pages
JEE Advanced 2020 Practice Test
No ratings yet
JEE Advanced 2020 Practice Test
23 pages
AI Tools Repository
100% (1)
AI Tools Repository
4 pages
TechJar - Offerings - V07
No ratings yet
TechJar - Offerings - V07
14 pages
Aligning Moments in Time Using Video Queries
No ratings yet
Aligning Moments in Time Using Video Queries
13 pages
Safety Assessment en 13001-3-2
No ratings yet
Safety Assessment en 13001-3-2
8 pages
Process List
No ratings yet
Process List
5 pages
ESP Check List
100% (2)
ESP Check List
2 pages
Non-Verbal Communication
No ratings yet
Non-Verbal Communication
13 pages
NABL Density Certificate For Nitrile Rubber Tube Insulation For Hvac
No ratings yet
NABL Density Certificate For Nitrile Rubber Tube Insulation For Hvac
1 page
Lumangbayan High School Simulation Report
No ratings yet
Lumangbayan High School Simulation Report
2 pages
Literary Madness Explored
No ratings yet
Literary Madness Explored
5 pages
Photography Manual 10-19
100% (1)
Photography Manual 10-19
7 pages
Chapter14 Ans
No ratings yet
Chapter14 Ans
4 pages
Slides - Design Guideline For HDI (MULTEK)
No ratings yet
Slides - Design Guideline For HDI (MULTEK)
11 pages
MGMT 371 - 1
No ratings yet
MGMT 371 - 1
21 pages
Observation On Plot Plan, Detail Drawing & Measuremetn Survey
No ratings yet
Observation On Plot Plan, Detail Drawing & Measuremetn Survey
30 pages

Anomaly Detection Class

Uploaded by

Anomaly Detection Class

Uploaded by

Anomaly Detection

Algorithms in Machine Learning, ISAE-SUPAERO

Naive bayes classiﬁer, Random Forest,

What if new anomalies?

Normal during mid-season

Normal during summer

Very unbalanced dataset Lack of coverage of all anomaly types

We need other approaches...

Outlier detection: the dataset contains anomalies in the sense of statement 1/

Univariate case: in 1 dimension (1 variable), how would you detect anomalies?

Remember your normal distribution!

What is the alternative? Let’s go MAD!

What threshold to use? Why 3 x MAD?

→ Thresholds always need human intervention / ﬁne-tuning!

→ 1st approach: median and MAD on each of the

→ We can use covariance matrix!

If scaled distribution, Euclidean distance to the center!

→ 1/ Randomly select a subset of datapoint

→ 2/ Calculate the covariance matrix, its

→ 4/ Compute the Mahalanobis distance

… Again, threshold to be deﬁned …

→ 3/ Average depth of a point in the forest

Low depth = high anomaly score

Once again, threshold to deﬁne!

* Anomaly score = average depth normalized with average depth of

Not exactly true… Each tree splits a subset

Swamping: predicting normal points as anomalies, because local

Masking: locally dense anomaly clusters, therefore predicting

Subsampling reduces these 2 eﬀects

Image taken from:

→ 2/ Compute the Reachability Distance(LR) of A

Once again, threshold to deﬁne!

For more details:

→ Simple idea: draw a circle around your data points!

Outside circle: outlier

For more details:

It’s time to play with these methods with sklearn...

Main interest = play with parameters to see the impact on

→ These unsupervised methods give only a relative measure of abnormality

Human intervention for

Normal during mid-season

Normal during summer

Predict each variable by using

→ Use error of reconstruction as a score

Outlier detection Novelty detection

Unsupervised methods Unsupervised methods

Supervised methods to learn normality

Reformulate the problem Define a threshold Find a decision model

→ Data collection, retraining strategy …

Based on historical data, detect

Novelty detection: learn normal

Random Forest regression to

You might also like