0% found this document useful (0 votes)

20 views9 pages

DataMining Chapter5

This chapter discusses Support Vector Machine (SVM), a popular binary classification method in data mining introduced by Cortes and Vapnik in 1995. It explains the concept of finding an optimal hyperplane to separate data points of different classes, the importance of maximizing the margin for better classification, and how SVM can be adapted for multi-class problems. Additionally, it addresses the advantages and disadvantages of SVM, particularly its efficiency in high-dimensional spaces and the computational cost associated with large datasets.

Uploaded by

hamidbnb865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views9 pages

DataMining Chapter5

Uploaded by

hamidbnb865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

CHAPITRE V : SUPPORT VECTOR MACHINE (SVM)

This chapter introduces one of the most widely used methods in data mining: SVM. Highly
acclaimed for its performance, this method is often used as a benchmark for comparing
methods.

5.1 INTRODUCTION TO SVM :

SVM (Support Vector Machine) is a binary classification method based on supervised
learning. It was introduced by Cortes and Vapnik in 1995 [12] This method is based on the
search for the existence of a linear data separator in a given space. Since it is a two-class
classification problem, this method uses a training dataset to build the model parameters. It is
based on the use of so-called kernel functions, which allow optimal data separation.
Initially, the method was proposed by its authors for two-class classification. However, it was
later generalised to n classes.

5.2 BASIC CONCEPTS:

For two given classes of examples, the aim of SVM is to find a classifier that will
separate the data and maximise the distance between the two classes. With SVM, this
classifier is a linear classifier called a hyperplane.
In the figure below, we need to find a hyperplane that separates two classes of points. Class 1
is represented by plus symbols. Class 2 is represented by round symbols.

63
Fig 5.1 SVM goal: separate the data

5.2.1 Hyperplane, vector support, marge

The closest points, which are the only ones used to determine the hyperplane, are called
support vectors. They are highlighted in the figure. Clearly, we must have at least one support
vector for each class.

Fig 5.2 Hyperplane and support vectors

64
Obviously, there is a multitude of valid hyperplanes, but the remarkable property of SVMs is
that this hyperplane must be optimal. We are therefore going to look for the hyperplane that
passes "in the middle" of the points of the two classes of examples. Intuitively, this is like
looking for the "safest" hyperplane. If an example has not been described perfectly, a small
variation will not change its classification if its distance from the hyperplane is large.
Formally, this amounts to looking for a hyperplane whose minimum distance to the training
examples is the maximum. This distance is called the "margin" between the hyperplane and
the examples. The optimal separating hyperplane is the one that maximises the margin. As we
are looking to maximise this margin, we will talk about separators with a wide margin.

5.2.1 Why maximise margin?

Intuitively, having a wider margin provides greater security when classifying a new
example. In addition, if we find the classifier that behaves best in relation to the training data,
it is clear that it will also be the one that best classifies the new examples. In figure 5.3, the
right-hand side shows us that with an optimal hyperplane, a new example remains well
classified even though it falls within the margin. On the left, we see that with a smaller
margin, the example is classified poorly.

Hyperplane with low margin Best separating hyperplane

Fig 5.3 SVM objective: find the best separating hyperplane giving the largest margin

In general, the classification of a new unknown example is given by its position relative to the
optimal hyperplane.

5.2.3 Example of SVM use

To illustrate the principle of SVM, consider the following set containing data from two
classes C1 and C2 (table 5.1). Let us do the following tasks :
 Represent the data on a plane.
 Find approximately the optimal separating hyperplane (straight line).
 Give its equation
 Consider the new data (7, 4). Will it be classified in C1 or C2?

65
Table 5.1 Dataset for SVM
X Y Classe
1 2 C2
2 4 C2
3 3 C2
4 2 C2
4 4 C2
6 6 C1
8 4 C1
6 8 C1
7 8 C1
5 9 C1
9 9 C1
Figure 5.4 shows the graphical representation of the data. Approximately, we can see that the
vector supports, on the C1 side, are the points (6, 6) and (8, 4). On the C2 side, the course
support is the point (4, 4). The equation of the separator is y = -x +10.
From these considerations, we can conclude that data X will be assigned to class C1.
It is important to note that in this simple example, we have visually designated the vector
supports so that we can easily write down the separator equation. However, as we will see in
the mathematical formulation of the SVM problem, finding the optimal separator equation
requires solving a system of equations.

Fig 5.4 Example of the use of SVM.

66
5.3 LINEARITY AND NON-LINEARITY
SVM models include linearly separable and non-linearly separable cases. The former
are the simplest of SVMs because they allow us to find the linear classifier. But in most real-
world problems there is no linear separation possible between the data, so the maximum
margin classifier cannot be used because it only works if the classes of training data are
linearly separable.
Figure 5.5 summarises this linearity problem. In the figure on the right, the data is linearly
separable and a classifier exists. In the figure on the right, the data is not linearly separable
and a classifier cannot be found directly. The data must be transformed as explained below.

Linear separable case Non-linear separable case

Fig 5.5 Linearly separable vs. non-linearly separable data.

5.3.1 Solving linear cases:

In cases where the data is linearly separable, the solution in SVM becomes a solution
to a linear programming problem.
A linear model corresponds to the following general equation:
f(x) = w.x + b (5.1)

The separating hyperplane (decision frontier) therefore has the equation:

w.x + b = 0 (5.2)
The distance of a point from the plane to the separating hyperplane is given by:
d(x) = |w.x + b| / ||w|| (5.3)
where ||w|| is the norm of the vector w.

67
The optimal hyperplane is the one for which the distance to the nearest points (margin) is
maximum. Let x1 and x2 be points of different classes : f(x1) = +1 and f(x2) = -1.
We can therefore write:
(w.x1) + b = +1 and (w.x2) + b = -1 (5.4)
So, we have:
w.(x1 - x2) = 2 (5.5)
If we divide the two parts of the previous equation by the norm of w, we obtain :
w.(x1 - x2) / ||w|| = 2 / ||w|| (5.6)
We can therefore deduce that maximising the margin is equivalent to minimising ||w|| under
certain constraints:
(5.7)

5.3.2 Solving non-linear cases:

To overcome the drawbacks of non-linearly separable cases, the idea behind SVMs is
to change the data space. The non-linear transformation of the data can allow a linear
separation of the examples in a new space. We therefore have a change of dimension. This
new dimension is called the "re-description space". Intuitively, the higher the dimension of
the re-description space, the higher the probability of finding a separating hyperplane between
the examples. This is illustrated by figure 5.6.

Fig 5.6 Transformation of non-linear separable data to another space.

We therefore have a transformation of a non-linear separation problem in the representation

space into a linear separation problem in a higher-dimensional re-description space. This non-
linear transformation is performed using a kernel function. These include : polynomial,
Gaussian, sigmoid and Laplacian kernels.
Example: Let's take the example of the XOR function, which gives results formed by for non-
separable data (table 5.1).

68
Table 5.1: XOR function giving non-linearly separable data
X Y Class
0 0 0
1 0 1
0 1 1
1 1 0

Figure 5.7 shows the data from the XOR function on a plane. Result 0 (class 0) is represented
by an empty circle. Result 1 (class 1) is represented by a solid circle.

Fig 5.7 XOR function giving non-linear separable data.

The data resulting from the XOR function is not linearly separable. Therefore, we cannot
directly find a linear separator to apply AVM. To do this, we will apply a polynomial
transformation function that will transform the pair (x, y) into a triplet (x, y, x * y). Table 5.2
summarises this transformation.
Table 5.2: Using the polynomial function to transform data
X Y X*Y Class
0 0 0 0
1 0 0 1
0 1 0 1
1 1 1 0

The transformation applied takes us from a 2-dimensional space to a 3-dimensional space.

Let's represent the data from this transformation (figure 5.8). We can see visually that the data
has become linearly separable in the new space. We can now apply SVM.

69
Fig 5.8 Linearly separable data after transformation into a new space.

5.4 SVM MULTI-CLASSES

The SVM method was originally introduced to classify data belonging to two classes.
But later it was generalised to n classes. Figure 5.9 shows a multi-class case.

Fig 5.9 How to use SVM in case of multi-classes data?.

70
There are several ways of adapting two-class SVMs to the multi-class case, including the one-
versus-all approach and the one-versus-one approach. The choice will depend on the size of
the problem.
1. The one versus all approach: consists of training a two-class SVM using the elements
of one class against all the others. This amounts to solving c SVM problems, each of
size n.
2. The one versus one approach: consists of training c(c-1)/2 SVM on each pair of
classes, then deciding on the winning class either by a majority vote or by post-
processing the results using a posteriori probability estimation.

5.5 CRITICISM OF SVM

The main advantages of SVM are:
 High-dimensional efficiency: SVMs are efficient in high-dimensional spaces, such as
those encountered in image classification and bioinformatics.
 Efficiency over a small number of samples: They continue to perform well even when
the number of samples is relatively small compared with the number of dimensions.
 Versatility: By selecting the appropriate kernel, SVMs can be adapted to a wide
variety of problems.
The main disadvantage of SVM is the cost in computation time, especially if the dataset is
large, the number of attributes is high, the data are multi-class and they are not linearly
separable.

CONCLUSION OF THE CHAPTER

In this chapter, we have presented SVM, which is considered to be one of the most powerful
methods in data mining. We have described its principle, which is to search for an optimal
data separator. We explained the transformation that needs to be made (using a kernel
function) when the data is not linearly separable. We end the chapter with the most commonly
used approaches for handling multi-class data. The following are theoretical and practical
exercises on the use of SVM.

SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
ML Support Vector Machines 2
No ratings yet
ML Support Vector Machines 2
22 pages
Support Vector Machine - Explanation
No ratings yet
Support Vector Machine - Explanation
12 pages
Unit 2
No ratings yet
Unit 2
47 pages
Support Vector Machine Algorithm
No ratings yet
Support Vector Machine Algorithm
8 pages
1 SVM Lecture Material Main Notes
No ratings yet
1 SVM Lecture Material Main Notes
19 pages
UNIT-III Support Vector Machines
No ratings yet
UNIT-III Support Vector Machines
43 pages
Support Vector Machine
No ratings yet
Support Vector Machine
11 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
SVM - Feb 15
No ratings yet
SVM - Feb 15
34 pages
Support Vector Machines Explained
No ratings yet
Support Vector Machines Explained
2 pages
SVM
No ratings yet
SVM
11 pages
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
No ratings yet
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
6 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
27 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
No ratings yet
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
69 pages
Unit2 Notes What Is A Support Vector Machine
No ratings yet
Unit2 Notes What Is A Support Vector Machine
11 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
SVM Basics for Data Scientists
No ratings yet
SVM Basics for Data Scientists
139 pages
Machine Learning (CSO851) - Lecture 05
No ratings yet
Machine Learning (CSO851) - Lecture 05
27 pages
Support Vector Machine
No ratings yet
Support Vector Machine
12 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Machine Learning (R17a0534) 54 57
No ratings yet
Machine Learning (R17a0534) 54 57
4 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
Support Vector Machines
No ratings yet
Support Vector Machines
13 pages
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
No ratings yet
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
69 pages
ML Lec-19
No ratings yet
ML Lec-19
20 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Ankita
No ratings yet
Ankita
10 pages
Support Vector Machine Guide
No ratings yet
Support Vector Machine Guide
21 pages
SVM Tutorial
No ratings yet
SVM Tutorial
28 pages
Support Vector Machines
No ratings yet
Support Vector Machines
9 pages
Unit - 2
No ratings yet
Unit - 2
15 pages
Support Vector Machine
No ratings yet
Support Vector Machine
18 pages
Support Vector Machines: Delivered by
No ratings yet
Support Vector Machines: Delivered by
13 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
28 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Support Vector Machines
No ratings yet
Support Vector Machines
11 pages
5 SVM
No ratings yet
5 SVM
34 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
6 pages
Support Vector Machine
No ratings yet
Support Vector Machine
13 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
SVM Guide for Data Science Students
No ratings yet
SVM Guide for Data Science Students
19 pages
Lab5 AI
No ratings yet
Lab5 AI
7 pages
1501589527da Mod14 Q1 e Text
No ratings yet
1501589527da Mod14 Q1 e Text
12 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Module 3 ML 24
No ratings yet
Module 3 ML 24
65 pages
Business Data Mining Week 6
No ratings yet
Business Data Mining Week 6
20 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
SVM Notes
No ratings yet
SVM Notes
4 pages
Howtousethetop7 Band 7+ Comparisons For Task 1: The Ultimate Cheat Sheet
No ratings yet
Howtousethetop7 Band 7+ Comparisons For Task 1: The Ultimate Cheat Sheet
12 pages
Final CFP ANRF MAHA MedTechMission 15 09 2025
No ratings yet
Final CFP ANRF MAHA MedTechMission 15 09 2025
5 pages
3 Ways to Achieve Goals Guide
No ratings yet
3 Ways to Achieve Goals Guide
5 pages
Microsoft PowerPoint Guide
No ratings yet
Microsoft PowerPoint Guide
30 pages
Curriculum Implementation Division (Cid) : Learning Competency Directory
No ratings yet
Curriculum Implementation Division (Cid) : Learning Competency Directory
5 pages
Gen Phys 2 1st Quarter TQ Answers Solving
No ratings yet
Gen Phys 2 1st Quarter TQ Answers Solving
6 pages
Wellste Aluminum Cylinder Tube Catalog
No ratings yet
Wellste Aluminum Cylinder Tube Catalog
15 pages
Certificado de Origen Nov 01-08
No ratings yet
Certificado de Origen Nov 01-08
12 pages
Exploring Medical Language A Student Directed Approach 10th Edition ISBN 0323396453, 9780323396455 Extended Version Download
0% (1)
Exploring Medical Language A Student Directed Approach 10th Edition ISBN 0323396453, 9780323396455 Extended Version Download
14 pages
Searchclient Safari&Sca Esv E4952731b002d90d&hl en CA&q Earthmover+Art+Ultrakill&Uds AMwkrPt3WrCCUIpuQZi
No ratings yet
Searchclient Safari&Sca Esv E4952731b002d90d&hl en CA&q Earthmover+Art+Ultrakill&Uds AMwkrPt3WrCCUIpuQZi
1 page
BS Financial Institutions Course List
No ratings yet
BS Financial Institutions Course List
2 pages
Ecological Psychology Insights
No ratings yet
Ecological Psychology Insights
2 pages
Edu 580 Finallessonplan
No ratings yet
Edu 580 Finallessonplan
6 pages
Spe 192038 Ms
No ratings yet
Spe 192038 Ms
16 pages
Arcrete D-8
No ratings yet
Arcrete D-8
2 pages
Surveying Techniques for Engineers
No ratings yet
Surveying Techniques for Engineers
21 pages
PDF Handbook of Human Factors in Web Design Second Edition Kim-Phuong L. Vu Download
No ratings yet
PDF Handbook of Human Factors in Web Design Second Edition Kim-Phuong L. Vu Download
81 pages
Case Study - Bakery House App
No ratings yet
Case Study - Bakery House App
4 pages
Bubble Bubble Interaction
No ratings yet
Bubble Bubble Interaction
11 pages
pH/ORP: User Guide
No ratings yet
pH/ORP: User Guide
2 pages
100kva Oil S H /) /)
No ratings yet
100kva Oil S H /) /)
2 pages
As 1442-2007 Carbon Steels and Carbon-Manganese Steels - Hot Rolled Bars and Semi-Finished Products
No ratings yet
As 1442-2007 Carbon Steels and Carbon-Manganese Steels - Hot Rolled Bars and Semi-Finished Products
7 pages
Specification Accredited As Level Gce Further Mathematics A h235
No ratings yet
Specification Accredited As Level Gce Further Mathematics A h235
84 pages
Chapter 8
No ratings yet
Chapter 8
19 pages
Lecture 4
No ratings yet
Lecture 4
21 pages
Analyzing Spa Ceylon
No ratings yet
Analyzing Spa Ceylon
7 pages
Information Technology
No ratings yet
Information Technology
1,552 pages
Introduction of Philosophy MODULE 5
100% (1)
Introduction of Philosophy MODULE 5
3 pages
Advanced Diploma in Filmmaking
No ratings yet
Advanced Diploma in Filmmaking
4 pages
Letter Structure and Examples
No ratings yet
Letter Structure and Examples
6 pages

DataMining Chapter5

Uploaded by

DataMining Chapter5

Uploaded by

CHAPITRE V : SUPPORT VECTOR MACHINE (SVM)

5.1 INTRODUCTION TO SVM :

5.2 BASIC CONCEPTS:

5.2.1 Hyperplane, vector support, marge

Fig 5.2 Hyperplane and support vectors

5.2.1 Why maximise margin?

Hyperplane with low margin Best separating hyperplane

5.2.3 Example of SVM use

Fig 5.4 Example of the use of SVM.

Linear separable case Non-linear separable case

Fig 5.5 Linearly separable vs. non-linearly separable data.

5.3.1 Solving linear cases:

The separating hyperplane (decision frontier) therefore has the equation:

5.3.2 Solving non-linear cases:

Fig 5.6 Transformation of non-linear separable data to another space.

We therefore have a transformation of a non-linear separation problem in the representation

Fig 5.7 XOR function giving non-linear separable data.

The transformation applied takes us from a 2-dimensional space to a 3-dimensional space.

5.4 SVM MULTI-CLASSES

Fig 5.9 How to use SVM in case of multi-classes data?.

5.5 CRITICISM OF SVM

CONCLUSION OF THE CHAPTER

You might also like