0% found this document useful (0 votes)

22 views22 pages

Support Vector Machine and Support Vector Regression

The document provides an overview of Support Vector Machines (SVM) and Support Vector Regression (SVR), detailing their classification and regression techniques, respectively. It explains the concept of maximum-margin hyperplanes, support vectors, and the use of kernel functions to handle non-linear data. Additionally, it includes mathematical formulations, optimization problems, and Python implementation examples for both SVM and SVR.

Uploaded by

anirudh.s1402

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views22 pages

Support Vector Machine and Support Vector Regression

Uploaded by

anirudh.s1402

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Support vector machine and

Support vector regression

Support Vector machine
It is a classification technique

H1 does not separate the classes. H2 does, but

only with a small margin. H3 separates them with
the maximal margin.
This maximum-margin separator is determined
by a subset of the datapoints.

Datapoints in this subset are called “support vectors”.

It will be useful computationally if only a small fraction of the datapoints are support
vectors, because we use the support vectors to decide which side of the separator a test
case is on.
Maximum Margin Hyperplane in SVM
• In the case of support vector machines, a data point is viewed as a p-dimensional
vector and we want to know whether we can separate such points with a (p−1)
dimensional hyperplane. This is called linear classifier.
• So we choose the hyperplane so that the distance from it to the nearest data point on
each side is maximized. If such a hyperplane exists, it is known as the maximum-
margin hyperplane and the linear classifier it defines is known as a maximum-margin
classifier.
Linear SVM formulation

𝒘𝒊𝒅𝒕𝒉 = 𝟐/||𝒘|| 𝒏𝒆𝒆𝒅𝒔 𝒕𝒐 𝒃𝒆 𝒎𝒂𝒙𝒊𝒎𝒖𝒎

Or ||𝒘||𝟐 𝒊𝒔 𝒂𝒔 𝒔𝒎𝒂𝒍𝒍 𝒂𝒔 𝒑𝒐𝒔𝒔𝒊𝒃𝒍𝒆

Constraints
𝐰. 𝑥 + 𝑏 > +1 𝑓𝑜𝑟 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠
𝐰. 𝑥 + 𝑏 < −1 𝑓𝑜𝑟 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠

An optimization Problem!
Linear SVM formulation
• In the context of a linear SVM, the decision boundary can be represented as a hyperplane
defined by the equation ( 𝑤 ⋅ 𝑥 + 𝑏 = 0 ), where ( x ) is the input feature vector.
• ( w ) (Weights or Coefficients): This is a vector of weights associated with the features in
the input data. The weights influence the orientation of the decision boundary. They
determine how much each feature contributes to the decision-making process.
• (Bias or Intercept): This scalar value shifts the decision boundary away from the origin.
Providing additional flexibility for the model to achieve better separation between the
classes.

Hard Margin: If the training data is linearly separable,

we can select two parallel hyperplanes that separate
the two classes of data, so that the distance between
them is as large as possible.
Linear SVM formulation
To extend SVM to cases in which the data are not linearly separable,
the hinge loss function 𝐿 𝑦𝑖 𝑓 𝑥𝑖 is helpful.
𝐿 𝑦𝑖 𝑓 𝑥𝑖 = max 0,1 − 𝑦𝑖 𝑓 𝑥𝑖

The hinge loss penalizes instances that are misclassified or that violate the margin
constraint.
For linear SVM, primal form is fine!
𝐰. 𝑥 + 𝑏 ≥ +1 − 𝜉 𝑖 𝑓𝑜𝑟 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠
𝐰. 𝑥 + 𝑏 ≤ −1 + 𝜉 𝑖 𝑓𝑜𝑟 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠
𝑤𝑖𝑡ℎ 𝜉 𝑖 ≥ 0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖
𝑛
||𝐰||2
𝑎𝑛𝑑 + 𝐶 ෍ 𝜉 𝑖 𝑎𝑠 𝑠𝑚𝑎𝑙𝑙 𝑎𝑠 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒
2
𝑖=1

𝝃𝒊 − 𝑠𝑙𝑎𝑐𝑘 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
C- is the hyper parameter
Explanation- slack variable
• In SVMs, slack variables 𝜉𝑖 are introduced
to allow for some degree of
misclassification when training the model,
making it possible to create a soft-margin
SVM. The concept of slack variables allows
the SVM to handle cases where classes are
not perfectly separable.

Definition: For an instance (i): 𝝃𝒊 ≥ 𝟎 represents the amount by which the margin is
violated. Essentially, slack variables measure how much an observation fails to meet the
margin constraint.
Physical representation of C

0 < c < 1 the width will expand

c > 1 the width will retract
How algorithm works
For linear SVM, primal form is fine!
𝐰. 𝑥 + 𝑏 ≥ +1 − 𝜉 𝑖 𝑓𝑜𝑟 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠
𝐰. 𝑥 + 𝑏 ≤ −1 + 𝜉 𝑖 𝑓𝑜𝑟 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑐𝑎𝑠𝑒𝑠
𝑤𝑖𝑡ℎ 𝜉 𝑖 ≥ 0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑖
𝑛
||𝐰||2
𝑎𝑛𝑑 + 𝐶 ෍ 𝜉𝑖 𝑎𝑠 𝑠𝑚𝑎𝑙𝑙 𝑎𝑠 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒
2
𝑖=1

𝑖
One method-Gradient descent
𝜉 − 𝑠𝑙𝑎𝑐𝑘 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒
𝒘 += 𝒚𝒊 ∗ 𝒍𝒓 ∗ 𝒙
C- is the hyper parameter
𝒃 += 𝒚𝒊 ∗ 𝒍𝒓
Iteratively evaluate optimum w and b Example python code is attached
The kernel trick Curved in low dimensional
feature space
Planar in high dimensional
feature space

Low-D
xb
• For many mappings from a low-D space to a xa
high-D space, there is a simple operation on
two vectors in the low-D space that can be 
used to compute the scalar product of their
two images in the high-D space. High-D
K ( x a , xb ) =  ( x a ) . ( xb ) Kernel function is
simply a function of
(x )
a
doing the scalar feature data
Letting the
kernel do product in the  ( xb )
obvious way
Different Kernel functions
• Linear Kernel: No mapping is needed as the data is already assumed
to be linearly separable.
• Polynomial Kernel: Maps inputs into a polynomial feature space,
enhancing the classifier’s ability to capture interactions between
features.
• Radial Basis Function (RBF) Kernel: Also known as the Gaussian
kernel, it is useful for capturing complex regions by considering the
distance between points in the input space.
• Sigmoid Kernel: Mimics the behavior of neural networks by using a
sigmoid function as the kernel.
Different Kernel functions

Polynomial: K (x, y ) = (x.y + 1) p

Gaussian Parameters
−||x − y||2 / 2 2 that the user
radial basis K (x, y ) = e must choose
function

Neural net: K (x, y ) = tanh ( k x.y −  )

How algorithm works
The kernel trick is fundamentally a feature of the dual formulation of Support Vector
Machines (SVM) and cannot be applied directly to the primal form. Hence SVM based on
dual form is often used. the dual formulation for SVM is derived from the Lagrangian of
the primal problem. 𝑛
𝑛 𝑛
1
maximize 𝑊(𝛼) = ෍ 𝜶𝒊 − ා ෍ 𝛼𝑖 𝛼𝑗 𝑦𝑖 𝑦𝑗 𝐾(𝑥𝑖 , 𝑥𝑗 )
2
𝑖=1 𝑗=1
𝑖=1
Subjected to σ𝑛𝑖=1 𝜶𝒊 𝑦𝑖 = 0 𝑎𝑛𝑑 (0 ≤ 𝜶𝒊 ≤ 𝐶) Remember: 𝐾(𝑥𝑖 , 𝑥𝑘 ) is the
transformation function (or kernel
𝑛 function).
𝑏 = 𝑦𝑘 − ෍ 𝛼𝑖 𝑦𝑖 𝐾(𝑥𝑖 , 𝑥𝑘 ) Our aim to evaluate 𝜶𝒊 using
𝑖=1 optimization. All other parameters
Prediction such as b is evaluated using
𝜶𝒊 using training data. Finally,
𝑛 function 𝑓(𝑥) is evaluated.
𝑓(𝑥) = ෍ 𝛼𝑖 𝑦𝑖 𝐾(𝑥𝑖 , 𝑥) + 𝑏
𝑖=1
SVM Implementation using Python

# Import necessary libraries

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn import svm
from sklearn import metrics

# Loading Data
# Load dataset
cancer = datasets.load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target,
test_size=0.3, random_state=109)
SVM Implementation using Python
# Generating Model
# Create an SVM Classifier using a linear kernel
clf = svm.SVC(kernel='linear')
# Train the model using the training sets
clf.fit(X_train, y_train)
# Predict the response for the test dataset
y_pred = clf.predict(X_test)
# Evaluating the Model
# Model Accuracy: how often is the classifier correct?
accuracy = metrics.accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
# Model Precision: what percentage of positive tuples are labeled as such?
precision = metrics.precision_score(y_test, y_pred)
print("Precision:", precision)
# Model Recall: what percentage of positive tuples are labelled as such?
recall = metrics.recall_score(y_test, y_pred)
print("Recall:", recall)
Support vector regression

Support vector regression (SVR) is a

type of support vector machine (SVM)
that is used for regression tasks. It
tries to find a function that best
predicts the continuous output value
for a given input value.
𝒚𝒊 − 𝒘𝑻 𝝓 𝒙𝒊 + 𝒃 ≤ 𝜺 + 𝝃𝒊 for 𝒊 = 𝟏, … , 𝒎

Minimization Problem 𝒚𝒊 − 𝒘𝑻 𝝓 𝒙𝒊 + 𝒃 ≥ 𝜺 + 𝝃𝒊 ∗ for 𝒊 = 𝟏, … , 𝒎

Support vectors

Hyperplane

Decision boundary
In case of NON-LINEAR SVR:
• The kernel functions transform the data into a higher dimensional feature
space to make it possible to perform the linear separation.

Primal formulation

You can evaluate 𝜶 values using iterative mechanism,

applicable for small data sets (pls refer attached code)
Dual formulation (derived from derived from the Lagrangian of the
primal problem)! For large datasets (not important)

𝒏
𝒏 𝒏
∗ 𝟏
[𝒎𝒂𝒙 ∗
෍ (𝜶 𝒊 − 𝜶 𝒊 ) 𝒚 𝒊 − ා ෍ 𝑲(𝒙𝒊 , 𝒙𝒋 ) (𝜶𝒊 − 𝜶∗𝒊 )(𝜶𝒋 − 𝜶∗𝒋 )]
𝜶𝒊 ,𝜶𝒊 𝟐
𝒊=𝟏 𝒋=𝟏
𝒊=𝟏
𝒏

[෍(𝜶𝒊 − 𝜶∗𝒊 ) = 𝟎, 𝜶𝒊 ≥ 𝟎, 𝜶∗𝒊 ≥ 𝟎]

𝒊=𝟏
SVR from scratch (not important!)
import numpy as np # Sample dataset
data = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]])
# [x1, x2, y] X = data[:, 0:2] # Input features (x1, x2)
Y = data[:, 2] # Target variable (y)
# Step 1: Choose a kernel function (linear kernel)
def kernel_function(x, y):
return np.dot(x, y)
# Linear kernel
# Step 2: Compute the kernel matrix K
n = X.shape[0]
K = np.zeros((n, n))
for i in range(n): for j in range(n):
K[i, j] = kernel_function(X[i, :], X[j, :])
print('Kernel Matrix K:') print(K)
SVR from scratch (not important!)
# Step 3: Setting up parameters for SVR
C = 1 # Regularization parameter
epsilon = 0.1 # Epsilon in the epsilon-insensitive loss function
# Placeholder values for Lagrange multipliers α and α*
alpha = np.array([1, 1, 1, 1]) # Placeholder values for Lagrange multipliers α
alpha_star = np.array([0, 0, 0, 0])
# Placeholder values for Lagrange multipliers α* # Calculate the bias term b (here we use
a simple average for demonstration)
b_values = []
for i in support_vector_indices:
b_value = Y[i] - np.sum((alpha - alpha_star) * K[i, :])
b_values.append(b_value)

b = np.mean(b_values) # Average of the computed b values for support vectors

SVR from scratch (not important!)
# Step 4: Compute the regression function f(x)
def regression_function(x):
return np.sum((alpha - alpha_star) * kernel_function(X, x)) + b
# Step 5: Make predictions with the regression function
new_data_point = np.array([2, 3])
# New input data point (x1=2, x2=3)
predicted_y = regression_function(new_data_point)
print(f'Predicted y for input ({new_data_point[0]:.1f},
{new_data_point[1]:.1f}) is: {predicted_y:.2f}')

Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Support Vector Machine
No ratings yet
Support Vector Machine
9 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Unit - 2-1
No ratings yet
Unit - 2-1
7 pages
Third Year Engineering: Unit II: Supervised Machine Learning
No ratings yet
Third Year Engineering: Unit II: Supervised Machine Learning
11 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
Presentation - SVM & KM - May 2009
No ratings yet
Presentation - SVM & KM - May 2009
24 pages
SVM
No ratings yet
SVM
11 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
SVM Applications and Properties
100% (1)
SVM Applications and Properties
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
26 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
SVM Classifier Techniques Guide
No ratings yet
SVM Classifier Techniques Guide
15 pages
Slide - SVM
No ratings yet
Slide - SVM
12 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
43 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Unit II 2.2 ML Kernel Machines SVM
No ratings yet
Unit II 2.2 ML Kernel Machines SVM
50 pages
Support Vector Machines
No ratings yet
Support Vector Machines
33 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Unit - 2
No ratings yet
Unit - 2
15 pages
7 - Support Vector Machines (SVM)
No ratings yet
7 - Support Vector Machines (SVM)
29 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
40 pages
UNIT-II-Support Vector Machine Algorithm
No ratings yet
UNIT-II-Support Vector Machine Algorithm
13 pages
SVM Algorithm Guide with Python Code
No ratings yet
SVM Algorithm Guide with Python Code
10 pages
Prediction & SVM Explained
No ratings yet
Prediction & SVM Explained
33 pages
Financial Market Volatility Forecasting
No ratings yet
Financial Market Volatility Forecasting
52 pages
Chapter 07 SVM
No ratings yet
Chapter 07 SVM
20 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
This Is
No ratings yet
This Is
7 pages
Support Vector Machines
No ratings yet
Support Vector Machines
12 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
SVM
No ratings yet
SVM
12 pages
Lect3 2
No ratings yet
Lect3 2
43 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Unit 3 Aam
No ratings yet
Unit 3 Aam
30 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
36 pages
Presentation1 2
No ratings yet
Presentation1 2
18 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
29 pages
SVM1
No ratings yet
SVM1
4 pages
Support Vector Machines
No ratings yet
Support Vector Machines
43 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
Introduction to Support Vector Machines
No ratings yet
Introduction to Support Vector Machines
33 pages
Support Vactor Machine Final
No ratings yet
Support Vactor Machine Final
11 pages
Support Vector Machines For Classification and Regression
No ratings yet
Support Vector Machines For Classification and Regression
8 pages
Support Vector Machine
No ratings yet
Support Vector Machine
55 pages
Support Vector and Kernel Methods - Detailed Notes
No ratings yet
Support Vector and Kernel Methods - Detailed Notes
10 pages
Support Vector Machines: Theory, Implementation, and Applications
No ratings yet
Support Vector Machines: Theory, Implementation, and Applications
40 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
32 pages
Support Vector Machine For Classification
No ratings yet
Support Vector Machine For Classification
38 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
EMEP Guideline Updates Rev-6
No ratings yet
EMEP Guideline Updates Rev-6
1 page
Male Female Height Ideal Weight Height Ideal Weight: Types of Diet
No ratings yet
Male Female Height Ideal Weight Height Ideal Weight: Types of Diet
6 pages
Atomic Basics
100% (1)
Atomic Basics
4 pages
Summary RA 10591
No ratings yet
Summary RA 10591
11 pages
Vocabulary for English Learners
No ratings yet
Vocabulary for English Learners
25 pages
Bandha and Mudras
No ratings yet
Bandha and Mudras
41 pages
Herbal Remedies For Treatment of Hypertension
100% (1)
Herbal Remedies For Treatment of Hypertension
22 pages
JB Core Programming
No ratings yet
JB Core Programming
3 pages
HA Lec 16
No ratings yet
HA Lec 16
10 pages
Autocad Mep
No ratings yet
Autocad Mep
70 pages
Casual Factor Form Blank
No ratings yet
Casual Factor Form Blank
6 pages
A Low Concentrated Sweets Diet
No ratings yet
A Low Concentrated Sweets Diet
15 pages
Varachakra
88% (8)
Varachakra
18 pages
FT0 Prince Charming Reanimator
No ratings yet
FT0 Prince Charming Reanimator
21 pages
Predictive AIOps Solution Brief
No ratings yet
Predictive AIOps Solution Brief
5 pages
Buffing Wheel SDS
No ratings yet
Buffing Wheel SDS
4 pages
Success Strategies by Shiv Khera
100% (1)
Success Strategies by Shiv Khera
14 pages
The Zanti Misfits Analysis
No ratings yet
The Zanti Misfits Analysis
2 pages
Btech7th-CE-Estimating Costing Lab Mannual 122
No ratings yet
Btech7th-CE-Estimating Costing Lab Mannual 122
9 pages
ZPS Series Power Supplies Overview
No ratings yet
ZPS Series Power Supplies Overview
2 pages
SATIP-T-481-01 Ver 08
No ratings yet
SATIP-T-481-01 Ver 08
1 page
ESD-9 Katalog Çalışması 31.07.2019
No ratings yet
ESD-9 Katalog Çalışması 31.07.2019
119 pages
ASTM - C496 C496M - 11 Standard Test Method For Splitting Tensile Strength of Cylindrical Concrete Specimens
No ratings yet
ASTM - C496 C496M - 11 Standard Test Method For Splitting Tensile Strength of Cylindrical Concrete Specimens
5 pages
Souq House-GF Plan
No ratings yet
Souq House-GF Plan
1 page
CMM Insights for Engineers
100% (2)
CMM Insights for Engineers
26 pages
Week 2 Assignment 2 - Fictional Report
No ratings yet
Week 2 Assignment 2 - Fictional Report
4 pages
Introduction to Pathology Techniques
No ratings yet
Introduction to Pathology Techniques
3 pages
Data Sheet For LV AC Drives IEC S 736Dv2021 09
No ratings yet
Data Sheet For LV AC Drives IEC S 736Dv2021 09
12 pages
Coniferas of World
No ratings yet
Coniferas of World
16 pages
Science India Magazine August 2022
No ratings yet
Science India Magazine August 2022
64 pages

Support Vector Machine and Support Vector Regression

Uploaded by

Support Vector Machine and Support Vector Regression

Uploaded by

Support vector machine and

Support vector regression

H1 does not separate the classes. H2 does, but

Datapoints in this subset are called “support vectors”.

𝒘𝒊𝒅𝒕𝒉 = 𝟐/||𝒘|| 𝒏𝒆𝒆𝒅𝒔 𝒕𝒐 𝒃𝒆 𝒎𝒂𝒙𝒊𝒎𝒖𝒎

Hard Margin: If the training data is linearly separable,

0 < c < 1 the width will expand

Polynomial: K (x, y ) = (x.y + 1) p

Neural net: K (x, y ) = tanh ( k x.y −  )

# Import necessary libraries

Support vector regression (SVR) is a

Minimization Problem 𝒚𝒊 − 𝒘𝑻 𝝓 𝒙𝒊 + 𝒃 ≥ 𝜺 + 𝝃𝒊 ∗ for 𝒊 = 𝟏, … , 𝒎

You can evaluate 𝜶 values using iterative mechanism,

[෍(𝜶𝒊 − 𝜶∗𝒊 ) = 𝟎, 𝜶𝒊 ≥ 𝟎, 𝜶∗𝒊 ≥ 𝟎]

b = np.mean(b_values) # Average of the computed b values for support vectors

You might also like