Dimension Reduction
Dimension Reduction
"It is a way of
converting the higher dimensions dataset into lesser dimensions
dataset ensuring that it provides similar information." These techniques
are widely used in machine learning for obtaining a better fit predictive
model while solving the classification and regression problems.
There are two ways to apply the dimension reduction technique, which
are given below:
Feature Selection
1. Filters Methods
In this method, the dataset is filtered, and a subset that contains only
the relevant features is taken. Some common techniques of filters
method are:
o Correlation
o Chi-Square Test
o ANOVA
o Information Gain, etc.
2. Wrappers Methods
The wrapper method has the same goal as the filter method, but it takes
a machine learning model for its evaluation. In this method, some
features are fed to the ML model, and evaluate the performance. The
performance decides whether to add those features or remove to
increase the accuracy of the model. This method is more accurate than
the filtering method but complex to work. Some common techniques of
wrapper methods are:
o Forward Selection
o Backward Selection
o Bi-directional Elimination
3. Embedded Methods: Embedded methods check the different training
iterations of the machine learning model and evaluate the importance of
each feature. Some common techniques of embedded methods are:
o LASSO
o Elastic Net
o Ridge Regression, etc.
Feature Extraction:
a. Component Analysis
b. Linear Discriminant Analysis
c. Kernel PCA
d. Quadratic Discriminant Analysis
PCA Algorithm-
The steps involved in PCA Algorithm are as follows-
Step-01: Get data.
Step-02: Compute the mean vector (µ).
Step-03: Subtract mean from the given data.
Step-04: Calculate the covariance matrix.
Step-05: Calculate the eigen vectors and eigen values of the covariance
matrix.
Step-06: Choosing components and forming a feature vector.
Step-07: Deriving the new data set.
Given data = { 2, 3, 4, 5, 6, 7 ; 1, 5, 3, 6, 7, 8 }.
Compute the principal component using PCA Algorithm.
OR
Consider the two dimensional patterns (2, 1), (3, 5), (4, 3), (5, 6), (6, 7), (7,
8).
Compute the principal component using PCA Algorithm.
OR
Compute the principal component of following data-
CLASS 1
X = 2, 3, 4
Y = 1, 5, 3
CLASS 2
X = 5, 6, 7
Y = 6, 7, 8
Solution-
We use the above discussed PCA Algorithm-
Step-01:
Get data.
The given feature vectors are-
x1 = (2, 1)
x2 = (3, 5)
x3 = (4, 3)
x4 = (5, 6)
x5 = (6, 7)
x6 = (7, 8)
Step-02:
Calculate the mean vector (µ).
Mean vector (µ)
= ((2 + 3 + 4 + 5 + 6 + 7) / 6, (1 + 5 + 3 + 6 + 7 + 8) / 6)
= (4.5, 5)
Thus,
Step-03:
Subtract mean vector (µ) from the given feature vectors.
x1 – µ = (2 – 4.5, 1 – 5) = (-2.5, -4)
x2 – µ = (3 – 4.5, 5 – 5) = (-1.5, 0)
x3 – µ = (4 – 4.5, 3 – 5) = (-0.5, -2)
x4 – µ = (5 – 4.5, 6 – 5) = (0.5, 1)
x5 – µ = (6 – 4.5, 7 – 5) = (1.5, 2)
x6 – µ = (7 – 4.5, 8 – 5) = (2.5, 3)
Feature vectors (xi) after subtracting mean vector (µ) are-
Step-04:
Calculate the covariance matrix.
Covariance matrix is given by-
Now,
Now,
Covariance matrix
= (m1 + m2 + m3 + m4 + m5 + m6) / 6
On adding the above matrices and dividing by 6, we get-
Step-05:
Calculate the eigen values and eigen vectors of the covariance matrix.
λ is an eigen value for a matrix M if it is a solution of the characteristic
equation |M – λI| = 0.
So, we have-
From here,
(2.92 – λ)(5.67 – λ) – (3.67 x 3.67) = 0
16.56 – 2.92λ – 5.67λ + λ2 – 13.47 = 0
λ2 – 8.59λ + 3.09 = 0
Solving this quadratic equation, we get λ = 8.22, 0.38
Thus, two eigen values are λ1 = 8.22 and λ2 = 0.38.
Clearly, the second Eigen value is very small compared to the first Eigen
value.
So, the second Eigen vector can be left out.
Eigen vector corresponding to the greatest eigen value is the principal
component for the given data set.
So. we find the eigen vector corresponding to eigen value λ1.
We use the following equation to find the eigen vector-
MX = λX
where-
M = Covariance Matrix
X = Eigen vector
λ = Eigen value
Substituting the values in the above equation, we get-
Solving these, we get-
2.92X1 + 3.67X2 = 8.22X1
3.67X1 + 5.67X2 = 8.22X2
On simplification, we get-
5.3X1 = 3.67X2 ……… (1)
3.67X1 = 2.55X2 ……… (2)
From (1) and (2), X1 = 0.69X2
From (2), the Eigen vector is-
Refer class notes for further calculation
Lastly, we project the data points onto the new subspace as-
Backward Feature Elimination
o In this technique, firstly, all the n variables of the given dataset are
taken to train the model.
o The performance of the model is checked.
o Now we will remove one feature each time and train the model on
n-1 features for n times, and will compute the performance of the
model.
o We will check the variable that has made the smallest or no
change in the performance of the model, and then we will drop
that variable or features; after that, we will be left with n-1 features.
o Repeat the complete process until no feature can be dropped.
If a dataset has too many missing values, then we drop those variables
as they do not carry much useful information. To perform this, we can
set a threshold level, and if a variable has missing values more than that
threshold, we will drop that variable. The higher the threshold value, the
more efficient the reduction.
Random Forest
Factor Analysis