Machine Learning
Module 4
Clustering
Hebbian Learning rule
■ Hebbian Learning Rule, also known as Hebb Learning
Rule, was proposed by Donald O Hebb.
■ It is one of the first and also easiest learning rules in the
neural network.
■ It is used for pattern classification.
■ It is a single layer neural network, i.e. it has one input
layer and one output layer.
■ The input layer can have many units, say n.
■ The output layer only has one unit.
■ Hebbian rule works by updating the weights between
neurons in the neural network for each training sample.
3
Hebbian Learning Rule Algorithm
1. Set all weights to zero, wi = 0 for i=1 to n, and bias to
zero.
2. For each input vector, S(input vector) : t(target output
pair), repeat steps 3-5.
3. Set activations for input units with the input vector Xi = Si
for i = 1 to n.
4. Set the corresponding output value to the output neuron,
i.e. y = t.
5. Update weight and bias by applying Hebb rule for all i =
1 to n:
4
Implementing AND Gate
• There are 4 training samples, so there will be 4
iterations.
• Also, the activation function used here is Bipolar
Sigmoidal Function so the range is [-1,1].
5
Implementing AND Gate
Step 1 :
Set weight and bias to zero, w = [ 0 0 0 ]T and b = 0.
Step 2 :
Set input vector Xi = Si for i = 1 to 4.
X1 = [ -1 -1 1 ]T
X2 = [ -1 1 1 ]T
X3 = [ 1 -1 1 ]T
X4 = [ 1 1 1 ]T
Step 3 :
Output value is set to y = t.
6
Implementing AND Gate
Step 4 : Modifying weights using Hebbian Rule:
First iteration –
w(new) = w(old) + x1y1 = [ 0 0 0 ]T + [ -1 -1 1 ]T . [ -1 ] = [ 1 1 -1 ]T
For the second iteration, the final weight of the first one will be used
and so on.
Second iteration –
w(new) = [ 1 1 -1 ]T + [ -1 1 1 ]T . [ -1 ] = [ 2 0 -2 ]T
Third iteration –
w(new) = [ 2 0 -2]T + [ 1 -1 1 ]T . [ -1 ] = [ 1 1 -3 ]T
Fourth iteration –
w(new) = [ 1 1 -3]T + [ 1 1 1 ]T . [ 1 ] = [ 2 2 -2 ]T
So, the final weight matrix is [ 2 2 -2 ]T 7
Implementing AND Gate
Testing the network :
8
Implementing AND Gate
For x1 = -1, x2 = -1, b = 1, Y = (-1)(2) + (-1)(2) + (1)(-2) = -6
For x1 = -1, x2 = 1, b = 1, Y = (-1)(2) + (1)(2) + (1)(-2) = -2
For x1 = 1, x2 = -1, b = 1, Y = (1)(2) + (-1)(2) + (1)(-2) = -2
For x1 = 1, x2 = 1, b = 1, Y = (1)(2) + (1)(2) + (1)(-2) = 2
The results are all compatible with the original table.
Decision Boundary :
2x1 + 2x2 – 2b = y
Replacing y with 0, 2x1 + 2x2 – 2b = 0
Since bias, b = 1, so 2x1 + 2x2 – 2(1) = 0
2( x1 + x2 ) = 2
The final equation, x2 = -x1 + 1
9
Implementing AND Gate
10
Expectation -Maximization Algorithm in ML
• In most real-life applications of machine learning, it is
found that several relevant learning features are
available, but very few of them are observable, and the
rest are unobservable.
• If the variables are observable, then it can predict the
value using instances.
• On the other hand, the variables which are latent or
directly not observable, for such variables
Expectation-Maximization (EM) algorithm plays a vital
role to predict the value.
11
Expectation -Maximization Algorithm in ML
• Being an iterative approach, it consists of two modes.
• In the first mode, we estimate the missing or latent variables.
• Hence it is referred to as the Expectation/estimation step
(E-step).
• Further, the other mode is used to optimize the parameters of
the models so that it can explain the data more clearly.
• The second mode is known as the maximization-step or
M-step.
12
Expectation -Maximization Algorithm in ML
•Expectation step (E - step): It involves the estimation (guess) of
all missing values in the dataset so that after completing this
step, there should not be any missing value.
•Maximization step (M - step): This step involves the use of
estimated data in the E-step and updating the parameters.
•Repeat E-step and M-step until the convergence of the values
occurs.
•The primary goal of the EM algorithm is to use the available
observed data of the dataset to estimate the missing data of the
latent variables and then use that data to update the values of
the parameters in the M-step.
13
Steps in EM Algorithm
14
Steps in EM Algorithm
•1st Step: The very first step is to initialize the parameter values.
Further, the system is provided with incomplete observed data
with the assumption that data is obtained from a specific model.
•2nd Step: This step is known as Expectation or E-Step, which is
used to estimate or guess the values of the missing or incomplete
data using the observed data. Further, E-step primarily updates
the variables.
•3rd Step: This step is known as Maximization or M-step, where
we use complete data obtained from the 2nd step to update the
parameter values.
•4th step: The last step is to check if the values of latent variables
are converging or not. If it gets "yes", then stop the process; else,
repeat the process from step 2 until the convergence occurs.
15
Example: Suppose coin A and B are used for tossing.
Each coin is tossed 10 times. Table shows the observation
sequence of getting H and T. What is the probability of
getting H if coin A and coin B is used.
16
Solution: First calculate number of H and T for each
coin.
17
Solution:
Probability of getting Head when coin A is used,
Pa =24/(24+6)=0.8
Probability of getting Head when coin B is used,
Pb =9/(9+11)=0.45
18
Example:
Calculate the probability of getting H for coin A
and B.
19
Solution:
Only observation sequence is known, but the state is not
known(coin A or coin B) then EM algo is used.
Assume, Pa=0.6 and Pb=0.5
20
Solution:
21
Solution:
22
Solution:
23
Solution:
24
Solution:
25
Solution:
26
Solution:
27
Solution:
28