B.Sc.
(IT)/(CS) V Semester
Soft Computing
TBI-504/TBS-505
Jaishankar Bhatt
Assistant Professor
Graphic Era Deemed to be University
Dehradun
Graphic Era Deemed to be University
1
Dehradun
Unit 1
Introduction to Soft Computing
Table of Contents
• Introduction
• Hard vs Soft Computing
• Requirement of soft computing
• Applications of soft computing
• Advantages and disadvantages of soft computing
• The biological neural network
• Artificial Neural Network (ANN)
• References
Graphic Era Deemed to be University 2
Definition of Soft Computing:
Soft computing is the process of solving real-life complex problems using approximate
calculations and gives solutions that are not very specific in nature just like the human
brain works, which, unlike traditional computing, focuses on impartial truths and
approximates. The two major problem-solving technologies include:
1. Hard computing
2. Soft computing
Hard computing deals with precise models where accurate solutions are achieved
quickly. On the other hand, soft computing deals with approximate models and gives
solution to complex problems. Dr. Lotfi Zadeh of the university of California, Berkeley,
USA, coined the term of Soft computing in 1980. The objective of soft computing is to
provide precise approximation and quick solutions for complex real-life problems.
According to Dr Zadeh, soft computing is a methodology that imitates the human brain
to reason and learn in an uncertain environment. Soft computing uses multivalued logics
and thus has a sophisticated nature. It is mainly used to perform parallel computations.
Soft computing is a branch of artificial intelligence that provides approximate solutions
to complex real life problems that are difficult or impossible to solve using classical
methods. Soft computing uses a combination of Genetic Algorithms (GAs) , Neural
Networks and Fuzzy Logic (FL).
Hard Computing Soft Computing
Precisely stated analytical model required. Imprecision is tolerable
Involves intelligent computation steps, reduces computational
High amount of computation required for complex problems.
time for complex problems
Involves nature inspired systems, fuzzy logic, neural
Involves binary logic, crisp systems and numerical analysis.
approximation.
Imprecision and uncertainty are tolerable and are exploited to
Imprecision and uncertainty are undesirable.
arrive at a better solution.
Requires exact input data. Can handle ambiguous and noisy data.
Exact and precise results Approximate results
Strictly follows the sequence of computation. Allow parallel processing and computation.
Examples are any numerical or traditional methods of
Examples are Neural network e.g. Adaline, Medaline etc.
problem solving
Hard Computing vs Soft Computing
The following are some of the reasons why soft computing is needed (Requirement of
Soft Computing):
1. Complexity of real-world problems: Many real-world problems are complex and
involve uncertainty, vagueness, and imprecision. Traditional computing methods are
not well-suited to handle these complexities.
2. Incomplete information: In many cases, there is a lack of complete and accurate
information available to solve a problem. Soft computing techniques can provide
approximate solutions even in the absence of complete information.
3. Noise and uncertainty: Real-world data is often noisy and uncertain, and classical
methods can produce incorrect results when dealing with such data. Soft computing
techniques are designed to handle uncertainty and imprecision.
4. Non-linear problems: Many real-world problems are non-linear, and classical methods
are not well-suited to solve them. Soft computing techniques such as fuzzy logic and
neural networks can handle non-linear problems effectively.
5. Human-like reasoning: Soft computing techniques are designed to mimic human-like
reasoning, which is often more effective in solving complex problems.
Applications of AI and Soft Computing
Soft computing methodologies are widely used in various scientific and engineering
disciplines such as data mining, electronics, automotive, aerospace, marine, robotics,
defense, industrial, medical and business applications.
Recent Developments in soft computing:
1. In the field of Big Data, soft computing working for data analyzing models, data behavior
models, data decision, etc.
2. In case of Recommender system, soft computing plays an important role for analyzing the
problem on the base of algorithm and works for precise results.
3. In Behavior and decision science, soft computing used in this for analyzing the behavior,
and model of soft computing works accordingly.
4. In the fields of Mechanical Engineering, soft computing is a role model for
computing problems such that how a machine will work and how it will make
the decision for a specific problem or input given.
5. In this field of computer science, we can say that it is a core part of soft
computing and it works on advanced level like Machine learning (ML) and Deep
learning (DL) and Artificial intelligence (AI), etc.
Advantages of Soft Computing:
1. Robustness: Soft computing techniques are robust and can handle
uncertainty, imprecision, and noise in data, making them ideal for solving real-
world problems.
2. Approximate solutions: Soft computing techniques can provide approximate
solutions to complex problems that are difficult or impossible to solve exactly.
3. Non-linear problems: Soft computing techniques such as fuzzy logic and neural
networks can handle non-linear problems effectively.
4. Human-like reasoning: Soft computing techniques are designed to mimic human-like
reasoning, which is often more effective in solving complex problems.
5. Real-time applications: Soft computing techniques can provide real-time solutions
to complex problems, making them ideal for use in real-time applications.
Disadvantages of Soft Computing:
1. Approximate solutions: Soft computing techniques provide approximate solutions,
which may not always be accurate.
2. Computationally intensive: Soft computing techniques can be computationally
intensive, making them unsuitable for use in some real-time applications.
3. Lack of transparency: Soft computing techniques can sometimes lack transparency,
making it difficult to understand how the solution was arrived at.
4. Difficulty in validation: The approximation techniques used in soft
computing can sometimes make it difficult to validate the results, leading to a lack of
confidence in the solution.
5. Complexity: Soft computing techniques can be complex and difficult to
understand, making it difficult to implement them effectively.
The Biological Neural Network: Neural network is an information processing systems;
the design of NN is inspired by the structure and functioning of human brain. In a
human brain there are approximately 1011 neurons which are connected with each
other.
ANNs possess large number of highly interconnected processing elements called nodes
or units or neuron, which usually operate in parallel and are configured in regular
architectures.
A typical biological neuron has following components:
1. The fundamental unit of the network is called a neuron or a nerve cell.
2. A neuron has a roughly spherical cell body called soma. Nucleus is located here.
3. The signals generated in soma are transmitted to other neurons through an
extension on the cell body called axon or nerve fiber.
4. Tree – like nerve fibers called dendrites are associated with the cell body.
Dendrites are responsible for receiving the incoming signals from other neurons.
The dendrites serve as receptors for signals from adjacent neurons.
5. Axon is a single long fiber extending from cell body, which eventually branches
into strands & sub-strands connecting to many other neurons at the synaptic
junctions, or synapses.
Working of the neuron
1. Dendrites receive activation signal from other neurons which is the internal state
of every neuron
2. Soma processes the incoming activity signals and convert its into output
activation signals.
3. Axons carry signals from the neuron and sends it to other neurons.
4. Electric impulses are passed between the synapses and the dendrites. The signal
transmission involves a chemical process called neuro-transmitters.
Brain vs. Computer - Comparison between Biological Neuron and Artificial Neuron (Brain
vs. Computer)
1. Speed: The of cycle time of execution in the ANN is of few nanoseconds whereas in the
case of biological neuron it is of a few milliseconds. Hence, the artificial neuron model using
a computer is faster.
2. Processing: Basically, the biological neuron can perform massive parallel operations
simultaneously. The artificial neuron can also perform several parallel operations
simultaneously, but, in general, the artificial neuron network process is faster than that of
the brain.
3. Size and complexity: The total number of neurons in the brain is about 1011 and the total
number of interconnections is about 1015. Hence, it can be noted that the complexity of the
brain is comparatively higher, i.e. the computational work takes places not only in the brain
cell body, but also in axon, synapse, ete. On the other hand, the size and complexity of an
ANN is based on the chosen application and the network designer. The size and complexity
of a biological neuron is more than that of an artificial neuron.
4. Storage capacity (memory): The biological. Neuron stores the information in its
interconnections or in synapse strength but in an artificial neuron it is stored in its
contiguous memory locations. In an artificial neuron, the continuous loading of new
information may sometimes overload the memory locations. As a result, some of the
addresses containing older memory locations may be destroyed. But in case of the
brain, new information can be added in the interconnections by adjusting the strength
without destroying the older information. A disadvantage related to brain is that
sometimes its memory may fail to recollect the stored information whereas in an
artificial neuron, once the information is stored in its memory locations, it can be
retrieved. Owing to these facts, the adaptability is more toward an artificial neuron
5. Tolerance: The biological neuron assesses fault tolerant capability whereas the
artificial neuron has no fault tolerance. The distributed neuron of the biological
neurons enables to store and retrieve information even when the interconnections in
them get disconnected.
Thus biological neurons are fault tolerant. But in case of artificial neurons, the
information gets corrupted if the network interconnections are disconnected.
Biological neurons can accept redundancies, which is not possible in artificial
neurons. Even when some cells die, the human nervous system appears to be
performing with the same efficiency.
6. Control mechanism: In an artificial neuron modeled using a computer, there is a
control unit present in Central Processing Unit, which can transfer and control precise
scalar values from unit to unit, but there is no such control unit for monitoring in the
brain. The strength of a neuron in the brain depends on the active chemicals present
and whether neuron connections are strong or weak as a result of structure layer
rather than synapses. However, the ANN possesses simpler interconnections and is
free from chemical actions similar to those raking place in brain (biological neuron).
Thus, the control mechanism of an artificial neuron is very simple compared to that
of a biological neuron.
Term Brain Computer
Speed Execution time is few milliseconds Execution time is few nano
seconds
Processing Perform massive parallel operations Perform several parallel operations
simultaneously simultaneously. It is faster the
biological neuron
Size and Number of Neuron is 1011 and number of It depends on the chosen
complexity interconnections is 1015. So, complexity of brain application and network designer.
is higher than computer
Storage capacity
• Information is stored in interconnections • Stored in continuous memory
or in synapse strength. location.
• New information is stored without • Overloading may destroy
destroying old one. older locations.
• Sometimes fails to recollect information • Can be easily retrieved
Tolerance
• Fault tolerant • No fault tolerance
• Store and retrieve information even • Information corrupted if the
interconnections fails network connections
• Accept redundancies disconnected.
• No redundancies
Control Depends on active chemicals and neuron CPU Control mechanism is very
mechanism connections are strong or weak simple
Comparison between Biological neuron and Artificial neuron
Application of Soft Computing
The application of soft computing has proved following advantages:
• The application that cannot be modelled mathematically can be solved.
• Non-linear problems can be solved.
• Introducing human knowledge such as cognition, understanding, recognition,
learning and other into the field of computing.
Few applications of soft computing are enlisted below:
1. Handwritten Script Recognition using Soft Computing:
It is one of the demanding parts of computer science. It can translate multilingual
documents and sort the various scripts accordingly. Block –level technique
concept is used by the system to recognize the script from several script document
given. To classify the script according to their features, it uses Discrete Cosine
Transform (DCT) and Discrete Wavelet Transform (DWT) together.
2. Image Processing and Data Compression using Soft Computing:
Image analysis is the high-level processing technique which includes recognition and
bifurcation of patterns. It is one of the most important parts of the medical field. The
problem of computational complexity and efficiency in the classification can be easily be
solved using soft computing techniques. Genetic algorithms, genetic programming,
classifier systems, evolutionary strategies, etc are the techniques of soft computing that can
be used. These algorithms give the fastest solutions to pattern recognition. These help in
analyzing the medical images obtained from microscopes as well as examine the X-rays.
3. Use of Soft Computing in Automotive Systems and Manufacturing:
Automobile industry has also adapted soft computing to solve some of the major problems.
Classic control methods is built in vehicles using the Fuzzy logic techniques. It takes the
example of human behavior, which is described in the forms of rule – “If-Then
“statements. The logic controller then converts the sensor inputs into fuzzy variables that
are then defined according to these rules. Fuzzy logic techniques are used in engine
control, automatic transmissions, antiskid steering, etc.
4. Soft Computing based Architecture:
An intelligent building takes inputs from the sensors and controls effectors by using
them. The construction industry uses the technique of DAI (Distributed Artificial
Intelligence) and fuzzy genetic agents to provide the building with capabilities that
match human intelligence. The fuzzy logic is used to create behaviour-based
architecture in intelligent buildings to deal with the unpredictable nature of the
environment, and these agents embed sensory information in the buildings.
5. Soft Computing and Decision Support System:
Soft computing gives an advantage of reducing the cost of the decision support
system. The techniques are used to design, maintain, and maximize the value of the
decision process. The first application of fuzzy logic is to create a decision system
that can predict any sort of risk. The second application is using fuzzy information
that selects the areas which need replacement.
6. Soft Computing Techniques in Bioinformatics:
The techniques of soft computing help in modifying any uncertainty and
indifference that biometrics data may have. Soft computing is a technique that
provides distinct low-cost solutions with the help of algorithms, databases, Fuzzy
Sets (FSs), and Artificial Neural Networks (ANNs). These techniques are best
suited to give quality results in an efficient way.
7. Soft Computing in Investment and Trading:
The data present in the finance field is in opulence and traditional computing
is not able to handle and process that kind of data. There are various approaches
done through soft computing techniques that help to handle noisy data. Pattern
recognition technique is used to analyse the pattern or behaviour of the data and
time series is used to predict future trading points.
Artificial Neural Networks (ANN):
Neural networks are information processing systems that are implemented to model the
working of the human brain. It is more of a computational model used to perform tasks
in a better optimized way than the traditional systems.
The inventor of the first neuro-computer, Dr. Robert Hecht-Nielsen, defines a neural
network as a computing system made up of simple, highly interconnected processing
elements, which process information by their dynamic state response to external
inputs.
ANNs are composed of multiple nodes, which imitate biological neurons of human
brain. The neurons are connected by links and they interact with each other. The nodes
can take input data and perform simple operations on the data. The result of these
operations is passed to other neurons. The output at each node is called
its activation or node value. Each link is associated with weight. ANNs are capable of
learning, which takes place by altering weight values.
Artificial neural networks (ANNs) are comprised of a node layers, containing an
input layer, one or more hidden layers, and an output layer. Each node, or
artificial neuron, connects to another and has an associated weight and
threshold. If the output of any individual node is above the specified threshold
value, that node is activated, sending data to the next layer of the network.
Otherwise, no data is passed along to the next layer of the network.
Artificial Neural Networks (ANN): Introduction
• AN artificial neural network may be defined as an information processing model
that is inspired by the way biological nervous system, such as the brain, process
information.
• This model tries to replicate only the most basic function of the brain.
• An ANN is composed of a large umber of highly interconnected processing units
(neurons) working in unison to solve specific problems.
• Like human being, ANN also learn by example.
• AN ANN is configured for a specific application, such as spam classification, face
recognition, pattern recognition through the learning process.
• Each neuron is connected the other by a connection link.
• Each connection is associated with weights which contains information about
particular problem.
• This information is used by the neural network to solve a particular problem.
• ANN’s collective behaviour is characterized by their ability to learn, recall and
generalize training patterns or data to that of a human brain.
• They have the capability to model networks of original neurons as found in the
brain.
• Thus the ANN processing elements are called neurons or artificial neurons
To depict the basic operation of a neural net, consider a set of neurons, say X1 and X2,
transmitting signals to another neuron Y. Here X1, and X2 are input neurons, which
transmit signals, and Y is the output neuron, which receives signals. Input neurons X1
and X2 are connected to the output neuron Y, over a weighted interconnection links
W1, and W2.
For the above simple neuron net architecture, the net input has to be calculated in the
following way:
where x1 and x2 are the activations of the input neurons X1, and X2, i.e., the output
of input signals. The output y of the output neuron Y can be obtained by applying
activations over the net input, i.e., the function of the net input:
y = f (yin)
Output= Function (net input calculated)
Characteristics of ANN:
An ANN possesses the following characteristics:
1. It is a neurally implemented mathematical model
2. There exist a large number of highly interconnected processing elements called
neuron in an ANN.
3. The interconnections with their weighted linkages hold the informative knowledge.
4. The input signals arrive at the processing elements through connections and
connecting weights.
5. The processing elements of the ANN have the ability to learn, recall and generalize
from the given data by suitable assignment or adjustment of weights.
6. The computational power can be demonstrated only by the collective behavior of
neurons, and it should be noted that no single neuron carries specific information.
7. A single neuron carries no specific information.
Components of the basic Artificial Neuron:
1.Inputs: Inputs are the set of values for which we need to predict a output value. They
can be viewed as features or attributes in a dataset.
2. Weights: weights are the real values that are attached with each input/feature and they
convey the importance of that corresponding feature in predicting the final output. In
other words, a weight decides how much influence the input will have on the output
3. Bias: Bias is used for shifting the activation function towards left or right, you can
compare this to y-intercept in the line equation.
4. Summation Function: The work of the summation function is to bind the weights and
inputs together and calculate their sum.
5. Activation Function: The activation function decides whether a neuron should be
activated or not by calculating the weighted sum and further adding bias to it. The
purpose of the activation function is to introduce non-linearity into the output of a
neuron.
Terminologies of ANN
1. Weights: Weight is a parameter which contains information about the input signal. This
information is used by the net to solve a problem. In ANN architecture, every neuron is
connected to other neurons by means of a directed communication link and every link is
associated with weights. Wij is the weight from processing element ‘i’ source node to
processing element ‘j’ destination node.
2. Bias (b): The bias is a constant value included in the network. Its impact is seen in
calculating the net input. The bias is included by adding a component x0 =1 to the input
vector X. Bias can be positive or negative. The positive bias helps in increasing the net
input of the network. The negative bias helps in decreasing the net input of the network.
3. Threshold (𝜽): Threshold is a set value used in the activation function. In ANN, based
on the threshold value the activation functions are defined and the output is calculated.
4. Learning Rate (𝜶): The learning rate is used to control the amount of weight
adjustment at each step of training. The learning rate ranges from 0 to 1. It determines the
rate of learning at each time step.
Activation Function:
To better understand the role of the activation function, let us assume a person is
performing some work. To make the work more efficient and to obtain exact output,
some force or activation may be given. This activation helps in achieving the exact output.
In a similar way, the activation function is applied over the net input to calculate output
of an ANN.
There are several activation functions. Some of them are as follows.
1. Identity function: it is a linear function and can be defined as :
f(x) = x for all x
The output here remains the same as input. The input layer uses the identity activation
function.
2. Binary Step function: This function can be defined as
Where θ represents the threshold value. This function is most widely used in single-layer
nets to convert the net input to an output that is a binary (1 or 0).
3. Bipolar step function: This function can be defined as
where θ represents the threshold value. This function is also used in single-layer nets to
convert the net input to an output that is bipolar(+ 1 or -1).
4. Sigmoidal functions: The sigmoidal functions are widely used in back-propagation nets
because of the relationship between the value of the functions at a point and the value of
the derivative at that point which reduces the computational burden during training.
Sigmoidal functions are of two types:
1- Binary sigmoid function:
2- Bipolar sigmoid function:
i- Binary sigmoid function: It is also termed as logistic sigmoid function or unipolar
sigmoid function. It can be defined as
where λ is the steepness parameter.
ii- Bipolar sigmoid function: This function is defined as
where λ is the steepness parameter and the sigmoid function range is between -1
and+ 1.
5. Ramp function: The ramp function is defined as
Q. No.1: For the network shown in Figure I, calculate the net input to the output neuron.
Sol: The given neural net consists of three input neurons and one output neuron. The input and weights are
[x1, x2, x3] = [0.3, 0.5, 0.6]
[w1, w2, w3] = [0.2, 0.1, -0.3]
The net input can be calculated as :
Yin = x1w1 + x2w2 + x3w3
= 0.3x0.2 + 0.5x0.1 + 0.6x(-0.3)
= 0.06 + 0.05 -0.18 = -0.07
Q. No.2: Calculate the net input for the network shown in Figure 2 with bias included in the network.
Solution: The given net consists of two input neurons, a bias and an output neuron. The inputs are:
[x1, x2] = [0.2, 0.6]
[w1, w2] = [0.3, 0.7]
Since the bias is included b = 0.45 and bias input x0 is equal to 1, the net input is calculated as
Yin= b+x1w1 +x2w2
= 0.45 + 0.2 X 0.3 + 0.6 X 0.7
= 0.45 + 0.06 + 0.42 = 0.93
Therefore yin = 0.93 is the net input.
McCulloch-Pitts Neuron:
The McCulloch-Pitts neuron was the earliest neural network discovered in 1943. It is
usually called as M-P neuron. The M-P neurons are connected by directed weighted
paths. It should be noted that the activation of a M-P neuron is binary, that is, at any
time step the neuron may fire or may not fire. The weights associated with the
communication links may be excitatory (weight is positive) or inhibitory (weight is
negative).
Hebbian Network Architecture
Hebbian learning rule is one of the earliest and the simplest learning rules for the
neural networks. It was proposed by Donald Hebb. Hebb proposed that if two
interconnected neurons are both “on” at the same time, then the weight between
them should be increased. Hebbian network is a single layer neural network which
consists of one input layer with many input units and one output layer with one
output unit. The bias which increases the net input has value 1.
Donald Hebb stated in 1949 that in the brain, the learning is performed by the
change in the synaptic gap. Hebb explained it: "When an axon of cell A is near
enough to excite cell B, and repeatedly or permanently takes place in firing it, some
growth process or metabolic change take place in one or both the cells such that
A’s efficiency, as one of the cells firing B is increased.
According to the Hebb rule, the weight vector is found to increase proportionately to
the product of the input and the learning signal. Here the learning signal is equal to the
neuron's output. In Hebb learning, if two interconnected neurons are 'on'
simultaneously then the weights associated with these neurons can be increased by the
modification made in their synaptic gap (strength). The weight update in Hebb rule is
given by
𝒘𝒊 𝒏𝒆𝒘 = 𝒘𝒊 𝒐𝒍𝒅 + 𝒙𝒊 𝒚
or
𝒘𝒊 𝒏𝒆𝒘 = 𝒘𝒊 𝒐𝒍𝒅 + ∆𝒘𝒊 (𝐰𝐡𝐞𝐫𝐞 ∆𝒘𝒊 = 𝒙𝒊 𝒚 )
The Hebb rule is more suited for bipolar data than binary data. If binary data is used,
the above weight updation formula cannot distinguish two conditions namely;
1. A training pair in which an input unit is "on" and target value is "off."
2. A training pair in which both the input unit and the target value are "off."
Thus, there are limitations in Hebb rule application over binary data. Hence, the
representation using bipolar data is advantageous.
Hebb Net Algorithm
Step 1: Initialize all weights and bias to zero, i.e., wi = 0 for i = 1 to n, b = 0.
Here, n is the number of input neurons.
Step 2: For each input training vector and target output pair, S : t, do steps 2–5.
Step 3: Set activation for input units: xi = Si, i = 1, …, n.
Step 4: Set activation for output unit: y = t.
Step 5: Adjust the weights and bias:
𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 + 𝑥𝑖 𝑦
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + 𝑦
The above five steps complete the algorithmic process.
In Step 4, the weight updation formula can also be given in vector form as
𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 + 𝑥𝑖 𝑦
Here the change in weight can be expressed as·
∆𝑤 = 𝑥𝑖 𝑦
As a result,
𝑤𝑖 𝑛𝑒𝑤 = 𝑤𝑖 𝑜𝑙𝑑 +∆ 𝑤
The Hebb rule can be used for pattern association, pattern categorization,
pattern classification and over a range of other areas.
Design a hebb net to implement a logical
AND function (Use bipolar inputs)
The training data for AND function
Initially the weights and bias are set to zero, i.e.
𝑤1 = 𝑤2 = 𝑏 = 0
First input [x1 x2 b] = [111] and Target = 1 [i.e. y=1]
Setting the initial weight as old weights and applying the Hebb rule
The new weights here are
𝑤1 (new) = 𝑤1 (old) + ∆𝑤1 = 1+1 = 2
𝑤2 𝑛𝑒𝑤 = 𝑤2 𝑜𝑙𝑑 + ∆ 𝑤2 = 1+ 1 =2
B (new) = b (old) + ∆ b = -1 + (-1) =-2
Design a Hebb net to implement a logical OR
function (Use bipolar inputs and targets)
The training data for OR function
Initially the weights and bias are set to zero, i.e.
𝑤1 = 𝑤2 = 𝑏 = 0
First input [𝑥1 𝑥2 𝑏] = [1 1 1] and Target = 1 (i.e. y = 1)
Setting the initial weight as old weights and applying the Hebb rule
The weight change here is: The new weights here are:
∆𝑤𝑖 = 𝑥𝑖 𝑦 𝑤1 𝑛𝑒𝑤 = 𝑤1 𝑜𝑙𝑑 + ∆𝑤1 = 0 + 1 = 1
∆𝑤1 = 𝑥1 𝑦 = 1 X 1 = 1 𝑤2 𝑛𝑒𝑤 = 𝑤2 𝑜𝑙𝑑 + ∆𝑤2 = 0 + 1 = 1
∆𝑤2 = 𝑥2 𝑦 = 1 X 1 = 1 𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + ∆𝑏 = 0 + 1 = 1
∆b = y =1
𝑆𝑒𝑐𝑜𝑛𝑑 𝑖𝑛𝑝𝑢𝑡 [𝑥1 𝑥2 𝑏] = [1 −1 1] 𝑎𝑛𝑑 𝑡𝑎𝑟𝑔𝑒𝑡 = 1 [𝑖. 𝑒. 𝑦 = 1]
𝑇ℎ𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 𝑐ℎ𝑎𝑛𝑔𝑒 ℎ𝑒𝑟𝑒 𝑖𝑠:
∆𝑤𝑖 = 𝑥𝑖 𝑦
∆𝑤1 = 𝑥1 𝑦 = 1X1 = 1
∆𝑤2 = 𝑥2 𝑦 = −1X1 = −1
∆𝑏 =y=1
𝑇ℎ𝑒 𝑛𝑒𝑤 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 ℎ𝑒𝑟𝑒 𝑎𝑟𝑒:
𝑤1 𝑛𝑒𝑤 = 𝑤1 𝑜𝑙𝑑 + ∆𝑤1 = 1 + 1 = 2
𝑤2 𝑛𝑒𝑤 = 𝑤2 𝑜𝑙𝑑 + ∆𝑤2 = 1 + −1 = 0
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + ∆𝑏 = 1 + 1 = 2
𝑇ℎ𝑖𝑟𝑑 𝑖𝑛𝑝𝑢𝑡 𝑥1 𝑥2 𝑏 = −1 1 1 𝑎𝑛𝑑 𝑡𝑎𝑟𝑔𝑒𝑡 = 1 (𝑖. 𝑒. 𝑦 = 1)
𝑇ℎ𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 𝑐ℎ𝑎𝑛𝑔𝑒 ℎ𝑒𝑟𝑒 𝑖𝑠
∆𝑤𝑖 = 𝑥𝑖 𝑦
∆𝑤1 = 𝑥1 𝑦 = −1X1 = −1
∆𝑤2 = 𝑥2 𝑦 = 1X1 = 1
∆𝑏 = 𝑦 = 1
𝑇ℎ𝑒 𝑛𝑒𝑤 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 ℎ𝑒𝑟𝑒 𝑎𝑟𝑒:
𝑤1 𝑛𝑒𝑤 = 𝑤1 𝑜𝑙𝑑 + ∆𝑤1 = 2 + −1 = 1
𝑤2 𝑛𝑒𝑤 = 𝑤2 𝑜𝑙𝑑 + ∆𝑤2 = 0 + 1 = 1
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + ∆𝑏 = 2 + 1 = 3
𝐹𝑜𝑢𝑟𝑡ℎ 𝑖𝑛𝑝𝑢𝑡 𝑥1 𝑥2 𝑏 = −1 −1 1 𝑎𝑛𝑑 𝑡𝑎𝑟𝑔𝑒𝑡 = −1(𝑖. 𝑒. 𝑦 = −1)
𝑇ℎ𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 𝑐ℎ𝑎𝑛𝑔𝑒 ℎ𝑒𝑟𝑒 𝑖𝑠:
∆ 𝑤𝑖= 𝑥𝑖 𝑦
∆𝑤1 = 𝑥1 𝑦 = −1 X − 1 = 1
∆𝑤2 = 𝑥2 𝑦 = −1 X − 1 = 1
∆𝑏 = 𝑦 = −1
𝑇ℎ𝑒 𝑛𝑒𝑤 𝑤𝑒𝑖𝑔ℎ𝑡𝑠 ℎ𝑒𝑟𝑒 𝑎𝑟𝑒:
𝑤1 𝑛𝑒𝑤 = 𝑤1 𝑜𝑙𝑑 + ∆𝑤1 = 1 + 1 = 2
𝑤2 𝑛𝑒𝑤 = 𝑤2 𝑜𝑙𝑑 + ∆𝑤2 = 1 + 1 = 2
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + ∆𝑏 = 3 + −1 = 2
Perceptron Introduction
The Perceptron algorithm consists of a single layer of neurons that process inputs,
compute a weighted total, and then apply an activation function to get an output.
Based on a set of features or input variables, the algorithm learns to classify input
data into one of two potential groups. To reduce the discrepancy between the
expected output and the actual output, the weights of the neurons are changed.
Perceptron is Machine Learning algorithm for supervised learning of various binary
classification tasks. Further, Perceptron is also understood as an Artificial Neuron
or neural network unit that helps to detect certain input data computations in
business intelligence. Perceptron model is also treated as one of the best and
simplest types of Artificial Neural networks. However, it is a supervised learning
algorithm of binary classifiers. Hence, we can consider it as a single-layer neural
network with four main parameters, i.e., input values, weights and Bias, net sum,
and an activation function.
Graphic Era Deemed to be University
68
Dehradun
Basic Components of Perceptron
Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which
contains five main components. These are as Input, weights, net input function,
activation function and output
Graphic Era Deemed to be University
69
Dehradun
In the Perceptron Learning Rule, the predicted output is compared with
the known output. If it does not match, the error is propagated
backward to allow weight adjustment to happen.
Graphic Era Deemed to be University
70
Dehradun
Graphic Era Deemed to be University
Flowchart for Perceptron network with single output
Dehradun
71
Perceptron Training Algorithm for Single Output Classes
Step 0: Initialize the weights and bias (for easy calculation they can be set to 0) also initialize
the learning rate ∝ (0 <∝<=1). For simplicity ∝ 𝑖𝑠 𝑠𝑒𝑡 𝑡𝑜 1.
Step 1: Perform Steps 2-6 until the final stopping condition is false.
Step 2: Step 2: Perform Steps 3-5 for each training pair indicated by s:t
Step 3: The input layer containing input units is applied with identity activation functions:
xi =si
Step 4: Calculate the output of the network. To do so, first obtain the net input:
𝑦𝑖𝑛 = 𝑏 + 𝑥𝑖 𝑤𝑖
𝑖=1
Graphic Era Deemed to be University
72
Dehradun
where "n" is the number of input neurons in the input layer. Then apply activations over
the net input calculated to obtain the output:
0 𝑖𝑓 𝑦𝑖𝑛 > 0
𝑦 = 𝑓 𝑦𝑖𝑛 = ቐ 1 𝑖𝑓 𝑦𝑖𝑛 = 0
−1 𝑖𝑓 𝑦𝑖𝑛 < 0
Graphic Era Deemed to be University
73
Dehradun
Step 5: Weight and bias adjustment: Compare the value of the actual
(calculated) output and desired (target) output.𝑖𝑓 𝑡𝑗 ≠ 𝑦𝑗 , 𝑡ℎ𝑒𝑛
w 𝑛𝑒𝑤 = 𝑤 𝑜𝑙𝑑 + ∝ 𝑡 𝑥
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 + ∝ 𝑡
Else we have
𝑤 𝑛𝑒𝑤 = 𝑤 𝑜𝑙𝑑
𝑏 𝑛𝑒𝑤 = 𝑏(old)
Step 6: Train the network until there is no weight change. This is the stopping
condition for the network. If this condition is not met, then start again from Step
2.0
Graphic Era Deemed to be University
74
Dehradun
Implement AND function using perceptron network for bipolar inputs and targets
Solution: Perceptron learning rule: The learning signal is the difference between the
calculated output (y) and actual output (target) of a neuron.
The output y is obtained on the basis on the net input calculated and activation
function being applied over the net input.
Weights are update using the formula
𝑖𝑓 𝑦 ≠ 𝑡, 𝑡ℎ𝑒𝑛
𝑤 𝑛𝑒𝑤 = 𝑤 𝑜𝑙𝑑 +∝ 𝑡𝑥 (∝= Learning Rate)
𝑏 𝑛𝑒𝑤 = 𝑏 𝑜𝑙𝑑 +∝t
𝑒𝑙𝑠𝑒 𝑤𝑒 ℎ𝑎𝑣𝑒
𝑤 𝑛𝑒𝑤 = 𝑤(𝑜𝑙𝑑)
Graphic Era Deemed to be University
75
Dehradun
1. The perceptron network, which uses learning rule
is used to train the AND function.
2. The input patterns are presented to the network
one by one.
3. When all the input patterns are presented, then
one epoch is said to be completed.
4. The initial weights and threshold are set to zero.
5. The learning rate is set equal to 1.
Graphic Era Deemed to be University
76
Dehradun
Initially 𝑤1 = 𝑤2 = 𝑏 = ⍬(𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑) = 0
∆𝑤1 =∝ 𝑡𝑥1
𝑦𝑖𝑛 = 𝑏 + 𝑥1 𝑤1 + 𝑥2 𝑤2 ∆𝑤2 =∝ 𝑡𝑥2
∆𝑏 =∝ 𝑡
Graphic Era Deemed to be University
77
Dehradun
AND function using Perceptron Rule
Graphic Era Deemed to be University
78
Dehradun
Question - Find the weight required to perform following classification using
perceptron network.
Solution – Assume initial weights w1 = w2 = w3 = w4 = 0, b = 0 and learning rate ∝ = 1
Calculate the net input using 𝑦𝑖𝑛 = b + 𝑥1 𝑤1 + 𝑥2 𝑤2 + 𝑥3 𝑤3 + 𝑥4 𝑤4
and apply the activation function to calculate y. for weight change use ∆ 𝑤𝑖 =∝ 𝑡𝑥𝑖 and
update weight using 𝑤 𝑛𝑒𝑤 = 𝑤 𝑜𝑙𝑑 + ∆𝑤𝑖
Graphic Era Deemed to be University
79
Dehradun
EOPCH -1
EOPCH -2
In Second input 𝑡 = 𝑦, so no need to update weight. At last because all four inputs
are not classified correctly (i.e. in all 4 inputs values in column t and column y are not
same)., so we will go for third epoch.
Graphic Era Deemed to be University
80
Dehradun
EOPCH -3
In the above table, in all four inputs 𝑦 = 𝑡, so there is no need to update weights,
Using weights 𝑤1 = −2, 𝑤2 = 2, 𝑤3 = 0, 𝑤4 = 2 𝑎𝑛𝑑 𝑏 = 0, all four inputs are
classified correctly (stopping condition is true). The final network is as follows.
Graphic Era Deemed to be University
81
Dehradun
Question 1: Implement OR function with binary inputs and bipolar targets using
perceptron training algorithm up to 3 epochs.
Solution: The truth table for OR function with binary inputs and bipolar targets is
shown below.
The perceptron network, which uses perceptron learning rule, is used to train the
OR function. The network architecture is shown in Figure. The initial values of
the weights and bias are taken as zero, i.e.,
W1= W2 =b = 0
Also the learning rate is 1 and threshold is 0.2. So, the activation function
becomes
The final weights at the end of third epoch are w1 =2, w2 and b= -1
Further epochs have to be done for the convergence of 'the network.
Question 2: - Find the weights required to perform the following classification using
perceptron network. The vectors (1,1, 1, 1) and ( -1, 1 -1, -1) are belonging to the
class (so have target value 1), vectors (1, 1, 1, -1) and (1, -1, -1, 1) are not belonging
to the class (so have target value -1). Assume learning rate as 1 and initial weights as
0. assume threshold = 0.2
Solution: - Let w1 = w2 = w3 = w4 = b =0 and learning rate = 1. Since the threshold =
0.2, so the activation function is
Questions3: - Find-the weights using perception network for ANDNOT function
when all the inputs are presented only one time. Use bipolar inputs and targets.
Truth table of ANDNOT function is given below.
Solution: Initialize w1 = w2 =b =0 and learning rate is 1 and threshold = 0, so the
activation function is