Chapter Five
Chapter Five
Learning Agents
5.1. Learning agents and its components
An agent is learning if it improves its performance on future tasks after making observations
about the world. Why would we want an agent to learn? If the design of the agent can be
improved, why wouldn’t the designers just program in that improvement to begin with? There
are three main reasons. First, the designers cannot anticipate all possible situations that the agent
might find itself in. For example, a robot designed to navigate mazes must learn the layout of
each new maze it encounters. Second, the designers cannot anticipate all changes over time; a
program designed to predict tomorrow’s stock market prices must learn to adapt when conditions
change from boom to bust. Third, sometimes human programmers have no idea how to program
a solution themselves. For example, most people are good at recognizing the faces of family
members, but even the best programmers are unable to program a computer to accomplish that
task, except by using learning algorithms.
A learning agent in AI is the type of agent that can learn from its past experiences or it has
learning capabilities. It starts to act with basic knowledge and then is able to act and adapt
automatically through learning. A learning agent has mainly four conceptual components,
which are:
You can revise the detail of the learning agent components in chapter 2.
Figure 5. 1 A general model of learning agent
A machine learning algorithm is an algorithm that is able to learn from data. But what do we
mean by learning? In 1997 Tom Mitchell gave a definition for Machine Learning: ''Machine
Learning is a computer program, that is said to learn from experience E with respect to some
class of tasks T and performance measure P, if its performance at tasks in T, as measured by P,
improves with experience E.''
5.3. How does Machine Learning work
A machine learning system builds prediction models, learns from previous data, and predicts the output of
new data whenever it receives it. The amount of data helps to build a better model that accurately predicts
the output, which in turn affects the accuracy of the predicted output. Suppose we have a complex
problem, where we need to perform some predictions, so instead of writing a code for it, we just
need to feed the data to generic algorithms, and with the help of these algorithms, machine builds
the logic as per the data and predict the output.
The below block diagram explains the working of Machine Learning algorithm:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
1. Supervised Learning:
Supervised learning is a type of machine learning in which the algorithm is trained on the labeled
dataset. It learns to map input features to targets based on labeled training data. In supervised
learning, the algorithm is provided with input features and corresponding output labels, and it
learns to generalize from this data to make predictions on new, unseen data.
Let's understand supervised learning with an example. Suppose we have an input dataset of cats
and dog images. So, first, we will provide the training to the machine to understand the images,
such as the shape & size of the tail of cat and dog, Shape of eyes, color, height (dogs are
taller, cats are smaller), etc. After completion of training, we input the picture of a cat and ask
the machine to identify the object and predict the output. Now, the machine is well trained, so it
will check all the features of the object, such as height, shape, color, eyes, ears, tail, etc., and find
that it's a cat. So, it will put it in the Cat category. This is the process of how the machine
identifies the objects in Supervised Learning.
The main goal of the supervised learning technique is to map the input variable(x) with the
output variable(y). Some real-world applications of supervised learning are Risk Assessment,
Fraud Detection, Spam filtering, etc.
Supervised machine learning can be classified into two types of problems, which are:
Classification
Regression
Classification:
Classification is a type of supervised learning where the algorithm learns to assign input data to a
specific category or class based on input features. The output labels in classification are discrete
values. Classification algorithms can be binary, where the output is one of two possible classes,
or multiclass, where the output can be one of several classes.
Examples of supervised learning algorithms include:
Linear Regression
Logistic Regression
Decision Trees
Support Vector Machines (SVM)
Neural Networks
2. Unsupervised Machine Learning:
Unsupervised learning is different from the supervised learning technique; as its name suggests,
there is no need for supervision. It means, in unsupervised machine learning, the machine is
trained using the unlabeled dataset, and the machine predicts the output without any supervision.
In unsupervised learning, the models are trained with the data that is neither classified nor
labelled, and the model acts on that data without any supervision.
The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to
find the hidden patterns from the input dataset.
Let's take an example to understand it more preciously; suppose there is a basket of fruit images,
and we input it into the machine learning model. The images are totally unknown to the model,
and the task of the machine is to find the patterns and categories of the objects.
So, now the machine will discover its patterns and differences, such as color difference, shape
difference, and predict the output when it is tested with the test dataset.
Unsupervised Learning can be further classified into two types, which are:
Clustering
Association
Clustering
The clustering technique is used when we want to find the inherent groups from the data. It is a
way to group the objects into a cluster such that the objects with the most similarities remain in
one group and have fewer or no similarities with the objects of other groups. An example of the
clustering algorithm is grouping the customers by their purchasing behavior.
Apriori Algorithm,
Eclat,
FP-growth algorithm.
3. Reinforcement Machine Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A
software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance. Agent gets rewarded for
each good action and get punished for each bad action; hence the goal of reinforcement learning
agent is to maximize the rewards.
The reinforcement learning process is similar to a human being; for example, a child learns
various things by experiences in his day-to-day life. An example of reinforcement learning is to
play a game, where the Game is the environment, moves of an agent at each step define states,
and the goal of the agent is to get a high score. Agent receives feedback in terms of punishment
and rewards.
Due to its way of working, reinforcement learning is employed in different fields such as Game
theory, Operation Research, Information theory, multi-agent systems.
The given figure illustrates the typical diagram of Biological Neural Network.
The typical Artificial Neural Network looks something like the given figure.
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell
nucleus represents Nodes, synapse represents Weights, and Axon represents Output.
Relationship between Biological neural network and artificial neural network:
Biological Neural Network Artificial Neural Network
Dendrites Inputs
Cell nucleus Nodes
Synapse Weights
Axon Output
Neural networks are composed of nodes or units (see Figure 18.19) connected by directed links.
A link from unit i to unit j serves to propagate the activation ai from i to j. also has a numeric
weight wi,j associated with it, which determines the strength and sign of the connection. Just as in
linear regression models, each unit has a dummy input a0 = 1 with an associated weight w0,j .
Each unit j first computes a weighted sum of its inputs:
Then it applies an activation function g to this sum to derive the output: