unit4(AI)2024.docx-1
unit4(AI)2024.docx-1
unit4(AI)2024.docx-1
Value-Based
In a value-based Reinforcement Learning method, you
should try to maximize a value function V(s). In this
method, the agent is expecting a long-term return of
the current states under policy π.
Policy-based
In a policy-based RL method, you try to come up with
such a policy that the action performed in every state
helps you to gain maximum reward in the future.
Two types of policy-based methods are:
●Deterministic: For any state, the same action is
produced by the policy π.
●Stochastic: Every action has a certain probability,
which is determined by the following
equation.Stochastic Policy :
n{a\s) = P\A, = a\S, =S]
Model-Based
In this Reinforcement Learning method, you need to
create a virtual model for each environment. The
agent learns to perform in that specific environment.
Positive:
It is defined as an event, that occurs because of
specific behavior. It increases the strength and the
frequency of the behavior and impacts positively on
the action taken by the agent.
This type of Reinforcement helps you to maximize
performance and sustain change for a more extended
period. However, too much Reinforcement may lead to
over-optimization of state, which can affect the
results.
Negative:
Negative Reinforcement is defined as strengthening of
behavior that occurs because of a negative condition
which should have stopped or avoided. It helps you to
define the minimum stand of performance. However,
the drawback of this method is that it provides enough
to meet up the minimum behavior.