Reinforcement Learning
Reinforcement Learning
Reinforcement Learning
In this case,
Value-Based:
In a value-based Reinforcement Learning method, you should try to maximize a
value function V(s). In this method, the agent is expecting a long-term return of
the current states under policy π.
Policy-based:
In a policy-based RL method, you try to come up with such a policy that the
action performed in every state helps you to gain maximum reward in the
future.
Deterministic: For any state, the same action is produced by the policy π.
Stochastic: Every action has a certain probability, which is determined by
the following equation.Stochastic Policy :
n{a\s) = P\A, = a\S, =S]
Model-Based:
In this Reinforcement Learning method, you need to create a virtual model for
each environment. The agent learns to perform in that specific environment.
Positive:
It is defined as an event, that occurs because of specific behavior. It increases
the strength and the frequency of the behavior and impacts positively on the
action taken by the agent.
Set of actions- A
Set of states -S
Reward- R
Policy- n
Value- V
Explanation:
In the below-given image, a state is described as a node, while the arrows show
the action.
For example, an agent traverse from room number 2 to 5
Supports and work better in AI, where human It is mostly operated with an
Best suited
interaction is prevalent. software system or applicat
Parameters Reinforcement Learning Supervised Learning