0% found this document useful (0 votes)

28 views6 pages

ML Assignment 2

Uploaded by

221225

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

ML Assignment 2

Uploaded by

221225

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

ML ASSIGNMENT – 2

Reinforcement Learning

Name: Karthik Nivedhan A

Roll number: 221225
ECE Third year

Sources:

What is Reinforcement Learning? - Reinforcement Learning Explained - AWS

Comparison Between Model-Free And Model-Based Reinforcement Learning Algorithms In 2022 -

Techyv.com

Types of Reinforcement Learning - GeeksforGeeks

Q1. Reinforcement learning

1.Introduction
Reinforcement learning (RL) is a machine learning technique that trains
software to make decisions to achieve the most optimal results. It mimics the
trial-and-error learning process that humans use to achieve their goals.
Software actions that work towards your goal are reinforced, while actions that
detract from the goal are ignored.
RL algorithms use a reward-and-punishment paradigm as they process data.
They learn from the feedback of each action and self-discover the best
processing paths to achieve final outcomes. The algorithms are also capable of
delayed gratification. The best overall strategy may require short-term
sacrifices, so the best approach they discover may include some punishments or
backtracking along the way. RL is a powerful method to help artificial
intelligence systems achieve optimal outcomes in unseen environments.

2.Why Reinforcement learning?

Excels in complex environments

RL algorithms can be used in complex environments with many rules and
dependencies. In the same environment, a human may not be capable of
determining the best path to take, even with superior knowledge of the
environment. Instead, model-free RL algorithms adapt quickly to continuously
changing environments and find new strategies to optimize results.

Requires less human interaction

In traditional ML algorithms, humans must label data pairs to direct the
algorithm. When you use an RL algorithm, this isn’t necessary (unsupervised). It
learns by itself. At the same time, it offers mechanisms to integrate human
feedback, allowing for systems that adapt to human preferences, expertise, and
corrections.

Optimizes for long-term goals

RL inherently focuses on long-term reward maximization, which makes it apt for
scenarios where actions have prolonged consequences. It is particularly well-
suited for real-world situations where feedback isn't immediately available for
every step, since it can learn from delayed rewards.

3.Core Components of Reinforcement Learning

1. Agent: The decision-maker that interacts with the environment. It learns
a policy, which maps states to actions.
2. Environment: The external world with which the agent interacts. It
provides states and rewards to the agent.
3. State: The current situation or configuration of the environment.
4. Action: The choices available to the agent.
5. Reward: A numerical value indicating the desirability of a particular
action or outcome.

4.The Reinforcement Learning Process

1. Initialization: The agent starts in an initial state.
2. Action Selection: The agent chooses an action based on its current
policy.
3. Environment Transition: The environment transitions to a new state
based on the agent's action.
4. Reward: The environment provides a reward to the agent.
5. Policy Update: The agent updates its policy to improve its chances of
receiving higher rewards in the future.

5.Applications of Reinforcement Learning

Reinforcement learning has found applications in various domains, including:
 Game Playing: AlphaGo, DeepMind's AI that defeated the world
champion Go player, is a notable example.
 Robotics: RL has been used to train robots to perform tasks such as
grasping objects and navigating environments.
 Finance: RL can be used for algorithmic trading and risk management.
 Healthcare: RL can assist in medical diagnosis and treatment planning.
 Natural Language Processing: RL can be used for tasks like machine
translation and dialogue systems.

Q2. Types of RL with example

Reinforcement learning (RL) can be broadly categorized into two main types:
Model-Based and Model-Free. Each type has its own approach to learning
and decision-making.

 Model-Based RL: Builds a model of the environment to plan actions.

 Model-Free RL: Learns directly from experience without a model.

1.Model-Based Reinforcement Learning

In model-based RL, the agent attempts to build a model of the environment.
This model predicts the next state and reward given a current state and action.
Using this model, the agent can plan its actions and evaluate different policies.
Example: A self-driving car might learn a model of the traffic environment,
predicting the behaviour of other vehicles and pedestrians. Using this model, it
can plan its route and make decisions like changing lanes or braking.

2.Model-Free Reinforcement Learning

Model-free RL, on the other hand, doesn't require an explicit model of the
environment. Instead, it learns directly from experience, updating its policy
based on the rewards it receives.

Example: A game-playing AI might learn to play chess by playing numerous

games against itself. It doesn't need to understand the rules of chess explicitly;
it simply learns which moves lead to wins and losses.

Q3. Q- learning
1.Introduction
Q-learning is a popular model-free reinforcement learning algorithm that aims
to learn the optimal action-value function, often referred to as the Q-function.
This function estimates the expected future reward for taking a particular
action in a given state. By maximizing the Q-function, the agent can learn to
make decisions that lead to the highest cumulative reward.

2.Q-Function and updating function

The Q-function, denoted as Q (s, a), represents the expected future discounted
reward obtained by taking action an in-state s and then following the optimal
policy thereafter. The discount factor, denoted by γ, determines the importance
of future rewards relative to immediate ones.

 s: current state
 a: current action
 r: reward received
 s’: next state
 a’: next action
 α: learning rate
 γ: discount factor

The Q-Learning Update Rule

The Q-learning algorithm iteratively updates the Q-function based on the
Bellman equation:

 s is the current state

 a is the current action
 R (s, a) is the immediate reward received
 s' is the next state
 a' is the best action in the next state (according to the Q-function)
 α is the learning rate, which determines how much the Q-function is
updated based on new experiences

3.Overview of Q-Learning Algorithm

1. Initialize the Q-function: Set all Q-values to zero.
2. Choose an action: Select an action based on the current state and the Q-
function. This can be done using an ε-greedy policy, which chooses the
best action with probability 1 - ε and a random action with probability ε.
3. Observe the reward and next state: Take the chosen action, observe the
resulting reward and next state.
4. Update the Q-function: Use the Q-learning update rule to update the Q-
value for the current state and action.
5. Repeat: Repeat steps 2-4 until convergence or a desired number of
episodes.

4.Advantages of Q-Learning:
 Simple and effective for small action spaces.
 Doesn’t require a model of the environment.

 Q-learning has been applied to a wide range of problems, including: Game playing,
Robotics, Finance, Healthcare

In conclusion, Q-learning is a powerful and versatile algorithm for model-

free reinforcement learning. Its ability to learn optimal policies through
trial and error makes it suitable for a wide range of applications.

THANK YOU, MAM.

Unit-5 Mlt
No ratings yet
Unit-5 Mlt
13 pages
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo - Download the ebook now to never miss important information
100% (2)
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo - Download the ebook now to never miss important information
70 pages
RL Week_1
No ratings yet
RL Week_1
53 pages
Design of Archimedean Spiral Antenna For Radar Appplications
No ratings yet
Design of Archimedean Spiral Antenna For Radar Appplications
6 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Harrigan 2017
No ratings yet
Harrigan 2017
9 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
RL UNIT - III (1)
No ratings yet
RL UNIT - III (1)
20 pages
REINFORCEMENT LEARNING-1
No ratings yet
REINFORCEMENT LEARNING-1
19 pages
UNIT 5 ML
No ratings yet
UNIT 5 ML
49 pages
Module 01
No ratings yet
Module 01
66 pages
7.Poynting theorem and wave polarization
No ratings yet
7.Poynting theorem and wave polarization
24 pages
Math 221 Week 8 Final Exam
No ratings yet
Math 221 Week 8 Final Exam
2 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
RL Vishnu Sankar
No ratings yet
RL Vishnu Sankar
26 pages
Reinforcement Learning (RL) : Agent
No ratings yet
Reinforcement Learning (RL) : Agent
35 pages
Igcse e Electricity With MSC
No ratings yet
Igcse e Electricity With MSC
94 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
Lecture 2 Survival Models - Handout
No ratings yet
Lecture 2 Survival Models - Handout
20 pages
AI (IT) UNIT-5
No ratings yet
AI (IT) UNIT-5
43 pages
Artificial Intelligence: Computer Science & Engineering, Khulna University
No ratings yet
Artificial Intelligence: Computer Science & Engineering, Khulna University
30 pages
Touch Screen Technology Documentation
64% (11)
Touch Screen Technology Documentation
21 pages
UNIT-4
No ratings yet
UNIT-4
56 pages
ML UNIT 5
No ratings yet
ML UNIT 5
13 pages
Lecture Week12
No ratings yet
Lecture Week12
37 pages
Module_1 - Reinforcement Learning and Markov Decision Process
No ratings yet
Module_1 - Reinforcement Learning and Markov Decision Process
19 pages
Sara Reinforcement Learning
No ratings yet
Sara Reinforcement Learning
69 pages
RL & DL Notes
No ratings yet
RL & DL Notes
73 pages
IntroductiontoRL-BR
No ratings yet
IntroductiontoRL-BR
22 pages
UI22EC23_09
No ratings yet
UI22EC23_09
9 pages
Unit-5 Mla
No ratings yet
Unit-5 Mla
22 pages
Jurnal 1
No ratings yet
Jurnal 1
17 pages
Data Engineering Top 100 Questions
No ratings yet
Data Engineering Top 100 Questions
59 pages
RL & DL Notes
No ratings yet
RL & DL Notes
43 pages
Module 1
No ratings yet
Module 1
72 pages
Winter Semester 2023-24_CSE4037_ETH_AP2023246000594_2024-01-05_Reference-Material-I
No ratings yet
Winter Semester 2023-24_CSE4037_ETH_AP2023246000594_2024-01-05_Reference-Material-I
35 pages
4.3 Reinforcement Learning
No ratings yet
4.3 Reinforcement Learning
27 pages
fai_mid2_4ans[1]
No ratings yet
fai_mid2_4ans[1]
4 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
30 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
25 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
23 pages
L-14 - Reinforcement-L-d-07062024-111949am
No ratings yet
L-14 - Reinforcement-L-d-07062024-111949am
22 pages
Lecture Notes on Reinforcement Learning Basics
No ratings yet
Lecture Notes on Reinforcement Learning Basics
6 pages
AI Unit5 Chapter18 LearningfromExamples 2023
No ratings yet
AI Unit5 Chapter18 LearningfromExamples 2023
15 pages
unit 5 ml
No ratings yet
unit 5 ml
15 pages
M.C.a. (Sem - II) Paper - I - Data Structures
No ratings yet
M.C.a. (Sem - II) Paper - I - Data Structures
132 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
10 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
Unit 1 - Reinforcement Learning,Overfitting, Training, Validation Sets, Metrics, Bias and Variance
No ratings yet
Unit 1 - Reinforcement Learning,Overfitting, Training, Validation Sets, Metrics, Bias and Variance
16 pages
Aldahmi 2019
No ratings yet
Aldahmi 2019
4 pages
ANALYSIS of RC OSCILLATORS Using Cascade Connection of LPF and HPF
No ratings yet
ANALYSIS of RC OSCILLATORS Using Cascade Connection of LPF and HPF
13 pages
MLT Unit-5 notes
No ratings yet
MLT Unit-5 notes
17 pages
Unit 5 ML 3year
No ratings yet
Unit 5 ML 3year
17 pages
ReinforcementLearning
No ratings yet
ReinforcementLearning
17 pages
Analisis Produktivitas Produksi PT. Karya Pak Oles Tokcer Denpasar
No ratings yet
Analisis Produktivitas Produksi PT. Karya Pak Oles Tokcer Denpasar
13 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
15 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
8 pages
UNIT-3
No ratings yet
UNIT-3
29 pages
UNIT V reinforcement learning
No ratings yet
UNIT V reinforcement learning
8 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
5 pages
9608 November 2015 Paper 32 Mark Scheme
No ratings yet
9608 November 2015 Paper 32 Mark Scheme
6 pages
Grade 8 Mathematics (Investigation) 2021
No ratings yet
Grade 8 Mathematics (Investigation) 2021
4 pages
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
No ratings yet
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
34 pages
Reinforcement Learning Notes ?
No ratings yet
Reinforcement Learning Notes ?
40 pages
Unit 5
No ratings yet
Unit 5
45 pages
Exp-14 Reinforcement Learning
No ratings yet
Exp-14 Reinforcement Learning
11 pages
L11 Reinforcement Learning 1
No ratings yet
L11 Reinforcement Learning 1
18 pages
Unit 5-1
No ratings yet
Unit 5-1
8 pages
Reinforcement Learning With Python
No ratings yet
Reinforcement Learning With Python
24 pages
What Is Reinforcement Learning
No ratings yet
What Is Reinforcement Learning
12 pages
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
No ratings yet
Introduction To Reinforcement Learning: Presented by - Rohit Mahto
9 pages
Unit-5 (AI)
No ratings yet
Unit-5 (AI)
21 pages
Catarina Paso 80 Con Cuñero
No ratings yet
Catarina Paso 80 Con Cuñero
1 page
Reinforcement_Learning_Enhanced
No ratings yet
Reinforcement_Learning_Enhanced
3 pages
Assignment_15_Modern_AI
No ratings yet
Assignment_15_Modern_AI
3 pages
Class 7 Project Planning
No ratings yet
Class 7 Project Planning
66 pages
RL Unit 1
100% (1)
RL Unit 1
26 pages
09 Static Keyword in Java
No ratings yet
09 Static Keyword in Java
9 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
2 pages
ACTIVITY 3.1. Quantum Mechanical Model
No ratings yet
ACTIVITY 3.1. Quantum Mechanical Model
2 pages
Fundamentals of Reinforcement Learning
No ratings yet
Fundamentals of Reinforcement Learning
33 pages
4 IT415 Requirement Based Test Generation
0% (1)
4 IT415 Requirement Based Test Generation
14 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
Bomba de Diafragma 2 Pulgadas ManualPD20P-XXX-XXX-En
No ratings yet
Bomba de Diafragma 2 Pulgadas ManualPD20P-XXX-XXX-En
8 pages
Flammability Presentation
No ratings yet
Flammability Presentation
27 pages
Remastering Ubuntu 18
No ratings yet
Remastering Ubuntu 18
3 pages
300 Watt MOSFET Real HI-FI Power Amplifier
100% (1)
300 Watt MOSFET Real HI-FI Power Amplifier
3 pages
Bomba de Engranages
No ratings yet
Bomba de Engranages
0 pages
Engineering Applications of FPGAs
100% (2)
Engineering Applications of FPGAs
230 pages
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
From Everand
Reinforcement Learning Explained - A Step-by-Step Guide to Reward-Driven AI
Luka Nikolic
No ratings yet