0% found this document useful (0 votes)

3 views28 pages

What Is Reinforcement Learning

Uploaded by

Andre anade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views28 pages

What Is Reinforcement Learning

Uploaded by

Andre anade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

What is Reinforcement learning?

Let us start with a simple analogy. If you have a pet at home, you may have
used this technique with your pet.

A clicker (or whistle) is a technique to let your pet know some treat is just about
to get served! This is essentially “reinforcing” your pet to practice good behavior.
You click the “clicker” and follow up with a treat. And with time, your pet gets
accustomed to this sound and responds every time he/she hears the click
sound. With this technique, you can train your pet to do “good” deeds when
required.

Now let’s make these replacements in the example:

• The pet becomes the artificial agent

• The treat becomes the reward function
• The good behavior is the resultant action
The above example explains what reinforcement learning looks like. This is
actually a classic example of reinforcement learning.

To apply this on an artificial agent, you have a kind of a feedback loop to

reinforce your agent. It rewards when the actions performed is right and
punishes in-case it was wrong. Basically what you have in your kitty is:

• an internal state, which is maintained by the agent to learn about the

environment
• a reward function, which is used to train your agent how to behave
• an environment, which is a scenario the agent has to face
• an action, which is done by the agent in the environment
• and last but not the least, an agent which does all the deeds!

2. Examples of Reinforcement Learning

Now, I am sure you must be thinking how the experiment conducted on
animals can be relevant to people practicing machine learning. This is what I
thought when I came across reinforcement learning first.

A lot of beginners tend to think that there are only 2 types of problems in
machine learning – Supervised machine learning and Unsupervised machine
learning. I don’t know where this notion comes from, but the world of machine
learning is much more than the 2 types of problems mentioned above.
Reinforcement learning is one such class of problems.

Let’s look at some real-life applications of reinforcement learning. Generally, we

know the start state and the end state of an agent, but there could be multiple
paths to reach the end state – reinforcement learning finds an application in
these scenarios. This essentially means that driverless cars, self
navigating vaccum cleaners, scheduling of elevators are all applications
of Reinforcement learning.

Here is a video of a game bot trained to play flappy bird.

3. What is a Reinforcement Learning Platform?

Before we look into what a platform is, lets try to understand a reinforcement
learning environment.

A reinforcement learning environment is what an agent can observe and act

upon. The horizon of an agent is much bigger, but it is the task of the agent to
perform actions on the environment which can help it maximize its reward. As
per “A brief introduction to reinforcement learning” by Murphy (1998),

The environment is a modeled as a stochastic finite state machine with inputs

(actions sent from the agent) and outputs (observations and rewards sent to the
agent).

Let’s take an example,

This is a typical game of mario. Remember how you played this game. Now
consider that you are the “agent” who is playing the game.

Now you have “access” to a land of opportunities, but you don’t know what will
happen when you do something, say smash a brick. You can see a limited
amount of “environment”, and until you traverse around the world you can’t see
everything. So you move around the world, trying to perceive what entails
ahead of you, and at the same time try to increase your chances to attain your
goal.

This whole “story” is not created by itself. You have to “render” it first. And that is
the main task of the platform, viz to create everything required for a complete
experience – the environment, the agent and the rewards.

4. Major Reinforcement Learning Platforms

i) Deepmind Lab

DeepMind Lab is a fully 3D game-like platform tailored for agent-based AI

research
A recent release by Google Deepmind, Deepmind lab is an integrated agent-
environment platform for general artificial intelligence research with a focus on
first person perspective games. It was built to accomodate the research done
at DeepMind. Deepmind lab is based on an open-source engine ioquake3 ,
which was modified to be a flexible interface for integration with artificial
systems.

Things I liked

• It has richer and realistic visuals.

• Closer integration with the gaming environment

Things I did not like

• It still lacks variety in terms of a gaming environment, which would get built over
time by open source contributions.
• Also at the moment, it supports only Linux, but has been tested on different OS.
Bazel (which is a dependency for deepmind lab) is experimental for windows.
So windows support for Deepmind lab is still not guaranteed.

ii) OpenAI Gym

(OpenAI Gym is) A toolkit for developing and comparing reinforcement

learning algorithms

OpenAI Gym is a platform for creating, evaluating and benchmarking artificial

agents in a game environment. The best thing I like about gym is that along with
the toolkit, there is a community support built around it, viz an evaluation
platform, code sharing platform and a discussion platform. Gym platform
consists multiple categories of environment along with sample solutions
provided by the community
Things I liked

• Variety of game environments with considerable open-source support.

Things I did not like

• Like Deepmind lab, Gym also has a limit on the number of environments it
supports (This is essentially taken care by OpenAI Universe)

Resources to explore further:

• Release post
• Open-source Repository
• Whitepaper
• Simple tutorial

iii) OpenAI Universe

Universe is a software platform for measuring and training an AI’s general

intelligence across the world’s supply of games, websites and other
applications

This is essentially an extension to OpenAI gym, with support for literally

“anything” you can do on a computer. Universe is built to emulate how a human
interacts with a computer. It uses Virtual Network Computing to access a
computer remotely, packages any program and converts it into a gym
environment.

Things I liked

• Unlimited access to any game environment.

• Universe is not only constrained to gaming environment, it can be used to
replace things like manual testing and working on Amazon Mechanical Turk
Things I did not like

• Initial release lacks many things which were promised, the significant one is
integration with Windows.

Resources to explore further:

• Release post
• Open-source Repository
• Simple tutorial

iv) Project Malmo

The Malmo platform is a sophisticated AI experimentation platform built on top

of Minecraft, and designed to support fundamental research in artificial
intelligence.

Project Malmo is a research initiative by Microsoft research, with an aim to build

AI agents to do complex tasks. Minecraft is a perfect scenario for building AI
agents, and that is why they chose it.

Things I liked

• Integration with a sophisticated environment

• Flexibility for customizing game environment

Things I did not like

• The support is only for minecraft, and no other game environment unlike
OpenAI universe.

Resources to explore further:

• Release Post
• Open-source Repository
• Whitepaper
• Simple tutorial
v) VizDoom

Doom-based AI Research Platform for Reinforcement Learning from Raw

Visual Information

I personally found this the most interesting platform to build AI agents, as you
can have a multi-agent support with a competitive environment to test the
agent. The platform runs on Doom, a first person shooting game, with a
variety of levels and modes.

Things I liked

• Variety of game modes with competitive environment

Things I did not like

• Similar to above platform, VizDoom has only support for one environment.

Resources to explore further:

• Release Post
• Open-source Repository
• Whitepaper
• Simple tutorial

5. Cheatsheet of Major Platforms

A few other notable platforms
• RL-Glue
o About: A standard interface for variety of languages to connect agents,
environments and experiment programs together.
o Resource: Main wiki Page
• CommAI
o About: A platform for training and testing AI systems based on communication
tasks
o Resource: Github repo
• Burlap
o About: BURLAP is a java code library for the use and development of single
or multi-agent planning and learning algorithms and domains to accompany
them.
o Resource: Github repo
• rlenvs
o About: A platform similar to OpenAI Gym but for lua
o Resource: Github repo

Introduction

One of the most fundamental question for scientists across the globe has been
– “How to learn a new skill?”. The desire to understand the answer is obvious
– if we can understand this, we can enable human species to do things we
might not have thought before. Alternately, we can train machines to do more
“human” tasks and create true artificial intelligence.
While we don’t have a complete answer to the above question yet, there are a
few things which are clear. Irrespective of the skill, we first learn by interacting
with the environment. Whether we are learning to drive a car or whether it an
infant learning to walk, the learning is based on the interaction with the
environment. Learning from interaction is the foundational underlying concept
for all theories of learning and intelligence.

Re-inforcement Learning

Today, we will explore Reinforcement Learning – a goal-oriented learning

based on interaction with environment. Reinforcement Learning is said to be
the hope of true artificial intelligence. And it is rightly said so, because the
potential that Reinforcement Learning possesses is immense.

Reinforcement Learning is growing rapidly, producing wide variety of learning

algorithms for different applications. Hence it is important to be familiar with the
techniques of reinforcement learning. If you are not familiar with reinforcement
learning, I will suggest you to go through my previous article on introduction to
reinforcement learning and the open source RL platforms.

Once you have an understanding of underlying fundamentals, proceed with

this article. By the end of this article you will have a thorough understanding of
Reinforcement Learning and its practical implementation.

1. Formulating a Reinforcement Learning Problem

Reinforcement Learning is learning what to do and how to map situations to
actions. The end result is to maximize the numerical reward signal. The
learner is not told which action to take, but instead must discover which action
will yield the maximum reward. Let’s understand this with a simple example
below.

Consider an example of a child learning to walk.

Here are the steps a child will take while learning to walk:

1. The first thing the child will observe is to notice how you are walking.
You use two legs, taking a step at a time in order to walk. Grasping this
concept, the child tries to replicate you.
2. But soon he/she will understand that before walking, the child has to
stand up! This is a challenge that comes along while trying to walk. So
now the child attempts to get up, staggering and slipping but still
determinant to get up.
3. Then there’s another challenge to cope up with. Standing up was easy,
but to remain still is another task altogether! Clutching thin air to find
support, the child manages to stay standing.
4. Now the real task for the child is to start walking. But it’s easy to say than
actually do it. There are so many things to keep in mind, like balancing
the body weight, deciding which foot to put next and where to put it.

Sounds like a difficult task right? It actually is a bit challenging to get up and
start walking, but you have become so use to it that you are not fazed by the
task. But now you can get the gist of how difficult it is for a child.

Let’s formalize the above example, the “problem statement” of the example is to
walk, where the child is an agent trying to manipulate the environment
(which is the surface on which it walks) by taking actions (viz walking) and
he/she tries to go from one state (viz each step he/she takes) to another. The
child gets a reward (let’s say chocolate) when he/she accomplishes
a submodule of the task (viz taking couple of steps) and will not receive any
chocolate (a.k.a negative reward) when he/she is not able to walk. This is
a simplified description of a reinforcement learning problem.
Here’s a good introductory video on Reinforcement Learning.

2. Comparison with other machine learning methodologies

Reinforcement Learning belongs to a bigger class of machine learning

algorithm. Below is the description of types of machine learning methodologies.

Let’s see a comparison between RL and others:

• Supervised vs Reinforcement Learning: In supervised learning, there’s an

external “supervisor”, which has knowledge of the environment and who shares
it with the agent to complete the task. But there are some problems in which
there are so many combinations of subtasks that the agent can perform to
achieve the objective. So that creating a “supervisor” is almost impractical. For
example, in a chess game, there are tens of thousands of moves that can be
played. So creating a knowledge base that can be played is a tedious task. In
these problems, it is more feasible to learn from one’s own experiences and
gain knowledge from them. This is the main difference that can be said of
reinforcement learning and supervised learning. In both supervised and
reinforcement learning, there is a mapping between input and output. But in
reinforcement learning, there is a reward function which acts as a feedback to
the agent as opposed to supervised learning.
• Unsupervised vs Reinforcement Leanring: In reinforcement learning, there’s
a mapping from input to output which is not present in unsupervised learning.
In unsupervised learning, the main task is to find the underlying patterns rather
than the mapping. For example, if the task is to suggest a news article to a user,
an unsupervised learning algorithm will look at similar articles which the person
has previously read and suggest anyone from them. Whereas a reinforcement
learning algorithm will get constant feedback from the user by suggesting few
news articles and then build a “knowledge graph” of which articles will the
person like.

There is also a fourth type of machine learning methodology called semi-

supervised learning, which is essentially a combination of supervised and
unsupervised learning. It differs from reinforcement learning as similar to
supervised and semi-supervised learning has direct mapping whereas
reinforcement does not.

3. Framework for solving Reinforcement Learning

Problems

To understand how to solve a reinforcement learning problem, let’s go through

a classic example of reinforcement learning problem – Multi-Armed Bandit
Problem. First, we would understand the fundamental problem of exploration
vs exploitation and then go on to define the framework to solve RL problems.
Suppose you have many slot machines with random payouts. A slot machine
would look something like this.

Now you want to do is get the maximum bonus from the slot machines as fast
as possible. What would you do?

One naive approach might be to select only one slot machine and keep pulling
the lever all day long. Sounds boring, but it may give you “some” payouts. With
this approach, you might hit the jackpot (with a probability close to 0.00000….1)
but most of the time you may just be sitting in front of the slot machine losing
money. Formally, this can be defined as a pure exploitation approach. Is this
the optimal choice? The answer is NO.
Let’s look at another approach. We could pull a lever of each & every slot
machine and pray to God that at least one of them would hit the jackpot. This
is another naive approach which would keep you pulling levers all day long, but
give you sub-optimal payouts. Formally this approach is a pure
exploration approach.

Both of these approaches are not optimal, and we have to find a proper balance
between them to get maximum reward. This is said to be exploration vs
exploitation dilemma of reinforcement learning.

First, we formally define the framework for reinforcement learning problem and
then list down the probable approaches to solve the problem.

Markov Decision Process:

The mathematical framework for defining a solution in reinforcement learning

scenario is called Markov Decision Process. This can be designed as:

• Set of states, S
• Set of actions, A
• Reward function, R
• Policy, π
• Value, V

We have to take an action (A) to transition from our start state to our end state
(S). In return getting rewards (R) for each action we take. Our actions can
lead to a positive reward or negative reward.

The set of actions we took define our policy (π) and the rewards we get in
return defines our value (V). Our task here is to maximize our rewards by
choosing the correct policy. So we have to maximize for all
possible values of S for a time t.

Shortest Path Problem

Let me take you through another example to make it clear.

This is a representation of a shortest path problem. The task is to go from place

A to place F, with as low cost as possible. The numbers at each edge between
two places represent the cost taken to traverse the distance. The negative cost
are actually some earnings on the way. We define Value is the total cumulative
reward when you do a policy.
Here,

• The set of states are the nodes, viz {A, B, C, D, E, F}

• The action to take is to go from one place to other, viz {A -> B, C -> D, etc}
• The reward function is the value represented by edge, i.e. cost
• The policy is the “way” to complete the task, viz {A -> C -> F}

Now suppose you are at place A, the only visible path is your next destination
and anything beyond that is not known at this stage (a.k.a observable space).

You can take a greedy approach and take the best possible next step, which is
going from {A -> D} from a subset of {A -> (B, C, D, E)}. Similarly now you are
at place D and want to go to place F, you can choose from {D -> (B, C, F)}. We
see that {D -> F} has the lowest cost and hence we take that path.

So here, our policy was to take {A -> D -> F} and our Value is -120.

Congratulations! You have just implemented a reinforcement learning

algorithm. This algorithm is known as epsilon greedy, which is literally a
greedy approach to solving the problem. Now if you (the salesman) want to go
from place A to place F again, you would always choose the same policy.

Other ways of travelling?

Can you guess which category does our policy belong to i.e. (pure exploration
vs pure exploitation)?

Notice that the policy we took is not an optimal policy. We would have to
“explore” a little bit to find the optimal policy. The approach which we took here
is policy based learning, and our task is to find the optimal policy among all the
possible policies. There are different ways to solve this problem, I’ll briefly list
down the major categories
• Policy based, where our focus is to find optimal policy
• Value based, where our focus is to find optimal value, i.e. cumulative reward
• Action based, where our focus is on what optimal actions to take at each
step

I would try to cover in-depth reinforcement learning algorithms in future articles.

Till then, you can refer to this paper on a survey of reinforcement learning
algorithms.

4. An implementation of Reinforcement Learning

We will be using Deep Q-learning algorithm. Q-learning is a policy based

learning algorithm with the function approximator as a neural network. This
algorithm was used by Google to beat humans at Atari games!

Let’s see a pseudocode of Q-learning:

1. Initialize the Values table ‘Q(s, a)’.

2. Observe the current state ‘s’.
3. Choose an action ‘a’ for that state based on one of the action selection
policies (eg. epsilon greedy)
4. Take the action, and observe the reward ‘r’ as well as the new
state ‘s’.
5. Update the Value for the state using the observed reward and the
maximum reward possible for the next state. The updating is done
according to the formula and parameters described above.
6. Set the state to the new state, and repeat the process until a terminal
state is reached.

A simple description of Q-learning can be summarized as follows:

We will first see what Cartpole problem is then go on to coding up a solution

When I was a kid, I remember that I would pick a stick and try to balance it on
one hand. Me and my friends used to have this competition where whoever
balances it for more time would get a “reward”, a chocolate!

Here’s a short video description of a real cart-pole system

Let’s code it up!

To setup our code, we need to first install a few things,

Step 1: Install keras-rl library
From terminal, run the following commands:
git clone https://github.com/matthiasplappert/keras-rl.git
cd keras-rl
python setup.py install

Step 2: Install dependencies for CartPole environment

Assuming you have pip installed, you need to install the following libraries
pip install h5py
pip install gym

Step 3: lets get started!

First we have to import modules that are necessary
import numpy as np
import gym

from keras.models import Sequential

from keras.layers import Dense, Activation, Flatten
from keras.optimizers import Adam

from rl.agents.dqn import DQNAgent

from rl.policy import EpsGreedyQPolicy
from rl.memory import SequentialMemory

Then set the relevant variables

ENV_NAME = 'CartPole-v0'
# Get the environment and extract the number of actions available in the Cartpole
problem
env = gym.make(ENV_NAME)
np.random.seed(123)
env.seed(123)
nb_actions = env.action_space.n

Next, we build a very simple single hidden layer neural network model.
model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))
print(model.summary())

Next, we configure and compile our agent. We set our policy as Epsilon
Greedy and we also set our memory as Sequential Memory because we want
to store the result of actions we performed and the rewards we get for each
action.
policy = EpsGreedyQPolicy()
memory = SequentialMemory(limit=50000, window_length=1)
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])

# Okay, now it's time to learn something! We visualize the training here for show,
but this slows down training quite a lot.
dqn.fit(env, nb_steps=5000, visualize=True, verbose=2)

Now we test our reinforcement learning model

dqn.test(env, nb_episodes=5, visualize=True)

This will be the output of our model:

And Voila! You have just built a reinforcement learning bot!

5. Increasing the complexity

Now that you have seen a basic implementation of Re-inforcement learning,

let us start moving towards a few more problems, increasing the complexity
little bit every time.

Problem – Towers of Hanoi

For those, who don’t know the game – it was invented in 1883 and consists of
3 rods along with a number of sequentially-sized disks (3 in the figure above)
starting at the leftmost rod. The objective is to move all the disks from the
leftmost rod to the rightmost rod with the least number of moves. (You can
read more on wikipedia)

If we have to map this problem, let us start with states:

• Starting state – All 3 disks in leftmost rod (in order 1, 2 and 3 from top to
bottom)
• End State – All 3 disks in rightmost rod (in order 1, 2 and 3 from top to
bottom)

All possible states:

Here are our 27 possible states:

All disks in a rod One disk in a Rod (13) disks in a rod (23) disks in a

Here are our 27 possible states:

All disks in a One disk in a (13) disks in a (23) disks in a (12) disks in a
rod Rod rod rod rod
(123)** 321 (13)2* (23)1* (12)3*
*(123)* 312 (13)*2 (23)*1 (12)*3
**(123) 231 2(13)* 1(23)* 3(12)*
132 *(13)2 *(23)1 *(12)3
213 2*(13) 1*(23) 3*(12)
123 *2(13) *1(23) *3(12)
Where (12)3* represents disks 1 and 2 in leftmost rod (top to bottom) 3 in
middle rod and * denotes an empty rightmost rod

Numerical Reward:

Since we want to solve the problem in least number of steps, we can attach a
reward of -1 to each step.

Policy:

Now, without going in any technical details, we can map possible transitions
between above states. For example (123)** -> (23)1* with reward -1. It can
also go to (23)*1

If you can now see a parallel, each of these 27 states mentioned above can
represent a graph similar to that of shortest path algorithm above and we can
find the most optimal solutions by experimenting various states and paths.

Problem – 3 x 3 Rubix Cube

While I can solve this for you as well, I would want you to do this by yourself.
Follow the same line of thought I used above and you should be good.
Start by defining the Starting state and the end state. Next, define all possible
states and their transitions along with reward and policy. Finally, you should
be able to create a solution for solving a rubix cube using the same approach.

6. A Peek into Recent Advancements in Reinforcement Learning

As you would realize that the complexity of this Rubix Cube is many folds
higher than the Towers of Hanoi. You can also understand how the possible
number of options have increased in number. Now, think of number of states
and options in a game of Chess and then in Go! Google DeepMind recently
created a deep reinforcement learning algorithm which defeated Lee Sedol!

With the recent success in Deep Learning, now the focus is slowly shifting to
applying deep learning to solve reinforcement learning problems. The news
recently has been flooded with the defeat of Lee Sedol by a deep reinforcement
learning algorithm developed by Google DeepMind. Similar breakthroughs are
being seen in video games, where the algorithms developed are achieving
human-level accuracy and beyond. Research is still at par, with both industrial
and academic masterminds working together to accomplish the goal of building
better self-learning robots
Source

Some major domains where RL has been applied are as follows:

• Game Theory and Multi-Agent Interaction

• Robotics
• Computer Networking
• Vehicular Navigation
• Medicine and
• Industrial Logistic.

There are so many things unexplored and with the current craze of deep
learning applied to reinforcement learning, there certainly are breakthroughs
incoming!

Here is one of the recent news:

Excited to share an update on #AlphaGo! pic.twitter.com/IT5HGBmYDr

— Demis Hassabis (@demishassabis) January 4, 2017

7. Additional Resources

I hope now you have in-depth understanding of how reinforcement learning

works. Here are some additional resources to help you explore more about
reinforcement learning

• Videos on Reinforcement Learning

• Book on Introduction to Reinforcement Learning
• Awesome Reinforcement Learning Github repo
• Course on Reinforcement Learning by David Silver

Reinforcement Learning For IoT - Final
No ratings yet
Reinforcement Learning For IoT - Final
45 pages
Aids Module 6
No ratings yet
Aids Module 6
19 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
29 pages
NLP & LLM - 11
No ratings yet
NLP & LLM - 11
16 pages
AI Research with Unity Platform
No ratings yet
AI Research with Unity Platform
28 pages
cs188 Fa24 Lec26
No ratings yet
cs188 Fa24 Lec26
38 pages
Neural Networks
No ratings yet
Neural Networks
39 pages
Artificial Intelligence Notes
No ratings yet
Artificial Intelligence Notes
20 pages
ML 5 Reinforcement
No ratings yet
ML 5 Reinforcement
23 pages
Intro to AI & Deep Learning
No ratings yet
Intro to AI & Deep Learning
20 pages
Unit 1 Ai
No ratings yet
Unit 1 Ai
23 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
47 pages
Unit 1 - AI
No ratings yet
Unit 1 - AI
65 pages
ETI-CH1 Notes
No ratings yet
ETI-CH1 Notes
19 pages
Nnai Bai-205 Unit 1
No ratings yet
Nnai Bai-205 Unit 1
107 pages
AI Research Paper
No ratings yet
AI Research Paper
14 pages
Building Reinforcement Learning Environment
No ratings yet
Building Reinforcement Learning Environment
7 pages
Ai Machine Learning
No ratings yet
Ai Machine Learning
27 pages
AI - Assignment 2 Zaryab Khan
No ratings yet
AI - Assignment 2 Zaryab Khan
6 pages
Verpaj Oii
No ratings yet
Verpaj Oii
2 pages
RL Chap 5
No ratings yet
RL Chap 5
21 pages
Building Artificial Intelligence Systems: 1. Perception
No ratings yet
Building Artificial Intelligence Systems: 1. Perception
105 pages
Lec 1 Intro Course Overview
No ratings yet
Lec 1 Intro Course Overview
50 pages
AI and Power BI Que-Finger Tips
No ratings yet
AI and Power BI Que-Finger Tips
38 pages
Ai Project
No ratings yet
Ai Project
15 pages
Artificial Intelligence All Unit
No ratings yet
Artificial Intelligence All Unit
41 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Aiml
No ratings yet
Aiml
101 pages
CS3491 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING 01 - by WWW - LearnEngineering.in
No ratings yet
CS3491 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING 01 - by WWW - LearnEngineering.in
23 pages
Deep Learning Introduction Class
No ratings yet
Deep Learning Introduction Class
46 pages
Genai Handout (Handout)
No ratings yet
Genai Handout (Handout)
14 pages
CS3491 Artificial Intelligence and Machine Learning Two Mark Questions 1
No ratings yet
CS3491 Artificial Intelligence and Machine Learning Two Mark Questions 1
23 pages
Lecture Reinforcement Learning
No ratings yet
Lecture Reinforcement Learning
28 pages
AI Basics: Concepts, Applications & Types
No ratings yet
AI Basics: Concepts, Applications & Types
37 pages
Ai - Chapter 1
No ratings yet
Ai - Chapter 1
18 pages
1 Intro Up To RL - TD
No ratings yet
1 Intro Up To RL - TD
20 pages
1st Module
No ratings yet
1st Module
9 pages
A Beginners Guide To Deep Reinforcement Learning PDF
No ratings yet
A Beginners Guide To Deep Reinforcement Learning PDF
9 pages
AI Concepts and Machine Learning Types
No ratings yet
AI Concepts and Machine Learning Types
68 pages
Lec 23
No ratings yet
Lec 23
51 pages
Case
No ratings yet
Case
6 pages
UNIT - 1 Notes
No ratings yet
UNIT - 1 Notes
28 pages
AI Notes
No ratings yet
AI Notes
19 pages
Artificial Intelligence Undergraduate Curriulum
No ratings yet
Artificial Intelligence Undergraduate Curriulum
53 pages
Unit 5 ML
No ratings yet
Unit 5 ML
49 pages
Fin Aaaaaaa Al
No ratings yet
Fin Aaaaaaa Al
138 pages
Different Branches of AI
No ratings yet
Different Branches of AI
25 pages
(Addison-Wesley Data & Analytics Series) Laura Graesser - Wah Loon Keng - Foundations of Deep Reinforcement Learning - Theory and Practice in Python-Addison-Wesley Professional (2019) PDF
100% (1)
(Addison-Wesley Data & Analytics Series) Laura Graesser - Wah Loon Keng - Foundations of Deep Reinforcement Learning - Theory and Practice in Python-Addison-Wesley Professional (2019) PDF
656 pages
CS3492-DBMS Question Bank - Watermark
No ratings yet
CS3492-DBMS Question Bank - Watermark
23 pages
AI Complete
No ratings yet
AI Complete
45 pages
AI Unit 1 With Assignment
No ratings yet
AI Unit 1 With Assignment
60 pages
Lec 01 Introductionv 2024
No ratings yet
Lec 01 Introductionv 2024
127 pages
Deep Learning
No ratings yet
Deep Learning
110 pages
(FREE PDF Sample) Deep Reinforcement Learning in Unity With Unity ML Toolkit 1st Edition Abhilash Majumder Ebooks
100% (7)
(FREE PDF Sample) Deep Reinforcement Learning in Unity With Unity ML Toolkit 1st Edition Abhilash Majumder Ebooks
29 pages
Datamahadev Com Ai Pilot Deep Reinforcement Learning Change Aviation Warfare
No ratings yet
Datamahadev Com Ai Pilot Deep Reinforcement Learning Change Aviation Warfare
8 pages
Intermediate English Exam Guide
100% (1)
Intermediate English Exam Guide
4 pages
Lesson 3 Language and Culture
No ratings yet
Lesson 3 Language and Culture
51 pages
Nakshatra Ashlesha
No ratings yet
Nakshatra Ashlesha
1 page
EOC Literary Terms List
No ratings yet
EOC Literary Terms List
2 pages
English For Writing Research Papers 2nd Edition Adrian Wallwork Instant Download
100% (2)
English For Writing Research Papers 2nd Edition Adrian Wallwork Instant Download
56 pages
Soap Note 2 Dominguez 2015
No ratings yet
Soap Note 2 Dominguez 2015
5 pages
Cinders of The Dragon Lord
No ratings yet
Cinders of The Dragon Lord
13 pages
Charles Bukowski - Best Quotes
No ratings yet
Charles Bukowski - Best Quotes
3 pages
Mapeh Common Stressors
No ratings yet
Mapeh Common Stressors
8 pages
AS - WB - CB - VI - The Deccan and The South Indian Kingdoms
No ratings yet
AS - WB - CB - VI - The Deccan and The South Indian Kingdoms
5 pages
واجب الحصة الثانية (مراجعة نهائية عام)
No ratings yet
واجب الحصة الثانية (مراجعة نهائية عام)
46 pages
Atomic Physics: Dr. Jie Zou PHY 1371 1
No ratings yet
Atomic Physics: Dr. Jie Zou PHY 1371 1
13 pages
SM PC300 350 LC 8
100% (3)
SM PC300 350 LC 8
1,025 pages
Direct and Indirect Speech
No ratings yet
Direct and Indirect Speech
9 pages
In The High Court of Judicature at Madras: Reserved On Delivered On
No ratings yet
In The High Court of Judicature at Madras: Reserved On Delivered On
23 pages
Customer Perception Towards Corporate Social Responsibility (CSR) in Banking Sector
No ratings yet
Customer Perception Towards Corporate Social Responsibility (CSR) in Banking Sector
4 pages
Managing Complexity
No ratings yet
Managing Complexity
18 pages
Peer Observation 3
No ratings yet
Peer Observation 3
4 pages
Build Skills To Learn Faster and Better According To Neuroscience
No ratings yet
Build Skills To Learn Faster and Better According To Neuroscience
94 pages
AEC 112 Notes
No ratings yet
AEC 112 Notes
56 pages
How To Integrate The Curricula 3rd Edition Robin J. Fogarty Instant Download
100% (1)
How To Integrate The Curricula 3rd Edition Robin J. Fogarty Instant Download
53 pages
Mega Trademark: Quality Hydraulic Tools
No ratings yet
Mega Trademark: Quality Hydraulic Tools
53 pages
Problem 9-47
No ratings yet
Problem 9-47
8 pages
Wagering Agreement: Agreement by Way of Wager (Sec. 30 - Contract)
No ratings yet
Wagering Agreement: Agreement by Way of Wager (Sec. 30 - Contract)
8 pages
Comprehension Email Writing (1) Malacca
No ratings yet
Comprehension Email Writing (1) Malacca
2 pages
To Martyr Varus For The Reposed Outside The Church
No ratings yet
To Martyr Varus For The Reposed Outside The Church
14 pages
Business Process Change - A Study of Methodologies, Techniques, and Tools
100% (1)
Business Process Change - A Study of Methodologies, Techniques, and Tools
27 pages
Pragmatics Presupposition
No ratings yet
Pragmatics Presupposition
11 pages
Dimaano CaseStudyPresentation
No ratings yet
Dimaano CaseStudyPresentation
13 pages
TLE 9 3rd Unit Assessment
No ratings yet
TLE 9 3rd Unit Assessment
2 pages