0% found this document useful (0 votes)

66 views9 pages

Some Thoughts On Reinforcement Learning: 1 Motivation

This document discusses reinforcement learning and the n-armed bandit problem. It provides an overview of reinforcement learning, describing the basic components of an RL system including the policy, reward function, and value function. It then discusses Q-learning as a fundamental RL technique for learning the optimal policy without prior knowledge of the environment. Finally, it discusses the n-armed bandit problem as a simple but useful test case for comparing RL methods, and how Q-learning can be applied to solve the n-armed bandit problem.

Uploaded by

fawkes2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views9 pages

Some Thoughts On Reinforcement Learning: 1 Motivation

Uploaded by

fawkes2002

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Some Thoughts on Reinforcement Learning

Maher Said February 3, 2006

Motivation

Why do we care about learning? One important reason is rationality (or the lack thereof): it is commonly accepted that rationality, as dened by economists (i.e., optimization), is far too demanding a requirement. In particular, the Bayesian approach requires full mental models, with state spaces, partitions, priors, and the like in games, we need to know our payos, opponents payos, how the interaction of actions determines payos, the distributions over possible payo functions of our opponents, and so on. While this may be a good approximation in some cases, it entirely rules out situations in which such a mental model is unavailable or infeasible. Furthermore, assuming the existence of such a mental model begs the question of where it comes from. One natural source is repeated experience combined with inductive and deductive reasoning. Specically, simple learning algorithms or rules of thumb may lead to rational behavior if there is a great enough wealth of experience to learn from. Thus, it is useful to understand the performance of these learning rules in individual (as well as interactive) decision making problems.

Overview of Reinforcement Learning

Reinforcement learning (RL) is a powerful tool used frequently by computer scientists to train essentially independent agents (generally computers or machines) to choose optimal actions to achieve its goals in a Markovian environment. It has been successfully used to, for example, train a computer to learn the game of backgammon and frequently defeat the best human players in the world, as well as to control elevator dispatching in complicated environments (see Sutton and Barto (1998) for case studies on these and other examples of the real-world implementations of reinforcement learning). A basic reinforcement learning system consists of a decision maker and an environment, as well as a policy, reward function, and value function. The policy denes the decision makers behavior at a given time; in particular, it is a mapping from observed states to available actions. The reward function maps the observed state of the environment to a real-valued reward (which may be interpreted as utility in our setting). The decision maker is assumed to maximize the sum total of rewards received. Finally, the value function serves as the agents estimate of the best possible

stream of rewards that may be achieved by following her policy. The key to a RL system, however, is the manner in which the policy evolves in response to the interaction of the agent with the environment: as actions are taken and rewards are received, the value function is updated and the policy is rened.

2.1

Q-Learning

There exists a wide variety of reinforcement learning techniques, some of which are better suited to certain learning environments and tasks than others. The most fundamental of these techniques is that of Q-learning. Dene the function Q (s, a) to be the maximum payo achievable when starting in state s and taking action a in that period; that is, in a setting with no discounting, we would dene Q (s, a) := E [r (s, a)] + E [V ( (s, a))] , (1)

where r (s, a) is the reward function evaluated at state s and action a, (s, a) is a state transition mapping, V is the optimal value function, and E is the expectations operator. The optimal policy in state s is then (s) := arg max {Q (s, a)} ;
aA

(2)

thus, if the decision maker is able to learn Q through experience, it is possible to determine an optimal policy without knowledge of the reward function r, the transition function , or the optimal value function V. In fact, knowledge of Q implies knowledge of V, as V (s) = Q (s, (s)) for all states s. So, how does a decision maker go about learning Q? The rst step is noting that Q (s, a) = E [r (s, a)] + E max Q (s, a) , a
a

(3)

So, let Qt denote the estimate of Q at time t, and initialize Q0 (s, a) = 0 for all s and a. The decision maker observes the current state st , and chooses an action at . After receiving a reward rt and observing the new state st+1 , the estimate of Q is updated to Qt : Qt (st , at ) = (1 t ) Qt1 (st , at ) + t rt + max Qt1 st+1 , a
a

(4)

where t is the inverse of the number of times the state-action pair (st , at ) has been observed. It can be shown that for an arbitrary initialization of Q0 and any action selection rule such that each state-action pair (s, a) is observed innitely often, then the update rule above is such that Qt (s, a) Q (s, a) for all s and all a as t .1
1

For a detailed statement and discussion of this convergence theorem, see Theorem 13.2 in Mitchell (1997).

2.2

The n-Armed Bandit Problem

An n-armed bandit, also called a multi-armed bandit, is similar to a traditional slot machine (a one-armed bandit), but has n levers. When pulled, each lever yields a reward drawn from some xed distribution associated with that lever. Initially, the decision maker (or gambler) has no information about any of the levers, but with repeated trials begins to learn which levers are associated with the highest payos. The n-armed bandit is a particularly good setting in which to study such methods due to its simplicity. One can easily compare the average rewards yielded by dierent decision-making approaches. Furthermore, there is always a well-dened optimal lever, so one may compare the time required by each strategy before consistently choosing the correct action. Finally, the stationarity and independence of the basic n-armed bandit problem allows one to examine the optimality properties of each method in a setting that does not require assumptions of signicant probabilistic sophistication about the decision maker.

2.3

Q-Learning and the n-Armed Bandit Problem

In the n-armed bandit problem, the Q-learning algorithm described above becomes even simpler. Since the problem is completely stationary, st = st+1 for all t N, allowing us to drop the dependence on s; thus, we may rewrite equation (3) above as Q (a) = E [r (a)] + E max Q a
a

(5)

Since an optimal policy maximizes Q and the n-armed bandit (since n is nite) has a well-dened best action (the one with the highest expected payo), we may dene for all a (a) := Q (a) E max Q a Q
a

= E [r (a)] .

(6)

is also optimal with respect to Q knowledge of Q Thus, a policy that is optimal with respect to Q Notice, however, immediately implies knowledge of Q. Therefore, we can concentrate on learning Q. that the algorithm above for learning Q allows us, via a simple law of large numbers argument, to as long as each arm is chosen innitely often.2 learn the value of Q

2.4

Action Selection

t (a) Q (a) for all a is that each arm be Given that the only requirement for convergence of Q chosen innitely often, what then is the appropriate method for choosing which action to take? Especially given that the decision maker cares about the stream of rewards received while learning
2 t (at ) = (1 t ) Q t1 (at ) + t rt ,where t is the Note that the update rule will be modied slightly, becoming Q 0 (a) = 0 for all a, this corresponds to tracking the reciprocal of the number of times action at has been chosen. If Q average performance over time of each action.

it is unclear how best to balance the explorative and exploitative motives. However, two main Q, methods dominate the literature: -greedy action selection and softmax action selection. The -greedy selection method is extremely simple; for some xed > 0, in each period t, the decision maker chooses a t = arg maxa Qt1 (a) with probability 1 , and randomizes uniformly between all arms with probability . With this method, for any > 0, each action will be sampled t Q. Furthermore, this implies an innite number of times as t , ensuring convergence of Q that the best action will be chosen (in the limit) with probability 1
n1 n

While simple, -greedy selection suers from the criticism that it ignores much of the information learned over time: it is just as likely to choose the worst-performing action as it is to choose the second-best action. Since the decision maker facing the bandit problem cares about payos as well as learning the optimal policy, such a method may be quite unsatisfactory. Therefore, softmax action selection procedures have been suggested which attempt to smooth the relationship between and the probability of selecting various actions. The most common softmax method estimates of Q uses the Boltzmann distribution, choosing action a at period t with probability t (a) = eQt (a)/
t (b)/ n Q b=1 e

(7)

where > 0 is a parameter called the temperature. Note that t (a) > 0 for all arms a and all t N. Furthermore, as , t approaches a discrete uniform distribution, while as (a) , then 0, t approaches the -greedy method, with = 0. Finally, if a = arg maxa Q t (a )

(a )/ Q

ods. Furthermore, choosing a value for the softmax method is a much more opaque task than choosing a level of for the -greedy method. Notice that modifying each of these methods slightly by decreasing or over time allows one to decrease the amount of exploration over time (by converging to a 0-greedy selection rule) while ensuring that each action will still be chosen an innite number of times in the limit (since all actions will still be chosen with positive probability for any > 0 or any > 0). The literature often suggests decreasing these parameters by a rate of either these methods so that or decline at a rate of a t = arg maxa Qt (a) has been chosen.
1 kt 1 t

(b)/ n Q b=1 e

as t , a much less intuitive result than that obtained by -greedy meth-

log(t) t ;

it is also possible to modify

log(kt ) kt ,

where kt is the number of times that

Related Literature

Economists have only relatively recently begun to examine reinforcement learning, with the vast majority of authors examining the performance of reinforcement learning agents in the context of a game. Roth and Erev (1995) and Erev and Roth (1998) were seminal papers in this regards they took as given players initial propensities to choose each action available to them in a game, with propensities updated additively by the payos received in each period and actions chosen with probabilities proportional to their propensities. The authors experimentally tested this model with 4

human subjects, nding that observed behavior seemed to be relatively well-explained by their reinforcement model. Rustichini (1999) is the only paper that I am aware of that examines the performance of RLbased algorithms in individual decision-making problems. In particular, he examines the optimality properties of two dierent reinforcement learning procedures in the case where there is a nite set of states that change according to a Markov process with a nite number of actions with deterministic payos. The rst rule he examines is that of Roth and Erev (1995) and Erev and Roth (1998):
i > 0, where i indexes actions. Then, if action j is chosen take as given some initial propensities S0 j j in period t 1, yielding a payo of rt1 , then St = St 1 + rt1 . The probability of choosing action

i at time t is then
i t =

i St j j St

(8)

eectively, the probability of choosing an action is proportional to the cumulative payo from that action (plus the initial propensity weights). Rustichini shows that this procedure converges almost surely to the action that maximizes expected payo, where the expectation is taken relative to the invariant distribution of the Markov process governing state transitions. He also examines action selection rules of the form
i t =

i j

St j

(9)

i are dened as before. This procedure is shown to converge to its limit where > 0 and the St

much faster than the cumulative linear rule in (6) ; however, this convergence need not be to the payo-maximizing choice.3 The remainder of the literature focuses on the performance of RL agents in games. Hopkins (2002) compares the Roth-Erev rule in (6) to ctitious play in two player games, and nds that an appropriate perturbation of a dynamic system generated by the interaction of two RL agents has the same stationary points as a stochastic ctitious play model, implying that if the game has sucient structure to determine the asymptotic behavior of stochastic ctitious play, then the Roth-Erev reinforcement learning model will yield similar behavior. Beggs (2005) shows that the play of RothErev agents leads to the elimination of iteratively dominated strategies, and that Roth-Erev agents cannot be forced permanently below their minmax payos, even when playing against sophisticated players. Laslier, Topol, and Walliser (2001) show that the play of agents using the Roth-Erev rule will converge with positive probability to any strict Nash equilibria. Laslier and Walliser (2005) show that this rule converges to the (generically unique) subgame perfect equilibrium in nite extensive form games of perfect information, and Jehiel and Samet (2005) prove a similar result for a closely related updating rule. All of the above-mentioned authors note a link between the dynamic system generated by reinforcement learning and the evolutionary replicator dynamic. B orgers and Sarin (1997) and (2000) are the only two papers in this literature to examine a
3 i Bell (2001) examines the sensitivity of the Roth-Erev rule to the initial conditions S0 . Her simulation results show that early behavior is extremely sensitive to these initial conditions.

rule other than the Roth-Erev cumulative payo rule. Suppose that action i is taken in period t, yielding a payo rt (0, 1) . Then actions are chosen in the following period with probabilities given by
j t +1 = j rt + (1 rt ) t

if j = i, if j = i.

j rt ) t

(10)

Thus, payos directly aect selection probabilities, with no intermediaries such as propensities or Q. The authors then note that the continuous time limit of this stochastic process satises the dierential equation which characterizes the continuous time replicator dynamic. In the second of the two papers, they add to the previous specication an endogenous aspiration level at+1 = rt + (1 ) at , and the reinforcement in period t is based on |rt at | instead of on rt alone. In this case, the stochastic system is the sum of a replicator dynamic and a probability matching component.4 Unfortunately, the established literature suers from a few shortcomings. Firstly, all of the papers based on the Roth-Erev framework assume that payos are always positive.5 Therefore,
i i i St +1 = St + rt > St , i i implying that t +1 > t ; that is, there is always positive reinforcement and no negative reinforce-

ment, implying that the probability of taking an action increases merely due to that action having been taken in the past. Meanwhile, the framework used by B orgers and Sarin (1997) and (2000) has no intermediary between payos and action probabilities, preventing any sort of cognitive or behavioral interpretation of their model. Finally, the only paper that explores nonstationary environments is Rustichini (1999); however, he makes use of a Markov process governing the state transition process and looks for optimality in terms of the stationary distribution thereby introducing stationarity through the back door.

Possible Directions for Future Work

What can we conclude from the overview above? The main point is that, at least for the basic narmed bandit problem, reinforcement learning algorithms do quite well in terms of picking out the best arm. Furthermore, there are many dierent parameterizations that yield this same result. Moreover, the economics literature has explored only a small portion of the universe of reinforcement learning algorithms. So, where do I want to go from here? There are several potential directions for further investigation.
Suppose there are two actions s1 and s2 with (respective) payos of (1, 0) with probability and (0, 1) with probability 1 . Then probability matching implies that the long-run frequency of choosing s1 is . j 5 i to be small enough that t is either negative or greater than 1 for some i. This Negative payos can cause j St is not an issue with exponential weighting rules.
4

4.1

Reinforcement Learning in a Nonstationary Environment

One limitation of the literature on reinforcement learning (at least in economics) is the assumption of stationarity. While multi-armed bandit problems are interesting and crop up in many strands of economic theory, what is potentially more interesting is when the world changes. Suppose the distributions governing the payos of our bandit problem unexpectedly change does a naive agent making use of a reinforcement learning approach notice the change? And if so, how long does it take for there to be a response? Can the agent eventually learn what the new optimal action is? Moreover, if we take things a step further and look at the performance of RL agents playing games against one another, what happens if the underlying game changes? How will these rules of thumb perform? One result that seems rather intuitive is that a single change in the environment is not sucient to bring about permanently sub-optimal behavior. To see this, note that the action selection rules described above all have positive probabilities of choosing any given action in any given period. Therefore, each action will be selected innitely often after the change in the environment. estimates, a law of large numbers While the agents initial (pre-change) experience will bias the Q argument implies that these estimates will be consistent in the limit, they will converge to the true means of the distributions, and the agent will have sucient information about the new environment to choose the optimal action. Thus, for the situation to be interesting, we need a setting in which the environment will change
N innitely often. So, let there be M actions {am }M m=1 and N states of the world {sn }n=1 . Taking

action m in state n yields a stochastic payo from a distribution Fmn with mean mn thus, we have N dierent M -armed bandit problems, with payo distributions governed by the state of the world. State transitions are governed by, for example, a Poisson distribution; in each period in which there is a switch in states, the state of the world is chosen from a uniform distribution over {sn }N n=1 . One of the most dicult questions is perhaps that of quantifying optimality. If the state of the world is changing innitely often, there is no action that may be considered to be ex ante optimal, so we cannot look for situations in which the probability of choosing a given arm goes to one as time goes to innity. One possible metric of optimality would be performance within a regime: the reinforcement learning algorithm would be considered to be performing well if it adjusts quickly after the regime change and approaches the state-specic optimal action. Intuition, supported by simulation evidence, suggests that with such a metric, performance should be inversely related to the frequency of regime change: if the state is constantly in ux, the best we can hope to do is to pick actions at random, whereas if there are long periods of stability, we can expect a signicant amount of learning (or re-learning) to occur. Another approach to this question would be to consider the problem from a Bayesian perspective. What would a fully rational, forward looking Bayesian agent do in each period? This behavior may be used as a benchmark with which to compare the performance of agents who act according to reinforcement learning. In particular, we can examine the sequence of actions, as well as the 7

history of payos, and determine whether RL behavior approximates Bayesian behavior, and how much of a payo loss, if any, is incurred by acting in accordance with reinforcement learning as opposed to Bayesian optimality.

4.2

An Axiomatic Approach

A major diculty with making use of reinforcement learning techniques is the determination of which specic parameterization one should use. It would be useful to have an axiomatic foundation for reinforcement learning. Restricting attention to the stationary environment setting for the moment, it seems plausible and worthwhile to develop reasonable axioms which would narrow down the set of possible dynamic decision-making rules to those that fall under the umbrella of reinforcement learning. In light of the large variety of possible RL models, axioms that characterize the softmax parameterization, for example, would be extremely useful (both theoretically and for practical application).

4.3

Reinforcement Learning in Other Settings

Recently, there have been a number of papers published about the performance of various RL algorithms in interactive settings for example, several authors have noted links between the behavior of RL agents playing games against one another and the continuous time replicator dynamic, while others have demonstrated that payos will be bounded below by minmax levels, or that RL behavior (in specic parameterizations) is consistent with iterated deletion of strictly dominated strategies. In this vein, it might be interesting to examine the performance of reinforcement learning agents in networks will reinforcement learning lead to convergence to optimal or ecient networks?

References
[1] Beggs, Alan (2005). On the Convergence of Reinforcement Learning. Journal of Economic Theory 122: 1-36. [2] Bell, Ann Maria (2001). Reinforcement Learning Rules in a Repeated Game. Computational Economics 18: 89-111. [3] B orgers, Tilman and Rajiv Sarin (1997). Learning through Reinforcement and Replicator Dynamics. Journal of Economic Theory 77: 1-14. [4] B orgers, Tilman and Rajiv Sarin (2000). Naive Reinforcement Learning with Endogenous Aspirations. International Economic Review 41: 921-950. [5] Erev, Ido and Alvin Roth (1998). Predicting how People Play Games: Reinforcement Learning in Experimental Games with Unique Mixed Strategy Equilibria. American Economic Review 88: 848-881.

[6] Hopkins, Ed (2002). Two Competing Models of How People Learn in Games. Econometrica 70: 2141-2166. [7] Jehiel, Philippe and Dov Samet (2005). Learning to Play Games in Extensive Form by Valuation. Journal of Economic Theory 124: 129-148. [8] Laslier, Jean-Francois, Richard Topol, and Bernard Walliser (2001). A Behavioral Learning Process in Games. Games and Economic Behavior 37: 340-366. [9] Laslier, Jean-Francois and Bernard Walliser (2005). A Reinforcement Learning Process in Extensive Form Games. International Journal of Game Theory 33: 219-227. [10] Mitchell, Tom (1997). Machine Learning. Boston: McGraw-Hill. [11] Roth, Alvin and Ido Erev (1995). Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term. Games and Economic Behavior 8: 164-212. [12] Rustichini, Aldo (1999). Optimal Properties of Stimulus-Response Learning Models. Games and Economic Behavior 29: 244-273. [13] Sutton, Richard and Andrew Barto (1998). Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.

Reinforcement Learning Guide
No ratings yet
Reinforcement Learning Guide
18 pages
Mid Term Report SoS
No ratings yet
Mid Term Report SoS
18 pages
AI Reinforcement Learning Guide
No ratings yet
AI Reinforcement Learning Guide
27 pages
Fundamentals of Reinforcement Learning
No ratings yet
Fundamentals of Reinforcement Learning
33 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
136 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
30 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
9 pages
Introduction To RL
No ratings yet
Introduction To RL
64 pages
Q Learing
No ratings yet
Q Learing
30 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
DD2431 Machine Learning Lab 4: Reinforcement Learning Python Version
No ratings yet
DD2431 Machine Learning Lab 4: Reinforcement Learning Python Version
9 pages
37 RL
No ratings yet
37 RL
18 pages
Unit 1 Reinforcement Learning
No ratings yet
Unit 1 Reinforcement Learning
70 pages
Multi-Agent Learning Dynamics
No ratings yet
Multi-Agent Learning Dynamics
26 pages
Unit 1
No ratings yet
Unit 1
18 pages
Reinforcement Learning MY101
No ratings yet
Reinforcement Learning MY101
15 pages
RL Unit5
No ratings yet
RL Unit5
101 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
16 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
16 pages
RL Cheatsheet for Researchers
No ratings yet
RL Cheatsheet for Researchers
16 pages
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
No ratings yet
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
34 pages
Lecture 9 Reiforcement Learning
No ratings yet
Lecture 9 Reiforcement Learning
29 pages
Reinforcement Learning: Karan Kathpalia
No ratings yet
Reinforcement Learning: Karan Kathpalia
80 pages
Intro To Reinforcement Learning
No ratings yet
Intro To Reinforcement Learning
56 pages
Lec17 ReinforcementLearning
No ratings yet
Lec17 ReinforcementLearning
58 pages
DLMAIRIL01 Q4-2024 Session4
No ratings yet
DLMAIRIL01 Q4-2024 Session4
80 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
17 pages
Lec 04 Reinforcement Learning
No ratings yet
Lec 04 Reinforcement Learning
57 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
32 pages
A Crash Course On Reinforcement Learning - Felix Wagner
No ratings yet
A Crash Course On Reinforcement Learning - Felix Wagner
84 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
17 pages
Reinforcement Learning in A Nutshell
No ratings yet
Reinforcement Learning in A Nutshell
12 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
10 Deep Reinforcement
No ratings yet
10 Deep Reinforcement
40 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
51 pages
Ai (It) Unit-5
No ratings yet
Ai (It) Unit-5
43 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
10 pages
5SC28 Machine Learning For Systems and Control
No ratings yet
5SC28 Machine Learning For Systems and Control
97 pages
Active Learning For Reward Estimation in Inverse Reinforcement Learning
No ratings yet
Active Learning For Reward Estimation in Inverse Reinforcement Learning
16 pages
AI Seminar RL
No ratings yet
AI Seminar RL
27 pages
Unit 4 QP
No ratings yet
Unit 4 QP
19 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
4 pages
ML Unit 5
No ratings yet
ML Unit 5
13 pages
ml4r 2025 05
No ratings yet
ml4r 2025 05
22 pages
CSE 445 - Lecture 9 - Reinforcement Learning
No ratings yet
CSE 445 - Lecture 9 - Reinforcement Learning
45 pages
Sdfesdf
No ratings yet
Sdfesdf
23 pages
Reinforcement LN-6
No ratings yet
Reinforcement LN-6
13 pages
RL Viva
No ratings yet
RL Viva
30 pages
Bayesian RL for Researchers
No ratings yet
Bayesian RL for Researchers
27 pages
Reinforcement Learning B.Tech. IV Year I Sem. Unit - I
No ratings yet
Reinforcement Learning B.Tech. IV Year I Sem. Unit - I
27 pages
Multi-Armed Bandits
No ratings yet
Multi-Armed Bandits
11 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
Unit-5 MLT
No ratings yet
Unit-5 MLT
13 pages
Lecture 1: Introduction: Lecturer: Prof. Subrahmanya Swamy Peruru Scribe: Harshvardhan Arya - Rishabh Katiyar
No ratings yet
Lecture 1: Introduction: Lecturer: Prof. Subrahmanya Swamy Peruru Scribe: Harshvardhan Arya - Rishabh Katiyar
4 pages
Child Development Theories - Outline
100% (1)
Child Development Theories - Outline
16 pages
FOREWORD Plus Manifesto
No ratings yet
FOREWORD Plus Manifesto
6 pages
Eros and Sexuality in Islamic Art Intro PDF
100% (2)
Eros and Sexuality in Islamic Art Intro PDF
16 pages
S and D-Spring 2008 PDF
No ratings yet
S and D-Spring 2008 PDF
24 pages
Process
No ratings yet
Process
1 page
C3PO Sri Lanka
No ratings yet
C3PO Sri Lanka
15 pages
Competitor's Review Competitors URLS
No ratings yet
Competitor's Review Competitors URLS
1 page
Beamer User Guide
No ratings yet
Beamer User Guide
201 pages
JM Jar Quote
No ratings yet
JM Jar Quote
5 pages
Artchowk Mission Statement
No ratings yet
Artchowk Mission Statement
5 pages
PepperJacks Business Plan PDF
No ratings yet
PepperJacks Business Plan PDF
13 pages
Competitor's Review Competitors URLS
No ratings yet
Competitor's Review Competitors URLS
1 page
Highly Effective ARTCHOWK Branding Plan
No ratings yet
Highly Effective ARTCHOWK Branding Plan
4 pages
Write A Critical Essay
No ratings yet
Write A Critical Essay
8 pages
IT Testing Expert with QTP Skills
100% (1)
IT Testing Expert with QTP Skills
5 pages
Data Science Tutorial Library - 370+ Free Tutorials
100% (1)
Data Science Tutorial Library - 370+ Free Tutorials
14 pages
Coquos Support For Continuous Queries in Unstructured P2P Networks
No ratings yet
Coquos Support For Continuous Queries in Unstructured P2P Networks
6 pages
Function Points Analysis
No ratings yet
Function Points Analysis
111 pages
Face Recognition: A Tutorial
No ratings yet
Face Recognition: A Tutorial
123 pages
R12 Installation Guide
No ratings yet
R12 Installation Guide
21 pages
ActixOne General User Training PDF
100% (2)
ActixOne General User Training PDF
38 pages
Updating The Test Case Document: Test Conditions Matrix (Testcondmatrix)
No ratings yet
Updating The Test Case Document: Test Conditions Matrix (Testcondmatrix)
4 pages
Lab Module 01
No ratings yet
Lab Module 01
19 pages
Python Sonal
No ratings yet
Python Sonal
127 pages
Utilizing Axapta Application Layers
No ratings yet
Utilizing Axapta Application Layers
18 pages
HW4 Fa17
No ratings yet
HW4 Fa17
4 pages
Programacion I... by Ing. Mollo
14% (7)
Programacion I... by Ing. Mollo
198 pages
Lab TCP
No ratings yet
Lab TCP
16 pages
Logical Standby
No ratings yet
Logical Standby
20 pages
Mediant 3000 User's Manual Ver 4.8
No ratings yet
Mediant 3000 User's Manual Ver 4.8
622 pages
Salesforce Identity Management Guide
No ratings yet
Salesforce Identity Management Guide
25 pages
Ibm Storage Ds3500 Instalation and Host Support Guide
No ratings yet
Ibm Storage Ds3500 Instalation and Host Support Guide
288 pages
SEP Analysis
100% (2)
SEP Analysis
41 pages
Ajp QB
No ratings yet
Ajp QB
6 pages
Roving Networks Bluetooth™ Product User Manual
No ratings yet
Roving Networks Bluetooth™ Product User Manual
32 pages
My Bitmap
No ratings yet
My Bitmap
5 pages
Lora Wan
No ratings yet
Lora Wan
18 pages
Specman e Reuse Methodology - Developer's Manual
100% (3)
Specman e Reuse Methodology - Developer's Manual
330 pages
1h 2000 Rrges: .............. Break .............. .
No ratings yet
1h 2000 Rrges: .............. Break .............. .
6 pages
Ctags Tutorial
No ratings yet
Ctags Tutorial
4 pages
5 Ways To Animate A React App PDF
No ratings yet
5 Ways To Animate A React App PDF
33 pages
The Cloud Database
No ratings yet
The Cloud Database
19 pages
Upgrade DB 10.2.0.4 12.1.0
No ratings yet
Upgrade DB 10.2.0.4 12.1.0
15 pages
Activate Software ATM Installation - Win7 Ver 04 00 00 05 Kh5
No ratings yet
Activate Software ATM Installation - Win7 Ver 04 00 00 05 Kh5
16 pages