(CS 5008) Reinforcement Learning : Assignment 3
Markov Chain
Q1) Consider a random walk on the set of integer Z described as follow: P rob(st+1 = s + 1|st =
s) = 0.5 and P rob(st+1 = s−1|st = s) = 0.5. Start from various initial distributions µ0 (remember
s0 ∼ µ0 ), and find out µ4 , the distribution after 4 time steps.
Q2) Consider a single queue with maximum length n = 9. What is the total number of states? Let the
queue evolve in discrete time steps t = 0, 1, . . ., and let probability of arrival of a customer between
times t and t + 1 be p, and let the probability that a customer is serviced between t and t + 1 be q.
Also, let arrival and service be independent of each other. Describe the probability transition matrix
for this system.
Q3) We know that µ>
t+1 = µt P. Verify that µt+1 is a distribution if µt is a distribution.
Q4) Consider the filtering problem discussed in the class? Verify that P rob(s0 |o0 , s0 ∼ µ0 ) is
nothing but the Bayes rule.
Q5) Consider the filtering problem discussed in the class? How will we find P rob(st |ok , s0 ∼ µ0 ),
where k < t.
Q6) Please work out the “rain and umbrella" explained in class for i) Filtering ii) Prediction iii)
Smoothing and iv) Maximum likelihood sequence (Viterbi Algorithm). Play around with different
numbers.