Dynamic Programming and Optimal Control: Programming Exercise Topic: Infinite Horizon Problems

This programming exercise involves using dynamic programming and optimal control algorithms to find the optimal policy for a paparazzi to take a picture of a celebrity in minimum time while navigating an estate with security cameras, obstacles, and the celebrity's mansion. The tasks involve (1) creating transition probability and stage cost matrices, (2) applying value iteration, policy iteration, and linear programming algorithms to compute the optimal value function and policy. Templates are provided for key functions to implement the algorithms and solve the stochastic shortest path problem.

Uploaded by

Ozzy Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views5 pages

Dynamic Programming and Optimal Control: Programming Exercise Topic: Infinite Horizon Problems

Uploaded by

Ozzy Jones

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

151-0563-01 Dynamic Programming and Optimal Control (Fall 2018)

Programming Exercise Topic: Infinite Horizon Problems

Issued: Nov 22, 2018 Due: Dec 19, 2018
Rajan Gill(rgill@ethz.ch), Weixuan Zhang(wzhang@ethz.ch), November 25, 2018

Policy Iteration, Value Iteration, and Linear Programming

The goal of this programming exercise is to help a paparazzi take a picture of a celebrity in
minimum time. For this purpose the paparazzi enters a celebrity’s estate (see Fig. 1) and tries
to sneak up to the celebrity’s mansion in order to get a good picture. Unfortunately, there are
various security cameras installed on the property. If caught on camera, the paparazzi will be
brought to the entrance gate by the security guards and has to restart. In addition, there are
various obstacles such as trees, bushes, ponds and pools on the property. The trees and bushes
can be used by the paparazzi to hide from the cameras, but he cannot move through them. The
ponds and pools can be crossed by the paparazzi, but this costs extra time and they do not offer
any sight protection.
North
⇑

(N,M)

(1,2)

(1,1) (2,1)

Figure 1: Bird’s eye view of a celebrity’s es- Figure 2: Example map of the celebrity’s
tate with trees, bushes, ponds and a pool. estate. The black cells represent trees and
bushes, the blue cells ponds and pools, the
grey cells the mansion, the red cells security
cameras and the yellow cell the entrance gate.

Problem set up
The celebrity’s property is discretized into N × M cells (see Fig. 2), where N is the width of
the estate (from west to east) and M the length (from south to north), respectively. At each
time step, the paparazzi can move north, west, south, east or stay at his current position and
try to take a picture. The paparazzi’s position is described by x = (n, m), n ∈ {1, . . . , N },
m ∈ {1, . . . , M }.
The map of the estate is described by a M × N -matrix, where positive values indicate
obstacles that are inaccessible (e.g. trees, bushes, mansion, other cameras, etc.) and negative
values indicate ponds or pools. If the paparazzi decides to move into a pond or pool, it takes
him four time steps. Leaving the pool or taking a picture from inside the pool only takes one
time step. The positions of the mansion, which is always rectangular, is given by a F × 2-matrix,
where each row indicates a cell of the map that is part of the mansion.
The positions of the security cameras are given by a H ×3-matrix, where each row indicates
the cell of the camera position (first two columns) and the camera’s image quality (third column).
The cameras can film in all four cardinal directions, but not diagonal nor through obstacles. If
the paparazzi moves to a cell that is in line of sight of a camera, the probability of being detected
by this camera is γi /||dpc ||, where γi denotes the camera’s image quality and ||dpc || the current
distance between the paparazzi and the camera (measured in number of cells). In case the
paparazzi is caught on camera, he will be brought to the entrance gate which costs an additional
six time steps. The position of the entrance gate is given by a 1 × 2-matrix indicating the cell
of the map where the gate is located.
In each time step, the paparazzi can take a picture instead of moving around. The min-
imum probability of successfully taking picture is pc = 0.001, which represents the rare case
when the celebrity walks into the picture outside of the mansion. If any cell of the mansion is
in the field of view of the paparazzi (all four cardinal directions, not diagonal or through trees
or bushes), the probability of successfully taking a picture is max{pc , γp /||dpm ||}, with γp = 0.5
and ||dpm || being the current distance between the paparazzi and the corresponding cell of the
mansion.

Note: The order of events during a time step is the following: First, the paparazzi moves to a
new cell or takes a picture. If he successfully took a picture of the celebrity, the task is over.
In all other cases, the security cameras try to spot the paparazzi and if caught on camera, the
paparazzi is escorted to entrance gate by the security guards.
If the paparazzi moves into a pond or pool cell, the security cameras have four attempts to
film the paparazzi and hence the probability of the paparazzi being detected increases.
Tasks
Find the policy minimizing the expected number of time steps required to successfully take a
picture by

a) creating a transition probability matrix P ∈ RK×K×L , where K is the number of possible

states and L is the number of control inputs1 . For creating P , the state space is created
by assigning a unique state index i = 1, 2, . . . , K to all accessible cells of the estate (see
main.m).
Use the provided file ComputeTransitionProbabilities.m as a template for your
implementation.
This part counts 30% towards the grade.

b) creating a stage cost matrix 2 G ∈ RK×L .

Use the provided file ComputeStageCosts.m as a template for your implementa-
tion.
This part counts 25% towards the grade.

c) applying Value Iteration3 , Policy Iteration and Linear Programming4 to compute J ∈ RK

and the optimal policy µ(i), i = 1, . . . , K, that solves the stochastic shortest path problem.
Use the provided files ValueIteration.m, PolicyIteration.m
and LinearProgramming.m as a template for your implementation.
Each algorithm makes up for 15% of the grade.

1
Set the transition probabilities to 0 for infeasible moves.
2
Set the expected stage cost to inf for infeasible moves.
3
You can terminate the algorithm if all J(i), i = 1, . . . , K, do not change by more than 10−5 within one
iteration step.
4
In your implementation of the file LinearProgramming.m, you may use the MATLAB function “linprog” to
solve the linear program.
Provided Matlab files
A set of MATLAB files is provided on the class website. Use them for solving the above prob-
lem. Strictly follow the structure. Grading is automated using MATLAB 2018b. You can add
functions to the template files, but each file should be self-contained, i.e. not depend on any
external custom function.
main.m Matlab script that has to be used to generate a
map of the estate, execute the stochastic shortest
path algorithms and display the results.
GenerateMap.p Matlab function that generates a map of the
celebrity’s estate.
PlotMap.m Matlab function that can plot a map of the estate,
and the cost and control action for each accessible
cell.
PlotMap3.m Matlab function that can plot a three dimensional
view of the estate and control action for each ac-
cessible cell.
ComputeTransitionProbabilities.m Matlab function template to be used for creating
the transition probability matrix P ∈ RK×K×L .
ComputeStageCosts.m Matlab function template to be used for creating
the stage cost matrix G ∈ RK×L .
ValueIteration.m Matlab function template to be used for your im-
plementation of the Value Iteration algorithm for
the stochastic shortest path problem.
PolicyIteration.m Matlab function template to be used for your im-
plementation of the Policy Iteration algorithm for
the stochastic shortest path problem.
LinearProgramming.m Matlab function template to be used for your im-
plementation of the Linear Programming algorithm
for the stochastic shortest path problem.
exampleMap.mat A pre-generated 15 × 10 map to be used for testing
your implementations of the above functions.
exampleP.mat The transition probability matrix P ∈ RK×K×L for
the example map.
exampleG.mat The stage cost matrix G ∈ RK×L for the example
map.
Deliverables
Please hand in by e-mail

• your MATLAB implementation of the following files:

ComputeTransitionProbabilites.m,
ComputeStageCost.m,
ValueIteration.m,
PolicyIteration.m,
LinearProgramming.m.
Only submit the above mentioned files. Your code should not depend on any other non-
standard MATLAB functions.

• in a PDF-file a scanned declaration of originality, signed by each student to confirm that

the work is original and has been done by the author(s) independently:
https://www.ethz.ch/content/dam/ethz/main/education/rechtliches-abschluesse
/leistungskontrollen/declaration-originality.pdf.
Each work submitted will be tested for plagiarism.

Please include all files into one zip-archive, named DPOCEx Names.zip, where Names is a list of
the full names of all students who have worked on the solution.
(e.g DPOCEx GillRajan ZhangWeixuan.zip)5
Send your files to wzhang@ethz.ch with subject [programming exercise submission]
by the due date indicated above. We will send a confirmation e-mail upon receiving your e-mail.
You are ultimately responsible that we receive your solution in time.

5
Up to three students are allowed to work together on the problem. They will all receive the same grade.

Powell-Tutorial-ComputationalStochasticOptimization Informs Nov152014
No ratings yet
Powell-Tutorial-ComputationalStochasticOptimization Informs Nov152014
142 pages
PSpice 17.2 MATLAB Interface User Guide (PSP - Matlab - Ug) PDF
100% (1)
PSpice 17.2 MATLAB Interface User Guide (PSP - Matlab - Ug) PDF
82 pages
Dynamic Speed Adaptation For Path Tracking Based o PDF
No ratings yet
Dynamic Speed Adaptation For Path Tracking Based o PDF
22 pages
Program Doctoral Studies
No ratings yet
Program Doctoral Studies
143 pages
Uspex Manual English 10.2
No ratings yet
Uspex Manual English 10.2
133 pages
lecture-06
No ratings yet
lecture-06
98 pages
Unit 3 PPT Ai
No ratings yet
Unit 3 PPT Ai
93 pages
Dynamic Programming and Markov Processes
No ratings yet
Dynamic Programming and Markov Processes
152 pages
Problem Solving
No ratings yet
Problem Solving
51 pages
Simulation of Agvs in Matlab: Virtual 3D Environment For Testing Different AGV Kinematics and Algorithms
No ratings yet
Simulation of Agvs in Matlab: Virtual 3D Environment For Testing Different AGV Kinematics and Algorithms
67 pages
Screenshot 2023-08-08 at 21.23.54
No ratings yet
Screenshot 2023-08-08 at 21.23.54
38 pages
Using Script Files and Managing Data
No ratings yet
Using Script Files and Managing Data
35 pages
Week12 Simulation of Probability Models
No ratings yet
Week12 Simulation of Probability Models
35 pages
Adversial Search
No ratings yet
Adversial Search
38 pages
Tesi PDF
No ratings yet
Tesi PDF
92 pages
Ch6 Matlab Gui
100% (1)
Ch6 Matlab Gui
68 pages
Reinforcement Learning: Foundations Exam
No ratings yet
Reinforcement Learning: Foundations Exam
42 pages
Optimization1
No ratings yet
Optimization1
32 pages
SYLLABUS 2 YEAR ALL
No ratings yet
SYLLABUS 2 YEAR ALL
22 pages
Control Hybrid Control
No ratings yet
Control Hybrid Control
91 pages
cs188 sp19 Final Sol
No ratings yet
cs188 sp19 Final Sol
28 pages
Deterministic Dynamic Programming: To The Next
No ratings yet
Deterministic Dynamic Programming: To The Next
52 pages
Functions Quad, Variation, Cubic Trig
No ratings yet
Functions Quad, Variation, Cubic Trig
22 pages
XII Appl. Maths Sample Paper-2
No ratings yet
XII Appl. Maths Sample Paper-2
13 pages
Getting Started Using Adams/Controls - MD Adams 2010
100% (2)
Getting Started Using Adams/Controls - MD Adams 2010
132 pages
AI Lec07 Adversarial Search
No ratings yet
AI Lec07 Adversarial Search
29 pages
Interpolation Curvefitting
No ratings yet
Interpolation Curvefitting
34 pages
3d Slicer
No ratings yet
3d Slicer
8 pages
2022_may_optimization-techniques2020-pattern
No ratings yet
2022_may_optimization-techniques2020-pattern
8 pages
Daa Mod - 4
No ratings yet
Daa Mod - 4
21 pages
CSE3664 - ProjectReport - Tic Tac Toe Game With AI Player
No ratings yet
CSE3664 - ProjectReport - Tic Tac Toe Game With AI Player
11 pages
HW 1 MeowMeow
No ratings yet
HW 1 MeowMeow
8 pages
HW2
No ratings yet
HW2
5 pages
OPTIMIZATION TECHNIQUES Sppu
No ratings yet
OPTIMIZATION TECHNIQUES Sppu
7 pages
mid1practiceS2025 (1)
No ratings yet
mid1practiceS2025 (1)
7 pages
Introduction To Object-Oriented Programming in Matlab: Jiro Doke, Ph.D. The Mathworks, Inc
No ratings yet
Introduction To Object-Oriented Programming in Matlab: Jiro Doke, Ph.D. The Mathworks, Inc
27 pages
Week5 Homework
No ratings yet
Week5 Homework
2 pages
R-Practical questions-Sem-IV
No ratings yet
R-Practical questions-Sem-IV
4 pages
Amazon ML Pyq
No ratings yet
Amazon ML Pyq
8 pages
Analysis of Critical Speed of Shaft Using C and MATLAB
No ratings yet
Analysis of Critical Speed of Shaft Using C and MATLAB
5 pages
Lecture 10
No ratings yet
Lecture 10
14 pages
Chapter - 6 Artificial Neural Network (Ann) Modeling
No ratings yet
Chapter - 6 Artificial Neural Network (Ann) Modeling
24 pages
cns 4-7_merged_merged_removed (1)
No ratings yet
cns 4-7_merged_merged_removed (1)
3 pages
Aulas MDP Exercises Sols
No ratings yet
Aulas MDP Exercises Sols
10 pages
Model Predictive Control
No ratings yet
Model Predictive Control
12 pages
ODM2022 Tutorial-2
No ratings yet
ODM2022 Tutorial-2
4 pages
Structural Dynamics Toolbox & Femlink: User'S Guide Sdtools
No ratings yet
Structural Dynamics Toolbox & Femlink: User'S Guide Sdtools
711 pages
Modeling and Simulation: ME 635/IPD 611 Kishore Pochiraju
No ratings yet
Modeling and Simulation: ME 635/IPD 611 Kishore Pochiraju
48 pages
Final Exam: CS 188 Spring 2019 Introduction To Artificial Intelligence
No ratings yet
Final Exam: CS 188 Spring 2019 Introduction To Artificial Intelligence
23 pages
Finite Element Method Magnetics: Download: 32-Bit Executable 64-Bit Executable Source
No ratings yet
Finite Element Method Magnetics: Download: 32-Bit Executable 64-Bit Executable Source
4 pages
A Hierarchical Model Predictive Control Framework For On-Road Formation Control of Autonomous Vehicles
No ratings yet
A Hierarchical Model Predictive Control Framework For On-Road Formation Control of Autonomous Vehicles
8 pages
Feed-Forward Control Design of Fuel Distribution On Advanced Dual-Fuel Engines With Varying Intake Valve Closing Timings
No ratings yet
Feed-Forward Control Design of Fuel Distribution On Advanced Dual-Fuel Engines With Varying Intake Valve Closing Timings
7 pages
Numerical Sage
No ratings yet
Numerical Sage
43 pages
Question 1) Search 10 Marks: Final Term Examination Spring-2020
No ratings yet
Question 1) Search 10 Marks: Final Term Examination Spring-2020
5 pages
Array Calc 10 D
No ratings yet
Array Calc 10 D
40 pages
1031碩博士班資格考試題（作業研究）
No ratings yet
1031碩博士班資格考試題（作業研究）
2 pages
PolyTop PDF
No ratings yet
PolyTop PDF
29 pages
Mae10 HW
No ratings yet
Mae10 HW
12 pages
Advanced Motorcycle Virtual Rider: Vehicle System Dynamics September 2008
No ratings yet
Advanced Motorcycle Virtual Rider: Vehicle System Dynamics September 2008
15 pages
Group 212066 22 Finalproject
No ratings yet
Group 212066 22 Finalproject
20 pages
Final Exam MAT1004 Summer Code 2
No ratings yet
Final Exam MAT1004 Summer Code 2
3 pages
37 - Practice+Exercise+Questions+ (Beginner) PDF
No ratings yet
37 - Practice+Exercise+Questions+ (Beginner) PDF
2 pages
Intro A I Course Work 2020 Part 2
No ratings yet
Intro A I Course Work 2020 Part 2
7 pages
Assignment 1 2 3
No ratings yet
Assignment 1 2 3
7 pages
Tic Tac Toe
No ratings yet
Tic Tac Toe
13 pages
Problemas Recomendados en Acm Icpc News
No ratings yet
Problemas Recomendados en Acm Icpc News
23 pages
HW 1
No ratings yet
HW 1
3 pages
Quiz 3
No ratings yet
Quiz 3
10 pages
Nonlinear Optimization: Benny Yakir
No ratings yet
Nonlinear Optimization: Benny Yakir
38 pages
CSCE155
No ratings yet
CSCE155
2 pages
7144CEM Principles of Data Science: Faculty of Engineering, Environment and Computing
No ratings yet
7144CEM Principles of Data Science: Faculty of Engineering, Environment and Computing
8 pages
Denise Achieng' Orege: Objective
No ratings yet
Denise Achieng' Orege: Objective
1 page
Course Outline - Numerical Methods
No ratings yet
Course Outline - Numerical Methods
4 pages
Programming For Electrical Engineers - MATLAB and Spice
No ratings yet
Programming For Electrical Engineers - MATLAB and Spice
273 pages
Robust Adaptive
No ratings yet
Robust Adaptive
3 pages
12 Appliedmaths1
No ratings yet
12 Appliedmaths1
20 pages
Previous ECE Exam-1
No ratings yet
Previous ECE Exam-1
7 pages
Final 12
No ratings yet
Final 12
11 pages
Partial Differential Equation With Matlab
No ratings yet
Partial Differential Equation With Matlab
284 pages
Measuring Vibration DT9837 MATLAB
No ratings yet
Measuring Vibration DT9837 MATLAB
4 pages
Prelim in Are Exam Maths II
No ratings yet
Prelim in Are Exam Maths II
3 pages
Nonlinear Optimization: Benny Yakir
No ratings yet
Nonlinear Optimization: Benny Yakir
38 pages
Vector CANoe
No ratings yet
Vector CANoe
7 pages
Worked Examples in Advanced Mechanics of Materials using MATLAB
From Everand
Worked Examples in Advanced Mechanics of Materials using MATLAB
Eric Okoth Ogur
No ratings yet
Symbolic Mathematics in Data Science. Algebra, Calculus, and Geometry with Matlab
From Everand
Symbolic Mathematics in Data Science. Algebra, Calculus, and Geometry with Matlab
César Pérez López
No ratings yet
Worked Examples in Mechanical Vibrations using MATLAB
From Everand
Worked Examples in Mechanical Vibrations using MATLAB
Eric Okoth Ogur
No ratings yet
Mathematical Optimization: Fundamentals and Applications
From Everand
Mathematical Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
From Everand
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
From Everand
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
Fouad Sabry
No ratings yet
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet