SLAP
SLAP
Abstract—Simultaneous Localization and Planning (SLAP) In contrast, deterministic motion-planning methods assume
under process and measurement uncertainties is a challenge. no uncertainty and aim to obtain collision-free paths. In
It involves solving a stochastic control problem modeled as a these methods, the dynamics of the system and any kinds of
Partially Observed Markov Decision Process (POMDP) in a
general framework. For a convex environment, we propose an uncertainty involved in the problem have no impact on the
optimization-based open-loop optimal control problem coupled final trajectory. Falling in this category are sampling-based
with receding horizon control strategy to plan for high quality approaches, such as Rapidly exploring Random Tree (RRT)
trajectories along which the uncertainty of the state localization [27, 21] and Probabilistic RoadMap (PRM)-based [22, 23]
is reduced while the system reaches to a goal state with minimum methods; and trajectory optimization-based algorithms, such
control effort. In a static environment with non-convex state
constraints, the optimization is modified by defining barrier as Covariant Hamiltonian Optimization for Motion Planning
functions to obtain collision-free paths while maintaining the (CHOMP) [57] and TrajOpt [43] which aim to obtain a
previous goals. By initializing the optimization with trajectories in collision-free trajectory.
different homotopy classes and comparing the resultant costs, we However, real world systems contain both motion and sens-
improve the quality of the solution in the presence of action and ing uncertainty and require belief space planning. Assuming
measurement uncertainties. In dynamic environments with time-
varying constraints such as moving obstacles or banned areas, the belief is fully parameterized by a mean and covariance is
the approach is extended to find collision-free trajectories. In a well-entrenched method in the literature. Early extensions
this paper, the underlying spaces are continuous, and beliefs are of Linear Quadratic Gaussian (LQG)-based methods [6], such
non-Gaussian. Without obstacles, the optimization is a globally as iterative LQG (iLQG) [48], restricted attention to process
convex problem, while in the presence of obstacles it becomes uncertainty by assuming full observations. By incorporating
locally convex. We demonstrate the performance of the method
on different scenarios. observation uncertainty, roadmap-based [40, 17] and tree-
based methods [50] apply LQG methodology to find a locally
I. I NTRODUCTION optimal nominal path, rather than constructing a trajectory.
Van den Berg et al. [51] use stochastic Differential Dynamic
Uncertainty is a feature of many robotic systems. Whether Programming (sDDP) to extend LQG-MP [50] to roadmaps,
it is navigation underwater, aerially, or in a cluttered envi- which is then extended in [52] utilizing iLQG to perform the
ronment, uncertainty presents itself as an inevitable part of belief value-iterations. Methods described in [49, 38] perform
the problem. Dealing with uncertainty in any situation is a filtering during the planning stage using the most likely obser-
challenge. For a robust and reliable plan, the controller must vation in order to make the belief propagation deterministic.
solve a stochastic control problem [25, 26], which can be Algorithms that consider Monte-Carlo simulations on future
formulated as a Partially Observed Markov Decision Process observations, or methods such as [1, 51] which consider all
(POMDP) [3], [45], [19]. However, unless the domain of future observations, can perform better. However, LQG-based
the planning problem is finite and small, finding the optimal methods always require an initial feasible solution, which
control law in a general set-up is extremely difficult [36, 30]. can limit trajectories and affect planning. In Feedback-Based
This is because the planner needs to compute an offline Information RoadMap (FIRM) [2], a belief-MDP problem
policy function for all conditional probability distributions of is solved over a belief-graph. This method is advantageous
the state, referred to as belief states. Approaches that solve because it breaks the curse of history; however, its belief
the problem by either building a decision tree of discretized stabilizations are LQG-based. In LQG-based methods, other
observation-action pairs [44, 37], or maintaining a global than the Gaussianity assumption of the uncertainty (which
discretization of the value function over continuous spaces, is insufficient in many situations), computational cost and
suffer from the exponential complexity known as the curse scalability is a concern. In case of large deviations, unpredicted
of dimensionality [19, 56]. A more significant problem is the error accumulations force the controller to re-plan. Hence,
exponential growth of the number of trajectories corresponding methods that provide an output trajectory instead of a feed-
to action-observation pairs, known as the curse of history [37]. back policy must have a fast re-planning algorithm. Thus,
application of methods, such as [51] with high computational be completely feasible; if it is feasible, the optimization avoids
complexity becomes limited to problems of small domains. morphing towards infeasible regions. In dynamic environments
Receding Horizon Control (RHC) or Model Predictive where the obstacles can move or banned areas can change
Control (MPC)-based methods [15, 31] are closely related over time, the optimization problem is modified to respect
methods, where the optimization problem involved in the inner the time-varying constraints via time-varying barrier functions.
control loop is required to compute quickly. In these methods, Therefore, in neither the static nor dynamic cases does the
at each time-step, a finite-horizon optimal control problem optimization vector size changes and the decision variables
results in an optimal open-loop sequence of controls for the remain solely as the control variables. Our approach, which
current state of the system whose first element is applied to simplifies the planning stage and reduces the computational
the plant [33, 32]. An appealing advantage of MPCs is that, complexity for solving the open-loop optimal control problem,
unlike POMDP solvers that obtain an offline feedback policy significantly improves online navigation under uncertainty
prescribing the control for all (belief) states, MPCs have a when compared to previous methods. Planning for uncer-
natural online planning procedure for the current state of the tainty and optimization becomes more efficient by reducing
plant. Particularly, in problems with changing environments computational costs by utilizing our stochastic MPC, which
with moving objects, offline plans can become unreliable after then becomes inherently flexible and scalable; thus, enabling
the execution of a few steps of planned actions, thus exploring its usage in both static and dynamic environments. This
only the relevant portion of the belief space. Deterministic flexibility allows our stochastic MPC to increase its scalability
MPCs have received the most attention in the literature, and in the static environment, thereby moving from locally-optimal
stochastic MPCs are still under development. A large body solutions towards a better approximation of a globally-optimal
of the literature focuses its attention on robust-MPCs and approach by applying the algorithm over multiple homotopy
optimization in a tube of trajectories generated by propagating classes. Furthermore, once established, the scalability can
the initial state with several samples of process uncertainties make possible the treatment of dynamic environments with
[18, 29]. These methods can lead to extreme conservativeness time-varying constraints or moving obstacles, which makes
[28]. Another class of methods tackles the problem with linear the algorithm suitable for online planning.
process and observation models. However, these methods can
lead to either non-convex or high-computational costs [16].
In problems with process uncertainty, Monte-Carlo based II. P ROBLEM D EFINITION
methods [4, 20], and related methods such as the scenario
approach [12], have also been successful in providing proba- In this section, we provide the general problem and other
bilistic guarantees with high confidence for convex problems. definitions.
However, in the presence of obstacles, most problems become System equations: We denote the state of the system by x ∈
inherently non-convex which limits the application of such X ⊂ Rnx , control action by u ∈ U ⊂ Rnu , and observation
methods in robotics problems. vector by z ∈ Z ⊂ Rnz . We only consider holonomic systems
In this paper, a stochastic MPC is proposed for planning in in this paper, where the general system equations are xt+1 =
the belief space. We define a finite-horizon optimal control f (xt , ut , ω t ) with zt = h(xt ) + ν t , where {ω t } and {ν t } are
problem with a terminal constraint that samples from an zero mean independent, identically distributed (iid) mutually
initial non-Gaussian belief and maps belief state samples independent random sequences, f : X × U × Rnx → X defines
to observation samples through usage of the observation the process (motion) model, and h : X → Z specifies the
model. The actions are planned with the aim of compression observation (measurement) model.
of the ensemble of observation trajectories. Therefore, the Belief (information state): The conditional distribution of
filtering equation, which usually poses a heavy burden on xt given the data history up to time t, called the belief
stochastic control problems, is avoided during the planning bt : X×Zt ×Ut → R. It is defined as bt (x, z0:t , u0:t−1 , b0 ) :=
stage. Additionally, the “most likely observation” assumption pXt |Z0:t ;U0:t−1 (x|z0:t ; u0:t−1 ; b0 ), and denoted by bt (x) or bt
is not used. The core problem in a convex environment in this paper [46, 25, 9, 24]. We use a non-Gaussian particle
is convexified for common non-linear observation models. filter representation of a belief in belief space, B, by taking a
Moreover, non-convex constraints are respected softly with a number N of state samples {xit }N with importance weights
i=1 P
N
combination of line integration of barrier functions along the {wti }Ni=1 [47, 13, 42, 41] and b t (x)≈ i i
i=1 wt δ(x − xt ) where
state trajectory. In a static environment, we apply the proposed δ(·) is the Dirac delta mass. It can be proven that by increasing
optimization over trajectories in different homotopy classes the number of particles to infinity, the distribution of the parti-
to find the lowest cost, smooth, collision-free trajectory with cles tends to the true filtering distribution [14]. In our planning
uncertainty reduction and minimum effort. It should be noted problem, we employ for the Maximum-A-Posteriori (MAP)
that, although the optimization is initialized with trajectories, state estimate defined as x̂t = arg maxx∈X bt (x). The belief
our algorithm constructs a new trajectory, since the underlying dynamics follow Bayesian update equations, summarized as a
trajectory only defines the homotopy class of the trajectory and function τ : B×U×Z → B, where bt+1 = τ (bt , ut , zt+1 ) [5].
is morphed continuously into a locally optimal one. Moreover, Problem 1: General optimal control problem. For a holo-
as elaborated in the simulations, the initial trajectory need not nomic system, given the initial belief state bt0 and a goal state
Fig. 1: Comparison of our method (3) with the traditional belief propagation methods (1 and 2).
xg , we define the following optimal control problem: where ftp := f (xpt , upt , 0) − At xpt − Bt upt , and the matrices
At = ∂f (x, u, 0)/∂x|xpt ,upt , Bt = ∂f (x, u, 0)/∂u|xpt ,upt , and
K−1
X Gt = ∂f (x, ut , 0)/∂ω|xpt ,upt are constant during each time
min Eπ [ cπt (xt , ut ) + cK (xK )]
π step (but can be time-dependent). In our planning problem
t=0
we construct an optimal trajectory. Therefore, in holonomic
s.t. bt+1 = τ (bt , ut , zt+1 ) (1a)
systems and under saturation constraints, the initial nominal
xt+1 = f (xt , ut , ω t ) (1b) trajectory can be morphed to differ significantly from the final
zt = h(xt ) + ν t (1c) optimized trajectory. However, it is important that although
j
φt (xt ) < 0, j ∈ [1, nc ], (1d) we linearize the observation model, the resultant Jacobian
H(x̂t ) = ∂h(x, 0)/∂x|x̂t is non-linear, state-and-time depen-
where the optimization is over feasible policies, and π = dent for many observation models. In other words, H(x̂t ) is
{π0 , · · · , πt } where πt : Zt+1 → U specifies an action given a function of the control variables.
the output, ut = πt (z0:t ). Moreover, cπt (·, ·) : X × U → R is Incremental cost c(·, ·) : X × U → R is defined as:
the one-step cost function, φjt (xt ) < 0 are nc (K+1) inequality
c(xt , ut )=Ebt [(xt − x̂t )T W(x̂t )(xt − x̂t )+uTt−1 Vtu ut−1], (3)
constraints, cπK (·) : X → R denotes the terminal cost, and the
expectation is taken with respect to policy π. where W(x̂t ) := H(x̂t )T R(x̂t )H(x̂t ) 0 and Vtu 0
Now that we have defined our problem, we present the are positive definite matrices, and R : X → Rnz ×nz is a
proposed solution for different variations of this problem. proper weighting matrix, to be defined later. Conceptually, the
special definition of the W matrix converts the weighted error
III. O UR S OLUTION of state particles to a weighted error of observation particles.
A. Belief Space Planning Doing this, we are incorporating the uncertainty reduction in
the cost, avoiding the filtering equation as a constraint in the
Calculation of the solution for the belief space problem 1 optimization, which breaks the computational cost of planning
in general is intractable. One attempt to make the problem problem, avoids introduction of new decision variables, avoids
more tractable is to restrict the policy to strategies such as propagation of the belief using most likely observations,
RHC, reduce the closed-loop optimization to an open-loop and distinguishes our method from LQG-based methods and
optimal control problem, and close the feedback loop by re- similar particle filter-based methods. This cost seeks to reduce
planning after execution of each action to account for updated the dispersion in the ensemble of the observation trajectories
belief. Even though the open-loop problem does not provide in terms of the weighted covariance. In other words, the
a feedback policy for all beliefs, it provides an open-loop minimization seeks to reduce the uncertainty in the predicted
optimal control sequence that still involves optimization in the observation or equivalently in estimation, which translates
belief space, which is a function space. Particularly, since the itself to shrinking the support of belief distribution.
observation space is continuous, the constraint of equation (1a) Comparison of our method with traditional approaches:
makes the problem computationally intractable. However, we Figure 1 graphically compares our method with traditional
will define our cost function such that the filtering equation is methods in the literature that tackle the open-loop problem. In
incorporated in the cost function. It can be shown that equation order to perform the filtering equation, a previous belief and
(1a) can be omitted to make the optimization more tractable, action, and a current observation are required. In the planning
while the approximate solution maintains its quality. stage, where the controller obtains the best sequence of future
Linearized system equations: We linearize the process actions, a current belief is given; however, all that is known
t0 +K p t0 +K−1
model around a nominal trajectory {xpt }t=t 0 , {ut }t=t0 for about the future observation is a likelihood distribution. As
a lookahead time horizon of K and linearize the observation shown in this figure, in classic methods, the initial belief is
model about the MAP state trajectory, to obtain: propagated using finitely many samples of the observation
obtained from the likelihood distribution. Therefore, a decision
xt+1 = ftp + At xt + Bt ut + Gt ω t (2a)
tree on the future predicted beliefs is constructed and so that
zt = h(x̂t ) + H(x̂t )(xt − x̂t ) + ν t , (2b) the optimizer can obtain the best action for each height of the
tree. Overall, the first method is computationally intensive. from the information sources get more penalized, penalizing
In the second popular class of methods, only most likely the corresponding observation trajectories, as well.
observation is utilized to perform the filtering equations and Convexifying the first term of the cost: Given the differ-
propagate the belief. This method can be less accurate than the entiable observation model h(x) = [h j (x)], and its Jacobian
latter, and although it provides a less expensive optimization, ∂hj (x) ∂hj (x)
H(x) = Hj (x) , where Hj (x) = ,··· ,
nevertheless, the filtering equation is part of the optimization ∂x1 ∂xnx
constraints which limits the application of this method to small for 1 ≤ j ≤ nz , if R(x) = diag(Rj (x)) is the diagonal ma-
domains. However, in our method, once the samples of the trix of Ri (x)’s corresponding to (uncorrelated) observations,
initial belief are propagated via the predicted model of the extending the definition of g to include matrix R we have
nz
system, they are converted into observation particles by a dT H(x)T R(x)H(x)d = Rj (x)(d · Hj (x)T )2 , which is
P
proper usage of the observation model. Thus, a rope-like bun- j=1
dle of propagated observation particle strands is constructed a sum of positive convex functions as in Lemma 1. Thus,
using the initial belief samples and with the advantage of a the first term of the cost in equation (3) can be convexified.
particular defined cost function, the dispersion in the strands is In our cost function, vector d represents any of the vectors
minimized. Hence, the optimization not only morphs the rope eit := √1N Ãt0 :t−1 (xit0 − x̂t0 ) ∈ Rnx for 1 ≤ i ≤ N ,
T T T
towards regions that provide observations, but also seeks to summarized in vector et := (e1t , e2t , · · · , eN T
t ) ∈ R
N nx
compress the bundle towards the end of the horizon. As a result which is constant at each time step t.
of reduced uncertainty in observation bundle, the belief itself Problem 2: Core convex problem in convex feasible space.
shrinks and the same results are obtained without performing Define W̄(x̂t ) := BlockDiag(W(x̂t )) with N equal diagonal
the filtering equation. Therefore, using this idea, the main blocks of W(x̂t ). Moreover, define the cost of information
computational burden of the problem is broken and the much as costinf o (x̂t ) := eTt W̄(x̂t )et and cost of control effort
cheaper optimization yields the desired results. We provide as costef f (ut ) := uTt Vt+1u
ut . Under the assumptions of
more details as we proceed in the paper. holonomic systems and convex environment and given the
B. Convex Environment initial re-sampled set of particles at time step t0 , {xit0 }ki=1 ,
and a goal state xg , the core convex SLAP problem is:
In this subsection, we provide our solution to the first variant
0
of the problem 1. Our assumption in this variant is that the tX +K
inequalities φjt (x̂t ) < 0 can only be convex, such as the min [costinf o (x̂t ) + costef f (ut−1 )]
ut0 :t0 +K−1
ones that define the boundary of the problem or saturation t=t0 +1
C. Dynamic Environment
In this scenario, we simulate a case where there are four (a) (b)
objects, starting from a common position and moving in
different directions downwards and towards the right of the Fig. 8: Moving object. The robot spends most of its time near
map. The robots starts from a distribution whose mean is the information source and avoids the object, which is moving
at (0,0), and wishes to reach the goal state (2,2) with high in a spiral path, and heads towards the goal region safely.
probability. As seen in Fig. 7, at the beginning of its trajectory, V. C ONCLUSION
the robot head towards the landmark at (1, 0.5), and as the
moving obstacles get closer, it changes its direction to bypass In this paper, we have proposed a method for controlling
the objects in the opposite direction. In this scenario, the initial a stochastic system starting from a configuration in the state
trajectory is just the straight line between the most probable space to reach a goal region. We have proposed a barrier func-
initial location of the robot and the final destination shown in tion method that combined with our optimization and gradient
descent methods, such as an interior point method, keeps a
trajectory in its homotopic path, and finds the local optimal
path. The optimization seeks a smooth trajectory that results in
the lowest combined cost of uncertainty in robot’s perception
and energy effort, while avoiding collision with obstacles,
and reaching to the goal destination with high probability.
Moreover, by finding trajectories in different homotopy classes
and optimizing over paths in all classes, we can compare the
costs incurred by those trajectories. Therefore, we find the
lowest cost trajectory that considers the cost of uncertainty
incurred in the trajectories, which is the closest trajectory to
the global optimal trajectory while avoiding the need to solve
Fig. 6: YouBot base control. There are three obstacles and two
the belief space dynamic programming equations. Finally, our
landmarks. The robot base is shown by a rectangle with a line
receding horizon control strategy and the low computational
at the heading. Initial and planned trajectories are depicted by
cost of the optimization allows us to incorporate dynamic
dashed and solid lines, respectively.
obstacles, as well as new objects that are detected along the [11] Sébastien Bubeck. Theory of convex optimization for
trajectory of the robot by adjusting the optimization and the machine learning. arXiv preprint arXiv:1405.4980, 2014.
barrier functions during execution. This allows us to perform [12] Giuseppe C Calafiore and Marco C Campi. The scenario
belief space planning in an online manner. Our future work approach to robust control design. Automatic Control,
is aimed at extending our results, analyzing the performance IEEE Transactions on, 51(5):742–753, 2006.
and sensitivities, and executing the methods on real robots. [13] D. Crisan and A. Doucet. A survey of convergence
results on particle filtering methods for practitioners.
ACKNOWLEDGMENT IEEE TRANSACTIONS ON SIGNAL PROCESSING, 50
(3), 2002.
This material is based upon work partially supported by
[14] A. Doucet, J.F.G. de Freitas, and N.J. Gordon. Sequential
NSF under Contract Nos. CNS-1646449 and Science & Tech-
Monte Carlo methods in practice. New York: Springer,
nology Center Grant CCF-0939370, the U.S. Army Research
2001.
Office under Contract No. W911NF-15-1-0279, and NPRP
[15] Carlos E Garcia, David M Prett, and Manfred Morari.
grant NPRP 8-1531-2-651 from the Qatar National Research
Model predictive control: theory and practicea survey.
Fund, a member of Qatar Foundation.
Automatica, 25(3):335–348, 1989.
[16] Peter Hokayem, Debasish Chatterjee, Federico Ramponi,
B IBLIOGRAPHY
Georgios Chaloulos, and John Lygeros. Stable stochastic
[1] A. Agha-Mohammadi, S. Chakravorty, and N. Am- receding horizon control of linear systems with bounded
ato. Firm: Feedback controller-based information-state control inputs. In Int. symposium on mathematical theory
roadmap-a framework for motion planning under uncer- of networks and systems (MTNS), pages 31–36, 2010.
tainty. In Intelligent Robots and Systems (IROS), 2011 [17] V. Huynh and N. Roy. icLQG: combining local and
IEEE/RSJ International Conference on, pages 4284– global optimization for control in information space. In
4291. IEEE, 2011. IEEE International Conference on Robotics and Automa-
[2] A. Agha-Mohammadi, S. Chakravorty, and N. Amato. tion (ICRA), 2009.
Firm: Sampling-based feedback motion planning un- [18] Zhong-Ping Jiang and Yuan Wang. Input-to-state stability
der motion uncertainty and imperfect measurements. for discrete-time nonlinear systems. Automatica, 37(6):
The International Journal of Robotics Research, page 857–869, 2001.
0278364913501564, 2013. [19] L. P. Kaelbling, M. L. Littman, and A. R. Cassandra.
[3] K Astrom. Optimal control of markov decision processes Planning and acting in partially observable stochastic
with incomplete state estimation. Journal of Mathemat- domains. Artificial Intelligence, 101:99–134, 1998.
ical Analysis and Applications, 10:174–205, 1965. [20] Nikolas Kantas, JM Maciejowski, and A Lecchini-
[4] Daniele Bernardini and Alberto Bemporad. Scenario- Visintini. Sequential monte carlo for model predictive
based model predictive control of stochastic constrained control. In Nonlinear Model Predictive Control, pages
linear systems. In Decision and Control, 2009 held 263–273. Springer, 2009.
jointly with the 2009 28th Chinese Control Conference. [21] Sertac Karaman and Emilio Frazzoli. Sampling-based
CDC/CCC 2009. Proceedings of the 48th IEEE Confer- algorithms for optimal motion planning. International
ence on, pages 6333–6338. IEEE, 2009. Journal of Robotics Research, 30(7):846–894, June 2011.
[5] Dimitri Bertsekas. Dynamic Programming and Optimal [22] L.E. Kavraki, P. Švestka, J.C. Latombe, and M. Over-
Control: 3rd Ed. Athena Scientific, 2007. mars. Probabilistic roadmaps for path planning in high-
[6] Dimitri P Bertsekas, Dimitri P Bertsekas, Dimitri P dimensional configuration spaces. IEEE Transactions on
Bertsekas, and Dimitri P Bertsekas. Dynamic program- Robotics and Automation, 12(4):566–580, 1996.
ming and optimal control, volume 1. Athena Scientific [23] L.E. Kavraki, M.N. Kolountzakis, and J.C. Latombe.
Belmont, MA, 1995. Analysis of probabilistic roadmaps for path planning.
[7] Subhrajit Bhattacharya, Vijay Kumar, and Maxim IEEE Transactions on Robotics and Automation, 14:166–
Likhachev. Search-based path planning with homotopy 171, February 1998.
class constraints. In Third Annual Symposium on Com- [24] Seyed-Mahdi Kazempour-Radi and Mohammad-Hussein
binatorial Search, 2010. Rafiei-Sakhaei. A utilization of wireless sensor network
[8] Subhrajit Bhattacharya, Maxim Likhachev, and Vijay Ku- in smart homes, reviewing its combination with modern
mar. Topological constraints in search-based robot path cellular networks. In Communication and Industrial
planning. Autonomous Robots, 33(3):273–290, 2012. Application (ICCIA), 2011 International Conference on,
[9] M Boloursaz, R Kazemi, B Barazandeh, and F Behnia. pages 1–5. IEEE, 2011.
Bounds on compressed voice channel capacity. In Com- [25] P. R. Kumar and P. P. Varaiya. Stochastic Systems: Es-
munication and Information Theory (IWCIT), 2014 Iran timation, Identification, and Adaptive Control. Prentice-
Workshop on, pages 1–6. IEEE, 2014. Hall, Englewood Cliffs, NJ, 1986.
[10] Stephen Boyd and Lieven Vandenberghe. Convex opti- [26] PR Kumar et al. Control: a perspective. Automatica, 50
mization. Cambridge university press, 2004. (1):3–43, 2014.
[27] Steve Lavalle and J. Kuffner. Randomized kinodynamic planning under non-gaussian uncertainty and non-convex
planning. International Journal of Robotics Research, 20 state constraints. In 2016 IEEE International Conference
(378-400), 2001. on Robotics and Automation (ICRA), pages 4238–4244.
[28] M. Lazar, W.P.M.H. Heemels, and A.R. Teel. Further IEEE, 2016.
input-to-state stability subtleties for discrete-time sys- [43] John Schulman, Yan Duan, Jonathan Ho, Alex Lee,
tems. Automatic Control, IEEE Transactions on, 58 Ibrahim Awwal, Henry Bradlow, Jia Pan, Sachin Patil,
(6):1609–1613, June 2013. ISSN 0018-9286. doi: Ken Goldberg, and Pieter Abbeel. Motion planning
10.1109/TAC.2012.2231611. with sequential convex optimization and convex colli-
[29] D Limon, T Alamo, DM Raimondo, D Munoz de la Pena, sion checking. The International Journal of Robotics
JM Bravo, A Ferramosca, and EF Camacho. Input-to- Research, 33(9):1251–1270, 2014.
state stability: a unifying framework for robust model [44] Guy Shani, Joelle Pineau, and Robert Kaplow. A survey
predictive control. In Nonlinear model predictive control, of point-based pomdp solvers. Autonomous Agents and
pages 1–26. Springer, 2009. Multi-Agent Systems, 27:1–51, 2013.
[30] O. Madani, S. Hanks, and A. Condon. On the unde- [45] R. D. Smallwood and E. J. Sondik. The optimal control
cidability of probabilistic planning and infinite-horizon of partially observable markov processes over a finite
partially observable markov decision problems. In Pro- horizon. Operations Research, 21(5):1071–1088, 1973.
ceedings of the Sixteen Conference on Artificial Intelli- [46] E. J. Sondik. The optimal control of partially observable
gence (AAAI), pages 541–548, 1999. markov processes. PhD thesis, Stanford University, 1971.
[31] David Q Mayne. Model predictive control: Recent [47] Sebastian Thrun, Wolfram Burgard, and Dieter Fox.
developments and future promise. Automatica, 50(12): Probabilistic Robotics. MIT Press, 2005.
2967–2986, 2014. [48] Emanuel Todorov and Weiwei Li. A generalized itera-
[32] David Q Mayne and Hannah Michalska. Receding tive lqg method for locally-optimal feedback control of
horizon control of nonlinear systems. Automatic Control, constrained nonlinear stochastic systems. In American
IEEE Transactions on, 35(7):814–824, 1990. Control Conference, 2005. Proceedings of the 2005,
[33] David Q Mayne, James B Rawlings, Christopher V Rao, pages 300–306. IEEE, 2005.
and Pierre OM Scokaert. Constrained model predictive [49] Noel Du Toit and Joel W. Burdick. Robotic motion
control: Stability and optimality. Automatica, 36(6):789– planning in dynamic, cluttered, uncertain environments.
814, 2000. In ICRA, May 2010.
[34] Nima Moshtagh. Minimum volume enclosing ellipsoid. [50] Jur Van Den Berg, Pieter Abbeel, and Ken Goldberg.
Convex Optimization, 111:112, 2005. Lqg-mp: Optimized path planning for robots with motion
[35] Arkadiı̆ Nemirovsky. Problem complexity and method uncertainty and imperfect state information. The Inter-
efficiency in optimization. national Journal of Robotics Research, 30(7):895–913,
[36] C. Papadimitriou and J. N. Tsitsiklis. The complexity of 2011.
markov decision processes. Mathematics of Operations [51] Jur Van den Berg, Sachin Patil, and Ron Alterovitz.
Research, 12(3):441–450, 1987. Motion planning under uncertainty using differential
[37] J. Pineau, G. Gordon, and S. Thrun. Point-based value dynamic programming in belief space. In Proc. of
iteration: An anytime algorithm for POMDPs. In Interna- International Symposium of Robotics Research, (ISRR),
tional Joint Conference on Artificial Intelligence, pages 2011.
1025–1032, 2003. [52] Jur Van Den Berg, Sachin Patil, and Ron Alterovitz.
[38] R. Platt, R. Tedrake, L. Kaelbling, and T. Lozano-Perez. Motion planning under uncertainty using iterative local
Belief space planning assuming maximum likelihood optimization in belief space. The International Journal
observations. In Robotics: Science and Systems (RSS), of Robotics Research, 31(11):1263–1278, 2012.
2010. [53] SJ Yakowitz. Algorithms and computational techniques
[39] Robert Platt. Convex receding horizon control in non- in differential dynamic programming. Control and Dy-
gaussian belief space. In Algorithmic Foundations of namical Systems: Advances in Theory and Applications,
Robotics X, pages 443–458. Springer, 2013. 31:75–91, 2012.
[40] Sam Prentice and Nicholas Roy. The belief roadmap: [54] Youbot-store.com. Youbot 3d model - youbot wiki,
Efficient planning in belief space by factoring the co- 2016. URL http://www.youbot-store.com/wiki/index.
variance. International Journal of Robotics Research, 28 php/YouBot 3D Model.
(11-12), October 2009. [55] zakharov. zakharov youbot model, 2011. URL
[41] Mohammadhussein Rafieisakhaei, Amirhossein Tamjidi, https://github.com/zakharov/youbot model/wiki/
and Suman Chakravorty. On-line mpc-based stochastic KUKA-youBot-kinematics,-dynamics-and-3D-model.
planning in the non-gaussian belief space with non- [56] R. Zhou and E. A. Hansent. An improved grid-based
convex constraints. 2015. approximation algorithm for pomdps. IJCAI, 2001.
[42] Mohammadhussein Rafieisakhaei, Amirhossein Tamjidi, [57] Matt Zucker, Nathan Ratliff, Anca D Dragan, Mihail Piv-
Suman Chakravorty, and PR Kumar. Feedback motion toraiko, Matthew Klingensmith, Christopher M Dellin,
J Andrew Bagnell, and Siddhartha S Srinivasa. Chomp:
Covariant hamiltonian optimization for motion planning.
The International Journal of Robotics Research, 32(9-
10):1164–1193, 2013.