Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
International Game Theory Review, Vol. 12, No. 1 (2010) 1–17
c World Scientific Publishing Company
DOI: 10.1142/S0219198910002489
GUARANTEED STRATEGIES FOR NONLINEAR
MULTI-PLAYER PURSUIT-EVASION GAMES∗
DUŠAN M. STIPANOVIƆ , ARIK MELIKYAN
and NAIRA HOVAKIMYAN‡
†Department of Industrial and Enterprise Systems
Engineering and the Coordinated Science Laboratory
University of Illinois at Urbana-Champaign
Urbana, IL 61801, USA
dusan@illinois.edu
‡Department
of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
Urbana, IL 61801, USA
nhovakim@illinois.edu.
In this paper, we provide a methodology to design strategies for either guaranteed
capture or guaranteed evasion in the case of pursuit-evasion games with multiple players
which are represented by nonlinear dynamic models. This methodology is based on
the continuously differentiable upper and lower approximations of the minimum and
maximum function of an arbitrary number of arguments, comparison principle, and
differential inequalities.
Keywords: Pursuit-evasion games; differential inequalities; multi-player dynamic games;
Liapunov analysis.
1. Introduction
Consideration of optimal strategies for dynamic pursuit-evasion games dates back to
the original work of Isaacs (1965). The problem of “pursuing a moving object with
another controlled object” was formulated as a stochastic optimal control problem
in Pontryagin et al. (1962). It is important to point out a significant contribution
in the theory of pursuit-evasion games based on extremal aiming method that was
introduced and developed by Krasovskii and Subbotin (see Krasovskii and Subbotin
(1988), Subbotin (1995)). In order to deal with the nondifferentiability of solutions
of the Hamilton-Jacobi partial differential equations, viscosity solutions [Crandall
and Lions (1983), Bardi and Capuzzo-Dolcetta (1997)] and so called “minmax” solutions Subbotin (1984), were independently introduced. Numerical approximations
∗ This paper appears almost a year and half after the untimely death of Arik Melikyan. D. M.
Stipanović and N. Hovakimyan would like to dedicate this paper to his memory.
† Corresponding author.
1
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
2
D. M. Stipanović, A. Melikyan & N. Hovakimyan
based on viscosity solutions were provided in Falcone and Ferretti (1994), Bardi
and Capuzzo-Dolcetta (1997). Some particular strategies for the players in various multi-player pursuit-evasion games were presented in Hagedorn and Breakwell
(1976), Breakwell and Hagedorn (1979), Melikyan (1981), Pashkov and Terekhov
(1987), Levchenkov and Pashkov (1990), Petrosjan (1993), Petrov (1994), Vagin
and Petrov (2002), Petrov (2003). For an application of generalized characteristics
of partial differential equations to differential pursuit-evasion games we refer to the
results reported in Melikyan (1981, 1998).
Another interesting and important scenario when the players obtain information
at discrete time instances was first considered in Melikyan (1973) for the case of one
pursuer and one evader. This result was later generalized using a cost of information
for more complex models for the pursuer and the evader in Olsder and Pourtallier
(1995). Finally, the case of pursuit-evasion games with several pursuers and one
evader with discrete observations that is based on a comparison with solutions to
differential games with continuous observations available to the players, was studied
in Melikyan and Pourtallier (1996). The problem in which one of the two players is
provided with the delayed observations was considered in Chernousko and Melikyan
(1975).
In this paper we follow another approach to define strategies in the pursuitevasion games which is based on the Liapunov type of analysis [Stipanović et al.
(2004)]. Instead of solving a Hamilton-Jacobi-Isaacs partial differential equation
for a value function, a specific function of the norms of relative distances between
pursuers and evaders is considered. This function is appropriately chosen from the
set of functions that are continuously differentiable and represent approximations
of the minimum and maximum function. Strategies are then formulated by either
maximizing or minimizing the growth, that is, the time derivative of the corresponding differentiable Liapunov-like function. One of the most important features of this
approach is that the methodology is applicable to a wide class of linear multi-player
pursuit-evasion games. These results were later extended in Stipanović et al. (2009)
to include more complicated nonlinear models that are affine in control strategies
for the players in the game. The pursuit-evasion games considered, were restricted
to the case of two pursuers and two evaders. In this paper we generalize these results
to multiple pursuit-evasion games by introducing convergent approximations of the
minimum and maximum function of an arbitrary number of arguments and by using
less restrictive comparison results.
The organization of the paper is as follows. In Sec. 2 we introduce functions that
represent convergent lower and upper approximations of both the minimum and
the maximum function. Some of the most general comparison results are recalled
in Sec. 3. Guaranteed strategies for either capture or evasion of the evaders are
provided in Sec. 4. Finally, as an illustration of the proposed methodology, we
consider pursuit-evasion games with nonidentical nonholonomic players described
by the unicycle model in Sec. 5.
Guaranteed Strategies for Nonlinear MP PE Games
3
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
2. Properties of the Minimum and the Maximum
Function Approximations
In this section we study generalized functions introduced in Stipanović et al. (2009)
that approximate minimum and maximum of two arguments for the case of an
arbitrary number of arguments. These functions will be later used to establish
sufficient conditions for either guaranteed capture or evasion of all or some of the
evaders. As a starting point, let us assume that we are given N positive numbers
ai , i ∈ N, where N = {1, . . . , N }. In order to approximate the minimum function
from below, we consider the following function:
1
(1)
σ δ (a1 , . . . , aN ) = δ N −δ , δ > 0
i=1 ai
and similarly for the approximation from above, we consider the following
function:
N
(2)
σ δ (a1 , . . . , aN ) = δ N −δ , δ > 0.
i=1 ai
Let us denote am = mini∈N {ai } and define m as a variable taking integer value j
representing the index of a minimal aj , that is m = j. Notice that if the minimum
value is achieved by more than one argument, we can choose any of the corresponding indices without any loss of generality. Now, we can state the following
theorem:
Theorem 2.1. The minimum approximation functions satisfy the following
properties:
σ δ ≤ am ≤ σ δ ,
∀ δ > 0,
(3)
lim σ δ = lim σ δ = am .
(4)
δ→∞
δ→∞
Proof. First notice that the approximation functions may be written as:
am
σδ =
,
δ
1 + i=m (am /ai )δ
√
am δ N
σδ =
.
δ
1 + i=m (am /ai )δ
Also,
N
δ
lim
cδi = 1
1+
δ→∞
i=1
if (∀ i ∈ {1, . . . , N }) (ci ∈ [0, 1])
(5)
(6)
4
D. M. Stipanović, A. Melikyan & N. Hovakimyan
which is true due to the following:
N
√
√
δ
δ
δ
cδi ≤ 1 + N if (ci ∈ [0, 1]) and lim 1 + N = 1.
1≤ 1+
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
(7)
δ→∞
i=1
Finally, since am /ai ≤ 1 for all i ∈ {1, . . . , N } and N ≥ 1 +
conclude that the statements of the theorem are true.
δ
i=m (am /ai )
we
Another interesting feature that is easy to show is that the minimum approximation functions behave well for any finite positive δ when the minimum approaches
zero, that is,
lim σ δ = lim σ δ = 0
am →0
am →0
(8)
which is a direct consequence of the Eqs. (5) and (6).
In order to approximate the maximum function from below, we introduce the
following function:
N δ
δ
i=1 ai
, δ>0
(9)
ρδ (a1 , . . . , aN ) =
N
and similarly for the approximation from above, we propose to use
N
δ
ρδ (a1 , . . . , aN ) =
aδi , δ > 0.
(10)
i=1
Let us denote aM = maxi∈N {ai } and define M as a variable taking integer value of
the index of a maximal aj , that is M = j. Again, notice that if the set of maximal
variables has more than one element we can choose any one of them without any
loss of generality. Now, analogously to the case of approximating the minimum, we
formulate the following theorem:
Theorem 2.2. The convergent maximum approximation functions satisfy the
following properties:
ρδ ≤ aM ≤ ρδ ,
∀ δ > 0,
(11)
lim ρ = lim ρδ = aM .
(12)
δ→∞ δ
δ→∞
Proof. Notice that the approximation functions can be rewritten as:
√
P
δ
δ 1+
i=M (ai /aM )
√
ρδ = aM
,
δ
N
ρδ = aM δ 1 +
(ai /aM )δ .
i=M
(13)
Guaranteed Strategies for Nonlinear MP PE Games
Again, using the following property:
N
δ
lim
cδi = 1 if (∀ i ∈ {1, . . . , N }) (ci ∈ [0, 1]),
1+
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
δ→∞
5
(14)
i=1
the fact that ai /aM ≤ 1 for all i ∈ {1, . . . , N }, and N ≥ 1 +
conclude that the statements of the theorem are true.
δ
i=M (ai /aM ) ,
we
Finally, it is interesting to note that the lower and the upper convergent approximations of both the minimum and the maximum function may be linked to the
constant elasticity of substitution (CES) functions with particular coefficients, multiplying the arguments, that are either 1 or 1/N , respectively (for more details see
Luenberger (1995)).
3. Comparison Principle Theorems
In this section we recall the comparison principle theorems to be used for proving
that the strategies of the players would guarantee either capture of all evaders or
their evasion from the pursuers. The following theorem [Lakshmikantham et al.
(1989)] will be used in proving guaranteed capture results:
Theorem 3.1. Let v ∈ C[R+ × Rn , R+ ] such that v(t, x) is locally Lipschitzian
in x. Assume that G ∈ C[R+ × Rn × R+ , R] and for (t, x) ∈ R+ × Rn ,
Dv(t, x) ≤ G(t, x, v(t, x)).
(15)
Let x(t) = x(t, t0 , x0 ) be any solution of ẋ = f (t, x), x(t0 ) = x0 , t0 ∈ R+ where
f ∈ C[R+ × Rn , Rn ], existing on [t0 , ∞). Also, let us assume that r(t, t0 , x0 , u0 ) is
the maximal solution of
u̇ = G(t, x(t), u),
u(t0 ) = u0
(16)
existing for t ≥ t0 . Then v(t0 , x0 ) ≤ u0 implies
v(t, x(t)) ≤ r(t, t0 , x0 , u0 ),
t ≥ t0 .
(17)
In the formulation of Theorem 3.1, D represents any Dini derivative [Bainov
and Simeonov (1992)] yet we assume that v(t, x) is continuously differentiable in
the domains of interest so that all Dini derivatives coincide with the standard total
time derivative d/dt. Also, we use dx/dt = ẋ when function x(·) is only a function
of time, that is, x ≡ x(t). Finally, R denotes the set of real numbers, R+ = [0, ∞),
and C[D1 , D2 ] denotes the set of all continuous functions with domain D1 and
codomain D2 [Lakshmikantham and Leela (1969)]. For more details on notation
and comparison results we refer to Lakshmikantham and Leela (1969, 1989) and
Bainov and Simeonov (1992).
In order to establish guaranteed evasion results we need the following straightforward modification of Theorem 3.1 which is justified along the lines of the
6
D. M. Stipanović, A. Melikyan & N. Hovakimyan
basic arguments provided in Lakshmikantham and Leela (1969) and Bainov and
Simeonov (1992).
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
Theorem 3.2. Let v ∈ C[R+ × Rn , R+ ] such that v(t, x) is locally Lipschitzian
in x. Assume that g ∈ C[R+ × Rn × R+ , R] and for (t, x) ∈ R+ × Rn ,
Dv(t, x) ≥ g(t, x, v(t, x)).
(18)
Let x(t) = x(t, t0 , x0 ) be any solution of ẋ = f (t, x), x(t0 ) = x0 , t0 ∈ R+ where
f ∈ C[R+ × Rn , Rn ], existing on [t0 , ∞). Also, let us assume that z(t, t0 , x0 , u0 ) is
the minimal solution of
q̇ = g(t, x(t), q),
q(t0 ) = q0
(19)
existing for t ≥ t0 . Then v(t0 , x0 ) ≥ q0 implies
v(t, x(t)) ≥ z(t, t0 , x0 , q0 ),
t ≥ t0 .
(20)
4. Differential Inequalities and Pursuit-Evasion Games
Let us assign ei ∈ Rni , i ∈ {1, . . . , Ne }, to be a vector of all state variables corresponding to the i-th evader where Ne denotes the total number of evaders. Similarly,
let us assign pj ∈ Rnj , j ∈ {1, . . . , Np }, to be a vector of all state variables corresponding to the j-th pursuer where Np denotes the total number of pursuers. In
order to simplify the notation let us concatenate all the individual state vectors
into the vectors e = [eT1 , . . . , eTNe ]T and p = [pT1 , . . . , pTNp ]T . These two vectors are
of dimensions defined by the dimensions of players’ individual state vectors. Let us
assume that the evaders’ dynamics are given in its compact form as
ė = fe (e, ue )
(21)
and similarly that the pursuers’ dynamics are given by
ṗ = fp (e, up )
(22)
where ue and up represent evaders’ and pursuers’ input strategies, respectively.
In order to generalize results presented in Stipanović et al. (2009) we start by
considering the following function:
φiδ (ei , p) = σ δ (ei − p1 , . . . , ei − pNp )
(23)
where ei and pj represent n-dimensional rectangular coordinates in the corresponding n-dimensional space for the i-th evader and the j-th pursuer, respectively. Obviously the state variables in ei and pj are subsets of the state variables in ei and pj ,
respectively. Without loss of generality and to simplify the notation, we introduce
functions φiδ (·, ·), i ∈ {1, . . . , Ne }, as functions of ei and p. Furthermore, we define
e
πδ (e, p) = ρδ (φ1δ (e1 , p), . . . , φN
δ (eNe , p)).
(24)
Guaranteed Strategies for Nonlinear MP PE Games
7
Now, we can define the following function:
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
v(e, p) = πδ (e, p)
and compute the corresponding strategies as follows:
∂v(e, p)
dv
fe (e, ue )
ûe (e, p) = arg max
= arg max
dt
∂e
ue ∈Ue
ue ∈Ue
∂v(e, p)
dv
ûp (e, p) = arg min
fp (p, up )
= arg min
dt
∂p
up ∈Up
up ∈Up
(25)
(26)
where Ue and Up represent admissible classes of functions for the evaders’ and
pursuers’ strategies, respectively. To streamline our presentation we assume that
the classes of admissible functions are such that the objective function that is
either minimized or maximized determines the arguments for the solution function. Therefore, the fact that ∂v(e,p)
∂e fe (e, ue ) depends on e and p implies that the
solution ûe (·) is also a function of e and p. One of the most general examples is
a class of piecewise continuous functions that are norm bounded. The construction of strategies follows the main ideas of the design of controllers based on Liapunov functions (for more details see [Khalil (2002), Bacciotti and Rosier (2005),
Blanchini and Miani (2008)]). In order to use comparison principle we approximate
from above the total time derivative of function v(e, p) as:
∂v(e, p)
∂v(e, p)
dv(e, p)
=
fe (e, ûe (e, p)) +
fp (p, ûp (e, p))
dt
∂e
∂p
≤ G(e, p, v(e, p))
(27)
where G(·, ·, ·) is a scalar continuous function of its arguments and ûe (e, p) and
ûp (e, p) are respectively collections of the evaders’ and pursuers’ strategies that
maximize or minimize the time derivative, that is the growth, of the function v(e, p).
Again, fj (j, ûj (e, p)), j ∈ {e, p} represent collective dynamics of the evaders (when
j = e) and the pursuers (when j = p) for the previously defined collective strategies.
By defining the capture of an evader to be accomplished whenever its Euclidean
distance to any of the pursuers becomes less than a prescribed positive number
R (also known as the “soft capture”) we state the following theorem:
Theorem 4.1. Assume that the initial conditions e0 = e(t0 ) and p0 = p(t0 ) at the
initial time t0 are such that the players are outside of the capture regions defined by
a positive number R, and that the maximal solution (as defined in Lakshmikantham
and Leela (1969)) of the following differential equation:
dw
= G(e(t), p(t), w),
dt
w0 = v0 (e0 , p0 )
(28)
is denoted as w̄(t, t0 , e0 , p0 , w0 ) along the trajectories of the players’ dynamic systems for the collections of their strategies ûe (e, p) and ûp (e, p). Then the capture of
all evaders is guaranteed when pursuers use collective strategies provided in a vector
8
D. M. Stipanović, A. Melikyan & N. Hovakimyan
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
form as ûp (e, p) within a finite time interval T − t0 after the initial time for any
feedback strategies of the evaders if w̄(T, t0 , e0 , p0 , w0 ) < R.
Proof. So the assumption of the theorem is that the pursuers choose their strategies to be ûp (e, p). Then, if the evaders choose any strategy ūe (e, p) ∈ Ue , from (26)
and (27) it follows that:
dv(e, p)
dt
ue =ūe (e, p)
up =ûp (e, p)
≤
dv(e, p)
dt
ue =ûe (e, p)
up =ûp (e, p)
≤ G(e, p, v(e, p)).
(29)
Thus, for any strategy ūe (e, p) ∈ Ue , we obtain the differential inequality (29) and
using the comparison principle we obtain:
v(e(t), p(t)) ≤ w̄(t, t0 , e0 , p0 , w0 ),
t ≥ t0 .
(30)
From (11) and (24) it follows that:
e
max{φ1δ (e1 (t), p(t)), . . . , φN
δ (eNe (t), p(t))} ≤ v(e(t), p(t))
(31)
which implies
φiδ (ei (t), p(t)) ≤ v(e(t), p(t))
(32)
for all i ∈ {1, . . . , Ne }. Then, using Eqs. (3) and (23) we obtain
min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≤ v(e(t), p(t)),
t ≥ t0 ,
(33)
for all i ∈ {1, . . . , Ne }. Finally, from inequalities (33) and an assumption of the
theorem, we obtain
min{ei (T ) − p1 (T ), . . . , ei (T ) − pNp (T )} < R
(34)
for all i ∈ {1, . . . , Ne } which is a guarantee that all evaders will be captured before
or at time T and thus the theorem is proved.
Now, let us first consider a problem of evasion of a single evader i, i ∈
{1, . . . , Ne }, from the pursuers. In order to do so first we assume that the i-th
evader’s dynamics is given by
ėi = fei (ei , uie ).
(35)
To obtain strategies for the guaranteed evasion we consider the following approximation function
ηδi (ei , p) = σ δ (ei − p1 , . . . , ei − pNp )
(36)
and define vi (·, ·) as
vi (ei , p) = ηδi (ei , p).
(37)
Guaranteed Strategies for Nonlinear MP PE Games
9
Similarly to the case of guaranteed capture, let us again bound the total time
derivative of vi (ei , p) yet in this case this bound will be from below as
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
∂vi (ei , p)
∂vi (ei , p) i
dvi (ei , p)
fe (ei , ũie (ei , p)) +
=
fp (p, ũip (ei , p))
dt
∂ei
∂p
≥ gi (ei , p, vi (ei , p))
(38)
where the strategies are computed from:
dvi
∂vi (ei , p) i
ũie (ei , p) = arg max
fe (ei , uie )
= arg max
dt
∂ei
uie ∈Uei
uie ∈Uei
dvi
∂vi (ei , p)
i
ũp (ei , p) = arg min
fp (p, up )
= arg min
dt
∂p
up ∈Up
up ∈Up
(39)
and gi (·, ·, ·) is a scalar continuous function of its arguments.
Theorem 4.2. Assume that the initial conditions ei0 = ei (t0 ) and p0 = p(t0 ) at
the initial time t0 are such that the players are outside of the region defined by the
set {(ei , p) : vi (ei , p) ≥ R} (notice that this is an over or outer approximation of
the soft capture set) and that the minimal solution (as defined in [Lakshmikantham
and Leela (1969)]) of the following differential equation:
dzi
= gi (ei (t), p(t), zi ), zi (t0 ) = zi0 = vi (ei0 , p0 )
(40)
dt
is denoted as z i (t, t0 , ei0 , p0 , zi0 ) along the trajectories of the players’ dynamic systems for the collections of their strategies ũie (ei , p) and ũp (ei , p). Then the evasion
of the i-th evader is guaranteed for any admissible composite strategy of the pursuers
if z i (t, t0 , ei0 , p0 , zi0 ) > R, for all t ≥ t0 .
Proof. So, the assumption of the theorem is that the i-th evader chooses its
strategy to be ũie (ei , p). Then, if the pursuers choose any collective strategy
ūp (ei , p) ∈ Up , from (38) and (39) it follows that:
dvi (ei , p)
dt
uie = ũie (ei , p)
up = ūp (ei , p)
≥
dvi (ei , p)
dt
uie = ũie (ei , p)
up = ũp (ei , p)
≥ gi (ei , p, vi (ei , p)).
(41)
Thus, for any strategy ūp (e, p) ∈ Up , we obtain the differential inequality (41) and
using the comparison principle we obtain:
vi (ei (t), p(t)) ≥ z(t, t0 , ei0 , p0 , zi0 ),
t ≥ t0 .
(42)
From (3), (36) and (37), it follows that:
min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ vi (ei (t), p(t)),
t ≥ t0
(43)
for all i ∈ {1, . . . , Ne }. Finally, using Eqs. (42) and (43), and an assumption of the
theorem, we obtain
min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ z(t, t0 , ei0 , p0 , zi0 ) > R,
t ≥ t0
(44)
10
D. M. Stipanović, A. Melikyan & N. Hovakimyan
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
which is a guarantee that the i-th evader will never be captured by the pursuers.
This completes the proof.
Now, let us construct cooperative strategies for the evaders that would guarantee
that none of them will be captured by the pursuers. For the cooperative case we
propose the following function:
γδ (e, p) = σ δ (ηδi (e1 , p), . . . , ηδi (eNe , p))
(45)
v(e, p) = γδ (e, p).
(46)
and define
This is a cooperative case since v(e, p) considers all evaders at the same time.
Notice that due to the assumption that the players’ dynamics are independent, the
cooperation is described through the function v(e, p). Then, we proceed by defining
the strategies for the players as:
dv
∂v(e, p)
fe (e, ue )
ũe (e, p) = arg max
= arg max
dt
∂e
ue ∈Ue
ue ∈Ue
(47)
∂v(e, p)
dv
fp (p, up )
ũp (e, p) = arg min
= arg min
dt
∂p
up ∈Up
up ∈Up
where again Ue and Up represent admissible classes of functions for the evaders’
and pursuers’ strategies, respectively. In order to use the comparison principle, we
approximate from below the total time derivative of function v(e, p) as
∂v(e, p)
∂v(e, p)
dv(e, p)
=
fe (e, ũe (e, p)) +
fp (p, ũp (e, p))
dt
∂e
∂p
≥ g(e, p, v(e, p)).
(48)
where g(·, ·, ·) is a scalar continuous function of its arguments.
Now, we are ready to formulate a theorem on the collective evasion as follows:
Theorem 4.3. Assume that the initial conditions e0 = e(t0 ) and p0 = p(t0 ) at
the initial time t0 are such that the players are outside of the region defined by
the set {(e, p) : v(e, p) ≥ R} (notice that this is an over or outer approximation
of the soft capture set for any evader) and that the minimal solution (as defined in
[Lakshmikantham and Leela (1969)]) of the following differential equation:
dz
= g(e(t), p(t), z),
dt
z(t0 ) = z0 = v(e0 , p0 )
(49)
is denoted as z(t, t0 , e0 , p0 , z0 ) along the trajectories of the players’ dynamic systems for the collections of their strategies ũe (e, p) and ũp (e, p). Then the evasion
of all evaders is guaranteed for any admissible composite strategy of the pursuers if
z(t, t0 , e0 , p0 , z0 ) > R, for all t ≥ t0 .
Guaranteed Strategies for Nonlinear MP PE Games
11
Proof. So, an assumption of the theorem is that the evaders choose their collective strategy to be ũie (ei , p). Then, if the pursuers choose any collective strategy
ūp (ei , p) ∈ Up , from (47) and (48) it follows that:
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
dv(e, p)
dt
ue = ũe (e, p)
up = ūp (e, p)
≥
dv(e, p)
dt
ue = ũe (e, p)
up = ũp (e, p)
≥ g(e, p, v(e, p)).
(50)
Thus, for any strategy ūp (e, p) ∈ Up , differential inequality (50) is valid, and using
the comparison principle we obtain
v(e(t), p(t)) ≥ z(t, t0 , e0 , p0 , z0 ),
t ≥ t0 .
(51)
From (3), (45) and (46) it follows that:
min{ηδi (e1 (t), p(t)), . . . , ηδi (eNe (t), p(t))} ≥ v(e(t), p(t)),
t ≥ t0
(52)
which implies
ηδi (e1 (t), p(t)) ≥ v(e(t), p(t)),
t ≥ t0 ,
(53)
for all i ∈ {1, . . . , Ne }. Then, from Eqs. (3) and (36) we obtain
min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ v(e(t), p(t)),
t ≥ t0
(54)
for all i ∈ {1, . . . , Ne }. Inequalities (51) and (54) imply that for all i ∈ {1, . . . , Ne },
min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ z(t, t0 , e0 , p0 , z0 ) > R,
t ≥ t0 (55)
which is a guarantee that all evaders will never be captured by any of the pursuers.
This completes the proof.
Notice that by defining goals of the players in terms of trajectories of the
system either reaching or not reaching the target sets, we can generalize the proposed methodology to be applicable to a larger class of dynamic games rather than
pursuit-evasion games only.
5. Unicycle Players
In order to provide an illustration for designing players’ strategies using the proposed approach we consider pursuit-evasion games where the players are modelled
using a nonholonomic nonlinear model also known as the unicycle [Spong et al.
(2005)]. Let us assume that each player i, where the total number of players is
denoted by N , that is i ∈ N = {1, . . . , N }, is modelled using the unicycle model,
that is,
Ẋi = si cos(ϕi )
Ẏi = si sin(ϕi )
(56)
ϕ̇i = ωi
where Xi and Yi represent planar rectangular coordinates and ϕi represents the
heading angle of the i-th player. Velocity, denoted by si , and angular velocity,
denoted by ωi , are assumed to be norm bounded by µi and νi , respectively.
12
D. M. Stipanović, A. Melikyan & N. Hovakimyan
Assuming that the Liapunov-like function for the i-th player is given by:
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
υi ≡ υi (Pi , P i ),
Pi = [Xi , Yi ]T , P i = {Pj : j = i, j ∈ Ni }
(57)
where Ni is a set of players of interest to player i. A set of players of interest is
defined based on the individual role of the player. For example, if a player is an
evader then the set of interest would include indices of all pursuers. A Liapunovlike function candidate associated with the i-th player denoted by υi (·, ·) is an
appropriate approximation of the minimum or the maximum function depending
whether the i-th player is a pursuer or an evader. Then, the control strategies
for the pursuers, that is, velocity and angular velocity are given by the following
expression:
ŝi
ω̂i
=
arg min
si ≤µi ,ωi ≤νi
=
arg min
si ≤µi ,ωi ≤νi
dvi
dt
∂vi
∂vi
cos(ϕi ) +
sin(ϕi ) si .
∂Xi
∂Yi
(58)
From Eq. (58) it follows that the closed-form solution for the pursuers’ velocities
are given by the following formula:
∂vi
∂vi
cos(ϕi ) +
sin(ϕi )
ŝi = −µi sign
∂Xi
∂Yi
∂vi /∂Xi
= −µi sign sin ϕi + arctan
,
(59)
∂vi /∂Yi
where sign(·) denotes the standard sign function. Similarly, for the evaders we
consider the arg max function in (58) and obtain the following formula:
∂vi
∂vi
cos(ϕi ) +
sin(ϕi )
ŝi = µi sign
∂Xi
∂Yi
∂vi /∂Xi
= µi sign sin ϕi + arctan
.
(60)
∂vi /∂Yi
From Eqs. (59) and (60) it can be easily shown that the maximal and minimal
velocities will be achieved if the heading angle is π/2 (that is, in both cases) which
implies that the desired heading angle is given by
∂υi /∂Xi
=
π/2
−
arctan
.
(61)
ϕdes
i
∂υi /∂Yi
Then, one possible norm bounded solution for the angular velocity strategy of the
i-the player is to guide the player toward the desired heading angle as,
ω̂i = −νi sign(ϕi − ϕdes
i ).
(62)
Notice that the differential equation ϕ̇i = ω̂i is finite-time stable if ϕdes
is a constant.
i
It is interesting to note that the functional forms for the angular velocities are the
same for all the players.
Guaranteed Strategies for Nonlinear MP PE Games
13
30
20
Y
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
25
E2
E1
15
10
P3
P5
P
4
5
−10
−5
0
5
10
15
20
25
X
Fig. 1. Pursuit-evasion game for the first set of initial conditions with five nonidentical unicycle
players.
Finally, in order to illustrate the proposed design of players’ strategies let us
consider a pursuit-evasion game with three pursuers and two evaders. The first
scenario depicted in Fig. 1 shows two evaders with the initial conditions in the
upper part of the figure and three pursuers with the initial conditions in the lower
part of the figure. So, initial Y -coordinates of evaders and pursuers are the same,
respectively. The trajectories for the evaders appear as thinner lines while the trajectories for the pursuers appear as thicker lines. We will refer to the evaders as
players one and two and denote them as E1 and E2 (as shown in Fig. 1). Similarly, we will refer to the three pursuers as players three, four and five and denote
them as P3 , P4 and P5 , respectively. This is also depicted in Fig. 1 by placing
the corresponding labels next to the players’ trajectories. By doing so, we assign
subscript indices 1 and 2 to players which are evaders and indices 3, 4 and 5 to
players which are pursuers. In the normalized units the bounds for the velocities of
the players are µ1 = µ2 = µ3 = 1, µ4 = µ5 = 2 and νi = 1, for all i ∈ {1, 2, 3, 4, 5}.
The strategies are computed using equation (26), that is, Eqs. (58)–(62), and the
composite function (25) (which is the same for all players) with δ = 3. In Fig. 1,
we can see that the players three and four (that is, pursuers P3 and P4 ) pursue
player one (that is, evader E1 ) and that player five (that is, pursuer P5 ) pursues
player two (that is, evader E2 ). Since players four and five are the fastest they
will capture players one and two. We do not provide closed-form solutions for the
players’ strategies and the time derivative of the composite function since the derivation is straightforward using Eqs. (25), (58)–(62), and the equations are long and
cumbersome.
14
D. M. Stipanović, A. Melikyan & N. Hovakimyan
40
35
Y
25
20
E
E
2
1
15
10
P
P
4
3
5
P
5
0
−20
0
−10
10
20
X
Fig. 2. Pursuit-evasion game for the second set of initial conditions with five nonidentical unicycle
players: trajectories at the beginning of the pursuit.
80
70
60
50
E2
E1
P
40
3
P
Y
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
30
30
5
P
20
4
10
0
−10
−20
−80
−60
−40
−20
X
0
20
40
Fig. 3. Pursuit-evasion game for the second set of initial conditions with five nonidentical unicycle
players: trajectories over the whole time horizon.
A more complex example is depicted in Figs. 2 and 3 where we only changed
initial conditions slightly (not to change the numbering order of the players) and the
bounds on the velocities as µ1 = 5, µ2 = 2, µ3 = 3, µ4 = 1, µ5 = 6, ν1 = ν3 = ν5 = 2,
and ν2 = ν4 = 1. Parameter δ = 3 remained unchanged as well as all the equations
used to compute players’ strategies. It is interesting that the initial strategy of
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
Guaranteed Strategies for Nonlinear MP PE Games
15
player three (that is, pursuer P3 ) is to pursue player one (that is, evader E1 ) as
depicted in Fig. 2 yet after a while it turns and starts pursuing player two (that is,
evader E2 ) as shown in Fig. 3 since player four (that is, pursuer P4 ) is too slow for
player two. Player five (that is, pursuer P5 ), as the fastest player, pursues from the
very beginning player one who is the fastest among the two evaders.
These initial simulation results show capabilities of the strategies that are proposed yet many open questions still remain to be addressed in our future work.
Some of the immediate issues to be considered are to include delays and study
robustness properties of the methodology as well as the possibility that the players
update their information only at discrete-time instances.
6. Conclusion
In this paper we provide a methodology of designing strategies for the players that
guarantee either capture or evasion of all or some evaders in multi-player pursuitevasion games. The players’ dynamics are represented by nonlinear models and the
sufficient conditions are formulated using a Liapunov type of analysis based on
the comparison principle and differential inequalities by considering differentiable
functions that are convergent approximations of the minimum and the maximum
function.
Acknowledgments
This work has been supported by the Alexander von Humboldt Foundation, Bonn,
Germany. The first author would also like to thank Mr. Juan S. Mejı́a for his help
in producing the Matlab code used to obtain simulation results provided in the
paper.
References
Bacciotti, A. and Rosier, L. [2005] Liapunov Functions and Stability in Control Theory,
2nd edn., Springer-Verlag, Berlin, Germany.
Bainov, D. and Simeonov, P. [1992] Integral inequalities and applications, Mathematics and
Its Applications, Vol. 57, Kluwer Academic Publishers, Dordrecht, The Netherlands.
Bardi, M. and Capuzzo-Dolcetta, I. [1997] Optimal Control and Viscosity Solutions of
Hamilton-Jacobi-Bellman Equations, Birkhäuser, Boston, MA.
Blanchini, F. and Miani, S. [2008] Set-Theoretic Methods in Control, Birkhäuser, Boston,
MA.
Breakwell, J. V. and Hagedorn, P. [1979] Point capture of two evaders in succession,
Journal of Optimization Theory and Applications 27, 89–97.
Chernousko, F. L. and Melikyan, A. A. [1975] Some differential games with incomplete
information, Lecture Notes in Computer Science, Vol. 27, Springer-Verlag, Berlin,
pp. 445–450.
Crandall, M. G. and Lions, P.-L. [1983] Viscosity solutions of Hamilton-Jacobi equations,
Transactions of American Mathematical Society 277, 1–42.
Falcone, M. and Ferretti, R. [1994] Discrete time high-order schemes for viscosity solutions
of Hamilton-Jacobi-Bellman equations, Numerische Mathematik 67, 315–344.
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
16
D. M. Stipanović, A. Melikyan & N. Hovakimyan
Hagedorn, P. and Breakwell, J. V. [1976] A differential game with two pursuers and one
evader, Journal of Optimization Theory and Applications 18, 15–29.
Isaacs, R. [1965] Differential Games: A Mathematical Theorey With Applications to
Warfare and Pursuit, Control and Optimization, John Wiley and Sons, Inc.,
New York, NY.
Khalil, H. K. [2002] Nonlinear Systems, 3rd edn., Prentice Hall, Upper Saddle River, NJ.
Krasovskii, N. N. and Subbotin, A. I. [1988] Game-Theoretical Control Problems, SpringerVerlag, New York, NY.
Lakshmikantham, V. and Leela, S. [1969] Differential and integral inequalities: Theory
and applications, Mathematics in Science and Engineering, Vol. 55, Academic Press,
New York, NY.
Lakshmikantham, V., Leela, S. and Martinyuk, A. A. [1989] Stability Analysis of Nonlinear
Systems, Marcel Dekker, New York, NY.
Levchenkov, A. Y. and Pashkov, A. G. [1990] Differential game of optimal approach of
two inertial pursuers to a noninertial evader, Journal of Optimization Theory and
Applications 65, 501–518.
Luenberger, D. G. [1995] Microeconomic Theory, McGraw-Hill, Inc., New York, NY.
Melikyan, A. A. [1973] On Minimal Observations in a Game of Encounter (in Russian)
PMM 37(3), 407–414.
Melikyan, A. A. [1981] Optimal interaction of two pursuers in a game problem (in Russian)
Tekhnicheskaya Kibernetika (2), 49–56.
Melikyan, A. A. [1998] Generalized characteristics of first order PDEs, Birkhäuser, Boston,
MA.
Melikyan, A. A., Hovakimyan, N. H. and Harutyunyan, L. [1998] Games of simple pursuit and approach on two-dimensional cone, Journal of Optimization, Theory and
Applications 98(3), 515–543.
Melikyan, A. A. and Pourtallier, A. [1996] Games with Several Pursuers and One Evader
with Discrete Observations, Game Theory and Applications (Petrosjan, L. A. and
Mazalov, V. V. eds.), Vol. 2, Nova Science Publishers, New York, NY, pp. 169–184.
Olsder, G. J. and Pourtallier, O. [1995] Optimal selection of observation times in a costly
information game, New Trends in Dynamic Games and Applications (Olsder, G. J.
ed.), Annals of the International Society of Dynamic Games, Vol. 3, Birkhäuser,
Boston, MA, pp. 227–246.
Pashkov, A. G. and Terekhov, S. D. [1987] A differential game of approach with two pursuers and one evader, Journal of Optimization Theory and Applications 55, 303–311.
Petrosjan, L. A. [1993], Differential Games of Pursuit, Series on Optimization, Vol. 2,
World Scientific, Singapore.
Petrov, N. N. [1994] Existence of the value of a many-person game of pursuit, Journal of
Applied Mathematics and Mechanics 58(4), 593–600.
Petrov, N. N. [2003] “Soft” capture in Pontryagin’s example with many participants,
Journal of Applied Mathematics and Mechanics 67(5), 671–680.
Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V. and Mishchenko, E. F. [1962]
The Mathematical Theory of Optimal Processes, Interscience Publishers, New York,
NY.
Spong, M. W., Hutchinson, S. and Vidyasagar, M. [2005] Robot Modeling and Control,
John Wiley & Sons, Hoboken, NJ.
Stipanović, D. M., Melikyan, A. and Hovakimyan, N. [2009] Some sufficient conditions
for multi-player pursuit-evasion games with continuous and discrete observations,
Annals of Dynamic Games 10, 133–145.
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com
by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only.
Guaranteed Strategies for Nonlinear MP PE Games
17
Stipanović, D. M., Sriram and Tomlin, C. J. [2004] Strategies for agents in multi-player
pursuit-evasion games, Proceedings of the Eleventh International Symposium on
Dynamic Games and Applications (Tucson, Arizona).
Subbotin, A. [1984] Generalization of the main equation of differential game theory,
Journal of Optimization Theory and Applications 43, 103–133.
Subbotin, A. I. [1995] Generalized Solutions of First-Order PDEs: The Dynamical
Optimization Prospective, Birkhäuser, Boston, MA.
Vagin, D. A. and Petrov, N. N. [2002] A problem of group pursuit with phase constraints,
Journal of Applied Mathematics and Mechanics 66(2), 225–232.