[go: up one dir, main page]

Academia.eduAcademia.edu
Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. International Game Theory Review, Vol. 12, No. 1 (2010) 1–17 c World Scientific Publishing Company  DOI: 10.1142/S0219198910002489 GUARANTEED STRATEGIES FOR NONLINEAR MULTI-PLAYER PURSUIT-EVASION GAMES∗ DUŠAN M. STIPANOVIƆ , ARIK MELIKYAN and NAIRA HOVAKIMYAN‡ †Department of Industrial and Enterprise Systems Engineering and the Coordinated Science Laboratory University of Illinois at Urbana-Champaign Urbana, IL 61801, USA dusan@illinois.edu ‡Department of Mechanical Science and Engineering University of Illinois at Urbana-Champaign Urbana, IL 61801, USA nhovakim@illinois.edu. In this paper, we provide a methodology to design strategies for either guaranteed capture or guaranteed evasion in the case of pursuit-evasion games with multiple players which are represented by nonlinear dynamic models. This methodology is based on the continuously differentiable upper and lower approximations of the minimum and maximum function of an arbitrary number of arguments, comparison principle, and differential inequalities. Keywords: Pursuit-evasion games; differential inequalities; multi-player dynamic games; Liapunov analysis. 1. Introduction Consideration of optimal strategies for dynamic pursuit-evasion games dates back to the original work of Isaacs (1965). The problem of “pursuing a moving object with another controlled object” was formulated as a stochastic optimal control problem in Pontryagin et al. (1962). It is important to point out a significant contribution in the theory of pursuit-evasion games based on extremal aiming method that was introduced and developed by Krasovskii and Subbotin (see Krasovskii and Subbotin (1988), Subbotin (1995)). In order to deal with the nondifferentiability of solutions of the Hamilton-Jacobi partial differential equations, viscosity solutions [Crandall and Lions (1983), Bardi and Capuzzo-Dolcetta (1997)] and so called “minmax” solutions Subbotin (1984), were independently introduced. Numerical approximations ∗ This paper appears almost a year and half after the untimely death of Arik Melikyan. D. M. Stipanović and N. Hovakimyan would like to dedicate this paper to his memory. † Corresponding author. 1 Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. 2 D. M. Stipanović, A. Melikyan & N. Hovakimyan based on viscosity solutions were provided in Falcone and Ferretti (1994), Bardi and Capuzzo-Dolcetta (1997). Some particular strategies for the players in various multi-player pursuit-evasion games were presented in Hagedorn and Breakwell (1976), Breakwell and Hagedorn (1979), Melikyan (1981), Pashkov and Terekhov (1987), Levchenkov and Pashkov (1990), Petrosjan (1993), Petrov (1994), Vagin and Petrov (2002), Petrov (2003). For an application of generalized characteristics of partial differential equations to differential pursuit-evasion games we refer to the results reported in Melikyan (1981, 1998). Another interesting and important scenario when the players obtain information at discrete time instances was first considered in Melikyan (1973) for the case of one pursuer and one evader. This result was later generalized using a cost of information for more complex models for the pursuer and the evader in Olsder and Pourtallier (1995). Finally, the case of pursuit-evasion games with several pursuers and one evader with discrete observations that is based on a comparison with solutions to differential games with continuous observations available to the players, was studied in Melikyan and Pourtallier (1996). The problem in which one of the two players is provided with the delayed observations was considered in Chernousko and Melikyan (1975). In this paper we follow another approach to define strategies in the pursuitevasion games which is based on the Liapunov type of analysis [Stipanović et al. (2004)]. Instead of solving a Hamilton-Jacobi-Isaacs partial differential equation for a value function, a specific function of the norms of relative distances between pursuers and evaders is considered. This function is appropriately chosen from the set of functions that are continuously differentiable and represent approximations of the minimum and maximum function. Strategies are then formulated by either maximizing or minimizing the growth, that is, the time derivative of the corresponding differentiable Liapunov-like function. One of the most important features of this approach is that the methodology is applicable to a wide class of linear multi-player pursuit-evasion games. These results were later extended in Stipanović et al. (2009) to include more complicated nonlinear models that are affine in control strategies for the players in the game. The pursuit-evasion games considered, were restricted to the case of two pursuers and two evaders. In this paper we generalize these results to multiple pursuit-evasion games by introducing convergent approximations of the minimum and maximum function of an arbitrary number of arguments and by using less restrictive comparison results. The organization of the paper is as follows. In Sec. 2 we introduce functions that represent convergent lower and upper approximations of both the minimum and the maximum function. Some of the most general comparison results are recalled in Sec. 3. Guaranteed strategies for either capture or evasion of the evaders are provided in Sec. 4. Finally, as an illustration of the proposed methodology, we consider pursuit-evasion games with nonidentical nonholonomic players described by the unicycle model in Sec. 5. Guaranteed Strategies for Nonlinear MP PE Games 3 Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. 2. Properties of the Minimum and the Maximum Function Approximations In this section we study generalized functions introduced in Stipanović et al. (2009) that approximate minimum and maximum of two arguments for the case of an arbitrary number of arguments. These functions will be later used to establish sufficient conditions for either guaranteed capture or evasion of all or some of the evaders. As a starting point, let us assume that we are given N positive numbers ai , i ∈ N, where N = {1, . . . , N }. In order to approximate the minimum function from below, we consider the following function:  1 (1) σ δ (a1 , . . . , aN ) = δ N −δ , δ > 0 i=1 ai and similarly for the approximation from above, we consider the following function:  N (2) σ δ (a1 , . . . , aN ) = δ N −δ , δ > 0. i=1 ai Let us denote am = mini∈N {ai } and define m as a variable taking integer value j representing the index of a minimal aj , that is m = j. Notice that if the minimum value is achieved by more than one argument, we can choose any of the corresponding indices without any loss of generality. Now, we can state the following theorem: Theorem 2.1. The minimum approximation functions satisfy the following properties: σ δ ≤ am ≤ σ δ , ∀ δ > 0, (3) lim σ δ = lim σ δ = am . (4) δ→∞ δ→∞ Proof. First notice that the approximation functions may be written as: am σδ =  ,  δ 1 + i=m (am /ai )δ √ am δ N σδ =  .  δ 1 + i=m (am /ai )δ Also,   N   δ lim  cδi = 1 1+ δ→∞ i=1 if (∀ i ∈ {1, . . . , N }) (ci ∈ [0, 1]) (5) (6) 4 D. M. Stipanović, A. Melikyan & N. Hovakimyan which is true due to the following:   N   √ √ δ δ δ  cδi ≤ 1 + N if (ci ∈ [0, 1]) and lim 1 + N = 1. 1≤ 1+ Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. (7) δ→∞ i=1 Finally, since am /ai ≤ 1 for all i ∈ {1, . . . , N } and N ≥ 1 + conclude that the statements of the theorem are true.  δ i=m (am /ai ) we Another interesting feature that is easy to show is that the minimum approximation functions behave well for any finite positive δ when the minimum approaches zero, that is, lim σ δ = lim σ δ = 0 am →0 am →0 (8) which is a direct consequence of the Eqs. (5) and (6). In order to approximate the maximum function from below, we introduce the following function:  N δ δ i=1 ai , δ>0 (9) ρδ (a1 , . . . , aN ) = N and similarly for the approximation from above, we propose to use  N  δ ρδ (a1 , . . . , aN ) =  aδi , δ > 0. (10) i=1 Let us denote aM = maxi∈N {ai } and define M as a variable taking integer value of the index of a maximal aj , that is M = j. Again, notice that if the set of maximal variables has more than one element we can choose any one of them without any loss of generality. Now, analogously to the case of approximating the minimum, we formulate the following theorem: Theorem 2.2. The convergent maximum approximation functions satisfy the following properties: ρδ ≤ aM ≤ ρδ , ∀ δ > 0, (11) lim ρ = lim ρδ = aM . (12) δ→∞ δ δ→∞ Proof. Notice that the approximation functions can be rewritten as: √ P δ δ 1+ i=M (ai /aM ) √ ρδ = aM , δ N   ρδ = aM δ 1 + (ai /aM )δ . i=M (13) Guaranteed Strategies for Nonlinear MP PE Games Again, using the following property:   N   δ lim  cδi = 1 if (∀ i ∈ {1, . . . , N }) (ci ∈ [0, 1]), 1+ Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. δ→∞ 5 (14) i=1 the fact that ai /aM ≤ 1 for all i ∈ {1, . . . , N }, and N ≥ 1 + conclude that the statements of the theorem are true.  δ i=M (ai /aM ) , we Finally, it is interesting to note that the lower and the upper convergent approximations of both the minimum and the maximum function may be linked to the constant elasticity of substitution (CES) functions with particular coefficients, multiplying the arguments, that are either 1 or 1/N , respectively (for more details see Luenberger (1995)). 3. Comparison Principle Theorems In this section we recall the comparison principle theorems to be used for proving that the strategies of the players would guarantee either capture of all evaders or their evasion from the pursuers. The following theorem [Lakshmikantham et al. (1989)] will be used in proving guaranteed capture results: Theorem 3.1. Let v ∈ C[R+ × Rn , R+ ] such that v(t, x) is locally Lipschitzian in x. Assume that G ∈ C[R+ × Rn × R+ , R] and for (t, x) ∈ R+ × Rn , Dv(t, x) ≤ G(t, x, v(t, x)). (15) Let x(t) = x(t, t0 , x0 ) be any solution of ẋ = f (t, x), x(t0 ) = x0 , t0 ∈ R+ where f ∈ C[R+ × Rn , Rn ], existing on [t0 , ∞). Also, let us assume that r(t, t0 , x0 , u0 ) is the maximal solution of u̇ = G(t, x(t), u), u(t0 ) = u0 (16) existing for t ≥ t0 . Then v(t0 , x0 ) ≤ u0 implies v(t, x(t)) ≤ r(t, t0 , x0 , u0 ), t ≥ t0 . (17) In the formulation of Theorem 3.1, D represents any Dini derivative [Bainov and Simeonov (1992)] yet we assume that v(t, x) is continuously differentiable in the domains of interest so that all Dini derivatives coincide with the standard total time derivative d/dt. Also, we use dx/dt = ẋ when function x(·) is only a function of time, that is, x ≡ x(t). Finally, R denotes the set of real numbers, R+ = [0, ∞), and C[D1 , D2 ] denotes the set of all continuous functions with domain D1 and codomain D2 [Lakshmikantham and Leela (1969)]. For more details on notation and comparison results we refer to Lakshmikantham and Leela (1969, 1989) and Bainov and Simeonov (1992). In order to establish guaranteed evasion results we need the following straightforward modification of Theorem 3.1 which is justified along the lines of the 6 D. M. Stipanović, A. Melikyan & N. Hovakimyan basic arguments provided in Lakshmikantham and Leela (1969) and Bainov and Simeonov (1992). Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. Theorem 3.2. Let v ∈ C[R+ × Rn , R+ ] such that v(t, x) is locally Lipschitzian in x. Assume that g ∈ C[R+ × Rn × R+ , R] and for (t, x) ∈ R+ × Rn , Dv(t, x) ≥ g(t, x, v(t, x)). (18) Let x(t) = x(t, t0 , x0 ) be any solution of ẋ = f (t, x), x(t0 ) = x0 , t0 ∈ R+ where f ∈ C[R+ × Rn , Rn ], existing on [t0 , ∞). Also, let us assume that z(t, t0 , x0 , u0 ) is the minimal solution of q̇ = g(t, x(t), q), q(t0 ) = q0 (19) existing for t ≥ t0 . Then v(t0 , x0 ) ≥ q0 implies v(t, x(t)) ≥ z(t, t0 , x0 , q0 ), t ≥ t0 . (20) 4. Differential Inequalities and Pursuit-Evasion Games Let us assign ei ∈ Rni , i ∈ {1, . . . , Ne }, to be a vector of all state variables corresponding to the i-th evader where Ne denotes the total number of evaders. Similarly, let us assign pj ∈ Rnj , j ∈ {1, . . . , Np }, to be a vector of all state variables corresponding to the j-th pursuer where Np denotes the total number of pursuers. In order to simplify the notation let us concatenate all the individual state vectors into the vectors e = [eT1 , . . . , eTNe ]T and p = [pT1 , . . . , pTNp ]T . These two vectors are of dimensions defined by the dimensions of players’ individual state vectors. Let us assume that the evaders’ dynamics are given in its compact form as ė = fe (e, ue ) (21) and similarly that the pursuers’ dynamics are given by ṗ = fp (e, up ) (22) where ue and up represent evaders’ and pursuers’ input strategies, respectively. In order to generalize results presented in Stipanović et al. (2009) we start by considering the following function: φiδ (ei , p) = σ δ (ei − p1  , . . . , ei − pNp ) (23) where ei and pj represent n-dimensional rectangular coordinates in the corresponding n-dimensional space for the i-th evader and the j-th pursuer, respectively. Obviously the state variables in ei and pj are subsets of the state variables in ei and pj , respectively. Without loss of generality and to simplify the notation, we introduce functions φiδ (·, ·), i ∈ {1, . . . , Ne }, as functions of ei and p. Furthermore, we define e πδ (e, p) = ρδ (φ1δ (e1 , p), . . . , φN δ (eNe , p)). (24) Guaranteed Strategies for Nonlinear MP PE Games 7 Now, we can define the following function: Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. v(e, p) = πδ (e, p) and compute the corresponding strategies as follows:   ∂v(e, p) dv fe (e, ue ) ûe (e, p) = arg max = arg max dt ∂e ue ∈Ue ue ∈Ue   ∂v(e, p) dv ûp (e, p) = arg min fp (p, up ) = arg min dt ∂p up ∈Up up ∈Up (25) (26) where Ue and Up represent admissible classes of functions for the evaders’ and pursuers’ strategies, respectively. To streamline our presentation we assume that the classes of admissible functions are such that the objective function that is either minimized or maximized determines the arguments for the solution function. Therefore, the fact that ∂v(e,p) ∂e fe (e, ue ) depends on e and p implies that the solution ûe (·) is also a function of e and p. One of the most general examples is a class of piecewise continuous functions that are norm bounded. The construction of strategies follows the main ideas of the design of controllers based on Liapunov functions (for more details see [Khalil (2002), Bacciotti and Rosier (2005), Blanchini and Miani (2008)]). In order to use comparison principle we approximate from above the total time derivative of function v(e, p) as: ∂v(e, p) ∂v(e, p) dv(e, p) = fe (e, ûe (e, p)) + fp (p, ûp (e, p)) dt ∂e ∂p ≤ G(e, p, v(e, p)) (27) where G(·, ·, ·) is a scalar continuous function of its arguments and ûe (e, p) and ûp (e, p) are respectively collections of the evaders’ and pursuers’ strategies that maximize or minimize the time derivative, that is the growth, of the function v(e, p). Again, fj (j, ûj (e, p)), j ∈ {e, p} represent collective dynamics of the evaders (when j = e) and the pursuers (when j = p) for the previously defined collective strategies. By defining the capture of an evader to be accomplished whenever its Euclidean distance to any of the pursuers becomes less than a prescribed positive number R (also known as the “soft capture”) we state the following theorem: Theorem 4.1. Assume that the initial conditions e0 = e(t0 ) and p0 = p(t0 ) at the initial time t0 are such that the players are outside of the capture regions defined by a positive number R, and that the maximal solution (as defined in Lakshmikantham and Leela (1969)) of the following differential equation: dw = G(e(t), p(t), w), dt w0 = v0 (e0 , p0 ) (28) is denoted as w̄(t, t0 , e0 , p0 , w0 ) along the trajectories of the players’ dynamic systems for the collections of their strategies ûe (e, p) and ûp (e, p). Then the capture of all evaders is guaranteed when pursuers use collective strategies provided in a vector 8 D. M. Stipanović, A. Melikyan & N. Hovakimyan Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. form as ûp (e, p) within a finite time interval T − t0 after the initial time for any feedback strategies of the evaders if w̄(T, t0 , e0 , p0 , w0 ) < R. Proof. So the assumption of the theorem is that the pursuers choose their strategies to be ûp (e, p). Then, if the evaders choose any strategy ūe (e, p) ∈ Ue , from (26) and (27) it follows that: dv(e, p) dt ue =ūe (e, p) up =ûp (e, p) ≤ dv(e, p) dt ue =ûe (e, p) up =ûp (e, p) ≤ G(e, p, v(e, p)). (29) Thus, for any strategy ūe (e, p) ∈ Ue , we obtain the differential inequality (29) and using the comparison principle we obtain: v(e(t), p(t)) ≤ w̄(t, t0 , e0 , p0 , w0 ), t ≥ t0 . (30) From (11) and (24) it follows that: e max{φ1δ (e1 (t), p(t)), . . . , φN δ (eNe (t), p(t))} ≤ v(e(t), p(t)) (31) which implies φiδ (ei (t), p(t)) ≤ v(e(t), p(t)) (32) for all i ∈ {1, . . . , Ne }. Then, using Eqs. (3) and (23) we obtain min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≤ v(e(t), p(t)), t ≥ t0 , (33) for all i ∈ {1, . . . , Ne }. Finally, from inequalities (33) and an assumption of the theorem, we obtain min{ei (T ) − p1 (T ), . . . , ei (T ) − pNp (T )} < R (34) for all i ∈ {1, . . . , Ne } which is a guarantee that all evaders will be captured before or at time T and thus the theorem is proved. Now, let us first consider a problem of evasion of a single evader i, i ∈ {1, . . . , Ne }, from the pursuers. In order to do so first we assume that the i-th evader’s dynamics is given by ėi = fei (ei , uie ). (35) To obtain strategies for the guaranteed evasion we consider the following approximation function ηδi (ei , p) = σ δ (ei − p1 , . . . , ei − pNp ) (36) and define vi (·, ·) as vi (ei , p) = ηδi (ei , p). (37) Guaranteed Strategies for Nonlinear MP PE Games 9 Similarly to the case of guaranteed capture, let us again bound the total time derivative of vi (ei , p) yet in this case this bound will be from below as Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. ∂vi (ei , p) ∂vi (ei , p) i dvi (ei , p) fe (ei , ũie (ei , p)) + = fp (p, ũip (ei , p)) dt ∂ei ∂p ≥ gi (ei , p, vi (ei , p)) (38) where the strategies are computed from:   dvi ∂vi (ei , p) i ũie (ei , p) = arg max fe (ei , uie ) = arg max dt ∂ei uie ∈Uei uie ∈Uei   dvi ∂vi (ei , p) i ũp (ei , p) = arg min fp (p, up ) = arg min dt ∂p up ∈Up up ∈Up (39) and gi (·, ·, ·) is a scalar continuous function of its arguments. Theorem 4.2. Assume that the initial conditions ei0 = ei (t0 ) and p0 = p(t0 ) at the initial time t0 are such that the players are outside of the region defined by the set {(ei , p) : vi (ei , p) ≥ R} (notice that this is an over or outer approximation of the soft capture set) and that the minimal solution (as defined in [Lakshmikantham and Leela (1969)]) of the following differential equation: dzi = gi (ei (t), p(t), zi ), zi (t0 ) = zi0 = vi (ei0 , p0 ) (40) dt is denoted as z i (t, t0 , ei0 , p0 , zi0 ) along the trajectories of the players’ dynamic systems for the collections of their strategies ũie (ei , p) and ũp (ei , p). Then the evasion of the i-th evader is guaranteed for any admissible composite strategy of the pursuers if z i (t, t0 , ei0 , p0 , zi0 ) > R, for all t ≥ t0 . Proof. So, the assumption of the theorem is that the i-th evader chooses its strategy to be ũie (ei , p). Then, if the pursuers choose any collective strategy ūp (ei , p) ∈ Up , from (38) and (39) it follows that: dvi (ei , p) dt uie = ũie (ei , p) up = ūp (ei , p) ≥ dvi (ei , p) dt uie = ũie (ei , p) up = ũp (ei , p) ≥ gi (ei , p, vi (ei , p)). (41) Thus, for any strategy ūp (e, p) ∈ Up , we obtain the differential inequality (41) and using the comparison principle we obtain: vi (ei (t), p(t)) ≥ z(t, t0 , ei0 , p0 , zi0 ), t ≥ t0 . (42) From (3), (36) and (37), it follows that: min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ vi (ei (t), p(t)), t ≥ t0 (43) for all i ∈ {1, . . . , Ne }. Finally, using Eqs. (42) and (43), and an assumption of the theorem, we obtain min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ z(t, t0 , ei0 , p0 , zi0 ) > R, t ≥ t0 (44) 10 D. M. Stipanović, A. Melikyan & N. Hovakimyan Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. which is a guarantee that the i-th evader will never be captured by the pursuers. This completes the proof. Now, let us construct cooperative strategies for the evaders that would guarantee that none of them will be captured by the pursuers. For the cooperative case we propose the following function: γδ (e, p) = σ δ (ηδi (e1 , p), . . . , ηδi (eNe , p)) (45) v(e, p) = γδ (e, p). (46) and define This is a cooperative case since v(e, p) considers all evaders at the same time. Notice that due to the assumption that the players’ dynamics are independent, the cooperation is described through the function v(e, p). Then, we proceed by defining the strategies for the players as:   dv ∂v(e, p) fe (e, ue ) ũe (e, p) = arg max = arg max dt ∂e ue ∈Ue ue ∈Ue (47)   ∂v(e, p) dv fp (p, up ) ũp (e, p) = arg min = arg min dt ∂p up ∈Up up ∈Up where again Ue and Up represent admissible classes of functions for the evaders’ and pursuers’ strategies, respectively. In order to use the comparison principle, we approximate from below the total time derivative of function v(e, p) as ∂v(e, p) ∂v(e, p) dv(e, p) = fe (e, ũe (e, p)) + fp (p, ũp (e, p)) dt ∂e ∂p ≥ g(e, p, v(e, p)). (48) where g(·, ·, ·) is a scalar continuous function of its arguments. Now, we are ready to formulate a theorem on the collective evasion as follows: Theorem 4.3. Assume that the initial conditions e0 = e(t0 ) and p0 = p(t0 ) at the initial time t0 are such that the players are outside of the region defined by the set {(e, p) : v(e, p) ≥ R} (notice that this is an over or outer approximation of the soft capture set for any evader) and that the minimal solution (as defined in [Lakshmikantham and Leela (1969)]) of the following differential equation: dz = g(e(t), p(t), z), dt z(t0 ) = z0 = v(e0 , p0 ) (49) is denoted as z(t, t0 , e0 , p0 , z0 ) along the trajectories of the players’ dynamic systems for the collections of their strategies ũe (e, p) and ũp (e, p). Then the evasion of all evaders is guaranteed for any admissible composite strategy of the pursuers if z(t, t0 , e0 , p0 , z0 ) > R, for all t ≥ t0 . Guaranteed Strategies for Nonlinear MP PE Games 11 Proof. So, an assumption of the theorem is that the evaders choose their collective strategy to be ũie (ei , p). Then, if the pursuers choose any collective strategy ūp (ei , p) ∈ Up , from (47) and (48) it follows that: Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. dv(e, p) dt ue = ũe (e, p) up = ūp (e, p) ≥ dv(e, p) dt ue = ũe (e, p) up = ũp (e, p) ≥ g(e, p, v(e, p)). (50) Thus, for any strategy ūp (e, p) ∈ Up , differential inequality (50) is valid, and using the comparison principle we obtain v(e(t), p(t)) ≥ z(t, t0 , e0 , p0 , z0 ), t ≥ t0 . (51) From (3), (45) and (46) it follows that: min{ηδi (e1 (t), p(t)), . . . , ηδi (eNe (t), p(t))} ≥ v(e(t), p(t)), t ≥ t0 (52) which implies ηδi (e1 (t), p(t)) ≥ v(e(t), p(t)), t ≥ t0 , (53) for all i ∈ {1, . . . , Ne }. Then, from Eqs. (3) and (36) we obtain min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ v(e(t), p(t)), t ≥ t0 (54) for all i ∈ {1, . . . , Ne }. Inequalities (51) and (54) imply that for all i ∈ {1, . . . , Ne }, min{ei (t) − p1 (t), . . . , ei (t) − pNp (t)} ≥ z(t, t0 , e0 , p0 , z0 ) > R, t ≥ t0 (55) which is a guarantee that all evaders will never be captured by any of the pursuers. This completes the proof. Notice that by defining goals of the players in terms of trajectories of the system either reaching or not reaching the target sets, we can generalize the proposed methodology to be applicable to a larger class of dynamic games rather than pursuit-evasion games only. 5. Unicycle Players In order to provide an illustration for designing players’ strategies using the proposed approach we consider pursuit-evasion games where the players are modelled using a nonholonomic nonlinear model also known as the unicycle [Spong et al. (2005)]. Let us assume that each player i, where the total number of players is denoted by N , that is i ∈ N = {1, . . . , N }, is modelled using the unicycle model, that is, Ẋi = si cos(ϕi ) Ẏi = si sin(ϕi ) (56) ϕ̇i = ωi where Xi and Yi represent planar rectangular coordinates and ϕi represents the heading angle of the i-th player. Velocity, denoted by si , and angular velocity, denoted by ωi , are assumed to be norm bounded by µi and νi , respectively. 12 D. M. Stipanović, A. Melikyan & N. Hovakimyan Assuming that the Liapunov-like function for the i-th player is given by: Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. υi ≡ υi (Pi , P i ), Pi = [Xi , Yi ]T , P i = {Pj : j = i, j ∈ Ni } (57) where Ni is a set of players of interest to player i. A set of players of interest is defined based on the individual role of the player. For example, if a player is an evader then the set of interest would include indices of all pursuers. A Liapunovlike function candidate associated with the i-th player denoted by υi (·, ·) is an appropriate approximation of the minimum or the maximum function depending whether the i-th player is a pursuer or an evader. Then, the control strategies for the pursuers, that is, velocity and angular velocity are given by the following expression: ŝi ω̂i = arg min si ≤µi ,ωi ≤νi = arg min si ≤µi ,ωi ≤νi dvi dt   ∂vi ∂vi cos(ϕi ) + sin(ϕi ) si . ∂Xi ∂Yi (58) From Eq. (58) it follows that the closed-form solution for the pursuers’ velocities are given by the following formula:  ∂vi ∂vi cos(ϕi ) + sin(ϕi ) ŝi = −µi sign ∂Xi ∂Yi  ∂vi /∂Xi = −µi sign sin ϕi + arctan , (59) ∂vi /∂Yi where sign(·) denotes the standard sign function. Similarly, for the evaders we consider the arg max function in (58) and obtain the following formula:  ∂vi ∂vi cos(ϕi ) + sin(ϕi ) ŝi = µi sign ∂Xi ∂Yi  ∂vi /∂Xi = µi sign sin ϕi + arctan . (60) ∂vi /∂Yi From Eqs. (59) and (60) it can be easily shown that the maximal and minimal velocities will be achieved if the heading angle is π/2 (that is, in both cases) which implies that the desired heading angle is given by  ∂υi /∂Xi = π/2 − arctan . (61) ϕdes i ∂υi /∂Yi Then, one possible norm bounded solution for the angular velocity strategy of the i-the player is to guide the player toward the desired heading angle as, ω̂i = −νi sign(ϕi − ϕdes i ). (62) Notice that the differential equation ϕ̇i = ω̂i is finite-time stable if ϕdes is a constant. i It is interesting to note that the functional forms for the angular velocities are the same for all the players. Guaranteed Strategies for Nonlinear MP PE Games 13 30 20 Y Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. 25 E2 E1 15 10 P3 P5 P 4 5 −10 −5 0 5 10 15 20 25 X Fig. 1. Pursuit-evasion game for the first set of initial conditions with five nonidentical unicycle players. Finally, in order to illustrate the proposed design of players’ strategies let us consider a pursuit-evasion game with three pursuers and two evaders. The first scenario depicted in Fig. 1 shows two evaders with the initial conditions in the upper part of the figure and three pursuers with the initial conditions in the lower part of the figure. So, initial Y -coordinates of evaders and pursuers are the same, respectively. The trajectories for the evaders appear as thinner lines while the trajectories for the pursuers appear as thicker lines. We will refer to the evaders as players one and two and denote them as E1 and E2 (as shown in Fig. 1). Similarly, we will refer to the three pursuers as players three, four and five and denote them as P3 , P4 and P5 , respectively. This is also depicted in Fig. 1 by placing the corresponding labels next to the players’ trajectories. By doing so, we assign subscript indices 1 and 2 to players which are evaders and indices 3, 4 and 5 to players which are pursuers. In the normalized units the bounds for the velocities of the players are µ1 = µ2 = µ3 = 1, µ4 = µ5 = 2 and νi = 1, for all i ∈ {1, 2, 3, 4, 5}. The strategies are computed using equation (26), that is, Eqs. (58)–(62), and the composite function (25) (which is the same for all players) with δ = 3. In Fig. 1, we can see that the players three and four (that is, pursuers P3 and P4 ) pursue player one (that is, evader E1 ) and that player five (that is, pursuer P5 ) pursues player two (that is, evader E2 ). Since players four and five are the fastest they will capture players one and two. We do not provide closed-form solutions for the players’ strategies and the time derivative of the composite function since the derivation is straightforward using Eqs. (25), (58)–(62), and the equations are long and cumbersome. 14 D. M. Stipanović, A. Melikyan & N. Hovakimyan 40 35 Y 25 20 E E 2 1 15 10 P P 4 3 5 P 5 0 −20 0 −10 10 20 X Fig. 2. Pursuit-evasion game for the second set of initial conditions with five nonidentical unicycle players: trajectories at the beginning of the pursuit. 80 70 60 50 E2 E1 P 40 3 P Y Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. 30 30 5 P 20 4 10 0 −10 −20 −80 −60 −40 −20 X 0 20 40 Fig. 3. Pursuit-evasion game for the second set of initial conditions with five nonidentical unicycle players: trajectories over the whole time horizon. A more complex example is depicted in Figs. 2 and 3 where we only changed initial conditions slightly (not to change the numbering order of the players) and the bounds on the velocities as µ1 = 5, µ2 = 2, µ3 = 3, µ4 = 1, µ5 = 6, ν1 = ν3 = ν5 = 2, and ν2 = ν4 = 1. Parameter δ = 3 remained unchanged as well as all the equations used to compute players’ strategies. It is interesting that the initial strategy of Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. Guaranteed Strategies for Nonlinear MP PE Games 15 player three (that is, pursuer P3 ) is to pursue player one (that is, evader E1 ) as depicted in Fig. 2 yet after a while it turns and starts pursuing player two (that is, evader E2 ) as shown in Fig. 3 since player four (that is, pursuer P4 ) is too slow for player two. Player five (that is, pursuer P5 ), as the fastest player, pursues from the very beginning player one who is the fastest among the two evaders. These initial simulation results show capabilities of the strategies that are proposed yet many open questions still remain to be addressed in our future work. Some of the immediate issues to be considered are to include delays and study robustness properties of the methodology as well as the possibility that the players update their information only at discrete-time instances. 6. Conclusion In this paper we provide a methodology of designing strategies for the players that guarantee either capture or evasion of all or some evaders in multi-player pursuitevasion games. The players’ dynamics are represented by nonlinear models and the sufficient conditions are formulated using a Liapunov type of analysis based on the comparison principle and differential inequalities by considering differentiable functions that are convergent approximations of the minimum and the maximum function. Acknowledgments This work has been supported by the Alexander von Humboldt Foundation, Bonn, Germany. The first author would also like to thank Mr. Juan S. Mejı́a for his help in producing the Matlab code used to obtain simulation results provided in the paper. References Bacciotti, A. and Rosier, L. [2005] Liapunov Functions and Stability in Control Theory, 2nd edn., Springer-Verlag, Berlin, Germany. Bainov, D. and Simeonov, P. [1992] Integral inequalities and applications, Mathematics and Its Applications, Vol. 57, Kluwer Academic Publishers, Dordrecht, The Netherlands. Bardi, M. and Capuzzo-Dolcetta, I. [1997] Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Birkhäuser, Boston, MA. Blanchini, F. and Miani, S. [2008] Set-Theoretic Methods in Control, Birkhäuser, Boston, MA. Breakwell, J. V. and Hagedorn, P. [1979] Point capture of two evaders in succession, Journal of Optimization Theory and Applications 27, 89–97. Chernousko, F. L. and Melikyan, A. A. [1975] Some differential games with incomplete information, Lecture Notes in Computer Science, Vol. 27, Springer-Verlag, Berlin, pp. 445–450. Crandall, M. G. and Lions, P.-L. [1983] Viscosity solutions of Hamilton-Jacobi equations, Transactions of American Mathematical Society 277, 1–42. Falcone, M. and Ferretti, R. [1994] Discrete time high-order schemes for viscosity solutions of Hamilton-Jacobi-Bellman equations, Numerische Mathematik 67, 315–344. Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. 16 D. M. Stipanović, A. Melikyan & N. Hovakimyan Hagedorn, P. and Breakwell, J. V. [1976] A differential game with two pursuers and one evader, Journal of Optimization Theory and Applications 18, 15–29. Isaacs, R. [1965] Differential Games: A Mathematical Theorey With Applications to Warfare and Pursuit, Control and Optimization, John Wiley and Sons, Inc., New York, NY. Khalil, H. K. [2002] Nonlinear Systems, 3rd edn., Prentice Hall, Upper Saddle River, NJ. Krasovskii, N. N. and Subbotin, A. I. [1988] Game-Theoretical Control Problems, SpringerVerlag, New York, NY. Lakshmikantham, V. and Leela, S. [1969] Differential and integral inequalities: Theory and applications, Mathematics in Science and Engineering, Vol. 55, Academic Press, New York, NY. Lakshmikantham, V., Leela, S. and Martinyuk, A. A. [1989] Stability Analysis of Nonlinear Systems, Marcel Dekker, New York, NY. Levchenkov, A. Y. and Pashkov, A. G. [1990] Differential game of optimal approach of two inertial pursuers to a noninertial evader, Journal of Optimization Theory and Applications 65, 501–518. Luenberger, D. G. [1995] Microeconomic Theory, McGraw-Hill, Inc., New York, NY. Melikyan, A. A. [1973] On Minimal Observations in a Game of Encounter (in Russian) PMM 37(3), 407–414. Melikyan, A. A. [1981] Optimal interaction of two pursuers in a game problem (in Russian) Tekhnicheskaya Kibernetika (2), 49–56. Melikyan, A. A. [1998] Generalized characteristics of first order PDEs, Birkhäuser, Boston, MA. Melikyan, A. A., Hovakimyan, N. H. and Harutyunyan, L. [1998] Games of simple pursuit and approach on two-dimensional cone, Journal of Optimization, Theory and Applications 98(3), 515–543. Melikyan, A. A. and Pourtallier, A. [1996] Games with Several Pursuers and One Evader with Discrete Observations, Game Theory and Applications (Petrosjan, L. A. and Mazalov, V. V. eds.), Vol. 2, Nova Science Publishers, New York, NY, pp. 169–184. Olsder, G. J. and Pourtallier, O. [1995] Optimal selection of observation times in a costly information game, New Trends in Dynamic Games and Applications (Olsder, G. J. ed.), Annals of the International Society of Dynamic Games, Vol. 3, Birkhäuser, Boston, MA, pp. 227–246. Pashkov, A. G. and Terekhov, S. D. [1987] A differential game of approach with two pursuers and one evader, Journal of Optimization Theory and Applications 55, 303–311. Petrosjan, L. A. [1993], Differential Games of Pursuit, Series on Optimization, Vol. 2, World Scientific, Singapore. Petrov, N. N. [1994] Existence of the value of a many-person game of pursuit, Journal of Applied Mathematics and Mechanics 58(4), 593–600. Petrov, N. N. [2003] “Soft” capture in Pontryagin’s example with many participants, Journal of Applied Mathematics and Mechanics 67(5), 671–680. Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V. and Mishchenko, E. F. [1962] The Mathematical Theory of Optimal Processes, Interscience Publishers, New York, NY. Spong, M. W., Hutchinson, S. and Vidyasagar, M. [2005] Robot Modeling and Control, John Wiley & Sons, Hoboken, NJ. Stipanović, D. M., Melikyan, A. and Hovakimyan, N. [2009] Some sufficient conditions for multi-player pursuit-evasion games with continuous and discrete observations, Annals of Dynamic Games 10, 133–145. Int. Game Theory Rev. 2010.12:1-17. Downloaded from www.worldscientific.com by INSTITUTE OF MATHEMATICS AND MECHANICS OF URAL BRANCHE OF RAS on 02/28/13. For personal use only. Guaranteed Strategies for Nonlinear MP PE Games 17 Stipanović, D. M., Sriram and Tomlin, C. J. [2004] Strategies for agents in multi-player pursuit-evasion games, Proceedings of the Eleventh International Symposium on Dynamic Games and Applications (Tucson, Arizona). Subbotin, A. [1984] Generalization of the main equation of differential game theory, Journal of Optimization Theory and Applications 43, 103–133. Subbotin, A. I. [1995] Generalized Solutions of First-Order PDEs: The Dynamical Optimization Prospective, Birkhäuser, Boston, MA. Vagin, D. A. and Petrov, N. N. [2002] A problem of group pursuit with phase constraints, Journal of Applied Mathematics and Mechanics 66(2), 225–232.