CONTROL OF DYNAMICAL SYSTEMS WITH

0 downloads 0 Views 429KB Size Report
Dušan M. Stipanovic ... t 0, is the standard Wiener process, and zt, t 0, is a time- ... The Markov process zt, t 0, models decisions that govern systems' ..... P (zt = z|Yk) p1 (t, x, z|Yk) on the right hand side of (28) corresponds to one de- ..... control, Numerical Algebra, Control and Optimization (NACO), 3 (2013), 31–48.
DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS Volume 35, Number 9, September 2015

doi:10.3934/dcds.2015.35.xx pp. X–XX

CONTROL OF DYNAMICAL SYSTEMS WITH DISCRETE AND UNCERTAIN OBSERVATIONS

Aleksandar Zatezalo Scientific Systems Company, Inc 500 West Cummings Park Suite 3000 Woburn, MA 01801, USA

´ Duˇ san M. Stipanovic Coordinated Science Laboratory and the Department of Industrial and Enterprise Systems Engineering University of Illinois at Urbana-Champaign Urbana, IL 61801, USA

Abstract. In this paper we provide a design methodology to compute control strategies for a primary dynamical system which is operating in a domain where other dynamical systems are present and the interactions between these systems and the primary one are of interest or are being pursued in some sense. The information from the other systems is available to the primary dynamical system only at discrete time instances and is assumed to be corrupted by noise. Having available only this limited and somewhat corrupted information which depends on the noise, the primary system has to make a decision based on the estimated behavior of other systems which may range from cooperative to noncooperative. This decision is reflected in a design of the most appropriate action, that is, control strategy of the primary system. The design is illustrated by considering some particular collision avoidance problem scenarios.

1. Introduction. Designing continuous time control strategies for dynamical systems, based on the information updated only at discrete time instances, is still very much an open problem. Its technical complexity partially stems from the fact that combining continuous time dynamics with discrete time information updates requires a merger of these two domains. On the other hand the problem is highly motivated by the applications where the information about the surrounding world, which is available to the primary dynamical system, is often being updated only at discrete time instances. Some of the pioneering work in this area is cited and/or provided in [18] where the control strategies were assumed to be piecewise constant and updated only at discrete time instances. Even more complicated the problem becomes if one is interested in finding how often the information needs to be updated if the control design is imposed by an objective to be satisfied in an optimal or any other way. This particular issue was treated for a variety of scenarios in [20, 7, 22, 21] involving multiple interacting dynamical systems falling into the category of dynamic (or differential) game problems [1, 23]. These works treated 2010 Mathematics Subject Classification. Primary: 37N35, 93E03; Secondary: 93E15, 93C10. Key words and phrases. Game theory, control theory, stochastic processes, robust control, estimation, discrete observations.

1

2

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

particular scenarios and the solutions were provided given specific assumptions on the systems’ dynamics and their objectives. The main goal of this article is to derive control strategies for a primary dynamical system given discrete and uncertain or noisy observations of other dynamical systems that the primary system is either interacting with, depending on (such as situations of trajectories intersecting which may be treated as collisions), or interested in (such as a pursuer trying to capture an evader). We assume that the discrete times when the new observations are available are given yet we generalize the problem by assuming that the observations are corrupted by noise. Therefore the primary system has to estimate the possible set of decisions that the other systems are pursuing and which consequently govern theirs and influence the primary systems dynamics, respectively. The challenge, given our setup that the information is updated at discrete time instants, is that the changes in the dynamic behavior can and do occur between those discrete observations. Other challenges in the derivation of estimation methods to predict the changes include uncertainties in other systems’ dynamics, possible state and observation model nonlinearities, and the existence of noise in the measurements and/or observations. Of course this is very much related to the problems of controlling dynamical systems with stochastic perturbations which on its own constitutes a challenging class of problems (for more details see, for example, [31, 12, 9]) In particular for the observed or other dynamical systems we consider Itˆo’s stochastic equation including systems’ dynamics with discrete controls or decisions, in the following form: dxt = b (t, xt , zt ) dt + σ (t, xt , zt ) dwt , where b (·, ·, ·), σ (·, ·, ·) are some functions satisfying appropriate conditions, wt , t ≥ 0, is the standard Wiener process, and zt , t ≥ 0, is a time-nonhomogeneous finite-state Markov process taking values in some finite set Z and which trajectories are right-continuous on [0, ∞) and have finite limits on (0, ∞), that is, process zt is c´ adl´ ag. The Markov process zt , t ≥ 0, models decisions that govern systems’ dynamic behavior. The other systems’ overall state trajectory is denoted as xt , t ≥ 0. In addition, we assume that the primary dynamical system has states that are governed by an ordinary differential equation of the form d˜ xt = B (t, x ˜t , ut ) , dt where B (·, ·, ·) is an appropriate function and ut is its control strategy/law. The state values over time x ˜t , t ≥ 0, characterize vehicle system which is governed by its control law ut which has to be determined based on an estimation of the other dynamical systems’ states (again denoted as xt ) behavior. Again we assume that the other systems’ states xt are observed at discrete times tk , k = 1, 2, . . . , and therefore the observations are modeled as y˜k = h (xtk ) + νk , where function h (·) models the observation process and the stochastic observation error νk is considered to be, for example, white noise. Given σ-algebra of all observations y˜s up to time t denoted by Yt , we want to estimate discrete decision state zt and possibly current trajectory xt , that is, for

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

3

t ≥ s ≥ 0 and z ∈ Z, we want to estimate P (zt = z|Ys ) T

and for x ¯t = (xt , zt ) possible, to estimate

where v T denotes the transpose of vector v, we want, if P (¯ xt |Ys ) ,

which is an extension that gives an estimate of trajectory xt for an estimated decision zt and therefore allows for better anticipation and possible reaction of the primary system to that particular trajectory of the overall state of other systems. Moreover, for a given payoff function v (t, ·|Ys ), t ≥ s ≥ 0, we want to derive effective control strategies that will drive dynamics of the primary system by x ˜t maximizing the values of the payoff function. In particular we propose to compute the control strategies based on some particular approximations (the details are provided in Section 4) of the following maximization (which may not be unique): u ˆ(t, x ˜t ) = arg max v (t, x ˜t |Ys ) . u(·)∈U

where U is a class of admissible control functions or strategies. Some standard classes of admissible functions for control strategies can be found in [5]. The paper is organized as follows. In Section 2, we introduce some auxiliary results on time-nonhomogeneous finite-state c´adl´ag Markov processes mostly based on the previous work provided in [16]. Modeling of the other systems’ dynamics using Itˆ o differential equation is provided in Section 3. The control strategies design procedure details are given in Section 4. In Section 5, the proposed control design is applied to a variety of collision avoidance scenarios [32] motivated by a recent space applications developments [33], [34]. Finally, some concluding remarks are provided in Section 6. 2. Time-nonhomogeneous finite-state Markov processes. We start this section by introducing some preliminary assumptions and notation. Let (Ω, F, P ) be a probability space, Z finite set, zt , t ≥ 0, is a Markov process with values in Z with c´ adl´ ag trajectories such that there are functions λuv (t) defined for u, v ∈ Z, t ≥ 0, which satisfy Z t |λuv (r)| dr < ∞, t ∈ [0, ∞) , u, v ∈ Z. 0

Furthermore for any z ∈ Z, the process Z t − λzs z (s) ds + Izt =z

(1)

0

is Ftz martingale, where Ftz is the complete σ-algebra generated by sets {zs = v}, s ∈ [0, t], v ∈ Z, and for A = {ω ∈ Ω : zt (ω) = z} ∈ F, Izt =z ≡ IA is a random function such that IA (ω) = 1 for ω ∈ A and zero otherwise. Following the derivations provided in [16] (in particular see Remark 2.2 on page 232), for u, v ∈ Z and 0 ≤ s ≤ t, the transition probabilities p (s, u, t, v) = P (zt = v|zs = u) of the Markov process zt defined by ( P (zt =v,zs =u) if P (zs = u) > 0 P (zs =u) (2) p (s, u, t, v) = P (zt = v) if P (zs = u) = 0,

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

4

satisfy Kolmogorov forward equations XZ t uv p (s, u, r, z) λzv (r)dr, p (s, u, t, v) = δ + z∈Z

(3)

s

where δ uv = 1 for u = v and δ uv = 0 otherwise. Therefore here we use the standard Kronecker delta function notation. Note that characterizations (3) and (1) of Markov process zt , t ≥ 0, are equivalent. Also, note that the functions λuv (t), u, v ∈ Z, t ≥ 0, can be modified keeping the same defining properties (in particular, local integrability and (1)) so that they are non-negative when u 6= v and their “row-wise sums” are equal to zero which can be written as X λuz (t) = 0, λuv (t) ≥ 0, t ≥ 0, v ∈ Z \ {u} . (4) z∈Z

˜ Furthermore let us assume that there is a given function h(·) on Z and random variables νk (for example, normally independent and identically distributed (i.i.d.) that are independent of zt , t ≥ 0) such that ˜ (zt ) + νk . y¯k = h (5) k

Let for t ≥ 0, Y˜t be the complete σ-algebra generated by y¯s , s ∈ [0, t]. Again using the results provided in [16] (Remark 2.2 on page 232 and see also Section 8 on page 254) we have that for a given distribution on Z as initial condition and 0 ≤ s ≤ t, Kolmogorov forward equations have a unique solution which is also a distribution function π ˜ (r, ·), r ∈ [s, t] on Z, so that for a given π ˜ (s, z), z ∈ Z, we have X d¯ π (r, z) = π ¯ (r, v) λvz (r) , r ∈ [s, t] . (6) dt v∈Z

isthe following: for z ∈ Z and 0 ≤ s ≤ t, how to derive Now the question   P zt = z|Y˜s given P zs = v|Y˜s , v ∈ Z? The answer is given by the following theorem.   Theorem 2.1. For k = 1, 2, . . . , 0 ≤ tk ≤ t, P zt = z|Y˜tk , z ∈ Z, is a solution   of (6) with initial condition P ztk = z|Y˜tk , z ∈ Z, where we set s = tk . Proof. Since νk , k = 1, 2, . . . , are independent of zt , t ≥ 0, for Aj , j = 1, . . . , k Borel (for example, open) sets, v ∈ Z, and Markov property for zt , we have     k k Y Y E I(ν +h(z ))−1 A Izt =v  = E  I(ν +h(z ))−1 A P (zt = v|ztk ) . (7) tj j tj j k k j=1

j=1

From equations (7) and (5), we obtain P (zt = z|Yk ) = E [P (zt = v|ztk ) |Yk ] . Since P (zt = v|ztk ) is a function of ztk , from (8), we have X P (zt = z|Yk ) = P (zt = v|ztk = z) P (ztk = z|Yk ) .

(8)

(9)

z∈Z

The remainder of the proof of the theorem then follows from equations (9), (2), and (3). More specifically in technical terms this means that P (zt = v|ztk = z) in the sum on the right hand side of equation (9) satisfies (6).

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

5

3. Itˆ o’s characterization of the systems’ dynamics. In this section we will concentrate on developing a model for the overall dynamics of the other systems that interact with the primary system. This interaction may be relevant in terms of safety (meaning collision avoidance) or actual pursuit where the primary system may be a pursuer or an evader. Let xt be a d-dimensional random process modeling the state of other systems and satisfying the following equations: dxt = b (t, xt , zt ) dt + σ (t, xt , zt ) dwt ,

(10)

with some initial data and where functions b (·, ·, ·) and σ (·, ·, ·) are some smooth and differentiable functions, and wt be the d2 -dimensional Wiener process. Here, the process zt , t ≥ 0, is the finite dimensional Markov process introduced in Section 2. Remark 1. The finite state Markov process zt , t ≥ 0, models possible strategies for process xt , t ≥ 0 which can be simple as performing turns, incorporating changes in particular directions, etc., so that we can have a control process αtz , zt = z, t ≥ 0, z ∈ Z, which can represent, for example, the direction and magnitude of the thrust as it is demonstrated in [33]. Furthermore to define jumps of zt we set  τ (0) = 0, τ (n + 1) = inf s ≥ τ (n) : zs 6= zτ (n) , n = 0, 1, 2, . . . , T = {τ (1) , τ (2) , . . . } , T (s, t) = T ∪ (s, t] , 0 ≤ s ≤ t < ∞. Then instead of equation (10), we have dxt = b (t, xt , αtzt ) dt + σ (t, xt , αtzt ) dwt ,

(11)

z αt τ (k)

where αtzt = such that k = max {n : τ (n) ≤ t}. Therefore, for t ∈ [τ (n), τ (n+ 1)), n = 0, 1, 2, . . . , from (11), we have Z t Z t   z z (12) σ s, xs , αsτ (n) dws . xt = xτ (n) + b s, xs , αsτ (n) ds + τ (n)

τ (n)

Therefore by considering (12) and using the results from references [15] and [16], we have that the stochastic process x ¯t = (xt , zt ) is completely characterized by the infinitesimal intensities λuv (·) satisfying (4) and Itˆo’s equation (11). At this point, the following lemma is needed for the characterization of the stochastic process x ¯t , t ≥ 0. Lemma 3.1. Let coefficient in (10) be Lipschitz and bounded (or satisfy more general conditions of Itˆ o’s theorem as stated on page 166 in [15]) and the Wiener process wt , t ≥ 0, be independent of the finite-state time-nonhomogeneous process zt , t ≥ 0. Then for a given initial condition, system (10) has a unique solution xt . Proof. Because of the independence of zt and wt and Markov property of zt , we can conclude the existence of a unique solution xt of (12) on each consecutive interval [τ (n) , τ (n + 1)], with initial condition xτ (n) , n = 0, 1, . . . . Then by the Itˆ o’s theorem (again on page 166 in [15]), the derived xt is a unique solution of (10). In connection with Remark 1 and Lemma 3.1, we gave the following few remarks. Remark 2. Instead of more constructive and somewhat implicit proof of Lemma 3.1 that followed similar approach as in [15] (see page 166), we could prove Lemma 3.1 by applying a generalization of the Gronwall inequality [6] and develop an alternative proof directly as it was done in [13] (see page 288).

6

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

Remark 3. As already mentioned, the infinitesimal intensities λuv (·), u, v ∈ Z, satisfying (4), completely characterize the finite-state Markov process zt , t ≥ 0, in the sense of a unique solution of (3), probability space, and stochastic process for which those solutions are transition probability densities (for more details see [16] and references therein). A similar statement is true for the Itˆo’s process xt . For T x ∈ Rd , t ≥ 0, z ∈ Z, a (t, x, z) = σ (t, x, z) σ (t, x, z) , we consider the following operator: X X ∂ ∂2 + ai,j (t, x, z) . (13) L (t, x, z) = bj (t, x, z) ∂xj ∂xi ∂xj j i,j Now let us consider the fundamental solution of the following second order parabolic partial differential equation: dp (t, x, z) = L∗ (t, x, z) p (t, x, z) , (14) dt where L∗ is the formal adjoint of the operator L defined by (13). The fundamental solution of (14) is the transition probability density of xt which therefore also completely characterizes this stochastic process [13]. It is important to note that we are interested in transition probabilities of the T stochastic process x ˜t = (xt , zt ) which for x, y ∈ Rd , 0 ≤ s ≤ t < ∞, u, v, ∈ Z, can be intuitively associated with the following differential equations: X dp (s, y, u, t, x, v) = L∗ (t, x, v) p (s, y, u, t, x, v) + p (s, y, u, t, x, z) λzv (t) (15) dt z∈Z

and the associated initial condition property lim p (s, y, u, t, x, v) = δ (x − y) δ uv ,

s→t

(16)

where δ (·) is the Dirac delta function and δ uv is the Kronecker delta function. At this point we are ready to state the following theorem. Theorem 3.2. Let us assume conditions (4) and recall Lemma 3.1. Then the system of equations (15) and (16) has a unique solution. Proof. Let for u, v, z ∈ Z, 0 ≤ s ≤ t < ∞, p1 (s, y, t, x, z) be the fundamental solution of equation (14) [13] and p2 (s, u, t, v) be a unique solution of equation (3) [16]. A similar proof follows for the existence of a solution of equations (15) and (16). The direct proof for the characterization of the solution of equations (15) and (16) can be also derived by setting an initial condition (16) to be a smooth approximation of the multiplication of the Dirac and the Kronecker delta function. Then the existence of the solution can be deduced from general theorems provided in [15]. Furthermore the approximation follows by using Gronwall lemma [6]. The uniqueness is implied directly by following similar lines of [13] (see page 288) or [16] by using the Gronwall inequality [6]. Theorem 3.2 is a direct consequence of characterizations of processes xt and zt . The consequence is the joint characterization that follows from the following corollary. Corollary 1. A solution of the system of equations (15) with the initial condition T (16) is the transition probability density for the process x ˜t = (xt , zt ) , t ≥ 0.

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

7

Proof. From the proof of Theorem 3.2, we have that the solution of (15) and (16) exists and it is unique. Now, we give just rough proof of the corollary that can be derived more precisely following techniques in [16] and [17]. Notice that by Itˆo’s rule and for some smooth function f , we have d (Izt =z f (xt )) = Izt =z df (xt ) + f (xt ) dIzt =z ,

(17)

where df (xt ) dIzt =z = 0 since dwt dIzt =z because of condition 1. From equations (1) and (17), and Itˆ o’s rule applied to f (xt ) (that is, df (xt ) = L (t, xt , zt ) f (xt ) dt+dm1t 1 where mt dt is a martingale), we have d (Izt =z f (xt )) = Izt =z L (t, xt , zt ) f (xt ) dt + f (xt ) λzt z (t) dt + dm2t

(18)

m2t .

for a martingale Then writing (18) in its integral form, taking conditional expectation with respect to xs = x and zs = u, 0 ≤ s ≤ t < ∞, u ∈ Z, and by P setting λzt z (t) = v∈Z Izt =v λvz (t) in the second term on the right hand side in (18), we get E [f (xt ) Izt =z |xt = x, zs = u] = f (x) δ uz Z t + E [Izr =z L (r, xr , zr ) f (xr ) |xs = x, zs = u] dr s XZ t + E [f (xr ) Izr =v |xs = x, zs = u] λvz (r) dr. v∈Z

(19)

s

The second and the third term on the right hand side of equation (19) correspond to the right hand side of equation (15) so that the left hand side of (19) can be expressed as Z E [f (xt ) Izt =z |xt = x, zs = u] = f (y) p (s, x, u, t, y, z) dy, (20) Rd

and for the integrand of the second term of (19) we have E [Izr =z L (r, xr , zr ) f (xr ) |xs = x, zs = u] Z = L (r, y, z) f (y) p (s, x, u, r, y, z) dy.

(21)

Rd

Also, the similar statements follow for the integrands of the summands of the third term on the right hand side of (19) by setting r instead of t in (20). The other technical details are the same as in [16] and [17]. Now let us now introduce an observation process of the other systems concatenated dynamic states xt , for continuous d3 -dimensional function h (t, ·), a d3 × d4 dimensional matrix Q and the standard normal identically distributed random variables νk , for k = 1, 2, . . . , as the following: y˜k = h (tk , xtk ) + Qνk .

(22)

Furthermore let Yk be the complete σ-algebra generated by yj , j ≤ k. Now, we are interested in estimating conditional probability density function p (t, x, z|Yk ) for t ≥ tk , x ∈ Rd , and z ∈ Z. Note that according to the standard definition [3], we have that for every bounded Borel function g on Rd × Z, XZ E [g (xt , zt ) |Yk ] = g (x, z) p (t, x, z|Yk ) dx. (23) z∈Z

Rd

8

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

Similar to the statement of Theorem 2.1, we are deriving a method for estimating p (t, x, z|Yk) given p (tk , x, z|Yk ) by using the statement of the following theorem. Theorem 3.3. For 0 ≤ tk ≤ t < ∞, the conditional probability density functions p (t, x, z|Yk ), x ∈ Rd , z ∈ Z, defined by (23), are solutions of X d˜ π (t, x, z) = L∗ (t, x, z) π ˜ (t, x, z) + π ˜ (t, x, v) λvz (t) , t ≥ tk , z ∈ Z, (24) dt v∈Z

with initial conditions π ˜ (tk , x, z) = p (tk , x, z|Yk ), x ∈ Rd , z ∈ Z. Proof. From Theorem 3.2, we have that the fundamental solution p (s, x, u, t, y, v), 0 ≤ s ≤ t < ∞, x, y ∈ Rd . u, v ∈ Z, exists and satisfies (15) and therefore satisfies (24) with the initial condition (16). Then by following similar lines in [13] (see equation (7.29) on page 369), we have that solution of (24) with initial condition π ˜ (tk , x, z) = p (tk , x, z|Yk ), x ∈ Rd , z ∈ Z has representation XZ π ˜ (t, x, v) = p (tk , y, z, t, x, v) p (tk , y, z|Yk ) dy. (25) z∈Z

Rd

Remark 4. The proof of Theorem 3.3 can be also derived directly as the proof of Theorem 2.1. As a consequence, we have the following corollary. Corollary 2. For k = 0, 1, . . . , given p (tk , x, z|Yk ), x ∈ Rd , z ∈ Z, and measurement y˜k+1 , we get p (tk+1 , x, z|Yk+1 ), first by estimating solution p (tk+1 , x, z|Yk ) of (24) on the interval t ∈ [tk , tk+1 ] for a given initial condition p (tk , x, z|Yk ) and then for x ∈ Rd and z ∈ Z, by setting p (tk+1 , y˜k+1 |x, z) p (tk+1 , x, z|Yk ) R p (tk+1 , x, z|Yk+1 ) = P , (26) yk+1 |y, v) p (tk+1 , y, v|Yk ) dy v∈Z Rd p (˜ where p (tk+1 , y˜k+1 |x, z) is the probability likelihood of the measurement y˜k+1 given states x ∈ Rd and z ∈ Z at time tk+1 , which in case of the observation model provided in equation (22) is 1

e p (tk+1 , y˜k+1 |x, z) = p det (2πQQT )

2 1

T −2 − 21 (˜ yk+1 −h(tk+1 ,x))

(QQ )

.

(27)

Proof. The statement of the corollary follows from Theorem 3.3 and Bayes’ theorem (for example, see page 463 in [3]). The following two remarks are important for understanding the structure of the conditional probability density function p (t, x, z|Ys ), 0 ≤ s ≤ t < ∞. Remark 5. The right hand side in (27) does not depend on the state z that appears on the left hand side of (27). This means that the conditional probabilities p (tk , x, z|Yk ), k = 1, 2, . . . , are mainly determined by solutions of (24). The Bayes’ update (26) then serves a role of testing each of possibilities z ∈ Z. Remark 6. From Remark 5, we can intuitively deduce that for t ≥ tk , the conditional probability p (tk , x, z|Yk ) can be represented by separating conditional probabilities P (zt = z|Yk ) and p (t, x, z|Yk ) of stochastic process xt , that is, we can consider representation

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

p (t, x, z|Yk ) =

X

δ vz P (zt = v|Yk ) p1 (t, x, v|Yk ) (a.s.),

9

(28)

v∈Z

where p1 (t, x, z|Yk ) is the conditional probability of the process Z t Z t x t = x tk + b (r, xr , z) dr + σ (r, xr , z) dwr , z ∈ Z, 0 ≤ tk ≤ t < ∞, tk

(29)

tk

given all observations up to time tk . The representation (28) is generally not true since the transitional probability density cannot be represented as product of transitional probability densities of zt and xt . For fundamental solutions of xt and zt , denoted by p1 (s, y, t, x, u) and p2 (s, u, t, v), respectively, the function defined as p (s, x, u, t, y, v) = p1 (s, y, t, x, u) p2 (s, u, t, v) ∗

(30)



satisfies (15) when L (t, x, v) is replaced by L (t, x, u) and satisfies the same initial condition (16). Then the representation (30) and consequently (28) follows if L∗ (t, x, u) = L∗ (t, x, z) for all relevant u, z ∈ Z, (the relevant being the ones where process zt can take values with positive probability) which is true if the process xt propagates the same for any decision between the measurements (22). Again decisions correspond to the values of process zt . If we denote by xzt the solution of Itˆo’s equation (29) then each summand P (zt = z|Yk ) p1 (t, x, z|Yk ) on the right hand side of (28) corresponds to one decision state value z ∈ Z where p1 (t, x, z|Yk ) is the conditional probability density for the state trajectory xzt , t ≥ tk , and P (zt = z|Yk ) is the conditional probability that the finite-state Markov process zt assumes value z or as mentioned earlier the decision corresponding to z ∈ Z. We conclude the section with the following corollary. Corollary 3. Let for 0 ≤ s ≤ t < ∞, p (t, x, z|Ys ), x ∈ Rd , z ∈ Z, conditional probability density functions of x ¯t given Ys as in Remark 6. Then for 0 ≤ s ≤ t < ∞, x ∈ Rd , we have Z P (zt = z|Ys ) = p (t, x, z|Ys ) dx Rd Z Z (31) X = p (s, y, v, t, x, z) p (s, y, v|Ys ) dydx. v∈Z

Rd

Rd

Proof. The corollary follows from Theorem 3.3 (equation (25)), where we set Ys instead of Yk , and the Fubini’s theorem or more specifically, the so called Tonelli’s theorem which is formulated as Theorem 18.3 on page 238 in [3]. 4. Design of robust strategies for the primary dynamical system. In this section we provide a design methodology for the computation of the control strategies for the primary dynamical system based on discrete and noise-corrupted information about the behavior of other dynamical systems. We also assume that the primary system has its own objectives which are captured in its payoff function to be specifically introduced later. The goal of the primary system is to optimize the values of its payoff function which is constructed so that either larger or smaller (depending on the construction) values of the payoff function lead to an accomplishment of the objectives (for more details on particular constructions we refer to reference [29]). Again we proceed by some necessary assumptions and notation. Let B be a smooth Rd4 -valued function defined on [0, ∞) × Rd4 × U, U ⊂ Rd5 where U is a compact set, and d4 , and d5 are positive integers. For U being the set of

10

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

Lebesgue measurable functions on [0, ∞), we are interested in deriving appropriate control laws u ∈ U such that the primary system dynamics of its d4 -dimensional states x ˜t , is governed by d˜ xt = B (t, x ˜t , ut ) , t ≥ 0. dt

(32)

Remark 7. The right hand side of equation (32) can be nonlinear or affine in ˜ defined on control [8], [30]. For example, for a smooth Rd4 -valued function B [0, ∞) × Rd4 and smooth Rd4 × Rd5 -valued function g˜ defined on [0, ∞) × Rd4 , the affine dynamics is given by d˜ xt ˜ (t, x =B ˜t ) + g˜ (t, x ˜t ) ut , t ≥ 0. (33) dt T If we set d4 = 3, d6 = 2, x ˜t = x1t , x2t , ψt , x1t , x2t ∈ R, ψt ∈ [0, 2π), on the right  T ˜ ≡ 0, ut = vt , ψ˙ t hand side of equation (33), B as control variables, and   cos ψt 0 g˜ (t, x ˜t ) =  sin ψt 0  , 0 1 we get the so-called unicycle model [30]. This model falls into the category of nonholonomic systems and thus has some special properties (for more on properties of nonholonomic systems and how to control them, we refer to [4, 25]). Another example of an affine in control model is the satellite orbital motion with rocket thrust where the control law is the the thrust direction vector [33]. Now let v (t, x, y), t ≥ 0, x ∈ Rd4 , y ∈ Rd , be a non-negative and differentiable function which we call a Lyapunov-like payoff function for the primary dynamics in (32) which relies on the information about other systems dynamics (11). Derivations, applications, and demonstrations of various constructions of v (t, x, y) based on multi-agent multi-objective scenarios are given in [29]. For each possible decision z ∈ Z, we have different payoff values v (t, x, xzt ). Motivated by the maximum approximation functions from [29] and by the application induced derivations in [34], we introduce the conditional maximum approximation function given discrete observations y˜j , j = 1, 2, . . . , k, at t ≥ tk and δ > 0, as  h i δ1 δ v (t, x|Yk ) = vδ (t, x|Yk ) = E (v (t, x, xt )) |Yk , (34) where xt is a solution of equation (10), or a solution of equation (11, or its integral form (12)). Solely for the illustration purposes we choose the maximization operation rather than the minimization yet all the derivations can be modified in the trivial manner to fit the minimization scenario. At this point we can state the following theorem that completely characterizes the payoff function maximization given discrete and noisy observations Yk = σ {˜ yj , j = 1, 2, . . . , k} (that is, given the complete σ-algebra generated by y˜j , j = 1, 2, . . . , k) of the stochastic system xt with different possible dynamics. The overall system’s changes between different state dynamics are based on values of the finite state Markov process zt . Corollary 4. Let xt and zt be stochastic processes characterized by (15) and (3), respectively, Yk the complete σ-algebra generated by y˜j , j = 1, 2, . . . , k, given in

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

11

(22), and payoff function given by (34). Then for tk ≤ t, x ∈ Rd4 , and δ > 0, we have ! δ1 XZ δ v (t, x|Yk ) = (v (t, x, y)) p (t, y, z|Yk ) dy . (35) z∈Z

Rd

Proof. From Corollary 2, we have that p (t, y, z|Yk ) is the conditional probability function of xt and zt given Yk for t ≥ tk . Then the conclusion of the corollary follows from equation (34). Remark 8. In terms of the classical Lp spaces [24], from (35), we have v (t, x|Yk ) = vδ (t, x|Yk ) = kv (t, x, ·) kLδ (p(t,·,·|Yk )) ,

(36)

where Lδ (p (t, ·, ·|Yk )) is the classical Lp space for p = δ with respect to the measure p (t, x, ·, ·|Yk ) on Rd × Z. It is well known that lim kv (t, x, ·) kLδ (p(t,·,·|Yk )) = kv (t, x, ·) kL∞ (p(t,·,·|Yk )) ,

(37)

δ→∞

where by definition [24], we have ( kv (t, x, ·) kL∞ (p(t,·,·|Yk )) = inf

M:

)

XZ z∈Z

p (t, y, z|Yk ) dy = 0 . (38)

{ˆ y :v(t,x,ˆ y )>M }

The right hand side of equation (38) is the essential supremum with respect to the measure p (t, ·, ·|Yk ) which implies that the payoff function can be strictly bigger than zero where the conditional probability density function has non-zero mass. As mentioned in Section 1, we want to estimate controls u(·) that govern (32). One way to achieve that goal is to estimate a solution to the following problem u ˆ(t, x ˜t ) = arg max v (t, x ˜t |Yk ) , t ≥ tk , k = 1, 2, . . . ,

(39)

u(·)∈U

by deriving control laws where time derivatives of function v (t, x ˜t |Yk ) achieve maximum, that is, we are considering the following problem u ˆ(t, x ˜t ) = arg max u(·)∈U

dv (t, x ˜t |Yk ) , t ≥ tk , k = 1, 2, . . . . dt

(40)

It is important to stress that even the problem (40) is very difficult to solve yet it is still easier to handle than the one in equation (39) due to its relationship to the standard control Lyapunov approach [10]. Motivated by this idea we formulate the following theorem which is an approximation of the problem (39) such that the optimization is performed point-wise over the image set of control strategies where symbol u is taken as a parameter not a function in the optimization problem (for more details on this we refer to [29]). Theorem 4.1. Let for k = 1, 2, . . . , t ≥ tk , x ∈ Rdk , v (t, x|Yk ) be a payoff function given by (34) with assumed primary system dynamics being affine in control as in equation (33). Then for positive number µ > 0 which defines the image set for control strategies as U = {u : kuk ≤ µ} and problem u ˆ(t, x ˜t ) = arg max u∈U

dv (t, x ˜t |Yk ) dt

(41)

12

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

we have T

T

g˜ (t, x ˜t ) ∇v (t, x ˜t |Yk )

, (42) u ˆ(t, x ˜t ) = µ

T T g (t, x ˜t ) ∇v (t, x ˜t |Yk )

˜   xt |Yk ) xt |Yk ) where ∇v (t, x ˜t |Yk ) = ∂v(t,˜ , . . . , ∂v(t,˜ is the gradient of v (t, x ˜|Yk ) with ∂x ˜1 ∂x ˜ d4 respect to x ˜ ∈ Rd4 . Proof. From (33), we have dv (t, x ˜t |Yk ) ∂v (t, x ˜t |Yk ) d˜ xt = + ∇v (t, x ˜t |Yk ) dt ∂t dt (43)   ∂v (t, x ˜t |Yk ) ˜ = + ∇v (t, x ˜t |Yk ) B (t, x ˜t ) + g˜ (t, x ˜t ) u . ∂t If we now take arg max with respect to u ∈ U of the right hand side of both lines of equation (43), we get arg max u∈U

dv (t, x ˜t |Yk ) = arg max ∇v (t, x ˜t |Yk ) g˜ (t, x ˜t ) u. dt u∈U

(44)

By following the geometric proof of “Lemma on Circular Vectograms” in [11] or by simple inspection we obtain that the following expression: D E T T ∇v (t, x ˜t |Yk ) g˜ (t, x ˜t ) u = g˜ (t, x ˜t ) ∇v (t, x ˜t |Yk ) , u , (45) where h·, ·i denotes the scalar product, is maximized when we choose the control T T strategy of equation (42) so that the angle between u and g˜ (t, x ˜t ) ∇v (t, x ˜t |Yk ) is zero (meaning the directions of these vectors are the same) with u having the maximum magnitude µ. Remark 9. Control laws/strategies given in equation (42) can be also derived by using the method of Lagrange multipliers, that is, by finding extrema of the function   2 f (u, λ) = ∇v (t, x ˜t |Yk ) g˜ (t, x ˜t ) u + λ kuk − µ2 . (46) Remark 10. A solution to the optimization problem (41) can be different than a solution to the optimization problem (40) even if we assume the same norm bounding constraint in both problems. In (40), we are looking for solutions, or extrema, in the set U of all admissible control functions and in (41) we are looking for point-wise solutions from the compact set U ⊂ Rd5 . If we assume that the feedback estimates of the controls (42) are performed at discrete time instances so that the control is computed when the feedback information is updated, then the problem (42) can be considered as a subproblem of (41). In other words, the solutions of (42) are approximations to the solutions of (41). The obtained approximations from U can be considered as constant (or some other predetermined) functions on the intervals between feedbacks which in turn, are also elements of U. Then the estimations of those constants on the feed back subintervals by using (42) and their consequent applications for driving system (33), can be considered as as piecewise constant control functions on [0, ∞), which are also elements of U. In this sense, the solution of the optimization problem (41) is an approximation of the solution of the optimization problem (40) and therefore approximation to the more general problem (39).

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

13

5. Collision avoidance problem. In this section we provide some specific technical details related to various collision avoidance problem scenarios. Let {Ft } be an increasing filtration of σ-algebras, and τ a Markov time with respect to filtration {Ft }. Motivated by a collision avoidance application described in [34], we are interested in estimating τ on a given time interval [tL , tU ] or estimating if τ ∈ I = [tL , tU ], for given lower time tL and upper time tU such that 0 ≤ tL ≤ tU < ∞. Let tk = tL + (k − 1) ∆to for a given ∆to > 0, n is a positive integer, k = 1, . . . , n + 1, such that I = ∪nk=1 Ik where Ik = [tk , tk+1 ), for k = 1, . . . , n − 1, and In = [tn , tn+1 ], tn+1 = tU . h  For k = 1, . . . , n, and ∆t > 0, set tjk = tk + (j − 1) ∆t, Ikj = tjk , tj+1 , m is a k

j m positive integer, j = 1, . . . , m + 1, such that Ik = ∪m j=1 Ik where we set In to be a closed interval. Let ℵA be the indicator function of the set A ⊂ [0, ∞) , that is, ℵA (s) = 1 if s ∈ A and ℵA (s) = 0 otherwise. Then, for t ≥ 0, we set

zt = ℵ[τ,∞) (t)

n X m X

ℵI j (τ ) (k, j) .

(47)

k

k=1 j=1

In relation to the finite state Markov process zt , t ≥ 0, defined by equation (47), we formulate the following two remarks. Remark 11. From equation (47), we have zt = (0, 0) ⇐⇒ τ > t    zt = (k, j) ⇐⇒ (τ ≤ t) & τ ∈ Ikj .

(48)

First equivalence in (48) means that the Markov time (stopping) τ is not yet realized (at the current moment t) and the second equivalence in (48) means that h  the Markov time τ is realized and that it is realized in the interval Ikj = tjk , tj+1 . k

Remark 12. The stochastic process zt , t ≥ 0, is a finite-state Markov process with c´ adl´ ag trajectories. The finite-state Markov process zt , t ≥ 0, takes values in the finite set Z = {0, 1, . . . , n} × {0, 1, . . . , m} and has the transition probabilities given by equation (2). The transition probability densities p (s, u, t, v) are right continuous in t for any u, v, and s. The distribution of τ determines or characterizes process zt and therefore possible existence of the so-called infinitesimal intensities λuv (t), u, v ∈ Z, t ≥ 0, with properties (4) [16], and vice versa by specifying the infinitesimal intensities with properties (4) we completely characterize the finitestate Markov process zt , t ≥ 0, and therefore an approximation of the distribution for the Markov time τ . The following example is motivated by space applications [33] and [34].  T  T Example 1. Let d4 = d = 6, and x ˜t = ξ˙t , ξt and xt = ζ˙t , ζt be state vectors of some orbiting body and an Earth orbiting satellite, respectively, such that ξt , ζt ∈ R3 and ξ˙t , ζ˙t ∈ R3 are their corresponding locations and velocities, respectively. We

assume that for some Markov time τ ∈ I, there is a jump in

velocity ∆v = ξ˙τ − ξ˙τ − such that ξ˙τ = fB (ξτ , ζτ +s , s) ,

(49)

14

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

where fb is the Battin function [34] representing the solution of the Lambert’s problem [2], and s > 0 is the transfer time from location ξτ to location ζτ +s .  T Original trajectory xt is completely determined by its values at xτ − = ξ˙τ − , ξτ and it is different from the trajectory determined by (as initial value for propagation  T equation (10)) xτ = ξ˙τ , ξτ created by the impulse ∆v. The transfer time s corresponds to the closest point of approach between the trajectories determined by xτ − and x ˜τ and usually can vary for a short time (considered to be a few seconds [33]) for bounded ∆v. In reality, there is no sudden jump in velocity expressed by impulse ∆v but changes in acceleration that can be smooth or sudden due to external forces such as perturbations, and/or internal forces such as the rocket propulsion and the change of mass. Impulse modeling using (49) can be used to derive numerical approximations of smooth changes in velocity due to propulsion [2] which was derived and demonstrated recently [34]. The primary system can have a general idea about the possible distribution of τ and therefore the finite-state Markov process zt that approximates its values within a given interval length ∆t. In the following two examples, we provide two reasonable candidate distributions of the process zt . Example 2. Since the estimation updates in equation (26) are at times tk , k = 1, 2, . . . , n + 1, it is enough to specify transitional probabilities of zt , t ≥ 0, at those times and consider their possible jumps at subintervals Ikj , j = 1, . . . , m + 1 in between tk and tk+1 , k = 1, . . . , n. Therefore, by considering Remark 11, for l = j = 0, we set  1 (50) P ztk+1 = (0, 0) |ztk = (0, 0) = , 2 and for l 6= 0 and j 6= 0, we set  1 P ztk+1 = (l, j) |ztk = (0, 0) = , (51) 2m and  P ztk+1 = (l, j) |ztk = (l, j) = 1. (52) Transitional probabilities (50), (51), and (52), completely characterize the finite state Markov process zt , t ≥ 0. Example 3. For l = j = 0 and λ > 0, we set P (ztU = (0, 0) |ztL = (0, 0)) = eλ(tU −tL ) .

(53)

For k = 1, 2, . . . , we set  P ztk+1 = (0, 0) |ztk = (0, 0) = eλ∆t ,

(54)

and for l 6= 0 and j 6= 0, we set  1 − eλ∆t P ztk+1 = (l, j) |ztk = (0, 0) = 2m

(55)

and  P ztk+1 = (l, j) |ztk = (l, j) = 1. (56) From equation (53), we can set a desired probability of a non-threat propagation during the interval of interest I = [tL , tU ]. For example, we can set that transitional probability (that is, the right hand side of equation (53)) to 21 which then implies 2 λ = t−Ulog −tL .

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

15

The following remark is important for understanding the specified finite-state Markov processes provided in Examples 2 and 3. Remark 13. Since λ in equation (53) has a constant value and also because of the constant value of the right hand side of equation (50) (for which a different value can be chosen), it may seem that the Markov process zt , t ≥ 0, in both examples is time homogeneous. Because of Remark 11, it is not. The non-homogeneous property can be observed, if by using (3), we derive the transition probabilities and the infinitesimal intensity functions at shorter time steps tjk , j = 1, . . . , m + 1, k = 1, . . . , n + 1. Now, by considering Example 1, let us define the distance between xt and x ˜t , t ∈ I, at the closest point of approach on the interval I, as follows: d (x, x ˜) = d (xI , x ˜I ) = min kξt − ζt k .

(57)

t∈I

By considering a parametrization of each of the trajectories by their values at time instants t ∈ I, that is, for y ∈ R6 , we obtain the parametrization x (t, y) = {xs : s ∈ I, xt = y} where x (t, xt ) = x (xt ), and similarly for x ˜t , t ∈ I, we have x ˜ (˜ xt ) x (˜ xt ) , x to be well defined. Then from equation (57), by setting d (˜ xt , xzt t ) = d (˜ (xzt t )), and by following methodology in [30], we set the avoidance function )!2 ( 2 d (˜ xt , xzt t ) − Ra2 zt , (58) va (˜ xt , xt ) = min 0, 2 d (˜ xt , xzt t ) − ra2 where 0 < ra < Ra < ∞ are given constants representing avoidance and detection range, respectively. These avoidance functions were developed as a modification of the original avoidance functions which were originally introduced in [19] under the concept of avoidance control. Also notice that in order to avoid collisions, smaller values of the avoidance functions are preferred, and this would be an example where the minimization of the payoff function would be performed instead of maximization. How to incorporate multiple avoidance functions into a single payoff function was shown in [29] using approximations of minimum and maximum functions. Given discrete observations and consequently by Corollary 4, we have explicit expressions for va (˜ x|Yk ) and therefore va (˜ xt |Yk ), t ≥ tk , x ˜ ∈ R6 , k = 1, 2, 3, . . . , n+ 1, which in turn allow to derive avoidance controls (42). It is of interest to note that considering Example 1 and either of Examples 2 or Example 3, from Remark 6 and (28), we have va (˜ xtk |Yk ) =

X z∈Z

Z P (ztk = z|Yk ) Rd

! δ1 vaδ (˜ xtk , y) p (tk , y, z|Yk ) dy

(59)

where p (tk , y, z|Ytk ) are conditional probability density functions obtained from estimating each of possible trajectories with the same propagation law (that is, up to the Wiener process modeled uncertainties caused by unknown perturbations) that spawned from the original trajectory at unknown times which are modeled by the finite state Markov process zt . Expression (59) can be derived directly as it was derived in [34] and where approximations of p (tk , y, z|Ytk ) are the Gaussian conditional probability density functions that correspond to each of possible hypothesis zt = z, z ∈ Z. Some numerical simulations are provided in [33] and [34].

16

ˇ ´ ALEKSANDAR ZATEZALO AND DUSAN STIPANOVIC

6. Conclusions. In this paper we provided a design methodology for control strategies of a primary dynamical system which depend on the behavior of other dynamical systems operating in the neighborhood of the primary dynamical system and thus may interact or influence its behavior in various ways. The problem is very relevant to, for example, multi-player dynamic games and collision avoidance scenarios. The design of control actions or strategies is based only on discrete-time and noisy observations of the behavior of other dynamical systems and thus the strategies are robust by construction. Based on this limited and somewhat inaccurate information, first the estimation of the behavior of other dynamical systems is performed and then an appropriate control strategy for the primary dynamical system is chosen accordingly. As an illustration, a few collision avoidance scenarios were considered and presented. Acknowledgments. The authors would like to thank Professor Hasnaa Zidani for the invitation and for the encouragement to submit this work. REFERENCES [1] T. Ba¸sar and G. J. Olsder, Dynamic Noncooperative Game Theory, Revised and updated 2nd edition, SIAM, Philadelphia, PA, 1999. [2] R. H. Battin, An Introduction to the Mathematics and Methods of Astrodynamics, Revised Edition, AIAA Education Series, 1999. [3] P. Billingsley, Probability and Measure, 2nd edition, John Willey & Sons, New York, 1986. [4] A. M. Bloch (with the collaboration of J. Baillieul, P. Crouch and J. Marsden), Nonholonomic Mechanics and Control, Springer-Verlag, New York, 2003. [5] A. Bressan and B. Piccoli, Introduction to the Mathematical Theory of Control, American Institute of Mathematical Sciences, Springfield, MO, 2007. [6] L. Buonanno, A new Gronwall-Bellman inequality for discontinuous functions, Journal of Interdisciplinary Mathematics, 9 (2006), 543–550. [7] F. L. Chernousko and A. A. Melikyan, Some differential games with incomplete information, Lecture Notes in Computer Science, Springer-Verlag, Berlin, 27 (1975), 445–450. [8] A. D´ esilles, H. Zidani and E. Cr¨ uck, Collision analysis for an UAV, Proceedings of the AIAA Guidance, Navigation, and Control Conference, Minneapolis, MN, 2012. [9] W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions, 2nd edition, Springer, New York, 2006. [10] R. A. Freeman and P. V. Kokotovi´ c, Robust Nonlinear Control Design: State Space and Lyapunov Techniques, Birkh¨ auser, Boston, MA, 1996. [11] R. Isaacs, Differential Games, A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization, Dover, Mineola, 1999. [12] I. Ya. Kats and A. A. Martynyuk, Stability and Stabilization of Nonlinear Systems with Random Structure, Taylor and Francis, London and New York, 2002. [13] I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus, 2nd edition, Springer, New York, 1991. [14] P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations, Springer, Berlin, 1992. [15] N. V. Krylov, Introduction to the Theory of Diffusion Processes, Translations of Mathematical Monographs, Vol. 142, American Mathematical Society, 1995. [16] N. V. Krylov and A. Zatezalo, Filtering of finite-state time-nonhomogeneous Markov Processes, a direct approach, Applied Mathematics & Optimization, 42 (2000), 229–258. [17] N. V. Krylov and A. Zatezalo, A direct approach to deriving filtering equations for diffusion processes, Applied Mathematics & Optimization, 42 (2000), 315–332. [18] N. N. Krasovskii and A. I. Subbotin, Game-theoretical Control Problems, Springer-Verlag, New York, NY, 1988. [19] G. Leitmann and J. Skowronski, Avoidance Control, Journal of Optimization Theory and Applications, 23 (1977), 581–591. [20] A. A. Melikyan, On minimal observations in a game of encounter, Prikladnaya Matematika i Mekhanika, 37 (1973), 407–414, (In Russian).

CONTROL OF DYN. SYST. WITH DISC. AND UNC. OBSERV.

17

[21] A. A. Melikyan and A. Pourtallier, Games with several pursuers and one evader with discrete observations, Game Theory and Applications (L. A. Petrosjan and V. V. Mazalov, eds.), Nova Science Publishers, New York, NY, 2 (1996), 169–184. [22] G. J. Olsder and O. Pourtallier, Optimal selection of observation times in a costly information game, New Trends in Dynamic Games and Applications (G. J. Olsder, ed.), Annals of the International Society of Dynamic Games, Birkh¨ auser, Boston, MA, 3 (1995), 227–246. [23] L. A. Petrosjan, Differential Games of Pursuit, Series on Optimization, Vol. 2, World Scientific, Singapore, 1993. [24] H. L. Royden, Real Analysis, 3rd edition, Macmillan Publishing Company, New York, 1988. [25] M. W. Spong, S. Hutchinson and M. Vidyasagar, Robot Modeling and Control, John Wiley & Sons, Hoboken, NJ, 2005. [26] D. M. Stipanovi´ c, A Survey and Some New Results in Avoidance Control, in 15th International Workshop on Dynamics and Control IWDC 2009, J. Rodellar and E. Reithmeier (Eds.), Barcelona, 2009. [27] D. M. Stipanovi´ c, A. Melikyan and N. Hovakimyan, Some sufficient conditions for multi-player pursuit-evasion games with continuous and discrete observations, Annals of the International Society of Dynamic Games, 10 (2009), 133–145. [28] D. M. Stipanovi´ c, A. Melikyan and N. Hovakimyan, Guaranteed strategies for nonlinear multiplayer pursuit-evasion games, International Game Theory Review , 12 (2010), 1–17. [29] D. M. Stipanovi´ c, C. J. Tomlin and G. Leitmann, Monotone approximations of minimum and maximum functions and multi-objective problems, Applied Mathematics & Optimization, 66 (2012), 455–473. [30] D. M. Stipanovi´ c, C. Valicka, C. J. Tomlin and T. R. Bewley, Safe and reliable coverage control, Numerical Algebra, Control and Optimization (NACO), 3 (2013), 31–48. [31] J. Yong and X. Y. Zhou, Stochastic Controls: Hamiltonian Systems and HJB Equations, Springer, New York, 1999. [32] A.Zatezalo, D. Stipanovi´ c, S. Yu and P. McLaughlin, Game-Theoretic Approach to Peer-toPeer Confrontations, Technical Paper, AUVSI’s Unmanned Systems, 2014. [33] A. Zatezalo, D. Stipanovi´ c, R. K. Mehra and K. Pham, Constrained Orbital Intercept-Evasion, Proceedings of SPIE, 9085-14, 2014. [34] A. Zatezalo, D. Stipanovi´ c, R. K. Mehra and K. Pham, Space Collision Threat Mitigation, Proceedings of SPIE, 9091-17, 2014.

Received April 2014; revised October 2014. E-mail address: [email protected] E-mail address: [email protected]