Asymptotic Stackelberg Optimal Control Design for ...

Asymptotic Stackelberg Optimal Control Design for an Uncertain Euler Lagrange System M. Johnson, T. Hiramatsu, N. Fitz-Coy, and W. E. Dixon Abstract— Game theory methods have advanced various disciplines from social science, notably economics and biology, and engineering. Game theory establishes an optimal strategy for multiple players in either a cooperative or noncooperative manner where the objective is to reach an equilibrium state among the players. A Stackelberg game strategy involves a leader and a follower that follow a hierarchy relationship where the leader enforces its strategy on the follower. In this paper, a general framework is developed for feedback control of an Euler Lagrange system using an open-loop Stackelberg differential game. A Robust Integral Sign of the Error (RISE) controller is used to cancel uncertain nonlinearities in the system and a Stackelberg optimal controller is used for stabilization in the presence of uncertainty. A Lyapunov analysis is provided to examine the stability of the developed controller.

I. I NTRODUCTION1 Noncooperative game theory has been applied to a variety of control problems [1]–[14]. While zero sum differential games have been heavily exploited in nonlinear ∞ control theory, non-zero sum differential games have had limited application in feedback control. In particular, the Stackelberg differential game which is based on a hierarchal relationship that exists between the players of the game, has been utilized in a decentralized control system [5], hierarchal control problems [3], [4], [13], and nonclassical control problems [6]. This paper is focused on the leader and a follower two player Stackelberg game. The leader is able to enforce a strategy on the follower and the leader knows how the follower will rationally react; however, the follower does not know the leader’s rational reaction. The strategy space for the game is the set of all available information for the players to make their decisions. The players are committed to following a predetermined strategy based on their knowledge of the initial state, the system model and the cost functional to be minimized. In this open-loop game formulation each player has access to state measurements and can adapt the respective strategy as a function of the system’s evolution. Previous research of the open-loop Stackelberg game in control theory is mainly focused on deriving an analytic solution and providing gain constraints; however these results are limited to linear systems and dont demonstrate stabilization properties of the control law. The main contribution of this work is the development of an optimal open-loop Stackelberg-based feedback control law, in conjunction with a robust feedback control law, that 1 This project was supported by the National Aeronautics and Space Administration through the University of Central Florida’s Space Grant Consortium. M. Johnson, T. Hiramatsu, N. Fitz-Coy, and W. E. Dixon are with the Dept. of Mechanical and Aerospace Engineering, University of Florida, Gainesville, Florida 32611,{marc1518,takashi,nfc,wdixon}@ufl.edu.

is shown to stabilize a nonlinear system with exogenous disturbances. In addition, the control law is shown to account for unknown state varying and bounded disturbances. To formulate the Stackelberg game for a nonlinear Euler Lagrange system, an implicit learning robust controller is used to cancel nonlinearities in the system. The control development is based on the continuous Robust Integral of the Sign of the Error (RISE) [15]–[17] technique. The RISE architecture is adopted since this method can accommodate for  2 disturbances and yield an asymptotic result. This paper investigates the development of the RISE controller in conjunction with an optimal Stackelberg feedback controller for an Euler Lagrange system with additive disturbances. The formulation of the Stackelberg game derives a structured uncertainty as one of the players; however, this paper also considers an additional bounded unstructured uncertainty that cannot be accounted for by the Stackelberg feedback. An optimal feedback controller based on a open-loop Stackelberg game is shown to minimize a cost functional in the presence of structured uncertainty, while a Lyapunov-based asymptotic tracking result is obtained through the amalgam of the RISE and Stackelberg feedback methods. II. DYNAMIC M ODEL AND P ROPERTIES The class of nonlinear dynamic systems considered in this paper is assumed to be modeled by the following EulerLagrange [18] formulation:  ()¨  +  ( ) ˙ ˙ + () +  () ˙ +   +   =  () (1) In (1),  () ∈ R× denotes the generalized inertia matrix,  ( ) ˙ ∈ R× denotes the generalized centripetal-Coriolis matrix, () ∈ R denotes the generalized gravity vector,  () ˙ ∈ R denotes the generalized friction vector,   ∈ R denotes a general uncertain disturbance with a structure derived from the formulation of the Stackelberg game,   () ∈ R is a general bounded unstructured disturbance,  () ∈ R represents the input control vector, and (), (), ˙ ¨() ∈ R denote the generalized position, velocity, and acceleration vectors, respectively. The subsequent development is based on the assumption that () and () ˙ are measurable, and  (),  ( ), ˙ (),  (), ˙   and   are unknown. Moreover, the following properties and assumptions will be exploited in the subsequent development. Assumption 1: The inertia matrix  () is symmetric, positive definite, and satisfies the following inequality ∀() ∈ R : ¯ kk2  1 kk2 ≤    () ≤ ()

(2)

where 1 ∈ R is a known positive constant, () ¯ ∈ R is a known positive function, and k·k denotes the standard Euclidean norm. Assumption 2: The following skew-symmetric relationships are satisfied: ´ ³   ˙ () − 2 ( ) ˙  = 0 ∀ ∈ R (3) ´ ³ ˙ = ˙ () − 2 ( ) ˙ (4) − ˙ () − 2 ( ) ´ ³ ¡ ¢  ¢ ¡ = ˙ −  +   (5) − ˙ −  + 

˙  () ˙ and Assumption 3: If () () ˙ ∈ L∞ , then  ( ), () are bounded. Moreover, if () () ˙ ∈ L∞ , then the first and second partial derivatives of the elements of  (),  ( ), ˙ () with respect to  () exist and are bounded, and the first and second partial derivatives of the elements of  ( ), ˙  () ˙ with respect to () ˙ exist and are bounded. Assumption 4: The desired trajectory... is assumed to be .... designed such that  (), ˙ (), ¨ ()   ()   () ∈ R exist, and are bounded. Assumption 5: The disturbance term and its first two time derivatives, i.e.   ()  ˙  ()  ¨ () are bounded by known constants. III. E RROR S YSTEM D EVELOPMENT The control objective is to ensure that the system tracks a desired time-varying trajectory, denoted by  () ∈ R , despite uncertainties in the dynamic model, while minimizing a given performance index. To quantify the tracking objective, a position tracking error, denoted by 1 () ∈ R , is defined as 1 ,  −  (6) To facilitate the subsequent analysis, filtered tracking errors, denoted by 2 (), () ∈ R , are also defined as 2 , ˙ 1 + 1 1

(7)

 , ˙ 2 + 2 2 

(8)

where 1  2 ∈ R× , are positive definite, constant, gain matrices. The filtered tracking error () is not measurable since the expression in (8) depends on ¨().The error systems are based on the assumption that the generalized coordinates of the Euler-Lagrange dynamics allow additive and not multiplicative errors. A state-space model can be developed based on the tracking errors in (6) and (7). Based on this model, a controller is developed that minimizes a quadratic performance index under the (temporary) assumption that the dynamics in (1), excluding the additive disturbance, are known. The feedback controller is the solution to a nonzero-sum differential game in the form of a Stackelberg problem. The subsequent analysis then uses a robust controller to identify the unknown dynamics and additive disturbance, thereby relaxing the temporary assumption that these dynamics are known. To develop a state-space model for the tracking errors in (6) and (7), the inertia matrix is premultiplied to the time

derivative of (7), and substitutions are made from (1) and (6) to obtain  ˙ 2 = − 2 −  +  +   +   

(9)

where the nonlinear function  () ∈ R is defined as  ,  (¨  + 1 ˙ 1 ) +  (˙ + 1 1 ) +  + 

(10)

Under the (temporary) assumption that the dynamics in (1) are known, the control input can be designed as  ,  +   − 

(11)

where  () ∈ R is an auxiliary control input that will be designed to minimize a subsequent performance index. By substituting (11) into (9) the closed-loop error system for 2 () can be obtained as  ˙ 2 = − 2 +  +   

(12)

A state-space model for (7) and (12) can now be developed as ˙ =  ( ) ˙  +  ()  + ()  (13) where  ( ) ˙ ∈ R2×2 ,  () ∈ R2× ,  () ∈ R2 and are defined as ∙ ¸ −1 ×  ( ) ˙ ,  0× − −1  ¤ £    () , 0×  −1 £  ¤  () , 1 2 

where × and 0× denote a  ×  identity matrix and matrix of zeros, respectively. IV. S TACKELBERG G AME C ONTROL D ESIGN

Stackelberg differential games provide a framework for systems that operate on different levels with a prescribed hierarchy of decisions. The game is cast in two solution spaces: the leader and the follower. The follower tries to minimize a cost functional based on the decision from the leader, while the leader knowing the follower’s rationale, will define an input such that the leader and the follower’s inputs will yield minimal cost functionals. The Stackelberg differential game for the system given in (13) can be formulated in an optimal control framework by defining the leader’s input as 2 =   , and the follower’s input as 1 = . The system in (13) can be rewritten as ˙ =  ( ) ˙  +  () 1 + ()2 

(14)

Each player has a cost functional 1 ( 1  2 ) 2 ( 1  2 ) ∈ R defined as Z ¢ 1 ∞¡  1 =   + 1 11 1 + 2 12 2  (15) 2 0 Z ¢ 1 ∞¡     + 1 21 1 + 2 22 2 (16) 2 = 2 0

where   ∈ R2×2 are positive definite and symmetric constant matrices defined as ∙ ¸ ∙ ¸ 11 12 11 12 =  =   12 22 12 22

where  ,  ,  ∈ R× are positive definite and symmetric constant matrices ∀  = 1 2. Based on the minimum principal [19], the Hamiltonians 1 ( 1  2 ) 2 ( 1  2 ) ∈ R of the follower and leader are respectively defined as ¢ 1¡  (17) 1 =   + 1 11 1 + 2 12 2 2  +1 ( + 1 + 2 ) ¢ 1¡  (18) 2 =    + 1 21 1 + 2 22 2 2 +2 ( + 1 + 2 ) +   ˙ 1

Substituting (14), (19)-(26) into the derivative of (24)-(26) yields the three differential Riccati equations −1  0 = ˙ +  +   − 11   (27) −1   −22   +  −1  0 = ˙ +   +   −  11   (28) −1   − 22   +  −  −1  −1  0 = ˙ +  −  − 11   − 22   ¡ ¢ ¡ −1 ¢ −1  −1  21 11   −  11   (29) + 11

Equations (27)-(29) must be solved simultaneously to yield a control strategy for the leader and follower. The solutions to the Riccati equation gains () and  () correspond to 1 () and 2 () respectively, while () constraints the trajectory of () and  (). From the Riccati equation in (28), four simultaneous equations are generated as

where the optimal controller and costate equation of the follower are 1 ˙ 1

−1  = −11  1 µ ¶ 1 = − = − 1 −   

(19) (20)

Substituting (19) into (18) yields the leader’s Hamiltonian ´ 1³  − −1  21 11  1 + 2 22 2    + 1 11 2 = 2 ¡ ¢ −1   1 + 2 +   ˙ 1 +2  − 11

where the optimal controller and costate equations are defined as −1  −22  2 µ ¶ 2

2

=

˙ 2

= −

˙



(21) = −   −  2 + 

¶ 2 = − 1 ¡ −1 ¢ −1  = − 11 21 11  1 ¡ −1 ¢  + 11  2 +  µ

(23)

=  =  = 

(24) (25) (26)

where ()  () () ∈ R2×2 are time-varying positive definite diagonal matrices. Given these assumed solutions, conditions/constraints are then developed to ensure these solutions satisfy (17)-(23).

= = = =

˙11 − 11 1 − 1 11 + 11 − 11 11 11 + 12 − 12 22  11 + 12 − 12 11 ˙22 − 22  −1  −   −1 22 + 22 −1 −22 22 − 22  −1 11  −1 22 −1 −22  −1 22  −1 22 

(30) (31) (32) (33)

If  () is selected as 22 = 22 =  ()

(34)

then the skew symmetry properties in Assumption 2 can be applied to (33) to determine that −1 −1 −11 − 22 + 22 − 22 22 = 0

(22)

The expressions derived in (19)-(23) define the optimal control problem. The subsequent analysis aims at developing an expression for the costate variables (1 () 2 () ()) which ˙ satisfy the costate equations (˙ 1 () ˙ 2 () ()) and can be implemented by the controllers 1 () and 2 (). To this end, the subsequent development is based on the following assumed solutions for the costate variables 1 2 

0 0 0 0

(35)

which implies that 22 is a constant matrix; therefore, 11 must also be a constant matrix from (31). From the Riccati equation given in (27) the following four simultaneous equations are generated 0 0 0 0

= = = =

˙ 11 − 11 1 − 1 11 + 11 (36) 11 + 12 (37) 11 + 12 (38) −1  −1 ˙ 22 − 22   −   22 + 22 (39) −1 −1 −1 −1 −1 −1 −22  11  22 − 22  22  22 

Substituting (34) into (39) yields −1 −1 −11 − 22 + 22 = 0

(40)

which when combined with (35) yields 22 (× + 22 ) − 22 = 0 If  is chosen such that 22 = −22 , then 22 is constrained to be (41) 22 = −2× 

From the Riccati equation given in (29) the following three simultaneous equations are generated 0 = ˙ 11 − 11 1 + 1 11 (42) 0 = 11 − 22 (43) −1 −1 −1 −1 ˙ 0 = 22 +   22 − 22  11  22 −1 −1 −1 −22  −1 22  −1 22 +  −1 11 21 11 22 −1 −1 −1 −1 − 11  22 − 22    (44) Substituting (34) and (41) into (44) results in the constraint −1 −1 −1 −1 + 222 + 11 21 11 = 0 11

(45)

In addition, substituting (41) into (43) yields 11 = −2× 

(46)

It is evident from (31), (32), (37), (38), (41), and (46) that the following relationships can be established ¢ 1¡ 11 = − 12 + 12 (47) 2 ¢ 1¡  11 = − 12 + 12 (48) + 211  2 Another constraint can be established by substituting (48), (46), and (36) into (30) and reducing the equation as ¢ ¡ ¢¤ 1 £¡   0 = 11 + 1 + 1 12 + 12  12 + 12 2 Subsituting (13), (24), and (34) into (19) yields a Staclkerberg derived controller given as =

−1 2  −11

(49)

The controller in (49) is subject to the disturbance modeled, derived by subsituting (13), (25) , and (34) into (21), given as −1   = −22 2

It is evident from (40) and (45) that the gain matrices −1 and 22 are constrained by −1 −1 −1 + 11 21 11 = 0 22 + 22

(50) −1 11

V. RISE F EEDBACK C ONTROL D EVELOPMENT In general, the bounded disturbance   () and the nonlinear dynamics given in (10) are unknown, so the controller given in (11) can not be implemented. However, if the control input contains some method to identify and cancel these effects, then () will converge to the state space model in (13) so that () minimizes the respective performance index. In this section, a control input is developed that exploits RISE feedback to identify the nonlinear effects and bounded disturbances to enable () to asymptotically converge to the state space model. To develop the control input, the error system in (8) is premultiplied by  () and the expressions in (1), (6), and (7) are utilized to obtain   = − 2 +  +   +   + 2  2 −  

Based on the open-loop error system in (55), the control input is composed of the optimal control developed in (49), plus a subsequently designed auxiliary control term () ∈ R as  ,  − 

0 = 22 + 22 (52) ¢ ¡ ¢¤ 1 £¡ 0 = 12 + 12 1 + 1 12 + 12 + 11 (53) 2 ¢ ¡ ¢¤ 1 £¡   0 = 1 + 1 12 + 12 + 11 ( 54) 12 + 12 2 Based on the open-loop Stackelberg strategy, the derived controller in (49) minimizes the cost functional given by (15) and is subject to the disturbance model in (50) that minimizes the cost functional given by (16). To demonstrate optimality of the proposed controller, Hamiltonians were constructed in (17)-(18) and an optimal control problem was formulated. The costate variables in (24)-(26) were assumed to be solutions to the costate equations (20), (22) and (23) and gain constraints were developed. If all constraints in (51)-(54) are satisfied then the assumed solutions in (24)-(26) satisfy (17)-(23), and hence are optimal.

(56)

The closed-loop tracking error system can be developed by substituting (56) into (55) as   = − 2 +  +   +   + 2  2 +  − 

(57)

To facilitate the subsequent stability analysis the auxiliary function  () ∈ R  which is defined as  ,  ( )¨  +  (  ˙ )˙ + ( ) +  (˙ ) 

(58)

is added and subtracted to (57) to yield ¯ +  +   +   +  −  + 2  2  (59)   = − 2 + 

¯ () ∈ R is defined as where 

¯ ,  −   

(51)

In addition, the weights (  ) imposed on the states vector in the cost functions (15) and (16) are subject to the following constraints

(55)

(60)

Substituting (50) into (59), taking a time derivative and using the relationship in (8) yields 1 ˜ +  − 2 − (−1 + −1 ) − ˙ (61)  ˙ = − ˙  +  11 22 2 after strategically grouping specific terms. In (61), the un˜ (1  2   ),  () ∈ R are measurable auxiliary terms  defined as ˜ 

· 1 ¯ + 2 ˙ 2 , −˙  2 −  ˙ 2 − ˙  +  2 −1 −1 +2  ˙ 2 + 2 + (11 + 22 )2 2 

 , ˙ + ˙   ˜ (1  2   ) and  () Motivation for grouping terms into  comes from the subsequent stability analysis and the fact that the Mean Value Theorem, Property 3, Property 4, and Property 5 can be used to upper bound the auxiliary terms as ° ° °˜ ° (62) ° ()° ≤  (kk) kk 

k k ≤  1  where () ∈ R3 is defined as

° ° °˙ ° ° ° ≤  2 

() , [1 2  ] 

(63)

(64)

the bounding function (kk) ∈ R is a positive globally invertible nondecreasing function, and   ∈ R ( = 1 2) denote known positive constants. Based on (61), the control term () is designed as the generalized solution to () ˙ , ( + 1)() +  1 (2 )

(65)

where    1 ∈ R are positive constant control gains. The closed loop error systems for () can now be obtained by substituting (65) into (61) as  ˙

1 ˜ +  − 2 − (−1 + −1 ) = − ˙  +  11 22 2 −( + 1) −  1 (2 ) (66) VI. S TABILITY A NALYSIS

Theorem 1: The controller given in (49) and (56) ensures that all system signals are bounded under closed-loop operation, and the tracking errors are regulated in the sense that k1 ()k  k2 ()k  k()k → 0



 → ∞

(67)

The boundedness of the closed loop signals and the result in (67) can be obtained provided the control gain  introduced in (65) is selected sufficiently large (see the subsequent stability analysis), and 1 , 2 are selected according to the sufficient conditions 1 (68) min (1 )  min (2 )  1 2 where min (·) is the minimum eigenvalue of 1 and 2 . The gain  1 is selected according to the following sufficient condition: 1 1   1 + (69)   min (2 ) 2 where  1 was introduced in (65). Furthermore,  () converges to an optimal controller that minimizes (15) subject to (13) provided the gain constraints given in (51)-(54) are satisfied. Remark: The control gain 1 can not be arbitrarily selected, rather it is calculated using a Lyapunov equation solver. Its value is determined based on the value of ,  , 11 , 21 and 22  Therefore ,  , 11 , 21 and 22 must be chosen such that (68) is satisfied. Proof: Let D ⊂ R3+1 be a domain containing Φ() = 0, where Φ() ∈ R3+1 is defined as p ()]  (70) Φ() , [  () In (70), the auxiliary function () ∈ R is defined as () ,  k2 (0)k − 2 (0)  (0) − ( ) 

(71)

where the auxiliary function () ∈ R is the generalized solution to ˙ () ,  ( () −  1 (2 )) 

(72)

where  1 ∈ R is a positive constant chosen according to the sufficient conditions in (69). As illustrated in [20], provided the sufficient conditions introduced in (69) are satisfied, the following inequality can be obtained: Z

( ) ≤  1 k2 (0)k − 2 (0)  (0)

(73)

0

Hence, (73) can be used to conclude that () ≥ 0. Let  (Φ ) : D × [0 ∞) → R be a continuously differentiable positive definite function defined as 1 1  (Φ ) , 1 1 + 2 2 +   () +  2 2 which satisfies the following inequalities: 1 (Φ) ≤  (Φ ) ≤ 2 (Φ)

(74)

(75)

provided the sufficient conditions introduced in (69) are satisfied. In (75), the continuous positive definite functions 1 (Φ), and 2 (Φ) ∈ R are defined as 1 (Φ) , 1 kΦk2 , and 2 (Φ) , 2 () kΦk2  where 1 , 2 () ∈ R are defined as ½ ¾ 1 1 1 , min {1 1 } 2 () , max () ¯ 1  2 2 ¯ are introduced in (2). After taking the time where 1 , () derivative of (74), ˙  (Φ ) can be expressed as 1 ˙ ˙  (Φ ) = 21 ˙ 1 + 2 ˙ 2 +  ˙ () +   () ˙ +  2 After utilizing (7), (8), (66), and substituting in for the time derivative of (), ˙ (Φ ) can be simplified as follows: ˜ () ˙  (Φ ) ≤ −21 1 1 + 22 1 +   ¡ −1 ¢ −1 − ( + 1 + min 11 + 22 ) kk2

(76)

− min (2 ) k2 k2 

Based on the fact that 1 1 k1 k2 + k2 k2 2 2 the expression in (76) can be simplified as ¡ ¢ ˜ () − ( + 1 + min −1 + −1 ) kk2 (77) ˙  ≤   11 22 2 1 ≤

− (2min (1 ) − 1) k1 k2 − (min (2 ) − 1) k2 k2 

By using (62), the expression in (77) can be rewritten as h i (78) ˙  ≤ −3 kk2 −  kk2 − (kk) kk kk 

where¡ 3 , min{2 min (1 ) − 1 min (2 ) − 1 1 + ¢ −1 −1 min 11 } and 1 and 2 are chosen according + 22 to the sufficient condition in (68). After completing the squares for the terms inside the brackets in (78), the following expression can be obtained: 2 (kk) kk2 2 ˙  ≤ −3 kk + ≤ − (Φ) 4

(79)

where  (Φ) =  kk2 , for some positive constant , is a continuous, positive semi-definite function that is defined within the closed set: ³ p ´o n D , Φ ∈ R3+1 | kΦk ≤ −1 2 3  

The inequalities in (75) and (79) can be used to show that  (Φ ) ∈ L∞ in D; hence, 1 (), 2 (), and () ∈ L∞ in D. Given that 1 (), 2 (), and () ∈ L∞ in D, standard linear analysis methods can be used to prove that ˙ 1 (), ˙ 2 () ∈ L∞ in D from (7) and (8). Since 1 (), 2 (), () ∈ L∞ in D, the assumption that  (), ˙ (), ¨ () exist and are bounded can be used along with (6)-(8) to conclude that (), (), ˙ ¨() ∈ L∞ in D. Since (), () ˙ ∈ L∞ in D, Property 3 can be used to conclude that  (),  ( ), ˙ (), and  () ˙ ∈ L∞ in D. Thus from (1) and Property 4, we can show that  () ∈ L∞ in D. Given that () ∈ L∞ in D, it can be shown that () ˙ ∈ L∞ in D. Since (), ˙ ¨() ∈ L∞ ˙ in D, Property 3 can be used to show that ˙  ( ), ˙ (), ˙ ˙  () and  () ∈ L∞ in D; hence, (66) can be used to show that () ˙ ∈ L∞ in D. Since ˙ 1 (), ˙ 2 (), () ˙ ∈ L∞ in D, the definitions for  () and () can be used to prove that  () is uniformly continuous in D. Let S ⊂ D denote a set defined as follows: ½ ³ ³ p ´´2 ¾  S , Φ()⊂ D | 2 (Φ())  1 −1 2 3 

(80) The region of attraction in (80) can be made arbitrarily large to include any initial conditions by increasing the control gain  (i.e., a semi-global type of stability result) [21]. Theorem 8.4 of [22] can now be invoked to state that 2

 k()k → 0



→∞

∀(0) ∈ S

(81)

Based on the definition of (), (81) can be used to conclude the results in (67) ∀(0) ∈ S Since  () → 0 as 2 () → 0 (see (49)), then (59) can be used to conclude that ¯ +  +   →

as

 () , 2 () → 0

(82)

The result in (82) indicates that the dynamics in (1) converge to the state-space system in (13). Hence,  () converges to an optimal controller that minimizes (15) subject to (13) in the presence of structured disturbances; provided the gain constraints given in (51)-(54) are satisfied. VII. C ONCLUSION A novel approach for the design of a Stackelberg-based feedback controller was proposed for a Euler-Lagrange system subject to state dependent and bounded disturbances. An optimal Stackelberg feedback component was used in conjunction with a RISE feedback component, which enables the generalized coordinates of the system to globally asymptotically track a desired time-varying trajectory despite uncertainty in the dynamics. Using a Lyapunov stability analysis and a Stackelberg game development, sufficient gain conditions were derived to ensure asymptotic stability while maintaining

optimality of the proposed controller. Future work will focus on generalizing the Stackelberg framework for a more general class of systems. R EFERENCES [1] T. Basar and G. Olsder, Dynamic Noncooperative Game Theory. SIAM, PA, 1999. [2] M. Bloem, T. Alpan, and T. Basar, “A stackelberg game for power control and channel allocation in cognitive radio networks,” ACM Proc. of the Intl. Conf. on Game Theory in Communications Networks (GAMECOMM), 2007. [3] T. Basar and H. Selbuz, “Closed-loop stackelberg strategies with applications in the optimal control of multilevel systems,” IEEE Transactions on Automatic Control, vol. 2, pp. 166–179, 1979. [4] J. Medanic, “Closed-loop stackelberg strategies in linear-quadratic problems,” IEEE Trans. Autom. Control, vol. 4, pp. 632–637, 1978. [5] M. Simaan and J. J., Cruz, “A stackelberg solution for games with many players,” IEEE Transactions on Automatic Control, vol. 18, pp. 322–324, 1973. [6] G. Papavassilopoulos and J. Cruz, “Nonclassical control problems and stackelberg games,” IEEE Trans. Autom. Control, vol. 24, pp. 155–166, 1979. [7] A. Gambier, A. Wellenreuther, and E. Badreddin, “A new approach to design multi-loop control systems with multiple controllers,” IEEE Conf. on Decision and Control, 2006. [8] J. Hongbin and C. Y. Huang, “Non-cooperative uplink power control in cellular radio systems,” Wireless Networks, vol. 4, pp. 233–240, 1998. [9] T. Basar and P. Bernhard, H-infinity Optimal Control and Related Minimax Design Problems. Boston: Birkhäuser, 2008. [10] A. Isidori and A. Astolfi, “Disturbance attenuation and h-infinity control via measuremnt feedback in nonlinear systems,” IEEE Trans. Autom. Control, vol. 9, pp. 1283–1293, 1992. [11] L. Pavel, “A noncooperative game approach to osnr optimization in optical networks,” IEEE Trans. Autom. Control, vol. 51, pp. 848–852, 2006. [12] C. Tomlin, “A game theoretic approach to controller design for hybrid systems,” Proceedings of the IEEE, vol. 88, pp. 949–970, 2000. [13] T. Basar and G. . J. Olsder, “Team-optimal closed loop stackelberg strategies in hierarchical control problems,” Automatica, vol. 16, pp. 409–414, 1980. [14] M. Jungers, E. Trelat, and H. Abou-Kandil, “Stackelberg strategy with closed-loop information structure for linear-quadratic games,” Preprint Hal, 2006. [15] Z. Cai, M. S. de Queiroz, and D. M. Dawson, “Robust adaptive asymptotic tracking of nonlinear systems with additive disturbance,” IEEE Trans. Automat. Contr., vol. 51, pp. 524–529, 2006. [16] C. Makkar, G. Hu, W. G. Sawyer, and W. E. Dixon, “Lyapunov-based tracking control in the presence of uncertain nonlinear parameterizable friction,” IEEE Trans. Automat. Contr., vol. 52, no. 10, pp. 373–379, 2008. [17] P. M. Patre, W. MacKunis, K. Kaiser, and W. E. Dixon, “Asymptotic tracking for uncertain dynamic systems via a multilayer neural network feedforward and rise feedback control structure,” IEEE Trans. Autom. Control, 2008, to appear. [18] R. Ortega, A. Loría, P. J. Nicklasson, and H. J. Sira-Ramirez, Passivitybased Control of Euler-Lagrange Systems: Mechanical, Electrical and Electromechanical Applications. Springer, 1998. [19] D. Kirk, Optimal Control Theory: An Introduction. Dover Pubns, 2004. [20] K. Dupree, P. M. Patre, Z. D. Wilcox, and W. E. Dixon, “Optimal control of uncertain nonlinear systems using RISE feedback,” in Proc. IEEE Conf. on Decision and Control, Dec. 2008, pp. 2154–2159. [21] B. Xian, D. M. Dawson, M. S. de Queiroz, and J. Chen, “A continuous asymptotic tracking control strategy for uncertain nonlinear systems,” IEEE Trans. Automat. Contr., vol. 49, no. 7, pp. 1206–1211, Jul. 2004. [22] H. K. Khalil, Nonlinear Systems. New Jersey: Prentice-Hall, 2002.