2011 IEEE International Conference on Control Applications (CCA) Part of 2011 IEEE Multi-Conference on Systems and Control Denver, CO, USA. September 28-30, 2011
Discrete-time Nonlinear Systems Inverse Optimal Control: A Control Lyapunov Function Approach Fernando Ornelas, Edgar N. Sanchez and Alexander G. Loukianov Abstract— This paper presents an inverse optimal control approach for exponential stabilization of discrete-time nonlinear systems, avoiding to solve the associated HamiltonJacobi-Bellman (HJB) equation, and minimizing a meaningful cost function. This stabilizing optimal controller is based on a discrete-time control Lyapunov function. The applicability of the proposed approach is illustrated via simulations by stabilization of an example.
I. INTRODUCTION In optimal nonlinear control, we deal with the problem of finding a stabilizing control law for a given system such that a criterion, which is a function of the state variables and the control inputs, is minimized; the major drawback is the requirement to solve the associated HJB equation , . Actually, the HJB equation has so far rarely proved useful except for linear regulator problem, to which it seems particularly well suited . In this paper, the inverse optimal control approach proposed initially by Kalman  for linear systems and quadratic cost functions, it is treated for the discrete-time nonlinear systems case. The aim of the inverse optimal control is to avoid the solution of the HJB equation . In the inverse approach, a stabilizing feedback control law, based on a priori knowledge of a control Lyapunov function (CLF), is designed first, and then it is established that this control law optimizes a meaningful cost functional. The main characteristic of the inverse problem is that the meaningful cost function is a posteriori determined for the stabilizing feedback control law. For continuous-time inverse optimal control applicability, we refer to the results presented in , , , , , , . To the best of our knowledge, there are few results on discrete-time nonlinear inverse optimal control (). In , an inverse optimal control scheme is proposed based on passivity approach, where a storage function is used as Lyapunov function and the output feedback is used as stabilizing control law. In this paper, we directly propose a CLF to establish the stabilizing control law and to minimize a cost functional. Although stability margins do not guarantee robustness, they do characterize basic robustness properties that well designed feedback systems must possess. Optimality is thus a discriminating measure by which to select from among the entire set of stabilizing control laws those with desirable properties . Inverse optimal controllers possess desirable This work is supported by CONACYT under projects 57801 and 46069. All authors are with CINVESTAV, Unidad Guadalajara, Jalisco 45019, M´exico. [email protected]
978-1-4577-1063-6/11/$26.00 ©2011 IEEE
robustness properties because they are optimal with respect to sensible performance criteria . The existence of CLF implies stabilizability  and every CLF can be considered as a meaningful cost function , . Systematic techniques for finding CLFs do not exist for general nonlinear systems; however, this approach has been applied successfully to classes of systems for which CLFs can be found such as: feedback linearizable, strict feedback and feed-forward systems, etc. (, ). Moreover, by using a CLF, it is not required the system to be stable for zero input (uk = 0). The contents of the paper are as follows. Section II describes a brief review on optimal control and presents general stability analysis. Section III establishes the inverse optimal control and its solution by means of a quadratic CLF. Section IV illustrates the applicability of the proposed inverse optimal controller by means of an example. Finally, Section V concludes the paper. II. MATHEMATICAL PRELIMINARIES A. OPTIMAL CONTROL Although the main goal the paper is to design of an inverse optimal discrete-time control, this section is devoted to briefly discuss the optimal control methodology and their limitations. Consider the nonlinear discrete-time affine system xk+1 = f (xk ) + g(xk ) uk ,
x0 = x(0)
with the associated meaningful cost functional V (xk ) =
l(xn ) + uTn R(xn ) un
where xk ∈ Rn is the state of the system at time k ∈ N . N denotes the set of nonnegative integers. u ∈ Rm is the control input, f : Rn → Rn and g : Rn → Rn×m are smooth mappings; f (0) = 0 and g(xk ) 6= 0 for all xk 6= 0; V : Rn → R+ ; l : Rn → R+ is a positive semidefinite1 function and R : Rn → Rm×m is a real symmetric positive definite2 weighting matrix. Meaningful cost functional (2) is a performance measure . The entries of R can be functions of the system state in order to vary the weighting on control effort according to the state value . Considering 1 A function l(z) is positive semidefinite (or nonnegative definite) function if for all vectors z, l(z) ≥ 0. In other words, there are some vectors z for which l(z) = 0, and for all others z, l(z) > 0 . 2 A real symmetric matrix R is positive definite if z T Rz > 0 for all z 6= 0 .
the state feedback control design problem, we assume that the full state xk is available. Equation (2) can be rewritten as = l(xk ) + uTk R(xk ) uk ∞ X + l(xn ) + uTn R(xn ) un
V (xk )
= l(xk ) +
uTk R(xk ) uk
+ V (xk+1 )
where we require the boundary condition V (0) = 0 so that V (xk ) becomes a Lyapunov function. From Bellman’s optimality principle (, ), it is known that, for the infinite horizon optimization case, the value function V (xk ) becomes time invariant and satisfies the discrete-time (DT) Bellman equation , ,  V (xk ) = min l(xk ) + uTk R(xk ) uk + V (xk+1 ) (4) uk
where V (xk+1 ) depends on both xk and uk by means of xk+1 in (1). Note that the DT Bellman equation is solved backward in time . In order to establish the conditions which optimal control must satisfy, we define the discrete-time Hamiltonian H (, pages 830–832) as H(xk , uk ) = l(xk ) + uTk R(xk ) uk + V (xk+1 ) − V (xk ). (5) A necessary condition the optimal control law should ∂H = 0 , which is equivalent to calculating satisfy is ∂u k the gradient of (4) right-hand side with respect to uk , then ∂V (xk+1 ) ∂uk ∂V (xk+1 ) = 2R(xk ) uk + g T (xk ) . ∂xk+1 Therefore, the optimal control law is formulated as 0
which can be rewritten as 1 ∂V T (xk+1 ) g(xk )× 4 ∂xk+1 ∂V (xk+1 ) = 0. R−1 (xk )g T (xk ) ∂xk+1
l(xk ) + V (xk+1 ) − V (xk ) +
Nevertheless, solving the partial differential equation (9) is not simple. Thus, to solve the above HJB equation for V (xk ) constitutes an important disadvantage in discrete-time optimal control for nonlinear systems. B. LYAPUNOV STABILITY Due to the fact that the inverse optimal control is based on a Lyapunov function, we establish the following definitions. A function V (xk ) satisfying the condition V (xk ) → ∞ as kxk k → ∞ is said to be radially unbounded . Definition 1:  Let V (xk ) be a radially unbounded, positive definite function, with V (xk ) > 0, ∀xk 6= 0 and V (0) = 0. If for any xk ∈ Rn , there exist real values uk such that ∆V (xk , uk ) < 0 where the Lyapunov difference ∆V (xk , uk ) is defined as V (xk+1 ) − V (xk ) = V (f (xk ) + g(xk ) uk ) − V (xk ). Then V (·) is said to be a “discrete-time control Lyapunov function” (CLF) for system (1). Theorem 1: (Exponential stability ) Suppose that there exists a positive definite function V : Rn → R and constants c1 , c2 , c3 > 0 and p > 1 such that
2R(xk ) uk +
1 ∂V (xk+1 ) uk = − R−1 (xk ) g T (xk ) 2 ∂xk+1
c1 kxk ≤ V (xk ) ≤ c2 kxk p
∆V (xk , uk ) ≤ −c3 kxk ,
∀k ≥ 0,
(10) ∀x ∈ Rn .
Then xk = 0 is an exponentially stable equilibrium of system (1). (7)
with the boundary condition V (0) = 0. Moreover, H has a quadratic form in uk and R(xk ) > 0, then ∂2H >0 ∂u2k holds as a sufficient condition such that optimal control law (7) (globally ) minimizes H and the performance index (2) . Substituting (7) in (4), it results in the following DT Hamilton-Jacobi-Bellman (HJB) equation: T 1 −1 ∂V (xk+1 ) T V (xk ) = l(xk ) + − R (xk ) g (xk ) 2 ∂xk+1 1 ∂V (xk+1 ) ×R − R−1 (xk ) g T (xk ) 2 ∂xk+1 +V (xk+1 ) 1 ∂V T (xk+1 ) = l(xk ) + V (xk+1 ) + g(xk ) × 4 ∂xk+1 ∂V (xk+1 ) R−1 (xk )g T (xk ) (8) ∂xk+1
III. INVERSE OPTIMAL CONTROL For the inverse approach, a stabilizing feedback control law is first developed, and then it is established that this control law optimizes a meaningful cost functional. When we want to emphasize that uk is optimal, we use u∗k . We establish the following assumptions and definitions which allow the inverse optimal control solution. In the next definition, we establish the discrete-time inverse optimal control problem. Definition 2: The control law ∂V (xk+1 ) 1 u∗k = − R−1 (xk )g T (xk ) 2 ∂xk+1
is inverse optimal (globally) stabilizing if (i) it achieves (global) asymptotic stability of x = 0 for system (1); (ii) V (xk ) is (radially unbounded) positive definite function such that inequality ∗ V := V (xk+1 ) − V (xk ) + u∗T k R(xk ) uk ≤ 0 (12) is fulfilled. When we select l(xk ) := −V , then V (xk ) is a solution for (9).
As established in Definition 2, inverse optimal control problem is based on the knowledge of V (xk ); thus, we propose a CLF V (xk ) such that (i) and (ii) can be guaranteed. For the control law (11), let us consider a twice differentiable positive (C 2 ) definite function V (xk ) =
1 T x P xk 2 k
as a CLF, where P ∈ Rn×n is assumed to be positive definite (P > 0) and symmetric (P = P T ) matrix. Considering one step ahead for (13) and evaluating (11), we obtain
with l(xk ) = −V u∗ =α(x ) . k
Proof: First, we analyze stability. Global stability for the equilibrium point xk = 0 of system (1) with (16) as input, is achieved if function V in (12), is satisfied. Thus, V results in
V (xk+1 ) − V (xk ) + αT (xk ) R(xk ) α(xk ) f T (xk ) P f (xk ) + 2f T (xk ) P g(xk ) α(xk ) = 2 αT (xk ) g T (xk ) P g(xk ) α(xk ) − xTk P xk + + 2 T α (xk ) R(xk ) α(xk ) 1 −1 = Vf (xk ) − P1T (xk ) (R(xk ) + P2 (xk )) P1 (xk ) 2 1 −1 + P1T (xk ) (R(xk ) + P2 (xk )) P1 (xk ) 4 1 = Vf (xk ) − P1T (xk ) (R(xk ) + 4 P2 (xk ))−1 P1 (xk ). (20)
Multiplying by R(xk ), (14) becomes 1 T 1 R(xk ) + g (xk )P g(xk ) u∗k = − g T (xk ) P f (xk ) 2 2 (15) which results in the following state feedback control law:
Selecting P such that V ≤ 0, stability of xk = 0 is guaranteed. Furthermore, by means of P , we can achieve a desired negativity amount  for the closed-loop function V in (20). This negativity amount can be bounded using a positive definite matrix Q as follows:
1 ∂V (xk+1 ) − R−1 (xk ) g T (xk ) 2 ∂xk+1 1 −1 = − R (xk ) g T (xk )(P xk+1 ) 2 1 −1 = − R (xk ) g T (xk )(P f (xk ) + P g(xk ) u∗k ). 2
Thus, 1 −1 T I + R (xk )g (xk )P g(xk ) u∗k = 2 1 − R−1 (xk ) g T (xk ) P f (xk ). 2
1 −1 u∗k = α(xk ) = − (R(xk ) + P2 (xk )) P1 (xk ) 2
where P1 (xk ) = g T (xk ) P f (xk ) and P2 (xk ) = 1 T 2 g (xk ) P g(xk ). Note that P2 (xk ) is positive definite and symmetric matrix, which ensures that the inverse matrix in (16) exists. Once we have proposed a CLF for solving the inverse optimal control in accordance with Definition 2, the main contribution is presented. Theorem 2: Consider the affine discrete-time nonlinear system (1). If there exists a matrix P = P T > 0 such that the following inequality holds: 1 −1 Vf (xk ) − P1T (xk ) (R(xk ) + P2 (xk )) P1 (xk ) ≤ 4 2 −ζQ kxk k
where Vf (xk ) = 12 [V (f (xk )) − V (xk )], with V (f (xk )) = f T (xk ) P f (xk ) and ζQ > 0; P1 (xk ) and P2 (xk ) as defined in (16); then, the equilibrium point xk = 0 of system (1) is globally exponentially stabilized by the control law (16), with the CLF (13). Moreover, with (13) as a CLF, this control law is inverse optimal in the sense that it minimizes the meaningful functional given by V (xk ) =
∞ X k=0
l(xk ) +
R(xk ) uk
1 −1 Vf (xk ) − P1T (xk ) (R(xk ) + P2 (xk )) P1 (xk ) 4 −xTk Q xk
−λmin (Q) kxk k
= −ζQ kxk k ,
ζQ = λmin (Q)
where k·k stands for the Euclidean norm and ζQ > 0 denotes the minimum eigenvalue of matrix Q (λmin (Q)). Thus, from (21) follows condition (17). Considering (20)-(21), if V = V (xk+1 ) − V (xk ) + 2 αT (xk ) R(xk ) α(xk ) ≤ −ζQ kxk k , then ∆V = V (xk+1 ) − 2 V (xk ) ≤ −ζQ kxk k . Moreover, as V (xk ) is a radially unbounded function, then the solution xk = 0 of the closedloop system (1) with (16) as input, is globally exponentially stable according to Theorem 1. When function −l(xk ) is set to be the (21) right-hand side, that is l(xk ) := −V u∗ =α(x k
1 −1 = −Vf (xk ) + P1T (xk ) (R(xk ) + P2 (xk )) P1 (xk ) 4 then V (xk ) as proposed in (13), is a solution of the DT HJB equation (9). In order to establish optimality, considering that (16)
stabilizes (1), and substituting l(xk ) in (18), we obtain
written as V (xk )
V (xk )
l(xk ) + uTk R(xk ) uk
k=0 ∞ X
−V + uTk R(xk ) uk
k=0 ∞ X
∞ h i X V (xk+1 ) − V (xk ) − V (x1 ) + V (x0 ) k=1 ∞ h X
uTk R(xk ) uk − αT (xk )R(xk )α(xk )
k=0 ∞ h X
i V (xk+1 ) − V (xk ) − V (x2 ) + V (x1 )
1 Vf (xk ) − P1T (xk ) (R(xk ) + (22) 4 k=0 X ∞ P2 (xk ))−1 P1 (xk ) + uTk R(xk ) uk .
−V (x1 ) + V (x0 ) +
∞ h X
uTk R(xk ) uk
i −α (xk )R(xk )α(xk ) . T
For notation convenience in (25), the upper limit ∞ will treated as N → ∞, and thus V (xk )
Now, factorizing (22) and then adding the identity matrix Im ∈ Rm×m presented as Im = −1 (R(xk ) + P2 (xk )) (R(xk ) + P2 (xk )) , we obtain
−V (xN ) + V (xN −1 ) − V (xN −1 ) + V (x0 ) N h i X + uTk R(xk ) uk − αT (xk )R(xk )α(xk ) k=0
V (xk )
= −V (xN ) + V (x0 ) +
∞ X k=0
1 Vf (xk ) − P1T (xk ) 2
i −α (xk )R(xk )α(xk ) .
Letting N → ∞ and noting that V (xN ) → 0 for all x0 , then V (xk ) = V (x0 )+
V (xk )
∞ h X
P1 (xk ), then (23)
Vf (xk ) + P1T (xk ) α(xk ) + αT (xk ) ×
k=0 ∞ h i X P2 (xk ) α(xk ) + uTk R(xk ) uk k=0 i −αT (xk )R(xk )α(xk ) ∞ h ∞ h i X X = − V (xk+1 ) − V (xk ) + uTk k=0
i × R(xk ) uk − α (xk )R(xk )α(xk ) .
(26) Thus, the minimum value of (26) is reached with uk = α(xk ). Hence, the control law (16) minimizes the cost functional (18). The optimal value function of (18) is V ∗ (x) = V (x0 ) for all x0 . We can establish the main conceptual differences between optimal control and inverse optimal control as: • For optimal control, the cost functions l(xk ) ≥ 0 and R(xk ) > 0 are given a priori; then, they are used to calculate u(xk ) and V (xk ) by means of DT HJB equation solution. • For inverse optimal control, the control Lyapunov function V (xk ) and the cost function R(xk ) are given a priori, and then, these functions are used to calculate u(xk ), and l(xk ) with function V as defined in (12). The optimal control will in general be of the form (11) and the minimum value of the performance index will be some function V (x0 ) of the initial state x0 . Remark 1: Additionally, V (xk ) solves the following Hamilton-Jacobi-Bellman equation: 1 ∂V T (xk+1 ) g(xk )× 4 ∂xk+1 ∂V (xk+1 ) R−1 (xk ) g T (xk ) = 0. (27) ∂xk+1 Remark 2: Research is being pursued to propose a method in order to calculate matrix P in (17), based on the linear matrix inequality (LMI) approach. l(xk ) + V (xk+1 ) − V (xk ) +
∞ X T uk R(xk ) uk − αT (xk )R(xk )α(xk ) .
P1 (xk ) +
uTk R(xk ) uk .
Being α(xk ) = − 12 (R(xk ) + P2 (xk )) becomes
uTk R(xk ) uk
(R(xk ) + P2 (xk )) P1 (xk ) + 1 T −1 P (xk ) (R(xk ) + P2 (xk )) × 4 1 −1 P2 (xk ) (R(xk ) + P2 (xk )) P1 (xk ) 1 −1 + P1T (xk ) (R(xk ) + P2 (xk )) × 4 R(xk ) (R(xk ) + P2 (xk ))
N h X
After evaluating the summation for k = 0, then (24) can be
A. INVERSE OPTIMAL CONTROL FOR LINEAR SYSTEMS
xk+1 = A xk + B uk ,
x0 = x(0)
For the special case of linear systems, it can be shown that inverse optimal control is an alternative way to solve the DT algebraic Riccati equation (DARE)  (DT HJB equation in DT nonlinear systems). Particularly, for the DT linear system
1 − (R + P2 (xk ))−1 P1 (xk ) 2 1 1 − (R + B T P B)−1 B T P A xk 2 2
2 0 −2
(28) 5 uk
according to Theorem 2, the inverse control law provides u∗k
2 0 −2
(29) Fig. 1.
Stabilization of a nonlinear system.
where P1 (xk ) and P2 (xk ) are defined as P1 (xk ) = B T P A xk ,
1 P2 (xk ) = B T P B. (31) 2 Now, if we can find P satisfying (17), then the closed-loop system (28) with the inverse optimal control law (29) is globally asymptotically stable. Now, selecting R = 12 R > 0 in (29) yields u∗k
1 1 − ( R+ 2 2 −(R + B T
1 T B P B)−1 B T P A xk 2 P B)−1 B T P A xk
∗ l(xk ) + u∗T k R(xk ) uk + V (xk+1 )
∗ xTk Q xk + u∗T k R(xk ) uk + V (f (xk ))
The applicability of developed method is illustrated via design of a stabilizing control law for a discrete-time second order nonlinear system of the form (1) with x1,k x2,k − 0.8 x2,k f (xk ) = (35) x21,k + 1.8 x2,k and
g(xk ) =
Moreover, choosing Q = 12 Q > 0 and the inverse optimal control law (32), the DT HJB (4) with l(xk ) = xTk Q xk becomes V (xk )
0 −2 + cos(x2,k )
According to (16), the stabilizing optimal control law is formulated as −1 1 1 T ∗ uk = − R(xk ) + g (xk ) P g(xk ) g T (xk ) P f (xk ) 2 2 where the positive definite matrix P is selected as 1 0 P = 10 ∗ 0 1
∗ +P1T (xk ) u∗k + u∗T k P2 (xk ) uk .
and R(xk ) is a constant matrix
Then R(xk ) = 1.
xTk Q xk u∗T R(xk ) u∗k + k 2 2 xTk AT P A xk + 2 −xTk AT P B(R + B T P B)−1 B T P A xk u∗T B T P B u∗k + k 2 which can be rewritten as xTk P xk 2
xTk P xk
The state penalty term l(xk ) in (18) is calculated according to (19). Fig. 1 shows the stabilization of this system with initial conditions x0 = [2 −2]T , and Fig. 2 displays the evaluation of the cost functional V (xk ). Remark 3: By using a CLF, system (35)-(36), is not required to be stable for uk = 0. Remark 4: In this example, according to Theorem 2, the optimal value function is calculated as V ∗ (x0 , α(xk )) = V (x0 ) = 12 xT0 P x0 = 40, which is reached as shown in Fig. 2.
= xTk Q xk + xTk AT P A xk −2 xTk AT P B(R + B T P B)−1 B T P A xk +xTk AT P B(R + B T P B)−1 × T
(R + B P B)(R + B P B)
B P A xk .
Finally, from (33) the DT algebraic Riccati equation P = Q + AT P A − AT P B(R + B T P B)−1 B T P A (34) is obtained.
This paper has established the inverse optimal control problem for a class of nonlinear DT systems. To avoid the solution of the Hamilton-Jacobi-Bellman equation, we propose a discrete-time control Lyapunov function (CLF) in a quadratic form. Based on this CLF, the inverse optimal
45 Cost Functional Evaluation
40 35 30 25 20 15 10 5 0 1
5 6 Time(k)
Fig. 2. Solid line shows the optimal value for the cost functional (V ∗ (x0 )). Dashed line shows the evaluation for the cost functional V (xk ) at the kstep.
control strategy is synthesized. Stability and the corresponding conditions for the inverse optimal control solution are established. Simulation results illustrate that the proposed controller ensures stabilization of a nonlinear system and minimizes a meaningful cost functional. R EFERENCES  R. Sepulchre, M. Jankovic, and P. V. Kokotovi´c, Constructive Nonlinear Control. London, New York, USA: Springer-Verlag, 1997.  M. Krsti´c and H. Deng, Stabilization of Nonlinear Uncertain Systems. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 1998.  B. D. O. Anderson and J. B. Moore, Optimal Control: Linear Quadratic Methods. Englewood Cliffs, New Jersey, USA: PrenticeHall, 1990.  R. E. Kalman, “When is a linear control system optimal?” Transactions of the ASME, Journal of Basic Engineering, Series D, vol. 86, pp. 81– 90, 1964.  R. A. Freeman and P. V. Kokotovi´c, “Optimal nonlinear controllers for feedback linearizable systems,” in Proceedings of the 14th American Control Conference, 1995, vol. 4, Seattle, WA, USA, June 1995, pp. 2722–2726.  P. J. Moylan and B. D. O. Anderson, “Nonlinear regulator theory and an inverse optimal control problem,” IEEE Transactions on Automatic Control, vol. 18, no. 5, pp. 460–465, 1973.  J. L. Willems and H. V. D. Voorde, “Inverse optimal control problem for linear discrete-time systems,” Electronics Letters, vol. 13, p. 493, Aug 1977.
 L. Magni and R. Sepulchre, “Stability margins of nonlinear receding horizon control via inverse optimality,” Systems and Control Letters, vol. 32, no. 4, pp. 241–245, 1997. [Online]. Available: http://www.montefiore.ulg.ac.be/services/stochastic/pubs/1997/MS97  R. A. Freeman and P. V. Kokotovi´c, Robust nonlinear control design: state-space and Lyapunov techniques. Cambridge, MA, USA: Birkhauser Boston Inc., 1996.  T. Ahmed-Ali, F. Mazenc, and F. Lamnabhi-Lagarrigue, “Disturbance attenuation for discrete-time feedforward nonlinear systems,” Lecture notes in control and information sciences, vol. 246, pp. 1–17, March 1999.  F. Ornelas, E. N. Sanchez, and A. G. Loukianov, “Discrete-time inverse optimal control for nonlinear systems trajectory tracking,” in Proceedings of the 49th IEEE Conference on Decision and Control, Atlanta, Georgia, USA, Dec 2010, pp. 4813–4818.  R. A. Freeman and P. V. Kokotovi´c, “Inverse optimality in robust stabilization,” SIAM Journal on Control and Optimization, vol. 34, no. 4, pp. 1365–1391, 1996.  J. Casti, “On the general inverse problem of optimal control theory,” Journal of Optimization Theory and Applications, vol. 32, no. 4, pp. 491–497, Dec 1980.  J. A. Primbs, V. Nevistic, and J. C. Doyle, “Nonlinear optimal control: A control Lyapunov function and receding horizon perspective,” Asian Journal of Control, vol. 1, pp. 14–24, 1999.  R. A. Freeman and J. A. Primbs, “Control Lyapunov functions: New ideas from an old source,” in Proceedings of the 35th IEEE Conference on Decision and Control, Kobe, Japan, 1996, pp. 3926–3931.  D. E. Kirk, Optimal Control Theory: An Introduction. Prentice-Hall, NJ, USA, 1970.  F. L. Lewis and V. L. Syrmos, Optimal control. John Wiley & Sons, New York, USA, 1995.  T. Basar and G. J. Olsder, Dynamic noncooperative game theory, 2nd ed. Academic Press, New York, 1995.  A. Al-Tamimi and F. L. Lewis, “Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof,” IEEE Transactions on Systems, Man, Cybernetics-Part B, vol. 38, no. 4, pp. 943–949, August 2008.  T. Ohsawa, A. M. Bloch, and M. Leok, “Discrete Hamilton-Jacobi theory,” 2009. [Online]. Available: http://www.citebase.org/abstract?id=oai:arXiv.org:0911.2258  W. M. Haddad, V.-S. Chellaboina, J. L. Fausz, and C. Abdallah, “Optimal discrete-time control for non-linear cascade systems,” Journal of The Franklin Institute, vol. 335, no. 5, pp. 827–839, 1998.  H. K. Khalil, Nonlinear Systems. New Jersey: Prentice Hall, 1996.  G. L. Amicucci, S. Monaco, and D. Normand-Cyrot, “Control Lyapunov stabilization of affine discrete-time systems,” vol. 1, San Diego, CA, USA, Dec. 1997, pp. 923–924.  M. Vidyasagar, Nonlinear systems analysis, 2nd ed. Prentice Hall, 1993.