On Time-Optimal Feedback Control - American Control ... - IEEE Xplore

5 downloads 0 Views 471KB Size Report
In this paper, a new sufficient optiniality condi- tion for time-optimal feedback control is present,ed. It generalizes tlie previous sufficiency condit~ions, the.
210 O n Time-Optimal

aek C ~ n t d

S. Sundar and Z. Shiller Department of Mechaiiica.1, Aerospace and Nuclear Engineering University of California Los Angeles Los Angeles, CA 90024 [email protected]

Abstract Time-optimal feedback control can be computed by solving tlie I-Iamilton-Jacobi-Belliiiaii (IIJB) equation. To date, this problem has not, been solved for nonlinear systems, such as articulat,ed robot,ic nianipulators, partly due to the difficulty in efficientsly finding a solution to tlie II.JI3 equation. In this paper, a new sufficient optiniality condition for time-optimal feedback control is present,ed. It generalizes tlie previous sufficiency condit~ions,the H J B equation and a Lyapunov-based condition derived in [ll].The new condition is satisfied by a class of piecewise C3continuous functions, t8ermedgeiiemlized value functioas, as is demonstrat,ed in an example for a simple nonlinear system.

1. Introductioii The time-optimal control of robotic manipulators has been the subject of extensive research for tlie past twenty years. As a result, tliiet: tiiaiii approachcs to solving the time-optimal control problem for point-topoint motions have emerged. I ) solving the two point boundary value problem (TPDVP) formdated by the Maximum Principle (see for example [5]), 2 ) Dynamic Programming (eg. [14]), and 3) parameter optimizations (eg. [15]). Using tlie Pontryagin Maximum Principle is computationally difficult due to the large dimensionality of the problem and the split boundary conditions on the states and costates. Dynamic programming suffers from what is known as tlie ”curse of dimensionality”, that is, the complexity of the problem increases rapidly with the grid resolution and the number of states. Parameter optimizations obtain approximate solutions more cfficicntly t h a n dynamic programming or solving the ‘l’l’13Vl’ probletn , however they are as yet computationally too inefficient for on-line implementations. To date, no practical method has been developed for computing the exact time-optimal feedback control for manipulators with nonlinear dynamics. The time-optimal feedback control problem of driving an autonomous (time invariant) system from some

init.ia.l stmateto t,he origin can be stated as: min U&

it’

1 dt

sub-ject t*othe autonomous (no explicit dependence on t ) syst.em dyna.mics: % = f ( x , u ); x E R n , u E Q c C m

(2)

and the fixed (not time dependent) boundary condit,ions: x(0) = xg :

X(t,f)

= 0

(3)

where f ( x , U) is differentiable in both arguments, and Cl is a fixed convex set of feasible controls. We assume the existence of the time-optimal control for all initial states XO. The optimal feedback control for the problem (1) may, in theory, be determined by solving the HJB equat,ion [1][3][4], as stated in Theorem 1: Theorern 1[1][3]: The control u*(x,t ) is the solution to prohlem ( 1 ) if i t satisfies the H J B equation mi11

UES2

{ cot(x,l)+ < tuc(x,t),f(x,u) > } = -1 , xER”-(0)

(4)

where ~ ( xt ), is a C2 scalar function satisfying w(0,t)= 0 w ( x , t ) > O ,x

(5)

#

0

(6)

The subscripts z and t represent partial derivatives with respect to x and t , respectively, and < ., . > denot.es the inlier product, on Rn.The scalar function ui(x,t) is the refurn, or value, function [1][3][4], representing the minimum time-to-go to the origin. We observe that q = 0 for autonomous systems and for fixed terminal conditions. We will, therefore, eliminate, in the following, the time argument from the return function. The return function w(x) need only be piecewise C2 continuous [1][4] since (4) is a local condition.

the solution to problem (1) if it satisfies

Consequently, w(x) can be coiist,riict,ed by piecing together continuous solutions of the IIJB equation (4) for disjoint subsets of %a'! [l]. Along the boundaries, wr(x) may be discontinous, or Z O , ~ ( X )inay not exist [4].Therefore, (4)does not, a.pply to the origin where wr(x) may not exist [l]. Intuitively, this can be explained by noting that wr(x) is norma.1t40the isocost surfaces, which are reduced t40a. point at tlie origin. Clea.rly, a normal to a point is not defined. Tlie HJB equation is a sufficient condition for coniputing the global time-optimal cont,rol. It reduces the dimensionality of the problem by eliminating the need for the costates commonly used in optimizations based on the Maximum Principle. Also, solving for the return function is a final bounchry value problem (FBVP), as opposed to the TPBVP forniula.ted by the necessa.ry conditions. Generally, it is ea,sier to find a solution satisfying a reduced number of bounda.ry conditions. However, the partia.1 differential equation (4) of the FBVP is more dinicult. to solve t41ian the ordinary differential equa.tions of the TPBVP resulting from the necessa.ry condit,ions. For this reason, the use of the Sufficient conditions tjo coinpute optimal feedback la.ws of a.rticula.tec1systems has been linii ted to qua.dra.tic cost funct.ionals, 1-niiii mi zing energy [9] and joint rate [$]. The qua.clra.ticstructure of the cost functionals in these c a ~ e sallows one t.o select. a qua.dra.tic ret,urn funct,ion, which reduces t4heoptimiza.tion problem to solving a. Ricca.t?iequa.tion [8][0]. For time-optimal control problems of iionlinear systems, such as articulat,ed robot,ic manipulators, it, is generally very difficult to guess a. return function. Another sufficiency condition for time-optima.1 control problems has been derived using Lya,punov functions [ll]. It yields the time-optima1 control by minimizing the time d e r h t i v e of a suit.able Lya.punov function [HI. In this paper, we show t,ha.t the Lyapunov based condition is equivalent to the TIJB equation, i.e., a return function sa.t.isfying the Z-IJB equation call be constructed from the Lya.punov function. We also propose a. new suficiency conclit,ion for t.imeoptimal control problems t.liat, is sa.t.isfied by a class of piecewise C2 continuous functions that we call the generalized value functions. The I-IJB eqimtion a.nd the Lyapuiiov based coiidition a,re shown to be special cases of the new sufficiency condit,ioii. The new condition is demonstra.ted in a.n esmnple.

min

U€O

< v,(x),f(x,u) >= -h(v(x))

(7)

where v(x) is a positive definite C2 scalar function, satisfying: lim

v(x) = O

llXll-.O

h ( v ( x ) )is a differentiable positive definite scalar function, and 11.11 denotes the Euclidean norm. Proof: Theorem 2 is proven in Appendix A by showing that a return function satisfying the HJB equation (4) can be constructed from the Lyapunov function satisfying Theorem 2. This is a more constructive proof than the one presented in [ll]since it shows the connection between Nahi's Theorem 2 and the llJB sufficiency condition, Theorem 1. Note that the Lyapunov functions satisfying (7) must, be C? continuous, which is not the case for the ret 0, it can be shown by contradiction that 9" # 0 [17]. Consequently, (42) follows from tlie Implicit Function Theorem [6]. From (42) and (40) we can write

Q < 00 and is moiiotoiiically increasing, it converges to a finite limit [6][13] given by:

m-a,

(42)

Fiom (40), it is easy to verify that W ( X ) and w(x) satisfy (39). Tlierefoie, fioni Lemma 2 , there exists a differentidble fiinction y such that:

---(IS > 0 (36) 4 Y ( a -2 nr + 1 )) l t ( 9 ) which implies that /3,,%(x) is strictly monotonically i t i creasing [6][13]. Since &(x) is bounded from above by Pm+r(x)- Pm(x) =

(41)

First, we show that w(x) is a function of w(x)

(35)

v(Y(a-z"L))

(40)

Similar to the proof of Theorem 2, from (40), w(x) satisfies ( 4 ) . For ~ ( xto) be a return function, it needs in addition to satisfy (5) and (6). This would follow from deriving the w(x) in the form of (27). However, to do that. we need to show that g(x) is a function of w(x), or

u-zna

bm(x) I

(39)

(45)

Similar t,o the proof of Theorem 2, we conclude that conditions (5) and (6) are satisfied, and, therefore, w(x) is a return function, and the control u*(x) satisfying (10) is t.lie time-optimal control.

(38)

Equation (28) follows from taking limits on both sides of (31). Equation ( 2 5 ) follows from ( 3 5 ) and (28), thus completing tlie proof of the Lemma.

3084