Adaptive Learning Control for Finite Interval Tracking ... - IEEE Xplore

3 downloads 0 Views 609KB Size Report
Jun 2, 2011 - Abstract—Using a constructive function approximation net- work, an adaptive learning control (ALC) approach is proposed for finite interval ...
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

893

Adaptive Learning Control for Finite Interval Tracking Based on Constructive Function Approximation and Wavelet Jian-Xin Xu, Senior Member, IEEE, and Rui Yan, Member, IEEE

Abstract— Using a constructive function approximation network, an adaptive learning control (ALC) approach is proposed for finite interval tracking problems. The constructive function approximation network consists of a set of bases, and the number of bases can evolve when learning repeats. The nature of the basis allows the continuous adaptive learning of parameters when the network undergoes any structural changes, and consequently offers the flexibility in tuning the network structure. The expandability of the bases guarantees precision of the function approximation and avoids the trial-and-error procedure in structure selection for any fixed structure network. Two classes of unknown nonlinear functions, namely, either global L2 or local L2 with a known bounding function, are taken into consideration. Using the Lyapunov method, the existence of solution and the convergence property of the proposed ALC system are discussed in a rigorous manner. By virtue of the celebrated orthonormal and multiresolution properties, wavelet network is used as the universal function approximator, with the weights tuned by the proposed adaptive learning mechanism. Index Terms— Adaptive learning, function approximation, nonlinear control, structure tuning, wavelet network.

I. I NTRODUCTION EARNING control [1]–[4] or adaptive learning control (ALC) [5], [6] developed as the complementary to adaptive control can cope with any tracking control tasks repeated over a finite time interval. Unlike adaptive control that targets asymptotic convergence along the time axis, learning control targets perfect tracking over a finite interval by means of asymptotic convergence along the learning axis (iteration axis). In this paper, we focus on ALC with the ultimate objective of addressing finite interval tracking problems. A constant challenging mission for the control society is to deal with dynamic systems in the presence of unknown nonlinearities. Consider the simple affine dynamics

L

x˙ = f (x) + u where u is the system input. Over the past five decades, numerous control strategies have been developed according to Manuscript received January 7, 2010; revised January 25, 2011; accepted March 13, 2011. Date of publication May 13, 2011; date of current version June 2, 2011. J.-X. Xu is with the Department of Electrical and Computer Engineering, National University of Singapore, 117576, Singapore (e-mail: [email protected]). R. Yan is with the Institute for Infocomm Research, Agency for Science, Technology and Research, 138632, Singapore (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNN.2011.2132143

the characteristics and prior knowledge of f (x). If f (x) can be parameterized as the product of unknown time-invariant parameters and known nonlinear functions, adaptive control and adaptive learning are most suitable. If f (x) cannot be parameterized but its upperbounding function f¯(x) is known a pri ori , robust control or robust learning control [7] is pertinent. In the past decade, intelligent control methods using function approximation, such as neural network, fuzzy network, and wavelet network, have been proposed, which opened a new avenue leading to more generic solutions and better control performance. The most profound feature of those function approximation methods is that the nonparametric function f (x) is given a representation in a parameter space. Various structures based on neural networks, radial basis networks, wavelet networks, and hinging hyperplanes have been used as black-box nonlinear dynamical models [8], [9] with the selected function approximation, and the control problem turns into an analogous adaptive control or ALC that needs to deal with only unknown time-invariant parameters. Neural network and fuzzy logic-based control has been very widely studied [10]–[22]. The success of neural control is subject to the validity of a prerequisite: i.e., the structure of the network, such as the number of layers and nodes, must be adequate to meet the desired approximation precision. Hence, it is commonly assumed in adaptive neural control that, for a continuous function f (x) on a compact set, a finite and sufficiently large neural network is chosen and there exists a set of ideal weights θ such that the function can be approximated to a specified precision [23]. It was indicated in [24]–[26] that, if the node number of a three-layer neural network is adequate, the approximation error can be arbitrarily small on a compact set. Recently, an adaptive controller using variable function approximators has been designed to guarantee the approximation precision [27], [28]. Owing to the lack of prior information on f (x), often a designer is unable to know how large a neural network would be adequate. If the network structure is inadequate, the control mission is impossible. Intuitively, a solution to this problem is to let the neural network evolve continuously from a small initial configuration and cease only when the desired precision is achieved. However, we encounter a difficulty when implementing this idea with adaptive neural control, because a neural network is constructed as a complete system instead of a basis. The fundamental difference between a complete system and a basis can be clearly seen from the changes of weights

1045–9227/$26.00 © 2011 IEEE

894

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

when the system structure evolves [29]. The new weights of a complete system, θ A , may be totally different from the original weights, θ . On the other hand, the new weights of a basis, i.e., θ A , will include the original weights θ as an invariant subset. Hence, after adding new nodes to a neural network, parametric adaptation may have to restart from scratch for the new weights θ A . Using a basis in approximation, on the other hand, the adaptively learned results for weights θ will remain valid and thus adaptive learning can be carried on. Adaptive learning will start from the beginning only for the newly added weights in θ A . In this paper, we consider two scenarios. In the first scenario, f (x) is assumed global L2 , i.e., L2 (R), which is the only prior knowledge. ALC can generate a convergent sequence and enter the prespecified bound in a finite number of learning iterations. In the second scenario, f (x) is assumed local L2 , and the prior knowledge is the upperbound f¯(x). A robust control mechanism is applied first to confine the state x to a compact set. By augmenting f (x) to a new function defined on R, we show that the second scenario reverts to the first one, and consequently achieves the same convergence property with ALC. Extension to more general plants in the cascade form will be exploited. Wavelet network has been developed as a universal function approximator in L2 , thus its structure can easily evolve in conjunction with parametric adaptation or adaptive learning. In [30], an adaptive wavelet network control is presented for nonlinear dynamical systems. The online adjustment of the network structure can be done in a constructive manner by gradually increasing the network resolution. In [31], a new type of wavelet neural network is proposed to improve the function approximation by generating a network with the optimal size. In this paper, three different wavelets are presented and their suitability explored. Through illustrative examples, we demonstrate the relationship between the complexity of wavelet network and the number of learning iterations. This paper is organized as follows. In Section II, the problem formulation and preliminaries are briefly presented. In Section III, ALC with universal function approximation is proposed. In Section IV, a robust ALC is proposed for local L2 nonlinear plants. In Section V, ALC is extended to cascaded nonlinear plants. In Section VI, the structure and properties of wavelet approximation are presented. In Section VII, illustrative examples, design considerations, and comparisons are provided. In Section VIII the conclusion is given. In this paper, we define  ·  as vector norm,  · 2 as T  L2 −norm, |·|s as uniform norm, and ·T = (1/T ) 0 ·2 dτ as extended L2 −norm zi m = max{|z j,i |s : j = 1, . . . , n+i } for zi = (z 1,i , . . . , z n+i,i )T . In subsequent contexts, we omit the argument t for all variables where no confusion arises. II. P ROBLEM F ORMULATION AND P RELIMINARIES Definition 1: Let Y be a normed linear space over the realnumber field R. A system of elements g1 , g2 , . . . , ⊂ Y is said to be a basis for  Y if any element y ∈ Y has a unique representation y = ∞ j =1 θ j g j with scalars θ j ∈ R.

Note that meaning of representation of y is as follows:  the b θ j g j , where Nb is the total number of bases if y Nb = Nj =1 employed, then lim Nb →∞ y − y Nb  = 0, where  ·  is the norm in the space Y . For arbitrary  > 0, to make y − y Nb  ≤ , we simply take Nb large enough. Further, coefficients θ1 , θ2 , . . ., are unique. To facilitate subsequent discussions on the existence of solution, the following Lemma is introduced. Lemma 1 [32]: Consider the following Cauchy problem: x˙

=

f(t, x),

x(t0 ) = x0 .

(1)

If D is an open set in R n+1 , f : D → R n is continuous in D and satisfies locally Lipschitzian condition for x, then the solution of Cauchy problem (1) can be extended to the boundary of D, ∂D (∂D can be ∞). This lemma implies that for a continuous and locally Lipschitzian function f, the solution of (1) in the open set D can be extended to the boundary of D. For instance, consider D = (−2, 2), an open set in R. Then for a continuous and locally Lipschitzian function f, the solution of (1) exists in D = [−2, 2] as long as the solution exists in D = (−2, 2). To focus on the essential idea and properties of the proposed ALC, the following simple dynamic plant is considered first:  x˙l, i = xl+1, i l = 1, 2, . . . , n − 1 (2) SI : x˙n, i = f (xi ) + u i where xi = [x 1,i , x 2,i , . . . , x n,i ]T ∈ R n is the state vector at the i th learning iteration, and u i ∈ R is the plant input at the i th iteration. The mapping f (x) is an unknown nonlinear function which is continuous and locally Lipschitzian for x ∈ R n . We consider two types of prior knowledge of f that lead to two distinct ALC designs. Assumption 1: f (x) ∈ L2 (R n ). An ALC method is developed for SI satisfying Assumption 1. Assumption 2: f (x) ∈ L2 (D) where D ∈ R n is a compact set. There exists a known continuous function f¯(x) ≥ 0 such that | f (x)| ≤ f¯(x), ∀x ∈ D. For SI satisfying Assumption 2, a robust ALC is proposed in this paper. The ALC is further extended to the following nth-order cascade dynamics:  x˙l, i = fl (xl, i ) + xl+1, i , l = 1, . . . , n − 1 (3) SI I : x˙n, i = f n (xi ) + u i where xl,i = [x 1,i , . . . , xl,i ]T ∈ Rl and xi = [x 1,i , x 2,i , . . . , x n,i ]T ∈ R n are the state vectors at the i th learning iteration, u i ∈ R is the plant input at the i th iteration, and fl (xl,i ) ∈ L2 (Rl ) are nonlinear unknown functions. It is known that fl (l = 1, . . . , n − 1) are unmatched uncertainties. Now give the control objective. Let xr (t) ∈ C n [0, T ) be a nth order continuously differentiable trajectory, then xr , xr(1) , . . . , xr(n) are bounded on a finite interval [0, T ], where  (1) (n−1) T ] and xi = T > T . Define xr = [xr , xr , . . . , xr T xi − xr = [x 1,i , x 2,i , . . . , x n,i ] . An augmented tracking

XU AND YAN: ADAPTIVE LEARNING CONTROL FOR FINITE INTERVAL TRACKING

error σi at the i th learning iteration is defined as n−1  d σi = +λ x 1, i = [λT 1]xi dt =[λn−1 , (n

Nb 5

In this section, a new ALC approach based on function approximation is presented for the plant SI in (2), whereby f (x) satisfies Assumption 1. Suppose that g1 (x), g2 (x), . . . form a continuous and 2 n locally ∞ Lipschitzian basis in the space L (R ), then f (x) = j =1 θ j g j (x) with θ j being unknown weights. If we choose Nb bases to approximate the function  bf (x), the approximation θ j g j (x). It is obvious error should be e Nb (x) = f (x) − Nj =1 T that lim Nb →∞ e Nb T = lim Nb →∞ 0 e Nb 2 dt = 0. If the basis is sufficiently smooth and well localized, then the series expansion of continuous square integrable functions in fact also converges pointwisely. For example, if we choose wavelet as a basis, then the convergence of the resulting series in an L2 sense should also be pointwise under appropriate constraints on the wavelet [33], [34]. These additional smoothness and decay conditions on the basis are assumed throughout the analysis in this paper. Note that the pointwise convergence of e Nb (x) holds ∀x ∈ Rn . Suppose x is a vector valued function of the time t, and let t ∈ [0, T ], then x(t) is a map x : [0, T ] → D ⊂ Rn . Obviously, e Nb (x(t)) is pointwise convergent over D, and thus e Nb (x(t)) is a compound function pointwise convergent in [0, T ], in the sequel e Nb T is a convergent sequence  T |e Nb (t)|2 dt = 0. (5) lim e Nb T = lim Nb →∞ 0

From the above convergence property, there exists a constant M such that e Nb T ≤ M for any i . Since the learning control objective is to track a given trajectory in a finite interval, it is well known that the initial state values will directly affect the learning results [35]. In this paper, we consider five types of initial conditions from a practical point of view: Assumption 3: 1) 2) 3) 4) 5)

3

1)λ]T

III. ALC

Nb →∞

4

(4)

where λ − − with λ > 0. The ultimate control objective is to find a sequence of appropriate control input, u i (t), t ∈ [0, T ], such that the tracking error sequence will enter a prespecified bound in L2T after a finite number of learning iterations. 1)λn−2 , . . . , (n

895

σ i (0) = 0;  ∞ 2 i=1 σi (0) = σ0 , where σ0 is a constant; |σi (0)| = σ0 = 0, where σ0 is a constant; σi (0) is random and bounded by a constant σ0 ; σi (0) = σi−1 (T ), and |σ1 (0)| ≤ σ0 .

Condition 1) is the typical identical initialization condition, condition 2) implies that σi (0) belongs to l 2 , condition 3) is the fixed initial shift, condition 4) includes first three conditions as the special cases, and condition 5) is the alignment condition often seen in processes without a resetting mechanism [35].

2 3

6

9

12

i

Fig. 1. Updating the structure for every three iterations by adding one new base.

Considering the system SI in (2), the tracking error dynamics at the i th learning iteration can be expressed as σ˙ i = f (xi ) + u i (t) + v(t, xi )

∀t ∈ [0, T ]

(6)



where v(t, xi ) = −xr(n) (t)+[0 λ]xi . For notational convenience, in the following, f (xi ), gk (xi ), and v(t, xi ) are denoted by f i , gi , and v i , respectively. The ALC mechanism is given as T u i = −βσi − θˆ i gi − v i

(7)

where θˆ i = [θˆ1 , . . . , θˆNb ]T and gi = [g1 , . . . , g Nb ]T . The value of Nb is related to the approximation accuracy. It is very important for the selection of Nb . In our method, we choose Nb to be a function of the number of iterations i . Thus, Nb reflects how frequently new bases are added to the existing basis set when the iteration number i increases. Our idea is to add one or a few new bases called a new base set {g j , . . . , gm } to the existing set {g1, . . ., g j −1 } after every Ndwell learning iterations, where Ndwell is a positive integer. For example, the relationship between Nb and i is given in the Fig. 1. Here, the network structure is updated by adding a new base into the existing set for every three iterations, i.e., Ndwell = 3. In the following, for simplicity of derivation, we choose Nb (i ) = i , i.e., N = 1. This implies that the function approximation network is updated at every learning iteration. The theoretical results thus derived can be easily extended to N > 1. The parametric adaptive learning law is θ˙ˆ i = σi gi

θˆ 1 (0) = 0,

(8) θˆ i (0) = θˆ i−1 (T ).

Substituting the ALC law (7) into the tracking error dynamics (6) yields σ˙ i = f i + u i + v i

= −βσi + (θ i − θˆ i )T gi + ei .

(9)

 Define the augmented state vector yi = (xi , θˆ i ). From the plant (2), relation (4), adaptive learning mechanism (8), and ALC law (7), we have

y˙ i = h(t, yi )

(10)

896

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

where h(t, yi ) = [x 2, i , . . . , x n, i , h x (t, yi ), hθTˆ (t, yi )]T h x (t, yi ) = f i + u i T = −β[λ 1]xi − v i + θ T gi − θˆ gi i

i

+ ei + β[λ 1]xr hθˆ (t, yi ) = [λT 1]xi gi .

(11)

The first main result that is concerned with the existence of solution of the above augmented dynamics (10) under the initial conditions described in Assumption 3 is summarized in the following theorem. Theorem 1: The solution yi exists in [0, T ] by choosing the feedback gain β > 1. Proof: Since the control task ends in the finite interval [0, T ], all we need to prove is that there is no finite escape time for yi in [0, T ]. We shall prove that the solution yi (t) of the dynamic system (10) exists in [0, T ), which therefore  implies the existence in [0, T ]. Define = R n+i × [0, T ). n+i Clearly, h(t, yi ) : i → R is continuous. By Peano’s Existence Theorem [32] associated with the initial values yi (0) = (x0 , θˆ i (0)) ∈ i , (10) has a continuous solution in a neighborhood of t = 0. Furthermore, it is easy to check that h(t, yi ) is locally Lipschitz continuous in yi . We only need to consider the solution for t > 0. Let [0, ti ) be the maximum interval over which the solution yi (t) can be continued. Lemma 1 implies that yi (t) tends to the boundary ∂ i as t → ti . It further implies that limt →ti yi (t)m = ∞ if ti < T , i.e., for any C > 0 and for each i , there exists δi > 0 such that yi (t)m ≥ C for all t ≥ ti − δi . Since yi (t) exists for all t ∈ [0, ti − (δi /2)], define a Lyapunov function 1 1 T (12) V (σi , θ˜ i ) = σi2 + θ˜ i θ˜ i 2 2 where θ˜ i = θ i − θˆ i . Differentiating V (σi , θ˜ i ) with respect to time t yields T V˙ (σ , θ˜ ) = σ σ˙ − θ˜ θˆ˙ . (13) i

i

i i

i

i

Substituting the augmented error dynamics (9) and the parametric adaptive learning law (8) yields V˙ (σi , θ˜ i ) = −βσi2 + σi ei .

(14)

Using Young’s inequality, there exists c ∈ (0, 1) such that 1 σi ei ≤ cσi2 + ei2 . (15) 4c It follows from (14) that 1 (16) V˙ (σi , θ˜ i ) ≤ (c − β)σi2 + ei2 4c where c − β < 0. Next, we will complete the proof by mathematical induction. For i = 1, from Assumption 3, |σ1 (0)| ≤ σ0 for all initial conditions, and θˆ 1 (0) = 0. It follows from (16) and ei T ≤ M that  t V˙ (σ1 , θ˜ 1 )dτ + V (0, 0) 0 ≤ V (σ1 , θ˜ 1 ) = 0

M 1 1  M2 ≤ + σ02 + θ 21 = 1 4c 2 2 4

for all t ∈ [0, t1 −(δ1 /2)], i.e., V (σ1 , θˆ 1 ) is bounded on [0, t1 − (δ1 /2)] by a constant which does not depend on δ1 . By the definition of Lyapunov function V , it can be derived from the above relationship that |σ1 |s ≤ M1 and |θˆ1 | ≤ M1 . Therefore, y1 (t)m ≤ M1 for all t ∈ [0, t1 − (δ1 /2)]. Note that M1 > 0 is a constant independent of δ1 . Taking C = 2M1 in advance, for the corresponding δ1 > 0 we have δ1 C C ≤ y1 (t1 − )m ≤ M1 = (17) 2 2 which is a contradiction implying t1 ≥ T . Assume that t j ≥ T for j = 2, . . . , i −1. Then the solution y j (t) exists in [0, T ) and therefore σ j and θˆ j are both bounded for all t ∈ [0, T ]. If ti < T , we have yi (t)m ≥ C for all t ≥ ti − δi , as shown above. Note that |σi (0)| ≤ σ0 for initial conditions 1)–4), σi (0) = σi−1 (T ) for the initial condition 5), and θˆ i (0) = θˆ i−1 (T ). Hence, the quantities σi (0) and θˆ i (0) are bounded by a constant independent of δi . From (16) and L2T convergence property of ei , we have 0 ≤ V (σi , θ˜ i )  t = V˙ (σi , θ˜ i ) dτ + V (σi (0), θ˜ i (0)) 0

M 1 1 ≤ + σi (0)2 + [θ i − θˆ i (0))T (θ i − θˆ i (0)) 4c 2 2 2 M  i (18) = 4 for all t ∈ [0, ti − (δ1 /2)], i.e., V (σi , θ˜ i ) is bounded on [0, ti − (δ1 /2)] by a constant which does not depend on δi . The definition of Lyapunov function V also implies that yi (t)m ≤ Mi for all t ∈ [0, ti −(δ1 /2)]. By taking C = 2Mi , it leads to a contradiction analogous to (17). As a result, ti ≥ T . For the closed-loop dynamic system (9) with the parametric updating law (8), the convergence property associated with initial conditions in Assumption 3 is shown in the following theorem. Theorem 2: Part 1) Under the initial conditions 1), 2), and 5), the exists a subsequence {σi j } of {σi }, which enters any prespecified bound  after a finite number of learning iterations. Part 2) Under the initial conditions 3) and 4), for any arbitrary δ > 0 and a bound given by  = (σ02 + δ)/(β − c)T there exists a sub-sequence {σi j } of {σi } which enters the given bound  after a finite number of learning iterations. Proof: Integrating both sides of (16) from 0 to T , and using the fact θ˜ i (0) = θ˜ i−1 (T ) V (σi (T ), θ˜ i (T ))  T ˜ = V (σi (0), θ i (0)) + V˙ dt 0

≤ V (σi−1 (T ), θ˜ i−1 (T )) +V (σi (0), θ˜ i−1 (T ))−V (σi−1 (T ), θ˜ i−1 (T ))  T  T 1 σi2 dt + e2 dt −(β − c) 4c 0 i 0

XU AND YAN: ADAPTIVE LEARNING CONTROL FOR FINITE INTERVAL TRACKING

1 1 2 = V (σi−1 (T ), θ˜ i−1 (T )) + σi2 (0) − σi−1 (T ) 2 2  T  T 1 σi2 dt + ei2 dt. −(β − c) 4c 0 0

897

Part 2) The relation (19) with the initial conditions 3) and 4), |σi (0)| ≤ σ0 , is V (σi (T ), θ˜ i (T ))

Repeating the operation i − 1 times leads to the following:

≤ V (σ1 (T ), θ˜ 1 (T )) +

1 2 σ j (0) V (σi (T ), θ˜ i (T )) ≤ V (σ1 (T ), θ˜ 1 (T )) + 2 i

−(β − c)

j =2

i  

1 1 2 σ j −1 (T ) + 2 4c i



j =2

−(β − c)

i   j =2 0

T

T

j =2 0

j =2 0

e2j dt

σ j2 dt.

(19)

Part 1) From the initial conditions 1), 2), and 5), we have 1 2 1 1 2 σ j (0)− σ j −1 (T ) ≤ σ0 2 2 2 i

i

j =2

j =2

and (19) becomes V (σi (T ), θ˜ i (T )) 1 ≤ V (σ1 (T ), θ˜ 1 (T )) + σ0 2 i  T i   1  T 2 2 −(β − c) σ j dt + e j dt. 4c 0 0

(20)

To derive the convergence, the reduction to absurdity will be used. Suppose, on the contrary, there exists a positive integer N1 such that σ j T ≥  for all iteration number j ≥ N1 . Since e j (x j ) is a convergent sequence in L2T , for arbitrary  T given  there exists a positive integer N2 such that 0 e2j dt ≤ 2c(β − c)T  for all j ≥ N2 . Letting N = max{N1 , N2 }, and noticing the existence of solution shown in Theorem 1, the following quantity is finite: N

+



T

j =2 0

N  1  T 2 e j dt. 4c 0

σ j2 dt

j =2

Then it follows from (20) that: V (σi (T ), θ˜ i (T )) ≤ B − (β − c)

 i  j =N+1 0

T

j =2

σ j2 dt +

i  1  T 2 e j dt. 4c 0

(22)

j =2

1 4c

N  T  j =2 0

j =2

e2j dt

is a finite. For arbitrary δ > 0 and  = (σ02 + δ)/(β − c)T , substitution into (22) yields

j =2

 1  B = V (σ1 (T ), θ˜ 1 (T )) + σ0 − (β − c) 2

i

Analogous to the proof of Part 1), assume that there exists a positive integer N1 such that σ j T ≥  for all iteration number j ≥ N1 . Since the approximation error ei is a convergent sequence in L2T , there exists an integer N2 such T that 0 e2j dt ≤ 2c(β−c)T  for all j ≥ N2 . From the existence of solution and the finiteness of N = max{N1 , N2 } N  T  1  B = V (σ1 (T ), θ˜ 1 (T )) + Nσ02 − (β − c) σ j2 dt 2 0 +

j =2

i  T 

1 2 σ0 2

σ j2 dt

 T i 1  e2j dt + 4c 0 j =N+1

1 ≤ B − (β − c)T (i − N). (21) 2 When i → ∞, the right-hand side of (21) approaches −∞ since B is finite, which contradicts the fact that V (σi (T ), θ˜ i (T )) is positive definite. Therefore, there must exist a subsequence of σi which enters the given bound  after a finite number of learning iterations.

V (σi (T ), θ˜ i (T ))  T i i  1  2 σ0 − (β − c) σ j2 dt ≤ B+ 2 0 j =N+1

+

1 4c

≤ B+

 i 

j =N+1 0

j =N+1

T

e2j dt

i 1  [σ02 −(β − c)T ] 2 j =N+1

1 (23) = B − (i − N)δ. 2 The right-hand side of (23) approaches −∞ because B is finite, which leads to a contradiction to the fact V (σi (T ), θ˜ i (T )) is positive definite. Therefore, there must exist a sub-sequence of σi which enters the given bound  after a finite number of learning iterations. Remark 1: From Part 2 of Theorem 2, a large gain β can reduce the tracking error bound  under the initial conditions 3) and 4). Moreover, it should be noted that in deriving the above convergence properties, we considered only sufficient conditions or the worst case performance. In practice, we may achieve better learning performance such as pointwise or uniform convergence, although in theory only L2T convergence is guaranteed. IV. ROBUST A DAPTIVE L EARNING C ONTROL (RALC) In Section III, we studied the ALC problem with the unknown function f (x) ∈ L2 (R n ). However, functions in the space L2 (R n ) are rarely met in practice. For instance, a simple linear function f (x) = x does not belong to the space L2 (R n ). In this section, our objective is to study functions more general

898

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

than L2 (R n ). As such, we consider functions in L2 (D) where D ⊂ R n is a compact set. Most functions we handle in control practice belong to L2 (D). Comparing with L2 (R n ), the difficulty of function approximation for L2 (D) is that the basis defined on D will not be valid outside the compact set D. In particular, the weights θ will change when the states x move out the compact set D. Most function approximationbased control methods developed hitherto require the system states to strictly stay in D, or no expansion from D. Such a non-expansion condition, in fact, is concerned with the transient behavior of control systems and is in general far more difficult than the original control task of asymptotic convergence. On the other hand, robust control methods can easily constrain the system states in D all the time, provided the unknown functions satisfy Assumption 2. Most studies on robust control are based on this assumption. In this section, we study the possibility of combining robust control with the function approximation to achieve better control performance for the plant SI . It is well known that, in robust control, to achieve a small tracking error bound in the presence of nonvanishing perturbations, a high feedback gain is required. The smaller the error bound, the higher the gain. Using an overly large control gain will, however, incur excessive control actions, not only wasting energy but also degrading responses, shortening the life cycle of control mechanisms or even destabilizing the control system. An appropriate control approach is to incorporate function approximation into robust control. Robust control with a lower gain will guarantee a bounded tracking performance, say D, although the error bound may not meet the performance specification. Then the function approximation with adaptive learning will gradually take over the tracking task by generating necessary control signals to compensate any nonvanishing perturbations or produce the “internal model.” Consider a compact set D0 = {σi ∈ R : |σi |s ≤ 0 }

f

fa

−2D

−D

D

2D

x

Fig. 2. Relationship between f (x) and f a (x) for a scalar x. f a (x) = f (x) for |x|s ≤ D and f a (x) = 0 for |x|s ≥ 2D. Clearly, f a (x) is smooth and monotonous between the boundaries ∂D and ∂2D.

part, as we will show later. Hence the construction of such a fictitious f a is only for the convenience of analysis. Likewise, the bounding function f¯ of f , defined on D, can also be modified into a fictitious f¯a defined on R n , with f¯a = f¯, where x ∈ D. Now we are ready to construct an augmented plant  l = 1, 2, . . . , n − 1 x˙l, i = xl+1, i , Sa : (25) x˙n, i = f a (x) + u i which has the same form as SI . The ALC law (7) will be revised with an additional robust control βi , as follows: T u i = −(β + βi )σi − θˆ i gi − v i T |θˆ i gi | + f¯ia βi = 0

(26)

T where β > 1, and θˆ i gi is the function approximation series of f a on R n . Substituting the RALC law (26) into (6) and replacing f (xi ) by f a (xi ), the dynamics of the tracking error σi is T

(24)

where 0 > 0 is a sufficiently large constant so that the initial condition |σ (0)| ≤ σ0 is within the compact set. From the definition of the augmented tracking error σi (t) in (4), corresponding to D0 there exists a compact set D so that xi ∈ D. So long as we can prove the non-expansion property of the compact set D0 for any i and t ∈ [0, T ], then the nonexpansion property of D is guaranteed. The non-expansion of D warrants a valid function approximation sequence because the weights θ will not change. To fulfill this control task, we need to show two properties in the RALC: the first to show the non-expansion of D0 , namely, the boundeness of σi by 0 , and the second to show the convergence of the tracking error sequence σi T to the prespecified bound . In the preceding section, we have shown the learning convergence analysis for f ∈ L2 (R n ). In order to make use of the analysis results in Theorems 1 and 2, we need to modify the functions f ∈ L2 (D) into functions of f a ∈ L2 (R n ) shown in Fig. 2. It is obvious that f a (x) ∈ L2 (R n ) and f (x) = f a (x) for x ∈ D. Remark 2: Note that such a modification is fictitious, because the states x will not leave D by the robust control

σ˙ i = −(β + βi )σi −θˆ i gi + f ia

(27)

where f ia = f a (xi ). Theorem 3: For the plant Sa shown in (25) satisfying Assumption 3, the controller (26) together with the parametric adaptive learning law (8) guarantees σi ∈ D0 for any i and t ∈ [0, T ]. Proof: Differentiating the Lyapunov function 1 2 (28) σ 2 i with respect to time t, substituting the tracking error dynamics (27), and applying the control law (26), we have V (σi ) =

T V˙ (σi ) = σi (−(β + βi )σi − θˆ i gi + f ia ) ⎞ ⎛ ˆ T gi | + | f¯a | | θ i ⎠ ≤ −βσi2 − βi |σi | ⎝|σi | − i βi

= −βσi2 − βi |σi |(|σi | − 0 ).

(29)

Clearly, V˙ is negative definite if |σi | ≥ 0 , thus |σi (t)| ≤ 0 is strictly guaranteed for any i and t ∈ [0, T ]. This implies σi ∈ Dσ and xi ∈ D.

XU AND YAN: ADAPTIVE LEARNING CONTROL FOR FINITE INTERVAL TRACKING

Theorem 4: For the plant Sa in (25), the controller (26) together with parametric adaptive learning law (8) guarantees the existence of a subsequence σi j T of σi T , which enters the bound  after a finite number of learning iterations. Proof: The idea of the proof is similar to that of Theorems 1 and 2. Define the same Lyapunov function 1 1 T V (σi , θ˜ i ) = σi2 + θ˜ i θ˜ i . 2 2

T θ˜ i θ˙ˆ i

+ σi ei .

(31)

Note that the above relation is the same as (14). Thus all subsequent derivations in Theorems 1 and 2 are valid, and hence the convergence property concluded in Theorem 2 also holds. Remark 3: By choosing a sufficiently large 0 that is reciprocal to the robust control gain, the robust control efforts can be greatly reduced. At the same time, the control objective can still be achieved after adaptive learning. V. E XTENSION TO C ASCADE F ORM Consider the nth-order cascade dynamic system SI I in (3). The backstepping design has been developed as a systematic approach to handle cascade dynamics or any system in triangular form. The principal idea of backstepping design is for the lth subsystem to construct a fictitious control input that will enter the (l + 1)th subsystem as the objective trajectory. As a systematic method, the backstepping design can be easily extended from second order to the nth order, and hence for simplicity and for concentrating on the most fundamental steps in the problem solving, we consider a second-order dynamics, i.e., n = 2 in (3), as below x˙1, i = f 1 (x 1, i ) + x 2, i x˙2, i = f 2 (xi ) + u i

(32)

where xi = [x 1,i , x 2,i ]T . Denote f1,i = f 1 (x 1,i ) and f 2,i = f 1 (xi ). The control objective is to design an appropriate control input u i (t) such that x 1, i can track xr in L2T as i → ∞. Since f j, i ∈ L2 (R j ) for j = 1, 2, there exist continuous and locally Lipschitzian bases g j = g j (x 1, i ) and h j = h j (xi ) such that f 1, i =

∞ 

θjgj =

j =1

f 2, i =

∞  j =1

i 

θ j g j + e1, i = θ iT gi + e1, i

j =1

φjh j =

i  j =1

φ j h j + e2, i = φ iT hi + e2, i

(33)

where β1 > 1, and the parametric adaptive learning law is θ˙ˆ i = gi z 1, i + ρ1, i gi z 2, i

(34)

where θˆ 1 (0) = 0, θˆ i (0) = θˆ i−1 (T ), and   ∂α1, i ∂α1, i T ∂gi ρ1, i = + . ∂ x 1, i ∂gi ∂ x 1, i

(35)

T T u i = ρ2, i − z 1, i − β2 z 2, i + ρ1, i θˆ i gi −φˆ i hi

T

−θ˜ i gi σi ≤

T

α1, i = −β1 z 1, i + x˙r − θˆ i gi

Design the actual controller at the i th iteration

T

= σi [−(β + βi )σi − θˆ i gi + θ iT gi + ei ] −βσi2

where e1,i and e2,i are approximation errors. Defining new coordinates z 1, i = x 1,i − xr and z 2, i = x 2, i − α1, i , the fictitious control is

(30)

Differentiating V (σi , θ˜ i ) with respect to time t, substituting the tracking error dynamics (27), and applying the adaptive learning law (8) yield V˙ (σi , θ˜ i ) = σi σ˙ i −

899

(36)

2 + 1 and where β2 > ρ1, i

ρ2, i =

∂α1, i ∂α1, i ∂α1, i ∂α1, i (2) + x 2, i + x˙r + x ∂t ∂ x 1, i ∂ xr ∂ x˙r r     ∂α1, i T ˙ ∂α1, i T ∂gi + x 2, i . θˆ i + ∂gi ∂ x 1, i ∂ θˆ i

The second parametric adaptive learning law is φ˙ˆ i = hi z 2, i

(37)

where φˆ 1 (0) = 0 and φˆ i (0) = φˆ i−1 (T ). The convergence property of the above ALC scheme is derived by the following theorem. Theorem 5: For the plant (32), the control laws (33) and (36) and the adaptive learning laws (34) and (37) guarantee the existence of a sub-sequence {z 1,i j } of {z 1,i } such that for arbitrary  > 0, z 1,i j T enters the bound  after a finite number of learning iterations. Proof: The proof consists of two steps. Step 1: From (32), we have z˙ 1, i = x˙1, i − x˙r = z 2, i + α1, i + f 1, i − x˙r . (38) Substituting the fictitious control α1,i in (33) into (38) yields T

z˙ 1, i = z 2, i − β1 z 1, i + f 1, i − θˆ i gi T = z 2, i − β1 z 1, i + θ˜ i gi + e1, i .

(39)

Define a Lyapunov function V1, i =

1 2 1 T z + θ˜ θ˜ i . 2 1, i 2 i

(40)

Using (39), the derivative of V1,i is T V˙1, i = z 1, i z˙ 1, i − θ˜ i θ˙ˆ i 2 = z 1, i z 2, i − β1 z 1, i + e1, i z 1, i T ˙ −θ˜ (θˆ − g z ). i

i

i 1, i

(41)

900

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

2 e 2 2 (1/4c)e1, i 2, i z 2, i ≤ cz 2, i +(1/4c)e2, i . Choosing (β1 −c) ≥ 2 β and (β2 − cρ1,i − c) ≥ β with β > 0, we obtain

Step 2: From (32) and (33), we have z˙ 2, i = x˙2, i − α˙ 1, i

∂α1, i ∂α1, i ∂α1, i x˙1, i − x˙r − ∂t ∂ x 1, i ∂ xr  T ∂α1, i (2) ∂α1, i − xr − θ˙ˆ i ∂ x˙r ∂ θˆ i   ∂α1, i T ∂gi − x˙1, i ∂gi ∂ x 1, i ∂α1, i ∂α1, i ∂α1, i (2) − = u i + f 2, i − x˙r − x ∂t ∂ xr ∂ x˙r r T T   ∂gi ∂α1, i ∂α1, i θˆ˙ i − − (x 2, i + f 1, i ) ∂gi ∂ x 1, i ∂ θˆ i ∂α1, i − (x 2, i + f 1, i ) ∂ x 1, i = u i + f 2, i −ρ1, i f1, i −ρ2, i (42) = u i + f 2, i −

T

+ ( f 2, i − φˆ i hi ) T = −z 1, i − β2 z 2, i − ρ1, i θ˜ i gi −ρ1, i e1, i

in L2T when i → ∞. Therefore, by following the derivation procedures in Theorems 1 and 2, we can reach the conclusion that z 1, i j T ≤  can be achieved after a finite number of learning iterations.

(43) A. Multiresolution Approximations by Wavelet

Define the Lyapunov function 1 2 1 ˜T ˜ V2, i = V1, i + z 2, i + φi φi . 2 2 The derivative of V2,i is T V˙2, i = V˙1, i + z 2, i z˙ 2, i − φ˜ i φ˙ˆ i .

(44)

(45)

Using (43), we have T

2 ˜ z 2, i z˙ 2, i = −z 1, i z 2, i − β2 z 2, i −ρ1, i θ i gi z 2, i − ρ1, i e1, i z 2, i T + φ˜ i hi z 2, i + e2, i z 2, i .

(46)

Substituting (41) and (46) into (45) yields 2 ˜ T ˙ˆ V˙2, i = z 1, i z 2, i − β1 z 1, i + e1, i z 1, i − θ i (θ i − gi z 1, i ) T

2 ˜ −z 1, i z 2, i − β2 z 2, i + ρ1, i θ i gi z 2, i + ρ1, i e1, i z 2, i T T ˙ + φ˜ h z + e z −φ˜ φˆ i 2, i 2, i 2, i i i T ˙ 2 2 ˜ β1 z 1, i −β2 z 2, i −θ i (θˆ i − gi z 1, i −ρ1, i gi z 2, i ) T φ˜ i (φ˙ˆ i − hi z 2, i ) i



2 + z 2 and 2 + e 2 )/2c as lumped By viewing z 1,i (e1,i 2,i 2,i quantities, the above relation is analogous to the relation 2 + e 2 )/2c is convergent (16) in Theorem 1. Further, (e1,i 2,i

From the previous discussion, finding an appropriate basis is indispensable to achieve the desirable function approximation property in ALC or RALC. Wavelet theory offers such a basis for L2 space [32], [33]. In this section, we will illustrate how an orthonormal basis of wavelets for L2 (R) can be constructed from the multiresolution approximation.

T z˙ 2, i = −z 1, i − β2 z 2, i −ρ1, i ( f 1, i − θˆ i gi )

= −

2 + z 2 )2 + ( (e 2 + e 2 )/2c)2 . ≤ −β( z 1,i 2, i 1, i 2, i

VI. WAVELET BASES

where ρ2, i defined in (37) is known. Substituting the control law (36) into (42) yields

T + φ˜ i hi + e2, i .

2 2 2 V˙2, i ≤ −(β1 − c)z 1, i − (β2 − cρ1, i − c)z 2, i 1 2 1 + e1, + e2 2c i 4c 2, i

+ e1, i z 1, i + ρ1, i e1, i z 2, i + e2, i z 2, i .

(47)

Substitution of the adaptive learning laws (34) and (37) results in 2 2 V˙2, i ≤ −β1 z 1, i − β2 z 2, i + e1, i z 1, i +ρ1, i e1, i z 2, i + e2, i z 2, i .

Multiresolution analysis was proposed in [38]. Multiresolution analysis provides a mathematical tool to describe the increment in information from a coarse resolution approximation to a finer resolution approximation. Let us give the definition of this concept. Denote Z the set of integer numbers. Definition 2: A multiresolution analysis of L2 (R) is an increasing sequence V j ∈ L2 (R), j ∈ Z, of closed subspaces of L2 (R), with the following properties: 1) .. . V−2 ⊂ V−1 ⊂V0 ⊂ V1 ⊂ V2 . . .; ∞ ∞ 2 2 2) −∞ V j = {0}, −∞ V j = L (R) is dense in L (R); 2 3) ∀ f ∈ L (R), ∀ j ∈ Z, f (x) ∈ V j ⇔ f (2x) ∈ V j +1 ; 4) f (x) ∈ V j ⇒ f (x − 2− j k) ∈ V j j, k ∈ Z; 5) for all j , there exists a φ(x), called scaling function, such that {φ j,k (x) = 2 j/2 φ(2 j x − k)| k ∈ Z} is an orthonormal basis of V j and V j = span φ j,k | k ∈ Z . The orthogonal projection of a function f ∈ L2 (R) into V j  is given by f j (x) = k∈Z < φ j,k (x), f (x) > φ j,k (x) and can be interpreted as an approximation to f at resolution 2− j . Therefore, the function f (x) can be uniquely approximated in the space V j

(48)

Using Young’s inequality, there exists c ∈ (0, 1) such 2 + (1/4c)e 2 2 2 that e1,i z 1,i ≤ cz 1, i 1, i ρ1, i e1, i z 2, i ≤ cρ1,i z 2,i +

f (x) =

Nj 

< φ j, k (x), f (x) > φ j, k (x) + e j

k=1

where e j is the approximation error at j th resolution including the truncation error, N j is the number of bases used at the j th resolution, and < · > is the inner product. Note that a larger j means a higher resolution, therefore e( j + 1) ≤ e( j ) and lim e( j ) = 0. j →∞

XU AND YAN: ADAPTIVE LEARNING CONTROL FOR FINITE INTERVAL TRACKING

ψ(x)

By defining W j as  the orthogonal complement of V j in V j +1 , i.e., V j +1 = V j W j the space L2 (R) is represented as a direct sum  Wj. (49) L2 (R) =

901

j ∈Z

Fig. 3.

k∈Z

+



< ψ j, k (x), f (x) > ψ j, k (x)

j ≥ J k∈Z

=

kJ 

v J, k φ J, k (x) +

Jp k j  

w j, k ψ j, k (x) + e(J p )

j =J k=0

k=0

where J denotes the starting resolution and J p denotes the terminal resolution, and v J,k and w j,k denote the coefficients or weights of the wavelet network. For a fixed j , the index k increases from 0 to k j . Now consider a n−dimensional system with x = [x 1 , . . . , x n ]T . The n−dimensional scaling function and mother wavelet are expressed as the product of n 1-D scaling function and mother wavelet φ j, k (x)

= φ j, k1 (x 1 )φ j, k2 (x 2 ) . . . φ j, kn (x n )

ψ j, k (x)

= ω j, k1 (x 1 )ω j, k2 (x 2 ) . . . ω j, kn (x n )

where ω j,kl (xl ), l = 1, 2, . . . , n, can be either φ j,kl (xl ) or ψ j,kl (xl ), but cannot all be φ j,ki (x i ). Given the function f (x), x = [x 1, . . . , x n ]T in the space L2 (R n ), the approximation up to J p resolution is f (x) =

kJ  k=[0,...,0]

+ e(J p )

v J, k φ J, k (x) +

Jp 

kj 

w j, k ψ j, k (x)

j =J k=[0,...,0]

−2

0 x

2

4

6

−2

0 x

2

4

6

1 0.5 0 −0.5 −1 −6

k∈Z

can be interpreted as an approximation to f at resolution 2− j . Therefore, the function f (x) in the space L2 (R) can be uniquely approximated as  f (x) = < φ J, k (x), f (x) > φ J, k (x)

−4

Sinc wavelet function ψ.

ψ(x)

Moreover, from the previous assumption on V j it follows that there exists a function ψ(x), called mother wavelet, such that   ψ j, k (x) = 2 j/2ψ(2 j x − k)| k ∈ Z (50)   is an orthonormal basis of W j . From (49), ψ j,k | j, k ∈ Z constitutes an orthonormal basis for L2 (R). The spaces W j are called wavelet subspaces of L2 (R) relative to the scaling function φ(x), and the orthogonal projection of a function f ∈ L2 (R) onto W j given by  g j (x) = < ψ j, k (x), f (x) > ψ j, k (x) (51)

1 0.8 0.6 0.4 0.2 0 −0.2 −0.4 −0.6 −0.8 −1 −6

Fig. 4.

−4

Morlet wavelet function ψ(x).

lth element of k J . For the j th resolution, kl increases till the upper limit kl, j which is the lth element of k j . The selection of k J and k j will be further discussed in the numerical example. B. Three Wavelet Bases Three types of continuous wavelet functions, namely, Sinc, Mexican hat, and Morlet, are investigated on their suitability for control problems. Case 1: Sinc (Shannon) Wavelet: Sinc wavelet is widely used to solve signal processing problems. The scaling function of the Sinc wavelet is φ(x) = sinc(x). The corresponding wavelet function is ψ(x) = 2sinc(2x) − sinc(x), shown in Fig. 3. Case 2: Morlet Wavelet: This wavelet is derived from a function that is proportional to the cosine function and Gaussian probability density function. It is nonorthogonal with an infinite support and the maximum energy lies around origin with a narrow band. Morlet wavelet is described as 2 ψ(x) = e−x /2 cos(5x), as shown in Fig. 4. Case 3: Mexican Wavelet: Mexican wavelet is derived from a function that is proportional to the second derivative function of the Gaussian probability density function. It is nonorthogonal with an infinite support, and the maximum energy is around origin with a narrow band. This wavelet is 2 described by ψ(x) = (1 − x 2 )e−(x /2) , as illustrated in Fig. 5.

(52)

where, on the right-hand side, the first term is the starting resolution or the coarse resolution that consists of the scaling functions only, the second term is the higher resolutions up to J p level that consist of mother wavelets only, and the third term is the approximation error. The index k = [k1 , . . . , kn ] ∈ Z n and each kl (l = 1, 2, . . . , n) starts from 0. For the resolution kl increases till the upper limit kl,J , which is the

VII. I LLUSTRATIVE E XAMPLE A. Design of Wavelet Network Structure In order to provide useful information and guideline for practical applications of wavelets in ALC, we focus on a few important factors: the suitability of a wavelet basis, the complexity of the function approximation network, and the length of adaptive learning period. Let Ntotal denote the total

902

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

1 Tracking Errors

ψ (x)

0.5 0

−2

0 x

2

4

j =J

1: 2: 3: 4:

Desired tracking error bound

0.1

Nj =

Jp n   (kl, j + 1).

2 3 Iteration Number

4

Fig. 6. Maximum tracking error based on ALC with Sinc wavelet for Ndwell = 1.

(53)

j =J l=1

The design procedure is summarized below. Step Step Step Step

0.2

1

number of iterations. Let Ndwell be the number of the iterations between the two structured updating. In other words, Ndwell is the “dwell iterations.” For instance, by choosing Ndwell = 10, the resolution j will increase by 1 after 10 iterations or when i increases by 10. Now we demonstrate how to construct the wavelet network. According to (52), it is necessary to select values of J , k J , j , and k j . First, choose J as a coarse resolution. By estimating ranges of state variables, we can further decide k that starts from [0, . . . , 0] and increases by 1 until ends at k J = [k1,J , . . . , km,J , . . . , kn,J ] at the J th layer. Note that n is the dimension of states, and km,J , m = 1, 2, . . . , n can take different values because the ranges for variables could be different. The number of wavelet bases at the J th layer is N J = (k1,J + 1) × · · · × (kn,J + 1). Analogously, for the j th resolution, j > J , k j = [k1, j , . . . , kn, j ] are the upper boundary of k in the wavelet network. Generally speaking, k j determines the location of wavelets at the j th resolution, and the distance between any two consecutive wavelets is 2− j , as can be seen from (50). The distance between the two consecutive wavelets halves when the resolution j increases by 1. In order to cover the same ranges when the resolution j increases by 1, the number of k j should be doubled, namely, k j +1 = 2k j = [2k1, j , . . . , 2kn, j ]. The total number of bases in the wavelet network, which starts from the coarse resolution J and ends at the final resolution j = J p , is Jp 

0.3

6

Mexican wavelet function ψ(x).

Nb =

0.4

0 −4

Choose J as the coarse resolution. Decide k J by the estimation of the region of states. Decide the dwell iteration number Ndwell . If the error is above the prespecified bound after Ndwell iterations, the resolution j increases by 1. Step 5: Double k j when j increases by 1, then proceed with new learning iterations. Step 6: Stop adaptive learning when the tracking error enters the pre-specified bound, otherwise go to Step 4.

x1 x2 Desired tracking error bound

0.25 Tracking Error

−0.5 −6

Fig. 5.

x1 x2

0.5

0.2 0.15 0.1 0.05 0 2

4

6

8 10 12 14 Iteration Number

16

18

20

Fig. 7. Maximum tracking error based on ALC with Sinc wavelet for Ndwell = 10.

B. Complexity Versus Learning Rate Consider the following dynamic system: x˙1 = x 2 x˙2 = 8e−x1 sin 5(x 1 + x 2 ) + u. The reference trajectory is xr = 2t 2 − t 3 , x˙r = 4t − 3t 2 . In this system, the unknown function f (x) = 8e−x1 sin 5(x 1 + x 2 ) ∈ L2 (R 2 ). Due to space limitation, we only demonstrate ALC control under the initial condition 1), i.e., x 1,i = x 2,i = 0. The prespecified tracking error bound is  = 0.01. We choose the coarse resolution J = 1 and k J = [k1,J , k2,J ]T = [4, 4]T to estimate the unknown function in the region [0, 1] × [0, 2]. The three different wavelet bases, Sinc, Morlet, and Mexican, are applied. First, Sinc Wavelet is applied. Choose Ndwell = 1, i.e., the wavelet network structure is adjusted by increasing one resolution when the iteration number i increases by 1. The tracking error is shown in Fig. 6. From the figure, the tracking error x 2 does not enter the prespecified error bound after four iterations. On the other hand, resolution j = 4 corresponds to a relatively complex structure since the number of wavelet bases at the resolution j = 4 reaches 1484 according to (53). A question is whether a higher resolution j = 5 is really imperative. Note that updating the structure at every iteration is the fastest updating rate. Since ALC needs more iterations to converge, we can figure out that the tracking error is more likely due to inadequate learning and less likely due to inadequate approximation. Hence we may update the network structure at a lower rate: for instance, updating once after a few learning iterations. Choose a dwell iteration number

0.35 0.3 0.25 0.2 0.15 0.1 0.05 0

f(x)

2

−6 0 10

15 20 Iteration Number

25

x2 Desired tracking error bound

0.15 0.1 0.05

2

4

6

8 10 12 14 Iteration Number

16

18

20

Maximum tracking error based on ALC with Mexican wavelet.

1 0

0.5

1 x2

1.5

2

Fig. 10. Approximation of unknown function in the 2-D system by the Mexican wavelet network with ALC.

x1

0.2

0.5 x1

30

0

Fig. 9.

0 −4

Maximum tracking error based on ALC with Morlet wavelet.

−0.05

f(x) Estimation of (f(x)

4

−2

0.25

Tracking Error

6

x1 x2 Desired tracking error bound

5

Fig. 8.

903

Tracking error

Tracking Error

XU AND YAN: ADAPTIVE LEARNING CONTROL FOR FINITE INTERVAL TRACKING

0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 −0.05

x1 x2 Desired tracking error bound

5

10

15 20 25 Iteration numbers

30

35

40

Fig. 11. Maximum tracking error based on ALC with Morlet wavelet for Ndwell = 20.

TABLE I C OMPARISON OF WAVELET N ETWORKS FOR D IFFERENT WAVELETS Type of wavelet

Jp

Nb

Ntotal

Sinc

2

106

17

Morlet

3

395

21

Mexican

2

106

16

Ndwell = 10. The tracking error performance is shown in Fig. 7. From Fig. 7, the tracking errors enter the desired tracking error bound after 17 iterations. Thus the final resolution is J p = 2, and the corresponding number of bases is Nb = 106. This implies that the complexity of the wavelet network can be greatly reduced by selecting an appropriate Ndwell . In practical control applications, the dwell iteration number Ndwell can be determined according to other control requirements. For instance, if priority is given to the learning rate, a small Ndwell would be proper. On the contrary, if controller complexity is the main concern, a large Ndwell shall be chosen. Now choosing Ndwell = 10, we apply Morlet wavelet and Mexican Wavelet in ALC approaches. The performances are shown in Figs. 8 and 9, respectively. Table I summarizes the comparative results of ALC with the three kinds of wavelets when Ndwell = 10. From Table I, we can see that Sinc and Mexican wavelets achieve a fast convergence rate and at the same time generate the simple structure. The performances are almost the same for Sinc wavelet and Mexican wavelet. Fig. 10 shows a fairly precise approximation of the unknown function f (x) by the Mexican wavelet network with ALC. From Fig. 8, we note that it may not be necessary to add the third resolution to the Morlet wavelet network, as the

errors are converging when the iteration number approaches 20. To verify this point, we choose Ndwell = 20 and run Morlet wavelet-based ALC again. The tracking performance is shown in Fig. 11, where the tracking errors enter the prespecified bound after 32 iterations with only two layers of resolution. In practice, it remains an open issue as to how to choose an optimal Ndwell that ensures the simplest wavelet network structure and at the same time the fastest learning convergence rate. C. Comparison with Adaptive Neural Network Finally, the proposed ALC and the adaptive neural network control (ANNC) are compared. In ANNC approaches that are based on function approximation, the radial basis function (RBF) is most widely used. The RBF function is a Gaussian function   −(x − μ j )T (x − μ j ) , j = 1, 2, . . . , Nb s j (x) = exp η2j which is used to approximate the continuous function f (x) : R n → R as fˆ(x) = wT s(x), where x ∈ ∈ R n , w = [w1 , w2 , . . . , w Nb ]T ∈ R Nb , Nb > 1 is the ANNC node number, s(x) = [s1 (x), s2 (x), . . . , s Nb (x)]T , μ j = [μ1, j , μ2 , . . . , μn, j ] are the centers of the receptive field, and η j is the width of the Gaussian function [16]. ANNC is applied first with 121 nodes, and centers μ j ( j = 1, 2, . . . , 121) are evenly spaced in [0, 1]×[0, 2], and η j = 0.1 for all j = 1, 2, . . . , 121. The tracking performance is shown in Fig. 12, in which the tracking error x 2 remains 0.03 after 25 iterations. Next, the number of neural nodes is increased to

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 6, JUNE 2011

Tracking Errors

904

0.3 0.25 0.2 0.15 0.1 0.05 0 −0.05

x1 x2 Desired tracking error bound

5

10 15 Iteration Number

20

25

Fig. 12. Maximum tracking errors for each iteration based on ANN with 121 nodes.

Tracking error

0.2

x1 x2 Desired tracking error bound

0.15 0.1 0.05 0 −0.05

5

10 15 Iteration numbers

20

25

Fig. 13. Maximum tracking errors for each iteration based on ANN with 400 nodes.

400, and centers μ j ( j = 1, 2, . . . , 400) are evenly spaced in [0, 1]×[0, 2], η j = 0.1 for all j = 1, 2, . . . , 400. The tracking error profiles are shown in Fig. 13. Although the number of neural nodes is larger than the number of bases of all three kinds of wavelet networks, the tracking error x 2 is still above 0.015 after 25 iterations. VIII. C ONCLUSION In this paper, we developed an ALC approach that can fully make use of the powerful function approximation in a more flexible and constructive manner. The wavelet network provides an orthonormal basis for L2 (R) and can be constructed from the multiresolution approximation, thereby fulfilling all requirements of the ALC approach. To concentrate on the idea, concepts, and the basic methods, we considered two classes of nonlinear uncertain dynamics, the higher order plants with a lumped uncertain nonlinear function, and plants in the cascade form. With rigorous analysis, we proved the existence of a solution and learning convergence properties. A number of case studies were presented to demonstrate the effectiveness of wavelet-based ALC, as well as the choice and design issues of the wavelet network. R EFERENCES [1] S. Arimoto, S. Kawamura, and F. Miyazaki, “Bettering operation of robots by learning,” J. Robot. Syst., vol. 1, no. 2, pp. 123–140, Apr. 1984. [2] X. Ruan and Z. Bien, “Iterative learning controllers with time-varying gains for large-scale industrial processes to track trajectories with different magnitudes,” Int. J. Syst. Sci., vol. 39, no. 5, pp. 513–527, May 2008.

[3] H. S. Ahn, K. L. Moore, and C. Y. Chen, Iterative Learning Control: Robustness and Monotonic Convergence for Interval Systems. New York: Springer-Verlag, 2007. [4] M. Sun and D. Wang, “Initial condition issues on iterative learning control for non-linear systems with time delay,” Int. J. Syst. Sci., vol. 32, no. 11, pp. 1365–1375, 2001. [5] J.-X. Xu and B. Viswanathan, “Adaptive robust iterative learning control with dead zone scheme,” Automatica, vol. 36, no. 1, pp. 91–99, Jan. 2000. [6] M. French and E. Rogers, “Nonlinear iterative learning by an adaptive Lyapunov technique,” Int. J. Control, vol. 73, no. 10, pp. 840–850, 2000. [7] Y. Tan and J.-X. Xu, “Learning based nonlinear internal model control,” in Proc. IEEE Amer. Control Conf., vol. 4. Jun. 2003, pp. 3009–3013. [8] J. Sjoberg, Q. Zhang, L. Ljung, A. Benveniste, P. Y. Glorennec, B. Delyon, H. Hjalmarsson, and A. Juditsky, “Nonlinear black-box modeling in system identification: A unified overview,” Automatica, vol. 31, no. 12, pp. 1691–1724, Dec. 1995. [9] A. Juditsky, H. Hjalmarsson, A. Benveniste, L. Ljung, J. Sjoberg, and Q. Zhang, “Nonlinear black-box models in system identification: Mathematical foundations,” Automatica, vol. 31, no. 12, pp. 1691–1724, Dec. 1995. [10] K. S. Narendra and K. Parthasarathy, “Identification and control of dynamical systems using neural networks,” IEEE Trans. Neural Netw., vol. 1, no. 1, pp. 4–27, Mar. 1990. [11] R. Zbikowski, K. J. Hunt, D. Sbarbaro, and P. J. Gawthrop, “Neural networks for control systems: A survey,” Automatica, vol. 28, no. 6, pp. 1083–1112, Nov. 1992. [12] A. U. Levin and K. S. Narendra, “Control of nonlinear dynamical systems using neural networks. II. Observability, identification, and control,” IEEE Trans. Neural Netw., vol. 7, no. 1, pp. 30–42, Jan. 1996. [13] R. Sanner and J.-J. E. Slotine, “Gaussian networks for direct adaptive control,” IEEE Trans. Neural Netw., vol. 3, no. 6, pp. 837–863, Nov. 1992. [14] M. M. Polycarpou, “Stable adaptive neural control scheme for nonlinear systems,” IEEE Trans. Autom. Control, vol. 41, no. 3, pp. 447–451, Mar. 1996. [15] S. Seshagiri and H. K. Khalil, “Output feedback control of nonlinear systems using RBF neural networks,” IEEE Trans. Neural Netw., vol. 11, no. 1, pp. 69–79, Jan. 2000. [16] S. S. Ge and C. Wang, “Direct adaptive NN control of a class of nonlinear systems,” IEEE Trans. Neural Netw., vol. 13, no. 1, pp. 214– 221, Jan. 2002. [17] K. K. Tan, S. N. Huang, and T. H. Lee, “Further results on adaptive control for a class of nonlinear systems using neural networks,” IEEE Trans. Neural Netw., vol. 14, no. 3, pp. 719–722, May 2003. [18] M. Chen and W. Lee, “Robust adaptive neural network synchronization controller design for a class of time delay uncertain chaotic systems,” Chaos, Solitons Fractals, vol. 41, no. 5, pp. 2716–2724, Sep. 2009. [19] Y.-J. Liu, S. Tong, and Y. Li, “Adaptive neural network tracking control for a class of non-linear systems,” Int. J. Syst. Sci., vol. 41, no. 2, pp. 143–158, Feb. 2010. [20] L.-X. Wang, “Stable adaptive fuzzy control of nonlinear systems,” IEEE Trans. Fuzzy Syst., vol. 1, no. 2, pp. 146–155, May 2003. [21] S. Labiod and T. Guerra, “Adaptive fuzzy control of a class of SISO nonaffine nonlinear systems,” Fuzzy Sets Syst., vol. 158, no. 10, pp. 1126–1137, May 2007. [22] T.-P. Zhang and Y. Yang, “Adaptive fuzzy control for a class of MIMO nonlinear systems with unknown dead-zones,” Acta Autom. Sin., vol. 33, no. 1, pp. 96–99, Jan. 2007. [23] T. Poggio and F. Girosi, “Networks for approximation and learning,” Proc. IEEE, vol. 78, no. 9, pp. 1481–1497, Sep. 1990. [24] M. M. Gupta and D. H. Rao, Neuro-Control Systems: Theory and Applications. Piscataway, NJ: IEEE Press, 1994. [25] K. Funahashi, “On the approximate realization of continuous mappings by neural networks,” Neural Netw., vol. 2, no. 3, pp. 183–192, 1989. [26] M. Stinchcombe, K. Hornik, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Netw., vol. 2, no. 5, pp. 359–366, 1989. [27] G. P. Liu, V. Kadirkamanathan, and S. A. Billings, “Variable neural networks for adaptive control of nonlinear systems,” IEEE Trans. Syst., Man, Cybern., Part C: Appl. Rev., vol. 29, no. 1, pp. 34–43, Feb. 1999. [28] G. P. Liu, Nonlinear Identification and Control. New York: SpringerVerlag, 2001.

XU AND YAN: ADAPTIVE LEARNING CONTROL FOR FINITE INTERVAL TRACKING

[29] I. I. Vorovich, L. P. Lebedev, and G. M. L. Gladwell, Functional Analysis. Norwell, MA: Kluwer, 1994. [30] J.-X. Xu and Y. Tan, “Nonlinear adaptive wavelet control using constructive wavelet networks,” IEEE Trans. Neural Netw., vol. 18, no. 1, pp. 115–127, Jan. 2007. [31] M. Pushpalatha and N. Nalini, “Generalization analysis of wavelet frame based neural network for function representation using compactly supported Gaussian wavelets,” Int. J. Recent Trends Eng., vol. 2, no. 6, pp. 188–191, Nov. 2009. [32] W. Z. Huang Z. H. Zheng, T. R. Ding, and Z. X. Dong, Qualitative Theory of Differential Equations. Providence, RI: AMS, 1991. [33] S. E. Kelly, M. A. Kon, and L. A. Raphael, “Local convergence for wavelet expansions,” J. Funct. Anal., vol. 126, pp. 102–138, 1994. [34] G. G. Walter, “Pointwise convergence of wavelet expansions,” J. Approx. Theory, vol. 80, no. 1, pp. 108–118, Jan. 1995. [35] J.-X. Xu and R. Yan, “On initial conditions in iterative learning control,” IEEE Trans. Autom. Control, vol. 50, no. 9, pp. 1349–1354, Sep. 2005. [36] T. Zhang, S. S. Ge, and C. C. Hang, “Stable adaptive control for a class of nonlinear systems using a modified Lyapunov function,” IEEE Trans. Autom. Control, vol. 45, no. 1, pp. 129–132, Jan. 2000. [37] T. M. Apostol, Mathematical Analysis. Reading, MA: Addison-Wesley, 1957. [38] S. G. Mallat, “Multiresolution approximations and wavelet orthonormal bases of L 2 (r),” Trans. Amer. Math. Soc., vol. 315, no. 1, pp. 69–87, Sep. 1989. [39] I. Daubechies, “Orthonormal bases of compactly supported wavelets,” Commun. Pure Appl. Math., vol. 41, no. 7, pp. 909–996, Oct. 1988.

905

Jian-Xin Xu (SM’98) received the Bachelors degree in electrical engineering from Zhejiang University, Hangzhou, China, in 1982, and the Masters and Ph.D. degrees in electrical engineering from the University of Tokyo, Tokyo, Japan, in 1986 and 1989, respectively. He spent one year at the Hitachi Research Laboratory, Ibaraki, Japan, more than one year in Ohio State University, Columbus, as a Visiting Scholar, and 6 months in Yale University, New Haven, CT, as a Visiting Research Fellow. He joined the National University of Singapore, Singapore, in 1991, and is currently a Professor at the Department of Electrical Engineering. His current research interests include learning theory, intelligent control, nonlinear and robust control, robotics, and precision motion control.

Rui Yan (M’11) received the Bachelors and Masters degrees from the Department of Mathematics, Sichuan University, Chengdu, China, in 1998 and 2001, respectively, and the Ph.D. degree from the Department of Electrical and Computer Engineering, National University of Singapore, Singapore, in 2006. She is currently a Research Fellow in the Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore. Her current research interests include advanced nonlinear control, power system stability analysis and control, intelligent control, and social robotics.