An efficient design of adaptive neural network control - NUS BME ...

0 downloads 0 Views 5MB Size Report
Dec 23, 2015 - For example, see Cas- taneda and Esquivel .... systems such as robotic arms, surface vessels, unmanned aircrafts and power converters.
Neural Networks 76 (2016) 122–134

Contents lists available at ScienceDirect

Neural Networks journal homepage: www.elsevier.com/locate/neunet

Hybrid feedback feedforward: An efficient design of adaptive neural network control Yongping Pan a , Yiqi Liu b , Bin Xu c , Haoyong Yu a,∗ a

Department of Biomedical Engineering, National University of Singapore, Singapore 117583, Singapore

b

School of Automation Science and Engineering, South China University of Technology, Guangzhou 510640, China

c

School of Automation, Northwestern Polytechnical University, Xi’an 710072, China

article

info

Article history: Received 10 July 2015 Revised and accepted 11 December 2015 Available online 23 December 2015 Keywords: Adaptive control Feedforward compensation Human motor learning control Neural network Euler–Lagrange system

abstract This paper presents an efficient hybrid feedback feedforward (HFF) adaptive approximation-based control (AAC) strategy for a class of uncertain Euler–Lagrange systems. The control structure includes a proportional–derivative (PD) control term in the feedback loop and a radial-basis-function (RBF) neural network (NN) in the feedforward loop, which mimics the human motor learning control mechanism. At the presence of discontinuous friction, a sigmoid-jump-function NN is incorporated to improve control performance. The major difference of the proposed HFF-AAC design from the traditional feedback AAC (FB-AAC) design is that only desired outputs, rather than both tracking errors and desired outputs, are applied as RBF-NN inputs. Yet, such a slight modification leads to several attractive properties of HFF-AAC, including the convenient choice of an approximation domain, the decrease of the number of RBF-NN inputs, and semiglobal practical asymptotic stability dominated by control gains. Compared with previous HFF-AAC approaches, the proposed approach possesses the following two distinctive features: (i) all above attractive properties are achieved by a much simpler control scheme; (ii) the bounds of plant uncertainties are not required to be known. Consequently, the proposed approach guarantees a minimum configuration of the control structure and a minimum requirement of plant knowledge for the AAC design, which leads to a sharp decrease of implementation cost in terms of hardware selection, algorithm realization and system debugging. Simulation results have demonstrated that the proposed HFF-AAC can perform as good as or even better than the traditional FB-AAC under much simpler control synthesis and much lower computational cost. © 2015 Elsevier Ltd. All rights reserved.

1. Introduction By virtue of the universal approximation property of function approximators such as fuzzy logic systems and neural networks (NNs), adaptive approximation-based control (AAC) demonstrates powerful capability for handling functional uncertainties in nonlinear systems (Farrell & Polycarpou, 2006) and has attracted great concern in recent years. For example, see Castaneda and Esquivel (2012), Chemachema (2012), Fairbank, Li, Fu, Alonso, and Wunsch (2014), Hamdy and el-Ghazaly (2014), Herzallah (2013), Kostarigka and Rovithakis (2012), Kruger, Schnetter, Placzek, and Vorsmann (2012), Liu and Tong (2014), Liu and Tong (2015), Liu, Tong, and Chen (2013), Pan and Er (2013), Pan, Er, Huang, and Sun (2012), Pan, Er, Huang, and Wang (2011),



Corresponding author. E-mail addresses: [email protected] (Y. Pan), [email protected] (Y. Liu), [email protected] (B. Xu), [email protected] (H. Yu). http://dx.doi.org/10.1016/j.neunet.2015.12.009 0893-6080/© 2015 Elsevier Ltd. All rights reserved.

Pan, Yu, and Er (2014), Pan, Zhou, Sun, and Er (2013), Yousef, Hamdy, and Shafiq (2013), Zou, Kumar, and Hou (2010), and Zou, Kumar, and Hou (2013). In the traditional feedback-based AAC (FB-AAC), function approximators are applied in the feedback loop to compensate for state-dependent unknown functions. Due to the local approximation property of most approximators and the feedback control structure, the traditional FB-AAC faces some challenges in addressing the following issues (Ren, Ge, Tee, & Lee, 2010): (1) how to determine an approximation domain a priori such that approximators can be constructed; (2) how to ensure approximator inputs remain within the approximation domain such that function approximation is always valid; (3) how to reduce the number of approximator inputs such that computational cost can be decreased. A hybrid feedback feedforward (HFF) scheme of AAC provides a promising way for tackling the aforementioned challenges of FBAAC (Pan & Yu, 2014). The key feature of the HFF-AAC design is that only desired outputs, rather than both tracking errors and desired outputs, are utilized as NN inputs. Since desired outputs can usually be prespecified, an approximation domain can be

Y. Pan et al. / Neural Networks 76 (2016) 122–134

determined by the desired outputs a priori, and the desired outputs can certainly remain within this domain. In addition, the numbers of NN inputs can be decreased since tracking errors are not needed to be NN inputs. There are a few innovative works of HFF-AAC so far (Chen, Jiao, & Wu, 2012; Chiu, 2006; Ishihara, van Doornik, & BenMenahem, 2011). In Ishihara et al. (2011), a HFF-AAC approach was developed for a class of robotic arms. The major limitation of this approach is that 6 formulas with exact plant bounds are applied to determine a feedback control gain. In Chiu (2006), a HFF-AAC approach was proposed for a class of uncertain affine nonlinear systems with functional control gains, where the feedforward loop contains two approximators, and the feedback loop is constructed based on H ∞ control and nonlinear damping techniques. The control law of this approach has three control terms with over 15 design parameters, where a nonlinear damping term with known plant bounds is used for ensuring global stability. In Chen et al. (2012), a HFF-AAC approach was presented for a class of affine nonlinear systems with constant control gains, where the control law has three control terms with over 10 design parameters, and an adaptive bounding term with permanently positive estimation is utilized to ensure global stability. Regardless of the attractive features and great potentials, the approaches of Chen et al. (2012), Chiu (2006) are subjected to the following drawbacks: (1) the stability results obtained are enslaved to a strict assumption that plant uncertainties are bounded by partially known functions; (2) favourable tracking performances are guaranteed at the cost of complex controllers with numerous design parameters. Moreover, these approaches face a dilemma that exact analytical expressions of plant uncertainties are applied to derive partially known bounds in illustrative examples, which implies that the plant dynamics are actually known during the control synthesis. The analysis of the previous HFF-AAC methods motivates the contributions of this study: to propose an efficient HFF-AAC scheme and to establish stability of the resulting closed-loop system. The controlled plant considered is a class of Euler–Lagrange systems which represents a large class of engineering systems such as robotic arms, surface vessels, unmanned aircrafts and power converters (Ortega, Perez, Nicklasson, & Sira-Ramirez, 1998). It is suggested that human motions are controlled by the central nervous system, where the lateral part of the cerebellum forms a feedforward path to act as a predictive machine, and the intermediate part of the cerebellum forms a feedback path to act as a servo machine (Khemaissia & Morris, 1998). Such a HFF architecture has been verified by some neuroscientific results such as Lam, Anderschitz, and Dietz (2006), Lee and Terzopoulos (2006), Seidler, Noll, and Thiers (2004) and Wagner and Smith (2008). Inspired by the above human motor learning control mechanism, this study proposes a biomimetic HFF-AAC strategy for Euler–Lagrange systems, where a proportional–derivative (PD) control term is applied as the feedback servo machine, and a radial-basis-function (RBF) NN is applied as the feedforward predictive machine. At the presence of discontinuous friction, a sigmoid-jump-function (SJF) NN is incorporated into the proposed HFF-AAC to improve control performance. It is worth noting that due to the proposed novel HFF-AAC scheme, extra efforts are made to establish semiglobal practical asymptotic stability of the closed-loop system. This study is a comprehensive extension of our preliminary work (Pan & Yu, 2014), where NN approximation errors, external disturbances and discontinuous friction are not considered in Pan and Yu (2014) to simplify control analysis. Differing from the existing HFF-AAC methods (Chen et al., 2012; Chiu, 2006; Ishihara et al., 2011), the proposed HFF-AAC strategy is directly inspired by the human motor learning control mechanism resulting in only two control terms with three design parameters, and consequently, possesses the following distinctive features: (1) all aforementioned challenges of the traditional FB-AAC design are completely resolved by a much simpler control scheme, which guarantees a minimum configuration of the control structure for the AAC

123

design; (2) the bounds of plant uncertainties are not required during control synthesis, which guarantees a minimum requirement of plant knowledge for the AAC design. These two distinctive features are significant because of the following reasons. From a theoretical point of view, the minimum configuration of control structure and the minimum requirement of plant knowledge in the proposed approach overturns the traditional FB-AAC design and leads to an efficient HFF-AAC methodology. From an engineering point of view, the control algorithm obtained leads to a sharp decrease of implementation cost in terms of hardware selection, algorithm realization and system debugging. The rest of this paper is organized as follows: The control problem is formulated in Section 2; the HFF-AAC is designed in Section 3; discussions are given in Section 4; illustrative examples are given in Section 5; conclusions are provided in Section 6. Throughout this paper, N, R, R+ , Rn and Rm×n denote the spaces of natural numbers, real numbers, positive real numbers, real n-vectors and real m × n-matrixes, respectively, ∥x∥ denotes the Euclidean norm of x, L∞ denotes the space of bounded signals, λmin (A) and λmax (A) are the minimal and maximal eigenvalues of A, respectively, min(·), max(·) and sup(·) are the functions of minimum, maximum and supremum, respectively, sgn(·) is the sign function, diag(·) is a diagonal matrix, col(x, z) := [xT , zT ]T , and C k represents the space of functions whose k-order derivatives all exist and are continuous, where x, z ∈ Rn , A ∈ Rn×n , and m, n, k ∈ N. In the subsequent sections, the arguments of a function may be omitted when the context is sufficiently explicit. 2. Problem formulation The class of nonlinear systems considered here is described by the following Euler–Lagrange formulation (Ortega et al., 1998): M (q)¨q + C (q, q˙ )˙q + G(q) + F (˙q) + τ d = τ

(1)

in which M (q) ∈ Rn×n is an inertia matrix, C (q, q˙ ) ∈ Rn×n is a centripetal–Coriolis matrix, G(q) ∈ Rn is a vector of gravitational torques, F (˙q) ∈ Rn is a vector of friction torques, τ d (t ) ∈ Rn is a vector of external disturbances, τ(t ) ∈ Rn is a vector of control torques, q(t ) := [q1 (t ), q2 (t ), . . . , qn (t )]T ∈ Rn is a vector of joint angles, and n is the number of joints. Let qd (t ) := [qd1 (t ), qd2 (t ), . . . , qdn (t )]T ∈ Rn be a vector of desired inputs. In this study, it is assumed that q and q˙ are measurable, and M, C , G, F and τ d are unknown a priori. The following properties are exploited for the subsequent development (Ortega et al., 1998). Property 1. M (q) is symmetric and positive-definite, and satisfies T ¯ (q)∥ξ∥2 , ∀ξ ∈ Rn , where m0 ∈ R+ is an m0 ∥ξ∥2 ≤ ξ M (q)ξ ≤ m ¯ (q) : Rn → R+ is an unknown function. unknown constant, and m

˙ (q)−2C (q, q˙ ) is skew-symmetric such that ξ (M ˙ (q)− Property 2. M 2C (q, q˙ ))ξ = 0, ∀ξ ∈ Rn . T

Property 3. M (q), C (q, q˙ ) and G(q) are of class C 1 , ∀q, q˙ ∈ Rn . Property 4. F (˙q) is uncoupled among joints so that F (˙q) = [f1 (˙q1 ), f2 (˙q2 ), . . . , fn (˙qn )]T , where fi (˙qi ) : R → R (i = 1 to n) are bounded, continuous and analytic on a compact set Ωq ⊂ Rn , except at q˙ i = 0 in which fi (˙qi ) is right-continuous with a finite jump. Property 5. τ d (t ) is bounded by ∥τ d (t )∥ ≤ τ¯d with τ¯d ∈ R+ being an unknown constant. Property 6. q, q˙ and q¨ are of ∈ L∞ . Define a position tracking error e1 (t ) := qd (t ) − q(t ), a filtered tracking error e2 (t ) := e˙ 1 (t )+ K1 e1 (t ) and an auxiliary signal v (t ) := q˙ d (t ) + K1 e1 (t ), where K1 ∈ Rn×n is a positive-definite and diagonal matrix. Let e := col(e1 , e2 ) and xd := col(qd , q˙ d , q¨ d ). The objective of this study is to develop a NN-based control strategy for the system (1) such that the system output q accurately tracks its desired signal qd for a wide range of initial states.

124

Y. Pan et al. / Neural Networks 76 (2016) 122–134

3. Feedback feedforward control design 3.1. Open-loop system dynamics Differentiating e2 with respect to time t and multiplying both sides of the resulting equality by M (q) yields M (q)˙e2 = M (q)(¨qd + K1 e˙ 1 ) − M (q)¨q. Substituting the expression of M (q)¨q by (1) into the above equality and noting the definition of v, one gets the open-loop filtered tracking error dynamics as follows:

ˆ (xd |W ˆ h) = W ˆ hT Φh (xd ) H

M (q)˙e2 = M (q)˙v + C (q, q˙ )˙q + G(q) + F (˙q) + τ d − τ. Using the definitions of e1 , e2 and v, one gets q˙ = v − e2 . Applying this result to the above equality leads to M (q)˙e2 = H (xd , e1 , e˙ 1 ) + F (˙q) − C (q, q˙ )e2 + τ d − τ

H (xd , e) and F (˙q) (Lewis, 1996), where a NN with 5n inputs (xd , e) is required in the feedback loop to approximate H (xd , e). It is stated in Selmic and Lewis (2000) that using continuous NNs to approximate discontinuous functions (e.g. F (˙q)) requires many NN nodes with many training iterations, and still does not yield very good results. In Selmic and Lewis (2002), a NN with jump activation functions is applied to improve the approximation of F (˙q) by only a few NN nodes. However, a NN with 5n inputs (xd , e) is still required in the feedback loop to approximate H (xd , e) in Selmic and Lewis (2002). In this study, a linearly parameterized RBF-NN with only 3n inputs (xd ) as follows (Farrell & Polycarpou, 2006):

(2)

(10)

is applied in the feedforward loop to approximate Hd (xd ) in (6), ˆ h = [W ˆ h1 , W ˆ h2 , . . . , W ˆ hn ] ∈ RN ×n with W ˆ hi = [w where W ˆ hi1 , 2 N T 1 N w ˆ hi , . . . , w ˆ hi ] ∈ R is a matrix of NN weights, Φh (xd ) = [φh (xd ), φh2 (xd ), . . . , φhN (xd )]T ∈ RN is a vector of regression functions, N is j

where H is a lumped uncertainty given by H (xd , e1 , e˙ 1 ) := M (q)(¨qd + K1 e˙ 1 ) + C (q, q˙ )(˙qd + K1 e1 ) + G(q). (3)

the number of RBF-NN nodes, and φh (xd ) : R3n → R+ with j = 1 to N are Gaussian RBFs given by

φhj (xd ) = exp(−∥xd − cj ∥2 /(2σj2 ))

(11)

Noting e˙ 1 = e2 − K1 e1 and q = qd − e1 , one gets q˙ = q˙ d − e2 + K1 e1 . By applying this with e = col(e1 , e2 ) and xd = col(qd , q˙ d , q¨ d ) to (3), H can be expressed as a function of xd and e as follows:

in which cj ∈ R and σj ∈ R are centres and widths of RBFs, respectively. In addition, in the presence of the discontinuous F (˙q) in (2), SJF-NNs are introduced as follows (Selmic & Lewis, 2002):

H (xd , e) = M (qd − e1 )(¨qd + K1 (e2 − K1 e1 )) + G(qd − e1 )

ˆ fi ) = W ˆ fiT Φf (˙qi ) fˆi (˙qi |W

+ C (qd − e1 , q˙ d − e2 + K1 e1 )(˙qd + K1 e1 ).

(4)

According to the definition of v, H can also be expressed to be H (q, q˙ , v , v˙ ) = M (q)˙v + C (q, q˙ )v + G(q)

   H1 (xv 1 )





H2 (xv 2 )

(5)



with xv 1 := col(q, v˙ ) ∈ R2n and xv 1 := col(q, q˙ , v ) ∈ R3n . Let Hd (xd ) := H (xd , e)|e=0 be a lumped plant uncertainty at e = 0, where e = 0 implies e1 = e˙ 1 = 0. Therefore, applying e = 0 to (4), one immediately obtains Hd (xd ) = M (qd )¨qd + C (qd , q˙ d )˙qd + G(qd )

   Hd1 (xd1 )





Hd2 (xd2 )

(6)



ˆ fi = [w to approximate fi (˙qi ), where W ˆ fi1 , w ˆ fi2 , . . . , w ˆ fiL ]T ∈ RL is a

vector of NN weights, Φf (˙qi ) = [φf1 (˙qi ), φf2 (˙qi ), . . . , φfL (˙qi )]T ∈ RL is a vector of regression functions, L is the number of SJF-NN nodes, j and φf (˙qi ) : R → R+ are SJFs given by j f

φ (˙qi ) =

 (1 − e−˙qi )j−1 0

(7)

in which i = 1, 2, . . . , n and j = 1, 2, . . . , L. From the expressions j j of φh (xd ) and φf (˙qi ), one knows that Φh (xd ), Φf (˙qi ) ∈ L∞ , ∀xd ∈

R3n and ∀˙q ∈ Rn . Now, the control law is designed as follows: (14)

ˆ f := [W ˆ f 1, W ˆ f 2, . . . , W ˆ fn ] ∈ RL×n , and K2 ∈ Rn×n is a positiveW definite and diagonal matrix of control gains. ˆ h | ∥W ˆ h ∥F Let compact sets Ωd := {xd | ∥xd ∥ ≤ cd }, Ωh := {W + ˆ ˆ ≤ ch }, and Ωf := {Wf | ∥Wf ∥ ≤ cf }, where cd , ch , cf ∈ R . Next, define optimal NN approximation errors εh (xd ) := Hd (xd ) − Hˆ (xd |Wh∗ ),

(15)

εf (˙q) := F (˙q) − Fˆ (˙q|Wf ),

(16)







(9)

in which ρ : R → R is a certain function that is globally invertible and strictly increasing.1 +

(13)

(8)

In a same manner as Remark 3 of Xian, Dawson, de Queiroz, and Chen (2004), since H given by (4) is of C 1 guaranteed by Property 3, ˜ to obtain the Mean Value Theorem can be applied to H

∥H˜ (xd , e)∥ ≤ ρ(∥e∥)∥e∥

if q˙ i ≥ 0 if q˙ i < 0

ˆ f ) := [fˆ1 (˙q1 |W ˆ f 1 ), fˆ2 (˙q2 |W ˆ f 2 ), . . . , fˆn (˙qn |W ˆ fn )]T , in which Fˆ (˙q|W

˜ is a mismatching term given by in which H ˜ (xd , e) := H (xd , e) − Hd (xd ). H

(12)

ˆ h ) + Fˆ (˙q|W ˆf) τ = K2 e2 + Hˆ (xd |W

with xd1 := col(qd , q¨ d ) ∈ R2n and xd2 := col(qd , q˙ d ) ∈ R2n . Subtracting and adding Hd and the right side of (2) yields

˜ (xd , e) + Hd (xd ) + F (˙q) − C (q, q˙ )e2 + τ d − τ M (q)˙e2 = H

+

3n

+

where Wh and Wf are optimal weight vectors given by





ˆ (xd |W ˆ h )∥ , Wh∗ := arg min sup ∥Hd (xd ) − H xd ∈Ωd

ˆ h ∈Ωh W



3.2. Feedback and feedforward approximations For the traditional FB-AAC of the system (1), NNs with continuous activation functions would be applied to approximate both

1 The rigorous proof of (9) can be referred to de Queiroz, Jun, Dawson, Burg, and Donepudi (1997, Appendix A).



ˆ f )∥ . Wf∗ := arg min sup ∥F (˙q) − Fˆ (˙q|W ˆ f ∈Ωf W

q˙ ∈Ωq

The following lemmas show the universal approximation properties of the NNs in (10) and (12). Lemma 1 (Farrell & Polycarpou, 2006). Let Hd (xd ) : Ωd → Rn be any C 1 function. Given any small constant ε¯ h ∈ R+ , there exists a RBFˆ (xd |W ˆ h ) in (10) with a sufficiently large number of NN nodes N NN H such that ∥εh (xd )∥ ≤ ε¯ h , ∀xd ∈ Ωd .

Y. Pan et al. / Neural Networks 76 (2016) 122–134

Lemma 2 (Selmic & Lewis, 2000). Let F (˙q) : Ωq → Rn be any bounded function which is continuous and analytic on Ωq , except at q˙ = 0 where it has a finite jump and is right-continuous. Given any small ˆ f ) in (12) with a sufficiently constant ε¯ f ∈ R+ , there is a SJF-NN Fˆ (˙q|W large number of NN nodes L so that ∥εf (˙q)∥ ≤ ε¯ f , ∀˙q ∈ Ωq .

ˆ (xd |W ˆ h) Remark 1. In the proposed control law (14), a RBF-NN H ˆ f ) in (12) are utilized to compensate in (10) and a SJF-NN Fˆ (˙q|W for the continuous part H (xd ) and the discontinuous part F (˙q) of (1), respectively, and a PD controller K2 e2 is utilized to guarantee closed-loop stability. Since F (˙q) is uncoupled among joints accordˆ f ) increases ing to Property 4, the number of NN nodes in Fˆ (˙q|W linearly, rather than exponentially, with the increase of the numˆf) ber of joints n. This fact provides a nice feature that adding Fˆ (˙q|W in (14) does not obviously increase the total number of NN nodes ˆ (xd |W ˆ h ) greatly reduces Nd . On the other hand, removing e in H Nd . Consequently, the complexity of the NNs adopted in (14) can ˆ f ) is applied. be greatly reduced even though an additional Fˆ (˙q|W More details can be referred to Section 4.1. If F (˙q) in (1) is omitted or considered to be smooth (Makkar, Hu, Sawyer, & Dixon, 2007), ˆ (xd |W ˆ h ) in (10) can be utilized to approximate the entire term H (H (xd ) + F (˙qd )) so that (14) is simplified to be a pure human moˆ f ). More details can be tor learning control structure without Fˆ (˙q|W referred to Pan and Yu (2014).

125

Theorem 1. For the system (1) under Properties 1–6 driven by the control law (14) with (10), (12), (19), λmin (Ki ) > 1/2 (i = 1, 2) and ˆ (0) ∈ Ωw , the stability results are given as follows: W (1) If the perturbation δ = 0, then the closed-loop system achieves semiglobal asymptotic stability in the sense that the tracking error e converges to 0 for an arbitrarily large estimated domain of attraction S determined by the control gains K1 and K2 ; (2) If the perturbation δ ̸= 0, then there exist suitably large control gains K1 and K2 such that the closed-loop system archives semiglobal practical asymptotic stability in the sense that an estimated domain of attraction S can be arbitrarily enlarged and a domain of error convergence Ωc can be arbitrarily diminished both by the increase of K1 and K2 . Proof. Firstly, the time derivative of V given by (20) is

˙

˙ (q)e2 /2 − W ˜ T Γ −1 W ˆ. V˙ = −eT1 K1 e1 + eT1 e2 + eT2 M (q)˙e2 + eT2 M Substituting (17) into the above result yields

V˙ ≤ −

2 

(λmin (Ki ) − 1/2)∥ei ∥2 + eT2 (H˜ + δ)

i=1

˙ˆ ˜ −W ˜ T Γ −1 W + eT2 Φ T (xd , q˙ )W ˙ (q) − 2C (q, q˙ ))e2 /2. + eT2 (M

3.3. Control synthesis and analysis

˜ = W ˜ T Φ (xd , q˙ )e2 to the Applying Property 2 and eT2 Φ T (xd , q˙ )W above result, one immediately obtains

ˆ := [W ˆ 1, W ˆ 2, . . . , W ˆ n ]T be a For compact presentation, let W block vector of NN weights, Φ (xd , q˙ ) := diag(Φ1T (xd , q˙ 1 ), . . . , ΦnT (xd , q˙ n )) be a block diagonal matrix of regression functions,

V˙ ≤ −

Applying (14)–(16) to (7) and noting the above notations, one gets the closed-loop filtered tracking error dynamics as follows2 :

˜ + H˜ + δ M (q)˙e2 = − K2 e2 − C (q, q˙ )e2 + Φ T (xd , q˙ )W

(17)

where δ is regarded as a perturbation given by

δ(xd , q˙ , t ) := εh (xd ) + εf (˙q) + τ d (t ).

with Γ = diag(Γi )|1≤i≤n and Γi = diag(Γhi , Γfi ), in which Γhi ∈ RN ×N and Γfi ∈ RL×L with i = 1 to n are positive-definite and

ˆ , •) is a projection diagonal matrixes of learning rates, and P (W operator given by (Farrell & Polycarpou, 2006)  ˆ ˆ ˆT ˆ , •) = •, if ∥WT ∥ < cw or2 ∥W ∥ = cw & W • ≤ 0; P (W ˆ ˆ ˆ • − W W • /∥W ∥ , otherwise. 1

1

˜ T Γ −1 W ˜ V (z) = eT1 e1 + eT2 M (q)e2 + W 2 2 2

2 

(λmin (Ki ) − 1/2)∥ei ∥2 + eT2 (H˜ + δ).

(22)

i=1

Secondly, consider the case that the perturbation δ in (17) is absent. Noting (9), one obtains (23)

Applying (23) and δ = 0 to (22) yields V˙ ≤ −(ks − ρ(∥e∥))∥e∥2 with ks := mini=1,2 {λmin (Ki ) − 1/2} ∈ R+ resulting in V˙ ≤ −λ3 ∥e∥2 ,

∀ks > ρ(∥e∥)

with λ3 := ks − ρ(∥e∥) ∈ R+ , which is equivalent to V˙ ≤ −λ3 ∥e∥2 ,

Now, choose a Lyapunov function candidate 1

˙

ˆ (t ) ∈ Ωw , ∀t ≥ 0 and W ˜ T (Φ (xd , q˙ )e2 − Γ −1 W ˆ ) ≤ 0. tees W Applying the above second result to (21) leads to

˜ ≤ ∥e2 ∥ ∥e∥ρ(∥e∥) ≤ ∥e∥2 ρ(∥e∥). eT2 H (19)

(21)

From the result of the projection modification in Farrell and Polyˆ (0) ∈ Ωw , then the adaptive law (19) guarancarpou (2006), if W

(18)

ˆ is designed to be An adaptive law of W ˙ˆ = P (W ˆ , Γ Φ (xd , q˙ )e2 ) W

˙ˆ ). ˜ T (Φ (xd , q˙ )e2 − Γ −1 W +W

V˙ ≤ −

From Lemmas 1, 2 and Property 5, δ in (18) can be bounded by ∥δ∥ ≤ δ¯ := ε¯ h + ε¯ f + τ¯d , ∀xd ∈ Ωd and q˙ ∈ Ωq .

(λmin (Ki ) − 1/2)∥ei ∥2 + eT2 (H˜ + δ)

i=1

ˆ , where W ˆ i = col(W ˆ hi , W ˆ fi ), and W ∗ be an optimal value of W

˜ := Φi (xd , q˙ i ) = col(Φh (xd ), Φf (˙qi )), and i = 1 to n. Define W 2 ∗ ˆ ˆ ˆ W − W and Ωw := {W | ∥W ∥ ≤ cw } with cw := (ch + cf2 )1/2 .

2 

(20)

˜ ∥) ∈ R2n+1 for the closed-loop system constiwith z := col(e, ∥W tuted by (17)–(19). The following theorem is established to show the main result of this study.

2 The operation in Φ T (x , q˙ )W ˜ follows the multiplication of block matrixes. d

∀e ∈ Ωea

with Ωea := {e| ∥e∥ < ρ −1 (ks )}. Noting Property 1, define Dz := {z| ∥z∥ < cz }, λ1 := min{1/2, m0 /2, λmin (Γ −1 )/2} ∈ R+ and ¯ (q)/2, λmax (Γ −1 )/2} ∈ R+ , where cz := λ2 := maxz∈Dz {1/2, m  − 1 2 2 (ρ (ks )) + 4cw . Then, one obtains



α1 (∥z∥) ≤ V (z) ≤ α2 (∥z∥) V˙ (z) ≤ −Ua (e), ∀z ∈ Dz

(24)

with α1 (∥z∥) := λ1 ∥z∥2 , α2 (∥z∥) := λ2 ∥z∥2 and Ua (e) := λ3 ∥e∥2 , ˜ ∈ L∞ in Dz . which implies V (z) ∈ L∞ in Dz such that e, W

126

Y. Pan et al. / Neural Networks 76 (2016) 122–134

Combining with Properties 3, 6 and Φh , Φf ∈ L∞ , one knows that all terms in (17) except e˙ 2 are uniformly bounded in Dz . In addition, it follows from Property 1 that M (q) is lower bounded. Thus, one gets e˙ 2 ∈ L∞ in Dz such that e˙ 1 ∈ L∞ in Dz implying e, e˙ ∈ L∞ in Dz . Using e, e˙ ∈ L∞ in Dz and the definition of Ua (e), one obtains U˙ a (e) ∈ L∞ in Dz , which is a sufficient condition for Ua (e) being uniformly continuous in Dz . Now, since V given by (20) satisfies (24), ∀z ∈ Dz , Ua (e) is uniformly continuous positivesemidefinite, and α1 (∥z∥) and α2 (∥z∥) are continuous positivedefinite, the Invariance-like Theorem (Xian et al., 2004, Lemma 2) is invoked to conclude that lim Ua (e) = 0,

t →∞

∀z(0) ∈ S

in which the domain of attraction is estimated by

S := z ∈ Dz |α2 (∥z∥) ≤ λ1 ((ρ −1 (ks ))2 + 4cw2 ) .





(25)

Noting the definition of Ua (e), one gets limt →∞ ∥e(t )∥ = 0,

∀z(0) ∈ S .

Moreover, because S in (25) can be arbitrarily enlarged by the increase of ks related to K1 and K2 to include any initial states q(0) and q˙ (0), the asymptotic stability result is semiglobal. Thirdly, consider the case that δ in (17) is present. Applying (23) and ∥δ∥ ≤ δ¯ to (22) and noting ∥e2 ∥ ≤ ∥e∥, one gets

¯ e∥. V˙ ≤ −(ks − ρ(∥e∥))∥e∥2 + δ∥

(26)

Thus, if ks > ρ(∥e∥) such that e ∈ Ωea , then V˙ remains nega¯ ks − ρ(∥e∥)). As the bound δ¯ is tively definite until ∥e∥ ≤ δ/( obtained on xd ∈ Ωd and q˙ ∈ Ωq and only xd ∈ Ωd is guaranteed a priori before control, (26) is valid only on q˙ ∈ Ωq . Let Ωeb := {e|e ∈ Ωea , q˙ ∈ Ωq }. Hence, for any given Ωq , there exists a suitably large k∗s ∈ R+ such that Ωea = Ωeb , ∀ks ≥ k∗s , which implies the condition q˙ ∈ Ωq is included in the condition e ∈ Ωea as long as ks ≥ k∗s . Consequently, (26) holds, ∀ks ≥ k∗s . According to the above analysis, under ks > ρ(∥e∥), i.e. e ∈ Ωea , one has that V given by (20) satisfies



α1 (∥z∥) ≤ V (z) ≤ α2 (∥z∥) V˙ (z) ≤ −Ub (e), ∀z ∈ Dz

¯ ks − ρ(∥e∥)) for ∥e∥ ≥ δ/(

holds on z ∈ Dz and µ satisfies µ < α2−1 (α1 (cz )), where α1 (∥z∥), α2 (∥z∥) and Ub (e) are continuous positive-definite. Accordingly, the UUB Theorem (Khalil, 2015, Th. 4.4) is invoked to state that the domain S is positively invariant and there exists a class KL function3 β such that for every z(0) ∈ S in (25), the solution of the closed-loop system composed of (2) and (17)–(19) satisfies

∥z(t )∥ ≤ max{β(∥z(0)∥, t ), α1−1 (α2 (µ))},

∀t ≥ 0

(28)

which implies that z(t ) is uniformly bounded, ∀t ≥ 0 and is UUB √ with a ultimate bound α1−1 (α2 (µ)) = λ2 /λ1 µ. Thus, the closedloop system achieves UUB stability such that z(t ) remains within Dz , ∀t ≥ 0 and converges to Dµ as t → ∞, ∀z(0) ∈ S . Noting the strictly increasing property of ρ , one gets that e(t ) converges to a domain of error convergence as follows:

Ωc := {e| ∥e∥ ≤ δ¯ /(ks − ρ0 )}

(29)

with ρ0 := min∥e∥=δ/ ¯ ks ρ(∥e∥). Furthermore, since S in (25) can be arbitrarily enlarged and Ωc in (29) can be arbitrarily diminished both by the increase of ks related to K1 and K2 , the stability result is semiglobally and practically asymptotic.  Remark 2. A structural comparison between the traditional FBAAC and the proposed HFF-AAC is demonstrated in Fig. 1, where the difference is that the former includes an additional feedback dash line of e. It is worth noting that all RBF-NN inputs xd of HFFAAC are known before control, whereas 2n additional RBF-NN inputs e of FB-AAC are unknown before control. This slight difference leads to several attractive features of HFF-AAC as follows: (1) the domain of NN approximation Ωd can be determined a priori by the desired output qd ; (2) the number of RBF-NN inputs can be reduced from 5n to 3n resulting in a sharp decrease of computational cost; (3) semiglobal practical asymptotic stability can be guaranteed by increasing the control gains K1 and K2 as shown in Theorem 1.

(27)

¯ e∥ is continuous positivewhere Ub (e) := (ks − ρ(∥e∥))∥e∥2 + δ∥ definite. Define a domain Dµ := {z| ∥z∥ ≤ µ} with µ :=

 2 1/2 δ¯ /(ks − ρ(∥e∥))2 + 4cw2 . Let S in (25) be an estimated domain of attraction. According to the Uniform Ultimate Boundedness (UUB) Theorem (Khalil, 2015, Th. 4.4), a necessary condition of closed-loop stability is that µ satisfies µ < α2−1 (α1 (cz )) =

 √ λ1 /λ2 (ρ −1 (ks ))2 + 4cw2 implying Dµ ⊂ Dz . It is derived that  λ2 /λ1 δ¯ < ρ −1 (ks )(ks − ρ(∥e∥))  ¯ ks − ρ(∥e∥)) < λ1 /λ2 ρ −1 (ks ) ⇔ δ/( ⇔ δ¯ /(ks − ρ(∥e∥)) < λ1 (ρ (ks )) /λ2  ⇔ δ¯ 2 /(ks − ρ(∥e∥))2 + 4cw2   < λ1 /λ2 (ρ −1 (ks ))2 + 4cw2  ⇔ µ < λ1 /λ2 cz ⇔ µ < α2−1 (α1 (cz )) 2

2

−1

Fig. 1. Schematic diagrams of HFF-AAC and FB-AAC.

4. Discussions

2

where the definitions of α1 , α2 , cz and µ are applied in the above derivation. Using the strictly increasing property of ρ , if z ∈ Dz such that ks > ρ(∥e∥),√ then there must exist a sufficiently large + k∗∗ λ2 /λ1 δ¯ < ρ −1 (ks )(ks − ρ(∥e∥)) as long s ∈ R to guarantee ∗∗ as ks ≥ ks resulting in µ < α2−1 (α1 (cz )). Combining with the results of the previous two paragraphs, if the selection of K1 and K2 follows ks ≥ max{k∗s , k∗∗ s }, then (27)

4.1. Comparison to existing FB-AACs

ˆ (xe |W ˆ e ) in the form For the traditional FB-AAC of (1), a RBF-NN H of (10) is applied to approximate (H (xd , e) + F (˙q)) in (2), which leads to a control law as follows (Lewis, 1996): ˆ e) τ = K2 e2 + Hˆ (xe |W

(30) m5n ×n

ˆe ∈ R with xe := col(xd , e) ∈ R5n and W number of activation functions for each input.

, where m is the

3 The definition of a KL function can be referred to Khalil (2015, Definition 4.1).

Y. Pan et al. / Neural Networks 76 (2016) 122–134

127

Table 1 Complexity comparisons of FF-AACs and HFF-AACs. Controller type

FB-AAC (30) FB-AAC (31) FB-AAC (32) FB-AAC (33) FB-AAC (34) HFF-AAC (14) HFF-AAC (35)

Nd

Cases

5n

m n m4n n (mn + m3n )n2 + 2mn n (m2n + m3n )n (m2n + m3n + L)n (m3n + L)n (2m2n + L)n

Case 1

Case 2

Case 3

Case 4

118,098 13,122 2,988 1,620 1,630 1,468 334

2,097,152 131,072 16,512 8,704 8,714 8,202 1,029

43,046,721 1,594,323 177,552 61,236 61,251 59,064 4,389

3,221,200,000 50,331,648 2,360,256 798,720 798,735 786,447 24,591

ˆ of (30) By the definition of v, the number of RBF-NN inputs in H can be reduced from 5n to 4n, which leads to ˆ v) τ = K2 e2 + Hˆ (xv |W

(31)

ˆ v ∈ Rm4n ×n . with xv := col(q, q˙ , v , v˙ ) ∈ R and W 4n

ˆ (q|W ˆ m ), To reduce the total number of NN nodes, sub-NNs M ˆC (q, q˙ |W ˆ c ), Gˆ (q|W ˆ g ) and Fˆ (˙q|W ˆ f ) in the form of (10) are applied to approximate M (q), C (q, q˙ ), G(q) and F (˙q) in (2), respectively resulting in a partitioned counterpart of (31) as follows (Lewis, 1996):

Table 2 Complexity comparisons of various HFF-AACs. Numbersa

Control type

Nc HFF-AAC in Chiu (2006) HFF-AAC in Chen et al. (2012) HFF-AAC (14) HFF-AAC (35)

3 3 2 2

Na 3 2 1 1

Nd

Nd 2n

2m n 2m2n n + 1 m3n n + L 2m2n n + L

≥15 ≥10 3 3

a Nc : the number of control terms; Na : the number of adaptive laws; Nd : the number of design parameters.

ˆ (q|W ˆ m )˙v + Cˆ (q, q˙ |W ˆ c )v + Gˆ (q|W ˆ g ) + Fˆ (˙q|W ˆf) τ = K2 e2 + M (32) mn ×n2

m3n ×n2

mn ×n

ˆc ∈ R ˆ g, W ˆf ∈ R ,W , and W . Applying ˆ ˆ ˆ ˆ sub-NNs H (xv 1 |Wv 1 ) and H (xv 2 |Wv 2 ) in the form of (10) to approximate H1 (xv 1 ) and (H2 (xv 2 )+ F (˙q)) in (5), respectively, one obtains ˆm ∈ R with W

another partitioned counterpart of (31) as follows:

ˆ v1 ) + Hˆ 2 (xv2 |W ˆ v2 ) τ = K2 e2 + Hˆ 1 (xv1 |W

(33)

m2n ×n

ˆ v1 , W ˆ v2 ∈ R in which W . To overcome the limitations of the RBF-NN in (10) for the approximation of discontinuous functions, the SJF-NN in (12) is introduced to approximate F (˙q) in (2) separately, which results in the following control law (Selmic & Lewis, 2002): ˆ v1 ) + Hˆ 2 (xv2 |W ˆ v2 ) + Fˆ (˙q|W ˆ f ). τ = K2 e2 + Hˆ 1 (xv1 |W

(34)

ˆ (xd1 |W ˆ 1 ) and Hˆ (xd2 |W ˆ 2 ) in the form Now, applying sub-NNs H

ˆ 1, W ˆ 2 ∈ Rm2n ×n to approximate H1 (xd1 ) and Hd2 (xd2 ) of (10) with W

system, a PD control term and an adaptive bounding term, with three adaptive laws and over 10 design parameters. Yet, the proposed control law (14) or (35) only has two control terms, i.e. a NN4 and a PD control term, with one adaptive law (19) and three design parameters, including the control gains K1 and K2 and the matrix of learning rates Γ . Complexity comparisons of various HFF-AAC laws are given in Table 2. It is observed that the proposed HFFAAC laws (14) and (35) have simpler control structures and much less design parameters than the HFF-AAC laws of Chen et al. (2012) and Chiu (2006). Therefore, the proposed approach resolves the challenges of the traditional FB-AAC design using a much simpler control scheme, which guarantees a minimum configuration of the control structure for the AAC design. On the other hand, the stability results in Chiu (2006) are based ˜ in (8) is bounded by (Chiu, 2006, A.1) on an assumption that H

˜ ≤ eT Ψ e + eT2 H

in (6), respectively leads to

ˆ 1 ) + Hˆ (xd2 |W ˆ 2 ) + Fˆ (˙q|W ˆ f ). τ = K2 e2 + Hˆ (xd1 |W

p 

ψk ∥e1 ∥2k ∥e2 ∥2

(36)

k=1

(35)

In summary, the traditional FB-AAC implements the NNs in the forms of (30)–(34), whereas the proposed HFF-AAC implements the NNs in the forms of (14) and (35). Complexity comparisons of all mentioned control laws with respect to the total number of NN nodes Nd are given in Table 1, where n = 2 and m = 3 for Case 1, n = 2 and m = 4 for Case 2, n = m = 3 for Case 3, n = 3 and m = 4 for Case 4, and L = 5 for all Cases. It is shown that the partitioned approximation is useful for reducing Nd , and the proposed HFF-AAC law (35) always has much less Nd than the FBAAC laws (30)–(34). Therefore, the proposed HFF-AAC law (35) is useful for the reduction of computational cost, and its advantage is much clear while the number of joints n and/or the number of activation functions m become high. For instance, in Case 4, the Nd of the simplest FB-AAC law (33) is more than 32 times larger than that of the proposed HFF-AAC law (35). 4.2. Comparison to existing HFF-AACs Two typical HFF-AAC approaches can be found in Chen et al. (2012) and Chiu (2006). The control law of Chiu (2006) has three control terms, i.e. a NN, a H ∞ control term and a nonlinear damping term, with two adaptive laws and over 15 design parameters. The control law of Chen et al. (2012) has three control terms, i.e. a fuzzy

where ψk ∈ R+ are known constants, Ψ ∈ R2n×2n is a known symmetric positive-semidefinite matrix, and p is a positive integer. The stability results in Chen et al. (2012) are based on an assumption ˜ in (8) is bounded by (Chen et al., 2012, A.2) that H

∥H˜ ∥ ≤

2n 

aj ϕj (q, q˙ , qd , q˙ d )|qj − qdj | + b

(37)

j =1

with col(q, q˙ ) := [q1 , q2 , . . . , q2n ] and col(qd , q˙ d ) := [qd1 , qd2 , . . . , qd2n ], where ϕj : R4n → R are known continuous functions, and aj , b ∈ R+ are unknown constants. Note that some symbols in (36) and (37) are slightly modified such that the assumptions are comparable. It is obvious that the above bounds are difficult to be obtained for an unknown H in (4), which results in a dilemma that exact analytical formulas of H are applied to derive the bounds of (36) and (37) in illustrative examples of Chen et al. (2012) and Chiu (2006). On the contrary, the proposed HFF-AAC does not need any boundary condition, which guarantees a minimum requirement of plant knowledge for the AAC design.

4 For (14), two NNs H ˆ (xd |W ˆ h ) and Fˆ (˙q|W ˆ f ) can be integrated into one as in (17) and only one adaptive law (19) is needed. The same can be done for (35).

128

Y. Pan et al. / Neural Networks 76 (2016) 122–134

Table 3 Performance index comparisons for various controllers. Example

Case

Controller type

Performance indexesa IAE1

A

1 2

B

1

2

3

a

FB-AAC (33) HFF-AAC (35) FB-AAC (33) HFF-AAC (35) FB-AAC (33) FB-AAC (34) HFF-AAC (35) FB-AAC (33) FB-AAC (34) HFF-AAC (35) FB-AAC (33) FB-AAC (34) HFF-AAC (35)

0.9113 0.6407 1.456 1.222 3.444 1.705 1.621 3.448 1.725 1.636 3.690 1.918 1.816

IAE2 1.422 1.303 2.088 2.033 2.211 1.376 1.345 2.225 1.432 1.429 2.453 1.557 1.528

Ec (τ1 )

Ec (τ2 )

1.377 1.392 1.378 1.393 5.525 × 104 5.436 × 104 5.421 × 104 5.523 × 104 5.438 × 104 5.417 × 104 5.495 × 104 5.410 × 104 5.391 × 104

3.421 × 10 3.763 × 10−4 3.530 × 10−4 3.871 × 10−4 4.457 × 104 4.565 × 104 4.597 × 104 4.456 × 104 4.563 × 104 4.598 × 104 4.541 × 104 4.685 × 104 4.710 × 104

Tr −4

193 082 191 085 259 220 033 261 223 035 264 226 036

IAE1 and IAE2 : the integral absolute errors with respect to links 1 and 2, respectively; Ec (τ ): the energy of control torque τ ; Tr : the real running time (s).

Fig. 2. Control trajectories of the helicopter by the FB-AAC (33) in Case 1. (a) Pitch control trajectories. (b) Yaw control trajectories.

5. Application examples 5.1. Example A: Quanser helicopters Consider a Quanser helicopter modelled by (1) with F (˙q) =

τ d = 0, τ = B(q)u and (Zarikian & Serrani, 2007)   p1 0 M (q) = , 0 p2 + p3 cos2 q1   0 p3 sin(2q1 )˙q2 /2 C (q, q˙ ) = , −p3 sin(2q1 )˙q2 /2 −p3 sin(2q1 )˙q1 /2   −p4 cos q1 + p5 sin q1 G(q) = , 0   −p6 p7 B(q) = −p8 cos q1 −p9 cos q1

where q1 (rad) and q2 (rad) are pitch and yaw angles, respectively, u (V) is a voltage vector applied to the motors, and pi ∈ R+ with i = 1, 2, . . . , 9 are coefficients related to helicopter parameters. For simulations, set q(0) = [π /3, −π /4]T , q˙ (0) = [0, 0]T , p1 = 0.0826 kg m2 , p2 = 0.0041 kg m2 , p3 = 0.0349 kg m2 , p4 = 0.1610 kg m2 /s2 , p5 = 0.4080 kg m2 /s2 , p6 = 0.0566 N m/V, p7 = 0.0054 N m/V, p8 = 0.0042 N m/V and p9 = 0.0114 N m/V (Zarikian & Serrani, 2007). The desired signals qd , q˙ d and q¨ d are obtained by filtering a piecewise constant input command qc using a linear filter G(s) = ω2 /(s2 + 2ζ ω + ω2 ) with ω = 0.1, ζ = 1 and s being a complex variable (Zarikian & Serrani, 2007).

Simulations are carried out in MATLAB software running on Windows 10 and an Intel Core i7-4510U CPU, where the Solver is chosen as fixed-step ode 3 with the step size being 5 × 10−3 s and other settings being defaults. The Gaussian white noise applied to the measurement is generated by the SIMULINK AWGN Channel with SNR being 50 dB and input signal power being 1 W. In ˆ f ) in (12) is not applied due to this example, the SJF-NN Fˆ (˙q|W F (˙q) = 0, and simulations are only compared between the FBAAC law (33) (with only a RBF-NN) and the proposed HFF-AAC law (35) without SJF-NNs. The previous HFF-AAC approaches are not compared here since their constraint conditions of the control designs are substantially different from those of the proposed HFFAAC approach as discussed in Sections 1 and 4.2. The design procedure of the proposed HFF-AAC law (35) is as j (k) follows: firstly, select Gaussian RBFs µ1 = exp(−(qd1 − 0.80(j − j

(k)

2))2 /0.50) and µ2 = exp(−(qd2 − 2.50(j − 2))2 /12.50) with j = 1 to 3 and k = 0 to 2 for the RBF-NN in (10) according to the information of qd ; secondly, select K1 = diag(1, 1) and K2 = diag(0.5, 0.05) in (35); finally, choose Γh1 = 10I, Γh2 = 0.3I, ˆ = 0 in (19). The design of the FB-AAC law (33) cw = 10 and W follows the same steps as above except that the feedback signals (q, q˙ , v , v˙ ) rather than the feedforward signals xd are applied as RBF-NN inputs in (33). The total numbers of NN nodes are Nd = (34 + 36 ) × 2 = 1620 and Nd = 2 × 34 × 2 = 324 for the control laws (33) and (35), respectively. For fair comparison, all shared parameters of the two control laws are set to be the same values. Case 1: Measurement noise free. Control trajectories by the control laws (33) and (35) in this case are depicted in Figs. 2 and 3,

Y. Pan et al. / Neural Networks 76 (2016) 122–134

129

Fig. 3. Control trajectories of the helicopter by the proposed HFF-AAC (35) in Case 1. (a) Pitch control trajectories. (b) Yaw control trajectories.

Fig. 4. Control trajectories of the helicopter by the FB-AAC (33) in Case 2. (a) Pitch control trajectories. (b) Yaw control trajectories. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

respectively, where upd = [upd1 , upd2 ]T denotes the PD control input, and unn = [unn1 , unn2 ]T denotes the NN control input. In this example, upd = B−1 (q)K2 e2 , and unn is equal to the rest of u. It is observed that the FB-AAC law (33) achieves a favourable control performance with high tracking accuracy and reasonable control inputs (see Fig. 2), and the proposed HFF-AAC law (35) performs at least as good as the FB-AAC law (33) using much less NN nodes (see Fig. 3). The feedforward NN control input unn of (35) dominates the control process and drives the helicopter without much efforts of upd after only a few seconds’ adaptation (see the lower subfigures of Fig. 3), and the feedback NN control input unn of (33) performs similar to that of (35). Tracking error comparisons of this case are depicted in Fig. 6(a), where the proposed HFF-AAC law (35) demonstrates even better tracking accuracy although its NN nodes are much fewer than the FB-AAC law (33). A possible reason is that the coverage domain of the RBF-NN in (35) can be more accurate since all its RBF-NN inputs xd are known before control, whereas the coverage domain of the RBF-NN in (33) may be more conservative since all its RBF-NN inputs (q, q˙ , v , v˙ ) are unknown before control. Performance index comparisons of this case are given in Example A of Table 3, which verify that the proposed HFF-AAC law (35) achieves at least as good tracking accuracy as the FB-AAC law

(33) under similar control energy Ec and shorter real running time Tr implying less computational cost. Case 2: Noisy measurement. Control trajectories by the control laws (33) and (35) in this case are depicted in Figs. 4 and 5, respectively. Note that due to the noisy RBF-NN inputs (q, q˙ , v , v˙ ) of the FB-AAC law (33), the red dash line represented unn1 in Fig. 4(a) is not clearly shown. Tracking error comparisons and performance index comparisons of this case are given in Fig. 6(b) and Example A of Table 3, respectively. It is observed that these results are quite similar to those of Case 1 except that upd exhibits chattering due to the noisy measurement. 5.2. Example B: robotic manipulators Consider a two-link planar robotic manipulator modelled by (1) with (Peng & Woo, 2002; Selmic & Lewis, 2002): M (q) =



p1 + p2 + 2p3 cos q2 p2 + p3 cos q2

 −p3 q˙ 2 sin q2 C (q, q˙ ) = p2 q˙ 1 sin q2



p2 + p3 cos q2 , p2

−p3 (˙q1 + q˙ 2 ) sin q2 0



,

130

Y. Pan et al. / Neural Networks 76 (2016) 122–134

Fig. 5. Control trajectories of the helicopter by the proposed HFF-AAC (35) in Case 2. (a) Pitch control trajectories. (b) Yaw control trajectories. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 6. Tracking error comparisons of helicopter control. (a) Tracking errors in Case 1. (b) Tracking errors in Case 2.

G(q) =

p4 g cos q1 + p5 g cos(q1 + q2 ) , p5 g cos(q1 + q2 )





fi (˙qi ) = [α0i + α1i e−β1i |˙qi | + α2i (1 − e−β2i |˙qi | )]sgn(˙qi ) with p1 = (m1 + m2 )l21 , p2 = m2 l22 , p3 = m2 l1 l2 , p4 = (m1 + m2 )l1 , p5 = m2 l2 , where mi (kg) and li (m) are the masses and lengths of the ith links, respectively, g (m/s2 ) is the gravitational acceleration, α0i , α1i , α2i , β1i , β2i ∈ R+ are friction coefficients, and i = 1, 2. For simulation, set q(0) = [π /0, 0]T , q˙ (0) = [0, 0]T , p1 = 2.90, p2 = 0.76, p3 = 0.87, p4 = 3.04, p5 = 0.87, g = 9.8, α01 = 35, α11 = 1.1, α21 = 0.9, β11 = 50, β21 = 55, α02 = 38, α12 = 1, α22 = 0.95, β12 = 55, and β22 = 60 (Selmic & Lewis, 2002). The control objective is to make q(t ) track a desired output qd (t ) = [(π /4) sin(π t ), (π /4) cos(π t )]T . The simulation environment is set to be the same as Example A except that the step size is 2 × 10−3 s in this example. Simulations are compared between the FB-AAC law (33) (with only a RBFNN), the FB-AAC law (34) (with a RBF-NN and a SJF-NN) and the proposed HFF-AAC law (35) (with a RBF-NN and a SJF-NN). The design procedure of the proposed HFF-AAC law (35) is as j follows: firstly, select Gaussian RBFs µi+1 = exp(−(qdi −(π /4)(j −

j

j

2))2 /0.48), µi+2 = exp(−(˙qdi − (π 2 /4)(j − 2))2 /4.76) and µi+4 = exp(−(¨qdi − (π 3 /4)(j − 2))2 /46.94) with i = 1, 2 and j = 1, 2, 3 for the RBF-NN in (10) according to the information of qd ; secondly, let L = 5 for the SJF-NN in (12); thirdly, let K1 = 5I and K2 = 50I in (35); finally, choose Γhi = Γfi = 5000I (i = 1, 2), cw = 1000

ˆ = 0 in (19). Therefore, the total numbers of NN nodes are and W Nd = (34 + 36 ) × 2 = 1620, Nd = (34 + 36 + 5) × 2 = 1630 and Nd = (2 × 34 + 5) × 2 = 334 for the control laws (33)– (35), respectively. For fair comparisons, the shared parameters of all control laws are set to be the same values. Case 1: Measurement noise free with τ d (t ) = [10 sin t , 10 sin t ]T . Control trajectories of this case by the control laws (33)–(35) are depicted in Figs. 7–9, respectively, in which τ pd = [τpd1 , τpd2 ]T denotes the PD control torque, and τ nn = [τnn1 , τnn2 ]T denotes the NN control torque. In this example, τ pd = K2 e2 , and τ nn is equal to the rest of τ . The observations are as follows: (1) the FB-AAC law (33) generally does not guarantee high-accuracy tracking under the discontinuous friction F (˙q) although the RBF-NN has certain capability to handle discontinuous uncertainties (see Fig. 7); (2) the FB-AAC law (34) achieves much higher tracking accuracy

Y. Pan et al. / Neural Networks 76 (2016) 122–134

131

Fig. 7. Control trajectories of the manipulator by the FB-AAC (33) in Case 1. (a) Link 1 control trajectories. (b) Link 2 control trajectories.

Fig. 8. Control trajectories of the manipulator by the FB-AAC (34) in Case 1. (a) Link 1 control trajectories. (b) Link 2 control trajectories.

Fig. 9. Control trajectories of the manipulator by the proposed HFF-AAC (35) in Case 1. (a) Link 1 control trajectories. (b) Link 2 control trajectories.

than the FB-AAC law (33) due to the effect of the applied SJF-NN in (34) (see Fig. 8); (3) the proposed HFF-AAC law (35) performs similarly to or even better than the FB-AAC law (34) using much

less NN nodes (see Fig. 9); (4) the learned feedforward NN control torque τ nn of (35) dominates the control process after only a short adaptation time (see the lower subfigures of Fig. 9). Tracking

132

Y. Pan et al. / Neural Networks 76 (2016) 122–134

Fig. 10. Control trajectories of the manipulator by the FB-AAC (33) in Case 2. (a) Link 1 control trajectories. (b) Link 2 control trajectories.

Fig. 11. Control trajectories of the manipulator by the FB-AAC (34) in Case 2. (a) Link 1 control trajectories. (b) Link 2 control trajectories.

Fig. 12. Control trajectories of the manipulator by the proposed HFF-AAC (35) in Case 2. (a) Link 1 control trajectories. (b) Link 2 control trajectories.

accuracy comparisons and performance index comparisons in this case are given in Fig. 13(a) and Example B of Table 3, respectively. All these results further verify the above observations.

Case 2: Noisy Measurement with τ d (t ) = [10 sin t , 10 sin t ]T . Control trajectories of this case by the control laws (33)–(35) are depicted in Figs. 10–12, respectively, and tracking accuracy

Y. Pan et al. / Neural Networks 76 (2016) 122–134

133

qualitative analysis of these results is the same as that of Case 1. Specifically in this case, the proposed HFF-AAC law (35) only spends less than 16% of the time of the other two FF-AAC laws to finish the same simulation trial. Case 3: Noisy measurement with τ d (t ) = [10sgn(sin t ), 10sgn (sin t )]T . This case is used for verifying robustness against stepchanging τ d of the proposed approach. Control trajectories of this case are very similar to those of Case 2 so that they are omitted here for saving space. However, tracking accuracy comparisons and performance index comparisons of this case are given in Fig. 13(c) and Example B of Table 3, respectively, where qualitative analysis of these results is the same as that of Case 1. 6. Conclusions This paper has presented an efficient HFF-AAC strategy, based on a class of uncertain Euler–Lagrange systems and has successfully established semiglobal practical asymptotic stability of the closed-loop system. The significance of the proposed approach is its simplicity and efficiency, which results in a sharp decrease of implementation cost in terms of hardware selection, algorithm realization and system debugging. The power of the proposed approach is exactly ‘‘A minor difference makes significant benefits’’. Illustrative examples have demonstrated that the proposed HFFAAC can perform as good as or even better than the traditional FB-AAC under much simpler control synthesis and much lower computational cost. Further work would focus on NN learning control under the HFF structure. Acknowledgements This work was supported in part by the Biomedical Engineering Programme, Agency for Science, Technology and Research (A*STAR), Singapore under Grant No. 1421480015, and in part by the Defense Innovative Research Programme, MINDEF of Singapore under Grant No. MINDEF-NUS-DIRP/2012/02. References

Fig. 13. Tracking error comparisons of manipulator control. (a) Tracking errors in Case 1. (b) Tracking errors in Case 2. (c) Tracking errors in Case 3.

comparisons and performance index comparisons of this case are provided in Fig. 13(b) and Example B of Table 3, respectively, where

Castaneda, C. E., & Esquivel, P. (2012). Decentralized neural identifier and control for nonlinear systems based on extended Kalman filter. Neural Networks, 31, 81–87. Chemachema, M. (2012). Output feedback direct adaptive neural network control for uncertain SISO nonlinear systems using a fuzzy estimator of the control error. Neural Networks, 36, 25–34. Chen, W. S., Jiao, L., & Wu, J. (2012). Globally stable adaptive robust tracking control using RBF neural networks as feedforward compensators. Neural Computing and Applications, 21(2), 351–363. Chiu, C. S. (2006). Mixed feedforward/feedback based adaptive fuzzy control for a class of MIMO nonlinear systems. IEEE Transactions on Fuzzy Systems, 14(6), 716–727. de Queiroz, M. S., Jun, H., Dawson, D. M., Burg, T., & Donepudi, S. R. (1997). Adaptive position/force control of robot manipulators without velocity measurements: theory and experimentation. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 27(5), 796–809. Fairbank, M., Li, S. H., Fu, X. G., Alonso, E., & Wunsch, D. (2014). An adaptive recurrent neural-network controller using a stabilization matrix and predictive inputs to solve a tracking problem under disturbances. Neural Networks, 49, 74–86. Farrell, J. A., & Polycarpou, M. M. (2006). Adaptive approximation based control: unifying neural, fuzzy and traditional adaptive approximation approaches. Hoboken, NJ, USA: Wiley. Hamdy, M., & el-Ghazaly, G. (2014). Adaptive neural decentralized control for strict feedback nonlinear interconnected systems via backstepping. Neural Computing and Applications, 24(2), 259–269. Herzallah, R. (2013). Probabilistic DHP adaptive critic for nonlinear stochastic control systems. Neural Networks, 42, 74–82. Ishihara, A. K., van Doornik, J., & Ben-Menahem, S. (2011). Control of robots using radial basis function neural networks with dead-zone. International Journal of Adaptive Control and Signal Processing, 25(7), 613–638. Khalil, H. K. (2015). Nonlinear control. Upper Saddle River, NJ, USA: Prentice Hall. Khemaissia, S., & Morris, A. (1998). Use of an artificial neuroadaptive robot model to describe adaptive and learning motor mechanisms in the central nervous system. IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 28(3), 404–416.

134

Y. Pan et al. / Neural Networks 76 (2016) 122–134

Kostarigka, A. K., & Rovithakis, G. A. (2012). Adaptive dynamic output feedback neural network control of uncertain MIMO nonlinear systems with prescribed performance. IEEE Transactions on Neural Networks and Learning Systems, 23(1), 138–149. Kruger, T., Schnetter, P., Placzek, R., & Vorsmann, P. (2012). Fault-tolerant nonlinear adaptive flight control using sliding mode online learning. Neural Networks, 32, 267–274. Lam, T., Anderschitz, M., & Dietz, V. (2006). Contribution of feedback and feedforward strategies to locomotor adaptations. Journal of Neurophysiology, 95(2), 766–773. Lee, S. H., & Terzopoulos, D. (2006). Heads up! Biomechanical modeling and neuromuscular control of the neck. ACM Transactions on Graphics, 25(3), 1188–1198. Lewis, F. L. (1996). Neural network control of robot manipulators. IEEE Expert, 11(3), 64–75. Liu, Y. J., & Tong, S. C. (2014). Adaptive fuzzy control for a class of nonlinear discrete-time systems with backlash. IEEE Transactions on Fuzzy Systems, 22(5), 1359–1365. Liu, Y. J., & Tong, S. C. (2015). Adaptive neural network tracking control of uncertain nonlinear discrete-time systems with nonaffine dead-zone input. IEEE Transactions on Cybernetics, 45(3), 497–505. Liu, Y. J., Tong, S. C., & Chen, C. (2013). Adaptive fuzzy control via observer design for uncertain nonlinear systems with unmodeled dynamics. IEEE Transactions on Fuzzy Systems, 21(2), 275–288. Makkar, C., Hu, G., Sawyer, W. G., & Dixon, W. E. (2007). Lyapunov-based tracking control in the presence of uncertain nonlinear parameterizable friction. IEEE Transactions on Automatic Control, 52(10), 1988–1994. Ortega, R., Perez, J. A. L., Nicklasson, P. J., & Sira-Ramirez, H. (1998). Passivity-based control of Euler–Lagrange systems: mechanical, electrical and electromechanical applications. London, UK: Springer. Pan, Y. P., & Er, M. J. (2013). Enhanced adaptive fuzzy control with optimal approximation error convergence. IEEE Transactions on Fuzzy Systems, 21(6), 1123–1132. Pan, Y. P., Er, M. J., Huang, D. P., & Sun, T. R. (2012). Practical adaptive fuzzy H∞ tracking control of uncertain nonlinear systems. International Journal of Fuzzy Systems, 14(4), 463–473. Pan, Y. P., Er, M. J., Huang, D. P., & Wang, Q. R. (2011). Adaptive fuzzy control with guaranteed convergence of optimal approximation error. IEEE Transactions on Fuzzy Systems, 19(5), 807–818.

Pan, Y.P., & Yu, H.Y. (2014). Biomimetic hybrid feedback feedforword adaptive neural control of robotic arms. In Proc. IEEE symposium on computational intelligence in control and automation, Orlando, FL, USA (pp. 1–7). Pan, Y. P., Yu, H. Y., & Er, M. J. (2014). Adaptive neural PD control with semiglobal asymptotic stabilization guarantee. IEEE Transactions on Neural Networks and Learning Systems, 25(12), 2264–2274. Pan, Y. P., Zhou, Y., Sun, T. R., & Er, M. J. (2013). Composite adaptive fuzzy H∞ tracking control of uncertain nonlinear systems. Neurocomputing, 99, 15–24. Peng, L., & Woo, P. Y. (2002). Neural-fuzzy control system for robotic manipulators. IEEE Control Systems Magazine, 22(1), 53–63. Ren, B., Ge, S. S., Tee, K. P., & Lee, T. H. (2010). Adaptive neural control for output feedback nonlinear systems using a barrier Lyapunov function. IEEE Transactions on Neural Networks, 21(8), 1339–1345. Seidler, R. D., Noll, D. C., & Thiers, G. (2004). Feedforward and feedback processes in motor control. Neuroimage, 22(4), 1775–1783. Selmic, R. R., & Lewis, F. L. (2000). Deadzone compensation in motion control systems using neural networks. IEEE Transactions on Automatic Control, 45, 602–613. Selmic, R. R., & Lewis, F. L. (2002). Neural-network approximation of piecewise continuous functions: Application to friction compensation. IEEE Transactions on Automatic Control, 13, 745–751. Wagner, M. J., & Smith, M. A. (2008). Shared internal models for feedforward and feedback control. Journal of Neuroscience, 28(42), 10663–10673. Xian, B., Dawson, D. M., de Queiroz, M. S., & Chen, J. (2004). A continuous asymptotic tracking control strategy for uncertain nonlinear systems. IEEE Transactions on Automatic Control, 49(7), 1206–1211. Yousef, H. A., Hamdy, M., & Shafiq, M. (2013). Flatness-based adaptive fuzzy output tracking excitation control for power system generators. Journal of the Franklin Institute, 350(8), 2334–2353. Zarikian, G., & Serrani, A. (2007). Harmonic disturbance rejection in tracking control of Euler–Lagrange systems: An external model approach. IEEE Transactions on Control Systems Technology, 15(1), 118–129. Zou, A. M., Kumar, K. D., & Hou, Z. G. (2010). Quaternion-based adaptive output feedback attitude control of spacecraft using Chebyshev neural networks. IEEE Transactions on Neural Networks, 21(9), 1457–1471. Zou, A. M., Kumar, K. D., & Hou, Z. G. (2013). Distributed consensus control for multiagent systems using terminal sliding mode and Chebyshev neural networks. International Journal of Robust and Nonlinear Control, 23(3), 334–357.