Neural Network based Adaptive Algorithms for ... - Semantic Scholar

1 downloads 0 Views 3MB Size Report
Flavio Nardi. In Partial Fulfillment of ...... [96] R. Ortega and A. Herrera, Solution to the decentralized stabilization problem,. Syst. Contr. Lett. 20 (1993), no.
Neural Network based Adaptive Algorithms for Nonlinear Control

A Thesis Presented to The Academic Faculty of The School of Aerospace Engineering

by

Flavio Nardi

In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Aerospace Engineering

Georgia Institute of Technology November 2000

Neural Network based Adaptive Algorithms for Nonlinear Control

Approved:

Anthony J. Calise, Chairman

Wassim M. Haddad

J.V.R. Prasad

Nader Sadegh

Panagiotis Tsiotras

Date Approved: ii

To all my mentors, ”Per Aspera Ad Astra”

iii

Contents List of Figures

vii

Acknowledgements

ix

Summary

xi

1 Introduction 1.1 Nonlinear Adaptive Control: Motivation and Overview . . . . . . . . 1.2 Brief Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . 2 Mathematical Preliminaries 2.1 Geometric Nonlinear Control . . . . . . 2.2 Stability and Boundedness . . . . . . . 2.3 NN Models . . . . . . . . . . . . . . . 2.3.1 Single Hidden Layer NN . . . . 2.3.2 Radial Basis Function NN . . . 2.3.3 Functional Link Perceptron NN 2.4 NN Universal Approximation Property

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

3 Direct Adaptive Full State Feedback Control 3.1 Dynamic Inversion for Non-Affine Systems . . . . . . . 3.2 SHL NN Control Augmentation . . . . . . . . . . . . . 3.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . 3.5 Extension to an Alternative Feedback Inversion Control 3.6 Full Adaptive RBF NN Control Augmentation . . . . . 3.7 Additional Comments . . . . . . . . . . . . . . . . . . 3.8 Simulation Results . . . . . . . . . . . . . . . . . . . . 4 Observer Based Adaptive Output Feedback 4.1 Problem Formulation . . . . . . . . . . . . . 4.2 NN based adaptive observer . . . . . . . . . 4.3 NN based adaptive controller . . . . . . . . 4.4 Simulation Results . . . . . . . . . . . . . . iv

Control . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . . . . . . Law . . . . . . . . .

. . . .

. . . .

. . . .

. . . . . . .

. . . . . . . .

. . . .

. . . . . . .

. . . . . . . .

. . . .

. . . . . . .

. . . . . . . .

. . . .

. . . . . . .

. . . . . . . .

. . . .

1 1 7

. . . . . . .

9 9 11 12 12 13 15 16

. . . . . . . .

19 20 26 29 30 35 38 42 43

. . . .

47 49 51 55 59

5 Output Feedback Control with a Linear Observer for the Error Dynamics 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Controller Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Design and Analysis of an Observer for the Error Dynamics . . . . . 5.5 Neural Network Approximation of the Inversion Error . . . . . . . . . 5.6 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65 65 67 68 71 72 73 74 75

6 Decentralized State Feedback Control 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Problem Formulation and Derivation of the Error Dynamics 6.3 Neural Network Augmentation and Stability Analysis . . . . 6.4 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

79 79 81 84 86 87

7 Concluding Remarks and Recommendations for Future Research 7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Recommended Research . . . . . . . . . . . . . . . . . . . . . . . . .

91 91 92

. . . . .

A Proofs A.1 Full State Feedback SHL Augmentation . . . . . . . . . . . . . A.1.1 Taylor Series Expansion of the SHL NN Output . . . . A.1.2 Proof of Lemma 3.2.1 . . . . . . . . . . . . . . . . . . . A.1.3 Proof of Boundedness . . . . . . . . . . . . . . . . . . . A.2 Alternative Full State Feedback Inversion SHL Augmentation A.3 Full State Feedback RBF Augmentation . . . . . . . . . . . . A.3.1 Taylor Series Expansion of the RBF NN Output . . . . A.3.2 Bound on w . . . . . . . . . . . . . . . . . . . . . . . . A.3.3 Proof of Boundedness . . . . . . . . . . . . . . . . . . . A.4 Observer Based Output Feedback . . . . . . . . . . . . . . . . A.4.1 Observer Boundedness . . . . . . . . . . . . . . . . . . A.4.2 Combined Observer and Controller UUB . . . . . . . . A.5 Output Feedback Linearly Parameterized NN Augmentation . A.6 Decentralized Full State Feedback SHL NN Augmentation . . A.7 Geometric Description of Ultimate Boundedness . . . . . . . . v

. . . . .

. . . . . . . . . . . . . . .

. . . . .

. . . . . . . . . . . . . . .

. . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

97 97 97 98 99 101 103 103 104 105 107 107 109 111 114 118

B A Flight Control Experiment: The Cal B.1 Cal Tech’s Ducted Fan . . . . . . . . . B.2 Time Scale Separation . . . . . . . . . B.3 Design of Inner Loop Controller . . . . B.4 Design of the Outer Loop Controllers . B.5 Experimental Results . . . . . . . . . .

Tech Ducted Fan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

123 124 125 129 132 136

Bibliography

139

Vita

153

vi

List of Figures 2.1

The SHL Perceptron Network. . . . . . . . . . . . . . . . . . . . . . .

13

2.2

RBF/FLP NN Structure. . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.1

Full State Feedback SHL NN Controller Architecture. . . . . . . . . .

24

3.2

Tracking Performance of the Baseline Linear Controller. . . . . . . . .

31

3.3

Tracking Performance with SHL NN. . . . . . . . . . . . . . . . . . .

32

3.4

Control Effort and Inversion Error of the Baseline Linear Controller. .

33

3.5

Control Effort and Inversion Error of the SHL NN Augmented Controller. 33

3.6

SHL NN Weight History. . . . . . . . . . . . . . . . . . . . . . . . . .

34

3.7

SHL NN Adaptive Bound. . . . . . . . . . . . . . . . . . . . . . . . .

34

3.8

Alternative Full State Feedback SHL NN Controller Architecture. . .

36

3.9

Tracking Performance with RBF NN. . . . . . . . . . . . . . . . . . .

44

3.10 Control Effort and Inversion Error of the RBF NN Augmented Controller. 45 3.11 RBF NN Outer Layer Weight History. . . . . . . . . . . . . . . . . .

46

3.12 RBF NN Adaptive Bound. . . . . . . . . . . . . . . . . . . . . . . . .

46

4.1

Tracking Performance without NN Controller. . . . . . . . . . . . . .

61

4.2

Tracking Performance with NN Controller. . . . . . . . . . . . . . . .

62

4.3

Observer and Controller NN Weight History. . . . . . . . . . . . . . .

62

4.4

Observer Performance: Actual and Estimated Velocities. . . . . . . .

63

4.5

Observer Performance with no NN. . . . . . . . . . . . . . . . . . . .

63

5.1

Output Feedback Adaptive Controller Architecture. . . . . . . . . . .

69

vii

5.2

Output Tracking Performance without NN Controller. . . . . . . . . .

77

5.3

Output Tracking Performance with NN Controller. . . . . . . . . . .

77

5.4

NN Controller Weights History. . . . . . . . . . . . . . . . . . . . . .

78

6.1

Decentralized Control: Performance with no NN. . . . . . . . . . . .

88

6.2

Decentralized Control: Performance with NN. . . . . . . . . . . . . .

89

6.3

Subsystem 1 Control Effort with NN. . . . . . . . . . . . . . . . . . .

89

A.1 Geometric Representation of the Sets in the Error Space. . . . . . . . 119 A.2 Geometric Representation of the Set of Allowable Command Inputs. . 121 B.1 The Cal Tech Ducted Fan. . . . . . . . . . . . . . . . . . . . . . . . . 124 B.2 Ducted Fan Section View. . . . . . . . . . . . . . . . . . . . . . . . . 125 B.3 Ducted Fan Pitch Control Inner Loop Architecture. . . . . . . . . . . 130 B.4 Altitude Controller, Hover Mode. . . . . . . . . . . . . . . . . . . . . 134 B.5 Velocity Controller, Hover Mode. . . . . . . . . . . . . . . . . . . . . 134 B.6 Altitude Controller, Forward Flight Mode. . . . . . . . . . . . . . . . 135 B.7 Velocity Controller, Forward Flight Mode. . . . . . . . . . . . . . . . 135 B.8 Experimental Results: Hover to Forward Flight. . . . . . . . . . . . . 137 B.9 Experimental Results: Reversal of Flight Direction. . . . . . . . . . . 137 B.10 Experimental Results: Steps in Altitude. . . . . . . . . . . . . . . . . 138 B.11 Experimental Results: Forward Flight to Hover. . . . . . . . . . . . . 138

viii

Acknowledgements I wish to express sincere, deep gratitude to my advisor Prof. Anthony J. Calise. Dr. Calise has been extremely supportive throughout my graduate studies. He has taught me much more than control system theory, and I consider myself very fortunate to have had the opportunity to work with him. I am thankful to the members of my doctoral committee: Prof. Wassim M. Haddad, Prof J.V.R. Prasad, Prof. Nader Sadegh, and Prof. Panagiotis Tsiotras. In particular, I would like to thank Prof. Haddad for his lectures on linear robust control theory. I’d like to mention the colleagues in the flight control laboratory: Eric Johnson, Franz Lee, Manu Sharma. I enjoyed interesting discussions with Joe Horn and Hong Xin. Special mention is due to Rolf Rysdyk and Naira Hovakimyan: I thank you both for your patience while working with me. I appreciated the painstaking proofreading of parts of my thesis work by Prof. Moshe Idan. Finally, I thank my parents, Lucio and Gerd, and my brothers, Luca and Marco, for their support and love.

ix

x

Summary Adaptive control using on-line function approximators for feedback linearizable systems has proven to be a very effective way to design controllers based on approximate knowledge of the system dynamics. Practical implementation of such controllers on the X-36 aircraft have demonstrated their stability and performance characteristics, as well as superior fault tolerance when there is redundancy in the control. These real time real world experiments provide great momentum for theoretical research in nonlinear adaptive control systems using neural networks to approximate the unknown dynamics. In this thesis we present neural network based adaptive algorithms for full state feedback, output feedback, and decentralized control of feedback linearizable nonlinear systems. We first introduce a novel approach to dynamic model inversion of a class of nonlinear non-affine dynamic systems that leads to a controller architecture that can host approximate feedback linearizing controllers. Boundedness of signals is guaranteed in a semi-global sense. That is, the controllers are local with respect to a predetermined compact set. We specifically construct adaptive update laws for the single hidden layer and the full adaptive (adapting gain, centers, and widths) radial basis function neural networks. We strengthened the stability result with respect to previously developed proofs by introducing a robust adaptive gain to estimate on-line the bound on the higher order terms of the neural network output. Next, we present a state observer based adaptive output feedback control architecture limited to relative degree 2 systems, but applicable to multi-input-multi-output xi

xii

systems. This architecture employs functional link perceptron neural networks. Update laws for the combined observer and controller are derived through Lyapunov analysis. We also consider an adaptive output feedback that overcomes the relative degree restriction of the adaptive observer approach. We argue that it is considerably more convenient to design an observer for the output tracking error dynamics, since it appears linear as a result of feedback linearization. We prove ultimate boundedness of both the observer and controller tracking errors. The proof of stability is limited to linearly parameterized neural networks. Finally, we develop a decentralized adaptive control design procedure for largescale uncertain systems using single hidden layer neural networks. The subsystems are assumed to be feedback linearizable and non-affine in the control, and their interconnections bounded linearly by the tracking error norms. Single hidden layer neural networks are introduced to approximate the feedback linearization error signal on-line from available measurements. A robust adaptive signal is required in the analysis to shield the feedback linearizing control law from the interconnection effects. The signals of all subsystems are shown to be ultimately bounded. As an Appendix we also report experimental results obtained by implementing a neural network based adaptive controller to the CalTech ducted fan.

Chapter 1 Introduction 1.1

Nonlinear Adaptive Control: Motivation and Overview

Research in nonlinear control theory has been motivated by the inherently nonlinear characteristics of the dynamical systems we often try to control. Examples of such systems are Euler-Lagrange systems, limit and rate saturated control systems, dynamically coupled and interconnected systems, to list a few. If we add to the nonlinear nature of the dynamics the fact that most systems are not well known and therefore not exactly modeled, it is clear that linear control techniques fall short in both their theoretical and practical aspects. Although linear systems are very well understood and controlled, linear control is not enough to guarantee stability and performance of nonlinear systems. A methodology often adopted to practically control nonlinear systems is gain scheduling control. This methodology is based on the fact that smooth nonlinear systems can be locally approximated by a linear one. For this reason, by generating a high fidelity model of the process, the control engineer can linearize it at all operating conditions and design locally stable controllers that deliver the desired performance. He then has to devise a smooth switching scheme to schedule these local controllers. This schedule results in a ”practically” semiglobal nonlinear controller. Although this 1

2

methodology works in practice, rigorous mathematical proofs of stability have been developed only in this last decade. Examples of such efforts are Ref. [123, 65]. For different design frameworks for gain scheduling control with a priori guarantees of closed loop system stability over the whole operating range see [99, 67] and references therein. Adaptive control seems today a ”natural” strategy to attack the stabilization and tracking of highly uncertain dynamical systems. In fact, since the late 50’s, control theorists have struggled to develop adaptive control laws that guarantee closed loop stability in the presence of unmodeled dynamics and external disturbances. The Model Reference Adaptive Control (MRAC) architecture [63, 84, 2, 128, 44] was first proposed for linear systems by Whitaker at the M.I.T. Instrumentation Laboratory in 1958 [143, 98]. The MRAC approach was based on a heuristically constructed gradient or delta rule, also known as M.I.T. rule. Grayson [33] and Butchart et al. [8] developed this initial heuristic approach in a more rigorous design methodology by introducing a closed loop stability analysis based on Lyapunov’s second (or direct) method. Parks extended Butchart’s result and first introduced the key idea for output feedback control design [100]. Specifically, Parks showed how to avoid using the derivatives of the model-system error in the Lyapunov’s analysis by satisfying a positive real condition. Dressler [23] introduced a state space formulation to the model reference adaptive system architecture and considered the stability of the closed loop system by Lyapunov’s direct method. A similar approach can be found in [76]. Monopoli [77] strengthened these preliminary results by considering an augmented error signal that in fact extends the application of the Meyer-Kalman-Yacubovich lemma [108, 132, 73] to error systems that are of relative degree greater than one. Monopoli showed global stability of MRAC systems when one has access to only the plant’s input and output signals. Derivatives of the plant output are replaced by filtered derivative signals.

3

The idea of introducing filters to render the error system closed loop transfer function positive real has gone a long way from its first inception. Monopoli later extended these results to multivariable systems [78]. In 1973 Astrom et al. [1] derived the asymptotic properties of self-tuning regulators when the parameter estimates tend in the limit to a finite value. A unifying survey of early results has been presented in [85, 86]. Extensions of this early work by Narendra et al. can be found in [89, 87] and by Morse in [79]. A new robust adaptive law without a persistent excitation condition was proposed in [83]. A unified account for linear multivariable adaptive control systems is given by Morse [80, 81]. An analysis of its parameter convergence and transient performance was presented in [94]. A milestone in the extension of linear control techniques to nonlinear systems has been the development of nonlinear geometric control. Recent research involving differential geometric methods [10, 11, 12, 9, 45, 92, 73] has rendered the design of controllers for a class of nonlinear systems somewhat systematic. This nonlinear control theory is based on coordinate transformations by which a class of nonlinear systems can be transformed into linear systems through feedback. Hence the word feedback linearization denoting such methodology. Exact feedback linearization proved to be too restrictive for practical applications due to its non robustness to unmodeled dynamics. A series of results were reported in the literature trying to address ways to overcome this issue. A list of references to the so called approximate feedback linearizing methods can be found in [35, 36, 56, 57, 51, 4, 31, 134, 110, 133, 5]. The approximate input-output linearization methodology has gained a lot of attention in many fields of application. Examples in flight control design are [64, 7, 66, 24, 6]. Extensions to render this methodology robust to uncertainty can be found in [127] and [37].

4

With the advent of feedback linearization, adaptive control found its way into nonlinear control. Adaptive control of linearizable systems were first proposed in [130] and [120]. A different setup for nonlinear adaptive control is presented in [97]. Later, Kanellakopoulos et al. [50] develop a systematic approach to the design of adaptive controllers for linearizable systems. The key idea contained in this paper, back-stepping control, has become a very popular and powerful tool in nonlinear adaptive control. A complete account for such methods can be found in [59, 73, 121]. An extension to non linearizable systems was proposed in [107]. The combination of adaptive control and feedback linearization applied to flight control can be found in [126]. In most of the classical adaptive control literature it is common to assume the unknown dynamics to have a known structure with unknown parameters entering linearly in the dynamics. The linear parameterization of unknown dynamics poses serious obstacles in adopting adaptive control algorithms in practical applications, because it is difficult to fix the structure of the unknown nonlinearities. This fact has been the motivating factor behind the interest in on-line function approximators to estimate and learn the unknown function. The most common function approximators used in adaptive control are artificial neural network and fuzzy logic structures. On line control algorithms that do not require knowledge of the system dynamics (except its dimension and relative degree) have been made possible by employing artificial neural networks in the feedback loop [34]. The ability of neural networks to approximate uniformly continuous functions has been proven in several articles [21, 27, 38, 28, 40]. An important aspect of neural network control applications is the difference between approximation theory results and what is achievable in on-line adaptive schemes using such approximators. First and most importantly, in off-line applications the neural network weights are updated based on input-output matching,

5

whereas in direct adaptive control situations the update of the network parameters is driven by a tracking error, which by its definition does not contain any input type information. In addition to this aspect, the adaptive update must yield boundedness of all signals in the closed loop. Commonly standard back-propagation of the error is not the only component in the parameter update law. Neural networks (NNs) for identification and control were first proposed by Narendra and Parthasarathy [88]. Results in this initial research were limited to simulation, and no proofs of stability in the closed loop were provided. Relevant initial analysis of neural networks in identification and control settings can be found in the work of Polycarpou and Ioannou [105, 104]. Sanner and Slotine first proposed Radial Basis Function (RBF) NNs for control of affine systems and provided rigorous analysis of stability and implementation. They introduced a sliding mode component in the control for the purpose of keeping the evolving dynamics within the predetermined compact set of interest [117]. Tzirkel and Fallside proposed the use of neural network control providing proof of stability [131]. Sadegh provided rigorous analysis of a neuro control system using sigmoidal neural networks [115, 116]. First results in proving stability of on-line feedback linearizing adaptive Single Hidden Layer (SHL) neural network augmented controllers were obtained by Chen and Khalil [15] for discrete time systems, and by Chen and Liu [16] for continuous time systems. Lewis et al. [145, 71, 72] developed novel adaptive control update laws for an on-line Single Hidden Layer (SHL) neural network based on Lyapunov analysis. These results were limited to affine in control nonlinear systems. Calise et al. [54, 14] removed the affine in control restriction by developing a dynamic inversion based control architecture with linearly parameterized neural networks in the feedback path to compensate for the inversion error introduced by an approximate inverse. Extensions of this work to SHL neural networks can be found in [74]. Flight testing has been carried out

6

using this control architecture on the X-36 aircraft, demonstrating both greater performance than the baseline dynamic inverting controller, and superior fault tolerance ability [13]. Relevant contributions in adaptive neural network theory were also introduced by Rovithakis and Christodoulou [114] and by Polycarpou [103, 106]. In particular, Polycarpou introduced the use of a robust adaptive gain to compensate for neglected higher order terms in the Taylor series expansion of the neural network output. This expansion is needed to obtain implementable update laws. The advantage of the adaptive bounding approach is that only the functional dependence of these higher order terms is needed, not their magnitude. Specifically, there is no need to know the upper bound on the neural network weight matrices. More recently, Zhang et al. [148] have extended the use of Lewis’ SHL adaptive laws to non affine nonlinear systems and analyzed its transient performance. In [147] they extended the control architecture to include a high gain observer for output feedback applications. In [112] Rovithakis shows that the use of neuro-control can be also considered when a controller associated with the nominal dynamics is given along with its Lyapunov function. In [111] he introduces a dynamic neural network for control of affine systems by developing adaptive laws based on Lyapunov analysis. The control architecture is based on model reference adaptive control, and the control signal is composed of the neural network and robust adaptive outputs with no nominal controller in the loop. Farrel presented a thorough stability and approximator convergence analysis for nonparametric nonlinear adaptive control [26]. Choi and Farrel [18] have proposed an adaptive control scheme using piecewise linear approximators to parameterize the uncertainty in the nonlinear affine system. They also proposed an interesting and promising RBF NN observer [17]. Structurally dynamic wavelet neural networks for control are considered in [118]. Kulawski et al. proposed

7

dynamic neural networks for control of feedback linearizable systems [61]. An interesting experimental comparison between linear, adaptive, and NN adaptive control is presented in [101].

1.2

Brief Outline of the Dissertation

The main theme of the research presented in this thesis is the development of adaptive algorithms for nonlinear control. We address three major subsets of nonlinear adaptive control: full state feedback, output feedback, and decentralized control. In Chapter 3 we present an alternative approach to that of Ref.[54] for feedback linearization of non-affine unknown nonlinear systems that employs an on-line adaptive neural network for control. We follow the approach in Ref.[106], and introduce an adaptive robustifying term. This avoids the need to know the upper bound to the ideal NN weights and the choice of a fixed robust gain that in practice may result in high gain control. We also consider a more general architecture for the linear part of the control design. The extensions over Ref.[106] are that we do not assume the dynamics affine in control and the development in the setting of feedback inversion. We adopt nonlinearly parameterized NNs such as the SHL NN and the full adaptive RBF NN. In Chapter 4 we illustrate an adaptive observer/controller architecture for MIMO systems of full relative degree. Each subsystem is assumed feedback linearizable with relative degree no greater than two. We prove stability adopting linearly parameterized NNs both for the observer and controller augmentation. In Chapter 5 we remove the relative degree limitation of the output feedback approach of Chapter 4 by employing a linear observer for the tracking error dynamics. The controller is augmented with a linearly parameterized NN, and a proof of stability

8

is provided that shows boundedness of both the linear observer and adaptive controller simultaneously. In Chapter 6 we extend the full state feedback approach of Chapter 3 to decentralized control of large-scale interconnected systems. We assume the interconnections to be bounded linearly by a function of the tracking error. Ultimate boundedness is guaranteed by introducing an additional robust adaptive term that shields the NN augmented feedback linearizing controller from interconnection effects. The proofs of boundedness theorems in Chapters 3-6 are detailed in Appendix A. Lastly, in Appendix B, we describe the analysis that went into the design of a NN based adaptive controller for the CalTech ducted fan. This controller was implemented, and experimental results are reported.

Chapter 2 Mathematical Preliminaries 2.1

Geometric Nonlinear Control

In this chapter we present the mathematical notation that we extensively use in the exposition of theoretical results. We will refer to the following nonlinear system x˙ = f (x, u)

(2.1.1)

y = g(x),

(2.1.2)

where x ∈ Rn , u ∈ Rm , and y ∈ Rq , as a non affine nonlinear system. The mappings f, g are assumed smooth (∈ C ∞ ). ∆ ∂g f. ∂x

The Lie derivative is defined as Lf g =

Definition 2.1.1 (Vector Relative Degree). [119] The system (2.1.1) is said to have vector relative degree γ1 , γ2 , . . . , γρ at x0 if Lgi Lkf gi (x) ≡ 0,

0 ≤ k ≤ γi − 2

for i = 1, . . . , p and the following decoupling matrix  Lg1 Lfγ1 −1 g1 · · · Lgp Lfγ1 −1 g1  .. .. ...  . .  γp −1 γp −1 L g 1 L f gp · · · L g p L f gp is non singular at x0 .

9

   

(2.1.3)

(2.1.4)

10

Definition 2.1.2 (Diffeomorphism). [119, 128] A map f : X → Y is said to be a diffeomorphism if f is a homeomorphism (i.e., a one-to-one and onto continuous map with continuous inverse) and if both f and f −1 are smooth. An important concept in geometric nonlinear control is that of zero dynamics. For ease of notation, assume that a single-input-single-output (SISO) system has relative degree r < n, where x ∈ Rn . Then, there exists a diffeomophism 

   Φ(x) =    ∆

L0f g L1f .. . (r−1)

Lf

      

(2.1.5)

that transforms the original system in Eq.(2.1.1) into the so called normal form [45] χ˙ = g(ξ1 , . . . , ξr , χ) ξ˙i = ξi+1

i = 1, . . . , r − 1

ξ˙r = h(ξ, χ, u) y = ξ1 .

(2.1.6)

where h(ξ, χ, u) = Lrf g with x = Φ−1 (ξ) and ξ ∈ Φξ with Φξ = Φ(Ω). The zero dynamics is defined as the χ dynamics with the output set to zero, i.e. χ˙ = g(0, χ).

(2.1.7)

A common assumption in stabilization and tracking control of feedback linearizable nonlinear systems is that the zero dynamics is exponentially attractive. We will denote with k · k the Euclidean norm, with k · k1 the vector 1 norm, and with k · kF the Frobenius norm for matrices.

11

2.2

Stability and Boundedness

In the sequel we will address the closed loop boundedness of the proposed adaptive control augmentation. In introducing a multilayer neural network as an online approximator to the uncertainty, we will not be able to show asymptotic stability, nor can we show that the tracking error goes to zero. For this reason the following definition states precisely what kind of boundedness we can ensure. The following facts will refer to this general form of a nonlinear dynamical system x˙ = f (t, x).

(2.2.1)

Definition 2.2.1. [44] The solutions of (2.2.1) are uniformly ultimately bounded (with bound B) if there exists a B > 0 and if corresponding to any α and to ∈ R+ , there exists a T = T (α) > 0 (independent of to ) such that kxo k < α implies kx(t; to , xo )k < B for all t ≥ to + T . Theorem 2.2.1. [146] Suppose there exists a Lyapunov function V (t, x) defined on 0 ≤ t < ∞, kxk ≥ R, where R maybe large, which satisfies the following conditions: 1. a(kxk) ≤ V (t, x) ≤ b(kxk), where a(·) ∈ C is a monotone increasing function, also a(r) → ∞ as r → ∞, and b(·) ∈ C is a monotone increasing function, 2. V˙ (t, x) ≤ −c(kxk), where c(r) is a positive and continuous function. Then the solutions of (2.2.1) are uniformly ultimately bounded. Theorem 2.2.2. [62] Let V (x) be a scalar function which for all x has continuous first partial derivatives with the property that V (x) → ∞ as kxk → ∞. If V˙ (x) < 0 for all x outside some closed and bounded set M, then the solutions of x˙ = f (x) are ultimately bounded.

12

2.3 2.3.1

NN Models Single Hidden Layer NN

Figure 2.1 shows the structure of a Single Hidden Layer (SHL) Perceptron NN [69]. The input–output map of a SHL network can be represented as y k = b M θM k +

n2 X

Mjk σj ,

j=1

where k = 1, · · · , n3 and

Ã

σ j = σ b N θN j +

n1 X

!

Nij xi .

i=1

Here n1 , n2 and n3 are respectively the number of input and hidden layer neurons, and number of outputs respectively. The scalar function σ(z) is a sigmoidal activation function that represents the ”firing” characteristics of the neuron. In this work it is defined as σ(z) =

1 . 1 + e−a z

(2.3.1)

The factor a is known as the activation potential. For convenience define the two weight matrices 

θN 1

···

θN n2

  N  1,1 · · · N1,n2 N = . ...  ..  Nn1,1 · · · Nn1,n2

and define a vector σ(z) as

      



θM 1

···

θM n3

  M · · · M1,n3 1,1  M = . ...  ..  Mn2,1 · · · Mn2,n3



   ,  



σ(z) = [bM σ(z1 ) σ(z2 ) · · · σ(zn2 )]T , where bM ≥ 0 allows for the thresholds θM to be included in the weight matrix M . Let x¯ = [ bN x ]T ,

13

θM

bM

σ( .)1

bN

θV

σ( .)2

M1 T

y1

x1

N1

σ( .)3

M2 T

y2

Mn T

yn

σ( .)4

xn

1

Nn

3

1

3

σ( .)n

2

Figure 2.1: The SHL Perceptron Network. where bN ≥ 0 is an input bias that allows for the thresholds θN to be included in the weight matrix N . With the above definitions, the input-output map of a SHL perceptron can be written in matrix form as y = M T σ(N T x¯).

2.3.2

(2.3.2)

Radial Basis Function NN

A Radial Basis Function (RBF) NN differs in structure from the SHL NN in that there are no weight parameters associated with any of the input layer interconnections. In addition, the activation function is a bell shaped Gaussian function. Figure 2.2 shows the typical structure of a RBF NN. Although this NN is usually considered linear parameterized, by adjusting the centers and the widths this NN structure becomes

14



1

φ( .)

  

1



φ( .)

 





 



 

 

φ( .)

φ( .)

  

φ( .)

Figure 2.2: RBF/FLP NN Structure. nonlinearly parameterized. The activation functions φ are defined as follows: µ ¶ k¯ x − ξk2 ∆ φ(¯ x, ξ, η) = exp , (2.3.3) η2 where x¯ is a vector of input variables, η is the width of the Gaussian function, ξ is a vector of Gaussian center positions. The argument of the activation function of the hidden layer units represents the Euclidean norm between the input vector and the units’ center position. This operation characterizes the exponentially decaying localized nonlinearity of Gaussian functions. The output of an RBF network can hence be written as y = M T φ(¯ x, ξ, η).

(2.3.4)

Even when the RBF network is considered for linear parameterization, this structure can uniformly approximate continuous functions to arbitrary accuracy on compact

15

sets provided a sufficient number of Gaussian functions is employed [32, 102]. The ”local” characteristic of RBF networks is considered an unattractive feature when the function approximation is required over a large domain. The choice of a sufficient number of Gaussian functions can quickly lead to an intractable problem due to the curse of dimensionality [117].

2.3.3

Functional Link Perceptron NN

The functional link perceptron network (FPLN) was first proposed for identification and control of dynamic systems by Sadegh [115, 116]. Sadegh showed that this type of network with sigmoid basis function can uniformly approximate continuous functions over compact sets. An FLPN is a three layered neural network whose hidden layer is composed by activation functions such that the input output relationship is defined by: y=

N X

mi φi (¯ x),

i=1

or in more compact form y = M T Φ(¯ x),

(2.3.5)

where ∆

M = ∆

Φ(x) =

h

h

m1 · · ·

mN

φ1 (¯ x) · · ·

iT

φN (¯ x)

iT

.

The activation function most commonly adopted is the so called sigmoid φ(z) =

1 − e−a z , 1 + e−a z

which is an hyperbolic tangent bounded over the whole domain.

(2.3.6)

16

2.4

NN Universal Approximation Property

The dominant theme of this thesis is the use of NN to approximate unknown functions of the dynamic states. In order to adopt the different NN models for online adaptation we must ensure that these models are effectively capable of uniformly approximating continuous function over predetermined compact sets. To help us better understand the concepts of convergence and approximation, we will introduce a key result from real analysis [3]. Definition 2.4.1. A function g approximates f uniformly on D ⊂ R p to within ² > 0 if kg(x) − f (x)k ≤ ²

∀x ∈ D

(2.4.1)

or equivalently if sup kg(x) − f (x)k ≤ ²

(2.4.2)

x∈D

The sup norm is sometimes referred to as the L∞ norm, because L∞ is the space of all functions for which this norm is defined (the space of bounded functions). Theorem 2.4.1 (Stone-Weierstrass Theorem). Let K be a compact subset of Rp and let A be a collection of continuous functions on K to R with the properties: • The constant function e(x) = 1, x ∈ K, belongs to A. • If f, g belong to A, then αf + βg belongs to A. • If f, g belong to A, then f g belongs to A. • If x 6= y are two points of K, there exists a function f in A such that f (x) 6= f (y).

17

Then any continuous function on K to R can be uniformly approximated on K by functions in A. This theorem summarizes all the characteristics needed to approximate a function: the first item includes the constant function, the second and the third define an algebra for the class (closeness under addition and multiplication), and the fourth ensures that there are functions other than the constant function in the class. In the following we report results form the NN literature that establish the approximation capability of the different NN models presented in the previous section. Theorem 2.4.2 (SHL NN is a universal approximator). [27] Let φ(x) be a nonconstant, bounded and monotone increasing continuous function. Let K be a compact subset of Rn and f (x1 , . . . , xn ) be a real valued continuous function on K. Then for arbitrary ² > 0 there exists an integer N and real constants ci , θi , i = 1, . . . , n, wij , i = 1, . . . , N, j = 1, . . . , n, such that ! Ã n N X X wij xj + θi ci φ fˆ(x1 , . . . , xn ) = i=1

(2.4.3)

j=1

satisfies max |f (x1 , . . . , xn ) − fˆ(x1 , . . . , xn )| < ². x∈K

(2.4.4)

As far as the RBF and FLPN are concerned, their approximation capabilities are guaranteed by the nature of the activation functions. Sadegh [115] specified the conditions for any activation function to be a basis function. Definition 2.4.2. Let S be a compact simply connected subset of Rn , and φ(·) : S → Rn be integrable and bounded. Then φ(·) is said to form a basis for C m (S) if 1. a constant function on S can be expressed as the output of the NN for a finite number of neurons,

18

2. the span of φ(·) is dense in C m (S) for countable number of neurons. It has been shown in [115, 116, 32, 102] that both the shifted sigmoid (superposition of sigmoidal functions) and the gaussian basis function can uniformly approximate any continuous function to an arbitrary degree of accuracy provided enough number of neurons are considered.

Chapter 3 Direct Adaptive Full State Feedback Control The Single Hidden Layer (SHL) NN was first introduced in on-line adaptive control by Chen and Khalil [15] for discrete event systems. This first attempt adopted the standard back-propagation (gradient rule) to update the NN weights along with a dead-zone to avoid NN parameter drift when the tracking error goes to zero. Lewis [71] developed a set of update rules for continuous time control using SHL NNs. These differential equations were obtained through Lyapunov analysis. Boundedness was shown by adding a robustifying term which can potentially lead to high gain control. This robust term requires the a priori knowledge of the upper bound to the Frobenius norm of the NN weights, and the choice of a fixed feedback gain. Both conditions can potentially be conservative. In this chapter we propose a novel approach to dynamic inversion of nonlinear feedback linearizable systems assuming the full state vector is available for feedback. From a technical standpoint, the novelty of the proposed approach lies in the fact that, by invoking the Mean Value Theorem [3] on the right hand side of a nonlinear controlled system, the resulting error dynamics are forced by the discrepancy between the applied control and the ideal inverting control law, as opposed to the classical inversion error of the approach by Calise et al. [54, 14]. Moreover, the proposed approach exposes a restriction on the time rate of change of the sensitivity of the 19

20

system with respect to the control, not present in earlier work. From an implementation point of view, the adaptive signals are injected after the approximate inverse model, thus avoiding altogether the algebraic loop assumption of the previously developed approach in Calise et al. [54, 14]. Furthermore, the control architecture can host any linear controller, including dynamic compensators, and does not require the approximate inverse model for stability analysis. To show ultimate boundedness of all signals, we introduce a robust adaptive term, the adaptive bound of Polycarpou [103, 106], to replace the conservative fixed gain term of Lewis’ approach [71]. We get an additional update law for the gain of the robust adaptive term. The advantage of the proposed approach from an implementation stand point is that we substitute the choice of a feedback gain with that of an adaptation gain. In practice, it is easier to have an adaptive gain, considering that it does not have to be chosen by the designer. Moreover, this approach avoids altogether having to guess an upper bound to the Frobenius norm of the NN weights. In the second half of this chapter, we also consider Radial Basis Function (RBF) NNs for the approximation of the unknown inversion error function. Our contribution in the present work is the development of adaptive laws for the outer layer weights, the Gaussian centers, and the widths, therefore obtaining a nonlinearly parameterize NN structure. The adaptation laws are derived from Lyapunov analysis.

3.1

Dynamic Inversion for Non-Affine Systems

Consider the following system x˙ n = f (x, u), where x ∈ Rn , xi , i = 1, . . . , n being the elements of x, u ∈ R, f ∈ C ∞ . The goal is to achieve bounded tracking of the x1 commands by assuming that all states are

21

available for feedback. It is perhaps important to remark that the following analysis is applicable also to systems with asymptotically stable zero dynamics, as long as all states are available for feedback. Assumption 3.1.1. There exist positive constants f L , f U , H such that 0 < fL ≤

∂f (x, u) ≤ fU, ∂u

(3.1.1)

and ¯ ¯ ¯ d ∂f (x, u) ¯ λmin (Q) L ¯ ¯ ¯ dt ∂u ¯ ≤ H < λmax (P ) f ,

(3.1.2)

for (x, u) ∈ Ω × R, with Ω ⊂ Rn , where Q, P are positive definite matrices that will be defined in the sequel. The condition in (3.1.2) is needed in the stability analysis. Remark 3.1.1. The condition in (3.1.2) may be difficult to verify. We will provide a qualitative discussion on some implications that it entails. This condition can be physically interpreted as a bound on the time rate of change of the sensitivity of the plant with respect to the control. In the case of affine systems of the form x˙ n = f (x) + u, this condition is always satisfied, since

d ∂ (f (x) dt ∂u

+ u) = 0. For affine

systems of the form x˙ n = f (x)+g(x)u, this condition becomes |g(x)| ˙ < H. In the non affine case, the fact that we restrict the plant to be invertible makes the functional relation with respect to the control one-to-one. The first condition on the sensitivity of the plant with respect to the control makes f a monotonic increasing function of u. The time derivative of the sensitivity of a non-affine system with respect to the control may contain the derivative of the control itself. Since the control law is, in general, a function of the states x, the bound on the rate of change of the sensitivity can be viewed as a bound on the time rate of change of the system dynamics in closed loop. An alternative approach that avoids this condition is described in Section 3.5.

22

Consider the following state dependent transformation v = f (x, u),

(3.1.3)

where v is commonly referred to as the pseudo control. The pseudo control is chosen in this derivation as a linear operator. In general, it may also be nonlinear, for example including a sliding mode component [119, 129]. The transformation in (3.1.3) is defined locally by invoking the implicit function theorem [3]. Since the pseudo control v is in general not a function of the control u but a state dependent operator, we have ∂[v − f (x, u)] 6= 0, ∂u

(3.1.4)

by Assumption 3.1.1. The fact that the expression in (3.1.4) is non singular implies that there exists in a neighborhood of every (x, u) ∈ Ω×R an implicit function α(x, v) such that v − f (x, α(x, v)) = 0. Since Assumption 3.1.1 holds in a domain Ω × R, we could consider the union of all such neighborhoods and extend the existence of the transformation to the entire domain. Let ∆

u∗ = α(x, v) = f −1 (x, v).

(3.1.5)

Before proceeding to the construction of the error equation consider the following remarks. Remark 3.1.2. For systems that are affine in the control as x˙ n = f (x) + g(x)u

23

the ideal feedback linearizing law is given by u∗ =

−f (x) + v . g(x)

This feedback linearizing law is defined for all (x, u) ∈ Ω × R, whereas in the non affine case we can only invoke the implicit function theorem for local existence. The analysis presented here is therefore equally applicable to affine in control systems. With reference to Figure 3.1 define the tracking error e1 = xf − x. Then the error dynamics can be written in the following form e˙ i = ei+1

i = 1, . . . , n − 1

e˙ n = x˙ fn − f (x, u). By invoking the mean value theorem there exists a real number λ,with 0 < λ < 1, such that f (x, u) = f (x, u∗ ) +

∂f (x, u) ¯¯ (u − u∗ ), ¯ ∂u u=uλ

equation in the error dynamics becomes

¯

∆ ∂f ¯ ∂u ¯

with uλ = λu + (1 − λ)u∗ . Introduce the short hand notation fu =

u=uλ

. The nth

e˙ n = x˙ fn − f (x, u∗ ) − fu (u − u∗ ). Based on the transformation introduced in (3.1.3), we know that for u∗ = f −1 (x, v) we have f (x, u∗ ) = v. Using this fact in the error dynamics e˙ n = x˙ fn − v − fu (u − u∗ ). Let the control u be defined as follows: ∆ u = fˆ−1 (x, v) + uad + u¯,

(3.1.6)

where fˆ−1 (x, v) is any available approximation of the unknown function f −1 (x, v), uad is a neural network based adaptive signal, and u¯ is a robust adaptive term.

24

Z Z

G L G N

9?:,@A@ 6 3CB DE1 =