Identification of Nonlinear Systems using Polynomial ... - VUB-ELEC

14 downloads 0 Views 5MB Size Report
Identification of Nonlinear Systems using Polynomial Nonlinear State. Space Models. FACULTY OF ENGINEERING. Department of Fundamental Electricity and ...
FACULTY OF ENGINEERING Department of Fundamental Electricity and Instrumentation

Identification of Nonlinear Systems using Polynomial Nonlinear State Space Models Thesis submitted in fulfillment of the requirements for the degree of Doctor in de Ingenieurswetenschappen (Doctor in Engineering) by

ir. Johan Paduart Chair:

Prof. Dr. ir. Annick Hubin (Vrije Universiteit Brussel)

Vice chair:

Prof. Dr. ir. Jean Vereecken (Vrije Universiteit Brussel)

Secretary:

Prof. Dr. Steve Vanlanduit (Vrije Universiteit Brussel)

Advisers:

Prof. Prof. Prof. Prof. Prof. Prof.

Jury:

Dr. ir. Johan Schoukens (Vrije Universiteit Brussel) Dr. ir. Rik Pintelon (Vrije Universiteit Brussel) Dr. ir. Lennart Ljung (Linköping University) Dr. ir. Johan Suykens (Katholieke Universiteit Leuven) Dr. ir. Jan Swevers (Katholieke Universiteit Leuven) Dr. ir. Yves Rolain (Vrije Universiteit Brussel)

Print: Grafikon, Oostkamp © Vrije Universiteit Brussel - ELEC Department © Johan Paduart 2007 Uitgeverij VUBPRESS Brussels University Press VUBPRESS is an imprint of ASP nv (Academic and Scientific Publishers nv) Ravensteingalerij 28 B-1000 Brussels Tel. ++32 (0)2 289 26 50 Fax ++32 (0)2 289 26 59 E-mail: [email protected] www.vubpress.be ISBN 978 90 5487 468 3 NUR 910 Legal deposit D/2008/11.161/012 All rights reserved. No parts of this book may be reproduced or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the author or the ELEC Department of the Vrije Universiteit Brussel.

Leaving the calm water of the linear sea for the more unpredictable waves of the nonlinear ocean.

Acknowledgements Research and writing a PhD thesis is not something one can do all by himself. Therefore, I would like to thank everyone who contributed to this thesis in one or many ways. First of all, I am grateful to Johan Schoukens and Rik Pintelon for giving me the opportunity to pursue my PhD degree at the ELEC department, and for introducing me to this fascinating research field. Without the guidance of the ‘patjes’ (Jojo, Rik and Yves) over the past four years, this work would have been a much tougher job. I would also like to thank all my colleagues at the ELEC department for the stimulating work environment, the exchange of interesting ideas, and the relaxing talks. I am indebted to Lieve Lauwers, Rik Pintelon, Johan Schoukens, and Wendy Van Moer for proofreading (parts of) my thesis. A special word of thanks goes to Lieve, who has polished the rough edges of this text. Lieve, you have made my thesis much more pleasant to read. Furthermore, I thank Tom Coen, Thomas Delwiche, Liesbeth Gommé, and Kris Smolders for letting me use their measurements. I very much appreciate your difficult and time-consuming experimental work. Last but not least, I am grateful to my family and my lieve Lieve for their unconditional support and love. Johan Paduart

Table of Contents Operators and Notational Conventions Symbols Abbreviations

v vii ix

Chapter 1 Introduction

1

1.1 What are Nonlinear Systems? 1.2 Why build Nonlinear Models? 1.3 A Framework for Nonlinear Modelling

2 3 4

1.3.1 1.3.2 1.3.3 1.3.4 1.3.5

Approximation Criteria The Volterra-Wiener Theory Continuous-time versus Discrete-time Single Input, Single Output versus Multiple Input, Multiple Output What is not included in the Volterra Framework?

1.4 Outline of the thesis 1.5 Contributions 1.6 Publication List

4 4 7 7 8

10 12 13

Chapter 2 The Best Linear Approximation

15

2.1 Introduction 2.2 Class of Excitation Signals

16 18

2.2.1 Random Phase Multisine 2.2.2 Gaussian Noise

18 20

2.3 Properties of the Best Linear Approximation 2.3.1 Single Input, Single Output Systems 2.3.2 Multiple Input, Multiple Output Systems

2.4 Some Properties of Nonlinear Systems

21 21 22

25

2.4.1 Response to a Sine Wave 2.4.2 Even and Odd Nonlinear Behaviour 2.4.3 The Multisine as a Detection Tool for Nonlinearities

2.5 Estimating the Best Linear Approximation 2.5.1 Single Input, Single Output Systems 2.5.2 Multiple Input, Multiple Output Systems

Appendix 2.A Calculation of the FRF Covariance from the Input/Output Covariances Appendix 2.B Covariance of the FRF for Non Periodic Data

25 25 26

28 28 34

40 43

Chapter 3 Fast Measurement of Quantization Distortions

47

3.1 Introduction 3.2 The Multisine as a Detection Tool for Non-idealities 3.3 DSP Errors

48 49 50

3.3.1 Truncation Errors of the Filter Coefficients

51

i

3.3.2 Finite Precision Distortion 3.3.3 Finite Range Distortion 3.3.4 Influence of the Implementation

53 55 57

3.4 Quality Analysis of Audio Codecs 3.5 Conclusion

59 62

Chapter 4 Identification of Nonlinear Feedback Systems

63

4.1 Introduction 4.2 Model Structure 4.3 Estimation Procedure

64 65 68

4.3.1 Best Linear Approximation 4.3.2 Nonlinear Feedback 4.3.3 Nonlinear Optimization

68 68 70

4.4 Experimental Results 4.4.1 4.4.2 4.4.3 4.4.4

72

Linear Model Estimation of the Nonlinear Feedback Coefficients Nonlinear Optimization Upsampling

4.5 Conclusion Appendix 4.A Analytic Expressions for the Jacobian

73 75 76 76

79 80

Chapter 5 Nonlinear State Space Modelling of Multivariable Systems

81

5.1 Introduction 5.2 The Quest for a Good Model Structure

82 83

5.2.1 Volterra Models 5.2.2 NARX Approach 5.2.3 State Space Models

83 84 86

5.3 Polynomial Nonlinear State Space Models 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5

Multinomial Expansion Theorem Graded Lexicographic Order Approximation Behaviour Stability Some Remarks on the Polynomial Approach

5.4 On the Equivalence with some Block-oriented Models 5.4.1 5.4.2 5.4.3 5.4.4 5.4.5

92 92 94 94 98 99

102

Hammerstein Wiener Wiener-Hammerstein Nonlinear Feedback Conclusion

102 104 105 106 109

5.5 A Step beyond the Volterra Framework

111

5.5.1 Duffing Oscillator 5.5.2 Lorenz Attractor

111 113

5.6 Identification of the PNLSS Model 5.6.1 5.6.2 5.6.3 5.6.4

115

Best Linear Approximation Frequency Domain Subspace Identification Nonlinear Optimization of the Linear Model Estimation of the Full Nonlinear Model

Appendix 5.A Some Combinatorials

115 115 120 122

130

ii

Appendix Appendix Appendix Appendix

5.B Construction of the Subspace Weighting Matrix from the FRF Covariance 131 5.C Nonlinear Optimization Methods 133 5.D Explicit Expressions for the PNLSS Jacobian 140 5.E Computation of the Jacobian regarded as an alternative PNLSS system 142

Chapter 6 Applications of the Polynomial Nonlinear State Space Model 6.1 Silverbox 6.1.1 6.1.2 6.1.3 6.1.4 6.1.5

146

Description of the DUT Description of the Experiments Best Linear Approximation Nonlinear Model Comparison with Other Approaches

146 146 147 149 151

6.2 Combine Harvester 6.2.1 6.2.2 6.2.3 6.2.4

155

Description of the DUT Description of the Experiments Best Linear Approximation Nonlinear Model

155 156 156 158

6.3 Semi-active Damper 6.3.1 6.3.2 6.3.3 6.3.4

161

Description of the DUT Description of the Experiments Best Linear Approximation Nonlinear Model

161 161 162 164

6.4 Quarter Car Set-up 6.4.1 6.4.2 6.4.3 6.4.4

167

Description of the DUT Description of the Experiments Best Linear Approximation Nonlinear Model

167 167 169 170

6.5 Robot Arm 6.5.1 6.5.2 6.5.3 6.5.4

173

Description of the DUT Description of the Experiments Best Linear Approximation Nonlinear Model

173 174 175 176

6.6 Wiener-Hammerstein 6.6.1 6.6.2 6.6.3 6.6.4 6.6.5 6.6.6

179

Description of the DUT Description of the Experiments Level of Nonlinear Distortions Best Linear Approximation Nonlinear Model Comparison with a Block-oriented Approach

6.7 Crystal Detector 6.7.1 6.7.2 6.7.3 6.7.4 6.7.5

145

179 179 179 180 182 185

186

Description of the DUT Description of the Experiments Best Linear Approximation Nonlinear Model Comparison with a Block-oriented Approach

iii

186 186 187 188 192

Chapter 7 Conclusions

193

References Publication List

197 204

iv

Operators and Notational Conventions ⺑

outline upper case font denotes a set: for example,

⺞, ⺪, ⺢,

and ⺓ are, respectively, the natural, the integer, the real, and the complex numbers.



the Kronecker matrix product

Re( )

real part of

Im( )

imaginary part of

arg min f(x)

the minimizing argument of

O(x)

an arbitrary function with the property

θˆ

estimated value of

x

complex conjugate of

x

subscript

(x) lim O ----------- < ∞ x

x→0

θ x

A re = Re(A) Im(A)

re

superscript

f(x)

A re = Re(A) Im(A)

re

subscript

u

with respect to the input of the system

subscript

y

with respect to the output of the system vector which contains all the distinct nonlinear combinations of

x(r )

the elements of vector

x , of exactly degree r

vector which contains all the distinct nonlinear combinations of

x{r }

the elements of vector

x , from degree 2 up to r

superscript

T

matrix transpose

superscript

–T

transpose of the inverse matrix

superscript

H

Hermitian transpose: complex conjugate transpose of a matrix

superscript

–H

Hermitian transpose of the inverse matrix

superscript

+

Moore-Penrose pseudo-inverse

∠x

phase (argument) of the complex number

A [ :, j ]

j -th column of A

A [ i, : ]

i -th row of A

κ(A) = (max σ i(A) ) ⁄ (min σ i(A) ) condition number of an n × m matrix A i

x =

i

( Re(x) ) 2 + ( Im(x) ) 2

magnitude of a complex number

v

x

x

diag(A 1, A 2, …, A K)

block diagonal matrix with blocks

A k with k = 1, 2, …, K

herm(A) = ( A + A H ) ⁄ 2

Hermitian symmetric part of an

rank(A)

rank of the

vec(A)

a column vector formed by stacking the columns of the matrix

n × m matrix A

n × m matrix A , i.e., maximum number of linear independent rows (columns) of A A on top of each other

Ᏹ{ }

mathematical expectation

Cov(X, Y)

cross-covariance matrix of

var(x)

variance of

C X = Cov(X) = Cov(X, X)

covariance matrix of

ˆ C X

sample covariance matrix of

C XY = Cov(X, Y)

cross-covariance matrix of

ˆ C XY

sample cross-covariance matrix of

X and Y

DFT(x(t))

Discrete

of

X and Y

x

Fourier

X X

X and Y

Transform

t = 0, 1, …, N – 1 Im 0

m×n

m × m identity matrix m × n zero matrix

Im

m × m identity matrix

S XX(jω)

auto-power spectrum of

x(t)

S XY(jω)

cross-power spectrum of

x(t) and y(t)



sample mean of

µx = Ᏹ { x }

mean value of

σ x2 = var(x)

variance of

σˆ x2

sample variance of

2 = covar(x, y) σ xy

covariance of

2 σˆ xy

sample covariance of

X

x

x

vi

x

x and y x and y

the

samples

x(t) ,

Symbols C BLA ( k )

total covariance of the Best Linear Approximation (MIMO)

Cn ( k )

covariance of the BLA due to the measurement noise (MIMO)

C NL ( k )

covariance of the BLA due to the stochastic nonlinear contributions (MIMO)

f

frequency

F

number of frequency domain data samples

fs

sampling frequency

G(jω)

frequency response function

G BLA(jω)

best linear approximation of a nonlinear plant

j

j2 = –1

k

frequency index

M

number of (repeated) experiments

N

number of time domain data samples

n a , n u , and n y

state dimension, input dimension, and output dimension



dimension of the parameter vector

s

Laplace transform variable

sk

Laplace transform variable evaluated along the imaginary axis at DFT frequency

θ

k : s k = jω k

t

continuous- or discrete-time variable

Ts

sampling period

U(k), Y(k)

discrete Fourier transform of the samples

u(tT s) and y(tT s) ,

t = 0, 1, …, N – 1 U k, Y k

Fourier coefficients of the periodic signals

U(jω), Y(jω)

Fourier transform of

u(t), y(t)

input and output time signals

V F(θ, z)

cost function based on

z

Z-transform variable

zk

Z-transform variable evaluated along the unit circle at DFT frequency

u(t) and y(t)

k : zk = e

vii

u(t) , y(t)

F measurements

jω k T s

= e j2πk ⁄ N

ε(θ, Z)

column vector of the model residuals (dimension

J(θ, Z) = ∂ε(θ, Z) ⁄ ∂θ

gradient

of

(dimension

residuals

ε(θ, Z)

w.r.t.

the

F)

parameters

θ

F × nθ )

θ

column vector of the model parameters

ζ(t )

nonlinear vector map of the state equation

η(t)

nonlinear vector map of the output equation

ξ(t )

column vector that contains the stacked state and input vectors

2 σ BLA (k)

total variance of the Best Linear Approximation (SISO)

σ n2 ( k )

variance of the BLA due to the measurement noise (SISO)

2 ( k) σ NL

variance of the BLA due to the stochastic nonlinear contributions (SISO)

ω = 2πf

angular frequency

viii

Abbreviations BL BLA DFT DUT FIR FFT FRF GN iid IIR LS LTI MIMO MISO NARX NLS NOE pdf PID PISPOT PNLSS PSD RF RBF RMS RMSE rpm RPM SA SISO SDR SNR SVD w.p.1 WLS

Band-Limited (measurement set-up) Best Linear Approximation Discrete Fourier Transform Device Under Test Finite Impulse Response Fast Fourier Transform Frequency Response Function Gaussian Noise independent identically distributed Infinite Impulse Response Least Squares Linear Time Invariant Multiple Input Multiple Output Multiple Input Single Output Nonlinear Auto Regressive with eXternal input (model) Nonlinear Least Squares Nonlinear Output Error (model) probability density function Proportional-Integral-Derivative (controller) Periodic Input, Same Periodic OuTput Polynomial NonLinear State Space Power Spectral Density Radio Frequency Radial Basis Function Root Mean Square (value) Root Mean Square Error rotations per minute Random Phase Multisine State Affine (model) Single Input Single Output Signal-to-Distortion Ratio Signal-to-Noise Ratio Singular Value Decomposition with probability one Weighted Least Squares

ix

x

C HAPTER 1

INTRODUCTION

1

Chapter 1: Introduction

1.1 What are Nonlinear Systems? It is difficult, if not impossible, to give a closing definition of nonlinear systems. The famous paradigm by the mathematician Stan Ulam illustrates this [6]: “using a term like ‘nonlinear science’, is like referring to the bulk of zoology as the study of non-elephant animals”. Nevertheless, the world around us is filled with nonlinear phenomena, and we are very familiar with some of these effects. Essentially, a system is nonlinear when the rule of three is not applicable to its behaviour. Tax rating systems in Belgium, for instance, behave nonlinearly: the higher someone’s gross salary gets, the higher his/her average tax rate becomes. Audio amplifiers are another good example of nonlinear systems. When their volume is turned up too eagerly, the signals they produce get clipped, and the music we hear becomes distorted instead of sounding louder. The weather system also behaves nonlinearly: slight perturbations of this system can lead to massive modifications after a long period of time. This is the so-called butterfly effect. It explains why it is so hard to accurately predict the weather with a time horizon of more than a couple of days. In some situations, nonlinear behaviour is a desired effect. Video and audio broadcasting, mobile telephony, and CMOS technology would simply be impossible without nonlinear devices such as transistors and mixers. Hence, it is important to understand and model their behaviour. Finally, let us define, in a slightly more rigorous way, what nonlinear systems are. To this end, we start by defining a linear system. With zero initial conditions, the system T { . } is linear if it obeys the superposition principle and the scaling property

T { αu 1 ( t ) + βu 2 ( t ) } = αT { u 1 ( t ) } + βT { u 2 ( t ) } ,

(1-1)

where u 1 ( t ) and u 2 ( t ) are two arbitrary input signals as a function of time, and α and β are two arbitrary scalar numbers. When the superposition principle or the scaling property is not fulfilled, we call T { . } a nonlinear system. The most important implication of this open definition, is that there exists no general nonlinear framework. That is why studying nonlinear systems is such a difficult task.

2

Why build Nonlinear Models?

1.2 Why build Nonlinear Models? From the handful examples listed in the previous section, it is clear that many real-life phenomena are nonlinear. Often, it is possible to use linear models to approximate their behaviour. This is an attractive idea because the linear framework is well established. Furthermore, linear models are easy to interpret and to understand. Building linear models usually requires significantly less effort than the estimation of nonlinear models. Unfortunately, linear approximations are only valid for a given input range. Hence, there has been a tendency towards nonlinear modelling in various application fields during the last decades. Technological innovations have resulted in less limitations on the computational, the memory, and the data-acquisition level, making nonlinear modelling a more feasible option. In order to build models for the studied nonlinear devices, we will employ system identification methods. Classic text books about system identification are [75], [38], and [56]. The basic goal of system identification is to identify mathematical models from the available input/ output data. This is often achieved via the minimization of a cost function embedded in a statistical framework (Figure 1-1). An excellent starting point for nonlinear modelling is [70]. Other

reference

works

on

nonlinear

systems

and

nonlinear

modelling

include

[60],[59],[7],[5],[33],[77], and [80]. In this thesis, we focus on the estimation of so-called simulation models: from measured input/output data, we estimate a model that, given a new input data set, simulates the output as good as possible. Such models can for instance be used to replace expensive experiments by cheap computer simulations. The major difference with prediction error modelling is that no past measured outputs are used to predict new output values. As mentioned before, there is no general nonlinear framework. However, there exists a class of nonlinear systems that has intensively been studied in the past, and which covers a broad spectrum of ‘nice’ nonlinear behaviour: the class of Wiener systems. This class will be used as a starting point in this thesis; it stems from the Volterra-Wiener theory which will be briefly explained in what follows. data

cost function

model

Figure 1-1. Basic idea of system identification: the cost function relates data and model.

3

Chapter 1: Introduction

1.3 A Framework for Nonlinear Modelling 1.3.1 Approximation Criteria In this thesis, we will use models to approximate the behaviour of nonlinear systems. This requires an approximation quality measure. For this, two principal convergence criteria can be employed: convergence in mean square sense and uniform convergence. Definition 1.1 (Convergence in the Mean) The model ˆf ( u, θ ) converges in mean square sense to a system f ( u ) , if for all ε > 0 , ∃M ε independent of θ such that for all M > Mε ,

⎧ 2⎫ ∃θ M ⇒ ∀u ∈ ⺥ : Ᏹ ⎨ f ( u ) – ˆf ( u, θ M ) ⎬ < ε ⎩ ⎭

with M = dim ( θ ) ,

where the expected value is taken over the class of excitation signals ⺥ . Definition 1.2 (Uniform Convergence) The model ˆf ( u, θ ) converges uniformly to a system f ( u ) , if for all ε > 0 , ∃M ε independent of θ and u such that for all M > M ε ,

∃θ M ⇒ ∀u ∈ ⺥ : f ( u ) – ˆf ( u, θ M ) < ε

with M = dim ( θ ) .

Note that uniform convergence is a stronger result than convergence in mean square sense, but the latter is often easier to obtain.

1.3.2 The Volterra-Wiener Theory In order to set up a rigorous framework for the following chapter, we consider a particular class of nonlinear systems. A classical approach is to make use of the Volterra-Wiener theory, which is thoroughly described in [59] and [60]. A short overview of the results that will serve in the rest of this thesis is given here.

4

A Framework for Nonlinear Modelling

Volterra series can be seen as the dynamic extension of power series, and they are defined as the (infinite) sum of Volterra operators H n . For an input u ( t ) as a function of time, the output y ( t ) of the series is given by ∞



y(t) =

Hn[ u ( t ) ] .

(1-2)

n=1

The n -th order continuous-time Volterra operator H n is defined as ∞

Hn [ u ( t ) ] =



∫ … ∫ hn ( τ1, …, τn )u ( t – τ1 )…u ( t – τn ) dτ1 … dτn , 0

(1-3)

0

where h n ( τ 1, …, τ n ) is the n -th order Volterra kernel. For a first order system ( n = 1 ), equation (1-3) reduces to the well-known relation ∞

H1 [ u ( t ) ] =

∫ h ( τ )u ( t – τ ) dτ ,

(1-4)

0

which is the convolution representation of a linear system with an impulse response h ( τ ) . When the kernel h n is causal, it is zero for any negative argument:

h n ( τ 1, …, τ n ) = 0

for τ i < 0, i = 1, …, n

(1-5)

Because we restrict ourselves to causal systems, the lower integral limits in (1-3) and (1-4) are set equal to zero. Volterra series can be used to approximate the behaviour of a certain class of nonlinear systems. However, they can suffer from severe convergence problems, which is a common phenomenon for power series. This can occur for example in the presence of discontinuities like hard clipping or dead-zones. To overcome this difficulty, Wiener introduced the Wiener-G functionals [60], which are Volterra functionals orthogonalized with respect to white Gaussian input signals. Note that the Wiener-G functionals are only required to solve numerical issues. Hence, what follows holds for both Volterra and Wiener-G functionals. The Wiener theory states that any nonlinear system f satisfying a number of conditions can be represented arbitrarily well in mean square sense by Volterra/Wiener-G functionals. The restrictions on system f are [60]:

5

Chapter 1: Introduction

1.

f is not explosive, in other words, the system’s response to a bounded input sequence is finite;

2.

f has a finite memory, i.e., the present output becomes asymptotically independent of the past values of the input;

3.

f is causal and time-invariant.

The set of systems W satisfying these conditions is known as the class of Wiener systems. When approximating f by a Volterra/Wiener-G functional ˆf , the mean square convergence is guaranteed over a finite time interval, with respect to the class of white Gaussian input signals

⺗: ⎧







Ᏹ⎨ [ f ( u ( t ) ) – ˆf ( u ( t ) ) ] 2 ⎬ < ε

t ∈ [ 0, T ], f ∈ W, ∀u ∈ ⺗

(1-6)

Boyd and Chua achieved even more powerful results with Volterra series via the introduction of a concept called Fading Memory [5]. Definition 1.3 (Fading Memory) f has Fading Memory on a subset K of a compact set, if there is a decreasing function w : ⺢ ⫹ → ( 0, 1 ] , with lim w ( t ) = 0 , such that t→∞ for each u 1 ∈ K and ε > 0 there is a δ > 0 such that for all u 2 ∈ K :

sup w ( –t ) u 1 ( t ) – u 2 ( t ) < δ ⇒ f ( u 1 ( t ) ) – f ( u 2 ( t ) ) < ε t≤0

(1-7)

Loosely explained, an operator has Fading Memory when two input signals close to each other in the recent past, but not necessarily in the remote past, yield present output signals that are close. This strengthened continuity requirement on f allows to obtain more powerful approximation results with Volterra series. Boyd and Chua have proved the uniform convergence of finite ( n < ∞ ) Volterra series to any continuous-time Fading Memory system for the class of input signals with bounded amplitude and bounded slew rate, without any restrictions on the time interval ( T → ∞ ).

6

A Framework for Nonlinear Modelling

1.3.3 Continuous-time versus Discrete-time Until now, we have only considered continuous-time Volterra series. However, this thesis mainly deals with the identification of discrete-time models. The discrete-time version of the

n -th order causal Volterra operator is defined as ∞

Hn[ u( t) ] =

∑ τ1 = 0







h n ( τ 1, …, τ n )u ( t – τ 1 )…u ( t – τ n ) ,

(1-8)

τn = 0

where h n ( τ 1, …, τ n ) denotes the n -th order, discrete-time Volterra kernel. In [5], the approximation properties of the Volterra series are shown under the Fading Memory assumption for discrete-time systems. The only difference with the continuous-time case is that the slew rate of the input signal does not need to be bounded.

1.3.4 Single Input, Single Output versus Multiple Input, Multiple Output So far, only SISO (Single Input, Single Output) systems were considered, but in Chapter 5 we will deal with MIMO (Multiple Input, Multiple Output) systems as well. MIMO Volterra models with n y outputs are defined as n y separate MISO Volterra series. In the following, the analysis is only pursued for discrete-time systems. For notational simplicity, we define u ( t ) as the vector of the n u assembled inputs at time instance t :

u1 ( t ) u(t) =

… un ( t )

(1-9)

u

A MISO Volterra series is defined as the sum of Volterra functionals H n : ∞

y(t) =



Hn [ u ( t ) ]

(1-10)

n=0

Next, consider the n -th term of (1-10). The n -th order, discrete-time MISO Volterra functional is defined as

7

Chapter 1: Introduction



Hn[ u ( t ) ] =

j 1, …, j n

Hn

[ u( t) ] ,

(1-11)

j 1, …, j n

where j i are input indices between 1 and n u , and j 1, …, j n Hn [u(t)]



=







τ1 = 0



h

j 1, …, j n

( τ 1, …, τ n )u j ( t – τ 1 )…u j ( t – τ n ) . 1

(1-12)

n

τn = 0 j 1, …, j n

[ u ( t ) ] of order n , we need to apply To determine the number of distinct operators H n some combinatorials. In Appendix 5.A, the same problem is solved in a different context, but the idea remains the same. Hence, we distinguish

( n + n u – 1 )! ⎛ n u + n – 1⎞ ⎜ ⎟ = -----------------------------n! ( n u – 1 )! ⎝ ⎠ n

(1-13)

different terms.

1.3.5 What is not included in the Volterra Framework? Since Volterra series are open loop models, they cannot represent a number of closed loop phenomena. Bearing the negative definition of nonlinear systems in mind, it is impossible to give an exhaustive list of the systems that cannot be approximated by Volterra models. However, it is possible to sum up a couple of examples.

• No Subharmonic Generation In [60], it was shown that the steady-state response of a Volterra series to a harmonic input is harmonic, and has the same period as the input. Hence, systems that generate subharmonics are excluded from the Volterra-Wiener framework. For this reason, we sometimes say that Wiener systems are PISPOT (Periodic Input, Same Periodic OuTput) systems. An example of subharmonic generation is given in the “Duffing Oscillator” on p. 111.

8

A Framework for Nonlinear Modelling

• No Chaotic Behaviour Chaos is typically the result of a nonlinear dynamic system of which the output depends extremely on the initial conditions. Such behaviour conflicts with the finite memory requirement of the Wiener class: the present output does not become asymptotically independent of the past. An example of chaotic behaviour is given in “Lorenz Attractor” on p. 113.

• No Multiple-valued Output Volterra series are a single-valued output representation. Hence, they cannot represent systems that exhibit output multiplicity, like for instance hysteresis.

9

Chapter 1: Introduction

1.4 Outline of the thesis All the work in this thesis relies on the concept of the Best Linear Approximation (BLA). Therefore, in Chapter 2 the BLA is first introduced in an intuitive way, and then rigorously defined for SISO and MIMO nonlinear systems. Furthermore, some interesting properties of multisine excitation signals with respect to the qualification and quantification of nonlinear behaviour are rehearsed. Finally, we explain how the BLA should be estimated, for both non periodic and periodic input/output data. As will become clear in Chapter 2, periodic excitations are preferred, since in that case more information can be extracted from the Device Under Test. The tools described in Chapter 2 are applied to a number of Digital Signal Processing (DSP) algorithms in Chapter 3. A measurement technique is proposed to characterize the nonidealities of DSP algorithms which are induced by quantization effects, overflows, or other nonlinear effects. The main idea is to apply specially designed excitations such that a distinction can be made between the output of the ideal system and the contributions of the system’s non-idealities. The proposed method is applied to digital filtering and to an audio compression codec. In Chapter 4, an identification procedure is presented for a specific kind of block-oriented model: the Nonlinear Feedback model. By estimating the Best Linear approximation of the system and by rearranging the model’s structure, the identification of the feedback model parameters is reduced to a linear problem. The numerical parameter values obtained by solving the linear problem are then used as starting values for a nonlinear optimization procedure. The proposed method is illustrated on measurements obtained from a physical system. Chapter 5 introduces the Polynomial Nonlinear State Space model (PNLSS) and studies its approximation capabilities. Next, a link is established between this model and a number of classical block-oriented models, such as Hammerstein and Wiener models. Furthermore, by means of two simple examples, we illustrate that the proposed model class is broader than the Volterra framework. In the last part of Chapter 5, a general identification procedure is presented which utilizes the Best Linear Approximation of the nonlinear system. Next,

10

Outline of the thesis

frequency domain subspace identification is employed to initialize the PNLSS model. The identification of the full PNLSS model is then regarded as a nonlinear optimization problem. In Chapter 6, the proposed identification procedure is applied to measurements from various real-life systems. The SISO test cases comprise three electronic circuits (the Silverbox, a Wiener-Hammerstein system and a RF crystal detector), and two mechanical set-ups (a quarter car set-up and a robot arm). Furthermore, two mechanical MISO applications are discussed (a combine harvester and a semi-active magneto-rheological damper). Finally, Chapter 7 deals with the conclusions and some ideas on further research.

11

Chapter 1: Introduction

1.5 Contributions The main goal of this thesis is to study and design tools which allow the practicing engineer to qualify, to understand and to model nonlinear systems. In this context, the contributions of this thesis are:

• The characterization of DSP systems/algorithms via the Best Linear Approximation and multisine excitation signals.

• A method to generate starting values for a block-oriented, Nonlinear Feedback model with a static nonlinearity in the feedback loop.

• A method that initializes the Polynomial NonLinear State Space (PNLSS) model by means of the BLA of the Device Under Test.

• The establishment of a link between the PNLSS model structure and five classical blockoriented models.

• The application of the proposed identification method to several real-life measurement problems.

12

Publication List

1.6 Publication List Chapter 3 was published as

• J. Paduart, J. Schoukens, Y. Rolain. Fast Measurement of Quantization Distortions in DSP Algorithms. IEEE Transactions on Instrumentation and Measurement, vol. 56, no. 5, pp. 1917-1923, 2007. The major part of Chapter 4 was presented at the Nolcos 2004 conference:

• J. Paduart, J. Schoukens. Fast Identification of systems with nonlinear feedback. Proceedings of the 6th IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 525-529, 2004. The comparative study between the PNLSS model and the block-oriented models from Chapter 5 were presented at IMTC 2007:

• J. Paduart, J. Schoukens, L. Gommé. On the Equivalence between some Block-oriented Nonlinear Models and the Nonlinear Polynomial State Space Model. Proceedings of the

IEEE Instrumentation and Measurement Technology Conference, Warsaw, Poland, pp. 1-6, 2007. The identification of the PNLSS model and its application to two real-life set-ups was presented at the SYSID 2006 conference:

• J. Paduart, J. Schoukens, R. Pintelon, T. Coen. Nonlinear State Space Modelling of Multivariable Systems. Proceedings of the 14th IFAC Symposium on System Identification, Newcastle, Australia, pp. 565-569, 2006. The application of the PNLSS model to a quarter car set-up led to the following publication:

13

Chapter 1: Introduction

• J. Paduart, J. Schoukens, K. Smolders, J. Swevers. Comparison of two different nonlinear state-space identification algorithms. Proceedings of the International Conference

on Noise and Vibration Engineering, Leuven, Belgium, pp. 2777-2784, 2006. Finally, the cooperation with colleagues from the ELEC Department (Vrije Universiteit Brussel), the PMA and BIOSYST-MeBioS Department (KULeuven) resulted in the following publications:

• J. Schoukens, J. Swevers, J. Paduart, D. Vaes, K. Smolders, R. Pintelon. Initial estimates for block structured nonlinear systems with feedback. Proceedings of the International

Symposium on Nonlinear Theory and its Applications, Brugge, Belgium, pp. 622-625, 2005.

• J. Schoukens, R. Pintelon, J. Paduart, G. Vandersteen. Nonparametric Initial Estimates for Wiener-Hammerstein systems. Proceedings of the 14th IFAC Symposium on System

Identification, Newcastle, Australia, pp. 778-783, 2006.

• T. Coen, J. Paduart, J. Anthonis, J. Schoukens, J. De Baerdemaeker. Nonlinear system identification on a combine harvester. Proceedings of the American Control Conference, Minneapolis, Minnesota, USA, pp. 3074-3079, 2006.

14

C HAPTER 2

THE BEST LINEAR APPROXIMATION

In this chapter, we introduce the Best Linear Approximation in an intuitive way. Next, the excitation signals used throughout this thesis are presented, followed by a formal definition of the Best Linear Approximation. We then explain how the properties of a multisine excitation signal can be exploited to quantify and qualify the nonlinear behaviour of a system. Finally, we show how the Best Linear Approximation of a nonlinear system can be obtained.

15

Chapter 2: The Best Linear Approximation

2.1 Introduction As was shown in the introductory chapter, linear models have many attractive properties. Therefore, it can be useful to approximate nonlinear systems by linear models. Since we are dealing with an approximation, model errors will be present. Hence, a framework needs to be selected in order to decide in which sense the approximate linear model is optimal. We will use a classical approach and minimize the errors in mean square sense. Definition 2.1 (Best Linear Approximation) The Best Linear Approximation (BLA) is defined as the model G belonging to the set of linear models Ᏻ , such that

G BLA = arg min Ᏹ { y ( t ) – G ( u ( t ) ) 2 } , G∈Ᏻ

(2-1)

where u ( t ) and y ( t ) are the input and output of the nonlinear system, respectively. In general, the Best Linear Approximation G BLA of a nonlinear system depends on the amplitude distribution, the power spectrum, and the higher order moments of the stochastic input u ( t ) [56],[19],[21]. The amplitude dependency is illustrated by means of a short simulation example. Example 2.2 Consider the following static nonlinear system:

y = tanh ( u ) ,

(2-2)

and three white excitation signals drawn from different distributions: a uniform, a Gaussian, and a binary distribution (Figure 2-1 (a)). The parameters of these distributions are chosen such that their variance is equal to one. Figure 2-1 (b) shows the BLA (grey) for the three distributions, together with the static nonlinearity (black). In this set-up, the BLA is a straight line through the origin. Note that in general, a static nonlinearity does not necessarily have a static BLA [19]. It can be seen from Figure 21 (b) that the slope g of the BLA changes from distribution to distribution. From this example, it is clear that the properties of the input are of paramount importance when the BLA of a nonlinear system is determined. That is why we will start by discussing the class of excitation signals used throughout this thesis. Next, a formal definition of the Best

16

Introduction

(a)

Uniform Distribution

Gaussian Distribution

0.5

0.5

0.5

0.25

0.25

0.25

0

í2

0

2

2

0

2

2

0

í2

2

í2

0

2

2 g = 0.76

0

í2

0

g = 0.61

0

í2

0

í2

g = 0.67

(b)

Binary Distribution

0

í2

0

2

í2

í2

0

2

Figure 2-1. (a) Three different probability density functions; (b) Static nonlinearity (black) and BLA (grey). Linear approximation is given. Then, we will demonstrate how multisine signals can be used to quantify and qualify the nonlinear behaviour of the Device Under Test (DUT). Finally, we will show how the BLA can be determined for SISO and MIMO systems.

17

Chapter 2: The Best Linear Approximation

2.2 Class of Excitation Signals Since the Best Linear Approximation of a nonlinear system depends on the properties of the applied input signal, it is important to define the kind of excitations that will be employed, and to discuss their properties. In this thesis, we will utilize the class of Gaussian excitation signals

⺕ with a user-defined power spectrum. Furthermore, it is required that the signals are stationary such that their power spectrum is well defined. Three excitation signals that are commonly used belong to ⺕ : Gaussian noise, Gaussian periodic noise, and random phase multisines (Figure 2-2). For periodic noise and random phase multisines, this membership is only asymptotic, i.e., for the number of excited frequency components going to infinity ( N → ∞ ). We will restrict ourselves to Gaussian noise and random phase multisines. We will give a definition and a brief overview of some of the properties of these signals. Gaussian Signals Gaussian Noise



Gaussian Periodic Noise Random Phase Multisines * Figure 2-2. Class of excitation signals. *: asymptotic result, for the number of excited frequency components going to infinity.

2.2.1 Random Phase Multisine Definition 2.3 (Random Phase Multisine) A random phase multisine is a periodic signal, defined as a sum of harmonically related sine waves:

1 u ( t ) = -------N

N



Uk e

k j ⎛ 2πf max ---- t + φ k⎞ ⎝ ⎠ N

(2-3)

k = –N

kf max

with φ – k = – φ k , U k = U – k = U ⎛ -------------⎞ , and f max the maximum frequency of the ⎝ N ⎠ excitation signal. The amplitudes U ( f ) are chosen in a custom fashion, according to the user-defined power spectrum that should be realized. The phases φ k are the realizations of an independent distributed random process such that Ᏹ{ e

18

jφ k

} = 0.

Class of Excitation Signals

The factor 1 ⁄

N serves as normalization such that, asymptotically ( N → ∞ ), the power of

the multisine remains finite, and its Root Mean Square (RMS) value stays constant as N increases. A typical choice is to take φ k uniformly distributed over [0, 2π ) , but for instance discrete phase distributions can be used as well, as long as Ᏹ{ e

jφ k

} = 0 holds. Note that the

random phase multisine is asymptotically normally distributed ( N → ∞ ), but in practice 20 excited lines works already very well for smoothly varying amplitude distributions [56],[62]. Next, we illustrate some of the properties of a random phase multisine signal with a short example. Example 2.4 We consider a random phase multisine with N = 128 , a flat power spectrum and f max = 0.25 Hz. Figure 2-3 (a) shows the histogram of this signal together with a theoretic Gaussian pdf having the same variance and expected value. Figure 2-3 (b) and (c) show the multisine in the time and frequency domain, respectively. From these plots, we see that the random phase multisine has a noisy behaviour in the time domain, and perfectly realizes the user-defined amplitude spectrum in the frequency domain. The main advantage of random phase multisines is the fact that their periodicity can be exploited to distinguish the measurement noise from the nonlinear distortions [15]. Further in this chapter, we will go into detail about this property. A drawback is the need to introduce a settling time for the transients, which is common to periodic excitation signals.

Amplitude

(a)

(b)

4

4

2

2

0

0

í2

í2

í4 0

0.2 0.4 Probability

í4 1

(c) 2

1

128 Time [s]

256

0

0

0.25 Frequency [Hz]

Figure 2-3. Some properties of a Random Odd, Random Phase Multisine: (a) Histogram (black) and theoretic Gaussian pdf (grey), (b) Time domain and (c) Frequency domain representation (DFT spectrum).

19

Chapter 2: The Best Linear Approximation

2.2.2 Gaussian Noise Definition 2.5 (Gaussian Noise) A Gaussian noise signal is a random sequence drawn from a Gaussian distribution with a user-defined power spectral density. Example 2.6 An example of a Gaussian noise signal is shown in Figure 2-4 (b). To generate this sequence, a signal of N = 128 samples was drawn from a normal distribution. In order to achieve the same bandwidth as the random phase multisine from Figure 2-3, the signal was filtered using a 6th order Butterworth filter with a cut-off frequency of 0.25 Hz. Finally, to obtain the same RMS value as for the multisine example, the amplitude of the filtered sequence was normalized. Figure 2-4 (a) shows the histogram of this signal together with a theoretic Gaussian pdf. In Figure 2-4 (c), the DFT spectrum of the sequence is plotted. The DFT spectrum of the Gaussian noise contains dips that can lead to unfavourable results such as a low SNR at some frequencies. Two more disadvantages are associated with the non periodic nature of random Gaussian noise. First of all, no distinction can be made between the measurement noise and the nonlinear distortions using simple tools. Secondly, leakage errors are present when computing the DFT of this signal. Note that when comparing Figure 2-3 and Figure 2-4, it is impossible to distinguish a random phase multisine and a random Gaussian noise sequence based on their histogram and time domain waveform.

Amplitude

(a)

(b)

4

4

2

2

0

0

í2

í2

í4 0

0.2 0.4 Probability

í4 1

(c) 2

1

128 Time [s]

256

0

0

0.25 0.5 Frequency [Hz]

Figure 2-4. Some properties of filtered random Gaussian Noise: (a) Histogram (black) and Gaussian pdf (grey), (b) Time domain and (c) Frequency domain representation (DFT spectrum).

20

Properties of the Best Linear Approximation

2.3 Properties of the Best Linear Approximation 2.3.1 Single Input, Single Output Systems Consider a noiseless Single Input, Single Output (SISO) nonlinear system S with an input u and an output y (see Figure 2-5 (a)). We make the following assumption on S . Assumption 2.7 There exists a uniformly bounded Volterra series of which the output converges in mean square sense to the output y of S for u ∈ ⺕ . These systems are also called Wiener, or PISPOT systems. This class includes discontinuities like quantizers or relays, and excludes chaotic behaviour or systems with bifurcations. Theorem 2.8 If S satisfies Assumption 2.7, it can be modelled as the sum of a linear system G BLA ( jω ) , called the Best Linear Approximation, and a noise source y s . The Best Linear Approximation is calculated as

S yu ( jω ) G BLA ( jω ) = ------------------- , S uu ( jω )

(2-4)

where S uu ( jω ) is the auto-power spectrum of the input, and S yu ( jω ) the cross-power spectrum between the output and the input. Relation (2-4) is obtained by calculating the Fourier transform of the Wiener-Hopf equation, which on its turn follows from equation (2-1) (see for instance [24] for this classic result). Note that by using equation (2-4) no causality is imposed on G BLA ( jω ) . In [53], it was shown that the BLA for Gaussian noise and random phase multisines with an equivalent power spectrum are asymptotically identical. The BLA for random phase multisines converges to G BLA ( jω ) as the number of frequency components N goes to infinity:

G BLA, N ( jω ) = G BLA ( jω ) + O ( N – 1 )

(2-5)

The Best Linear Approximation for a nonlinear SISO system is illustrated in Figure 2-5 (b). The noise source y s represents that part of the output y that cannot be captured by the linear model G BLA ( jω ) . Hence, for frequency ω k we have that

21

Chapter 2: The Best Linear Approximation

Y ( jω k ) = G BLA ( jω k )U ( jω k ) + Y s ( jω k ) .

(2-6)

Y s ( jω k ) depends on the particular input realization and exhibits a stochastic behaviour from realization to realization, with Ᏹ{ Y s ( jω k ) } = 0 . Hence, G BLA ( jω ) can be determined by averaging the system’s response over several input realizations.

(a)

u

nonlinear

y

system

ys (b)

u

linear system

y BLA

y

G BLA

Figure 2-5. (a) SISO nonlinear system vs. (b) its alternative representation.

2.3.2 Multiple Input, Multiple Output Systems In [17], the Best Linear Approximation was extended to a Multiple Input, Multiple Output (MIMO) framework. Consider a noiseless nonlinear system S with inputs u i ( i = 1, …, n u ) and outputs y j ( j = 1, …, n y ). As in the SISO case, S needs to fulfil some conditions in order to define the Best Linear Approximation. Assumption 2.9 For ∀i, j , there exists a uniformly bounded MIMO Volterra series of which the outputs converge in mean square sense to the outputs y j of S for u i ∈ ⺕ .

22

Properties of the Best Linear Approximation

Theorem 2.10 If S satisfies Assumption 2.9, it can be modelled as the sum of a linear system G BLA ( jω ) , and n y noise sources y s( j ) . The Best Linear Approximation of S is calculated as –1

G BLA ( jω ) = S yu ( jω )S uu ( jω ) , where

S uu ( jω ) ∈ ⺓

S yu ( jω ) ∈ ⺓

ny × n u

nu × nu

is

the

auto-power

(2-7)

spectrum

of

the

inputs,

and

is the cross-power spectrum between the outputs and the inputs.

Also for MIMO systems, it was proven that the Best Linear Approximation for Gaussian noise and random phase multisines is asymptotically equivalent [17]. Figure 2-6 (a) shows a nonlinear MIMO system with n u inputs and n y outputs. In Figure 2-6 (b), the alternative representation of this system is given: the Best Linear Approximation (j)

G BLA ( jω ) together with the n y stochastic nonlinear noise sources Y s ( jω k ) . For frequency ω k , the following equation holds:

Y ( jω k ) = G BLA ( jω k )U ( jω k ) + Y s ( jω k ) , u1 u2

(a)

un

(2-8)

y1 y2

nonlinear





system

yn

u

y

( ny )

y s( 1 ) y s( 2 ) y s (1)

y BLA

u1 (b)

linear

u2 … un

y1

(2) y BLA

system

G BLA

( ny ) y BLA

u

y2 … yn

y

Figure 2-6. (a) MIMO nonlinear system vs. (b) its alternative representation.

23

Chapter 2: The Best Linear Approximation

with Y ( jω k ) ∈ ⺓

ny × 1

ny × n u

, G BLA ( jω k ) ∈ ⺓ Here also, we have that Ᏹ{ Y s ( jω k ) } = 0 .

, U ( jω k ) ∈ ⺓

24

nu × 1

, and Y s ( jω k ) ∈ ⺓

ny × 1

.

Some Properties of Nonlinear Systems

2.4 Some Properties of Nonlinear Systems From Definition 2.3, we know that the amplitude spectrum of a random phase multisine can be customized. We will demonstrate that this freedom can be used to detect, quantify and qualify the nonlinear behaviour of devices that satisfy Assumption 2.7. The tools developed here will be used extensively in Chapter 3 to characterize Digital Signal Processing (DSP) algorithms.

2.4.1 Response to a Sine Wave Figure 2-7 shows the response of a linear (a) and a nonlinear (b) system to a sine wave. The output of the linear system consists of a sine wave, possibly with a modified amplitude and phase. The output spectrum of the nonlinear system, in general, contains additional spectral components, harmonically related to the input sine wave. Hence, the spectral components on the non excited spectral lines indicate the level of nonlinear behaviour of the DUT. |A|

|A| linear

(a)

system f

f

|A|

|A| nonlinear

(b)

system f

f

Figure 2-7. Response of (a) a linear system and (b) a nonlinear system to a sine wave.

2.4.2 Even and Odd Nonlinear Behaviour Using this principle, we can also retrieve qualitative information about the nonlinear behaviour of the DUT. In Figure 2-8, two kinds of nonlinear systems are considered. In (a), an even nonlinear system is excited with a sine wave with a frequency f 0 . The output spectrum of this system only contains contributions on the even harmonic lines (green arrows). This is due

25

Chapter 2: The Best Linear Approximation

|A|

even nonlinearity e.g.

(a)

|A|

2 y = e –u

f |A|

f

odd nonlinearity e.g.

(b) f

|A|

y = tanh ( u )

f

Figure 2-8. Response of (a) an even nonlinear system and (b) an odd nonlinear system to a sine wave. to the fact that the output spectrum of an even nonlinear system only contains even combinations of the input frequencies (e.g. f 0 + f 0 , f 0 – f 0 ,...). For an odd nonlinear system (b), the converse is true: its output spectrum consists of components on the odd frequency lines (red arrows). Here, the output spectrum contains only odd combinations of the input frequencies (e.g. f 0 + f 0 + f 0 , f 0 + f 0 – f 0 ,...).

2.4.3 The Multisine as a Detection Tool for Nonlinearities Consider now the case of a multisine signal, applied to a nonlinear system (see Figure 2-9). We will show that by carefully choosing the spectrum of the multisine, we can qualify and quantify the nonlinear behaviour of the DUT. For a multisine having only odd frequency components ( f 0 , 3f 0 , 5f 0 ,...), the even nonlinearities will only generate spectral components at even frequencies, since an even combination of odd frequencies always yields in an even frequency. Hence, the even frequency lines at the output can be used to detect even nonlinear behaviour of the DUT. Furthermore, odd combinations of odd frequency lines always

|A|

|A| nonlinear system f

f

Figure 2-9. Response of a nonlinear system to a multisine.

26

Some Properties of Nonlinear Systems

result in odd frequency lines. Hence, when some of the odd frequency lines are not excited, they can serve to detect odd nonlinear behaviour of the DUT [56]. According to which frequency lines are used to detect the nonlinear behaviour, several kinds of random phase multisines can be distinguished:

• Full multisine: all frequencies up to f max are excited, U k = A for k = 1, …, N .

(2-9)

• Odd multisine: only the odd lines are excited, ⎧A Uk = ⎨ ⎩0

k = 2n + 1

k ≤ N, n ∈ ⺞

elsewhere

(2-10)

Since even nonlinearities only contribute to the even harmonic output lines, they do not disturb the FRF measurements in this case. Hence, a lower uncertainty is achieved for the BLA [64].

• Random odd multisine: this is an odd multisine where the odd frequency lines are divided into groups with a predefined length, for instance, a block length of 4. In each block, all odd lines are excited ( U k = A ) except for one line which serves as a detection line for the odd nonlinearities. The frequency index of this line is randomly selected in each consecutive block [57]. This is the best way to reflect the nonlinear behaviour of the DUT. Hence this type of multisine should be the default option when employing random phase multisines. The ability to analyse the nonlinear behaviour of the DUT comes with a price: the frequency resolution diminishes or, equivalently, the measurement time increases.

27

Chapter 2: The Best Linear Approximation

2.5 Estimating the Best Linear Approximation In the last part of this chapter, we explain how the Best Linear Approximation of a nonlinear system can be determined. First, single input, single outputs systems are treated. Next, the results are extended to MIMO systems. Depending on the kind of excitation signals used during the experiments, the BLA is calculated differently. Therefore, we will make a distinction between periodic and non periodic data.

2.5.1 Single Input, Single Output Systems A. Periodic Data When using periodic excitation signals to determine the BLA of a SISO system, the experiments should be carried out according to the scheme depicted in Figure 2-10 [15]. In total, M different random phase multisines are applied, as shown on the vertical axis. After the transients have settled, we measure for each experiment P periods of the input and the output. Per experiment m and period p , we compute the DFT of the input ( U ( k ) and the output ( Y ( k )

[m,p]

[ m , p]

∈ ⺓)

∈ ⺓ ). Then, the spectra are averaged over the periods. For

frequency k , we obtain: P [m,p] ˆ ( k ) [ m ] = --1U U(k) ∑ P p=1 [m] [m,p] 1 P = --- ∑ Yˆ ( k ) Y( k) P p=1

(2-11)

ˆ ( jω ) [ m ] ∈ ⺓ : For every experiment m , we then calculate the FRF estimate G k ˆ ( k )[m ] [m ] Y ˆ G ( jω k ) = -------------------- , ˆ ( k )[ m] U

(2-12)

ˆ ( jω ) [ m ] which is equivalent to (2-4) for periodic excitations. Next, the M FRF estimates G k ˆ are combined in order to obtain the Best Linear Approximation G BLA ( jω k ) : 1 M ˆ ˆ ( jω ) [ m ] G G BLA ( jω k ) = ----- ∑ k M m=1

28

(2-13)

Estimating the Best Linear Approximation

P periods

Transient U

[1,1]

,Y

[1,1]

,Y

[2,1]



U



U

[1,P]

,Y

[1,P]

ˆ [ 1 ], σˆ 2 [ 1 ] G n

M experiments

m = 1

U

[2,1]

[2,P]

,Y

[2,P]

ˆ [ 2 ], σˆ 2 [ 2 ] G n

m = 2



U

[M,1]

,Y



[M,1]





U

[ M , P]

,Y

m = M

… [M,P]

ˆ [ M ], σˆ 2 [ M ] G n

Figure 2-10. Experiment design to calculate the BLA of a SISO nonlinear system.

Furthermore, due to the periodic nature of the excitation signals, the effect of the nonlinear distortions and the measurement noise on the Best Linear Approximation can be distinguished from each other. The variations over the P periods stem from the measurement noise, while the variations over the M experiments are due to the combined effect of the measurement noise and the stochastic nonlinear behaviour. Note that non-stationary disturbances such as non-synchronous periodic signals can also be detected using the measurement scheme from

ˆ Figure 2-10 [15]. First, we will determine the sample variance of G BLA ( jω k ) due to the measurement noise. A straightforward way to achieve this is to calculate the FRFs per period: [m,p]

Y(k ) ˆ ( jω ) [ m, p ] = ----------------------, G k [ m , p] U( k)

(2-14)

ˆ ( jω ) [ m ] , which is ˆ ( jω ) [ m, p ] to calculate the sample variance σˆ 2 [ m ] of G and to employ G n k k then given by P [m] 1 ˆ ( jω ) [ m ] 2 . ˆ ( jω ) [ m, p ] – G σˆ n2 = --------------------- ∑ G k k P(P – 1) p = 1

29

(2-15)

Chapter 2: The Best Linear Approximation

The drawback of this approach is that in equation (2-14) raw input data are employed without increasing the SNR by averaging over the periods. If the SNR of the input is low, the estimates

ˆ ( jω ) [ m, p ] will be of poor quality: a non negligible bias is present (SNR < 10 dB, [55]) and G k ˆ 2 [ m ] is to use the a high uncertainty (SNR < 20 dB, [54]). A better option to determine σ n

covariance information from the averaged input and output spectra, and to apply a first order approximation [56]. First, we calculate the sample variances and covariance of the estimated

ˆ ( k )[ m ] and Yˆ ( k ) [ m ] : spectra U

[m,p] ˆ [ m ] 2 1 - P 2 [ m ] = ----------σˆ U –U U P – 1 ∑p = 1 P [m] [m,p] ˆ [ m ] 2 1 σˆ Y2 = ------------ ∑ –Y Y P–1 p=1

(2-16)

P [m,p] ˆ [ m ] [m,p] ˆ [ m ] 1 2 [ m ] = -----------∑ σˆ YU (Y – Y )(U –U ) P–1 p=1

In (2-16), the frequency index k was omitted in order to simplify the formulas. From

ˆ 2 [ m ] ∈ ⺢ , and σˆ 2 [ m ] ∈ ⺓ , the sample variance σˆ 2 [ m ] of G ˆ ( jω ) [ m ] can 2 [m] ∈ ⺢ , σ σˆ U Y YU n k

be approximated by [56]:

⎛ σˆ 2 [ m ] ⎞ ⎞ 2 [m ] ˆ [ m ] 2 ⎛ σˆ Y2 [ m ] σˆ U G YU ˆσ 2 [ m ] = ----------------⎜ ---------------- + ----------------- – 2Re ⎜ ----------------------⎟ ⎟ , n ⎜ [ m ] 2 P ⎜ ˆ [m] 2 [ m ] ˆ [ m ]⎟ ⎟ ˆ U ⎝Y ⎝ Yˆ U ⎠⎠

(2-17)

ˆ [ m ] ∈ ⺢ . In a noiseless input framework, expression (2-17) simplifies to

with σ n2

[m] ⎞ σˆ Y2 ˆσ 2 [ m ] = --1- ⎛ ----------------⎜ ⎟. n P ⎝ ˆ [ m ] 2⎠ U

(2-18)

ˆ 2 [ m ] are averaged in order to acquire an improved estimate: Next, the M estimates σ n [m] 1 M ----- ∑ σˆ n2 . M m=1

After applying the

(2-19)

ˆ N -law, we obtain the uncertainty of G BLA due to the measurement

noise:

30

Estimating the Best Linear Approximation

M [m] 1 σˆ n2 = ------- ∑ σˆ 2 2 m=1 n M

(2-20)

Furthermore, the combined effect of the stochastic nonlinear behaviour and the measurement

ˆ 2 . It is determined noise can also be measured by calculating the total sample variance σ BLA ˆ ( jω ) [ m ] : from the M estimates G k

M 2 1 ˆ ˆ [m] – G 2 σˆ BLA = ------------------------ ∑ G BLA M(M – 1) m = 1

(2-21)

The total variance of the BLA is equal to the sum of the measurement noise variance and the 2 ( k) : variance due to the stochastic nonlinear contributions σ NL 2 2 (k ) + σ 2(k) σ BLA ( k ) = σ NL n

(2-22)

ˆ 2 ( k ) is estimated with Hence, σ NL ˆ 2 ( k ) – σˆ 2 ( k ) . 2 (k) = σ σˆ NL BLA n To conclude, for frequency index k we now have:

ˆ • G BLA ( jω k ) : The Best Linear Approximation. ˆ 2 • σˆ BLA ( k ) : The total sample variance of the estimate G BLA ( jω k ) , due to the combined effect of the measurement noise and the stochastic nonlinear behaviour.

• σˆ n2 ( k ) : The measurement noise sample variance of the estimate ˆ ( jω ) . G BLA

k

2 ( k ) : The sample variance of the stochastic nonlinear contributions. • σˆ NL

31

(2-23)

Chapter 2: The Best Linear Approximation

B. Non Periodic Data A combination of higher order correlation tests can be used to detect unmodelled nonlinearities for arbitrary excitations [20]. But in opposition to the case of periodic excitation signals, no simple methods exist to distinguish between the nonlinear distortions and the effect of measurement noise when non periodic excitations are used. Furthermore, leakage errors will be present when calculating the input and output DFT spectra [53]. However, it is still possible to characterize the combined effect of nonlinear distortions and measurement noise if we assume a noiseless input framework. First, the measured input and output time domain data are split into M blocks. In order to reduce the leakage effect, a Hanning or a diff window can for instance be applied to the signals; we will employ the latter [65]. Then, the input and output DFT spectra of each block are calculated. The next step is to calculate the sample cross-power spectrum between the output and the input Sˆ YU ( k ) and the auto-power spectrum of the input Sˆ UU ( k ) , using Welch’s method [84]: [ m ] [ m ]H 1 M . Sˆ XY = ----- ∑ X Y M m=1

(2-24)

Sˆ YU ( k ) ˆ G BLA ( jω k ) = ------------------ . Sˆ UU ( k )

(2-25)

The BLA is then given by

When working in a noiseless input framework, it can be shown that the following expression yields an unbiased estimate of the covariance of the output DFT Yˆ ( k ) :

M ˆ ˆ σˆ Y2 ( k ) = ---------------------- ( Sˆ YY ( k ) – G BLA ( jω k )S UY ( k ) ) , 2(M – 1)

(2-26)

where the factor 2 stems from the usage of the diff window. From this expression, we can

ˆ derive the uncertainty of G BLA ( jω k ) (see the end result of Appendix 2.B, simplified to the SISO case):

32

Estimating the Best Linear Approximation

ˆ2 1 σY ( k ) ˆσ 2 ( k ) = ---- -----------------BLA M Sˆ ( k ) UU

(2-27)

To summarize, for frequency k we have:

ˆ • G BLA ( jω k ) : The Best Linear Approximation. ˆ 2 • σˆ BLA ( k ) : The total sample variance of the estimate G BLA ( jω k ) , due to the combined effect of the measurement noise and the stochastic nonlinear behaviour.

Remark: When choosing the number of blocks M to split the input and output data, a tradeoff is made between the leakage effects and the uncertainty due to the stochastic nonlinear behaviour and the measurement noise. When M is large, this results in shorter time records of length N . Hence, the leakage will be more important, since leakage is a O ( N a rectangular window, and a O ( N

–2

–1

) effect for

) effect for a Hanning window [53]. Furthermore, the

frequency resolution diminishes with increasing M . On the other hand, it can be seen from

ˆ ( k ) reduces for a larger M . To conclude, with (2-27) that the uncertainty on the estimate G the number of blocks M we can balance between a lower variance and a better frequency resolution of the BLA.

33

Chapter 2: The Best Linear Approximation

2.5.2 Multiple Input, Multiple Output Systems Next, we explain how the Best Linear Approximation is estimated for MIMO systems. In the case of periodic excitation signals, the essential difference between the SISO and the MIMO framework is the need for multiple experiments. This stems from the fact that the influences of the different inputs, superposed in a single experiment, need to be separated. Again, a distinction is made between periodic and non periodic data.

A. Periodic Data When periodic signals are employed to determine the Best Linear Approximation of a MIMO system, the experiments are usually carried out according to the scheme depicted in Figure 211. In total, M blocks of n u experiments are performed, as shown on the vertical axis. After the transients have settled, P periods of the input and the output are measured for each experiment. Per block m and period p , we assemble the n u DFT spectra of the n u inputs n u × nu ny × n u [m,p] [m,p] and n y outputs in the matrices U ( k ) and Y ( k ) , respectively. ∈⺓ ∈⺓ Then, the input and output spectra are averaged over the periods per block m . For frequency

k , we have: P [m,p] ˆ ( k ) [ m ] = --1U U(k) ∑ P p=1 [m] [m,p] 1 P = --- ∑ Yˆ ( k ) Y( k) P p=1

(2-28)

ˆ ( jω ) [ m ] ∈ ⺓ n y × n u : For every block m , we now calculate an FRF estimate G k ˆ ( jω ) [ m ] = Yˆ ( k ) [ m ] ⎛ U ˆ ( k )[ m ]⎞ G k ⎝ ⎠

–1

.

(2-29)

ˆ ( jω ) [ m ] are then combined in order to obtain the Best Linear The M FRF estimates G k ˆ Approximation G BLA ( jω k ) :

1 M ˆ ˆ ( jω ) [ m ] -∑ G G BLA ( jω k ) = ---k M m=1

34

(2-30)

Estimating the Best Linear Approximation

P periods

Transient U

[1,1]

,Y

[1,1]

,Y

[2,1]



U



U

[1,P]

,Y

[1,P]

ˆ [ 1 ], Cˆ [ 1 ] G n

M × nu

experiments

m = 1

U

[2,1]

[2,P]

,Y

[2,P]

ˆ [ 2 ], Cˆ [ 2 ] G n

m = 2



… U

[M,1]

,Y



[M,1]





U

[ M , P]

,Y

… [M,P]

m = M

ˆ [ M ], Cˆ [ M ] G n

Figure 2-11. Experiment design to calculate the BLA of a MIMO nonlinear system.

Again, we are able to make a distinction between the nonlinear distortions and the effect of

ˆ the measurement noise. First, we will determine the sample covariance matrix of G BLA ( jω k ) due to the measurement noise. For the discussion why the calculation should be carried out via the covariances of the input and output spectra, we refer to the SISO case (“Periodic Data”

ˆ ( k ) [ m ] and on p. 28). The sample covariance matrices of the averaged DFT spectra U [m] Yˆ ( k ) are given by:

P [m,p] ˆ [ m ]⎞ [m,p] ˆ [ m ]⎞ H 1 Cˆ ˆ [ m ] = --------------------- ∑ –U vec ⎛ U –U vec ⎛ U ⎝ ⎠ ⎝ ⎠ P( P – 1) p = 1 U P [m,p] ˆ [ m ]⎞ [m,p] ˆ [ m ]⎞ H 1 –Y vec ⎛ Y –Y Cˆ ˆ [ m ] = --------------------- ∑ vec ⎛ Y ⎝ ⎠ ⎝ ⎠ P( P – 1) p = 1 Y

(2-31)

P [m,p] ˆ [ m ]⎞ [m,p] ˆ [ m ]⎞ H 1 –Y vec ⎛ U –U Cˆ ˆ ˆ [ m ] = --------------------- ∑ vec ⎛ Y ⎝ ⎠ ⎝ ⎠ P( P – 1) p = 1 YU

In (2-31), the frequency index k was omitted in order to simplify the formulas. From nu nu × nu n u ny nu × ny nu ny nu × n u nu ˆ ˆ Cˆ ˆ [ m ] ∈ ⺓ ∈ ⺓ C ∈ ⺓ , C , and , [ m ] [ m ] ˆ ˆ U Yˆ U [m] [m] Y ˆ ˆ covariance C of G is estimated with (see Appendix 2.A): n

35

the

sample

Chapter 2: The Best Linear Approximation

ˆ [m] = ⎛ ⎛U ˆ [ m ]⎞ C n ⎠ ⎝⎝ ˆ + ⎛⎛U ⎝⎝

–T

ˆ ⎛ ⎛ ˆ [ m ]⎞ ⊗ In ⎞ C [m ] U ˆ ⎠ ⎠ y Y ⎝⎝

[ m ]⎞ – T



–T

H

⊗ In ⎞ + … y⎠

ˆ [ m ]⎞ Cˆ ⎛ ⎛ ˆ [ m ]⎞ ⊗G [m] U ˆ ⎠ U ⎝⎝ ⎠

–T

H

ˆ [ m ]⎞ + … ⊗G ⎠

(2-32)

–T H ⎧ ˆ [ m ]⎞ – T ˆ [ m ]⎞ ⎫ ˆ [ m ]⎞ ⊗ G – 2herm ⎨ ⎛ ⎛ U ⊗ I n ⎞ Cˆ ˆ ˆ [ m ] ⎛ ⎛ U ⎠ ⎠ ⎝⎝ ⎝⎝ ⎠ ⎬ y⎠ Y U ⎩ ⎭

ˆ [ m ] ∈ ⺓ n y n u × n y n u . In a noiseless input framework, expression (2-32) simplifies to with C n ˆ [m] = ⎛ ⎛U ˆ [ m ]⎞ C n ⎝⎝ ⎠

–T

ˆ ⎛ ⎛ ˆ [ m ]⎞ ⊗ In ⎞ C [m ] U ˆ ⎠ ⎠ y Y ⎝⎝

–T

⊗ In ⎞ y⎠

H

(2-33)

ˆ [ m ] are at our disposal, we can combine them in order to obtain an Since M estimates C n improved estimate of the covariance matrix that characterizes the measurement noise: M 1 ˆ [m] Cˆ n = ------- ∑ C 2 m=1 n M

(2-34)

The combined effect of the stochastic nonlinear behaviour and the measurement noise is

ˆ characterized by the total sample covariance C BLA . This quantity is determined from the M ˆ ( jω ) [ m ] : estimates G k

H M 1 ˆ [m] – G ˆ ˆ ˆ ˆ [m] – G ⎛G ⎞ vec ⎛ G ⎞ ----------------------C = vec BLA BLA⎠ BLA⎠ ⎝ ⎝ M ( M – 1 ) ∑m = 1

(2-35)

The total covariance of the BLA is equal to the sum of the measurement noise covariance

C n ( k ) and the covariance due to the stochastic nonlinear contributions C NL ( k ) : C BLA ( k ) = C NL ( k ) + C n ( k )

(2-36)

ˆ ( k ) is estimated with Hence, C NL ˆ ˆ Cˆ NL ( k ) = C BLA ( k ) – C n ( k )

36

(2-37)

Estimating the Best Linear Approximation

To conclude, for frequency k we have:

ˆ • G BLA ( jω k ) : The Best Linear Approximation. • Cˆ BLA ( k ) : The total sample covariance matrix of the estimate ˆ ( jω ) , due to the combined effect of the measurement noise and G BLA

k

the stochastic nonlinear behaviour.

• Cˆ n ( k ) : The measurement noise sample covariance matrix of the ˆ estimate G BLA ( jω k ) . • Cˆ ( k ) : The sample covariance of the stochastic nonlinear NL

contributions.

Remark: When random phase multisines are used to perform FRF measurements on a multivariable system, it is possible to make an optimal choice for the phases. Within a block of experiments m , orthogonal random phase multisines should be used whenever possible, as they are optimal in the sense that they minimize the variance of the estimated BLA [18],[85].

ˆ ( k ) [ m ] in equation (2-29) When these signals are used, the condition number of the matrix U ˆ ( k ) [ m ] in (2-29) is calculated in optimal numerical equals one. Hence, the inverse of U conditions. Orthonormal random phase multisines are created in the following way. First, n u ordinary nu × nu random phase multisines are generated: U 1, …, U n . Then, a unitary matrix W ∈ ⺓ is u

used to define the excitation signals for the n u experiments. For frequency k , the applied signal is:

w 11 U ( k ) … w 1n U ( k ) 1 u n U( k) =

u





,



(2-38)

wn 1U ( k ) … wn n U ( k ) 1 u u u n u

where we omitted the index [ m ] . For W , the DFT matrix can for instance be used:

1 – j 2π ( k – 1 ) ( l – 1 ) w kl = --------- exp ⎛ --------------------------------------------⎞ , ⎝ ⎠ nu nu

37

(2-39)

Chapter 2: The Best Linear Approximation

where w kl is the element of W at position ( k, l ) . In the case of a system with three inputs ( n u = 3 ), we have

1

1

1

– j 2π j2π 1- 1 exp ⎛⎝ -----------⎞⎠ exp ⎛⎝ --------⎞⎠ -----W = . 3 3 3 j2π – j 2π 1 exp ⎛ --------⎞ exp ⎛ -----------⎞ ⎝ 3 ⎠ ⎝ 3 ⎠

(2-40)

B. Non Periodic Data Contrary to the case of periodic excitation signals, no simple methods are available to distinguish between the nonlinear distortions and the noise effects when non periodic excitations are used. Furthermore, leakage errors will be present when calculating the input and the output spectra. However, it is still possible to characterize the combined effect of the nonlinear distortions and the measurement noise. First, the input and output data are split into M > n u blocks. Then, the input and output DFT spectra are calculated per block. Again, leakage effects are diminished by means of a Hanning or diff window [65]. The next step is to calculate the sample cross-power spectrum of the inputs and outputs Sˆ YU ( k ) and the auto-power spectrum of the inputs Sˆ UU ( k ) using (2-24). The Best Linear Approximation is then given by

ˆ ˆ ˆ –1 G BLA ( jω k ) = S YU ( k )S UU ( k ) .

(2-41)

If we assume noiseless inputs, the following expression can be used to calculate the covariance of the output spectrum:

M ˆ ˆ Cˆ Y ( k ) = ------------------------ ( Sˆ YY ( k ) – G BLA ( jω k )S UY ( k ) ) , 2 ( M – nu )

(2-42)

ˆ ( k ) , the uncertainty of where the factor 2 stems from using a diff window. Making use of C Y ˆ G BLA ( jω k ) can be derived (see Appendix 2.B): 1 –T ˆ (k) Cˆ BLA ( k ) = ----- Sˆ ( k ) ⊗ C Y M UU

38

(2-43)

Estimating the Best Linear Approximation

To summarize, for frequency k we have:

ˆ • G BLA ( jω k ) : The Best Linear Approximation. • Cˆ BLA ( k ) : The total sample covariance matrix of the estimate ˆ ( jω ) , due to the combined effect of the measurement noise and G BLA

k

the stochastic nonlinear behaviour.

Remark: Again, a trade-off is made when choosing the number of blocks M . See the SISO case (“Non Periodic Data” on p. 32) for a discussion.

39

Chapter 2: The Best Linear Approximation

Appendix 2.A Calculation of the FRF Covariance from the Input/Output Covariances The measured input and output DFT coefficients for a block of n u experiments are given by

U ( k ) = U 0 ( k ) + NU ( k ) Y ( k ) = Y0 ( k ) + NY ( k ) nu × nu

(2-44) ny × nu

for frequencies k = 1, …, F . U 0 ( k ) ∈ ⺓ and Y 0 ( k ) ∈ ⺓ are the noiseless Fourier coefficients; N U ( k ) and N Y ( k ) are the contributions of all the noise sources in the experimental set-up. Assumption 2.11 (Disturbing Noise): The input N U ( k ) and output N Y ( k ) errors satisfy the following set of equations:

Ᏹ { vec ( N U ( k ) ) } = 0 Ᏹ { vec ( N Y ( k ) ) } = 0 H ⎧ ⎫ ⎛ ⎞ ⎛ ⎞ Ᏹ ⎨ vec ⎝ N U ( k )⎠ vec ⎝ N U ( k )⎠ ⎬ = C U ( k ) ⎩ ⎭



H

(2-45)



Ᏹ ⎨ vec ⎛⎝ N Y ( k )⎞⎠ vec ⎛⎝ N Y ( k )⎞⎠ ⎬ = C Y ( k ) ⎩



H ⎧ ⎫ H ⎛ ⎞ ⎛ ⎞ Ᏹ ⎨ vec ⎝ N Y ( k )⎠ vec ⎝ N U ( k )⎠ ⎬ = C YU ( k ) = C UY ( k ) ⎩ ⎭

Furthermore, we assume that N U ( k ) and N Y ( k ) are independent of U 0 ( k ) and Y0 ( k ) . The FRF estimate G ( jω k ) is given by

G ( jω k ) = Y ( k )U ( k )

–1

= ⎛ Y 0 ( k ) + N Y ( k )⎞ ⎛ U 0 ( k ) + N U ( k )⎞ ⎝ ⎠⎝ ⎠

40

–1

.

(2-46)

Estimating the Best Linear Approximation

We will calculate the variability of the FRF estimate using a first order Taylor approximation. For notational simplicity, we will omit the frequency index k in the following calculations. First, –1

we isolate U 0 from U

–1

:

–1 G = ( Y 0 + N Y ) ⎛ ⎛ I n + N U U 0 ⎞ U 0⎞ ⎝⎝ u ⎠ ⎠

–1

–1 –1 = ( Y 0 + N Y )U 0 ⎛ I n + N U U 0 ⎞ ⎝ u ⎠

–1

(2-47)

Next, we apply the Taylor expansion, restricting ourselves to the first order terms. For small

α , we have ⎛ I + α⎞ ⎝ nu ⎠

–1

≈ In – α .

(2-48)

u

When we apply this to (2-47), we obtain –1 –1 G = ( Y 0 + N Y )U 0 ⎛ I n – N U U 0 ⎞ , ⎝ u ⎠

(2-49)

and when we omit the second order terms in N U and N Y , –1

–1

–1

–1

G = Y0 U0 – Y0 U0 N U U0 + NY U 0 .

(2-50)

We define

G 0 = ( Y 0 )U 0

–1

,

(2-51)

and then rewrite (2-50): –1

–1

G = G0 + NG = G0 – G0 N U U 0 + NY U0

(2-52)

–1

(2-53)

Hence, we obtain –1

NG = NY U0 – G 0 NU U0 .

In order to compute vec ( N G ) as a function of vec ( N U ) and vec ( N Y ) , we will apply the following vectorization property:

41

Chapter 2: The Best Linear Approximation

T

vec ( ABC ) = ( C ⊗ A )vec ( B )

(2-54)

–T –T vec ( N G ) = ⎛ U 0 ⊗ I n ⎞ vec ( N Y ) – ⎛ U 0 ⊗ G 0⎞ vec ( N U ) . ⎝ ⎝ ⎠ y⎠

(2-55)

This results in

Next, we determine the covariance matrix C G which is defined as

⎧ H⎫ C G = Ᏹ ⎨ vec ( N G )vec ( N G ) ⎬. ⎩ ⎭

(2-56)

By combining equations (2-45), (2-55) and (2-56), we obtain: H

H

–T –T –T –T C G = ⎛ U 0 ⊗ I n ⎞ C Y ⎛ U 0 ⊗ I n ⎞ + ⎛ U 0 ⊗ G 0⎞ C U ⎛ U 0 ⊗ G 0⎞ + … ⎝ ⎝ ⎝ ⎠ ⎝ ⎠ y⎠ y⎠ H⎫ ⎧ –T –T – 2herm ⎨ ⎛ U 0 ⊗ I n ⎞ C YU ⎛ U 0 ⊗ G 0⎞ ⎬ ⎝ ⎝ ⎠ y⎠ ⎩ ⎭

42

(2-57)

Estimating the Best Linear Approximation

Appendix 2.B Covariance of the FRF for Non Periodic Data If we assume a noiseless input framework, then the measured input and output DFT spectra for block m are given by

U Y

[m]

[m]

[m]

( k ) = U0 ( k ) (k) =

[m] Y0 ( k )

+

(2-58)

[m] NY ( k )

nu × 1

ny × 1

for frequencies k = 1, …, F . U 0 ( k ) ∈ ⺓ and Y 0 ( k ) ∈ ⺓ are the noiseless DFT [m] spectra; N Y ( k ) represents the contributions of all the noise sources and the stochastic nonlinear behaviour in the experimental set-up. [m]

Assumption 2.12 (Disturbing Noise): The output error N Y ( k ) satisfies the following set of equations:

⎧ [m]







Ᏹ ⎨ NY ( k ) ⎬ = 0 ⎧ [m]

(2-59) [ m ]H

Ᏹ ⎨ N Y ( k )N Y ⎩

⎫ ( k ) ⎬ = CY ( k ) ⎭

[m]

Furthermore, we assume that N Y ( k ) is uncorrelated with U 0 ( k ) or Y 0 ( k ) .

ˆ ( jω ) is given by The noiseless FRF estimate G 0 k ˆ ( jω ) = Sˆ ˆ –1 G 0 Y 0 U 0 ( k )S U 0 U 0 ( k ) . k

(2-60)

We will calculate the variability of the FRF estimate and omit the frequency index k in the following calculations for notational simplicity. From (2-58), we have

Y

[m]

[m]

= Y0

[m]

+ NY

ˆ U[ m ] . = G 0

(2-61)

1 [ m ]H , and compute the summation M 0

We right-multiply both sides of equation (2-61) with ----- U over block index m :

43

Chapter 2: The Best Linear Approximation

M

1 ----M



[ m ] [ m ]H Y0 U0

1 + ----M

m=1

M

[ m ] [ m ]H NY U0



M

1 ˆ ---= G M

m=1



[ m ] [ m ]H

U0 U0

(2-62)

m=1

or when we apply (2-24)

1 Sˆ Y U + ----0 0 M

M

[m]



[ m ]H

NY U0

ˆ Sˆ = G U0 U0 .

(2-63)

m=1 –1 We then right-multiply (2-63) with Sˆ U U and use (2-60): 0 0 M

1 ˆ = G ˆ +N = G ˆ + ---G 0 0 G M



[m]

[ m ]H ˆ – 1 SU U 0 0

NY U0

(2-64)

[ m ]H ˆ – 1 SU U . 0 0

(2-65)

m=1

Hence, we obtain M

1 N G = ----M



[m]

NY U0

m=1 [m]

[m]

In order to compute vec ( N G ) as a function of vec ( N Y ) = N Y vectorization property T

vec ( ABC ) = ( C ⊗ A )vec ( B ) .

, we apply the

(2-66)

This results in

1 vec ( N G ) = ----M

M



T

⎛ ⎛ U [ m ]H Sˆ – 1 ⎞ ⊗ I ⎞ N [ m ] . U 0 U 0⎠ n y⎠ Y ⎝⎝ 0

(2-67)

m=1

Next, we determine the covariance matrix C G which is defined as

⎧ H⎫ C G = Ᏹ ⎨ vec ( N G )vec ( N G ) ⎬. ⎩ ⎭ By combining equations (2-67) and (2-68) we obtain:

44

(2-68)

Estimating the Best Linear Approximation

⎧⎛ M ⎞ ⎪⎜ [ m ]H ˆ – 1 ⎞ T [ m ]⎟ 1 ⎛ ⎛ ⎞ -----×… CG = Ᏹ SU U ⊗ In NY U ⎟ 2 ⎨⎜ ∑ ⎝⎝ 0 0 0⎠ y⎠ ⎪⎝ M ⎠ ⎩ m=1

(2-69)

H⎫

⎛ M ⎞ T ⎪ [ n ]H – 1 [n] … × ⎜ ∑ ⎛ ⎛ U 0 Sˆ U U ⎞ ⊗ I n ⎞ N Y ⎟ ⎬ ⎜ ⎟ ⎝⎝ 0 0⎠ y⎠ ⎝n = 1 ⎠ ⎪ ⎭ [m]

[n]

In order to eliminate the two sums, we make use of the independency of N Y and N Y for [m] [n] [n] m ≠ n , and of U 0 and U 0 for m ≠ n . We also have that N Y[ m ] is uncorrelated with U 0 for any m and n . This results in

1 C G = ------2 M

M



[ m ]H ˆ – 1 ⎞ T SU U 0 0⎠

Ᏹ ⎨ ⎛⎝ ⎛⎝ U 0

∑ m=1



–1 H

Applying (2-59) together with ( S XX ) C Y = 1 ⊗ C Y , results in

1 C G = ------2 M

M

∑ m=1



Ᏹ ⎨ ⎛⎝ Sˆ U ⎩

[ m ]HT U0 U 0 0

–T

[ m ] [ m ]H ⎛ ⎛

⊗ In ⎞ N Y N Y y⎠

⎝⎝

[ m ]H ˆ – 1 ⎞ TH SU U 0 0⎠

U0

⎫ ⊗ I n ⎞ ⎬ . (2-70) y⎠ ⎭

–1

= S XX , and taking into account the fact that

⎫ [ m ]T – T ⊗ I n ⎞ ⎛ 1 ⊗ C Y⎞ ⎛ U 0 Sˆ U U ⊗ I n ⎞ ⎬ . ⎠⎝ 0 0 y⎠ ⎝ y⎠ ⎭

(2-71)

Next, we make use of the Mixed-Product rule:

( A ⊗ B ) ( C ⊗ D ) = AC ⊗ BD ,

(2-72)

provided that A , B , C , D have compatible matrix dimensions. We then obtain T ⎧⎛ ⎫ M ⎛ ⎞ ⎞ ⎪ ⎜ ˆ –T ⎜ 1 ⎪ [ m ] [ m ]H⎟ ˆ – T ⎟ 1 C G = ----- Ᏹ ⎨ S U U ----- ∑ U 0 U 0 SU U ⊗ CY ⎬ , ⎜ ⎜ ⎟ ⎟ 0 0 0 0 M ⎪ M ⎪ ⎝ ⎝ m=1 ⎠ ⎠ ⎩ ⎭

(2-73)

or finally, after using (2-24) and reintroducing the frequency index k :

1 –T C G ( k ) = ----- Sˆ U U ( k ) ⊗ C Y ( k ) . M 0 0

45

(2-74)

Chapter 2: The Best Linear Approximation

46

C HAPTER 3

FAST MEASUREMENT OF QUANTIZATION DISTORTIONS

A measurement technique is proposed to characterize the nonidealities of DSP algorithms which are induced by quantization effects, overflows (fixed point), or nonlinear distortions in one single experiment/simulation. The main idea is to apply specially designed multisine excitations such that a distinction can be made between the output of the ideal system and the contributions of the system’s nonidealities. This approach allows to compare and quantify the quality of different implementation alternatives. Applications of this method include for instance digital filters, FFTs, and audio codecs.

47

Chapter 3: Fast Measurement of Quantization Distortions

3.1 Introduction Digital Signal Processing (DSP) systems have the advantage of being flexible when compared with analog circuits. However, they are prone to calculation errors, especially when a fixed point implementation is used. These errors induce non-ideal signal contributions at the output such as the presence of quantization noise, limit cycles, and nonlinear distortions. In order to be sure that the design specifications are still met, it is necessary to verify the presence of these effects and to quantify them. A simple approach uses single sine excitations to test the system. Unfortunately, this method does not reveal all problems and requires many experiments in order to cover the full frequency range. A more thorough approach consists in a theoretical analysis of the Device Under Test (DUT). A good example of this is given in [50], where the effects of coefficient quantization of Infinite Impulse Response (IIR) systems is analysed. The main disadvantage is that for every new DUT, or for every different implementation of the DUT, the full analysis needs to be repeated. This can be an involved and time consuming task, as a different approach may be needed for every new DUT. In this chapter, a method is proposed that detects and quantifies the quantization errors using one single multisine experiment with a well chosen amplitude spectrum. First, the measurement concept will be introduced. Then, a brief discussion of the major errors in DSP algorithms will be given. Finally, the results will be illustrated on a number of examples.

48

The Multisine as a Detection Tool for Non-idealities

3.2 The Multisine as a Detection Tool for Non-idealities Here, a special kind of multisine will be used, namely a Random Odd Multisine (see “The Multisine as a Detection Tool for Nonlinearities” on p. 26). Using this excitation signal, it is possible to extract the BLA G BLA ( jω ) of the DSP system for the class of Gaussian excitations with a fixed Power Spectral Density [62],[66]. The BLA can be obtained in two ways. The first way is to average the measured transfer functions for different phase realizations of the input multisine (see “Estimating the Best Linear Approximation” on p. 28). The second way is to identify a low order parametric model from a single phase realization. In order to have a realistic idea of the level of nonlinear distortions, the power spectrum of the excitation signal should be chosen such that it coincides with the power spectrum of the expected input signal of the system in later use. To summarize, the aim of the proposed method is two-folded: 1.

Extract the BLA in order to evaluate the linear behaviour of the DSP system;

2.

Detect and quantify the nonlinear effects in the DSP system in order to see whether the design specifications are met.

The main advantage of this method is that it can be used for any DSP system or algorithm, as long as the aim is to achieve a linear operation. A possible drawback is the limitation to a specific class of input signals. However, it must be said that the class of Gaussian signals is not that restrictive, since, for example, most telecommunication signals fall inside this class [4], [13].

49

Chapter 3: Fast Measurement of Quantization Distortions

3.3 DSP Errors In this section, multisines are used to quantify the quantization distortions. A fixed point digital filter serves as an example. The advantages of a fixed point representation in DSP systems are well-known when compared to floating point processing: they are both faster and cheaper. However, they suffer from a number of serious drawbacks as well. Fixed point implementations require some knowledge about the expected dynamic range of the input signal and the intermediate signal levels. They induce a finite numerical precision and a finite dynamic range for the internal representation of the processed samples. In contrast with a fixed point representation, floating point arithmetic allows input numbers with a practically unlimited dynamic range. Depending on how the numbers are quantized (e.g. ceil, floor or round), different kinds of distortion arise in the fixed point implementation. The overflow behaviour (saturation or two’s complement overflow) also plays an important role, because it can lead to a higher level of distortions and even to chaotic behaviour [8]. To investigate the influence of the finite quantization and the range problems, a fourth order Butterworth low pass filter is considered with a normalized cut-off frequency of 0.2Hz to be operated at a sampling rate of f s = 1Hz . The normalized, non-quantized filter coefficients are (in Matlab notation, with a precision of five digits after the decimal point):

b = 0.00204 0.00814 0.01222 0.00814 0.00204

(3-1)

a = 0.42203 -1.00000 0.97657 -0.44510 0.07908 with b the numerator, and a the denominator coefficients. The filter is implemented in Direct Form (DF) II [50], with 32 bit wide accumulators and a 16 bit wide data memory, including one sign bit. The fixed point representation is illustrated in Figure 3-1. The accumulators consist of 14 integer bits and 17 fractional bits; the memory has 7 integer bits and 8 fractional bits. These settings are summarized in Table 3-1, together with the largest and smallest numbers that can be achieved, and the least significant bit (lsb). The lsb is nothing more than the numerical resolution.

1

2

3

Figure 3-1. Fixed point representation: 1: Sign bit - 2: Integer bits - 3: Fractional bits.

50

DSP Errors

Sign Bit

Int. Bits

Fr. Bits

Max. Val.

Min. Val.

lsb

Accumulator

1

14

17

214 - 2-17

-214

2-17

Memory

1

7

8

27 - 2-8

-27

2-8

Table 3-1. DSP settings. The input signal consists of 16 phase realizations of a Random Odd Multisine, each with a period length of 1024 samples. The random grid has excited lines between DC ( f = 0Hz ) and 80% of the Nyquist frequency ( f = 0.4Hz ). Using a block length of 4, this results in 154 excited harmonic lines. The default RMS value of the input signal is set to about 328 lsb, which is 1% of the largest number that can be represented in the data memory. Two periods of each phase realization are applied. Here, the first period of the output signal is discarded in order to eliminate transients. A common way to determine the number of samples that need to be discarded is plotting the difference between the subsequent output periods. Then, by visual inspection the number of required transient points can be determined. Next, the Best Linear Approximation is calculated by averaging the measured Frequency Response Functions (FRFs) for each phase realization [56]. The total length of the input sequence applied to the DUT is in our experiment 1024 * 2 * 16 = 32 768 samples. The default truncation and overflow behaviour (these terms will be explained in the following sections) are set to rounding and saturation, respectively, but we will alter these settings in order to analyse their influence.

3.3.1 Truncation Errors of the Filter Coefficients Numerous different truncation methods exist [34]. Since the exact implementation of the truncation is of no importance to the proposed method, we shall consider two common truncation methods: arithmetic rounding and flooring. Both truncation methods are depicted in Figure 3-2. The solid black line represents the truncation characteristic, the dashed line stands for the ideal behaviour, and the grey line shows the average behaviour of the truncation method. The floor operation is often used since it requires the least computation time: the least significant bits of the calculated samples are simply discarded. However, it introduces an average offset of ½ lsb, which is not present when using the computationally more involved rounding method.

51

Chapter 3: Fast Measurement of Quantization Distortions

½ lsb

(a) Floor

(b) Round

Figure 3-2. Truncation characteristics. To inspect the quantization effect on the filter coefficients, the FRF is calculated for every phase realization by dividing the measured output spectrum by the input spectrum at the excited frequencies. Then, these FRFs are averaged over all the phase realizations. The following definition is used:

x d ( t ) = x ( tT s )

t = 1, 2 , … , N N

1 X = DFT ( x ) ⇔ X ( l ) = ---- ∑ x ( t )e N t=1 d

–--------------j2πtl N

(3-2)

G is the measured transfer function of the DUT: Y ( kf 0 ) G ( kf 0 ) = ---------------U ( kf 0 )

k = excited frequency lines

(3-3)

with U and Y the DFT of the input and output signals, respectively:

U = DFT ( u )

Y = DFT ( y )

(3-4)

The measured transfer function can now be compared with the designed one (see Figure 3-3) for both truncation methods. Since the quantization error of the coefficients results in displaced system poles, the realized transfer functions (full grey lines) differ from the designed one (black line). Furthermore, the amplitude of the complex model errors (dashed grey lines) are shown on the same plot. From Figure 3-3, we conclude that rounding should be used in this particular example in order to implement a filter with a transfer function that is as close as possible to the designed transfer function in the pass-band. Hence, we will continue the analysis in the following sections with the rounded coefficients.

52

Best Linear Approximation [dB]

DSP Errors

0

Ideal Round Floor

í20

í40

í60 0

0.05

0.1

0.15 0.2 0.25 Frequency [Hz]

0.3

0.35

0.4

Figure 3-3. Distortion of the transfer function due to quantized filter coefficients.

3.3.2 Finite Precision Distortion The level of nonlinear distortions at the output, due to the finite precision of the arithmetic and the storage operations, is investigated from the same data that was used to calculate G . In Figure 3-4 (a) and (b), the output spectrum is plotted as a function of the frequency for the rounding and floor operation, respectively. At the non excited lines in these plots, crosses represent the even contributions, and circles denote the odd contributions. Next, the distortion level at these lines is extrapolated to the excited lines. Hence, the Signal to Distortion Ratio (SDR) on the latter can be determined. At the even detection lines, the even nonlinearities (e.g. x 2 ) pop up, and at the odd detection lines, the odd nonlinearities (e.g.

x 3 ) are visible. From Figure 3-4 (a) and Figure 3-4 (b), it can be seen that the rounding method only leads to odd nonlinearities, while the floor operation leads to both odd and even nonlinearities. This behaviour can be explained by the symmetry properties of the truncation characteristics

(see

Figure 3-2).

The

rounding

operation

is

an

odd

function:

Round ( – x ) = – Round ( x ) . If this function was decomposed in its Taylor series, it would only consist of odd terms. The following properties hold for odd functions:

• the sum of two odd functions is odd; • the multiplication of an odd function with a constant number is odd; • the cascade of odd functions results in an odd function;

53

Chapter 3: Fast Measurement of Quantization Distortions

(a) Round

Output Spectrum [dB]

40 Excited Odd Detection Even Detection

20 0 í20 í40 í60 0

0.1

Output Spectrum [dB]

40

0.2 0.3 (b) Floor Frequency [Hz]

0.4

0.5

Excited Odd Detection Even Detection

20 0 í20 í40 í60 0

0.1

0.2 0.3 Frequency [Hz]

0.4

0.5

Figure 3-4. Output spectrum of a single multisine for different truncation methods. The filtering operation performed by the DUT consists in subsequent additions and multiplications with constant numbers (i.e., the filter coefficients). That is why at the output of the system only odd nonlinearities are seen when rounding is used as a truncation method. The floor operation is neither an odd, nor an even nonlinear operation. This leads to even and odd terms when this function is decomposed in its Taylor Series. As a result, odd and even nonlinearities are both observed in the output spectrum. Figure 3-4 (b) also shows the presence of a DC component corresponding to the offset of ½ lsb that is introduced by the flooring operation (see Figure 3-2). In the pass band, we observe that for this RMS level the SDR due to the finite precision effects is about 60 dB.

54

DSP Errors

3.3.3 Finite Range Distortion Finally, the effects of the limited dynamic range of the numeric representation are discussed. Two approaches can be used to deal with this problem. First, when no precautions are taken and when the value of the samples exceeds the allowed dynamic range, a two’s complement overflow will occur. The resulting transfer characteristic is given in Figure 3-5.

Range

Figure 3-5. Transfer characteristic for two’s complement overflow.

A second and better method to deal with the limited range is to detect and to saturate the representation. This leads to the characteristic given in Figure 3-6.

Range

Figure 3-6. Transfer characteristic for saturation overflow.

In the following simulation experiment, the RMS value of the input signal is increased with a factor 3, up to 983 lsb. Figure 3-7 shows heavy distortion of the transfer function in the case of two’s complement overflow, while in the case of saturation the distortion is much lower. In Figure 3-8 (a) and (b), the level of nonlinear distortion at the output is analysed, using the same data as before.

55

Best Linear Approximation [dB]

Chapter 3: Fast Measurement of Quantization Distortions

Ideal Saturation Overflow

0 í10 í20 í30 í40 0

0.05

0.1

0.15 0.2 0.25 Frequency [Hz]

0.3

0.35

0.4

Figure 3-7. Transfer function for finite range. (a) Saturation

Output Spectrum [dB]

60 Excited Odd Detection Even Detection

40 20 0 í20 0

Output Spectrum [dB]

60

0.1 (b) Two’s0.2 0.3Overflow0.4 Complement Frequency [Hz]

0.5

Excited Odd Detection Even Detection

40 20 0 í20 0

0.1

0.2 0.3 Frequency [Hz]

0.4

0.5

Figure 3-8. Output spectrum of a single multisine for different overflow methods.

56

DSP Errors

We observe the presence of a small nonlinear contribution at the even detection lines. This is caused by the asymmetry of the fixed point representation: the largest positive number is

2 7 – 2 – 8 , while the largest negative number is – 2 7 . Consequently, the relation Overflow ( – x ) = – Overflow ( x ) does not hold for all x (neither for the saturation, nor for the two’s complement overflow), resulting in a nonlinear behaviour that is not strictly odd. Thus, when rounding is used to truncate the numbers, the even distortions in the output spectrum can only be caused by an overflow event. Hence, the even detection lines can act as a warning flag for internal overflows.

3.3.4 Influence of the Implementation The method presented here is also appropriate to evaluate the relative performance of different types of implementations. In the following experiment, we compare three types of implementations: ordinary Direct Form II, a cascade of Second Order Sections (SOS) optimized to reduce the probability of overflow, and a cascade of SOS reducing the round-off noise. The results for the different implementations are shown in Figure 3-9. For this particular system and RMS value of the input signal (328 lsb), the Direct Form II system shows the smallest quantization noise.

57

Chapter 3: Fast Measurement of Quantization Distortions

(a) DF II Excited Odd Detection Even Detection

Output Spectrum [dB]

40 20 0 í20

0

0.1

0.2SOS Overflow 0.3 (b) Frequency [Hz]

Output Spectrum [dB]

0.5

Excited Odd Detection Even Detection

40 20 0 í20

0

0.1

0.3 (c)0.2 SOS Roundíoff Frequency [Hz]

0.4

0.5

Excited Odd Detection Even Detection

40 Output Spectrum [dB]

0.4

20 0 í20

0

0.1

0.2 0.3 Frequency [Hz]

0.4

0.5

Figure 3-9. Output spectrum of a single multisine for different implementations.

58

Quality Analysis of Audio Codecs

3.4 Quality Analysis of Audio Codecs The method described above can also be applied to the encoding and decoding of music in a compressed format, for instance MP3. Although the coding/decoding process cannot be considered as strictly time-invariant, we can still use the multisine technique to have an idea of the level of distortion that arises. This makes it easy to compare different codecs with identical bit rates, or different bit rates for the same codec. Of course, the psycho-acoustic models employed in the encoding process can only be rated through subjective listening tests; they are not considered here. Such models are used to determine the spectral music components that are less audible to the human ear. This effect is caused by the so-called masking effect: the human ear is less sensitive to small spectral components residing in the proximity of large spectral components. The codec takes advantage of this shortcoming in order to achieve higher compression ratios [31]. Since MP3 codecs are designed to handle music, it would be more interesting to use a music sample instead of a multisine signal. However, the benefits of even and odd detection lines are still needed. Detection lines in a music sample can easily be achieved with the following procedure. Consider the music sample vector x and the following anti-symmetrical sequence:

x x 0 [ x ] –x –x 0 [ x ] ,

(3-5)

where 0 [ x ] denotes a set of zero samples with the same length as vector x . For such a sequence, all the frequency lines which are a multiple of 2 or 3 of the fundamental frequency will be zero. Hence, they will serve as detection lines for the odd and even nonlinearities. The LAME MP3 codec (version 1.30, engine 3.92 [88]) is used in this example. The input signal is a monophonic music sequence, sampled at 44.1 kHz with 16 bits per sample. In order to have a representative set of data, an excerpt of 2 20 samples from a pop song is used. After applying the above procedure (3-5) to create detection lines, the length of the test sequence is increased to about 6 ⋅ 10 6 samples. In order not to overload the figures, the number of plotted points in Figure 3-10 has been reduced. The level of distortion for three bit rates (64 kbps, 128 kbps and 256 kbps) is plotted at the left hand side of Figure 3-10. For the 64 kbps encoding/decoding process, the distortion level lies 30 dB below the signal level for low frequencies. The results for 128 kbps and 256 kbps show that the distortion level decreases with about 10 dB per additional 64 kbps.

59

Chapter 3: Fast Measurement of Quantization Distortions

Distortion Level

Best Linear Approximation Amplitude [dB]

80

40

0

20 0 í20 í40 2 10

Excited Odd Detection Even Detection 10

3

15 0

í15

10

4

10

Amplitude [dB]

80 60 40

2

10

3

10

4

10 0

í10

20 Phase [º]

128 kbps Amplitude [dB]

10

í10

Phase [º]

64 kbps Amplitude [dB]

60

0 í20

15 0

í15

í40 2 10

10

3

10

4

10

Amplitude [dB]

80

40

10

3

10

4

10 0

í10

20 Phase [º]

256 kbps Amplitude [dB]

60

2

0 í20

15 0

í15

í40 2 10

3

10 Frequency [Hz]

10

4

10

2

3

10 Frequency [Hz]

10

4

Figure 3-10. MP3 coding/decoding distortion and the BLA for different bit rates.

60

Quality Analysis of Audio Codecs

Furthermore, the Best Linear Approximation is computed for the three bit rates and plotted at the right hand side of Figure 3-10. We see a flat amplitude spectrum and a zero phase, as expected. The variations in the amplitude spectrum for the lowest bit rate are probably caused by the effect of masking (which lines are masked is decided by the psycho-acoustic model of the encoder). It can also be observed that in the encoding process for 64 and 128 kbps, a low pass characteristic is present (cut-off frequencies of 14 kHz and 18 kHz, respectively). Consequently, the MP3 codec cuts off the high frequencies when low bit rates are used.

61

Chapter 3: Fast Measurement of Quantization Distortions

3.5 Conclusion In this chapter, we showed that it is possible to identify and quantify many non-idealities that occur in DSP systems, using custom designed multisines. The proposed concepts allow to verify quickly whether the input range and the quantization level are well chosen for input signals with a certain pdf and power spectrum. We have illustrated the ideas for an IIR system, for which the impact of the filter coefficient quantization, the presence of round-off noise, the overflow behaviour, and the effect of the chosen implementation can easily be measured and compared with the design specifications. Finally, the method was successfully applied to analyse the performance of an audio compression/decompression process.

62

C HAPTER 4

IDENTIFICATION OF NONLINEAR FEEDBACK SYSTEMS

In this chapter, a method is proposed to estimate block-oriented models which are composed of a linear, time-invariant system and a static nonlinearity in the feedback loop. By rearranging the model’s structure and by imposing one delay tab for the linear system, the identification process is reduced to a linear problem allowing a fast estimation of the feedback parameters. The numerical parameter values obtained by solving the linear problem are then used as starting values for the nonlinear optimization. Finally, the proposed method is illustrated on measurements from a physical system.

63

Chapter 4: Identification of Nonlinear Feedback Systems

4.1 Introduction Many physical systems contain, in an implicit manner, a nonlinear feedback. Consider for example a mass-spring-damper system with a nonlinear, hardening spring. For this system, the differential equation describing the displacement y c ( t ) of the mass m is given by

my··c ( t ) + dy· c ( t ) + k 1 y c ( t ) + k 3 y c3 ( t ) = u c ( t ) ,

(4-1)

where u c ( t ) is the input force and d is the damping coefficient. The constants k 1 and k 3 characterize the behaviour of the hardening spring ( k 3 > 0 ). To demonstrate the implicit feedback behaviour of this system, equation (4-1) is rewritten as follows

my··c ( t ) + dy· c ( t ) + k 1 y c ( t ) = u c ( t ) – k 3 y c3 ( t ).

(4-2)

The model structure which corresponds to this equation is shown in Figure 4-1. In this block scheme, G ( s ) is the Laplace transfer function between the input signal u c ( t ) and the output signal y c ( t ) of the system. The nonlinear block NL contains a static nonlinearity and represents the term k 3 y c3 ( t ) which is fed back in a negative way. In the following sections, we will develop an identification procedure for this kind of Nonlinear Feedback systems.

uc ( t )

+

+

-

G( s)

yc ( t )

NL Figure 4-1. LTI system with static nonlinear feedback.

64

Model Structure

4.2 Model Structure For a band-limited input signal u c ( t ) (for which the power spectrum S uu ( ω ) = 0 for ω > ω max ), the linear system G ( s ) can be approximated in the frequency domain by a discrete-time model nb

∑i = 0 b i z – i

G ( z, θ L ) = ---------------------------na – ∑ aj z j

and

θ L = ( a, b ) ,

(4-3)

j=0

provided that the sampling frequency f s is sufficiently high such that no aliasing occurs. To this matter, note that the bandwidth of y c ( t ) is possibly higher than the bandwidth of the input signal. This is due to the nonlinearity which is present in the feedback loop. The relation between the discrete-time signals u and y and the continuous-time signals u c and y c , for a given sampling period T s , is described by the following equations

⎧ u ( t ) = u c ( tT s ) ⎨ ⎩ y ( t ) = y c ( tT s )

(4-4)

The relation between input u ( t ) and output y ( t ) is then given by

y ( t ) = G ( z, θ L )u ( t ) .

(4-5)

The static nonlinearity is represented by a polynomial f NL

f NL ( x ) =

r

∑l = 0 p l x

l

and

θ NL = p .

(4-6)

The vector θ , which contains all the model parameters, is defined as

θ = ( θ L, θ NL ) .

(4-7)

The proposed model structure to identify the system is shown in Figure 4-2. The set of difference equations that describe the input/output relation is given by

65

Chapter 4: Identification of Nonlinear Feedback Systems

+

u(t)

+

x(t) -

y(t)

G ( z, θ L )

NL ( θ NL ) Figure 4-2. Model structure. n

n

b ⎧ a ⎪ a y ( t – j ) = ∑ bi x ( t – i ) ⎪∑ j ⎪j = 0 i=0 ⎨ r ⎪ l ⎪ x(t) = u( t) – ∑ pl y ( t ) ⎪ ⎩ l=0

(4-8)

Without losing generality, we divide the first equation by a 0 , or equivalently, set a 0 ≡ 1 . We then substitute the second equation of (4-8) into the first one, and get nb

y(t) =

na

nb

r

∑ bi u ( t – i ) – ∑ aj y ( t – j ) – ∑ bi ∑ pl y l ( t – i ) . i=0

j=1

i=0

(4-9)

l=0

l

We isolate the terms in y ( t ) and move them to the left hand side: nb

r

y [ k ] + b0

∑ pl l=0

yl [ k ]

=

na

nb

r

∑ bi u [ k – i ] – ∑ aj y [ k – j ] – ∑ bi ∑ pl yl [ k – i ] . i=0

j=1

i=1

(4-10)

l=0

This equation is an example of a nonlinear algebraic loop [41]: a nonlinear algebraic equation should be solved for every time step t when the model output is calculated. In principle, this problem can be tackled by a numeric solver. The disadvantage of this loop is that it can have multiple solutions. In fact, it is even possible that no solution exists when the degree r is even. As an example, let us take a look at the polynomial shown in Figure 4-3. The black curve is a polynomial of third degree; it corresponds to the left hand side of (4-10). The grey horizontal levels represent possible values of the right hand side of (4-10). In this example,

66

Model Structure

Figure 4-3. Number of solutions of the algebraic equation. we observe that according to the horizontal level, the number of solutions varies: one solution for the solid grey line, two solutions for the dotted line, and three solutions for the dash dotted line. Although y ( t – 1 ) should be a good initial guess leading the numeric solver to the correct solution, we prefer to avoid multiple solutions. To do this in a simple way, we will impose one delay tab for the linear block or, equivalently, b 0 in equation (4-10) will be set to zero. Taking into account the imposed delay, we obtain the following model equation nb

y(t) =

na

nb

r

∑ bi u ( t – i ) – ∑ aj y ( t – j ) – ∑ bi ∑ pl yl ( t – i ) . i=1

j=1

i=1

67

l=0

(4-11)

Chapter 4: Identification of Nonlinear Feedback Systems

4.3 Estimation Procedure A three step procedure is used to identify the parameters of the model equations (4-11). In the first step, the parameters of the linear model are identified. Next, the coefficients of the static nonlinearity are estimated. Finally, a nonlinear search procedure is employed in order to refine the initial values obtained in the first two steps.

4.3.1 Best Linear Approximation The first step in the identification process consists in estimating the Best Linear Approximation (BLA) from the measured input u m ( t ) and output y m ( t ) for t = 1, …N (see “Estimating the Best Linear Approximation” on p. 28). A parametric linear model is then estimated in the Z -

ˆ domain and denoted as G BLA ( z, θ L ) . Note that the linear behaviour of the nonlinear feedback branch Lin ( NL ) is implicitly included in the estimated BLA (see Figure 4-4).

4.3.2 Nonlinear Feedback In the second step, the nonlinear block which is present in the feedback branch is identified. To achieve this, the feedback loop is opened, and the model is restructured as shown in Figure 4-5. In order to keep the identification simple, the measured output y m ( t ) is used at the input of the NL block. The idea of using measured outputs instead of estimated outputs in order to avoid recurrence, is similar to the series-parallel architecture from the identification of neural networks [47].

ˆ G BLA ( z, θ L ) uc ( t )

+

+

+ -

+

-

G( s)

yc ( t )

Lin(NL)

NL (NL)

Figure 4-4. The BLA (grey box) of a Nonlinear Feedback system.

68

Estimation Procedure

+

u(t) y(t)

+

ˆ G BLA ( z, θ L )

-

y(t)

NL

Figure 4-5. Rearranged model structure.

As will be shown in what follows, opening the loop allows the formulation of an estimation

ˆ problem that is linear-in-the-parameters θ NL for a fixed G BLA ( z, θ L ) . From Figure 4-5, we obtain the following equation ⎛

⎜ ˆ y(t) = G BLA ( z, θ L ) ⎜ u ( t ) – ⎜ ⎝

⎞ l p l y ( t )⎟⎟ . ⎟ l=0 ⎠ r



(4-12)

The residual w ( t ) is defined as

ˆ w ( t ) ≡ ym ( t ) – G BLA ( z, θ L ) u m ( t ) ,

(4-13)

where u m ( t ) and y m ( t ) are the measured input and output, respectively. Note that w ( t ) is independent of the parameters θ NL . Next, the error e w ( t, θ NL ) needs to be minimized:

⎛ ⎛ r ⎞⎞ l ( t )⎟ ⎟ ˆ ⎜ ⎜ p y e w ( t, θ NL ) = w ( t ) – – G BLA ( z, θ L ) ∑ l m . ⎜ ⎜ ⎟⎟ ⎝ ⎝l = 0 ⎠⎠

(4-14)

To achieve this, the least squares cost function V ( θ NL ) N

V ( θ NL ) =

∑ ew ( t, θNL ) t=1

is minimized with respect to p i .

69

2

(4-15)

Chapter 4: Identification of Nonlinear Feedback Systems

ˆ Since G BLA ( z, θ L ) is a known linear operator, independent of θ NL , this minimization is a problem that is linear-in-the-parameters which can be solved in the time or frequency domain. Its solution is given by the matrix equation

θˆ NL = H + w ,

(4-16)

ˆ ⎛ 0 1 r ⎞, H = –G ym … ym BLA ( z, θ L ) ⎝ y m ⎠

(4-17)

with

and where w and y m are vectors that contain the elements w ( t ) and y m ( t ) for t = 1, …, N , respectively. In (4-17), the power should be computed elementwise, and the

ˆ + operator G BLA ( z, θ L ) should be applied to all the columns. The pseudo-inverse H can be calculated in a numerical stable way via a Singular Value Decomposition (SVD) [27]. Since measurements are used in the observation matrix H of this linear least squares problem, a bias is present on the estimated parameters θˆ NL [56]. However, when the Signal to Noise Ratio (SNR) achieved by the measurement set-up is reasonable, this bias remains small, yielding results that are well enough to initialize the nonlinear search procedure.

4.3.3 Nonlinear Optimization The starting values obtained from the initialization procedure can be improved by solving the full nonlinear estimation problem. The Levenberg-Marquardt algorithm (see “The LevenbergMarquardt Algorithm” on p. 135) is used to minimize the weighted least squares cost function

V WLS ( θ ) =

F

∑ k = 1 W ( k ) ε ( k, θ ) 2 ,

(4-18)

where W ( k ) ∈ ⺢ is a user-chosen, frequency domain weighting matrix. The model error

ε ( k, θ ) ∈ ⺓ is defined as ε ( k, θ ) = Y m ( k, θ ) – Y ( k ) ,

(4-19)

where Y m ( k, θ ) and Y ( k ) are the DFT of the measured and modelled output from equation (4-11), respectively. Hence, the cost function is formulated in the frequency domain, which

70

Estimation Procedure

enables the use of nonparametric weighting. Typically, the weighting matrix W ( k ) is chosen

ˆ – 1 ( k ) . This matrix can be obtained equal to the inverse covariance matrix of the output C Y straightforwardly when periodic signals are used to excite the DUT. In two different situations, leakage can appear in equation (4-19): when arbitrary excitations are employed, or when subharmonics are present in the measured or modelled output. In the first case, the leakage can be reduced by windowing techniques, or by increasing the length of the data record. In the second case, it suffices to increase the DFT window length such that an integer number of periods are measured in the window. The Levenberg-Marquardt algorithm requires the computation of the derivatives of the model error to the model parameters θ . The analytical expressions of the Jacobian are given in Appendix 4.A. In these expressions, the modelled outputs are utilized instead of the measured outputs. Hence, the bias which was present in the previous section due to the noise on the output is now removed.

71

Chapter 4: Identification of Nonlinear Feedback Systems

4.4 Experimental Results We will now apply the ideas of the previous sections to a practical measurement set-up. The Device Under Test is an electronic circuit, also known as the Silverbox [67], which emulates the behaviour of a mass-spring-damper system with a nonlinear spring. The experimental data originate from a single measurement and contain two main parts. In all experiments, the input and output signals are measured at a sampling frequency of 10 MHz/214 = 610.35 Hz. The first part of the data is a filtered Gaussian signal with a RMS value that increases linearly with time. This sequence consists of 40 700 samples and has a bandwidth of 200 Hz; it will be used for validation purposes. Note that the amplitude of the validation sequence exceeds the amplitude of the estimation sequence. A warning is in place here: generally speaking, extrapolation during the validation test should be avoided, since it reveals no information about the model quality. Good extrapolation performance, certainly in a black box framework, is often a matter of luck: the model structure happens to correspond exactly to the system’s internal structure. The second part of the data consists of ten consecutive realizations of a random phase multisine with 8192 samples and 500 transient points per realization, depicted in Figure 4-6 with alternating colours. The bandwidth of the excitation signal is also 200 Hz and its RMS value is 22.3 mV. The multisines will be employed to estimate the model. In this measurement, odd multisines were used:

Input Signal [V]

0.15

Uk = A

k = 2n + 1

Uk = 0

elsewhere

Validation

n∈⺞

(4-20)

Estimation

0.1 0.05 0 í0.05 í0.1 í0.15 0

50

100 Time [s]

150

200

Figure 4-6. Excitation signal that consists of a validation and an estimation set.

72

Experimental Results

using the same symbols as in equation (2-3). The phases φ k were chosen uniformly distributed over [0, 2π ) . We will extract the BLA of the DUT by averaging the measured transfer functions for different phase realizations of the multisine [56],[62],[66]. Note that this also diminishes the measurement noise on the BLA.

4.4.1 Linear Model We will start with a comparison of two second order linear models. The first model has completely unrestricted parameters. For the second model, we impose one delay tab which is equivalent to forcing b 0 to zero during the estimation. As mentioned before, this delay is imposed in order to avoid an algebraical loop. Furthermore, the order of the numerator is increased to reduce the model error. The consequences of the delay are now investigated by comparing the quality of the following models: 1.

Full linear model:

b0 + b1 z –1 + b2 z –2 G 1 ( z ) = ---------------------------------------------a0 + a1 z –1 + a2 z –2 2.

(4-21)

Linear model with imposed delay:

b1 z – 1 + b2 z –2 + b3 z –3 + b4 z –4 + b5 z – 5 ---------------------------------------------------------------------------------------------G2 ( z ) = a0 + a1 z –1 + a2 z –2

(4-22)

The estimation of the parameters of G 1 ( z ) and G 2 ( z ) for the DUT is carried out in the frequency domain, using the ELiS Frequency Domain Identification toolbox [56]. The resulting models are plotted in Figure 4-7 (dash dotted and dashed grey lines, respectively), together with the BLA (solid black line). From this figure, it can be seen that imposing a delay results in a slight distortion of the modelled transfer function for high frequencies. The following step consists in validating these models by using the validation data set. The simulation error signals of the linear models are plotted in Figure 4-8 (b) and (c), together with the measured output signal (a). From these plots, we conclude that the same model quality is achieved for both linear models: the Root Mean Square Error (RMSE) obtained with the validation data set is 14.3 mV. This means that imposing a delay tab does not significantly deteriorate the quality of the linear model in this particular experimental set-up.

73

Chapter 4: Identification of Nonlinear Feedback Systems

Amplitude [dB]

20 10 0 í10 í20 0

50

100

150

100 Frequency [Hz]

150

0 Phase [°]

í50 í100 í150 0

50

Figure 4-7. Measured FRF (solid black line), model G 1 ( z ) (dash dotted grey line) and model G 2 ( z ) (dashed grey line).

Measured Output Signal [V]

(a)

0.2 0 í0.2 0

10

20 30 40 50 Error Linear Model 1 í RMSE: 14.3 mV

60

10

20 30 40 50 Error Linear Model 2 í RMSE: 14.3 mV

60

(b)

0.2 0 í0.2 0

(c)

0.2 0 í0.2 0

10

20

30 40 Time [s]

50

Figure 4-8. Validation of the linear models.

74

60

Experimental Results

The estimated coefficients of the second linear model are

bˆ = 0 0.4838 – 0.0987 0.1217 -0.0782 0.0257 ,

aˆ = 1 -1.4586 0.9323 . (4-23)

We now proceed with the identification procedure using the second linear model and by extending it with the static nonlinear feedback branch (see Figure 4-2).

4.4.2 Estimation of the Nonlinear Feedback Coefficients After estimating the linear transfer characteristics, the nonlinear feedback coefficients p are estimated in the time domain, using the ten measured multisine realizations. Several degrees

r were tried out, and r = 1 : 3 yielded the best result. The identified coefficients are pˆ = -0.0347 -0.0260 3.9177 .

(4-24)

Again, the model is validated on the Gaussian noise sequence. Figure 4-9 (a) shows the simulation error of the Nonlinear Feedback model. Note that the vertical scale is enlarged 10 times compared with the plots in Figure 4-8. The RMSE has dropped with more than a factor 10 compared to the linear model. Furthermore in Figure 4-9 (a), the large spikes in the error signal have disappeared.

Simulation error NLFB model í RMSE: 1.01 mV

(a)

0.02 0 í0.02 0

10

20

30

40

50

60

Simulation error NLFB model, optimized í RMSE: 0.77 mV

(b)

0.02 0 í0.02 0

10

20

30 40 Time [s]

50

Figure 4-9. Validation of the nonlinear models.

75

60

Chapter 4: Identification of Nonlinear Feedback Systems

4.4.3 Nonlinear Optimization The proposed identification procedure significantly enhances the results compared to the linear model. But, we can achieve even better results by applying the nonlinear optimization method from section 4.3.3. Since no covariance information is available from the measured data, a constant weighting is employed. The resulting simulation error after applying the Levenberg-Marquardt algorithm is plotted in Figure 4-9 (b). The RMSE then decreases further with about 20% to 0.77 mV.

4.4.4 Upsampling To obtain a further improvement of the modelling results, we will upsample the input and output data. The idea behind upsampling is that the influence of the delay, which is imposed artificially and which is one sample period long, is reduced. Hence, the model quality should improve. After upsampling the input and output data with a factor 2, the estimation procedure described in the previous sections is applied. The simulation error of the Nonlinear Feedback model is shown in Figure 4-10 (a). We observe indeed that the validation test yields better results: the RMS error has decreased to 0.70 mV. In addition, a nonlinear search routine is used to optimize the parameters, resulting in a simulation error of 0.38 mV (Figure 4-10 (b)). The modelling results are summarized in Table 4-1. The linear models and the linear parts of Simulation error NLFB model, P=2 í RMSE: 0.7 mV

(a)

0.02 0 í0.02 0

10

20

30

40

50

60

Simulation error NLFB model, P=2, optimized í RMSE: 0.38 mV

(b)

0.02 0 í0.02 0

10

20

30 40 Time [s]

50

60

Figure 4-10. Validation of the nonlinear models for upsampled data.

76

Experimental Results

the Nonlinear Feedback models are all of order n a = 2 , n b = 5 ; the degree r of the polynomial is set to r = 1 : 3 . From Table 4-1, we conclude that by extending the linear Model

RMSE Validation

Linear

14.3 mV

Linear + delay

14.3 mV

NLFB, f s =610 Hz

1.01 mV

NLFB, f s =610 Hz (optimized)

0.77 mV

NLFB, f s =1221 Hz

0.70 mV

NLFB, f s =1221 Hz (optimized)

0.38 mV

Table 4-1. Summary of the modelling results.

model with a static nonlinear feedback, the simulation error on the validation data set is reduced by more than 20 dB: from 14.3 mV to 1.01 mV. The nonlinear optimization slightly improves this result down to 0.77 mV. Furthermore, using upsampling the total error reduction increases to more than 30 dB compared to the linear model. Finally, the simulation error of the optimized nonlinear model using the upsampled data set decreases to 0.38 mV. For a comparison with other modelling approaches on the same DUT, using the same data set, we refer to the Silverbox case study in Chapter 6 (see “Comparison with Other Approaches” on p. 151).

77

Chapter 4: Identification of Nonlinear Feedback Systems

40

Amplitude [dB]

30 20 10 0 í10 í20 í30 0

50

100

150 200 Frequency [Hz]

250

300

Figure 4-11. Spectrum of the measured output (solid black line); linear model error (solid grey line); NLFB model error (black dots); NLFB model error for upsampled data (grey dots).

In Figure 4-11, the amplitude spectrum of the simulation errors of the various models is shown, together with the measured output spectrum (solid black line). The solid grey line represents the error of the (unrestricted) linear model. The grey and black dots represent the Nonlinear Feedback model errors with and without upsampling, respectively.

78

Conclusion

4.5 Conclusion The technique proposed in this chapter provides a practical and fast way to model systems that are composed of a linear, time-invariant system and a static nonlinear feedback. The estimated model gives satisfying modelling results, which can be further improved by applying a nonlinear optimization procedure. We have applied the method on experimental data, and obtained good results. The modelling error was significantly reduced to less than 3% of the error obtained with an ordinary linear model.

79

Chapter 4: Identification of Nonlinear Feedback Systems

Appendix 4.A Analytic Expressions for the Jacobian Recall equation (4-11) which describes the Nonlinear Feedback model nb

y(k) =

nb

na

r

∑ bi u ( k – i ) + ∑ ∑ bi pl yl ( k – i ) – ∑ aj y ( k – j ) . i=1

i = 1l = 1

(4-25)

j=1

In order to use the Levenberg-Marquardt algorithm, we need to compute the derivatives of the output to the model parameters, i.e., the Jacobian. The Jacobian elements are defined as:

∂y ( k ) J b ( k ) ≡ ------------n ∂b n

∂y ( k ) J a ( k ) ≡ ------------n ∂a n

∂y ( k ) J p ( k ) ≡ ------------n ∂p n

(4-26)

Finally, we obtain: nb r na r ⎧ l ⎪J (k ) = u(k – n) + ∑ pl y ( k – n ) + ∑ ∑ pl bi lyl – 1 ( k – i )Jbn ( k – i ) – ∑ aj Jbn ( k – j ) ⎪ bn ⎪ l=1 i = 1l = 1 j=1 ⎪ nb r na ⎪ ⎪ (4-27) ⎨ J a ( k ) = – y ( k – n ) + ∑ ∑ p l b i ly l – 1 ( k – i )J a ( k – i ) + ∑ a j J a ( k – j ) n n ⎪ n i = 1l = 1 j=1 ⎪ ⎪ nb nb r na ⎪ ⎪ J (k) = ∑ bi y n ( k – i ) + ∑ ∑ pl bi ly l – 1 ( k – i )Jpn ( k – i ) – ∑ aj Jpn ( k – j ) ⎪ pn ⎩ i=1 i = 1l = 1 j=1

80

C HAPTER 5

NONLINEAR STATE SPACE MODELLING OF MULTIVARIABLE SYSTEMS

This chapter deals with the modelling of multivariable nonlinear systems. We will compare a number of candidate model structures and select the one that is most suitable for our modelling problem. Next, the different classes of systems that can exactly be represented by the selected model structure are discussed. Finally, an identification procedure to determine the model parameters is presented.

81

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

5.1 Introduction The aim of this chapter is to model nonlinear Multiple Input, Multiple Output (MIMO) systems. One way to achieve this is to examine the Device Under Test (DUT) thoroughly, and to build a model using first principles: for instance, physics and chemistry laws. This process can be very time-consuming, since it requires an exact knowledge about the system’s structure and all its parameters. It also induces that the system should be fully understood, which is not always feasible to achieve. Another way to tackle the modelling problem is to consider the system as a black box. In that case, the only available information about the system is given by its measured inputs and outputs. This approach usually means that no physical parameters or quantities are estimated. Hence, no physical interpretation whatsoever can be given to the model. Black box modelling implies the application of a model structure that is as flexible as possible, since no information about the device’s internal structure is utilized. Often, this flexibility results in a high number of parameters. In this chapter, we will make use of discrete-time models. One of the arguments for this choice is that, when looking to control applications, discrete-time descriptions are more suitable since control actions are usually taken at discrete time instances. Furthermore, the estimation of nonlinear continuous-time models is not a trivial task, and can be computationally involved, because it may imply the calculation of time-derivatives or integrals of sophisticated nonlinear functions of the measured signals [79]. Finally, it should be noted that a continuous-time approach is not strictly necessary, since we are not interested in the estimation of physical system parameters. One of the objectives is to choose a model structure that is suitable for MIMO systems. Hence, it is important that the common dynamics, present in the different outputs of the DUT, are exploited in such a way that they result in a smaller number of model parameters. First, a number of candidate model structures found in literature will be examined. Next, a specific model structure will be selected, and the relation with some other model structures (standard block-oriented nonlinear models, among others) will be investigated. Finally, an identification procedure for the selected model structure will be proposed.

82

The Quest for a Good Model Structure

5.2 The Quest for a Good Model Structure The literature regarding nonlinear system identification is vast, and the number of available model structures is practically unlimited. In order not to re-invent the wheel, we will briefly discuss a number of candidate model structures, and pick the one that seems most adequate. Initially, only deterministic models are considered: the presence of any kind of noise is ignored. First, two popular examples of input/output models are considered. Volterra and NARX models are both satisfying from a system theoretic viewpoint, and because of their approximation capabilities.

5.2.1 Volterra Models An introduction to Volterra series was already given in the first chapter (see “The VolterraWiener Theory” on p. 4). Since we have chosen the Volterra-Wiener approach as a framework for the Best Linear Approximation (see “Properties of the Best Linear Approximation” on p. 21), it is a logical first step to consider Volterra models. These models have already been employed in many application fields [43]: in video and image enhancement, speech processing, communication channel equalization, and compensation of loudspeaker nonlinearities. The main advantage of Volterra series is their conceptual simplicity, because they can be viewed as generalized LTI descriptions. Furthermore, they are open loop models for which the stability is easy to check and to enforce. However, in the case of a nonparametric representation, these benefits do not outweigh one important disadvantage: when identifying discrete-time Volterra kernels, an enormous number of kernel coefficients needs to be identified, even for a modest kernel degree. We illustrate this with a simple example of a SISO Volterra model. Table 5-1 shows the number of kernel samples N kern of two Volterra functionals for a memory length of N = 10 , and N = 100 samples. Note that triangular/ regular kernels were considered in order to eliminate the redundancy of the kernel coefficients. The number of effective kernel samples is then computed using the following binomial coefficient [43] (see also Appendix 5.A):

N + n – 1⎞ , N kern = ⎛ ⎝ ⎠ n

83

(5-1)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Degree n

N = 10

N = 100

1

10

100

2

55

5050

3

220

171 700

4

715

4 421 275

Table 5-1. Number of kernel coefficients N kern in a nonparametric representation, as a function of the kernel degree n and the memory length N . where n is the kernel degree. Table 5-1 shows that the number of required kernel coefficients increases dramatically for growing model degree and memory length. From this example, it is clear that, despite the theoretical insights they provide, nonparametric Volterra functionals are not useful in practical identification situations. However, the combinatorial growth of N kern can be tackled in various ways. For instance, a frequency domain IIR representation can be employed for the kernels [61]. But, in a black box framework such a parametrization is not straightforward, and poses a difficult model selection problem. Another solution is to apply interpolation methods to approximate the kernels in the time or frequency domain. This approach works well when the kernels exhibit a certain smoothness, see for instance [48]. The application of this idea leads to a significant decrease in parameters and, thus, measurement time. A multivariable Volterra model is, by definition, composed of n y different MISO models. When these models are parametrized independently, no advantage is taken of the common dynamics that appear in the different outputs. Consequently, such a representation does not satisfy our needs, since we are looking for a more parsimonious model structure.

5.2.2 NARX Approach To avoid the excessive numbers of parameters, NARX (Nonlinear AutoRegressive model with eXogeneous inputs) models were intensively studied in the eighties, as an alternative for Volterra series. The generic single input, single output NARX model is defined as

y ( t ) = f ( y ( t – 1 ), …, y ( t – n y ), u ( t – 1 ), …, u ( t – n u ) ) ,

(5-2)

where f ( . ) is an arbitrary nonlinear function of delayed inputs and outputs [7]. Contrary to the Volterra approach, the model output y ( t ) is now also a function of delayed output

84

The Quest for a Good Model Structure

samples. This is very similar to the extension of linear FIR models to IIR models. This has two important consequences. First of all, a longer memory length is achieved without suffering from the dramatic increase in the number of parameters, as was observed with nonparametric Volterra series. Secondly, due to the nonlinear feedback which is present, the stability analysis becomes very difficult compared with Volterra models. This is the major price that is paid for the increase in flexibility. In [37], it is proven that a nonlinear discrete-time, time-invariant system can always be represented by a general NARX model in a region around an equilibrium point, when it is subject to two sufficient conditions:

• the response function of the system is finitely realizable (i.e., the state space representation has a finite number of states);

• a linearised model exists when the system operates close to the chosen equilibrium point. Often, a more specific kind of NARX models is employed: polynomial NARX models. By applying the Stone-Weierstrass Theorem [16], it was shown in [7] that these models can approximate any sampled nonlinear system arbitrarily well, under the assumption that the space of input and output signals is compact (i.e., bounded and closed). The NARX model can also be used to handle multivariable systems. Just as with Volterra models, multivariable NARX models are defined as a set of MISO NARX models:

y i ( t + n ) = f i [ y 1 ( t + n 1 – 1 ), … , y 1 ( t ) , … y n ( t + n n – 1 ), y n ( t + nn – 2 ), … , y p ( t ), y

y

y

y

u 1 ( t + n ), u 1 ( t + n – 1 ), … , u 1 ( t ),

(5-3)

… u n ( t + n ), u n ( t + n – 1 ), … , u n ( t ) ] u

u

u

with i = 1, …, n y the output index, n u the number of inputs, n i the delay per output y i , and n = max ( n 1, …, n p ) the maximum delay [37]. For this general nonlinear model, there is no straightforward way to parametrize the functions f i , such that advantage is taken of the

85

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

common dynamics present in the different outputs. Hence, we will investigate another class of models than input/output models, namely state space models. It will turn out that the latter are a very suitable description for multiple input, multiple output systems.

5.2.3 State Space Models The most natural way to represent systems with multiple inputs and outputs is to use the state space framework. In its most general form, a n a -th order discrete-time state space model is expressed as

⎧ x ( t + 1 ) = f ( x ( t ), u ( t ) ) ⎨ y ( t ) = g ( x ( t ), u ( t ) ) ⎩ In these equations, u ( t ) ∈ ⺢ instance t , and y ( t ) ∈ ⺢

ny

(5-4)

nu

is the vector that contains the n u input values at time na is the vector of the n y outputs. The state vector x ( t ) ∈ ⺢

represents the memory of the system, and contains the common dynamics present in the different outputs. The use of this intermediary variable constitutes the essential difference between state space and input/output models. For the latter, the memory is created by utilizing delayed inputs or outputs. The first equation of (5-4), referred to as the state equation, describes the evolution of the state as a function of the input and the previous state. The second equation of (5-4) is called the output equation. It relates the system output with the state and the input. Furthermore, the state space representation is not unique. By means of a similarity transform, the model equations (5-4) can be converted into a new model that

exhibits

exactly

the

same

input/output

behaviour.

The

similarity

transform

x T ( t ) = T – 1 x ( t ) with an arbitrary non-singular square matrix T yields –1 ⎧ x T ( t + 1 ) = T f ( Tx T ( t ), u ( t ) ) = f T ( x T ( t ), u ( t ) ) ⎨ = g T ( x T ( t ), u ( t ) ) y ( t ) = g ( Tx T ( t ), u ( t ) ) ⎩

(5-5)

Note that when the similarity transform is applied to arbitrary functions f and g , the resulting functions f T and g T do not necessarily have the same form as f and g . This is illustrated in the following example. Consider the output equation

a y ( t ) = ------------ + bx 2 ( t ) , x1 ( t )

86

(5-6)

The Quest for a Good Model Structure

where a and b are the model parameters. When we define t ij as the ( i, j ) -th element of the matrix T – 1 , the similarity transform results in

a y ( t ) = ---------------------------------------------------- + b ( t 21 x ( t ) + t 22 x ( t ) ) , 1T 2T t 11 x ( t ) + t 12 x ( t ) 1T

(5-7)

2T

which obviously cannot be written in the form

aT y ( t ) = --------------- + b T x 2T ( t ) . x 1T ( t )

(5-8)

Hence, the similarity transform can have an influence on the model complexity. Whether the similarity transform introduces redundancy in the representation, depends on the (fixed) parametrization of f and g . However, this issue is not so important to us: all the state space models discussed in this chapter retain their model structure under a similarity transform. In what follows, we assume that f ( 0, 0 ) = 0 and h ( 0, 0 ) = 0 , such that x = 0 is an equilibrium state. In the following section, we describe different kinds of state space models.

A. Linear State Space Models The model equations of the well-known linear state space model are given by

⎧ x ( t + 1 ) = Ax ( t ) + Bu ( t ) ⎨ y ( t ) = Cx ( t ) + Du ( t ) ⎩ with the state space matrices A ∈ ⺢

na × na

, B∈⺢

na × n u

(5-9)

, C∈⺢

ny × na

, and D ∈ ⺢

ny × n u

.

The transfer function G ( z ) that corresponds to (5-9) is given by –1

G ( z ) = C ( zI n – A ) B + D ,

(5-10)

a

with I n the identity matrix of dimension n a . From (5-10), it is clear that the poles of G ( z ) a are given by the eigenvalues of A . By means of a similarity transform, the set of state space matrices ABCD can be converted into a new set A T B T C T D T that exhibits exactly the same input/output behaviour. The similarity transform x T ( t ) = T – 1 x ( t ) with an arbitrary nonsingular square matrix T yields

87

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

A T = T – 1 AT

BT = T –1 B

C T = CT

(5-11)

DT = D

It is easily verified that the similarity transform has no influence on the transfer function: –1

G T ( z ) = C T ( zI n – A T ) B T + D T a –1

= CT ( zI n – T – 1 AT ) T – 1 B + D

(5-12)

a

–1

= C ( zTT – 1 – TT – 1 ATT – 1 ) B + D = G(z ) B. Bilinear State Space Models A continuous-time, bilinear state space model is defined as

⎧ dx c ( t ) ⎪ -------------- = Ax c ( t ) + Bu c ( t ) + Fx c ( t ) ⊗ u c ( t ) ⎨ dt ⎪ y ( t ) = Cx ( t ) + Du ( t ) c c ⎩ c where A ∈ ⺢

na × na

, B∈⺢

n a × nu

, C∈⺢

ny × na

, D∈⺢

ny × nu

, and F ∈ ⺢

(5-13)

na × n u na

are the

bilinear state space matrices. These models are a straightforward extension of linear state space models, which enables them to cope with nonlinear systems. It was shown in the past that, in continuous-time, these models are universal approximators for nonlinear systems [26],[46]: any continuous causal functional can be approximated arbitrarily well by a continuous-time, bilinear state space model within a bounded time interval. The discrete-time, bilinear state space model is given by

⎧ x ( t + 1 ) = Ax ( t ) + Bu ( t ) + Fx ( t ) ⊗ u ( t ) ⎨ y ( t ) = Cx ( t ) + Du ( t ) ⎩

(5-14)

Intuitively, it is expected that this model preserves the approximation capabilities of its continuous-time counterpart. Unfortunately, this is not the case: it is not possible to approximate all (nonlinear) discrete-time systems by discrete-time, bilinear models. The reason for this is that the set of discrete-time, bilinear systems is not closed with respect to the product operation: the product of the outputs of two discrete-time, bilinear state space systems is not necessarily a bilinear system again [26]. In order to maintain the universal

88

The Quest for a Good Model Structure

approximation property also for discrete-time systems, a more generic class of models needs to be defined: state affine models.

C. State Affine Models A single input, single output, state affine model of degree r is defined as

⎧ ⎪x( t + 1) = ⎪ ⎨ ⎪ y( t) = ⎪ ⎩ na × na

r–1

∑i = 0

r

i

A i u ( t )x ( t ) + ∑

r–1

i=1 r

i

i

Biu ( t) (5-15) i

∑i = 0 C i u ( t )x ( t ) + ∑i = 1 Di u ( t )

n a × nu

ny × n a

ny × nu

with A i ∈ ⺢ , Bi ∈ ⺢ , Ci ∈ ⺢ , and D i ∈ ⺢ . These models were introduced in [73], and they pop up in a natural way in the description of sampled continuoustime, bilinear systems [7],[59]. On a finite time interval and for bounded inputs, they can approximate any continuous, discrete-time system arbitrarily well in uniform sense [26]. Just as in the case of bilinear models, the states x ( t ) appear in the state and output equations in an affine way. Hence, such a model structure enables the use of subspace identification techniques to estimate the state space matrices [80].

D. Other Kinds of State Space Models In literature, state space models come in many different flavours. In this section, we give a non-exhaustive list of various existing approaches, and mention their most remarkable properties. The idea behind Linear Parameter Varying (LPV) models [80] is to create a linear, time-variant s

model. Its parameters are a function of a user-chosen vector p ( t ) ∈ ⺢ which characterizes the operating point of the system, and which is assumed to be measurable. The state space equations are an affine function of p ( t ) :

89

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

⎧ x(t) u(t) +B ⎪x( t + 1) = A p(t) ⊗ x( t) p( t ) ⊗ u( t) ⎪ ⎨ ⎪ x( t) u(t) y(t) = C +D ⎪ ⎩ p(t) ⊗ x(t) p( t ) ⊗ u( t) with A =

A 0 A 1 … A s , and A i ∈ ⺢

na × na

(5-16)

. The other state space matrices B , C , and D

are partitioned in a similar way. Two special cases of the LPV model are the bilinear and the state affine model. When p ( t ) is chosen equal to u ( t ) , and B i = 0 , C i = 0 , and D i = 0 for i = 1, …, s , then the model equations become identical to the bilinear model equations (5-14). The state affine description of degree r is obtained from the LPV model by choosing

p ( t ) equal to a vector that contains all distinct nonlinear combinations of u ( t ) up to degree r – 1 . The LPV model structure is particularly interesting for nonlinear control: it enables the use of different linear controllers at different operating points (i.e., gain scheduling). Another kind of state space models are the so-called Local Linear Models (LLM) [80]. The idea here is to partition the input space and the state space into operating regions in which a particular linear model dominates. The state space matrices are defined as a sum of weighted local linear models:

⎧ ⎪x( t + 1) = ⎪ ⎨ ⎪ y(t) = ⎪ ⎩

s

∑i = 1 p i ( φ t ) ( A i x ( t ) + B i u ( t ) + O i ) s

(5-17)

∑i = 1 pi ( φt ) ( Ci x ( t ) + Di u ( t ) + Pi )

The scalar weighting functions p i ( . ) generally have local support, like for instance radial basis functions. The scheduling vector φ t is a function of the input u ( t ) and the state x ( t ) . The last type of nonlinear state space models discussed here are deterministic Neural State Space models [77],[78]. The general nonlinear equations in (5-4) are parametrized by multilayer feedforward neural networks with hyperbolic tangents as activation functions:

⎧ x ( t + 1 ) = W AB tanh ( V A x ( t ) + V B u ( t ) + β AB ) ⎨ y ( t ) = W CD tanh ( V C x ( t ) + V D u ( t ) + β CD ) ⎩

90

(5-18)

The Quest for a Good Model Structure

The model in (5-18) can be viewed as a multi-layer recurrent neural network with one hidden layer. It is also a specific kind of NLq system, for which sufficient conditions for global asymptotic stability were derived in [78]. Furthermore, the NLq theory allows to check and to ensure the global asymptotic stability of neural control loops.

91

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

5.3 Polynomial Nonlinear State Space Models The approach in this thesis consists in starting from the general model

⎧ x ( t + 1 ) = f ( x ( t ), u ( t ), θ ) ⎨ y ( t ) = g ( x ( t ), u ( t ) , θ ) ⎩

(5-19)

and to apply a functional expansion of the functions f ( . ) and g ( . ) . For this, we need to choose a set of basis functions out of the many possibilities: sigmoid functions, wavelets, radial basis functions, polynomials, hyperbolic tangents,... We opted for a polynomial approach. The main advantage of polynomial basis functions is that they are straightforward to compute, and easy to apply in a multivariable framework. We propose the following notation for the Polynomial NonLinear State Space (PNLSS) model:

⎧ x ( t + 1 ) = Ax ( t ) + Bu ( t ) + Eζ ( t ) ⎨ y ( t ) = Cx ( t ) + Du ( t ) + Fη ( t ) ⎩

(5-20)

The coefficients of the linear terms in x ( t ) and u ( t ) are given by the matrices A ∈ ⺢ and B ∈ ⺢

na × nu

in the state equation, and C ∈ ⺢

equation. The vectors ζ ( t ) ∈ ⺢ matrices E ∈ ⺢

na × nζ



and η ( t ) ∈ ⺢

and F ∈ ⺢

ny × nη



ny × na

and D ∈ ⺢

ny × nu

na × na

in the output

contain monomials in x ( t ) and u ( t ) ; the

contain the coefficients associated with those

monomials. The separation between the linear and the nonlinear terms in (5-20) is of no importance for the behaviour of the model. However, later on in the identification procedure this distinction will turn out to be very practical. The reason for this is that the first stage of the identification procedure consists of estimating a linear model. First, we briefly summarize the multinomial expansion theorem and the graded lexicographic order, which are both useful concepts when dealing with multivariable monomials.

5.3.1 Multinomial Expansion Theorem In order to denote monomials in an uncomplicated way, we first define the n -dimensional multi-index α which contains the powers of a multivariable monomial:

92

Polynomial Nonlinear State Space Models

α = α1 α 2 … α n , with α i ∈ ⺞ . A monomial composed of the components from the vector ξ ∈ ⺢ simply written as

ξ

α

αi

n

=

∏i = 1 ξ i

,

(5-21) n

is then

(5-22)

where ξ i is the i -th component of ξ . The total degree of the monomial is given by

α =

n

∑i = 1 αi ,

(5-23)

and the factorial function of the multi-index α is defined as

α! = α 1!α 2!…α n! .

(5-24)

Furthermore, we define ξ ( r ) as the column vector of all the distinct monomials of degree r (i.e., with multi-index α = r ) composed from the elements of vector ξ . The number of elements in vector ξ ( r ) is given by the following binomial coefficient (see Appendix 5.A):

⎛ n + r – 1⎞ . ⎝ ⎠ r

(5-25)

Finally, the vector ξ { r } is defined as the column vector containing all the monomials of degree two up to degree r . The length of this vector is given by

n + r⎞ – 1 – n. L n, r = ⎛ ⎝ r ⎠

(5-26)

The notations introduced above can now be used to express the multinomial expansion theorem [83].

93

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Theorem 5.1

(Multinomial Expansion Theorem) The multinomial expansion

theorem gives an expression for the power of a sum, as a function of the powers of the terms: n

⎛ ξ⎞ ⎝ ∑i = 1 i⎠

k



=

α =k

k! α ----- ξ α!

(5-27)

5.3.2 Graded Lexicographic Order To assemble the monomials in a deterministic way, it is convenient to define an order of succession among the monomials. A possible choice is to utilize the lexicographic (or alphabetical)

order.

For

this,

a

sequence

is

chosen

between

the

symbols

{ ξ 1 }, { ξ 2 }, …, { ξ n } . The most trivial choice is to base the ordering on the index i of ξ i : { ξ1 } < { ξ 2 } < … < { ξ n }

(5-28)

When the symbols ξ i are combined into a word, we arrange them according to this order. For 2 instance, the disordered monomial ξ 3 ξ ξ 2 should be written as { ξ 1 ξ 1 ξ 2 ξ 3 } . Furthermore, 1

monomials of the same degree can be ordered like words in a dictionary, e.g.

{ ξ1 ξ1 ξ2 } < { ξ1 ξ 1 ξ 3 } < { ξ 2 ξ 2 ξ 3 } .

(5-29)

Monomials of different degrees are placed in groups of increasing degree. Within each degree, the lexicographic order is applied. This results in the so-called graded lexicographic order. We will use this to order the elements of the vector ξ { r } , which contains all monomials with a degree between two and r . For instance, the vector ξ { 3 } with n = 2 denotes

ξ{ 3 } = =

ξ( 2 ) ; ξ ( 3 ) 2 ξ1

2 ξ1 ξ2 ξ2

3 ξ1

2 ξ1 ξ2

2 ξ1 ξ2

3 ξ2

T

(5-30)

5.3.3 Approximation Behaviour When a full polynomial expansion of (5-19) is carried out, all monomials up to a chosen degree r must be taken into account. First, we define ξ ( t ) as the concatenation of the state vector and the input vector:

94

Polynomial Nonlinear State Space Models

ξ ( t ) = x1 ( t ) … xn ( t ) u1 ( t ) … un ( t ) a u

T

.

(5-31)

As a consequence, the dimension of the vector ξ ( t ) is given by n = n a + n u . Then, we define ζ ( t ) and η ( t ) in equation (5-20) as

ζ ( t ) = η ( t ) = ξ ( t ){ r } .

(5-32)

This is our default choice for the PNLSS model structure. The total number of parameters required by the model in (5-20) is given by

⎛⎛n + r ⎞ ⎞ ⎛ ⎛ n a + n u + r⎞ ⎞ ⎜⎜ ⎟ – 1⎟ ( n a + n y ) = ⎜ ⎜ ⎟ – 1⎟ ( n a + n y ) . ⎝⎝ r ⎠ ⎠ ⎝⎝ ⎠ ⎠ r

(5-33)

When all the nonlinear combinations of the states are present in ζ ( t ) and η ( t ) for a given degree, then the proposed model structure is invariant under a similarity transform. Since the

n a2 elements of the transform matrix T can be chosen freely provided that T is non singular, the effective number of parameters becomes

⎛ ⎛ n a + n u + r⎞ ⎞ ⎜⎜ ⎟ – 1⎟ ( n a + n y ) – n a2 . ⎝⎝ ⎠ ⎠ r

(5-34)

A. The PNLSS Approach versus State Affine Models The question we want to answer is what the approximation properties are of this model structure. When taking a closer look at (5-15) (“State Affine Models” on p. 89), we observe that the State Affine (SA) representation forms a subclass of the default PNLSS model structure. Therefore, the PNLSS model structure inherits its approximation properties from the state affine framework. The remaining question is then what the additional advantage is of the PNLSS approach over the state affine representation. To investigate this, we recapitulate a derivation given in [59] for a SISO system. This derivation starts from a polynomial expansion of degree 2r of the general state space equations (5-19). This expansion is expressed by Kronecker products:

95

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

⎧ x( t + 1) = ⎪ ⎪ ⎨ ⎪ y(t) = ⎪ ⎩

r

r

∑i = 0 ∑j = 0 Fij x r

(i)

j

( t )u ( t )

with F 00 = 0 (5-35)

r

∑i = 0 ∑j = 0 G ij x

(i)

j

( t )u ( t )

with G 00 = 0

where F ij and G ij are the Taylor series coefficients of f and g , respectively, and x ( i ) ( t ) is defined as follows:

x(2)(t) = x( t) ⊗ x( t) (5-36)

… x(r )( t ) = x( t ) ⊗ … ⊗ x( t )

The notation with Kronecker products is more elegant than the one we proposed in (5-20), but it has the disadvantage that redundant monomials are present in the vector x ( i ) ( t ) . However, for the ideas developed here the Kronecker notation is well suited. Next, difference equations are developed for x ( i ) ( t + 1 ) . For instance, for x ( 2 ) ( t + 1 ) we have

x( 2) ( t + 1 ) = x( t + 1 ) ⊗ x( t + 1 ) r

r

r

r

F x ( i ) ( t )u j ( t )⎞ ⊗ ⎛ ∑ = ⎛∑ F x ( i ) ( t )u j ( t )⎞ ⎝ i = 0 ∑j = 0 ij ⎠ ⎝ i = 0 ∑j = 0 ij ⎠

(5-37)

Still following the calculations in [59] and using implicit summation, this results in a difference equation of the form

x(2)(t + 1) =





F km ⊗ F qn x ( i ) ( t )u j ( t ) .

(5-38)

i, j ≥ 0 k + q = i m+n = j

We apply the same procedure for x ( 3 ) ( t + 1 ) , and so on. Furthermore, a new state vector is defined:

96

Polynomial Nonlinear State Space Models

x



(1)

x (t) = x

(t) … .

(r)

(5-39)

(t)

Note that this state vector is non minimal for n a > 1 , because it contains identical elements due to the redundant monomials of the Kronecker representation. Finally, the terms in (5-38) with a nonlinear degree greater than r (i.e., the terms for which i + j > r ) are neglected. This implies that the approximation of the system is actually of degree r . The following state affine model is then obtained:

⎧ 䊟 ⎪x (t + 1) = ⎪ ⎨ ⎪ y( t) = ⎪ ⎩

r–1

∑i = 0 r–1



i

A i u ( t )x ( t ) + ∑ i

∑i = 0 Ci u ( t )x



r i=1 r

(t) + ∑

i=1

i

Bi u ( t ) (5-40) i

Di u ( t )

B. Comparison of the Number of Parameters In (5-39), we observe that the number of states in the state affine approximation grows combinatorially with the degree of approximation r . This is the price to be paid for the state affine representation. To calculate the number of required states, the redundant states that originate from the use of the Kronecker product, need to be taken into account. The total number of distinct states n 䊟 of model (5-40) is:

⎛ n a + r⎞ n䊟 = ⎜ ⎟ –1 ⎝ r ⎠

(5-41)

2 + 2n + 1 matrix coefficients per set of state For a SISO state affine model, there are n 䊟 䊟

affine matrices A i B i C i D i , and in total there are r such sets. We also have to take into account the similarity transform. Hence, the actual number of parameters is given by 2 + 2n + 1 ) – n 2 . r ( n䊟 䊟 䊟

(5-42)

We will now compare this to the number of parameters that are required for a PNLSS approximation of degree r . In Figure 5-1, the ratio between expressions (5-42) (state affine

97

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Ratio num. par. SA/PNLSS

10

10

10

10

3

r=1 r=2 r=3 r=4 r=5

2

1

0

2

4 6 8 Order na of the approximated system

10

Figure 5-1. Ratio of the number of parameters for the SA approximation and the PNLSS approximation, for different degrees r . approach) and (5-33) (PNLSS approach) is shown for various system orders ( n a = 1, …, 10 ), and for different degrees of approximation ( r = 1, …, 5 ). For r = 1 (red line), the ratio is one. This is a natural result since it corresponds to a linear model which has the same order for both approximations. For r > 1 , it can be seen from Figure 5-1 that the ratio is always higher than one.

C. Conclusion We have shown that, for an approximation of the same quality, the PNLSS model always requires a lower number of parameters than the state affine model. For this reason, we prefer to use the PNLSS model structure over the state affine one.

5.3.4 Stability The only recursive relation which is present in the general state space model (5-19) is the state equation. Hence, the stability of the model only depends on the function f , the initial conditions of the state, and the properties of the input signal. Therefore, when analysing the stability of (5-19), it suffices to study the following equation

x ( t + 1 ) = f ( x ( t ), u ( t ) ) ,

(5-43)

with x ( 0 ) = x 0 . The concept of Input-to-State Stability (ISS) reflects this idea. It was introduced in [74] for continuous-time systems, and extended to the discrete-time case in

98

Polynomial Nonlinear State Space Models

[32]. In order to define ISS, the following notations and definitions are used: ⺞ denotes the set of all non negative integers. The set of all input functions u: ⺞ → ⺢ m with the norm m , and . is the Euclidean norm. The initial u = sup { u ( t ) : t ∈ ⺞ } < ∞ is denoted by l ∞ state is given by x ( 0 ) = x 0 .

Definition 5.2 A function γ: ⺢ ⱖ0 → ⺢ ⱖ0 is a ᏷ -function if it is continuous, strictly increasing and if γ ( 0 ) = 0 . A function β: ⺢ ⱖ0 × ⺞ → ⺢ ⱖ0 is a ᏷ᏸ -function if for each fixed t ≥ 0 , the function β ( ., t ) is a ᏷ -function, and if for each fixed s ≥ 0 , the function β ( s, . ) is decreasing, and if β ( s, t ) → 0 as t → ∞ . Definition 5.3 (Input-to-State-Stability) System (5-43) is globally ISS, if there m and each exist a ᏷ᏸ -function β and a ᏷ -function γ , such that for each input u ∈ l ∞

x0 ∈ ⺢

na

it holds that

x ( t, x 0, u ) ≤ β ( x 0 , t ) + γ ( u )

(5-44)

for each t ∈ ⺞ . Loosely explained, a system is ISS if every state trajectory corresponding to a bounded input remains bounded, and if the trajectory eventually becomes small when the input signal becomes small as well, independent of the initial state. In this thesis, we will not try to find such functions β and γ .

5.3.5 Some Remarks on the Polynomial Approach A. Orthogonal Polynomials The question addressed here is whether the use of orthogonal polynomials can create some additional value to the identification of the PNLSS model. Orthogonal polynomials have proved their usefulness in times when computing power and memory were scarce. For linear problems, the orthogonality of regressors induces two advantages. First of all, the reestimation of parameters is circumvented when new regressors are added to an already solved problem. Unfortunately, this asset is of no use here, since the identification of the proposed model (5-20) requires solving a nonlinear problem (see “Identification of the PNLSS Model” on p. 115). Secondly, orthogonality can improve the numerical conditioning. To this

99

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

matter, it should be noted that orthogonal basis functions only offer a clear advantage when they are applied to signals with a given probability density function. For instance, Hermite polynomials and Chebyshev polynomials are well suited for Gaussian and uniformly distributed signals, respectively. Bearing in mind the class of excitation signals we chose in Chapter 2, it would be logical to select Hermite polynomials. However, the states, which are polynomial functions of the input and the previous states, do not necessarily have a Gaussian distribution. Therefore, we will employ ordinary polynomials, since no clear advantage can be taken from the application of orthogonal polynomials.

B. Disadvantages of the Polynomial Approach A number of drawbacks come along with the application of polynomials. The most important one is the explosive behaviour of polynomials outside the region in which they were estimated. Indeed, a polynomial quickly attains large numerical values when its arguments are large. At first sight, this might seem a serious drawback compared to the well-behaved extrapolation of basis functions, which tend to a constant value for large input values. However, it should be noted that, in general, it is never a good idea to extrapolate with an estimated model. This fact is independent of the chosen basis functions, whether they are polynomials, hyperbolic tangents, radial basis functions, or sigmoids. The only exception to this rule is when there exists an exact match between the DUT’s internal structure and the model structure. In a black box framework, this is seldom the case. We illustrate this rule of thumb by means of a short simulation, where we use two kinds of basis functions to approximate the relation

y = atan ( u ) .

(5-45)

To generate the estimation data set, an input signal of 2000 samples, uniformly distributed between u = – 5 and u = 5 is used. First, we estimate a 15th degree polynomial using linear least squares. Then, a Gaussian Radial Basis Function (RBF) network with 8 centres is estimated with the RBF Matlab toolbox [51]. Both models require the estimation of 16 parameters. Next, we evaluate the extrapolation behaviour of both models by applying an input between u = – 10 and u = 10 . The result of this test is shown in Figure 5-2. The top plot shows the original function (solid black line), together with the polynomial approximation (dashed grey line) and the RBF approximation (solid grey line). The bottom plot shows the error of both approximations on a logarithmic scale. Although the output of the RBF model

100

Polynomial Nonlinear State Space Models

y

2 0 í2

Error [dB]

í10

í5

0

5

10

í5

0 u

5

10

100 50 0 í50 í10

Figure 5-2. Top: arctangent function (solid black line); polynomial approximation (dash dotted grey line); Gaussian RBF approximation (solid grey line). Bottom: Model error for both approximations. does not explode like with the polynomial approach (as a matter of fact, it converges to zero for x → ± ∞ ), it still exhibits severe extrapolation errors close to the estimation region. We conclude that the use of ‘well-behaved’ basis functions like RBFs, hyperbolic tangents, or sigmoids is no justification to employ them for extrapolation.

101

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

5.4 On the Equivalence with some Block-oriented Models The past years, simple block-oriented models have been utilized extensively to model nonlinear systems. The block-oriented models most commonly used are the Hammerstein, Wiener, Wiener-Hammerstein, and the Nonlinear Feedback model. Numerous applications of these models in various fields can be found in literature: the modelling of heat exchangers [10], transmission lines [58], chemical processes [49], and biological systems [30]. Furthermore, various identification methods for block-oriented models exist [2],[3],[11]. In the following sections, we will establish a link between the Polynomial NonLinear State Space (PNLSS) model and a number of standard block-oriented models. We will restrict ourselves to SISO systems, because there exists no rigorous definition for MIMO block-oriented models: when the number of system inputs is different from the number of outputs, it is not clear which dimensions the intermediate signals (i.e., the signals between the blocks) should have. Furthermore, a distinction will be made between the Hammerstein, Wiener, and WienerHammerstein system. The first two systems can be considered as a special case of the Wiener-Hammerstein system, but making a distinction will render the analysis more simple and interpretable.

5.4.1 Hammerstein A Hammerstein system consists of a static nonlinearity followed by a linear dynamic system (see Figure 5-3). A typical example where this model is utilized is in the case of a non-ideal sensor exhibiting a static nonlinear effect, which is followed by a transmission line showing a linear dynamic behaviour.

v

u

y G0( z )

P

Figure 5-3. Hammerstein system. In general, the input signal u is distorted by a static nonlinearity P , resulting in the intermediate signal v which is filtered by a linear system G 0 ( z ) . The linear system is

102

On the Equivalence with some Block-oriented Models

parametrized as a n a -th order linear state space model with parameters { A 0 B 0 C 0 D 0 }. For the parametrization of the static nonlinearity, we rely on the Weierstrass theorem.

Theorem 5.4

(Weierstrass Approximation Theorem) Let f be a continuous

function on a closed interval [a,b] . Then, given any ε > 0 , there exists a polynomial P of degree r such that

f(x) – P(x) < ε

(5-46)

for all x in [a,b] . In other words, a continuous function on a closed interval can be uniformly approximated by polynomials [35]. Hence, the static nonlinearity in Figure 5-3 is parametrized as a polynomial with coefficients p i . The following equations describe the Hammerstein system:

v(t) =

r

i

∑i = 1 pi u ( t )

(5-47)

⎧ x0 ( t + 1 ) = A0 x0 ( t ) + B0 v ( t ) ⎨ y ( t ) = C0 x0 ( t ) + D0 v ( t ) ⎩

(5-48)

The substitution of (5-47) in (5-48) results in a set of equations identical to (5-20), when we define the system matrices in (5-20) as

A = A0

B = p1B 0

E = p2 B … pr B 0 0

C = C0

D = p1 D0

F = p2 D … pr D 0 0

(5-49)

and the vectors of monomials as

ζ ( t ) = η ( t ) = u ( t ){ r } .

103

(5-50)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

For the Hammerstein system, ζ ( t ) and η ( t ) are a subset of the polynomial vector functions defined in (5-32). Therefore, we can conclude that a Hammerstein system with a continuous nonlinearity can be represented by the PNLSS model in (5-20).

5.4.2 Wiener A Wiener system is composed of a linear dynamic block G 0 ( z ) followed by a static nonlinear block, as shown in Figure 5-4.

v

u

y

G0 ( z )

P

Figure 5-4. Wiener system.

The equations that describe the behaviour of a Wiener system are

⎧ x0 ( t + 1 ) = A0 x0 ( t ) + B0 u ( t ) ⎨ v ( t ) = C0 x0 ( t ) + D0 u ( t ) ⎩ y(t) =

r

(5-51)

i

∑i = 1 pi v ( t )

(5-52)

By substituting the second equation of (5-51) in (5-52), and by applying the Multinomial Expansion (see “Multinomial Expansion Theorem” on p. 92), we find the following set of system matrices:

A = A0

B = B0

C = p1 C 0

D = p1 D0

E = 0

F = p C 2 ( 1 ) 2p C ( 1 )C ( 2 ) … rp C ( n )D r – 1 p D r 2 0 2 0 0 r 0 a 0 r 0

(5-53)

and

ζ( t) = 0

η ( t ) = ξ ( t ){ r } .

104

(5-54)

On the Equivalence with some Block-oriented Models

A similar conclusion as for the Hammerstein systems can be drawn: ζ ( t ) and η ( t ) in (5-54) are a subset of the polynomial vector functions defined in (5-32). Hence, Wiener systems with a continuous static nonlinearity can be represented using the PNLSS approach.

5.4.3 Wiener-Hammerstein Wiener-Hammerstein systems are defined as a static nonlinear block sandwiched between two linear dynamic blocks G 1 ( z ) and G 2 ( z ) with orders n 1 and n 2 , respectively (see Figure 5-5). The intermediate signals are denoted as v ( t ) and w ( t ) .

w

v

u G1 ( z )

y G2( z )

P

Figure 5-5. Wiener-Hammerstein system. The system equations are:

⎧ x1 ( t + 1 ) = A1 x1 ( t ) + B1 u ( t ) ⎨ v ( t ) = C1 x1 ( t ) + D1 u ( t ) ⎩ w( t) =

r

i

∑i = 1 p i v ( t )

⎧ x2 ( t + 1 ) = A2 x2 ( t ) + B 2 w ( t ) ⎨ y ( t ) = C2 x2 ( t ) + D2 w ( t ) ⎩

(5-55)

(5-56)

(5-57)

These equations are combined in order to obtain the representation of (5-20). For this, the state vectors x 1 ( t ) and x 2 ( t ) are merged into the new state vector x ( t ) . Again, the equivalence holds, and we obtain the system matrices in (5-58).

105

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

A =

0

A1 p1 B C1 2

E =

0

n1 × n2

B1

B =

n1 × 1 2 1

p2 B2 C ( 1 ) F = D2 p C2 ( 1 ) 2 1

0

C = p1 D C1 2

p1 D B2 1

A2

n1 × 1



0

… rp r B 2 C ( n 1 )D 1 1

1



1

n1 × 1 r–1

2p 2 B 2 C ( 1 )C 1 ( 2 ) 2p 2 C ( 1 )C 1 ( 2 )

0

D = p1 D D2 1

C2

r–1

rp r C ( n 1 )D 1 1

n1 × 1

pr B2 D

(5-58)

r 1

r

pr D1

The vectors of monomials are defined as

ζ ( t ) = η ( t ) = ξ' ( t ) { r } ,

(5-59)

where

ξ' ( t ) = x 1 ( t )



xn ( t ) 1

u(t)

T

.

(5-60)

5.4.4 Nonlinear Feedback In this section, we will discuss a simple (I) and a more general (II) type of Nonlinear Feedback system. The first system is shown in Figure 5-6, and is referred to as NLFB I.

u

+

+

v

y

-

G0( z )

P Figure 5-6. Nonlinear Feedback I. It is described by the following equations:

⎧ x 0 ( t + 1 ) = A0 x0 ( t ) + B 0 v ( t ) ⎨ y ( t ) = C0 x0 ( t ) + D0 v ( t ) ⎩

106

(5-61)

On the Equivalence with some Block-oriented Models

r

i

v( t ) = u( t) – ∑

i=1

pi y ( t )

(5-62)

After substitution of (5-62) in (5-61), we obtain: r ⎧ i p i y ( t )⎞ ⎪ x 0 ( t + 1 ) = A 0 x 0 ( t ) + B 0 ⎛⎝ u ( t ) – ∑ ⎠ i=1 ⎪ ⎨ r i ⎪ y ( t ) = C0x0( t) + D0⎛ u ( t ) – ∑ p i y ( t )⎞ ⎪ ⎝ ⎠ i=1 ⎩

(5-63)

The last equation of (5-63) is a nonlinear algebraic equation due to the presence of the direct term coefficient D 0 . This coefficient renders the system incompatible with the PNLSS model. For the more general Nonlinear Feedback system (see Figure 5-7), similar nonlinear algebraic equations pop up due to the direct term of the linear subsystems.

u

+

+

y

-

G1( z )

G3 ( z )

P

G2( z )

Figure 5-7. Nonlinear Feedback II.

In order to continue the analysis, the following assumptions are made: Assumption 5.5 (Delay in system NLFB I) A delay is present in the linear dynamic block G 0 ( z ) , i.e., D 0 = 0 . Assumption 5.6 (Delay in system NLFB II) A delay is present in at least one of the linear blocks G 1 ( z ) , G 2 ( z ) , or G 3 ( z ) . This is equivalent to D 1 = 0 , D 2 = 0 , or D3 = 0 . Assumption 5.5 and Assumption 5.6 are true, when for instance a digital controller is present somewhere in the feedback loop. If this is not the case, we will still assume that a zero direct

107

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

term is present in one of the linear blocks. Note that a delay is always present in real-life systems. If the sampling frequency is sufficiently high, the delay will be in the same order of magnitude as one delay tab. When this condition is not fulfilled, the data can be upsampled in order to achieve a negligible direct term [12]. The equations in (5-63) are therefore reduced to: r ⎧ i ⎛ p i ( C0 x0 ( t ) ) ⎞ ⎪ x0 ( t + 1 ) = A0 x0 ( t ) + B0 ⎝ u ( t ) – ∑ ⎠ i=1 ⎨ ⎪ y ( t ) = C0 x0 ( t ) ⎩

(5-64)

These system equations are equivalent to (5-20) using the following system parameters:

A = A0 – p1 B0 C0 E = –B0 p C2 ( 1 ) 2 0

B = B0

C = C0

2p 2 C ( 1 )C 0 ( 2 ) 0



D = 0 rp r C ( n a – 0

F = 0 r–1 1 )C 0 ( n a )

r pr C0 ( na )

(5-65)

and the following vectors of monomials:

ζ ( t ) = x ( t ){ r }

η(t) = 0 .

(5-66)

To prove the equivalence for the system NLFB II, different polynomial vector maps are necessary. They depend on the position of the delay in the feedback loop (i.e., the linear system for which D i is assumed to be zero). In (5-67), we define three vectors composed of the state vectors of the linear systems G 1 ( z ) , G 2 ( z ) , and G 3 ( z ) .

ξ1 ( t ) = x1 ( t ) x2 ( t )

T

ξ2 ( t ) = x2 ( t )

ξ3 ( t ) = x1 ( t ) x2 ( t ) x3 ( t ) u ( t )

T

The necessary monomials ζ ( t ) and η ( t ) are listed in Table 5-2.

D1 = 0

D2 = 0

D3 = 0

ζ(t )

ξ1 ( t )

ξ2 ( t )

{r}

ξ3( t )

η(t)

0

ξ2 ( t )

{r}

{r}

{r}

0

Table 5-2. Required monomials as a function of the position of the delay.

108

(5-67)

On the Equivalence with some Block-oriented Models

5.4.5 Conclusion We have established a link between a number of standard block-oriented models and the PNLSS model. The results for the different nonlinear block structures are summarized in Table 5-3 and Table 5-4. In each row, the required monomials for the PNLSS approach are presented. Hammerstein (5-49)

Wiener (5-53)

Wiener-Hammerstein (5-58)

ζ(t )

u ( t ){ r }

0

ξ' ( t ) { r }

η(t)

u ( t ){ r }

ξ ( t ){ r }

ξ' ( t ) { r }

Table 5-3. PNLSS monomials for open loop block-oriented models.

NLFB I (5-65)

NLFB II, D 1 = 0 (5-67)

ζ(t )

x ( t ){ r }

ξ1 ( t )

η(t)

x ( t ){ r }

0

{r}

NLFB II, D 2 = 0 (5-67)

ξ2 ( t )

{r}

ξ2 ( t )

{r}

NLFB II, D 3 = 0 (5-67)

ξ3 ( t )

{r}

0

Table 5-4. PNLSS monomials for feedback block-oriented models. It is beyond discussion that block-oriented models give the most physical insight to the user. From an identification point of view, they often require less parameters than the PNLSS approach. The open loop block-oriented models have the advantage that the intermediary signals are solved in a non recurrent way during the estimation and the simulation. Therefore, their stability is simple to check and to ensure. On the other hand, block-oriented models require prior knowledge about the structure of the device, which is not always easy to obtain. For some block-oriented models, like the Wiener-Hammerstein system or the Nonlinear Feedback Structure, initial values are not always straightforward to obtain. The PNLSS model is inherently compatible with MIMO systems, and it does not need any prior knowledge. The price paid for this flexibility is the explosion of the number of required model parameters. The pros and cons of both approaches are summarized in Table 5-5. To conclude,

109

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

none of both the approaches is clearly better than the other. For this reason, the user should choose an appropriate model structure according to his/her needs.

Block-Oriented

State Space

Physical interpretation





Number of parameters





Flexibility of the model





Model initialization





Table 5-5. Comparison of the block-oriented approach vs. the state space approach.

110

A Step beyond the Volterra Framework

5.5 A Step beyond the Volterra Framework By means of two simple examples, we illustrate that systems which fit into the polynomial nonlinear state space model structure, do not necessarily belong to the Volterra framework which was set up in the introductory chapter.

5.5.1 Duffing Oscillator The Duffing oscillator is a second order nonlinear dynamic system which is excited with a harmonic signal. Its behaviour is described by the following differential equation:

d2yc dy ----------- + a -------c- + by c + cy c3 = dcos ( ωt ) , dt dt 2

(5-68)

where a , b , and c are the system parameters. The amplitude of the sinusoidal signal is determined by the parameter d . According to the value of this parameter, several kinds of output behaviour can occur, such as ordinary harmonic output, period doubling, period quadrupling, and even chaotic behaviour. The Duffing equation can also be written in a state space form:

⎧ dX ( t ) 1 ⎪ --------------- = X2 ( t ) ⎪ dt ⎨ ⎪ dX 2 ( t ) - = du c ( t ) – aX 2 ( t ) – bX 1 ( t ) – cX 13 ( t ) ⎪ --------------dt ⎩

(5-69)

where u c ( t ) = cos ( ωt ) . Next, this continuous-time model is converted into a discrete-time model using the Euler rule with a time step h :

dX ( t ) ------------- = f ( X ( t ), u c ( t ) ) dt



x ( t + 1 ) = x ( t ) + hf ( x ( t ), u ( t ) )

(5-70)

such that

X ( th ) ≅ x ( t ) , and

111

(5-71)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

u ( t ) = u c ( th ) .

(5-72)

We apply this principle to (5-69), and obtain

⎧ x 1 ( t + 1 ) = x 1 ( t ) + hx 2 ( t ) ⎨ ⎩ x 2 ( t + 1 ) = hdu ( t ) + ( 1 – ha )x 2 ( t ) – hbx 1 ( t ) – hcx 13 ( t )

(5-73)

The discrete-time model in (5-73) might be a poor approximation of its continuous-time counterpart due the simplicity of the differentiation rule. However, in what follows the focus will lie solely on the behaviour of the discrete-time model, and not on the relation between (569) and (5-73). From (5-73), it is clear that this system belongs to the PNLSS model class. In the next simulation, these equations are simulated during N sim = 10 following settings:

b = – 10

c = 100

ω = 3.5

d = 0.82

iterations using the

2π h = -------ωN

2

x (t)

0.5

0

í0.5 0

20

40

60

80

Time [s] 100 0

2

Spectrum x [dB]

a = 1

6

í100 í200 í300 0

5

10 15 20 Angular Frequency [rad/s]

25

30

Figure 5-8. Top plot: state trajectory of the discretized Duffing oscillator; bottom plot: DFT of the state trajectory (grey) and the input signal (black).

112

(5-74)

A Step beyond the Volterra Framework

with N = 4096 . The value of the time step h is chosen such that no leakage is present in the calculated DFTs. In Figure 5-8, the state trajectory of x 2 ( t ) during the first 100 seconds is shown (top plot). In the bottom plot, the DFT of the last 8N samples of the state trajectory is plotted (grey), together with the DFT of the input signal (black). From this figure, we observe that besides harmonic components, also subharmonic components are present: the harmonic lines corresponding to ω ⁄ 4 , ω ⁄ 2 , and 3ω ⁄ 4 are also excited. Although the DUT can be represented by the PNLSS model, it does not fit into the Volterra-framework.

5.5.2 Lorenz Attractor In [40], E. N. Lorenz studied the nonlinear differential equations that describe the behaviour of a forced, dissipative hydrodynamic flow. The solutions of these equations appeared to be extremely sensitive to minor changes of the initial conditions. This is the so-called butterfly effect. Often, the behaviour of this system is referred to as chaotic, while it is described by the following deterministic model equations:

⎧ dX ( t ) 1 ⎪ --------------⎪ dt - = σ ( X 2 ( t ) – X 1 ( t ) ) ⎪ ⎪ dX 2 ( t ) ⎨ ---------------- = X 1 ( t ) ( ρ – X 3 ( t ) ) – X 2 ( t ) ⎪ dt ⎪ ⎪ dX 3( t) --------------⎪ dt - = X 1 ( t )X 2 ( t ) – bX 3 ( t ) ⎩

(5-75)

where the model parameters are given by ρ = 28 , σ = 10 , and b = 8 ⁄ 3 . Like in the previous section, we convert these equations into a discrete-time description by applying the Euler differentiation method with time step h :

⎧ x 1 ( t + 1 ) = hσ ( x 2 ( t ) – x 1 ( t ) ) + x 1 ( t ) + hu ( t ) ⎪ ⎨ x2 ( t + 1 ) = h ( x1 ( t ) ( ρ – x3 ( t ) ) – x2 ( t ) ) + x2 ( t ) ⎪ ⎩ x 3 ( t + 1 ) = h ( x 1 ( t )x 2 – bx 3 ( t ) ) + x 3 ( t )

(5-76)

with X i ( th ) ≅ x i ( t ) . An input term hu ( t ) is added to the first state equation, such that initial conditions can be imposed on the system. It can easily be seen that these equations fit into the proposed PNLSS model structure. As with the Duffing oscillator, we are not interested in a

113

20 0 í20 0

20

40

60

80

100

20 0 í20 0

20

40

60

80

100

20

40

60

80

100

50

3

x (t)

x2(t)

x1(t)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

0

0

Time [s]

Figure 5-9. State trajectory of the discretized Lorenz attractor (2D plot). perfect match between (5-75) and (5-76). In the following simulation, we use a time step

h = 0.01 , apply an impulse with amplitude A = 0.01 on the state x 1 ( t ) , and simulate the equations in (5-76) during N sim = 10 000 samples. The resulting state trajectory is shown in Figures 5-9 and 5-10. The chaotic behaviour of this model is in contradiction to the Volterra framework, since here the response to a periodic input is definitely not periodic. This example illustrates that the PNLSS model structure is ‘richer’ than the Volterra framework.

Figure 5-10. State trajectory of the discretized Lorenz attractor (3D plot).

114

Identification of the PNLSS Model

5.6 Identification of the PNLSS Model In this part of the chapter, an identification procedure for the model in (5-20) is proposed. It consists of four major steps. First, we estimate, in mean square sense, the Best Linear Approximation (BLA) of the plant. Then, a parametric linear model is estimated from the BLA using frequency domain subspace techniques. This is immediately followed by a nonlinear optimization to improve the linear model estimates. The last step consists of a nonlinear optimization procedure in order to obtain the parameters of the full nonlinear model.

5.6.1 Best Linear Approximation For calculating the Best Linear Approximation of the Device Under Test (DUT), we refer to Chapter 2 (see “Estimating the Best Linear Approximation” on p. 28). The procedure explained there converts the measured input/output data into a nonparametric linear model

ˆ ( k ) , and its sample covariance denoted by Cˆ ( k ) . G ˆ ( k ) is given in the form of a G G Frequency Response Function (FRF). This data reduction step offers a number of advantages. First of all, the Signal to Noise Ratio (SNR) is enhanced. Secondly, it allows the user to select, in a straightforward way, a frequency band of interest. Finally, when periodic data are available, the measurement noise and the effect of the nonlinear behaviour can be separated.

5.6.2 Frequency Domain Subspace Identification ˆ ( k ) into a parametric model. The The next step is to transform the nonparametric estimate G ˆ ( k ) , taking into purpose is to estimate a linear, discrete-time state space model from G ˆ ( k ) . For this, we make use of the frequency domain account the covariance matrix C G subspace algorithm in [44] which allows to incorporate covariance information for non uniformly spaced frequency domain data. Furthermore, we rely on the results presented in [53], where the stochastic properties of this algorithm were analysed for the case in which the sample covariance matrix is employed instead of the true covariance matrix. In this section, we briefly recapitulate the algorithm and the model equations on which the algorithm is based.

A. Model Equations First, consider the DFT in N samples of the state space equations (5-9):

115

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

⎧ z k X ( k ) = AX ( k ) + BU ( k ) + z k x I ⎨ ⎩ Y ( k ) = CX ( k ) + DU ( k ) for k = 1, …, F , and with z k = e

jω k T s

(5-77)

. In the transient term z k x I , x I is defined as

1 x I = -------- ( x ( 0 ) – x ( N ) ) . N

(5-78)

In what follows, we will neglect the transient term. The procedures that determine the BLA result in an estimate in the form of a FRF. Hence, we rewrite (5-77) into the FRF form as well. This is done by setting U ( k ) = I n . The plant model then looks as follows: u

⎧ z k X ( k ) = AX ( k ) + B ⎨ ⎩ G ( k ) = CX ( k ) + D with the state matrix X ( k ) ∈ ⺓

na × nu

, G( k) ∈ ⺓

ny × n u

(5-79)

, and n a the order of the model.

p

We multiply the second equation of (5-79) by z k , and elaborate it by repeatedly substituting z k X ( k ) with the first equation of (5-79). p

p–1

( Cz k X ( k ) + z k D )

p–1

( CAX ( k ) + CB + z k D )

p–2

( CA X ( k ) + CAB + z k CB + z k D )

zk G ( k ) = z k

= zk

= zk

(5-80)

2

2

… After p – 1 substitutions, we obtain p

p

z k G ( k ) = CA X ( k ) + ( CA

p–1

B + z k CA

p–2

p–1

B + … + zk

We write down equation (5-81) for p = 0, …, r – 1 with r > n a :

116

p

CB + z k D ) .

(5-81)

Identification of the PNLSS Model

G ( k ) = CX ( k ) + D z k G ( k ) = CAX ( k ) + CB + z k D

(5-82)

… r–1

zk

G ( k ) = CA

r–1

X ( k ) + CA

r–2

r–2

B + … + zk

r–1

CB + z k

D

The extended observability matrix O r and the matrix S r that contains the Markov parameters, are defined as:

C CA …

Or =

CA

D CB …

Sr =

r–1

CA

r–2

0 D … B CA

r–3

… 0 0 … 0 0 … … …

(5-83)

B … CB D

We also define

W r ( k ) = 1 z … zr – 1 k k

T

.

(5-84)

By applying the definitions (5-83) and (5-84) to the r equations of (5-82), we obtain the following relation:

G = Or X + Sr I ,

(5-85)

where the matrices G , X , and I are defined as

G = Wr ( 1 ) ⊗ G ( 1 ) … Wr ( F ) ⊗ G ( F ) X = X( 1 ) … X(F )

(5-86)

I = Wr ( 1 ) ⊗ In … Wr ( F ) ⊗ In u u The complex data equation in (5-85) is now converted into a real equation:

G where we define Z

re

re

re

= Or X + Sr I ,

= Re ( Z ) Im ( Z ) .

117

(5-87)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Assumption 5.7 We assume the following additive noise setting:

G ( k ) = G0 ( k ) + NG ( k ) ,

(5-88)

where the noise matrix N G ( k ) has independent (over k ), circular complex normally distributed elements with zero mean

Ᏹ{ N G ( k ) } = 0 ,

(5-89)

C G ( k ) = Ᏹ{ vec ( N G ( k ) )vec H ( N G ( k ) ) } .

(5-90)

and covariance C G ( k ) :

Equation (5-87) then becomes

G

re

re

re

= Or X + Sr I + NG ,

(5-91)

with

NG = Wr ( 1 ) ⊗ NG ( 1 ) … Wr ( F ) ⊗ NG ( F ) .

(5-92)

Another assumption concerns the controllability and the observability of the true plant model: Assumption 5.8 The true plant model can be written in the form (5-79) where ( A, C ) is observable and ( A, B ) is controllable. At first sight, it is awkward to refer to a ‘true’ linear model, whereas the algorithm will be used to identify a model for a nonlinear DUT. The actual system will definitely not adhere to the linear representation in (5-79). However, one should bear in mind that, at this stage, the goal is to retrieve a parametric model for the Best Linear Approximation of the system. From the view point of the BLA, the nonlinear behaviour of the DUT only results in two kind of effects: bias contributions which change the dynamic behaviour of the BLA, and stochastic contributions which act like ordinary disturbing noise (see also Chapter 2). The state space matrices can be retrieved from equation (5-91) using the frequency domain subspace identification algorithm summarized in paragraph B.

118

Identification of the PNLSS Model

B. Frequency Domain Subspace Identification Algorithm [44] 1.

Estimate the extended observability matrix O r , given G ( k ) and C G ( k ) . 1a. Initialization: Choose r > n a and form n

Z =

u i I re and C = Re ⎛ F W ( k )W H ( k ) ⊗ C G ( k )⎞ , N r r ∑ ∑ ⎝ ⎠ k=1 i=1 G re

(5-93)

i

where C G ( k ) denotes the i -th diagonal partition of C G ( k ) (see Appendix 5.B). 1b. Eliminate the input I

re

from Z using a QR-decomposition of Z T : Z = R T Q T .

I re

Z =

G re

=

T R 11 0

Q 1T

(5-94)

T RT QT R 12 22 2

T as the left upper block of rn × rn elements. Then, R T has Define R 11 u u 12 re T dimensions rn y × rn u , and R 22 ( rn y × rn y ) remains after the elimination of I from Z .

1c.

–1 ⁄ 2 T R 22 :

Remove the noise influence from (5-91): calculate the SVD of C N –1 ⁄ 2 T R 22

CN

= UΣV T ,

(5-95)

ˆ = C1 ⁄ 2U O r N [:,1:n a] .

(5-96)

and estimate O r as

2.

ˆ : Make use of the shift property of O r to estimate A and C from O r ˆ † ˆ ˆ ˆ Aˆ = O r [1: ( r – 1 )n y,:] O r [n y + 1:rn y,:] and C = O r [1:n y,:]

3.

(5-97)

ˆ and C ˆ : minimize V with respect to B and D : Estimate B and D , given A SS F

V SS =

–1

∑k = 1 ε H ( k )C G ( k )ε ( k )

,

(5-98)

with

ε ( k ) = vec ( G SS ( Aˆ , B, Cˆ , D, k ) – G ( k ) ) , –1

G SS ( A, B, C, D, k ) = C ( z k I n – A ) B + D . a

119

(5-99) (5-100)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

5.6.3 Nonlinear Optimization of the Linear Model The weighted least squares cost function V SS defined in (5-98) is a measure of the model quality. According to this measure, it turns out that the subspace algorithm generates acceptable model estimates. However, in practical applications V SS strongly depends on the dimension parameter r chosen in step 1.a of the identification procedure. A first action that can be taken to improve the model estimates, is to apply the subspace algorithm for different values of r , for instance r = n a + 1, …, 6n a , and to select the model that corresponds to the lowest V SS . The second way to obtain better modelling results is to consider the cost function F

V WLS =

–1

∑k = 1 ε H ( k )C G ( k )ε ( k ) ,

(5-101)

with

ε ( k ) = vec ( G SS ( A, B, C, D, k ) – G ( k ) ) ,

(5-102)

and to minimize V WLS with respect to all the parameters ( ABCD ). This is a nonlinear problem that can be solved using the Levenberg-Marquardt algorithm (see “The LevenbergMarquardt Algorithm” on p. 135). This method requires the computation of the Jacobian of the model error ε ( k ) with respect to the model parameters. From (5-102) and (5-100), we calculate the following expressions:

⎧ ∂ε ( k ) ⎪ ------------⎪ ∂A ij ⎪ ⎪ ∂ε ( k ) ⎪-----------⎪ ∂B ij ⎨ ∂ε ( k ) ⎪ -----------⎪ ∂C ij ⎪ ⎪ ⎪ ∂ε ( k ) ⎪ -----------∂ D ij ⎩

–1 na × na

= vec ⎛ C ( z k I n – A ) I ij ⎝ a

–1 ( z k I n – A ) B⎞ ⎠ a

na × n u

= vec ⎛ C ( z k I n – A ) – 1 I ij ⎝ a =

ny × na ( zkIn vec ⎛ I ij ⎝ a

⎞ ⎠ (5-103)

– A ) – 1⎞ ⎠

ny × n u ⎞ = vec ⎛ I ij ⎝ ⎠

120

Identification of the PNLSS Model

Cost Function

10

10

10

10

3

2

1

0

10

20

30

40 r

50

60

70

Figure 5-11. WLS cost function V SS of the subspace estimates (grey dots), and the cost function V WLS after the nonlinear optimization (black dots) for different values of dimensioning parameter r .

The subspace method is used to generate a number of initial linear models (e.g.

r = n a + 1, …, 6n a ), which are used as starting values for the nonlinear optimization procedure. Finally, the model that corresponds to the lowest cost function V WLS is selected. Due to the fact that a high number of different initial models is employed, there is a higher probability to end up in a global minimum of V WLS , or at least in a good local minimum. Note that in the parameter space, there exists an infinite number of global minimizers for V WLS , 2

more precisely a subspace of dimension n a . This is a consequence of fully parametrizing the linear state space representation. Furthermore, while carrying out the nonlinear optimization, the unstable models estimated with the subspace algorithm are stabilized, for instance using the methods described in [14]. To exemplify this method, we apply it to the semi-active damper data set (see “Description of the Experiments” on p. 161), and estimate models of order n a = 3 for different values of r : r = 3, …, 75 . In Figure 5-11, the cost function of the subspace estimates V SS is shown (grey dots), together with the cost function of the optimized models V WLS (black dots). Figure 5-11 illustrates that V SS is a craggy function of r , and that for this particular data set, the Levenberg-Marquardt algorithm ends up in the same local minimum: the same value of

V WLS is attained for a large number of initial models.

121

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

5.6.4 Estimation of the Full Nonlinear Model The last step in the identification process is to estimate the full nonlinear model

⎧ x ( t + 1 ) = Ax ( t ) + Bu ( t ) + Eζ ( t ) ⎨ y ( t ) = Cx ( t ) + Du ( t ) + Fη ( t ) + e ( t ) ⎩

(5-104)

with the initial state given by x ( 1 ) = x 0 , and where e ( t ) is the output noise. For this, a weighted least squares approach will be employed. In order to keep the estimates of the model parameters unbiased, the following assumption is required. Assumption 5.9 It is assumed that the input u ( t ) of the model in (5-104) is noiseless, i.e., it is observed without any errors and independent of the output noise. In practical situations, it may occur that Assumption 5.9 is not fulfilled. When the SNR at the input is sufficiently high (> 40 dB), the resulting bias in the estimated model parameters is negligible. When the SNR is too low, it can be increased by employing periodic signals: by measuring a sufficient number of periods, and averaging over time or frequency, the SNR is improved in a straightforward way. The weighted least squares cost function V WLS with respect to the model parameters θ = [ vec ( A ) ;vec ( B ) ;vec ( C ) ;vec ( D ) ;vec ( E ) ;vec ( F ) ] will be minimized:

V WLS ( θ ) = where W ( k ) ∈ ⺓

ny × ny

F

∑k = 1 ε H ( k, θ )W ( k )ε ( k, θ ) ,

(5-105)

is a user-chosen, frequency domain weighting matrix. Typically, this

ˆ – 1 ( k ) . This matrix can matrix is chosen equal to the inverse covariance matrix of the output C Y be obtained straightforwardly when periodic signals are used to excite the DUT. By choosing

W ( k ) properly, it is also possible to put more weight in a certain frequency band of interest. When no covariance information is available and no specific weighting is required by the user, a constant weighting ( W ( k ) = 1 , for k = 1, …, F ) is employed. Furthermore, the model error ε ( k, θ ) ∈ ⺓

ny

is defined as

ε ( k, θ ) = Y m ( k, θ ) – Y ( k ) ,

122

(5-106)

Identification of the PNLSS Model

where Y m ( k, θ ) and Y ( k ) are the DFT of the modelled and the measured output, respectively. Note that when Y m ( k, θ ) is calculated with correct initial conditions, equation (5-106) does not pose serious leakage problems in the case of non periodic data, because the leakage terms present in Y m ( k, θ ) and Y ( k ) cancel each other.

A. Calculation of the Jacobian We minimize V WLS ( θ ) by means of the Levenberg-Marquardt algorithm (see “The Levenberg-Marquardt Algorithm” on p. 135). This requires the computation of the Jacobian

J ( k, θ ) of the modelled output with respect to the model parameters. Hence, we need to compute

∂Y m ( k , θ ) ∂ ε ( k, θ ) J ( k, θ ) = ------------------- = ------------------------ . ∂θ ∂θ

(5-107)

Given the nonlinear relationship in (5-104), it is impractical to calculate the model output and the Jacobian directly in the frequency domain. Therefore, we will perform these operations in the time domain, followed by a DFT in order to obtain Y m ( k, θ ) and J ( k, θ ) . Before deriving explicit expressions, we recapitulate some general aspects with respect to the calculation of the Jacobian, which are pointed out in [47] and [78]. Consider a general discrete-time nonlinear model

⎧ x ( t + 1 ) = f ( x ( t ), u ( t ), a ) ⎨ y ( t ) = g ( x ( t ), u ( t ) , b ) ⎩

(5-108)

where a and b are the model parameters present in the state and output equation, respectively. The derivatives of the output y ( t ) with respect to a and b are given by

x( t + 1 ) ∂f ( x ( t ), u ( t ) , a ) ∂x ( t ) ∂ f ( x ( t ), u ( t ), a ) ⎧ ∂--------------------= -------------------------------------- ------------ + -------------------------------------⎪ ∂a ∂x ( t ) ∂a ∂a ⎪ ⎪ ∂y ( t -) ∂g ( x ( t ), u ( t ), b ) ∂ x ( t ) ----------= --------------------------------------- -----------⎨ ∂a ∂x ( t ) ∂a ⎪ ⎪ ∂y ( t ) ∂g ( x ( t ), u ( t ) , b ) ⎪ ------------ = --------------------------------------∂ b ∂b ⎩

123

(5-109)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

These equations can be rewritten as follows

⎧ x ( t + 1 ) = f ( x ( t ), u ( t ), a )x ( t ) + f ( x ( t ), u ( t ), a ) x a a ⎪ a ⎪ y a ( t ) = g x ( x ( t ), u ( t ), b )x a ( t ) ⎨ ⎪ ⎪ y b ( t ) = g b ( x ( t ), u ( t ), b ) ⎩

(5-110)

Hence, the expressions that define the calculation of the Jacobian (5-110) can be regarded as a new dynamic discrete-time nonlinear model. The inputs of this Jacobian model are the inputs and the simulated states of the original model. These states are obtained by simulating the original model with the estimated parameters of the previous Levenberg-Marquardt iteration. For model equations (5-104), explicit expressions for the Jacobian can be found in Appendix 5.D. Furthermore, due to the polynomial nature of (5-104), the equations in (5-110) are in a polynomial form as well. Hence, a PNLSS model can be determined that calculates the elements of the Jacobian. In Appendix 5.E, explicit expressions are derived for the state space matrices of this new model.

B. Initial Conditions In (5-110), the simulated states are employed to calculate the Jacobian. Hence, when the state sequence is computed, the initial state x 0 of the model in (5-104) should be taken into account. For this, three possible approaches are distinguished. The simplest, but rather inefficient way, is to calculate the Jacobian for the full data set, and then to discard the first

N trans transient samples of both the Jacobian and the model error. In this way, a part of the data is not used for the model estimation. The second method can only be employed when periodic excitations are applied during the experiments. As mentioned earlier, the simulated states from the previous LevenbergMarquardt iteration are used to calculate the Jacobian. By applying several periods of the input sequence, and by considering only the last simulated state period, the transients become negligible. This principle is depicted in Figure 5-12 for two input periods. In this particular example, it suffices to discard the first period to obtain states that are in regime. In order to save computing time, a fraction of a period can be used as a preamble. This can be done for highly damped systems, or when the number of samples per period is high.

124

Identification of the PNLSS Model

input

states nonlinear model

Figure 5-12. Removal of the transients in the simulated states; state(s) in regime (black), transient (red).

The last method, which is suitable for both periodic and non periodic excitations, is to estimate the initial conditions x 0 as if they were ordinary model parameters. This can be achieved in a straightforward way, since the estimation of x 0 is equivalent to estimating an extra column in the state space matrix B . The idea is to add an artificial model input u art to the model, which only contributes to the state equation in a linear way (i.e., only via the B matrix). The resulting input is then given by

u' ( t ) =

u( t) . u art ( t )

(5-111)

Assume that the original input, the state and the output data sequences are defined for time indices t = 1, …, N . We consider u ( 0 ) = 0 and x ( 0 ) = 0 , and apply an impulse signal to the artificial input of the system:

⎧1 u art ( t ) = ⎨ ⎩0

t = 0 t = 1, … , N

(5-112)

Then, we obtain the following state equation for t = 0 :

x ( 1 ) = Ax ( 0 ) + Bu' ( 0 ) + Eζ ( 0 ) = Bu' ( 0 )

(5-113)

= B [ :, n + 1 ] u Consequently, the initial conditions can be estimated like ordinary model parameters by adding an artificial input u art to the model.

125

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

C. Starting Values The last obstacle that needs to be cleared before starting the nonlinear optimization is to choose good starting values for θ . For the matrices A , B , C , and D , we will use the estimates obtained from the parametric Best Linear Approximation. The other state space matrices ( E and F ) are initially set to zero. The idea of using a linear model as a starting point for the nonlinear modelling is certainly not new (e.g. [71]), and is quite often employed. Using the parametric BLA as the initial nonlinear model offers two important advantages. First of all, it guarantees that the estimated nonlinear model performs at least as good as the best linear model. Secondly, for the model structure in (5-104), this principle results in a rough estimate of the model order n a .

D. How to handle the similarity transform? As mentioned earlier, the state space representation is not unique: the similarity transform

x T ( t ) = T – 1 x ( t ) leaves the input/output behaviour unaffected. The n a2 elements of the transformation matrix T can be chosen freely, under the condition that T is non singular. The parameter space has thus at least n a2 unnecessary dimensions. This poses a problem for the n gradient-based identification of the model parameters θ ∈ ⺢ θ : the Jacobian does not have full rank and, hence, an infinite number of equivalent solutions exists. One way to deal with this problem is to use a canonical parametrization, such that the redundancy disappears. However, it is known that this may lead to numerically ill-conditioned estimation problems [45]. A second way to cope with the overparametrization is to employ so-called Data Driven Local Coordinates (DDLC) [45], or a Projected Gradient search [80]. The key idea of these methods is to identify the manifold of models parametrized by θ in the parameter space, for which the models have an identical input/output behaviour. Thus, any parameter update for which the model remains on this manifold does not change the input/output behaviour. Therefore, the methods presented in [45] and [80] compute the parameter update such that it is locally orthogonal to the manifold: this is achieved by computing a projection matrix

P∈⺢

n θ × n θ – n a2

such that the new Jacobian

J DDLC ( θ ) = J ( θ )P

126

(5-114)

Identification of the PNLSS Model

has n a2 columns less than the original Jacobian J ( θ ) , and has full rank. The matrix P needs to be determined during every iteration step. The third method to deal with the rank deficiency of the Jacobian consists in using a full parametrization, and employing a truncated Singular Value Decomposition (for more details, see “The Levenberg-Marquardt Algorithm” on p. 135). In [86], it is shown that this method and the DDLC method are equivalent: the search direction in the θ -space computed with DDLC and the one obtained by means of a truncated SVD are identical. The additional advantage of the DDLC method is the calculation of n a2 less columns of the Jacobian matrix compared to the full parametrization. This can save a considerable amount of computation time, especially when the model order is high. The DDLC approach is feasible when the computation of P is straightforward. This is the case for linear, bilinear, and LPV state space models. However, for the polynomial nonlinear state space model, the calculation of P is very involved. Hence, we will employ the third method: a full parametrization and a truncated SVD.

E. Overfitting and Validation The nonlinear search should be pursued until the cost function in (5-105) stops decreasing. However, as it is often the case for model structures with many parameters, overfitting can occur during the nonlinear optimization [70]. This phenomenon can be visualized by applying a fresh data set to the models obtained from the iterations of the nonlinear search. In the case of overfitting, the model quality first increases up to an optimum, and then deteriorates as a function of the number of iterations. The reason for this is the following: at the start of the optimization, the important parameters are quickly pulled to minimizing values, and diminish the bias error. As the minimization continues, the less important parameters are more and more drawn to minimizing values. Hence, a growing number of parameters becomes activated, and the variance on the parameter estimates increases. In order to avoid this effect, we use the so-called stopped search [70]: we evaluate the model quality of every estimated model on a test set, and then select the model that achieves the best result. This method is a form of implicit regularization, because it prevents the activation of unnecessary parameters.

127

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

F. Stability during Estimation and Validation We will assume that the parametric linear model obtained from the BLA is stable. As mentioned before, this can be ensured using stabilizing methods for linear models, like for instance [14]. Hence, the nonlinear optimization of the full PNLSS model is started from a stable model. The first phase in the nonlinear optimization consists of calculating the Jacobian. Hence, the first question is whether this calculation remains stable. Consider the recursive expressions for the Jacobian given in (5-168). From these equations, it is observed that the time-varying dynamics of the Jacobian are determined by the factor

( A + Eζ' ( t ) ) .

(5-115)

On the other hand, the Jacobian matrix of the original state equation in (5-104), with respect to the states, is given by

∂x ( t + 1 ) --------------------- = A + Eζ' ( t ) . ∂x ( t )

(5-116)

This expression describes the linearised dynamic behaviour of the original model at every time instance. Since (5-115) and (5-116) are identical, the original model and the Jacobian model share the same dynamic behaviour (i.e., the instantaneous poles of both models are identical). Consequently, a stable model always yields a stable Jacobian. The second question is whether a parameter update during the nonlinear optimization yields a stable model. Naturally, this is not necessarily the case. When unstability occurs, it will be reflected by the value of the cost function (Inf or NaN). This phenomenon can easily be handled by the nonlinear optimization procedure, as if it was an ordinary increase of the cost function. On experimental data, it occurs from time to time that the estimated model becomes unstable on the validation set. To overcome this problem, the following heuristic approach is employed. The validation input signal is also passed on to the nonlinear optimization algorithm. In this way, the validation output of the updated model with parameters θ test (see Figure 5-14 in “The Levenberg-Marquardt Algorithm” on p. 135) can be computed in every iteration. When the validation output is unstable, the optimization algorithm reacts as if the cost function has increased. This approach guarantees a model which is stable for the validation set.

128

Identification of the PNLSS Model

Nevertheless, this procedure prevents the iterative search to go through an unstable (validation) zone before ending up in a stable zone again. Consequently, this method should only be applied when it is strictly necessary.

129

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Appendix 5.A Some Combinatorials In this appendix, we calculate the number of distinct monomials in n variables of a given degree r . Choosing r different elements out of a set of n elements can be done in a number of different ways, which is given by the binomial coefficient:

n! ⎛ n⎞ = ---------------------⎝ r⎠ r! ( n – r )!

(5-117)

For instance, if we have to choose two different elements from { 1, 2, 3, 4 } , this results in

4! ⎛ 4⎞ = ----------------------= 6 ⎝ 2⎠ 2! ( 4 – 2 )!

(5-118)

combinations, namely

{ 1, 2 }, { 1, 3 }, { 1, 4 } (5-119)

{ 2, 3 }, { 2, 4 } { 3, 4 }

We would also like to add identical combinations, like { 1, 1 } for instance. To do so, we need to add one dummy variable s to the set { 1, 2, 3, 4 } . Then, the resulting 10 combinations are:

{ 1, 2 }, { 1, 3 }, { 1, 4 }, { 1, s }

s = 1

{ 2, 3 }, { 2, 4 }, { 2, s }

s = 2

{ 3, 4 }, { 3, s }

s = 3

{ 4, s }

s = 4

(5-120)

In general, we need to add r – 1 dummy variables in order to obtain

n + r – 1 )!⎛ n + r – 1⎞ = (-------------------------⎝ ⎠ r r! ( n – 1 )! monomials of degree r in n variables.

130

(5-121)

Identification of the PNLSS Model

Appendix 5.B Construction of the Subspace Weighting Matrix from the FRF Covariance The subspace identification algorithm requires the computation of the weighting matrix C N , which is defined as H( k) }) . C N = Re ( Ᏹ{ N G ( k )N G

(5-122)

In Chapter 2, we have determined the covariance matrix C G ( k ) as:

C G ( k ) = Ᏹ{ vec ( N G ( k ) )vec H ( N G ( k ) ) } .

(5-123)

The purpose of this appendix is to find an expression for C N as a function of the elements of CG ( k ) . When we substitute (5-92) in (5-122), we obtain:

⎛ ⎧ F ⎫⎞ C N = Re ⎜ Ᏹ⎨ ∑ [ W r ( k ) ⊗ N G ( k ) ] [ W r ( k ) ⊗ N G ( k ) ] H ⎬⎟ ⎝ ⎩ k=1 ⎭⎠

(5-124)

The following identities hold:

(A ⊗ B)

H

H

= A ⊗B

H

( A ⊗ B ) ( C ⊗ D ) = AC ⊗ BD

(5-125)

i.e., the Hermitian transpose of a Kronecker product and the Mixed Product rule. Applying these Kronecker product properties to (5-124) results in: F

H ( k ) }⎞ C N = Re ⎛ ∑ W ( k )W rH ( k ) ⊗ Ᏹ{ N G ( k )N G ⎝ k=1 r ⎠

(5-126)

We denote the i -th column of N G ( k ) as N [ :, i ] ( k ) and obtain H(k) = N G ( k )N G

nu

∑i = 1 N [ :, i ] ( k )N [H:, i ] ( k ) .

On the other hand, we also have that

131

(5-127)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

i

N˜ ij

j ˜ divided into n × n partitions. Figure 5-13. N y y

N˜ = vec ( N G ( k ) )vec H ( N G ( k ) ) =

N [ :, 1 ] ( k ) … N [ :, n ] ( k )

N [H:, 1 ] ( k ) … N [H:, n ] ( k ) . u

(5-128)

u

˜ at position ( i, j ) in N˜ with dimensions n × n (see Figure 5-13) is Hence, the partition N ij y y given by N˜ ij = N [ :, i ] ( k )N [H:, j ] ( k ) .

(5-129)

H ( k ) } can be computed from Taking into account equation (5-127), it is clear that Ᏹ{ N G ( k )N G

˜ ( k ) } . First, C ( k ) should be divided into n × n partitions, the elements of C G ( k ) = Ᏹ{ N G u u in which each partition contains n y × n y elements. Next, the diagonal partitions should be H ( k ) } . Finally, we obtain summed in order to determine Ᏹ{ N G ( k )N G n

F

u i ( k )⎞ , C N = Re ⎛ ∑ W r ( k )W rH ( k ) ⊗ ∑ CG ⎝ k=1 ⎠ i=1

i ( k ) denotes the i -th n × n partition on the diagonal of C ( k ) . where C G y y G

132

(5-130)

Identification of the PNLSS Model

Appendix 5.C Nonlinear Optimization Methods Consider the cost function V ( θ, Z ) which is a function of the parameter vector θ ∈ ⺢



and

the measurements Z . In this appendix, we will summarize a number of standard iterative nonlinear optimization methods that can be used to minimize V ( θ, Z ) with respect to θ . Given their iterative nature, these methods all have in common the computation of a parameter update ∆θ .

A. The Gradient Descent Algorithm The gradient descent algorithm is the most intuitive method to find the minimum of a function. In this iterative procedure, the parameter update ∆θ is proportional to the negative gradient ∇V of the cost function

∆θ = – λ∇ V ,

(5-131)

with λ the damping factor. The main advantages of the gradient method are its conceptual simplicity and its large region of convergence to a (local) minimum. The most important drawback is its slow convergence.

B. The Gauss-Newton Algorithm When a quadratic cost function V ( θ, Z ) needs to be minimized: N

V ( θ, Z ) =

e T ( θ,

Z )e ( θ, Z ) =



2

e k ( θ, Z ) ,

(5-132)

k=1

with e ( θ, Z ) ∈ ⺢

N

a residual, the Gauss-Newton algorithm is well suited. The reason for this

is that this iterative procedure makes explicit use of the quadratic nature of the cost function. This results in a faster convergence compared to the gradient method [25]. The parameter update ∆θ of this method is given by

∆θ = – ⎛ ∇ 2 V ⎞ ⎝ ⎠

–1

∇V .

(5-133) 2

This approach requires the knowledge of the Hessian matrix ∇ V (i.e., the matrix containing the second derivatives) and the gradient ∇V of the cost function, both with respect to θ .

133

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Further on, it will become clear that, for a quadratic cost function, the Hessian can be approximated by making use of only the first order derivatives of e ( θ, Z ) . The Hessian matrix and the gradient are given by

∇2V

∂ 2 V ( θ, Z ) = ------------------------ = 2J T ( θ, Z )J ( θ, Z ) + 2 ∂θ 2

N

∑ k=1

∂ 2 e k ( θ, Z ) ------------------------e k ( θ, Z ) , ∂θ 2

∂V ( θ, Z ) ∇V = --------------------- = 2J T ( θ, Z )e ( θ, Z ) , ∂θ

(5-134)

(5-135)

where J ( θ, Z ) is defined as the Jacobian matrix of e ( θ, Z ) with respect to θ :

∂ e ( θ, Z ) J ( θ, Z ) = -------------------- . ∂θ

(5-136)

When the residuals e k ( θ, Z ) are small, the second term in (5-134) is negligible compared to the first term. Hence, the Hessian can be approximated by

∂ 2 V ( θ, Z ) ------------------------ ≅ 2J T ( θ, Z )J ( θ, Z ) , ∂θ 2

(5-137)

which is only a function of the Jacobian. Hence, the Gauss-Newton parameter update ∆θ is found by solving

J T ( θ, Z )J ( θ, Z )∆θ = – J T ( θ, Z )e ( θ, Z ) .

(5-138)

The step ∆θ can be computed in a numerically stable way via the Singular Value Decomposition (SVD) [27] of J ( θ, Z ) :

J ( θ, Z ) = UΣV T .

(5-139)

The parameter update is then given by

∆θ = – V Σ – 1 Ue ( θ, Z ) .

(5-140)

If J ( θ, Z ) is not of full rank, then Σ – 1 is singular, and a truncated SVD should be used in order to compute (5-140). This occurs for example when an overparametrized model is utilized.

134

Identification of the PNLSS Model

The convergence rate of the Gauss-Newton algorithm depends on the assumption that the residuals e k ( θ, Z ) are small. If this is the case, then the convergence rate is quadratic, otherwise it can become supralinear. The main drawback of the Gauss-Newton algorithm is its smaller region of convergence compared to the gradient method.

C. The Levenberg-Marquardt Algorithm The Levenberg-Marquardt algorithm [36],[42] combines the large convergence region of the gradient descent method with the fast convergence of the Gauss-Newton method. In order to increase the numerical stability and to avoid comparing apples with oranges, the columns of the Jacobian matrix J ( θ, Z ) need to be normalized prior to the computation of the parameter update. The normalized Jacobian matrix J N ( θ, Z ) is given by

J N ( θ, Z ) = J ( θ, Z )N , where the diagonal normalization matrix N ∈ ⺢

nθ × nθ

(5-141) is defined as

⎛ ⎞ 1 1 N = diag ⎜ ------------------------------------------, …, --------------------------------------------⎟ . rms ( J [ :, n ] ( θ, Z ) )⎠ ⎝ rms ( J [ :, 1 ] ( θ, Z ) ) θ

(5-142)

In most cases, the normalization yields in a better condition number (i.e., the ratio between the largest and the smallest non zero singular value) for J N ( θ, Z ) compared with J ( θ, Z ) . Next, the parameter update ∆θ N is computed by solving the equation T ( θ, Z )J ( θ, Z ) + λ 2 I )∆θ = – J T ( θ, Z )e ( θ, Z ) , ( JN N nθ N N

(5-143)

where the damping factor λ determines the weight between the two methods. If λ has a large numerical value, then the second term in (5-143) is important, and hence the gradient descent method dominates. When λ is small, the Gauss-Newton method takes over. In order to compute (5-143) in a numerically stable way, the SVD of J N ( θ, Z ) is calculated first. When the Jacobian is singular, J N ( θ, Z ) has rank n˜ θ < n θ and the SVD is given by

J N ( θ, Z ) = Udiag ⎛ σ 1, σ 2, …, σ n˜ , 0, … , 0⎞ V T . ⎝ ⎠ θ

135

(5-144)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Next, the parameter update ∆θ N is calculated using a truncated SVD. This results in

∆θ N = – VΛU T e ( θ, Z ) ,

(5-145)

where the matrix Λ is defined as

σ n˜ σ2 ⎛ σ1 ⎞ θ -, 0, … , 0⎟ . Λ = diag ⎜ -------------------, -------------------, …, -------------------σ n2˜ + λ 2 ⎝ σ 12 + λ 2 σ 22 + λ 2 ⎠

(5-146)

θ

In the last step, the parameter update ∆θ N needs to be denormalized again:

∆θ = N∆θ N .

(5-147)

As a starting value for λ , the largest singular value of J N ( θ, Z ) from the first iteration can be used [25]. Next, λ is adjusted according to the success of the parameter update. When the cost function decreases, the approximation made in (5-137) works well. Hence, λ should be decreased such that the Gauss-Newton influence becomes more important. Conversely, when the cost function increases, the gradient descent method should gain more weight: this is obtained by increasing λ . Different stop criteria can be employed to bring the iterative Levenberg-Marquardt algorithm to an end. For instance, the optimization can be broken off when the relative decrease of the cost function becomes smaller than a user-chosen value, or when the relative update of the parameter vector becomes too small. However, the most simple approach is to stop the optimization when a sufficiently high number of iterations i max is exceeded. A full optimization scheme that makes use of this stop criterion is shown in Figure 5-14. In practice, we will also evaluate the cost function on the validation set, and choose the model which performs best on this data set (see “Overfitting and Validation” on p. 127).

136

Identification of the PNLSS Model

Initialize θ Compute V ( θ, Z )

i = 1 λ = –1

i < i max

no

Stop

yes Compute J Normalize J

[ U, S, V ] = svd ( J N )

1 λ = λ × --2 V = V test

λ = –1

θ = θ test

yes

λ = S ( 1, 1 )

no

Compute ∆θ N Denormalize ∆θ N

θ test = θ + ∆θ

Compute V test ( θ test, Z )

i = i+1 λ = λ × 10 yes

V test < V or

no

i > i max Figure 5-14. Levenberg-Marquardt algorithm.

137

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

D. Dealing with Complex Data Suppose the following cost function V ( θ, Z ) needs to be minimized: N

V ( θ, Z ) =

ε H ( θ, Z )ε ( θ,



Z) =

2

ε k ( θ, Z ) ,

(5-148)

k=1

with ε ( θ, Z ) ∈ ⺓

N

a complex residual. V ( θ, Z ) can be rewritten as T ( θ, Z )ε ( θ, Z ) , V ( θ, Z ) = ε re re

(5-149)

where ε re ( θ, Z ) is defined as

ε re ( θ, Z ) = Re ( ε ( θ, Z ) ) . Im ( ε ( θ, Z ) )

(5-150)

Furthermore, the matrix J re ( θ, Z ) is defined as

J re ( θ, Z ) = Re ( J ( θ, Z ) ) . Im ( J ( θ, Z ) )

(5-151)

The matrices ε re ( θ, Z ) and J re ( θ, Z ) are thus real matrices which allows us to recycle the ideas described in section C.

E. Weighted Least Squares In general, a Weighted Least Squares (WLS) cost function is defined as

V WLS ( θ, Z ) = ε H ( θ, Z )Wε ( θ, Z ) , where W ∈ ⺓

N×N

(5-152)

is a Hermitian, positive definite weighting matrix. Any Hermitian positive

(semi-)definite matrix can be decomposed as [27]:

W = W1 / 2W1 / 2 ,

(5-153)

where the square root matrix W 1 / 2 is also Hermitian. Using the SVD of W = UΣV T , W 1 / 2 can be calculated straightforwardly:

138

Identification of the PNLSS Model

W 1 / 2 = VΣ 1 / 2 V H = ( W 1 / 2 ) H .

(5-154)

For real matrices, a similar result holds:

W 1 / 2 = VΣ 1 / 2 V T = ( W 1 / 2 ) T .

(5-155)

Equation (5-152) can thus be rewritten as H H V WLS ( θ, Z ) = ε H ( θ, Z ) ⎛ W 1 / 2⎞ W 1 / 2 ε ( θ, Z ) = ⎛ W 1 / 2 ε ( θ, Z )⎞ W 1 / 2 ε ( θ, Z ) , ⎝ ⎠ ⎝ ⎠

(5-156)

V WLS ( θ, Z ) = ε˜ H ( θ, Z )ε˜ ( θ, Z ) ,

(5-157)

ε˜ ( θ, Z ) = W 1 / 2 ε ( θ, Z ) .

(5-158)

or

with

The Jacobian of ε˜ ( θ, Z ) is then given by

∂( W 1 / 2 ε ( θ, Z ) ) ∂ε˜ ( θ, Z ) J˜ ( θ, Z ) = -------------------- = -------------------------------------- = W 1 / 2 J ( θ, Z ) . ∂θ ∂θ

(5-159)

In this way, we recast the WLS problem such that it can be solved using the techniques from section C and D.

139

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Appendix 5.D Explicit Expressions for the PNLSS Jacobian In this appendix, we compute explicit expressions for the derivatives of the model output (5104) with respect to the parameters θ . We first define the matrices ζ' ( t ) ∈ ⺢

η' ( t ) ∈ ⺢

nη × na

n ζ × na

and

as

∂ζ ( t ) ∂ζ ( t ) ∂ζ ( t ) ζ' ( t ) = ------------ = --------------- … ----------------∂x 1 ( t ) ∂x n ( t ) ∂x ( t ) a

(5-160)

∂η ( t ) ∂η ( t ) ∂η ( t ) η' ( t ) = ------------- = --------------- … ----------------∂x 1 ( t ) ∂x n ( t ) ∂x ( t ) a m×n

I ij

∈⺢

m×n

denotes a zero matrix with a single element equal to one at entry ( i, j ) :

m×n

I ij

=

0 … 0 … 0

… … … … …

0 … 1 … 0

… … … … …

0 … 0 i … 0

(5-161)

j We begin by computing the Jacobian with respect to the elements A ij of the matrix A . The derivative of the output equation with respect to A ij is given by:

∂ ∂y ( t ) ------------ = ( Cx ( t ) + Du ( t ) + Fη ( t ) ) ∂A ij ∂ A ij ∂x ( t ) ∂x ( t ) = C ------------ + Fη' ( t ) -----------∂A ij ∂A ij

(5-162)

In order to determine the right hand side of (5-162), we also need the derivatives of the state equation which are given by

∂ ∂x ( t + 1 ) --------------------- = ( Ax ( t ) + Bu ( t ) + Eζ ( t ) ) . ∂A ij ∂ A ij We define x A ( t ) ∈ ⺢ ij

na

as

140

(5-163)

Identification of the PNLSS Model

∂x ( t ) x A ( t ) = ------------ . ij ∂A ij

(5-164)

Then, equation (5-163) is rewritten as na × na

x A ( t + 1 ) = I ij ij

x ( t ) + ( A + Eζ' ( t ) )x A ( t ) .

(5-165)

ij

Combining equations (5-162) and (5-165) results in na × na ⎧ x ( t ) + ( A + Eζ' ( t ) )x A ( t ) ⎪ x A ij ( t + 1 ) = I ij ij ⎨ ⎪ J A ( t ) = ( C + Fη' ( t ) )x A ( t ) ij ij ⎩

where J A ( t ) ∈ ⺢ ij

ny

(5-166)

is defined as

∂y ( t ) J A ( t ) = -----------ij ∂A ij

(5-167)

The Jacobian of the other model parameters are computed in a similar way. We summarize the results below: na × na ⎧ x ( t + 1 ) = I x ( t ) + ( A + Eζ' ( t ) )x A ( t ) ⎪ A ij ij ij ⎨ ⎪ J A ( t ) = ( C + Fη' ( t ) )x A ( t ) ij ij ⎩

J C ( t ) = I ij ij

n a × nu ⎧ u ( t ) + ( A + Eζ' ( t ) )x B ( t ) ⎪ x B ij ( t + 1 ) = I ij ij ⎨ ⎪ J B ( t ) = ( C + Fη' ( t ) )x B ( t ) ij ij ⎩

J D ( t ) = I ij ij

na × nζ ⎧ ζ ( t ) + ( A + Eζ' ( t ) )x E ( t ) ⎪ x E ij ( t + 1 ) = I ij ij ⎨ ⎪ J E ( t ) = ( C + Fη' ( t ) )x E ( t ) ij ij ⎩

141

ny × n a

ny × n u

x( t)

u(t)

ny × n η

J F ( t ) = I ij ij

η(t)

(5-168)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

Appendix 5.E Computation of the Jacobian regarded as an alternative PNLSS system It is clear from (5-168) that the computation of J A ( t ) , J B ( t ) , and J E ( t ) is equivalent to ij ij ij calculating the output of an alternative PNLSS system. We consider here, for instance, the calculation of J A ( t ) and we will attempt to write equations (5-166) in the following form: ij

⎧ x˜ ( t + 1 ) = A˜ x˜ ( t ) + B˜ u˜ ( t ) + E˜ ζ˜ ( t ) ⎨ ⎩ y˜ ( t ) = C˜ x˜ ( t ) + F˜ η˜ ( t )

(5-169)

For this, we define the new inputs, states and outputs as follows:

u˜ ( t ) = x ( t ) u(t)

x˜ ( t ) = x A ( t ) ij

y˜ ( t ) = J A ( t ) ij

(5-170)

A number of relations between the original and the new system matrices are trivial:

A˜ = A

C˜ = C

B˜ = I ij

0

na × nu

˜ = 0 D

(5-171)

The remaining system matrices and monomials require slightly more effort to determine. The goal of the following calculations is to rewrite the terms Eζ' ( t )x˜ ( t ) and Fη' ( t )x˜ ( t ) as

E˜ ζ˜ ( t ) and F˜ η˜ ( t ) , respectively. The time indices will be omitted for the sake of simplicity. Using the multinomial notation and with multi-index α j defined as the power of the j -th monomial ζ j , we have: α

ζ1 ζ =

… ζn

ζ

=

1 u˜ … .



αn

(5-172)

ζ

∂ζ

Next, we derive expressions for ζ' = ----- . The derivative of ζ j with respect to the state ∂x variable x i is equal to α j ( i )ζ j x i– 1 . We can neglect the presence of the factor x i– 1 when x i is not present in a given monomial, since the corresponding α j ( i ) is in that case equal to zero. Hence, we obtain the following relation:

142

Identification of the PNLSS Model

ζ' =

α 1 ( 1 )ζ 1 x 1– 1 … α 1 ( n a )ζ 1 x n– 1 a …



(5-173)



α n ( 1 )ζ n x 1– 1 … α n ( n a )ζ n x n– 1 ζ ζ ζ ζ a Then, the product ζ'x˜ is given by

α 1 ( 1 )ζ 1 … α 1 ( n a )ζ 1 ζ'x˜ = … … … α n ( 1 )ζ n … α n ( n a )ζ n ζ

ζ

ζ

x˜ 1 x 1– 1 .



(5-174)

x˜ n x n– 1

ζ

a

a

˜

We define the new j -th monomial ζ j as

ζ˜j = ζ j

–1 x˜ 1 u˜ 1

x˜ 1 x 1– 1 = ζj

… x˜ n x n– 1 a

,



(5-175)

–1 x˜ n u˜ n

a

a

a

which allows to rewrite (5-174) as

α1 0 … 0 ζ'x˜ =

0 α2 … 0 … … … … 0 0 … αn

ζ

ζ˜ 1 ζ˜

2 .

… ζ˜

(5-176)



˜: This leads to the definition of the new matrix E α1 0 … 0 0 α2 … 0 E˜ = E … … … … 0 0 … αn ζ

143

(5-177)

Chapter 5: Nonlinear State Space Modelling of Multivariable Systems

144

C HAPTER 6

APPLICATIONS OF THE POLYNOMIAL NONLINEAR STATE SPACE MODEL

In this chapter, the nonlinear state space approach is applied to a number of real-life systems: the Silverbox, a combine harvester, a semi-active damper, a quarter car set-up, a robot arm, a WienerHammerstein system, and a crystal detector. For each case study, the Device Under Test (DUT) and the performed experiments are described. Next, the Best Linear Approximation is estimated and a nonlinear state space model is built. Whenever possible, we compare our approach with other modelling methods.

145

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

6.1 Silverbox 6.1.1 Description of the DUT The Silverbox is an electronic circuit that emulates the behaviour of a mass-spring-damper system (Figure 6-1). The input u c of the system is the force applied to the mass m ; the output y c represents the mass displacement. The spring acts nonlinearly and is characterized by the parameters k 1 and k 3 . Since k 3 is positive, the spring is hardening. This means that relatively more force is required as the spring is extended. The equation that describes the system’s behaviour is given by

my··c ( t ) + dy· c ( t ) + k 1 y c ( t ) + k 3 y c3 ( t ) = u c ( t ) .

(6-1)

The parameter d determines the damping which is present in the system. For a sinusoidal input u c ( t ) , (6-1) is also known as the Duffing equation (see “Duffing Oscillator” on p. 111).

k 1, k 3 m

uc

d Figure 6-1. Mass-spring-damper system.

6.1.2 Description of the Experiments The applied excitation signal consists of two parts (Figure 6-2). The first part of the signal is filtered Gaussian white noise with a linearly increasing RMS value as a function of time. This sequence consists of 40 700 samples and has a bandwidth of 200 Hz. The average RMS value of the signal is 22.3 mV. This data set will be used to validate the models. The second part of the excitation signal contains 10 realizations of an odd random phase multisine with 8192 samples and 500 transient points per realization. The bandwidth of the excitation signal is also 200 Hz and its RMS value is 22.3 mV. This sequence is applied once to the system under test and will be used to estimate the models. In all experiments, the input and output signals are measured at a sampling frequency of 10 MHz/214 = 610.35 Hz.

146

Silverbox

Input Signal [V]

0.15

Validation

Estimation

0.1 0.05 0 í0.05 í0.1 í0.15 0

50

100 Time [s]

150

200

Figure 6-2. Excitation signal that contains a validation and an estimation set.

6.1.3 Best Linear Approximation In order to obtain a nonparametric estimate of the Best Linear Approximation (BLA), the FRF

ˆ is determined for every phase realization of the estimation data set. The BLA G BLA ( jω k ) is then calculated by averaging those FRFs. Next, a parametric second order linear state space model is estimated. From this model, initial values will be extracted in order to estimate some nonlinear models. The results are plotted in Figure 6-3: the top and bottom plot show the amplitude and phase of the BLA, respectively. The solid black line denotes the BLA; the solid

ˆ grey line represents the linear model. The total standard deviation σ BLA ( k ) is also given (black dashed line), together with the model error (dashed grey line), i.e., the difference between the measured BLA and the linear model. Unfortunately, only one period per realization was measured. Hence, no distinction can be made between the nonlinear contributions and the measurement noise (see “Periodic Data” on p. 28). From Figure 6-3, we observe that the linear model is of good quality: up to a frequency of 120 Hz, the model error coincides with the standard deviation. A statistically significant, but small model error is present in the frequency band between 120 Hz and 200 Hz. This error is surprising, because equation (6-1) corresponds to a linear, second order system when omitting the nonlinear term k 3 y c3 ( t ) . To explain this behaviour, the way the measurements

147

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

Amplitude [dB]

20

0

í20

í40

í60 0

50

100 Frequency [Hz]

50

100 Frequency [Hz]

0

150

200

Phase [º]

í45

í90

í135

í180 0

150

200

Figure 6-3. BLA of the Silverbox (solid black line); Total standard deviation (black dashed line); 2nd order linear model (solid grey line); Model error (dashed grey line). were made should be taken into account. A band-limited (BL) set-up was employed during the measurements [56]. Hence, a discrete-time, second order model does not suffice to model this continuous-time, second order system. Although the model error disappears for a third order model, we will neglect it and continue with the second order model. The second data set is now used to validate the linear model. The measured output of the system is plotted in Figure 6-4 (black line) together with the simulation error (grey line). The RMS value of the simulation error (RMSE) is 13.7 mV. This number should be compared to the RMS output level that measures 53.4 mV.

148

Silverbox

RMSE: 13.7 mV 0.3

Amplitude [V]

0.2 0.1 0 í0.1 í0.2 0

10

20

30 40 Time [s]

50

60

Figure 6-4. Validation result for the 2nd order linear model: measured output (black) and model simulation error (grey).

6.1.4 Nonlinear Model Now, we will investigate whether better modelling results can be obtained with a nonlinear model. First, a second order polynomial nonlinear state space (PNLSS) model is estimated with the following settings:

ξ ( t ) = x1 ( t ) x2 ( t ) u ( t )

T

(6-2)

and

ζ ( t ) = ξ ( t ){ 3 }

η(t) = 0 .

(6-3)

Hence, the nonlinear degree in the state equation is nx=[2 3]. We include all cross products of the states and the input. In the output equation, only the linear terms are present (ny=0). This results in a nonlinear model that contains 37 parameters. The validation results for this nonlinear model are shown in Figure 6-5. Again, the measured output signal is denoted by the black line; the simulation error of the nonlinear model is plotted in grey. The RMS value of the model error has dropped significantly from 13.7 mV for the linear model to 0.26 mV for the nonlinear model. Hence, the second order polynomial nonlinear state space model performs more than a factor 50 better than the linear one. The

149

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

RMSE: 0.26 mV 0.3

Amplitude [V]

0.2 0.1 0 í0.1 í0.2 0

10

20

30 40 Time [s]

50

60

Figure 6-5. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey). spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey) are shown in Figure 6-6. The errors of the linear model are particularly present around the resonance frequency (approximately 60 Hz), i.e., for large signal amplitudes. The errors of the nonlinear model are concentrated around the resonance frequency and close to DC. Higher model orders and degrees of nonlinearity were also tried out, but none of them gave better results than this second order nonlinear model. We also estimated some state affine 0

Amplitude [dBV]

í20 í40 í60 í80 í100 0

50

100

150 200 Frequency [Hz]

250

300

Figure 6-6. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey).

150

Silverbox

models (see “State Affine Models” on p. 89) of various orders and degrees. In Table 6-1, the validation results for such models of degree 3 and 4 are summarized.

Model Order

State Affine degree 3

State Affine degree 4

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=2

9.12

23

9.12

32

n=3

10.51

39

7.51

55

n=4

9.40

59

10.04

84

n=5

6.22

83

9.47

119

n=6

12.65

111

13.15

160

n=7

11.81

143

12.77

207

Table 6-1. Validation results for state affine models of degree 3 and 4. It is clear that the state affine approach yields unsatisfying results on the Silverbox data set. A possible reason for the poor performance is that the state affine approach approximates the nonlinear behaviour of a system by polynomial functions of the input, while the nonlinear behaviour of the Silverbox mainly consists of a nonlinear feedback of the output. As a result, high system orders are required in order to obtain a good model [69].

6.1.5 Comparison with Other Approaches At the Symposium of Nonlinear Control Systems (Nolcos) in 2004, a special session was organized around the Silverbox device. The aim was to compare different modelling approaches applied to the same nonlinear device, using the same experimental data. In all the papers participating to this session, the multisine and the Gaussian noise data set were used for estimation and validation, respectively. Before continuing, a warning concerning the validation data is appropriate. As can be seen from Figure 6-2, the amplitude of the last part of the validation input exceeds the amplitude of the estimation data. Hence, for this part of the validation data, extrapolation will occur. It is important to emphasize that the performance achieved in this region is not a good measure for the model quality. It is rather a matter of “luck”: if there is an exact correspondence between the internal structure of the DUT and the model structure, this will generally yield in good extrapolation behaviour. But if this is not the case, and the estimated model is only an approximation, the extrapolation will in general be

151

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

poor. Therefore, the extrapolation behaviour should be discarded in a fair assessment. Note that the danger of extrapolation also resides in the use of too small amplitudes (for instance dead zones, which become relatively more important for smaller inputs). Among the papers, we distinguish three methodologies. The first one is a white box approach (H. Hjalmarsson [29], J. Paduart [52]), making explicit use of the knowledge about the internal structure of the Silverbox device. M. Espinoza [22], L. Sragner [76], and V. Verdult [81] employ a black box approach. Finally, the paper of L. Ljung [39] shows results for black box and grey box modelling. In the following, we will briefly describe each methodology. In [29], the internal block structure of the Silverbox is reordered to turn it into a MISO Hammerstein system. An existing relaxation method for Hammerstein-Wiener systems (published by E. Bai [1]) is extended to MISO systems, and applied to the Silverbox device. The model obtained with this method achieves a validation RMSE of 0.96 mV. Another white box approach is presented in [52]. In Chapter 4, the ideas from this paper are elaborated. The RMSE on the validation set is 0.38 mV. In [22], Least Squares Support Vector Machines (LS-SVM) for nonlinear regression are applied to model the Silverbox. The idea here is to consider a model where the inputs are mapped to a high dimensional feature space with a nonlinear mapping ␸ . This feature space is converted to a low dimensional dual space by means of a positive definite kernel K . As the final model is expressed as a function of K , there is no need to compute ␸ explicitly. Furthermore, the dual space allows to estimate the model parameters by solving a linear least squares problem under equality constraints. In [22], polynomial kernels are used on the Silverbox data, yielding a validation result of 0.32 mV. An even better model is obtained in [23] using a partially linear model (PL-LS-SVM): 0.27 mV. This approach includes the prior knowledge that linear regressors are present in the model. The model used in [81] is a state space model, composed of a weighted sum of two local second order linear models (LLM). The weights are a function of a scheduling vector, which is chosen equal to the output of the system for this particular DUT. A typical choice for the

152

Silverbox

weighting functions are radial basis functions which are also used in [81]. The validation result obtained here is 1.3 mV. In [76], different types of Artificial Neural Networks (ANN) are assessed: Multi-Layer Perceptron (MLP) Networks and ΣΠ Networks, both making use of hyperbolic tangent base functions. In both cases, the maximal time lag for the input and the output is chosen equal to 5. The MLP Network has one hidden layer and contains 60 neurons. The ΣΠ Network has 20 multiplicative and 20 additive elements. The best model is a special MLP Network with only 10 hidden neurons, which also makes use of linear regressors. This results in a RMSE of 7.8 mV. A whole arsenal of different black and grey box techniques are applied in [39], such as neural networks, wavenets, block-oriented models, and physical models. The best result is achieved by a one hidden layer sigmoidal neural network with 30 neurons, using input and output regressors with a maximal time lag of 10. Note that a custom, cubic regressor is included, which improves the results significantly. This leads to a RMSE of 0.30 mV. In Table 6-2, the validation RMSE and the number of required parameters are summarized for the different approaches.

Author

Approach

Validation RMSE [mV]

Number of parameters

J. Paduart

PNLSS

0.26

37

H. Hjalmarsson, [29]

Physical block-oriented model

0.96

5

J. Paduart, [52]

Physical block-oriented model

0.38

10

M. Espinoza, [22]

LS-SVM with NARX

0.32

490

M. Espinoza, [23]

PL-LS-SVM with PL-NARX

0.27

190

L. Sragner, [76]

MLP-ANN

7.8

600

V. Verdult, [81]

Local Linear State Space model

1.3

16

L. Ljung, [39]

NL ARX model

0.30

712

Table 6-2. Validation results for various modelling approaches. From Table 6-2, we conclude that the physical models achieve reasonable validation RMSE values. The main advantage of this approach is the small number of parameters, and the ability to give a physical interpretation to the identified parameters. In general, the black and deep-grey box models show the lowest RMSE values. The price paid for their excellent

153

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

performance is the higher number of parameters they require. An exception to the rule in Table 6-2 are the MLP neural networks. Several reasons can cause their poor performance for this particular set-up. First of all, the hyperbolic tangent functions used in the ANN approach do not exploit the polynomial behaviour of the Silverbox. Secondly, it could be that the neural network was not properly initialized, or that the nonlinear search used in the estimation procedure got stuck in a local minimum. We obtained a good result with the polynomial nonlinear state space model: a low RMSE value (0.26 mV), and a reasonable number of parameters (37). The reason why our approach works so well is due to the correspondence between the PNLSS model and the internal structure of the Silverbox, which basically consists of a cubic feedback of the output (see “Nonlinear Feedback” on p. 106). This match is a clear advantage in a validation test that requires extrapolation. To conclude, three black box models clearly stand out in this comparison, with RMSE values close to the noise level of 0.25 mV. This level was obtained from another data set, in similar experimental conditions.

154

Combine Harvester

6.2 Combine Harvester 6.2.1 Description of the DUT The system that we will model in this section is a New Holland CR-960 combine harvester (see Figure 6-7). Note that there is no grain header mounted at the front side of the harvester. We will use measurements performed by ir. Tom Coen from the KULeuven, Faculty of Bioscience Engineering, Department MeBioS. A block scheme of the traction system of the machine is shown in Figure 6-8. The black connections denote mechanical transmissions; the grey connection is part of the hydrostatic transmission. The diesel engine delivers the traction power and is coupled to a hydrostatic pump which on its turn drives a hydrostatic engine. The speed of the diesel engine is kept at the requested set point by a regulator which varies the fuel injection. The flow of the hydrostatic pump is controlled by an electric current. The power is then transferred to the front axle through the mechanical gearbox and the front differential. The traction system to be modelled has two inputs and one output. The first input is the steering current of the hydrostatic pump; the second input is the speed setting of the diesel engine. The engine speed is limited between 1300 and 2100 rotations per minute (rpm); the steering current of the hydrostatic pump can be varied between 0 % and 100 %. The output of this MISO system is the measured driving speed, expressed in km/h. A detailed analysis of

Figure 6-7. Combine harvester.

155

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

Diesel Engine

Hydrostatic Pump

Speed Setting [rpm]

Steering Current [%]

Hydrostatic Engine

Mechanical Gearbox

Front Axle

Machine Speed [km/h]

Figure 6-8. Traction system of the combine harvester. the expected system order of the traction system is presented in [9]. The dynamic behaviour is mainly located in the pump, and consists of three second order subsystems. A part of these dynamics can be neglected as they are relatively fast; hence the required model order turned out to be four.

6.2.2 Description of the Experiments All experiments were performed on the road, with the gearbox fixed in the second gear. Two sets of orthogonal random odd, random phase multisines were generated (see “Periodic Data” on p. 34). Hence, a total of 4 realizations were applied to both input channels. Each realization consisted of two periods of 4096 samples each, and 192 transient samples. The RMS value of the multisines for the first and second input were 57% and 1715 rpm, respectively. The bandwidth of the excitation signals was 2 Hz and the sampling frequency f s used in the experiments was 20 Hz. The first two realizations will be used to estimate the models and the remaining two to validate them. Due to timing problems with the PXI instrumentation system used to perform the experiments, the applied input signals were not completely periodic. As a consequence, we cannot exploit the periodic nature of the original signals to separate the measurement noise from the nonlinear contributions. Hence, we treat the data sequences as if they were non periodic (see “Non Periodic Data” on p. 38).

6.2.3 Best Linear Approximation To estimate the Multiple Input, Multiple Output (MISO) BLA and its covariance, we split the estimation data ( 2 × ( 2 × 4096 + 192 ) samples) in M = 32 subrecords of 524 samples.

156

Combine Harvester

G

G

11

í10

Amplitude [dB]

í20

í50

í30

í60

í40 í70

í50

í80

í60 í70 0

0.5 1 1.5 Frequency [Hz] G

í90 0

2

11

Phase [°]

12

í40

0

í45

í45

í90

í90

í135

í135

í180

í180

í225

í225 0.5 1 1.5 Frequency [Hz]

2

0.5 1 1.5 Frequency [Hz]

2

12

0

0

0.5 1 1.5 Frequency [Hz] G

2

0

Figure 6-9. The MISO BLA (G11 and G12) of the combine harvester (solid black line); Total standard deviation (dashed black line); 6th order linear model (solid grey line); Model error (dashed grey line). Then, we compute the auto- and cross spectra with equation (2-24). Finally, the BLA is obtained with equation (2-41), and its covariance with equation (2-43). Next, a 4th, 5th, and 6th order linear model is estimated with a subspace method (see “Frequency Domain Subspace Identification” on p. 115) from the BLA and its covariance matrix. The subspace method is then followed by a nonlinear optimization. Figure 6-9 shows the BLA (solid black line) and the total standard deviation (dashed black line), together with the 6th order linear model (solid grey line) and the amplitude of the complex model error (dashed grey line). G11 is the transfer function from the steering current (input 1) to the measured speed (output 1); G12 is the transfer function from the diesel engine’s speed setting (input 2) to the measured speed (output 1). Next, a validation test is carried out with the 6th order linear model. In

157

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

RMSE: 0.73 km/h 12

Speed [km/h]

10 8 6 4 2 0 í2 0

100

200

300

400 500 Time [s]

600

700

800

Figure 6-10. Validation result for the 6th order linear model: measured output (black) and model simulation error (grey). Figure 6-10, the model error for the validation data is shown (grey) together with the measured output (black). The validation data consists of two merged multisine realizations. Therefore, two transient phenomena are present in the model error: one at the start of the data set and one around 400 s. For the calculation of the RMSE (0.73 km/h), we discard 200 samples at the start of each realization in order to eliminate the effect of the transients. When taking a closer look at the simulation error, we observe periodic residuals. This effect can be caused by periodic disturbances (e.g. coupling with 50 Hz mains), or by unmodelled dynamics (since a quasi-periodic excitation signal was employed). In the next section, it will become clear that there are unmodelled dynamics.

6.2.4 Nonlinear Model For the nonlinear modelling, we use as starting values the 4th, 5th, and 6th order linear model obtained in the previous step. Two types of nonlinear state space models are considered: polynomial nonlinear models and state affine models. For the polynomial models, we have observed that a nonlinear output equation does not enhance the modelling results. Therefore, we only show the results for models with a linear output equation ( η ( t ) = 0 ). We also have noticed that the nonlinear combinations with the inputs in the state equation do not improve the results. Hence, these terms are omitted in what follows. The validation RMSE of the estimated polynomial nonlinear models is shown in Table 6-3 (left: ζ ( t ) = x ( t ) ( 3 ) , and right:

ζ ( t ) = x ( t ) { 3 } ). From this table, it is clear that the RMSE decreases for higher model orders,

158

Combine Harvester

PNLSS nx=[3], ny=[]

PNLSS nx=[2 3], ny=[]

Model Order

Validation RMSE [km/h]

Number of parameters

Validation RMSE [km/h]

Number of parameters

n=4

0.63

94

0.63

134

n=5

0.57

192

0.54

267

n=6

0.35

356

0.36

482

Table 6-3. Validation results for the polynomial nonlinear state space models. while the number of parameters increases significantly. The best result achieved with the polynomial nonlinear approach is a RMSE of 0.35 km/h using a 6th order model with nx=[3]. The validation error of this model is shown in Figure 6-11 (grey), together with the measured output (black). The two transients are discarded in the same way as described in the previous section. Figure 6-12 shows the spectra of the measured validation output (black), the linear simulation error (light grey), and the nonlinear simulation error (dark grey). From this plot, we observe that the nonlinear model reduces the linear model error between DC and 1 Hz, but for higher frequencies no significant improvement is obtained. Furthermore, state affine models of degree 3 and 4 are also estimated. The validation results for these models are shown in Table 6-4. A 5th order model of degree 3 yields the best result (0.39 km/h). No clear trends are visible in Table 6-4; the results are comparable to what we

RMSE: 0.35 km/h 12

Speed [km/h]

10 8 6 4 2 0 í2 0

100

200

300

400 500 Time [s]

600

700

800

Figure 6-11. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey).

159

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

Amplitude [dBkm/h]

40

20

0

í20

í40 0

2

4 6 Frequency [Hz]

8

10

Figure 6-12. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey). obtained with the polynomial nonlinear state space models. Hence, both approaches perform equally well. State Affine degree 3

State Affine degree 4

Model Order

Validation RMSE [km/h]

Number of parameters

Validation RMSE [km/h]

Number of parameters

n=4

0.46

149

0.45

254

n=5

0.39

209

0.40

359

n=6

0.41

279

0.44

482

Table 6-4. Validation results for state affine models of degree 3 and 4.

160

Semi-active Damper

6.3 Semi-active Damper 6.3.1 Description of the DUT This application concerns the modelling of a magneto-rheological (MR) damper. This damper is called semi-active, because the characteristics of the viscous fluid inside the damper is influenced by a magnetic field. Hence, the relation between the force over the damper and the position/velocity of the piston is changed. shaker

current

damper

load cell

piston viscous fluid

Figure 6-13. Measurement set-up of the magneto-rheological damper.

Two quantities serve as input to this system: the reference signal applied to the PID controller to regulate the piston position via the shaker, and the current which determines the magnetic field over the viscous fluid. As system output, we consider the force over the damper, which is measured by a load cell. The measurement set-up is shown in Figure 6-13. For an ideal, linear damper, we expect to obtain an improper first order model, i.e., the theoretical relationship between displacement and force for a perfect damper. Due to the non-idealities of the device, the required model order will turn out to be higher.

6.3.2 Description of the Experiments Both the construction of the set-up and the measurements were carried out by ir. Kris Smolders from the PMA Department of the KULeuven. He applied three realizations of a full grid, random phase multisine to the DUT. The multisines were excited in a frequency band between 0.12 Hz and 10 Hz, and 6 periods per realization were measured with 65 536 samples per period. In all the measurements, a sampling frequency f s of 2000 Hz was used.

161

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

A slow DC trend present in the measured output data was removed prior to the estimation procedures. After removal of the DC levels, the signals applied to the first (piston reference) and second input (damper current) of the DUT have a RMS value of 39 mV and 194 mV, respectively. The first two multisine realizations are used for the estimation of the models, and the third realization for the validation.

6.3.3 Best Linear Approximation First, we will estimate the device’s Best Linear Approximation. Unfortunately, only two realizations were available. For a dual input system, this is sufficient to calculate the BLA, but not enough to determine an estimate of the covariance. Hence, the approach described in

Amplitude [dB]

G

G

11

10

10

0

0

í10

í10

í20

í20

í30

í30

í40

í40

í50 0

2

4 6 Frequency [Hz] G

í50 0

8

2

11

12

4 6 Frequency [Hz] G

8

4 6 Frequency [Hz]

8

12

270 180

90 Phase [°]

90 0 í90 60

í180 í270 0

2

4 6 Frequency [Hz]

8

0

2

Figure 6-14. The MISO BLA (G11 and G12) of the semi-active damper (solid black line); 3rd order linear model (solid grey line); Total standard deviation (dashed black line); Model error (dashed grey line).

162

Semi-active Damper

RMSE: 33.92 mV

Amplitude [V]

0.2 0.1 0 í0.1 í0.2 0

5

10

15 20 Time [s]

25

30

Figure 6-15. Validation result for the 3rd order linear model: measured output (black) and model simulation error (grey). “Periodic Data” on p. 34 is not suitable. Therefore, we employ the method described in “Non Periodic Data” on p. 38 to determine the BLA from the averaged input/output data. We compute the auto- and cross spectra with equation (2-24), with M = 32 blocks of 4096 samples. Finally, the BLA is obtained with equation (2-41) and its covariance with equation (243). Then, some linear models with different model orders (2nd to 5th order) are estimated from the BLA and its covariance matrix, using a subspace method (see “Frequency Domain Subspace Identification” on p. 115) which is followed by a nonlinear optimization. Figure 6-14 shows the MISO BLA (solid black line) and the total standard deviation (dashed black line), together with the 3rd order linear model (solid grey line) and the amplitude of the complex model error (dashed grey line). G11 is the transfer function from the piston reference (input 1) to the measured force (output 1); G12 is the transfer function from the damper current (input 2) to the measured force (output 1). G11 behaves like expected for a damper: ideally, the force over the damper should be proportional to the velocity of the piston, i.e., jω times the displacement. This is, indeed, roughly what we observe for G11. Furthermore, from the top plots it can be seen that the relative uncertainty on G12 is high compared with the one on G11. Hence, the estimated linear model is mainly determined by G11. Next, a validation test is carried out with the 3rd order linear model. In Figure 6-15, the model error for the validation data is shown (grey), together with the measured output (black). The RMS value of the model error (34 mV) is quite high compared with the RMS value of the measured output (71 mV). We will reduce this error using nonlinear models.

163

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

6.3.4 Nonlinear Model For the nonlinear modelling, we use as starting values the 2nd to 5th order linear models obtained in the previous step. Again, two types of nonlinear state space models are considered: PNLSS and state affine models. For the polynomial models, we have observed that using a nonlinear relation for both the state and the output equation always yields better modelling results. Hence, no linear state nor output equation is considered in what follows. PNLSS, "full" nx=[2 3], ny=[2 3]

PNLSS, "states only" nx=[2 3], ny=[2 3]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=2

12.4

98

22.9

29

n=3

8.2

211

8.9

75

n=4

6.7

399

6.6

164

n=5

9.7

689

10.3

317

Table 6-5. Validation results for the PNLSS models with (left) all the nonlinear combinations, and (right) without nonlinear combinations using the input.

For the PNLSS model, we will make a distinction between two choices for the nonlinear vectors ζ ( t ) and η ( t ) . First, we take into account all nonlinear combinations using the states

and the inputs (referred to as "full", ζ ( t ) = η ( t ) = ξ ( t ) { 3 } , with ξ ( t ) = [ x ( t ) ;u ( t ) ] ). RMSE: 6.6 mV

Amplitude [V]

0.2 0.1 0 í0.1 í0.2 0

5

10

15 20 Time [s]

25

30

Figure 6-16. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey).

164

Semi-active Damper

Amplitude [dBV]

0

í20 í40 í60 í80 0

50

100 Frequency [Hz]

150

200

Figure 6-17. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey). Secondly, we consider only the nonlinear combinations of the states, without the inputs (referred to as "states only", ζ ( t ) = η ( t ) = x ( t ) { 3 } ). The validation RMSE of the estimated polynomial models is given in Table 6-5. It is clear that the RMSE decreases for higher model orders up to n=4. We also conclude that taking into account the nonlinear combinations of the input improves, on average, the RMSE at the price of a significantly higher number of parameters. For the PNLSS approach, the best result is achieved using a “states only” 4th order model with degree nx=[2 3], ny=[2 3], resulting in a RMSE of 6.6 mV. This is a reduction of the model error with a factor 5 compared with the linear model. The validation error for the best nonlinear model is given in Figure 6-16 (grey), together with the measured output (black). Figure 6-17 shows the spectra of the measured validation output signal (black), the linear simulation error (light grey), and the nonlinear simulation error (dark grey). This plot illustrates that the nonlinear model squeezes down the model error over a broad frequency range. Furthermore, different state affine models of degree 3 and 4 are estimated. The validation results for these models are given in Table 6-6. A 4th order model of degree 4 yields the best result (13.5 mV). For this DUT, the PNLSS approach performs clearly better than the state affine approach. By employing a 4th order PNLSS model, the simulation error on the validation set was reduced with more than a factor 5 compared with the BLA: from 34 mV to 6.6 mV. This result should

165

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

State Affine degree 3

State Affine degree 4

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=2

26.5

59

23.2

98

n=3

17.7

99

17.1

167

n=4

13.8

149

13.5

254

n=5

14.5

209

30.2

359

Table 6-6. Validation results for different state affine models of degree 3 and 4. be compared with the noise level (1.8 mV), which can easily be determined since several periods of the measured data are available. Hence, it should be possible to reduce the model error with an additional factor 3. However, it was not possible to achieve this with the PNLSS approach. Maybe the nonlinear optimization got stuck in a local minimum, or the model order/ degree should be increased further in order to obtain better results.

166

Quarter Car Set-up

6.4 Quarter Car Set-up 6.4.1 Description of the DUT In this test case, we study a quarter car set-up which is situated at the PMA Department of the KULeuven, and which was built by ir. Kris Smolders [72]. The set-up is a scale model of a car suspension based on masses, springs, and the magneto-rheological damper that was modelled in section 6.3.

car mass load cell Magneto-rheological damper

wheel mass

shaker Figure 6-18. Quarter car set-up.

The system is excited by a hydraulic shaker which emulates, by means of a PID controller, the vertical road displacement. The reference signal for the position of the shaker serves as system input. The force over the damper is considered as the system output and is measured with a load cell, which is placed between the damper and the car mass. Taking into account the various interactions between the masses, springs and damper, and the shaker dynamics, the expected model order is about six (for a more elaborate discussion, see [72]).

6.4.2 Description of the Experiments K. Smolders applied two realizations of a full grid, random phase multisine (RPM), which was excited in a frequency band between 0.05 Hz and 10 Hz. Per multisine realization, 10 periods

167

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

Amplitude [V]

(a) RPM data set

(b) GN data set

0.4

0.4

0.2

0.2

0

0

í0.2

í0.2

í0.4 0

10

20 Time [s]

í0.4 0

30

50 100 Time [s]

Figure 6-19. (a) Random Phase Multisine data set, and (b) Gaussian Noise data set. RMS RPM (light grey line), RMS GN (dark grey line), and extrapolation zone (grey block). were measured, with 40 000 samples per period. Furthermore, a filtered Gaussian noise (GN) sequence with a linearly increasing RMS value over time was applied to the system. This signal consisted of about 280 000 data samples. The RMS value of the multisine and the noise sequence are, respectively, 75 mV and 58 mV. Both data sets are shown in Figure 6-19. In the plot on the right side, the RMS value of the RPM sequence is given (light grey line), together with the RMS value of the GN data set (dark grey line). The latter is calculated per block of 8 000 samples. From this plot, we observe that the RMS value of the Gaussian noise sequence exceeds the RMS value of the multisine data set around t=100 s. For larger values of t, we end up in the extrapolation zone (grey block). In all the measurements, the current applied to the semi-active damper was fixed to 1 A, and a sampling frequency f s of 2000 Hz was used. Prior to the estimation, a slow DC trend that stems from the load cell sensor was removed from all the measured data, using linear detrending. Originally, the RPM data set was intended for the estimation, and the GN data set for the validation. However, this leads to poor modelling results, even when the GN sequence is only used up to t=100 s. A possible (a) DFT Spectrum RPM signal

(b) DFT Spectrum GN signal

0

0

í20

í20

í40

í40

0

10

20 30 40 Frequency [Hz]

50

0

10

20 30 40 Frequency [Hz]

50

Figure 6-20. DFT spectrum of (a) the RPM signal, and (b) the GN data set.

168

Quarter Car Set-up

explanation for this is the fact that the spectrum of the GN is broader than the RPM’s spectrum (see Figure 6-20). Hence, we decided to interchange the roles of both data sets such that spectral extrapolation is avoided: the GN data set serves now for the estimation of the models, and the RPM data set for the validation.

6.4.3 Best Linear Approximation Since the GN data set is employed for the estimation of the models, the approach described in “Non Periodic Data” on p. 38 is employed to determine the BLA. First, we compute the autoand cross spectra with equation (2-24), with M = 34 blocks of 8 000 samples. Then, the BLA is obtained with equation (2-41) and its covariance with equation (2-43). From this data, 4th to 6th order linear models are estimated using a subspace method (see “Frequency Domain

20

Amplitude [dB]

10 0 í10 í20 í30 0

2

4

6 8 10 Frequency [Hz]

12

14

16

2

4

6 8 10 Frequency [Hz]

12

14

16

Phase [°]

90

0

í90

0

Figure 6-21. BLA of the quarter car set-up (solid black line); Total standard deviation (black dashed line); 4th order linear model (solid grey line); Model error (dashed grey line).

169

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

RMSE: 136 mV 1

Amplitude [V]

0.5

0

í0.5

í1 0

5

10

15

20 25 Time [s]

30

35

Figure 6-22. Validation result for the 4th order linear model: measured output (black) and model simulation error (grey). Subspace Identification” on p. 115), which is followed by a nonlinear optimization. Figure 621 shows the BLA (solid black line) and the total standard deviation (dashed black line), together with the 4th order linear model (solid grey line) and the amplitude of the complex model error (dashed grey line). Next, the 4th order linear model is validated on the RPM data set. In Figure 6-22, the simulation error for the validation data is given (grey) together with the measured output (black). The RMS value of the model error (136 mV) is quite high compared to the RMS value of the measured output (285 mV). Hence, we will try to reduce this error with the PNLSS approach.

6.4.4 Nonlinear Model In what follows, we only show the results for the PNLSS models. State affine models were also estimated but are omitted here, since they yielded poor results. In Table 6-7, the validation results are shown for PNLSS models of various orders, with a nonlinear state and output equation. Two kinds of models are discussed: models that contain all the nonlinear combinations of the states and the inputs (PNLSS, "full", ζ ( t ) = η ( t ) = ξ ( t ) { 3 } ), and models that only employ the nonlinear combinations of the states (PNLSS, “states only”,

ζ ( t ) = η ( t ) = x ( t ) { 3 } ). The entries that are indicated with “N.A.” correspond to models which could not be estimated due to memory restrictions. To this matter, recall that the estimation data set consists of about 280 000 data samples. The best validation result is achieved by the 5th order model from the right hand side of the table, giving a simulation

170

Quarter Car Set-up

PNLSS, "full" nx=[2 3], ny=[2 3]

PNLSS, "states only" nx=[2 3], ny=[2 3]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=4

104

259

107

159

n=5

N.A.

N.A.

44

311

n=6

N.A.

N.A.

50

552

Table 6-7. Validation results for the PNLSS models with all the nonlinear combinations (left), and without nonlinear combinations with the input (right). error of 44 mV. Since the validation data are periodic and several periods were measured, we can compare this figure to the noise level at the output, which is 1.8 mV. Apparently, a significant amount of unmodelled dynamics is still present in the residuals, although the model error decreased with more than a factor 3 compared with the linear model. Hence, this DUT is an example where the PNLSS approach delivers unsatisfying results. A higher model order or an increased nonlinear degree might improve the results, but the size of the data set prevents the estimation of such models due to memory restrictions. Another possible explanation for the poor result is that the polynomial approximation is not suited for this setup, e.g. due to the presence of hard saturation in the DUT. In Figure 6-24, the spectra of the measured validation output signal (black), the linear simulation error (light grey), and the nonlinear simulation error (dark grey) are shown. From this plot, it can be seen that a significant model error reduction is achieved by the nonlinear model between DC and approximately 50 Hz. RMSE: 44 mV 1

Amplitude [V]

0.5

0

í0.5

í1 0

5

10

15

20 25 Time [s]

30

35

Figure 6-23. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey).

171

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

Amplitude [dBV]

20

0

í20

í40

í60 0

50

100 Frequency [Hz]

150

200

Figure 6-24. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey).

172

Robot Arm

6.5 Robot Arm 6.5.1 Description of the DUT In this case study, we will model a robot arm (see Figure 6-25) that was constructed by ir. Thomas Delwiche and his co-workers from the Control Engineering Department of the Université Libre de Bruxelles (ULB). The goal of his research is to design a controller for the robot arm such that it can be used for long-distance surgery. The manipulations carried out by a surgeon with the master robot arm should be repeated accurately by a slave device, and should give force feedback to the surgeon. T. Delwiche carried out experiments on the device in cooperation with the Department ELEC of the Vrije Universiteit Brussel. The robot arm rotates by means of a DC motor that is driven by a servo-amplifier, which incorporates a controller. The reference voltage sent to the servo-amplifier serves as input of the system. The input signal is proportional to the couple applied to the arm (1V = 12.95 10-3 Nm). The output of the system is the angle of the arm, measured with a 1024 counts per turn encoder connected to the motor shaft (1V ≈ 90º). Furthermore, the speed of the arm is fed back through a controller in order to introduce damping in the system. This feedback loop is considered as an intrinsic part of the DUT. When we neglect the dynamics of the DC motor and the nonlinear effects in the set-up, we expect a second order relationship between the couple applied to the arm, and the resulting angle.

Figure 6-25. Robot arm.

173

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

6.5.2 Description of the Experiments T. Delwiche performed several multisine experiments on the robot arm, using different RMS input levels and different bandwidths for the excitation signal. All experiments were performed at a sampling frequency of 10 MHz/214 = 610.35 Hz. From all the available data, we selected a set of experiments in which the excitation signal has a bandwidth of 30 Hz and a RMS value of 80 mV. Ten realizations of a random odd, random phase multisine were applied to the DUT. Each realization consisted of two periods with 24 415 samples per period. We use eight of these realizations (a total of 195 320 samples) to estimate the models and the remaining two realizations (48 830 samples) to validate them.

Amplitude [dB]

0 í20 í40 í60 í80 í100

5

10

Phase [º]

135

15 20 Frequency [Hz]

25

30

15 20 Frequency [Hz]

25

30

90

45

0

5

10

Figure 6-26. BLA of the robot arm (solid black line); Total standard deviation (dashed black line); Measurement noise level (dotted black line); 2nd order linear model (solid grey line); Model error (dashed grey line).

174

Robot Arm

6.5.3 Best Linear Approximation First, we calculate the BLA with formula (2-30). Since periodic excitations were used and more than one period per realization was measured, it is possible to distinguish the nonlinear contributions and the measurement noise. Figure 6-26 shows the estimated BLA of the robot

ˆ arm (solid black line). Furthermore, the total standard deviation σ BLA ( k ) due to the combined effect of measurement noise and nonlinear distortions (dashed black line), and the

ˆ ( k ) (dotted black line) are also plotted. standard deviation due to the measurement noise σ n We see that the total standard deviation lies significantly higher than the measurement noise, indicating that the nonlinear behaviour is dominant compared with the measurement noise. Then, a number of linear models are estimated with subspace techniques, followed by a nonlinear optimization of the cost function (5-101), using only the excited frequency lines. The best result is achieved by a 3rd order linear model. This model is also plotted in Figure 626 (solid grey line), together with the model error (dashed grey line). Although these models seem to fit well the nonparametric Best Linear Approximation estimate, they deliver poor validation results. Hence, a second nonlinear optimization is applied: this time using all the frequency lines including DC. After this optimization, the 3rd order linear model is validated. The result is presented in Figure 6-27 which shows the measured output (black) and the model error (grey). The RMS error of this model is 34.6 mV. This should be compared to the RMS level of the output which is 218 mV. We will now try to reduce this model error by estimating a number of nonlinear state space models. RMSE: 34.6 mV 0.4

Amplitude [V]

0.3 0.2 0.1 0 í0.1 í0.2 0

10

20

30

40 50 Time [s]

60

70

80

Figure 6-27. Validation result for the 3rd order linear model: measured output (black) and model simulation error (grey).

175

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

6.5.4 Nonlinear Model Again, we start by estimating some polynomial nonlinear state space models, using the linear models obtained in the previous step as starting values. The results are given in Table 6-8. The second column shows models with a nonlinear state and output equation ( ζ ( t ) = η ( t ) = ξ ( t ) { 3 } ); the third column shows models that only have a nonlinear state equation ( ζ ( t ) = ξ ( t ) { 3 } , η ( t ) = 0 ). The best result is achieved by the 3rd order model of the second column (13.5 mV). PNLSS nx=[2 3], ny=[2 3]

PNLSS nx=[2 3], ny=[]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=2

23.2

53

24.1

37

n=3

13.5

127

22.2

97

n=4

24.3

259

16.3

209

Table 6-8. Validation results for the polynomial nonlinear state space models. Furthermore, state affine models of degree 3 and 4 are estimated. The results are summarized in Table 6-9. An increasing model order improves the RMSE, apart from some exceptions where the nonlinear optimization probably got stuck into a local minimum. State Affine degree 3

State Affine degree 4

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=2

21.1

23

21.3

32

n=3

23.8

39

23.5

55

n=4

17.3

59

16.7

84

n=5

7.6

83

18.2

119

n=6

5.9

111

8.4

160

n=7

12.3

143

15.1

207

n=8

9.3

179

18.3

260

n=9

5.3

219

6.0

319

Table 6-9. Validation results for state affine models of degree 3 and 4.

176

Robot Arm

RMSE: 5.3 mV 0.4

Amplitude [V]

0.3 0.2 0.1 0 í0.1 í0.2 0

10

20

30

40 50 Time [s]

60

70

80

Figure 6-28. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey).

For the robot arm, the state affine approach clearly yields better results than the PNLSS approach when it comes to minimizing the RMSE. To see this, we compare the best RMSE achieved on the validation set: 5.3 mV versus 13.5 mV. The validation test of the best nonlinear model is shown in Figure 6-28. Compared with the linear model, the model error is reduced with almost a factor 7: from 34.6 mV to 5.3 mV. Although this is a good result, the smallest validation error is still large compared with the noise level (0.4 mV). This indicates that there are still unmodelled dynamics in the residuals. Figure 6-29 shows the DFT spectra 20

Amplitude [dBV]

0 í20 í40 í60 í80 0

10

20 30 Frequency [Hz]

40

50

Figure 6-29. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey).

177

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

of the measured validation output signal (black), the linear simulation error (light grey), and the nonlinear simulation error (dark grey). The nonlinear model visibly reduces the model error over a broad spectral range. The remaining errors are concentrated around DC and stem from the (low frequency) drift problems observed during the measurements.

178

Wiener-Hammerstein

6.6 Wiener-Hammerstein 6.6.1 Description of the DUT In this section, we will model an electronic circuit with a Wiener-Hammerstein structure, designed by Gerd Vandersteen [82] from the Vrije Universiteit Brussel, Department ELEC. The system is composed of a static nonlinear block, sandwiched between two linear dynamic systems.

y0

u0

y

u f

f

Figure 6-30. Wiener-Hammerstein system. The first linear system is a 3rd order Chebyshev low-pass filter with a 0.5 dB ripple and a pass band up to 4.4 kHz. The static nonlinearity is realized by resistors and a diode. The second linear system is a 3rd order inverse Chebyshev low-pass filter with a -40 dB stop band, starting at 5 kHz.

6.6.2 Description of the Experiments The excitation signal consists of two parts: four periods of a random odd, random phase multisine with 16 384 samples per period, and about 170 000 data points of filtered Gaussian noise. Both signals have a bandwidth of 10 kHz and a RMS value of about 640 mV. The multisine will be utilized for the estimation procedure, and the Gaussian noise for validation purposes. We performed the measurements at a sampling frequency of 51.2 kHz.

6.6.3 Level of Nonlinear Distortions First, we will analyse the level of nonlinear distortions from the multisine experiment. In Figure 6-31, the spectrum of the averaged output is shown. The solid black line represents the output at the excited lines. The grey circles and crosses denote the contributions at the odd and even detection lines, respectively. In order to improve the visibility of the figure, the

179

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

60

Amplitude [dB]

40 20 0 í20 í40 í60 0

2

4 6 Frequency [kHz]

8

10

Figure 6-31. Averaged output spectrum Excited lines (solid black line), Standard deviation (dashed black line) Odd nonlinear distortion (grey circles), Even nonlinear distortion (grey crosses). number of plotted contributions on the detection lines is reduced. From Figure 6-31, it is clear that the nonlinear distortions lie in the pass band about 20 dB below the linear contributions. Furthermore, the even nonlinear distortions slightly dominate the odd nonlinear contributions. The standard deviation on the excited lines, which is a measure for the measurement noise, is also plotted (dashed black line). We see that in the pass band, the noise level is about 30 dB lower than the nonlinear distortion level.

6.6.4 Best Linear Approximation ˆ We now calculate G BLA ( jω k ) using the multisine data set. Since several periods were

ˆ 2 ( k ) is estimated using formula (2measured, the variance due to the measurement noise σ n ˆ 2 ( k ) (i.e., the effect of the nonlinear distortions and 17). To calculate the total variance σ BLA the measurement noise), we cannot use equation (2-21), because only one multisine realization was applied. However, the level of nonlinear distortions at the non excited harmonic lines can be interpolated to the excited frequency lines. This allows to calculate the total variance on the BLA. The BLA is plotted in Figure 6-32 (solid black line), together with

ˆ ( k ) due to the measurement noise (dotted black line), and the total the standard deviation σ n ˆ standard deviation σ BLA ( k ) (dashed black line).

180

Wiener-Hammerstein

Amplitude [dB]

0 í20 í40 í60 í80 í100 0

2

4 6 Frequency [kHz]

8

10

2

4 6 Frequency [kHz]

8

10

0

Phase [º]

í180

í360

í540

í720 0

Figure 6-32. BLA of the Wiener-Hammerstein circuit (solid black line); Total standard deviation (dashed black line); Measurement noise level (dotted black line); 6th order linear model (solid grey line); Model error (dashed grey line). RMSE: 36.2 mV

Amplitude [V]

0.5

0

í0.5

í1 0

0.5

1

1.5 2 Time [s]

2.5

3

Figure 6-33. Validation result for the 6th order linear model: measured output (black) and model simulation error (grey).

181

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

Linear models of various orders are estimated. For this, a subspace technique is used which is followed by a numeric optimization, both carried out in the frequency domain. The 6th order linear model yields the best result and is shown in Figure 6-32 (solid grey line), together with the model error (dashed grey line). Next, the 6th order linear model is validated using the Gaussian noise sequence. In Figure 6-33, the simulation error (grey) is plotted together with the measured output (black). The RMSE is quite high compared to the RMS level of the output: 36.2 mV versus 213 mV. Note also that the asymmetric behaviour of the model error is in agreement with the dominant even nonlinear behaviour of the system.

6.6.5 Nonlinear Model The linear models obtained in the previous section are now used as starting values to estimate a number of polynomial nonlinear state space models. We estimate models of 4th, 5th, and 6th order. Two kinds of models are discussed: models that use all the nonlinear combinations of the states and the input ( ξ ( t ) { 3 } , “full”), and models that use the nonlinear combinations of the states only ( x ( t ) { 3 } , “states only”). We also verify whether the use of a linear or nonlinear output equation influences the results. Table 6-10 shows the modelling results for the “full” PNLSS models. PNLSS, “full” nx=[2 3], ny=[2 3]

PNLSS, “full” nx=[2 3], ny=[]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=4

9.60

259

8.33

209

n=5

3.70

473

3.61

396

n=6

3.32

797

3.21

685

Table 6-10. Validation results for the “full” PNLSS models. In Table 6-11, the validation results are shown for models that do not use the input in the nonlinear combinations. Since there are less nonlinear terms in these models, the number of required parameters is significantly lower. However, the RMSE values are always higher than the corresponding entries in Table 6-10. When taking a closer look at Table 6-10, we observe that the models with a linear output equation ( η ( t ) = 0 , on the right side of the table) always yield better results than the models with a nonlinear output equation. The best PNLSS

182

Wiener-Hammerstein

model is the 6th order model with a linear output equation that uses all the nonlinear combinations of the states up to degree [2 3]. It has a validation RMSE of 3.21 mV. PNLSS, “states only” nx=[2 3], ny=[2 3]

PNLSS, “states only” nx=[2 3], ny=[]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=4

12.0

159

12.18

129

n=5

4.18

311

3.94

261

n=6

3.65

552

3.23

475

Table 6-11. Validation results for the “states only” PNLSS. Next, we estimate state affine models of various orders and of degree 3 and 4. Looking at Table 6-12, which shows the validation RMSE for the state affine approach, we observe a similar trend as with the robot arm data: the RMSE diminishes smoothly as the model order increases. State Affine degree 3

State Affine degree 4

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=5

7.35

83

6.98

119

n=6

5.10

111

4.64

160

n=7

4.41

143

3.82

207

n=8

3.86

179

3.13

260

n=9

3.66

219

3.01

319

n=10

3.56

263

2.60

384

Table 6-12. Validation results for state affine models of degree 3 and 4. The best validation result is achieved by the 10th order model of degree 4, with a RMSE of 2.6 mV. The simulation error of this model is plotted in Figure 6-34 (grey), together with the measured output signal (black).

183

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

RMSE: 2.6 mV

Amplitude [V]

0.5

0

í0.5

í1 0

0.5

1

1.5 2 Time [s]

2.5

3

Figure 6-34. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey).

Figure 6-35 shows the spectra of the measured validation output signal (black), the linear simulation error (light grey), and the nonlinear simulation error (dark grey). In the pass-band of the device, the nonlinear model pushes down the model error with about 20 dB. Beyond 5 kHz, no significant difference between the linear and the nonlinear model error can be observed.

Amplitude [dBV]

0 í20 í40 í60 í80 í100 0

5

10 15 Frequency [kHz]

20

25

Figure 6-35. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey).

184

Wiener-Hammerstein

6.6.6 Comparison with a Block-oriented Approach In [68], the same measurements were used to model this electronic circuit using a blockoriented approach. Both linear blocks of the Wiener-Hammerstein model were identified as a 6th order linear model. The static nonlinearity was parametrized as a 9th degree polynomial. Taking into account two exchangeable gains between the three blocks, this results in a total of 34 parameters. Furthermore, the RMS value of the simulation error for this model is 3.8 mV. Hence, we see that this error is reduced with more than 30% to 2.6 mV using the PNLSS/ state affine approach, at the cost of a significantly higher number of parameters.

185

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

6.7 Crystal Detector 6.7.1 Description of the DUT The last modelling challenge discussed in this chapter is an Agilent-HP420C crystal detector (see Figure 6-36). This kind of device is often used in microwave applications to measure the envelope of a signal. The RF connection of the crystal detector (the left part in Figure 6-36) serves as input of the DUT. The video connection of the detector (right part) is considered as output of the system. From the physical, block-oriented model proposed in [63], we expect to find a second order relationship between the input and output of this device.

Agilent 423B

Figure 6-36. Agilent-HP crystal detector.

6.7.2 Description of the Experiments Ir. Liesbeth Gommé from the ELEC Department at the Vrije Universiteit Brussel carried out the experiments with the crystal detector. She applied two filtered Gaussian noise sequences of 50 000 samples with a growing RMS value as a function of time. Both signals are superimposed on a DC level of 117 mV, and have a total RMS value of 118 mV. The input

Amplitude [V]

(a) Estimation data set

(b) Validation data set

0.2

0.2

0.15

0.15

0.1

0.1

0.05

0.05 0

1

2 3 Time [ms]

4

0

1

2 3 Time [ms]

Figure 6-37. Estimation and validation input sequences.

186

4

Crystal Detector

signal of the first data set had a bandwidth of 800 kHz and will be used for estimation purposes (Figure 6-37 (a)). The second data set had a bandwidth of 400 kHz and will serve as validation data set (Figure 6-37 (b)). The sampling frequency f s used in the experiments was 10 MHz. Each sequence was repeated and measured 5 times. This allows to average the data and to compute the standard deviation of the noise on the measurements (0.23 mV at the input, and 0.24 mV at the output of the system).

6.7.3 Best Linear Approximation To calculate the BLA, we first average the 5 measured periods of the estimation data set in order to diminish the measurement noise. We then split the data in M = 10 subblocks of 5000 samples. With equations (2-25) and (2-27), we calculate the estimated BLA together 10

Amplitude [dB]

0 í10 í20 í30 í40 í50 0

100

200

300 400 500 600 Frequency [kHz]

700

800

900

100

200

300 400 500 600 Frequency [kHz]

700

800

900

Phase [º]

0

í15

í30

í45 0

Figure 6-38. BLA of the crystal detector (solid black line); Total standard deviation (dashed black line); 3rd order linear model (solid grey line); Model error (dashed grey line).

187

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

RMSE: 0.89 mV 0.2

Amplitude [V]

0.15

0.1

0.05

0 0

1

2

3

4

Time [ms]

Figure 6-39. Validation result for the 3rd order linear model: measured output (black) and model simulation error (grey).

ˆ 2 ( k ) . In Figure 6-38, G ˆ with its covariance σ BLA BLA ( jω k ) (solid black line) and the total ˆ standard deviation σ BLA ( k ) (dashed black line) are plotted. Next, a number of linear models

is estimated using frequency domain subspace identification, followed by a nonlinear optimization. In Figure 6-38, the third order linear model that was obtained is plotted (solid grey line), together with the model error (dashed grey line). Next, this linear model is validated; the result is shown in Figure 6-39. The black line denotes the measured output signal; the grey line represents the model error. The RMSE of the linear model is 0.89 mV. The RMS value of the output sequence, without the DC offset, is 15 mV. In the next step, we will try to reduce the model error using a nonlinear model.

6.7.4 Nonlinear Model We start by estimating some polynomial nonlinear state space models. We check whether a linear or a nonlinear output equation needs to be utilized, and whether the input needs to be included in the nonlinear combinations. Table 6-13 shows the modelling results when all nonlinear combinations of the states and the input are used (“full” PNLSS, using ξ ( t ) { 3 } ). The left and right column show the results for a nonlinear and linear output equation, respectively.

188

Crystal Detector

PNLSS, “full” nx=[2 3], ny=[2 3]

PNLSS, “full” nx=[2 3], ny=[]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=3

0.267

127

0.780

97

n=4

0.260

259

0.308

209

n=5

0.367

473

0.555

396

Table 6-13. Validation results for the “full” PNLSS models. Next, we estimate polynomial nonlinear state space models that only use nonlinear combinations of the states (“states only” PNLSS, using x ( t ) { 3 } ). The validation results for these models are summarized in Table 6-14. PNLSS, “states only” nx=[2 3], ny=[2 3]

PNLSS, “states only” nx=[2 3], ny=[]

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=3

0.580

71

0.743

55

n=4

0.277

159

0.397

129

n=5

0.876

311

0.440

261

Table 6-14. Validation results for the “states only” PNLSS models. Obviously, the best model structure is the one in the left column of Table 6-13. These models have a nonlinear state and output equation, and use all nonlinear combinations of the states and the input. The best PNLSS model is the 4th order model in the left column of Table 6-13, with a RMSE of 0.260 mV. Taking into account the input and output noise levels on the averaged data (around 0.23 mV), this is a satisfying result. Next, we try the state affine approach. The results for state affine models of degree 3 and 4 are shown in Table 6-15. The results here are even better than for the polynomial nonlinear state space approach: lower validation RMSEs are obtained for a lower number of parameters. These excellent results seem to contradict the poor performance of the state affine approach in the Silverbox case study. Both the Silverbox and the crystal detector have been identified as Nonlinear Feedback systems. Hence, we would expect a similar modelling performance. However, there is an important difference between both systems: the crystal detector shows a significantly lower amount of dynamics between its input and output compared to the

189

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

State Affine degree 3

State Affine degree 4

Model Order

Validation RMSE [mV]

Number of parameters

Validation RMSE [mV]

Number of parameters

n=3

0.275

39

0.266

55

n=4

0.259

59

0.260

84

n=5

0.264

83

0.261

119

n=6

0.262

111

0.271

160

n=7

0.278

143

0.266

207

n=8

0.264

179

0.264

260

Table 6-15. Validation results for state affine models of degree 3 and 4. Silverbox (Figure 6-3 versus Figure 6-38). Consequently, there is a high resemblance between the input and output signals for the crystal detector. Therefore, the fact that the nonlinear behaviour is mainly present at the output poses less difficulties for the state affine approach (for which the approximation relies on polynomials of the input). The best result obtained with the state affine approach is the 4th order model of degree 3 with a validation RMSE of 0.259 mV. This model was used to generate Figure 6-40, which shows the measured validation output signal (black) and the simulation error of the best nonlinear model.

RMSE: 0.259 mV 0.2

Amplitude [V]

0.15

0.1

0.05

0 0

1

2

3

4

Time [ms]

Figure 6-40. Validation result for the best nonlinear model: measured output (black) and model simulation error (grey).

190

Crystal Detector

Amplitude [mV]

2 0 í2 í4 í6 í8 0

1

2

3

4

Time [ms]

Figure 6-41. Simulation error in the validation test for the best linear (light grey), and best nonlinear (dark grey) model.

In Figure 6-41, the model errors for the best linear and best nonlinear model are plotted. The residuals of the linear model mainly consist of a DC offset and large asymmetric spikes at the end of the sequence, which are not present in the nonlinear model error. Both phenomena indicate that an even nonlinear behaviour is present in the DUT. This is consistent with the practical use of the DUT: a squaring characteristic is required for AM-demodulation. Furthermore, Figure 6-42 shows the DFT of the measured validation output (black), the linear

Amplitude [dB]

í20

í40

í60

í80 0

1

2 3 Frequency [MHz]

4

5

Figure 6-42. DFT spectra of the measured validation output signal (black), linear simulation error (light grey), and nonlinear simulation error (dark grey).

191

Chapter 6: Applications of the Polynomial Nonlinear State Space Model

(light grey) and the nonlinear (dark grey) simulation error. From this plot, we observe that the nonlinear model reduces the model error in the frequency band between DC and 1 MHz.

6.7.5 Comparison with a Block-oriented Approach From the analysis of its internal structure, it became clear that the crystal detector can be represented by a Nonlinear Feedback model. Hence, a block-oriented Nonlinear Feedback model was estimated for this device in [63], using the same measured data. Contrary to the Nonlinear Feedback model from Chapter 4, the approach in [63] allows a nonlinearity in the feedback branch that is dynamic. This nonlinearity takes the form of a Wiener-Hammerstein structure (see Figure 5-7). In order to separate the feedforward and feedback dynamics during the estimation, an excitation signal with a linearly increasing amplitude as a function of time is required. The feedforward branch of the block-oriented model was identified as a 1st order linear system. The Wiener-Hammerstein structure in the feedback branch consisted of a 1st order linear system, followed by a static nonlinearity that was parametrized as a 9th degree polynomial. The second linear system was a simple gain factor. Taking into account the exchangeability of gain factors between the different blocks, the total number of parameters of the block-oriented model is 13. Furthermore, the simulation error achieved in the validation test was 0.30 mV. Hence, the PNLSS and the state affine approach perform better, at the price of a considerably higher number of parameters. Furthermore, the added value of a blockoriented approach is illustrated in [28]: in this paper, the block-oriented model which is estimated in the base band is used to predict its behaviour in the RF-band.

192

C HAPTER 7

CONCLUSIONS

193

Chapter 7: Conclusions

The main goal of this thesis was to study and to model the behaviour of nonlinear systems, using the Best Linear Approximation and multisine excitation signals. These intermediate tools were successfully applied for the qualification and quantification of DSP non-idealities. This was demonstrated in Chapter 3 with some practical examples, including finite word length effects in digital filtering, and the coding/decoding process of an audio compression codec. The main advantage of the presented method resides in its simplicity and its general applicability: no elaborated, deep theoretical analysis is required and the method can be applied regardless of the functionality of the DSP algorithm. Chapter 4 has revealed yet another example of how the Best Linear Approximation can be adopted in the identification of block-oriented models. A straightforward identification procedure was presented to identify a specific class of Nonlinear Feedback models. It was successfully applied to real measurements from a physical device. In Chapter 5, we have shown that when a general, black box model of a nonlinear device is required, the PNLSS model is a perfect tool to achieve this goal. First, we have illustrated the extreme flexibility of this model by proving an explicit link with several popular block-oriented models. In addition, some examples of non-Fading Memory systems were shown to have a PNLSS representation. Secondly, the ability to cope with multivariable systems is included in a very natural way. If the user is not interested in physical interpretation of the model parameters, then block-oriented models can be considered obsolete: the PNLSS model encompasses them all, at the price of a higher number of parameters. Furthermore, the PNLSS identification procedure is straightforward, and consists of only three simple steps: (1) compute the Best Linear Approximation, (2) estimate a linear model, and (3) solve a standard nonlinear optimization problem. The seven successful practical applications from Chapter 6 confirm the flexibility of the PNLSS model. They demonstrate that the identification procedure works very well in practice, and is robust on both small (e.g. combine harvester) and large (e.g. quarter car set-up) data sets. In each of the test cases, a significant model error reduction was achieved compared with the linear models. One of the main advantages of the PNLSS approach is that no difficult identification settings have to be chosen by the user during the identification, such as the number of input and output time lags, the number of neurons, or the hyperparameters values. An estimate of the model order is determined easily from the BLA in step (2). Although it is possible to improve the model by tweaking the nonlinear state and output equations, the standard full PNLSS model usually delivers satisfying results with a moderate nonlinear degree.

194

Naturally, there are limitations to the proposed approach. First of all, the PNLSS model is only suitable to handle low order systems (as a rule of thumb: n a < 10 ). For higher order systems, the combinatorial explosion of the number of parameters becomes too restrictive in order to get good modelling results. This problem can be overcome, for instance by restricting the number of states included in the polynomial expansion, or by imposing a linear relation for a part of the states. The second disadvantage resides in the nonlinear search during the estimation of the model parameters: the risk of getting trapped in a local minimum is always imminent. However, it must be said that this weak spot is common to many identification methods, even in the case of linear modelling. Finally, it is difficult to guarantee the stability of the estimated models. But again, the risk of instability is inherent to any recursive model. Sometimes it is possible to add constraints in order to keep the model stable, but this may have a negative influence on the modelling performance. These three limitations constitute interesting topics for future research. We conclude that the common denominator in this thesis, is the KISS principle [87]: Keep it Simple and Stupid, or rather, Successful. The user is not interested in overwhelming and complicated mathematical models, or sophisticated identification procedures. He/she does not want to go astray in many identification options that are difficult to understand. Hence, one of the main contributions of this thesis is the PNLSS approach: a simple but robust tool, that can be used to model a broad class of nonlinear systems. To put the KISS concept into action, we recommend the following work flow: 1.

Excite the DUT with a broadband excitation signal that corresponds to normal operating conditions (bandwidth and RMS value). Preferably, apply several realizations of a random phase multisine, with two periods per realization.

2.

Use the input/output measurements to calculate the BLA as explained in Chapter 2.

3.

Estimate a linear model from the BLA and its measured covariance. Select the linear model with the lowest order, for which there are no significant systematic model errors.

4.

Estimate the standard (full) PNLSS model with a nonlinear degree [2 3] in both the state and the output equation, and the model order determined in the previous step. Use in this estimation step all but one multisine. The remaining multisine is then used for the validation.

195

Chapter 7: Conclusions

196

REFERENCES [1]

E. W. Bai. An optimal two-stage identification algorithm for Hammerstein-Wiener nonlinear systems. Automatica, vol. 34, no. 3, pp. 333-338, 1998.

[2]

E. W. Bai. Frequency domain identification of Wiener models. Automatica, vol. 39, no. 9, pp. 1521-1530, 2003.

[3]

E. W. Bai. Frequency domain identification of Hammerstein models. IEEE Transactions

on Automatic Control, vol. 48, no. 4, pp. 530-542, 2003. [4]

P. Banelli. Theoretical analysis and performance of OFDM signals in nonlinear fading channels. IEEE Transactions on Wireless Communications, vol. 2, no. 2, pp. 284-293, 2003.

[5]

S. Boyd, L. Chua. Fading Memory and the Problem of Approximating Nonlinear Operators with Volterra Series. IEEE Transactions on Circuits and Systems, vol. 32, no. 11, pp. 1150-1161, 1985.

[6]

D. K. Campbell. Nonlinear Science from Paradigms to Practicalities. From cardinals to

chaos: reflections on the life and legacy of Stanislaw Ulam. Necia Grant Cooper, Cambridge University Press, pp. 218, 1989. [7]

S. Chen, S. A. Billings. Representations of non-linear systems: the NARMAX model. Int.

Journal of Control, vol. 49, no. 3, pp. 1013-1032, 1989. [8]

L. Chua, T. Lin. Chaos in Digital Filters. IEEE Transactions on Circuits and Systems, vol. 35, no. 6, pp. 648-658, 1988.

[9]

T. Coen, J. Paduart, J. Anthonis, J. Schoukens, J. De Baerdemaeker. Nonlinear system identification on a combine harvester. Proceedings of the American Control Conference,

Minneapolis, Minnesota, USA, pp. 3074-3079, 2006. [10] R. G. Corlis, R. Luus. Use of residuals in the identification and control of two-input, single-output systems. I&EC Fundamentals, vol. 8, no. 5, pp. 246-253, 1969. [11] P. Crama. Identification of block-oriented nonlinear models. PhD Thesis, Vrije Universiteit Brussel, 2004. [12] R. Crochiere, L. R. Rabiner. Multirate Digital Signal Processing. Prentice-Hall, Englewood Cliffs, New Jersey, 1983.

197

[13] R. Dinis, A. Palhau. A class of signal-processing schemes for reducing the envelope fluctuations of CDMA signals. IEEE Transactions on Communications, vol. 53, no. 5, pp. 882-889, 2005. [14] T. D’haene, R. Pintelon, P. Guillaume. Stable Approximations of Unstable Models.

Proceedings of the IEEE Instrumentation and Measurement Technology Conference, Warsaw, Poland, pp. 1-6, 1-3 May 2007. [15] T. D’haene, R. Pintelon, J. Schoukens, E. Van Gheem. Variance analysis of frequency response function measurements using periodic excitations. IEEE Transactions on

Instrumentation and Measurement, vol. 54, no. 4, pp. 1452-1456, 2005. [16] J. Dieudonné. Foundations of Modern Analysis. Academic Press, New York, 1960. [17] T. Dobrowiecki, J. Schoukens. Measuring a linear approximation to weakly nonlinear MIMO systems. Automatica, vol. 43, no. 10, pp. 1737-1751, 2007. [18] T. Dobrowiecki, J. Schoukens, P. Guillaume. Optimized Excitation Signals for MIMO Frequency Response Function Measurements. IEEE Transactions on Instrumentation

and Measurement, vol. 55, no. 6, pp. 2072-2097, 2006. [19] M. Enqvist. Linear Models of Nonlinear Systems. PhD Thesis, Linköping Studies in Science and Technology, Dissertation no. 985, 2005. [20] M. Enqvist, J. Schoukens, R. Pintelon. Detection of Unmodeled Nonlinearities Using Correlation Methods. Proceedings of the IEEE Instrumentation and Measurement

Technology Conference, Warsaw, Poland, pp. 1-6, 1-3 May 2007. [21] M. Enqvist, L. Ljung. Linear approximations of nonlinear FIR systems for separable input processes. Automatica, vol. 41, no. 3, pp. 459-473, 2005. [22] M. Espinoza, K. Pelckmans, L. Hoegaerts, J. Suykens, B. De Moor. A comparative study of LS-SVMs applied to the silver box identification problem. Proceedings of the IFAC

Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 513-518, 1-3 September 2004. [23] M. Espinoza. Structured Kernel Based Modeling and its Application to Electric Load

Forecasting. Department of Electrical Engineering, KULeuven, 2006. [24] P. Eykhoff. System Identification. Parameter and State Estimation. Wiley, New York, 1974. [25] R. Fletcher. Practical Methods of Optimization (Second Edition). Wiley, New York, 1991.

198

[26] M. Fliess, D. Normand-Cyrot. On the approximation of nonlinear systems by some simple state-space models. Proceedings of the IFAC Identification and Parameter

Estimation Conference, Washington DC, USA, vol. 1, pp. 511-514, 7-11 June 1982. [27] G. H. Golub, C. F. Van Loan. Matrix Computations - Third Edition. The John Hopkins University Press, London, 1996. [28] L. Gommé, J. Schoukens, Y. Rolain, W. Van Moer. Validation of a crystal detector model for the calibration of the Large Signal Network Analyzer. Proceedings of the

Instrumentation and Measurement Technology Conference, Warsaw, Poland, pp. 1-6, 13 May 2007. [29] H. Hjalmarsson, J. Schoukens. On direct identification of physical parameters in nonlinear models. Proceedings of the IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 519-524, 1-3 September 2004. [30] I. W. Hunter, M. J. Korenberg. The Identification of Nonlinear Biological Systems Wiener and Hammerstein Cascade Models. Biological Cybernetics, vol. 55, no. 2-3, pp. 135-144, 1986. [31] N. Jayant, J. Johnston, R. Safranek. Signal compression based on models of human perception. Proceedings of the IEEE, vol. 81, no. 10, pp. 1385-1422, 1993. [32] Z. P. Jiang, Y. Wang. Input-to-state stability for discrete-time nonlinear systems.

Automatica, vol. 37, no. 6, pp. 857-869, 2001. [33] H. K. Khalil. Nonlinear Systems. Prentice Hall, Upper Saddle River, New Jersey, Second Edition, 1996. [34] D. E. Knuth. The Art of Computer Programming Vol. I. Addison-Wesley, pp. 70, 1968. [35] Kreider, Kuller, Ostberg, Perkins. An Introduction to Linear Analysis. Addison-Wesley Publishing Company Inc, 1966. [36] K. Levenberg. A method for the solution of certain problems in least squares. Quart.

Appl. Math., vol. 2, pp. 164-168, 1944. [37] I. J. Leontaritis, S. A. Billings. Input-output parametric models for non-linear systems. Part I: deterministic non-linear systems. International Journal of Control, vol. 41, no. 2, pp. 303-344, 1985. [38] L. Ljung. System Identification: Theory for the User. Prentice Hall, Upper Saddle River, New Jersey, Second Edition, 1999.

199

[39] L. Ljung, Q. Zhang, P. Lindskog, A. Juditski. Modeling a non-linear electric circuit with black box and grey box models. Proceedings of the IFAC Symposium on Nonlinear

Control Systems, Stuttgart, Germany, pp. 543-548, 1-3 September 2004. [40] E. N. Lorenz. Deterministic Nonperiodic Flow. Journal of the Atmospheric Sciences, vol. 20, no. 2, pp. 130-141, 1963. [41] O. Markusson. Model and System Inversion with Applications in Nonlinear System

Identification and Control. S3-Automatic Control, Royal Institute of Technology, 2002. [42] D. Marquardt. An algorithm for least-squares estimation of nonlinear parameters. SIAM

Journal of Applied Mathematics, vol. 11, pp. 431-441, 1963. [43] V. J. Mathews, G. L. Sicuranza. Polynomial Signal Processing. A Wiley-Interscience Publication, 2000. [44] T. McKelvey, H. Akçay, L. Ljung. Subspace-based multivariable system identification from frequency response data. IEEE Transactions on Automatic Control, vol. 41, no. 7, pp. 960-979, 1996. [45] T. McKelvey, A. Helmersson, T. Ribarits. Data driven Local Coordinates for multivariable linear systems and their application to system identification. Automatica, vol. 40, no. 9, pp. 1629-1635, 2004. [46] R. R. Mohler, W. D. Kolodziej. An overview of bilinear system theory and applications.

IEEE Transactions on Systems, Man and Cybernetics, vol. 10, no. 10, pp. 683-688, 1980. [47] S. Narendra Kumpati, K. Parthasarathy. Identification and control of dynamical systems using neural networks. IEEE Transactions on Neural Networks, vol. 1, no. 1, pp. 4-27, 1990. [48] J. G. Németh. Identification of Nonlinear Systems using Interpolated Volterra Models. PhD Thesis, Vrije Universiteit Brussel, 2003. [49] S. J. Norquay, A. Palazoglu, J. Romagnoli. Application of Wiener Model Predictive Control to a pH Neutralization Experiment. IEEE Transactions on Control Systems Technology, vol. 7, no. 4, pp. 437-445, 1999. [50] A. V. Oppenheim, R. W. Schafer. Discrete-Time Signal Processing. Prentice-Hall, pp. 335373 and pp. 296, 1989. [51] M. J. L. Orr. Radial Basis Function Networks Toolbox. http://www.anc.ed.ac.uk/rbf/rbf.html, 1999.

200

[52] J. Paduart, G. Horvath, J. Schoukens. Fast identification of systems with nonlinear feedback. Proceedings of the IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 525-530, 1-3 September 2004. [53] R. Pintelon. Frequency-domain subspace system identification using non-parametric noise models. Automatica, vol. 38, no. 8, pp. 1295-1311, 2002. [54] R. Pintelon, Y. Rolain, W. Van Moer. Probability density function for frequency response function measurements using periodic signals. IEEE Transactions on Instrumentation

and Measurement, vol. 52, no. 1, pp. 61-68, 2003. [55] R. Pintelon, J. Schoukens. Measurement of frequency response functions using periodic excitations, corrupted by correlated input/output errors. IEEE Transactions on

Instrumentation and Measurement, vol. 50, no. 6, pp. 1753–1760, 2001. [56] R. Pintelon, J. Schoukens. System Identification. A Frequency domain approach. IEEE Press, New Jersey, 2001. [57] R. Pintelon, G. Vandersteen, L. De Locht, Y. Rolain, J. Schoukens. Experimental Characterization of Operational Amplifiers: a System Identification Approach. IEEE

Transactions on Instrumentation and Measurement, vol. 53, no. 3, pp. 854- 876, 2004. [58] S. Prakriya, D. Hatzinakos. Blind Identification of LTI-ZMNL-LTI Nonlinear Channel Models. IEEE Transactions on Signal Processing, vol. 43, no. 12, pp. 3007-3013, 1995. [59] W. J. Rugh. Nonlinear System Theory, The Volterra/Wiener Approach. The John Hopkins University Press, 1981. [60] M. Schetzen. The Volterra and Wiener Theories of Nonlinear Systems. Wiley, New York, 1980. [61] J. Schoukens. Parameterestimatie in Lineaire en Niet-Lineaire Systemen met Behulp van

Digitale Tijdsdomein Metingen. PhD Thesis, Vrije Universiteit Brussel, 1985. [62] J. Schoukens, T. Dobrowiecki, R. Pintelon. Identification of linear systems in the presence of nonlinear distortions. A frequency domain approach. IEEE Transactions on

Automatic Control, vol. 43, no. 2, pp. 176-190, 1998. [63] J. Schoukens, L. Gommé, W. Van Moer, Y. Rolain. Identification of a crystal detector using a block structured nonlinear feedback model. Proceedings of the Instrumentation

and Measurement Technology Conference, Warsaw, Poland, pp. 1-6, 1-3 May 2007. [64] J. Schoukens, R. Pintelon, T. Dobrowiecki, Y. Rolain. Identification of linear systems with nonlinear distortions. Automatica, vol. 41, no. 3, pp. 491-504, 2005.

201

[65] J. Schoukens, Y. Rolain, R. Pintelon. Analysis of windowing/leakage effects in frequency response function measurements. Automatica, vol. 42, no. 1, pp. 27-38, 2006. [66] J. Schoukens, J. Swevers, R. Pintelon, H. Van der Auweraer. Excitation design for FRF measurements in the presence of nonlinear distortions. Proceedings of the ISMA2002

Conference, Leuven, Belgium, vol. 2, pp. 951-958, 16-18 September 2002. [67] J. Schoukens, G. Nemeth, P. Crama, Y. Rolain, R. Pintelon. Fast approximate identification of nonlinear systems. Automatica, vol. 39, no. 7, pp. 1267-1274, 2003. [68] J. Schoukens, R. Pintelon, J. Paduart, G. Vandersteen. Nonparametric Initial Estimates for Wiener-Hammerstein Systems. Proceedings of the 14th IFAC Symposium on System

Identification, Newcastle, Australia, pp. 778-783, 29-31 March 2006. [69] A. Schrempf. Identification of Extended State-Affine Systems. PhD Thesis, Johannes Kepler Universität, 2004. [70] J. Sjöberg, Q. Zhang, L. Ljung, A. Benveniste, B. Delyon, P.-Y. Glorennec, H. Hjalmarsson, A. Juditsky. Nonlinear black-box modeling in system identification: a unified overview. Automatica, vol. 31, no. 12, pp. 1691-1724, 1995. [71] J. Sjöberg. On estimation of nonlinear black box models: How to obtain a good initialization. Proceedings of the 1997 IEEE Workshop Neural Networks for Signal

Processing VII, Amelia Island Plantation, Florida, pp. 72-81, 1997. [72] K. Smolders, M. Witters, J. Swevers, P. Sas. Identification of a Nonlinear State Space Model for Control using a Feature Space Transformation. Proceedings of the ISMA2006

Conference, Leuven, Belgium, pp. 3331-3342, 18-20 September 2006. [73] E. D. Sontag. Realization theory of discrete-time nonlinear systems: Part I: The bounded case. IEEE Transactions on Circuits and Systems, vol. 26, no. 5, pp. 342-356, 1979. [74] E. D. Sontag. Smooth stabilization implies coprime factorization. IEEE Transactions on

Automatic Control, vol. 34, no. 4, pp. 435-443, 1989. [75] T. Söderström, P. Stoica. System Identification. Prentice Hall, Englewood Cliffs, 1989. [76] L. Sragner, J. Schoukens, G. Horvath. Modeling of slightly nonlinear systems: a neural network approach. Proceedings of the IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 531-536, 1-3 September 2004.

202

[77] J. A. K. Suykens, B. De Moor, J. Vandewalle. Nonlinear System Identification using neural state space models, applicable to robust control design. International Journal of

Control, vol. 62, no. 1, pp.129-152, 1995. [78] J. A. K. Suykens, J. Vandewalle, B. De Moor. Artificial Neural Networks for Modelling and

Control of Non-Linear Systems. and their application to control. Kluwer Academic Publishers, 1996. [79] H. Unbehauen, G. P. Rao. A review of identification in continuous-time systems. Annual

Reviews in Control, vol. 22, pp. 145-171, 1998. [80] V. Verdult. Nonlinear System Identification: A State-Space approach. PhD Thesis, University of Twente, 2002. [81] V. Verdult. Identification of local linear state-space models: the silver-box case study.

Proceedings of the IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 537-542, 1-3 September 2004. [82] G. Vandersteen. Identification of linear and nonlinear systems in an errors-in-variables

least squares and total least squares framework. PhD Thesis, Vrije Universiteit Brussel, 1997. [83] E. W. Weisstein. Multinomial Series. From MathWorld - A Wolfram Web Resource. http://mathworld.wolfram.com/MultinomialSeries.html [84] P. D. Welch. The use of fast Fourier transforms for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Transactions

on Audio and Electroacoustics, vol. 15, no. 2, pp. 70-73, 1967. [85] E. Wernholt, S. Gunnarsson. Detection and Estimation of Nonlinear Distortions in Industrial Robots. Proceedings of the IEEE Instrumentation and Measurement

Technology Conference, Sorrento, Italy, pp. 1913-1918, 24-27 April 2006. [86] A. G. Wills, B. Ninness. On Gradient-Based Search for Multivariable System Estimates.

IEEE Transactions on Automatic Control, vol. 52, no. 12, pp. 1-8, 2007. [87] http://en.wikipedia.org/wiki/KISS_principle [88] http://lame.sourceforge.net

203

PUBLICATION LIST Journal papers

• J. Paduart, J. Schoukens, Y. Rolain. Fast Measurement of Quantization Distortions in DSP Algorithms. IEEE Transactions on Instrumentation and Measurement, vol. 56, no. 5, pp. 1917-1923, 2007. Conference papers

• J. Paduart, J. Schoukens. Fast Identification of systems with nonlinear feedback. Proceedings of the 6th IFAC Symposium on Nonlinear Control Systems, Stuttgart, Germany, pp. 525-529, 2004.

• J. Paduart, J. Schoukens, L. Gommé. On the Equivalence between some Block-oriented Nonlinear Models and the Nonlinear Polynomial State Space Model. Proceedings of the

IEEE Instrumentation and Measurement Technology Conference, Warsaw, Poland, pp. 1-6, 2007.

• J. Paduart, J. Schoukens, R. Pintelon, T. Coen. Nonlinear State Space Modelling of Multivariable Systems. Proceedings of the 14th IFAC Symposium on System Identification, Newcastle, Australia, pp. 565-569, 2006.

• J. Paduart, J. Schoukens, K. Smolders, J. Swevers. Comparison of two different nonlinear state-space identification algorithms. Proceedings of the International Conference

on Noise and Vibration Engineering, Leuven, Belgium, pp. 2777-2784, 2006.

• J. Schoukens, J. Swevers, J. Paduart, D. Vaes, K. Smolders, R. Pintelon. Initial estimates for block structured nonlinear systems with feedback. Proceedings of the International

Symposium on Nonlinear Theory and its Applications, Brugge, Belgium, pp. 622-625, 2005.

204

• J. Schoukens, R. Pintelon, J. Paduart, G. Vandersteen. Nonparametric Initial Estimates for Wiener-Hammerstein systems. Proceedings of the 14th IFAC Symposium on System

Identification, Newcastle, Australia, pp. 778-783, 2006.

• T. Coen, J. Paduart, J. Anthonis, J. Schoukens, J. De Baerdemaeker. Nonlinear system identification on a combine harvester. Proceedings of the American Control Conference, Minneapolis, Minnesota, USA, pp. 3074-3079, 2006.

205