Computational experience with algorithms for ... - Springer Link

COMPUTATIONAL EXPERIENOE WITH ALGORITHMS FOR S E P A R A B L E N O N L I N E A R L ~ A S T S Q U A R E S P R O B L E M S (~) C. CORRAD1 (2) . L. STEFANINI (3)

ABSTRACT Nonlinear least squares problems frequently arise in which the fitting function can be written as a linear combination of functions involving further parameters in a nonlinear manner. This paper outlines an efficient implementation of an iterative procedure originally developed by Golub and Pereyra and successively modified by various authors, which takes advantage of the linear-nonlinear structure, and investigates its performances on various test problems as compared with the standard Gauss-Newton and Gauss-Newton-Marquardt schemes. -

1. Introduction. The last several years have witnessed an increasing interest in separable nonlinear least squares problems, that is, non linear least squares problems in which the parameters to be solved for can be separated into a linear part and a nonlinear part. Typical applications are least squares approximations by exponential or rational functions. Various techniques have been proposed by several authors which take advantage of the special structure of this kind of models: they generally consist of two stage iterative processes, each stage dealing with one set of parameters. For a comprehensive bibliography the reader should refer to [5]; see also [7], [ 1 2 ] , [11], [1] and references cited there.

Received May 25, 1977. (1) A preliminary version of this note has been presented at the CNR-GNIM meeting held in Florence, september 1976. (2) CNEN, Centro di Calcolo, and Istituto di Scienze Economiche, Universith di Bologna. (3) SOGESTA s.p.a., Urbino.

318

C. CoRaaot - L. STZFANINt:

Computational experience

The purpose of this note is to compare the performances of the generalized Gauss-Newton (GN) and Gauss-Newton-Marquardt (GNM) algorithms modified for separable problems along the lines suggested by Golub and Pereyra [5], Kaufman [7], and the corresponding standard GN and GNM methods on a set of test problems commonly used for numerical experiments. In Section 2 we outline an efficient implementation of the mentioned algorithms, while in Section 3 we discuss some experimental results.

2. Gauss-Newton and Gauss-Newton-Marquardt algorithms for separable nonlinear least squares. The problem of interest can be stated briefly as follows: determine the parameter vectors a e R " , beR" to minimize ~he functional

(1)

F (b, a ) = [IX (a) b + / ( a ) - y l l z,

where y e R r is a given vector of observations on the dependent variable, X (a) is a given T • matrix of observations on the independent variables whose elements are functions of the parameter vector a only, and f is a given vector function of a, (possibly) involving some of the independent variables. All the functions are assumed to be continuously differentiable with respect to a. We shall refer to b and a as the linear and nonlinear parameters respectively. It is definitely clear that the standard approach is to minimize (1) with respect to the vector [b', a']'6R '~+n as a whole, that is all parameters are considered nonlinear. The technique proposed by Golub and Pereyra is based on the observation that for any fixed a, the optimal choice for b is A

(2)

b = X + (a) v (a),

where v ( a ) = y - / ( a ) , and X + denotes the Moore-Penrose inverse o f X. Subsituting in (1) yields a problem in the nonlinear parameters only:

(3)

1[[X (a) X + (a)--I] v (a)[I 2 -- Ilr (a)[[ 2.

The two approaches (1), (3) are shown to be equivalent under the assumption that X (a) has constant rank for a belonging to some open set S2 containing the desired solution. In what follows we shall assume that X (a) has full rank for aeS2.

with algorithms lor separable nonlinear least squares problems

319

Problem (3) can be solved by some nonlinear least squares method, e. g. GN: (4)

ai+l=ai-- yi [D (r (ai))] + r ( a i ) = a i - Yi hi,

or GNM algorithm:

[D (r (ai))]+

(5) where [5]

D (r (a))=(X (a) X + ( a ) - I ) [D (X (a)) X + (a) v (a)+D (t (a))] + + [(X (a) X + (a)--l) D (X (a)) X + (a)]' ~ (a). The procedure can be improved in the following way. Let us first construct an orthogonal factorization of X, e. g. using The Householder transformations,

,~

s,,

for any fixed a, where Q is T • T orthogonal, R1 is n • n upper triangular, S is a n • n permutation matrix. Then we have

XX+-I=--Q'[

0

~r-n] O

and, by the well known isometric property of orthogonal matrices, II [ X (a) X + ( a ) - - I ] '~ (a)ll where Q =

=

llo (a) (a)ll - lir (a)lP,

[o,1

Q~lr-,,

The corresponding GN or GNM iterations, which have been proposed by Krogh [8], are obtained from (4), ( 5 ) b y using rz (a), O2D (r(a)) in place of r (a), D (r (a)). This modification yields a slight reduction of the amount of computation required per iteration. A further improvement is described by Kaufman [7], who suggests to ignore the second term in the expression of O2D (r (a)), so that the iterative schemes reduce to (4')

aj+~=aj-~,j [Wz (ai)] § rz (ai),

32O

C. CORP.ADI - L. STEFANINI: Computational experience

and (5')

aj+l -" ai --

v/I

j

respectively; here we have denoted W2 (a)=O2 (a) [D (X (a)) X + (a) v (a)+D (/(a))]. The resulting algorithms, which obviously require less computational effort, are shown [12] to have roughly the same asymptotic convergence rate as those proposed by Golub and Pereyra. Following [12], we shall refer to (4') and (5') as the restricted GN and GNM algorithm respectively. Let us now describe a recommended computational scheme for the restricted and unrestricted (or full) GN and GNM method. i)

Restricted GN and GNM.

STEP 1. Choose an initial guess for a. A

STEP 2. Compute the factorization (6). Then b ( a ) = X + (a) v (a) is computed by solving the upper triangular system R~ b"=dl, where Ov=[dl] n d2 T--n ~

and setting ~ - - S b . REMARK. This step corresponds to the least squares solution of the linear problem

IIx (a)

b-v

(a)l['

for fixed a. STEP 5.

Compute W=O [DX b+D]]=

W2 t-'n"

REMARK. The explicit computation of W~ is not needed unless one wants to evaluate the estimated variance-covariance matrix of the parameters (for details see [5]).

with algorithms [or separable nonlinear least squares problems

321

STEP 4. Perform an iteration according to (4') or (5'). Using the Householder transformations again, obtain

~

R2 l"~, '

W2=Q" f 0 where (~ is (T--n)X(T--n) orrhogonal, R2 is m • m • m permutation matrix.

upper triangular, S is a

Then solve the upper triangular system R2 h ' = - - ~ , where

d~ ]/'--n--m and put h--S h. This is the correction term for GN. For the GNM iteration it is necessary to complete the decomposition of the augmented jacobian (7)

[ W2

]

according to the scheme

(8)

w:=_'73 o

vI

0

1o 1~,-~,'

T--n

where Q is (T--n+m) • (T--n+m) orthogonal. Then solve the upper triangular system R2 h------d~, and set h=ffh.. REMARK 1. The two stage orthogonal decomposition (8) - originally described by Jennings and Osborne [6] - requires less computational effort than the standard one stage factorization. Moreover, it permits the calculation of the Moore-Penrose inverse of the matrix (7) for different values of v without complete refactorization of the matrix itself. Note that pivoting is not needed for

322

C. CORRAm- L. STEFANINI: Computational experience

the second stage of (8), since as a result of the first stage the euclidean norms of the columns of the augmented jacobian turn out to be in descending order. REMARK 2. For the actual implementation of the generalized GN and of GNM aIgorithm we need to compute a) the steplength ), and b) the damping factor v respectively. Convenient schemes are the following. a) Set ~,=1 (full step GN). Then if I[r2(a-)'h)[[ 2> _ ][r2(a)[[2 put ),