Application of a block modified Chebyshev algorithm to the iterative ...

Numerische Mathematik

c Springer-Verlag 1994

Numer. Math. 68: 3{16 (1994)

Electronic Edition

Application of a block modi ed Chebyshev algorithm to the iterative solution of symmetric linear systems with multiple right hand side vectors D. Calvetti1 , L. Reichel2 ;?

;??

Department of Pure and Applied Mathematics, Stevens Institute of Technology, Hoboken, NJ 07030, USA E-mail: [email protected] 2 Department of Mathematics and Computer Science, Kent State University, Kent, OH 44242, USA E-mail: [email protected] 1

Received April 22, 1993

Dedicated to Professor Josef Stoer on the occasion of his 60th birthday

Summary. An adaptive Richardson iteration method is described for the solu-

tion of large sparse symmetric positive de nite linear systems of equations with multiple right-hand side vectors. This scheme \learns" about the linear system to be solved by computing inner products of residual matrices during the iterations. These inner products are interpreted as block modi ed moments. A block version of the modi ed Chebyshev algorithm is presented which yields a block tridiagonal matrix from the block modi ed moments and the recursion coecients of the residual polynomials. The eigenvalues of this block tridiagonal matrix de ne an interval, which determines the choice of relaxation parameters for Richardson iteration. Only minor modi cations are necessary in order to obtain a scheme for the solution of symmetric inde nite linear systems with multiple right-hand side vectors. We outline the changes required. Mathematics Subject Classi cation (1991): 65F10, 65F15, 15A24

1. Introduction We describe an adaptive Richardson iteration method for the solution of large linear systems of equations (1.1) AX = B; A 2 Rnn ; X; B 2 Rns ; where the matrix A is sparse and symmetric. We will assume that A is positive de nite unless explicitly stated otherwise. The number of columns s of the righthand side matrix B is assumed to be much smaller than the number of rows n. Research supported in part by the Design and Manufacturing Institute at Stevens Institute of Technology ?? Research supported in part by NSF grant DMS-9205531 ?

Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 3 of Numer. Math. 68: 3{16 (1994)

4

D. Calvetti and L. Reichel

Linear systems of equations of this form have to be solved in the block Cimmino method [2, 22]. They also arise in iterative methods for 3D problems in which 2D subproblems are solved accurately in each step. Several solution methods for systems of the form (1.1) are available. They include block conjugate gradient and block Lanczos methods, as well as modi cations of the vector Lanczos method1 ; see [12, 17, 18, 19, 23, 25] and references therein. The Chebyshev and Richardson iteration methods can be implemented effectively on vector and parallel computers and have therefore received renewed attention; see [6, 24]. When applying these methods one typically requires that an interval [a; b], 0 < a b < 1, that contains the spectrum of A, denoted by (A), be explicitly known. Recently, Golub and Kent [11] proposed an adaptive Chebyshev iteration method, in which approximations of such an interval are determined by computing certain modi ed moments during the iterations. Modi cations of this scheme, that can reduce the number of iterations required to achieve convergence, are proposed in [3]. The methods in [3, 11] assume that s = 1 in (1.1). An adaptive Richardson iteration method in which approximations of an interval [a; b] containing (A) are determined by computing certain modi ed moments during the iterations was proposed in [4]. The scheme in [4] is designed for the solution of linear systems (1.1) with s = 1. The present paper generalizes the method in [4] to linear systems (1.1) with s > 1. Let X0 2 Rns be a given initial approximate solution of (1.1). We can write Richardson iteration in the form (1.2) Xk+1 := Xk + k Rk ; k = 0; 1; 2; : : : ; where (1.3) Rk := B ? AXk is the residual matrix associated with the approximate solution Xk of (1.1) and k 2 R is a relaxation parameter. We wish to determine the relaxation parameters so that the error matrices (1.4) Ek := A?1 B ? Xk converge to the zero-matrix rapidly as k increases. The residual matrices can be expressed as (1.5) Rk = pk (A)R0 ; k = 0; 1; 2; : : : ; where the polynomials pk satisfy the recursion relation (1.6) pk+1 () = (1 ? k )pk (); k = 0; 1; 2; : : : ; with p0 () := 1. Because of (1.5) we refer to the pk as residual polynomials. Edrei [7] and Leja [16] introduced recursively de ned points associated with compact sets in the complex plane. These points are known as Leja points. Our adaptive Richardson scheme chooses the relaxation parameters k to be 1

Throughout this paper we, for clarity, sometimes refer to the Lanczos method as the vector Lanczos method in order to distinguish it from the block Lanczos method


A block modi ed Chebyshev algorithm

5

reciprocal values of Leja points associated with intervals K j := [aj ; bj ] on the positive real axis. Let (1.7) 0 < 1 2 : : : n < 1 denote the eigenvalues of A. During the iterations we determine a sequence of intervals K 0 ; K 1 ; K 2 ; : : : , such that K j K j +1 and (1.8) 1 a j b j n : We would like the intervals K j to contain most of the spectrum of A, and we determine them in the following manner: Let R0 = Q0 B0 be a QR factorization of R0 , i.e., Q 2 Rns , QT0 Q0 = I and B0 2 Rss is upper triangular. We interpret the matrix products QT0 Rj as block modi ed moments, which together with the recursion coecients k for the residual polynomials are input to a block generalization of the modi ed Chebyshev algorithm. This algorithm yields a symmetric block-tridiagonal matrix Hj , whose eigenvalues are Ritz values of A. The extreme eigenvalues of Hj are used to determine the interval K j . This paper is organized in the following manner. Section 2 reviews properties of Leja points and the convergence of Richardson iteration. Section 3 introduces a block modi ed Chebyshev algorithm and discusses some of its properties. Our adaptive Richardson method is presented in Sect. 4, and computed examples are shown in Sect. 5. Concluding remarks can be found in Sect. 6. Denote the columns of B by b and the columns of X by x . Then (1.1) can be written (1.9) Ax = b ; 1 ` s : Instead of solving (1.1) by the adaptive Richardson method of Sect. 4, we can solve the s linear systems of equations (1.9), e.g., by the Richardson scheme in [4] designed for the solution of linear systems of equations with a single righthand side vector. The scheme of the present paper has the advantage that it can be implemented with higher level BLAS, and this typically yields higher performance on modern computers; see [1, 6, 13]. Moreover, the block modi ed Chebyshev algorithm of the present paper often yields better estimates for the interval [1 ; n ] than the (scalar) modi ed Chebyshev algorithm that is used in [4]. The latter is illustrated by the computed examples in Sect. 5. j

`

j

`

2. Richardson iteration based on Leja points In this section we review results on the convergence of Richardson iteration when the spectrum of A lies in an interval K = [a; b] on the positive real axis, and the relaxation parameters k are reciprocal values of Leja points for K . More detailed discussions can be found in [4, 8, 21]. De ne the weight function w(z ) = z , and let (2.1) z0 := b : Choose zk so that (2.2)

kY ?1 j =0

jzk ? zj jw(zk ) = max z2K

kY ?1 j =0

jz ? zj jw(z ) ; zk 2 K ; k = 1; 2; 3; : : : :


6


Formula (2.2) might not determine the zk , k 1, uniquely. We call any points z0 ; z1 ; z2; : : : that satisfy (2.1){(2.2) Leja points for K . Edrei [7] and Leja [16] studied these points when the weight function is w(z ) := 1. The asymptotic distribution of the Leja points is the same for w(z ) := 1 and w(z ) := z , but computed examples in [21] show the latter weight function to yield a better choice of the rst few relaxation parameters k := 1=zk . It follows from results in [16] that Leja points for K = [a; b] are uniformly distributed on K with respect to the density function (2.3) d(z ) := 1 (z ? a)?1=2 (b ? z )?1=2 ; a < z < b :

We remark that the zeros of Chebyshev polynomials for K are uniformly distributed with respect to the density function (2.3), too. We formulate a convergence result for Richardson iteration with the relaxation parameter k := 1=zk , where the zk are Leja points for K and (A) K . Let = fk g1 k=0 be a sequence of relaxation parameters and introduce the asymptotic convergence factor

kEk k 1=k sup ( ) := klim ; !1 E 6=0 kE0 k

(2.4)

0

where kk denotes a matrix norm, and the error matrices Ek are given by (1.4). Lemma 2.1. Let pk be the residual polynomials de ned by (1.6). Then max jp ()j1=k ( ) = klim !1 2(A) k

(2.5)

for any matrix norm in (2.4). Proof. First assume that the matrix norm in (2.4) is the 2-norm kB k2 := max kB yk2 ; B 2 Rns ; kyk2 =1

where

0s 1= X := @ juj j A ; 1 2

kuk2

2

j =1

u = [u1 ; u2 ; : : : ; us ]T

2 Rl :

From Ek = pk (A)E0 it follows that (2.6)

kEk k kp (A)k = max jp ()j k kE k 2 A k 2

0 2

2

( )

Assume that the maximum of the right-hand side of (2.6) is achieved for = (1k) , and let x1 2 Rn satisfy Ax1 = (1k) x1 and kx1 k2 = 1. Substitution of E0 = [x1 ; 0; : : : ; 0] into (2.6) yields

kEk k = kp (A)E k = jp ( k )j : k k kE k 2

0 2

Therefore,

0 2

( ) 1



7

kE k =k 1

k 2 = klim max jp ()j1=k : lim !1 2(A) k k!1 kE0 k2 This shows that (2.5) holds for the 2-norm. Now let k k be a general matrix norm, i.e., we assume that for arbitrary matrices B; C 2 Rns and 2 R the norm satis es i) kB k 6= 0 if B 6= 0, ii) kB k = jj kB k, and iii) kB + C k kB k + kC k. Horn and Johnson [15, p. 320] refer to such a matrix norm as a generalized matrix norm or as a vector norm, because kk is not required to be submultiplicative. By [15, Theorem 5.7.8] it follows that there are constants c2 c1 > 0, such that for all B 2 Rns , (2.7) c1 kB k2 kB k c2 kB k2 : Substituting (2.7) into (2.4) shows that () is independent of the choice of matrix norm. We remark that [15, Theorem 5.7.8] is only formulated for n n matrices, and we apply this theorem to n n matrices that are obtained by appending n ? s zero columns to the n s error matrices Ek . ut In general (A) is not explicitly known. Assume instead that an interval K = [a; b], which contains (A) is explicitly known. Introduce the asymptotic convergence factor with respect to K ,

max jp (z )j1=k : ( ; K ) := klim !1 z2K k

Then (; K ) ( ). We would like to choose the relaxation parameters k in Richardson iteration so that (; K ) is minimal for a given interval K = [a; b]. Lemma 2.2. Let (A) K = [a; b] with 0 < a < b < 1. Assume that the iteration parameters are reciprocal values of Leja points for K . Then

( ; K ) = inf (; K ) = (a; b) :=

p1 ? ; 1+ 2

where := bb?+aa . In particular, if K 0 is a subinterval of K , then

inf (; K 0 ) inf (; K ) :

Proof. The lemma follows from results by Leja [16] and from properties of Chebyshev polynomials; see [4, Lemma 2.1] for details. ut

3. A block modi ed Chebyshev algorithm The vector Lanczos algorithm computes, from an initial vector r0 2 Rn , a symmetric tridiagonal matrix, whose eigenvalues are Ritz values of A; see, e.g., Golub and Van Loan [13], Gragg [14] and Parlett [20]. The block Lanczos algorithm determines, from an initial matrix R0 2 Rns , a symmetric block tridiagonal matrix Hj , whose eigenvalues are Ritz values of A; see Golub and Underwood [12], Parlett [20] and references therein. This section presents a block version of the modi ed Chebyshev algorithm that determines the matrix Hj from the residual matrices R0 ; R1 ; : : : ; R2j?1 given by (1.3) and from recursion coecients 0 ; 1 ; : : : ; 2j?1 of the residual polynomials (1.6). Our derivation of the Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 7 of Numer. Math. 68: 3{16 (1994)

8


block modi ed Chebyshev algorithm is analogous to the derivation of the (scalar) modi ed Chebyshev algorithm by Golub and Gutknecht [10]. We rst consider the block Lanczos algorithm. Let R0 = Q0B0 be a QR factorization of R0 , i.e., Q0 2 Rns , QT0 Q0 = I and B0 2 Rss is upper triangular. The block Lanczos algorithm generates a sequence of orthogonal mutually orthogonal matrices Qk 2 Rns , k = 0; 1; 2; : : : , i.e., I; j = k; T Qk Qj = 0; j 6= k; (3.1) that satisfy a three-term recursion relation (3.2) Qk+1 Bk+1 = AQk ? Qk Ak ? Qk?1 BkT ; k 0 ; for certain matrices Aj ; Bj 2 Rss , where Q?1 := 0. De ne the Krylov subspaces (3.3) K j (A; R0 ) := spanfR0 ; AR0 ; : : : ; Aj ?1 R0 g ; j 0 ; where the right-hand side denotes the span of the columns of the matrices listed. We will ignore degenerate cases and assume that (3.4) K j (A; R0 ) = spanfQ0 ; Q1 ; : : : ; Qj ?1 g ; j 0 :

Algorithm 3.1n(Block Lanczos algorithm) Input: R 2 R s , A 2 Rnn , j n=s; Output: fQk gjk? , fAk gjk? , fBk gjk? , Qk 2 Rns , Ak ; Bk 2 Rss ; 0

1 =0

1 =0

1 =1

Compute QR factorization R0 = Q0 B0 where QT0 Q0 = I and B0 upper triangular; P1 := AQ0 ; A0 := QT0 P1 ; P1 := P1 ? Q0 A0 ; for k := 2; 3; : : : ; j do Compute QR factorization Pk?1 = Qk?1 Bk?1 where QTk?1 Qk?1 = I and Bk?1 upper triangular; Pk := AQk?1 ? Qk?2 BkT?1 ; Ak?1 := QTk?1 Pk ; Pk := Pk ? Qk?1 Ak?1 ; end k; The matrices Ak ; Bk generated by Algorithm 3 are uniquely determined if the diagonal entries of the Bk are positive. We will assume this to be the case. The three-term recursion relation (3.2) can be written in matrix form (3.5) A[Q0 ; Q1 ; : : : ; Qj?1 ] = [Q0 ; Q1 ; : : : ; Qj?1 ]Hj + [0; : : : ; 0; Qj Bj ] ; where Hj 2 Rjsjs is the symmetric block tridiagonal matrix

(3.6)

2A B 66 66 B A B 66 6 B ... Hj = 66 66 ... 66 64 0

T 1

1

1

2

T 2

...

BjT?1 Bj?1 Aj?1

3 77 77 77 77 ; 77 77 77 5



9

Note that because the matrices Bj are upper triangular, the band width of Hj is only 2s + 1. We will now derive a block modi ed Chebyshev algorithm and show that it also yields the matrix (3.6). For notational simplicity we let the matrix A be a mapping from `2 to `2 , and the matrices Qj be mappings from Rs to `2 . De ne the matrix (3.7) Q = [Q0 ; Q1 ; Q2; : : : ] : Then (3.5) can be written in the form (3.8) AQ = QH; where H is a block tridiagonal matrix, with leading principal js js submatrix Hj . Let the residual matrices Rk 2 `s2 be de ned by (1.3). In view of (1.5) and (1.6) they satisfy (3.9) Rk+1 = Rk ? k ARk ; k = 0; 1; 2; : : : : Introduce the matrix R = [R0 ; R1 ; R2 ; : : : ]; and de ne the block bidiagonal matrix

2T 66 66 T T = 666 66 4

00

10

T11 T21 T22 ... ...

3 77 77 77 ; 77 75

with s s blocks Tjj := 1j I and Tj+1;j := ? 1j I . Then (3.9) yields

AR = RT:

(3.10) De ne the s s matrices and let (3.11)

Sjk := QTj Rk ; j 0 ; k 0 ;

2S S 66 66 S S S = 666 66 S 4 . 00

01

10

11

20

S02 ...

.. In view of (3.1) and (3.4), we have (3.12) Sjk = 0 ; j > k ;

3 77 77 77 = Q R : 77 75 T


10


i.e., the matrix S is block upper triangular. It follows from (3.8), (3.10), (3.11) and the symmetry of A and H that (3.13) HS = HQTR = QT AR = QT RT = ST : Identifying corresponding blocks on the left-hand side and on the right-hand side of (3.13) yields (3.14)Bj Sj?1;k + Aj Sjk + BjT+1 Sj+1;k = Sjk Tkk + Sj;k+1 Tk+1;k ; j 0 ; k 0; where S?1;k := 0. If j > k + 1, then equation (3.14) is trivially satis ed because both left-hand side and right-hand side vanish due to (3.12). When j = k + 1, equation (3.14) simpli es to (3.15) Bk+1 Skk = Sk+1;k+1 Tk+1;k ; and j = k yields (3.16) Bk Sk?1;k + Ak Skk = Skk Tkk + Sk;k+1 Tk+1;k : Note that the blocks Ak and Bk of Hj can be determined from (3.15){(3.16), i.e., (3.17) Bk = Skk Tk;k?1 Sk??11;k?1 ; ?1 : Ak = (Skk Tkk + Sk;k+1 Tk+1;k ? Bk Sk?1;k )Skk (3.18) Similarly as above, we ignore degenerate situations and assume that the matrices Sjj?1 exist for all j 0. Note that the computation of the Ak and Bk by (3.17){(3.18) only requires the diagonal and superdiagonal blocks of the matrix S . The blocks of the matrix S required for the computation of the entries of Hj can be generated recursively, starting from the blocks S0k , 0 k < 2j . Details of the computations are described in the following algorithm. We refer to the matrices S0k as block modi ed moments, because when the block size s is one, the quantities S0k can be interpreted as modi ed moments; see [3, 4, 11]. The (scalar) modi ed Chebyshev algorithm generates a symmetric tridiagonal matrix, whose entries are recursion coecients of orthogonal polynomials associated with modi ed moments, from modi ed moments; see, e.g., [9, 10]. Our algorithm generates a symmetric block tridiagonal matrix, whose entries are recursion coecients of the orthogonal mutually orthogonal matrices Qj from block analogues of modi ed moments. We therefore refer to our algorithm as the block modi ed Chebyshev algorithm. This algorithm is presented here for the special case when Tkk = ?Tk+1;k = 1k I . Algorithm 3.2 (Block modi ed Chebyshev algorithm) Input: j; fS0k g2kj=0?1; fk g2kj=0?1; Output: Hj ; ?1 ); B0 := 0; A0 := 1 (I ? S01 S00 0



11

for k := 1; 2; : : : ; j ? 1 do S^kk := ? 1 Sk? ;k + 1 I ? Ak? Sk? ;k ? Bk? Sk? ;k ; k k Compute Choleski factorization Bk Bk of ? 1 S^kk Sk?? ;k? ; k? Skk := ?k? Bk Sk? ;k? ; for m := k + 1; k+ 2; : : : ; 2j ? k ? 1do 1 1 ? Skm := Bk ? Sk? ;m + I ? Ak? Sk? ;m ? Bk? Sk? ;m ; m m end m; ?; Ak := 1 I ? 1 Sk;k + Bk Sk? ;k Skk k k end k; 1

+1

1

1

1

1 1

T

1

1

1

2

1

1

T

1

+1

+1

1

1

1

1

2

1

Thus, under suitable regularity conditions, the blocks de ning the matrix Hj can be computed by Algorithm 3.2. The algorithm can be applied to Richardson iteration as follows. From the residual matrices Rk , 0 k < 2j , de ned by (1.3), we compute the matrices (3.19) S0k = QT0 Rk ; 0 k < 2j : These matrices and the relaxation parameters k are input to Algorithm 3.2, which determines the matrix Hj . Let (3.20) ^1 ^2 : : : ^sj

denote the eigenvalues of Hj . These eigenvalues are used to determine an interval that contains most of the eigenvalues of A. The relaxation parameters k are chosen as reciprocal values of Leja points for this interval. Details of the computation of the relaxation parameters are presented in Sect. 4. There we also show how two block modi ed moments can be computed in each iteration of the Richardson method.

4. An adaptive Richardson iteration method This section describes our adaptive Richardson iteration scheme and discusses some computational aspects. Assume that we know an interval [am ; bm ] whose endpoints satisfy 1 am and bm n , where 1 and n are the extreme eigenvalues of A; see (1.7). Assume, moreover, that we have carried out q iterations by the Richardson method with relaxation parameters 0 ; 1 ; : : : ; q?1 . We would like to determine new relaxation parameters k , k q, as reciprocal values of Leja points for K m = [am ; bm] in the presence of the points zk = 1=k for 0 j < q. However, the numerical determination of a large number of Leja points according to (2.1){(2.2) may be quite dicult. We therefore discretize the interval K m using grid points fti g`i=1 , which we choose to be zeros of the Chebyshev polynomial of the rst kind of degree ` for the interval K m . Algorithm 4.1 (Computation of relaxation parameters) Input: aj , bj , `, q, r, fzk gkq?=01 (zk = 1=k); Output: fk grk?=1q ; Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 11 of Numer. Math. 68: 3{16 (1994)

12


Let K (m`) = fti g`i=1 be the set of zeros of a Chebyshev polynomial of the rst kind of degree ` for the interval [am ; bm ]; for k := q; q + 1; : : : ; r ? 1 do if k=0 then z0 := bj ; 0 := 1=z0;

else

Determine zk 2 K (m`) , so that kY ?1 i=0

jzk ? zi j!(zk ) = max`

kY ?1

z2K(m) i=0

jz ? zi j!(z );

k := 1=zk ;

endif ;

end k;

In Sect. 3 we showed how to compute block modi ed moments associated with the residual polynomials pk during the iterations, and how these block modi ed moments and the relaxation parameters can be used to determine a symmetric tridiagonal matrix Hj . The eigenvalues (3.20) of Hj can be used to update the interval K m = [am ; bm ]. The updated interval K m+1 = [am+1 ; bm+1 ] is de ned by (4.1) am+1 := minfam ; ^1 g; bm+1 := maxfbm; ^sj g: We remark that if the initial interval K 0 = [a0 ; b0] satis es (1.8), then the intervals K j , j 1 de ned by (4.1) also satisfy (1.8). Formula (3.19) shows how to compute one block modi ed moment in each iteration. The application of this formula would make it necessary to carry out 2j ? 1 iterations in order to determine the matrix Hj . We now describe how to compute two modi ed moments in each iteration. This makes it possible to update the intervals [am ; bm] more frequently. The ability to update the interval often is particularly important in the beginning of the iterations when many of the eigenvalues of A might lie far outside the intervals [am ; bm ] used to determine relaxation parameters. Consider the polynomials p~2k?1 () := (1 ? k?1 )[pk?1 ()]2 = pk?1 ()pk (); p~2k () := [pk ()]2 : They satisfy the recursion relation p~k+1 () = (1 ? ~k )~pk (); k = 0; 1; : : : ; where (4.2) ~2k+1 := ~2k := k ; k = 0; 1; : : : : Analogously to (1.5) and (3.19), we de ne R~k := p~k (A)R~0 ; S~0k := QT0 R~k ; k 0 ; where R~0 = R0 = Q0 B0 . Then, (4.3) S00 = S~00 = B0 Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 12 of Numer. Math. 68: 3{16 (1994)


13

and, for k 1, S~0;2k?1 = QT0 p~2k?1 (A)R0 = QT0 pk?1 (A)pk (A)R0 = B0?T RkT?1 Rk ; S~0;2k = QT0 p~2k (A)R0 = QT0 pk (A)pk (A)R0 = B0?T RkT Rk : Thus, we can compute two block modi ed moments during each iterations with Richardson's method. From the residual matrices R0 ; R1 ; : : : ; Rj and relaxation parameters 0 ; 1 ; : : : ; j?1 we can compute 2j block modi ed moments fS~0k g2kj=0?1 and the parameters f~k g2kj=0?1 , which can be used as input to Algorithm 3.2 to determine a symmetric block tridiagonal matrix Hj 2 Rjsjs . The eigenvalues (3.20) of Hj can be applied to update the interval [am ; bm] according to (4.1). Therefore, we can update the interval every j iterations. Computational experience suggests, however, that one, in general, should carry out more than j iterations between updates of the interval, in order to make the residual matrices rich in eigenvector components associated with eigenvalues lying outside the interval presently being used to determine relaxation parameters. In Algorithm 4.2 below we therefore carry out at least jq iterations between updates of the interval, where q is a small integer larger than one. Lemmas 2.1 and 2.2 suggest a criterion for deciding when to update the interval [am ; bm ]. Let C 2 Rns have elements cjk and introduce the norm

0n 1= X @ jcjk j A : max ks 1 2

jjjC jjj :=

(4.4)

2

j =1

1

Since the matrix A is symmetric and positive de nite, jjjC jjjA := jjjAC jjj is also a matrix norm. Assume for the moment that (A) K m = [am ; bm]. Then () ( ; K m ) and Lemmas 2.1{2.2 show that

jjjR jjj =k k

1

jjjE jjj =k 1

k A = klim = () ( ; K m ) = (am ; bm): lim k!1 jjjR0 jjj !1 jjjE0 jjjA We therefore update the interval [am ; bm] every jmq iterations if

jjjRjmq jjj > (a ; b )jmq m m jjjR jjj 0

for j = 1; 2; : : : . We have found this criterion for updating to work quite well. The following algorithm assumes that the initial interval [a0 ; b0] is a subset of the interval [1 ; n ] de ned by the extreme eigenvalues of the matrix A; see (1.7). This condition is satis ed for instance when a := b := 1 trace(A): (4.5) 0

0

n

Algorithm 4.2 (Adaptive Richardson method) Input: j; n; q; s 2 N, A 2 Rnn , X ; B 2 nRns, a ; b ; tol 2 R; Output: Approximate solution Xk 2 R ; 0

0

0

+1


14


R0 := B ? AX0 ; Compute S~00 by (4.3); for m := 0; 1; : : : do if m = 0 or jjjRjmq jjj=jjjjR?01jjj > (am ; bm)jmq then Compute fjmq+k gk=0 for the interval [am ; bm] by Algorithm 4.1; for k := 0; 1; : : :; j ? 1 do Xjmq+k+1 := Xjmq+k + jmq+k Rjmq+k ; Rjmq+k+1 := B ? AXjmq+k+1 ; if jjjRjmq+k+1 jjj tol then stop; S~0;2k+1 := B ?T RkT Rk+1 ; S~0;2k+2 := B ?T RkT+1 Rk+1 ; ~2k+1 := ~2k := jmq+k ; end k; Use the block modi ed moments fS~0k g2kj=0?1 and the parameters f~k g2kj=0?1 as input to Algorithm 3.2 to determine the symmetric block tridiagonal matrix Hj of order js; Compute the extreme eigenvalues ^1 and ^js of Hj ; Determine endpoints of new the interval [am+1 ; bm+1 ] by (4.1);

else

endif;

am+1 := am ; bm+1 := bm; Compute fjmq+k gjk?=01 for the interval [am+1 ; bm+1 ] by Algorithm 4.1; for k := 0; 1; : : : ; j ? 1 do Xjmq+k+1 := Xjmq+k + jmq+k Rjmq+k ; Rjmq+k+1 := B ? AXjmq+k+1 ; if jjjRjmq+k+1 jjj tol then stop; end k;

?1 Compute fjmq+k gjq k=j for the interval [am+1 ; bm+1 ] by Algorithm 4.1; for k := j; j + 1; : : : ; jq ? 1 do Xjmq+k+1 := Xjmq+k + jmq+k Rjmq+k ; Rjmq+k+1 := B ? AXjmq+k+1 ; if jjjRjmq+k+1 jjj tol then stop; end k; end m; The algorithm above is designed for iterative solution of linear systems of equations with a symmetric positive de nite matrix. The scheme can be modi ed fairly easily to be applicable when the matrix A is symmetric inde nite. In the latter case, the set K m is replaced by two intervals on the real axis, one on each side of the origin. A Richardson method for the solution of (1.1) when A is symmetric inde nite and s = 1 is presented in [5] and the approach there generalizes in an obvious manner to the case s > 1. We therefore omit the details.

5. Computed examples This section presents some numerical experiments that illustrate the behavior of the adaptive Richardson iteration method de ned by Algorithm 4.2. The computer programs for our examples were written in FORTRAN 77, and all computations were carried out on an HP 9000-720 workstation in double precision arithmetic, i.e., with approximately 15 signi cant digits. In the examples we use Algorithm 4.2 with the input parameters j = 3 and q = 4. We compare the number of iterations required by Algorithm 4.2 with the number of iterations required by an iterative scheme in which the s linear systems of equations Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 14 of Numer. Math. 68: 3{16 (1994)


15

(1.9) with a single vector as right-hand side are solved by applying s copies of Algorithm 4.2 with input parameter s = 1 in lock step. We refer to the latter scheme as the vector Richardson algorithm and to Algorithm 4.2 as the block Richardson algorithm. The vector Richardson algorithm proceeds as follows. When s = 1 the moment matrices S~0k ; 0 k < 2j , computed in Algorithm 4.2 are scalars and Algorithm 3.2 simpli es to the (scalar) Chebyshev algorithm. The symmetric matrix Hj generated by the scalar Chebyshev algorithm is of order j and tridiagonal. Whenever, in the solution of one of the s linear systems, the computation of a symmetric j j tridiagonal matrix is required in order to update the interval used to compute relaxation parameters by Algorithm 4.1, we determine a symmetric j j tridiagonal matrix for each one of the s linear systems. We then compute the j eigenvalues of each one of the s tridiagonal matrices and denote them by ^k , 1 k js. We order these eigenvalues according to (3.20). The interval used to determine relaxation parameters for each one of the s linear systems of equations is then updated according to (4.1). In this manner information \learned" when solving one of s linear systems (1.9) is applied in the solution process of the other systems. The relaxation parameters used to solve each one of the systems (1.9) are the same. Example 5.1. In this example we let A be a 200 200 symmetric matrix with one eigenvalue at 1, and the remaining eigenvalues at 400; 401; : : :; 598. We solve (1.1) for s = 2. The entries of the two right-hand side vectors are uniformly distributed random numbers in the interval [?1; 1]. We begin Richardson iterations with the initial approximate solution X0 := 0 and with an initial interval K 0 given by (4.5). The iterations were terminated when the norm (4.4) of a computed residual matrix was smaller than = 10?5. The number of iteration required for the block Richardson algorithm to achieve convergence was 84 and the intervals [am ; bm ] were adapted three times. The vector Richardson iteration required 165 iterations and the interval was adaptived ve times. Example 5.2. The 200 200 symmetric matrix A considered in this example has one eigenvalue at 1, one at 1000 and the remaining eigenvalues at 400; 401; : : :; 597. The right hand side matrix B is the same as in the previous example and so is the choice of X0 and stopping criterion for the iterations. The initial interval is K 0 = [4; 4]. The block Richardson algorithm demanded 140 iterations to achieve convergence, while the vector Richardson algorithm required 187 iterations.

6. Conclusion The paper describes a block generalization of the modi ed Chebyshev algorithm and discusses an application to an adaptive Richardson iteration method for the solution of symmetric linear systems of equations with s > 1 right-hand side vectors. Numerical examples compare this iterative method with an adaptive Richardson iterative scheme based on the (scalar) modi ed Chebyshev algorithm in which the linear system of equations is considered as s linear systems of equations, each of which with a single vector as right-hand side. The examples show that the adaptive iterative method based on the block modi ed Chebyshev algorithm can yield convergence within considerably fewer iterations. Numerische Mathematik Electronic Edition { page numbers may dier from the printed version page 15 of Numer. Math. 68: 3{16 (1994)

16


References 1. Andersen, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Ostrouchov S., Sorensen, S. (1992): LAPACK Users' Guide, SIAM, Philadelphia 2. Arioli, M., Du, I., Ruiz, D., Sadkane, M. (1992): Techniques for accelerating the block Cimmino method. In: Dongarra, J., Kennedy, K., Messina, P., Sorensen, D.C., Voigt, R.G., eds., Parallel Processing for Scienti c Computing, pp. 98{104. SIAM, Philadelphia 3. Calvetti, D., Golub, G.H., Reichel, L. (1994): Gaussian quadrature applied to adaptive Chebyshev iteration. In: Golub, G., Greenbaum, A., Luskin, M., eds., Recent Advances in Iterative Methods, pp. 31{44. Springer, New York 4. Calvetti, D., Reichel, L. (1992): Adaptive Richardson iteration based on Leja points. Report, Institute for Computational Mathematics, Kent State University, Kent, OH 5. Calvetti, D., Reichel, L. (1993): An adaptive Richardson iteration method for inde nite linear systems. Report, Institute for Computational Mathematics, Kent State University, Kent, OH 6. Dongarra, J.J., Du, I.S., Sorensen, D.C., van der Vorst, H.A. (1991): Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia 7. Edrei, A. (1939): Sur les determinants recurrents et les singularites d'une fonction donnee par son developpement de Taylor. Composito Math. 7, 20{88 8. Eiermann, M., Niethammer, W., Varga, R.S. (1985): A study of semiiterative methods for nonsymmetric systems of linear equations. Numer. Math. 47, 505{533 9. Gautschi, W. (1982): On generating orthogonal polynomials. SIAM J. Sci. Stat. Comput. 3, 289{317 10. Golub, G.H., Gutknecht, M.H. (1990): Modi ed moments for inde nite weight functions. Numer. Math. 57, 607{624 11. Golub, G.H., Kent, M.D. (1989): Estimates of eigenvalues for iterative methods. Math. Comp. 53, 619{626 12. Golub, G.H., Underwood, R. (1977): The block Lanczos method for computing eigenvalues. In: Rice, J.R., ed., Mathematical Software III, pp. 361{377. Academic Press, New York 13. Golub, G.H., Van Loan, C.F. (1989): Matrix Computations, 2nd ed. Johns Hopkins University Press, Baltimore 14. Gragg, W.B. (1974): Matrix interpretations and applications of the continued fraction algorithm. Rocky Mountain J. Math. 4, 213{225 15. Horn, R.A., Johnson, C.R. (1985): Matrix Analysis. Cambridge University Press, Cambridge 16. Leja, F. (1957): Sur certaines suits liees aux ensemble plan et leur application a la representation conforme. Ann. Polon. Math. 4, 8{13 17. O'Leary, D.P. (1980): The block conjugate gradient algorithm and related methods. Linear Algebra Appl. 29, 293{322 18. O'Leary, D.P. (1987): Parallel implementation of the block conjugate gradient algorithm. Parallel Computing 5, 127{139 19. Parlett, B.N: (1980): A new look at the Lanczos algorithm for solving symmetric systems of linear equations. Linear Algebra Appl. 29, 323{346 20. Parlett, B.N. (1980): The Symmetric Eigenvalue Problem. Prentice Hall, Englewood Clis 21. Reichel, L. (1991): The application of Leja points to Richardson iteration and polynomial preconditioning. Linear Algebra Appl. 154{156. 389{414 22. Ruiz, D.F. (1992): Resolution de grands systemes lineaires creux non symetrique par une methode iterative par blocs dans un environment multiprocesseur. Ph.D. thesis, Report TH/PA/92/6, CERFACS, Toulouse, France 23. Saad, Y. (1987): On the Lanczos method for solving symmetric linear systems with several right-hand sides. Math. Comp. 48 651{662 24. Saylor, P.E. (1988): Leapfrog variants of iterative methods for linear algebraic equations. J. Comput. Appl. Math. 24 169{193 25. van der Vorst, H.A. (1987): An iterative solution method for solving f (A)x = b, using Krylov subspace information obtained for the symmetric positive matrix A. J. Comput. Appl. Math. 18, 249{263