Convergence properties of relaxation algorithms - Springer Link

0 downloads 0 Views 461KB Size Report
similar to the one used by the author for the method of feasible directions is ... Key words: Anti-Jamming, Coordinate Descent, Univariate Search, Cyclic ...
Mathematical Programming 31 (1985) 15-24 North-Holland

CONVERGENCE PROPERTIES OF RELAXATION ALGORITHMS G e r a r d G.L. M E Y E R Electrical Engineering and Computer Science Department. The Johns Hopkins University, Baltimore. MD 21218, USA Received 3 November 1983 Revised manuscript received 3 July 1984 Nonadaptive relaxation algorithms require strong continuity assumptions and adaptive relaxation algorithms are computationally costly. To remedy that situation, an anti-jamming procedure similar to the one used by the author for the method of feasible directions is proposed. The resulting algorithms are compared with the existing ones for solving unconstrained optimization problem in E". Key words: Anti-Jamming, Coordinate Descent, Univariate Search, Cyclic Relaxation.

I. Introduction In [8] a n d [9], the r e l a x a t i o n a l g o r i t h m s o r i g i n a l l y p r o p o s e d by J a c o b i a n d Seidel, a n d used in [13] a n d I15] for solving linear systems o f e q u a t i o n s , have been e x t e n d e d so that n o n l i n e a r systems c o u l d be solved. R e l a x a t i o n m e t h o d s have also been used on s t r u c t u r e d o p t i m i z a t i o n p r o b l e m s . F o r e x a m p l e , in [1], [4] a n d [14] the p r o b l e m is that o f m i n i m i z i n g a functional on a p r o d u c t s p a c e ; in [3] various c o o r d i n a t e descent a l g o r i t h m s are a n a l y z e d using Z a n g w i l l ' s idea o f c o m p o s i t i o n o f point-to-set m a p s ; a n d in [2] F i o r o t a n d H u a r d p r e s e n t a t h e o r y for a n a l y z i n g r e l a x a t i o n a l g o r i t h m s for o p t i m i z a t i o n . G i v e n a closed subset T o f E " a n d a collection o f m a p s a J ( ) from T into T, j = 1,2 . . . . , p, let Ci be the set of fixed points o f the m a p aj( 9) in T, i.e., C={z~

T I%(z)=z}.

The goal o f a r e l a x a t i o n a l g o r i t h m is to find points that are in the intersection o f the fixed p o i n t sets Cj, j = 1 , 2 , . . . , p, i.e., points in the set Cx, where Cx = CL~ C~"

9 "~ C..

R e l a x a t i o n a l g o r i t h m s generate sequences in T from an initial p o i n t in T by using at each iteration one o f the m a p a t ( ) . The m a n n e r in which that choice is m a d e c a n n o t be a r b i t r a r y if the a l g o r i t h m s are to exhibit the a p p r o p r i a t e c o n v e r g e n c e behavior. In this p a p e r , we shall e x a m i n e the sequencings o f the m a p s aj(. ) that lead to interesting c o n v e r g e n c e results when either one or b o t h o f the f o l l o w i n g h y p o t h e s e s are satisfied by the m a p s a t ( ). 15

G.G.L. Meyer/ Convergenceproperties of relaxation algorithms

16

Hypothesis 1. The maps aj(. ) , j = 1 , 2 , . . . , p, are continuous on T with respect to T. Hypothesis 2. A map v( - ) from T into E exists such that for every j -- 1 , 2 , . . . , p, if z is not in Cj, scalars e j ( z ) > 0 , fii(z)> 0, and Aj(z) exist such that

v( ,3~(y ) ) = o,j = 1, 2 , . . . , n}, where dl, d2,..., d, is a set of linearly independent vectors in E n. For j = 1 , 2 , . . . , n and z in E n, let the sets Bj(z) and the scalars bj(z) be defined by Bi(z) = {/3]f(z +rid; = min{f(z + ~d,)l ~ ~ E}},

bj(z)c Bi(z)

and

ilb;(z)ll = min{l]/3H l/3 ~ Bj(z)}.

Note that if two quantities /3 and 132 exist such that

II~,tf = 11~21i = min{ll/3 II I/3 ~ g(z)}, then we let b;(z) = 118, IILet v ( . ) be the m a p from E" into E and for j = 1,2 . . . . ,p, let the maps a i ( ) from E n into E" be defined by

v(z) = f ( z )

(2)

and a,(z)

=

z +

h;(z)d,

(3)

G. G.L. Meyer / Convergence properties of relaxation algorithms

Lemma 6. I f Hypothesis 4 is satisfied, the maps aj(. (3) satisfy Hypothesis 2.

21

and v(" ) defined in (2) and

Note that Hypothesis 4 does not imply the continmty of the maps aj(. ) given in (3). Thus, we may produce examples of maps f ( - ) which satisfy Hypothesis 4, but on which the nonadaptive algorithm fails when the maps a i ( . ) defined in (3) are used: that is, nonadaptive algorithms may produce bounded sequences of points {z,} such that the corresponding sequences {]]Vf(zi)U} are bounded away from 0 [2, 12]. It is possible to strengthen the assumptions on the map f ( 9) so that the corresponding maps aj(. ) are continuous. Hypothesis 5. For every z in E and direction dJ, J = 1, 2 , . . . , n, the sets Bj(z) contain at most one point. The uniqueness assumption implied by Hypothesis 5 eliminates 'flat spots' along the directions di, J = 1, 2 . . . . , n [2, Hypothesis H3 p. 72], [17, p. 112]. This is useful because the maps aj(. ) are continuous whenever Hypotheses 4 and 5 are satisfied. Lemma 7. I f Hypotheses 4 and 5 are satisfied, the maps aj(. ) , j = l, 2 , . . . , n, defined in (3) are continuous. Using the results of the preceding sections, we may now present the convergence properties of relaxation algorithms which use the maps aj( 9) defined in (3). Assume that f ( . ) satisfies Hypotheses 4 and 5. In that case we may use Algorithm 1. If m ( . ) satisfies Hypothesis 3, then every cluster point of every sequence generated by the algorithm is a solution of Problem 1. If m(- ) satisfies Hypothesis 3 and if Problem 1 possesses at most a countable number of solutions, then every sequence generated by Algorithm 1 converges to one of them. Assume that f ( . ) satisfies Hypothesis 4 only. In that case, we cannot use a nonadaptive algorithm, and therefore must use Algorithms 2, 3 or 4. If we use Algorithm 2, every cluster point of every sequence generated by the algorithm is a solution of Problem 1. If Problem 1 possesses only one solution, then every sequence generated by Algorithm 2 converges to that solution. If we use Algorithm 3, at least one cluster point of every sequence generated by the algorithm is a solution of Problem 1. If we use Algorithm 4, every cluster point of every sequence {y~}generated by the algorithm is a solution of Problem 1, and if Problem 1 possesses only one solution, every sequence {Yi} generated by Algorithm 4 converges to it. When Algorithm 1 uses the maps aj( 9) defined in (3) and the map m(. ) defined by m ( 1 ) = l , m ( 2 ) = 2 , m ( p ) = p , m ( p + l ) = l , r e ( p + 2 ) = 2 , etc., the resultant method is called: the 'cyclic coordinate descent algorithm' [3, p. 158], [17, p. 111]; the 'univariate search method' [10, p. 292]; the 'method that changes one variable at a time' [12, p. 194]; or the 'cyclic univariate relaxation method' [2], [9, p. 244].

G.G.L. Meyer/ Convergenceproperties of relaxation algorithms

22 6. Conclusion

Work is in progress to extend the results presented in this paper to additional classes of algorithm components. First, although point-to-point maps have been assumed here, many potentially useful algorithms employ point-to-set maps. While Algorithm 1 cannot use point-to-set maps because of the requirement for continuity, it should be possible to generalize Algorithms 2, 3 and 4 so that nondeterministic c o m p o n e n t s may be used. Second, each c o m p o n e n t map aj(-) is a u t o n o m o u s , and it m a y be possible to extend the techniques developed here to treat both point-to-point and point-to-set n o n a u t o n o m o u s maps. Finally, we have assumed that the quantity m(i) is chosen deterministically at each iteration. An interesting issue is the feasibility of choosing the m a p index, m(i), in a r a n d o m fashion. By making a r a n d o m c o m p o n e n t selection at each iteration, continuity and the computationally intensive schemes for eliminating the continuity requirement may not be necessary.

Appendix: Proofs for Section 4 Proof of Lemma 3. Let z, be a cluster point of a sequence {zi} generated by Algorithm 3. An infinite subset K o f the integers exists so that the subsequence {z~}~ converges to z,. The construction o f the sequence {Pi} implies that 0 ~< P H 0. Assume that p , > 0. An index k~ exists such that p~ = p . for every i/> k~. Further, the m a p index set ~(i, p~) is a strict subset of { 1 , 2 , . . . , p} for every i ~> k~, permitting us to choose an index m(i) for every i>~k~. Let k be an index such that k>~k~, pi=p., and Ilz,-z.ll~k, i in K. Let io be the smallest index in K such that io>~k, and let m(io) be the map index used by the algorithm at that point. Let i~ be the smallest index in K such that i~ > io. The point z~, satisfies

II

z,, II

o,,

and therefore &(i~, p,,) contains the map index m(i~). By construction, the map index m(i~) used at iteration i~ is not equal to m(io). Let i~ be the smallest in K such that i2 > il. The point z,, satisfies ]]z,,,- zc.[l