Nonlinear iterative methods for linear ill-posed problems in Banach ...

1 downloads 0 Views 291KB Size Report
norm solution of the operator equation Ax = y in Banach spaces X, Y, where ... In the Hilbert space setting this problem has been thoroughly studied and many ...
INSTITUTE OF PHYSICS PUBLISHING Inverse Problems 22 (2006) 311–329

INVERSE PROBLEMS

doi:10.1088/0266-5611/22/1/017

Nonlinear iterative methods for linear ill-posed problems in Banach spaces F Sch¨opfer1 , A K Louis and T Schuster Fachrichtung Mathematik, Geb. 36, Universit¨at des Saarlandes, 66041 Saarbr¨ucken, Germany E-mail: [email protected], [email protected] and [email protected]

Received 23 August 2005, in final form 13 December 2005 Published 30 January 2006 Online at stacks.iop.org/IP/22/311 Abstract We introduce and discuss nonlinear iterative methods to recover the minimumnorm solution of the operator equation Ax = y in Banach spaces X, Y , where A is a continuous linear operator from X to Y. The methods are nonlinear due to the use of duality mappings which reflect the geometrical aspects of the underlying spaces. The space X is required to be smooth and uniformly convex, whereas Y can be an arbitrary Banach space. The case of exact as well as approximate and disturbed data and operator are taken into consideration and we prove the strong convergence of the sequence of the iterates.

1. Introduction Let X and Y be Banach spaces and A : X −→ Y be a continuous linear operator. We discuss the problem of iteratively recovering a solution of the operator equation Ax = y.

(1.1)

Problem (1.1) may be ill posed, i.e. the solution (if it exists) need not be unique (e.g., when A has a non-trivial kernel) or does not depend continuously on the right-hand side, so that small perturbations of the data can result in arbitrarily large deviations from the solution (which happens when A is a compact operator in the infinite-dimensional case). In the Hilbert space setting this problem has been thoroughly studied and many methods of solution and regularization have been established (see for instance [14, 9, 3], to mention just a few references). 1

Author to whom all correspondence is to be sent.

0266-5611/06/010311+19$30.00 © 2006 IOP Publishing Ltd Printed in the UK

311

312

F Sch¨opfer et al

Among the first iterative algorithms in Hilbert spaces are variations of the Landweber iteration [12]: xn+1 = xn − µn A∗ (Axn − y)

(1.2)

with x0 ∈ X and µn > 0 appropriately. Some of these methods were also transferred to the case of general Banach spaces. To our knowledge, the linear iterative methods are restricted to the case Y = X and the operator A has to fulfil certain resolvent conditions which in concrete situations may be difficult to verify [3, 16, 11]. Recently, some nonlinear iterative methods have been proposed in the context of Bregman projections and the minimization of convex functionals, and they are shown to be weakly convergent [5, 15]. In this paper, we consider nonlinear generalizations of (1.2) in the form JX (xn+1 ) = JX (xn ) − µn A∗ JY (Axn − y),

xn+1 = JX∗ (JX (xn+1 )),

X∗

where JX : X −→ 2 is a so-called duality mapping (which also plays a key role with monotone and accretive mappings [1, 2, 7, 20, 21]). These methods only require the space X to be smooth and uniformly convex, whereas Y can be an arbitrary Banach space, and we prove strong convergence. In the next section, we give the necessary theoretical tools and apply them in section 3 to describe the methods and prove their convergence. We conclude with some numerical experiments. 2. Preliminaries Throughout the paper let X and Y be real Banach spaces with duals X∗ and Y ∗ . Their norms will be denoted by .. We omit indices indicating the space since it will become clear from the context which one is meant. For x ∈ X and x ∗ ∈ X∗ , we write x, x ∗  = x ∗ , x = x ∗ (x). By L(X, Y ) we denote the space of all continuous linear operators A : X −→ Y and write A∗ for the dual operator A∗ ∈ L(Y ∗ , X∗ ) and A = A∗  for the operator norm of A. For real numbers a, b, we write a ∨ b = max{a, b},

a ∧ b = min{a, b}.

Also, let p, q, r, s ∈ (1, ∞) be conjugate exponents so that 1 1 1 1 + =1 and + = 1. p q r s 2.1. Geometry, duality mapping and minimum-norm solution There is a tight connection between certain geometrical aspects of Banach spaces like convexity and smoothness and a so-called duality mapping. We give a short survey of what we will need in the following. A detailed introduction to this topic can be found in [8]. The functions δX : [0, 2] −→ [0, 1] and ρX : [0, ∞) −→ [0, ∞) defined by     δX () = inf 1 −  1 (x + y) : x = y = 1, x − y   2

and ρX (τ ) =

1 2

sup{x + y + x − y − 2 : x = 1, y  τ }

are referred to as the modulus of convexity and the modulus of smoothness, respectively. δX is continuous and nondecreasing with δX (0) = 0. ρX is continuous, convex and nondecreasing

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

with ρX (0) = 0 and ρX (τ )  τ . The function τ →

313

ρX (τ ) τ

is nondecreasing and fulfils   2 > 0 for all τ > 0. For every Hilbert space H, we have δH () = 1 − 1 − 2 and √ ρH (τ ) = 1 + τ 2 − 1 and δX ()  δH (), ρX (τ )  ρH (τ ) for arbitrary Banach spaces X [10, 13]. ρX (τ ) τ

Definition 2.1. A Banach space X is said to be (a) uniformly convex if δX () > 0 for any  ∈ (0, 2], (b) uniformly smooth if limτ →0 ρXτ(τ ) = 0, (c) strictly convex if λx + (1 − λ)y < 1 for all λ ∈ (0, 1) and x, y ∈ X with = y and x = y, i.e. the boundary of the unit ball contains no line segment, (d) smooth if for every 0 = x ∈ X there is a unique x ∗ ∈ X∗ such that x ∗  = 1 and x, x ∗  = x. Example 2.2. Lp spaces (1 < p < ∞) are known to be both uniformly convex and uniformly smooth [13] and  p−1 2 p−1 2    + o( 2 ) >  , 1 0 δX ()  C p

resp.

ρX (τ )  Cτ p .

(2.3)

It is a fact that δX () is of power type p iff ρX∗ (τ ) is of power type q. ∗

The mapping Jp : X −→ 2X defined by Jp (x) = {x ∗ ∈ X∗ : x ∗ , x = xx ∗ , x ∗  = xp−1 }

(2.4)

is called the duality mapping of X with gauge function t → t . By jp we denote a singlevalued selection of Jp (i.e., jp (x) ∈ Jp (x) for every x ∈ X). We summarize a few facts about the duality mapping. p−1

Theorem 2.3. (a) For every x ∈ X the set Jp (x) is not empty and it is convex and weakly closed in X∗ . (b) Jp (−x) = −Jp (x) and Jp (λx) = λp−1 Jp (x) for all x ∈ X and λ > 0. (c) Jp is monotone, i.e. x ∗ − y ∗ , x − y  0 for all x, y ∈ X and x ∗ ∈ Jp (x), y ∗ ∈ Jp (y). Example 2.4. (a) In the case of a Hilbert space, J2 is just the identity mapping. (b) In Lr spaces, we have 1 r−1 Jp (x) = sgn(x), x ∈ Lr , r−p |x| xr which is to be understood pointwise (resp. componentwise).

314

F Sch¨opfer et al

(c) A single-valued selection for the duality mapping in RN with the supremum norm N x∞ = supn∈N |xn |, x = (xn )N n=1 ∈ R is given by (jp (x))n = xp−1 ∞ sgn(xk )δn,k ,

n = 1, . . . , N,

where k is an index with |xk | = x∞ .

(d) If we equip RN with the L1 -norm x1 = N n=1 |xn |, we may choose p−1

(jp (x))n = x1

sgn(xn ),

n = 1, . . . , N.

The following theorem is contained in [8, chapters 1, 2] and [18, theorem 1.1.17] and shows that convexity and smoothness are dual concepts. Theorem 2.5. (a) X is uniformly convex (resp. uniformly smooth) iff X∗ is uniformly smooth (resp. uniformly convex). (b) If X is uniformly convex then X is reflexive and strictly convex. (c) If X is uniformly smooth then X is reflexive and smooth. (d) Let X be reflexive. Then X is strictly convex (resp. smooth) iff X∗ is smooth (resp. strictly convex). (e) X is strictly convex iff every duality mapping Jp of X is strictly monotone, i.e. x ∗ − y ∗ , x − y > 0 for all x, y ∈ X with x = y and x ∗ ∈ Jp (x), y ∗ ∈ Jp (y). (f) X is smooth iff every duality mapping Jp of X is single valued. (g) If X is reflexive, strictly convex and smooth then Jp is single-valued, norm-to-weakcontinuous and bijective. The inverse Jp−1 : X∗ −→ X is given by Jp−1 = Jq∗ with Jq∗ being the duality mapping of X∗ with gauge function t → t q−1 . (h) Let M = ∅ be a closed convex subset of X. If X is uniformly convex then there exists a unique x ∈ M such that x = inf z. z∈M

If in addition X is smooth then Jp (x), x  Jp (x), z for all z ∈ M. Remark 2.6. Smoothness is also related to differentiability of the norm: ˆ (a) X is smooth iff the norm is Gateaux differentiable on X\{0}. (b) X is uniformly smooth iff the norm is uniformly Fr´echet differentiable on the unit sphere. The next two theorems [19] provide us with inequalities which will be of great relevance for proving the convergence of our method. Theorem 2.7. If X is uniformly convex then for all x, y ∈ X x − yp  xp − pjp (x), y + σp (x, y) with



σp (x, y) = pKp 0

where Kp = 4(2 +

1

(x − ty ∨ x)p δX t



ty 2(x − ty ∨ x)

(2.5) 

√    3) min 12 p(p − 1) ∧ 1, 12 p ∧ 1 (p − 1), √ √  (p − 1)(1 − ( 3 − 1)q ), 1 − (1 + (2 − 3)q)1−p .

dt,

(2.6)

(2.7)

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

315

Theorem 2.8. If X is uniformly smooth then for all x, y ∈ X x − yp  xp − pJp (x), y + σ˜ p (x, y)

(2.8)

with  σ˜ p (x, y) = pGp

1

0

(x − ty ∨ x)p ρX t



ty x − ty ∨ x

 dt,

(2.9)

where Gp = 8 ∨ 64cKp−1 with Kp defined according to (2.7) and  ∞   15 1 + j +2 τ0 c = 4 2 1 + τ 2 − 1 j =1 τ0

√ with τ0 =

339 − 18 . 30

0

Remark 2.9. (a) For p = 2 in a real Hilbert space, these inequalities reduce to the well-known polarization identity x − y2 = x2 − 2x, y + y2 . (b) In fact the above inequalities completely characterize uniformly smooth (resp. uniformly convex) Banach spaces [19]. Since we are interested in the minimum-norm solution of (1.1), i.e. a unique x ∈ X such that Ax = y

and

x = inf{z : z ∈ X, Az = y},

(2.10)

we will need the following lemma. Lemma 2.10. Let X be smooth and uniformly convex and y ∈ R(A). (a) There exists the minimum-norm solution x of (1.1) and Jp (x) ∈ R(A∗ ). ˜ ∈ R(A∗ ) and (b) If x ∈ X is the minimum-norm solution of (1.1) and x˜ ∈ X fulfils Jp (x) x − x˜ ∈ N (A) then x˜ = x. Proof. The set M := {z ∈ X : Az = y} is a nonempty closed convex subset of X since y ∈ R(A) and A is a continuous linear operator. Theorem 2.5(h) then guarantees the existence (and uniqueness) of the minimum-norm solution x of (1.1). Now let z be an arbitrary element of N (A). Then (x ± z) ∈ M and by theorem 2.5(h), we have Jp (x), x  Jp (x), x ± z = Jp (x), x ± Jp (x), z. It follows that Jp (x), z = 0, and therefore, Jp (x) ∈ (N (A))⊥ = R(A∗ ) which proves the rest of (a). If x˜ ∈ X is like in (b) ˜ = limn→∞ A∗ un . Then then by (a) we can find a sequence un ∈ Y ∗ such that Jp (x) − Jp (x) ˜ x − x ˜ = lim A∗ un , x − x ˜ = lim un , A(x − x) ˜ =0 Jp (x) − Jp (x), n→∞

n→∞

since x − x˜ ∈ N (A). Because of theorem 2.5(e) we conclude that x˜ = x.



316

F Sch¨opfer et al

2.2. Bregman distance It turned out that due to the geometrical characteristics of Banach spaces other than Hilbert spaces, it is more appropriate to use Bregman distances instead of conventional Ljapunov functionals like x − yp or jp (x) − jp (y)q to prove convergence of regularizing algorithms [1, 2, 6, 4, 17, 5, 15]: 1 1 x, y ∈ X. (2.11) p (x, y) := yp − xp − inf{ξ, y − x : ξ ∈ Jp (x)}, p p Because of theorem 2.5(f) and (2.4) in smooth Banach spaces, this can be written as 1 1 xp + yp − Jp (x), y q p 1 = (xp − yp ) + Jp (y) − Jp (x), y, q

p (x, y) =

x, y ∈ X

(2.12)

Remark 2.11. p is not a metric. In a real Hilbert space 2 (x, y) = 12 x − y2 . We now summarize and prove a few facts concerning p and its relationship to the norm in X (see also [1, 2, 4]). Theorem 2.12. Let X be smooth and uniformly convex. Then for all x, y ∈ X and sequences (xn )n in X the following holds: (a) p (x, y)  0 and p (x, y) = 0 ⇔ x = y. (b) limxn →∞ p (xn , x) = ∞, i.e. the sequence (xn )n remains bounded if the sequence (p (xn , x))n is bounded. (c) p is continuous in both arguments. (d) Equivalent are (i) limn→∞ xn − x = 0, (ii) limn→∞ xn  = x and limn→∞ Jp (xn ), x = Jp (x), x, (iii) limn→∞ p (xn , x) = 0. (e) (xn )n is a Cauchy sequence if it is bounded and for all  > 0 there exists an n0 ∈ N such that p (xk , xl ) <  for all k, l  n0 . Proof. Equations (2.12) and (2.4) yield p (x, y) 

1 1 yp + xp − xp−1 y  0, p q

which proves the first part of (a) and (b). (c) and the implication (i) ⇒ (ii) in (d) are a consequence of theorem 2.5(b) and (g). The implication (ii) ⇒ (iii) follows directly from the definition of p (2.12). Let us prove (iii) ⇒ (i) (see also [4], proof of claim 2.2 and corollary 2.4); substituting x − y for y in theorem 2.7, we arrive at pp (x, y) = yp + (p − 1)xp − pJp (x), y  σp (x, x − y). With the explicit expression for σp (2.6), we have    1 1 tx − y (x − t (x − y) ∨ x)p dt. σp (x, x − y) = δX pKp t 2(x − t (x − y) ∨ x) 0 Since for all t ∈ [0, 1] x − t (x − y) ∨ x  x + x − y

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

317

and t x − y 2 (in the case x  2t x − y this is clear and otherwise x − t (x − y)  tx − y − x  t x − y), and δX is nondecreasing and non-negative, we can estimate 2    1 2p tx − y p p−1 dt σp (x, x − y)  x − y t δX pKp 2(x + x − y) 0    1 tx − y dt  x − yp t p−1 δX 1 2(x + x − y) 2     x − y 1 1 p  x − y δX 1− p . 4(x + x − y) p 2 x − t (x − y) ∨ x 

Putting all together we see that



p (xn , x)  Cxn − xp δX

xn − x 4(xn  + xn − x)



 K  with C = p2pp 1− 21p . If limn→∞ p (xn , x) = 0 then by (b) of this theorem there is a constant R > 0 such that 4(xn  + xn − x)  R for all n ∈ N. Suppose lim sup  n→∞ xn − x > 0. Then we can find an  > 0 and a subsequence xnk k of (xn )n so that xnk − x    for all k ∈ N. Therefore, by the monotonicity of δX and the uniform convexity of X (definition 2.1(a))    > 0. p xnk , x  C p δX R   which contradicts limk→∞ p xnk , x = 0. An analogous argument proves (e) and the rest of (a) also follows from this.  3. Discussion of the solution methods Now we turn to the solution of (1.1) for a given operator A ∈ L(X, Y ), where X is assumed to be smooth and uniformly convex and Y can be an arbitrary (real) Banach space. By theorem 2.5(a), (b) and (d) X is then reflexive and the dual X∗ is strictly convex and uniformly smooth. 3.1. Exact data First we consider the case of exact data y ∈ R(A). Let x be the minimum-norm solution of (1.1) (which exists according to lemma 2.10(a)). To recover x, we propose the following. Method 3.1. (0) If y = 0 then x = 0 and we are done, else we start with (1) We fix p, r ∈ (1, ∞), choose a constant C ∈ (0, 1)

(3.1)

and an initial vector x0 ∈ X such that Jp (x0 ) ∈ R(A∗ )

and

p (x0 , x) 

For n = 0, 1, 2, . . . , we repeat the following step:

1 xp . p

(3.2)

318

F Sch¨opfer et al

(2) We set Rn := Axn − y.

(3.3)

If Rn = 0  STOP, else we choose the parameters according to (a) In the case x0 = 0, we set q p−1 p−r µ0 := C R . (3.4) Ap 0 (b) For all n  0 (resp. n  1 if x0 = 0) let   Rn C , λn := (ρX∗ (1)) ∧ 2q Gq A xn  where Gq > 0 is the constant in (2.9). Since X∗ is uniformly smooth (definition 2.1(b)) we can find a τn ∈ (0, 1] with ρX∗ (τn ) = λn . (3.5) τn Then we set τn xn p−1 µn := . (3.6) A Rnr−1 The iterates are defined by Jp (xn+1 ) = Jp (xn ) − µn A∗ jr (Axn − y),

xn+1 = Jq (Jp (xn+1 )).

(3.7)

Remark 3.12. (a) The choice of the initial vector x0 (3.2) and the definition of the iterates (3.7) guarantee that Jp (xn ) ∈ R(A∗ ) for all n ∈ N. The choice x0 = 0 is always possible since p (0, x) = p1 xp (see 2.12). (b) If the stopping rule Rn = 0 is fulfilled for n ∈ N then A(xn −x) = Axn −y = Rn = 0 and thus x − xn lies in N (A). So by (a) and lemma 2.10(b) xn is our solution. (c) In proving the convergence of the above method we will see that (3.2) also ensures that xn = 0 for all n  1 and thus the parameters τn (3.5) are always well defined. Now we can prove the main theorem of this paper. Theorem 3.3. Method 3.1 either stops after a finite number of iterations with the minimumnorm solution x of (1.1) or the sequence of the iterates (xn )n converges strongly to x. Proof. If the method stops at step n with Rn = 0 we are done (remark 3.12(b)). Otherwise Rn > 0 for all n  0. The proof of convergence in that case will be structured as follows: first we show that the sequence (n )n with 1 1 (3.8) n := p (xn , x) = xn p + xp − Jp (xn ), x q p (see (2.12)) obeys a recursive inequality which implies its convergence. Then we deduce that the sequence (xn )n has a Cauchy subsequence and finally that (xn )n converges strongly to x. Equations (3.1) and (2.4) together with (3.8) yield 1 1 n+1 = Jp (xn ) − µn A∗ jr (Axn − y)q + xp − Jp (xn ) − µn A∗ jr (Axn − y), x q p 1 1 = Jp (xn ) − µn A∗ jr (Axn − y)q + xp − Jp (xn ), x + µn jr (Axn − y), Ax. q p (3.9)

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

319

In the case x0 = 0, we have R0 = y > 0 and 0 = p1 xp and therefore (3.9) gives 1 =

1 q ∗ 1 q (r−1)q µ A jr (y)q + 0 − µ0 jr (y), Ax  µ0 Aq R0 + 0 − µ0 R0r , q 0 q

since Ax = y. By the choice of µ0 (3.4), we get the following estimate: 1  C q

q p−1 p q p−1 p q p−1 p R0 +  0 − C R0 = 0 − C(1 − C q−1 ) R p p A A Ap 0

and therefore 1 < 0 = p1 xp , which implies that x1 = 0. For all n  0 (resp. n  1 if x0 = 0), we apply theorems 2.8 and 2.5(g) to equation (3.9) and get n+1 

1 (Jp (xn )q − qxn , µn A∗ jr (Axn − y) + σ˜ q (Jp (xn ), µn A∗ jr (Axn − y))) q 1 + xp − Jp (xn ), x + µn jr (Axn − y), Ax. p

With (3.8) this can be written as n+1  n − µn jr (Axn − y), Axn − Ax + = n − µn Rnr +

1 σ˜ q (Jp (xn ), µn A∗ jr (Axn − y)) q

1 σ˜ q (Jp (xn ), µn A∗ jr (Axn − y)). q

(3.10)

Now we estimate the integrand in the explicit expression for σ˜ q (2.9). The choice of µn (3.6) and τn (3.5) yields for all t ∈ [0, 1] Jp (xn ) − tµn A∗ jr (Axn − y) ∨ Jp (xn )  xn p−1 + µn ARnr−1 = xn p−1 (1 + τn )  2xn p−1 and Jp (xn ) − tµn A∗ jr (Axn − y) ∨ Jp (xn )  Jp (xn ) = xn p−1 . Together with the monotonicity of ρX∗ this gives    1 µn ARnr−1 (2xn p−1 )q ∗ dt σ˜ q (Jp (xn ), µn A jr (Axn − y))  qGq ρX∗ t t xn p−1 0  1 1 ρX∗ (tτn ) dt = 2q qGq xn p t 0 τn 1 ρX∗ (t) dt = 2q qGq xn p t 0  2q qGq xn p ρX∗ (τn ). Substituting this into (3.10), we get n+1  n − µn Rnr + 2q Gq xn p ρX∗ (τn )   1 xn  ρX∗ (τn ) . τn xn p−1 Rn 1 − 2q Gq A = n − A Rn τn The choice of C (3.1) and τn (3.5) finally gives the following recursive inequality: n+1  n −

1−C τn xn p−1 Rn . A

(3.11)

320

F Sch¨opfer et al

Therefore, also in the case x0 = 0, the relation 1 < 0  p1 xp holds (3.2). Inductively, we obtain for every admissible choice of the initial vector 1 0  n+1  n  1 < p (0, x) = xp (3.12) p and conclude that xn = 0 for all n  1 and that the sequence (n )n is nonincreasing, and therefore convergent and in particular bounded. Theorem 2.12(b) then ensures that the sequence (xn )n is bounded, which also implies the boundedness of the sequences (Jp (xn ))n (2.4) and (Rn )n (3.3). From (3.11), we can further derive 1−C 0 τn xn p−1 Rn  n − n+1 A and thus for all N ∈ N, we have 0

N N  1−C  τn xn p−1 Rn  (n − n+1 ) = 0 − N+1  0 , A n=0 n=0

which gives ∞ 

τn xn p−1 Rn < ∞.

(3.13)

n=0

Suppose lim inf n→∞ Rn > 0. Then there exist n0 ∈ N and  > 0 such that Rn   for all n  n0 and therefore ∞ ∞    τn xn p−1  τn xn p−1 Rn < ∞, n=n0

n=n0

which forces (xn )n to be a null sequence, since by the boundedness of (xn )n and by Rn   also the sequence (τn )n remains bounded away from zero (3.5). The continuity of p (., x) (theorem 2.12(c)) and (3.12) yields 1 1 xp = p (0, x) = lim p (xn , x) = lim n < xp , n→∞ n→∞ p p which is a contradiction. So we have lim inf n→∞ Rn = 0, and therefore, we can choose a   subsequence Rnk k with the property that Rnk → 0 for

k→∞

and

Rnk < Rn for all n < nk . (3.14)   The same property then also holds for every subsequence of Rnk k . By the boundedness of     (xn )n and Jp (xn ) n , we can thus find a subsequence xnk k with   (S.1) the sequence of the norms xnk  k is convergent,    (S.2) the sequence Jp xnk k is weakly convergent and   (S.3) the sequence Rnk k fulfils (3.14).   Now we show that xnk k is a Cauchy sequence. With (2.12), we have for all l, k ∈ N with k>l   1  p  p        p xnl , xnk = xnl  − xnk  + Jp xnk − Jp xnl , xnk . q Because of (S.1) the first summand converges to zero for l → ∞. The second summand can be written as                   Jp xnk − Jp xnl , xnk = Jp xnk − Jp xnl , x + Jp xnk − Jp xnl , xnk − x .

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

321

Again the first summand converges to zero for l → ∞ by (S.2). We estimate the second summand  −1  k          n   Jp xn − Jp xn , xn − x  =  Jp xn+1 − Jp (xn ), xnk − x  . k l k   n=nl

The recursive definition of the method (3.1) yields  −1  k        n   Jp xn − Jp xn , xn − x  =  µn jr (Axn − y), Axnk − y  k l k   n=nl



n k −1

  µn jr (Axn − y)Axnk − y 

n=nl

=

nk −1 1  τn xn p−1 Rnk . A n=n l

(S.3) then ensures that        Jp xn − Jp xn , xn − x   k l k

nk −1 1  τn xn p−1 Rn . A n=n l

  By (3.13), the right-hand side convergesto zero  for l → ∞, and therefore, so does p xnl , xnk . By theorem 2.12(e), we conclude that xnk k is a Cauchy sequence and thus convergent to an x˜ ∈ X. It remains to prove that x˜ = x and limn→∞ xn − x = 0. We have      Rn = Axn − y  = A xn − x , k

k

k

where the left-hand side converges to zero for k → ∞ (S.3). Since A is continuous the right-hand side converges to A(x˜ − x) for k → ∞ and we see that x˜ − x lies in N (A). On ˜ lies in R(A∗ ) by remark 3.12(a) and theorem 2.5(g) and together with the other hand, Jp (x) lemma 2.10(b) this shows that x˜ = x. So by the continuity of p (., x) and (2.12)(a), we have   lim nk = lim p xnk , x = p (x, x) = 0. k→∞

k→∞

Since the sequence (n )n is convergent and has a subsequence converging to zero, it must be a null sequence. By theorem 2.12(d), we finally conclude that (xn )n converges strongly to x.  3.2. Approximate data Suppose that instead of exact data y ∈ R(A) and operator A ∈ L(X, Y ) we are given approximations (yk )k in Y and (Al )l in L(X, Y ). For instance, when we discretize an infinitedimensional problem or when the operators (Al )l allow for faster computations of Al xn . We assume that we know estimates for the deviations yk − y  δk ,

δk > δk+1 > 0,

Al − A  ηl ,

ηl > ηl+1 > 0,

lim δk = 0

(3.15)

lim ηl = 0.

(3.16)

k→∞

l→∞

Moreover, to properly include the second case (3.16), we need an a priori estimate for the norm of the solution x, i.e. there is a constant R > 0 such that x  R.

(3.17)

322

F Sch¨opfer et al

We set S := sup Al .

(3.18)

l∈N

Method 3.1 has to be altered appropriately in concordance with the approximations. Method 3.4. (1) We fix p, r ∈ (1, ∞), choose constants C, D ∈ (0, 1)

(3.19)

and an initial vector x0 ∈ X such that 1 (3.20) xp . p We set k−1 := 0 and l−1 := 0 and for n = 0, 1, 2, . . . we repeat the following step: (2) If for all k > kn−1 and all l > ln−1 1 Al xn − yk  < (δk + ηl R) (3.21) D  STOP, else we can find kn > kn−1 and ln > ln−1 with Jp (x0 ) ∈ R(A∗ )

where

and

p (x0 , x) 

δkn + ηln R  DRn

(3.22)

  Rn := Aln xn − ykn .

(3.23)

We choose the parameters according to (a) In the case x0 = 0, we set q p−1 p−r R . Sp 0 (b) For all n  0 (resp. n  1 if x0 = 0), we set   C(1 − D) Rn ∗ , λn := (ρX (1)) ∧ 2q Gq S xn  where Gq > 0 is the constant in (2.9) and choose a τn ∈ (0, 1] with ρX∗ (τn ) = λn . τn Then we set τn xn p−1 . µn := S Rnr−1 The iterates are defined by        xn+1 = Jq Jp xn+1 . Jp xn+1 = Jp (xn ) − µn A∗ln jr Aln xn − ykn , µ0 := C(1 − D)p−1

(3.24)

(3.25)

(3.26)

(3.27)

Remark 3.5. (a) If the stopping rule (3.21) is fulfilled for an n ∈ N then for all k > kn−1 and all l > ln−1 1 Al xn − yk  < (δk + ηl R) D where the left-hand side converges to Axn − y = A(xn − x) and the right-hand side converges to zero for l, k → ∞ (3.15), (3.16) and, as in the case of exact data, we see that xn is our solution. (b) Condition (3.22) guarantees that the sequence (n )n is nonincreasing. The assertions of theorem 3.3 remain valid.

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

323

Theorem 3.6. Method 3.2 either stops after a finite number of iterations with the minimumnorm solution x of (1.1) or the sequence of the iterates (xn )n converges strongly to x. Proof. The proof is rather similar to the one in case of exact data and we only give the main modifications. If the stopping rule (3.21) is never fulfilled then according to (3.22) and (3.15), (3.16) we always have Rn > 0. In the case x0 = 0, we estimate  q     1 q µ0 A∗l0 jr yk0  + 0 − µ0 jr yk0 , Al0 x q  q         1 q = µ0 A∗l0 jr yk0  + 0 − µ0 jr yk0 , yk0 + µ0 jr yk0 , yk0 − y q     + µ0 jr yk0 , Ax − Al0 x .

1 =

Because of (3.15), (3.16), (3.17) and (3.18), we get 1 

  1 q q (r−1)q µ S R0 + 0 − µ0 R0r + µ0 R0r−1 δk0 + ηk0 R . q 0

Equation (3.22) and the choice of µ0 (3.24) finally yields 1 q q (r−1)q µ S R0 + 0 − µ0 R0r (1 − D) q 0 q p−1 p q p−1 p = C q (1 − D)p p R0 + 0 − C(1 − D)p p R0 S S p−1 q p = 0 − C(1 − C q−1 )(1 − D)p p R0 S

1 

and therefore 1 < 0 and especially x1 = 0. With (3.22) and since         −µn jr Aln xn − ykn , Aln xn − Aln x = −µn jr Aln xn − ykn , Aln xn − ykn     − µn jr Aln xn − ykn , ykn − y     − µn jr Aln xn − ykn , Ax − Aln x    −µn Rnr + µn Rnr−1 δkn + ηln R , in equation (3.10) becomes for all n  0 (resp. n  1 if x0 = 0)     1  n+1  n − µn Rnr + µn Rnr−1 δkn + ηln R + σ˜ q Jp (xn ), µn A∗ln jr Aln xn − ykn q   1   n − (1 − D)µn Rnr + σ˜ q Jp (xn ), µn A∗ln jr Aln xn − ykn . q The last summand can be estimated analogously to the case of exact data by    σ˜ q Jp (xn ), µn A∗ln jr Aln xn − ykn  2q qGq xn p ρX∗ (τn ) and thus we arrive at n+1  n −

  2q Gq S xn  ρX∗ (τn ) (1 − D) . τn xn p−1 Rn 1 − S 1 − D Rn τn

324

F Sch¨opfer et al

By the choice of τn (3.25), we get (1 − D)(1 − C) τn xn p−1 Rn . S Now we proceed as after inequality (3.11) while keeping the following in mind:   nj −1               Jp xn − Jp xn , xn − x  =  µn jr Aln xn − ykn , Aln xnj − Aln x  j i j  n=ni  n+1  n −

nj −1





(3.28)

    µn  jr Aln xn − ykn , Alnj xnj − yknj 

n=ni

    +  jr Aln xn − ykn , yknj − y      +  jr Aln xn − ykn , Ax − Aln x      +  jr Aln xn − ykn , Aln xnj − Alnj xnj  . With (3.15), (3.16) (the sequences (δn )n and (ηn )n are nonincreasing) and (3.17) this gives nj −1            Jp xn − Jp xn , xn − x   µn Rnr−1 Rnj + δkn + ηln R + 2ηln xnj  . j i j n=ni

Let R˜ > 0 be a constant such that xn   R˜ for all n ∈ N. Then by (3.22) and property (S.3) of the sequence (Rnj )j , we have   nj −1        D r−1  Jp xn − Jp xn , xn − x   ˜ R R R µ R + DR + 2 n n n n j i j j n R n=n i



˜ nj −1 1 + D + 2 DRR  τn xn p−1 Rn . S n=n



i

3.3. Discrepancy principle as stopping rule Now we consider the case of noisy data yδ and disturbed operator Aη with known noise level y − yδ   δ

and

A − Aη   η.

(3.29)

We apply method 3.4 with δk = δ and ηl = η for all k, l ∈ N and use the discrepancy principle [9, 14]. To that end condition (3.21) supplies us with a simple stopping rule; we terminate the iteration when for the first time 1 (3.30) Rn < (δ + ηR). D Because as long as Rn  D1 (δ + ηR) then according to (3.28) and remark 3.5(b) xn+1 is a better approximation to x than xn . As a consequence of this and theorem 3.6 (for δ, η → 0), we can formulate the following. Corollary 3.7. Together with the discrepancy principle as stopping rule (3.30) method 3.4 is a regularization method for problem (1.1). Remark 3.8. (a) This also proves the stability of method 3.1.

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

325

(b) Since the selection jr need not be continuous (and in fact cannot be continuous if Jr is set valued [8]), the method is an example for regularization with non-continuous mappings. 4. Numerical experiments To implement the methods we need estimates for the constant Gp in (2.9). Due to the generality of theorem 2.8, the value given there is not the best possible in special cases. To get better estimates we can use the following observation. In the proof of theorem 2.8 (here used for the dual X∗ ), it can be seen that the value of Gq stems from an estimate of the form   x − y (x ∨ y)q ρX∗ . Jq (x) − Jq (y)  Gq x − y x ∨ y Moreover, let X∗ have modulus of smoothness of power type t, i.e. for some constant K > 0 ρX∗ (τ )  Kτ t . Then we can use Jq (x) − Jq (y)  Gq K(x ∨ y)q−t x − yt−1 and the parameter τn in method 3.1 (resp. 3.4) can be chosen according to  1  1   Rn t−1 C C(1 − D) Rn t−1 τn = 1 ∧ , respectively τ = 1 ∧ . n 2q Gq KA xn  2q Gq KS xn  For instance in lq spaces, we have Jq (x) − Jq (y)  i.e.

 2−q 2 , Gq K = q − 1,

 2−q 2 x − yq−1 , (q − 1)(x ∨ y)q−2 x − y,

1