Multivariate Spectral Gradient Algorithm for Nonsmooth Convex ...

2 downloads 0 Views 2MB Size Report
Jul 5, 2015 - use approximate function and gradient values of the Moreau-Yosida regularization to ... everywhere by the Rademacher theorem; then the B-.
Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2015, Article ID 145323, 7 pages http://dx.doi.org/10.1155/2015/145323

Research Article Multivariate Spectral Gradient Algorithm for Nonsmooth Convex Optimization Problems Yaping Hu School of Science, East China University of Science and Technology, Shanghai 200237, China Correspondence should be addressed to Yaping Hu; [email protected] Received 20 April 2015; Revised 4 July 2015; Accepted 5 July 2015 Academic Editor: Dapeng P. Du Copyright Β© 2015 Yaping Hu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We propose an extended multivariate spectral gradient algorithm to solve the nonsmooth convex optimization problem. First, by using Moreau-Yosida regularization, we convert the original objective function to a continuously differentiable function; then we use approximate function and gradient values of the Moreau-Yosida regularization to substitute the corresponding exact values in the algorithm. The global convergence is proved under suitable assumptions. Numerical experiments are presented to show the effectiveness of this algorithm.

Proposition 1 (see Chapter XV, Theorem 4.1.4, [1]). The Moreau-Yosida regularization function 𝐹 is convex, finitevalued, and differentiable everywhere with gradient

1. Introduction Consider the unconstrained minimization problem min 𝑓 (π‘₯) ,

π‘₯∈R𝑛

(1)

where 𝑓 : R𝑛 β†’ R is a nonsmooth convex function. The Moreau-Yosida regularization [1] of 𝑓 at π‘₯ ∈ R𝑛 associated with 𝑧 ∈ R𝑛 is defined by 1 𝐹 (π‘₯) = min𝑛 {𝑓 (𝑧) + ‖𝑧 βˆ’ π‘₯β€–2 } , π‘§βˆˆR 2πœ†

min 𝐹 (π‘₯)

1 (π‘₯ βˆ’ 𝑝 (π‘₯)) , πœ†

(2)

(3)

and the original problem (1) are equivalent in the sense that the two corresponding solution sets coincidentally are the same. The following proposition shows some properties of the Moreau-Yosida regularization function 𝐹(π‘₯).

(4)

where 𝑝 (π‘₯) = arg min𝑛 {𝑓 (𝑧) + π‘§βˆˆR

where β€–β‹…β€– is the Euclidean norm and πœ† is a positive parameter. The function minimized on the right-hand side is strongly convex and differentiable, so it has a unique minimizer for every 𝑧 ∈ R𝑛 . Under some reasonable conditions, the gradient function of 𝐹(π‘₯) can be proved to be semismooth [2, 3], though generally 𝐹(π‘₯) is not twice differentiable. It is widely known that the problem π‘₯∈R𝑛

𝑔 (π‘₯) ≑ βˆ‡πΉ (π‘₯) =

1 ‖𝑧 βˆ’ π‘₯β€–2 } 2πœ†

(5)

is the unique minimizer in (2). Moreover, for all π‘₯, 𝑦 ∈ R𝑛 , one has σ΅„© σ΅„© 1σ΅„© σ΅„©σ΅„© 󡄩󡄩𝑔 (π‘₯) βˆ’ 𝑔 (𝑦)σ΅„©σ΅„©σ΅„© ≀ σ΅„©σ΅„©σ΅„©π‘₯ βˆ’ 𝑦󡄩󡄩󡄩 . πœ†

(6)

This proposition shows that the gradient function 𝑔 : R𝑛 β†’ R𝑛 is Lipschitz continuous with modulus 1/πœ†. In this case, the gradient function 𝑔 is differentiable almost everywhere by the Rademacher theorem; then the Bsubdifferential [4] of 𝑔 at π‘₯ ∈ R𝑛 is defined by πœ•π΅ 𝑔 (π‘₯) = {𝑉 ∈ R𝑛×𝑛 : 𝑉 = lim βˆ‡π‘” (π‘₯π‘˜ ) , π‘₯π‘˜ ∈ 𝐷𝑔 } , π‘₯π‘˜ β†’ π‘₯

(7)

where 𝐷𝑔 = {π‘₯ ∈ R𝑛 : 𝑔 is differentiable at π‘₯}, and the next property of BD-regularity holds [4–6].

2

Mathematical Problems in Engineering

Proposition 2. If 𝑔 is BD-regular at π‘₯, then

Proposition 4. Let πœ€ be arbitrary positive number and let π‘π‘Ž (π‘₯, πœ€) be a vector satisfying (9). Then, one gets

(i) all matrices 𝑉 ∈ πœ•π΅ 𝑔(π‘₯) are nonsingular;

(ii) there exists a neighborhood N of π‘₯ ∈ R𝑛 , πœ…1 > 0, and πœ…2 > 0; for all 𝑦 ∈ N, one has 𝑇

σ΅„©σ΅„© βˆ’1 σ΅„©σ΅„© 󡄩󡄩𝑉 σ΅„©σ΅„© ≀ πœ…2 , σ΅„© σ΅„©

2

𝑑 𝑉𝑑 β‰₯ πœ…1 ‖𝑑‖ ,

βˆ€π‘‘ ∈ R𝑛 , 𝑉 ∈ πœ•π΅ 𝑔 (𝑦) .

(8)

Instead of the corresponding exact values, we often use the approximate value of function 𝐹(π‘₯) and gradient 𝑔(π‘₯) in the practical computation, because 𝑝(π‘₯) is difficult and sometimes impossible to be solved precisely. Suppose that, for any πœ€ > 0 and for each π‘₯ ∈ R𝑛 , there exists an approximate vector π‘π‘Ž (π‘₯, πœ€) ∈ R𝑛 of the unique minimizer 𝑝(π‘₯) in (2) such that 𝑓 (π‘π‘Ž (π‘₯, πœ€)) +

1 σ΅„©σ΅„© π‘Ž σ΅„©2 󡄩𝑝 (π‘₯, πœ€) βˆ’ π‘₯σ΅„©σ΅„©σ΅„© ≀ 𝐹 (π‘₯) + πœ€. 2πœ† σ΅„©

(9)

The implementable algorithms to find such approximate vector π‘π‘Ž (π‘₯, πœ€) ∈ R𝑛 can be found, for example, in [7, 8]. The existence theorem of the approximate vector π‘π‘Ž (π‘₯, πœ€) is presented as follows. Proposition 3 (see Lemma 2.1 in [7]). Let {π‘₯π‘˜ } be generated according to the formula π‘₯π‘˜+1 = π‘₯π‘˜ βˆ’ π›Όπ‘˜ πœπ‘˜ ,

π‘“π‘œπ‘Ÿ π‘˜ = 1, 2, . . . ,

(10)

where π›Όπ‘˜ > 0 is a stepsize and πœπ‘˜ is an approximate subgradient at π‘₯π‘˜ ; that is, πœπ‘˜ ∈ πœ•πœ€π‘˜ 𝑓 (π‘₯π‘˜ ) = {𝜐 | 𝑓 (𝜐) βˆ’ ⟨𝜐, π‘₯π‘˜ ⟩ ≀ 𝑓 (π‘₯π‘˜ ) + πœ€π‘˜ } , π‘“π‘œπ‘Ÿ π‘˜ = 1, 2, . . . .

π‘“π‘œπ‘Ÿ π‘˜ = 1, 2, . . . ,

(12)

(13)

(ii) Conversely, if (11) holds with πœ€π‘˜ given by (13), then (12) holds: π‘₯π‘˜+1 = π‘π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ). π‘Ž

We use the approximate vector 𝑝 (π‘₯, πœ€) to define approximation function and gradient values of the Moreau-Yosida regularization, respectively, by πΉπ‘Ž (π‘₯, πœ€) = 𝑓 (π‘π‘Ž (π‘₯, πœ€)) + π‘”π‘Ž (π‘₯, πœ€) =

1 σ΅„©σ΅„© π‘Ž σ΅„©2 󡄩𝑝 (π‘₯, πœ€) βˆ’ π‘₯σ΅„©σ΅„©σ΅„© , 2πœ† σ΅„©

(14)

π‘Ž

(π‘₯ βˆ’ 𝑝 (π‘₯, πœ€)) . πœ†

(17)

σ΅„© σ΅„©σ΅„© π‘Ž (18) 󡄩󡄩𝑝 (π‘₯, πœ€) βˆ’ 𝑝 (π‘₯)σ΅„©σ΅„©σ΅„© ≀ √2πœ†πœ€. Algorithms which combine the proximal techniques with Moreau-Yosida regularization for solving the nonsmooth problem (1) have been proved to be effective [7, 9, 10], and also some trust region algorithms for solving (1) have been proposed in [5, 11, 12], and so forth. Recently, Yuan et al. [13, 14] and Li [15] have extended the spectral gradient method and conjugate gradient-type method to solve (1), respectively. Multivariate spectral gradient (MSG) method was first proposed by Han et al. [16] for optimization problems. This method has a nice property that it converges quadratically for objective function with positive definite diagonal Hessian matrix [16]. Further studies on such method for nonlinear equations and bound constrained optimization can be found, for instance, in [17, 18]. By using nonmonotone technique, some effective spectral gradient methods are presented in [13, 16, 17, 19]. In this paper, we extend the multivariate spectral gradient method by combining with a nonmonotone line search technique as well as the Moreau-Yosida regulation function to solve the nonsmooth problem (1) and do some numerical experiments to test its efficiency. The rest of this paper is organized as follows. In Section 2, we propose multivariate spectral gradient algorithm to solve (1). In Section 3, we prove the global convergence of the proposed algorithm; then some numerical results are presented in Section 4. Finally, we have a conclusion section.

2. Algorithm

then (11) holds with σ΅„© σ΅„©2 πœ€π‘˜ = 𝑓 (π‘₯π‘˜ ) βˆ’ 𝑓 (π‘₯π‘˜+1 ) βˆ’ π›Όπ‘˜ σ΅„©σ΅„©σ΅„©πœπ‘˜ σ΅„©σ΅„©σ΅„© β‰₯ 0.

2πœ€ σ΅„© σ΅„©σ΅„© π‘Ž 󡄩󡄩𝑔 (π‘₯, πœ€) βˆ’ 𝑔 (π‘₯)σ΅„©σ΅„©σ΅„© ≀ √ , πœ†

(16)

(11)

(i) If πœπ‘˜ satisfies πœπ‘˜ ∈ πœ•π‘“ (π‘₯π‘˜+1 ) ,

𝐹 (π‘₯) ≀ πΉπ‘Ž (π‘₯, πœ€) ≀ 𝐹 (π‘₯) + πœ€,

(15)

The following proposition is crucial in the convergence analysis. The proof of this proposition can be found in [2].

In this section, we present the multivariate spectral gradient algorithm to solve the nonsmooth convex unconstrained optimization problem (1). Our approach is using the tool of the Moreau-Yosida regularization to smoothen the nonsmooth function and then make use of the approximate values of function 𝐹 and gradient 𝑔 in multivariate spectral gradient algorithm. We first recall the multivariate spectral gradient algorithm [16] for smooth optimization problem: min {𝑓 (π‘₯) | π‘₯ ∈ R𝑛 } ,

(19)

𝑛

where 𝑓 : R β†’ R is continuously differentiable and its gradient is denoted by 𝑔. Let π‘₯π‘˜ be the current iteration; multivariate spectral gradient algorithm is defined by π‘₯π‘˜+1 = π‘₯π‘˜ βˆ’ diag {

1 1 1 , , . . . , 𝑛 } π‘”π‘˜ , πœ†π‘˜ πœ†1π‘˜ πœ†2π‘˜

(20)

where π‘”π‘˜ is the gradient vector of 𝑓 at π‘₯π‘˜ and diag{πœ†1π‘˜ , πœ†2π‘˜ , . . . , πœ†π‘›π‘˜ } is solved by minimizing σ΅„©σ΅„©σ΅„©diag {πœ†1 , πœ†2 , . . . , πœ†π‘› } 𝑠 βˆ’ 𝑒 σ΅„©σ΅„©σ΅„© (21) σ΅„©σ΅„© π‘˜βˆ’1 π‘˜βˆ’1 σ΅„© σ΅„©2

Mathematical Problems in Engineering

3

with respect to {πœ†π‘– }𝑛𝑖=1 , where π‘ π‘˜βˆ’1 = π‘₯π‘˜ βˆ’ π‘₯π‘˜βˆ’1 , π‘’π‘˜βˆ’1 = π‘”π‘˜ βˆ’ π‘”π‘˜βˆ’1 . Denote the 𝑖th element of π‘ π‘˜ and π‘¦π‘˜ by π‘ π‘˜π‘– and π‘¦π‘˜π‘– , respectively. We present the following multivariate spectral gradient (MSG) algorithm. 𝑛

Algorithm 5. Set π‘₯0 ∈ R , 𝜎 ∈ (0, 1), 𝛽 > 0, πœ† > 0, 𝛾 β‰₯ 0, 𝛿 > 0, 𝜌 ∈ [0, 1], πœ– ∈ (0, 1), 𝐸0 = 1, and 𝜏0 ∈ (0, 1]; {πœπ‘˜ } is a strictly decreasing sequence with limπ‘˜ β†’ 0 πœπ‘˜ = 0, π‘˜ := 0. Step 1. Set πœ€0 = 𝜏0 . Calculate πΉπ‘Ž (π‘₯0 , πœ€0 ) by (14) as well as π‘”π‘Ž (π‘₯0 , πœ€0 ) by (15). Let 𝐽0 = πΉπ‘Ž (π‘₯0 , πœ€0 ), 𝑑0 = βˆ’π‘”π‘Ž (π‘₯0 , πœ€0 ). Step 2. Stop if β€–π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )β€– = 0. Otherwise, go to Step 3. Step 3. Choose πœ€π‘˜+1 satisfying 0 < πœπ‘˜ β€–π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )β€–2 }; find π›Όπ‘˜ which satisfies

πœ€π‘˜+1

≀ 𝑇

πΉπ‘Ž (π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ , πœ€π‘˜+1 ) βˆ’ π½π‘˜ ≀ πœŽπ›Όπ‘˜ π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ ,

min{πœπ‘˜ ,

role in manipulating the degree of nonmonotonicity in the nonmonotone line search technique, with 𝜌 = 0 yielding a strictly monotone scheme and with 𝜌 = 1 yielding π½π‘˜ = πΆπ‘˜ , where πΆπ‘˜ =

1 π‘˜ π‘Ž βˆ‘πΉ (π‘₯𝑖 , πœ€π‘– ) π‘˜ + 1 𝑖=0

(25)

is the average function value. (iii) From Step 6, we can obtain that 1 σ΅„© σ΅„© σ΅„© σ΅„© min {πœ–, } σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© ≀ σ΅„©σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©σ΅„© 𝛿 1 1 σ΅„© σ΅„© ≀ max { , } σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© ; πœ– 𝛿

(26)

then there is a positive constant πœ‡ such that, for all π‘˜, (22)

where π›Όπ‘˜ = 𝛽2βˆ’π‘–π‘˜ and π‘–π‘˜ is the smallest nonnegative integer such that (22) holds.

𝑇 σ΅„© σ΅„©2 π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ ≀ βˆ’ πœ‡ σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© ,

(27)

which shows that the proposed multivariate spectral gradient algorithm possesses the sufficient descent property.

Step 4. Let π‘₯π‘˜+1 = π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ . Stop if β€–π‘”π‘Ž (π‘₯π‘˜+1 , πœ€π‘˜+1 )β€– = 0.

3. Global Convergence

Step 5. Update π½π‘˜+1 by the following formula: πΈπ‘˜+1 = πœŒπΈπ‘˜ + 1, π½π‘˜+1 =

πœŒπΈπ‘˜ π½π‘˜ + πΉπ‘Ž (π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ , πœ€π‘˜+1 ) . πΈπ‘˜+1

(23)

Step 6. Compute the search direction π‘‘π‘˜+1 by the following: (a) If π‘¦π‘˜π‘– /π‘ π‘˜π‘– > 0, then set πœ†π‘–π‘˜+1 = π‘¦π‘˜π‘– /π‘ π‘˜π‘– ; otherwise set πœ†π‘–π‘˜+1 = π‘ π‘˜π‘‡ π‘¦π‘˜ /π‘ π‘˜π‘‡ π‘ π‘˜ for 𝑖 = 1, 2, . . . , 𝑛, where π‘¦π‘˜ = π‘”π‘Ž (π‘₯π‘˜+1 , πœ€π‘˜+1 ) βˆ’ π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) + π›Ύπ‘ π‘˜ , π‘ π‘˜ = π‘₯π‘˜+1 βˆ’ π‘₯π‘˜ . (b) If πœ†π‘–π‘˜+1 ≀ πœ– or πœ†π‘–π‘˜+1 β‰₯ 1/πœ–, then set πœ†π‘–π‘˜+1 = 𝛿 for 𝑖 = 1, 2, . . . , 𝑛. Let π‘‘π‘˜+1 = βˆ’diag{1/πœ†1π‘˜+1 , 1/πœ†2π‘˜+1 , . . . , 1/πœ†π‘›π‘˜+1 }π‘”π‘Ž (π‘₯π‘˜+1 , πœ€π‘˜+1 ). Step 7. Set π‘˜ := π‘˜ + 1; go back to Step 2. Remarks. (i) The definition of πœ€π‘˜+1 = π‘œ(β€–π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )β€–2 ) in Algorithm 5, together with (15) and Proposition 3, deduces that σ΅„©2 σ΅„© σ΅„©2 σ΅„© πœ€π‘˜+1 = π‘œ (σ΅„©σ΅„©σ΅„©π‘₯π‘˜ βˆ’ π‘π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© ) = π‘œ (σ΅„©σ΅„©σ΅„©π‘₯π‘˜ βˆ’ π‘₯π‘˜+1 σ΅„©σ΅„©σ΅„© ) σ΅„© σ΅„©2 = π‘œ (π›Όπ‘˜2 σ΅„©σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©σ΅„© ) ;

(24)

then, with the decreasing property of πœ€π‘˜+1 , the assumed condition πœ€π‘˜ = π‘œ(π›Όπ‘˜2 β€–π‘‘π‘˜ β€–2 ) in Lemma 7 holds. (ii) From the nonmonotone line search technique (22), we can see that π½π‘˜+1 is a convex combination of the function value πΉπ‘Ž (π‘₯π‘˜+1 , πœ€π‘˜+1 ) and π½π‘˜ . Also π½π‘˜ is a convex combination of the function values πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ), . . ., πΉπ‘Ž (π‘₯1 , πœ€1 ), πΉπ‘Ž (π‘₯0 , πœ€0 ) as 𝐽0 = πΉπ‘Ž (π‘₯0 , πœ€0 ). 𝜌 is a positive value that plays an important

In this section, we provide a global convergence analysis for the multivariate spectral gradient algorithm. To begin with, we make the following assumptions which have been given in [5, 12–14]. Assumption A. (i) 𝐹 is bounded from below. (ii) The sequence {π‘‰π‘˜ }, π‘‰π‘˜ ∈ πœ•π΅ 𝑔(π‘₯π‘˜ ), is bounded; that is, there exists a constant 𝑀 > 0 such that, for all π‘˜, σ΅„©σ΅„© σ΅„©σ΅„© σ΅„©σ΅„©π‘‰π‘˜ σ΅„©σ΅„© ≀ 𝑀.

(28)

The following two lemmas play crucial roles in establishing the convergence theorem for the proposed algorithm. By using (26) and (27) and Assumption A, similar to Lemma 1.1 in [20], we can get the next lemma which shows that Algorithm 5 is well defined. The proof ideas of this lemma and Lemma 1.1 in [20] are similar, hence omitted. Lemma 6. Let {πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ )} be the sequence generated by Algorithm 5. Suppose that Assumption A holds and πΆπ‘˜ is defined by (25). Then one has πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) ≀ π½π‘˜ ≀ πΆπ‘˜ for all π‘˜. Also, there exists a stepsize π›Όπ‘˜ satisfying the nonmonotone line search condition. Lemma 7. Let {(π‘₯π‘˜ , πœ€π‘˜ )} be the sequence generated by Algorithm 5. Suppose that Assumption A and πœ€π‘˜ = π‘œ(π›Όπ‘˜2 β€–π‘‘π‘˜ β€–2 ) hold. Then, for all π‘˜, one has π›Όπ‘˜ β‰₯ π‘š0 ,

(29)

where π‘š0 > 0 is a constant. Proof (Proof by Contradiction). Let π›Όπ‘˜ satisfy the nonmonotone Armijo-type line search (22). Assume on the contrary

4

Mathematical Problems in Engineering 𝑇

πœŽπ›Όπ‘˜σΈ€  π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ < πΉπ‘Ž (π‘₯π‘˜ + π›Όπ‘˜σΈ€  π‘‘π‘˜ , πœ€π‘˜+1 )

that lim inf π‘˜ β†’ ∞ π›Όπ‘˜ = 0 does hold; then there exists a subsequence {π›Όπ‘˜ }𝐾󸀠 such that π›Όπ‘˜ β†’ 0 as π‘˜ β†’ ∞. From the nonmonotone line search rule (22), π›Όπ‘˜σΈ€  = π›Όπ‘˜ /2 satisfies 𝑇

πΉπ‘Ž (π‘₯π‘˜ + π›Όπ‘˜σΈ€  π‘‘π‘˜ , πœ€π‘˜+1 ) βˆ’ π½π‘˜ > πœŽπ›Όπ‘˜σΈ€  π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ ;

βˆ’ πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) ≀ 𝐹 (π‘₯π‘˜ + π›Όπ‘˜σΈ€  π‘‘π‘˜ ) βˆ’ 𝐹 (π‘₯π‘˜ ) + πœ€π‘˜+1

(30)

= π›Όπ‘˜σΈ€  π‘‘π‘˜π‘‡ 𝑔 (π‘₯π‘˜ )

together with πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) ≀ π½π‘˜ ≀ πΆπ‘˜ in Lemma 6, we have π‘Ž

𝐹

(π‘₯π‘˜ + π›Όπ‘˜σΈ€  π‘‘π‘˜ , πœ€π‘˜+1 ) βˆ’ πΉπ‘Ž β‰₯𝐹

π‘Ž

+

(π‘₯π‘˜ , πœ€π‘˜ )

(π‘₯π‘˜ + π›Όπ‘˜σΈ€  π‘‘π‘˜ , πœ€π‘˜+1 ) βˆ’ π½π‘˜

>

πœŽπ›Όπ‘˜σΈ€  π‘”π‘Ž

𝑇

(π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ .

𝑇

(32)

π‘‘π‘˜π‘‡ 𝑉 (π‘’π‘˜ ) π‘‘π‘˜ + πœ€π‘˜+1

≀ π›Όπ‘˜σΈ€  π‘‘π‘˜π‘‡ 𝑔 (π‘₯π‘˜ ) +

(31)

By (28) and (31) and Proposition 4 and using Taylor’s formula, there is

1 2

2 (π›Όπ‘˜σΈ€  )

𝑀 σΈ€  2 σ΅„©σ΅„© σ΅„©σ΅„©2 (π›Όπ‘˜ ) σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© 2

+ πœ€π‘˜+1 , where π‘’π‘˜ ∈ (π‘₯π‘˜ , π‘₯π‘˜+1 ). From (32) and Proposition 4, we have 𝑇

(π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) βˆ’ 𝑔 (π‘₯π‘˜ )) π‘‘π‘˜ βˆ’ (1 βˆ’ 𝜎) π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ βˆ’ πœ€π‘˜+1 /π›Όπ‘˜σΈ€  2 π›Όπ‘˜ ] = π›Όπ‘˜σΈ€  > [ σ΅„©σ΅„© σ΅„©σ΅„©2 2 𝑀 σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© σ΅„© σ΅„©2 σ΅„© σ΅„© σ΅„© σ΅„©2 πœ‡ (1 βˆ’ 𝜎) σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© βˆ’ √2πœ€π‘˜ /πœ† σ΅„©σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©σ΅„© βˆ’ πœ€π‘˜ /π›Όπ‘˜σΈ€  2 π‘œ (π›Όπ‘˜ ) πœ‡ (1 βˆ’ 𝜎) σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© 2 β‰₯[ βˆ’ π‘œ (π›Όπ‘˜ )] ] βˆ’ = [ 2 σ΅„©σ΅„© σ΅„©σ΅„©2 σ΅„© σ΅„© σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© βˆšπœ† 𝑀 𝑀 σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© σ΅„© σ΅„© β‰₯[

πœ‡ (1 βˆ’ 𝜎) (max {1/πœ–, 1/𝛿})

2

βˆ’

(33)

π‘œ (π›Όπ‘˜ ) 2 βˆ’ π‘œ (π›Όπ‘˜ )] , βˆšπœ† 𝑀

where the second inequality follows from (26), Part 3 in Proposition 4, and πœ€π‘˜+1 ≀ πœ€π‘˜ , the equality follows from πœ€π‘˜ = π‘œ(π›Όπ‘˜2 β€–π‘‘π‘˜ β€–2 ), and the last inequality follows from (27). Dividing each side by π›Όπ‘˜ and letting π‘˜ β†’ ∞ in the above inequality, we can deduce that 2πœ‡ (1 βˆ’ 𝜎) 1 1 ) = + ∞, β‰₯ lim ( 2 2 π‘˜ β†’ ∞ (max {1/πœ–, 1/𝛿}) 𝑀 π›Όπ‘˜

Therefore, it follows from the definition of π½π‘˜+1 and (23) that π½π‘˜+1 = ≀

(34)

which is impossible, so the conclusion is obtained.

sequence {π‘₯π‘˜ }∞ π‘˜=0 has accumulation point, and every accumulation point of {π‘₯π‘˜ }∞ π‘˜=0 is optimal solution of problem (1). Proof. Suppose that there exist πœ–0 > 0 and π‘˜0 > 0 such that σ΅„©σ΅„© π‘Ž σ΅„© (36) 󡄩󡄩𝑔 (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© β‰₯ πœ–0 , βˆ€π‘˜ > π‘˜0 . From (22), (26), and (29), we get 𝑇

πΉπ‘Ž (π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ , πœ€π‘˜+1 ) βˆ’ π½π‘˜ ≀ πœŽπ›Όπ‘˜ π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) π‘‘π‘˜ 1 σ΅„© σ΅„©2 ≀ βˆ’ πœŽπ›Όπ‘˜ min {πœ–, } σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© 𝛿 1 ≀ βˆ’ πœŽπ‘š0 πœ–0 min {πœ–, } , βˆ€π‘˜ > π‘˜0 . 𝛿

(38)

πœŽπ‘š0 πœ–0 min {πœ–, 1/𝛿} . πΈπ‘˜+1

By Assumption A, 𝐹 is bounded from below. Further by Proposition 4, 𝐹(π‘₯π‘˜ ) ≀ πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) for all π‘˜, we see that πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) is bounded from below. Together with πΉπ‘Ž (π‘₯π‘˜ , πœ€π‘˜ ) ≀ π½π‘˜ for all π‘˜ from Lemma 6, it shows that π½π‘˜ is also bounded from below. By (38), we obtain ∞

βˆ‘ π‘˜=π‘˜0

πœŽπ‘š0 πœ–0 min {πœ–, 1/𝛿} < ∞. πΈπ‘˜+1

(39)

On the other hand, the definition of πΈπ‘˜+1 implies that πΈπ‘˜+1 ≀ π‘˜ + 2, and it follows that ∞

(37)

πœŒπΈπ‘˜ π½π‘˜ + π½π‘˜ βˆ’ πœŽπ‘š0 πœ–0 min {πœ–, 1/𝛿} πΈπ‘˜+1

≀ π½π‘˜ βˆ’

By using the above lemmas, we are now ready to prove the global convergence of Algorithm 5. Theorem 8. Let {π‘₯π‘˜ } be generated by Algorithm 5 and suppose that the conditions of Lemma 7 hold. Then one has σ΅„© σ΅„© lim 󡄩󡄩𝑔 (π‘₯π‘˜ )σ΅„©σ΅„©σ΅„© = 0; (35) π‘˜β†’βˆž σ΅„©

πœŒπΈπ‘˜ π½π‘˜ + πΉπ‘Ž (π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ , πœ€π‘˜+1 ) πΈπ‘˜+1

βˆ‘ π‘˜=π‘˜0

∞ πœŽπ‘š0 πœ–0 min {πœ–, 1/𝛿} πœŽπ‘š0 πœ–0 min {πœ–, 1/𝛿} β‰₯ βˆ‘ πΈπ‘˜+1 π‘˜+2 π‘˜=π‘˜

= + ∞.

0

(40)

Mathematical Problems in Engineering

5

This is a contradiction. Therefore, we should have σ΅„© σ΅„© lim σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© = 0. π‘˜β†’βˆž σ΅„©

Table 1: Test problems.

(41)

From (17) in Proposition 4 together with πœ€π‘˜ as π‘˜ β†’ ∞, which comes from the definition of πœ€π‘˜ and limπ‘˜ β†’ 0 πœπ‘˜ = 0 in Algorithm 5, we obtain σ΅„© σ΅„© lim 󡄩󡄩𝑔 (π‘₯π‘˜ )σ΅„©σ΅„©σ΅„© = 0.

π‘˜β†’βˆž σ΅„©

(42)

Set π‘₯βˆ— as an accumulation point of sequence {π‘₯π‘˜ }∞ π‘˜=0 ; there is a convergent subsequence {π‘₯π‘˜π‘™ }∞ 𝑙=0 such that lim π‘₯π‘˜π‘™ = π‘₯βˆ— .

π‘™β†’βˆž

(43)

Nr. 1 2 3 4 5 6 7 8 9 10 11 12

Problems Rosenbrock Crescent CB2 CB3 DEM QL LQ Mifflin 1 Mifflin 2 Wolfe Rosen-Suzuki Shor

Dim. 2 2 2 2 2 2 2 2 2 2 4 5

𝑓ops (π‘₯) 0 0 1.9522245 2.0 βˆ’3 7.20 βˆ’1.4142136 βˆ’1.0 βˆ’1.0 βˆ’8.0 βˆ’44 22.600162

From (4) we know that 𝑔(π‘₯π‘˜ ) = (π‘₯π‘˜ βˆ’π‘(π‘₯π‘˜ ))/πœ†. Consequently, (42) and (43) show that π‘₯βˆ— = 𝑝(π‘₯βˆ— ). Hence, π‘₯βˆ— is an optimal solution of problem (1).

4. Numerical Results This section presents some numerical results from experiments using our multivariate spectral gradient algorithm for the given test nonsmooth problems which come from [21]. We also list the results of [14] (modified Polak-Ribi`erePolyak gradient method, MPRP) and [22] (proximal bundle method, PBL) to make a comparison with the result of Algorithm 5. All codes were written in MATLAB R2010a and were implemented on a PC with 2.8 GHz CPU, 2 GB of memory, and Windows 8. We set 𝛽 = πœ† = 1, 𝜎 = 0.9, πœ– = 10βˆ’10 , and 𝛾 = 0.01, and the parameter 𝛿 is chosen as 1 { { { { {σ΅„© π‘Ž βˆ’1 𝛿 = {󡄩󡄩󡄩𝑔 (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„©σ΅„© { { { { βˆ’5 {10

σ΅„© σ΅„© if σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© > 1, σ΅„© σ΅„© if 10βˆ’5 ≀ σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© ≀ 1, σ΅„© σ΅„© if σ΅„©σ΅„©σ΅„©π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )σ΅„©σ΅„©σ΅„© < 10βˆ’5 ;

(44)

then we adopt the termination condition β€–π‘”π‘Ž (π‘₯π‘˜ , πœ€π‘˜ )β€– ≀ 10βˆ’10 . For subproblem (5), the classical PRP CG method (called subalgorithm) is used to solve it; the algorithm stops if β€–πœ•π‘“(π‘₯π‘˜ )β€– ≀ 10βˆ’4 or 𝑓(π‘₯π‘˜+1 )βˆ’π‘“(π‘₯π‘˜ )+β€–πœ•π‘“(π‘₯π‘˜+1 )β€–2 βˆ’β€–πœ•π‘“(π‘₯π‘˜ )β€–2 ≀ 10βˆ’3 holds, where πœ•π‘“(π‘₯π‘˜ ) is the subgradient of 𝑓(π‘₯) at the point π‘₯π‘˜ . The subalgorithm will also stop if the iteration number is larger than fifteen. In its line search, the Armijo line search technique is used and the step length is accepted if the search number is larger than five. Table 1 contains problem names, problem dimensions, and the optimal values. The summary of the test results is presented in Tables 23, where β€œNr.” denotes the name of the tested problem, β€œNF” denotes the number of function evaluations, β€œNI” denotes the number of iterations, and β€œπ‘“(π‘₯)” denotes the function value at the final iteration.

The value of 𝜌 controls the nonmonotonicity of line search which may affect the performance of the MSG algorithm. Table 2 shows the results for different parameter 𝜌, as well as different values of the parameter πœπ‘˜ ranging from 1/6(π‘˜ + 2)6 to 1/2π‘˜2 on problem Rosenbrock, respectively. We can conclude from the table that the proposed algorithm works reasonably well for all the test cases. This table also illustrates that the value of 𝜌 can influence the performance of the algorithm significantly if the value of πœ€ is within a certain range, and the choice 𝜌 = 0.75 is better than 𝜌 = 0. Then, we compare the performance of MSG to that of the algorithms MPRP and PBL. In this test, we fix πœπ‘˜ = 1/2π‘˜2 and 𝜌 = 0.75. To illustrate the performance of each algorithm more specifically, we present three comparison results in terms of number of iterations, number of function evaluations, and the final objective function value in Table 3. The numerical results indicate that Algorithm 5 can successfully solve the test problems. From the number of iterations in Table 3, we see that Algorithm 5 performs best among these three methods, and the final function value obtained by Algorithm 5 is closer to the optimal function value than those obtained by MPRP and PBL. In a word, the numerical experiments show that the proposed algorithm provides an efficient approach to solve nonsmooth problems.

5. Conclusions We extend the multivariate spectral gradient algorithm to solve nonsmooth convex optimization problems. The proposed algorithm combines a nonmonotone line search technique and the idea of Moreau-Yosida regularization. The algorithm satisfies the sufficient descent property and its global convergence can be established. Numerical results show the efficiency of the proposed algorithm.

6

Mathematical Problems in Engineering Table 2: Results on Rosenbrock with different 𝜌 and πœ€. 𝜌=0 NI/NF/𝑓(π‘₯) 30/46/1.581752e βˆ’ 9 28/38/5.207744e βˆ’ 9 29/37/1.502034e βˆ’ 9 27/37/1.903969e βˆ’ 9 27/36/4.859901e βˆ’ 9

πœπ‘˜ 1/2π‘˜2 1/3(π‘˜ + 2)3 1/4(π‘˜ + 2)4 1/5(π‘˜ + 2)5 1/6(π‘˜ + 2)6

Time 1.794 1.420 1.388 1.451 1.376

𝜌 = 0.75 NI/NF/𝑓(π‘₯) 29/30/7.778992e βˆ’ 9 26/27/6.541087e βˆ’ 9 27/28/5.112699e βˆ’ 9 27/28/6.329141e βˆ’ 9 27/28/6.073222e βˆ’ 9

Time 1.076 1.023 1.030 1.092 1.025

Table 3: Numerical results for MSG/MPRP/PBL on problems 1–12. Nr. 1 2 3 4 5 6 7 8 9 10 11 12

MSG NI/NF/𝑓(π‘₯) 29/30/7.778992e βˆ’ 9 9/10/1.450669e βˆ’ 5 9/10/1.9522245 4/9/2.000009 3/4/βˆ’2.999949 11/12/7.200000 3/4/βˆ’1.4142136 9/10/βˆ’0.9999638 12/13/βˆ’0.9999978 5/6/βˆ’7.999999 6/7/βˆ’43.99797 12/13/2.260017

MPRP NI/NF/𝑓(π‘₯) 46/48/7.091824e βˆ’ 7 11/13/6.735123e βˆ’ 5 12/14/1.952225 2/6/2.000098 4/6/βˆ’2.999866 10/12/7.200011 2/3/βˆ’1.414214 4/6/βˆ’0.9919815 20/23/βˆ’0.9999925 β€” 28/58/βˆ’43.99986 33/91/22.60023

Conflict of Interests The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments The author would like to thank the anonymous referees for their valuable comments and suggestions which help a lot to improve the paper greatly. The author also thanks Professor Gong-lin Yuan for his kind offer of the source BB codes on nonsmooth problems. This work is supported by the National Natural Science Foundation of China (Grant no. 11161003).

References [1] J. B. Hiriart-Urruty and C. LemarΒ΄echal, Convex Analysis and Minimization Algorithms, Springer, Berlin, Germany, 1993. [2] M. Fukushima and L. Qi, β€œA globally and superlinearly convergent algorithm for nonsmooth convex minimization,” SIAM Journal on Optimization, vol. 6, no. 4, pp. 1106–1120, 1996. [3] L. Q. Qi and J. Sun, β€œA nonsmooth version of Newton’s method,” Mathematical Programming, vol. 58, no. 3, pp. 353–367, 1993. [4] F. H. Clarke, Optimization and Nonsmooth Analysis, Wiley, New York, NY, USA, 1983. [5] S. Lu, Z. Wei, and L. Li, β€œA trust region algorithm with adaptive cubic regularization methods for nonsmooth convex minimization,” Computational Optimization and Applications, vol. 51, no. 2, pp. 551–573, 2012.

PBL NI/NF/𝑓(π‘₯) 42/45/0.381e βˆ’ 6 18/20/0.679e βˆ’ 6 32/34/1.9522245 14/16/2.0000000 17/19/βˆ’3.0000000 13/15/7.2000015 11/12/βˆ’1.4142136 66/68/βˆ’0.9999994 13/15/βˆ’1.0000000 43/46/βˆ’8.0000000 43/45/βˆ’43.999999 27/29/22.600162

𝑓ops (π‘₯) 0 0 1.9522245 2.0 βˆ’3 7.20 βˆ’1.4142136 βˆ’1.0 βˆ’1.0 βˆ’8.0 βˆ’44 22.600162

[6] L. Q. Qi, β€œConvergence analysis of some algorithms for solving nonsmooth equations,” Mathematics of Operations Research, vol. 18, no. 1, pp. 227–244, 1993. [7] R. Correa and C. LemarΒ΄echal, β€œConvergence of some algorithms for convex minimization,” Mathematical Programming, vol. 62, no. 1–3, pp. 261–275, 1993. [8] M. Fukushima, β€œA descent algorithm for nonsmooth convex optimization,” Mathematical Programming, vol. 30, no. 2, pp. 163–175, 1984. [9] J. R. Birge, L. Qi, and Z. Wei, β€œConvergence analysis of some methods for minimizing a nonsmooth convex function,” Journal of Optimization Theory and Applications, vol. 97, no. 2, pp. 357–383, 1998. [10] Z. Wei, L. Qi, and J. R. Birge, β€œA new method for nonsmooth convex optimization,” Journal of Inequalities and Applications, vol. 2, no. 2, pp. 157–179, 1998. [11] N. Sagara and M. Fukushima, β€œA trust region method for nonsmooth convex optimization,” Journal of Industrial and Management Optimization, vol. 1, no. 2, pp. 171–180, 2005. [12] G. Yuan, Z. Wei, and Z. Wang, β€œGradient trust region algorithm with limited memory BFGS update for nonsmooth convex minimization,” Computational Optimization and Applications, vol. 54, no. 1, pp. 45–64, 2013. [13] G. Yuan and Z. Wei, β€œThe Barzilai and Borwein gradient method with nonmonotone line search for nonsmooth convex optimization problems,” Mathematical Modelling and Analysis, vol. 17, no. 2, pp. 203–216, 2012. [14] G. Yuan, Z. Wei, and G. Li, β€œA modified Polak-Ribi`ere-Polyak conjugate gradient algorithm for nonsmooth convex programs,” Journal of Computational and Applied Mathematics, vol. 255, pp. 86–96, 2014.

Mathematical Problems in Engineering [15] Q. Li, β€œConjugate gradient type methods for the nondifferentiable convex minimization,” Optimization Letters, vol. 7, no. 3, pp. 533–545, 2013. [16] L. Han, G. Yu, and L. Guan, β€œMultivariate spectral gradient method for unconstrained optimization,” Applied Mathematics and Computation, vol. 201, no. 1-2, pp. 621–630, 2008. [17] G. Yu, S. Niu, and J. Ma, β€œMultivariate spectral gradient projection method for nonlinear monotone equations with convex constraints,” Journal of Industrial and Management Optimization, vol. 9, no. 1, pp. 117–129, 2013. [18] Z. Yu, J. Sun, and Y. Qin, β€œA multivariate spectral projected gradient method for bound constrained optimization,” Journal of Computational and Applied Mathematics, vol. 235, no. 8, pp. 2263–2269, 2011. [19] Y. Xiao and Q. Hu, β€œSubspace Barzilai-Borwein gradient method for large-scale bound constrained optimization,” Applied Mathematics and Optimization, vol. 58, no. 2, pp. 275– 290, 2008. [20] H. Zhang and W. W. Hager, β€œA nonmonotone line search technique and its application to unconstrained optimization,” SIAM Journal on Optimization, vol. 14, no. 4, pp. 1043–1056, 2004. [21] L. LukΛ‡san and J. VlΛ‡cek, β€œTest problems for nonsmooth unconstrained and linearly constrained optimization,” Tech. Rep. 798, Institute of Computer Science, Academy of Sciences of the Czech Republic, Praha, Czech Republic, 2000. [22] L. LukΛ‡san and J. VlΛ‡cek, β€œA bundle-Newton method for nonsmooth unconstrained minimization,” Mathematical Programming, vol. 83, no. 3, pp. 373–391, 1998.

7

Advances in

Operations Research Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Advances in

Decision Sciences Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Applied Mathematics

Algebra

Hindawi Publishing Corporation http://www.hindawi.com

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Probability and Statistics Volume 2014

The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

International Journal of

Differential Equations Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Submit your manuscripts at http://www.hindawi.com International Journal of

Advances in

Combinatorics Hindawi Publishing Corporation http://www.hindawi.com

Mathematical Physics Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Complex Analysis Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

International Journal of Mathematics and Mathematical Sciences

Mathematical Problems in Engineering

Journal of

Mathematics Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Discrete Mathematics

Journal of

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Discrete Dynamics in Nature and Society

Journal of

Function Spaces Hindawi Publishing Corporation http://www.hindawi.com

Abstract and Applied Analysis

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

International Journal of

Journal of

Stochastic Analysis

Optimization

Hindawi Publishing Corporation http://www.hindawi.com

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014