p:texsiopt -1 09 09 09 - CiteSeerX

33 downloads 0 Views 457KB Size Report
Again applying Example 2.1, it is easy to check that hp is defined by .... where p(b(y),y) = 1, it follows from (3.9) and (3.8) that lim mini zi→0 hp(z) = lim y →+∞ hp.
SIAM J. OPTIM. Vol. 10, No. 1, pp. 289–313

c 1999 Society for Industrial and Applied Mathematics °

DECREASING FUNCTIONS WITH APPLICATIONS TO PENALIZATION∗ A. M. RUBINOV† , B. M. GLOVER‡ , AND X. Q. YANG§ Abstract. The theory of increasing positively homogeneous functions defined on the positive orthant is applied to the class of decreasing functions. A multiplicative version of the inf-convolution operation is studied for decreasing functions. Modified penalty functions for some constrained optimization problems are introduced that are in general nonlinear with respect to the objective function of the original problem. As the perturbation function of a constrained optimization problem is decreasing, the theory of decreasing functions is subsequently applied to the study of modified penalty functions, the zero duality gap property, and the exact penalization. Key words. decreasing functions, IPH functions, multiplicative inf-convolution, modified penalty functions, exact penalization AMS subject classifications. 90C30, 65K05 PII. S1052623497326095

1. Introduction. In this paper we study positive decreasing functions defined on the positive orthant Rn++ and their applications to nonlinear penalization formed by increasing positively homogeneous (IPH) functions of the first degree. There exists a natural isomorphism between the ordered space of all positive decreasing functions and the ordered space of all IPH functions. The theory of abstract convexity (see [9, 12, 18]) allows us to consider duality in various nonconvex situations. Recently [16] a duality theory based on abstract convexity with respect to the so-called mintype functions was developed for IPH functions. The isomorphism between IPH functions and decreasing functions allows us to apply this theory in the study of positive decreasing functions. This approach is developed in the first part of the paper. We also introduce and study a multiplicative analogue of the inf-convolution operation. This operation exhibits very nice properties in the class of decreasing functions similar to the usual “sum” convolution, which exhibits very nice properties in the class of convex functions. The second part of the paper is devoted to the study of modified penalty functions for the single constraint problem: fo (x) −→ inf

subject to x ∈ X, f1 (x) ≤ 0

with a real-valued positive objective function fo and a real-valued constraint function f1 . The perturbation function ˜ β(y) = inf{fo (x) : x ∈ X, fi (x) ≤ yi , i = 1, . . . , m},

y = (y1 , . . . , ym ),

∗ Received by the editors August 15, 1997; accepted for publication (in revised form) December 14, 1998; published electronically November 29, 1999. This research was supported by grants from the Australian Research Council. http://www.siam.org/journals/siopt/10-1/32609.html † School of Information Technology and Mathematical Sciences, The University of Ballarat, Ballarat, VIC 3353, Australia ([email protected]). ‡ Department of Mathematics, Curtin University of Technology, Perth, WA 6845, Australia ([email protected]). § Department of Mathematics, The University of Western Australia, Nedlands, WA 6907, Australia ([email protected]).

289

290

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

has useful applications in the study of the nonlinear programming problem: fo (x) −→ inf

subject to x ∈ X, fi (x) ≤ 0, i = 1, . . . , m

(see for example [3, 5, 6, 8, 10, 11, 14] and the references therein). The perturbation function is decreasing: y 1 ≥ y 2 =⇒ β(y 1 ) ≤ β(y 2 ). Consequently, the study of perturbation functions should be based on a theory of decreasing functions. It is well known that all constraints fi can be convoluted into a single constraint (see [19] for a detailed discussion). For example, we can use a convolution by maximum: maxi fi (x) ≤ 0 ⇐⇒ fi (x) ≤ 0 for all i. So we restrict ourselves to the problem with a single constraint. The classical penalty function for an optimization problem with a single constraint is formed by means of the classical convolution function p(y1 , y2 ) = y1 +y2 . Sometimes it is more convenient to consider the convolution by an increasing function p with some additional properties. This approach has been developed by Rubinov, Glover, and Yang [17] and Andramonov [1]. In such a case the modified penalty function L+ p (x, d) has the form (1.1)

L+ p (x, d) = p(fo (x), d max(f1 (x), 0)).

Among the many issues that arise in connection with such a setting, we indicate the following: 1. how to obtain conditions ensuring the zero duality gap property; 2. how to obtain conditions ensuring the exact penalization; 3. how to find convolution functions p such that an exact penalty parameter for the function L+ p is substantially smaller than that of the classical function; 4. how to find convolution functions p such that the penalty function L+ p is smooth at the solution. The corresponding questions for the classical function (except question 3) have been discussed, for example, in [2, 5, 7, 11, 14]. The first three questions are addressed in this paper. We consider the penalty function of the form (1.1) involving only an IPH function p with some natural properties. It is demonstrated that for a large class of IPH functions, p, the zero duality gap property inf{fo (x) : x ∈ X, f1 (x) ≤ 0} = sup inf L+ p (x, d) d>0 x∈X

holds if and only if the perturbation function β(y) = inf{fo (x) : f1 (x) ≤ y},

y ≥ 0,

is lower semicontinuous at the origin. Thus the zero duality gap property depends only on the problem itself but does not depend on an outer convolution function from a very large class of such functions. The proof of this fact is based on the theory of multiplicative inf-convolution developed in the first part of the paper. In contrast to this result, we show that the exact penalization essentially depends on an outer convolution function. In particular, it is proved that as a rule the penalization with respect to the function p+∞ (α, y) = max{α, y} is not exact.

DECREASING FUNCTIONS AND PENALIZATION

291

The convolution with respect to the family of IPH functions 1

pk (δ, y) = (δ k + y k ) k

(0 < k < +∞)

is considered. We study especially the exact penalization by p 12 and demonstrate that this penalization always can be accomplished with a smaller penalty parameter than that of the classical convolution function p1 . We also obtain an asymptotically sharp estimate of the ratio d¯12 /d¯1 , where d¯k is the least exact penalization parameter with respect to pk . This estimate allows us to draw the following conclusion: If the constrained minimum of the objective function is not very far from the unconstrained minimum of this function, then the penalization by p 12 can be accomplished with a substantially smaller exact penalty parameter d¯12 than d¯1 . Thus for many problems the ill-conditionedness of the penalty function with a large penalty parameter can be overcome using p 12 . We prove that the class of IPH functions is sufficiently large to provide the exact penalization: an exact modified penalty function can be found for a given problem under some very mild assumptions. The paper is structured as follows. The class of IPH functions is studied in section 2 and decreasing functions in section 3. The multiplicative inf-convolution of decreasing functions is introduced and studied in section 4. The modified penalty function is discussed in section 5. Section 6 is devoted to the perturbation function, and its links with the modified penalty function and exact penalization are investigated in section 7. 2. Increasing positively homogeneous functions. Let I be a finite set of indices. We shall use the following notation: • RI is the space of all vectors (xi )i∈I ; • xi is the ith coordinate of a vector x ∈ RI ; • if x, y ∈ RI , then x ≥ y ⇐⇒ xi ≥ yi for all i ∈ I; • if x, y ∈ RI , then x  y ⇐⇒ xi > yi for all i ∈ I; • RI+ = {x = (xi ) ∈ RI : x ≥ 0}; • RI++ = {x = (xi ) ∈ RI : x  0}. If I consists of n elements we will also use the notation Rn , Rn+ , and Rn++ instead of RI , RI+ , and RI++ , respectively. A function p defined on either the cone RI++ or the cone RI+ and mapping into R+∞ := R+ ∪ {+∞} is called an IPH function if p is increasing (x ≥ y =⇒ p(x) ≥ p(y)) and positively homogeneous of degree 1 (p(λx) = λp(x) for λ > 0) and if there is a point y ∈ RI++ such that p(y) < +∞. If p is an IPH function defined on RI+ , then the restriction of p to RI++ is also an IPH function. We shall denote the class of all IPH functions defined on RI++ by IP H++ . It is known (see [16]) that every function p ∈ IP H++ is continuous on RI++ . The simplest example of an IPH function is a function of the form `(y) = h`, yi, where ` = (`1 , . . . , `m ) ∈ RI++ and (2.1)

h`, yi = min `i yi . i∈I

The main tool in the study of IPH functions will be the so-called support set. In particular, we will use support sets of IPH functions defined on RI++ . Definition 2.1 (see [16]). Let p ∈ IP H++ . The set (2.2)

supp(p) = {` ∈ RI++ : h`, yi ≤ p(y) f or all y ∈ RI++ }

292

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

is called the support set of the function p. For IPH functions defined on the cone RI+ it is possible to define a support set in different ways. One of them has been studied in [15]. However, it will be more convenient to use the following definition in this paper. Definition 2.2. Let p¯ be an IPH function defined on RI+ and let p be the restriction of the function p¯ to the cone RI++ . Then the support set of the function p, defined by (2.2), is called the support set of the function p¯. We shall study in this section only support sets for IPH functions defined on RI++ . The following result shows that each IPH function can be reconstructed from its support set. Theorem 2.1 (see [16]). Let p ∈ IP H++ . Then p(y) = max{h`, yi : ` ∈ supp(p)} for all y ∈ RI++ . A subset U of RI++ is called normal if `1 ∈ U, `2 ∈ RI++ , and `1 ≥ `2 imply `2 ∈ U . A subset U of RI++ is called closed if it is closed in the topological space RI++ . Theorem 2.2 (see [16]). Let U ⊂ RI++ . Then U = supp(p) for some p ∈ IP H++ if and only if U is normal and closed. For a, x ∈ RI++ , U ⊂ RI++ we shall require the following notational convention: (2.3)

a · x = (ai xi )i∈I ,

a · U = {a · u : u ∈ U },

a = x



ai xi

 i∈I

.

If x ∈ RI++ , then 1 ≡ x−1 = x



1 xi

 i∈I

.

The following result provides an explicit description of the support set. Theorem 2.3 (see [16]). Let p be an IPH function defined on the cone RI++ . Then supp(p) = {` ∈ RI++ : p(`−1 ) ≥ 1}. We now describe some properties of support sets. It follows immediately from the definition that p1 ≤ p2 ⇐⇒ supp(p1 ) ⊂ supp(p2 )

for p1 , p2 ∈ IP H++ .

Proposition 2.1. Let p(x) = inf α∈A pα (x), where (pα )α∈A is a family of IPH functions. Then supp(p) = ∩α∈A supp(pα ). Proof. Since the sets supp(pα ) are normal and closed it follows that their intersection is also normal and closed. Theorem 2.2 shows that there exists an IPH function p˜ such that supp(˜ p) = ∩α∈A supp(pα ). Since p˜ ≤ pα for all α ∈ A it follows that p˜ ≤ inf α∈A pα = p. The function p(x) = inf α∈A pα (x) is IPH. Since p ≤ pα for all α, p), so p ≤ p˜. it follows that supp(p) ⊂ ∩α∈A supp(pα ) = supp(˜ Let a = (ai )i∈I ∈ RI++ and p ∈ IP H++ . We will require in what follows the function pa , where pa (y) = p(a · y) and a · y is defined by (2.3). Proposition 2.2. supp(pa ) = a · supp(p) ≡ {a · ` : ` ∈ supp(p)}.

DECREASING FUNCTIONS AND PENALIZATION

293

Proof. Let ` ∈ supp(pa ), y = (yi )i∈I ∈ RI++ , and z = (zi )i∈I = a · y. Then p(z) = p(a · y) ≥ h`, yi = min `i yi = min i∈I

i∈I

`i zi . ai

Thus the vector `0 = l/a belongs to supp(p). Since ` = a · `0 it follows that ` ∈ a · supp(p). Lemma 2.1. If U is a normal set and a ≥ a0  0, then a · U ⊃ a0 · U. Proof. Let ` ∈ a0 ·U . Then there exists u ∈ U such that ` = a0 ·u. Thus (`/a0 ) ∈ U. Since U is normal it follows that (`/a) ∈ U , so ` ∈ a · U. We now give some examples of IPH functions and support sets. Example 2.1. Let p(y) = maxi∈I ai yi with a = (ai )i∈I and ai > 0. Clearly p ∈ IP H++ . Applying Theorem 2.3 we can easily conclude that the support set supp(p) coincides with the following set Va :   `i (2.4) Va = ∪i∈I {` = (`1 , . . . , `m ) ∈ RI++ : `i ≤ ai } = ` ∈ RI++ : min ≤1 . i∈I ai Assume, more generally, that ai ≥ 0 for all i ∈ I. Let Ia = {i ∈ I : ai > 0}. It is easy to check that   `i I Va = ` ∈ R++ : min ≤1 . i∈Ia ai Example 2.2. Let 0 < k < +∞ and à pk (x) =

X

! k1 xki

for all x ∈ RI++ .

i∈I

Clearly pk ∈ IP H++ . Applying Theorem 2.3 we obtain the following: ( ) X 1 (2.5) ≥1 . supp(pk ) = ` ∈ RI++ : `k i∈I i If k1 ≥ k2 , then pk1 ≥ pk2 ; thus supp(pk1 ) ⊃ supp(pk2 ). For the function p∞ (x) := maxi∈I xi we have p∞ (x) = inf k>0 pk (x). From Proposition 2.1 it follows that \ supp(p∞ ) = supp(pk ). k>0

Thus, from (2.4) and (2.5),   ` ∈ RI++ : min `i ≤ 1 = i∈I

\

( `∈

0 0. So gU (y) = +∞ ≥ lim supk gU (yk ). Assume now that gU (yk ) < +∞ for all k. Then (gU (yk ), yk ) ∈ U . If λ := lim supk gU (yk ) < +∞, then (λ, y) ∈ U and therefore λ ≤ gU (y). If λ = +∞ then it easily follows that gU (y) = +∞. Thus lim supk gU (yk ) ≤ gU (y) in both cases. Proposition 3.2. Let g ≥ 0 be a decreasing and upper semicontinuous function and U = {(α, y) : y  0, 0 < α ≤ g(y), y ∈ dom g}. Then U is a normal closed set and g = gU . Proof. We first show that U is normal. Let (α1 , y1 ) ∈ U, α2 > 0, y2  0, and (α1 , y1 ) ≥ (α2 , y2 ). Since g is decreasing we have α2 ≤ α1 ≤ g(y1 ) ≤ g(y2 ). Thus (α2 , y2 ) ∈ U . Since g is upper semicontinuous it follows that U is closed. We also have gU (y) = sup{α : (α, y) ∈ U } = sup{α : α ≤ g(y)} = g(y)

(y ∈ dom g). 0

Recall that I 0 = {0} ∪ I. Consider an IPH function p defined on the cone RI++ . Let U = supp(p) be the support set of the function p; then the set U generates the function gU by (3.1). Definition 3.1. Let p ∈ IP H++ and U = supp(p). Then the function gU defined by (3.1) is called the associated function to p. We shall denote the associated function to p by hp . Example 3.1. Let p(δ, y) = max{αδ, a1 y1 , . . . , am ym } with α > 0, a = (a1 , . . . , am ) ∈ RI+ . Then (see Example 2.1) U = supp(p) coincides with the set V(α,a) defined as follows:  V(α,a) =

`∈

0 RI++

 : min

`i `0 , min α i∈Ia ai



 ≤1 ,

where Ia = {i : ai > 0}. If y ∈ RI++ is a vector such that yi ≤ ai for some i, then (δ, y) ∈ U for all δ > 0 so hp (y) = gU (y) = +∞. Assume now that y  a. Then (δ, y) ∈ V(α,a) if and only if δ ≤ ao , so hp (y) = ao . Thus  (3.3)

hp (y) =

α if y  a, +∞ otherwise.

295

DECREASING FUNCTIONS AND PENALIZATION

Assume now that α = 0. Again applying Example 2.1, it is easy to check that hp is defined by (3.3) with α = 0. Thus this function coincides with the indicator function δZ of the set Z = {y : y  a}. P 1 Example 3.2. Let 0 < k < +∞ and pk (δ, y) = (δ k + i∈I yik ) k . Let U = supp(p). Furthermore, let à u=

X 1 1− yk i∈I i

Then (see Example 2.2) we have ( U=

(α, y) ∈

0 RI++

! k1 .

) X 1 1 : k + ≥1 . α yik i∈I

So

(

X 1 1 hp (y) = sup α : k ≥ 1 − α yik

)

i∈I

=

  

1 u

  +∞

if 1 >

X 1 , yk i∈I i

otherwise. 0

Let p be an IPH function defined on RI++ . We now show that sup p(1, y) = sup hp (y).

y0

y0

We need the following simple assertion. Lemma 3.1. Let ψ(λ) be a continuous decreasing function defined on the segment ¯ > 0. Let χb (λ) = min{ψ(λ), bλ} for b > 0. Then (0, +∞) and let limλ→+0 ψ(λ) = M 1. for all b > 0 the function χb attains its maximum at a unique point λb > 0; 2. λb is a solution of the equation ψ(λ) = bλ; 3. λb → 0 as b → +∞; ¯ as b → +0. 4. χb (λb ) = ψ(λb ) = bλb → M Proof. The proof is straightforward. 0 Proposition 3.3. Let p be an IPH function defined on RI++ . Then sup p(1, y) = sup hp (y).

y0

y0

Proof. First we shall verify that (3.4)

p(1, y) = sup min{hp (z), hz, yi} for all y ∈ RI++ . z0

Indeed it follows, from the definition of the associated function hp , that supp(p) = {(δ, z) : z  0, 0 < δ ≤ hp (z)}. So for y  0 we have p(1, y) = sup{h(δ, z), (1, y)i : (δ, z) ∈ supp(p)} =

sup

z0,δ≤hp (z)

min(δ, hz, yi).

296

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

Thus (3.4) holds. It follows that for an arbitrary y  0 and ε > 0 there exists a vector z  0 such that p(1, y) − ε ≤ min(hp (z), hz, yi) ≤ hp (z) ≤ sup hp (u). u0

Thus sup p(1, y) ≤ sup hp (u).

y0

u0

We now verify the reverse inequality. Fix a vector z  0 and consider the ray {λz : λ > 0}. Let ψ(λ) ≡ ψz (λ) = hp (λz). The function ψ is decreasing and (3.5)

lim ψ(λ) = lim hp (y) = sup hp (y). y→0

λ→+0

y0

For y  0 consider the function χb (λ) = min{ψ(λ), by λ}, where by = hz, yi. Let λy be a solution of the equation ψ(λ) = by λ. Lemma 3.1 shows that maxλ>0 min{ψ(λ), by λ} is attained at the point λy and equals ψ(λy ). It follows from (3.4) that p(1, y) ≥ max min(ψ(λ), by λ) = ψ(λy ). λ>0

Applying (3.5) and Lemma 3.1 we have sup p(1, y) ≥ sup ψ(λy ) = lim ψ(λ) = lim hp (λz) = sup hp (u).

y0

λ→+0

y0

λ→+0

u0

We now show that the associated function hp can be expressed in terms of the initial function p. 0 Proposition 3.4. Assume that p is an IPH function defined on RI++ . Let z ∈ RI++ . Then the following hold: 1. If limτ →+0 p(τ, z −1 ) ≥ 1, then hp (z) = +∞. 2. If limτ →+∞ p(τ, z −1 ) < 1, then hp (z) = 0. 3. If limτ →+0 p(τ, z −1 ) < 1 ≤ limτ →+∞ p(τ, z −1 ), then hp (z) =

1 , b( z1 )

where b(z) is the smallest solution of the equation p(b, z) = 1. Proof. It follows from Theorem 2.3 that 0

supp(p) = {(α, y) ∈ RI++ : p(α−1 , y −1 ) ≥ 1}. Let z ∈ RI++ and y = z −1 . Then hp (z) = hp (y −1 ) = sup{α : (α, y −1 ) ∈ supp(p)} = sup{α : p(α−1 , y) ≥ 1} = sup{τ −1 : p(τ, y) ≥ 1} 1 . = inf{τ : p(τ, y) ≥ 1} Let ψy (τ ) = p(τ, y). Thus (3.6)

hp (z) =

1 . inf{τ : ψy (τ ) ≥ 1}

DECREASING FUNCTIONS AND PENALIZATION

297

It follows, from the properties of the function p, that ψy is an increasing continuous function on R++ . Let γ− = lim ψy (τ ) τ →+0

and γ+ = lim ψ(τ ). τ →+∞

If γ− ≥ 1, then inf{τ : ψy (τ ) ≥ 1} = 0; if γ+ < 1, then the set {τ : ψy (τ ) ≥ 1} is empty and so the infimum of this set is defined to be +∞. If γ− < 1 ≤ γ+ , then inf{τ : ψy (τ ) ≥ 1} is equal to the smallest root of the equation ψy (τ ) = 1. The desired result follows from (3.6). Proposition 3.4 allows us to describe some properties of IPH functions in terms of associated functions. 0 Let p be an IPH function defined on the cone RI+ . The support set of the function 0 p coincides (see Definition 2.2) with the support set of its restriction to RI++ . We will denote this restriction by the same letter p. The following propositions will be useful in what follows. We give a sketch of their proofs. 0 Proposition 3.5. Let p be a continuous IPH function defined on the cone RI+ . Then limmini zi →+∞ hp (z) = 1 if and only if p(1, 0, . . . , 0) = 1. Proof. Let limmini zi →+∞ hp (z) = 1. Since 0 < hp (z) < +∞, it follows from Proposition 3.4 that hp (z) = (b(z −1 ))−1 , where p(b(z −1 ), z −1 ) = 1. Since p is continuous we can conclude that (3.7)

p(1, 0) =

lim

mini zi →+∞

p(b(z −1 ), z −1 ) = 1.

Now assume that p(1, 0) = 1. Let t(z) = {α : p(α, z) = 1}. Since p is positively homogeneous it follows from (3.7) that t(0) = {1}. By continuity of p we have (α → α0 ,

z → 0,

α ∈ t(z)) =⇒ α0 = 1.

Since b(z −1 ) ∈ t(z −1 ) it follows that hp (z) = (b(z −1 ))−1 → 1 as mini zi → +∞. Let h be a function defined on RI++ and L = limkzk→+∞ h(1/z). We can present this limit in the following form: (3.8)

L=

lim

maxi zi →+∞

h(1/z) =

lim

mini yi →+∞

h(y). 0

Proposition 3.6. Let p be an IPH function defined on RI++ with I 0 = {0} ∪ I and limkuk→+∞ p(1, u) = +∞. Then limmini zi →0 hp (z) = +∞. Proof. First we show that (3.9)

(p(α, y) = 1, kyk → +∞) =⇒ α → 0.

Indeed, if α ≥ 1 then p(α, y) ≥ p(1, y) → +∞ (as kyk → +∞), which is a contradiction. Thus α < 1. Since p is IPH it follows from p(α, y) = 1 that p(1, (y/α)) = 1/α. Since p is an increasing function we can conclude that p(α, y) = 1 implies  y 1 1 lim p 1, = lim = 0 α α kyk→+∞ kyk→+∞ α with α0 ≤ 1. If α0 > 0, then ky/αk → +∞ and therefore p(1, y/α) = +∞, which is again a contradiction. Thus α0 = 0. Since   1 = (b(y))−1 , hp y

298

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

where p(b(y), y) = 1, it follows from (3.9) and (3.8) that   1 = lim (b(y))−1 = +∞. lim hp (z) = lim hp mini zi →0 y kyk→+∞ kyk→∞ 4. Multiplicative inf-convolution of decreasing functions. Definition 4.1. Let h and l be functions defined on RI++ and mapping into (0, +∞]. The function   z , z ∈ RI++ , (4.1) (h  l)(z) = inf h(y)l y0 y is called the multiplicative inf-convolution of the functions h and l. Since   z z = inf h l(u), inf h(y)l y0 u0 y u it follows that multiplicative inf-convolution is a commutative operation: h  l = l  h. If l is a decreasing function, then, applying (4.1), it is easy to check that the multiplicative inf-convolution, h  l, of l and an arbitrary positive function h is also decreasing. Assume now that l is an upper semicontinuous function. Then for an arbitrary function h the function z → h(y)l(z/y) is upper semicontinuous for all y  0 and therefore h  l is also upper semicontinuous. In particular, the following assertion holds. Proposition 4.1. If l is a positive decreasing upper semicontinuous function, then hl is decreasing and upper semicontinuous for any arbitrary positive function h. Example 4.1. Let α be a nonnegative number and a = (a1 , . . . , am ) be a positive vector. Let p(δ, y) = max{αδ, a1 y1 , . . . , am ym }. Then (see Example 3.1) hp (y) = α if y  a and hp (y) = +∞ otherwise. Let l be a continuous decreasing function defined on RI++ . We have   z z z (l  hp )(z) = inf l(y)hp = inf l hp (u) = inf l α. ua y0 u0 y u u Since l is continuous and decreasing we conclude that inf u>a l( uz ) = l( az ). Thus z  . (l  hp )(z) = αl a In particular, if α = 1 and a = (1, . . . , 1), then l  hp = l for all continuous decreasing functions l. We now describe the positive part of the hypograph of the multiplicative infconvolution of decreasing functions. It is convenient to describe this in terms of the support set of the corresponding IPH function. Proposition 4.2. Let l be a finite decreasing positive function defined on RI++ 0 and p be an IPH function defined on RI++ with I 0 = {0} ∪ I. Let hp be the associated function for p. Then \ (4.2) (l(y), y) · supp(p) = {(δ, z) : 0 < δ ≤ (l  hp )(z), z  0}, y0

where the product a · U is defined by (2.3).

DECREASING FUNCTIONS AND PENALIZATION

299

Proof. Let us prove that for all y  0:    z (4.3) . (l(y), y) · supp(p) = (δ, z) : δ ≤ l(y)hp y Indeed, since l(y) > 0 it follows from the definition of the associated function that (l(y), y) · supp(p) = {(l(y) · γ, y · u) : (γ, u) ∈ supp(p)} = {(l(y) · γ, y · u) : γ ≤ hp (u)}    δ z ≤ hp = (δ, z) : l(y) y    z . = (δ, z) : δ ≤ l(y)hp y Let V be the set on the left-hand side in (4.2). Then (δ, z) ∈ V ⇐⇒ (δ, z) ∈ (l(y), y) · supp(p) (∀y  0)   z (∀y  0) ⇐⇒ δ ≤ l(y)hp y   z . ⇐⇒ δ ≤ inf l(y)hp y0 y So V = {(δ, z) : δ ≤ (l  hp )(z), z  0}. 0 Remark 4.1. Let U and V be closed normal subsets of RI++ and gU = l. The set {(l(y), y) : y  0} represents the upper boundary of the normal set U . We can consider the set defined by (4.2) with V = supp(p) as a “product” of the sets U and V . Since the set V can be considered as the positive part of the hypograph hyp hp and, similarly, U can be considered the positive part of hyp l, it follows that the positive part of the hypograph of the multiplicative inf-convolution of hp and l coincides with the “product” of the positive parts of the hypographs hyp hp and hyp l. Let h = l  hp be a multiplicative inf-convolution, where l and hp are as in Proposition 4.2. It follows from Proposition 4.1 that h is a decreasing and upper semicontinuous function and therefore there exists an IPH function r such that h = hr . Proposition 4.3. Let l be a decreasing positive function defined on RI++ and let 0 p, r : RI++ → R+∞ be IPH functions. Then hr = l  hp if and only if \ (l(y), y) · supp(p). supp(r) = y0

Proof. This follows directly from Proposition 4.2. Lemma 4.1. Let h be a finite decreasing function defined on RI++ . Then the following limits exist: lim h(z) = sup h(z),

z→0

z0

lim

mini zi →+∞

Proof. The proof is straightforward. We now present the main result of this section.

h(z) = inf h(z). z0

300

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

Theorem 4.1. Let l and h be decreasing functions defined on RI++ such that 1. 0 < γ := limmini zi →+∞ l(z), M := limy→0 l(y) < +∞; 2. domh = {y : h(y) < +∞} 6= ∅ and H := limmini zi →+∞ h(z) > 0; 3. lim inf min zi →0 h(z) > M γ H. Then (4.4)

lim (h  l)(z) = lim l(z) ×

z→0

z→0

lim

mini zi →+∞

h(z) = M H.

Proof. First we show that (h  l)(z) ≤ M H for all z  0. Let e = (1, . . . , 1). For the functions u(λ) = l(λe) and vz (λ) = h( λ1 z) with z  0, we have l(λ0 e) = lim l(y) = M, u(λ) ≤ sup u(λ) = lim 0 λ →0

λ0 >0

inf vz (λ) = inf h(µz) = lim h(µz) =

λ>0

So (4.5)

µ→+∞

µ>0

y→0

lim

mini yi →+∞

h(y) = H.

  z ≤ inf u(λ)vz (λ) ≤ M inf vz (λ) = M H. (h  l)(z) = inf l(y)h y0 λ>0 λ>0 y

Thus lim (h  l)(z) ≤ M H.

(4.6)

z→+0

We now prove the reverse inequality. It follows from condition 3 that there exists numbers µ > 0 and ε > 0 such that h(u) ≥ (1 + ε)(1/γ)M H whenever min ui ≤ µ. Thus, if mini (z/y)i ≤ µ, then     z z ≥ γh ≥ (1 + ε)M H. l(y)h y y l(y)h(z/y) > M H. Applying (4.5) we can conclude that   z M H ≥ (h  l)(z) = inf l(y)h y0 y      z z , inf l(y)h l(y)h = min inf y y y0, min(z/y)i ≤µ y0,(z/y)µe   z . l(y)h = inf y y0,(z/y)µe

Thus inf y0,

min(z/y)i ≤µ

Let z ∈ RI++ and zµ = (1/µ)z. We have (4.7)

    z z (h  l)(z) = inf = inf l(y)h . l(y)h 0yz y y y0, z/ye µ

Since l is decreasing we have l(y) ≥ l(zµ ) for 0  y  zµ . So   z = l(zµ ) inf h(u). (h  l)(z) ≥ inf l(zµ )h uµe 0yzµ y

DECREASING FUNCTIONS AND PENALIZATION

301

It follows from Lemma 4.1 that inf h(u) =

u≥µe

lim

mini ui →+∞

h(u) = H,

so (h  l)(z) ≥ Hl(zµ ). Thus (4.8)

lim (h  l)(z) ≥ H lim l(zµ ) = H lim l(y) = HM.

z→0

z→0

y→0

It follows from (4.5) and (4.8) that (4.4) holds. Remark 4.2. Condition 3 of the theorem holds if limz→0 l(z) lim inf mini zi →0 h(z) > . limmini zi →+∞ h(z) limmini zi →+∞ l(z) It is clear that this inequality holds if limmini zi →0 h(z) = +∞. 5. The modified penalty function. We now study the following constrained optimization problem: (5.1)

(P ) :

fo (x) −→ inf

subject to x ∈ X, f1 (x) ≤ 0,

where X ⊂ Rn and fi : X → R, i = 0, 1. Remark 5.1. The more general problem (5.2)

fo (x) −→ inf

subject to x ∈ X, gi (x) ≤ 0, i ∈ I,

can be represented in the form of (5.1) with f1 (x) = supi∈I gi (x). We will require the following assumption. Assumption 5.1. inf x∈X fo (x) := γ > 0. Let Xo be the set of all feasible solutions for (P ): Xo = {x ∈ X : f1 (x) ≤ 0}.

(5.3)

The set Xo can also be represented in the following form: Xo = {x ∈ X : f1+ (x) = 0}, where f1+ (x) = max{f1 (x), 0}. Thus the problem (P ) is equivalent to the following problem: (5.4)

(P + ) :

fo (x) −→ inf

subject to f1+ (x) = 0.

It follows from Assumption 5.1 that fo+ (x) := max{fo (x), 0} = fo (x) for all x ∈ X. Let F + (x, do , d) = (do fo (x), df1+ (x)),

x ∈ X, do , d > 0.

We now consider the function p1 defined on R2+ by p1 (yo , y1 ) = yo + y1 . Clearly p1 is an IPH function. The function p1 generates the classical penalty function L+ for the problem (P ): L+ (x, do , d) = do fo (x) + df1+ (x) ≡ p1 (F + (x, do , d)),

x ∈ X, do , d > 0.

Suppose now that we have an arbitrary continuous IPH function p defined on R2+ .

302

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

Definition 5.1. The function L+ p + L+ p (x, do , d) = p(F (x, do , d)),

x ∈ X, do > 0, d > 0,

is called the modified penalty function for the problem (P ), corresponding to the function p. Let (5.5)

qp (do , d) = inf p(do fo (x), df1+ (x)) ≡ inf L+ p (x, do , d), x∈X

x∈X

do > 0, d > 0.

Definition 5.2. The problem (Dp ) :

qp (1, d) −→ sup

subject to

d > 0,

where qp is defined by (5.5), is called the dual problem to the problem (P ), corresponding to the function p. We will denote by MP the value of the initial problem (P ) and by MDp the value of the dual problem: MP = inf{fo (x) : x ∈ Xo },

MDp = sup{qp (1, d) : d > 0}.

Note that the function d 7→ qp (1, d) is increasing and MDp = limd→+∞ qp (1, d). We can express the function qp (do , d) defined by (5.5) in the following form:   + + + (5.6) qp (do , d) = inf Lp (x, do , d) = min inf Lp (x, do , d), inf Lp (x, do , d) . x∈X

x∈Xo

x6∈Xo

Let (5.7)

X1 = {x ∈ X : f1 (x) > 0} = {x ∈ X : x 6∈ Xo }

and (5.8)

rp (do , d) = inf p(do fo (x), df1 (x)). x∈X1

Proposition 5.1. If Assumption 5.1 holds, then   (x, d , d), r (d , d) . qp (do , d) = min inf L+ o p o p x∈Xo

Proof. This follows directly from (5.6) and (5.8). We now study the function rp defined by (5.8) on R2++ . Since p is an IPH function it follows that rp is also an IPH function. Recall (see Definition 2.2) that the support set of the function p coincides with the support set of its restriction to the cone R2++ . We will denote this restriction by the same letter p. We now describe the support set supp(rp ) of rp in terms of the set supp(p). Proposition 5.2. Let p be an IPH function and let fo and f1 be, respectively, the objective and constraint functions of the problem (P ). Furthermore, let the set X1 and the function rp be defined by (5.7) and (5.8), respectively. Then the following holds: \ (5.9) (fo (xo ), f1 (x)) · supp(p), supp(rp ) = x∈X1

DECREASING FUNCTIONS AND PENALIZATION

303

where the product a · U is defined by (2.3). Proof. We have for do , d > 0: rp (do , d) = inf p(do fo (x), df1 (x)) = inf qpx (do , d), x∈X1

x∈X1

where qpx (do , d) = p((fo (x), f1 (x)) · (do , d)) = p((fo (x), f1+ (x)) · (do , d)). It follows from Propositions 2.1 and 2.2 that \ \ supp(qpx ) = (fo (x), f1 (x)) · supp(p). supp(rp ) = x∈X1

x∈X1

6. Perturbation functions. We now study the perturbation function (see, for example, [4, 5, 6, 14, 13] and references therein) for the problem (P ) defined by (5.1). Definition 6.1. The function β defined on R+ = {y ∈ R : y ≥ 0} by (6.1)

β(y) = inf{fo (x) : x ∈ X, f1 (x) ≤ y}

is called the perturbation function of the problem (P ). The value β(0) of the perturbation function at the origin coincides with the value MP of the problem (P ). We also have inf β(y) = inf

y>0

inf

y>0 x∈X, f1 (x)≤y

fo (x) = inf fo (x). x∈X

Since inf x∈X fo (x) = γ > 0 (by Assumption 5.1) we have inf y>0 β(y) = γ > 0. It follows directly from the definition that the perturbation function is decreasing: y1 ≥ y2 =⇒ β(y1 ) ≤ β(y2 ). Assumption 6.1. Let Xo and X1 be the sets defined by (5.3) and (5.7), respectively. There exists a sequence xk ∈ X1 such that f1 (xk ) → 0 and fo (xk ) → MP , where MP = inf x∈Xo fo (x) is the value of the problem (5.1). If Assumption 6.1 holds, then for each y > 0 we have inf x∈X1 ,f1 (x)≤y fo (x) ≤ MP , so (6.2)

β(y) =

fo (x),

inf

x∈X1 ,f1 (x)≤y

y > 0.

Proposition 6.1. Let p be an IPH function defined on R2++ and let U = supp(p). Let Assumptions 5.1 and 6.1 hold. Then, for the function rp defined by (5.8), we have \ (β(y), y) · U. supp(rp ) = y>0

Proof. Let A=

\

(fo (x), f1 (x)) · U

and B =

\

(β(y), y) · U.

y>0

x∈X1

It follows from Proposition 5.2 that supp(rp ) = A. We now check that B ⊂ A. Let x ∈ X1 and y = f1 (x). Then (see (6.2)) β(y) =

inf

x0 ∈X1 ,f1 (x0 )≤y

fo (x0 ) ≤ fo (x)

304

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

and therefore (fo (x), f1 (x)) ≥ (β(y), y). It follows from Lemma 2.1 that (fo (x), f1 (x))· U ⊃ (β(y), y) · U so \ \ (fo (x), f1 (x)) · U ⊃ (β(y), y) · U ⊃ B. A= x∈X1

y=f1 (x), x∈X1

We now prove that A ⊂ B. Let y > 0. It follows from (6.2) that for each sufficiently small ε0 > 0 there exists a vector x ∈ X1 such that f1 (x) ≤ y and fo (x) − ε0 ≤ β(y). It follows from Lemma 2.1 that (β(y), y) · U ⊃ (fo (x) − ε0 , f1 (x)) · U . Therefore, \ (β(y), y) · U B= y>0



\

(fo (x) − ε0 , f1 (x)) · U.

x∈X1

T Since ε0 > 0 is an arbitrary number it follows that B ⊃ x∈X1 (fo (x), f1 (x)) · U = A. Corollary 6.1. hrp = β  hp . Proof. This follows immediately from Proposition 4.3. Remark 6.1. The perturbation function β does not depend on the IPH function p, and the associated function hp does not depend on the problem (P ) (that is, on the functions fo and f1 ). Let M = limy→+0 β(y) = supy>0 β(y). Since Xo ≡ {x ∈ X : f1 (x) ≤ 0} ⊂ {x ∈ X : f1 (x) ≤ y}, it follows that β(y) ≤ inf x∈Xo fo (x) = MP ; therefore, M ≤ MP = β(0). Hence, the equality (6.3)

MP = lim β(y) y→+0

holds if and only if β is lower semicontinuous at the point zero. Conditions ensuring lower semicontinuity of the perturbation function are well known (see, for example, [12]). A simple sufficient condition has the following form: if fo is continuous and there exists y > 0 such that the set Xy = {x ∈ X : f1 (x) ≤ y} is compact, then β is lower semicontinuous and (6.3) holds. We shall assume below that both Assumptions 5.1 and 6.1 hold. Theorem 6.1. Let hp be the associated function for p and assume lim hp (z) = 1

z→+∞

and

lim hp (z) = +∞.

z→+0

Then (6.4)

sup hrp (z) = sup rp (1, d) = inf fo (x) z>0

d>0

x∈Xo

if and only if the perturbation function β is lower semicontinuous at the point zero. Proof. This follows directly from Corollary 6.1, Theorem 4.1, Remark 4.2, and Proposition 3.3. The conditions in Theorem 6.1 are given in terms of the associated function hp of an IPH function p. Applying Propositions 3.5 and 3.6 we can present conditions guaranteeing the validity of (6.4) in terms of the function p itself. Theorem 6.2. Let p be a continuous IPH function defined on the cone R2+ with p(1, 0) = 1 and limu→+∞ p(1, u) = +∞. Then (6.4) holds if and only if the function β is lower semicontinuous at the point zero.

DECREASING FUNCTIONS AND PENALIZATION

305

Let us now consider the dual problem (see Definition 5.2) to the problem (P ), where MDp is the value of the dual problem. Lemma 6.1. Let p be an IPH function defined on R2+ with p(1, 0) = 1. Then MDp ≤ MP and, for all x ∈ Xo and d > 0, + L+ p (x, 1, d) ≡ p(fo (x), df1 (x)) = fo (x).

Proof. Let d > 0 and x ∈ Xo . Since f1+ (x) = 0 and p is positively homogeneous with p(1, 0) = 1 we have fo (x) = p(fo (x), 0) = p(fo (x), df1+ (x)) = L+ p (x, 1, d). Also p(fo (x0 ), d1 f1+ (x0 )) = qp (1, d). L+ p (x, 1, d) ≥ inf 0 x ∈X

Thus MP = inf x∈Xo fo (x) ≥ supd>0 qp (1, d) = MDp . Lemma 6.2. Let p be a continuous IPH function defined on the cone R2+ with p(1, 0) = 1 and limu→+∞ p(1, u) = +∞. Then MDp = M , where M = limy→+0 β(y). Proof. By Proposition 5.1 and Lemma 6.1 we have  (6.5)

qp (1, d) = min



inf

x∈Xo

L+ p (x, 1, d), rp (1, d)

= min{MP , rp (1, d)}.

Propositions 3.5 and 3.6 show that (6.6)

lim hrp (z) = 1

z→+∞

and

lim hrp (z) = +∞.

z→+0

Applying Corollary 6.1 and Theorem 4.1 we can conclude that lim hrp (d) = lim β(y) = M.

d→+0

y→+0

It follows from Proposition 3.3 that (6.7)

sup rp (1, d) = lim hrp (d) = M. d>0

d→+0

Since MP ≥ M we have, by applying (6.5) and (6.7), that MDp = lim qp (1, d) = lim min{MP , rp (1, d)} = min{MP , M } = M. d→+∞

d→+∞

Remark 6.2. Let limy→+0 β(y) = MP . Then qp (1, d) = rp (1, d) for all d > 0. Indeed, since the function d 7→ rp (1, d) is increasing it follows from (6.7) that rp (1, d) ≤ MP for all d > 0. Applying (6.5) we can conclude that qp (1, d) = rp (1, d) for all d > 0. Theorem 6.3. Let p be a continuous IPH function defined on the cone R2+ with p(1, 0) = 1 and limu→+∞ p(1, u) = +∞. Then MDp = MP if and only if the perturbation function β is lower semicontinuous at the point zero. Proof. This follows directly from Lemma 6.2.

306

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

7. Exact penalty functions. Consider the optimization problem (P ) defined by (5.1). In this section we will discuss the existence of an (exact) penalty parameter, that is, a number d > 0 such that MDp = qp (1, d), for a given IPH function p. If Assumptions 5.1 and 6.1 hold and the perturbation function β is lower semicontinuous at the origin, then (see Remark 6.2) qp (1, d) = rp (1, d), where rp is defined by (5.8). It has been shown (see Corollary 6.1) that the associated-to-rp function hrp can be presented as the multiplicative inf-convolution of the perturbation function β and the function hp : hrp = β  hp . We will use this formula in the study of the problem under consideration. We need the following assumption. Assumption 7.1. The perturbation function β(y) is lower semicontinuous at the origin. If Assumption 7.1 holds, then (see Theorem 6.3) MP = MDp for each continuous IPH function p defined on R2+ with properties p(1, 0) = 1 and limy→+∞ p(1, y) = +∞. Proposition 7.1. Let Assumptions 5.1, 6.1, and 7.1 hold. Let p be a continuous IPH function defined on R2+ with p(1, 0) = 1 and limy→+∞ p(1, y) = +∞. Then ¯ = MP if and only if the associated function hr is constant on the segment rp (1, d) p [0, MP /d¯1 ]: hrp (y) = MP ,

MP 0≤y≤ ¯ . d1

¯ = MP . Since the function Proof. Assume that there exists d¯ such that rp (1, d) rp (1, d) is increasing and supd>0 rp (1, d) = MP , it follows that rp (1, d) = MP for all ¯ ¯ Thus we have for d/d0 ≥ d, d ≥ d. rp (d0 , d) = d0 rp (1, d/d0 ) = d0 MP . For the support set srp of the function rp the following is valid (see Theorem 2.3):     1 1 ≥1 . , srp = ` = (l0 , l1 ) : rp l0 l1 ¯ We have ( 1 )( 1 )−1 = d. ¯ Thus Consider the point (l0 , l1 ) = (MP , MP /d). l1 l0   1 1 1 , rp = MP = 1. l0 l1 l0 ¯ ∈ sr . The set sr is normal so {(l0 , l1 ) : l0 ≤ MP , l1 ≤ MP /d} ¯ ⊂ Hence (MP , MP /d) p p srp . By the definition of the associated function, we have hrp (y) = sup{α : (α, y) ∈ srp }. ¯ then hr (y) ≥ MP . On the other hand, hr (y) is a decreasing So if 0 ≤ y ≤ MP /d, p p ¯ function with hrp (0) = MP . Thus hrp (y) = MP if y ≤ MP /d. ¯ Since Assume now that hrp (y) = MP for 0 ≤ y ≤ y¯ = MP /d. hrp (¯ y ) = sup{α : (α, y¯) ∈ srp }, srp is closed, and hrp (¯ y ) < +∞, we can deduce that (MP , y¯) ∈ srp . Thus   1 d¯ 1 ¯ ≥ 1; , rp (1, d) = rp MP MP MP

DECREASING FUNCTIONS AND PENALIZATION

307

¯ ≥ MP . On the other hand, rp (1, d) ≤ MP for all d. Thus rp (1, d) ¯ = that is, rp (1, d) MP . Example 7.1. Let p(α, y) = max{α, y} and β be the perturbation function of the problem (P ). Assume that β is continuous. Then (see Example 4.1) hrp (z) = (β  hp )(z) = β(z). We have, by applying Theorem 2.1 and the definition of the associated function, rp (1, d) = max{hl, yi : l ∈ supp(rp )} = max min(hrp (z), zd) = max min(β(z), zd). z>0

z>0

Assume that the perturbation function β is strictly decreasing for sufficiently small y, that is, that inf{fo (x) : f1 (x) ≤ y1 } > inf{fo (x) : f1 (x) ≤ y2 } whenever y1 < y2 . Then MP = limy→0 β(y) > β(z) for all z > 0. Since maxz>0 min(β(z), zd) = β(zd ) where zd is a solution of the equation β(z) = zd, we have qp (1, d) = rp (1, d) = max min(β(z), zd) = β(zd ) < MP . z>0

Thus there is no d > 0 such that rp (1, d) = MP . We now consider the convolution function pk (k > 0) defined by ¡ 1 pk (δ, y) = δ k + y k k

(7.1)

(δ > 0, y > 0).

For the sake of simplicity we shall denote the function rpk by r[k] and its associated function hr[k] by h[k] . The following assertion will be useful in the study of both conditions for the exact penalization and estimations of penalty parameters. Lemma 7.1. Let Assumptions 5.1, 6.1, and 7.1 hold. Let k > 0 and pk = p be the function defined by (7.1). Let h[k] be the associated function to rpk . Then MP = h[k] (z) if and only if MP ≤ β(y)

(7.2)

z (z k

1

− yk ) k

for

0 < y < z.

Proof. It easily follows from Example 3.2 that ( y y > 1, 1 , (y k −1) k h[p] (y) = +∞, y ≤ 1. Since limy→+∞ hp (y) = 1 and limy→+0 hp (y) = +∞, we can apply Theorem 6.1, which shows that (7.3)

MP = MDp = sup h[k] (z) = lim h[k] (z). z→0

z>0

By Corollary 6.1 we have (7.4) h[k] (z) = inf β(y)hp y>0

    z z z = inf β(y)hp = inf β(y) 1 . 0 0),

is said to be calm at the point zero if (7.6) with k = 1 holds. Using Theorem 7.1 it is possible to derive the well-known result (see, for example, ˜ if and only if the [4, 6]) that there exists a number d˜ > 0 such that MP = qp1 (1, d) family (Py ) is calm. For many problems the exact penalization with respect to the classical penalty function p1 can be accomplished only with very large penalty parameters, which leads to ill-conditioned unconstrained optimization problems. Applying penalization with respect to other convolution functions p we can sometimes decrease penalty parameters providing exact penalization. We shall next study this question for the convolution function p 12 . First we consider the arbitrary k > 0 and describe the least penalty parameter, that is, the least number, for which the equality MP = r[k] (1, d) holds. Assumption 7.2. Assume that the perturbation function β(y) is continuous on [0, +∞) and β(y) < MP for y > 0. Let Assumption 7.2 hold, k > 0, and (7.9)

vk (y) =

y (1 − (β(y)MP−1 )k ) k 1

Let (7.10)

ϕk (z) = inf vk (y). 0 0, it follows that ϕk (z) = inf 0 0. Clearly ϕk is a decreasing function. Lemma 7.2. Let k > 0 and (7.6) be valid. If Assumptions 5.1, 6.1, and 7.2 hold, then the least exact penalty parameter d¯k of the problem (P ) with respect to the zk , where z¯k is the solution of the equation z = ϕk (z). function pk is equal to 1/¯ Proof. It follows from Proposition 7.1 that r[k] (1, d) = MP if and only if h[k] (z) = MP with z = 1/d. Lemma 7.1 demonstrates that the equality MP = h[k] (z) is valid if and only if (7.2) holds; that is, z MP ≤ 1 β(y) (z k − y k ) k

for

0 < y < z.

It easy to check that this inequality is equivalent to the following: (7.11)

z ≤ vk (y)

for

0 < y < z,

where vk is defined by (7.9). Thus MP = h[k] (z) if and only if z ≤ ϕk (z). Since (7.6) holds it follows from Theorem 7.1 that there exists z > 0 such that MP = h[k] (z); that is, z ≤ ϕk (z). Thus r[k] (1, d) = MP if and only if z ≤ ϕk (z), so the least exact penalty parameter d¯k is equal to the inverse to the greatest element z¯k of the set {z : z ≤ ϕk (z)}. Since ϕk is a continuous decreasing function we can deduce that the equation z = ϕk (z) has the unique solution. It is clear that this solution coincides with z¯k . Let β(y) < MP for y > 0. Set q 1 + β(y)MP−1 q (7.12) (y > 0). u(y) = 1 − β(y)MP−1 Since β is a decreasing function it follows that u is a decreasing function as well. Lemma 7.3. Let β(y) < MP for y > 0, and let vk , ϕk , and u be functions defined by (7.9), (7.10), and (7.12), respectively. Then ϕ 12 (z) ≥ ϕ1 (z)u(z)

(z > 0).

Proof. We have v1 (y) =

y , 1 − β(y)MP−1

v 12 (y) = 

1−

q

y β(y)MP−1

2 .

Hence v 12 (y) = u(y)v1 (y)

(y > 0).

Let z > 0 and 0 < y < z. Since ϕ1 (z) ≤ v1 (y) and u is a decreasing function it follows that ϕ 12 (z) = inf v 12 (y) ≥ ϕ1 (z) inf u(y) = ϕ1 (z)u(z). 00 r[ 12 ] (1, d) = (a − b)2 is attained at the point d¯12 = c − b. Note that d¯[ 12 ] does not depend on a.

DECREASING FUNCTIONS AND PENALIZATION

311

Consider now the classical penalty function with k = 1. It is easy to check that supd>0 r[1] (1, d) = (a − b)2 is attained at the point d¯1 = 2(a − b). Thus d¯1 → +∞ as a → +∞. Proposition 7.2. The estimation (7.16) is asymptotically sharp in the following sense: for each ε > 0 there exists a problem (P ) such that the difference between expressions in the right-hand side and the left-hand side of (7.16) is less than ε. Proof. Consider the problem (7.17) from Example 7.2. We have MP = (a − b)2 ,

γ = (c − a)2 ,

d¯1 = 2(a − b),

d¯12 = c − b.

Hence the inequality (7.16) can be presented in the following form: (7.18)

c−b≤

1− 1+

a−c a−b a−c a−b

2(a − b).

Note that the difference between expressions in the right-hand side and left-hand side in (7.18) is equal to (c − b)2 (2a − c − b)−1 , so this difference tends to zero as c − b → 0. Remark 7.2. Consider the following problem (Pc ): fo (x) + c −→ min

subject to f1 (x) ≤ 0,

which is equivalent to problem (P ). Clearly, both problems (P ) and (Pc ) have the same exact penalty parameter d¯1 with respect to the classical penalty function p1 . At the same time, they have different exact penalty parameters d¯12 with respect to the convolution function p 12 . Let d¯12 (c) be the least penalty parameter with respect to p 12 for the problem (Pc ). It follows from (7.16) that d¯12 (c) tends to zero as c → +∞. Note that the corresponding unconstrained optimization problem can become ill-conditioned for very large c. Remark 7.3. Consider the classical convolute function p1 and the coresponding penalty parameter d¯1 . It is well known (see [3]) that d¯1 can be estimated from below by the optimal Lagrange multiplier λ of the problem (P ). Clearly, the optimal Lagrange multiplier of the problem (Pc ) (see Remark 7.2) coincides with λ. It follows from Remark 7.2 that the estimation of the exact penalty parameter d¯12 by the Lagrange multiplier λ is impossible. We conclude the paper by showing that it is possible to find an IPH function p and a number d such that for the modified penalty function L+ p generated by p we have MP = inf x∈X L+ p (x, 1, d) ≡ qp (1, d). Theorem 7.3. Let Assumptions 5.1, 6.1, and 7.1 hold. Then for the problem (P) defined by (5.1) there exists an IPH function p and a number d > 0 such that MP = qp (1, d), where qp is defined by (5.5). Proof. Let β be the perturbation function of the problem (P ). Consider the lower semicontinuous hull β¯ of the function β:   0 ¯ β(y ) . β(y) = max β(y), 0lim inf 0 y →y,y 6=y

312

A. M. RUBINOV, B. M. GLOVER, AND X. Q. YANG

Clearly β¯ is decreasing and lower semicontinuous. Since β is lower semicontinuous at ¯ the origin we can conclude that β(0) = β(0) = MP . Let  +∞ if 0 < y ≤ 1, g(y) = ¯ −1 ) if y > 1. MP /β(y Clearly g is a decreasing function. Since β¯ is lower semicontinuous it follows that g is upper semicontinuous. We have also MP = 1. lim g(y) = lim ¯ u→+0 β(u)

y→+∞

Since g is upper semicontinuous we can conclude (see Proposition 3.2) that there exists a normal closed set U ⊂ R2++ such that g = gU . Consider an IPH function p¯ defined on R2++ by p¯(y) = sup{min(l1 y1 , l2 y2 ) : l ∈ U }. Since the function y → p¯(1, y) is increasing it follows that a = limy→+0 p(1, y) < +∞. Let the function p be defined on R2+ by  if y1 , y2 > 0,  p¯(y1 , y2 ) 0 if y1 = 0, p(y1 , y2 ) =  if y2 = 0. ay1 It is easy to check that p is a continuous IPH function and hp (y) = g(y) for y > 0. Since limy→+∞ hp (y) = limy→+∞ g(y) = 1 it follows from Proposition 3.5 that a = p(0, 1) = 1. The equality hp (y) = +∞ for y ≤ 1 shows that limu→+∞ p(1, u) = +∞. It follows from Theorem 6.2 that MP = sup{hrp (z) : z > 0} so hrp (z) ≤ MP for all z > 0. On the other hand, we have     MP z z = inf β(y)g = inf β(y) ¯ y . hrp (z) = inf β(y)hp y0 y>0 y y β( z ) ¯ Let z = 1. If the function β is continuous at a point y, then β(y) = β(y); otherwise ¯ β(y) ≥ β(y). So β(y) hrp (1) = inf ¯ MP = MP . y 0 such that rp (1, d) Acknowledgments. The authors are very grateful to two anonymous referees for constructive comments on an earlier version of this paper. These comments have enabled the authors to significantly improve the last section of the paper. REFERENCES [1] M. Yu. Andramonov, An Approach to Constructing Generalized Penalty Functions, Research Report 21/97, SITMS, University of Ballarat, Ballarat, VIC, Australia, 1997. [2] A. Auslender, R. Cominetti, and M. Haddou, Asymptotic analysis for penalty and barrier methods in convex and linear programming, Math. Oper. Res., 22 (1997), pp. 43–62.

DECREASING FUNCTIONS AND PENALIZATION

313

[3] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, New York, 1982. [4] J. V. Burke, Calmness and exact penalization, SIAM J. Control Optim., 29 (1991), pp. 493– 497. [5] J. V. Burke, An exact penalization viewpoint of constrained optimization, SIAM J. Control Optim., 29 (1991), pp. 968–998. [6] F. H. Clarke, Optimization and Nonsmooth Analysis, Wiley-Interscience, New York, 1983; reprinted as Classics Appl. Math. 5, SIAM, Philadelphia, PA, 1990. [7] C. Chen and O. L. Mangasarian, Smoothing methods for convex inequalities and linear complementarity problems, Math. Programming, 71 (1995), pp. 51–69. [8] C. J. Goh and X. Q. Yang, A sufficient and necessary condition for nonconvex constrained optimization, Appl. Math. Lett., 10 (1997), pp. 9–12. [9] S. S. Kutateladze and A. M. Rubinov, Minkowski Duality and Its Applications, Nauka, Novosibirsk, Russia, 1976 (in Russian). [10] L. S. Lasdon, Optimization Theory for Large Systems, Macmillan, London, 1970. [11] M. Minoux, Programmation mathematique, theorie et algorithmes Dunod, Bordas et G.N.E.T.E.N.S.T., Paris, 1989. [12] D. Pallaschke and S. Rolewicz, Foundations of Mathematical Optimization, Kluwer Academic, Norwell, MA, 1997. [13] R. T. Rockafellar, Conjugate Duality and Optimization, CBMS-NSF Regional Conf. Ser. in Appl. Math. 16, SIAM, Philadelphia, 1974. [14] R. T. Rockafellar, Lagrange multipliers and optimality, SIAM Rev., 35 (1993), pp. 183–238. [15] A. M. Rubinov, Some properties of increasing convex-along-rays functions, Proc. Centre Math. Appl. Austral. Nat. Univ., 36 (1999), pp. 153–167. [16] A. M. Rubinov and B. M. Glover, Duality for increasing positively homogeneous functions and normal sets, Rech. Op´ er., 32 (1998), pp. 105–123. [17] A. M. Rubinov, B. M. Glover, and X. Q. Yang, Extended Lagrange and penalty functions in continuous optimization, Optimization, to appear. [18] I. Singer, Abstract Convex Analysis, Wiley-Interscience, New York, 1997. [19] Yu. Yevtushenko and V. Zhadan, Exact auxiliary functions in optimization problems, U.S.S.R. Comput. Math. Math. Phys., 30 (1990), pp. 31–42.