An Interior Point Constraint Generation Algorithm for Semi-Infinite ...

Submitted to manuscript (Please, provide the mansucript number!)

An interior point constraint generation algorithm for semi-infinite optimization with healthcare application* Mohammad R. Oskoorouchi† College of Business Administration, California State University San Marcos, San Marcos, California 92096, USA, Email: [email protected]

Hamid R. Ghaffari Department of Mechanical and Industrial Engineering, University of Toronto, Toronto ON, M5S 3G8, Canada, Email: [email protected]

Tam´as Terlaky‡ Department Industrial and Systems Engineering, Lehigh University, 200 West Packer Avenue, Bethlehem, PA 18015-1582, Email: [email protected]

Dionne M. Aleman Department of Mechanical and Industrial Engineering, University of Toronto, Toronto ON, M5S 3G8, Canada, Email: [email protected]

We propose an interior point constraint generation (IPCG) algorithm for semi-infinite linear optimization (SILO) and prove that the algorithm converges to an ε-solution of SILO after a finite number of constraints is generated. We derive a complexity bound on the number of Newton steps needed to approach the updated µ−center after adding multiple violated constraints, and a complexity bound on the total number of constraints that is required for the overall algorithm to converge. We implement our algorithm to solve sector duration optimization model arising in Leksell Gamma r PerfexionTM treatment planning (Elekta, Stockholm Sweden), a highly specialized treatment for Knife brain tumors. Using real patient data provided by the Department of Radiation Oncology at Princess Margaret Hospital in Toronto, Ontario, Canada, we show that our algorithm can efficiently handle problems in real-life healthcare applications. Comparing our results with that of a projected gradient method, we show that our IPCG algorithm obtains a more accurate solution, and is, on average, 7 times faster. We also compare our numerical results with SeDuMi, and show that our IPCG method outperforms classical primal-dual interior point methods on sector duration optimization problem and a class of randomly generated large-scale second order cone optimization (SOCO) problems. We illustrate the convergence behavior of our algorithm using classical SILO problems. Key words : Semi-infinite linear optimization, second-order cone optimization, sector duration optimization.

1

Oskoorouchi, Ghaffari, Terlaky, and Aleman: An IPCG algorithm for SILO with healthcare appl. Article submitted to ; manuscript no. (Please, provide the mansucript number!)

2

1.

Introduction

In this paper we present an interior point constraint generation (IPCG) algorithm for problems with a linear objective function and exponentially large, or infinitely many, constraints. This class of problems essentially contains semi-infinite linear optimization (SILO) and convex optimization (CO); in particular, semidefinite optimization (SDO) and second-order cone optimization (SOCO). Although there exist many efficient software packages based on polynomial interior point methods for convex conic optimization (such as SDPT3 (Toh et al. 1999), SeDuMi (Sturm 1999, SeDuMi 2003), and CSDP (Borchers 1999)), and based on low-rank factorization (such as SDPLR (Burer and Monteiro 2003)), but we would still like to keep this class of problems within our domain as we develop this algorithm. While today’s software packages perform extremely well on small to moderate size convex conic problems, they can’t handle efficiently large-scale problems arising in various real-life applications. For example, a problem with a few thousand conic constraints of large size, say 104 , especially when dense, would be a challenging problem for classical primaldual interior point methods, and thus would require significant computation time to solve even by today’s state-of-the-art software packages.

We present our algorithm in the context of SILO and prove its theoretical convergence and complexity in this general setting. We then implement the algorithm on SOCO models arising in r PerfexionTM treatment planning using real patient data and show that Leksell Gamma Knife it can efficiently handle real-life applications. Further, using randomly generated instances, we illustrate that our algorithm outperforms SeDuMi on a class of large scale SOCO problems. Consider the following SILO problem: ∗

This research was partially conducted when the first author was visiting McMaster University

†

The research of this author has been supported by a research grant from the College of Business Administration,

California State University San Marcos, and the University Professional Development Grant. ‡

The research of this author has been supported by NSERC, the Canada Research Chair program, and MITACS,

and a start-up fund of Lehigh University.


max {bT y : aTω y ≤ cω , ω ∈ Ω} ,

3

(1)

where Ω is a compact set, b ∈IRm , aω ∈IRm , and cω ∈IR1 , for ω ∈ Ω. This problem has been wellstudied in the literature and has numerous applications in engineering, healthcare and management. See Goberna and Lopez (2002) for a theoretical survey and Lopez and Still (2007) for a more recent survey on this topic. We propose a constraint generation build-up technique to solve problem (1), and show that a finite number of iterations is required to obtain an ε-optimal solution. The idea of this principle is that if a model is too large, or a function is too complex, it is better not to know it entirely, but to discover it as needed by intelligent queries (questions). Let FΩ = {y ∈ IRm : aTω y ≤ cω ω ∈ Ω},

be the feasible region of problem (1), assumed to be compact. Consider a discretization of problem (1), where the feasible region is an outer approximation of FΩ , and is defined by a finite number of constraints: max bT y : AT y ≤ c ,

(2)

where A ∈IRm×n is a full row rank matrix composed of column vectors ai ’s and c ∈IRn is composed of scalars ci ’s. Problem (2) is a relaxation of the original problem. Let us call it the dual problem. The corresponding primal problem reads min cT x : Ax = b, x ≥ 0 .

(3)

The main idea of our algorithm is as follows: We start from problem (2) with only an artificial box constraint whose bounds (RHS) are dynamically updated. Using a point in the vicinity of the central path of problem (2), multiple violated (deep) constraints from FΩ are identified. The feasible region of the dual problem is updated by adding the violated constraints and the barrier function is simultaneously updated by reducing the barrier parameter. This is equivalent to adding


4

columns to primal problem (3). Then the strict feasibility for the new feasible region is recovered and the central path is updated. This process continues until the barrier parameter is small enough, i.e., the duality gap approaches to zero. There are many algorithms in the literature based on cutting plane methods for SILO. See, for instance, Ferris and Philpott (1989), Wu et al. (1998), Li et al. (2006), and Luo et al. (1999). The method that we describe in this paper is a variant of Luo et al. (1999) with major differences both from the theoretical and implementation viewpoints. There are three main theoretical enhancements. First, our algorithm adds violated constraints with no changes to the right hand side. In Luo et al. (1999), when a violated constraint is identified, it is relaxed by changing its right hand side to make the current µ−center strictly feasible, which of course results in loss of information. We keep the violated constraints as deep as they are. Second, we extend the analysis to the case where multiple violated constraints are added simultaneously instead of adding one constraint at a time. Finally, at each iteration we update the barrier parameter together with updating the feasible region in the same step. All of these modifications contribute to the efficiency of the method as documented in the implementation section of this paper. We implement our algorithm to solve SOCO problems and show that our method outperforms primal-dual interior point methods on a class of large-scale problems. We also derive two theoretical complexity results. After adding p violated constraints and simultaneously updating the centering parameter µ by µ+ = (1 − η)µ where η =

√1 , 9 2m

we show that only

O(p log(p + 1)) Newton steps are required to obtain a point in the vicinity of the new µ+ -center. We also show that our IPCG algorithm stops with an ε-solution to the SILO problem after adding at most O

m2 pˆ2 3√m/ε e δ2

constraints, where δ is the radius of the largest full dimensional ball contained in FΩ , and pˆ is the maximum number of constraints added simultaneously.


5

We implement our algorithm to solve an optimization model arising in healthcare applications. Specifically, we solve a set of sector duration optimization models which are a component of the r PerfexionTM a type of external beam treatment planning process for Leksell Gamma Knife radiotherapy used to treat brain tumors and lesions. PerfexionTM provides several technological r units, which greatly increases its advancements to the previous generation of Gamma Knife treatment accuracy. At the same time, these advancements result in much larger and more difficult treatment planning problems. r Most mathematical models available in the literature are specific to older Gamma Knife machines. Thus, the new features of the PerfexionTM units cannot be exploited in these models. See Lim and Lee (2008) for a review of these techniques. Very recently, Ghaffari et al. (2009) developed ˙ formulation a SOCO formulation of the sector duration optimization model for PerfexionTM Their is based on the intensity modulated radiation therapy (IMRT) treatment planning approach proposed by Aleman et al. (2008). We implement our algorithm to solve the SOCO model developed by Ghaffari et al. (2009), and use real patient data. Using 100 instances, we illustrate that on average our algorithm is over 7 times faster than a projected gradient algorithm on the original formulation of the sector duration optimization model. We also compare the computational results of our algorithm with that of SeDuMi on the SOCO model of the sector duration optimization problem. Unfortunately, SeDuMi runs into numerical problems after a few iterations and stops with an error message. We suspect this failure is due to the nature of the application which is numerically hard. However, we show that our IPCG algorithm outperforms the classical primal-dual interior point methods on a certain class of large scale SOCO problems using randomly generated data. Since the algorithm is presented in the SILO setting, we also test our algorithm on classical SILO problems selected from the literature. These tests illustrate the convergence behavior of our algorithm in terms of the number of iterations that it takes for the upper and lower bounds to approach the optimal value with a high precision.


6

The paper is organized as follows: Section 2 presents preliminaries and some technical lemmas that are needed throughout this paper. In Section 3, we describe our IPCG algorithm in detail. Complexity of recovering the µ−center and complexity and convergence of the algorithm are given in Sections 4 and 5, respectively. In Section 6, we present our computational experience with the sector duration optimization model for PerfexionTM treatment planning. In Section 7, we compare our numerical results with that of SeDuMi on randomly generated problems, and illustrate the performance of our algorithm on a set of classical SILO problems.

2.

Preliminaries

We denote the primal, dual and primal-dual feasible regions of the discretization problem by Fp , Fd , and F , respectively: Fp = {x ∈ Rn : Ax = b, x ≥ 0}, Fd = {s ∈ Rn : AT y + s = c, s ≥ 0}, F = Fp × Fd .

Let µ > 0 be the barrier parameter. The corresponding barrier functions read as Pn T ϕp (x, µ) := c µx − i=1 log xi , Pn T ϕd (s, µ) := −bµ y − i=1 log si , Pn T ϕ(x, s, µ) := xµ s − i=1 log xi si .

Due to the one-to-one correspondence between y and s in Fd , we drop the argument y from the barrier function. The unique minimizer of ϕ(x, s, µ) over F , denoted by (x(µ), s(µ)), is a point on the central path, and it satisfies the primal-dual feasibility and the centering condition xs = µe, where xs is the Hadamard product of x and s, i.e., it is an n-vector composed of xi si , and e ∈ Rn is the vector with all its components equal to 1. We also call this point the µ-center. For the µ-center (x(µ), s(µ)), one has ϕ(x(µ), s(µ), µ) = n − n log µ. A θ-approximate µ-center (¯ x, s¯) is a point in the vicinity of the central path that satisfies T

A y¯ + s¯ = c,

A¯ x = b,

x

¯ s ¯

− e ≤ θ < 1.

µ

(4)


7

We now state some technical lemmas that are needed throughout this paper. The proofs of those lemmas that are not given here can be found in interior point methods books, such as Roos et al. (1997), Ye (1997b), and den Hertog (1994). Lemma 1 Let z ∈IRn , and kz k < 1. Then φ(kz k) ≤ ψ(z) ≤ φ(−kz k), where ψ(z) = eT z −

Pn

j=1 log(1 + zj ),

and φ(α) = α − log(1 + α).

Lemma 2 If z ∈IRn and kz k∞ < 1, then n

eT z −

X kz k2 log(1 + zj ) ≤ eT z. ≤ 2(1 − kz k∞ ) j=1

Lemma 3 Let (¯ x, s¯) be a θ-approximate µ-center. Then √ √ x ¯T s¯ n−θ n≤ ≤ n + θ n. µ

Moreover if µ+ = (1 − η)µ with 0 < η < 1, then

Corollary 4 For n ≥ 2, η =

3.

1 √ , 9 n

x √

¯s¯ − e ≤ 1 (θ + η n).

1−η

µ+

and arbitrary θ ≤ 1/4, one has

x

¯s¯ − e ≤ 0.40.

µ+

Interior point constraint generation algorithm

In this section we present our IPCG algorithm for solving problem (1). We make the following assumptions: Assumption 1 The set Ω is compact, and the mappings t → at and t → ct are continuous in t. Assumption 2 The feasible region FΩ contains a δ-radius full dimensional ball. Assumption 3 FΩ is contained in the unit cube [0, 1]m , and all m-vectors b and at are normalized.

8


Assumption 1 is made to ensure that the optimal solution of the constraint generation algorithm coincides with that of problem (1) (see Lemma 6). Assumption 2 is needed to establish a bound on the number of constraints, and Assumption 3 is a scaling assumption that will help to keep the complexity bound simple. We now describe the algorithm. Let y¯ be a point in the vicinity of the central path of Fd (see (4)) and a ¯Tj y ≤ c¯j , for j = 1, . . . , p be p constraints in FΩ such that c¯j < a ¯Tj y¯. The feasible region of the updated discretization therefore reads as Fd+ = {s ∈ IRn+ , r ∈ IRp+ : AT y + s = c, A¯T y + r = c¯},

where A¯ ∈IRm×p is composed of the p column vectors a ¯i ’s and c¯ = (¯ c1 ; . . . ; c¯p ). Let µ+ = (1 − η)µ be the updated barrier parameter for a later-specified value 0 < η < 1. The task is now to find a point in the vicinity of the central path of the updated discretization, close to the µ+ -center of Fd+ . However, since c¯ < A¯T y¯, then A¯T y ≤ c¯ are deep constraints for Fd , the current point y¯ is not

a feasible point of Fd+ . Therefore we first need to derive a strictly feasible point for Fd+ . Let ( ) p X p t¯ = arg min tT V t − log ti , 2 i=1

(5)

¯ 2 AT )−1 A, ¯ where X is a diagonal n × n matrix with the components of vector x where V = A¯T (AX as its diagonal elements. Also define d¯ = p(A¯T y¯ − c¯)t¯.

(6)

Notice that since A¯T y¯ − c¯ > 0, and t¯ > 0, then d¯ > 0. Let α < 1 − θ be fixed. We consider two cases: 1. Moderately deep constraints: d¯ < αe. In this case we show that all violated constraints cross the Dikin ellipsoid around y¯, and the dual feasibility can be recovered using the current point y¯. 2. Very deep constraints: There exists a constraint for which d¯i ≥ α. In this case, dual feasibility cannot be recovered. We show that one can recover feasibility in the primal space ¯ = b}, Fp+ = {x ∈ IRn+ , t ∈ IRp+ : Ax + At and obtain the new µ+ -center using the primal barrier function.


9

The concept of shallow and deep cuts was first introduced in the context of analytic center cutting plane method by Goffin and Vial (1999). Lemma 5 Let Fp and Fd be the primal and dual feasible regions of the discretization problem, respectively. Let µ be the barrier parameter, and (¯ x, s¯) be a point in the vicinity of the central path ¯ 2 AT (AX ¯ 2 AT )−1 A¯t¯, that satisfies (4). Let p violated constraints A¯T y ≤ c¯ be added to Fd , ∆x = −X and d¯i < α < 1 − θ, for i = 1, . . . , p. Then x+ = (¯ x +α∆x; αt¯) is strictly feasible for Fp+ . Furthermore, ¯ 2 AT )−1 A¯t¯, and define ∆s = AT (AX 1 ¯ t¯−1 , r¯ = (αe − d) p

(7)

where the p-vector t¯−1 is the component-wise inverse of vector t¯. Then s+ = (¯ s + α∆s; r¯) is strictly feasible for Fd+ . Proof: A similar lemma is presented in Goffin and Vial (2000) for multiple cutting plane algorithm where µ = 1 and d¯ = 0. The directions ∆x and ∆s defined in this lemma are similar to those of Goffin and Vial (2000). Therefore, to some extent, their proof remains valid here. In particular, A(∆x) + A¯t¯ = 0 is obtained by construction. Also, the strict feasibility of the updating directions x ¯ + α∆x > 0 and s¯ + α∆s > 0 are obtained by Lemma 7 below, and by the fact that α < 1 − θ. We prove that A¯T y + + r¯ = c¯ and r¯ > 0. Notice that AT (¯ y +∆y)+ s¯+∆s = c, implies AT ∆y = −∆s ¯ 2 AT )−1 A¯t¯. Therefore and ∆y = −(AX A¯T y + + r¯ = A¯T y¯ + αA¯T ∆y + r¯ = A¯T y¯ − αV t¯+ r¯, and from the KKT optimality conditions of problem (5), we have α 1 ¯ t¯−1 A¯T y + + r¯ = A¯T y¯ − t¯−1 + (αe − d) p p 1 = A¯T y¯ − d¯t¯−1 p = c¯. Now since d¯ < αe, we have r¯ > 0.


10

Lemma 5 shows that if the violated constraints are moderately deep, then Newton’s method can be initiated from x+ and s+ to obtain a point in the vicinity of the new central path. In the next section we derive a bound on the number of Newton steps required to update the µ+ -center. When there is at least one very deep inequality, dual feasibility cannot be recovered because it is not clear how far the constraint is away from the Dikin ellipsoid. In this situation one can still recover primal feasibility by using x+ and Newton’s method can be applied in the primal space to update the µ+ -center. This procedure is repeated until the barrier parameter µ falls within the desired accuracy. The next lemma, due to Luo et al. (1999), shows that the constraint generation algorithm delivers an ε-optimal solution for problem (1). Lemma 6 Let ε > 0 be given. Under Assumption 1, if y¯ ∈ FΩ is in the vicinity of µ
(A¯k )T y k . Step 2. Update nk = nk−1 + pk , ηk =

1 √ , 9 nk

µk = (1 − ηk )µk−1 , Ak = [Ak−1 A¯k ], and ck =

(ck−1 ; c¯k ). Step 3. Compute t¯ from (5) and d¯ from (6). Step 4.1.If d¯ < αe, then use s+ to start a dual Newton procedure to obtain sk , and define xk := x(sk ) in the vicinity of the µk -center of Fdk . Step 4.2. Otherwise, use x+ to start a primal Newton procedure to obtain xk , and define sk := s(xk ) in the vicinity of the µk -center of Fpk . Step 5. k=k+1. End


4.

11

Complexity of recovering the µ-center

In this section we derive a bound on the number of Newton steps that is required to obtain a point in the vicinity of the µ+ -center when all violating constraints are moderately deep. First, we recall a lemma from Goffin and Vial (2000): Lemma 7 For directions ∆x and ∆s in Lemma 5, we have

−1 ¯ ∆x ≤

X

1 1−θ

and

We also need the following technical lemma:

−1

S¯ ∆s ≤

1 . 1−θ

Lemma 8 Let (¯ x, s¯) be a θ-approximate µ-center. Then 2

θ 1. ϕ(¯ x, s¯, µ) ≤ ϕ(x(µ), s(µ), µ) + 2(1−θ)

2. ϕ(¯ x, s¯, µ+ ) + n log(1 − η) ≤ ϕ(x(µ), s(µ), µ) + ν(n, η, θ) 3. ϕ(¯ x, s¯, µ+ ) ≤ ϕ(x(µ+ ), s(µ+ ), µ+ ) + ν(n, η, θ) where

√ θ2 η(n + θ n) + . ν(n, η, θ) = n log(1 − η) + 1−η 2(1 − θ)

Proof: In view of Lemma 2, the first inequality is straightforward. Let us prove the second inequality. Since (¯ x, s¯) is in the vicinity of the µ-center, from Lemma 2 we have n X x ¯T s¯ x ¯i s¯i ϕ(¯ x, s¯, µ ) = + − n log µ − log µ µ i=1 +

≤

¯T s¯ θ2 x ¯T s¯ x − + n − n log µ + . µ+ µ 2(1 − θ)

The second inequality follows from Corollary 4 and √ x ¯T s¯ x ¯T s¯ η¯ xT s¯ η(n + θ n) − = ≤ . µ+ µ (1 − η)µ 1−η

The third inequality implies from the second one.

Notice that the bounds on the primal-dual barrier function in Lemma 8 are also valid for the primal and the dual barrier functions. The following corollary simplifies these bounds for some given values.


12

Corollary 9 For n ≥ 2, η =

1 √ , 9 n

and arbitrary θ ≤ 1/4 we have

1. ϕ(¯ x, s¯, µ) ≤ ϕ(x(µ), s(µ), µ) + 0.05. 2. ϕ(¯ x, s¯, µ+ ) + n log(1 − η) ≤ ϕ(x(µ), s(µ), µ) + 0.10, 3. ϕ(¯ x, s¯, µ+ ) ≤ ϕ(x(µ+ ), s(µ+ ), µ+ ) + 0.10. Proof: Since 0 < η < 1, we have √ θ2 η(n + θ n) + ν(n, η, θ) ≤ −nη + 1−η 2(1 − θ) √ 2 2 θη n + nη θ = + . 1−η 2(1 − θ)

For η =

1 √ 9 n

and n ≥ 2, the bound simplifies as follows ν(n, η, θ) ≤

θ2 12 (θ + 1/9) + . 99 2(1 − θ)

The proof follows by the assumption θ ≤ 0.25.

Now we establish an upper bound on the primal barrier function at (x+ , µ+ ). Lemma 10 Let (¯ x, s¯) be a θ-approximate µ-center, µ+ = 1 − 9√1 n µ, and all the violated con-

straints are moderately deep, i.e., d¯ < αe. Then for α < 1 − θ and 0 < θ ≤ 1/4, one has

α ϕp (x , µ ) ≤ ϕp (¯ x, µ ) − α − log 1 − 1−θ +

+

+

+

− e d¯− T

p X

log αt¯j + 0.40.

j =1

Proof: The primal barrier function at (x+ , µ+ ) reads + + ϕ+ p (x , µ )

¯ −1 ∆xk ≤ Since kX

1 1−θ

p n X (c+ )T x+ X log x ¯j (1 + α∆xj /¯ xj ) − − log αt¯j . = µ+ j=1 j=1

and α < 1 − θ, from Lemma 1 n

(c+ )T x+ X − log x ¯j µ+ j=1 X p α α T ¯ −1 − log αt¯j . − log 1 − −αe X ∆x − 1−θ 1−θ j=1

+ + ϕ+ p (x , µ ) ≤

(8)

On the other hand, T (c+ )T x+ c ∆x c¯T t¯ cT x ¯ T ¯ −1 T ¯ −1 − αe X ∆x = + + α + + − e X ∆x . µ+ µ µ+ µ

(9)


13

In view of (6) one has ¯ t¯−1 t¯T D c¯T t¯ = y¯T A¯t¯− , p ¯ Thus, the term in brackets in (9) reads as ¯ is a diagonal matrix composed by vector d. where D T ¯ s¯T ∆x ¯ −1 ∆x − e d , − eT X + µ µ+

or

T T ¯ s¯x ¯ ¯ −1 ∆x) − e d − e ( X µ+ µ+

(10)

and using the Cauchy-Schwartz inequality we have

T

−1 s¯x s¯x ¯ ¯ −1 ¯ ∆x ≤

¯

−e X − e

X ∆x . + + µ µ

Now, from (10), Corollary 4, Lemma 7, and the assumption that µ ≤ 1, we have cT x ¯ (c+ )T x+ T ¯ −1 − αe X ∆x ≤ − eT d¯+ 0.40. µ+ µ+ The proof follows from (8).

Notice that since d¯ > 0, the term eT d¯ can be eliminated from the bound in Lemma 10. We now bound the dual barrier function. Lemma 11 Let the assumptions of Lemma 10 be satisfied. Then + + + , µ ) ≤ ϕ (¯ s , µ ) − α − log 1− (s ϕ+ d d

α 1−θ

−

p X

log r¯j + 0.40.

j =1

Proof: Observe that n+p

+ + ϕ+ d (s , µ )

−bT y + X = log s+ − j µ+ j=1

p n X −bT y¯ αbT ∆y X − − log s¯j (1 + α∆sj /¯ sj ) − log r¯j . = µ+ µ+ j=1 j=1

Now since α < 1 − θ, in view of Lemma 1 and Lemma 7 we have + + ϕ+ d (s , µ ) ≤

ϕd (¯ s, µ+ ) +

X p α α¯ xT ∆s α T ¯ −1 − − log 1 − − αe S ∆s − log r¯j . µ+ 1−θ 1−θ j =1


14

On the other hand, T x x ¯T ∆s ¯ s ¯ T ¯−1 ¯−1 ∆s S − e S ∆s ≤ − e µ+ µ+

−1

x ¯s¯

¯

≤

µ+ − e S ∆s ≤

0.40 , 1−θ

where the last inequality is due to Corollary 4 and Lemma 7.

We now present the main result of this section. Theorem 12 Let (¯ x, s¯) be a θ-approximate µ-center, µ+ = 1 − 9√1 n µ, and all the violated con-

straints are moderately deep. Moreover, let d¯ < (α/2)e. Then for α < 1 − θ and 0 < θ ≤ 1/4 we

have ϕ+ (x+ , s+ , µ+ ) − ϕ+ (x(µ+ ), s(µ+ ), µ+ ) ≤ p log p + ξ(p, θ, α), where ξ(p, θ, α) = 1.0 − 2α − 2 log 1 −

α 1−θ

α2 − p 1 + log . 2

Proof: Adding inequalities in Lemma 10 and Lemma 11 gives

α x, s¯, µ ) + 0.80 − 2α − 2 log 1 − ϕ (x , s , µ ) ≤ ϕ(¯ 1−θ +

+

+

+

+

−

p X

+ p log

log αt¯j r¯j .

j =1

From (7) we have p X j=1

p X

α (α − d¯j ) p j=1 X = p log α/p + log(α − d¯j )

log αt¯j r¯j =

log

≥ p log

α2 , 2p

where the inequality is valid because d¯j < α/2, for j = 1, . . . , p. Thus, ϕ+ (x+ , s+ , µ+ ) ≤ ϕ(¯ x, s¯, µ+ ) + 0.80 − 2α − 2 log 1 −

and from Lemma 8 and Corollary 9, we have

α 1−θ

2p , α2


α ϕ (x , s , µ ) ≤ ϕ(x(µ ), s(µ ), µ ) + 1.0 − 2α − 2 log 1 − 1−θ +

+

+

+

+

+

+

+ p log

15

2p . α2

(11)

On the other hand, ϕ(x(µ+ ), s(µ+ ), µ+ ) = n − n log µ+ = n + p − (n + p) log µ+ − (p − p log µ+ ) = ϕ+ (x(µ+ ), s(µ+ ), µ+ ) − p + p log µ+ ≤ ϕ+ (x(µ+ ), s(µ+ ), µ+ ) − p.

The proof follows from (11).

Note that at each iteration of the Newton method the barrier function is reduced by a constant amount. Therefore, Theorem 12 shows that after adding p moderately deep constraints and simultaneously updating µ, only O(p log(p + 1)) Newton steps are required to obtain a point in the vicinity of the new µ+ -center. We remark that the assumption µ ≤ 1 has been made only to simplify this bound. If µ > 1, the complexity changes to O(p log(µp + 1)).

5.

Complexity analysis and convergence

The complexity analysis and convergence of Algorithm 1 is presented for the general case. Let A¯T y ≤ c¯ be the p violated constraints such that c¯ < A¯T y¯. In this section, we do not differentiate between moderate and very deep constraints. for simplicity we treat all constraints as deep. This approach covers the worst case behavior of our algorithm. Lemma 13 For n ≥ 2, η =

1 √ , 9 n

θ = 0.25 and α = 0.50, we have

+ + ϕ+ d (s(µ , µ )

≥ ϕd (s(µ), µ) − p log p −

p X

1/2

log vi ,

i=1

where v ∈IRp is composed of the diagonal elements of matrix V as defined in Section 3. Proof: First observe that + + + + + + ϕ+ d (s(µ ), µ ) = n + p − (n + p) log µ − ϕp (x(µ ), µ )

(12)


16

+ + ≥ n + p − (n + p) log µ+ − ϕ+ p (x , µ ),

and from Lemma 10 we have + + ϕ+ x, µ+ ) d (s(µ ), µ ) ≥ n − n log µ − n log(1 − η) − ϕp (¯ p X α + T ¯ +p − p log µ + α + log 1 − log αt¯j − 0.40. +e d+ 1−θ j =1

Now, from Corollary 9 we have

+ + ϕ+ d (s(µ ), µ ) ≥ n − n log µ − ϕp (x(µ), µ) + +p − p log µ + α + log 1 −

α 1−θ

+ e d¯+ T

j =1

Thus + + ϕ+ d (s(µ ), µ ) ≥ ϕd (s(µ), µ) +

p X j =1

log t¯j + p + α + log 1 −

p X

α 1−θ

log αt¯j − 0.50.

+ eT d¯+ p log α − 0.50.

On the other hand, Goffin and Vial (2000) prove that p X j=1

log t¯j ≥ −p log p −

p X

1/2

log vj .

j=1

Therefore, + + ϕ+ d (s(µ ), µ ) ≥ ϕd (s(µ), µ) − p log p −

p X j =1

log vj1/2 + p + α + log 1 −

α 1−θ

+ eT d¯+ p log α − 0.50.

The proof follows by substituting θ = 0.25 and α = 0.50.

Lemma 13 establishes a bound on the optimal value of the updated dual barrier function after adding p deep constraints and updating µ. Notice that inequality (12) derived in this lemma is the same inequality that was derived for central cuts by Ye (1997a), and Goffin and Vial (2000). Here, we simply ignore eT d¯ > 0 from the bound because we do not have any information on the depth of the cut. However, in practice, having deep constraints are beneficial in the sense that a feasible solution to the original problem is reached faster when constraints are added with no changes to their right hand side. Lemma 14 At the kth iteration of the algorithm, let µk ≤ µ0 := 1, nk := n0 + np := 2m + and p = max {pi } i=1,...,k

Pk

i=1 pi ,


17

where pi ’s are the number of deep constraints added at iteration i. Then √ √ np X m m 1 1/2 k − + nk log δ ≤ −ϕd (s(µk ), µk ) ≤ + 2m log + np log(p + 1) + log vi , µk 2 2 i=1

where δ is the radius of the full dimensional ball contained in FΩ . Proof: The right hand side inequality follows from Lemma 13 ϕkd (s(µk ), µk )

≥

ϕ0d (s(µ0 ), µ0 ) −

np X i=1

pi log(p + 1) −

np X

1/2

log vi

i=1

and the fact that 2m

ϕ0d (s(µ0 ), µ0 )

−bT y X log sj (µ0 ) − = µ0 j=1 −bT e 1 − 2m log 2 2 √ − m 1 ≥ − 2m log 2 2

=

where the inequality is due to kbk ≤ 1, as presented in Assumption 3. To prove the left hand side inequality, let (y c , sc ) be the center of the δ-ball. Then sci = ci − aTi y c ≥ δ, for all i = 1, . . . , 2m + np . Also from Assumption 2, since FΩ is contained in the unit cube, we have ky k∞ ≤ 1. Therefore, at the kth iteration ϕkd (sc , µk )

√ n m −bT y c X c log si ≤ − − (np + 2m) log δ. = µk µk i=1

The lemma now follows from ϕkd (s(µk ), µk ) ≤ ϕkd (sc , µk ). The following lemma is due to Ye (1997a).

Lemma 15 Let p ≤ m, then

np X

np log vi ≤ 2m2 log 1 + . 8m2 i=1

We now present the main theoretical result of this paper:

18


Theorem 16 For all i, let 1 ≤ pi ≤ p ≤ m. Then after adding at most O

m2 p2 3√m/ε e δ2

constraints, the IPCG algorithm stops with an ε-solution to the SILO problem. Proof: From Lemma 14 we have √

√ np m m 1 X 1/2 − − + nk log δ − np log(p + 1) ≤ 2m log + log vi . 2 µk 2 i=1

Since p ≥ 1 and nk ≥ np one has √ np 3 m δ 1 X 1 − log vi ) + log (2m log + ≤ 2nk µk p+1 2nk 4 i=1 Pnp m + i=1 vi 1 2 ≤ log 2 nk n m 2 + 2m log(1 + 8mp2 ) 1 , ≤ log 2 2 nk

(13) (14)

where (13) is due to the Geometric Mean Inequality, and (14) is due to Lemma 15. Notice that inequality (14) is valid at each iteration of the IPCG algorithm. Therefore, a feasible solution in the δ-ball is obtained when this inequality is violated. That is,

log

m 2

2 √ n + 2m2 log(1 + 8mp2 ) δ 3 m . ≤− + log nk nk µk p+1

On the other hand, from Lemma 6, an ε-solution of the original problem is reached when µk ≤ ε √ . nk + nk

Therefore, an ε-solution is achieved when

log

m 2

2 √ n + 2m2 log(1 + 8mp2 ) δ 3 m ≤− + log , nk ε p+1

or when m 2

n

√

+ 2m2 log(1 + 8mp2 ) e−3 m/ε δ 2 ≤ nk (p + 1)2

holds. The proof now follows from this inequality.


6.

19

r PerfexionTM treatment planning Gamma Knife

In this section we implement our algorithm to solve the sector duration optimization problem for r treatment units, known as Leksell Gamma Knife r PerfexionTM , latest model of Gamma Knife developed and maintained by Elekta, Stockholm, Sweden (http://www.elekta.com/). The Lekr PerfexionTM radiosurgery provides several technological advancements to the sell Gamma Knife r unit which greatly increases its treatment accuracy. Gamma Knife r is a highly specialized unit that provides treatment of brain tumors and lesions Gamma Knife wherein. Individual proton beams from several directions are directed into the brain tumor or malformation, also known as the gross tumor volume (GTV). The size and dose intensity of each of these beams can be controlled independently of the other beams. The focal point of all of these beams is called the isocenter. A simultaneous collection of beams pointed at a particular isocenter is called a shot. Several shots are placed inside the target region at different isocenters, and sufficient radiation dose in those shots eliminates the tumorous cells. The neurosurgeons and the radiation oncologists plan the treatment by determining the isocenters (where the radiation shots should be centered), and the dose intensity (radiation duration at an isocenter). The dose is calculated by discretizing the patient into several cubes, called voxels (volume pixels). The desired amount of dose delivered to each voxel is determined by clinicians. The purpose of the sector duration optimization problem, given a set of isocenters, is to achieve that desired dose. Although we can use our method to solve the sector duration optimization problem, optimally locating the isocenters is not trivial. Physicians traditionally rely on their experience and judgment in determining the isocenter locations, which may lead to a non-optimal treatment plan. The ideal plan is to deliver a high dose of radiation to the malformation within the brain in such a way that the damage to the healthy tissues surrounding it is minimized. In the PerfexionTM units, each beam originates from one of eight sectors spaced around the r machines, it has unlimited access to cranial volume, fully patient. Unlike older Gamma Knife automated couch positioning capability, and fully automated sector selection. It also has the capability of delivering different beam sizes in different sectors.


20

r treatment planning have been well-studied in the Mathematical models for Gamma Knife literature. See, for instance, Lim and Lee (2008) and Ganz (1997). Most existing mathematical r problem are based on mixed integer programming and mixed integer models for Gamma Knife nonlinear programming models. See, for instance, Lim and Lee (2008) and Ferris et al. (2003). Mathematical models for PerfexionTM treatment planning are still in early stages. In an ongoing research project, Ghaffari et al. (2009) developed a mathematical model to incorporate all the new features of PerfexionTM . In this model, they use a combination of two optimization models: isocenter optimization and sector duration optimization. This approach is similar to that of Aleman et al. (2008), used for IMRT treatment planning problems. We implement the algorithm developed in this paper to solve the sector duration optimization arising in PerfexionTM treatment planning as proposed in Ghaffari (2009), and test it using real patient data. All the test problems were done on a desktop computer using Intel(R) Core(TM)2 Quad CPU 2.66 GHz processor with 4 GB RAM. 6.1.

Sector Duration Optimization Model

Sector duration optimization is the problem of finding the optimal shot shape for a given set of isocenters. The shot shape is determined by the amount of time each sector delivers radiation at each of the three available beam diameter sizes (4mm, 8mm, or 16mm). This problem essentially minimizes the damage to the healthy tissues around the tumor. The model that we use here is based on the fluence map optimization model presented by Aleman et al. (2008). In this model, each voxel is assigned a penalty related to the amount of under or overdosage it receives. The penalties are weighted according to the structure to which the voxel belongs so that some structures can be given priority over other structures. Similarly, the penalties for underdosing may be different from the penalties for overdosing so that the optimization model has a preference for certain structure dose. The sector duration optimization model is to minimize the total penalty in the treatment plan. In the following formulation, S is the set of structures, vs is the number of voxels in structure s, Ts is the threshold (desired) dose level for structure s, B is the set of sectors, C is the set of feasible


21

sector sizes, I is set of feasible isocenters, tIbc ≥ 0 is the time (sec) of radiation delivery at isocenter I ∈ I from sector b ∈ B with size c ∈ C (decision variable), DIbcjs is the dose deposition coefficient for radiation delivered at isocenter I ∈ I from sector b ∈ B with size c ∈ C to voxel j = 1, . . . , vs in structure s ∈ S , and zjs is the total dose received by voxel j = 1, . . . , vs in structure s ∈ S in the treatment plan. For a given finite set of isocenters Θ ⊆ I , vs XX

min

Fs (zjs )

s∈S j=1

subject to

zjs =

XXX

DIbcjs tIbc

I∈Θ b∈B c∈C

tIbc ≥ 0

s ∈ S , j = 1, . . . , vs

(15)

I ∈ Θ, b ∈ B, c ∈ C

where Fs (zjs ) =

1 ws (zjs − Ts )2+ + ws (Ts − zjs )2+ , vs

(16)

is the penalty function for the dose zjs received by voxel j in structure s and vs is the number of voxels in structure s. ws , ws , Ts are weighting parameters used to influence the quality of the treatment plan.

One approach to solve problem (15) is the projected gradient method. This method is based on the projection of a descending direction into the feasible set. However, as we see in our computational results, the projected gradient algorithm can get into zig-zag situations which result in very slow convergence. Since functions Fs are quadratic convex functions, Problem (15) is a convex problem with only non-negativity constraints. Thus one can reformulate it into an SOCO:

min

X

(δs + γs )

s∈S∪T

subject to

v uX u vs w s t (zjs − Ts )2+ ≤ δs , v s j=1

s∈S

22


v uX u vs w s t (Ts − zjs )2+ ≤ γs , v s j=1 XXX zjs = DIbcjs tIbc , I∈Θ b∈B c∈C

tIbc ≥ 0,

s∈S

(17)

s ∈ S , j = 1, . . . , vs I ∈ Θ, b ∈ B, c ∈ C

We implement our algorithm to solve this model and compare the results with the projected gradient algorithm implemented on the equivalent model (15). At each iteration of the constraint generation algorithm, an oracle is called to return an outer approximation of the violated secondorder cone constraints. This is obtained by computing the gradient of the constraint functions at the current µ−center. If no violated constraint is detected, the algorithm is continued by updating the centering parameter. The oracle uses a random search for identifying violated constraints. We use the dose volume histogram (DVH) to illustrate our results. The purpose of a DVH is to summarize three dimensional dose distribution into a two dimensional graph. This graph has been accepted as a tool for treatment plan evaluation. A DVH illustrates what percentage of the structure volume receives an amount of dose in the treatment. The ”volume” referred to in DVH analysis can be a target of radiation treatment or a healthy tissue around it. A good treatment plan is one which spares all the healthy organs and makes sure that 100% of the GTV receives 100% of the dose prescribed. The following experiment is based on real patient data provided by the Department of Radiation Oncology at the Princess Margaret Hospital (PMH), Toronto, Ontario, Canada. The isocenters are randomly generated inside the gross tumor volume (GTV) and the planning target volume (PTV) separately. The PTV is typically an extended area around the GTV and contains tissues that are suspected to be tumorous. Figure 1 shows our starting guess. We select three isocenters within the PTV (diamonds) and seven isocenters in GTV (circles), and initialize all sectors with all sizes for 1’s. The dashed vertical line in DVH (the figure on the right) shows the prescription dose which is 2Gy in this example. We illustrate the percentage of dose received by 10 structures in the brain. However, only structures


23

SDO: Initial 100

Case= 107, Shots#= 10, layer: 90 180

5

4.5

L_eye R_eye LLens L_Optic_N R_Optic_N RLens Brainstem Chiasm GTV PTV

90

190 4

80

200 3.5 210

220

230

70

3

60

2.5

50

2

40

1.5

30

1

20

0.5

10

240

250

260 240

250

Figure 1

260

270

280

290

300

310

0

0 0

5

10

15 Dose (Gy)

20

25

30

Initial treatment plan

SDO: PROJ 100

Case= 107, Shots#= 10, layer: 90 180

5

4.5


90

190 4

80

200 3.5 210

220

70

3

60

2.5

50

2

40

1.5

30

1

20

0.5

10

230

240

250

260 240

Figure 2

250

260

270

280

290

300

310

0

0 0

0.5

1

1.5

2

2.5 Dose (Gy)

3

3.5

4

4.5

The treatment plan produced by the projected gradient algorithm.

that receive significant dose are visible in this figure. We use marked lines to differentiate the curves. The DVH in this figure shows that more than 70% of the brainstem receives more than the prescribed dose, which will kill the brainstem. We first solved problem (15) using the projected gradient method and stopped the algorithm when the relative improvement of the objective value fell below 1E-8. Figure 2 shows the final dose distribution on the PTV, GTV and brainstem produced by projected gradient algorithm. The solution time in this case is 47 minutes with an objective function value 1.7769. We then implemented our IPCG algorithm to solve the equivalent model (17). In this instance the dimension of the dual space is m = 250, the number of linear constraints is 240, and since we


24

SDO: SILP 100

Case= 107, Shots#= 10, layer: 90 180

5

4.5


90

190 4

80

200 3.5 210

220

70

3

60

2.5

50

2

40

1.5

30

1

20

0.5

10

230

240

250

260 240

Figure 3

250

260

270

280

290

300

310

0

0 0

0.5

1

1.5

2 Dose (Gy)

2.5

3

3.5

4

The optimal treatment plan produced by IPCG algorithm

have 10 isocenters, there are 10 second-order cones. Dimensions of the second-order cones are 6084, 5741, 160, 1337, 163, 1590, 18301, 775, 7179, and 8358. Figure 3 shows the final dose distribution on the PTV, GTV and brainstem and the optimal treatment plan produced by our constraint generation algorithm. The solution time in this case is only 11 minutes with more accurate objective value 1.7316 (improvement of approximately 3%). We also solved (17) using SeDuMi. The coefficient matrix in SeDuMi is a sparse 49,678 by 49,929 matrix. After 30 iterations and 13 minutes, SeDuMi ran into numerical problems and stopped with this error message: “No sensible solution found”. As a matter of fact, SeDuMi gives a solution which is not too far from optimal, but it cannot compute the optimal solution with high precision. We tested many other instances with different number of isocesnter, but the result was the same. This might be due to the fact that the sector duration optimization problems are numerically hard in nature (personal communication with Polik (2009)). The time advantage of the IPCG algorithm over the projected gradient algorithm is significant. We compared the two algorithms using 100 instances of 15 randomly placed isocenters, and measured the solution time in minutes. The result is illustrated in Figure 4. The vertical axis is the solution time and the horizontal axis shows the instances. The upper solid line graph illustrates the projected gradient algorithm and the lower dashed line graph shows our interior point constraint generation method. The mean solution time of the projected algorithm is 155 minutes while the


25

3

solving time

10

PG IPCG

2

10

0

10

20

30

40

50

60

70

80

90

100

instance

Figure 4

The time comparison between the IPCG algorithm and the projected gradient algorithm on 100 instances with 15 isocenters

mean solution time of the IPCG algorithm is only 22 minutes. Consequently we may conclude that on average our algorithm is over 7 times faster than the projected gradient algorithm when solving these radiation therapy treatment planning problems. Additionally our algorithm returns a more accurate solution which is vital for this type of application.

7.

More computational experience

In this section we present some additional computational results to illustrate the behavior of our interior point constraint generation algorithm. We test two sets of problems. First we show the convergence behavior of the algorithm on some classical SILO problems selected from Coope and Watson (1985), and then we show the power of our algorithm on a class of SOCO problems using randomly generated data. In the following examples we solve an optimization problem of the form max bT y : g(y, ω) ≤ 0, ω ∈ Ω ,

(18)

where g(y, ω) is a linear function of y for a given ω in the compact set Ω. To solve this form of problem with our IPCG algorithm, we need to convert problem (18) to the form of problem (1). To do this, at each iteration an oracle is used to discretize Ω and identify multiple violated constraints using a random search. The violated constraints are then added to


26

the relaxation problem as new constraints and the µ+ −center is updated. At each iteration of the algorithm, therefore, we deal with a relaxation problem (2), and its corresponding primal problem (3), which is a restricted form of the primal of the original problem. Example 1 Let b = (−1, −1/2, −1/3)T , g(y, ω) = tan(ω) −

Pm

i=1 yi ω

i−1

, and Ω = [0, 1].

Solving this problem using our IPCG algorithm yields the optimal solution y ∗ = (0.089073; 0.423147; 1.0450756) and the optimal objective value bT y ∗ = −0.6490412. Figure 5 shows the convergence behavior of our algorithm for Example 1. In this figure we plot the objective values of the relaxed dual and restricted primal problems at the current µ−center (y-axis) in each iteration (x-axis). The upper (lower) curve comes from evaluating cT x (bT y), the objective function of the restricted primal (relaxed dual) problem, at the current µ−center. This figure illustrates that IPCG algorithm quickly approaches the optimal value with a reasonable duality gap. Observe that a good approximation of the optimal solution is achieved in less than 40 iterations. However, to get a high precision (10−8 ) solution, we let the algorithm run for 90 iterations. Notice that since the feasible region of problem (3) is also feasible for the primal of the original problem, cT x at the µ−center always gives an upper bound for the optimal objective value, that is, the upper curve never crosses the optimal value line. However, this is not true for the lower curve. This curve is obtained by evaluating bT y, the objective value of the dual problem at a feasible point of the relaxation. Therefore this point is not necessarily feasible for the original problem, which is why the lower curve may cross the optimal value line in the early iterates. 2

2

Example 2 b = (−1, −1/2, −1/2, −1/3, −1/4, −1/3)T , g(y, ω) = eω1 +ω2 − (y1 + ω1 y2 + ω2 y3 + ω12 y4 + ω1 ω2 y5 + ω22 y6 ), and Ω = [0, 1] × [0, 1]. The optimal solution of this problem is y ∗ = (2.5782999, −4.106585, −4.0981235, 4.2450596, 4.5222404, 4.2370932)T ,


27

6 5 4 3

2 1 0 −1 −2

Figure 5

0

10

20

30

40

50

60

70

80

90

Convergence behavior of IPCG algorithm on Example 1 to 10−8 precision

16 14 12 10 8 6 4 2 0 −2 −4

Figure 6

0

10

20

30

40

50

60

70

80

90

100


and the optimal objective value is bT y ∗ = −2.4338899. The convergence behavior of this problem is shown in Figure 6. Example 3 b = (−2, −4, −3)T , g(y, ω) =

P3

1 i=1 (1 − yi )hi (ω1 , ω2 ) − 2 ,

Ω = [−1, 4] × [−1, 4], and

h1 (ω1 , ω2 ) = (1/ω1 )(exp((−1/ω1 )(1 + (ω2 − 1)2 ))) ω1 > 0, h2 (ω1 , ω2 ) = (1/ω1 )(exp((−1/ω1 )(2 + ω22 /4)) ω1 > 0, h3 (ω1 , ω2 ) = (1/(ω1 − 2)(exp((−1/(ω1 − 2))(1 + (ω2 + 1)2 )) ω1 > 2, h1,2,3 (ω1 , ω2 ) = 0 elsewhere .


28

Optimal solution upper and lower bound in each iteration 8 lower bound: b’y Upper bound: c’x 7

6

5

4

3

2

1

Figure 7

0

10

20

30

40 50 iteration

60

70

80

90


The optimal solution for this problem is y ∗ = (1.5425641, −2.1014821, 0.9345579)T and the optimal objective value is bT y ∗ = 4.3862422. The convergence behavior of this problem is illustrated in Figure 7. Notice that in Figures 5–7 the lower and upper curves are not monotonically approaching each other when the current iterate is far from the optimal solution. This phenomenon is due to the fact that these bounds are computed by evaluating the objective functions of the relaxed dual problem and its corresponding primal problem at the current µ−center. When violated constraints are identified, the feasible region of the relaxed dual is updated by adding new constraints. Since this problem is not solved to optimality, but evaluated at the µ+ −center, the value of the objective function is unpredictable at early stages of the algorithm. However, as we get closer to the optimal solution of the original problem, the fluctuations decrease and the lower and upper curves become lower and upper bounds on the optimal objective value, and finally they approach the optimum value monotonically.


29

As the second large-scale test set, we implement our algorithm to solve SOCO problems using randomly generated data. Consider the following SOCO: max {bT y : lb ≤ y ≤ ub , (cj − ATj y) ∈ Lnj , j = 1, 2, . . . , k},

y∈Rm

(19)

where b is a non-zero vector in Rm , lb < ub are real vectors indicating lower and upper bounds of y respectively, and Ln is an n−dimensional Lorentz Cone, defined by: Ln = {s ∈ Rn :

√

s2 + · · · + sn ≤ s1 }.

(20)

The bound constraints lb ≤ y ≤ ub are added to ensure bounded feasible and finite optimal objective value. We use the MATLAB function randn.m to generate data for matrices Aj and vectors cj from a normal distribution with mean zero and standard deviation one. We let q cj1 = 2 c2j2 + · · · + c2jnj ,

j = 1, . . . , k

to ensure feasibility. At each iteration of the constraint generation algorithm an oracle is called to return an outer approximation of the violated second-order cone constraints. This is obtained by computing the gradient of the constraint functions at the current µ−center. If no violated constraint is detected the algorithm is continued by updating the centering parameter. Our oracle uses a random search for identifying violated constraints. This technique works well when the number of cones (k) is relatively small. Note that a more efficient technique is needed to detect violated constraints for problems with large number of conic constraints. Tables 1 and 2 show the numerical results of this experiment. Each row shows a different random problem with characteristics given in the first two columns: k, the number of second-order cone constraints in problem (19) and n ¯ , the size of each cone, respectively. The column under “cuts” reports the number of gradient inequalities needed to add until the optimal solution is reached. The next pair of columns in Table 1 compare the optimal objective values obtained by solving SOCO by our IPCG algorithm and SeDuMi. The corresponding columns in Table 2 illustrate the


30

k 3 9 27 81 243 729 2187 6561 19683 59049 Table 1

Optimum value CPU time (sec) n ¯ cuts IPCG SeDuMi IPCG/Oracle SeDuMi 1E+6 30 2.9913802 2.9913802 19/18 99 5E+5 31 2.9878005 2.9878005 26/24 260 1E+5 30 2.9751433 2.9751433 11/10 105 5E+4 31 2.9492360 2.9492360 14/13 147 1E+4 26 2.8790904 2.8790904 10/9 86 5E+3 32 2.8021491 2.8021491 13/11 135 1E+3 31 2.5023178 2.5023178 11/10 158 5E+2 28 2.3104156 2.3104156 31/27 248 1E+2 33 1.7423363 1.7423363 110/101 115 5E+1 29 1.2555162 1.2555162 861/854 1120

CPU time comparison of SeDuMi and our IPCG approach implemented to solve randomly generated SOCO problems with m = 3 and different values of n ¯ and k.

k 2 4 8 16 32 64 128 256 512 1024 2048 Table 2

gap CPU time (sec) n ¯ cuts IPCG SeDuMi IPCG/Oracle SeDuMi 1E+6 44 5.74E-03 139/134 5E+5 44 5.94E-03 91/87 1E+5 46 6.05E-03 6.93E-03 25/19 154 5E+4 44 5.72E-03 8.34E-03 14/7 166 1E+4 46 5.83E-03 2.06E-03 10/4 87 5E+3 49 6.13E-03 8.25E-03 9/3 72 1E+3 49 5.95E-03 4.06E-03 9/3 18 5E+2 53 6.17E-03 3.50E-03 7/2 15 1E+2 59 6.26E-03 1.71E-03 6/1 8 5E+1 61 6.59E-03 2.45E-03 5/1 5 1E+1 63 6.58E-03 1.85E-03 5/1 2

CPU time comparison of SeDuMi and our IPCG approach implemented to solve randomly generated SOCO problems with m = 30 and different values of n ¯ and k.

duality gap at the final solution. The CPU times needed to achieve these values by IPCG algorithm and SeDuMi, rounded to the nearest integer in seconds, are reported in the last two columns of these tables. For our algorithm we report IPCG/Oracle to report the CPU time of the whole algorithm and the CPU time of the oracle. A close study of these results reveals that our IPCG algorithm outperforms the classical interior point methods when we deal with problems with a large number of conic constraints of large size, when m, the dimension of y, is relatively small. Except for the last two instances of Table 2, our algorithm outperforms SeDuMi in terms of CPU time. However, it should be mentioned that primal-dual interior point methods, and in particular SeDuMi, are superior to our interior point


31

constraint generation algorithm for problems with small to moderate values of k, n, and m. Table 1 reveals interesting information. When m is small, a duality gap of 10−8 is achieved quickly in all of the test problems. This is not a typical behavior of cutting plane methods. These techniques are known to have difficulties near the optimal solution (see Oskoorouchi and Goffin (2007) and Oskoorouchi and Mitchell (2008)). As m increases, the algorithm returns to its traditional performance. This is the reason that in Table 2, we run the test problems to only three digits of accuracy. In this table, we show that our algorithm can reach an approximate solution with reasonable precision faster than SeDuMi. A disadvantage of our algorithm is that it requires the value of m to be relatively small. When the dimension of this space is large, the IPCG algorithm is required to add too many constraints before the desired accuracy is reached. Also the IPCG/Oracle data in Table 2 shows that a substantial portion of the CPU time is consumed by the oracle in the random search. Clearly, a more efficient search could reduce this time and consequently enhance the performance of our algorithm.

8.

Conclusions and future research

We presented an interior point constraint generation algorithm for semi-infinite linear programming and showed that the algorithm converges to an ε-solution after a finite number of iterations. We derived two theoretical complexity results. The number of Newton steps needed to get close to the updated µ− center after adding p new violating constraints is bounded by O(p log(p + 1)), and the overall algorithm stops with an ε-solution to the SILO problems after adding at most O

m2 pˆ2 3√m/ε e δ2

constraints, where δ is the radius of the largest full dimensional ball contained in FΩ , and pˆ is the maximum number of constraints added simultaneously. We implemented our algorithm to solve the sector duration optimization model of the Gamma r PerfexionTM treatment planning problem and showed that our algorithm can efficiently Knife handle real-life problems in healthcare applications.

32


We also illustrated the convergence behavior of our algorithm on some classical SILO problems and reported numerical results by implementing it on SOCO problems. We showed that our IPCG method outperforms classical primal-dual interior point methods on problems with a large number of conic constraint, of large size, when m, the dimension of y, is small. There are a number of directions we can take to continue this research in the future. The sector duration optimization model is one component of the PerfexionTM treatment planning. Another component of treatment planning is isocenter optimization which determines the optimal locations of isocenters. As an immediate extension, our algorithm can be modified to solve the isocenter optimization model. This work is currently in progress.

The algorithm that we described in this paper has the potential to be combined with branchand-cut algorithms and to be implemented to solve mixed integer conic programming problems. An efficient technique for problems of this kind is the use of outer approximation of the second-order cone constraints. See, for instance, Bonami et al. (2008) and Abhishek et al. (2008). The main reason to use polyhedral approximation is the opportunity to have a warm start in the branch-andbound algorithm after adding an integer cut. Ben-Tal and Nemirovski (2001) develop a polyhedral approximation for second-order cone optimization that is used by Vielma et al. (2008) in their approach to mixed integer conic programming. The advantage of this approximation is that it is computed once and used at every relaxation node. However, this approximation, although tight, yields an LP with a large number of constraints and variables. For example the polyhedral approximation of a single second-order cone of dimension 4, creates an LP with over 10,000 variables and 22,000 constraints (with high precision). This could be costly for ILOG CPLEX, especially when the number of cones and their dimensions are large. We see a potential advantage of our algorithm in solving a class of mixed conic integer programming problems. At each node of branch-and-cut algorithm, a conic optimization relaxation is formed. In this paper we showed, using an outer approximation, that we can efficiently solve a class of SOCO problems. Given that the algorithm works by gradually generating constraints, an


33

integer cut could be treated as a new added constraint in the solution process of our algorithm. This could potentially be an efficient process in the frame of branch-and-cut algorithms. We intend to explore these avenues in our future research.

Acknowledgment: The authors would like to thank Professor Impre Polik for his help with SeDuMi, and anonymous reviewers for their suggestions and comments on the earlier versions of this paper which substantially improved its content.

References Abhishek, K., S. Leyffer, J.T. Linderoth. 2008. Filmint: an outer-approximation-based solver for nonlinear mixed integer programs. Preprint ANL/MCS-P1374-0906, Argonne National Laboratory, Mathematics and Computer Science Division . Aleman, D.M., A. Kumar, R.K. Ahuja, H.E. Romeijn, J.F. Dempsey. 2008. Neighborhood search approaches to beam orientation optimization intensity modulated radiation therapy treatment planning. J Glob Optim 42 578–586. Ben-Tal, A., A. Nemirovski. 2001. On polyhedral approximations of second-order cone. Mathematics of Operations Research 26 193–205. Bonami, P., L.T. Biegler, A.R. Conn, G. Cornuejols, I.E. Grossmann, C.D. Laird, J. L. Lodi, F. Margot, N. Sawaya, A. Wachter. 2008. An algorithmic framework for convex mixed integer nonlinear programs. Discrete Optimization to appear. Borchers, B. 1999. Csdp, a c library for semidefinite programming. Optimization Methods and Software 11 613–623. Burer, S., R.D.C. Monteiro. 2003. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming 95(2) 329–357. Coope, I.D., G.A. Watson. 1985. A projected lagranian algorithm for semi-infinite programming. Mathematical Programming 32 337–356. den Hertog, D. 1994. Interior Point Approach to Linear Quadratic, and Convex Programming. Kluwer Academic Publishers, Dordrecht, Boston, London.


34

Ferris, M. C., J. Lim, D. M. Shepard. 2003. An optimization approach for treatment planning on a gamma knife. SIAM Journal On Optimization 13 921–937. Ferris, M.C., A.B. Philpott. 1989. An interior point algorithm for semi-definite programming. Mathematical Programming 43 257–276. Ganz, J. C. 1997. Gamma Knife Surgery. Springer-Verlag, Wien, Austria. r PerfexionTM . Ph.D. Proposal, UniverGhaffari, H.R. 2009. Optimization models on leksell Gamma Knife sity of Toronto. Ghaffari, H.R., D.M. Aleman, M. Ruscin, D. Jaffry. 2009.

r Optimization models in Gamma Knife

PerfexionTM radiothrapy treatment planning. In progress. Goberna, M.A., M.A. Lopez. 2002. Linear semi-infinite programming theory: an updated survey. European Journal of Operational Research 143 390–405. Goffin, J.-L., J.-P. Vial. 1999. Shallow, deep and very deep cuts in the analytic center cutting plane method. Mathematical Programming 84 89–103. Goffin, J.-L., J.-P. Vial. 2000. Multiple cuts in the analytic center cutting plane methods. SIAM Journal on Optimization 11 266–288. Li, S.J., S.Y. Wu, X.Q. Yang, K.L. Teo. 2006. A relaxed cutting plane method for semi-definite programming. Computational Optimization and Applications 196(2) 459–473. Lim, Gino J., Eva K. Lee. 2008. Optimization in medicine and biology. Taylor & Francis Group, LLC, 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742. Lopez, M., G. Still. 2007. Semi-infinite programming. European Journal of Operational Research 180 491– 518. Luo, Z.-Q., C. Roos, T. Terlaky. 1999. Complexity analysis of a logarithmic barrier decomposition method for semi-infinite linear programming. Applied Numerical Mathematics 29 379–394. Oskoorouchi, M.R., J.L. Goffin. 2007. A matrix generation approach for eigenvalue optimization. Mathematical Programming 109 155–179. Oskoorouchi, M.R., J.E. Mitchell. 2008. A second-order cone cutting surface method: complexity and applications. Computational Optimization and Applications to appear.


35

Polik, I. 2009. Personal communication. COR@L Computational Research At Lehigh. . Roos, C., T. Terlaky, J.-P. Vial. 1997. Theory and Algorithms for Linear Optimization. John Wiley & Sons Ltd., Baffins Lane, Chichester, England. SeDuMi. 2003. Sedumi. http://sedumi.ie.lehigh.edu/ . Sturm, J.F. 1999. Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones. Optimization Methods and Software 625–653. Toh, K.C., M.J. Todd, R.H. Tutuncu. 1999. Sdpt3–a matlab software package for semidefinite programming, version 2.1. Optimization Methods and Software 11 545–581. Vielma, J.P., S. Ahmed, G.L. Nemhauser. 2008. A lifted linear programming branch-and-bound algorithm for mixed integer conic quadratic programs. INFORMS Journal of Computing 438–450. Wu, S.Y., S.C. Fang, C.J. Lin. 1998. Relaxed cutting plane method for solving linear semi-definite programming problems. Computational Optimization and Applications 99 759–779. Ye, Y. 1997a. Complexity analysis of the analytic center cutting plane method that uses multiple cuts. Mathematical Programming 78 85–104. Ye, Y. 1997b. Interior Point Algorithms, Theory and Analysis. John Wiley Inc., New York, NY.

An Interior Point Constraint Generation Algorithm for Semi-Infinite ...

An Interior Point Constraint Generation Algorithm for Semi-Infinite ...

Suggest Documents

An interior point constraint generation algorithm for semi ... - CSUSM

An Infeasible Interior-Point Algorithm for Solving

Interior Point Algorithm - Princeton University

Interior Point Algorithm - Princeton University

A parallel interior-point algorithm for linear

Infeasible Constraint-Reduced Interior-Point ... - Optimization Online

An Interior-Point Algorithm for the Maximum-Volume Ellipsoid Problem

An interior point algorithm for nonlinear quantile regression

An interior point algorithm for continuous minimax - Semantic Scholar

A Constraint Generation Algorithm for Large Scale

Formal verification of an interior point algorithm instanciation

An Interior Point Potential Reduction Algorithm ... - Stanford CS Theory

Interior point algorithm-based power flow optimisation

An Interior-Point Method for Semidefinite Programming

An interior-point Lagrangian decomposition

A trust region interior point algorithm for infinite ... - Springer Link

interior point algorithm for solving farm resource ... - AgEcon Search

A primalâdual infeasible-interior-point algorithm for linear ...

A self-concordant interior point algorithm for nonsymmetric circular ...

Interior point method for long-term generation scheduling of ... - Unicamp

Interior point method for long-term generation scheduling ... - CiteSeerX

5) A New Infeasible Interior-Point Algorithm for Linear Programming

A two-level interior-point decomposition algorithm for ...

A polynomial path-following interior point algorithm for general linear ...

An Interior Point Constraint Generation Algorithm for Semi-Infinite ...