Scheduling Parallel Dedicated Machines with the Speeding-Up Resource Hans Kellerer,1 Vitaly A. Strusevich2 1

2

Institut für Statistik und Operations Research, Universität Graz, Universitätsstraße 15, Graz A-8010, Austria

School of Computing and Mathematical Sciences, University of Greenwich, Old Royal Naval College, Park Row, London SE10 9LS, United Kingdom

Received 7 November 2006; revised 29 February 2008; accepted 8 March 2008 DOI 10.1002/nav.20292 Published online 23 April 2008 in Wiley InterScience (www.interscience.wiley.com).

Abstract: We consider a problem of scheduling jobs on m parallel machines. The machines are dedicated, i.e., for each job the processing machine is known in advance. We mainly concentrate on the model in which at any time there is one unit of an additional resource. Any job may be assigned the resource and this reduces its processing time. A job that is given the resource uses it at each time of its processing. No two jobs are allowed to use the resource simultaneously. The objective is to minimize the makespan. We prove that the two-machine problem is NP-hard in the ordinary sense, describe a pseudopolynomial dynamic programming algorithm and convert it into an FPTAS. For the problem with an arbitrary number of machines we present an algorithm with a worst-case ratio close to 3/2, and close to 3, if a job can be given several units of the resource. For the problem with a ﬁxed number of machines we give a PTAS. Virtually all algorithms rely on a certain variant of the linear knapsack problem (maximization, minimization, multiple-choice, bicriteria). © 2008 Wiley Periodicals, Inc. Naval Research Logistics 55: 377–389, 2008 Keywords: scheduling; parallel dedicated machines; resource constraints; complexity; approximation

1.

INTRODUCTION

In this article we consider the problem of scheduling jobs on parallel dedicated machines, provided that the processing of jobs can be sped up by allocating an additional resource. We are given a set N = {1, 2, . . . , n} of jobs and m processing machines M1 , M2 , . . . , Mm . The machines are parallel and dedicated. Each job has to be processed on exactly one machine, and the set N of jobs is in advance partitioned into m subsets, N1 , N2 , . . . , Nm , so that the jobs of set Ni and only these are processed on machine Mi , 1 ≤ i ≤ m. The processing time for performing job j is equal to pj ≥ 0 time units. No machine processes more than one job at a time and preemption is not allowed. For all problems considered in this article, the goal is to ﬁnd a schedule that minimizes the makespan, i.e., the maximum completion time. For a schedule S, let the makespan be denoted by Cmax (S). A schedule with the smallest makespan is called optimal and is denoted by S ∗ . In the basic model studied in this article, it is assumed that an additional renewable speeding-up resource can be allocated to a job. There are σ ≥ 1 units of the resource available at any time. If a job j is not given the resource, its processing Correspondence to: V.A. Strusevich ([email protected] ac.uk) © 2008 Wiley Periodicals, Inc.

time remains equal to pj ; otherwise, the resource will speed up the processing. A job that is not given the resource is called a nonresource job; otherwise, it is called a resource job. In this article we mainly concentrate on the simplest scenario of resource consumption, which we call binary. We assume that exactly one unit of the resource is available at any time, i.e., σ = 1. Each job j ∈ N is associated with a value πj ≤ pj . If a job j is given the resource, its processing time becomes pj − πj and exactly one unit of the resource is required at any time of this processing. No two resource jobs can be processed simultaneously. Unless stated otherwise, all time parameters pj and πj are assumed to be integer. To help the reader to grasp the main features of our model, we provide a small-sized example. EXAMPLE 1: Consider the problem of processing ﬁve jobs on three parallel dedicated machines M1 , M2 , and M3 . Machine M1 has to process only job 1, machine M2 processes jobs 2 and 3, while machine M3 processes jobs 4 and 5; i.e., N1 = {1}, N2 = {2, 3} and N3 = {4, 5}. The value of the processing parameters pj and πj are given in Table 1. Later in this article we use this example for the purpose of numerical illustrations. Here we give two meaningful interpretations of the example presented in Table 1.

378

Naval Research Logistics, Vol. 55 (2008) Table 1. Data for Example 1. N1

j pj πj

1 5 3

N2 2 3 1

N3 3 5 4

4 3 1

5 5 4

Human Resource Management. Three teams M1 , M2 , and M3 have to implement ﬁve projects. Team Mi is responsible for set Ni of projects, i = 1, 2, 3. The senior management may allocate an extra employee, currently not a member of any of the teams, to take part in any project. If project j is done by the existing workforce of the corresponding team, it will take pj weeks. If the additional employee is assigned to project j , this will reduce the duration of the project by πj weeks. The extra employee remains assigned to the project until it is completed and cannot be assigned to take part in more than one project at a time. The purpose is to complete all ﬁve projects as early as possible. Power-Aware Scheduling. A computing device consists of three parallel processors M1 , M2 , and M3 and has to run ﬁve tasks. The tasks of set Ni are assigned to run on processor Mi , i = 1, 2, 3. If task j is run on the corresponding processor at a standard speed, it will take pj seconds. It is possible to increase the speed of any processor, so that task j will require πj seconds less. In order not to overheat the device, only one processor at a time can be speed up. The goal is to ﬁnish all ﬁve tasks as early as possible. See Fig. 1 for a schedule that is optimal for the instance presented in Example 1. The resource jobs are hatched with vertical lines. If the resource allocations are known in advance then all processing times are also known, and we refer to this kind of resource as the renewable static resource. Study on scheduling problems with this type of resource constraints has been initiated by Bła˙zewicz et al. [3]; See also [1] and [2] for the most recent reviews of research in this area. The problems of minimizing the makespan on parallel dedicated machines with static renewable resources have been studied in [11, 12]. A fairly complete computational classiﬁcation of relevant problems has been obtained and a number of approximation algorithms have been designed and analyzed. In the case of a single static resource, the problems studied in [11, 12] are denoted by PDm|res1σρ|Cmax (if the number of machines is ﬁxed and equal to m) and by PD|res1σρ|Cmax (if the number of machines is arbitrary, i.e., part of the problem input). Here “PD” stands for “parallel dedicated machines,” while “res1σρ” implies that there is a single resource, the size of the resource does not exceed σ , each job is allocated no more than ρ units of the resource and at any time the total amount of the allocated resource is Naval Research Logistics DOI 10.1002/nav

at most σ . In particular, PDm|res111|Cmax denotes the problem of minimizing the makespan on m parallel dedicated machines, provided that some jobs are known to require one unit of the additional resource at any time of their processing. The main problem studied in this article will be denoted by PDm|res111, Bi|Cmax , where we write “Bi” in the middle ﬁeld to stress that the resource is speeding-up and the binary scenario of its consumption is applied. If the number of the machines is arbitrary, the problem is denoted by PD|res111, Bi|Cmax . In the scheduling literature, there are at least two general models with a nonrenewable speeding-up resource. A feature common for both models is that a single resource has to be divided between the jobs in advance, and the processing times of each job that receives the resource is reduced, depending on how many units of the resource are allocated. This situation is typical if the resource to be divided represents money or energy. These models have numerous applications in manufacturing, supply chain management, imprecise computing and other areas. The ﬁrst model with the nonrenewable speeding-up resource is related to scheduling with controllable processing times; see [13] for a review. Formally, for each job j ∈ N we are given the “standard” value of processing time p¯ j that can be compressed to the minimum value p j , where pj ≤ p¯ j . Crashing p¯ j to some actual processing time pj , pj ≤ pj ≤ p¯ j , may decrease job completion time(s) but incurs additional cost αj xj , where xj = p¯ j − pj is the compression amount of job j and αj is a given unit compression cost. A number of authors, see, e.g., [9], argue that the compression is achieved due to the additional resource allocated to a job. Usually, this problem area deals with the trade-off between the improved quality of the obtained schedule and the cost ofthe used resource, normally represented by a linear function j ∈N αj xj . Another model deﬁnes the processing time of a job j to which uj units of the resource are allocated as pj = (aj /uj )k , where aj is a known job-dependent constant, while k is a

Figure 1. An optimal schedule for Example 1.

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

positive constant. Herethe main issue is to ﬁnd a resourcefeasible schedule, i.e., j ∈N uj ≤ σ that minimizes a certain function that measures the quality of a schedule. This scheduling model has been studied in a series of papers; here we only refer to [15], which is the earliest paper we are aware of, and [16], the most recent one. Returning to problem PDm|res111, Bi|Cmax with a single renewable speeding-up resource, notice that some preliminary results on this problem are reported in [5]. In particular, problem PD2|res111, Bi|Cmax is shown to be NP-hard in the ordinary sense; we give an alternative proof in Section 2.1. The fact that the problem under consideration is NPhard stimulates the search for approximation algorithms that deliver solutions fairly close to the optimum. Recall some relevant deﬁnitions. A polynomial-time algorithm that creates a schedule with the makespan that is at most ρ ≥ 1 times the optimal value is called a ρ-approximation algorithm; the value of ρ is called a worst-case ratio bound. A family of algorithms is called a polynomial-time approximation scheme, or a PTAS if for a given ε > 0 it contains an algorithm that has the running time that is polynomial in the length of the problem input and delivers a worst-case ratio bound of 1 + ε. If additionally the running time is polynomial in 1/ε, a PTAS is called a fully polynomial-time approximation scheme (FPTAS). While most of the presented results concern problem PD2|res111, Bi|Cmax , in this article we also address a more general problem, in which there are σ ≥ 1 units of the renewable resource available at a time. We are given a matrix with the nonnegative elements pj τ , where j ∈ N and 0 ≤ τ ≤ σ . If job j is allocated an integer number τ , 0 ≤ τ ≤ σ , of units of the resource at any time of its processing, then its actual processing time is equal to pj τ . Only the jobs that are allocated at most σ units of the resource can be scheduled to run in parallel. We denote this problem by PDm|res1σ σ , Int|Cmax , where “Int” stands for the integer scenario of resource allocation. For the speeding-up resource, the values pj τ are non-increasing in τ for each job j ; however, in general we do not have to make this assumption. A special case of problem PDm|res1σ σ , Int|Cmax is studied in [7], where it is assumed that the actual processing time of job j that is given τ units of the resource depends linearly on τ , i.e., pj τ = pj − τ πj , where pj and πj have the same meaning as for problem PDm|res111, Bi|Cmax . This problem can be denoted by PDm|res1σ σ , Lin|Cmax , where “Lin” stresses that the actual processing times depend linearly on the number of units of the speeding-up resource allocated to the job. Notice that in the case of σ = 1, both scenarios, integer and linear, coincide and become the binary scenario. For problem PDm|res1σ σ , Lin|Cmax , a (3 + ε)-approximation algorithm is presented in [7]. A similar speeding-up scenario applied to other scheduling models, e.g., unrelated parallel machines, is considered in [6].

379

The remainder of this article is organized as follows. Section 2 addresses the two-machine version of problem PDm|res111, Bi|Cmax . We show that problem PD2|res111, Bi|Cmax is NP-hard, offer a pseudopolynomialtime dynamic algorithm and convert it into an FPTAS. In Section 3 we present an approximation algorithm for problem PD|res111, Bi|Cmax with an arbitrary number of machines that delivers a worst-case ratio close to 3/2. We also show how to extend our approach to handle problem PDm|res1σ σ , Int|Cmax and give a (3 + ε)-approximation algorithm that does not involve approximation schemes for quadratic programming problems employed in [7] for a less general problem PDm|res1σ σ , Lin|Cmax . In Section 4 we describe a PTAS for problem PDm|res111, Bi|Cmax with a ﬁxed number of machines. Section 5 contains some concluding remarks. In our reasoning and design of the algorithms we often use the linear knapsack problem, in both minimization and maximization settings, as well as the multiple-choice knapsack problem and the bicriteria knapsack problem. Notice that each of these problems admits an FPTAS; see [10] for details. Additionally, the algorithms that we describe here for the problems with the speeding-up resource rely on the algorithms from [11] developed for the counterparts with the static resource; the latter algorithms are not presented here in full detail, and we brieﬂy review the results from [11] in relevant sections of this article.

2.

TWO MACHINES: COMPLEXITY AND FPTAS

In this section, we study problem PD2|res111, Bi|Cmax . Recall that if the resource is static, i.e., is allocated in advance, problem PD2|res111|Cmax is solvable in O(n) time; on the other hand problem PDm|res111|Cmax is NP-hard for any ﬁxed m ≥ 3, while problem PD|res111|Cmax with an arbitrary number of machines is NP-hard in the strong sense; see [11]. Notice also that the reduction used in [11] to prove strong NPhardness of problem PD|res111|Cmax is straightforwardly extendable to problem PD|res111, Bi|Cmax . For the two-machine case of problem PDm|res111, Bi| Cmax we deﬁne aj = pj , bk = pk , 2.1.

αj = πj , βk = πk ,

j ∈ N1 ; k ∈ N2 .

Complexity

To resolve the complexity issue of problem PDm|res111, Bi|Cmax we only need to establish the status of its version with two machines. In this subsection, we show that problem PD2|res111, Bi|Cmax is NP-hard in the ordinary sense. In the proof of the NP-hardness the following well-known NP-complete problem is used for reduction. Naval Research Logistics DOI 10.1002/nav

380

Naval Research Logistics, Vol. 55 (2008)

Partition: Given t integers ej such that tj =1 ej = 2E, does there exist a partition of the index set T = {1, 2, . . . , t} into two subsets T1 and T2 such that j ∈T1 ei = j ∈T2 ei = E? For a nonempty set Q ⊆ T , deﬁne e(Q) = j ∈Q ej ; additionally deﬁne e(∅) = 0. The theorem below holds not only for the speeding-up model that is of primary concern of this article, but also for an alternative possible model for which the actual processing times are derived by reducing the original times by the same factor. THEOREM 1: Problem PD2|res111, Bi|Cmax is NP-hard even if there exists a λ, 0 < λ < 1, such that αj = λaj for all j ∈ N1 and βj = λbj for all j ∈ N2 . PROOF: Take an arbitrary γ , 0 < γ < 1, and given an instance of Partition, deﬁne • N1 := T and N2 := {t + 1, t + 2}; • aj := ej /γ , j ∈ N1 ; • bt+1 := E/γ 2 and bt+2 := E.

• αj := λaj = • βt+1 := (1 − γ )E.

βt+2 := λbt+2 =

Notice that if job j ∈ N1 is given the resource that its processing time becomes aj − αj =

ej 1−γ − ej = ej = γ aj . γ γ

Similarly, E 1−γ E = γ bt+1 ; − E= γ2 γ2 γ = E − (1 − γ )E = γ E = γ bt+2 .

bt+1 − βt+1 = bt+2 − βt+2

We show that Partition has a solution if and only if in the constructed problem there exists a schedule S0 such that Cmax (S0 ) ≤ (1 + γ1 )E. Suppose that Partition has a solution represented by the sets T1 and T2 . Then schedule S0 with Cmax (S0 ) = (1 + γ1 )E exists and can be found as follows. Both machines operate in the time interval [0, (1+ γ1 )E] with no idle time. The resource is assigned to the jobs of set T1 on machine M1 and to job t + 1 on machine M2 . Machine M1 processes the block T2 of jobs in the time interval [0, E/γ ] followed by the block T1 of the resource jobs. Machine M2 processes the sequence of jobs (t + 1, t + 2). Naval Research Logistics DOI 10.1002/nav

(bt+1 − βt+1 ) + (bt+2 − βt+2 ) = E/γ + γ E. Denote the total processing time of the resource jobs on machine M1 by z. Since the total duration of all resource jobs does not exceed (1 + γ1 )E and the resource jobs on machine M2 take E/γ + γ E time units, it follows that in schedule S0 z ≤ (1 − γ )E, and the sum of the original processing times for the resource jobs is equal to z/γ . We now estimate the total load on machine M1 as z+

z 2E 2E 1−γ 2E (1 − γ )2 − = − z≥ − E. γ γ γ γ γ γ

Since 2 − (1 − γ )2 > γ + 1 for 0 < γ < 1, we derive that 2E (1 − γ )2 1 − E > 1+ E, γ γ γ

Also deﬁne λ = 1 − γ , so that 1−γ ej , j ∈ N1 ; γ λbt+1 = 1−γ E and γ2

Suppose now that schedule S0 with Cmax (S0 ) ≤ (1 + γ1 )E exists. Assume ﬁrst that both jobs t + 1 and t + 2 are given the resource, so that the total load on machine M2 becomes

which is impossible. If job t + 1 is not given the resource then the smallest total load on machine M2 is γE2 +γ E, which is larger then (1+ γ1 )E for each γ , 0 < γ < 1. Thus, in schedule S0 the only job on machine M2 that is given the resource is job t + 1, and its processing takes E/γ time. Consider an arbitrary partition T1 and T2 of the index set T , such that on machine M1 the jobs j ∈ T1 are given the resource, and the jobs of the other subset are not. If e(T1 ) > E, then total processing time of all resource jobs on both machines exceeds (1 + γ1 )E. Assume that e(T1 ) = E − x for some positive x. Then the total load on machine M1 is equal = (1 + γ1 )E + ( γ1 − 1)x > (1 + γ1 )E, a conto E − x + E+x γ tradiction. Therefore, if schedule S0 exists, then Partition must have a solution. Notice that Theorem 2 implies that problem PD2|res111, Bi|Cmax is NP-hard if for any job the reduced processing time is obtained from the standard processing time by multiplying it by the same value 1 − λ. The proof outlined in [5] deals with a less restricted version of the problem. 2.2.

Dynamic Programming

We now show that problem PD2|res111, Bi|Cmax can be solved by a dynamic programming algorithm in

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

pseudopolynomial time. Given an instance of problem PD2|res111, Bi|Cmax deﬁne (aj −αj ), B = (bj −βj ), R = max{A, B}. A= j ∈N1

381

To minimize the workload on M2 we also need to solve the knapsack problem: W2 = min y

j ∈N2

bk −

k∈N2

βk y k

j ∈N2

(1) THEOREM 2: Problem PD2|res111, Bi|Cmax can be solved in O(nR) time. PROOF: As follows from [11], for problem PD2|res111| Cmax with the static resource there exists an optimal schedule in which on each machine the jobs are organized in two blocks: that of the resource jobs and the other consisting of the nonresource jobs. Moreover, it sufﬁces to look for an optimal schedule among those schedules in which (i) the jobs in each block are processed without intermediate idle time; (ii) machine M1 processes the resource jobs starting at time zero, and then processes the block of the non-resource jobs; and (iii) machine M2 starting at time zero processes the block of the non-resource jobs followed by the block of the resource jobs which starts as early as possible. For problem PD2|res111, Bi|Cmax , consider an arbitrary resource allocation. Associate job j ∈ N1 with a binary variable xj such that xj = 1 if job j is a resource job and xj = 0 otherwise. Similarly, for each k ∈ N2 introduce a binary variable yk such that yk = 1 if job k is a resource job and yk = 0 otherwise. For some t1 , 0 ≤ t1 ≤ A, suppose that machine M1 processes the block of the resource jobs in the time interval [0, t1 ]. Similarly, suppose that on machine M2 the block of the resource jobs is processed during a period of t2 time units, where 0 ≤ t2 ≤ B. The makespan for a schedule of this structure exceeds equal to neither the workload on machine M 1 j ∈N1 (aj − workload on machine M2 αj )xj + j ∈N1 aj (1 − xj ) nor the equal to k∈N2 (bk − βk )yk + k∈N2 bk (1 − yk ). Besides, since the resource jobs cannot be processed simultaneously, the makespan is at least as large as t1 + t2 . To minimize the workload on machine M1 we need to solve the following knapsack problem: aj − αj x j W1 = min x

j ∈N1

j ∈N1

or equivalently max x

subject to j ∈N1

αj xj

j ∈N1

(aj − αj )xj ≤ t1 ;

xj ∈ {0, 1}, j ∈ N1 .

or equivalently max y

βk yk

k∈N2

subject to

(bk − βk )yk ≤ t2 ;

yk ∈ {0, 1}, k ∈ N2 .

k∈N2

We can solve the ﬁrst knapsack problem as an all-capacities knapsack problem in O(|N1 |A) time by a dynamic programming algorithm for t1 = A, see [10] for more details. Such an algorithm outputs an optimal solution to each knapsack problem with a positive integer right-hand side value t1 ∈ {0, 1, . . . , A}. Let W1 (t1 ) denotes the optimal value of the objective function for t1 ∈ {0, 1, . . . , A}. It is convenient to represent these values as a table with two rows and A + 1 columns, where column t1 , 0 ≤ t1 ≤ A, in the ﬁrst row contains the value of t1 and in the second row the value of W1 (t1 ). This table will be called the A-matrix. It is clear that the larger the value of t1 is taken, the smaller the value of W1 (t1 ) is found, i.e., the entries of the second row of the A-matrix form a non-increasing array. Similarly, for the second knapsack problem in O(|N2 |B) time we can ﬁnd solutions to all knapsack problems with all positive integer right-hand side values t2 less than or equal to B. Similarly to the above, we associate the found solutions with the B-matrix that has two rows and B + 1 columns, where column t2 , 0 ≤ t2 ≤ B, in the ﬁrst row contains the value of t2 and in the second row contains the corresponding optimal value W2 (t2 ) of the objective function. The entries of the second row of the B-matrix are also nonincreasing. The time required for building these matrices in at most O(nR). To ﬁnd the overall solution we need to ﬁnd the integer values t1 , 0 ≤ t1 ≤ A, and t2 , 0 ≤ t2 ≤ B, such that max{t1 + t2 , W1 (t1 ), W2 (t2 )} is as small as possible. Assume that there exists an overall optimal solution in which W1 (t1 ) ≥ W2 (t2 ). We describe a simple procedure that ﬁnds C (1) equal to the smallest value of max{t1 + t2 , W1 (t1 )}, provided that W1 (t1 ) ≥ W2 (t2 ). From the A-matrix, read W1 (0). In the B-matrix, ﬁnd the largest value of W2 (t2 ) such that W1 (0) ≥ W2 (t2 ) and read the corresponding value of t2 . Compute C (1) = max{t2 , W1 (0)}. Take the next value of t1 and read the value of W1 (t1 ) from the corresponding column of the A-matrix. In the B-matrix, Naval Research Logistics DOI 10.1002/nav

382

Naval Research Logistics, Vol. 55 (2008) Table 2. A-matrix.

t1 W1 (t1 )

0 7

1 7

2 7

3 7

4 4

ﬁnd the largest value of W2 (t2 ) such that W1 (t1 ) ≥ W2 (t2 ) and read the corresponding value of t2 . Update C (1) := min{C (1) , max{t1 + t2 , W1 (t1 )}}. This process is repeated for all integer t1 up to A. Since the second row of the B-matrix is ordered, ﬁnding the largest values of W2 (t2 ) that does not exceed W1 (t1 ) for all values of t1 from zero to A requires no more than B comparisons, because each time the search may start from the value found in the previous iteration. Thus, the ﬁnal value of C (1) will be found in no more than O(A + B) = O(R) time. In a symmetric way, can we ﬁnd C (2) equal to the smallest value of max{t1 +t2 , W2 (t2 )}, provided that W1 (t1 ) < W2 (t2 ). This will also take O(R) time. The optimal value of the makespan is then equal by min{C (1) , C (2) }. The corresponding resource assignment can be found by determining the values of the decision variables. EXAMPLE 2: To illustrate the algorithm, take the data from Table 1 with machine M3 removed and with the value p1 changed to 7, so that a1 = p1 = 7, α1 = π1 = 3, b2 = p2 = 3, β2 = π2 = 1, b3 = p3 = 5, and β3 = π3 = 4. For machine M1 we need to solve the all-capacities knapsack problem

which results in the A-matrix shown in Table 2. Similarly, for machine M2 we have to solve the allcapacities knapsack problem max y2 + 4y3 subject to 2y2 + y3 ≤ 3 y2 , y3 ∈ {0, 1} which results in the B-matrix shown in Table 3. Assuming that W1 (t1 ) ≥ W2 (t2 ) we ﬁnd the value C (1) = 5 achieved for t1 = 4 and t2 = 1. Assuming that W1 (t1 ) < W2 (t2 ) we ﬁnd the value C (2) = 8 achieved for t2 = 0 and t1 = 0. The optimal makespan is equal to 5. See Fig. 2 for the resulting optimal schedule. Table 3. B-matrix. 0 8

1 4

Naval Research Logistics DOI 10.1002/nav

Notice that the dynamic programming algorithm outlined in [5] does not take advantage of all features of the two knapsack problems involved and therefore requires much more time.

2.3.

2 4

3 3

FPTAS

We now convert a pseudopolynomial dynamic programming algorithm for problem PD2|res111, Bi|Cmax into a fully polynomial approximation scheme for this problem. To achieve this purpose we use a popular rounding technique introduced by Ibarra and Kim [8]. In this subsection, we refer to an instance of the original problem PD2|res111, Bi|Cmax as Instance I . Given an ε > 0 and Instance I , use (1) compute A, B and R, and deﬁne δ = εR/n. Deﬁne an instance of problem PD2|res111, Bi|Cmax with the processing times a˜ j = aj /δ, b˜k = bk /δ,

max 3x1 subject to 4x1 ≤ 4 x1 ∈ {0, 1}

t2 W2 (t2 )

Figure 2. An optimal schedule for Example 2.

a˜ j − α˜ j = (aj − αj )/δ, b˜k − β˜k = (bk − βk )/δ,

j ∈ N1 ; k ∈ N2 ,

and call it Instance I˜. Here x denotes the largest integer that does not exceed x. Our algorithm solves Instance I˜ and converts the resulting schedule into an approximate solution to the original Instance I . Notice that due to the performed rounding, the dynamic programming algorithm that solves Instance I˜ generated a reduced number of states. Algorithm FPTPD2 1. Given Instance I and an ε > 0, deﬁne Instance I˜ as above. 2. For Instance I˜, run the dynamic programming algo˜ rithm from Section 2.2. Call the found schedule S. Let NiR and Ni0 be the subsets of the resource and the ˜ non-resource jobs of set Ni , i = 1, 2, in schedule S. 3. Process the jobs from the original Instance I in accordance with the subsets NiR and Ni0 . Call the resulting schedule Sε . Stop.

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

THEOREM 3: For schedule Sε found by Algorithm FPTPD2 the inequality Cmax (Sε ) ≤1+ε Cmax (S ∗ ) holds. The running time of the algorithm does not exceed O(n2 /ε). PROOF: Given Instance I , introduce an instance of problem PD2|res111, Bi|Cmax with the processing times deﬁned as a¯ j = a˜ j δ, b¯k = δ b˜k ,

a¯ j − α¯ j = (a˜ j − α˜ j )δ, b¯k − β¯k = (b˜k − β˜k )δ,

j ∈ N1 ;

respectively, and call it Instance I¯. It follows that a schedule S¯ that is optimal for Instance I¯ is also associated with the same subsets NiR and Ni0 of jobs ˜ so that as in schedule S, ¯ = max Cmax (S)

(a¯ j − α¯ j ) +

j ∈N1R

+

a¯ j ,

j ∈N10

(b¯k − β¯k ),

k∈N2R

b¯k

(b¯k − β¯k ) R

(a¯ j − α¯ j )+

k∈N2

Recall that the processing times aj in Instance I are obtained by extending the times a¯ j to their original values by no more than δ each. The same holds for other time parameters in Instances I and I¯. Thus, for schedule Sε we have that

Cmax (Sε ) ≤ max

+

j ∈N1R

k∈N2R

+

(aj − αj ) +

k∈N2R

(bk − βk ) + n2 δ,

j ∈N1

k∈N2

By deﬁnition, R˜ ≤ n/ε. Thus, we conclude that the running time of Algorithm FPTPD2 does not exceed O(n2 /ε). Notice that the FPTAS outlined in [5] requires O(n4 /ε3 ). ARBITRARY NUMBER OF MACHINES: APPROXIMATION ALGORITHMS

In this section, we show that problem PD|res111, Bi|Cmax in which the number of machines is part of the input admits an approximation algorithm with a worst-case ratio 1.5 + ε for any given positive ε. We also show how our approach can be extended to develop a (3 + ε)-approximation algorithm for a more general problem PD|res1σ σ , Int|Cmax based on a simpler reasoning than that employed in [7] for deriving the same bound for a less general problem PD|res1σ σ , Lin|Cmax .

k∈N20

j ∈N1R

The running time of Algorithm FPTPD2 is determined by the running time of the dynamic programming algorithm used ˜ time, where in Step 2, which requires O(nR) (a˜ j − α˜ j ), (b˜k − β˜k ) . R˜ = max

3.

k ∈ N2 .

383

aj + n1 δ,

j ∈N10

bk

k∈N20

(aj − αj )

j ∈N1R

(bk − βk ) + (n1 + n2 )δ

¯ + nδ = Cmax (S) ¯ + εR. ≤ Cmax (S) ¯ and R are lower bounds on the optimal Since both Cmax (S) makespan for the original Instance I , we have that Cmax (Sε ) ≤ (1 + ε)Cmax (S ∗ ).

3.1.

Binary Scenario

Our algorithm for ﬁnding an approximate solution of problem PD|res111, Bi|Cmax consists of two phases. In the ﬁrst phase, we apply an FPTAS to an integer programming problem in which the optimal value of the objective function is a lower bound on the optimal makespan Cmax (S ∗ ) for the original problem PD|res111, Bi|Cmax . The solution found in the ﬁrst phase generates an instance of the static problem PD|res111|Cmax with ﬁxed processing times. In the second phase we apply a 1.5-approximation algorithm from [11] to the obtained instance of problem PD|res111|Cmax . As a result we determine a suboptimal schedule SH for the original problem PD|res111, Bi|Cmax , such that for any positive ε the bound 3 Cmax (SH ) ≤ +ε Cmax (S ∗ ) 2

(2)

holds. Consider the following integer linear programming problem (ILP): Problem ILP : Minimize C m (p − πj )xj ≤ C s.t. i=1 j ∈Ni j j ∈Ni pj − j ∈Ni πj xj ≤ C, i = 1, . . . , m j ∈ N. xj ∈ {0, 1}, Naval Research Logistics DOI 10.1002/nav

384

Naval Research Logistics, Vol. 55 (2008)

Problem ILP can be seen as a relaxation of problem PDm|res111, Bi|Cmax . Here xj is equal to 1 if job j is given the resource and is equal to zero, otherwise. In problem ILP we simultaneously minimize the maximum workload over all machines and the total processing time of all resource jobs. Let C ∗ be the optimal objective value of problem ILP. Notice that for problem PDm|res111, Bi|Cmax with m ≥ 3 there may exist an optimal schedule S ∗ such that Cmax (S ∗ ) > C ∗ . To see this, turn to the data set of Example 1 given in Table 1. The value of C ∗ is equal to 4, achieved for x1 = x3 = x5 = 1 and x2 = x4 = 0. However, the optimal makespan is 5; see Fig. 1. Thus, C ∗ is a lower bound on the optimal makespan, which in general is not tight. Our approximation algorithm for problem PD|res111, Bi|Cmax consists of two phases. First, we ﬁnd an approximate solution to problem ILP with the value of the function that does not exceed (1 + δ)C ∗ , where δ is an appropriately chosen fraction of a given accuracy parameter ε. The output of the phase is a resource allocation that is fairly close to that in an optimal schedule. To ﬁnd a suboptimal schedule for this resource allocation we use an approximation algorithm for problem PD|res111, Bi|Cmax , which essentially behaves as a 1.5-approximation algorithm. To achieve a required ratio of 1.5 + ε for a given positive ε, we deﬁne δ = ε/1.5. We start the ﬁrst phase by introducing problem ILP(C), a parametric integer linear programming problem with a parameter C Problem ILP(C) m : (p − πj )xj Minimize i=1 j ∈Ni j s.t. p − j ∈Ni j j ∈Ni πj xj ≤ C, i = 1, . . . , m xj ∈ {0, 1}, j ∈ N. In problem ILP(C) we minimize the total processing time of all resource jobs, provided that the workload on each machine is bounded by C. For a ﬁxed C, let z∗ (C) denote the optimal value of the objective function in problem ILP(C). By decomposing problem ILP(C) into m knapsack problems, we ﬁnd an approximate solution to that problem with the value of the function z(C) such that the bound z(C) ≤ (1 + δ)z∗ (C) holds. Associate problem ILP(C) with the series of the following m knapsack problems KPi (C) (i = 1, . . . , m): Problem KPi (C) : (p − π )x Minimize j ∈Ni j j j subject to j ∈Ni pj − j ∈Ni πj xj ≤ C xj ∈ {0, 1},

j ∈ Ni .

In each problem KPi (C) we minimize the total processing time of the resource jobs on machine Mi , provided that the workload on that machine does not exceed C. Let zi∗ (C) Naval Research Logistics DOI 10.1002/nav

denote the optimal value of the objective function in problem problems ILP(C) and KPi (C), observe KPi (C). Comparing ∗ z that z∗ (C) = m i=1 i (C). Each problem KPi (C) is a minimization knapsack problem, so that using the corresponding FPTAS we can ﬁnd a solution with a value of the objective function zi (C) such that zi (C) ≤ (1 + δ)zi∗ (C). Running such an FPTAS takes O(n(log n + (1/δ 2 ) log(1/δ)) time for each i; see, e.g. [10]. The values of the decision variables obtained for all problems KPi (C), i = 1, . . . , m, deﬁne a feasible solution of problem ILP(C), so that for the obtained value of the objective function z(C) we deduce z(C) =

m

m zi (C) ≤ (1 + δ)zi∗ (C) = (1 + δ)z∗ (C),

i=1

i=1

i.e., the described procedure is an FPTAS for solving problem ILP(C) with a ﬁxed C. In problem ILP the largest value of C does not exceed the sum of all reduced processing times, as if each job is a resource job. Besides C cannot be larger than the workload of a machine, provided that none of the jobs assigned to that machine has been given the resource. On the other hand, C cannot be smaller than the workload of a machine, provided that each of the jobs assigned to that machine has been given the resource. Thus, for problem I LP we have that C ∈ [Y1 , Y2 ], where Y1 = max (pj − πj )|1 ≤ i ≤ m ; j ∈Ni m (pj − πj ), max pj |1 ≤ i ≤ m . Y2 = max i=1 j ∈Ni

j ∈Ni

The FPTAS for problem ILP(C) can be embedded into a binary search procedure that for problem ILP determines a value of C such that (1 + δ)C ∗ . The search starts from the midpoint of the current interval [Y1 , Y2 ]. If for the current value of C, the value z(C) found by the FPTAS for problem ILP(C) does not exceed (1 + δ)C, then the obtained solution is also feasible for problem ILP and a better value of C can be found by taking the midpoint of the left part of the current interval; otherwise, we take the midpoint of the right part of the current interval. The search stops having found a solution to problem ILP with the value of the objective function Cδ ≤ (1 + δ)C ∗ . The running time for ﬁnding Cδ does not exceed O(nm log(Y2 −Y1 )(log n+(1/δ 2 ) log(1/δ)), which is polynomial in both the length of the input of problem I LP and 1/ε, i.e., the described procedure is an FPTAS for problem ILP. We now pass to the second phase of the algorithm. The obtained approximate solution to problem ILP generates the

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

split of the jobs in the original problem into the resource jobs (those with xj = 1) and the remaining non-resource jobs. We obtain an instance of the problem PD|res111|Cmax with parallel dedicated machines and a single resource studied in [11]. To ﬁnd an approximate solution to the original problem PD|res111, Bi|Cmax we apply Algorithm GT from [11] to the obtained instance of problem PD|res111|Cmax . Recall that Algorithm GT if applied to problem PD|res111|Cmax is essentially a group technology heuristic algorithm that schedules on each machine a batch of the resource jobs and a batch of the non-resource jobs. It creates a schedule SH such that Cmax (SH ) ≤ ρL, where 3 − 1, if m is odd ρ = 23 2m 1 , − , if m is even 2 2(m−1) and L is a lower bound on the optimal makespan of the instance of problem PD|res111|Cmax computed as the maximum between the largest machine workload and the total processing times of the resource jobs. In our case, such a lower bound L does not exceed Cδ . Thus, we have Cmax (SH ) ≤ ρCδ ≤ 1.5(1 + δ)C ∗ = (1.5 + ε)C ∗ ≤ (1.5 + ε)Cmax (S ∗ ), which corresponds to (2). The second phase of our procedure, i.e., running Algorithm GT requires only O(nm) time, therefore the overall running time of the algorithm is polynomial. 3.2.

Integer Scenario

We now turn to problem PD|res1σ σ , Int|Cmax under the integer scenario of resource allocation. If a job j ∈ N is given τ units of the resource, 0 ≤ τ ≤ σ , then its the actual processing time is equal to a given value pj τ . Under the linear scenario of resource allocation, we are given the values pj and πj , j ∈ N , and the the actual processing time of job j that is allocated τ units of the resource is equal to pj τ = pj −τ πj . For problem PD|res1σ σ , Lin|Cmax , a (3 + ε)-algorithm is designed in [7]. The algorithm is essentially a two-phase procedure. In the ﬁrst phase, the resource allocations are found by solving a quadratic integer programming problem by an FPTAS. The solution found in the ﬁrst phase generates an instance of the static problem PD|res1σ σ |Cmax with ﬁxed processing times. In the second phase, the jobs are allocated to the machines using a simple greedy algorithm. As a result a suboptimal schedule SH is found, such that for any positive ε the bound Cmax (SH ) ≤3+ε Cmax (S ∗ ) holds.

(3)

385

Later we demonstrate that in fact the use of non-linear models in the ﬁrst phase of this process is not necessary, even for a more general problem PD|res1σ σ , Int|Cmax . In fact, we reduce the ﬁrst-phase actions to solving a series of integer linear programming problems by an FPTAS. Our reasoning is quite similar to that described in Section 3.1, however here we apply an FPTAS not to linear knapsack problems, but rather to so-called multiple-choice linear knapsack problems. As a result we gain in the overall running time and simplify the justiﬁcation of the algorithm. To achieve the required ratio (3) for a given positive ε, we deﬁne δ = ε/2. Consider the following integer linear programming problem ILPσ . It has a feasible solution if there is a feasible schedule for problem PD|res1σ σ , Int|Cmax with a makespan at most C. A variable xj τ , where j ∈ N and 0 ≤ τ ≤ σ , is equal to 1 if job j is given τ units of the resource and is equal to zero, otherwise. m σ

τ pj τ xj τ ≤ σ C,

(4)

i=1 j ∈Ni τ =0 σ

pj τ xj τ ≤ C,

i = 1, . . . , m,

(5)

j ∈Ni τ =0 σ

xj τ = 1,

j ∈ N,

(6)

τ =0

xj τ ∈ {0, 1}.

(7)

Problem ILPσ can be seen as a relaxation of the original problem PD|res1σ σ , Int|Cmax . The left-hand side of (4) represents the total resource consumption of all jobs. Since at most σ units of the resource can be allocated at any time, the total resource consumption is at most σ C which is expressed in (4). If there is a schedule with makespan not exceeding C, the workload on each machine is at most C. This is guaranteed by the inequalities (5). Introduce problem ILPσ (C), a parametric integer linear programming problem with a parameter C: Minimize

m σ

τ pj τ xj τ

i=1 j ∈Ni τ =0

subject to

σ

pj τ xj τ ≤ C,

i = 1, . . . , m,

j ∈Ni τ =0 σ

xj τ = 1,

j ∈ N,

τ =0

xj τ ∈ {0, 1}. In problem ILPσ (C) we minimize the total resource consumption of all jobs, provided that the workload on each machine is bounded by C. By decomposing problem Naval Research Logistics DOI 10.1002/nav

386

Naval Research Logistics, Vol. 55 (2008)

ILPσ (C) into m integer programs, we ﬁnd an approximate solution to that problem with the value of the function z(C) such that the bound z(C) ≤ (1 + δ)z∗ (C), holds, where z∗ (C) denote the optimal value of the objective function in problem ILPσ (C). Associate problem ILPσ (C) with the series of the following m integer programs MCKPi (C), i = 1, . . . , m: Minimize

σ

τ pj τ xj τ

j ∈Ni τ =0

subject to

σ

pj τ xj τ ≤ C,

j ∈Ni τ =0 σ

xj τ = 1,

j ∈ N,

τ =0

xj τ ∈ {0, 1}. In each problem MCKPi (C) we minimize the total resource consumption on machine Mi , provided that the workload on that machine does not exceed C. Let zi∗ (C) denote the optimal value of the objective in problem function ∗ z (C). MCKPi (C). It follows that z∗ (C) = m i=1 i Each of the programs MCKPi (C) is a multiple choice knapsack problem (MCKP) in the minimization form. The MCKP is a generalization of the classical knapsack problem. We have given σ mutually disjoint classes Q0 , Q1 , . . . , Qσ of items to be packed into a knapsack of capacity C. Each item j ∈ Qτ , 0 ≤ τ ≤ σ , has a cost cj τ and a weight wj τ , and the problem is to choose exactly one item from each class so that the total cost is minimized without exceeding the weight capacity C. In the case of problem MCKPi (C), there are |Ni | items in each of the σ + 1 classes so that the total number of items is t = O(|Ni |σ ). An item j of class Qτ is associated with job j ∈ Ni that is given τ units of resource, where τ = 0, 1, . . . , σ . The weights correspond to the processing times pj τ and the costs are equal to τ pj τ . An FPTAS for the MCKP is outlined in [10, page 338]. To achieve an accuracy of 1+δ for the MCKP an overall running time of O(tσ/δ) is required. Thus, program MCKPi (C) can be approximated with performance ratio 1+δ in O(|Ni |σ 2 /δ) time. The values of the decision variables obtained for all problems MCKPi (C), i = 1, . . . , m, deﬁne a feasible solution of problem ILPσ (C), so that for the obtained value of the objective function z(C) we deduce z(C) ≤ (1 + δ)z∗ (C), i.e., the described procedure is an FPTAS for solving problem ILP(C) with a ﬁxed C. Embedding the FPTAS for ILPσ (C) into a binary search on C we can calculate the smallest integer C ∗ for which problem ILPσ (C) has a solution with the objective function value Naval Research Logistics DOI 10.1002/nav

z∗ (C) ≤ (1 + δ)σ C ∗ . Consequently, C ∗ is a lower bound on the optimal makespan Cmax (S ∗ ) and the corresponding solution vector x ∗ is a feasible solution of ILP with constraint (4) relaxed to m σ

τ pj τ xj τ ≤ (1 + δ)σ C ∗ .

(8)

i=1 j ∈Ni τ =0

Thus, we have found a resource allocation to the jobs, i.e., we have obtained an instance of problem PD|res1σ σ |Cmax , such that the total processing time on each machine is at most C ∗ and the total resource consumption does not exceed (1+δ)σ C ∗ with C ∗ ≤ Cmax (S ∗ ). A resource allocation found in [7] possess the same properties, so that a greedy algorithm from [7] can be used to ﬁnd a suboptimal schedule. In the greedy algorithm, each time the ﬁrst job is scheduled which can be assigned to the corresponding machine without violating the resource constraints, and ties are broken arbitrarily. We summarize the results of this section as the following statement. THEOREM 4: Problem PD|res1σ σ , Int|Cmax admits a (3 + ε)-approximation algorithm, while problem PD|res111, Bi|Cmax admits a (3/2 + ε)-approximation algorithm. The running times of both algorithms depend polynomially on both the length of the input and 1/ε.

4.

FIXED NUMBER OF MACHINES: PTAS

In this section, we present a PTAS for problem PDm|res111, Bi|Cmax . For ﬁxed m and ε ∈]0, 1[ the running time of our scheme is polynomial in the size of the problem input, but not in m and 1/ε. Let S ∗ be a schedule that is optimal for problem PDm|res111, Bi|Cmax . For job j ∈ N its actual processing time in schedule S ∗ is denoted by fj∗ , i.e., fj∗ = pj if j is a non-resource job and fj∗ = pj − πj if it is a resource job. For a subset of jobs Q ⊆ N , deﬁne the sum of actual processing times in an optimal schedule by f ∗ (Q). Let SH be a heuristic schedule found by the algorithm presented in Section 3 with ε = 0.5. It follows that C :=

1 Cmax (SH ) 2

is a lower bound on the optimal makespan C ∗ := Cmax (S ∗ ). On the other hand, Cmax (SH ) ≥ C ∗ , so that C ≤ C ∗ ≤ 2C. Recall that for a static version of our problem, i.e., problem PDm|res111|Cmax , a PTAS is developed in [11]. To design

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines Table 4. Six possible classes of a job: B, big; M, medium; S, small. Class of job j

1

2

3

4

5

6

Non-resource duration pj Resource duration pj − πj

B B

B M

B S

M M

M S

S S

a PTAS for problem PDm|res111, Bi|Cmax we will “guess” a resource allocation that is very close to that in an optimal schedule S ∗ . After that, our algorithm follows the lines of the PTAS for the obtained static problem. The PTAS splits the jobs into big, medium and small according to their durations in such a way that the processing time of every big job is by a factor larger that the duration of any small job, while the total processing time of all medium jobs is a very small multiple of the optimal makespan. Introduce the sequence of real numbers δ1 , δ2 , . . . such that t δt := (ε/m)2 . For δ equal to one of these values δt , the processing time of a job is called big if it exceeds δC; is called small if it no greater than δ 2 C; otherwise it is called medium. Since we do not know an optimal resource allocation, it is, for instance, possible that the processing time of a job is big (provided that no resource is given to that job), while if the job is given the resource its duration becomes medium or even small. Thus, we have to split the jobs into six classes as shown in the Table 4. For each integer t, t ≥ 1, introduce the set of jobs N t := {Jj ∈ N |δt2 C < fj∗ ≤ δt C}. It can be veriﬁed that the sets of jobs N 1 , N 2 , . . . are mutually disjoint. There exists an integer ∗ t0 , 1 ≤ t0 ≤ mε , such that f ∗ (N t0 ) ≤ εf m(N ) ≤ εC ∗ holds. ∗ Otherwise, we would have f ∗ (N ) > mε εf m(N ) , which is impossible. Taking t from 1 to mε , eventually we will ﬁnd a value of t that allows us to ﬁnd the values of actual processing times as in an optimal schedule for most of the jobs. The explanation below is related to such a t. Note that the number of values of t to be examined is constant for ﬁxed m and ε. Deﬁne δ = δt and determine the resource allocation to the jobs according to their class. Classes 1–3. For each job j in these classes its nonresource duration is big, i.e., pj > δC. In an optimal schedule the total processing time of all jobs of big duration on each machine does not exceed C ∗ ≤ 2C, therefore, there will be at most µ := 2δ1 of such jobs on each machine. This means that to generate resource allocations for the jobs in these classes we need to verify at most O(nmµ ) options, one of which will coincide with the allocation in an optimal schedule. Notice that the number of options to be considered is polynomial. Classes 4 and 5. In an optimal schedule the total processing time of all jobs of medium duration, including those from

387

Class 2, on all machines does not exceed εC ∗ . Since each medium duration exceeds δ 2 C, it follows that at most ν := O( δε2 ) jobs will be of medium duration in an optimal schedule. Thus, we have O(nν ) resource allocations to be veriﬁed. Class 6. The duration of each job of this class remains small irrespective of the resource allocation. Let Ni(6) denote the set of jobs of this class to be processed on machine Mi , 1 ≤ i ≤ m. Unlike for the previous classes, here we cannot afford full enumeration since Class 6 may contain too many jobs. Instead, to achieve an optimal resource allocation we will try to minimize total workload on each machine simultaneously with the total processing time of all resource jobs. This can be done for each machine Mi separately by solving the following bicriteria problem of Boolean programming that we call problem Vi : Problem Vi : minimize minimize

F1 (z) = j ∈N (6) pj (1 − zj ) i F2 (z) = j ∈N (6) (pj − πj )zj zj ∈ {0, 1},

i

j ∈ Ni(6) ,

where zj equal to 1 is job j is assigned the resource and equal to 0 otherwise. Thus, for machine Mi , the function F1 represents the total processing time of all nonresource jobs of Class 6, while the function F2 represents the total processing time of all resource jobs of this class. Recall that a solution z is called Pareto-optimal if there exists no solution z

such that F1 (z

) ≤ F1 (z ) and F1 (z ) ≤ F2 (z

), where at least one of these relations holds as a strict inequality. The set of all Pareto-optimal solutions is known as the efﬁciency frontier. It can be seen there exists an optimal schedule for the original problem that is related to a resource allocation that is Pareto-optimal for all problems Vi . Otherwise, it would be possible to ﬁnd a resource allocation for the jobs of Class 6 such that both the total duration of the resource jobs and the total duration of the non-resource jobs on some machine Mi are not larger and one duration is strictly smaller than the optimal values of the functions of problem Vi . For problem Vi , minimizing one of the objective functions subject to a constraint that the value of the other function is bounded is essentially a knapsack problem. Thus, problem Vi as a problem of simultaneous minimization of these functions is NP-hard and solving it in polynomial time is unlikely. For our purposes, however, it sufﬁces to ﬁnd a solution to problem Vi that can be seen as an approximation of the set of Pareto-optimal solutions. Given a positive ε, a feasible solution z is called an (1 + ε)-approximation of solution z

if F1 (z ) ≤ (1 + ε)F1 (z

) and F2 (z ) ≤ (1 + ε)F2 (z

). The set of (1 + ε)-approximations of all Pareto optimal points is called an (1 + ε)-approximation of the efﬁciency frontier. As follows from [14], there exists an (1 + ε)-approximation of the efﬁciency frontier that consists of a number of solutions Naval Research Logistics DOI 10.1002/nav

388

Naval Research Logistics, Vol. 55 (2008)

that is polynomial in both 1/ε and the number of jobs in set Ni(6) . For Problem Vi , an FPTAS ﬁnds an (1 + ε)-approximation of the efﬁciency frontier, and its running time is polynomial in both 1/ε and the number of jobs in set Ni(6) . For our purposes, we may adapt an FPTAS for a more general multi-objective knapsack problem [4]. See also Section 13.1 of the book [10] for more information on multi-objective problems of integer programming. Solve each problem Vi , i = 1, . . . , m, by an FPTAS. For the original value of ε, we may set the accuracy of an FPTAS to ε/m. For any i, a solution delivered by the FPTAS generates a resource allocation for all jobs of Class 6. Examining all found solutions, we determine a resource allocation which guarantees the following property. Class 6 Property. For jobs of Class 6 the sum of actual processing times of these jobs on each machine is at most 2εC ∗ /m away from the corresponding value f ∗ (Ni(6) ) in an optimal schedule, while the total processing time of all resource jobs of this class exceeds the corresponding value in an optimal schedule by at most εC ∗ . Thus, trying all values of t we perform full enumeration of possible resource allocations of the jobs of Classes 1–5 and apply the approximation approach to the jobs of Class 6. As a result, an allocation will be found such that for each job j of Classes 1–5 its actual processing time becomes equal to fj∗ , the duration in some optimal schedule. For the same allocation, the remaining jobs will satisfy Class 6 Property above. This allocation generates an instance of the static problem P Dm|res111|Cmax that we call Instance Iε . Let us refer to the PTAS from [11] for problem PDm|res111|Cmax as the static PTAS. As an approximate solution to the original problem, we accept a solution delivered by the static PTAS applied to instance Iε . In fact, the static PTAS should be applied to every instance associated with every resource allocation, but for our purposes it sufﬁces to study performance of the static PTAS applied to Iε . Take the value of δ that initiates instance Iε , and deﬁne big, medium and small jobs as the jobs that have big, medium and small duration, respectively. Recall that this deﬁnition coincides with the deﬁnition of these jobs in the description of the static PTAS. Recall that the static PTAS does the following:

• ﬁnd all schedules of big jobs with starting times that are multiples of δ 2 C by full enumeration (the number of the big jobs is constant); • for each such schedule, determine the amounts of small resource jobs to be processed in the gaps of that schedule by solving a linear programming problem; • schedule small resource jobs preemptively in these gaps; Naval Research Logistics DOI 10.1002/nav

• schedule small non-resource jobs preemptively in the remaining gaps; • get rid of preemptions (some of the jobs can be temporarily discarded); • append all unscheduled jobs (i.e., all temporarily discarded jobs and all medium jobs); • select the best of all generated schedules. Here we do not present a detailed analysis of the performance of our PTAS, since it is basically identical to that given in [11]. The only point that needs clariﬁcation concerns the jobs of Class 6, since in Instance Iε their durations are not exact. However, since the static PTAS assigns the small jobs preemptively, we can use Class 6 Property to guarantee that for the obtained schedule Sε the error Cmax (Sε )−C ∗ does not exceed rεC ∗ , where r is a constant that only depends on m and 1/ε. By normalizing the original value of ε, we conclude that our approach leads to a PTAS for the original problem.

5.

CONCLUSION

In this article, we address approximability issues of the problem of scheduling jobs on parallel dedicated machines in the presence of a single renewable speeding-up resource of unit amount as well as for its generalization with several available units of the resource. In all our algorithms we rely on a certain version of the linear knapsack problem. It is an attractive research goal to extend our approaches to other models of parallel processing. A study of models, inlcuding multicriteria ones, that take into consideration either the amount or the cost of the used resources is very appealing; the models of this kind are the main topic of research in scheduling with controllable processing times.

ACKNOWLEDGEMENTS The authors are grateful to Alexander Grigoriev and Marc Uetz of the University of Maastricht for useful discussions at the early stages of this research. The comments by two anonymous referees and an associate editor have contributed to improving the presentation.

REFERENCES [1] J. Bła˙zewicz, N. Brauner, and G. Finke, “Scheduling with discrete resource constraints,” J.Y.-T. Leung (Editor), Handbook of scheduling: Algorithms, models and performance analysis, Chapman & Hall/CRC, London, 2004, pp. 23-1–23-18. [2] J. Bła˙zewicz, K.H. Ecker, G. Schmidt, and J. Weglarz, Scheduling in computer and manufacturing systems, Springer, Berlin, 1994.

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines [3] J. Bła˙zewicz, J.K. Lenstra, and A.H.G. Rinnooy Kan, Scheduling subject to resource constraints, Discrete Appl Math 5 (1983), 11–24. [4] T. Erlebach, H. Kellerer, and U. Pferschy, Approximating multi-objective knapsack problems, Management Sci 48 (2002), 1603–1612. [5] A. Grigoriev, H. Kellerer, and V.A. Strusevich, Dedicated parallel machine scheduling with a single speeding-up resource, 6th Workshop on models and algorithms for planning and scheduling problems, book of abstracts, 2003, pp. 131–132. [6] A. Grigoriev, M. Sviridenko, and M. Uetz, Machine scheduling with resource dependent processing times, Math Program 110 (2007), 209–228. [7] A. Grigoriev and M. Uetz, “Scheduling parallel jobs with linear speedup,” T. Erlebach and P. Persiano (Editors), Approximation and online algorithms (WAOA 2005), Lecture Notes Comput Sci 3879 (2006), 203–215. [8] O.H. Ibarra and C.E. Kim, Fast approximation algorithms for the knapsack and sum of subsets problem, J ACM 22 (1975), 463–468. [9] A. Janiak and M.Y. Kovalyov, Single machine scheduling subject to deadlines and resource dependent processing times, Eur J Oper Res 94 (1996), 284–291.

389

[10] H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack problems, Springer, Berlin, 2004. [11] H. Kellerer and V.A. Strusevich, Scheduling parallel dedicated machines under a single non-shared resource, Eur J Oper Res 147 (2003), 345–364. [12] H. Kellerer and V.A. Strusevich, Scheduling problems for parallel dedicated machines under multiple resource constraints, Discrete Appl Math 113 (2004), 45–68. [13] E. Nowicki and S. Zdrzalka, A survey of results for sequencing problems with controllable processing times, Discrete Appl Math 26 (1990), 271–287. [14] C.N. Papadimitriou and M. Yannakakis, On the approximability of trade-offs and optimal access web sources, Proc 41st Symp Foundations of Computer Science, 2000, pp. 86–92. [15] D. Shabtay, Single and a two-resource allocation algorithms for minimizing the maximal lateness in a single machine-scheduling problem, Comput Oper Res 31 (2004), 1303–1315. [16] D. Shabtay and M. Kaspi, Parallel machine scheduling with a convex resource consumption function, Eur J Oper Res 173 (2006), 92–107.

Naval Research Logistics DOI 10.1002/nav

2

Institut für Statistik und Operations Research, Universität Graz, Universitätsstraße 15, Graz A-8010, Austria

School of Computing and Mathematical Sciences, University of Greenwich, Old Royal Naval College, Park Row, London SE10 9LS, United Kingdom

Received 7 November 2006; revised 29 February 2008; accepted 8 March 2008 DOI 10.1002/nav.20292 Published online 23 April 2008 in Wiley InterScience (www.interscience.wiley.com).

Abstract: We consider a problem of scheduling jobs on m parallel machines. The machines are dedicated, i.e., for each job the processing machine is known in advance. We mainly concentrate on the model in which at any time there is one unit of an additional resource. Any job may be assigned the resource and this reduces its processing time. A job that is given the resource uses it at each time of its processing. No two jobs are allowed to use the resource simultaneously. The objective is to minimize the makespan. We prove that the two-machine problem is NP-hard in the ordinary sense, describe a pseudopolynomial dynamic programming algorithm and convert it into an FPTAS. For the problem with an arbitrary number of machines we present an algorithm with a worst-case ratio close to 3/2, and close to 3, if a job can be given several units of the resource. For the problem with a ﬁxed number of machines we give a PTAS. Virtually all algorithms rely on a certain variant of the linear knapsack problem (maximization, minimization, multiple-choice, bicriteria). © 2008 Wiley Periodicals, Inc. Naval Research Logistics 55: 377–389, 2008 Keywords: scheduling; parallel dedicated machines; resource constraints; complexity; approximation

1.

INTRODUCTION

In this article we consider the problem of scheduling jobs on parallel dedicated machines, provided that the processing of jobs can be sped up by allocating an additional resource. We are given a set N = {1, 2, . . . , n} of jobs and m processing machines M1 , M2 , . . . , Mm . The machines are parallel and dedicated. Each job has to be processed on exactly one machine, and the set N of jobs is in advance partitioned into m subsets, N1 , N2 , . . . , Nm , so that the jobs of set Ni and only these are processed on machine Mi , 1 ≤ i ≤ m. The processing time for performing job j is equal to pj ≥ 0 time units. No machine processes more than one job at a time and preemption is not allowed. For all problems considered in this article, the goal is to ﬁnd a schedule that minimizes the makespan, i.e., the maximum completion time. For a schedule S, let the makespan be denoted by Cmax (S). A schedule with the smallest makespan is called optimal and is denoted by S ∗ . In the basic model studied in this article, it is assumed that an additional renewable speeding-up resource can be allocated to a job. There are σ ≥ 1 units of the resource available at any time. If a job j is not given the resource, its processing Correspondence to: V.A. Strusevich ([email protected] ac.uk) © 2008 Wiley Periodicals, Inc.

time remains equal to pj ; otherwise, the resource will speed up the processing. A job that is not given the resource is called a nonresource job; otherwise, it is called a resource job. In this article we mainly concentrate on the simplest scenario of resource consumption, which we call binary. We assume that exactly one unit of the resource is available at any time, i.e., σ = 1. Each job j ∈ N is associated with a value πj ≤ pj . If a job j is given the resource, its processing time becomes pj − πj and exactly one unit of the resource is required at any time of this processing. No two resource jobs can be processed simultaneously. Unless stated otherwise, all time parameters pj and πj are assumed to be integer. To help the reader to grasp the main features of our model, we provide a small-sized example. EXAMPLE 1: Consider the problem of processing ﬁve jobs on three parallel dedicated machines M1 , M2 , and M3 . Machine M1 has to process only job 1, machine M2 processes jobs 2 and 3, while machine M3 processes jobs 4 and 5; i.e., N1 = {1}, N2 = {2, 3} and N3 = {4, 5}. The value of the processing parameters pj and πj are given in Table 1. Later in this article we use this example for the purpose of numerical illustrations. Here we give two meaningful interpretations of the example presented in Table 1.

378

Naval Research Logistics, Vol. 55 (2008) Table 1. Data for Example 1. N1

j pj πj

1 5 3

N2 2 3 1

N3 3 5 4

4 3 1

5 5 4

Human Resource Management. Three teams M1 , M2 , and M3 have to implement ﬁve projects. Team Mi is responsible for set Ni of projects, i = 1, 2, 3. The senior management may allocate an extra employee, currently not a member of any of the teams, to take part in any project. If project j is done by the existing workforce of the corresponding team, it will take pj weeks. If the additional employee is assigned to project j , this will reduce the duration of the project by πj weeks. The extra employee remains assigned to the project until it is completed and cannot be assigned to take part in more than one project at a time. The purpose is to complete all ﬁve projects as early as possible. Power-Aware Scheduling. A computing device consists of three parallel processors M1 , M2 , and M3 and has to run ﬁve tasks. The tasks of set Ni are assigned to run on processor Mi , i = 1, 2, 3. If task j is run on the corresponding processor at a standard speed, it will take pj seconds. It is possible to increase the speed of any processor, so that task j will require πj seconds less. In order not to overheat the device, only one processor at a time can be speed up. The goal is to ﬁnish all ﬁve tasks as early as possible. See Fig. 1 for a schedule that is optimal for the instance presented in Example 1. The resource jobs are hatched with vertical lines. If the resource allocations are known in advance then all processing times are also known, and we refer to this kind of resource as the renewable static resource. Study on scheduling problems with this type of resource constraints has been initiated by Bła˙zewicz et al. [3]; See also [1] and [2] for the most recent reviews of research in this area. The problems of minimizing the makespan on parallel dedicated machines with static renewable resources have been studied in [11, 12]. A fairly complete computational classiﬁcation of relevant problems has been obtained and a number of approximation algorithms have been designed and analyzed. In the case of a single static resource, the problems studied in [11, 12] are denoted by PDm|res1σρ|Cmax (if the number of machines is ﬁxed and equal to m) and by PD|res1σρ|Cmax (if the number of machines is arbitrary, i.e., part of the problem input). Here “PD” stands for “parallel dedicated machines,” while “res1σρ” implies that there is a single resource, the size of the resource does not exceed σ , each job is allocated no more than ρ units of the resource and at any time the total amount of the allocated resource is Naval Research Logistics DOI 10.1002/nav

at most σ . In particular, PDm|res111|Cmax denotes the problem of minimizing the makespan on m parallel dedicated machines, provided that some jobs are known to require one unit of the additional resource at any time of their processing. The main problem studied in this article will be denoted by PDm|res111, Bi|Cmax , where we write “Bi” in the middle ﬁeld to stress that the resource is speeding-up and the binary scenario of its consumption is applied. If the number of the machines is arbitrary, the problem is denoted by PD|res111, Bi|Cmax . In the scheduling literature, there are at least two general models with a nonrenewable speeding-up resource. A feature common for both models is that a single resource has to be divided between the jobs in advance, and the processing times of each job that receives the resource is reduced, depending on how many units of the resource are allocated. This situation is typical if the resource to be divided represents money or energy. These models have numerous applications in manufacturing, supply chain management, imprecise computing and other areas. The ﬁrst model with the nonrenewable speeding-up resource is related to scheduling with controllable processing times; see [13] for a review. Formally, for each job j ∈ N we are given the “standard” value of processing time p¯ j that can be compressed to the minimum value p j , where pj ≤ p¯ j . Crashing p¯ j to some actual processing time pj , pj ≤ pj ≤ p¯ j , may decrease job completion time(s) but incurs additional cost αj xj , where xj = p¯ j − pj is the compression amount of job j and αj is a given unit compression cost. A number of authors, see, e.g., [9], argue that the compression is achieved due to the additional resource allocated to a job. Usually, this problem area deals with the trade-off between the improved quality of the obtained schedule and the cost ofthe used resource, normally represented by a linear function j ∈N αj xj . Another model deﬁnes the processing time of a job j to which uj units of the resource are allocated as pj = (aj /uj )k , where aj is a known job-dependent constant, while k is a

Figure 1. An optimal schedule for Example 1.

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

positive constant. Herethe main issue is to ﬁnd a resourcefeasible schedule, i.e., j ∈N uj ≤ σ that minimizes a certain function that measures the quality of a schedule. This scheduling model has been studied in a series of papers; here we only refer to [15], which is the earliest paper we are aware of, and [16], the most recent one. Returning to problem PDm|res111, Bi|Cmax with a single renewable speeding-up resource, notice that some preliminary results on this problem are reported in [5]. In particular, problem PD2|res111, Bi|Cmax is shown to be NP-hard in the ordinary sense; we give an alternative proof in Section 2.1. The fact that the problem under consideration is NPhard stimulates the search for approximation algorithms that deliver solutions fairly close to the optimum. Recall some relevant deﬁnitions. A polynomial-time algorithm that creates a schedule with the makespan that is at most ρ ≥ 1 times the optimal value is called a ρ-approximation algorithm; the value of ρ is called a worst-case ratio bound. A family of algorithms is called a polynomial-time approximation scheme, or a PTAS if for a given ε > 0 it contains an algorithm that has the running time that is polynomial in the length of the problem input and delivers a worst-case ratio bound of 1 + ε. If additionally the running time is polynomial in 1/ε, a PTAS is called a fully polynomial-time approximation scheme (FPTAS). While most of the presented results concern problem PD2|res111, Bi|Cmax , in this article we also address a more general problem, in which there are σ ≥ 1 units of the renewable resource available at a time. We are given a matrix with the nonnegative elements pj τ , where j ∈ N and 0 ≤ τ ≤ σ . If job j is allocated an integer number τ , 0 ≤ τ ≤ σ , of units of the resource at any time of its processing, then its actual processing time is equal to pj τ . Only the jobs that are allocated at most σ units of the resource can be scheduled to run in parallel. We denote this problem by PDm|res1σ σ , Int|Cmax , where “Int” stands for the integer scenario of resource allocation. For the speeding-up resource, the values pj τ are non-increasing in τ for each job j ; however, in general we do not have to make this assumption. A special case of problem PDm|res1σ σ , Int|Cmax is studied in [7], where it is assumed that the actual processing time of job j that is given τ units of the resource depends linearly on τ , i.e., pj τ = pj − τ πj , where pj and πj have the same meaning as for problem PDm|res111, Bi|Cmax . This problem can be denoted by PDm|res1σ σ , Lin|Cmax , where “Lin” stresses that the actual processing times depend linearly on the number of units of the speeding-up resource allocated to the job. Notice that in the case of σ = 1, both scenarios, integer and linear, coincide and become the binary scenario. For problem PDm|res1σ σ , Lin|Cmax , a (3 + ε)-approximation algorithm is presented in [7]. A similar speeding-up scenario applied to other scheduling models, e.g., unrelated parallel machines, is considered in [6].

379

The remainder of this article is organized as follows. Section 2 addresses the two-machine version of problem PDm|res111, Bi|Cmax . We show that problem PD2|res111, Bi|Cmax is NP-hard, offer a pseudopolynomialtime dynamic algorithm and convert it into an FPTAS. In Section 3 we present an approximation algorithm for problem PD|res111, Bi|Cmax with an arbitrary number of machines that delivers a worst-case ratio close to 3/2. We also show how to extend our approach to handle problem PDm|res1σ σ , Int|Cmax and give a (3 + ε)-approximation algorithm that does not involve approximation schemes for quadratic programming problems employed in [7] for a less general problem PDm|res1σ σ , Lin|Cmax . In Section 4 we describe a PTAS for problem PDm|res111, Bi|Cmax with a ﬁxed number of machines. Section 5 contains some concluding remarks. In our reasoning and design of the algorithms we often use the linear knapsack problem, in both minimization and maximization settings, as well as the multiple-choice knapsack problem and the bicriteria knapsack problem. Notice that each of these problems admits an FPTAS; see [10] for details. Additionally, the algorithms that we describe here for the problems with the speeding-up resource rely on the algorithms from [11] developed for the counterparts with the static resource; the latter algorithms are not presented here in full detail, and we brieﬂy review the results from [11] in relevant sections of this article.

2.

TWO MACHINES: COMPLEXITY AND FPTAS

In this section, we study problem PD2|res111, Bi|Cmax . Recall that if the resource is static, i.e., is allocated in advance, problem PD2|res111|Cmax is solvable in O(n) time; on the other hand problem PDm|res111|Cmax is NP-hard for any ﬁxed m ≥ 3, while problem PD|res111|Cmax with an arbitrary number of machines is NP-hard in the strong sense; see [11]. Notice also that the reduction used in [11] to prove strong NPhardness of problem PD|res111|Cmax is straightforwardly extendable to problem PD|res111, Bi|Cmax . For the two-machine case of problem PDm|res111, Bi| Cmax we deﬁne aj = pj , bk = pk , 2.1.

αj = πj , βk = πk ,

j ∈ N1 ; k ∈ N2 .

Complexity

To resolve the complexity issue of problem PDm|res111, Bi|Cmax we only need to establish the status of its version with two machines. In this subsection, we show that problem PD2|res111, Bi|Cmax is NP-hard in the ordinary sense. In the proof of the NP-hardness the following well-known NP-complete problem is used for reduction. Naval Research Logistics DOI 10.1002/nav

380

Naval Research Logistics, Vol. 55 (2008)

Partition: Given t integers ej such that tj =1 ej = 2E, does there exist a partition of the index set T = {1, 2, . . . , t} into two subsets T1 and T2 such that j ∈T1 ei = j ∈T2 ei = E? For a nonempty set Q ⊆ T , deﬁne e(Q) = j ∈Q ej ; additionally deﬁne e(∅) = 0. The theorem below holds not only for the speeding-up model that is of primary concern of this article, but also for an alternative possible model for which the actual processing times are derived by reducing the original times by the same factor. THEOREM 1: Problem PD2|res111, Bi|Cmax is NP-hard even if there exists a λ, 0 < λ < 1, such that αj = λaj for all j ∈ N1 and βj = λbj for all j ∈ N2 . PROOF: Take an arbitrary γ , 0 < γ < 1, and given an instance of Partition, deﬁne • N1 := T and N2 := {t + 1, t + 2}; • aj := ej /γ , j ∈ N1 ; • bt+1 := E/γ 2 and bt+2 := E.

• αj := λaj = • βt+1 := (1 − γ )E.

βt+2 := λbt+2 =

Notice that if job j ∈ N1 is given the resource that its processing time becomes aj − αj =

ej 1−γ − ej = ej = γ aj . γ γ

Similarly, E 1−γ E = γ bt+1 ; − E= γ2 γ2 γ = E − (1 − γ )E = γ E = γ bt+2 .

bt+1 − βt+1 = bt+2 − βt+2

We show that Partition has a solution if and only if in the constructed problem there exists a schedule S0 such that Cmax (S0 ) ≤ (1 + γ1 )E. Suppose that Partition has a solution represented by the sets T1 and T2 . Then schedule S0 with Cmax (S0 ) = (1 + γ1 )E exists and can be found as follows. Both machines operate in the time interval [0, (1+ γ1 )E] with no idle time. The resource is assigned to the jobs of set T1 on machine M1 and to job t + 1 on machine M2 . Machine M1 processes the block T2 of jobs in the time interval [0, E/γ ] followed by the block T1 of the resource jobs. Machine M2 processes the sequence of jobs (t + 1, t + 2). Naval Research Logistics DOI 10.1002/nav

(bt+1 − βt+1 ) + (bt+2 − βt+2 ) = E/γ + γ E. Denote the total processing time of the resource jobs on machine M1 by z. Since the total duration of all resource jobs does not exceed (1 + γ1 )E and the resource jobs on machine M2 take E/γ + γ E time units, it follows that in schedule S0 z ≤ (1 − γ )E, and the sum of the original processing times for the resource jobs is equal to z/γ . We now estimate the total load on machine M1 as z+

z 2E 2E 1−γ 2E (1 − γ )2 − = − z≥ − E. γ γ γ γ γ γ

Since 2 − (1 − γ )2 > γ + 1 for 0 < γ < 1, we derive that 2E (1 − γ )2 1 − E > 1+ E, γ γ γ

Also deﬁne λ = 1 − γ , so that 1−γ ej , j ∈ N1 ; γ λbt+1 = 1−γ E and γ2

Suppose now that schedule S0 with Cmax (S0 ) ≤ (1 + γ1 )E exists. Assume ﬁrst that both jobs t + 1 and t + 2 are given the resource, so that the total load on machine M2 becomes

which is impossible. If job t + 1 is not given the resource then the smallest total load on machine M2 is γE2 +γ E, which is larger then (1+ γ1 )E for each γ , 0 < γ < 1. Thus, in schedule S0 the only job on machine M2 that is given the resource is job t + 1, and its processing takes E/γ time. Consider an arbitrary partition T1 and T2 of the index set T , such that on machine M1 the jobs j ∈ T1 are given the resource, and the jobs of the other subset are not. If e(T1 ) > E, then total processing time of all resource jobs on both machines exceeds (1 + γ1 )E. Assume that e(T1 ) = E − x for some positive x. Then the total load on machine M1 is equal = (1 + γ1 )E + ( γ1 − 1)x > (1 + γ1 )E, a conto E − x + E+x γ tradiction. Therefore, if schedule S0 exists, then Partition must have a solution. Notice that Theorem 2 implies that problem PD2|res111, Bi|Cmax is NP-hard if for any job the reduced processing time is obtained from the standard processing time by multiplying it by the same value 1 − λ. The proof outlined in [5] deals with a less restricted version of the problem. 2.2.

Dynamic Programming

We now show that problem PD2|res111, Bi|Cmax can be solved by a dynamic programming algorithm in

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

pseudopolynomial time. Given an instance of problem PD2|res111, Bi|Cmax deﬁne (aj −αj ), B = (bj −βj ), R = max{A, B}. A= j ∈N1

381

To minimize the workload on M2 we also need to solve the knapsack problem: W2 = min y

j ∈N2

bk −

k∈N2

βk y k

j ∈N2

(1) THEOREM 2: Problem PD2|res111, Bi|Cmax can be solved in O(nR) time. PROOF: As follows from [11], for problem PD2|res111| Cmax with the static resource there exists an optimal schedule in which on each machine the jobs are organized in two blocks: that of the resource jobs and the other consisting of the nonresource jobs. Moreover, it sufﬁces to look for an optimal schedule among those schedules in which (i) the jobs in each block are processed without intermediate idle time; (ii) machine M1 processes the resource jobs starting at time zero, and then processes the block of the non-resource jobs; and (iii) machine M2 starting at time zero processes the block of the non-resource jobs followed by the block of the resource jobs which starts as early as possible. For problem PD2|res111, Bi|Cmax , consider an arbitrary resource allocation. Associate job j ∈ N1 with a binary variable xj such that xj = 1 if job j is a resource job and xj = 0 otherwise. Similarly, for each k ∈ N2 introduce a binary variable yk such that yk = 1 if job k is a resource job and yk = 0 otherwise. For some t1 , 0 ≤ t1 ≤ A, suppose that machine M1 processes the block of the resource jobs in the time interval [0, t1 ]. Similarly, suppose that on machine M2 the block of the resource jobs is processed during a period of t2 time units, where 0 ≤ t2 ≤ B. The makespan for a schedule of this structure exceeds equal to neither the workload on machine M 1 j ∈N1 (aj − workload on machine M2 αj )xj + j ∈N1 aj (1 − xj ) nor the equal to k∈N2 (bk − βk )yk + k∈N2 bk (1 − yk ). Besides, since the resource jobs cannot be processed simultaneously, the makespan is at least as large as t1 + t2 . To minimize the workload on machine M1 we need to solve the following knapsack problem: aj − αj x j W1 = min x

j ∈N1

j ∈N1

or equivalently max x

subject to j ∈N1

αj xj

j ∈N1

(aj − αj )xj ≤ t1 ;

xj ∈ {0, 1}, j ∈ N1 .

or equivalently max y

βk yk

k∈N2

subject to

(bk − βk )yk ≤ t2 ;

yk ∈ {0, 1}, k ∈ N2 .

k∈N2

We can solve the ﬁrst knapsack problem as an all-capacities knapsack problem in O(|N1 |A) time by a dynamic programming algorithm for t1 = A, see [10] for more details. Such an algorithm outputs an optimal solution to each knapsack problem with a positive integer right-hand side value t1 ∈ {0, 1, . . . , A}. Let W1 (t1 ) denotes the optimal value of the objective function for t1 ∈ {0, 1, . . . , A}. It is convenient to represent these values as a table with two rows and A + 1 columns, where column t1 , 0 ≤ t1 ≤ A, in the ﬁrst row contains the value of t1 and in the second row the value of W1 (t1 ). This table will be called the A-matrix. It is clear that the larger the value of t1 is taken, the smaller the value of W1 (t1 ) is found, i.e., the entries of the second row of the A-matrix form a non-increasing array. Similarly, for the second knapsack problem in O(|N2 |B) time we can ﬁnd solutions to all knapsack problems with all positive integer right-hand side values t2 less than or equal to B. Similarly to the above, we associate the found solutions with the B-matrix that has two rows and B + 1 columns, where column t2 , 0 ≤ t2 ≤ B, in the ﬁrst row contains the value of t2 and in the second row contains the corresponding optimal value W2 (t2 ) of the objective function. The entries of the second row of the B-matrix are also nonincreasing. The time required for building these matrices in at most O(nR). To ﬁnd the overall solution we need to ﬁnd the integer values t1 , 0 ≤ t1 ≤ A, and t2 , 0 ≤ t2 ≤ B, such that max{t1 + t2 , W1 (t1 ), W2 (t2 )} is as small as possible. Assume that there exists an overall optimal solution in which W1 (t1 ) ≥ W2 (t2 ). We describe a simple procedure that ﬁnds C (1) equal to the smallest value of max{t1 + t2 , W1 (t1 )}, provided that W1 (t1 ) ≥ W2 (t2 ). From the A-matrix, read W1 (0). In the B-matrix, ﬁnd the largest value of W2 (t2 ) such that W1 (0) ≥ W2 (t2 ) and read the corresponding value of t2 . Compute C (1) = max{t2 , W1 (0)}. Take the next value of t1 and read the value of W1 (t1 ) from the corresponding column of the A-matrix. In the B-matrix, Naval Research Logistics DOI 10.1002/nav

382

Naval Research Logistics, Vol. 55 (2008) Table 2. A-matrix.

t1 W1 (t1 )

0 7

1 7

2 7

3 7

4 4

ﬁnd the largest value of W2 (t2 ) such that W1 (t1 ) ≥ W2 (t2 ) and read the corresponding value of t2 . Update C (1) := min{C (1) , max{t1 + t2 , W1 (t1 )}}. This process is repeated for all integer t1 up to A. Since the second row of the B-matrix is ordered, ﬁnding the largest values of W2 (t2 ) that does not exceed W1 (t1 ) for all values of t1 from zero to A requires no more than B comparisons, because each time the search may start from the value found in the previous iteration. Thus, the ﬁnal value of C (1) will be found in no more than O(A + B) = O(R) time. In a symmetric way, can we ﬁnd C (2) equal to the smallest value of max{t1 +t2 , W2 (t2 )}, provided that W1 (t1 ) < W2 (t2 ). This will also take O(R) time. The optimal value of the makespan is then equal by min{C (1) , C (2) }. The corresponding resource assignment can be found by determining the values of the decision variables. EXAMPLE 2: To illustrate the algorithm, take the data from Table 1 with machine M3 removed and with the value p1 changed to 7, so that a1 = p1 = 7, α1 = π1 = 3, b2 = p2 = 3, β2 = π2 = 1, b3 = p3 = 5, and β3 = π3 = 4. For machine M1 we need to solve the all-capacities knapsack problem

which results in the A-matrix shown in Table 2. Similarly, for machine M2 we have to solve the allcapacities knapsack problem max y2 + 4y3 subject to 2y2 + y3 ≤ 3 y2 , y3 ∈ {0, 1} which results in the B-matrix shown in Table 3. Assuming that W1 (t1 ) ≥ W2 (t2 ) we ﬁnd the value C (1) = 5 achieved for t1 = 4 and t2 = 1. Assuming that W1 (t1 ) < W2 (t2 ) we ﬁnd the value C (2) = 8 achieved for t2 = 0 and t1 = 0. The optimal makespan is equal to 5. See Fig. 2 for the resulting optimal schedule. Table 3. B-matrix. 0 8

1 4

Naval Research Logistics DOI 10.1002/nav

Notice that the dynamic programming algorithm outlined in [5] does not take advantage of all features of the two knapsack problems involved and therefore requires much more time.

2.3.

2 4

3 3

FPTAS

We now convert a pseudopolynomial dynamic programming algorithm for problem PD2|res111, Bi|Cmax into a fully polynomial approximation scheme for this problem. To achieve this purpose we use a popular rounding technique introduced by Ibarra and Kim [8]. In this subsection, we refer to an instance of the original problem PD2|res111, Bi|Cmax as Instance I . Given an ε > 0 and Instance I , use (1) compute A, B and R, and deﬁne δ = εR/n. Deﬁne an instance of problem PD2|res111, Bi|Cmax with the processing times a˜ j = aj /δ, b˜k = bk /δ,

max 3x1 subject to 4x1 ≤ 4 x1 ∈ {0, 1}

t2 W2 (t2 )

Figure 2. An optimal schedule for Example 2.

a˜ j − α˜ j = (aj − αj )/δ, b˜k − β˜k = (bk − βk )/δ,

j ∈ N1 ; k ∈ N2 ,

and call it Instance I˜. Here x denotes the largest integer that does not exceed x. Our algorithm solves Instance I˜ and converts the resulting schedule into an approximate solution to the original Instance I . Notice that due to the performed rounding, the dynamic programming algorithm that solves Instance I˜ generated a reduced number of states. Algorithm FPTPD2 1. Given Instance I and an ε > 0, deﬁne Instance I˜ as above. 2. For Instance I˜, run the dynamic programming algo˜ rithm from Section 2.2. Call the found schedule S. Let NiR and Ni0 be the subsets of the resource and the ˜ non-resource jobs of set Ni , i = 1, 2, in schedule S. 3. Process the jobs from the original Instance I in accordance with the subsets NiR and Ni0 . Call the resulting schedule Sε . Stop.

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

THEOREM 3: For schedule Sε found by Algorithm FPTPD2 the inequality Cmax (Sε ) ≤1+ε Cmax (S ∗ ) holds. The running time of the algorithm does not exceed O(n2 /ε). PROOF: Given Instance I , introduce an instance of problem PD2|res111, Bi|Cmax with the processing times deﬁned as a¯ j = a˜ j δ, b¯k = δ b˜k ,

a¯ j − α¯ j = (a˜ j − α˜ j )δ, b¯k − β¯k = (b˜k − β˜k )δ,

j ∈ N1 ;

respectively, and call it Instance I¯. It follows that a schedule S¯ that is optimal for Instance I¯ is also associated with the same subsets NiR and Ni0 of jobs ˜ so that as in schedule S, ¯ = max Cmax (S)

(a¯ j − α¯ j ) +

j ∈N1R

+

a¯ j ,

j ∈N10

(b¯k − β¯k ),

k∈N2R

b¯k

(b¯k − β¯k ) R

(a¯ j − α¯ j )+

k∈N2

Recall that the processing times aj in Instance I are obtained by extending the times a¯ j to their original values by no more than δ each. The same holds for other time parameters in Instances I and I¯. Thus, for schedule Sε we have that

Cmax (Sε ) ≤ max

+

j ∈N1R

k∈N2R

+

(aj − αj ) +

k∈N2R

(bk − βk ) + n2 δ,

j ∈N1

k∈N2

By deﬁnition, R˜ ≤ n/ε. Thus, we conclude that the running time of Algorithm FPTPD2 does not exceed O(n2 /ε). Notice that the FPTAS outlined in [5] requires O(n4 /ε3 ). ARBITRARY NUMBER OF MACHINES: APPROXIMATION ALGORITHMS

In this section, we show that problem PD|res111, Bi|Cmax in which the number of machines is part of the input admits an approximation algorithm with a worst-case ratio 1.5 + ε for any given positive ε. We also show how our approach can be extended to develop a (3 + ε)-approximation algorithm for a more general problem PD|res1σ σ , Int|Cmax based on a simpler reasoning than that employed in [7] for deriving the same bound for a less general problem PD|res1σ σ , Lin|Cmax .

k∈N20

j ∈N1R

The running time of Algorithm FPTPD2 is determined by the running time of the dynamic programming algorithm used ˜ time, where in Step 2, which requires O(nR) (a˜ j − α˜ j ), (b˜k − β˜k ) . R˜ = max

3.

k ∈ N2 .

383

aj + n1 δ,

j ∈N10

bk

k∈N20

(aj − αj )

j ∈N1R

(bk − βk ) + (n1 + n2 )δ

¯ + nδ = Cmax (S) ¯ + εR. ≤ Cmax (S) ¯ and R are lower bounds on the optimal Since both Cmax (S) makespan for the original Instance I , we have that Cmax (Sε ) ≤ (1 + ε)Cmax (S ∗ ).

3.1.

Binary Scenario

Our algorithm for ﬁnding an approximate solution of problem PD|res111, Bi|Cmax consists of two phases. In the ﬁrst phase, we apply an FPTAS to an integer programming problem in which the optimal value of the objective function is a lower bound on the optimal makespan Cmax (S ∗ ) for the original problem PD|res111, Bi|Cmax . The solution found in the ﬁrst phase generates an instance of the static problem PD|res111|Cmax with ﬁxed processing times. In the second phase we apply a 1.5-approximation algorithm from [11] to the obtained instance of problem PD|res111|Cmax . As a result we determine a suboptimal schedule SH for the original problem PD|res111, Bi|Cmax , such that for any positive ε the bound 3 Cmax (SH ) ≤ +ε Cmax (S ∗ ) 2

(2)

holds. Consider the following integer linear programming problem (ILP): Problem ILP : Minimize C m (p − πj )xj ≤ C s.t. i=1 j ∈Ni j j ∈Ni pj − j ∈Ni πj xj ≤ C, i = 1, . . . , m j ∈ N. xj ∈ {0, 1}, Naval Research Logistics DOI 10.1002/nav

384

Naval Research Logistics, Vol. 55 (2008)

Problem ILP can be seen as a relaxation of problem PDm|res111, Bi|Cmax . Here xj is equal to 1 if job j is given the resource and is equal to zero, otherwise. In problem ILP we simultaneously minimize the maximum workload over all machines and the total processing time of all resource jobs. Let C ∗ be the optimal objective value of problem ILP. Notice that for problem PDm|res111, Bi|Cmax with m ≥ 3 there may exist an optimal schedule S ∗ such that Cmax (S ∗ ) > C ∗ . To see this, turn to the data set of Example 1 given in Table 1. The value of C ∗ is equal to 4, achieved for x1 = x3 = x5 = 1 and x2 = x4 = 0. However, the optimal makespan is 5; see Fig. 1. Thus, C ∗ is a lower bound on the optimal makespan, which in general is not tight. Our approximation algorithm for problem PD|res111, Bi|Cmax consists of two phases. First, we ﬁnd an approximate solution to problem ILP with the value of the function that does not exceed (1 + δ)C ∗ , where δ is an appropriately chosen fraction of a given accuracy parameter ε. The output of the phase is a resource allocation that is fairly close to that in an optimal schedule. To ﬁnd a suboptimal schedule for this resource allocation we use an approximation algorithm for problem PD|res111, Bi|Cmax , which essentially behaves as a 1.5-approximation algorithm. To achieve a required ratio of 1.5 + ε for a given positive ε, we deﬁne δ = ε/1.5. We start the ﬁrst phase by introducing problem ILP(C), a parametric integer linear programming problem with a parameter C Problem ILP(C) m : (p − πj )xj Minimize i=1 j ∈Ni j s.t. p − j ∈Ni j j ∈Ni πj xj ≤ C, i = 1, . . . , m xj ∈ {0, 1}, j ∈ N. In problem ILP(C) we minimize the total processing time of all resource jobs, provided that the workload on each machine is bounded by C. For a ﬁxed C, let z∗ (C) denote the optimal value of the objective function in problem ILP(C). By decomposing problem ILP(C) into m knapsack problems, we ﬁnd an approximate solution to that problem with the value of the function z(C) such that the bound z(C) ≤ (1 + δ)z∗ (C) holds. Associate problem ILP(C) with the series of the following m knapsack problems KPi (C) (i = 1, . . . , m): Problem KPi (C) : (p − π )x Minimize j ∈Ni j j j subject to j ∈Ni pj − j ∈Ni πj xj ≤ C xj ∈ {0, 1},

j ∈ Ni .

In each problem KPi (C) we minimize the total processing time of the resource jobs on machine Mi , provided that the workload on that machine does not exceed C. Let zi∗ (C) Naval Research Logistics DOI 10.1002/nav

denote the optimal value of the objective function in problem problems ILP(C) and KPi (C), observe KPi (C). Comparing ∗ z that z∗ (C) = m i=1 i (C). Each problem KPi (C) is a minimization knapsack problem, so that using the corresponding FPTAS we can ﬁnd a solution with a value of the objective function zi (C) such that zi (C) ≤ (1 + δ)zi∗ (C). Running such an FPTAS takes O(n(log n + (1/δ 2 ) log(1/δ)) time for each i; see, e.g. [10]. The values of the decision variables obtained for all problems KPi (C), i = 1, . . . , m, deﬁne a feasible solution of problem ILP(C), so that for the obtained value of the objective function z(C) we deduce z(C) =

m

m zi (C) ≤ (1 + δ)zi∗ (C) = (1 + δ)z∗ (C),

i=1

i=1

i.e., the described procedure is an FPTAS for solving problem ILP(C) with a ﬁxed C. In problem ILP the largest value of C does not exceed the sum of all reduced processing times, as if each job is a resource job. Besides C cannot be larger than the workload of a machine, provided that none of the jobs assigned to that machine has been given the resource. On the other hand, C cannot be smaller than the workload of a machine, provided that each of the jobs assigned to that machine has been given the resource. Thus, for problem I LP we have that C ∈ [Y1 , Y2 ], where Y1 = max (pj − πj )|1 ≤ i ≤ m ; j ∈Ni m (pj − πj ), max pj |1 ≤ i ≤ m . Y2 = max i=1 j ∈Ni

j ∈Ni

The FPTAS for problem ILP(C) can be embedded into a binary search procedure that for problem ILP determines a value of C such that (1 + δ)C ∗ . The search starts from the midpoint of the current interval [Y1 , Y2 ]. If for the current value of C, the value z(C) found by the FPTAS for problem ILP(C) does not exceed (1 + δ)C, then the obtained solution is also feasible for problem ILP and a better value of C can be found by taking the midpoint of the left part of the current interval; otherwise, we take the midpoint of the right part of the current interval. The search stops having found a solution to problem ILP with the value of the objective function Cδ ≤ (1 + δ)C ∗ . The running time for ﬁnding Cδ does not exceed O(nm log(Y2 −Y1 )(log n+(1/δ 2 ) log(1/δ)), which is polynomial in both the length of the input of problem I LP and 1/ε, i.e., the described procedure is an FPTAS for problem ILP. We now pass to the second phase of the algorithm. The obtained approximate solution to problem ILP generates the

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines

split of the jobs in the original problem into the resource jobs (those with xj = 1) and the remaining non-resource jobs. We obtain an instance of the problem PD|res111|Cmax with parallel dedicated machines and a single resource studied in [11]. To ﬁnd an approximate solution to the original problem PD|res111, Bi|Cmax we apply Algorithm GT from [11] to the obtained instance of problem PD|res111|Cmax . Recall that Algorithm GT if applied to problem PD|res111|Cmax is essentially a group technology heuristic algorithm that schedules on each machine a batch of the resource jobs and a batch of the non-resource jobs. It creates a schedule SH such that Cmax (SH ) ≤ ρL, where 3 − 1, if m is odd ρ = 23 2m 1 , − , if m is even 2 2(m−1) and L is a lower bound on the optimal makespan of the instance of problem PD|res111|Cmax computed as the maximum between the largest machine workload and the total processing times of the resource jobs. In our case, such a lower bound L does not exceed Cδ . Thus, we have Cmax (SH ) ≤ ρCδ ≤ 1.5(1 + δ)C ∗ = (1.5 + ε)C ∗ ≤ (1.5 + ε)Cmax (S ∗ ), which corresponds to (2). The second phase of our procedure, i.e., running Algorithm GT requires only O(nm) time, therefore the overall running time of the algorithm is polynomial. 3.2.

Integer Scenario

We now turn to problem PD|res1σ σ , Int|Cmax under the integer scenario of resource allocation. If a job j ∈ N is given τ units of the resource, 0 ≤ τ ≤ σ , then its the actual processing time is equal to a given value pj τ . Under the linear scenario of resource allocation, we are given the values pj and πj , j ∈ N , and the the actual processing time of job j that is allocated τ units of the resource is equal to pj τ = pj −τ πj . For problem PD|res1σ σ , Lin|Cmax , a (3 + ε)-algorithm is designed in [7]. The algorithm is essentially a two-phase procedure. In the ﬁrst phase, the resource allocations are found by solving a quadratic integer programming problem by an FPTAS. The solution found in the ﬁrst phase generates an instance of the static problem PD|res1σ σ |Cmax with ﬁxed processing times. In the second phase, the jobs are allocated to the machines using a simple greedy algorithm. As a result a suboptimal schedule SH is found, such that for any positive ε the bound Cmax (SH ) ≤3+ε Cmax (S ∗ ) holds.

(3)

385

Later we demonstrate that in fact the use of non-linear models in the ﬁrst phase of this process is not necessary, even for a more general problem PD|res1σ σ , Int|Cmax . In fact, we reduce the ﬁrst-phase actions to solving a series of integer linear programming problems by an FPTAS. Our reasoning is quite similar to that described in Section 3.1, however here we apply an FPTAS not to linear knapsack problems, but rather to so-called multiple-choice linear knapsack problems. As a result we gain in the overall running time and simplify the justiﬁcation of the algorithm. To achieve the required ratio (3) for a given positive ε, we deﬁne δ = ε/2. Consider the following integer linear programming problem ILPσ . It has a feasible solution if there is a feasible schedule for problem PD|res1σ σ , Int|Cmax with a makespan at most C. A variable xj τ , where j ∈ N and 0 ≤ τ ≤ σ , is equal to 1 if job j is given τ units of the resource and is equal to zero, otherwise. m σ

τ pj τ xj τ ≤ σ C,

(4)

i=1 j ∈Ni τ =0 σ

pj τ xj τ ≤ C,

i = 1, . . . , m,

(5)

j ∈Ni τ =0 σ

xj τ = 1,

j ∈ N,

(6)

τ =0

xj τ ∈ {0, 1}.

(7)

Problem ILPσ can be seen as a relaxation of the original problem PD|res1σ σ , Int|Cmax . The left-hand side of (4) represents the total resource consumption of all jobs. Since at most σ units of the resource can be allocated at any time, the total resource consumption is at most σ C which is expressed in (4). If there is a schedule with makespan not exceeding C, the workload on each machine is at most C. This is guaranteed by the inequalities (5). Introduce problem ILPσ (C), a parametric integer linear programming problem with a parameter C: Minimize

m σ

τ pj τ xj τ

i=1 j ∈Ni τ =0

subject to

σ

pj τ xj τ ≤ C,

i = 1, . . . , m,

j ∈Ni τ =0 σ

xj τ = 1,

j ∈ N,

τ =0

xj τ ∈ {0, 1}. In problem ILPσ (C) we minimize the total resource consumption of all jobs, provided that the workload on each machine is bounded by C. By decomposing problem Naval Research Logistics DOI 10.1002/nav

386

Naval Research Logistics, Vol. 55 (2008)

ILPσ (C) into m integer programs, we ﬁnd an approximate solution to that problem with the value of the function z(C) such that the bound z(C) ≤ (1 + δ)z∗ (C), holds, where z∗ (C) denote the optimal value of the objective function in problem ILPσ (C). Associate problem ILPσ (C) with the series of the following m integer programs MCKPi (C), i = 1, . . . , m: Minimize

σ

τ pj τ xj τ

j ∈Ni τ =0

subject to

σ

pj τ xj τ ≤ C,

j ∈Ni τ =0 σ

xj τ = 1,

j ∈ N,

τ =0

xj τ ∈ {0, 1}. In each problem MCKPi (C) we minimize the total resource consumption on machine Mi , provided that the workload on that machine does not exceed C. Let zi∗ (C) denote the optimal value of the objective in problem function ∗ z (C). MCKPi (C). It follows that z∗ (C) = m i=1 i Each of the programs MCKPi (C) is a multiple choice knapsack problem (MCKP) in the minimization form. The MCKP is a generalization of the classical knapsack problem. We have given σ mutually disjoint classes Q0 , Q1 , . . . , Qσ of items to be packed into a knapsack of capacity C. Each item j ∈ Qτ , 0 ≤ τ ≤ σ , has a cost cj τ and a weight wj τ , and the problem is to choose exactly one item from each class so that the total cost is minimized without exceeding the weight capacity C. In the case of problem MCKPi (C), there are |Ni | items in each of the σ + 1 classes so that the total number of items is t = O(|Ni |σ ). An item j of class Qτ is associated with job j ∈ Ni that is given τ units of resource, where τ = 0, 1, . . . , σ . The weights correspond to the processing times pj τ and the costs are equal to τ pj τ . An FPTAS for the MCKP is outlined in [10, page 338]. To achieve an accuracy of 1+δ for the MCKP an overall running time of O(tσ/δ) is required. Thus, program MCKPi (C) can be approximated with performance ratio 1+δ in O(|Ni |σ 2 /δ) time. The values of the decision variables obtained for all problems MCKPi (C), i = 1, . . . , m, deﬁne a feasible solution of problem ILPσ (C), so that for the obtained value of the objective function z(C) we deduce z(C) ≤ (1 + δ)z∗ (C), i.e., the described procedure is an FPTAS for solving problem ILP(C) with a ﬁxed C. Embedding the FPTAS for ILPσ (C) into a binary search on C we can calculate the smallest integer C ∗ for which problem ILPσ (C) has a solution with the objective function value Naval Research Logistics DOI 10.1002/nav

z∗ (C) ≤ (1 + δ)σ C ∗ . Consequently, C ∗ is a lower bound on the optimal makespan Cmax (S ∗ ) and the corresponding solution vector x ∗ is a feasible solution of ILP with constraint (4) relaxed to m σ

τ pj τ xj τ ≤ (1 + δ)σ C ∗ .

(8)

i=1 j ∈Ni τ =0

Thus, we have found a resource allocation to the jobs, i.e., we have obtained an instance of problem PD|res1σ σ |Cmax , such that the total processing time on each machine is at most C ∗ and the total resource consumption does not exceed (1+δ)σ C ∗ with C ∗ ≤ Cmax (S ∗ ). A resource allocation found in [7] possess the same properties, so that a greedy algorithm from [7] can be used to ﬁnd a suboptimal schedule. In the greedy algorithm, each time the ﬁrst job is scheduled which can be assigned to the corresponding machine without violating the resource constraints, and ties are broken arbitrarily. We summarize the results of this section as the following statement. THEOREM 4: Problem PD|res1σ σ , Int|Cmax admits a (3 + ε)-approximation algorithm, while problem PD|res111, Bi|Cmax admits a (3/2 + ε)-approximation algorithm. The running times of both algorithms depend polynomially on both the length of the input and 1/ε.

4.

FIXED NUMBER OF MACHINES: PTAS

In this section, we present a PTAS for problem PDm|res111, Bi|Cmax . For ﬁxed m and ε ∈]0, 1[ the running time of our scheme is polynomial in the size of the problem input, but not in m and 1/ε. Let S ∗ be a schedule that is optimal for problem PDm|res111, Bi|Cmax . For job j ∈ N its actual processing time in schedule S ∗ is denoted by fj∗ , i.e., fj∗ = pj if j is a non-resource job and fj∗ = pj − πj if it is a resource job. For a subset of jobs Q ⊆ N , deﬁne the sum of actual processing times in an optimal schedule by f ∗ (Q). Let SH be a heuristic schedule found by the algorithm presented in Section 3 with ε = 0.5. It follows that C :=

1 Cmax (SH ) 2

is a lower bound on the optimal makespan C ∗ := Cmax (S ∗ ). On the other hand, Cmax (SH ) ≥ C ∗ , so that C ≤ C ∗ ≤ 2C. Recall that for a static version of our problem, i.e., problem PDm|res111|Cmax , a PTAS is developed in [11]. To design

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines Table 4. Six possible classes of a job: B, big; M, medium; S, small. Class of job j

1

2

3

4

5

6

Non-resource duration pj Resource duration pj − πj

B B

B M

B S

M M

M S

S S

a PTAS for problem PDm|res111, Bi|Cmax we will “guess” a resource allocation that is very close to that in an optimal schedule S ∗ . After that, our algorithm follows the lines of the PTAS for the obtained static problem. The PTAS splits the jobs into big, medium and small according to their durations in such a way that the processing time of every big job is by a factor larger that the duration of any small job, while the total processing time of all medium jobs is a very small multiple of the optimal makespan. Introduce the sequence of real numbers δ1 , δ2 , . . . such that t δt := (ε/m)2 . For δ equal to one of these values δt , the processing time of a job is called big if it exceeds δC; is called small if it no greater than δ 2 C; otherwise it is called medium. Since we do not know an optimal resource allocation, it is, for instance, possible that the processing time of a job is big (provided that no resource is given to that job), while if the job is given the resource its duration becomes medium or even small. Thus, we have to split the jobs into six classes as shown in the Table 4. For each integer t, t ≥ 1, introduce the set of jobs N t := {Jj ∈ N |δt2 C < fj∗ ≤ δt C}. It can be veriﬁed that the sets of jobs N 1 , N 2 , . . . are mutually disjoint. There exists an integer ∗ t0 , 1 ≤ t0 ≤ mε , such that f ∗ (N t0 ) ≤ εf m(N ) ≤ εC ∗ holds. ∗ Otherwise, we would have f ∗ (N ) > mε εf m(N ) , which is impossible. Taking t from 1 to mε , eventually we will ﬁnd a value of t that allows us to ﬁnd the values of actual processing times as in an optimal schedule for most of the jobs. The explanation below is related to such a t. Note that the number of values of t to be examined is constant for ﬁxed m and ε. Deﬁne δ = δt and determine the resource allocation to the jobs according to their class. Classes 1–3. For each job j in these classes its nonresource duration is big, i.e., pj > δC. In an optimal schedule the total processing time of all jobs of big duration on each machine does not exceed C ∗ ≤ 2C, therefore, there will be at most µ := 2δ1 of such jobs on each machine. This means that to generate resource allocations for the jobs in these classes we need to verify at most O(nmµ ) options, one of which will coincide with the allocation in an optimal schedule. Notice that the number of options to be considered is polynomial. Classes 4 and 5. In an optimal schedule the total processing time of all jobs of medium duration, including those from

387

Class 2, on all machines does not exceed εC ∗ . Since each medium duration exceeds δ 2 C, it follows that at most ν := O( δε2 ) jobs will be of medium duration in an optimal schedule. Thus, we have O(nν ) resource allocations to be veriﬁed. Class 6. The duration of each job of this class remains small irrespective of the resource allocation. Let Ni(6) denote the set of jobs of this class to be processed on machine Mi , 1 ≤ i ≤ m. Unlike for the previous classes, here we cannot afford full enumeration since Class 6 may contain too many jobs. Instead, to achieve an optimal resource allocation we will try to minimize total workload on each machine simultaneously with the total processing time of all resource jobs. This can be done for each machine Mi separately by solving the following bicriteria problem of Boolean programming that we call problem Vi : Problem Vi : minimize minimize

F1 (z) = j ∈N (6) pj (1 − zj ) i F2 (z) = j ∈N (6) (pj − πj )zj zj ∈ {0, 1},

i

j ∈ Ni(6) ,

where zj equal to 1 is job j is assigned the resource and equal to 0 otherwise. Thus, for machine Mi , the function F1 represents the total processing time of all nonresource jobs of Class 6, while the function F2 represents the total processing time of all resource jobs of this class. Recall that a solution z is called Pareto-optimal if there exists no solution z

such that F1 (z

) ≤ F1 (z ) and F1 (z ) ≤ F2 (z

), where at least one of these relations holds as a strict inequality. The set of all Pareto-optimal solutions is known as the efﬁciency frontier. It can be seen there exists an optimal schedule for the original problem that is related to a resource allocation that is Pareto-optimal for all problems Vi . Otherwise, it would be possible to ﬁnd a resource allocation for the jobs of Class 6 such that both the total duration of the resource jobs and the total duration of the non-resource jobs on some machine Mi are not larger and one duration is strictly smaller than the optimal values of the functions of problem Vi . For problem Vi , minimizing one of the objective functions subject to a constraint that the value of the other function is bounded is essentially a knapsack problem. Thus, problem Vi as a problem of simultaneous minimization of these functions is NP-hard and solving it in polynomial time is unlikely. For our purposes, however, it sufﬁces to ﬁnd a solution to problem Vi that can be seen as an approximation of the set of Pareto-optimal solutions. Given a positive ε, a feasible solution z is called an (1 + ε)-approximation of solution z

if F1 (z ) ≤ (1 + ε)F1 (z

) and F2 (z ) ≤ (1 + ε)F2 (z

). The set of (1 + ε)-approximations of all Pareto optimal points is called an (1 + ε)-approximation of the efﬁciency frontier. As follows from [14], there exists an (1 + ε)-approximation of the efﬁciency frontier that consists of a number of solutions Naval Research Logistics DOI 10.1002/nav

388

Naval Research Logistics, Vol. 55 (2008)

that is polynomial in both 1/ε and the number of jobs in set Ni(6) . For Problem Vi , an FPTAS ﬁnds an (1 + ε)-approximation of the efﬁciency frontier, and its running time is polynomial in both 1/ε and the number of jobs in set Ni(6) . For our purposes, we may adapt an FPTAS for a more general multi-objective knapsack problem [4]. See also Section 13.1 of the book [10] for more information on multi-objective problems of integer programming. Solve each problem Vi , i = 1, . . . , m, by an FPTAS. For the original value of ε, we may set the accuracy of an FPTAS to ε/m. For any i, a solution delivered by the FPTAS generates a resource allocation for all jobs of Class 6. Examining all found solutions, we determine a resource allocation which guarantees the following property. Class 6 Property. For jobs of Class 6 the sum of actual processing times of these jobs on each machine is at most 2εC ∗ /m away from the corresponding value f ∗ (Ni(6) ) in an optimal schedule, while the total processing time of all resource jobs of this class exceeds the corresponding value in an optimal schedule by at most εC ∗ . Thus, trying all values of t we perform full enumeration of possible resource allocations of the jobs of Classes 1–5 and apply the approximation approach to the jobs of Class 6. As a result, an allocation will be found such that for each job j of Classes 1–5 its actual processing time becomes equal to fj∗ , the duration in some optimal schedule. For the same allocation, the remaining jobs will satisfy Class 6 Property above. This allocation generates an instance of the static problem P Dm|res111|Cmax that we call Instance Iε . Let us refer to the PTAS from [11] for problem PDm|res111|Cmax as the static PTAS. As an approximate solution to the original problem, we accept a solution delivered by the static PTAS applied to instance Iε . In fact, the static PTAS should be applied to every instance associated with every resource allocation, but for our purposes it sufﬁces to study performance of the static PTAS applied to Iε . Take the value of δ that initiates instance Iε , and deﬁne big, medium and small jobs as the jobs that have big, medium and small duration, respectively. Recall that this deﬁnition coincides with the deﬁnition of these jobs in the description of the static PTAS. Recall that the static PTAS does the following:

• ﬁnd all schedules of big jobs with starting times that are multiples of δ 2 C by full enumeration (the number of the big jobs is constant); • for each such schedule, determine the amounts of small resource jobs to be processed in the gaps of that schedule by solving a linear programming problem; • schedule small resource jobs preemptively in these gaps; Naval Research Logistics DOI 10.1002/nav

• schedule small non-resource jobs preemptively in the remaining gaps; • get rid of preemptions (some of the jobs can be temporarily discarded); • append all unscheduled jobs (i.e., all temporarily discarded jobs and all medium jobs); • select the best of all generated schedules. Here we do not present a detailed analysis of the performance of our PTAS, since it is basically identical to that given in [11]. The only point that needs clariﬁcation concerns the jobs of Class 6, since in Instance Iε their durations are not exact. However, since the static PTAS assigns the small jobs preemptively, we can use Class 6 Property to guarantee that for the obtained schedule Sε the error Cmax (Sε )−C ∗ does not exceed rεC ∗ , where r is a constant that only depends on m and 1/ε. By normalizing the original value of ε, we conclude that our approach leads to a PTAS for the original problem.

5.

CONCLUSION

In this article, we address approximability issues of the problem of scheduling jobs on parallel dedicated machines in the presence of a single renewable speeding-up resource of unit amount as well as for its generalization with several available units of the resource. In all our algorithms we rely on a certain version of the linear knapsack problem. It is an attractive research goal to extend our approaches to other models of parallel processing. A study of models, inlcuding multicriteria ones, that take into consideration either the amount or the cost of the used resources is very appealing; the models of this kind are the main topic of research in scheduling with controllable processing times.

ACKNOWLEDGEMENTS The authors are grateful to Alexander Grigoriev and Marc Uetz of the University of Maastricht for useful discussions at the early stages of this research. The comments by two anonymous referees and an associate editor have contributed to improving the presentation.

REFERENCES [1] J. Bła˙zewicz, N. Brauner, and G. Finke, “Scheduling with discrete resource constraints,” J.Y.-T. Leung (Editor), Handbook of scheduling: Algorithms, models and performance analysis, Chapman & Hall/CRC, London, 2004, pp. 23-1–23-18. [2] J. Bła˙zewicz, K.H. Ecker, G. Schmidt, and J. Weglarz, Scheduling in computer and manufacturing systems, Springer, Berlin, 1994.

Kellerer and Strusevich: Scheduling Parallel Dedicated Machines [3] J. Bła˙zewicz, J.K. Lenstra, and A.H.G. Rinnooy Kan, Scheduling subject to resource constraints, Discrete Appl Math 5 (1983), 11–24. [4] T. Erlebach, H. Kellerer, and U. Pferschy, Approximating multi-objective knapsack problems, Management Sci 48 (2002), 1603–1612. [5] A. Grigoriev, H. Kellerer, and V.A. Strusevich, Dedicated parallel machine scheduling with a single speeding-up resource, 6th Workshop on models and algorithms for planning and scheduling problems, book of abstracts, 2003, pp. 131–132. [6] A. Grigoriev, M. Sviridenko, and M. Uetz, Machine scheduling with resource dependent processing times, Math Program 110 (2007), 209–228. [7] A. Grigoriev and M. Uetz, “Scheduling parallel jobs with linear speedup,” T. Erlebach and P. Persiano (Editors), Approximation and online algorithms (WAOA 2005), Lecture Notes Comput Sci 3879 (2006), 203–215. [8] O.H. Ibarra and C.E. Kim, Fast approximation algorithms for the knapsack and sum of subsets problem, J ACM 22 (1975), 463–468. [9] A. Janiak and M.Y. Kovalyov, Single machine scheduling subject to deadlines and resource dependent processing times, Eur J Oper Res 94 (1996), 284–291.

389

[10] H. Kellerer, U. Pferschy, and D. Pisinger, Knapsack problems, Springer, Berlin, 2004. [11] H. Kellerer and V.A. Strusevich, Scheduling parallel dedicated machines under a single non-shared resource, Eur J Oper Res 147 (2003), 345–364. [12] H. Kellerer and V.A. Strusevich, Scheduling problems for parallel dedicated machines under multiple resource constraints, Discrete Appl Math 113 (2004), 45–68. [13] E. Nowicki and S. Zdrzalka, A survey of results for sequencing problems with controllable processing times, Discrete Appl Math 26 (1990), 271–287. [14] C.N. Papadimitriou and M. Yannakakis, On the approximability of trade-offs and optimal access web sources, Proc 41st Symp Foundations of Computer Science, 2000, pp. 86–92. [15] D. Shabtay, Single and a two-resource allocation algorithms for minimizing the maximal lateness in a single machine-scheduling problem, Comput Oper Res 31 (2004), 1303–1315. [16] D. Shabtay and M. Kaspi, Parallel machine scheduling with a convex resource consumption function, Eur J Oper Res 173 (2006), 92–107.

Naval Research Logistics DOI 10.1002/nav