An efficient fully polynomial approximation scheme for the ... - Core

0 downloads 0 Views 238KB Size Report
Given a set of n items En ¼ f1,y,ng each having a positive integer weight wj ً j ...... nX400, or e ¼ 1/1000 and nX4000, respectively, all the items are small items ...
Journal of Computer and System Sciences 66 (2003) 349–370 http://www.elsevier.com/locate/jcss

An efficient fully polynomial approximation scheme for the Subset-Sum Problem Hans Kellerer,a, Renata Mansini,b Ulrich Pferschy,a and Maria Grazia Speranzac a

Institut fu¨r Statistik und Operations Research, Universita¨t Graz, Universita¨tsstr. 15, A-8010 Graz, Austria Dipartimento di Elettronica per l’Automazione, Universita` di Brescia, via Branze 38, I-25123 Brescia, Italy c Dipartimento Metodi Quantitativi, Universita` di Brescia, Contrada S. Chiara 48/b, I-25122 Brescia, Italy

b

Received 20 January 2000; revised 24 June 2002

Abstract Given a set of n positive integers and a knapsack of capacity c; the Subset-Sum Problem is to find a subset the sum of which is closest to c without exceeding the value c: In this paper we present a fully polynomial approximation scheme which solves the Subset-Sum Problem with accuracy e in time Oðminfn  1=e; n þ 1=e2 logð1=eÞgÞ and space Oðn þ 1=eÞ: This scheme has a better time and space complexity than previously known approximation schemes. Moreover, the scheme always finds the optimal solution if it is smaller than ð1  eÞc: Computational results show that the scheme efficiently solves instances with up to 5000 items with a guaranteed relative error smaller than 1/1000. r 2003 Elsevier Science (USA). All rights reserved. Keywords: Subset-sum problem; Worst-case performance; Fully polynomial approximation scheme; Knapsack problem

1. Introduction Given a set of n items En ¼ f1; y; ng each having a positive integer weight wj ð j ¼ 1; y; nÞ and a knapsack of capacity c; the Subset-Sum Problem (SSP) is to select a subset E of En such that the corresponding total weight wðEÞ is closest to c without exceeding c: Formally, the SSP 

Corresponding author. E-mail addresses: [email protected] (H. Kellerer), [email protected] (R. Mansini), pferschy@ uni-graz.at (U. Pferschy), [email protected] (M.G. Speranza). 0022-0000/03/$ - see front matter r 2003 Elsevier Science (USA). All rights reserved. doi:10.1016/S0022-0000(03)00006-0

350

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

is defined as follows: n P maximize wj xj j¼1

subject to

n P

wj xj pc;

j¼1

where

( xj ¼

xj Af0; 1g ð j ¼ 1; y; nÞ;

1 if item j is selected; 0 otherwise:

P We assume, without loss of generality, that nj¼1 wj Xc and wj pc for j ¼ 1; y; n: The SSP is a special case of the 0–1 Knapsack Problem arising when the profit and the weight associated with each item j are identical. A large number of theoretical and practical papers has appeared on this problem. An extensive overview on the literature is contained in the excellent book by Martello and Toth [18]. The SSP is well-known to be NP-hard [4]. Therefore, all exact algorithms for the SSP are pseudopolynomial. The classical dynamic programming approach has running time OðncÞ and requires OðncÞ memory. An optimal algorithm with improved complexity is due to Pisinger [19]. As for all NP-hard problems, it is interesting to look for suboptimal solutions which are within a predefined range of the optimal value, provided that the time and space requirements are reasonably small, i.e. bounded by a polynomial. The most common method to judge the quality of an approximation P algorithm is its worst-case performance. Define by X  the optimal set of items and by z ¼ jAX  wj the optimal solution value of the SSP. Analogously, let X H be the set of items selected by a heuristic H and zH the corresponding solution value. A heuristic H for the SSP is an ð1  eÞ-approximation algorithm ð0oeo1Þ if for any instance zH Xð1  eÞz

ð1Þ

holds. The parameter e is called the worst-case relative error. A fully polynomial approximation scheme is a heuristic H which, given an instance I and any relative error e; returns a solution value which obeys (1) and is polynomial both in the length of the encoded input and in 1=e: The first fully polynomial approximation scheme for the Subset-Sum Problem was suggested by Ibarra and Kim [8]. They partition the items into small and large items. The weights of the large items are scaled and then the problem with scaled weights and capacity is solved optimally through dynamic programming. The small items are added afterwards using a greedy-type algorithm. Their approach has time complexity Oðn  1=e2 Þ and space complexity Oðn þ 1=e3 Þ: Lawler [14] improved the scheme of Ibarra and Kim by a direct transfer of a scheme for the knapsack problem which uses a more efficient method of scaling. His algorithm has only Oðn þ 1=e4 Þ time and Oðn þ 1=e3 Þ memory requirement. Note that the special algorithm proposed in his paper for subset-sum does not work, since he makes the erroneous proposal to round up the item values. As an improvement, Lawler claims in his paper that a combination of his approach (which is not correct) with a result by Karp [10] would give a running time of Oðn þ 1=e2 logð1e ÞÞ: Karp

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

351

2 presents in [10] an algorithm for subset sum with running time nð1þe e Þ log1þe 2 which is Oðn  1=e Þ: Lawler states that replacing n by the number of large items Oðð1=eÞ logð1=eÞÞ would give a running time of Oðn þ 1=e2 logð1e ÞÞ: It can be easily checked that a factor of 1e is missing in the second term of the expression. Possibly, this mistake originates from the fact that there is a misprint in Karp’s 1þe paper, giving a running time of nð1þe 2 Þ log1þe 2 instead of the correct nð e Þ log1þe 2: The approach by Gens and Levner [5,6] is based on a different idea. They use a dynamic programming procedure where at each iteration solution values are eliminated which are different from each other by at least a threshold value depending on e: The corresponding solution set is then determined by standard backtracking. Their algorithm solves the Subset-Sum Problem in Oðn  1=eÞ time and space. In 1994 Gens and Levner [7] presented an improved fully polynomial approximation scheme based on the same idea. The algorithm finds an approximate solution with relative error less than e in time Oðminfn=e; n þ 1=e3 gÞ and space Oðminfn=e; n þ 1=e2 gÞ: Our algorithm requires Oðminfn  1=e; n þ 1=e2 logð1=eÞgÞ time and Oðn þ 1=eÞ space. A short description of the algorithm has appeared as extended abstract in [13]. The paper is organized as follows: In Section 2 we first present the general structure of the algorithm in an informal way, afterwards our fully polynomial approximation scheme will be described extensively in a technical way. Its correctness, its asymptotic running time and its space requirements are analyzed in Section 3. Section 4 contains computational results and, finally, concluding remarks are given in Section 5.

2. The fully polynomial approximation scheme 2.1. Informal description of the algorithm As our approach is rather involved we try to give an intuition of the approximation scheme in an informal way. The detailed algorithm is presented in Section 2.2. We will explain the algorithm step by step starting from Bellmans procedure for calculating the optimal solution, then doing several modifications which yield better time and space requirements and finally reaching the FPTAS with the claimed time and space bounds. The well-known original dynamic programming approach by Bellman [1] solves the SubsetSum Problem optimally in the following way: The set R of reachable values consists of integers i less than or equal to the capacity c for which a subset of items exists with total weight equal to i: Starting from the empty set, R is constructed iteratively in n iterations by adding in iteration j weight wj to all elements from R and keeping only partial sums not exceeding the capacity. For each value iAR a corresponding solution set with total weight equal to i is stored. This gives a pseudopolynomial algorithm with time OðncÞ and space OðncÞ: In order to obtain an FPTAS, the items are at first separated into small items (having weight pec) and large items. It can be seen easily that any ð1  eÞ-approximation algorithm for the large items remains an ð1  eÞ-approximation algorithm for the whole item set if we assign the small items in the end of the algorithm in a greedy way. (This is done in Step 4 of our algorithm.) Therefore, we will deal only with large items in the further considerations. The interval containing the large items ec; c is again partitioned into Oð1=eÞ subintervals of equal length ec (see Step 1).

352

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

Then, from each subinterval Ij :¼ jec; ð j þ 1Þec the (at most) Jej1 n  1 smallest and Jej1 n  1 biggest items are selected. All these large items are collected in the so-called set of relevant items L and the other items are discarded (see Step 2). Lemma 1 ensures that any optimal solution for L is at most ec smaller than an optimal solution for the large items. Hence, we have to consider only jLjpOðminfn; 1=e logð1=eÞgÞ items for the further computation. Consequently, the corresponding modification of the Bellman algorithm requires time Oðminfnc; n þ 1=e logð1=eÞcgÞ and space Oðn þ ð1=eÞcÞ but approximates the optimal solution still with accuracy ec: (For each partial sum we have to store at most 1=e large items.) The next step is to get an approximation algorithm with time and space complexity not depending on c: For this reason not all reachable values are stored; only the smallest value d ð jÞ and the largest value dþ ð jÞ in each subinterval Ij are kept in each iteration and are updated if in a later recursion smaller or larger values in Ij are obtained. In principle, we have replaced the c possible reachable values by 1e reachable intervals and perform a so-called ‘‘relaxed’’ dynamic programming. This procedure relaxed dynamic programming returns the array D ¼ fd½1 pd½2 p?pd½k0 g of reduced reachable values, i.e. D consists of the values d ð jÞ; dþ ð jÞ sorted in non-decreasing order. Lemma 2 together with Corollary 3 show that d½k0 is at least ð1  eÞc or is even equal to the optimal solution value. Replacing c by 1e ; this modified algorithm yields an ð1  eÞ-approximation algorithm which runs in time Oðminfn=e; 1=e2 logð1=eÞgÞ and space Oðn þ 1=e2 Þ: We have already achieved a FPTAS with the claimed running time, only the memory requirement is too large by a factor of 1=e which is due to the fact that we store for each reduced reachable value i the corresponding solution set. Thus, if we would be satisfied with calculating only the maximal solution value and not be interested in the corresponding solution set, we could finish the algorithm after this step. One way of reducing the space could be to store for each reduced reachable value d½j only the index dð jÞ of the last item by which the reachable value was generated. Starting from the maximal solution value we then try to reconstruct the corresponding solution set by backtracking. But the partial sum (with value in Ii ) which remains after each step of backtracking may not be stored anymore in D: So, if original values in Ii are no longer available we could choose one of the updated values d ðiÞ; dþ ðiÞ: Let yB denote the total weight of the current solution set determined by backtracking. Lemma 4 shows, in principle, that there exists yR Afd ðiÞ; dþ ðiÞg with ð1  eÞcpyR þ yB pc: Hence, we can continue backtracking with the stored value yR : However, during backtracking another problem may occur: The series of indices, from which we construct our solution set, may increase after a while which means that an entry for Ii may have been updated after considering the item with index dð jÞ: This opens the unfortunate possibility that we take an item twice in the solution. Therefore, we can run procedure backtracking only as long as the values dð jÞ are decreasing. Then we have to recompute the remaining part of the solution by running again the relaxed dynamic programming procedure on # consisting of all items from L with smaller index than the last value dð jÞ and a reduced item set L for the smaller subset capacity cˆ :¼ c  yB : In the worst case it may happen that backtracking always stops after identifying only a single item of the solution set. This would increase the running time by a factor of 1=e: For this reason, we apply a more clever way to reconstruct the approximate solution set, namely the following divide and conquer approach. After performing backtracking until the values

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

353

dð jÞ increase, procedure divide and conquer is called. It performs not a single run of relaxed # but splits this task dynamic programming for the complete remaining problem with item set L; # into two subsets L1 ; L2 of (almost) the same cardinality. into two subproblems by partitioning L Relaxed dynamic programming is then performed for both item sets independently with capacity cˆ returning two arrays of reduced reachable arrays D1 ; D2 : By Lemma 5 we can find entries u1 AD1 ; u2 AD2 with cˆ  ecpu1 þ u2 pˆc: To find the solution sets corresponding to value u1 and u2 we first perform backtracking for item set L1 with capacity cˆ  u2 which reconstructs a part of the solution contributed by L1 with value yB1 : If yB1 is not close enough to cˆ  u2 and hence does not fully represent the solution value generated by items in L1 we perform a recursive execution of divide and conquer for item set L1 with capacity cˆ  u2  yB1 which finally produces yDC such that yB1 þ yDC is close to u1 : 1 1 The same strategy is carried out for L2 producing a partial solution value yB2 by backtracking and possibly performing recursively divide and conquer which again returns a value yDC 2 : All DC B DC B DC # together we derive the solution contributed by item set L as y ¼ y1 þ y1 þ y2 þ y2 : In every recursive execution of divide and conquer we start as above with two runs of relaxed dynamic programming and backtracking for both subsets of items. If backtracking completely delivers the solution for the desired capacity we have completely solved one subproblem, otherwise we continue the splitting process of the item set recursively by performing another execution of divide and conquer for the remaining subproblem. As each execution of divide and conquer returns at least one item of the solution through backtracking (usually more than one), the depth of the recursion is bounded by Oðlogð1e ÞÞ: We can represent the recursive structure of divide and conquer as a binary rooted tree. Each node in the tree corresponds to one call of divide and conquer with the root indicating the first call in Step 3 of the algorithm. Every node may have up to two child nodes, the left child corresponding to a call of divide and conquer to resolve L1 and the right child corresponding to a call for L2 : During the recursive execution of the algorithm this tree is traversed by a depth-firstsearch strategy. Every node returns a part of the solution set computed directly through backtracking and returns as another part the results of its child nodes. Theorem 7 and Lemma 6 guarantee that the final solution value yL :¼ yB þ yDC ; returned by the first backtracking phase and the first application of divide and conquer, either fulfills ð1  eÞcpyL pc or that yL is optimal for the large items. In this way our algorithm is still a FPTAS but we do not have to store solution sets of items thus requiring only Oðn þ 1=eÞ space. Finally, Theorem 8 shows the rather surprising fact that introducing divide and conquer does not increase the running time. This can be intuitively explained by the fact that the size of the subproblems, for which relaxed dynamic programming is performed during the recursive executions of divide and conquer, is decreasing systematically both with respect to the number of items and the required subset capacity. 2.2. Technical description of the algorithm This section is devoted to the detailed description of the fully polynomial approximation scheme (A) outlined before.

354

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

Algorithm (A) Input: n; wj ð j ¼ 1; y; nÞ; c; e: Output: zA ; X A : Step 1: Partition into intervals. Compute the number of intervals k :¼ J1e n: Set the interval length t :¼ ec: Partition the interval ½0; c into the interval ½0; t ; into k  2 intervals Ij :¼ jt; ð j þ 1Þt

(j ¼ 1; y; k  2) of length t and the (possibly smaller) interval Ik1 :¼ ðk  1Þt; c : Denote the items in ½0; t by S and call them small items. Denote the items in Ij by Lj with nj :¼ jLj j: S Set L :¼ k1 j¼1 Lj and call the elements of L large items. If L ¼ | then go to Step 4. Step 2: Determination of the relevant item set L: For every j ¼ 1; y; k  1 do If nj 42ðJkjn  1Þ then Let Kj consist of the Jkjn  1 smallest and the Jkjn  1 biggest items in Lj : Else let Kj consist of all items in Lj : S Define the set of relevant items L by L :¼ k1 j¼1 Kj : Set l :¼ jLj: Discard the remaining items L\L: Step 3: Dynamic programming recursion. PL :¼ | (current solution set of large items) LE :¼ | (set of relevant items excluded from further consideration) These two sets are updated only by procedure backtracking. Perform procedure relaxed dynamic programming ðL; cÞ returning the dynamic programming arrays d ðÞ; dþ ðÞ and dðÞ with entries d ð jÞ; dþ ð jÞ ð j ¼ 1; y; k  1Þ and dðiÞ ði ¼ 1; y; k0 ; k0 p2k  1Þ: Let the array D :¼ fd½1 pd½2 p?pd½k0 g of reduced reachable values represent the values d ð jÞ; dþ ð jÞ (unequal to zero) sorted in non-increasing order. If d½k0 oð1  eÞc; then set c :¼ d½k0 þ ec: Perform procedure backtracking ðd ðÞ; dþ ðÞ; dðÞ; L; cÞ returning yB : If c  yB 4ec then perform procedure divide and conquer ðL\LE ; c  yB Þ returning yDC yL :¼ yB þ yDC : Step 4: Assignment of the small items. Apply a greedy-type algorithm to S and a knapsack with capacity c  yL ; i.e. examine the small items in any order and insert each new item into the knapsack if it fits. Let yS be the greedy solution value and PS be the corresponding solution set. Finish with zA :¼ yL þ yS and X A :¼ PL ,PS : Comment. It will be clear from Corollary 3 that the possible redefinition of c in Step 3 of the algorithm is used to find the exact solution in case of z oð1  eÞc:

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

355

* c˜Þ Procedure relaxed dynamic programming ðL; * subset of items, c˜: subset capacity. Input: L: Output: d ðÞ; dþ ðÞ; dðÞ: dynamic programming arrays. (Forward recursion) * ¼ v1 ; v2 ; y; v * and l* :¼ jLj: * Let L l Compute k˜ with c˜AIk˜ : ˜ d ð jÞ :¼ dþ ð jÞ :¼ 0 ð j ¼ 1; y; kÞ: For i :¼ 1 to l* do begin ˜ Form the set Di :¼ fdþ ð jÞ þ vi j dþ ð jÞ þ vi p˜c; j ¼ 1; y; kg,   ˜ fd ð jÞ þ vi j d ð jÞ þ vi p˜c; j ¼ 1; y; kg,fv i g: For all uADi do begin Compute j with uAIj : If d ð jÞ ¼ 0 (and therefore also dþ ð jÞ ¼ 0) then d ð jÞ :¼ dþ ð jÞ :¼ u and dðd ð jÞÞ :¼ dðdþ ð jÞÞ :¼ i: If uod ð jÞ then d ð jÞ :¼ u and dðd ð jÞÞ :¼ i: If u4dþ ð jÞ then dþ ð jÞ :¼ u and dðdþ ð jÞÞ :¼ i: end end return d ðÞ; dþ ðÞ; dðÞ: Comment. In each interval Ij we keep only the current biggest iteration value dþ ð jÞ and the current smallest iteration value d ð jÞ; respectively. The value dðdÞ represents the index of the last item which is used to compute the iteration value d: It is stored for further use in procedure backtracking. Note that the last interval Ik˜ contains only values smaller than or equal to c˜: * yT Þ Procedure backtracking ðd ðÞ; dþ ðÞ; dðÞ; L;  þ Input: d ðÞ; d ðÞ; dðÞ: dynamic programming arrays, * ¼ fv1 ; v2 ; y; v * g: subset of items as in relaxed dynamic programming; L l yT : target point for backtracking. Output: yB : collected partial solution value. This is the only procedure where items are added to PL and deleted from LE : (Backward recursion) u :¼ maxj fu0j j u0j ¼ dþ ð jÞ and u0j pyT g yB :¼ 0; stop :¼ false:

356

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

Repeat i :¼ dðuÞ PL :¼ PL ,fvi g yB :¼ yB þ vi u :¼ u  vi If u40 then Compute j with uAIj : If dþ ð jÞ þ yB pyT and dðdþ ð jÞÞoi then u :¼ dþ ð jÞ else if d ð jÞ þ yB XyT  ec and dðd ð jÞÞoi then u :¼ d ð jÞ else stop :¼ true: until u ¼ 0 or stop * j jXig LE :¼ LE ,fvj AL B return ðy Þ:

Comment. A part of the sequence of items which led to a value within ec of yT is reconstructed. The backtracking stops in particular if, in the dynamic programming arrays, an entry is found which, meeting the condition on the solution value, was however updated after the generation of the ‘‘forward arc’’ vi : Such an updated entry must not be used because it may originate from a smaller entry which was generated by an item already used in the partial solution and hence this item would appear twice in the solution vector. # cˆÞ Procedure divide and conquer ðL; # subset of items, cˆ: subset capacity. Input: L: # Output: yDC : part of the solution value contained in L: (Divide) # into two disjoint subsets L1 ; L2 with cardinalities as equal as possible. Partition L þ Perform procedure relaxed dynamic programming (L1 ; cˆ) returning d 1 ðÞ; d1 ðÞ; d1 ðÞ: þ Perform procedure relaxed dynamic programming (L2 ; cˆ) returning d 2 ðÞ; d2 ðÞ; d2 ðÞ: (Conquer) þ  þ Find entries u1 and u2 of the dynamic programming arrays d 1 ðÞ; d1 ðÞ and d2 ðÞ; d2 ðÞ; respectively, with u1 Xu2 such that ð2Þ cˆ  ecpu1 þ u2 pˆc: :¼ 0; yB2 :¼ 0; yDC :¼ 0: local variables yDC 1 2 (Resolve L1 ) þ B Perform procedure backtracking ðd 1 ðÞ; d1 ðÞ; d1 ðÞ; L1 ; cˆ  u2 Þ returning y1 : If cˆ  u2  yB1 4ec then perform procedure divide and conquer ðL1 \LE ; cˆ  u2  yB1 Þ returning yDC 1 :

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

357

(Resolve L2 ) If u2 40 then begin If cˆ  u2  yB1 4ec then perform procedure relaxed dynamic  þ programming ðL2 ; cˆ  yB1  yDC ðÞ 1 Þ returning d2 ðÞ; d2 ðÞ; d2 ðÞ: þ B DC B ˆ ðÞ; d ðÞ; d ðÞ; L ; c  y  y Þ returning y : Perform procedure backtracking ðd 2 2 1 2 2 2 1 B  y 4ec then perform procedure divide and If cˆ  yB1  yDC 2 1 B DC conquer ðL2 \LE ; cˆ  yB1  yDC 1  y2 Þ returning y2 : end B DC yDC ¼ yB1 þ yDC 1 þ y2 þ y2 return ðyDC Þ: þ Comments. The recomputation in ðÞ is necessary if the memory for d 2 ðÞ; d2 ðÞ; d2 ðÞ was used during the recursive execution of divide and conquer to resolve L1 :

Note that a continued bipartition of the item set with recomputation of the solution set (with increasing time requirement) was also used by Magazine and Oguz [15]. With a possible loss of practical performance but without changing the running time analysis in Theorem 8 procedure backtracking can be simplified by stopping the loop as soon as uadþ ð jÞ and uad ð jÞ: Also keeping the set LE is not strictly necessary.

3. Correctness and performance In this section, at first we will prove the correctness of the above algorithm which is stated in Theorem 7. Second, its asymptotic running time and space requirement will be analyzed in the proof of Theorem 8. In the beginning we show that the reduction of the large items to the relevant item set L in Step 2 changes the optimal solution by at most ec: Throughout this section let y be the optimal solution for the item set L: Lemma 1. Let yL be the optimal solution for L: Then, yL Xð1  eÞc or yL ¼ y : Proof. Denote by mj the number of items of Lj ð j ¼ 1; y; k  1Þ in an optimal solution for item set L: Since Jkjn items of Lj have total weight strictly greater than JkjnjtXktXc; there are at most Jkjn  1 items of Lj in any feasible solution of SSP and Jkjn  1Xmj follows. Hence, the set C which consists of the mj smallest elements of Lj for all j ¼ 1; y; k  1 is a feasible solution set of SSP and a subset of the relevant items L: Now we exchange iteratively items of C which belong to the mj smallest elements of some set Lj with one of the mj biggest items of Lj : We finish this procedure either when ð1  eÞcpwðCÞpc or when C collects the mj biggest items of Lj for all j ¼ 1; y; k  1: This is possible since the weight

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

358

difference in each exchange step does not exceed t ¼ ec: At the end C is still a subset of L and either wðCÞXð1  eÞc or C consists of the mj biggest items of each interval and therefore wðCÞXy which completes the proof. & Now we prove that also the reduction of the complete dynamic programming scheme to the smaller sets dþ ðÞ; d ðÞ is not far away from an optimal scheme for any subset of items. * c˜ and the The difference between an optimal dynamic programming scheme for some L; reduced version in procedure relaxed dynamic programming is the following: For each new item i from 1 to l* an optimal algorithm would compute the sets Di :¼ fd þ vi j d þ vi p˜c; dADi1 g,Di1 with D0 :¼ f0g: Thus, Di is the set of the values of all possible partial solutions using the first i * items of set L: ˜ For the procedure relaxed dynamic programming we define by Di :¼ fd ð jÞ; dþ ð jÞ j j ¼ 1; y; kg the reduced set of 2k˜ solution values computed in iteration i: We now renumber the elements of * in non-decreasing order such that Di ¼ fdi ½s j s ¼ 1; y; 2kg ˜ and each Di ði ¼ 1; y; lÞ ˜ di ½1 pdi ½2 p?pdi ½2k ; to set aside 0-entries. After these preparations we show Lemma 2. For each d ADi there exists some index c with di ½c pd pdi ½c þ 1 and

di ½c þ 1  di ½c pt

ð3Þ

or even ˜ c  ec: di ½2k X˜

ð4Þ

Proof. The statement is shown by induction on i: The assertion is trivially true for i ¼ 1: Let us assume it is true for all iterations from 1 to i  1: ˜ c  ec for some Let di ADi and di eDi1 : Then, di ¼ d þ vi for some dADi1 : If di0 ½2k X˜ 0 i Af1; y; i  1g; the claim follows immediately. Otherwise, we assume by the induction hypothesis that there are di1 ½c ; di1 ½c þ 1 with di1 ½c pdpdi1 ½c þ 1 and di1 ½c þ 1  di1 ½c pt: Set a :¼ di1 ½c þ vi and b :¼ di1 ½c þ 1 þ vi : Of course, b  apt and apdi pb: Assume first that di is in the interval Ik˜ containing c˜: If b4˜c; then c˜Xa4˜c  ec; else if bp˜c we get bX˜c  ec: Hence, at least one of the values a; b fulfills inequality (4). ˜ We distinguish three cases: Assume now that di AIj with jok: (i) aAIj ; bAIj ; (ii) aAIj ; bAIjþ1 ; (iii) aAIj1 ; bAIj :

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

359

In the remainder of the proof the values d ð jÞ and dþ ð jÞ are taken from iteration i: In case (i) we get that both d ð jÞ and dþ ð jÞ are not equal to zero, and (3) follows. For case (ii) note that d ð j þ 1Þpb: If a ¼ d ð jÞ; then d ð j þ 1Þ  dþ ð jÞpb  apt: If on the other side a4d ð jÞ; then dþ ð jÞXa and again d ð j þ 1Þ  dþ ð jÞpt: We conclude (3). Case (iii) is analogous to case (ii) and * we have shown that (3) or (4) hold for each cAf1; y; lg: & * and capacity c˜ let yL* be its optimal For any restricted Subset-Sum Problem with item set L solution value. * and c˜ yields Corollary 3. Performing procedure relaxed dynamic programming with inputs L either

c˜  ecpyp˜ ˜ c or

*

y˜ ¼ yL :

* and thus d ¼ yL* ; the corollary Proof. If we use the construction of Lemma 2 for i ¼ jLj follows. & From Lemma 1 and Corollary 3 it follows immediately that the first execution of procedure relaxed dynamic programming (performed in Step 3 of the algorithm) computes a solution value within ec of the optimal solution. In fact, if the optimal solution y is smaller than ð1  eÞc; even the exact optimal solution is found. To preserve this property during the remaining part of the algorithm we continue the computation with an artificially decreased capacity although this would not be necessary to compute just an e-approximate solution. Note that the arrays d ðÞ; dþ ðÞ do not necessarily contain the corresponding solution items, because updates of values may have taken place after reaching the optimal solution value. Therefore, procedure backtracking reconstructs usually only a part of these items and recursive divide and conquer computations are applied to find a set of items which actually yields such a solution value. The output of procedure backtracking is characterized by the following lemma. * and yT we have Lemma 4. After performing procedure backtracking with inputs d ðÞ; dþ ðÞ; L * E summing up to yR such that yB 4ec and there exists a subset of items in L\L yT  ecpyB þ yR pyT :

ð5Þ

Proof. In the first iteration one relevant item (with weight 4ec) is always added to yB : We will show that during the execution of backtracking the value u always fulfills the properties required by yR in every iteration. Procedure backtracking is always performed (almost) immediately after procedure relaxed dynamic programming. Moreover, the target value yT is either identical to the capacity in the preceding dynamic programming routine and hence with Corollary 3 the starting value of u fulfills (5) (with yR ¼ u and yB ¼ 0) or, if called while resolving L1 ; u1 has the same property. During the computation in the loop we update u at first by u  vdðuÞ : Hence, this new value must have been found during the dynamic programming routine while processing items with a smaller

360

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

index than that leading to the old u: Therefore, we get for u40; that d ð jÞpupdþ ð jÞ: At this point yT  ecpyB þ vdðuÞ þ u  vdðuÞ pyT still holds. Then there are two possible updates of u: We may set u :¼ dþ ð jÞ thus not decreasing u and still fulfilling (5). We may also have u :¼ d ð jÞ with the condition that d ð jÞXyT  yB  ec and hence inequality (5) still holds because u was less than yT  yB and is further decreased. Ending the loop in backtracking there is either u ¼ 0; and (5) is fulfilled with yR ¼ 0; or stop ¼ true, and by the above argument yR :¼ u yields the required properties. At the end of each iteration (except possibly the last one) u is always set to an entry in the dynamic programming array which was reached by an item with index smaller than the item previously put into PL : Naturally, all items leading to this entry must have had even smaller indices. Therefore, the final value yR must be a combination of items with indices less than that of * E: & the last item added to PL ; i.e. from the set L\L In the following, we show that the divide and conquer recursions actually generate a suitable solution for the given capacity. # with weight yˆ Lemma 5. If at the start of procedure divide and conquer there exists a subset of L such that ˆ c; cˆ  ecpypˆ then there exist u1 ; u2 fulfilling (2). Proof. Obviously, we can write yˆ ¼ y1 þ y2 with y1 being the sum of items from L1 and y2 from L2 : If y1 or y2 is 0; the result follows immediately from Lemma 2 setting u1 or u2 equal to 0, respectively. With Lemma 2 we conclude that after the Divide step there exist values a1 ; b1 from the dynamic þ programming arrays d 1 ðÞ; d1 ðÞ with a1 py1 pb1

and b1  a1 pt:

þ Analogously, there exist a2 ; b2 from d 2 ðÞ; d2 ðÞ with

a2 py2 pb2

and b2  a2 pt:

Now it is easy to see that at least one of the four pairs from fa1 ; b1 g  fa2 ; b2 g fulfills (2). & The return value of the procedure divide and conquer can be characterized in the following way. # and cˆ; Lemma 6. If, performing procedure divide and conquer with inputs L # with weight yˆ such that cˆ  ecpypˆ ˆ c; there exists a subset of L then also the returned value yDC fulfills cˆ  ecpyDC pˆc:

ð6Þ

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

361

Proof. Lemma 5 guarantees that under condition (6) there are always values u1 ; u2 satisfying (2). The recursive structure of the divide and conquer calls can be seen as an ordered, not necessarily complete, binary rooted tree. Each node in the tree corresponds to one call of divide and conquer with the root indicating the first call from Step 3. Furthermore, every node may have up to two child nodes, the left child corresponding to a call of divide and conquer to resolve L1 and the right child corresponding to a call generated while resolving L2 : As the left child is always visited first (if it exists), the recursive structure corresponds to a preordered tree walk. In the following, we will show the statement of the lemma by backwards induction moving ‘‘upwards’’ in the tree, i.e. beginning with its leaves and applying induction to the inner nodes. We start with the leaves of the tree, i.e. executions of divide and conquer with no further ¼ 0 after resolving L1 and by considering the condition for recursive calls. Therefore, we have yDC 1 not calling the recursion cˆ  u2  ecpyB1 pˆc  u2 : Resolving L2 we either have u2 ¼ 0 and hence yDC ¼ yB1 and we are done with the previous inequality or we get cˆ  yB1  ecpyB2 pˆc  yB1 and hence with yDC ¼0 2 cˆ  ecpyB1 þ yB2 ¼ yDC pˆc: For all other nodes we show that the above implication is true for an arbitrary node under the inductive assumption that it is true for all its children. To do so, we will prove that if condition (6) holds, it is also fulfilled for any child of the node and hence by induction the child nodes return values according to the above implication. These values will be used to show that also the current node returns the desired yDC : If the node under consideration has a left child, we know by Lemma 4 that after performing procedure backtracking with yT ¼ cˆ  u2 ; there exists yR 1 fulfilling c  yB1  u2 cˆ  yB1  u2  ecpyR 1 pˆ which is equivalent to the required condition (6) for the left child node. By induction, we get with the above statement for the return value of the left child (after rearranging) c  u2 : cˆ  u2  ecpyB1 þ yDC 1 pˆ If there is no left child (i.e. for the case that yDC 1 ¼ 0), we get from the condition for this event cˆ  u2  ecpyB1 pˆc  u2 : If u2 ¼ 0 and hence yDC ¼ yB1 þ yDC 1 ; we are done immediately in both of these two cases. If there is a right child node we proceed along the same lines. From Lemma 4 we know that after performing backtracking with yT ¼ cˆ  yB1  yDC while resolving L2 there exists yR 2 with 1 B R B cˆ  yB1  yDC c  yB1  yDC 1  y2  ecpy2 pˆ 1  y2 ;

which is precisely condition (6) for the right child node.

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

362

Hence, by induction we can apply the above implication on the right child and get B DC B c  yB1  yDC cˆ  yB1  yDC 1  y2  ecpy2 pˆ 1  y2

which is (after rearranging) equivalent to the desired statement for yDC in the current node. If there is no right child (i.e. yDC 2 ¼ 0), the result follows from the corresponding condition. & Applying this lemma to the beginning of the recursion and considering also the small items we can finally state Theorem 7. Algorithm ðAÞ is an ð1  eÞ-approximation algorithm for the Subset-Sum Problem. In particular, zA Xð1  eÞc or zA ¼ z : Moreover, the bound is tight. Proof. As shown in Lemma 1 the reduction of the total item set to L in Step 2 does not eliminate all e-approximate solutions. Hence, it is sufficient to show the claim for the set of relevant items, namely that at the end of Step 3 we have either yL Xð1  eÞc

or yL ¼ yL :

If the first execution of relaxed dynamic programming in Step 3 returns yoð1 ˜  eÞc we know from Corollary 3 that we have found the optimum solution value over the set of relevant items. Continuing with the updated capacity c the claim yL ¼ yL would follow immediately from the first alternative for the new capacity. Therefore, we will assume in the sequel that in Step 3 we find yXð1 ˜  eÞc and only prove that yL Xð1  eÞc: If divide and conquer is not performed at all this relation follows immediately. It remains to be shown that for yDC ; i.e. the return value of the first execution of divide and conquer, ð1  eÞc  yB pyDC pc  yB holds, because at the end of Step 3 we set yL :¼ yB þ yDC : However, due to Lemma 4 the existence of a value yR satisfying this relation is established. But this is exactly the condition required in Lemma 6 to guarantee that yDC fulfills the above. In Step 4 there are only two possibilities: Either there is a small item which is not chosen by the greedy-type algorithm. But this can only happen if the current solution is already greater than ð1  eÞc and we are done. In the second case we have SCX A which yields, depending on the outcome of Step 3, either

zA ¼ yL þ wðSÞXð1  eÞc þ wðSÞXð1  eÞc

or zA ¼ yL þ wðSÞ ¼ yL þ wðSÞ ¼ y þ wðSÞXz ) zA ¼ z by Lemma 1. To prove that the bound is tight, we consider the following series of instances: e ¼ 1=k; n ¼ 2k  1; c ¼ kR with R4k  1 and w1 ¼ ? ¼ wk1 ¼ R þ 1; wk ¼ ? ¼ w2k1 ¼ R: It follows

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

363

that zA ¼ ðk  1ÞR þ ðk  1Þ and z ¼ kR: The performance ratio tends to ðk  1Þ=k when R tends to infinity. & The asymptotic running time of algorithm (A) is analyzed in the following theorem. Theorem 8. For every accuracy e40 ð0oeo1Þ algorithm ðAÞ runs in time Oðminfn  1=e; n þ 1=e2 logð1=eÞgÞ and space Oðn þ 1=eÞ: Especially, only Oð1=eÞ storage locations are needed in addition to the description of the input itself. Proof. Recall that k is in Oð1=eÞ and t ¼ ec: Throughout the algorithm, the relevant space requirement consists of storing the n items and six dynamic programming arrays þ  þ d 1 ðÞ; d1 ðÞ; d1 ðÞ; d2 ðÞ; d2 ðÞ and d2 ðÞ with length k: Special attention has to be paid to the implicit memory necessary for the recursion of procedure divide and conquer. To avoid using new memory for the dynamic programming array in every recursive call, we always use the same space for the six arrays. But this means that after returning þ from a recursive call while resolving L1 the previous data in d 2 and d2 is lost and has to be recomputed. # can be achieved by taking the first half and second half of the A natural bipartition of each L given sequence of items. This means that each subset Li can be represented by the first and the last index of consecutively following items from the ground set. If for some reason a different partition scheme is desired, a labeling method can be used to associate a unique number with each call of # divide and conquer and with all items belonging to the corresponding set L: Therefore, each call to divide and conquer requires only a constant amount of memory and the recursion depth is bounded by Oðlog kÞ: Hence, all computations can be performed within Oðn þ 1=eÞ space. Step 1 requires Oðn þ kÞ time. Selecting the Jkjn  1 items with smallest and largest weight for Lj (j ¼ 1; y; k  1) in Step 2 can be done efficiently as described in [3] and takes altogether Oðn þ kÞ time. The total number l of relevant items of L is bounded from above by lpn and by  k1   k1 X X k 1 lp2 o2k log k: ð7Þ  1 p2k j j j¼1 j¼1 Consequently, l is of order Oðminfn; 1=e logð1=eÞgÞ: * and c˜ (let l* :¼ jLj) * takes Each call of procedure dynamic programming with parameters L * * Oðl  c˜=tÞ time, as for each item i; i ¼ 1; y; l; we have to consider only jDi j candidates for updating the dynamic programming arrays, a number which is clearly in Oð˜c=tÞ: Procedure backtracking always immediately follows procedure relaxed dynamic programming and can clearly be done in OðyT =tÞ time with yT p˜c: Therefore, applying the above bound on l Step 3 requires Oðminfn; k log kgc=tÞ; i.e. Oðminfn  1=e; 1=e2 logð1=eÞgÞ time plus the effort of the divide and conquer execution which will be treated below. Clearly, Step 4 requires OðnÞ time. To estimate the running time of the recursive procedure divide and conquer we recall the representation of the recursion as a binary tree as used in the proof of Lemma 6. A node is said to

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

364

have level c if there are c  1 nodes on the path to the root node. The root node is assigned level 0. This means that the level of a node gives its recursion depth and indicates how many # starting at the root with L: Naturally, the bipartitionings of the item set took place to reach L maximal level is log l which is in Oðlog kÞ: Obviously, for any node with level c the number of items considered in the corresponding # lc : # is bounded by lo execution of divide and conquer, which we denote by l# :¼ jLj; 2 Let us describe the computation time in a node which consists of the computational effort without the at most two recursive calls to procedure divide and conquer. If the node under # and consideration corresponds to an execution of procedure divide and conquer with parameters L # # c=tÞ * l=2) cˆ; then the two calls to relaxed dynamic programming from Divide (each with lE take Oðlˆ time (see above) and dominate the remaining computations. Therefore, the computation time in a node with level c is in Oð2lc cˆ=tÞ: For every node with input capacity cˆ the combined input capacity of its children, i.e. the sum of capacities of the at most two recursive calls to divide and conquer, is (by applying Lemma 6 for yDC 1 and Lemma 4 for the last inequality) at most B cˆ  u2  yB1 þ cˆ  yB1  yDC c  u2  yB1 þ cˆ  yB1  ðˆc  yB1  u2  ecÞ  yB2 1  y2 pˆ

¼ cˆ  yB1  yB2 þ ecpˆc: Performing the same argument iteratively for all nodes from the root downwards, this means that for all nodes with equal level the sum of their capacities remains bounded by c: There are mc p2c nodes with level c in the tree. Denoting the capacity of a node i in level c by cˆic it was shown above that mc X cˆic pc: i¼1

Therefore, the total computation time for all nodes with level c is bounded in complexity by mc X l i l cˆ =tp c c=t: c c 2 2 i¼1 Summing up over all levels this finally yields log Xl c¼0

l c=tp2lc=t; 2c

which is of order Oðminfn  1=e; 1=e2 logð1=eÞgÞ and proves the theorem. & Remark. Algorithm (A) can also be used to approximate the bounded subset-sum problem, a generalization of the SSP, where of each item j ð j ¼ 1; y; nÞ at most bj copies are available. Constructing the relevant item set in Step 2 we get that the number of relevant items in this case is of order Oð1=e logð1=eÞÞ: Therefore, the same proof as for the SSP can be applied and we

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

365

obtain an approximation scheme with running time Oðn þ 1=e2 logð1=eÞÞ and space requirement Oðn þ 1=eÞ:

4. Computational results In the present section we analyze the practical behaviour of the proposed fully polynomial approximation scheme. First the design of the experiments is described and then the results are presented and discussed. In order to test the algorithm on instances in which it may exhibit a very bad performance we have considered the following classes of test problems: (A) (B) (C) (D) (E)

wj wj wj wj wj

uniformly random in ð1; 1014 Þ; c ¼ 3  1014 ; P ¼ 2kþnþ1 þ 2kþj þ 1; with k ¼ Ilog2 nm; c ¼ I12 nj¼1 wj m; n ¼ nðn þ 1Þ þ j; c ¼ In1 2 mnðn þ 1Þ þ ð2Þ; F uniformly random in ð1; 10F Þ; c ¼ 104 n; 3 uniformly random in ð1; 103 Þ and even, c ¼ 104 n þ 1:

The randomly generated test problems of class (A) are taken from Martello and Toth [17] where the range for the items was ð1; 105 Þ: The classes of deterministic problems (B) and (C) were constructed by Chvatal [2] to be extremely difficult to solve with branch and bound methods. These problems as well as classes (D) and (E) are described in [18], where the authors report computational results both for the approximation schemes of Johnson [9] and Martello and Toth [16] and for the fully polynomial approximation schemes of Lawler [14] and Gens and Levner [5,6]. Their results have been obtained on a CDC-Cyber 730 computer, having 48 bits available for integer operations. It is worth noticing that in their experiments the number of items is at most 1000 and the error e equals only 12; 14 and 17: Hence, a direct comparison to our results is not possible. Our fully polynomial approximation scheme was coded in FORTRAN 90 (Salford Version 2.18) and run on a PC Intel Pentium 200 MHz with 32 MB of RAM. The class of problems (A) 1 1 1 has been tested for a number of items up to 5000 and a relative error e equal to 10 ; 100 ; 1000 ; respectively. For the same values of accuracy we have tested class (B) for n equal to 10; 15; 20; 25; 30; 35 (note the exploding size of the coefficients) and class (C) for n up to 100 000: While classes (B) and (C) are deterministic and thus only single instances are generated, in class (A) for each accuracy and each number of items we generated and ran 10 instances. In test problems (D) the value of c is such that for each pair ðn; F Þ; about n=2 items can be expected to be in the optimal solution. According to Martello and Toth [16] the difficulty for this type of problems increases with F ; while in class (E), where the values both for the item weights and for the knapsack capacity are moderate, it is the odd value assigned to c and the even values taken by the items to make the class difficult (i.e. any enumerative algorithm will usually terminate only after exploring the complete decision tree (cf. [18])). It is easy to verify that in our case both classes of problems are ‘‘easy’’. In fact, when the accuracy and the number of items are taken equal to e ¼ 1=10 and nX40; or e ¼ 1=100 and

366

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

Table 1 Problem class (A): Percentage errors (average (maximum) values), computational times in seconds (average (maximum) values) (A)

e ¼ 1=10

e ¼ 1=100

e ¼ 1=1000

n

Per. error

Time

Per. error

Time

Per. error

10

1.422 (6.201)

0.320 (0.385)

0.164 (0.537)

0.091 (0.123)

0.043 (0.089)

4.151 (5.160)

50

0.4459 (1.524)

0.123 (0.384)

0.1207 (0.506)

0.692 (0.769)

0.0212 (0.0732)

28.479 (33.289)

100

0.232 (1.206)

0.059 (0.091)

0.1017 (0.343)

0.707 (0.767)

0.0152 (0.0478)

54.296 (60.492)

500

0.0368 (0.165)

0.056 (0.095)

0.0997 (0.336)

2.801 (3.131)

0.0331 (0.0879)

284.693 (325.317)

1000

0.023 (0.0977)

0.088 (0.126)

0.0231 (0.06)

3.900 (4.669)

0.0276 (0.0824)

551.059 (582.445)

5000

0.0063 (0.0158)

0.823 (0.843)

0.0065 (0.0153)

6.257 (7.526)

0.0074 (0.0203)

1927.779 (2010.602)

Time

nX400; or e ¼ 1=1000 and nX4000; respectively, all the items are small items so that the guaranteed bounds on the worst-case performance are easily obtained through Step 4 of the algorithm (i.e. by only applying the greedy-type algorithm). For this reason the results for these two classes are not reported. We only notice that, e.g. for F ¼ 6; the errors and the running times found for each pair ðn; eÞ in the two classes are very similar even if in the instances of class (E) the algorithm finds the optimal solution more often than in those of class (D). Now we analyze classes (A)–(C) discussing their results separately. The errors were determined by first computing the difference between the approximate solution and the upper bound c for problems (A) and for classes (B) and (C) with respect to the optimal solution value computed as in [18]. Then this difference was expressed as percentage of the respective upper bound. Naturally, the error is 0 for those problems where the optimal solution was found, while a running time equal to 0.000 means that less than 1=1000 s were required. Table 1 shows the results for class (A). The table gives four types of entries: In the first column the average and the maximum (in parentheses) percentage errors are reported, while the second column shows the average and maximum (in parentheses) running times. Table 2 shows the minimum, average and maximum values of the cardinality of the relevant item set L: This illustrates the considerable reduction in the number of items which enter the dynamic programming procedure with respect to the total number of items n: Moreover it shows that for class (A) the reduction of the algorithm to the mere step 4 (the greedy algorithm) is not systematic as for classes (D) and (E). In particular, in the tested instances it never happened to have only small items.

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

367

Table 2 Problem class (A): cardinality of the relevant item set L (minimum, average and maximum values) (A)

e ¼ 1=10

e ¼ 1=100

e ¼ 1=1000

n

Min

Average

Max

Min

Average

Max

Min

Average

Max

10 50 100 500 1000 5000

4 21 30 32 32 32

6 25.1 31.8 32 32 32

8 29 32 32 32 32

8 44 91 337 449 706

8.8 47.1 94.1 345.2 480.1 732.1

9 49 97 357 492 754

9 49 96 494 986 3575

9.6 49.2 98.2 497.3 992.6 3614

10 50 99 499 997 3685

Table 3 Problem class (A): maximum depth in the tree and number of calls to divide and conquer (in brackets), minimum, average and maximum values (A)

e ¼ 1=10

e ¼ 1=100

e ¼ 1=1000

n

Min

Average

Max

Min

Average

Max

Min

Average

Max

10

0 (0)

0.4 (0.4)

1 (1)

0 (0)

0.7 (0.7)

1 (1)

0 (0)

0.2 (0.2)

1 (1)

50

0 (0)

1 (1)

2 (2)

1 (1)

1.4 (1.4)

3 (3)

1 (1)

1.2 (1.2)

2 (2)

100

1 (1)

1.1 (1.1)

2 (2)

1 (1)

1.9 (2)

3 (4)

0 (0)

1.8 (2.1)

2 (3)

500

0 (0)

0.8 (0.8)

1 (1)

2 (2)

3.2 (3.9)

4 (7)

2 (2)

3.2 (4.5)

4 (8)

1000

0 (0)

1.1 (1.1)

2 (2)

2 (2)

3.4 (4.1)

5 (7)

2 (2)

4.1 (6.6)

5 (14)

5000

0 (0)

0.9 (0.9)

1 (1)

3 (3)

4.5 (5.8)

6 (10)

3 (4)

5 (7.2)

6 (10)

A detailed analysis of structure and depth of the binary tree generated by the divide and conquer recursions illustrates the contribution of this step for determining the final solution. Table 3 shows for each pair ðn; eÞ the minimum, average and maximum values of the maximum depth of the tree and the minimum, average and maximum number of times (the values given into brackets) procedure divide and conquer was called. The number of calls to divide and conquer proportionally increases with the number of items involved and with the accuracy considered. In the instances of class (A), divide and conquer was called a maximum number of 14 times for the case n ¼ 1000 and e ¼ 1=1000: On the contrary, the

368

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

Table 4 Problem class (B): percentage errors and times in seconds, single instances e ¼ 1=10

e ¼ 1=100

e ¼ 1=1000

n

Per. error

Time

Per. error

Time

Per. error

Time

10 15 20 25 30 35

0.000 0.733 4.776 0.000 3.225 0.000

0.059 0.073 0.075 0.058 0.058 0.066

0.267 0.292 0.292 0.118 0.806 0.759

0.079 0.112 0.174 0.176 0.312 0.266

0.000 0.000 0.065 0.088 0.025 0.019

5.616 6.585 8.618 10.688 12.734 19.351

Table 5 Problem class (C): percentage errors and times in seconds, single instances e ¼ 1=10

e ¼ 1=100

e ¼ 1=1000

n

Per. error

Time

Per. error

Time

Per. error

Time

10 50 100 500 1000 5000 10000

2.953 1.004 0.501 0.100 0.050 0.010 0.050

0.033 0.092 0.179 0.872 1.740 0.009 0.018

0.000 0.519 0.490 0.100 0.050 0.010 0.005

0.106 0.751 0.733 0.877 1.754 0.067 0.140

0.000 0.014 0.000 0.050 0.050 0.010 0.005

5.296 45.154 73.532 404.215 414.755 0.644 1.319

maximum depth of the tree is never larger than 8 and seems to be only moderately dependent on the number of items. Tables 4 and 5 refer to classes (B) and (C), the Todd and Avis instances. The entries in these tables give the percentage error and the computational time for each trial. Computational results show that the algorithm has an average performance much better than for these worst case examples. Still compared to previous approximation approaches, our algorithm is also successful on these instances, requiring e.g. for class (B) never more than 20 s: Finally, as shown in Table 5, for class (C) problems the algorithm generates decreasing errors when n increases. In particular, the instances become ‘‘easy’’, i.e. only the Greedy algorithm is 1 1 applied, when 1e pn1 2 : In Table 5 all the instances with e ¼ 10 and nX50; e ¼ 100 and nX500; 1 e ¼ 1000 and nX5000; respectively, only required step 4 of the algorithm. For this reason the results for n ¼ 10000; 50 000 and 100 000 are not shown.

5. Concluding remarks In this paper we have presented a fully polynomial approximation scheme which solves the Subset-Sum Problem with accuracy e in time Oðminfn  1=e; n þ 1=e2 logð1=eÞgÞ and space

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

369

Oðn þ 1=eÞ: Moreover, for all instances with subset capacity c where the optimal solution is less than ð1  eÞc our algorithm actually computes the optimal solution. With this scheme we could solve large instances efficiently with high accuracy on a personal computer which were intractable by known approximation algorithms on bigger computers (e.g. a CDC-Cyber 730). An objection against fully polynomial approximation schemes for the SSP was so far that ‘‘they can be impractical for relatively large values of 1=e’’ [17]. Since for this scheme the memory requirement is only Oðn þ 1=eÞ; there should be no reason to prefer polynomial approximation schemes to this fully polynomial approximation scheme. The divide and conquer technique was also used in an improved approximation algorithm for the knapsack problem (see [11,12]). Finally, an interesting open problem would be to get rid of the logarithm in the time bound for the algorithm, i.e. to reduce the time complexity to Oðminfn  1=e; n þ 1=e2 gÞ:

References [1] R.E. Bellman, Dynamic Programming, Princeton University Press, Princeton, 1957. [2] V. Chvatal, Hard knapsack problems, Oper. Res. 28 (1980) 1402–1411. [3] D. Dor, U. Zwick, Selecting the median, in: Proceedings of the Sixth ACM-SIAM Symposium on Discrete Algorithms, 1995, pp. 28–35. [4] M.R. Garey, D.S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman, San Francisco, 1979. [5] G.V. Gens, E.V. Levner, Approximation algorithms for certain universal problems in scheduling theory, Soviet J. Comput. System Sci. 6 (1978) 31–36. [6] G.V. Gens, E.V. Levner, Fast approximation algorithms for knapsack type problems, in: K. Iracki, K. Malinowski, S. Walukiewicz (Eds.), Optimization Techniques, Part 2, Lecture Notes in Control and Information Sciences, Vol. 74, Springer, Berlin, 1980, pp. 185–194. [7] G.V. Gens, E.V. Levner, A fast approximation algorithm for the subset-sum problem, INFOR 32 (1994) 143–148. [8] O.H. Ibarra, C.E. Kim, Fast approximation algorithms for the knapsack and sum of subset problems, J. ACM 22 (1975) 463–468. [9] D.S. Johnson, Approximation algorithms for combinatorial problems, J. Comput. System Sci. 9 (1974) 339–356. [10] R.M. Karp, The fast approximate solution of hard combinatorial problems, Proceedings of the Sixth Southeastern Conference on Combinatorics, Graph Theory, and Computing, Utilitas Mathematica Publishing, Winnipeg, 1975, pp. 15–31. [11] H. Kellerer, U. Pferschy, A new fully polynomial time approximation scheme for the knapsack problem, J. Combin. Optim. 3 (1999) 59–71. [12] H. Kellerer, U. Pferschy, Improved dynamic programming in connection with an FPTAS for the knapsack problem, J. Combin. Optim., to appear. [13] H. Kellerer, U. Pferschy, M.G. Speranza, An efficient fully polynomial approximation scheme for the subset-sum problem, Proceedings of the Eighth ISAAC Symposium, Springer Lecture Notes in Computer Science, Vol. 1350, Springer, Berlin, 1997, pp. 394–403. [14] E. Lawler, Fast approximation algorithms for knapsack problems, Math. Oper. Res. 4 (1979) 339–356. [15] M.J. Magazine, O. Oguz, A fully polynomial approximation algorithm for the 0–1 knapsack problem, European J. Oper. Res. 8 (1981) 270–273. [16] S. Martello, P. Toth, Worst-case analysis of greedy algorithms for the subset-sum problem, Math. Programming 28 (1984) 198–205.

370

H. Kellerer et al. / Journal of Computer and System Sciences 66 (2003) 349–370

[17] S. Martello, P. Toth, Approximation schemes for the subset-sum problem: survey and experimental results, European J. Oper. Res. 22 (1985) 56–69. [18] S. Martello, P. Toth, Knapsack Problems: Algorithms and Computer Implementations, Wiley, Chichester, 1990. [19] D. Pisinger, Linear time algorithms for knapsack problems with bounded weight, J. Algorithms 33 (1999) 1–14.