A Fast 2-Approximation Algorithm for the Minimum ... - CiteSeerX

A Fast 2-Approximation Algorithm for the Minimum Manhattan Network Problem ⋆ Zeyu Guo1 , He Sun2 , and Hong Zhu3 1

Department of Computer Science and Engineering Fudan University, China 2 Shanghai Key Laboratory of Intelligent Information Processing Fudan University, China 3 Shanghai Key Laboratory of Trustworthy Computing East China Normal University, China {gzy,sunhe,hzhu}@fudan.edu.cn

Abstract. Given a set T of n points in IR2 , a Manhattan Network G is a network with all its edges horizontal or vertical segments, such that for all p, q ∈ T , in G there exists a path (named a Manhattan path) of the length exactly the Manhattan distance between p and q. The Minimum Manhattan Network (MMN) problem is to find a Manhattan network of the minimum length, i.e., the total length of the segments of the network is to be minimized. In this paper we present a 2-approximation algorithm with time complexity O(n2 ), which improves the 2-approximation algorithm with time complexity Ω(n8 ), proposed by Chepoi, Nouioua et al.. To the best of our knowledge, this is the best result on this problem.

1

Introduction

A rectilinear path between two points p, q ∈ IR2 is a path connecting p and q with all its edges horizontal or vertical segments. Furthermore, a Manhattan path between p and q is a rectilinear path with its length exactly dist(p, q) := |p.x − q.x| + |p.y − q.y|, i.e., the Manhattan distance between p and q. Given a set T of n points in IR2 , a network G is said to be a Manhattan network on T , if for all p, q ∈ T there exists a Manhattan path between p and q with all its segments in G. For the given network G, let the length of G, denoted by L(G), be the total length of all segments of G. For the given point set T , the Minimum Manhattan Network (MMN) Problem is to find a Manhattan network G on T with minimum L(G). From the problem description, it is easy to show that there is a close relationship between the MMN problem and planar t-spanners. For t ≥ 1, if there exists a planar graph G such that for all p, q ∈ T , there exists a path in G connecting p and q of length at most t times the distance between p and q, G is said to be ⋆

This work is supported by Shanghai Leading Academic Discipline Project(Project Number:B412) and National Natural Science Fund (grant #60496321 and #60703091). Correspondence author: He Sun

a t-spanner of T . The MMN Problem for T is exactly the problem to compute the 1-spanner of T under the L1 -norm. Motivation: The spanners and the MMN problem have applications in city planning, network layout, distributed algorithms and VLSI circuit design. Lam, Alexandersson et al. [8] showed an approximation algorithm for Manhattan networks on alignment graphs that are relevant to sequence alignments in computational biology. Related works: The MMN problem was first introduced by Gudmundsson, Levcopoulos et al. [6], and until now, it is open whether this problem belongs to the complexity class P. Gudmundsson et al. [6] proposed an O(n3 )-time 4approximation algorithm, and an O(n log n)-time 8-approximation algorithm. Kato, Imai et al. [7] presented an O(n3 )-time 2-approximation algorithm. However, the proof of their algorithm correctness is incomplete [4]. In spite of that, their paper still provided a valuable idea, that it suffices for G to be a Manhattan network if for each of O(n) certain pairs there exists a Manhattan path connecting its two points. Thus it is not necessary to enumerate all the pairs in T ×T . Following this idea, Benkert, Wolff et al. [1, 2] proposed an O(n log n)-time 3-approximation algorithm. They also described a mixed-integer programming (MIP) formulation of the MMN problem. After that, Chepoi, Nouioua et al. [4] proposed a 2-approximation rounding algorithm by solving the linear programming (LP) relaxation of the MIP. In [9], S. Seibert and W. Unger proposed a 1.5-approximation algorithm. Unfortunately, their proof is incorrect. Therefore 2-approximation is, to our best knowledge, the lowest approximation ratio for this problem. However, since the algorithm of Chepoi, Nouioua et al. is based on linear programming, its time complexity, as pointed out in [1], may be as high as Ω(n8 ). Outline of our approach: In a high-level overview, our algorithm is as follows: partition the input into several blocks (ortho-convex regions) that can be solved independently of each other. For the blocks, some can be trivially solved optimally, whereas only one type of blocks is difficult to solve. For such a nontrivial block there are some horizontal and vertical strips which can be solved by horizontal and vertical nice covers plus switch segments to connect neighboring points in the same strip. Furthermore, we add some missing connections in regions called staircases. We implement this step using the dynamic programming technique to guarantee that staircases are locally optimal. Though a similar method can be found in previous work, we define a new form of staircase, called revised staircase (RS), and use the speed-up technique of the dynamic programming introduced in [5]. For the approximation analysis, the ratio of 2 follows simply from the fact that the strips and staircases are solved locally optimally, and though these regions may overlap, the overlapping segments are counted at most twice. Our results: In this paper we present a 2-approximation algorithm with time complexity O(n2 ). Furthermore, for the given set T , let H and W be the height and the width of the minimum rectangle covering all points in T with its sides parallel to the axes. Let G⋆ be an MMN on T and G be the network

constructed by our algorithm. From the algorithm description, it is proven that L(G) ≤ 2L(G⋆ ) − H − W .

2

Preliminaries

Basic notations: For p = (p.x, p.y) ∈ IR2 , let Qk (p) denote the k-th closed quadrant with respect to the origin p, e.g., Q1 (p) := {q ∈ IR2 | p.x ≤ q.x, p.y ≤ q.y}. Define R(p, q) as a closed rectangle (possibly degenerate) where p, q ∈ IR2 are its two opposite corners. BV (p, q) is defined as the vertical closed band bounded by p, q, whereas BH (p, q) denotes the horizontal closed band bounded by p, q. For the given point set T , let Γ be the union of vertical and horizontal lines which pass through some point in T . In addition, we use [c, d] to represent the vertical or horizontal segment with endpoints c and d, as Fig. 1 shows.

a

c b

d e

Fig. 1. T = {a, b, c, d, e}. The vertical and horizontal lines compose Γ .

Pareto envelope: The Pareto envelope, originally proposed by Cheopi et al. [4], plays an important role in our algorithm and we give a brief introduction. Given the set of points T , a point p is said to be dominated by q if ∀t ∈ T : dist(q, t) ≤ dist(p, t) ∧ ∃t ∈ T : dist(q, t) < dist(p, t) . A point is said to be an efficient point if it is not dominated by any point in the plane. The Pareto envelope of T is the set of all efficient points, denoted by T P(T ). SFig. 2 shows an example of P(T ). It is not hard to prove that P(T ) = u∈T v∈T R(u, v). Chalmet et al. [3] demonstrated that P(T ) can be built in O(n log n) time. They also presented some other properties of P(T ). In particular, P(T ) is ortho-convex, i.e., the intersection of P(T ) with any vertical or horizontal line is continuous, which is equivalent to the fact that for any two points p, q ∈ P(T ), there exists a Manhattan path in P(T ) between p and q. In [4] Chepoi et al. also showed that the Pareto envelope is the union of some ortho-convex (possibly degenerate) rectilinear polygons (called blocks). Two blocks can overlap at only one point which is called a cut vertex. We denote by C the set of cut vertices, and let T + := T ∪ C. A vertical sweeping line ℓ moving from left to right may intersect two or more blocks only when ℓ ⊆ Γ , since only such kind ofP ℓ passes through the cut vertices where blocks overlap. P This fact yields W = B⊆P(T ) WB , and similarly H = B⊆P(T ) HB , where HB and WB are the height and the width of the minimum rectangle covering the block B. For a block B let TB := T + ∩ B. We say B is trivial if B is a

rectangle (or degenerate to a segment) such that |TB | = 2. It is known that the two points in TB must be two opposite corners of B when it is trivial. In Fig. 2, C = {a, b, c, d} and only the block between c and d is non-trivial.

d c a

(a)

b

(b)

Fig. 2. An example of a Pareto envelope. The black points in (a) are the set T . The two separate grey regions in (b) are non-degenerate blocks, whereas the black lines are degenerate blocks. All these blocks form the Pareto envelope P(T ).

Chepoi et al. [4] proved that an MMN on T + is also an MMN on T , and to obtain an MMN on T + , it suffices to build an MMN on TB for each B ⊆ P(T ). The MMN in any trivial block B can be built by simply connecting the two points in TB using a Manhattan path. So we have reduced the MMN problem on T to MMN on non-trivial blocks. For a non-trivial block B denote its border by ∂B and let ΓB := Γ ∩ B. We call a corner p in ∂B a convex corner if the interior angle at p equals to π2 , otherwise p is called a concave corner. Lemma 1. [4] For any non-trivial block B and any convex corner p in ∂B, it holds that p ∈ TB . Lemma 2. [4] For any non-trivial block B, there exists an MMN GB on TB such that GB ⊆ ΓB . Furthermore, any MMN GB ⊆ ΓB on TB contains ∂B. Strips and staircases: Informally, for p, q ∈ TB , p.y < q.y, we call R(p, q) a vertical strip if it does not contain any point in the region BV (p, q) except the vertical lines {(x, y)|x = p.x, y ≤ p.y} and {(x, y)|x = q.x, y ≥ q.y}. Similarly, for the points p, q ∈ TB , p.x < q.x, we call R(p, q) a horizontal strip R(p, q) if it does not contain any point in the region BH (p, q) except the horizontal lines {(x, y)|x ≤ p.x, y = p.y} and {(x, y)|x ≥ q.x, y = q.y}. Especially, we say a vertical or horizontal strip R(p, q) is degenerate if p.x = q.x or p.y = q.y. Fig. 3 gives an example of a horizontal strip. The other notion which plays a critical role in our algorithm is the staircase. There are four kinds of staircases specified by a parameter k ∈ {1, · · · , 4}, and without loss of generality we only describe the one with k = 1. Suppose R(p, q) is a vertical strip and R(p′ , q ′ ) is a horizontal strip, such that q ∈ Q1 (p), q ′ ∈ Q1 (p′ ), p, q ∈ BV (p′ , q ′ ), p′ , q ′ ∈ BH (p, q), i.e., they cross in the way as Fig. 4

p

R(p, q)

q

t

Fig. 3. The rectangle is a horizontal strip. Any point in TB within BH (p, q) can only be placed on the dashed linee, e.g., the point t.

shows. Let o be the topmost and rightmost point in R(p, q) ∩ R(p′ , q ′ ). Denote by Tpp′ |qq′ the set of v in TB ∩SQ1 (o) such that (Q3 (v)\Q3 (o)) ∩ TB = {v}. If Tpp′ |qq′ 6= ∅, then let Spp′ |qq′ = v∈Tpp′ |qq′ R(o, v), which is said to be a staircase (see Fig. 4). In this figure, no point in TB is located in the dark area and the light grey regions (i.e., the staircase and the two unbounded half-bands) except those in Tpp′ |qq′ . As was observed in [1, 4], the interiors of two staircases are disjoint, and the interior of a staircase and a strip are disjoint. For a strip R(p, q), (p, q) is called a strip pair. For each staircase Spp′ |qq′ , let v be any point in Tpp′ |qq′ , and (v, p) (also (v, p′ )) is called a staircase pair. Theorem 1. [4] For a network GB , if for any strip pair or staircase pair (p, q), p, q ∈ TB , there exists a Manhattan path in GB connecting p and q, then GB is a Manhattan network on TB .

3

Algorithm Description

Following [1] we give the following definitions: a union of vertical segments CV is said to be a vertical cover if for any horizontal line ℓ and any vertical strip R that ℓ intersects, it holds that ℓ ∩ R ∩ CV 6= ∅. A union of horizontal segments CH is said to be a horizontal cover if for any vertical line ℓ and any horizontal strip R that ℓ intersects, it holds that ℓ ∩ R ∩ CH 6= ∅. Furthermore, a minimum vertical cover (MVC) is the vertical cover of the minimum length, as Fig. 5 shows, whereas a minimum horizontal cover (MHC) is the horizontal cover of the minimum length. In addition, an MVC/MHC is said to be nice if any of its segments contains at least one point in TB . A nice cover is the union of a nice MVC and a nice MHC. It can be shown that a nice cover, expressed by EC , always exists [1, 2]. Benkert et al. [1, 2] proposed an O(n log n)-time algorithm that computes a nice cover EC for the set of points TB , |TB | = n, such that EC ⊆ Γ . For a nondegenerate strip R(p, q), we use [p, a] and [q, b] to express the segments of EC covering R(p, q). Especially, a = p if no such a segment contains p, and b = q if no such a segment contains q. Note that [p, a] ∪ [q, b] ∪ [a, b] is a Manhattan path. We say [a, b] is a switch segment. Let ES denote the union of switch segments, and it has been proven that L(ES ) ≤ HB + WB . Theorem 2. There exists a procedure CreateNC to construct a nice cover EC such that EC ⊆ ΓB .

q Tpp′ |qq′

Spp′ |qq′

p

o

′

q′

p Fig. 4. A staircase consisting of a dark grey Fig. 5. A nice MVC consisting of black lines region and its borderline

Proof. The procedure CreateNC consists of two steps. In the initial step, we run the algorithm proposed by Benkert et al. [1, 2] to get a nice cover EC . In the second step, we modify EC to guarantee that EC ⊆ ΓB . Clearly a segment e of EC which is not entirely laid on B must be used to cover a nondegenerate strip. Without loss of generality, we use e to cover R(p, q), a vertical strip with p the lower left corner and q the upper right corner. Assume that e contains q. From q ∈ B we know ∂B crosses e. From p ∈ B, ∂B turns at some point p′ down to p, as Fig. 6 shows. Let q ′ be the bottom endpoint of e. It turns out that the Manhattan path between p and q switches at the y-ordinate of q ′ which is less than that of p′ . We move the switch segment up to the y-ordinate of p′ (note that switch segments are not added yet so the movement is just done in concept). Accordingly, the part of e lying outside of B is moved left onto ∂B. Since no point of TB is put outside of B, the part that has been moved is not used to cover any other strip, which shows that EC is still a nice cover after the movement. We repeat the modification until no such e exists. Then, we have EC ⊆ ΓB . ⊓ ⊔ Corollary 1. For |TB | = n, the time complexity of CreateNC is O(n log n). Proof. From [1], the time complexity of creating a nice cover is O(n log n). Now we analyze the running time of the second step. Since each point of TB is exactly in one vertical segment and one horizontal segment in EC , there exist at most 2n segments in EC . Due to the fact that one modification step decreases one segment that is not entirely on B and the number of segments is O(n), the time complexity of the second step is O(n). In summary, the running time of CreateNC is O(n log n). ⊓ ⊔ Lemma 3. After the procedure CreateNC, ∂B ⊆ EC . Proof. For p, q ∈ ∂B, let ∂Bpq be the part of ∂B from p to q in the counterclockwise direction (p, q excluded). It suffices to prove that ∂Bpq ⊆ EC whenever p, q ∈ ∂B, TB ∩ ∂Bpq = ∅, i.e., p, q are two neighboring points. This is obviously

true if p.x = q.x or p.y = q.y. For the other cases, we know there is no convex corner in ∂Bpq since if such a convex corner exists, then it is not in TB , which contradicts Lemma 1. Thus there is a concave corner on ∂Bpq , denoted by t. Without loss of generality, assume that p, q ∈ Q4 (t), as Fig. 7 shows. Among the points in TB let p′ be the leftmost one in Q1 (p) (choose the bottommost if more than one exist) and q ′ be the topmost point in Q3 (q) (choose the rightmost if more than one exist). Note it is possible that p′ = q or q ′ = p. It is clear that R(p, p′ ) is a vertical strip and R(q, q ′ ) is a horizontal strip. From EC ⊆ B we have [p, t] ⊆ EC to cover R(p, p′ ), and [q, t] ⊆ EC to cover R(q, q ′ ). Therefore ∂Bpq ⊆ EC . ⊓ ⊔

q p′

∂B

q p′

p′

∂B t

e q′

q′ p

∂B

p

Fig. 6. Modifying the nice cover

q′

q

p Fig. 7. A concave corner

Now we add ES to G and complete the procedure constructing the Manhattan paths of all the strip pairs. Let Mpq denote the Manhattan path constructed between p and q. Next we deal with the staircase pairs. Definition 1 (revised staircase). For a staircase Spp′ |qq′ , let o′ be the intersection of Mpq and Mp′ q′ (if more than one intersectionsSexist then choose any ⋆ ′ one). Let the revised staircase (RS) Spp ′ |qq ′ be the part of v∈Tpp′ |qq′ R(o , v)\∂B enclosed by Mpq , Mp′ q′ (Mpq , Mp′ q′ excluded). Fig. 8 shows how to compute an RS from a staircase. From the definition, notice that ∂B does not belong to the revised staircase. It is not hard to prove that no point of TB falls into the interior of any RS. This fact implies that the ⋆ interiors of two RS are disjoint, and the interior of the RS Spp ′ |qq ′ does not overlap any strip except R(p, q) and R(p′ , q ′ ). Given the staircase Spp′ |qq′ , we will describe how to build the Manhattan paths of all the staircase pairs of Spp′ |qq′ . Without loss of generality, assume that R(p, q) is a vertical strip and R(p′ , q ′ ) is a horizontal strip where q, q ′ ∈ Q1 (o), as Fig. 4 shows. The other cases are symmetric. Since in the following analysis we need to deal with each revised staircase, we omit the subscript and write ⋆ S := Spp′ |qq′ , S ⋆ := Spp ′ |qq ′ and TS := Tpp′ |qq ′ . Let m := |TS |+1, v0 := q, vm := q ′ . Express the points in TS as v1 , v2 , · · · , vm−1 in the order from the topmost and leftmost one to the bottommost and rightmost one. From Lemma 2 we know there exists a Manhattan path in B for every

q

q

∂B

Mpq Spp′ |qq′

q

o

p′ p

⋆ Spp ′ |qq ′

′

o

p′ Mp′ q′

q′

′

p

Fig. 8. From a staircase to an RS. The black line in the left picture is ∂B.

two consecutive points vi , vi+1 . These Manhattan paths, together with Mpq and Mp′ q′ which are also in B, enclose a closed area covering S ⋆ . So we have S ⋆ ⊆ B, which yields that all segments added within S ⋆ are also within B. Theorem 3. There exists a procedure CompOptNet such that for the given RS S ⋆ with the point set Ts , |TS | = n, CompOptNet takes O(n2 ) time and space to compute an optimal network E such that E ∪ ∂B connects each point in TS to either Mpq or Mp′ q′ . Proof. We use dynamic programming technique to implement this procedure. Suppose [xi , vi ] is the vertical segment of S ∩ ∂B connecting vi , and [yi , vi ] is the horizontal one, i.e., [xi , vi ] and [yi , vi ] are the segments we subtracted from R(o′ , vi ) (xi = vi or yi = vi if no such a vertical or horizontal segment exists), as Fig. 9 shows. Let S(0, m) := S ⋆ and we use c(0, m) to express the length of the optimal network E ⊆ S(0, m), such that E ∪ ∂B connects v1 , v2 , · · · , vm−1 to either the left boundary or bottom boundary of S(0, m). For the region S(0, m), we need to choose a suitable k, 0 ≤ k < m, and split S(0, m) into two small areas S(0, k) and S(k + 1, m) according to the value of k such that the function c(0, m) is minimized. For 0 ≤ i ≤ j ≤ m, the notion of S(i, j) is defined recursively as described in the following paragraph. Fig. 9 gives an intuitive way to understand this partition method. For the region S(i, j) and chosen k, i ≤ k < j, we connect yk left using a horizontal segment, denoted by w(i, k), and express the area above w(i, k) by S(i, k) if k > i, otherwise S(i, k) = ∅. If k + 1 6= j, we connect xk+1 down using a vertical segment h(k + 1, j) and denote the area on the right side of h(k + 1, j) by S(k + 1, j). It is not difficult to see that c(i, j) = min c(i, k) + c(k + 1, j) + w(i, k) + h(k + 1, j) i≤k