Locating Active Sensors on Traffic Networks - CiteSeerX

Annals of Operations Research 136, 229–257, 2005 c 2005 Springer Science + Business Media, Inc. Manufactured in The Netherlands.

Locating Active Sensors on Traffic Networks M. GENTILI ∗ [email protected] Computing Science Department, University of Salerno, Via Ponte Don Melillo, 84084, Fisciano (Sa), Italy P.B. MIRCHANDANI [email protected] System and Industrial Engineering Department, ATLAS Research Center, The University of Arizona, Tucson, AZ 85721-0020, USA

Abstract. Sensors are used to monitor traffic in networks. For example, in transportation networks, they may be used to measure traffic volumes on given arcs and paths of the network. This paper refers to an active sensor when it reads identifications of vehicles, including their routes in the network, that the vehicles actively provide when they use the network. On the other hand, the conventional inductance loop detectors are passive sensors that mostly count vehicles at points in a network to obtain traffic volumes (e.g., vehicles per hour) on a lane or road of the network. This paper introduces a new set of network location problems that determine where to locate active sensors in order to monitor or manage particular classes of identified traffic streams. In particular, it focuses on the development of two generic locational decision models for active sensors, which seek to answer these questions: (1) “How many and where should such sensors be located to obtain sufficient information on flow volumes on specified paths?”, and (2) “Given that the traffic management planners have already located count detectors on some network arcs, how many and where should active sensors be located to get the maximum information on flow volumes on specified paths?” The problem is formulated and analyzed for three different scenarios depending on whether there are already count detectors on arcs and if so, whether all the arcs or a fraction of them have them. Location of an active sensor results in a set of linear equations in path flow variables, whose solution provide the path flows. The general problem, which is related to the set-covering problem, is shown to be NP-Hard, but special cases are devised, where an arc may carry only two routes, that are shown to be polynomially solvable. New graph theoretic models and theorems are obtained for the latter cases, including the introduction of the generalized edge-covering by nodes problem on the path intersection graph for these special cases. An exact algorithm for the special cases and an approximate one for the general case are presented. Keywords: location theory, sensors, traffic networks, network covering problems, linearly independent equations

1.

Introduction

Sensors may be classified as “passive” or “active” depending on the level of information that a vehicle provides en-route from its point of origin to its destination. For example, the inductance loop detector is a passive sensor in view of the fact that the vehicle does not actively generate a signal on its location or speed; when it passes over the inductance ∗

Corresponding author.

230

GENTILI AND MIRCHANDANI

loop embedded in the road pavement the change in the magnetic field sends a electrical signal that indicates the vehicle passage and, from the shape of the electrical signal, an approximate measure of the vehicle speed. Generally, such loop detectors are used to count vehicles in order to give traffic volumes (e.g., vehicles per hour) on a lane or road of a network. On the other hand, for automatic toll collection systems, vehicles actively provide an “identification”, that may be picked up by roadside (or lane overhead) readers in order to charge an appropriate toll for usage of some facility. The underlying technology for active sensors could be based on microwave transmission from vehicle to reader, or on some sort of installed barcode on the vehicle that a laser-based reader decodes, or cameras that read vehicles-specific characters in acquired visual images, or some other sensor technology and other sensor media. In general, whether a sensor is active or passive, a sensor obtains an “image” of vehicle/traffic flow (using sensing media such as, electrical, radar, laser, microwave, audio, etc.) and processes it to identify characteristics of the vehicle (car, bus, truck, etc.) or the traffic flow. Recognition of a bus or a truck could provide information on its path. Recognition of a vehicle’s license plate could provide its origin and/or destination. As we mentioned above, “images” could also be coded transmission from vehicles that provide additional information and the sensors (readers) de-code such transmission to obtain, for example, freight information from trucks (good carried, its origin, its destination, container weight, custom clearance, fees paid, etc.), route/schedule information from buses (route number, schedule number, passenger count, etc.) and account information from electronic toll tags (tolls paid, credit remaining, vehicle ID, etc.). In this paper, we will refer to active sensors also as path-ID sensors, which effectively give the vehicle ID and their paths. Passive sensors will be characterized by the processing images at points on lanes to count vehicles or measure traffic flow, or on nodes where vehicles may be tracked in a visual image to measure turning ratios. For both active and passive sensors, a new set of location problems arise on where to locate such sensors to monitor or manage the particular class of traffic detected. In this paper, we will consider some type of generic sensors and their locations that address a large number of sensor location problems. The paper is focused, in particular, on the development of two generic locational decision models for active sensors, which seek to answer these questions: (1) “How many and where should path-ID sensors be located to obtain sufficient information on flow volumes on paths?”, and (2) “Given that the traffic management planners have already located count detectors (counting sensors) on some network arcs, how many and where should path-ID sensors be located to get the maximum information on flow volumes on paths?” The paper is organized as follows. The rest of this section briefly describes kinds of sensors that lead to our research questions. Sections 2 and 3 define and model specific scenarios addressed for locating path-ID sensors on a network. Polynomial instances are presented in Section 4 and a greedy-type heuristic for the general case is given in Section 5. We now discuss the two new types of sensors that can be located, in addition to the conventional counting sensors, to measure traffic flow: (i) sensors on arcs that detect

LOCATING ACTIVE SENSORS ON TRAFFIC NETWORKS

231

Figure 1. A network of 7 nodes and 8 arcs, having flows on four paths from origin node O to destination node D. The number label on each arc denotes the flow volume on that arc.

flow volumes on paths (path-ID sensors) and (ii) sensors on nodes that detect turning ratios (image sensors). 1.1. Path-ID sensors Flow on an arc contains flow from several paths; we assume that when locating a path-ID sensor on an arc of the network we can measure the flow volumes of each path to which that arc belongs. On the simple network in figure 1 there are four paths that have a common origin O and a common destination D. The number labeled on each arc is the total flow on that arc. On arc (1, 3), for example, there are 22 units of flow (such information may be available through counting sensors). On this arc, there are two paths, path 1 and path 2. We assume that by locating a path-ID sensor on arc (1, 3), for example, we can measure the flow volumes of path 1 and of path 2, that is, we know how the 22 units of flow on that arc are decomposed. For such a model to be applied, we must assume, active identification is provided by the class of vehicles been monitored. In many countries/states/cities, special vehicles such as commercial trucks, buses, emergency vehicles, and trucks carrying hazardous material have electronic “tags” or transponders installed on them that transmit some sort of identification, from which one can obtain its planned path through the network. 1.2. Image sensors Video vehicle detection systems provide non-intrusive vehicle detection through machine vision. Such systems allow users to place virtual detectors in the field of view,

232


rather than physically placing the detectors on the roadway pavement, providing flexible detector placement. Video systems overcome the maintenance and reliability concern of conventional inductive loop technology and provide detection regardless of pavement conditions. However, the performance of video-based detection systems depends on visual conditions (daylight, dark, reflection, etc.). An image can be obtained from either (i) a fixed camera mounted on a tall building or a pole or (ii) a moving camera installed on an air-borne platform such as a helicopter. Processing the images obtained by fixed or mobile cameras, it is possible to recognize vehicles on the scene and movement of these vehicles. In this way, for example, by locating a fixed video camera that takes images of the traffic situation at an intersection of the network we can estimate the turning ratios at the intersection. For example, locating an image sensor on node 3 of the network in figure 1, we can measure the turning ratios at this node. That is, we are able to measure the proportion of flow that goes to the out-going arcs (3, 4) and (3, 5) from each in-coming arc (1, 3) and (2, 3). Since the problems modeled and analyzed are new, there is no previous literature on them. There are a few sensor location problems, however, that are related to estimating traffic volumes from origins (O) to destinations (D) for a general network where vehicles use shortest routes from O to D. Here sensors determine traffic volumes on arcs and the goal of the problems is to update a prior OD matrix to obtain a new OD matrix that “results” in traffic volumes closer to the measured values; by “resulting” we mean that the underlying model assumes vehicles choose shortest routes and a traffic equilibrium results where used routes have equal travel times (see, e.g., Wardrop, 1952; Cascetta and Nguyen, 1988). In this scenario, Lam and Lo (1990) proposed some heuristic procedures to define where to locate sensors on the arcs of the network in order to obtain a better estimate of the OD matrix. Yang, Iida, and Sasaki (1991) proposed the so-called “OD covering rule” to locate sensors on arcs to bound the OD estimation error. Yang and Zhou (1998) defined three other rules to locate sensors on arcs to better estimate the OD matrix and proposed some heuristic procedures. Bianco, Confessore, and Reverberi (2001) and Bianco, Confessore, and Gentili (2003) studied a combinatorial optimization problem (the Sensor Location Problem) to locate sensors on nodes to get information on arc volumes on the non-monitored portion of the network. The models developed in this paper may be applied in such scenarios if all vehicles are equipped with active sensors and in the resulting equilibrium most vehicles from an O to D use only a few known paths. (Such an application may arise in the future, only if drivers feel comfortable with use of active sensors and are not concerned about “big brother” monitoring them.)

2.

Locating path-ID sensors on arcs

In this section we describe the general problem addressed. First we need some additional notation and terminology. Let R = (N , A) be a graph representing the traffic network, where the set N of nodes has size |N | = n and the set of arcs A has size |A| = m. A path Y = {a1 , a2 , . . . , as } is a


233

sequence of arcs ai ∈ A such that ai = (v, w) and ai+1 = (w, z), ∀i = 1, . . . , s −1. Since flow on an arc contains flow from several paths, from different origin-destination pairs, we need to define total flow in terms of path flows. We will simply let this total flow be decomposed into path flows yi on path Yi , i = 1, . . . , p, where p is the number of paths used in the network. We will let f a be the total flow on arc a ∈ A for the time interval being considered (a counting sensor on arc a measures f a ). Let Y = {Y1 , Y2 , . . . Y p } be the set of all the paths on the network. Given an arc a ∈ A, we denote by Ya = {Yi : a ∈ Yi } the set of paths that contain that arc. Given a subset of arcs A ⊆ A, the set Y A = a∈A Ya is the set of the paths that contain at least one arc in the set A , and we say this set is covered by A . Let B = {bi j }, i ∈ {1, 2, . . . , m}, j ∈ {1, 2, . . . , p} be the m × p arc/paths incidence matrix, that is bi j = 1 if arc ai belongs to the path Y j and bi j = 0 otherwise. For example, the i-th column of matrix B, B i = [0, 1, 1, 0, 1] denotes a path with arcs a2 , a3 , a5 in a five-arc network consisting of arcs A = {a1 , a2 , a3 , a4 , a5 }. The index set of columns, i = 1, 2, . . . , p, will also be denoted by C and the i-th column of B will be denoted by B i . Our aim is to determine the flow volume y j of each path Y j , j = 1, . . . , p, on the network by locating path-ID sensors on the arcs of the network. We recall that by locating a path-ID sensor on an arc of the network we are able to measure the flow of the paths to which that arc belongs. In our notation, it means that locating a path-ID sensor on arc a ∈ A, such that a ∈ Yi ∩ Y j ∩ Yk we can know the flows yi , y j , yk . Figure 2(a) shows a subnetwork of n = 10 nodes and m = 13 arcs, defined by p = 8 given paths on which there is possible flow. With each arc ai ∈ A in the subnetwork, the set of the paths Yai is associated. For example with arc a8 , the set Ya8 = {Y3 , Y5 } denotes the set of paths that contains arc a8 . In figure 2(b) the number labeled on each arc denotes the total volume of flow on that arc. For example on arc a8 there is a flow of 21 units. We can associate with each arc a linear equation decomposing the flow volumes on the arc into flows on paths. For example with arc a8 , we can associate the equation y3 + y5 = 21. Let counting sensors be located on all the arcs of the network. In this case, we know the entire vector f = { f 1 , f 2 , . . . , f m } of the arc flows. To know the path flow volumes we should solve the system of linear equations: By = f

(1)

Let rank(B) = k be the rank of matrix B. If k = p and m ≥ p then any p independent arc flows will allow us to compute unique p path flows {y1 , y2 , . . . , y p }. The more complex, and interesting problem, is when k < p, in which case our new sensors need to be located to determine the path flows if they are determinable (i.e., if a unique feasible set of path flows exists). Thus, from here on, we assume min(m, p) = p and rank(B) = k < p. Under this assumption, system (1) does not have a unique solution. This means that even by locating counting sensors on all arcs of the network we may not know (considering all data to be consistent) the flow volume of each path Y j , j = 1, . . . , p.

234


Figure 2. A network of 10 nodes and 13 arcs with flows on 8 paths. Variables Yi on each arc denote the paths that use the arc.

In the example of figure 2, the flows on paths are given by solving the following system of linear equations: y1 a1 1 a2  0  a3  0  a4  0  a5  0  a6  1  a7  0  a8  0 a9  0 a10  0 a11  1 a12  0 a13 0 

y2 1 0 0 0 0 1 0 0 0 0 0 0 1

y3 0 1 0 0 0 0 0 1 0 0 0 1 0

y4 0 1 0 0 0 0 0 0 1 0 0 1 0

y5 0 0 1 0 0 0 0 1 0 1 0 0 0

y6 0 0 1 0 0 0 0 0 1 1 0 0 0

y7 0 0 0 1 1 0 1 0 0 0 1 0 0

y8    0 12  18  0    0     10   y1   1  10    y2    1     10   y    0   3   12   y    1   4  =  10   y    0   5   21   y    0  6  7  y7  10  0  y     7  8 0     18  0 1 15

(2)

The rank of the matrix of system (2) is rank(B) = 6, and, the number of variables (path flows) is p = 8. Therefore, the system does not have a unique solution. Indeed, by setting the value of p − rank(B) = 2 variables, we can determine the values of all the other variables. It means that if we somehow know the exact flows of two paths of the network, we can find the flow volumes of all the remaining paths (assuming that data are consistent). In the example, if we can measure, by locating path-ID sensors on some arcs of the network, the flow volumes, for instance y1 , y4 , we can determine the volumes of all the remaining paths by using (2). That is, by locating path-ID sensors on arcs we add to the system (1) a set of new linear equations in order to have a unique solution for the path flows.


235

In the example, if we locate a sensor on arc a1 , we may get the following two new equations to the system: y1 = 2 and y2 = 10. The rank of the new matrix, (15 × 8), B = B

( u 1 ), obtained by adding to matrix B the two new rows u 1 = (1, 0, 0, 0, 0, 0, 0, 0) and u2

u 2 = (0, 1, 0, 0, 0, 0, 0, 0), is now rank(B ) = 7, and we do not get a unique solution. Note that, knowing y1 and y2 we can indirectly determine some of the other path volumes: y7 = f a11 − y1 on arc a11 and y8 = f a7 − y7 on arc a7 . If we locate two sensors on the network, one on arc a1 and one on arc a4 , we add 4 new equations to the original system (1) and the new (17 × 8) matrix B is such that rank(B ) = 7, thus, again, these four new equations are still insufficient to determine all path flows. In summary, the question we want to answer is: Question 1. What is the minimum number of path-ID sensors to locate on the network, and where should they be located in order to add new equations to system (1) that result in the new system having full rank (i.e., a unique solution)? Note that, in the above example by locating two path-ID sensors, one on arc a4 and one on arc a8 we get 4 new equations and the new matrix becomes full rank. This is an optimal solution for the problem. Also, we assumed the knowledge of the flows on every arc of the network. This may not be true in all scenarios. Indeed, in general, we might know flows on only some arcs of the network where counting sensors are already located. That is, we suppose that a subset f 1 = {ai1 , ai2 , . . . , aik }, i k < m, of arc flows corresponding to the arc subset A1 ⊆ A are known, resulting in the system: B1 y = f 1

(3)

where B 1 is a submatrix B obtained by the set of equations associated to the arcs in A.1 The question in this case is: Question 2. What is the minimum number of path-ID sensors to locate on the network and where should they be located in order to add new equations to system (3) that results in the new system having a unique solution? We refer to this problem as the Sensor Location on Arc Problem (SLAP). In order to better understand the complexity we distinguish three different scenarios: • Zero count information: we do not know the flow volume on any arc; • Total count information: we know the flows on each arc of the network (i.e., we have system (1)); and • Partial count information: we know the flows on some arcs of the network (i.e., we have system (3)). In the next section we give a model formulation for the problem and we will refer again to these scenarios as SLAP-zero SLAP-tot and SLAP (or SLAP-par) respectively.

236

3.


Formulations and analyses of problem scenarios

In this section we formulate and analyze each scenario (Zero, Total and Partial count information) in order to better characterize the feasible region of the problem. The problem for the general case is NP-hard as proved in Gentili (2002). We give here model formulations for the three different scenarios. Some polynomial instances and a polynomial algorithm are given in Section 4.

3.1. Zero count information: SLAP-zero In case of zero count information, we do not know any arc flow and we need to locate path-ID sensors to determine all path flows. Consider again the network of figure 2. Suppose we locate a path-ID sensor on arc a1 and determine exactly the path volumes of Y1 and Y2 . Then, if we knew total flow on arc a11 we can find indirectly the volume on path Y7 , but in this case we do not know the total flow on arc a11 . Therefore, in the case of zero count information, the number of path volumes to be directly measured is equal to the number of paths. It follows that a feasible solution for SLAP-zero is a set of arcs that covers all the paths. We then have the following definition. Definition 1. A set A ⊆ A of arcs is feasible for SLAP-zero if Y A = Y In this case the optimization problem has a set-covering formulation. We define the binary variable xi = (0, 1), xi = 1 if a path-ID sensor is located on arc ai ∈ A and 0 otherwise. The minimum number of path-ID sensors required to determine all the path flows is the solution of the following problem: [SLAP-zero]

min

m

xi

(4)

i=1

subject to m

bi j xi ≥ 1

j = 1, . . . , p

(5)

i=1

xi ∈ {0, 1}

i = 1, . . . , m

(6)

The objective function (4) requires us to locate the minimum number of path-ID sensors on the arcs of the network. Constraints (5) state that the location of the path-ID sensors must cover all the paths of the network.


237

3.2. Total count information: SLAP-tot In the example of figure 2 we saw that, in the case of total count information, we need to measure directly only a subset of path volumes; the others can be determined through the dependencies among the arc flow equations. We also observed that the number of path flow volumes to be directly monitored is equal to the number of free variables of system (1). More formally, if the number of paths is p and the rank of the incidence matrix is rank(B) = k, then we need to measure directly at least p − k variables. Obviously, not every set of p − k paths to be measured requires the same number of path-ID sensors. Our aim is to choose, among all the subsets of p − k paths, that subset which requires the minimum number of path-ID sensors. Let us observe some “wrong” choices of p − k variables. That is, there are some sets of p − k variables that, if directly measured, are not enough to solve the system. In this example, it turns out that if flows on paths Y1 , Y4 are measured, all other paths are known. On the other hand, measuring directly, say paths Y1 and Y2 (through a pathID sensor on arc a1 ) we do not know indirectly all the other path flows. This implies that there are some “wrong” choices of arcs too. The question is, “Which are the good choices of arcs that result in a feasible subset of arcs for SLAP-tot, that is those arcs that produce a unique solution to system (1)?”. In order to answer this question we need to characterize first the good set of variables, those subsets of p − k path flows which, if directly monitored, allow us to determine all other path flows. Lemma 1 to follow will help us characterize a right subset of path flows. We refer this subset as rank-filling subset of variables. At this point, we need some additional notation and definitions. Let H = {Yi1 , Yi2 , . . . , Yih }, H ⊆ Y, be a subset of paths whose flows are yi1 = f i1 , yi2 = f i2 , . . . , yih = f ih . We associate with H the set of p-dimensional row vectors U H = {u i1 , u i2 , . . . , u ih } where u i j = (0, 0, . . . , 1 , . . . , 0, 0). Let f H denote the column vector ( f i1 , f i2 , . . . , f i p ). ij

Definition 2. Given system (1) of linear equations, where matrix B is an m × p matrix and rank(B) = k < p, a subset of variables H = {yi1 , yi2 , . . . , yih } is rank-filling for the system if the following two conditions are satisfied: (i) the size of the set is such that |H | = h ≥ p − k; (ii) the system ( UBH )y = ( ffH ) has a unique solution. Thus, a rank-filling set of variables is a “good” set of variables to solve our problem. From Definition 2 it follows directly that a subset of variables of size h < p −k cannot be a rank-filling subset for the system. Note that any set of p −k free variables for system (1) is a minimum rank-filling set of variables for that system, and vice versa. This is because each subset H of free variables contains h = p − k elements, and fixing the value of the free variables is equivalent to adding to the system the set of linearly independent

238


equations U H y = f H . Also, note that there is a one-to-one correspondence among sets of free variables (i.e., minimum rank filling sets) of a system of linear equations and the maximal linearly independent columns of the matrix of the system. Let us illustrate this by an example. Consider system (1) with matrix B of rank k < p. Without loss of generality, let B = ( B1 B2 ), y = ( yy12 ), B1 y1 + B2 y2 = f . If B1 is a non-singular matrix with rank k, then y2 is the set of free variables with respect to matrix B1 . Hence, all the minimum rank-filling sets of variables of a system can be obtained by enumerating all the maximal linearly independent sets of columns of matrix B. This correspondence holds for every rank-filling set (not necessarily minimum) of variables as stated by the following lemma, whose detailed proof is given in Gentili (2002). Lemma 1. A set of h variables H = {Yi1 , Yi2 , . . . , Yih }, corresponding to columns of B with index in C H = (i 1 , i 2 , . . . , i h ), h ≥ p − k, rank(B) = k, is rank-filling for system (1) if and only if the columns whose index is in C\C H are linearly independent. The corollary below follows directly. Corollary 1. If H is a rank-filling set of variables such that |H | = h > p − k there exists a subset H ⊆ H such that H is a rank-filling set and |H | = p − k. The rank-filling sets of variables can be characterized by Lemma 1. In the example of figure 2, we see that by locating path-ID sensors on arcs a4 and a8 we can measure the flows {y3 , y5 , y7 , y8 }. Indeed, this set is a rank-filling set of variables, and the remaining set of columns of B are linearly independent. Clearly, we want to locate path-ID sensors on arcs such that the set of variables we directly monitor is a rank-filling set. We have, in this way, characterized the feasible solution (as a set of arcs) for problem SLAP-tot. Definition 3. A subset of arcs A ⊆ A is feasible for SLAP-tot if the corresponding set of variables Y A ⊆ Y is a rank-filling set for system (1). By Corollary 1, each rank-filling set of variables contains a minimum rank-filling set, therefore if A ⊆ A is a feasible set of arcs for SLAP-tot, each A ⊇ A is feasible too, because Y A ⊇ Y A . In the example of figure 2, the subset A = {a1 , a4 } ⊆ A is feasible for SLAP-tot. All the subsets A ⊇ A are also feasible. Since we are looking for the set of minimum size, we can focus only on those feasible subsets of arcs covering a minimum rank-filling set of variables. We can now state the general model formulation for SLAP-tot that is effectively a set covering formulation with additional constraints. Let us define by C p−k the family of subsets of indices of the columns of matrix B that correspond to rank-filling sets of variables for system (1) and whose size is p − k, and let S be any element of C p−k . Then SLAP-tot can be modelled as follows.

239


[SLAP-tot] min subject to

m i=1 m

xi bi j xi ≥ 1

(7) ∀j ∈ S

(8)

i=1

S ∈ C p−k xi ∈ {0, 1}

i = 1, . . . , m

(9) (10)

The objective function (8) is the same as that of SLAP-zero. Constraints (9) and (10) require that each set of path-ID sensors located on the network measure at least one subset of paths containing a minimum rank-filling set. Observe that the definition of the rank-filling set of variables of the system is useful from a model formulation point of view. Indeed, checking if a set of variables is rank-filling requires a polynomial number of Gaussian eliminations. We will see in Section 4 that the concept of rank-filling set of variables will be useful for the definition of the polynomial instances of the three different scenarios (zero, total and partial count information) of the problem.

3.3. Partial count information: SLAP-par SLAP-par, or simply SLAP, refers to the general problem of locating path-ID sensors when some arc flows are known. Analysis of SLAP-zero in Section 3.1 was useful to introduce the set-Covering formulation of such problems, and analysis of SLAP-tot in Section 3.2 to introduce the concept of minimum rank-filling subsets. We now further the analysis and integrate the results to study the general SLAP problem. In case of partial information, recall the system (3) of linear equations, where the dimension of the matrix B 1 is m 1 × r , m ≤ m 1 . For example, consider again the graph in figure 2. Let us suppose we know only the total flow volume on two arcs a1 and a11 . That is, we have the following system of two linear equations: (a1 ) (a11 )

y1 + y2 = y1 + y7 =

f a1 f a11

(11)

The matrix of system (11) has rank of 2 and the number of variables is 8 (variables y1 , y2 , y7 have coefficient equal to 1 in (11), variables y3 , y4 , y5 , y6 , y8 have coefficient equal to zero). Then, as in the case of SLAP-tot, to determine all the path flows, we must monitor a subset of paths and derive the other volumes from the arc flow dependencies (11) of the system. Also, in this case, we need to characterize a set of paths which, if directly monitored, allow us to determine all the other path flows.

240


In this example we have to monitor directly at least p − r (B 1 ) = 8 − 2 = 6 paths to determine all the other path volumes. As in SLAP-tot, the sets of variables to be measured for SLAP are the rank-filling sets related to the system of linear equation (system (11) for the illustrative example or system (3) for the general case). We can obtain a model formulation for the SLAP which is identical to the formulation obtained for SLAP-tot. However, the rank-filling sets for the case of partial count information have a defined structure, and it is possible to give an alternative model formulation that better characterizes the structure of the rank-filling sets of variables, and includes the SLAP-zero and SLAP-tot formulations as special cases. In the example of figure 2, it is easy to see that for system (11) there are only three rank-filling sets of minimum size p − r (B 1 ) = 6. These sets contain the four variables yi , i = 3, 4, 5, 6 and they differ only with respect to the remaining two variables chosen among y1 , y2 , y7 . That is, they contain all the variables having zero coefficients in equations (11) and two out of three among the other variables in (11) with coefficient 1. This is a common structure of minimum rank filling sets in case of partial information. We will state these observations in Lemma 2 to follow. Given the system of equation (3), we partition the set Y of variables into two subsets Y 0 and Y 1 representing respectively the variables with zero coefficients in the system and the variables with non-zero coefficient in the system. Consider the reduced system: B¯ y¯ = f 1

(12)

obtained from (3) by eliminating all the zero columns (that is, those columns correspond¯ and the Lemma 2 follows. ing to the variables in Y 0 ). Clearly, r (B 1 ) = r ( B) Lemma 2. Each minimum rank-filling set for system (3) is obtained by a minimum rank-filling set of system (12) plus the set Y 0 of variables. Corollary 2. A subset of arcs A ⊆ A is a feasible solution for SLAP-par if and only if the corresponding set of variables Y A contains a minimum rank-filling set of system (12) and the set Y 0 of variables. Now we are ready to give the alternative formulation for the SLAP problem. Let C0 be the set of indices of columns of B corresponding to variables in Y 0 and let C1 be be the family of subsets of indices of the columns of matrix B corresponding to minimum rank-filling sets of variables for the reduced system (12) and let S be any element of it. We can now formulate SLAP as follows: [SLAP] min

m i=1

xi

(13)

241


subject to

m

bi j xi ≥ 1 ∀ j ∈ S

(14)

i=1

S ∈ C1 m bi j xi ≥ 1 ∀ j ∈ C0

(15) (16)

i=1

xi ∈ {0, 1}

i = 1, . . . , m

(17)

The objective function (13) requires us to minimize the number of path-ID sensors to locate on the arcs of the network and it is the same as for SLAP-zero and SLAP-tot. Constraints (14), (15) and (16) together require that each set of path-ID sensors located on the network measure a subset of paths containing a minimum rank-filling set for system (3). More specifically, constraints (14) and (15) require that each set of pathID sensors located on the network measure a subset of paths containing a minimum rank-filling set for the reduced system (12) and constraints (16) requires that the path-ID sensors measure all the path in Y 0 . Observe that this formulation contains as special cases the formulations SLAP-zero and SLAP-tot. Indeed, if Y 0 = ∅ and Y 1 = Y, then we have SLAP-tot. On the other hand, if Y 0 = Y and Y 1 = ∅ we have the SLAP-zero formulation. The next section analyzes the special cases for the three scenarios when we assume that on each arc there are exactly two paths. The particular structure of the rank-filling set will help us define a polynomial algorithm to solve these particular instances. 4.

Polynomially solvable cases: Exactly two paths on each arc

The assumption for these cases is that on each arc a ∈ A there are exactly two paths, and therefore, the incident arc/path matrix B has exactly two non-zero elements for each row. The reasons for selecting these cases, which are idealistic, were that (a) this is possible in a sparse network, where each arc carries trucks or buses equipped for active sensors that belong to one or two paths, and, (b), more importantly, to investigate whether the SLAP problems are always NP-hard or whether polynomial cases exist. In what follows, we will show that the three special cases are equivalent to solving the Generalized Vertex Covering by Edges Problem (GVCE). We will show that the latter is polynomially solvable and thus our problems are too. From the hypothesis that each arc a ∈ A has two paths associated with it, we have

p that j=1 bi j = 2, ∀i = 1, . . . , m. This property allows us to work on a related graph: the intersection graph of the paths, referred to as the path intersection graph in the sequel. For clarity of presentation, the path intersection graph G will be referred to in terms of vertices and edges, whereas the underlying traffic network R in terms of nodes and arcs. Let G = (V, E) be a graph such that each vertex vi ∈ V corresponds to the path Yi ∈ Y and there is an edge between vertex vi ∈ V and vertex v j ∈ V if the corresponding paths Yi , Y j have a common arc. We will denote an edge of G by e(i, j) to point out the fact

242


Figure 3. A network of 10 nodes and 19 arcs with exactly two paths on each arc. The path intersection graph for this network is shown in figure 4.

Figure 4. The path intersection graph for the network of figure 3. Each vertex corresponds to a path, and there is an edge between two vertices if the corresponding paths have a common arc. Observe that two of the three components, {v1 , v2 , v7 , v8 } and {v3 , v4 , v5 , v6 }, are bipartite.

that e(i, j) = (vi , v j ). Thus, |V | = |Y| = p and |E| ≤ |A| = m. Figure 4 shows the path intersection graph G related to the network R of figure 3 whose set of paths satisfies the assumption. In the sequel we will show that the three problems are easily redefined on the path intersection graph. Once we know the optimal choice of edges on the path intersection graph G, it is an immediate step to find the corresponding optimal set of arcs on the network R. 4.1. SLAP-zero with two paths for each arc: SLAP-zero(2) We show now that an optimum set for SLAP-zero(2) can be obtained by solving the Vertex Covering by Edges Problem (VCE) on the path intersection graph G. First we


243

recall the decision version of VCE and then we prove the correspondence among the optimal solutions of the two problems. The Vertex Covering by Edges problem (VCE) Instance: A graph G = (V, E) and a positive integer K ≤ |V |. Question. Is there a subset E ⊆ E of edges of size K or less that covers the set of vertices V , that is, where each vertex in V is incident to at least one edge in E ? An optimum covering set for VCE is a feasible set of minimum size. Theorem 1. A subset A ⊆ A of arcs is an optimum set for SLAP-zero(2) if and only if A corresponds to an optimum covering set E on the path intersection graph. Proof. Let R = (N , A) be a network and Y = {Y1 , Y2 , . . . , Y p } be the set of paths defined on R. Let G = (V, E) be the intersection graph of the set of paths Y and E = {e1 , e2 , . . . , es } be an optimum covering set of G. By feasibility, we have that {v1 , v2 , . . . , v p } = V is the set of vertices covered by E and this set, by construction, corresponds to the entire set of paths Y. We now build from E a feasible set of arcs A for SLAP-zero(2) on R of the same cardinality. For each edge (vi , v j ) ∈ E select any arc a ∈ A such that a ∈ Yi and a ∈ Y j . Let the selected arcs form the set A . We now have a set A which has the same size of E and it is feasible for SLAP-zero(2). Indeed, if there is one path Yk ∈ / Y A then the solution E does not cover the corresponding vertex vk ; that is a contradiction. Thus, the optimum covering E corresponds to an optimum set for SLAP-zero(2). Conversely, let A = {a1 , a2 , . . . , as } be an optimum set for SLAP-zero(2) on R, that is Y A = Y. For each a ∈ A , let {Yi , Y j } be the two paths covered by a, and select the edge ei j = (vi , v j ) ∈ E to obtain E . This set E is such that |E | = |A | and it covers the entire set of vertices V , thus is an optimum covering set for G. The VCE problem for a graph G is polynomially solvable (see for example Proposition 4.5 pp. 639 in Nemhauser and Wolsey (1988)). Thus, Theorem 1 defines a way to solve SLAP-zero(2) under the assumption that there are exactly two paths on each arc of the network. 4.2. SLAP-tot with two paths on each arc: SLAP-tot(2) In this section we will define some properties of the rank-filling set of variables. We will show the relation among the rank-filling sets and the connected components of the path intersection graph G. We will see that the optimal solution value for the SLAP-tot(2) is strictly related to the number of connected components of the path intersection graph and that an optimum choice of arcs can be easily obtained. We recall some additional terminology and some simple properties of undirected graphs. Let G = (V, E) be an undirected graph with |V | = n vertices and |E| = m edges. Let Z be the edge/vertex incident matrix of G. A forest is an acyclic partial subgraph of G. A forest is spanning if it spans the vertex set of G. A tree is a connected forest. A

244


Figure 5. Example of L-forest (a) and L-odd sets (b).

spanning tree is a connected spanning forest. A spanning forest F = (V, E T ) is maximal if by adding any edge e ∈ E\E T the subgraph F ∪ {e} contains a cycle. An L-forest of G is the graph obtained from a forest by adding at most one edge e to each component of the forest, and the cycle containing e is odd (see figure 5(a)). Two vertex disjoint odd cycles, together with a path between them or a pair of odd cycles with a common vertex, form an L-odd set (see figure 5(b)). A maximal L-forest F is a spanning L-forest such that by adding any edge e ∈ E\E T the corresponding subgraph F ∪ {e} contains an L-odd set or an even cycle. Lemma 3. If G is connected then n − 1 ≤ rank(Z ) ≤ n. Proof. Matrix Z is an m × n matrix. Thus rank(Z ) ≤ n. A spanning tree of G corresponds to a set of n − 1 linearly independent rows of Z (see, for example, Proposition 6.1 pp.76 in Nemhauser and Wolsey (1988)) and thus n − 1 ≤ rank(Z ). Theorem 2. Let G be a connected graph, then rank(Z ) = n if and only if G contains an odd cycle. Proof. (⇒) Let us consider a set of n linearly independent rows of Z . These rows correspond to a set of edges of G and, since they are linearly independent they are distinct, thus define a subgraph G with n vertices and n distinct edges. Therefore, G contains a cycle. Let C be such a cycle. Consider the submatrix Z C corresponding to this cycle. If the cycle contains an even number of edges then the determinant of matrix Z C is zero and the corresponding rows are linearly dependent, contradicting the hypothesis. Thus, C contains an odd number of edges.


245

(⇐) Conversely, consider a maximal L-forest of G. Since G is connected, a maximal L-forest F is obtained from a spanning tree T by adding a single edge e that introduces an odd cycle. Let C be such a cycle and let n c be the number of its edges. We want to prove the n rows corresponding to the edges of the maximal L-forest are linearly independent. To do that, let us consider a system of linear equations defined on the edge set of F. This is a system with n variables (corresponding to the vertex set) and n equations. We will show this system has a unique solution, and thus the rows of the corresponding matrix are linearly independent. Consider the subsystem obtained by considering the equations associated with the odd cycle of the L-forest. This is a system with n c equations and n c variables. The corresponding submatrix is non-singular (because the cycle is odd) and thus a unique solution to this subsystem can be achieved; that is, the values of the variables corresponding to the vertices of the cycle can be obtained. Consider now the subgraph G obtained from F by deleting the edges of the cycle C. G has at most n c connected acyclic component G i and each one contains at least one vertex of the cycle. Let n i be the number of vertices of each component G i . To each component G i is associated a subsystem in n i variables and n i −1 linearly independent equations. Thus, fixing the value of at least one variable of the subsystem, specifically that corresponding to a vertex in C, a solution for the subsystem can be achieved. Therefore, each subsystem has a unique solution. Since the subsystems are independent from each other (because they correspond to different components of the graph G ), a unique solution to the original system is obtained. From Theorem 2 the corollary below directly follows. Corollary 3. Let G be a connected graph, then rank(Z ) = n − 1 if and only if G is bipartite. Lemma 4. Let G be a graph with p connected components, then rank(Z ) = n −h where h ≤ p is the number of its bipartite components. Proof. Columns and rows of matrix Z can be rearranged to obtain the diagonal block structure   Z1 0 . . . 0  0 Z2 . . . 0    ... ... 0 0 . . . Zs where Z i is the edge/vertex incidence matrix of component G i . Therefore, rank(Z ) =

p i=1 rank(Z i ). If Z i corresponds to a bipartite component G i with n i vertices then, by Corollary 3, rank(Z i ) = n i − 1. If Z i corresponds to a non-bipartite component G i , then by Lemma 2, rank(Ai ) = n i . Hence, the theorem follows. Let us now consider the set L of all the forests and L-forests of G, that is, L = {F : F is a forest or an L-forest of G}.

246


Theorem 3. Each set of linearly independent rows of Z has a one-to-one correspondence to an element of L. Proof. (⇒) Consider a set of k linearly independent rows of Z . Let E be the corresponding set of edges on the graph. If the subgraph G induced by E is connected then it can have k or k − 1 vertices. If G has k − 1 vertices then it is acyclic, therefore is a tree and thus a forest. If G has k vertices, then, by Theorem 3, it has an odd cycle, and thus it is an L-forest. If G is not connected then, applying the same reasoning to each of the connected components, this satisfies the hypothesis. (⇐) Conversely, let F be an element of L. If F is connected then it can be acyclic or contain an odd cycle. If F is acyclic then the corresponding set of edges are linearly independent. If F contains an odd cycle then, by Theorem 2, the corresponding rows are linearly independent. If F is not connected then, applying the same reasoning to each component, the hypothesis follows. Thus, each set of maximal linearly independent rows of Z corresponds to a maximal L-forest or a maximal forest of G. An extension and formalization of Theorem 3 in matroid context has been defined in Conforti and Rao (1987). Theorem 3 holds also when G has multiple edges, because a multiple edge introduces an even cycle and does not change the structure of the odd cycles of the network. Now we return to our problem and show the relationship between the arc/path incidence matrix B and the edge/vertex incident matrix Z of the path intersection graph G. Lemma 5. The edge/vertex incidence matrix Z has the same rank of the arc/path matrix B. Proof. We will prove this by showing that each row of Z corresponds to a subset of identical rows of matrix B, and, vice versa, that each row of matrix B corresponds to a row of Z . Recall that each row Bi of B corresponds one edge of G. In particular, if row Bi is such that elements bik = 1 and biq = 1 then there exists in G the edge e(k,q) = (vk , vq ). Vice versa, let us consider an edge e(i, j) = (vi , v j ) ∈ E, then there exists in B the set of identical rows with bki = 1 and bk j = 1. Since by definition there is a one-to-one correspondence between each row Z k of Z and each edge of G, the hypothesis follows.

Observation 1. By proof of Lemma 5, it is evident that matrix Z is obtained from B by eliminating identical rows Bi1 , Bi2 , . . . , Bit . Thus, the system Zy = f

(18)

where f is obtained from f by eliminating elements f i1 , f i2 , . . . , f it , is equivalent to system (1).

247


We know, by Lemma 4, that the rank of the edge/vertex incidence matrix of an undirected graph is equal to the number of vertices of the graph minus the number of its bipartite connected components; the consequence with respect to B is stated in the following lemma. Lemma 6. The rank of the arc/path matrix B is equal to the number of paths minus the number of bipartite connected components of the path intersection graph. Let us now analyze SLAP-tot(2) on the path intersection graph. Let h = p − k be the number of bipartite connected components of G, where p is the number of the paths and k is the rank of matrix Z . We recall that a feasible solution to SLAP-tot(2) is a subset of arcs in R covering a rank-filling set of h variables. All the rank-filling sets of variables are easily defined using the path intersection graph G. The following result is a direct consequence of Lemmas 5 and 6. Lemma 7. Every minimum rank-filling set of variables for system (1) is composed of h elements where h is the number of bipartite connected components of G. When the path intersection graph is connected we have the following result. Lemma 8. If the path intersection graph G is connected then the optimal solution value of SLAP-tot(2) is at most 1. Proof. By Lemma 3, the rank of matrix B can be either (i) rank(B) = p or (ii) rank(B) = p − 1. Therefore, if rank(B) = p then a unique solution to system (1) is already achieved and no path-ID sensors need to be located. On the other hand, if rank(B) = p − 1, then only one variable has to be directly measured, that is, each minimum rank-filling set has size 1. Then, every arc a ∈ A covers a minimum rank-filling set. Theorem 4. If the path intersection graph G has h bipartite connected components, then each set H = {Yi1 , Yi2 , . . . , Yih } of h variables is a minimum rank-filling set of variables for system (1) if and only if the corresponding vertices {vi1 , vi2 , . . . , vih } belong to the h different bipartite connected components of G. Proof. By Observation 1 we can consider system (18) and its coefficient matrix Z . We can define the block diagonal structure of matrix Z 

Z1  0  0

0 Z2 ... 0

... ... ... ...

 0 0   Zs

248


where each submatrix Z i is the edge/vertex incidence matrix of a connected component of G. Thus, a solution to system (18) (and hence to system (1)) can be obtained solving separately the s subsystems Z i y = f i . Each subsystem’s variables correspond to the vertex set for the associated component, and its equations correspond to the component’s edge set. Each subsystem that represents a component G i with an odd cycle (that is G i is not a bipartite component) corresponds to a subsystem Z i y = f i that has a unique solution (by Theorem 2). For each subsystem that represents a bipartite component G i with n i vertices such that rank(Z i ) = n i − 1 (recall Corollary 3), a solution can be achieved fixing the value of any variable of the subsystem. Thus, system (18) (and hence system (1)) has a unique solution ⇔ each subsystem has a unique solution ⇔ each subsystem corresponding to a bipartite component has a solution. Therefore, we can fix the values of h variables, each corresponding to h vertices each of which belongs to a different bipartite component of G. Necessary and sufficient conditions for a subset of arcs A ⊆ A to be feasible for SLAPtot(2) follow from Theorem 4. Let G 1 = (V1 , E 1 ), G 2 = (V2 , E 2 ), . . . , G h = (Vh , E h ) be the bipartite connected components of the intersection graph, where Vi and E i , i = 1, 2, . . . , h are the vertex and the edge sets, respectively, of the component G i . Corollary 4. Given the path intersection graph G = (V, E). A subset A ⊆ A is feasible for SLAP-tot(2) if and only if A corresponds to a subset of edges E ⊆ E such that |VE ∩ Vi | ≥ 1, i = 1, 2, . . . , h, where VE ⊆ V is the subset of vertices spanned by E . Proof. Given A feasible, for each a ∈ A let {Yi , Y j } be the two paths covered by a and select the edge evi ,v j ∈ E to obtain E . If there exists a Vk such that VE ∩ Vk = ∅, then all the variables corresponding to the vertices v ∈ Vk are not covered by A . But A is feasible and, thus, by Theorem 4, this results in a contradiction. This proves sufficiency. By similar reasoning the necessary part of the proof holds. The lemma below, whose proof is omitted, follows directly. Lemma 9. Let G 1 = (V1 , E 1 ) and G 2 = (V2 , E 2 ) be any two connected components of G. There does not exist any arc a = (i, j) ∈ A, in the original network R, with vi ∈ V1 and v j ∈ V2 . Let us denote by z ∗ the optimal solution of SLAP-tot(2). Lemma 10. If the path intersection graph G has h bipartite connected components, z ∗ = h. Proof. Since an edge of G corresponds to an arc of R = (N , A), then by choosing one edge for each bipartite component of G we obtain a subset E of size h that corresponds


249

to a subset A of the same size that is feasible for SLAP-tot, that is z ∗ ≤ h. If A is feasible, then the corresponding set E covers all the bipartite components of G, i.e. |VE ∩Vi | ≥ 1, i = 1, 2, . . . , h. By Lemma 9, a set Vi cannot exist such that |VE ∩Vi | = 1 and therefore no edge covers two components. Thus, at least h edges are required to cover all components, that is h ≤ z ∗ . Hence z ∗ = h. Corollary 4 and Lemmas 9 and 10 result in the following theorem. Theorem 5. The optimal solution of SLAP-tot(2) is given by choosing h arcs {a1 , a2 , . . . , ah } ⊆ A such that: (i) h is the number of bipartite connected components of the path intersection graph G; (ii) each arc ai , i = 1, 2, . . . , h corresponds to an edge ei ∈ E of G that belongs to the bipartite connected components G i , i = 1, 2, . . . , h. Therefore, once we obtain the path intersection graph, we can then easily determine a rank-filling set of variables that corresponds to an optimal set of arcs for SLAP-tot(2). For the path intersection graph in figure 4, the optimal solution value of the problem is equal to h = 2 the number of bipartite connected components of the graph G. A rank-filling set of variables of minimum size contains two variables corresponding to two vertices, one each from the two bipartite connected components of G. Thus, for example {Y1 , Y2 } is not a rank-filling set, but {Y1 , Y4 } is. Since, by Lemma 9, we cannot find a single edge that can cover a minimum rank-filling set, the optimal choice is to take one edge ei from each bipartite connected component Vi and to select in A the corresponding arcs. From the graph of figure 4, an optimum subset of edges in G comprises, for example, edges (v1 , v2 ) in E 1 and (v3 , v4 ) in E 2 which correspond to arcs a1 and a2 in the original network R. Observe that a13 also covers vertices v3 and v4 . Observing the network R in figure 3 we see that paths Y3 and Y4 both use arcs a2 and a13 and path-ID sensor location on either of them gives the same information. 4.3. SLAP-par with two paths on each arc: SLAP-par(2) In this section we will show the SLAP-par(2) problem, or simply SLAP(2), is polynomially solvable when there are exactly two paths for each arc of the network. Analysis in Section 4.1 allowed us to show the equivalence of SLAP-zero(2) and the polynomially solvable VCE problem on a graph. Analysis in Section 4.2 related the solution of SLAP-tot(2) to the number of bipartite components in the path intersection graph G and to selecting an edge in each component resulting in a polynomial algorithm. We now address the special case of the general problem, SLAP(2), and utilize the results of Sections 4.1 and 4.2. We will define a new problem on the path intersection graph and refer to it as the Generalized Vertex Covering by Edges Problem (GVCE). We will show an

250


Figure 6. The subgraph G 1 of the path intersection graph (for the network of figure 3) in case of partial information, when total volumes on arcs a1 , a5 , a8 , a12 , a14 are known. This subgraph has 7 bipartite connected components: {v1 , v2 }, {v7 , v8 }, {v3 , v5 }, {v4 }, {v6 }, {v10 , v11 , v12 }, {v9 }.

optimum solution to SLAP(2) corresponds to an optimum set for GVCE. Then we will prove GVCE is equivalent to the VCE problem and thus is polynomially solvable. Consider the network of figure 3 and suppose we know the flows on a subset of the arcs, say a1 ,a5 ,a8 , a12 , a14 . Matrix B 1 of system (3) in this case is: y1 a1 1 a5  0 a8  0 a12  0 a14 0 

y2 1 0 0 0 0

y3 0 0 1 0 0

y4 0 0 0 0 0

y5 0 0 1 0 0

y6 0 0 0 0 0

y7 0 1 0 0 0

y8 0 1 0 0 0

y9 0 0 0 0 0

y10 0 0 0 1 0

y11 0 0 0 1 1

y12  0 0   0   0  1

The rank of this matrix is k = 5, the number of flow variables is p = 12, and therefore, a minimum rank-filling set of variables has size h = 12 − 5 = 7. Consider the subgraph G 1 of the path intersection graph G obtained with only the edges corresponding to arcs in the set A1 = {a1 , a5 , a8 , a12 , a14 } (see figure 6). Note that in this case there are three isolated vertices corresponding to the three paths Y4 , Y6 and Y9 which are not included in the subsystem defined by G 1 . The edge/vertex incidence matrix of G 1 has the same rank of matrix B 1 and all the properties of the connected components of G 1 proven for the path intersection graph G with total count information still hold for this case. If we define a degenerate component consisting of a single vertex to be a bipartite component of G 1 , then, using the results of the previous section, and by Lemma 2, we have the following corollary. Corollary 5. Let G 1 = (V 1 , E 1 ) be the subgraph of the path intersection graph related to matrix B 1 . A subset H = {yi1 , yi2 , . . . , yih } is a minimum rank-filling set of variables for system (3) if and only if vk ∈ Vk1 , k = i 1 , i 2 , . . . , i h , where G 11 , G 12 , . . . , G 1h are


251

all of the bipartite components of G and Vi1 is the vertex set of the bipartite connected component G i1 . That is, as in the case of SLAP-tot(2), we can easily characterize all the minimum rankfilling sets for SLAP(2). In figure 6, the graph G 1 has seven bipartite connected components. Each minimum rank-filling set for system (3) is obtained by choosing seven variables corresponding to seven vertices of G 1 that belong to the different bipartite connected components. For example, the set H = {Y1 , Y3 , Y4 , Y6 , Y7 , Y9 , Y12 } is a minimum rank-filling set because the corresponding vertices v1 ,v3 ,v4 ,v6 ,v7 ,v9 and v12 belong to different bipartite connected components of G 1 . However, we cannot derive the same result as stated by Theorem 5 because Lemma 9 does not hold in this case. Indeed, since G 1 is a subgraph of G, there may be some edges in E, say e(i, j) , such that vi and v j belong to different connected components of G 1 (for example the edge (v3 , v4 ) or (v4 , v6 )). That is, choosing 7 arcs in A corresponding to 7 edges of G 1 that belong to different bipartite connected components as stated by Theorem 5, may not give the optimal solution to SLAP(2). Indeed, for the network of figure 3, choosing only four arcs, say a8 , a9 , a11 and a16 can cover the set of paths H = {Y1 , Y3 , Y4 , Y6 , Y7 , Y9 , Y12 }, which is a minimum rank-filling set for this example. We now define the Generalized Vertex Covering by Edges Problem. First, we will show that each optimum solution of SLAP(2) corresponds to an optimum solution of GVCE, then we will prove GVCE to be polynomially solvable (also, GVCE belongs to the more general class of the Generalized Subgraph Problems as defined in Feremans (2001)). Generalized Vertex Covering by Edges Problem (GVCE) Instance: A graph G = (V, E), a family U = {U1 , U2 , . . . , Uh } of disjoint subsets of the vertices and a positive integer K ≤ |V |. Question. Is there a subset E ⊆ E of size K or less that covers the sets of the family U. That is, does an E ⊆ E exist with |E | ≤ K such that for each subset Ui there exists at least one vertex u i ∈ Ui and a vertex v ∈ V for which (v, u i ) ∈ E ? A subset E of edges of G is feasible for GVCE if at least one vertex in each set Ui is incident to one or more edges in E . For example, for the graph in figure 6, we define the family of disjoint subsets U = {{v1 , v2 }, {v3 , v5 }, {v4 }, {v6 }, {v7 , v8 }, {v9 }, {v10 , v11 , v12 }}. The subset of edges E = {(v4 , v6 ), (v1 , v7 ), (v3 , v4 ), (v9 , v11 )} is feasible for GVCE; indeed it covers at least one vertex from each of the components in the family. Let G be the intersection graph of the set of the paths Y. Consider the spanning subgraph G 1 = (V, E 1 ) of G obtained considering those edges of G corresponding to the set of arcs A1 and let G i1 = (Vi , E i1 ),i = 1, 2, . . . , h be the bipartite connected components of G 1 . Let U = {U1 , U2 , . . . , Uh } be the family of disjoint subsets of the vertex set V such that Ui = Vi .

252


Lemma 11. A ⊆ A is feasible for SLAP(2) if and only if A corresponds to a set E ⊆ E of the same size that is feasible for the GVCE on the path intersection graph with the family U. Proof. Let E = {e1 , e2 , . . . , es } be a feasible set of edges for the GV C E on G and the family U. We now build from E a feasible set of arcs A for SLAP(2) on R of the same cardinality. For each edge (vi , v j ) ∈ E select any arc a ∈ A such that a ∈ Yi and a ∈ Y j . This set A is the same size as E and by Corollary 5 it is feasible for SLAP(2). Now consider the contrary. Let the set A = {a1 , a2 , . . . , as } be a feasible set for SLAP(2) on R. We now build a feasible set for GVCE of the same size. For each a ∈ A , let {Yi , Y j } be the two paths covered by a, and select the edge ei j = (vi , v j ) ∈ E. Let the selected edges form the set E . We have that |E | = |A | and, by Corollary 5, it covers the family U, and thus is feasible. Theorem 6. A subset A ⊆ A is optimum for SLAP(2) if and only if A corresponds to a set E ⊆ E that is optimum for the GVCE on the path intersection graph with bipartite components forming its family U. Proof. Let A be an optimum set for SLAP(2), the corresponding set E of the same size is also optimum for GVCE, because otherwise there would exist a feasible set E ∗ for GVCE with |E ∗ | ≤ |E| and the corresponding feasible set A∗ with |A∗ | ≤ |A | contradicting the optimality of A . The same reasoning is applied for the i f (sufficiency) part of the proof. We now prove that any instance of GVCE can be reduced to an instance of VCE and thus GVCE is polynomially solvable. Theorem 7. The GVCE problem is polynomially solvable. Proof. Let G = (V, E), a family U = {U1 , U2 , . . . , Uh } of disjoint subsets of vertices and a positive integer K ≤ |V | be any instance of GVCE. We build a graph G = (Z , X ) and consider a positive integer K such that G has a subset E ⊆ E of size K or less that covers the family U if and only if G has a subset X ⊆ X of size K or less that covers the set of vertices Z . The construction of G is made by considering a vertex set Z of size h such that each vertex z i ∈ Z corresponds to the set Ui of the family U, and by considering the edge set X such that: (i) there is the edge (z i , z j ) ∈ X if there exists at least one edge (u, v) ∈ E such that u ∈ Ui and v ∈ U j and (ii) there is a loop-edge (z i , z i ) ∈ E if either there exists at least one edge (u, v) ∈ E such that u ∈ Ui and v ∈ Ui or there exists one edge (u, v) ∈ E such that u ∈ Ui and v ∈ / Uk for each k = 1, 2, . . . , h (see illustration of figure 7). This construction is done in polynomial time. We now have to show that G has a subset E ⊆ E of size K or less that covers the family U if and only if G has a subset X ⊆ X of size K or less that covers the set


253

Figure 7. The reduction of GV C E to V C E. From any instance of GV C E on graph G, a graph G is built for V C E. The family of disjoint sets of vertices of G is: U1 = {v1 , v2 }, U2 = {v3 }, U3 = {v6 }, U4 = {v7 , v8 }. For example, the loop edge of vertex z 2 is associated with the edges (v1 , v2 ) and (v1 , v4 ) and the edge (z 2 , z 3 ) is associated with the edge (v1 , v3 ).

of vertices Z . First suppose that E , |E | ≤ K , is a subset of edges in G that covers the family U. We choose a subset X ⊆ X of edges of G in the following way: – for each e = (u, w) ∈ E such that u ∈ Ui and w ∈ U j select the edge (z i , z j ) ∈ X ; – for each e = (u, w) ∈ E such that u ∈ Ui and w ∈ Ui select the edge (z i , z i ) ∈ X ; / Uk for each k = 1, 2, . . . , h – for each e = (u, w) ∈ E such that u ∈ Ui and w ∈ select the edge (z i , z i ) ∈ X . Let the selected edges form the set X . Setting K = K we have now obtained a set X such that |X | ≤ K and X covers the vertex set Z . Indeed, if there is a vertex in Z that is not covered, say z k , there is no edge in E covering the subset Uk , which leads to a contradiction. Conversely, let X ⊆ X , |X | ≤ K , be a subset of edges of G that covers the set of vertices Z . We choose a subset E ⊆ E of edges of G in the following way: – for each (z i , z j ) ∈ X select any edge (u, w) ∈ E such that u ∈ Ui and w ∈ U j ; – for each (z i , z i ) ∈ X select any edge (u, w) ∈ E such that either u ∈ Ui and w ∈ Ui or u ∈ Ui and w ∈ / Uk for each k = 1, 2, . . . , h. Let these selected edges form the set E . Setting K = K we have obtained a set E such that |E | ≤ K that covers the family U. Indeed, if there is a set in U that is not covered, say Uk , there is no edge in X covering the corresponding vertex vk , which leads to a contradiction. Obviously, the SLAP-tot(2) and SLAP-zero(2) are special cases of SLAP(2), and therefore correspond to particular instances of the GVCE problem. The SLAP-zero(2) instance is obtained when the family U = {U1 , U2 , . . . , Uh } of the set of vertices of G

254


is such that each set corresponds to a single vertex, i.e. Ui = {vi } for each vi ∈ V . On the other hand, when each set of the family corresponds to the vertex set of a bipartite connected component of G we obtain the SLAP-tot(2) instance. Thus, we develop the following polynomial algorithm to solve SLAP(2) when there are exactly two paths on each arc of the network. Algorithm 1 for SLAP(2). Input: A network R = (N , A), a set of paths Y = {Y1 , Y2 , . . . , Y p } such that on each arc of the network there are exactly two paths, a set A1 ⊆ A of arc flow volumes. Output: An optimum set A ⊆ A that solves SLAP(2). Step 1. Build the path intersection graph G = (V, E); Step 2. Define the subset of edges E 1 ⊆ E corresponding to the subset of arcs A1 ⊆ A; Step 3. Define the family U = {U1 , U2 , . . . , Uh } of disjoint subsets of vertices, where each subset is the vertex set of a bipartite component of G 1 = (V, E 1 ); Step 4. Solve the GVCE on G with family U and let E be the optimum subset of edges; Step 5. A is a subset of arcs which corresponds to the set E . 5.

An algorithm for the general case

In Section 4 we showed that SLAP(2) was polynomially solvable and provided Algorithm 1 that solves it in polynomial time. To fully address the general case of SLAP, with more than 2 paths on some arcs, we present here a greedy-type approximate algorithm. We first describe the algorithm for the total count information case and then present the modified versions for the zero and partial count information scenarios. The algorithm is implemented through appropriate manipulation of the coefficient matrix B. It is based on the following observation. Given an arc a ∈ A that is common to q paths. If we know the volumes of q − 1 paths containing arc a and we know the total flow volume of this arc, then we can compute the volume flow of the remaining path. We state this as a remark. Observation 2. Given an equation yi1 + yi2 + · · · + yiq = f k of the initial system of equations By = f , if we add to the system the q − 1 equations yh = f h , h =

q−1 i 1 , i 2 , . . . , i q−1 then yiq = f k − h=1 f ih . Suppose we locate a path-ID sensor on arc ai ∈ A corresponding to row Bi of the matrix. Then, all path flows on arc ai become known and the i-th row of the matrix B become redundant; this is effected by setting element bi j = 0, j = 1, 2, . . . , p. Also, when biq = 1 the flow on path Yq is known; therefore, the column B q is no longer needed and can be eliminated. Then, if there is a row, say Bk , such that bkl = 1 is the only non-zero element, then flow on path Yl is determined (by Observation 2) and we no longer need


255

to consider that row and also column B l . We can repeat the process until the rows of the remaining matrix are all equal to zero or at least have two non-zero elements. Choose one of the rows with non-zero elements according to a given selection criterion to locate a sensor and repeat the above process. Continue until all of the rows of matrix B become zero. The steps of the Algorithm 2, based on the above observations, to obtain a feasible solution to the general problem are as follows. Algorithm 2 for SLAP-tot Step 1 Set k = 0. Step 2 (Selection) Select row Bi(k) = 0 according to the given selection criterion to locate a path-ID sensor on an arc. Step 3 (Deletion) Update the B (k) matrix to obtain the matrix B (k+1) in the following way: (k) Step 3.1 If biq = 1 eliminate corresponding columns B (k)q as well as row Bi(k) ; (k) Step 3.2 If a row B (k) j of remaining matrix is such b jq = 1 is the only element different from zero, then set j = i and k = k + 1 and goto Step 3.

Step 4 (Stopping criterion) If bi j = 0 for all i = 1, 2, . . . , m and for all j = 1, 2, . . . , p then stop. Otherwise set k = k + 1 and go to Step 2. Theorem 8. Algorithm 2 finds a feasible solution for SLAP-tot. The proof follows directly from the observations that led to the algorithm. For the case when all the components of the path intersection graph are bipartite, we have the following corollary. Corollary 6. If there are exactly two paths on each arc and all the components of the path intersection graph are bipartite, then the Algorithm 2 finds an optimal solution for SLAP-tot(2) Proof. Let G be the path intersection graph. Let h be the number of connected components of G. Since the algorithm defines a subset of arcs A ⊂ A that is feasible for the problem, then |A | ≥ h. We prove it cannot be |A | > h. Since the connected component of the graph are all bipartite, then an optimal solution for SLAP-tot(2) is obtained by choosing exactly one edge for each component (Theorem 5); therefore, if A is not an optimum set (i.e., |A | > h) then there are two arcs, say ai , a j ∈ A that corresponds to edges of G, say ei , e j , that belongs to the same connected component. Suppose the algorithm chooses arc ai first. Since e j is in the same connected component of ei the following three cases may occur:

256


(i) Ya j = Yai : the row B j of the matrix, after Step 3.1, effectively becomes zero and the arc a j cannot be chosen subsequently; (ii) |Ya j ∩ Yai | = 1: the row B j of the matrix, after Step 3.1, has only one element different from zero. Thus, after Step 3 it will be eliminated and arc a j cannot be chosen subsequently; (iii) Ya j ∩ Yai = ∅: let Ya j = {Y j1 , Y j2 } and Yai = {Yi1 , Yi2 }, be respectively the set of paths associated with arc a j and arc ai . Since ai and a j belong to the same connected component there exists in G a path of length say l connecting vertex v js , s = 1, 2 to vertex vik , k = 1, 2. Let {Bi , B j1 , . . . , B jl−2 , B j } be the corresponding rows of the matrix. These rows are such that b jk ik−1 = 1 and b jk ik = 1 and thus after l iterations of Step 3.2 row B j will be equal to zero and arc a j cannot be chosen subsequently. Thus, we have a contradiction and the hypothesis follows.

In case of zero count information we cannot apply Observation 2 and thus Step 3.2 of the algorithm cannot be applied. Then, Algorithm 2 reduces to the greedy algorithm for solving the set covering problem Nemhauser and Wolsey (1988). On the other hand, in case of partial count information, Observation 2 can be applied only to the rows of the matrix corresponding to arcs in A1 , that is Step 3.2 is applied only when row B k+1 , under j consideration, corresponds to arc a j ∈ A . 6.

Summary and further research

In this paper, we addressed the problem of locating active sensors on the arcs of a traffic network where the sensors can provide data on paths. We showed that each sensor located on an arc results in a set of linear equations in path flow variables that may be used for finding path flows. Then, the problem becomes the selection of the minimum number of arcs that add linear equations that result in a full rank coefficient matrix. We presented a formulation of the problem and analyzed three different scenarios depending of the number of conventional counting sensors already located on the network. The general problem was shown to be NP-hard. However, we were able to introduce polynomial instances for it. Through the proofs of the polynomially solvable cases, some new graph theoretic models and theorems were obtained, which in their own right add to the graph theoretic knowledge base, besides providing insight to develop an approximate algorithm for the general case. Another related sensor location problem, which we alluded to in the introductory section, arises when image sensors can be located on the nodes of the network (intersections) to provide turning ratios. Processing the images, obtained by either fixed or mobile video cameras, it is possible to recognize vehicles on the scene and track their movements. For example, by locating a fixed camera that takes images of the traffic flows at an intersection of the network we can estimate (i) the arc flow volume of each


257

arc incident to the node and (ii) the turning ratios at the intersection. The turning ratio t ijk at node i is the proportion of flow on arc ( j, i) that goes to arc (i, k). Therefore, we can associate with each node, as for the path-ID sensors, a set of equations that might be added to the linear system if an image sensor is located on that node. Thus, the imagesensor-on-node location problem becomes a generalization of the active-sensor-on-arc location problem addressed in this paper, albeit a more complex one. We are currently formulating a general model for locating image sensors on nodes, and analyzing and developing solution approaches. References Bianco, L., G. Confessore, and P. Reverberi. (2001). “A Network Based Model for Traffic Sensor Location with Implications on O/D Matrix Estimates.” Transportation Science 35(1), 50–60. Bianco, L., G. Confessore, M. Gentili. (2003). “Combinatorial Aspects of the Sensor Location Problem.” Annals of Operations Research (to appear). Cascetta, E. and S. Nguyen. (1988). “A Unified Framework for Estimating or Updating Origin/Destination Trip Matrices from Traffic Counts.” Transportation Research B 22, 437–455. Conforti, M. and M.R. Rao. (1987). “Some New Matroids on Graphs: Cut Sets and the Max Cut problem.” Mathematics of Operations Research 12(2), 193–204. Feremans, C. (2001). “Generalized Spanning Trees and Extensions.” PhD Thesis, Universit´e Libre de Bruxelles, Institut de Statistique et de Recherche Op´erationelle . Gentili, M. (2002). “New Models and Algorithms for the Location of Sensors on Traffic Networks.” PhD Thesis. Department of Statistic, Probability and Applied Statistics, University of Rome “La Sapienza”. Lam, W.H.K. and H.P. Lo.(1990). “Accuracy of O-D Estimates from Traffic Counts.” Traffic Engineering and Control 31, 358–367. Nemhauser, G.L. and L.A. Wolsey. (1988). Integer and Combinatorial Optimization. New York, N.Y: J. Wiley and Sons. Wardrop, J.G. (1952). “Some Theoretical Aspects of Road Traffic Research.” In Proceedings of the Institute of Civil Engineering, Part II, pp. 325–378. Yang, H., Y. Iida, and T. Sasaki. (1991). “An Analysis of the Reliability of an Origin/Destination Trip Matrix Estimated from Traffic Counts.” Transportation Research B 25, 351–363. Yang, H. and J. Zhou. (1998). “Optimal Traffic Counting Locations for Origin-Destination Matrix Estimation.” Transportation Research 32B(2), 109–126.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.