SELF-ORGANIZING SENSOR NETWORKS WITH ... - Semantic Scholar

1 downloads 11301 Views 1005KB Size Report
composed of a set of cheap, lightweight components, for detection of events of interest or ... and to send them to a sensor fusion center which takes the final decision. ..... Using graph terminology, we call such a number d the degree of the ...
SELF-ORGANIZING SENSOR NETWORKS WITH INFORMATION PROPAGATION BASED ON MUTUAL COUPLING OF DYNAMIC SYSTEMS Sergio Barbarossa INFOCOM Dpt., University of Rome “La Sapienza”, Via Eudossiana 18, 00184 Rome, Italy E-mail: @infocom.uniroma1.it

ABSTRACT Sensor networks are typically used as a distributed system, composed of a set of cheap, lightweight components, for detection of events of interest or estimation of physical parameters. The most typical approach in designing a sensor network consists in asking the sensors to collect data and to send them to a sensor fusion center which takes the final decision. In this paper, we propose a totally different approach. Each network node is composed of a sensor, that measures the parameter of interest, and of a dynamic system (oscillator), initialized by the sensor measurement. The oscillators of nearby nodes are mutually coupled. We show that, through proper local coupling strategies, we may design networks, with no fusion center, where each dynamic system, on each node, converges to the globally optimal maximum likelihood estimator that could have been achieved only by an ideal fusion center having access to all system parameters and observations perfectly. 1. INTRODUCTION The fundamental challenge in the research on sensor networks focuses on the possible strategies to design networks composed of a multitude of cheap, lightweight components that are possibly individually unreliable but, as a whole, are capable of solving complex tasks through spontaneous self-organization [1]. It is the structural simplicity of the sensor that poses some of the basic problems in the design of the overall network, namely: i) each sensor should have minimum energy consumption and should then perform only simple tasks; ii) each sensor should be allowed to “fall asleep”, at random times, for periodic recharge of its battery without compromising the network functionality; iii) the overall network should have detection and estimation capabilities superior to those of each individual sensor, possibly without the need for a centralized fusion center; iv) the information gathered by the sensor nodes should spread through the network without the need for complicated multiple access or routing techniques; v) the network should be scalable, i.e. capable of operating correctly irrespective of the number of sensors. Most of the current

research on sensor networks today aims at exporting part of the huge background of knowledge accumulated in telecommunication networks in the sensor field, with the specific task of designing energy-efficient communication systems. However, in most applications, the requirements and constraints present in sensor networks are so different from the equivalent values typically occurring in telecommunication networks that it may be more advisable to shift the basic paradigm and devise totally innovative decision and communication strategies. The scope of this paper is precisely to propose a novel strategy, where the spreading of information occurs as a result of the local coupling between adjacent nodes which act as mutually coupled adaptive oscillators. The idea, indeed, is not totally new, as it occurs in many biological systems. Our goal here is to show how to translate the mathematical models of biological systems in a sensor network scenario. One biological example useful to grasp the basic idea is the heartbeat. In our bodies, even though our life depends in a fundamental way on the rhythm generated by the heart, surprisingly, there is no high precision master clock ! Nevertheless, our heart beats in a very regular fashion, it is capable to adapt to external solicitations, with very limited energy consumption and it lasts for a long time, even if the pacemaker cells, responsible for the cardiac rhythm, have a life cycle much smaller than the average life time of a human being. Where is then the source of stability ? The stability of the overall system is the result of the collective behavior of a population of mutually coupled pacemaker cells, whose individual reliability and precision is limited, but, as a whole, give rise to an extremely reliable system. The mathematics of populations of mutually coupled oscillators provides then an interesting framework to design sensor networks capable of satisfying the desired requirements. This idea has been initially proposed in [2], [3], where the sensors were made to operate as pulse coupled integrate-andfire oscillators, according to the Peskin’s model proposed to describe the heart physiology [6]. The approach proposed in [2], [3] encoded the information gathered from each sensor into temporal shifts. However, this could cause an ambiguity problem to distant observers, unable to discriminate

between the information-bearing time shift and the propagation delay. This potential limitation was then removed in [4], where it was proposed an alternative way of designing sensor networks based on mutually coupled dynamical systems. At the same time, in [4] it was shown that, acting on a single control parameter, the network could switch its behavior between two alternatives: as a local information storage or as an information gatherer and spreader. In this work, we extend the approach of [4]. In particular, we show how to design the coupling rules so that each node evolves towards the global optimal maximum likelihood (ML) estimator, through local coupling between nearby nodes, without the need of any sensor fusion center that gathers all the information present in each node to derive the ML estimate. 2. MUTUALLY COUPLED OSCILLATORS In our proposed scheme, the network is composed of N nodes composed of a sensor and a dynamical system. The sensor may work either as a detector or as an estimator. In the first case, the i-th sensor takes a decision about the event of interest (like, e.g., the intrusion of a person or the level of radiation) and sets a parameter, let us say ωi , of the associated dynamic system accordingly. For example, in case of detection it sets ωi = Ω1 , whereas in case of no detection, it sets ωi = Ω0 . Alternatively, if the sensor works as an estimator, it sets ωi proportional to the estimated variable. After sensing the environment, the dynamical system (oscillator) present in each node (let us say the i-th one) is let to evolve from an initial condition given by the initial pulsation, set equal to ωi , according to the following equation N KX θ˙i (t) = ωi + aij F [θj (t) − θi (t)], ci j=1

(1)

with i = 1, . . . , N, where θi (t) is the state function of the ith sensor (θi (0) may be initialized as a random number); F (·) is, typically, a monotonically increasing nonlinear odd function of its argument; the coefficients aij are real variables that describe the coupling between sensors i and j; K is a control loop gain; ci is a coefficient that quantifies the attitude of the i-th sensor to adapt its values as a function of the signals received from the other nodes: The higher is ci , the less is the attitude of the i-th node to change its original decision ωi . The function F (x) takes into account the mutual coupling between the sensors. By reciprocity, the coefficients aij satisfy the condition aij = aji . The decision, or estimate, of each sensor is then encoded in its pulsation θ˙i (t).

Approximating the derivative in (1) with the finite differences, using a small time step ∆t, we may rewrite (1) as the law governing the evolution of the instantaneous phase θi (t) as N

θi (t + ∆t) = θi (t) + ωi ∆t +

K∆t X aij F [θj (t) − θi (t)], ci j=1

(2) with i = 1, . . . , N . Equation (2) has a straightforward interpretation when F (x) is a monotonically increasing, odd function. If, on the average, most of the phases θj (t) are greater than θi (t), the last term on the right-hand side of (2) tends to be positive and then θi (t) tends to increase. If, on the other hand, most of the phases θj (t) are smaller than θi (t), the last term is negative and then θi (t) tends to decrease. Hence, in both cases, θi (t) evolves in order to reduce the difference between itself and the other phases. Since each oscillator is behaving in the same way, we may expect that the oscillators tend to synchronize. Indeed, as we will see later on, the capability of the network to synchronize depends on the value of the loop control parameter K. A possible choice for F (x) is, for example F (x) =

eλx − 1 , eλx + 1

(3)

with λ real and positive. With this choice, each oscillator evolves with an activation function that resembles the behavior of neurons in our brain. An alternative choice is ³ x´ . F (x) = sin 2π T

(4)

In this case, the overall model (1), setting aij = 1 for all i and j, is known as the Kuramoto’s model [5]. The choice (4) may seem more problematic than (3) as now F (x) is not monotonic anymore. Nevertheless, this does not create any real trouble. Indeed, the consequence of using (4) is that the solution of the system of nonlinear equations (1) is forced to be periodic, of period T . In our work, the coefficients aij take into account the local coupling between oscillators, so that two oscillators are coupled (i.e., aij 6= 0), only if their distance is smaller than the coverage radius of each sensor1 . In the rest of the paper, we will denote the parameters ωi and the functions θi (t) as the natural pulsations and the instantaneous phases of the i-th oscillator, in accordance to Kuramoto’s terminology. However, it is important to emphasize that neither ωi nor θi (t) are necessarily the pulsation and instantaneous phase of a 1 The coverage radius is assumed to be the same for all sensors, even though this could be changed to accommodate for different network topological models, like small worlds or scale-free networks.

sinusoidal carrier. They are, in general, physical parameters whose choice is dictated by implementation constraints. For example, the oscillators may be pulsed oscillators, as in ultra-wideband systems, where θi (t) is the time where the i-th node emits a pulse. In this case, the information is carried by the rate with which the pulse emission time varies with time, somehow mimicking the functioning of the neurons in the brain. We say that the overall population synchronizes if all sensors end up oscillating with the same pulsation, i.e. θ˙i (t) = θ˙∗ (t), ∀i. It is easy to verify that, thanks to the reciprocity aij = aji and to the oddness of F (x), if we multiply both sides of (1) by ci and take the summation of all the equations in (1), over the index i, we get N X

ci θ˙i (t) =

i=1

N X

˙ as a function An example is reported in Fig.1, showing ψ(t) of ψ(t) (where we suppose, with no loss of generality, that ω2 > ω1 ). We can easily verify from the figure that there

1.5

ω2 − ω1 + K

.

ψ(t) 1

0.5

o

0

−0.5

c i ωi .

−6

i=1

If the coefficients ci are all equal to each other, ω ∗ is simply the average of the initial pulsations ωi . However, if each sensor knows the SNR with which it has taken its initial decision, it can set ci = SNRi , so that the final common pulsation becomes PN i=1 SN Ri ωi ω∗ = P . (7) N i=1 SN Ri This is an interesting behavior, as it shows that the sensors with the highest SNR are the ones that weight more in the distributed decision. Ideally, if there is a noiseless sensor (i.e., with infinite SNR), it forces all other sensors to take its same decision and then prevents the other sensors from making errors, even if they are noisy.

To better understand the behavior of the proposed system, we start with the simple case of two coupled oscillators and then we will illustrate the general case. 2.1. Two-oscillators system A system with only two oscillators is relatively easy to analyze adopting the insightful and elegant geometric interpretation used in [8]. Introducing the function ψ(t) := θ2 (t) − θ1 (t), and setting a12 = 1/2, we can rewrite (1), with F (x) given by (3), as (we set, for simplicity, ci = 1, ∀i) e −1 ˙ . ψ(t) = ω2 − ω1 + K −λψ(t) e +1

−4

−2

0

2

4

6

ψ(t)

Hence, if the system synchronizes, the common pulsation must necessarily be constant and equal to PN i=1 ci ωi . (6) θ˙∗ (t) := ω ∗ = P N i=1 ci

−λψ(t)

ω2 − ω1 − K

(5)

(8)

˙ as a function of ψ(t), with F (x) as Fig. 1. Variation of ψ(t) in (3). ˙ is only one equilibrium point, corresponding to ψ(t) = 0. In such a point, the system is synchronous. Furthermore, this equilibrium is stable. In fact, as shown by the arrows representing the direction of the shift acting on ψ(t), when its value departs from the equilibrium state, we see that the system reacts to any departure by forcing the point to move back to the equilibrium state. At the same time, we realize that there exists such an equilibrium only if K > |ω2 − ω1 |.

(9)

The situation is apparently more complicated if we choose the sinusoidal F (x), as in (4). In such a case, repeating the same simple analysis as before, we have ˙ ψ(t) = ω2 − ω1 − K sin[ψ(t)],

(10)

which is represented in Fig. 2. From Fig. 2, we notice that there are now two equilibria within one period. However, only one of these equilibria is stable, namely the one indicated by the circle. In fact, the equilibrium represented by the star is unstable because any shift from that point leads to an indefinite departure from the equilibrium. Again, also in this case, there exists one stable equilibrium point, in each period, if (9) holds true. At the equilibrium, the instantaneous phases differ only by a constant term, equal to µ ¶ ω2 − ω 1 θ2 (t) − θ1 (t) = arcsin . (11) K This equation shows that the only way to reduce this phase difference consists in choosing a value of K sufficiently greater than the difference |ω2 − ω1 |.

20

6 ω −ω +K 2

5

d = 20

18

1

16

. 4 ψ(t)

d = 16

14

3

ri(t)

12 10

2

d = 10

ω2 − ω1

1

8 6

o

0

*

ψ(t)

4

ω −ω − K 2

−1

−2

1

2 0

0

1

2

3

4

5

6

˙ as a function of ψ(t). Fig. 2. Variation of ψ(t)

0

100

200

300

400

500 time index

600

700

800

900

1000

Fig. 3. Variation of ri (t) as a function of t, for different degrees.

2.2. N -oscillators system Whereas the case with only two sensors is easy to analyze, it is much more difficult to study the general case of N sensors. Nevertheless, for large N , we may exploit the mean field theory approximation [5], typically used in the study of phase transitions in thermodynamic, to derive an approximate solution. In the following derivations, we consider only the sinusoidal case, where F (x) is given by (4). For simplicity of notation, we also set ci = 1. If we introduce the complex function ri (t)e

jαi (t)

:=

N X

aij ejθj (t) ,

(12)

j=1

the mean field approximation consists in assuming that, for large values of N , after a transient, if the system converges, the functions ri (t) tends to a constant, independent of the index i, for all i, and θj (t) = ω ∗ t + θj0 , so that, for large N , we have N X



aij ejθj (t) ≈ ejω t r ejα .

(13)

j=1

We will now show that, in practice, the mean field approximation (13) is very good even with values of N not excessively large, provided that K is sufficiently large (the effect of the choice of K will be better illustrated nextf). As an example, in Fig. 3 we report the values of ri (t) obtained over 20 independent realizations of a network composed of N = 21 sensors. In each realization, each sensor starts with a random phase θi (0), uniformly distributed between 0 and 2π. The natural pulsations ωi , at the beginning of each experiment, are generated as binary random variables equal to Ω0 = 0, with probability p0 = 0.2 or to

Ω1 = 100, with probability 1 − p0 = 0.8, as they are supposed to be the result of a binary decision. The network is generated as a regular graph, i.e. a graph where the number of sensors coupled to any given node is the same for all nodes. Using graph terminology, we call such a number d the degree of the network. In particular, in Fig. 3 we show the behavior of ri (t), as a function of time, for different degrees, i.e. d equal to 10, 16, and 20 (full connectivity). Fig. 3 shows that, after a transient, all functions ri (t) tend to values slightly less than the network degree d. Ideally, the maximum possible value of r is exactly d and this value is achievable if all oscillators are perfectly synchronous to each other, so that all exponentials ejθj (t) in (12) sum up coherently.In [4] we showed how to derive r analytically, for a network used as a distributed binary decision system. Multiplying both sides of (13) by e−jθi (t) and taking the imaginary part, the mean field approximation (13) allows us to rewrite (1) as θ˙i (t) = ωi − Kr sin[θi (t) − ω ∗ t − α], i = 1, . . . , N. (14) The interesting aspect of this approximation is that the state equation of each sensor has the same behavior as the twosensor case, irrespective of the number of coupled oscillators. Proceeding then as in the two-sensor case, there exists one stable equilibrium if Kr > |ωi − ω ∗ |.

(15)

What is important to emphasize about (15) is that the existence of an equilibrium depends on the value of r, which, on its turn, depends on the collective behavior of the oscillators. Looking at the definition (12), at the beginning of the state evolution, the value of r is typically small and there

might be just a few oscillators satisfying (15). However, as the number of synchronized oscillators increases, the value of r increases. As a consequence, it is more likely that other oscillators will respect condition (15). In turn, the value of r increases and so on. There is then a sort of positive feedback such that more and more oscillators become locked to each other. Conversely, since r ≤ d, the network cannot synchronize if there are some oscillators for which Kd < |ωi − ω ∗ |.

(16)

The maximum value of r is equal to the network degree. Hence, the larger is the coverage radius of each oscillator, the higher is the probability for the network to synchronize. But increasing the coverage radius requires more transmission power. Alternatively, given d, we may increase K to avoid the possibility for (16) to be true.

3. DECENTRALIZED ML ESTIMATION Let us consider the linear observation model, where the i-th sensor observes a vector y i = Ai x + w i ,

(17)

where x is the unknown parameter vector, assumed to be the same for all sensors; Ai is the mixing matrix of sensor i, and w i is the observation noise vector, characterized by zero mean and covariance matrix C i . We assume that the noise vectors affecting different sensors are statistically independent of each other (however the noise vector present in each sensor may be colored). Let us denote with L the number of unknowns, so that x is a column vector of size L. The observation vector y i has dimension M . We consider the case where the single sensor must be able, in principle, to recover the parameter vector from its own observation. This requires that M ≥ L and that Ai is full column rank. The ML estimate of each sensor alone is then (i)

−1 −1 H −1 ˆ M L = (AH x Ai C i y i , i C i Ai )

(18)

where the symbol † denotes pseudo-inverse. An ideal centralized node that gathers all the observation vectors y i without errors and knows all mixing matrices Ai , would derive the optimal centralized ML estimate, equal to à n !−1 à n ! X X H −1 H −1 ˆML = Ai C i Ai x Ai C i y i , (19) i=1

i=1

where the summation extends over all the nodes that send their information to the decision node. Clearly, the estimate (19) is the desired solution, but it is difficult to obtain because it requires a lot of information arriving at the decision

node, without errors. In fact, the sensor fusion center would need to know not only all the observations y i , but also the mixing matrices Ai and the noise covariance matrices C i of each sensor. Nevertheless, we will show next how to achieve the optimal estimate only through local exchange of partial information, without the need for collecting all the information in any node. Generalizing the strategy described in the previous section to the vector case, we design nodes that evolve according to the following vector state equation (i) −1 −1 ˆ M L +K(AH θ˙ i (t) = x i C i Ai )

N X

aij F [θ j (t)−θ i (t)],

j=1

(20) (i) ˆ M L given by (18). In (20), the with i = 1, . . . , N and x symbol F (x) has to be intended as the vector whose mth entry is the F (xm ), where xm is the m-th entry of x. −1 Multiplying both sides of (20) by AH i C i Ai , we obtain H −1 −1 ˙ AH i C i Ai θ i (t) = Ai C i y i +K

N X

aij F [θ j (t)−θ i (t)].

j=1

(21) Summing up all these equations over the index i, we get n X

−1 ˙ AH i C i Ai θ i (t) =

n X

−1 AH i C i yi .

(22)

i=1

i=1

Hence, if the system has the capability to reach a synchro∗ nization state, where θ˙ i (t) = θ˙ (t), for all i, that state must necessarily be θ˙ (t) = ∗

Ã

n X i=1

−1 AH i C i Ai

!−1 Ã

n X i=1

−1 AH i C i yi

!

. (23)

This equilibrium coincides with the global optimal ML estimate. Some examples are useful to grasp the behavior of the proposed system. In Fig. 4 we report the simulation result obtained for a system having the following parameters. The number of nodes is N = 20. The number parameters to be estimated is L = 3 and their values is −1, 1, and 2. Each sensor collects an observation vector of size M = 9, through a mixing matrix Ai , which is generates as a random Gaussian matrix, whose entries are i.i.d., with zero mean and unit variance. Each observation is corrupted by a Gaussian noise of variance σn2 = 9. Each node is coupled with only four other nodes. The nonlinear coupling function is the function given in (3), with λ = 1. In Fig. 4, we report the behavior of the three components of the vector ∗ θ˙ (t), as a function of t, for all nodes of the network. Each color refers to one parameter. The black lines represent the

estimates of the three parameters obtainable with a fusion center that computes the ML estimate, knowing all matrices Ai and all observation vectors y i . It is interesting to see that each sensor reaches, after a transient, the global ML estimate, even though it is coupled to only a few nodes and no nodes send the information about its own mixing matrix to the other nodes. To better quantify the behavior of the

0

10

−1

10 estimation variance

10 8

.

N=5

N = 11

−2

10

6

θi(t) 4 2

−3

10

−5

0

2 n

5

σ (dB)

0 −2

Fig. 5. Estimation variance vs. noise variance: centralized ML (dashed lines for N = 5 and dashed lines for N = 11) and distributed ML (stars).

−4 −6 −8 −10

0

0.5

1

1.5

2 n

4

x 10

Fig. 4. Estimation variance vs. noise variance: centralized ML (solid lines) and distributed ML (stars). proposed system, in Fig. 5 we report the estimation variance of the three parameters, as a function of the additive noise variance. We compare the variance obtained with the optimal global ML (solid line) and with the sensor network (stars). The setup is the same as in Fig.4, except that here we used the sine function (4), instead of (3). We can check that, also in terms of variance the distributed ML obtained with the network of mutually coupled oscillators performs as the ideal centralized scheme. As expected, even if each node is coupled with the same number of neighbors (four in both cases), we observe from Fig.5, that increasing the overall number of nodes in the network yields a decrease of the estimation variance, on each node ! In conclusion, in this paper we have shown how a properly designed sensor network, composed of mutually coupled dynamic systems gives rise to a distributed estimator that performs as a globally optimal fusion centers. In a way, the information propagates through the networks in analog form as a result of local coupling. Interestingly, increasing the number of nodes in the network yields a performance improvement, even if each node remains coupled with a fixed, small number of neighbors. Given a sensor network designed in this way, each node is characterized by a reliability which goes well beyond that of the single sensor, without the need to collect all the information in a sensor fusion center. In a parallel work [4], we have shown also how to use the network as a

way to get spatial smoothing of the sensor estimates, again through the simple coupling mechanism. 4. REFERENCES [1] Iyengar, S.,S., Brooks, R., R., (Eds.), Distributed Sensor Networks, Chapman & Hall/CRC, Boca Raton, 2005. [2] Hong, Y.-W., Scaglione, A. “Distributed change detection in large scale sensor networks through the synchronization of pulse-coupled oscillators”, Proc. of ICASSP ’2004, pp. III869–872, Lisbon, Portugal, July 2004. [3] Hong, Y.-W., Cheow, L., F., Scaglione, A. “A simple method to reach detection consensus in massively distributed sensor networks”, Proc. of ISIT ’2004, Chicago, July 2004. [4] S. Barbarossa, F. Celano, “Self-Organizing sensor networks designed as a population of mutually coupled oscillators,” Proc. of IEEE Signal Processing Advances in Wireless Communications, SPAWC 2005, New York, June 2005. [5] Kuramoto, Y., Chemical Oscillations, Waves, and Turbulence, Dover Publications, August 2003. [6] Peskin, C. S., Mathematical Aspects of Heart Physiology, Courant Institute of Mathematical Sciences, New York Univ., 1975. [7] Mirollo, R., Strogatz, S., H., “Synchronization of pulsecoupled biological oscillators”, SIAM Journal on Applied Mathematics, vol. 50, pp. 1645–1662, 1990. [8] Strogatz, S., H., Nonlinear Dynamics and Chaos, pp. 273– 278, Perseus Book Publishing, Cambridge, MA, Dec. 2000.

10