An Efficient Self-Healing Scheme for Wireless Sensor ... - IEEE Xplore

1 downloads 0 Views 209KB Size Report
all of the descendants of the fault node should rejoin the network. Instead, we present an efficient scheme for repair itself after node failure that just one or some.
2008 Second International Conference on Future Generation Communication and Networking

An Efficient Self-healing Scheme for Wireless Sensor Networks* Jian Wan, Wanyong Chen, Xianghua Xu, Miaoqi Fang Grid and Services Computing Lab Hangzhou Dianzi University, Hangzhou 310018, China E-mail: [email protected], [email protected]

The reliability of wireless sensor networks is affected by faults that may occur due to various reasons. In ZigBee sensor networks, ZigBee specification presents a self-healing process which relies on its tree topology. When a ZigBee node failed, all of the descendants of the fault node should rejoin the network. Instead, we present an efficient scheme for repair itself after node failure that just one or some of the descendants should rejoin the network. It allows a subnet of fault node rejoin the network with minimum message exchanges and, therefore, saves energy. Simulation results show that most of subnet rejoin the network in the first level of sub-tree, and confirm the efficiency of the proposed scheme.

build-in self-healing process of ZigBee standard makes the descendant nodes of disconnected node to be disassociated as well and rejoined from scratch. The cost of communication is high. However, there is no need to do this in data collection applications. This paper proposes an improved method which reduces the number of message exchanges nodes have to take to rejoin the network. The proposed process, therefore, has the potential to save energy and bandwidth. The rest of the paper is organized in the following way. We first review the literature in the self-healing area of WSNs in Section 2. Then, we explain the assumptions under which we are operating, before describing our scheme in Section 4. We give the simulation results in Section 5. Finally, the paper is concluded in Section 6.

1. Introduction

2. Related Work

Wireless sensor networks (WSNs) which consist of devices with sensing, data processing, and communication components can allow us to instrument, observe, and response to the phenomena in the natural environment and physical infrastructure. Due to low cost, low power, robust and reliable, WSNs have a broad rang of applications, such as in military sensing, air traffic control, video surveillance, traffic surveillance, industrial and manufacturing automation, environment and habitat monitoring, physical security and scientific exploration in dangerous environments.[1] One key feature of a robust and reliable sensor network is the ability to repair itself when one or more of the nodes fail or lose contact with the rest of the network. This self-healing ability is essential to ensure network coverage and continued network functionality after one or multiple nodes have been disconnected with the network due to communication breakdown, battery drainage or node malfunction. As an industrial standard for ad-hoc networks built upon IEEE 802.15.4, ZigBee [2] has been widely adopted for low data-rate, low-power wireless sensor networks. The

In this section, we briefly review the related works in the area of self-healing in wireless sensor networks and other networks. The ZigBee standard has a built-in self-healing process which specifies how a disconnected node should rejoin the network [2]. In particular, when a node is disconnected from the network, the descendant nodes of it have to be disassociated as well and all nodes rejoin from scratch. Such a rejoin process of ZigBee is not efficient enough since it could result in many unnecessary message exchanges. In [3], an approach to trace the failed nodes in sensor networks was proposed. The corrupted routes can be recovered effectively. However, the approach used in [3] was centralized, where the control center (base station) monitors the health of sensors and recovers the corrupted routes. The communication cost of the approach is high. In the self-repair procedure of the multicast protocol in [4], when a parent node leaves gracefully, the leaving parent broadcasts the address of a randomly chosen available child to its other children. The “elected” child replaces the leaving node and other

Abstract

*

This work is supported by Science and Technology Research and Development Program of Zhejiang Province, China. (Grant No. 20 07C11023, 2007C21G3230005 and 2007R40G2040097)

978-0-7695-3431-2/08 $25.00 © 2008 IEEE DOI 10.1109/FGCN.2008.138

98

children attach to it directly. However, when a parent node leaves suddenly, the repair algorithm lets each child to pick up a potential parent and rejoin. The authors of [5] proposed an improved method to reduce the exchange messages. The method tries to bring along the disconnected node as many descendants as possible when rejoining the network. Considering the depth limitation, the disconnected node’s descendants have to execute the build-in selfhealing process of ZigBee standard to rejoin the network if the descendants with a new depth exceed the limitation by the method proposed. However, the end devices just need transfer the measurement to the coordinator by the parent-child links in data collection application. Therefore, how to keep the initial topology as far as possible is important and the method doesn’t accord with the need in data collection.

4. The Self-healing Scheme for ZigBee Networks This section proposes a more efficient self-healing scheme for ZigBee networks. As described in Section 2, when a node is disconnected in ZigBee network, it uses the built-in self-healing process to rejoin the network without disconnecting its descendants and letting them rejoin from scratch. Instead, our goal is to gain the best repair with the minimum cost by keeping the initial topology of the sub-tree as far as possible. In the proposed self-healing scheme, we distinguish the non-coordinator nodes into three types: repair sponsors, repair supporters and response neighbors. The repair sponsors denote the nodes whose parents are faulty. They sponsor the self-healing process when they find their parents faulty. The repair supporters denote the descendants of the repair sponsors. The response neighbors denote the repair sponsors and repair supporters’ neighbors (i.e., neighbors that are within a transmission range). We describe the process happened in the three types of the non-coordinator nodes respectively and all the nodes have the capability of processing the three types of requirement. Fig. 2 illustrates the relationship of the three types’ nodes in the self-healing process and the final purpose of the scheme. In Fig. 2, we give the state of first level success and other level success which is judged by whether the repair sponsor has a response neighbor.

3. Model We are interested in ZigBee networks using for data collection and containing three types of devices as coordinator, routers and end devices. In ZigBee networks, a coordinator starts a PAN, and other nodes join the network by becoming children of one of the existing node which acts as the parent. A router can further accommodate child nodes. An end device, however, cannot have any children of its own. Fig.1 shows the general topology of the ZigBee network.

Figure 1. The general topology of the ZigBee network Figure 2. The purpose of the proposed self-healing scheme

As shows in Fig. 1, the dark circles represent faulty nodes due to the sensor device itself and the harsh environment where the sensor nodes are deployed. We only consider the faulty reasons such as communication breakdown, battery failure or node malfunction. A child node which has failed communicating with the parent goes through an “orphaning” process during which broadcasts its status and waiting for the reply from its parent. If the reply is not received within a pre-configured time period, the child node considers itself disconnected. The coordinator is considered to be main-powered and will not disconnect itself or be disconnected.

When the node considers itself disconnected with the process described in Section 3, it acts as a repair sponsor and goes through the procedure outline in Fig. 3 to join the network. Step 1. Broadcast a JOIN_REQUEST (ieee_addr) to the neighbors, where ieee_addr is the joining node’s IEEE address. Step 2. Wait for the responses from the neighbors – JOIN_RESPONSE (ieee_addr, depth), where ieee_addr is the neighbor’s IEEE address and depth indicates the neighbor’s depth. Step 3. When all the JOIN_RESPONSE (ieee_addr, depth) are received. Compare the depth of the response

99

The response neighbors are the non-coordinator nodes next to the faulty sub-tree. They accept the rejoin request from the faulty sub-tree. The procedure outlines of them are showed in Fig. 5.

neighbors, and then send a JOIN (ieee_addr) to the minimum one to join in. If no JOIN_RESPONSE (ieee_addr, depth) is received, go to Step 4. Step 4. Send a FIND_REQUEST (ieee_addr) to the children, where ieee_addr is the repair sponsor’s IEEE address. Step 5. Wait for the responses from the repair supporters finding a response neighbor – FIND_RESPONSE (ieee_addr, depth), where ieee_addr is the IEEE address of repair supporter’s and depth indicates the response neighbor’s depth. Step 6. When all the FIND_RESPONSE (ieee_addr, depth) are received. Compare the depth of the response neighbors, and then exchange the parent-child relationship with the child routed to the repair supporter whose response neighbor’s depth is minimum, meanwhile, send a UPDATE_ROUTE (ieee_addr) to the new parent and keep the other parent-child relationships invariant. The parameter of ieee_addr in UPDATE_ROUTE (ieee_addr) is the IEEE address of responder. Figure 3. The procedure outline of repair sponsor

Step 1. Listen to the parent and neighbors. When a message is accepted, go to Step 2. Step 2. switch (message) { case ‘JOIN_REQUEST’ : Send a CHECK_REQUEST (ieee_addr) to check whether it can communicate to the coordinator and wait for the CHECK_RESPONSE (ieee_addr), where ieee_addr is the IEEE address of itself. case ‘CHECK_RESPONSE’ : Send a JOIN_RESPONSE (ieee_addr, depth) to the joining requester. case ‘JOIN’ : Set the requester to be a new child. } Figure 5. The procedure outline of response neighbor

5. Simulation Results

As the repair sponsor’s descendant, repair supporter has the function of finding the response neighbor to rejoin the network and transmitting the messages of self-healing procedure. It goes through the procedure outline in Fig. 4.

For simulation we used Visual Basic as the tool. The simulations assume that all of the nodes are randomly deployed in a region of size 1000 × 1000 m2, all of the nodes have a communication range of 100 m. We consider the performance of our algorithm in small networks. Each parameter setting is repeated 100 times to obtain average results. We use the settings for the topological parameter, which respectively stand for tall (Cm=4, Rm=2, Lm=9), regular (Cm=6, Rm=4, Lm=5) and flat (Cm=8, Rm=6, Lm=4) trees. Fig. 6 shows the percentages of nodes succeed in rejoining to the networks. When the number of nodes increases, the number of nodes' neighbors increases, so the children of the fault node have more neighbors can rejoin to network. Both the ZigBee standard and our algorithm are the same percentage.

Step 1. Listen to the parent and children. When a message is accepted, go to Step 2. Step 2. switch (message) { case ‘FIND_REQUEST’ : Same as Step 1-2 in Fig. 3. When all the JOIN_RESPONSE (ieee_addr, depth) are received. Send a FIND_RESPONSE (ieee_addr, depth) to the parent, where depth is the minimum depth of the received JOIN_RESPONSE (ieee_addr, depth). If no JOIN_RESPONSE (ieee_addr, depth) is received, transmit the FIND_REQUEST (ieee_addr) to the children. case ‘UPDATE_ROUTE’ : if (local address = = ieee_addr in UPDATE_ROUTE) Join the response neighbor who has the minimum depth and set the former parent to be a child. else Transmit the UPDATE_ROUTE (ieee_addr) to the children and set the former parent to be a child. end if. case ‘FIND_RESPONSE’ : Transmit the FIND_RESPONSE (ieee_addr, depth) to parent. } Figure 4. The procedure outline of repair supporter

Figure 6. The percentages of nodes rejoin successfully

Fig. 7 shows the percentages of the first level children rejoin to the network, when one node is fault.

100

tree than flat tree with the same scale when the node goes wrong. Overall, the number of exchange messages in our scheme is small, especially in the network which has the large depth and scale.

6. Conclusion In this paper, we proposed a self-healing scheme for ZigBee Sensor Networks which are used for data collection. This scheme tries to heal the isolated subtree by making one of its nodes which has neighbor owing the minimum depth to rejoin the network and it just needs update one branch of the sub-tree at most. Different from the built-in self-healing process in ZigBee standard, it keeps the initial topology as far as possible. So it requires minimum message exchanges and saves time and energy. An example in the paper can illustrate the execution of the scheme well and the simulation show the performance of the scheme is effective. In the other applications of ZigBee Sensor Networks maybe need routing between sensor nodes. In a future work, we plan to study on how to adjust the IEEE addresses of the nodes after self-healing. Then it can accord with the routing requirement.

Figure 7. The percentages of the first level children rejoin to the network

From Fig. 7, we find the percentages of the first level children rejoin to the network are high. So the scheme we proposed could effectively reduce the exchange messages of self-healing. Compare to the built-in process of ZigBee, Fig. 8 shows the number of exchange messages in the scheme we proposed. We consider the percentage of ZigBee’s built-in process as 100.

7. References [1] C. Chong and S. P. Kumar, Sensor Networks: Evolution, Opportunities, and Challenges, Proceedings of the IEEE, Vol. 91, No. 8, August 2003. pp1247-1256. [2] ZigBee Specification Version 1.0, ZigBee Alliance. 2004. Figure 8. The number of exchange messages when nodes rejoin to the network

[3] J. Staddon, D. Balfanz, G. Durfee, Efficient tracing of failed nodes in sensor networks, Proceedings of First ACM International Workshop on Wireless Sensor Networks and Applications (WSNA), 2002.

Because the depth of the tree and the number of the fault node’s children are small when the network’s scale is small, so we find the decrease of the exchange messages in our scheme isn’t obvious in Fig. 8. However, when the number of nodes increases, the difference between our scheme and ZigBee’s built-in process becomes obviously. When the depth of the tree is large, as we known in Fig. 7, our scheme could ensure a low cost with a high percentage of rejoining the network in the first level. But the built-in process of ZigBee needs all of the fault node’s children to rejoin the network and the cost is high. Meanwhile, Fig. 8 shows our scheme has more advantage in tall

[4] T.T.-M. kwan and K. L. Yeung, ON overlay multicast tree construction and maintenance, Proceedings of International Conference on Collaborative Computing: Networking, Applications and Worksharing, 2005. [5] W.Z. Qiu, P. Hao and R. J. Evans, An Efficient Selfhealing Process for ZigBee Sensor Networks, Proceedings of International Symposium on Communications and Information Technologies, 2007.

101