A Dominating Set Based Peer-to-Peer Protocol for ... - Semantic Scholar

A Dominating Set Based Peer-to-Peer Protocol for Real-time Multi-source Collaboration Dewan Tanvir Ahmed, Shervin Shirmohammadi and Abdulmotaleb El Saddik Distributed and Collaborative Virtual Environments Research Laboratory School of Information Technology and Engineering, University of Ottawa, ON, Canada {dahmed, shervin}@discover.uottawa.ca, [email protected]

Abstract Designing a collaborative architecture for real-time applications is an intricate challenge that usually involves dealing with the real-time constraints, resource limitations and complex synchronous problems. In multi-source collaboration applications, users interact with each other to share their states which are essential for synchronous communication. In this paper, we present real-time multi-participant communication architecture to efficiently manage their interactions in a peer-to-peer fashion. We introduce a graph-theoretic framework for provisioning overlay network based collaboration services to heterogeneous receivers. Considering resource limitations and exploiting geographical positions, the protocol greedily builds degree-constrained minimum-cost connected graph to manipulate the topology to a significant extent by selecting mesh neighbors and changing the metrics. Data delivery routes are picked using dominating set. We named it Dominating Set based Peer-to-Peer Protocol (DS-P2P). Simulation is used to manifest that the framework is robust, responsive to tree partitions, and suitable for multiparticipant real-time collaboration.

1. Introduction Different technologies carry different forms of collaboration such as telephone conferencing, ISDN video conferencing, voice over IP, IP video conferencing, just to name a few. Improved real-time collaboration brings significant productivity and effectiveness to the system and to the communities. Multi-participant interactions allow users act together concurrently over the Internet to collaborate in a variety of applications such as conferencing, application sharing, collaborative design, tele-learning, tele-training, and many others. Many of these

applications are synchronous, and must be carried out in a real-time fashion. As such, they introduce hard challenges to the system designers. A scalable system is most desirable, but it requires exchanging of too many messages to maintain the proper collaboration states. A fundamental requirement of any real-time collaboration tool is the exchange of frequent update messages among the participants. But it is challenging to keep low data rate without affecting the collaboration experience. The synchronous communication and the proper coordination among the parties can be defined through end-to-end latency. Moreover the interactions among parties are highly reactive; therefore, the requirement for frequent updates with a moderate end-to-end delay imposes hard time constraints. Bandwidth intensive applications have now become practical for home users with the improvement in communication technologies. The need for efficient support of one-to-many and many-to-many applications led to the proposal for the implementation of multicasting on the global inter-network called IP Multicast The lack of applicability of IP multicasting on the Internet has been well documented in [1]. Due to the practical lack of multicasting infrastructure on the Internet, an alternative has been proposed to shift multicast support from the networking layer to the end systems and the proxies. The key idea of overlay networks is to form a virtual network on top of the physical network so that overlay nodes can be customized to incorporate the complex functionalities without modifying the native routers. The overlay framework considers different issues relating to application domain such as potential group size, member’s unreliability and mechanisms to control their behavior, as well as pragmatic aspects like the heterogeneity of the peers. Application layer multicast (ALM) is a form of overlay network that works in a peer-to-peer manner and overcomes the functionality

limitations of IP Multicast [1]. Current deployment practices of IP Multicasting require the manual configuration at routers to form the MBONE, which makes it expensive to set up and maintain. Therefore, an alternative has been proposed to move multicast support from core routers to end systems. This is ALM where data packets are replicated at end-hosts rather than at routers. The popularity of ALM is growing in various fields as an alternative to native IP multicast. These include news group, video conferencing, internet games, internet jukebox, interactive chat-lines, distant learning, video on demand and etc. In this paper, we present real-time multi-participant peer-to-peer communication architecture. This architecture includes graph-theoretic framework for provisioning overlay network based collaboration services to a diverse set of receivers by incorporating the features of dominating set. It reduces the system’s dependency on the end-hosts by minimizing the size of forwarding node set. The minimal diameter as compared with other approaches, such as tree based approaches, justifies its significance. The rest of this paper is organized as follows. Literature review is given in Section 2. Section 3 portrays the real-time multi-participant collaboration architecture. We present the simulation results and the analysis of the proposed architecture in Section 4. Finally, we conclude the paper in Section 5.

2. Literature review There have been significant research activities in ALM-based protocols, some directed to real-time applications. Yusung et al. [2] exploit landmarks to construct topologically aware data paths among the multicast group members. It does not need the exact network topology, but requires the relative position of the members using the landmarks. The protocol partitions members into topologically aware clusters based on the ordering of their close landmarks. DSP2P, on the other hand, applies a greedy algorithm to build degree-constrained minimum-cost connected graph. The focus of our approach is to define routing paths to reduce high network latency and to lessen redundant network resource usage over the other existing scalable approaches. Wierzbicki et al. introduce Fastcast [3] for efficient peer-to-peer applications. It is a root based, online, and topology-aware ALM. It is controlled by a parameter, through which it is possible to tradeoff used traffic against worst-case length of the application layer path. DS-P2P considers resource limitations of the end-hosts and constructs a mesh which is more robust and

responsive to tree partitions and more suitable for multi-source applications. Yoid [4] defines a protocol for building a multicast tree using distributed end hosts. It uses hop count as a measure of distance. DSP2P on the other hand uses physical distance to percept the topology without any extra measuring technique. ALMI [5] uses a centralized algorithm to build spanning tree making it difficult to scale. Although DS-P2P allows node registering in centralized manner it can handle ad hoc issues in distributed way. It increases the robustness of the system and lowers the dependency on other nodes. FT-ALM is a hybrid p2p collaboration protocol to carry time dependent data in multicast fashion [6]. It is designed for heterogeneous end-hosts and constructs tree that is vulnerable at high dynamic environment. To overcome such problem it makes the use of backup parents. We found that mesh like structure is more robust and responsive to tree partitions. Thus, DS-P2P constructs mesh structure and gets the benefit of it. We observed that many ALM protocols (such as RMX [7], Gossamer [8], Bayeux [9], Borg [10], Scribe [11]) operate based on an existing peer-to-peer substrate that serves as a mesh on top of which an overlay multicast tree can be constructed using either a reverse-path forwarding scheme (Gossamer, RMX [7], Scribe [11]), or a forward-path forwarding scheme (Bayeux [9]) or both (Borg [10]). The advantages of these approaches include low control overhead and distributed management of the multicast tree, but they do not restrict the degree of each node and are sub-optimal. Minimum Spanning Tree (MST) and Shortest Path Tree (SPT) routing algorithms can be modified to respect the degree constraints of each node. The problem of finding minimum-cost degree-constrained multicast trees or degree-constrained Steiner trees is NP-complete [12]. There are several heuristic approximation algorithms addressing this problem [13][14][15]. Some of these algorithms (such as [14][15]) do not provide exact guarantees to the degree of each node in the tree and instead provide a bound on the worst-case degree. Others focus on constructing a single tree and do not consider multiple trees over the same graph ([13][16]). Though there has been some research with regards to constructing multiple trees on a shared graph [17], they still only provide a bound on the worst case (maximum) degree of any node as opposed to guarantees on the individual maximum degree for every node which is required for a protocol supporting multi-source applications. DS-P2P is designed for multi-source communication. It has low dependency on other end-hosts as it uses minimum number of forwarding nodes. Other key feature of DSP2P is low diameter. This is one of the key constraints

of synchronic real-time applications like networked games [18].

3. Collaboration and communication We propose an effective p2p mechanism for realtime multi-participant collaboration. The stability of the system depends on other parties’ cooperation. So, the key goal should be to reduce the dependency on other end-hosts. In the following subsections, we describe such a system.

3.1. General policy and node registering Fault-tolerant and super quality overlay structure is the pillar of any overlay based application. To create an improved overlay, we assume every joining node registers to a central rendezvoused point, i.e. the coordinator. Whenever a new node comes to the system, it makes an explicit join request to the coordinator. The joining node also specifies necessary information such as its identity, available bandwidth for the session and physical position in the geographical system. The coordinator processes the join request and accommodates the node to the system (covered in Section 3.2). Using this system, a node can register in advance or even after the start of the collaboration session. c 2/2 a

2/2 2/2 b

2/0 d

Figure 1. Disconnected graph

3.2. Mesh construction The coordinator, a negotiator in DS-P2P, constructs a mesh based on the participants’ geographical position. Instead of using end-to-end delay or hop count, the geographical position serves as an alternative to perceive physical distance. It determines a node’s fan-out/degree using the following relation where b (application specific parameter) stands for the bandwidth requirement to serve each client: degree=⎣uploadBW/b⎦. If we allow each node to choose the closest nodes as its neighbor, the resultant graph may be disconnected. Consider Figure 1, where Nodes a, b, c and d each have a degree of 2. Nodes a, b and c are close to each other and use their degrees. But Node d can not make an edge to any of them even though two of the three nodes are close to it. So, the resultant graph is disconnected. To avoid such disconnected graph, it constructs a degree-constrained

minimum-cost connected mesh using a heuristic approach. The objective of this construction is to determine the edges to form a connected mesh subject to the degree constraints while optimizing (minimize) the overall distance. At the first phase, considering a completely connected graph, the coordinator sorts the edges in ascending order. It greedily selects N-1 edges (N is the # of nodes) to span all the members. This policy ensures a connected graph, in fact a tree according to our algorithm, while satisfying the degree constraints. Next step is to include as many edges as possible to form a mesh while obeying degree constraints and greedily optimizing overall distance. The pseudocode is given in Figure 2 (trivial cases are ignored). Function Degree-Constrained MinCost Connected Mesh begin sort edges of E in ascending order e = minHeap(E) // Let e = Edge(n1,n2) add edge e to G′ vertexSet = {n1,n2} while(|vertexSet | != N) //N: total # of nodes begin e = minHeap(E) if (n1∈vertexSet ⊕ n2∈vertexSet) begin if (n2∉vertexSet and n1.freeDegree>0) vertexSet = vertexSet ∪ {n2} add edge e to G′ elseif (n1∉vertexSet and n2.freeDegree>0) vertexSet = vertexSet ∪{n1} add edge e to G′ end update data structure end greedily add many edges to G′ considering degree constraints end

Figure 2. Mesh construction - Heuristic approach

3.3. Data delivery path It is hard to construct stable and efficient data delivery paths while incorporating fault tolerance features. Considering the view of the network model as a graph offers the facility to use the full arsenal of algorithms and concepts that are well known in graph theory. In graph theory, a dominating set for a graph G = (V, E) is a subset V′ of V such that every vertex not in V′ is joined to at least one member of V′ by some edge. We construct an undirected graph G = (V, E) using the above heuristic approach. Mathematically, open neighbor set and close neighbor set of a vertex

v∈V are represented by N(v)={u|(v,u)∈E} and N[v]=N(v)∪{v} respectively. The following marking schemes are carried out to determine the core nodes (i.e. gateway nodes) in the system responsible for data forwarding. Nodes are labeled either as T or F, where T and F stand for gateway and non-gateway nodes, respectively. Initially, it assigns F to each v∈V. The logic changes the marker m(v) to T, if there are two unconnected neighbors of v. The graph G′⊆G is induced by V′ where V′ = {v | v∈V, m(v)=T}. However, the dominating set constructed in this way is not minimal. Subset relation is used to reduce its cardinality; hence the next two rules are applied: Rule 1: Consider two vertices u and v in G′. If N[v]⊆N[u] in G and key(v)