My title

8 downloads 0 Views 3MB Size Report
Mar 4, 2017 - 97. Conclusion. 99. Novelty. 100. Future Work. 102. List of Novel Papers. 103 ... 1.2.4 Sensor network on (i +k)-th iteration. ...... imately 140 000 packages which are routed in the network for arround 100 iterations. ...... The skewness of the distribution shows that the HCRF give estimates that are skep- tical.
T ECNICAL U NIVERSITY S OFIA FACULTY OF T ELECOMMUNICATIONS D EPARTMENT OF "C OMUNNICATION N ETWORKS "

Algorithms and Heuristics for Data Mining and Decision Making in a Next Generation Telecommunication Networks Dissertation for the achievement of the scientific degree of "Philosophy Doctor" in the field of "Communication and Computer Science" a doctoral program in "Communication System and Networks"

Author:

Scientific Supervisor:

Vladislav Vasilev

Prof. Georgi Iliev

December 12, 2016

Acknowledgments I would like to express my sincerest gratitude to prof. Georgi Iliev for the unconditional support and valuable advises, that he gave me in the process of exploring the vast scientific field, and that enabled me to produce my dissertation. I consider the practical experience and the gained knowledge to be indispensable and absolutely fundamental to my future professional development. Special thanks to prof. Vladimir Poulkov for his efforts in the examination and the correction of my work. My work would surely be of lesser quality otherwise, which will diminish the interest in it. Special thanks to prof. Albena Mihovska, for finding the good qualities of my theories and for helping me to further develop them. Special thanks to Assoc. prof. Alexander Tsenov for the haste with which he reviewed my dissertation, so that I could defend it in time. His notes were punctual and added great value to the final text. I give my deepest gratitude to anyone who helped me to finish and defend my PhD thesis. I hope that my findings will make life in general better for everyone.

Contents 1

2

Overview of Unsupervised Clustering Methods in Sensor Networks. 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Defining the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Overview of Mathematical Methods in Sensor Networks . . . . . . . . . 1.4 Advantages and Disadvantages of Graph clustering with Cliques and Cuts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5 Comparison of Methods for Graph Clustering. . . . . . . . . . . . . . . . 1.5.1 Girvan-Neuman Method. . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 Karger’s Algorithm for Minimal Cuts of a Graph . . . . . . . . . . 1.5.3 Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.4 Probability That the Packet Will Reach Its Destination . . . . . . 1.5.5 Variance of the Mean Delay Time . . . . . . . . . . . . . . . . . . 1.6 Conclusion of the comparison. Aims of the dissertation. Novelty of chapter 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Novel Software Packet for Sensor Network Simulation 2.1 Challenges in the Simulation of Sensor Networks . . . 2.2 Data Structure of the Novel Packet for WSN . . . . . 2.3 Functions that simulate the behavior of an SN in time 2.3.1 Function markov_transition 2.3.1 . . . . . . . 2.3.2 Function randsample_vv 2.3.1 . . . . . . . . . 2.3.3 Function prop_states 2.3.1 . . . . . . . . . . . 2.3.4 Functions bfs_nl and bfs_nld 2.3.1 . . . . . . . 2.3.5 Function wsn_packets_v2 2.3.1 . . . . . . . . 2.3.6 Function pack_stat 2.3.1 . . . . . . . . . . . . 2.3.7 Function bin_inc_s 2.3.1 . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

1 1 3 11 13 16 16 17 18 19 21 22 23 23 24 26 28 29 30 32 33 33 33

i

CONTENTS

2.4

2.5

2.6 3

4

Packet of functions that cluster with Karger’s Algorithm, Girvan-Neuman and Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Karger’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Girvan-Newman Method . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . Functions used in Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Function simp_d_nld 2.3.1 . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Function fi_si_v 2.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Function is_subset_exc_e 2.3.1 . . . . . . . . . . . . . . . . . . . 2.5.4 Function rem_nld 2.3.1 . . . . . . . . . . . . . . . . . . . . . . . . 2.5.5 Function fi_nons 2.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . Novelty of Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Forecasting the Behavior of Sensor Networks 3.1 Clique Clustering with Polynomial Computational Complexity, that Include the Time Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Simulative Evaluation of the ClC . . . . . . . . . . . . . . . . . . . . . . . 3.3 Summary of the results from ClC . . . . . . . . . . . . . . . . . . . . . . . 3.4 Latent Variable Clustering model . . . . . . . . . . . . . . . . . . . . . . . 3.5 Index Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Practical Interpretation of the Theoretical Results . . . . . . . . . . . . . . 3.6.1 Probability that a packet will reach its destinations using the HCRF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.2 Variance of the Mean Delay Time with HCRF . . . . . . . . . . . 3.7 Summary of the comparison and novelty of Chapter 3 . . . . . . . . . . . Clustering with the HCRF for Maximizing the Throughput 4.1 Philosophy of the Unsupervised Method with HCRF . . . . . . . . . . . . 4.2 Simulation Modeling of the Problem of FInding the Maximal Throughput and Suppressing Inter-Cell interference. . . . . . . . . . . . . . . . . . . . 4.3 Redifining the Terms IIG and IIT to Work in Vector Spaces . . . . . . . . 4.3.1 Stochastic Detection of IITs in large graphs . . . . . . . . . . . . 4.3.2 Isomorphism as a Method for Re-indexing . . . . . . . . . . . . . 4.3.3 Mixed complete r -uniform hyper-graphs . . . . . . . . . . . . . . 4.3.4 Vector detection of IIT . . . . . . . . . . . . . . . . . . . . . . . .

34 34 36 36 36 37 38 41 41 42 42 43 43 50 53 53 58 64 66 69 70 72 72 75 78 78 81 84 86 ii

CONTENTS

4.4 4.5

4.3.5 Special notes to MCMC . . . . . . . . . . . . . . . . . . . . . . . . Finding the Maximal Total Throughput in a Sensor Network . . . . . . . Conclusion and Novelty of Chapter 4 . . . . . . . . . . . . . . . . . . . . .

Conclusion

86 92 97 99

Novelty

100

Future Work

102

List of Novel Papers

103

List of Abbreviations

civ

A

´ tÐ˙zÑAÑ ´ ˘ tÐijÐÿ Ðÿ Ðt’Ð¿ÐžÐˇrÐuÐˇ дcиtпÑAи ˚ rÑCи ˛ CÐšÐˇ r cv ´ ˘ rÑCÐ¿Ñ ˘ Ð¡Ðˇr ÑEÐ¿Ñ ˘ ˇ A.1 ÐIJÐÿСÐÿÐijÐˇrÐz˙ иtС ÑAи ˛ tÐ£ÐˇrÑAÐˇ A AÐt’пК ˘ rÑD ˇ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . cv иsÑAÐˇ ´ Ð¡Ðˇr ÐˇcÐaÐ ˘ rÐt’ÐÿиtСÑC A.2 ÐSÑ ¸ AÐˇ ˛ aЧ ˛ . . . . . . . . . . . . . . . . . . cvii ˇ rÐz˙ СÐÿÑRÑ ˘ ˘ ˇ CÐˇ ´ C´ Ð¡Ðˇr КÑAи ˇ ˇ A.3 ÐTÑ ˛ tОÐÿ ÑEÐ¿Ñ AÐt’пК Ðÿ ОпСÑDÐ¿Ñ AÐi ˘ rÑD ˘ ˘ ˇ иt ÑEÐÿУи ˇ иsÑAÐˇ tÑA-Ðt’ÑŁÑ AКп . . . . . . . . . . . . . . . cix A.4 A paper signed by a notary . . . . . . . . . . . . . . . . . . . . . . . . . . . cx

iii

List of Figures 1.1.1 Principles of SN optimization. . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Semantic structure of Ch. 1. . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 A diagram of a chain of effects in a SN. . . . . . . . . . . . . . . . . . . . . 1.2.3 Sensor network on i -th iteration. . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 Sensor network on (i + k)-th iteration. . . . . . . . . . . . . . . . . . . . . . 1.3.1 Diagram of the mathematical methods. . . . . . . . . . . . . . . . . . . . . 1.4.1 Example garph and its cliques. In some cases the cut search(e.g cut 1 and 2) gives wrong results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 The distribution of the mean probability that a packet will reach its destination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 The distribution of the variance of the probability that a packet will reach its destination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.3 The distribution of the skewness of the probability that a packet will reach its destination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.4 Distribution of the variance of the mean delay time. . . . . . . . . . . . . . 2.3.1 Diagram of the structure variables and function from the packet for SN simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 A sample from the set X with distribution P = [p 1 , p 2 , . . . , p n ], resulting in the set X = (x 1 , x 2 , x n , x 2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 A first order discrete by space and time Markov Chain. The edges that show the probabilities of staying in the same state are not shown. . . . . . 2.3.4 Iteration 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.5 Iteration 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.6 Breadth First Search initialized at V0 and the iterations showed in different colored hyper-edges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 4 7 8 9 11 14 19 20 20 21

27 30 31 32 32 33

iv

LIST OF FIGURES

2.3.7 Binary search in an increasing vector. The search value is X = V3 . On the first iteration X is compared with the middle element of the array V4 . Since V4 > X , on the next iteration 2, we only observe the subset with indexes randing from 1 to 3. The middle element of the subset in iteration 2 is V2 . Since V2 < X , on the next iteration we examine only the subset with index. On the third iteration we have and match X = V3 and the search is terminated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Minimal cuts has fewer edges. . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 Contraction of the edgeE 3,4 = (V3 ,V4 ). . . . . . . . . . . . . . . . . . . . . . 2.5.1 A relational diagram of the structural variables and the functions of the package WSN used in Chapter 3 and Chapter 4. . . . . . . . . . . . . . . . 2.5.2 For the set of neighbors Ne (V1 ) = (V2 ,V3 ,V4 ) of vertex V1 and the set of neighbors Ne (V2 )−V1 +V2 = (V2 ,V3 ,V4 ,V6 ) of V2 without V1 and including V2 , one can see that Ne (V1 ) ⊆ Ne (V2 ) − V1 + V2 . . . . . . . . . . . . . . . . 2.5.3 For the set of neighbors Ne (V1 ) = (V2 ,V3 ,V4 ) of vertex V1 and the set of neighbors Ne (V3 )−V1 +V2 = (V2 ,V3 ,V4 ,V8 ) of V3 without V1 and including V3 , one can see that Ne (V3 ) ⊆ Ne (V3 ) − V1 + V3 . . . . . . . . . . . . . . . . 2.5.4 Finding a simplicial vertex. For the set of neighbors Ne (V1 ) = (V2 ,V3 ,V4 ) of vertex V1 and the set of neighbors Ne (V4 ) − V1 + V4 = (V2 ,V3 ,V4 ,V7 ) of vertex V4 without V1 and including V4 , one can see that N (V1 ) ⊆ Ne (V4 ) − V1 + V4 . Combining the results on Figures 2.5.2 and 2.5.3 one concludes that, V1 is simplicial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.5 The difference of this with Fig. 2.5.2, 2.5.3 and 2.5.4,is that the vertexes V2 and V3 are not neighbors. Therefore, N (V1 )¬ ⊆ N (V2 )−V1 +V2 . Hence, V1 is not simplicial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Factorization of a Markov Filed with cliques. . . . . . . . . . . . . . . . . 3.1.2 A dynamical procedure that finds the connected component. . . . . . . . . © ª 3.1.3 T ∗ = X ∗ ∪ Z ∗ ∪ E ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Probability to successfully transmit a packet as a function of time for 20 different scenarios and a curved interpolated line passing through the data. 3.2.2 The defined chromatic metric as a function of time for 20 different scenarios and a curved interpolated line passing through the data. . . . . . . . . 3.4.1 A digram of the HCRF model. . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 A tree of conditional probabilities. . . . . . . . . . . . . . . . . . . . . . . .

34 35 35 37

39

39

40

40 44 45 49 51 52 54 57 v

LIST OF FIGURES

3.5.1 The index invariant hyper-graph S σ . . . . . . . . . . . . . . . . . . . . . . . 3.5.2 An example of a sub-hyper-graph. . . . . . . . . . . . . . . . . . . . . . . . 3.5.3 An example of an IIT built with conditional IIGs S σ,1 . . . . . . . . . . . . 3.6.1 Adding an element S 1 = (V0 , E 0 ) to the inmost edge of an IIG S σ gives D S σ + S 1 . Adding an S 1 = (V0 , E 0 ) to the outmost part of the dual S σ gives D D S σ + S 1 . S σ + S 1 that is isomorphic to S σ + S 1 . . . . . . . . . . . . . . . . 3.6.2 The distribution of the mean probability that a packet will reach its destination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.3 The distribution of the variance of the probability that a packet will reach its destination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.4 The distribution of the skewness of the probability that a packet will reach its destination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.5 Distribution of the variance of the mean delay time. . . . . . . . . . . . . . 4.1.1 4.1.2 4.2.1 4.3.1 4.3.2 4.3.3

Association of an IIG with the Babylon Tower. . . . . . . . . . . . . . . . Ishango Bone (Lebombo bone). Source: Wikipedia . . . . . . . . . . . . . Noise suppression in WSN. . . . . . . . . . . . . . . . . . . . . . . . . . . . Detecting an IIT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index invariant hyper-graph S σ . . . . . . . . . . . . . . . . . . . . . . . . . Example: Clique Tree, where the cliques/IIGs have only one intersecting vertex and hence form an IIT . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.4 Indexing of the vertexes {V1 ,V2 } with the help of a hyper-graph. . . . . . . 4.3.5 Index invariant hyper-graph S σ . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.6 Index invariant hyper-graph S σ . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.7 Example of the application of the hyper-graph relation. . . . . . . . . . . . 4.3.8 Method with random deposition [1] . . . . . . . . . . . . . . . . . . . . . . 4.3.9 Sequential movement of the spheres [1]. . . . . . . . . . . . . . . . . . . . 4.4.1 Histogram of the power [dBm] of a sensor. . . . . . . . . . . . . . . . . . . 4.4.2 Throughput [bits/S/Hz] as a function of time [S] and dissipated power [dBm] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Throughput [bits/S/Hz] as a function of time [S] and dissipated power[dBm] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.4 Distance between clusters in relations to the distance in total throughput [bits/s/Hz]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59 61 62

65 67 68 69 70 73 75 77 79 79 81 82 82 83 84 91 92 93 94 96 97

vi

LIST OF FIGURES

´ t V3 Ðÿ V4 ÑCÑ ´ AÑ ˘ EпКи ˘ Rд ˇ ´ sÐšÐˇr Ðt’Ðˇr ÑAÐˇ A.1.1ÐŠÑŁÑAÑ tÑCи ˛ r ÑAÑŁÑ ˛ Aи ˛ tÐt’Ðÿ, УпСиtÐ˝uиt Сиt ÑAÐˇ ˛ r УпÐ˚uКпÐz˙ иtСÐÿ ÑEÐÿОР˛ z˙ Ðÿ ÑA˛ ˘ t пÑC ´ 3 К ÑEÐ¿Ñ ˘ ˇ Ðt’ÑŁÐz˙ Ð˝uÐÿÐ¡Ðˇr УпКиtÑGи AÐt’пКÐÿ ˘ rÑDÐÿ. ˇ иsÑAÐˇ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . cvii © ª A.3.1 G ∗ = X ∗ ∪ Y ∗ ∪ E ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . cix

vii

List of Tables 2.2.1 2.2.2 2.2.3 2.2.4

Descriptions of the structure variable graph of Fig. 2.3.1 Descriptions of the structure variable graph of Fig.2.3.1 Descriptions of the structure variable graph of Fig.2.3.1 Descriptions of the structure variable q of Fig.2.3.1 . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

24 24 25 26

viii

Chapter 1 Overview of Unsupervised Clustering Methods in Sensor Networks. 1.1

Introduction

The development of the modern day Mathematical Methods(MM) in the field of communications, definitely increases the intelligence of individual devises in the system. In the broader perspective, when these methods applied on the whole network, they enable a huge number of devises to cooperate as a single indivisible whole, that is capable of applying more complicated network and user function, at the expense of the same or diminishing resources. Since network functions are directly related with resource management, then the better cooperation capabilities inevitably increases the efficiency and utilization of the available resources. Internet of Things(IoT) is a telecommunication network from a new generation, that is greatly dependent on the cooperation functions of it elements. IoT is characterized as a set of many devises, that make independent intelligent decisions in parallel, that in turn optimize processes, self-configure, self-heal the system and other [2–5]. Because of this, a Sensor Network(SN) is a big part of IoT. Mainly SNs are used for data acquisition, that enable inteligent decision making in IoT. Hence, cooperative methods are activelly applied in SNs. The computationally efficient and theoretical implementation of these methods, allow SNs to work with a huge number of devises. In this way the ration of cost versus functionality drops, which increases the possible applications of such system in a real life solution. There are two types of mathematical methods that are used to increase efficiency of 1

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

a system sych as a SN, namely supervised and unsupervised methods. The supervised methods require input data that is annotated, meaning that each data entry must be assigned to a class(for example, high,low,strong,weak). A supervised method is then trained on the data entries and its label/class. Because of the need for annotation, these methods have limited field of application in SN, where the ultimate goal is network autonomy. On the other hand, the unsupervised methods require no annotation and hence have unlimited application possibilities in SNs. The current dissertation focuses only on unsupervised methods for cooperation in a time dependent SNs. At the date of the work this is a fundamental and an unsolved problem in such networks. Fig. 1.1.1 shows the causal flow, in which the mathematical methods influence a SN’s functionality. SN

Mathematical Methods

Efficiency

Cooperation

Many devises

Ch.3 Supervised methods Price

Ch.4 Unsupervise methods Sec. 1.3 Unlimited application

Wirelecc communication

Figure 1.1.1: Principles of SN optimization. Because each sensor makes autonomous intelligent decisions, without having annotated data regarding the cooperation with other sensors, then the functional relations, that is the most important in terms of the quality of the SN is given by the relation between the cooperating capabilities of the SN and the unsupervised methods. That is why Ch.1 makes an overview of the most commonly used methods for unsupervised learning that are applied in SN. In order to compare the methods of Ch.1 with a simulation in Ch.2 a simulation model of a SN is made. Ch. 2 also describes the difficulties of such a model and gives ways to avert them, both from the perspective of the used algorithms and the required 2

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

software. In Ch. 3 novel unsupervised methods are derived such as: the method with index invariance and the method of Lotka-Volterra. The main goal of this chapter is to model the dynamic size of a connected component in the network. The given methods efficiently solve the stated problem, because they are mathematically derived from the assumptions of the mode. In order to solve the problem, the sensors are modeled as social entities in a time dependent environment, to which the Lotka-Volterra model is then applied. Next, more complicated mathematical tools are applied to the task, by the use of a Hidden Conditional Random Filed(HCRF) and the social principals, which leads to a novel clustering method. In this chapter are is also a simulation comparison of the performance of the novel method with the methods of Ch. 1. Chapter 4 examines another important aspect of the SN, that is vital to its quality, namely the relation between system efficiency and the wireless interface. The main argument from a practical perspective in this chapter is that because of the required high efficiency versus cost ratio, this network cannot be wired. A wired connection between many devises will cause huge installation cost inefficiencies. For this reason, SN are also called Wireless Sensor Network (WSN) and by wireless is meant that the sensor are wireless with respect to both communication and source of energy. WSNs are use a radio interface for communication and are battery powered, and also often batteries can be recharged with an alternative power source, like a solar panel for instance. The nature of such a communication techniques poses additional high requirements to the mathematical methods used. These task are a main subject in Ch. 4.

1.2

Defining the Problem

Ch. 1 follows the semantic structure given on Fig. 1.2.1. A key factor of the quality of the simulation results is the relation between real works and the assumption of a model. For this reason we first need to define a SN in a rigorous way and then based on this definition we need to evaluate the relation between the model assumptions and real world. As a results of this evaluation we shall have e workable simulation model, that can be used to time dependent network analysis. Next, we examine a SN example, in which we find the three main problems that arise in such systems, namely the presence of differential processes, the need to energy efficiency and the need to forecast such effects. Following that, we give a overvies of the standart mathematical methods that could be used and we make a comparison analysis 3

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

between them. Deffining the problem 1.2

SN model Practial limitations

Assumptions

Effect Forecasting

Energy efficiency Differential processes

Overview of MM for WSN 1.3

Comparison of MM 1.5

Figure 1.2.1: Semantic structure of Ch. 1.

SN Model The definition of a SN is [6–13] the following: Definition 1.2.1. Sensor networks are networks including a large number of devises for data gathering, that cooperate together, and the quality of the system is based on the mathematical methods for efficient use of the shared resources. With the use od definition 1.2.1 we can make simplifying assumption in order to obtain a SN model. Assumption 1.2.1. All sensors are the same with respect to their computational capabilities, memory and battery capacity. This assumption should hold, because each sensor has extrmely low price per devise. Otherwise, Ðˇr diversity of devises is associated with production cost for the different production lines. 4

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

Assumption 1.2.2. Each sensor is able to recharge its battery with an alternative power source, e.g. solar panel. In the general case, if the battery is not recharged, the live-time of the sensor is limited with respect to time and the maximum total data size, that can be routed in the network. Therefore in many real live situation, the sensor must be able to replenish their power. Assumption 1.2.3. Depending on the available baterry power, the coverage radius of each sensor changes according to the same first order Markov Chain(MC). This assumption implies, that the available energy of a sensor is independent of the energy of all other sensors, which is not necessary true. For example, sensors that belong to the mosts commonly used routes in the network will definitely have dependent energy levels. It is worth noting at this point, that this assumption is limiting assumption of the simulation model. Nevertheless, when the network has low traffic values, this assumption will hold. This scenario is very probable, in an intelligent network, where the mathematical methods implemented in the devices require more recourse that the data transmission. Assumption 1.2.4. The change of the coverage range depends only on the current state but not on previous ones. This assumption follows from the use of first oder MC and strengthens the assumption, that the energy depends only on the computational processes on each sensor. Assumption 1.2.5. Two sensors for a connection for a period of time, if they pairwise fall in each others coverage radius. Hence, we assume that the connection is duplex Assumption 1.2.6. Each sensor has a limited length of the queue for processing its own packets and the packets routed by other sensor. This assumption allows us to include in the model the functional relation between the traffic in the network and the network quality. With this assumption we can easily relate the dynamic computational resources with the packet loss. Definition 1.2.2. For a given SN scenario, the position of each sensor does not change. From this point on, if not stated otherwise, when we speak about a SN, we mean a specific scenario of a SN. This definition is based on the large number of sensors in the network. In general, once installed, it is not feasible to move the sensors until the end of its exploitation. 5

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

Effect 1.2.1. For a given scenario for each pair of sensors, there exist a constant probability that at connection will arise. This stems form the fact, that the MC has a constant stationary sitribuion and the distance between sensors do not change. Therefore a topology obtained with this model forms a random graph, where the probability of each edge differs. Hence, the model is more complicated the Erdos-Renyi model [14], where the probability of an edge is constant and the same. Therefore, this model needs simulation modeling. In the end we also assume that: Assumption 1.2.7. The network is decentralized.

SN Example Using the made assumptions we can now build a software package for SN simulations in the way described in Ch. 2. With the use of this package we can now simulate a SN as a function of time. Let us consider the SN simulation example on Fig. 1.2.3 and Fig. 1.2.4 for two consecutive states of the network. For each SN we should consider its behavior in time from the view point of a single Initial Sensor Node(ISN). On the figure the ISN is labeled with Initial Node, and from the perspective on the network the ISN is reffered to with label 1. In order to find the ISN on the figures easily, the ISN’s coverage radius is marked also in green. On the figures, all sensor positions are marked with "x" and its label is shown next to it. Each sensor’s coverage radius is given with a blue circle. Red lines indicate, that two sensors are pariwise in each other’s coverage range and therefore they form a connection at the time being. Black lines indicaate the covering tree with respect to the ISN. The coverage tree will be used to rout the ISN packets in the network. Next, we describe the effect, that play crucial role in the behaviour of the network. (a) In the model the energy change causes a change in the coverage range. (b) The coverage range changes the topology. (c) The topology gives rise to different connected components. (d) The change of the connected component changes traffic. (e) The traffic give rise to different high traffic(bottle neck) points. (f) The traffic and the bottle neck points are a basis for the network quality. 6

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

This chain of effects is given on Fig. 1.2.2. Energy

a

Range

дs

Topology К Connected Component

Quaity

иs иt Traaffic

иt Ðt’ Bottle neck points

Figure 1.2.2: A diagram of a chain of effects in a SN.

7

CHAPTER 1. OVERVIEW OF UNSUPERVISED CLUSTERING METHODS IN SENSOR NETWORKS.

1