Designing Information Secure Networks with Graph ...

3 downloads 1001 Views 461KB Size Report
of information security depending on their properties. Information security ... ware to advance. Security levels make the network segmented, which protects.
Designing Information Secure Networks with Graph Theory

Short Paper CyCON 2015 Visa Vallivaara Mathematical Sciences of University of Oulu VTT Technical Research Centre of Finland visa.vallivaara@vtt.

Abstract Graph theory studies the properties of graphs and networks. Graphs are an excellent tool for designing, analysing and optimizing data networks. In a data network there are unequal components, which need dierent amount of information security depending on their properties. Information security includes all measures which try to prevent destruction, manipulation or stealing of information. But at the same time the information must be available for those who have the permission to access it. Computer network can be modelled with graph, where vertices are components of network such as computer, switch or database. With edges we can illustrate allowed connections between those components. Every vertex has importance value; the smaller value means that the protection of the component is more important. Software-dened networking (SDN) is a quite new approach to designing, creating and controlling computer networks. In a software-dened network all components can be directly connected. The main result of this Thesis is the algorithm twintrees, which makes software dened networks more secure and reliable, where it is hard for malware to advance. Security levels make the network segmented, which protects critical parts from threats which could spread from the less protected parts of the network. It takes time for malware to spread from higher levels to lower levels and thus it is easier to react to threat before any catastrophic happens. The algorithm, which this Thesis represents, transfers full graphs to 2-edge-connected graphs by combining two independent spanning trees. For software designed network the 2-edge-connectivity is an excellent property. With this property it is possible to remove any connection between two components so that the network remains still connected.

1

Introduction In a data network there are unequal components that require dierent amount of information security. The most valuable ones are usually databases which store condential and valuable information. The valuable components of the network need to be protected as well as possible. One way to do it is to put them in the most sheltered corner of the network structure, far away from the threats from Internet. Even if a malware is able to infect couple of the machines in the network, the catastrophe can be prevented if the network topology has prevented the spread of malware to the critical parts. For example Filiol et al. (2007) have studied spreading of worms in large networks and have found that the routing topology can have a very big impact on speed how worms spreads [1]. But the network should not be too tubular or scattered to enable a smooth data transfer. In the Master's Thesis graph theory [2] was used to solve these problems. In this short paper from the Thesis, most of the mathematical theories and all the proofs were left out. Graphs are an excellent tool for modelling and optimizing data networks. It is easy to examine and analyze the network topology with the help of graphs. For example Ahmat (2009) has used graph theory to research routing and monitoring related optimization problems of large complex networks, and has shown that most of these problems are NP-complete or NP-hard [3]. Any network can be represented by a graph where vertices represent network components and edges represent the communication links between components. Each vertex can have security value and we can also give weight values to the edges of the graph that would present how much the link is normally used. By using these values we can customize the network topology to be better, so that the components which are the most valuable are in the safest place and those components which communicate a lot are close to each other. The rst chapter of this paper provides the basics of the graph theory. The second section describes the network security. In the third section we apply optimization algorithm to dierent networks and analyze the results. The last chapter is a conclusion of the results of the Thesis.

2

1 Graph Theory Graph theory examines properties of graphs and networks. Graphs are mathematical way to represent relations between distinct objects. They can be used to conveniently visualize and solve many practical problems such as making schedules, transmission of data over the Internet or the optimization of transport routes.

1.1

Properties of graph

G = (VG , EG ) consists of a vertex set VG and an edge set EG . Degree deg(v) of a vertex v tells how many connections the vertice has. A path P : v1 − v2 − · · · − vk+1 is a row of distinct connected vertices, v1 ̸= v2 ̸= · · · ̸= vk+1 . Paths P1 and P2 are edge independent, if they don't share an edge. A circle C is a path, except it starts and ends in the same vertex, C : v1 − v2 − · · · − vk , where v1 ̸= v2 ̸= · · · ̸= vk−1 and v1 = vk . Vertices u and v are connected to each other, if there exist a path from vertex v to vertex u. A graph G is connected, if each of it vertices is connected to each other. A complete graph Kn = (VK , EK ), has exactly one edge A graph

between each vertices.

1.2

Minimum spanning tree

A tree

T = (VT , ET )

T ⊆G

is a connected acyclic graph. A graph's

is a spanning tree

the graph

G.

T,

G

subgraph

which is a tree and contains all the vertices of

Every connected graph contains a spanning tree.

Edges of a graph can have weights, which represent the price of using α + them. G = (VG , EG , α) is a weighted graph, where α : EG y R is a weight function.

A minimum spanning tree of a graph



is a spanning tree

T,

which

edges have the smallest total weight. For example, in the Figure 1 there is a weighted graph, which minimum spanning tree is bolded. There are a couple of good algorithms for nding the minimum spanning tree. Kruskal's algorithm is from year

O(|E| ∗ log|V |).

1956

[4] and its execution time is

Another one is Prim's algorithm from year

best execution time is

O(|E| + |V | ∗ log|V |).

3

1957

[5] and it's

Figure 1: Minimum spanning tree in a weighted graph.

1.3

Connectivity of a graph

Graph's

G Edge-Connectivity

number

λ(G) tells how many connections have

to broke before the network becomes unconnected:

λ(G) = min{|F |, F ⊆ EG Vertex

v

and

G−F G,

G

if by removing vertex

v

and ′ all connected edges to it from graph G, we get a disconnected subgraph G . A graph

is a cut point vertex of a graph

is unconnected}

is nonseparable, if it is connected and doesn't contain a cut point

vertex. Vertices can be removed from a graph while maintaining the connectivity level by using the vertex shrinking. In the Figure 2 there is an example of how shrinking is done. By shrinking vertex v1 and v2 from the graph G we ′ get a graph G where vertices v1 and v2 is replaced by vertice v in such a way that the vertex

v2

v

is adjacent to all the same vertices as the vertices

v1

and

were.

2 Information secure network Information security is a very broad concept. It includes all the means to prevent destruction of data, modication of data and stealing of data. However, at the same time the information must be made available to them who have the right to access it. Criminals try to steal condential information

4

Figure 2: Vertices

7

and

9

has been shrinked to vertice

v.

from the organization using a variety of attack techniques such as phishing, man-in-the-middle attack, trojans and other malware. SDN  Software Dened Networking  is a networking paradigm that provides a software abstraction layer on top of the physical network infrastructure, enabling the control plane functions distinct from the data plane [6]. The data plane still resides on the forwarding element (switch), while routing decisions are moved to a separate controller. The basic idea is that the SDN separates the network management "brain" and the packet trac management "muscles". This means that, all network devices may be directly connected, such as in a complete graph. This type of network is easier to optimize, and its structure can be changed in an instant, as the network's physical structures are not a constraint. A data network can be modelled by the graph, in such a way that the vertices are components of the network such as a computer, router, switch, printer or database. Edges can present thr permitted connections between the network components. In the Figure 6 there are some general network topology graphs. We can give weight values to the edges that indicate how often the connection is usually used. And each vertex needs a security level value which determines how important its protection is. It is not easy to determine the security value of devices with a single value, because the measurement of the security is not unambiguous. But the levels of security are nevertheless measured and evaluated a lot. By using these weight values we can customize the network topology to be better, so that the components which have lot of trac are close to each other and the valuable components are protected. Initially we can think that the

5

Figure 3: Dierent topology graphs for networks.

network is a complete graph

Kn = (V, E).

In the complete graph there is an

edge for each vertex to all other vertices and

|vk | = n and |EK | = n(n − 1)/2.

This kind of network would have the best possible reliability, but it is not very secure so we want to reduce the amount connections. We can use previous network trac data to give weights to edges, a real value from zero to one, smaller meaning more often used connection. Next we give a security value for each vertex for example an integer value from one to ve, so that the smallest value is the most valuable. Then we add the dierence of security values of the endpoints to each edge weight, which signicance can be controlled with a security parameter. Now we can calculate a minimum spanning tree with either Kruskal's or Prim's algorithm. The complete graph has been replaced by a tree, which contains the most used connections so that the vital components are protected. A tree certainly isn't a very reliable network, since removal of any edge makes it disconnected. We must therefore add edges in such a way that the network would remain safe. So next we calculate a second minimum spanning tree from the original graph, which is edge independent with the rst tree. By combining these two independent trees we get a network, where the degree of all vertices is at least two and a removal of any edge will not disconnect the network.

6

Figure 4: Complete graph

K10 .

3 Analyzing the algorithm The basic idea of the algorithm is to form two edge independent trees from the network and unite the trees to get a new better network.

3.1

Pseudocode

be weighted graph, where |VG | = n and |EG | = m. We + have to choose how many security levels c ∈ Z is needed and every vertex is Let

Gα = (VG , EG )

i = 1, . . . , n. From the trac of p ∈ [0, 1] for each edge EG = (ej , pj ), + where j = 1, . . . , m. In addition, we need a security parameter s ∈ R , which

given the security value

VG = (vi , ci ),

where

links we can calculate the probability of

we use to control the priority of security in the network topology. When the

s < c−1

the security is a priority, if

s = c−1

7

then a security and speed are

Figure 5: Graph from the Figure 8 after execution of the algorithm.

s > c−1 then the transmission speed is more important. MinSTree nds the minimum spanning tree T . α 1. twintrees (G = (VG = (v, c), EG = (e, p)), s) 2. for each vi vj ∈ EG 3. α(vi vj ) = pij + |ki − kj |/s 4. T1 =MinSTree (Gα ) 5. EG = EG − ET1 6. T2 =MinSTree (Gα ) 7. H = T1 ∪ T2 8. return H equally important, if

3.2

Twintrees algorithm

Matlab function twintrees takes as input a weighted graph

curity levels of vertices

G, a list of the se-

C , a safety parameter s and method Prim

or Kruskal.

Security levels and the security parameter are used to form the new weights which are used to calculate the graph H, which is made from two spanning trees. 1. function

H=

twintrees

(G, C, s, method) 8

2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.

n=

size(C, 1);

for j = 2 : n % New weights for i = 1 : (j − 1)

G(j, i) = G(j, i) + abs(C(i, 1) − C(j, 1))/s;

end end

G = sparse(G); for i = 1 : n % Name vertices ′ ′ Nodes(i) = cellstr([char( a + i − 1)

end

':' int2str(C(i, 1))]);

T = graphminspantree(G,′ Method′ , method); G = G − T; H = graphminspantree(G,′ Method′ , method) + T ; view(biograph(round(H∗100)/100,Nodes,'ShowArrows','o ','ShowWeights','on'))

end

The random graphs and related security levels used for testing were created with following Matlab commands:

C = rand(c, n, 1),

where

n

is the amount

G = tril(rand(n, n), −1) and of vertices and c is the amount

of security levels. In Figures 4 and 5 there are graphs with

10

vertices each and security

levels from one to ve. In Figure 4, the network is a full graph, so each vertex has access to all the other vertices. In Figure 5 the algorithm is executed to enhance network security which leaves only the safest connections. First the edges got random weights from range

[0, 1]

in such a way that small value

means widely-used connection. Next the weights got from the security levels were added to the edge weights with security parameter

s = 4, meaning that

the security and reliability are balanced.

3.3

Analyzing the results

Security levels help to make the network segmented which protects critical parts of the network from the threats which could spread from the less protected parts. It takes time for a malware to spread from the higher levels to the lower levels and it is possible to react to the threat before major damage has been done. The minimum spanning trees should be made with Prim's algorithm rather than Kruskal's, because in starting situation the graphs have a lot of edges, which implicates that the Prim algorithm is faster. On small networks this does not matter, but for example in graph with

2000

vertices

the algorithm execution time with Prim was nine seconds and with Kruskal was over thirteen seconds.

9

The graph

H,

created by the algorithm, consists of two mutually edge

independent minimum spanning trees, which also means that from each vertex there are two independent paths to all other vertices. So it is possible to remove any edge such a way that the network remains connected. The deleted edge can be replaced by a new edge in such a way that the

2−edge-

connectivity remains, so that the network can be modied again if necessary. Or we can increase the weight of the unwanted edge and form a new network with the algorithm, in which case the connection is left out. We also found out during testing that the generated graphs were often also nonseparable. From nonseparable graph it is possible to remove any vertex in such a way that the graph will remain connected. But the graph may have a cut point vertex, if the graph is large and the security parameter is chosen to emphasize security features. By varying the security parameter and security levels numbers, it is easy to generate many dierent kinds of networks and choose a nonseparable one. If the graph is

v

2−edge-connected

but nonseparable, the infected vertex

can be removed by shrinking it with an adjacent node

u,

which has the

closest security level. Shrinking can be done in such a way that all the edges connected to

v

are moved to the vertex

u

one edge at a time, so that the

network remains always connected. Then it is possible to calculate a new network with the twintrees algorithm without the contaminated vertex and add the edges belonging to the new network and remove the old edges.

4 Conclusion The main result of this study is the algorithm twintrees, which makes software dened networks more secure and reliable, where malware propagation is more dicult. By using security levels of the network components and weights of the known trac we can segment the network in such a way, that the most valuable parts are protected and trac ows smoothly. The algorithm restructures a complete graph to

2−edge-connected graph

by combining two mutually edge independent spanning trees. For the softwaredened networking the

2−edge-connectivity is an excellent feature. It is pos-

sible to remove any desired edge from the network to get rid of unwanted connection between two components so that the network will remain connected. The generated networks are often also nonseparable, but unfortunately not always. If the network is nonseparable, then any vertex can be removed so that the network remains connected. This is useful, for example when it is detected that one of the network components is infected by malware.

10

Although the network is not nonseparable, the infected node can still be removed by shrinking, and

2−edge-connectivity

ensures that the network

remains connected during shrinking. Future research could include improvement to the algorithm, that would make the network always nonseparable. Another future research interest could be the optimal amount of security levels and the magnitude of security parameter. I hope that in the future the algorithm can be tested in a real changing environment and data can be collected in practice.

References [1] É. Filiol, E. Franc, A. Gubbioli, B. Moquet and G. Roblot: Combinatorial optimisation of worm propagation on an unknown network. Engi-

neering and Technology, World Academy of Science, 2007. [2] V.

Vallivaara:

Tietoturvallisten

verkkojen

suunnittelu

graafteorian

avulla. M.Sc. Thesis. University of Oulu: Finland, 2014.

[3] K. Ahmat: Graph Theory and Optimization Problems for Very Large Networks. City University of New York, United States, 2009.

[4] J.B. Kruskal: On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. Pages 48-50. Proceedings of the American

Mathematical Society 7, 1956. [5] R. Prim: Shortest Connection Networks and Some Generalizations. pages 1389-1401. Bell System Technical Journal 36, 1957. [6] Open ing:

Networking The

New

Foundation:

Norm

for

Software-Dened

Networks,

ONF

White

Networkpaper

,

https://www.opennetworking.org/images/stories/downloads/sdnresources/white-papers/wp-sdn-newnorm.pdf, 2012. [7] E.

Weisstein:

Vertex

Contraction.

A

Wolfram

Web

Resource,

http://mathworld.wolfram.com/VertexContraction.html, 2014. [8] R. Diestel: Graph Theory. Third edition. Berlin : Springer, 2005

11