Network Classes and Graph Complexity Measures - CiteSeerX

Network Classes and Graph Complexity Measures Matthias Dehmer Center for Mathematics University of Coimbra, Apartado 3008 3001-454 Coimbra, Portugal [email protected] Stephan Borgert Darmstadt University of Technology Telecooperation Group Hochschulstr. 10 D-64289 Darmstadt, Germany [email protected] Frank Emmert-Streib Computational Biology and Machine Learning Group Center for Cancer Research and Cell Biology School of Biomedical Sciences Queen’s University Belfast 97 Lisburn Road, Belfast BT9 7BL, UK [email protected] Abstract In this paper, we propose an information-theoretic approach to discriminate graph classes structurally. For this, we use a measure for determining the structural information content of graphs. This complexity measure is based on a special information functional that quantifies certain structural information of a graph. To demonstrate that the complexity measure captures structural information meaningfully, we interpret some numerical results. Key words: Network Modelling, Network Complexity Measures, Entropy

1

Introduction

Exploring quantitative measures for detecting network complexity has been a fascinating research topic since many decades. One important starting point was applying Shannons’s information theory for investigating living systems, e.g., biological and chemical networks [11, 13, 3, 9, 16]. After this, methods to quantify the structural information content of graphs became quite popular where the classical ones were developed by, e.g., [14, 16, 12, 1]. In this paper, we deal with structurally discriminating network classes [5, 10] by using an information-theoretic technique for determining the structural information 1

content of graphs. We define the structural information content of a given graph as its topological entropy. The resulting information measure will be interpreted as a graph complexity measure. Classical methods to measure the structural information content of graphs are often related to the problem to determine a partitioning of the underlying vertex set for obtaining a certain probability distribution [14, 16, 12]. For example, Rashevsky [14] defined the entropy of directed/undirected and unweighted graphs by partitioning the vertices in sets of indistinguishable vertices according to their dependence on local and nonlocal degree-dependencies. Then a probability distribution was obtained [14] by assigning a probability to each partition determined as the fraction of vertices in this partition divided by the total number of vertices. Finally, this approach has been developed further in [12]. In this paper, we use a newly proposed entropy measure [4] to discriminate graph classes structurally. In contrast to the mentioned classical graph entropy methods, this measure is not based on determining vertex partitionings. The main construction principle is to assign a probability value to every vertex in a graph by using a certain information functional. Such a functional quantifies structural information of the graph under consideration. Based on numerical results, we demonstrate that this entropy measure can capture important structural information of graphs and, hence, can discriminate graph classes by calculating the structural information content of graphs.

2

Information-Theoretic Complexity Measures for Graphs

In this section, we briefly outline the graph entropy method to measure the entropy of arbitrary undirected and connected networks [4]. Before starting, we express some mathematical preliminaries [2, 6, 4]. We start with the basic definition of a undirected, finite and connected graph G. We define G as G = (V, E), |V | < ∞, E ⊆ V2 . G is called connected if for arbitrary vertices v i and vj there exists an undirected path from v i to vj . Otherwise, we call G unconnected. In this paper, GU C denotes the set of finite, undirected and connected graphs. The degree of a vertex v ∈ V is denoted by δ(v) and equals the number of edges e ∈ E which are incident with v. We call the quantity σ(v) = max u∈V d(u, v) the eccentricity of v ∈ V , where d(u, v) denotes the shortest distance between u and v. We want to notice that d(u, v) is a metric. Further, ρ(G) = maxv∈V σ(v) is called the diameter of G. The j-sphere of a vertex v i regarding G ∈ GU C is defined as the set Sj (vi , G) := {v ∈ V | d(vi , v) = j, j ≥ 1}.

(1)

In order to define the entropy in general, let X be a discrete random variable with alphabet A and p(xi ) = Pr(X = xi ) be the probability mass function of X. Then, the entropy of X is defined as X H(X) := − p(xi ) log(p(xi )). (2) xi ∈A

To repeat the novel graph entropy method recently introduced in [4], we first state the definition of a special information functional. Here, the information functional f V quantifies structural information of a graph G by using metrical properties of graphs [15].

Definition 2.1 Let G ∈ GU C that is arbitrarily labeled. For a vertex v i ∈ V , the information functional f V is defined as f V (vi ) := αc1 |S1 (vi ,G)|+c2 |S2 (vi ,G)|+···+cρ(G) |Sρ(G) (vi ,G)| , ck > 0, 1 ≤ k ≤ ρ(G), α > 0. (3) ck are real positive coefficients. Definition 2.2 The vertex probabilities are defined by the quantities f V (vi ) pV (vi ) := P|V | . V j=1 f (vj )

(4)

Now, we define the structural information content of a graph G ∈ G U C as its entropy of the underlying graph topology. Definition 2.3 Let G = (V, E) ∈ GU C . Then, we define the entropy of G by If V (G) : = −

|V | X

pV (vi ) log(pV (vi )),

(5)

i=1

=−

|V | X i=1

f V (vi ) P|V |

V j=1 f (vj )

log

f V (vi ) P|V |

V j=1 f (vj )

!

.

(6)

First, we want to remark that the process of defining information functionals and, hence the entropy of a graph by using structural properties or graph-theoretical quantities is not unique. We clearly see that each structural graph property or quantity captures certain structural information of an underlying graph differently. By considering Definition (2.1), we also observe that the information functional f V contains the free parameter α and ck . ck can be used to weight structural characteristics of a graph in question. In terms of practical applications, the free parameter α can be determined, e.g., for graphs that should be classified, by applying an optimization method that optimizes α concerning known class labels of the graphs from an underlying training set. Then, the optimal α-value corresponds in this case to the parameter that leads us to the lowest classification error. Finally, we find that the value α can always be determined via an optimization procedure based on a given data set and, hence, is uniquely defined for a given classification problem [4].

3

Numerical Results

In this section, we demonstrate that our proposed entropy measure is suitable to discriminate graph classes structurally by using so called cumulative entropy distributions. For this, we here choose the class of so called ϑ-trees (VT) and a special class of rooted trees (RT). ϑ-trees are here defined as follows: T ϑ = (V, E), ϑ ∈ IN is a rooted tree with the property that for the root r ∈ V it holds δ(r) = ϑ. Further, for all internal vertices v ∈ V it holds δ(v) = ϑ + 1 and leaves are vertices without successors. For performing our experiment, we first define a special class of rooted trees: Each tree has the characteristic property that for v ∈ V excluding leaves 1 ≤ δ(v) ≤ 2 holds. To generate the special class of ϑ-trees, we choose ϑ = 5. Now, to determine the cumulative entropy distributions by varying the free parameters (α and c k ) we first express a definition.

1 0.9

0.8

0.8

0.7

0.7

0.6 0.5

% of ϑ trees

% of rooted trees

1 0.9

CRT 1

0.4

CRT 2 CRT 3 CRT 4 CRT 5 CRT 10

0.3 0.2 0.1 0

0

2

4

6

8

0.6 0.5

CVT 1

0.4

CVT

0.3

CVT

2 3

CVT

0.2

4

CVT 5

0.1

CVT 10

0

10

0

1

2

3

IfV(T)

BT /V T

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6 0.5

% of ϑ trees

% of rooted trees

Figure 1. Cumulative entropy distributions for C1 the parameter set P1 .

CRT 1 CRT 2 CRT 3 CRT 4 CRT 5 CRT 10

0.4 0.3 0.2 0.1 0

0

2

4

6

8

RT /V T

-C5

0.3

CVT

2 3

CVT

0.2

4

CVT 5

0.1

CVT 10

0

10

0

1

2

3

1

0.8

0.8

0.7

0.7

0.6

0.6

0.5

% of ϑ trees

% of rooted trees

4

5

6

7

IfV(T)

0.9

CRT 1

0.4


0.3 0.2 0.1 6 IfV(T)

by using

1

CVT

1

5

RT /V T

and C10

CVT

0.9

4

7

0.4

RT /V T

3

6

0.5


2

5

0.6

IfV(T)

0

4 IfV(T)

7

8

9

RT /V T

-C5

RT /V T

and C10

0.5

by using

CVT 1

0.4

CVT

0.3

CVT

2 3

CVT

0.2

4

CVT 5

0.1

CVT 10

10

0

0

1

2

3

4

5

6

7

IfV(T)

RT /V T


RT /V T

-C5

RT /V T

and C10

by using

1 0.9

0.8

0.8

0.7

0.7

0.6 0.5

% of ϑ trees

% of rooted trees

1 0.9

CRT 1

0.4


0.3 0.2 0.1 0

0

2

4

6 IfV(T)

8

0.6 0.5

CVT 1

0.4

CVT

0.3

CVT

2 3

CVT

0.2

4

CVT 5

0.1

CVT 10

10

0

0

1

2

3

4

5

6

7

IfV(T)

RT /V T


RT /V T

-C5

RT /V T

and C10

by using

Definition 3.1 The graph classes CαV T , CαRT and the parameter sets Pµ (ck ) are generated in the following way: • Starting from a fixed height h and ϑ, each T ϑ ∈ CαV T is created randomly by retaining the characteristic property of a ϑ-tree. Then, the graph entropy I f V is computed with the free parameter value α and a parameter setting P µ (ck ). • Starting from a fixed height h, each T ∈ C αRT is created randomly by retaining the characteristic property as defined above. Then, the graph entropy I f V is computed with the free parameter value α and a parameter setting P µ (ck ). • P1 (ck ) = {ck ∈ IR+ | c1 := 2h, ck+1 := ck − 1, k = 1, 2, . . . , ρ(G) − 1}. • P2 (ck ) = {ck ∈ IR+ | c1 := 2h, c2 := a · c1 , c3 := c1 − 2, ck+1 := ck − 1, a > 1, k = 3, 4, . . . , ρ(G) − 3}. • P3 (ck ) = {ck ∈ IR+ | c1 := 1, c2 := 2, . . . , cρ(G) := ρ(G)}. • P4 (ck ) = {ck ∈ IR+ | c1 := 2h, c2 := 2h − 1, c3 := 2h − 2, ck = 0, k > 3}. We want to mention that using the parameter set P 1 means that we weight a stronger local branching in a graph. P2 expresses that we weight paths of the 2-sphere more strongly than shorter ones. The meaning of P3 is similarly to that of P2 . Finally, the definition of P4 can be seen as an approximation of the information functional f V because we set the cardinalities of the larger j-spheres equal to zero. The numerical results are summarized in Table (1) and Table (2). For example, from Table (1) we clearly observe that the computed entropies of ϑ-trees are in average larger than the entropies of the special rooted trees in dependence on α. Hence, these numerical results correspond to our intuition that a ϑ-tree is generally considered as structurally more complex (in terms of branching) than a rooted tree with the defined property. Regarding the parameter set P 3 , we see in Table (2) that the mean values of the entropies for ϑ-trees are not always larger than the entropy values of the special rooted trees. For P4 , the results of Table (2) correspond with the results produced for P1 . As a second characteristic, we find in most cases that the variances of the special rooted tree and ϑ-tree classes can be clearly distinguished. This can also be understood by the fact that a set of ϑ-trees is in average more structurally complex and diverse than a set of those special rooted trees having the same height h. To interpret the cumulative entropy distributions for the generated tree classes (for h = 8) regarding their resulting entropies we look at Figure (1) - Figure (4). Here, the cumulative entropy distribution states the percentage rate of the total number of trees which possess an entropy value less or equal I f V .

h = 8,

P1

m ¯ σ2

C1RT 7.780 0.504

C1V T 5.317 0.298

C2RT 1.565 1.028

C2V T 4.048 0.099

C3RT 0.992 0.552

C3V T 3.344 0.206

C4RT 0.802 0.422

C4V T 2.981 0.242

C5RT 0.704 0.366

C5V T 2.754 0.256

m ¯ σ2

C6RT 0.644 0.333

C6V T 2.594 0.261

C7RT 0.601 0.312

C7V T 2.475 0.263

C8RT 0.570 0.296

C8V T 2.381 0.263

C9RT 0.545 0.284

C9V T 2.305 0.263

RT C10 0.524 0.275

VT C10 2.240 0.263

C1RT

C1V T

C2RT

C2V T

C3RT

m ¯ σ2

7.780 0.504

5.317 0.298

0.457 0.335

1.650 0.419

0.300 0.212

C3V T 1.147 0.455

C4RT 0.236 0.168

C4V T 0.983 0.440

C5RT 0.201 0.144

C5V T 0.901 0.419

m ¯ σ2

C6RT 0.179 0.129

C6V T 0.850 0.401

C7RT 0.164 0.119

C7V T 0.813 0.385

C8RT 0.153 0.111

C8V T 0.785 0.373

C9RT 0.144 0.105

C9V T 0.762 0.362

BT C10 0.138 0.100

VT C10 0.743 0.353

h = 8,

P2

Table 1. m ¯ denotes the mean of the entropies for the classes CαRT and CαV T where α varies from 1 to 10 in natural numbers (step size is equal to 1). σ 2 denotes the corresponding variance. It holds |CαRT | = |CαGT | = 100 and ϑ = 5. h = 8,

P3

m ¯ σ2

C1RT 7.780 0.504

C1V T 5.317 0.298

C2RT 3.448 0.711

C2V T 4.002 0.260

C3RT 3.094 0.539

C3V T 3.115 0.589

C4RT 2.978 0.476

C4V T 2.634 0.708

C5RT 2.918 0.443

C5V T 2.334 0.745

m ¯ σ2

C6RT 2.881 0.423

C6V T 2.127 0.754

C7RT 2.856 0.409

C7V T 1.975 0.753

C8RT 2.837 0.399

C8V T 1.859 0.747

C9RT 2.822 0.392

C9V T 1.766 0.740

RT C10 2.821 0.386

VT C10 1.690 0.733

C1RT

C1V T

C2RT

C2V T

C3RT

m ¯ σ2

7.780 0.504

5.317 0.298

0.295 0.194

0.869 0.422

0.245 0.150

C3V T 0.514 0.308

C4RT 0.215 0.128

C4V T 0.399 0.287

C5RT 0.195 0.114

C5V T 0.348 0.283

m ¯ σ2

C6RT 0.181 0.106

C6V T 0.321 0.281

C7RT 0.170 0.099

C7V T 0.305 0.280

C8BT 0.162 0.095

C8V T 0.295 0.279

C9RT 0.155 0.091

C9V T 0.288 0.278

RT C10 0.149 0.088

VT C10 0.283 0.277

h = 8,

P4

Table 2. The entities m ¯ and σ 2 were computed based on the same assumptions stated in Table (1). As a main result, we now find from Figure (1) - Figure (4) that for α ∈ {1, 2, 3, 4, 5, 10} the cumulative entropy distributions of C αRT are significantly different from the corresponding cumulative distributions of CαV T . We notice that the observation that the distribution for C1RT and C1V T seems to be almost equal is related to the fact that our entropy measure has always a maximum at α = 1. This can be generally proven for an arbitrary undirected and connected graph. Putting it all together, the computed cumulative entropy distributions imply that in the shown cases the entropy measure proposed in Section (2) is able to detect that special rooted trees and ϑ-trees manifest structurally different graph classes.

4

Summary and Conclusion

In this paper, we applied a complexity measure for networks for structurally discriminating classes of networks. For this, we compared the resulting cumulative entropy distributions as a criterion wether two graph classes can be considered as equivalent or not. As future work, we are interested in defining graph similarity measures based on the entropy distributions. For this, e.g., the well known Kullback-Leibler-distance [7, 8] can be used.

References [1] D. Bonchev. Information Theoretic Indices for Characterization of Chemical Structures. Research Studies Press, Chichester, 1983. [2] T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley Series in Telecommunications and Signal Processing. Wiley & Sons, 2006. [3] S. M. Dancoff and H. Quastler. Information content and error rate of living things. In H. Quastler, editor, Essays on the Use of Information Theory in Biology, pages 263–274. University of Illinois Press, 1953. [4] M. Dehmer. A novel method for measuring the structural information content of networks. Cybernetics and Systems, 2008. in press. [5] F. Emmert-Streib. The chronic fatigue syndrome: A comparative pathway analysis. Journal of Computational Biology, 14(7), 2007. [6] F. Harary. Graph Theory. Addison Wesley Publishing Company, 1969. [7] S. Kullback. Information theory and statistics. John Wiley & Sons, 1959. [8] S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical Statistics, 22(1):79–86, 1951. [9] H. Linshitz. The information content of a battery cell. In H. Quastler, editor, Essays on the Use of Information Theory in Biology. University of Illinois Press, 1953. [10] Alexander Mehler and Angelika Storrer. What are ontologies good for? evaluating terminological ontologies in the framework of text graph classification. In Uwe M¨onnich and Kai-Uwe K¨ uhnberger, editors, OTT’06. Ontologies in Text Technology: Approaches to Extract Semantic Knowledge from Structured Information, Publications of the Institute of Cognitive Science (PICS), pages 11–18, Osnabr¨ uck, 2007. [11] H. Morowitz. Some order-disorder considerations in living systems. Bull. Math. Biophys., 17:81–86, 1953. [12] A. Mowshowitz. Entropy and the complexity of the graphs I: An index of the relative complexity of a graph. Bull. Math. Biophys., 30:175–204, 1968. [13] H. Quastler. Information Theory in Biology. University of Illinois Press, 1953. [14] N. Rashevsky. Life, information theory, and topology. Bull. Math. Biophys., 17:229– 235, 1955. [15] V. A. Skorobogatov and A. A. Dobrynin. Metrical analysis of graphs. MATCH, 23:105–155, 1988.

[16] E. Trucco. A note on the information content of graphs. Bulletin of Mathematical Biology, 18(2):129–135, 1956.