Measuring information-based complexity across ... - Semantic Scholar

E CO LO G I CA L I N FOR MA T IC S 2 (2 0 0 7) 1 21–1 2 7

a v a i l a b l e a t w w w. s c i e n c e d i r e c t . c o m

w w w. e l s e v i e r. c o m / l o c a t e / e c o l i n f

Measuring information-based complexity across scales using cluster analysis M.B. Dalea,b , M. Ananda,c,⁎, R.E. Desrochersa a

Department of Biology, Laurentian University, Sudbury, Ontario, Canada P3E 2C6 Australian School of Environmental Studies, Griffith University, Nathan, Qld. 4111, Australia c Department of Environmental Biology, University of Guelph, Guelph, Ontario, Canada N1G 2W1 b

AR TIC LE I N FO

ABS TR ACT

Article history:

Scaling of ecological data can present a challenge firstly because of the large amount of

Received 29 June 2006

information contained in an ecological data set, and secondly because of the problem of

Received in revised form

fitting data to models that we want to use to capture structure. We present a measure of

19 February 2007

similarity between data collected at several scales using the same set of attributes. The

Accepted 15 March 2007

measure is based on the concept of Kolmogorov complexity and implemented through minimal message length estimates of information content and cluster analysis (the models).

Keywords:

The similarity represents common patterns across scales, within the model class. We thus

Multiscale

provide a novel solution to the problem of simultaneously considering data structure, model

Minimum message length (MML)

fit and scale. The methods are illustrated in application to an ecological data set.

Clustering

© 2007 Elsevier B.V. All rights reserved.

Similarity Modifiable unit area problem (MUAP) Plant ecology Vegetation

1.

Introduction

Scale is a concept central to all ecological studies, whether relating to space or time. Sayama et al. (2003) demonstrated that there are powerful linkages between scales, contradicting the erroneous, though commonly held, assumption that it is possible to neatly partition evolutionary effects at different spatial scales, to study a molecule, individual, population, metapopulation, species or ecosystem. It is these dynamic linkages among the levels, rather than the number of levels themselves, that should probably be the focus of attention. Hogeweg (2002) argues that ‘processes do not, in biotic systems, operate in isolation and the existence of entanglement at different time and space scales does not need explanation, being there by default’. Ignoring it by segregating time and space scales is simply a modelling artefact.

There are several ways in which scale appears as a feature in ecological studies. An omnipresent problem is the modifiable areal unit problem (MUAP; see Openshaw, 1984; Fotheringham and Wong, 1991; Nakaya, 2000; Brunsdon, 2002; Holt et al., 1996; Jelinski and Wu, 1996). Ecological units do not come in convenient packets and the size, shape and distribution of samples will all have effects on any study; this aspect of scale has already received considerable research (Brunsdon, 2002; Pavlov et al., 2001; see also methods developed by Juhász-Nagy and Podani, 1983). The effects of scale can, however, be mitigated by employing fuzzy concepts. This allows any individual sample to partake of several component structures and leads to consistent estimates of cluster parameters. Bar-Yam (2002) proposed that agglomerative clustering indicates the mechanism by which information is lost as the level of uncertainty increases across scales, but a quantitative mea-

⁎ Corresponding author. Current address: Department of Environmental Biology, University of Guelph, Guelph, Ontario, Canada NIG 2W1. E-mail address: [email protected] (M. Anand). 1574-9541/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.ecoinf.2007.03.011

122

E CO L O G I CA L I NF O R MA T IC S 2 ( 2 0 0 7) 1 21–1 2 7

sure will depend on the similarity coefficient and the particular algorithm used for clustering. Pavlov et al. (2001) and Puzicha and Buhman (1998) use similar ideas to obtain segmentation of images based on texture variation and using fractal concepts; specifically Pavlov et al. (2001) suggest using wavelet decompositions. The number and distribution of samples also links to the part-whole problem (cf. Szabo, 1996) and also to the relationship between habitat heterogeneity and spontaneous pattern production (Sayama et al., 2003). Here we shall consider a different problem that concerns the estimation of common structure between levels. We ask, does common structure exist and, if so, how strong is it? And how does it decay as differences in scale increase? Some argue that scale invariance, or the presence of similar structure across different spatial or temporal scales, should be expected for complex systems (e.g., Brown et al., 2002). However, Wolpert and Macready (2000) recently put forward another view that over different space and time scales, the patterns exhibited by such a complex system should vary greatly, and in ways that are unexpected given the patterns on the other scales. The degree of dissimilarity plotted against scale would therefore provide a profile, which can be used as a system descriptor, and compared with other system profiles irrespective of the subject matter. Binder and Plazas (2001) recommend a similar procedure. This obviously requires a suitable measure of dissimilarity or similarity between data at two or more scales. Several authors have suggested ordination methods for multiscale analyses (Noy-Meir and Anderson, 1971; Borcard and Legendre, 2002); however, these do not in general provide a similarity measure between scales. Another approach makes use of fractal and multifractal analysis (e.g., based on Rényi's generalized entropy functions; Borda-de-Agua et al., 2002); however, again the degree of self-similarity cannot easily be determined. In this paper, we present a clustering approach to determine similarity between scales. The problems caused MUAP can be overcome by using fuzzy clustering. This allows us to identify common structure at different scales using the minimum message length principle. The method was applied to ecological data to test its efficacy at detecting changes in community structure in terms of the composition and relative abundances of species in the community.

2. A minimum message length similarity measure Dale (2002); (see also Dale and Anand, 2004) have proposed using the minimum message length (MML) principle to estimate the Kolmogorov complexity as a sum of two components: model (structure) description and model fit. Kolmogorov complexity is a measure of the difficulty of description of a pattern or algorithm (Li and Vitányi, 1997); however, the measure has not been used very often for ecological informatics (but see Anand and Orlóci, 1996, 2000). In the present work, it is has two components: One is related to structure in the data as captured in some class of models, while the other relates to the fit of the data to the model, assuming the choice of model is correct. That is we have, for model message H and data fit D: minimum message length ¼ lnðpðHÞÞ lnðpðDjHÞÞ

MML is Bayesian (see Spiegalhalter et al., 2002) and the model probability is a prior probability of selecting one particular model from some class of models; however, it delivers posterior probability estimates rather than probability density distributions. We can make a trade-off between complexity of model and fit in order to identify ‘desirable’ models. These are the least complex for adequate fit to the data; in effect, MML operationalizes Occam's Razor. Message length is directly related to probability, so that the probability of any model p(H) is given by e− message length, or message length = −ln(p(H)). For more details on MML, see Wallace (2005). If data are collected at two or more scales using the same attributes (e.g., species and their abundances), we can easily analyze each of these independently using some model. Here we use non-hierarchical clustering as provided by the SNOB program (Wallace and Boulton, 1968; Wallace and Dowe, 2000; Dale et al., 2001) as the class of models. This approach provides a non-hierarchical fuzzy clustering with the number of clusters estimated by the analysis. It permits continuous (numeric), multistate (classes), frequency (counts) and angular (e.g., slope or aspect) attributes and also handles missing values. Other model classes could be used, including cases where there exists dependency between the samples forming the data sets (Wallace, 1998; Edgoose and Allison, 1999; Dale et al., 2002; Ricotta and Anand, 2006). Wallace (1995) has discussed using factor analysis as a model class, while Agusta and Dowe (2002a,b, 2003a,b) present methods incorporating other distributions, including skewed and correlated ones. It is also possible to compare classes of models, for example clustering versus ordination models. MML can be used in two distinct ways to investigate scale effects. For predictive purposes, we might fit a model obtained from data at one scale to data at another scale, provided both use the same attributes. So analysing data A, we obtain a model PA which we then apply to data B. Note that the result will not be the same if we first obtain model PB and apply it to data A; the function is asymmetric. If the objective is to predict from one scale to another this may be useful. However, it does not tell us how much structure is shared between the two scales. In order to obtain an appropriate symmetric function, we have adopted a different procedure. Assume data at two scales, A and B, and call the message lengths of the best models for each I(A) and I(B). We can also analyze the data after conjoining two (or more) data sets from different scales. Thus, if we combine A and B to form A&B, by analysing both data sets together, we obtain a message length I(A&B). If A and B are independent, they will share no patterns and this latter message length will be the sum of the lengths of the component parts; that is I(A&B) = I(A) + I(B). However, if they do share patterns, then the combined data will have a reduced message length and I(A&B) b I(A) + I(B). We can then calculate a change in message length ∂I(A,B)=I(A) +I(B) −I(A&B). ∂(Ab B)≤ min (I(A),(B)), with equality occurring only when all patterns at one scale are present in the other. Assuming I(A) N I (B), then we can form a similarity measure SðAbBÞ ¼ AIðAbBÞ IðBÞ . There is no great difficulty in extending this to cover several scales. Calculating I(A), I(B), I(C) and I(A&B&C) poses no problems. There are, however, choices involved in performing the subtraction, especially if A, B and C have

123


some kind of ordering, for example by size of unit sample or hierarchy (Anand and Orlóci, 2000). If we have data sets A, B and C, then we might use any of the following: 1. 2. 3. 4.

I(A) + I(B) + I(C) − I(A&B&C)—unordered I(A&B) + I(C) − I(A&B&C)—ordered I(B&C) + I(A) − I(A&B&C)—reverse ordered I(A&C) + I(B) − I(A&B&C)—unordered

With the pairwise interactions (expressions 2, 3 and 4), we can still normalize by the minimum of I(A) or I(B) or by the size of the two data sets to provide an overall measure of the effects of interactions between the various scales. However, we cannot do this with the triple option, expression 1, but we can substitute the sum of the smaller components. Which of these measures is appropriate depends on the question being considered. If there is an ordering A ≺ B ≺ C, then expression 1 ignores possible common patterns related to this order, expressions 2 and 3 include possible common pattern though in opposite directions, while expression 4 is not appropriate. The procedure extends to more data sets but then the possible choices of measure become even greater. The ∂I() measure provides a similarity measure which can be used with any combination of data sets, not just those differing in scale, provided they share at least some common set of attributes.

3.

Data and methods

The data were modified as follows in order to examine the changes in community structure, in terms of the composition and relative abundance of species, at different scales: The primary data consist of records of the cover abundance of 119

species of understorey plants. These were collected from line transects from 6 sites located along a historic pollution gradient (Anand et al., 2003; Tucker and Anand, 2003; Desrochers and Anand, 2005). This gradient reflects decreasing historic sulphur dioxide (SO2) concentrations with distance from the source of emission, a nickel–copper smelter complex in Sudbury, Ontario, Canada. Each plot was recorded in two successive years (2001, 2002), and in each case two parallel transects of 100, 1 × 1 m quadrats were positioned running over similar altitude ranges down south-facing slopes to examine topographical gradients in diversity. In total, 2400 quadrats were available. Combinations of data sets were prepared at different scales as follows: Within each transect, adjacent quadrats were merged, using 2-, 3-, 4- and 5-tuples, using the mean relative abundance of each species. For the 3-tuple, since 100 is not exactly divisible, the final quadrat was omitted. Thus, we have data at scales of 1–5 quadrats, which translates to scales at 1–5 m. The original data used Braun–Blanquet codes. In order to obtain the larger sample areas these have to be averaged, and this is not possible directly because the codes form an ordered category scale only. The cover-abundance codes were therefore converted using Noest et al.'s (1989) numeric equivalences. This is not an ideal procedure, since the transformation really does not affect the ordered category nature of the data, but alternatives such as using median values for consecutive quadrats lead to a large preponderance of zero values. Ideally the collected data would be numeric but for the present investigation the transformation must suffice. SNOB can handle the original codes as numeric attributes, defining the precision of the values as =1, but a proper formal treatment of ordered category data is not yet available. For the combined values at higher scales the same precision was used, although fractional values were present.

Table 1 – (a) Individual analyses at the several scales and (b) pairwise analyses of scales Scale

Size

1-class MML

1-class MML /size

No. of clusters

n-class MML

n-class MML /size

Model MML

% reduction in n-class MML without and with () model MML

(a) 1 2 3 4 5

2400 1200 792 600 480

271449.8 112164.0 74396.8 50560.1 34900.0

113.10 93.47 93.94 84.27 72.71

39 22 19 14 12

62137.8 27984.0 19321.3 15248.2 11666.9

25.89 23.32 24.40 25.41 24.31

10767.0 5358.8 4506.0 3085.6 2393.2

77.1 75.1 74.0 69.8 66.6

(73.1) (70.3) (68.0) (63.7) (59.7)

(b) 1&2 1&3 1&4 1&5 2&3 2&4 2&5 3&4 3&5 4&5

3600 3192 3000 2880 1992 1800 1680 1392 1272 1080

391178.3 350809.3 330202.8 318213.6 190364.8 164683.6 152131.5 129419.2 114688.9 88599.3

108.66 109.90 110.07 110.49 95.56 91.49 90.55 92.97 90.16 82.04

46 43 43 44 33 29 31 26 24 22

82355.7 75408.4 73456.0 70805.6 42883.2 38703.1 37154.6 31014.0 28487.6 24647.2

22.871 23.624 24.483 24.585 21.528 21.502 22.116 22.280 22.396 22.822

12767.8 11861.4 11986.1 12232.9 8378.9 7145.0 7408.3 6273.4 5742.5 4785.5

82.2 81.9 81.4 81.6 81.9 80.8 80.4 80.9 80.2 77.6

(78.9) (78.5) (77.8) (77.7) (77.5) (76.5) (75.6) (76.0) (75.2) (72.2)

Scale indicates the number of primary samples combined and size is the number of samples in the data; 1-class MML is the message length of the raw data before clustering; number of clusters is determined by the clustering algorithm; n-class MML is the message length for the selected number of clusters (structural complexity); model MML is the model fit component of complexity; % reduction in MML due to clustering is given in the last column with and without inclusion of Model MML. It is calculated as (1-class MML −n-class MML)/ 1-class MML and (1-class MML −n-class MML + model MML)/ 1-class MML, respectively.

124


Each of these data sets was then clustered using MML, individually and in combinations. The similarities were then calculated between several combinations of scales. In addition, since the number of samples varied widely, it seemed desirable to examine the message length on a ‘per individual’ bases. The analyses performed are as follows: First, we calculate the model fit for each of the individual data sets 1 to 5. Then we examine the pairwise relationships, using all possible pairs. Finally, we look briefly at higher order combinations in two ways. In one (a Bush analysis), the combined analysis is compared to the individual analyses, whereas in the other (a chained analysis) it is compared to the combined analysis involving all except the last added scale. So, for the combination 1&2&3, the Bush analysis uses the separate analyses of 1, 2 and 3, whereas the chained analysis uses 1&2 and 3.

4.

Results

The results from the independent analysis of the several scales are shown in Table 1a. The number of classes and the associated n-class MML show a close relationship with the size of the population employed and at all scales the clustering provides a markedly better n-class result compared with that for a single class; however, the MML per thing values are not as closely related. Turning to the pairwise analyses (Table 1b), we obtain similar results except that the MML per individual values are now related to size. In all cases, the n-class solution is much preferable to the single class analysis with a much smaller MML. In general scales are more closely related to their immediate neighbors than to scales more different; however, the relationship is monotone, though probably nonlinear. For the similarity measures of the pairs (Table 2), scale 1 is most similar to scale 3, while scale 2 and scale 4 are most related. Scale 1 shows the strongest relationship with scale 5 although all other scales are more strongly related to scale 1. Somewhat surprisingly, scale 5 shows its lowest relationship with scale 4; similarity in size is not a clear indicator of similarity in pattern. In any case, we have a suggestion of three different interactions, 1:3, 2:4 and possibly 5. The similarities could easily be ordinated using principal coordinates analysis (Gower, 1966). It is interesting to note that the data structure component of complexity is consistently much higher than the model fit component. The MML value is, as has been stated earlier, a combination of structure and fit components. We can consider these separately and in Table 3 we show the pairwise similarities using the structural component only. The relationships here are not identical with those in Table 2, with scales 1, 2, 3 and 4

Table 3 – Pairwise similarity between scales using data structure component of complexity

1 2 3 4 5

1

2

3

4

5

1 0.627 0.757 0.604 0.387

0.627 1 0.330 0.421 0.144

0.757 0.330 1 0.427 0.483

0.604 0.421 0.427 1 0.290

0.387 0.144 0.483 0.290 1

all similar, while scale 5 now shows most relationship to scale 3 while scales 2 and 4 are not strongly related to scale 5. This means that the clusters obtained differ at the several scales. In most cases, the structural similarities are larger than the total, suggesting at least some commonality between clusters at the several scales does exist. This may not reflect any commonality of content. The MML for coding the clusters incorporates the prior probability of the cluster (its relative size) together with the coding of the cluster parameters (in our case, the mean and variance for each attribute at a specific precision). Thus, having the same number of clusters with a similar range of sizes will tend to produce a similar MML irrespective of attribute properties. However, with a large number of attributes, it is likely that their contribution will be dominant. The major difference is the stronger relationship between scales 3 and 5 here, compared to the 1 and 5 of the total MML. Structure could also be used by itself in the more complex, multiscale analyses below, but any interpretation would be difficult. Two other results may be of interest. Firstly we can ask if the clusters identified in the pair analyses have similar members to those of the single scale analyses. Because of low expectations, a chi-square test, while significant, is untrustworthy. Accepting this, if we compare the 1&2 result with those for 1 and 2 individually, we obtain R2 values of 0.78 for the 1&2/1 comparison and 0.66 for the 1&2/2 comparison. It seems likely that much of this relationship is due to the low diversity of many of the original sample quadrats obtained from ecologically damaged areas (Desrochers and Anand, 2005). Secondly, we can enquire if the pairwise analysis produces significantly more (or less) fuzzy assignments of the constituent samples. The output from the SNOB program identified 3 levels of fuzziness—well assigned, poor and very poor. Table 4 shows the relevant numbers for the 1&2 analysis. The results suggest that, if anything, the combined analysis is providing a crisper result, combining the several data sets leads to more effective clusters being formed. Turning to more complex interactions, the effects of ignoring any ordering of the scales was determined to be quite significant. For example, 1&2 + 3 − 1&2&3 = 82355.7 + 19321.3– 94997.9 = 6680.1, whereas 1&3 + 2 − 1&2&3 = 75408.4 + 27984.0– 94997.9 = 8394.5. Samples from scale 1 are smaller than those

Table 2 – Pairwise similarity between scales using total Kolmogorov complexity Scales 1 2 3 4 5

1

2

3

4

5

1 0.278 0.313 0.258 0.257

0.278 1 0.229 0.297 0.214

0.313 0.229 1 0.233 0.214

0.258 0.297 0.233 1 0.194

0.257 0.214 0.214 0.194 1

Table 4 – Fuzzy assignment frequencies for the scales 1 and 2 analyses Scale

Size

Well assigned

Poor

Very Poor

1 2 1&2

2400 1200 3600

2086 1127 3349

307 71 249

7 2 2

125


Table 5 – Cumulative analyses Scales

Size

1-class MML

No. of clusters

n-class MML

% reduction in n-class MML without Model MML

Bush similarity

Chained similarity

1 1 + 2 − 1&2 1+2+3− 1&2&3 1+2+3+4− 1&2&3&4 1 + 2 + 3+4 + 5 − 1&2&3&4&5

2400 3600 4392

271449.8 391178.3 469368.3

39 46 59

62137.8 82355.7 94997.9

77.1 78.9 79.8

1 0.278 0.305

1 0.278 0.346

4992

526305.0

58

104200.0

80.2

0.328

0.397

5472

570660.9

63

111837.0

80.4

0.33

0.353

from scale 2 while scale 2 is smaller than scale 3. Maintaining the order leads to smaller message lengths than not, and surprisingly reversing the order produces the smallest message length of all. The results of examining all five scales are shown in Table 5. Note that the number of clusters, unlike the MML, is not monotonic with the largest scales involved, with scale 4 having fewer than scale 3. Scales 1 and 2 are somewhat disparate and have a low similarity while scales 4 and 5 show stronger similarities and high percent capture. For a chained analysis, the similarities are acceptable. Knowing more about patterns at several scales leads in general to a closer relationship with a larger scale, although this increase seems to be almost negligible for the two largest scales here.

5.

Discussion

We introduce a new measure for cross-scale analysis of ecological data and the structures it defines. On the basis of a single analysis, it is not possible to decide if the results are an inherent feature of ecological systems. There is certainly some common structure as well as idiosyncratic variation, and the methods used here can separate these components: crossscale similarity measures based on Kolmogorov complexity provide needed information. The exception is the Bush analysis where we may need to extend the search to see if a better model is present. The results of the application suggest that the ecological system it represents is ‘complex’ in that dissimilarity was detected across scales. This conclusion follows the logic presented by Wolpert and Macready (2000). This criterion for assessing complexity, however, needs to be further examined, both through simulations as well as application to other data sets. The results of the MML analysis also contribute to the MUAP question. The identification of three, possibly hierarchical, levels of pattern suggests that there is no single sample size that is appropriate for all analysis. Instead, examination at scales 3, 4 and 5 would seem to be necessary. We detected a non-intuitive similarity between scales 1 and 3 and 2 and 4. Our results suggest a more complex structuring of scale than scale-invariance. While many ecological studies have focussed on cross-scale analysis (e.g., Greig-Smith, 1983; Orlóci et al., 2006), the theoretical focus has not specifically been on detecting the degree of dissimilarity across scales, and we thus feel that this question merits further study. In all cases, it was interesting to note that component of data structure complexity was always higher than the

component due to model fit complexity. The relationships between these two components could be further studied for other data sets and other models to determine if this is a generality. We found that ordering (of the combination) of scales mattered in the amount of information-based complexity detected. This is an important finding because this ordering is often not taken into account in ecological studies of scale. It must be kept in mind that Kolmogorov complexity is not calculable and can only be estimated. In our approach, it is subject to constraints on the class of models to be employed. The use of clustering has some advantages, but it would be preferable to allow a wider range of models including combinations of clusters and axes (for other clustering techniques that could be used, see Podani, 1991; Wallace and Dale, 2005). By using more sophisticated clustering methods, including within cluster correlation, we can expect further improvement in the substantive information. There may, however, be consequences of changing models to the model fit component of Kolmogorov complexity. Further studies could examine the effect of changing models on Komogorov complexity and hence of choosing appropriate model classes for ecological analysis. While we have concentrated here on the relationship between several spatial scales, the same approach can be applied in other circumstances. For example, if we examine data recorded at several different times, we can again compare the independent analyses with a composite one; i.e. calculate MML(t1) + MML(t2) − MML(t1&t2). This provides a significance test (message length is closely related to log probability) and a similarity measure and, given the range of data types available within the SNOB program, would be highly suitable for handling mixed data. The biggest question that remains unresolved is whether or not cross-scale difference in clustering organization is an inherent, general feature of ecological organization. We suggest that the dissimilarity across-scales reported here is important not because ecological systems may not have a coherent cross-scale organization (unlikely in our opinion) but because the MML clustering approach has uncovered a novel informatic feature that needs further scrutiny in order to assess what exactly it is indicating about the cross-scale structure that does exist.

Acknowledgments MA acknowledges funding from the Natural Sciences and Engineering Research Council of Canada, the Canada Research

126


Chairs Program and the Ontario Ministry of Science and Technology for infrastructure and salary support for RD and MD. We thank B.C. Tucker and K. Lemire for assistance with field data collection and Steve Kaufman for technical assistance and comments on a previous version of the manuscript. An anonymous reviewer provided helpful comments.

REFERENCES Agusta, Y., Dowe, D.L., 2002a. MML clustering of continuousvalued data using Gaussian and t distributions. In: McKay, B., Slaney, J. (Eds.), Lecture Notes in Artificial Intelligence, vol. 2557. Springer-Verlag, Berlin, Germany, pp. 143–154. Agusta, Y., Dowe, D.L., 2002b. Clustering of Gaussian and t distributions using minimum message length. In: Sasikumar, M., Jayprasad, H.J., Kavitha, M. (Eds.), Proceedings of the International Conference KBCS-2002. Vikas Publishing House Pvt. Ltd., New Delhi, India, pp. 289–299. Agusta, Y., Dowe, D.L., 2003a. Unsupervised learning of correlated multivariate Gaussian mixture models using MML. Lecture Notes in Artificial Intelligence, vol. 2903. Springer-Verlag, Berlin, pp. 477–489. Agusta, Y., Dowe, D.L., 2003b. Unsupervised learning of gamma mixture models using minimum message length. In: Hamza, M.H. (Ed.), Proceedings 3rd IASTED Conference on Artificial Intelligence and Application. ACTA Press, Calgary, Canada, pp. 457–462. Anand, M., Orlóci, L., 1996. Complexity in plant communities: the notion and quantification. Journal of Theoretical Biology 179, 179–186. Anand, M., Orlóci, L., 2000. On hierarchical partitioning of an ecological complexity function. Ecological Modelling 132, 51–61. Anand, M., Ma, K.-M., Okonski, A., Levin, S., McCreath, D., 2003. Characterizing biocomplexity and soil microbial dynamics along a smelter-damaged landscape gradient. The Science of the Total Environment 311, 247–259. Bar-Yam, Y., 2002. Sum rule for multiscale representations of kinematically described systems. Advances in Complex Systems 5, 409–431. Binder, P.M., Plazas, J.A., 2001. Multiscale analysis of complex systems. Physics Review E 63, 065203. Borcard, D., Legendre, P., 2002. All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecological Modelling 153, 51–68. Borda-de-Agua, L., Hubbell, S.P, McAllister, M., 2002. Species-area curves, diversity indices, and species abundance distributions: a multifractal analysis. American Naturalist 159, 138–155. Brown, J.H., Gupta, V.K., Li, B.L., Milne, B.T., Restrepo, C., West, G.B., 2002. The fractal nature of nature: power laws, ecological complexity and biodiversity. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 357, 619–626. Brunsdon, C., 2002. A Bayesian perspective on the modifiable areal unit problem using data augmentation. Regional Conference of the IGU, Durban South Africa. Geographical Renaissance at the Dawn of the Millennium. Dale, M.B., 2002. Models, measures and messages: an essay on the role for induction. Community Ecology 3, 191–204. Dale, M.B., Anand, M., 2004. Domain knowledge, evidence, complexity and convergence. International Journal of Ecology and Environmental Sciences 30, 141–158. Dale, M.B., Salmina, L., Mucina, L., 2001. Minimum message length clustering: an explication and some applications to vegetation data. Community Ecology 2, 231–247. Dale, M.B., Dale, P.E.R., Edgoose, T., 2002. Markov models for incorporating temporal dependence. Acta Oecologica 23, 261–269.

Desrochers, R.E., Anand, M., 2005. Quantifying the components of biocomplexity along ecological perturbation gradients. Biodiversity and Conservation 4 (14), 3437–3455. Edgoose, T., Allison, L., 1999. MML Markov classification of sequential data. Statistics and Computing 9, 269–278. Fotheringham, A.S., Wong, D.W.S., 1991. The modifiable areal unit problem in statistical analysis. Environment and Planning A 23, 1025–1044. Gower, J.C., 1966. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53, 325–338 (coordinates analysis). Greig-Smith, P., 1983. Quantitative Plant Ecology, 3rd ed. Blackwell Scientific Publications, London. Hogeweg, P., 2002. Computing an organism: on the interface between informatic and dynamic processes. BioSystems 64, 97–109. Holt, D., Steel, D.G., Tranmer, M., 1996. Area homogeneity and the modifiable areal unit problem. Geographical Systems 3, 181–200. Jelinski, D.E., Wu, J.-G., 1996. The modifiable areal unit problem and implications for landscape ecology. Landscape Ecology 11, 129–140. Juhász-Nagy, P., Podani, J., 1983. Information theory methods for the study of spatial processes in succession. Vegetatio 51, 129–140. Li, M., Vitányi, P., 1997. An Introduction to Kolmogorov Complexity and its Applications. Springer Verlag. Nakaya, T., 2000. An information statistical approach to the modifiable areal unit problem in incidence rate maps. Environment and Planning A 32, 91–109. Noest, V., Van der Maarel, E., Van der Meulen, F., Van der Laan, D., 1989. Optimum transformation of plant species cover-abundance values. Vegetation 83, 167–178. Noy-Meir, I., Anderson, D.J., 1971. Multiple pattern analysis, or multiscale ordination: toward a vegetation hologram? In: Patil, G.P., Pielou, E.C., Waters, W.E. (Eds.), Statistical Ecology III. Penn. State Univ Press, pp. 207–231. Openshaw, S., 1984. The modifiable areal unit problem. CATMOG, vol. 38f. GeoBooks, Norwich, England. Orlóci, L., de Patta Pillar, V., Anand, M., 2006. Multiscale analysis of palynological records: new possibilities. Community Ecology 7, 53–68. Pavlov, A.N., Ebeling, W., Molgedey, L., Ziganshin, A.R., Anishchenko, V.S., 2001. Scaling features of texts, images and time series. Physica. A 300, 310–324. Podani, J., 1991. Introduction to the Exploration of Multivariate Biological Data. Backhuys Publishing, Leiden, Netherlands. Puzicha, J., Buhman, J.M., 1998. Multiscale annealing for real-time unsupervised texture segmentation. Technical Report IAI-97-4 Institut für Informatik. Rheinische Freidrich Wilhelms Univ, Bonn. Ricotta, C., Anand, M., 2006. Spatial complexity of ecological communities: Bridging the gap between probabilistic and nonprobabilistic uncertainty measures. Ecological Modelling 197, 59–66. Sayama, H., Kaufman, L., Bar-Yam, Y., 2003. Spontaneous pattern formation and genetic diversity in habitats with irregular geographical features. Conservation Biology 17, 893–900. Spiegalhalter, D.J., Best, N.G., van der Linde, A., 2002. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society. Series B 64, 583–639. Szabo, N., 1996. Introduction to algorithmic information theory http://szabo.best.vwh.net/kolmogorov.html. Tucker, B.C., Anand, M., 2003. The use of matrix models to detect natural and pollution-induced forest gradients. Community Ecology 4, 89–100. Wallace, C.S., 1995. Multiple factor analysis by MML estimation. Technical Report, vol. 95/218. Dept Computer Science, Monash University, Clayton, Victoria 3168, Australia. Wallace, C.S., 1998. Intrinsic classification of spatially-correlated data. Computer Journal 41, 602–611.


Wallace, C.S., 2005. Statistical and inductive inference by minimum message length. Series: Information Science and Statistics. Springer. Wallace, C.S., Boulton, D.M., 1968. An information measure for classification. Computer Journal 11, 185–194. Wallace, C.S., Dale, M.B., 2005. Hierarchical clusters of vegetation types. Community Ecology 6, 57–74. Wallace, C.S., Dowe, D.L., 2000. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. Statistics and Computing 10, 73–83.

127

Wolpert, D.H., Macready, W.G., 2000. Self-dissimilarity: an empirically observable complexity measure. In: Bar-Yam, Y. (Ed.), Proceedings of the 1st Necsi International Conference Complex Systems: Unifying Themes in Complex Systems, Nashua NH, New England Complex Systems Institute, pp. 625–644.