Modeling Urban Growth by Cellular Automata: A Case ...

The 10th International Conference on Computer Science & Education (ICCSE 2015) July 22-24, 2015. Fitzwilliam College, Cambridge University, UK

FrP5.5

Modeling Urban Growth by Cellular Automata: A Case Study of Xiamen City, China Xinxin Zhang, Xianli Lin, Shunzhi Zhu* College of Computer and Information Engineering Xiamen University of Technology Xiamen, China

* [email protected] Abstract—This paper focus on the modeling of urban growth with regional difference. Firstly, two land use change map of Xiamen city in 2001 and 2007 were acquired by classification based on satellites images. Secondly, nine kinds of driving factors, which were derived from points of interest (POI) and DEM, were also selected by using distance analysis of GIS. Those factors include public services, economic, political and geographical aspects. Basing on these data, this study adopts logistic regression (LR) model to analysis the urban transition and effects contributed by driving factors. The overall accuracy rate of LR model is up to 81.9%-85.9% and the ROC is 0.896, indicating that it is capable to quantitative analysis the mechanism of different driving factors and the spatial- temporal land use change. Finally, a constrained CA model is applied to simulate and predict the future land use situation of Xiamen in 2020. The simulation results reveal that the increasing areas of construction are mainly located outside of the Xiamen Island. The overall land supply and demand are in contradiction obviously, which may lead to increasing pressure on farmland protection. In general, land use issues would become the main bottlenecks of the development of economic and society in Xiamen city. The prediction results can provide reliable guidance about policy implementation for land use planning department. Keywords—urban growth; cellular automata; POI; Xiamen city

I.

INTRODUCTION

Urbanization, due to its close links to socio-economic and biophysical driving forces, has increased from significant to threatening parts around the globe. According to the UN world Urbanization Prospects report, the proportion of urban population of the world will rise to 60% by 2030, while the total urban area only covers 1 or 2% of the global land area [1]. This huge contract reveals that the continuous migration flows have contributed to very rapidly increase of built-up area. Besides, urban extension raises a series of concerns about transport optimization, pollution control, energy consumption, agricultural protection etc. Especially in some developing countries such as China, urbanization, as a sign of the vitality of regional economy, is the most rapid and largest land use during the past decades. Therefore, understanding the mechanisms of urban land use change is critical to an understanding of globe environmental change [2]. Nowadays, modeling of land growth, as a precondition to understanding the complexity of land use dynamics, as well as forecasting future trends of land use change and its ecological impact has been widely used for the purpose of planning and This work was supported in part by the National Nature Science Foundation of China under Grant 41401475 and 61373147, and Xiamen University of Technology high level talents under Grant YKJ13022R and Science and Technology Plan Project of Fujian Province under JA09216.

U.S. Government work not protected by U.S. copyright 645

management of urban regions [3]. Currently, numerous models have already been available and developed both in the literature and application [4], and they have achieved significant success in understanding the spatial and temporal dynamics of urban growth. Researchers increasingly agree on a qualified urban model should better address several important features including multiply-scale analysis, driving forces causes, spatial and temporal effects, and level of integration. However, due to the urban systems are usually characterized by non-linear interactions between the subsystems, many of the existing models capture some, but not the entire feature above. It is thus important in developing urban growth model to recognize and work with these complexities. Among the various models, Cellular automata (CA) models, which are generally defined as multidimensional grids of identical cells that are self-acting and processing information, have been widely adopted to investigate the logical nature of self-reproducible systems [5]. Transitions of the automata (i.e. cells) are determined by their neighborhood cells in a certain distance based on specific rules over a number of iterations. Because of this underlying feature, CA have become useful as a tool for simulating urban spatial-temporal dynamics and numbers of results have been documented [6-8]. Unfortunately, it must be pointed out that there are always some intangible variables that cannot be measured and some rules that cannot be mathematically expressed. In fact, few models have been fully validated with empirical data from other areas using a statistical method [9]. In addition, dynamic changes in an urban land use system display both regularity and irregularity in temporal rate and spatial patterns. However, urban growth, which use general transfer rule and same speed to drive model, cannot be rationalized according to a set of pre-defined scenarios. Hence, the compact city debate is lack of proper tools to ensure successful implementation of the impact city because of its complexity [10]. Therefore, this research demonstrates a constrained CA model to simulate and predict future urban growth in the Xiamen city, China. Its objectives are to: a) build an operational model for regional urban growth; b) simulate future urban growth in the region based on different scenarios; and c) predict spatial extents of future urban expansions through to the year 2020.

FrP5.5

autocorrelation of observations proximate in space, and a special check of joins was determined each time by comparing land use change types of ‘adjacent’ cells. A join is defined by sequential occurrences of like land use changes in adjacent cells, which are those central cells of the current sampling window and the sampling windows to its neighborhood. We calculated the statistic results of different window size which increases by 2 each time from 3 to 30.

II. METHOD Prediction of urban transition probabilities was accomplished using an integrated framework composed of a logistic regression model and a constrained CA model. The LR model was firstly used to analyze the quantifiable relationship between urban growth and driving forces. The CA model is then adopted to predicting the trend of urban growth in future. A. Logistic regression model A logistic regression model was selected to analyze and interpret the relationships between land-use change and socioeconomic driving forces [11]. The function can be defined as a linear combination of attributes of land use choices: k

z = β 0 + ∑ β j xij

(1)

Pi = e u / (1 + e u )

(2)

B. Constrained Celluar Automata model Constrained for urban growth can be generally classified into three types: local, regional and globe constraints. As in X. Li and A.G. Yeh’s study[15], a standard constrained cellular automata is generalized as follows: P 'td {x, y} = f ( S t {x, y}, CONSdt {x, y}, N )

j =1

with constant

β0

in (1), coefficients

where P 'td { x , y} is the probability of cell {x,y} at time t;

β j (j =1,2,….,k) are

S t {x, y}

is the state of cell {x,y} at time t; t consldt { x , y}, conrdt { x , y} and consgd {x, y} are the evaluated scores of local, regional and global constraints respectively; CONS dt { x , y} is the combined constraint scores of all three constraints applied by MCE techniques; and N is the neighborhood of cell {x,y}. The scores continuously range from 0 to 1. The higher value of the score is, the more possibility of urban growth happens.

the unknown parameters which are often estimated by maximum likelihood, and predictor x j for k predictor variables (j =1,2,….,k). Equation (2) is the estimated probability that the ith land cell (i= 1,… ,n) belongs to certain land use type. Hence, the linear regression equation creates by the logit or log of the odds as (3): k

ln( Pi / (1 − Pi )) = β0 + ∑ β j xij

(4)

t and CONS dt {x, y}=consldt {x, y}× conrdt {x, y}× consgd {x, y}

(3)

j =1

In this study, constraints of CA have been defined as follows:

In this model developed for Xiamen city, 9 independent variables, which are derived from an initial set of POI and remote sensing images, were adopted to measure suitability, accessibility to infrastructure and facilities, market factor and policy constraints. These variables include 1) distance to major roads; 2) distance to business district; 3) distance to commercial center; 4) distance to financial district; 5) distance to service center; 6) distance to administration center; 7) distance to education district; 8) distance to medical center; and 9) DEM.

• Local constraints are represented by values of each cell, including the logistic regression probability, as well as the neighborhood effects derived from certain distance of central cell. • Regional constraints are adopted to emphasize the different of scenarios among Xiamen city. In this study, administrative boundaries are considered as the vital factor because their effects could be crucial in influencing urban growth patterns.

Urban growth patterns always exhibit spatial autocorrelation, which means the transition of land use appears in geographical proximity. Thus, spatial autocorrelation of land use and driving forces may invalidate the assumption of independence, and compromises the applicability of conventional statistics, which are unfortunately often omitted in many econometric models. Currently, several techniques are available to quantify spatial autocorrelation. These exploratory methods include Spatial Auto-logic Regression (SAL), Getis and Ord statistics, Anselin’s LISA, and Geographically Weighted Regression (GWR) (see reviews in[12]), which are normally with a spatial weights matrix to describes the relationships among the data points/pixels. However, these methods are either time consuming due to the incorporation of the complex autoregressive structure (such as GWR), or unable to account for the heterogeneity of local spatial effects (such as SAL). Another feasible strategy is to sample the data systematically to create a smaller dataset that consists only of non-neighbouring observations [13, 14]. In this study, an optimal sampling scheme is employed to verify the spatial

• Globe constraints represent temporally change of land use in the whole city, which can be generated from statistical yearbook of provincial and city governments. In addition, policy constraints and protected land, such as farm, greening etc. are also involved into this type.

Another essence of CA is the definition of neighboring cells influence the state of central cell. In this study, we used a simple 3¯3 window to count the distribution of states in its neighboring cells. Fig. 1 is the flow chart of constrained CA model adopted in this study. III. IMPLEMENTATION AND RESULT A. Study area and data Xiamen city, located on the southeastern coast of China (118.06E, 24.27N), is actually an island connected by causeway to the mainland in line with Taiwan. It has a land area of more than 1699.39 square kilometers and a sea area of

646

FrP5.5

300 square kilometers. Because of such favorable conditions, Xiamen becomes an international seaport scenery city. It has a registered household population of 1.96 million at the end of 2013 and a permanent population of about 3.73 million. Under the jurisdiction of the municipality are six districts: Siming (SM), Huli (HL), Jimei (JM), Haicang (HC), Tong'an (TAN) and Xiang'an (XA), respectively (Fig.1).

software. Proximity is a prime cause of urban expansion. The proximity variables measure the minimum Euclidean distances to the nearest commercial site, residential area, industrial site and transportation network respectively. The availability of usable sites significantly influences urban growth patterns. A summary of nine predictions is shown in Table 1. TABLE 1 SUMMARY OF PREDICTOR VARIABLES FOR THE RURAL-URBAN LAND CONVERSION MODEL

FIGURE 1 DISTRICTS OF XIAMEN CITY, CHINA

The initial training data about land use conversion were obtained by the classification of the 2001 and 2007 TM images (30m). These images were geo-rectified and mosaicked with high-resolution (with a scale of 1:10000) topographic map as references. Six land use classes, agriculture, green, undeveloped land, build-up, water, and mudflat aquaculture land, were derived using the supervised classification function provide by the ENVI software (Fig. 2). The first three types are considered to be rural areas that has potential of urbanization, the build-up type is considered to be urban areas, and last two are considered to parcels that are stable and not suitable for development. Classification accuracy will positive affect the simulation accuracy. Hence, the accuracy assessment for land use classification was carried out with reference to available land use maps of government, air photographs, and field investigation. The total accuracy is 78.25% in 2001 and 87.62% in 2007, and the Kappa coefficient is 0.7531 and 0.8525, indication that the land use maps have satisfactory accuracies for further analysis.

Varibales

Description

Dist_Com

Distance from the cell to the nearest commercial site

Dist_Edu

Distance from the cell to the nearest Education site

Dist_Res

Distance from the cell to the nearest restaurant area

Dist_Gov

Distance from the cell to the nearest government site

Dist_Road

Distance from the cell to the nearest road

Dist_Med

Distance from the cell to the nearest medical site

Dist_Shp

Distance from the cell to the nearest shopping site

Dist_Ser

Distance from the cell to the nearest service site

DEM

The elevation of the cell

B. Simulation results Fig. 3 presents the plot showing changes in the number of joins against the increases in window size. The different change trend of land use types indicated that a small sampling windows is insufficient for removing spatial autocorrelation. On contrary, a large sampling window leads to loss of certain information. We finally chose to sample to data using 15*15 windows, or 450m (*450m) sampling distance after carefully comparing. At such distance, much of the spatial autocorrelation were filtered out, yet enough samples for regression analysis were retained. In this study, a total of 9382 samples, include 7422 of rural and 1960 of urban, were obtained for the next LR model to representing the complex relationship between land use conversion and its independent variables.

FIGURE 3 NUMBER OF JOINS IN TERMS OF WINDOW SIZES

The LR model is implemented using ArcInfo’s Add-in developer kits. Spatial logistic analysis is performed to model urban growth in Xiamen city. The LR model is capable to search for a set of optimal parameters for the rural-urban transition. The results are shown in Table 2 and Table 3.

FIGURE 2 LAND USE MAP OF XIAMEN IN 2001 (LEFT) AND 2007.

The independents variables for determining land use conversion were retrieved via GIS functions of ArcInfo 10.0

647

FrP5.5

TABLE 2 COEFFICIENTS OF LAND CONVERSION MODELS FOR 2001-2007 Varibales

Coef

Std Err

Dist_Com

-1.04E-04

4.23E-05

Dist_Edu

-2.39E-04

6.71E-05

Dist_Res

-7.03E-05

1.76E-05

Dist_Gov

-3.63E-05

4.12E-05

Dist_Road

2.14E-04

0.000154

Dist_Med

1.12E-05

5.76E-05

Dist_Shp

-3.12E-04

6.76E-05

Dist_Ser

-3.48E-04

6.36E-05

DEM

-3.26E-02

0.002045

Constant

1.50E+00

-2 LL

6010.807

R Square

0.498

a

85.1%

ROC

0.896

PCP

FIGURE 4 ROC CURVE (AUC=0.896) a.

The cut value is 0.5

TABLE 4 THE CLASSIFICATION TABLE ( CUT VALUE IS 0.33) TABLE 3 THE CLASSIFICATION TABLE ( CUT VALUE IS 0.5) Observed

Predicted Urban

Observed

Correct

Urban

Rural

Correct

1570

390

80.1

1332

6090

82.1

Urban

1127

833

57.5

Rural

Rural

569

6853

92.3

Overall

Overall

Percentage

Urban

Percentage Rural

Predicted

81.9

85.1

C. Model validation Table 2 shows the statistical results of the research. The LR model is found to be significantly different from the constant-only model, indicating that the 9 predictors as a whole are statistically reliable for the prediction of urban transition probabilities in the city. With 0.5 as the cut-off value for classification in Table 3, the overall prediction success rates of the LR model were 85.1%. However, it is not significant for the urban use (57.5%), which is caused by the LR model omitted more urban cells than rural cells. This is understandable because samples is predominantly rural. To reduce classification error and to be consistent with future prediction, a ROC method was used to determine the optimum cut-off value given consideration by the AUC. As a result in Figure 4, classification for urban use was improved by about 24%, up to 80.1 when the cut-off value is 0.33, and the overall PCP is remain above 80% with little change (Table 4).

D. Predicted urban growth According to the final prediction, the total population of the Xiamen city will increase from 3.73 million in 2014 to 5 million in 2020. The growth rate is about 211.667 people, or 5.6% per year. However, the urban area in 2020 is predicted to increase to 494.27 km2, which is 29.31% of the total land use of city. This means the urban area expands at a rate of about 1.37% annually, which is lower than the population growth because of the reserve land resources shortage. The spatial process of future urban growth of Xiamen city in 2020 is illustrated in Fig. 5. Fig.5 (a) and (b) are the actual classification data obtained from the remote sensing images. Fig. 5 (c) is the simulation result of Xiamen city in 2007 for the purpose of comparison with actual data. Fig.5 (d) and (e) are the prediction of Xiamen city in 2020 but in different scenarios. The last two figures show that the distribution of the predicted urban growth will follow by two patterns: the potential increased urban area mainly growth outside of island and a pattern of growth along the major roads. Meanwhile, for HL and SM districts within island, two strongly transformation-oriented directions, geographical constrained by water, and restricted by protected land, new urban development will spread among the existing urban gaps, and hardly create new large urban clusters. However, by adding the constrained rules to original CA model,

Results from both classification suggest that the LR model generates a sensitive prediciton for urban land use in a certain cut-off value. Hences, the cut-off value should be carefully select by sizes of different categories within samples. For urban growth prediction, the cut-off value 0.33 was chosen for the CA prediciton in order to account for a significant portion of the urban area in the relatively rural areas.

648

FrP5.5

the rate of urban area expansion will be lower than the natural scenario, which is predicted more reasonably.

(a) 2001

(b) 2007

(c) Simulation map 2007

(d) Simulation map 2020 (natural growth)

(e) Simulation map 2020 (constrained growth) FIGURE 5 SIMULATED 2020 URBAN LAND MAPS USING CA MODEL WITH DIFFERENT SCENARIOS PARAMETERS

make the prediction objective, realistic and flexible, a spatial logistic regression model was firstly used through an optimal sampling process to predict the probabilities of urban transformation. Future urban growth was then simulated based on a constrained cellular automata model, which

IV. CONCLUSION This research took a combined approach to modeling and prediction of urban growth in the Xiamen city, China. To

649

FrP5.5

adopted different scenarios that related urban sprawl with population growth. The logistic regression model was proved to be useful for identifying significant driving factors and has achieved high prediction rates for land use categories as a whole. Moreover, with a ROC curve, moderate success rates for urban use can be selected. Results of model validations indicate that the LR model is reliable for prediction, but the cut-off value should be carefully considered.

[13] D. Munroe, J. Southworth, and C. Tucker, "Modeling spatially and temporally complex land-cover change: The case of western Honduras," The Professional Geographer, vol. 56, pp. 544-559, 2004. [14] B. Huang, L. Zhang, and B. Wu, "Spatiotemporal analysis of rural-urban land conversion," International Journal of Geographical Information Science, vol. 23,pp. 379-398, 2009. [15] X. Li and A. G.-O. Yeh, "Modelling sustainable urban development by the integration of constrained cellular automata and GIS," International Journal of Geographical Information Science, vol. 14, pp. 131-152, 2000.

The urban growth is a complex system that impose a challenge for science and practice, a constrained CA model based on GIS techniques can provided quantified, spatial, visualize on the future dynamics of land use. The simulation results indicate that the land development will inevitably exert tremendous pressure on the nature environment and social demand. The shortage of reserve land may not satisfy the need for the sustainable development of Xiamen city in the future, that it is hoped will draw public attention and increase environmental awareness. Further, more studies are necessary on evaluating the impacts of exogenous driving forces and local effects on urban spatial dynamics. REFERENCES [1]

W. B. Meyer and I. BL Turner, Changes in land use and land cover: a global perspective vol. 4: Cambridge University Press, 1994. [2] K. C. Seto and R. K. Kaufmann, "Modeling the drivers of urban land use change in the Pearl River Delta, China: Integrating remote sensing with socioeconomic data," Land Economics, vol. 79, pp. 106-121, Feb 2003. [3] E. Lopez, G. Bocco, M. Mendoza, and E. Duhau, "Predicting land-cover and land-use change in the urban fringe - A case in Morelia city, Mexico," Landscape and Urban Planning, vol. 55, pp. 271-285, Aug 10 2001. [4] P. Verburg, P. Schot, M. Dijst, and A. Veldkamp, "Land use change modelling: current practice and research priorities," GeoJournal, vol. 61, pp. 309-324, 2004. [5] R. White and G. Engelen, "Cellular automata and fractal urban form: a cellular modelling approach to the evolution of urban land-use patterns," Environment and planning A, vol. 25, pp. 1175-1199, 1993. [6] F. Wu, "Calibration of stochastic cellular automata: the application to rural-urban land conversions," International Journal of Geographical Information Science, vol. 16, pp. 795-818, 2002. [7] F. Wu, "Simulating urban encroachment on rural land with fuzzy-logic-controlled cellular automata in a geographical information system," Journal of Environmental Management, vol. 53, pp. 293-308, 1998. [8] M. Al-shalabi, L. Billa, B. Pradhan, S. Mansor, and A. A. Al-Sharif, "Modelling urban growth evolution and land-use changes using GIS based cellular automata and SLEUTH models: the case of Sana’a metropolitan city, Yemen," Environmental earth sciences, vol. 70, pp. 425-437, 2013. [9] J. Allen and K. Lu, "Modeling and prediction of future urban growth in the Charleston region of South Carolina: a GIS-based integrated approach," Ecology and Society, vol. 8, p. 2, 2003. [10] E. Burton, M. Jenks, and K. Williams, The compact city: a sustainable urban form?: Routledge, 2003. [11] S. Serneels and E. F. Lambin, "Proximate causes of land-use change in Narok District, Kenya: a spatial statistical model," Agriculture Ecosystems & Environment, vol. 85, pp. 65-81, Jun 2001. [12] A. Páez and D. M. Scott, "Spatial statistics for urban analysis: a review of techniques with examples," GeoJournal, vol. 61, pp. 53-67, 2005.

650