Property valuation with artificial neural network: the

0 downloads 0 Views 424KB Size Report
Jan 4, 2013 - approach, the multilayer perception network has been employed ... cokriging (Chica-Olmo, 2007) which was used to estimate housing prices. .... rons are organised in layers which are either input data, hidden (not directly obser- ... Finally, for the transportation accessibility, the distance to metro has been.
Journal of Property Research

ISSN: 0959-9916 (Print) 1466-4453 (Online) Journal homepage: http://www.tandfonline.com/loi/rjpr20

Property valuation with artificial neural network: the case of Athens Angelos Mimis , Antonis Rovolis & Marianthi Stamou To cite this article: Angelos Mimis , Antonis Rovolis & Marianthi Stamou (2013) Property valuation with artificial neural network: the case of Athens, Journal of Property Research, 30:2, 128-143, DOI: 10.1080/09599916.2012.755558 To link to this article: http://dx.doi.org/10.1080/09599916.2012.755558

Published online: 04 Jan 2013.

Submit your article to this journal

Article views: 182

View related articles

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=rjpr20 Download by: [Panteion University]

Date: 26 May 2016, At: 02:21

Journal of Property Research, 2013 Vol. 30, No. 2, 128–143, http://dx.doi.org/10.1080/09599916.2012.755558

Property valuation with artificial neural network: the case of Athens Angelos Mimis*, Antonis Rovolis and Marianthi Stamou Department of Economic and Regional Development, Panteion University of Social and Political Sciences, 136 SYGROU AV, Athens 17671, Greece

Downloaded by [Panteion University] at 02:21 26 May 2016

(Received 23 November 2011; final version received 29 November 2012) The purpose of this article is to examine the application of an artificial neural network (ANN) approach in property valuation. The approach has been enhanced by the use of a geographic information system (GIS) to enrich the explanatory variables and model the spatial dimension of the problem. The sample data used contain information of 3150 properties in the broader area of Athens. Various internal physical (structure quality and quantity) and external environmental characteristics (neighbourhood characteristics and transportation access) of the properties are available. In order to incorporate these environmental variables, the GIS was used to employ location-based characteristics. In our approach, the multilayer perception network has been employed and the results have been compared with the traditional approach of the spatial lag model. The comparison demonstrates that ANN gives more consistent predictions in the area of Athens. Our results reveal the non-linear relationships of the value of a property with respect to floor space and age. Finally, spatial variation of the values of the properties in broader area of Athens is illustrated. Keywords: artificial neural networks; housing prices; geographic information systems

1. Introduction The housing sector is always of great importance for the Greek economy. The housing stock in relation to the population is above the EU27 average, and has one of the highest percentages of owner-occupied houses (75%) in Western Europe (Suarez, 2009). Housing construction activity in Greece was rather high during the last decade, while construction activity in the area of Athens alone accounted for one-third of the total, as shown in Figure 1. For a general introduction of the role of housing in Greek society, one can consult, Allen, Barlow, Leal, Maloutas, and Padovani (2004). In the last decade, the demand for housing in the metropolitan area of Athens peaked in 2007 only to experience a prolonged fall in the subsequent period (Bank of Greece, 2012).1 Also the increased demand in the early 2000s was the result of the confluence of several possible factors, such as the influx of economic migrants (for an analysis see, for instance, Rovolis & Tragaki, 2006), the dramatic fall of interest rates,2 the dearth of alternative investment opportunities *Corresponding author. Email: [email protected] Ó 2012 Taylor & Francis

Journal of Property Research

129

Built Surface (squared meteres)

30000000 Athens Total

25000000 20000000 15000000 10000000 5000000

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

0

Downloaded by [Panteion University] at 02:21 26 May 2016

Year

Figure 1. Built properties in Greece (in terms of surface, m2).

in the country,3 and, especially for the metropolitan area of Athens, public investment for the 2004 Olympics which, among other things, created a huge transportation infrastructure for the area, adding two new subway lines on the existing one. By the end of the decade, however, the Greek economy was in dire straits, and a manifestation of the crisis was the reason for decline in housing demand. As can be seen in Figure 1, the significant drop in housing construction activity, both in the area of Athens and across the country becomes pronounced. In any case, and probably due to the paucity of data, there is limited applied research analysing the housing market in Athens. Econometric methods have usually been employed to address the problem of property valuation – hedonic models more specifically. In its initial formulation, the hedonic approach is based on multiple regression models where the value of a property is given in a predefined form as a weighted sum of the various characteristics of the said property (Curry, 2002; Rosen, 1974). Notwithstanding neural networks, other approaches have come up recently to extend or replace hedonic models, such as the fuzzy logic approach (Lughofer, Trawinski, Trawinski, Kempa, & Lasota, 2011), the support vector machine technique (Lam, Yu, & Lam, 2009), the rulebased expert systems (Kauko, 2003), the Geographically Weighted Regression model (Huang, Wu, & Barry, 2010; Osland, 2010) and from the geostatistics field, cokriging (Chica-Olmo, 2007) which was used to estimate housing prices. Further spatial econometric techniques have extended the hedonic models in order to consider the spatial autocorrelation and the spatial heterogeneity of housing prices (Bourassa, Cantoni, & Hoesli, 2010; Osland, 2010; Zietz, Zietz, & Sirmans, 2008). To this effect, several types of spatial regression have been employed, most notably the spatial lag, and the spatial autoregressive error model. Neural networks can be thought of as machines that model the way in which the brain performs a particular task or function of interest. They are made up of simple processing units and they have the ability to store prior knowledge and make it available for use. Various studies exist comparing the performance of hedonic models with the artificial neural network (ANN) approach in property valuation, and there are studies supporting both approaches. Worzala, Lenk, and Silva (1995) compared two flavours of an ANN model with traditional multiple regression

Downloaded by [Panteion University] at 02:21 26 May 2016

130

A. Mimis et al.

techniques. They concluded that the later approach is superior and caution should be taken while using the neural network approach. They also noted that different software packages provided inconsistent results. They attributed this inconsistency to the random initialisation of the weights as well as the gradient descents methods employed; a remedy is described by Peterson and Flanagan (2009).4 On the contrary, more recent studies, Nguyen and Cripps (2001), Peterson and Flanagan (2009), Selim (2009) have supported the superiority of the ANN approach in property valuation. Nguyen and Cripps (2001) concluded that if a data-set comprises a reasonably large amount of values, the ANN model performs better. Their sample comprised 3900 values. Selim (2009), with a sample of 5741 cases from the urban and rural areas of Turkey, compared the two approaches with the evaluation of common performance measures, the mean-squared error (MSE), the root–meansquared error (RMSE) and the mean absolute error (MAE). In all three, the ANN performed better. Also Peterson and Flanagan (2009), using a large sample of 46,467 residential properties for the period 1999–2005, concluded that linear appraisal methods generate significant errors in comparison to the basic feed forward ANN approach. In their study, the RMSE and the mean absolute percentage error (MAPE) have been adopted. The ANN approach has the ability to deal with non-linear relationships and thus allow a broader range of variation than the hedonic models. It can also adapt to changing environments, handle noisy and fuzzy data, and generalise unseen situations (Openshaw & Openshaw, 1997). On the other hand, it is not obvious how the weights of the model are related to the price and for this reason, the ANN has been characterised in the literature as a ‘black box’. Progress has been made to interpret the contribution of input variables to model prediction by using the weights of the neural network (for e.g. Olden & Jackson, 2002). In a recent study of Paliwal and Kumar (2011), a ranking of the importance of the independent variables was carried out. Also there are various comparisons of the proposed approaches of interpreting the importance and contribution of the explanatory variables to the neural network models (Gevrey, Dimopoulos, & Lek, 2003; Sung, 1998). The hedonic as well as the neural network models can be complemented by the use of GIS which enables the visualisation and storage of data, and allow the employment of spatial analysis (Kauko, 2003; Longley, Higgs, & Martin, 1994; Rodriguez, Sirmans, & Marks, 1995). In one of the initial approaches, Longley et al. (1994) made a connection between the property values and a proposed mechanism for local taxation and tax council organised in a GIS system for classifying properties into bands. In a similar manner, Rodriguez et al. (1995) emphasised the importance of location in real estate research and investigated the use of GIS in property valuation by applying a multivariable regression model in which a distance variable has been adopted. This work has been made more methodical by Din, Hoesli, and Bender (2001), who introduced a geoindex capturing the environmental criteria that affect prices. This was evaluated in a GIS framework and was used to compare hedonic and neural network models. More recently, Visser, Van Dam, and Hooimeijer (2008) captured into a hedonic model, the physical, social and functional characteristics of properties and pointed out the spatial variation of prices. This study combines different approaches present in the literature by connecting the method of ANN with the GIS environment. Although property prices are highly correlated with their locations, such are not directly included into the models. Only Garcia, Gamez, and Alfaro (2008) used position coordinates as an independent

Downloaded by [Panteion University] at 02:21 26 May 2016

Journal of Property Research

131

variable in the ANN model. Despite that, their data-set comprised 591 apartments and family houses in an area of about 10 km2. In our study, this has been put to the test by using the location of properties in a large area of 300 km2 and by including not only property details such as area and level, but also socio-economic characteristics of the area. These characteristics have been shown to play an important role in house attractiveness, and, as a result, in individual house prices (Din et al., 2001; Kauko, 2003). So, in our approach, GIS-created variables have been extracted and employed as proposed by Rodriguez et al. (1995). Further in order to compare the results with that of a more traditional model the spatial lag model5 (SAR) has been adopted and fitted with the same data on the same explanatory variables. The SAR is preferred (and, thus, utilised) in this study over a spatial error model, as all of the houses examined are, in effect, apartments; this means that housing units are clustered, and therefore a spatial error model is less suitable (for this point, see Liao & Wang, 2012). In the next section, a brief introduction into the ANN models as well as the SAR is offered. Problem and data description comes next, to be followed by an analysis of the tools used and the presentation and discussion of results. In the last section, an overall discussion and ways of improving this model are offered. 2. Methodology In this section the two models used in this study, the SAR and the neural network are briefly described. 2.1. SAR The clustering of similar or dissimilar values in geographical space has been well recognised and is referred to as ‘spatial autocorrelation’ or ‘spatial dependence’ (Can, 1990). Based on this clustering, the property prices will be spatially autocorrelated or there is dependency among housing prices. Spatial dependence is considered as the existence of a functional relationship between what happens at one point in space and what happens elsewhere (Anselin, 1988). In presence of spatial dependence, different models can be implemented (for details, see Anselin, 1988; Pace & LeSage, 2009). In this research the SAR is used, in order to take account the spatial dependence of the dependent variable. The SAR is an expansion of traditional linear regression. This model implies that levels of the dependent variable y are a function, not only of the independent variables X, but of the levels of adjacent (neighbouring) y as well. As stated in Anselin (1988), the formal model is: y ¼ qWy þ X b þ e

ð1Þ

where y is the dependent variable, ρ is the coefficient of the spatially lagged dependent variable, W is the standardised or unstandardised spatial weight matrix, X is the vector of explanatory variables, β is the vector of unknown regression parameters and ɛ the error term. 2.2. Neural networks The ANNs, or, simply, neural networks, have their origin in biology, and more specifically, in the way the human brain operates (Haykin, 2008). Among other fields,

132

A. Mimis et al.

they are used instead of regression analysis, with the goal of modelling unknown functional forms. A neural network is a collection of simple units, called neurons or nodes that are connected together by edges called synapses. The strength of the connection in synapses is controlled through the weights attached to them. The neurons are organised in layers which are either input data, hidden (not directly observable) or output layers. Data enter through the neurons in the input layer, and pass through the synapses to the neurons in the hidden layers where two processes take place:

Downloaded by [Panteion University] at 02:21 26 May 2016

(1) the weighted summation functions and (2) the transformation functions. Processed signals, or results, leave the network at the output layer. The neural networks can be classified based on their network architecture (feed forward, feedback or competitive) or the way that the learning process occurs (supervised and unsupervised). One well known and widely used type of ANN is the multilayer perceptron (MLP) that is used in our case. This is a feedforward network that is trained in a supervised manner. In the feedforward network, the architecture is that of a directed graph and in its general form, the ANN is fully connected, meaning that any neuron is connected to all the neurons from the previous layer. Further, it is trained in a supervised manner by providing to the network a training example data-set (pairs of input and desired data values) and updating the synaptic weights to match the predicted with the provided output. An example of a MLP can be seen in Figure 2 and adopts a 9-5-1 architecture consisting an input layer of nine neurons (independent variables, i.e. property attributes), a hidden layer of five neurons and an output layer of one neuron (dependent variable, i.e. price of the property). The MLP is trained by the widely used backpropagation learning algorithm. In this algorithm, the ANN is trained iteratively by first calculating the output for given inputs and current weights. As the training process has not been completed,

Figure 2. Multilayer perceptron with one hidden layer.

Journal of Property Research

133

Downloaded by [Panteion University] at 02:21 26 May 2016

the predicted output differs from the observed. The errors are propagated back through the network, altering accordingly the synaptic weights in order to minimise the sum of squared errors (SSE). The process stops if the absolute partial derivatives of the error function with respect to the weights are smaller than a given threshold. Applications of the MLP networks range from the economics field (forecasting), to medicine (bio-informatics, disease diagnostics), and up to information technology (pattern recognition, security) (Sharda, 1994; Zhang, Patuwo, & Hu, 1998).

3. Problem description The on-site property valuation is time consuming, costly, based on subjective judgement and sometimes can even be negotiated (Al-Akhras & Saadeh, 2010). So the aim of this study is to examine : (1) the effectiveness of a valuation model that is based on ANN, (2) the connection of the model with the GIS environment through the independent variables extracted from it and (3) the visualisation of the results. The database of the property sample includes 3150 cases of apartments in the greater area of Athens.6 These data cover the period 2000–2006, and prices were deflated with the respective price deflator (with base year 2000) obtained from the AMECO database.7 The explanatory variables that are used in the literature can be categorised into four groups: structure quality and quantity, neighbourhood characteristics and transportation access (De Bruyne & Van Hove, 2013; Dewees, 1976). In structural quantity variables such as the floor space and the number of rooms (used to specify the number or the size of characteristics of the property) are included. Regarding structure quality aspect, the age and the condition of the property can be included. The neighbourhood characteristics are defined by variables such as traffic levels, pollution levels and crime statistics. Finally, transportation access is included in the model through the variable ‘travel time to work’. In our case by having in mind the available data as well the special features of the area, for the first two groups, the explanatory variables of floor space, the level and the age are included into the model. For the neighbourhood characteristics, the mean income of the relevant zip code is incorporated which portrays the social status of the area as well as the land value of the area used for tax purposes. Both of them have been extracted from the GIS since they are characteristics in polygons covering the area of study. Finally, for the transportation accessibility, the distance to metro has been included as a dummy variable. It has been chosen first because metro does not cover the whole area of study, and due to its speed and ability to keep the time schedule, whereas buses, in Athens, are highly dependent on traffic. The analysis has been enhanced by inserting additional variable of location which created in a GIS environment (Rodriguez et al., 1995). More specifically, the description of the variables used in the model follows:

Downloaded by [Panteion University] at 02:21 26 May 2016

134

A. Mimis et al.

• Location. The addresses of the properties have been geocoded in GIS environment and then have been converted into x and y coordinates. The coordinates are in the projected coordinate system Greek grid. • Year of construction. The age of the property. • Year of valuation. Difference between the year of the newest property and the year the valuation took place. • Floor space.8 The surface in square metres. • Level. The level of the apartment. Most of the properties in the data-set are apartments in buildings. • Income. Mean income of the families in the area as cited in the tax returns for the year 2006. • Land value. The value of the land in the area of interest used for tax purposes. • Distance to metro. An indicator with a value of one if the property is within 1000 m from a metro station. • Value of the property.9 This is the dependent variable corresponding to the price in euros. The descriptive statistics of the variables under examination can be seen in Table 1. The study area along with the sample can be seen in Figure 3. It should also be noted that the initial database has been pre-processed by data cleaning through detecting and removing inaccurate or incomplete records. Most common problems were empty cells and errors in addresses which are due to mistakes in the data entry process. In the following section, the use of SAR and the ANN models will be described. 4. Implementation and results The starting point of empirical analysis is to examine the spatial autocorrelation in the residuals of an ordinary linear regression model (OLS). The spatial autocorrelation Moran’s I index was employed with the inverse distance weighting scheme and a 1 km bandwidth and resulted in a value equal to 0.07 (z-score = 10.89 and p-value < 0.01), indicating strong clustering pattern of housing prices. Given the evidence of the existence of spatial effects, a SAR is applied in comparison to the ANN. In order to generate the spatial weight matrix used in SAR, many kinds of neighbouring structures were tested; and based on maximum model Table 1. Descriptive statistics. Variable Age (years) Year of valuation (years) Floor space (m2) Level (0–10) Mean income (e) Land value (e) Distance to metro (dummy) Value (e)

Mean

Standard deviation

Minimum

Maximum

17.9 1.6 81.6 2.2 14,253 1362 0.14 133,616

13.3 1.5 25.8 – 2573 371 – 57,603

0 0 20 0 10,877 650 0 20,387

40 6 283 10 28,401 5150 1 410,712

Downloaded by [Panteion University] at 02:21 26 May 2016

Journal of Property Research

135

Figure 3. Sample points in the study region of Athens.

fit (R2) and the Akaike information criterion (AIC) of the model, the neighbourhood distance of 1 km was used. Regression results obtained with a SAR procedure are reported in Table 2. The model specifications applied here are the same as those used within the OLS framework. Furthermore, hedonic price modelling is performed using a semi-log functional form. As can be seen from Table 2, all property attributes are statistical significance, as well as, the estimate of the spatial lag parameter (Rho) is significant. The evaluation of SAR model was implemented in the R software by using the library ‘spdep’ (Bivand, 2006; Kissling & Carl, 2008). The ANN model that has been adopted is the MLP. As it has been shown (Hornik, 1991), a single hidden layer is sufficient to approximate any complex function. So in our case, a single hidden layer was implemented and after a number of trials, a 9-5-1 architecture was chosen in order to have the least complexity possible. This has been implemented in the programming environment of Matlab by using the neural network toolbox. This means that an input of nine independent variables is processed by a hidden layer of five neurons, resulting in an output layer corresponding to the total house price. The data-set that has been used for training,

136

Downloaded by [Panteion University] at 02:21 26 May 2016

Table 2.

A. Mimis et al. Results of the SAR model.

Intercept Age Floor space Level Mean income Land value Distance to metro Year of valuation Rho LR – test value Wald statistic Log – likelihood LM-test for residual autocorrelation

Estimate

Std. error

z - value

Pr(>|z|)

5.2955e + 03 1.1984e02 9.4728e03 3.3220e02 1.7806e05 1.4357e04 5.5781e02 6.9483e02 0.082001 14.555 10.939 650.4264 110.11

1.5979e + 03 6.5397e04 4.9605e04 2.5622e03 1.5979e + 03 1.4848e05 9.1096e03 4.0098e03 0.014475

3.3141 18.3256 19.0965 12.9650 8.4837 9.6693 6.1233 17.3281 5.665

0.0009194 2.2e16 2.2e16 2.2e16 2.2e16 2.2e16 9.167e10 2.2e16 1.4704e08 0.00013614 0.00094143 2.22e16

validation and testing of the model has been normalised by the software, to produce zero mean and unity standard deviation (SD). During the training of the model, the data were divided randomly into three parts. The ‘training set’ was set to 60% of the sample, the ‘validation set’ which was 20% and the remaining was allocated to the ‘test set’. Also, as activation function, the hyperbolic tangent sigmoid transfer function was used in the hidden layer with a linear transformation in the output layer. Further, the scaled conjugate gradient back propagation algorithm (Moller, 1993) has been used for training with the MSE as the error function. This supervised learning algorithm is an improvement to the standard back propagation since it does not contain any user dependent parameters and, by using a stepsize scaling methodology, avoids the time-consuming line searches. In order to compare the two models, several measures have been employed and their values can be seen in Table 3. After the models have been fit, 87% of the variation in the data is explained by the ANN model and 76% for the SAR model. Further and most importantly, the MAE mean is 16,625e and 17,384e for the ANN and the SAR model, respectively. The distribution of errors for test case of the ANN model can be seen in Figure 4. It describes the accuracy in predicting the value of the property for individual cases not used for training the model. Another two measures used in literature are the mean absolute error (MAE) and the MAPE defined by the formulas (Garcia et al., 2008; Peterson & Flanagan, 2009; Selim, 2009): RMSE ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rX ðyi  ^yi Þ2 =n

ð2Þ

i

X yi  ^y  i  MAPE ¼  yi =n

ð3Þ

i

where yi is the observed and ^yi the predicted price and n the number of properties in the sample.

Journal of Property Research

137

Table 3. Model statistics. Total value

Downloaded by [Panteion University] at 02:21 26 May 2016

Error mean Error standard deviation Mean absolute error (MAE) Standard deviation ratio (SD) Mean absolute percentage error (MAPE) Root-mean-squared error (RMSE) Correlation (target and estimation) R2

MLP

SAR

1443.9 21092.7 16625.7 0.3661 0.135575 21138.8 0.930573 0.865291

1312.9 27925.4 17384.8 0.3796 0.136154 27951.9 0.893626 0.764463

In both measures, the ANN displays lower values. The ratio of the predicted error SD to the training data SD (SD ratio) is presented (Garcia et al., 2008). In the case where the SD is one, the predictor is not better than the mean estimator and a significant lower value than one corresponds to a good model performance. In our case, the SD is about 0.36 and 0.37 for the two models pointing out the superiority of the ANN approach. Several studies have examined the non-linear relationship between value of the property in relation with floor space (Peterson & Flanagan, 2009) and age (Goodman & Thibodeau, 1995). In this perspective, the later functional form has been investigated with the use of the ANN model (Figure 5). This graph is created by calculating values for given floor space and age by keeping the other parameters constant. These evaluations are made for a specific location in the study area and consequently the land value, mean income of the residence and the distance to metro are extracted from the GIS. The level is chosen to be equal to 2. The graph shows a non-linear relationship between floor space and value whereas the value of the property declines rapidly in the first couple of years and then linearly. As far as the floor space – value dependence is concerned, the results are in agreement with the results reported in the literature (Garcia et al., 2008) indicating an increasing non-linear relationship. On the other hand, the age–value relationship is a decreasing linear function, after the first years, contradicting existing literature (Garcia et al., 2008; Goodman & Thibodeau, 1995) where a non-monotonic and non-linear function is indicated. This is due to the fact that the predictions are limited to property ages less than 40 years old – which is when the price of a property starts to increase as a part of it comprises preserved estates. In almost all the studies in the literature, the area of interest is considered either small or homogeneous in order to avoid a spatial effect. An approach to this problem has being pointed out by Friesen, Patterson, and Harmel (2011) who used the neighbourhood group code (NGC), provided by the relevant tax appraisal office, used to identify areas with homogeneous properties. In our case, as can be seen from Figure 2, the study area is about 300 km and hence the value of similar properties varies in space. For that reason, the effect of position on the price of the property has been examined by considering a typical 18-year-old apartment of 80 km, in the second level of a building block. The other variables are extracted from the GIS environment to compliment the ANN model. So an ANN prediction for 1600 equally spaced sample points in the study area is performed and these are illustrated in Figure 6 in form of an inverse distance weighted interpolation model.

Downloaded by [Panteion University] at 02:21 26 May 2016

138

A. Mimis et al.

Figure 4. Error frequencies for the test set of the neural network.

Figure 5. Value prediction for varying age and floor space.

This method of interpolation computes the value of a property as a weighted average of the prices of the observations, where the weights are computed as functions to their distance to the interpolating location. The result is in good agreement with the actual spatial variation of prices, pointing out high values in the north-east areas, and in the south, as well as the centre of the areas. It should also be noted that the interpolation was calculated into the whole study area without excluding restricted areas with different land use.

Downloaded by [Panteion University] at 02:21 26 May 2016

Journal of Property Research

139

Figure 6. Value prediction for a typical apartment.

5. Conclusion Housing construction has been one of the most important economic activities for the Greek economy. Greeks have invested heavily in real estate property, and this is reflected in the high percentage of owner-occupied houses. There are, however, few studies examining empirical property valuation in the Greek context. This is probably due to the fact that there are very few databases with relevant information regarding house prices in Greece. In this study, a rather large database (of 3150 apartments in the greater area of Athens) is analysed, with the use of ANN approach and specifically a MLP. The sample of 3150 properties has been employed which includes various internal and external characteristics (i.e. structure quality and quantity, neighbourhood characteristics and transportation access) of the properties. These include as independent variables, the location as x and y coordinates, age, floor space, distance to public transport, social standing of the area as well as the dependent variable which is the value of the property. It has to be noted that data have been enriched with information extracted from GIS which otherwise are not available to the models, for example, the location (addresses) of the properties.

140

A. Mimis et al.

Recent research has suggested that the ANN approach is, in many cases, superior to that of traditional multiple regression framework, having the ability to deal with non-linear relationships. The ANN approach performs better especially when the data-set in use is ‘large’. To support this, the ANN has been compared with the SAR model which is an expansion of the traditional linear regression models. In all the measures, the ANN performed better and illustrated an absolute mean error of 16,625e. In the results, functional relationship of the value of the property with respect to the floor space and aged was examined, illustrating a non-linear relationship between floor space and value whereas the value of the property declines rapidly in the first couple of years and then linearly. Finally, spatial variation of values for a typical apartment was examined and the results are in good agreement with actual values. Downloaded by [Panteion University] at 02:21 26 May 2016

Acknowledgement The authors would like to thank the editor and the three anonymous reviewers for their helpful comments and suggestions during the review process. The authors also wish to thank assistant Professor A. Karaganis for his careful reading and comments on the manuscript

Notes 1. The indicator of real estate properties, of all types, i.e. residential, commercial, etc. sold in Greece took very high positive values until the mid-80s; the indicators for Greece as a whole, and Athens, were 29.6 and 49.5, respectively (annual percentage change). However, since 2006, the demand for real estate property decreased dramatically; the respective figures of the sales indicator for Greece, and Athens were 19.6 and 22. The annual percentage change of real estate sales remained negative from 2006 to date (the indicator of real estate property sales is published by the Bank of Greece, based on data from the Statistical Authority of Greece, Bank of Greece, 2012). It has to be noted that this fall of real estate demand coincides with the decline of the real Gross Domestic Product growth rate; from a positive 4% in 2006, real GDP growth rate fell to a dismal 6 in 2011. There is research in other countries suggesting that ‘the demand for housing seems much more strongly related to real income than to any other factor’ (Hendershott & Weicher, 2002); this is a field of potential future research in Greece. 2. As Greece became a member of the Eurozone, the interest rates of new housing loans fell from more than 11% in 1999, to less than 6% in 2001 and 5% in 2002, to reach the lowest 3% in early 2010. 3. Especially after the 1999–2000 collapse of the Greek stock exchange market. 4. The random initialisation of the network weights has as a result that the gradient descent algorithm to start from different starting points and thus evaluating different solutions. This can be treated by ensample averaging (Haykin, 2008). 5. The literature contains a certain amount of variation in naming this model. In our case the spatial lag model refers to the SAR model. 6. The geographical area of the sample comprises the entire Athens basin; this means that it incorporates many different municipalities and not only the borough of Athens. 7. AMECO is the annual macroeconomic database of the European Commission’s Directorate General for Economic and Financial Affairs (DG ECFIN). 8. This is the size of the property in square metres. As all properties in this data-set comprise apartments, there is no discrepancy between land size and built property. 9. The dependent variable is the price of the property. In Greece, there are no comprehensive databases with real estate property prices. This particular data-set is part of a larger one compiled by a Greek private bank. ‘House prices’ are the prices reported by the buyers, ‘corroborated’ by the bank’s evaluation. Having said that, residential property valuation in many cases shows a high degree of accuracy (for this point see, for instance, McGreal & Taltavull de La Paz, 2012).

Journal of Property Research

141

Notes on contributors Angelos Mimis, (mathematician, BSc Patras, MSc Southampton, PhD Leeds) is a lecturer in the Department of Regional and Economic Development in Panteion University of Athens in the field of Informatics. His research interests include GIS, neural networks and optimisation. Antonios Rovolis, (economist, BA Athens, MA Sussex, PhD LSE) is an assistant professor of Spatial and Urban Economics at the Department of Economic and Regional Development, Panteion University of Athens. His areas of expertise include real estate economics, urban and regional infrastructure investment, urban and regional economic growth, spatial dimensions of new technologies and new economic geography.

Downloaded by [Panteion University] at 02:21 26 May 2016

Marianthi Stamou, (geographer, BSc Aegean, MSc Athens) is a PhD candidate who is conducting research in the spatial econometric models of property valuation.

References Al-Akhras, M. & Saadeh, M. (2010). Automatic valuation of Jordanian estates using a genetically-optimised artificial neural network approach. WSEAS Transactions on Systems, 9, 905–916. Allen, J., Barlow, J., Leal, J., Maloutas, T., & Padovani, L. (2004). Housing and welfare in Southern Europe. Oxford: Blackwell – RICS foundation. Anselin, L. (1988). Spatial econometrics: Methods and models. Netherlands: Kluwer Academic Publishers. Bank of Greece (2012). Indices of residential property transaction (in Greek). Retrieved from http://www.bankofgreece.gr Bivand, R. (2006). Implementing spatial data analysis software tools in R. Geographical Analysis, 38, 23–40. Bourassa, S. C., Cantoni, E., & Hoesli, M. (2010). Predicting house prices with spatial dependence: A comparison of alternative methods. Journal of Real Estate Research, 32, 139–159. Can, A. (1990). The measurement of neighborhood dynamics in urban prices. Journal of Economic Geography, 66, 254–272. Chica-Olmo, J. (2007). Prediction of housing location price by a multivariate spatial method: Cokriging. Journal of Real Estate Research, 29, 91–114. Curry, B. (2002). Neural networks and non-linear statistical methods: An application to the modeling of price–quality relationships. Computers & Operations Research, 29, 951– 969. De Bruyne, K. & Van Hove, J. (2013). Explaining the spatial variation in housing prices: An economic geography approach. Applied Economics, 45, 1673–1689. Dewees, D. N. (1976). The effect of a subway on residential property values in Toronto. Journal of Urban Economics, 3, 357–369. Din, A., Hoesli, M., & Bender, A. (2001). Environmental variables and real estate prices. Urban Studies, 38, 1989–2000. Friesen, D. D., Patterson, M., & Harmel, R. (2011). A comparison of multiple regression and neural networks for forecasting real estate values. Regional Business Review, 30, 114–136. Garcia, N., Gamez, M., & Alfaro, E. (2008). ANN+GIS: An automated system for property valuation. Neurocomputing, 71, 733–742. Gevrey, M., Dimopoulos, I., & Lek, S. (2003). Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological Modelling, 160, 249–264. Goodman, A. C. & Thibodeau, T. G. (1995). Age-related heteroskedasticity in hedonic house price equations. Journal of Housing Research, 6, 25–42.

Downloaded by [Panteion University] at 02:21 26 May 2016

142

A. Mimis et al.

Haykin, S. (2008). Neural networks and learning machines (3rd ed.). New Jersey, NJ: Pearson Prentice Hall. Hendershott, P. H. & Weicher, J. C. (2002). Forecasting housing markets: Lessons learned. Real Estate Economics, 30(1), 1–11. Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural Networks, 4, 251–257. Huang, B., Wu, B. & Barry, M. (2010). Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. International Journal of Geographical Information Science, 24, 383–401. Kauko, T. (2003). On current neural network applications involving spatial modeling of property prices. Journal of housing and the built environment, 18, 159–181. Kissling, W. D. & Carl, G. (2008). Spatial autocorrelation and the selection of simultaneous autoregressive models. Global Ecology and Biogeography, 17, 59–71. Lam, K. C., Yu, C. Y., & Lam, C. K. (2009). Support vector machine and entropy based decision support system for property valuation. Journal of Property Research, 26, 213– 233. Liao, W. C. & Wang, X. (2012). Hedonic house prices and spatial quantile regression. Journal of Housing Economics, 21, 16–27. Longley, P., Higgs, G. & Martin, D. (1994). The predictive use of GIS to model property valuations. International Journal of Geographical Information Systems, 8, 217–235. Lughofer, E., Trawinski, B., Trawinski, K., Kempa, O., & Lasota, T. (2011). On employing fuzzy modeling algorithms for the valuation of residential premises. Information Sciences, 181, 5123–5142. McGreal, S. & Taltavull de La Paz, P. (2012). An analysis of factors influencing accuracy in the valuation of residential properties in Spain. Journal of Property Research, 29(1), 1– 24. Moller, M. F. (1993). A scaled conjugate gradient algorithm for fast supervised learning. Neural networks, 6, 525–533. Nguyen, N. & Cripps, A. (2001). Predicting housing value: A comparison of multiple regression analysis and artificial neural networks. Journal of Real Estate Research, 22, 313–336. Olden, J. D. & Jackson, D. A. (2002). Illuminating the ‘black box’: A randomization approach for understanding variable contributions in artificial neural networks. Ecological Modelling, 154, 135–150. Openshaw, S. & Openshaw, C. (1997). Artificial intelligence in geography. Chichester: John Wiley and sons. Osland, L. (2010). An application of spatial econometrics in relation to hedonic house price modeling. Journal of Real Estate Research, 32, 289–320. Pace, R. K. & LeSage, J. P. (2009). Introduction to spatial econometrics. Boca Raton, FL: Chapman & Hall/CRC. Paliwal, M. & Kumar, U. A. (2011). Assessing the contribution of variables in feed forward neural network. Applied Soft Computing, 11, 3690–3696. Peterson, S. & Flanagan, A. B. (2009). Neural network hedonic pricing models in mass real estate appraisal. Journal of Real Estate Research, 31, 147–164. Rodriguez, M. Sirmans, C. F., & Marks, A. P. (1995). Using geographic information systems to improve real estate analysis. Journal of Real Estate Research, 10, 163–173. Rosen, S. (1974). Hedonic prices and implicit markets: Product differentiation in price competition. Journal of Political Economy, 82, 34–55. Rovolis, A. & Tragaki, A. (2006). Ethnic characteristics and geographical distribution of immigrants in Greece. European Urban and Regional Studies, 13, 99–111. Selim, H. (2009). Determinants of house prices in Turkey: Hedonic regression vs. artificial neural network. Expert Systems with Applications, 36, 2843–2852. Sharda, R. (1994). Neural networks for the MS/OR analyst: An application bibliography. Interfaces, 24, 116–130. Suarez, J. L. (2009). European real estate markets. Hampshire: Palgrave Macmillan. Sung, A. H. (1998). Ranking importance of input parameters of neural networks. Expert Systems with Applications, 15, 405–411.

Journal of Property Research

143

Downloaded by [Panteion University] at 02:21 26 May 2016

Visser, P., Van Dam, F. & Hooimeijer, P. (2008). Residential environment and spatial variation in house prices in Netherlands. Tijdschrift voor Economische en Sociale Geografie [Journal of Economic and Social Geography], 99, 348–360. Worzala, E., Lenk, M. & Silva, A. (1995). An exploration of neural networks and its application to real estate valuation. Journal of Real Estate Research, 10, 185–201. Zhang, G., Patuwo, B. E. & Hu, M. Y. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14, 35–62. Zietz, J., Zietz, E. N. & Sirmans, G. S. (2008). Determinants of house prices: A quantile regression approach. Journal of Real Estate Finance and Economics, 37, 317–333.