Space-time modelling without distance - Semantic Scholar

1 downloads 0 Views 454KB Size Report
sion spline (BMARS) model (Denison, Mallick and Smith, 1998b), as a basis to provide predictions to the .... George and McCulloch, 1998). Lewis and Ray ...
Space-time modelling without distance D G T Denison1

P Dellaportas2

B K Mallick3

October 1998

Dept. of Mathematics, Imperial College, 180 Queen's Gate, London, SW7 2BZ, U.K. 2 Dept. of Statistics, Athens University of Economics and Business, Patission 76, 104 34 Athens, Greece 3 Dept. of Statistics, Texas A&M University, College Station, TX 77843-3143, U.S.A. 1

Abstract

We present a novel method for analysing space-time data when response data is given at a nite number of locations and the aim is to predict the response at a new location, where only a short run of data is available. This is the type of dataset that is typically available when attempting to analyse wind velocity data and we demonstrate our method, and compare it to that introduced by Haslett and Raftery (1989), on a set of data collected from the island of Crete in Greece. Typically the distance between locations is used to de ne the correlation matrix between responses at distinct locations even though this cannot always be justi ed. The peculiarity presented in our data is that the sites are in a complex topography so di erences in the local characteristics of the wind stations, the direction of the prevailing winds, and other unobserved covariates can all lead to unsuitable model tting. We use a nonparametric model to avoid these problems and demonstrate its predictive power in relation to the dataset under study. Keywords: Bayesian methods; Markov chain Monte Carlo; Multivariate adaptive regression splines; Nonparametric model.

1

1 Introduction Recently, attention to the development of sophisticated methodologies which determine the wind energy resource has increased (see, for example, Haslett and Raftery, 1989; Huang and Chalabi, 1995; Jamil, Parsa and Majidi, 1995). Interest usually lies in the understanding and prediction of wind turbines' performance which inevitably requires knowledge of the behaviour and structure of the wind itself. This knowledge is important, amongst other reasons, for the location design of wind turbines as well as for the estimation of their potential to carry the imposed loads at an economical viable cost. Research into alternative forms of renewable energy sources has led to much interest in the possibility of harnassing signi cant power from winds. Thus, estimating the average wind speed over a long period of time is important as a general indicator of possible power at a new station. However, turbines cannot operate at extreme wind speeds: with low winds there is not enough energy to drive the turbines and in high winds the turbines are turned o to prevent possible damage. These considerations make it useful to predict how likely such days are at a new location. In this paper, we deal with data coming from a large experiment which investigated the possibility of estimating the wind energy potential at a given site given long records available at other sites and a considerably smaller number of records at the new site. A similar dataset was analysed by Haslett and Raftery (1989), henceforth referred to as HR. Their study involved wind speeds at 12 stations across Ireland collected over a 18 year period from 1961-1978. They discovered long-range dependencies in the data so developed a methodology based on fractional di erencing to allow the model to take into account this dependency structure. However, standard kriging techniques (Journel and Huijbregts, 1978; Ripley, 1981) were used as a starting point and these require parametric estimation of the covariance structure between locations which assumes isotropy across the stations. Sampson and Guttorp (1989) highlight some of the problems that may have occured in their analysis. They used a nonparametric method which allows for possibly nonstationary covariance structures by scaling the spatial dispersions between locations in such a way as to make the spatial dispersion function between the new locations stationary and isotropic. This 2

allows correlations between spatial locations to not be a monotone function of the distance between them. This method is known as multidimensional scaling and in Sampson and Guttorp (1992) it is used to analyse a dataset of solar radiation levels. However, this is not a completely nonparametric method as, after the scaling, parametric covariance structures need to be estimated as before. We propose to use the completely nonparametric Bayesian multivariate adaptive regression spline (BMARS) model (Denison, Mallick and Smith, 1998b), as a basis to provide predictions to the problems highlighted in the last paragraph. The BMARS model was developed as an extension to the classical MARS model discussed in Friedman (1991). Any nonparametric model (neural network, projection pursuit regression, local polynomial tting) could be used in the framework we suggest but we use the BMARS one because of its ability to t the data well in the presence of, perhaps many, irrelevant predictors and its overall interpretability. The novelty of the nonparametric approach is that the distance between the locations is not used to make the inferences. This is plausible in the data set we have since the ground topography is complex and, unlike the similar data set analysed by HR, there does not seem to exist a systematic relation between stations distance and correlation. Therefore, we felt we cannot reliably assign a parametric model to this relationship. We stay completely within the nonparametric framework and use the responses at other locations to predict the response at the location where predictions are to be made. We use the short run of data at the prediction location to `train' the nonparametric model and then predict at this location over the rest of the time points with this model. Thus we have a simple multiple regression setup with which to test the ecacy of our model comprising of a training set and a test set. The BMARS model produces a sample of models generated from the (approximate) posterior distribution of the model parameters given the data. The sample allows simple evaluation of both linear and nonlinear functions of the estimated responses; in particular, this allows simple computation of the mean and variance of the wind speed over the entire time interval. 3

2 The Data The data we analyse in this paper came from a project which took place on the island of Crete, one of Greece's prime areas for large scale wind power development. A wind monitoring system of 10 stations was developed, spanning an approximately 3500km region of the island. The main criterion for selecting the sites was that they should be representative of a distinct ow. Thus, hills and ridges with di erent elevations and slopes along with locations on gently undulating terrain were chosen for mast locations and wind speed measurements at each location were collected for one year. For more details see Glinou et al. (1995). The complete dataset available consists of 10 minute averages of the wind speeds in metres per second (ms? ) and was collected for one year starting at the beginning of April 1993. However, as the analysis that we shall present focuses on the daily wind speed averages we had to excluded 2 of the 10 stations because they had over 100 days' worth of missing values. We follow many other researchers in this eld (Carlin and Haslett, 1982; Brown, Katz and Murphy, 1984; HR) and work with the square root transformation of the wind speed as the raw data. This helps to make the marginal distributions appear more normal as without transformation they are noticeably asymmetric. In the spirit of HR we call the transformed wind speed data velocity measures. The complete dataset is plotted in Fig. 1 and shows that the variances of the velocity measures, as well as the means, are not the same for each station. We choose not to try and deseasonalise the data although this is a possible source of error. With only one year's worth of data we cannot accurately estimate the seasonal trends and, although there is evidence of 35 and 105 day periodicities in the data, the stations appear too di erent for an overall seasonal e ect to be gauged. In early work we did attempt to deseasonalise the data and the predictions obtained were not as good as they were when using just the standard velocity measures. In Fig. 3 we display the relative positions of the 8 wind stations used together with a topographical map of the area (Fig. 2). The terrain surrounding the stations is covered with sparse and low lying, typically mediterranean, vegetation. Image plots of the absolute 2

1

4

distances and the correlations between the stations are given in Figs. 4 and 5. From these gures we can see that there are three distinct clusters of stations (1,2,3), (4,5) and (6,7,8). This design reveals the e ort made by the engineers to establish reference sites, representative of the wind conditions that would prevail over the studied region in the absence of terrain features causing locally induced circulations. For this reason sites located in coastal areas on the northen and on the southern part of the island were selected. The three clusters of stations do have higher correlations between each other than with the remaining stations however, this does not give the full picture. Fig. 5 reveals stations 6 and 7 seem to be more highly correlated than might be expected and station 8 seems to be correlated with all the other stations but none signi cantly. In fact its highest correlation is with station 3 which is a long distance away. A distance-correlation plot for the data is given in Fig. 6. This summarises the image plots and clearly shows that correlation and distance are not related in a simple manner, and a linear approximation for the trend would be unsuitable. This plot corresponds to Fig. 3 in HR where they nd that in their data a linear relationship may exist but only if the data from Rosslare is excluded. Commonly, when a possible location for a new wind generator is under investigation a small sample of data, usually about 2 months' worth, is recorded and the viability of the new station is assessed using this data. Wind speeds at other locations are known for a long period of time both before and after the time when the new data was collected and the aim is to assess the possible wind resource that might have been harnessed at the proposed location. Thus a prediction of the mean wind speed at the new site over the long period is important. The power generated by a wind turbine is not a linear function of wind speed so a method of making smaller time-scale predictions of the velocity measures would be useful. Standard kriging techniques, and even more re ned ones (HR), only give an estimate to the overall velocity measure which is one reason we shall introduce a method that makes predictions at every distinct time point where the data is collected. 5

3 The Model 3.1 The general nonparametric model Suppose that we have data a m locations, labelled 1; : : : ; m and at all of these locations apart from one, namely k say, we have wind velocity measures on N consective days, t = 1; : : : ; N . At location k we only have a short run of n datapoints taken on days t = t ; : : : ; t + n ? 1. From this short run of data we wish to predict the velocity measures at location k on the unobserved days; in particular, we wish to determine the potential wind power resource at location k over the whole period of time. We choose to model the wind speed at a location k with a general nonparametric model. This captures both the local characteristics of the site and the dependence of it on the other locations. The covariates chosen to model the response determine whether a purely spatial, or a temporal-spatial model is used. Thus, putting Xt = (X t; : : : ; Xmt)T where Xkt is the velocity measure at place k on day t, we model 0

0

1

Xkt = f (X?t?kp ; : : : ; X?t k ; : : : ; X?t kq ) + kt +

(1)

where the ?k superscript denotes all the elements in the vector except the kth and kt is the residual error between the regression model and the response Xkt which, henceforth, we shall assume to be normally distributed with variance  and is independent of all the other errors. The number of covariates used is (p + q + 1)(m ? 1) where p and q are the number of past and future lags used to model the response, respectively. After setting up this general model, in theory, we may use any regression method we choose to model the true regression function f . Nonlinear regression methods are wellknown to be useful for prediction but can be dicult to interpret and can over t when many of the covariates are unimportant. However, the multivariate adaptive regression spline (MARS) model of Friedman (1991) su ers from neither of these problems which is why we concentrate on a Bayesian extension of it in this paper. Analysing nonlinear time series using classical MARS was rst proposed by Lewis and Stevens (1991) and modelling 2

6

f with a MARS model is just the semi-multivariate adaptive spline threshold autoregressive (SMASTAR) model (Lewis and Ray, 1997).

3.2 Classical MARS A full description of the MARS model can be found in the seminal paper by Friedman (1991) but in this section we give a brief outline of the model following the notation of Friedman (1991) throughout. The MARS method is both highly exible and easily interpretable. It was motivated by the recursive partitioning approach to regression (Morgan and Sonquist, 1963; Breiman et al., 1984) but produces a continuous model which can be made to have continuous derivatives and has greater exibility to model relationships that are nearly additive or involve at most a few variables. The model can be represented in such a way that the additive contributions of each predictor variable and the interactions between variables can be easily identi ed which helps to identify variables which are important in the model. We can write the estimate to the regression function using a MARS model as

fb(x) =

K X

iBi(x)

i=1

(2)

where x 2 D and the i(i = 1; : : : ; K ) are the suitably chosen coecients of the basis functions Bi and K is the number of basis functions in the model. The Bi are given by 8