A Comparison Study of Parametric and

0 downloads 0 Views 174KB Size Report
Email : elmanani_simamora@mail.ugm.ac.id. 2Department of Mathematics,. Gadjah Mada University, Yogyakarta, Indonesia;. Email : subanar@ugm.ac.id.
International Journal of Applied Mathematics and Statistics, Int. J. Appl. Math. Stat.; Vol. 53; Issue No. 5; Year 2015, ISSN 0973-1377 (Print), ISSN 0973-7545 (Online) Copyright © 2015 by CESER PUBLICATIONS

A Comparison Study of Parametric and Semiparametric Bootstrapping in Deterministic Simulation Elmanani Simamora1, Subanar2 and Sri Haryatmi Kartiko3 1 Department of Mathematics, State University of Medan, North Sumatera, Indonesia; Email : [email protected] 2 Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia; Email : [email protected] 3

Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia; Email : [email protected]

ABSTRACT In deterministic simulation model, kriging predictor is an exact interpolation which ignores the randomness of errors in sampled Input/Output (I/O) data. Inserting kriging model parameter estimators based on sampled I/O data into kriging predictors produced underestimated or biased variance estimator. This paper is a comparative study for two methods of unbiased variance generic estimators of kriging prediction, which were obtained by inserting randomness of sampled output into deterministic simulation model. This randomness was generated by semiparametric bootstrapping and parametric bootstrapping. Comparison of the performance of both bootstrapping would be measured on a small number of unsampled input points (untried points) by considering: (i) estimation values of both generic estimators of kriging variance (bootstrap kriging variance), (ii) coverage probability and length of confidence interval of Kriging prediction of both boostrappings. Simulation with bootstrap sample size B

10000 and various data dimension,

showed smaller estimation of semiparametric bootstrap kriging variance than parameteric. Coverage probability of semiparametric and parametric bootstrap percentiles were exactly the same as nominal coverages, while standard normal coverage based on parametric bootstrapping was very different from nominal coverage. Length of estimation of confidence interval based on semiparametric bootstrapped was shorter than parametric. Generally, the performance of semiparametric bootstrapping gave was better than parametric bootstrapping.

Keywords: Kriging, Variance, Bootstrapping, Parametric, Semiparametric Mathematics Subject Classification: 62K05, 62M30 and 62F40.

1. INTRODUCTION Kriging is a prediction method using interpolation based on linear combination of al sampled responses to predict unsampled responses, which was initially used in the field of geostatistic. The term “ Krigeage (kriging)” is coined by Matheron as a homage to the founder, Prof D.G Krige, a mining engineer from South Africa; see Forrester et al. 2008, p.50. Sacks et al. (1998) apply kriging method on computer-base experiments or simulation to determine and approach model (approximation), www.ceser.in/ceserp www.ceserp.com/cp-jour www.ceserpublications.com

International Journal of Applied Mathematics and Statistics

which is then called model of model or metamodel. So a metamodel is an approximation of behaviors of sampled Input/Output (I/O) functions, with a completion of optimization based on simulation model. There are several types of metamodel which can be used, such as: polynomial regression metamodel, kriging metamodel, basis radial metamodel. While for simulation model there are only two types which are deterministic and random (stocastic); see Fang et al. (2006), Kleijnen 2008. For this study, we limited and focused only on kriging metamodel and deterministic simulation model. This is often used in simulation-optimization in the design of Engineering and Management Science/Operations Research in complex systems. The first step to construct kriging metamodel or also known as kriging model is selecting or sampling input points using an experimental design. This experimental design is used because the correlation structure of input points is not known for certainly, so computer experiment is used. So, input points are random variables, while the outputs (simulation response) of the selection of multi-modal test function will be adjusted with the sampling design of input points which have been obtained. To reduce the uncertainty of these input points, we combine input points or experimental design scenario only once, in which the same scenarios produce the same outputs. There are several algorithms which can be used for this, for example: McKay et al. (2000), Kenny et al. (2000) and Müller (2012). This result is called sampled data, sampled Input/Output (I/O) set, which will be used to estimate kriging model parameter and predict unsampled input points (untried points). Kriging model usually uses relatively small sample size so that deterministic simulation model can reduce simulation time. While to estimate (hyper)parameter of kriging model, we use sampled I/O data, in which sampled output vector and the result of unsampled simulation outputs (kriging prediction) are assumed to be Gaussian multivariate and Gaussian process, respectively. Maximum Likelihood Estimators (MLEs) method is used to estimate the parameters. Replacing this parameter estimation into kriging predictorcalled plug-in kriging predictorwhich produce Empirical Best Linear Unbiased Predictor (EBLUP) with Mean Squared Error Prediction (MSEP)for convenience called ” plug-in kriging variance”is an underestimate or bias; see: Wang and Wall (2003), Hertog et al. (2006), Kleijnen (2008). It is because plug-in kriging predictors ignore epistemic or subjective uncertainty. In other words, plug-in kriging predictors ignore the randomness of kriging model parameters due to the sampling of output points. To get kriging variance estimators which are not an underestimate or unbiased, Wang and Wall (2003) insert uncertainty into prediction interval. Kleijnen (2011), Dellino et al.(2011) insert uncertainty into their kriging prediction (Bootstrapped Kriging) and state that uncertainty in deterministic simulation is based on three sources which are: (i) noise computation, (ii) epistemic, (iii) simulation model using Pseudo-Random Numbers (PRNs). Xie et al. (2013) divide sources of uncertainty into two, which are: noise computation (aleatory) and input uncertainty (epistemic). Wang and Wall (2003), Hertog et al. (2006), Kleijnen (2008), Kleijnen (2011) insert epistemic uncertainty based on parametric bootstrap method to obtain bootstrap resampling from the outputs in the sampled inputs. Wang and Wall (2003) apply it on stochastic simulation model with input variable with two dimensions. Hertog et al. (2006), Kleijnen (2008), Kleijnen (2011) apply it on deterministic simulation with input variable with higher dimensions. Simamora et al. (2014) call this procedure “directly-parametric bootstrapping” by considering the provisions of Wang and Wall (2003) who state that there are two methods to estimate unbiased kriging variance or generic estimators of the kriging predictor variance, i.e, (i) estimation of kriging variance diversity because semi-variogram (correlation) parameter was estimated, (ii) using parametric bootstrap method directly to estimate the kriging variance. As a result, Wang and Wall (2003) derive two formulas for generic estimators of bootstrapped kriging variance, which Simamora

173

International Journal of Applied Mathematics and Statistics

et al. (2014) call generic estimators of kriging variance due to directly-parametric bootstrapping and nondirectly-parametric bootstrapping. We, Simamora et al. (2014), proposed a new procedure using semiparametrik bootstrap method to insert uncertainty into deterministic simulation. We showed that resampling (bootstrap sample) generation procedure of output based on directly-semiparametric bootstrapping was better than nondirectly-semiparametric bootstrapping. Then to limit resampling generation procedure, this study only focused on directly-bootstrapping. This procedure was inspired by Schelin et al. (2010), but was different in simulation and input dimension. We followed the same deterministic simulation as Hertog et al. (2006) which enables input variable with higher dimension. While the difference from Hertog et al. (2006) was related to Monte Carlo sampling, as described above. The main contribution of this paper was comparing the performance of parametric and semiparametric bootstrappings by considering: (i) estimation values of both generic estimators of kriging variance (bootstrap kriging variance), (ii) coverage probability and length of confidence interval of Kriging predictions of both bootstrappings. One of the methods to discover the accuracy of kriging prediction was using kriging variance. While to see which of the generic estimators of kriging variance (bootstrap kriging variance) was more efficient on the accuracy of kriging prediction, only the smallest value of both generic estimators of kriging variance should be considered. On the other hand, to see the effects of both estimators on the length of interval and coverage probability, we considered the length of the smallest interval estimation with estimation of coverage probability approaching the desired nominal probability. For Efficient Global Optimization (EGO) review in simulation-optimization of both methods above will be discussed further in the future paper. In this paper, we only focus on statistical performance. After this introduction, our paper is organized as follows. Section 2 summarizes kriging model with zero order polynomial regression model selection; see Lophaven et al. (2002), p.5 or ordinary kriging; Cressie (1993), p.119. In section 2, we also derived a summary of variance formulation of kriging prediction which was “Best Linear Unbiased Predictor (BLUP)”. Section 3 introduced summary and bootstrapped algorithms of generic estimators of kriging variance (directly-parametric bootstrapping and directly-semiparametric bootstrapping algorithms) and derived formulations for coverage probability and estimation of confidence interval. While section 4 presents simulation results. Lastly, we present conclusions of simulation results. 2. ORDINARY KRIGING Kriging model generally consists of two components which are regression and stochastic components, where y ( x) is assumed to be realization of random process Y ( x) . While

x  X  ƒ n expresses

factor or controlled variableinput point (input variable)with n dimension, where X is experimental area or input variable space; see Fang et al. (2006), p.5. Without removing generality, we selected zero order polynomial regression model or ordinary kriging Y ( x)

E  Z ( x)

(1)

where regression component was E which expressed unknown constant trend, while stochastic component was Z ( x) which was second order stationary Gaussian process with a mean of zero, E ( Z ( x))

0 , and process variance was constant, E ( Z ( x) Z ( x)) V 2 . Covariance between Z ( x) and

Z (t ) was C ( x, t ) V 2 R ( x, t ) , where x , t expressed two different sample locations (input points) and

174

International Journal of Applied Mathematics and Statistics

R ( x, t ) shows correlation between both which only depended on Euclidean distance | x  t | . Modeled

correlation function is available in Lophaven et al. (2002). We assumed Gaussian correlation function, by considering the most popular correlation function in simulation, R ( x, t , T )

n

2 – exp[Ti | xi  ti | ]

(2)

i 1

where Ti  T  ƒn ! 0 , while vector T was parameter of roughness of the surface of kriging prediction. To

expand

X

[ x1 , x2 ," , xn ]T , \ correlation matrix sized m u m with elemenent

experimental

area,

input

Rij

data

for

several

sampled

locations

R ( xi , x j , T ), i, j 1, 2," , n ,

as

obtained

(3)

while r ( x0 ) [ R ( xi , x0 , T )" R ( xm , x0 , T ]T was correlation vector each Z ( xi ) in input data and Z ( x0 ) , respectively, in which x0 was untried point. Kriging prediction in x0 which was Best Linear Unbiased Predictor (BLUP) was revealed in yˆ ( x0 )

O T YX , where O [O1 " Om ]T and YX

[ y ( x1 )" y ( xn )]T

express weight vector and sampled output, respectively. To be guaranteed, kriging prediction in x0 was BLUP then E[ yˆ ( x0 ) | YX ]

E[ y ( x0 ) | YX ] and minimizing Mean Squared Error Prediction (MSPE), min O E[( yˆ ( x0 )  y ( x0 )) 2 ] .

(4)

Lophaven et al. (2002) derive (4) using Lagrangian function with constraint

O E

1T \ 1r ( x )  1 \ 1[r ( x0 )  1m ( m T 1 0 )] 1m \ 1m 1Tm \ 1YX

1Tm \ 11m

with 1m

, by substituting O into yˆ ( x0 ) yˆ ( x0 )

¦ im 1 Oi

1 , to produce

[1"1]T . Generalized Least Squares (GLS) produces

O T YX BLUP kriging prediction became E  r T ( x0 )\ 1 (YX  1m E )

(5)

and kriging variance of predictor yˆ ( x0 ) , where MSPE is equal to kriging variance because kriging predictor did not have error, producing ª [1T \ 1r ( x )  1]2 º   r T ( x0 )\ 1r ( x0 ) » . MSPE ( y ( x0 )) V 2 «1  m T 10 \ 1 1 m m ¬« ¼»

Generally, (hyper)parameter of kriging model