History Matching Channelized Facies Models using

0 downloads 0 Views 3MB Size Report
Jul 2, 2018 - duces a novel parameterization based on deep learning for history matching facies models with ensemble methods. ... which can limit their application in complex cases. .... discussion on DL, we suggest the books from Goodfellow et al. .... However, we did not include any facies data (hard data) in order to.
History Matching Channelized Facies Models using Ensemble Smoother with a Deep Learning Parameterization Smith W. A. Canchumuni? , Alexandre A. Emerick† and Marco Aurelio C. Pacheco? July 2, 2018 ? PUC-RIO † Petrobras

Abstract Ensemble data assimilation methods have been successfully applied in several real-life history-matching problems. However, because these methods rely on Gaussian assumptions, their performance is severely degraded when the prior geology is described in terms of complex facies distributions. This work introduces a novel parameterization based on deep learning for history matching facies models with ensemble methods. The proposed method consists on a parameterization of geological facies by means of a deep belief network (DBN) used as an autoencoder. The process begins with a large set of facies realizations which are used for training the DBN. The trained network has two parts: an encoder and a decoder function. The encoder is used to construct a continuous parameterization of the facies which is iteratively updated to account for observed production data using the method ensemble smoother with multiple data assimilation (ES-MDA). After each iteration of ES-MDA, the decoder is used to reconstruct the facies realizations. The proposed method is tested in three synthetic history-matching problems with channelized facies constructed with multiple point geostatistics. We compare the results of the DBN parameterization against the standard ES-MDA (with no parameterization) and the recently proposed optimization-based principal component analysis (OPCA). Our results show that all procedures are able to match the observed production data. However, standard ES-MDA failed to generate channel facies with well-defined boundaries. OPCA and DBN parameterizations improved the facies description resulting in the expected bi-modal distributions of log-permeability. This paper reports our initial results on an ongoing investigation with deep learning. Nevertheless, the results presented here indicate a great potential on the use of deep learning technologies in the inverse modeling of petroleum reservoirs.

Introduction Ensemble data assimilation methods are Monte Carlo techniques inspired in the Kalman filter and formulated in a Bayesian formalism. These methods have been successfully applied in several real-life history-matching problems in the last decade. Among the advantages of these methods we can highlight the simplicity of implementation and the ability of dealing with a large number of model parameters keeping the computational cost relatively low. Yet, these methods have well-documented deficiencies, which can limit their application in complex cases. These deficiencies can be associated with two main aspects: (i) Gaussian and linearity hypotheses present in the formulation of the methods; (ii) the use relatively small ensembles. In history matching, deficiencies associated with the nonlinearity of the forward model are usually handled with iterative forms of ensemble smoothers; see, e.g., (Chen and Oliver, 2013; Emerick and Reynolds, 2013; Stordal and Elsheikh, 2015; Luo et al., 2015) while deficiencies related to non-Gaussian priors are typically addressed with re-parameterization techniques. Problems related to the use of small ensembles are mitigated by the use of localization (Houtekamer and Mitchell, 2001). Non-Gaussian priors are particularly evident in models with complex facies distributions; e.g. fluvial channels. The challenge to update facies models comes from the fact that they are typically categorical, i.e., they can assume values such as “sand,” “leaf,” “interlaminated,” etc. Therefore, prior distributions of facies tend to be far apart from Gaussian distributions. It is important to highlight that facies history matching is a challenge for the majority of the assisted methods and not only for those based on the Kalman filter. The number of publications proposing parameterizations for facies is quite extensive. Among these methods, truncated pluri-gaussian simulation (TPG) (Armstrong et al., 2011) seems to be the most successful; see, e.g., (Liu and Oliver, 2005; Agbalaka and Oliver, 2008; Sebacher et al., 2013; Zhao et al., 2008). However, TPG is less appealing to re-parameterize channelized models generated by object-based or multiple-point geostatistics algorithms. For these cases, there is an intensive ongoing research activity dedicated to develop parameterizations based on extensions of principal component analysis (PCA); see, e.g., (Sarma et al., 2008; Vo and Durlofsky, 2014; Chen et al., 2016; Emerick, 2017). Motivated by the results reported in (Emerick, 2017), we started and investigation on the use of nonlinear versions of PCA for facies history matching. In particular, we have focused on techniques based on deep learning, which is a subset of machine learning methods specialized in learning data representations in terms of a hierarchy of concepts (Goodfellow et al., 2016). In (Canchumuni et al., 2017), we investigated the use of autoencoders in a simple facies problem. The present paper reports the continuation of this work, in particular we analyse the use of deep belief networks (Hinton et al., 2006; Hinton and Salakhutdinov, 2006). The remaining of the paper is organized as follows: in the next sections we briefly review ensemble data assimilation and discuss the combination with the optimization-based PCA. Then, we present a short discussion about deep learning with emphasis on autoencoders, restricted Boltzmann machines and deep belief networks, which are the main elements of the parameterization investigated in this paper. We present three test cases with channelized facies. After that, we present some comments about the results followed by the conclusions.

Ensemble Data Assimilation ES-MDA Although the parameterizations discussed in this paper could be used with any ensemble data assimilation method, here we use the ensemble smoother with multiple data assimilation (ES-MDA) (Emerick and Reynolds, 2013). The ES-MDA analysis equation for a vector of model parameters m ∈ RNm can be written as

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

   mk+1 = mkj + R ◦ Kk dobs + ekj − g mkj , j

(1)

for j = 1, . . . , Ne , with Ne denoting the ensemble size. The superscript k stands for the data assimilation index; dobs ∈ RNd is the vector of observations, ekj ∈ RNd is the vector of random perturbations which Nd ×Nd is obtained by sampling N (0, αk Ce ), with Ce ∈ R   denoting the data-error covariance matrix and αk denoting the ES-MDA inflation coefficient. g mkj ∈ RNd is the vector of predicted data. R is the localization matrix and “◦” denotes de Schur product. The matrix Kk is a modified version of the Kalman gain, which is given by  −1 Kk = Ckmd Ckdd + αk Ce .

(2)

In the above equation, Ckmd ∈ RNm ×Nd is the matrix containing the cross-covariance values between model parameters and predicted data and Ckdd ∈ RNd ×Nd is the covariance matrix of predicted data. Both matrices are estimated using the ensemble at the kth data assimilation step. In the standard implementation of ES-MDA, the inflation coefficients αk > 1 are applied a pre-defined number of times, Na , and a a the set {αk }Nk=1 must satisfy the condition ∑Nk=1 αk−1 = 1 (Emerick and Reynolds, 2013).

ES-MDA-OPCA Vo and Durlofsky (2014) introduced the optimization-based PCA (OPCA) to construct a continuous and differentiable parameterization for facies. The idea behind OPCA is to introduce a regularization term in the least-squares function minimized by the standard PCA. This regularization term “pushes” the solution to discrete values, typically zero or one for a model with two facies. Emerick (2017) combined OPCA with ES-MDA and showed that the resulting method outperformed ES-MDA combined with PCA and Kernel PCA (Sarma et al., 2008; Schölkopf et al., 1998) for history matching a channelized facies model. The resulting ES-MDA-OPCA method is straightforward to implement. The process starts with Nr realizations of facies, here denoted by x ∈ RNx , used to construct a low-rank approximation of the covariance matrix Cx 1 . PCA is applied to Cx such that we can construct the square root 1 UΣU> , C1/2 x = √ Nr − 1

(3)

where U is the Nx ×Nx orthogonal matrix containing the left eigenvectors of Cx and √N1−1 Σ is the Nx ×Nr r matrix containing the square root of the eigenvalues of Cx . Note that in a practical implementation, we truncate small eigenvalues and it is not necessary to form the matrix Cx ; see details in (Emerick, 2017). The square root in Eq. 3 was intentionally constructed as a symmetric matrix by introducing a multiplication by U> to allow the use of localization when combined with ES-MDA as discussed in (Emerick, 2017). Given Eq. 3, we can generate PCA realizations of facies using xpca = x + C1/2 x h,

(4)

where x is the mean and h ∈ RNx is a vector of white noise, i.e., h ∼ N (0, I). Given a PCA realization, xpca , we reconstruct each component of the facies vector, b x, using 1 In this paper, we are interested in binary facies fields, i.e., we want to determine x ∈ BNx , where B is the space of binary values, 0 and 1. However, with OPCA we write x ∈ RNx and try to obtain the entries of x as close as possible to 0 or 1.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

γ xbi = 0, if xpca,i ≤ , 2 γ xbi = 1, if xpca,i ≥ 1 − , 2 xpca,i − 2γ γ γ xbi = , if < xpca,i < 1 − . 1−γ 2 2

(5)

for i = 1, ..., Nx . γ ∈ [0, 1] is the regularization factor, typically γ = 0.8. During the data assimilation with ES-MDA-OPCA an ensemble of Ne < Nr vectors h is updated to account for the observed production data. Before each reservoir simulation the facies realization is reconstructed using Eqs. 4 and 5.

Deep Learning Deep learning (DL) is the general terminology used to refer to a series of machine learning techniques developed to represent data with high level of abstraction. Essentially DL refers to neural networks with multiple layers, hence the name “deep.” The first well-succeeded use of a deep architecture is the deep belief network (DBN) developed by Hinton et al. (2006). Before the DBN, neural networks with multiple layers were considered too difficult to optimize (Goodfellow et al., 2016). In (Canchumuni et al., 2017), we investigated the use of autoencoders to parameterize facies in a simple history-matching problem. The basic idea is to use the network to re-parameterize facies realizations into continuous fields which can be updated using an ensemble-based method. Before we started the investigation on autoencoders, however, we investigated the use of extensions of PCA; see, (Emerick, 2017). The motivation for choosing autoencoders emerged from the fact that these networks can be interpreted as nonlinear generalizations of PCA (Deng et al., 2017). The results reported in (Canchumuni et al., 2017) were promising. However, when we tested standard autoencoders in more complex facies distributions generated with multiple-point geostatistics, such as the ones presented in this paper, we observed difficulties for convergence during the training of the networks. For this reason, we moved to DBN, but preserving the basic idea of autoencoders, that is, use a network which is able to map discrete facies into continuous parameters and reconstruct back to discrete facies. In the following, we present a very brief description of autoencoders, restricted Boltzmann machines (RBM) and DBN, which are the building blocks of the proposed parameterization. For the readers interested in an introductory discussion on DL, we suggest the books from Goodfellow et al. (2016) and Bengio (2009).

Autoencoders Autoencoder is a popular unsupervised neural network. The training process does not require any information about the class and their target is to reproduce its input to its output. Figure 1 illustrates a standard autoencoder structure which is composed of two mappings: • Encoder: a mapping of the input vector, here x ∈ BNx , into a latent space, usually called hidden layer, here denoted by the vector h ∈ RNh , and mathematically expressed as h = fe (x) = φ (We x + be ).

(6)

• Decoder: inverse mapping, converting the latent representation from the hidden layer to the visible (output) layer, b x ∈ RNx , i.e., b x = fd (h) = φ (Wd h + bd ).

(7)

In Eqs. 6 and 7, W and b represent the weight matrix and the bias vectors, respectively. The subscripts “e” and “d” stand for encoder and decoder, respectively. φ (·) is called activation function, usually ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Input

x1

x2

x3

x4

...

xN x Encoder f e x 

h1

Hidden layer

h2

...

hN h f d h  Decoder

Output

xˆ1

xˆ 2

xˆ3

xˆ 4

...

xˆ N x

Figure 1: General architecture of an autoencoder.

x1

x2

x3

x4

...

ph | x 

xN x px | h 

h1

h2

...

hN h

Figure 2: Restricted Boltzmann machine.

nonlinear functions such as the logistic sigmoid, hyperbolic tangent sigmoid and the rectified linear unit Goodfellow et al. (2016). The training process corresponds to determine the weight matrix and the bias vector of the encoder and decoder functions by minimizing an objective function that penalizes the difference between x and b x = fd (fe (x)). One interesting feature of an autoencoder is that when the decoder function is linear and the objective function is the mean squared error, the autoencoder becomes equivalent to PCA. For this reason, some authors refer to autoencoders with nonlinear activation functions as nonlinear generalizations of PCA; see, e.g., (Deng et al., 2017). This analogy between autoencoders and PCA was one of our original motivations for trying autoencoders for facies parameterization.

Restricted Boltzmann Machine Before we introduce the DBN, it is convenient to introduce RBM, which is the basic network used to construct a DBN. RBMs are stochastic neural networks which are symmetric with two-layers and no intra-layer communication, hence the name “restricted.” Figure 2 illustrates the typical architecture of a RBM. The network includes a forward and a backward pass. The forward pass constructs a sampling of the hidden units, h given the inputs, x, i.e., a sample from p(h|x). In the backward pass, we have the reconstruction step where the objective is to sample p(x|h). The training process of a RBM aims to determine the weights of the connections and possible bias terms in the input and hidden layers, which is typically performed using the algorithm known as contrastive divergence (Hinton, 2002).

Deep Belief Networks A DBN is typically defined as a stack of RBMs. Each RBN layer is linked to a subsequent layer forming a deep architecture. Here, we use a particular DBN structure, where the main DBN is replicated at the hidden layer to create a deep autoencoder as proposed in (Hinton and Salakhutdinov, 2006) and illustrated in Fig. 3. The basic idea is to use the first part of the network to construct a continuous parameterization of the facies, represented by hidden units, h. Then, we use the second part to reconstruct the facies. The training process of a DBN is done layer-by-layer, which is sometimes referred to as pretraining (Hinton and Salakhutdinov, 2006), followed by a global fine-tuning step with backpropagation ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Input

x1

x2

x3

x4

...

xN x RBM1

Encoder

ph | x  RBM2 Hidden layer

h1

h2

h3

h4

...

hN h

px | h 

Decoder

Output

xˆ1

xˆ 2

xˆ3

xˆ 4

...

xˆ N x

Figure 3: Deep belief network as an autoencoder composed by two RBMs.

over the entire network.

Ensemble Smoother with a Deep Learning Parameterization The proposed method for history matching facies starts with training the DBN with a large set of prior realizations of facies. In this process, the facies of each gridblock of the model corresponds to an input in the network. Note that no reservoir simulations are required during the training. In our tests, we used the DBN implementation available in the MATLAB toolkit called DeeBNet (Keyvanrad and Homayounpour, 2014). We selected this implementation for the convenience of using MATLAB, especially because it provides an easy-to-use GPU-based training process. Our initial tests showed that the training can be computationally very demanding, which has been handled using GPU. The trained networks are used for data assimilation. Figure 4 illustrates the process, which consists of introducing the ensemble smoother (ES) analysis applied to the hidden units between the encoder and decoder of the network. This process has a Bayesian interpretation as illustrated in the figure. Each realization of the initial ensemble, which corresponds to a sample from the prior distribution of facies, here denoted by p(x), is used in the input layer of the encoder. The trained encoder is responsible to generate samples of the distribution of the hidden units given the prior realizations, i.e., h ∼ p(h|x). b ∼ p(h|x, dobs ). After that, the ES is responsible for conditioning h to the observed production data, i.e., h b Finally the decoder is responsible for the reconstruction step, i.e., sample x ∼ p(x|h, dobs ). The scheme depicted in Fig. 4 corresponds to a single application of ES to make the figure simpler. In practice, however, we need to use ES-MDA because of the nonlinearity of the problem. In this case, the process is similar, but we have to repeat the ES analysis and the reconstruction step (decoder) multiple times. Note that the encoder is used only once in the beginning of the data assimilation. We designed the networks to have the same number of nodes in the input, output and hidden layer, each node corresponding to a gridblock of the model, i.e., Nx = Nh . This choice was made to allow the use of localization during the data assimilation step. The intermediate layers have fewer nodes to avoid the training to result in a trivial solution of simply replicating the input entries to the hidden layer. One interesting feature is that if all models in the training set are constructed with the same set of hard data (facies types at well locations), then the reconstructed realizations will honor the hard data.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Input

x1

x2

x3

x4

...

xN x

ph | x 

Encoder

Hidden layer

h1

h2

h3

h4

...

hN h ph | x, d obs 

Ensemble Smoother Hidden layer

hˆ1

hˆ2

hˆ3

hˆ4

...

hˆN h

px | h, d obs 

Decoder

Output

xˆ1

xˆ 2

xˆ3

xˆ 4

...

xˆ N x

Figure 4: Combination of a DBN with ES.

Figure 5: Channel training image used to generate the reference model and the prior realizations for the test case 1.

Test Cases Test Case 1 The first test problem corresponds to a channelized facies model generated using the algorithm snesim (Strebelle, 2002) with the training image presented by (Caers and Zhang, 2004) and reproduced in Fig. 5. Figure 6 shows the reference (true) permeability field. We assumed a constant permeability of 5000 mD in the channels and 500 mD in the background sand. The size of the model is 45 × 45 gridblocks, all gridblocks with 100 ft × 100 ft and constant thickness of 50 ft. We placed four oil producing and three water injection wells as shown in Fig. 6. All prior realizations were generated using the same training image of the reference model. However, we did not include any facies data (hard data) in order to make the problem more challenging. All producing wells operate at constant bottom-hole pressure of 3000 psi. The water injection wells operate at 4000 psi. The synthetic measurements were obtained by adding Gaussian random noise with standard deviation of 5% of the data predicted by the reference model.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

MOD45 O terceiro caso de estudo corresponde a uma série de imagens de 5.2.2.3 O terceiro caso de estudo corresponde a uma série de imagens de fácies que foram construídas mediante algoritmos da MPS. O conjunto de MOD45 fácies que foram construídas mediante algoritmos da MPS. O conjunto de treinamento e de teste foi criado usando a ferramenta SGeMs, baseado na treinamento e de caso teste de foi estudo criado corresponde usando a ferramenta SGeMs, baseado de na O terceiro a uma série de imagens pesquisa desenvolvida por Emerick [21]. fácies que foram construídas mediante algoritmos da MPS. O conjunto de pesquisa por Emerick As desenvolvida imagens correspondem a [21]. modelos canalizados de duas fácies e a 5000 treinamento e de correspondem teste foi criadoa usando a ferramenta SGeMs, As imagens modelos canalizados de apresenta duasbaseado fácies enaa diferença para os casos de estudo prévios é que cada imagem canais pesquisa para desenvolvida por Emerick [21]. é que cada imagem apresenta canais diferença os ou casos decanais estudo prévios mais complexos, seja, em diferentes direções, com algumas curvas na 3500 As imagens correspondem adiferentes modelos direções, canalizados de duas fácies e a mais complexos, ou seja, canais em sua estrutura e cada imagem é de dimensão 45 × 45. com algumas curvas na diferença parae os casos de estudo prévios é que cada sua estrutura cada imagem é de2000 dimensão 45 × 45.imagem apresenta canais mais complexos, ou seja, canais em diferentes direções, com algumas curvas na Imagens de teste (espaço discreto) Imagens de teste sua estrutura e cada imagem é de500 dimensão 45 × 45. (espaço discreto)

Figure 6: Reference permeability field (mD). Test case 1. Circles are oil producing wells and triangles are water Imagens de teste injection wells. (espaço discreto)

Extração de características (Encoder) (espaço continuo,Gaussiano) Extração de características (Encoder) (a) Input facies

140

140

140

120

140100

140 100

140 100

120 80

12080

120 80

100 60

10060

80140 40

140 80 40

140 80 40

60120 20

120 60 20

120 60 20

0

80

200

150 200

150 100

Extração de características (Encoder)

100 40 0

20

150

(espaço continuo,Gaussiano) 120

120

(espaço continuo,Gaussiano) 100 60

100 40 0 -4

-2

0

2

4

60

-4 40

-2

0

2

80 20

100 400 -4

60 0 -4 40

4

-2

-2

0

0

2

2

4

4

(b) Hidden layer

20

20

0 -2

0

2

4

-4

-2

0

2

4

200 50 100

50 0 100 -5

0

5

-2

0

2

4

-2

0

2

-2

0

2

4

0 50 -5

0 0

5

-4

-2

0

2

4

50

20

0

-4

150 0 50 -4 100

60 0 -4 40

Reconstrução (Decoder)

0 -4

80 20

100 150

100 50 150

4

0

-4

-2

0

2

4

-5

0 0

5

-4

-2

0

2

4

Reconstrução (Decoder) Reconstrução (Decoder)

(c) Reconstructed facies

Figura 53: Imagens de fácies da base de teste do caso de estudo MOD45, utilizando umathree rederealizations DBN como AE. Figure 7: Training process showing the first of the validation set and the corresponding histograms in the hidden layer. Figura Test case53: 1. Imagens de fácies da base de teste do caso de estudo MOD45, utilizando uma rede DBN como AE. Figura 53: Imagens de fácies da base de teste do caso de estudo MOD45, utilizando umatorede como2045 AE.nodes in the input, hidden and output We used a DBN with architecture similar Fig. 3DBN but with layers, which corresponds to the number of gridblock of the model. For the intermediate layer, we used 1048 nodes. We selected the simplest DBN architecture such that we could balance computational cost during training and achieve good reconstruction results. The training process required 3.2 hours in a standard GPU (GeForce GTX 1080). It is worth mentioning that at the time of this investigation we had only a GPU from a stand-alone computer, which limited the size of our networks and training sets. The training set consisted of 5000 realizations of facies and 1000 realizations for testing. Figure 7 shows the first three realizations of the training set before and after reconstruction indicating that the selecting DBN was able to successfully reconstruct the facies. Figure 7b shows the histograms of the values in the hidden units, h, resulting from the trained encoder. This figure shows that the marginal distributions of h are essentially Gaussian with zero mean and unity variance, making h more suitable for updating with ES-MDA than the original facies with binary distribution. We conducted history-matching experiments for this test case considering the standard ES-MDA, ESMDA-OPCA and ES-MDA-DBN. In all cases, we use the same prior ensemble with Ne = 200 realizations and Na = 4 data assimilations with constant MDA inflation factors. The same 5000 realizations used to train the DBN were used to construct the PCA basis for ES-MDA-OPCA. No localizations was ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

5000

3500

2000

500

Figure 8: First three prior realizations of permeability (mD). Test case 1. 5000

3500

2000

500

Figure 9: First three realizations of permeability (mD) after ES-MDA. Test case 1.

used in this first test case. Figures 8, 9, 10 and 11 show the first three realizations of the prior ensemble, ES-MDA, ES-MDA-OPCA and ES-MDA-DBN, respectively. As expected, standard ES-MDA was not able to preserve the binary distributions of permeability of the prior realizations. ES-MDA-OPCA resulted in nearly binary distributions of permeability and it was able to recover the position of the main channels of the model connecting oil producing and water injection wells. However, ES-MDA-OPCA generated small disconnected “pieces” of channels which are not present in the reference model nor in the prior realizations. ES-MDA-DBN also resulted in nearly binary distributions of permeability. Compared to ES-MDA-OPCA, the channels are more continuous and well-defined. However, the resulting realizations represent only approximations of the reference field and they do not seem actual samples from the same distribution which generated the reference and the prior realizations. Figure 12 shows boxplots of the data mismatch objective function and Fig. 13 shows the observed and predicted data for one well of the field. The objective function is computed with the standard formula

ON (m j ) =

1 (dobs − g(m j ))> C−1 e (dobs − g(m j )) , 2Nd

(8)

where all terms were defined before. The results Figs. 12 and 13 show that all methods were able to match the observed production data reasonably well.

5000

3500

2000

500

Figure 10: First three realizations of permeability (mD) after ES-MDA-OPCA. Test case 1. ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

5000

3500

2000

500

Figure 11: First three realizations of permeability (mD) after ES-MDA-DBN. Test case 1.

104 103

1120.1

ON

102 101 100

1.1

1.2

1.3

10-1 Prior

ES-MDA

ES-MDA-OPCA

ES-MDA-DBN

600

400

200 0 0

500 1000 time (day)

(a) ES-MDA

1500

2000

P3 Oil Production (bbl/day)

P3 Oil Production (bbl/day)

P3 Oil Production (bbl/day)

Figure 12: Normalized data mismatch objective function. Test case 1.

600

400

200 0 0

500 1000 time (day)

1500

(b) ES-MDA-OPCA

2000

600

400

200 0 0

500 1000 time (day)

1500

2000

(c) ES-MDA-DBN

Figure 13: Predicted oil rate at well P3. Test case 1. Red dots are the observed data points, gray and blue curves are the predicted data from the prior and posterior ensembles, respectively.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Test Case 2 The second test problem is the same used in (Emerick, 2017). This problem was originally designed to test ES-MDA-OPCA to simultaneously update the facies type and the permeability within each facies. Figure 14 shows the reference permeability field and the corresponding histogram. The facies distribution of the reference case and all prior realizations were generated using the training image of Fig. 5. The permeability values within each facies were obtained with sequential Gaussian simulation. For the facies channel, we used an isotropic spherical variogram with range of 375 meters. The prior mean of log-permeability is 7 ln-mD and the prior standard deviation is 0.5 ln-mD. The permeability of the background was generated with an isotropic spherical variogram with range of 2250 meters, prior mean of 4 ln-mD and prior standard deviation of 0.5 ln-mD. The model has 100 × 100 gridblocks with uniform size of 75 meters and constant thickness of 20 meters. The DBN used the same architecture of Fig. 3 with 10000 nodes in the input, hidden and output layers, which corresponds to the number of gridblock of the model. For the intermediate layer, we used 4096 units. The training set consisted of 5000 realizations plus 1000 for validation. The training required 20.4 hours of GPU. Figure 15 illustrates the result of the training showing that the DBN is able to reconstruct the facies generating values in the hidden layer with Gaussian marginal distributions. 8 7 800 6 5

600 400 200

4 3

0 2

3

4

5

6

7

8

9

10

Figure 14: Reference log-permeability field (ln-mD). Test case 2. Circles are oil producing wells and triangles are water injection wells.

For standard ES-MDA we updated directly the log-permeability fields. For ES-MDA-OPCA and ESMDA-DBN we updated an augmented model vector with the form 

 h m =  yc  , yb

(9)

where h ∈ RNx is the vector containing the PCA coefficients used in ES-MDA-OPCA or the vector containing the values in the hidden layer for ES-MDA-DBN. yc and yb are the vectors containing the log-permeability values for the facies channel and background, respectively. Given a vector m, the permeability value of the ith gridblock of the model, κi , is given by κi = exp [yb,i + xbi (yc,i − yb,i )] .

(10)

In the above equation, yc,i and yb,i are the ith components of the vectors yc and yb , respectively. xbi ∈ [0, 1] defines the facies type. For all methods we use the same prior ensemble with Ne = 200 realizations and Na = 10 data assimilation steps. We also applied distance-based localization to the Kalman gain with the taper function proposed by Furrer and Bengtsson (2007) with a spherical anisotropic correlation function with major range of 15000 meters oriented at 45o , i.e., aligned with the main direction of the channels, and minor range of 5000 meters. We select relatively large localization regions to account for the long correlations in the prior realizations because of the presence of channels. Figures 16, 17, 18 and 19 show the first three realizations of the prior ensemble and the posterior ensembles obtained ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

de 3.2 horas. rimento foi executado somente em GPU, tomando um tempo de treinamento (espaço discreto) de 3.2 horas. Imagens de teste (espaço discreto)

Imagens de teste (espaço discreto)

Extração de características (Encoder) (espaço continuo,Gaussiano)

(a) Input facies Extração de características (Encoder)

350

350

300

300

250 350

250 350

200 300

200 300

150 250

150 250

100 350200

100 200 350

100 200 350

100 200 350

50 300150

50 150 300

50 150 300

50 150 300

0 100 5 250 -5

0

200 50

250 350

400

300

250 350

200 300

400

-5

0 100 5 250 -5

0

100

50

0

5

(b) Hidden layer

5

1500

100 400 200

0 100 5 250 -5

0

0 300 5 100 -5

-5

0

5

100

0 150

-5

0

5

100

0

5

5

200 0 -5

0

5

100

50

0 -5

0

50 200

50

0 0

200 300

Reconstrução (Decoder)

50

0

0

50 200

150 0 5 -5

0

200 300

150 150 (espaço continuo,Gaussiano) 250 250

20050

100

-5

350

Extração de características (Encoder)

0 250100 -5

150 0

350

300 300 (espaço continuo,Gaussiano)

0 -5

0

5

0 -5

Reconstrução (Decoder)

0

5

-5

0

5

Reconstrução (Decoder) (c) Reconstructed facies

Figura 54: Imagens de fácies da base de teste do caso de estudo MOD100, Figure 15: Training processutilizando showing the first three of da theAE. validation and do corresponding histograms rederealizations DBN como Figura 54:uma Imagens de fácies base de set teste caso de estudo MOD100, in the hidden layer. Test case 2. utilizando uma rede DBN como AE. Figura 54: Imagens de fácies da base de teste do caso de estudo MOD100,

by each method. Clearly,5.2.2.4 ES-MDA was not able to preserve well-defined boundaries for the channels utilizando rede DBN AE. resulting in nearly Gaussian marginaluma distributions forcomo the log-permeability. ES-MDA-OPCA and ESMOD100 5.2.2.4 MDA-DBN obtained bi-modal distributions of log-permeability with some well-defined channels. Both MOD100 methods, however, generated some discontinuous branches of channels, which not present O caso de estudo MOD100 corresponde a umare modelo maiorindethedimensão prior description of the field. The realizations generated by ES-MDA-DBN are more distinct from each caso de estudo MOD100 corresponde a um modelo maior de dimensão 5.2.2.4 ×O 100. caso de MOD45, a base dados other, i.e., there is more 100 variability inIgualmente the posteriorao ensemble asestudo indicated in Fig. 20 whichde shows the é criada MOD100 100 × 100. Igualmente ao caso de estudo MOD45, a base de dados é criada normalized variance of log-permeability (ratio between the posterior and prior mediante algoritmos de MPS. A estrutura dos variances). canais em Preserving cada imagem de the variance of the ensemble is a desirable result as we typically want to use the posterior ensemble to mediante MPS. Acorresponde estrutura dos emmaior cada de imagem de O casoalgoritmos de estudode MOD100 a umcanais modelo dimensão estimate the uncertainty range. In terms of the data matches, ES-MDA-DBN resulted in larger values 100. Igualmente caso de estudo a base de dadosin é criada of the objective function 100 (Fig.×21), although the dataaomatches of all cases MOD45, are acceptable as illustrated Fig. 22. mediante algoritmos de MPS. A estrutura dos canais em cada imagem de

Test Case 3 The last test case is also from (Emerick, 2017). This problem corresponds to the first layer (Schelde formation) of the Brugge case (Peters et al., 2010). We generated the reference model (Fig. 24) and all prior realizations using the snesim algorithm with the Ganges Delta training image shown in Fig. 23 (Mariethoz and Caers, 2014). All realizations were conditioned to the same set of hard data at the 30 well locations (same locations of the original Brugge case). The observed data for history matching correspond to water cut at 20 wells and bottom-hole pressure at 30 wells (20 producers and 10 water injectors) for a period of 10 years. We added Gaussian random noise with standard deviation of 10% of the data generated by the reference model to generate the synthetic observations. The training set of the DBN consisted of 2000 realizations of facies. The same realizations were used to construct the PCA basis for ES-MDA-OPCA. Similarly to the other two cases, we use the number of input and hidden units in the network equals to the number of gridblocks (6672) to allow the use of localization. The number of nodes in the intermediary layers is 2048. Figure 25 presents the results of the training showing that the DBN is able to reconstruct the facies models.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

8 7 6 5 4 3 1500

1200

1000

800

1200

800

500

0

400

2

3

4

5

6

7

8

9

0

400

0

3

4

5

6

7

8

9

2

3

4

5

6

7

8

9

10

Figure 16: First three prior realizations of log-permeability (ln-mD). Test case 2.

8 7 6 5 4 3 1000

1200

800

800

600

800

600

400 400 400

200

200 0

-2

0

2

4

6

8

10

12

0

0

2

4

6

8

10

0

0

2

4

6

8

10

Figure 17: First three realizations of log-permeability (ln-mD) after ES-MDA. Test case 2.

8 7 6 5 4 3 1200

1200

800

800

400

400

1000

600

200 0

2

3

4

5

6

7

8

0

2

3

4

5

6

7

8

0

2

3

4

5

6

7

8

Figure 18: First three realizations of log-permeability (ln-mD) after ES-MDA-OPCA. Test case 2. ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

8 7 6 5 4 3 1200

1200

800

800

400

400

1200

800

400

0

2

3

4

5

6

7

8

0

9

2

3

4

5

6

7

8

0

2

3

4

5

6

7

8

Figure 19: First three realizations of log-permeability (ln-mD) after ES-MDA-DBN. Test case 2.

1 0.8 0.6 0.4 0.2 0

(a) ES-MDA-OPCA

(b) ES-MDA-DBN

Figure 20: Normalized variance of log-permeability. Test case 2. 200 160

ON

120 80 40

40

1.85

3.18

5.17

0 Prior

ES-MDA

ES-MDA-OPCA

ES-MDA-DBN

1

1

0.8

0.8

0.6 0.4 0.2 0

P5 Water Cut

1 0.8 P5 Water Cut

P5 Water Cut

Figure 21: Normalized data mismatch objective function. Test case 2.

0.6 0.4 0.2 0

0

2000 4000 time (day)

(a) ES-MDA

6000

7650

0.6 0.4 0.2 0

0

2000 4000 time (day)

6000

7650

(b) ES-MDA-OPCA

0

2000 4000 time (day)

6000

(c) ES-MDA-DBN

Figure 22: Water cut at well P5. Test case 2. ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

7650

Figure 23: Ganges Delta training image used to generate the reference model and the prior realizations for the third test problem. 500 400 300 200 100 0

8

7

6

5

4

2

3

4

5

6

7

8

3

Figure 24: Reference log-permeability field (ln-mD). Test case 3.

We performed history matching exercises with ES-MDA, ES-MDA-OPCA and ES-MDA-DBN. All cases starting with the same prior ensemble with Ne = 200 realizations and Na = 6 data assimilations. Localization was applied to the Kalman gain using the Gaspari-Cohn correlation function (Gaspari and Cohn, 1999) with major range of 3500 meters aligned with the main direction of the channels, and minor range of 3000 meters. Figures 26, 27, 28 and 29 show the first three realizations of each ensemble. The conclusions are somewhat similar to what we observed in the previous cases. ES-MDA-OPCA and ESMDA-DBN were able to generate bi-modal log-permeability distributions in accordance with the prior description of the model. However, ES-MDA-DBN resulted in large value of the data mismatch objective function (Fig. 30), hence poorer data matches. Figure 31 illustrates this fact. Note that there is a larger spread of predicted water cut near the breakthrough time for ES-MDA-DBN. This is an indication that the parameterization with DBN makes the data assimilation more nonlinear requiring more iterations to converge. Although the results are not reported here, it is worth mentioning that we could reduce the objective function obtained by ES-MDA-DBN increasing the number of data assimilations. For example, for Na = 20 we obtained a median objective function of 4.4.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Capítulo 5. Deep Learning

111

Imagens de teste Capítulo 5. Deep Learning

(espaço discreto)

111

Imagens de teste (espaço discreto)

Imagens de teste (espaço discreto)

Extração de características (Encoder) (espaço continuo,Gaussiano)

300

Extração de300 características (Encoder) 300 (a)continuo,Gaussiano) Input facies (espaço 200 200

200 300 100 200 0 -4 100 300 0 -4 200

300 100 -2

0

2

-2

0

2

4 (espaço -4 continuo,Gaussiano) -2 0 2 4 -4 100 100 300 300 0 0 4 -4 -2 0 2 4 -4 200 200

-2

0

2

4

-2

0

2

4

-2

0

2

4

(b) Hidden (Decoder) layer Reconstrução

100 0 -4

300 100

Extração de200 características (Encoder) 200 0 0

-2

0

2

100

100

0

0

4 -4 -2 0 (Decoder) 2 4 -4 Reconstrução

(c) Reconstructed facies Figura 56: Imagens de fácies de teste do caso de estudo BRUGGE, utilizando Reconstrução (Decoder) uma rede DBN como AE. Figure 25: Training process showing the first three realizations of the training set and corresponding histograms in the hidden layer. Test 56: caseImagens 3. Figura de fácies de teste do caso de estudo BRUGGE, utilizando 5.2.2.6 uma rede DBN como AE. Erro de Reconstrução

5.2.2.6 FiguraPara 56: Imagens fácies de testeda do reconstrução caso de estudo BRUGGE, utilizando avaliaçãodedos resultados obtida pelas redes DL, Erro de Reconstrução uma rede DBN como AE. formulamos a seguente equação do erro médio quadrático:

 da reconstrução  Para avaliação dos resultados obtida pelas redes DL, Ng Nr X 100 1 X 2 5.2.2.6  (x − x ˆ ) % (5-40) Er = formulamos a seguentet equação do erro médio i,j quadrático: i,j Erro de Reconstrução Nr j=1  Ng i=1  Ng Nr 100 X 1 X  Para avaliação dos resultados da reconstrução obtida pelas àredes DL, onde x representa o valor da i-ésima posição doxˆi,j vetor x%(entrada rede) da (xi,j − )2  (5-40) Er = i,j t Nr j=1 Ng i=1 600 formulamos equação do erro ˆ (saída da realização j ae seguente xˆi,j representa o valor da médio i-ésimaquadrático: posição do vetor x   x (entrada à rede) da 400 onde x representa o valor da i-ésima posição do vetor rede) da ao número de realizações da base i,j realização j, Nr e Ng correspondem Ng Nr 100 X 1 X  %  da i-ésima (xi,j − posição xˆi,jAplicando )2200 (5-40) Ertdo = vetoro x ˆ (saída realização e xˆi,j representa valor do vetor x da de teste e aj dimensão respectivamente. a Equação 5-40 Nr j=1 Ng i=1 rede) da realização j, N e N correspondem ao número de realizações da base para os diferentes experimentos feitos nesse capítulo,0 foram obtidos erros de r g 2 x 3(entrada 4 5 6 à 7rede) 8 onde xi,j erepresenta da i-ésima posição do vetor da de teste a dimensão do0.3 vetor x respectivamente. Aplicando a Equação 5-40 reconstrução menoreso avalor %. ˆ (saída realização j e xˆi,j representa o valor i-ésima posição do vetor x para os diferentes experimentos feitosdanesse capítulo, foram obtidos erros da de 600 rede) A daTabela realização j, N e N correspondem ao número de realizações da base r g reconstrução menores a 0.3 %. 12 apresenta os valores correspondentes para cada teste reali400 de teste e a dimensão do número vetor x de respectivamente. a Equação 5-40 zado utilizando para um 1000 realizaçõesAplicando de teste (N r = 1000). para os diferentes experimentos feitos nesse capítulo,200foram erros de A Tabela 12 apresenta os valores correspondentes para obtidos cada teste realireconstrução menores a 0.3 %. zado utilizando para um número de 1000 realizações de teste (Nr = 1000). 0 2

3

4

5

6

7

8

600 A Tabela 12 apresenta os valores correspondentes para cada teste realizado utilizando para um número de 1000 realizações de teste (Nr = 1000). 400

200

0

8

7

6

5

4

2

3

4

5

6

7

8

9

3

Figure 26: First three prior realizations of log-permeability (ln-mD). Test case 3.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

400 300 200 100 0

1

2

3

4

5

6

7

8

9

2

3

4

5

6

7

8

9

2

3

4

5

6

7

8

9

400 300 200 100 0

1

300

200

100

0

8

7

6

5

1

3

4

Figure 27: First three realizations of log-permeability (ln-mD) after ES-MDA. Test case 3.

600

400

200

0

2

3

4

5

6

7

8

3

4

5

6

7

8

3

4

5

6

7

8

400 300 200 100 0

2

600

400

200

0

8

7

6

5

4

2

9

3

Figure 28: First three realizations of log-permeability (ln-mD) after ES-MDA-OPCA. Test case 3.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

600

400

200

0

2

3

4

5

6

7

8

3

4

5

6

7

8

3

4

5

6

7

8

600

400

200

0

2

600

400

200

0

8

6

7

5

2

4

3

Figure 29: First three realizations of log-permeability (ln-mD) after ES-MDA-DBN. Test case 3.

80

ON

60

40 36.3 20 10.2 4.52 2.52 ES-MDA ES-MDA-OPCA ES-MDA-DBN

0 Prior

Figure 30: Normalized data mismatch objective function. Test case 3.

0.8 0.6 0.4 0.2 0

1 BR- P17 Wat er Cut

1 BR- P17 Wat er Cut

BR- P17 Wat er Cut

1

0.8 0.6 0.4 0.2 0

0 500

1500 2500 time (day)

(a) ES-MDA

3500

0.8 0.6 0.4 0.2 0

0 500

1500 2500 t im e ( day)

35000

0 500

(b) ES-MDA-OPCA

1500 2500 t im e ( day)

(c) ES-MDA-DBN

Figure 31: Water cut at well BR-P17. Test case 3.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

35000

Comments The results presented in the previous sections showed that the training step of the DBN was very successful in terms of reconstructing the input facies realizations of the training and validation sets. Unfortunately, the same performance was not observed when we updated the latent vectors of the DBN with ES-MDA. This process was able to generate bi-model distributions of log-permeability with some well-defined channels. However, the resulting facies realizations are clearly distinguishable from the prior ones. Overall, the results obtained with ES-MDA-DBN are superior to the standard ES-MDA and comparable with ES-MDA-OPCA. Yet, there still room for improvements in terms of preserving the geological realism of the models. One possibility we considered is to resample the posterior realizations using a geostatistical algorithm to ensure that the realizations preserve the desired characteristics. Variations of this idea include the construction of “probability maps” based on the posterior realizations, which can be used as soft constraints in geostatistical simulations (Jafarpour and Khodabakhshi, 2011; Le et al., 2015; Sebacher et al., 2015) or to generate “pseudo-wells” to constrain realizations (Tavakoli et al., 2014; Chang et al., 2015). The resampling ensures the desired geological realism but it tends to deteriorate the data matches. Even though the idea of resampling models using geostatistics has some appealing features, we decided to invest more in other deep networks architectures in the continuation of this reseach before introducing a resampling. More recently, we moved our attention to convolutional networks, in particular deep convolutional generative adversarial networks (Goodfellow et al., 2014; Radford et al., 2015; Curtó et al., 2018). Our initial tests demonstrated great potencial. For example, we were able to obtain realizations conditioned to production data which are visually indistinguishable from realizations obtained with the snesim algorithm for the first test problem discussed in this paper. One problem, however, is that the computational requirements for training the network increases significantly. For example, our tests showed the need for a much larger training set, we used 30000 realizations plus another 10000 for validation, and deeper architectures for the networks, we consider cases with up to nine hidden layers. The training showed to be impracticable in a stand-alone computer. However, we were able to train a network in one hour using a recently acquired cluster with four GPUs (NVIDIA TESLA P100).

Conclusions This paper reported our current results on an ongoing research project aiming to develop a robust parameterization for facies history matching. In particular, we presented a parameterization based on deep belief networks integrated with an iterative ensemble smoother (ES-MDA). The parameterization was tested against three synthetic problems and compared with the standard ES-MDA and ES-MDA combined with optimization-based PCA. The results showed clear improvements over the standard ES-MDA in terms of preserving channelized features in the realizations and a performance comparable to the parameterization with optimization-based PCA.

Acknowledgments The authors thank Petrobras for the financial support.

References Agbalaka, C.C. and Oliver, D.S. [2008] Application of the EnKF and localization to automatic history matching of facies distribution and production data. Mathematical Geosciences, 40(4), 353–374. Armstrong, M., Galli, A., Beucher, H., Loc’h, G.L., Renard, D., Doligez, B., Eschard, R. and Geffroy, F. [2011] Plurigaussian simulations in geosciences. Springer-Verlag Berlin Heidelberg, 2nd edn. Bengio, Y. [2009] Learning deep architectures for AI. Now Publishers Inc. Caers, J. and Zhang, T. [2004] Multiple-point geostatistics: a quantitative vehicle for integrating geologic analogs into multiple reservoir models. AAPG memoir, 80, 383–394. Canchumuni, S.A., Emerick, A.A. and Pacheco, M.A. [2017] Integration of ensemble data assimilation and deep learning for history matching facies models. In: Proceedings of the Offshore Technology ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Conference, Rio de Janeiro, Brazil, 24–26 October, OTC-28015-MS. Chang, Y., Stordal, A.S. and Valestran, R. [2015] Facies parameterization and estimation for complex reservoirs – the Brugge field. In: Proceedings of the SPE Bergen One Day Seminar, Bergen, Norway, 22 April, SPE-173872-MS. Chen, C., Gao, G., Gelderblom, P. and Jimenez, E. [2016] Integration of cumulative-distributionfunction mapping with principal-component analysis for the history matching of channelized reservoirs. SPE Reservoir Evaluation & Engineering, 19(2), 278–293. Chen, Y. and Oliver, D.S. [2013] Levenberg-Marquardt forms of the iterative ensemble smoother for efficient history matching and uncertainty quantification. Computational Geosciences, 17, 689–703. Curtó, J.D., Zarza, I.C., Torre, F.D.L., King, I. and Lyu, M.R. [2018] High-resolution deep convolutional generative adversarial networks. arXiv:1711.06491v9 [cs.CV]. Deng, X., Tian, X., Chen, S. and Harris, C.J. [2017] Deep learning based nonlinear principal component analysis for industrial process fault detection. In: Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN). Emerick, A.A. [2017] Investigation on principal component analysis parameterizations for history matching channelized facies models with ensemble-based data assimilation. Mathematical Geosciences, 49(1), 85–120. Emerick, A.A. and Reynolds, A.C. [2013] Ensemble smoother with multiple data assimilation. Computers & Geosciences, 55, 3–15. Furrer, R. and Bengtsson, T. [2007] Estimation of high-dimensional prior and posterior covariance matrices in Kalman filter variants. Journal of Multivariate Analysis, 98(2), 227–255. Gaspari, G. and Cohn, S.E. [1999] Construction of correlation functions in two and three dimensions. Quarterly Journal of the Royal Meteorological Society, 125(554), 723–757. Goodfellow, I., Bengio, Y. and Courville, A. [2016] Deep learning. MIT Press. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y. [2014] Generative adversarial networks. arXiv:1406.2661v1 [stat.ML]. Hinton, G.E. [2002] Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800. Hinton, G.E., Osindero, S. and Teh, Y.W. [2006] A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. Hinton, G.E. and Salakhutdinov, R.R. [2006] Reducing the dimensionality of data with neural networks. Science, 313, 504–507. Houtekamer, P.L. and Mitchell, H.L. [2001] A sequential ensemble Kalman filter for atmospheric data assimilation. Monthly Weather Review, 129(1), 123–137. Jafarpour, B. and Khodabakhshi, M. [2011] A probability conditioning method (PCM) for nonlinear flow data integration into multipoint statistical facies simulation. Mathematical Geosciences, 43(2), 133–164. Keyvanrad, M.A. and Homayounpour, M.M. [2014] A brief survey on deep belief networks and introducing a new object oriented toolbox (DeeBNet). arXiv:1408.3264v7 [cs.CV]. Le, D.H., Younis, R. and Reynolds, A.C. [2015] A history matching procedure for non-Gaussian facies based on ES-MDA. In: Proceedings of the SPE Reservoir Simulation Symposium, Houston, Texas, USA, 23–25 February, SPE-173233-MS. Liu, N. and Oliver, D.S. [2005] Ensemble Kalman filter for automatic history matching of geologic facies. Journal of Petroleum Science and Engineering, 47(3–4), 147–161. Luo, X., Stordal, A.S., Lorentzen, R.J. and Nævdal, G. [2015] Iterative ensemble smoother as an approximate solution to a regularized minimum-average-cost problem: Theory and applications. SPE Journal, 20(5). Mariethoz, G. and Caers, J. [2014] Multiple-point geostatistics – Stochastic modeling with training images. John Wiley & Sons, Ltd. Peters, L., Arts, R., Brouwer, G., Geel, C., Cullick, S., Lorentzen, R.J., Chen, Y., Dunlop, N., Vossepoel, F.C., Xu, R., Sarma, P., Alhuthali, A.H. and Reynolds, A. [2010] Results of the Brugge benchmark study for flooding optimisation and history matching. SPE Reservoir Evaluation & Engineering, 13(3), 391–405. Radford, A., Metz, L. and Chintala, S. [2015] Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434 [cs.LG]. ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain

Sarma, P., Durlofsky, L.J. and Aziz, K. [2008] Kernel principal component analysis for efficient differentiable parameterization of multipoint geostatistics. Mathematical Geosciences, 40(1), 3–32. Schölkopf, B., Smola, A. and Müller, K.R. [1998] Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5), 1299–1319. Sebacher, B.M., Hanea, R. and Heemink, A. [2013] A probabilistic parametrization for geological uncertainty estimation using the ensemble Kalman filter (EnKF). Computational Geosciences, 17(5), 813–832. Sebacher, B.M., Stordal, A.S. and Hanea, R. [2015] Bridging multipoint statistics and truncated Gaussian fields for improved estimation of channelized reservoirs with ensemble methods. Computational Geosciences, 19(2), 341–369. Stordal, A.S. and Elsheikh, A.H. [2015] Iterative ensemble smoothers in the annealed importance sampling framework. Advances in Water Resources, 86, 231–239. Strebelle, S. [2002] Conditional simulation of complex geological structures using multiple-point statistics. Mathematical Geology, 34(1), 1–21. Tavakoli, R., Srinivasan, S. and Wheeler, M.F. [2014] Rapid updating of stochastic models by use of an ensemble-filter approach. SPE Journal, 19(3), 500–513. Vo, H.X. and Durlofsky, L.J. [2014] A new differentiable parameterization based on principal component analysis for the low-dimensional representation of complex geological models. Mathematical Geosciences, 46(7), 775–813. Zhao, Y., Reynolds, A.C. and Li, G. [2008] Generating facies maps by assimilating production data and seismic data with the ensemble Kalman filter. In: Proceedings of the SPE Improved Oil Recovery Symposium, Tulsa, Oklahoma, 20–23 April, SPE-113990-MS.

ECMOR XVI 2018 – 16th European Conference on the Mathematics of Oil Recovery 3–6 September 2018, Barcelona, Spain