Ensemble model of Artificial Neural Networks with randomized

0 downloads 0 Views 888KB Size Report
AI application has a long and vibrant history in the petroleum industry especially for reservoir characterization [2,. 3]. Interestingly, this has been limited to the use ...
2013 8th International Conference on Information Technology in Asia (CITA)

Ensemble Model of Artificial Neural Networks with Randomized Number of Hidden Neurons Fatai Anifowosea

Jane Labadinb

Faculty of Computer Science and Information Technology Universiti Malaysia Sarawak Sarawak, Malaysia a [email protected], [email protected] Abstract—Conventional artificial intelligence techniques and their hybrid models are incapable of handling several hypotheses at a time. The limitation in the performance of certain techniques has made the ensemble learning paradigm a desirable alternative for better predictions. The petroleum industry stands to gain immensely from this learning methodology due to the persistent quest for better prediction accuracies of reservoir properties for improved hydrocarbon exploration, production, and management activities. Artificial Neural Networks (ANN) has been applied in petroleum engineering but widely reported to be lacking in global optima caused mainly by the great challenge involved in the determination of optimal number of hidden neurons. This paper presents a novel ensemble model of ANN that uses a randomized algorithm to generate the number of hidden neurons in the prediction of petroleum reservoir properties. Ten base learners of the ANN model were created with each using a randomly generated number of hidden neurons. Each learner contributed in solving the problem and a single ensemble solution was evolved. The performance of the ensemble model was evaluated using standard evaluation criteria. The results showed that the performance of the proposed ensemble model is better than the average performance of the individual base learners. This study is a successful proof of concept of randomization of the number of hidden neurons and demonstrated the great potential for the application of this learning paradigm in petroleum reservoir characterization. Keywords—Ensemble; artificial neural networks; random number of hidden neurons; porosity; permeability; petroleum reservoir characterization.

I.

INTRODUCTION

While the conventional and hybrid Artificial Intelligence (AI) techniques are only capable of handling a single hypothesis at a time, the ensemble machine learning paradigm has the capability of handling several hypotheses. The hypotheses are eventually combined using any of the combination algorithms to evolve a single ensemble hypothesis that represents the best of the individual hypotheses. The ensemble learning paradigm is an emulation of the human social behavior of seeking various opinions on issues before making a final decision. This methodology has performed excellently well in various classification and clustering tasks [1]. The petroleum industry is well suited for this methodology given the persistent quest for increased prediction accuracy of petroleum reservoir properties. The need for this methodology in the petroleum industry is also due to the strategic importance of petroleum as the major source of energy to the world. The process of predicting

978-1-4799-1092-2/13/$31.00 ©2013 IEEE

Abdulazeez Abdulraheem Department Petroleum Engineering King Fahd University of Petroleum and Minerals Dhahran, Saudi Arabia [email protected] various properties of petroleum reservoirs is called Reservoir Characterization. Despite the success of the ensemble machine paradigm, it has not been adequately utilized in the petroleum industry. This methodology is capable of increasing the prediction accuracy of various properties of petroleum reservoirs which will result in the reciprocal improvement of accuracy in the estimation of other associated petroleum reservoir exploration and production indices such as hydrocarbon reserve and its rate of recovery. AI application has a long and vibrant history in the petroleum industry especially for reservoir characterization [2, 3]. Interestingly, this has been limited to the use of Artificial Neural Networks (ANN) [4]. Despite the wide application of ANN in petroleum engineering, it suffers from a number of deficiencies [5] one of which is that it gets caught up in local optima because the number of hidden layers and neurons are determined mostly by trial and error. All the various algorithms such as the Cascade Correlation and Radial Basis Function neural networks that were developed to address these problems did not improve the overall performance of the ANN technique [6]. The persistent quest for increased performance of predictive models for petroleum reservoir characterization has led to the continued search for better techniques. As the search continues, feature selection-based and optimization-based hybrids of AI techniques have been proposed and applied with ANN. Some of the proposed algorithms failed while others created new problems such as increased execution time [7], memoryintensive computations [8] and sometimes unsolved problem of local optima [6]. The ensemble learning paradigm offers a promising alternative. Hence, we propose an ensemble model of ANN that uses a randomly selected number of neurons in the hidden layer of each of its base learners. This is intended to remove the great effort associated with the trial-and-error method of determining the optimal number of hidden neurons. Our choice of ANN is due to its popular use in the petroleum industry AI applications as well as the need to improve the performance of this technique for better prediction performance [2 - 4]. The main objective of this paper is to investigate the feasibility of this proposed algorithm and to prepare the necessary ground for a more robust application and comparison with conventional and existing ensemble algorithms. Our motivation is the reported successful applications of the ensemble methodology in other fields [9 11]. Ensemble models are especially suitable for the petroleum industry where data are usually scarce and the few available ones are noisy.

2013 8th International Conference on Information Technology in Asia (CITA)

II.

LITERATURE REVIEW

A. Petroleum Reservoir Characterization This is an important process in the petroleum industry that aims to determine various properties of petroleum reservoirs. It helps in more successful reservoir exploration, production and management. This paper focuses on the prediction of porosity and permeability since they are the key factors in the estimation of the potential quantity and quality of a reservoir. Since they are also used for the estimation of other properties, a marginal improvement in their prediction accuracy will lead to further improvement in the accuracy of those other reservoir properties whose values depend on them [12]. The great challenge that is associated with the reservoir characterization process calls for the implementation of the ensemble learning methodology. B. The Ensemble Machine Learning Paradigm The ensemble machine modeling paradigm combines diverse multiple expert hypotheses to generate a single overall solution to a problem. This concept is justified by its mimic of human behavior where the judgment of a committee is believed to be superior to those of individuals, provided the individuals have reasonable competence [13]. It was initially introduced for classification and clustering purposes [9, 13] and later extended to time series and regression problems [14]. It reduces the possibility of an unfortunate emergence of a poor decision from a combined set of possible solutions. More details about this methodology can be found in [15, 16]. It is pertinent to examine the level of implementation of this methodology in the petroleum industry. C. Existing Work on Ensemble Learning Applications The petroleum industry has greatly benefited from the application of ANN [2 - 4]. However, the ensemble learning method has not been adequately explored. The only work found in the petroleum industry-related literature was [17] that proposed an ensemble model of ANN in the prediction of open-hole triple-combos. In view of the afore-mentioned, this paper is among the first and the most recent attempt in the application of a novel ensemble method in the prediction of petroleum reservoir properties. Due to the existing gap in the ensemble methodology application in petroleum engineering, this paper attempts to handle one of the prevailing problems facing the ANN technique: D. ANN and the Problem with the Number of Hidden Neurons ANN is a CI technique that is modeled after the biological nervous system. It is made up of several layers of neurons interconnected by links using weights. The ANN architecture has been comprehensively explained in [5, 18]. The number of hidden neurons is one of the major factors that determine the performance of the ANN model. Hence, the determination of the optimal number of hidden neurons has attracted much focus. [19, 20] investigated the effect of the number of hidden neurons on the stability of the learning performance of large scale neural networks. They established a relationship between the number of hidden neurons and the stability of ANN. Much earlier, [21] used a theoretical proof to establish an upper bound on the number of hidden neurons in a feedforward neural network using weights as the basis. Reference [22]

proposed a two-step approach to determine the optimal number of hidden neurons. However since this approach resembles an exhaustive search procedure, it would be memory-intensive and time-wasting. Reference [23] postulated that determining the optimal number of hidden neurons is a matter of trial and error. The question remains: for how long do we depend on trial and error? To answer this question, [24] attempted to optimize the number of hidden neurons using a signal-to-noise ratio approach. However, it was not shown that the method is applicable to petroleum engineering. Most recently, [25] while attempting to approximate the number of hidden neurons concluded that it remains a matter of guess. Hence, the question remains that there is no single efficient way to determine the optimal number of hidden neurons of a neural network model. We hereby hypothesize that instead of going through a lot of subjective trial-and-error guesses, using an ensemble learning approach to combine the results of several learners with their number of hidden neurons guessed using a fair and objective randomized algorithm will give acceptable results. This will also save the tedious effort, memory resources and time required by the conventional trial and error methods. E. Randomized Algorithms Randomized algorithms have been modeled as probabilistic Turing Machines in Computational Complexity Theory [26]. Previous studies on randomization have considered both Las Vegas and Monte Carlo algorithms along with several complexity classes [27]. The fundamental class for randomized complexity is RP [28], the class of decision problems for which there is an efficient (polynomial time) randomized algorithm or probabilistic Turing Machine which recognizes NO-instances with absolute certainty and recognizes YES-instances with a probability of at least ½. The complexity class BPP (bounded-error probabilistic polynomial-time) has also been defined as aiming to capture the set of decision problems efficiently solvable by probabilistic algorithms [26]. It has been established that the expected cost of a typical randomized algorithm is O(log n) where n is the number of random guesses with the additional quality that each number fairly selection hence having an equal chance of being guessed without bias [29]. III.

RESEARCH METHODOLOGY

A. Description of Data A total of six core measurements matched with their corresponding well logs for porosity and permeability were used for the evaluation of this proposed ensemble model. The three datasets for porosity were obtained from a drilling site in the Northern Marion Platform of North America and the three datasets for permeability were from a giant carbonate reservoir in the Middle East. The porosity datasets have five predictor variables viz. Top Interval, Grain Density, Grain Volume, Length and Diameter while the permeability datasets have eight predictor variables viz. Gamma Ray, Porosity, Density, Water Saturation, Deep Resistivity, Microspherically Focused, Neutron and Caliper Logs. Following the standard machine learning approach, each dataset was divided into training and testing subsets using a stratified sampling technique where 70% of the data was

2013 8th International Conference on Information Technology in Asia (CITA)

randomly selected for training, and the remaining 30% for testing. The choice of this stratification method is based on its popularity for fairness and high confidence [12]. B. Evaluation Criteria and Combination Rules For effective evaluation and comparison of the performances of the proposed ensemble model with respect to those of its base learners, we applied the commonly used criteria viz. correlation coefficient (R2), root mean-squared error (RMSE), and mean absolute error (MAE). R2 measures the statistical correlation or the degree of closeness between the predicted and actual output values. RMSE computes the average of the squared differences between each predicted value and its corresponding actual value. MAE takes the average of all the individual absolute errors between the actual and predicted target values. The mathematical bases for these criteria can be found in literature [4, 5, 9, 10]. The results of the base learners were combined using the algebraic rules: Max() and Mean() rules for R2, and Min() and Mean() for the error measures. With these rules, the output of the base learner with the highest R2, which mostly corresponds to that with the least error, is chosen to be the output of the ensemble model. C. Design and Implementation of the Ensemble Model In order to avoid the problems associated with the trial and error method, we created an ensemble ANN model with 10 base learners. We used the default 2-layer network with all the parameters set to their optimal values but with the number of hidden neurons randomly chosen between 1 and 50. The choice of this network architecture and range for the number of hidden neurons is based on the suggestion that most problems can be solved with a 2-layer architecture with no more than 50 neurons in each hidden layer [18]. To further increase the diversity of the results, a new random division each dataset into training and testing sets was made for each base learner. The algorithm for the proposed ensemble method: For k = 1 to 6 //there are 6 datasets For n = 1 to 10 //10 base learners Divide data into training and testing Number of hidden neurons = Rand(1:50) Train the model Sn with training data Test the model Sn with testing data Keep result as Hypothesis, Hn Next n Compute the best of all hypotheses: Hfinal (x) = Next k //Repeat for all datasets

IV.

RESULTS AND DISCUSSION

After running the proposed ensemble model on the six datasets, the results are shown in Figure 3 through 5. Figure 3 clearly showed that our proposed ensemble model performed better than the average of the base learners making up the ensemble model. The R2 of our proposed model is higher than the average R2 of its components. In terms of RMSE, Figure 4 showed the superiority of our proposed algorithm with less RMSE than the average of the base learners. With MAE,

Figure 5 shows a similar trend to RMSE with lower values for the proposed algorithm than the average performance of the base learners. All these comparative results are in perfect agreement with the theory of the ensemble methodology [15, 16, 30] which seeks to integrate all possible solutions to a problem and evolve the best of them. This makes sense in that the evolved overall best of an ensemble model should be either better than or at least equal to the best of its components. The obvious effect of the number of neurons on the performance of ANN is a confirmation of the diversity of this parameter and the need for an ensemble solution for optimal performance. The result of this study shows that using the objective and fair guesses of a randomized algorithm performs better than the subjective guesses of the trial-and-error method of determining the optimal number of hidden neurons. This agrees perfectly with the literature on the benefits of the randomized algorithm along with its low computational cost [26 - 28]. This is a successful proof of the concept of using randomized algorithms to determine the number of hidden neurons of ANN. This removes the great effort, huge cost and often poor prediction performance that are associated with the trial-and-error method. This work has the potential to open a new window of opportunities for continued research on further developing the ensemble methodology especially in petroleum engineering applications. V.

CONCLUSION

In this paper, we have presented a successful proof of concept that showed that the complexity in determining the optimal number of hidden neurons for ANN is best solved by using the concept of randomization in an ensemble fashion. We also established the need for the ensemble learning methodology in petroleum reservoir characterization. Different reports from past studies emphasizing the use of trial and error methods to optimize the number of hidden neurons were presented. To evaluate the proposed ensemble model, we used six porosity and permeability datasets obtained from different petroleum reservoirs. We then compared the performance of the ensemble model with the average performance of its base learners using well-established decision rules, and standard evaluation criteria. The implementation results showed that our proposed ensemble model performed better than the average of its base learners. This agrees perfectly with the theory of ensemble methodology. Our study indicates a great potential for more successful applications of the ensemble method in petroleum engineering. However, a more rigorous comparative investigation is needed to benchmark the performance of this ensemble model with a bagging-based ANN ensemble, and the RandomForest technique, which is the first implementation of the ensemble paradigm.

2013 8th International Conference on Information Technology in Asia (CITA) Plot ShowingtheRandomlyGenerated Numbers of HiddenNeurons 45

Number of Hidden Neurons

40 35 30 25 20 15 10 5 0

1

2

3

4

5

6

7

8

9

10

BaseLearners

Fig. 1. Random Selection of Hidden Neurons (Porosity Data 1).

Fig. 4. RMSE Comparison on all Datasets.

Plot ShowingtheRandomlyGeneratedNumbersofHiddenNeurons

Number of Hidden Neurons

50 45 40 35 30 25 20 15 10 5 0

1

2

3

4

5

6

7

8

9

10

BaseLearners

Fig. 2. Random Selection of Hidden Neurons (Permeability Data 3).

Fig. 5. MAE Comparison on all Datasets.

ACKNOWLEDGMENT The authors would like to thank the Universiti Malaysia Sarawak and King Fahd University of Petroleum and Minerals for providing the resources used in the conduct of this study. REFERENCES [1]

[2]

[3] Fig. 3. Correlation Coefficient Comparison on all Datasets.

[4]

[5]

[6]

C. Caragea, J. Sinapov, A. Silvescu, D. Dobbs, and V. Honavar, "Glycosylation site prediction using ensembles of support vector machine classifiers," BMC Bioinformatics, vol. 8, issue 2, pp. 438, 2007. J.K. Ali, "Neural networks: a new tool for the petroleum industry," in SPE European Petroleum Computer Conference, Aberdeen, pp. 217231, 1994. S. Mohaghegh, "virtual intelligence and its applications in petroleum engineering: artificial neural networks," in Journal of Petroleum Technology, Distinguished Lecture Series, 2000. L. Jong-Se, "Reservoir properties determination using fuzzy logic and neural networks from well data in offshore Korea," J. of Petroleum Sci. & Eng., vol. 49, pp. 182-192, 2005. J.B. Petrus, F. Thuijsman, and A.J. Weijters, Artificial Neural Networks: An Introduction to ANN Theory and Practice. Springer, Netherlands, 1995. M. Bruen and J. Yang, "Functional networks in real-time flood forecasting: a novel application," Advances in Water Resources, vol. 28, pp. 899-909, 2005.

2013 8th International Conference on Information Technology in Asia (CITA) [7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15] [16] [17]

[18] [19]

[20]

[21]

[22]

[23] [24]

[25]

[26] [27] [28]

W. Gao, "Study on new improved hybrid genetic algorithm," in Advances in Information Technology and Industry Applications, D. Zeng, Eds. Lecture Notes in Electrical Engineering, vol. 136, pp 505512, 2012. R.R. Bies, M.F. Muldoon, B.G. Pollock, S. Manuck, G. Smith, and M.E. Sale, "A genetic algorithm-based, hybrid machine learning approach to model selection," J. of Pharmacokinetics & Pharmacodynamics, vol. 33, issue 2, pp. 195-221, 2006. L. Nanni and A. Lumini, "An ensemble of support vector machines for predicting virulent proteins," Expert Systems with Applications, vol. 36, pp. 7458–7462, 2009. I. Zaier, C. Shu, T.B.M.J. Ouarda, O. Seidou, and F. Chebana, "Estimation of ice thickness on lakes using artificial neural network ensembles," J. of Hydrology, vol. 383, pp. 330–340, 2010. J. Sun and H. Li, "Financial distress prediction using support vector machines: ensemble vs. individual," Applied Soft Computing, vol. 12, issue 8, pp. 2254-2265, 2012. T. Helmy, F. Anifowose, and K. Faisal, "Hybrid computational models for the characterization of oil and gas reservoirs," Int J of Exp Sys with Appl, vol. 37, pp. 5353-5363, 2010. M. Re and G. Valentini, "Simple ensemble methods are competitive with state-of-the-art data integration methods for gene function prediction," J. of Machine Learn. Res., vol. 8, pp. 98-111, 2010. V. Landassuri-Moreno and J.A. Bullinaria, "Neural network ensembles for time series forecasting," in Genetic and Evolutionary Computation Conference (GECCO), Montréal Québec, Canada, 2009. L. Breiman, "Bagging predictors," Machine Learning, vol. 24, issue 2, pp. 123-140, 1996. L. Breiman, "Random forests," Machine Learning, vol. 45, issue 1, pp. 5 - 32, 2001. D. Chen, J. Quirein, H. Hamid, H. Smith, and J. Grable, "Neural network ensemble selection using multiobjective genetic algorithm in processing pulsed neutron data," in SPWLA 45th Annual Logging Symposium, June 6-9, 2004. H. Demuth, M. Beale, and M. Hagan, Neural Network Toolbox™ 6 User’s Guide. The MathWorks Inc., New York, 2009. K. Shibata and Y. Ikeda, "Effect of number of hidden neurons on learning in large-scale layered neural networks," in ICCAS-SICE 2009, pp. 5008-5013, August 2009. G. Panchal, A. Ganatra, Y.P. Kosta, and D. Panchal, "Behaviour analysis of multilayer perceptronswith multiple hidden neurons and hidden layers," International Journal of Computer Theory and Engineering, vol. 3, no. 2, pp. 332 - 337, April 2011. G.B. Huang and H. A. Babri, "Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions," IEEE Transactions on Neural Networks, vol. 9, no. 1, pp. 224 - 229, 1998. I. Rivals and L. Personnaz, "A statistical procedure for determining the optimal number of hidden neurons of a neural modelî," in second International Symposium on Neural Computation (NCí2000), Berlin, May 23-26m 2000. J.T. Heaton, Introduction to Neural Networks with Java. Heaton Research, Inc., 440 pages, 2008. Y. Liu, J.A. Starzyk, and Z. Zhu, "Optimizing number of hidden neurons in neural networks," in proceedings of the 25th IASTED International Multi-Conference on artificial intelligence and applications, pp. 121126, 2007. S. Karsoliya, "Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture," International Journal of Engineering Trends and Technology, vol. 3, issue 6, pp. 714-717, 2012. S. Arora and B. Barak, Computational Complexity: A Modern Approach. Princeton University Press, 509 pages, 2009. D.P. Bovet and P. Crescenzi, Introduction to the Theory of Complexity. First Edition, Creative Commons, USA, 290 pages, 2006. D.S. Johnson, "The NP-completeness column: an ongoing guide," J. Algorithms, vol. 5, pp. 433-447, 1984.

[29] T.H. Cormen, C.E. Leiserson, R.L. Rivest, and C.S. Stein, Introduction to Algorithms. The MIT Press, Cambridge, Massachusetts London, England, pp. 301-309, 2001. [30] R. Polikar, "Ensemble Based Systems in Decision Making," IEEE Circuits & Sys Mag, vol. 3, pp. 21-45, 2006.