Application of Neural Network for the Prediction of Eco-efficiency

Application of Neural Network for the Prediction of Eco-efficiency Sławomir Golak1, Dorota Burchart-Korol2, Krystyna Czaplicka-Kolarz2, Tadeusz Wieczorek1 1

Silesian University of Technology, Poland 2 Central Mining Institute, Poland

Abstract. This paper presents the application of neural networks in the design process of new technologies taking into account factors such as their influence on the environment and the economic effects of their implementation. The use of neural networks allowed eco-efficiency assessment of technologies based on highly reduced number of descriptive design parameters, which are very difficult to collect at the conceptual design stage. The great diversity of technologies involved along with the small number of available examples made difficult to construct a neural model and demanded careful data preprocessing and network structure selection. Keywords: eco-efficency, prediction, feature selection, neural network structure optimization.

1 Introduction A technology life-cycle consists of many stages, from concept design, through detailed designing process, implementation and use phase, all the way to its liquidation. So far the principal criterion for the evaluation of technology has been its economic efficiency, i.e. the ratio of expenditures for its development and operation to the profit return. Recently global economy is paying more attention to the environmental performance of various technologies. For these reasons environmental impact assessment (expressed in eco-indicator points) and economic-environmental impact assessment are becoming an inseparable element of the design process. Both these aspects together are expressed in eco-efficiency indicator points. Classic analytical methods for eco-efficiency assessment require thorough economic and environmental analyses. The main drawback of this approach, beside its being a laborious process, is the fact that the analyses are possible to carry out only on the basis of a completed design of the technology assessed. As a result, the environmental impact assessment can be obtained only after the design phase has been finished, and quite frequently not until the implementation process, which, in the case of bad indicators, means a costly necessity of re-design process. Since such enterprises usually involve complex and multistage financing systems, a return to designing phase may even prove impossible. D. Liu et al. (Eds.): ISNN 2011, Part III, LNCS 6677, pp. 380–387, 2011. © Springer-Verlag Berlin Heidelberg 2011

Application of Neural Network for the Prediction of Eco-efficiency

381

Therefore, it was essential to find such tools that would allow the assessment of economic and environmental efficiencies in the earliest possible design phase and on the basis of a smaller data set than the one required in the classic analyses. A way to replace analytical methods is the application of models based on the data which describe the parameters and influences of the completed and implemented technologies. In the selection process of such a model, a few factors must be taken into account, and they are namely: nonlinear or even discontinuous character of the dependencies between the parameters and the impact indicators of the technology, high level of abstraction of eco-efficiency indicator, and the fact that the ultimate measure of technological impact is greatly influenced by subjective, stochastic decisions of the person who carries out the classic analysis. So far neural networks have been used to predict only ecological effects of the implemented technologies [1-4]. This paper presents research concerning the prediction of both economic and environmental effects expressed in eco-efficiency indicator points.

2 Eco-efficiency Eco-efficiency is a new concept in environmental management which integrates environmental considerations with economic analysis to improve products and technologies. Eco-efficiency is a strategic tool and is one of the key factors of sustainable development. Eco-efficiency analysis allows designers to find the most effective solution taking into account economic aspects and environmental compatibility of products/technologies. The environmental impact should be as low as possible while the economic performance should be the highest achievable. Eco-efficiency tools used in the Central Mining Institute are Life Cycle Assessment (LCA) and Net Present Value (NPV). The main purposes of eco-efficiency analysis are the following: reducing consumption of resources, reducing the environmental impact, increasing the product value added and increasing economic efficiency of production while reducing environmental impact. The purpose of eco-efficiency is to maximize value creation while having minimized the use of resources and emissions of pollutants. In the literature sources can be found some formulas for eco-efficiency calculation based on the various available data on the environmental impact and the size and cost of production. The eco-efficiency analysis is used for comparing similar products/technologies in order to choose the best solution with the lowest cost and the least environmental impact (eco-effective solutions). Central Mining Institute in Poland has a lot of experience in eco-efficiency analysis and has been conducting this kind of analysis for several years now. Their research methodology is based on eco-efficiency tool of environmental management and economic tools. Environmental assessment is based on LCA analysis, while economic indicators are calculated on the basis of NPV (Net Present Value) [5]. Life Cycle Assessment (LCA) is an environmental assessment method for evaluation of impacts that a product, process or technology has on the environment over the entire period of its life – from the extraction of the raw material through the manufacturing, packaging and marketing processes, the use, re-use and maintenance of the

382

S. Golak et al.

product or technology, to its eventual recycling or disposal as waste at the end of its useful life. The second element of the eco-efficiency analysis is net present value NPV. NPV method involves discounting the stream of revenue and expenditure and calculating their present value. It allows designers to achieve comparability of receipts and payments at various times by converting them into value at the specified time base. Calculating the net present value NPV involves summing all the net benefits (net cash flows) associated with the technology/product performance throughout the economic life cycle which is discounted before aggregation, which boils down to one moment of time in order to unify their monetary value. In order to calculate eco-indicator, specific tools should be used (e.g. Sima Pro), and input and output data of technology should be defined very precisely, including all life cycle phases: the construction phase, operation and decommissioning. Carrying out a full LCA analysis is a very difficult task and often are introduced some limitations. Obtaining data on the construction and decommissioning phase of each plant is often a very difficult and laborious task. Since it is the operation stage which has the greatest impact on eco-efficiency indicator for most of the analysed technologies, it should be considered the most important phase in technology life cycle from the users’ point of view.

3 Data Preprocessing The main problem with artificial neural networks used to predict eco-efficiency is low availability of data necessary to create the model. It must be remembered that the undertakings in question are usually multi million dollar investments, which naturally limits the number of cases. Additional difficulty is the fact that such enterprises usually involve financial secrecy and trade secret, which prevents the creators of a given technology from sharing all the information they possess with third parties. The resulting situation is highly problematic because of the complexity of the modeled dependency combined with small number of data. Since in the study presented here the data were collected only on a few dozen technologies, it was essential that all the cases be considered. Two major problems were taken in considerations: ─ the studied technologies belonged to three different fields: energy, material, environmental. ─ the scale of the technologies ranged from local implementations to national undertakings, which meant analogous range of their parameter variability and environmental impact measures. In these particular circumstances of neural networks application, the primary principle of features selection was their availability at various stages of design process and the cost of their acquisition. The set of potential features included the following: ─ predictors of resource use, energy consumption, waste and emissions for the stages of construction, operation and decommissioning, ─ environmental impact indicators for the stages of construction, operation and decommissioning which are functionally dependent on the predictors,


383

─ indicators of ecological effects, impact on human health, and resource use which are functionally dependent on the predictors, and thereby the indicators for stages, ─ total eco-indicator of the environmental impact of technology which is functionally dependent on the eco-indicators for all stages and kinds of impact, ─ social impact, ─ lifetime of technology and its annual, total product, ─ financial effect NPV. Since it is the operation phase that seems to have the greatest influence on the ecoefficiency indicator for most of the analysed technologies, and the data concerning this phase are relatively easily available, at the first stage of features selection only the parameters describing this phase were chosen to predict the total impact of the technologies. The order of magnitude of some of the features under consideration can undergo even a ninefold change. This is why, among others, the basic set of features was extended by adding the following features made from the primary features: ─ Transformed to logarithmic scale

(

)

ξ i , j = ln x i , j + 1 ⋅ sign (x i , j )

(1)

where: xi,j- original j-th feature of i-th technology, ξi,j- scaled j-th feature of i-th technology ─ Ratio of original features

ξ i , jk =

xi , j xi , k

(2)

where: xi,j -original j-th feature of i-th technology, xik - original k-th feature of i-th technology, ξi,jk - scaled jk-th feature of i-th technology Since eco-efficiency has very different orders of magnitude, scaled variants of this value were created using analoguous transformations. Extremely low number of items in the created data set (only 39 examples) which is to be used to construct a neural model has the following consequences: ─ statistical tools [6] cannot be used in features selection, ─ construction time of a single neural model (for one combination of features) is very short, ─ there is a possibility of using the brute-force search method to find the optimum model-oriented set of features. The diagram in Fig.1 shows an algorithm for features selection based on neural models for all subsets of the extended set of features. In the process of generating the subsets for individual models, it was decided that each feature could be taken into account only once, either in its primary form or transformed into one of the scales.

384

S. Golak et al.

An initial set of features and a basic va riant of eco-effi cienc y

A manual selection of feat ures based on data availabili ty The bas ic s et extended by scaled features and eco-effi cienc y

C onstructing of a neural model for a s ubset of features and a ecoefficiency variant

: C onst ructing of a neural model for a subset of features and a ecoeffici ency variant.

A s election of subset s of feat ures and ecoefficiency variant

Selecti on of a subset of features and ecoeffici ency variant ba sed of prediction errors of neural models

Model wi th defined a st ructure, input features and an eco-efficiency vari ant

Fig. 1. An algorithm of the brute-force feature selection

The decision was made on the ground that multiple considering of any feature would not introduce any new information to the model, and at the same time it would dramatically increase the number of iterations of the selection algorithm. Additionally, the feature values and the output value were standardized. Before calculating the prediction error for individual features subsets and variants of the estimated values, the results provided by the network were transformed back to their original scales so that the comparison between various models be possible.

4 Creating of a Neural Model The classic multi-layer perceptron was chosen to predict eco-efficiency as this neural model proves efficient when the prediction of continuous values is involved, and this type of network structure is also successful with small data sets [7, 8]. For the same reasons as above, in the case of data preprocessing, the optimal network structure for the input features and the output variant set by the superior algorithm was determined by the brute-force search method. The searching limits were determined on the basis of Kolmogorov’s theorem, and the only nets that were considered were those with one or two hidden layers with sigmoidal transfer function, and linear output layer. The limited numbers of neurons in particular layers were selected with a view to the time needed to run the iteration algorithms for network structure selection executed within the features selection algorithm. In order to verify that the selected maximum numbers of neurons were justified, the influences of the numbers of the descriptive parameters on the neural network errors were shown in graphs. The number of the network descriptive parameters was calculated from Formula (3) for networks with one hidden layer, and from Formula (4) for networks with two hidden layers.


385

N P = (N I + 2) ⋅ N L1 + 1

(3)

N P = ( N I + 1) ⋅ N L1 + ( N L1 + 2) ⋅ N L 2 + 1

(4)

where: NP -number of network parameters, NI - number of network inputs, NL1 - number of neurons in the first hidden layer, NL2 - number of neurons in the second hidden layer.

Fig. 2. Dependence of a neural model error on number of its parameters

The cloud of 420 points representing structures of tested network that can be seen in the exemplary graph (for the best set of input features) is the result of random initialization of neural network weights. Although owing to the multi-start method, which was applied, each of the points represents an average for 10 networks of the same structure and various initial weights values, it was not sufficient to completely eliminate error value fluctuations. However, from this cloud of points, which is the result of calculation limitations, a tendency can be noticed for the error values to increase in the case of bigger structure of networks, which justifies the assumed limitations concerning the network size.

5 Results Since the data set used in the presented study was relatively small, in order to obtain the greatest possible learning data set the constructed neural network was validated by means of cross validation type Leave-one-out, which consists in assigning only one example at each iteration in the validation set. Additional and very important advantage of this kind of validation is obtaining eco-efficiency prediction errors individually for each technology considered, which increases the possibilities of substantive assessment of the obtained results by indicating the technologies distinguishing from others. Because the neural network was used to predict eco-efficiency of many various technologies with very different orders of magnitude of its indicator, the only possibility of determining a common unit of measure was to base it on relative errors:

386

S. Golak et al.

E=

1 N

N

∑ i =1

yi − d i ⋅ 100% yi

(5)

where: E – error measure, N- number of technologies, yi – network prediction for i-th test technology, di – expected value of the indicator for i-th test technology. The application of the above algorithm for features selection and network structure selection made it possible to achieve very high accuracy of eco-efficiency prediction (1,54%) on the basis of data relatively easily available at the conceptual stage (Table 1, p.1). Encouraged by this success, we made a few some several attempts to predict eco-efficiency relying exclusively on the input data used in analytical calculations of eco-indicator for the operation phase. Lower accuracy of prediction (Table 1, p.2) obtained for all predictors of the operation phase indicates that potential additional information introduced to the model will not compensate for the growth of network structure when the number of learning data remains unchanged. The subsequent prediction models for individual predictors are characterized by lower accuracy (p. 3-6); however, considering three times less labour required to prepare the data compared with eco-indicator, they are an attractive alternative. Table 1. Accuracy of eco-efficiency prediction

1 2 3 4 5 6

Input features Eco-indicator for operation phase, social impact, NPV Predictors of resources, energy, waste and emissions, social impact, NPV Predictor of resources, social impact, NPV Predictor of energy, social impact, NPV Predictor of waste, social impact, NPV Predictor of emissions, social impact, NPV

Structure

Relative error

3-5-17-1

1,54%

6-12-20-1

2,05%

3-11-12-1 3-17-6-1 3-5-6-1 3-9-18-1

3,75% 6,51% 8,41% 6,51%

6 Conclusions A possibility of predicting economic and ecological effects of new technologies at the stage of their conceptual design and relying in this respect on a simplified set of descriptive parameters allows the designers to avoid costly mistakes when the first strategic decisions are made. The main difficulty with the application of neural network to do the task is a limited number of learning data, and what makes the situation even more problematic is relatively high complexity of dependencies between the parameters and the technological impacts. On the other hand, a small learning dataset, which means potentially small neural network, makes it possible to apply brute-force search methods, both for finding optimal network structure and input data set. Owing to this solution the level of accuracy of eco-efficiency prediction has been so high for easily available parameters that the predictions facilitate conceptual decisions about technologies to a large degree.


387

It should be noticed that a characteristic feature of the presented application of neural network is the fact that it may be a source of additional knowledge about the technologies whose data were used to create the neural network. Significant over- or underestimation of their eco-efficiency indicates that a given technology is not typical and needs further analyses, contrary to classic approach, where such a case is frequently disregarded. The fact that a large part of descriptive parameters can be disregarded proves the existence of strong dependencies between them which are very hard to define. This situation encouraged environmental protection specialists to thorough studies in this field. Additional information on the meaning and synergic effect of individual parameters might be obtained by applying one of methods of a knowledge extraction from neural networks [9].

References 1. Seo, K.-K., Min, S.-H., Yoo, H.-W.: Artificial neural network based life cycle assessment model for product concepts using product classification method. In: Gervasi, O., Gavrilova, M.L., Kumar, V., Laganá, A., Lee, H.P., Mun, Y., Taniar, D., Tan, C.J.K. (eds.) ICCSA 2005. LNCS, vol. 3483, pp. 458–466. Springer, Heidelberg (2005) 2. Sousa, I., Wallace, D.: Product classification to support approximate life-cycle assessment of design concepts. Technological Forecasting & Social Change 73, 228–249 (2006) 3. Seo, K.-K., Kim, W.-K.: Approximate Life Cycle Assessment of Product Concepts Using a Hybrid Genetic Algorithm and Neural Network Approach. In: Szczuka, M.S., Howard, D., Ślȩzak, D., Kim, H.-k., Kim, T.-h., Ko, I.-s., Lee, G., Sloot, P.M.A. (eds.) ICHIT 2006. LNCS (LNAI), vol. 4413, pp. 258–268. Springer, Heidelberg (2007) 4. Li, J., Wu, Z.: Application of neural network on environmental impact assessment tools. Int. J. Sustainable Manufacturing 1(1/2), 100–121 (2008) 5. Czaplicka-Kolarz, K., Wachowicz, J., Bojarska-Kraus, M.: A Life Cycle Method for Assessment of a Colliery Colliery’s Balance. The International Journal of Life Cycle Assessment 9, 247–253 (2004) 6. Duch, W., Wieczorek, T., Biesiada, J., Blachnik, M.: Comparison of feature ranking methods based on information entropy. In: IEEE International Conference on Neural Networks, vol. 2, pp. 1415–1419 (2004) 7. Lanouette, R., Thibault, J., Valade, J.L.: Process modeling with neural networks using small experimental datasets. Computers and Chemical Engineering 23, 1167–1176 (1999) 8. Li, D.C., Yeh, C.W.: A non-parametric learning algorithm for small manufacturing data sets. Exepers Systems with Application 32, 391–398 (2008) 9. Wieczorek, T., Golak, S.: An algorithm of knowledge extraction from trained neural networks. In: IIPWM 2004. Advances in Soft Computing, Springer, Heidelberg (2004)