Artificial neural networks for predicting low temperature ... - NOPR

Indian Journal of Engineering & Materials Sciences Vol.16, August 2009, pp. 237-244

Artificial neural networks for predicting low temperature performances of modified asphalt mixtures Yuksel Tasdemir* Faculty of Engineering and Architecture, Bozok University, 66100 Yozgat, Turkey Received 1 April 2008; accepted 23 March 2009 In this study, the estimation of the low temperature performance of modified asphalt mixtures is investigated by using multi-layer perceptrons (MLP) which is one of the artificial neural networks (ANNs) techniques and general linear model (GLM). The fastest MLP training algorithm, that is the Levenberg-Marquardt algorithm, is used for optimization of the network weights. The ANN test results are compared to GLM results. GLM has, historically, been used to model the low temperature performance (fracture temperature and fracture strength) of asphalt pavements. The data used in the ANN model and GLM are arranged in a format of four input parameters that cover additive type, asphalt binder content, aging level and air void content, and output parameters which are the fracture temperature and the fracture strength. Based on the comparisons, it is found that the ANN generally gives better fracture temperature and fracture strength estimates than the GLM technique. Keywords: Artificial neural network; General linear model; Thermal stress restrained specimen test; Fracture temperature; Fracture strength

Low temperature cracking of asphalt pavements may be a serious problem in many cold regions of the world. Low temperature cracking occurs when the thermal stress induced at low temperature exceeds the tensile strength of the asphalt concrete pavement. In the laboratory, the thermal stress restrained specimen test (TSRST) is used to characterize the low temperature performance of asphalt mixtures. In this test, the thermally induced stress gradually increases as temperature is decreased until fracture of the specimen occurs at the fracture temperature and maximum stress value (fracture strength). Numerous experimental studies have been performed on fracture temperature and fracture strength of different asphalt mixtures using restrained cooling tests.1-12 Models for prediction of low temperature properties of asphalt pavements have also been proposed.1 Historically, general linear model (GLM) procedures have been used to model the fracture temperature and the fracture strength.2 Over the past two decades, there has been an increasing interest in artificial neural networks. ANNs are powerful and versatile computational tools that can be used to organize and correlate information in ways that have proved useful for solving certain types —————— *E-mail: [email protected]

of problems that are very complex, poorly understood, or too resource-intensive to tackle using more traditional computational methods. ANNs have been successfully used for tasks involving for instance pattern recognition, function approximation, optimization, forecasting, data retrieval, and automatic control.13 Nowadays, AAN has also become popular amongst the researchers in the field of asphalt because of its effective output.14-16 The main purpose of the work discussed in this paper is to develop suitable models using GLM and ANN for estimating the fracture temperature and the fracture strength. The developed models are trained and tested on experimental data. The output of the GLM and ANN are compared with experimental results. Materials Binder and additives

One conventional base bitumen and one polymer modified bitumen were used in the study. Both products were obtained from British Petroleum (BP) and the polymer modified bitumen was premixed at their processing facilities using the same base bitumen. The type and concentration of polymer as well as any special mixing techniques in this product are proprietary. Besides polymer, additives

INDIAN J. ENG. MATER. SCI., AUGUST 2009

238

(loose cellulose fiber and synthetic fiber) from two different suppliers were used in the study. The loose cellulose fiber and the synthetic fiber were added at 0.3% and 0.05% by mixture weight, respectively. Traditional index and Superpave tests results of the base asphalt and polymer modified binder have been reported elsewhere.3 Asphalt concrete mixtures

Test specimens of coarse graded asphalt concrete according to Superpave specification, with a maximum aggregate size of 25 mm, were prepared. A crushed basalt aggregate from Holbrook, Arizona was used in all mixtures. The measured particle size distribution of the aggregate used is shown in Fig. 1. The Marshall mix design method was used and the optimal bitumen content was estimated as 5.0% by weight. In order to determine the effect of bitumen content variation, as well as optimal bitumen content, a maximum bitumen film thickness was chosen based on the Iowa Department of Transportation (IDOT) specification.17 The bitumen content was then estimated to 5.6% for maximum film thickness. Based on producer’s recommendations, the mixing and compaction temperatures used for the mixtures containing polymer modified bitumen were 145 ºC and 135ºC, respectively. For mixtures containing unmodified bitumen, mixing and compaction temperatures were decreased by 5ºC. To obtain homogeneous mixtures, cellulose or synthetic fibers were added to the heated aggregate and mixed for 60 s. Aggregate, fiber and bitumen were then mixed in a bucket mixer for 2.5 min. After mixing, the loose mixture was subjected to short term aging in a forced oven for 4 h at 135ºC. This treatment is recommended for simulating short term aging occurring in the field between mixing and

placement.18 Beams (150 mm thick, 150 mm wide and 400 mm long) were compacted using a California kneading compactor. Finally, four TSRST specimens (50 mm × 50 mm × 250 mm) were obtained from a large beam sample. Some of the TSRST specimens were subjected to long term oven aging (LTOA) in a forced oven at 85ºC for 5 and 25 days in order to investigate the effect of aging. The TSRST specimens were wrapped in a metal screen and bedded on coarse sand to prevent deformations and to allow air access underneath the specimens during the aging procedure. Specimens were turned once every day during LTOA. Consequently, two asphalt binder contents (5.0 and 5.6%), four types of additives (unmodified binder, cellulose fiber, polymer modified binder, and synthetic fiber) and three aging levels (0, 5 and 25 days) were assigned. The target air void content was chosen as 4%.19 Test Method Thermal stress restrained specimen test

The low temperature properties of asphalt mixtures are evaluated using the thermal stress restrained specimen test (TSRST). For the TSRST equipment used in this study, the main parts are environmental chamber, load frame, screw jack, cooling device, temperature controller and data acquisition and control system. The test specimen is glued to two aluminum end platens with epoxy. Before performing the test, the specimen was kept for one hour at approximately 2ºC in the environmental chamber to ensure that the inside temperature of the specimen is the same as the temperature of the environmental chamber. A cooling rate of 10ºC/h was used. The contraction of the specimen during the cooling process was measured using two linear variable differential transducers (LVDT). When the thermally induced stress in the specimen exceeded its strength, fracture occurred and the test was completed. Method of Analysis The GLM and ANN were used to compare experimental results of the fracture temperature and the fracture strength. General linear model

Fig. 1–Gradation of the aggregate used

Statistical analysis of covariance was performed using a GLM procedure on the TSRST results at a significance level of 0.05 using statistical analysis system (SAS) software package.20 Analysis of

TASDEMIR: MODIFIED ASPHALT MIXTURES

covariance combines some of the features of both regression and analysis of variance. Artificial neural networks

The ANN modeling approach is a computer based methodology that attempts to simulate some important features of the human nervous system; in other words, the ability to solve problems by applying information gained from past experience to new problems or case scenarios. Analogous to a human brain, ANN uses many simple computational elements, named artificial neurons, connected by variable weights.21 The ANN modeling consists of two steps: to train and to test the network. During the training stage the network uses the inductive-learning principle to learn from a set of examples called the training set. Test data is not used in training. Among the much different architecture, the multi-layer perceptron architecture is commonly used for prediction.22 Multi-layer perceptrons (MLP) The network consists of layers of parallel processing elements with each layer being fully connected to the proceeding layer by interconnection strengths, or weights (W).21 A multilayer feed forward network consists of an input layer, one or more hidden layers, and an output layer. The MLP can have more than one hidden layer; however, theoretical works have shown that a single hidden layer is sufficient for the ANN to approximate any complex nonlinear function.23,24 Therefore, in this study, a one-hidden-layer MLP is used as shown in Fig. 2.

239

Computations take place in the hidden and output layers only. Various combinations of network architecture to develop optimum ANN model were examined. ANN (i, j, k) indicates a network architecture with i, j and k neurons in input, hidden and output layers, respectively. Figure 2 illustrates a three-layer neural network consisting of layers i, j, and k, with the interconnection weights Wij and Wjk between layers of neurons. Initial estimated weight values are progressively corrected during a training process that compares predicted outputs to known outputs, and backpropagates any errors (from down to up in Fig. 2) to determine the appropriate weight adjustments necessary to minimize the errors. Throughout ANN simulation the adaptive learning rates were used for the purpose of faster training speed and solving local minima problem. For each epoch, if performance decreases toward the goal, then the learning rate is increased by the factor learning increment. If performance increases, the learning rate is adjusted by the factor learning decrement.25 The numbers of hidden layer neurons were found using simple trialerror method in applications. The MLPs were trained using the Levenberg–Marquardt technique as this technique is more powerful than the conventional gradient descent techniques.26-28 Levenberg-Marquardt algorithm The Levenberg-Marquardt algorithm is an approximation to Newton’s method.29 If a function V(x) is to be minimized with respect to the parameter vector x, then Newton’s method would be:

[

]

−1

∆ x = − ∇ 2V (x ) ∇V ( x )

… (1)

where ∇V 2 ( x ) is the Hessian matrix and ∇V ( x ) is the gradient. If V ( x ) reads: N

V ( x ) = ∑ ei2 ( x )

… (2)

i =1

then it can be shown that:

Fig. 2–An ANN architecture used for the fracture temperature or the fracture strength prediction.

∇V ( x ) = J T ( x )e( x )

… (3)

∇ 2V ( x ) = J T (x )J ( x ) + S ( x )

… (4)

where J ( x ) is the Jacobian matrix and


240 N

S ( x ) = ∑ ei ∇ 2 ei ( x )

… (5)

MARE =

i =1

For the Gauss-Newton method it is assumed that S ( x ) ≈ 0 , and Eq. (1) becomes:

[

]

−1

∆ x = J T ( x )J ( x ) J T (x )e( x )

… (6)

The Levenberg-Marquardt modification to the GaussNewton method is:

[

∆ x = J T ( x )J ( x ) + µI

]

−1

J T ( x )e( x )

… (7)

The parameter µ is multiplied by some factor (β) whenever a step would result in an increased V ( x ) . When a step reduces V ( x ) , µ is divided by β. When the scalar µ is very large the Levenberg-Marquardt algorithm approximates the steepest descent method. However, when µ is small, it is the same as the GaussNewton method. Since the Gauss-Newton method converges faster and more accurately towards a minimum error, the goal is to shift towards the GaussNewton method as quickly as possible. The value of µ is decreased after each step unless the change in error is positive; i.e., the error increases.30 For the neural network-mapping problem, the terms in the Jacobian matrix can be computed by a simple modification to the back-propagation algorithm.26 Application and Results Input variables of the GLM and ANN model include additive type, asphalt binder content, aging level and air void content, and outputs are the fracture temperature and the fracture strength. In this study 73 data were used. For training set, 49 data (approximately 66%) were randomly selected and the remaining 24 data (approximately 34%) were selected as test set. The root mean square error (RMSE), mean absolute relative error (MARE) and determination coefficient (R2) statistics were used as the evaluation criteria. The R2 measures the degree to which two variables are linearly related. RMSE and MARE provide different types of information about the predictive capabilities of the model. The RMSE and the MARE are used as measure of the accuracy of the model. The RMSE and MARE are defined as

RMSE =

1 n [( xi )observed − ( xi ) predicted ]2 ∑ n i =1

… (8)

1 n ( xi )observed − ( xi ) predicted .100 ∑ n i =1 ( xi )observed

... (9)

in which the n denotes the number of data set, (xi) are the observed and predicted fracture temperature or fracture strength at ith test number. Also, the index of agreement d, was determined. The index of agreement was proposed by Willmot31 to overcome the insensitivity of R2 to differences in the observed and predicted means and variances.32 The range of d is similar to that of R2 and lies between 0 (no correlation) and 1 (perfect fit). The index of agreement represents the ratio of the mean square error and the potential error33 and is defined as: n

d = 1−

∑ [( xi ) predicted

− ( xi ) observed ]2

∑ ( ( xi ) predicted

− ( x) observed +

i =1 n i =1

( xi ) observed − ( x) observed ) 2 … (10)

where, ( x) observed are the average of the observed fracture temperature or fracture strength. A general linear model (GLM) was developed to evaluate the effect of variables on test results for fracture temperature and fracture strength by using 49 results (Eq. 11). Since air void content was not fully controlled, it was taken as covariate. The regression coefficients were evaluated for the GLM model by using training set. These coefficients were used to predict test data. Yi = µ+ α1ADD + α2ASP + α3AGE + α4AIR + α5ADD*ASP + α6ADD*AGE + α7ASP*AGE … (11) where; Y1 is fracture temperature, Y2 is fracture strength, µ is constant, αi is regression coefficients, ADD is additive type, ASP is asphalt binder content, AGE is aging level, AIR is air void content, ADD*ASP is interaction between ADD and ASP, ADD*AGE is interaction between ADD and AGE and ASP*AGE is interaction between ASP and AGE. The results of the statistical analysis by the GLM indicate that the fracture temperature is most sensitive to the additive type and degree of aging. While the polymer modification improved the low temperature cracking resistance, cellulose or synthetic fibers did


not improve the low temperature performances of mixtures. As the degree of aging increases, the fracture temperature becomes warmer. The fracture strength is most sensitive to the additive type followed by the degree of aging. The fracture strength of polymer or fiber modified mixtures are greater than the fracture strength of unmodified mixture. The air void content and asphalt binder content are not significantly effective on the fracture temperature and the fracture strength within a reasonable range about optimum. ANN models also were established for modified asphalt concrete mixtures. Levenberg-Marquardt training algorithm was used here for adjusting the weights and togarithmic sigmoid transfer function was used as the activation function for hidden layers. The numbers of hidden layer neurons were found to maximize the R2 and minimize RMSE by using simple trial and error method. If the number of nodes in the hidden layer is small, the network may not have sufficient degrees of freedom to learn the process correctly. If the number is too high, the training will take a long time and the network may sometimes overfit the data.34 Before applying the ANN to the data, the training input and output values were normalized using the equation

a

xi − x min +b x max − x min

… (12)

where xmin and xmax denote the minimum and maximum of the training and test data, respectively. Different values can be assigned for the scaling factors a and b . There are no fixed rules as to which standardization approach should be used in particular circumstances35. In this study the a and b were taken as 0.6 and 0.2, respectively. The different ANN structures were tried in terms of iterations and hidden layer numbers. The ANN(4,4,1) appeared to be most optimal topology for both the fracture temperature and the fracture strength. The ANN(4,4,1) model comprises 4 input, 4 hidden and 1 output layer neurons. The ANN model structure used in the study is given in Fig. 2. The ANN networks training were stopped after 5,000 epochs. The RMSE, R2, MARE and the index of agreement d, of the GLM and ANN are summarized in Table 1. The error of prediction is given as RMSE which provides a measure to judge the goodness of fit of the GLM and ANN. The lower the value of RMSE, the

241

better is the fit. It can be seen from Table 1, the ANN has the smallest RMSE (0.935ºC) and MARE (3.3%), and the highest R2 (0.92) and d (0.96) for the fracture temperature. The ANN also has the smallest RMSE (0.259 MPa) and MARE (7.6%), and the highest R2 (0.81) and d (0.93) for the fracture strength. The RMSE and MARE of the ANN model were lower than the GLM, indicating that the ANN model predicted the fracture temperature and fracture strength more accurately. The relationship between the observed fracture temperature and predicted fracture temperature by the GLM and ANN for testing set are linear as shown in Fig. 3. The ANN gave R2 coefficient of 0.92, which was higher than the value of 0.88 obtained using the GLM for the fracture temperature. The GLM and ANN estimates for the testing period were also compared with the observed fracture strength in the form of the scattered plot (Fig. 3). It is seen, particularly from the fit line equations and R2 values in the scatter diagrams, that the ANN estimates are closer to the corresponding observed values than those of the GLM method. It can be seen from the fit line equations (assume that the fit line equation is y = a0 x + a1 ) that the a0 and a1 coefficients for the ANN are respectively closer to the 1 and 0 than those of the GLM. As shown in Fig. 3, R2 of the GLM and ANN model for fracture strength are 0.58 and 0.81, respectively. As indicated in Table 1, d values are also consistent with R2. It confirms that the ANN model seems to be much better than the GLM in estimation of the fracture temperature and the fracture strength. Observed fracture temperature and fracture strength and predictions by GLM and ANN are shown in Fig. 4. Figure 4 infers that the ANN estimates follow the observed curve very closely. It can be said that the ANN training is successful. In order to evaluate the relative performances of the GLM and ANN model, residual plots are also illustrated in Fig. 5. Residual plots are also consistent Table 1–Values of RMSE, R2, MARE and the index of agreement d, for each model in the testing period Models For fracture temperature: GLM ANN For fracture strength: GLM ANN

RMSE

R2

MARE (%)

d

1.011 0.935

0.88 0.92

3.7 3.3

0.87 0.96

0.366 0.259

0.58 0.81

10.3 7.6

0.84 0.93

242


Fig. 3–Plot of observed and predicted (a) fracture temperature using the GLM, (b) fracture temperature using the ANN, (c) fracture strength using the GLM, and (d) fracture strength using the ANN

with erlier findings; therefore, it can be concluded that the ANN exhibited somewhat better performance over the GLM approach.

Fig. 4–Observed and predicted (a) fracture temperature by the GLM and ANN, and (b) fracture strength by the GLM and ANN

Conclusions This paper examined the development and predictive capabilities of two methods, general linear model and artificial neural networks, for modeling the fracture temperature and the fracture strength of asphalt mixtures in low temperature. Based on the results obtained in this study, the following conclusions can be drawn: The ANN gave R2 coefficient of 0.92, which was higher than the value of 0.88 obtained using the GLM for the fracture temperature. R2 of the GLM and ANN model for fracture strength are 0.58 and 0.81, respectively. It can be concluded that the ANN may


243

Fig. 5–Residual plots of (a) GLM for the fracture temperature, (b) ANN for the fracture temperature, (c) GLM for the fracture strength, and (d) ANN for the fracture strength

provide better performance than the GLM in the estimation of the fracture temperature and the strength, considering the scatter plots between estimated and measured fracture temperature and fracture strength. The RMSE and MARE of ANN model were lower than the GLM, indicating that the ANN model predicted the fracture temperature and fracture strength more accurately. The ANN has the smallest RMSE (0.935ºC) and MARE (3.3%) for fracture temperature and also the smallest RMSE (0.259 MPa) and MARE (7.6%) for fracture strength. The three layer feed-forward ANN technique is capable of modeling the fracture temperature and the fracture strength in a significantly improved manner.

The developed ANN model may be used to predict the fracture temperature and the fracture strength of asphalt mixtures, including similar materials used in this study. References 1 2

3 4 5 6

Isacsson U & Zeng H, Mater Struct, 31 (1998) 58. Jung D, Selection and performance evaluation of test method to assess thermal cracking resistance of asphalt aggregate mixtures, Ph.D. Thesis, Oregon State University, Corvallis, 1994. Tasdemir Y & Agar E, Indian J Eng Mater Sci, 14 (2007) 151. Kliewer J E, Zeng H & Vinson T S, J Cold Reg Eng, 10 (1996) 135. Jung D & Vinson T S, J Assoc Asphalt Pav Technol, 62 (1993) 54. Lu X, Isacsson U & Ekblad J, Constr Build Mater, 12 (1998) 405.

244 7

8 9 10 11 12 13

14 15 16 17 18


Hesp S, Terlouw T & Vonk W, Restrained cooling tests on SBS modified asphalt concrete, paper presented at Eurobitume Workshop 99, Luxembourg, 1999. Lu X & Isacsson U, Road Mater Pavement Design, 2 (2001) 29. Lu X, Isacsson U & Ekblad J, Mater Struct, 36 (2003) 652. Edwards Y, Tasdemir Y & Isacsson U, Fuel, 85 (2006) 989. Edwards Y, Tasdemir Y & Isacsson U, Cold Reg Sci Technol, 45 (2006) 31. Edwards Y, Tasdemir Y & Isacsson U, Mater Struct, 39 (2006) 725. Meier R W & Tutumluer E, Uses of artificial neural networks in the mechanistic-empirical design of flexible pavements, presented at Int Workshop on Artificial Intellingence and Mathematical Methods in Pavement and Geomechanical Systems, Florida, USA, 1998. Gopalakrishnan K, Kim S & Ceylan C, Can J Civil Eng, 35 (2008) 699. Ozsahin, T S& Oruc S, Constr Build Mater, 22 (2008) 1436. Ceylan C, Gopalakrishnan K & Kim S, Int J of Pavement Eng, (in press). IDOT, Control of asphaltic concrete mixtures, Matls IM 511, 1997. Bell C A, Wieder A, Fellin M J, Laboratory aging of asphalt-aggregate mixtures: field validation, (National Research Council, Washington DC), 1994.

19 Tasdemir Y, Investigation of thermal behaviour of asphalt mixtures with performance tests, Ph. D. Thesis, Istanbul Technical University, Istanbul, 2003 [in Turkish]. 20 SAS Institute Inc., SAS/STAT user’s guide, (Cary, NC, USA), 1998. 21 El-Din A G & Smith D W, Water Res, 36 (2002) 1115. 22 Haykin S, Neural networks: A comprehensive foundation (Prentice-Hall, Englewood Cliffs), 1998. 23 Cybenco G, Math Control Signal, 2 (1989) 303. 24 Hornik K, Stinchcombe M & White H, Neural Networks, 2 (1989) 359. 25 Kisi O, J Hydrol, 329 (2006) 636. 26 Hagan M T & Menhaj M, IEEE Trans, Neural Networks, 5 (1994) 989. 27 El-Bakyr M Y, Chaos Soliton Fract, 18 (2003) 995. 28 Cigizoglu H K & Kisi O, Nord Hydrol, 36 (2005) 49. 29 Marquardt D, J Soc Ind Appl Math, 11 (1963) 431. 30 Kisi O, Hydrolog Sci J, 49 (2004) 1025. 31 Willmot C J, Phys Geogr, 2 (1981) 184. 32 Legates D R & McCabe G J, Water Resour Res, 35(1999) 233. 33 Gaile G L & Willmot C J, Spatial Statistics and Models (Springer, Dordrecht), 1984. 34 Karunanithi N, Grenney W J, Whitley D & Bovee K, J Comput Civil Eng, 8 (1994) 201. 35 Dawson W C & Wilby R, Hydrolog Sci J, 43 (1998) 47.