Support Vector Machines and Artificial Neural Networks for ...

International Electrical Engineering Journal (IEEJ) Vol. 6 (2015) No.3, pp. 1803-1808 ISSN 2078-2365 http://www.ieejournal.com/

Support Vector Machines and Artificial Neural Networks for Identification of Residence Time Distribution Signals H. Kasban1, H. Arafa1 , S. M. S. Elaraby1 , O. Zahran2, M. El-Kordy2 and F. E. Abd El-Samie2 1 Nuclear Research Center, Atomic Energy Authority, Egypt 2

Faculty of Electronic Engineering, Menofia University, Egypt.

E-mails : [email protected] , [email protected] Abstract— This paper presents a practical comparison between the Support Vector Machines (SVMs) and the Artificial Neural Networks (ANNs) as identifiers for the Residence Time Distribution (RTD) signal identification. In these identifiers, the cepstral features are extracted from the signal or from its power density spectrum (PDS) estimated using eigenvector method, or from the Discrete Cosine Transform (DCT), then the extracted features feed the identifiers. Both identifiers have been tested using the same RTD signals. The performance of these identifiers is evaluated in the presence of different types of noise. The simulation results proved that, the ANNs based identifier is more reliable in RTD signal identification, but it takes more time with respect to the SVMs based identifier. Index Terms— Support Vector Machines, Artificial Neural Networks, Residence Time Distribution

I. INTRODUCTION RTD is the total time spent by the particles in the system. RTD measurement is used for determining the possible system malfunctions such as channeling, bypassing, short-circuiting and existence of dead volumes in many industrial systems [1-3]. Also, it can be used for optimizing the design of the industrial system at the design stage. RTD measurement can be performed by the injection of a suitable tracer into the system and monitoring the concentration of the tracer using radiation detectors placed at one or more locations [4-6]. The main advantages of using radiotracer in RTD measurement are; physico-chemical compatibility, high detection sensitivity, on-line detection and the availability of a number of radiotracers for different phases and their limited memory effects. The main problems of using these techniques are the difficulty of identification of the obtained signals and the requirement of skilled experts in the identification process of the output signal. Normally, the identification of the output signal is performed manually, depending heavily on the skills and the experience of a trained operator. This process is time consuming and the results typically suffer from inconsistency and errors. Also, the RTD signal may be subject to different sorts of noise. This leads to errors in the RTD calculations, and hence leads to wrong analysis in the determination of

system malfunctions. To overcome these problems, some techniques have been presented for treatment and identification of the RTD signal. RTD signal processing has been presented before using Z and Fast Fourier transforms and some digital signal processing techniques [7, 8]. Kasban et. al. presented an approach for RTD signal identification based on transfer domains [9, 10]. Discrete wavelet Transform (DWT), DCT and Discrete Sine Transform (DST) have been tested and compared. The Cepstral features are extracted from the signal or from one of its domains transforms, then the neural networks are used for matching the extracted features of the original RTD signal in the presence of noise. The experimental results show that the highest identification rate is obtained when the features are extracted from the DCT of the RTD [9, 10]. Zahran et. al. used the PDS estimated using its different estimation methods instead of using transforms domain in [9]. The identification results are compared to different estimation methods in order to select the best PDS estimation method for RTD signal identification. Neural networks are used for training and testing in the proposed approach. The experimental results showed that; the proposed approach with features extracted from the PDS of the RTD signals estimated using eigenvector method provides the highest identification rate [11, 12]. References 9 -12 are used only ANNs for identification with different features and transforms, although they achieved good identification rate, we are looking for higher identification rate within a shorter time. This paper presents a comparison between using the SVMs and ANNs as identifiers for the RTD signal identification purpose. In this paper, the cepstral features [9-12] are extracted from the signal or from the PDS estimated using eigenvector methods (best results in [11, 12]) or from the DCT of the signal (best results in [9, 10]), then the extracted features feed the identifiers. The rest of this paper is organized as follows: Section 2 presents a dissection about using the SVMs and ANNs in the identification process. Section 3 presents the paper methodology. Section 4 presents the experimental results and discussions. Finally, Section 5 gives the concluding remarks. 1803

Kasban et. al.,

Support Vector Machines and Artificial Neural Networks for Identification of Residence Time Distribution Signals

International Electrical Engineering Journal (IEEJ) Vol. 6 (2015) No.3, pp. 1803-1808 ISSN 2078-2365 http://www.ieejournal.com/ II. SVMS VERSUS ANNS SVMs are developed by Vapnik and colleagues at Bell laboratories [13]. SVMs have been used by many researchers in different applications [14-17]. SVMs considered a set of related supervised learning techniques, it belongs to a family of generalized linear identification. SVMs are used for minimizing the empirical identification error and maximizing the geometric margin. The support-vector network is a learning machine for two-group classification problems. The basic SVM deals with two-class problems in which the data are separated by a hyperplane defined by a number of support vectors. It operates by mapping the data of interest to a high dimensional space and generating a separating hyperplane in that space. The high dimensional separating hyperplane can be used for hypothesis testing. The identification using SVMs composed of two processes: building a model that simulates the identification system and a feature matching process that evaluates the performance of the model by using a test set of signals. In the modeling step, the RTD signals are stored in the system using features that are extracted during the training phase. When an unknown set of signal arrives, a feature matching technique is applied to map the features from this set in the model. SVMs minimize the empirical identification error and maximize the geometric margin. SVMs map an input vector to a higher dimensional space, where a maximal separating hyperplane is constructed. Two parallel hyperplanes are constructed on each side of the hyperplane that separates the data. The separating hyperplane is the hyperplane that maximizes the distance between the two parallel hyperplanes. On the other hand, the history of neural networking, arguably started in the late 1800s with scientific attempts to study the workings of the human brain. In 1890, William James published the first work about brain activity patterns. The first

artificial neuron was produced in 1943 by neurophysiologist Warren McCulloch and the logician Walter [18]. ANNs have been used by many researchers in different applications [19-23]. ANNs, like people, learn by examples, learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. Artificial neurons are simulations of biological neurons. They receive one or more input and sum them to produce an output. ANNs composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. The identification using ANNs composed of two phases; training and testing. The training of neural network is accomplished by adjusting its weights using a training algorithm. The training algorithm adapts the weights by attempting to minimize the sum of the squared error between a desired output and the actual output of the output neurons given:

E

1 O 2  Do  Yo  2 o 1

(1)

where Do and Yo are the desired and actual outputs of the oth output neuron, respectively, and O is the number of output neurons. Each weight in the neural network is adjusted by adding an increment to reduce E as rapidly as possible. The adjustment is carried out over several training iterations, until a satisfactorily small value of E is obtained or a given number of epochs are reached. The similarities between SVMs and ANNs are; SVM with sigmoid kernel is equivalent two-layer feed forward ANNs and SVMs with Gaussian kernel is equivalent radial basis function network. Table (1) summarizes the advantages and disadvantages of SVMs and ANNs.

Table (1): Advantages and disadvantages of SVMs and ANNs Advantages

 

SVMs Fewer parameters to consider (kernel, cost). Works well with fewer training samples (number of support vectors).

 

ANNs An ability to learn how to do tasks based on the data given for training or initial experience. Can create its own organization or representation of

1804 Kasban et. al.,


International Electrical Engineering Journal (IEEJ) Vol. 6 (2015) No.3, pp. 1803-1808 ISSN 2078-2365 http://www.ieejournal.com/      

Disadvantage s

 

Prediction accuracy is generally high Fast evaluation of the learned target function Effective in high dimensional spaces. Uses a subset of training points in the decision function so it is memory efficient. Different Kernel functions can be specified for the decision function. Learning result is more robust, works when training examples contain errors. Over fitting is not common. Problem need to be formulated as 2-class classification.



Poor performances when the number of features is greater than the number of samples.



Difficult to understand the learned function.





 

the information it receives during the learning time. Computations may be carried out in parallel, and special hardware devices are being designed and manufactured taking advantage of this capability. Fault tolerance via redundant information leads to a corresponding degradation of performance.

Suffer from multiple local minima. Computational complexity depends dimensionality of the input space.

on

the



International Electrical Engineering Journal (IEEJ) Vol. 6 (2015) No.3, pp. 1803-1808 ISSN 2078-2365 http://www.ieejournal.com/ III. METHODOLOGY The method of RTD signal identification is shown in figure (1). The method has two phases; a training phase and a testing phase. In the training phase, features are extracted from each signal or from its PDS or from its DCT. These features are used to train an SVMs or ANNs. Features are extracted from every incoming signal in the testing phase, and a decision is made through feature matching and decision making step to give RTD signal or not. The DCT expresses a sequence of finitely many data points in terms of a sum of cosine functions oscillating at different frequencies. The DCT of a sequence {x (n), 0 ≤ n ≤ N-1} is given by [24]: N 1  (2n  1)k (2) E(k)   (k ) E (n) cos( ) 2N n0 where , 0 ≤ k ≤ N-1, and  (0) 

1 N

,

 (k ) 

2 N

(3)

Power Density Spectrum (PDS) describes how the power of a signal is distributed with frequency. There are many methods for estimating the PDS, As eigenvector method provided the best results in RTD identification, so it will be used in this paper. This method is based on an Eigen-analysis of the autocorrelation matrix of the noise-corrupted signal. Eigen-analysis is used for partitioning the Eigenvectors and the Eigenvalues of the autocorrelation matrix of a noisy signal into two

subspaces; the signal subspace composed of the principle Eigenvectors associated with the largest Eigenvalues and the noise subspace represented by the smallest Eigenvalues. The decomposition of a noisy signal into a signal subspace and a noise subspace forms the basis of the Eigen-analysis methods [25, 26]. This method can be described in the following steps: a) Estimate the autocorrelation matrix. b) Estimate the Eigenvalues of the autocorrelation matrix αk. c) Estimate the corresponding Eigenvectors vk . d) Estimate the PDS using the following equation [12]: (4) P eig ( f ) 

1

P

  k e t ( f )v k

2

k  M 1

where ak are the weighting factors and vk, (k = M + 1 , . . . , p) are the noise subspace Eigenvectors. Feature extraction means reducing the amount of data present in the RTD signal while retaining the signal discriminative information. In this paper the cepstral features [9-12] are extracted from the signal or from its power density spectrum (PDS) or from one of its domains transforms, and then the extracted features feed the identifiers. The SVMs and ANNs based identifiers have been tested using the same RTD signals.

RTD signal

RTD signal

degraded RTD signal PDS

DCT

.

Features extraction

ANNs Features database

PDS

DCT

Features extraction

ANNs training

SVMs training

SVMs Features database

ANNs testing

SVMs testing

Decision Training phase Testing phase Figure (1): RTD signal identification method 1806 Kasban et. al.,


International Electrical Engineering Journal (IEEJ) Vol. 6 (2015) No.3, pp. 1803-1808 ISSN 2078-2365 http://www.ieejournal.com/ IV. EXPERIMENTAL RESULTS 20 RTD signals are used in the training phase for both identifiers. In the testing phase, the comparison has been done for three signal degradations; Gaussian, Rayleigh and Rician noise. For each time, the features are extracted separately and compared from the signal, from the DCT of the signal or from the PDS of the signal estimated by eigenvector method. The performance evaluation metric in the experiments is the identification rate: Identification rate 

The number of success identifica tions The total number of identifica tion trials

(5)

Table (2) shows the identification rate versus the signal to noise ratio (SNR) when the RTD signals degraded by Gaussian noise. Tables (2): Identification rate (%) when the signal degraded by Gaussian noise SNR (dB) 0 5 10 15 20 25 30

Signal 65 65 70 75 75 75 80

ANNs DCT 90 90 95 95 95 100 100

PDS 90 95 95 95 100 100 100

Signal 40 50 60 65 70 75 80

SVMs DCT 70 70 75 75 80 85 95

PDS 65 70 70 80 90 95 100

The results show that; the identification rate increase by increasing the SNR, where it become 100% at 20 dB for ANNs based identifier while the 100% identification rate obtained at 30 dB when using the SVMs based identifier, it is noted that, the ANNs based identifier give higher identification rate than the SVMs based identifier, SVMs based identifier is affected by SNR more than the ANNs based identifier, and when the features are extracted from the PDS of the signal, the highest identification rate is obtained. Table (3) shows the identification rate versus noise variance when the RTD signals degraded by Rayleigh noise. Tables (3): Identification rate (%) when the signal degraded by Rayleigh noise Noise variance 0 5 10 15 20 25 30 35 40 45 50

Signal 65 65 65 65 60 55 55 55 50 40 40

ANNs DCT 100 95 90 85 85 85 85 80 70 70 65

PDS 100 95 95 95 90 90 90 90 85 85 85

Signal 65 65 65 65 65 60 60 60 55 55 50

SVMs DCT 95 95 85 85 85 85 85 80 75 75 70

PDS 100 95 95 90 85 80 80 80 80 80 80

The results show that; the identification rate decreases with increasing the noise variance. It is noted that, the ANNs based identifier gives a higher identification rate than the SVMs based identifier, and the highest identification rate is obtained when the features are extracted from the PDS of the signal. Table (4) shows the identification rate versus Rician probability density when the signal degraded by Rician noise. Tables (4): Identification rate (%) when the signal degraded by Rician noise Probability density 0 5 10 15 20 25 30 35 40 45 50

Signal 80 60 45 35 20 20 20 15 15 15 10

ANNs DCT 100 80 75 70 65 70 65 60 55 45 30

PDS 100 75 75 75 75 75 75 75 75 75 75

Signal 75 70 55 40 25 20 20 20 20 10 10

SVMs DCT 85 75 75 60 55 30 30 20 20 10 10

PDS 100 75 75 75 75 70 70 65 65 60 60

The results show that; the identification rate decreases with increasing the noise probability density. The results proved the superiority of the ANNs based identifier to the SVMs based identifier. To compare between the SVMs and ANNs based identifiers from the point of execution time, table (5) shows the execution time (Sec) of the identification process for different signal degradations. This time is the CPU processing time using DELL laptop with Intel core i5 CPU and 4 GB RAM running with MATLAB 7.6. It is found that the ANNs based identifier takes more time with respect to SVMs based identifier. Tables (5): Testing time (Sec) of identification for different signal degradations ANNs SVMs Noise Signa DCT PDS Signal DCT PDS l Gaussian 24 31 32 19 20 23 Rayleigh 21 33 31 11 18 30 Rician 21 35 40 21 21 23

V. CONCLUSION The paper presented a practical comparison between the SVMs and ANNs as identifiers for RTD signal identification. The previous research in the RTD identification used only the ANNs with different features and transforms, although they achieved good identification rate, we are looking for higher identification rate within a shorter time. In this paper, the cepstral features are extracted from the signal or from the PDS estimated using eigenvector method or from the DCT of the signal, then the extracted features feed the identifiers. The two identifiers have been tested using the same RTD signals. The performance of these identifiers is evaluated in the presence 1807

Kasban et. al.,


International Electrical Engineering Journal (IEEJ) Vol. 6 (2015) No.3, pp. 1803-1808 ISSN 2078-2365 http://www.ieejournal.com/ of different types of noise. The simulation results proved that, the ANNs based identifier give higher identification rate than the SVMs based identifier, and the highest identification rate is obtained when the features are extracted from the PDS of the signal. In the future other techniques can be tested for enhanced the identification process such as; hidden Markov model or Fuzzy algorithms. REFERENCES [1] International Atomic Energy Agency, “Radiotracer applications in industry: a guidebook”, Safety Reports Series 423, IAEA, Vienna Austria, 2004. [2] International Atomic Energy Agency, “Diagnosis of industrial reactors by radiotracers: RTD applications”, Technical Report Series 423, IAEA, Austria, 2005. [3] International Atomic Energy Agency, “Residence time distribution method for industrial and environmental applications”, Training Course Series 31, IAEA, Vienna Austria, 2008. [4] H. Kasban, O. Zahran, H. Arafa, M. El-Kordy, Sayed M. S. Elaraby and F. E. Abd El-Samie, “Laboratory experiments and modeling for industrial radiotracer applications”, Applied Radiation and Isotopes, Vol. 68, pp. 1045-1054, 2010. [5] O. Zahran, H. Kasban, Horya A. Arafa, M. El-Kordy, and F. E. Abd El-Samie, “residence time distribution measurements in phosphate production unit using radiotracer techniques”, 48th Annual British Conference on Non-Destructive Testing, Blackpool, UK, September 2009. [6] H. Kasban, O. Zahran, Sayed M. S. Elaraby, S. EL-Rabaie, M. El-Kordy and F. E. Abd El-Samie, “using radiotracer in industrial applications”, 27th National Radio Science Conference, Menouf, Egypt, March 2010. [7] L. Furman, “Z-transform and adaptive signal processing in analysis of tracer data”, Canadian Journal of Chemical Engineering, Vol. 80, No. 3, pp. 472-477, 2002. [8] P. Viitanen, “Experiences on fast fourier transform as a deconvolution technique in determination of process equipment residence time distribution”, Applied Radiation and Isotopes, Elsevier, Vol. 48, No. 7, pp. 893-898, 1997. [9] H. Kasban, O. Zahran, H. Arafa, M. El-Kordy, Sayed M. S. Elaraby and F. E. Abd El-Samie, “An efficient approach for residence time distribution signal processing and identification”, 7th International Conference on Informatics and systems (INFOS 2010), Cairo, Egypt, March 2010. [10] H. Kasban, O. Zahran and F. E. Abd El-Samie, “New trends for on-line troubleshooting in industrial problems using radioisotopes”, Journal on Electronics and Electrical Engineering, Vol. 2, No. 3, pp. 284-292, 2010. [11] H. Kasban, O. Zahran, H. Arafa, M. El-Kordy, Sayed M. S. Elaraby and F. E. Abd El-Samie, “RTD signal identification using linear and nonlinear modified periodograms”, International Conference on Computer Engineering & Systems (ICCES'2011), IEEE, Cairo, Egypt, November 2011. [12] O. Zahran, H. Kasban, F. E. AbdEl-Samie and M. El-Kordy, “Power density spectrum for the identification of residence time distribution signals”, Applied Radiation and Isotopes, Elsevier, Vol. 70, pp. 2638–2645, 2012. [13] C. Cortes and V. Vapnik, “Support-vector networks”. Machine Leaming, Vol. 20, pp. 273-297, 1995. [14] V. N. Vapnik, "An overview of statistical learning theory", IEEE Transactions on Neural Networks, Vol. 10, pp. 988–1000, 1999. [15] J. Weston and C. Watkins, “Support vector machines for multiclass recognition”, ESANN’99, Bruges, Belgium, pp. 219–224, 1999. [16] I. Guyon and N. Christianini, “Survey of support vector machine applications”, NIPS’99 Special Workshop on Learning with Support Vector, 1999.

[17] B. Scholkopf, “SVMs: A practical consequence of learning theory”, IEEE Intelligent Systems, Vol. 13, pp.18–19, 1998. [18] W. S. McCulloch and W.H. Pitts, “A logical calculus of the ideas immanent in nervous activity”, Bulletin of Mathematical Biophysics, Vol. 5, pp.115-137, 1943. [19] T. Boukra, A. Lebaroud and G. Clerc, “Statistical and neural-network approaches for the classification of induction machine faults using the ambiguity plane representation” , IEEE Transactions on Industrial Electronics, Vol. 60, No. 9, pp. 4034 - 4042, 2013. [20] O. S. Stamenković, K. Rajković, A. V. Veličković, P. S. Milić, and V. B. Veljković “Optimization of base-catalyzed ethanolysis of sunflower oil by regression and artificial neural network models”, Fuel Processing Technology, Vol. 114, pp. 101–108, 2013. [21] L. H. Hassan, M. Moghavvemi, H. A. F. Almurib and O. Steinmayer “Current state of neural networks applications in power system monitoring and control”, Electrical Power and Energy Systems, Vol. 51, pp. 134-144, 2013. [22] J. S. Torrecilla, C. Tortuero, J. Cancilla and P.-Rodríguez. “Estimation with neural networks of the water content in imidazolium-based ionic liquids using their experimental density and viscosity values”, Talanta, Vol. 113, 93–98, 2013. [23] D. Bibicu and L.Moraru, “Cardiac cycle phase estimation in 2-D echocardiographic images using an artificial neural network”, IEEE Transactions On Biomedical Engineering, Vol. 60, No. 5, pp. 1273 – 1279, 2013. [24] R. Muralishankar, “Theoretical complex cepstrum of DCT and warped DCT filters”, IEEE Signal Processing Letters, Vol. 14, No. 5, pp. 367- 370, 2007. [25] S. T. Ahsan, “Optimal window design, comparing it with the kaiser window”, IEEE Potentials, pp. 40-44, 2002. [26] D. G. Manolakis, V. K. Ingle, and S. M. Kogon, “Statistical and adaptive signal processing”, Artech House, ISBN 978-1-58053-610-3, 2005.