Face Recognition Using Robust Convolutional Neural Network

Face Recognition Using Robust Convolutional Neural Network

Amin Jalali, *Minho Lee School of Electronics Engineering Kyungpook National University

e-mail : [email protected], [email protected]

feature extraction and classification with minimum preprocessing tasks. In recent face recognition research, Khalajzadeh et al. [4] proposed a hierarchical CNN approach with 4 layers and applied standard back propagation as the learning algorithm. Y. Choi et al. [5] proposed incremental two-dimensional kernel principal component analysis for face recognition. Patel et al. [6] presented dictionary-based method for face recognition including variable lighting and pose. However, these approaches have lower accuracy for recognition rate. In this study, we used deep structure of Convolutional Neural Network to extract the features and classify the images. The samples of the dataset include faces with varying illumination, occlusion, and poses in order to accomplish better image labeling and classification. We propose a penalty term to be considered in the cost function of the training algorithm to reduce the sensitivity of input images over the output error to guarantee the robustness. To implement the robustness in deep structure, the derivative of the activations is applied in cost function of the error back propagation-learning algorithm as additional term. Through the training process, as the error propagates back to the first layer, the weights are tuned considering the Hebb’s principle and create salient features which causes general exploitation and better classification. Moreover, to improve the training time we utilized fused

Abstract Face recognition is still a challenging issue especially when the images contain various kinds of occlusions, illumination variations, and poses. We propose robust Convolutional Neural Network (CNN) with the new cost function including the back propagated error and gradient of the hidden neuron penalty. The gradient penalty follows Hebb’s learning rule multiplied by the derivative of sigmoid function, which avoids the weights from drastically changing when it feedbacks the small variations of the output error to the input layer. The proposed method compared with I2DKPCA and conventional CNN shows that the proposed approach outperforms existing state of art methods.

I. Introduction Face recognition is a challenging field of study due to various limitations involved in faces such as pose variations, occlusions like glasses, scarf, and moustaches, variations in shape and color, illuminations, and facial expressions. Convolutional Neural Networks (CNN) was introduced in 1995 by Yann LeCun and Yoshua Bengio [1, 2] for image processing and recognition tasks . Conceptually CNN is inspired by the biological findings of locally sensitive and orientation-selective nerve cells in the visual cortex of a cat [3]. The CNN conducts 2557

convolution-subsampling method along with cross correlation approach instead of convolution method, which leads to less training time. The remainder of this paper is organized as follows. Section II presents the proposed method. Experimental results and benchmarking of results with existing works are discussed in section III. Ultimately, the final section concludes the work.

 

 

 







 

(3)





(1)

   

  



Convolutional neural network [1] is composed of several feature map arrays followed by a classifier at the end. Each stage serves as a feature extractor. Its invariance and stability to scaling, distortion, disturbance, and shift makes it a unique structure. In proposed design the integration of the convolution and subsampling operations [7, 8] is implemented. This makes the architecture simpler. This results in a design, which has reduced complexity, higher generalization ability with lower number of trainable parameters and faster performance. The increased problem complexities like pose, occlusion, and illumination variations are usually represented as noise on training data. An additional error term is added as a function of Jacobian matrix to reduce sensitivity of output error over input values. For better generalization and robust classification we reduce the unreasonably-high sensitivity. Especially for the classifier networks, the low sensitivity of output values guarantees correct classification even though the input values are changed by noise or other unknown reasons. In proposed method the derivatives of the sigmoid functions are applied to the standard output error as an additional term with relative significance factor and the total error is minimized by the back propagation. In proposed approach, instead of standard output error (  ) in cost function, the new error  is defined as (1) :   



Where  and  are the output error and the additional penalty term defined from the  hidden layer for the  training pattern, respectively.  and  are target and actual output values of the  output neuron for the  stored pattern, respectively, and    is the corresponding post-synaptic value for the  element at the  hidden-layer. Here  and  are number of output neurons and number of neurons at the  hidden layer, respectively, and the errors are normalized with these numbers. The  represents relative significance of the hidden-layer error over output error [9]. According to the equation (1), there are two components of weight updates, i.e. the back propagated sensitivity and gradient of the hidden neuron penalty term. The additional term follows a Hebbian learning rule [10], and this new learning algorithm incorporates two popular learning algorithms, i.e., the back-propagation and Hebbian learning rules. The Hebbian term is also multiplied by the derivative of activation function, which prevents the weights from drastic increase or decrease. The theoretical justification for this approach is mentioned in [9]. Hebb's principle [9] can be described as a method of determining how to alter the weights between model neurons. The weight between two neurons increases if the two neurons activate simultaneously and reduces if they activate separately. Nodes that tend to be either both positive or both negative at the same time have strong positive weights, while those that tend to be opposite have strong negative weights. We applied the proposed updating method for first layer and by applying chain rule for an L-layer Convolutional Neural Network, the values of weights are tuned through the structure.

II. Proposed robustness approach in cost function of CNN

 



(2)

2558

Ⅲ. Experimental Results

Table.3 The performance of different methods on AR dataset Reference/Year Method Accuracy Y. Choi et I2DKPCA 82.92% al., 2014 [5] Patel et al., Dictionary-based 93.7% 2012 [6] recognition Proposed CNN 98% Proposed Robust CNN 99.75%

The experiments were carried out using conventional CNN method and the robust CNN; then, compared with other approaches I2DKPCA and Dictionary-based recognition[5, 6]. The total number of used samples were 2600 including 100 subjects each for 26 samples. We used 2200 of the dataset samples for training and 400 for the test phase. All the samples were chosen randomly and equally distributed for test and train phase. The experiments were performed in Visual Studio 2010 and in C++. Table. 1 presents the detailed structure of CNN for both conventional CNN and the robust CNN structures. In the following tables the extension of abbreviated terms are as follows: No. FM: number of feature maps, L: Layer, CS: Convolution-Subsampling, FC: Fully connected, FM size: feature map size, K size: Kernel size, SF size: Subsampling size. Table.1 The structure of CNN Changing the kernel size with 5-layer structure Structure specifications values No. FM: L0/L1/L2/L3/L4 1/4/50/300/100 Type of layer: L1/L2/L3/L4 CS/CS/FC/FC FM size: L0/L1/L2/L3/L4 29/13/5/1/1 K size: K1/K2/K3/K4 5/5/5/1 SF size: S1/S2/S3/S4 2/2/1/1 Table. 2 shows the accuracy rate in regards to change of Lambda value.

Ⅳ. Conclusion In this study a new cost function including the back propagated sensitivity and gradient of the hidden neuron penalty term is proposed for CNN structure to evaluate the error rate. The error value is used to update the values of the kernels during the back propagation process. To check the validity of the proposed approach, it was applied to AR dataset, which contains images with varying illumination, poses, and occlusions. The presented cost function extracts better features. This occurs when the new learning algorithm incorporates two popular learning algorithms, i.e., the back-propagation and Hebbian learning rules. The Hebbian term is also multiplied by the derivative of activation function, which prevents the weights from indefinite increase or decrease when it feedbacks the small variations of the output error to the input layer.

Acknowledgment

Table.2 The accuracy regarding to value of Lambda CNN Robust CNN Lambda:0.2, Lambda:0.75, 0.4, 0.6 0.85, 0.9 98% 99.75% 99.5% Table. 3 shows the accuracy of the proposed method in comparison with other state of art methods. The re-sults indicate superior performance of the proposed method in dealing with illumination, occlusion, and pose variations.

The research was supported by 'Software Convergence Technology Development Program', through the Ministry of Science, ICT and Future Planning(S1005-14-1003) and was supported by the Industrial Strategic Technology Development Program(10044009) funded by the Ministry of Trade, Industry and Energy(MOTIE, Korea).

References [1] Y. LeCun, K. Kavukcuoglu, and C. Farabet, "Convolutional networks and applications in 2559

vision." Circuits and Systems (ISCAS), Proceedings of IEEE International Symposium on, pp. 253-256, 2010. [2] Y. LeCun, and Y. Bengio, “Convolutional networks for images, speech, and time series,” The handbook of brain theory and neural networks, vol. 3361, pp. 310, 1995. [3] D. Bouchain, “Character recognition using convolutional neural networks,” Institute for Neural Information Processing, vol. 2007, 2006. [4] H. Khalajzadeh, M. Mansouri, and M. Teshnehlab, “Hierarchical Structure Based Convolutional Neural Network for Face Recognition,” International Journal of Computational Intelligence and Applications, vol. 12, no. 03, 2013. [5] Y. Choi, S. Ozawa, and M. Lee, “Incremental two-dimensional kernel principal component analysis,” Neurocomputing, vol. 134, pp. 280-288, 2014. [6] V. M. Patel, T. Wu, S. Biswas, P. J. Phillips, and R. Chellappa, “Dictionary-based face recognition under variable lighting and pose,” Information Forensics and Security, IEEE Transactions on, vol. 7, no. 3, pp. 954-965, 2012. [7] P. Y. Simard, D. Steinkraus, and J. C. Platt, "Best practices for convolutional neural networks applied to visual document analysis." pp. 958-958, 2003. [8] F. Mamalet, and C. Garcia, "Simplifying convnets for fast learning," Artificial Neural Networks and Machine Learning–ICANN, pp. 58-65: Springer, 2012. [9] L. Soo-Young, and J. Dong-Gyu, “Merging Back-propagation and Hebbian Learning Rules for Robust Classifications,” Neural networks: the official journal of the International Neural Network Society, vol. 9, no. 7, pp. 1213-1222, 1996. [10] D. Hebb, "0.(1949) The organization of behavior," Wiley, New York, 1968.

2560