Handwriting Recognition- Neural Networks- Fuzzy ... - Semantic Scholar

8 downloads 0 Views 90KB Size Report
Phil Phillips. 2). ,. S.Yanushkevich. 1). , D.Popel ..... Fu LiMin, Neural Networks in Computer Intelligence, McGraw-Hill, Inc.,1994. 12. Zadeh L.A. Outline of a New ...
Proceedings of the IAPR International Conference on Pattern Recognition and Information Processing – PRIP’97, vol. 1, 1997, pp.39-47

Education Aspects: Handwriting RecognitionNeural NetworksFuzzy Logic J. Gilewski1), Phil Phillips2), S.Yanushkevich1), D.Popel 3) 1)

Institute of Computer Science & Information Systems, Technical University, Zovnierska 49, Szczecin 71-210, Poland, Fax::(+4891)4876439, E-mail: [email protected] , [email protected] 2) IEE C3 Committee, Netherlaw 1, North Berwick, United Kingdom, Fax: (**44) 017-1250-1290, E-mail: [email protected] 3) Belarussian State University of Infomatics and Radioelectronics, P.Brovky 6, Minsk 220069, Republic of Belarus, Fax: (+375)2-310914, 495106. Abstract - A laboratory work on courses Pattern Recognition and Image Processing and Artificial Intelligence is introduced. The originality of authors’ approaches is that (i) one of the approaches to handwriting recognition (ii) elements of fuzzy logic, and (iii) neural networks, are simultaneously studied in the work. Other aim of the paper is to review such directions of biometric technologies as signature and handwriting authorization (identification). The offered laboratory work has a character of research work and frequently requires non-standard decision-making of the students. The efficiency of the offered approach for educational process is estimated by such factors as interest of the students, expansion of their outlook, ability for complex using their knowledge and more complete preparation for examinations. The proposed technology is easy to be implemented and allow flexibly reacting onto changing and adding in training courses1. Index terms - handwriting recognition, fuzzy logic, feature extraction, neuron network, biometric technologies, education

1

This work was supported in part by Technical University of Szczecin (Poland) and in part by State University of Informatics & Radioelectronics (Belarus).

1. Introduction It is well known that one of the main goal for every university is to use modern scientific achievements in education process. This way is not simple because new scientific results can mapped into education process only by using so called methodical pre-processing and approbation. Some principals of this mapping was formulated by Society for Research in Higher Education (UK) and we agree with them. So, methodical approach based on next principals: orienting, motivating, presenting, clarifying, confirming, consolidating (opportunities to develop and test personal understanding), and elaborating (introducing additional materials to develop more detailed knowledge). As is the case for most computer Science departments, the Institute of Computer Science & Information Systems (Technical University of Szczecin) is regularly faced with the need to revise its courses and curriculum in order to remain current in a rapidly changing field. This paper is devoted to a problem of handwriting recognition by using non-classical approaches, and we have followed these principles when developing this laboratory work. Rosenblatt’s perceptron (1950) was one of the first neural network (NN) and it was able to recognize a fixed-font character set. Zadeh proposed the fuzzy logic theory [Zadeh], which changed many our conceptions about methods to solve the problem. It is well known today that NN and fuzzy logic are complementary tools for solving handwritting recognition problem. When composing this task for education goal we have aimed to reach: i) simplicity of implementation, ii) flexibility, i.e. ability to make quickly changing and adding in according with requirements of the courses, and (iii) integrated using of inter-disciplinary knowledge. In this paper we present original technology for simultaneously studying next aspects: (i) methods of handwriting recognition, (ii) elements of fuzzy logic, (iii) NN architecture, and (iv) training methods for NN. It is clear that students study these questions, at least, in two university courses: Pattern Recognition and Image Processing and Artificial Intelligence. In other words, to fulfil the given work, very good preparation of the students is required. Moreover, it has character of research work. Our approach for working out this laboratory work was under influencing other additional factors too, namely: to realise in the work the possibility to learn such directions of biometric technologies as signature and handwriting authentication and recognition [Shme], [Zhou]. It is caused by that the significant attention when training the experts in area of Banking Information Technologies, is paid to problems of information safety, including biometrics methods [ShYa], [Sold]. It should be indicated, that biometric methods include voice and speech recognition, dynamic signature and handwriting capture, eyes (iris and retinal) identification, hand geometry, fingerprint identification, face recognition and keystroke dynamics [Sold]. Therefore it is necessary to relate also at least two functions to the listed above ones, namely: (i) studying the methods of signature identification, and( ii) studying the methods of handwriting verification. Conspectus of lectures on the specified above courses, textbooks, numerous scientific papers on the given direction and methodical literature were used when to compose the laboratory task [Amin], [Bern], [GadMoh], [ShmYan], and others. Our approach is explained as follows. At the beginning, we present the common structure of the laboratory work, the main components of which are pre-processing, fuzzy feature extraction, and feedforward NN. The functions of the components of the handwriting characters processing and recognition can be changed. It means that is possible to „follow” lectures and take into account changing and additions without changing of the work structure. Then we concentrate attention on realising the fuzzy feature extraction algorithm. This algorithm allow modify, and students have opportunities to explore enough carefully the mathematics for fuzzy logic and features of the applications for a concrete task. Further we explain some methodical questions of fulfilling the work by students and share our experience. 2. The general structure of the system The main components of the work, or stages to learn the theme, are pre-processing, fuzzy feature extraction, and feedforward NN (Fig.1). The handwriting model combines both functions: segmentation and classification of a target handwritten letter. The character feature of such organizing consists of the fact that functions of the components can be essentially changed.

Input Image

Preprocessing Image filtering Thining Search vertices

Fuzzy feature extraction

Features coding

Neural Network

Output Fig. 1. The structure of handwriting recognition system

Let us assume that a binary image of a character be the input data. Before we start the segmentation process we have to realise the pre-processing of the image which includes the following steps (Fig. 3): - filtering: the aim is to reduce noise and make easier extracting the structural features, - thinning (or skeletonizing): this task removes outer pixels by iterative boundary erosion process until a skeleton of pixel chains only remains, - searching vertices: at this step we extract a line junctions and the ends of the lines called vertices. Every vertex has a number of branches which meet each other in this point. After the pre-processing we get the thinned image with a set of vertices and its number of branches. We can perform a fuzzy feature extraction. The task of our feature extraction module is to process a binary image so that to obtain a vector of features. The main features are: a kind of segment, its orientation and size related to the character frame and position. The features coding block includes a complex structure for input layer of a NN classifier. The input vector contains details for our character extracted from previous step. The last step is to classify the character. We apply the classic three-layer feedforward NN [Fu].

3. Fuzzy feature extraction algorithm

The pre-processing of the image allows students studying a quite enough class of algorithms and estimate the influence of the obtained results onto quality of the recognition. Then the fuzzy feature extraction algorithm is learned in the work. The process of feature extraction of a character consists of the distinguishing typical elements (vertices) and branches or segments (Fig. 2). curve branch

Table 1. Basic segments of a character with typical elements

stright branch line

right

left

loop

2

3

vertex

Fig. 2. Example of a character

horizontal vertical right sloped left sloped

Some different kinds of segments can be indicated, but taking into account the requirements we defined the simple segments shown in Table 1. It is clear that the number of segments in Table 1 can be extended, what is necessary, in particular, for signature authentication [Shme], [ShmYan]. Every vertex has its own co-ordinates Vx and Vy and the number of branches Vn which are crossed in point (Vx, Vy). The exception is for the vertex which lies on the end of the segment where Vn = 1 (see Fig. 2 for vertex 3). The aim of this algorithm is to extract the defined segments from the character image. Before we start extraction, the pre-processing of the image should be perform first. As a result we achieve a thinned image I, a vertices structure V and the number of vertices Vc. The vertices structure V contains: a) vertex co-ordinates Vxi,Vyi b) the number of branches Vni, , where i is an index of vertex, i = 0 .. Vc-1.

The exact algorithm includes the following steps. Input data: a) m x n binary image matrix I, b) vertices structure V, c) number of vertices Vc, Output data: fuzzy data structure F of: a) symbols of Orginal noisy Filtered Thined Image with classified segments, b) bound boxes image image image found vertices of segments, c) position of segments, d) co-ordinates of the Fig. 3. Main steps of pre-processing. segment (start and end points), e) sizes of segments. Step 1: Initialize the following variable: index of vertices i = 0, index of the branch j = 0 for the vertex Vi, m x n temporary image matrix I’ with all zero elements, segment points co-ordinates matrix P of the current segment. Step 2: Trace the segment of the branch j, starting from vertex co-ordinates Vxi, Vyi to the next found vertex: (i) Check both the image matrix I and the temporary image matrix I’ for the pixel If the pixel is set on I’ it indicates that this pixel was previously traced and it sDX is skipped. The tracing of the segment is stopped when start point tracer has no way to go (ii) All traced points are set on the temporary image matrix I’ and their co-ordinates are saved to the matrix P. bound box Save the start and the end point of the segment and the sDY bound box which contains the path to the Fuzzy Data F. (iii) Note the number of segment pixels to k . Step 3: To qualify the segment as a loop, the segment must keep the condition d 0,5Ms FS(size) = -2/Ms*size + 1 for size 0,5Ms

Fig. 8. Membership function for fuzzy size sets S, M, L where, size = k, Ms = n and k - number of points belongs to the segment, n - height of the character. LL Go to step 7. line BL intersections Step 6: Check what kind of curve corresponds to RL the segment. (i) Calculate the number of intersections of the line LL, LR with the segment (ii) Calculate coefficients of the two parallel line LL and RL to the base line BL of the segment Fig. 9. Specifying the kind of curve by (iii) The following curve Horizontal Left Curve (HLC), calculating the intersection with Horizontal Right Curve (HRC), Vertical Left Curve parallel line to the base line BL (VLC), Vertical Right Curve (VRC), Right Slope Right Curve (RR), Right Slope Left Curve (RLC), Left Slope Right Curve (LRC), Left Slope Left Curve (LRC) are defined by by the conditions: if b∈H and IL=0 and IR>=2 then b∈HRC if b∈H and IL>=2 and IR=0 then b∈HLC if b∈V and IL=0 and IR>=2 then b∈VRC if b∈V and IL>=2 and IR=0 then b∈VLC if b∈R and IL=0 and IR>=2 then b∈RRC if b∈R and IL>=2 and IR=0 then b∈RLC if b∈L and IL=0 and IR>=2 then b∈LRC if b∈L and IL>=2 and IR=0 then b∈LLC where b - branch, IL - number of the left line intersections with segment, IR - number of the right line intersections with segment, H, V, R, L - kind of base line define by membership functions FH, FV, FL, FR. (iv) Calculate the size of the segment. The size of the segment is defined by the same way as at step 5 (see fig. 8) but size = sDX * sDY, Ms = m * n, where sDX, sDY is the width and the height of the bound box (see fig. 3.4), m is the width of the character, n is the height of the character. Step 7: If j < Vni (there is another branch to check) go to step 2 else if i < Vc (there is another vertex to start from) increase index of vertices (i = i + 1) and go to step 2. If i >= Vc that means there is no vertex left and stop the algorithm.

4. Features coding This stage supposes preparing the initial data for the NN. Students are requested to learn and research some methods of feature coding when handwriting character recognizing. The choice of the feature coding method is essentially influences onto the learn characteristics of the NN and the quality of recognition. Let us consider briefly one of the methods, that we suggest to students. First of all we need to determine the maximal number of components because of fixed number of neurones in input layer. All features extracted from character are coded into the input vector. In this vector we can distinguish two parts. The first part include a kind of segment, its orientation and size. The second part encodes the relationships between the components. For each pair of distinct components the segment relative position record is defined (with information about what direction has to go from the first to reach the second and whether the objects touch).

5. Multilayer feed-forward NN One neuron for each posible character Output layer

Hidden layer: number of neurons is chosen by experiments

Input layer Threshold

Complex structure of the features: number of neurons are dependent on number of the fuzzy feature

Fig. 10. The structure of the NN

The classic three-layer feedforward NN is used in the laboratory work (Fig. 10). The activation function of every input unit is the linear function f(x) = x, and activation function of every hidden and output unit is the sigmoid function f(f) = 1/(1 + e-x). In our case, the inputs of the neural NN will be components of the vector which represent a collection of segments, together with information about how the segments are related. To train a NN classifier, the preparing of a set of training data and test data is required. The number of input nodes depends on the number of a maximal number of the components. The number of hidden nodes is a variable that can be adjusted by user but there is no special rules to do that. We will propose several tests to find convenient values

for the number of hidden neurones. To have the NN to perform the algorithm of the recognition we must adjust weight values in order to obtain a desired network performance. For this architecture, a backpropagation training algorithm is convenient. We use standard Kohonen self-organization feature map, which accurately represents the fuzziness of the character classes. Student can see the weight vectors of an map that trained using 24x18 binary images of characters.

6. Educational process The mentioned above system represented as laboratory work requires for students knowledge in methods of pattern recognition and image processing, fuzzy logic, and NN. When extending the system and its using in the course on Banking Information technologies, students must know some bases of handwriting and signature identification. 6.1 Studying of the fuzzy feature extractor The aim of this subject is to acquaint students with a main steps of this algorithm and to produce a set of training and test data. To start the laboratory work we need a lot of binary images of every character that we want to learn the classifier. Student wants to experiment with the feature extractor should get to know about working rules of this algorithm. By experimenting we can set the following properties:

vt - vertex threshold: factor of minimal distance between vertices; minDV = ((m * n)/2 * vt)/100, where minDV is a minimal distance between vertices and m, n - size of the image. If the found vertex is closer than minDV than the vertex is skipped. st - segment threshold: factor of minimal length of a path to classify as a segment to analyse; minDS = st * minDV, where minDS is a minimal length of a path. If the traced path is shorter than minDS than the path is not analyse (it’s too short); α - angle threshold: minimal width of an angle to classify a segment as a loop (see step 3 of the algorithm, Fig. 5); β - curve threshold: minimal average deviation of distance to classify a segment as a line (see step 4 of the algorithm, Fig. 6); N - number of components for NN classifier.

6.2 Studying of NN classifier A component of the presented system can be any software realization of NN (NN Professional or Studgard NN Simulator etc.). For creating a NN students must specify a number of input nods (dependant on N) number of nodes in hidden layer and number of output nodes. The number of output nodes is the number of recognized character. The most important step in the recognition system is the learning of the NN. The starting point for training procedure is data which students produce in the first part. Training data consists of input-output pairs that have been generated by the fuzzy feature extraction process. In this part students can change parameters of the created network and depending to the used simulator they can get many kinds of graph, chart and tables.

6.3 Training tasks We consider below two examples of training tasks for students. These examples allow estimating how to do this laboratory work and its complexity. Task 2 consists in estimating of influence of the number of neurones in hidden layer onto the training process of the network, and then onto the recognition. A family of simplest graphics is illustrated in Fig.11. Students are invited to explain these results of the experiments. Task 1 Subject: Influence of learning coefficient onto a learning progress. • Prepare a learning data using fuzzy feature extractor with chosen settings and chosen learning set of character images • Create a three-layer NN with proper number of input and output node (depending on how many feature components are used and how many kinds of character images we recognize) • For each chosen learning coefficient NN pass a learning task, note the output error in every learning step • Make a output error graph (put learning step on a horizontal axis and the output error on vertical axis) • Make a conclusions

Task 2 Subject: Influence of number of nodes in hidden layer onto a learning progress • Prepare a learning data using fuzzy feature extractor with chosen settings and chosen learning set of character images • Create a three-layer NN with proper number of input node and output node (depending on how many feature components we are using and how many kind of character images we recognize). • For each NN pass a learning task, note the output error chart • Make a conclusions

7. Conclusion and comments This paper is addressed to our colleagues which are related the training process at universities. The main goal of the paper is to exchange the experiences of using the modern achievements in pattern recognition and image processing, NN, fuzzy logic and biometric technologies. Preparing the laboratory works for learning new scientific achievements is always difficult. These difficulties have to classify as „technological”, they have methodological character. This problem in special literature on education is referred as mapping the new achievements in educational process.

That’s why the subject of the paper can be interpreted as a way to map new achievements in Pattern recognition and 5 image processing into didactic process of university, by 15 means of the laboratory work. The initial state of the problem considered in the paper, was to create a laboratory work on one of chapters of the course ”Pattern recognition and image processing”. As a result, it has become possible to essentially extend the functional abilities of the created system. It allowed using the system for other courses: ”Artificial Intelligence” and ”Banking Information Technologies”. It was reached owing to the fact that the system is implemented Fig. 11. Influence of number (from 5 to as a set of modules based on typical programme packages, 25) of nodes in hidden layer on a and allows training all its components and also fast learning progress modification. It means that numerous methods of image preprocessing can be learned in the course ”Pattern recognition and image processing” basing on such real data as handwriting symbols. Its main goal - to learn handwriting recognition technology based on fuzzy logic and NN - is kept. Fuzzy logic and NN can be applied at many levels in complex handwriting recognition process. Fuzzy logic systems can represent high-level knowledge and it is very importand point of our laboratory task. When we have been prepared this paper, we got to know the paper [Gader]. It stimulated us to develop the supposed system. We would like to make the work more complex in future, and to propose students to study the technology of character recognition considered in this paper. The originality of authors’ approach is in that the system allows for students to get to know better one of directions of biometric technologies - handwriting and signature identification. Biometric technologies are one of principal components of the lectures and practical works on Banking Information Technologies. Unfortunately, we can not estimate the efficiency of this laboratory work with respect to its using in educational process. We have only indirect estimations. So, for example, during task executing, it was possible notice easily the most prepared students, and also to reveal the students which are able for research work. Besides this, the results of the work were frequently resulted as examples on examinations. The presented here system is used in didactic process of the Institute Computer Science & Information Systems (Technical University of Szczecin, Poland). Students universally found this task interesting and useful, and they were able to complete the contents and propose different versions of the exercises.We think, this experience can be used at other universities too. References 1. Amin A., and Wilson W.A., Hand-Printed Character Recognition System Using Artificial Neural Networks, Proc. of the 2nd Int.Conf. on Pattern Recognition, pp. 943-946, 1993 2. Bernard G., Multilayer Perceptron and Uppercase Handwrtitten Character Recognition, Proc. of the 2nd Int. Conf. on Pattern Recognition, 1993 pp. 935-938 3. Gader P.D., Keller J.M., Krishnapuram R., Chiang J-H., Mohamed M.A. Neural and Fuzzy Methods in Handwriting Recognition, Computer, Feb, 1997, pp.79-86 4. Gader P.D., Mohamed M.A., and Chiang J-H.,Comparison of Crisp and Fuzzy Character Neural Networks in Handwritten Word Recognition, IEEE Trans. Fuzzy Systems, Aug., 1995, pp.357-364 5. Pepe S., and Chen C.S., Fuzzy Logic for Handwritten Numeral Character Recognition in Fuzzy Models for Pattern Recognition, ed. J.C Bezdek, S.K. Pal IEEE Press 1992 6. Shmerko V., and Yanushkevich S., Script of lectures on Artificial Inelligence (In Polish) 7. Shmerko V. Experience of Designing and Using Signature and Handwriting Person Indetification Systems. The British Computer Society, Computer Security Specialist Group Annual Conference, March, London, 1995 8. Shmerko V. and Yanushkevich S. (Eds.), Banking Information Systems, Technical University of Szczecin, Poland, 1997 9. Soldek J. , Shmerko V., Phillips Phil, Kukharev G., Rogers W., and Yanushkevich S., Image Analysis and Pattern Recognition in Biometric Technologies, this issue 10. Zhou R.W., and Quck C., An Automatic Fuzzy Neural Network Driven Signature Verification System, The IEEE Int. Conf. on Neural Networks, Washngton, USA, vol.2, 1996, pp.1034-1039 11. Fu LiMin, Neural Networks in Computer Intelligence, McGraw-Hill, Inc.,1994 12. Zadeh L.A. Outline of a New Approach to the Analysis of Complex Systems and Decision Processes. IEEE Trans.Syst.Man Cybern. 3(1), 1973, pp.28-44