recognition of off-line handwritten arabic (indian)

‫‪Sabri A. Mahmoud and Sameh M. Awaida‬‬

‫‪RECOGNITION OF OFF-LINE HANDWRITTEN ARABIC‬‬ ‫‪(INDIAN) NUMERALS USING MULTI-SCALE FEATURES‬‬ ‫‪AND SUPPORT VECTOR MACHINES VS. HIDDEN‬‬ ‫‪MARKOV MODELS‬‬ ‫*‪Sabri A. Mahmoud and Sameh M. Awaida‬‬ ‫‪Information and Computer Science, King Fahd University of Petroleum and Minerals‬‬ ‫‪Dhahran, Saudi Arabia‬‬

‫اﻟﺨﻼﺻـﺔ‪:‬‬ ‫ﺗﺼﻒ هﺬﻩ اﻟﻮرﻗﺔ ﻃﺮﻳﻘﺔ ﻟﻠﺘﻌﺮف اﻟﺘﻠﻘﺎﺋﻲ ﻋﻠﻰ اﻷرﻗﺎم اﻟﻌﺮﺑﻴﺔ )اﻟﻬﻨﺪﻳﺔ( اﻟﻤﻜﺘﻮﺑﺔ ﺑﺨﻂ اﻟﻴﺪ دون اﺗﺼﺎل ﺑﺎﻟﺤﺎﺳﺐ ﺑﺎﺳﺘﺨﺪام ﻣﺎآﻴﻨﺎت دﻋﻢ اﻟﻤﺘﺠﻬﺎت‬ ‫وﻧﻤﺎذج ﻣﺎرآﻮف اﻟﻤﺨﻔﻴﺔ‪ .‬وﺗﻢ اﺳﺘﺨﺪام ﻣﻤﻴﺰات ﻟﻠﻤﻘﺎﻳﻴﺲ ﻣﺤﻠﻴﺔ‪ ،‬وﻣﺘﻮﺳﻄﺔ‪ ،‬وﻣﻮﺳﻌﺔ‪ .‬وﻣﻦ ﺧﻼل اﻟﺘﺠﺎرب اﻟﻌﻤﻠﻴﺔ ﺗﻢ اﻟﻮﺻﻮل إﻟﻰ ﻗﻴﻢ اﻟﻤﺘﻐﻴﺮات‬ ‫ﻟﻤﺎآﻴﻨﺎت دﻋﻢ اﻟﻤﺘﺠﻬﺎت اﻟﺘﻲ ﺗﻌﻄﻲ أﻋﻠﻰ ﻣﻌﺪﻻت اﻟﺘﻌﺮف ﻣﻦ ﺧﻼل اﺳﺘﺨﺪام ﺧﻮارزم ﻟﻠﺒﺤﺚ اﻟﺸﺎﻣﻞ‪ ،‬إﺿﺎﻓﺔ إﻟﻰ ذﻟﻚ ﺗﻤﺖ ﻣﻘﺎرﻧﺔ ﻧﺘﺎﺋﺞ اﻟﺘﺼﻨﻴﻒ‬ ‫ﻟﻤﺎآﻴﻨﺎت دﻋﻢ اﻟﻤﺘﺠﻬﺎت ﺑﻨﺘﺎﺋﺞ اﻟﺘﺼﻨﻴﻒ اﻟﻤﻘﺎﺑﻠﺔ ﻟﻨﻤﺎذج ﻣﺎرآﻮف اﻟﻤﺨﻔﻴﺔ‪.‬‬ ‫ﻳﺴﺘﺨﺪم هﺬا اﻟﺒﺤﺚ ﻗﺎﻋﺪة ﺑﻴﺎﻧﺎت ﻣﻦ ‪ 44‬آﺎﺗﺒًﺎ ﻣﻊ ‪ 48‬ﻋﻴﻨﺔ ﻟﻜﻞ رﻗﻢ ﺑﺈﺟﻤﺎﻟﻲ ‪ 21120‬ﻋﻴﻨﺔ‪ .‬وﻗﺪ ﺗ ﻢ ﺗ ﺪرﻳﺐ ﻣ ﺼﻨﻔﺎت ﻣﺎآﻴﻨ ﺎت دﻋ ﻢ اﻟﻤﺘﺠﻬ ﺎت وﻧﻤ ﺎذج‬ ‫ﻣﺎرآﻮف اﻟﻤﺨﻔﻴﺔ ﺑﻤﻘﺪار ‪ ٪ 75‬ﻣﻦ اﻟﺒﻴﺎﻧﺎت واﺧﺘﺒﺎرهﺎ ﻣﻊ اﻟﺒﻴﺎﻧﺎت اﻟﻤﺘﺒﻘﻴﺔ‪ .‬أﻣﺎ ﺑﻘﻴﺔ أﻗﺴﺎم ﺑﻴﺎﻧﺎت اﻟﺘﺪرﻳﺐ واﻻﺧﺘﺒﺎر ﻓﻘ ﺪ ﺗ ﻢ اﺳ ﺘﺨﺪاﻣﻬﺎ أﻳ ﻀًﺎ وأﺳ ﻔﺮت ﻋ ﻦ‬ ‫أداء ﻣﺸﺎﺑﻪ‪.‬‬ ‫وﻗﺪ ﺣﻘﻘﺖ ﻣﻌﺪﻻت اﻟﺘﻌﺮف ﻣﺘﻮﺳﻂ ‪ ٪ 99.83‬و ‪ ٪ 99.00‬ﺑﺎﺳﺘﺨﺪام ﻣﺼﻨﻔﺎت ﻣﺎآﻴﻨﺎت دﻋﻢ اﻟﻤﺘﺠﻬﺎت وﻧﻤﺎذج ﻣﺎرآﻮف اﻟﻤﺨﻔﻴ ﺔ ﻋﻠ ﻰ اﻟﺘ ﻮاﻟﻲ‪ ،‬وﺛﺒ ﺖ‬ ‫أن ﻣﺼﻨﻒ ﻣﺎآﻴﻨﺎت دﻋﻢ اﻟﻤﺘﺠﻬﺎت أﻓﻀﻞ ﻟﺠﻤﻴﻊ اﻷرﻗﺎم‪ .‬أﻣﺎ اﻟﻤﻘﺎرﻧﺔ ﻋﻠﻰ ﻣﺴﺘﻮى اﻟﻜﺎﺗﺐ )اﻟﻜﺎﺗﺐ ‪ 34‬إﻟ ﻰ ‪ (44‬ﻓ ﺄﻇﻬﺮت أن ﻧﺘ ﺎﺋﺞ ﻣ ﺼﻨﻒ ﻣﺎآﻴﻨ ﺎت دﻋ ﻢ‬ ‫اﻟﻤﺘﺠﻬﺎت ﻓﺎﻗﺖ ﻧﺘﺎﺋﺞ ﻣﺼﻨﻒ ﻧﻤﺎذج ﻣﺎرآﻮف اﻟﻤﺨﻔﻴﺔ ﻟﺠﻤﻴﻊ اﻟﻜﺘﺎب اﻟﺬﻳﻦ ﺗﻢ اﺧﺘﺒﺎرهﻢ‪ .‬آﻤﺎ ﺗﻢ ﺗﺤﻠﻴﻞ أﺧﻄﺎء اﻟﺘﺼﻨﻴﻒ ﻟﻤﺼﻨﻒ ﻣﺎآﻴﻨﺎت دﻋﻢ اﻟﻤﺘﺠﻬﺎت‪.‬‬ ‫وأﺛﺒﺘﺖ اﻟﻄﺮﻳﻘﺔ اﻟﻮاردة ﺑﻬﺬا اﻟﺒﺤﺚ ﺑﺎﺳﺘﺨﺪام ﻣﻤﻴ ﺰات ﻣﺘﻌ ﺪدة اﻟﻤﻘ ﺎﻳﻴﺲ وﻣ ﺼﻨﻒ ﻣﺎآﻴﻨ ﺎت دﻋ ﻢ اﻟﻤﺘﺠﻬ ﺎت ﻓﻌﺎﻟﻴﺘﻬ ﺎ ﻓ ﻲ اﻟﺘﻌ ﺮف ﻋﻠ ﻰ اﻷرﻗ ﺎم اﻟﻌﺮﺑﻴ ﺔ‬ ‫)اﻟﻬﻨﺪﻳﺔ( اﻟﻤﻜﺘﻮﺑﺔ ﺑﺨﻂ اﻟﻴﺪ دون اﺗﺼﺎل ﺑﺎﻟﺤﺎﺳﺐ‪ ،‬وﺗﻔﻮﻗﻬﺎ ﻋﻠﻰ ﻣﺼﻨﻒ ﻧﻤﺎذج ﻣﺎرآﻮف اﻟﻤﺨﻔﻴﺔ‪.‬‬ ‫ﻣﻔ ﺎﺗﻴﺢ اﻟﺒﺤ ﺚ‪ :‬اﻟﺘﻌ ﺮف اﻵﻟ ﻲ ﻋﻠ ﻰ اﻷرﻗ ﺎم اﻟﻌﺮﺑﻴ ﺔ )اﻟﻬﻨﺪﻳ ﺔ(‪ ،‬اﻟﺘﻌ ﺮف اﻟ ﺬآﻲ ﻋﻠ ﻰ اﻟﺤ ﺮوف‪ ،‬ﻣ ﺼﻨﻒ ﻣﺎآﻴﻨ ﺎت دﻋ ﻢ اﻟﻤﺘﺠﻬ ﺎت‪ ،‬ﻣ ﺼﻨﻒ ﻧﻤ ﺎذج‬ ‫ﻣﺎرآﻮف اﻟﻤﺨﻔﻴﺔ‪ ،‬اﻟﺘﻌﺮف ﻋﻠﻰ اﻷرﻗﺎم اﻟﻤﻜﺘﻮﺑﺔ ﺑﺨﻂ اﻟﻴﺪ‪ ،‬اﻟﻤﻌﻴﺎرﻳﺔ‪ ،‬اﺳﺘﺨﺮاج اﻟﻤﻤﻴﺰات‪.‬‬

‫‪* Corresponding author:‬‬ ‫‪E-mail: [email protected]‬‬ ‫‪Paper Received December 2, 2008; Paper Revised April 15, 2009; Paper Accepted May 27, 2009‬‬

‫‪429‬‬

‫‪The Arabian Journal for Science and Engineering, Volume 34, Number 2B ‬‬

‫‪October 2009‬‬

Sabri A. Mahmoud and Sameh M. Awaida

ABSTRACT This paper describes a technique for automatic recognition of off-line handwritten Arabic (Indian) numerals using Support Vector Machines (SVM) and Hidden Markov Models (HMM). Local, intermediate, and large scale features are used. SVM parameters, producing the highest recognition rates, are experimentally found by using an exhaustive search algorithm. In addition, SVM classifier results are compared to those of the HMM classifier. The present research uses a database of 44 writers with 48 samples of each digit totaling 21120 samples. The SVM and HMM classifiers were trained with 75% of the data and tested with the remaining data. Other divisions of data for training and testing were performed and resulted in comparable performance. The achieved average recognition rates were 99.83% and 99.00% using, respectively, the SVM and HMM classifiers. SVM recognition rates proved to be better for all digits. Comparison at the writer’s level (Writers 34 to 44) showed that SVM results outperformed HMM results for all tested writers. The classification errors of the SVM classifier were analyzed. The presented technique, using the powerful set of features and the SVM classifier, proved to be effective in the recognition of independent writer Arabic (Indian) numerals and was shown to be superior to the HMM classifier. Key words: Arabic (Indian) automatic numeral recognition, Intelligent Character Recognition (ICR), SVM, HMM, handwritten digit recognition, normalization, feature extraction

430

The Arabian Journal for Science and Engineering, Volume 34, Number 2B

October 2009


RECOGNITION OF OFF-LINE HANDWRITTEN ARABIC (INDIAN) NUMERALS USING MULTI-SCALE FEATURES AND SUPPORT VECTOR MACHINES VS. HIDDEN MARKOV MODELS 1. INTRODUCTION Handwritten digit recognition is widely used, for example in office automation, check verification, and a large variety of banking business, postal address reading, and data entry applications. The calligraphic nature of the Arabic script is distinguished from other languages in several ways. For example, Arabic text is written from right to left, with 28 basic characters, of which 16 have from one to three dots. Those dots differentiate similar characters. The shape of an Arabic character depends on its position in the word; a character might have up to four different shapes depending on it being isolated, connected from the right, connected from the left, or connected from both sides. On the other hand, Arabic (Indian) numerals are not cursive. Indian numerals are used in Arabic writing, while Arabic numerals are used in Latin languages. The term, ‘Arabic numerals’, is used here to refer to the Indian numerals used in Arabic writing. Although Arabic text is written from right to left, Arabic (Indian) numbers are written left to right with the most significant digit being the left-most one and the least significant digit the right-most. This is similar to Latin numerals. Figure 1 shows the Arabic (Indian) and Latin digits 0 to 9 (from right to left). Digit ‘1’ is similar in Arabic and Latin. Arabic digit ‘5’ is similar to Latin digit ‘0’. Digit ‘9’ of Arabic and Latin are similar with lower stroke projecting to down-right in Arabic and down-left in Latin. There exist two manners to write digit ‘4’ in Latin and two manners to write digit ‘3’ in Arabic as shown in Figure 2.

Figure 1. Arabic (Indian) and Latin handwritten numerals 0 to 9

Figure 2. Different methods to write digit 4 (Latin) and 3 (Arabic)

An Arabic number may consist of an arbitrary number of digits. The recognition system classifies each digit independently, preserving its relative position with respect to other digits, in order to obtain the actual value of the number after recognition. This paper is organized as follows: related work is presented in Section 2; feature extraction is addressed in Section 3, where three types of features are used; SVM and HMM classifiers theory is presented in Section 4: training, recognition, and experimental results are addressed in Section 5; and, finally, the conclusions are presented in Section 6. 2. RELATED WORK Various methods have been proposed, and high recognition rates are reported, for the recognition of English handwritten digits [1–4]. In recent years, many researchers have addressed the recognition of Arabic text, including Arabic numerals [5–10]. In addition, several researchers have reported the recognition of Persian (Farsi) handwritten digits [11–14]. However, the reported recognition rates for Arabic/Farsi digits need more improvements to be practical. Surveys on Arabic Optical Text Recognition may be cited in [15,16], a bibliography in [17], advances in Arabic text recognition in [18], assessment of handwritten recognition technologies in [19], and state of the art and future trends of Arabic text recognition in [20]. Al-Omari et al. presented a recognition system for the online handwritten Indian numerals from one to nine. The system skeletonizes the digits, and then geometrical features are extracted. Probabilistic neural networks (PNNs) are used for classification. The authors claim that the system may be extended to Arabic characters [5]. Bouslama [6] presented an algorithm based on structural techniques for extracting local features from the geometric and October 2009


431


topological properties of online Arabic characters using fuzzy logic. Salah et al. developed a serial model for visual digit classification based on primitive selective attention mechanism [8]. The technique is based on parallel scanning of a down-sampled image to find interesting locations through a saliency map, and on extracting key features of those locations at high resolution. Shahrezea et al. used the shadow coding method for recognition of Persian handwritten digits [11]. In this method, a segment mask is overlaid on the digit image, and the features are calculated by projecting the image pixels into these segments. In [12], the Persian digit images are represented by line segments which are used to model and recognize the digits. Additional features and classifiers are needed for discriminating the digit pairs ‘‘0–5’’, ‘‘7–8’’, and ‘‘4–6’’. Said et al. fed the pixels of the normalized digit image as is into a neural network for classification, where the number of the hidden units for the neural network classifier is determined dynamically [13]. Sadri et al. used a feature vector of length 16 estimated from the derivative of the horizontal and vertical profiles of the image [14]. SVMs are modern learning systems that deliver state-of-the-art performance in real-world pattern recognition and data mining applications such as text categorization, handwritten character recognition, image classification, material identification, and bioinformatics. Camastra [21] used SVM for cursive character recognition. Recently, researchers have been experimenting with SVM in recognizing Arabic/Farsi handwritten digits. Mowlaei et al. [22] used wavelet transform with SVM to recognize Arabic/Persian digits extracted from postal addresses. Mozaffari et al. used direction and curvature of the skeleton images of Arabic/Persian zip code numerals, then performed Principal Component Analysis to normalize their lengths before presenting them to SVM [23]. Soltanzadeh used the profile of the digit images along with SVM to recognize Arabic/Persian digits [24], whereas Mozaffari et al. used the SVM classifier to perform feature comparisons between fractal codes and wavelet transform [25]. HMMs were originally developed for speech recognition. The technology soon lent itself to the problems where features can be serialized with respect to time, such as the case of optical text recognition [26]. This approach was used for the recognition of printed Arabic text and for other languages by Makhoul et al. [27]. They achieved an error rate of 3.3 percent on data from the DARPA Arabic OCR Corpus. Recently, similar research was produced on handwritten data to achieve recognition rates of up to 89% on word level [28]. Benouareth et al. used the sliding window technique, along with semi-continuous hidden Markov models, for the recognition of off-line unconstrained handwritten Arabic word recognition [29]. Al-Ohali et al. [30] studied the effect of modifying the backward variable during training and verification to improve upon the discriminatory power of HMM. They achieved an 11% recognition rate improvement on Arabic sub-word recognition by implementing the proposed modification. Hence, HMM is becoming a popular classifier for use in Arabic handwritten text recognition. Readers are referred to [31,32] for exemplar systems. In this paper, we present an effective technique for the recognition of offline handwritten Indian numerals (0,1... 9) used in Arabic writing. Gradient, Structural, and Concavity (GSC) features are used [33]. The gradient features capture the low-level gradient direction frequency. The structural features capture middle-level geometric characteristics, including the presence of corners and lines at several directions. The concavity features capture highlevel topological and geometrical features, including direction of bays, presence of holes, and large vertical and horizontal strokes [34]. While the binarized version is used in [35], here we used the features without binarization. In addition, we used the SVM and HMM classifiers, while the k-nearest neighbour algorithm was used in [35]. The values of the SVM parameters that produce the highest recognition rates were estimated by using an exhaustive twostage parameter estimation technique (coarse- and fine-search). The results of SVM recognition were compared with those of the HMM classifier. The results of the two classifiers were analyzed. The recognition rates of SVM proved to be superior to those of HMM for all digits and tested writers, as detailed in Section 5. 3. FEATURE EXTRACTION Computed features measure the image characteristics at local, intermediate, and large scales, and hence it takes a quasi-multi-resolution approach. The local-scale features measure the edge curvature in the neighbourhood of a pixel. The intermediate features measure short stroke types which span several pixels. The large features measure certain concavities which can span across the image [36]. Looking into the image at different levels, i.e. quasi-multiresolution, has proven itself to provide good results in Intelligent Character Recognition (ICR) systems [36]. In this paper, we are evaluating the use of the GSC features with the SVM classifier, and then the results are compared with HMM results using the same dataset and features. The GSC features were designed to work with binarized images, and so it is presumed that the image has been thresholded by using Otsu’s method [37]. Favata et al. introduced this multiple feature/resolution approach which is binarized before use [33,35,36]. They used feature maps that are sampled by placing a 4×4 grid, and they thresholded the features to be in binary form. Besides implementing the algorithm, we also generalized the approach to work on any n×m grid, and we used the features without binarization.

432


October 2009


Initially the digit image is divided into the n×m grid with an equal number of foreground pixels for each of the n rows, and an equal number of foreground pixels for each of the m columns. Figure 3 shows Arabic digits 4 and 9 divided respectively into 3×3, 4×4, 5×5, and 6×6 divisions. Each digit is divided into n number of rows where each row has nearly an equal number of black pixels. The indices of these rows specify the y-axis of the digit grid. Similarly, the digit is divided into m number of columns with nearly an equal number of black pixels in each column. The indices of the columns specify the x-axis of the digit grid. The digit is then divided into an n×m grid using the estimated x- and y-indices.

Figure 3. Arabic digits 4 and 9 divided into 4 different divisions

After the image is divided into grids, the gradient, structural, and concavity features are extracted for each segment of the digit. Then the features of the segments are concatenated to form the feature vector of the Arabic numeral. The implemented features are based on [33,36] except the modifications we made to better suit the SVM and HMM classifiers. The theory and implementation of these features are explained next. The readers are referred to [33-36] for more details. 3.1. Gradient Features The gradient features are computed by convolving two 3x3 Sobel operators with the binary image. The gradient of a center pixel is computed as a function of its eight nearest neighbours. The gradient image consists of magnitude and direction. In this work, the gradient direction is used in the computation of a feature vector of the gradient feature map. The direction of the gradient is split into 12 non-overlapping regions (1°–30°, 31°–60°…331°–360°). In each sampling region, a histogram of gradient directions is estimated, resulting in 12 gradient features for each image segment. This corresponds to the count of each gradient direction in the region. These counters are concatenated to give the gradient features of a digit. 3.2. Structural Features The structural features capture the “mini-strokes” of the image. A set of 12 rules are applied to each pixel. Each rule examines a particular pattern of the neighbouring pixels for allowed gradient ranges. For example, rule S1 states that if neighbour N0 and neighbour N4 of a pixel both have a gradient range of 61°–150°, then the rule is satisfied and its corresponding value in the feature vector is incremented by 1. The current rule-set includes four types of corners and eight types of lines. The full list of rules and the neighbourhood definitions are shown in Table 1. These features represent the structural feature vector. Structural features use stroke trajectories on an intermediate scale. Table 1. (A) Structural Feature Definitions, (B) Neighbourhood Definitions

October 2009


433


3.3. Concavity Features These features consist of segment density, maximum strokes, and concavity features, a total of 8 features per image segment. Density features capture the number of foreground pixels in the image segments. They are computed by placing a sampling grid on the image and counting the number of foreground image pixels that fall into each grid. Stroke features capture the maximum horizontal and vertical strokes in the image segments. These are computed from the run lengths of horizontal and vertical black pixels across the image. From this information, the presence of strokes is determined by storing the maximum horizontal and vertical run length in each region. Concavity features are extracted by convolving the image with a star-like operator. This operator shoots rays in eight directions and determines what each ray hits. A ray can hit an image pixel or the edge of the image. Upward/downward, left/right pointing concavities are detected along with holes. For example, if a background pixel hits an image pixel in most directions, then it is considered to be a hole. This allows broken holes to be detected as holes. Five concavity features are extracted per image segment. 4. SUPPORT VECTOR MACHINES (SVMS) AND HIDDEN MARKOV MODELS (HMM) CLASSIFIERS In this section a concise theoretical background of the two classifiers is presented and citations for more details are given below. 4.1. Support Vector Machines (SVMs) SVMs are modern learning systems that deliver state-of-the-art performance in real-world pattern recognition and in data mining applications such as text categorization, handwritten character recognition, image classification, material identification, and bioinformatics. The SVM was developed by Vapnik and other researchers [38–41]. Here we briefly describe the basic theory behind SVM for pattern recognition and we refer readers to [40,42] for more details. For x ∈R

d

a

two-class

classification

problem,

assume

that

we

have

a

series

of

input

vectors:

( i = 1, 2 , .... N ) with corresponding labels, y i ∈ { + 1, − 1} ( i = 1, 2 , .... N ) . Here, +1 and −1 indicate

the two classes. SVM maps the input vectors x ∈ R d into a high dimensional feature space Φ ( x ) ∈ Η , and it constructs an Optimal Separating Hyperplane (OSH). OSH maximizes the marginal distance between the hyper plane and the nearest data points of each class in the space Η . The mapping Φ ( ⋅ ) is performed by a kernel function K ( x i , x j ) which defines an inner product in the space Η . The decision function implemented by SVM can be written as: ⎛ N ⎞ f ( x ) = sg n ⎜ ∑ y i α i ⋅ K ( x , x i ) + b ⎟ ⎝ i =1 ⎠

where the coefficients

αi

(1)

are obtained by solving the following convex Quadratic Programming (QP) problem:

Maximize

∑

N i =1

αi −

1 ∑ 2

N i =1

∑

N j =1

α iα j ⋅ y i y

j

⋅ K ( x i , x j ) , 0 ≤ αi ≤ C

(2)

Subject to N

∑α i =1

x

i

y

i

= 0

i = 1, 2 , ..... N .

(3)

C is a regularization parameter which controls the tradeoff between margin and misclassification error. These j are called Support Vectors only if the corresponding α i > 0 . In this work, we used the Radial Basis Function

(RBF) given by:

(

K (x i , x j ) = exp −γ x

i

−x

d j

),

(4)

Here γ and d are kernel parameters.

434


October 2009


4.2. Hidden Markov Model (HMM) A hidden Markov model assumes that the sequence of observed digit vectors representing each digit is generated by a Markov model. A Markov model is a finite state machine that can move to a next state (which may be the current state) at each time unit. With each move, a digit vector is generated. The probability of generating the digit observation vector, O, by the model λ through the state sequence S is the product of the probabilities of the outputs and the probabilities of the transitions. P(O, Q|λ) = π1 b1(o1) a12 b2(o2) a23 b3(o3) … Where O = o1, o2, o3, ..., is a sequence of digit observations; Q, is the state sequence; λ = (A,B,π); πi , initial state transition; aij the transition probability from state i to state j; and bi, the output probability at state i. Both i,j are between 1 (the model first state) and T (the model last state). As the state sequence is unknown, the probability is computed by summing overall possible state sequences. Since this is a time consuming step, it is approximated by the following equation,

where Q= q1, q2, q3, …is the state sequence of the model. This equation is usually computed through recursion with the assumption that the parameters aij and bij are known for each model λi.. The model parameters are estimated in the training phase using the Baum-Welch algorithm. The sequence of states S that gives the highest probability is determined by the Viterbi algorithm. This study uses a left-to-right HMM for Arabic (Indian) handwritten numeral recognition. Figure 4 shows the case of a 5-state HMM model. This model allows relatively large variations in the horizontal position of the Arabic numeral. The sequence of state transition in the training and testing of the model is related to each digit feature observation. Although each digit model may have a different number of states, we decided to use the same number of states for all digits, as was done in [43,44]. In order to use the HMM classifier, the general trend is to compute the feature vectors as a function of an independent variable [45]. For speech recognition, this independent variable is normally the time. This simulates the use of HMM in speech recognition where sliding frames/windows are used. The same technique is used in off-line text recognition where the independent variable is in the text line direction [43,44]. This enables the use of an HMM engine of speech recognition in text recognition. In this paper, we are using a different technique to extract the features of an Arabic numeral by using the image as a whole. However, the same HMM classifier is used without modification.

Figure 4. Bakis model HMM with 6 states for digit 4

5. TRAINING AND RECOGNITION The data were collected from writers using semitransparent paper over a tabular grid. The data were collected from 44 writers; each writer wrote 48 samples of each digit (0–9), a total of 480 digits per writer. The database consists of 21,120 samples. The written pages are then scanned using a scanner with a resolution of 300 pixels per inch. For each scanned page, the horizontal histogram is computed. The resulting histogram has black and white regions. The black regions represent the text lines and the white regions represent the spaces between the text lines. The locations of the black regions give the numeral lines limits. Using the numeral lines limits, the numeral lines are extracted. For each line, the vertical histogram is computed. The black regions represent the digits and the white regions represent the spaces between the digits. The locations of the black regions are used to specify the location of

October 2009


435


each digit in the line. These digits were extracted keeping each digit in a separate image file. In this work, the digits are normalized to a height of 64 pixels while maintaining the aspect ratio for each numeral. Reference may be made to [46] for more details on the dataset, which is freely available on line in [47]. The SVM using RBF kernel parameters (C and γ) are estimated using a two-stage exhaustive search algorithm. In the first stage, coarse-search parameters estimation is implemented, and the best parameters are selected for the second fine-search parameter estimation. These parameters are then used in experimenting with SVM by using the extracted features. The results of the SVM classifier are then compared with the results of the HMM classifier. The implementation of this work was done using C language and MATLAB. The Hidden Markov Model Toolkit (HTK) [48] was used in the experimentation of HMM. 5.1. SVM EXPERIMENTATION The SVM classifier was trained with 75% of the data (i.e. the first 33 writers) and tested by the remaining data (i.e. writers 34 to 44) by using the estimated SVM parameters. In order to analyze the effectiveness of these features, we tested each feature type separately, and all combinations of the feature types were tested. In addition, different values of image divisions were tried, and features were extracted before and after smoothing. Smoothing generally reduces noise and fills small stroke gaps. We used SVM parameters that were estimated in the two-stage exhaustive search, (i.e. C = 212 and γ = 2-12.25). Figure 5 shows the results of testing using the raw images as is and after smoothing. It is clear from the figure that the recognition rates of all combinations of features improved, excluding the concavity features, which dropped from 99.11% to 96.1%. In a test using smoothed gradient and structural features and unsmoothed concavity features, the recognition rate dropped unexpectedly from 99.83% to 99.60%. It was expected that using each feature type at its highest recognition rate (individually) would lead to the highest recognition rate when combined. This indicates that smoothing is better when we use all the features, even when the concavity feature is not used at its maximum recognition rate.

Figure 5. Average recognition rate for all feature types combination

Different image divisions were tested: 3×3, 4×4, 5×5, and 6×6. There was an improvement in the recognition rate with the higher number of segments. However, this does not justify the increased number of features used and the extra computational cost. The number of features using 6×6 divisions is 4 times the number of features using 3×3 divisions, and with little improvement in recognition rates at high computational cost. Hence, the following experiments were conducted using 3×3 divisions and the data was smoothed before feature extraction. All combinations of feature types were tested. Table 2 shows the recognition rates of all digits using the features separately and in combination. In general, gradient features have better recognition rates than structural and concavity features. Combined gradient and concavity features have better recognition rates than gradient, structural, and concavity features. The use of combined gradient, structural, and concavity features resulted in the best recognition rates among all combinations.

436


October 2009


Table 2. Recognition Rates Per Digit For All Feature Types Combinations

The confusion matrix for the combined GSC features on the smoothed images is shown in Table 3. The symbol %c represents the percentage of recognition rate, and %e the percentage of incorrectly labelled digits. The average recognition rate is 99.83%. Table 3. Confusion Matrix Using GSC Features With SVM Parameters of C = 212 and γ = 2-12.25

Figure 6 shows the recognition rates of the tested writers (i.e. writers 34 to 44). SVM was trained with the first 33 writers and tested with the last 11 writers to evaluate the writer-independent identification. Since the data is hard to visualize when all the results are displayed, only the recognition rates that are below 100% are shown; all other rates are 100%. The figure shows that only digits 2, 3, 5, and 9 have errors, and only writers 34, 35, 38, 39, and 40 have errors.

2

3

5

9

Recognition %

100.000 95.000 9

90.000 6

85.000 34 35

36 37

Writers

3 Digits 38 39

40 41

42 43 44

0

Figure 6.Recognition rates below 100%. all other values are 100%

October 2009


437


We conducted several experiments in which SVM was trained on data of writers 12 to 44 and tested on the same data. The achieved average recognition rate was 99.98%. In another experiment, we trained SVM on writers 12 to 44 and tested it on writers 1 to 11. Table 4 shows the results of training and testing using different training and testing data combinations. The table confirms our conclusions that the presented features and techniques are powerful in the recognition of Arabic (Indian) handwritten digits. Table 4 shows the results of training and testing using different training and testing data combinations. Table 4. Recognition Results for Different Training and Testing Sets Training using samples of Training using samples Training using samples writers 12 to 44 and of writers 12 to 44 and of writers 1 to 33 and testing using samples of testing using samples testing using samples writers 12 to 44 of writers 1 to 11 of writers 34 to 44 %c %c %c %e %c %e 0 99.94 0.06 99.43 0.57 100.00 0.00 1 100.00 0.00 100.00 0.00 100.00 0.00 2 100.00 0.00 100.00 0.00 99.62 0.38 3 100.00 0.00 100.00 0.00 99.24 0.76 4 99.87 0.13 100.00 0.00 99.81 0.19 5 100.00 0.00 99.81 0.19 99.81 0.19 6 100.00 0.00 100.00 0.00 100.00 0.00 7 100.00 0.00 99.81 0.19 100.00 0.00 8 100.00 0.00 100.00 0.00 100.00 0.00 9 100.00 0.00 100.00 0.00 99.81 0.19 Avg. 99.98 0.02 99.91 0.09 99.83 0.17 5.2. SVM vs. HMM Classifier In this section, the results of the SVM are compared with those of HMM classifier using the same features and training and testing data. In order to assess the HMM classifier, the HTK was used in the experimentation with different numbers of features, different numbers of codebook sizes, and different numbers of states. The GSC 324-dimensional feature vector was split into 27 observations of 12 features each. SVM and HMM classifiers were trained with the observations of the first 33 writers and tested with the observations of the remaining writers, 34 to 44. The HMM results were produced using the HMM model of Section 4 but with a different number of states. Choosing the number of states and codebook size is usually done by hand [49]. One method is to use verification data to choose the optimal number of states and codebook size and then report results on an independent testing set. An alternative method is to select the maximum possible state number and codebook size for training dataset and using that model for the testing set. The latter was used in this paper, and produced very encouraging results. In the HMM experimentation, the best results were achieved with 47 states and codebook size of 1150. Figure 7 shows the recognition rates of the different feature types for using SVM and HMM. It is clear from the figure that SVM has higher recognition rates for all the types of features used.

438


October 2009


Figure 7. Recognition rate for different features and classifiers

Figure 8 shows the recognition rates of the different digits using SVM and HMM. It is clear from the figure that SVM outperforms the HMM classifier with respect to all digits, and the improvement in recognition rates is more pronounced in the cases of digits 3 and 9 (2.27% and 2.84% more, respectively). On average, the SVM recognition rate is nearly 0.83% more than HMM.

Figure 8. Recognition rates of the SVM and HMM classifiers.

Figure 9 show the recognition rates of the SVM and HMM classifiers for writers 34 to 44. It is clear that the recognition rates using SVM are appreciably better than those of the HMM, and the improvement in recognition rates are clearer in writers 34, 35, and 41 (1.88%, 2.29%, and 2.08% more, respectively).

October 2009


439


Figure 9.Recognition rates per writer of the SVM and HMM classifiers using the first 33 writers for training and writers 34 to 44 for testing

Figure 10 shows all the digits that were misclassified by the SVM classifier. The figure shows the tested digit, its image, its recognized label, and its writer. Each row consists of two samples. Analyzing the misclassified digits indicates that 4 out of 9 errors are due to digit ‘3’, which is written in a style by writer 34 that is different from other writers. This type of error and its solution were reported in Mahmoud [46]. It was suggested that two models should be used for digit ‘3’, one with two upper strokes and one with three upper strokes. Three errors are between digits ‘2’ and ‘4’, one of which is bad data (i.e. sample 2 of the 4th row). The other two errors may be attributed to in-proportional segments of the digits making one digit closer to the other. In one sample, a ‘5’ is confused with a ‘0’ as frequently reported in the literature [12,46]. This error may be attributed to the extra internal stroke that extends in the hole of the five. The last sample is a thinned and an elongated 9 with a tail.

Figure 10. Misclassified digits of SVM classifier

6. CONCLUSIONS This study has presented a system for automatic independent writer off-line handwritten Arabic (Indian) numeral recognition, based on a quasi-multi-resolution approach to feature extraction using SVM. The database had 44 writers with 48 samples of each digit, totaling 21120 samples. A two-stage exhaustive parameter estimation technique is used to estimate the best values for SVM parameters (C and γ). The database was split into 2 subsets; one was used in training (75%) and the second for testing (the remaining 25%). Other experiments were executed with different training and testing samples and comparable results were achieved.

440


October 2009


A numeral image was divided into several sub-regions. Gradient, structural, and concavity features were extracted for each image segment. These computed features measure the image characteristics at local, intermediate, and large scales. The local-scale features measure edge curvature in a neighbourhood of a pixel. The intermediate features measure strokes which span several pixels. The large features measure certain concavities which can span the image. Different grid sizes of images were tested (3×3, 4×4, 5×5, 6×6). The recognition rate using grid size of 4×4 is less than using 3×3 and has 1.77 times the number of features. The recognition rate using grid size of 5×5 is higher than using 3×3 by 0.79% and uses 2.77 times the number of features. The recognition rate using grid size of 6×6 is higher than using 3×3 by 0.83% and 4 times the number of features. A 3x3 grid size was used in favor of reducing the number of features used and the computational cost by accepting a minor recognition rate reduction of 0.83% at a saving of an extra three times the number of features. Using SVM, all combinations of features were tested to find the combination that produces the highest recognition rates. Different sizes of divisions were tested to select the most suitable one. Smoothed and unsmoothed data were tested. An independent writer identification test was conducted, and it proved the system is writerindependent. In using HMM, we did not follow the general trend of using the sliding window technique to extract the features. Instead, we extracted the features of the whole digit and then split the feature vector into observations that were fed into the HMM. This technique proved effective. The results mean that feature extraction techniques based on the whole digit may be used with HMM and not only features based on the sliding window technique. The implemented windowing technique, which is different from the generally used sliding window technique, has several advantages. The sliding window technique uses a vertically sliding window with the same size for all samples. The vertical sliding window is segmented vertically into a fixed number and size of segments for all digits/characters. In our case, the segment sizes and x- and y-coordinates are different for each sample depending on the black pixel distribution as explained above. The sliding window has horizontal overlap with previous window (and the vertical segments (sub-windows) may also overlap). In our case there is no overlap between segments. The number of features in the sliding window technique is proportional to the sample width and the overlap of the windows. In our case, the number of features is fixed for all samples. The features of the sliding window technique are presented for each vertical segment of the digit/character and presented to HMM. In our case, we split the features of the whole digit into observation vectors. In both cases, the observation vectors are presented to the HMM one after the other. In this work, we used 9 segments to represent the digit. The sliding window technique uses many more overlapping segments to represent the digits. We analyzed the performance of the HMM by using different numbers of codebook sizes and different numbers of states. The best results were achieved with 47 states and a codebook size of 1150. The recognition results of SVM were compared to those of the HMM classifier. The achieved average recognition rates were 99.83% and 99.00% using, respectively, the SVM and HMM classifiers. SVM recognition rates were better for all the digits. In a second test, SVM and HMM were tested with the individual writers’ numerals (writers 34 to 44). The average recognition rates of the writers using SVM and HMM were compared. SVM results outperformed HMM results for all tested writers. Finally, classification errors were analyzed. It was shown that 9 digits out of 5280 were misclassified (0.17%). Four of these errors were due to writer 34 wring digit ‘3’ in a different style (with two upward strokes) while the model was based on three upward strokes. This can be addressed by having two models for digit ‘3’. Other errors may be attributed to bad data or deformed digits’ strokes. In general, the implemented features do not suffer from the well-known digits combination problem (viz. 4 with 6, 5 with 0, 7 with 8, 9 with 6, etc.). The presented technique using robust features and the SVM classifier proved to be effective in the recognition of independent writer Arabic (Indian) numerals, and it was shown to be superior to the HMM classifier for all digits and tested writers, and also superior to published work using the same data. The researchers are currently exploring the extension of the technique to Arabic text recognition, and the use of multiple classifiers is also under investigation. ACKNOWLEDGMENT The authors would like to thank the referees for their constructive criticism and stimulating remarks. The modification of the original manuscript to address those remarks improved the revised manuscript considerably. The authors would like to thank King Fahd University of Petroleum and Minerals for supporting this research and providing the computing facilities. This work was partially supported by KFUPM internal project number IN060337.

October 2009


441


REFERENCES

442

[1]

P. Berkes, “Handwritten Digit Recognition with Nonlinear Fisher Discriminant Analysis”, Artificial Neural Networks: Formal Models and Their Applications - ICANN 2005, 2005, pp. 285–287.

[2]

Cheng-Lin, Kazuki, Hiroshi, and Hiromichi, “Handwritten Digit Recognition: Investigation of Normalization and Feature Extraction Techniques”, Pattern Recognition, 37(2004), pp. 265–279.

[3]

E. Kussul and T. Baidyk, “Improved Method of Handwritten Digit Recognition Tested on MNIST Database”, Image and Vision Computing, 22(2004), pp. 971–981.

[4]

Qing Tang, “Two-Dimensional Penalized Signal Regression for Hand Written Digit Recognition”, Master's thesis, Louisiana State University, 2006.

[5]

F. A. Al-Omari and O. Al-Jarrah, “Handwritten Indian Numerals Recognition System Using Probabilistic Neural Networks”, Advanced Engineering Informatics, 18(2004), pp. 9–16.

[6]

F. Bouslama, “Structural and Fuzzy Techniques in the Recognition of Online Arabic Characters”, International Journal of Pattern Recognition and Artificial Intelligence, 13(1999), pp. 1027–1040.

[7]

S. Salourn, “Arabic Hand-Written Text Recognition”, ACS/IEEE International Conference on Computer Systems and Applications, 2001, pp. 106–109.

[8]

A. Salah, E. Alpaydin and L. Akarun, “A Selective Attention-Based Method for Visual Pattern Recognition With Application to Handwritten Digit Recognition and Face Recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2002), pp. 420–425.

[9]

S. Alma'adeed, C. Higgins and D. Elliman, “Off-Line Recognition of Handwritten Arabic Words Using Multiple Hidden Markov Models”, Knowledge-Based Systems, 17(2004), pp. 75–79.

[10]

S. Touj, N. Amara and H. Amiri, “Arabic Handwritten Words Recognition Based on a Planar Hidden Markov Model,” International Arab Journal of Information Technology, 2(2005), pp. 318–325.

[11]

M. Shirali-Shahreza, K. Faez, and A. Khotanzad, “Recognition of Handwritten Persian/Arabic Numerals by Shadow Coding and an Edited Probabilistic Neural Network”, 3(1995), pp. 436–439.

[12]

H. Hosseini and A. Bouzerdoum, “A Combined Method for Persian and Arabic Handwritten Digit Recognition”, Australian and New Zealand Conference on Intelligent Information Systems, 1996, pp. 80–83.

[13]

F. Said, A. Yacoub, and C. Suen, “Recognition of English and Arabic Numerals Using a Dynamic Number of Hidden Neurons”, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99, 1999, pp. 237–240.

[14]

J. Sadri, C. Y. Suen, and T. D. Bui, “Application of Support Vector Machines for Recognition of Handwritten Arabic/Persian Digits”, Proceeding of the Second Conference on Machine Vision and Image Processing & Applications (MVIP2003), Tehran, Iran: 2003, pp. 300–307.

[15]

B. Al-Badr and S. A. Mahmoud, “Survey and Bibliography of Arabic Optical Text Recognition”, Signal Processing, 41(1995), pp. 49–77.

[16]

M. S. Khorsheed, “Off-Line Arabic Character Recognition – A Review”, Pattern Analysis & Applications, 5(2002), pp. 31–45.

[17]

A. Nabawi and S. A. Mahmoud, “Arabic Optical Text Recognition: A Classified Bibliography”, Engineering Research Bulletin, Minufiyah University, Egypt, 23(2000), pp. 49–77.

[18]

J. Trenkle, A. Gillies, E. Erl, S. Schlosser and S. Cavin, “Advances In Arabic Text Recognition”, Proceedings of the Symposium on Document Image Understanding Technology, 2001, pp. 158–168.

[19]

S. N. Srihari and G.R. Ball, "An Assessment of Arabic Handwriting Recognition Technology", CEDAR Technical Report TR-03-07, 2007, pp. 1–38.

[20]

H. El Abed and V. Margner, “Arabic Text Recognition Systems - State of the Art and Future Trends”, 5th International Conference on Innovations in Information Technology (Innovations'08), Al Ain, UAE: 2008.

[21]

F. Camastra, “A SVM-Based Cursive Character Recognizer”, Pattern Recognition, 40(2007), pp. 3721–3727.

[22]

A. Mowlaei and K. Faez, “Recognition of Isolated Handwritten Persian/Arabic Characters and Numerals Using Support Vector Machines”, Neural Networks for Signal Processing, 2003. NNSP'03. 2003 IEEE 13th Workshop, 2003, pp. 547–554.


October 2009


[23]

S. Mozaffari, K. Faez, and M. Ziaratban, “Structural Decomposition and Statistical Description of Farsi/Arabic Handwritten Numeric Characters”, Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference, 1(2005), pp. 237–241.

[24]

H. Soltanzadeh and M. Rahmati, “Recognition of Persian Handwritten Digits Using Image Profiles of Multiple Orientations”, Pattern Recognition Letters, 25(2004), pp. 1569–1576.

[25]

S. Mozaffari, K. Faez, and H. Kanan, “Feature Comparison Between Fractal Codes and Wavelet Transform in Handwritten Alphanumeric Recognition Using SVM Classifier”, Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference, 2(2004), pp. 331–334.

[26]

A. Elms, S. Procter, and J. Illingworth, “The Advantage of Using an HMM-Based Approach for Faxed Word Recognition”, International Journal on Document Analysis and Recognition, 1(1998), pp. 18–36.

[27]

J. Makhoul, R. Schwartz, C. Lapre, and I. Bazzi, “A Script-Independent Methodology for Optical Character Recognition”, Pattern Recognition, 31(1998), pp. 1285–1294.

[28]

R. Al-Hajj, C. Mokbel, and L. Likforman-Sulem, “Combination of HMM-Based Classifiers for the Recognition of Arabic Handwritten Words”, Ninth International Conference on Document Analysis and Recognition, ICDAR'07, 2007, pp. 959–963.

[29]

A. Benouareth, A. Ennaji, and M. Sellami, “Semi-Continuous HMMs with Explicit State Duration for Unconstrained Arabic Word Modeling and Recognition”, Pattern Recogn. Lett., 29(2008), pp. 1742–1752.

[30]

Y. Al-Ohali, M. Cheriet, and C. Suen, “Introducing Termination Probabilities to HMM”, Pattern Recognition, 2002. Proceedings. 16th International Conference, 3(2002), pp. 319–322.

[31]

M. Pechwitz and V. Maergner, “HMM Based Approach for Handwritten Arabic Word Recognition Using the IFN/ENIT- Database”, Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2, IEEE Computer Society, 2003, p. 890.

[32]

R. El-Hajj, L. Likforman-Sulem, and C. Mokbel, “Arabic Handwriting Recognition Using Baseline Dependant Features and Hidden Markov Modeling”, Proceedings of the Eighth International Conference on Document Analysis and Recognition, IEEE Computer Society, 2005, pp. 893–897.

[33]

J. Favata, G. Srikantan, and S. Srihari, “Handprinted Character/Digit Recognition Using a Multiple Feature /Resolution Philosophy”, Proceedings of International Workshop on Frontiers of Handwriting Recognition (IWFHR IV), 1994, pp. 57–66.

[34]

B. Zhang, “Handwriting Pattern Matching and Retrieval With Binary Features,” Ph.D. thesis, State University of New York at Buffalo, 2003.

[35]

J.T. Favata, “Offline General Handwritten Word Recognition Using an Approximate BEAM Matching Algorithm”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2001), pp. 1009–1021.

[36]

J. Favata and G. Srikantan, “A Multiple Feature/Resolution Approach to Handprinted Digit and Character Recognition”, International Journal of Imaging Systems and Technology, 7(1996), pp. 311, 304.

[37]

N. Otsu, “A Threshold Selection Method from Gray-Level Histograms”, IEEE Transactions on Systems, Man and Cybernetics, 9(1979), pp. 62, 66.

[38]

V. Vapnik and A. Chervonenkis, “A Note on One Class of Perceptrons”, Automation and Remote Control, 1964.

[39]

V. Vapnik, Estimation of Dependences Based on Empirical Data: Springer Series in Statistics, Springer-Verlag New York, Inc., 1982.

[40]

V.N. Vapnik, The Nature of Statistical Learning Theory, Springer, 1999.

[41]

I. Guyon, B.E. Boser, and V. Vapnik, “Automatic Capacity Tuning of Very Large VC-Dimension Classifiers”, Advances in Neural Information Processing Systems 5, [NIPS Conference], Morgan Kaufmann Publishers Inc., 1993, pp. 147–155.

[42]

V.N. Vapnik, Statistical Learning Theory, Wiley-Interscience, 1998.

[43]

I. Bazzi, C. LaPre, J. Makhoul, C. Raphael, and R. Schwartz, “Omnifont and Unlimited-Vocabulary OCR for English and Arabic”, Proceedings of the Fourth International Conference on Document Analysis and Recognition, 2(1997), pp. 842–846.

[44]

I. Bazzi, R. Schwartz, and J. Makhoul, “An Omnifont Open-Vocabulary OCR System for English and Arabic”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(1999), pp. 495–504.

October 2009


443


444

[45]

Zhidong Lu, R. Schwartz, and C. Raphael, “Script-independent, HMM-Based Text Line Finding for OCR”, Proceedings. 15th International Conference on Pattern Recognition, 4(2000), pp. 551–554.

[46]

S. Mahmoud, “Recognition of Writer-Independent Off-Line Handwritten Arabic (Indian) Numerals Using Hidden Markov Models”, Signal Processing, 88(2008), pp. 844–857.

[47]

S. Mahmoud, “Arabic Computing [Online]”, http://faculty.kfupm.edu.sa/ICS/smasaad/Arabic%20Computing.htm, 2008.

[48]

S. J. Young, G. Evermann, M. J. Gales, D. Kershaw, G. Moore, J. J. Odell, D.G. Ollason, D. Povey, V. Valtchev, and P.C. Woodland, The HTK book version 3.4, Cambridge University Engineering Department, 2006.

[49]

S. Gunter and H. Bunke, “Optimizing the Number of States, Training Iterations and Gaussians in an HMM-Based Handwritten Word Recognizer”, Proceedings of Seventh International Conference on Document Analysis and Recognition, 1(2003), pp. 472–476.


October 2009