Initial Results in Offline Arabic Handwriting Recognition Using ... - LAMP

1 downloads 0 Views 414KB Size Report
Apr 7, 2005 - Processing Flowchart. Ordered. Large-Scale. Features. Training. Char/Word. Image .... Emulate handwriting differences by random perturbation.
Initial Results in Offline Arabic Handwriting Recognition Using Large-Scale Geometric Features Ilya Zavorin, Eugene Borovikov, Mark Turner

System Overview ƒ Based on large-scale features: robust to handwriting variations, both inter-writer and intrawriter ƒ Simulates writing process: robust to overlaps ƒ Trainable: uses discrete Hidden Markov Models (HMM) ƒ Uses high-level information: ƒ Permits “filling in” gaps in recognition unavoidable in degraded handwritten text ƒ Resolves ambiguity

ƒ Based on work by Khorsheed, with various modifications Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005



2

Processing Flowchart Ordered Large-Scale Features

Training Char/Word Image

Model Training

High-Level Information

Image Preprocessor

Recognition Word Image

Ordered Large-Scale Features

Recognition Lexicon/Word Model

Candidate Word 1 Correction/ Fine-Tuning …

Candidate Word N

Optical Handwriting Recognition Technologies Technical Exchange Meeting

Recognized Word

April 7, 2005



3

Large-Scale Features ƒ Represent connected components of image of letter or word as sequence of simple geometric objects, e.g. loops, turning and intersection points ƒ Example: “Mem”

= LOOP, TURN, END

ƒ Robust to: ƒ Handwriting variations: size, orientation, local distribution ƒ Letter variants: positional forms, casheedas ƒ Individual style differences ƒ Poor image quality

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005



4

Simulation of Writing ƒ Individually process different connected components (CCs) of image skeleton

ƒ Use set of consistent tracing rules to traverse each CC ƒ Rules resemble those of human handwriting, but any consistent set of rules can be used

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005



5

Preprocessing Stages

ƒ Image: ƒ Skeleton: ƒ Stick Approximation: ƒ Labeled Sequence: LOOP,TURN,X,END,…

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005



6

Discrete HMMs ƒ “Universal” model of lexicon or collection of word models assembled from individual letter HMMs ƒ Letter model size proportional to letter complexity ƒ Universal combination reflects letter combination probabilities (bigram or other) ƒ Very general framework: the only requirement is ordered set of discrete features/events to be used for training and recognition

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005



7

Discrete HMMs single letter HMM:

‫ق‬

loop

turn

dot

turn

‫ﻗﻠﻴﻞ‬

‫ا‬

‫ب‬



‫ي‬

‫ا‬

p11

p12



p1n

‫ب‬

p21

p22



p2n











‫ي‬

pn1

pn2



pnn

Topology permits skipped and repeated components

universal (composite) HMM based on individual letter HMMs and bigram statistics:

end

character block (word) HMM: ‫ق‬

Optical Handwriting Recognition Technologies Technical Exchange Meeting

‫ل‬

‫ي‬

‫ل‬ April 7, 2005



8

Training Modes ƒ Original algorithm: training individual letter models by manually pre-segmented data ƒ Straight-forward, accurate training ƒ Not practical for high volume systems

ƒ Improvement: training by unsegmented words using word models ƒ Lose some accuracy, but ƒ Preprocessing becomes fully automatic

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005



9

Training with manual segmentation 1.

For each training word I.

Manually pre-segment into character sub-images

II. Convert character sub-images into observation sequences

2.

For each character I.

Compute initial model

II. Collect all training sequences from data in Step 1. III. Train character model

3.

Connect all trained character models into universal model using bigram statistics

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 10

Training with no manual segmentation 1.

For each character, compute initial model

2.

For each training word I.

Assemble initial character models into word model

II. Convert entire training word image into observation sequence III. Use sequence to train model IV. Disassemble word model into instances of trained character models

3.

For each character, combine all instances into single trained character model

4.

Connect all trained character models into universal model using bigram statistics

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 11

Experiments ƒ Proof of Concept ƒ Manual preprocessing of segmented letters ƒ Emulate handwriting differences by random perturbation (“shaking”) ƒ Training and test words created by concatenation

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 12

Manual Processing ƒ Using specially designed graphical tool ƒ Stick traces superimposed onto “template” images ƒ Segments labeled

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 13

Training/Test Word Concatenation ƒ Traces merged automatically

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 14

Handwriting Style Emulation ƒ Traces randomly perturbed (“shaken”) ƒ Letter models trained by shaken skeletons ƒ Perturbation measured by standard deviation, in pixels Image size ≈ 100 x 200 pixels

σ=5

σ=7

Optical Handwriting Recognition Technologies Technical Exchange Meeting

σ = 10

April 7, 2005

• 15

Test data ƒ Tested on synthetic data ƒ 29 main Arabic characters included in model ƒ Characters models with 3 to 20 hidden states ƒ Lexicon of 2000+ words ƒ Various degrees of perturbation, from σ=0 to σ=15 ƒ Three sets of tests 1. Machine-printed images as templates; train with pre-segmentation 2. Machine-printed images as templates; train without pre-segmentation 3. Machine-printed and handwritten images as templates; train without pre-segmentation Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 16

Test Sets 1 and 2: Results

Character Accuracy

Word Accuracy

With

Without

Segmentation

Segmentation

Precision

99.5%

93.1%

Recall

99.6%

92.1%

Number of word sequences

107950

10800

Number of matches

105691

8059

Recognition rate

97.9%

74.6%

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 17

Test Set 3: Handwritten and machine-printed data ƒ Purpose: to show that one can model one handwriting style with perturbed versions of other styles ƒ Two words, 3 handwritten and 1 machineprinted samples of each ƒ Skeletons created manually ƒ Trained using perturbed handwritten skeletons, no pre-segmentation (as in Test 2) ƒ Tested on machine-printed skeleton

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 18

Test Set 3: Training and Test Data

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 19

Test Set 3: Results ƒ Tested universal model ƒ Both words correctly recognized ƒ Perturbation during training was critical: no path at all through universal model without it

Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 20

Future Work ƒ Automatic preprocessing ƒ Rectification ƒ Feature extraction

ƒ Study role of high-level info ƒ Simple human handwriting recognition tests using native and non-native Arabic speakers to establish baseline of human performance

ƒ Tests on more realistic data ƒ IFN/ENIT ƒ ISG ƒ Other datasets Optical Handwriting Recognition Technologies Technical Exchange Meeting

April 7, 2005

• 21