2-D histogram-based segmentation of postal

0 downloads 0 Views 489KB Size Report
method based on 2-D histogram clustering and watershed transform. ..... loss of information and to join flat basins with deep basins without increase the depth of ...
2-D Histogram-based segmentation of Postal Envelopes Eduardo Akira Yonekura, Jacques Facon PUCPR-Pontif´icia Universidade Cat´olica do Paran´a, Rua Imaculada Conceic¸a˜ o 1155, Prado Velho 80215-901 Curitiba-PR, Brazil akira,facon @ppgia.pucpr.br 

Abstract Mail sorting automation is yet a partially solved challenge. Due to no fixed positions of the semantic information in envelope images, part of this challenge is the automatic location of address blocks. In this paper we approach this problem by proposing a new postal envelope segmentation method based on 2-D histogram clustering and watershed transform. Segmentation task consists in detecting the 2D histogram modes associated with homogeneous regions in envelope. A new filter used for 2-D histogram calculation is proposed. The homogeneous modes in 2-D histogram are segmented through the morphological watershed transform. Our approach is applied to complex postal envelopes. Very little a priori knowledge of the envelope images is required. The advantages of this approach will be described and illustrated with tests carried out on different images where there are no fixed position for the handwritten address block, postmarks and stamps. A groundtruth strategy is employed to evaluate the accuracy of segmentation. 





1. Introduction One of the greatest challenges to improve the postal service is solving the problem of automatic location of address blocks in envelope images. This segmentation challenge involves partitioning the envelopes into several semantic clusters, which can contain text and graphic messages in addition to the address block, postmarks and stamps. Moreover, there are no fixed positions for these items. Hence, the correct address block location is highly dependent on the correct location of other items. For instance, one can solve this problem by segmenting not only the handwritten address block, but also postmarks, stamps and background . Several authors have dealt with the problem of locating address blocks in envelope images. The existing methods

are based on Artificial Intelligence techniques, on geometrical characteristics or on texture analysis by filtering. Wang in [5] presents an expert system for automatic sorting of envelope images. The system uses a blackboard to keep the geometric attributes of blocks obtained after processing different types of envelope images. A statistical database of geometric features and a rule-based inference engine are used to assign a score to each address block candidate. Finally, the candidate with the highest score is semail lected as the address block. A database of over pieces was used to develop the rules associated with the tools. Experiments with a database of mail pieces indicated a success rate of over in locating the address block. Yeh in [12] proposes a method for locating address blocks which utilizes geometric features. Based on Laplacian operator, this method applies a connected component analysis to identify each individual eight-connected set of pixels. A merging process is used to eliminate the undesired connected components. The features used are position, size, average gray level and thickness. The connected components are grouped based on the similarities of their geometric features using a proximity tree. Cutting the tree produces several different clusters of components. Weighting functions corresponding to the different categories are used to assign scores to each cluster, based on its position. Finally, the cluster with the highest score for belonging to the “address block” category is selected as the correct address block. Jain and Bhattacharjee in [8] present a method for locating address blocks based on a multi-channel filtering approach (two-dimensional Gabor Filters). The text in the envelope image defines a textured region. Non-text components, including blank spaces, are considered as regions of different texture types. Gabor filtering decomposes an input image into a number of filtered images, each of which contains intensity variations over a narrow range of frequency and orientation. A three-cluster solution (k=3) identifies the regions consisting of printed text, non-text (blank spaces and regions which have slow intensity variations), and the

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE



















boundaries between the text regions and the non-text regions. The cluster corresponding to the text is isolated and a connected component analysis is used to identify individual blocks of text in the envelope image. The address block is identified from these blocks. Unfortunately, it is impossible to evaluate the accuracy of the proposed method since few images were tested and no success rate in locating the address block was provided. Yu et al in [17] present a method for locating address blocks from magazine envelopes based on the following assumptions: The address is written on a light-colored label, generally skewed. Moreover, the label may be stuck on the magazine, under the plastic envelope or displayed in a plastic window provided in the envelope. The address block follows the left alignment rule. The contrast between the ink and the spaces between characters varies according to the equipment used (laser or matrix printer, or even a typewriter). And the magazine and envelope ensemble contains other text messages in addition to the address block. The authors segment magazine envelopes using an Otsu’s modified method which reduces the influence of the grayscale distributions. Then a connected component strategy is used to identify individanalysis based on ual blocks of text. Some heuristics are used to eliminate false address block candidates. Finally, a recognizer is used to locate the address block. Experiments on magazine envelopes and other ones showed that the method was successful in locating the correct address block and , respectively. in 































The methods presented above are highly dependent on some a priori knowledge of the envelopes such as orientation and position of the address block. Consequently, they are not really suitable for complex envelopes. Due to the lack of quantitative evaluation criteria, it is difficult to do a comparison. The evaluation is only visual without groundtruth comparison. The goal of this paper is to propose and to evaluate a new postal envelope segmentation method based on 2-D histogram clustering. The aim is to locate and segment handwritten address blocks, postmarks, stamps and background with no a priori knowledge of the envelope images. A filtering based on morphological grayscale dual reconstruction in the 2-D histogram calculation is proposed. The 2-D histogram clustering is carried out by means of the morphological watershed transform. A groundtruth segmentation is employed to evaluate the accuracy of this method This paper is organized as follows: section 2 shortly reviews histogram-based techniques to solve the segmentation challenge. The new filtering used for 2-D histogram calculation is presented in section 3, while in section 4 the proposed method is detailed. Section 5 presents some experimental results and discussions. In addition, the

proposed segmentation method is evaluated by means of groundtruth images in sections 5.1 and 5.2. Finally, some conclusions are drawn in section 6.

2. Histogram-based techniques for segmentation challenge Histogram-based techniques assume that homogeneous classes merge themselves ”naturally” as clusters in an appropriate space. The advantage of this type of techniques is that there is no need of a priori knowledge. We can find histogram-based segmentation algorithms in the literature. Lim and Lee [9], Liu and Yang [10] proposed an approach to segment color images based on decomposition of 1-D color histograms into the sum of Gaussian distributions. Celenk [4] proposed another approach based on 1-D color histogram in 3-D clustering. The main problem is that all theses techniques suffer from the lack of local spatial knowledge. An effective way to overcome this problem is to employ 2-D histogram which allows not only clustering objects of an image but also preserving the local spatial knowledge. The notion of 2-D histogram is mainly used to image thresholding, where it is computed by two images: the original one and a filtered one, usually obtained by an average filter as proposed in [1], [3], [16], [7]. However, in our case, processing the 2-D histogram with the original image and averaged one has the drawback of introducing new gray levels that do not exist in the original image, which can lead to some distortions in terms of segmentation. To overcome this drawback, a filtering based on morphological reconstruction will be further proposed. The detection of clusters in a 2-D histogram is not a trivial task, once it is necessary to assist the gray level occurrences, which is computationally expensive. In the literature, there are few approaches which deal with histogram clustering. One of them is the morphological watershed transform which is able to overcome noisy and heterogeneity problems that occur in postal envelope images. It is not rare to find in postal envelope images noise, complex (and creased) background and inhomogeneous handwritten address blocks. Shafarenko et. al [13] present a segmentation algorithm for color images that uses the watershed transform to segment either the two-dimensional (2-D) or the three dimensional (3-D) color histogram of an image. This method takes place in a perceptually uniform color space (in this case the Luv space). The 2-D histogram is constructed by summing up votes for all intensities occurring at each point of the plane uv. To avoid oversegmentation usually inherent to watershed algorithm the histogram is smoothed. Luo et. al [11] proposed an approach to detect abnormality in color fundus images, which is performed on the

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE

object-based color difference image. In the color difference image, bright and dark objects are highlighted through Luv or Lab color spaces respectively. To suppress undesired background and minor objects which are sensitive to the watershed transform, the authors apply a pre-thresholding and also a post-verification to erase obscure candidates, blood vessels and optic disk.

Figure 1. Grayscale dual reconstruction of mask f from marker g

3. 2-D Histogram based on grayscale dual reconstruction Consider a digitized image and its , both of size ( ) filtered version with gray levels, where and are the gray levels of the pixel at location (x,y) in both images, respectively. The 2-D histogram of images F and G, is a dimensional matrix which contains some statistical information regarding the number of gray level occurrences of a pixel at the same location (x,y) in from any both images. The 2-D histogram and having respectively the pair of pixels gray levels (i,j) can be formalized as follows: 





$





!

!

$

&



























$

*





-

.



/





















*



























2























2



-

.



/

2



2



block and postmarks, stamps and background. For this purpose, we use the morphological watershed transform. To histogram as bebetter understand it, one can see the ing a topographic relief, where the highest areas correspond to peaks, while the lowest ones would be the valleys. If a drop of water is placed on any point of that relief, it will slide until it rests on a place we will call the local minimum (see [2] for more details). The cluster of all those points where the drop of water finally rests in a given region with a minimum is defined as being the watershed area , associated with the minimum (Fig 2). The meeting point of two watershed areas generates a border forming the watershed line, which is the outcome of the watershed transformation process [2].



&

]

_









-

.

/



9

;

























@













C













D

F

(1)

The 2-D histogram is highly dependent of the choice of the second image. As already mentioned, such image is usually obtained by an average filter [1] [3] [16] [7], which can introduce, in the filtered image, non-existent gray levels in the original image. To overcome this problem, we use a morphological filter. Among all morphological operators (see [14] [15] [6] for more details), the morphological grayscale dual reconstruction was chosen. This operator is carried out by iterative operations of geodesic erosion until idempotence as follows: H

I







H

(2)

I

I I









K

O

P

L

 N

R

T

U

X

!

!

! 

U W

Y

Z







W

[

O

where represents the elementary geodesic erosion of grayscale image (marker) over grayscale image (mask). U

W





The morphological grayscale dual reconstruction has the advantage of preserving peaks and modifying valleys (Fig 1). This is a crucial point if we want to preserve the background without spreading the noisy and to modify the handwriting address block, postmarks and stamps information (which look like grayscale valleys as in Fig 1).

4. Clustering

Figure 2. Watershed lines and Catchment basins

To efficiently segment the histogram by watershed transform, it is required that the markers are located in the local minima where they will be pierced. We have used the morphological grayscale dual reconstruction where histogram itself and the the mask correspond to the marker correspond to the histogram plus one occurrence ( histogram + 1) to locate these local minima. This process guarantees the exact location of the true minima.





We are aiming at automatic clustering, where all classes hisshould be extracted from the image through the togram. Ideally, these classes are the handwritten address

&

]



&

]

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE

&

&

]

]

&

]

5. Experiments

5.2. Numerical results and discussion

A database composed of complex postal envelope images, with creased background and no fixed position for the handwritten address blocks, postmarks and stamps was used to evaluate the efficiency of the proposed approach. pixels and was digitized Each image has about dpi. We could verify that the great majority of pixels at of these images belong to the envelope background (approximatively ) and that the address blocks, stamps , and of the and postmarks represent only envelope area, respectively. These differences appear during the construction of the 2-D histogram. The background which constitutes a great homogeneous region, remains almost identical by grayscale dual reconstruction. Consequently, a huge number of occurrences, corresponding to the background, lie on the principal diagonal of the 2-D histogram. The drawback of this process is the distortion caused by the clustering process. To reduce the influence of these unchanged regions and to equalize the importance of the searched classes, the 2-D histogram diagonal iss set to . 



























Due to the similarity of the thickness and the gray levels between handwritten address blocks and postmarks, we observed that it is difficult to segment handwritten address blocks and postmarks separately. Thus, we propose to segment the envelope images into three classes:





















2. Handwritten address block and postmarks;



The complexity of the postal envelope images is reflected over the 2-D histogram which presents a huge ). Then, a post-processing number of basins (over step was necessary to decrease the number of basins and consequent markers. For this task, we used the morphological erosion which allows to join flat basins without loss of information and to join flat basins with deep basins without increase the depth of the last ones. Hence, the system processes n erosions until it is possible to obtain the number of expected markers (i.e. expected classes) through the morphological reconstruction, which are required by the morphological watershed. This post-processing step is totally automatic without training tests.

1. Stamps;



5.1. Evaluation strategy of the proposed method

3. Background. histogramFigures 3 and 4 illustrate the proposed based segmentation on complex postal envelope images with no fixed positions for address blocks, postmarks and stamps. In Figure 3-(a) to (g), it is possible to compare the segmented classes with the groundtruth ones in case of postal envelope with creased background. On can observe that the proposed method has detected three classes as expected. Figure 4 shows interesting segmentations in case of envelope images with many postmarks ( Figure 4-(a) to (d)), with many stamps (Figure 4-(e) to (h)) and without stamp (Figure 4-(i) to (l)). It is possible to conclude that the segmentation recovered the three classes as expected, independently of the layout of the envelope images. The average segmentation rates for each class (handwritten address block and postmarks, background and stamps) are showed in Table 1. Independently of the layout and background in the input images, it can be seen that the segmentation recovered address blocks with postmarks ( ) and background ( ) with great success. In the case of adof unrecovered segdress block and postmark classes, mentation are due to unwanted parts of stamps and to the difference between the original handwritten and postmark thickness and the segmented one. The segmented handwritten address block and postmark are thinner than the ideal ones. In the case of the background, of unrecovered segmentation are due to the creased background and unwanted stamp parts. It can also be seen that the segmentation did not recovered the stamps as expected. It is caused by the fact that the stamps contain some complex drawings, gray level differences (dark objects and bright background) difficulting a good rate.

]

















To evaluate the accuracy of the proposed approach, a groundtruth strategy was employed. For each envelope image, the ideal result (ground-truth segmentation) regarding each class (handwritten address block and postmarks, background and stamps) has been generated. An example of ground-truth segmentation is illustrated in Figure 3(b) and (c) and (d). The comparison between the groundtruth segmentation and the results obtained by the proposed approach was carried out pixel by pixel. We have computed each obtained result with the ideal segmentation in terms of identical pixels at the same location. We have also computed the average of all obtained results for each class (handwritten address block and postmarks, background and stamps).

&



6. Conclusions A new postal envelope segmentation method combining 2-D histogram and morphological clustering by watershed transform concepts was proposed. A morphological reconstruction filter for 2-D histogram calculation was proposed.

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE

Class Stamps Handwritten Address Block and Postmarks Background

Rate 25% 75% 90%

Table 1. Average segmentation rates for each class.

Unlike the previous clustering techniques which carry out the partition based on distance measure, the proposed clustering was based on the watershed transform criterium. The proposed method was shown to be efficient and encouraging in segmenting and isolating the handwritten address blocks, postmarks and background without a priori knowledge about the position of each semantic class. By comparing the obtained results with the ideal ones (ground-truth segmentation), it is possible to see that the proposed approach is robust if we consider the handwritten address block and postmark class and background one. In addition, since this method does not use a priori knowledge, it can be employed to segment other types of document images.

7. ACKNOWLEDGMENTS

[8] A. Jain and S. Bhattacharjee. Address block location on envelopes using gabor filters. Pattern Recognition, 25(12):1459–1477, 1992. [9] Y. W. Lim and S. U. Lee. On the color image segmentation algorithm based on the thresholding and the fuzzy c-means technique. Pattern Recognition, 23:379–396, 1990. [10] J. Liu and Y. H. Yang. Multiresolution color image segmentation. IEEE Trasactions on Pattern Analysis and Machine Intelligence, 16:689–700, 1994. [11] G. Luo, O. Chutatape, H. Li, and M. Krishnan. Abnormality detection in automated mass screening system of diabetic retinopathy. In 14th Symposium on Computer Based Medical Systems, pages 132–137. 2001. [12] A. L. P.S. Yeh, S. Antoy and A. Rosenfeld. Address location on envelopes. Pattern Recognition, 20(2):213–227, 1987. [13] L. Shafarenko, M. Petrou, and J. Kittler. Histogram-based segmentation in a perceptually uniform color space. IEEE Transactions on Image Processing, 7:1354–1358, 1998. [14] P. Soille. Morphological Image Analysis: Principle and Applications. Springer, 1999. [15] L. Vincent. Morphological grayscale reconstruction in image analysis: Applications and efficient algorithms. IEEE Transactions on Image Processing, 2:176–201, 1993. [16] C. W. W.T. Chen and C. Yang. A fast two-dimensional entropic thresholding algorithm. Pattern Recognition, 27:885– 893, 1994. [17] B. Yu, A. Jain, and M. Mohiuddin. Address block location on complex mail pieces. Technical Report MSUCPS:TR9712, Dept. of Computer Science, Michigan State University, 1997.

The authors gratefully acknowledge PUCPR and CNPq (National Scientific Research Council) for the financial support.

References [1] A. S. Abutaleb. Automatic thresholding of gray-level pictures using two-dimensional entropy. Computer Vision Graphics Image Processing, 29:22–32, 1985. [2] S. Beucher and F. Meyer. The Morphological Approach to Segmentation: The Watershed Transformation. Marcel Dekker, 1993. [3] A. Brink. Thresholding of digital images using twodimensional entropies. Pattern Recognition, 25:803–808, 1992. [4] M. Celenk. A color clustering technique for image segmentation. Computer Vision Graphics Image Processing, 52:145–170, 1990. [5] P. P. C.H. Wang and S. Srihari. Object recognition in visually complex environments: An architecture for locating address blocks on mail pieces. Proc. Ninth Intl. Conf. on Pattern Recognition, pages 365–367, 1988. [6] J. Facon. Mathematical Morphology: theory and examples (in portuguese). J. Facon Editor, Curitiba-PR, 1996. [7] L. L. J. Gong and W. Chen. Fast recursive algorithms for two- dimensional thresholding. Pattern Recognition, 31:295–300, 1997.

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE

(a)

(b)

(e)

(c)

(f)

(d)

(g)

Figure 3. Results of the proposed approach compared with ground-truth segmentation: (a) Envelope image with creased background, (b)-(d) Ground-Truth segmentation, (e)-(g) Segmentation results.

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

Figure 4. Results of the proposed approach: (a) and (e) Envelope images with creased background and wrong layout, (i) Envelope images with creased background and no stamp, (b),(f),(j) Handwritten address block and postmark classes, (c),(g), (k) Bakcground classes, (d), (h), (l) Stamp classes.

Proceedings of the XVI Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI’03) 1530-1834/03 $17.00 © 2003 IEEE