Image features extraction using mathematical morphology - CiteSeerX

6 downloads 58209 Views 332KB Size Report
opening by reconstruction. We will call this filter later ASF by reconstruction (ASFR): ... Cluster centers obtained finally represents graytones of the output image.
Image features extraction using mathematical morphology Marcin Iwanowski, Sławomir Skoneczny, Jarosław Szostakowski Institute of Control and Industrial Electronics, Warsaw University of Technology, ul.Koszykowa 75, 00-662 Warsaw, Poland ABSTRACT Mathematical Morphology (MM) is a very efficient tool for image processing, based on non-linear local operators. In this paper MM is applied to extract the image’s features. As a feature we understand specific information about the image i.e. location, size, orientation of certain image elements. Morphological operators are applied to find and measure objects on the image’s surface. Two practical examples are considered. First is devoted to analysis of binary images, containing printed characters. Characters are separated and MM is used to extract some information (features) from each character. These features are later measured and included in a feature vector. It contains the special kind of information - the number of elements of the character with its shape modified in different ways. Second example shows how feature extraction by MM works on graytone images. Images for analysis contain human faces. Morphological operators extract some important elements of human face. This information is very important to identify the human face. Experiments show us how the morphological operators can be applied to the feature detection. The simplest operators as erosion and dilation, as well as more sophisticated tools like: morphological filtering, geodesic transformations are used for that purpose. Also directional operations are applied to extract some areas. This paper includes algorithms for feature extraction by MM, as well as the brief description of morphological tools, explication of experiments (characters and faces), and the results of them. After experiments already made, MM seems to be a powerful and promising technique for the feature extraction. Extracted features can be used for identification of objects on the image's surface. Keywords: mathematical morphology, filtering by reconstruction, feature extraction, pattern recognition

1. INTRODUCTION In most of cases MM is used as a tool for filtering and segmentation. Applications of MM for feature extraction is not as often represented in literature as both tasks mentioned above. In our paper mathematical morpholgy is used for the purpose of extraction of specific elements of an image - images' features. Two kinds of application are presented. Firstly example of application of MM in feature vector construction is described. This example shows how directional morphological operators can be used for feature extraction. Features of printed characters will be extracted. Such kind of feature extraction is particurarly useful for optical character recognition systems (OCR). Second example describes an application of advanced morphological filtering into a problem of human faces recognition. Human faces are filtered and segmented in such a way that particular elements of the face are extracted. These elements can be treated as a features of a human face. Later they can be used for human face recognition. This paper is divided into 6 parts. First three are devoted to a short introduction into tools which are applied in presented applications. Part 2 describes directional mophological operators, part 3 - morphological filtering by reconstruction, and part 4 - segmentation by k-means clustering. Next two parts describe the applications: part 5 application of feature extraction for OCR and part 6 - extraction of the features of human face. Part 7 includes a conclusion. Contact e-mail address: [email protected]

2. DIRECTIONAL MORPHOLOGICAL OPERATIONS Good introduction to mathematical morphology was published in some publications9,10. Here only the directional versions of morphological operators are described. In this case 'directional' means that a given operation in performed in a one, particular direction. In other words only neighbors from one direction are considered. The shape of neighborhood in MM is described using the notion of structuring element. Shape of the structuring element tells us how the neighborhood looks like. Two basic morphological operators are: erosion and dilation. Directional versions of them have structuring element consisting of only one neighbor of each pixel in given grid. Dilation (erosion) described in other words is a non-linear supremum (infinimum) filter. We consider supremum (infinimum) of only two points: center point and one of its neighbors, so that the image properties from one direction are considered. All possible locations of the neighbor are strictly depending on the grid type. As a consequence of that not all directions can be take under consideration only these which describe location of neighbors in given grid. Because of that number of directions is equal to the connectivity of the grid e.g. for grid with 6-connectivity (hexagonal grid) only 6 directions are available. In the example shown in one of forthcoming parts this number is enogh - we use square grid with 8 neighbors and therefore we can have 8 different directions in which directional operations are possible. But for some reasons one need to have some other directions e.g. one could need more precise definition of direction by an angle. In this case some algorithms are to find in literature15. In fact, a lot of morphological operators have its directional versions, especially filters. We are using directional versions of opening and closing. Structuring element used for both operations has the same properties as these previously described - it includes only one neighbor. But one remark is important. Opening consists of two succeded operations: erosion and after it dilation, and needs two structuring elements. First one - used for dilation should be transposed to the second - for erosion. In directional case it means that dilation is performed in the oposite direction to erosion. The same remark should be considered for closing (which consists of erosion preceeded by dilation). Size of directional operation is defined in the same manner as in traditional case. Operation of size n consists of n single operations.

3. MORPHOLOGICAL FILTERING BY RECONSTRUCTION Another morphological tool used in our experiments is filtering. In this paper morphological filtering by reconstruction is considered. The following part describes briefly theoretical background of opening and closing by reconstruction. Both operations belong to one group called geodesic morphological transformations13. Geodesic dilation (erosion) of size 1 of the marker image f with respect to the mask image g where

f ≤ g ( f ≥ g ) is defined as infinimum (supremum) image of mask image and marker image after dilation (erosion) (1) (1) of size one - with elementary structuring element, and is denoted as: δ g ( f ) ( ε g ( f ) ):

δ g(1) ( f ) = δ

(1)

( f ) ∧ g,

ε g(1) ( f ) = ε

(1)

( f ) ∨ g,

(1) Geodesic dilation (erosion) of size n consists of n succesive geodesic dilations (erosions) of size 1:

δ g (n) ( f ) = δ g (1) ( δ g(1) .... δ g (1) ( f )...)) n − times

, ε g ( f ) = ε g ( ε g .... ε g ( f )...)) ( n)

(1)

(1)

n − times

(1)

(2)

Reconstruction by dilation (reconstruction by erosion) of a mask image f from a marker image g where f ≤ g ( f ≥ g ) is defined as succesive geodesic dilations (erosions) of f with respect to g performed until idempotence and is ∗

denoted by: Rg ( f ) ( Rg ( f ) ):

Rg ( f ) = δ (gi ) ( f ) , Rg∗ ( f ) = ε (gi ) ( f ),

(3)

where i such that (idempotence):

δ (gi ) ( f ) = δ (gi +1) ( f ) , ε (gi ) ( f ) = ε (gi +1) ( f ).

More information about reconstruction is in literature13,14. Opening and closing by reconstruction are defined as follows:

γ R( n ) ( f ) = R f (ε ( n ) ( f )),

ϕ R( n ) ( f ) = R *f (δ ( n ) ( f )).

(4)

Unlike traditional opening and closing, opening by reconstruction and closing by reconstruction preserves shapes in the image. It is very important while using these operations as a first step of segmentation. It makes possible removal of the local peaks of gray-intensity without changing the borders of regions (which happens in traditional opening and closing). One of the most efficient filters are alternating sequencial filters (ASF), which are consist of two families of filters. Each family contains operations of the same type but of different, increasing size. It can be defined as follows:

ASF i = α (i )β (i ) ... α ( 2)β ( 2) α (1)β (1)

(5)

α, β are the families of filters, i is defined as a size of filter. In most of cases α and β represent opening and closing7,11. In our case we use these operation by reconstruction where α means closing by reconstruction and β opening by reconstruction. We will call this filter later ASF by reconstruction (ASFR):

ASFR i = ϕ R (i ) γ R (i ) ... ϕ R ( 2) γ R ( 2) ϕ R (1) γ R (1)

(6)

More information about morphological filters is in literature7,11.

4. SEGMENTATION Up to now a huge amount of segmentation algorithms was developed. In general one can divide them into two groups12: based on spectral and spatial properties of the image. To the first group belong all algorithms based on the intensity of image’s pixels like eg. clustering. Second group bases on the relationships between pixels and theirs neighbors. To this group belongs the most popular morphological segmentation technique - watershed1,6. This section describes only first type of segmentation - clustering. In our case not only spectral properties are considered because first step - morphological filtering was performed. Those step have removed the undesirable intensity peaks using the spatial information, so the spatial constraints are in such a manner considered. Proposed clustering method bases on the k-means technique2,3,8. K-means algorithm aims at classifying each member of given set of input data into given number of clusters. Algorithm deals with distances between samples and cluster centers. This distance is minimized in two steps performed iteratively. First step distributes samples into the clusters, second updates cluster centers. Each sample is classified to the cluster which center is the nearest in given metric. Updating of cluster centers is performed by computing a new cluster center by placing it in the middle (arithmetic mean value) of all of the samples belonging to the cluster. These two steps are performed iteratively while clusters centers change their position. If in the last performed iteration cluster centers did not move, it means that operation of samples classification is finished. In case of applying k-means algorithm for images notions of sample and cluster center are represented by pixel’s graytone. During the calculations the distance between samples and cluster centers is calculated by taking absolute value of difference between graytones of sample and cluster center. Cluster centers obtained finally represents graytones of the output image. These graytones refers to regions on the image surface.

Whole algorithm can be written as follows: 1. Set initial cluster centers (output graytones). 2. Classify pixels. Each pixel is labeled by the id number of the closest cluster. 3. Calculate new cluster centers (output graytones). 4. If in cluster centers obtained in step 3 are different to centers obtained in previous iteration then go to step 2, otherwise go to step 5. 5. Using cluster centers (output graytones) classify pixels of the input image into output one. Change value of each pixel to the value of the closest cluster center. Initial cluster centers (point 2 above) can be calculated in any way. We have placed them equally distanced in a whole range of graytones. Next remark refers to the spectral property of this kind of segmentation. Because k-means does not consider neither position of pixel nor its neighbors, there is no need to make all of the calculations on image. Image histrogram is sufficient to do this. This solution is faster because it needs smaller amout of computations. In our case k-means algorithm is applied to a binarisation (k=2).

5. FEATURE EXTRACTION FOR OCR One of the main areas of application of any feature extraction algorithm is optical character recognition (OCR) and automatic document reading. Up to now mathematical morphology is not often applied in these areas, altrough it offers interesting tools for solving some of problems. From the large variety of morphological operators, directional operators seem to be useful for that purpose. They can operate in particular directions and find directional features of objects. Features are understood here as a set of information which allows us to indetify characters and to classify it to one of the pre-defined classes. In this paper we want to show an example of feature extraction using directional morphological operators. Only two directions are considered - horizontal and vertical. For presented example this is sufficient, but in more advanced cases (e.g. hand-written characters, larger variety of different fonts) alse more directions can be considered. We use directional dilations in several steps of feature extraction. Firstly we use it for defining the character area - the smallest square including the character. This can be easily and fast obtained by taking intersection of horizontal and vertical dilations - see Fig.2a-2d. Similiar procedure can be applied to extraction of characters from the whole line of text, as well as extraction of lines from whole page. This extraction can be obtained very fast. All of characters from page can be extracted in one pass. Character area is applied firstly to define the window in which character is included. Secondly knowing the xand y-size of the area we obtain the proportions: height/width. For some of the characters like 'I','M','W' this is good hint for recognition. Finally the character area will be applied to determine the borders of the area. Borders are applied to hide a part of the character - see Fig.1. - in order to modify a character in such a manner that its shape can be changed.

Fig.1. Examples of characters modified by adding borders.

Borders are generated by directional erosions of the character area. Left and right border by horizontal erosions, upper and lower by vertical. Size of erosions is calculated in such a way that the thickness of border should be equal to about 15% of thickness of the character area in particular direction. The features which are generated are grouped into two groups. First one base on counting of holes inside the character. The hole is understood here as a part of the background surrounded by the character's curve - like a triangle inside the 'A' character. Second group of features bases also on some calculations, but in this case we calculate number of objects remaining after some character's modifications. First feature from the first group is a number of holes in the character (not-modified). In order to obtain next features the character's shape is changed by adding borders - see Fig.1. After the change of the shape number of holes is calculated once again. For most of characters number of wholes is increasing after adding one or more borders. Altogether this group of features consists of 5 elements: number of holes in not-modified character, with additional left, right, upper and lower border.

Fig.2. Set of 36 characters of Arial font (a), after vertical dilation (b), after horizontal dilation (c), character area - intersection of veritcal and horizontal dilation (d), extraction of horizontal lines - horizontal opening (e), filtered horizontal lines (f), extraction of vertical lines vertical opening (g), filtered vertical lines (h).

Second group of features base on calculations of elements remained after some filtering of character. This filtering is applied due to extract areas of the character's shape in particular directions e.g. horizontal, vertical lines (fragments of charcter's curve). In our case two features are extracted: number of horizontal and vertical fragments in character's curve. One find these fragments by directional opening in two directions: horizontal and vertical. Size of the opening should be greater than thickness of the character's curve. In the example shown on Fig.2 (e,f,g,h) this process is presented. Horizontal and vertical lines obtained are additionally filtered in order to remove thin lines which can be found e.g. in 'O' character. Number of objects is calculated on filtered image. a

b

c

d

e

f

0

1

1

1

1

1

2

1

0

0

0

0

0

1

0

2

0

0

1

0

1

0

1

2

3

0

0

0

0

1

0

2

4

1

1

1

1

1

1

1

5

0

0

1

0

1

1

1

6

1

1

2

1

1

0

7

0

0

0

0

1

8

2

2

2

2

2

9

1

1

1

3

1 1

A

1

2

B

2

2

2

C

0

0

D

1

1

E

0

F

1

1

2

2

3

a

b

c

d

e

f

g

h

I

0

0

0

0

0

1

0

0,1

0,3

J

0

0

0

0,6

K

0

1

1

0,6

L

0

0

0

0,6

M.

0

2

0

1

0

0,6

N

0

1

0

1

0

0

0,6

O

1

1

1

1

1

0

0

1

0,6

P.

1

1

1

1

1

1

0

1

0,6

Q

1

1

1

1

1

0

0

0,6 0,9

S

0

0

1

0

1

0

0

1

1

g

h

0

0,6

3

R

1

2

1

1

0 0

1

0 0

0

2

1

1

1

0,4

1

0

0

1

0

0,8

0

1

1

0,6

2

0

0,9

2

0

0,7

1

2

2 2

0,8

0 0

2

0

0

0

2

1

0,7 0,8

0

0,7 0,7

1

1

0

1

2

2

1

3

2

0,7

T

0

0

0

0

0

1

1

1

0

0

0

0

2

0,7

U

0

0

0

1

0

2

0

1

1

1

1

2

0,7

V

0

0

0

1

0

0

0

0,9

0

2

0

0

1

3

0,7

W

0

1

0

2

0

2

0

1,3

0

0

1

0

0

1

2

0,6

X

0

1

1

1

1

0

0

0,7

G

0

0

1

0

0

0

0,8

Y

0

0

0

1

0

1

0

0,7

H

0

1

0

1

0

2

0,7

Z

0

0

1

0

1

0

2

0,7

3

1

1

0 1

1,3

1

0,7 1

0,7

Fig.3. Feature vestor obtained by morphological procesing: (a)-(e) number of holes in: character (a), character with lower (b), left (c), upper (d), right (e) border; number of elements after directional vertical (f) and horizontal (g) opening; proportions of character area (h).

In our example lines in only two directions are detected. For more complicated tasks one can apply more directions. In square grid of points (8 neighbors) one can simply add two directions (+/- 45°). For some characters this can be useful. Of course other directions (angles) are also possible, but in this case one should use special algorithms for morphological operators in unrestricted direction15 Fig.3 shows the summary of all features extracted from a set of characters of Arial font. The last column includes the proportions heigh/width of each character calculated on character area. This in only an example, but it shows that this 7 features are sufficient to distinguish most of the characters. Only some pairs of characters have the same set of features. For this case additional features can be applied. At the end should be said that features (elements of the feature vector) can be different depending on the particular font. In Fig.3 also other possible values for some of the characters are included and written in italics.

6. FINDING OF SPECIFIC ELEMENTS IN HUMAN FACE Experiments, which results are presented here, were devoted to finding of specific areas on the human face as eyes, lips, etc, which can be further processed. Morphological filtering by reconstruction is an excellent tool for filtering without disturbing the borders of regions on the image5. In case of extracting characteristic elements of human face this is extremely important because any unexpected disturbance of shape can cause problems with badly recognized face (in the next step of processing). First steps of processing bases on morphological filters by reconstruction. Filters by reconstruction are extremely useful for all of the purposes where one should remove some noise or elements from the image, but without chaging the shape of remaining objects. Traditional filters remove the noise/obejects from the image, but on the other hand they cause disturbances of shape (e.g. blurring effect). Opening/closing by reconstruction in the first stage (erosion/dilation) remove areas of given size from the image and, in the second stage reconstructs not-removed objects.

Fig.4. Filtering by ASFR of human face: input image (a), ASFR size 1 (b), size 2 (c), size 3 (d), size 4 (e), ASF size 4 (f); pictures above show images, below show sections of the images above made for y=50 - see short lines between images.

First step is applied in order to remove the noise from the image (pre-filtering). This step consists of closing by reconstruction of size 1 preceeded by opening by reconstruction of size 1. First of these two operations removes

darker than background pixels from the image, second - lighter. Result of that will be later called filtered image. Result of this step is presented on Fig. 4b - input image in on Fig.4a. In the second step main filtering by reconstruction is performed (ASFR). Such kind of filtering removes in each step bigger objects. The result of ASFR is shown on Fig.4c,d,e. Face on Fig.4e is not including the characteristic features like eyes, lips etc. The shape of the face is however preserved. In order to compare with traditional alternating sequential filtering (by traditional opening/closing) Fig.4f is shown. It is visible, that the shape of the face is changed - such a operation for the recognition purposes has not a great value. Image filtered by ASFR is now binarized in order to obtain face area. Birnarisation is made by k-means algorithm (with k=2). Shape of the face in presented in Fig.5a. Now, the features of the face (eyes, lips) are going to be extracted. Image after ASFR compared with the filtered one (Fig.4b) dosn't include elements of the face. From this image initial image is now subtracted. This operation can be included to a family of top-hat transformation, because it removes the background and preserves the objects on the background. Fig.5b shows result of our top-hat. Important areas on the face are visible - eyes and lips are considerably lighter than the rest of the image. This image can be easily binarized by k-means technique - see Fig. 5c. In order to remove these parts of the binarized image, which are dos not belong to the face area, we just take the intersection of it with the face area. After this operation the image includes important features of our image - eyes and lips - see Fig. 5d.

Fig.5. Steps of proposed processing method: face area (a), top-hat (b), binarized top-hat (c), part of binarized top-hat included in face area (d), feature areas (e), feature areas + contour of face area (f).

In order to determine not only features, but whole areas where features can be found, some additional operations are performed (openings, directional and traditional dilations). After this areas are excluded from the image - see Fig. 5f.

Fig.6. Results of the extraction of human face features.

Fig.6 shows some other result of feature extraction shown on initial images. In all of them features are extracted correctly. Using our method we can find specific elements on the human face. These elements and distances between them are measured. These sizes and distances can (and will be in the future) included in feature vector used for face recognition.

Presented solution is of course not the only one to be applied to this kind of images. One of the other possibilities is an image segmentation based on watershed transformation1,6. Opposite to k-means-type segmentation it considers local dependencies (pixel's neighborhood).

7. CONCLUSIONS Experiments carried out shows that MM can be efficiently used for feature extraction. Experiments already realized show that mathematical morphology offers some interesting tools for feature extraction. These tools are: directional operations, reconstruction, and alternating sequential filters. Also clustering technique by k-means algorithm can be used for binarisation of graytone images (automatic threshold calculations). Methods presented in our paper are still developed. First method (OCR method) was already applied to different character sets. Similar method was applied to fast reading of bank-check codes written using special OCR-characters4. Second method (features of the human face) is a good basis for further experiments, some of which are realized at the moment in our institute.

ACKNOWLEDGEMENTS This work was supported by Polish Committee of Scientifical Researches (KBN), project number: 8 T11C 027 12.

REFERENCES 1. S.Beucher, "Segmentation d'images et morphologie mathematique", Ph.D. Thesis, School of Mines of Paris, Fontainebleau, 1990. 2. R.O.Duda, P.E.Hart, "Pattern classification and scene analysis" Willey and Sons, 1973. 3. M.Iwanowski, "K-means algorithm for greytone reduction", Technical Report, School of Mines of Paris, N13/96/MM, 1996. 4. M.Iwanowski, "Automatic reading of bank-check codes", Technical Report, School of Mines of Paris, N19/96/MM, 1996. 5. M.Iwanowski, S.Skoneczny, J.Szostakowski, "Image segmentation by advanced morphological filtering and clustering", in:Proceedings of International Conference of The Quantitive Description of Materials Microstructure, Warsaw 16-19 April 1997 pp.307-314. 6. F.Meyer, S.Beucher, "Morphological Segmentation". Journal of Visual Comm. and Image Representaion, Vol.1, No.1, September, pp. 21-46, 1990. 7. F.Meyer, J.Serra, "Filters: from theory to practice", Acta Stereologica 1989; 8/2; Proceeding of 5-th European Congress for Stereology, Freiburg, September 1989, pp. 503-508. 8. S.Z Selim, M.A.Ismail, "K-means-type algorithms: a generalised convergence theorem and characterisation of local optimality". IEEE Trans. Pattern Anal. Machine Intell. 6(1), pp. 81-87, 1984. 9. J.Serra, "Image analysis and mathematical morphology, Academic Press, 1982. 10. J.Serra, ed., "Image analysis and mathematical morphology Volume 2: theoretical advances", Academic Press, 1988. 11. J.Serra, L.Vincent, "An overview of morphological filtering", Circuits Systems Signal Processing Vol.11, No.1, 1992. 12. P.Soille, "Partitioning of multispectral images using mathematical morphology", Technical Report, School of Mines of Paris, N-22/92/MM, 1992. 13. P.Soille, "Geodesic Transformations in Mathematical Morphology: an Overview", Technical Report, School of Mines of Paris, N-24/94/MM, 1994. 14. L.Vincent, "Morphological Grayscale Reconstruction: Definition, Efficient Algorithm and Aplications in Image Analysis", IEEE Comp.Vis. and Patt.Rec., Proceedings, 1992, pp. 622-635. 15. Edmond J.Breen, Pierre J.Soile, "Generalisation of van Herk recursive erosion/dilation algorithm to lines at arbitrary angles", DICTA-93 Sydney - Conference Proceedings ,12 1993