Robust artificial landmark recognition using polar ... - CiteSeerX

1 downloads 0 Views 211KB Size Report
polar histograms. Pablo Suau. Departamento de Ciencia de la Computación e Inteligencia Artificial. Universidad de Alicante, Ap. de correos 99, 03080, Alicante ...
Robust artificial landmark recognition using polar histograms Pablo Suau Departamento de Ciencia de la Computaci´ on e Inteligencia Artificial Universidad de Alicante, Ap. de correos 99, 03080, Alicante (Spain) [email protected]

Abstract. New results on our artificial landmark recognition approach are presented, as well as new experiments in order to demonstrate the robustness of our method. The objective of our work is the localization and recognition of artificial landmarks to help in the navigation of a mobile robot. Recognition is based on interpretation of histograms obtained from polar coordinates of the landmark symbol. Experiments prove that our approach is fast and robust even if the database has an high number of landmarks to compare with.

1

Introduction

Robot navigation is a research field where a great variety of different mechanisms are being studied in order to achieve an interesting goal: having a physical autonomous agent capable of navigating without any human interaction, just interpreting the surrounding environment. Information exchange with this environment leans on several types of sensors, like sonar, laser range sensors, and so on. Vision based navigation systems can achieve a high degree of flexibility, allowing the robot to take complex decisions. Some of these systems are based on landmark recognition; however, papers explaining this kind of systems (like, for example, [1]) focus more on the system description or on the use of environmental characteristics as landmarks (natural landmarks) rather than explaining the recognition process. Our research deals with the landmark recognition process from another point of view. We present our landmark localization and recognition approach itself, without considering a concrete robot system where this process could be included, so it could be included in other kind of systems The use of landmarks with roadsign symbols has been chosen so in a future this method could be applied to the problem of the recognition of this kind of signals. It is possible to find several papers talking about the roadsign recognition problem ([2],[3],[4]), but they explain more complex techniques that the one we present here. The objective has been to find an efficient and robust recognition method. Although the papers about roadsign detection mentioned before show us some ways of solving this problem, our method is simpler and give us better results.

This paper is divided in the following sections: in section 2 we define polar histograms, in section 3 we explain how to compare different polar histograms, in section 4 the complete approach to localize and recognize artificial landmarks is shown, and finally, in section 5, some experimental results are shown.

2

Polar histograms

Polar histograms are introduced as a way of comparing symbols, without being affected by little changes in shape, orientation and displacement (scale variations are solved in the localization part of our system, which is explained above). These polar histograms are created from polar coordinates of symbols. Some works have proven that using polar coordinates allows an efficient and low computational cost two dimensional irregular shape comparison, invariant to displacement and rotation (on the plane of the image, no 3D rotations) [5]. The first step to build a polar histogram from a symbol is to have a binary image containing that symbol. This image is represented by means of cartesian coordinates, and it must be transformed into a polar coordinates image, using the gravitational center of the symbol as pole and a polar axis which origin is this pole (some examples are shown in Figure 1). Using the equations (1) and (2) we can know which cartesian pair (x, y) corresponds to each polar pair (ρ, θ). This translation can be done in two ways: calculating the polar coordinates for each cartesian pair in the original image, or calculating the cartesian coordinates corresponding to each polar pair in the destination image. This second method is more efficient and faster, avoiding gaps to be present in the resulting polar image.

x = ρ · cos(θ) y = ρ · sin(θ)

(1) (2)

Fig. 1. Some examples of symbols represented in cartesian coordinates (left) and the same symbols represented in polar coordinates (right), using the gravitational center as a pole.

Finally, from the polar image, we can obtain a histogram that represents the original symbol. In the polar image, the distance ρ increases with each column; so, all the pixels in the same column are at the same distance from the symbol’s gravitational center in the original image. If we add all the pixels with value 1 in each column in the polar image, we generate a histogram that indicates us for all the distances from the gravitational center of the symbol, how many pixels have value 1 (an example is shown in figure 2). This histogram is rotation invariant (because we use polar coordinates and the camera is always straight) and displacement invariant (because we use the gravitational center of the symbol as polar center). We call this structure polar histogram.

Fig. 2. An example of polar histogram created from a landmark symbol

We will use polar histograms to recognize symbols extracted from a landmark localized in an image, comparing its polar histograms with polar histograms created from symbols stored in a database.

3

Comparing polar histograms

If we have a symbol database, and we have created a polar histogram for each of those symbols, the recognition task is as easy as to build a polar histogram for the symbol we want to recognize and then try to find the one in the database whose polar histogram is more similar. In order to test this similarity, several histogram comparison methods, like Kolmogorov-Smirnov test or Chi-Square Distance could be used. As can be seen in the experimental results section, we have test some different histogram comparison methods, like L1 norm, L2 norm ([6]), Prefix Sum ([7])

and Chi-Square distance (Kolmogorov-Smirnov was not suitable to our problem). We had better results with Chi-Square Distance, so this is the distance we use in our system in order to compare polar histograms. The Chi-Square distance, applied to two histograms, can give us a weighted average of the difference between all the positions of these histograms, so it tells us which of the histograms in the database is more similar to the histogram of the symbol we want to recognize. We can calculate this distance χ2 between two histograms i and j using equations (3) and (4).

χ2ij =

n 2 X ˆ (Hi (k) − H(k)) k

ˆ H(k)

Hi (k) + Hj (k) ˆ H(k) = 2

(3) (4)

Although the Chi-Square disttribution is not symmetric, the Chi-Square distance has this property, so it can fit our purposes.

4

System description

The system aim is to locate the nearest landmark inside a digital image containing one or more artificial landmarks, and to extract the symbol inside it to recognize it, after a comparison with a set of symbols stored in a database. Such a digital image is obtained by a camera placed on a mobile robot. Since this work is focused more on recognition than on localization or image segmentation, landmarks are not too complex. As we can see in examples in Figure 3, a landmark is square shaped, with blue border and a black symbol inside the border. Symbols inside the landmarks have been taken from real roadsigns.

Fig. 3. Landmark examples

Our approach for landmark localization and recognition was based on [2], with some changes. Figure 4 shows the complete process from the moment the image is obtained from the camera on the robot to the moment the symbol inside the landmark is recognized. This process can be summarized with the following steps: – Colour segmentation: After transforming the input image to a HSV color model, a color quantization is applied to it, reducing it to eight basic colours ([2]). A binary image is created containing the pixels of the original image which have the same colour than the landmark borders (one of these eight basic colours).

Fig. 4. The complete localization and recognition process

– Landmark localization: from the binary image corresponding to the landmark borders, we try to localize the nearest one, by means of horizontal and vertical projections. We don’t use stereo vision (we have only a camera on top of the robot) so we don’t have depth information. Therefore we consider that the nearest landmark is the biggest one, the landmark with the greatest number of pixels. The localization is explained in [2], but instead of creating projections as the total sum of blue pixels in each row and column, we have verified that using the maximum sum of consecutive blue pixels give us better results in the case we have several landmarks very close each other. – Landmark’s symbol extraction: once the nearest landmark has been detected, and after checking it is approximately square-shaped, we apply the k-means algorithm only to the part of the original image where the nearest landmark is placed, splitting pixels in two groups: pixels having a high V value and pixels having a low V value. As a consequence, we create a binary image with the same size than the images stored in the database, containing only one symbol. – Recognition: a polar histogram is created from the extracted symbol, and it is compared with the histograms created from symbols stored at the database. At the end, recognized landmark is shown on screen. At the landmark’s symbol extraction step there is an interesting issue we must discuss. If we follow [2], we should use the black colour plane to recognize the symbol inside the nearest landmark. However, using this method has proven to be not very robust with varying light conditions. That’s why we use k-means algorithm.

5

Experimental results

Finally we show some experimental results. The images caught by the camera on the robot had a size of 320x240 pixels, and the images stored in the landmark database had a size of 96x96 pixels (so, the nearest landmark extracted from the image would be scaled to a size of 96x96). Not all the landmarks inside the images caught by the robot had an orthofrontal position, so these results include the recognition of several slightly out of plane rotated landmarks. However, if the nearest landmark is too rotated from robot point of view, we can consider that landmark not interesting, because the robot must only interpret landmarks in front of it. 1 ˜ The symbols stored in the database were obtained from SENALECTICA , a vectorial image repository of real roadsigns. A group of test sets with several images caught by the camera on the robot (containing from 89 to 380 images) were create to estimate the recognition error rate. The first one had 10 landmarks stored in the database, the second one 20 landmarks, the third one 30 landmarks, and so on. The last one had 100 landmarks. All the landmarks from the database appeared in at least 3 images in the corresponding test set. In these images appeared from 1 to 3 landmarks, at different distances. Localization error rate was allways between 1-3%. To calculate recognition error rates we ignore the images were nearest landmark is not localized correctly. First we could see the effect of changing the size of the polar images from where we calculate the polar histograms. Figure 5(a) shows the recognition error rate when we have 100 landmarks in the database for different polar image resolutions (and, in consequence, different number of polar histogram elements). As we can see, with a low number of polar histogram elements there is not enough information in order to achieve an adequate recognition. From the moment we use 100 elements, error rate converges, so we use histograms with 100 elements in the rest of experiments. Figure 5(b) shows that is better to use k-means algorithm to extract the symbol inside the nearest landmark instead of using black colour plane, like in [2]. The recognition error rate is calculated for each of the test cases described before. Finally, Figure 5(c) shows the recognition error rate of our approach for each of the test cases, using different distance metrics for comparing polar histograms. As we can see, the mixture of polar histograms for image characterization and Chi-Square distance for image recognition results in a low recognition error rate. Our approach is very fast and has a recognition error low enough to allow a correct robot navigation guided by artificial landmarks.

6

Conclusions and future work

New results for our fast and robust method to recognize symbols inside artificial landmarks to help in robot navigation have been presented. The method is based 1

http://iris.cnice.mecd.es/bancoimagenes/senales

Fig. 5. Experimental results

on the comparison of polar histograms, using the Chi-Square distance. An high number of right guesses is achieved when the number of symbols in the database is high enough. Actually we are working on improving landmark localization, so our approach could be used with images containing complex environments. Our final goal is to use this method with a real robot platform and see how it works.

References [1]

[2]

[3]

[4]

Todt, E., Torras, C., Detection of Natural Landmarks Through Multiscale Opponent Features, 15th International Conference on Pattern Recognition (ICPR00), Barcelona, Spain, 2000, Vol. 3, pp. 3988-3991. Hsien, J.C., Chen, S.Y., Road Sign Detection and Recognition Using Markov Model, 14th Workshop on Object-Orient Technology and Applications (OOTA 2003), Taiwan, 2003, pp. 529-536. Piccioli, G., De Micheli, E., Parodi, P., Campani, M., A Robust method for road sign detection and recognition, Image and Vision Computing, Vol. 14, 1996, pp. 209-223. Zadeh, M.M., Kasvand, T. Suen, C.Y., Localization and Recognition of Traffic Signs for Automated Vehicle Control Systems, Conf. on Intelligent Transpor-

[5]

[6]

[7]

tation Systems, part of SPIE’s Intelligent Systems and Automated Manufacturing, Pittsburgh, USA, 1997. pp. 272-282. Bernier, T., Landry, J.A., A New Method for Representationg and Matching Shapes of Natural Objects, Pattern Recognition, Vol. 36(8), 2003, pp. 17111723. Fekete, S. P., Simplicity and Hardness of the Maximum Traveling Salesman Problem under Geometric Distances, Proc. Tenth ACM-SIAM Symposium on Discard Algorithms (SODA 99), Maryland, USA, pp. 337-345. Cha, S. H., Srihari, S. N., Distance Between Histograms of Angular Measurements and its Application to Handwritten Character Similarity, 15th International Conference on Pattern Recognition (ICPR 2000) , Barcelona, Spain, 2000, pp. 21-24.