Delaunay triangulation for image object indexing ... - Semantic Scholar

Delaunay triangulation for image object indexing: a novel method for shape representation Yi Tao* and William I. Grosky* Department of Computer Science Wayne State University Detroit, MI 48202 ABSTRACT Recent research on image databases has been aimed at the development of content-based retrieval techniques for the management of visual information. Compared with such visual information as color, texture, and spatial constraints, shape is so important a feature associated with those image objects of interest that shape alone may be sufficient to identify and classify an object completely and accurately. This paper presents a novel method based on feature point histogram indexing for object shape representation in image databases. In this scheme, the feature point histogram is obtained by discretizing the angles produced by the Delaunay triangulation of a set of unique feature points which characterize object shape in the context, and then counting the number of times each discrete angle occurs in the resulted triangulation. The proposed shape representation technique is translation, scale, and rotation independent. Our various experiments concluded that the Euclidean distance performs very well as the similarity measure function in combination with the feature point histogram computed by counting the two largest angles of each individual Delaunay triangle. Through the further experiment, we also found evidence that an image object representation using a feature point histogram provides an effective cue for image object discrimination. Keywords: shape representation, feature point, point feature map, feature point histogram, Delaunay triangulation

1. INTRODUCTION The main challenge of multimedia information systems is analysis, representation, management and retrieval of the contents of multimedia data such as audios, images, and videos. In the last few years, content-based image retrieval has seen a great deal of emphasis in the context of multimedia information systems5,15,24. Images contain not only low-level visual features such as color, texture, shape, and various spatial constraints, but also high-level meaningful semantics which conceptualizes the correspondence between image objects and real-world objects. As the semantics of images and image sequences is much richer than that of alphanumeric data, it is necessary to move from image-level into object-level interpretation. Compared with such visual information as color, texture, and spatial constraints, shape is so important a feature associated with those objects of interest that it becomes an essential part of the way we interpret and interact with the real world. Shape alone may be sufficient to identify and classify an object completely and accurately. Shape, along with the various spatial constraints of multiple objects, is important data in many applications, ranging from complex space exploration and satellite information management to medical research and entertainment. In general, object shape is determined by the context and the observer. It has been noted21, however, that shape information is highly resolution dependent, and requires elaborate processing to extract from an image. Shape cues include only a restricted set of view invariants such as corners and zeros of curvature. One of the major barriers nowadays to image databases being commonly used is shape retrieval10. Shape retrieval can be categorized into exact match searching and similarity-based searching. For either type of retrieval, the dynamic aspects of shape information require expensive computations and sophisticated methodologies in the areas of image processing and database systems. So far similarity-based shape retrieval is the most popular searching type. Extraction and representation of object shape are relatively difficult tasks and have been approached in a variety of ways. In14, shape representation techniques are broadly divided into two categories: boundary-based and region-based. To be specific, boundary-based methods concern the border or contour of the shape without considering its interior information; region-based methods concern both the border *

We would like to acknowledge the support of NSF Research Instrumentation Grant 97-29818.

and interior of the shape. One drawback of this categorization, however, is that they put shape attributes such as area, elongation, and compactness into both categories. We view shape representation techniques are being in two distinct categories: measurement-based methods ranging from simple, primitive measures such as area and circularity15 to the more sophisticated measures of various moment invariants14,15; and transformation-based methods ranging from functional transformations such as Fourier descriptors14 to structural transformations such as chain codes11 and curvature scale space feature vectors12. An attempt to compare the various shape representation schemes is made in14. In this paper, we present a novel approach for object shape discrimination using feature point histogram representation and explore its potential uses through experiments. The rest of the paper is organized as follows. In the next section, we briefly review different shape representation techniques. In section 3 some Delaunay triangulation related concepts in computational geometry are introduced. Section 4 presents the methodology of computing feature point histogram based on the point feature map which characterizes object shape. In section 5, we describe various experiments to demonstrate that the Euclidean distance performs very well as the similarity measure function in combination with the feature point histogram computed by counting the two largest angles of each individual Delaunay triangle. The further prototype implementation of an image retrieval system which contains 1099 fish shapes is shown in section 6. Finally, we give some concluding remarks.

2. RELATED WORK In the following, a few descriptions snapshot current approaches for shape representation. As of yet, no definitive comparisons of these methods have been made. In9, Jagadish introduced the notion of a rectangular cover of a shape. Since it is restricted to rectilinear shapes in two dimensions such that all of the shape angles are right angles, each shape in the database comprises an ordered set of rectangles. These rectangles are normalized, and then described by means of their relative positions and sizes. The proposed shape representation scheme supports any multi-dimensional point indexing method such as grid-file16 and K-D-B trees18. This technique can be naturally extended to multiple dimensions. Besides the limitation mentioned previously, the process of obtaining good shape descriptions of rectangular covers is not an easy job, either. One of the first image retrieval project is QBIC15. Provided with a visual query interface, a user can draw a sketch to find images with similar sketches in terms of color, texture, and shape. A union of heuristic shape features such as area, circularity, eccentricity, major axis orientation and some algebraic moment invariants are computed for content-based image retrieval. Since similar moments do not guarantee similar shapes, the query results sometimes contain perceptually different matches. In13, Mehrotra and Gary present a general and flexible shape similarity-based approach to enable the retrieval of both rigid and articulated shapes. In their scheme, each shape is coded as an ordered sequence of interest points such as the maximum local curvature boundary points or vertices of the shape boundary’s polygonal approximation with the indexed feature vectors representing the shape boundary. To answer a shape retrieval query, the query shape representation is extracted and the index structure is searched for the stored shapes that are possibly similar to the query shape, and the set of possible similar shapes is further examined to formulate the final solution to the query. In12, Mokhtarian, Abbasi and Kittler presented an approach for efficient and robust retrieval by shape content through Curvature Scale Space (CSS). They use the maxima of curvature zero_crossing contours of CSS image as a feature vector to represent the shapes of object boundary contours. The matching algorithm compares two sets of maxima and assigns a matching value as a measure of similarity. In addition, they use the aspect ratio of the CSS image, eccentricity and circularity to narrow down the range of searching. As far as the evaluation of the system performance is concerned, they reach the conclusion that because shape similarity is a subjective matter, the evaluation task is very difficult. The results of their subjective test indicated that human judgements of shape similarity noticeably differ. In11, assuming that each shape boundary is approximated by directed straight line segments, Lu introduced a unique chain coding method for shape representation by eliminating the inherent non-invariance of chain code. He also discusses the shape distance and similarity measures based on the derived shape indexes. One of the limitations is that mirror image factor is not taken into account. Additionally, if the flattest segment of boundaries does not happen to be along the major axis, this method may not work well.

In1, Imran and Grosky proposed to recursively decompose an image into a spatial arrangement of feature points while preserving the spatial relationships among its various components. In their scheme, quadtrees are used to manage the decomposition hierarchy and help in quantifying the measure of similarity. This scheme is incremental in nature and can be adopted to find a match at various levels of details, from coarse to fine. This technique can also be naturally extended to higher dimension space. One drawback of this approach could be that the set of feature points characterizing shape and spatial information in the image has to be normalized before being indexed.

3. DELAUNAY TRIANGULATION IN COMPUTATIONAL GEOMETRY Let P = { p1, p2, …, pn } be a set of points in the two-dimensional Euclidean plane, namely the sites. Partition the plane by labeling each point in the plane to its nearest site. All those points labeled as pi form the Voronoi region V(pi). V(pi) consists of all the points x’s at least as close to pi as to any other site: V(pi) = { x: | pi - x| ≤ | pj - x|, ∀j ≠ i }. Some points x’s do not have a unique nearest site. The set of all points that have more than one nearest site form the Voronoi diagram V(P) for the set of sites. Construct the dual graph G for a Voronoi Diagram V(P) as follows: the nodes of G are the sites of V(P), and two nodes are connected by an arc if their corresponding Voronoi polygons share a Voronoi edge. In 1934, Delaunay proved that when the dual graph is drawn with straight lines, it produces a planar triangulation of the Voronoi sites P, so called the Delaunay triangulation D(P). Each face of D(P) is a triangle, so called the Delaunay triangle.

Figure 1(a): A set of 26 points

Figure 1(c): Delaunay triangulation of the 1st variant

Figure 1(b): Resulting Delaunay triangulation

Figure 1(d): Delaunay triangulation of the 2nd variant

Delaunay triangulations and Voronoi diagrams are dual structures, and both of them contain the same information in some sense, but represented in a rather different form. To gain a grasp on these complex structures, it is necessary to have a thorough understanding of the relationships between the Delaunay triangulation and its corresponding Voronoi diagram. The

proof of Delaunay’s theorems and properties is beyond the scope of this paper, but can be found in17. Among various algorithms for constructing the Delaunay triangulation of a set of N points, we note that there are O(NlogN) algorithms3,4 for solving this problem. Amazingly, the Delaunay triangulation can be computed with less than 30 lines of C code as shown in17, but with a costly algorithm efficiency of O(n4). Theoretically, from the definition of the Delaunay triangulation, it is easily shown that those angles of the resulting Delaunay triangles of a set of sites (points) remain the same under uniform translations, scalings, and rotations of the point set. An example is illustrated in Figure 1 as follows: Figure 1(b) shows the resulting Delaunay triangulation for a set of 26 points shown in Figure 1(a); Figure 1(c) shows the resulting Delaunay triangulation of the transformed (translation, rotation, and scaled-up) set of 26 points in Figure 1(a); Figure 1(d) shows the resulting Delaunay triangulation of the transformed (translation, rotation, and scaled-down) set of 26 points in Figure 1(a).

4. OBJECT SHAPE REPRESENTATION USING FEATURE POINT HISTOGRAM An image object is either an entire image or some other meaningful portion of an image which could be a union of one or more disjoint regions. Typically, an image object would be a semcon (iconic data with semantics)6. For example, consider an image of a seashore scene consisting of some seagulls on the coast, with the sky overhead and sea area in the front. Examples of image objects for this image would include the entire scene (with textual descriptor Live on the Seashore), the seagull region(s), the sand regions(s), the water region(s), the sky region(s), and the bird regions (the union of all the seagull regions). Now, each image object in an image database contains a set of unique and characterizing features F = {f1, …, fk}. We believe that the nature as well as the spatial relationships of these various features can be used to characterize the corresponding image objects1,7,20. Those features, which characterize the shape of any image object, can be classified into the following two categories: •

Global features are general in nature and depend on the characteristics of the entire image object. Area, perimeter, and major axis direction of the corresponding image region are examples of such features.

•

Local features are based on the low-level characteristics of image objects. The determination of local features usually requires more involved computation. Curvatures, boundary segments, and corner points around the boundary of the corresponding image region are examples of such features.

In 2-D space, many of the features can be represented as a set of points. These points can be tagged with labels to capture any necessary semantics. Each of the individual points representing some feature of an image object we call a feature point. The entire image object is represented by a set of labeled feature points {p1, …, pk}. For example, a corner point of an image region has a precise location and can be labeled with the descriptor Corner Point, some numerical information concerning the nature of the corner in question, as well as the region's identifier. A color histogram of an image region can be represented by a point placed at the center-of-mass of the given region and labeled with the descriptor Color Histogram, the histogram itself, as well as the region's identifier. Effective semantic representation and retrieval requires labeling such feature points of each database image object. The introduction of feature points and associated labels effectively converts an image object into an equivalent symbolic representation, called its point feature map. We have devised a quadtree-based indexing mechanism to retrieve all those images from a given image database which contain image objects whose point feature map is similar to the point feature map of a particular query image object1. We note that the various spatial relationships among these feature points are the important aspect of our work. An important criterion for shape indexing schemes is that the shape representation should be translation, scale, and rotation invariant14. In addition, shape representation should possess good discriminating capabilities and must be robust. Color histogram indexing has been proven to be very useful for content-based retrieval in image databases, and is widely recognized as an image retrieval method with sufficient distinguishing capabilites8,15,21,24. Given a discrete color space such as RGB or HSV, a color histogram is obtained by discretizing the image colors and counting the number of times each particular color occurs in the image. Our proposed approach to indexing the spatial arrangements of shape features of an image object is also histogram-based. The methodology of feature point histogram representation for image object shape is quite simple. Within a given image, we first identify particular image objects to be indexed. For each image object, we construct a corresponding point feature map. In this study, we assume that each feature is represented by a single feature point and that each point feature map consists of a

set of distinct feature points having the same label descriptor, such as Corner Point. We then construct a Delaunay triangulation of these feature points. The feature point histogram is obtained by discretizing the angles produced by this triangulation and counting the number of times each discrete angle occurs in the image object of interest, given the selection criteria of which angles will contribute to the final feature point histogram. For example, the feature point histogram can be built by counting the two largest angles, the two smallest angles, or all three angles of each individual Delaunay triangle. An O(max(N, #bins)) algorithm is necessary to compute the feature point histogram corresponding to the Delaunay triangulation of a set of N points. Our idea of using a feature point histogram to represent the shape of an image object originates from the fact that if two image objects are similar in shape, then both of them should have the same set of feature points. Thus, each pair of the corresponding Delaunay triangles in the two resulting Delaunay triangulations must be similar to each other, independent of image object’s position, scale, and rotation. In this study, corner points, which are generally high-curvature points located along the crossings of an image object’s edges or boundaries, will serve as the feature points for our various experiments. We have previously argued for representing an image by the collection of its corner points in1, which proposed an interesting technique for indexing such collections provided that the image object has been normalized. In our present approach, which is histogram-based, the image object does not have to be normalized. This technique also supports an incremental approach to image object matching, from coarse to fine, by varying the bin sizes. Figure 2(a) shows the resulting Delaunay triangulation produced from the point feature map characterizing the shape of an image object, leaf, in which corner points serve as the feature points. while Figure 2(b) shows the resulting feature point histogram built by counting all three angles of each individual Delaunay triangle with bin size of 10°.

Number of angles

Feature Point Histogram 60 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Bin number

Figure 2(a): Delaunay triangulation of a leaf shape

1

Figure 2(b): Resulting feature point histogram

3

2

5

4 Figure 3(a): The five original fish images

5. EXPERIMENTS ON SHAPE SIMILARITY OF IMAGE OBJECTS In this section, we describe various experiments to demonstrate our notion of shape similarity between image objects. Each image used in the following experiments contains a single image object. The shape database consists of 150 images, in which the five original fish images are shown in Figure 3(a), the ten original leaf images are shown in Figure 3(b), and for each original image, nine variants are constructed: three rotation variants, three rotation, scaled-up variants, and three rotation, scaled-down variants. We note that each scaled variant is restricted to + 30% of its original size.

3 5

1

2

4 7

8

9

6

10 Figure 3(b): The ten original leaf images

For each image, the contained image object is flood-filled with black, while the background is flood-filled with white. Then we use SUSAN (Smallest Univalue Segment Assimilation Nucleus)19 for our corner point detection. It is pointed out in2,19 that SUSAN provides better results than traditional corner detection algorithms under varying levels of image brightness, and is computationally efficient, as well. With image objects plus their orthogonally transformed variants to conduct the initial experiments, we circumvent human subjectivity of shape similarity by simply assuming that each shape is relevant only to itself and to its nine variants. This also enables us to employ the standard recall-precision curves23 for the quantitative evaluation of the retrieval effectiveness. 5.1 Experiment One The first experiment consists of 100 images, in which five original fish images and five original leaf images (F1, F2, F3, L4, L5, L6, F4, L8, L9, F5) are chosen first, and then their corresponding variants are included. In this experiment, each image is indexed by a feature point histogram built by counting each individual angle in the resulted Delaunay triangulation of its feature point set (map) with bin size of 10°. Given that, n is the total number of bins. i or j is the subscript of feature point histogram bin. Q is the feature point histogram of query image object. D is the feature point histogram of database image object. qi is the ith bin of the query object histogram. dj is the jth bin of the database object histogram. aij is the angle difference between qi and dj. amax is the maximum

angle difference between qi and dj (i.e., 180°). wi is qi if qi>0 and di>0; and is 1 if qi=0 or di=0. Dis(Q, D) is the similarity measure between two feature point histograms. The four similarity measure functions used in this experiment are formalized as follows: •

The Euclidean distance is widely used as a comparison function, also known as the standard N-dimensional L2 metric23. n

1/2

Dis(Q, D) = ( ∑ (qi – di)2 ) i=1

•

Weighted cross distance function is first used for color histogram comparisons in the QBIC project8,15. This metric takes the perceptual similarity between the different bins of color histograms into account. n

n

2

Dis(Q, D) = ∑ ∑ (1- aij / amax)(qi – dj) i=1 j=1

•

Another weighted distance measure is derived from the standard N-dimensional L1 metric, i.e., a city-block distance, by taking into account the relative proportion of each bin. It is used for color histogram comparisons in24. n

1/2

Dis(Q, D) = ∑ wi( (qi – di)2 ) i=1

•

The histogram intersection metric was originally proposed in21 to search for color-based matching candidates in the large image database in an efficient way. n

n

i=1

j=1

Dis(Q, D) = ∑ min(qi, di) / ∑ qj In terms of the algorithm efficiency, the weighted cross distance function is the most expensive algorithm with an order of O(n2), while the other three functions cost differently but with the same order of O(n). We note that, the two weighted distance functions require the histogram bins to be normalized.

Relevant Image # 1 2 3 4 5 6 7 8 9 10

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9

Q10

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 8 20 31 35

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 30

1 2 3 4 5 6 7 8 9 20

1 2 3 4 5 6 7 12 13 22

1 2 3 4 5 6 8 9 11 25

1 2 3 4 5 7 8 9 10 38

1 2 3 4 5 6 7 8 9 12

1 2 3 4 5 6 7 8 13 48

Table 1: Actual positions of relevant images for each of ten queries

Assuming that each image is relevant only to itself and to its nine variants, we use each of the original ten images as a query image over the resulting database of 100 images, and then rank each match using four different metrics so as to find the proper similarity measure function. For each of the ten queries, Table 1 shows the actual positions in the 100 retrieved database images of the ten relevant images with the Euclidean distance as the similarity function, where relevant image i is the ith relevant image retrieved. From Table 1, we may calculate recall-precision curves. An example curve for Query 2 is shown in Figure 4. Based on the average positions of relevant images for each of ten queries, Figure 5 shows the

corresponding recall-precision curves for the four similarity measure functions. Obviously, the Euclidean distance performs very well as the similarity measure function. Recall-precision curves

120

120

100

100 Precision(%)

Precision (%)

Query 2

80 60 40 20

Euclidean distance

80

Weighted distance measure one

60 40

Weighted distance measure two

20

Histogram intersection

0

0

0

20

40

60

80

100

120

0

20

40

Recall (%)

60

80

100

120

Recall(%)

Figure 5: Recall-precision curves for four similarity measure functions

Figure 4: Recall-precision curve for query 2

5.2 Experiment Two The second experiment consists of the same 100 images as in experiment one. As in a color histogram, an angle resolution of 180 degrees is very large and unnecessary for object shape discrimination. In this experiment, we use various bin sizes for the Euclidean distance metric. For each of the ten queries, Table 2 shows the average positions, in the 100 retrieved image database, of the ten relevant images, where relevant image i is the ith relevant image retrieved, assuming that each image is relevant only to itself and to its nine variants. We conclude that, in some sense, the coarser the bin size of the feature point histogram is, the worse the overall effectiveness of shape matching becomes. Average Position of Relevant Image # Bin size of 5 degrees Bin size of 10 degrees Bin size of 15 degrees Bin size of 20 degrees Bin size of 30 degrees

1st

2nd

3th

4th

5th

6th

7th

8th

9th

10th

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

4 4 4 4 4

5 5 5 5 5

6 6 6 6 6

8 7 8 7 9

9 10 11 11 13

12 12 13 13 16

25 25 27 29 32

Table 2: Average positions of relevant images for each of ten queries using various bin sizes for the Euclidean distance

5.3 Experiment Three The third experiment consists of 100 images, in which the 10 original leaf images and their corresponding variants are included. In this experiment, each leaf shape is indexed by three feature point histograms: the two-largest-angle histogram, the two-smallest-angle histogram, and three-angle histogram, built respectively by counting the two largest angles, the two smallest angles, and all three angles of each individual triangle in the resulting Delaunay triangulation of its feature point set (point feature map) with bin size of 10°. Assuming that each image is relevant only to itself and to its nine variants, we use each of the original ten leaves as a query image over the resulting database of 100 leaf images, and rank each match using the Euclidean distance metric. Average Position of Relevant Image # 2-largest-angle histogram 2-smallest-angle histogram 3-angle histogram

1st

2nd

3rd

4th

5th

6th

7th

8th

9th

10th

1 1 1

2 2 2

3 4 3

4 5 5

6 7 6

7 9 8

9 12 10

10 15 13

13 20 17

21 37 34

Table 3: Average positions of relevant images for each of ten queries using three different feature point histograms respectively

For each of the ten queries, Table 3 shows the average positions, in the 100 retrieved shape database, of the ten relevant shapes using two-largest-angle histograms, two-smallest-angle histograms, and three-angle histograms respectively for shape matching, where relevant image i is the ith relevant shape retrieved. Due to limited space, the actual position of each relevant image for each individual query is not listed. Recall-precision curves corresponding to Table 3 are shown in Figure 6.

R e c a ll-P re c is io n C u rv e s 120 2 -la rg e s t -a n g le h is to g ra m in d e x in g

Precision(%)

100 80

2 -s m a lle s t-a n g le h is to g ra m in d e x in g

60 40

3 -a n g le h is to g ra m in d e x in g

20 0 0

20

40

60

80

100

120

R e c a l l(%)

Figure 6: Recall-precision curves for Table 3

Overall retrieval effectiveness23 is measured by using either 3-point average precision at recall values of 20%, 50%, and 80% or 11-point average precision at recall values of 0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, and 100%. See Tables 4, 5, and 6 for a list of these values. Retrieval Effectiveness 3 point 11 point

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9

Q10

86 86

78 77

78 76

96 90

96 94

100 94

84 82

100 96

100 94

91 89

Table 4: Retrieval effectiveness using 2-largest-angle histogram indexing Retrieval Effectiveness 3 point 11 point

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9

Q10

71 72

70 67

60 62

90 88

78 72

91 83

71 71

82 82

91 84

78 76

Table 5: Retrieval effectiveness using 2-smallest-angle histogram indexing

Retrieval Effectiveness 3 point 11 point

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q9

Q10

80 83

74 68

73 69

90 88

96 89

96 88

82 75

92 88

93 88

77 76

Table 6: Retrieval effectiveness using 3-angle histogram indexing

By comparing the experiment results of both standard recall-precision curves and overall retrieval effectiveness of each individual query via feature point histograms built under three different angle selection criteria, we conclude that two-largestangle histogram indexing shows the best ability of discriminating object shape in combination with the Euclidean distance as the similarity measure function. Three-angle histogram indexing is not as good as two-largest-angle histogram indexing, but better than two-smallest-angle histogram indexing. We discuss how the angle selection criterion influences the ability of object shape discrimination below. Two triangles are similar in a lot of different ways, but all of them involve comparing angles and sides of the two triangles. If all of the angles of a triangle are equal to the corresponding angles in the other triangle, then the two triangles are similar to each other. In fact, given the degrees of any two out of three angles, it is sufficient to characterize the shape of a triangle. Therefore, we expect that the feature point histogram built by counting only two angles out of each individual triangle within the Delaunay triangulation provides a sufficient and effective way for image object shape discrimination. However, because the sum of angles in a triangle is 180°, the resulting two-smallest-angle histogram has all zero values on those histogram bins equivalent to or greater than 90°. For example, if the histogram bin is of size 30°, each of the bins [90°, 120°), [120°, 150°), [150°, 180°) contains no angles. Thus the actual bin number of the two-smallest-angle histogram is only half of the two-largest-angle histogram and the three-angle histogram. Since the histogram-based representation is not unique, it is obvious that the histogram with a small number of bins provides less discriminating ability than those with larger numbers of bins, as the similarity of two object shapes depends on the Euclidean distance between their histograms. In this case, two-smallest-angle histogram is least effective for object shape discrimination, because the similarity measure depends on those nonzero values of histogram bins ranging from 0° to 90° only. Therefore, using the two smallest angles out of three in each Delaunay triangle to construct the feature point histogram is least resistant to false hits, than the other two choices. On the other hand, both the two-largest-angle histogram and the three-angle histogram representations contain a positive number of angles in each bin in general. As for the experiment result that two-largest-angle histogram indexing produces more discriminating power than three-angle histogram indexing, we argue that contributing the two largest angles out of three in each triangle of the Delaunay triangulation to the final feature point histogram is sufficient to characterize the shape of an image object. The feature point histogram representation of image object shape resulting from counting all three angles of each triangle in a Delaunay triangulation does not provide the best shape discriminating power, because the Euclidean distance computes the angle difference between each pair of corresponding Delaunay triangles more than necessary, leading to more false hits compared with two-largest-angle histogram representation. In addition, we note that the local movement of feature points and even the presence of outliers affect the Delaunay triangulation only locally. Thus the computed feature point histogram is not appreciably changed depending on the bin size. Since histogram-based representation is lossy, and not unique, image objects of different shapes may have the same feature point histogram representation, just as in color histogram representation and indexing. To compensate for this disadvantage, sub-image objects can be searched for using this approach, in combination with our previous quadtree-based technique1. For color histograms, the histogram of a sub-image of a given image object is a sub-histogram of the histogram of the original image object. This is not technically the case with feature point histograms, but we can argue that good sub-image matches can usually be obtained by means of this property as if it were true.

6. THE FURTHER EXPERIMENT As we know, shape similarity faces the challenge of human subjectivity, and the evaluation of the effectiveness of shape matching method is a relatively difficult task. So far, we simply assume that each shape is relevant only to itself and to its nine variants. However, someone may argue that the 5th leaf and the 9th leaf are similar in shape, but they are considered irrelevant while we compute the standard recall-precision curves in the previous experiments. Anyway, we stress that those experiments have shown the efficacy of the proposed shape representation scheme, though a much larger database is required for real testing. For this purpose, we implemented a Web-based shape retrieval system with its shape database consisting of 1099 images from the SQUID project12. Each database image contains only one fish shape. In this experiment, we randomly selected 20 images as queries, and then asked a number of volunteers to find the images in the database that are similar to each individual query shape. Based on the results of this subjective test, we reached the same conclusion as in12 that, human judgements of shape similarity are appreciably different. However, the ranking generated by the experiment always agreed closely with a subset of the human judges. We can argue that the similar shapes chosen by the experiment are almost always part of the lists of the top shapes generated by the different human judges. The results of two

sample queries are shown in Figure 7 and Figure 8. As it is difficult to include all the examples within the limited space, interested readers can test the system via the Internet URL http://www.cs.wayne.edu/~yit/fish/db.html with login name cgiuser and password web*wings if requested.

7. CONCLUSION In this paper, we have presented a novel method for shape representation. Through our experiments, we demonstrate that feature point histogram representation of image objects provides an effective way for object shape discrimination. In conjunction with the Department of Neurological Surgery at Wayne State University, we are initiating a project utilizing this shape representation technique for the design and implementation of a system for neurological surgery training. The proposed system will work in a Web-based environment, allowing neurosurgeons to query, and browse various patients-related medical records in an effective and efficient way. To be specific, the implemented system will support not only traditional text-based query, but also allow neurosurgeons to query by visual contents of medical images without forcing them to know the exact values of image features. For example, a more involved query is to find prior patients records in which the segmented lesions in the same location of the MRI brain images have a similar shape like this under treatment. We have found that this image representation scheme depends crucially on the quality of the technique used to find corner points. This is the weak link in our approach, but even so, we have shown that our shape representation scheme works well in certain environments. Thus, a potential research topic is to work out better image processing algorithms such that more precise and stable image feature points can be extracted under various shape scale spaces. As a histogram can be easily represented as a multidimensional point, standard nearest-neighbor approaches to indexing can also be used. We have not examined index creation in the context of this paper. However, we are working on various nearest neighbor approaches to directly access to relevant images.

8. ACKNOWLEDGEMENT We would like to give special thanks to Sadegh Abbasi, Farzin Mokhtarian, and Josef Kittler, at the Center for Vision, Speech, and Signal Processing, University of Surrey, for allowing their fish shape database to be used in our research activities. Fish images were downloaded from http://www.ee.surrey.ac.uk/Research/VSSP/imagedb/demo.html, and leaf images were downloaded from http://www.prip.tuwien.ac.at/prip/image.html.

9. REFERENCES 1.

2. 3. 4. 5. 6.

7. 8.

9.

I. Ahmad and W. I. Grosky, “Spatial Similarity-based Retrievals and Image Indexing By Hierarchical Decomposition”, Proceedings of the International Database Engineering and Application Symposium (IDEAS’97), pp. 269-278, Montreal, Canada, August, 1997 I. Ahmad, “A Hierarchical Decomposition Approach for Image Indexing”, PhD Dissertation, Department of Computer Science, Wayne State University, Detroit, MI 48202, 1997 R.A. Dwyer, “A Faster Divide-and-Conquer Algorithm for Constructing Delaunay Triangulations”, Algorithmic, Vol. 2, No. 2, pp. 127-151, 1987 S. Fortune, “A Sweepline Algorithm for Voronoi Diagrams”, Algorithmic, Vol. 2, No. 2, pp. 153-174, 1987 W.I. Grosky, “Managing Multimedia Information in Database Systems”, Communication of ACM, Vol. 40, No. 12, pp. 73-80, 1997 W.I. Grosky, F. Fotouhi, and Z. Jiang, “Using Metadata for the Intelligent Browsing of Structured Media Objects”, Managing Multimedia Data: Using Metadata to Integrate and Apply Digital Data, A. Sheth and W. Klas (Eds.), pp. 6792, McGraw Hill Publishing Company, New York, 1998 W. Hsu, T.S. Chua, and H.K. Pung, “An Integrated Color-Spatial Approach to Content-based Image Retrieval”, Proceedings of ACM Multimedia, pp. 305-313, San Francisco, November, 1995 J. Hafner, H.S. Sawhney, W. Equitz, M. Flickner, and W. Niblack, “Efficient Color Histogram Indexing for Quadratic Form Distance Functions”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 7, pp. 729736, 1995 H.V. Jagadish, “A Retrieval Technique for Similar Shapes”, Proceedings of the ACM SIGMOD Conference, pp. 208217, Denver, Colorado, June, 1991

10. M. Kliot and E. Rivlin, “Invariant-Based Shape Retrieval in Pictorial Databases”, Computer Vision and Image Understanding, Vol. 71, No. 2, pp. 182-197, 1998 11. G. Lu, “An Approach to Image Retrieval Based on Shape”, Journal of Information Science, Vol. 23, No. 2, pp. 119-127, 1997 12. F. Mokhtarian, S. Abbasi, and J. Kitter, “Efficient and Robust Retrieval by Shape Content through Curvature Scale Space”, Proceedings of International Workshop on Image Database and Multimedia Search, pp. 35-42, Amsterdam, the Netherlands, 1996 13. R. Mehrotra and J.E. Gary, “Similar-Shape Retrieval in Shape Data Management”, IEEE Computer, Vol. 28, No. 9, pp. 57-62, 1995 14. B.M. Mehtre, M.S. Kankanhalli, and W.F. Lee, “Shape Measures for Content Based Image Retrieval: A Comparison”, Information Processing & Management, Vol. 33, No. 3, pp. 319-337, 1997 15. W. Niblack, R. Barder, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Yaubin, “The QBIC Project: Querying Images by Content Using Color, Texture, and Shape”, Proceedings of SPIE Storage and Retrieval for Image and Video Databases, Vol. 1908, pp.173-181, 1993 16. J. Nievergelt, H. Hinterberger, and K.C. Sevcik, “The Grid File: An Adaptable Symmetric Multikey File Structure”, ACM Transaction on Database Systems, 9(1), 1984 17. J. O’Rourke, Computational Geometry in C, Cambridge University Press, Cambridge, England, 1994 18. J.T. Robinson, “K-D-B tree: A Search Structure for Large Multidimensional Dynamic Indices”, Proceedings of ACM SIGMOD Conference on the Management of Data, 1981 19. S.M. Smith and J.M. Brady, “SUSAN – A New Approach to Low Level Image Processing”, Technical Report TR95SMS1c, Department of Clinical Neurology, Oxford University, UK, 1995 20. J.R. Smith and S.-F. Chang, “Integrated Spatial and Point Feature Map Query”, ACM Multimedia Systems Journal, To Appear 21. M.J. Swain and D.H. Ballard, “Color Indexing”, International Journal of Computer Vision, Vol. 7, No. 1, pp. 11-32, 1991 22. X. Wan and C.-C.J. Kuo, “Color Distribution Analysis and Quantization for Image Retrieval”, Proceedings of SPIE Storage and Retrieval for Image and Video Databases, Vol. 2670, pp.8-15, 1996 23. I.H. Witten, A. Moffat, and T.C. Bell, Managing Gigabytes, Van Nostrand Reinhold, New York, New York, 1994 24. K. Wu, A.D. Narasimhalu, B.M. Mehtre, C.P. Lam and Y.J. Gao, “CORE: A Content-based Retrieval Engine for Multimedia Information Systems”, Multimedia Systems, No. 3, pp. 25-41, 1995

Figure 8: The 2nd sample query Figure 7: The 1st sample query