An Image Retrieval System for Artistic Database on ... - Semantic Scholar

23 downloads 0 Views 1MB Size Report
determine for an other vector of the same dimension X = {x1 ··· xc} a degree of similitude with the centroids one. Under the hypothesis written above, given.
An Image Retrieval System for Artistic Database on Cultural Heritage Edoardo Ardizzone, Antonio Chella, Roberto Pirrone and Orazio Gambino UNIVERSITA’ DEGLI STUDI DI PALERMO DINFO - Dipartimento di Ingegneria Informatica viale delle Scienze, edificio 6 90100 Palermo (ITALY) {ardizzon,chella,pirrone}@unipa.it [email protected]

Abstract. In this work an image retrieval system based on dominant colors is presented. A novel similarity function based on membership values of the Fuzzy c-means theory is introduced in order to using the information contained in the image descriptor vector like coordinates in a conceptual space. The principal characteristic of this system is quickness on retrieval phase due to light computations, easy implementation, immunity to strong luminance variations, simply to use by the user. The image database we used is heterogeneous and comprises 3000 images, both paints and photos of monuments, and other low quality photos taken by webcam was introduced.

1

Introduction

Content Based Image Retrieval consist of application like QBIC[8] to find images for large and heterogeneous image database. It would be preferable that the system was ”text query free” for two reasons: first, the creation of the descriptor is more difficult because the database must be indexed; second, the text query could become fundamental for the retrieval. The above situation in typical of image database which contains photos of monuments, hand-manufactured, paintings, sculptures which are present as a collection of cultural heritage of a country. The creation of a sophisticated descriptor requires many time for its execution: this is not a problem during the descriptor database creation, but it is a problem if a fast comparison among the query image and every image inside of the database is desired. In this work we use a descriptor based only on colors and a fuzzy solution to compare the descriptors among the images which easy to implement. Among Image Retrieval System which use fuzzy theory can be cited Simplicity [6] a complex system based on fuzzy [5] has been developed with good results. In this work the image is subdivided by 4x4 array and for each one three average colors in LUV color model and 3 measures on high frequency using s Daubechies-4 wavelet transform form a descriptor. The collection of descriptors for all subimages is the dataset for a fuzzy k-means algorithm in order to discover

2

clusters in such way to identify image regions with the same characteristics. For each identified region, a shape descriptor based on normalized inertia is added. A membership function is modelled with a Cauchy function for each cluster, so a membership value to each cluster can be determined for a given region. The similarity between two regions is given by an algorithm inspired to fuzzy sets and named Unified Feature Matching. Recently, authors of Blobworld [7] have developed a complex segmentation system where an image is decomposed by ”blobs” which simplifies the image and a text query is necessary. The paper is organized as follows: Section 2 Description vector explanation, Section 3 Fuzzy Similarity function between a query image and that one coming form database Section, 4 Presentation of experiments results and Section 5 Gives some ideas to enhance this system.

2 2.1

Descriptor definition Color reduction

Often in literature the reader can find reducing color algorithm like application of pattern recognition ones like ISODATA [11] which performs fuzzy k-mean algorithm [2] . These kind of algorithms are slow because are based on objective function minimization with an iterative process so a lot of work has been spent to speed up them [1]. Of course, initial conditions are important so it would be necessary to know a priori a configuration close to the optimal ones. Alternatives algorithm could be those one proposed in [4] or more recently in [3].For our work, we have used the algorithm in [9] whose author provide free C source code at the internet site [10]. In this algorithm RGB cube is subdivided in a predefined number of sub-regions with minimum variance whose centroids are the dominant colors, in our case 16. No dither algorithm is applied because in this way we have a sort of image segmentation, even if this choice produce false contours. Results on color reduction are shown in fig.1(b). 2.2

Image partitioning

Briefly, the image segmentation system of Blobworld is based on a low frequency filtering driven by repeated isotropic measure of subimages of different dimension followed by the performing of EM algorithm to identify clusters which contains image regions according to an L*a*b* color model. As the authors confirm in their article, this method not always works well specially, we say, if the image is not composed by a large and uniform background and a subject in the middle of the scene. Indeed, we have performed a free distribution of the system provided by the internet site [12] which works with 162x198 images and, after an image resizing of fig.1(a), you can see that a part of the subject is lost as shown in fig.1(c). For this reason we prefer a simple image partitioning like in fig.2 to recognize only most important parts of the image. If no part of the image is selected, the descriptor is referred to the entire image.

3

(a)

(b)

(c)

Fig. 1. From left to right: original image, image using 16 colors, blob segmentation.

Fig. 2. Selecting zones

2.3

Descriptor Vector

The dominant colors palette is converted in HSV color model both for the entire image and each zone the percentages of all colors are counted. The user can choice to use the entire image percentages or freely select every single zone, and he can excluding or including luminance component.

h s v entire image zone1

···

zone5

Fig. 3. Descriptor Vector

3

Fuzzy Similarity Function

In literature there are many algorithms to clustering treatment like General Pruning [13], but here we explain a new similarity function, based on the fuzzy c-means theory. We suppose that a given state space in Rp is segmented by

4

c clusters whose centroids are V = {v1 · · · vc } where vi ∈ Rp and we want determine for an other vector of the same dimension X = {x1 · · · xc } a degree of similitude with the centroids one. Under the hypothesis written above, given the Bezdek objective function: J(U, V) =

c X c X

2 µm ik kxk − vi kG

(1)

k=1 i=1

subject to the constraints: µik ∈ [0, 1] ∀ik,

c X

µik = 1 ∀k,

i=1

c X

µik ∈ (0, n) ∀i

k=1

it can be demonstrated which it is minimized for ³ µik =

1 kxk −vi k2G

Pc j=1

³

1 ´ m−1

1 kxk −vj k2G

, k = 1, . . . , n 1 ´ m−1

(2)

Here kxk2G = xT Gx where G is a positive definite matrix whose coefficients can induce an inner product in Rep . The µik coefficients represent the membership degree of the centroid vi with the generic vector xk and they are contained in a square membership array U ∈ Rc·c . Then, the generic k-th column of this array expresses a measure of similitude among vector xk and all centroids so we can take the maximum value of membership for each column and consider it like the degree of similitude sj with the nearest centroid which can be called vˆj and setting to zero all others values of the same column. sj = max {µ1j . . . µcj } ∀j = {1 . . . c}

(3)

so we obtain a similitude vector: s = [s1 . . . sc ]

(4)

where each component si represents the membership value between the xi and the closest centroid vˆi . The Bezdek objective function can be rewritten as follows: Jˆ (s, vˆ) = s · D where: D = [kx1 − vˆ1 kG . . . kxc − vˆc kG ]

(5) T

It is simple to demonstrate that if J, which has all positive terms, is minimized then Jˆ is also minimized because: Jˆ (s, vˆ) < J (U, v)

5

At first sight Jˆ could be a good measure of similitude; indeed seeing the first ˆ example in fig.4, when data matches with centroids we obtain J=0 but if the ˆ replacement of data is different, like the others, J presents values proportional to degree of dissimilarity.

3

3

+ CENTROID O DATA

+ CENTROID O DATA

2

2

1

1

0

0

1

2

3

0

0

1

(a)

2

3

+ CENTROID O DATA

+ CENTROID O DATA

2

2

1

1

0

3

(b)

3

0

1

2

3

0

0

1

(c)

2

3

(d)

ˆ ˆ Fig. 4. (a) data matches with centroids, J=0; (b) data similar to centroids, J=0.54; ˆ (c) data different from centroids, J=0.88;(d) data concentrated around one centroid ˆ J=0.39 .

But this measure wrong clearly when data is concentrated around to only one centroid,like shown in fig.4(d). The reason of this undesirable behavior is that the formula takes in account the distance with the closest centroid, but even if we consider only membership values using the sum of sj values between centroids and data: Jˆ (s) =

c X

sj

(6)

j=1

is useless because both in fig.4(b) and in fig.4(d) the similitude values of vector s are near to 1. The fact is that the model we have thought doesn’t take in account which every centroid should have only one nearest data vector and this observation must be inserted in the formulation of the similarity model 6. From membership array U we consider a crisp membership array Ucrisp which

6

is a zeros sparse array and ones only in the positions where the max per column in the array U are.

(a)

(b)

(c)

(d)

Fig. 5. Crisp membership arrays for the examples in Fig.4

In the ideal case of fig.4(a), where the data vector matches perfectly with the centroids, the Ucrisp is the Identity array, as shown in fig.5(a), and fig.4(b), where it is similar to centroids, you can observe a sparseness of values in fig.5(b). Both cases have the property that each row has only one value of membership, that is each centroid has only one data vector with a membership value close to 1. The diagonal alignment of fig.5(a) is no matter because is due only to the identical sorting of two vectors. So if more data vector are near to one centroid we must penalize those membership values where this fact happens using penalty coefficients p ˆ = [p1 . . . pc ]. The feature of this coefficient is that it must have the max value 1 if a data vector belongs to only one centroid, less if not. We have chosen the follow penalty vector:

where:

f = [m1 . . . mc ] Ã c !−1 X mk = Ucrisp (i, l)

(7)

l=1

f or i : sk = max {µik } , k = 1 . . . c and at the end this is the similarity function researched: J =f ·s For our purpose the centroid vector is the descriptor vector of a query image and data vector is that one of a generic image which cames from the database descriptor. 3.1

Singularity Treatment

Often in the programming libraries can be found implementation of fuzzy c-mean which doesn’t take in account the singularity case because it is considered a rare event. Substantially, the membership definition in (2) falls in singularity simply when centroids match with a data value and the distance is null. It is sufficient only one centroid matches with a data value and the membership array can’t be

7

calculated. We think that it is a thoughtlessly for our application because the input RGB data is quantized and this produce a quantization in the HSV model too. To prevent the singularity condition, the null distance must be detected and the correspondent Ucrisp column has been inserted.

4

Dataset and Experimental Setup

We have used an archive of more than 3000 images in .jpg format and low quality webcam photos. Most images have a range of resolution from 640x400 to 800x600 24 bit. The boxed results are a valid match. In the last row we show the result of a query image with attenuated luminance channel: the system find at the first position an image which matches perfectly with the original.

5

Future works

The algorithm [9] super-segments large regions with uniform color; this produce false contours and uses centroids which could be used to describe details, like the red of eva’s apple in fig.1(a). So a post-processing should be executed to merge these regions setting free some centroid with a more lever creation of the dominant colors palette. Enhancing of descriptor introducing shape or texture analysis should increment its functionality. In fact, the system works fine with only color information without any else information about the image. Each artist has a particular painting technique which distinguishes his paintings, or some preferred topics, that are present in his works. A texture analysis could be able to capture information about the painting stroke which characterize, for example, a Van Gogh from an other artist. This fact imply that the painting style of each artist could be recognized in order to identify some descriptor parameters which are characteristic of each artist.A shape identification could identify a particular object represented on the picture, so a shape descriptor should be developed in order to be invariant both rigid and scale transform. In this way an object, even if it is located in a different position into another paint, it can be found using a visual query like that one in this work.

6

Acknowlegement

This work has been partially supported by Te.S.C.He.T. (Technology System for Cultural Heritage in Tourism) project ENGINEERING INGEGNERIA INFORMATICA.

8

References [1]

[2] [3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] [11]

[12] [13]

R.L. Cannon, J.V. Dave, and J.C. Bezdek, “Efficient implementation of the fuzzy c-means clustering algorithm,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 8, no. 2, pp. 248–255, 1986. J.C. Bezdek, “Pattern Recognition with Fuzzy Objective Function,” Plenum Press, 1981. N. Papamarkos, A.E. Atsalakis, C.P. Strouthopoulos, “Adaptive color reduction,” Systems, Man and Cybernetics, Part B, IEEE Transactions on , vol. 32, Issue: 1, pp. 44–56, Feb. 2002. S. Wan, K. Wong, P. Prusinkiewicz, “An algorithm for multidimensional data clustering,” ACM Transaction on Mathematical Software , 14(2), pp. 153–162, June 1988. Yixin Chen, J.Z. Wang, “A region-based fuzzy feature matching approach to content-based image retrieval,” Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol. 24, Issue: 9, pp. 1252–1267, Sept. 2002. J.Z. Wang, Jia Li, G. Wiederhold, “SIMPLIcity: semantics-sensitive integrated matching for picture libraries,” Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol. 23, Issue: 9, pp. 947 – 963, Sept. 2001. C. Carson, S. Belongie, H. Greenspan, J. Malik, “Blobworld: image segmentation using expectation-maximization and its application to image querying,” in Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 24, pp. 1026 – 1038, Aug. 2002. M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Qian Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, P. Yanker, “Query by image and video content: the QBIC system,” IEEE Trans. on Medical Imaging, vol. 28, Issue: 9, pp. 23–32, Sept. 1995. Xiaolin Wu, “Greedy orthogonal bipartition of RGB space for variance minimization aided by inclusion-exclusion tricks,” Graphics Gems vol. II, , vol. 2,pp. 126–133. www.csd.uwo.ca /faculty /wu /cq.c Phillips, S. “Reducing the computation time of the Isodata and K-means unsupervised classification algorithms,” Geoscience and Remote Sensing Symposium 2002. IGARSS ’02. 2002 IEEE International , vol. 3, pp. 1627 – 1629,June 2002. http://elib.cs.berkeley.edu /src /blobworld A. Kimura et al., “Very quick audio searching: introducing global pruning to the Time-series Active Search,” Proc. of ICASSP, vol. 3, no. 3, pp. 1429–1432, 2001.

9