Texture Image Retrieval Using Novel Non ... - Semantic Scholar

11 downloads 0 Views 264KB Size Report
In [3], Smith and Chang used the statistics (mean and variance) extracted from the wavelet ..... [3] John R.Smith. Automated binary texture feature sets for im-.
Texture Image Retrieval Using Novel Non-separable Filter Banks Based on Centrally Symmetric Matrices

1

Zhenyu He1 , Xinge You1,2 , Yuan Yan Tang1 , Patrick S. Wang3, Yun Xue4 Department of Computer Science, Hong Kong Baptist University, Hong Kong Email: zyhe, xyou, [email protected] 2 Faculty of Mathematics and Computer Science, Hubei University, China 3 College of Computer and Information Science Northeastern University Boston, USA Email: [email protected] 4 Department of Mathematics, Hong Kong Baptist University, Hong Kong Email: [email protected]

Abstract Though millions of images are stored in a large digital image library today, the user can not access or make full use of these image information unless the digital image library is well organized in order to allow efficient browsing, searching and retrieval. Thus, research in image retrieval has been an active discipline since 70’s last century. Image retrieval is a typical problem of pattern recognition, consisting of two parts: extracting features (EF) and similarity measurement (SM). In this paper, we develop new non-separable filter banks based on the centrally symmetric matrixes, and apply them to extract the features of texture images. Compared to tensor product wavelets, our new filter banks can capture more directional texture information, which is helpful for texture image retrieval. Experiments show that our novel non-separable filter banks are satisfiable and achieve a better retrieval effectiveness than Daubechies wavelets.

1. Introduction Recently, with the rapid increase of the volume of the digital image library, we need to develop effective and precise methods to browse, search and retrieval image collections in order to make full use of the image information. Image retrieval has become an active research area since the 70’s last century. There are two main techniques for image retrieval. One is text-based, another is content-based. Though simple and efficient, text-based image retrieval suffers from some limitations which may lead to unrecoverable mismatches [9].

To overcome limitations of text-based technique, content-based image retrieval was proposed. Usually, there are several typical features can characterize the content of an image, such as texture, color, shape and etc. In this paper, we only consider the texture image retrieval. Because texture is very important and useful in pattern recognition and computer vision, rich research works in this field have been done in the past three decades [9]. In the early 70’s last century, Haralick et al proposed the cooccurrence matrix representation of texture feature [8]. In early 90’s last century, both Gabor wavelet and tensor product wavelets were widely used for texture analysis and texture image retrieval. In [3], Smith and Chang used the statistics (mean and variance) extracted from the wavelet subbands as the texture representation. Further, some more advanced and precise statistical models in wavelet domain were also applied on texture classification. [2], [6] and [5] employed the generalized gaussian density model and hidden markov tree model for texture image retrieval. Unfortunately, all of these works are based on the tensor product wavelets, which can only capture limited directions and cannot fully reflect the directional features of the texture images.

2. Centrally symmetric matrices and nonseparable bivariate filter banks based on centrally symmetric matrices In digital image processing, we hope the analysis tool have the properties of multiresolution and multidirection. Generally, the analysis tools used in image processing are bivariate filter banks and wavelets. But most of the current

0-7695-2521-0/06/$20.00 (c) 2006 IEEE

bivariate filter banks and wavelets are constructed by the tensor product of the univariate filters, that means the convolution of bivariate filter banks or wavelets and image is implemented by first convoluting the rows of the image using the univariate filters and then convoluting the columns of the image. A result brought by the construction of tensor product is the bivariate filter banks and wavelets can only capture limited image orientation information, that is, vertical, horizontal and diagonal directions. Therefore, nontensor product approaches for construction of multivariate filter banks or wavelets are desirable. Definition 1: An n × n matrix B = (bj,l )nj,l=1 is defined as a centrally symmetric matrix if it satisfies bj,l = bn+1−j,n+1−l , j, l = 1, 2, ..., n

(1)

A matrix B of order n is called centrally symmetric orthogonal matrix if it is both of centrally symmetric matrix and orthogonal matrix. To describe the centrally symmetric matrix and centrally symmetric orthogonal matrix, we need to use the n × n matrix Sn next The matrix Sn can be represented in block matrix forms in two cases when n is odd and even as follows.   √ √ 2 Ik 0 −2 2 Hk 2   1 0 S2k+1 =  √ 0 (2)  √ 2 2 H 0 I k k 2 2  S2k =

√ 2 I √2 k 2 2 Hk

√ − 2 2 Hk √ 2 2 Ik

(3)



e g

f h

2

(6)

2

2

2

e + f = g + h = 1, eg + f h = 0

(7) (8)

a = d = cos µ, b = − sin µ, c = sin µ, e = h = cos ν, f = − sin ν and g = sin ν are parametrization solutions of (7) and (8). Thus, the centrally symmetric orthogonal matrix B can be denoted as B(µ,ν) . Based on the centrally symmetric matrices, we can build up a general construction of bivariate non-tensor product wavelet filter banks with linear phase. These filter banks have a matrix factorization and can be used in texture image. We define V0 = (1, 1, 1, 1)T , V1 = (1, −1, 1, −1)T , V2 = (1, 1, −1, −1)T , V3 = (1, −1, −1, 1)T and denote the matrix of trigonometric polynomial D(ξ, η) as   1 0 0 0  0 e−iξ  0 0  , (ξ, η) ∈ R2 D(ξ, η) =  −iη  0  0 e 0 −i(ξ+η) 0 0 0 e (9) For any fixed positive integer N, arbitrarily chosen real number pairs (µk , νk ), the low-pass filter can be expressed as

m0 (ξ, η) = ×(

N

k=1

with the matrix Sn defined in (2) and (3). Here, we use the notation x to denote the largest integer no more than the real number x. Further, if Z1, Z2 are orthogonal matrix of order n+1 n n  n+1 2  ×  2  and  2  ×  2 , then matrix B is centrally symmetric and orthogonal. For example, in case of n = 4,

a b (5) Z1 = c d

Z2 =

a2 + b2 = c2 + d2 = 1, ac + bd = 0,



where In is an unit matrix of order n and Hn is an unit antidiagonal matrix of order n. Proposition 1: A matrix B of order n is centrally symn+1 metric if and only if there exist a  n+1 2  ×  2  matrix Z1 n n and a  2  ×  2  matrix Z2 such that B has the following factorization.

Z1 0 B = Sn (4) SnT 0 Z2

and

if matrix B is a centrally symmetric and orthogonal matrix, then the real numbers a, b, c, d, e, f, g, h satisfy

1 × (1, e−iξ , e−iη , e−i(ξ+η) ) 4

T B(µk ,νk ) D(2ξ, 2η)B(µ ) × V0 , (ξ, η) ∈ R2 k ,νk )

(10) Where, B(µk ,νk ) is centrally symmetric orthogonal matrix defined in (4). Obviously, m(0, 0) = 1, which means that m0 is a low-pass filter. Define mj (ξ, η) = ×(

N

k=1

1 × (1, e−iξ , e−iη , e−i(ξ+η) ) 4

T B(µk ,νk ) D(2ξ, 2η)B(µ )×Vj , j = 1, 2, 3, (ξ, η) ∈ R2 k ,νk )

(11) It is easy to see that mj (0, 0) = 0, j = 1, 2, 3. That is, mj , j = 1, 2, 3 are high-pass filters. For simplicity, in this paper, we only consider the case of N = 1. The following figures show that our new nonseparable filter banks (NFB) can capture more than three image directional information as tensor product wavelets do. And choosing different parameter pairs (µ, ν), we can obtain different directional features of the digital image, as is shown in Fig. 1.

0-7695-2521-0/06/$20.00 (c) 2006 IEEE

M1 25 fitted GGD Distribution historgram of coefficients 20

15

10

(a)

5

(b)

Figure 1. One level decomposition using new non-separable filter banks. (a) µ = π/4, ν = π/4, N = 1; (b) µ = 5π/8, ν = 3π/4, N = 1.

As we know, the directional information is an important feature of texture image. Thus, the new non-separable filter banks can reflect more texture features, which are beneficial for texture image retrieval.

3. Our algorithm for texture image retrieval Texture image retrieval is a typical problem of pattern recognition, which contains two main parts: feature extraction (FE) and similarity measurement (SM). Generalized gaussian density (GGD) model in wavelet domain is an effective approach to extract features of texture images, as has been proved by a series of papers [4] [6]. Our experiment shows that the histogram of texture image’s coefficients after the decomposition using our non-separable filter banks (NFB) can also be closely approximated by the GGD model, as is illustrated by Fig. 2. For simplicity, we call these coefficients obtained after the decomposition using NFB as the coefficients in the NFB domain. The key steps of our algorithm consist of the followings: Step 1: Extracting features using the GGD model in NFB domain. GGD model is given as follows. p(x; α, β) =

β β exp−1(|x|/α) 2αΓ(1/β)

(12)

where Γ(·) is the Gamma function. The basic idea of this step is to use the GGD model to approximate the statistical distribution of NFB coefficients in one subband and then take the parameter pair {α, β} of GGD model as the features to represent this subband. The parameter pair {α, β} of all chosen subbands can be regarded as the features of one texture image. There are several methods can be used to estimate the parameter pair {α, β}, and maximum likelihood estimation (MLE) is the most common method. For details of this part, please refer to [6].

0 −2

−1.5

−1

−0.5

0

0.5

1

1.5

2

Figure 2. An example to show that the histogram of texture image’s coefficients in NFB domain can be well fitted by the GGD model. The texture image in this example is bark0.

Step 2: Measuring the similarity of two texture image. The Kullback-Leibler distance (KLD) is an ideal choice to measure the similarity of two statistical models. The Kullback-Leibler distance (KLD) between two GGD models is defined as β1 αΓ(1/β2 ) ) β2 α1 Γ(1/β1 ) α1 2 Γ((β2 + 1)/β1 ) 1 + ( )β − (13) α2 Γ(1/β1 ) β1

D(p(·; α1 , β1 )p(·; α2 , β2 )) = log(

The total KLD distance between two texture images is the sum of all the KLD distances across all selected NFB subbands. Step 3: Generating the retrieval result based on the total KLD value. We only consider the k-nearest neighbor classifier since it is a robust and efficient scheme. That is, image retrieval is to find out the top N images which are most similar to the query image.

4. Experiments To compare our method with other researchers’ works, we use the same texture images used in [6]. That is, 40 texture images with size 512 pixels × 512 pixels in MIT Vision Texture (VisTex) database are used in our experiments. Each texture images is divided into 16 non-overlapping subimages with size 128 pixels × 128 pixels. Thus, totally 640 texture subimages are involved in our experiments. Following the evaluation criteria in [1], for each query subimage, the top S matches are retrieved. If S < 15, the retrieval rate is the the percentage of correct retrievals (the other 15 subimages of the same original texture) among the

0-7695-2521-0/06/$20.00 (c) 2006 IEEE

Table 1. TEXTURE IMAGE AVERAGE RETRIEVAL RATE(%) Number of Top Matches

db wavelet

[π/2, 0]

[π/6, 2π/7]

[π/2, 0]& [−π/2, 0]

[π/2, 0]&[−π/2, 0]& [π/6, 2π/7]&[2π/7, 2π/3]

3 5 7 10 15 20 50

90.8 87.6 84.6 80.5 71.7 80.2 91.7

95.2 92.6 88.3 83.3 75.2 82.7 94.5

93.9 90.5 85.8 81.0 72.8 81.1 92.6

96.8 94.4 90.2 85.3 77.5 84.3 95.2

97.6 95.0 91.6 86.2 79.5 86.1 96.3

top S matches. Otherwise, the retrieval percentage is the ratio of the number of correct retrievals within the top S retrieval results to 15. For instance, in the case of S = 5, if 4 correct retrievals are ranked at the top 5 matches, then the retrieval rate equals to 4/5 × 100% = 80%. In the case of S = 15, the retrieval rate is 10/15 × 100% = 66.67% if 10 correct retrievals are at the top 15 matches. There are several successful methods (including waveletbased energy, wavelet-based GGD, wavelet-based HMT) for texture image retrieval. But among them, which is the best one? [6], [7] offered two answers to this question. In [6], M.N.Do evaluated the performance of three kinds of wavelet-based energy method (L1 , L2 ,L1 + L2 ) and two kinds of wavelet-based GGD model(GGD&KLD, GGD&ED), and then concluded that the GGD&KLD always outperformed other 4 methods. In [7], O.Commowick tested energy-based method, GGD method, WD-HMT method on VisTex database, and found that the GGD method is the most efficient. Based on their conclusions, to show the efficiency of our method, here we only compared our method with the wavelet-based GGD&KLD method. The retrieval result comparison is given in Table. 1. In the first row of Table. 1, db wavelet means the db wavelet-based GGD&KLD method, [π/2, 0] means the NFB with parameters [µ = π/2, ν = 0], [π/2, 0]&[−π/2, 0] means two NFBs with parameters [µ = π/2, ν = 0] and [µ = −π/2, ν = 0], separately, which are combined to retrieve the texture image. Other parameters in the first row have the similar meanings. From this table, we can see that combining more NFBs can achieve better retrievals, and the retrieval results of NFBs with different parameters also differ. NFB [π/2, 0] is somewhat better than NFB [π/6, 2π/7] on retrieval.

5. Conclusion In this paper, we have made an introduction of the origination of image retrieval and previous research works in texture image retrieval. What is more important, we have

developed the novel non-separable filter banks (NFB) based on the centrally symmetric matrices and further proposed a GGD model in NFB domain for texture image retrieval. Our experiments show that our new method is effective.

Acknowledgments This research was partially supported by a grant (60403011) from National Natural Science Foundation of China and the grants (RGC and FRG) from Hong Kong Baptist University.

References [1] B.S.Manjunath. Texture features for browsing and retrieval of image data. IEEE Trans.Pattern Recognition and Machine Intelligence, 18:837–842, Aug.1996. [2] G.V.Wouwer. Statistical texture characterization from discrete wavelet representations. IEEE Trans.Image Proc., 8(4):592–598, Apr. 1999. [3] John R.Smith. Automated binary texture feature sets for image retrieval. proc. ICASSP-96, Mar. 1996. [4] S. Mallat. A theory of multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis And Machine Intelligence, 11:674–693, Jul. 1989. [5] M.N.Do. Rotation invariatnt texture characterization and retrieval using steerable wavelet-domain hidden markov models. IEEE Trans.on Multimedia., 4:517–527, Dec. 2002. [6] M.N.Do. Wavelet-based texture retrieval using generalized gaussian density and kullback-leibler distance. IEEE Transactions on Image Processing, 11:146–158, Feb. 2002. [7] O.Commowick and C.Louchet. Wavelet-based texture classification and retrieval. Technical Report, Ecole Nationale Superieure des Telecommunications, 2003. [8] Robert M.haralick. Texture features for image classification. IEEE Trans. on Sys, Man, and Cyb, SMC-3(6), 1973. [9] Y. Rui. Image retrieval: current techniques, promising directions, and open issues. Journal of Visual Communication and Image Representation, 10:39–62, Mar. 1999.

0-7695-2521-0/06/$20.00 (c) 2006 IEEE