Invariant Features for Texture Image Retrieval

0 downloads 0 Views 93KB Size Report
the pdf of the steerable subbands can be simply repre- sented using two .... 20%, 40% and 60% darker (Table 1b) (some examples are shown in Figure. 3.).
Invariant Features for Texture Image Retrieval using Steerable Pyramid S. Areepongsa!, D. Park#, and K.R. Rao* !

The National Electronics and Computer Technology Center, National Science and Technology Development Agency, Ministry of Science, Technology and Environment, Bangkok, Thailand # Hannam University, Dept. ICE, Ojung-Dong, Taeduk-Ku, Taejon, South Korea 306-791 * The University of Texas at Arlington, Box 19016, TX 76019. Email:[email protected]

Abstract – In this paper, rotation, translation and luminance invariant features for texture image retrieval are investigated. The features are derived based on the statistical (standard deviation and shape parameter) distribution of the transform coefficients extracted from each steerable pyramid subband. By utilizing the proposed invariant features, the similarity measure between query and database images provides reliable retrieval results even when the lighting conditions and orientation of images are changed.

In the next section, the background, structure and the properties of the steerable pyramid are addressed. Then in section 3, to analyze the illumination invariant properties, the statistical parameters are extracted from histogram of steerable subbands. The implementation of these parameters in texture image retrieval is described in section 4. The conclusions of the proposed system are highlighted in section 5. II. THE STEERABLE PYRAMID

Keywords: Invariant features, Retrieval system, Rotation invariance, Steerable pyramid, Texture image.

I.

INTRODUCTION

In view of the overwhelming accumulation of the digital databases, the development of retrieval systems that allow efficient browsing, searching and retrieval of digital images is needed. Generally, the existing retrieval methods utilize the attributes of an image to describe and retrieve similar images. Among the valid contents, texture is one of the most discriminating elements. Even though texture analysis methods are mainly classified into two approaches, structural and statistical, both of them lack the human visual constraints [1]. Recently, since a multi-resolution paradigm with a pyramid structure matches well to human texture perception, a number of researchers introduced this structure to the retrieval applications [2][3]. Unfortunately, the wavelet transform lacks the translation and rotation invariant properties. This results in the mismatch of the retrieval process when the image orientation is altered. To overcome this problem, the combination of wavelets and moments is proposed [3]. Alternatively, steerable pyramid structure, which has the translation and rotation invariant properties is also introduced [1]. However, this algorithm works well only when the images have the same luminance intensity. To incorporate all the invariant properties, we investigated the statistical parameters (standard deviation and shape parameters) extracted from the histogram of steerable pyramid subband. These parameters contain not only the translation and rotation invariant properties but also support the illumination invariant properties.

In signal processing, a signal can be decomposed into subbands, such as by wavelet transform. The wavelet transform is widely used in many applications including a retrieval system, since the pyramid structure of wavelets responds well to human visual system. However, one drawback of wavelets (orthogonal) is the lack of translation invariance especially in two-dimensional (2-D) signals [4]. To overcome this problem, the “steerable” pyramid wavelet, a class of arbitrary orientation filters generated by linear combination of a set of basis filters, has been proposed. [4]. The system diagram of steerable pyramid for a single stage is shown in Figure. 1. The system is divided into two parts, analysis and synthesis. On the left side of the diagram is analysis part. The image is decomposed into lowpass and highpass subbands, using steerable filters L0 and H0. The lowpass band continues to break down into a set of bandpass subbands B0, …, BN and lower lowpass subband L1. The lower lowpass subband is subsampled by a factor of 2 along the x and y directions. Repeating the shaded area provides the recursive structure. On the other hand, the right side of the diagram is synthesis part. The synthesized image is reconstructed by upsampling the lower lowpass subband by the factor of 2 and adding up with the set of bandpass subbands and the highpass subbands. Due to its invariant properties, the pyramid structure of the steerable wavelet was used in texture analysis [5]. It captures textures in both structural and random aspects [5]. The histogram of highpass and bandpass subbands in steerable pyramid can be modeled as the generalized Gaussian, f (x ) [6].

[

f ( x ) = ae − b x − µ



where a and b can be expressed as follows: bγ 2Γ (1 / γ )

(2)

1 Γ(3 / γ ) σ Γ(1 / γ )

(3)

a=

b=



0

Because the mean of the subbands is close to zero, the pdf of the steerable subbands can be simply represented using two parameters; standard deviation (σ 2 ) and shape parameter ( γ ). Input Signal

Synthesized Signal

H0

H0 B0

B0

Bn

Bn 2

a) Analysis part

2

γ ∗ = γ + c(1 − α )

(5)

where the

σ

and

σ ∗ are the standard deviations at

γ ∗ is the new shape parameter derived from the shape parameter γ . IV. IMPLEMENTATIONS AND RESULTS

as Γ( x ) = t x −1 e − t dt . ∫

L1

(4)

different lighting condition and

Note that µ , σ 2 and γ are the mean, variance and shape parameters of the probability density function (pdf) of the subbands. The Gamma function is defined

L0

σ ∗ = ασ

(1)

L0

In this section, the implementation of the proposed features in the retrieval system is explained. To generate the image database, each image is decomposed into 3level and 4-orientation subbands (Figure. 2) using the steerable pyramid transforms. Then, the standard deviation and shape parameters are computed from histograms of bandpass subbands. Finally, all these parameters (texture attributes) are stored along with the images (may be in compressed domain) in database. To operate the query process, the sample image is chosen from the sample images as in Query-byExample. Then, the query texture attributes are extracted and compared to other texture attributes contained in database. The similarity is performed by computing the scale factor (α ) between histograms of both query and database images using the second order normalized central moment ( β ) as expressed in (6). 2

β 2′ = α β 2

L1

b) Synthesis part

Figure 1. First level of steerable pyramid decomposition system.

(6)

The compensation for standard deviation and shape parameter of bandpass subbands of the steerable pyramid transform is performed using (4) and (5). Then, the distance between the query object Q and the candidate object C is evaluated using the following:

III. ILLUMINATION INVARIANCE PARAMETERS Though the statistical features of steerable subbands (standard deviation and shape parameters) are robust to the changes in rotation and translation, it does not usually tolerate to the lighting variations. In this section, we present further modifications of the statistical parameters so as to have illumination invariance. As proposed in [7], the histograms of images at different lighting conditions can be approximately modeled as a translation and scale version of each other. Let a signal f (x ) is translated by ε and scaled by α , then the translated and scaled of f (x ) is expressed as f ′( x ) = α f ( x ) + ε . However, in [7], it is reported that the translation ( ε ) of gray level of image do not affect the statistics of the wavelet band. On the other hand, the scaling (α ) of gray level of image has a direct relation with both standard deviation and shape parameter, respectively, as follows:

 NL  NB d (Q , C ) = min  ∑  ∑ Ak σ Q′ k − σ C′ m  p =1  k =1

(

)2 + ∑ Bk (γ Q′ NB

k =1

k

− γ C′ m

)2 

(7)



, m=0, … , N B where σ ′ Q

k

and γ ′ are standard derivation and Qk

shape parameter of the query object at bandpass k. Subscript C k stands for the candidate object. The parameters N B , and N L refer to the number of bandpasses in each level and number of levels in steerable pyramid. The superscript m is the circular shifted parameter of the bands in steerable pyramid. The weighting factors, Ak and Bk, are set by users. Finally, the images with least difference of distance d (Q , C ) are retrieved. To evaluate the invariant properties and the performance of the proposed method, 13 types of test images are obtained by rotating at 30, 60, 90, 120, 150, and 200 degrees (Table 1a) and then by changing the brightness of each (Bark, Brick and Bubble) rotated im-

ages by the amount of 20%, 40 % and 60% brighter and 20%, 40% and 60% darker (Table 1b) (some examples are shown in Figure. 3.). In simulation, the query image, Brick (see Figure. 3a.), is compared against all images in database. In Table 1a, the distance measure of all rotated images in database with the query image is shown. The shaded area indicates the less distance error. Based on these results, it is shown that the proposed invariance features can identify the image with different orientations. Table 1b illustrates the distance measures of the query image and the images in database which are subject to both various orientations and illuminations. The table shows that combining the feature model in [7] with the proposed features provides a stronger robustness to both the rotation and illumination variations. The first 9 images that have minimum distance measure compared to the query image (Brick: normal orientation and illumination) are also shown in Figure. 4.

g). 90° orientation

Std , Std , Std , Shape Shape Shape

i) Darker 40 %

Figure 3. The Brick image with different orientations and illuminations.

a) 0°, 20% brighter Std , Shape

h) Brighter 40 %

b) 0°, 40 % brighter

c) 0°, 20% darker

Std , Shape Std , Shape

Std , Shape

Std , Shape

Std , Shape

Std , Shape

Std , Shape

Std , Shape

d) 0°, 60% brighter

e) 0°, 40 % darker

f) 90°, 20% brighter

Figure 2. The decomposition of the steerable pyramid [6] (3levels and 4 orientations).

g). 90°, 40% brighter h) 90°, normal lighting i) 90°, 20% darker

Figure 4. Nine most similar images retrieved from the database (query image: Brick normal orientation and illumination).

To evaluate the performance of the proposed features, we use the criterion introduced in [3]. a). Normal orientation

b) Brighter 40 % c) Darker 40 %

K

∑ nq

ηR =

(8)

q=0 K

∑ Nq

q=0

where K , q, n , q d). 60° orientation

e) Brighter 40 %

f) Darker 40 %

and T are the size of image in

Nq

database, query image, the number of successfully retrieved images, the number of similar images found in database, and a positive integer used as a tolerance, respectively. By manually listing the similar images found

in the database using images with normal orientation and illumination as query image, the retrieval efficiency of the proposed system (Figure 5) is dependent on the tolerance factor. By allowing more tolerance (which means, to allow more similar images to be retrieved), the retrieval efficiency is increased.

Retrieval Efficiency (in %)

94 93

tation and illumination changes by adding more features. VI. APPENDIX The set of 13 images (normal and oriented) are downloaded from the web site, http://sipi.usc.edu/services/database/Database.html and are added to the illumination changed images (Bark, Brick and Bubble) to form the database (total of 163 images).

92

VII. REFERENCES

91

[1]

90 89

[2]

88 [3]

87 0

2

4

6

8

10

12

14

16

18

20 [4]

Tolerance

[5]

[6]

Figure 5. The evaluation on the performance of retrieving.

[7]

V. CONCLUSIONS

[8]

It is shown that the features are highly invariant to orientation and illumination changes. The texture features are represented by a generalized Gaussian pdf which is estimated from the steerable subbands. Based on our simulations, the combined features can improve the retrieval efficiency as to the robustness to the orien-

P. Blancho, H. Komik and K. Knoblauch, “Steerable pyramidbased features for image retrieval from a texture database,” SPIE Human Vision and Electronic Imaging III, Jan. 1998, pp. 552562. J. Z. Wang et. al., “Wavelet-based indexing technique with partial sketch retrieval capability,” in Proc. the Forum on Research and Technology Advances in Digital Libraries, May 1997, pp. 13-24. M. K. Mandal, T. Aboulnasr and S. Phanchanathan, “ Image indexing using moments and wavelets,” IEEE Trans. on Consumer Electronics, vol. 42, pp. 557-565, Aug. 1996. W. T. Freeman and E. H. Adelson, “The design and use of steerable filters,” IEEE Trans. on PAMI, vol. 13, pp. 891-906, Sept. 1991. E. P. Simoncelli and J. Portilla, “Texture characterization via joint statistics of wavelet coefficient magnitudes”, IEEE ICIP, Oct. 1998, pp. 62-66. E. P. Simoncelli, “Noise removal via Baysian wavelet coding,” IEEE ICIP, Sept. 1996, pp. 379-382. M. K. Mandal, Wavelet Based Coding and Indexing of Images and Video, Ph.D. thesis Dept. of Electrical Engineering, University of Ottawa, Ottawa, Canada, May 1998. K. Sharifi and A. L. Garcia, “Estimation of shape parameters for generalized Gaussian distributions in subband decompositions of video,” IEEE Trans. on CSVT, vol. 5, pp. 52-56, Feb. 1995.

Table 1 The distance measure of the query image (Brick, normal orientation and illumination) against the rotated and illumination changed image database 1a. the distance measure of the query image and rotated images Orientation (degrees) Brick Bark Bubble Grass Leather Pigskin Raffia Sand Straw Water Weave Wood Wool

0 (Query) 6796.56 6839.45 15828.25 5441.46 2855.25 3969.77 3239.05 5196.44 3479.23 3864.95 1640.86 2975.87

30 1311.86 7654.93 5764.36 15245.39 6182.99 3023.34 3858.77 3328.62 3871.80 3643.00 3751.97 2355.09 2960.59

60 1404.66 6592.10 5683.16 14851.87 7999.16 2923.48 3687.47 3280.33 2856.90 3614.10 3945.56 2053.14 3011.28

90 5.23 6682.43 7298.84 11662.58 6877.93 2871.52 4319.16 3241.15 5275.66 3513.67 4207.71 1554.69 3052.93

120 1378.48 7976.14 6045.11 11754.68 7016.53 3003.99 3897.45 3202.05 3024.32 3620.42 4009.92 2427.57 3046.68

150 1167.14 6717.57 5792.68 11335.50 7730.50 2880.10 3816.92 3201.05 2523.14 3600.53 4163.38 1926.45 3078.23

200 2631.04 6361.36 6704.09 12451.09 7055.56 2939.11 3739.73 3207.91 2201.58 3725.98 4220.24 2553.75 3025.52

1b. the distance measure of the query image and the rotated and illumination changed images) Orientation (degrees) Brick Normal 20% Brighter 40% Brighter 60% Brighter 20% Darker 40% Darker 60% Darker Bark Normal 20% Brighter 40% Brighter 60% Brighter 20% Darker 40% Darker 60% Darker Bubble Normal 20% Brighter 40% Brighter 60% Brighter 20% Darker 40% Darker 60% Darker

0 (Query) 0.00 0.00 0.59 0.04 2.11 39.81 6796.56 6796.56 6796.56 6796.40 6267.81 4988.53 3499.61 6839.45 6839.45 6839.45 6823.89 5874.46 4463.72 3258.94

30 1311.86 1311.86 1311.86 1301.65 1308.34 1284.08 1216.72 7654.93 7654.93 7654.93 7654.18 7217.15 5967.47 4237.69 5764.36 5764.36 5764.36 5752.11 5309.53 4448.04 3388.54

60 1404.66 1404.66 1404.66 1392.75 1399.99 1376.22 1314.67 6592.10 6592.10 6592.10 6591.60 6179.98 4993.83 3502.68 5683.16 5683.16 5683.16 5670.75 5081.82 4099.37 3171.32

90 5.23 5.23 5.23 7.31 5.59 9.91 47.95 6682.43 6682.43 6682.43 6682.14 6180.85 4938.19 3456.40 7298.84 7298.84 7298.84 7288.44 5953.62 4311.53 3133.37