Consumer Electronics, IEEE Transactions on - IEEE Xplore

521

Park and Lee: Image Compression Based on Best Wavelet Packet Bases

IMAGE COMPRESSION BASED ON BEST WAVELET PACKET BASES Daechul Park* and Moon Ho Lee** *Han Nam University **Chunbuk National University Dept. Info. and Comm. Engr. Dept. Info. and Comm. Engr. 133 Ojung-dong, Taeduk-ku Chonju Chonbuk 560-756, Korea Taejon 300-791, Korea

Abstract An adaptive subband decomposition technique for an efficient signal representation and compression is introduced and tested in a rate-distortion framework. Best wavelet packets bases exhibit a SFFT (Short-time Fast Fourier Transform) subband decompositions at one source instance, a wavelet decomposition at other instance, or any intermediate wavelet packet decomposition at yet other intances to best match the signal's characteristics. For a given subband decomposition, commonly used information measures such as entropy, distortion(mse), and rate(no. of coefficients) minimized over all subbands are decomposition to search the most efficient wavelet packet tree of signal and the method to provide such wavepacket tree is proposed. Image coding application results using the joint rate-distortion cost measure demonstrated superior performance over the entropy only information cost measure.

1. Introduction A signal may be divided into frequency subbands by repeated application of convolution-decimation by a pairs of filter banks ( H and G ) whose banks are called perfect reconstruction QMF (Quadramre Mirror Filter) if they satisfy orthogonality and perfect reconstruction conditions (to be described in Section 2). The use of an adaptive subband decomposition of a signal was recently introduced by Coifman et al. 111 using orthogonal filter banks as a family of orthonormal bases for picture compression. Ramchandran and Vetterli[ZI proposed an

Manuscript received June 27, 1994

0098 3063/94 $04.00

optimization algorithm to produce best wavelet packets bases in a joint rate-distortion measure sense using DCT (Discrete Cosine Transform) wavelet packet basis after quadtree segmentation in a source image. Basically this approach never splits the bands in the form of wavelet transfrom and/or wavelet packet basis mentioned in 111 and rather the source image is preprocessed by quadtree segmentation in order to achieve a variable block size DCT coding. The segmented image is then coded in a JPEG environment. This algorithm simply signifies that the adaptive subband tree structure tuned to the signal characteristics enables the coder to exhibit best performances in human perception and distortion sense. These adaptive subband bases come with a natural quadtree-like structure and remarkable orthogonality properties. When applying subband decomposition techniques to image compression, difficult questions to be answered are 1) what

i@ormation cost measure has to be used to get the best decompostion tree (wavelet packets bases) ? and 2) Given the best basis tree, how am we eflciently and optimally allocate the provided coding budget to subbands? A possible answer to the fist question is to introduce a cost function for band splitting and merging and then to search for a best wavelet packet basis by minimizing the cost functional over all subbands. The choice of the information cost functional that " i z e s a global distortion is not trivial for a limited coding budget. So answer to the second question is more or less an application specific matter

1994 IEEE

IEEE Transactions on Consumer Electronics, Vol. 40, No. 3 , AUGUST 1994

528

that requires our a priori knowledge about source images. In this paper we conduct best wavelet packets bases search with respect to the cost measure of power concentration ability of wavelet transformed coefficients as was done in [31. Unlike the method used in [31, an information cost J ( D w ,N , 1) is assigned to each subspace W E W , where the N,, 1) measures the expenses quantity ](Ow, of including w in the decomposition of the picture S. The D , and N , denote mse (mean square error) distortion and the number of significant coefficients in the subspace w detected at cutoff threshold T,. The parameter h is the Lagrange multiplier that controls best match to the convex hull approximation of D-N (Distortion-Number of coefficients) curve. 2. Wavelet Packets A. Wavelet Functions Roughly spealcing, a wavelet packet is a square integrable function with zero mean and compact support both in time and frequency. It is characterized by three parameters: frequency 1, position b, and scale 1. Recall that any square integrable function f(x) can be expressed as its orthogonal projection onto subspaces spanned by a set of wavelet function *r,b(X) obtained by dilation and translation operation of a mother (wavelet) function[4] *l,b(X)

=

2-[%(2-[x

Similarly one function as @l,b(X) =

-

defines

b), ( @ ) E 2

a

father

(1)

(scalar)

2-ln@(2-lX- b), ( Z , b ) € Z (2)

At a given scale 1, the wavelet and scaling function constitutes an orthonormal basis of decomposed subspaces obtained by applying convolution-decimation operator such that

It is shown in [51 that the projection of a

function f(x) onto the function space with bases set {@r,b) and (91,bI are equivalent to a filtering operation followed by subsarnpling by 2. Let {hiiI$jl and {g/iIEil be two finite filter sequences satisfying the orthogonality and perfect reconstruction conditions. Now define two convolution-decimation operators H and G on Z2(2) into Z2(22) as follows [61: M-1

(Hflri =

with g n =

C h; fi-ui (3.1)

j=O

filter

coefficients (-l)"h~-1-% n = O,l;..,M-l.

The adjoint operators H and G are defined by

hn

H* and G'

and of

These operations correspond to upsampling by 2 followed by convolution with filter hn = hM-1-n and g n = gM-1-n. n=O,I,***,M-l As mentioned earlier, filter sets H , G, ,H*#and G* satisfy a pairs of orthogonality and perfect reconstruction conditions: HG' = GH" = 0 HH* + GG' = I

(5.1) (5.2)

It is also shown in [51 that wavelet packets { W,i>F==oare generated from two recursive relations:

These recursive relations allow construction of the scaling function @ I , O ( X ) = WO(X)and

529


the wavelet function ~ i , o ( x =) W i ( x ) for n=O and Z=1 which can be identified with the scaling and the wavelet function in [41 along with all other wavelet packets W,(Z-'X-b), where Z,bEZ, n e N . Those function sets constitute the library of wavelet packets bases in Z2(2).The size of library is infinite in continuum limit, but only finite in the approximagon space restricted to a compact interval. Successive expansions of a vector x in RN with respect to wavelet packets bases form a set of N log N dimensional vectors. From this set we may choose more than Z N orthonormal bases for RN.

B. Multidimensional Wavelet Packets Two dimensional convolution-decimation operators can be defined in terms of tensor products of the pair of QMFs, H and G :

~ = s ( x , yE) & 2 2 ) with width N, = 2"" and height N y = 2"'. The space of such images may be decomposed into a partially ordered set of W of subspaces w0,b) called subbands, where 2 2 0 and 01 b

(a)

first level

root node

w(l.2)

1 w(2,lO) I w(2,ll) Iw(2,14) lw(2,15) I

-Fi->

second level decomposition

b(1,O)

TE, XijESs,)

~ : j

(13)

where S, is the sequence of transformed coefficients with respect to the basis w , and the threshold is given by

T, =

E

-1 P; In P;, P;

=

lx~12/11x11'.

The

J

numerator and the demominator term in (15) represent total energy and entropy energy in the signal x , respectively. The distortion measure D ( x ) to be used is the mse (mean square error) defined as follows:

Both N , ( x ) and D E ( x ) are additive measures of information on 2'. Thus to each subspace w E W (the whole tree) we m y assign an information cost 3

J ~ ( w ( l , bE)= ) , CDE(w(l,b))+hk N e ( ~ ( 2 , b )(17) ) b=O

b=O

for level 1 and its associated subband b. B. The Best-basis Algorithm

To each retained subspace w E W , its associated information cost J A( w ( 2, b ), E ) measures the expenses of including w in the decomposition to represent the picture S. Let's define the best basis for representing S with respect to J ~ ( w ( l , b ) , &as) the subset BOwhich minimizes CJb(W(l,b),E), W(l,b)E B c W ,

where B is all admissible basis subsets.

and the number of coefficents NE above threshold T Eis defined as =

=

1,b

(12)

I

NE(x)

H(x)

e;rP(-L(x)/llxll')

=

ET^^,

(14)

Without a coding budget constraint, the best basis subset Bo will be the one that gives the global minimum distortion over all basis subsets B. However, in our case with a coding budget contraint, we have to solve the constrained minimization problem as Min CO(w ( 2, b ) , ~ subject ) to CNE ( w ( 2,b),E ) IN7 1,b

where E is a scaling constant factor which is related to quantization type. The term e;rP(-L(x)/llxll') is directly related to average energy of significant coefficients, i.e., IIxll'/exp ( - H ( x ) )

(15),

Lb

where N7 is the total coding budget in the number of coefficients retained at cutoff threshold T,. We solve the constrained problem by converting it to an unconstrained problem using Lagrange mdtiplier[7]. Thus

IEEE Transactions on Consumer Electronics, Vol. 40, No. 3, AUGUST 1994

532

the equivalent unconstrained becomes the solution to

problem

Now one can equivalently solve the above unconstrained equation for the optimal values of E , w ( l , b ) E W ,and h. For certain E , h , band b and level 1, one can obtain best possible candidates that meet the budget constraint N E . Among those candidates, we have to search for the best wavelet packet tree that gives the minimum cost functional ]*(A) over the scales (levels). A fast tree pruning (andor growing) algorithm to find the optimal basis BOis well described in [71 and omitted here. The first part of the algorithm finds the optimal value of the Lagrange multiplier h( 2 ) for given levels. The second part computes total cost for each parent block (that is one level up) and their child nodes and then compares the cost to select the minimal cost wavelet packet. In other words, let Jp(l)

=

D ~ ( l ) ( W ( l , b+) ) h ( l ) * Ne(l)(W(l,b))

be the parent node cost at level 1 and band b. Let J p ( l ) be denoted by Jx(n(w(l,b),E). The descendent child nodes cost is also represented as Jc(

I+ 1) =

J h ( l + l ) ( W ( I+

1,4b+i), E ( l + 1 ) )

3

=

Z D E ( l + l ) ( ~ ( l +4b+i)+ l),

i=O

The best basis algorithm simply search the cost for band spliting and/or merging as follows: mini"

If J J l ) > J c ( l + l ) then split parent band

else

merge child bands end

C. Wavelet Packet Signal Compression

So far we searched for the most efficient representation by the best-basis algorithm using the Lagrange multiplier method. Note that no quantization was applied on the wavelet packets coeffients. Simply the algorithm picks up the transformed coefficients over the subspaces that provide least mse distortion subject to the coding budget for the reconstruction of the original signal. The coefficient streams retained can be quantized and coded before transmission along with overhead side information that describes wavelet packet tree structure. The side information consists of an m a y of integers which describe level, frequency (or band), starting position, and cutoff threshold index and the best basis is coded in depth-first order as shown in Fig. 3: Those coefficients with symbols P and M are quantized using a scalar quantizer and then entropy coding is performed for further redundancy removal for those symbol streams. The code map symbols (P, M, 2, R, but not C ) stream can be encoded using an entropy coder such as the adaptive arithmetic coder [91, where symbol P, M, 2, and R denote Plus coefficient, Minus one, insignificant one, and root node with insignificant descendant coefficients, respectively, As we saw in Fig. 3, the various subbands present in the best basis tree provide a segmentation of the picture in the time-frequency domain. Some selection criterion may be applied to preselect a desired feature of the signal such as a given texture or a feature detected at a selected scale. Then the coefficients within the subband may be used as a signature of the selected feature and at the same time the positions of large amplitude coefficients can be used to more precisely locate the feature.

533


periodized D6 QMFs (Daubechies's filter set) whose coefficients are given below: ho = g 5 = 0.332671 ,hi = g4 = 0.806892 hz

= g 3 = gi =

h4 =

0.459878

,h3 =

g2 =

-0.135011

-0.085441

,h5 =

go =

0.035226

Test irnage "rnit" of size 256x256 with 8 bits/pixel were subband decomposed only upto level 3. The information cost functions we used in the simulation are (i) the entropy cost criterion, i.e., L ( x ) = -Ex?lnx? and

(ii) the joint rate-distortion cost

(b) Coded symbol map in depth first search order: P=Plus,M=Minus,C=Child, Z=IsolatedZero,R=RootNode : PMMZMZRPZZMMZRZZPZZZZMMZCCZZCCZZZZPMZZZZMPP PZZZZPZZMCCCCMMZZPZZM.

Fig. 3 Best Basis Search Order and code map symbol stream 4. Image Coding Application Results We now describe an image coding application of the proposed best-basis algorithm which finds orthonomd subspaces with minimal information cost for a given coding budget. Somehow, significant coefficients in the best basis tree are detected by the entropy criterion (energy concentration measure) as well as overhead tree information. Implementation parameters for the orthogonal wavelet packet transforms are the

JL( w, E )

defined in (17). As an application experiments, Tables 1-3 show typical results of both the entropy criterion and the joint rate-distortion criterion. We carried out four experiments for the best wavelet packet basis search by the proposed best basis algorithm described in Section 3.B. For computational effeciency, enough memory was allocated upto the deepest level of the tree of subsapces that contain significant coefficients and the coefficients detected at the cutoff threshold inserted into their appropriate locations. Then we reconstruct the best basis tree by the adjoints of the convolution-decimation operators, which produces part of the next deepest level. This procedure is repeated until the root node is reached. Fig. 4 shows an optimal wavelet packet basis for "rnit" image which was found using the entropy criterion and the joint rate-distortion criterion, respectively. The overall performances were evaluated after reconstruction of both best wavelet packet bases by computing the nurnber of transmitted coefficients and PSNR (Peak Signal-to-Noise Ratio) defined as PSNR = 10 loglo 255'/mse. The experimental results are s m e r i z e d in Table 4. The results of the rate-distortion measure method in Table 4 indicate that even though the number of retained coefficients was increased, overall picture quality does not reflect significant coefficients's contribution to PSNR directly until certain number of


534

Table 2. Best wavelet packets tree

Table 1. Best wavelet packets tree (a) Entropy Criterian at ~

a(3,O) w(3,l: 15=881 N=40( me= mse= j803.7 ___ 416.1 ~ ( 3 2 ) w(3,3: 15=435 N=3& mse= mse= 513.0 138.8 a(3,8) w(3,9: 15=426 N=B$ mse= mse= 75.0 50.8 __ v(3,lO: W(3J N=419 N=Xd mse= mse= 72.0 167.1 ~

0.1

(a>Entropy Criterian at

w(2,4) N = 574 mse = 39.8

w(2,5) N = 889 nse = 45.4

~

~

~

E =

E =

0.05

__

w(2,3) N = 941 mse = 37.4

w(2,6)

N=% mse = 35.9

w(2,7) N = 692 nse = 27.1

w(3,4) w(3,5) N=427 N=427 w(2,5) mse=1844.3mse=177.4 nse=39.2 mse=89.1 w(2,4) N=871 N=1169 w(3,2) w(3,3) w(3,6) w(3,7) mse=21.8 mse=25.4 N=543 N=506 N=450 N=428 mse=236.9 mse=57.60 nse=15.4 mse=42.3 w(3,16) ~(3,171 N=541 N=393 w(2,7) w(Z,3) w(2,6) mse=33.5 mse=25.0 N=1012 N=459 N=1304 ~(3,181 ~(3,191 mse=21.7 mse=13.8 mse=19.2 N=476 N=543 mse=72.00 mse=32.9

~

w 8) N=1035 mse = 30.9

w(2,9) N = 980 mse = 10.7

W(2,lO) N=1299 mse = 35.1

W(2,ll) N = 994 mse = 17.1

w(2,12) N=352 mse=12.9

~(2,151 N=790 mse=13.1

r i (b) Rate-distortion criterion

Q=12 w(2,2) w(2,3) N=2163,mse=18.9N=1311,mse=19.2 Q=9 Q=8 w(1.2) N=2359,mse=56.0

Q=5

~(2,131 N=753 mse=9.2

N=1442 mse=68.0 Q=5

w(l.3) N=786,mse=33.4 Q=5

1

w(2,9) N=1330 mse=5.30

w(2,12) w(2,13) N=563 N=1047 mse=7.70 mse=5.10

W(2,ll) N=1356 mse=8.50

~(2,141 ~(2,151 N=1092 N=lE mse=33.6 mse=7.00

1

~(3,401 ~(3,411 N=454 N=473 mse=30.3 mse=11.6 w(3,43) w(3,42) N=W N=463 mse=17.3 mse=9.30

(b) Rate-distortion criterion w(2,O) W(2,l) N=4063,mse=4.0 N=2@5p==7.10 w(l,l) Q=12 Q=10 N=2556,mse=32.5 w(2,2) w(2,3) Q=6 N=2621,mse=6.80N=1704pse=8.70 Q=10 Q=9 w(1,2) w(13 N=786,mse=33.4 N=3801,mse=24.0 Q=6 I Q=5

1

Park and Lee:

coefficients are reached. As mentioned before, N is not the same as the bit rate R used in

Table 3. Best wavelet packets tree (a) Entropy Criterian at

E

w(3,O) w(3,l) N=984 N=623 ise=1844.3 mse=177.4

=

[71.

0.03 w(2,4) w(Z,5) N=lZ2 N=11535 mse=21.8 mse=25.4

w(3,2) w(3,3) N=653 N=599 nse=236.9 mse=57.60 w(3,8)

w(3,9) N=499 mse=33.5 mse=25.0

N=660

~(3,101 w(3,3) N=576 N=658 nse=72.00 mse=32.9

1

w(2,8) N=1775 mse=15.5

~(3,401 ~(3,411 N=559 mse=30.3 mse=ll.6 w(3,42) w(3,43) N=609 mse=17.3 mse=9.30

w(2,3) N=1717 mse=P9.2

w(2,6) w(2,7) N=737 N=1329 mse=21.7 mse=13.8

W(Z,9) N=1695 mse=5.30

w(2,12) ~ ( 2 , 1 3 1 N=834 N=1396 mse=7.70 mse=5.10

W(2,ll) N=1731 mse=8.50

~(2,141 w(2,15) N=312 N=l43O mse=33.6 mse=7.00

W(1,O) N=15991,mse=1.4 Q=12 w(1.2) N=4391,mse=17.6 Q=7

Entropy !Method

25,472

535

Image Compression Based on Best Wavelet Packet Bases

Another interesting observation on selection of Q type (i.e., E value) indicates that even small number of significant coefficients (1,386coefficients) were increased , the overall PSNR was increased more than 3dB. Therefore the right choice of Q will make a significant contribution on image reconstruction quality (see Table 5.) for given coding budget since there exists a "critical coeficients threshold" in each 'subspace that must be included in the best-basis tree for high PSNFL This property must be investigated further in order to help understand the relation between image qualitay and critical coefficients threshold in the aspect of human perception. Table 5. Selection of Q and its quality affect

I Q type 10 12

w(l,l) N=3604,mse=17.0 Q=8 ~(1.3) N=1901,mse=13.9 Q=7

Measure

37.2 35.2

coefficients

N 12,596 N = 13.981

PSNR (dB) 31.74 35.20

5. Conclusions I

'

Rate-di s t o r t i on IMeasure Method PSNR ( dB ) PSNR( dB) 25,887

no.of

!

An efficient scheme is proposed for representing images with best wavelet basis obtained by the best basis search algorithm using the entropy information cost criterion and the rate-distortion. Image quality reconstructed bY the joint rate(N)-distortion(D) cost outp?rforms the entropy cost criterion's by more than lOdB in PSNR as to signal approximation by a wavelet packet basis. As the number of retained coefficients in the subspace increases, the corresponding distortions decreases monotonically in case of the entropy criterion. On the other hand, in case of the joint rate-distortion criterion, the distortion does not drop monotonically by increasing coefficients retained in the subspace within certain range of the number of Coefficients. This indicates that there may

536


exist a "critical co@icients threshold" peculiar to the retained basis subsets that must be included in the best basis tree for high PSNR. Further study must be performed for a systematic allocation of minimally required coefficients in the best basis subsets to avoid a Perceptual coefficients threshold effect. Acknowledgment Preprints and part of routines by the entropy cost criterion were provided by anonymous ftp from Mathematics department, Yale University, New Haven, CT. The author acknowledges useful discussions on Lagrange multiplier method with Mr. Jinho Choi (KAIST). References 1. R. Coifman, Y. Meyer, S. Qualie, and M.V. Wickerhauser, "Signal Processing and Compression with Wavelet Pacliects," Proc. the CoMerence on wavelets, Marseilles, Spring, 1989

" Km and vetterli7 "Best Wavelet Packets Bases Using Rate-Distortion Criteria," IEEE Symposium on Circuits and System, ISCAS'92, San Diego, CA, May, 1992 3. M. V. Wickerhauser, "Picture Compression by Best Basis Subband Coding," Preprint, Yale University, Jan, 1990 4. I. Daubechies, "Orthogonal Bases of Compactly Supported Wavelets, Communiaztions on Pure and Applied Mathematics XLI, 1988, pp. 909-996 5. M. V. Wickerhauser, "INRIA Lectures on Wavelet Packets Algorithms," Preprint, Yale University, Mar., 1991, pp. 40-52 6. M.V. Wickerhauser and R. Coifman, "Entropy Based Methods for Best Basis Selection," IEEE Trans. on Imformation Theory, Vol. 38 , No. 2 , Mar. , 1992, pp. 713-718 7. K. Ramchandran and M. Vetterli, "Best Wavelet Packet Bases in a Rate-Distortion Sense," Submitted to IEEE Trans. on Imuge Processing, 1992

8. P. Haddad, M. Barlaud, and P. Mathieu, "Optimization of Distortion-rate in Image Coding: An Application of Wavelet Packets," Signal Processing VI: Theories and Applications: Elsevier Science Publisher, 1992, pp. 1493-1496 9. 1. H, witten, R, Neal, and J,G. celary, "Arithmetic Coding for Data Compression," Comm. ACM, Vol. 30, June, 1987, pp. 520-540 Biographies DAECHUL PARK received the B.S. degree in Electronics Engineering from Sogang University, Seoul, in 1977, the M.S. and the Ph.D. degree from the University of New Mexico, Albuquerque, NM, in 1985 and 1989, respectively. From 1989 to 1993, he joined in ETRI (Electronics and Telecommunications Research Institute) as a senior researcher in visual communications section.During 19911992, he worked in CTR (Center for Telecommunication Research), Columbia University, New York City, as a visiting scholar. In 1993 he joined Hannam University, Taejon, Korea where he is currently assistant professor in Information and Communication Engineering department. His research interests are in image coding, video compression, and 3D video display.

Moon Ho Lee was born in Cheju, Korea, in 1945. H e received the B.S. and M.S. degree b o t h in Electrical Engineering from the Chonb u k National University, in 1967 and 1976, respectively. He has received the Ph.D. degree i n Electronics Engineering from the University of Tokyo in 1990. He was reciDient of paper prize awards from KICS in 1986 and from KITE in 1992, respectively. He also studied at the University of Hannover during 1990 summer and the University of Aachen during the winter of 1992 in Germany. Currently he is a professor and chairman in Department of Information and Communication Engineering. His research interests include image processing, mobile communication and high speed communication network. Dr. Lee is a member of Sigma Xi.

10

Park and Lee:

Image Compression Based on Best Wavelet Packet Bases

Fig 4. An optimal wavelet packet basis decompositions and reconstructions (a) original "mit" test image (b) best wavelet packet decomposition by entropy measure (c) reconstructed "mit" image from (b) (d) best wavelet packet decomposition by D-N measure (e) reconstructed "mit" image from (d)