Lossless Data Hiding Using Histogram Shifting Method Based on ...

1 downloads 0 Views 2MB Size Report
Abstract. This paper1 proposes a histogram shifting method for image lossless data hiding in integer wavelet transform domain. This algorithm hides data into.
Lossless Data Hiding Using Histogram Shifting Method Based on Integer Wavelets Guorong Xuan1, Qiuming Yao1, Chengyun Yang1, Jianjiong Gao1, Peiqi Chai1, Yun Q. Shi2, and Zhicheng Ni2 1

Dept. of Computer Science, Tongji University, Shanghai, P.R. China [email protected] 2 Dept. of Electrical & Computer Engineering, New Jersey Institute of Technology Newark, New Jersey, USA [email protected]

Abstract. This paper1 proposes a histogram shifting method for image lossless data hiding in integer wavelet transform domain. This algorithm hides data into wavelet coefficients of high frequency subbands. It shifts a part of the histogram of high frequency wavelet subbands and thus embeds data by using the created histogram zero-point. This shifting process may be sequentially carried out if necessary. Histogram modification technique is applied to prevent overflow and underflow. The performance of this proposed technique in terms of the data embedding payload versus the visual quality of marked images is compared with that of the existing lossless data hiding methods implemented in the spatial domain, integer cosine transform domain, and integer wavelet transform domain. The experimental results have demonstrated the superiority of the proposed method over the existing methods. That is, the proposed method has a larger embedding payload in the same visual quality (measured by PSNR (peak signal noise ratio)) or has a higher PSNR in the same payload. Keywords: Histogram Shifting, Lossless Data Hiding, Integer Wavelets.

1 Introduction This paper focuses on the image lossless data hiding, which requires not only correct retrieval of the hidden data but also inverting the marked image back to the original cover image without any distortion. Recently, Ni et al. [1,2] proposed an image lossless data hiding algorithm using pairs of zero-points and peak-points, in which the part of an image histogram is shifted to embed data. Independently, Leest et al. [3] proposed a similar method. However, both of these two methods are implemented in the spatial domain. It is 1

This research is supported partly by National Natural Science Foundation of China (NSFC) on the project “The Research of Theory and Key Technology of Lossless Data Hiding (90304017)”.

Y.Q. Shi and B. Jeon (Eds.): IWDW 2006, LNCS 4283, pp. 323 – 332, 2006. © Springer-Verlag Berlin Heidelberg 2006

324

G. Xuan et al.

well-known that the histogram distribution varies dramatically from image to image. Consequently, it is hard for these two methods to achieve high data embedding payload (often referred to as capacity as well) with a reasonably high visual quality (often measured by PSNR (peak signal to noise ratio)). Since the wavelet coefficients of high frequency subbands have Laplacian-like distribution, meaning that there is a high peak in the histogram around zero and small magnitudes on both sides, we propose to apply the histogram shifting technique in the wavelet domain. Because of the losslessness requirement, we chose to work in the integer wavelet transform domain. During the shifting of histograms of high-frequency integer wavelet subbands, the overflow (e.g., the pixel grayscale value exceeding 255 for an 8-bit image) and/or underflow (e.g., the pixel grayscale value below 0 for an 8-bit image) may take place, thus violating the losslessness requirement. In order to overcome overflow and/or underflow, the histogram modification technique is adopted, which have been used in our previous works on image lossless data hiding using integer wavelet transform [4, 5, 6, 7, 8]. Experimental works have been conducted to compare the performance of this proposed new technique with that of the existing techniques [1, 2, 7, 9, 10], showing the superiority of the proposed technique. The rest of the paper is organized as follows. The integer wavelet transform and histogram modification are introduced in Section 2. In Sections 3, the algorithm of wavelet histogram shifting is presented. Some experimental results are reported in Section 4. Conclusions are drawn in Sections 5.

2 Integer Wavelet Transform and Histogram Modification 2.1 Integer Wavelet Transform Since it is required to reconstruct the original image with no distortion, we use the integer lifting scheme wavelet transform in this framework. Specifically, we adopt the CDF(2,2) and similar series used in JPEG2000 standard [11]. Table 1 below lists the forward and inverse transform of CDF(2,2) integer wavelet transform. Table 1. CDF(2,2) integer wavelet transform Forward transform Splitting: si←x2i di←x2i+1 Dual lifting: di←di-{(si+si+1)/2} Primary lifting: si←si+{(di-1+di)/4} Inverse transform Inverse primal lifting: si←si-{(di-1+di)/4} Inverse dual lifting: di←di+{(si+si+1)/2} Merging: x2i←si x2i+1←di

; ;

After integer wavelet transform, it has four sub-bands. We will embed the information into three high frequency subbands.

Lossless Data Hiding Using Histogram Shifting Method Based on Integer Wavelets

325

2.2 Histogram Modification For a given image, after data embedding in some IWT coefficients, it is possible to cause overflow/underflow, which means that after inverse wavelet transform the grayscale values of some pixels in the marked image may exceed the upper bound (255 for an eight-bit grayscale image) and/or the lower bound (0 for an eight-bit grayscale image). In order to prevent the overflow/underflow, we adopt histogram modification, which narrows the histogram from both sides as shown in Figure 1. Please refer to [8] for the detailed algorithm. The bookkeeping information will be embedded into the cover media together with the information data. parts to be merged

0

255 (a) parts after merge

0

G/2

G/2

255

(b)

0

G/2

G/2

255

(c)

Fig. 1. Grayscale histogram modification: (a) original histogram; (b) modified histogram; (c) histogram after data embedding

3 Lossless Data Hiding Based on Integer Wavelet Histogram Shifting 3.1 Introduction to Wavelet Histogram Shifting After integer wavelet transform, the histograms of high frequency subbands, referred to as wavelet histogram in the rest of this paper, are calculated. There the horizontal

326

G. Xuan et al.

axis represents the wavelet coefficients’ value and the vertical axis the occurrence numbers of the corresponding wavelet coefficients. As mentioned, Ni et al. [1, 2] proposed the histogram shifting method in the spatial domain, while independently Leest et al. [3] proposed the histogram gap function method in the spatial domain. In the following discussion, we consider a simple example shown in Figure 2 to demonstrate the principle of data embedding using histogram shifting. There, Figure 2 (a) is the original histogram of an integer wavelet high-frequency subband. In Figure 2 (b), a zero-point (no any coefficients in this subband assume this specific value: Z). That is, we shift the part of histogram with values larger than Z towards the righthand side by one unit. It means the original Z+1 value now becomes Z+2, and the original Z+2 becomes Z+3 and so on. Another part of the histogram with the value less than and equal to Z remains unchanged.

Z

(a)

(b)

Fig. 2. An example showing how a zero point is generated: (a) original histogram (b) histogram after a zero point is created

In data embedding, we scan all of the IWT coefficients in the high-frequency subband. Once an IWT coefficient of value “Z” is encountered, if the to-be-embedded bit is “1”, this coefficient’s value will be added by 1, i.e., becoming “Z+1”. If the tobe-embedded bit is “0”, the coefficient’s value remains to be “Z”. The data extraction is actually the reverse process of data embedding. When an IWT coefficient of value “Z+1” is met, bit “1” is extracted and the coefficient’s value reduces to “Z”. When the coefficient of value “Z” is met, bit “0” is extracted. After all data have been extracted, the part of the histogram equal to or larger than “Z+2” needs to be shift towards the left-hand side by one unit. Clearly, the histogram shifting can also be carried out towards the left-hand side. Obviously, the payload is the occurrence number of coefficients having value “Z” in the histogram. Note that the sequence in which the wavelet coefficients are encountered in data embedding can be controlled by using a key in order to make hidden data secure. If the number of to-be-embedded bits is large, it usually needs multiple zeros and the corresponding shifting to accommodate the large payload. The process of data embedding and data extraction illustrated above is summarized below. That is, we first shift the histogram shown in Figure 2 starting from value “Z+1” towards the right-hand one-by-one, leaving the value “Z+1” empty, i.e., creating a zero-point at “Z+1” in the histogram. Then according to the to-be-

Lossless Data Hiding Using Histogram Shifting Method Based on Integer Wavelets

327

embedded bit sequence, we either keep those coefficients having a value “Z” unchanged (if embedding a bit “0”) or we change the coefficient from value “Z” to value “Z+1” (if embedding a bit “1”). During the data retrieval, we extract a bit “0” from those coefficients having value “Z”. We extract a bit “1” from those coefficients having value “Z+1”. Furthermore, we reduce the value of the coefficients from “Z+1” back to “Z”. After all the hidden bits have been extracted out, we need to shift the part of the histogram larger than “Z+1” towards the left-hand side by one unit. Since the histogram of IWT high frequency subbands obeys Laplacian-like distribution, the algorithm can embed data in both sides of the histogram alternatively until all the to-be-embedded bits are embedded. The proposed data embedding and data extraction algorithms are presented below in detail. 3.2 Data Embedding Algorithm Assume there are M bits which are supposed to be embedded into a high frequency subband of IWT. We embed the data in the following way, as shown in Figure 3.

C h o o se T Peak=T G e n e ra te a z e ro -p o in t & E m b e d d a ta F in is h e d ? N

Y

N

P eak > 0? P eak = -P eak -1

Y

P e a k = -P e a k S=P eak

Fig. 3. Data embedding flowchart

(1) Set a threshold T>0, to let the number of the high frequency wavelet coefficients in [-T,T] is greater than M. And set the Peak=T. (2) In the wavelet histogram, move the histogram (the value is greater than Peak) to the right-hand side by one unit to leave a zero-point at the value Peak+1. Then embed data in this point. (3) If there are to-be-embedded data remaining, let Peak = (-Peak), and move the histogram (less than Peak) to the left-hand side by 1 unit to leave a zero-point at the value (-Peak-1). And embed data in this point.

328

G. Xuan et al.

(4) If all the data are embedded, then stop here and record the Peak value as stop peak value, S. Otherwise, Peak =(-Peak-1), go back to (2) to continue to embed the remaining to-be-embedded data. 3.3 Data Extraction Algorithm Data extraction is the reverse process of data embedding. Assume the stop peak value is S, the threshold is T. Figure 4 is the data extraction diagram.

Peak=S Extract data &Backfill the zero-point Finished?

Y

N N

Peak=-Peak

Peak>0? Y

Peak=-Peak-1

Fig. 4. Data extracting flowchart

(1) Set Peak = S. (2) Decode with the stopping value Peak. (In what follows, assume Peak>0. If Peak 0 and W>0) can be classified into three categories. When wavelet coefficients WT, the modification of W is T-S+1. From the above, we conclude that if the payload is small (S is close to T), majority coefficients are in category 1. Hence PSNR is high. If the payload is too large and leads S close to 0, the advantage of this proposed method is not obvious.

4 Some Experimental Results Experiments on six frequently used images, i.e., Lena, Baboon, Airplane, House, Peppers, Sailboat are reported here. From Figure 5, we can find that the visual quality of the marked images is still acceptable when 131k bits are embedded into these grayscale images of 512x512x8, i.e., the embedding payload is 0.5 bpp. Table 3 and 4 show the PSNR for different payload in the images Lena and Baboon. It shows that the increase of threshold T does not always lead to the increase of payload. When payload is smaller, we can choose larger threshold T. Hence fewer coefficients are changed during the data embedding and the resultant PSNR is higher. When payload is larger, the total number of needed zero points and threshold T also need to be larger. Figure 6 depicts the performance comparison between our method and several other most advanced lossless data hiding methods, including Ni et al.’s method [1,2], Tian’s Difference Expansion [9], Companding on integer DCT [10], and Threshold Embedding [7]. Note that the performance in terms of PSNR versus data embedding payload by threshold embedding [7] is superior to that achieved by the methods reported in [4, 5, 6]. Therefore, this indicates that our proposed method reported in this paper has the best performance in terms of PSNR versus payload, compared with these prior arts.

330

G. Xuan et al.

(a)

(b)

(c)

(e)

(f)

(d)

Fig. 5. PSNR of marked images with a payload of 0.5bpp: (a) Lena: 41.07 dB, (b) Baboon: 31.18 dB, (c) Airplane: 42.71 dB, (d) House: 40.90 dB, (e) Peppers: 39.71 dB, (f) Sailboat: 36.91 dB Table 3. Experimental results on Lena image Payload (bpp) 0.1 0.15 0.2 0.3 0.4 0.5 0.6

PSNR (dB) 50.71 49.14 47.77 45.08 42.85 41.07 39.38

Threshold T 4 2 1 2 3 4 5

Stop values S 3 1 0 -1 -1 0 0

Table 4. Experimental results on Baboon image Payload (bpp) 0.1 0.15 0.2 0.3 0.4 0.5

PSNR (dB) 45.31 42.19 40.16 39.61 33.80 31.18

Threshold T 3 4 3 3 8 13

Stop value S -2 -2 0 0 0 0

Lossless Data Hiding Using Histogram Shifting Method Based on Integer Wavelets

331

54 The proposed method Difference Expansion Companding on DCT Threshold Embedding Ni's method

52 50

PSNR (dB)

48 46 44 42 40 38 36 34

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Payload Siz e (bpp) Fig. 6. Performance comparison of histogram shifting method v.s. several other most advanced lossless data hiding methods on Lena image

Table 5 and 6 are the detailed comparison results with Ni et al.’s method. Table 5 shows that our method has higher PSNR while the payload is same. Table 6 shows that the payload of our method is about four times that in Ni et al.’s method at the same PSNR. Hence, our method has better performance than Ni et al.’s method. Table 5. PSNR for same payload Images Lena Airplane Baboon Peppers Sailboat House

Payload 5,460 16,171 5,421 5,449 7,301 14,310

PSNR (dB) ours 56.90 53.60 51.80 55.29 52.79 53.74

Ni et al.’s 48.2 48.3 48.2 48.2 48.2 48.3

Table 6. Payload for same PSNR Images Lena Airplane Baboon Peppers Sailboat House

PSNR 48.2 48.3 48.2 48.2 48.2 48.3

Payload (dB) ours 47,186 65,536 17,040 39,321 31,457 57,664

Ni et al.’s 5,460 16,171 5,421 5,449 7,301 14,310

332

G. Xuan et al.

5 Summary This paper proposed a novel lossless data hiding method based on the histogram shifting, integer wavelet transform and histogram modification. The experimental results and theoretical analysis show that the proposed method has better performance than the similar methods in the spatial domain, integer DCT domain and integer wavelet domain. The proposed method has larger payload at the same PSNR. Especially, the proposed method has very high PSNR while the payload is small.

References 1. Z. Ni, Y. Q. Shi, N. Ansari and W. Su: Reversible Data Hiding. IEEE International Symposium on Circuits and Systems (ISCAS03), May 2003, Bangkok, Thailand. 2. Z. Ni, Y. Q. Shi, N. Ansari and W. Su: Reversible data hiding. IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 3, pp. 354-362, March 2006. 3. A. Leest, M. Veen, and F. Bruekers: Reversible Image Watermarking. IEEE Proceedings of ICIP’03, vol.2, pp.731-734, September 2003. 4. G. Xuan, J. Zhu, J. Chen, Y. Q. Shi, Z. Ni and W. Su: Distortionless Data Hiding Based on Integer Wavelet Transform. IEE Electronics Letters, vol. 38, no. 25, pp. 1646-1648, December 2002 5. G. Xuan, Y. Q. Shi, Z. Ni: Lossless Data Hiding Using Integer Wavelet Transform and Spread Spectrum. IEEE International Workshop on Multimedia Signal Processing (MMSP04), Siena, Italy, September 2004. 6. G. Xuan, Y. Q. Shi, Z. Ni: Reversible Data Hiding Using Integer Wavelet Transform and Companding Technique. Proceedings of International Workshop on Digital Watermarking (IWDW04), Korea, October 2004 7. G. Xuan, Y. Q. Shi, C. Yang, Y. Zheng, D. Zou, P. Chai: Lossless data hiding using integer wavelet transform and threshold embedding technique. IEEE International Conference on Multimedia and Expo (ICME05), Amsterdam, Netherlands, July, 2005. 8. G. Xuan, C. Yang, Y. Q. Shi and Z. Ni: High Capacity Lossless Data Hiding Algorithms. IEEE International Symposium on Circuits and Systems (ISCAS04), Vancouver, Canada, May 2004. 9. J. Tian: Reversible Data Embedding Using a Difference Expansion. IEEE Transactions on Circuits and Systems for Video Technology, Aug. 2003, 890-896. 10. B. Yang, M. Schmucker, W. Funk, C. Busch, S. Sun: Integer DCT-based Reversible Watermarking for Images Using Companding Technique. Proceedings of SPIE, Security and watermarking of Multimedia Content, Electronic Imaging, San Jose (USA), 2004 11. M. Rabbani and R. Joshi: An Overview of the JPEG2000 Still Image Compression Standard, Signal Processing: Image Communication 17 (2002) 3–48.