A novel robust scaling image watermarking scheme

5 downloads 0 Views 1MB Size Report
Oct 24, 2014 - Maryam Amirmazlaghani a,* ... In this paper, we propose a novel scaling watermarking scheme in which the ... Contents lists available at ScienceDirect ..... outperforms other IQA indexes (Sheikh & Bovik, 2006), therefore,.
Expert Systems with Applications 42 (2015) 1960–1971

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

A novel robust scaling image watermarking scheme based on Gaussian Mixture Model Maryam Amirmazlaghani a,⇑, Mansoor Rezghi b, Hamidreza Amindavar c a

Department of Computer Engineering and Information Technology, Amirkabir University of Technology, Tehran, Iran Department of Computer Science, Tarbiat Modares University, Tehran, Iran c Department of Electrical Engineering, Amirkabir University of Technology, Tehran, Iran b

a r t i c l e

i n f o

Article history: Available online 24 October 2014 Keywords: Gaussian Mixture Model (GMM) Statistical modeling Maximum Likelihood detector Wavelet transform L-curve method

a b s t r a c t In this paper, we propose a novel scaling watermarking scheme in which the watermark is embedded in the low-frequency wavelet coefficients to achieve improved robustness. We demonstrate that these coefficients have significantly non-Gaussian statistics that are efficiently described by Gaussian Mixture Model (GMM). By modeling the coefficients using the GMM, we calculate the distribution of watermarked noisy coefficients analytically and we design a Maximum Likelihood (ML) watermark detector using channel side information. Also, we extend the proposed watermarking scheme to a blind version. Consequently, since the efficiency of the proposed method is dependent on the good selection of the scaling factor, we propose L-curve method to find the tradeoff between the imperceptibility and robustness of the watermarked data. Experimental results demonstrate the high efficiency of the proposed scheme and the performance improvement in utilizing the new strategy in comparison with the some recently proposed techniques. Ó 2014 Elsevier Ltd. All rights reserved.

1. Introduction Nowadays, we encounter easy distribution and sharing of digital media due to easy access to the Internet. However, it has made the protection and authentication of multimedia contents and copyright to be of a great concern. Digital watermarking which embeds hidden secondary data into digital multimedia products, has been applied as a technology for postdistribution protection of digital media. Imperceptibility and robustness are two main requirements of watermarking schemes and usually there is a trade off between them. Watermarks have two categories of roles: In the first category, the main goal is to determine whether a specific watermark is present or not in the received media content (integrity verification) (Cheng & Huang, 2003; Merhav & Sabbag, 2008). In the second category, the embedded watermark is considered as a hidden unknown message which should be decoded accurately (Akhaee, Sahraeian, Sankur, & Marvasti, 2009; Barni, Bartolini, Rosa, & Piva, 2003; Valizadeh & Wang, 2012). In this paper, we focus on the watermarking for data hiding. Usually, a watermarking approach can be seen as a communication task consisting of two steps, watermark embedding and ⇑ Corresponding author. E-mail addresses: [email protected] (M. Amirmazlaghani), rezghi@modares. ac.ir (M. Rezghi), [email protected] (H. Amindavar). http://dx.doi.org/10.1016/j.eswa.2014.10.015 0957-4174/Ó 2014 Elsevier Ltd. All rights reserved.

watermark retrieval (Mahbubur Rahman, Omair Ahmad, & Swamy, 2009). There are several methods of watermark embedding, such as through quantization (Chen & Wornell, 2001; Okman & Akar, 2007), additive (Mahbubur Rahman et al., 2009; Mairgiotis, Galatsanos, & Yang, 2008), and multiplicative (Barni, Bartolini, Rosa, & Piva, 2001; Cheng & Huang, 2003; Cox, Kilian, Leighton, & Shammoon, 1997; Ng & Garg, 2005). In multiplicative watermarks, the power of the watermark is proportional to the corresponding image feature samples. So, multiplicative watermarks are image content dependent and they are more robust than additive watermarking methods. Another embedding approach is based on scaling. In the scaling based watermarking, the watermark data is embedded into the cover media by slightly scaling the cover (Akhaee et al., 2009). The watermark is often embedded in a transformed domain. The transforms usually employed for digital watermarking are the discrete Fourier transform (DFT) (Doncel, Nikolaidis, & Pitas, 2007), the discrete cosine transform (DCT) (Cheng & Huang, 2003), and the discrete wavelet transform (DWT) (Akhaee et al., 2009; Mahbubur Rahman et al., 2009). Wavelet domain watermarking schemes can embed the watermark with a higher strength, while preserving the imperceptibility requirement, hence, a large number of wavelet domain watermarking schemes have been developed over the last decade (Cheng & Huang, 2003; Mahbubur Rahman et al., 2009; Nezhadarya, Wang, & Ward, 2011).

1961

M. Amirmazlaghani et al. / Expert Systems with Applications 42 (2015) 1960–1971

Wavelet domain scaling watermarking has been employed in Akhaee et al. (2009) and its high performance has been confirmed due to the reported experimental results. The weakness of this method is using Gaussian model for wavelet coefficients that can not capture wavelet densities efficiently (Allili, 2012; Amirmazlaghani, Amindavar, & Moghaddamjoo, 2009). In this paper, our goal is to propose and analyze a new wavelet domain robust scaling watermarking scheme based on statistical modeling. Our proposed scheme preserves the appropriate properties of scaling watermarking approach (Akhaee et al., 2009), while extends the dynamic formulation of the statistical model which results in better characterization of low pass images subbands and improved watermarking scheme. The watermark data is inserted in the low frequency wavelet coefficients of the higher entropy blocks of images (Akhaee et al., 2009; Watson, Yang, Solomon, & Villasenor, 1997). In the decoder side, we use an ML detector in the wavelet domain. For the ML detector to be successful, the correct choice of priors for wavelet coefficients is certainly a very important factor. As mentioned above, in Akhaee et al. (2009), the Gaussian distribution has been supposed for low frequency wavelet coefficient. Although, using Gaussian distribution results in simple detector and closed form analyzing results, but, as we will discuss in Section 2.2, it is far from real distribution of the low frequency coefficients. So, to design an efficient ML detector, first we study the statistical properties of wavelet coefficients. We demonstrate through extensive experiments that low frequency coefficients have significantly non-Gaussian statistics that can be modeled efficiently using Gaussian Mixture Model (GMM). This model can capture Gaussian and non-Gaussian data. Under this assumption, we propose our ML detector and analyze its performance. Since, our proposed method is based on scaling, tuning the scaling parameter is essential. There is a trade off between imperceptibility and robustness in tuning scaling parameter. To solve this trade off, a multiobjective optimization method has been suggested in Akhaee et al. (2009) and Akhaee, Sahraeian, and Marvasti (2010) that is based on constraint optimization and manually tuning some parameters that affect the efficiency of the method. To overcome these constraints, we propose a L-curve method to optimally select this parameter. Experimental results verify the efficiency of the proposed method. It should be mentioned that although GMM has been used previously in the literature of watermarking in some papers such as Yuan and Zhang (2006), Zhang, Li, and Wang (2008) and Yuan and Zhang (2004), but since we use this model in a robust scaling watermarking framework, our proposed approach is quite different from them. The paper is organized as follows. In Section 2, we study the statistical modeling of low frequency wavelet coefficients. The watermark insertion and detection processes are described in Section 3 and Section 4, respectively. To adjust the strength factor, the L-curve method is studied in Section 5. Also, we proposed a blind extension of our watermarking scheme in Section 6. Experimental results are reported in Section 7 and finally Section 8 concludes the paper.

2. Statistical modeling of the low frequency wavelet coefficients The performance of our proposed watermarking scheme is dependent on efficiently modeling the low frequency wavelet coefficients. Therefore, this section is dedicated to study the statistical properties of these coefficients. Previously, Gaussian distribution has been used for these coefficients in Akhaee et al. (2009). In the following, first we define the Gaussian Mixture Model (GMM) that involves the Gaussian model as a special case. Then, based

on studying the compatibility of the low frequency wavelet coefficients with Gaussian and GM distributions, we demonstrate that Gaussian distribution can not capture the statistical properties of these coefficients, when the GM can efficiently model these coefficients. So, in the proposed method, we relax the limitation of supposing Gaussian distribution for the low frequency wavelet coefficients and we use the GMM. 2.1. Gaussian Mixture Model (GMM) A random process xi follows a GMM with M components, if:

f xi ðxi Þ ¼

M X j¼1

ðxi  lj Þ2 p pffiffiffiffiffiffijffi exp ; 2r2j 2prj

ð1Þ

where f xi ðÞ denotes the probability density function (PDF) of xi . Moreover, lj ; rj , and pj denote the mean, standard deviation, and the weight of the jth Gaussian component, respectively. To abbreviate the notation, we use: xi  GMð~ l; ~ rÞ where p; ~

0

p1 Bp B 2 ~ p¼B B .. @ . pM

1 C C C; C A

0

l1 1

Bl B 2 ~ l¼B B .. @ .

lM

C C C; C A

r1

1

Br B 2 ~ r¼B B .. @ .

C C C: C A

0

rM

Expectation Maximization (EM) algorithm is employed to estimate the GMM parameters. 2.2. GMM for the low frequency wavelet coefficients In this section, first we demonstrate the inefficiency of Gaussian distribution for modeling low frequency coefficients that was previously used in Akhaee et al. (2009). Then, we check the compatibility of these coefficients with the GMM. We make use of histogram plots, which model PDFs, and Kolmogorov–Smirnov test. We modeled a series of natural images, but we report the modeling results of some representative images because of the limited space, considering the similarity between modeling results of different images. We use Daubechies (Db4) with three levels of decomposition. Histogram of the low frequency subbands at the third level of decomposition related to eight high entropy 16  16 blocks of Cameraman image and the best fitted Gaussian probability density function are shown in Fig. 1. This figure, also shows the best fitted GMM with three component. It is obvious from Fig. 1 that the low frequency coefficients do not follow the Gaussian distribution, but they can be efficiently modeled by the GMM. To quantify these results, we use the Kolmogorov–Smirnov (KS) test which is a well known nonparametric test to assess the compatibility between the distribution of a sample set and a reference PDF. We performed KS test on the low frequency coefficients of non-overlapping blocks of different natural images to study their compatibility with Gaussian and GM distributions. The percentage of compatibility between the density of the image blocks low frequency coefficients with these two distributions (Gaussian and GM) are summarized in Table 1. It is clear from this table that these coefficients don’t follow a Gaussian distribution but they are compatible with the GMM. 3. Watermark insertion Suppose that we want to insert a message M into an image. This message is mapped by an encoder into a binary codeword vector. The image is divided into non-overlapping blocks. Entropy is a measure of the amount of information in a signal. Watson, Borthwick, and Taylor (1997) noticed that complexity and

1962

M. Amirmazlaghani et al. / Expert Systems with Applications 42 (2015) 1960–1971 −3

6

−3

x 10

2

−3

x 10

8

x 10

0.012

5

0.01 1.5

6

4

0.008

3

1

4

0.5

2

0.006

2

0.004

1 0

0.002

600

800 1000 1200

0

−3

5

500

1000

1500

0 600

−3

x 10

4

800

0

1000 1200

−3

x 10

6

600

800

1000

−3

x 10

8

x 10

5

4

3

6 4

3 2

3

4

2 2 1

1 0

2 1

400

600

800 1000

0

1000

1500

2000

0 600

800

1000

1200

0

800

1000

1200

Fig. 1. Vertical bars show the normalized histogram of low frequency subbands at the third level of decomposition related to eight high entropy 16  16 blocks of Cameraman image. The best fitted Gaussian distribution is depicted in black solid lines and the histogram of the best fitted GMM is depicted in red dashed lines. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 1 Kolmogorov–Smirnov test results: the compatibility percentage of the image blocks low frequency wavelet coefficients (subband L3 ) density with Gaussian and GM densities. Block size

Image

Gaussian

Gaussian mixture

88

Cameraman Baboon Barbara

41.70 46.07 40.28

99.44 99.73 99.46

16  16

Cameraman Baboon Barbara

28.81 37.30 20.31

93.46 99.12 92.58

4. Watermark detection

uncertainty of the original image could result in an increase to the mark invisibility. They referred to this phenomenon as ‘‘entropy masking’’. So, after dividing the images into non-overlapping blocks, the higher entropy blocks are selected. Consequently, we apply DWT to each of the selected blocks and insert a single bit of the watermark in each block by scaling the low frequency wavelet coefficients. If xi , 1 6 i 6 N denotes the i low frequency coefficient in a selected block of size N, we use the following formula to embed the watermark bit b in this block:

wi ¼



xi  cxi ; if b ¼ 0; xi þ cxi ; if b ¼ 1;

i ¼ 1; . . . ; N

ð2Þ

or equivalently

wi ¼ xi þ ð1Þbþ1 cxi

b ¼ 0; 1; i ¼ 1; . . . ; N;

parameter c. It is clear that c controls the watermark strength and we call c the strength factor. An increase in the value of c results in a more robust but more perceptible watermark. So, it should be selected precisely. In Section 5, we propose a L-curve method to efficiently tune this factor.

ð3Þ

where wi denotes the ith coefficient after embedding watermark bit b. In other words, the low frequency coefficients are scaled with factor 1  c or 1 þ c due to the watermark bit. Finally, to achieve the watermarked image, we perform the inverse DWT. In this paper, boldface type is used to identify random variables and normal fonts for deterministic variables. The Performance of the proposed method is dependent on efficiently tuning the scaling factors 1  c or equivalently the

According to the good properties of ML detectors, we use them in the detector. Thus, greater the accuracy of the probability density function (PDF) of the image coefficients incorporated in the detector, the higher the reliability of detection of the watermark at a predefined c value in our proposed scheme. In the following parts, we design and analyze a new detector based on GMM model. It should be mentioned that our proposed approach is a semi-blind one and requires side information containing strength factor, block positions, and GMM parameters.

4.1. Proposed ML decoder We demonstrated that the GM distribution provides reasonable adaptation with the empirical PDFs of the image low frequency wavelet coefficients. Motivated by these results, we design an ML decoder for the proposed image watermarking scheme using the GM as the prior distribution. In the detection step of the proposed watermarking scheme, the decoder should verify that if a specific embedded watermark bit is equal to 0 or 1. However, one of the biggest challenges in the watermark detection is that the strengths of the watermark signals will change after being distorted by an attacker in a watermarking channel. Many analysis leading to the optimum detection algorithm do not take into account possible attacks, and to evaluate the real effectiveness of the detector only experimental results have been considered (Barni et al., 2001). We model the attacks noise as additive white Gaussian noise (AWGN) (Akhaee et al.,

1963

M. Amirmazlaghani et al. / Expert Systems with Applications 42 (2015) 1960–1971

2009, 2010). Hence, the decoder receives the coefficients contaminated by noise, and using (3), we have:

yi ¼ wi þ ni ¼ xi þ ð1Þbþ1 cxi þ ni

b ¼ 0; 1;

ð4Þ

where yi and ni denote the received coefficient and the attack noise, respectively. To design an ML decoder, we require the distribution of yi for b ¼ 0; 1. Modeling the low frequency coefficients of the original image (xi ) using GMM with M component and from (1), we have:

p; ~ xi  GMð~ l; ~ rÞ ) f xi ðxi Þ ¼

M X j¼1

Let’s define g function as:

0

PM

p pffiffiffiffij j¼1 2pr1 j

B B g s ðxÞ ¼ ln B @ PM

p pffiffiffiffij j¼1 2pr0 j

exp exp

ðxl1j Þ

2

1

C C C; l A

2ðr1j Þ

2

2 ðx 0j Þ 2 0 2ð j Þ

ð9Þ

r

where s denotes the set of parameters for g function, i.e., N

flbj ; rbj ; pj gj¼1 ; for b ¼ 0; 1. Using the definition in (9), we can rewrite the decision rule (8), as:

ðxi  lj Þ2 p pffiffiffiffiffiffijffi exp : 2r2j 2prj

N X

1

g s ðyi Þ?0:

According to (3), we can simply obtain the distribution of watermarked coefficients wi :

ð10Þ

0

i¼1

In the following section, we analyze the performance of the proposed ML detector.

wi  GMð~ l; C b ~ rÞ p; C b ~ ) f wi ðwi Þ ¼

M X j¼1

4.2. Performance analysis of the proposed decoder

ðxi  C b lj Þ2 p pffiffiffiffiffiffiffi j exp ; 2pC b rj 2C 2b r2j

where C b ¼ 1 þ ð1Þbþ1 c for b ¼ 0; 1. Since wi and ni are independent random variables, and yi ¼ wi þ ni , the PDF of yi can be obtained by convolving the PDF of wi and the PDF of ni as: f yi ðyi Þ ¼ f wi ðyi Þ  f ni ðyi Þ " # Z 1X 2 M ða  C b lj Þ p 1 ðyi  aÞ2 pffiffiffiffiffiffiffi j pffiffiffiffiffiffiffi exp exp da ¼ 2 2 2r2n 2prn 2pC b rj 2C b rj 1 j¼1 " # M X ða  C b lj Þ2 p 1 ðyi  aÞ2 ffiffiffiffiffiffi ffi pffiffiffiffiffiffiffi j p ¼ exp exp d a ð5Þ 2r2n 2prn 2pC b rj 2C 2b r2j j¼1

where  stands for the convolution and rn denotes the standard deviation of noise that can be computed using the recommendation by Donoho (1994). Omitting some details and using this fact that the convolution of two Gaussian functions will be a new Gaussian function, we obtain a GMM for yi as:

3 2 ðyi  lbj Þ pj 4 5; pffiffiffiffiffiffiffi exp f yi ðyi Þ ¼ 2 2prbj 2ðrbj Þ j¼1 M X

where

2

lbj ¼ C b lj for b ¼ 0; 1;

Employing ML decision for N samples in the low frequency subband, we have:

ð7Þ

0

where s1 and s0 denote the embedding bit ‘‘1’’ and ‘‘0’’, respectively. Substituting (6) in (7) and assuming that the wavelet coefficients are i.i.d (Akhaee et al., 2009), we have:

9 9 8 2 2 N