Fast Johnson-Lindenstrauss Transform for Robust and Secure Image ...

Fast Johnson-Lindenstrauss Transform for Robust and Secure Image Hashing Xudong Lv #1, Z. Jane Wang #2 #

Department of Electrical and Computer Engineering, University of British Columbia Vancouver, Canada 1

[email protected] [email protected]

2

Abstract—Dimension reduction based techniques, such as singular value decomposition (SVD) and non-negative matrix factorization (NMF), have been proved to provide excellent performance for robust and secure image hashing by retaining the essential features of the original image matrix while preventing intentional attacks. In this paper, we introduce a recently proposed low-distortion, dimension reduction technique, referred as Fast Johnson-Lindenstrauss Transform (FJLT), and propose the use of FJLT for image hashing. FJLT shares the lowdistortion characteristics of a random projection but requires a much lower complexity. These two desirable properties make it suitable for image hashing. Our experiment results show that the proposed FJLT-based hash yields good robustness under a wide range of attacks. Furthermore, the influence of secret key on the proposed hashing algorithm is evaluated by receiver operating characteristics (ROC) graph, revealing the efficiency of the proposed approach.

I. INTRODUCTION Because of the easy-to-copy nature, tremendous growth and wide spread of multimedia data have led to an urgent, challenging research issue: How to efficiently manage such abundant data (e.g. fast media searching, indexing and identification) and provide sufficient protection of intellectual property as well? Among the variety of techniques, image hashing, which is a content-based compact representation of an image [1], has been proved to be an efficient tool because of its robustness and security. For example, image hashing provides an alternative way to watermarking for image authentication, and can achieve better robustness against various attacks than watermarking technology that is sensitive to position and intensity of embedded information. Different from traditional cryptographic hash, an image hash does not suffer from the sensitivity to the graceful degradation of original data because of its desired properties, named perceptual robustness and fragility to visually distinct image [2]-[4]. Such robustness property requires that two images that are perceptually identical in human visual system (HVS) will be mapped to the same hash value, while perceptually distinct images will be mapped to different hash values. An image hash is a compact and exclusive feature descriptor with secret key dependence for a specific image. In general, the first step in image hash construction is feature extraction [1]. Especially the intrinsic characteristic of image hashing is based on the features extracted from the image content itself, which leads to a superior robustness

978-1-4244-2295-1/08/$25.00 © 2008 IEEE

performance against geometric attacks and other malicious manipulations. The more invariant features are extracted, the more robust hash scheme is. However, generating the hash using the features alone could make the scheme susceptible to forgery attacks. Therefore, key-dependent randomization techniques should be incorporated in the hash scheme as the second step to guarantee the security of image hashing. Various approaches have been proposed for constructing image hashes in the literature, though there is no optimal one that is robust against all types of attacks and distortions. For example, radon soft hash algorithm (RASH) [5] shows robustness against geometric transformation and some image processing attacks by using the Radon transform and principle component analysis (PCA). Swaminathan’s hashing scheme [6] incorporates pseudo randomization into the Fourier-Mellin transform to achieve better robustness to geometric operations. It was also proposed to generate hash value by detecting invariant feature points [3], though the expensive search and removal of feature points such as by cropping and blurring limit its performance in practice. Recently, several hashing schemes based on dimension reduction were developed and outperform previous ones. Using low-rank matrix approximations obtained via singular value decomposition for hashing was explored in [7]. Its robustness against geometric attacks motivates other solutions in this direction. Monga et al. [4] introduced another dimension reduction technique, which is called non-negative matrix factorization (NMF) into their new hash algorithm. Based on the results in [4] and our own experiment, the NMF hashing possesses excellent robustness under a large class of perceptually insignificant attacks while significantly reducing misclassification for perceptually distinct images. Inspired by the promising dimension reduction techniques for image hashing, we propose a new robust and secure image hash algorithm based on Fast Johnson-Lindenstrauss transform (FJLT) in [8], a recently proposed dimension reduction technique. FJLT shares the low-distortion characteristics of a random projection but requires a lower computational complexity. Our experimental results show that FJLT hashing provides robustness to various manipulations such as noise, blurring, JPEG compression and geometric attacks etc. It is also more suitable for practical implementation because of its computational efficiency and high security due to random projection.

725

Authorized licensed use limited to: The University of British Columbia Library. Downloaded on February 1, 2010 at 01:24 from IEEE Xplore. Restrictions apply.

MMSP 2008

The rest of this paper is organized as follows. We will first introduce the background and theoretic details about FJLT in Section II. We then describe the proposed hash algorithm based on random sampling and FJLT in Section III. In Section IV, we first present our experimental results. Then the ROC graph will be illustrated to evaluate the influence of secret key on the proposed hashing scheme. II. FAST JOHNSON-LINDENSTRAUSS TRANSFORM (FJLT): BACKGROUND AND THEORY The Johnson-Lindenstrauss (JL) theorem has found numerous applications that include searching for approximate nearest neighbours (ANN) [8] and dimension reduction in database etc. By the JL lemma [9], n points in Euclidean space can be projected from the original d dimensions down to lower ݇ ൌ ܱሺߝ ିଶ ݊ሻ dimensions while incurring a distortion of at most ͳ േ ߝ in their pairwise distances, where Ͳ ൏ ߝ ൏ ͳ. Based on the theorem, Alion and Chazelle [8] proposed a new low-distortion embedding of ݈ଶௗ into ݈௣௞ ሺ‫ ݌‬ൌ ͳǡ ʹሻ, called the Fast Johnson-Lindenstrauss transform (FJLT). FJLT is based on preconditioning of a sparse projection matrix with a randomized Fourier transform. Note that we will only consider the ݈ଶ case ሺ‫ ݌‬ൌ ʹሻ because our hash is measured by ݈ଶ norm. For the ݈ଵ case, interested readers could refer to [8]. A. Fast Johnson-Lindentrauss Transform Briefly speaking, FJLT is a random embedding Ȱ ൌ ‫ܶܮܬܨ‬ሺ݊ǡ ݀ǡ ߝሻ that can be obtained as a product of three realvalued matrices: Ȱ ൌ ܲ‫ܦܪ‬Ǥሺͳሻ The matrices P and D are random and H is deterministic [8]. x

P is a k-by-d matrix whose elements {ܲ௜௝ } are drawn independently according to the following distribution, where ࣨሺͲǡ ‫ି ݍ‬ଵ ሻ means a Normal distribution with zero-mean and variance ‫ିݍ‬ଵ , ܲ௜௝ ̱ࣨሺͲǡ ‫ିݍ‬ଵ ሻ‫ݍ‬ Ǥ ቊ ܲ௜௝ ൌ Ͳͳ െ ‫ݍ‬ where

ܿ ଶ ݊ ‫ ݍ‬ൌ ቊ ǡ ͳቋǡ ݀

for a large enough constant c. x

H is a d-by-d normalized Hadamard matrix with the elements as: ଵ

‫ܪ‬௜௝ ൌ ݀ ିଶ ሺെͳሻ‫ۃ‬௜ିଵǡ௝ିଵ‫ ۄ‬ǡ where ‫݅ۃ‬ǡ ݆‫ ۄ‬is the dot-product of the m-bit vectors i, j expressed in binary.

x

D is a d-by-d diagonal matrix, where each diagonal element ‫ܦ‬௜௜ is drawn independently from {-1, 1} with probability 0.5.

Therefore, Ȱ ൌ ‫ܶܮܬܨ‬ሺ݊ǡ ݀ǡ ߝሻ is a k-by-d matrix, where d is the original dimension number of data and k is a lower dimension number set to be ܿ ᇱ ߝ ିଶ ݊. Here, n is the number of data points, ߝ is the distortion rate, and ܿ ᇱ is a constant. B. The FJLT Lemma Lemma [8]: Fix any set X of n vectors in Թௗ ǡ Ͳ ൏ ߝ ൏ ͳ, let ߔ ൌ ‫ܶܮܬܨ‬ሺ݊ǡ ݀ǡ ߝሻ. With probability at least 2/3, the following two events occur: 1) For all ‫ܺ א ݔ‬ǡ ሺͳ െ ߝሻ݇ԡ‫ݔ‬ԡଶ ൑ ԡ‫ݔ‬ԡଶ ൑ ሺͳ ൅ ߝሻ݇ԡ‫ݔ‬ԡଶ Ǥ 2) The mapping ߔ ‫ ׷‬Թௗ ՜ Թ௞ ‫ݏ݁ݎ݅ݑݍ݁ݎ‬ ܱሺ݀ ݀ ൅ ሼ݀ߝ ିଶ ݊ ǡ ߝ ିଶ ଷ ݊ሽሻ operations. Proofs of the theorem can be found in [9]. Note that the successful probability (at least 2/3) arises from the random projection and could be amplified to (ͳ െ Ɂሻ for any Ɂ ൐ Ͳ, if we repeat the construction ܱሺሺͳȀɁሻ times [8]. Since the random projection is actually a pseudo random process determined by a secret key in our case, the Lemma means that most of the keys (at least 2/3) are satisfied with the distortion bound described in FJLT lemma and could be applied in our hashing algorithm. Therefore, the FJLT will make our scheme widely applicable for most of the keys and suitable to be used in practice. We will evaluate this property by ROC analysis later. III. IMAGE HASHING VIA FJLT Motivated by the hashing based on SVD [7] and NMF [4], we found dimension reduction is a significantly important approach to capture the essential features that are invariant for geometric manipulation and other image processing attacks. For FJLT, three advantages facilitate its application in hashing. First, FJLT is a random projection that enhances the security of the hashing scheme. Second, FJLT’s low distortion guarantees the robustness to most malicious attacks. The last one is low computational cost while it is applied in practice. Therefore, we propose using FJLT for hashing. Given an image, the proposed hashing algorithm consists of three steps: random sampling, dimension reduction by FJLT, and random weight incorporation. A. Random Sampling The idea to select a few subimages as original feature by random sampling is not a novel one [4], [7] as shown in Fig. 1. However, we treat a subimage as a point in high dimension rather than a two dimensional matrix as in SVD [7] and NMF [4]. For instance, Fig.1 (b) shows a m-by-m subimage. In our

726 Authorized licensed use limited to: The University of British Columbia Library. Downloaded on February 1, 2010 at 01:24 from IEEE Xplore. Restrictions apply.

case, the subimage is actually a point in the ݉ଶ dimension (colour images will be converted to gray images). Given a colour image, we first convert it to a gray image and pseudo randomly select N subimages depending on the key and get ܵ௜ ‫ܴ א‬௠ ǡ for ͳ ൑ ݅ ൑ ܰ . Each ܵ௜ is a ݉ଶ ൈ ͳ vector by concatenating the columns of corresponding subimage. Then we construct our original feature as: ‫ ݁ݎݑݐܽ݁ܨ‬ൌ ሼܵଵ ǡ ܵଶ ǡ ǥ ǡ ܵே ሽ݉ଶ ൈ ܰǤሺʹሻ The advantage of forming such a feature is that we can capture the global information in the Feature matrix and local information in each component ܵ௜ . Even if we lose some portions of the original image by geometric attacks such as cropping, it will only affect one or several components in our Feature matrix and will not have a big influence on the global information. However, the Feature matrix is too large to store and match, which motivates us to employ dimension reduction technique.

Fig. 1. An example of random sampling for Lena. (a) Original Lena. (b), (c) The subimages selected by random sampling with size m

c’=2, the size of IH will be 600-by-20. Referring to the NMFNMF-SQ hashing in [4], we can introduce the pseudorandom ௞ weight vectors ሼ‫ݓ‬௜ ሽே ௜ୀଵ ‫ݓ‬௜ ‫ ܴ א‬and get our final secure hash as ‫ ݄ݏܽܪ‬ൌ ሼ‫ܪܫۃ‬ଵ ǡ ‫ݓ‬ଵ ‫ۄ‬ǡ ‫ܪܫۃ‬ଶ ǡ ‫ݓ‬ଶ ‫ۄ‬ǡ ǥ ǡ ‫ܪܫۃ‬ே ǡ ‫ݓ‬ே ‫ۄ‬ሽǤሺͶሻ Where the ‫ܪܫ‬௜ is the ith column in IH, and ‫ܪܫۃ‬௜ ǡ ‫ݓ‬௜ ‫ ۄ‬is the inner product of vector ‫ܪܫ‬௜ and ‫ݓ‬௜ . However, the weight ‫ݓ‬௜ will obviously distort the distance between two hash vectors ‫ܪܫ‬௜ and ‫ܪܫ‬௜ᇱ and affect the classification accuracy later. A favourable solution is to sort the elements of ‫ܪܫ‬௜ and ‫ݓ‬௜ in a descending order before the inner product and make sure a large weight value will be assigned to a large component. Therefore the perceptual quality of the hash vector is retained by minimizing the effect of the weights. Now, the final hash is obtained with only a 1-by-N vector for each image, which is very compact and secure. Then we will exhibit its robustness to a large class of manipulations in our experiment results. IV. EXPERIMENT AND ANALYSIS A. Database and Results In order to evaluate the performance of our hashing algorithm, we apply it to the miscellaneous volume in the USC-SIPI image database including some classic benchmark images (such as Lena, Baboon, Pepper, House etc.). For each original colour image with size 256ൈ256, we generate 50 similar attacked versions by Photoshop. The seven classes of manipulations and attacks considered are: additive Gaussian noise, additive Uniform noise, Gaussian blurring, JPEG Compression, rotation, cropping, and scaling. The details of the corresponding parameters and PSNR are listed in Table I. TABLE I PARAMETERS AND PSNR FOR MANIPULATED IMAGES

B. Dimension Reduction by FJLT Based on the analysis of section II, FJLT can capture the essential feature of the original data in low dimension with low distortion. Recall the construction Ȱ ൌ ‫ܶܮܬܨ‬ሺ݊ǡ ݀ǡ ߝሻ, we first get three real-valued matrices P, H, and D by the parameters (N, ݉ଶ , ߝ), where H is deterministic but P and D are pseudorandomly dependent on secret key. The lower dimension k is set to be ܿߝ ିଶ ܰ and c is a constant. Then we can get our intermediate hash (IH) as

Manipulation Gaussian Noise (5%~40%, 8) Uniform Noise (5%~40%, 8) Gaussian Blurring (0.5~4.0, 8) JPEG Compression

‫ ܪܫ‬ൌ Ȱሺ‫݁ݎݑݐܽ݁ܨ‬ሻ ൌ ܲ‫ ܦܪ‬ൈ ‫ ݇݁ݎݑݐܽ݁ܨ‬ൈ ܰǤሺ͵ሻ Here, the advantage of FJLT is that we can determine the lower dimension k by adjusting the number of data points and the distortion rate ߝ, which provides us with a lot of chances to get a better performance. However, the smaller ߝ is, the larger k is obtained. Therefore we need to make a trade-off. C. Random Weight Incorporation Although the original feature has been mapped to lower dimension with small distortion, the size of intermediate hash is still a little large. For instance, if we set N=20 , ߝ=0.1 and

Baboon

Lena

Pepper

House

26.01 ~10.03

25.91 ~10.11

26.15 ~10.24

26.07 ~10.13

30.0 ~13.1

29.97 ~13.24

29.96 ~13.48

30.04 ~13.31

27.22 ~20.05

33.23 ~21.97

33.26 ~20.73

29.59 ~19.40

QF={1, 3, 5 ,7, 10, 15, 25, 50}

Rotation

50, 100, 150, 200, 250, 300, 450

Cropping

5%, 10%, 20%, 30%, 35%, 40%

Scaling

25%, 50%, 75%, 150%, 200%

After generating the hash vectors for the image database, we choose Euclidean distance to measure the distance of hash vectors between the original images and the manipulated


images, and solve the classification problem using some popular classifiers such as K-nearest neighbours (KNN) etc. Ideally, to provide the robustness to malicious attacks, no matter what manipulation is done, the distorted images should be classified into the corresponding original image. In the experiments, we test the proposed algorithm on the database above and compare the results with NMF in [4] since the NMF hashing has been proved to outperform other existed algorithms. The results are shown in Table II. TABLE II IDENTIFICATION ACCURACY RATE FOR MANIPULATED IMAGES

Manipulation Gaussian Noise (5%~40%) Uniform Noise (5%~40%) Gaussian Blurring (0.5~4.0) JPEG Compression Rotation Cropping Scaling

Accuracy Rate FJLT NMF 100% 93.75% 100% 100% 100% 100% 100% 100% 96.43% 75% 100% 91.67% 85% 75%

To further evaluate the performance of the proposed algorithm, here we show four example images shown in Fig. 2, where each image is jointly manipulated by rotation by 25 degrees, Cropping, Gaussian noise 30%, and JPEG with QF=5. The results are listed in Table III. We observe that the diagonal elements are the minimum in each column, indicating that the proposed algorithm is still robust to the strong combination attacks illustrated in Fig. 2. The same test was applied to NMF as shown in Table IV. NMF also achieves good identification results except for Fig. 2 (a) and (b), which are misclassified to Red House and Pepper. Therefore, this example indicates that the proposed FJLT can at least provide comparable performance when compared with NMF. TABLE HASH DISTANCE BETWEEN MANIPULATED AND ORIGINAL IMAGES BY FJLT

Original Images Baboon Lena Pepper House Airplane Red House

Following the steps in Section III, we test our algorithm with the parameters chosen as, m=64, N=20, ߝ=0.1, key=5 etc. For the NMF, the parameters are m=64, p=20, ‫ݎ‬ଵ =2, ‫ݎ‬ଶ =1, and M=20 in [4]. Therefore, each image is represented by a hash vector with length 20. From the results above, a better performance is achieved by the proposed FJLT hashing with an additional advantage in terms of the computational cost. It takes about 300 seconds to compute hash vectors for 50 images by NMF-NMF-SQ, but only takes 90 seconds by the proposed FJLT.

(a)

(b)

(c)

(d)

Fig. 2. Strongly jointly-attacked examples by rotation, cropping, noise, and JPEG compression: (a) Baboon, (b) Lena, (c) Peppers and (d) House

(a) 816.62 1116.97 1350.73 1730.12 2587.55 1531.53

Examples in Fig. 2 (b) (c) 1509.17 1146.44 1565.42 916.56 1231.35 590.47 2267.36 2165.72 2857.36 2940.95 1500.57 1230.36

(d) 1354.07 1756.72 2003.7 985.11 1541.56 1680.19

TABLE HASH DISTANCE BETWEEN MANIPULATED AND ORIGINAL IMAGES BY NMF

Original Images Baboon Lena Pepper House Airplane Red House

(a) 1152.46 1512.33 1962.01 4265.77 7195.54 489.78

Examples in Fig. 2 (b) (c) 3041.63 3072.38 515.18 732.47 412.65 474.3 6148.70 6190.24 9075.38 9118.24 1998.39 2034.07

(d) 2206.89 4796.23 5222.59 975.15 3866.92 3273.65

B. ROC Analysis for Secret Key Another important property of the hashing algorithm is the stable performance under different secret keys. The more keys we can use, the higher security is obtained. Based on our analysis for FJLT lemma, our scheme should be robust to most of the keys. Therefore, we propose to apply the receiver operating characteristics (ROC) graph [10] to analyse and demonstrate the detection performance of the proposed FJLT, as shown in Fig. 3. Here, we assume that all the steps of hash generation in Section III are dependent on the same secret key. We test the performance of the algorithm under 200 secret keys from 10~10000 with interval 50 and other parameters are the same as before. Note that there are only 12 points in Fig. 3 because some of the 200 points are the same and overlapped with each other in the figure. Obviously, the proposed hashing algorithm provides stable identification performance with little influence by the secret key. The true positive rate is always above 0.94 and the false positive rate is generally below 0.02. That means our strategy is nearly approaching the


perfect classification point (0, 1) [10]. This observation suggests that most secret keys can be used in our hashing generation, which is consistent with the analysis in Section II, and thus guarantees the high security in practice.

FJLT hashing to geometric attacks such as scaling and rotation needs to be improved further. Future work will include testing of the FJLT hash over a larger image database with natural images and photos, and exploring its application in image/video authentication. In addition, we plan to improve our hashing algorithm by working on optimized random sampling to further capture the significant local and global information, and incorporating transforms that are invariant to geometric manipulations into our schemes to enhance its robustness to geometric attacks. REFERENCES

Fig. 3. True positive rates and false positive rates in ROC graph under 200 different secret keys

V. CONCLUSION In this paper, we have introduced a new dimension reduction technique—FJLT, and applied it to the image hashing problem. Based on our experiment results, it is noted that the FJLT-based hashing is robust to a large class of manipulations. Compared with the NMF-based approach, the proposed approach can achieve comparable, sometimes even better, performance than that of NMF, but requires less computation cost. Moreover, the random projection and low distortion properties of FJLT make it more suitable for practical hashing than NMF. Besides, ROC analysis reveals that the FJLT approach provides a stable performance that is nearly independent on most of the secret keys. This observation further guarantees the security of the proposed algorithm. However, we also notice that the robustness of

[1] M. Wu, Y. Mao, and A. Swaminathan, “A signal processing and randomization perspective of robust and secure image hashing”, IEEE/SP 14th Workshop on Statistical Signal Processing, Aug. 2007, pp: 166-170. [2] V. Monga, A. Banerjee, and B. L. Evans, “A clustering based approach to perceptual image hashing,” IEEE Trans. on information Forensics and Security, vol. 1, no.1, pp. 68-79, Mar. 2006. [3] V. Monga and B. L. Evans, “Perceptual image hashing via feature points: performance evaluation and tradeoffs,” IEEE Trans. on Image Processing, vol. 15, no.11, pp. 3453-3466, Nov. 2006. [4] V. Monga and M. K. Mihcak, “Robust and secure image hashing via non-negative matrix factorizations,” IEEE Trans. on information Forensics and Security, vol. 2, no.3, pp. 376-390, Sep.2007. [5] F. Lefbvre, J. Czyz, and B. Macq, “A robust soft hash algorithm for digital image signature,” in Proc. IEEE Int. Conf. Image Processing, Sep.2003, vol. 2, pp. 495-498. [6] A. Swaminathan, Y. Mao, and M. Wu, “Robust and secure image hashing,” IEEE Trans. on information Forensics and Security, vol. 1, no. 2, pp. 215-230, June 2006. [7] S. S. Kozat, R. Venkatesan, and M. K. Mihcak, “Robust perceptual image hashing via matrix invariants,” in Proc. IEEE Int. Conf. Image Processing, Singapore, Oct. 2004, vol. 5, pp. 3443-3446. [8] N. Ailon and B. Chazelle, “Approximate nearest neighbors and the fast Johnson-Lindenstrauss trasnform,” in Proceedings of the 38st Annual Symposium on the Theory of Computing (STOC), pages 557-563, Seattle, WA, 2006. [9] S. Dasgupta and A. Gupta, “An elementary proof of the JohnsonLindenstrauss lemma,” UC Berkeley, Tech. Rep. 99-006, March 1999. [10] T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters 27, pp. 861-874, 2006.