A Multilayer Evolutionary Homomorphic Encryption ...

2014 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery

A Multilayer Evolutionary Homomorphic Encryption Approach for Privacy Preserving over Big Data Amine Rahmani, Abdelmalek Amine, Reda Mohamed Hamou GeCode Laboratory, Department of Computer Science Tahar Moulay University of Saida Algeria

tion [01][04][15], private information retrieval [01][03][11] and access control [05][09][21][25][27].

Abstract—One of the biggest impediments that prevent the evolution of big data is the privacy of users. Many advanced researches are done within this topic and a lot of concepts had seen the light. One is a cryptographic concept known as homomorphic encryption which allows the application of operations on ciphered data without need to decipher it. However, from the cryptographic aspect, the homomorphic encryption has its defects which make it a potentially solution, in fact some researches proved the inefficiency of those cryptosystems against some kind of attacks such as attacks with chosen plaintext (IND-CPA) and attacks with chosen ciphered text (IND-CCA) and even for the majority of homomorphic cryptosystems which use user’s identity attacks of chosen identity. On the other, a new type of cryptosystems was recently introduced where he aim is to improve the classic cryptography techniques, such as substitution and transposition using evolutionary methods of data mining, e.g., genetic algorithms. The efficiency of this kind of schemes was proved IND-CPA and IND-CCA. In this paper, we improve the efficiency of a homomorphic cryptosystem known as TSZ (To, Safavi-Naini, and Zhang) by proposing a new approach that combines between it and evolutionary cryptography in order to use the advantages of these two categories. Key-words: homomorphic encryption, preserving, evolutionary encryption

I.

Another domain in informatics sciences presents his importance and success of resolving problems, especially the ones related to time and complexity. This is the optimisation techniques of data mining sciences. The optimisation algorithms are inspired from the real life and categorized on several domains like trajectory methods and the evolutionary methods such as genetic algorithms. These algorithms are inspired from the Darwinian evolution of biological populations [12]. However, in the light of success observed of these methods, the researchers started to apply it in different domains even in cryptography, which gave the birth of evolutionary cryptography in which the researches try to incorporate the evolutionary methods in cryptographic schemes in order to enhance the security level of these cryptosystems as we will see later on. In this work, we improve the security level of one of the known homomorphic encryption schemes which is named the TSZ (To, Safavi-Naini and Zhang) by combining between obfuscation and encryption of data by introducing the genetic algorithms. This paper is organised as follows: we present at first some related works then we introduce some essential information and backgrounds in order to get much better understanding this work. After that we describe our proposal by presenting the cryptosystem; then a scenario of the utilisation of this system in a secure retrieve protocol named PIR protocol using this arrangement. At the end we discuss his performance against known attacks by evaluating it in two steps: first theoretically and then we conduct a set of experiments and compare the results with some existing cryptosystems.

privacy

INTRODUCTION

With the creation of concept of big data and its services such as cloud computing that came with plenty of advantages, many challenges appear which lead to a lot of criticisms, especially the security level presented in privacy of users [24]. However, the fact of losing control on data during its existing in the service presents a real trouble for the researchers in privacy preserving domain. It occurs even when we consider that third party couldn’t access to data because of licence agreement, the transmission of queries and downloading documents are still insecure and third party have all rights to access to it. In result of that, a lot of researches [01] are published and others are still in process in order to improve this privacy. Some of those works are interested by a specific solution resides in a new cryptographic concept known as homomorphic encryption [07]. This kind of schemes was proposed theoretically for the first time in 1984 but didn’t realised until the beginning of the 21st century because of the less need to it that time. This concept use some complex mathematical techniques such as elliptic curves and bilinear applications which give the opportunity of executing operations on ciphered data without need to decipher it. This concept is used in different privacy axis such as data obfusca978-1-4799-6236-5/14 $31.00 © 2014 IEEE DOI 10.1109/CyberC.2014.14

II.

RELATED WORKS

The privacy preserving is a broadly known and sensitive domain, many works are answered on this matter. In [01], the authors give a general review about privacy preserving over published data and its techniques. In [15], the authors propose a composition bearing a general discussion about privacy challenges in great user profiling data techniques using EEXCESS benchmark. In [17], the authors show a general survey about using data mining techniques and algorithms in the privacy preserving domain. The papers introduced in [21] is a state of the art of privacy preserving data mining in which the authors present a general thought about privacy

19

Encrypt (x) * Encrypt (y) = Encrypt (x + y) (1)

preserving techniques and their focus on the role of data mining on this theme.

where Encrypt () represents the encryption function and x and y are plain texts.

Concerning the homomorphic encryption, many works are executed in this area. The majority of homomorphic propositions go to the fact of using the identity or specific attributes related to the user in order to ensure the aim of this topic such as in [02] where the authors give the general presentation of identity based encryption (IBE), then they proposed two approaches of IBE scheme based on fuzzy identity and study his security point in terms of keys sizes and security proofs. The homomorphic encryption was applied in several axis of privacy preserving domain. In access control such as in [25] the authors proposed a new attribute based encryption (ABE) scheme for fine-grained access control, and in [27], where we saw a proposition of a hierarchical key assignment (HKA) based encryption scheme for access control in which the. Another kind of researches interested by studying the homomorphic schemes like the papers in [13] and [07] where the authors present their surveys about some homomorphic encryption by evaluating it in term of their complexity and security levels.

B. Evolutionary cryptography From [12], we can say that the evolutionary cryptosystems is a cryptographic domain that uses the classic cryptography, such as transposition, substitution, or refined cryptography by encoding texts. It then applies the genetic algorithms in order to get a full disorder of the texts, this disorder leads to a new key named genetic key which is a permutation used for the decryption phase. C. Description of TSZ scheme As a homomorphic scheme, the TSZ [10] cryptosystem is founded on a bilinear application [10] ê:G1 x G1 G2 in which G1 and G2 are two groups of a prime order q, the main goal of this scheme is to permit to a user of an identity ‘u’ to share data with k users. The steps are: Initialization: The user u generates two random generators P and CT from G1 and a unitary polynomial F(x) of degree 2k-1 as indicates the following form:

Concerning the evolutionary cryptography, as said in the introduction, the efficiency of evolutionary methods in many domain leads to the incorporation of these methods on cryptography, in [12], the authors present an introduction of application of evolutionary methods of data mining in security issues and cryptography. In [17], the authors discuss on their survey the application of evolutionary methods on cryptographic schemes and their use on real world scenarios. In [28], the authors study the general model of evolutionary cryptography in synchronous communications by concluding the model and analysing the influence of it on the considered communication model. III.

F(x) = a0 + a1 * x + ……. + a2k-2 * x2k-2 + x2k-1 (2) Master secret key: It generates P and the polynomial F(x). The encryption key: Consider as the public key in form of a set of parameters (g,Q,Q0,Q1…Q2k-1) where: Qi = ai * CT (3) and g= ê (P, Q) (4) Secret key for user (u): Sku= F(u)-1 * P Encryption algorithm: The user generates a random number r and computes: C= (m*gr, r*Q, r*Q0… r*Q2k-1) (5) where m is the plaintext.

PRELIMINARIES

In this section, we introduce the homomorphic encryption, evolutionary encryption, and the homomorphic scheme that we want to improve.

Decryption algorithm: To decrypt the ciphered text, the user computes gr using his/her own key as follows:

A. Homomorphic encryption: The homomorphic encryption is, as Fontaine and Galand define in [07], a conventional cryptographic scheme based on three essential algorithms: KeyGen, Encrypt, and Decrypt. KeyGen is a randomized algorithm for generation of public and secrete keys, Encrypt used to take as an input a plaintext and public key to generate a ciphered text, and decrypt takes as an input the ciphered text and secrete key to regenerate the original plain text. Other resources include another algorithm in the beginning of the three precedent named setup in which the public key is generated in the form of set of parameters and a master secrete key used in KeyGen with some other parameters such as identity of user for the generation of the secrete key corresponding to one and only one user.

gr = ê(Sku, r*Q0)*…*ê(u2k-2Sku*Q2k-2)*ê(u2k-1Sku*Q) (6)

IV.

OUR PROPOSITION

TABLE I lists the key notations that we will use: Notations PK MSK Sk CT PCT SC

Such cryptosystem is called homomorphic if it verifies the following property:

20

TABLE I. NOTATIONS Description Public Parameters and represents the public key of the homomorphic scheme Master Secret Key Secret Key corresponding to a user The final ciphered text pre-ciphered text Stop Criterion

Figure 1 describes the general idea of our cryptosystem.

choose the best individuals; the selection works as follows: first we choose a random r between 0 and 1then we compute. ଵ ଵ ൑ ‫ݎ‬ ൜ ୧ ୧ିଵ ൑ ‫ ݎ‬൑ ୧ ʹ ൑ ൑

(9)

4. Cross-over: This process works as follows: first we choose two individuals from the selected ones then for each individual we choose randomly a position and split it; after that new individuals will be generated as indicating the figure 2:

Figure 1: Progress of our cryptography scheme

Our cryptosystem is based on two levels, the first level uses the evolutionary encryption in order to get an obfuscation of the text and the second one which is a homomorphic encryption in which we get the final ciphered text, indeed our cryptographic algorithm is based on five algorithms (evolutionary, setup, KeyGen, encrypt and decrypt).

Figure 2: Cross-Over Progress

Finally, the INDchild1 will replace INDparent1 and INDchild2 will replace INDparent2. 5. Mutation: It is the last process of this step where for each individual we choose randomly two positions and permute them.

Algorithm 1: Evolutionary encryption 1 Input: plaintext M 2 Output: pre-ciphered text PCT, session key SK 3 Begin 4 Codification (M) Initial population P; 5 Repeat 6 Evaluation (P); 7 Selection (P); 8 Cross-over (P); 9 Mutation (P); 10 Until (stop criteria is verified) 11 Concatenation (P) PCT; 12 Comparison (M, PCT) SK 13 END

This step arrives in form of loop until the stop criterion is verified. Finally the concatenation of the final individuals gives us the pre-ciphered text PCT, and the comparison of M and PCT gives us a permutation called session key SK which is essentially the final population resulting from this process. Concerning the stop criterion we define it as follows: F (xk) = σ௡௜ୀ଴ ȁܿܽ‫݀ݎ‬ሺ݈௞ ሻ െ ܿܽ‫݀ݎ‬ሺ݈௜ ሻȁ

Algorithm 1: Evolutionary cryptosystem

(10)

Algorithm 1 says that the first step in our cryptosystem is the evolutionary encryption, and this step is organised on several processes as follows:

where xk is an individual and li is the corresponding list and card (li) the cardinality of li and n the number of individuals.

Codification: The codification process consists of tokenizing the plain text M using bag of words then transfers each word to a set of characters codified by their ASCII codes. The final result of this process is the codes where each set is considered as individual for the next processes.

The next step in our algorithm is the homomorphic encryption using TSZ algorithm with modification:

1.

2.

Let’s consider user’s identity ID Algorithm 2: Setup 1 Input: secrete parameter 1 2 Outputs: PK and MSK 3 Begin 4 Generate a random prime integer n and two prime big integer p and q 5 Put G1 and G2 two sets of order n such as G1[0] p and G1[1] q 6 Fill the rest of G G1[i] G1[i-1] + G1[i-2] for i= 2 … n 7 Put e:G1 x G1 G2 in which e(x, y)=x * y 8 Fill G2 using e 9 Choose a random integer a0 10 Choose a random e1 from G and put g1 = e12 11 Choose another e2 randomly from G and put g2 = 4 * e2 – 1 12 Master secrete key MSK = (g1, a0) 13 Public parameters PK=(gt, g2, g2’=g2a0) in which gt is a random generator in G2 14 END

Evaluation: This process consists of computing the fitness function of each individual. The fitness function is chosen as follows: consider P (xk) the probability of individual xk in which ிሺ௫ ሻ (7) P (xk)= σ೗ ೖ ೔ ிሺ௫೔ ሻ

Then, consider qi the cumulated probability of xi in (8) which qi=σ௜௞ୀଵ ܲሺ௞ ሻ 3. Selection: After computing the cumulated probability of each individual this process came to 21

Algorithm 2: Setup function in TSZ cryptosystem

5 6

r

Put M’= C[0]/gt Starting from M’ deduct M using the 7 reciprocal permutation of KS 8 END

As the algorithm 2 indicates, the setup function works as follows. The user enter a secrete parameter which is used as an identification for the user, the user after that generates an integer n which is considered as length of two groups G1 and G2, the next step used to fill G1, for that the user generate two big integers p and CT and put them respectively in G1, the rest of this group is filled using Fibonacci principle so that each element starting from the third one equals to the sum of the two precedent elements, however, the next step defines a bilinear application e goes from G1 to G2 reading two elements of G1 and returning the multiplication result, now a random integer a0 is generated in order to use it in the next algorithm, the user now must define two random generators e1 and e2 in G1 then put g1=e12 and g2= 4 * e2 -1, in homomorphic encryption g1 and g2 are called generating functions, after that the user puts gt = e(g1, g2) which is automatically will be in G2.

Algorithm 5: Decryption step

The last algorithm in our system is the decryption way presented in Algorithm 5, this step works as follows: the user computes a new gtr by making it equal to the multiplication of e (Sku, g2’r) and e (Sku ID, g2r) then compute the text M’ which equals to the first component of C divided by gtr, finally, it doesn’t rest except the use of the reciprocal permutation of KS which is principally the initial population resulting from the codification process in algorithm 1 in order to get the final plaintext M. The following scenario indicates an example of use of our system in PIR protocol: CLIENT

SERVER

- Generate a request Q - Ciphered request with evolutionary cryptosystem and get PCT - Generate session key KS from Q and PCT - Execute setup and generate PK and MSK - Execute encrypt PCT using PK and get QF

The returned result of this algorithm are public parameters which are respectively gt, g2, and a new element g2’ which equals to g2a0, and master secrete key which is the composition of g1 and a0. Algorithm 3: KeyGen 1 Input: user’s ID, MSK 2 Output: user’s secrete key Sku 3 Begin 4 F = ID+a0 5 Sku = g11/F 6 gt = e(Sku, g2(ID+a0)) 7 END

Send (QF, KS, PK) - Execute encrypt warehouse using KS and get WH’ - Execute encrypt WH’ using PK - Apply QF and get encrypted result ER

- Execute KeyGen using MSK and ID and get Sku

Algorithm 3: KeyGen Algorithm

This algorithm used for generation of secrete key corresponding to such user, the user has an ID, a secrete function F used to calculate the sum of ID with a0 recuperated from MSK, after that, the user calculates his own secrete key Sku = g11/F then update gt by making it equals to e(Sku, g2(ID+a0)), the problem in this algorithm is that it accept an identity in number form, for that we introduce a special treatment in which the ID of user is the result of application of XOR operation between his real identity characters but this optimisation has a security problem which we will discuss later on.

Send (ER) - Execute decrypt using Sku and get R’ - Decrypt R’ using the reciprocal KS and get final result R

Figure 3: Scenario of PIR protocol using Our System

First the client generates a request Q than applies the evolutionary cryptosystem in order to obtain a form that we call pre-ciphered request and we note it PCT, since the first cipher serves to create a disorder of the text the client use the comparison between PCT and Q in order to get a permutation that we call session key KS, after that the client execute the first function of the second algorithm which is the homomorphic one in order to get public parameters PK and master secrete key MSK, he use PK to cipher PCT in order to obtain a final ciphered request QF, the client now send the set composed of the QF, PK and KS, the server in his turn encrypt his warehouse first encryption using KS and get pre-ciphered warehouse (WH’) then cipher it using PK and applies the QF to extract results which is ciphered and send it to the client, in the same time of server’s work the client execute the second function of his homomorphic cryptosystem in order to get the secrete key Sku which is used to decipher the received

Algorithm 4: Encrypt 1 Input: pre-ciphered text PCT, PK 2 Output: ciphered text CT 3 Begin 4 Chose a random r ‫ א‬Ժ௤ r r r 5 Compute CT = (PCT*gt , g2’ , g2 ) 6 END

Algorithm 4: Encryption step

The encryption algorithm works as follows: the user first chooses a random r from Ժ௤ then compute gtr, g2’r, g2r, finally the ciphered text C, unlike most of cryptosystems, is a set composed of PCT*gtr and the last two computed elements which are used in decryption step. Algorithm 5: Decrypt 1 Input: ciphered text CT, Sku, PK, SK 2 Output: plain text M 3 Begin 4 gtr = e(Sku, g2’r) * e(Sku ID, g2r)

22

results, after decipher it the client finally use the reciprocal permutation in order to get the plaintext results. V.

through this part of discussion we want to improve the efficiency of our secret of keys against attacks. C. Genetic algorithms The security of this algorithm lies in the random principle of genetic algorithms which create a total disorder of the text so that even an attacker who succeed in building an anonymous decoder, he will still need to reveal the session key, also, the use of genetic algorithms gives more security to the classic cryptography because the permutation deducted from this algorithm is almost like this one used in symmetric cryptosystems such as DES but it just be changed every time we execute the algorithm.

EXPERIMENTS AND RESULTS

The security of our cryptosystem lies in the most enemy of the cryptanalysts, the use of random in almost all the algorithms. In this section we will discuss the performance of our cryptosystem in term of his security issue, and to do that we like to give a general view about the TSZ attack described in [10] in order to get better understand what we try to improve on this cryptosystem, after that we present a set of results of experiments that we did in order to evaluate this cryptosystem, the evaluation of results is based on some criteria like the size of text in plaintext form, pre-ciphered form and ciphered form, and also the size of initial population, and time, also we try to change the cross-over probability in order to get much more official result and study the efficiency of introducing genetic algorithms on TSZ cryptosystem.

D. Setup The security of this algorithm lies on the fact of the impossibility on deducting two prime numbers p and CT from the result of its sum, also the use of random to generate the first two elements of G1 make the deducting of p and CT very hard, another security advantage related always to the group G1 is that not all the group is used but a random element are chosen then applied on other functions named generative functions which make the attacks with exhaustive search very hard for this type. The setup even if it generates a secrete key, but this key is not able to decrypt a text which can be considered as another security issue.

A. Attack against TSZ cryptosystem The founders of TSZ system demonstrated that the collaboration of at least k users can’t make an anonymous decoder but, from [10] we can clearly notice that the author demonstrate an attack called anonymous pirate decoder in which a single user with ID= u can build the decoder and generates the decryption key, first at all the user must generate a set of random numbers Z0, Z1, …………… Z2k-2 ‫ א‬Ժ௤ then calculates a set of x elements in which Xi = u Sku + Zi * Q

E. KeyGen As we mentioned earlier in this paper, the mechanism that we use to get a numeric identity creates a problem because the application of XOR operation on characters gives the attacker 216 possibility to guess the identity of the user which is a small number if we take in consideration the use of big data services, but even with that, this algorithm has always a security aspect lies in his complexity of deriving the secrete key which is an exponential complexity.

(11)

where i= 0…..2k-2, now the user calculates ʹǦͳൌʹǦͳȗȂሺͲȗͲ൅ͳȗͳ൅ǥǥǥǥǥǥ൅ʹǦʹȗʹǦʹሻሺͳʹሻ

Finally the user shared the vector (X0, X1………. X2k-2, X2k-1) so that any user can calculate

F. Results of experimentations In this subsection we will present our results from different experiments that we did by changing on cross-over probability and fixing the size of homomorphic keys on 2048 bits, the evaluation of the results is based on six criteria: size of plaintext (SP) counted by characters, size of pre-ciphered text (SPC) counted by characters, size of ciphered text (SC) also counted by characters, size of initial population (IP) counted by individual, time of encryption (TE) counted by seconds, and time of decryption (TD) counted by seconds.

ൌሺͲǡȗͲሻǥǥሺʹǦʹǡȗʹǦʹሻȗሺʹǦͳǡȗሻ ሺͳ͵ሻ

And the author demonstrates that this attack is untraceable by proofing the bijection of the application goes from Ժ௤ to G1 in fact such attacker can generate directly a set of random Xi without passing by formula 11 and 12, in that case the generation of Xi is independent from user’s identity. The construction of this decoder in order to attack the TSZ scheme with one user will be easy constructed as follows: the attacker chooses a random z0 from Ժ௤ and a random ID then put a random too (Sku, SkuID), after that the attacker puts X0=Sku*g2z0 and X1=SkuID*g2’-z0, the decoder is finally gr=e(X0, g2’r)*e(X1, g2r).

TABLE II. RESULTS OF FIVE EXPERIMENTS WITH PROBABILITY OF CROSS-OVER= 0.6

B. Security issues As the principle of Kirchhoff of open conception in security informatics says that a security of such cryptosystem is based on security of its keys, we will discuss the generation algorithms, otherwise,

23

Experience

SP

IP

SPC

SC

TE

TD

001 002 003 004 005

221 225 607 757 1414

27 25 35 32 33

530 280 650 1426 4014

10590404 654624 10457274 24447477 3055789

11462 5479 16432 8574 4214

12141 5258 24715 11704 7814

In the case of TABLE III and Figure 5, we increase the probability of cross-over to 0.8 then we repeat the same experiences. We notice the same thing in TABLE I that the size of encrypted and preencrypted form of such text does not rely on the size of the plaintext form and also the time of encryption which is in this case related to the size of preencrypted form.

SPC x10

221

225

607

757

As a global comparison we can conclude that the size of results of encryption steps and time of execution have no relation with the size of plaintext also that the number of individuals depend on the number of used characters but always stays between 25 and 32 of used characters (alphanumeric, and special characters).

30.55789

142.6

104.57274 65 28

53

65.4624

105.90404

244.47477

401.4

SC x100000

1414

The precedent results are results of many experiments with different sizes of the plaintext, now we will see other experiments in which we fix the size of plaintext and change the cryptographic system using the original TSZ scheme without introducing genetic algorithms, our scheme, OkamotoUchiyama, Waters, Goldwasser-Micali, and Paillier, in this case we evaluate our system basing on three criteria size of ciphered text, and time of encryption, and time of decryption.

Figure 4: Results of five experiments with probability of crossover= 0.6

In the case of results presented in TABLE II and Figure 4 we fix the probability of cross-over on 0.6 then execute five different experiences by changing the size of plaintext, otherwise, in each experience we encrypt a text of size different from the others, however, from the results we can see that the size of the plaintext influences the size of final and preencrypted text but not increase and decrease with the size of plaintext, in fact we see here that the greatest size in encrypted form corresponds to the plaintext of medium size when the plaintext of greatest size has an encrypted form of medium size but a preencrypted form of the biggest size, and the same remark for the time of encryption and decryption even if we take in consideration the external effects such as system interruptions.

TABLE IV: RESULTS OF EXPERIMENTS WITH SIZE OF PLAINTEXT = 751 CHARACTERS

TABLE III: RESULTS OF FIVE EXPERIMENTS WITH PROBABILITY OF CROSS-OVER= 0.8 Experience

SP

IP

SPC

SC

TE

TD

001 002 003 004 005

221 225 607 757 1414

27 25 35 32 33

510 2270 2802 9562 4348

1940843 1601647 10116031 38477007 27312475

3336 3570 7010 14834 9417

4603 6053 9119 11829 6836

Algorithm

SC

SP

TE

TD

Our system

5 208 476

751

5 765

3 224 409

Original TSZ

4 187 638

751

1699

Okamoto-Uchiyama

1 496

751

25

70

Waters

13 583

751

235

153

Goldwasser-Micali

15 393

751

102

434

Paillier

2 251

751

276

670

Pallier Goldwasser-Micali Waters

SC x100000

SPC x10

221

225

434.8

273.12475

TSZ Our System

101.16031

16.01647

51

19.40843

227

280.2

384.77007

Okamoto-Uchiyama

607

TD x10

0 200 400 600 800 TE x10 SC x 1000

Figure 6: Results of experiments with size of plaintext = 751 characters

757

From the results in TABLE IV and Figure 6, we notice that our system increase the size of encrypted text with about 20% more than the original TSZ scheme (5 million characters for our system against 4 million characters for original TSZ) even if it takes more time than any other scheme that we use

1414

Figure 5: Results of five experiments with probability of crossover= 0.8

24

because of his iterative multilevel principle, otherwise, we notice that the scheme of OkamotoUchiyama gives the lowest results in term of size of encrypted text and time taken to encrypt and decrypt text.

JS¾&âåÝ2Ý ƒ¶ýg%F¡çßSJksßyWéòs/îÐÔ r¬_§vv$ãGF’Q™v çì·¤âžÄ=Ï,‚—â0Ô²“í¿}é¡¶J8Su¸xÑ,ùŒa~½k‚‹_éGNòº=UjÞíbÓ5®Ç(NùÁ²tÛô-þA‚ÛCIŒ #æSjù ÝËN¶-&Õ×,

The following figures present some instances of the plaintext, his pre-encrypted and encrypted form. The most impediment that prevent the evolution of big data is the privacy of users in spite of the invention of concept of homomorphic encryption which allows the application of operations on ciphered data without need to decipher it but this presents just a potentially solution, in fact some researches proved the inefficiency of those cryptosystems against some kind of attacks such as attacks with chosen plaintext (IND-CPA) and attacks with chosen ciphered text (INDCCA) and even for the majority of homomorphic cryptosystems which are based on identity.

Figure 9: Part from the encrypted form of figure 4

Figure 9 presents part from the final encrypted form after applying the homomorphic encryption so that we notice that the resulting text is full of special and incomprehensible characters and words, and also we couldn’t take a fully image of the encrypted form because of the big size of this last one and that is presented in the Tables above.

Figure 7: Part of plaintext form example

VI.

Figure 7 indicates a plaintext used in one of our experiences, the text is part of the introduction of this paper.

CONCLUSION AND FUTURE WORKS

Since the invention of big data concept and his services many impediments stood in the way of evolution, one of this impediments is the security concept which is now a highly active domain of search, many works are done and others are still in process until now in order to improve the different security issues such as access control and confidentiality of stored data using several techniques such as data mining, and cryptography issues.

m m mosm impmdimmnm m am prmvmnm m m mvolumionmofmbigmdamamismm mmprivacymofmusmrsminmspimmmofmm mminvmnmionmofmconcmpmmofm omomorp icmmncrypmionmwmicmmallowsmmmmm applicamionmofmopmramionsmonmcipmmr mdmdamamwimmoummnmmdmmomdmcip mmrmimmbummmmismprmsmnmsmjusmm ampommnmiallymsolumion,minmfacmmsommmrmsmarcmms mprov

In this paper, we have presented a new approach in cryptography by combining between evolutionary cryptography and homomorphic cryptography. The integration of these two paradigms can enhance the security level of the other. Theoretically, it seems that the encryption of a datum several times will create more protection even if this proposition still need more studies in term of complexity of computing and complexity against attacks.

Figure 8: Part of pre-encrypted form example

REFERENCES

Figure 8 presents a part from the pre-encrypted form of the text presented in figure 4 after applying the genetic algorithms. We can see that the cross-over applied on population which is the positions of the characters in the text creates a disorder of the text without modify the content. The benefit of this step is that unlike the classic cryptography, each character moved from his positions is not necessarily replaced by the same character always or by near characters which will make cryptanalyze of this form something hard.

[01] Amar Paul Singh, M. D. (2013). A Review of Privacy Preserving Data Publishing Technique. International Journal of Emerging Research in Management &Technology, 32-38. [02] Amit Sahai, B. W. (2005). Fuzzy Identity-Based Encryption. Dans R. Cramer, Advances in Cryptology – EUROCRYPT 2005 (pp. 457-473). Aarhus, Denmark: Springer. [03] Benny Chor, O. G. (1995). Private Information Retrieval. the 36th Annual IEEE Conference on Foundations of Computer Science (pp. 41–50). New York: IEEE. [04] Blough, R. P. (2008). a Robust Data-obfuscation Approach for Privacy Preservation . International Journal of Information and Computer Security, 4-26. [05] Crampton, J. (2009). Cryptographically-enforced hierarchical access control with multiple keys. The Journal of Logic and Algebraic Programming, 690–700.

25

[06] G. Shoba, R. M. (2014). A Survey of Safeguarding the Privacy in Data Mining Using Decision Tree Learning Algorithms. International Journal of Computer Applicatio, 21-28. [07] Galand, C. F. (2007). A Survey of Homomorphic Encryption for Nonspecialists. EURASIP Journal on Information Securit. [08] Gentry, C. (2009). Fully homomorphic encryption using ideal lattices. the forty-first annual ACM symposium on Theory of computing (pp. 169-178 ). Bethesda, Maryland, Washington: ACM. [09] Gerome Miklau, D. S. (2003). Controlling Access to Published Data Using. the 29th international conference on Very large data bases (pp. 898-909). Berlin, Germany: IEEE. [10] Hervé Chabanne, D. H. (2005). Public Traceability in Traitor Tracing Schemes. Eurocrypt'05 (pp. 542-558). Aarhus, Denmark: Springer. [11] Hiroaki Kikuchi, D. K. (2013). Scalable Privacy-Preserving Data Mining with Asynchronously Partitioned Datasets. IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences, 111-120. [12] Isasi Pedro, J. C. (2004). Introduction to the Applications of Evolutionary Computation in Computer Security and Cryptography. international journal of Computational Intelligence, 445–449. [13] J. Loftus, A. M. (2011). On CCA-Secure Somewhat Homomorphic En-cryption. Selected Areas in Cryptography, 55--72. [14] Jian Wang, Y. L. (2009). A Survey on Privacy Preserving Data Mining. the First International Workshop on Database Technology and Applications (pp. 111-114). Wuhan, Hubei, China: IEEE. [15] Mohammad Ali Kadampur, S. D. (2010). A Noise Addition Scheme in Decision Tree for Privacy Preserving Data Mining. journal of computing, 137-144. [16] Omar Hasan, B. H. (2013). A Discussion of Privacy Challenges in User Profiling with Big Data Techniques: The EEXCESS Use Case. 2nd International Congress on Big Data. Santa Clara Marriott, CA, USA: IEEE. [17] Picek, S. G. (2011). On evolutionary computation methods in cryptography. MIPRO the 34th International Convention (pp. 1496 - 1501). Opatija, Croatia: IEEE. [18] Quisquater, G. H.-J. (2000). Montgomery Exponentiation with no Final Subtractions: Improved Results. Cryptographic Hardware and Embedded Systems, 293-301. [19] Rakesh Agrawal, R. S. (2000). Privacy-preserving data mining. the 2000 ACM SIGMOD international conference on Management of data (pp. 439-450). Dallas, Texas, USA: ACM. [20] Richard Chow, M. A. (2012). A Practical System for Privacy-Preserving Collaborative Filtering. IEEE 12th International Conference on Data Mining Workshops (pp. 547-554 ). Brussels, Belgium: IEEE. [21] Sabrina De Capitani di Vimercati, S. F. (2007). Overencryption: management of access control evolution on outsourced data. the 33rd international conference on Very large data bases (pp. 123-134). Vienna, Austria: IEEE. [22] Suciu, G. M. (2003). Controlling access to published data using cryptography. the 29th international conference on Very large data bases (pp. 898-909). Berlin, Germany: IEEE. [23] Tristan Allard, B. N. (2011). Towards a Safe Realization of Privacy-Preserving Data Publishing Mechanisms. 12th International Conference on Mobile Data Management 2 (pp. 31-34). Luleå, Sweden: IEEE. [24] Vassilios S. Verykios, E. B. (2004). State-of-the-art in privacy preserving data mining. SIGMOD Record, 50 - 57. [25] Vipul Goyal, O. P. (2006). Attribute-Based Encryption for Fine-Grained Access Control of Encrypted Data. the 13th ACM conference on Computer and communications security (pp. 89-98). Alexandria, VA, USA: ACM.

[26] Xuyun Zhang, C. L. (2014). Privacy Preservation over Big Data in Cloud systems. Security, Privacy and Trust in Cloud Systems, 239-257. [27] Yi-Ruei Chen, C.-K. C.-G. (2013). CloudHKA: A Cryptographic Approach for Hierarchical Access Control in Cloud Computing. 11th International Conference on Applied Cryptography and Network Security (pp. 37-52). Banff, AB, Canada: Springer. [28] Zhang Guoping, Z. X., Si, G., & Qing, D. (2010). Synchronous Communication Research and Implementation of Evolutionary Cryptography. International Symposium on Computational Intelligence and Design (ISCID) (pp. 79 - 82). Hangzhou, Zhejiang, China: IEEE.

26