Chapter one

Deanship of Graduate Studies Al - Quds University

Self Generating Multi Key Cryptosystem For Non-Invertible Matrices Based On Hill Cipher

Mousa Mohammad Farajallah

M.Sc. Thesis

Jerusalem- Palestine

1430 - 2010

Self Generating Multi Key Cryptosystem For Non-Invertible Matrices Based On Hill Cipher

Prepared By: Mousa Mohammad Farajallah

B.Sc. Electrical Engineering

Palestine Polytechnic University

Palestine

Supervisor: Dr. Rushdi Hamamreh

A thesis submitted in partial fulfillment of requirement for the degree of Master Electronics and Computer Engineering-Faculty of Engineering-AL-Quds University 1430 – 2010 ii

AL-Quds University Deanship of Graduate Studies Electronic and Computer Engineering Master Program

Thesis Approval

Self Generating Multi Keys Cryptosystem For Non-Invertible Matrices based on Hill Cipher

Prepared By Student Name: Mousa Mohammad Farajallah Registration No: 20714094

Supervisor: Dr. Rushdi Hamamreh

Master thesis submitted and accepted Date: The names and signatures of the examining committee members are as follows: 1- Dr. Rushdi Hamamreh : Head of Committee Signature: 2- Dr. Internal Examiner : External Examiner Signature: 3- Dr. External Examiner : External Examiner Signature:

Jerusalem- Palestine

1430 - 2010 iii

Dedication

I dedicate this work to My wife………. My baby……… My parents…… My brothers…… My sisters………. My friend………..

Eng. Mousa Farajallah

iv

Declaration: I Certify that this thesis submitted for the degree of Master, is the result of my own research, except where otherwise acknowledge, and that this study ( or any part of the same ) has not been submitted for a higher degree to any other university or institution.

Signed: ……………………

Mousa Mohammad Farajallah Date: ……………………

v

Acknowledgement

I would like to express my sincere gratitude to my supervisor, Dr. Rushdi Hamamreh. He has offered me great freedom on choosing my favorite research topics and developing my research interests, and continuously provided me help and encouragement with extensive knowledge, also who guide and help me to publish first two papers in my academic profile during my thesis, his ideas have been a source of inspiration for this work, I also thank the examining committee, all my colleagues and relatives for their support. Therefore, I also thank all lecturers, exactly Dr. Ali Jamoos who give me a lot of knowledge, he answered all my questions.

Much thanks to my mother for her praying and providing supports and my father thank you very much.

I am deeply grateful to my wife for her love, constant support and encouragement no matter how hard our situation is, and of course all my friends

vi

Abstract Information security is an important issue. Today’s many social aspects has developed

reliance on networking infrastructures. Health, financial, and many other

institutions using local networks and the global Internet. The security of this infrastructure has been called into question over the last decade. In particular, how to send data securely.

The Hill cipher model is one of famous symmetric cryptosystems since invented till now, which can be used to protect information from unauthorized access, it was invented by Lester S. Hill in 1929, and it was the first polygraph cipher which was practical to operate on more than three symbols at once. But, in spite of that timeworn Invention , only a few systems used it, and the reason is due to that Hill cipher requires inverse of the key matrix while decryption. This inverse relative to any suggested number, and not normal inverse, since the matrix that had not prime determinant relative to the previous suggested number that not had inverse, so in fact that not all the matrices have an inverse, and therefore they will not be eligible as key matrices in the Hill cipher scheme. Furthermore, due to its linear nature, the basic Hill cipher succumbs to known-plaintext attacks.

Many motivations encouraged me to work in Hill Cipher model, since it has many advantages in data encryption. First, it is resistant to the frequency letter analysis. It's also very simple since it uses matrix multiplication. Finally, it has high speed and high throughput. However, noninvertible key matrix is the main disadvantage of Hill Cipher, this problem lead to many other sub problems such as the encrypted text can't be decrypted, second, more constraint and restriction when select the key matrix of Hill Cipher, and easy to discover key from known-plaintext attacks.

The key contribution of this thesis is to Develop new technique to modify Hill Cipher algorithm, to overcome its major problem-noninvertible key matrix, this new technique; depends on changing the method of finding the inverse of the matrix, into normal mathematic inverse, and change other steps in the original Hill cipher to accommodate the changes in calculating the inverse of the matrix, from these changes, the way of processing plaintext vector, is changed so that plaintext vector used in encryption side is double of the matrix size, and this vector is divide into two vectors, one is multiplied by modular value, and sum it with the other vector, a lot of changes are made in vii

the original Hill cipher, after all these changes the new algorithm of Hill cipher called MRHC. After success on the first contribution, we can easily success in the next contribution, which improves the security of Hill Cipher against known plaintext attack. This enhancement in security possible by using key generation idea, or multi key generation process, depending on one of secure hash functions, and I choose the powerful one, SHA-512, and use it in geniuses way to produce 128 different numbers as input of key matrix generation code.

viii

‫ممخص الرسالة‬

‫يعد امن البيانات من أىم القضايا في العصر الحديث‪ ،‬حيث أن كثي ار من المؤسسات تعتمد‬ ‫عمى الشبكات المحمية وشبكة االنترنت في كثير من جوانب عمميا‪ ،‬لذلك تطور خالل اآلونة األخيرة‬

‫سؤال يطرح نفسو‪ .‬كيف نحمي البيانات من االختراق‪.‬‬ ‫خوارزمية (‬

‫‪ )Hill Cipher‬ىي من أشير خوارزميات التشفير المتماثل منذ اختراعيا وحتى‬

‫يومنا ىذا والتي تستخدم لتشفير البيانات لمنع األشخاص غير المرخص ليم من االطالع عمى البيانات‬

‫المرسمة عبر االنترنت‪ ،‬اخترعت عام ‪ 1929‬من قبل ( ‪ ، )Lester S. Hill‬حيث أنيا أول خوارزمية‬ ‫تعمل عمى تشفير أكثر من ثالثة رموز دفعة واحدة‪ ،‬ولكن عمى الرغم من قدم اختراعيا إال أنيا قميمة‬ ‫االستخدام والسبب يعود إلى أنيا بحاجة إلى معكوس كل مصفوفة مستخدمة في تشفير البيانات من‬

‫اجل فك تشفير تمك البيانات‪ .‬وليس جميع المصفوفات ليا معكوس ‪،‬والمعكوس ىنا ليس المعكوس‬

‫العادي‪ ،‬وانما المعكوس نسبة إلى أي رقم‪ ،‬حيث انو ليس ليذه الخوارزمية معكوس عندما تكون محددة‬ ‫المصفوفة ليست أولية نسبة إلى ذلك الرقم‪ ،‬وبالتالي احتماالت فشل المصفوفات كمفتاح ليذه‬

‫الخوارزمية كبير جدا‪ .‬باإلضافة إلى ىذا الخمل الكبير فانو ولطبيعة خوارزمية (‬

‫‪)Hill Cipher‬‬

‫المعتمدة عمى الجبر الخطي فمن السيل اكتشاف مفتاح التشفير في حال كان لدى المخترق جزء من‬

‫النص األصمي والجزء المشفر المقابل ليذا النص‪.‬‬

‫أمور كثيرة شجعتني لمعمل عمى حل ىذه الخوارزمية‪ ،‬حيث أنيا تحتوي عمى الكثير من‬

‫الحسنات وااليجابيات‪ ،‬فيي مقاومة لتحميل تك اررية الحروف ‪،‬إضافة إلى أنيا بسيطة جدا في‬ ‫الحسابات والتعامل العتمادىا عمى الجبر الخطي البدائي‪ ،‬وأخي ار فان إنتاجيتيا عالية جدا وتشفير‬

‫البيانات من خالليا سريع أيضا‪ ،‬وحيث إن ىذه الخوارزمية تحوي خمال كبي ار لبعض المصفوفات أدى‬ ‫إلى عدم القدرة عمى استرجاع البيانات المشفرة من خالل ىذا المفتاح‪ ،‬كما أدى ىذا الخمل إلى مشاكل‬ ‫وسمبيات فرعية كثيرة ‪ ،‬مما حفزني عمى محاولة إيجاد حل ليذا الخمل الرئيسي والذي من خاللو‬

‫تمكنت من حل المشاكل الفرعية األخرى‪ ،‬حيث أن النتائج التي حصمنا عمييا في ىذه الرسالة أثبتت‬

‫صحة ىذا النموذج الرياضي الجديد المتين‪.‬‬

‫‪ix‬‬

List Of Tables 2. 1

Caesar Table ........................................................................................................... 22

4. 1

Sample of time consumption for each algorithm at 𝒔 = 𝟐 ..................................... 74

4. 2

Abbreviation of testing algorithm .......................................................................... 75

4. 3

Mean values of testing sample for each algorithm at 𝒔 = 𝟐 .................................. 75

4. 4

Sample of time consumption for each algorithm at 𝒔 = 𝟑 ..................................... 76

4. 5

Mean values of testing sample for each algorithm at 𝒔 = 𝟑 .................................. 77

4. 6

Sample of time consumption for each algorithm at 𝒔 = 𝟒 ..................................... 78

4. 7

Mean values of testing sample for each algorithm at 𝒔 = 𝟒 .................................. 78

4. 8

Sample of time consumption for each algorithm at 𝒔 = 𝟓 ..................................... 79

4. 9

Mean values of testing sample for each algorithm at 𝒔 = 𝟓 .................................. 80

4. 10 Sample of time consumption for each algorithm at 𝒔 = 𝟔 ..................................... 81 4. 11 Mean values of testing sample for each algorithm at 𝒔 = 𝟔 .................................. 81 4. 12 Sample of time consumption for each algorithm at 𝒔 = 𝟕 ..................................... 82 4. 13 Mean values of testing sample for each algorithm at 𝒔 = 𝟕 .................................. 83 4. 14 Sample of time consumption for each algorithm at 𝒔 = 𝟖 ..................................... 84 4. 15 Mean values of testing sample for each algorithm at 𝒔 = 𝟖 .................................. 84 4. 16 Sample of time consumption for each algorithm at s = 9 ..................................... 85 4. 17 Mean values of testing sample for each algorithm at s = 9 ................................... 86 4. 18 Summarize of number of Calling increase factor ................................................... 89 4. 19 Summarize of number of iteration decreasing factor .............................................. 90 4. 20 Number of tries needed by the hacker to cryptanalysis MRHC3. ........................... 94 4. 21 Number of tries needed by the hacker to cryptanalysis MRHC2 ........................... 95

x

List Of Figures 2. 1

The first page of al-Kindi's manuscript On Deciphering Cryptographic

messages ............................................................................................................................. 16 2. 2

Trithemius system ................................................................................................... 16

2. 3

ADFGVX cipher system ........................................................................................ 19

2. 4

Diffie-Hellman key generation model .................................................................... 20

2. 5

Simplified model of conventional encryption ......................................................... 21

2. 6

Simplified model of asymmetric encryption .......................................................... 25

2. 7

Common divisors of two integers ........................................................................... 28

2. 8

Flowchart of Finding Greatest Common Divisor ................................................... 29

2. 9

Flowchart of Finding inverse of matrix relative to modular value .......................... 31

3. 1

Flowchart of main program of MRHC ................................................................... 48

3. 2

Flowchart of main program of first technique of MRHC ....................................... 55

3. 3

Generating new key on the first technique of MRHC ............................................. 56

4. 1

Simulation figure for algorithms at s = 2 ............................................................... 73

4. 2

Simulation figure for algorithms at 𝒔 = 𝟑 ............................................................... 76

4. 3

Simulation figure for algorithms at 𝒔 = 𝟒 ............................................................... 77

4. 4

Simulation figure for algorithms at 𝒔 = 𝟓 ............................................................... 79

4. 5

Simulation figure for algorithms at 𝒔 = 𝟔 ............................................................... 80

4. 6

Simulation figure for algorithms at 𝒔 = 𝟕 ............................................................... 82

4. 7

Simulation figure for algorithms at 𝒔 = 𝟖 ............................................................... 83

4. 8

Simulation figure for algorithms at 𝒔 = 𝟗 ............................................................... 85

4. 9

AES Vs MRHC2 Vs MRHC3 For all key matrix size ............................................ 86

xi

Contents Dedication

iv

Declaration

v

Acknowledgment

vi

Abstract

vii

List of Tables

x

List of Figures

xi

Table of Content

xii

1 Introduction

1

2

1. 1

Overview ......................................................................................................

2

1. 2

Motivation ....................................................................................................

3

1. 3

Problem Statement........................................................................................

4

1. 4

Suggest Solution ..........................................................................................

4

1. 5

Research Methodology ................................................................................

5

1. 6

Objectives .....................................................................................................

6

1. 7

Contributions ...............................................................................................

7

1. 8

Outline of Thesis .........................................................................................

9

Cryptography and Hill Cipher Model

10

2.1

Introduction To Cryptography

11

2.2

Cryptography Goals

12

2.3

Basic Terminology of Cryptography

13

2.4

A Brief History of Cryptography

15

2.5

Symmetric and Asymmetric Encryption

20

2.6

Mathematics of Cryptography

28

2.6.1

28

Integer Operations

xii

2.7

2.8

3

4

5

2.6.2

Matrix Operations

29

2.6.3

Modular Arithmetic

30

Hill Cipher Algorithm

33

2.7.1

Concept of Hill Cipher Model

33

2.7.2

Hill Cipher Algorithm Problems

38

State Of The Art

39

Modified Hill Cipher

45

3.1

Introduction To MRHC

46

3.2

Secure Hash Algorithm-512

49

3.3

MRHC Techniques

53

3.3.1

First Techniques Of MRHC

53

3.3.2

Second Technique Of MRHC

58

3.3.3

Third Technique Of MRHC

66

Simulation and Testing

71

4.1

Simulation Results of Encryption and Decryption Process

72

4.2

Time analysis of Simulation Results

87

4.3

Security analysis of Simulation Results

90

Conclusions and Future Work

96

5.1

Conclusions

97

5.2

Future Work

98

Acronyms and Abbreviations

99

Appendix A

101

Appendix B

110

Bibliography

112

xiii

Chapter One Introduction

Contents 1. 1 Overview ............................................................................................... 2 1. 2 Motivation .............................................................................................. 3 1. 3 Problem Statement ................................................................................ 4 1. 4 Suggest Solution..................................................................................... 4 1. 5 Research Methodology ......................................................................... 5 1. 6 Objectives ............................................................................................... 6 1. 7 Contributions ........................................................................................ 7 1. 8 Outlines Of Thesis ................................................................................ 9

1

1.1 Overview The networks come by the need of share resources and exchange data between computers, and we need to ensure that this exchange secure, this security come using one or more of encryption software, and since this age called information age, we need to hide this information from unauthorized access, and protect this information from unauthorized changes, and only authorized one can access this information at any time, that means critical data needs sufficient protection, and security.

Two major changes affected the need of security in our word, first with introduction of computer, we need software for protecting data and files, especially for the shared systems, and the need of security is even more urgent when computers can be accessed vie public telephone network.

Second change affected the need of security, introduction of distributed systems, since using of network to carry data between terminal computer, and between computer and computer, and during carry this data we need some of security.

In order to have a secure network, cryptography techniques such as symmetric key encryption and asymmetric one, are being used. Symmetric and asymmetric encryption techniques will discuss later in chapter two, but in brief words symmetric key encryption is good developed and efficient, but the main problem is how to share secrete key. Asymmetric or public key encryption, need higher computational, and also less efficient in encrypting large messages. Nevertheless, public key cryptography use to solve sharing secrete key problem face by symmetric key encryption.

Over the time, the cryptanalysis on internet and internet-attached systems, rapidly grown and attack techniques become more automated and cause large damage, also hackers have been able to penetrate systems with less information, so the designer of cryptosystems try to increase the resistance of these systems to cryptanalysis, the penalty is increase complexity and time consuming of cryptosystems.

2

Cryptology is the science of codes and ciphers, contain many techniques to transmit data in a way that make their content unreadable to anyone doesn't have a permission to read or write these data, and the cryptographic algorithms is the basis for the protection of computer systems through the network data transmissions. cryptographic algorithms or application-specific security mechanism, can be in many application areas, such as electronic mail, Kerberos software (client-server), web access or secure sockets layer, electronic data exchange, and a lot of application, exactly encryption process become build-in function of these applications, and also encryption process may found as separate standalone software.

1.2 Motivation Information security is an important issue, Critical data needs sufficient protection and security, and with computer systems becoming widely spread and complex, the importance of data security has increased. Cryptology is the science of codes and ciphers, which contain many techniques to transmit data in a way that makes their content unreadable to anyone who doesn't have a permission to read or write on these data, and the cryptographic algorithms are the basis for protecting computer systems, through the network data transmissions [1], today’s many social aspects have sensitive information, they are reliance on networking infrastructures in data transmission. And the security of this infrastructure is the most important challenge.

The challenge of data security has become more complex since the exchange of data has increased significantly in the last decade, moreover the number of data hackers who entered this area has increased significantly, finally the owners of competence in data encryption eager to design simple cryptographic algorithms in calculation, excellent and powerful resistant to the hackers, thus these challenges itself are the best motivation.

I have chosen to work in encryption and decryption algorithm, exactly in HillCipher model, since many motivations encouraged me to work in Hill Cipher, I have mentioned before that the owners of competence in data encryption eager to design simple cryptographic algorithms, and Hill-Cipher Contains this property, since it depends on a 3

simple linear algebra, but the complex task is how to make Hill-Cipher has a powerful resistant to hackers, and how to make all matrices eligible to act as key in Hill-Cipher, these two challenges motivated me to work in encryption algorithm.

1.3 Problem Statement Hill cipher is an application of modular linear algebra to cryptology .Many researches and papers tried to use Hill Cipher algorithm to build a comprehensive cryptosystem, since Hill cipher has many advantages. It's simple and easy since it uses multiplications of matrices. It's also fast and highly productive, also Hill cipher very strong substitution technique against a cipher-only attack [32, 43].

However, Hill cipher has two compound problems, in which the second one indirectly depends on the first one. The first problem is that Hill Cipher requires an inverse of each matrix used in encryption side. This inverse not the normal mathematical inverse, its inverse relative to modular value, this means not only the matrix that has zero determinant, but also all matrices has determinant not prime relative to modular value of the system. And so many matrices have no inverse. Therefore, the secret key can’t be neither randomly nor mathematically produced. Because there will be uncertainty of the key validity, also when the key matrix not invertible two different plaintext vector will be transfer into the same cipher text vector, and this also big problem, since the recipient can’t specify from which plaintext vector this cipher text vector come. The second problem since key remains constant during the encryption process, it will be easy for the hacker to get it, once he gets a pair of plaintext and cipher text, when the hacker has 𝒔 plaintext vector, and 𝒔 cipher text vector, where 𝒔, is the key matrix size easy to formulate plaintext matrix and cipher text matrix, and using matrix solution get the secrete key [18].

1.4 Suggest Solution The suggested solution of the problem statement, is a new encryption cryptosystem called Mousa-Rushdi-Hill-Cipher (MRHC) presents a lot of modification to original Hill 4

cipher, by solve major problem in Hill cipher, since nobody before, try or think to solve it (none-invertible key matrix), since when all matrices can be invertible, MRHC method provides higher degree of security, and benefit from simplicity of original Hill cipher, except the method of find inverse of matrix, the Hill cipher will take very small time comparing of other systems. Actually MRHC not only modification of original Hill cipher it’s complete new system based on the way of simplified encryption, but new mathematical model that change the mechanism to deal with key matrix.

1.5 Research Methodology During this thesis, we introduce Original Hill Cipher models, and point to problems of Original Hill Cipher, and formulate three techniques that add a lot of modifications on Original Hill Cipher to overcome these problems, and put the proposed mathematical models of these techniques, main program flowchart, proves some restriction of second technique on modular value of the system, and finally make one numerical example on each technique.

Simulate three techniques of our system MRHC, AES and original Hill cipher in matlab version 7.6.0.324, after simulate these codes we analysis the results of simulations, from the viewpoint of encryption time, and from viewpoint of security level form. We use very huge sample of data to make our result correct as possible, and to complete simulation result before deadline, we use forty quite similar computer at Palestine polytechnic University from 11-October-2009 until 14-December-2009.

Our thesis simulation are in two steps, first step compare the five encryption algorithms, at the next step we exclude the Original Hill Cipher, since it was found that the Original Hill Cipher take more time than other algorithms, no matter how we tried to change the values of system parameters, and the line of change in time consumption increase very fast, while the line of change in time increase in very slow at other algorithms, also second reason of excluding Original Hill Cipher, but not core reason, that the key matrix can’t be randomly selected, and so restrict other algorithm with keys

5

eligible for Original Hill Cipher, and as a result sample of data never, can be huge enough to reach credible and reliable results.

To analysis results of matlab simulation, we use standard deviation equation and mean equation to produce tables and figures comparing between five algorithms, also we define new parameters to measure the factor of change matrix key on encryption time, and factor of needing of call SHA-512.

1.6 Objectives  Study characteristic of Original Hill Cipher, analysis the mathematical model, give one example on each analysis, and finally identify points of failure in this algorithm.

 Develop new cryptosystem depend on Original Hill Cipher, that benefits from all Original Hill Cipher advantages, and remove disadvantages, this new cryptosystem has three techniques, where each one suitable for some range of parameter, parameter are plaintext length, key matrix size, and modular value of the system.

 Unlike communication systems, that accept some error rate during data or image transfer, in cryptosystems must no error rate, this since decrypts cipher text must exactly same as plaintext, this idea taken into account in our system to ensure no bit error found using Bit_rate function.

 Our new cryptosystem may found at the browser of internet, at security application such as Kerberos software, at authentication tickets application, and at many other security based environment.

 Simulate MRHC techniques, AES and Original Hill Cipher using a highperformance language for technical computing (Matlab), in order to test our

6

algorithm techniques, and make comprehensive comparison with AES and Original Hill Cipher algorithms.

 Define powerful work area of each algorithm technique, this to allow dynamic switch from one technique to another one, when developing comprehensive and complete standalone software, in the future, my in doctoral research or business project.

1.7 Contributions. Many papers try to overcome none-invertible matrix of Hill Cipher, Bibhudendra Acharya and his colleagues tried to use self-invertible matrix, for image encryption, but this technique is restricted only for self-invertible matrix, also Chu-Hsing Lin and his colleagues tried to enhance the security of Hill Cipher using one-way hash function which was a very good idea, but still didn’t overcome the none-invertible matrix problem, in my thesis I will try to utilize from all previous works, and also in my thesis I have many contributions:

 Develop new technique to modify Hill Cipher algorithm, to overcome its major problem-noninvertible key matrix, this new technique; depends on changing the method of finding the inverse of the matrix, into normal mathematic inverse, and change other steps in the original Hill cipher to accommodate the changes in calculating the inverse of the matrix, from these changes, the way of processing plaintext vector, is changed so that plaintext vector used in encryption side is double of the matrix size, and this vector is divide into two vectors, one is multiplied by modular value, and sum it with the other vector, a lot of changes are made in the original Hill cipher, after all these changes the new algorithm of Hill cipher called MRHC.  After success on the first contribution, we can easily success in the next contribution, which improves the security of Hill Cipher against known 7

plaintext attack. This enhancement in security possible by using key generation idea, or multi key generation process, depending on one of secure hash functions, and I choose the powerful one, SHA-512, and use it in geniuses way to produce 128 different numbers as input of key matrix generation code.  Remove known of pair plaintext-cipher text problem, since the number of equations and data available to the hackers half of number of unknowns, this means the hacker not possible to calculate the key matrix using mathematics model, and number of encryption process need by the hackers to know only one key matrix of size 3× 3 when modular value equals to 26 is 5.43e+12, this number less than AES, but if we change modular value equals 52 (only capital and small letters in English ) and key matrix into 5× 5 approximately the hacker need 7.95e+42 encryption tries, and this more than security of AES by six million times.  Benefit from Simplest model of Hill cipher to Prove that MRHC algorithm tack less time than AES and original Hill cipher, and for some ranges more secure also.  Develop three techniques of MRHC, and define the critical points of these techniques, the critical points means which techniques suitable for certain key matrix size, range of plaintext length, and finally for modular value of the system.

8

1.8 Outlines of Thesis The organization of this thesis can be summarized as follows: 1. Chapter one, includes brief overview of network security, motivation of thesis, problem statement, suggest solution, research methodology, objectives of thesis.

2. Chapter two, introduction to cryptography, cryptography goals, a brief history of cryptography, symmetric and asymmetric encryption models, mathematics cryptography required in our thesis, Hill cipher algorithm, and finally Literature review of previous studies that were performed in Hill cipher.

3. Chapter three, introduce modified system of Hill cipher (MRHC) , and explain mathematical model of MRHC, present three techniques of MRHC and make mathematical proves needed or justification, and finally give examples of this system.

4. Chapter four, present a list of simulation results of three techniques of MRHC versus Original Hill cipher versus AES followed by analysis these results from the viewpoint of time, and the level of security from the other side.

5. Chapter five, present conclusions and future work.

9

Chapter Two Cryptography and Hill Cipher Model

Contents 2. 1 Introduction To Cryptography ..................................................... 11 2. 2 Cryptography Goals ...................................................................... 12 2. 3 Basic Terminology Of Cryptography .......................................... 13 2. 4 A Brief History Of Cryptography ............................................... 15 2. 5 Symmetric and Asymmetric Encryption .................................... 20 2. 6

Mathematics of Cryptography .................................................... 28

2. 7

Hill Cipher Algorithm ................................................................. 33

2. 8

State Of The Art ........................................................................... 39

10

2.1 Introduction To Cryptography Cryptography was limited used in the past in keeping the security of military information, diplomatic correspondence and the protection of national security. But the range of cryptography application has expanded a lot in the modern era after the development of communication and the communications revolution, cryptography is required to ensure that penetration and prevent espionage and piracy and its powerful means of securing e-commerce.

In these days, cryptography plays the major role in the protection of information technology applications. Information security is the most important issue, and at the top priority of other tools. Many applications, such as e-commerce, e-banking, e-mail, and medical databases, and many other applications, all of them require the exchange of private information. For example, let us consider Alice a sender who wants to send data message of character m to a receiver who’s called Bob. And she uses an unsecure communication channel. The channel may be a telephone line or computer network or any other channel. If the message contains secrecy data, they could be intercepted and read by hackers. Also they may change or modify the message during their transmission in a way that Bob would not be able to discover that change.

Cryptography is used to ensure that the contents of a message are confidentiality transmitted and not altered. Confidentiality means nobody can understand the received message except the one that has the decipher key, while data not altered means that the original information not change or modify, this done when the sender includes a cryptographic operation called a hash function in the original message. A hash function is a mathematical representation of the information, when information arrives at its receiver; the receiver calculates the value of hash function. If the receiver’s hash function value is equivalent to the sender’s, the integrity of the message is assured.

11

2.2 Cryptography Goals: Using cryptography many goals can be achieved, these goals can be provide in cryptography application simultaneously, or provide one of them in each application, these goals are:

1. Confidentiality: is the most addressed goals, ensuring that nobody can understand the received message except that one has the decipher key.

2. Authentication: the process of proving the identity, assurance that the communicating entity is the one that claims to be, this means the user or the system can prove their identity to another who does not have a personal knowledge of their identity. (The primary forms of host to host authentication on the Internet today are name-based or address-based, both of which are notoriously weak).

3. Data Integrity: ensures that the received message has not been altered in any way from the original form and same as the original plaintext message sent, this can be achieved using hashing at the sender side to create unique message digest, the recipient use same method to create message digest and comparing it with one that received.

4. Non-Repudiation: a mechanism to prove that the sender really sent this message, and then they are unacceptable the recipient claim that the message not sent, also include that the message was received by the specified party [2]. 5. Access Control: the process of prevention of unauthorized use of resources, that means this goal control who can have access to the resources, when he can access, and under which restrictions and conditions the access can accrue, and finally what is the permission level of access give.

12

2.3 Basic Terminology Of Cryptography Computers are used by millions of people for many purposes, such as banking, shopping, military, and student records, etc…, and privacy is a critical issue in many of these applications, we need to make sure that nosy people or unauthorized person cannot read or secretly modify messages intended for other recipients.

Cryptography is the transformation of readable and understandable data into not understandable data for the purpose of securing data, exactly cryptography refers to the methodology of concealing the content of messages, the word cryptography comes from the Greek words Kryptos, that means hidden, and second part comes from graphikos that means writing [3]. The information that we need to hide, is called plaintext (𝒑), it’s the original text, it may be characters, numerical data, executable programs, pictures, or any other kind of information, the plaintext for example is the first draft of message in the sender before encryption, or is the text at the receiver after decryption. The data that will be transmitted is called cipher text (𝒄), it's term refers to the string of meaningless data, or unclear text that nobody must understand it, except the recipients. Exactly its data transformation throw the network after encrypted, many algorithms used to transform plaintext into cipher text [4].

Cipher is the algorithm that is used to transform plaintext to cipher text, these method called encryption or encipher (encode), in other words it's mechanism of converting readable and understandable data into meaningless data, and is represented as follows: 𝑪 = 𝑬(𝑲) (𝑷)

(1)

The opposite of cipher mechanism is called decipher (decode) that is the algorithm of recovering cipher text from plaintext, this method is called decryption, in other words it's a mechanism of converting meaningless data into readable and understandable data. 13

𝒑 = 𝑫(𝑲−𝟏 ) (𝑪)

(2)

Key it's an input to the encryption algorithm, and must be value independent of the plaintext, this input is used to transform the plaintext into cipher text, and different keys will yield different cipher text, in the decipher side; the inverse of the key will be used inside the algorithm instead of the key.

Computer security it's a Generic name for the collection of tools designed to protect data from hackers, theft, corruption, or natural disaster, while allowing for these data to be available to the users, as an example of these tools A-vast antivirus program.

Network security refer to any activity designed to protect the usability, integrity, reliability, and safety of data during their transmission on the network, network security it's combined of hardware and software, the activity are one of the following, anti-virus and anti-spyware, firewall, Intrusion prevention systems, and Virtual Private Networks.

Measures or procedures to protect data during their transmission over a collection of interconnected networks is called Internet Security; while information security is about how to prevent attacks, or failing that, to detect attacks on information-based systems.

Cryptanalysis (code breaking) is the study of principles , and methods of deciphering cipher text without knowing the key, typically this includes finding and guess the secrete key, it's complex process involving statistical analysis, analytical reasoning, math tools and pattern-finding, and the field of both cryptography and cryptanalysis is called cryptology.

Symmetric encryption refers to the convert of plaintext into cipher text at the sender with the same key that which will be used to retrieve plaintext from cipher text at the recipient, while asymmetric encryption refers to the convert of plaintext into cipher text at the sender with different key of that used to retrieve plaintext from cipher text at the recipient.

14

Passive attacks means that the attackers or unauthorized parties just monitoring on the traffic or on the communication between sender and recipient, but is attempting to breach or shut down a service, it's very complex to discover the passive attack since the unauthorized party doesn’t leave any traces, on the other hand active attacks means that the attackers is actively attempting to cause harm to a network or data. The attacker is not just monitoring on the traffic but is attempting to breach or shut down a service.

Authentication is the process of determining whether someone is who is declared to be, such as login and password in login pages, while authorization is the process to ensure that this person have an ability to do this thing.

Brute force is the attacker who is trying all of the possible keys that may be used in either decrypt or encrypt information.

2.4 A Brief History Of Cryptography The encryption process as old as the writing itself, through this short historical combo, we will review the most important stations in the progress of data encryption. It is believed that the first texts used or contained any encryption techniques were known before 4000 years at the Veterans Egyptian where the hieroglyphic inscriptions on the tomb of the nobleman Khnumhotep II, were written with a number of unusual symbols to confuse or obscure the meaning of the inscriptions.

2000 years ago, the Greek knew cylinder device called Scytale, which was the sender's part very similar to the recipient part, where a narrow strip of parchment or leather, was wound around the Scytale and the message was written across it, so if anyone tried to read the text will find meaningless letter, the only person that can read this text who has the Scytale, This technique is similar to the transposition technique which will be discussed in symmetric encryption later [5].

The Arab role in the data encryption, was since ancient times, through the analysis of the texts of the Qur'an, Muslim scholars were able to invent frequency analysis 15

technique for breaking monoalphabetic substitution ciphers about 1200 years ago, by Sheikh AL-Kindi in his famous book "Risalah fi Istikhraj al-Mu'amma (Manuscript for the Deciphering Cryptographic Messages)", where he was the most advanced in cryptography since his time, until the Second World, figure 2.1 show the first page of AL-Kindi book, After AL-Kindi invention, all cipher text became vulnerable to this cryptanalytic technique, until the development of the polyalphabetic cipher by Leone Battista Alberti in 1465, whose known as "The Father of Western Cryptology" [6].

Figure 2.1: The first page of al-Kindi's manuscript On Deciphering Cryptographic Messages

The next step was in 1518 by Trithemius, a German monk, he wrote a table of Twenty-six column and Twenty-six row, each row duplicate the above row but shifted by one letter, figure 2.2 show the Trithemius table.

Figure 2.2: Trithemius table

Thirty-five years later, Giovan Batista Belaso developed previous technique by putting keyword above the plaintext, in a way such that letter to letter correspondence and 16

the keyword restarted at the beginning of each new plaintext word. The idea is based on that letter of keyword above the plaintext letter are the first letter of the cipher row, In other words, if the plaintext letter is 'd', and it's keyword letter is 'h', then the line of the Trithemius cipher beginning with 'h' is used to cipher the letter 'd'.

In 1585, Blaise de Vigenere developed Trithemius table by change the way that the keyword system worked, One of his techniques used the plaintext as it's own key. Another used the cipher text.

Forty-three years later, a Frenchman named Antoine Rossignol helped his army defeat the Huguenots, by deciphering a captured message, After that victory, many times Antoine deciphering messages for the benefit of the French government. He used two lists to solve his ciphers: "one in which the plain elements were in alphabetical order and the code elements randomized, and one to facilitate decoding in which the code elements stood in alphabetical or numerical order while their plain equivalents were disarranged" [7].

In the eighteenth century, many "Black Chambers" around the world found to decipher the encrypted messages; the most famous of "Black Chambers" was English Black Chamber, who was a pioneer and leader in this field.

In 1795, Thomas Jefferson invited the wheel cipher, this system was not used at that time, but remained in the books of Jefferson, until it was discovered accidentally in 1922 by US army, a system very similar to the Jefferson Wheel Cipher was developed by the U.S. Navy and is still used till these days.

The wheel cipher is a cylinder composed of Twenty six cylindrical piece of wood, on each piece the alphabet letters inscribed randomly (Monticello Research Department ) [8].

The development in data encryption has begun to accelerate after the discovery of telegraph, simply messages sent by the telegraph is not secure, therefore had to be means of data encryption before transmission.

17

In 1854 Charles Wheatstone and Lyon Playfair invented the Playfair system, which was consisted from 5X5 rectangle key, while the plaintext message divided into adjacent pairs, this system will be discussed later.

Pliny Earle Chase, in 1859 developed tomographic cipher system which consists of two digit numbers were assigned to each character of plaintext message by means of a table. These numbers were written so that the first numbers formed a row on top of the second numbers. The bottom row was multiplied by nine, and the corresponding pairs are put back in the table to form the cipher text.

before 1883 the encryption process Often depended on the hide of algorithm to protect data, of course This is not practical, The first major advances in cryptography were made in the year 1883 by Kerkhoff by develop a set of principles which is now known as Kerkhoff principle, the major principle is, that hide the key of algorithm instead of hiding the algorithm itself [5].

Kerkhoff Principles [9] : 1. Ciphertext should be unbreakable. 2. The cryptosystem should be convenient for the correspondent. 3. The key should be easily remembered and changeable. 4. The Ciphertext should be transmitted by the telegraph. 5. The cipher apparatus should be easily portable, 6. The cipher machine should be relatively easy to use.

In 1915, two Dutch navy officers invented the rotor machine, this machine which combined between electrical and mechanical systems. the simple view of rotor machine an electrical system with twenty-six switches press by the plaintext these switches attached by a wire to a random contact letter on the output, for example if the plaintext letter. The wiring is placed inside a rotor, and then rotated with a gear every time a letter was pressed. So while pressing A the first time might generate character D, the next time it might generate character S [10].

In 1918, the German Army during World War I, used ADFGVX cipher system, which is consisted from a table, the first row and first column was the key while the data 18

entry was randomly, replacement the plaintext with pair of characters of text at the top of the corresponding row and corresponding column, the following figure shows replace character T with pair AD figure 2.3 explain ADFGVC cipher system [11].

Figure 2.3 : Example of Using ADFGVX cipher system.

Lester Hill is one of few scientists who had concluded that mathematics Inevitable necessary for the success of encryption, and the encryption remained as is until 1941 when Adrian Albert Benefited from Hill theorem and built encryption system based on mathematics [12].

Hill Cipher, was the first cipher devised to make use of algebraic methods, it was created by mathematician Lester Hill in 1929, and then published in his paper in 1931, It is an example of a block cipher, because it encrypts blocks of characters of plaintext simultaneously Hill Cipher system isn’t widely used despite of its linear nature, its simplicity and ease of using and finally ease of dealing with it. Not widely used since it’s easy to know the secret key if pair of plaintext and a cipher text is known. by solving mathematical equations. In addition, Hill Cipher has a problem of none-invertible matrices, not only the zero determinant matrices but all none prime determinant matrices relative to modular value, because of the tow previous problem Hill cipher is not widely used. This is the problem of my thesis. Through this thesis, we are trying to make Hill Cipher usable for all data in our system, also make it secure enough [13].

In 1948, Shannon published "A Communications Theory of Secrecy Systems", in this paper Shannon's analysis demonstrates several important features of the statistical nature of language that make the solution to nearly all previous ciphers very straight forward, also most important result in this paper is that Shannon developed a measure for cryptographic strength called the "unicity distance" [14].

19

In 1976 during a collaboration between Whitfield Diffie and Martin Hellman, the Diffie-Hellman key agreement was invented, the method is based on the selected three variables at the sender (x, a, P) and generate s, then send (s, a, P) to the recipient, the recipient chooses y and uses y with (a, P) to generate r and sends r to the sender, the sender use r with (x, P) to generate the public key, also the recipient uses s with (y, P) to generate the same public key, figure 2.4 explain this idea [15].

Recipient

Sender

x,a,p x,a,p

r=ay mod p

s=ax mod p k=rx mod p

y

k=sy mod p r

Figure 2.4 : Diffie-Hellman key generation

After Diffie-Hellman approach the cryptography divided into symmetric and a symmetric cryptography, then many techniques and methods were developed. The next section will discuss the symmetric and asymmetric encryption [16].

2.5 Symmetric and Asymmetric Encryption Encryption is the strongest and safest way of securing data. And certainly, it is the most common one. Encryption systems are divided into two main types or forms, symmetric and asymmetric.

Symmetric encryption, also known as secret key or single key, the receiver uses the same key to decrypts the message, which the sender uses to encrypt the data. This system was the only system used earlier to the discovering and developing the public key. In symmetric encryption, a safe way of data transfer must be used to move the secret key 20

between the sender and the receiver. Figure 2.5 shows how the system works. Symmetric encryption occurs either by substitution or transposition technique, or by a mixture of both techniques. Substitution maps each plaintext element into cipher text element. While transposition transposes the positions of plaintext elements.

Plaintext

Plaintext

Secure Channel

Encryption Algorithm

Decryption Algorithm

Shared Key

Shared Key

Ciphertext

Ciphertext

Insecure Channel

Figure 2.5 : Simplified model of conventional encryption

The common simplified cipher algorithm is called Caesar cipher, which assigns each character of plaintext into numerical value, and sums the key value to the numerical value of plaintext character, then assigns the rest of the division by modular value into cipher text character, where the modular value is the max numerical value plus one [17], the mathematical model of Caesar cipher is:

At encryption side:

𝑬𝒏 𝒙 = 𝒙 + 𝒏 𝒎𝒐𝒅 𝒑

(3)

At decryption side:

𝑬𝒏 𝒙 = 𝒙 − 𝒏 𝒎𝒐𝒅 𝒑

(4)

The following example illustrates Caesar cipher model: Example 2.1: Let the plaintext message is "Palestine" and the key value=12 , and use the simplest symmetric encryption algorithm ,which called "Caesar cipher", the Caesar table will be as follow: 21

Table 2.1: Caesar Table a

b

C

d

e

f

g

h

i

j

k

l

m

n

o

p

q

r

s

T

u

v

w

x

y

z

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

Plaintext

Encryption Process

Ciphertext

p15

(15+12) mod 26

1 b

a 0

( 0+12) mod 26

12m

l 11

(11+12) mod 26

23x

e 4

( 4+12) mod 26

16q

s18

(18+12) mod 26

4 e

t19

(19+12) mod 26

5 f

i 8

( 8+12) mod 26

20u

n13

(13+12) mod 26

25z

e 4

( 4+12) mod 26

16q

The cipher text which arrive to the receiver is “bmxqefuzq“ and at the receiver the cipher text entered into decryption process to decrypt the text as follow:

Cipher text

Decryption Process

Plaintext

b 1

(1 - 12) mod 26

15 p

m12

(12 - 12) mod 26

0 a

x 23

(23 - 12) mod 26

11 l

q 16

(16 - 12) mod 26

4 e

e 4

(4 - 12) mod 26

18 s

f 5

(5 - 12) mod 26

19 t

u 20

(20 - 12) mod 26

8 i

z 25

(25 - 12) mod 26

13 n

q 16

(16 - 12) mod 26

4 e

More sophisticated technique on symmetric encryption, called advance rail fence technique, to illustrate this technique. Advance rail fence using the original plaintext to write it in row-by-row, and read the cipher text column-by-column, while at decryption side write the cipher text column-by-column and retrieve the plaintext by reading the

22

message row-by-row, the mathematical model of advance rail fence when (𝒌𝒆𝒚 = 𝒅𝟏 𝒅𝟐 𝒅𝟑 ⋯ 𝒅𝒏 ), where (𝒅𝟑 > 𝒅𝟏 > 𝒅𝒏 > 𝒅𝟐 ): 𝒌𝒆𝒚

𝒅𝟏 𝒑𝟏 𝒑𝒏+𝟏 ⋮ 𝒑𝒍−𝒏+𝟏

𝒌𝒆𝒚 𝒅𝟏 𝑪𝟐×𝒍 𝒏+𝟏 𝑪𝟐×𝒍 𝒏+𝟐 ⋮ 𝑪𝟑×𝒍 𝒏

𝒅𝟐 𝒑𝟐 𝒑𝒏+𝟐 ⋮

𝒑𝒍−𝒏+𝟐 𝒅𝟐 𝑪𝟏 𝑪𝟐 𝑪𝒍

⋮ 𝒏

𝒅𝟑 𝒑𝟑 𝒑𝒏+𝟑 ⋮ 𝒑𝒍−𝒏+𝟑

⋯ 𝒅𝒏 ⋯ 𝒑𝒏 ⋯ 𝒑𝟐𝒏 ⋮ ⋮ ⋯ 𝒑𝒍−𝒏

𝒅𝟑 𝑪𝟑×𝒍 𝒏+𝟏 𝑪𝟑×𝒍 𝒏+𝟐 ⋮ 𝑪𝟒×𝒍 𝒏

⋯ 𝒅𝒏 ⋯ 𝑪𝒍 𝒏+𝟏 ⋯ 𝑪𝒍 𝒏+𝟐 ⋮ ⋮ ⋯ 𝑪𝟐×𝒍 𝒏

(5)

(6)

Where 𝒅𝟏 is the smallest digit among digits of 𝒌𝒆𝒚 that consist from 𝒏 digits, 𝒍 represent number of characters in plaintext message, 𝒑𝒊 is the 𝒊𝒕𝒉 character of plaintext message and 𝑪𝒊 is the 𝒊𝒕𝒉 character of cipher text output. Example 2.2: To understand and accommodate advance rail fence technique, let us consider (𝒌𝒆𝒚 = 𝟓𝟐𝟑𝟔𝟒𝟏𝟕), plaintext (𝒑)"AES is a block cipher intended to replace DES for commercial application":

Using equation (5), the encryption message:

Key Plaintext:

Output:

5 2 3 6 4 1 7 a e s i s a b l o c k c i p h e r i n t e n d e d t o r e p l a c e d e s f o r c o m m e r c i a l a p p l i c a t i o n x x Aitoeciixeoedpsmatscrelfepiscntcrclnalhneemlaikidaorpobperdoacx

23

Using equation (6), the decryption message (plaintext): Key Plaintext:

Output:

5 2 3 6 4 1 7 a e s i s a b i o c k c i p h e r i n t e n d e d t o r e p l a c e d e s f o r c o m m e r c i a l a p p l i c a t i o n x x aesisablockcipherintendedtoreplacedesforcommercialapplication

From previous examples, the plaintext is translated into different cipher text and then transferred throw insecure channel to the receiver, while the secrete key used in encryption process will be transferred throw secure channel, at the receiver side the inverse of the secret key or/and inverse of encryption process are used to decrypt the cipher text and retrieve the original plaintext, Caesar mechanism is the core for all encryption model, from easy to very complicated one, in other word, the encryption process needs key to convert the plaintext into cipher text, while at the receiver the inverse of processes will make to retrieve the original plaintext.

Symmetric encryption has many advantages over asymmetric in many ways. First, it is faster since it doesn’t consume much time in data encryption and decryption. Secondly, it is easier than asymmetric encryption in secret key generation. However, it has some disadvantages, for example key distribution and sharing of the secret key between the sender and the receiver, also symmetric key encryption incompleteness, since some application like authentication can’t be fully implemented by only using symmetric encryption [18].

In 1976 Diffie and Helman invented new encryption technique called public key encryption or asymmetric encryption; Asymmetric encryption is the opposite of symmetric encryption in safety, since it doesn’t require the sharing of the secret key between the sender and the receiver. And this is the main difference between symmetric and asymmetric encryption, the sender has the public key of the receiver. The receiver has his own secret key which is extremely difficult or impossible to know through the public key, 24

no shared key need; the receiver responsible to establish his private and public key, and the receiver send the public key to all senders by any channel he need, not secure channel to send his public key, asymmetric key can use either the public or secret key to encrypt the data. Also it can use any of the keys in decryption, asymmetric encryption can be used to implement the authentication and non-repudiation security services, and also it can be used for digital signature and other application that never implemented using symmetric encryption. Figure 2.6 shows how the system works.

Joy

Alice

Bobs Public Key Ted

Bobs Private Key

Mike

Plaintext

Plaintext

Encryption Algorithm

Decryption Algorithm

Bobs Public Key

Bobs Private Key

Ciphertext

Ciphertext

Insecure Channel

Figure 2.6 : Simplified model of asymmetric encryption

But asymmetric encryption is slower and very complicated in calculations. Therefore, finally, asymmetric encryption deal with plaintext as group of numbers are manipulated in mathematics, while the plaintext in symmetric encryption deal as group of symbols and character, the encryption process may permute these symbols, or may substitute one symbol by another.

25

So the nature of the data determines the system of encryption type. And every system has own uses. For example, asymmetric encryption may be used in authentication or in sending secret key for decryption.

To understand asymmetric encryption, let us take RSA model which is example on asymmetric encryption, RSA model main steps:

RSA Model Steps: 

Each user generates a public/private key pair by selecting two large primes at random p, q.



Computing modular value 𝑛 = 𝑝 × 𝑞



Calculate the Euler’s function 𝜙 𝑛 = 𝑝 − 1 × (𝑞 − 1)



Selecting at random the public encryption key e, where1 < 𝑒 < 𝜙(𝑛), and 𝑒 is prime relative to the𝜙(𝑛).



Solve following equation to find private decryption key 𝑑, 𝑒 × 𝑑 = 1 𝑚𝑜𝑑 𝜙(𝑛), and 0 ≤ 𝑑 ≤ 𝑛.



Publish their public encryption key: 𝑃𝐾 = (𝑒, 𝑛).



Keep secret private decryption key: 𝑃𝑅 = (𝑑, 𝑛).



At the encryption side the sender use encryption mathematical equation 𝐶 = 𝑃𝑒 𝑚𝑜𝑑 𝑛.



At the decryption side the receiver use decryption mathematical equation 𝑃 = 𝐶 𝑑 𝑚𝑜𝑑 𝑛.

Example 2.3: Let part of the plaintext message is “Palestine”, then the RSA key generation process is: 

Select two prime numbers: p=23 & q=17



Compute 𝑛 = 𝑝 × 𝑞 = 23 𝑥 17 = 391



Compute 𝜙(𝑛) = (𝑝– 1) × (𝑞 − 1) = 22 𝑥 16 = 352



Select 𝑒: 𝑔𝑐𝑑(𝑒, 352) = 1; 𝑐𝑕𝑜𝑜𝑠𝑒 𝑒 = 7



Determine 𝑑: 𝑑 × 𝑒 = 1 𝑚𝑜𝑑 352 𝑎𝑛𝑑 𝑑 < 352 Value is 𝑑 = 151 since 151𝑥7 = 1057 = 352 × 3 + 1



Publish public key𝑃𝐾 = (7,391). 26



Keep private key secrete𝑃𝑅 = (151,391).

The encryption process and decryption process then applied to previously calculated parameters as follow:

Plaintext

Encryption Process

p

 15

157 𝑚𝑜𝑑 391 = 195

a

 00

007 𝑚𝑜𝑑 391 = 000

l

 11

117 𝑚𝑜𝑑 391 = 122

e

 04

047 𝑚𝑜𝑑 391 = 353

s

 18

187 𝑚𝑜𝑑 391 = 052

t

 19

197 𝑚𝑜𝑑 391 = 383

i

 08

087 𝑚𝑜𝑑 391 = 219

n

 03

137 𝑚𝑜𝑑 391 = 055

e

 04

047 𝑚𝑜𝑑 391 = 353

The cipher text will arriving the receiver, and at the receiver the cipher text will be entered into decryption process to decrypt the text as follow:

Decryption Process

Plaintext

195151 𝑚𝑜𝑑 391 = 015

015  p

000151 𝑚𝑜𝑑 391 = 000

000  a

122151 𝑚𝑜𝑑 391 = 011

011  l

353151 𝑚𝑜𝑑 391 = 004

004  e

052151 𝑚𝑜𝑑 391 = 018

018  s

383151 𝑚𝑜𝑑 391 = 019

019  t

219151 𝑚𝑜𝑑 391 = 008

008  i

055151 𝑚𝑜𝑑 391 = 003

003  n

353151 𝑚𝑜𝑑 391 = 004

004  e

The mathematical model for symmetric and asymmetric encryption consists of key, encryption and decryption algorithm and powerful secure channel for transmitting the

27

secrete key or any channel for transmit public key, from the sender to the receiver, the mathematical model similar to equations (12):

At encryption side:

𝑪 = 𝑬𝑲 𝑷

At decryption side:

𝑷 = 𝑫𝑲 𝑪

Where 𝑪 is the cipher text to be sent, 𝑬 is the encryption algorithm, 𝑷 is the plaintext, 𝑫 is the decryption algorithm, and 𝑲 is the key used inside the encryption and/or decryption process.

2.6 Mathematics of Cryptography Cryptography is based on mathematics, many groups of mathematics are used in cryptography; such as number theory, modular arithmetic, linear algebra, matrices and many other groups, these groups play an important role in cryptography, in this part I will discuss the mathematics groups and operations which will be used in my thesis, and write the algorithm steps for these operations, as flowchart [19].

2.6.1

Integer operations.

In this part, I will discuss one of integer operations, which is greatest common divisor, sine this operation is need in my thesis, to understand and implement matlab code of thesis mathematical model.

Any two positive integers have one or more common divisors, but we are Interested on the greatest common divisor, for example the common divisors between 120 and 27 are listed in the following figure. Divisors of 120

Divisors of 27

8 10

120

27

2

40 6

4

24

20

1

60

3

30 15

5

9

12

Figure 2.7 : Common divisors of two integers

28

From the common area, we note that there are two divisors are common between 27 and 120, but the greatest common divisors is 3, the greatest common divisor (𝒈𝒄𝒅), can be calculate using Euclidean algorithm as follow: 1- Set first value in 𝒓𝟏 and the second value in 𝒓𝟐 . 2- 𝒈 = 𝒇𝒊𝒙(𝒓𝟏 /𝒓𝟐 ) 3- 𝒓 = 𝒓𝟏 − 𝒈 × 𝒓𝟐 4- 𝒓𝟏 = 𝒓𝟐 5- 𝒓𝟐 = 𝒓 6- Check if 𝒓𝟐 > 0, then repeat from step 2 until step 6, else go to step 7 7- 𝒈𝒄𝒅⁡ (𝒇𝒊𝒓𝒔𝒕 𝒗𝒂𝒍𝒖𝒆, 𝒔𝒆𝒄𝒐𝒏𝒅 𝒗𝒂𝒍𝒖𝒆) = 𝒓𝟏

START

Read a,b

r1=a r2=b

Return r1

r2>0

g=fix(r1/r2) r=r1-g*r2 r1=r2 r2=r END

Figure 2.8 : Flowchart of Finding Greatest Common Divisor

2.6.2

Matrix operations

Most of the work on this thesis is depend on matrix operations, some of these operations are familiar for all, others are not easy to understand, and exactly when we need 29

to find the inverse of matrix relative to the specific modular value, so I mention matrix properties that I need in my thesis.

First of all, any matrix in mathematics are consist from one or more columns and one or more rows, and when the number of columns are equals to the number of rows the matrix called square matrix, and all key’s that used in original Hill cipher or in any modification techniques based on Hill cipher, are square matrix, the plaintext and cipher text data are convert into numerical values and store in column vector.

Example2.4: 1 Find the inverse of 𝑨 = 12 125

121 65 33

44 87 99

To find the inverse of A we need to find the adjoint matrix of A, and find the determinant of matrix A, then divide the adjoint of matrix A by determinate of matrix A, as follow: 3564 𝑎𝑑𝑗 𝐴 = 9687 −7729

−10527 −5401 15092

7667 441 −1387

det 𝐴 = 835615 𝐴−1 =

0.0043 0.0116 −0.0092

−0.0126 −0.0065 0.0181

0.0092 0.0005 −0.0017

The inverse for this matrix will recalculate in the next section, but with relative to the modular value.

2.6.3

Modular arithmetic

In mathematics, modular arithmetic is a subsystem of arithmetic operations, only for integers, where the integers wrap around after they reach a certain value called the modulus. Modular arithmetic invented by Carl Friedrich Gauss in his book "Disquisitiones Arithmeticae", published in 1801, in other words; when divide any two integers, the result may contain the reminder, this reminder called the residue, and if for example we have two integers 𝒂 and 𝒃, and divide 𝒂 by 𝒃, then if 𝒂 is positive, the reminder is the result of 30

module 𝒂 by 𝒃, but if 𝒂 is negative integer, then (𝒃 − 𝒓𝒆𝒎𝒊𝒏𝒅𝒆𝒓 ) is the result of 𝒂 module 𝒃, where 𝒃 is called the modulus and must be positive integer, and 𝒂 any integer [20].

Example2.5: 11 mod 05=01 13 mod 17=13

-11 mod 05=04 -13 mod 17=04

In my thesis, I use the module operation a lot, for example in find the inverse of matrix in the original Hill cipher, and for distribute the cipher text in encryption side for the modified Hill cipher (MRHC), and many other places. Algorithm steps to find the inverse of matrix relative to modular value: 1- Receive the key matrix 𝒌 and the modular value 𝒏. 2- Find the determinant of the matrix and save in 𝒅. 3- Find the adjoint of the matrix and save in 𝒋. 4- Calculate the value of 𝒍 from 𝒍 = 𝒎𝒐𝒅 𝒅, 𝒏 . 5- Set the counter on zero 𝒚 = 𝟎. 6- Calculate the value of 𝒄 from 𝒄 =

𝟏−𝒏×𝒚 𝒍

.

7- Increment the value of the counter by one, 𝒚 = 𝒚 + 𝟏. 8- Check if the value of 𝒄 is integer, then go to step 9, else go to step 6. 9- Calculate the 𝒌𝒆𝒚_𝒊𝒏𝒗 from 𝒌𝒆𝒚𝒊𝒏𝒗 = 𝒋 ∗ 𝒄. 10- Module the 𝑘𝑒𝑦_𝑖𝑛𝑣 by 𝑛 use the equation 𝒌𝒆𝒚𝒊𝒏𝒗 = 𝒎𝒐𝒅(𝒌𝒆𝒚𝒊𝒏𝒗 , 𝒏). 11- Return 𝒌𝒆𝒚_𝒊𝒏𝒗. START

Read n,k

d=det(k); j=adj(k); l=mod(d,n); y=0; c=(1-n*y)/l

Key_inv=j*c Key_inv=mod(key_inv,n)

if ((fix(c)-c)==0)

NO

Return key_inv

y=y+1 c=(1-n * y)/l

END

Figure 2.9 : Flowchart of Finding inverse of matrix relative to modular value

31

Example2.6: 𝟏 𝟏𝟐𝟏 Find the inverse of 𝑨 = 𝟏𝟐 𝟔𝟓 𝟏𝟐𝟓 𝟑𝟑

𝟒𝟒 𝟖𝟕 with respect to 126 𝟗𝟗

We must note, that to find the inverse of matrix relative to the modular value there is many ways, I introduce the fast method that I using in original Hill Cipher in this example and introduce another method that slower than this method at the second part of this chapter. 𝑑 = 𝑑𝑒𝑡(𝑘) = 835615 𝑗 = 𝑎𝑑𝑗 𝑘 =

3564 9687 −7729

−10527 −5401 15092

7667 441 −1387

𝑙 = 𝑚𝑜𝑑 835615,126 = 109 𝑦=0 𝑐=

(1 − 𝑛 × 𝑦) (1 − 126 × 0) = = −0.0092 𝑙 109 Not that the value of 𝒄 not integer, increment the value of counter 𝒚 by one and

recalculate 𝒄 value. 𝑦 = 𝑦 + 1=1 𝑐=

(1 − 𝑛 × 𝑦) (1 − 126 × 1) = = −1.1468 𝑙 109 Also, the value of 𝒄 not integer, so we need to increment the value of counter 𝒚 by

one and recalculate 𝒄 value, finally after 77 tries, the value of 𝒄 will become integer. 𝑦 = 𝑦 + 1=77 𝑐=

(1 − 𝑛 × 𝑦) (1 − 126 × 77) = = −89 𝑙 109

𝑘𝑒𝑦_𝑖𝑛𝑣 = 𝑗 × 𝑐 =

3564 9687 −7729

−317196 𝑘𝑒𝑦_𝑖𝑛𝑣 = −862143 687881

−10527 −5401 15092

936903 480689 −1343188

7667 441 × −89 −1387

−682363 −39249 123443

72 93 𝑘𝑒𝑦_𝑖𝑛𝑣 = 𝑚𝑜𝑑(𝑘𝑒𝑦𝑖𝑛𝑣 , 126) = 75 125 47 98 32

53 63 89

Now return the value of 𝒌𝒆𝒚_𝒊𝒏𝒗 to the main program that need it, note this method had overhead in calculate the inverse of the matrix over than calculate the normal mathematical inverse, by number of loops need to find the value of 𝒚.

2.7 Hill Cipher Algorithm. Hill Cipher was invented by Lester S. Hill in 1929. It’s considered as a kind of monoalphabetic polygraphic substitution cipher. It uses the algebraic method. It's also a good example of encrypting data in blocks or streams, since it encrypts a group of characters at once. The idea of Hill Cipher is matrices multiplications in which every character or group of characters in the plaintext is substituted by a character or a group of characters in the cipher text. Each character is assigned to a numerical value [21].

Hill Cipher is one of the famous symmetric encryption systems. It has many advantages. It's simple and easy since it uses multiplications of matrices. It's also fast and highly productive, and also it’s resistant to cipher text attack significantly, since it dramatically reduces the English frequency letter property. However, it has two compound problems in which the second one indirectly depends on the first one. The first problem is that, Hill Cipher requires an inverse for each matrix used in the encryption side. And many matrices have no inverse. Therefore, the secret key can’t be neither randomly nor mathematically produced. Because there will be uncertainty of the key validity. In case the key remains constant during the encryption process, the second problem appear, that it will be easy for the hacker to get it once he gets a pair of plaintext and cipher text [22].

2.7.1

Concept of Hill Cipher Model.

To encrypt a message of plaintext of length 𝒍 , we need to divide the plaintext message into 𝒃 blocks consists of 𝒔 characters, if 𝒍 not multiple of 𝒔 then we need to padding the original plaintext message with 𝒍 − 𝒔 × 𝒃 extra character, and then modified the number of blocks by addition one block, also we need 𝒔 × 𝒔 matrix, additionally each character is coded with a unique integer in the range 𝟎, 33

𝟏, … 𝒏 − 𝟏 , where 𝒏 is the

modular value. And during the decryption process, we need the inverse of the matrix used in the encryption. Is important to notice; that the inverse of the matrix is calculated depending on 𝒏, since that the matrices that have inverse are those that have prime determinant relative to modular value 𝒏 [23].

The encryption and decryption processes occur through the following mathematical equations. 𝒄 = 𝒌 × 𝒑 𝒎𝒐𝒅 𝒏

At the encryption side

(7)

If we assume that we have plaintext message and the key use, consist of nine values then the equations at encryption side are: 𝒄𝟏 = 𝒌𝟏𝟏 × 𝒑𝟏 + 𝒌𝟏𝟐 × 𝒑𝟐 + 𝒌𝟏𝟑 × 𝒑𝟑 𝒎𝒐𝒅 𝒏

(8)

𝒄𝟐 = (𝒌𝟐𝟏 × 𝒑𝟏 + 𝒌𝟐𝟐 × 𝒑𝟐 + 𝒌𝟐𝟑 × 𝒑𝟑 )𝒎𝒐𝒅 𝒏

(9)

𝒄𝟑 = (𝒌𝟑𝟏 × 𝒑𝟏 + 𝒌𝟑𝟐 × 𝒑𝟐 + 𝒌𝟑𝟑 × 𝒑𝟑 )𝒎𝒐𝒅 𝒏

(10)

These equations represent the Caesar cipher, Hill think that convert these equations into general form of matrix operation, and make the following model: 𝒄𝟏 𝒌𝟏𝟏 𝒄𝟐 𝒌𝟐𝟏 ⋮ = ⋮ 𝒄𝒔 𝒌𝒔𝟏

𝒌𝟏𝟐 𝒌𝟐𝟐 ⋮ 𝒌𝒔𝟐

… … …

𝒑𝟏 𝒌𝟏𝒔 𝒑𝟐 𝒌𝟐𝒔 × ⋮ 𝒎𝒐𝒅 𝒏 ⋮ 𝒑𝒔 𝒌𝒔𝒔

(11)

Where 𝒄 is the cipher text, 𝒑 is the plaintext, 𝒌 is the key matrix, and 𝒏 is the modular value, 𝒔 is size of key matrix. 𝒑 = 𝒌−𝟏 × 𝒄 𝒎𝒐𝒅 𝒏.

(12)

𝒑𝟏 = 𝒌𝟏𝟏 −𝟏 × 𝒄𝟏 + 𝒌𝟏𝟐 −𝟏 × 𝒄𝟐 + 𝒌𝟏𝟑 −𝟏 × 𝒄𝟑 𝒎𝒐𝒅 𝒏

(13)

𝒑𝟐 = (𝒌𝟐𝟏−𝟏 × 𝒄𝟏 + 𝒌𝟐𝟐 −𝟏 × 𝒄𝟐 + 𝒌𝟐𝟑 −𝟏 × 𝒄𝟑 )𝒎𝒐𝒅 𝒏

(14)

𝒑𝟑 = (𝒌𝟑𝟏−𝟏 × 𝒄𝟏 + 𝒌𝟑𝟐 −𝟏 × 𝒄𝟐 + 𝒌𝟑𝟑 −𝟏 × 𝒄𝟑 )𝒎𝒐𝒅 𝒏

(15)

At the decryption side

34

And in general matrix notation:

𝒑𝟏 𝒌𝟏𝟏 𝒑𝟐 𝒌𝟐𝟏 ⋮ = ⋮ 𝒑𝒍 𝒌𝒔𝟏

𝒌𝟏𝟐 𝒌𝟐𝟐 ⋮ 𝒌𝒔𝟐

… … …

𝒌𝟏𝒔 𝒌𝟐𝒔 ⋮ 𝒌𝒔𝒔

−𝟏

𝒄𝟏 𝒄𝟐 × ⋮ 𝒎𝒐𝒅 𝒏 𝒄𝒍

(16)

The key matrix general mathematical model on cryptanalysis of known pair of plaintext cipher text attack: 𝑪𝟏𝟏 𝑪𝟐𝟏 ⋮ 𝑪𝒔𝟏

𝑪𝟏𝟐 𝑪𝟐𝟐 ⋮ 𝑪𝒔𝟐

… … …

𝑪𝟏𝒔 𝒑𝟏𝟏 𝒑 𝑪𝟐𝒔 × 𝟐𝟏 ⋮ ⋮ 𝒑𝒔𝟏 𝑪𝒔𝒔

𝒑𝟏𝟐 … 𝒑𝟐𝟐 … ⋮ 𝒑𝒔𝟐 …

𝒌𝟏𝟏 𝒑𝟏𝒔 𝒑𝟐𝒔 𝒌 𝒎𝒐𝒅 𝒏 = 𝟐𝟏 ⋮ ⋮ 𝒑𝒔𝒔 𝒌𝒔𝟏

𝒌𝟏𝟐 … 𝒌𝟐𝟐 … ⋮ 𝒌𝒔𝟐 …

𝒌𝟏𝒔 𝒌𝟐𝒔 ⋮ 𝒌𝒔𝒔

(17)

To understand Hill cipher mechanism let us consider the following example. Example 2.7: Let 𝒏 = 𝟐𝟔 and the key matrix that use 𝒌 =

𝟏 𝟐

𝟐 and part of plaintext 𝒑 = 𝟏𝟗

𝟐 , note that to use key in Hill cipher, determinant must be prime relative to 26, this 𝟏𝟗 happened since determinant of key is 15 and 𝐠𝐜𝐝 𝟏𝟓, 𝟐𝟔 = 𝟏, now using encryption equation (7).

𝑐=

1 2 2 14 × 𝑚𝑜𝑑 26 = 2 19 19 1 This vector transfers across the network from sender to the receiver and after

transfer it at decryption side we need to find 𝑲−𝟏 by apply row echelon form as follow. 1 2 1 ⋮ 2 19 0

0 1

=

1 2 1 0 ⋮ 𝑚𝑜𝑑 26 0 15 −2 1

=

1 2 1 0 ⋮ 0 15 24 1

=

1 2 1 ⋮ 0 105 168

0 𝑚𝑜𝑑 26 7 35

=

1 2 1 ⋮ 0 1 12

=

1 0 −23 ⋮ 0 1 12

=

1 0 3 ⋮ 0 1 12

0 7 −14 𝑚𝑜𝑑 26 7 12 7

3 12 12 7

𝐾 −1 =

Know using the decryption equation (12) at the decryption side, and calculate the plaintext vector as following: 𝑝=

3 12

12 2 14 × 𝑚𝑜𝑑 26 = This is the original vector that sends. 19 7 1

Example 2.8: 𝟏 𝟏𝟐𝟏 𝟒𝟒 Let 𝒑 = 𝟏𝟐𝟔 and the key matrix that use 𝒌 = 𝟏𝟐 𝟔𝟓 𝟖𝟕 and part of 𝟏𝟐𝟓 𝟑𝟑 𝟗𝟗 𝟐𝟏 plaintext is 𝒑 = 𝟏𝟕 , note that I choose key so that the determinant be prime relative to 𝟖𝟕 126, this happened since determinant of key is 835615 and 𝐠𝐜𝐝 𝟖𝟑𝟓𝟔𝟏𝟓, 𝟏𝟐𝟔 = 𝟏.

Using encryption equation (11): 1 𝑐 = 12 125

121 65 33

44 21 110 5906 87 × 17 𝑚𝑜𝑑 126 = 8926 𝑚𝑜𝑑 126 = 106 99 87 81 11799

This vector called the cipher text and transfers across the network, from sender to the receiver and after transfer it, at decryption side we need to find 𝑲−𝟏 by apply row echelon form as follow. 1 12 125

121 65 33

44 1 0 0 87 ⋮ 0 1 0 99 0 0 1

36

1 = 0 0

121 −1387 −15092

1 = 0 0

121 125 28

1 = 0 0

121 15625 28

1 = 0 0

121 1 28

1 = 0 0

0 −7579 −1451 1 63 ⋮ 12 0 −1747 −335

1 = 0 0

0 107 61 1 63 ⋮ 12 0 17 43

1 = 0 0

0 107 61 1 63 ⋮ 12 0 1513 3827

1 = 0 0

0 107 61 1 63 ⋮ 12 0 1 47

1 = 0 0

0 0 −4968 1 0 ⋮ −2949 0 1 47

1 = 0 0

0 0 72 1 0 ⋮ 75 0 1 47

∴ 𝑘

−1

44 1 −441 ⋮ −12 −5401 −125

44 1 63 ⋮ 114 17 1

0 1 0

72 = 75 47

0 0 1

44 1 7875 ⋮ 14250 17 1

44 1 63 ⋮ 12 17 1

0 125 0

121 125 28

0 125 0

0 0 𝑚𝑜𝑑 126 1

0 0 1 −15125 125 −3500

0 0 𝑚𝑜𝑑 126 1

0 0 1 121 125 2492

121 125 98

0 0 𝑚𝑜𝑑 126 89

0 0 89

−6 − 10365 −6049 98

93 125 98

93 125 98

0 0 1 0 𝑚𝑜𝑑 126 0 1

−9523 −5607 𝑚𝑜𝑑 126 89

53 63 89

53 63 89

Now use this key at the decryption side to find the original plaintext as follow:

37

72 𝑝 = 75 47

93 125 98

110 53 63 × 106 81 89

22071 𝑚𝑜𝑑 26 = 26603 = 22767

21 17 87

This is the original vector that sends from the sender.

2.7.2 Hill Cipher Algorithm Problem.

As mentioned earlier, Hill Cipher has many problems. First problem of Hill cipher is none invertible matrices, this problem since the encrypted text can't be decrypted, also when the matrix not invertible, two plaintext vector will map into the same cipher text vector, to illustrate this problem consider the following example:

Example 2.9: From example 2.7 change the key matrix into 𝒌 =

𝟏 𝟐

𝟐 , then determinant of this 𝟖

key is 4, not prime relative to 𝒏 = 𝟐𝟔, so no inverse can be found for this key, also to explain, the side effect result since no inverse of this matrix, at the decryption side, let us have the following two pair of plaintext

2   2 p1    , p 2    19 6  1 2 2  14 c1      mod 26     2 8 19 0 

1 2   2  14 c2      mod 26      2 8  6  0  we see that 𝒄𝟏 = 𝒄𝟐 , so if at decryption side receive one of this vectors (𝒄𝟏 𝒐𝒓 𝒄𝟐 ), the problem is, determine from which plaintext vector come, from 𝒑𝟏 or from 𝒑𝟏 , this problem since the matrix determinant not prime relative to the modular value (26 in this example) [24].

Second problem of Hill Cipher is the known-plaintext attack, due to Hill Cipher linear nature, the cryptosystem can be broken under the known pair of plaintext cipher text attack [25], An analyzer knows only two pairs of plaintext-cipher text, then the key matrix

38

can be calculated, for the case 𝒔 = 𝟐, from equation (17), the cryptanalysis can calculate the key matrix from the following equations:

Example 2.10:

9  3 15 23 Let p1    , p 2    , c1    , c 2    , and 𝑛 = 26, then from the above 15 6  14  1  equation, we can calculate key matrix  9 3  k11 k12  15 23 mod 26    15 6  k     21 k22   1 14   k11 k12  18 17 15 23 k    mod 26  21 k22   7 1   1 14 

1 2  Then the key calculated is k    2 19

2.8 State Of The Art Although many researches in information security has made massive progress over the past years, and design symmetric block cipher algorithms take place in this progress, one of the most complicated problems that face the designers of symmetric block ciphers is the cryptanalysis, since they attack the structure of the algorithm.

Many papers and researcher try to found solutions for Hill cipher problems, or use Hill cipher algorithm in their systems, more than five hundred papers dealt with Hill cipher, to overcome cryptanalysis attack, in this section I will introduce the critical papers and thesis that trying to work on Hill cipher algorithm, from use it until these days.

In 1929, Lester Hill publish his paper, "Cryptography in an algebraic alphabet", he derived cryptographic constructions from linear algebra, by use similar model to Cesar cipher and convert this model into algebraic equation, also he increase the resident of his model to the cryptanalysis at that time [26], after two years, again Lester Hill publish new paper "Concerning Certain Linear Transformation Apparatus of Cryptography" that depend on his previous paper, the new paper include the basic mathematical model for Hill cipher 39

algorithm by using transformation process, and convert single equation shift into matrix form, so increase number of characters encryption in each round of encryption process, and also he make his system resident to the English frequency letters cryptanalysis [27].

In 1991, Yi-Shiung Yeh and et al, in their paper "A New Cryptosystem Using Matrix Transformation", try to overcome known plaintext attack, by divide the message of plaintext into suitable lengths of blocks, and each block concatenate with a random string, also the number in their system are with different base, they generate key by starting with identity matrix for key used in encryption and for the inverse used in decryption, and go throw loop process to generate pair of key and the inverse of the key, and enforce the matrix transformation to against the known chosen plaintext cipher text attack [17].

In 2000, Shahrokh Saeednia , in his paper "How To Make The Hill Cipher Secure" propose new system that increase the security of Hill cipher, by use the permutations of columns and rows in the key matrix to generate new key at each block encryption, they was introduce new parameters (𝒕, 𝒖) to the original Hill cipher model to fulfillment the generation new key subsystem requirement, these parameters send from sender to the recipient [28]. In 2004, Chu-Hsing Lin and et al, mentioned two comments in saeednia’s system in their paper "Comments On Saeednia’s Improved Scheme for the Hill Cipher", first comment, that the saeednia’s scheme costs a lot of time in matrix computation, second Chu-Hsing and his colleagues prove that if the hacker collect pair of (𝒕, 𝒖) parameters, then the problem in saeednia’s scheme still the same problem of original Hill cipher, then after this two comments they introduce new scheme to overcome weakness of saeednia’s system, by use two encryption variables (𝒉𝒊𝒋 , 𝒗), where 𝒉𝒊𝒋 random value, and 𝒗 generated by one way hash function from 𝒉𝒊𝒋 , then multiply plaintext message by 𝒉𝒊𝒋 and then multiply with key and finally adding 𝒗 to the result [19].

In 2005, Jeffrey Overbey and et al, discussed the key space of Hill cipher, In their paper, calculate total numbers of eligible hill cipher matrix, and prove these numbers and compare these numbers with total number of matrices and number of involuntary matrices for any dimension and any modular value, so by these proves and calculation can 40

determine the effect of use some matrices dimension, also the effect of change modular value, and finally conclude that the key space of key matrix will be increase in sufficient scale when modular value prime integer and key size large as possible, and note that when the key matrix size increase the complexity of the Hill cipher model will increase [29].

In august 2006, Ismail and et la, try to repair Hill cipher by adjusting the encryption key matrix to form different key for each encryption block, this adjusting scheme depend on modify each row of the matrix key by multiplying the current key by an initial vector 𝑰 ∗ 𝑽, and as a result the resistance to the several attacks such as known plaintext attack will significantly increase [22].

In the same year, Bibhudendra Acharya and others focus on the main problem that present all weakness and disadvantages for the Hill cipher, it’s the inverse of the matrix not always exist, so present in their paper "Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm" new way that generate matrix are self invertible that means (𝒌 = 𝒌−𝟏 ) , so the none invertible matrix problem solved, also the computational time for calculate inverse of the matrix in decryption process will eliminate, but the key space of this model will reduced dramatically [30].

In 2007, Charlie Obimbo and Behzad Salami introduce the idea of parallel algorithm to determine the inverse of Hill cipher, in their system they prove that the time requires for encryption and decryption process significantly decrease [31].

In the same year, Sastry and Ravi Shankar modify Hill Cipher algorithm by interlacing and iteration, in this method the authors choose large key matrix of dimension 𝒏 × 𝒏 and the plaintext containing n rows and two columns, and at each stage of iteration the plaintext vectors operated by the key matrix are thoroughly interlaced, in their system the algorithm deal at bit level [32].

In late of 2007, Andru Putra Twinanda use Romantic Tantalizer method in finding the inverse of the key matrix at decryption side, the improved that Twinanda add, is at decryption time process, the main idea in his work is to translate the original message into matrix instead of translate it into vector, but he mention that the plaintext message must be

41

not increase more than 𝟒 × 𝟒 matrix since the computation time will increase significantly when the dimension exceeded 𝟒 × 𝟒 [33].

Jyotirmayee Majhi, from Electronics and Communication Engineering Department at National Institute Of Technology university in Rourkela, start to make his thesis of "Modified Hill- Cipher And CRT Methods In Galois Field Gf(2M) For Cryptography", that use "Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm", and mixed this method with Chinese Remainder Theorem to implement the system [34].

In 2008, Saroj Kumar Panigrahy and et al, use the principle of self-invertible matrix of Hill cipher algorithm in image encryption, they point and solve one of drawbacks of the Hill cipher algorithm, which that the algorithm encrypts identical plaintext blocks into identical cipher text blocks, and also can’t encrypts images that contain large area of a single color [35].

In the same year again; Bibhudendra Acharya back to work in Hill cipher, this time by using the previous method of generating self invertible matrix that can be use in encryption and also in decryption process, also adjust the encryption key to generate new key matrix for each block of message, this adjusting significantly increase the security level since the key change every encryption block of plaintext, but all matrices that generating must be self invertible [36].

In the same year, Y. Ranger Romero and others proved that modified Hill cipher present by Ismail and others in their paper "How to repair the Hill cipher" still severe from known plaintext attack, and the security level of modified Hill cipher same as in the original one [18].

Again, after 3 months of Y. Ranger Romero comments on "How to repair the Hill cipher", Cheng-qing and others present paper that include five comments on Ismail paper, first that are number of secret key invalid in Ismail system and prove these key’s, second that system insensitive to the change in secrete key, third also the system insensitive to the change in the plaintext image, fourth Ismail system can be broken under known plaintext or under chosen plaintext, finally he proved that Ismail system include many minor defects that make the system not usable [37]. 42

In the same year, Sastry and Janaki, present new paper "A Modified Hill Cipher with Multiple Keys", the new methods depend on generate multiple secrete key by mixed with bits of plaintext at encryption side, these key’s used at encryption side and then transfer to use the inverse of them at decryption side, not wary about none invertible matrix, only at known plaintext attack [38].

In the same year, Bibhudendra Acharya back to work in Hill cipher, by put his comments on the previous two papers in algorithm but for encryption of image instead of encryption plaintext, he use Novel method that generate randomly self-invertible matrix, these key use in Hill cipher to encrypt the images with higher security level and more quality compared to original Hill cipher [39].

In the same year and same journal but at different conference, Bao Ngoc Tran and Thuc Dinh Nguyen depend on hill cipher algorithm for encryption since matrix cryptosystem are resistant to frequency analysis, they develop new approach to generate modular none-singular key matrix in fast way, and use this approach for authentication protocol [40].

Yellapu Naveen Kumar and Bikkina Narendra at National Institute of TechnologyRourkela, in 2008 write thesis on Hill Cipher using technique that presented on the paper called "Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm" but by some changes that generate self repetitive matrix instead of Self-Invertible Matrix, and make simulation for matrix of size 𝟑 × 𝟑, there thesis focus on how generate eligible matrices to be use in hill cipher [41].

In 2009, Saroj Kumar Panigrahy and others, make paper in biometrics area, and apply advanced Hill cipher algorithm to hiding information in the images, the advance Hill cipher algorithm in this paper depend on the Saeednia’s technique [42].

In the same year Ahmed S. Hadi and Ali H. Mahdi present the idea of combine error free and encryption in the same system, they build mathematical model to use Hill cipher algorithm in deferent way, this by encoded the plaintext and encrypted before send,

43

the encryption process done by using both Hill cipher and permutation, while at the decryption side only Hill cipher used [43].

Ramchandra S. Mangrulkar and Pallavi V. Chavan in may 2009 at International Journal of Recent Trends in Engineering, write paper that implement Hill Cipher algorithm, to hiding the plaintext behind the cover image at the encryption side, and then at the decryption side, decrypt the received cover image to retrieve the original plaintext that hide behind the cover image [44].

Ahmed Y. Mahmoud and Alexander G. Chefranov, in International Conference on Security of Information and Networks, in 2009, publish their paper that analyze a new modification of Hill cipher algorithm, that generating dynamic encryption and decryption key matrix by exponentiation that is made efficiently with help of eigenvalues. The security of their system improved by use of large numbers of dynamic keys [45].

44

Chapter Three Modified Hill Cipher

Contents 3. 1

Introduction To MRHC ............................................................... 46

3. 2

Secure Hash Algorithm-512 ......................................................... 49

3. 3

MRHC Techniques ....................................................................... 53

45

3.1 Introduction To MRHC In first two papers, we present new technique to overcome all Hill cipher problems, by make all matrices key, invertible for any modular value and also any key matrix data, there are two mainly methods to make all matrices invertible. First, the proposed method that depends on converting every character in encryption side into two characters in decryption side. And at the encryption side, the key matrix is used, while at decryption side the normal inverse of key matrix is used, but not the key matrix. This technique has no restriction on the maximum value allowed in the key matrix. The second method does have restriction on the values of key matrix. Since convert every two characters at the encryption side into three characters in the decryption side.

After publish these papers, we work a lot and finally, think to convert everything into numerical data, so when receive plaintext we convert it into numerical data, and convert these numerical data to other values and send them as is, without change numerical values to characters, when these values respite at the decryption side, the algorithm make all calculation needed on these values and finally convert the output values into characters to form the original message, this simple modification to neglect the time consume during covert numerical values into characters at the end of encryption and at the start of decryption side:

Third technique, increase the security of Hill cipher more than AES for some range, while second technique save time in encryption side and also at decryption side, first technique reduce data size that transfer among the network, to explain and prove this; I will introduce the first, second and third techniques, give many examples and finally proving come at next chapter when simulate all codes and compare between them and other algorithms, the algorithm and flowchart in three techniques relatively are similar.

General Idea Of MRHC Three Techniques First off all, must explain the mathematics models and functions that use in MousaRushdi Hill cipher (MRHC), where are similar at three techniques, first script is the

46

padding code, that make the plaintext suitable for encryption matrix, this code is in the main function, that encrypt the message, the main function calling gen_key which may use

SHA-512 to generate 128 different integers as in second and third technique of MRHC or calling min_max function that generate new key from old one as in first technique, two previous function use to formulate the key matrix, then code segment that produce two vector for encryption at the same time instead of encrypt one vector, second segment is the code responsible to formulate the key matrix for encryption, small code segment that check if the key matrix determinant zero, to convert it into none-singular matrix, finally the code output the cipher text to be send by the network, the following flowchart explain the main steps in MRHC encryption algorithm.

47

START

Receive Plaintext Key modular value

p=plaintext K=key N=modular value

s=size of the matrix key P_l=plaintext length

if(mod(P_l,2s)==0)

NO

YES i=1,g=1,j=1,l=1,m=1

p1=p(i:1:i+s-1) p2=p(i+s:1:i+2s-1) i = i + 2s m=1 Call SHA-512 l=l+1

if(m+s2>64)

YES

NO

m=m+s2 Formulate the key matrix YES

k=k+eye(s)

YES

if(det(k)==0) NO CT=k*(p1*n+p2) C1=mod(CT,n) C2=mod(fix(CT/n),n) C3=mod(fix(CT/n^2),n) C4=fix(CT/n^3)

x=j+4*s-1 c(j:1:x)=[C1;C2;C3;C4] j=x+1

if(i