An efficient message digest algorithm (MD) for data ... - IEEE Xplore

An Efficient Message Digest Algorithm (MD) For Data Security Abdul Hamid M. Ragab, Nabil A. Ismail, Senior Member ZEEE, and Osama S. Farag Allah

Abstract--This paper presents a new proposed Many . of its characteristics message digest algorithm 0) (applications domain, performance and implementation structure) are similar to those of MDCfamily of hash functions. The proposed algorithm takes as input a message of arbitrary length and produces as output a 128/160-bit fugerprint or message digest. New features of the proposed algorithm include the heavy use of data-dependent rotations, and the inclusion of integer multiplication as an additional primitive operation. These proposed features are expected to provide high security level with enhancement in throughput. The proposed algorithm is intended for digital signature applications, where a large fde must be compressed in a secure manner before being signed (encrypted) with a private secret key under a public-key cryptosystem. The proposed algorithm is designed to be quite fast on 32-bit machines. In addition, it does not require any large substitution tables, so that the algorithm can be coded quite compactly. We describe the general characteristics, architecture and implementation, and give a complete specifcation for MD-160/128. Several test vectors are used for inspecting the validity of the proposed algorithm. Also, we compare the software performance of several MDCbased algorithms, which is of independent interest. Simulation results show that the throughput of the proposed MD-128 is about 76.4 Mbitlsec while in RIPEMD-128 is about 69.8 Mbitlsec. Index Terms-Cryptographic Hash Functions, Cryptanalysis, Software Performance.

I.

INTRODUCTION

C

RYPTOGWHIC hash functions are an important tool in cryptography for applications such as digital fingerprinting of messages, message authentication, and key derivation. Hash functions can map bitstrings of arbitrary finite length into strings of fvred length. A hash value is generated by a b c t i o n H of the form h = H(M) where M is a variable-length message and H(M) is the fured-length hash value. The purpose of a hash function is to produce a fingerprint of a file, message, or other block of data. To be useful for message authentication, a hash function must satisfy the following properties: 1- H can be applied to a block of data of any size. 2- H produces a fixed-length output. This work was supposed by Department of Computer Sci. & Eng.in Faculty of Electronic Engineering, Menouf-32952, Egypt. Abdul Hamid M. Ragab is the head of Computer Sci. & Eng. Department, Faculty of Electronic Engineering, Menouf-32952, Egypt, (e-mail: a h m r a g a whot"). d is with the Department of Computer Sci. & Eng., Faculty Nabil A. I of Electronic Engineering, Menouf-32952, Egypt, (e-mail: nabil_is@ hotmail.com). Osama S. Farag AUah is with the Department of Computer Sci. & Eng., Faculty of Electronic Engineering Menouf-32952, Egypt, (e-mail: osami-salah_faragallah@ hotmail.com).

IEEE Catalogue No. 01CH37239 /$IO.OO 0 2001 IEEE. 0-7803-7IOI-I/OI

3- H(x) is relatively easy to compute for any given input x. 4- One-way: for any given code h, it is computationally infeasible to find x such that H(x) = h. 5- Weak collision resistance: for any given block x, it is computationally infeasible to find y f xwithH(y)= H(x). 6- Strong collision resistance: it is computationally infeasible to find any pair (x,y) such that H(y) = H(x). The strength of a hash function against brute-force attacks [l] depends only solely on the length of hash code produced by the algorithm. Table I Summarizes the level of effort required producing a birthday or square root attack (which is referred to strength of hash code) for merent types of hash functions, assuming m-bit result. TABLEI. COMPARISONOF HASH-CODE STRENGTHBETWEEN DFFERENT TYPES OF HASH FUNCTIONS

[

Type of hash function One-way

1 Strength of hash function I

I Strong collision resistance I

2" 2mn

Almost all hash functions are iterative processes, which hash inputs of arbitrary length by processing successive fixed-size blocks of the input. The input Xis padded to a multiple of the block length and subsequently divided into t blocks x1 through x,. The overall structure of a typical secure hash function is indicated in Fig. 1. The hash function h can then be described as follows: & = initial n-bit value Hi = @&-,,Xi),1s i 5 t, h(X) = Ht The first constructions for hash functions were based on block ciphers (such as DES) [l]. Although some trust has been built up in the security of these proposals, their software performance is not very good, since they axe typically from 2 to 4 times slower than the corresponding block cipher. The most popular hash functions, which are currently used in a wide variety of applications, are the custom designed hash functions from the --family. MD4 was proposed by R Rivest [4,5]. It is a very fast hash function tuned towards 32-bit processors. Because of unexpected vulnerabilities identified in [6,7] (namely collisions for two rounds out of three), a strengthenedversionofMD4was designed which is called MD5 [8,9]. MD5 is slightly slower than MD4, but it is more conservative in design. It was being implemented fast into products. MD5 is probably the most widely used hash function, in spite of the fact that the compression function of MD5 is not collision resistant [lo]. This does not pose a threat for standard applicationsof MD5, but still implies a violation of one of

191

H =chaining variable xi = ith input block f = compressionfunction t = number of input blocks n = length of hash wle b = length of input block

I-&-1

Fig. 1. General structure of secure hash function the design principles. A second alternative for MD5 is the Secure Hash Algorithm (SHA-l), which was designed by NSA and published by MST [ll]. The two main improvements are the increased size of the result (160 bits compared to 128 bits for the other schemes), and the fact that the message words in the different rounds are not permuted but computed as the sum of previousmessage words. This has main consequence that it is much harder to make local changes confined to a few bits. The first version of SHA had a weaker form of this property (no mixing was done between bits at different positions in a word), and apparently this can be exploited to produce collisions faster than 2'' operations. However, no details have been made available. This weakness was removed in the improved version [ll]. Another proposal was RIPEMD, which was developed in the framework of the EU project RIPE (Race Integrity Primitives Evaluation). The RIPE consortium had a goal to propose a portfolio of recommended integrity primitives [12,22] based on its independent evaluation of MD4 and MD5 [6,10]. It proposed a strengthened version of MD4, which was called RIPEMD. RIPEMD consists of essentially two parallel versions of MD4, With some improvements to the shifts and the order of the message words; thetwo parallel instances differ only in the round constants. At the end of the compression function, the words of left and right halves are added. 11.

Algorithm MD4 MD5

RIPEMD RIPEMD-128 RJPEMD-160

step function A = ( A + f(B,C.D) + xi+ k )