On DPA-Resistive Implementation of FSR-based ...

1 downloads 0 Views 842KB Size Report
Trivium and Garin stream ciphers, using the WDDL styles. These ciphers are designed by two submicron technologies and simulated by HSpice software.
International Review on Communications Antenna and Propagation (I.R.E.C.A.P.), Vol. 1, n. 1

On DPA-Resistive Implementation of FSR-based Stream Ciphers using WDDL Logic Styles R.Ebrahimi Atani1,*, S.Ebrahimi Atani2 Abstract – Cryptographic embedded systems are vulnerable to Differential Power Analysis (DPA) attacks. In this paper, we will implement two of the focus eSTREAM candidates, the Trivium and Garin stream ciphers, using the WDDL styles. These ciphers are designed by two submicron technologies and simulated by HSpice software. The paper presents the trade-offs involved in designing the architecture, and design for performance issues. Copyright © 2010 Praise worthy Prize S.r.l. - All rights reserved.

Keywords: DPA attack, Stream cipher, eSTREAM, WDDL, CMOS,Trivium, Grain.

I.

Introduction

Side Channel Attacks (SCA) are the strongest types of implementation attacks. They require only a passive attacker; one that monitors inputs, outputs, and sidechannels of a device executing a cryptographic algorithm. Of all side-channel attacks, differential power analysis (DPA) is a SCA of particular concern as it is very effective in finding a secret key. In 1998, Kocher et al. reported that the power consumption of a smart card could reveal the secret key of the cryptographic algorithm (first DPA attack) [1]. The attack is based on the fact that logic operations in standard static CMOS have power characteristics that depend on the input data. Power is only drawn from the power supply when a 0 ® 1 output transition occurs. (During 0 ® 0 and 1 ® 1 transitions, no power is drawn). During a 1 ® 0 transition, the stored capacitance is discharged to ground). Therefore, by measuring the power supply of an IC as it encrypts, and then performing statistical analysis of the measured power traces, the secret key can be determined. DPA is a well-known and thoroughly studied threat for implementations of block ciphers (DES and AES), public key algorithms (RSA) and stream ciphers (Grain and Trivium [2]). Recently by the use of Power Analysis in the Real World a Complete Break of the KeeLoq Code Hopping Scheme has also come true [21]. Stream ciphers as part of the symmetric key cryptography family, have always had the reputation of efficiency in hardware and speed. They have attracted much attention since the beginning of the eSTREAM project in 2004. Although there is vast literature about DPA of block ciphers and public key algorithms, only few publications can be found about DPA attacks and its countermeasures on stream ciphers. Stream ciphers require frequent synchronization to prevent synchronization loss between sender and receiver

Manuscript received November 2010, revised January 2011

[3]. Normally the initialization will be done with the same secret key and with a different initial value IV. So an attacker can disrupt the synchronization and apply a new known IV and measure the power traces in the initialization phase to apply a DPA on the embedded system of the stream cipher [4]. In recent years, some countermeasures against DPA at the logic level have been proposed. Logic level countermeasures are applied to the basic components of hardware and aim to cut off DPA leakage at its source. The main goal of this article is to investigate the applicability of Wave Dynamic Differential Logic (WDDL) [5] in the implementation of Grain v.1and Trivium stream ciphers. Then an efficient and secure implementation of these Stream Ciphers using WDDL is presented. To examine the applicability of WDDL styles, we will observe circuit simulations (using HSpice) on transistor level implementation of Trivium stream ciphers. WDDL provides a constant power dissipating logic i.e. in one clock cycle the power consumption of each individual logic gate is constant and independent of its input signals. In order to measure the effect of minimum feature size in DPA resistivity, two different sub-micron technologies are used for simulations. In [10], some ideas about DPA-resistive implementation of FSR-based Stream Ciphers using Sense Amplifier Based Logic (SABL) is presented. In [6], [10], and [16] an efficient SABL-based circuit for Trivium [8], Grain v.1 [7] and Grain-128 [31] stream ciphers are presented. Although the proposed architectures show good security features but the security is highly dependent to full-custom design flow and it is relatively expensive. Since WDDL styles need a semicustom design flow we expect to achieve cheaper design costs with good DPA resistivity. This paper is structured as follows. Section II describes the basic concepts of differential power analysis of stream ciphers. Then a brief overview of

Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

WDDL logic styles is presented in section III. In section IV the structure of two well known H/W based estream candidates are explained. Design and simulation issues are described in section V and finally, conclusions are drawn in Section VI.

II.

DIFFERENTIAL POWER ANALYSIS OF STREAM CIPHERS

DPA is based on the fact that CMOS logic and application specific details cause logic operations to have power characteristics that depend on the input data. It relies further on statistical analysis and error correction to extract the information from the power consumption that is correlated to the secret key [1]. In a DPA a hypothetical model of the device under attack is used to predict the power consumption. The classical setup for a DPA on stream ciphers is illustrated in Fig. 1. Output power traces are determined by the input data, initial value (IV), private key, and output of the device and by many other parameters. An attacker to some extent has the potential knowledge of some of them (e.g. IV, input data and output data) while others are unknown. Regarding the DPA attack, multiple measurements of the power consumption of a cryptographic device are made. For each measurement, different chosen IV’s are sent to the device. Since the cryptographic algorithm is known, a hypothesis on intermediate values can be used to calculate the targeted data values based on the random input values. If the correct hypothesis is used, the targeted data values are calculated correctly for all measurements. According to (1), the total power consumption of an embedded device depends on 3 factors: PTotal = PCons. + PNoise + PDD

(1)

With the help of statistical methods (calculation of correlations, mean values, etc.), the randomness of the data values that are not targeted ( PCons. : leakage currents and data independent power consumption and PNoise : which comes from electrical noise) is exploited to reduce their effects on the power consumption traces. PDD is the data dependent power consumption and is targeted in statistical analysis. After all, the result of the statistical operation indicates which key hypothesis is correct. Normally, a hamming distance power model is used to map the transitions that occur at the outputs of cells of a netlist to power consumption values. In CMOS gates, it is reasonable to assume that the main component of the data dependent power consumption is the dynamic power consumption which is the power dissipation of charging and discharging of output capacitance nodes ( P0®1 or P1®0 ). In a CMOS gate, we can express dynamic power consumption by: 2 PDynamic = N × CL × f × VDD

Manuscript received November 2010, revised January 2011

(2)

Fig.1. Differential power analysis model of stream ciphers.

Where C L , the gate load capacitance and N is the probability of 0 ® 1 or 1 ® 0 output transition and f is the clock frequency. This equation shows that the power consumption of CMOS circuits is data dependent. Note that N is the most important factor in the hypothetical model [15]. There are different techniques for calculation of it. For example, a variable gate delay model can be used for measuring the number of transitions and glitches of a circuit. This technique can be easily applied to circuits by using a VHDL simulator in Register Transfer Level.

III. WDDL Logic Styles Tiri et.al. proposed WDDL applying Differential Cascode Voltage Switch Logic (DCVSL) as a countermeasure against DPA [5]. WDDL is part of the dynamic differential logic, also known as dual-rail with precharge logic, which has a single charging event per cycle. All WDDL gates use only NAND and NOR static CMOS gates and inversion comes for free. Fig. 2 shows WDDL logic. A WDDL gate consists of a parallel combination of two positive complementary gates. In the precharge phase, both true and false inputs are set to 0. This puts the output of the gate at 0. This 0 precharge value travels as the input to the next gate, creating a precharge wave. In the evaluation phase, each input signal is differential and the WDDL gate calculates a differential output. In fact, precharge signal controls the precharge phase to transmit (0, 0) and the evaluation phase to transmit (0, 1) or (1, 0). The precharge operation is performed at the first step in combinational circuit and, the components to be used are limited to AND, OR, and NOT operations. The number of transitions in all circuits generated during an operation cycle is constant without depending on the values of input signals. But power consumption in the CMOS circuits is generally proportional to the number of transitions at the gates. Therefore, the WDDL circuits are effective as a countermeasure against DPA since the power consumption may become constant without depending on the values of input signals as described in the feature above. Precharge wave plays a big role in DPA resistivity of WDDL. There are two ways to integrate the precharge logic into the overall circuit. The first method is to insert Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

a precharge operator at the start of every combinatorial logic tree [5]. So the clock net will have to be routed to every gate which increases the area and cost of design. The second method is to have precharge signal only at primary inputs (circuit inputs and register outputs). In this way, it is sufficient to solely precharge the input signals of the encryption module. Once the precharged signals have propagated, the encryption module is in stable operation mode. From then on, the registers launch the precharge wave. They store the precharged 0's, sampled at the end of the preceding precharge phase, during the evaluation phase. The input signals and all the combinatorial logic concurrently interleave precharge mode and evaluation mode. This method will lead to a much smaller area overhead. In this work the second method is preferred despite the double clock frequency for the same data rate. WDDL gates can be freely interconnected and special design rules, like NP-rules or domino logic rules, used to cascade conventional dynamic gates are unnecessary. Besides, every compound standard cell has only 1 switching event per cycle.

In [10], [12] two main factors of the DPA leakage in WDDL were reported: (1) Leakage caused by the difference of loading capacitance between two complementary logic gates in WDDL gate, (2) Leakage caused by the difference of delay time between the input signals of WDDL gates. But Tiri et al. and Guilley et al. proposed "Fat Wire" [10] and "Backend Duplication" [11], [18], [19] respectively as a countermeasure in the place and route to improve the DPA resistance. Besides all this, in [8] it has been shown that a reported flaw against WDDL [9] cannot be exploited. Figure 2 shows the timing diagram of WDDL MasterSlave Dynamic and Differential Logic (DDL) register (consist of four CMOS flip flops Fig. 1). The simulations are done with Hspice using typical BSIM3 0.13 mm 1.2V CMOS SOI technology. And finally, because of the architecture of WDDL gates, there will not be any possible glitches in the design. Also the switching factor of WDLL is 100% [5].

Fig.2. WDDL Logic style

IV.

H/W Based Stream Ciphers

In the design of a cryptographic primitive there are many different properties that have to be addressed. The most important ones are speed, security and simplicity. Many stream ciphers are based on Linear Feedback Shift Registers (LFSR). The main reason is not only for the good statistical properties of the sequences they produce, but also for the simplicity and speed of their hardware implementation. In this section the structure of two well known FSR-based stream ciphers are presented. They are: two eSTREAM [30] finalist: Grain v.1 [7] and Trivium [8] stream ciphers.

V.

eSTREAM Project

eSTREAM, as part of the EU-funded network of excellence ECRYPT [29], was initiated in 2004 to design and analyze new proposals of stream ciphers suitable for widespread adoption. The project was a successor of the Manuscript received November 2010, revised January 2011

NESSIE project [32] initiated in 2000, where no stream cipher was elected for the final portfolio. A call for primitives was put forward in the fall of 2004, and attracted 34 submissions from all over the world. The candidate algorithms were divided in two categories: software-based algorithms, and hardwarebased algorithms. The design goals for software profile should offer a key size of at least 128 bits, and provide some significant advantage over AES Counter Mode with respect to throughput. On the other hand, Candidates for the hardware profile should outperform AES in restricted environments with respect to relevant parameters such as gate count, power consumption and speed, and provide a key size of at least 80 bits. In April 15 2008 and after three evaluation phases the eSTREAM competition ended and the final portfolio [18] contained 8 promising ciphers: 4 in each profile. Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

One of the H/W profile finalists, F-FCSR [17], was removed from the portfolio in September 2008 after new cryptanalytic results were found. Thus, according to the final modification in November 2008, the three finalists in H/W profile are Grain v.1 [7], Mickey v.2 [23], and Trivium [8]. The final portfolio of eSTREAM projects for S/W and H/W profiles are listed in Table I and Table II respectively. Since this paper is on DPA-resistive implementation of H/W-based Stream Ciphers by the use WDDL logic styles, we briefly review the structures of the eSTREAM finalist for H/W profile. TABLE I FINAL ESTREAM CANDIDATES - SOFTWARE PROFILE (NOVEMBER 2008) Stream cipher Key size HC-128 [33] 128-bit Rabbit [28]

128-bit

Salsa20/12 [27]

128-256 bit

SOSEMANUK [25]

128-256 bit

g : bi +80 = si Å bi Å bi +9 Å bi +14 Å bi + 21 Å bi + 28 Å bi +33 Å bi +37 Å bi + 45 Å bi +52 Å bi +60 Å bi +62 Å bi + 63 · bi + 60 Å bi +37 · bi +33 Å bi +15 · bi +9 Å bi +60 · bi +52 · bi + 45 Å bi +33 · bi + 28 · bi + 21 Å bi + 63 · bi + 45 · bi + 28 · bi +9 Å bi +60 · bi +52 · bi +37 · bi +33 Å bi +63 · bi +60 · bi + 21 · bi +15 Å bi +63 · bi +60 · bi +52 · bi + 45 · bi +37 Å bi +33 · bi + 28 · bi + 21 · bi +15 · bi +9 Å bi +52 · bi + 45 · bi +37 · bi +33 · bi + 28 · bi + 21

The output function h(x) uses as input selected bits from both feedback shift registers: h( x) = x1 Å x4 Å x0 · x3 Å x2 · x3 Å x3 · x4 Å x0 · x1 · x2 Å x0 · x2 · x3 Å x0 · x2 · x4 Å x1 · x2 · x4 Å x2 · x3 · x4

(4)

Where the variables x0 , x1 , x2 , x3 , and x4 corresponds to the tap positions si + 3 , si + 25 , si + 46 , si + 64 , and bi + 63 respectively.

TABLE II FINAL ESTREAM CANDIDATES - HARDWARE PROFILE (NOVEMBER 2008)

Stream cipher Trivium [8]

Key size 80-bit

Grain v1 [7]

80-bit

MICKEY v2 [23]

80-bit

V.1.

Fig.3. The key initialization of Grain stream cipher

GRAIN V.1 STREAM CIPHER

Grain v.1 [7] is a stream cipher introduced in 2005 as a candidate for the hardware profile of eSTREAM project. Grain v.1 is a binary additive synchronous stream cipher with an internal state of 160 bits si , si +1 ,K, si + 79 and bi , bi +1 ,K , bi + 79 residing in a linear feedback shift register (LFSR) and a nonlinear feedback shift register (NLFSR), respectively. The design of the algorithm mainly targets hardware environments where gate count, power consumption and memory are very limited. The key size of Grain is 80 bits (ki : 0 £ i £ 79) . Additionally, an initial value of 64 bits ( IVi : 0 £ i £ 63) is required. In initialization phase, all 80 NLFSR elements are loaded with the key bits, (bi = ki : 0 £ i £ 79) , then the first 64 LFSR elements are loaded with the IV bits, ( si = IVi : 0 £ i £ 63) . The last 16 bits of the LFSR are filled with ones. f(x) and g(x) are two polynomials used as feedback function for the LFSR and NLFSR. f : si +80 = si + 62 Å si +51 Å si + 38 Å si + 23 Å si +13 Å si

Manuscript received November 2010, revised January 2011

(3)

Fig.4. Grain stream cipher

The output of the filter function is masked with the some state bits from the NFSR to produce the key stream z i : z i = bi +1 Å bi + 2 Å bi + 4 Å bi +10 Å bi + 31 Å bi + 43 Å bi + 56 Å h( si + 3 , si + 25 , si + 46 , si + 64 , bi + 63 )

(5)

This output is used during the initialization phase as additional feedback to LFSR and NLFSR. During normal operation this value is used as key stream output. The generated output bits per clock cycle are called Radix. By implementing the small feedback functions, f (x ) and g (x) , and the output function several times the speed of Grain can easily reach up to Radix-32. Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

V.2.

TRIVIUM STREAM CIPHER

Trivium [8] is a stream cipher introduced in 2005 as a candidate for the hardware profile of the eSTREAM project. Trivium has an internal state of 288 bits ai , ai +1 ,K, ai +92 , bi , bi +1 ,K, bi +83 and ci , ci +1 , K , ci +110 residing in three coupled NLFSRs A, B, and C of 93, 84, and 111 bits respectively. Trivium has a key k = (k 0 ,K, k79 ) of 80 bits as well as an initial value IV = ( IV0 ,K, IV79 ) of 80 bits. The initialization of the key

and IV is done as follows: ì (a0 , K, a92 ) = (0, K,0, k 79 ,K, k 0 ) ï í(b0 ,K, b83 ) = (0,0,0,0, IV79 ,K, IV0 ) ï (c ,K, c ) = (1,1,1,0,0, K,0,0) 0 110 î

(6)

Then, the state is updated over 4 full cycles, according to (2), but without generating key stream bits. After 1152 clocking it outputs a key stream bit

zi (3):

ìai +93 = ai + 24 Å ci Å (ci +1 · ci + 2 ) Å ci + 45 ï í bi +84 = bi + 6 Å ai Å (ai +1 · ai + 2 ) Å ai + 27 ïc î i +111 = ci + 24 Å bi Å (bi +1 · bi + 2 ) Å bi +15

(7)

zi = ai Å bi Å ci Å ai + 27 Å bi +15 Å ci + 45 (8) Trivium has a very simple structure that is well suited for different Radix (The generated output bits per clock cycle) implementations from Radix-1 to Radix-64 without noticeable hardware penalties. The basic structure of the Trivium stream ciphers is shown in Fig. 5.

Fig.5. Trivium stream cipher

VI.

Circuit Design and Simulation Results

Since WDDL needs a Semi-Custom design flow it can also be used in FPGA and PLA based implementations [12, 21]. In order to have accurate observation of the power consumption, the whole cipher, including the control units for initialization phase, is modeled at transistor level using a spice netlist. In order to specify the impact of transistor sizing on the design, both ciphers Manuscript received November 2010, revised January 2011

are designed using two technologies: typical BSIM3 0.13 mm CMOS SOI technology and typical 0.35 mm CMOS SOI technology. As mentioned in the last section, Trivium has three NLFSRs of A: 93 flip-flops, B: 84 flipflops, and C: 111 flip-flops. The whole initialization needs 1152 cycles. Spice simulations were run to test the circuit by test vectors provided by the inventors of Trivium using Hspice circuit simulator and C compiler. First a new standard gate library based on WDDL logic is designed. Minimum possible sized transistors are used to lower the total capacitance to get lower dynamic power consumption. In the design of WDDL gates and DDL registers, it is important that the compound standard cell charge a total capacitance with a constant value. For example, AND gate and the OR gate (which make up a compound standard cell), are not identical [9]. The internal and the intrinsic capacitances cannot be identical between both CMOS gates. Yet, with shrinking channel length of the transistors, the interconnect capacitance becomes more and more the dominant capacitance. This makes it appropriate to primarily concentrate on the interconnect capacitances. Under the assumption that the differential signals travel in the same environment, the interconnect capacitances are equivalent [9, 11]. The first clock cycle is spent in evaluation phase, the second in pre-charge phase. Note that the precharge phase is fundamental to any dynamic logic, to achieve a switching factor of 100%. In order to monitor all current variations, one sample has been taken every 50ps. For each cipher, simulations were run for four different 80-bit keys and IV pairs. The selected key and IV pairs are shown in Table I. Table II and Table III show the summary of final statistical power analysis results. All power simulations are observed by 5 MHz clock signal. In order to compare WDDL with SABL and CMOS implementations [6], [9], the average power consumption per cycle was extracted by averaging the power consumption on 100 consecutive clock cycles. Then, the Mean Power Consumption (MPC), the Power Consumption Standard Deviation (PCSD), the Normalized Energy Deviation (NED) (9) and Normalized Standard Deviation (NSD) were extracted for each simulated logic style (10) [34]. For example for Trivium implementation and in 0.13 mm technology and for the key and IV pair of (K1, IV1):

PCSDWDDL = 0.094 PCSDS -CMOS PCSDWDDL = 3.485 PCSDSABL

(11)

Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

TABLE I DIFFERENT 80-BIT HEXADECIMAL KEY AND IVS USED FOR THE INITIALIZATION OF TRIVIUM IN SIMULATIONS 80-bit Key and IV K1 AAA…A

NED =

K2

80000..0

IV1

64-bit IV for Grain V.1 555..5

IV2

FFF…F

IV3

000…0

IV4 IV5

1111…1 80-bit IV for Trivium 555..5

IV6

FFF…F

IV7

000…0

IV8

1111…1

max( Energy / Cycle) - min( Energy / Cycle) max( Energy / Cycle) NSD =

PCSD MPC

(9) (10)

This shows that the power consumption fluctuation of WDDL implementation is nearly 10 % of standard CMOS. This is a major improvement but still PCSDWDDL ¹ 0 and very small current variations are detectable. On the other hand WDDL cannot damp current fluctuations as much as SABL. This is true but we should take into account that process of the design of SABL is full custom which is expensive. WDDL is designed in RTL level and can be implemented on FPGAs which make it a cheap countermeasure against DPA with reasonable DPA resistivity. DPA resistivity factor is calculated in (12): 1 DPA Resistancy µ (12) PCSD Increasing the security is never free. The tradeoff is an increase in area, time and power. One of the fundamental parameters of a cryptographic algorithm is the amount of data it can process within a given period. The total throughput of the algorithm is expressed as Mbits / s and can be calculated from where f is the clock frequency of the design (e.g. 5 MHz). Trivium throughput rate for SABL and Standard CMOS designs are equal [9], but for WDDL is decreased with a factor of two (to prepare the precharge wave).: Power = Current ´ Constant Supply Voltage

(13)

T = f ´ Radix (14) Where, A , P and DR corresponds to transistor cost (Area), power consumption, and DPA resistancy respectively. Normally for an ASIC, the lower bound on the area increase of WDDL implementation is a factor of 2 (the number of gates in a WDDL gate) and the higher Manuscript received November 2010, revised January 2011

bound is 4 (WDDL Flip-flops use 4 S-CMOS Flip-flops). But as mentioned in the section 3, Trivium consists of 288 Flip-flops and a couple of AND2 and Xor2 gates. So the required area is approximately 4 times bigger than Standard CMOS. The overall comparison between WDDL, SABL and standard CMOS design for 4 different key and IV choices (Table II and III) are shown in Table IV. At the end of the simulations and data analysis we exhibited that WDDL logic styles allow to significantly decrease the supply current variations of both Trivium and Grain circuits without a major increase on the process of design compared to SABL styles. But still in both designs (WDDL and SABL) very small current variations are detectable. As a disadvantage this could be a start of a DPA attack since the predictability of the energy variations is more critical than their amplitude. Since DPA efficiency depends on the possibility to predict the power consumption of a device in function of its input data and the value of the correlation coefficient, the attack is still theoretically feasible against WDDL circuits. And finally the overall comparison for WDDL, SABL and S-CMOS design of Grain v.1 and Trivium stream ciphers in 0.35μm and 0.13μm CMOS technologies is in Tables IV and V. Stream ciphers always had the reputation of efficiency in hardware. But as mentioned before, area over head for WDDL Flipflops are 4 times higher than standard CMOS so WDDL does not look promising in the design FSR based stream ciphers. On the other hand there are some nice advantages of using WDDL which make it still interesting to be used. Finally the total power consumption is approximately 3.5 times larger than standard CMOS.

VII. Conclusion This paper investigated the use of WDDL logic to counteract power analysis attacks on stream ciphers. In particular, an efficient DPA resistive circuit for Grain v.1 and Trivium stream ciphers are designed and compared with their standard CMOS and SABL implementations. Experimental results have demonstrated that WDDL is an effective technique to achieve an important reduction in the power variation for stream ciphers. WDDL can simply be used in RTL level design flow and can reduce design costs for secure stream cipher implementations. Specifically, comparison with a standard CMOS design shows WDDL is a good technique for stream cipher implementation. The tradeoff is in an increase in area, time and power consumption. Specially in FSR based stream ciphers like Trivium, since Flip-flops are major building blocks in the cipher, area overhead is ≈4 times higher.

Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

TABLE II

Statistical power analysis of Standard CMOS, SABL, and WDDL implementation of Grain V.1 stream cipher for different key and IV’s for both 0.35μm and 0.13μm CMOS technology. Stream Cipher

GRAIN V.1

Technology Statistical Power Analysis Features

CMOS, 0.35ΜM

CMOS 0.13ΜM

(VDD = 3.3V, VTN = 0.6V, VTP = −0.85V )

(VDD = 1.2V, VTN = 0.4V, VTP = −0.39V )

PMC [mW ]

PCSD [mW ]

NED

PMC [mW ]

NSD

PCSD [mW ]

NED

NSD

Standard CMOS implementation [9] K1, IV1

421

26.8

0.2784

0.0636

258

22.50

0.7889

0.0872

K1, IV2

402

18.5

0.1905

0.0460

246

14.14

0.8026

0.0545

K2, IV3

397

17.1

0.2187

0.0430

242

12.31

0.8402

0.0509

K2, IV4

415

29.6

0.3882

0.0713

252

21.72

0.7924

0.0862

Sense Amplifier Based Logic (SABL) implementation [9] K1, IV1

616

0.9497

0.0136

0.00154

378

0.7610

0.0124

0.0020

K1, IV2

605

0.9375

0.0111

0.00155

371

0.7424

0.0082

0.0020

K2, IV3

601

0.9318

0.0122

0.00155

369

0.7291

0.0079

0.0019

611

0.9393

0.0120

0.00155

374

0.7482

0.0101

0.0020

K2, IV4

Wave Dynamic Differential Logic (WDDL) implementation (Our Work) K1, IV1

1492

2.86

0.0103

0.00191

951

2.12

0.0121

0.00222

K1, IV2

1424

2.92

0.0110

0.00205

927

1.97

0.0139

0.00213

K2, IV3

1431

2.76

0.0126

0.00193

931

2.05

0.0107

0.00220

K2, IV4

1476

2.80

0.0119

0.00189

949

2.07

0.0134

0.00218

TABLE III

Statistical power analysis of Standard CMOS, SABL, and WDDL implementation of Trivium stream cipher for different key and IV’s for both 0.35μm and 0.13μm CMOS technology. Stream Cipher

TRIVIUM

Technology Statistical Power Analysis Features

CMOS, 0.35ΜM

CMOS 0.13ΜM

(VDD = 3.3V, VTN = 0.6V, VTP = −0.85V )

(VDD = 1.2V, VTN = 0.4V, VTP = −0.39V )

PMC [mW ]

PCSD [mW ]

NED

NSD

PMC [mW ]

PCSD [mW ]

NED

NSD

Standard CMOS implementation [9] K1, IV5

641

49.1

0.3292

0.0766

337

31.20

0.8945

0.0926

K1, IV6

637

19.3

0.2351

0.0303

321

16.12

0.8191

0.0190

K2, IV7

629

23.5

0.3139

0.0374

319

17.14

0.8614

0.0537

635

41.7

0.2842

0.0657

326

29.94

0.9218

0.0918

K2, IV8

Sense Amplifier Based Logic (SABL) implementation [9] K1, IV5

949

1.2632

0.0091

0.00133

545

0.8435

0.0061

0.0015

K1, IV6

940

1.2469

0.0106

0.00132

537

0.7927

0.0054

0.0014

K2, IV7

938

1.2403

0.0089

0.00131

536

0.7831

0.0050

0.0014

943

1.2531

0.0117

0.00133

541

0.8237

0.0059

0.0015

K2, IV8

Wave Dynamic Differential Logic (WDDL) implementation (Our Work) K1, IV5

2323

4.5

0.0112

0.0019

1413

2.94

0.0215

0.0020

K1, IV6

2285

3.9

0.0151

0.0017

1402

3.01

0.0180

0.0021

K2, IV7

2301

4.2

0.0139

0.0018

1382

3.17

0.0224

0.0022

K2, IV8

2378

4.2

0.0142

0.0017

1439

3.23

0.0256

0.0022

Manuscript received November 2010, revised January 2011

Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

TABLE IV

Overall comparison for WDDL, SABL and S-CMOS design of Grain v.1 stream cipher in 0.35μm and 0.13μm CMOS technologies. (All data are normalized) Logic Style

WDDL

SABL [9]

Standard CMOS [9]

WDDL

SABL [9]

Standard CMOS [9]

CMOS, 0.35ΜM

CMOS 0.13ΜM

(VDD = 3.3V, VTN = 0.6V, VTP = −0.85V )

(VDD = 1.2V, VTN = 0.4V, VTP = −0.39V )

Transistor Cost (A)

1

0.70

0.31

1

0.70

0.31

Power Consumption (P)

1

0.325

0.281

1

0.397

0.266

DPA Resistancy (DR)

0.269

1

0.045

0.276

1

0.043

Throughput Rate (T)

0.5

1

1

0.5

1

1

TABLE V

Overall comparison for WDDL, SABL and S-CMOS design of Trivium stream cipher in 0.35μm and 0.13μm CMOS technologies. (All data are normalized) Logic Style

WDDL

SABL [9]

Standard CMOS [9]

WDDL

SABL [9]

Standard CMOS [9]

CMOS, 0.35ΜM

CMOS 0.13ΜM

(VDD = 3.3V, VTN = 0.6V, VTP = −0.85V )

(VDD = 1.2V, VTN = 0.4V, VTP = −0.39V )

Transistor Cost (A)

1

0.71

0.27

1

0.71

0.27

Power Consumption (P)

1

0.406

0.274

1

0.383

0.231

DPA Resistancy (DR)

0.298

1

0.0375

0.263

1

0.0343

Throughput Rate (T)

0.5

1

1

0.5

1

1

References [1]

P.C. Kocher, J. Jaffe, B. Jun, Differential Power Analysis, Advances in Cryptology - CRYPTO, LNCS Vol. 1666, 1999, pp. 388-397. [2] W. Fischer, B.M. Gammel, O. Kniffler, J. Velton, Differential Power Analysis of Stream Ciphers, Topics in Cryptology - CTRSA, LNCS Vol. 4377, 2007, pp. 257-270. [3] J. Daemen, R. Govaerts, J. Vandewalle, Resynchronization Weaknesses in Synchronous Stream Ciphers, Advances in Cryptology - Eurocrypt, LNCS Vol. 765, 1993, pp. 159-167. [4] J. Lano, N. Mentens, B. Preneel, I. Verbauwhede, Power Analysis of Synchronous Stream Ciphers with Resynchronization Mechanism, SASC'04, 2004, pp. 327-333. [5] K. Tiri, I. Verbauwhede, A Logic Level Design Methodology for a Secure DPA Resistant ASIC or FPGA Implementation, Design, Automation, and Test in Europe conference (DATE), 2004, pp. 246-251. [6] R.E. Atani, S. Mirzakuchaki, S.E. Atani, W. Meier, Design and simulation of a DPA resistive circuit for Trivium stream cipher based on SABL styles, Mixed Design of Integrated Circuits and Systems (MIXDES), 2008, p.p. 203-208. [7] M. Hell, Th. Johansson, W. Meier, Grain - A Stream Cipher for Constrained Environments, eSTREAM project, 2006. [8] C. DeCanniere, B. Preneel, Trivium Specifications, eSTREAM project, 2005. [9] S. Guilley, L. Sauvage, P. Hoogvorst, R. Pacalet, G. M. Bertoni, S. Chaudhuri, Security Evaluation of WDDL and SecLib Countermeasures against Power Attacks, IEEE Tran. Computers, Vol. 57, Issue: 11, 2008, pp. 1482-1497. [10] R.E. Atani, S. Mirzakuchaki, S.E. Atani, W. Meier, On DPAResistive Implementation of FSR-based Stream Ciphers using Manuscript received November 2010, revised January 2011

SABL Logic Styles, International Journal of Computers, Communications & Control (IJCCC), Vol III, No. 4, 2008, pp. 324-335. [11] D. Suzuki, M. Saeki, Security Evaluation of DPA Countermeasures Using Dual-Rail Pre-Charge Logic Style, Workshop on Cryptographic Hardware and Embedded Systems, LNCS Vol. 4249, 2006, pp. 255-269. [12] K. Tiri, I. Verbauwhede, Place and Route for Secure Standard Cell Design, CARDIS, 2004, pp.143-158. [13] S. Guilley, P. Hoogvorst, Y. Mathieu, v Pacalet, The Backend Duplication Method, Workshop on Cryptographic Hardware and Embedded Systems, LNCS Vol. 3659, 2005, pp. 383-397. [14] D. Suzuki, M. Saeki, T. Ichikawa, DPA Leakage Models for CMOS Logic Circuits, Workshop on Cryptographic Hardware and Embedded Systems, LNCS, Vol. 3659, 2005, pp. 366-382. [15] S. Mangard, E. Oswald, T. Popp, Power Analysis Attacks: Revealing the Secrets of Smart Cards, Springer, 2007. [16] R.E. Atani, W. Meier, S. Mirzakuchaki, S.E. Atani, Design and Implementation of DPA Resistive Grain-128 Stream cipher Based on SABL Logic, International Journal of Computers, Communications & Control (IJCCC), Vol. III - Suppl. Issue, 2008, pp. 293-298. [17] F. Arnault, T. Berger and C. Lauradoux, F-FCSR Stream Ciphers, New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp. 170-178. [18] S. Babbage, The eSTREAM Portfolio, April 2008, eSTREAM project website. [19] P. Yu, Implementation of DPA-Resistant Circuit for FPGA, MSc. Thesis, Virginia Tech., 2007. [20] P. Yu, P. Schaumont, Secure FPGA circuits using controlled placement and routing, CODES+ISSS, 2007, pp. 45-50. [21] K. Tiri, I. Verbauwhede, A digital design flow for secure integrated circuits, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 25, no. 7, 2006, pp. 11971208. Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved

R. E. Atani, S. E. Atani

[22] T. Eisenbarth, T. Kasper, A. Moradi, C. Paar, M. Salmasizadeh, and M. Manzuri Shalmani. On the Power of Power Analysis in the Real World: A Complete Break of the KeeLoq Code Hopping Scheme, In Advances in Cryptology – CRYPTO, LNCS Vol. 5157, 2008, pp. 203-220. [23] S. Babbage and M. Dodd. The MICKEY stream ciphers, New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp. 191-209. [24] M. Beck and E. Tews. Practical attacks against WEP and WPA, 2008. http://dl.aircrack-ng.org/breakingwepandwpa.pdf. [25] C. Berbain, O. Billet, A. Canteaut, N. Courtois, H. Gilbert, L. Goubin, A. Gouget,L. Granboulan, C. Lauradoux, M. Minier, T. Pornin, and H. Sibert. SOSEMANUK, a fast software-oriented stream cipher, New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp. 98-118. [26] D. Bernstein. Notes on the ECRYPT Stream Cipher project (eSTREAM). http://cr.yp.to/streamciphers.html. [27] D. Bernstein. The Salsa20 family of stream ciphers. New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp. 84-97. [28] M. Boesgaard, M. Vesterager, and E. Zenner. The Rabbit stream cipher. New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp. 69-83. [29] ECRYPT Network of Excellence in Cryptology. http://www.ecrypt.eu.org/. [30] ``eSTREAM - the ECRYPT Stream Cipher Project", 2004 – 2008, the eSTREAM project. http://www.ecrypt.eu.org/stream. [31] M. Hell, T. Johansson, A. Maximov, and W. Meier. The Grain family of stream ciphers, New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp. 179-190. [32] New European Schemes for Signatures, Integrity and Encryption. http://www.cosic.esat.kuleuven.be/nessie/. [33] H. Wu. The stream cipher HC-128, New Stream Cipher Designs, LNCS Vol. 4986, 2008, pp.39-47. [34] F. Mace, F. X. Standaert, I. Hassoune, J.J. Quisquater, and J. Legat, A Dynamic Current Mode Logic to Counteract Power Analysis Attacks, DCIS 2004 - 19th Conference on Design of Circuits and Integrated Systems, 2004, pp. 186 - 191.

Manuscript received November 2010, revised January 2011

Authors’ information *

Reza Ebrahimi Atani (Corresponding author) Department of Computer Engineering The University of Guilan P.O. Box 3756 Rasht, Iran [email protected] 1

Shahabaddin Ebrahimi Atani 2 Department of Mathematics The University of Guilan P.O. Box 1914 Rasht, Iran [email protected]

Reza Ebrahimi Atani was born in 1980. He received the B.S. degree in electrical engineering from the University of Guilan in 2002 and the M.Sc and PhD degrees in electronics from Iran University of Science and Technology in 2004 and 2010 respectively. Since 2010, he is an assistant professor in Computer Engineering Department at the University of Guilan. His current research interests include stream cipher design and cryptanalysis, cryptographic hardware and embedded system (CHES), side channel attacks (Power and fault attacks), and design of VLSI circuits. Shahabaddin Ebrahimi Atani Shahabaddin Ebrahimi Atani received the B.S. degree in Mathematics from the University of Tabriz, in 1975 and the M.Sc degrees in Mathematics from the Institute of Mathematics, Teacher Training University. He received his PhD (1996) in Mathematics from University of Manchester. During 1978 - 1981 he was a lecturer in Arak University. He joined Mathematics department of Guilan University in 1981 as an Assistant Professor. Since 2007, he is professor in that department. His current research interests include Module theory, Modules over pullback rings and Pure-injective modules, Zero-divisor graphs, semiring theory and cryptography.

Copyright © 2011 Praise Worthy Prize S.r.l. - All rights reserved