PHYSICAL UNCLONABLE FUNCTIONS, FPGAS AND ... - CiteSeerX

12 downloads 37 Views 165KB Size Report
sist of many random components that are present in the man- ufacturing process and ..... δH(x, x ) counts the number of functions h ∈ H for which x and x collide.
PHYSICAL UNCLONABLE FUNCTIONS, FPGAS AND PUBLIC-KEY CRYPTO FOR IP PROTECTION Jorge Guajardo, Sandeep S. Kumar, Geert-Jan Schrijen, Pim Tuyls Philips Research Laboratories, Eindhoven, The Netherlands {Jorge.Guajardo,Sandeep.Kumar,Geert.Jan.Schrijen,Pim.Tuyls}@philips.com

ABSTRACT

PROM or flash. When the FPGA is powered up, the bitstream is loaded onto the FPGA and the FPGA is configured. During loading, an attacker can easily tap the bitstream and make a copy of it, which he can then use to (illegally) program other FPGAs without paying the required licensing fees to the IP owner. This attack is called a cloning attack and it is a serious concern to IP developers nowadays. From a security perspective, the counterfeiting threat can be best explained as an authentication problem. In general, we can identify the following security services required by different parties in the overall IP protection chain: Hardware IP authentication (S1): a hardware design runs only on a specific hardware device, hence it can not be cloned. Hardware platform authentication (S2): the hardware platform (FPGA) allows only authentic designs to execute. Complete design confidentiality (S3): the intended design recipient (this could be the system integrator, the end user, etc.) has only access to the design as a black box (input/output behavior). No other party (in addition to the design developer) knows anything about the hardware IP. Secure hardware IP updating (S4): given that there is already an authentic design running on the FPGA, the IP provider would like to update it and at a minimum keep all the security guarantees that the previous design kept. Design traceability (S5): given an IP block, the designer can trace back who the intended recipient of the design was. User privacy (S6): A design should not be linkable to the identity of the end-user Notice that S5 has been considered in other works [19] and that S1 and S2 were originally introduced in [40] as key services in the IP authentication problem. Early solutions [21] to the IP authentication problem (and to this date the most widely available one) are based on bitstream encryption. The key used to encrypt the bitstream is stored in either non-volatile memory added to the FPGA [2] or the FPGA stores a long-term key in a few hundred bits of dedicated RAM backed-up by an externally connected battery [25, 26]. Both solutions come with a price penalty and are

In recent years, IP protection of FPGA hardware designs has become a requirement for many IP vendors. To this end solutions have been proposed based on the idea of bitstream encryption, symmetric-key primitives, and the use of Physical Unclonable Functions (PUFs). In this paper, we propose new protocols for the IP protection problem on FPGAs based on public-key (PK) cryptography, analyze the advantages and costs of such an approach, and describe a PUF intrinsic to current FPGAs based on SRAM properties. We observe that a major advantage in using PK-based protocols is that it allows for an implementation in which the private key stored in the FPGA never has to leave the device, thus increasing security. Finally, notice that this comes at the cost of additional hardware resources but not at significant performance degradation. 1. INTRODUCTION In today’s globalized economy, it has become standard business practice to include third party Intellectual Property (IP) into products. This trend has led to the realization that internally developed IP is of strategic importance, for two reasons: (i) it decreases the design cycle by implementing reuse strategies and (ii) it is a source of additional licensing income from external parties. However, IP creators must face the counterfeiting challenge. For example, it is estimated that as much as 10% of all high tech products sold globally are counterfeit. This translates into a conservative estimate of US$100 billion of global IT industry revenue lost due to counterfeiting [23]. The same paper advises to employ anti-counterfeiting technologies to mitigate the effects of counterfeiters. This paper deals explicitly with one such technology and its implementation on FPGAs. SRAM based FPGAs, which are the majority of the market, offer a very flexible solution for implementation of valuable designs since they can be reprogrammed (configured) in the field. This allows for instance to update current designs with new and improved ones and stands in sharp contrast with implementations on ASICs. FPGA designs are represented as bitstreams and stored in external memory e.g. 1

therefore not very attractive. The second solution has the additional disadvantage that the battery has only a limited life time and that batteries can get damaged which further shortens the design operating life. Both effects have as a consequence that the key and the design are lost after some time, rendering the overall IP block non-functional. CONTRIBUTIONS. In this paper, we will focus on providing services S1, S2 and S3. In particular, we propose new and improved protocols for IP protection on FPGAs. We depart from previous works and study the advantages that asymmetric cryptography provides in this setting. In particular, this allows that secret-key information never has to leave the FPGA in contrast to the proposals in [40, 13]. As usual this comes at the cost of having to implement publikey cryptography on the FPGA, which requires more hardware resources than symmetric-key based constructions. However, we also show that it is possible to obtain a performance comparable to that of symmetric-key cryptography by encrypting and scrambling in the sense of [18]. In addition, we describe two possible implementations of Physical Unclonable Functions (PUFs) one of which does not require modification of the hardware and thus, it is intrinsic to the FPGA. Finally, we show some of the trade-offs that can be made when implementing a fuzzy extractor [10, 29]. NOTATION. We will denote an IP block by SW and use this terminology interchangeably. We write EncK (·) to mean the symmetric encryption of the argument under key K. We assume that Enc(·) provides semantic security against chosen plaintext attacks (written IND-CPA) as the standard CBC mode of operation for symmetric ciphers provides. Similarly, we will write EncKpubI (·) to indicate the PK encryption using I’s public key and SigKpriv (·) to indicate the a I signature on the argument using the private key KprivI of I. Decryption and verification operations are then written DecKprivI (·) and VerKpubI (·), respectively. ORGANIZATION. Section 2 provides an overview of PUFs, security assumptions and their properties. We also define fuzzy extractors as an integral part of our solution. In Section 3, we introduce a protocol that is based on PK cryptography primitives that allows for a solution in which the private key associated with a device does not need to leave it, even during enrollment. This has the additional advantage that our solution does not require fuses, which against advanced attacks do not offer enough guarantees (see e.g. [3]). Section 4 describes several PUF constructions including intrinsic PUFs which are based on the properties of SRAM memories present on FPGAs. We end in Sect. 5 analyzing the cost of the fuzzy extractor implementation.

They inherit their unclonability from the fact that they consist of many random components that are present in the manufacturing process and can not be controlled. When a stimulus is applied to the system, it reacts with a response. Such a pair of a stimulus C and a response R is called a challengeresponse pair (CRP). In particular, a PUF is considered as a function that maps challenges to responses. The following assumptions are made on the PUF: 1. It is assumed that a response Ri (to a challenge Ci ) gives only a negligible amount of information on another response Rj (to a different challenge Cj ) with i 6= j. 2. Without having the corresponding PUF at hand, it is impossible to come up with the response Ri corresponding to a challenge Ci , except with negligible probability. 3. Finally, it is assumed that PUFs are tamper evident. This implies that when an attacker tries to investigate the PUF to obtain detailed information of its structure, the PUF is destroyed. In other words, the PUF’s challenge-response behavior is changed substantially. We distinguish between two different situations. First, we assume that there is a large number of challenge response pairs (Ci , Ri ), i = 1, . . . , N available for the PUF; i.e. a strong PUF has so many CRPs such that an attack (performed during a limited amount of time) based on exhaustively measuring the CRPs only has a negligible probability of success and in particular, 1/N ≈ 2−k for large k ≈ 100 [34, 41]. We refer to this case as strong PUFs. If the number of different CRPs N is rather small, we refer to it as a weak PUF. Due to noise, PUFs are observed over a noisy measurement channel i.e. when a PUF is challenged with Ci a response Ri0 which is a noisy version of Ri is obtained. Section 4 introduces several PUF constructions. 2.1. Fuzzy Extractor and Helper Data Algorithm In [44] it was explained that PUFs can be used to store a secret key in a secure way. Since, PUF responses are noisy as explained above and the responses are not fully random, a Fuzzy Extractor or Helper Data Algorithm is needed to extract one (or more) secure keys from the PUF responses. For the precise definition of a Fuzzy Extractor and Helper Data algorithm we refer to [10, 29]. Informally, two basic primitives need to be implemented: (i) Information Reconciliation or error correction and (ii) Privacy Amplification or randomness extraction. In order to implement those two primitives, helper data W are generated during the enrollment phase. Later during the key reconstruction phase, the key is reconstructed based on a noisy measurement Ri0 and the helper data W . During the enrollment phase (carried out in a trusted environment), a probabilistic procedure called Gen is run. It takes as input a PUF response R and produces as output a key K and helper data W : (K, W ) ← Gen(R). During the key reconstruction phase a procedure called Rep is run. It

2. PHYSICAL UNCLONABLE FUNCTIONS Physical Unclonable Functions, introduced by Pappu et al. [34, 35], consist of inherently unclonable physical systems. 2

takes as input a noisy response R0 and helper data W and reconstructs the key K (if R0 originates from the same source as R) i.e. K ← Rep(R0 , W ). In order to implement the procedures Gen and Rep we need an error correcting code C and a set H of universal hash functions [8]. The parameters1 [n, k, d] of the code C are determined by the length of the responses R and the number t of errors that have to be corrected. The distance d of the code is chosen such that t errors can be corrected. During the enrollment phase a response R is obtained and a random code word CS ← C is chosen from C. Then, a first helper data vector equal to W1 = CS ⊕ R is generated. Furthermore a hash function hi is chosen at random from H and the key K is defined as K ← hi (R). The helper data W2 = i. Summarizing the procedure Gen is defined as follows, (K, W1 , W2 ) ← Gen(R). Finally, during the key reconstruction phase R0 is obtained. During the procedure Rep the following steps are carried out. 1. Information Reconciliation: Using the helper data W1 , W1 ⊕ R0 is computed. Then the decoding algorithm of C is used to obtain CS . From CS , R is reconstructed as R = W1 ⊕ CS . 2. Privacy amplification: The helper data W2 is used to choose the correct hash function hi ∈ H and to reconstruct the key as follows: K = hi (R).

3.1. Public-Key Based Approaches Previous protocols for IP protection [40, 13] are based on the use of symmetric-key cryptography and PUFs (see Sect. 3.2). In this section, we investigate the advantages that PK cryptography provides in this setting. Figure 1 shows the proposed protocol. Notice that in contrast to Fig. 2, the PKbased protocol in Fig. 1 requires the TTP only to interact during the enrollment phase. In addition, we do not require an additional nonce to guarantee privacy from the TTP, as PK crypto gives us this for “free”. Assumptions and Notation: – Communication channels between SYS-TTP, TTP-IPP, and SYS-IPP are authenticated (no man-in-the-middle attack possible). – The TTP and IPP have public-private key pairs (Kpub , Kpriv ) and (Kpub , Kpriv ), reTTP TTP IPP IPP spectively. – The SYS can generate (with the help of the internal FPGA hardware security module) a public-private key pair (Kpub , Kpriv ) internally in the FPGA. C1 C1 – We write InfoHW to mean IDHW kC1kW1kW2kKpub . C1 – We write InfoSW to mean IDSW kKpub . IPP – System parameters (E(F k ) finite field F k , generator point P ∈ E(F k )) are published by the TTP. 2 2 2 Enrollment Protocol: SYS

TTP

IPP InfoSW  Proof of Knowledge of Kpriv IPP  (InfoSW ) CertK privTTP -

InfoHW Proof of Knowledge of Kpriv

C1 -

 (Info ) Cert  KprivTTP HW Authentication Protocol: SYS

IPP IDSW k CertK

3. PUF-BASED IP AUTHENTICATION FOR FPGAS

Dk SigK

 In this section, we introduce protocols based on PK encryption for protection of IP blocks. We also compare our protocol to previous protocols in the literature and analyze the advantages that PK cryptography provides in this setting. Before we continue, we briefly summarize the parties involved in the IP protection chain. These include: the end user, the FPGA customer, the system integrator or designer (SYS), the hardware IP-Provider or core vendor (IPP), the hardware (FPGA) manufacturer (HWM) or vendor, the CAD software vendor, and a Trusted Third Party (TTP) [21]. In the protocols that we propose, we will only deal with the SYS, IPP, HWM, and TTP. We refer to [21] for a detailed description of all the parties. Finally, notice that in the protocol proposed in Sect. 3.1 as well as in the work presented in [40, 13], it is assumed the existence of an internal security module with access to the PUF circuit and either an AES module [40, 13] or elliptic curve (EC) and hash modules (this work). In contrast to [40, 13], the EC-based module never has to output the private key of the device, even during enrollment. Notice that AES hardware modules are already present in some FPGA devices now days [25, 2]. For the reader unfamiliar with EC crypto, the appendix provides a brief overview of the subject.

privIPP

privTTP

(D)kCertK

(InfoHW )

privTTP

(InfoSW )

D ← EncK

pubC

(SW ) 1

Fig. 1. PK-Based Authentication Protocol for FPGA IP Protection. During enrollment the SYS obtains the system parameters published by the TTP, which include: the elliptic curve, finite field F2k and an EC point P ∈ E(F2k ) of prime order. Then, SYS instructs the internal FPGA security module to generate a new private-public key pair, by choosing a challenge C1 and helper data W1 , W2 , deriving a secret key KprivC1 (which corresponds to hi (R), where R is the PUF response as in Sect. 2.1) and computing KpubC1 = KprivC1 · P . Notice that KprivC1 is an integer and KpubC1 a point on an EC. The SYS requests a new certificate from the TTP by sending C1 , W1 , W2 , KpubC1 to the TTP and executing a zero knowledge proof (and proving) [37, 32] that he is in possession of a device that knows the private key KprivC1 corresponding to the public key KpubC1 (without disclosing the private key). Notice that neither SYS nor HWM have direct access to KprivC1 . Upon successful completion of this proof, the TTP sends the SYS a certificate certifying public key and helper data information. The IPP goes through a similar certification procedure for his public key. Then, the authentication protocol between the SYS and IPP consists in exchanging: (i) certificates, which proof authenticity of

1 Given a [n, k, d]-code C over F its words are n-tuples of F elements. q q The code has minimum distance d, it can correct up to b(d − 1)/2c errors, and it has cardinality q k ; i.e. it can encode up to q k possible messages.

3

public keys2 , (ii) the signature on the encrypted bitstream D, which provides authenticity and integrity of D (it did not originate from a bogus IP provider), and (iii) the encrypted bitstream itself, which guarantees confidentiality (even from the TTP).

curious” (the TTP follows the protocol in an honest manner but tries to find out as much information as possible). Notice that this implies that the TTP is not allowed to tamper with EncKpubIP P (η). Finally, notice the last message of the protocol is also sent in an off-line fashion, when the bitstream is loaded onto the FGPA from insecure non-volatile storage.

3.2. Previous Symmetric-Key (SK) Approaches In [40], Sympson and Schaumont describe a protocol which provides hardware IP authentication (S1) and hardware platform authentication (S2). This protocol has been recently simplified in [13]. In addition, the authors in [13] introduce a new protocol which allows for the parties involved in the IP-block exchange to communicate without the TTP having access to the contents of the communication. For completeness, the protocol of [13] is shown in Fig. 2. In Fig. 2, we

3.3. Comparison

Assumptions: – Communication channels between SYS-TTP, TTP-IPP, and SYS-IPP are authenticated. – Communication channel HWM-TTP is secure and authenticated. – Honest but Curious TTP. – Both TTP and SYS obtain the authentic IPP’s public key, Kpub . IPP – Random nonce η. Enrollment Protocol: HWM

TTP IDHW k {{C1 , R1 }, . . . {Cn , Rn }}

-

IPP IDSW k Hash(SWkIDSW )



Authentication Protocol: SYS

TTP

IPP

IDSW kIDHW k (η) EncK pubIPP

-

IDSW kCikCj k (RikRj )k EncK pubIPP (η) EncK pubIPP

-



CikCj kDkMACK (CikCj kD) j

Ki ← Hash(Rikη), Kj ← Hash(Rj kη), D ← EncK (SWkIDSW ) i

Fig. 2. Authentication Protocol for FPGA IP Protection [13] write Ci to denote the PUF challenge and the corresponding helper data required to reconstruct the PUF response Ri from a noisy version Ri0 . It is implicitly assumed that the circuit used to obtain CRPs during the enrollment protocol is destroyed after enrollment and that subsequently, given a challenge Ci the corresponding response Ri0 is only available internally to the decryption circuit in the FPGA. Without this assumption, anyone could access Ri , and the protocols proposed in [40, 13] would be completely broken. The TTPprivate protocol in [13] does not assume any secure channels (except for HWM-TTP). However, it is assumed that the channels HWM-TTP, TTP-SYS, TTP-IPP, SYS-IPP are authentic (e.g. man-in-the-middle attacks are not possible) and that it is possible to obtain IPP’s public key in an authenticated way. It is also assumed that the TTP is “honest-but-

The memory requirements are to a large extent the same for both SK and PK approaches. In particular, the same number of bits of helper data (see Sect. 5) are required to generate the private key in the PK based solution and the secretkey in the SK based solution. Similarly, the error correcting codes necessary in both cases have the same complexity. The PK-based solution requires certificates, which are not present in the SK-based solution. This implies if using straight forward certificates an additional memory requirement of 2 × 492 = 984 bits3 , assuming an EC defined over a 163-bit field. This however can be reduced to 328 bits by using implicit certificates [7]. Regarding performance, we have to perform a signature verification and a decryption. First, notice that efficient implementations of ECC for FPGAs are well known (see [33, 14]) as well as hash functions [27]. Signature performance should not constitute a bottleneck since it requires the computation of a hash (comparable in speed to a MAC in the symmetric-key case) plus a PK operation on the hash (at most 2 msec [33]). However, the PK decryption operation could constitute a heavy burden on the application as it has to be performed on L/163 blocks, where L is the length of the bitstream. To minimize the impact of PK decryption, there are two possibilities. First, we could modify the protocol in Fig. 1 to simply exchange public keys between IPP and SYS, and then fall back to a protocol similar to [13] for encryption and authentication. This would require the implementation of a symmetric encryption algorithm in addition to ECC and hash function modules. The second possibility is to use the construction introduced in [18]. The idea here is to encrypt a single 163bit block and use hashes (or a stream cipher seeded with the hash of the unencrypted bitstream) to generate a pseudorandom sequence to encrypt the bitstream with a one-time pad. Such a scheme would require a single public-key encryption and two hash computations. Finally, we notice that our public-key based scheme allows us to reduce the number of rounds in the protocol compared to [40, 13] and as previously mentioned, the private key does not ever need to be outside the device. 3 This includes only the length of the public key and associated signature. The certificate information has to be included in both SK and PK solutions, as previously argued.

2 It

is assumed that the TTP’s public key is stored in non-tamperable storage.

4

3.4. How Does Everything Work Together?

also to protect a ”weak” PUF from external attacks. Recently, Su et al. [43] present a custom built circuit array of cross-coupled NOR gate latches to uniquely identify an IC. Here, small transistor threshold voltage Vt differences that are caused due to process variations lead to a mismatch in the latch to store a 1 or a 0.

For completeness we describe how the combination of bitstream encryption and a key extracted from a PUF works for IP protection applications on FPGAs. First, we describe the symmetric-key scenario and then we generalize to the PK case. The process consists of the following steps: (i) load the encrypted bitstream and MAC on the FPGA, (ii) challenge the PUF with a challenge Ci , (iii) measure the PUF response Ri0 , (iv) retrieve helper data W1 , W2 from memory, (v) use a fuzzy extractor to extract the key K ← Rep(Ri0 , W1 , W2 ), (vi) decrypt the bitstream and check the associated MAC (if the MAC re-computation fails, the FPGA aborts configuration), and finally (vii) configure the FPGA. In the case of PK encryption, the process is as follows: (i) load the encrypted bitstream D, signature SigKpriv (D), and certifiIPP cate CertKprivTTP (InfoSW ), (ii) verify the certificate information and signature using the public key KpubTTP of the TTP, (iii) verify the signature on the encrypted bitstream D, using the public key KpubIPP of IPP (if any of these checks fails configuration is aborted), (iv) after this steps (ii) to (vii) of the symmetric-key case are followed.

4.2. Coating PUFs In [44], Tuyls et al. present coating PUFs in which an IC is covered with a protective matrix coating, doped with random dielectric particles at random locations. The IC also has a top metal layer with an array of sensors to measure the local capacitance of the coating matrix that is used to characterize the IC. The measurement circuit is integrated in the IC, making it a controlled PUF. Figure 3 shows a schematic diagram of the PUF construction. In Fig. 3, it is possible to see how the upper metal layer contains aluminum sensor structures (Al) that are used to measure the local capacitance of the coating. It is shown in [44] that it is possible to extract

4. PUF CONSTRUCTIONS In this section, we describe known PUF constructions including: Optical and Silicon PUFs, Coating PUFs, and SRAMbased PUFs. Because these last ones have been recently proposed and because of their relevance to the FPGA environment, we describe SRAM-based PUFs in detail. 4.1. Silicon PUFs and Optical PUFs Pappu et al. [34, 35] introduce the idea of a Physical OneWay Function. They use a bubble-filled transparent epoxy wafer and shine a laser beam through it leading to a response interference pattern. This kind of optical PUF is hard to use in the field because of the difficulty to have a tamper resistant measuring device. Gassend et al. introduce Silicon Physical Random Functions (SPUF) [12] which use manufacturing process variations in ICs with identical masks to uniquely characterize each chip. The statistical delay variations of transistors and wires in the IC were used to create a parameterized self oscillating circuit to measure frequency which characterizes each IC. Silicon PUFs are very sensitive to environmental variations like temperature and voltage. Lim et al. [28] introduce arbiter based PUFs which use a differential structure and an arbiter to distinguish the difference in the delay between the paths. Gassend et al. [11] also define a Controlled Physical Random Function (CPUF) which can only be accessed via an algorithm that is physically bound to the randomness source in an inseparable way. This control algorithm can be used to measure the PUF but

Fig. 3. Schematic cross-section of a Coating PUF IC. up to three key bits from each sensor in the IC. A key observation in [44] is that the coating can be used to store keys (rather than as a challenge-response repository as in previous works) and that these keys are not stored in memory. Rather, whenever an application requires the key, the key is generated on the fly. This makes it much more difficult for an attacker to compromise key material in security applications. Finally, Tuyls et al. [44] show that active attacks on the coating can be easily detected, thus, making it a good countermeasure against probing attacks. Although coating PUFs are very cheap to produce they still need a small additional manufacturing step. 4.3. FPGA Intrinsic PUFs and SRAM Memories The disadvantage of most of the previous approaches is the use of custom built circuits or the modification of the IC manufacturing process to generate a reliable PUF. Guajardo et al. [13] approach the problem by identifying an Intrinsic PUF which is defined as a PUF already present in the device and that requires no modification to satisfy the security goals. We describe next how SRAM memories, which are widely available in almost every computing device including modern FPGAs, can be used as an Intrinsic PUF. A CMOS SRAM cell is a six transistor device [4] formed of two cross-coupled inverters and two access transistors

5

connecting to the data bit-lines based on the word-line signal. Previous research on process variations in SRAM has been aimed at reducing the static-noise margin, defined as the minimum DC noise voltage to flip the cell state. In [5], the authors show that microscopic variations in the dopant atoms in the channel region of the MOSFET induce differences in the threshold voltage Vt of the transistors of an SRAM cell. The transistors forming the cross-coupled inverters are constructed particularly weak to allow driving them easily to 0 or 1 during a write process. Hence, these transistors are extremely vulnerable to atomic level intrinsic fluctuations which are outside the control of the manufacturing process and independent of the transistor location on the chip (see e.g. [9]). In practice, SRAM cells are constructed with proper width/length ratios between the different transistors [38] such that these fluctuations do not affect the reading and writing process under normal operation. However, during power-up, the cross-coupled inverters of a SRAM cell are not subject to any externally exerted signal. Therefore, any minor voltage difference that shows up on the transistors due to intrinsic parameter variations will tend towards a 0 or a 1 caused by the amplifying effect of each inverter acting on the output of the other inverter. Hence with high probability an SRAM cell will start in the same state upon power-up. On the other hand, different SRAM cells will behave randomly and independently from each other. In [13], the authors consider as a challenge a range of memory locations within a SRAM memory block. For example, we show in Sect. 5 that to derive a 163-bit secret we require about 6132 SRAM memory bits (under extreme conditions). The response are the start-up values at these locations. If the memory block used is about 512 kbits, we can expect to have about 85 CRPs. Notice also that SRAM-based PUFs produce a binary string as result of a measurement, in contrast, to other PUFs reported in the literature, which have to go through a quantization process before obtaining a bit string from the measurement. This results in a reduction in the complexity of the measurement circuit. For our proof of concept, we use the Altera Stratix II EP2S60 FPGA [1], which has three different types of memory: M512, M4K and Mega-RAM (MRAM). Among these only MRAM is of interest as it is uninitialized during power-up by the reset logic. Each Stratix II chip has two 64 kilobyte MRAM blocks.

The main criteria here is to check the stability of the startup values over a series of intra-class measurements done over a long time period. In [13], the authors compared the Hamming distance between a first measurement and repeated measurements of the same SRAM block carried over approximately two days. The experiment was done with four different MRAM blocks, located in two different FPGAs. The measurements showed that less than 4% of the startup bit values change over time. Similarly, preliminary data indicates that measurements at temperatures ranging from −20◦ C to 80◦ C result in bit strings with maximum fractional Hamming distances of 12% when compared to a reference measurement performed at 20◦ C. Finally, we notice that intra-class Hamming distances of the SRAM startup values should remain small, even when other data has been written into the memory before the FPGA was restarted. In particular, it is important that the startup values are unaffected by aging and the use of the SRAM blocks to store data. SRAM memory retention has been previously considered in [15, 16, 42]. The tests in [13] indicate that storing zeros or ones into the memory has very little influence in the SRAM start-up values in agreement with [42]. The fractional Hamming distance between bit strings from an enrollment (reference) measurement and any of the other measurements does not exceed 4.5% in this test. The fractional Hamming distance between bit strings of different SRAM blocks (and different FPGAs) should be close to 50%, such that each SRAM block (and thus each FGPA) can be uniquely identified. In order to get an idea of identification capabilities of SRAM blocks, [13] investigated the distribution of Hamming distances between bit strings of length 8190 bytes derived from different SRAM blocks (inter-class distribution). A histogram of inter-class Hamming distances is depicted in Fig. 4. The startup bit values of seventeen different SRAM blocks were used to create this graph. The analysis shows that the inter-class fractional Hamming distance distribution closely matches a normal distribution with mean 49.97% and a standard deviation of 0.3%. Fig. 4 also shows the histogram of intra-class Hamming distance measurements. This histogram was created by comparing 92 repeated measurements of the same SRAM block. The intra-class fractional Hamming distance distribution of startup bit strings has an average of 3.57% and a standard deviation of 0.13%.

In order to be useful as a PUF, SRAM startup values should have good statistical properties and be robust over time, to temperature variations, and have good identification performance. These properties were studied in [13]. Here we summarize their findings. Regarding robustness, the Hamming distance between bit strings from repeated measurements of the same SRAM block (intra-class measurements) should be small enough, such that errors between enrollment and authentication measurements can be corrected by an error correcting code admitting efficient decoding.

5. ON THE COST OF EXTRACTING A 163-BIT KEY It is well known that due to the noisy nature of PUFs a fuzzy extractor is required. A fuzzy extractor, as explained in Sect. 2.1, provides error correction capabilities to take care of the noisy measurements and privacy amplification to guarantee the uniform distribution of the final secret. We refer to Sect. 2.1 for the details but, in general, we will need to 6

Histogram of Hamming Distances

code is a good candidate (see for example [6, 36]) with N bit code words and a minimum distance at least d = 2t + 1, t the number of errors that C can correct. Since we need to generate in the end at least 216 information bits, it becomes an optimization problem to choose the best code in terms of hardware resources, number of SRAM bits required, performance, etc. For example, using [511, 19, t = 119]-BCH, we would need 12 × 511 = 6132 bits to generate 228 information bits. On the other hand, if we assume pb = 0.06 (i.e. assume that we only need to operate at 20◦ C), then we could use the binary [1023, 278, t = 102]-BCH code, which requires only 1023 bits of SRAM memory to generate 278 bits of information.

Histogram of Hamming Distances

30

20

18 25

Between−class distribution, count (%)

Within−class distribution, count (%)

16

20

15

10

14

12

10

8

6

4 5 2

0 0.025

0.03 0.035 0.04 0.045 Fractional Hamming Distance

0.05

0 0.48

0.49 0.5 0.51 Fractional Hamming Distance

0.52

Fig. 4. Histogram of intra-class (left) and inter-class (right) Hamming distances between startup bit strings of SRAM blocks and their approximating normal distributions.

PRIVACY AMPLIFICATION. A universal hash function, introduced by Carter and Wegman in [8], is a map from a finite set A of size a to a finite set B of size b. For a given hash function h and two strings x, x0 with x 6= x0 , we define the function δh (x, x0 ) as equal to 1 if h(x) = h(x0 ) and 0 otherwise. For a finite set P (or family) of hash functions H, δH (x, x0 ) is defined to be h∈H δh (x, x0 ). In other words, δH (x, x0 ) counts the number of functions h ∈ H for which x and x0 collide. For a random h ∈ H and any two distinct x, x0 , the probability that h(x) = h(x0 ) is δH (x, x0 )/|H|, where |H| denotes the size of the set H. There has been extensive research on universal hash functions (see for example [39, 31]). However, their suitability for hardware implementations has not been thoroughly investigated. To our knowledge, the work of [24] and the recent work of Kaps et al. [20] are the only ones that consider their hardware implementation. However, no one seems to have considered their implementation on FPGAs. Thus, we will consider what the best architecture for FPGAs is in future work.

choose an error correcting code which accepts efficient decoding, implement its decoding algorithm on the FPGA, and implement a universal hash function, chosen at random from a set H during enrollment. In the following, we describe the choices that can be made to derive a 163-bit key, which can be used in combination with elliptic curve cryptography and the protocols proposed in Sect. 3.1. SECRECY RATE AND ERROR CORRECTION. The fuzzy extractor derives a key K from the SRAM startup bits R by compressing these bits with a hash function hi . The minimal amount of compression that needs to be applied by the hash function is expressed in the secrecy rate SR , see [17]. The maximum achievable secrecy rate SR is given by the mutual information between bit strings derived during enrollment and reconstruction, I(R, R0 ). In [17], a method was presented for estimating this secrecy rate using a universal source coding algorithm called the ContextTree Weighting Method [45]. We have applied this method to the SRAM startup values. By estimating the mutual information I(R, R0 ) between repeated measurements of the same memory block, we find an average secrecy rate of 0.76 bits per SRAM memory bit. That means that to derive a secret of size N , we need at least d1.32N e source bits. In order to choose an adequate error correcting code, we first consider the number of bits of information, which have to be at least d1.32N eN =163 = 216 bits. Assuming that all bits are independent, the probability that a string of S bits will  than t errors, denoted Pt by Ptotal , is PShave more given by i=t+1 Si pib (1 − pb )S−i = 1 − i=0 Si pib (1 − pb )S−i , where pb denotes the bit error probability. Notice that the maximum number of errors that we have experimentally seen is about 12%. Thus, assume that we have a bit error probability pb = 0.15, to be conservative and that we are willing to accept a failure rate of Ptotal = 10−6 . Since, we are assuming that the errors are independent, a binary BCH

6. CONCLUSIONS In this paper, we have proposed new and efficient protocols for the IP-protection problem based on PK cryptographic primitives. In addition, we have described known PUF constructions with particular attention to those that are based on the properties of SRAM because of their presence in current FPGAs. We have tested this construction on FPGAs with embedded block RAM memories which are not reset at power-up. We have seen similar phenomena in ASICs and expect similar behavior on any other device which contains uninitialized SRAM memory. At present, we have identified other properties of SRAM memory, which have the potential to be used as a PUF-source. This will be investigated in future work. We will also explore in the future the exact complexity of implementing a fuzzy extractor on an FPGA. Finally, we notice that the unique identifiers derived from the PUFs could be useful for tracking purposes. 7

7. REFERENCES [1]

ALTERA. Stratix II Device Handbook vol 1: Stratix II Architecture, August 2006. Available at http://www. altera.com/literature/hb/stx2/stx2_sii51002.pdf.

[2]

ALTERA. Application Note 341 v2.0. Using the Design Security Feature in Stratix II and Stratix II GX Devices, February 2007. Available at http://www.altera.com/literature/an/an341.pdf.

[3]

R. J. Anderson and M. G. Kuhn. Low Cost Attacks on Tamper Resistant Devices. In B. Christianson, B. Crispo, T. M. A. Lomas, and M. Roe, editors, Security Protocols Workshop, volume 1361 of LNCS, pages 125–136. Springer, 1997.

[29]

J.-P. M. G. Linnartz and P. Tuyls. New Shielding Functions to Enhance Privacy and Prevent Misuse of Biometric Templates. In J. Kittler and M. S. Nixon, editors, Audio-and Video-Based Biometrie Person Authentication — AVBPA 2003, volume 2688 of LNCS, pages 393–402. Springer, June 9-11, 2003.

[30]

V. S. Miller. Use of Elliptic Curves in Cryptography. In H. C. Williams, editor, Advances in Cryptology — CRYPTO ’85, volume 218 of Lecture Notes in Computer Science, pages 417–426. Springer, August 18-22, 1985.

[31]

W. Nevelsteen and B. Preneel. Software Performance of Universal Hash Functions. In J. Stern, editor, Advances in Cryptology — EUROCRYPT’99, volume 1592 of LNCS, pages 24–41. Springer, May 2-6, 1999.

[32]

T. Okamoto. Provably Secure and Practical Identification Schemes and Corresponding Signature Schemes. In E. F. Brickell, editor, Advances in Cryptology — CRYPTO’92, volume 740 of LNCS, pages 31–53. Springer, 1992.

[33]

G. Orlando and C. Paar. A High Performance Reconfigurable Elliptic Curve Processor for GF(2 m ). In C¸. K. Koc¸ and C. Paar, editors, Cryptographic Hardware and Embedded Systems — CHES 2000, volume 1965 of LNCS, pages 41–56. Springer, August 17-18, 2000.

[4]

A. Bellaouar and M. I. Elmasry. Low-Power Digital VLSI Design. Circuits and Systems. Kluwer Academic Publishers, first edition, 1995.

[5]

A. J. Bhavnagarwala, X. Tang, and J. D. Meindl. The Impact of Intrinsic Device Fluctuations on CMOS SRAM Cell Stability. IEEE Journal of Solid-State Circuits, 36(4):658–665, April 2001.

[34]

R. S. Pappu. Physical one-way functions. PhD thesis, Massachusetts Institute of Technology, March 2001. Available at http://pubs.media.mit.edu/pubs/papers/01.03.pappuphd.powf.pdf.

[6]

R. E. Blahut. Theory and Practice of Error Control Codes. Addison-Wesley Publishing Company, first edition, 1985.

[35]

R. S. Pappu, B. Recht, J. Taylor, and N. Gershenfeld. Physical one-way functions. Science, 297(6):2026–2030, 2002. Available at http://web.media.mit.edu/˜brecht/papers/02.PapEA.powf.pdf.

[36]

W. W. Peterson and E. J. Weldon, Jr. Error-Correcting Codes. The MIT Press, second edition, 1972.

[37]

C.-P. Schnorr. Efficient Identification and Signatures for Smart Cards. In G. Brassard, editor, Advances in Cryptology — CRYPTO ’89, volume LNCS 435, pages 239–252. Springer, 1989.

[38]

E. Seevinck, F. J. List, and J. Lohstroh. Static-Noise Margin Analysis of MOS SRAM Cells. IEEE Journal of Solid-State Circuits, 22(5):748–754, Oct 1987.

[39]

V. Shoup. On Fast and Provably Secure Message Authentication Based on Universal Hashing. In N. Koblitz, editor, Advances in Cryptology - CRYPTO ’96, volume 1109 of LNCS, pages 313–328. Springer, August 18-22, 1996.

[40]

E. Simpson and P. Schaumont. Offline Hardware/Software Authentication for Reconfigurable Platforms. In L. Goubin and M. Matsui, editors, Cryptographic Hardware and Embedded Systems — CHES 2006, volume 4249 of LNCS, pages 311–323. Springer, October 10-13, 2006.

[41]

B. Skoric, P. Tuyls, and W. Ophey. Robust Key Extraction from Physical Uncloneable Functions. In J. Ioannidis, A. D. Keromytis, and M. Yung, editors, Applied Cryptography and Network Security — ACNS 2005, volume 3531 of LNCS, pages 407–422, June 7-10, 2005.

[42]

S. P. Skorobogatov. Low temperature data remanence in static RAM. Technical Report 536, University of Cambridge, Computer Laboratory, June 2002.

[43]

Y. Su, J. Holleman, and B. Otis. A 1.6pJ/bit 96% Stable Chip-ID Generating Cicuit using Process Variations. In ISSCC ’07: IEEE International Solid-State Circuits Conference, pages 406–408, Washington, DC, USA, 2007. IEEE Computer Society.

[44]

P. Tuyls, G.-J. Schrijen, B. Skoric, J. van Geloven, N. Verhaegh, and R. Wolters. Read-Proof Hardware from Protective Coatings. In Cryptographic Hardware and Embedded Systems — CHES 2006, volume 4249 of Lecture Notes in Computer Science, pages 369–383. Springer, October 10-13, 2006.

[45]

F. Willems, Y.M. Shtarkov, and Tj. J. Tjalkens. The Context-Tree Weighting method: Basic Properties. IEEE Trans. Inform. Theory, IT-41:653–664, May 1995.

[7]

D. R. L. Brown, R. P. Gallant, and S. A. Vanstone. Provably Secure Implicit Certificate Schemes. In F. Syverson P, editor, Financial Cryptography — FC 2001, volume 2339 of LNCS, pages 156–165. Springer, February 19-22, 2001.

[8]

L. Carter and M. N. Wegman. Universal Classes of Hash Functions. J. Comput. Syst. Sci., 18(2):143–154, 1979.

[9]

B. Cheng, S. Roy, and A. Asenov. The impact of random doping effects on CMOS SRAM cell. In European Solid State Circuits Conference, pages 219–222, Washington, DC, USA, 2004. IEEE Computer Society.

[10]

Y. Dodis, M. Reyzin, and A. Smith. Fuzzy extractors: How to generate strong keys from biometrics and other noisy data. In C. Cachin and J. Camenisch, editors, Advances in Cryptology —- EUROCRYPT 2004, volume 3027 of LNCS, pages 523–540. Springer-Verlag, 2004.

[11]

B. Gassend, D. Clarke, M. van Dijk, and S. Devadas. Controlled Physical Random Functions. In ACSAC ’02: Proceedings of the 18th Annual Computer Security Applications Conference, page 149, Washington, DC, USA, 2002. IEEE Computer Society.

[12]

B. Gassend, D. E. Clarke, M. van Dijk, and S. Devadas. Silicon physical unknown functions. In V. Atluri, editor, ACM Conference on Computer and Communications Security — CCS 2002, pages 148–160. ACM, November 2002.

[13]

J. Guajardo, S. S. Kumar, G.-J. Schrijen, and P. Tuyls. FPGA Intrinsic PUFs and Their Use for IP Protection. March 12, 2006. Submitted for publication.

[14]

N. Gura, S. C. Shantz, H. Eberle, S. Gupta, V. Gupta, D. Finchelstein, E. Goupy, and D. Stebila. An End-to-End Systems Approach to Elliptic Curve Cryptography. In S. Kaliski Jr. B, C¸. K. Koc¸, and C. Paar, editors, Cryptographic Hardware and Embedded Systems — CHES 2002, volume 2523 of LNCS, pages 349–365. Springer, August 13-15, 2002.

[15]

P. Gutmann. Secure deletion of data from magnetic and solid-state memory. In Sixth USENIX Workshop on Smartcard Technology Proceedings, pages 77–89, San Jose, California, July 1996. Available at http://www. cs.cornell.edu/people/clarkson/secdg/papers.sp06/secure_deletion.pdf.

[16]

P. Gutmann. Data remanence in semiconductor devices. In 10th USENIX Security Symposium, pages 39–54, August 2001. Available at http://www.cryptoapps.com/ ˜peter/usenix01.pdf.

[17]

T. Ignatenko, G.J. Schrijen, B. Skoric, P. Tuyls, and F. Willems. Estimating the Secrecy-Rate of Physical Unclonable Functions with the Context-Tree Weighting Method. In IEEE International Symposium on Information Theory, pages 499–503, Seattle, USA, July 2006.

[18]

M. Jakobsson, J. P. Stern, and M. Yung. Scramble All, Encrypt Small. In L. R. Knudsen, editor, Fast Software Encryption — FSE’99, volume 1636 of LNCS, pages 95–111. Springer, March 24-26, 1999.

[19]

A. B. Kahng, J. Lach, W. H. Mangione-Smith, S. Mantik, I. L. Markov, M. Potkonjak, P. Tucker, H. Wang, and G. Wolfe. Watermarking techniques for intellectual property protection. In Design Automation Conference — DAC ’98, pages 776–781, New York, NY, USA, 1998. ACM Press.

[20]

J.-P. Kaps, K. Y., and B. Sunar. Energy Scalable Universal Hashing. IEEE Trans. Computers, 54(12):1484–1495, 2005.

[21]

T. Kean. Cryptographic rights management of FPGA intellectual property cores. In ACM/SIGDA tenth international symposium on Field-programmable gate arrays — FPGA 2002, pages 113–118, 2002.

[22]

N. Koblitz. A Family of Jacobians Suitable for Discrete Log Cryptosystems. In S. Goldwasser, editor, Advances in Cryptology — CRYPTO ’88, volume 403 of LNCS, pages 94–99. Springer, August 21-25, 1988.

[23]

KPMG Electronics, Software & Services and Alliance for Gray Market and Counterfeit Abatement. Managing the Risks of Counterfeiting in the Information Technology Industry. White paper, 2005. Available at http: //www.agmaglobal.org/.

[24]

H. Krawczyk. LFSR-based Hashing and Authentication. In Y. Desmedt, editor, Advances in Cryptology CRYPTO ’94, volume 839 of LNCS, pages 129–139. Springer, August 21-25, 1994.

[25]

R. Krueger. Using High Security Features in Virtex-II Series FPGAs. Xapp766 (v1.0), Xilinx, July 8, 2004. Available at http://www.xilinx.com/bvdocs/appnotes/xapp766.pdf.

[26]

A. Lesea. IP Security in FPGAs. White paper 261 (v1.0), Xilinx, February 16, 2007. Available at http: //direct.xilinx.com/bvdocs/whitepapers/wp261.pdf.

[27]

R. Lien, T. Grembowski, and K. Gaj. A 1 Gbit/s Partially Unrolled Architecture of Hash Functions SHA-1 and SHA-512. In T. Okamoto, editor, Topics in Cryptology — CT-RSA 2004, volume 2964 of LNCS, pages 324–338. Springer, February 23-27, 2004.

[28]

D. Lim, J. W. Lee, B. Gassend, G. E. Suh, M. van Dijk, and S. Devadas. Extracting secret keys from integrated circuits. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 13(10):1200–1205, October 2005.

A. ELLIPTIC CURVE CRYPTOGRAPHY PRIMER Elliptic curves were introduced in cryptography by Miller [30] and Koblitz [22]. In this work we limit ourselves to EC defined over binary finite fields because of their advantages when implemented in hardware. A non-supersingular elliptic curve E(F2n ) is defined as the set of solutions (x, y) ∈ F2n × F2n to the equation: y 2 + xy = x3 + ax2 + b where a, b ∈ F2n , b 6= 0, together with the point at infinity, denoted by O. It is possible to define an addition operation on the group of points of the elliptic curve [22]. Given such an addition operation, we write k · P , where k is an integer and P ∈ E(F2n ) to mean the addition of the point P , k times to itself. The order of P , written ord(P ), is the least integer k such that k · P = O. Given an elliptic curve point P ∈ E(F2n ) of prime order, the discrete logarithm problem in the additive group of points of the EC is defined as: given two points Q, P ∈ E(F2n ), find the integer k such that Q = k · P . For security reasons, k and the order of the point P are chosen such that they are ≈ 2160 . Notice that k corresponds to the private key of a user and it is an integer. Q corresponds to the user’s public key. P is a system parameter chosen by the TTP. 8