Multiparty Computation of Fixed-Point Multiplication and Reciprocal∗ Octavian Catrina and Claudiu Dragulin International University in Germany School of Information Technology 76646 Bruchsal, Germany

Abstract Collaborative business applications can use secure multiparty computation to preserve input privacy. These applications need protocols that provide all the basic operations with integers and rational numbers and allow secure composition and efficient application development. Secure computation with rational numbers is a long-standing open problem. We present in this paper several components of a protocol family for secure computation with fixed-point numbers based on secret sharing.

1. Introduction Secure multiparty computation deals with cryptographic protocols that enable a group of parties to run joint applications and preserve input privacy. Their core components offer the combined ability to protect data privacy and compute with protected data, e.g., secret sharing or homomorphic encryption schemes. Among the applications that benefit from input privacy we find auctions, collaborative linear programming, supply chain management, benchmarking and forecasting [3, 11, 2]. These applications need a family of protocols for secure computation with integer and rational numbers. The protocols proposed so far support arithmetic in a field or ring and computation with binary and integer values. Protocols for computing the reciprocal and division were studied only in the context of particular applications and integer arithmetic [1, 8, 7]. Secure computation with rational numbers has remained, essentially, an open problem. In this paper, we present several protocols for multiparty computation with fixed-point numbers. Fixed-point arithmetic needs specific protocols only for multiplication and division, while the other operations can be efficiently achieved using integer computation. Overall, the family of ∗ Part of the work presented in this paper was funded by the European Commission through the grant FP7-213531 to the SecureSCM project.

protocols allows addition, subtraction, multiplication, division, and comparison of (signed) integers and fixed-point numbers. These protocols use a secure computation framework that relies on secret sharing and provides a collection of basic building blocks, partly adapted from solutions proposed in the literature (e.g., [5, 11, 10, 1, 4]). Division is achieved by multiplying the dividend with the reciprocal of the divisor, to improve efficiency when dividing many values by a common divisor. This solution is mainly motivated by one of our target applications, the Simplex algorithm for linear programming. Section 2 provides an introduction to data representation for secure computation with fixed-point numbers. Protocols for fixed-point multiplication and division are discussed in Sections 3 and 4, respectively. We conclude in Section 5 with an analysis of performance measurements for an implementation of these protocols.

2. Data Representation Secure computation has to encode application data as values in the input domains of the cryptographic protocols. We assume a secure computation framework that uses secret sharing over a field (Zq or F2m ) and offers as core functionality the following operations: input (share-distribution), output (secret reconstruction) and secure arithmetic with field elements. We summarize several encoding methods for binary and fixed-point values. Logical values x ∈ {f alse, true} are encoded as integer values, f alse = 0, true = 1. Binary values x ∈ {0, 1} can be encoded in Zq or F2m with identity mapping. For this encoding, boolean functions can be evaluated using arithmetic in Zq or F2m . It is often useful (albeit inefficient) to encode integers as arrays of secret-shared bits corresponding to their binary representation. Protocols for boolean functions and binary arithmetic can be constructed using the protocols for secure arithmetic in Zq or F2m . Fixed-point numbers are rational numbers represented as a sequence of digits split by a virtual radix point into an integer part and a fractional part with fixed length. We denote

3. Secure Multiplication

e the length of the integer part, f the length of the fractional part, and k = e + f . In computer arithmetic, fixed-point numbers are usually represented using binary digits and the two’s complement system [6]. We use a similar solution for secure computation, because it allows efficient arithmetic. Integers are fixed-point numbers with f = 0. An integer x is encoded on k bits using the mapping xhki = x mod 2k , which provides a unique representation on k bits of signed integers in Zhki = [−2k−1 , 2k−1 − 1]. The set of all rational numbers is Q = {x = uv | u, v ∈ Z, v 6= 0}. A k-bit fixed-point representation provides the value of the numerator u ∈ Zhki , while the denominator has a fixed value v = 2f , f < k. We can thus represent the subset of rational numbers Qhk,f i = {xhk,f i = u · 2−f | u ∈ Zhki , k > f }. We define the maps int : Qhk,f i → Z, int(x) = x · 2f , and fxp : Zhki → Qhk,f i , fxp(y) = y · 2−f . Arithmetic with fixed-point numbers in Qhk,f i can be defined using integer arithmetic. For u, v ∈ Z, we denote u/v = uv . We consider the following basic set of operations, for any a, b ∈ Qhk,f i : a+b a−b a·b a/b (a = b) (a < b)

= = = = = =

Multiplication of fixed-point values in Qhk,f i consists of an integer multiplication followed by division of the result by 2f , i.e., truncation of the f least significant bits. We need q > 2k+f to avoid overflow. can sometimes be reduced: PkThe number of truncations Pk −f a · b = fxp(( int(a ) and if i i ) · int(bi ))/2 i=1 i i=1 k+2f q>2 then a·b·c = fxp((int(a)·int(b)·int(c))/2−2f ). Also, if a ∈ Zhki , b ∈ Qhk,f i then a · b = fxp(a · int(b)). Protocol 3.1, Div2m, truncates m least significant bits of a ∈ Zhki by computing ⌊a/2m ⌋ = (a − (a mod 2m ))2−m . Div2m uses protocol 3.2, Mod2m, to compute a mod 2m . The input a and the outputs of the protocols are encoded in Zq and secret-shared, while k and m are public positive integers. The protocols provide statistical privacy and require q > 2k+κ+1 , where κ is a security parameter. Protocol 3.1. Div2m([a], k, m) 1. [b] ← Mod2m([a], k, m) 2. [a′ ] ← ([a] − [b])2−m

fxp(int(a) + int(b)) fxp(int(a) − int(b)) fxp((int(a) · int(b))/2f ) fxp((int(a) · 2f )/int(b)) (int(a) = int(b)) (int(a) < int(b)).

3. Return [a′ ] Protocol 3.2. Mod2m([a], k, m) 1. For 0 ≤ i < k + κ do [ri ] ← RandBitZ() Pk+κ−1 2. [r] ← i=0 2i · [ri ] [r′ ]B ← ([rm−1 ], . . . , [r0 ]) Pm−1 [r′ ] ← i=0 2i · [ri ]

Addition, subtraction, equality, and inequality in Qhk,f i are computed like in Zhki . It is sufficient to find an encoding for secure integer computation. Fixed-point numbers use the same encoding, and protocols for fixed-point arithmetic are constructed using those for integer arithmetic. Signed integers in A = [−A/2, A/2] can be encoded in the field Zq using the function ρ(a) = a mod q, where q > A > 0. If a ∈ [0, A/2) then ρ(a) = a and if a ∈ [−A/2, 0) then ρ(a) = q + a. For this encoding, ρ(φZ (a, b)) = φZq (ρ(a), ρ(b)), where φZ , φZq are addition, subtraction, multiplication, and equality in Z and Zq , respectively. Also, if a mod b = 0, then ρ(a/b) = ρ(a)/ρ(b). For these operations, secure arithmetic in A ⊂ Z can be computed using the protocols for secure arithmetic in Zq . An alternative solution is to encode signed integers in a field with modular arithmetic centered around zero [1]. We specify the protocols assuming that all private data is encoded and secret-shared in the same field Zq using Shamir sharing. We denote [x] a sharing of x ∈ Zq and [x]B a bitwise sharing. For a ∈ Z we also use a = ρ(a), for convenience. Linear combinations of secret values with public coefficients are locally computed. Other operations with secret values require interaction. We measure communication complexity as the number of secure multiplication and equivalent operations, called invocations. Round complexity is the number of sequential interactions.

3. [b] ← 2k−1 + [a] 4. c ← output([b] + [r]) c′ ← c mod 2m 5. [u] ← BitLT(c′ , [r′ ]B ) [a′ ] ← c′ − [r′ ] + [u] · 2m 6. Return [a′ ] The idea of Mod2m is to compute and open a + r, for a random secret r, then compute (a + r) mod 2m = (a mod 2m + r mod 2m ) mod 2m and subtract [r mod 2m ]. With our encoding, this works only if a ≥ 0. Mod2m replaces a with b = (2k−1 + a) mod q = 2k−1 + a. Observe that b ∈ [0, 2k − 1] and b mod 2m = a mod 2m for any m < k. In steps 1-2 the parties generate the random secrets r ∈ [0, 2κ+k − 1] and r′ = r mod 2m . In steps 3-4 they mask b using r and reveal c = b+r (no wraparound modulo q, since q > 2k+κ+1 ). Let a′ = b mod 2m and c′ = c mod 2m . Observe that a′ = c′ − r′ + u · 2m , where u ∈ {0, 1} and u = (c′ < r′ ). Mod2m computes [u] using the comparison protocol BitLT (bitwise less than). 2

For large κ the distribution of c is close to random uniform in [0, 2k+κ + 2k − 2). Since no other value is opened the protocol provides statistical privacy (assuming privacy of the sub-protocols). With current building blocks Mod2m runs in log(m) + 4 rounds with communication complexity 0.5m log(m) + 2k + 2κ invocations. An efficient approximate truncation protocol was proposed in [1]. The absolute error is ǫ ≤ n + 1, for n parties. We adapted the protocol for signed integers encoded in Zq . We call AppDiv2m this variant. AppDiv2m runs in 3 rounds with communication complexity 4 invocations.

We describe a basic version of the protocol using this simple linear approximation. More accurate initial approximations can be obtained by table lookup [9]. Secure table lookup can be achieved quite efficiently and we intend to add piece-wise linear approximation. The computation of the reciprocal for d ∈ Qhk,f i , d > 0 consists of the following main steps: 1. Range reduction: Scale the input d to the range [1/2, 1). Let r = 2s · d ∈ [1/2, 1) the scaled value. 2. Iterations: Let x0 ∈ (1, 2] be an initial approximation of 1/r with ǫ0 ≤ 2−m . Accuracy of p bits is obtained after c = ⌈log2 (p/m)⌉ iterations. Compute xi+1 = xi (2 − xi · r), for 0 ≤ i ≤ c.

4. Secure Reciprocal The algorithms for dividing fixed-point numbers follow two main approaches: digit recurrence algorithms (subtractive division) and functional iteration algorithms (multiplicative division) [6]. Digit recurrence algorithms calculate one quotient digit per iteration, and hence the number of iterations increases linearly with the length of the quotient. Functional iteration algorithms converge quadratically to the quotient, starting with an initial approximation and doubling the number of quotient bits at each iteration. We compute a secure division in two steps: (1) determine the reciprocal of the divisor using the Newton-Raphson method and (2) multiply it with the dividend. This solution is motivated by our main target application, which requires efficient computation of multiple divisions with a common divisor. Other solutions are currently being investigated. The Newton-Raphson method is generally used for evaluating the root of an equation f (x) = 0 based on the recuri) rence xi+1 = xi − ff′(x (xi ) . In particular, for f (x) = 1/x − d we can compute an approximation of the reciprocal 1/d using the recurrence xi+1 = xi (2 − xi d). The relative error is ǫi = 1 − dxi after iteration i, and becomes ǫi+1 = ǫ2i after the next iteration. Therefore, if |ǫ0 | < 1 the algorithm converges and the relative error decreases quadratically. For an initial approximation x0 with error ǫ0 < 2−m , a reciprocal with relative error ǫ < 2−p is obtained after ⌈log2 (p/m)⌉ iterations. Each iteration roughly doubles the accuracy, and iteration i yields an approximation of the reciprocal with accuracy 2i × m bits. A key issue is to determine an initial approximation that ensures quadratic convergence. Moreover, performance can be improved by starting with an accurate initial approximation, in order to reduce the number of iterations. The usual approach is to compute a normalized input r ∈ [0.5, 1) (or r ∈ [1, 2)) and then find an approximation of 1/r. A linear approximation 1/r ≈ α − βr can easily be computed and offers quite good accuracy. For example, x0 = 2.9142 − 2r approximates r ∈ [0.5, 1) with error ǫ0 < 0.08578, i.e., accuracy of 3.5 bits. This linear approximation can be computed without interaction for secret-shared r.

3. Range expansion: The iterations compute xc ≈ 1/r = 1/(d · 2s ). Scale xc to obtain the (approximate) reciprocal 1/d ≈ xc · 2s . We slightly change the notation to simplify the presentation. We denote vhk,f i a k-bit positive fixed-point value with f -bit resolution and vhki a k-bit positive integer. Protocol 4.1, RecNR, computes the reciprocal of a positive fixed-point value dhk,f i , where k = 2f in order to avoid overflow. The range reduction scales dhk,f i and keeps p ∈ [f, k] most significant bits, obtaining the normalized input rhp,pi ∈ [1/2, 1). The iterations yield xhp+2,pi ≈ 1/rhp,pi , xhp+2,pi ∈ (1, 2]. The range expansion scales xhp+2,pi to obtain zhk,f i ≈ 1/dhk,f i . The parameter p allows different trade-offs between accuracy and efficiency, e.g., p = f offers an important efficiency gain for a relative error < 2−f . Protocol 4.1. RecNR([d], k, f, p) 1. [v] ← ScaleUpFactor([d], k) 2. [r] ← [d] · [v] 3. If (k − p > 0) then [r] ← Div2m([r], k, k − p) 4. [x] ← α − β · [r] 5. For 1 ≤ i ≤ ⌈log(p/m)⌉ do (a) [y] ← 22p+1 − [x] · [r] (b) [y] ← [x] · [y] (c) [x] ← Div2m([y], 3 + 3p, 2p) 6. [z] ← [x] · [v] 7. [z] ← Div2m([z], k + p, k + p − 2f ) 8. Return [z] Protocol RecNR carries out the algorithm outlined above using integer arithmetic. The input and the output are the integers dhki = int(dhk,f i ) and zhki = int(zhk,f i ), encoded in Zq and secret-shared. 3

Range reduction (normalization). The range reduction step computes the normalized input rhp,pi = 2s · dhk,f i so that rhp,pi ∈ [1/2, 1). If dhk,f i < 1/2 then s > 0 (multiplication by 2s ) and if dhk,f i ≥ 1 then s < 0 (division by 2s ) and the values d, s, and r must remain secret. Protocol 4.1 starts with dhki = int(dhk,f i ) ∈ [1, 2k − 1] and computes rhpi = 2s ·dhki ∈ [2p−1 , 2p −1]. This implies rhp,pi = 2−p · rhpi ∈ [1/2, 1), as required. The protocol has to deal with two issues: (1) determine the secret factor 2s and (2) compute the normalized input using only multiplication by secret or public value and division by a public value. The input is first scaled up by a secret factor 2u , obtaining rhki = 2u · dhki ∈ [2k−1 , 2k − 1]. If p = k the normalized input is rhki . If p < k, the result is divided by the public value 2k−p to obtain rhpi = 2−(k−p) · rhki ∈ [2p−1 , 2p − 1]. The secret factor 2u is computed using Protocol 4.2. The protocol starts with the bit decomposition of [d], obtaining [d]B = ([dk−1 ], . . . , [d0 ]), then determines the most significant non-zero bit of [d]B using prefix-OR. Suppose that this bit is dj , where j ∈ [0, k − 1]. Then [2u ] = [2k−1−j ]. The secret integer value [v] = [2u ] is saved for the range expansion in step 6 of Protocol 4.1.

Range expansion. The Newton-Raphson iterations yield xhp+2,pi ≈ 1/rhp,pi , where xhp+2,pi ∈ (1, 2] and rhp,pi = 2u−(k−f ) · dhk,f i . The range expansion in steps 6-7 of protocol 4.1 scales xhp+2,pi to obtain zhk,f i ≈ 1/dhk,f i . Let zhki = zhk,f i · 2f and xhp+2i = xhp+2,pi · 2p . Using integer arithmetic, the protocol computes zhki = xhp+2,pi · 2u−(k−f ) · 2f = xhp+2i · 2u · 2−(k+p−2f ) . Complexity. The most relevant complexity metric for RecNR is the number of rounds. For p = f = 56 bits (4 iterations) and exact truncation using Div2m, the protocol needs 92 rounds, which can be reduced to 74 rounds by preprocessing the shared random bits. For approximate truncation using AppDiv2m the protocol needs 46 rounds. Reciprocal of signed values. Signed inputs are handled by a simple extension of protocol 4.1, requiring a secure integer comparison and two secure integer multiplications: determine and save the secret sign of the input, compute the reciprocal, and then set the correct sign for the result.

5. Evaluation and Conclusions

Protocol 4.2. ScaleUpFactor([d], k) We implemented and tested the protocols using our Java implementation of secure computation based on secret sharing. We summarize the results of performance measurements for the main building blocks used in fixed-point multiplication and division. The family of protocols was also used and tested in a protocol for privacy preserving linear programming based on the Simplex algorithm. We measured the protocol execution time for secure computation with 5 parties. The parties are processes that run on different PCs with full mesh interconnection. The secure computation proceeds in rounds consisting of local computation and data exchange. In every round, the parties execute a batch of operations and exchange the associated data in a single interaction.

1. ([dk−1 ], . . . , [d0 ]) ← BitDec([d], k, k) 2. ([ak−1 ], . . . , [a0 ]) ← PreOR([dk−1 ], . . . , [d0 ]) 3. For 0 ≤ i ≤ k − 2 do [bi ] ← [ai ] − [ai+1 ] 4. [bk−1 ] ← [ak−1 ] Pk−1 5. [v] ← i=0 [bk−1−i ] · 2i 6. Return [v] BitDec extracts m bits from a k-bit shared integer in with log(m)+4 rounds and m 2 log(m)+2k+2κ invocations, Wj statistical privacy. PreOR computes ak−j = i=1 dk−i , 1 ≤ j ≤ k in log(k) rounds and k2 log(k) invocations.

Batch size q length (bits) m (bits) Div2m LAN (ms) WAN (ms) AppDiv2m LAN (ms) WAN (ms)

Iterations. Steps 4-5 of protocol 4.1 compute the iterations: x0 = α − βr, xi+1 = xi (2 − xi r). In the protocol specification we denote α = ⌈α · 2p ⌋, where ⌈.⌋ means rounding to the nearest integer. We assume ǫ0 < 2−m , hence p bits of precision are obtained after ⌈log(p/m)⌉ iterations. The precision is a public value, so the protocol may reveal the number of iterations. Protocol 4.1 is a variant that computes a single truncation for the two fixed-point multiplications in an iteration (as shown in the previous section), to reduce round complexity. The disadvantage of this solution is that it requires a larger modulus q. Observe that throughout the iterations 1.0 < x ≤ 2.0, and 0 < y ≤ 4.0. Therefore, using a single truncation requires about 3 + 3p bits to avoid overflow.

1 128 24

256 56

10 128 24

256 56

100 128 24

256 56

135 334

246 565

85 169

180 361

71 148

154 330

21.5 50.4

21 48.9

2.5 5.9

2.6 7.0

0.6 1.9

0.7 2.7

Table 1. Performance of Div2m and AppDiv2m. The experiments were carried out in an isolated network, for two settings: LAN with 100 Mbps and WAN with 10 4

Truncation q length (bits) f = p (bits) LAN (ms) WAN (ms)

AppDiv2m 128 24 295 842

256 56 532 1413

Div2m 128 24 794 2196

References 256 56 1773 4032

[1] J. Algesheimer, J. Camenish, and V. Shoup. Efficient computation modulo a shared secret with application to the generation of shared safe-prime products. In CRYPTO 2002, volume 2442 of LNCS, pages 417– 432. Springer-Verlag, 2002.

Table 2. Performance of RecNR.

[2] M. Atallah, M. Blanton, V. Deshpande, K. Frikken, J. Li, and L. Schwarz. Secure Collaborative Planning, Forecasting, and Replenishment (SCPFR). In Proc. of Multi-Echelon/Public Applications of Supply Chain Management Conference, Atlanta, USA, 2006.

Mbps links and 15 ms end-to-end delay. LAN experiments give an upper bound for protocol performance, in networks with very low delay. WAN experiments show how the performance degrades when the delay increases. We used a heterogeneous group of PCs, and performance was determined by the slowest PC, equipped with a Pentium 4HT processor, at 2.8 GHz. The implementation uses several optimizations for protocols that compute with bitwise shared values, but these protocols remain expensive for large inputs. Binary computations use bits shared in GF (28 ) (8-bit shares) and the protocols for generating shared random bits and converting bit-shares work in small fields, hence with low complexity. Still, due to the binary computation, the exact truncation using Div2m is much slower, especially for large input values, than the approximate truncation using AppDiv2m. Two modulus lengths were used in the experiments: 128 bits, the minimum length suitable for secure arithmetic with fixed-point numbers, and 256 bits, which provides better accuracy for our target application. The communication and computation complexities increase with the modulus length. However, the protocols that use binary computation, like Div2m, are affected more than the others, due to the large number of shared random bits generated and of binary operations. Precomputation of shared random bits can substantially improve the performance. We measured the execution time for a single protocol instance and for batches of 10 and 100 parallel instances. The performance gain obtained by batch processing is substantial but varies a lot depending on the complexity of the protocol’s rounds. The gain is more important for AppDiv2m than for Div2m, which has to generate or process 100-200 shared bits in some rounds. A fixed-point multiplication takes slightly longer than a truncation (one more round). The measurements for RecNR show the execution time of the basic protocol for positive input. The variant using approximate truncation is 2.5-3 times faster and provides suitable accuracy. The accuracy can be adjusted to application requirements by selecting appropriately the bit-length of the fixed-point representation. Further developments will include improving the efficiency or/and accuracy of the truncation protocols, improvements of the reciprocal protocol (initial approximation), alternative division algorithms, and computation of other functions (e.g., square root).

[3] F. Brandt. Fundamental Aspects of Privacy and Deception in Electronic Auctions. PhD dissertation, Technical University Munich, 2003. [4] R. Cramer, I. Damg˚ard, and Y. Ishai. Share conversion, pseudorandom secret-sharing and applications to secure computation. In TCC 2005, volume 3378 of LNCS, pages 342–362. Springer-Verlag, 2005. [5] I. Damg˚ard, M. Fitzi, E. Kiltz, J. Nielsen, and T. Toft. Unconditionally secure constant-rounds multi-party computation for equality, comparison, bits and exponentiation. In TCC 2006, volume 3876 of LNCS, pages 285–304. Springer-Verlag, 2006. [6] M. D. Ercegovac and T. Lang. Digital Arithmetic. Morgan Kaufmann, 2003. [7] S. L. From and T. Jakobsen. Secure Multi-Party Computation on Integers. Master’s thesis, University of Aarhus, Denmark, BRICS, Department of Computer Science, 2006. [8] E. Kiltz, G. Leander, and J. Malone-Lee. Secure Computation of the Mean and Related Statistics. In TCC 2005, volume 3378 of LNCS. Springer-Verlag, 2005. [9] N. T. Masayuki Ito and S. Yajima. Efficient Initial Approximation for Multiplicative Division and Square Root by a Multiplication with Operand Modification. IEEE Transactions on Computers, 46(4), 1997. [10] T. Nishide and K. Ohta. Multiparty Computation for Interval, Equality, and Comparison Without BitDecomposition Protocol. In PKC 2007, volume 4450 of LNCS, pages 343–360. Springer-Verlag, 2007. [11] T. Toft. Primitives and Applications for Multi-party Computation. PhD dissertation, University of Aarhus, Denmark, Department of Computer Science, 2007.

5

Abstract Collaborative business applications can use secure multiparty computation to preserve input privacy. These applications need protocols that provide all the basic operations with integers and rational numbers and allow secure composition and efficient application development. Secure computation with rational numbers is a long-standing open problem. We present in this paper several components of a protocol family for secure computation with fixed-point numbers based on secret sharing.

1. Introduction Secure multiparty computation deals with cryptographic protocols that enable a group of parties to run joint applications and preserve input privacy. Their core components offer the combined ability to protect data privacy and compute with protected data, e.g., secret sharing or homomorphic encryption schemes. Among the applications that benefit from input privacy we find auctions, collaborative linear programming, supply chain management, benchmarking and forecasting [3, 11, 2]. These applications need a family of protocols for secure computation with integer and rational numbers. The protocols proposed so far support arithmetic in a field or ring and computation with binary and integer values. Protocols for computing the reciprocal and division were studied only in the context of particular applications and integer arithmetic [1, 8, 7]. Secure computation with rational numbers has remained, essentially, an open problem. In this paper, we present several protocols for multiparty computation with fixed-point numbers. Fixed-point arithmetic needs specific protocols only for multiplication and division, while the other operations can be efficiently achieved using integer computation. Overall, the family of ∗ Part of the work presented in this paper was funded by the European Commission through the grant FP7-213531 to the SecureSCM project.

protocols allows addition, subtraction, multiplication, division, and comparison of (signed) integers and fixed-point numbers. These protocols use a secure computation framework that relies on secret sharing and provides a collection of basic building blocks, partly adapted from solutions proposed in the literature (e.g., [5, 11, 10, 1, 4]). Division is achieved by multiplying the dividend with the reciprocal of the divisor, to improve efficiency when dividing many values by a common divisor. This solution is mainly motivated by one of our target applications, the Simplex algorithm for linear programming. Section 2 provides an introduction to data representation for secure computation with fixed-point numbers. Protocols for fixed-point multiplication and division are discussed in Sections 3 and 4, respectively. We conclude in Section 5 with an analysis of performance measurements for an implementation of these protocols.

2. Data Representation Secure computation has to encode application data as values in the input domains of the cryptographic protocols. We assume a secure computation framework that uses secret sharing over a field (Zq or F2m ) and offers as core functionality the following operations: input (share-distribution), output (secret reconstruction) and secure arithmetic with field elements. We summarize several encoding methods for binary and fixed-point values. Logical values x ∈ {f alse, true} are encoded as integer values, f alse = 0, true = 1. Binary values x ∈ {0, 1} can be encoded in Zq or F2m with identity mapping. For this encoding, boolean functions can be evaluated using arithmetic in Zq or F2m . It is often useful (albeit inefficient) to encode integers as arrays of secret-shared bits corresponding to their binary representation. Protocols for boolean functions and binary arithmetic can be constructed using the protocols for secure arithmetic in Zq or F2m . Fixed-point numbers are rational numbers represented as a sequence of digits split by a virtual radix point into an integer part and a fractional part with fixed length. We denote

3. Secure Multiplication

e the length of the integer part, f the length of the fractional part, and k = e + f . In computer arithmetic, fixed-point numbers are usually represented using binary digits and the two’s complement system [6]. We use a similar solution for secure computation, because it allows efficient arithmetic. Integers are fixed-point numbers with f = 0. An integer x is encoded on k bits using the mapping xhki = x mod 2k , which provides a unique representation on k bits of signed integers in Zhki = [−2k−1 , 2k−1 − 1]. The set of all rational numbers is Q = {x = uv | u, v ∈ Z, v 6= 0}. A k-bit fixed-point representation provides the value of the numerator u ∈ Zhki , while the denominator has a fixed value v = 2f , f < k. We can thus represent the subset of rational numbers Qhk,f i = {xhk,f i = u · 2−f | u ∈ Zhki , k > f }. We define the maps int : Qhk,f i → Z, int(x) = x · 2f , and fxp : Zhki → Qhk,f i , fxp(y) = y · 2−f . Arithmetic with fixed-point numbers in Qhk,f i can be defined using integer arithmetic. For u, v ∈ Z, we denote u/v = uv . We consider the following basic set of operations, for any a, b ∈ Qhk,f i : a+b a−b a·b a/b (a = b) (a < b)

= = = = = =

Multiplication of fixed-point values in Qhk,f i consists of an integer multiplication followed by division of the result by 2f , i.e., truncation of the f least significant bits. We need q > 2k+f to avoid overflow. can sometimes be reduced: PkThe number of truncations Pk −f a · b = fxp(( int(a ) and if i i ) · int(bi ))/2 i=1 i i=1 k+2f q>2 then a·b·c = fxp((int(a)·int(b)·int(c))/2−2f ). Also, if a ∈ Zhki , b ∈ Qhk,f i then a · b = fxp(a · int(b)). Protocol 3.1, Div2m, truncates m least significant bits of a ∈ Zhki by computing ⌊a/2m ⌋ = (a − (a mod 2m ))2−m . Div2m uses protocol 3.2, Mod2m, to compute a mod 2m . The input a and the outputs of the protocols are encoded in Zq and secret-shared, while k and m are public positive integers. The protocols provide statistical privacy and require q > 2k+κ+1 , where κ is a security parameter. Protocol 3.1. Div2m([a], k, m) 1. [b] ← Mod2m([a], k, m) 2. [a′ ] ← ([a] − [b])2−m

fxp(int(a) + int(b)) fxp(int(a) − int(b)) fxp((int(a) · int(b))/2f ) fxp((int(a) · 2f )/int(b)) (int(a) = int(b)) (int(a) < int(b)).

3. Return [a′ ] Protocol 3.2. Mod2m([a], k, m) 1. For 0 ≤ i < k + κ do [ri ] ← RandBitZ() Pk+κ−1 2. [r] ← i=0 2i · [ri ] [r′ ]B ← ([rm−1 ], . . . , [r0 ]) Pm−1 [r′ ] ← i=0 2i · [ri ]

Addition, subtraction, equality, and inequality in Qhk,f i are computed like in Zhki . It is sufficient to find an encoding for secure integer computation. Fixed-point numbers use the same encoding, and protocols for fixed-point arithmetic are constructed using those for integer arithmetic. Signed integers in A = [−A/2, A/2] can be encoded in the field Zq using the function ρ(a) = a mod q, where q > A > 0. If a ∈ [0, A/2) then ρ(a) = a and if a ∈ [−A/2, 0) then ρ(a) = q + a. For this encoding, ρ(φZ (a, b)) = φZq (ρ(a), ρ(b)), where φZ , φZq are addition, subtraction, multiplication, and equality in Z and Zq , respectively. Also, if a mod b = 0, then ρ(a/b) = ρ(a)/ρ(b). For these operations, secure arithmetic in A ⊂ Z can be computed using the protocols for secure arithmetic in Zq . An alternative solution is to encode signed integers in a field with modular arithmetic centered around zero [1]. We specify the protocols assuming that all private data is encoded and secret-shared in the same field Zq using Shamir sharing. We denote [x] a sharing of x ∈ Zq and [x]B a bitwise sharing. For a ∈ Z we also use a = ρ(a), for convenience. Linear combinations of secret values with public coefficients are locally computed. Other operations with secret values require interaction. We measure communication complexity as the number of secure multiplication and equivalent operations, called invocations. Round complexity is the number of sequential interactions.

3. [b] ← 2k−1 + [a] 4. c ← output([b] + [r]) c′ ← c mod 2m 5. [u] ← BitLT(c′ , [r′ ]B ) [a′ ] ← c′ − [r′ ] + [u] · 2m 6. Return [a′ ] The idea of Mod2m is to compute and open a + r, for a random secret r, then compute (a + r) mod 2m = (a mod 2m + r mod 2m ) mod 2m and subtract [r mod 2m ]. With our encoding, this works only if a ≥ 0. Mod2m replaces a with b = (2k−1 + a) mod q = 2k−1 + a. Observe that b ∈ [0, 2k − 1] and b mod 2m = a mod 2m for any m < k. In steps 1-2 the parties generate the random secrets r ∈ [0, 2κ+k − 1] and r′ = r mod 2m . In steps 3-4 they mask b using r and reveal c = b+r (no wraparound modulo q, since q > 2k+κ+1 ). Let a′ = b mod 2m and c′ = c mod 2m . Observe that a′ = c′ − r′ + u · 2m , where u ∈ {0, 1} and u = (c′ < r′ ). Mod2m computes [u] using the comparison protocol BitLT (bitwise less than). 2

For large κ the distribution of c is close to random uniform in [0, 2k+κ + 2k − 2). Since no other value is opened the protocol provides statistical privacy (assuming privacy of the sub-protocols). With current building blocks Mod2m runs in log(m) + 4 rounds with communication complexity 0.5m log(m) + 2k + 2κ invocations. An efficient approximate truncation protocol was proposed in [1]. The absolute error is ǫ ≤ n + 1, for n parties. We adapted the protocol for signed integers encoded in Zq . We call AppDiv2m this variant. AppDiv2m runs in 3 rounds with communication complexity 4 invocations.

We describe a basic version of the protocol using this simple linear approximation. More accurate initial approximations can be obtained by table lookup [9]. Secure table lookup can be achieved quite efficiently and we intend to add piece-wise linear approximation. The computation of the reciprocal for d ∈ Qhk,f i , d > 0 consists of the following main steps: 1. Range reduction: Scale the input d to the range [1/2, 1). Let r = 2s · d ∈ [1/2, 1) the scaled value. 2. Iterations: Let x0 ∈ (1, 2] be an initial approximation of 1/r with ǫ0 ≤ 2−m . Accuracy of p bits is obtained after c = ⌈log2 (p/m)⌉ iterations. Compute xi+1 = xi (2 − xi · r), for 0 ≤ i ≤ c.

4. Secure Reciprocal The algorithms for dividing fixed-point numbers follow two main approaches: digit recurrence algorithms (subtractive division) and functional iteration algorithms (multiplicative division) [6]. Digit recurrence algorithms calculate one quotient digit per iteration, and hence the number of iterations increases linearly with the length of the quotient. Functional iteration algorithms converge quadratically to the quotient, starting with an initial approximation and doubling the number of quotient bits at each iteration. We compute a secure division in two steps: (1) determine the reciprocal of the divisor using the Newton-Raphson method and (2) multiply it with the dividend. This solution is motivated by our main target application, which requires efficient computation of multiple divisions with a common divisor. Other solutions are currently being investigated. The Newton-Raphson method is generally used for evaluating the root of an equation f (x) = 0 based on the recuri) rence xi+1 = xi − ff′(x (xi ) . In particular, for f (x) = 1/x − d we can compute an approximation of the reciprocal 1/d using the recurrence xi+1 = xi (2 − xi d). The relative error is ǫi = 1 − dxi after iteration i, and becomes ǫi+1 = ǫ2i after the next iteration. Therefore, if |ǫ0 | < 1 the algorithm converges and the relative error decreases quadratically. For an initial approximation x0 with error ǫ0 < 2−m , a reciprocal with relative error ǫ < 2−p is obtained after ⌈log2 (p/m)⌉ iterations. Each iteration roughly doubles the accuracy, and iteration i yields an approximation of the reciprocal with accuracy 2i × m bits. A key issue is to determine an initial approximation that ensures quadratic convergence. Moreover, performance can be improved by starting with an accurate initial approximation, in order to reduce the number of iterations. The usual approach is to compute a normalized input r ∈ [0.5, 1) (or r ∈ [1, 2)) and then find an approximation of 1/r. A linear approximation 1/r ≈ α − βr can easily be computed and offers quite good accuracy. For example, x0 = 2.9142 − 2r approximates r ∈ [0.5, 1) with error ǫ0 < 0.08578, i.e., accuracy of 3.5 bits. This linear approximation can be computed without interaction for secret-shared r.

3. Range expansion: The iterations compute xc ≈ 1/r = 1/(d · 2s ). Scale xc to obtain the (approximate) reciprocal 1/d ≈ xc · 2s . We slightly change the notation to simplify the presentation. We denote vhk,f i a k-bit positive fixed-point value with f -bit resolution and vhki a k-bit positive integer. Protocol 4.1, RecNR, computes the reciprocal of a positive fixed-point value dhk,f i , where k = 2f in order to avoid overflow. The range reduction scales dhk,f i and keeps p ∈ [f, k] most significant bits, obtaining the normalized input rhp,pi ∈ [1/2, 1). The iterations yield xhp+2,pi ≈ 1/rhp,pi , xhp+2,pi ∈ (1, 2]. The range expansion scales xhp+2,pi to obtain zhk,f i ≈ 1/dhk,f i . The parameter p allows different trade-offs between accuracy and efficiency, e.g., p = f offers an important efficiency gain for a relative error < 2−f . Protocol 4.1. RecNR([d], k, f, p) 1. [v] ← ScaleUpFactor([d], k) 2. [r] ← [d] · [v] 3. If (k − p > 0) then [r] ← Div2m([r], k, k − p) 4. [x] ← α − β · [r] 5. For 1 ≤ i ≤ ⌈log(p/m)⌉ do (a) [y] ← 22p+1 − [x] · [r] (b) [y] ← [x] · [y] (c) [x] ← Div2m([y], 3 + 3p, 2p) 6. [z] ← [x] · [v] 7. [z] ← Div2m([z], k + p, k + p − 2f ) 8. Return [z] Protocol RecNR carries out the algorithm outlined above using integer arithmetic. The input and the output are the integers dhki = int(dhk,f i ) and zhki = int(zhk,f i ), encoded in Zq and secret-shared. 3

Range reduction (normalization). The range reduction step computes the normalized input rhp,pi = 2s · dhk,f i so that rhp,pi ∈ [1/2, 1). If dhk,f i < 1/2 then s > 0 (multiplication by 2s ) and if dhk,f i ≥ 1 then s < 0 (division by 2s ) and the values d, s, and r must remain secret. Protocol 4.1 starts with dhki = int(dhk,f i ) ∈ [1, 2k − 1] and computes rhpi = 2s ·dhki ∈ [2p−1 , 2p −1]. This implies rhp,pi = 2−p · rhpi ∈ [1/2, 1), as required. The protocol has to deal with two issues: (1) determine the secret factor 2s and (2) compute the normalized input using only multiplication by secret or public value and division by a public value. The input is first scaled up by a secret factor 2u , obtaining rhki = 2u · dhki ∈ [2k−1 , 2k − 1]. If p = k the normalized input is rhki . If p < k, the result is divided by the public value 2k−p to obtain rhpi = 2−(k−p) · rhki ∈ [2p−1 , 2p − 1]. The secret factor 2u is computed using Protocol 4.2. The protocol starts with the bit decomposition of [d], obtaining [d]B = ([dk−1 ], . . . , [d0 ]), then determines the most significant non-zero bit of [d]B using prefix-OR. Suppose that this bit is dj , where j ∈ [0, k − 1]. Then [2u ] = [2k−1−j ]. The secret integer value [v] = [2u ] is saved for the range expansion in step 6 of Protocol 4.1.

Range expansion. The Newton-Raphson iterations yield xhp+2,pi ≈ 1/rhp,pi , where xhp+2,pi ∈ (1, 2] and rhp,pi = 2u−(k−f ) · dhk,f i . The range expansion in steps 6-7 of protocol 4.1 scales xhp+2,pi to obtain zhk,f i ≈ 1/dhk,f i . Let zhki = zhk,f i · 2f and xhp+2i = xhp+2,pi · 2p . Using integer arithmetic, the protocol computes zhki = xhp+2,pi · 2u−(k−f ) · 2f = xhp+2i · 2u · 2−(k+p−2f ) . Complexity. The most relevant complexity metric for RecNR is the number of rounds. For p = f = 56 bits (4 iterations) and exact truncation using Div2m, the protocol needs 92 rounds, which can be reduced to 74 rounds by preprocessing the shared random bits. For approximate truncation using AppDiv2m the protocol needs 46 rounds. Reciprocal of signed values. Signed inputs are handled by a simple extension of protocol 4.1, requiring a secure integer comparison and two secure integer multiplications: determine and save the secret sign of the input, compute the reciprocal, and then set the correct sign for the result.

5. Evaluation and Conclusions

Protocol 4.2. ScaleUpFactor([d], k) We implemented and tested the protocols using our Java implementation of secure computation based on secret sharing. We summarize the results of performance measurements for the main building blocks used in fixed-point multiplication and division. The family of protocols was also used and tested in a protocol for privacy preserving linear programming based on the Simplex algorithm. We measured the protocol execution time for secure computation with 5 parties. The parties are processes that run on different PCs with full mesh interconnection. The secure computation proceeds in rounds consisting of local computation and data exchange. In every round, the parties execute a batch of operations and exchange the associated data in a single interaction.

1. ([dk−1 ], . . . , [d0 ]) ← BitDec([d], k, k) 2. ([ak−1 ], . . . , [a0 ]) ← PreOR([dk−1 ], . . . , [d0 ]) 3. For 0 ≤ i ≤ k − 2 do [bi ] ← [ai ] − [ai+1 ] 4. [bk−1 ] ← [ak−1 ] Pk−1 5. [v] ← i=0 [bk−1−i ] · 2i 6. Return [v] BitDec extracts m bits from a k-bit shared integer in with log(m)+4 rounds and m 2 log(m)+2k+2κ invocations, Wj statistical privacy. PreOR computes ak−j = i=1 dk−i , 1 ≤ j ≤ k in log(k) rounds and k2 log(k) invocations.

Batch size q length (bits) m (bits) Div2m LAN (ms) WAN (ms) AppDiv2m LAN (ms) WAN (ms)

Iterations. Steps 4-5 of protocol 4.1 compute the iterations: x0 = α − βr, xi+1 = xi (2 − xi r). In the protocol specification we denote α = ⌈α · 2p ⌋, where ⌈.⌋ means rounding to the nearest integer. We assume ǫ0 < 2−m , hence p bits of precision are obtained after ⌈log(p/m)⌉ iterations. The precision is a public value, so the protocol may reveal the number of iterations. Protocol 4.1 is a variant that computes a single truncation for the two fixed-point multiplications in an iteration (as shown in the previous section), to reduce round complexity. The disadvantage of this solution is that it requires a larger modulus q. Observe that throughout the iterations 1.0 < x ≤ 2.0, and 0 < y ≤ 4.0. Therefore, using a single truncation requires about 3 + 3p bits to avoid overflow.

1 128 24

256 56

10 128 24

256 56

100 128 24

256 56

135 334

246 565

85 169

180 361

71 148

154 330

21.5 50.4

21 48.9

2.5 5.9

2.6 7.0

0.6 1.9

0.7 2.7

Table 1. Performance of Div2m and AppDiv2m. The experiments were carried out in an isolated network, for two settings: LAN with 100 Mbps and WAN with 10 4

Truncation q length (bits) f = p (bits) LAN (ms) WAN (ms)

AppDiv2m 128 24 295 842

256 56 532 1413

Div2m 128 24 794 2196

References 256 56 1773 4032

[1] J. Algesheimer, J. Camenish, and V. Shoup. Efficient computation modulo a shared secret with application to the generation of shared safe-prime products. In CRYPTO 2002, volume 2442 of LNCS, pages 417– 432. Springer-Verlag, 2002.

Table 2. Performance of RecNR.

[2] M. Atallah, M. Blanton, V. Deshpande, K. Frikken, J. Li, and L. Schwarz. Secure Collaborative Planning, Forecasting, and Replenishment (SCPFR). In Proc. of Multi-Echelon/Public Applications of Supply Chain Management Conference, Atlanta, USA, 2006.

Mbps links and 15 ms end-to-end delay. LAN experiments give an upper bound for protocol performance, in networks with very low delay. WAN experiments show how the performance degrades when the delay increases. We used a heterogeneous group of PCs, and performance was determined by the slowest PC, equipped with a Pentium 4HT processor, at 2.8 GHz. The implementation uses several optimizations for protocols that compute with bitwise shared values, but these protocols remain expensive for large inputs. Binary computations use bits shared in GF (28 ) (8-bit shares) and the protocols for generating shared random bits and converting bit-shares work in small fields, hence with low complexity. Still, due to the binary computation, the exact truncation using Div2m is much slower, especially for large input values, than the approximate truncation using AppDiv2m. Two modulus lengths were used in the experiments: 128 bits, the minimum length suitable for secure arithmetic with fixed-point numbers, and 256 bits, which provides better accuracy for our target application. The communication and computation complexities increase with the modulus length. However, the protocols that use binary computation, like Div2m, are affected more than the others, due to the large number of shared random bits generated and of binary operations. Precomputation of shared random bits can substantially improve the performance. We measured the execution time for a single protocol instance and for batches of 10 and 100 parallel instances. The performance gain obtained by batch processing is substantial but varies a lot depending on the complexity of the protocol’s rounds. The gain is more important for AppDiv2m than for Div2m, which has to generate or process 100-200 shared bits in some rounds. A fixed-point multiplication takes slightly longer than a truncation (one more round). The measurements for RecNR show the execution time of the basic protocol for positive input. The variant using approximate truncation is 2.5-3 times faster and provides suitable accuracy. The accuracy can be adjusted to application requirements by selecting appropriately the bit-length of the fixed-point representation. Further developments will include improving the efficiency or/and accuracy of the truncation protocols, improvements of the reciprocal protocol (initial approximation), alternative division algorithms, and computation of other functions (e.g., square root).

[3] F. Brandt. Fundamental Aspects of Privacy and Deception in Electronic Auctions. PhD dissertation, Technical University Munich, 2003. [4] R. Cramer, I. Damg˚ard, and Y. Ishai. Share conversion, pseudorandom secret-sharing and applications to secure computation. In TCC 2005, volume 3378 of LNCS, pages 342–362. Springer-Verlag, 2005. [5] I. Damg˚ard, M. Fitzi, E. Kiltz, J. Nielsen, and T. Toft. Unconditionally secure constant-rounds multi-party computation for equality, comparison, bits and exponentiation. In TCC 2006, volume 3876 of LNCS, pages 285–304. Springer-Verlag, 2006. [6] M. D. Ercegovac and T. Lang. Digital Arithmetic. Morgan Kaufmann, 2003. [7] S. L. From and T. Jakobsen. Secure Multi-Party Computation on Integers. Master’s thesis, University of Aarhus, Denmark, BRICS, Department of Computer Science, 2006. [8] E. Kiltz, G. Leander, and J. Malone-Lee. Secure Computation of the Mean and Related Statistics. In TCC 2005, volume 3378 of LNCS. Springer-Verlag, 2005. [9] N. T. Masayuki Ito and S. Yajima. Efficient Initial Approximation for Multiplicative Division and Square Root by a Multiplication with Operand Modification. IEEE Transactions on Computers, 46(4), 1997. [10] T. Nishide and K. Ohta. Multiparty Computation for Interval, Equality, and Comparison Without BitDecomposition Protocol. In PKC 2007, volume 4450 of LNCS, pages 343–360. Springer-Verlag, 2007. [11] T. Toft. Primitives and Applications for Multi-party Computation. PhD dissertation, University of Aarhus, Denmark, Department of Computer Science, 2007.

5