LTSC-128: Stream Cipher Based on the Intractable Shortest Vector Problem in Lattice

Khaled Suwais, Azman Samsudin School of Computer Sciences Universiti Sains Malaysia Penang 11800 Malaysia [email protected], [email protected]

Journal of Digital Information Management

Abstract: LTSC-128 is a high secure stream cipher based on reach the required level of security or that the generated keyBoesgaard etstream al. 2003). However, researches show that those ciphers either the hardness of the Shortest Vector Problem in Lattice space. The is statistically weak (Scott et al. 2001,Christophe 2001). cipher is based on vectors multiplication over ﬁnite ﬁeld, where Therefore the current de s ign of stream ciphers, which rests on do not reach the required level of security or that the generated keystream is these vectors are represented by polynomials to enhance the using simple logical and mathematical operations or LFSR, statistically weak (Scott et al. 2001,Christophe 2001). Therefore the current deperformance of keystream generation. The key size is 128 bits show the imperfection of the stream cipher. sign of stream ciphers, which rests on using simple logical and mathematical and there is no attack faster than exhaustive key search has been coreshow of anythe stream cipher is presented by the keystream operations orThe LFSR, imperfection of the stream cipher. identiﬁed. The experimental results showed that the encryption generation procedure, where the input key is treated based on The core of any stream cipher is presented by the keystream generation rate of LTSC-128 is about 50Mbit/second. LTSC-128 cipher speciﬁc and pre-deﬁned formula of that cipher. In this paper we has additional feature that its performance can be increased procedure, where the input key stream is treated based and of pre-deﬁned are presenting a new cipher basedon on speciﬁc the hardness by applying multithreading technique on multi-core processors. formula of that cipher. In this paper we are presenting a new stream cipher the Shortest Vector Problem (SVP) in lattice system. SVP is an The multithreaded version of LTSC-128 has achieved better en problem 2005)Vector and it isProblem the core of the proposed based on theNP-hard hardness of the(Khot Shortest (SVP) in lattice system. cryption rate of 71Mbit/second on dual-core processor. cipher to ensure higher compared tocore the existing SVP is an NP-hard problem (Khot security 2005) and it is the of the stream proposed cipher ciphers discussed earlier. to the existing stream ciphers discussed Categories and Subject Descriptors to ensure higher security compared E.3 [Data Encryption ] Data encryption standard; D.4.6earlier. [Security and Protection] Cryptographic controls

The SVP along with three other intractable problems are conas the four other most important pivots in current The SVP sidered along with three intractable problems arepublic-key considered as the cryptosystems as appeared in the literature. The other three four most important pivots in current public-key cryptosystems as appeared in General Terms: Data encryption; Cryptographic analysis, Vector problems are the Integer Factorization Problem (IFP), Discrete problem, Lattice the literature.Logarithm The other three (DLP) problems are the Integer Factorization Problem and Elliptic Curve Discrete Logarithm Problem (IFP), Discrete Logarithm Problem (DLP) and Elliptic Curveintroduced Discrete Logarithm Problem (ECDLP). The ﬁrst use of SVP was formally Keywords: Stream cipher, Shortest vector problem, Lattice, NP-hard in 1998 (Hoff s tein et al. 1998), in a public-key cryptosystem Problem (ECDLP). The ﬁrst use of SVP was formally introduced in 1998 (Hoffproblem, Multithreading, Encryption calledinNTRU cryptosystem. The general ideaNTRU of SVPcryptosystem. rests on stein et al. 1998), a public-key cryptosystem called The the shortest non-zero vector b in lattice B, where B is Received: 11 November 2010; Revised 17 December 2010;general Accepted idea ﬁnding of SVP rests on ﬁnding the shortest non-zero vector b in lattice b, the basis of the lattice δ. 20 December 2010

where b is the basis of the lattice δ. basis can beasseen as matrix, n × m matrix, where vectors b formsthe column The basisThe b can beBseen n×m where vectors b forms 1. Introduction the column of B such as: of b such as: Stream cipher algorithm is responsible for generating a | ··· | − → → cryptographically strong keystream to perform encryption − (1) (1) b = b1 · · · bm operations on a given plaintext. Generally, stream cipher is used | · · · | in environment where encryption need to be done in a continuous

stream of bits (rather than by block as in block cipher). Stream The workﬂow of LTSC-128 is divided into three phases: Initialization Phase The workﬂow of LTSC-128 is divided into three phases: Initialciphers can be categorized into two categories: hardware-based ization Phase (IP), Keystream Generation PhasePhase (KGP)(EP) and (see Sec(IP), Keystream Generation Phase (KGP) and Encryption and software-based. The hardware-based stream ciphers are Encryption Phasein(EP) Sectphase ion 3). The SVP is applied in 3). The SVP is applied the (see second of LTSC-128 to generate the commonly based on the Linear Feedback Shift Registerstion (LFSRs) the second phase of LTSC-128 to generate the keystream. In (e.g. SNOW (Ekdahl and Johansson 2000), LILI-128 (Dawson et al. keystream. InLTSC-128, LTSC-128, the mathematical operations and vector representathe mathematical operations and vector representa 2000), etc.). On the other hand, software-based stream ciphers tion (e.g. of SVP in lattice space was inspired by NTRU public-key cryptosystem, tion of SVP in lattice space was inspired by NTRU public-key RC4 (Rivest 1992), SEAL (Rogaway and Coppersmith 1998), etc.) where the vectors are represented by polynomials for by efﬁciency purposes as cryptosystem, where the vectors are represented polynomials rely on Boolean functions and bit manipulation in their computations. for efﬁciency purposes as will discuss later in this document. will discuss later in this document. A cryptographically secure stream cipher should satisfy two Employing NP-hardNP-hard problemproblem such as SVP in stream cipher will essential requirements. These requirements are summarized by Employing such as SVP in stream cipher willstrengthen two questions: how secure is the generated keystream against strengthen the security of the generated keystream against the security of the generated keystream against cryptanalysis attacks, since cryptanalysis attacks and how strong is the generated keystream cryptanalysis since there is no known methodinthat can there is no known methodattacks, that can solve NP-hard problems polynomial time. statistically? In addition, one should also take into consideration solve NP-hard problems in polynomial time. LTSC-128 provides LTSC-128 provides higher security than several other well known ciphers inthe performance factor so that the performance and the level of higher security than several other well known ciphers intended tended to be to used in hardware andand software implementations. security provided by the algorithm are balanced. be used in hardware software implementations.

From a technical point of view, the use of NP-hard problems in cryptographic

Both categories of stream ciphers are proven to provide high From a technical point of view, the use of NP-hard problems primitives decreases the performance of the corresponding primitives due to its performance ciphers according to the results presented in in cryptographic primitives decreases the performance of the costly operations. Thus, the use ofdue the new multi-core technology (Rogaway and Coppersmith 1998, Boesgaard et al. 2003). corresponding primitives to its costly operations. Thus, the becomes necessary order the performance cryptographic primitives based However, researches show that those ciphers either do not in use of to theenhance new multi-core technology becomes necessary in

Journal of Digital Information Management Volume 9 Number 1 February 2011

27

[k u2 , k3 u3 , k3 u1 ] [k3 u32 , k 3 u3 , k3 u1 ] = [k1 u1 + k2 u3 + k3 u2 , k1 u2 + k2 u1 + k3 u3 , = [k1 u1 + k2 u3 + k3 u2 , k1 u2 + k2 u1 + k3 u3 , k u3 + k2 u2 + k3 u1 ] k1 u13 + k2 u2 + k3 u1 ]

NP-hard order problems. Technically, gaining higher performance on multi-core to enhance the performance cryptographic primitives based Resulted polynomial fromconvolution the convolution multiplication anyother oper Resulted polynomial from the multiplication ororany problems. Technically, gaining performance on n NP-hard NP-hard Technically, gaining higher higher techniques, performance on multi-core multi-core Resulted polynomial fromare the convolution multiplication or any other operaatform canonproblems. beNP-hard achieved by using multithreading where multiple problems. Technically, gaining higher performance other operations reduced modulo q, where q is a pre-defined tions are multiple reduced modulo q, where q is a pre-deﬁned value. The other impo atform can be achieved by multithreading techniques, where atformare can be achieved by using using multithreading techniques, where multiple are reduced modulo q, important where q operation is a pre-deﬁned value. The out other imporon multi-core platform can be achievedcalculations by using multithreading value. The other that is being carried reads created to accomplish the internal oftions the corresponding on NP-hard problems. Technically, gaining higher performance on multi-core tant operation that is being carried out in our implementation is the inverse eads are created totoaccomplish the internal calculations ofof the corresponding reads are created accomplish the internal calculations the corresponding techniques, where multiple threads are created to accomplish in our implementation is the inverse of a given polynomial. tant operation that is being carried out in our implementation is the inverse− of− yptographicplatform primitives. another multithreading techniques → − → can From be achieved byperspective, using multithreading techniques, where multiple −1 → − → − a given polynomial. The inverse of → polynomial Mdenoted modulobyq → is–1 denoted by M yptographic From another perspective, multithreading techniques theprimitives. internal calculations of the corresponding cryptographic The inverse of polynomial M modulo q is M such yptographic primitives. From another perspective, multithreading techniques q a given The inverse of polynomial M modulo q is denoted by M q q n be applied on the design of the algorithm (e.g. stream threads arestructural created to accomplish the internal calculations ofpolynomial. the cipher) corresponding primitives. From another perspective, multithreading techniques hard problems. Technically, gaining higher performance on multi-core such that: n be applied on the structural design of the algorithm (e.g. stream cipher) that: an onkeystreams the structural design of the algorithm (e.g. stream cipher) −1 cryptographic primitives. From another perspective, multithreading techniques such that: ch be thatapplied multiple are generated andof used to encrypt a stream of → − → − can bekeystreams applied onmultithreading the structural design thewhere algorithm (e.g.a −1 (9) m bemultiple achieved by using techniques, multiple →M ≡ 1 (mod q) − →M ( chcan that are generated and used toto encrypt stream ofofcipher) − q⊗ ⊗ uch that multiple keystreams are generated and used encrypt a stream can be applied on the structural design of the algorithm (e.g. stream M ≡ 1 (mod q) (9) M aintext as described by the model proposed in (Suwais and Samsudin 2008). q stream cipher) such that multiple keystreams are generated and are created to accomplish the internal calculations of the corresponding aintext as by model proposed inin(Suwais and Samsudin 2008). → → − aintext asdescribed described bythe the (Suwais and Samsudin 2008). such that multiple keystreams are as generated and used to encrypt aand stream of − used encrypt a stream of proposed plaintext by the model The paper is to organized asmodel follow: Section 2described provides the background →M , we For a given ϑ toϑdenote thethe number ofofone’s coef raphic primitives. From another perspective, multithreading techniques For apolynomial given polynomial M,write weϑwrite to denote number For a given polynomial M , we write to denote the number of one’s coefﬁThe paper is organized as follow: Section 2 provides the background and plaintext as described by the model proposed in (Suwais and Samsudin 2008). The paper isSection organized as and follow: Section 2 provides thecients background and proposed in (Suwais Samsudin 2008). tations used. 3 describes the structure of the proposed algorithm. polynomial. In in practice, the convolution of two polynomia coefﬁcients the polynomial. In practice,product the convolution applied on the structural design of the algorithm (e.g. stream cipher) in theone’s cients in the polynomial. In practice, the convolution product of two polynomials tations used. Section 3 describes the structure of the proposed algorithm. The paper is organized as follow: Section 2 provides the background and otations used. Section 3 generated describes theused structure of the proposed product of tworequires polynomials with binary coefﬁcients requires ϑn optimiz plementation and evaluation is to discussed Section 4.algorithm. Security at multiple keystreams and encrypt ainstream of Thus, for further speed with binary coefﬁcients ϑn operations. The paperperformance is are organized as follow: Section 2 provides the backnotations used. Section 3 describes the structure of the proposed algorithm. Thus, for speed optimizabinary coefﬁcients requires ϑfurther plementation and performance evaluation isisdiscussed ininwith Section 4.4.Security n operations. mplementation and performance evaluation discussed Section Security operations. Thus,offorSmall speed optimiza tionfurther we apply the ground andmodel notations used. Section 3 describes the5 structure xtaluation as described by the proposed in (Suwais and Samsudin 2008). and statisitcal analysis are presented in Section and 6we respectively. tion apply the concept Hamming Weight Polynomial multiplicatio Implementation and performance evaluation is discussed in Section 4. Security we applyconcept the concept of Small Hamming Polynomial multiplication aluation statisitcal analysis are inin Section 55tion and 66respectively. of Small Hamming Weight Weight Polynomial multiplication valuation and statisitcal analysis arepresented presented Section and respectively. of the proposed algorithm. Implementation and paperconcluding isand organized as follow: 2 provides the background and nally, remarks areSection presented inare Sections 7. performance (Hoffstein and Silverman 2000) by dividing the polynomials used in our sy evaluation and statisitcal analysis presented in Section 5 and 6(Hoffstein respectively. and Silverman 2000) by dividing the polynomials (Hoffstein and Silverman 2000) by dividing the polynomials used in our sysnally, concluding remarks are presented in Sections 7. evaluation is discussed in Section 4. Security evaluation and nally, concluding in the Sections 7. algorithm. ns used. Section 3 remarks describesare thepresented structure of proposed tem into smaller with less number of ϑ with coefﬁcients. Therefore, th Finally, concluding remarks areinpresented in Sections 7.into usedpolynomial inpolynomial our system into smaller polynomial less number statisitcal analysis are presented Section 5 and 6 respectively. tem smaller with less number of ϑ coefﬁcients. Therefore, the entation and performance evaluation is discussed in Section 4. Security computationofof multiplication is deﬁned as: ϑ the coefﬁcients. Therefore, the computation of the multiplicaFinally, concluding remarks are presented in Sections 7. Deﬁnitions and Preliminaries computation of the multiplication is deﬁned as: ion and statisitcaland analysis are presented in Section 5 and 6 respectively. Deﬁnitions tion is deﬁned as: Deﬁnitions andPreliminaries Preliminaries 2 Deﬁnitions and Preliminaries → − → − → − → − → − concluding remarks are presented in Sections 7. − → → L (X) × − →M (X) = − → L 1 (X) ×− − →( L 2 (X) × →M (X)) − (1 2. set Deﬁnitions Preliminaries ttice δ is the of linear and independent vectors − b− deﬁned as: L (X) × M (X) = L (X) × ( L (X) × M (X)) (10) (10) → i 1 2 → → − ttice δ is the set of linear independent vectors b deﬁned as: i attice δ is the set of linear independent vectors bivectors deﬁned Lattice theset set linearindependent independent deﬁned as: → − → − → − i as: Lattice δ δisisthe ofoflinear vectors → bi bdeﬁned as: → n →1(X) − →→ − → L 1 (X) − and L 2and (X)→ obtained bybysplitting L (X) with fewer numb where ﬁnitions and Preliminaries � where L Lare are obtained splitting L(X) with 2(X) → − n (X) and L (X) are obtained by splitting L (X) withfewer fewer number where L → − → − 1 2 n → → n r b ,r � (2) of δ =� � → i− i i ∈ Z− number ones in ϑ1and and ϑ L1 and L2 respectively. There→ − → − and ϑ for L L respectively. Therefore the multiplicatio of ones in ϑ → 22for → − 1 2 1 → , r ∈ Z (2) δδ== i=1 rri ib− (2) and ϑ for L and L respectively. Therefore the multiplication of ones in ϑ (2) 1 ,deﬁned ri i ∈ rZi bas: 1fore (2) 2 i , ri ∈ Z δ is the set of linear independent vectors bbiδii = the2multiplication obtained by splitting with ϑ2are )n operations, where n is the degre of smaller polynomials requireswhere (ϑ1 +and i=1 i=1 i=1 )n operations, where is2)n the degree of smaller polynomials requires 1 + ϑ2 fewer number and of (ϑ smaller polynomials requires (ϑ1 n +ϑ → − n ofofthe polynomial. Such polynomial multiplication can be implemented in com he matrix bThe is matrix knownBas the basis of lattice and itδconsists all vectors b− is � known basis ofδ lattice and itofconsists → →as the − → − iSuch− the polynomial. polynomial multiplication can be implemented comoperations, where n is the degree of the polynomial. Such poly→ The matrixδas b known the basisδδofand lattice δ and itputer consists of all vectors bi of shift and XOR operations, which efﬁciently in bi , ras rbasis ∈of Zlattice = is → e matrix isall the ititconsists ofof(2) all vectors bbas ibasis i of i i a set affect th he matrixbb isknown known as the lattice and allprograms vectors ofThe vectors bi (see Eq.(1)). The dimension of δconsists is the number ee Eq.(1)). dimension of δ is the number of vectors in b. The vectors in nomial multiplication can be implemented in com p uter programs puter programs as a set of shift and XOR operations, which efﬁciently affect the i=1 dimension of δ is the number of vectors in b. The vectors in (see dimension Eq.(1)). The ee ofofvectors δδisisthe ofofvectors ininb. in ofThe vectors in B. as The lattice can be represented as vectors performance of the multiplication process. eeEq.(1)). Eq.(1)). The dimension theinnumber number vectors b.The The vectors in ofpolynomial → − as a set shift and XOR operations, which efﬁciently affect the tice can be represented truncated polynomial ρ of degree n-1 as shown → − performance of the polynomial multiplication process. lattice can be represented aspolynomial truncated polynomial ρ of − degree n-1 as shown → → − → → truncated polynomial ρ of degree as shown indegree Eq.(3): tice can be represented as ρ− as performance of the polynomial multiplication process. atrix b is known as the basis lattice δ and itn-1 consists of all vectors bn-1 ttice can be represented asoftruncated truncated polynomial ρ of of degreen-1 asshown shown i Eq.(3): in Eq.(3): Eq.(3): q.(1)). The dimension of δ is the number of vectors in b. The vectors in n−1 Eq.(3): → − → +n−1 a2 x+22 ·+ x+ (3) ρ (x) = an x− ··a +1 a x2 a+o a1 x + a(3) (3) ρ (x)+=· a· n· x o → → n−1 → can be represented− as truncated polynomial of n-1 shown 3. Proposed Stream Cipher LTSC-128 ++· ·· ·· ·++− aρa22x ++aa11xx2+ (3) ρ− ==aannxxn−1 x2degree +aaoas (3) ρ(x) (x) o a0,a ancoefﬁcients are binary coefﬁcients in{0,1} {0, and all the operations are dehere a0 , a1 ,where ·where · · an aare binary in {0, 1} in and all1} the operations are de3): ,··· binary coefﬁcients and all 0 ,1a 1 , ·a·n· are Thedeuse of SVP of high dimensional lattice as the hard problem n the operations are here aa00, ,aa11,operations ·− · ·· ·aannover are are binary coefﬁcients inn{0, 1} and all n−1 2 ring → n+ here ·ρring {0, and all operations are de−1). The sum ofthe two polynomials is computed ring R Z[X]/(X de ﬁned over R+= Z[X ]/(X −1). The sum xbinary + ·coefﬁcients ·n ·= a2 xthe + ain x a1} (3)is computed (x)R =are anthe −1). The sum of two polynomials ed over the,ﬁned = Z[X]/(X 1 o of LTSC-128 is promising since there is no algorithm that is n −1). The sum of two polynomials is computed ed over the ring R = Z[X]/(X ofbyring two polynomials is computed bysum adding thethe coefﬁcients ofis important −1).ofThe of two polynomials computed ned over the R = the Z[X]/(X adding coefﬁcients variables with same powers. One adding the coefﬁcients of variables with the same powers. One ableimportant to solve the SVP problem in polynomial time. Moreover, a0adding , a1 , · · · athe are binary coefﬁcients inpowers. {0, 1}with and allimportant the operations areOne denvariables with theof same One operation that coefﬁcients variables the same powers. important operation that is needed during the keystream generation is the Convolution yeration addingthat the coefﬁcients of variables with the same powers. One important this problem has been implemented in public-key cryptography n during the keystream generation is the Convolution is needed −1). The sum of two polynomials computed ver the ring is R needed = Z[X]/(X during the keystream is theisConvolution eration that isis needed during the keystream isis the Convolution Multiplication of polynomials in generation b. The generation convolution multiplication of(e.g. polynomials NTRU cryptosystem) and has shown good results in peration that needed during the keystream generation the Convolution ultiplication of polynomials in b. The convolution multiplication of polynomials ng the coefﬁcients of variables with the same powers. One important Multiplication of polynomials inconvolution B. The multiplication in is denoted by and deﬁned asconvolution follows: ultiplication ofofbpolynomials inin⊗ b. The multiplication ofofpolynomials terms of performance and security. LTSC-128 implements ultiplication polynomials b. The convolution multiplication polynomials on that is needed during theB keystream generation is the as Convolution b is denoted by ⊗ and deﬁned as follows: of polynomials in is denoted by ⊗ and deﬁned follows: lattice and vectors in polynomial representation for higher bbisisdenoted by deﬁned denoted by⊗⊗and and deﬁned asfollows: follows: n−1 k as cation of polynomials in b. The convolution multiplication of polynomials � � � → − → − →− − → → − − → → − − → efﬁciency as discussed in Section 2. Low Hamming Weight n−1 k �= � � L Li M n+k−1 L→ ⊗− M + (4) i M k−1 iMi denoted− and deﬁned as = follows: →by ⊗− → → →− − → →− − → Lpolynomial n−1 kk − multiplication concept is applied by dividing the n−1 � � � L ⊗ M = L M L M L M + �− (4) � i+j≡1(mod �− → − → − → → → → → − → i− k−1 i=0 i− n+k−1 i− i k) i=k+1 = → − → − → − → − → − − → → − → − LL ⊗⊗� M L M n+k−1 LM + = (4) polynomial of degree n into smaller polynomials with fewer k−1 kM = = i=0 LLi iMMn−1 (4) k−1 +i=k+1 Li i Mn+k−1 � = i+j≡1(mod k) Li i Mi i � n

→− − → →i=k+1 − → − →− − → − − → → one’sM(ϑ = –3 ). i=0 i+j≡1(mod ki=0 and the poisitions of coefﬁcients ink)(4) polynomials and L Li M Li M Li M i k) L ⊗ M = where + i denotes = (4) i=k+1 k−1 n+k−1 i+j≡1(mod i=0 i=k+1 i+j≡1(mod k) Thepoisitions convolutionofproduct of twoinpolynomials is treated as here k and respectively. i denotes the coefﬁcients polynomials M and L coefﬁLTSC-128 works as follows: an input key of 128-bit generates here i i denotes the of coefﬁcients MM and Lstring here kk and and denotes the poisitions poisitions coefﬁcients in polynomials polynomials L where k and i denotes the poisitions ofpolynomials coefﬁcients in polynomicients multiplication, where aofpolynomial is in represented by aas binary of of 128-bit length per round. The keystream is aand keystream spectively. The convolution product of two is treated coefﬁk and i denotes the poisitions of coefﬁcients in polynomials M and L → − → − spectively.als The convolution product of two polynomials is treated as coefﬁM and L respectively. The convolution product of two polyespectively. The convolution product of two polynomials is treated as coefﬁcoefﬁcients [an , aan−1 , · · · , a1 , a0 ].isFor example, letbyMa and L be two truncated divided into 4-words (sub-keystream). These sub-keystream ents multiplication, where polynomial represented binary string of ively.multiplication, The convolution product ofcoefﬁ twocpolynomials is − treated asacoefﬁ→ → − ents aas is by string ofof XOR’ed with the plaintext bits (one word of plaintext at nomials iswhere treated ients multiplication, where a binary polybits are 8 with binary coefﬁcients the following: ents multiplication, a polynomial polynomial is represented represented by a binary string efﬁcients [apolynomials , an−1 ,awhere · ·polynomial · of , adegree example, let M andas− L be two truncated → − → nwhere 1 , a0 ]. For multiplication, is represented by a binary string of → − → − nomial is represented by a binary string of coefﬁcients [a ,a ,··· n n−1 a time) efﬁcients [a , a , · · · , a , a ]. For example, let M and L be two truncated → 7 let − → 6M and − n−1 , · · · , a11 , a00 ]. For − oefﬁcients [ann,degree example, L be two truncated to generate the ciphertext bits. LTSC-128 is designed → → coefﬁcients → 2 truncated lynomials of as following: ents [an , an−1 ·,0]. ·a·n−1 , a1 ,example, a80 ].with For binary example, let M=two and Lxthe be two M (x) x + + x + 1 (5) IP, KGP and EP as described in the following ,a ,a For let M and L be truncated polynomials in three phases: 1 lynomials ofofdegree 88with binary as olynomials degree withcoefﬁcients binarycoefﬁcients coefﬁcients asthe thefollowing: following: mials of degree 8 with8binary as theasfollowing: of degree with − binary coefﬁcients sections. → 7 6 2the following: M (x) = x7 + x6 + x2 + 1 (5) → − →7(x) − → − 6 2 (5) 6 xx7+ (5) M (5) M (x) =M x (x) + x== + x2++xx1 ++xx ++11 (5) 3.1 Initialization Phase (IP) → → − At this phase, we aim to extract two polynomials, M and (6) L (x) = x7 + x5 + x4 + 1 → (6) L in order to use them in the keystream generation phase. → − → − → → The two polynomials are extracted from the input key K can be represented as two vectors follow: Then MThen and ML and canLbe represented as two vectors asas follow: of 128-bit length. The key K is transformed to its binary → (7) → M =[1,1,0,0,0,1,0,1] − representation, and from those binary bits we obtain the M→ = [1, 1, 0, 0, 0, 1, 0, 1] (7) two polynomials of degree n = 347 (the value of n is chosen L =[1,0,1,1,0,0,0,1] (8) → − to achieve higher security as proved in Section 5, such L of = convolution [1, 0, 1, 1, 0,product 0, 0, 1] of two polynomi(8) To illustrate the idea that (n − 1)/2 is prime). Let ϑM and ϑL be the number of 1’s → → als, let [k1, k2, k3] and [u1, u2, u3] be two polynomials, then the of polynomials M and L respectively, such that To illustrate the idea of convolution product of two polynomials, let [k1 coefﬁcients , k2 , k3 ] convolution multiplication is performed as follows: = ϑL = n/3= 116. isM pernd [u1 , u2 , u3 ] be two polynomials, then the convolution multiplication ϑ → → [k1, k2, k3] ⊗ [u1, u2, u3] = k1 · [u1, u2, u3] + k2 · [u3, u1, u2] + k3 · [u2, u3, u1] The extraction of polynomial L (setting the coefﬁcients of L ) ormed as follows: = [k1u1, k1u2, k1u3] + [k2u3, k2u1, k2u2] + [k3u2, k3u3, is done by selecting the 128-bit of the input key from the Least ] Signiﬁcant Bit (LSB) position to the Most Signiﬁcant Bit (MSB) po[k1 , k2 , k3 ] ⊗ [u1 , u2 , u3 ] = k1k3·u1[u 1 , u2 , u3 ] + k2 · [u3 , u1 , u2 ] + k3 · [u2 , u3 , u1 ] sition, such that the ﬁrst selected bit from K is the ﬁrst coefﬁcient = [k1u1 + k2u3 + k3u2, k1u2 + k2u1 + k3u3, k1u3 + k2u2 → = [k+1 u , k u , k u ] + [k2 u3 , k2 u1 , k2 u2 ] + of L until we obtain ϑL = 116. If ϑL is less than 116, which is the k31u1] 1 2 1 3

28

[k3 u2 , k3 u3 , k3 u1 ]

Journal Volume 9 Number 1 February 2011 = [k1of u1 Digital + k2 u3 Information + k3 u2 , k1 u2 Management + k2 u1 + k3 u3 , k1 u3 + k2 u2 + k3 u1 ]

Resulted polynomial from the convolution multiplication or any other opera-

→ common case, we set the last coefﬁcients of L by selecting r bits of K from MSB position to LSB position until we satisfy the condition ϑL = 116. In rare situation, the majority of the input key bits are zeros with fewer ones, which means the input key → is not capable to provide polynomial L with sufﬁcient number → of ones. In that case, we reset (0 to 1) the bits of polynomial L → n – starting from the location 2 moving toward the LSB of L . This procedure continues until we hit the threshold ϑL = 116. The purpose of this procedure is to achieve preferable distribution of ones coefﬁcients in the new generated polynomial. → On the other hand, the coefﬁcients of the second polynomial M are also extracted from the input key by using similar mechanism. We set the ﬁrst coefﬁcients by selecting the 128-bits of the Figure 2. The design of EP Phase implemented in LTSC-128 input key from MSB position to LSB position, such that the ﬁrst → selected bit from K is the ﬁrst coefﬁcient of M, until we obtain ϑM When the design KGP is of required to generate a new keystream for → Figure 2: The EP Phase implemented in LTSC-128. = 116. If ϑM is less than 116, we set the last coefficients of M by encryption, counter Ck is incremented and added (addition selecting r bits of K from LSB position to MSB position until we → → modulo q) to L and M. At this point, another auxiliary counter satisfy the condition ϑM = 116. Similar to the rare case discussed (Cd)is required in generating new keystream. The counter Cd is → → in generating polynomial L , we need to reset some bits in Manother to point, auxiliary counter (Cd ) is required in generating new keystream. ﬁrstly initialized such that Cd = Ck. The two counters Ck and Cd hit the threshold ϑM = 116. However, the procedure ofThe resetting ﬁrstly initialized such The two counters counter Care d isthen d =C used by Eq.(13) forthat one C round ofk .keystream genera- Ck and → n, the zero bits in M starts from the location – 2 moving toward the bysi)as Eq.(13) for one round of keystream generation (K si ) as Cd are then used tion (K follows:

follows:

→ − → − → − (13) K si = [ P ⊗ ( M ⊕ Cki+1 )−1 q ⊗ ( L ⊕ Cdi+1 )](mod q) (13) → − → − where Cki+1 = M ⊕ [Cki �→LCK5 ] + 1 and Cdi+1 = L→⊕ [Cdi � LCD5 ] + 1. The where Cki+1 = M ⊕ [Cki « LCK5]+1and Cdi+1 = L ⊕ [Cdi « LCD5]+1. two variables LCK5 and LCD5 are the integer values represented by the last 5 The two variables LCK5 and LCD5 are the integer values repreand Cby The size of the resulted K sThe is 347-bit block. bits of both Cksented d respectively. the last 5 bits of both Ck and Cd respectively. size The ﬁrst 340-bits are divided into 4 chunks of 85-bit block. Only the second and of the resulted Ks is 347-bit block. The ﬁrst 340-bits are divided 4 chunks 85-bitare block. Only the second segthird segmentsinto (Seg2 and of Seg3) considered in the and ﬁnalthird keystream since ments statistical (Seg2 and Seg3) are considered thethe ﬁnalrest keystream → − → − they showed better properties comparedin to of the bits. The Figure 1: Initializing polynomials L and M by extracting the coefﬁcients from K. since they showed better statistical properties compared to the rest of the bits (Seg1 and Seg4) are found statistically weak, and therefore they rest of the bits. The rest of the bits (Seg1 and Seg4) are found are not considered in theweak, ﬁnal and keystream. statistically therefore they are not considered in the → − ﬁnal keystream. position, such that the ﬁrst selected bit from K is the ﬁrst coefﬁcient of M , until → − we obtain ϑM = 116. If ϑM is less than 116, we set the last3.3 coefﬁcients of MPhase by Encryption (EP) 3.3 Encryption Phase (EP) electing r bits of K from LSB position to MSB position until we satisfy the conThe encryptionThe process of LTSC-128 on applying encryption process is of based LTSC-128 is based exclusive-OR on applying (XOR) → → Similar to thepolynomials rare case Ldiscussed in generating polynomial dition ϑM = 116. Figure 1. Initializing and M by extracting the exclusive-OR (XOR) operation the keystream bitsgenerated and → − → − operation between the keystream bits andbetween the plaintext bits. The Ks L , we need to reset some bitscoefﬁcients in M to hit the from K threshold ϑM = 116. However, the plaintext bits. The generated K s of 160-bit length is divided → − of 160-bit length is divided into ﬁve words (32-bit each). The encryption is acmoving he procedure of resetting the zero bits in M starts from the location n2 , into ﬁve words (32-bit each). The encryption is accomplished by complished byXOR’ing XOR’ing one word of plaintext with one word of the keystream → − one word of plaintext with one word of the keystream oward the MSB MSBof of→ M . Figure 1 illustrates the extraction of the initialization value M. Figure 1 illustrates the extraction of the initialization as portrayed by 2. by Figure 2. → − → − asFigure portrayed → key. or polynomials M from → the input value L forand polynomials L and M from the input key. Notice that when EP has used the whole generated bits at time ti , the KGP → − Noticetothat when EP has used the whole generated bits at Finally, Finally, anotheranother two parameters ( P , q)→are needed during thethe KGP phase two parameters ( P, q) are needed will during be called to generate newwill keystream highlighted the KGP phase. time ti, the KGP be called as to generate newinkeystream as generate the keystreams. As proved in (Hoffstein and Silverman 2000), PolynoKGP phase to generate the keystreams. As proved in (Hoffstein → − highlighted in the KGP phase. → mial P = xand + 2Silverman and integer q =Polynomial 128 are two unitsinteger in Z[X]. The generated 2000), P =ideal x + 2 and q = 128 → Implementation and Performance Evaluation → two ideal units in Z[X ]. − →The generated polynomial − → − 4 are M will be polynomial M will be → in the form M = →1 + (2 + x) × M . 4. Implementation and Performance Evaluation in the form M = 1 + (2 + x) × M. We compare the LTSC-128 of against the widely RC4 We performance compare the of performance LTSC-128 againstadapted the 3.2 Keystream Generation PhasePhase (KGP)(KGP) 3.2 Keystream Generation stream cipher,widely whichadapted is basedRC4 on simple substitutions shiftonoperations. Both stream cipher, which isand based simple The generated and initialized polynomials and variables in substitutions and shift operations. Both ciphers were coded The generated and initialized polynomials and variables in IP phase, form the IP phase, form the main parameters for the KGP phase. The in C++ using MinGW-2.05. NTL library is used to handle the main parameters for the KGP phase. The formula for a single round of keystream formula for a single round of keystream Ks generation is given polynomial arithmetic in LTSC-128. For testing purposes, we is given K s generation as follows: as follows: ran both ciphers on a workstation with Intel Core 2 Duo CPU → → →⊗ → → → − 2.13 GHz processor, with 2GB RAM. Ks = P− M− (11) q ⊗ L (11) Ks = P ⊗ Mq ⊗ L → → → In terms where P, M and L are extracted and initialized in the previ→ − − → → − → of performance, single threaded LTSC-128 is − → →previous phase, and where P , M and L are and initialized in the M three about times slower than RC4. Relatively, LTSC-128 q ous phase, andextracted M is the inverse polynomial ( M mod q) (see q → − is a good alternative with regards to the high security s the inverse polynomial ( M mod q) (see Eq.(9)). Eq.(9)). provided compared to the broken RC4. The encryption In order to generate new keystream, KGP uses a counter Ck , which is ﬁrstly In order to generate new keystream, KGP uses a counter Ck, rate of LTSC-128 is 49.671 MBits/second compared to the nitialized from the input initialized key K byfrom Eq.(12): which is firstly the input key K by Eq.(12): Ck =

n � i=1

(K[i](mod n))

(12)

RC4 encryption rate which is 152.173 MBits/second on the same machine. Figure 3 illustrates the perform ance (12) comparison between LTSC-128 and RC4 with different plaintext sizes.

When the KGP is required to generate a new keystream for encryption, → − →Volume 9 Number 1 − Journal of Digital Information Management February 2011 ounter Ck is incremented and added (addition modulo q) to L and M . At this

29

5. Security Analysis The security of LTSC-128 rests on the intractability of the SVP problem, which is an NP-hard problem. At present, no algorithm can solve the SVP in polynomial time. Besides NPhard problem, the security of LTSC-128 is also being measured by its statistical property. In this section, we examine LTSC-128 against cryptanalaysis attacks and test the statistical attributes of the generated keystream from the KGP phase.

Figure 3. Encryption rate for RC4 and LTSC-128 stream ciphers

Recently, a high performance multithreaded model for stream ciphers was proposed in (Suwais and Samsudin 2008) to enhance the performance of stream ciphers, where multiple tasks are performed concurrently. The model is divided into three components: thread creator, keystream generator and plaintext encoder.

5.1 Brute Force Attack The length of the input key used in LTSC-128 was chosen to speciﬁcally thwart Brute Force attack. Brute Force attack on the 128-bit input key of LTSC-128 is infeasible due to the huge key space, 2128 possible keys. Hypothetically, an attacker can → recover the secret key by trying all possible polynomials M ∈ → ζM or L ∈ ζL (where ζ denotes polynomial space), and test if → → → P ⊗ Mq ⊗ L (mod q) has small entries (polynomial with less → → number of one’s coefﬁcients). Since L and M are of the same order and equal in the number of non-zero coefficients, we → consider only polynomial M in our analysis. The security of SVP is determined by #ζM. The security level (Hoffstein et al. 1998) for SVP of dimension n is then given by Eq.(14):

�

� Applying the multithreaded model on LTSC-128 requires 1 n! (14) (14) #ζM = immediate trigger of the thread creator component of the ϑM ! (n − 2ϑM )! multithreaded model, resulting in four threads. Two threads are responsible of generating keystreams, while the Based other on Based on itEq.(14), it requires the attacker carry out apEq.(14), requires the attacker to carrytoout approximately 2132 132 two threads are responsible of encrypting stream ofoperations plaintext. to proximately operations to brute forceprocedure LTSC-128. The brute force2 LTSC-128. The same has same to be applied → → − Subsequently, the four threads will be directly connected to procedure has to be applied on polynomial L to attack our requireon polynomial L to attack our generator, which double the calculation the keystream generation and encryption phases of LTSCgenerator, which double the calculation requirements. ThereTherefore, LTSC-128is secure is secure against brute-force attack due to the 128, via the keystream generator and plaintextments. enc oder fore, LTSC-128 against brute-force attack due to the → − infeasibility carrying out huge amount of calculations on both polynomials M components of the multithreaded model. However, LTSC- of infeasibility of carrying out huge amount of calculations on both → − → → 128 uses single counter to generate new keystream and each L. polynomials M and L. round. But the multithreaded model is designed to use two counters in parallel. 5.2 Time-memory Trade-off Attack 5.2 Time-memory Trade-off Attack The sucess of this kind of attacks depnds on the size of the state The multithreaded model helps LTSC-128 to perform faster space. Ciphers with small state space are highly subjected to since the workload of LTSC-128 is divided amongThe multiple sucess ofthis thisattack. kind of attacks depnds the size of theout state space. Ciphers Theprocedure of thison attack is carried as follows. threads, and all those threads are operating concurrently with small state space are highly subjected to thisand attack. The procedure of this Create a sorted list of output sequences their corresponding on multi-core processors. Thus multithreading on multi-core states. the received sequence matched with attack is carried out Then, as follows. Createoutput a sorted list ofisoutput sequences and machine provides LTSC-128 with higher utilization on the new the one stored in the list. If matching is found, the cipher state their corresponding states. Then, the received output sequence is matched with generation of processors. Applying this model on LTSC-128 the output sequence can also be found in shows better performance compared to the results discussed the one storedbefore in theproducing list. If matching is found, the cipher state before producing the list. Resulting in a successful key recovery (Ekdahl and above. Figure 4 illustrates the performance enhancement the output sequence can also be found in the list. Resulting in a successful key Johansson 2000). obtained by applying the multithreaded model on a dual core recovery (Ekdahl and Johansson 2000).

machine. The encryption rate of the multithreaded LTSC-128 We found that time-memory trade-off attack is not feasible to be found that time-memory trade-off attack is not feasible to be applied has signiﬁcantly increase to 71.602 Mbit/s compared We to the applied on LTSC-128 stream cipher. The state space is found to 347+128 ) LTSC-128 be stream cipher. state space is found too larg 128 (2 sequential LTSC-128 encryption rate, 49.671 Mbit/s,on as shown too larg (2347+128The ) compared to the key size ofto thebe space(2 ). previously. compared to the key size of the space (2128 ).

5.3 Balance The initialization phase of LTSC-128 intends to extract two → 5.3 Balancepolynomials → M and L from the input key K. We found it hard to → → recover the input key by attacking polynomials M and L due to − → The initialization phase of LTSC-128 intends to extract two polynomials M and the balance property that we added to this phase. The extracted → − L from the input key K . We found it hard to recover the input key by attacking = →polynomials − → have equal number of 0’s and 1’s (ϑM ϑL = n/3 116). − polynomials MThe and L due tophase the balance property we added this phase. initialization has eliminated the that statistical biasedto that be foundhave during the process extraction. The extractedcould polynomials equal number of of polynomials 0’s and 1’s (ϑ M = ϑL = n/3 = From the other attacker required tobiased performthat 2347 could be 116). The initialization phaseperspective, has eliminated theisstatistical +2347 operations to recover the input key, based on the fact that found during the process→ of polynomials extraction. From the other perspective, → polynomials M and L 347 are of the degree 347. Therefore, this kind 347 + 2 operations to recover the input key, attacker is required to perform 2 of attacks is infeasible due to the → − →huge number of operations − based on the which fact that M andtheLinput are of the degree 347. Therefore, arepolynomials required to recover key.

this kind of attacks is infeasible due to the huge number of operations which are 5.4 Chosen Input Differential Attack required to recover the input key.

Figure 4. Performance Improvement gained from applying the multithreaded model on M(LTSC-128)

This is yet another powerfull attack where small changes on the input values is expected to propagate through the cipher.

5.4 Chosen Input Differential Attack

30

Journal of Digital Information Management Volume 9 Number 1 February 2011 This is yet another powerfull attack where small changes on the input values

is expected to propagate through the cipher. In LTSC-128, changing one bit in the input key will affect the entire six variables in the keystream generator.

In LTSC-128, changing one bit in the input key will affect the it consists of 18 core tests. These 18 tests are producing →→ − → − → − → − entire six variables in the keystream generator. a total → − → of 215 p-values uniform on [0,1). The Diehard Suite − → − − hese variables are thecounters counters (C ,C two ( M, ,LL) )ofofthe These variables are counters (C ,,CCdd),), two (the ,,deLL)) of the These variables are the the (Cpolynomials two polynomials polynomials (M Mdeof the deded ),two kkpolynomials ese variables are the (C ,kC (M kcounters d ), requires approximately 68 million 32-bit numbers for a suc → These variables are the counters (C ,C ), two polynomials ( M , → − → − k d , LCD ). LTSC-128 is found secure ree 347 and two shifting variables (LCK , LCD ). LTSC-128 is found secure gree 347 and two shifting variables (LCK , LCD ). LTSC-128 is found secure gree 347 and two shifting variables (LCK 5 5 5 5 , LCD ). LTSC-128 is found secure ee 347 and two shifting variables (LCK 5 5 cessful run of the 18 tests. The tests included in the Suite, 5 → are the counters (Ck , Cd5), two polynomials → → hese variables (− M ,− L).) of the deLkind ) ofare the degcounters ree 347 and two shifting variables (LCKinput LCD 5(,M 5 ) of the deese variables the (C polynomials ,ofLkey k, C d ), two gainst this of attacks since changing one bit of the will affect against this kind of attacks since changing one bit of the input key will affect against this kind of attacks since changing one bit the input key will affect with their abbrevi a tion are: Birthday Spacing Test (BDAY), ainst this kind of attacks since changing one bit of the input key will affect is found secure ee 347 and two shifting variables (LCK 5 , LCD 5 ). LTSC-128 LTSC-128 is found secure(LCK against this kind of attacks is since , LCD ). LTSC-128 found secure ee 347 and two shifting variables 5 5 Overlapping 5-Permutation Test (OPE RM), Binary Rank he entire variables of the sequence generation cycle. The resulting sequence the entire variables of the sequence generation cycle. The resulting sequence the entire variables of the sequence generation cycle. The resulting sequence egainst entirethis variables thebit sequence generation cycle. The kind ofofattacks changing bit the resulting input keysequence will affect changing one ofsince the input key will one affect theofentire variables Test for matrices (RANK31), Binary Rank Test for matrices ainst this kind of attacks since changing one bit of the input key will affect hich is mapped from the input sequence is obtained in a nonlinear fashion which mapped from the input isin ininsequence aa nonlinear which is mapped from the inputissequence sequence is obtained obtained nonlinear fashion fashion is mapped from the sequence input sequence obtained a nonlinear fashion of theis sequence generation cycle. The resulting sequence eichentire variables of the generation cycle. The resulting (RANK32), Binary Rank Test for matrices (RANK68), eyperforming entire variables of the sequence generation cycle. The resulting sequence performing bit-shifting, modulo operation convolution multiplications. which is mapped from sequence the input sequence is obtained in a by performing bit-shifting, modulo operation and multiplications. In by performing bit-shifting, modulo operation and convolution multiplications. In bit-shifting, modulo operation and convolution multiplications. InIn Test (BSTM), hich is mapped from the input isand obtained in aconvolution nonlinear fashion Bitstream DNA Test (DNA), OPSO Test (OPSO), hich is mapped from the input sequence is obtained in a nonlinear fashion nonlinear fashion by performing bit-shifting, modulo operation ther words, the i − th bit of the output sequence depends on the ﬁrst, second, other words, the i − th bit of the output sequence depends on the ﬁrst, second, other words, the i − th bit of the output sequence depends on the ﬁrst, second, her words, thebit-shifting, i − th bit ofmodulo the output sequence depends on the ﬁrst, second, OQSO Test (OQSO), Count-The-1’s Test on stream of bytes y performing operation and convolution multiplications. In and convolution multiplications. In0other the i − multiplications. th bit of performing bit-shifting, modulo operation and convolution In hird,..., and jthe bit the input (where i ≤words, j). 00≤≤ third,..., and jthe bit of the input i i≤≤ j). third,..., and j− − th bitoutput of thesequence input (where j).the ﬁrst, second, (CBYTE), Count-The-1’s Test for speciﬁc bytes (CSBYTE), rd,..., and jthe −−output thth bit ofof input (where 0the ≤≤(where iﬁrst, ≤ j).second, her words, i− thsequence bit ofth the depends on depends on third,..., her words, the i − th bit of the output sequence depends on theand ﬁrst, second, Parking Lot Test (PARKL), Minimum Distance Test (MDIST), ird,..., andj j−− of the input (where ≤ i ≤ j). th th bit bit of the input (where 0 ≤ i ≤0 j). 3Dspheres Test (3DS), Squeeze Test (SQEZ), Overlapping rd,..., and j − th bit of the input (where 0 ≤ i ≤ j). .5 LLL Lattice Basis Reduction Algorithm 5.5 LLL Lattice Basis Reduction 5.5 LLL Lattice Basis Reduction Algorithm 5 LLL Lattice Basis Reduction Algorithm Algorithm Sums Test (OSUM), Run Test (RUN) and Crap Test (CRAP). 5.5 LLL Lattice Basis Reduction Algorithm Results obtained from the 18 tests are summarized in Table 5 LLL Lattice Basis Reduction Algorithm Lenstra al. described a polynomial time algorithm called al. [1982] apolynomial polynomial timealgorithm algorithm called LLL-reduction Lenstra et al. [1982] aatime polynomial time algorithm called LLL-reduction Lenstra etdescribed al.[1982] [1982] described polynomial time algorithm called LLL-reduction nstra etetLattice al. [1982] described adescribed called LLL-reduction 5enstra LLL Basis Reduction Algorithm 1.− The p-value of each test shows that LTSC-128 passed → − → → − → − LLL-reduction algorithm, for transforming basis B = (b ,b ,...,b ) 1 2 n , b , ..., b ) of Lattice L into an LLLgorithm, for transforming basis b = (b , b , ..., b ) of Lattice L into an algorithm, for transforming basis b = (b , b , ..., b ) of Lattice L into an LLLLLLalgorithm, for transforming basis b = (b 1 2 n 1 2 n the tests conﬁdence of 99%, within the success range , b , ..., b ) of Lattice L into an LLLorithm, for transforming basis b = (b 1 2 n → 1 2time algorithm enstra et al.of[1982] described aLLL-reduced polynomial called LLL-reduction with `n = (b` 1,b` 2,..., Lattice L` into an basisalgorithm B b` n).− The ` =described `bb `b(`b`nb1). nstra et al. [1982] a polynomial time called LLL-reduction ` ` ` ` → ` ` ` ` ` ` , , . . . , The LLL-reduction algorithm is commonly educed basis b ( b 0.01

Khaled Suwais, Azman Samsudin School of Computer Sciences Universiti Sains Malaysia Penang 11800 Malaysia [email protected], [email protected]

Journal of Digital Information Management

Abstract: LTSC-128 is a high secure stream cipher based on reach the required level of security or that the generated keyBoesgaard etstream al. 2003). However, researches show that those ciphers either the hardness of the Shortest Vector Problem in Lattice space. The is statistically weak (Scott et al. 2001,Christophe 2001). cipher is based on vectors multiplication over ﬁnite ﬁeld, where Therefore the current de s ign of stream ciphers, which rests on do not reach the required level of security or that the generated keystream is these vectors are represented by polynomials to enhance the using simple logical and mathematical operations or LFSR, statistically weak (Scott et al. 2001,Christophe 2001). Therefore the current deperformance of keystream generation. The key size is 128 bits show the imperfection of the stream cipher. sign of stream ciphers, which rests on using simple logical and mathematical and there is no attack faster than exhaustive key search has been coreshow of anythe stream cipher is presented by the keystream operations orThe LFSR, imperfection of the stream cipher. identiﬁed. The experimental results showed that the encryption generation procedure, where the input key is treated based on The core of any stream cipher is presented by the keystream generation rate of LTSC-128 is about 50Mbit/second. LTSC-128 cipher speciﬁc and pre-deﬁned formula of that cipher. In this paper we has additional feature that its performance can be increased procedure, where the input key stream is treated based and of pre-deﬁned are presenting a new cipher basedon on speciﬁc the hardness by applying multithreading technique on multi-core processors. formula of that cipher. In this paper we are presenting a new stream cipher the Shortest Vector Problem (SVP) in lattice system. SVP is an The multithreaded version of LTSC-128 has achieved better en problem 2005)Vector and it isProblem the core of the proposed based on theNP-hard hardness of the(Khot Shortest (SVP) in lattice system. cryption rate of 71Mbit/second on dual-core processor. cipher to ensure higher compared tocore the existing SVP is an NP-hard problem (Khot security 2005) and it is the of the stream proposed cipher ciphers discussed earlier. to the existing stream ciphers discussed Categories and Subject Descriptors to ensure higher security compared E.3 [Data Encryption ] Data encryption standard; D.4.6earlier. [Security and Protection] Cryptographic controls

The SVP along with three other intractable problems are conas the four other most important pivots in current The SVP sidered along with three intractable problems arepublic-key considered as the cryptosystems as appeared in the literature. The other three four most important pivots in current public-key cryptosystems as appeared in General Terms: Data encryption; Cryptographic analysis, Vector problems are the Integer Factorization Problem (IFP), Discrete problem, Lattice the literature.Logarithm The other three (DLP) problems are the Integer Factorization Problem and Elliptic Curve Discrete Logarithm Problem (IFP), Discrete Logarithm Problem (DLP) and Elliptic Curveintroduced Discrete Logarithm Problem (ECDLP). The ﬁrst use of SVP was formally Keywords: Stream cipher, Shortest vector problem, Lattice, NP-hard in 1998 (Hoff s tein et al. 1998), in a public-key cryptosystem Problem (ECDLP). The ﬁrst use of SVP was formally introduced in 1998 (Hoffproblem, Multithreading, Encryption calledinNTRU cryptosystem. The general ideaNTRU of SVPcryptosystem. rests on stein et al. 1998), a public-key cryptosystem called The the shortest non-zero vector b in lattice B, where B is Received: 11 November 2010; Revised 17 December 2010;general Accepted idea ﬁnding of SVP rests on ﬁnding the shortest non-zero vector b in lattice b, the basis of the lattice δ. 20 December 2010

where b is the basis of the lattice δ. basis can beasseen as matrix, n × m matrix, where vectors b formsthe column The basisThe b can beBseen n×m where vectors b forms 1. Introduction the column of B such as: of b such as: Stream cipher algorithm is responsible for generating a | ··· | − → → cryptographically strong keystream to perform encryption − (1) (1) b = b1 · · · bm operations on a given plaintext. Generally, stream cipher is used | · · · | in environment where encryption need to be done in a continuous

stream of bits (rather than by block as in block cipher). Stream The workﬂow of LTSC-128 is divided into three phases: Initialization Phase The workﬂow of LTSC-128 is divided into three phases: Initialciphers can be categorized into two categories: hardware-based ization Phase (IP), Keystream Generation PhasePhase (KGP)(EP) and (see Sec(IP), Keystream Generation Phase (KGP) and Encryption and software-based. The hardware-based stream ciphers are Encryption Phasein(EP) Sectphase ion 3). The SVP is applied in 3). The SVP is applied the (see second of LTSC-128 to generate the commonly based on the Linear Feedback Shift Registerstion (LFSRs) the second phase of LTSC-128 to generate the keystream. In (e.g. SNOW (Ekdahl and Johansson 2000), LILI-128 (Dawson et al. keystream. InLTSC-128, LTSC-128, the mathematical operations and vector representathe mathematical operations and vector representa 2000), etc.). On the other hand, software-based stream ciphers tion (e.g. of SVP in lattice space was inspired by NTRU public-key cryptosystem, tion of SVP in lattice space was inspired by NTRU public-key RC4 (Rivest 1992), SEAL (Rogaway and Coppersmith 1998), etc.) where the vectors are represented by polynomials for by efﬁciency purposes as cryptosystem, where the vectors are represented polynomials rely on Boolean functions and bit manipulation in their computations. for efﬁciency purposes as will discuss later in this document. will discuss later in this document. A cryptographically secure stream cipher should satisfy two Employing NP-hardNP-hard problemproblem such as SVP in stream cipher will essential requirements. These requirements are summarized by Employing such as SVP in stream cipher willstrengthen two questions: how secure is the generated keystream against strengthen the security of the generated keystream against the security of the generated keystream against cryptanalysis attacks, since cryptanalysis attacks and how strong is the generated keystream cryptanalysis since there is no known methodinthat can there is no known methodattacks, that can solve NP-hard problems polynomial time. statistically? In addition, one should also take into consideration solve NP-hard problems in polynomial time. LTSC-128 provides LTSC-128 provides higher security than several other well known ciphers inthe performance factor so that the performance and the level of higher security than several other well known ciphers intended tended to be to used in hardware andand software implementations. security provided by the algorithm are balanced. be used in hardware software implementations.

From a technical point of view, the use of NP-hard problems in cryptographic

Both categories of stream ciphers are proven to provide high From a technical point of view, the use of NP-hard problems primitives decreases the performance of the corresponding primitives due to its performance ciphers according to the results presented in in cryptographic primitives decreases the performance of the costly operations. Thus, the use ofdue the new multi-core technology (Rogaway and Coppersmith 1998, Boesgaard et al. 2003). corresponding primitives to its costly operations. Thus, the becomes necessary order the performance cryptographic primitives based However, researches show that those ciphers either do not in use of to theenhance new multi-core technology becomes necessary in

Journal of Digital Information Management Volume 9 Number 1 February 2011

27

[k u2 , k3 u3 , k3 u1 ] [k3 u32 , k 3 u3 , k3 u1 ] = [k1 u1 + k2 u3 + k3 u2 , k1 u2 + k2 u1 + k3 u3 , = [k1 u1 + k2 u3 + k3 u2 , k1 u2 + k2 u1 + k3 u3 , k u3 + k2 u2 + k3 u1 ] k1 u13 + k2 u2 + k3 u1 ]

NP-hard order problems. Technically, gaining higher performance on multi-core to enhance the performance cryptographic primitives based Resulted polynomial fromconvolution the convolution multiplication anyother oper Resulted polynomial from the multiplication ororany problems. Technically, gaining performance on n NP-hard NP-hard Technically, gaining higher higher techniques, performance on multi-core multi-core Resulted polynomial fromare the convolution multiplication or any other operaatform canonproblems. beNP-hard achieved by using multithreading where multiple problems. Technically, gaining higher performance other operations reduced modulo q, where q is a pre-defined tions are multiple reduced modulo q, where q is a pre-deﬁned value. The other impo atform can be achieved by multithreading techniques, where atformare can be achieved by using using multithreading techniques, where multiple are reduced modulo q, important where q operation is a pre-deﬁned value. The out other imporon multi-core platform can be achievedcalculations by using multithreading value. The other that is being carried reads created to accomplish the internal oftions the corresponding on NP-hard problems. Technically, gaining higher performance on multi-core tant operation that is being carried out in our implementation is the inverse eads are created totoaccomplish the internal calculations ofof the corresponding reads are created accomplish the internal calculations the corresponding techniques, where multiple threads are created to accomplish in our implementation is the inverse of a given polynomial. tant operation that is being carried out in our implementation is the inverse− of− yptographicplatform primitives. another multithreading techniques → − → can From be achieved byperspective, using multithreading techniques, where multiple −1 → − → − a given polynomial. The inverse of → polynomial Mdenoted modulobyq → is–1 denoted by M yptographic From another perspective, multithreading techniques theprimitives. internal calculations of the corresponding cryptographic The inverse of polynomial M modulo q is M such yptographic primitives. From another perspective, multithreading techniques q a given The inverse of polynomial M modulo q is denoted by M q q n be applied on the design of the algorithm (e.g. stream threads arestructural created to accomplish the internal calculations ofpolynomial. the cipher) corresponding primitives. From another perspective, multithreading techniques hard problems. Technically, gaining higher performance on multi-core such that: n be applied on the structural design of the algorithm (e.g. stream cipher) that: an onkeystreams the structural design of the algorithm (e.g. stream cipher) −1 cryptographic primitives. From another perspective, multithreading techniques such that: ch be thatapplied multiple are generated andof used to encrypt a stream of → − → − can bekeystreams applied onmultithreading the structural design thewhere algorithm (e.g.a −1 (9) m bemultiple achieved by using techniques, multiple →M ≡ 1 (mod q) − →M ( chcan that are generated and used toto encrypt stream ofofcipher) − q⊗ ⊗ uch that multiple keystreams are generated and used encrypt a stream can be applied on the structural design of the algorithm (e.g. stream M ≡ 1 (mod q) (9) M aintext as described by the model proposed in (Suwais and Samsudin 2008). q stream cipher) such that multiple keystreams are generated and are created to accomplish the internal calculations of the corresponding aintext as by model proposed inin(Suwais and Samsudin 2008). → → − aintext asdescribed described bythe the (Suwais and Samsudin 2008). such that multiple keystreams are as generated and used to encrypt aand stream of − used encrypt a stream of proposed plaintext by the model The paper is to organized asmodel follow: Section 2described provides the background →M , we For a given ϑ toϑdenote thethe number ofofone’s coef raphic primitives. From another perspective, multithreading techniques For apolynomial given polynomial M,write weϑwrite to denote number For a given polynomial M , we write to denote the number of one’s coefﬁThe paper is organized as follow: Section 2 provides the background and plaintext as described by the model proposed in (Suwais and Samsudin 2008). The paper isSection organized as and follow: Section 2 provides thecients background and proposed in (Suwais Samsudin 2008). tations used. 3 describes the structure of the proposed algorithm. polynomial. In in practice, the convolution of two polynomia coefﬁcients the polynomial. In practice,product the convolution applied on the structural design of the algorithm (e.g. stream cipher) in theone’s cients in the polynomial. In practice, the convolution product of two polynomials tations used. Section 3 describes the structure of the proposed algorithm. The paper is organized as follow: Section 2 provides the background and otations used. Section 3 generated describes theused structure of the proposed product of tworequires polynomials with binary coefﬁcients requires ϑn optimiz plementation and evaluation is to discussed Section 4.algorithm. Security at multiple keystreams and encrypt ainstream of Thus, for further speed with binary coefﬁcients ϑn operations. The paperperformance is are organized as follow: Section 2 provides the backnotations used. Section 3 describes the structure of the proposed algorithm. Thus, for speed optimizabinary coefﬁcients requires ϑfurther plementation and performance evaluation isisdiscussed ininwith Section 4.4.Security n operations. mplementation and performance evaluation discussed Section Security operations. Thus,offorSmall speed optimiza tionfurther we apply the ground andmodel notations used. Section 3 describes the5 structure xtaluation as described by the proposed in (Suwais and Samsudin 2008). and statisitcal analysis are presented in Section and 6we respectively. tion apply the concept Hamming Weight Polynomial multiplicatio Implementation and performance evaluation is discussed in Section 4. Security we applyconcept the concept of Small Hamming Polynomial multiplication aluation statisitcal analysis are inin Section 55tion and 66respectively. of Small Hamming Weight Weight Polynomial multiplication valuation and statisitcal analysis arepresented presented Section and respectively. of the proposed algorithm. Implementation and paperconcluding isand organized as follow: 2 provides the background and nally, remarks areSection presented inare Sections 7. performance (Hoffstein and Silverman 2000) by dividing the polynomials used in our sy evaluation and statisitcal analysis presented in Section 5 and 6(Hoffstein respectively. and Silverman 2000) by dividing the polynomials (Hoffstein and Silverman 2000) by dividing the polynomials used in our sysnally, concluding remarks are presented in Sections 7. evaluation is discussed in Section 4. Security evaluation and nally, concluding in the Sections 7. algorithm. ns used. Section 3 remarks describesare thepresented structure of proposed tem into smaller with less number of ϑ with coefﬁcients. Therefore, th Finally, concluding remarks areinpresented in Sections 7.into usedpolynomial inpolynomial our system into smaller polynomial less number statisitcal analysis are presented Section 5 and 6 respectively. tem smaller with less number of ϑ coefﬁcients. Therefore, the entation and performance evaluation is discussed in Section 4. Security computationofof multiplication is deﬁned as: ϑ the coefﬁcients. Therefore, the computation of the multiplicaFinally, concluding remarks are presented in Sections 7. Deﬁnitions and Preliminaries computation of the multiplication is deﬁned as: ion and statisitcaland analysis are presented in Section 5 and 6 respectively. Deﬁnitions tion is deﬁned as: Deﬁnitions andPreliminaries Preliminaries 2 Deﬁnitions and Preliminaries → − → − → − → − → − concluding remarks are presented in Sections 7. − → → L (X) × − →M (X) = − → L 1 (X) ×− − →( L 2 (X) × →M (X)) − (1 2. set Deﬁnitions Preliminaries ttice δ is the of linear and independent vectors − b− deﬁned as: L (X) × M (X) = L (X) × ( L (X) × M (X)) (10) (10) → i 1 2 → → − ttice δ is the set of linear independent vectors b deﬁned as: i attice δ is the set of linear independent vectors bivectors deﬁned Lattice theset set linearindependent independent deﬁned as: → − → − → − i as: Lattice δ δisisthe ofoflinear vectors → bi bdeﬁned as: → n →1(X) − →→ − → L 1 (X) − and L 2and (X)→ obtained bybysplitting L (X) with fewer numb where ﬁnitions and Preliminaries � where L Lare are obtained splitting L(X) with 2(X) → − n (X) and L (X) are obtained by splitting L (X) withfewer fewer number where L → − → − 1 2 n → → n r b ,r � (2) of δ =� � → i− i i ∈ Z− number ones in ϑ1and and ϑ L1 and L2 respectively. There→ − → − and ϑ for L L respectively. Therefore the multiplicatio of ones in ϑ → 22for → − 1 2 1 → , r ∈ Z (2) δδ== i=1 rri ib− (2) and ϑ for L and L respectively. Therefore the multiplication of ones in ϑ (2) 1 ,deﬁned ri i ∈ rZi bas: 1fore (2) 2 i , ri ∈ Z δ is the set of linear independent vectors bbiδii = the2multiplication obtained by splitting with ϑ2are )n operations, where n is the degre of smaller polynomials requireswhere (ϑ1 +and i=1 i=1 i=1 )n operations, where is2)n the degree of smaller polynomials requires 1 + ϑ2 fewer number and of (ϑ smaller polynomials requires (ϑ1 n +ϑ → − n ofofthe polynomial. Such polynomial multiplication can be implemented in com he matrix bThe is matrix knownBas the basis of lattice and itδconsists all vectors b− is � known basis ofδ lattice and itofconsists → →as the − → − iSuch− the polynomial. polynomial multiplication can be implemented comoperations, where n is the degree of the polynomial. Such poly→ The matrixδas b known the basisδδofand lattice δ and itputer consists of all vectors bi of shift and XOR operations, which efﬁciently in bi , ras rbasis ∈of Zlattice = is → e matrix isall the ititconsists ofof(2) all vectors bbas ibasis i of i i a set affect th he matrixbb isknown known as the lattice and allprograms vectors ofThe vectors bi (see Eq.(1)). The dimension of δconsists is the number ee Eq.(1)). dimension of δ is the number of vectors in b. The vectors in nomial multiplication can be implemented in com p uter programs puter programs as a set of shift and XOR operations, which efﬁciently affect the i=1 dimension of δ is the number of vectors in b. The vectors in (see dimension Eq.(1)). The ee ofofvectors δδisisthe ofofvectors ininb. in ofThe vectors in B. as The lattice can be represented as vectors performance of the multiplication process. eeEq.(1)). Eq.(1)). The dimension theinnumber number vectors b.The The vectors in ofpolynomial → − as a set shift and XOR operations, which efﬁciently affect the tice can be represented truncated polynomial ρ of degree n-1 as shown → − performance of the polynomial multiplication process. lattice can be represented aspolynomial truncated polynomial ρ of − degree n-1 as shown → → − → → truncated polynomial ρ of degree as shown indegree Eq.(3): tice can be represented as ρ− as performance of the polynomial multiplication process. atrix b is known as the basis lattice δ and itn-1 consists of all vectors bn-1 ttice can be represented asoftruncated truncated polynomial ρ of of degreen-1 asshown shown i Eq.(3): in Eq.(3): Eq.(3): q.(1)). The dimension of δ is the number of vectors in b. The vectors in n−1 Eq.(3): → − → +n−1 a2 x+22 ·+ x+ (3) ρ (x) = an x− ··a +1 a x2 a+o a1 x + a(3) (3) ρ (x)+=· a· n· x o → → n−1 → can be represented− as truncated polynomial of n-1 shown 3. Proposed Stream Cipher LTSC-128 ++· ·· ·· ·++− aρa22x ++aa11xx2+ (3) ρ− ==aannxxn−1 x2degree +aaoas (3) ρ(x) (x) o a0,a ancoefﬁcients are binary coefﬁcients in{0,1} {0, and all the operations are dehere a0 , a1 ,where ·where · · an aare binary in {0, 1} in and all1} the operations are de3): ,··· binary coefﬁcients and all 0 ,1a 1 , ·a·n· are Thedeuse of SVP of high dimensional lattice as the hard problem n the operations are here aa00, ,aa11,operations ·− · ·· ·aannover are are binary coefﬁcients inn{0, 1} and all n−1 2 ring → n+ here ·ρring {0, and all operations are de−1). The sum ofthe two polynomials is computed ring R Z[X]/(X de ﬁned over R+= Z[X ]/(X −1). The sum xbinary + ·coefﬁcients ·n ·= a2 xthe + ain x a1} (3)is computed (x)R =are anthe −1). The sum of two polynomials ed over the,ﬁned = Z[X]/(X 1 o of LTSC-128 is promising since there is no algorithm that is n −1). The sum of two polynomials is computed ed over the ring R = Z[X]/(X ofbyring two polynomials is computed bysum adding thethe coefﬁcients ofis important −1).ofThe of two polynomials computed ned over the R = the Z[X]/(X adding coefﬁcients variables with same powers. One adding the coefﬁcients of variables with the same powers. One ableimportant to solve the SVP problem in polynomial time. Moreover, a0adding , a1 , · · · athe are binary coefﬁcients inpowers. {0, 1}with and allimportant the operations areOne denvariables with theof same One operation that coefﬁcients variables the same powers. important operation that is needed during the keystream generation is the Convolution yeration addingthat the coefﬁcients of variables with the same powers. One important this problem has been implemented in public-key cryptography n during the keystream generation is the Convolution is needed −1). The sum of two polynomials computed ver the ring is R needed = Z[X]/(X during the keystream is theisConvolution eration that isis needed during the keystream isis the Convolution Multiplication of polynomials in generation b. The generation convolution multiplication of(e.g. polynomials NTRU cryptosystem) and has shown good results in peration that needed during the keystream generation the Convolution ultiplication of polynomials in b. The convolution multiplication of polynomials ng the coefﬁcients of variables with the same powers. One important Multiplication of polynomials inconvolution B. The multiplication in is denoted by and deﬁned asconvolution follows: ultiplication ofofbpolynomials inin⊗ b. The multiplication ofofpolynomials terms of performance and security. LTSC-128 implements ultiplication polynomials b. The convolution multiplication polynomials on that is needed during theB keystream generation is the as Convolution b is denoted by ⊗ and deﬁned as follows: of polynomials in is denoted by ⊗ and deﬁned follows: lattice and vectors in polynomial representation for higher bbisisdenoted by deﬁned denoted by⊗⊗and and deﬁned asfollows: follows: n−1 k as cation of polynomials in b. The convolution multiplication of polynomials � � � → − → − →− − → → − − → → − − → efﬁciency as discussed in Section 2. Low Hamming Weight n−1 k �= � � L Li M n+k−1 L→ ⊗− M + (4) i M k−1 iMi denoted− and deﬁned as = follows: →by ⊗− → → →− − → →− − → Lpolynomial n−1 kk − multiplication concept is applied by dividing the n−1 � � � L ⊗ M = L M L M L M + �− (4) � i+j≡1(mod �− → − → − → → → → → − → i− k−1 i=0 i− n+k−1 i− i k) i=k+1 = → − → − → − → − → − − → → − → − LL ⊗⊗� M L M n+k−1 LM + = (4) polynomial of degree n into smaller polynomials with fewer k−1 kM = = i=0 LLi iMMn−1 (4) k−1 +i=k+1 Li i Mn+k−1 � = i+j≡1(mod k) Li i Mi i � n

→− − → →i=k+1 − → − →− − → − − → → one’sM(ϑ = –3 ). i=0 i+j≡1(mod ki=0 and the poisitions of coefﬁcients ink)(4) polynomials and L Li M Li M Li M i k) L ⊗ M = where + i denotes = (4) i=k+1 k−1 n+k−1 i+j≡1(mod i=0 i=k+1 i+j≡1(mod k) Thepoisitions convolutionofproduct of twoinpolynomials is treated as here k and respectively. i denotes the coefﬁcients polynomials M and L coefﬁLTSC-128 works as follows: an input key of 128-bit generates here i i denotes the of coefﬁcients MM and Lstring here kk and and denotes the poisitions poisitions coefﬁcients in polynomials polynomials L where k and i denotes the poisitions ofpolynomials coefﬁcients in polynomicients multiplication, where aofpolynomial is in represented by aas binary of of 128-bit length per round. The keystream is aand keystream spectively. The convolution product of two is treated coefﬁk and i denotes the poisitions of coefﬁcients in polynomials M and L → − → − spectively.als The convolution product of two polynomials is treated as coefﬁM and L respectively. The convolution product of two polyespectively. The convolution product of two polynomials is treated as coefﬁcoefﬁcients [an , aan−1 , · · · , a1 , a0 ].isFor example, letbyMa and L be two truncated divided into 4-words (sub-keystream). These sub-keystream ents multiplication, where polynomial represented binary string of ively.multiplication, The convolution product ofcoefﬁ twocpolynomials is − treated asacoefﬁ→ → − ents aas is by string ofof XOR’ed with the plaintext bits (one word of plaintext at nomials iswhere treated ients multiplication, where a binary polybits are 8 with binary coefﬁcients the following: ents multiplication, a polynomial polynomial is represented represented by a binary string efﬁcients [apolynomials , an−1 ,awhere · ·polynomial · of , adegree example, let M andas− L be two truncated → − → nwhere 1 , a0 ]. For multiplication, is represented by a binary string of → − → − nomial is represented by a binary string of coefﬁcients [a ,a ,··· n n−1 a time) efﬁcients [a , a , · · · , a , a ]. For example, let M and L be two truncated → 7 let − → 6M and − n−1 , · · · , a11 , a00 ]. For − oefﬁcients [ann,degree example, L be two truncated to generate the ciphertext bits. LTSC-128 is designed → → coefﬁcients → 2 truncated lynomials of as following: ents [an , an−1 ·,0]. ·a·n−1 , a1 ,example, a80 ].with For binary example, let M=two and Lxthe be two M (x) x + + x + 1 (5) IP, KGP and EP as described in the following ,a ,a For let M and L be truncated polynomials in three phases: 1 lynomials ofofdegree 88with binary as olynomials degree withcoefﬁcients binarycoefﬁcients coefﬁcients asthe thefollowing: following: mials of degree 8 with8binary as theasfollowing: of degree with − binary coefﬁcients sections. → 7 6 2the following: M (x) = x7 + x6 + x2 + 1 (5) → − →7(x) − → − 6 2 (5) 6 xx7+ (5) M (5) M (x) =M x (x) + x== + x2++xx1 ++xx ++11 (5) 3.1 Initialization Phase (IP) → → − At this phase, we aim to extract two polynomials, M and (6) L (x) = x7 + x5 + x4 + 1 → (6) L in order to use them in the keystream generation phase. → − → − → → The two polynomials are extracted from the input key K can be represented as two vectors follow: Then MThen and ML and canLbe represented as two vectors asas follow: of 128-bit length. The key K is transformed to its binary → (7) → M =[1,1,0,0,0,1,0,1] − representation, and from those binary bits we obtain the M→ = [1, 1, 0, 0, 0, 1, 0, 1] (7) two polynomials of degree n = 347 (the value of n is chosen L =[1,0,1,1,0,0,0,1] (8) → − to achieve higher security as proved in Section 5, such L of = convolution [1, 0, 1, 1, 0,product 0, 0, 1] of two polynomi(8) To illustrate the idea that (n − 1)/2 is prime). Let ϑM and ϑL be the number of 1’s → → als, let [k1, k2, k3] and [u1, u2, u3] be two polynomials, then the of polynomials M and L respectively, such that To illustrate the idea of convolution product of two polynomials, let [k1 coefﬁcients , k2 , k3 ] convolution multiplication is performed as follows: = ϑL = n/3= 116. isM pernd [u1 , u2 , u3 ] be two polynomials, then the convolution multiplication ϑ → → [k1, k2, k3] ⊗ [u1, u2, u3] = k1 · [u1, u2, u3] + k2 · [u3, u1, u2] + k3 · [u2, u3, u1] The extraction of polynomial L (setting the coefﬁcients of L ) ormed as follows: = [k1u1, k1u2, k1u3] + [k2u3, k2u1, k2u2] + [k3u2, k3u3, is done by selecting the 128-bit of the input key from the Least ] Signiﬁcant Bit (LSB) position to the Most Signiﬁcant Bit (MSB) po[k1 , k2 , k3 ] ⊗ [u1 , u2 , u3 ] = k1k3·u1[u 1 , u2 , u3 ] + k2 · [u3 , u1 , u2 ] + k3 · [u2 , u3 , u1 ] sition, such that the ﬁrst selected bit from K is the ﬁrst coefﬁcient = [k1u1 + k2u3 + k3u2, k1u2 + k2u1 + k3u3, k1u3 + k2u2 → = [k+1 u , k u , k u ] + [k2 u3 , k2 u1 , k2 u2 ] + of L until we obtain ϑL = 116. If ϑL is less than 116, which is the k31u1] 1 2 1 3

28

[k3 u2 , k3 u3 , k3 u1 ]

Journal Volume 9 Number 1 February 2011 = [k1of u1 Digital + k2 u3 Information + k3 u2 , k1 u2 Management + k2 u1 + k3 u3 , k1 u3 + k2 u2 + k3 u1 ]

Resulted polynomial from the convolution multiplication or any other opera-

→ common case, we set the last coefﬁcients of L by selecting r bits of K from MSB position to LSB position until we satisfy the condition ϑL = 116. In rare situation, the majority of the input key bits are zeros with fewer ones, which means the input key → is not capable to provide polynomial L with sufﬁcient number → of ones. In that case, we reset (0 to 1) the bits of polynomial L → n – starting from the location 2 moving toward the LSB of L . This procedure continues until we hit the threshold ϑL = 116. The purpose of this procedure is to achieve preferable distribution of ones coefﬁcients in the new generated polynomial. → On the other hand, the coefﬁcients of the second polynomial M are also extracted from the input key by using similar mechanism. We set the ﬁrst coefﬁcients by selecting the 128-bits of the Figure 2. The design of EP Phase implemented in LTSC-128 input key from MSB position to LSB position, such that the ﬁrst → selected bit from K is the ﬁrst coefﬁcient of M, until we obtain ϑM When the design KGP is of required to generate a new keystream for → Figure 2: The EP Phase implemented in LTSC-128. = 116. If ϑM is less than 116, we set the last coefficients of M by encryption, counter Ck is incremented and added (addition selecting r bits of K from LSB position to MSB position until we → → modulo q) to L and M. At this point, another auxiliary counter satisfy the condition ϑM = 116. Similar to the rare case discussed (Cd)is required in generating new keystream. The counter Cd is → → in generating polynomial L , we need to reset some bits in Manother to point, auxiliary counter (Cd ) is required in generating new keystream. ﬁrstly initialized such that Cd = Ck. The two counters Ck and Cd hit the threshold ϑM = 116. However, the procedure ofThe resetting ﬁrstly initialized such The two counters counter Care d isthen d =C used by Eq.(13) forthat one C round ofk .keystream genera- Ck and → n, the zero bits in M starts from the location – 2 moving toward the bysi)as Eq.(13) for one round of keystream generation (K si ) as Cd are then used tion (K follows:

follows:

→ − → − → − (13) K si = [ P ⊗ ( M ⊕ Cki+1 )−1 q ⊗ ( L ⊕ Cdi+1 )](mod q) (13) → − → − where Cki+1 = M ⊕ [Cki �→LCK5 ] + 1 and Cdi+1 = L→⊕ [Cdi � LCD5 ] + 1. The where Cki+1 = M ⊕ [Cki « LCK5]+1and Cdi+1 = L ⊕ [Cdi « LCD5]+1. two variables LCK5 and LCD5 are the integer values represented by the last 5 The two variables LCK5 and LCD5 are the integer values repreand Cby The size of the resulted K sThe is 347-bit block. bits of both Cksented d respectively. the last 5 bits of both Ck and Cd respectively. size The ﬁrst 340-bits are divided into 4 chunks of 85-bit block. Only the second and of the resulted Ks is 347-bit block. The ﬁrst 340-bits are divided 4 chunks 85-bitare block. Only the second segthird segmentsinto (Seg2 and of Seg3) considered in the and ﬁnalthird keystream since ments statistical (Seg2 and Seg3) are considered thethe ﬁnalrest keystream → − → − they showed better properties comparedin to of the bits. The Figure 1: Initializing polynomials L and M by extracting the coefﬁcients from K. since they showed better statistical properties compared to the rest of the bits (Seg1 and Seg4) are found statistically weak, and therefore they rest of the bits. The rest of the bits (Seg1 and Seg4) are found are not considered in theweak, ﬁnal and keystream. statistically therefore they are not considered in the → − ﬁnal keystream. position, such that the ﬁrst selected bit from K is the ﬁrst coefﬁcient of M , until → − we obtain ϑM = 116. If ϑM is less than 116, we set the last3.3 coefﬁcients of MPhase by Encryption (EP) 3.3 Encryption Phase (EP) electing r bits of K from LSB position to MSB position until we satisfy the conThe encryptionThe process of LTSC-128 on applying encryption process is of based LTSC-128 is based exclusive-OR on applying (XOR) → → Similar to thepolynomials rare case Ldiscussed in generating polynomial dition ϑM = 116. Figure 1. Initializing and M by extracting the exclusive-OR (XOR) operation the keystream bitsgenerated and → − → − operation between the keystream bits andbetween the plaintext bits. The Ks L , we need to reset some bitscoefﬁcients in M to hit the from K threshold ϑM = 116. However, the plaintext bits. The generated K s of 160-bit length is divided → − of 160-bit length is divided into ﬁve words (32-bit each). The encryption is acmoving he procedure of resetting the zero bits in M starts from the location n2 , into ﬁve words (32-bit each). The encryption is accomplished by complished byXOR’ing XOR’ing one word of plaintext with one word of the keystream → − one word of plaintext with one word of the keystream oward the MSB MSBof of→ M . Figure 1 illustrates the extraction of the initialization value M. Figure 1 illustrates the extraction of the initialization as portrayed by 2. by Figure 2. → − → − asFigure portrayed → key. or polynomials M from → the input value L forand polynomials L and M from the input key. Notice that when EP has used the whole generated bits at time ti , the KGP → − Noticetothat when EP has used the whole generated bits at Finally, Finally, anotheranother two parameters ( P , q)→are needed during thethe KGP phase two parameters ( P, q) are needed will during be called to generate newwill keystream highlighted the KGP phase. time ti, the KGP be called as to generate newinkeystream as generate the keystreams. As proved in (Hoffstein and Silverman 2000), PolynoKGP phase to generate the keystreams. As proved in (Hoffstein → − highlighted in the KGP phase. → mial P = xand + 2Silverman and integer q =Polynomial 128 are two unitsinteger in Z[X]. The generated 2000), P =ideal x + 2 and q = 128 → Implementation and Performance Evaluation → two ideal units in Z[X ]. − →The generated polynomial − → − 4 are M will be polynomial M will be → in the form M = →1 + (2 + x) × M . 4. Implementation and Performance Evaluation in the form M = 1 + (2 + x) × M. We compare the LTSC-128 of against the widely RC4 We performance compare the of performance LTSC-128 againstadapted the 3.2 Keystream Generation PhasePhase (KGP)(KGP) 3.2 Keystream Generation stream cipher,widely whichadapted is basedRC4 on simple substitutions shiftonoperations. Both stream cipher, which isand based simple The generated and initialized polynomials and variables in substitutions and shift operations. Both ciphers were coded The generated and initialized polynomials and variables in IP phase, form the IP phase, form the main parameters for the KGP phase. The in C++ using MinGW-2.05. NTL library is used to handle the main parameters for the KGP phase. The formula for a single round of keystream formula for a single round of keystream Ks generation is given polynomial arithmetic in LTSC-128. For testing purposes, we is given K s generation as follows: as follows: ran both ciphers on a workstation with Intel Core 2 Duo CPU → → →⊗ → → → − 2.13 GHz processor, with 2GB RAM. Ks = P− M− (11) q ⊗ L (11) Ks = P ⊗ Mq ⊗ L → → → In terms where P, M and L are extracted and initialized in the previ→ − − → → − → of performance, single threaded LTSC-128 is − → →previous phase, and where P , M and L are and initialized in the M three about times slower than RC4. Relatively, LTSC-128 q ous phase, andextracted M is the inverse polynomial ( M mod q) (see q → − is a good alternative with regards to the high security s the inverse polynomial ( M mod q) (see Eq.(9)). Eq.(9)). provided compared to the broken RC4. The encryption In order to generate new keystream, KGP uses a counter Ck , which is ﬁrstly In order to generate new keystream, KGP uses a counter Ck, rate of LTSC-128 is 49.671 MBits/second compared to the nitialized from the input initialized key K byfrom Eq.(12): which is firstly the input key K by Eq.(12): Ck =

n � i=1

(K[i](mod n))

(12)

RC4 encryption rate which is 152.173 MBits/second on the same machine. Figure 3 illustrates the perform ance (12) comparison between LTSC-128 and RC4 with different plaintext sizes.

When the KGP is required to generate a new keystream for encryption, → − →Volume 9 Number 1 − Journal of Digital Information Management February 2011 ounter Ck is incremented and added (addition modulo q) to L and M . At this

29

5. Security Analysis The security of LTSC-128 rests on the intractability of the SVP problem, which is an NP-hard problem. At present, no algorithm can solve the SVP in polynomial time. Besides NPhard problem, the security of LTSC-128 is also being measured by its statistical property. In this section, we examine LTSC-128 against cryptanalaysis attacks and test the statistical attributes of the generated keystream from the KGP phase.

Figure 3. Encryption rate for RC4 and LTSC-128 stream ciphers

Recently, a high performance multithreaded model for stream ciphers was proposed in (Suwais and Samsudin 2008) to enhance the performance of stream ciphers, where multiple tasks are performed concurrently. The model is divided into three components: thread creator, keystream generator and plaintext encoder.

5.1 Brute Force Attack The length of the input key used in LTSC-128 was chosen to speciﬁcally thwart Brute Force attack. Brute Force attack on the 128-bit input key of LTSC-128 is infeasible due to the huge key space, 2128 possible keys. Hypothetically, an attacker can → recover the secret key by trying all possible polynomials M ∈ → ζM or L ∈ ζL (where ζ denotes polynomial space), and test if → → → P ⊗ Mq ⊗ L (mod q) has small entries (polynomial with less → → number of one’s coefﬁcients). Since L and M are of the same order and equal in the number of non-zero coefficients, we → consider only polynomial M in our analysis. The security of SVP is determined by #ζM. The security level (Hoffstein et al. 1998) for SVP of dimension n is then given by Eq.(14):

�

� Applying the multithreaded model on LTSC-128 requires 1 n! (14) (14) #ζM = immediate trigger of the thread creator component of the ϑM ! (n − 2ϑM )! multithreaded model, resulting in four threads. Two threads are responsible of generating keystreams, while the Based other on Based on itEq.(14), it requires the attacker carry out apEq.(14), requires the attacker to carrytoout approximately 2132 132 two threads are responsible of encrypting stream ofoperations plaintext. to proximately operations to brute forceprocedure LTSC-128. The brute force2 LTSC-128. The same has same to be applied → → − Subsequently, the four threads will be directly connected to procedure has to be applied on polynomial L to attack our requireon polynomial L to attack our generator, which double the calculation the keystream generation and encryption phases of LTSCgenerator, which double the calculation requirements. ThereTherefore, LTSC-128is secure is secure against brute-force attack due to the 128, via the keystream generator and plaintextments. enc oder fore, LTSC-128 against brute-force attack due to the → − infeasibility carrying out huge amount of calculations on both polynomials M components of the multithreaded model. However, LTSC- of infeasibility of carrying out huge amount of calculations on both → − → → 128 uses single counter to generate new keystream and each L. polynomials M and L. round. But the multithreaded model is designed to use two counters in parallel. 5.2 Time-memory Trade-off Attack 5.2 Time-memory Trade-off Attack The sucess of this kind of attacks depnds on the size of the state The multithreaded model helps LTSC-128 to perform faster space. Ciphers with small state space are highly subjected to since the workload of LTSC-128 is divided amongThe multiple sucess ofthis thisattack. kind of attacks depnds the size of theout state space. Ciphers Theprocedure of thison attack is carried as follows. threads, and all those threads are operating concurrently with small state space are highly subjected to thisand attack. The procedure of this Create a sorted list of output sequences their corresponding on multi-core processors. Thus multithreading on multi-core states. the received sequence matched with attack is carried out Then, as follows. Createoutput a sorted list ofisoutput sequences and machine provides LTSC-128 with higher utilization on the new the one stored in the list. If matching is found, the cipher state their corresponding states. Then, the received output sequence is matched with generation of processors. Applying this model on LTSC-128 the output sequence can also be found in shows better performance compared to the results discussed the one storedbefore in theproducing list. If matching is found, the cipher state before producing the list. Resulting in a successful key recovery (Ekdahl and above. Figure 4 illustrates the performance enhancement the output sequence can also be found in the list. Resulting in a successful key Johansson 2000). obtained by applying the multithreaded model on a dual core recovery (Ekdahl and Johansson 2000).

machine. The encryption rate of the multithreaded LTSC-128 We found that time-memory trade-off attack is not feasible to be found that time-memory trade-off attack is not feasible to be applied has signiﬁcantly increase to 71.602 Mbit/s compared We to the applied on LTSC-128 stream cipher. The state space is found to 347+128 ) LTSC-128 be stream cipher. state space is found too larg 128 (2 sequential LTSC-128 encryption rate, 49.671 Mbit/s,on as shown too larg (2347+128The ) compared to the key size ofto thebe space(2 ). previously. compared to the key size of the space (2128 ).

5.3 Balance The initialization phase of LTSC-128 intends to extract two → 5.3 Balancepolynomials → M and L from the input key K. We found it hard to → → recover the input key by attacking polynomials M and L due to − → The initialization phase of LTSC-128 intends to extract two polynomials M and the balance property that we added to this phase. The extracted → − L from the input key K . We found it hard to recover the input key by attacking = →polynomials − → have equal number of 0’s and 1’s (ϑM ϑL = n/3 116). − polynomials MThe and L due tophase the balance property we added this phase. initialization has eliminated the that statistical biasedto that be foundhave during the process extraction. The extractedcould polynomials equal number of of polynomials 0’s and 1’s (ϑ M = ϑL = n/3 = From the other attacker required tobiased performthat 2347 could be 116). The initialization phaseperspective, has eliminated theisstatistical +2347 operations to recover the input key, based on the fact that found during the process→ of polynomials extraction. From the other perspective, → polynomials M and L 347 are of the degree 347. Therefore, this kind 347 + 2 operations to recover the input key, attacker is required to perform 2 of attacks is infeasible due to the → − →huge number of operations − based on the which fact that M andtheLinput are of the degree 347. Therefore, arepolynomials required to recover key.

this kind of attacks is infeasible due to the huge number of operations which are 5.4 Chosen Input Differential Attack required to recover the input key.

Figure 4. Performance Improvement gained from applying the multithreaded model on M(LTSC-128)

This is yet another powerfull attack where small changes on the input values is expected to propagate through the cipher.

5.4 Chosen Input Differential Attack

30

Journal of Digital Information Management Volume 9 Number 1 February 2011 This is yet another powerfull attack where small changes on the input values

is expected to propagate through the cipher. In LTSC-128, changing one bit in the input key will affect the entire six variables in the keystream generator.

In LTSC-128, changing one bit in the input key will affect the it consists of 18 core tests. These 18 tests are producing →→ − → − → − → − entire six variables in the keystream generator. a total → − → of 215 p-values uniform on [0,1). The Diehard Suite − → − − hese variables are thecounters counters (C ,C two ( M, ,LL) )ofofthe These variables are counters (C ,,CCdd),), two (the ,,deLL)) of the These variables are the the (Cpolynomials two polynomials polynomials (M Mdeof the deded ),two kkpolynomials ese variables are the (C ,kC (M kcounters d ), requires approximately 68 million 32-bit numbers for a suc → These variables are the counters (C ,C ), two polynomials ( M , → − → − k d , LCD ). LTSC-128 is found secure ree 347 and two shifting variables (LCK , LCD ). LTSC-128 is found secure gree 347 and two shifting variables (LCK , LCD ). LTSC-128 is found secure gree 347 and two shifting variables (LCK 5 5 5 5 , LCD ). LTSC-128 is found secure ee 347 and two shifting variables (LCK 5 5 cessful run of the 18 tests. The tests included in the Suite, 5 → are the counters (Ck , Cd5), two polynomials → → hese variables (− M ,− L).) of the deLkind ) ofare the degcounters ree 347 and two shifting variables (LCKinput LCD 5(,M 5 ) of the deese variables the (C polynomials ,ofLkey k, C d ), two gainst this of attacks since changing one bit of the will affect against this kind of attacks since changing one bit of the input key will affect against this kind of attacks since changing one bit the input key will affect with their abbrevi a tion are: Birthday Spacing Test (BDAY), ainst this kind of attacks since changing one bit of the input key will affect is found secure ee 347 and two shifting variables (LCK 5 , LCD 5 ). LTSC-128 LTSC-128 is found secure(LCK against this kind of attacks is since , LCD ). LTSC-128 found secure ee 347 and two shifting variables 5 5 Overlapping 5-Permutation Test (OPE RM), Binary Rank he entire variables of the sequence generation cycle. The resulting sequence the entire variables of the sequence generation cycle. The resulting sequence the entire variables of the sequence generation cycle. The resulting sequence egainst entirethis variables thebit sequence generation cycle. The kind ofofattacks changing bit the resulting input keysequence will affect changing one ofsince the input key will one affect theofentire variables Test for matrices (RANK31), Binary Rank Test for matrices ainst this kind of attacks since changing one bit of the input key will affect hich is mapped from the input sequence is obtained in a nonlinear fashion which mapped from the input isin ininsequence aa nonlinear which is mapped from the inputissequence sequence is obtained obtained nonlinear fashion fashion is mapped from the sequence input sequence obtained a nonlinear fashion of theis sequence generation cycle. The resulting sequence eichentire variables of the generation cycle. The resulting (RANK32), Binary Rank Test for matrices (RANK68), eyperforming entire variables of the sequence generation cycle. The resulting sequence performing bit-shifting, modulo operation convolution multiplications. which is mapped from sequence the input sequence is obtained in a by performing bit-shifting, modulo operation and multiplications. In by performing bit-shifting, modulo operation and convolution multiplications. In bit-shifting, modulo operation and convolution multiplications. InIn Test (BSTM), hich is mapped from the input isand obtained in aconvolution nonlinear fashion Bitstream DNA Test (DNA), OPSO Test (OPSO), hich is mapped from the input sequence is obtained in a nonlinear fashion nonlinear fashion by performing bit-shifting, modulo operation ther words, the i − th bit of the output sequence depends on the ﬁrst, second, other words, the i − th bit of the output sequence depends on the ﬁrst, second, other words, the i − th bit of the output sequence depends on the ﬁrst, second, her words, thebit-shifting, i − th bit ofmodulo the output sequence depends on the ﬁrst, second, OQSO Test (OQSO), Count-The-1’s Test on stream of bytes y performing operation and convolution multiplications. In and convolution multiplications. In0other the i − multiplications. th bit of performing bit-shifting, modulo operation and convolution In hird,..., and jthe bit the input (where i ≤words, j). 00≤≤ third,..., and jthe bit of the input i i≤≤ j). third,..., and j− − th bitoutput of thesequence input (where j).the ﬁrst, second, (CBYTE), Count-The-1’s Test for speciﬁc bytes (CSBYTE), rd,..., and jthe −−output thth bit ofof input (where 0the ≤≤(where iﬁrst, ≤ j).second, her words, i− thsequence bit ofth the depends on depends on third,..., her words, the i − th bit of the output sequence depends on theand ﬁrst, second, Parking Lot Test (PARKL), Minimum Distance Test (MDIST), ird,..., andj j−− of the input (where ≤ i ≤ j). th th bit bit of the input (where 0 ≤ i ≤0 j). 3Dspheres Test (3DS), Squeeze Test (SQEZ), Overlapping rd,..., and j − th bit of the input (where 0 ≤ i ≤ j). .5 LLL Lattice Basis Reduction Algorithm 5.5 LLL Lattice Basis Reduction 5.5 LLL Lattice Basis Reduction Algorithm 5 LLL Lattice Basis Reduction Algorithm Algorithm Sums Test (OSUM), Run Test (RUN) and Crap Test (CRAP). 5.5 LLL Lattice Basis Reduction Algorithm Results obtained from the 18 tests are summarized in Table 5 LLL Lattice Basis Reduction Algorithm Lenstra al. described a polynomial time algorithm called al. [1982] apolynomial polynomial timealgorithm algorithm called LLL-reduction Lenstra et al. [1982] aatime polynomial time algorithm called LLL-reduction Lenstra etdescribed al.[1982] [1982] described polynomial time algorithm called LLL-reduction nstra etetLattice al. [1982] described adescribed called LLL-reduction 5enstra LLL Basis Reduction Algorithm 1.− The p-value of each test shows that LTSC-128 passed → − → → − → − LLL-reduction algorithm, for transforming basis B = (b ,b ,...,b ) 1 2 n , b , ..., b ) of Lattice L into an LLLgorithm, for transforming basis b = (b , b , ..., b ) of Lattice L into an algorithm, for transforming basis b = (b , b , ..., b ) of Lattice L into an LLLLLLalgorithm, for transforming basis b = (b 1 2 n 1 2 n the tests conﬁdence of 99%, within the success range , b , ..., b ) of Lattice L into an LLLorithm, for transforming basis b = (b 1 2 n → 1 2time algorithm enstra et al.of[1982] described aLLL-reduced polynomial called LLL-reduction with `n = (b` 1,b` 2,..., Lattice L` into an basisalgorithm B b` n).− The ` =described `bb `b(`b`nb1). nstra et al. [1982] a polynomial time called LLL-reduction ` ` ` ` → ` ` ` ` ` ` , , . . . , The LLL-reduction algorithm is commonly educed basis b ( b 0.01