On the Computational Complexity of Minimal Cumulative Cost Graph Pebbling

arXiv:1609.04449v1 [cs.CR] 14 Sep 2016

Jeremiah Blocki∗

Samson Zhou†

September 16, 2016

Abstract We consider the computational complexity of ﬁnding a legal black pebbling of a DAG G = (V, E) with minimum cumulative cost. A black pebbling is a sequence P0 , . . . , Pt ⊆ V of sets of nodes whichSmust satisfy the following properties: P0 = ∅ (we start oﬀ with no pebbles on G), sinks(G) ⊆ j≤t Pj (every sink node was pebbled at some point) and parents Pi+1 \Pi ⊆ Pi (we can only place a new pebble on a node v if all of v’s parents had a pebble during the last round). The cumulative cost of a pebbling P0 , P1 , . . . , Pt ⊆ V is cc(P) = |P1 | + . . . + |Pt |. The cumulative pebbling cost is an especially important security metric for data-independent memory hard functions, an important primitive for password hashing. Thus, an eﬃcient (approximation) algorithm would be an invaluable tool for the cryptanalysis of password hash functions as it would provide an automated tool to establish tight bounds on the amortized space-time cost of computing the function. We show that such a tool is unlikely to exist. In particular, we prove the following results. • It is NP-Hard to ﬁnd a pebbling minimizing cumulative cost.

˜ • The natural linear program relaxation for the problem has integrality gap O(n), where n is the number of nodes in G. We conjecture that the problem is hard to approximate. • We show that a related problem, ﬁnd the minimum size subset S ⊆ V such that depth(G − S) ≤ d, is also NP-Hard. In fact, under the unique games conjecture there is no (2 − ǫ)approximation algorithm.

1

Introduction

Given a directed acyclic graph (DAG) G = (V, E) the goal of the (parallel) black pebbling game is to place pebbles on all sink nodes of G (not necessarily simultaneously). The game is played in rounds and we use Pi ⊆ V to denote the set of currently pebbled nodes on round i. Initially all nodes are unpebbled, P0 = ∅, and in each round i ≥ 1 we may only include v ∈ Pi if all of v’s parents were pebbled in the previous conﬁguration (parents(v) ⊆ Pi−1 ) or if v was already pebbled in the last round (v ∈ Pi−1 ). In the sequential pebbling game we can place at most one new pebble on the graph in any round (i.e., |Pi \Pi−1 | ≤ 1), but in the parallel pebbling game no such restriction applies. ∗

Department of Computer Science, Purdue University, West Lafayette, IN. Email: [email protected] Department of Computer Science, Purdue University, West Lafayette, IN. Email: [email protected] Research supported by NSF CCF-1649515. †

1

k

We deﬁne the cumulative cost (respectively space-time cost) of a pebbling P = (P1 , . . . , Pt ) ∈ PG to be cc(P) = |P1 | + . . . + |Pt | (resp. st(P) = t × max1≤i≤t |Pi |), that is, the sum of the number of k pebbles on the graph during every round. Here, PG (resp. PG ) denotes the set of all valid parallel k (resp. sequential) pebblings of G. The parallel cumulative pebbling cost of G, denoted Πcc (G) (resp. Πst (G) = minP∈PG st(G)), is the cumulative cost of the best legal pebbling of G. Formally, k

Πcc (G) = min cc(P) , and

Πst (G) = min st(P) . P∈PG

k

P∈PG k

In this paper, we consider the computational complexity of Πcc (G), showing that the value is k NP-Hard to compute. We also provide evidence that Πcc (G) is hard to approximate by demonstrating that the natural linear programming relaxation for the problem has a large integrality gap.

1.1

Motivation

The pebbling cost of a DAG G is closely related to the cryptanalysis of data-independent memory hard functions (iMHF) [AS15], a particularly useful primitive for password hashing [PHC, BDK16]. k In particular, an eﬃcient algorithm for (approximately) computing Πcc (G) would enable us to automate the cryptanalysis of candidate iMHFs. The question is particularly timely as the Internet Research Task Force considers standardizing Argon2i [BDK16], the winner of the password hashing competition [PHC], despite recent attacks [CGBS16, AB16a, AB16b] on the construction. Despite recent progress [AB16b, ABP16] the precise security of Argon2i and alternative constructions is poorly understood. An iMHF is deﬁned by a DAG G (modeling data-dependencies) on n nodes V = {1, . . . , n} and a compression function H (usually modeled as a random oracle in theoretical analysis)1 . The label ℓ1 of the ﬁrst node in the graph G is simply the hash H(x) of the input x. A vertex i > 1 with parents i1 < i2 < · · · < iδ has label ℓi (x) = H(i, ℓi1 (x), . . . , ℓiδ (x)). The output value is the label ℓn of the last node in G. It is easy to see that any legal pebbling of G corresponds to an algorithm computing the corresponding iMHF. Placing a new pebble on node i corresponds to computing the label ℓi and keeping (resp. discarding) a pebble on node i corresponds to storing the label in memory (resp. freeing memory). Alwen and Serbinenko proved that in the parallel random oracle model (pROM) of computation, any algorithm evaluating such an iMHF could be reduced to a pebbling strategy with (approximately) the same cumulative memory cost. It should be noted that any graph G on n nodes has a sequential pebbling strategy P ∈ PG that ﬁnishes in n rounds and has cost cc(P) ≤ st(P) ≤ n2 . Ideally, a good iMHF construction provides ˜ n2 ) even if the the guarantee that the amortized cost of computing the iMHF remains high (i.e., Ω adversary evaluates many instances (e.g.,diﬀerent password guesses) of the iMHF. Unfortunately, k neither large Πst (G) nor large minP∈P k st(P), are suﬃcient to guarantee that Πcc (G) is large [AS15]. G

More recently Alwen and Blocki [AB16a] showed that Argon2i [BDK16], the winner of the recently completed password hashing competition [PHC], has much lower than desired amortized space-time k ˜ n1.75 . complexity. In particular, Πcc (G) ≤ O

1 Because the data-dependencies in an iMHF are specified by a static graph, the induced memory access pattern does not depend on the secret input (e.g., password). This makes iMHFs resistant to side-channel attacks. Datadependent memory hard functions (MHFs) like scrypt [Per09] are potentially easier to construct, but they are potentially vulnerable to cache-timing attacks.

2

k

In the context of iMHFs, it is important to study Πcc (G), the cumulative pebbling cost of a graph G, in additional to Πst (G). Traditionally, pebbling strategies have been analyzed using space-time complexity or simply space complexity. While sequential space-time complexity may be a good model for the cost of computing a single-instance of an iMHF on a standard single-core machine (i.e., the costs incurred by the honest party during password authentication), it does not model the amortized costs of a (parallel) oﬄine adversary who obtains a password hash value and would like to evaluate the hash function on many diﬀerent inputs (e.g., password guesses) to crack k the user’s password [AS15, AB16a]. Unlike Πst (G), Πcc (G) models the amortized cost of evaluating a data-independent memory hard function on many instances [AS15, AB16a]. k An eﬃcient algorithm to (approximately) compute Πcc (G) would be an incredible asset when developing and evaluating iMHFs. For example, the Argon2i designers argued that the AlwenBlocki attack [AB16a] was not particularly eﬀective for practical values of n (e.g., n ≤ 220 ) because the constant overhead was too high [BDK16]. However, they could not rule out the possibility k that more eﬃcient attacks might exist2 . An eﬃcient algorithm to (approximately) compute Πcc (G) would have allowed us to immediately resolve such debates by automatically generating upper/lower bounds on the cost of computing Argon2i for each running time parameters (n) that one might select k n2 in practice. Alwen et al. [ABP16] showed how to construct graphs G with Πcc (G) = Ω log n . This construction is essentially optimal in theory as results of Alwen and Blocki [AB16a] imply 2 k log n that any constant indegree graph has Πcc (G) = O n log . However, the exact constants one log n k

−6

2

could obtain through a theoretical analysis are most-likely small. A proof that Πcc (G) ≥ 10log×n n would be an underwhelming security guarantee in practice, where we may have n ≈ 106 . An k eﬃcient algorithm to compute Πcc (G) would allow us to immediately determine whether these new constructions provide meaningful security guarantees in practice.

1.2

Results k

Our primary contribution is a proof that the decision problem “is Πcc (G) ≤ k for a positive integer ” is NP-Complete.3 In fact, our result holds even if the DAG G has constant indeg.4 k ≤ n(n+1) 2 k

We also provide evidence that Πcc (G) is hard to approximate. Thus, it is unlikely that it will be possible to automate the cryptanalysis process for iMHF candidates. In particular, we deﬁne a k its linear programming relaxation. We natural integer program to compute Πcc (G) and consider n then show that the integrality gap is at least Ω log n leading us to conjecture that it is hard to k

approximate Πcc (G). We also give an example of a DAG G on n nodes with the property that any k optimal pebbling (minimizing Πcc ) requires more than n pebbling rounds. The computational complexity of several graph pebbling problems has been explored previously in various settings [GLT80, HP10]. However, we stress that the structure of a pebbling minimizing cumulative cost can be very diﬀerent from the structure of a pebbling minimizing space-time cost or 2

Indeed, Alwen and Blocki [AB16b] subsequently introduced heuristics to improve their attack and demonstrated that their attacks were effective even for smaller (practical) values of n by simulating their attack against real Argon2i instances. k 3 Note that for any G with n nodes we have Πcc (G) ≤ 1 + 2 + . . . + n = n(n+1) since we can always pebble G in 2 topological order in n steps if we never remove pebbles. 4 For practical reasons most iMHF candidates are based on a DAG G with constant indegree.

3

space. Thus, we need fundamentally new ideas to construct appropriate gadgets for our reduction.5 We ﬁrst introduce a new problem which we call Bounded 2-Linear Covering (B2LC) and show that it is NP-Complete by reduction from 3-PARTITION. We then show that we can encode a B2LC instance as a graph pebbling problem. In Appendix B we also investigate the computational complexity of determining how “depthreducible” a DAG G is showing that the problem is NP-Complete even if G has constant indegree. A DAG G is (e, d)-reducible if we there exists a subset S ⊆ V of size |S| ≤ e such that depth(G−S) < d. That is, after removing nodes in the set S from G, any remaining directed path has length less than d. If G is not (e, d)-reducible, we say that it is (e, d)-depth robust. It is known that a ˜ n2 ) if and only if the graph is highly depth robust graph has high cumulative cost (e.g., Ω ˜ (n)) [AB16a, ABP16]. Our reduction from Vertex Cover preserves approximation (e.g., e, d = Ω 6 hardness. Thus, assuming that P 6= NP it is hard to 1.3-approximate e, the minimum size of a set S ⊆ such that depth(G − S) < d [DS05]. Under the Unique Games Conjecture [Kho02], it is hard to (2 − ǫ)-approximate e for any ﬁxed ǫ > 0 [KR08].

2

Preliminaries

Given a directed acyclic graph (DAG) G = (V, E) and a node v ∈ V we use parents(v) = {u : (u, v) ∈ E} to denote the set of nodes u with directed edges into node v and we use indeg(v) = |parents(v)| to denote the number of directed edges into node v. We use indeg(G) = maxv∈V indeg(v) to denote the maximum indegree of any node in G. For convenience, we use indeg instead of indeg(G) when G is clear from context. We say that a node v ∈ V with indeg(v) = 0 is a source node and a node with no outgoing edges is a sink node. We use sinks(G) (resp. sources(G)) to denote the set of all sink nodes (resp. source nodes) in G. We will use n = |V| to denote the number of nodes in a graph, and for convenience we will assume that the nodes V = {1, 2, 3, . . . , n} are given in topological order (i.e., 1 ≤ j < i ≤ n implies that (i, j) ∈ / E). We use depth(G) to denote the length of the longest directed path in G. Given a positive integer k ≥ 1 we will use [k] = {1, 2, . . . , k} to denote the set of all integers 1 to k (inclusive). Definition 1 Given a DAG G = (V, E) on n nodes a legal pebbling of G is a sequence of sets P = P0 , . . . , Pt such that: (1) P0 = ∅, (2) ∀i > 0, v ∈ Pi \Pi−1 we have parents(v) ⊆ Pi−1 , and (3) P ∀v ∈ sinks(G) ∃0 < j ≤ t such that v ∈ Pj . The cumulative cost of the pebbling P is cc(P) = ti=1 |Pi |, and the space-time cost is st(P) = t × max0 0 by a reduction from Cubic Vertex Cover, deﬁned below. Note that when d = 0 REDUCIBLEd is Vertex Cover. The decision problem VC (resp. CubicVC) is deﬁned as follows: Input: a graph G on n vertices (CubicVC: each with degree 3) and a positive integer k ≤ n2 . Output: Yes, if G has a vertex cover of size at most k; otherwise No. To show that minCC is NP-Complete we introduce a new decision problem B2LC. We will show that the decision problem B2LC is NP-Complete and we will give a reduction from B2LC to minCC. The decision problem Bounded 2-Linear Covering (B2LC) is deﬁned as follows: Input: n variables x1 , . . . , xn , k positive constants 0 ≤ c1 , . . . , ck , an integerPm ≤ k and k equations of the form xαi + ci = xβi , where αi , βi ∈ [n] and i ∈ [k]. We require that ki=1 ci ≤ p(n) for some ﬁxed polynomial n. Output: Yes, if we can ﬁnd mn values xy,z (for each 1 ≤ y ≤ m and 1 ≤ z ≤ n) such that for each i ∈ [k] there exists 1 ≤ y ≤ m such that xy,αi + ci = xy,βi (that is the assignment x1 , . . . , xn = xy,1 , . . . , xy,n satisﬁes the ith equation); otherwise No. To show that B2LC is NP-Complete we will reduce from the problem 3-PARTITION, which is known to be NP-Complete. The decision problem 3-PARTITION is deﬁned as follows: T T < xi < 2n Input: A multi-set S of m = 3n positive integers x1 , . . . , xm ≥ 1 such that (1) we have 4n for each 1 ≤ i ≤ m, where T = x1 +. . .+xm , and (2) we require that T ≤ p(n) for a ﬁxed polynomial p.7 P Output: Yes, if there is a partition of [m] into n subsets S1 , . . . , Sn such that j∈Si xj = nT for each 1 ≤ i ≤ n; otherwise No. T We may assume 4n < xi < terms, as described in [Dem14]. 7

T 2n

by taking any set of positive integers and adding a large fixed constant to all

5

Fact 3 [GJ75, Dem14] 3-PARTITION is NP-Complete.8

3

Related Work

The sequential black pebbling game was introduced by Hewitt and Paterson [HP70], and by Cook [Coo73]. It has been particularly useful in exploring space/time trade-oﬀs for various problems like matrix multiplication [Tom78], fast fourier transformations [SS78, Tom78], integer multiplication [SS79b] and many others [Cha73, SS79a]. In cryptography it has been used to construct/analyze proofs of space [DFKP15, RD16], proofs of work [DNW05, MMV13] and memorybound functions [DGN03] (functions that incur many expensive cache-misses [ABW03]). More recently, the black pebbling game has been used to analyze memory hard functions [AS15, AB16a].

3.1

Password Hashing and Memory Hard Functions

Users often select low-entropy passwords which are vulnerable to oﬄine attacks if an adversary obtains the cryptographic hash of the user’s password. Thus, it is desirable for a password hashing algorithm to involve a function f(.) which is moderately expensive to compute. The goal is to ensure that, even if an adversary obtains the value (username, f(pwd, salt), salt) (where salt is some randomly chosen value), it is prohibitively expensive to evaluate f(., salt) for millions (billions) of diﬀerent password guesses. PBKDF2 (Password Based Key Derivation Function 2) [Kal00] is a popular moderately hard function which iterates the underlying cryptographic hash function many times (e.g., 210 ). Unfortunately, PBKDF2 is insuﬃcient to protect against an adversary who can build customized hardware to evaluate the underlying hash function. The cost computing a hash function H like SHA256 or MD5 on an Application Speciﬁc Integrated Circuit (ASIC) is dramatically smaller than the cost of computing H on traditional hardware [NB+ 15]. [ABW03], observing that cache-misses are more egalitarian than computation, proposed the use of “memory-bound” functions for password hashing — a function which maximizes the number of expensive cache-misses. Percival [Per09] observed that memory costs tend to be stable across diﬀerent architectures and proposed the use of memory-hard functions (MHFs) for password hashing. Presently, there seems to be a consensus that memory hard functions are the ‘right tool’ for constructing moderately expensive functions. Indeed, all entrants in the password hashing competition claimed some form of memory hardness [PHC]. As the name suggests, the cost of computing a memory hard function is primarily memory related (storing/retrieving data values). Thus, the cost of computing the function cannot be signiﬁcantly reduced by constructing an ASIC. Percival [Per09] introduced a candidate memory hard function called scrypt, but scrypt is potentially vulnerable to side-channel attacks as its computation yields a memory access pattern that is datadependent (i.e., depends on the secret input/password). Due to the recently completed password hashing competition [PHC] we have many candidate data-dependent memory hard functions such as Catena [FLW13] and the winning contestant Argon2i-A [BDK15].9 8

The 3PARTITION problem is called P[3,1] in [GJ75]. The specification of Argon2i has changed several times. We use Argon2i-A to refer to the version of Argon2i from the password hashing competition, and we use Argon2i-B to refer to the version that is currently being considered for standardization by the Cryptography Form Research Group (CFRG) of the IRTF[BDKJ16]. 9

6

3.1.1

iMHFs and Graph Pebbling

All known candidate iMHFs can be described using a DAG G and a hash function H. Graph pebbling is a particularly useful as a tool to analyze the security of an iMHF [AS15, CGBS16, FLW13]. A pebbling of G naturally corresponds to an algorithm to compute the iMHF. Alwen and Serbinenko [AS15] showed that in the pROM model of computation, any algorithm to compute the iMHF corresponds to a pebbling of G. 3.1.2

Measuring Pebbling Costs

In the past, MHF analysis has focused on space-time complexity [Per09, FLW13, CGBS16]. For example, the designers of Catena [FLW13] showed that their DAG G had high sequential spacetime pebbling cost Πst (G) and Corigan-Gibbs et al. [CGBS16] showed that Argon2i-A and their own iMHF candidate iBH (“balloon hash”) have (parallel) space-time cost Ω n2 . Alwen and Serbinenko [AS15] observed that these guarantees are insuﬃcient for two reasons: (1) the adversary may be parallel, and (2) the adversary might amortize his costs over multiple iMHF instances (e.g., multiple password guesses). Indeed, there are now multiple known attacks on Catena [BK15, AS15, AB16a]. Alwen and Blocki [AB16a, AB16b] gave attacks on Argon2i-A, Argon2i-B, iBH, k and Catena with lower than desired amortized space-time cost — Πcc (G) ≤ O n1.8 for Argon2i-B, k ˜ n1.75 for Argon2i-A and iBH and Πkcc (G) ≤ O n5/3 for Catena. This motivates the Πcc (G) ≤ O k

need to study cumulative cost Πcc instead of space-time cost since amortized space-time complexity k approaches Πcc as the number of iMHF instances being computed increases. k n2 Alwen et al. [ABP16] recently constructed a constant indegree graph G with Πcc (G) = Ω log n . k

From a theoretical standpoint, this is essentially optimal as any constant indeg DAG has Πcc = 2 log n O n log [AB16a], but from a practical standpoint the critically important constants terms in log n the lower bound are not well understood.

3.2

Computational Complexity of Pebbling

The computational complexity of various graph pebbling has been explored previously in diﬀerent settings [GLT80, HP10]. Gilbert et al. [GLT80] focused on space-complexity of the black-pebbling game. Here, the goal is to ﬁnd a pebbling which minimizes the total number of pebbles on the graph at any point in time (intuitively this corresponds to minimizing the maximum space required during computation of the associated function). Gilbert et al. [GLT80] showed that this problem is PSPACE complete by reducing from the truly quantiﬁed boolean formula (TQBF) problem. We need diﬀerent tools to analyze the computational complexity of the problem of ﬁnding a pebbling with low cumulative cost. By contrast, observe that minCC ∈ NP because any DAG G with n nodes this algorithm has a pebbling P with cc(P) ≤ st(P) ≤ n2 . Thus, if we are minimizing cc or st cost, the optimal pebbling of G will trivially never require more than n2 steps. By contrast, the optimal (space-minimizing) pebbling of the graphs from the reduction of Gilbert et al. [GLT80] often require exponential time. In Appendix D, we show that the optimal pebbling from [GLT80] does take polynomial time if the TQBF formula only uses existential quantiﬁers (i.e., if we reduce from 3SAT). Thus, the reduction of Gilbert et al. [GLT80] can also be extended to show that it is NP-Complete to check whether a DAG G admits a pebbling P with st(P) ≤ k for some parameter k. The reduction, 7

which simply appends a long chain to the original graph, exploits the fact that that if we increase space-usage even temporarily we will dramatically increase st cost. However, this reduction does not extend to cumulative cost because the penalty for temporarily placing large number of pebbles can be quite small as we do not keep these pebbles on the graph for a long time.

4

NP-Hardness of minCC

In this section we prove that minCC is NP-Complete by reduction from B2LC. Suppose we are given an instance of B2LC. That is, we are given n variables x1 , . . . , xn and k equations of the form xα + ci = xβ , where α, β ∈ [n] and i ∈ [k], where ci are positive integers bounded by some polynomial in n. Our goal is to to decide if there exist a set of m < k variable assignments: xi,j 1 ≤ i ≤ m and 1 ≤ j ≤ n so that each equation xα + ci = xβ is satisﬁed by some assignment xi,1 , . . . , xi,n . We shall construct a graph GB2LC so that pebbling nodes corresponds to synchronized variables which are “oﬀset” by the summand in satisﬁed equations. P Namely, our graph consists of several components. Our ﬁrst gadget is a chain of length c = ci so that each node is connected to the previous node, and can only be pebbled if there exists a pebble on the previous node in the previous step, such as in Figure 1. We will use another gadget to ensure that, in any legal pebbling, we need to “walk” a pebble down each chain m diﬀerent times. Intuitively, each chain represents a variable xi and the time at which we start walking the jth pebble down the chain will correspond to the value of the variable xj in the th assignment. In any “non-cheating” pebbling strategy we will keep at most one pebble on each chain at any point in time. We could “cheat” by placing multiple pebbles along a chain representing a speciﬁc variable to satisfy the equation nodes. To discourage cheating, we create τ copies of the path gadget representing each variable. Let C1i , . . . , Cτi denote j,c j the τ chains representing variable xi , and vj,1 i , . . . , vi denote the nodes in chain Ci . xi : Fig. 1: A chain of length 5 for variable xi . For the jth equation xα + ci = xβ , create a chain Ej of length c − ci , so that the ath node in chain l,a+ci for all 1 ≤ l ≤ τ, as demonstrated in Figure 2. Ej has incoming edges from vertices vl,a α and vβ Intuitively, if the equation xα + ci = xβ is satisﬁed by the jth assignment then on the jth time we walk pebbles down the chain xα and xβ the pebbles on each chain will be synchronized (i.e., when th we have a pebble on vl,a α , the a link in the chain representing xα we will have a pebble on the node l,a+ci during the same round) so that we can pebble all of these new nodes. We create a single sink vβ node linked from each of the k equation chains, which can only be pebbled if all equation nodes are pebbled. Our ﬁnal gadget is a path of length cm so that each node is connected to the previous node. We create a path Mi of length cm for each variable xi , and for every node vip+qc in the path with position p + qc > 1, where 1 ≤ p ≤ c and 0 ≤ q < m − 1, we create an edge to the node from node th vl,p i , 1 ≤ l ≤ τ (that is, the p in all τ chains representing the variable). We connect the ﬁnal node in each of the n paths to the ﬁnal sink node. This forces any valid pebbling to walk along each path of length cm, in the process walking along each chain Cji , 1 ≤ j ≤ τ, at least m times. 8

Equation ej x2 : x1 : Fig. 2: A gadget for equation ej : x1 + 2 = x2 . Chain for xi :

Fig. 3: A path Mi of length 9 = cm representing m = 3 passes for variable xi , which can take c = 3 values. k Lemma 4 If the B2LC instance has a valid solution, then Πcc GB2LC ≤ τcmn + 2cmn + 2ckm + 1. k Lemma 5 If the B2LC instance does not have a valid solution, then Πcc GB2LC ≥ τcmn + τ.

We refer to the appendix for the formal proofs of Lemma 4 and Lemma 5. Intuitively, any solution to B2LC corresponds to m walks across the τn chains Cji , 1 ≤ i ≤ n, 1 ≤ j ≤ τ of length c. If the B2LC instance is satisﬁable then we can synchronize each of these walks so that we can pebble every equation chain Ej and path Mj along the way. Thus, the total cost is τcmn plus the cost to pebble the k equation chains Ej (≤ 2ckm), the cost to pebble the n paths Mj (≤ 2cmn) plus the cost to pebble the sink node 1. If the B2LC instance is not satisﬁed then we must adopt a “cheating” pebbling strategy. This means that for some i ≤ n the pebbling either (1) keeps at least 2 pebbles on each of the chains C1i , . . . , Cτi during some pebbling round, or (2) delay walking the pebble down the chains Cji , . . . , Cτi at some point (mid-walk) so that there are mc + 1 pebbling rounds with 1 pebble on each of these chains. In either case we pay at least τ(mc + 1) to pebble the chains C1i , . . . , Cτi and at least τmc(n − 1) to pebble the other chains Cji ′ with i ′ 6= i and 1 ≤ j ≤ τ. Thus, the cumulative cost is at least τcmn + τ. Theorem 6 minCC is NP-Complete. Proof : It suﬃces to show that there is a polynomial time reduction from B2LC to minCC since B2LC is NP-Complete (see Theorem 8). Given an instance P of B2LC, we create the corresponding graph G as described above. This is clearly achieved in polynomial time. By Lemma 4, if P has a k valid solution, then Πcc (G) ≤ τcmn + 2cmn + 2ckm + 1. On the other hand, by Lemma 5, if P k does not have a valid solution, then Πcc (G) ≥ τcmn + τ. Therefore, setting τ > 2cmn + 2ckm + 1 k (such as τ = 2cmn + 2ckm + 2) allows one to solve B2LC given an algorithm which outputs Πcc (G). ✷ Theorem 7 minCCδ is NP-Complete.

9

τ copies: Chain for x2 : Equation 2: x2 + 1 = x1 Sink Equation 1: x1 + 0 = x2 τ copies: Chain for x1 :

Fig. 4: An example of a complete reduction Note that the only possible nodes with indegree greater than two are the nodes in the equation gadgets E1 , . . . , Em , and the ﬁnal sink node. The equation gadgets can have indegree up to 2τ + 1, while the ﬁnal sink node has indegree n + m. We replace the incoming edges to each of these nodes with a binary tree, so that all vertices have indegree at most two. By changing τ appropriately, we can still distinguish between instances of B2LC using the output of minCCδ . We refer to the appendix for a sketch of the proof of Theorem 7. Theorem 8 B2LC is NP-Complete. We refer to the appendix for the proof of Theorem 8, where we show that there is a polynomial time reduction from 3-PARTITION to B2LC.

5

LP Relaxation has Large Integrality Gap

Let G = (V, E) be a DAG with maximum indegree δ and with V = {1, . . . , n}, where 1, 2, 3, . . . , n is topological ordering of V. We start with an integer program for DAG pebbling, taking Step 1a in Figure 5. Intuitively, xtv = 1 if we have a pebble on node v during round t. Thus, Constraint 1 says that we must have a pebble on the ﬁnal node at some point. Constraint 2 says that we do not start with any pebbles on G and Constraint 3 enforces the validity of the pebbling. That is, if v has parents we can only have a pebble on v in round t + 1 if either (1) v already had a pebble during round t, or (2) all of v’s parents had pebbles in round t. It is clear that the above Integer Program yields the optimal pebbling solution. If we want the optimal pebbling solution that terminates in exactly t∗ ≥ depth(G) steps then we can simply change Constraint 1 to say that xt∗ n ≥ 1. We would like to convert our Integer Program to a Linear Program. The natural relaxation is n t ˜ simply to allow 0 ≤ xv ≤ 1. However, we show that this LP has a large integrality GAP Ω log n even for DAGs G with constant indegree. In particular, 2 there exist DAGs with constant indegree k n ˜ δ for which the optimal pebbling has Πcc (G) = Ω log n [ABP16], but we will provide a fractional solution to the LP relaxation with cost O n . 10

(1) Variables: For 1 ≤ v ≤ n and 0 ≤ t ≤ n2 , (a) Integer Program: xtv ∈ {0, 1}

(b) Relaxed Linear Program: 0 ≤ xtv ≤ 1 (2) Goal (minimize cumulative pebbling cost): min (3) Constraint 1 (Must Finish):

Pn2

t t=0 xn

(4) Constraint 2 (No Pebbles At Start):

≥ 1. P

0 v>0 xv

P

v∈V

Pn2

t t=0 xv .

≤ 0.

(5) Constraint 3 (Pebbling Is Valid): For all v s.t |Parents(v)| ≥ 1 and 0 ≤ t ≤ n2 − 1 we have P t v ′ ∈Parents(v) xv ′ t+1 t xv ≤ xv + . |Parents(v)| Fig. 5: Integer Program for Pebbling. Theorem 9 Let G be with constant indegree δ. Then there is a fractional solution to our LP Relaxation (of the Integer Program in Figure 5) with cost at most 3n. The proof of Theorem 9 is in Appendix A, we sketch the proof here. We ﬁrst “fractionally” pebble G in topological order using fractional pebbles with weight n1 . Since there are n pebbles, this process requires n time steps. Once this process is complete, we may increase xtn by n1 for each t−1 = 1 . Thus, after n − 1 subsequent time step xt−1 v , since all of its parents v must have value xv n more time steps, we have xn = 1. We refer to the appendix for a formal proof. One tempting way to “ﬁx” the linear program is to require that the pebbling take at most n steps since the fractional assignment used to establish the integrality gap takes 2n steps. There are two issues with this approach: (1) It is not true in general that the optimal pebbling of G takes at most n steps. See Appendix C for a counter example and discussion. (2) For constant indegree DAGs we can give a fractional assignment that takes exactly n steps and costs O(n log n). Thus, ˜ the integrality gap is still Ω(n). Brieﬂy, in this assignment we set xii = 1 for i ≤ n and for i < t we 1 −dist(i,t+j)−j+2 : j ≥ 1} , where dist(x, y) is the length of the shortest path from set xti = max n , {2 x to y. In particular, if j = 1 (we want to place a ‘whole’ pebble on node j + t in the next round by setting xj+t j+t = 1 ) and node i is a parent of node t + j then we have dist(i, t + j) = 1 so we will have xj+t = 1 (e.g., a whole pebble on node i during the previous round). i

6

Conclusions

We initiate the study of the computational complexity of cumulative cost minimizing pebbling in the parallel black pebbling model. This problem is motivated by the urgent need to develop and analyze secure data-independent memory hard functions for password hashing. We show that it is NP-Hard to ﬁnd a parallel black pebbling minimizing cumulative cost, and we provide evidence that the problem is hard to approximate. Thus, it seems unlikely that we will be able to develop tools to automate the task of a cryptanalyst to obtain strong upper/lower bounds on the security 11

of a candidate iMHF. However, we cannot absolutely rule out the possibility that an eﬃcient approximation algorithm exists. The primary remaining challenge is to either give an eﬃcient αk k approximation algorithm to ﬁnd a pebbling P ∈ P k with cc(P) ≤ αΠcc (G) or show that Πcc (G) is hard to approximate. We believe that the problem oﬀers many interesting theoretical challenges and a solution could have very practical consequences for secure password hashing. It is our hope that this work encourages others in the TCS community to explore these questions.

References [AB16a]

Jo¨el Alwen and Jeremiah Blocki. Eﬃciently computing data-independent memory-hard functions. In Advances in Cryptology CRYPTO’16. Springer, 2016.

[AB16b]

Jo¨el Alwen and Jeremiah Blocki. Towards practical attacks on argon2i and balloon hashing. Cryptology ePrint Archive, Report 2016/759, 2016. http://eprint.iacr.org/2016/759.

[ABP16]

Jo¨el Alwen, Jeremiah Blocki, and Krzysztof Pietrzak. The pebbling complexity of depth-robust graphs. Cryptology ePrint Archive, Report 2016/, 2016. http://eprint.iacr.org/.

[ABW03] Mart´ın Abadi, Michael Burrows, and Ted Wobber. Moderately hard, memory-bound functions. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2003, San Diego, California, USA, 2003. [AS15]

Jo¨el Alwen and Vladimir Serbinenko. High Parallel Complexity Graphs and MemoryHard Functions. In Proceedings of the Eleventh Annual ACM Symposium on Theory of Computing, STOC ’15, 2015. http://eprint.iacr.org/2014/238.

[BDK15]

Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich. Fast and tradeoﬀ-resilient memory-hard functions for cryptocurrencies and password hashing. Cryptology ePrint Archive, Report 2015/430, 2015. http://eprint.iacr.org/2015/430.

[BDK16]

Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich. Argon2 password hash. Version 1.3, 2016. https://www.cryptolux.org/images/0/0d/Argon2.pdf.

[BDKJ16] Alex Biryukov, Daniel Dinu, Dmitry Khovratovich, and Simon Josefsson. The memoryhard Argon2 password hash and proof-of-work function. Internet-Draft draft-irtf-cfrgargon2-00, Internet Engineering Task Force, March 2016. [BK15]

Alex Biryukov and Dmitry Khovratovich. Tradeoﬀ cryptanalysis of memory-hard functions. Cryptology ePrint Archive, Report 2015/227, 2015. http://eprint.iacr.org/.

[CGBS16] Henry Corrigan-Gibbs, Dan Boneh, and Stuart Schechter. Balloon hashing: Provably space-hard hash functions with data-independent access patterns. Cryptology ePrint Archive, Report 2016/027, Version: 20160601:225540, 2016. http://eprint.iacr.org/. [Cha73]

Ashok K. Chandra. Eﬃcient compilation of linear recursive programs. (FOCS), pages 16–25. IEEE Computer Society, 1973. 12

In SWAT

[Coo73]

Stephen A. Cook. An observation on time-storage trade oﬀ. In Proceedings of the Fifth Annual ACM Symposium on Theory of Computing, STOC ’73, pages 29–33, New York, NY, USA, 1973. ACM.

[Dem14]

Erik Demaine. 6.890 algorithmic lower bounds: http://ocw.mit.edu, September 2014.

Fun with hardness proofs.

[DFKP15] Stefan Dziembowski, Sebastian Faust, Vladimir Kolmogorov, and Krzysztof Pietrzak. Proofs of space. In Rosario Gennaro and Matthew J. B. Robshaw, editors, CRYPTO 2015, Part II, volume 9216 of LNCS, pages 585–605. Springer, Heidelberg, August 2015. [DGN03]

Cynthia Dwork, Andrew Goldberg, and Moni Naor. On memory-bound functions for ﬁghting spam. In Advances in Cryptology - CRYPTO 2003, 23rd Annual International Cryptology Conference, Santa Barbara, California, USA, August 17-21, 2003, Proceedings, volume 2729 of Lecture Notes in Computer Science, pages 426–444. Springer, 2003.

[DNW05] Cynthia Dwork, Moni Naor, and Hoeteck Wee. Pebbling and proofs of work. In Victor Shoup, editor, CRYPTO 2005, volume 3621 of LNCS, pages 37–54. Springer, Heidelberg, August 2005. [DS05]

Irit Dinur and Samuel Safra. On the hardness of approximating minimum vertex cover, 2005.

[EGS75]

Paul Erd¨os, Ronald L. Graham, and Endre Szemeredi. On sparse graphs with dense long paths. Technical report, Stanford, CA, USA, 1975.

[FLW13]

Christian Forler, Stefan Lucks, and Jakob Wenzel. Catena: A memory-consuming password scrambler. IACR Cryptology ePrint Archive, 2013:525, 2013.

[GJ75]

Michael R Garey and David S. Johnson. Complexity results for multiprocessor scheduling under resource constraints. SIAM Journal on Computing, 4(4):397–411, 1975.

[GLT80]

John R Gilbert, Thomas Lengauer, and Robert Endre Tarjan. The pebbling problem is complete in polynomial space. SIAM Journal on Computing, 9(3):513–524, 1980.

[HP70]

Carl E. Hewitt and Michael S. Paterson. Record of the project mac conference on concurrent systems and parallel computation. chapter Comparative Schematology, pages 119–127. ACM, New York, NY, USA, 1970.

[HP10]

Philipp Hertel and Toniann Pitassi. The pspace-completeness of black-white pebbling. SIAM Journal on Computing, 39(6):2622–2682, 2010.

[Kal00]

Burt Kaliski. Pkcs# 5: Password-based cryptography speciﬁcation version 2.0. 2000.

[Kho02]

Subhash Khot. On the power of unique 2-prover 1-round games. In Proceedings of the Thirty-Fourth annual ACM Symposium on Theory of Computing, pages 767–775. ACM, 2002.

[KR08]

Subhash Khot and Oded Regev. Vertex cover might be hard to approximate to within 2- ǫ, 2008. 13

[MMV13] Mohammad Mahmoody, Tal Moran, and Salil P. Vadhan. Publicly veriﬁable proofs of sequential work. In Robert D. Kleinberg, editor, ITCS 2013, pages 373–388. ACM, January 2013. [NB+ 15]

Arvind Narayanan, Joseph Bonneau, , Edward W Felten, Andrew Miller, and Steven Goldfeder. Bitcoin and Cryptocurrency Technology (manuscript). 2015. Retrieved 8/6/2015.

[Per09]

C. Percival. Stronger key derivation via sequential memory-hard functions. In BSDCan 2009, 2009.

[PHC]

Password hashing competition. https://password-hashing.net/.

[PR80]

Wolfgang J. Paul and R¨ udiger Reischuk. On alternation II. A graph theoretic approach to determinism versus nondeterminism. Acta Inf., 14:391–403, 1980.

[RD16]

Ling Ren and Srinivas Devadas. Proof of space from stacked bipartite graphs. Cryptology ePrint Archive, Report 2016/333, 2016. http://eprint.iacr.org/.

[Sch82]

Georg Schnitger. A family of graphs with expensive depth reduction. Theor. Comput. Sci., 18:89–93, 1982.

[Sch83]

Georg Schnitger. On depth-reduction and grates. In 24th Annual Symposium on Foundations of Computer Science, Tucson, Arizona, USA, 7-9 November 1983, pages 323–328. IEEE Computer Society, 1983.

[SS78]

John E. Savage and Sowmitri Swamy. Space-time trade-oﬀs on the ﬀt algorithm. IEEE Transactions on Information Theory, 24(5):563–568, 1978.

[SS79a]

John E. Savage and Sowmitri Swamy. Space-time tradeoﬀs for oblivious interger multiplications. In Hermann A. Maurer, editor, ICALP, volume 71 of Lecture Notes in Computer Science, pages 498–504. Springer, 1979.

[SS79b]

Sowmitri Swamy and John E. Savage. Space-time tradeoﬀs for linear recursion. In Alfred V. Aho, Stephen N. Zilles, and Barry K. Rosen, editors, POPL, pages 135–142. ACM Press, 1979.

[Tom78]

Martin Tompa. Time-space tradeoﬀs for computing functions, using connectivity properties of their circuits. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pages 196–204, New York, NY, USA, 1978. ACM.

A

Missing Proofs

k Reminder of Lemma 4. If the B2LC instance has a valid solution, then Πcc GB2LC ≤ τcmn + 2cmn + 2ckm + 1. Proof of Lemma 4. Suppose the given instance of B2LC has a valid solution, {xi,j }. Recall that a pebble must pass m times through each of the τ chains of length c representing each variable. For a set k, let xk,j1 ≤ xk,j2 ≤ . . . ≤ xk,jn . We start the kth pass through the τ chains by placing a pebble 14

on C1jn , . . . , Cτjn , the τ chains representing variable jn . At each subsequent time step, we move the existing pebbles to the next node along the chain. When the pebbles on chains C1jn , . . . , Cτjn reach the (xk,jn − xk,jn−1 + 1)th nodes, we place a pebble on C1jn−1 , . . . , Cτjn−1 , the τ chains representing variable jn−1 . We continue this process by moving existing pebbles to the next node along each chain for each subsequent time step. When pebbles on chains C1jl , . . . , Cτjl reach the (xk,jl − xk,jl−1 + 1)th nodes, we place a pebble on C1jl−1 , . . . , Cτjl−1 , the τ chains representing variable jl−1 . Thus, the gadgets E1 , . . . , Em which are satisﬁed by xk,1 , . . . , xk,m+1 can be pebbled during this pass, since by construction, we oﬀset the positions of the pebbles by the appropriate distances. When a pebble reaches the end of its chain, we remove the pebble. Hence, we see that each chain has c time steps with pebbles, and each of those steps needs only one pebble. There are n variables, τ chains representing each variable, and m passes through each chain. Across all variable chains, the cumulative complexity is τcmn since there are n variables and τ chains representing each variable. Similarly, making m walks through the paths C1j , . . . , Cτj allows us to pebble the path Mj (j ≤ n). These paths each have length cm and we keep at most one pebble on Mj at any point in time. There may be a delay of up to c time steps between consecutive walks through the paths C1j , . . . , Cτj during which we cannot progress our pebble through the path Mj . However, there are at most 2cm total steps in which we have a pebble on Mj . Thus, the cumulative complexity across all paths M1 , ..., Mn is at most 2cmn. Likewise, since each equation gadget Ej is represented by a chain, one pebble at each time step suﬃces for each of these paths. We may have to keep a pebble on the gadget Ej while we make m walks through the paths, but we keep this pebble on Ej for at most 2cm steps (each walk takes c steps and the delay between consecutive walks is at most c steps). Since there are k equations, and there is a chain for each equation, the cumulative complexity across all equation chains is at most 2ckm. There is one ﬁnal sink node, so the cumulative complexity for an instance of B2LC with a valid solution is at most τcmn + 2cmn + 2ckm + 1. k Reminder of Lemma 5. If the B2LC instance does not have a valid solution, then Πcc GB2LC ≥ τcmn + τ. Proof of Lemma 5. Suppose the given instance of B2LC does not have a valid solution. We ﬁrst observe that in any legal pebbling P = (P0 , . . . , Pt ) ∈ P k we must walk a pebble down each of the paths M1 , . . . , Mn of length cm. Let tzi denote the ﬁrst time step in which we place a pebble on vzi — the z’th node on path Mi . Clearly, tzi < tz+1 for z < cm and at time tzi − 1 we must have a i 1,y pebble on all nodes in parents vzi . In particular, vi , . . . , vτ,y ∈ Ptz −1 where y − 1 = z mod c so i that the pebbling conﬁguration at time tz − 1 includes at least one pebble on each of the chains C1i , . . . , Cni . There are n variables xi , τ chains Cji of length c representing each variable xi , and m passes through each chain. Thus, across all variable chains, the cumulative complexity is at least τcmn, and any “cheating” pebbling has cumulative complexity at least τcmn + τ. Recall that a “cheating” pebbling either has (1) one step in which we have at least 2 pebbles on each of the chains C1i , . . . , Cτi for some i ≤ n, or (2) cm + 1 time steps in which we have a pebble on each of the chains C1i , . . . , Cτi for some i ≤ n. Any non-cheating pebbling corresponds to a valid B2LC solution. If an equation ej is not satisﬁed by the solution then there is no legal way to pebble the equation chain Ej without cheating because at no point during the m walks are both variables in the equation oﬀset by the correct amount. Thus, any instance of B2LC which does not have a valid solution requires at least τcmn + τ cumulative complexity. 15

Reminder of Theorem 7. minCCδ is NP-Complete. Proof of Theorem 7. (sketch) We sketch the construction for δ = 2. Although we maintain τ chains representing each variable, we can no longer maintain the gadgets for the equation, E1 , . . . , Em , for which each vertex has indegree 2τ + 1, one edge from its predecessor in the gadget, and 2τ edges from the chains representing the variables involved in the equation. Instead, in place of the 2τ edges, we construct a binary tree, where the bottom layer of the tree has at least τ2 nodes. Each of the τ2 connects to a separate instance of the chains representing the variables involved in the equation. Thus, if the equation involves variables xi and xj , then each of the τ2 nodes will have an edge from one of the τ chains representing xi , and an edge from one of the τ chains representing xj , oﬀset by an appropriate amount. This construction ensures that the root of the tree is pebbled only if all equations are satisﬁed, but any “dishonest” walk along the chains will require at least τ additional pebbles. Similarly, we replace the 2τ edges in the paths P1 , . . . , Pn of length cm with τ binary trees with base 2 . Finally, we replace the m + n incident edges to the sink node with a binary tree with base m+n . Since each of these terms are polynomial in c, m, n, there exists a τ 2 that is also polynomial in c, m, n which allows us to distinguish between honest walks and dishonest walks. As a result, we can decide between instances of B2LC. Reminder of Theorem 8. B2LC is NP-Complete. ProofPof Theorem 8. First, sort S so that S = {s1 , s2 , . . . , sm } and si ≤ sj for any i ≤ j. Let 2 T= m i=1 sm . We then create mn = 3n equations: x1 + s1 = x2 ,

x2 + s2 = x3 ,

...,

xm + sm = xm+1 ,

x1 + 0 = x2 ,

x2 + 0 = x3 ,

...,

xm + 0 = xm+1 ,

x1 + T = x2 ,

x2 + T = x3 ,

...,

xm + T = xm+1 ,

x1 + 2T = x2 ,

x2 + 2T = x3 ,

...,

xm + 2T = xm+1 ,

x1 + (n − 2)T = x2 ,

x2 + (n − 2)T = x3 ,

...,

xm + (n − 2)T = xm+1 ,

Finally, we create the additional n equations: x1 +

T + 3(i − 1)(n − 2)T = xm+1 , n

(1)

for i ∈ [n]. This gives a total of 3n2 + n equations so that the reduction is clearly achieved in polynomial time. Recall that B2LC is true if and only if there exist {ai,j }, i ∈ [n], j ∈ [m] so that each equation xi + ci,j = xj is satisﬁed by the assigning xi = ai,k and xj = aj,k for some k. We show that there exists a solution to 3-PARTITION if and only if there exists a solution to B2LC. Suppose there exists a partition of S into n triplets S1 , S2 , . . . , Sn so that the sum of the integers in each triplet is the same, and equals nT . We set a1,1 = 0 and for each si that appears in S1 , we take the equation, a1,i + si = a1,i+1 . Otherwise, if si does not appear in S1 , we take the equation a1,i + 0 = a1,i+1 . Suppose that ai,j are deﬁned for all i < k. Then we set ak,1 = 0 and for each si that appears in Sk , we take the equation, ak,i + si = ak,i+1 . If si appears in Si for some i < k, then we take the equation ak,i + (i − 2)T = ak,i+1 . Otherwise, if si does not appear in S1 , . . . , Sk , we take the equation ak,i + (i − 1)T = ak,i+1 . Thus, to get ai,m+1 from ai,1 , we add in three elements whose sum is nT . We then add in 3(i − 1) instances of (i − 2)T , for each of the elements which appear in S1 , . . . , Si−1 . Finally, we add in 16

(3n − 3i) instances of (i − 1)T . Hence, it follows that ai,1 +

T + 3(i − 1)(i − 2)T + (3n − 3i)(i − 1)T = ai,m+1 n T ai,1 + + (3in − 3n − 6i + 6)T = ai,m+1 n T ai,1 + + 3(i − 1)(n − 2)T = ai,m+1 , n

so that all equations in 1 are satisﬁed. Thus, a solution for 3-PARTITION yields a solution for B2LC. Suppose there exists a solution for the above instance of B2LC. Without loss of generality, assume each equation in 1, x1 + nT + 3(i − 1)(n − 2)T = xm+1 holds for ai,1 , ai,m+1 , since two diﬀerent equations clearly cannot hold for the same ai,1 , ai,m+1 . We observe that ai,1 + nT ≡ ai,m+1 (mod T ) for all i. Hence, to obtain a1,m+1 from a1,1 , we must take three equations of the form xi + si = xi+1 since the summing less than three elements in S is less than nT , while the summing more than three T and elements in S is more than nT but less than T + nT (as each element of S is greater than 4n T less than 2n and the sum of all elements in S is T ). Say that these three equations use the terms si1 , si2 , si3 . Then we let S1 = {si1 , si2 , si3 }, which indeed sums to nT . Similarly, to obtain ak,m+1 from ak,1 , we must take three equations of the form xi + si = xi+1 . Since there are 3n equations of this form which must be satisﬁed and each of the n assignments of the form ai,1 , . . . , ai,m+1 (where 1 ≤ i ≤ n) uses exactly three of these equations, then each assignment of ai,1 , . . . , ai,m+1 must satisfy a disjoint triplet of the 3n equations. Say that the three satisﬁed equations for ai,1 , . . . , ai,m+1 are sk1 , sk2 , sk3 . Then we let Sk = {sk1 , sk2 , sk3 }, which indeed sums to nT . Therefore, we have a partition of S into triplets, which each sum to nT , as desired. Thus, a solution for B2LC yields a solution for 3-PARTITION. Reminder of Theorem 9. Let G be with constant indegree δ. Then there is a fractional solution to our LP Relaxation (of the Integer Program in Figure 5) with cost at most 3n. Proof of Theorem 9. In particular, for all time steps t ≤ n, we set xti = n1 for all i ≤ t, and xti = 0 for i > t. For time steps n < t ≤ 2n, we set xtn = t−n+1 and xtv = n1 for v 6= n. We ﬁrst argue n that this is a feasible solution. Trivially, Constraints 1 and 2 are satisﬁed. Note that if xt+1 = n1 , v t P xv ′ then xtu = n1 for all u < v, so certainly v ′ ∈Parents(v) |Parents(v)| = t−n = n1 . Moreover, xt−1 n n for n < t ≤ 2n implies that setting xtn = xt−1 n +

X v ′ ∈Parents(v)

xtv ′ t−n 1 t−n+1 = + = |Parents(v)| n n n

is valid. Therefore, Constraint is satisﬁed. P 3P Finally, we claim that v∈V t≤2n xtv ≤ 3n. To see this, note that for every round t ≤ n, we have xtv ≤ n1 for all v ∈ V. Thus, XX v∈V t≤n

xtv ≤

XX 1 X ≤ 1 = n. n v∈V t≤n

17

v∈V

For time steps n < t ≤ 2n, note that xtn ≤ 1 and xtv = n1 for all v 6= n. Thus, X X X n−1 t xv ≤ 1+ ≤ 2n. n t≤n

v∈V n 0 [KR08], under the Unique Games Conjecture [Kho02]. We also produce a reduction from Cubic Vertex Cover to show REDUCIBLEd is NP-Complete even when the input graph has bounded indegree. Theorem 10 REDUCIBLEd is NP-Complete. k

10

Alwen et al. [ABP16] also gave tighter upper and lower bounds on Πcc (G) for the Argon2i-A, iBH and Catena k k iMHFs. For example, Πcc (G) = Ω n1.66 and Πcc (G) = O n1.71 for a random Argon2i-A DAG G (whp).

18

Proof : We provide a reduction from VC to REDUCIBLEd . Given an instance G = (V, E) of VC, arbitrarily label the vertices 1, . . . , n, where n = |V|, and direct each edge of E so that 1, . . . , n is a topological ordering of the nodes. For each node i we add two directed paths: (1) a path of length i − 1 with an edge from the last node on the path to node i, (2) a path of length n − i with an edge from node i to the ﬁrst node of the path. To avoid abuse of notation, let U represent the original n vertices (from the given instance of VC) in the modiﬁed graph. We claim VC has a vertex cover of size at most k if and only if the resulting graph is (k, n)reducible. Indeed, if there exists a vertex cover of size k, we remove the corresponding k vertices in the resulting construction. Then there are no edges connecting vertices of U. By construction, any path of length n must contain at least two vertices of U. Hence, the resulting graph is (k, n)reducible. On the other hand, suppose that there is no vertex cover of size k. Given a set S of k = |S| vertices we say that a node u ∈ U is “untouched” by S if u ∈ / S and S does not contain any vertex from the chain(s) we connected to node u. If there is no vertex cover of size k, then removing any set S of k = |S| vertices from the graph leaves an edge (u, v) with the following properties: (1) u, v ∈ U, (2) u and v are both untouched by S. Suppose u has label i. Then the (untouched) directed path which ends at u has length i − 1. Similarly, there still exists some directed path of length ≥ n − i which begins at v. Thus, there exists a path of length at least n, so the resulting graph is not (k, n)-reducible. ✷ Theorem 11 Even for bounded indegree graphs, REDUCIBLEd is NP-Complete. Proof : We provide a reduction from CubicVC to REDUCIBLEd , using graphs of bounded indegree. Given an instance of CubicVC, partition the vertices of G = (V, E) into four color classes: c1 , c2 , c3 , c4 ⊆ V such that no edge in E is incident to two vertices of the same color. Orient each edge in G so that an edge points from a vertex colored ci to a vertex colored cj only if j > i. Let Vi be the vertices of V which are colored with color ci . For each vertex v ∈ V1 , add a new directed path of length n2 and add a directed edge from the last vertex of this path to v. Thus, each v ∈ V1 has indegree 1. For each vertex v ∈ V2 , create a new directed path of length n4 and add a directed edge from the last vertex of this path to v. Thus, each v has indegree ≤ 4. Furthermore, for each v ∈ V2 create a new directed path of length n2 and add a directed edge from v ∈ V2 to the ﬁrst vertex in this path. For each vertex v ∈ V3 , create a new directed path of length n8 and add an edge from the last vertex in this path to v, so that v has indegree ≤ 4. Moreover, for each v ∈ V3 create a new directed path of length 3n 4 and add a directed edge from v to the ﬁrst vertex in this path. Finally, for each v ∈ V4 create a new directed path of length 7n 8 and add a directed edge from v ∈ V4 to the ﬁrst vertex in this path — observe that v has indegree ≤ 3. Clearly, the resulting structure is a DAG which can be constructed in polynomial time. We claim CubicVC has a vertex cover of size at most k if and only if the above construction is (k, n)-reducible. Indeed, if there exists a vertex cover of size k, we remove the corresponding k vertices in the resulting construction. Then there are no edges connecting vertices of diﬀerent colors. By construction, any path of length n must contain at least two vertices of diﬀerent colors. Hence, the resulting graph is (k, n)-reducible. On the other hand, suppose that there is no vertex cover of size k. Given a set S of k = |S| vertices we say that a node u ∈ V1 ∪ V2 ∪ V3 ∪ V4 is “untouched” by S if u ∈ / S and S does not contain any vertex from the chain(s) we connected to node u. If there is no vertex cover of 19

size k, then removing any set S of k = |S| vertices from the graph leaves an edge (u, v) with the following properties: (1) u, v ∈ V and have diﬀerent color classes, (2) u and v are both untouched by S. Suppose u has color i. Then the (untouched) directed path which ends at u has length 2ni . Similarly, there still exists some directed path of length ≥ n − 2ni which begins at v. Thus, there exists a path of length at least n, so the resulting graph is not (k, n)-reducible. ✷

C

On Pebbling Time and Cumulative Cost

In this section we address the following question: is is necessarily case that an optimal pebbling of G (minimizing cumulative cost) takes n steps? For example, the pebbling attacks of Alwen and Blocki [AB16a, AB16b] on depth-reducible graphs such as Argon2i-A, Argon2i-B, Catena all take exactly n steps. Thus, it seems natural to conjecture that the optimal pebbling of G always ﬁnishes in depth(G) steps. In this section we provide a concrete example of an DAG G (with n nodes and depth(G) = n) with the property that any optimal pebbling of G must take more than n steps. More formally, let Πk (G, t) denote the set of all legal pebblings of G that take at most t pebbling k rounds. We prove that minP∈Πk (G,t) cc(P) > minP∈Πk (G) cc(P) = Πcc (G). Theorem 12 There exists a graph G with n = 16 nodes and depth(G) = n s.t. 28 27 .

minP∈Πk (G,t) cc(P) k

Πcc (G)

≥

Proof : Consider the following DAG G on 16 nodes {1, . . . , 16} with the following directed edges (1) (i, i + 1) for 1 ≤ i < 16, (2) (i, i + 9) for 1 ≤ i ≤ 5 and (3) (i, i + 7) for 8 ≤ i ≤ 9. We ﬁrst show in Claim 13 that there is a pebbling P with cost 27 that takes 18 rounds. Theorem 12 then follows from Claim 14 where we show that any P ∈ Πk (G, 16) has cumulative cost at least 28. Intuitively, any legal pebbling must either delay before pebbling nodes 15 and 16 or during rounds 15 − i (for 0 ≤ i ≤ 5) we must have at least one pebble on some nodes in the set {9, 8, ..., 9 − i}. k

Claim 13 For the above DAG G we have Πcc (G) ≤ 27. Proof : Consider the following pebbling: P0 = ∅, P1 = {1}, P2 = {2}, P3 = {3}, P4 = {4}, P5 = {5}, P6 = {6}, P7 = {7}, P{ 8} = {8}, P9 = {1, 9}, P10 = {2, 10}, P11 = {3, 11}, P12 = {4, 12}, P13 = {5, 13}, P14 = {6, 14}, P15 = {7, 14}, P16 = {8, 14}, P17 = {9, 15}, P18 = {16}. It is easy to verify that cc(P) = 9 + 2 ∗ 9 = 27 since there are 9 steps in which we have one pebble on G and 9 steps in which we have two pebbles on G. ✷ Claim 14 For the above DAG G we have minP∈Πk (G,16) cc(P) ≥ 28. Proof : Let P = (P1 , . . . , P16 ) ∈ Πk (G, 16) be given. Clearly, i ∈ Pi for each round 1 ≤ i ≤ 16. To pebble nodes 15 and 16 on steps 15 and 16 we must have 9 ∈ P15 and 8 ∈ P14 . By induction, this means that P14−i ∩ {8, ..., 8 − i} 6= ∅ for each i > 0. To pebble node 9 + i at time 9 + i we require that i ∈ P8+i for each 1 ≤ i ≤ 5. These observations imply that |P9 | ≥ 3, |P10 | ≥ 3, . . ., |P13 | ≥ 3. We also have |P15 | ≥ 2 and |P14 | ≥ 2. We also have |Pi | ≥ 1 for all 1 ≤ i ≤ 16. The cost of rounds 1–8 and round 16 is at least 9. The cost of rounds 9–13 is at least 15 and the cost of rounds 14 and 15 is at least 4. Thus, cc(P) ≥ 28. ✷ 20

✷ k

Open Questions: Theorem 12 shows that Πcc (G) can be smaller than minP∈Πk (G,t) cc(P), but how large can this gap be in general? Can we prove upper/lower bounds on the ratio: for any n node DAG G? Is it true that

minP∈Πk (G,t) cc(P) k

Πcc (G)

minP∈Πk (G,t) cc(P) k

Πcc (G)

≤ c for some constant c? If not does this

hold for n node DAGs G with constant indegree? Is it true that the optimal pebbling of G always takes at most cn steps for some constant c?

D

NP-Hardness of minST

Recall that the space-time complexity of a graph pebbling is deﬁned as st(P) = t×max1≤i≤t |Pi |. We deﬁne minST and minSST based on whether the graph pebbling is parallel or sequential. Formally, the decision problem minST is deﬁned as follows: Input: a DAG G on n nodes and an integer k < n(n + 1)/2. Output: Yes, if minP∈P k st(P) ≤ k; otherwise No. G

The decision problem minSST is deﬁned as follows: Input: a DAG G on n nodes and an integer k < n(n + 1)/2. Output: Yes, if Πst (G) ≤ k; otherwise No. Gilbert et al.[GLT80] provide a construction from any instance of TQBF to a DAG GTQBF with pebbling number 3n + 3 if and only if the instance is satisﬁable. Here, the pebbling number of a DAG G is minP=(P1 ,...,Pt )∈P k maxi≤t |Pi | is the number of pebbles necessary to pebble G. An important gadget in their construction is the so-called pyramid DAG. We use a triangle with the number k inside to denote a k-pyramid (see Figure 6 for an example of a 3-pyramid). The key property of these DAGs is that any legal pebbling P = (P0 , . . . , Pt ) ∈ P k (Pyramidk ) of a k-pyramid requires at least mini |Pi | ≥ k pebbles on the DAG at some point in time. Another gadget, which appears in Figure 7, is the existential quantiﬁer gadget, which requires that si , si − 1, and si − 2 pebbles must be placed in each of the pyramids to ultimately pebble qi . Remark: We note that [GLT80] focused on sequential pebblings (P ∈ P) in their analysis, but their analysis extends to parallel pebblings (P ∈ P k ) as well.

≡ k

Fig. 6: A 3-Pyramid. We observe that any instance of TQBF where each quantiﬁer is an existential quantiﬁer requires at most a quadratic number of pebbling moves. Speciﬁcally, we look at instances of 3-SAT, such 21

qi

qi

xi

xi

xi

xi′

xi′

xi′ si

si − 1

xi xi′ si

si − 1

si − 2

si − 2

qi+1

qi+1

Fig. 7: An existential quantiﬁer, with xi set to true in the left ﬁgure and xi set to false in the right ﬁgure. as in Figure 8. In such a graph representing an instance of 3-SAT, the sink node to be pebbled is qn . By design of the construction, any true statement requires exactly three pebbles for each pyramid representing a clause. On the other hand, a false clause requires four pebbles, so that false statements require more pebbles. Thus, by providing extraneous additions to the construction which force the number of pebbling moves to be a known constant, we can extract the pebbling number, given the space-time complexity. For more details, see the full description in [GLT80]. Lemma 15 [GLT80] The quantified Boolean formula Q1 x1 Q2 x2 · · · Qn xn Fn is true if and only if the corresponding DAG GTQBF has pebbling number 3n + 3. Lemma 16 Suppose that we have a satisfiable TQBF formula Q1 x1 Q2 x2 · · · Qn xn Fn with Qi = ∃ for all i ≤ n. Then there is a legal sequential pebbling P = (P0 , . . . , Pt ) ∈ P GTQBF of the corresponding DAG GTQBF from [GLT80] with t ≤ 6n2 +33n pebbling moves and maxi≤t |Pi | ≤ 3n+3. Proof : We describe the pebbling strategy of Gilbert et al. [GLT80], and analyze the pebbling time of their strategy. Let T (i) be the time it takes to pebble qi in the proposed construction for any instance with i variables, i clauses, and only existential quantiﬁers. Suppose that xi is allowed to be true for the existential quantiﬁer Qi = ∃. Then vertex xi′ is pebbled using si moves, where si = 3n − 3i + 6. Similarly, vertices di and fi are pebbled using si − 1 and si − 2 moves respectively. Additionally, fi is moved to xi′ and then xi is moved to xi′ in the following step, for a total of two more moves. We then pebble qi+1 using T (i + 1) moves and ﬁnish by placing a pebble on xi and moving it to ci , bi , ai , and qi , for ﬁve more moves. Finally, we use six more moves to pebble an additional clause. Thus, in this case, Ttrue (i) = si + (si − 1) + (si − 2) + 13 + T (i + 1). On the other hand, if xi is allowed to be false for the existential quantiﬁer Qi = ∃, then ﬁrst we pebble xi′ , di , and fi sequentially, using si , si − 1, and si − 2 moves respectively. We then move 22

the pebble from fi to xi′ and then to xi , for a total of two more moves. We then pebble qi+1 using T (i + 1) moves. The pebble on qi+1 is subsequently moved to ci and then bi , using two more moves. Picking up all pebbles except those on bi and xi′ , and using them to pebble fi takes si − 2 more moves. Additionally, the pebble on fi is moved to xi′ and then ai , while the pebble on xi′ is moved to xi and then qi , for four more moves. Finally, we use six more moves to pebble an additional clause. In total, Tfalse (i) = si + (si − 1) + (si − 2) + (si − 2) + 14 + T (i + 1). Therefore, T (i) ≤ 4si + 10 + T (i + 1),

where si = 3n − 3i + 6. Thus,

T (i) ≤ 12(n − i) + 34 + T (i + 1). Writing R(i) = T (n − i) then gives

Pn

R(i) ≤ 12i + 34 + R(i − 1),

so that R(n) ≤ i=1 (12i + 34) = 6n2 + 40n. Hence, it takes at most 6n2 + 40n moves to pebble the given construction for any instance of TQBF which only includes existential quantiﬁers. ✷ Theorem 17 minST is NP-Complete. Proof : We provide a reduction from 3-SAT to minST. Given an instance I of 3-SAT with at most n clauses or variables, we create the corresponding graph from [GLT80]. Additionally, we append a chain of length 300n3 + 6n2 + 40n + 100 to the graph with an edge from the sink node qn from [GLT80] to the ﬁrst node in our chain. By adding a chain of length 300n3 + 6n2 + 40n + 100 we can ensure that for any legal pebbling P = (P0 , . . . , Pt ) ∈ P k (G) of our graph G we have t ≥ 300n3 + 6n2 + 40n + 100. By Lemma 16, at most 6n2 + 40n moves are necessary to pebble the 3−SAT portion of the graph, while the chain requires exactly 300n3 +6n2 +40n+100 moves. Thus, if I is satisﬁable then we can ﬁnd a legal pebbling P = (P0 , . . . , Pt ) with space maxi≤t |Pi | ≤ 3n + 3 and t ≤ 300n3 + 12n2 + 80n + 100 moves. First, pebble the sink qn in t ′ = 6n2 + 40n steps and maxi≤t ′ |Pi | ≤ 3n + 3 space by Lemma 16 and then walk single pebble down the chain in 300n3 + 6n2 + 40n + 10 steps keeping at most one pebble on the DAG in each step. The space time cost is at most st(P) ≤ 900n4 +936n3 +276n2 +540n+300. If I is not satisﬁable then, by Lemma 15 for any legal pebbling P = (P0 , . . . , Pt ) ∈ P k (G) of our graph we have maxi≤t |Pi | ≥ 3n + 4 and t ≥ 300n3 + 6n2 + 40n + 100. Thus, st(P) ≥ 900n4 + 1218n3 + 144n2 + 460n + 400 we have 900n4 + 1218n3 + 144n2 + 460n + 400 − 900n4 + 936n3 + 276n2 + 540n + 300 > 0 . for all n > 0. Thus, I is satisﬁable if and only if there exists a legal pebbling with st(P) ≤ 900n4 + 936n3 + 276n2 + 540n + 300. Clearly, this reduction can be done in polynomial time, and so there is indeed a polynomial time reduction from 3-SAT to minST. ✷

We note that the same reduction from TQBF to minST fails, since there exist instances of TQBF where the pebbling time is 2Ω(n) . However, the same relationships do hold for sequential pebbling. The proof of Theorem 18 is the same as the proof of Theorem 17 because we can exploit the fact that the pebbling of GTQBF from Lemma 16 is sequential. Theorem 18 minSST is NP-Complete. 23

Why Doesn’t the Gilbert et al. Reduction Work for Cumulative Cost? The construction from [GLT80] is designed to minimize the number of pebbles simultaneously on the graph, at the expense of larger number of necessary time steps and/or a larger average number of pebbles on the graph during each pebbling round. As a result, one can bypass several time steps by simply adding additional pebbles during some time step. For example, it may be beneﬁcial to temporarily keep repebbling the si − 2-pyramid pebbles on all three nodes xi , xi and xi′ at times so that we can avoid si later. Also if we do not need the pebble on node xi for the next 2 steps then it is better to discard any pebbles on xi′ and xi entirely to reduce cumulative cost. Because we would need to repebble the si -pyramid later our maximum space usage will increase, but our cumulative cost would decrease. Our reduction to space-time cost works because st cost is highly sensitive to an increase in the number of pebbles on the graph even if this increase is temporary. Cumulative, unlike space cost or space-time cost, is not very sensitive to such temporary increases in the number of pebbles on the graph.

24

q1 : Sink x1

14

x1

15

q2

13

p0

x2

x2 p1

11

12

q3

10 x3

8

x3

9

q4

p2 = q5

7 x4

5

x4

6 4

Fig. 8: Graph GTQBF for ∃x1 , x2 , x3 , x4 s.t. (x1 ∨ x2 ∨ x4 ) ∧ (x2 ∨ x3 ∨ x4 ). 25

arXiv:1609.04449v1 [cs.CR] 14 Sep 2016

Jeremiah Blocki∗

Samson Zhou†

September 16, 2016

Abstract We consider the computational complexity of ﬁnding a legal black pebbling of a DAG G = (V, E) with minimum cumulative cost. A black pebbling is a sequence P0 , . . . , Pt ⊆ V of sets of nodes whichSmust satisfy the following properties: P0 = ∅ (we start oﬀ with no pebbles on G), sinks(G) ⊆ j≤t Pj (every sink node was pebbled at some point) and parents Pi+1 \Pi ⊆ Pi (we can only place a new pebble on a node v if all of v’s parents had a pebble during the last round). The cumulative cost of a pebbling P0 , P1 , . . . , Pt ⊆ V is cc(P) = |P1 | + . . . + |Pt |. The cumulative pebbling cost is an especially important security metric for data-independent memory hard functions, an important primitive for password hashing. Thus, an eﬃcient (approximation) algorithm would be an invaluable tool for the cryptanalysis of password hash functions as it would provide an automated tool to establish tight bounds on the amortized space-time cost of computing the function. We show that such a tool is unlikely to exist. In particular, we prove the following results. • It is NP-Hard to ﬁnd a pebbling minimizing cumulative cost.

˜ • The natural linear program relaxation for the problem has integrality gap O(n), where n is the number of nodes in G. We conjecture that the problem is hard to approximate. • We show that a related problem, ﬁnd the minimum size subset S ⊆ V such that depth(G − S) ≤ d, is also NP-Hard. In fact, under the unique games conjecture there is no (2 − ǫ)approximation algorithm.

1

Introduction

Given a directed acyclic graph (DAG) G = (V, E) the goal of the (parallel) black pebbling game is to place pebbles on all sink nodes of G (not necessarily simultaneously). The game is played in rounds and we use Pi ⊆ V to denote the set of currently pebbled nodes on round i. Initially all nodes are unpebbled, P0 = ∅, and in each round i ≥ 1 we may only include v ∈ Pi if all of v’s parents were pebbled in the previous conﬁguration (parents(v) ⊆ Pi−1 ) or if v was already pebbled in the last round (v ∈ Pi−1 ). In the sequential pebbling game we can place at most one new pebble on the graph in any round (i.e., |Pi \Pi−1 | ≤ 1), but in the parallel pebbling game no such restriction applies. ∗

Department of Computer Science, Purdue University, West Lafayette, IN. Email: [email protected] Department of Computer Science, Purdue University, West Lafayette, IN. Email: [email protected] Research supported by NSF CCF-1649515. †

1

k

We deﬁne the cumulative cost (respectively space-time cost) of a pebbling P = (P1 , . . . , Pt ) ∈ PG to be cc(P) = |P1 | + . . . + |Pt | (resp. st(P) = t × max1≤i≤t |Pi |), that is, the sum of the number of k pebbles on the graph during every round. Here, PG (resp. PG ) denotes the set of all valid parallel k (resp. sequential) pebblings of G. The parallel cumulative pebbling cost of G, denoted Πcc (G) (resp. Πst (G) = minP∈PG st(G)), is the cumulative cost of the best legal pebbling of G. Formally, k

Πcc (G) = min cc(P) , and

Πst (G) = min st(P) . P∈PG

k

P∈PG k

In this paper, we consider the computational complexity of Πcc (G), showing that the value is k NP-Hard to compute. We also provide evidence that Πcc (G) is hard to approximate by demonstrating that the natural linear programming relaxation for the problem has a large integrality gap.

1.1

Motivation

The pebbling cost of a DAG G is closely related to the cryptanalysis of data-independent memory hard functions (iMHF) [AS15], a particularly useful primitive for password hashing [PHC, BDK16]. k In particular, an eﬃcient algorithm for (approximately) computing Πcc (G) would enable us to automate the cryptanalysis of candidate iMHFs. The question is particularly timely as the Internet Research Task Force considers standardizing Argon2i [BDK16], the winner of the password hashing competition [PHC], despite recent attacks [CGBS16, AB16a, AB16b] on the construction. Despite recent progress [AB16b, ABP16] the precise security of Argon2i and alternative constructions is poorly understood. An iMHF is deﬁned by a DAG G (modeling data-dependencies) on n nodes V = {1, . . . , n} and a compression function H (usually modeled as a random oracle in theoretical analysis)1 . The label ℓ1 of the ﬁrst node in the graph G is simply the hash H(x) of the input x. A vertex i > 1 with parents i1 < i2 < · · · < iδ has label ℓi (x) = H(i, ℓi1 (x), . . . , ℓiδ (x)). The output value is the label ℓn of the last node in G. It is easy to see that any legal pebbling of G corresponds to an algorithm computing the corresponding iMHF. Placing a new pebble on node i corresponds to computing the label ℓi and keeping (resp. discarding) a pebble on node i corresponds to storing the label in memory (resp. freeing memory). Alwen and Serbinenko proved that in the parallel random oracle model (pROM) of computation, any algorithm evaluating such an iMHF could be reduced to a pebbling strategy with (approximately) the same cumulative memory cost. It should be noted that any graph G on n nodes has a sequential pebbling strategy P ∈ PG that ﬁnishes in n rounds and has cost cc(P) ≤ st(P) ≤ n2 . Ideally, a good iMHF construction provides ˜ n2 ) even if the the guarantee that the amortized cost of computing the iMHF remains high (i.e., Ω adversary evaluates many instances (e.g.,diﬀerent password guesses) of the iMHF. Unfortunately, k neither large Πst (G) nor large minP∈P k st(P), are suﬃcient to guarantee that Πcc (G) is large [AS15]. G

More recently Alwen and Blocki [AB16a] showed that Argon2i [BDK16], the winner of the recently completed password hashing competition [PHC], has much lower than desired amortized space-time k ˜ n1.75 . complexity. In particular, Πcc (G) ≤ O

1 Because the data-dependencies in an iMHF are specified by a static graph, the induced memory access pattern does not depend on the secret input (e.g., password). This makes iMHFs resistant to side-channel attacks. Datadependent memory hard functions (MHFs) like scrypt [Per09] are potentially easier to construct, but they are potentially vulnerable to cache-timing attacks.

2

k

In the context of iMHFs, it is important to study Πcc (G), the cumulative pebbling cost of a graph G, in additional to Πst (G). Traditionally, pebbling strategies have been analyzed using space-time complexity or simply space complexity. While sequential space-time complexity may be a good model for the cost of computing a single-instance of an iMHF on a standard single-core machine (i.e., the costs incurred by the honest party during password authentication), it does not model the amortized costs of a (parallel) oﬄine adversary who obtains a password hash value and would like to evaluate the hash function on many diﬀerent inputs (e.g., password guesses) to crack k the user’s password [AS15, AB16a]. Unlike Πst (G), Πcc (G) models the amortized cost of evaluating a data-independent memory hard function on many instances [AS15, AB16a]. k An eﬃcient algorithm to (approximately) compute Πcc (G) would be an incredible asset when developing and evaluating iMHFs. For example, the Argon2i designers argued that the AlwenBlocki attack [AB16a] was not particularly eﬀective for practical values of n (e.g., n ≤ 220 ) because the constant overhead was too high [BDK16]. However, they could not rule out the possibility k that more eﬃcient attacks might exist2 . An eﬃcient algorithm to (approximately) compute Πcc (G) would have allowed us to immediately resolve such debates by automatically generating upper/lower bounds on the cost of computing Argon2i for each running time parameters (n) that one might select k n2 in practice. Alwen et al. [ABP16] showed how to construct graphs G with Πcc (G) = Ω log n . This construction is essentially optimal in theory as results of Alwen and Blocki [AB16a] imply 2 k log n that any constant indegree graph has Πcc (G) = O n log . However, the exact constants one log n k

−6

2

could obtain through a theoretical analysis are most-likely small. A proof that Πcc (G) ≥ 10log×n n would be an underwhelming security guarantee in practice, where we may have n ≈ 106 . An k eﬃcient algorithm to compute Πcc (G) would allow us to immediately determine whether these new constructions provide meaningful security guarantees in practice.

1.2

Results k

Our primary contribution is a proof that the decision problem “is Πcc (G) ≤ k for a positive integer ” is NP-Complete.3 In fact, our result holds even if the DAG G has constant indeg.4 k ≤ n(n+1) 2 k

We also provide evidence that Πcc (G) is hard to approximate. Thus, it is unlikely that it will be possible to automate the cryptanalysis process for iMHF candidates. In particular, we deﬁne a k its linear programming relaxation. We natural integer program to compute Πcc (G) and consider n then show that the integrality gap is at least Ω log n leading us to conjecture that it is hard to k

approximate Πcc (G). We also give an example of a DAG G on n nodes with the property that any k optimal pebbling (minimizing Πcc ) requires more than n pebbling rounds. The computational complexity of several graph pebbling problems has been explored previously in various settings [GLT80, HP10]. However, we stress that the structure of a pebbling minimizing cumulative cost can be very diﬀerent from the structure of a pebbling minimizing space-time cost or 2

Indeed, Alwen and Blocki [AB16b] subsequently introduced heuristics to improve their attack and demonstrated that their attacks were effective even for smaller (practical) values of n by simulating their attack against real Argon2i instances. k 3 Note that for any G with n nodes we have Πcc (G) ≤ 1 + 2 + . . . + n = n(n+1) since we can always pebble G in 2 topological order in n steps if we never remove pebbles. 4 For practical reasons most iMHF candidates are based on a DAG G with constant indegree.

3

space. Thus, we need fundamentally new ideas to construct appropriate gadgets for our reduction.5 We ﬁrst introduce a new problem which we call Bounded 2-Linear Covering (B2LC) and show that it is NP-Complete by reduction from 3-PARTITION. We then show that we can encode a B2LC instance as a graph pebbling problem. In Appendix B we also investigate the computational complexity of determining how “depthreducible” a DAG G is showing that the problem is NP-Complete even if G has constant indegree. A DAG G is (e, d)-reducible if we there exists a subset S ⊆ V of size |S| ≤ e such that depth(G−S) < d. That is, after removing nodes in the set S from G, any remaining directed path has length less than d. If G is not (e, d)-reducible, we say that it is (e, d)-depth robust. It is known that a ˜ n2 ) if and only if the graph is highly depth robust graph has high cumulative cost (e.g., Ω ˜ (n)) [AB16a, ABP16]. Our reduction from Vertex Cover preserves approximation (e.g., e, d = Ω 6 hardness. Thus, assuming that P 6= NP it is hard to 1.3-approximate e, the minimum size of a set S ⊆ such that depth(G − S) < d [DS05]. Under the Unique Games Conjecture [Kho02], it is hard to (2 − ǫ)-approximate e for any ﬁxed ǫ > 0 [KR08].

2

Preliminaries

Given a directed acyclic graph (DAG) G = (V, E) and a node v ∈ V we use parents(v) = {u : (u, v) ∈ E} to denote the set of nodes u with directed edges into node v and we use indeg(v) = |parents(v)| to denote the number of directed edges into node v. We use indeg(G) = maxv∈V indeg(v) to denote the maximum indegree of any node in G. For convenience, we use indeg instead of indeg(G) when G is clear from context. We say that a node v ∈ V with indeg(v) = 0 is a source node and a node with no outgoing edges is a sink node. We use sinks(G) (resp. sources(G)) to denote the set of all sink nodes (resp. source nodes) in G. We will use n = |V| to denote the number of nodes in a graph, and for convenience we will assume that the nodes V = {1, 2, 3, . . . , n} are given in topological order (i.e., 1 ≤ j < i ≤ n implies that (i, j) ∈ / E). We use depth(G) to denote the length of the longest directed path in G. Given a positive integer k ≥ 1 we will use [k] = {1, 2, . . . , k} to denote the set of all integers 1 to k (inclusive). Definition 1 Given a DAG G = (V, E) on n nodes a legal pebbling of G is a sequence of sets P = P0 , . . . , Pt such that: (1) P0 = ∅, (2) ∀i > 0, v ∈ Pi \Pi−1 we have parents(v) ⊆ Pi−1 , and (3) P ∀v ∈ sinks(G) ∃0 < j ≤ t such that v ∈ Pj . The cumulative cost of the pebbling P is cc(P) = ti=1 |Pi |, and the space-time cost is st(P) = t × max0 0 by a reduction from Cubic Vertex Cover, deﬁned below. Note that when d = 0 REDUCIBLEd is Vertex Cover. The decision problem VC (resp. CubicVC) is deﬁned as follows: Input: a graph G on n vertices (CubicVC: each with degree 3) and a positive integer k ≤ n2 . Output: Yes, if G has a vertex cover of size at most k; otherwise No. To show that minCC is NP-Complete we introduce a new decision problem B2LC. We will show that the decision problem B2LC is NP-Complete and we will give a reduction from B2LC to minCC. The decision problem Bounded 2-Linear Covering (B2LC) is deﬁned as follows: Input: n variables x1 , . . . , xn , k positive constants 0 ≤ c1 , . . . , ck , an integerPm ≤ k and k equations of the form xαi + ci = xβi , where αi , βi ∈ [n] and i ∈ [k]. We require that ki=1 ci ≤ p(n) for some ﬁxed polynomial n. Output: Yes, if we can ﬁnd mn values xy,z (for each 1 ≤ y ≤ m and 1 ≤ z ≤ n) such that for each i ∈ [k] there exists 1 ≤ y ≤ m such that xy,αi + ci = xy,βi (that is the assignment x1 , . . . , xn = xy,1 , . . . , xy,n satisﬁes the ith equation); otherwise No. To show that B2LC is NP-Complete we will reduce from the problem 3-PARTITION, which is known to be NP-Complete. The decision problem 3-PARTITION is deﬁned as follows: T T < xi < 2n Input: A multi-set S of m = 3n positive integers x1 , . . . , xm ≥ 1 such that (1) we have 4n for each 1 ≤ i ≤ m, where T = x1 +. . .+xm , and (2) we require that T ≤ p(n) for a ﬁxed polynomial p.7 P Output: Yes, if there is a partition of [m] into n subsets S1 , . . . , Sn such that j∈Si xj = nT for each 1 ≤ i ≤ n; otherwise No. T We may assume 4n < xi < terms, as described in [Dem14]. 7

T 2n

by taking any set of positive integers and adding a large fixed constant to all

5

Fact 3 [GJ75, Dem14] 3-PARTITION is NP-Complete.8

3

Related Work

The sequential black pebbling game was introduced by Hewitt and Paterson [HP70], and by Cook [Coo73]. It has been particularly useful in exploring space/time trade-oﬀs for various problems like matrix multiplication [Tom78], fast fourier transformations [SS78, Tom78], integer multiplication [SS79b] and many others [Cha73, SS79a]. In cryptography it has been used to construct/analyze proofs of space [DFKP15, RD16], proofs of work [DNW05, MMV13] and memorybound functions [DGN03] (functions that incur many expensive cache-misses [ABW03]). More recently, the black pebbling game has been used to analyze memory hard functions [AS15, AB16a].

3.1

Password Hashing and Memory Hard Functions

Users often select low-entropy passwords which are vulnerable to oﬄine attacks if an adversary obtains the cryptographic hash of the user’s password. Thus, it is desirable for a password hashing algorithm to involve a function f(.) which is moderately expensive to compute. The goal is to ensure that, even if an adversary obtains the value (username, f(pwd, salt), salt) (where salt is some randomly chosen value), it is prohibitively expensive to evaluate f(., salt) for millions (billions) of diﬀerent password guesses. PBKDF2 (Password Based Key Derivation Function 2) [Kal00] is a popular moderately hard function which iterates the underlying cryptographic hash function many times (e.g., 210 ). Unfortunately, PBKDF2 is insuﬃcient to protect against an adversary who can build customized hardware to evaluate the underlying hash function. The cost computing a hash function H like SHA256 or MD5 on an Application Speciﬁc Integrated Circuit (ASIC) is dramatically smaller than the cost of computing H on traditional hardware [NB+ 15]. [ABW03], observing that cache-misses are more egalitarian than computation, proposed the use of “memory-bound” functions for password hashing — a function which maximizes the number of expensive cache-misses. Percival [Per09] observed that memory costs tend to be stable across diﬀerent architectures and proposed the use of memory-hard functions (MHFs) for password hashing. Presently, there seems to be a consensus that memory hard functions are the ‘right tool’ for constructing moderately expensive functions. Indeed, all entrants in the password hashing competition claimed some form of memory hardness [PHC]. As the name suggests, the cost of computing a memory hard function is primarily memory related (storing/retrieving data values). Thus, the cost of computing the function cannot be signiﬁcantly reduced by constructing an ASIC. Percival [Per09] introduced a candidate memory hard function called scrypt, but scrypt is potentially vulnerable to side-channel attacks as its computation yields a memory access pattern that is datadependent (i.e., depends on the secret input/password). Due to the recently completed password hashing competition [PHC] we have many candidate data-dependent memory hard functions such as Catena [FLW13] and the winning contestant Argon2i-A [BDK15].9 8

The 3PARTITION problem is called P[3,1] in [GJ75]. The specification of Argon2i has changed several times. We use Argon2i-A to refer to the version of Argon2i from the password hashing competition, and we use Argon2i-B to refer to the version that is currently being considered for standardization by the Cryptography Form Research Group (CFRG) of the IRTF[BDKJ16]. 9

6

3.1.1

iMHFs and Graph Pebbling

All known candidate iMHFs can be described using a DAG G and a hash function H. Graph pebbling is a particularly useful as a tool to analyze the security of an iMHF [AS15, CGBS16, FLW13]. A pebbling of G naturally corresponds to an algorithm to compute the iMHF. Alwen and Serbinenko [AS15] showed that in the pROM model of computation, any algorithm to compute the iMHF corresponds to a pebbling of G. 3.1.2

Measuring Pebbling Costs

In the past, MHF analysis has focused on space-time complexity [Per09, FLW13, CGBS16]. For example, the designers of Catena [FLW13] showed that their DAG G had high sequential spacetime pebbling cost Πst (G) and Corigan-Gibbs et al. [CGBS16] showed that Argon2i-A and their own iMHF candidate iBH (“balloon hash”) have (parallel) space-time cost Ω n2 . Alwen and Serbinenko [AS15] observed that these guarantees are insuﬃcient for two reasons: (1) the adversary may be parallel, and (2) the adversary might amortize his costs over multiple iMHF instances (e.g., multiple password guesses). Indeed, there are now multiple known attacks on Catena [BK15, AS15, AB16a]. Alwen and Blocki [AB16a, AB16b] gave attacks on Argon2i-A, Argon2i-B, iBH, k and Catena with lower than desired amortized space-time cost — Πcc (G) ≤ O n1.8 for Argon2i-B, k ˜ n1.75 for Argon2i-A and iBH and Πkcc (G) ≤ O n5/3 for Catena. This motivates the Πcc (G) ≤ O k

need to study cumulative cost Πcc instead of space-time cost since amortized space-time complexity k approaches Πcc as the number of iMHF instances being computed increases. k n2 Alwen et al. [ABP16] recently constructed a constant indegree graph G with Πcc (G) = Ω log n . k

From a theoretical standpoint, this is essentially optimal as any constant indeg DAG has Πcc = 2 log n O n log [AB16a], but from a practical standpoint the critically important constants terms in log n the lower bound are not well understood.

3.2

Computational Complexity of Pebbling

The computational complexity of various graph pebbling has been explored previously in diﬀerent settings [GLT80, HP10]. Gilbert et al. [GLT80] focused on space-complexity of the black-pebbling game. Here, the goal is to ﬁnd a pebbling which minimizes the total number of pebbles on the graph at any point in time (intuitively this corresponds to minimizing the maximum space required during computation of the associated function). Gilbert et al. [GLT80] showed that this problem is PSPACE complete by reducing from the truly quantiﬁed boolean formula (TQBF) problem. We need diﬀerent tools to analyze the computational complexity of the problem of ﬁnding a pebbling with low cumulative cost. By contrast, observe that minCC ∈ NP because any DAG G with n nodes this algorithm has a pebbling P with cc(P) ≤ st(P) ≤ n2 . Thus, if we are minimizing cc or st cost, the optimal pebbling of G will trivially never require more than n2 steps. By contrast, the optimal (space-minimizing) pebbling of the graphs from the reduction of Gilbert et al. [GLT80] often require exponential time. In Appendix D, we show that the optimal pebbling from [GLT80] does take polynomial time if the TQBF formula only uses existential quantiﬁers (i.e., if we reduce from 3SAT). Thus, the reduction of Gilbert et al. [GLT80] can also be extended to show that it is NP-Complete to check whether a DAG G admits a pebbling P with st(P) ≤ k for some parameter k. The reduction, 7

which simply appends a long chain to the original graph, exploits the fact that that if we increase space-usage even temporarily we will dramatically increase st cost. However, this reduction does not extend to cumulative cost because the penalty for temporarily placing large number of pebbles can be quite small as we do not keep these pebbles on the graph for a long time.

4

NP-Hardness of minCC

In this section we prove that minCC is NP-Complete by reduction from B2LC. Suppose we are given an instance of B2LC. That is, we are given n variables x1 , . . . , xn and k equations of the form xα + ci = xβ , where α, β ∈ [n] and i ∈ [k], where ci are positive integers bounded by some polynomial in n. Our goal is to to decide if there exist a set of m < k variable assignments: xi,j 1 ≤ i ≤ m and 1 ≤ j ≤ n so that each equation xα + ci = xβ is satisﬁed by some assignment xi,1 , . . . , xi,n . We shall construct a graph GB2LC so that pebbling nodes corresponds to synchronized variables which are “oﬀset” by the summand in satisﬁed equations. P Namely, our graph consists of several components. Our ﬁrst gadget is a chain of length c = ci so that each node is connected to the previous node, and can only be pebbled if there exists a pebble on the previous node in the previous step, such as in Figure 1. We will use another gadget to ensure that, in any legal pebbling, we need to “walk” a pebble down each chain m diﬀerent times. Intuitively, each chain represents a variable xi and the time at which we start walking the jth pebble down the chain will correspond to the value of the variable xj in the th assignment. In any “non-cheating” pebbling strategy we will keep at most one pebble on each chain at any point in time. We could “cheat” by placing multiple pebbles along a chain representing a speciﬁc variable to satisfy the equation nodes. To discourage cheating, we create τ copies of the path gadget representing each variable. Let C1i , . . . , Cτi denote j,c j the τ chains representing variable xi , and vj,1 i , . . . , vi denote the nodes in chain Ci . xi : Fig. 1: A chain of length 5 for variable xi . For the jth equation xα + ci = xβ , create a chain Ej of length c − ci , so that the ath node in chain l,a+ci for all 1 ≤ l ≤ τ, as demonstrated in Figure 2. Ej has incoming edges from vertices vl,a α and vβ Intuitively, if the equation xα + ci = xβ is satisﬁed by the jth assignment then on the jth time we walk pebbles down the chain xα and xβ the pebbles on each chain will be synchronized (i.e., when th we have a pebble on vl,a α , the a link in the chain representing xα we will have a pebble on the node l,a+ci during the same round) so that we can pebble all of these new nodes. We create a single sink vβ node linked from each of the k equation chains, which can only be pebbled if all equation nodes are pebbled. Our ﬁnal gadget is a path of length cm so that each node is connected to the previous node. We create a path Mi of length cm for each variable xi , and for every node vip+qc in the path with position p + qc > 1, where 1 ≤ p ≤ c and 0 ≤ q < m − 1, we create an edge to the node from node th vl,p i , 1 ≤ l ≤ τ (that is, the p in all τ chains representing the variable). We connect the ﬁnal node in each of the n paths to the ﬁnal sink node. This forces any valid pebbling to walk along each path of length cm, in the process walking along each chain Cji , 1 ≤ j ≤ τ, at least m times. 8

Equation ej x2 : x1 : Fig. 2: A gadget for equation ej : x1 + 2 = x2 . Chain for xi :

Fig. 3: A path Mi of length 9 = cm representing m = 3 passes for variable xi , which can take c = 3 values. k Lemma 4 If the B2LC instance has a valid solution, then Πcc GB2LC ≤ τcmn + 2cmn + 2ckm + 1. k Lemma 5 If the B2LC instance does not have a valid solution, then Πcc GB2LC ≥ τcmn + τ.

We refer to the appendix for the formal proofs of Lemma 4 and Lemma 5. Intuitively, any solution to B2LC corresponds to m walks across the τn chains Cji , 1 ≤ i ≤ n, 1 ≤ j ≤ τ of length c. If the B2LC instance is satisﬁable then we can synchronize each of these walks so that we can pebble every equation chain Ej and path Mj along the way. Thus, the total cost is τcmn plus the cost to pebble the k equation chains Ej (≤ 2ckm), the cost to pebble the n paths Mj (≤ 2cmn) plus the cost to pebble the sink node 1. If the B2LC instance is not satisﬁed then we must adopt a “cheating” pebbling strategy. This means that for some i ≤ n the pebbling either (1) keeps at least 2 pebbles on each of the chains C1i , . . . , Cτi during some pebbling round, or (2) delay walking the pebble down the chains Cji , . . . , Cτi at some point (mid-walk) so that there are mc + 1 pebbling rounds with 1 pebble on each of these chains. In either case we pay at least τ(mc + 1) to pebble the chains C1i , . . . , Cτi and at least τmc(n − 1) to pebble the other chains Cji ′ with i ′ 6= i and 1 ≤ j ≤ τ. Thus, the cumulative cost is at least τcmn + τ. Theorem 6 minCC is NP-Complete. Proof : It suﬃces to show that there is a polynomial time reduction from B2LC to minCC since B2LC is NP-Complete (see Theorem 8). Given an instance P of B2LC, we create the corresponding graph G as described above. This is clearly achieved in polynomial time. By Lemma 4, if P has a k valid solution, then Πcc (G) ≤ τcmn + 2cmn + 2ckm + 1. On the other hand, by Lemma 5, if P k does not have a valid solution, then Πcc (G) ≥ τcmn + τ. Therefore, setting τ > 2cmn + 2ckm + 1 k (such as τ = 2cmn + 2ckm + 2) allows one to solve B2LC given an algorithm which outputs Πcc (G). ✷ Theorem 7 minCCδ is NP-Complete.

9

τ copies: Chain for x2 : Equation 2: x2 + 1 = x1 Sink Equation 1: x1 + 0 = x2 τ copies: Chain for x1 :

Fig. 4: An example of a complete reduction Note that the only possible nodes with indegree greater than two are the nodes in the equation gadgets E1 , . . . , Em , and the ﬁnal sink node. The equation gadgets can have indegree up to 2τ + 1, while the ﬁnal sink node has indegree n + m. We replace the incoming edges to each of these nodes with a binary tree, so that all vertices have indegree at most two. By changing τ appropriately, we can still distinguish between instances of B2LC using the output of minCCδ . We refer to the appendix for a sketch of the proof of Theorem 7. Theorem 8 B2LC is NP-Complete. We refer to the appendix for the proof of Theorem 8, where we show that there is a polynomial time reduction from 3-PARTITION to B2LC.

5

LP Relaxation has Large Integrality Gap

Let G = (V, E) be a DAG with maximum indegree δ and with V = {1, . . . , n}, where 1, 2, 3, . . . , n is topological ordering of V. We start with an integer program for DAG pebbling, taking Step 1a in Figure 5. Intuitively, xtv = 1 if we have a pebble on node v during round t. Thus, Constraint 1 says that we must have a pebble on the ﬁnal node at some point. Constraint 2 says that we do not start with any pebbles on G and Constraint 3 enforces the validity of the pebbling. That is, if v has parents we can only have a pebble on v in round t + 1 if either (1) v already had a pebble during round t, or (2) all of v’s parents had pebbles in round t. It is clear that the above Integer Program yields the optimal pebbling solution. If we want the optimal pebbling solution that terminates in exactly t∗ ≥ depth(G) steps then we can simply change Constraint 1 to say that xt∗ n ≥ 1. We would like to convert our Integer Program to a Linear Program. The natural relaxation is n t ˜ simply to allow 0 ≤ xv ≤ 1. However, we show that this LP has a large integrality GAP Ω log n even for DAGs G with constant indegree. In particular, 2 there exist DAGs with constant indegree k n ˜ δ for which the optimal pebbling has Πcc (G) = Ω log n [ABP16], but we will provide a fractional solution to the LP relaxation with cost O n . 10

(1) Variables: For 1 ≤ v ≤ n and 0 ≤ t ≤ n2 , (a) Integer Program: xtv ∈ {0, 1}

(b) Relaxed Linear Program: 0 ≤ xtv ≤ 1 (2) Goal (minimize cumulative pebbling cost): min (3) Constraint 1 (Must Finish):

Pn2

t t=0 xn

(4) Constraint 2 (No Pebbles At Start):

≥ 1. P

0 v>0 xv

P

v∈V

Pn2

t t=0 xv .

≤ 0.

(5) Constraint 3 (Pebbling Is Valid): For all v s.t |Parents(v)| ≥ 1 and 0 ≤ t ≤ n2 − 1 we have P t v ′ ∈Parents(v) xv ′ t+1 t xv ≤ xv + . |Parents(v)| Fig. 5: Integer Program for Pebbling. Theorem 9 Let G be with constant indegree δ. Then there is a fractional solution to our LP Relaxation (of the Integer Program in Figure 5) with cost at most 3n. The proof of Theorem 9 is in Appendix A, we sketch the proof here. We ﬁrst “fractionally” pebble G in topological order using fractional pebbles with weight n1 . Since there are n pebbles, this process requires n time steps. Once this process is complete, we may increase xtn by n1 for each t−1 = 1 . Thus, after n − 1 subsequent time step xt−1 v , since all of its parents v must have value xv n more time steps, we have xn = 1. We refer to the appendix for a formal proof. One tempting way to “ﬁx” the linear program is to require that the pebbling take at most n steps since the fractional assignment used to establish the integrality gap takes 2n steps. There are two issues with this approach: (1) It is not true in general that the optimal pebbling of G takes at most n steps. See Appendix C for a counter example and discussion. (2) For constant indegree DAGs we can give a fractional assignment that takes exactly n steps and costs O(n log n). Thus, ˜ the integrality gap is still Ω(n). Brieﬂy, in this assignment we set xii = 1 for i ≤ n and for i < t we 1 −dist(i,t+j)−j+2 : j ≥ 1} , where dist(x, y) is the length of the shortest path from set xti = max n , {2 x to y. In particular, if j = 1 (we want to place a ‘whole’ pebble on node j + t in the next round by setting xj+t j+t = 1 ) and node i is a parent of node t + j then we have dist(i, t + j) = 1 so we will have xj+t = 1 (e.g., a whole pebble on node i during the previous round). i

6

Conclusions

We initiate the study of the computational complexity of cumulative cost minimizing pebbling in the parallel black pebbling model. This problem is motivated by the urgent need to develop and analyze secure data-independent memory hard functions for password hashing. We show that it is NP-Hard to ﬁnd a parallel black pebbling minimizing cumulative cost, and we provide evidence that the problem is hard to approximate. Thus, it seems unlikely that we will be able to develop tools to automate the task of a cryptanalyst to obtain strong upper/lower bounds on the security 11

of a candidate iMHF. However, we cannot absolutely rule out the possibility that an eﬃcient approximation algorithm exists. The primary remaining challenge is to either give an eﬃcient αk k approximation algorithm to ﬁnd a pebbling P ∈ P k with cc(P) ≤ αΠcc (G) or show that Πcc (G) is hard to approximate. We believe that the problem oﬀers many interesting theoretical challenges and a solution could have very practical consequences for secure password hashing. It is our hope that this work encourages others in the TCS community to explore these questions.

References [AB16a]

Jo¨el Alwen and Jeremiah Blocki. Eﬃciently computing data-independent memory-hard functions. In Advances in Cryptology CRYPTO’16. Springer, 2016.

[AB16b]

Jo¨el Alwen and Jeremiah Blocki. Towards practical attacks on argon2i and balloon hashing. Cryptology ePrint Archive, Report 2016/759, 2016. http://eprint.iacr.org/2016/759.

[ABP16]

Jo¨el Alwen, Jeremiah Blocki, and Krzysztof Pietrzak. The pebbling complexity of depth-robust graphs. Cryptology ePrint Archive, Report 2016/, 2016. http://eprint.iacr.org/.

[ABW03] Mart´ın Abadi, Michael Burrows, and Ted Wobber. Moderately hard, memory-bound functions. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2003, San Diego, California, USA, 2003. [AS15]

Jo¨el Alwen and Vladimir Serbinenko. High Parallel Complexity Graphs and MemoryHard Functions. In Proceedings of the Eleventh Annual ACM Symposium on Theory of Computing, STOC ’15, 2015. http://eprint.iacr.org/2014/238.

[BDK15]

Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich. Fast and tradeoﬀ-resilient memory-hard functions for cryptocurrencies and password hashing. Cryptology ePrint Archive, Report 2015/430, 2015. http://eprint.iacr.org/2015/430.

[BDK16]

Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich. Argon2 password hash. Version 1.3, 2016. https://www.cryptolux.org/images/0/0d/Argon2.pdf.

[BDKJ16] Alex Biryukov, Daniel Dinu, Dmitry Khovratovich, and Simon Josefsson. The memoryhard Argon2 password hash and proof-of-work function. Internet-Draft draft-irtf-cfrgargon2-00, Internet Engineering Task Force, March 2016. [BK15]

Alex Biryukov and Dmitry Khovratovich. Tradeoﬀ cryptanalysis of memory-hard functions. Cryptology ePrint Archive, Report 2015/227, 2015. http://eprint.iacr.org/.

[CGBS16] Henry Corrigan-Gibbs, Dan Boneh, and Stuart Schechter. Balloon hashing: Provably space-hard hash functions with data-independent access patterns. Cryptology ePrint Archive, Report 2016/027, Version: 20160601:225540, 2016. http://eprint.iacr.org/. [Cha73]

Ashok K. Chandra. Eﬃcient compilation of linear recursive programs. (FOCS), pages 16–25. IEEE Computer Society, 1973. 12

In SWAT

[Coo73]

Stephen A. Cook. An observation on time-storage trade oﬀ. In Proceedings of the Fifth Annual ACM Symposium on Theory of Computing, STOC ’73, pages 29–33, New York, NY, USA, 1973. ACM.

[Dem14]

Erik Demaine. 6.890 algorithmic lower bounds: http://ocw.mit.edu, September 2014.

Fun with hardness proofs.

[DFKP15] Stefan Dziembowski, Sebastian Faust, Vladimir Kolmogorov, and Krzysztof Pietrzak. Proofs of space. In Rosario Gennaro and Matthew J. B. Robshaw, editors, CRYPTO 2015, Part II, volume 9216 of LNCS, pages 585–605. Springer, Heidelberg, August 2015. [DGN03]

Cynthia Dwork, Andrew Goldberg, and Moni Naor. On memory-bound functions for ﬁghting spam. In Advances in Cryptology - CRYPTO 2003, 23rd Annual International Cryptology Conference, Santa Barbara, California, USA, August 17-21, 2003, Proceedings, volume 2729 of Lecture Notes in Computer Science, pages 426–444. Springer, 2003.

[DNW05] Cynthia Dwork, Moni Naor, and Hoeteck Wee. Pebbling and proofs of work. In Victor Shoup, editor, CRYPTO 2005, volume 3621 of LNCS, pages 37–54. Springer, Heidelberg, August 2005. [DS05]

Irit Dinur and Samuel Safra. On the hardness of approximating minimum vertex cover, 2005.

[EGS75]

Paul Erd¨os, Ronald L. Graham, and Endre Szemeredi. On sparse graphs with dense long paths. Technical report, Stanford, CA, USA, 1975.

[FLW13]

Christian Forler, Stefan Lucks, and Jakob Wenzel. Catena: A memory-consuming password scrambler. IACR Cryptology ePrint Archive, 2013:525, 2013.

[GJ75]

Michael R Garey and David S. Johnson. Complexity results for multiprocessor scheduling under resource constraints. SIAM Journal on Computing, 4(4):397–411, 1975.

[GLT80]

John R Gilbert, Thomas Lengauer, and Robert Endre Tarjan. The pebbling problem is complete in polynomial space. SIAM Journal on Computing, 9(3):513–524, 1980.

[HP70]

Carl E. Hewitt and Michael S. Paterson. Record of the project mac conference on concurrent systems and parallel computation. chapter Comparative Schematology, pages 119–127. ACM, New York, NY, USA, 1970.

[HP10]

Philipp Hertel and Toniann Pitassi. The pspace-completeness of black-white pebbling. SIAM Journal on Computing, 39(6):2622–2682, 2010.

[Kal00]

Burt Kaliski. Pkcs# 5: Password-based cryptography speciﬁcation version 2.0. 2000.

[Kho02]

Subhash Khot. On the power of unique 2-prover 1-round games. In Proceedings of the Thirty-Fourth annual ACM Symposium on Theory of Computing, pages 767–775. ACM, 2002.

[KR08]

Subhash Khot and Oded Regev. Vertex cover might be hard to approximate to within 2- ǫ, 2008. 13

[MMV13] Mohammad Mahmoody, Tal Moran, and Salil P. Vadhan. Publicly veriﬁable proofs of sequential work. In Robert D. Kleinberg, editor, ITCS 2013, pages 373–388. ACM, January 2013. [NB+ 15]

Arvind Narayanan, Joseph Bonneau, , Edward W Felten, Andrew Miller, and Steven Goldfeder. Bitcoin and Cryptocurrency Technology (manuscript). 2015. Retrieved 8/6/2015.

[Per09]

C. Percival. Stronger key derivation via sequential memory-hard functions. In BSDCan 2009, 2009.

[PHC]

Password hashing competition. https://password-hashing.net/.

[PR80]

Wolfgang J. Paul and R¨ udiger Reischuk. On alternation II. A graph theoretic approach to determinism versus nondeterminism. Acta Inf., 14:391–403, 1980.

[RD16]

Ling Ren and Srinivas Devadas. Proof of space from stacked bipartite graphs. Cryptology ePrint Archive, Report 2016/333, 2016. http://eprint.iacr.org/.

[Sch82]

Georg Schnitger. A family of graphs with expensive depth reduction. Theor. Comput. Sci., 18:89–93, 1982.

[Sch83]

Georg Schnitger. On depth-reduction and grates. In 24th Annual Symposium on Foundations of Computer Science, Tucson, Arizona, USA, 7-9 November 1983, pages 323–328. IEEE Computer Society, 1983.

[SS78]

John E. Savage and Sowmitri Swamy. Space-time trade-oﬀs on the ﬀt algorithm. IEEE Transactions on Information Theory, 24(5):563–568, 1978.

[SS79a]

John E. Savage and Sowmitri Swamy. Space-time tradeoﬀs for oblivious interger multiplications. In Hermann A. Maurer, editor, ICALP, volume 71 of Lecture Notes in Computer Science, pages 498–504. Springer, 1979.

[SS79b]

Sowmitri Swamy and John E. Savage. Space-time tradeoﬀs for linear recursion. In Alfred V. Aho, Stephen N. Zilles, and Barry K. Rosen, editors, POPL, pages 135–142. ACM Press, 1979.

[Tom78]

Martin Tompa. Time-space tradeoﬀs for computing functions, using connectivity properties of their circuits. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing, STOC ’78, pages 196–204, New York, NY, USA, 1978. ACM.

A

Missing Proofs

k Reminder of Lemma 4. If the B2LC instance has a valid solution, then Πcc GB2LC ≤ τcmn + 2cmn + 2ckm + 1. Proof of Lemma 4. Suppose the given instance of B2LC has a valid solution, {xi,j }. Recall that a pebble must pass m times through each of the τ chains of length c representing each variable. For a set k, let xk,j1 ≤ xk,j2 ≤ . . . ≤ xk,jn . We start the kth pass through the τ chains by placing a pebble 14

on C1jn , . . . , Cτjn , the τ chains representing variable jn . At each subsequent time step, we move the existing pebbles to the next node along the chain. When the pebbles on chains C1jn , . . . , Cτjn reach the (xk,jn − xk,jn−1 + 1)th nodes, we place a pebble on C1jn−1 , . . . , Cτjn−1 , the τ chains representing variable jn−1 . We continue this process by moving existing pebbles to the next node along each chain for each subsequent time step. When pebbles on chains C1jl , . . . , Cτjl reach the (xk,jl − xk,jl−1 + 1)th nodes, we place a pebble on C1jl−1 , . . . , Cτjl−1 , the τ chains representing variable jl−1 . Thus, the gadgets E1 , . . . , Em which are satisﬁed by xk,1 , . . . , xk,m+1 can be pebbled during this pass, since by construction, we oﬀset the positions of the pebbles by the appropriate distances. When a pebble reaches the end of its chain, we remove the pebble. Hence, we see that each chain has c time steps with pebbles, and each of those steps needs only one pebble. There are n variables, τ chains representing each variable, and m passes through each chain. Across all variable chains, the cumulative complexity is τcmn since there are n variables and τ chains representing each variable. Similarly, making m walks through the paths C1j , . . . , Cτj allows us to pebble the path Mj (j ≤ n). These paths each have length cm and we keep at most one pebble on Mj at any point in time. There may be a delay of up to c time steps between consecutive walks through the paths C1j , . . . , Cτj during which we cannot progress our pebble through the path Mj . However, there are at most 2cm total steps in which we have a pebble on Mj . Thus, the cumulative complexity across all paths M1 , ..., Mn is at most 2cmn. Likewise, since each equation gadget Ej is represented by a chain, one pebble at each time step suﬃces for each of these paths. We may have to keep a pebble on the gadget Ej while we make m walks through the paths, but we keep this pebble on Ej for at most 2cm steps (each walk takes c steps and the delay between consecutive walks is at most c steps). Since there are k equations, and there is a chain for each equation, the cumulative complexity across all equation chains is at most 2ckm. There is one ﬁnal sink node, so the cumulative complexity for an instance of B2LC with a valid solution is at most τcmn + 2cmn + 2ckm + 1. k Reminder of Lemma 5. If the B2LC instance does not have a valid solution, then Πcc GB2LC ≥ τcmn + τ. Proof of Lemma 5. Suppose the given instance of B2LC does not have a valid solution. We ﬁrst observe that in any legal pebbling P = (P0 , . . . , Pt ) ∈ P k we must walk a pebble down each of the paths M1 , . . . , Mn of length cm. Let tzi denote the ﬁrst time step in which we place a pebble on vzi — the z’th node on path Mi . Clearly, tzi < tz+1 for z < cm and at time tzi − 1 we must have a i 1,y pebble on all nodes in parents vzi . In particular, vi , . . . , vτ,y ∈ Ptz −1 where y − 1 = z mod c so i that the pebbling conﬁguration at time tz − 1 includes at least one pebble on each of the chains C1i , . . . , Cni . There are n variables xi , τ chains Cji of length c representing each variable xi , and m passes through each chain. Thus, across all variable chains, the cumulative complexity is at least τcmn, and any “cheating” pebbling has cumulative complexity at least τcmn + τ. Recall that a “cheating” pebbling either has (1) one step in which we have at least 2 pebbles on each of the chains C1i , . . . , Cτi for some i ≤ n, or (2) cm + 1 time steps in which we have a pebble on each of the chains C1i , . . . , Cτi for some i ≤ n. Any non-cheating pebbling corresponds to a valid B2LC solution. If an equation ej is not satisﬁed by the solution then there is no legal way to pebble the equation chain Ej without cheating because at no point during the m walks are both variables in the equation oﬀset by the correct amount. Thus, any instance of B2LC which does not have a valid solution requires at least τcmn + τ cumulative complexity. 15

Reminder of Theorem 7. minCCδ is NP-Complete. Proof of Theorem 7. (sketch) We sketch the construction for δ = 2. Although we maintain τ chains representing each variable, we can no longer maintain the gadgets for the equation, E1 , . . . , Em , for which each vertex has indegree 2τ + 1, one edge from its predecessor in the gadget, and 2τ edges from the chains representing the variables involved in the equation. Instead, in place of the 2τ edges, we construct a binary tree, where the bottom layer of the tree has at least τ2 nodes. Each of the τ2 connects to a separate instance of the chains representing the variables involved in the equation. Thus, if the equation involves variables xi and xj , then each of the τ2 nodes will have an edge from one of the τ chains representing xi , and an edge from one of the τ chains representing xj , oﬀset by an appropriate amount. This construction ensures that the root of the tree is pebbled only if all equations are satisﬁed, but any “dishonest” walk along the chains will require at least τ additional pebbles. Similarly, we replace the 2τ edges in the paths P1 , . . . , Pn of length cm with τ binary trees with base 2 . Finally, we replace the m + n incident edges to the sink node with a binary tree with base m+n . Since each of these terms are polynomial in c, m, n, there exists a τ 2 that is also polynomial in c, m, n which allows us to distinguish between honest walks and dishonest walks. As a result, we can decide between instances of B2LC. Reminder of Theorem 8. B2LC is NP-Complete. ProofPof Theorem 8. First, sort S so that S = {s1 , s2 , . . . , sm } and si ≤ sj for any i ≤ j. Let 2 T= m i=1 sm . We then create mn = 3n equations: x1 + s1 = x2 ,

x2 + s2 = x3 ,

...,

xm + sm = xm+1 ,

x1 + 0 = x2 ,

x2 + 0 = x3 ,

...,

xm + 0 = xm+1 ,

x1 + T = x2 ,

x2 + T = x3 ,

...,

xm + T = xm+1 ,

x1 + 2T = x2 ,

x2 + 2T = x3 ,

...,

xm + 2T = xm+1 ,

x1 + (n − 2)T = x2 ,

x2 + (n − 2)T = x3 ,

...,

xm + (n − 2)T = xm+1 ,

Finally, we create the additional n equations: x1 +

T + 3(i − 1)(n − 2)T = xm+1 , n

(1)

for i ∈ [n]. This gives a total of 3n2 + n equations so that the reduction is clearly achieved in polynomial time. Recall that B2LC is true if and only if there exist {ai,j }, i ∈ [n], j ∈ [m] so that each equation xi + ci,j = xj is satisﬁed by the assigning xi = ai,k and xj = aj,k for some k. We show that there exists a solution to 3-PARTITION if and only if there exists a solution to B2LC. Suppose there exists a partition of S into n triplets S1 , S2 , . . . , Sn so that the sum of the integers in each triplet is the same, and equals nT . We set a1,1 = 0 and for each si that appears in S1 , we take the equation, a1,i + si = a1,i+1 . Otherwise, if si does not appear in S1 , we take the equation a1,i + 0 = a1,i+1 . Suppose that ai,j are deﬁned for all i < k. Then we set ak,1 = 0 and for each si that appears in Sk , we take the equation, ak,i + si = ak,i+1 . If si appears in Si for some i < k, then we take the equation ak,i + (i − 2)T = ak,i+1 . Otherwise, if si does not appear in S1 , . . . , Sk , we take the equation ak,i + (i − 1)T = ak,i+1 . Thus, to get ai,m+1 from ai,1 , we add in three elements whose sum is nT . We then add in 3(i − 1) instances of (i − 2)T , for each of the elements which appear in S1 , . . . , Si−1 . Finally, we add in 16

(3n − 3i) instances of (i − 1)T . Hence, it follows that ai,1 +

T + 3(i − 1)(i − 2)T + (3n − 3i)(i − 1)T = ai,m+1 n T ai,1 + + (3in − 3n − 6i + 6)T = ai,m+1 n T ai,1 + + 3(i − 1)(n − 2)T = ai,m+1 , n

so that all equations in 1 are satisﬁed. Thus, a solution for 3-PARTITION yields a solution for B2LC. Suppose there exists a solution for the above instance of B2LC. Without loss of generality, assume each equation in 1, x1 + nT + 3(i − 1)(n − 2)T = xm+1 holds for ai,1 , ai,m+1 , since two diﬀerent equations clearly cannot hold for the same ai,1 , ai,m+1 . We observe that ai,1 + nT ≡ ai,m+1 (mod T ) for all i. Hence, to obtain a1,m+1 from a1,1 , we must take three equations of the form xi + si = xi+1 since the summing less than three elements in S is less than nT , while the summing more than three T and elements in S is more than nT but less than T + nT (as each element of S is greater than 4n T less than 2n and the sum of all elements in S is T ). Say that these three equations use the terms si1 , si2 , si3 . Then we let S1 = {si1 , si2 , si3 }, which indeed sums to nT . Similarly, to obtain ak,m+1 from ak,1 , we must take three equations of the form xi + si = xi+1 . Since there are 3n equations of this form which must be satisﬁed and each of the n assignments of the form ai,1 , . . . , ai,m+1 (where 1 ≤ i ≤ n) uses exactly three of these equations, then each assignment of ai,1 , . . . , ai,m+1 must satisfy a disjoint triplet of the 3n equations. Say that the three satisﬁed equations for ai,1 , . . . , ai,m+1 are sk1 , sk2 , sk3 . Then we let Sk = {sk1 , sk2 , sk3 }, which indeed sums to nT . Therefore, we have a partition of S into triplets, which each sum to nT , as desired. Thus, a solution for B2LC yields a solution for 3-PARTITION. Reminder of Theorem 9. Let G be with constant indegree δ. Then there is a fractional solution to our LP Relaxation (of the Integer Program in Figure 5) with cost at most 3n. Proof of Theorem 9. In particular, for all time steps t ≤ n, we set xti = n1 for all i ≤ t, and xti = 0 for i > t. For time steps n < t ≤ 2n, we set xtn = t−n+1 and xtv = n1 for v 6= n. We ﬁrst argue n that this is a feasible solution. Trivially, Constraints 1 and 2 are satisﬁed. Note that if xt+1 = n1 , v t P xv ′ then xtu = n1 for all u < v, so certainly v ′ ∈Parents(v) |Parents(v)| = t−n = n1 . Moreover, xt−1 n n for n < t ≤ 2n implies that setting xtn = xt−1 n +

X v ′ ∈Parents(v)

xtv ′ t−n 1 t−n+1 = + = |Parents(v)| n n n

is valid. Therefore, Constraint is satisﬁed. P 3P Finally, we claim that v∈V t≤2n xtv ≤ 3n. To see this, note that for every round t ≤ n, we have xtv ≤ n1 for all v ∈ V. Thus, XX v∈V t≤n

xtv ≤

XX 1 X ≤ 1 = n. n v∈V t≤n

17

v∈V

For time steps n < t ≤ 2n, note that xtn ≤ 1 and xtv = n1 for all v 6= n. Thus, X X X n−1 t xv ≤ 1+ ≤ 2n. n t≤n

v∈V n 0 [KR08], under the Unique Games Conjecture [Kho02]. We also produce a reduction from Cubic Vertex Cover to show REDUCIBLEd is NP-Complete even when the input graph has bounded indegree. Theorem 10 REDUCIBLEd is NP-Complete. k

10

Alwen et al. [ABP16] also gave tighter upper and lower bounds on Πcc (G) for the Argon2i-A, iBH and Catena k k iMHFs. For example, Πcc (G) = Ω n1.66 and Πcc (G) = O n1.71 for a random Argon2i-A DAG G (whp).

18

Proof : We provide a reduction from VC to REDUCIBLEd . Given an instance G = (V, E) of VC, arbitrarily label the vertices 1, . . . , n, where n = |V|, and direct each edge of E so that 1, . . . , n is a topological ordering of the nodes. For each node i we add two directed paths: (1) a path of length i − 1 with an edge from the last node on the path to node i, (2) a path of length n − i with an edge from node i to the ﬁrst node of the path. To avoid abuse of notation, let U represent the original n vertices (from the given instance of VC) in the modiﬁed graph. We claim VC has a vertex cover of size at most k if and only if the resulting graph is (k, n)reducible. Indeed, if there exists a vertex cover of size k, we remove the corresponding k vertices in the resulting construction. Then there are no edges connecting vertices of U. By construction, any path of length n must contain at least two vertices of U. Hence, the resulting graph is (k, n)reducible. On the other hand, suppose that there is no vertex cover of size k. Given a set S of k = |S| vertices we say that a node u ∈ U is “untouched” by S if u ∈ / S and S does not contain any vertex from the chain(s) we connected to node u. If there is no vertex cover of size k, then removing any set S of k = |S| vertices from the graph leaves an edge (u, v) with the following properties: (1) u, v ∈ U, (2) u and v are both untouched by S. Suppose u has label i. Then the (untouched) directed path which ends at u has length i − 1. Similarly, there still exists some directed path of length ≥ n − i which begins at v. Thus, there exists a path of length at least n, so the resulting graph is not (k, n)-reducible. ✷ Theorem 11 Even for bounded indegree graphs, REDUCIBLEd is NP-Complete. Proof : We provide a reduction from CubicVC to REDUCIBLEd , using graphs of bounded indegree. Given an instance of CubicVC, partition the vertices of G = (V, E) into four color classes: c1 , c2 , c3 , c4 ⊆ V such that no edge in E is incident to two vertices of the same color. Orient each edge in G so that an edge points from a vertex colored ci to a vertex colored cj only if j > i. Let Vi be the vertices of V which are colored with color ci . For each vertex v ∈ V1 , add a new directed path of length n2 and add a directed edge from the last vertex of this path to v. Thus, each v ∈ V1 has indegree 1. For each vertex v ∈ V2 , create a new directed path of length n4 and add a directed edge from the last vertex of this path to v. Thus, each v has indegree ≤ 4. Furthermore, for each v ∈ V2 create a new directed path of length n2 and add a directed edge from v ∈ V2 to the ﬁrst vertex in this path. For each vertex v ∈ V3 , create a new directed path of length n8 and add an edge from the last vertex in this path to v, so that v has indegree ≤ 4. Moreover, for each v ∈ V3 create a new directed path of length 3n 4 and add a directed edge from v to the ﬁrst vertex in this path. Finally, for each v ∈ V4 create a new directed path of length 7n 8 and add a directed edge from v ∈ V4 to the ﬁrst vertex in this path — observe that v has indegree ≤ 3. Clearly, the resulting structure is a DAG which can be constructed in polynomial time. We claim CubicVC has a vertex cover of size at most k if and only if the above construction is (k, n)-reducible. Indeed, if there exists a vertex cover of size k, we remove the corresponding k vertices in the resulting construction. Then there are no edges connecting vertices of diﬀerent colors. By construction, any path of length n must contain at least two vertices of diﬀerent colors. Hence, the resulting graph is (k, n)-reducible. On the other hand, suppose that there is no vertex cover of size k. Given a set S of k = |S| vertices we say that a node u ∈ V1 ∪ V2 ∪ V3 ∪ V4 is “untouched” by S if u ∈ / S and S does not contain any vertex from the chain(s) we connected to node u. If there is no vertex cover of 19

size k, then removing any set S of k = |S| vertices from the graph leaves an edge (u, v) with the following properties: (1) u, v ∈ V and have diﬀerent color classes, (2) u and v are both untouched by S. Suppose u has color i. Then the (untouched) directed path which ends at u has length 2ni . Similarly, there still exists some directed path of length ≥ n − 2ni which begins at v. Thus, there exists a path of length at least n, so the resulting graph is not (k, n)-reducible. ✷

C

On Pebbling Time and Cumulative Cost

In this section we address the following question: is is necessarily case that an optimal pebbling of G (minimizing cumulative cost) takes n steps? For example, the pebbling attacks of Alwen and Blocki [AB16a, AB16b] on depth-reducible graphs such as Argon2i-A, Argon2i-B, Catena all take exactly n steps. Thus, it seems natural to conjecture that the optimal pebbling of G always ﬁnishes in depth(G) steps. In this section we provide a concrete example of an DAG G (with n nodes and depth(G) = n) with the property that any optimal pebbling of G must take more than n steps. More formally, let Πk (G, t) denote the set of all legal pebblings of G that take at most t pebbling k rounds. We prove that minP∈Πk (G,t) cc(P) > minP∈Πk (G) cc(P) = Πcc (G). Theorem 12 There exists a graph G with n = 16 nodes and depth(G) = n s.t. 28 27 .

minP∈Πk (G,t) cc(P) k

Πcc (G)

≥

Proof : Consider the following DAG G on 16 nodes {1, . . . , 16} with the following directed edges (1) (i, i + 1) for 1 ≤ i < 16, (2) (i, i + 9) for 1 ≤ i ≤ 5 and (3) (i, i + 7) for 8 ≤ i ≤ 9. We ﬁrst show in Claim 13 that there is a pebbling P with cost 27 that takes 18 rounds. Theorem 12 then follows from Claim 14 where we show that any P ∈ Πk (G, 16) has cumulative cost at least 28. Intuitively, any legal pebbling must either delay before pebbling nodes 15 and 16 or during rounds 15 − i (for 0 ≤ i ≤ 5) we must have at least one pebble on some nodes in the set {9, 8, ..., 9 − i}. k

Claim 13 For the above DAG G we have Πcc (G) ≤ 27. Proof : Consider the following pebbling: P0 = ∅, P1 = {1}, P2 = {2}, P3 = {3}, P4 = {4}, P5 = {5}, P6 = {6}, P7 = {7}, P{ 8} = {8}, P9 = {1, 9}, P10 = {2, 10}, P11 = {3, 11}, P12 = {4, 12}, P13 = {5, 13}, P14 = {6, 14}, P15 = {7, 14}, P16 = {8, 14}, P17 = {9, 15}, P18 = {16}. It is easy to verify that cc(P) = 9 + 2 ∗ 9 = 27 since there are 9 steps in which we have one pebble on G and 9 steps in which we have two pebbles on G. ✷ Claim 14 For the above DAG G we have minP∈Πk (G,16) cc(P) ≥ 28. Proof : Let P = (P1 , . . . , P16 ) ∈ Πk (G, 16) be given. Clearly, i ∈ Pi for each round 1 ≤ i ≤ 16. To pebble nodes 15 and 16 on steps 15 and 16 we must have 9 ∈ P15 and 8 ∈ P14 . By induction, this means that P14−i ∩ {8, ..., 8 − i} 6= ∅ for each i > 0. To pebble node 9 + i at time 9 + i we require that i ∈ P8+i for each 1 ≤ i ≤ 5. These observations imply that |P9 | ≥ 3, |P10 | ≥ 3, . . ., |P13 | ≥ 3. We also have |P15 | ≥ 2 and |P14 | ≥ 2. We also have |Pi | ≥ 1 for all 1 ≤ i ≤ 16. The cost of rounds 1–8 and round 16 is at least 9. The cost of rounds 9–13 is at least 15 and the cost of rounds 14 and 15 is at least 4. Thus, cc(P) ≥ 28. ✷ 20

✷ k

Open Questions: Theorem 12 shows that Πcc (G) can be smaller than minP∈Πk (G,t) cc(P), but how large can this gap be in general? Can we prove upper/lower bounds on the ratio: for any n node DAG G? Is it true that

minP∈Πk (G,t) cc(P) k

Πcc (G)

minP∈Πk (G,t) cc(P) k

Πcc (G)

≤ c for some constant c? If not does this

hold for n node DAGs G with constant indegree? Is it true that the optimal pebbling of G always takes at most cn steps for some constant c?

D

NP-Hardness of minST

Recall that the space-time complexity of a graph pebbling is deﬁned as st(P) = t×max1≤i≤t |Pi |. We deﬁne minST and minSST based on whether the graph pebbling is parallel or sequential. Formally, the decision problem minST is deﬁned as follows: Input: a DAG G on n nodes and an integer k < n(n + 1)/2. Output: Yes, if minP∈P k st(P) ≤ k; otherwise No. G

The decision problem minSST is deﬁned as follows: Input: a DAG G on n nodes and an integer k < n(n + 1)/2. Output: Yes, if Πst (G) ≤ k; otherwise No. Gilbert et al.[GLT80] provide a construction from any instance of TQBF to a DAG GTQBF with pebbling number 3n + 3 if and only if the instance is satisﬁable. Here, the pebbling number of a DAG G is minP=(P1 ,...,Pt )∈P k maxi≤t |Pi | is the number of pebbles necessary to pebble G. An important gadget in their construction is the so-called pyramid DAG. We use a triangle with the number k inside to denote a k-pyramid (see Figure 6 for an example of a 3-pyramid). The key property of these DAGs is that any legal pebbling P = (P0 , . . . , Pt ) ∈ P k (Pyramidk ) of a k-pyramid requires at least mini |Pi | ≥ k pebbles on the DAG at some point in time. Another gadget, which appears in Figure 7, is the existential quantiﬁer gadget, which requires that si , si − 1, and si − 2 pebbles must be placed in each of the pyramids to ultimately pebble qi . Remark: We note that [GLT80] focused on sequential pebblings (P ∈ P) in their analysis, but their analysis extends to parallel pebblings (P ∈ P k ) as well.

≡ k

Fig. 6: A 3-Pyramid. We observe that any instance of TQBF where each quantiﬁer is an existential quantiﬁer requires at most a quadratic number of pebbling moves. Speciﬁcally, we look at instances of 3-SAT, such 21

qi

qi

xi

xi

xi

xi′

xi′

xi′ si

si − 1

xi xi′ si

si − 1

si − 2

si − 2

qi+1

qi+1

Fig. 7: An existential quantiﬁer, with xi set to true in the left ﬁgure and xi set to false in the right ﬁgure. as in Figure 8. In such a graph representing an instance of 3-SAT, the sink node to be pebbled is qn . By design of the construction, any true statement requires exactly three pebbles for each pyramid representing a clause. On the other hand, a false clause requires four pebbles, so that false statements require more pebbles. Thus, by providing extraneous additions to the construction which force the number of pebbling moves to be a known constant, we can extract the pebbling number, given the space-time complexity. For more details, see the full description in [GLT80]. Lemma 15 [GLT80] The quantified Boolean formula Q1 x1 Q2 x2 · · · Qn xn Fn is true if and only if the corresponding DAG GTQBF has pebbling number 3n + 3. Lemma 16 Suppose that we have a satisfiable TQBF formula Q1 x1 Q2 x2 · · · Qn xn Fn with Qi = ∃ for all i ≤ n. Then there is a legal sequential pebbling P = (P0 , . . . , Pt ) ∈ P GTQBF of the corresponding DAG GTQBF from [GLT80] with t ≤ 6n2 +33n pebbling moves and maxi≤t |Pi | ≤ 3n+3. Proof : We describe the pebbling strategy of Gilbert et al. [GLT80], and analyze the pebbling time of their strategy. Let T (i) be the time it takes to pebble qi in the proposed construction for any instance with i variables, i clauses, and only existential quantiﬁers. Suppose that xi is allowed to be true for the existential quantiﬁer Qi = ∃. Then vertex xi′ is pebbled using si moves, where si = 3n − 3i + 6. Similarly, vertices di and fi are pebbled using si − 1 and si − 2 moves respectively. Additionally, fi is moved to xi′ and then xi is moved to xi′ in the following step, for a total of two more moves. We then pebble qi+1 using T (i + 1) moves and ﬁnish by placing a pebble on xi and moving it to ci , bi , ai , and qi , for ﬁve more moves. Finally, we use six more moves to pebble an additional clause. Thus, in this case, Ttrue (i) = si + (si − 1) + (si − 2) + 13 + T (i + 1). On the other hand, if xi is allowed to be false for the existential quantiﬁer Qi = ∃, then ﬁrst we pebble xi′ , di , and fi sequentially, using si , si − 1, and si − 2 moves respectively. We then move 22

the pebble from fi to xi′ and then to xi , for a total of two more moves. We then pebble qi+1 using T (i + 1) moves. The pebble on qi+1 is subsequently moved to ci and then bi , using two more moves. Picking up all pebbles except those on bi and xi′ , and using them to pebble fi takes si − 2 more moves. Additionally, the pebble on fi is moved to xi′ and then ai , while the pebble on xi′ is moved to xi and then qi , for four more moves. Finally, we use six more moves to pebble an additional clause. In total, Tfalse (i) = si + (si − 1) + (si − 2) + (si − 2) + 14 + T (i + 1). Therefore, T (i) ≤ 4si + 10 + T (i + 1),

where si = 3n − 3i + 6. Thus,

T (i) ≤ 12(n − i) + 34 + T (i + 1). Writing R(i) = T (n − i) then gives

Pn

R(i) ≤ 12i + 34 + R(i − 1),

so that R(n) ≤ i=1 (12i + 34) = 6n2 + 40n. Hence, it takes at most 6n2 + 40n moves to pebble the given construction for any instance of TQBF which only includes existential quantiﬁers. ✷ Theorem 17 minST is NP-Complete. Proof : We provide a reduction from 3-SAT to minST. Given an instance I of 3-SAT with at most n clauses or variables, we create the corresponding graph from [GLT80]. Additionally, we append a chain of length 300n3 + 6n2 + 40n + 100 to the graph with an edge from the sink node qn from [GLT80] to the ﬁrst node in our chain. By adding a chain of length 300n3 + 6n2 + 40n + 100 we can ensure that for any legal pebbling P = (P0 , . . . , Pt ) ∈ P k (G) of our graph G we have t ≥ 300n3 + 6n2 + 40n + 100. By Lemma 16, at most 6n2 + 40n moves are necessary to pebble the 3−SAT portion of the graph, while the chain requires exactly 300n3 +6n2 +40n+100 moves. Thus, if I is satisﬁable then we can ﬁnd a legal pebbling P = (P0 , . . . , Pt ) with space maxi≤t |Pi | ≤ 3n + 3 and t ≤ 300n3 + 12n2 + 80n + 100 moves. First, pebble the sink qn in t ′ = 6n2 + 40n steps and maxi≤t ′ |Pi | ≤ 3n + 3 space by Lemma 16 and then walk single pebble down the chain in 300n3 + 6n2 + 40n + 10 steps keeping at most one pebble on the DAG in each step. The space time cost is at most st(P) ≤ 900n4 +936n3 +276n2 +540n+300. If I is not satisﬁable then, by Lemma 15 for any legal pebbling P = (P0 , . . . , Pt ) ∈ P k (G) of our graph we have maxi≤t |Pi | ≥ 3n + 4 and t ≥ 300n3 + 6n2 + 40n + 100. Thus, st(P) ≥ 900n4 + 1218n3 + 144n2 + 460n + 400 we have 900n4 + 1218n3 + 144n2 + 460n + 400 − 900n4 + 936n3 + 276n2 + 540n + 300 > 0 . for all n > 0. Thus, I is satisﬁable if and only if there exists a legal pebbling with st(P) ≤ 900n4 + 936n3 + 276n2 + 540n + 300. Clearly, this reduction can be done in polynomial time, and so there is indeed a polynomial time reduction from 3-SAT to minST. ✷

We note that the same reduction from TQBF to minST fails, since there exist instances of TQBF where the pebbling time is 2Ω(n) . However, the same relationships do hold for sequential pebbling. The proof of Theorem 18 is the same as the proof of Theorem 17 because we can exploit the fact that the pebbling of GTQBF from Lemma 16 is sequential. Theorem 18 minSST is NP-Complete. 23

Why Doesn’t the Gilbert et al. Reduction Work for Cumulative Cost? The construction from [GLT80] is designed to minimize the number of pebbles simultaneously on the graph, at the expense of larger number of necessary time steps and/or a larger average number of pebbles on the graph during each pebbling round. As a result, one can bypass several time steps by simply adding additional pebbles during some time step. For example, it may be beneﬁcial to temporarily keep repebbling the si − 2-pyramid pebbles on all three nodes xi , xi and xi′ at times so that we can avoid si later. Also if we do not need the pebble on node xi for the next 2 steps then it is better to discard any pebbles on xi′ and xi entirely to reduce cumulative cost. Because we would need to repebble the si -pyramid later our maximum space usage will increase, but our cumulative cost would decrease. Our reduction to space-time cost works because st cost is highly sensitive to an increase in the number of pebbles on the graph even if this increase is temporary. Cumulative, unlike space cost or space-time cost, is not very sensitive to such temporary increases in the number of pebbles on the graph.

24

q1 : Sink x1

14

x1

15

q2

13

p0

x2

x2 p1

11

12

q3

10 x3

8

x3

9

q4

p2 = q5

7 x4

5

x4

6 4

Fig. 8: Graph GTQBF for ∃x1 , x2 , x3 , x4 s.t. (x1 ∨ x2 ∨ x4 ) ∧ (x2 ∨ x3 ∨ x4 ). 25