The Conditional Uncertainty Principle

5 downloads 131 Views 834KB Size Report
Jun 23, 2015 - ∗Electronic address: [email protected]. †Electronic address: [email protected]. arXiv:1506.07124v1 [quant-ph] 23 Jun 2015 ...
The Conditional Uncertainty Principle Gilad Gour,1, ∗ Varun Narasimhachar,1, † Andrzej Grudka,2 Michal Horodecki,3, 4 Waldemar Klobus,2 and Justyna Lodyga2 1

arXiv:1506.07124v1 [quant-ph] 23 Jun 2015

Department of Mathematics and Statistics and Institute for Quantum Science and Technology, University of Calgary, 2500 University Drive NW, Calgary, Alberta, Canada T2N 1N4 2 Faculty of Physics, Adam Mickiewicz University, 61-614 Pozna´ n, Poland 3 Institute for Theoretical Physics and Astrophysics, University of Gda´ nsk, 80-952 Gda´ nsk, Poland 4 National Quantum Information Centre of Gda´ nsk, 81-824 Sopot, Poland (Dated: June 24, 2015) The uncertainty principle, which states that certain sets of quantum-mechanical measurements have a minimal joint uncertainty, has many applications in quantum cryptography. But in such applications, it is important to consider the effect of a (sometimes adversarially controlled) memory that can be correlated with the system being measured: The information retained by such a memory can in fact diminish the uncertainty of measurements. Uncertainty conditioned on a memory was considered in the past by Berta et al. [1], who found a specific uncertainty relation in terms of the von Neumann conditional entropy. But this entropy is not the only measure that can be used to quantify conditional uncertainty. In the spirit of recent work by several groups [2–6], here we develop a general operational framework that formalizes the concept of conditional uncertainty in a measure-independent form. Our formalism is built around a mathematical relation that we call conditional majorization. We define and characterize conditional majorization, and use it to develop tools for the construction of measures of the conditional uncertainty of individual measurements, and also of the joint conditional uncertainty of sets of measurements. We demonstrate the use of this framework by deriving measure-independent conditional uncertainty relations of two types: (1) A lower bound on the minimal joint uncertainty that two remote parties (Bob and Eve) have about the outcome of a given pair of measurements performed by a third remote party (Alice), conditioned on arbitrary measurements that Bob and Eve make on their own systems. This lower bound is independent of the initial state shared by the three parties; (2) An initial state–dependent lower bound on the minimal joint uncertainty that Bob has about Alice’s pair of measurements in a bipartite setting, conditioned on Bob’s quantum system. PACS numbers:

I.

INTRODUCTION

The discovery of quantum mechanics in the early twentieth century brought about a profound change in the way we view the physical world. The theory contains many counter-intuitive features for which we are still struggling to find satisfying interpretations. Nevertheless, it is remarkably successful in its agreement with observation, and in fact, its counter-intuitive features have found many practical applications. Among these features is the uncertainty principle, which states that there is a “minimum uncertainty” inherent in certain pairs of measurements [7]. If the outcome of one of the measurements can be predicted with high precision, then the other measurement necessarily becomes highly imprecise. This principle has applications in cryptographic tasks [8–10], the study of quantum correlations and nonlocality [11, 12], and continuous-variable quantum information processing [13–15]. An extensive amount of work has been done on the quantum uncertainty principle in connection with an isolated system [16–27]. But if the quantum system in consideration is correlated with another system, this second system can act as a “memory”. The initial (pre-measurement) correlations can cause the memory to retain some information about the system after it has been measured, effectively “alleviating” the constraints imposed by the usual uncertainty principle. Berta et al. [1] were the first to consider this situation, and formulated a modified version of an entropic uncertainty relation that applies in the presence of a correlated memory: If ρAB is a bipartite quantum state shared by two parties Alice and Bob, and σ XB and τ ZB are classical–quantum states that result, respectively, from Alice

∗ Electronic † Electronic

address: [email protected] address: [email protected]

2 measuring her subsystem A for the observables X and Z, then H(X|B)σ + H(Z|B)τ ≥ log2

1 + H(A|B)ρ , c

(1)

where H(·|·) is the conditional von Neumann entropy, and c is a constant determined by the incompatibility of the observables X and Z. The interpretation of this identity is as follows: The first term on the left represents the uncertainty about Alice’s X measurement, given access to Bob’s system B. Similarly, the second term measures the uncertainty about Alice’s Z, given access to B. The last term on the right, H(A|B), can be negative, e.g. when ρAB is maximally entangled, mitigating—and sometimes completely eliminating—the effect of the log(1/c) term. This means that if Alice and Bob get along well enough, they can defeat the uncertainty constraint that otherwise plagues all incompatible quantum measurements! This result was subsequently experimentally demonstrated in [28]. Berta et al.’s uncertainty relation uses the von Neumann conditional entropy to measure the uncertainty of a variable conditioned on a memory (we shall call it “conditional uncertainty”), and the sum of individual conditional entropies to measure the joint conditional uncertainty of X and Z. But this measure is only one of infinitely many ways to measure joint conditional uncertainty, and does not capture its complete operational meaning. This is similar to how specific entanglement measures, such as the entanglement cost and relative entropy of entanglement, don’t completely capture the meaning of entanglement. On the other hand, the operational framework of local operations and classical communication (LOCC)—operations that do not create entanglement—transcends specific measures and completely characterizes entanglement. In [2–6], uncertainty without a memory (or “non-conditional uncertainty”) was characterized through a class of operations acting on probability distributions pX of a classical variable X. These “certainty-nonincreasing” operations are random relabelings of X, which result in doubly stochastic maps acting on pX . Given two distributions pX and qX , the proper way of comparing their uncertainties is not through any entropy or other such measure, but rather through the relation of majorization [31]. pX can be said to be more certain than qX only if the former majorizes the latter, denoted pX  qX . The quantum uncertainty principle was then derived by applying this operational framework on pairs (or larger sets) of distributions that result from quantum-mechanical measurements applied on quantum states. This majorizationbased approach also enabled the construction of so-called universal uncertainty relations. These are relations of the form v(pX , qZ ) ≺ ω,

(2)

where pX and qZ are the probability distributions over the outcomes of measurements X and Z, respectively, applied on an arbitrary quantum state ρ. v is some vector (not necessarily in the same space as p and q) constructed out of p and q. ω is some constant vector, independent of ρ. The universality of such a relation is in the fact that for any measure U of uncertainty, one can get a lower bound on the joint uncertainty (measured in terms of U(v)) of (pX , qZ ) by evaluating U on the fixed argument ω. One type of universal relation, reported in [2, 3], uses the construction v(pX , qZ ) ≡ pX ⊗ qZ ,

(3)

v(pX , qZ ) ≡ pX ⊕ qZ .

(4)

while another type, derived in [4–6], uses

These “direct product”- and “direct sum”-type relations yield many known entropic uncertainty relations, via the application of appropriate entropy functions on the two sides of the majorization relation. In this paper, we will approach the conditional quantum uncertainty principle using a combination of the ideas of [1] and [2, 3, 6]: Incorporating a correlated memory and defining conditional uncertainty in a measure-independent way. We start by setting up an operational framework for conditional uncertainty, in Section II. We will define a mathematical relation, which we call conditional majorization, that acts as the basis for comparing the conditional uncertainties of variables, analogous to how ordinary majorization enables the comparison of non-conditional uncertainties. If σ1XB and σ2XB are two classical–quantum states that carry distributions of a classical variable X correlated with a quantum system B, then the relation “σ1XB conditionally majorizes σ2XB ” in our formalism, denoted σ1XB c σ2XB , will signify that the variable X is more certain in the former state than in the latter, conditioned upon access to the quantum system B.

3 We will derive several results about conditional majorization, including a thorough characterization of the case of a classical memory (Section II A). Note that the case of a classical memory, i.e., classical correlations between X and B, is still non-trivially different from the non-conditional case (from the viewpoint of uncertainty). Moreover, the classical-memory case helps in constructing measures of conditional uncertainty that apply to the quantum case. For some special cases, we will be able to further simplify the characterization, condensing the conditions to a finite number of inequalities. In the case with a quantum memory, we will derive a sufficient condition for conditional majorization (Section II B). We will also develop a toolkit for constructing measures of conditional uncertainty (Section II C). This will lead us to the investigation of measures of the joint conditional uncertainty of two or more states (Section II D). The concept of joint uncertainty is central to the quantum uncertainty principle, which is really about more than one measurement considered together. In Section III, we will apply our calculus of conditional majorization to a typical cryptographic scenario, also considered in [1, 29]. Here, two legitimate parties, Alice and Bob, are engaged in an LOCC protocol (e.g., quantum key distribution), and must contend with an “eavesdropper”, Eve, who potentially has access to a purification of Alice and Bob’s initial state. Thus, the overall state shared by the three parties is of the form ΨABE . In [1], a bound was derived on the amount of information (measured in terms of the von Neumann conditional entropy) about Alice’s system that can “leak” to Eve, based on how much initial correlation Alice and Bob share: 1 H(X|B)σ + H(Z|E)τ ≥ log2 . c

(5)

This identity is, remarkably, equivalent to (1). In this latter form, the first term still quantifies the uncertainty about Alice’s X measurement conditioned on access to Bob’s system B. But the second term now represents the uncertainty about Alice’s Z measurement given access to the purification of Alice and Bob’s composite system—in other words, the uncertainty that Eve has about Alice’s Z! While the form (1) tells us that Alice’s measurement outcome can become completely certain, even for incompatible measurements, given complete access to a purification of her system, (5) shows that access to disjoint parts of the purification based on her different measurement choices—which is what Bob and Eve get individually, unless they both collaborate—can never completely lift the joint uncertainty of two incompatible measurements. This fact is related to the monogamy of quantum correlations: If Bob has access to a part of the environment that is highly correlated with Alice’s system, then the rest of the environment, visible to Eve, is necessarily very weakly correlated with Alice’s system. The extreme case of this is when Alice and Bob share maximal entanglement—then Eve gets no information at all! In Section III we will derive a similar result, but not involving any specific measure of conditional uncertainty. Instead, our “universal conditional uncertainty relation” will state that, if Bob and Eve (respectively) perform measurements {Mb } and {Me } on their own systems when Alice (respectively) measures {Ma1 } and {Ma2 }, the joint distributions of the variables (a1 , b) in the first case, and (a2 , e) in the second, obey pA1 B ⊗ qA2 E ≺c Ω.

(6)

This is similar to the universal uncertainty relation (2), except that here we have a minimum joint conditional uncertainty that can be generated from any conditional uncertainty measure U simply by evaluating U(ω1 , ω2 ). Just by applying different measures U to this single identity, one could churn out different results similar to the one in [1]. Note, however, that the measures of joint conditional uncertainty generated this way are all of a particular special type: They quantify the uncertainty that Bob and Eve suffer when they try to learn about Alice’s measurement by also subjecting their own systems to measurements. In this sense, these measures of uncertainty are more in the spirit of the accessible information than the von Neumann conditional entropy. In Section IV, on the other hand, we will use our results on quantum-conditioned majorization to derive a completely quantum universal conditional uncertainty relation. This relation will apply to a bipartite setting, where Alice and Bob initially share a state ρAB , and then Alice makes either measurement MX or MY (Bob makes no measurement). The relation we will prove is of the form σ XB ⊗ τ ZB ≺c Ω(ρAB ),

(7)

where σ XB and τ ZB are the post-measurement states for the two choices of Alice’s measurement. This result demonstrates the power and versatility of the operational conditional uncertainty framework. We will also note that it can easily be adapted to a tripartite universal uncertainty relation, similarly to how (5) is derived from (1). Finally, in our conclusive section, we will discuss several other possible operational scenarios that can be subjected to our methods, leaving their study for future work.

4

FIG. 1: The operational schematic for conditional uncertainty: Alice and Bob initially share a quantum-quantum state ρAB that could contain quantum correlations. Alice’s measurement MX turns her subsystem A into a classical register X; the correlations become classical, too. She then carries out random relabeling operations on X, possibly conditioned on classical outputs obtained by Bob.

II.

CONDITIONAL MAJORIZATION: AN OPERATIONAL FRAMEWORK

A schematic of the operational set-up for conditional uncertainty is shown in Fig. 1. Alice holds the quantum system of interest, A; Bob holds another system, B, which is the memory. The initial state shared by Alice and Bob, ρAB [32], could contain classical or quantum correlations, even entanglement. But Alice will eventually perform some measurement on A. Let MX ≡ {Mx }x∈{1...n} be the POVM associated with Alice’s potential measurement. The act of measurement maps her quantum system A to an n-dimensional classical system, X, that carries the measurement outcome. Alice and Bob’s post-measurement state is a classical–quantum (CQ) state, i.e., a state with the form σ XB =

n X

n   X X X px |xi hx| ⊗ σxB , |xi hx| ⊗ TrA (MxA ⊗ 1B )ρAB ≡

(8)

x=1

x=1

where {|xi}x is an orthonormal basis of “classical outcome flags” on the classical register X. We define   consisting B AB A . px ≡ Tr Mx ρA , and σx = p−1 x TrA (Mx ⊗ 1 )ρ The conditional uncertainty is really a property of this post-measurement CQ state, and so such states are the objects in our formalism. In the remainder, we denote by CQn the set of all CQ states with classical dimension |X| = n. Similarly, we will denote by CC n the set of all classical–classical (CC) states with the first subsystem ndimensional. We will now identify the class of operations that cannot increase conditional certainty. Alice can perform classical certainty-nonincreasing operations, i.e., random relabelings, on X. Bob, on the other hand, is allowed to perform arbitrary quantum operations on B. Moreover, Bob’s action can in particular produce a classical output, and Alice’s action can depend on his the output. This results in the following general form for the allowed operations. Definition 1. For a state σ XB ∈ CQn , a quantum-conditioned random relabeling operation (QCR) is a completely positive (CP) trace-preserving (TP) map of the form 0

σ XB 7→ τ XB ≡ N (σ XB ) =

n X X x,x0 =1

h iB 0 (j) px Dx0 x |x0 ihx0 |X ⊗ E (j) (σx ) ,

(9)

j

where j can run over an arbitrary number of values, each of whose corresponding D(j) is an n × n doubly stochastic matrix and E (j) a trace-nonincreasing CP map, such that the map X E(·) ≡ E (j) (·) j 0

0

is TP. Note that τ XB is also in CQn . For a pair of CQ states (σ XB , τ XB ) that satisfy (9) for some QCR N , we will 0 say that σ XB quantum-conditionally majorizes (or simply conditionally majorizes) τ XB , denoted σ XB c τ XB

0

5 0

(or, equivalently, τ XB ≺c σ XB ). It is worth mentioning that the “quantum” element in the relation comes only from the memory upon which the (un)certainty of X is conditioned; to stress this, we use “quantum” (or, when appropriate, “classical”) as a sort of adverbial qualifier on the adjective “conditional”. Note that {QCR} ⊂ {LOCC}. Note also that, without loss of P (j) (j)† generality, we can assume that each E (j) has only one Kraus operator. Otherwise, if E (j) (·) = ` K` (·)K` , then (j,`) (j) we can define D ≡ D for all `. In order to 0characterize conditional certainty, we must find the conditions under 0 which σ XB c τ XB for a given pair (σ XB , τ XB ) of initial and final states. A.

Classical-conditional majorization

Let us first consider a classical memory B ≡ Y , so that the states of XY are in CC n , i.e., just joint probability distributions pXY . In this case we will dispense with the (here) cumbersome bra-ket P notation, and instead represent the state using the matrix P whose components are pxy . We use the notation py ≡ x pxy for the marginal probability of Y = y, and p|y ≡ px|y x = p−1 y (pxy )x for the conditional distribution of X given Y = y. In this case, since Y is classical, we replace the E (j) ’s of (9) with classical trace-nonincreasing operations (i.e., sub-stochastic maps). This results in transformations of the following form. Definition 2. For a classical register X and a classical memory Y initially in some joint distribution P ∈ CC n , a classical-conditioned random relabeling (CCR) is a stochastic map of the form X P 7→ Q = D(j) P R(j) , (10) j

where j can run over an arbitrary number of values, each of whose D(j) is an n × n doubly stochastic P corresponding (j) (j) matrix and R a matrix with non-negative entries, such that j R is row-stochastic. Remark (1). Any CC-to-CC state transformation P 7→ Q that can be achieved via QCR can also be achieved by some CCR. For, let Q = N (P ) for some QCR N . Then, merely follow N with a rank-1 projection map in the classical basis of Q; this overall amounts to a CCR, while still achieving the transformation P 7→ Q. In light of this fact, we will use the same symbol “c ” to denote CC-to-CC convertibility via CCR. Remark (2). Arbitrarily reordering the elements of each coulmn of P is a CCR: Just choose D(j) = Πj (permutations), and R(j) = ejj , i.e., a matrix with zeros everywhere except the (j, j) element, which is chosen to be 1. The number of rows of P is just the n that was defined before; this equals the number of rows in Q, because the effective transformation acting on X is always doubly stochastic. However, the arbitrary classical channels allowed on Y can change the number of columns in P to a different one in Q, transforming Y to a different classical system altogether, which we call Z. We use ` (respectively, m) to denote the dimensions of Y (Z). In the following, we will prove several results that provide a characterization of classical-conditional majorization, as well as offering insight into its nature. Providing most of the technical proofs in the appendix, here we limit to a high-level discussion of the essence of the results. We say that P and Q are conditionally equivalent, and write P ∼c Q, if P c Q and P ≺c Q. Identifying cases of such equivalence leads to further simplification, resulting in the following standard form: Definition 3 (Standard Form). Let P = [pxy ] be an n × ` joint distribution matrix. Its standard form P ↓ = [p↓xw ] is an n × `0 matrix, where `0 ≤ `. To obtain P ↓ , we apply the following transformations on P : 1. Reordering within columns: Arrange the elements of each column of P in non-increasing order. Since this is a reversible CCR, we can just assume WLOG that P has this form. That is, for all y = 1, ..., `, p1y ≥ p2y · · · ≥ pny . 2. Combining proportional columns: If two columns of P , say those corresponding to y and y 0 , are proportional |y |y 0 to each other (i.e., p = p ), then we replace both by a single column (py + py0 )p|y . We do this until no two ˜ |w . columns are proportional. The resulting matrix P˜ contains `0 ≤ ` columns, each of the form p˜w p 3. Reordering the columns: Reorder the `0 columns of P˜ in non-increasing order of p˜w . If there are ties, resolve them by ranking the tied columns in non-increasing order of their first component (˜ p1w ). If there remain ties, we rank by non-increasing (˜ p1w + p˜2w ), and so on. The resulting n × `0 matrix is the final standard form P ↓ . Lemma A.2 in the appendix shows that P ↓ ∼c P .

6 Hereafter, we will (often implicitly) assume all states to be in their standard form, without loss of generality. This enables an elegant characterization of conditional majorization through the n × n lower-triangular matrix   1 0 0 ··· 0 1 1 0 · · · 0 1 1 1 · · · 0 . (11) L= . . . . ..    .. .. . . 1 1 1 ··· 1 Lemma 1. For P and Q in the standard form, P c Q if and only if there exists a row-stochastic matrix T such that LP T ≥ LQ,

(12)

where the inequality is entrywise. The predicate of the existence of such a T is really the question of feasibility of the linear programming problem defined by the condition (12). Since the number of free parameters in T , and the specification of the linear program, are all linear in the sizes of the instances of P and Q, one can numerically decide the predicate efficiently using standard linear programming algorithms. However, the existential clause renders the condition rather opaque to intuition and qualitative understanding. In the following, we will try to overcome this obstacle. Firstly, the standard form, together with the simplification afforded by Lemma 1, leads to some nice properties of the conditional majorization relation: Theorem 2. Let P , Q, R be three probability matrices. Then, Reflexivity: P c P. Transitivity: P c Q and Q c R ⇒ P c R. Antisymmetry: P c Q and Q c P ⇒ P ↓ = Q↓ . (13) That is, c is a partial order with respect to the standard form. The long, but straightforward, proof is provided in the appendix. There is a finite set of inequalities that prove sufficient as conditions for ordinary majorization between a given pair of vectors; these are in terms of the partial sums of the vector components [30]. In conditional majorization, it turns out to be difficult to find a finite sufficient set of conditions that are as simple to understand as those of majorization. While any instance of conditional majorization is efficiently solvable by linear programming algorithms, we would like conditions in some form that provide more insight. For example, conditions of the following form: P c Q ⇒ Φ(P ) ≤ Φ(Q). To this end, our next, main result is that a restricted collection of such Φ-like measures together form a sufficient condition: Theorem 3. Let P and Q be n × ` and n × m joint probability matrices. Denote by Rn×m the set of all n × m +,1 row-stochastic matrices. Then, the following conditions are equivalent: 1. Q ≺c P . 2. For all matrices A = [a1 , ..., am ] ∈ Rn×m +,1 , ` X

py ΦA (p|y ) ≥

y=1

m X

qw ΦA (q|w ),

(14)

w=1

where ΦA is a sub-linear functional given by  ↓ ΦA (p|y ) ≡ max (a↓k )T p|y . k≤m

(15)

3. For all convex symmetric functions Φ, ` X y=1

py Φ(p|y ) ≥

m X w=1

qw Φ(q|w ).

(16)

7 Remark. The matrix A can be assumed to be in standard form. Furthermore, by taking Φ in (16) to be Shannon entropy, the left hand side of (16) becomes H(X|Y ). The above theorem can be proved using standard methods from the field of linear programming, particularly Farkas Lemma. The details are provided in the appendix. We now look at a few special cases of the matrix dimensions n, ` and m, wherein conditional majorization is relatively easier to verify: 1. n = 1, with ` and m arbitrary: In this case P ↓ = Q↓ always holds, and therefore P ∼c Q for any (P, Q). 2. m = 1, with n and ` arbitrary: Here Q = q is a one-column matrix equivalent to a probability vector q. Moreover, T in Lemma 1 also have only one column, and since it is row-stochastic, T = e = (1, ..., 1)T . Denoting p ≡ P e, we have P c Q

⇐⇒

p  q.

(17)

3. ` = 1, with n and m arbitrary: In this case P = p is a probability vector, and T = (t1 , ..., tm ) is a probability row vector. We therefore get P c Q

⇐⇒

p  q|w ∀ w = 1, ..., m.

(18)

4. ` = 2, with n and m arbitrary: In this case, we have the following theorem, proved in the appendix. Theorem 4. Let P and Q be n × 2 and n × m probability matrices given in their standard form. Define µk ≡

k X

px|1 − px|2



(w)

and νk

x=1



k X

 qx|w − px|2 .

(19)

x=1

0 Denote by I + , IP , and I − the set of indices {k} for which µk is positive, zero, and negative, respectively. Furthermore, define p ≡ p1 = x px1 , so that 1 − p = p2 . Also define: ( ( ) ) (w) (w) νk νk qw qw αw ≡ max 0, max+ min 1, min− ; βw ≡ , (20) p µk p µk k∈I k∈I

and through these, W+ (P, Q) ≡ 1 −

m X

αw ;

W− (P, Q) ≡

w=1

m X

! − 1;

(21)

(βw − αw ) .

(22)

βw

w=1 (w)

W0 (P, Q) ≡ − max0 {νk }; w;k∈I

W1 (P, Q) ≡

min w∈{1,...,m}

Then, Q ≺c P if and only if W0 , W1 , W+ , and W− are all non-negative. The conditions in the theorem above can be simplified in special cases. The following special case will be helpful in constructing “universal uncertainty bounds” similar to those in [2]. Corollary 5. If p|2 ≺ p|1 and also q|w ≺ p|1 for all w = 1, ..., m then Q ≺c P if and only if W+ (P, Q) ≥ 0. Furthermore, W+ (P, Q) ≥ 0 if and only if ( (w) ) m X νk p≥ . (23) qw max{0, hw }, where hw ≡ max + µk k∈I w=1 Moreover, if p|1 = e1 ≡ (1, 0, ..., 0)T , then ( Pk hw = max k

x=1 qx|w

1 − πk

− πk

) ,

with πk ≡

k X

px|2 .

(24)

x=1 (w)

Proof. If p|2 ≺ p|1 then I − = ∅. If in addition q|w ≺ p|1 then µk ≥ νk . In this case, we always have W0 ≥ 0 and W1 ≥ 0. Then, from Theorem 4, Q ≺c P if and only if W+ (P, Q) ≥ 0. Eq. (23) and (24) follow from direct Pk calculations, with x=1 px|1 = 1.

8 Using this corollary, for any given arbitrary n-dimensional distribution ω, we can construct a “minimal P ” (that we call Ω) that conditionally majorizes a given Q, with the requirement that ω be one of the conditionals of this Ω: Corollary 6. For an arbitrary n-dimensional distribution ω, define Wω ≡ {w : q|w ≺c ω}. If X

qw ≥ 1 − α,

(25)

w∈Wω

then Ω c Q, where  Ω=

α 0 ... 0 (1 − α)ω1 (1 − α)ω2 . . . (1 − α)ωn

T .

(26)

This last corollary will be applied directly in SectionIII to obtain a universal uncertainty relation. B.

Quantum-conditional majorization

In the general case, where the memory B can be quantum, conditional majorization becomes rather complex. In particular, it could be hard to find simple conditions that are both necessary and sufficient for conditional majorization. Nevertheless, some sufficient conditions that prove usefulin building quantum uncertainty relations. Let Pn−1we can find X σ XB ≡ ⊗ σxB be a classical–quantum state where σxB are n normalized states acting on Bob’s x=0 qx |xihx| Hilbert space HB with dim HB = d. Each σx can be written as: σxB =

d−1 X y=0

λy|x |y; xihy; x| =

d−1 X

qy|x |uy|x iB huy|x |

(27)

y=0

B d−1 where {|x; yi}d−1 y=0 are the eigenvectors of σx that corresponds to the eigenvalues λy|x . The set of vectors {|uy|x i }y=0 B d−1 generate another possible decomposition of σx . In general, the set {|uy|x i }y=0 is not orthonormal and in fact, in special cases it is even possible to have |uy|x i = |uy0 |x i for y 6= y 0 .

Lemma 7. Let σxB be a density matrix and denote by λ|x = (λ0|x , ..., λn−1|x ) its vector of eigenvalues. Then, there Pd−1 B B exists states {|uy|x i}d−1 y=0 such that σx = y=0 qy|x |uy|x i huy|x | if and only if q|x ≺ λ|x ,

(28)

where q|x = (q0|x , ..., qn−1|x ). The uncertainty principle is about there being a “minimum possible uncertainty” in a state. Therefore, (conditional) uncertainty relations involve finding a lower bound to the (conditional) uncertainty of a measurement outcome. Here we are interested in finding a “state-valued” lower bound that can apply in general, without regard to what measure XB 0 of conditional uncertainty is used. In other words, we are interested in constructing some Ω ∈ CQn such that 0 ΩXB c σ XB . We would like this Ω to be nontrivial, yet simple enough that it affords some insight. For this reason, we take as Ansatz the following: ΩXB = ω|0ih0|X ⊗ |0ih0|B + (1 − ω)|1ih1|X ⊗ |ψihψ|B

(29)

where |ψiB is some fixed state in Bob’s Hilbert space, HB . We also completes |0iB (i.e. the state appears in ΩXB ) to an orthonormal basis which we denote by {|yiB }d−1 y=0 for all y = 0, ..., d − 1. In the theorem below assume that d = n, since if d < n we can always embed HB in Cn , and if n < d we can always add terms to σ XB with zero probabilities. We will therefore use the same index n for both d and n. Theorem 8. Let σ XB and ΩXB be CQ states as above, and for each σx let {qy|x , |uy|x i}n−1 y=0 one of its many decompositions. Denote by cy = hy|ψi the y component of |ψi when written in the basis {|yiB } discussed above. Finally, for any j, k ∈ {0, ..., n − 1}, with j 6= k, denote by ) ( n−1 X qy|j (j,k) 2 (j,k) 2 ? (j,k) ry ≡ qy|k |huy|j |uy|k i| ; r ≡ qy|k |huy|j |uy|k i| ; ωjk ≡ r qj min (30) (j,k) (j,k) {y|ry >0} ry y=0

9 ? Then, σ XB ≺c ΩXB if ω ≤ ωjk and there exists j, k ∈ {0, ..., n − 1}, with j 6= k, such that the following condition hold:  p p 1 (31) qk r(j,k) |0iB + 1 − ω − r(j,k) qk |1iB |ψi = |Ψjk (ω)i ≡ √ 1−ω

Corollary 9. Using notations as in Theorem 8, if all σxB = |φx iB hφx | then for any j ∈ {0, ..., n − 1} and ω ∈ [0, qj ] σ XB ≺c ω|0ih0|X ⊗ |0ih0|B + (1 − ω)|1ih1|X ⊗ |Ψj (ω)ihΨj (ω)|B

(32)

where q   1 cj |0iB + 1 − ω − c2j |1iB 1−ω

(33)

√ cj ≡ max |hφj |φk i| qk

(34)

|Ψj (ω)iB = √ and

k6=j

C.

Measures of conditional uncertainty

Just like Schur-convex functions provide a way of understanding ordinary majorization, we can understand the structure of conditional majorization through similar functions: Definition 4. A measure of the uncertainty of X conditioned on B is a function UX|B : CQn → R, with the following properties: X

B

1. UX|B (|0i h0| ⊗ |0i h0| ) = 0.   0 0 2. UX|B (σ XB ) ≤ UX|B 0 τ XB whenever σ XB c τ XB . 0

Condition 2 above, for any conditional uncertainty measure UX|B , is a necessary condition for σ XB c τ XB . But since the conditional majorization relation is not a total order, it is not completely characterized by any single measure. How do we construct such measures? One way is to first consider the classical-conditional case: Definition 5. A measure of the classical-conditioned uncertainty of X given Y is a function UX|Y : CC n → R, with the following properties: 1. UX|Y (P ) = 0 if pxy = py δxy (or equivalently px|y = δxy ). 2. UX|Y (P ) ≤ UX|Y (Q) whenever P c Q. Now, define the following: Definition 6. Let σ XB ∈ CQn . For a classical-conditioned uncertainty measure (as in Definition 5) UX|Y , the minimum U-classical uncertainty of X given B is a function MU : CQn → R defined by  MU σ XB := min UX|Y (P ) , (35) MY

where the minimization is over all POVMs MY describing possible measurements on B, where P ∈ CC n results from σ XB through a particular choice of this measurement. Lemma 10. It is sufficient to take the minimization in (35) over all rank 1 POVM measurements. The proof will be provided in Appendix C. Lemma C.1 of the appendix establishes that such a measure is nondecreasing under QCR, thereby justifying its nomenclature. Indeed, from Theorem 3, it follows that for any symmetric concave function Φ, the function   X UΦ (P ) ≡ py Φ p|y (36) y

10 is a measure of uncertainty of X|Y . For this choice of U the minimal conditional uncertainty is given by  MΦ σ XB = min UΦ (P ) . MY

(37)

In the case that Φ is the Shannon entropy H,  MΦ=H σ XB = min H(X|Y ) MY

= H(X) − max H(X : Y ) ≥ S(X|B)σ , MY

(38)

where the last inequality follows from the Holevo bound on the accessible information, maxMY H(X : Y ). A useful method for generating conditional uncertainty measures is using contractive distance measures (or even pseudo-distance measures such as the relative entropy) on the space of density matrices. Let Λ(·, ·) be a measure that is contractive under quantum operations, i.e., Λ (E(ρ1 ), E(ρ2 )) ≤ Λ(ρ1 , ρ2 ) for any two density matrices (ρ1 , ρ2 ) and any quantum channel E. Now consider the following class of states:   1 B ⊗ ρ ⊂ CQn . F n ≡ τ XB = 1X n n B This is the maximal subset of CQn with the property of closure under the set of QCR operations. That is, the property that ∀E ∈ {QCR}, E(F n ) ⊆ F n . This is the set of states that can be considered the most conditionally uncertain. Now define UΛ : CQn → R given by   XB XB 0 UΛ (σ XB ) = c − inf Λ σ , τ , (39) 0 τ XB ∈F n

with c a suitable constant. Then, by virtue of the contractivity of Λ, UΛ is a measure of conditional uncertainty of X given B. Many important entropic “distances”, such as the relative entropy and the R´enyi divergences for certain parametric regimes, can be used in place of Λ to yield such measures.

D.

Joint conditional uncertainty

Since the uncertainty principle is about more than one potential (actual or counterfactual) measurement on Alice’s side, we require a notion not of the conditional uncertainty of a single measurement, but of the joint conditional uncertainty of two or more measurements. Here we will consider the case of two measurements, but the same methods can be naturally adapted to more than two. We would like first to define the most general notion of joint conditional uncertainty that is independent on whether the two measurements on Alice’s system are actual or counterfactual, and is not restricted to a particular scenario or task. We therefore only require from such a measure to be monotonic under joint QCR: Definition 7. Let J : CQn × CQm → R, and let σXB ∈ CQn and γZC ∈ CQm . Then, J is a measure of the joint conditional uncertainty if: (1) J (σXB , γZC ) = 0 if both σXB and γZC are product states of the form |0ih0| ⊗ ρ. (2) J (σXB , γZC ) ≥ J (˜ σXB 0 , γ˜ZC 0 ) if both σXB ≺c σ ˜XB 0 and γZC ≺c γ˜ZC 0 . Similarly to the minimum classical conditional uncertainty measures, one can construct measures of joint conditional uncertainty from classical measures. This can be done as follows. Let J cl : CC n × CC m → R be a measure of classical joint uncertainty (i.e. satisfying the conditions of Def. 7 for classical states/distributions P and Q). Then, Theorem 11. Let σXB ∈ CQn and γZC ∈ CQm be two classical–quantum states; let MY and MW be POVMs acting, respectively, on B and C; and let P = (pxy ) and Q = (qwz ) be two probability matrices with probabilities pxy = Tr [σXB (|xihx| ⊗ My )] and qzw = Tr [γZC (|zihz| ⊗ Mw )]. Then, the function J (σXB , γZC ) := min J cl (P, Q) MY ,MZ

is a measure of joint conditional uncertainty.

(40)

11 The basic definition of a joint conditional uncertainty measure is independent of any operational task, game, or scenario that involves some facet of the conditional uncertainty principle. But specific tasks often admit the construction of useful measures that naturally reflect the structure of those tasks. For example, consider the scenario in Fig. 2. Here Alice and Bob share two copies of the same state, ρA1 B1 ⊗ ρA2 B2 . On the first copy, Alice performs the measurement MX ; on the other copy, she performs MZ . Effectively, the two measurements MX and MZ have been combined into one joint measurement on a tensor-product system. Then, the conditional uncertainty of XZ given B1 B2 in the resulting CCQQ state is actually a manifestation of the joint conditional uncertainty of X|B and Z|B; therefore, any measure of conditional uncertainty of this state is effectively a measure of the joint uncertainty of the two measurements conditional on B. In Section IV we will find a universal conditional uncertainty relation based on this scenario.

FIG. 2: Alice and Bob share two copies of the same state. Different measurements by Alice lead to a joint conditional uncertainty that manifests as the conditional uncertainty of the global state.

One can make variations on the above scenario, depending on the particular task. For example, Bob could perform measurements on his systems. In another scenario (Fig. 3), Alice and Bob actually share only one copy of the state. But now a shared random bit r determines the measurements that they each perform. Here again, Alice’s two measurements have been combined into one; but this time, instead of both measurements happening on a different copy of the same state, one randomly-chosen measurement is performed. Any measure of conditional uncertainty of this randomized hybrid measurement is then a measure of joint conditional uncertainty of the two individual measurements. A particular scenario that is relevant in cryptographic tasks is depicted in Fig. 4: Three parties, Alice, Bob and Eve (who can be interpreted innocently as the “environment”, or maliciously as an “eavesdropper”) share a pure

FIG. 3: Alice and Bob share only one copy of a state, but their measurements are chosen probabilistically.

12

FIG. 4: Alice, Bob and Eve share a pure state |Ψi. Alice chooses randomly between measurements MX and MZ . Upon her choosing MX , Bob makes the measurement MY . If Alice chooses the measurement MZ , Eve performs MW .

FIG. 5: Alice, Bob and Eve share two copies of a pure state |Ψi. On the copy where Alice makes measurement {Ma1 }, Bob makes the measurement {Mb }. On the other copy, Alice performs the measurement {Ma2 }, whereas Eve performs {Me }.

state ΨABE . Such a scenario is considered, for example, in security proofs of cryptographic protocols [1, 29] where one must assume, to account for the worst case, that an eavesdropper has access to a purification of Alice and Bob’s shared system. Even within this tripartite scenarios, one can consider variations based on the way in which multiple measurements are incorporated. The setup in Fig. 4, for example, involves a shared random bit r that decides whether Alice will make measurement MX or MZ . If she happens to choose MX , then Bob makes a measurement MY on his system; else, Eve measures her system through MW . Once again, any measure of the conditional uncertainty of the CQ state resulting from the hybrid measurement amounts to a measure of the joint uncertainty of the two individual measurements, conditional on Bob’s and Eve’s systems, respectively. In the scenario of Fig. 5, like in Fig. 2, the three parties share two copies of the same pure state |Ψi. Now the two alternate r cases of Fig. 4 both occur, but on different copies of the state. In the following section, as a demonstration of the application of the methods we have developed thus far, we will derive a universal uncertainty relation based on this tripartite scenario.

III.

APPLICATION I: UNIVERSAL, STATE-INDEPENDENT TRIPARTITE MEASUREMENT-BASED CONDITIONAL UNCERTAINTY RELATIONS

We will now construct a universal conditional uncertainty relation for the scenario in Fig. 5: Alice, Bob, and Eve initially share two copies of a pure state Ψ ≡ |Ψi hΨ|, i.e., a state ΨA1 B1 E1 ⊗ ΨA2 B2 E2 . Alice performs measurements

13 {Ma1 } and {Ma2 } on her two copies. Bob performs {Mb } on his first copy and does nothing with his second. Eve does nothing with her first part, and measurement {Me } on her second. Treating x ≡ (a1 , a2 ) as one variable, and w ≡ (b, e) as another, we will apply our characterization of minimum classical conditional uncertainty (Section II C) on the joint CC state on XW to get a lower bound on the joint minimum classical uncertainty of Alice’s two measurements conditioned separately on Bob’s and Eve’s systems. We will assume that Alice’s measurements are both rank-1 projective measurements. The post-measurement CC state Q on XW is given by

with qa1 b = Tr

h



MaA11 ⊗ MbB1 ⊗ 1E1 ΨA1 B1 E1

i

qxw = qa1 b qa2 e ,    and qa2 e = Tr MaA22 ⊗ 1B2 ⊗ MeE2 ΨA2 B2 E2 .

(41)

In order to lower-bound the joint conditional uncertainty of a1 |b and a2 |e, we will lower-bound the conditional uncertainty of X|W , which in turn we will accomplish by constructing a “state-valued” upper bound on Q, under the “c ” partial order. That is, some ΩXC ∈ CC n , such that ΩXC c QXW . X

(42)

C

One possible upper bound is of the form ΩTriv = |0i h0| ⊗ |0i h0| , which conditionally majorizes every σ XB ∈ CC n , let alone QXW ∈ CC n . But this bound would be trivial, because this Ω contains no conditional uncertainty. On the other hand, Q itself is another trivial upper bound: It is as tight as one could desire, but is state-dependent (in the strongest way possible), and as such, provides little insight. We would like to find some Ω that strikes a balance between these factors. The simplest kind of state that has some conditional uncertainty is some ΩXY ∈ CC n such that ` ≡ |Y | = 1 and ↓ Ω 6= (1, 0, 0 . . . , 0)T . Let us try to find an upper bound Ω of this form. From Section II A, recall (18), which gives the condition for Ω c Q in the case ` = 1: ω  q|w ,

(43)

where ω is the lone column of Ω. If we wish to make this vector state-independent, and dependent only on Alice’s measurements {Ma1 } and {Ma2 }, then it must majorize a Q that results from any Ψ and any choice of Bob’s and Eve’s measurements. We will now show that this results in the trivial upper bound ΩTriv , irrespective of Alice’s measurements. Recall our assumption that Alice’s measurements are rank-1 projective. Then, let |α1 i hα1 | be one of the POVM elements from her first measurements, and |α2 i hα2 | one from her second. Now consider the tripartite pure state ABE 1 Ψ = √ (|α1 i |0i |1i + |α2 i |1i |0i) . 2

(44)

If Bob’s and Eve’s measurements are both just projective measurements with respect to their respective canonical bases, then the conditional distribution q|w≡(b,e)=(0,0) is given by q (a1 ,a2 )|(0,0) = δa1 ,α1 δa2 ,α2 .

(45)

Now considering (43), we see that the above conditional distribution forces Ω to be ΩTriv . So we have to look beyond the ` = 1 case; let’s now try ` = 2. Corollary 6 assures us of the existence of such an Ω given a suitable vector ω. In the following, we will construct such a vector, to finally get our desired Ω. Theorem 12. Define c ≡ maxα1 ,α2 |hα1 |α2 i|, where α1 and α2 are eigenvectors of Alice’s A1 and A2 measurements, respectively. Then for arbitrary state ψABE and measurements {Mb } and {Me }, we have Q ≺c Ω,

(46)

with  Ω=

α 0 ... 0 (1 − α)ω1 (1 − α)ω2 . . . (1 − α)ωn

T (47)

where ω = (β, . . . , β , 1 − lβ, 0 . . . 0) | {z }

(48)

l

with l being largest integer such that βl ≤ 1 and α, β satisfy αβ =

1 (1 + c)2 . 4

(49)

14 Proof. We apply Lemma E.2, obtaining that X

qw max qx|w ≤ w

w

1 (1 + c)2 . 4

(50)

We then apply Lemma E.1 with r = (1 + c)2 /4. The relation (46) constitutes a nontrivial uncertainty relation, when α and β are strictly less than 1. This happens for 1 (1 + c)2 < α < 1. 4 Example: In the case of two qubit measurements {Ma1 } and {Ma2 }, we get   α (1 − α)β c Q. 0 (1 − α)(1 − β)

IV.

(51)

(52)

APPLICATION II: UNIVERSAL UNCERTAINTY RELATIONS BASED ON QUANTUM-CONDITIONAL MAJORIZATION

Here we will construct another type of universal conditional uncertainty relation, this time for the scenario of Fig. 2, wherein Alice and Bob initially share two copies of a state, ρA1 B1 ⊗ ρA2 B2 , and then only Alice makes measurement MX and MZ on her parts, Bob retaining his systems unmeasured. We provide a state-valued upper bound on the post-measurement state σ XB1 ⊗ τ ZB2 . Consider first the case of a pure ρ = |ψihψ|. We will consider here orthogonal projective measurements of the form n−1 A AB Mx = |sx ihsx | and Mz = |tz ihtz | where {|sx i}x=0 and {|tz i}n−1 z=0 are two orthonormal bases of H . We write |ψi in its Schmidt form n−1 X

p λy |yiA |yiB

(53)

px |xihx|X ⊗ |φx iB1 hφx |,

(54)

|ψiAB =

y=0

Then we have σ XB1 ≡

n−1 X x=0

where n−1 n−1 X 1 Xp |φx iB1 = √ λy hsx |yi|yiB1 with px ≡ λy |hsx |yi|2 . px y=0 y=0

(55)

Similarly, τ ZB2 ≡

n−1 X

qz |zihz|Z ⊗ |ϕz iB2 hϕz |,

(56)

z=0

where n−1 n−1 X 1 Xp λy htz |yi|yiB2 with qz ≡ λy |htz |yi|2 . |ϕz iB2 = √ qz y=0 y=0

(57)

Hence, from Corollary 9 the following universal uncertainty relation follows: Theorem 13. For any choice of 4 indices (x1 , z1 ) 6= (x2 , z2 ), it holds for all ω ∈ [0, px1 qz1 ] that 1 B2 σ XB1 ⊗ τ ZB2 ≺c ΩXZB x1 x2 z1 z2 ,

(58)

15 XZ 1 B2 ⊗ |x1 z1 ihx1 z1 |B1 B2 + (1 − ω)|1ih1|XZ ⊗ |Ψx1 x2 z1 z2 (ω)ihΨx1 x2 z1 z2 (ω)|B1 B2 , with where ΩXZB x1 x2 z1 z2 ≡ ω|0ih0|

1 h √ |Ψx1 x2 z1 z2 (ω)iB1 B2 = √ |hφx1 |φx2 ihϕz1 |ϕz2 i| px2 qz2 |x1 iB1 |z1 iB2 1−ω p + px1 qz1 + (1 − |hφx1 |φx2 ihϕz1 |ϕz2 i|2 )px2 qz2 − ω|x2 iB1 |z2 iB2 i X √ + px qz |xiB1 |ziB2 .

(59)

(x,z)∈{(x / 1 ,z1 ),(x2 ,z2 )}

We can rephrase this universal uncertainty relation in terms of measures, as Theorem 13 (Alternate form). Let UXZ|B1 B2 be a measure of conditional uncertainty as per Definition 4. Then,   1 B2 UXZ|B1 B2 σ XB1 ⊗ τ ZB2 ≥ max max U ΩxXZB ≡ η(MX , MZ , ψ). (60) 1 x2 z1 z2 (x1 ,z1 )6=(x2 ,z2 ) ω∈(0,px1 qz1 )

This relation can also be extended to mixed bipartite states, if we restrict to jointly-concave conditional uncertainty measures, i.e., if    X 2 , (61) UXZ|B1 B2 σ XB1 ⊗ τ ZB2 ≥ rk rk0 UXZ|B1 B2 σkXB1 ⊗ τkZB 0 k,k0

where σ XB1 =

X

rk σkXB1

and

τ XB2 =

k

X

rk τkXB2 .

k

Consider a mixed bipartite state ρAB =

X

rk |ψk ihψk |AB .

(62)

k A1 Let σkXB1 be the CQ state obtained by measurement MX = {|sx ihsx |} on |ψk iA1 B1 , and τkZB2 be the CQ state A2 A1 A 2 B2 obtained by measurement MZ = {|tz ihtz |} on |ψk i . The corresponding CQ states obtained by measuring MX A2 and MZ , respectively, on ρAB , are then σ XB1 and τ ZB2 . Therefore, X 2 σ XB1 ⊗ τ ZB2 = rk rk0 σkXB1 ⊗ τkZB (63) 0 k,k0

is a convex mixture of “conditionally pure” CQ states. If UXZ|B1 B2 is concave, it follows that (60) holds for any ρ. Corollary 9 can be used in other scenarios, as well, to yield fully quantum conditional uncertainty relations. For example, we can apply the same corollary to the tripartite tensor-product scenario of Section III, without Bob and Eve making measurements. V.

DISCUSSION AND CONCLUSION

We developed an operational framework that defines conditional uncertainty—that is, uncertainty of a classical variable (a measurement outcome) conditioned on access to a quantum memory—through a mathematical relation called conditional majorization. This relation, while a natural generalization of majorization, is much harder to characterize. The relation, denoted “c ”, is defined between classical–quantum (CQ) states. A CQ state σ XB is more 0 0 conditionally uncertain than another, τ XB , if and only if σ XB c τ XB . We found several useful results about conditional majorization in the classical-memory case. These include an efficiently-computable linear program to verify classical-conditioned majorization, a class of measures that provide necessary and sufficient conditions for the relation, and much simpler conditions in some special cases. In the quantummemory case, we proved a sufficient condition for conditional majorization. We also found a method of constructing 0 a tight state-valued upper bound (under the “c ” relation) ΩXB to a given state σ XB . We then developed a toolkit that can be used to construct measures of conditional uncertainty. Measures of quantum-conditional uncertainty (which are defined on CQ states σ XB ) can be constructed from measures of classicalconditional uncertainty (defined on CC states P XY ) by applying the latter to probability distributions obtained by

16 measuring the memory system (B), and then optimizing over this measurement. We called such measures “measures of the minimum classical-conditioned uncertainty”. In addition, contractive metric or pseudo-metric functions on the space of density matrices can be used to define conditional uncertainty measures. These measures are in the spirit of B “least distance to the set of maximally conditionally-uncertain states”, which are states of the form n1 1X n ⊗σ . We then extended the idea of measures, to quantify the joint conditional uncertainty of two or more measurements. After defining the most general notion of joint conditional uncertainty measures, we considered some special operational scenarios that naturally represent some aspect of joint conditional uncertainty. These include a few bipartite and tripartite operational schemes (Figs. 2, 3, 4 and 5). We used these operational schemes in two illustrative applications. The first one applied our characterization of minimum classical-conditioned uncertainty to a three-player game (Fig. 5) where Alice, Bob, and Eve initially share two copies of a pure tripartite state, ΨA1 B1 E1 ⊗ ΨA2 B2 E2 . Subsequently, Alice makes measurements {Ma1 } and {Ma2 } on her copies, while Bob and Eve each measure only one of their respective copies (with measurements {Mb } and {Me }, respectively), discarding the other. We derived a state-valued upper bound to the resulting post-measurement 4-partite classical–classical state, effectively lower-bounding the joint uncertainty of Alice’s two measurements conditioned on the systems owned by Bob and Eve, respectively. This universal conditional uncertainty relation is of the form pA1 B ⊗ qA2 E ≺c Ω.

(64)

The other application was to bipartite scenario (Fig. 2), where Alice and Bob share two copies of some pure state, ψ A1 B1 ⊗ ψ A2 B2 . Alice makes two different measurements on her copies, while Bob keeps his system as it is. We then found a state-valued upper bound on the resulting 4-partite CQ–CQ state, resulting in the following universal conditional uncertainty relation: σ XB ⊗ τ ZB ≺c Ω(ρAB ),

(65)

An open problem that immediately suggests itself is regarding the state-dependent nature of the universal relation (60). It is not hard to see that, in the present form of the relation, a state-independent bound Ω would necessarily have to be trivial: Just consider the initial state ψ AB to be maximally entangled, in which case the conditional uncertainty is zero for any projective measurement on Alice’s part. On the other hand, Berta et al.’s relation H(X|B)σ + H(Z|B)τ ≥ log2

1 + H(A|B)ρ c

(66)

retains a non-trivial element even for the maximally entangled case, essentially by splitting this zero conditional uncertainty between the two terms on the right: The first (positive) term representing the incompatibility of the two measurements, and the second (negative) term embodying the initial correlation between A and B. It would be interesting if our universal relation could be made nontrivially state-independent using a similar “split” between measurement incompatibility and initial correlation. Note that the tripartite case does not suffer from the same problem, enabling us to formulate a state-independent universal relation in Section III, where we considered accessible information–like measures of conditional uncertainty. We expect a state-independent tripartite relation to be obtainable also for other kinds of measures, using Corollary 9 or similar results. We leave this, too, for future work. We expect that the characterization of quantum-conditioned majorization simplifies in special cases, such as the case of pure initial states, qubit measurements, etc. These cases are left for future work, as well. Another potential future work is to find other ways of constructing measures of individual and joint conditional uncertainty. For example, one could take measures defined on CQ states whose conditional quantum states are pure, and then extend these measures via the convex-roof extension. An important project to be undertaken is the application of our methods to concrete cryptographic tasks, such as key distribution and coin-tossing. Replacing the existing, entropy-based methods with majorization-based methods is likely to improve the analysis of the single-shot or finite-size case of such tasks, possibly beyond the improvement already afforded by the smooth R´enyi entropy calculus. The scenarios we considered in our applications, and indeed, all the ones depicted in the figures, are but a handful of examples that we contrived to illustrate our methods. The framework of conditional uncertainty (and joint conditional uncertainty) that we have developed is very general and versatile, and can be used in many other scenarios. This opens up plenty of avenues for future investigation.

17 Acknowledgments

This work is supported by ERC Advanced Grant QOLAPS and National Science Centre project Maestro DEC2011/02/A/ST2/00305.

[1] Mario Berta, Matthias Christandl, Roger Colbeck, Joseph M. Renes, and Renato Renner. The uncertainty principle in the presence of quantum memory. Nature Physics, 6(9):659–662, July 2010. [2] Shmuel Friedland, Vlad Gheorghiu, and Gilad Gour. Universal Uncertainty Relations. Physical Review Letters, 111(23):230401, December 2013. ˙ [3] Zbigniew Puchala, Lukasz Rudnicki, and Karol Zyczkowski. Majorization entropic uncertainty relations. Journal of Physics A: Mathematical and Theoretical, 46(27):272002, 2013. ˙ [4] Lukasz Rudnicki, Zbigniew Puchala, and Karol Zyczkowski. Strong majorization entropic uncertainty relations. Physical Review A, 89(5):052115, 2014. [5] Lukasz Rudnicki. Majorization approach to entropic uncertainty relations for coarse-grained observables. Physical Review A, 91(3):032123, 2015. [6] Varun Narasimhachar, Alireza Poostindouz, and Gilad Gour. The principle behind the uncertainty principle. arXiv preprint arXiv:1505.02223, 2015. ¨ [7] W. Heisenberg. Uber den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschrift f¨ ur Physik, 43(3-4):172–198, March 1927. [8] Ivan Damgaard, Serge Fehr, Louis Salvail, and Christian Schaffner. Cryptography In the Bounded Quantum-Storage Model. page 26, August 2005. [9] Masato Koashi. Simple security proof of quantum key distribution via uncertainty principle. page 4, May 2005. [10] David P. DiVincenzo, Michal Horodecki, Debbie W. Leung, John a. Smolin, and Barbara M. Terhal. Locking Classical Correlations in Quantum States. Physical Review Letters, 92(6):067902, February 2004. [11] Jonathan Oppenheim and Stephanie Wehner. The uncertainty principle determines the nonlocality of quantum mechanics. Science (New York, N.Y.), 330(6007):1072–4, November 2010. [12] Otfried G¨ uhne. Characterizing Entanglement via Uncertainty Relations. Physical Review Letters, 92(11):117903, March 2004. [13] AI Lvovsky. Squeezed Light. Photonics: Scientific Foundations, Technology and Applications, Volume 1, pages 121–163. [14] Samuel L Braunstein and Peter Van Loock. Quantum information with continuous variables. Reviews of Modern Physics, 77(2):513, 2005. [15] Ulrik L Andersen, Gerd Leuchs, and Christine Silberhorn. Continuous-variable quantum information processing. Laser & Photonics Reviews, 4(3):337–354, 2010. [16] H. Robertson. The Uncertainty Principle. Physical Review, 34(1):163–164, July 1929. [17] I. I. Hirschman. A Note on Entropy. American Journal of Mathematics, 79(1):152, January 1957. [18] Iwo Bialynicki-Birula and Jerzy Mycielski. Uncertainty relations for information entropy in wave mechanics. Communications in Mathematical Physics, 44(2):129–132, June 1975. [19] David Deutsch. Uncertainty in Quantum Measurements. Physical Review Letters, 50(9):631–633, February 1983. [20] Hans Maassen and J. Uffink. Generalized entropic uncertainty relations. Physical Review Letters, 60(12):1103–1106, March 1988. [21] Stephanie Wehner and Andreas Winter. Entropic uncertainty relationsa survey. New Journal of Physics, 12(2):025009, 2010. [22] I D Ivanovic. An inequality for the sum of entropies of unbiased quantum measurements. Journal of Physics A: Mathematical and General, 25(7):L363–L364, April 1992. [23] Jorge S´ anchez. Entropic uncertainty and certainty relations for complementary observables. Physics Letters A, 173(3):233– 239, February 1993. [24] Manuel Ballester and Stephanie Wehner. Entropic uncertainty relations and locking: Tight bounds for mutually unbiased bases. Physical Review A, 75(2):022319, February 2007. [25] Shengjun Wu, Sixia Yu, and Klaus Mølmer. Entropic uncertainty relation for mutually unbiased bases. Physical Review A, 79(2):022104, February 2009. [26] Marco Tomamichel and Renato Renner. Uncertainty Relation for Smooth Entropies. Physical Review Letters, 106(11):110506, March 2011. [27] Patrick J. Coles, Roger Colbeck, Li Yu, and Michael Zwolak. Uncertainty Relations from Simple Entropic Properties. Physical Review Letters, 108(21):210405, May 2012. [28] Chuan-Feng Li, Jin-Shi Xu, Xiao-Ye Xu, Ke Li, and Guang-Can Guo. Experimental investigation of the entanglementassisted entropic uncertainty principle. Nature Physics, 7(10):752–756, 2011. [29] Marco Tomamichel, Charles Ci Wen Lim, Nicolas Gisin, and Renato Renner. Tight finite-key analysis for quantum cryptography. Nature communications, 3:634, 2012. [30] Albert W. Marshall, Ingram Olkin, and Barry C. Arnold. Inequalities: Theory of Majorization and Its Applications. Springer Series in Statistics. Springer New York, New York, NY, 2011.

18 [31] An n-dimensional vector p is said to majorize another vector, q, denoted “p  q”, if for all k ∈ {1, 2 . . . n},

k P j=1

p↓j ≥

k P

qj↓ ,

j=1

where p↓ is a vector consisting of the components of p arranged in nonincreasing order, and likewise, q↓ . [32] Slightly deviating from convention, we will use superscripts to specify the names of subsystems A, B, etc.

Appendix A: Classical-conditional majorization

P Notation: P is the matrix whose components are the joint probabilities pxy . We use the notation py ≡ x pxy for the marginal probability of Y = y, and p|y ≡ px|y x = p−1 y (pxy )x for the conditional distribution of X given n×` Y = y. Denote by R+,1 the set of all n × ` row-stochastic matrices, and by Rn×` the set of all n × ` matrices with + n×m non-negative entries. In particular, P ∈ Rn×` in our generic example, while Q ∈ R . + + We are interested in the following condition for the conditional majorization relation P c Q: X Q= D(j) P R(j) , (A1) j

where each D(j) is an n × n doubly stochastic matrix, and each R(j) is an ` × m matrix of non-negative entries, with P (j) row-stochastic. jR Remark. Arbitrarily reordering the elements of each column of P is a CCR: Just choose D(j) = Πj (permutations), and R(j) = ejj , i.e., a matrix with zeros everywhere except the (j, j) element, which is chosen to be 1. In componentwise form: qxw =

`  XX j

D(j) py



y=1

x

(j) Ryw .

(A2)

Here (D(j) py )x is the xth component of the vector D(j) py , where py is the y th column of P (we will similarly use qw for the columns of Q). To simplify this relation, we denote tyw ≡

X

(j) Ryw and D(y,w) ≡

j

(j) X Ryw j

tyw

D(j) .

(A3)

Note that D(y,w) is an n × n doubly stochastic matrix and T = [tyw ] is an ` × m row-stochastic matrix. With these notations the relation in (A2) becomes qw =

` X

tyw D(y,w) py .

(A4)

y=1

Lemma A.1. P c Q if and only if there exists a set of n × n doubly stochastic matrices D(y,w) (`m in number) and an ` × m row-stochastic matrix T ≡ [tyw ] such that the columns of P (denoted py ) are related to those of Q (qw ) through (A4). Proof. We already proved that P c Q implies the existence of D(y,w) and T with the desired properties, satisfying (A4). It remains to show that if such matrices exist, then there exist D(j) and R(j) that satisfy (A1). Indeed, since P (j) D(y,w) are doubly stochastic we can write them as D(y,w) = j syw Π(j) , where Π(j) are permutation matrices and P (j) (j) (j) to be the matrix whose components are tyw syw , and D(j) the permutation matrices j syw = 1. Hence, taking R Π(j) , completes the proof. Lemma A.2. Consider an n × ` probability matrix P = [p1 , ..., p` ] such that one of the columns, say p1 , is a multiple of another column, say p2 . That is, there exists non-negative real number λ such that p1 = λp2 . Then, P ∼c P 0 ≡ [(1 + λ)p2 , p3 , ..., p` ], where P 0 is n × (` − 1) probability matrix.

(A5)

19 Proof. Let T = [t1 , ..., t`−1 ] be the following ` × (` − 1) row stochastic matrix   1 0 ··· 0 T = , I`−1

(A6)

where I`−1 is the (` − 1) × (` − 1) identity matrix. Note that P T = P 0 and therefore P 0 ≺c P . Let S = [s1 , ..., s`−1 ] be the following (` − 1) × ` row stochastic matrix   λ 1 1+λ 1+λ 0 · · · 0  0 0 1 · · · 0   . (A7) S= . .. . . ..   .. . . . 0

0

1

Note that P 0 S = P and therefore P ≺c P 0 . Hence, P ∼c P 0 . Note that the elements of each column of P can be arbitrarily reordered under the allowed transformations. Together with the above lemma, this allows us to define the following standard form: Definition 3. Let P = [pxy ] be an n × ` joint distribution matrix. Its standard form P ↓ = [p↓xw ] is an n × `0 matrix, where `0 ≤ `. To obtain P ↓ , we apply the following transformations on P : 1. Reordering within columns: Arrange the elements of each column of P in non-increasing order. Since this is a reversible CCR, we can just assume WLOG that P has this form. That is, for all y = 1, ..., `, p1y ≥ p2y · · · ≥ pny . 2. Combining proportional columns: If two columns of P , say those corresponding to y and y 0 , are proportional |y |y 0 to each other (i.e., p = p ), then we replace both by a single column (py + py0 )p|y . We do this until no two ˜ |w . columns are proportional. The resulting matrix P˜ contains `0 ≤ ` columns, each of the form p˜w p 3. Reordering the columns: Reorder the `0 columns of P˜ in non-increasing order of p˜w . If there are ties, resolve them by ranking the tied columns in non-increasing order of their first component (˜ p1w ). If there remain ties, we rank by non-increasing (˜ p1w + p˜2w ), and so on. The resulting n × `0 matrix is the final standard form P ↓ . Lemma A.2 implies that P ↓ ∼c P . In the remainder we will assume that P and Q are given in this standard form. Since all the D(y,w) in (A4) are doubly stochastic, we get that k X

qxw ≤

x=1

` X

tyw

y=1

k X

pxy , ∀ k = 1, ..., n.

Note that by taking the sum over w on both sides, we get that the marginal distributions p = ( P ˜ the matrices whose components are q = ( w qxw )x satisfy p  q. Denote by P˜ and Q p˜ky =

k X

(A8)

x=1

pxy and q˜kw =

x=1

k X

qxw

P

y

pxy )x and

(A9)

x=1

That is, P˜ = LP

˜ = LQ, and Q

(A10)

where L is the following n × n matrix: 0 0 ··· 1 0 ··· 1 1 ··· .. .. . . 1 1 1 ···

1 1 1 L= .  .. 

 0 0 0 . ..  .

(A11)

1

We will now prove an important simplification due to the standard form, which we stated as Lemma 1 in the main text: Lemma 1. For P and Q in the standard form, P c Q if and only if there exists a row-stochastic matrix T such that LQ ≤ LP T, where the inequality is entrywise.

(A12)

20 ˜ ≤ P˜ T . Proof. We have seen in Eq. (A8) that if Q ≺c P then there exists a row-stochastic matrix T such that Q Conversely, suppose there exists a row stochastic matrix T satisfying (A12). Denote A ≡ P T and the components of A by akw =

` X

pky tyw .

(A13)

y=1

˜ ≤ P˜ T is equivalent to qw ≺w aw , where aw = (akw )k , and the symbol ≺w stands for weak The condition Q Pn Pn majorization (i.e. instead of equality k=1 qkw ≤ k=1 akw ). It is known (see e.g. [30]) that qw ≺w aw if and only if there exists a non negative (entrywise) matrix S (w) and a doubly stochastic matrix D(w) such that qw = S (w) aw and S (w) ≤ D(w) entrywise. Therefore, there exists doubly stochastic matrix D(w) such that qw ≤ D(w) aw . In components it reads qxw ≤

n X

(w)

Dxk akw =

n X ` X

(w)

0 Dxk pky tyw ≡ qxw .

(A14)

k=1 y=1

k=1

0 0 0 However, since the components qxw and qxw both sums to one, the condition 0 ≤ qxw ≤ qxw implies that qxw = qxw . Hence, this equation is equivalent to (A4) with D(y,w) ≡ D(w) , and from Lemma A.1 it follows that Q ≺c P . This completes the proof.

Remark. Note that the proof of Lemma 1 also implies that Q ≺c P if and only if there exist m doubly stochastic matrices D(w) , and a row stochastic matrix T such that qw = D(w)

` X

tyw py , ∀ w = 1, ..., m.

(A15)

y=1

This is a simpler version of (A4). For any set of m doubly stochastic matrices D = {D(1) , ..., D(m) }, and an n × m matrix A, whose columns are az , we define D[A] := [D(1) a1 , ..., D(m) am ].

(A16)

Q ≺c P

(A17)

With these notations ⇐⇒

Q = D[P T ]

for some set of m doubly stochastic matrices D and a row stochastic matrix T . We now prove Theorem 2 of the main text: Theorem 2. Let P , Q, R be three probability matrices. Then, Reflexivity: P ≺c P Transitivity: P ≺c Q and Q ≺c R ⇒ P ≺c R Antisymmetry: P ≺c Q and Q ≺c P ⇒ P ↓ = Q↓ (A18) That is, ≺c is a partial order with respect to the standard form. Proof. Reflexivity and transitivity of ≺c follow directly from its definition in (A1). To prove antisymmetry, suppose P ≺c Q and Q ≺c P (i.e. Q ∼c P ), and suppose Q and P are given in their standard form; that is, we assume P = P ↓ and Q = Q↓ . From Lemma 1 we know that there exist stochastic matrices T and R such that LQ ≤ LP T and LP ≤ LQR. Combining these two inequalities gives LQ ≤ LQRT

and LP ≤ LP T R.

(A19)

Since RT is a row stochastic matrix, the sum of the columns of LQ and LQRT are the same. Therefore, we must have LQ = LQRT , and using similar arguments we also get LP = LP T R. Since L is invertible, this in turn gives Q = QRT

and P = P T R.

We next prove, that RT and T R must be the identity matrices.

(A20)

21 Lemma A.3. Let A be an m×m row stochastic matrix, and Q = Q↓ is an n×m matrix with non-negative components. Then, Q = QA ⇒ A = Im .

(A21)

The proof is by induction over m. For m = 1, Q is an n × 1 column matrix. Then, A is a 1 × 1 row stochastic matrix, i.e. the number 1. Therefore the lemma holds for m = 1. Next, suppose the lemma holds for all n × m non-negative matrices Q(m) (in their standard form); i.e. if Q(m) = Q(m) A(m) for some m × m row stochastic matrix A(m) , then A(m) = Im . We need to show that the same holds for all n × (m + 1) non-negative matrices Q(m+1) (in their standard form). Indeed, let A(m+1) be an (m + 1) × (m + 1) row stochastic matrix, and denote  (m)   A v (m+1) (m+1) (m) and A = (A22) Q = Q qm+1 uT 1 − u (m) where u, v ∈ Rm above is non-negative it is not + and u is the sum of the components of u. Note that while A (m) necessarily row stochastic. More precisely, the sum of the columns of A is e − v, where e = (1, ..., 1)T . Now, (m+1) (m+1) (m+1) suppose that Q =Q A . With the notations above this is equivalent to

Q(m) = Q(m) A(m) + qm+1 uT qm+1 = Q(m) v + (1 − u)qm+1 .

(A23)

The second equation is equivalent to uqm+1 = Q(m) v. Therefore, if u = 0 then v = 0 since Q(m) has no zero columns. Clearly, in this case we also have u = 0 so that Q(m) = Q(m) A(m) and therefore A(m) = Im from the assumption of the induction. This also gives A(m+1) = Im+1 . We therefore assume now that u > 0 and therefore v 6= 0. Substituting qm+1 = u1 Q(m) v into the first equation of (A23) gives   1 Q(m) = Q(m) A(m) + vuT (A24) u Since the sum of the columns of Q(m) is 1 − v we get that A(m) + u1 vuT is an m × m row stochastic matrix, and therefore from the assumption of the induction we get A(m) +

1 T vu = Im . u

(A25)

However, since u, v, A(m) ≥ 0 the components of u and v must satisfy ui vj = 0 for i 6= j.

(A26)

Since both u 6= 0 and v 6= 0, there must exists i0 such that ui0 6= 0, vi0 6= 0, and uj = vj = 0 for all j 6= i0 . However, in this case, qm+1 =

1 (m) vi Q v = 0 qi0 . u ui0

(A27)

That is, qm+1 is a multiple of anther column of Q(m+1) contrary to the assumption that Q(m+1) is given in its standard form. Therefore, we must have u = 0, and as discussed above, this corresponds to A(m+1) = Im+1 . This completes the proof of the lemma. Now, using this in the relation (A20) gives that both RT = Im and T R = I` . Hence, m = ` and R = T −1 . But since both R and T are row stochastic, we must have that they are permutation matrices. Finally, since Q and P are given in their standard form, the permutation matrices R and T must be the identity matrices. This completes the proof of antisymmetry of ≺c . For any P ∈ Rn×` + , denote by  E(P, k) := Q0 ∈ Rn×k : Q0 ≺c P , +

(A28)

which we call the Markotop of P . Note that the Markotop of P is a compact convex set. Its vertices are the set of all matrices of the form D[P T ] with D consist of m permutation matrices and T is a matrix whose rows are elements of the standard basis in Rm .

22 Lemma A.4. Given an n × ` matrix P and an n × m matrix Q then Q ≺c P

⇐⇒

E(Q, k) ⊆ E(P, k), ∀ k ∈ N.

(A29)

Proof. Suppose Q ≺c P and let A ∈ E(Q, k). Then, by definition, A ≺c Q and Theorem 2 gives A ≺c P ; that is, A ∈ E(P, k). Conversely, Q ∈ E(Q, m) ⊆ E(P, m) implies Q ≺c P . Let SE(P,m) : Rn×m → R be the support function of the Markotop of P defined by  SE(P,m) (A) := max Tr(AT Q) : Q ∈ E(P, m)

(A30)

for any A ∈ Rn×m . Support functions of non-empty compact convex sets has the following property: E(Q, m) ⊆ E(P, m)

⇐⇒

SE(Q,m) ≤ SE(P,m)

(A31)

From the previous lemma the support function provides a characterization of conditional majorization. We therefore calculate Tr(AT Q) =

n X m X

akw qkw =

k=1 w=1

m X ` X

tyw

w=1 y=1

n X n X

(w)

Dkx akw pxy

(A32)

x=1 k=1

Since the set of doubly stochastic matrices is the convex hull of the permutation matrices we get max D (w)

n n X X

(w)

Dkx akw pxy = max π

x=1 k=1

n X

aπ(x)w pxy = max(Πaw )T py Π

x=1

(A33)

where the second maximum is over all permutations π, and the last maximum is over all n × n permutation matrices Π. Therefore, SE(P,m) (A) = max Tr(AT D[P T ]) = D,T

` X y=1

max max(Πaw )T py = Π

w≤m

` X

py ΦA (p|y )

(A34)

y=1

where  ↓ ΦA (p|y ) ≡ max max(Πaw )T p|y = max (a↓w )T p|y . w≤m

Π

w≤m

(A35)

Note that ΦA is positively homogeneous (i.e. λΦA (p|y ) = ΦA (λp|y ) for λ ≥ 0), convex, and symmetric (under permutations of p|y ). This leads to our main result: Theorem 3. Let P and Q be n×` and n×m joint probability matrices. Then, the following conditions are equivalent: 1. Q ≺c P . 2. For all matrices A = [a1 , ..., am ] ∈ Rn×m , + ` X

py ΦA (p|y ) ≥

y=1

m X

qw ΦA (q|w ),

(A36)

qw Φ(q|w ).

(A37)

w=1

where ΦA (p|y ) is as in (A35). 3. For all convex symmetric functions Φ, ` X y=1

py Φ(p|y ) ≥

m X w=1

Remark. The matrix A can be assumed to be in standard form.

23 Proof. It remains to be shown that (3) follows from (1). Indeed, suppose Q ≺c P . Then, there exist family D of m n × n doubly stochastic matrices and an ` × m stochastic matrix T ≡ [t1 , ..., tm ] such that Q = D[P T ] = [D(1) P t1 , ..., D(m) P tm ]. Denoting by qw ≡

Pn

x=1 qxw

(A38)

we therefore get that `

q|w =

X tyw py 1 (w) D P tw = D(w) p|y . qw q w y=1

(A39)

P` Note that the RHS of the above expression is a convex combination of D(w) p|y since qw = y=1 tyw py . We therefore have ! m X m ` m X m ` ` `  X    X   X X X X tyw py (w) |y |w (w) |y |y D p ≤ qw Φ(q ) = ≤ tyw py Φ D p qw Φ tyw py Φ p = py Φ p|y qw w=1 w=1 y=1 w=1 y=1 w=1 y=1 y=1 (A40) where the first inequality follows from the convexity of Φ and the second from its Schur convexity.

Insights from linear programming

Theorem 3 provides a characterization of conditional majorization in terms of the functions defined in (A36). It is, however, possible to obtain a number of conditions using standard methods of linear programming. Consider the condition given in (A12) for conditional majorization, and let   t1    t2  m`  t≡ .  (A41) ∈R  ..  tm where t1 , ..., tm are the columns of T . Let Γ be the (nm + `) × `m real matrix   −LP   −LP     .. , Γ= .      −LP  I` I` · · · I`

(A42)

where I` is the ` × ` identity matrix. Finally, let 

−Lq1  .  . . b≡  −Lqm e

    ∈ R(nm+`)  

(A43)

where q1 , ..., qm are the columns of Q, and e ≡ (1, ..., 1)T ∈ R` . With these notations we have the following proposition. Proposition A.5. With the same notations as above, Q ≺c P if and only if there exists 0 ≤ t ∈ Rm` such that Γt ≤ b.

(A44)

Proof. The proof follows from a matrix T with non-negative entries PmLemma 1, with the observation that if there exists 0 that satisfies (A12) with t ≤ e, then there also exists a matrix T with non-negative entries that satisfies k k=1 Pm Eq. (A12) and k=1 t0 k = e.

24 The proposition above has a dual formulation using one of the variants of Farkas Lemma [see for example p.91 in J. Matousek and B. Gartner, ”Understanding and Using Linear Programming”]: Lemma A.6. (Farkas) Let Γ, b as above, then Γt ≤ b has non-negative solution if and only if every non-negative (nm+`) s ∈ R+ with sT Γ ≥ 0 also satisfies sT b ≥ 0. Denote by 

s1  .  . . s=  sm r

   (nm+`)  ∈ R+ with s1 , ..., sm ∈ Rn+ and r ∈ R`+ .  

(A45)

With these notations sT Γ ≥ 0

⇐⇒

rT ≥ sTw LP ∀ w = 1, ..., m

⇐⇒

ry ≥

max w∈{1,...,m}

sTw Lpy ∀ y = 1, ..., `

(A46)

where rj are the components of r. Similarly, T

s b≥0

⇐⇒

T

r e≥

m X

sTw Lqw .

(A47)

w=1

Note that if (A47) holds for r with ry = maxw∈{1,...,m} sTw Lpy , then it also holds for r with ry ≥ maxw∈{1,...,m} sTw Lpy . n×m Therefore, defining the matrix S = [s1 , ..., sm ] ∈ Rn×m we get that Q ≺c P if and only if for all S ∈ R+ + ` X y=1

max w∈{1,...,m}

sTw Lpy ≥ Tr(S T LQ).

(A48)

Denoting by A ≡ S T L the above equation becomes ` X y=1

max w∈{1,...,m}

aTw py ≥ Tr(AQ) =

m X

aTw qw

(A49)

w=1

Since aw = a↓w and py = p↓y the equation above is equivalent to ` X

py ΦA (p|y ) ≥

y=1

m X

aTw qw .

(A50)

aTw qw .

(A51)

w=1

Note that also m X w=1

|w

qw ΦA (q ) ≥

m X w=1

Therefore, the condition in Eq. (A36) is stronger than one given in Eq.(A50), and yet Eq. (A50) is sufficient for conditional majorization. Several special cases of n, ` and m were considered in the main text. The case ` = 2 is the most involved of these special cases, and so here we look at this case in detail. We assume that Q = Q↓ and P = P ↓ are 2-row joint probability matrices. For ` = 2, Q ≺c P if and only if there exists two m-dimensional probability vectors a and b such that LQ ≤ LP T , where ! a1 · · · am T = . (A52) b1 · · · bm

25 Pn Pn We denote by qw ≡ x=1 qxw for w = 1, ..., m, and similarly by py ≡ x=1 pxy for y = 1, 2. Since p1 + p2 = 1 we simplify the notation and denote p1 ≡ p and p2 ≡ 1 − p. With these notations the inequality LQ ≤ LP T is equivalent to qw Lq|w ≤ aw pLp|1 + bw (1 − p)Lp|2

∀ w = 1, ..., m.

(A53)

Note that the last component in the vector inequality above reads qw ≤ aw p + bw (1 − p) ∀ w = 1, ..., m. This is possible only if qw = aw p + bw (1 − p)

(A54)

for all w = 1, ..., m. Substituting this into (A53) we conclude that Q ≺c P if and only if there exists a probability vector a satisfying the following conditions: qw aw ≤ p   q   w L q|w − p|2 aw L p|1 − p|2 ≥ p

(A55) (A56)

We now prove the following theorem, stated in the main text: Theorem 4. Let P and Q be n × 2 and n × m probability matrices given in their standard form. Let p and qw be as defined above. Define µk ≡

k X

px|1 − px|2



(w)

and νk



k X

 qx|w − px|2 .

(A57)

x=1

x=1

Denote by I + , I 0 , and I − the set of indices {k} for which µk is positive, zero, and negative, respectively. Furthermore, define: ( ) ( ) (w) (w) νk νk qw qw αw ≡ max 0, max ; βw ≡ min 1, min , (A58) p p k∈I + µk k∈I − µk and through these, W+ (P, Q) ≡ 1 −

m X

αw ;

m X

W− (P, Q) ≡

w=1

! βw

− 1;

(A59)

(βw − αw ) .

(A60)

w=1 (w)

W0 (P, Q) ≡ − max0 {νk }; w;k∈I

W1 (P, Q) ≡

min w∈{1,...,m}

Then, Q ≺c P if and only if W0 , W1 , W+ , and W− are all non-negative. Proof. To prove the theorem we need to show the following equivalence: (A55)-(A56) ⇔ (A59)-(A60). To prove the implication ”⇒” we assume (A55)-(A56) and express the condition (A56), using notation introduced in (A57), as aw µk ≥

qw (w) ν p k

∀ k, ∀w.

(A61)

Next, we consider the three partitions of the set {k} that together cover the whole set: 1. k ∈ I 0 , i.e., µk = 0. Then for all w = 1, ..., m, qw (w) qw (w) ν ≤ 0 ∀ (k ∈ I 0 ) ⇒ max ν ≤ 0. p k p k∈I 0 k Since

qw p

(w)

is positive for all w, we get that W0 (P, Q) ≡ − maxw;k∈I 0 {νk } is non-negative.

(A62)

26 2. k ∈ I + , i.e., µk > 0. Then for all w = 1, ..., m (w)

aw −

qw νk ≥0 p µk

∀k ∈ I + .

(A63)

But since aw ≥ 0 must also hold, we have aw − αw ≥ 0. Summing over w and requiring

Pm

w=1

aw = 1, we see that W+ ≡ 1 −

(A64) Pm

w=1

αw must be non-negative.

3. k ∈ I − , i.e., µk < 0. Then for all w = 1, ..., m (w)

qw νk − aw ≥ 0 p µk

∀k ∈ I − .

(A65)

But (A55) states that aw ≤ qw /p must hold. Therefore, βw − aw ≥ 0. Again, summing over w and requiring

Pm

w=1

aw = 1 entails that W− ≡

(A66) Pm

w=1

βw − 1 be non-negative.

Finally, combining (A64) with (A66) for all w, we obtain the condition W1 ≡ minw∈{1,...,m} (βw − αw ) ≥ 0. This completes the proof of one direction. To prove the other direction, define the matrix-valued function Θ : Rm → R2×m , given by ! v1 v2 ··· vm . (A67) Θ(v) ≡ q1 −pv1 q2 −pv2 −pvm · · · qm1−p 1−p 1−p Each such matrixP satisfies all the properties desired of the matrix T in (A52), except possibly row stochasticity. Now, the first-row sum w vw is a continuous function of v, monotonically increasing in each vw . Moreover, by construction, if the first row adds up to 1, so does the second row. Let α ≡ (α P1 . . . αm ), and similarly define P β. By the premise, Θ(α) and Θ(β) contain only nonnegative entries. Furthermore, w αw = 1 − W+ ≤ 1 and w βw = W− + 1 ≥ 1, ¯ such and βw ≥ αw for all w (since W1 ≥ 0). The intermediate value theorem then ensures the existence of some v that Θ(¯ v) is a row-stochastic T with the desired properties.

Appendix B: Quantum-conditional majorization

 Pn−1 Let σ XB ≡ x=0 qx |xihx|X ⊗ σxB be a classical–quantum state where σxB are n normalized states acting on Bob’s Hilbert space HB with dim HB = d. Each σx can be written as: σxB =

d−1 X y=0

λy|x |y; xihy; x| =

d−1 X

qy|x |uy|x iB huy|x |

(B1)

y=0

B d−1 where {|x; yi}d−1 y=0 are the eigenvectors of σx that corresponds to the eigenvalues λy|x . The set of vectors {|uy|x i }y=0 d−1 generate another possible decomposition of σx . In general, the set {|uy|x iB }y=0 is not orthonormal and in fact, in special cases it is even possible to have |uy|x i = |uy0 |x i for y 6= y 0 . Lemma 7. Let σxB be a density matrix and denote by λ|x = (λ0|x , ..., λn−1|x ) its vector of eigenvalues. Then, there Pd−1 B B exists states {|uy|x i}d−1 y=0 such that σx = y=0 qy|x |uy|x i huy|x | if and only if

q|x ≺ λ|x , where q|x = (q0|x , ..., qn−1|x ).

(B2)

27 Let ΩXB be another classical–quantum state given by ΩXB = ω|0ih0|X ⊗ |0ih0|B + (1 − ω)|1ih1|X ⊗ |ψihψ|B

(B3)

where |ψiB is some fixed state in Bob’s Hilbert space, HB . We also completes |0iB (i.e. the state appears in ΩXB ) to d−1 an orthonormal basis which we denote by {|yiB }y=0 for all y = 0, ..., d − 1. In the theorem below assume that d = n, B n since if d < n we can always embed H in C , and if n < d we can always add terms to σ XB with zero probabilities. We will therefore use the same index n for both d and n. Theorem 8. Let σ XB and ΩXB be classical quantum states as above, and for each σx let {qy|x , |uy|x i}n−1 y=0 one of its many decompositions. Denote by cy = hy|ψi the y component of |ψi when written in the basis {|yiB } discussed above. Finally, for any j, k ∈ {0, ..., n − 1}, with j 6= k, denote by ) ( n−1 X qy|j (j,k) 2 (j,k) 2 ? (j,k) (B4) ry ≡ qy|k |huy|j |uy|k i| ; r ≡ qy|k |huy|j |uy|k i| ; ωjk ≡ r qj min (j,k) (j,k) {y|ry >0} ry y=0 ? Then, σ XB ≺c ΩXB if ω ≤ ωjk and there exists j, k ∈ {0, ..., n − 1}, with j 6= k, such that the following condition hold:  p p 1 (B5) |ψi = |Ψjk (ω)i ≡ √ qk r(j,k) |0iB + 1 − ω − r(j,k) qk |1iB 1−ω

Proof. Without loss of generality, take j = 0 and k = 1. Let N be a CQO. Then, n−1 n−1 X (j) X (j)    D1x |xihx|X ⊗ E (j) |ψiB hψ| D0x |xihx|X ⊗ E (j) |0iB h0| + (1 − ω) N ΩXB = ω

where

Pn−1 j=0

(B6)

j,x=0

j,x=0

(0)

(0)

(1)

E (j) is a CPTP map. We now consider a specific CQO for which D0x = δ0x , D1x = δ1x , D1x = δ0x , (j)

and for j > 1, D1x = δxj . Furthermore, we will only consider trace decreasing CP maps E (j) such that E (j) (|0ih0|) is zero unless j = 0. Therefore, for these choices we get:   N ΩXB = ω|0ih0|X ⊗ E (0) |0iB h0|  + (1 − ω)|0ih0|X ⊗ E (1) |ψiB hψ|  + (1 − ω)|1ih1|X ⊗ E (0) |ψiB hψ| + (1 − ω)

n−1 X

|xihx|X ⊗ E (x) |ψiB hψ|



(B7)

x=2

Consider now the following choices of E (x) . Each E (x) is given by E (x) (·) =

n−1 X

† Ky|x (·)Ky|x

(B8)

y=0

where the Kraus operators Ky|x have the following form: Ky|0 =

n−1 q X p p ty |uy|0 ih0| + ayz |vz|y iB hz|, Ky|1 = b1 sy |uy|0 iB h1| and for x > 1 Ky|x = bx qy|x |uy|x iB hx| (B9) z=1

where sy , ty are probabilities, for each y the n − 1 vectors {|vz|y i}n−1 z=1 are orthonormal and perpendicular to |uy|0 i, and the non-negative numbers bx and the complex numbers ayz satisfy the relation: bx +

n−1 X y=0

|ayx |2 = 1

for all x = 1, ..., n − 1.

(B10)

28 Note that the above relations ensure that N ΩXB



Pn

y,x=0

† Ky|x = In . We therefore get so far that: Ky|x

  ω + (1 − ω)b1 |h1|ψi|2 |0ih0|X ⊗ σ0B  + (1 − ω)|1ih1|X ⊗ E (0) |ψiB hψ| =

+ (1 − ω)

n−1 X

bx |hx|ψi|2 |xihx|X ⊗ σxB

(B11)

x=2

where we have chosen the probabilities ty and sy such that qy|0 =

ωty + (1 − ω)b1 |h1|ψi|2 sy ω + (1 − ω)b1 |h1|ψi|2

(B12)

It is therefore our goal to get E (0) (|ψiB hψ|) proportional to σ1B . Denote |ψiB =

n−1 X

cz |zi.

(B13)

z=0

Then Ky|0 |ψi = c0

p

ty |uy|0 i +

n−1 X

ayz cz |vz|y i

(B14)

z=1

We would like Ky|0 |ψi to be proportional to the state |uy|1 i, and more precisely, we want √ for some g ∈ C. Ky|0 |ψi = g qy|1 |uy|1 i

(B15)

We therefore write each |uy|1 i in the basis |uy|1 i =



ry |uy|0 i + eiθy

p

1 − ry |y ⊥ i

(B16)

where |y ⊥ i is some normalized state orthogonal to |uy|0 i, and ry ∈ [0, 1], θy ∈ [0, 2π]. Note that the set of states {|y ⊥ i} are completely determined by {|uy|1 i}. We choose |v1|y i ≡ |y ⊥ i. For this choice we then must choose ayz = ay δz1 . For these choices we then have p (B17) Ky|0 |ψi = c0 ty |uy|0 i + ay c1 |y ⊥ i Hence, from (B15) we have p √ √ g qy|1 ry = c0 ty

and

p √ g qy|1 eiθy 1 − ry = ay c1

Note that we can always take ay = eiθ |ay | to absorb the phase, and by taking the ratios we get √ p c0 ty |ay |c1 √ √ p = √ g qy|1 ry = c0 ty ry 1 − ry

(B18)

(B19)

The first equation gives: qy|1 ry ty = Pn−1 z=0 qz|1 rz

n−1 X c20 = qz|1 rz 2 g z=0

For this value of ty , we must have sy ≥ 0. This leads to the first constraint on ω, qy|0 , qy|1 , and rz : ! qy|1 ry ω qy|0 − Pn−1 + qy|0 (1 − ω)b1 |h1|ψi|2 ≥ 0 q r z z=0 z|1

(B20)

(B21)

The second constraint comes from |ay |2 =

g 2 qy|1 c20 ty (1 − ry ) = (1 − ry ) ry c21

(B22)

29 Therefore, the sum over y gives d−1 X

g2 |ay | = 2 c1 y=0 2

1−

n−1 X

! qy|1 ry

=

y=0

g2 c21

  c2 g 2 − c20 1 − 02 = g c21

(B23)

We therefore get the constraints on the coefficients c0 , c1 , and g: n−1 X c20 = qz|1 rz 2 g z=0

g 2 ≤ c20 + c21

(B24)

With these choices we get N ΩXB



  ω + (1 − ω)b1 |h1|ψi|2 |0ih0|X ⊗ σ0B

=

+ (1 − ω)g 2 |1ih1|X ⊗ σ1B + (1 − ω)

n−1 X

|hx|ψi|2 |xihx|X ⊗ σxB

(B25)

x=2

where we substitute bx = 1. Therefore, q0 = ω + (1 − ω)b1 c21

q1 = (1 − ω)g 2

qx = (1 − ω)|cx |2

(x ≥ 2)

(B26)

Now, b1 = 1 −

n−1 X

|ay |2 = 1 −

y=0

c2 + c21 − g 2 g 2 − c20 = 0 2 c1 c21

(B27)

Hence, q0 = ω + (1 − ω) c20 + c21 − g 2



(B28)

To summarize: q0 + q1 = ω + (1 − ω) c20 + c21



2

q1 = (1 − ω)g qx = (1 − ω)|cx |2 c20 = g 2

n−1 X

qy|1 ry

for x = 2, ..., n − 1 ;

ry = |huy|0 |uy|1 i|2

(B29) (B30) (B31) (B32)

z=0

g 2 ≤ c20 + c21

(B33)

qy|1 ry ω qy|0 − Pn−1 z=0 qz|1 rz

! + qy|0 (1 − ω)b1 |h1|ψi|2 ≥ 0

(B34)

After some simple algebra, these conditions can be shown to be equivalent to the condition of the theorem. Corollary 9. Using notations as in Theorem 8, if all σxB = |φx iB hφx | then for any j ∈ {0, ..., n − 1} and ω ∈ [0, qj ] σ XB ≺c ω|0ih0|X ⊗ |0ih0|B + (1 − ω)|1ih1|X ⊗ |Ψj (ω)ihΨj (ω)|B

(B35)

where q   1 cj |0iB + 1 − ω − c2j |1iB 1−ω

(B36)

√ cj ≡ max |hφj |φk i| qk

(B37)

|Ψj (ω)iB = √ and

k6=j

30 Appendix C: Certainty accessible via Bob’s measurements

We consider here the case of a classical memory Y that results from Bob measuring a quantum B. The essential idea is to take the best possible measurement on B and then use the previous section’s methods on the resulting CC state. To this end, we define the following. Definition 6. Let σXB ∈ CQn result from the measurement MX on system A of ρAB . The minimal conditional uncertainty associated with the measurement MX given accessibility to the quantum system B, is a function MU : CQn → R defined by MU (σXB ) := min UX|Y (P ) ,

(C1)

{My }

where the minimization is over all POVMs {My } describing Bob’s measurement, where P ∈ CC n results from σXB through a particular choice of this measurement. UX|Y is a classical measure of conditional uncertainty as in Definition 5. We now prove a lemma in this connection, which was stated in the main text. Lemma 10. It is sufficient to take the minimization in (C1) over all rank 1 POVM measurements. Proof. To prove the statement we need to show that min UX|Y (P ) ≤ min UX|Y (P 0 ),

(C2)

˜ 0} {Γ y

{My }

˜ y0 } for arbitrary POVMs. To this end, we can where {My } stand for all rank-1 POVM measurements, whereas {Γ ˜ y0 = P Ryy0 My , where Ryy0 are define general POVM elements as a linear combination of rank-1 POVM elements Γ y P components of row-stochastic matrix R ( y0 Ryy0 = 1). Then the components p0xy0 of matrix P 0 are given by h i X X ˜ y0 ) = p0xy0 = Tr ρAB (Mx ⊗ Γ Ryy0 Tr [ρAB (Mx ⊗ My )] = Ryy0 pxy , y

(C3)

y

where in the second equality we used linearity of the trace. Now, since pxy are the components of the matrix P , we have that P 0 = P R. From the property (1) of a measure of the uncertainty of X|Y (see Definition 5) we have that it is nondecreasing under processing of Y : UX|Y (P ) ≤ UX|Y (P R),

(C4)

and after minimizing over POVM measurements we get (C2). Lemma C.1. MU is monotonically non-decreasing under QCR. Proof. To prove the statement we need to show that min UX|Y (P ) ≤ min UX|Y (Q), ˜ 0} {Γ y

{My }

(C5)

˜ y0 ) on a state σXB where P (Q) is a matrix of probabilities pxy (qx0 y0 ) obtained by measuring POVM elements My (Γ (N (σXB )). First we will express the probability pxy in the product form pxy = Tr(

n X

px0 |x0 ihx0 | ⊗ σx0 )(|xihx| ⊗ My )

x0 =1

= Tr px |xihx| ⊗ (σx My ) = px Tr σx My = px py|x .

(C6)

31 ˜ y0 that gives the minimum in the RHS of (C5). The probabilities qx0 y0 are calculated Let us now take the POVMs Γ from ˜ y0 ) qx0 y0 = Tr N (σXB )(|x0 ihx0 | ⊗ Γ n X X (j) ˜ y0 ) = Tr( Dx00 x px |x00 ihx00 | ⊗ Kj σx Kj† )(|x0 ihx0 | ⊗ Γ = Tr

=

x,x00 =1 j n X X

˜ y0 Dx0 x px |x0 ihx0 | ⊗ Kj σx Kj† Γ

x=1 j n XX

(j)

˜ y0 Kj ). Dx0 x px Tr(σx Kj† Γ

j

(j)

(C7)

x=1

˜ 0y } and trace preserving operation P K † (.)Kj there exist We will now show that for any POVM measurements {Γ j j P POVM measurements {My } and a set of matrices {R(j) } with non-negative entries with the property that j R(j) is row stochastic, such that X (j) ˜ y0 Kj = Kj† Γ Ryy0 My , (C8) y (j)

where Ryy0 are the components of the matrix R(j) . To show this define the POVM elements My and the matrices R(j) as the following ˜ z Kk , My ≡ M(z,k) := Kk† Γ (j) Ryy0



(j) R(z,k)y0

(C9)

:= δzy0 δjk .

(C10)

Note that such defined My is a valid POVM: X † X X † ˜ z Kk = Kk Kk = I, My = Kk Γ y

while

P

j

(C11)

k

z,k

R(j) is row stochastic: XX y0

(j)

Ryy0 =

XX y0

j

δzy0 δjk = 1, ∀y.

(C12)

j

Having the above we can write ˜ y 0 Kj ) = Tr(σx Kj† Γ

l X

(j)

Ryy0 Tr(σx My ),

(C13)

y=1

hence for such defined My qx0 y0 =

n XX j

=

(j)

Dx0 x px

x=1

n X l XX j

l X

(j)

Ryy0 py|x

y=1 (j)

(j)

Dx0 x px py|x Ryy0 ,

(C14)

x=1 y=1

P which means that the matrix Q can be expressed as Q = j D(j) P R(j) . Recall that the measure of the uncertainty of X|Y is non-decreasing under correlated classical processing of Y with random relabelling of X and since we assumed ˜ y0 were optimal we get that Γ UX|Y (P ) ≤ min UX|Y (Q). ˜ 0} {Γ y

Therefore, minimizing UX|Y (P ) over all POVMs completes the proof.

(C15)

32 Appendix D: Joint conditional uncertainty

Since the uncertainty principle is about more than one potential (actual or counterfactual) measurement on Alice’s side, we require a notion not of the conditional uncertainty of a single measurement, but of the joint conditional uncertainty of two or more measurements. We would like first to define the most general notion of joint conditional uncertainty that is independent on whether the two measurements on Alice’s system are actual or counterfactual, and is not restricted to a particular scenario or task. We therefore only require from such a measure to be monotonic under joint QCR: Definition 7. Let J : CQn × CQm → R, and let σXB ∈ CQn and γZC ∈ CQm . Then, J is a measure of the joint conditional uncertainty if: (1) J (σXB , γZC ) = 0 if both σXB and γZC are product states of the form |0ih0| ⊗ ρ. (2) J (σXB , γZC ) ≥ J (˜ σXB 0 , γ˜ZC 0 ) if both σXB ≺c σ ˜XB 0 and γZC ≺c γ˜ZC 0 . One can construct measures of joint conditional uncertainty from classical measures. This can be done as follows. Let J cl : CC n × CC m → R be a measure of classical joint uncertainty (i.e. satisfying the conditions of Def. 7 for classical states/distributions P and Q). Then, (0) (1) Theorem 11. Let σXB ∈ CQn and γZC ∈ CQm be two classical–quantum states, let {My } and {Mw } be htwo POVMs, and i let P = (pxy ) h and Q = (qwz ) i be two probability matrices with probabilities pxy =  (0)

Tr σXB |xihx| ⊗ My

(1)

and qzw = Tr γZC |zihz| ⊗ Mw J (σXB , γZC ) :=

. Then, the function J cl (P, Q)

min

(D1)

M (0) ,M (1)

is a measure of conditional joint uncertainty. The proof follows from the same idea as in Lemma C.1. Appendix E: Miscellaneous auxiliary results for universal tripartite relation

Here we include a couple of sundry lemmas that are useful in proving our universal tripartite minimum classical uncertainty relation in Section III. First, we state without proof the following elementary application of Markov’s inequality: Lemma E.1. Let Q ∈ CC n . Suppose that X

qw max qx|w ≤ r

(E1)

x

w

For arbitrary β > 0, define Wβ ≡ {w : max qx|w ≤ β}. Then, x

X

qw ≥ 1 −



r . β

(E2)

Recall the scenario in Fig. 5: Alice, Bob, and Eve initially share two copies of a pure state Ψ ≡ |Ψi hΨ|, i.e., a state ΨA1 B1 E1 ⊗ ΨA2 B2 E2 . Alice performs measurements {Ma1 } and {Ma2 } on her two copies. Bob performs {Mb } on his first copy and does nothing with his second. Eve does nothing with her first part, and measurement {Me } on her second. We are interested in the following quantity, for a given pair of {Ma1 } and {Ma2 }: X η ≡ max max qb qe max qa1 |b qa2 |e (E3) Ψ

{Mb },{Me }

b,e

a1 ,a2

Lemma E.2. η = 14 [1 + c]2 . Proof. Consider the following auxiliary quantities: ηlower =

max

Ψ=ψ A ⊗ψ B ⊗ψ C

max

{Mb },{Me }

X b,e

qb qe max qa1 |b qa2 |e , a1 ,a2

where the maximum is taken over all product states, and X X ηupper = max max qb qe qe0 |b qb0 |e max qa1 |be0 qa2 |b0 e , Ψ

{Mb },{Me }

b,e

b0 ,e0

a1 ,a2

(E4)

(E5)

33 where Tr [Ψ(I ⊗ Mb ⊗ Me0 )] , Tr [Ψ(I ⊗ Mb ⊗ I)] Tr [Ψ(I ⊗ Mb0 ⊗ Me )] = , Tr [Ψ(I ⊗ I ⊗ Me )] Tr [Ψ(Ma1 ⊗ Mb ⊗ Me0 )] = , Tr [Ψ(I ⊗ Mb ⊗ Me0 )] Tr [Ψ(Ma2 ⊗ Mb0 ⊗ Me )] . = Tr [Ψ(I ⊗ Mb0 ⊗ Me )]

qe0 |b =

(E6)

qb0 |e

(E7)

qa1 |be0 qa2 |b0 e

(E8) (E9)

It is clear that ηlower ≤ η, while since max qa1 |b qa2 |e = max

a1 ,a2

a1 ,a2

X

qe0 |b qb0 |e qa1 |be0 qa2 |b0 e ≤

b0 ,e0

X

qe0 |b qb0 |e max qa1 |be0 qa2 |b0 e a1 ,a2

b0 ,e0

(E10)

we also have that ηupper ≥ η. We will show that ηlower = 41 [1 + c]2 and also ηupper ≤ 14 [1 + c]2 , where c is maximum overlap of basis vectors defining {Ma1 } and {Ma2 }, and since ηlower ≤ η ≤ ηupper we immediately get η = 14 [1 + c]2 . Calculation of ηlower . We have the following X ηlower = max max qb qe max qa1 |b qa2 |e (E11) ψ A ⊗ψ B ⊗ψ C {Mb },{Me }

= =

max

max

ψ A ⊗ψ B ⊗ψ C {Mb },{Me }

max

ψ A ⊗ψ B ⊗ψ C

a1 ,a2

b,e

X

qb qe max qa1 qa2

max qa1 qa2

a1 ,a2

(E12)

a1 ,a2

b,e

X

max

{Mb },{Me }

qb qe

(E13)

b,e

= max max qa1 qa2

(E14)

ψ A a1 ,a2

1 [1 + c]2 , 4

=

(E15)

where in the first line we used probability independence on conditioning for product states, while in the last line we used the result from [2]. Calculation of ηupper . We have the following X X ηupper = max max qb qe qe0 |b qb0 |e max qa1 |be0 qa2 |b0 e (E16) Ψ

{Mb },{Me }

= max Ψ

max

{Mb },{Me }

a1 ,a2

b0 ,e0

b,e

X

qbe0 qb0 e max qa1 |be0 qa2 |b0 e .

(E17)

a1 ,a2

b,e,b0 ,e0

(E18) Notice, that we can use the pairs of indices (be0 ) and (b0 e) interchangeably: X X max max qbe0 qb0 e max qa1 |be0 qa2 |b0 e = max max qb0 e qbe0 max qa01 |b0 e qa02 |be0 0 0 Ψ

{Mb },{Me }

b,e,b0 ,e0

a1 ,a2

Ψ

{Mb },{Me }

b,e,b0 ,e0

and we can symmetrize the maxima over (a1 , a2 ) (for any given set (b, e, b0 , e0 )) by writing the following   X 1 0 0 0 0 ηupper = max max qbe qb e 2 max qa1 |be0 qa2 |b0 e + max qa1 |b0 e qa2 |be0 . 0 0 Ψ

{Mb },{Me }

b,e,b0 ,e0

a1 ,a2

(E19)

a1 ,a2

(E20)

a1 ,a2

The main advantage of doing so is that in comparison with maxa1 ,a2 qa1 |b qa2 |e in the definition of η (E3), we no longer need to tackle the probabilities qa1 |b and qa2 |e conditioned independently on Bob’s and Eve’s outcomes. Instead, we now only need to evaluate 12 maxa1 ,a2 qa1 |be0 qa2 |b0 e + maxa01 ,a02 qa01 |b0 e qa02 |be0 for each setting (b0 e, be0 ). In this case any

34 pair of settings (b0 e) (and accordingly (be0 )) leads to preparing a particular state ψ (φ) on Alices’s side. Thus, we need to find the maximum of   1 0 0 q(a Kψ,φ = max q(a1 |ψ)q(a2 |φ) + max |φ)q(a |ψ) . (E21) 1 2 a01 ,a02 2 a1 ,a2 over all states ψ and φ. Suppose then that the Kψ,φ is attained for the vector corresponding to a particular a∗1 in the first maximum. The second term is then maximized by choosing the vector corresponding to a0∗ 2 for which the 0 overlap |ha∗1 |a0∗ i| is maximal with respect to other choices of a . This is guaranteed due to nonuniqueness of the state 2 2 ∗ 0∗ ψ for which the optimal probability q(a∗1 |ψ) is drawn. W.l.o.g. we can set a∗1 = a0∗ 1 and a2 = a2 , thus the four vectors corresponding to a∗1 , a∗2 , ψ, φ become coplanar. Let us denote the following angles: ∠(a∗1 , a∗2 ) = α ∈ [0, π4 ], ∠(a∗1 , ψ) = θ ∈ [0, α], ∠(φ, a∗2 ) = ϕ ∈ [0, α],

(E22) (E23) (E24)

then 1 max Kψ,φ = max (cos2 θ cos2 ϕ + cos2 (θ − α) cos2 (ϕ − α)). ψ,φ θ,ϕ 2

(E25)

Notice that a function of the form cos2 (θ − α) cos2 (ϕ − α) for any given parameter α has a maximum for θ − ϕ = 0, therefore 1 1 max (cos2 θ cos2 ϕ + cos2 (θ − α) cos2 (ϕ − α)) = max (cos4 θ + cos4 (θ − α)). θ 2 θ,ϕ 2

(E26)

By differentiating the last expression with respect to θ and equating to 0, we arrive at cos3 θ sin θ + cos3 (θ − α) sin(θ − α) = 0.

(E27)

Since cosines are nonnegative, the only nontrivial solution is θ − α = −θ. Thus, for θ = 1 1 max (cos4 θ + cos4 (θ − α)) = (cos4 θ 2 2

α 2

+ cos4 ( α2 − α)) =

α 2

we obtain

1 [1 + cos α]2 . 4

(E28)

Recall that α is chosen such that from the definition it maximizes the overlap between the vectors corresponding to a∗1 and a∗2 , and so cos α = c. Notice that in fact we obtained Kψ,φ ≤ 14 [1 + c]2 which immediately leads to conclusion ηupper ≤ 14 [1 + c]2 .