Testing finitary probabilistic processes - Semantic Scholar

1 downloads 0 Views 797KB Size Report
Feb 13, 2009 - Yuxin Deng, Rob van Glabbeek, Matthew Hennessy, Carroll Morgan ..... using a probabilistic extension of the Hennessy-Milner logic [17].
Testing finitary probabilistic processes Yuxin Deng, Rob van Glabbeek, Matthew Hennessy, Carroll Morgan February 13, 2009 Abstract This paper provides modal- and relational characterisations of may- and must-testing preorders for recursive CSP processes with divergence, featuring probabilistic as well as nondeterministic choice. May testing is characterised in terms of simulation, and must testing in terms of failure simulation. To this end we develop weak transitions between probabilistic processes, elaborate their topological properties, and capture divergence in terms of partial distributions.

Contents 1

Introduction

2

2

The language pCSP

4

3

Lifted transitions, and weak moves over distributions 3.1 Lifted relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Weak transitions defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Elementary properties of weak derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 7 9 11

4

Testing probabilistic processes 4.1 Extremal testing . . . . . . . . . . . 4.2 Resolution-based testing . . . . . . 4.3 Comparison . . . . . . . . . . . . . 4.3.1 Scalar versus Vector testing 4.3.2 Must testing . . . . . . . . . 4.3.3 May testing . . . . . . . . . 4.4 Testing via weak-p derivatives . . .

. . . . . . .

12 12 13 15 15 16 18 24

5

Further properties of weak derivation 5.1 Finite generability and closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Distillation of divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27 27 28

6

The failure simulation preorder 6.1 Two equivalent definitions and their rationale . . . . . . . . . . . . 6.2 A simpler characterisation of failure similarity for finitary processes 6.3 Precongruence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . .

29 29 33 34 38

Failure simulation is complete for must testing 7.1 Inductive characterisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 The modal logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Characteristic tests for formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39 39 42 44

7

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

1

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

. . . . . . .

. . . .

8

Simulations and may testing 8.1 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48 48 50

9

Conclusion and Related Work

50

A Assorted counter-examples A.1 Finite state space is necessary; otherwise. . . . . . . . . . . . . . . . . . . . A.1.1 Distillation of divergence . . . . . . . . . . . . . . . . . . . . . . . A.1.2 Transitivity of failure simulation . . . . . . . . . . . . . . . . . . . A.1.3 Equivalence of finite and infinite interpolation . . . . . . . . . . . . A.1.4 Soundness of failure simulation . . . . . . . . . . . . . . . . . . . A.1.5 Pre-congruence of simple failure similarity . . . . . . . . . . . . . A.2 Finite branching is necessary; otherwise. . . . . . . . . . . . . . . . . . . . A.2.1 Closure of derivatives . . . . . . . . . . . . . . . . . . . . . . . . . A.2.2 Distillation of divergence . . . . . . . . . . . . . . . . . . . . . . . A.2.3 Equivalence of failure simulation and extended failure simulation . A.2.4 Transitivity of failure simulation . . . . . . . . . . . . . . . . . . . A.2.5 Soundness of failure simulation . . . . . . . . . . . . . . . . . . . A.2.6 Coincidence of failure simulation and the limit of its approximation

. . . . . . . . . . . . .

52 52 52 52 52 52 52 53 53 53 53 53 53 55

B Technical lemmas and proofs for Section 3 B.1 Infinitary properties of lifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2 Elementary properties of weak derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.3 Structural properties of weak derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55 55 56 58

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

1 Introduction It has long been a challenge for theoretical computer scientists to provide a firm mathematical foundation for process description languages that incorporate both nondeterministic and probabilistic behaviour in such a way that processes are semantically distinguished only if they can be told apart by some notion of testing. In our earlier work [3, 2] a semantic theory was developed for one particular language with these characteristics, a finite process calculus called pCSP: nondeterminism is present in the form of the standard choice operators inherited from CSP [9], that is P ⊓ Q and P 2 Q, while probabilistic behaviour is added via a new choice operator P p ⊕ Q in which P is chosen with probability p and Q with probability 1−p. The intensional behaviour of a pCSP process is given in terms of a probabilistic labelled transition system [21, 3], or pLTS, a generalisation of labelled transition systems [17]. In a pLTS the result of performing an action in a given state results in a probability distribution over states, rather than a single state; thus the relations s −α → t in an LTS are replaced by relations s −α → ∆, with ∆ a distribution. Closed pCSP expressions P are interpreted as probability distributions P in the associated pLTS. Our semantic theory [3, 2] naturally generalises the two preorders of standard testing theory [5] to pCSP:

[ ℄

• P ⊑pmay Q indicates that Q is at least as good as P from the point of view of possibly passing probabilistic tests; and • P ⊑pmust Q indicates instead that Q is at least as good as P from the point of view of guaranteeing probabilistic tests. The most significant result of [2] was an alternative characterisation of these preorders as particular forms of coinductively defined simulations over the underlying pLTS. We also provided a characterisation in terms of a modal logic. The object of the current paper is to extend the above results to a version of pCSP with recursive process descriptions: we add a construct rec x. P for recursion, and extend the intensional semantics of [2] in a straightforward manner. We restrict ourselves to finitary pCSP processes, having finitely many states and displaying finite branching.

2

• The simulation relations in [2] were defined in terms of weak transitions τˆ τˆ =⇒ between distributions, obtained as the transitive closure of a relation −→ τ τ between distributions that allows part of a distribution to make a τ -move, whereas the remaining part remains in place. This definition is however inade1 1 quate for processes that can do an unbounded number of τ -steps. The problem 2 2 is highlighted by the process Q1 = rec x. (τ.x 21 ⊕ a. 0) illustrated in Figure 1. • • Process Q1 is indistinguishable, using tests, from the simple process a. 0; we have Q1 ≃pmay a. 0 and Q1 ≃pmust a. 0. This is because the process Q1 a will eventually perform the action a with probability 1. However, the action τˆ a a • a. 0 −→ 0 can not be simulated by a corresponding move Q1 =⇒−→. No matter which distribution ∆ we obtain from executing a finite sequence of Figure 1: The pLTS of process Q 1 τˆ internal moves Q1 =⇒ ∆, part of it is unable to subsequently perform the action a. To address this problem we propose a new relation ∆ =⇒ Θ, that indicates that Θ can be derived from ∆ by performing an unbounded sequence of internal moves; we call Θ a weak derivative of ∆. For example a. 0 will turn out to be a weak derivative of Q1 , Q1 =⇒ a. 0 , via the infinite sequence of internal moves

[ ℄

[ ℄

[ ℄

[ ℄

[ ℄[ ℄ [ ℄ τ τ [Q1℄ −→ [Q1 ⊕ a. 0 ℄ −→ [Q1 1 2

[ ℄

1 22



[



τ τ ⊕ a. 0 −→ . . . Q1 21n ⊕ a. 0 −→ ... .

One of our contributions here is the significant use of “subdistributions” that sum to no more than one [10, 16]. For example, the empty subdistribution ε elegantly represents the chaotic behaviour of processes that in CSP and in musta testing semantics is tantamount to divergence, because we have ε −→ ε for any action a, and a process like rec x. x that diverges via an infinite τ path gives rise to the weak transition rec x. x =⇒ ε. Our weak transitions relation =⇒ can be regarded as an extension of the one from Lynch, Segala & Vaandrager [15] to partial distributions; the latter, although defined in a very different way, can be obtained from ours by requiring weak derivatives to be total distributions. We end this introduction with a brief glimpse at our proof strategy. In [2] the characterisations for finite pCSP processes were obtained using a probabilistic extension of the Hennessy-Milner logic [17]. Moving to recursive processes, we know that process behaviour can be captured by a finite modal logic only if the underlying LTS is finitely branching, or at least image-finite [17]. Thus to take advantage of a finite probabilistic HML we need a property of pLTSs corresponding to finite branching in LTSs; this is topological compactness. Subdistributions over (derivatives of) finitary pCSP processes inherit the standard (complete) Euclidean metric. One of our key results is that

[ ℄

Theorem 1.1 For every finitary pCSP process P , the set { ∆ | P =⇒ ∆ } is convex and compact. Indeed, using techniques from Markov Decision Theory [19] we can show that the potentially infinite set { ∆ |

[P ℄ =⇒ ∆ } is nevertheless the convex closure of a finite set of subdistributions, from which Theorem 1.1 follows immediately. This in turn implies that the simulation preorder ⊑S is compact in the following sense:

[ ℄

Theorem 1.2 For every finitary pCSP process P , the set { ∆ | P ⊑S ∆ } is convex and compact. This key result allows an inductive characterisation of the simulation preorder: we can construct a sequence of approximations ⊑kS for k ≥ 0 with the property that Theorem 1.3 For finitary distributions ∆ and Θ we have ∆ ⊑S Θ if and only if ∆ ⊑kS Θ for every k ≥ 0. Our main characterisation results can then be obtained by extending the probabilistic modal logic used in [2], so that for example • it characterises ⊑kS for every k ≥ 0, and therefore it also characterises ⊑S • every probabilistic modal formula can be mimicked by a may-test. Similar results accrue for must testing: details are given in Section 7.

3

P

::=

S | P p⊕ P

S

::=

0 | x ∈ Var | a.P | P ⊓ P | S 2 S | S |A S | rec x. P Figure 2: Syntax of pCSP

2 The language pCSP Let Act be a set of visible actions which a process can perform, and let Var be an infinite set of variables. The language pCSP is then given by the two-sorted syntax in Figure 2. It is essentially the finite language of [2, 3], to which has been added the recursive construct rec x. P in which x is a variable and P a term. Intuitively rec x. P represents the solution of the fixed-point equation x = P . The notions of free and bound variables are standard, as is that of the substitution of terms for free occurrences of variables (with renaming if necessary). By Q[x 7→ P ] we mean the result of substituting the term P for the variable x in Q. We write pCSP for the set of closed terms defined by this grammar, the pCSP processes, and sCSP for its subset of pCSP states: the sub-sort S above. The process P p ⊕ Q, for 0 ≤ p ≤ 1, represents a probabilistic choice between P and Q: with probability p it will act like P and with probability 1−p it will act like Q.1 Any process is a probabilistic combination of statebased processes built by repeated application of the operator p⊕. The state-based processes have a CSP-like syntax, involving the stopped process 0, action prefixing a. for a ∈ Act, internal- and external choices ⊓ and 2, and a parallel composition |A for A ⊆ Act. The process P ⊓ Q will first do a so-called internal action τ 6∈ Act, choosing nondeterministically between P and Q. Therefore ⊓, like a. , acts as a guard, in the sense that it converts any process arguments into a state-based process. The same applies to rec x. P as, following CSP [18], our recursion construct performs an internal action when unfolding. As our testing semantics will abstract from internal actions, these τ -steps are harmless and merely simplify the semantics. The process s 2 t on the other hand does not perform actions itself but rather allows its arguments to proceed, disabling one argument as soon as the other has done a visible action. In order for this process to start from a state rather than a probability distribution of states, we require its arguments to be state-based as well; the same requirement applies to |A . Finally, the expression s |A t, where A ⊆ Act, represents processes s and t running in parallel. They may synchronise by performing the same action from A simultaneously; such a synchronisation results in τ . In addition s and t may independently do any action from (Act\A) ∪ {τ }. Although formally the operators 2 and |A can only be applied to state-based processes, informally we use expressions of the form P 2 Q and P |A Q, where P and Q are not state-based, as syntactic sugar for expressions in the above syntax obtained by distributing 2 and |A over p⊕. Thus for example s 2 (t1 p ⊕ t2 ) abbreviates the term (s 2 t1 ) p ⊕ (s 2 t2 ). The full language of CSP [1, 9, 18] has many more operators; we have simply chosen a representative selection, and have added probabilistic choice. Our parallel operator is not a CSP primitive, but it can easily be expressed in terms of them — in particular P |A Q = (P kA Q)\A, where kA and \A are the parallel composition and hiding operators of [18]. It can also be expressed in terms of the parallel composition, renaming and restriction operators of CCS. We have chosen this (non-associative) operator for convenience in defining the application of tests to processes. As usual we may elide 0; the prefixing operator a. binds stronger than any binary operator; and precedence betweenL binary operatorsP is indicated via brackets or spacing. e We will also sometimes use indexed binary operators, such as i∈I pi ·Pi with i∈I pi = 1 and all pi > 0, and i∈I Pi , for some finite index set I. Our language is interpreted as a probabilistic labelled transition system [3, 2]. Essentially the same model has appeared in the literature under different names such as NP-systems [11], probabilistic processes [12], simple probabilistic automata [20], probabilistic transition systems [13] etc. Furthermore, there are strong structural similarities with Markov Decision Processes [19, 4].

[

℄ [℄ [

℄ [℄

1 In our semantics we have P ⊕ Q = Q and P ⊕ Q = P , so without limitation of generality we could have required that 0 2} consists of all total distributions whose support is included in {si | i > 2}. However, with a definition of convex closure that requires only binary interpolations of distributions to 2 be included, l{si | i > 2} would merely consist of all such distributions with finite support. Convex closure is a closure operator in the standard sense, in that it satisfies • X ⊆ lX • X ⊆ Y implies lX ⊆ lY • llX = lX. We say a set X is convex if lX = X. Furthermore, we say that a relation R ⊆ Y × D(S) is convex whenever the set {∆ | y R ∆} is convex for every y in Y , and lR denotes the smallest convex relation containing R.

3.1 Lifted relations In a pLTS actions are only performed by states, in that actions are given by relations from states to distributions. But pCSP processes in general correspond to distributions over states, so in order to define what it means for a process to perform an action, we need to lift these relations so that they also apply to distributions. In fact we will find it convenient to lift them to subdistributions. Definition 3.2 Let (S, L, →) be a pLTS and R ⊆ S × D(S) be a relation from states to subdistributions. Then R ⊆ D(S) × D(S) is the smallest relation that satisfies: (1) s R Θ implies s R Θ, and (2) (Linearity) ∆i R Θi for i ∈ I implies

P

i∈I

P

pi ·∆i R

i∈I

pi ·Θi for any pi ∈[0, 1] (i ∈ I) with

Remark 3.3 For R1 , R2 ⊆ S × D 1 (S), if R1 ⊆ R2 then R1 ⊆ R2 .

P

i∈I

pi ≤ 1.

By construction R is convex. Moreover, because s(lR)Θ implies s R Θ we have R=lR, which means that when considering a lifted relation we can w.l.o.g. assume the original relation to have been convex. In fact when R is indeed convex, we have that s R Θ and s R Θ are equivalent. →. Thus, An application of this notion is when the relation is −α → for α ∈ Actτ ; in that case we also write −α → for −α α as source of a relation −→ we now also allow distributions, and even subdistributions. A subtlety of this approach is that for any action α, we have ε −α →ε (1) P simply by taking I = ∅ or i∈I pi = 0 in Definition 3.2. That will turn out to make ε especially useful for modelling the “chaotic” aspects of divergence, in particular that in the must-case a divergent process can simulate any other. Definition 3.2 is very similar to our previous definition in [2], although there it applied only to (full) distributions: Lemma 3.4 ∆ R Θ if and only if P P (i) ∆ = i∈I pi ·si , where I is an index set and i∈I pi ≤ 1,

(ii) For each i ∈ I there is a subdistribution Θi such that si R Θi , P (iii) Θ = i∈I pi ·Θi .

Proof: Straightforward.

2

An P important point here is that a single state can be split into several pieces: that is, the decomposition of ∆ into i∈I pi ·si is not unique. One important property of this lifting operation is the following: Lemma 3.5 Suppose ∆ R Θ, where R is any relation in S × D(S). Then (i) |∆| ≥ |Θ|. 7

(ii) If R is a relation in S × D 1 (S) then |∆| = |Θ|. Proof: This follows immediately from the characterisation in Lemma 3.4.

2

So for example if ε R Θ then 0 = |ε| ≥ |Θ|, whence Θ is also ε. Remark 3.6 From Lemma 3.4 it also follows that lifting enjoys the following two properties: (Scaling) If ∆ R Θ, p ∈ and |p·∆| ≤ P1 then p·∆ R p·Θ. P P (Additivity) If ∆i R Θi for i ∈ I and | i∈I ∆i | ≤ 1 then i∈I ∆i R i∈I Θi . In fact, we could have presented Definition 3.2 using scaling and additivity instead of linearity.

R

The lifting operation has yet another characterisation, this time in terms of choice functions. Definition 3.7 Let R ⊆ S × D(S) be a relation from states to subdistributions. Then f : S → D(S) is a choice function for R, written f ∈S R, if s R f (s) for every s ∈ S. Proposition 3.8 Suppose R ⊆ S × D(S) is a convex relation. Then for any ∆ ∈ D(S), ∆ R Θ if and only if there is some choice function f ∈⌈∆⌉ R such that Θ = Exp∆ (f ). P Proof: First suppose Θ = Exp∆ (f ) for some choice function f ∈⌈∆⌉ R, that is Θ = s∈⌈∆⌉ ∆(s)·f (s). It now follows from Lemma 3.4 that ∆ R Θ since s R f (s) for each s. Conversely suppose ∆ R Θ; we have to find a choice function f ∈⌈∆⌉ R such that Θ = Exp∆ (f ). Applying Lemma 3.4 we know that P P (i) ∆ = i∈I pi ·si , for some index set I, with i∈I pi ≤ 1 P (ii) Θ = i∈I pi ·Θi for some Θi satisfying si R Θi . Now define the function f : ⌈∆⌉ → D(S) by letting f (s) =

X

{ i∈I | si =s }

P

(

pi )·Θi ∆(s)

Note that ∆(s) = { i∈I | si =s } pi and therefore by convexity s R f (s); so f is a choice function for R. Moreover, a P 2 simple calculation shows that Exp∆ (f ) = i∈I pi ·Θi , which by (ii) above is Θ.

An important further property is the following: P P Proposition 3.9 If i∈I pi ·∆i R Θ then Θ = i∈I pi ·Θi for some subdistributions Θi such that ∆i R Θi for i ∈ I. P Proof: Let ∆ R Θ where ∆ = i∈I pi ·∆i . By Proposition 3.8, using that R=lR, there is a choice function f ∈⌈∆⌉ lR such that Θ = Exp∆ (f ). Take Θi := Exp∆i (f ) for i ∈ I. Using that ⌈∆i ⌉ ⊆ ⌈∆⌉, Proposition 3.8 yields ∆i R Θi for i ∈ I. Finally, X X X X X X ∆(s)·f (s) = Exp∆ (f ) = Θ. 2 ∆i (s)·f (s) = pi ·∆i (s)·f (s) = pi ·Θi = pi · i∈I

i∈I

s∈⌈∆i ⌉

s∈⌈∆⌉ i∈I

s∈⌈∆⌉

P The converse to the above is not true in general: from ∆ R ( i∈I pi ·Θi ) it does not follow that ∆ can correspondingly a 1 1 1 1 1 be decomposed. For example, we have a.(b 12 ⊕ c) −→ 2 ·b + 2 ·c, yet a.(b 2 ⊕ c) cannot be written as 2 ·∆1 + 2 ·∆2 a a such that ∆1 −→ b and ∆2 −→ c. In fact a simplified form of Proposition 3.9 holds for un-lifted relations, provided they are convex: P P Corollary 3.10 If ( i∈I pi ·si ) R Θ and R is convex, then Θ = i∈I pi ·Θi for subdistributions Θi with si R Θi for i ∈ I. P Proof: Take ∆i to be si in Proposition 3.9, whence Θ = i∈I pi ·Θi for some subdistributions Θi such that si R Θi for i ∈ I. Because R is convex, we then have si R Θi from the remarks following Definition 3.2. 2 8

Lifting satisfies the following monadic property with respect to composition. Lemma 3.11 Let R1 , R2 ⊆ S × D(S). Then the forward relational composition R1 ;R2 is equal to the lifted composition R1 ;R2 . Proof: Suppose ∆ R Pis some Θ such that ∆ R1 Θ R2 Φ. By Lemma 3.4 we have the P1 ;R2 Φ. Then there decomposition ∆ = i∈I pi ·si and Θ = i∈I pi ·Θi with si R1 Θi for each i ∈ I. By Proposition 3.9 we obtain P Φ = i∈I pi ·Φi with Θi R2 Φi . It follows that si R1 ;R2 Φi , and thus ∆ R1 ;R2 Φ. So we have shown that R1 ;R2 ⊆ R1 ;R2 . The other direction can be proved similarly.

2

3.2 Weak transitions defined × Definition 3.12 (Weak τ moves to derivatives) Suppose we have subdistributions ∆, ∆→ k , ∆k , for k ≥ 0, with the following properties: × ∆→ 0 + ∆0 → ∆1 + ∆× 1 .. .

— The × component stops “here” (even if it could have moved), — but the → component moves on.

∆ ∆→ 0 .. .

= τ −→

∆→ k

τ × −→ ∆→ k+1 + ∆k+1 .. .

In total:

∆′

=

P∞

k=0

∆× k

— Finally, all the stopped-somewhere components are summed.

τ The −→ moves above sources are lifted in the sense of the previous section. P with subdistribution × We call ∆′ := ∞ ∆ a derivative of ∆, and write ∆ =⇒ ∆′ to mean that ∆ can make a weak τ move to its k=0 k ′ derivative ∆ .

There is always at least one derivative of any distribution (the distribution itself) and there can be many. Using Lemma 3.5 it is easily checked that Definition 3.12 is well-defined in that derivatives do not sum to more than one. τ ⋆ τ Example 3.13 Let −→ denote the reflexive transitive closure of the relation −→ over subdistributions. By the judicious use of the empty distribution ε in the definition of =⇒, and property (1) above, it is easy to see that τ ⋆ ∆ −→ Θ implies ∆ =⇒ Θ . τ ⋆ For ∆ −→ Θ means the existence of a finite sequence of subdistributions ∆ = ∆0 , ∆1 , . . . , ∆k = Θ, k ≥ 0 for which we can write

∆ ∆0 .. .

= ∆0 + ε τ −→ ∆1 + ε .. .

∆k−1 ε

τ −→ τ −→

ε + ∆k ε+ε .. .

In total:

Θ

This implies that =⇒ is indeed a generalisation of the standard notion for non-probabilistic transition systems of performing an indefinite sequence of internal τ moves. 2 In Definition 3.12 we can see that ∆′ = ε iff ∆× k = ε for all k. Thus ∆ =⇒ ε iff there is an infinite sequence of τ subdistributions ∆k such that ∆ = ∆0 and ∆k −→ ∆k+1 , that is iff ∆ can give rise to a divergent computation. 9

[



τ Example 3.14 Consider the process rec x. x, which recall is a state, and for which we have rec x. x −→ rec x. x and τ 2 thus rec x. x −→ rec x. x . Then rec x. x =⇒ ε .

[



[



[



[ ℄

[℄

Example 3.15 Recall the process Q1 = rec x. (τ.x 21 ⊕ a) from the introduction. We have Q1 =⇒ a because

[Q1℄ [Q1℄ 1 · [τ.Q1 ℄ 2 1 · [Q1 ℄ 2

= τ −→ τ −→ τ −→

... 1 τ · Q1 −→ 2k ...

[ ℄

[Q 1 ℄ + ε 1 1 · [τ.Q1 ℄ + · [a℄ 2 2 1 · [Q 1 ℄ + ε 2 1 1 · [τ.Q1 ℄ + 2 · [a℄ 2 2 2 1 2k+1

[



· τ.Q1 +

1 2k+1

[℄

· a

which means that by definition we have

[Q℄ =⇒ ε +

X 1 · a 2k

[℄

k≥1

[℄

which is just a as claimed.

2

Example 3.16 Consider states sk and probabilities pk for k ≥ 2 such that

[℄

τ a sk −→

pk

⊕ sk+1 ,

where we choose pk so that starting from any sk the probability of eventually taking a left-hand branch, and so reaching a ultimately, is just 1/k in total. Thus pk must satisfy 1/k = pk + (1−pk )/(k+1), whence by arithmetic we have that pk := 1/k 2 will do. Therefore in particular s2 =⇒ 12 a , with the remaining 12 lost in divergence. 2

[℄

[℄

Our final example demonstrates that derivatives of (interpretations of) pCSP processes may have infinite support, and hence that we can have P =⇒ ∆′ such that there is no P ′ ∈ pCSP with P ′ = ∆′ .

[ ℄

[ ℄

Example 3.17 Let P denote the process rec x. b 12 ⊕ (x |∅ 0). Then we have the derivation:

[P ℄ [P ℄ 1 · [P |∅ 01 ℄ 2

= τ −→ τ −→

[P ℄ + ε 1 1 · [P |∅ 01 ℄ + · [b℄ 2 2 1 1 2 · [P |∅ 0 ℄ + 2 · [b |∅ 01 ℄ 2 2 2

... ... 1 1 1 k τ −→ · P |∅ 0 · P |∅ 0k+1 + k+1 · b |∅ 0k k k+1 2 2 2 ...

[



[



[



where 0k represents k instances of 0 running in parallel. This implies that

[P ℄ =⇒ Θ where Θ=

X 1 · b |∅ 0k 2k

[



k≥1

a distribution with infinite support.

2 10

3.3 Elementary properties of weak derivations Here we develop some properties of the weak move relation =⇒ which will be important later on in the paper; full proofs and technical lemmas have in some cases been placed in Appendix B. We wish to use weak derivation as much as possible in the same way as the lifted action relations −α →, and therefore we start with showing that =⇒ enjoys two of the most crucial properties of −α →: linearity of Definition 3.2 and the decomposition property of Proposition 3.9. P Theorem 3.18 Let pi ∈[0, 1] for i ∈ I with i∈I pi ≤ 1. Then P P (i) If ∆i =⇒ Θi for all i ∈ I then i∈I pi ·∆i =⇒ i∈I pi ·Θi . P P (ii) If i∈I pi ·∆i =⇒ Θ then Θ = i∈I pi ·Θi for subdistributions Θi such that ∆i =⇒ Θi for all i ∈ I. Proof: See Lemma B.3.

2

With Theorem 3.18, the relation =⇒ ⊆ D(S) × D(S) can be obtained as the lifting of a relation =⇒S from S to D(S), which is defined by writing s =⇒S Θ just when s =⇒ Θ. Proposition 3.19 (=⇒S ) = (=⇒). Proof: That ∆ =⇒S Θ implies ∆ =⇒ P Θ is a simple application of Part (i) of Theorem 3.18. For the other direction, suppose ∆ = ⇒ Θ: given that ∆ = s∈⌈∆⌉ ∆(s)·s, Part (ii) of the same lemma enables us to decompose Θ into P ∆(s)·Θ where s = ⇒ Θ for each s in ⌈∆⌉. But the latter actually means that s =⇒S Θs , and so by s s s∈⌈∆⌉ definition this implies ∆ =⇒S Θ. 2 Corollary 3.20 The relations =⇒ is convex. Proof: This is immediate from its being a lifting.

2

We proceed with the important properties of reflexivity and transitivity of weak derivations. Theorem 3.21 (Reflexivity and transitivity of =⇒) For any ∆ ∈ D(S) we have ∆ =⇒ ∆. Moreover, if ∆ =⇒ Θ and Θ =⇒ Λ then ∆ =⇒ Λ. Proof: The first statement is trivial: just take ∆→ 0 := ε in Definition 3.12. For the second, see Theorem B.4.

2

Finally, we need a property that is the converse of transitivity: if one executes a given weak derivation partly, by stopping more often and moving on less often, one makes another weak transition that can be regarded as an initial segment of the given one. We need the property that after executing such an initial segment, it is still possible to complete the given derivation. Definition 3.22 A weak derivation Φ =⇒ Γ is called an initial segment of a weak derivation Φ =⇒ Ψ if for k ≥ 0 × × → there are Γk , Γ→ k , Γk , Ψk , Ψk , Ψk ∈ D(S) such that Γ0 = Ψ0 = Φ and × Γk = Γ→ k + Γk → τ Γk − → Γk+1 P ∞ Γ = i=0 Γ× k

× Ψk = Ψ→ k + Ψk → τ → Ψk+1 Ψk − P ∞ Ψ = i=0 Ψ× k

→ Γ→ k ≤ Ψk Γk ≤ Ψ k τ → (Ψ→ k − Γk ) −→ (Ψk+1 − Γk+1 ).

Proposition 3.23 If Φ =⇒ Γ is an initial segment of Φ =⇒ Ψ, then Γ =⇒ Ψ. Proof: To be placed in Appendix B.

2

Further properties of weak derivations from finitary subdistributions are given in Section 5.

11

4 Testing probabilistic processes We follow our earlier approach [2, 3] to the testing of probabilistic processes. A test is simply a process from the language pCSP except that it may use special “success” actions for reporting the outcome. Thus we assume a set Ω of fresh success actions not already in Actτ . We refer to the augmented language as pCSPΩ , and the pLTS it generates as pltst . Now formally a test T is process from this language, and to apply it to process P we form the process T |Act P in which all visible actions of P must synchronise with T . Thus this gives rise to a pLTS in which the only possible actions are τ and elements of Ω: Definition 4.1 A pLTS of the form hS, Ωτ , −→i is referred to as a computation structure. To determine the outcome of applying a test to a process we therefore have extract some result from a computation structure. However because of the presence of recursion in pCSP these may now be of infinite depth. Consequently we can no longer use our earlier approach [2], because it assumed both processes and tests to be finite. In the following two subsections we outline two different ways in which outcomes can be calculated from the infinite (but finite-branching) computation structure of T |Act P ; for finite-state systems they will turn out to be equivalent.

4.1 Extremal testing Here we assume that tests are allowed to use a single success action ω; thus Ω = {ω}. Let S be the set of states in a computation structure. We view the unit interval [0, 1] ordered in the standard manner as a complete lattice (with least element 0), and this induces the same structure on the set of functions S → [0, 1]; the induced order is given by f ≤ g whenever f (s) ≤ g(s) for every s ∈ S. Now consider the functional Rmin : (S→[0, 1]) → (S→[0, 1]) defined by:   if s −ω→ 1 Rmin (f )(s) = 0 if s 6−→   τ min{ f (∆) | s −→ ∆ } otherwise where f (∆) = Exp∆ (f ). In a similar fashion we can define the functional Rmax : (S→[0, 1]) → (S→[0, 1]) which uses the max function in place of min. Both these functions are monotonic, and therefore have least fixed points, which we abbreviate to Vmin , Vmax respectively. Furthermore, it can be shown that both Rmin and Rmax are continuous (see Lemma 4.19 below) , so we have the characterisation F F (2) and Vmax = n∈N Vmax n Vmin = n∈N Vmin n where both Vmin 0 and Vmax 0 denote the bottom function ⊥ defined by ⊥(s) = 0 for all s ∈ S, and • Vmin (k+1) = Rmin (Vmin k ) • Vmax (k+1) = Rmax (Vmax k ) Now for a test T and a process P , we have two ways of defining the outcome of the application of T to P : Aemin (T, P ) = Vmin (T |Act P ) Aemax (T, P ) = Vmax (T |Act P ) Here Aemin (T, P ) returns a single probability p, estimating the minimal probability of success; it is a pessimistic estimate. On the other hand Aemax (T, P ) is optimistic, in that it gives the maximal probability of success. Definition 4.2 1. P ⊑epmay Q if for every test T , Aemax (T, P ) ≤ Aemax (T, Q) 12

b

τ

b

τ

1 2

τ

bc 1 2

b

1 2

b

τ

bc 1 2 b

τ

b

b

a

ω

b

b

(ii) The computation structure (a.ω |Act Q1 )

(i) The process Q1

Figure 4: Testing the process Q1 2. P ⊑epmust Q if for every test T , Aemin (T, P ) ≤ Aemin (T, Q) Example 4.3 Consider the process Q1 = rec x. (τ.x 21 ⊕ a), which is described graphically in Figure 4. When we apply the test T = a.ω to it we get the computation structure also described there. Note that this is deterministic and consequently the pessimistic and optimistic approaches coincide. That is, we have Vmax (T |Act Q1 ) = Vmin (T |Act Q1 ) = v where v is the least probability –indeed the only probability, in this case– that satisfies v =

1 1 ·v+ 2 2

, so that v = 1.

In general these solutions are unique just when the process is almost free of divergence, that is almost (surely) convergent so that the infinite internal paths have total probability zero. In fact it is possible to show that Q1 ≃epmay a

Q1 ≃epmust a. 2

4.2 Resolution-based testing Here tests are allowed to use a finite collection of success actions, Ω = {ω1 , . . . ωn }, although it will be convenient to assume that in any given state at most one of the actions ωi can be executed. Then, when calculating the result of applying the test T to the process P , we use the collection of resolutions of T |Act P ; intuitively a resolution represents a run of the combined process T |Act P , and as such gives exactly one probability for each success action. So in general the application of T to P will be a set of probabilities vectors, result vectors, not necessarily finite. Here we adapt the notion of resolution defined in [21, 4] for probabilistic automata, to pLTSs. A computation structure hS, Ωτ , →i is called deterministic if for every s ∈ S and every α ∈ Ωτ there is at most one ∆ such that s −α → ∆. Definition 4.4 A resolution of a computation structure hS, Ωτ , →i is a deterministic computation structure hT, Ωτ , →i such that there is a resolving function f ∈ T → S which satisfies the following conditions: 1. if t −ω→ Θ for some ω ∈ Ω then f (t) −ω→ f (Θ) τ τ 2. if t −→ Θ then f (t) −ω→ 6 for all ω ∈ Ω and f (t) −→ f (Θ)

3. if t −→ 6 then f (t) −→ 6 where f (Θ) is the distribution defined by f (Θ)(s) =

P

f (t)=s

Θ(t).

The reader is referred to Section 2 of [4] for a detailed discussion of this concept of resolutions, and the manner in which it represents computation runs of a process; in particular in a resolution states in S are allowed to be resolved into distributions, and computation steps can be probabilistically interpolated. We often use the meta-variable R to 13

τ b

τ 1 2

b

b

τ

τ bc 1 2

τ 1 2

b

1 2

bc 1 2

b

a

bc 1 2

b

b

bc 1 2

b

τ

b

b

a

b

τ 1 2

ω

b

b

b

τ

b

ω

b

(ii) The computation structure (a.ω |Act Q2 )

(i) The process Q2

Figure 5: Testing the process Q2 refer to a resolution, with resolving function fR , and the computation structure involved will usually be understood from the context; in most cases it is that generated by pltst . Let S denote the set of states in a deterministic computation structure. Then by analogy with the functional Rmin and Rmax of the previous subsection we can define R : (S → [0, 1]Ω ) → (S → [0, 1]Ω ) by  i  if s −ω→ ω~i R(f )(s) = ~0 (3) if s 6−→   τ f (∆) if s −→ ∆ Here we use notation originally introduced in [4] for denoting result vectors in [0, 1]Ω ; ~0 is the vector which is everywhere 0 while ω~i has 1 in the ωith position, but is otherwise 0. Once more this functional has a least fixed point, which we denote by V. When the deterministic computation structure concerned is given by a resolution R, we say that R realises the function V. Now let AΩ (T, P ) denote the set

[



{ V(∆) | fR (∆) = T |Act P , R a resolution of pltst }.

(4)

Definition 4.5 Ω Ω 1. P ⊑Ω pmay Q if for every Ω-test T , A (T, P ) ≤Ho A (T, Q) Ω Ω 2. P ⊑Ω pmust Q if for every Ω-test T , A (T, P ) ≤Sm A (T, Q)

These preorders are abbreviated to P ⊑pmay Q, and P ⊑pmust Q, when |Ω|= 1. Example 4.6 Consider the process Q2 = rec x. τ.(τ.x 12 ⊕ a) 2 τ.(0 12 ⊕ a) and the application of the test T = a.ω to it; this is outlined in Figure 5. In the computation structure of T |Act Q2 , for each k ≥ 1 there is a resolution Rk such that V(Rk ) = (1 − 21k ); intuitively it goes around the loop (k − 1) times before at last taking the right hand τ action. Thus AΩ (T, Q2 ) contains (1 − 21k ) for every k ≥ 1. But it also contains 1, because of the resolution which takes the left hand τ move every time. Thus 1 1 1 AΩ (T, Q2 ) = {(1 − ), (1 − 2 ), . . . , (1 − k ), . . . 1} 2 2 2 Since AΩ (T, a) = {1} it follows that AΩ (T, a) ≤Ho AΩ (T, Q2 )

AΩ (T, Q2 ) ≤Sm AΩ (T, a)

Indeed it is possible to show that a ⊑pmay Q2

Q2 ⊑pmust a 2 14

4.3 Comparison In this section we compare the two approaches of testing introduced in the previous two subsections. Our first result is that in the most general setting they lead to different testing preorders. Example 4.7 [Monster B] Consider the infinite-state pLTS defined as follows: in addition to the states a and 0 it has the infinite set b1 , b2 , . . ., with each of these having two transitions: τ bk+1 • bk −→

[

τ 0 • bk −→

1 2k



⊕a .

Now let us compare the state b1 with the process a. With the test a.ω, using resolutions, we get: 1 1 AΩ (a.ω, b1 ) = {0, (1 − ), . . . , (1 − k ), . . .} 2 2 AΩ (a.ω, a) = {1}

(5)

which means that a 6⊑Ω pmay b1 . However when we use extremal testing, the test a.ω can not distinguish these processes. It is straightforward to see 1 that Vmax (a.ω |Act a) = 1. Moreover for every n > 0 once can calculate Vmax (n+1) (a.ω |Act bk ) to be (1 − 2(k+n) ), which in turns means that Vmax (a.ω |Act b1 ) also evaluates to 1. With some more work one can go on to show that no test can distinguish between these processes using optimistic extremal testing, meaning that a ⊑epmay b1 . 2 In the remainder of this section we show that provided some finitary constraints are imposed on the pLTS extremal testing and resolution-based testing coincide. It is convenient to break the material up into three subsections. In this first we show, that for resolution-based testing it is sufficient to use a singleton set of success actions, |Ω|= 1. In the second we examine must testing, which is easier than the may case, which in turn is treated in the final subsection. 4.3.1 Scalar versus Vector testing In [4] it was shown that for finite-branching probabilistic automata, and resolution-based testing, it is sufficient to use results sets Ω of size 1. We wish to apply this result in our setting, to obtain Theorem 4.11 below; however to do so we need to demonstrate that the manner in which the value obtained from a resolution used in that paper coincides with our use of least fixed points. First of all, we recall the notion of occurrences of actions and the results-gathering function W given in Definition 5 of [4]. Definition 4.8 Given a fully probabilistic automaton R = hS, ∆◦ , →i, the probability that R starts with a sequence of actions ℵ ∈ Σ∗ , is given by PrR (ℵ, ∆◦ ), where PrR : Σ∗ × S → [0, 1] is defined inductively:  PrR (ℵ, ∆) if s −α →∆ PrR (ε, s) := 1 and PrR (αℵ, s) := 0 otherwise P where PrR (ℵ, ∆) = s∈⌈∆⌉ ∆(s) · PrR (ℵ, s). The notation ε denotes the empty sequence of actions and αℵ the sequence starting with α ∈ Σ and continuing with ℵ ∈ Σ∗ . The value PrR (ℵ, s) is the probability that R proceeds with sequence ℵ from state s. Let Σ∗α be the set of finite sequences in Σ∗ that contains α just once, namely P at the end. Then the probability that that the fully probabilistic automaton R ever performs an action α is given by ℵ∈Σ∗α PrR (ℵ, ∆◦ ). Definition 4.9 For a fully probabilistic automaton R, let its success tuple W.R ∈ [0, 1]n be such that (W.R)i is the probability that R performs the action ωi . Then for a (not necessarily fully probabilistic) automaton M we define the set of its success tuples to be those resulting as above from all its resolutions: W.M := {W.R | R is a resolution of M }. 15

Proposition 4.10 If R = hS, ∆◦ , →i is a fully probabilistic automaton, then W.R = V(∆◦ ). P Proof: We need to show that ∀i : (W.R)i = (V(∆◦ ))i , i.e. ℵ∈Σ∗ωi PrR (ℵ, ∆◦ ) = (V(∆◦ ))i , for which it suffices to show that X PrR (ℵ, s) = (V(s))i for all s ∈ S. (6) F

ℵ∈Σ∗ωi

n ℵ ∈ Σ∗ be a sequence of actions, we write |ℵ| Since V = n∈N V , we have that (V(s)) P i = limn→∞ (V )i . Letting ∞ for its length. The sequence of reals { ℵ∈Σ∗ωi ,|ℵ|≤n PrR (ℵ, s)}n=0 is increasing and bounded by 1, so it converges P P and we have ℵ∈Σ∗ωi PrR (ℵ, s) = limn→∞ ℵ∈Σ∗ωi ,|ℵ|≤n PrR (ℵ, s). We now prove by induction on n that X PrR (ℵ, s) = (Vn (s))i for all n ∈ N. (7) n

ℵ∈Σ∗ωi ,|ℵ|≤n

which will yield (6) immediately. • The base case is n = 0. Then ∀i :

P

ℵ∈Σ∗ωi ,|ℵ|≤n PrR (ℵ, s)

= 0 and V0 (s) = ~0.

• Now supposing (7) holds for some n, we consider the case for n + 1. If s −→, 6 then we have X PrR (ℵ, s) = 0 = (Vn+1 (s))i . ℵ∈Σ∗ωi ,|ℵ|≤n+1

If s −α → ∆ for some action α and distribution ∆, then there are two possibilities: ℵ is a finite non-empty sequence without any – α = ωi . We then have (Vn+1 (s))i = 1. Note that if P occurrence of ωi , then PrR (ℵωi , s) = 0. In other words, ℵ∈Σ∗ωi ,|ℵ|≤n+1 PrR (ℵ, s) = PrR (hωi i, s) = 1. – α 6= ωi . Then (Vn+1 (s))i = (Vn (∆))i . On the other hand, PrR (α′ ℵ, s) = 0 if α 6= α′ . Therefore, P P ℵ∈Σ∗ωi ,|ℵ|≤n+1 PrR (ℵ, s) = Pαℵ∈Σ∗ωi ,|αℵ|≤n+1 PrR (αℵ, s) Pr (ℵ, ∆) = Pαℵ∈Σ∗ωi ,|αℵ|≤n+1 R = Pr (ℵ, ∆) ∗ω R i ℵ∈Σ ,|ℵ|≤n = (Vn (∆))i by induction = (Vn+1 (s))i 2 As a corollary of Proposition 4.10, we have AΩ (T, P ) = W.(T |Act P ) for any process P and test T . Therefore, Ω the testing preorders ⊑Ω pmay , ⊑pmust defined in Section 4.2 coincides with those in Definition 6 of [4]. Now Theorem 4 of [4] (to be accurate, the variant of that theorem for state-based testing) tells us that when testing finite-state processes it suffices to use a single success action rather than using multiple success actions. That is, Theorem 4.11 For finite-state processes: • P ⊑Ω pmay Q if and only if P ⊑pmay Q • P ⊑Ω pmust Q if and only if P ⊑pmust Q

2

4.3.2 Must testing Here we show that provided we restrict our attention to finite-branching processes there is no difference between extremal must testing, and resolution-based must testing. In view of Theorem 4.11, we will restrict our attention to resolution-based testing in which there is only one success action ω, |Ω|= 1. Let us consider a computation structure hS, Ωτ , →i, obtained perhaps from applying a test T to a process P in (T |Act P ). We have two ways of obtaining a result for a distribution of states from S, by applying the function Vmin , or by using resolutions of the computation structure to realise V. Our first result says that regardless of the actual resolution used, the value obtained from the latter will always dominate the former. 16

Proposition 4.12 Vmin (fR (∆)) ≤ V(∆), for any resolution R. Proof: Suppose R is a resolution of hS, Ωτ , →i, giving the deterministic computation structure hT, Ωτ , →i. First we show by induction on n that for every state t ∈ T Vmin n (fR (t)) ≤ Vn (t)

(8)

For n = 0, this is trivial. We consider the inductive step. First if t −ω→ Θ, then fR (t) −ω→ f (Θ), and thus τ 6 and so let us assume t −→ Θ for some Vmin n+1 (fR (t)) = 1 = Vn+1 (t). A similar argument applies if t −→, ω Θ, and t −→. 6 τ ∆} Vmin (n+1) (fR (t)) = min{Vmin n (∆)|fR (t) −→ n (f (Θ)) ≤ V R min P = Ps∈S fR (Θ)(s) · Vmin n (s) = Pt′ ∈T Θ(t′ ) · Vmin n (fR (t′ )) ′ n ′ ≥ by induction t′ ∈T Θ(t ) · V (t ) = Vn (Θ) = V(n+1) (t) Now by continuity we have from (8) that Vmin (fR (t)) ≤ V(t)

(9)

and this result is then easily generalised to distributions: P Vmin (fR (∆)) = Ps∈S fR (∆)(s) · Vmin (s) = Pt∈T ∆(t) · Vmin (fR (t)) by (9) ≤ t∈T ∆(t) · V(t) = V(∆) 2 Our next result says that in any finite-branching computation structure we can find a resolution which realises the function Vmin . Moreover this resolution will be of a particularly simple form. This new material of Matthew’s should be used in the proof of Lemma 5.3. A resolution R is said to be static if its resolving function fR is injective. Again we refer the reader to [4] for a discussion of power of this restriction. Static restrictions are particularly simple, in that they does not allow states to be resolving into distributions, or computation steps to be interpolated. Moreover they are very easy to describe. Definition 4.13 A (static) policy for a computation structure hS, Ωτ , →i is a partial function pp : S ⇀ D1 (S) satisfying: • s −→ implies pp(s) is defined • s −ω→ implies s −ω→ pp(s) τ τ • otherwise, if s −→ then s −→ pp(s)

Intuitively a policy pp decides between the choices available at a given state, with a preference for reporting success. It is easy to see that a policy pp determines a static resolution of the computation structure. This is defined as the deterministic computation structure hS, Ωτ , →pp i, where →pp is determined by s →pp f (s), and note that the associated resolving function is the identity. Indeed it is possible to show that every static resolution is determined in this manner by some policy. Proposition 4.14 In any finite-branching computation structure, there exists a static resolution R such that V(fR−1 (∆)) = Vmin (∆) for any distribution ∆.

17

Proof: Let ∆ be a distribution over the computation structure hS, Ωτ , →i. We exhibit the required resolution by defining a policy over S. We say policy pp(−) is min-seeking if its domain is { s ∈ S | s −→ } and it satisfies: τ τ if s −ω→ 6 but s −→ then Vmin (pp(s)) ≤ Vmin (∆) whenever s −→ ∆

Note that by design a min-seeking policy satisfies: τ if s −ω→ 6 but s −→ then Vmin (s) = Vmin (pp(s))

(10)

In a finite-branching computation structure it is straightforward to define a min-seeking policy: (i) If s −ω→ then let pp(s) be any ∆ such that s −ω→ ∆. τ τ (ii) Otherwise, if s −→ let {∆1 , . . . ∆n } be the finite non-empty set { ∆ | s −→ ∆ }. Now let pp(s) be any ∆k satisfying the property Vmin (∆k ) ≤ Vmin (∆j ) for every 1 ≤ j ≤ n; at least one such ∆k must exist.

We now show that the static resolution determined by such a policy satisfies the requirements of the proposition. For the sake of clarity let us write Vpp (∆) for the value realised for ∆ in the resolution determined by the policy pp(−). We already know, from Proposition 4.12, that Vmin (∆) ≤ Vpp (∆) and so we concentrate on the converse, Vpp (∆) ≤ Vmin (∆). Recall that the function Vpp is the least fixpoint of the functional R defined in (3) above, and interpreted in the resolution determined by pp(−). So the result follows if we can show that the function Vmin is also a fixed point. Since |Ω|= 1 this amounts to proving   if s −ω→ 1 Vmin (s) = 0 if s 6−→   Vmin (pp(s)) otherwise However this is a straightforward consequence of (10) above.

2

Theorem 4.15 For finite-branching processes, P ⊑epmust Q if and only if P ⊑pmust Q Proof: This is a consequence of the two previous propositions. First suppose P ⊑epmust Q. To show P ⊑pmust Q we must show that for any value v in AΩ (T, P ), for any arbitrary test T , there exists some v ′ ∈ AΩ (T, Q) such that v ′ ≤ v. The value v must be of the form V(∆), where there is a resolution R such that fR (∆) = T |Act P . From Proposition 4.12 we know that Vmin ( T |Act P ) ≤ v, and now from the hypothesis P ⊑epmust Q we have that Vmin ( Q |Act P ) ≤ v. Now employing Proposition 4.14 we can find some other (static) resolution R′ and such that V(Θ) = Vmin ( Q |Act P ), where Θ is fR′ ( Q |Act P ). So we can take the required v ′ to be V(Θ). The converse, P ⊑pmust Q implies P ⊑epmust Q is equally straightforward, and is left to the reader. 2

[

[





[ [

[

℄ ℄



4.3.3 May testing Here we can try to apply the same proof strategy as in the previous section. The analogue to Proposition 4.12 goes through: Proposition 4.16 V(∆) ≤ Vmax (fR (∆)) for any resolution R. Proof: Similar to the proof of Proposition 4.12

2

However the proof strategy used in Proposition 4.14 can not be used to show that Vmax can be realised by some static resolution, as the following example shows.

18

Example 4.17 In analogy with the definition used in the proof of Proposition 4.14, we say that a policy pp(−) is max-seeking if its domain is precisely { s ∈ S | s −→ }, and τ τ if s −ω→ 6 but s −→ then Vmax (∆) ≤ Vmax (pp(s)) whenever s −→ ∆ τ This ensures that Vmax (s) = Vmax (pp(s)), whenever s −→ and s −ω→, 6 and again it is straightforward to define a max-seeking policy in a finite-branching computation structure. However the resulting resolution does not in general realise the function Vmax . To see this, let us turn the (finite-branching) pLTS used in Example 4.7 into a computation structure. Here in addition to the two states ω and 0 there is the infinite set c1 , . . . ck , . . . and the transitions τ ck+1 • ck −→

[



τ • ck −→ 0 pk ⊕ ω , where again pk is the probability

1 . 2k

One can calculate Vmax (ck ) to be 1 for every k, and a max-seeking policy is given determined by pp(ck ) = ck+1 ; indeed this is essentially the only such policy. However this resolution does not realise Vmax , as Vpp (ck ) = 0. 2 Nevertheless we will show that if we restrict attention to finite-branching, and finite-state, computation structures, then there will always exist some static resolution which realises Vmax . The proof relies on techniques used in Markow process theory [19], and unlike that of Proposition 4.14 is non-constructive; we simply prove that some such resolution exists, without actually showing how to construct it. Consider the set of all functions from a finite set S to [0, 1], denoted by FS , and the distance function d over FS defined by d(f, g) = max|f (s) − g(s)|s∈S . We can check that (FS , d) constitutes a complete metric space. Let δ ∈ (0, 1) be a discount factor. The discounted version of the functional R in Section 4.2, Rδ : FS → FS defined by  if s −ω→  1 δ 0 if s −→ 6 R (f )(s) = (11)  δ · f (∆) otherwise where f (∆) = Exp∆ (f ), is a contraction mapping with constant δ. It follows from the Banach fixed point theorem that Rδ has a unique fixed point when δ < 1, which we denote by Vδ . On the other hand, it can be shown that Rδ is a continuous function over the complete lattice FS . So Vδ , as the least fixed point of Rδ , has the characterisation F δ δ,n V = n∈N V , where Vδ,n is the n-th iteration of Rδ over ⊥. Note that if there is no discount, i.e. δ = 1, we see that Rδ , Vδ coincides with R, V respectively. Similarly, we can define Vmin δ and Vmax δ . Let R+ be the set of non-negative real numbers. The following property is very useful. It says that the directed sup and countable sum can be interchanged. Later on, we only need the first clause since we restrict ourselves to finite-state systems in this paper. The second clause is also proved because it is interesting by itself. Lemma 4.18 Let S be a set and (S → R+ ) be the set of all functions from S to R+ . Suppose {fi | i ∈ I} is any directed subset of (S → R+ ). 1. If S is finite, then

XG

fi (s) =

s∈S i∈I

GX

fi (s).

i∈I s∈S

P F 2. If S is countable and the partial sum Sn := nj=1 i∈I fi (sj ) is bounded, i.e. there exists some c ∈ R+ such that Sn ≤ c for any n, then XG GX fi (s) = fi (s). s∈S i∈I

i∈I s∈S

Proof: 1. Since S is finite, we can assume that |S| F = N for some N ∈ N. Let ǫ be any positive real number. For each s ∈ S, there is some index is F such that 0 ≤ i∈I fi (s) − fi (s) ≤ Nǫ for all i > is . Let iS = max{is | s ∈ ǫ . Summing up over all P s ∈ S, we get S}. For P 0 ≤ i∈I fi (s) − fi (s) ≤ N for all iF> iSP P anyFs ∈ S, we have f (s) = lim f (s) ≤ ǫ for all i > i . Therefore, f (s)− 0P≤ F i i→∞ i S i s∈S fi (s) = i∈I s∈S s∈S s∈S i∈I f (s). i s∈S i∈I 19

P F 2. Since the sequence {Sn }n∈N is increasing and bounded, it converges to s∈S i∈I fi (s). Let ǫ be any positive real number. We can take a finite subset S ′ of S which is large enough so that XG XG ǫ 0≤ fi (s) − (12) fi (s) ≤ . 2 ′ s∈S i∈I

s∈S i∈I

With the same argument as in the proof the first clause, we can choose an index iS ′ so that XG X ǫ 0≤ fi (s) − fi (s) ≤ 2 ′ ′ s∈S i∈I

(13)

s∈S

Pn F for all i > iS ′ . We observe that fi (s) ≤P i∈I fi (s), so the sequence { j=1 fi (s)}n∈N , for any i ∈ I, is increasing and bounded, thus converges to s∈S fi (s). Therefore, there exists some N ∈ N such that 0≤

X

fi (s) −

N X

fi (s) ≤

j=1

s∈S

ǫ 2

(14)

for all i ∈ I. Without loss of generality, we assume that {s1 , ..., sN } ⊆ S ′ . It follows from (14) that −

X X ǫ ≤ fi (s) − fi (s) ≤ 0 2 ′ s∈S

(15)

s∈S

for all i ∈ I. Take the sum of the three inequalities (12), (13) and (15), we obtain XG X ǫ − ≤ fi (s) − fi (s) ≤ ǫ 2 s∈S i∈I

for all i > iS ′ . Therefore,

F

i∈I

P

s∈S

(16)

s∈S

fi (s) = limi→∞

P

s∈S

fi (s) =

P

s∈S

F

i∈I

fi (s). 2

The functionals Rδ and Rδmax have the following properties. Lemma 4.19

1. For any δ ∈ (0, 1], the functionals Rδ and Rδmax are continuous;

2 1 ; ≤ Rδmax 2. If δ1 , δ2 ∈ (0, 1] and δ1 ≤ δ2 , then we have Rδ1 ≤ Rδ2 and Rδmax

3. Let {δn }n≥1 be a nondecreasing sequence of discount factors converging to 1. It holds that F n = Rmax . and n∈N Rδmax

F

δn n∈N R

=R

Proof: We only consider R, the case for Rmax is similar.

1. Let f0 ≤ f1 ≤ ... be an increasing chain in S → [0, 1]. We need to show that G G Rδ ( fn ) = Rδ (fn ) n≥0

(17)

n≥0

For any s ∈ S, we are in one of the following three cases: i (a) s −ω→. We have

F Rδ ( n≥0 fn )(s)

= = = =

20

1 F 1 Fn≥0 δ R (f )(s) Fn≥0 δ n ( n≥0 R (fn ))(s)

by (11)

(b) s −→. 6 Similar to last case. We have Rδ (

G

fn )(s) = 0 = (

n≥0

G

Rδ (fn ))(s).

n≥0

(c) Otherwise, s −α → ∆ for some action α and distribution ∆ ∈ D1 (S). Then we infer that F F by (11) Rδ ( n≥0 fn )(s) = δ · ( n≥0 fn )(∆) F P = δ · s∈⌈∆⌉ ∆(s) · ( n≥0 fn )(s) F P = δ · s∈⌈∆⌉ ∆(s) · ( n≥0 fn (s)) P F = δ · s∈⌈∆⌉ n≥0 ∆(s) · fn (s) F P by Lemma 4.18 = δ · n≥0 s∈⌈∆⌉ ∆(s) · fn (s) F = δ · n≥0 fn (∆) F δ · f (∆) = Fn≥0 δ n R (fn )(s) = Fn≥0 = ( n≥0 Rδ (fn ))(s)

2. Straightforward by the definition of R.

3. For any f ∈ S → [0, 1] and s ∈ S we show that G R(f )(s) = (

n∈N

Rδn )(f )(s).

(18)

We focus on the non-trivial case that s −α → ∆ for some action α and distribution ∆ ∈ D1 (S). F F ( n∈N Rδn )(f )(s) = Fn∈N Rδn (f )(s) · f (∆) = n∈N δnF = f (∆) · ( n∈N δn ) = f (∆) · 1 = R(f )(s) 2 Lemma 4.20 Let {δn }n≥1 be a nondecreasing sequence of discount factors converging to 1. F • V = n∈N Vδn F • Vmax = n∈N Vmax δn

Proof: We only consider V; the case for Vmax is similar. We use the notation lfp(f ) for the least fixed point of the function f over a complete lattice. Recall that V and Vδn are the least fixed points of R and Rδn respectively, so we need to prove that G lfp(R) = (19) lfp(Rδn ) n∈N

We now show two inequations. δn δn F For any δnn ∈ N, we have δn ≤ 1, so Lemma 4.19 (2) yields R ≤ R. It follows that lfp(R ) ≤ lfp(R), thus n∈N lfp(R ) ≤ lfp(R). F F For the other direction, lfp(R) ≤ n∈N lfp(Rδn ), it suffices to show that n∈N lfp(Rδn ) is a pre-fixed point of R, F F i.e. R( n∈N lfp(Rδn )) ≤ n∈N lfp(Rδn ), which we derive as follows. Let {δn }n≥1 be a nondecreasing sequence of

21

discount factors converging to 1. = = = = ≤ = = This completes the proof of (19).

F δn R( F )) F n∈N lfp(R δm )( n∈N lfp(Rδn )) ( m∈N R F F δm R ( n∈N lfp(Rδn )) Fm∈N F δm δn Fm∈N Fn∈N R δm(lfp(R δn)) R (lfp(R )) Fm∈N Fn≥m δn R (lfp(Rδn )) Fm∈N δn≥m δn n )) Fn∈N R (lfp(R δn ) lfp(R n∈N

by Lemma 4.19 (3) by Lemma 4.19 (1) by Lemma 4.19 (2)

2

In the rest of this section, we consider probabilistic automata with actions τ and ω only. Lemma 4.21 Suppose δ ∈ (0, 1]. If hT, Θ◦ , →i is a resolution of hS, ∆◦ , →i, then we have Vδ,n (Θ◦ ) ≤ Vmaxδ,n (∆◦ ). Proof: Let f : T → S be the resolving function associated with the resolution hT, Θ◦ , →i, we show by induction on n that (20) Vmax δ,n (f (t)) ≥ Vδ,n (t) for any t ∈ T The base case n = 0 is trivial. We consider the inductive step. If t −ω→ Θ, then f (t) −ω→ f (Θ), thus Vmax δ,n (f (t)) = τ τ 1 = Vδ,n (t). Now suppose t −→ Θ. Then f (t) −ω→ 6 and f (t) −→ f (Θ). We can infer that Vmax δ,(n+1) (f (t)) = ≥ = = ≥ = =

τ ∆} δ · max{Vmax δ,n (∆)|f (t) −→ δ,n δ · Vmax (f (Θ)) P δ · s∈S f (Θ)(s) · Vmax δ,n (s) P δ · Pt′ ∈T Θ(t′ ) · Vmax δ,n (f (t′ )) by induction δ · t′ ∈T Θ(t′ ) · Vδ,n (t′ ) δ · Vδ,n (Θ) Vδ,(n+1) (t)

So we have proved (20), from which it follows that Vmax δ (f (t)) ≥ Vδ (t) for any t ∈ T

(21)

Therefore, we have that Vmax δ (∆◦ ) = = = ≥ =

Vmax δ (f (Θ◦ )) P f (Θ◦ )(s) · Vmax δ (s) Ps∈S ◦ δ (f (t)) Pt∈T Θ◦ (t) · Vmax δ Θ (t) · V (t) by (21) t∈T Vδ (Θ◦ )

2

We say a resolution of a process is static if its associated resolving function is injective. Lemma 4.22 Suppose δ < 1. Given a probabilistic automaton hS, ∆◦ , →i, there exists a static resolution hT, Θ◦ , →i such that Vmax δ (∆◦ ) = Vδ (Θ◦ ). τ Proof: Let hT, Θ◦ , →i be a resolution with an injective resolving function f such that if t −→ Θ then Vmax δ (f (Θ)) = δ τ max{Vmax (∆) | f (t) −→ ∆}. The finite-branching assumption ensures the existence of the such resolving function f.

22

Let g : T → [0, 1] be the function defined by g(t) = Vmax δ (f (t)) for all t ∈ T . Below we show that g is a fixed τ Θ. By the point of Rδ . If t −ω→ then f (t) −ω→. Therefore, Rδ (g)(t) = 1 = Vmax δ (f (t)) = g(t). Now suppose t −→ δ δ τ ω τ definition of f , we have f (t) −→, 6 f (t) −→ f (Θ) with Vmax (f (Θ)) = max{Vmax (∆) | f (t) −→ ∆}. Therefore, Rδ (g)(t)

= = = = = = = =

δ · g(Θ) P δ · t∈T Θ(t) · g(t) P δ · t∈T Θ(t) · Vmax δ (f (t)) P δ · s∈S f (Θ)(s) · Vmax δ (s) δ · Vmax δ (f (Θ)) τ ∆} δ · max{Vmax δ (∆)|f (t) −→ δ Vmax (f (t)) g(t)

Since Rδ has a unique fixed point Vδ , we derive that g coincides with Vδ , i.e. Vδ (t) = g(t) = Vmax δ (f (t)) for all 2 t ∈ T , from which we can obtain the required result Vδ (Θ◦ ) = Vmax δ (∆◦ ). Theorem 4.23 Given a finite-state probabilistic automaton hS, ∆◦ , →i, there exists a static resolution hT, Θ◦ , →i such that Vmax (∆◦ ) = V(Θ◦ ). Proof: By Lemma 4.22, for every discount factor d ∈ (0, 1) there exists a static resolution which achieves the maximum probability of success. Since there are finitely many states in S, there are finitely many static resolutions. There must exist a static resolution that achieves the maximum probability of success for infinitely many discount factors. In other words, for every nondecreasing sequence {δn }n≥1 converging to 1, there exists a subsequence {δnk }k≥1 and a static resolution hT, Θ◦ , −→i with resolving function f0 such that Vδnk (t) = Vmax δnk (f0 (t)) for all t ∈ T and k = 1, 2, .... By Lemma 4.20, we have that, for every t ∈ T , F Vδnk (t) V(t) = Fk∈N δnk = (f0 (t)) k∈N Vmax = Vmax (f0 (t)) It follows that V(Θ◦ ) = Vmax (∆◦ ).

2

Along the same line, we can obtain a similar theorem for Vmin . However, the use of discount factors is not necessary in this case. Lemma 4.24 If hT, Θ◦ , →i is a resolution of hS, ∆◦ , →i, then it holds that Vmin (∆◦ ) ≤ V(Θ◦ ). Proof: Analogous to the proof of Lemma 4.21.

2

Theorem 4.25 Given a finite-state probabilistic automaton hS, ∆◦ , −→i. There exists a static resolution hT, Θ◦ , →i such that Vmin (∆◦ ) = V(Θ◦ ). Proof: Similar to the proof of Lemma 4.22. Let hT, Θ◦ , →i be a resolution with an injective resolving function f τ τ such that if t −→ Θ then Vmin (f (Θ)) = min{Vmin (∆) | f (t) −→ ∆}. Let g : T → [0, 1] be the function defined by g(t) = Vmin (f (t)). As in the proof of Lemma 4.22, we show that g is a fixed point of R. Since V is the least fixed point of R, it holds that V(t) ≤ g(t) = Vmin (f (t)) for all t ∈ T , from which we can obtain V(Θ◦ ) ≤ Vmin (∆◦ ). Using Lemma 4.24, we derive that Vmin (∆◦ ) = V(Θ◦ ). 2 From Lemmas 4.21 and 4.24 as well as Theorems 4.23 and 4.25, we obtain the following corollary. Corollary 4.26 Let M = hS, ∆◦ , →i be a finitary probabilistic automaton. Vmax (∆◦ ) Vmin (∆◦ )

= max{V(Θ◦ ) | Θ◦ is the initial distribution of a static resolution of M } = min{V(Θ◦ ) | Θ◦ is the initial distribution of a static resolution of M } 23

The following theorem states that for finitary processes extremal testing yields the same preorders as resolutionbased testing. Theorem 4.27 For finite-state processes: • P ⊑epmay Q if and only if P ⊑pmay Q • P ⊑epmust Q if and only if P ⊑pmust Q Proof: An immediate consequence of Corollary 4.26.

2

Because of Theorems 4.11 and 4.27, we can choose the variation of the testing preorders which are most convenient to the task at hand. So to prove the soundness of the simulation preorders we will use extremal testing, while to prove that modal formulae can be characterised by tests we use the resolution-based testing.

4.4 Testing via weak-p derivatives We now show how resolutions, used in Section 4.2 to determine the outcome of tests, can be realised as the derivatives of a restrictive class of weak-p moves. With this alternative point of view, we can effectively test processes by comparing their possible derivatives directly: in effect the inductive V is replaced by the induction implicit in the definition of derivative, and the evaluation of ω-move possibilities is done directly on the subdistributions the derivations produce. Definition 4.28 In a pLTS a state s is called stable if s −τ→, 6 and a subdistribution Θ is called stable if every state in its support is stable. We write ∆ =⇒≻ ∆′ whenever ∆ =⇒ ∆′ and ∆′ is stable, and call ∆′ an extreme derivative of ∆. Referring to Definition 3.12, we see this means that in the derivation of ∆′ from ∆ at every stage a state must move on if it can, so that every stopping component can contain only states which must stop: we have s ∈ ∆× k if and now also only if s −τ→. 6 Lemma 4.29 [Existence of extreme derivatives] (i) For every subdistribution ∆ there exists some (stable) ∆′ such that ∆ =⇒≻ ∆′ . (ii) In a deterministic pLTS we have that ∆ =⇒≻ ∆′ and ∆ =⇒≻ ∆′′ implies ∆′ = ∆′′ . Proof: The construction of derivatives in Definition 3.12 is simply specialised by choosing at every stage ∆× k uniquely × := ∆ to be ∆k restricted exactly to those states that must stop, i.e. those s for which s −τ→. 6 Then ∆→ k − ∆k and k τ ∆k+1 is chosen arbitrarily so that ∆→ k −→ ∆k+1 . That establishes (i). For (ii) we observe that in a deterministic pLTS the above choice of ∆k+1 is unique, so that the whole derivative construction becomes unique. 2 It is worth pointing out that the use of subdistributions, rather than distributions, is essential here. For example if ∆ τ τ τ diverges, that is if there is an infinite sequence of derivations ∆ −→ ∆1 −→ . . . ∆k −→ . . ., then the only extreme derivative of ∆ is the empty subdistribution ε. Lemma 4.30 Let ∆ be a subdistribution in a fully probabilistic pLTS. If ∆ =⇒≻ ∆′ then VΩ (∆) = VΩ (∆′ ). F Proof: Recall that VΩ is defined over deterministic pLTSs (only), and that V = n≥0 Vn because its defining functional is continuous, where we are now (in this proof) dropping the superscript · Ω to reduce clutter: the Vn ’s are τ the nth approximants to V. By inspection we have that s −→ ∆ implies Vn+1 (s) = Vn (∆), whence by lifting and linearity we get τ If ∆ −→ ∆′ then Vn+1 (∆) = Vn (∆′ ) for all n ≥ 0. (22) 24

Now suppose ∆ =⇒≻ ∆′ . Referring to Definition 3.12 and carrying out a straightforward induction based on (22), we have n n X X Vn+1 (∆) = V0 (∆n+1 ) + Vn−k+1 (∆× ) = Vn−k+1 (∆× (23) k k) k=0

k=0

for all n ≥ 0, with the second step depending on V0 ’s being identically ~0. Now because ∆′ is an extreme derivative we know that all the ∆× whence immediately for each k ≤ n k ’s are Pstable, n × × we have Vn−k+1 (∆× k=0 V(∆k ). We conclude by reasoning k ) = V(∆k ), simplifying the last step above to just F F n+1 n (∆) V(∆) = n≥0 V n≥0 V (∆) =

= = = = =

F Pn V(∆× k) Pn Fn≥0 k=0 V( k=0 ∆× n≥0 k) F Pn V( n≥0 k=0 ∆× k) P∞ V( k=0 ∆× ) k V(∆′ ) .

(23) and immediately above finite linearity of V least fixed point of continuous functional is itself continuous

2 Now according to (4) and Definition 4.5, the outcome of applying a test to a process is determined by the values V(Θ) for distributions Θ over particular deterministic pLTSs, namely resolutions. Then determinacy and Lemma 4.30 immediately above ensures that V(Θ) coincides with V(Θ≻ ), where Θ≻ is the unique extreme derivative of Θ. Moreover because of its stability the calculation of V(Θ≻ ) is simply the weighted sum of all the successful states in its support. The final link between the resolution- and derivation views of testing is that, in an ω-respecting computation structure, resolutions and extreme derivatives are essentially the same thing. Theorem 4.31 (Resolutions and extreme derivatives) Consider an ω-respecting computation structure hS, Ωτ , →i. Then ∆ =⇒≻ ∆′ if and only if there is a resolution hT, Ωτ , →T i with resolving function f and subdistributions Θ, Θ′ : D(T ) such that (i) f (Θ), f (Θ′ ) = ∆, ∆′ (ii) Θ =⇒≻T Θ′ . Proof: For only if, suppose hT, Ωτ , →T i is a resolution of hS, Ωτ , →i, let Θ be a subdistribution over T , and suppose Θ =⇒≻T Θ′ . By linearity we can apply f to that whole derivation structure, preserving its validity and establishing f (Θ) =⇒T f (Θ′ ). Now we observe that each state in the support of f (Θ′ ) is stable: consider any state t ∈ ⌈Θ× k ⌉ for any k ≥ 0. Since t is stable by assumption, either t is a deadlock state or t −ω→ Γ for some success action ω ∈ Ω and subdistribution Γ. By Definition 4.4, it must be the case that f (t) is a deadlock state in the first case, and that f (t) −ω→ f (Γ) in the second. Additionally, since hS, Act, →i is a “non-scooting” computation structure and f (t) ∈ S, we have f (t) −τ→ 6 in the second case. Therefore, f (t) is stable in both cases. Thus, we have in fact f (Θ) =⇒T f (Θ′ ). For if, consider an extreme derivation ∆ =⇒≻ ∆′ , as given in Definition 3.12 where all ∆× k are assumed to be × stable; for convenience we let ∆k denote ∆→ + ∆ . To define the corresponding resolution hT, Ωτ , −→T i we refer k k to Definition 4.4. First let the set of states T be S × N and the resolving function f : T → S be given by f (s, k) = s. To complete the description we must define the two partial functions −α →, for α = ω and α = τ . These are always defined so that if (s, k) −α → Γ then the only states in the support of Γ are of the form (s, k + 1). In the definition we use Θ↓k , for any subdistribution Θ over S, to be the subdistribution over T given by ( Θ(s) if t = (s, k) Θ↓k (t) = 0 otherwise Note that by definition 25

(a) f (Θ↓k ) = Θ →↓k (b) ∆↓k + ∆×↓k k = ∆k k

The definition of −ω→T is straightforward: its domain consists of states (s, k) where s ∈ ⌈∆× k ⌉ and is defined by ω letting (s, k) −ω→ ∆↓k+1 for some arbitrarily chosen s − → ∆ . s s τ → τ The definition of −→ T is more complicated, and is determined by the moves ∆k −→ ∆k+1 . For a given k this move means that X X τ ∆k+1 = pi ·Γi , si −→ Γi ∆→ pi ·si , k = i∈I

i∈I

So for each k we let τ (s, k) −→ T

X

pi ·Γ↓k+1 i

si =s

This definition ensures ↓k τ 1. (∆→ −→T (∆k+1 )↓k+1 k )

2. (∆× )↓k is stable. This completes our definition of the resolution; it remains to find distributions Θ, Θ′ over T such that f (Θ) = ∆, f (Θ′ ) = ∆′ and Θ =⇒≻ Θ′ . Because of (a) (c) and (d) we have the following extreme derivation, which by Part (ii) of Lemma 4.29 is the unique one from ∆↓0 0 : ↓0 ↓0 ∆↓0 = (∆→ + (∆× 0 ) 0 ) τ × ↓1 → ↓1 → ↓0 −→T (∆1 ) + (∆1 ) (∆0 ) .. .. . . ↓k (∆→ k )

τ −→ T

↓k+1 ↓k+1 (∆→ + (∆× k+1 ) k+1 ) .. .

Θ′ =

P∞

× ↓k k=0 (∆k )

Letting Θ be ∆↓0 , we see that Note (a) above ensures f (Θ) = ∆; the same note and the linearity of f applied to distributions also gives f (Θ′ ) = ∆′ . 2 Corollary 4.32 In an ω-respecting computation structure, the following statements hold. 1. If ∆ =⇒≻ ∆′ then there is a resolution Θ of ∆ such that VΩ (Θ) = VΩ (∆′ ). 2. For any resolution Θ of ∆, there exists an extreme derivative ∆′ such that ∆ =⇒≻ ∆′ and VΩ (Θ) = VΩ (∆′ ). Proof: Suppose ∆ =⇒≻ ∆′ . By Theorem 4.31, there is a resolution of ∆ with resolving function f and a subdistribution Θ such that Θ =⇒≻ Θ′ and f (Θ′ ) = ∆′ and f (Θ) = ∆; this last statement means that by definition Θ is a resolution of ∆. By Lemma 4.30, we have VΩ (Θ) = VΩ (Θ′ ). Since Θ′ and ∆′ are extreme derivatives, all the states in their supports are stable. Therefore, for any t ∈ ⌈Θ′ ⌉ we have that (i) t −ω→ with ω ∈ Ω iff f (t) −ω→, (ii) t −→ 6 iff f (t) −→. 6 It follows that VΩ (Θ′ ) = VΩ (∆′ ). As a result, we obtain that VΩ (Θ) = VΩ (∆′ ). To prove 2, suppose Θ is a resolution of ∆; that is there is a resolution as in Definition 4.4 with a resolving function f such that f (Θ) = Θ′ . We know from Lemma 4.29 that there exists a (unique) subdistribution Θ′ such that Θ =⇒≻ Θ′ . By Theorem 4.31 we have that ∆ = f (Θ) =⇒≻ f (Θ′ ). The same arguments as in the other direction show that VΩ (Θ) = VΩ (f (Θ′ )). 2

26

Definition 4.33 Let ∆ be a subdistribution in a computation structure. We write V Ω (∆) for the set of testing outcomes {VΩ (Γ) | Γ is a resolution of ∆} obtained from ∆. Lemma 4.34 Let ∆ be a subdistribution in an ω-respecting computation structure. Then V Ω (∆) = {VΩ (∆′ ) | ∆ =⇒≻ ∆′ }. Proof: This is immediate from Corollary 4.32.

2

Finally we have the basis of a derivation-style definition of testing. Lemma 4.35 Let P be a process and T a compatible test. Then AΩ (T, P ) = {VΩ (∆′ ) | [ T |Act P ] =⇒≻ ∆′ }.

[



Proof: This is immediate from (4) and Lemma 4.34.

2

5 Further properties of weak derivation In this section we expose some less obvious properties of derivations, relating to their behaviour at infinity. One important property is that the set of derivations from a single starting point is closed in the sense (from analysis) of containing all its limit points where, in turn, limit depends on a Euclidean-style metric defining the distance between two distribuitions in a straightforward way. The other property is “distillation of divergence,” allowing us to find in any derivation that partially diverges (by no matter how small an amount) a point at which the divergence is “distilled” into a state which wholly diverges. Both properties depend on our working within finitary pLTSs — that is, ones in which the state space is finite and the (unlifted) transition relation is finite-branching. (The examples of Appendix A show this is necessary, for our approach at least.) Again, some proofs and technical lemmas are deferred to Appendix B.

5.1 Finite generability and closure a Now let us restrict our attention to finitary pLTSs. Here by definition the sets s· −→ are finite, for every s and a. This a a of course is no longer true for the lifted relations −→; nevertheless the sets s· −→ can be finitely represented.

Definition 5.1 A resolution R is said to be static if its resolving function fR is injective. This means that the transition used at a state is always the same one, no matter how often or from where the state is reached, and that it is pure, that is not interpolated. Static resolutions are particularly easy to describe because they are given effectively by subsets of the transition relation of the original pLTS. For us a convenient formulation of this is in terms of the following definition. Definition 5.2 A static derivative policy for a computation structure hS, Act, →i, or an SDP, is a partial function pp : S ⇀ D1 (S) satisfying: τ • If pp is defined at s then s −→ pp(s), and

• If s −τ→ 6 then pp is undefined at s. Intuitively a policy pp decides for each state, once and for all, which of the available τ -choices to take if any: since it chooses a specific transition, or inaction, it does not interpolate; and since it is a function of the state, it makes the same choice on every visit. There is a close relationship between SDP’s and static resolutions. One difference is that (here) we are concerned only with policies for τ transitions (although clearly the idea generalises). The other difference is that an SDP can select a “do not take any τ -transition” option, even if some are available, which it does by setting pp to be undefined at s even when s enables some transition. On the other hand, a static resolution must take a transition if it can. In this respect SDP’s are close to derivations, which also have the “stopping-here” option. The great importance for us of SDP’s is that they give a particularly simple characterisation of derivatives, provided the state-space is finite and the pLTS is finite-branching. This is essentially a result of Markov Decision Processes [19], which we now translate into our context. 27

Lemma 5.3 (=⇒ realised by interpolation of finitely many static policies) Suppose s =⇒ ∆′ for some state s and subdistribution ∆′ . Then there is a finite index set I, probabilities pi summing to 1 and static strategies ppi such that P ′ ∆ = i∈I pi ·∆′i where uniquely s =⇒ppi ∆′i for each i. Proof: See Lemma B.20 and its preceding material.

2

Lemma 5.4 (Closure of =⇒) For any state s the set (s =⇒) of derivatives of s is closed. Proof: From Lemma 5.3 that set is a subset of the convex closure of the set of subdistributions reachable via =⇒ using one of the finitely many static policies of the pLTS, and that latter set is closed since it is the convex closure of finitely many points. But each of those subdistributions is trivially a derivative itself; and the set of derivatives is convex by Corollary 3.20. Hence the two sets are equal, and thus the former is closed as well. 2 a

Lemma 5.5 [Closure of =⇒]

a

For any state s the set {∆′ | s =⇒ ∆′ } is closed.

a

a Proof: The relation =⇒ is a composition of three stages: there must be ∆′1 , ∆′2 with s =⇒ ∆′1 −→ ∆′2 =⇒ ∆′ . For ′ the first stage, from Lemma 5.3 we know any such ∆1 is the interpolation of fixed finite set of (other) subdistributions; call them “principal” for that starting point (s) and type of transition (=⇒). b 1 from the first stage, there are only a finite For each state s1 in the support of some principal subdistribution ∆ a b 1 ⌉ itself is finite, there are thus only because of finite branching. Because ⌈∆ number of distributions in (s1 −→), a b 1 ’s, b finitely many subdistributions necessary to generate all of (∆1 −→); and because there are only finitely many ∆ b we still need only a finite number of principal subdistributions ∆2 to generate the results of the first and second stages. For the third stage we make effectively the same argument as for the second, except that instead of appealing to b 2 =⇒) instead we use the finite generability that we appealed to in the first stage. finite branching of (∆ Then Lemma 5.4 applies (analogously) for closure. 2

Lemma 5.6 [Zero-one law, static case] If for some static derivative policy pp over a finite-state pLTS there is for some s a derivation s =⇒pp ∆′ with |∆′ | < 1 then in fact for some (possibly different) state sε we have sε =⇒pp ε. Proof: Suppose that for no state s do we have s =⇒pp ε. This means that for every state s there is some other state s′ (possibly depending on s), some number Ns ≥ 0 (no matter how large) and probability ps > 0 (no matter how small) τ such that s can follow policy pp to reach s′ in Ns transitions −→ with aggregated probability ps and then can go no ′ further because pp is undefined at s — for if this were not true, policy pp would generate an unbounded τ -tree of transitions rooted at s with aggregate probability 1. Set N ≥ 0 to be the maximum of Ns over all states, and p > 0 to be the minimum of ps over all states. (That p > 0 is one place we use finiteness of the state space; the other is in the existence of both the Ns ’s in the first place, and of their maximum.) Now pick an arbitrary s and consider the (unique) derivation s =⇒pp ∆′ . If it is finite, then |∆′ |=1 trivially. If τ not, group the infinite sequence of pp-generated −→ pp moves into blocks of N . In any such block the probability of reaching a pp-undefined state is at least p > 0, and so the probability of eventually reaching a pp-undefined state (by taking many N -blocks of moves in succession) is in fact 1. Thus s =⇒pp ∆′ implies |∆′ | = 1. Our lemma then is immediate from the contrapositive. 2

5.2 Distillation of divergence Although it is possible to have processes that diverge with some probability strictly between zero and one, in a finitary system it is possible to “distill” that divergence in the sense that in many cases we can limit our analyses to processes that either wholly diverge (can do so with probability one) or wholly converge (can diverge only with probability zero). This property is based on the zero-one law for finite-state probabilistic systems, and in this section we present the aspects of it that we need here.

28

Lemma 5.7 [Distillation of divergence, static case] If for some state s and static derivative policy pp over a finite-state pLTS there is a derivation s =⇒pp ∆′ then there is a probability p and full distributions ∆′1 , ∆′ε such that s =⇒ (∆′1 p ⊕ ∆′ε ) and ∆′ = p·∆′1 and ∆′ε =⇒ ε. Proof: See Lemma B.21.

2

Lemma 5.8 [Distillation of divergence, general case] For any s, ∆′ in a finitary pLTS with s =⇒ ∆′ there is a probability p and full distributions ∆′1 , ∆′ε such that s =⇒ (∆′1 p ⊕ ∆′ε ) and ∆′ = p·∆′1 and ∆′ε =⇒ ε. Proof: From Lemma 5.3 we know that the derivation s =⇒ ∆′ is an interpolation of a finite number of static derivations. Lemma 5.7 then applies to each separately, and the result follows by interpolating the derivations from each static case. 2 Corollary 5.9 If in a finitary pLTS we have ∆, ∆′ with ∆ =⇒ ∆′ and |∆|>|∆′ | then there is some state s′ reachable with non-zero probability from ∆ such that s′ =⇒ ε. That is, the pLTS based on ∆ must have a wholly diverging state somewhere. Proof: Assume at first that |∆|=1; then the result is immediate from Lemma 5.8 since any s′ ∈⌈∆′ε ⌉ will do. The general result is obtained by dividing the given derivation by |∆|. 2

6 The failure simulation preorder This section is divided in four: the first subsection presents the definition of the failure simulation preorder in an arbitrary pLTS, together with some explanatory examples. It gives two equivalent characterisations of this preorder: a coinductive one as a largest relation between subdistributions satisfying certain transfer properties, and one that is obtained through lifting and an additional closure property from a relation between states and subdistributions that we call failure similarity. It also investigates some elementary properties of the failure simulation preorder and of failure similarity. In the second subsection we restrict attention to finitary processes, and on this realm characterise the failure simulation preorder in terms of simple failure similarity. All further results on the failure simulation preorder, in particular precongruence for the operators of pCSP and soundness and completeness w.r.t. the must testing preorder, are in terms in this characterisation, and hence pertain to finitary processes only. The third subsection establishes monotonicity of the operators of pCSP with respect to the failure simulation preorder — in other words: shows that the failure simulation preorder is a precongruence with respect to these operators — and the last subsection is devoted to showing soundness with respect to must testing. Completeness is the subject of Section 7.

6.1 Two equivalent definitions and their rationale α

We start with defining the weak action relations =⇒ for α ∈ Actτ and the refusal relations −A→ 6 for A ⊆ Act that are the key ingredients in the definition of the failure simulation preorder [6, 2]. Definition 6.1 Let ∆ and its variants be subdistributions in a pLTS hS, Actτ , →i. a

a • For a ∈ Act write ∆ =⇒ ∆′ whenever ∆ =⇒ ∆pre −→ ∆post =⇒ ∆′ . Extend this to Actτ by allowing as a τ τ special case that =⇒ is simply =⇒, i.e. including identity (rather than requiring at least one −→).

• For A ⊆ Act and s ∈ S write s −A→ 6 if s −α→ 6 for every α ∈ A ∪ {τ }; write ∆ −A→ 6 if s −A→ 6 for every s ∈⌈∆⌉. • More generally write ∆ =⇒−A→ 6 if ∆ =⇒ ∆pre for some ∆pre such that ∆pre −A→. 6

[ ℄

a

[ ℄

[ ℄

a

For example, referring to Example 3.15 we have Q1 =⇒ 0 , while in Example 3.16 we have s2 =⇒ well as s2 =⇒−B→ 6 for any set B not containing a, because s2 =⇒ 21 a .

[ ℄

a

[℄

Proposition 6.2 The relation =⇒, for a ∈ Act, can be obtained as a lifting. 29

1 2

[ 0 ℄ as

a

a a =⇒. By Proposition 3.19 this equals =⇒S −→ Proof: Relation =⇒ is in fact =⇒ −→ =⇒S which, by three applicaa a a =⇒S hence =⇒S −→ =⇒S and finally =⇒S −→ =⇒. tions of Lemma 3.11, equals =⇒S −→

2

a

Corollary 6.3 The relation =⇒ is convex. Proof: This is immediate from its being a lifting.

2

Definition 6.4 (Failure Simulation Preorder) Define ⊒FS to be the largest relation in D(S) × D(S) such that if ∆ ⊒FS Θ then P P α 1. whenever ∆ =⇒ ( i pi ∆′i ), for α ∈ Actτ and certain pi with ( i pi ) ≤ 1, then there are Θ′i ∈ D(S) with P α Θ =⇒ ( i pi Θ′i ) and ∆′i ⊒FS Θ′i for each i, and 2. whenever ∆ =⇒−A→ 6 then also Θ =⇒−A→. 6

Naturally Θ ⊑FS ∆ just means ∆ ⊒FS Θ. We have chosen the orientation of the preorder symbol to match that of must testing, which goes back to the work of De Nicola & Hennessy [5]. This orientation also matches the one used in CSP [9] and related work, were we have S PECIFICATION ⊑ I MPLEMENTATION. At the same time, we like to stick to the convention popular in the CCS community of writing the simulated process to the left of the preorder symbol and the simulating process (that mimics moves of the simulated one) on the right. This helps when comparing with may testing and the simulation preorder in Section 8. We achieve this by writing I MPLEMENTATION ⊒FS S PECIFICATION. In the first case of the above definition the summation is allowed to be empty, which has the following useful consequence. Lemma 6.5 If ∆ diverges and ∆ ⊒FS Θ, then also Θ diverges. Proof: Divergence of ∆ means that ∆ =⇒ ε, whence with ∆ ⊒FS Θ we can take the empty summation in Definition 6.4 to conclude that also Θ =⇒ ε. 2 Although the regularity of Definition 6.4 is appealing — for example it is trivial to see that ⊑FS is reflexive and transitive, as it should be — in practice, for specific processes, it is easier to work with a characterisation of the failure simulation preorder in terms of a relation between between states and distributions. Definition 6.6 (Failure Similarity) Define FS to be the largest relation in S × D(S) such that if s FS Θ then α α 1. whenever s =⇒ ∆′ , for α ∈ Actτ , then there is a Θ′ ∈ D(S) with Θ =⇒ Θ′ and ∆′ FS Θ′ , and 2. whenever s =⇒−A→ 6 then Θ =⇒−A→. 6 Any relation R ⊆ S × D(S) that satisfies the two clauses above is called a failure simulation. Obviously, for any failure simulation R we have R ⊆ FS . The following two lemmas show that the lifted failure similarity relation FS ⊆ D(S) × D(S) has simulating properties analogous to 1 and 2 above. α

α

Lemma 6.7 Suppose ∆ FS Θ and ∆ =⇒ ∆′ for α ∈ Actτ . Then Θ =⇒ Θ′ for some Θ′ such that ∆′ FS Θ′ . X X Proof: ∆ FS Θ implies by Lemma 3.4 that ∆= si FS Θi , Θ= p i · si , pi · Θ i . i∈I i∈I P α α By Propositions 6.2 and 3.9 we know from ∆ =⇒ ∆′ that si =⇒ ∆′i for ∆′i ∈ D(S) such that ∆′ = i∈I pi · ∆′i . α α ′ ′ ′ ′ ′ For each i and ∆i FS Θ . Let Pi ∈ I we′ infer from si FS Θi and si =⇒ ∆i that there is a Θi ′∈ D(S)′ with Θi α=⇒ Θ ′ ′ Θ := i∈I pi ·Θi . Then Definition 3.2(2) and Theorem 3.18(i) yield ∆ FS Θ and Θ =⇒ Θ . 2 6 Then Θ =⇒−A→. 6 Lemma 6.8 Suppose ∆ FS Θ and ∆ =⇒−A→.

′ A ′ ′ ′ ′ −→. 6 By Lemma 6.7 there exists some Proof: Suppose ∆ FS Θ and ∆ =⇒ ∆X XΘ such that Θ =⇒ Θ and ∆ FS Θ . From Lemma 3.4 we know that ∆′ = p i · si , si FS Θi , Θ′ = pi · Θi , with si ∈ ⌈∆′ ⌉ for all i ∈ I. i∈I

i∈I

Since ∆′ −A→, 6 P we have that siP −A→ 6 for all i ∈ I. It follows from si FS Θi that Θi =⇒ Θ′i −A→. 6 By Theorem 3.18(i) we 6 By the transitivity of =⇒ we have that Θ =⇒−A→. 6 2 obtain that i∈I pi · Θi =⇒ i∈I pi · Θ′i −A→. 30

The next result shows how the failure simulation preorder can alternatively be defined in terms of failure similarity. Proposition 6.9 For ∆, Θ ∈ D(S) we have ∆ ⊒FS Θ just when there is a Θmatch with Θ =⇒ Θmatch and ∆ FS Θmatch . ′ ′ Proof: Let ′FS ⊆ S × D(S) be the relation given P by s FS Θ iff s ⊒FS Θ. Then FS isPa failure simulation; hence ′ FS ⊆ FS . Now suppose ∆ ⊒FS Θ. Let ∆ := i pi ·si . ThenPthere are Θi with Θ =⇒ i pi ·Θi and s ⊒FS Θi for each i, whence si ′FS Θi , and thus si FS Θi . Take Θmatch := i pi ·Θi . Definition 3.2 yields ∆ FS Θmatch . For the other direction it suffices to show that FS ; =⇒−1 satisfies the two clauses of Definition 6.4, yielding FS ; =⇒−1 ⊆ ⊒FS . So suppose, for given ∆, Θ ∈ D(S), there is a Θmatch with Θ =⇒ Θmatch and ∆ FS Θmatch . α P α ′ there is some Θ′ such that Θmatch =⇒ Θ′ and P Suppose ′∆ =⇒ ′ i∈I pi ·∆i for some α ∈ Actτ . By Lemma′ 6.7 P ′ ′ 3.9 we know that Θ = ( i∈I pi ·∆i ) FS Θ . From Proposition i∈I pi ·Θi for subdistributions Θi such that α P ∆′i FS Θ′i for i ∈ I. Thus Θ =⇒ i pi ·Θ′i by the transitivity of =⇒ (Theorem 3.21) and ∆′i (FS ; =⇒−1 )Θ′i for each i ∈ I by the reflexivity of =⇒. Suppose ∆ =⇒−A→. 6 By Lemma 6.8 we have Θmatch =⇒−A→. 6 It follows that Θ =⇒−A→ 6 by the transitivity of =⇒. 2

Note the appearance of the “anterior step” Θ =⇒ Θmatch in Proposition 6.9 immediately above; the following example shows it necessary in the sense that defining ⊒FS simply to be FS (i.e. without anterior step) would not have been suitable. Example 6.10 Compare the two processes P := a 12 ⊕ b and Q := τ.P . They are testing equivalent, and so for FS to be complete we would have to have P FS Q . But we do not, for by Proposition 3.9 that would require a a FS Q , which must fail since the former’s move −→ 0 cannot be matched by the latter. We do however have P ⊒FS Q because of the anterior step Q =⇒ P and of course P FS P . 2

[℄

[ ℄

[ ℄

[ ℄

[ ℄

[ ℄

[ ℄

Remark 6.11 For s ∈ S and Θ ∈ D(S) we have s FS Θ iff s ⊒FS Θ; here no anterior step is needed. One direction of this statement has been obtained in the beginning of the proof of Proposition 6.9; for the other note that s FS Θ implies s FS Θ by Definition 3.2(1) which implies s ⊒FS Θ by Proposition 6.9 and the reflexivity of =⇒. Example 6.10 shows that ⊒FS cannot be obtained as the lifting of any relation: it lacks the decomposition property of Proposition 3.9. Nevertheless, ⊒FS enjoys the property of linearity, as occurs in Definition 3.2: P P P Lemma 6.12 If ∆i ⊒FS Θi for i ∈ I then i∈I pi ·∆i ⊒FS i∈I pi ·Θi for any pi ∈[0, 1] (i ∈ I) with i∈I pi ≤ 1.

Proof: This follows immediately from the linearity of FS and =⇒ (cf. Theorem 3.18(i)), using Proposition 6.9.

[



2

Example 6.13 (Divergence) From Example 3.14 we know that rec x. x =⇒ ε. This, together with (1) in Section 3.1, and the fact that ε −A→ 6 for any set of actions A, ensures that s FS rec x. x for any s, hence Θ FS rec x. x for any τ τ τ · · · −→ ··· ∆1 −→ Θ, and thus that Θ ⊒FS rec x. x . Indeed similar reasoning applies to any ∆ with ∆ = ∆0 −→ because — as explained right before Example 3.14 — this also ensures that ∆ =⇒ ε. In particular, we have ε =⇒ ε and hence rec x. x ≃FS ε. Yet rec x. x 6⊒FS 0, because the move rec x. x =⇒ ε cannot be matched by a corresponding move from 0 — see Lemma 6.5. 2

[

[

[







[

[



[





[ ℄

[



Example 6.13 shows again that the anterior move in Proposition 6.9 is necessary: although ε ⊒FS rec x. x we do not have ε FS rec x. x , since by Lemma 3.5 any Θ with ε FS Θ must have |Θ| = 0.

[



Example 6.14 Referring to the process Q1 of Example 3.15, with Proposition 6.9 we easily see that a ⊒FS Q1 because we have a FS Q1 . Note that the move Q1 =⇒ a is crucial, since it enables us to match the move a a a −→ 0 with Q1 =⇒ a −→ 0 . It also enables us to match refusals: if a −B→ 6 then B can not contain the B action a, and therefore also Q1 =⇒−→. 6 The converse, that a ⊑FS Q1 , is also true because it is straightforward to verify that the relation

[℄

[ ℄

[ ℄

[ ℄ [℄ [ ℄

[ ℄

[ ℄

[℄

[℄

[℄

[℄

[℄

[ ℄

{(Q1 , a ), (τ.Q1 , a ), (a, a ), (0, 0 )} is a failure simulation and thus is a subset of FS . We therefore have Q1 ≃FS a. 31

2

Example 6.15 Let P be the process a 12 ⊕ rec x. x and consider the state s2 introduced in Example 3.16. First note that P FS 21 · a , since rec x. x FS ε. Then because s2 =⇒ 12 · a we have P ⊒FS s2 . The converse, that s2 ⊒FS P holds, is true because s2 FS P follows from the fact that the relation

[ ℄ [ ℄

[℄

[℄ [ ℄ [ ℄ {(sk , [a℄ ⊕ [rec x. x℄) | k ≥ 2} ∪ {(a, [a℄), (0, [ 0 ℄)} is a failure simulation that contains the pair (s2 , [P ℄). 1/k

2

Our final examples pursue the consequences of the fact that the empty distribution ε is behaviourally indistinguishable from divergent processes like rec x. x .

[



Example 6.16 (Subdistributions formally unnecessary) For any subdistribution ∆, let ∆e denote the (full) distribution defined by ∆e := ∆ + (1− |∆|)· rec x. x .

[



[



Intuitively it is obtained from ∆ by padding the missing support with the divergent state rec x. x . Then ∆ ≃FS ∆e . This follows because ∆e =⇒ ∆, which is sufficient to establish ∆ ⊒FS ∆e ; but also ∆e FS ∆ because rec x. x FS ε, and that implies the converse ∆e ⊒FS ∆. The equivalence shows that formally we have no need for subdistributions, and that our technical development could be carried out using (full) distributions only. 2

[



But abandoning subdistributions comes at a cost: the definition of ∆ =⇒ Θ on p.9 would be much more complex if expressed with full distributions, as would syntactic manipulations such as those used in the proof of Theorem 3.21. More significant, however, is that diverging processes have a special character in failure simulation semantics. Placing them at the bottom of the ⊑FS preorder — as we do – requires that they failure-simulate every processes, thus allowing all visible actions and all refusals and so behaving in a sense “chaotically”; yet applying the operational semantics of Figure 3 to rec x. x literally would suggest exactly the opposite, since rec x. x allows no visible actions (all its derivatives enable only τ ) and no refusals (all its derivatives have τ enabled). The case analyses that discrepancy would require are entirely escaped by allowing subdistributions, as the chaotic behaviour of the diverging ε follows naturally from the definitions, as we saw in Example 6.13. We conclude with an example involving divergence and subdistributions.

[ ℄

[ ℄

Example 6.17 For 0 ≤ c ≤ 1 let Pc be the process 0 c ⊕ rec x. x. We show that Pc ⊑FS Pc′ just when c ≤ c′ . (Refusals can be ignored, since Pc refuses every set of actions, for all c.) Suppose first that c ≤ c′ , and split the two processes as follows:

[Pc ℄ = [Pc ℄ =

[ ℄ [ ℄ . Because 0  [rec x. x℄ (the middle terms), we have immediately [Pc ℄  [Pc ℄, whence [Pc ℄ ⊑FS [Pc ℄. For the other direction, note that [Pc ℄ =⇒ c′ · [ 0 ℄. If [Pc ℄ ⊑FS [Pc ℄ then from Definition 6.4 we would have to have [Pc ℄ =⇒ c′ ·Θ′ for some subdistribution Θ′ , a derivative of weight no more than c′ . But the smallest weight Pc ′ ′

[ ℄ [ ℄

[ ℄ [ ℄

c· 0 + (c′ −c)· rec x. x + (1−c′ )· rec x. x c· 0 + (c′ −c)· 0 + (1−c′ )· rec x. x ′

FS



FS





can reach via =⇒ is just c, so that we must have in fact c ≤ c .

2

We end this subsection with two properties of failure similarity that will be useful later on. Proposition 6.18 The relation FS is convex. P P Proof: Suppose s FS Θi and pi ∈ [0, 1] for i ∈ I, with i∈I pi = 1. We need to show that s FS i∈I pi ·Θi . α α ′ ′ I such that Θi =⇒ Θ′i and If s =⇒ ∆′ , then therePexist Θ′i for i ∈P P ∆ FS′ Θi . By Proposition 6.2 and Definiα ′ ′ tion 3.2(2), we obtain that i∈I pi ·Θi =⇒ i∈I pi ·Θi and ∆ FS i∈I pi ·Θi . P ′ A If s =⇒−A→ 6 6 for some 6 for all i ∈ I. By definition we have i∈I pi ·Θ′i −A→. P A ⊆ Act, then P Θi =⇒ ′ Θi −→ Theorem 3.18(i) yields i∈I pi ·ΘP p ·Θ . i =⇒ i i i∈I 2 So we have checked that s FS i∈I pi ·Θ′i . It follows that FS is convex. Proposition 6.19 The relation FS ⊆ D(S) × D(S) is reflexive and transitive. 32

Figure 6: Illustration of Theorem 6.21 Proof: Reflexivity is easy; it relies on the fact that s FS s for every state s. α For transitivity, we first show that FS ; FS is a failure simulation. Suppose s FS Θ FS Φ. If s =⇒ ∆′ then there α α is a Θ′ such that Θ =⇒ Θ′ and ∆′ FS Θ′ . By Lemma 6.7, there exists a Φ′ such that Φ =⇒ Φ′ and Θ′ FS Φ′ . Hence, ∆′ FS ; FS Φ′ . By Lemma 3.11 we know that FS ; FS = FS ; FS

(24)

Therefore, we obtain ∆′ FS ; FS Φ′ . If s =⇒−A→ 6 for some A ⊆ Act, then Θ =⇒−A→ 6 and hence Φ =⇒−A→ 6 by Lemma 6.8. So we established that FS ; FS ⊆ FS . It now follows from Remark 3.3 and (24) that FS ; FS ⊆ FS .

2

6.2 A simpler characterisation of failure similarity for finitary processes Here we present a simpler characterisation of failure similarity, valid when considering finitary processes only. It is in terms of this characterisation that we will establish soundness and completeness of the failure simulation preorder with respect to the must testing preorder; consequently we have these results for finitary processes only. Definition 6.20 (Simple Failure Similarity) Let sFS be the largest relation in S × D(S) such that if s sFS Θ then 1. whenever s =⇒ ε then also Θ =⇒ ε, otherwise α

2. whenever s −α → ∆′ , for α ∈ Actτ , then there is a Θ′ with Θ =⇒ Θ′ and ∆′ sFS Θ′ , and 3. whenever s −A→ 6 then Θ =⇒−A→. 6 Theorem 6.21 (Equivalence of failure- and simple failure similarity) For finitary distributions ∆, Θ ∈ D(S) in a pLTS hS, Actτ , →i we have ∆ sFS Θ iff ∆ FS Θ. → ∆′ and s −A→ 6 implies s =⇒−A→ 6 it is trivial that FS satisfies the conditions of Proof: Because s −α → ∆′ implies s −α s Definition 6.20, so that FS ⊆ FS . For the other direction we need to show that sFS satisfies Clause 1 of Definition 6.6 with α = τ , that is if s sFS Θ and s =⇒ ∆′ then there is some Θ′ ∈ D(S) with Θ =⇒ Θ′ ∆′ sFS Θ′ . Once we have this, the relation sFS clearly satisfies both clauses of Definition 6.6, so that we have sFS ⊆ FS . So suppose that s sFS Θ and that s =⇒ ∆′ where — for the moment — we assume |∆′ | = 1. Referring to × × → → τ Definition 3.12, must be ∆k , ∆→ k and ∆k for k ≥ 0 such that s = ∆0 , ∆k = ∆k + ∆k , ∆k −→ ∆k+1 P∞ there × × × ′ → s and ∆ = k=1 ∆k . Since ∆0 + ∆0 = s FS Θ, using Proposition 3.9 we can define Θ =: Θ0 + Θ→ 0 so that {×,→} {×,→} → τ → → → s s s ∆0 FS Θ0 . Since ∆0 −→ ∆1 and ∆0 FS Θ0 we have Θ0 =⇒ Θ1 with ∆1 FS Θ1 .

33

× Repeating the above procedure gives us inductively a series Θk , Θ→ 0, such that k , Θk of subdistributions, for k ≥ P τ × × × × → → → ′ s s s = ⇒ Θ . We define Θ := Θ0 = Θ, ∆k FS Θk , Θk = Θk + Θk , ∆k FS Θk , ∆k FS Θk and Θ→ k k i Θi . By ′ ′ ′ s Additivity (Remark 3.6) we have ∆ FS Θ . It remains to be shown that Θ =⇒ Θ . For that final step, because (Θ =⇒) is closed (Lemma 5.4) we can establish Θ =⇒ Θ′ by exhibiting a sequence ′ ′ Θi with Θ =⇒ Θ′i forPeach i and with the Θ′i ’s being arbitrarily close to Θ establishes for each i that P.∞Induction × ′ ′ → | = 0 and limi→∞ |∆→ Θ =⇒ Θi := (Θi + k≤i Θk ). Since |∆ | = 1, we must have limi→∞ | k=i ∆→ i i | = 0, P ∞ → → → whence by Lemma 3.5, using that ∆i sFS Θi , also limi→∞ | k=i Θi | = 0 and limi→∞ |Θ→ | = 0. Thus these i Θ′i ’s form the sequence we needed. That concludes the case for |∆′ | = 1. If on the other hand ∆′ = ε, i.e. we have |∆′ | = 0, then Θ =⇒ ε follows immediately from s sFS Θ, and ε sFS ε trivially. In the general case, if s =⇒ ∆′ then by Lemma 5.8 we have s =⇒ ∆′1 p ⊕ ∆′ε for some probability p and full distributions ∆′1 , ∆′ε , with ∆′ = p·∆′1 and ∆′ε =⇒ ε. From the mass-1 case above we have Θ =⇒ Θ′1 p ⊕ Θ′ε with ∆′{1,ε} sFS Θ′{1,ε} ; from the mass-0 case we have Θ′ε =⇒ ε and hence Θ′1 p ⊕ Θ′ε =⇒ p·Θ′1 by Theorem 3.18(i); thus transitivity yields Θ =⇒ p·Θ′1 , with ∆′ = p·∆′1 sFS p·Θ′1 as required, using Definition 3.2(2). 2

Add the counterexample from Appendix A.2.3 here.

6.3 Precongruence The purpose of this section is to show that the semantic relation ⊒FS is preserved by the constructs of pCSP. The proofs follow closely the corresponding proofs in Section 4 of [2], but here there is a significant extra proof obligation: in order to relate two processes we have to demonstrate that if the first diverges then so does the second. This is often non-trivial; for example in the development of the testing theory for non-probabilistic processes, this proof obligation caused considerable difficulty and was only achieved by an appeal to K¨onig’s Lemma (see Lemma 4.4.13 of [7]). Here, in order to avoid such complications, we introduce yet another version of failure simulation; it modifies Definition 6.20 by checking divergence co-inductively instead of using a predicate. Definition 6.22 Define cFS to be the largest relation in S × D(S) such that if s cFS Θ then τ τ 1. whenever s =⇒ ε, there are some ∆′ , Θ′ such that s =⇒−→= ⇒ ∆′ =⇒ ε, Θ =⇒−→= ⇒ Θ′ and ∆′ cFS Θ′ ; otherwise α

2. whenever s −α → ∆′ , for α ∈ Actτ , then there is a Θ′ with Θ =⇒ Θ′ and ∆′ cFS Θ′ , and 3. whenever s −A→ 6 then Θ =⇒−A→. 6 Lemma 6.23 The following statements about divergence are equivalent. (1) ∆ =⇒ ε. τ τ τ (2) There is an infinite sequence ∆ −→ ∆1 −→ ∆2 −→ . . .. τ τ τ (3) There is an infinite sequence ∆ =⇒−→= ⇒ ∆1 =⇒−→= ⇒ ∆2 =⇒−→= ⇒ . . ..

Proof: By the definition of weak transition, it is immediate that (1) ⇔ (2). Clearly we have (2) ⇒ (3). To show that (3) ⇒ (2), we introduce another characterisation of divergence. Let ∆ be a subdistribution in a pLTS L. A pLTS induced by ∆ is a pLTS that has ∆ as initial subdistribution and whose states and transitions are subsets of those in L. (4) There is a pLTS induced by ∆ where all states have outgoing τ transitions. It holds that (3) ⇒ (4) because we can construct a pLTS whose states and transitions are just those used in deriving the infinite sequence in (3). For this pLTS, each state has an outgoing τ transition, which gives (4) ⇒ (2). 2 The next lemma shows the usefulness of the relation cFS by checking divergence in a co-inductive way.

34

τ Lemma 6.24 Suppose ∆ cFS Θ and ∆ =⇒ ε. Then there exist ∆′ , Θ′ such that ∆ =⇒−→= ⇒ ∆′ =⇒ ε, τ ′ ′ ′ c Θ =⇒−→=⇒ Θ , and ∆ FS Θ . P Proof: Suppose ∆ cFS Θ and ∆ =⇒ ε. By Proposition 3.9(2), we can decompose Θ as s∈⌈∆⌉ ∆(s)·Θs and τ ′ ′ s cFS Θs for each s ∈ ⌈∆⌉. Now each s must also diverge. So there sP =⇒−→= ⇒ ∆′s =⇒ ε, P exist ∆s , Θs′ such that τ ′ ′ ′ ′ ′ c Θs =⇒−→=⇒ Θs and ∆s FS Θs for each s ∈ ⌈∆⌉. Let ∆ = s∈⌈∆⌉ ∆(s)·∆s and Θ = s∈⌈∆⌉ ∆(s)·Θ′s . By τ τ ⇒ ∆′ , and Θ =⇒−→= ⇒ Θ′ . We also have that ∆′ =⇒ ε because Proposition 3.9(1), we have ∆′ cFS Θ′ , ∆ =⇒−→= ′ ′ ′ ′ for each state s in ∆ it holds that s ∈ ⌈∆s ⌉ for some ∆s and ∆s =⇒ ε, which means s =⇒ ε. 2

Lemma 6.25 cFS coincides with sFS . Proof: We only need to check that the first clause in Definition 6.20 is equivalent to the first clause in Definition 6.22. For one direction, we consider the relation R := {(s, Θ) | s =⇒ ε, Θ =⇒ ε} τ τ τ and show R ⊆ cFS . Suppose s R Θ. By Lemma 6.23 there are two infinite sequences s −→ ∆1 −→ ∆2 −→ . . . and τ τ Θ = ⇒ ε. Note that ∆ = ⇒ ε if and only if t = ⇒ ε for each Θ −→ Θ1 −→ . . .. Then we have both ∆1 =⇒ ε and 1 1 P P t ∈ ⌈∆1 ⌉. Therefore, ∆1 R Θ1 as we have ∆1 = t∈⌈∆1 ⌉ ∆1 (t)·t, Θ1 = t∈⌈∆1 ⌉ ∆1 (t)·Θ1 , and t R Θ1 . Here |∆1 | = 1 because ∆1 , like s, is a distribution. For the other direction, we show that ∆ cFS Θ and ∆ =⇒ ε imply Θ =⇒ ε. Then as a special case, we get s cFS Θ and s =⇒ ε imply Θ =⇒ ε. By repeated application of Lemma 6.24, we can obtain two infinite sequences τ τ τ τ ∆ =⇒−→= ⇒ ∆1 =⇒−→= ⇒ . . . and Θ =⇒−→= ⇒ Θ1 =⇒−→= ⇒ . . . such that ∆i cFS Θi for all i ≥ 1. By Lemma 6.23 this implies Θ =⇒ ε. 2

The advantage of this new relation cFS over sFS is that in order to check s cFS Θ when s diverges it is sufficient τ to find a single matching move Θ =⇒−→= ⇒ Θ′ , rather than an infinite sequence of moves. However to construct this matching move we can not rely on clause 2. in Definition 6.22, as the move generated there might actually be empty, as we have seen in Example 3.13. Instead we need a method for generating p-weak moves which contain at least one occurrence of a τ -action. τ τ Definition 6.26 [Productive moves] Let us write s |A t −→ p Θ whenever we can infer s |A t −→p Θ from rule (par.r) or (par.i). In effect this means that t must contribute to the action. τ These productive actions are extended to subdistributions in the standard manner, giving ∆ −→ p Θ.

First let us recall the following lemma which appeared as Lemma 6.12 in [3]; it still holds in our current setting. The following lemma appeared as Lemma 6.12 in [3]. It still holds in our current setting. Lemma 6.27 (1) If Φ =⇒ Φ′ then Φ |A ∆ =⇒ Φ′ |A ∆ and ∆ |A Φ =⇒ ∆ |A Φ′ . a a a (2) If Φ −→ Φ′ and a 6∈ A then Φ |A ∆ −→ Φ′ |A ∆ and ∆ |A Φ −→ ∆ |A Φ′ . a a τ (3) If Φ −→ Φ′ , ∆ −→ ∆′ and a ∈ A then ∆ |A Φ −→ ∆′ |A Φ′ . P P P P (4) ( j∈J pj ·Φj ) |A ( k∈K qk ·∆k ) = j∈J k∈K (pj ·qk )·(Φj |A ∆k ).

(5) Given relations R, R′ ⊆ S × D(S) satisfying uRΨ whenever u = s |A t and Ψ = Θ |A t with s R′ Θ and t ∈ S. Then ∆ R′ Θ and Φ ∈ D(S) implies (∆ |A Φ) R (Θ |A Φ). 2 τ τ Proposition 6.28 Suppose ∆ cFS Θ and ∆ |A t −→ p Γ. Then Θ |A t =⇒−→=⇒ Ψ for some Ψ such that Γ R Ψ, c where R is the relation given by {(s |A t, Θ |A t) | s FS Θ}. τ Proof: We first show a simplified version of the result. Suppose s cFS Θ and s |A t −→ p Γ; we prove this entails τ Θ |A t =⇒−→=⇒ Ψ such that Γ R Ψ. There are only two possibilities for inferring the above productive move from s |A t:

35

τ (i) Γ = s |A Φ where t −→ Φ a a (ii) or Γ = ∆ |A Φ where for some a ∈ A, s −→ ∆ and t −→ Φ. τ In the first case we have Θ |A t −→ Θ |A Φ by Lemma 6.27(2) and (s |A Φ) R (Θ |A Φ) by Lemma 6.27(5), a c whereas in the second case s FS Θ implies Θ =⇒−→= ⇒ Θ′ for some Θ′ ∈ D(S) with ∆ cFS Θ′ , and we have τ ′ Θ |A t =⇒−→=⇒ Θ |A Φ by Lemma 6.27(1) and (3), and (∆ |A Φ) R (Θ′ |A Φ) by Lemma 6.27(5). τ The general case now follows using a standard decomposition/recomposition argument. Since ∆ |A t −→ p Γ, Lemma 3.4 yields X X τ ∆= pi ·si , si |A t −→ Γ= pi ·Γi , p Γi , i∈I

i∈I

P

for certain si ∈ S, Γi ∈ D(S) and i∈I pi ≤ 1. For finitary processes cFS is convex by combination P of Proposition 6.18, Theorem 6.21 and Lemma 6.25. Hence, since ∆ cFS Θ, Corollary 3.10 yields that Θ = i∈I pi ·Θi for τ some Θi ∈ D(S) such that si cFS Θi for i ∈ I. By the above argument we have Θi |A t =⇒−→= ⇒ Ψi for some P Ψi ∈ D(S) such that Γi R Ψi . The required Ψ can be taken to be i∈I pi ·Ψi as Definition 3.2(2) yields Γ R Ψ and τ Theorem 3.18(i) and Definition 3.2(2) yield Θ |A t =⇒−→= ⇒ Ψ. 2 Our next result shows that we can always factor out productive moves from an arbitrary action of a parallel process. τ Lemma 6.29 Suppose ∆ |A t −→ Γ. Then there exists subdistributions ∆→ , ∆× , ∆next , Γ× (possibly empty) such that

(i) ∆ = ∆→ + ∆× τ (ii) ∆→ −→ ∆next τ × (iii) ∆× |A t −→ p Γ

(iv) Γ = ∆next |A t + Γ× τ Proof: By Lemma 3.4 ∆ |A t −→ Γ implies that X τ si |A t −→ Γi , ∆= pi ·si ,

Γ=

X

pi ·Γi ,

i∈I

i∈I

P τ for certain si ∈ S, Γi ∈ D(S) and i∈I pi ≤ 1. Let J = { i ∈ I | si |A t −→ p Γi }. Note that for each i ∈ (I − J) τ ′ ′ Γi has the form Γi |A t, where si −→ Γi . Now let X X pi ·si , ∆→ = ∆× = pi ·si i∈J

i∈(I−J)

next



=

X

pi ·Γ′i ,

×

Γ =

X

pi ·Γi

i∈J

i∈(I−J)

By construction (i) and (iv) are satisfied, and (ii) and (iii) follows by property (2) of Definition 3.2.

2

τ Lemma 6.30 If ∆ |A t =⇒ ε then there is a ∆′ ∈ D(S) such that ∆ =⇒ ∆′ and ∆′ |A t −→ p =⇒ ε.

Proof: Suppose ∆0 |A t =⇒ ε. By Lemma 6.23 there is an infinite sequence τ τ τ ... Ψ2 −→ Ψ1 −→ ∆0 |A t −→ × × By induction on k ≥ 0, we find distributions Γk+1 , ∆→ k , ∆k , ∆k+1 , Γk+1 such that τ (i) ∆k |A t −→ Γk+1

(ii) Γk+1 ≤ Ψk+1 × (iii) ∆k = ∆→ k + ∆k τ (iv) ∆→ k −→ ∆k+1

36

(25)

τ × (v) ∆× k |A t −→p Γk+1

(vi) Γk+1 = ∆k+1 |A t + Γ× k+1 . Induction Base: Take Γ1 := Ψ1 and apply Lemma 6.30. τ Induction Step: Assume we already have Γk , ∆k and Γ× k . Since ∆k |A t ≤ Γk ≤ Ψk and Ψk −→ Ψk+1 , Propoτ sition 3.9 gives us a Γk+1 ⊆ Ψk+1 such that ∆k |A t −→ Γk+1 and Γk+1 ≤ Ψk+1 . Now apply Lemma 6.30. P∞ P∞ Let ∆′ := k=0 ∆× a weak τ move ∆0 =⇒ ∆′ . Since ∆′ |A t = k=0 (∆× k . By (iii) and (iv) above we obtain k |A t), P ∞ × τ ′ by (v) and Definition 3.2 we have ∆′ |A t −→p k=1 Γk . Note that here it does not matter if ∆ = ε. Since × Γ× k ≤ Γk ≤ Ψk and Ψk =⇒ ε it follows by Theorem 3.18(ii) that Γk =⇒ ε. Hence Theorem 3.18(i) yields P ∞ × 2 k=1 Γk =⇒ ε.

We are now ready to prove the main result of this section, namely that ⊑FS is preserved by the parallel operator. Proposition 6.31 If ∆ ⊒FS Θ then ∆ |A Φ ⊒FS Θ |A Φ. Proof: We first construct the following relation R := {(s |A t, Θ |A t) | s cFS Θ}

and check that R ⊆ cFS . As in the proof of Proposition 4.6 in [2], one can check that each strong transition from s |A t can be matched up by a transition from Θ |A t, and the matching of failures can also be established. So we concentrate on the requirement involving divergence. Suppose s cFS Θ and s |A t =⇒ ε. We need to find some Γ, Ψ such that τ (a) s |A t =⇒−→= ⇒ Γ =⇒ ε, τ (b) Θ |A t =⇒−→= ⇒ Ψ and Γ R Ψ. τ c By Lemma 6.30 there are ∆′ , Γ ∈ D(S) such that s =⇒ ∆′ and ∆′ |A t −→ p Γ =⇒ ε. Since FS coincides s ′ ′ ′ ′ c with FS and FS , there must be a Θ ∈ D(S) such that Θ =⇒ Θ and ∆ FS Θ . By Proposition 6.28 we have τ τ Θ′ |A t =⇒−→= ⇒ Ψ for some Ψ such that Γ R Ψ. Now s |A t =⇒ ∆′ |A t −→ Γ =⇒ ε and Θ |A t =⇒ Θ′ |A τ t =⇒−→=⇒ Ψ with Γ R Ψ, which had to be shown. Therefore, we have shown that R ⊆ cFS . Now let us focus our attention on the statement of the proposition, which involves ⊒FS . Suppose ∆ ⊒FS Θ. By definition this means that there is some Θmatch such that Θ =⇒ Θmatch and ∆ sFS Θmatch . By Lemma 6.25 we have ∆ cFS Θmatch and Lemma 6.27(5) yields (∆ |A Φ) R (Θmatch |A Φ). Therefore, we have (∆ |A Φ) cFS (Θmatch |A Φ), i.e. (∆ |A Φ) sFS (Θmatch |A Φ) by Lemma 6.25. By Lemma 6.27(1) we also have (Θ |A Φ) =⇒ (Θmatch |A Φ), which had to be established. 2

Proposition 6.32 (Precongruence) If P ⊒FS Q then α.P ⊒FS α.Q for α ∈ Act, and similarly if P1,2 ⊒FS Q1,2 respectively then P1 ⊙ P2 ⊒FS Q1 ⊙ Q2 for ⊙ any of the operators ⊓, 2, p ⊕ and |A . Proof: The most difficult case is the closure of failure simulation under parallel composition, which is proved in Lemma 6.31. The other cases are simpler, thus omitted. 2 Lemma 6.33 If P ⊑FS Q then for any test T it holds that [P |Act T ] ⊑FS [Q |Act T ]. Proof: We first construct the following relation R := {(s |Act t, Θ |Act t) | s cFS Θ} where s |Act t is a state in [P |Act T ] and Θ |Act t is a subdistribution in [Q |Act T ], and show that R⊆cFS . 1. The matching of divergence between s |Act t and Θ |Act t is almost the same as the proof of Lemma 6.31, besides that we need to check the requirements t −ω→ 6 and Γ −ω→ 6 are always met there. 37

2. We now consider the matching of transitions. • If s |Act t −ω→ then this action is actually performed by t. Suppose t −ω→ Γ. Then s |Act t −ω→ s |Act Γ and Θ |Act t −ω→ Θ |Act Γ. Obviously we have (s |Act Γ, Θ |Act Γ) ∈R. τ • If s |Act t −→ then we must have s |Act t −ω→, 6 otherwise the τ transition would be a “scooting” transition. 6 There are three subcases. It follows that t −ω→. τ τ τ – t −→ Γ. So the transition s |Act t −→ s |Act Γ can simply be matched up by Θ |Act t −→ Θ |Act Γ. τ c ′ ′ – s −→ ∆. Since s FS Θ, there exists some Θ such that Θ =⇒ Θ and ∆ cFS Θ′ . Note that in this case t −ω→. 6 It follows that Θ |Act t =⇒ Θ′ |Act t which can match up the transition s |Act t −→ ∆ |Act t because (∆ |Act t, Θ′ |Act t) ∈R. a a – s −→ ∆ and t −→ Γ for some action a ∈ Act. Since s cFS Θ, there exists some Θ′ such that a ′ c Θ =⇒ Θ and ∆ FS Θ′ . Note that in this case t −ω→. 6 It follows that Θ |Act t =⇒ Θ′ |Act Γ which can match up the transition s |Act t −→ ∆ |Act Γ because (∆ |Act Γ, Θ′ |Act Γ) ∈R.

• Suppose s |Act t −A→ 6 for any A ⊆ Act ∪ {ω}. There are two possibilities. A2 A1 − − 6 → and – If s |Act t −ω→, 6 then t −ω→ 6 and there are two subsets A1 , A2 of A such that s − 6 →, t − c ′ ′ ′ A1 A = A1 ∪ A2 . Since s FS Θ there exists some Θ such that Θ =⇒ Θ and Θ −− 6 →. Therefore, we have Θ |Act t =⇒ Θ′ |Act t −A→. 6 – If s |Act t −ω→ then t −ω→ and ω 6∈ A. Therefore, we have Θ |Act t −ω→ and Θ |Act t −τ→ 6 because there is no “scooting” transition in Θ |Act t. It follows that Θ |Act t −A→. 6

Therefore, we have shown that R⊆cFS , from which our expected result can be establishing using similar arguments in the last part of the proof of Lemma 6.31. 2

6.4 Soundness In this section we prove that failure simulations are sound for showing that processes are related via the failure-based testing preorder. We assume initially that we are using only one success action ω, so that |Ω| = 1. Because we prune our computation structures before extracting values from them, we will be concerned mainly with ω-respecting structures. For those we have the following Lemma 6.34 Let ∆ and Θ be subdistributions in an ω-respecting computation structure. If subdistribution ∆ is stable and ∆ sFS Θ, then V(∆) ∈ V(Θ). Proof: We first show that if s is stable and s sFS Θ then V(s) ∈ V(Θ). Since s is stable, we have only two cases: (i) s −→ 6 Here V(s)=0 and since s sFS Θ we have Θ =⇒ Θ′ with Θ′ −→, 6 whence in fact Θ =⇒≻ Θ′ and V(Θ′ ) = 0. Thus from Lemma 4.34 we have V(s) = 0 ∈ V(Θ). (ii) s −ω→ ∆′ for some ∆′ Here V(s)=1 and Θ =⇒ Θ′ −ω→ with V(Θ′ )=1. Because the pLTS is ω-respecting, ′ in fact Θ =⇒≻ Θ and so again V(s) = 1 ∈ V(Θ). P Now for the general case we suppose ∆ sFS Θ. Use Proposition 3.9 to decompose Θ into s∈⌈∆⌉ ∆(s)·Θs such that s sFS Θs for each sP ∈ ⌈∆⌉, and recall each P such state s is stable. From above we have that V(s) ∈ V(Θs ) for 2 those s, and so V(∆) = ∈⌈∆⌉ ∆(s)·V(s) ∈ s∈⌈∆⌉ ∆(s)·V(Θs ) = V(Θ). Lemma 6.35 Let ∆ be a subdistribution in an ω-respecting computation structure. If ∆ =⇒ ∆′ then V(∆′ ) ⊆ V(∆).

Proof: Note that if ∆′ =⇒≻ ∆′′ then ∆ =⇒ ∆′ =⇒≻ ∆′′ , so that every extreme derivative of ∆′ is also an extreme derivative of ∆. The result follows from Lemma 4.34. 2 Lemma 6.36 Let ∆ and Θ be subdistributions in an ω-respecting computation structure. If ∆ sFS Θ, then we have V(∆) ⊆ V(Θ). 38

Proof: Since ∆ sFS Θ, for any ∆ =⇒≻ ∆′ we have the matching transition Θ =⇒ Θ′ such that ∆′ sFS Θ′ . It follows from Lemmas 6.34 and 6.35 that V(∆′ ) ∈ V(Θ′ ) ⊆ V(Θ). By Lemma 4.34 we obtain V(∆) ⊆ V(Θ). 2 Lemma 6.37 Let ∆ and Θ be subdistributions in an ω-respecting computation structure. If Θ ⊑FS ∆, then it holds that V(Θ) ⊇ V(∆). Proof: Suppose Θ ⊑FS ∆. By Proposition 6.9 and Theorem 6.21, there exists some Θ′ such that Θ =⇒ Θ′ and ∆ sFS Θ′ . By Lemmas 6.36 and 6.35 we obtain V(∆) ⊆ V(Θ′ ) ⊆ V(Θ). 2 Theorem 6.38 If P ⊑FS Q then P ⊑pmust Q. Proof: We reason as follows. implies implies iff implies iff

P ⊑FS Q [P |Act T ] ⊑FS [Q |Act T ] V([P |Act T ]) ⊇ V([Q |Act T ]) A(T, P ) ⊇ A(T, Q) A(T, P ) ≤Sm A(T, Q) P ⊑pmust Q .

Lemma 6.33, for any test T [·] is ω-respecting; Lemma 6.37 (4) Def. Smyth order Definition 4.5

2 Corollary 6.39 If P ⊑FS Q then P ⊑Ω pmust Q. Proof: Section 4.3.1 recalled our earlier result that Ω-testing is reducible to scalar testing.

2

7 Failure simulation is complete for must testing This section establishes the completeness of the failure simulation preorder w.r.t. the must testing preorder. It does so in three steps. First we provide a characterisation of the preorder relation ⊑FS by finite approximations. Secondly, using this, we develop a modal logic which can be used to characterise the failure simulation preorder on finitary pLTSs. Finally, we adapt the results of [2] to show that the modal formulae can in turn be characterised by tests; again this result depends on the underlying pLTS being finite-state. From this, completeness follows.

7.1 Inductive characterisation sFS FS kFS ′FS cFS The relation sFS of Definition 6.20 is given coinductively: it is the largest fixpoint of an equation R= F(R). An alternative approach is to use that F(−) to define sFS as a limit of approximants: Definition 7.1 For every k ≥ 0 we define the relations kFS ⊆ S × D(S) as follows: (i) 0FS

:= S × D(S)

:= F(kFS ) (ii) k+1 FS T k := ∞ Finally let ∞ FS k=0 FS .

. The converse is A simple inductive argument ensures that sFS ⊆ kFS , for every k ≥ 0, and therefore that sFS ⊆ ∞ FS however not true in general. A (non-probabilistic) example is well-known in the literature: it makes essential use of an infinite branching. Let P be the process rec x. a.x and s a state in a pLTS which starts by making an infinitary choice, namely for each k ≥ 0 it has the option to perform a P sequence of k a actions in succession and then deadlock. This can be described by ∞ a the infinitary CCS expression k=0 ak . Then P 6sFS s, because the move P −→ P can not be matched by s. s. However an easy inductive argument shows that P kFS ak for every k, and therefore that P ∞ FS

[ ℄ [ ℄

[ ℄

39

[ ℄ [ ℄

Once we restrict our non-probabilistic systems to finitely branching state spaces, however, a simple counting argument will show that sFS coincides with ∞ ; see [8, Theorem 2.1] for the argument applied to bisimulation FS equivalence. In the probabilistic case we restrict to both finite-state and finite-branching -systems, and the effect of that is captured by topological compactness. Finiteness is lost unavoidably when we remember that for example the process τ.a + τ.b can move via =⇒ to a distribution a p ⊕ b for any of the non-denumerably many probabilities p∈[0, 1] — and that is hardly finite. Even worse, in probabilistic systems one can have infinite branching over a finitestate system (not possible without probability), just e.g. by picking some infinite subset of [0, 1] to be the allowed p’s in the example above; that is why we must now impose finite branching explicitly. The effect is what one could call “finitely generated” transitions (in this case by arbitrary interpolation of the two extreme possibilities a and b, two being a finite number), and that is the key structural property that compactness captures. Because compactness follows from closure and boundedness , we approach this topic via closure. Note that the metric space (D(S), d1 ) with d1 (∆, Θ) = maxs∈S |∆(s) − Θ(s)| and (S → D(S), d2 ) with d2 (f, g) = maxs∈S d1 (f (s), g(s)) are complete. Let X be a subset of either D(S) or S → D(S). Clearly, X is bounded. So if X is closed, it is also compact.

[℄

[℄

Definition 7.2 A relation R ⊆ S × D(S) is closed if for every s ∈ S the set s· R is closed. a

Two examples of closed relations are =⇒ and =⇒ for any a, as shown by Lemmas 5.4 and 5.5. Our next step is to show that each of the relations kFS are closed. This requires some results to be first established. Lemma 7.3 Let R⊆ S × D(S) be closed. Then its set of choice functions { f : S → D(S) | f ∈S R } is also closed. Proof: Straightforward. Need to say what definition of closure we’re using wrt a set of functions.

2

Corollary 7.4 Let R⊆ S × D(S) be closed and convex. Then R is also closed. Proof: For any ∆ ∈ D(S), we know from Proposition 3.8 that ∆· R= {Exp∆ (f ) | f ∈⌈∆⌉ R}. The function Exp∆ (−) is continuous. By Lemma 7.3 the set of choice functions of R is closed, and it is also bounded, thus being compact. Its image is also compact, thus being closed. 2 Lemma 7.5 Let R ⊆ S × D(S) be closed and convex, and C ⊆ D(S) be closed. Then the set { ∆ | ∆· R ∩C 6= ∅ } is also closed. Proof: First define E : D(S) × (S → D(S)) → D(S) by E(Θ, f ) = ExpΘ (f ), which is obviously continuous. Then let F = { f | f ∈⌈∆⌉ R }, which by the previous lemma is closed. Finally let Z = π1 (E −1 (C) ∩ (D(S) × F )) where π1 is the projection on to the first component of a pair. We observe that the continuity of E ensures that the inverse image of the closed set C is closed. Furthermore, E −1 (C) ∩ (D(S) × F ) is compact because it is both closed and bounded. Its image under the continuous function π1 is also compact. It follows that Z is closed. But Z = { ∆ | ∆· R ∩C 6= ∅ } because ∆ ∈ Z iff (∆, f ) ∈ E −1 (C) for some f ∈⌈∆⌉ R iff E(∆, f ) ∈ C for some f ∈⌈∆⌉ R iff Exp∆ (f ) ∈ C for some f ∈⌈∆⌉ R iff ∆ R ∆′ for some ∆′ ∈ C The last line is an application of Proposition 3.8, which requires convexity of R. An immediate corollary of this last result is:

40

2

Corollary 7.6 In a finitary pLTS the following sets are closed: (i) { ∆ | ∆ =⇒ ε } (ii) { ∆ | ∆ =A⇒ 6 } Proof: By Lemma 5.4 and Corollary 3.20 we know that =⇒ is closed and convex. Therefore, we can apply the previous lemma with C = {ε} to obtain the first result. To obtain the second we apply it with C = { Θ | Θ −A→ 6 }, which is easily seen to be closed. 2 The result is also used in the proof of: Proposition 7.7 For every k ≥ 0 the relation kFS is closed and convex. Proof: By induction on k. For k = 0 it is obvious. So let us assume that kFS is closed and convex. We have to show that (k+1)

s· FS

is closed and convex, for every state s

(26)

(k+1)

coincides with { ∆ | ∆ =⇒ ε }. So If s =⇒ ε then this follows from the corollary above, since in this case s· FS let us assume that this is not the case. For every A ⊆ Act let RA = { ∆ | ∆ =A⇒ 6 }, which we know by the corollary above to be closed and is obviously α α convex. Also for every Θ, α let GΘ,α = { ∆ | (∆· =⇒) ∩ (Θ· kFS ) 6= ∅ }. By Proposition 6.2, the relation =⇒ is lifted from a closed convex relation. By Corollary 7.4, the assumption that kFS is closed and convex implies that kFS is also closed. So we can appeal to Lemma 7.5 and conclude that each GΘ,α is closed. By Definition 3.2(2) it is also (k+1) easy to see that GΘ,α is convex. But it follows that s· FS is also closed and convex as it can be written as ∩{ RA | s −A→ 6 } ∩ ∩{ GΘ,α | s −α → Θ} 2 Before the main result of this section we need one more technical result. Lemma 7.8 Let S be a finite set of states. Suppose Rk ⊆ S × D(S) is a sequence of closed convex relations such ∞ k k that R(k+1) ⊆ Rk . Then (∩∞ k=0 R ) ⊆ (∩k=0 R ). k k ∞ Proof: Let R∞ denote (∩∞ k=0 R ), and suppose ∆ R Θ for every k ≥ 0. We have to show that ∆R Θ. Let G = { f : S → D(S) | Θ = Exp∆ (f ) }, which is easily seen to be a closed set. For each k let F k = { f : S → D(S) | f ∈⌈∆⌉ Rk }, which by Lemma 7.3 we also know to be closed. Finally consider the collection of closed sets H k = F k ∩ G; since ∆ Rk Θ, Proposition 3.8 assures us that all of these are non-empty. Also H (k+1) ⊆ H k k and therefore by the finite-intersection property [14] ∩∞ k=0 H is also non-empty. Let f be an arbitrary element of this intersection. For any state s, and for every k ≥ 0, s Rk f (s), that is s R∞ f (s). So f is a choice function for R∞ , f ∈⌈∆⌉ R∞ . From convexity and Proposition 3.8 it follows that ∆ R∞ Exp∆ (f ). But from the definition of the G we know that Θ = Exp∆ (f ), and the required result follows. 2

Theorem 7.9 In a finitary pLTS, s sFS Θ if and only if s ∞ FS Θ. ∞ Proof: Since sFS ⊆ ∞ FS it is sufficient to show the opposite inclusion, which by definition holds if FS is a failure ∞ ∞ ∞ k simulation, viz. if FS ⊆ F(FS ). Suppose s FS Θ, which means that s FS Θ for every k ≥ 0. According to Definition 6.20, in order to show s F(∞ FS ) Θ we have to establish three properties, the first and last of which are trivial (for they are independent on the argument of F). α So suppose s −α → ∆′ . We have to show that Θ =⇒ Θ′ for some Θ′ such that ∆′ ∞ Θ′ . FS

41

α

For every k ≥ 0 there exists some Θ′k such that Θ =⇒ Θ′k and ∆′ kFS Θ′k . Now construct the sets α

Dk = { Θ′ | Θ =⇒ Θ′ and ∆′ kFS Θ′ }. From Lemma 5.4 and Proposition 7.7 weT know that these are closed. They are also non-empty and Dk+1 ⊆ Dk . So α ∞ by the finite-intersection property the set k=0 Dk is non-empty. For any Θ′ in it we know Θ =⇒ Θ′ and ∆′ kFS Θ′ k for every k ≥ 0. By Proposition 7.7, the relations FS are all closed and convex. Therefore, Lemma 7.8 may be applied ′ to them, which enables us to conclude ∆′ ∞ 2 FS Θ . Analogously to what we did for sFS , we also give an inductive characterisation of ⊒FS : For every k ≥ 0 let ∆ ⊒kFS Θ T∞ k if there exists a Θ =⇒ Θmatch such that ∆ kFS Θmatch , and let ⊒∞ FS denote k=0 ⊒FS .

Corollary 7.10 In a finitary pLTS, ∆ ⊒FS Θ if and only if ∆ ⊒∞ FS Θ.

Proof: Since sFS ⊆ kFS for every k ≥ 0, it is straightforward to prove one direction: ∆ ⊒FS Θ implies ∆ ⊒∞ FS Θ. k k k Θk . By  For the converse, ∆ ⊒∞ Θ means that for every k we have some Θ satisfying Θ = ⇒ Θ and ∆ FS FS Proposition 6.9 we have to find some Θ∞ such that Θ =⇒ Θ∞ and ∆ kFS Θ∞ . This can be done exactly as in the proof of Theorem 7.9. 2

7.2 The modal logic Let F be the set of modal formulae defined inductively as follows: • div, ⊤ ∈ F • ref (A) ∈ F when A ⊆ Act, • haiϕ ∈ F when ϕ ∈ F and a ∈ Act, • ϕ1 ∧ ϕ2 ∈ F when ϕ1 , ϕ2 ∈ F, • ϕ1 p ⊕ ϕ2 ∈ F when ϕ1 , ϕ2 ∈ F and p ∈ [0, 1]. This generalises the modal language used in [2] by the addition of the L new constant div, representing the ability of a process to diverge. In [2] there is the probabilistic choice operator i∈I pi ·ϕi , where I is a non-empty finite index P set, and i∈I pi = 1. This can be simulated in our language by nested use of the binary probabilistic choice. Relative to a given pLTS hS, Actτ , →i the satisfaction relation |= ⊆ D(S) × F is given by: • ∆ |= ⊤ for any ∆ ∈ D(S), • ∆ |= div iff ∆ =⇒ ε, • ∆ |= ref (A) iff ∆ =⇒−A→, 6 a

• ∆ |= haiϕ iff there is a ∆′ with ∆ =⇒ ∆′ and ∆′ |= ϕ, • ∆ |= ϕ1 ∧ ϕ2 iff ∆ |= ϕ1 and ∆ |= ϕ2 , • ∆ |= ϕ1 p ⊕ ϕ2 iff there are ∆1 , ∆2 ∈ D(S) with ∆1 |= ϕ1 and ∆2 |= ϕ2 , such that ∆ =⇒ p·∆1 + (1−p)·∆2 . We write ∆ ⊒F Θ when ∆ |= ϕ implies Θ |= ϕ for all ϕ ∈ F — note the opposing directions. This is because the modal formulae express “bad” properties of our processes, ultimately divergence and refusal: thus Θ ⊑F ∆ means that any bad thing implementation ∆ does must have been allowed by the specification Θ. For pCSP processes we use P ⊑F Q to abbreviate P ⊑F Q in the pLTS given in Section 2. The set of formulae used here is obtained from that in Section 7 of [2] by adding one operator, div, and relaxing the constraint on the construction of probabilistic choice formulae. But the interpretation is quite different, as it uses the new silent move relation =⇒. As a result our satisfaction relation no longer enjoys a natural, and expected, property. Recall that if a recursive CCS process P satisfies a modal formula from HML, then there is a recursion-free finite unwinding of P which also satisfies it. Intuitively this reflects the fact that if a non-probabilistic process does a bad thing, then at some (finite) point it must actually do it. But this is not true in our new, probabilistic setting: for

[ ℄

42

[ ℄

example Q1 given in Example 4.3 can do an a and then refuse anything; but all finite unwindings of it achieve that with probability strictly less than one. That is, whereas Q1 |= hai⊤, no finite unwinding of Q1 will satisfy hai⊤. Our first task is to show that the interpretation of the logic is consistent with the operational semantics of processes.

[ ℄

Theorem 7.11 If ∆ ⊒FS Θ then ∆ ⊒F Θ. Proof: We must show that if ∆ ⊒FS Θ then whenever ∆ |= ϕ we have Θ |= ϕ. The proof proceeds by induction on ϕ: • The case when ϕ = ⊤ is trivial. • Suppose ϕ is div. Then ∆ |= div means that ∆ =⇒ ε and we have to show Θ =⇒ ε, which is immediate from Lemma 6.5. a

• Suppose ϕ is haiϕa . In this case we have ∆ =⇒ ∆′ for some ∆′ satisfying ∆′ |= ϕa . The existence of a corresponding Θ′ is immediate from Definition 6.4 Case 1 and the induction hypothesis. • The case when ϕ is ref (A) follows by Definition 6.4 Clause 2, and the case ϕ1 ∧ ϕ2 by induction. • When ϕ is ϕ1 p ⊕ ϕ2 we appeal again to Definition 6.4 Case 1, using α := τ to infer the existence of suitable Θ′{1,2} . 2 We proceed to show that the converse to this theorem also holds, so that the failure simulation preorder ⊑FS coincides with the logical preorder ⊑F. The idea is to mimic the development in Section 7 of [2], by designing characteristic formulae which capture the behaviour of states in a pLTS. But here the behaviour is not characterised relative to sFS , but rather to the sequence of approximating relations kFS . Definition 7.12 In a finitary pLTS hS, Actτ , →i, the k th characteristic formulae ϕks , ϕk∆ of states s ∈ S and subdistributions ∆ ∈ D(S) are defined inductively as follows: • ϕ0s = ⊤ and ϕ0∆ = ⊤, • ϕk+1 = div, provided s =⇒ ε, s V a k k+1 a 6 provided s −τ→, 6 • ϕs = ref (A) ∧ s−→ ∆ haiϕ∆ where A = {a ∈ Act | s −→}, V V k k+1 k a τ • ϕs = s−→ ∆ haiϕ∆ ∧ s−→∆ ϕ∆ otherwise, L k+1 k+1 ). • and ϕ∆ = (div) 1−⌈∆⌉ ⊕ ( s∈⌈∆⌉ ∆(s) ⌈∆⌉ ·ϕs

Lemma 7.13 For every k ≥ 0, s ∈ S and ∆ ∈ D(S) we have s |= ϕks and ∆ |= ϕk∆ .

Proof: By induction on k, with the case when k = 0 being trivial. The inductive case of the first statement proceeds by an analysis of the possible moves from s, from which that of the second statement follows immediately. 2 Lemma 7.14 For k ≥ 0, (i) Θ |= ϕks implies s kFS Θ, (ii) Θ |= ϕk∆ implies Θ =⇒ Θmatch such that ∆ kFS Θmatch , (iii) Θ |= ϕk∆ implies Θ ⊒kFS ∆. Proof: For every k part (iii) follows trivially from (ii). We prove (i) and (ii) simultaneously, by induction on k, with the case k = 0 being trivial. The inductive case, for k + 1, follows the argument in the proof of Lemma 7.3 of [2]. = div and therefore Θ |= div, which gives the required Θ =⇒ ε. (i) First suppose s =⇒ ε. Then ϕk+1 s τ Now suppose s −→ ∆. Here there are two cases; if in addition s =⇒ ε we have already seen that Θ =⇒ ε and this is the required matching move from Θ, since ∆ kFS ε. So let us assume that s 6=⇒ ε. Then by the definition of ϕk+1 we must have that Θ |= ϕk∆ , and we obtain the required matching move from Θ from the inductive s hypothesis: induction on part (ii) gives some Θ′ such that Θ =⇒ Θ′ and ∆ kFS Θ′ .

43

a The matching move for s −→ Θ is obtained in a similar manner.

Finally suppose s −A→. 6 Since this implies s −τ→, 6 by the definition of ϕk+1 we must have that Θ |= ref (A), which s A actually means that Θ =⇒−→. 6 L P k+1 (ii) By definition ϕk+1 = (div) 1−⌈∆⌉ ⊕ ( s∈⌈∆⌉ ∆(s) ) and thus Θ =⇒ (1−⌈∆⌉)·Θdiv + s∈⌈∆⌉ ∆(s)·Θs ∆ ⌈∆⌉ ·ϕs such that Θdiv |= div and Θs |= ϕk+1 .PBy definition, Θdiv =⇒ ε, so by Theorem 3.18(i) and the reflexivity s Θs for every s in and transitivity of =⇒ we obtain Θ =⇒ s∈⌈∆⌉ ∆(s)·Θs . By part (i) we know that s k+1 FS k+1 P ⌈∆⌉, which in turn means that ∆ FS 2 s∈⌈∆⌉ ∆(s)·Θs .

Theorem 7.15 In a finitary pLTS, ∆ ⊒F Θ if and only if ∆ ⊒FS Θ.

Proof: One direction follows immediately from Theorem 7.11. For the opposite direction suppose ∆ ⊒F Θ. By Lemma 7.13 we have ∆ |= ϕk∆ , and hence Θ |= ϕk∆ , for all k ≥ 0. By part (iii) of the previous lemma we thus know that ∆ ⊒∞ 2 FS Θ. That ∆ ⊒FS Θ now follows from Corollary 7.10.

7.3 Characteristic tests for formulae The import of Theorem 7.15 is that we can obtain completeness of the failure simulation preorder with respect to the must-testing preorder by designing for each formula ϕ a test which in some sense characterises the property of a process of satisfying ϕ. This has been achieved for the pLTS generated by the recursion free fragment of pCSP in Section 8 of [2]. Here we generalise this technique to the pLTS generated by the set of finitary pCSP terms. As in [2], the generation of these tests depends on crucial characteristics of the testing function A(−, −), which are summarised in the following two lemmas 7.16 and 7.19, corresponding to Lemmas 6.7 and 6.8 in [2] respectively. Lemma 7.16 Let ∆ be a pCSP process, and T, Ti be tests. 1. o ∈ A(ω, ∆) iff o = |∆|·~ ω. 2. ~0 ∈ A(τ.ω, ∆) iff ∆ =⇒ ε. e 6 3. ~0 ∈ A( a∈A a.ω, ∆) iff ∆ =⇒−A→. 4. Suppose the action ω does not occur in the test T . Then o ∈ A(τ.ω 2 a.T, ∆) with o(ω) = 0 iff there is a a ∆′ ∈ D(sCSP) with ∆ =⇒ ∆′ and o ∈ A(T, ∆′ ). 5. o ∈ A(T1 p ⊕ T2 , ∆) iff o = p·o1 + (1−p)·o2 for certain oi ∈ A(Ti , ∆). 6. o ∈ A(T1 ⊓ T2 , ∆) if there are a q ∈ [0, 1] and ∆1 , ∆2 ∈ D(sCSP) such that ∆ =⇒ q ·∆1 + (1−q)·∆2 and o = q ·o1 + (1−q)·o2 for certain oi ∈ A(Ti , ∆i ). Proof: 1. Since ω |Act P −ω→, the states in the support of [ω |Act ∆] have no other outgoing transitions than ω. Therefore [ω |Act ∆] is the unique extreme derivative of itself, and as $[ω |Act ∆] = |∆|·~ω we have A(ω, ∆) = {|∆|·~ω }. 2. (⇐) Assume ∆ =⇒ ε. By Lemma 6.27(1) we have τ.ω |Act ∆ =⇒ τ.ω |Act ε. All states involved in this × derivation (that is, all states u in the support of the intermediate distributions ∆→ i and ∆i of Definition 3.12) ω have the form τ.ω |Act s, and thus satisfy u −→ 6 for all ω ∈ Ω. Therefore we have [τ.ω |Act ∆] =⇒ [τ.ω |Act ε]. Trivially, [τ.ω |Act ε] = ε is stable, and hence an extreme derivative of [τ.ω |Act ∆]. Moreover, $ε = ~0, so ~0 ∈ A(τ.ω, ∆). (⇒) Suppose ~0 ∈ A(τ.ω, ∆), i.e., there is some extreme derivative Γ of [τ.ω |Act ∆] such that $Γ = ~0. Given the operational semantics of pCSP, all states u ∈ ⌈Γ⌉ must have one of the forms u = [τ.ω |Act t] or u = [ω |Act t]. As $Γ = ~0, the latter possibility cannot occur. If follows that all transitions contributing to the derivation [τ.ω |Act ∆] =⇒≻ Γ are obtained by means of the rule (par.r), and in fact Γ has the form [τ.ω |Act ∆′ ] for some distribution ∆′ with ∆ =⇒ ∆′ . As Γ must be stable, yet none of the states in its support are, it follows that ⌈Γ⌉ = ∅, i.e. ∆′ = ε. 44

3. Let T :=

e

a∈A

a.ω.

(⇐) Assume ∆ =⇒ ∆′ −A→ 6 for some ∆′ . Then T |Act ∆ =⇒ T |Act ∆′ by Lemma 6.27(1), and by the same argument as in the previous case, [T |Act ∆] =⇒ [T |Act ∆′ ]. All states in the support of T |Act ∆′ are deadlocked. So [T |Act ∆] =⇒≻ [T |Act ∆] and $(T |Act ∆) = ~0. Thus we have ~0 ∈ A(T, ∆). (⇒) Suppose ~0 ∈ A(T, ∆). By the very same reasoning as in Case 2 we find that ∆ =⇒ ∆′ for some ∆′ such that T |Act ∆′ is stable. This implies ∆′ −A→. 6 4. Let T be a test in which the success action ω does not occur, and let U := τ.ω 2 a.T . a

(⇐) Assume there is a ∆′ ∈ D(sCSP) with ∆ =⇒ ∆′ and o ∈ A(T, ∆′ ). W.l.o.g. we may assume Why? a ∆′ . Using Lemma 6.27(1) and (3), and the same reasoning as in the previous cases, that ∆ =⇒ ∆pre −→ τ [U |Act ∆] =⇒ [U |Act ∆pre ] −→ [T |Act ∆′ ] =⇒ Γ for a stable subdistribution Γ with $Γ = o. It follows that o ∈ A(U, ∆). (⇒) Suppose o ∈ A(U, ∆) with o(ω) = 0. Then there is a stable subdistribution Γ such that [U |Act ∆] =⇒ Γ and $Γ = o. Since o(ω) = 0 there no state in the support of Γ of the form ω |Act t. Hence there must be a a ∆′ ∈ D(sCSP) such that ∆ =⇒−→ ∆′ and [T |Act ∆′ ] =⇒ Γ. It follows that o ∈ A(T, ∆′ ). 5. (⇐) Assume oi ∈ A(Ti , ∆) for i = 1, 2. Then [Ti |Act ∆] =⇒ Γi for some stable Γi with $Γi = oi . By Theorem 3.18(i) we have [(T1 p ⊕ T2 ) |Act ∆] = p·[T1 |Act ∆] + (1−p)·[T2 |Act ∆] =⇒ p·Γ1 + (1−p)·Γ2 , and p·Γ1 + (1−p)·Γ2 is stable. Moreover, $(p·Γ1 + (1−p)·Γ2 ) = p·o1 + (1−p)·o2 , so o ∈ A(T1 p ⊕ T2 , ∆). (⇒) Suppose o ∈ A(T1 p ⊕ T2 , ∆). Then there is a stable Γ with $Γ = o such that [(T1 p ⊕ T2 ) |Act ∆] = p·[T1 |Act ∆] + (1−p)·[T2 |Act ∆] =⇒ Γ. By Theorem 3.18(ii) there are Γi for i = 1, 2, such that [Ti |Act ∆] =⇒ Γi and Γ = p·Γ1 + (1−p)·Γ2 . As Γ1 and Γ2 are stable, we have $Γi ∈ A(Ti , ∆) for i = 1, 2. Moreover, o = $Γ = p·$Γ1 + (1−p)·$Γ2 . 6. Suppose q ∈ [0, 1] and ∆1 , ∆2 ∈ D(pCSP) with ∆ =⇒ q ·∆1 + (1−q)·∆2 and oi ∈ A(Ti , ∆i ). Then there are stable Γi with [Ti |Act ∆i ] =⇒ Γi and $Γi = oi . Now [(T1 ⊓ T2 ) |Act ∆] =⇒ q ·[(T1 ⊓ T2 ) |Act ∆1 ] τ + (1−q)·[(T1 ⊓ T2 ) |Act ∆2 ] −→ q ·[T1 |Act ∆1 ] + (1−q)·[T2 |Act ∆2 ] =⇒ q ·Γ1 + (1−q)·Γ2 . The latter subdistribution is stable and satisfies $(q ·Γ1 + (1−q)·Γ2 ) = q ·o1 + (1−q)·o2 . Hence q ·o1 + (1−q)·o2 ∈ A(T1 ⊓ T2 , ∆). 2 We also have the converse to part (6) of this lemma, again mimicking Lemma 6.8 of [2]. For that purpose, we use two technical lemmas whose proofs are similar to those for Lemmas 6.29 and 6.30 respectively. × τ next Lemma 7.17 Suppose ∆ |A (T1 ⊓ T2 ) −→ Γ. Then there exist subdistributions ∆→ , ∆× (possibly 1 , ∆2 , ∆ empty) such that × (i) ∆ = ∆→ + ∆× 1 + ∆2 τ ∆next (ii) ∆→ −→ × (iii) Γ = ∆next |A (T1 ⊓ T2 ) + ∆× 1 |A T1 + ∆2 |A T2 τ Proof: By Lemma 3.4 ∆ |A (T1 ⊓ T2 ) −→ Γ implies that X τ si |A (T1 ⊓ T2 ) −→ Γi , ∆= pi ·si , i∈I

Γ=

X

pi ·Γi ,

i∈I

P for certain si ∈ S, Γi ∈ D(sCSP) and i∈I pi ≤ 1. Let J1 = { i ∈ I | Γi = si |A T1 } and J2 = { i ∈ I | Γi = si |A τ Γ′i . Now let T2 }. Note that for each i ∈ (I − J1 − J2 ) we have Γi in the form Γ′i |A (T1 ⊓ T2 ), where si −→ X X X pi ·si , pi ·Γ′i . ∆→ = pi ·si , ∆× ∆next = k = i∈(I−J1 −J2 )

i∈Jk

i∈(I−J1 −J2 )

where k = 1, 2. By construction (i) and (iii) are satisfied, and (ii) follows by property (2) of Definition 3.2.

45

2

Lemma 7.18 If ∆ |A (T1 ⊓ T2 ) =⇒≻ Ψ then there are Φ1 and Φ2 such that (i) ∆ =⇒ Φ1 + Φ2 (ii) Φ1 |A T1 + Φ2 |A T2 =⇒≻ Ψ Proof: Suppose ∆0 |A (T1 ⊓ T2 ) =⇒≻ Ψ. We know from Definition 3.12 that there is a collection of subdistributions × Ψk , Ψ→ k , Ψk , for k ≥ 0, satisfying the properties ∆0 |A (T1 ⊓ T2 ) = τ Ψ→ −→ 0 .. .

Ψ0 = Ψ1 =

τ −→ Ψk+1 =

Ψ→ k

× Ψ→ 0 + Ψ0 → Ψ1 + Ψ× 1 .. . × Ψ→ k+1 + Ψk+1 .. .

Ψ=

P∞

k=0

Ψ× k

and Ψ is stable. × × Take Γ0 := Ψ0 . By induction on k ≥ 0, we find distributions Γk+1 , ∆→ k , ∆k1 , ∆k2 , ∆k+1 such that τ (i) ∆k |A (T1 ⊓ T2 ) −→ Γk+1

(ii) Γk+1 ≤ Ψk+1 × × (iii) ∆k = ∆→ k + ∆k1 + ∆k2 τ (iv) ∆→ k −→ ∆k+1 × (v) Γk+1 = ∆k+1 |A (T1 ⊓ T2 ) + ∆× k1 |A T1 + ∆k2 |A T2 × Induction Step: Assume we already have Γk and ∆k . Note that ∆k |A (T1 ⊓ T2 ) ≤ Γk ≤ Ψk = Ψ→ k + Ψk and × × τ T1 ⊓ T2 can make a τ move. Since Ψ is stable, we know that either Ψk = ε or Ψk −→. 6 In both cases it holds that τ ∆k |A (T1 ⊓ T2 ) ≤ Ψ→ . Proposition 3.9 gives a subdistribution Γ ≤ Ψ such that ∆ |A (T1 ⊓ T2 ) −→ Γk+1 . k+1 k+1 k k Now apply Lemma 7.17.

P∞ P∞ × Let Φ1 = k=0 ∆× k=0 ∆k2 . By (iii) and (iv) above we obtain a weak τ move ∆ =⇒ Φ1 + Φ2 . k1 and Φ2 = × × × → For k ≥ 0, let Γk := ∆k |A (T1 ⊓ T2 ), let Γ× 0 := ε and let Γk+1 := ∆k1 |A T1 + ∆k2 |A T2 . Moreover, Γ := Φ1 |A T1 + Φ2 |A T2 . Now all conditions of Definition 3.22 are fulfilled, so ∆0 |A (T1 ⊓ T2 ) =⇒ Γ is an initial segment of ∆0 |A (T1 ⊓ T2 ) =⇒ Ψ. By Proposition 3.23 we have Φ1 |A T1 + Φ2 |A T2 =⇒≻ Ψ. 2 Lemma 7.19 If o ∈ A(T1 ⊓ T2 , ∆) then there are a q ∈ [0, 1] and ∆1 , ∆2 ∈ D(sCSP) such that ∆ =⇒ q ·∆1 + (1−q)·∆2 and o = q ·o1 + (1−q)·o2 for certain oi ∈ A(Ti , ∆i ). Proof: If o ∈ A(T1 ⊓ T2 , ∆) then there is an extreme derivative Ψ of [(T1 ⊓ T2 ) |Act ∆] such that $Ψ = o. By Lemma 7.18 there are Φ1,2 such that (i) ∆ =⇒ Φ1 + Φ2 (ii) and [T1 |Act Φ1 ] + [T2 |Act Φ2 ] =⇒≻ Ψ. By Theorem 3.18(ii) there are some subdistributions Ψ1 and Ψ2 such that Ψ = Ψ1 + Ψ2 and Ti |Act Φi =⇒≻ Ψi for i = 1, 2. Let o′i = $Ψi . As Ψi is stable we obtain o′i ∈ A(Ti , Ψi ). We also have o = $Ψ = $Ψ1 + $Ψ2 = o′1 + o′2 . We now distinguish two cases: • If Ψ1 = ε, then we take ∆i = Φi , oi = o′i for i = 1, 2 and q = 0. Symmetrically, if Ψ2 = ε, then we take ∆i = Φi , oi = o′i for i = 1, 2 and q = 1. • If Ψ1 6= ε and Ψ2 6= ε, then we let q =

|Φ1 | |Φ1 +Φ2 | ,

∆1 = 1q Φ1 , ∆2 =

1 1−q Φ2 ,

o1 = q1 o′1 and o2 =

1 ′ 1−q o2 .

It is easy to check that q ·∆1 + (1−q)·∆2 = Φ1 + Φ2 , q ·o1 + (1−q)·o2 = o′1 + o′2 and oi ∈ A(Ti , ∆i ) for i = 1, 2. 2

46

Proposition 7.20 For every formula ϕ ∈ F there exists a pair (Tϕ , vϕ ) with Tϕ an Ω-test and vϕ ∈ [0, 1]Ω such that ∆ |= ϕ if and only if ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ . (27) Tϕ is called a characteristic test of ϕ and vϕ its target value. Proof: The proof is adapted from that of Lemma 8.1 in [2], from where we take the following remarks: As Ω is countable and Ω-tests are finite expressions, for every Ω-test there is an ω ∈ Ω not occurring in it. Furthermore, if a pair (Tϕ , vϕ ) satisfies requirement (27), then any pair obtained from (Tϕ , vϕ ) by bijectively renaming the elements of Ω also satisfies that requirement. Hence two given characteristic tests can be assumed to be Ω-disjoint, meaning that no ω ∈ Ω occurs in both of them. Our modal logic F is identical to that used in [2], with the addition of one extra constant div. So we need a new characteristic test and target value for this latter formula, and reuse those from [2] for the rest of the language:2 • Let ϕ = ⊤. Take Tϕ := ω for some ω ∈ Ω, and vϕ := ~ω . • Let ϕ = div. Take Tϕ := τ.ω for some ω ∈ Ω, and vϕ := ~0. e • Let ϕ = ref (A) with A ⊆ Act. Take Tϕ := a∈A a.ω for some ω ∈ Ω, and vϕ := ~0. • Let ϕ = haiψ. By induction, ψ has a characteristic test Tψ with target value vψ . Take Tϕ := τ.ω 2 a.Tψ where ω ∈ Ω does not occur in Tψ , and vϕ := vψ . • Let ϕ = ϕ1 ∧ ϕ2 . Choose a Ω-disjoint pair (Ti , vi ) of characteristic tests Ti with target values vi , for i = 1, 2. Furthermore, let p ∈ (0, 1] be chosen arbitrarily, and take Tϕ := T1 p ⊕ T2 and vϕ := p·v1 + (1−p)·v2 . • Let ϕ = ϕ1 p ⊕ ϕ2 . Again choose a Ω-disjoint pair (Ti , vi ) of characteristic tests Ti with target values vi , i = 1, 2, this time ensuring that there are two distinct success actions ω1 , ω2 that do not occur in any of these tests. Let Ti′ := Ti 12 ⊕ ωi and vi′ := 21 vi + 12 ω~i . Note that for i = 1, 2 we have that Ti′ is also a characteristic test of ϕi with target value vi′ . Take Tϕ := T1′ ⊓ T2′ and vϕ := p·v1′ + (1−p)·v2′ . Note that vϕ (ω) = 0 whenever ω ∈ Ω does not occur in Tϕ . As in the proof of Lemma 8.1 of [2] we now check by induction on ϕ that (27) above holds; the proof relies on Lemmas 7.16 and 7.19. • Let ϕ = ⊤. For all ∆ ∈ D(sCSP) we have ∆ |= ϕ as well as ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ , using Lemma 7.16(1). • Let ϕ = div. Suppose ∆ |= ϕ. Then we have that ∆ =⇒ ε. By Lemma 7.16(2), ~0 ∈ A(Tϕ , ∆). Now suppose ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ . This implies o = ~0, so by Lemma 7.16(2), ∆ =⇒ ε. Hence ∆ |= ϕ. • Let ϕ = ref (A) with A ⊆ Act. Suppose ∆ |= ϕ. Then ∆ =⇒−A→. 6 By Lemma 7.16(3), ~0 ∈ A(Tϕ , ∆). Now suppose ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ . This implies o = ~0, so ∆ =⇒−A→ 6 by Lemma 7.16(3). Hence ∆ |= ϕ. a

• Let ϕ = haiψ with a ∈ Act. Suppose ∆ |= ϕ. Then there is a ∆′ with ∆ =⇒ ∆′ and ∆′ |= ψ. By induction, ∃o ∈ A(Tψ , ∆′ ) : o ≤ vψ . By Lemma 7.16(4), o ∈ A(Tϕ , ∆). a Now suppose ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ . This implies o(ω) = 0, so by Lemma 7.16(4) there is a ∆′ with ∆ =⇒ ∆′ ′ ′ and o ∈ A(Tψ , ∆ ). By induction, ∆ |= ψ, so ∆ |= ϕ. • Let ϕ = ϕ1 ∧ϕ2 and suppose ∆ |= ϕ. Then ∆ |= ϕi for i=1, 2 and hence, by induction, ∃oi ∈ A(Ti , ∆) : oi ≤ vi . Thus o := p·o1 + (1−p)·o2 ∈ A(Tϕ , ∆) by Lemma 7.16(5), and o ≤ vϕ . Now suppose ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ . Then, using Lemma 7.16(5), o = p·o1 + (1−p)·o2 for certain oi ∈ A(Ti , ∆). Recall that T1 , T2 are Ω-disjoint tests. One has oi ≤ vi for both i = 1, 2, for if oi (ω) > vi (ω) for some i = 1 or 2 and ω ∈ Ω, then ω must occur in Ti and hence cannot occur in T3−i . This implies v3−i (ω) = 0 and thus o(ω) > vϕ (ω), in contradiction with the assumption. By induction, ∆ |= ϕi for i = 1, 2, and hence ∆ |= ϕ. • Let ϕ = ϕ1 p ⊕ ϕ2 . Suppose ∆ |= ϕ. Then there are ∆1 , ∆2 ∈ D(sCSP) with ∆1 |= ϕ1 and ∆2 |= ϕ2 such that ∆ =⇒ p·∆1 + (1−p)·∆2 . By induction, for i = 1, 2 there are oi ∈ A(Ti , ∆i ) with oi ≤ vi . Hence, there are o′i ∈ A(Ti′ , ∆i ) with o′i ≤ vi′ . Thus o := p·o′1 + (1−p)·o′2 ∈ A(Tϕ , ∆) by Lemma 7.16(6), and o ≤ vϕ . 2 However, because we employ state-based testing here, as opposed to action-based testing in [2], we translate the action-based test ω 2 a.T ψ for the action modality haiψ into the state-based test τ.ω 2 a.Tψ .

47

Now suppose ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ . Then, by Lemma 7.19, there are q ∈ [0, 1] and ∆1 , ∆2 ∈ D(sCSP) such that ∆ =⇒ q ·∆1 + (1−q)·∆2 and o = q ·o′1 + (1−q)·o′2 for certain o′i ∈ A(Ti′ , ∆i ). Now ∀i : o′i (ωi )= vi′ (ωi )= 21 , so, using that T1 , T2 are Ω-disjoint tests, 21 q = q ·o′1 (ω1 ) = o(ω1 ) ≤ vϕ (ω1 ) = p·v1′ (ω1 ) = 12 p and likewise 1 1 ′ ′ 2 (1−q) = (1−q)·o2 (ω2 ) = o(ω2 ) ≤ vϕ (ω2 ) = (1−p)·v2 (ω2 ) = 2 (1−p). Together, these inequalities say that ′ ′ q = p. Exactly as in the previous case one obtains oi ≤ vi for both i = 1, 2. Given that Ti′ = Ti 21 ⊕ ωi , using Lemma 7.16(5), it must be that o′i = 12 oi + 12 ω~i for some oi ∈ A(Ti , ∆i ) with oi ≤ vi . By induction, ∆i |= ϕi for i = 1, 2, and hence ∆ |= ϕ. 2 F Theorem 7.21 If ∆ ⊒Ω pmust Θ then ∆ ⊒ Θ.

Proof: Suppose ∆ ⊒Ω pmust Θ and ∆ |= ϕ for some ϕ ∈ F. Let Tϕ be a characteristic test of ϕ with target value vϕ . Then Proposition 7.20 yields ∃o ∈ A(Tϕ , ∆) : o ≤ vϕ , and hence, given that ∆ ⊒Ω pmust Θ, by the Smyth preorder we have ∃o′ ∈ A(Tϕ , Θ) : o′ ≤ vϕ . Thus Θ |= ϕ. 2

8 Simulations and may testing In this section we follow the same strategy as for failure simulations and testing (Section 6) except that we restrict our treatment to full distributions: this is possible because partial distributions are not necessary for this case; and it is desirable because the approach becomes simpler as a result. Definition 8.1 [Simulation Preorder] Define ⊑S to be the largest relation in D1 (S) × D 1 (S) such that if ∆ ⊑S Θ then P P P α α whenever ∆ =⇒ ( i pi ∆′i ), for finitely many pi with i pi = 1, there are Θ′i with Θ =⇒ ( i pi Θ′i ) and ∆′i ⊑S Θ′i for each i. Note that, unlike for Definition 8.1, this summation cannot be empty. Again it is trivial to see that ⊑S is reflexive and transitive; and again it is sometimes easier to work with an equivalent formulation based on a state-level “simulation” defined as follows. Definition 8.2 [Simulation] Define S to be the largest relation in S × D1 (S) such that if s S Θ then whenever α s −α → ∆′ there is a Θ′ with Θ =⇒ Θ′ and ∆′ S Θ′ . Definition 8.2 differs from the analogous Definition 6.20 in three ways: it is missing the clause for divergence, and α for refusal; and it is (implicitly) limited to =⇒-transitions that simulate by producing full distributions only. Without that latter limitation, any simulation relation could be scaled down uniformly without losing its simulation properties, for example allowing counter-intuitively a to be simulated by a 12 ⊕ ε. Lemma 8.3 The above preorder and simulation are equivalent in the following sense: for distributions ∆, Θ we have ∆ ⊑S Θ just when there is a Θmatch with Θ =⇒ Θmatch and ∆ S Θmatch . Proof: The proof is as for the failure case, except that in Theorem 6.21 we can assume total distributions, and so do not need the second part of its proof where divergence is treated. 2

8.1 Soundness In this section we prove that simulations are sound for showing that processes are related via the may-testing preorder. We assume initially that we are using only one success action ω, so that |Ω| = 1. Because we prune our computation structures before extracting values from them, we will be concerned mainly with ω-respecting structures, and for those we have the following.

48

Lemma 8.4 Let ∆ and Θ be two distributions. If ∆ is stable and ∆ S Θ, then V(∆) ≤Ho V(Θ). Proof: We first show that if s is stable and s S Θ then V(s) ≤Ho V(Θ). Since s is stable, we have only two cases: (i) s −→ 6

Here V(s)={0} and since V(Θ) is not empty we have V(s) ≤Ho V(Θ).

(ii) s −ω→ ∆′ for some ∆′ Here V(s)={1} and Θ =⇒ Θ′ −ω→ with V(Θ′ )={1}. By Lemma 6.35 specialised to full distributions, we have 1 ∈ V(Θ). Therefore, V(s) ≤Ho V(Θ). P Now for the general case we suppose ∆ S Θ. Use Proposition 3.9 to decompose Θ into s∈⌈∆⌉ ∆(s)·Θs such that s S Θs for each s ∈ P state s is stable. From above we have that V(s) ≤Ho V(Θs ) for P⌈∆⌉, and recall each such 2 those s, and so V(∆) = ∈⌈∆⌉ ∆(s)·V(s) ≤Ho s∈⌈∆⌉ ∆(s)·V(Θs ) = V(Θ).

Lemma 8.5 Let ∆ and Θ be distributions in an ω-respecting computation structure. If ∆ S Θ, then we have V(∆) ≤Ho V(Θ).

Proof: Since ∆ S Θ, we consider subdistributions ∆′′ with ∆ =⇒≻ ∆′′ ; by distillation of divergence (Lemma 5.8) we have full distributions ∆′ and ∆′1,2 and probability p such that s =⇒ ∆′ = (∆′1 p ⊕ ∆′2 ) and ∆′′ = p·∆′1 and ∆′2 =⇒ ε. There is thus a matching transition Θ =⇒ Θ′ such that ∆′ S Θ′ . By Proposition 3.9, we can find distributions Θ′1 , Θ′2 such that Θ′ = Θ′1 p ⊕ Θ′2 and ∆′1,2 S Θ′1,2 . Since ⌈∆′1 ⌉ = ⌈∆′′ ⌉ we have that ∆′1 is stable. It follows from Lemma 8.4 that V(∆′1 ) ≤Ho V(Θ′1 ). Thus we finish off with = = ≤Ho = ≤Ho ≤Ho

V(∆′′ ) V(p·∆′1 ) p·V(∆′1 ) p·V(Θ′1 ) V(p·Θ′1 ) V(Θ′ ) V(Θ) .

∆′′ = p·∆′1 linearity of V above argument based on distillation linearity of V Θ′ = Θ′1 p ⊕ Θ′2 Lemma 6.35 specialised to full distributions

Since ∆′′ was arbitrary, we have our result.

2

Lemma 8.6 Let ∆ and Θ be distributions in an ω-respecting computation structure. If ∆ ⊑S Θ, then it holds that V(∆) ≤Ho V(Θ). Proof: Suppose ∆ ⊑S Θ. By Lemma 8.3, there exists some Θmatch such that Θ =⇒ Θmatch and ∆ S Θmatch . By Lemmas 8.5 and 6.35 we obtain V(∆) ≤Ho V(Θ′ ) ⊆ V(Θ). 2 Theorem 8.7 If P ⊑S Q then P ⊑pmay Q. Proof: We reason as follows. implies implies iff iff

P ⊑S Q [P |Act T ] ⊑S [Q |Act T ] V([P |Act T ]) ≤Ho V([Q |Act T ]) A(T, P ) ≤Ho A(T, Q) P ⊑pmay Q .

the counterpart of Lemma 6.33 for simulation, for any test T [·] is ω-respecting; Lemma 8.6 (4) Definition 4.5

2 Corollary 8.8 If P ⊑S Q then P ⊑Ω pmay Q. Proof: Section 4.3.1 recalled our earlier result that Ω-testing is reducible to scalar testing.

49

2

8.2 Completeness

[ ℄

Let L be the subclass of F by skipping the div and ref (A) clauses. We write P ⊑L Q just when P |= ϕ implies Q |= ϕ. We have the counterparts of Theorems 7.15 and 7.21, with similar proofs.

[ ℄

Theorem 8.9 In a finitary pLTS ∆ ⊑L Θ if and only if ∆ ⊑S Θ. L Theorem 8.10 If P ⊑Ω pmay Q then P ⊑ Q.

Corollary 8.11 If P ⊑pmay Q then P ⊑S Q. Proof: From Theorems 8.9 and 8.10 we know that if P ⊑Ω pmay Q then P ⊑S Q. Section 4.3.1 recalled our earlier result that Ω-testing is reducible to scalar testing. So the required result follows. 2

9 Conclusion and Related Work In this paper we continued our previous work [3, 4, 2] in our quest for a testing theory for processes which exhibit both nondeterministic and probabilistic behaviour. We have generalised our results in [2] of characterising the may preorder as a simulation relation and the must preorder as a failure-simulation relation, from finite processes to finitary processes. To do this it was necessary to investigate fundamental structural properties of derivation sets (finite generability) and similarities (infinite approximations), which are of independent interest. The use of Markov Decision Processes and Zero-One laws was essential in obtaining our results. Segala [21] defined two preorders called trace distribution precongruence (⊑TD ) and failure distribution precongruence (⊑FD ). He proved that the former coincides with an action-based version of ⊑Ω pmay and that for “probabilistically convergent” systems the latter coincides with an action-based version of ⊑Ω pmust . The condition of probabilistic convergence amounts in our framework to the requirement that for ∆ ∈ D 1 (S) and ∆ =⇒ ∆′ we have |∆′ | = 1. In [15] it has been shown that ⊑TD coincides with a notion of simulation akin to ⊑S . Other probabilistic extensions of simulation occurring in the literature are reviewed in [3, 2].

References [1] S.D. Brookes, C.A.R. Hoare & A.W. Roscoe (1984): A theory of communicating sequential processes. Journal of the ACM 31(3), pp. 560–599. [2] Y. Deng, R.J. van Glabbeek, M. Hennessy & C.C. Morgan (2008): Characterising testing preorders for finite probabilistic processes. Logical Methods in Computer Science 4(4:4). [3] Y. Deng, R.J. van Glabbeek, M. Hennessy, C.C. Morgan & C. Zhang (2007): Remarks on testing probabilistic processes. ENTCS 172, pp. 359–397. [4] Y. Deng, R.J. van Glabbeek, C.C. Morgan & C. Zhang (2007): Scalar outcomes suffice for finitary probabilistic testing. In Proc. ESOP’07, LNCS 4421, Springer, pp. 363–378. [5] R. De Nicola & M. Hennessy (1984): Testing equivalences for processes. Theoretical Computer Science 34, pp. 83–133. [6] R.J. van Glabbeek (1993): The linear time – branching time spectrum II; the semantics of sequential systems with silent moves. In Proc. CONCUR’93, LNCS 715, Springer, pp. 66–81. [7] M. Hennessy (1988): An Algebraic Theory of Processes. MIT Press. [8] M. Hennessy & R. Milner (1985): Algebraic Laws for Nondeterminism and Concurrency. Journal of the ACM 32(1), pp. 137–161. [9] C.A.R. Hoare (1985): Communicating Sequential Processes. Prentice-Hall. [10] C. Jones (1990) Probabilistic Non-determinism. Ph.D. Thesis, University of Edinburgh. [11] B. Jonsson, C. Ho-Stuart & Wang Yi (1994): Testing and refinement for nondeterministic and probabilistic processes. In Proc. FTRTFT’94, LNCS 863, Springer, pp. 418–430.

50

[12] B. Jonsson & Wang Yi (1995): Compositional testing preorders for probabilistic processes. In Proc. LICS’95, IEEE Computer Society Press, pp. 431–441. [13] B. Jonsson & Wang Yi (2002): Testing preorders for probabilistic processes can be characterized by simulations. Theoretical Computer Science 282(1), pp. 33–51. [14] S. Lipschutz (1965): Schaum’s outline of theory and problems of general topology. McGraw-Hill. [15] N. Lynch, R. Segala & F.W. Vaandrager (2007): Observing Branching Structure through Probabilistic Contexts. SIAM Journal on Computing 37(4), pp. 977–1013. [16] A.K. McIver & C.C. Morgan (2005): Abstraction, Refinement and Proof for Probabilistic Systems. Springer. [17] R. Milner (1989): Communication and Concurrency. Prentice-Hall. [18] E.-R. Olderog & C.A.R. Hoare (1986): Specification-oriented semantics for communicating processes. Acta Informatica 23, pp. 9–66. [19] M. Puterman (1994): Markov Decision Processes. Wiley. [20] R. Segala (1995): Modeling and Verification of Randomized Distributed Real-Time Systems. PhD thesis, MIT. [21] R. Segala (1996): Testing probabilistic automata. In Proc. CONCUR’96, LNCS 1119, Springer, pp. 299–314. [22] Wang Yi & K.G. Larsen (1992): Testing probabilistic and nondeterministic processes. In Proc. PSTV’92, IFIP Transactions C-8, North-Holland, pp. 47–61.

51

A

Assorted counter-examples

A.1 Finite state space is necessary; otherwise. . . A.1.1 Distillation of divergence Distillation of divergence is the notion that if a system contains any divergence, no matter how small, it can be “distilled out” into an equivalent presentation in which states either wholly converge or wholly diverge; the relevant lemma is Lemma 5.8. It’s used to show the equivalence of failure simulation and failure e-simulation (Theorem 6.21), and to justify the full-distribution approach in the may-case (Section 8). Example 3.16 is an infinite-state system over states s2,3,··· where the probability of convergence is 1/k from any state sk , thus a situation where distillation of divergence fails because all the states partially diverge, yet there is no single state which wholly diverges. In spite of that, there is an infinite-state version of Lemma 5.6 (the underlying fact on which Lemma 5.8 depends). In it the assumption is that the partial divergences are bounded away from zero, i.e. there is some ε > 0 such that every state converges with probability at least ε. But Example 3.16 fails this assumption, because the 1/k probabilities of convergence become arbitrarily small.

A.1.2 Transitivity of failure simulation I think Section A.2.4 below can be adapted to provide a counter-example here: replace the intermediate infinitely branching but finite-state process (28) by the finitely branching but infinite-state process Example 3.16.

A.1.3 Equivalence of finite and infinite interpolation P Let the state space be the positive integers and consider the infinite interpolant i i/2i of the set {i | i ≥ 1}. It cannot be realised by any finite interpolation, thus invalidating Lemma B.1 when the state space is infinite.

A.1.4 Soundness of failure simulation See Section A.2.5 below.

A.1.5 Pre-congruence of simple failure similarity We use a modification of Example 3.16 which as before has states sk with k ≥ 2, but we add an extra a-looping state sa to give all together the system τ (sa sk −→

for k ≥ 2 There is a failure simulation sk sFS (sa

1 k

1 k2

⊕ sk+1 )

and

τ ⊕ 0) because the move sk −→ (sa

a sa −→ sa .

1 k2

⊕ sk+1 ) can be matched by a move to

τ -simulating move would be the identity (sa 12 ⊕ (sa 1 ⊕ 0)) which simplifies to just (sa 1 ⊕ 0) again — i.e. a sufficient −→ k+1 k k resolution of =⇒. Now s2 |a sa diverges even though s2 itself does not, and (recall from above) we have s2 sFS (sa 1 ⊕ 0). 2 Yet (sa 1 ⊕ 0) |a sa does not diverge, which is a contra-indication for the truth of Lemma 6.31 unless it uses finiteness of the 2 state space somewhere. The role of the infinitely many states seems to be to escape distillation of divergence in the system. To

get this kind of counter-example, we want to “trick” our constructions into allowing the failure-simulation of an unboundedly deep τ -tree by deadlock. But in a finite-state system such a tree must diverge either with probability one (in which case the divergence clause of sFS kicks in, and prevents the tricky simulation because deadlock does not diverge), or with probability zero (in which case it has no substantive effect anyway). Example 3.16 uses infinitely many states to avoid the two extremes. Note that this counter-example does not go through if we use failure similarity FS instead of simple failure similarity sFS , since s2 6FS (sa 1 ⊕ 0) — the former has s2 =⇒ sa 1 ⊕ ε, but this move cannot be matched by sa 1 ⊕ 0. 2

2

2

52

A.2 Finite branching is necessary; otherwise. . . In the non-probabilistic world, it’s not possible to have infinite branching if the state space is finite; so finite branching, as a restriction, is imposed only when the state space is infinite. In the probabilistic world, even a two-element state space allows τ (s2 p ⊕ s3 ) for all p in some infinite set. infinite branching of an associated pLTS: consider the pLTS containing transitions s1 −→ Here are some examples of what happens if we don’t impose finite branching as well as finiteness of the state space.

A.2.1 Closure of derivatives An obvious outcome is that we lose the guaranteed closure of derivatives: just take the example above where the p’s come from some open set like (0, 1).

A.2.2 Distillation of divergence We also lose distillation of divergence without finite branching. The essence of the counter-example here is that we collapse all the sk -states of Example 3.16 onto a single state, which then becomes infinitely branching: we take a system comprising just two states and a k-indexed set of transitions τ s −→ (28) k ( 0 1/k2 ⊕ s) for k ≥ 2,

[ ℄

τ τ τ as illustrated in Figure 7. As we have seen, by taking transitions s −→ K · −→K+1 · −→K+2 · · · we can achieve divergence with probability 1 − 1/K for arbitrary K ≥ 2; but (contradicting distillation) neither of the two states s, 0 wholly diverges. The zero-one law Lemma 5.6 is not affected by infinite branching, because it is restricted to the deterministic case (i.e. the case of no branching at all). What fails is the combination of a number of deterministic distillations to make a non-deterministic one, in Lemma 5.8: it depends on Lemma 5.3, which in turn requires finite branching.

For those with some knowledge of sequential probabilistic/demonic semantics, there is an instructive comparison to be made. Straightforward generalisation of the invariant/variant principle for loop correctness to probabilistic programs allows a direct proof of the zero-one law in the fully nondeterministic case [16, Sec. 2.6], even in an infinite state space, yet there seems to be no explicit mention of finite branching. This is explained by the fact that probabilistic/demonic sequential programs are guaranteed to be continuous (as predicate transformers) only when finite branching is imposed (by analogy with the well known connection between continuity and bounded nondeterminism for standard programs); and failure of that continuity invalidates the soundness proof of the loop-correctness rule referred to above, since it affects the definition of loop itself (ω-limit, or not). Finally –pursuing this point– note that deterministic (but probabilistic) outcomes with infinite support are unproblematic: for example the sequential program b, n := true, 0; while b do n := n+1; b := true 1/2 ⊕ false end is perfectly legitimate (i.e. continuous), even though it has infinitely many potential final values of n.

A.2.3 Equivalence of failure simulation and extended failure simulation By relating both s and 0 (simulated states) to 0 (simulating state) we can see that s from (28) above is failure-simulated by just 0 itself. Yet we have for example s =⇒ 0 1/2 ⊕ ε, a move which cannot be matched by any extended failure-simulating =⇒-move from 0 — and that means that Theorem 6.21 fails in this case. As we’d expect, its proof refers to Lemma 5.8 which in turn refers to Lemma 5.3 where finite branching is assumed.

[ ℄

A.2.4 Transitivity of failure simulation τ τ (0 1/2 ⊕ t1 ) and t1 −→ Define a process t0 −→ t1 , and argue that t0 sFS s from (28) above; but also –as observed there– we have s s s FS 0. Yet we do not have t0 FS 0.

A.2.5 Soundness of failure simulation Given Section A.2.4, failure simulation and the failure-testing preorder cannot coincide for infinitary processes, since the preorder is transitive trivially. Is it soundness, or completeness that fails? If we compare (28) with 0, we have failure simulation of the former by the latter (Appendix A.2.3). But the test τ.ω gives (0, 1] as an outcome set for the former and just {1} for the latter, breaking the preorder.

53

1

/

9

T

h

e

e

c

o

o

s

n

c

t

t

i

r

a

v

a

n

e

i

s

a

e

u

n

r

p

i

l

,

n

s

g

t

i

l

p

t

w

o

n

r

a

r

i

t

s

e

b

i

i

a

u

,

t

l

t

c

1

n

b

g

s

+

e

a

n

h

k

v

o

t

c

k

a

t

s

2

l

y

l

o

a

e

+

u

i

k

,

y

s

i

.

.

1

f

n

d

v

e

.

i

/

k

s

.

3

3

1

/

1

/

4

6

4

s 1 k

=

/

4

2

s

5

1

/

2

5

6

1

/

3

6

There are two states s (large black node) and 0 (small black nodes); the white nodes indicate probabilistic choice; and all transitions are internal. To diverge from s with probability 1 − 1/K, start at “petal” K and take successive τ -loops anti-clockwise from there. Yet, although divergence with arbitrarily high probability is present, complete probability-1 divergence is nowhere possible. Either infinite states or infinite branching is necessary for this anomaly. Example starting from k = 3 Loop last used Overall escape probability

3 0.11

4 0.17

5 0.20

6 0.22

7 0.24

8 0.25

9 0.26

10 0.27

· · · 15 0.29

Figure 7: Infinitely branching flower, from Section A.2.2.

54

· · · 20 0.30

· · · 30 0.31

· · · 40 0.32

A.2.6 Coincidence of failure simulation and the limit of its approximation Rob used the following example to show the necessity of the finite branching condition. Consider a PLTS with four states s, t, u, v, 0 and the transitions are a • s −→ 0 1⊕ s 2

a a • t −→ 0, t −→ t a u • u −→ τ • v −→ u p ⊕ t for all p ∈ (0, 1).

This is a finite-state but not finitely branching system, due to the infinite branch in v. We have that s kFS v for all k ≥ 0 but we do not have s sFS v. We first observe that s sFS v does not hold because s will eventually deadlock with probability 1, whereas a fraction of v will go to u and never deadlock. τ We now show that s kFS v for all k ≥ 0. For any k we start the simulation by choosing the move v −→ (u 1 ⊕ t). By 2k induction on k we show that (29) s kFS (u 1 ⊕ t). 2k

(k+1)

The base case k = 0 is trivial. So suppose we already have (29). We now show that s FS

(u

1 2k+1

⊕ t). Neither s nor t nor u

a 0 1 ⊕ s. This can be can diverge or refuse {a}, so the only relevant move is the a-move. We know that s can do the move s −→ 2 a matched up by (u 1 ⊕ t) −→ (0 1 ⊕ (u 1 ⊕ t)). 2k+1

2

2k

B Technical lemmas and proofs for Section 3 B.1

Infinitary properties of lifting

Lemma B.1 [Infinite interpolation in finite dimensional space] In finite dimensional Euclidean space, infinite interpolation reduces to finite interpolation. Proof: Suppose we have a set X in N -dimensional Euclidean space and a point x such that x is an infinite interpolation of points in X but is not a finite interpolation of any such points. Since the set of finite interpolants of X is convex (in the usual sense), the point x can therefore be non-strictly separated from it: there is a hyperplane H (thus of dimension N −1) containing x such that all the finite interpolants of X lie non-strictly on one side of H. Since x is however an (infinite) interpolant of X, and is in H with all of X on one side of H, in fact x must be an (infinite) interpolant of X ∩ H. But x is not, of course, a finite interpolant of X ∩ H (since otherwise it would have been a finite interpolant of X in the first place). Since X ∩ H is a space of (at least) one dimension lower than our original, the argument reduces eventually to a simple line, where it holds trivially. 2 Lemma B.2 [Infinitary linearity of lifting] Let R be a relation between S and D(S) and, as usual, let P R be its lifted version, thus P a relation in D(S)×D(S). Let I be an index set, possibly infinite. Then ∆ p ·∆ ) R ( p ·Θ i R Θi for all i implies that ( i i i i) i i P where i pi ≤ 1. Proof: This is immediate from the current definition of lifting. P We note first that s R Θ just when Θ = j p j ·Θj for j ranging over a finite index set J such that s R Θj for each index j P and 1 = j p j . Because we are in a finite state space, however, we can by Lemma B.1 allow J to be infinite. P P Now more generally we have that ∆ R Θ just when Θ = s js p js ·Θjs for js ranging over a (possibly infinite) index set P Js , and s itself ranging over the support of ∆, such that s R Θjs for each js and ∆(s) = js p js . The result then follows by straightforward (if intricate) rearrangement of absolutely convergent series:

iff

∆i R Θi for all i P P i i Θi = s j i p js ·Θjs and and

i

s

s R Θjs P i ∆i (s) = j i p js

for each s in the support of ∆i

s

55

P

implies

i

s

i

jsi

i

pi ·p js ·Θjs for each s in the support of some ∆i

s

implies

P

P P js ·Θjs i pi ·Θi = s js p js sP RΘ P ( i pi ·∆i )(s) = js p js

and and (

i

s R Θjs P P P jsi i pi ·∆i (s) = i j i pi ·p

and and

implies

P P P

pi ·Θi =

i

P

i

pi ·∆i ) R (

P

i

some cheating here. . . rearrangement; suitable definitions of p js and Θjs for each s in the support of

P

i

pi ·Θi

pi ·Θi ) . 2

B.2

Elementary properties of weak derivations

Lemma B.3 Suppose p1 + p2 ≤ 1. Then (i) ∆1,2 =⇒ ∆′1,2 implies (p1 ·∆1 + p2 ·∆2 ) =⇒ (p1 ·∆′1 + p2 ·∆′2 ). (ii) If (p1 ·∆1 + p2 ·∆2 ) =⇒ ∆′ then ∆′ = p1 ·∆′1 + p2 ·∆′2 for subdistributions ∆′1,2 such that ∆1,2 =⇒ ∆′1,2 . τ Proof: For (i) we note that lifted transitions −→ have that property directly from Clause (2) of Definition 3.2. Thus the structures whose existence is implied by Definition 3.12 for ∆1 =⇒ ∆′1 and ∆2 =⇒ ∆′2 separately can be added with the p1 , p2 scaling given to form a single composite structure establishing (p1 ·∆1 + p2 ·∆2 ) =⇒ (p1 ·∆′1 + p2 ·∆′2 ). For (ii) we use an inductive argument, here presented informally. To avoid confusion of subscripts we will effect some renaming and simplification in the demonstrandum, making it read

If Γ + Λ =⇒ Π then Π = ΠΓ + ΠΛ with Γ, Λ =⇒ ΠΓ,Λ .

(30)

(The p1,2 will be re-introduced further below.) × → → τ → Now from Definition 3.12 we have Γ + Λ = Π0 = Π× 0 + Π0 for some Π0 , Π0 with, further, that Π0 −→ Π1 for some Π1 . Define Γ→ := Γ ⊓ Π→ 0 Γ× := Γ − Γ→ (31) Λ× := Λ ⊓ Π× 0 Λ→ := Λ − Λ× , and then check these elementary facts: that Γ× + Γ→ = Γ and Λ× + Λ→ = Λ, and that all the introduced subdistributions are in their proper ranges. What remains is to show that they combine properly, and for that we fix a state s and distinguish two → In Case (a) the definitions (31) simplify to Γ→ .s, Γ× .s, Λ× .s, Λ→ .s := cases: either (a) Π→ 0 .s ≥ Γ.s or (b) Π0 .s ≤ Γ.s. × × × → × × Γ.s, 0, Π0 .s, (Λ.s − Π0 .s), whence immediately Γ .s + Λ→ .s = Π→ 0 .s and Γ .s + Λ .s = Π0 .s. Case (b) is similar. τ τ → → → Because Π→ is − → -enabled, we see that s − → 6 implies Π .s = 0 whence also Γ .s = Λ .s = 0, so that both Γ→ , Λ→ are 0 0 τ → τ → τ −→ -enabled also. Thus we appeal to Proposition 3.9 to find Γ1 , Λ1 with Γ −→ Γ1 and Λ −→ Λ1 and Π1 = Γ1 + Λ1 . Being now in the same position with Π1 as we were with Π0 , we can continue this procedure (here the informal induction) to induce derivation structures for Γ, Λ separately that establish (30) when added together as in Part (i) of this lemma. Finally, we let Γ, Λ, Π be p1 ·∆1 , p2 ·∆2 , ∆′ , and scale the induced derivation structures up by 1/p1 , 1/p2 respectively. Because later subdistributions in those structures can never be bigger than earlier ones, the proper bounding of ∆1,2 themselves guarantees that the up-scaling does not make any subsequent subdistributions too big. 2 Theorem B.4 [Transitivity of =⇒]

In a finite-state pLTS, if ∆ =⇒ Θ and Θ =⇒ Λ then ∆ =⇒ Λ.

→ Proof: By definition ∆ =⇒ Θ means that some ∆k , ∆× k , ∆k exist for all k ≥ 0 such that

∆ = ∆0 ,

→ ∆k = ∆ × k + ∆k ,

τ ∆→ k −→ ∆k+1 ,

Θ=

∞ X

k=0

Since Θ = ∆× 0 +

P

k≥1

≥ ∆× k and Θ =⇒ Λ, by Theorem 3.18 there are Λ0 , Λ1 such that X × ∆k =⇒ Λ≥ Λ = Λ0 + Λ≥ ∆× 0 =⇒ Λ0 , 1 , 1 . k≥1

56

∆× k.

(32)

Using Theorem 3.18 again, we have Λ1 , Λ≥ 2 such that X

∆× 1 =⇒ Λ1 ,

≥ ∆× k =⇒ Λ2 ,

≥ Λ≥ 1 = Λ1 + Λ2 ,

k≥2

thus in combination Λ = Λ0 + Λ1 + Λ≥ 2 . Continuing this process we have that X

∆× k =⇒ Λk ,

≥ ∆× j =⇒ Λk+1 ,

Λ=

k X

Λj + Λ≥ k+1

(33)

j=0

j>k

for all k ≥ 0. Lemma 3.5 and Proposition 3.19 ensure that ∆ =⇒ ΘPimplies |∆| ≥ |Θ| for any subdistributions ∆ and P P Θ, and ≥ ∞ × × therefore that | j>k ∆× j | ≥ |Λk+1 | for all k ≥ 0. But since Θ = k=0 ∆k from (32), we know that the tail sum j>k ∆j ≥ converges to ε when k approaches ∞, and therefore that limk→∞ Λk = ε. Thus by taking that limit we conclude that Λ

∞ X

=

Λk .

(34)

k=0 × → Now for each k ≥ 0, we know that ∆× k =⇒ Λk gives us some ∆kl , ∆kl , ∆kl for l ≥ 0 such that X × → τ ∆kl . ∆× ∆kl = ∆× ∆→ Λk = kl −→ ∆k,l+1 k = ∆k0 , kl + ∆kl ,

(35)

l≥0

Therefore we can put all this together with Λ

=

∞ X

k=0

Λk

X

=

∆× kl

X

=

i≥0

k,l≥0

0 @

X

k,l|k+l=i

1

A , ∆× kl

(36)

where the last step is a straightforward diagonalisation. Now from the decompositions above we re-compose an alternative trajectory of ∆′ i ’s to take ∆ via =⇒ to Λ directly. Define X X → × → × → ∆→ (37) ∆× ∆′ i = ( ∆′ i = ∆′ i = ∆′ i + ∆′ i , kl ) + ∆i , kl , k,l|k+l=i

k,l|k+l=i

so that from (36) we have immediately that Λ

=

X

×

∆′ i .

(38)

i≥0

We now show that (i) ∆ = ∆′ 0 →

τ ∆′ i+1 (ii) ∆′ i −→

from which, with (38), we will have ∆ =⇒ Λ as required. For (i) we observe that = = = = = = =

∆ ∆0 → ∆× 0 + ∆0 ∆00 + ∆→ 0 → × → ∆P 00 + ∆00 + ∆0 P × → ( k,l|k+l=0 ∆kl ) + ( k,l|k+l=0 ∆→ kl ) + ∆0 ′→ ′× ∆0 +∆0 ∆′ 0 .

(32) (32) (35) (35) index arithmetic (37) (37)

For (ii) we observe that →

= τ −→ =

′ ∆P i → ( k,l|k+l=i ∆→ kl ) + ∆i P ( k,l|k+l=i ∆k,l+1 ) + ∆i+1 P × → → ( k,l|k+l=i (∆× k,l+1 + ∆k,l+1 )) + ∆i+1 + ∆i+1

(37) (32), (35), Proposition 3.9 (32), (35)

57

= = = = = =

P P × → → ( k,l|k+l=i ∆× k,l+1 ) + ∆i+1 + ( P k,l|k+l=i ∆k,l+1 ) + ∆i+1 P × → ( k,l|k+l=i ∆k,l+1 ) + ∆i+1,0 + ( k,l|k+l=i ∆k,l+1 ) + ∆→ i+1 P P × → → ( k,l|k+l=i ∆× ) + ∆ + ∆ + ( ∆ ) + ∆i+1 i+1,0 k,l+1 k,l+1 k,l|k+l=i P Pi+1,0 → → ( k,l|k+l=i+1 ∆× ) + ( ∆ ) + ∆ i+1 kl kl k,l|k+l=i+1 × → ∆′ i+1 + ∆′ i+1 ∆′ i+1 ,

which concludes the proof.

B.3

rearrange (35) (35) index arithmetic (37) (37) 2

Structural properties of weak derivations

Definition B.5 [Reward function] [−1, 1]. 3

Let a reward function be a function $: S → [−1, 1] from the state space into the real interval

Define empty suprema over sets of reward-like values to be −∞. This is only a convenience in the middle of calculations; the infinities always disappear in the end. Write $.∆ for the expected value of reward function $ over distribution ∆. Write (s =⇒) for {∆′ | s =⇒ ∆′ } and write W⊓ .$.s for ⊓{$∆′ | s =⇒ ∆′ }. Write (s =⇒pp ) for the single subdistribution that results from using policy pp to construct (s =⇒); note that (=⇒pp ) is a function (and is total), whereas (=⇒) is in general a relation (also total). This function is made precise in Definition B.8 below. Lemma B.6 [Linearity of W⊓ and W⊔ ]

For any reward function $ the associated W⊓ .$ and W⊔ .$ are linear.

Proof: We prove the binary case for W⊓ : extension to any finite linear combination is straightforward, and the argument for W⊔ is the same. Suppose ∆ = ∆1 p ⊕ ∆2 for some p and ∆{1,2} . Define m{1,2} := W⊓ .$.∆{1,2} respectively. Then for any ε > 0 there are ∆′{1,2} with ∆{1,2} =⇒ ∆′{1,2} and $.∆′{1,2} ≤ m{1,2} + ε. By linearity of =⇒ we therefore have ∆′ := ∆′1 p ⊕ ∆′2 with ∆ =⇒ ∆′ and $.∆′ ≤ (m1 p ⊕ m2 ) + ε. Since ε is arbitrary, the result follows. 2 Our overall aim is to establish that every element in (s =⇒) is the interpolation of a finite number of static derivatives ∆′i , that is in each case s =⇒ppi ∆′i . In order to do that, however, we must introduce the notion of discounted derivatives. Definition B.7 [Discounted derivatives] Define s =⇒δ ∆′ for discount 0 ≤ δ ≤ 1 by analogy with our earlier definition τ Definition 3.12 of derivative except that each −→-move discounts its outcome by δ. That is, we revise that earlier definition by including multiplications by δ at the appropriate points: ∆ δ ·∆→ 0 .. .

= τ −→

× ∆→ 0 + ∆0 → ∆1 + ∆ × 1 .. .

δ ·∆→ k

τ −→

× ∆→ (k+1) + ∆(k+1) .. .

In total:

— The × component stops “here,” — but the → component moves on, discounted by δ.

∆′

— Finally, all the stopped components are summed.

P∞

′ Then we call ∆′ := k=0 ∆× k a δ-discounted derivative of ∆, and write ∆ =⇒δ ∆ to mean that ∆ can make a weak δ-discounted ′ τ move to a δ-discounted derivative ∆ .

Note that (=⇒1 ) and (=⇒) agree trivially. Finally, we combine the two notions of static policy and discount: Definition B.8 [Discounted SDP-derivatives]

Define s =⇒δ,pp ∆′ as follows:

∆0 ∆× k .s

= =

∆k+1

=

∆ 0 if pp defined at s else ∆k .s X δ · pp(s) s ∈ ⌈∆k ⌉ pp defined at s

3 In later sections we will restrict reward functions to the non-negatives [0,1]; but here we need the more general range. Maybe more explanations for the reason of needing this general range?

58

Then, as usual, we sum the stopped distributions to define ∆′ :=

P∞

k=0

∆× k.

Note that (s =⇒pp ), which we defined informally above, is given by (s =⇒1,pp ). Definition B.9 [discounted payoff] Wδ⊔ .$.s := ⊔{$∆′ | s =⇒δ ∆′ }.

For reward function $ and discount δ define the discounted maximising payoff function

The payoff function Wδ⊔ , applied to s, thus gives the supremal expected reward determined by $ over all distributions reached by starting at s and using =⇒δ . Pre-calculating this function is the key technique in defining important policies: for us, they will be the maximising ones. For the moment we will just show that the function Wδ⊔ .$ (over S) is a fixed point. Lemma B.10 [Wδ⊔ .$ as a fixed point]

In the usual context (suitable S, δ, $, underlying pLTS and s ∈ S) we have Wδ⊔ .$.s

=

τ $s ⊔ δ · ⊔ {Wδ⊔ .$.∆′ | s −→ ∆′ } ,

τ that is Wδ⊔ .$ for any δ, $ (including δ = 1) is a fixed point of the function (λW.(λs.$s ⊔ δ · ⊔ {W.∆′ | s −→ ∆′ })). Note that if s −τ→ 6 on the right-hand side (for some argument) then the supremum is empty, giving −∞, as we mentioned above; multiplied by δ it is still −∞; then ⊔’d with $s the −∞ disappears. (Doing this avoids a case analysis in at least two places.)

Proof: We rely on splitting =⇒δ up in a “first one step, and then the rest” -style. This gives us first =

(s =⇒δ ) × ′ → → τ ′ {∆× 0 + δ ·∆ | s = ∆0 + ∆0 ∧ ∆0 −→ ∆1 ∧ ∆1 =⇒δ ∆

→ for some ∆× 0 , ∆0 , ∆1 } ,

which allows us to calculate further as follows: = = = =

Wδ⊔ .$.s ⊔{$∆′ | s =⇒δ ∆′ } × ′ → → τ ′ ⊔{$.(∆× for some ∆→ 0 , ∆1 } 0 + δ ·∆ ) | s = ∆0 + ∆0 ∧ ∆0 −→ ∆1 ∧ ∆1 =⇒δ ∆ × × → → τ ′ ′ ⊔{$.∆0 + δ ·$∆ | s = ∆0 + ∆0 ∧ ∆0 −→ ∆1 ∧ ∆1 =⇒δ ∆ for some ∆→ 0 , ∆1 } × ′ → → τ ′ + δ · ⊔ {$∆ | ⊔{$.∆× + ∆ ∧ ∆ − → ∆ ∧ ∆ = ⇒ ∆ for some ∆→ s = ∆ 1 1 δ 0 0 0 , ∆1 }} 0 0

= × → → τ δ ⊔{$.∆× 0 + δ · ⊔ {W⊔ .$.∆1 | s = ∆0 + ∆0 ∧ ∆0 −→ ∆1

= =

for some ∆→ 0 }}

definition above

lift Wδ⊔ .$; linearity of =⇒δ

τ ⊔{$s p ⊕ δ · ⊔ {Wδ⊔ .$.∆1 | s −→ ∆1 } for some 0 ≤ p ≤ 1} s can be split only into s p ⊕ s δ τ $s ⊔ δ · ⊔ {W⊔ .$.∆1 | s −→ ∆1 } , ⊔ over interpolation of scalars must be one or the other

as required.

2

Given that Lemma B.10 expresses Wδ⊔ .$ as a fixed point, and works for W1⊔ .$ in particular, one could ask why we bother with the δ. The answer is that in the δ=1 case it’s not clear which fixed point we have, whereas for δ