Embedding Finite Automata within Regular Expressions - CiteSeerX

Embedding Finite Automata within Regular Expressions Shoham Ben-David1 [email protected] 1

Dana Fisman1,2

Sitvanit Ruah1

[email protected]

[email protected]

IBM Haifa Research Lab

2

Weizmann Institute of Science

Category: Technical Paper

Abstract. Regular expressions and their extensions have become a major component of industry-standard specification languages such as PSL/Sugar ([2]). The model checking procedure of regular expression based formulas, as well as of many LTL and CTL formulas, involves constructing an automaton which runs in parallel with the model. In this paper we re-examine this construction. Instead of directly translating a regular expression into an automaton, as traditionally done, we propose an algorithm which allows an intermediate representation mixing both regular expressions and automata. This representation can be thought of as plugging an automaton inside a regular expression, to replace an existing sub-expression. In order to be verified, the intermediate representation is then translated into another automaton, resulting in a set of automata running in parallel. A key feature of this algorithm is that the plug-in automaton is independent of the regular expression from which it originated, and thus can be used in several different properties. The contribution of our work is manyfold, as demonstrated in the paper. It allows modularity and flexibility of the automata construction, and can increase expressiveness when SEREs are mixed with CTL. In many cases it significantly reduces the size of the automata built for formulas, thus reducing the overall run time of verification.

1 Introduction Symbolic model checking has been found extremely efficient in the verification of hardware designs, and has been widely adopted in industry in recent years. While traditional model checkers ([13]) used the temporal logics CTL or LTL as their specification language, contemporary industrial languages, have sought ways to make the specification language easier to learn and use. The industry-standard language PSL/Sugar [2], as well as other industry oriented languages (e.g. [4]), augment the logic with the use of Extended Regular Expressions (EREs, or SEREs using the formulation of [2]). A SERE specification can be viewed as a sequence of Boolean events describing a desirable behavior of the model. For example, the SERE formula ϕ = {req·¬ack ∗ ·ack}! asserts that on all execution paths of the model, req is active on the first cycle, ack is then inactive for zero or more cycles, and then ack becomes active. Similarly, SEREs can be used to describe an undesireable behavior of the model. The formula not {req·¬ack ∗ ·ack}! asserts that there does not exist an execution path on which req is active on the first cycle, ack is then inactive for zero or more cycles, and then ack becomes active.

2

In this paper we consider formulas given as not SERE! . These formulas have a special importance for two main reasons. The first is that a large subset of LTL, CTL and other SERE-based properties can be automatically reduced to an equivalent formula of the form not r! 1 . The second reason is the efficient model checking methods they enjoy, as explained below. A not r! formula has the nature that it is sufficient to find one execution path of the model satisfying r, in order to conclude the formula does not hold in the model. This nature allows a not r! formula to be modelled by a non-deterministic finite automaton (NFA) Nr , which accepts sequences satisfying r, and which is linear in the size of r. Running it together with the model, we then verify the invariant property AG ¬bad where bad is a boolean expression asserting that Nr is in an accepting state. The reduction to an invariant property is important, since invariant properties are easier to verify by different model checking engines [7]. The efficient verification algorithm, together with the wide variety of properties which can be transformed into not SERE! formulas, have made them the major translation path of the IBM model checking toolset RuleBase [5] and the topic of this paper. The translation of a not SERE! formula into an NFA is re-examined in this paper. Instead of directly translating a formula into an automaton, as traditionally done, we propose an embedding algorithm which allows the translation to be modular, by allowing an intermediate representation of the formula which mixes both SERE and NFA. Let r be a SERE and s a sub-SERE of r. We show a process in which s is plugged out of r, a separate side-NFA is built for it, and a simpler SERE referring to the side-NFA is plugged into r, replacing s. A key feature of this algorithm is the fact that the side-NFAbuilt is independent of the SERE it originated from. This allows the side-NFA to be plugged-in in any other SERE when appropriate. The use of the embedding algorithm has several advantages, on which we elaborate in this paper. It allows modularity of the automata construction, as automata previously built can be plugged into a more complex SERE. The modularity leads to flexibility in the way the automaton is constructed, since different parts of the SERE can use different translations. It can increase expressiveness when a not SERE! property is translated to CTL . Since SEREs are more expressive than CTL [15], the embedding is needed for the parts which are not expressible in CTL. Finally, the use of embedding can significantly reduce the size of the automata built for formulas, thus reducing the overall run time of verification. The rest of the paper is organized as follows. Section 2 covers some preliminaries. Section 3 gives our embedding algorithm, and in section 4 we discuss the benefits of the algorithm and give example applications. Section 5 concludes the paper. The proofs appear in Appendix A.

1

Beer et al. [6] have shown how to translate a subset of CTL and SERE-based formulas into not SERE ! formulas. Since those CTL formulas lay in the common fragment of CTL and LTL [12], their LTL counterparts also have an equivalent in this form.

3

2 Preliminaries 2.1 The computational Model - DTS We represent a finite state program by a discrete transition system. A discrete transition system (DTS) is a symbolic representation of a finite automaton on finite or infinite words. The definition is derived from the definition of a fair discrete system (FDS) [11]. A DTS D : hV, Θ, ρ, A, J i consists of the following components: – V = {u1 , . . . , un }: A finite set of typed state-variables over possibly infinite domains. We define a state s to be a type-consistent interpretation of V , assigning to each variable u ∈ V a value s[u] in its domain. We denote by ΣV the set of all states, and by BV the set of all boolean expressions over the state-variables in V (when V is understood from the context we write simply Σ and B, respectively). Example 1. Let V denote the set {a, b, c} of Boolean state variables. Then ΣV the set of interpretation of these variables contains the 8 interpretations giving different truth values to a, b and c. That is, ΣV = {∅, {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}. And BV is the set of all boolean expressions over these variables. For example, the boolean expressions a, a ∨ b, ¬c and (a → ¬b) ∧ c are all in BV . y – Θ: The initial condition. This is an assertion characterizing all the initial states of the DTS. – ρ: The transition relation. This is an assertion ρ(V, V 0 ) relating a state s ∈ ΣV to its D-successor s0 ∈ ΣV by referring to both unprimed and primed versions of the state-variables. The transition relation ρ(V, V 0 ) identifies state s0 as a Dsuccessor of state s if hs, s0 i ρ(V, V 0 ), where hs, s0 i is the joint interpretation which interprets u ∈ V as s[u] and u0 as s0 [u]. – A: The accepting condition for finite words. This is an assertion characterizing all the accepting states for runs of the DTS satisfying finite words. – J = {J1 , . . . , Jk }: The justice (Büchi) accepting condition for infinite words. This is a set of assertions characterizing the sets of accepting states for runs of the DTS satisfying infinite words. The justice requirement J ∈ J stipulates that every infinite computation contains infinitely many states satisfying J. Let D be a DTS for which the above components have been identified. We define a run of D to be a finite or infinite non-empty sequence of states σ : s0 s1 s2 . . . satisfying the requirements of initiality i.e. that s0 Θ; and of consecution i.e. that for each j = 0, 1, . . . , the state sj+1 is a D-successor of state sj . A run satisfying the requirement of maximality i.e. that it is either infinite, or terminates at a state sk which has no Dsuccessors is termed a maximal run. Let U ⊆ V be a subset of the state-variables. A run σ : s0 s1 s2 . . . sn . . . is said to be satisfying a finite word w = b0 b1 . . . bn over BU iff for every i, 0 ≤ i ≤ n, si bi . A run σ : s0 s1 s2 . . . sn+1 . . . satisfying a finite word w = b0 b1 . . . bn is said to be accepting w iff sn+1 satisfies A. An infinite run σ : s0 s1 s2 . . . is said to be satisfying an infinite word w = b0 b1 . . . over BU iff for every i ≥ 0, si bi . An infinite run σ satisfying an infinite word w is said to be accepting w iff for each J ∈ J , the run σ contains infinitely many states satisfying J.

4

For a state s, we denote by s|U the restriction of s to the state-variables in U , i.e. the state s|U agrees with s on the interpretation of the state-variables in U , and does not provide an interpretation for variables in V \ U . For a run σ = s0 s1 s2 . . . we denote by σ|U the run s0 |U s1 |U s2 |U . . .. Discrete transition systems can be composed in parallel. Let Di = hVi , Θi , ρi , Ai , Ji i, i ∈ {1, 2}, be two discrete transition systems. We denote the synchronous parallel composition of D1 and D2 by D1 ||| D2 and define it to be D1 ||| D2 = hV1 ∪V2 , Θ1 ∧Θ2 , ρ1 ∧ ρ2 , A1 ∧ A2 , J1 ∪ J2 i. We can view the execution of D as the joint execution of D1 and D2 . From Finite Automata to DTS Given a non-deterministic finite automata on finite words (NFA) [10] whose alphabet is a set of boolean expressions over a given set of variables V , it is straightforward to construct the discrete transition system corresponding to it. The same holds for a generalized Büchi automaton on infinite words (GBA) [14]. Example 2. Assume we have the set of state variables V = {a, b, c}. Let N be the NFA over BV described in figure 1. Then the DTS D = hV, Θ, ρ, A, J i described below

Fig. 1. An NFA over BV where V = {a, b, c}.

corresponds to N , where stN is a new variable with domain {0, 1, 2, 3}. V = {a, b, c, stN }; Θ = (stN = 1); A = (stN = 3); J = ∅; and (stN = 1 ∧ a ∧ st0N = 2) ∨ (stN = 2 ∧ b ∧ st0N = 2)∨ ρ = (stN = 2 ∧ c ∧ st0N = 3) ∨ (stN = 1 ∧ ¬a ∧ st0N = 0)∨ (stN = 2 ∧ ¬b ∧ st0N = 0) ∨ (stN = 2 ∧ ¬c ∧ st0N = 0)

y

Formally, let V be a set of state-variables and let B be the corresponding set of boolean expressions. Let N = hB, Q, Q0 , δ, Ai be an NFA. Let state be a new variable (not in V ) whose domain is Q ∪ {qsink } where qsink ∈ / Q. Then, N can be represented as the DTS DN = hV N , ΘN , ρN , AN , J N i where _ _ V N = V ∪ {state}; ΘN = (state = q0 ); AN = (state = q); J N = ∅; q0 ∈Q0

ρN =

W (q1 ,σ,q2 )∈δ

q∈A

W (state = q1 ∧ σ ∧ state0 = q2 ) 0 (state = q1 ∧ ¬σ ∧ state = qsink )

Similarly, we can construct the DTS corresponding to a Büchi automaton. Let G = hB, Q, Q0 , δ, Fi be a GBA with F = {F1 , . . . , Fk }. Then, G can be represented as the discrete transition system DG = hV G , ΘG , ρG , AG , J G i where:

5

JG

W V G = V ∪ {state}; ΘG = q0 ∈Q0 (state = q0 ); AG = ∅; W = {J1 , . . . , Jk } where for each 1 ≤ i ≤ k : Ji = q∈Fi (state = q); W (state = q1 ∧ σ ∧ state0 = q2 ) W W ρG = (q1 ,σ,q2 )∈δ (state = q1 ∧ ¬σ ∧ state0 = qsink ) (state = qsink ∧ state0 = qsink )

In this paper, given an NFA N = hB, Q, Q0 , δ, Ai we first construct a terminal Büchi automaton [9] by adding a self loop on all accepting states of N and defining the Büchi accepting sets to be the singleton set of accepting states (i.e. {A}). This Büchi automaton accepts all words which have a finite prefix accepted by N . Then we construct a DTS for the resulting terminal Büchi automaton. We denote the resulting DTS D N . Let σ = σ0 s1 . . . be a run of D N . We say that the “step” (si , si+1 ) of D N corresponds to the transition (q1 , σ, q2 ) ∈ δ of N iff (si , si+1 ) |= (state = q1 ∧ σ ∧ state0 = q2 ). Example 3. The DTS DN = hVN , ΘN , ρN , AN , JN i for the terminal Büchi constructed for the NFA N described in Figure 1 is as follows where D = hV, Θ, ρ, A, J i is the DTS from Example 2. VN = V ; ΘN = Θ; ρN = ρ ∨ (stN = 3 ∧

st0N

AN = A;

JN = {AN };

= 3) ∨ (stN = qsink ∧

st0N

and = qsink )

y

2.2 The logic The logic considered in this paper is the fragment of the industry-standard temporal logic PSL/Sugar [2] which consists of only not r! formulas, with r being a Sugar extended regular expression (SERE). These formulas are of special interest for several reasons. First, it is a convenient way for specification which is widely used. Second, a large subset of the PSL/Sugar properties can be automatically translated into not r! properties [6, 12]. Finally, the verification of these properties can be reduced to verification of invariant properties and is thus very efficient (see e.g. [6]). The semantics of these formulas over a given DTS is defined below. The definition assumes a set of state variables V , the corresponding set Σ of interpretations of the state-variables in V and the set B of boolean expressions over V . We assume two designated boolean expressions true and false belong to B, such that for every s ∈ Σ, s true and s / false. Definition 1 (SEREs). – The empty set ∅ and the empty regular expression λ are SEREs. – Every boolean expression b ∈ B is a SERE. – If r, r1 , and r2 are SEREs, then the following are also SEREs: 1. {r} (encapsulation) 2. r1 ∪ r2 (union) 3. r1 · r2 (concatenation) 4. r∗ (Kleene closure) 5. r1 ◦ r2 (fusion) 6. r1 ∩ r2 (intersection) We denote by REs standard regular expressions, i.e. the subset of SEREs with no fusion or intersection operators. To define the semantics of SEREs, we use the following notations.

6

Notations We denote a letter from Σ by s (possibly with subscripts) and a word from Σ by u, v, or w. The concatenation of u and v is denoted by uv. If u is infinite, then uv = u. The empty word is denoted by ², so that w² = ²w = w. The overlapping concatenation of us and sv, denoted by us ◦ sv, is the word usv. Let L1 and L2 be sets of words. The concatenation of L1 and L2 , denoted L1 L2 is the set {w | ∃w1 ∈ L1 , ∃w2 ∈ L2 and w = w1 w2 }. The overlapping concatenation of L1 and L2 , denoted L1 ◦ L2 is the set {w | ∃w1 s ∈ L1 , ∃sw2 ∈ L2 and w = w1 sw2 }. DefineSL0 = {²} and Li = LLi−1 for i ≥ 1. The Kleene closure of L denoted L∗ is the set i 0 and u = truem1 w0

w1

v0

v1

n −1

1 1 p0 −→ p1 −→ p2

qS −→ q1 −→ q2

··· ···

w1 1 true true pn1 −1 −→ q −→ qI . . . qI −→ |I {z } m1 times v n2 −1

qn−1 −→ qn2

is a run of Nr0 accepting w. For the other direction, assume σ is an accepting run of Nr0 over w. Let pred(A) denote the set {q ∈ Q|∃b ∈ B, qA ∈ QA , s.t. (q, b, qA ) ∈ δ}. By the transition relation, σ is of the form σ = σ ¯1 . . . σ ¯k for some k ≥ 1 where for every i ≤ k, σ¯i is in one of the following forms: 1. v0

v

ni −1

i i qS −→ q1i . . . qni i −1 −→ qni i

2.

n −1

vi i vi0 true true q1i . . . qni i −1 −→ qni i qI −→ qI . . . qI −→ qS −→ {z } | mi times

qni i −1 ∈ pred(A), and qnk ∈ A. The rest of the proof is for case (2). The proof for case 1 is similar. Define wk = truem1 v10 . . . v1n1 −1 . . . u0k truemk vk0 . . . vknk −1 (the word accepted by σ). We show wk ∈ Lng({true∗ ·r}k ) by induction on k. If k = 1, then true true v0 v n1 −1 qI −→ qI . . . qI −→ qS −→ q11 . . . qn1 1 −1 −→ qn1 1 | {z } m1 times and w1 = truem1 v 0 . . . v n1 −1 . By the transition relation ∃q0 ∈ Q0 such that (q0 , v 0 , q11 ) ∈ 1 δ and ∀1 ≤ i ≤ n1 − 1, (qi1 , v i , qi+1 ) ∈ δ so q0 , . . . , qn1 1 is an accepting run of Nr 0 n1 −1 0 n1 −1 over v . . . v , therefore v . . . v ∈ Lng(r) and w1 ∈ Lng(true∗ ·r}). Assume that if σ = σ ¯1 . . . σ ¯k is an accepting run over wk then wk ∈ Lng({true∗ ·r}k ). Let v

nk −1

k σ ¯k = σ ˆk −→ qnk k .That is, σ ˆk is the prefix of σ ¯k obtained by chopping the last state of σ ¯k . The last state of σ ˆk is qnk k −1 ∈ pred(A). Let n −1 vk k true true σ k+1 = σ ¯1 . . . σ ¯k−1 σ ˆk −→ qI −→ qI . . . qI −→ | {z } mk+1 times n

0 vk+1

qS −→ q1k+1 . . . qnk+1 k+1 −1

k+1 vk+1

−1

−→

qnk+1 k+1 n

−1

k+1 0 be an accepting run of Nr0 over wk+1 = wk · truek+1 vk+1 . . . vk+1 . qnk k −1 ∈ Q and nk −1 k 0 there exists q ∈ {qI , qS } such that (qnk −1 , vk , q) ∈ δ . By the definition of δ 0 , there exists qnk k ∈ A such that (qnk k −1 , vknk −1 , qnk k ) ∈ δ. Therefore

v

nk −1

k σ0 = σ ¯1 . . . σ ¯k−1 σ ˆk −→ qnk k

18

is an accepting run of Nr0 over wk . By the induction hypothesis wk ∈ Lng({true∗ r}k ). 0 true true vk+1 qI −→ qI . . . qI −→ qS −→ q1k+1 . . . qnk+1 k+1 −1 | {z } mk+1 times

m

−1

n

k+1 vk+1

−1

−→

n

k+1 is an accepting run of Nr0 over u0k+1 u1k+1 . . . uk k+1 vk1 . . . vk+1 hypothesis nk+1 −1 0 . . . vk+1 ∈ Lng({true∗ r}) truek+1 · vk+1

and w ∈ Lng({true∗ r}k+1 ).

qnk+1 k+1

−1

by the induction

u t

Proof of Theorem 10 We make use of the following notations and assumptions in the proof of Theorem 10, and its lemmas (Lemmas 16-18). Let r be a SERE over B, and t a sub-SERE of r such that for every intersection and fusion operator, t is not a sub-SERE of both operands. Let N = hB, Q, Q0 , δ, Ai be an NFA for t and N 0 = hB, Q0 , Q00 , δ 0 , A0 i the cruising NFA constructed from N (as defined in Proposition 7). Let DN 0 be the DTS of N 0 and let start and end be the boolean expressions defined in Section 3. Let t0 be the corresponding placeholder. Define idle to be the boolean expression asserting state = qI . We use s (possibly with subscripts) to denote states/boolean expressions in a run of a DTS. We use σ (possibly with subscripts) to denote sequences of states/boolean expressions in a run of a DTS. We use b (possibly with subscripts) to denote letters/boolean expressions and w (possibly with subscripts) to denote words i.e. sequences of boolean expressions. Lemma 16. Let σ10 , σ20 be runs of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) as well as w1 and w2 , respectively. Then σ10 σ20 is a run of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) as well as w1 w2 . Proof. A run of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) corresponds to a run of N 0 starting with state qI or state q0 ∈ Q0 and ending in state q ∈ predA or in state qI . Denote by σ1 , σ2 the runs of N 0 corresponding to σ10 , σ20 respectively. σ2 starts in q0 ∈ Q00 . Let b0 be the first letter of w2 , then there is a transition from q ∈ predA∪{qi } to q0 via b0 thus the concatenation σ1 σ2 is possible. The run σ 0 σ20 starts with a state s0 such that s0 [state] = q0 ∈ Q0 or s0 [state] = qI and ends in a state sn such that sn [state] ∈ predA∪{qI }. It thus satisfies (idle∨ start) ◦ true∗ ◦ (end ∨idle). Clearly if σ10 satisfies w1 and σ2 satisfies w2 , the concatenated run σ10 σ20 satisfies the concatenated word w1 w2 . u t Lemma 17. – Let σ10 s1 be a run of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) as well as w1 b. Let s2 σ20 be a run of DN 0 satisfying (idle∗) as well as bw2 . Then σ10 s1 σ20 is a run of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) as well as w1 bw2 .

19

– Let σ10 s1 be a run of DN 0 satisfying (idle∗) as well as w1 b. Let s2 σ20 be a run of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) as well as bw2 . Then σ10 s2 σ20 is a run of DN 0 satisfying (idle ∨ start) ◦ true∗ ◦ (end ∨ idle) as well as w1 bw2 . Proof. We show only the first case, the second case is symmetric. The run σ10 s1 of DN 0 corresponds to a run σq1 of N 0 where s1 [state] = q1 and σ1 q1 starts with a state q0 ∈ Q0 ∪{qI } and ends in state q ∈ pred(A)∪{qI }. The run s2 σ20 of DN 0 corresponds to a run qI σ2 of N 0 of the form qI∗ . Since from every q ∈ pred(A) ∪ {qI } there are transitions to qI , we can concatenate the second run, after chopping its first state to the end of the first run. The resulting run σ1 q1 σ2 , thus, starts with a state q0 ∈ Q0 ∪ {qI } and ends in a state qI (or in state q ∈ pred(A), if |σ2 | = 0). Therefore σ10 s1 σ20 satisfies (idle ∨ start) ◦ true∗ ◦ (end ∨ idle). Clearly the concatenated run σ10 s1 σ20 satisfies the concatenated word w1 bw2 . u t Lemma 18. Let r000 denote the SERE (idle ∨ start) ◦ r[t ← t0 ] ◦ (end ∨ idle). Let r00 denote the SERE r000 if S(r) = false and the SERE λ ∪ r000 otherwise.

there exists a run of DN 0

w |≡ r m satisfying w which satisfies also r00 .

Proof. The proof is by induction on the structure of r with respect to t. The base case is r = t. The induction step can be decomposed into 7 cases, where r1 and r2 are SEREs such that t may be a sub-SERE of them and n1 and n2 are SEREs such that t is not a sub-SERE of them: r = r1 ·r2

r = r1 ∪r2

r = r1 ∗

r = r1 ◦n2 r = n1 ◦r2

r = r1 ∩n2 r = n1 ∩r2

Denote r1 [s ← s0 ] and r2 [s ← s0 ] by r10 and r20 respectively. – Base case: r = t w |≡ r ⇐⇒ [By correctness of N as an NFA recognizing Lng(t)] If S(r) = false There exists w ˆ ∈ Lng(r) and an accepting run q0 q1 . . . qn of N over w ˆ such that for each q¯0 ∈ {qI , qS } and q¯n ∈ Q00 ∪ {qn } there exists a run w0 = s0 s1 . . . sn of DN 0 satisfying w such that 1. s0 [state] = q¯0 2. ∀1 ≤ i ≤ n − 1 : si [state] = qi . 3. sn [state] = q¯n . ⇐⇒ [By the definitions of idle, start and end] If S(r) = false then there exists a run w0 of DN 0 satisfying w which also satisfies {start ∧ end} ∪ {start·¬start∗ ·end} otherwise there exists a run w0 of DN 0 satisfying w which also satisfies λ ∪ {start ∧ end} ∪ {start·¬start∗ ·end}

20

⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies t000 = (idle ∨ start) ◦ t0 ◦ (end ∨ idle) (if S(r) = false and λ ∪ t000 otherwise). ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r00 – Induction step: 1. r = r1 ·r2 w |≡ r1 ·r2 ⇐⇒ There exist w1 , w2 s.t. w = w1 w2 , w1 |≡ r1 and w2 |≡ r2 ⇐⇒ [By the inductive hypothesis] There exist w1 , w2 s.t. w = w1 w2 , there exists a run w10 of DN 0 satisfying w1 which also satisfies r100 , and there exists a run w20 of DN 0 satisfying w2 which also satisfies r200 ⇐⇒ [By Lemma 16] There exists w1 , w2 s.t. w = w1 w2 and there exists a run w0 = w10 w20 of DN 0 satisfying w1 w2 which also satisfies r100 ·r200 ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r00 2. r = r1 ∪ r2 w |≡ r1 ∪ r2 ⇐⇒ w |≡ r1 or w |≡ r2 ⇐⇒ [By the inductive hypothesis] There exists a run w1 of DN 0 satisfying w which also satisfies r100 or there exists a run w1 of DN 0 satisfying w which also satisfies r200 ⇐⇒ There exists a run w1 of DN 0 satisfying w which also satisfies r100 or r200 ⇐⇒ There exists a run w1 of DN 0 satisfying w which also satisfies r100 ∪ r200 ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r00 3. r = r1 ∗ w |≡ r1 ∗ ⇐⇒ Either w = ² or there exist w1 , w2 , . . . , wk s.t. w = w1 w2 . . . wk and for all 1 ≤ i ≤ k, wi |≡ r1 . ⇐⇒ [By the inductive hypothesis] Either w = ² or there exist w1 , w2 , . . . , wk s.t. w = w1 w2 . . . wk and for all 1 ≤ i ≤ k there exists a run wi0 of DN 0 satisfying wi which also satisfies r100 ⇐⇒ [By repetitive applications of Lemma 16] There exists a run w0 of DN 0 such that either |w0 | = 0 or w0 = w10 w20 . . . wk0 which satisfies ² or w1 w2 . . . wk , respectively, and which also satisfies λ or r100 ·r100 · . . . ·r100 ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r100 ∗ ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r00 4. r = r1 ◦ n2 w |≡ r1 ◦n2 ⇐⇒ There exist w1 , w2 and b s.t. w = w1 bw2 , w1 b |≡ r1 and bw2 |≡ n2

21

⇐⇒ [By the inductive hypothesis] There exist w1 , w2 and b s.t. w = w1 bw2 , there exists a run σ1 s1 of DN 0 satisfying w1 b which also satisfies r100 , and bw2 |≡ n2 0 ⇐⇒ [By the transition relation of N 0 and the correctness of DN as its representation] There exist w1 , w2 and b s.t. w = w1 bw2 , there exists a run σ1 s1 of DN 0 satisfying w1 b which also satisfies r100 , and there exists a run s2 σ2 of DN 0 satisfying bw2 which also satisfies idle∗ and n2 ⇐⇒ [By Lemma 17] There exist w1 , w2 and b s.t. w = w1 bw2 and there exists a run w0 = σ1 s1 σ2 of DN 0 satisfying w1 bw2 which also satisfies r100 ◦n2 ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r00 5. r = n1 ◦ r2 Symmetric to the above case. 6. r = r1 ∩ n2 w |≡ r1 ∩ n2 ⇐⇒ w |≡ r1 and w |≡ n2 ⇐⇒ [By the inductive hypothesis] There exists a run w1 of DN 0 satisfying w which also satisfies r100 and w |≡ n2 ⇐⇒ There exists a run w1 of DN 0 satisfying w which also satisfies r100 and n2 ⇐⇒ There exists a run w1 of DN 0 satisfying w which also satisfies r100 ∩ n2 ⇐⇒ There exists a run w0 of DN 0 satisfying w which also satisfies r00 7. r = r1 ∩ n2 Symmetric to the above case. u t Theorem 10 DM |= not r! ⇐⇒ DM ||| DN 0 |= not r[t ← t0 ]! Proof. DM |= / not r! ⇐⇒ There exists a finite run w of DM s.t. w |≡ r ⇐⇒ [By Lemma 18] There exists a finite run w of DM s.t. there exists a run of DN 0 satisfying w which also satisfies r00 ⇐⇒ [By definition of r00 ] There exists a finite run w of DM s.t. there exists a run of DN 0 satisfying w which also satisfies r[t ← t0 ] / not r[t ← t0 ]! ⇐⇒ DM ||| DN 0 |= u t

22

Lemma 19. Let r be a SERE, and t = r1 ∩r2 a sub-SERE of r. Let N10 and N20 be the cruising NFAs for r1 and r2 respectively. Let DN10 and DN10 be their corresponding DTS s. Let idle1 and idle2 denote, respectively, that DN10 and DN20 are in a state s such that s[state] = qI . Let idle denote the conjunction idle1 ∧ idle2 . Let t0 denote the SERE placeholder(r1 ∩r2 , start, middle, end). Let r000 denote the SERE (idle ∨ start) ◦ r[t ← t0 ] ◦ (end ∨ idle) and let r00 denote the SERE r000 if S(r) = false and the SERE λ ∪ r000 otherwise. Then w |≡ r m there exists a run of DN10 ||| DN20 satisfying w which satisfies also r00 . Proof. We show only the base case (r = r1 ∩r2 ). The rest of the proof proceeds similarly to the proof of Lemma 18. w |≡ r1 ∩r2 ⇐⇒ w |≡ r1 and w |≡ r2 ⇐⇒ [By Lemma 18] If S(r1 ) = false then there exists a run w0 of DN10 satisfying w which also satisfies {start1 ∧ end1 } ∪ {start1 ·¬start1 ∗ ·end1 } otherwise there exists a run w0 of DN10 satisfying w which also satisfies λ ∪ {start1 ∧ end}1 ∪ {start1 ·¬start1 ∗ ·end1 } And if S(r2 ) = false then there exists a run w0 of DN20 satisfying w which also satisfies {start2 ∧ end2 } ∪ {start2 ·¬start2 ∗ ·end2 } otherwise there exists a run w0 of DN20 satisfying w which also satisfies λ ∪ {start2 ∧ end2 } ∪ {start2 ·¬start2 ∗ ·end2 } ⇐⇒ If S(r1 ∩r2 ) = false then there exists a run w0 of DN10 ||| DN20 satisfying w which also satisfies {(start1 ∧start2 )∧(end1 ∧end2 )} ∪ {start1 ∧start2 )·(¬start1 ∧¬start2 )∗ ·(end1 ∧end2 )} otherwise there exists a run w0 of DN10 ||| DN20 satisfying w which also satisfies λ ∪ {(start1 ∧start2 )∧(end1 ∧end2 )} ∪ {start1 ∧start2 )·(¬start1 ∧¬start2 )∗ ·(end1 ∧end2 )} ⇐⇒ If S(r1 ∩r2 ) = false then there exists a run w0 of DN10 ||| DN20 satisfying w which also satisfies {start ∧ end} ∪ {start·middle∗ ·end} otherwise there exists a run w0 of DN10 ||| DN20 satisfying w which also satisfies λ ∪ {start ∧ end} ∪ {start·middle∗ ·end}

23

⇐⇒ There exists a run of DN10 ||| DN20 satisfying w which satisfies also r00 .

u t

Proposition 12 Let DM be a DTS, r a SERE and r1 ∩r2 a sub-SERE of r. Let N10 and N20 be the cruising NFAs , for r1 and r2 respectively. Let DN10 and DN10 be their corrssponding DTS s. Then

DM ||| DN10 ||| DN20

DM |= not r! m |= not r[r1 ∩r2 ← placeholder(r1 ∩r2 , start, middle, end)]!

Proof. The proof is similar to the proof of Theorem 10, this time making use of Lemma 19. u t Proposition 13 The NFA Np[=i..j] accepts Lng({p[= i..j]}). Proof. Let w ∈ Lng({p[= i..j]}), we show there exists a run of Np[=i..j] on w which is accepting. Let k be the number of p’s in w. Since w ∈ p[= i..j], we have that i ≤ k ≤ j. 0 1 k Denote w = ¬pm , p, ¬pm , . . . , p, ¬pm where ¬pb denotes b consecutive repetitions 0 1 2 k of ¬p. By the transition relation q0m +1 , q1m +1 , q2m +1 , . . . , qkm is an accepting run of Np[=i..j] on w. Let w 6∈ p[= i..j] and assume towards contradiction that s0 s1 . . . sn is an accepting run of Np[=i..j] on w. Therefore sn ∈ {qi , . . . , qj }. It is easy to see that for all 0 ≤ t ≤ n and ∀1 ≤ m ≤ j, if st = qm then there are exactly m states in s0 s1 . . . st in which p is asserted. Therefore there are between i to j states in s0 s1 . . . sn in which p is asserted. That is, w ∈ p[= i..j], in contradiction to the assumption. u t Proofs of Section 4.3 Definition 20 (Semantics of CTL formulas). The semantics of CTL formulas are defined with respect to a model (DTS) and a state in the model. Let DM = hVM , ΘM , ρM , AM i be a DTS. The notation DM , s |= ϕ means that formula ϕ holds in state s of model DM . The notation DM |= ϕ is equivalent to DM , s |= ϕ for all s such that s |= Θ. In other words, ϕ is valid for every initial state of DM . The semantics of a CTL formula over DM are defined as follows, where b denotes a boolean expression and ϕ, ϕ1 , and ϕ2 denote CTL formulas. – – – – –

DM , s |= b ⇐⇒ s |= b DM , s |= ¬ϕ ⇐⇒ DM , s |= /ϕ DM , s |= ϕ1 ∧ ϕ2 ⇐⇒ DM , s |= ϕ1 and DM , s |= ϕ2 DM , s |= EX ϕ ⇐⇒ ∃ run σ of DM s.t. |σ| > 1, σ 0 = s, and DM , σ 1 |= ϕ DM , s |= E [ϕ1 U ϕ2 ] ⇐⇒ ∃ run σ of DM s.t.σ0 = s and ∃k < |σ| s.t. DM , σ k |= ϕ2 and ∀j s.t. j < k: DM , σ j |= ϕ1

24

– DM , s |= EG ϕ ⇐⇒ ∃ run σ of DM s.t. σ 0 = s and ∀j s.t. 0 ≤ j < |σ|: DM , σj |= ϕ Definition 21 Let DM be a DTS and s ∈ ΣV . Let r be a SERE such that ² |≡ / r. DM , s |= not {r}! ⇐⇒ for every finite run σ of DM s.t. σ 0 = s, σ |≡ / r. Proposition 15 Let DM be a discrete system and r a SERE with no starred sub-SEREs, such that ² |≡ / r. Then DM |= not r! ⇐⇒ DM |= ¬T (r). Proof. First we note that the translation procedure (Definition 14) covers all SEREs r such that ² |≡ / r and r does not contain any starred sub-SEREs. Second, we note that Definition 21 is legitimate, since although SEREs are defined over non-empty words, the definition DM |= not r! relies on the fact that runs are by definition non-empty and further assumes ² |≡ / r. From the same reason we can first show that for every s ∈ ΣV , DM , s |= not{r}! ⇐⇒ DM , s |= ¬T (r) We show this by induction on the structure of r. Let b be a boolean expression and let r1 , r2 be SERE’s. – Base case. 1. DM , s |= not {b}! ⇐⇒ DM , s |= ¬b ⇐⇒ DM , s |= ¬T (b) – Induction step. 2. DM , s |= not {r·b∗ }! ⇐⇒ DM , s |= not {r}! ⇐⇒ [by the induction hypothesis] DM , s |= ¬T (r) 3. DM , s |= not {r1 ∪r2 }! ⇐⇒ DM , s |= not {r1 }! ∧ not {r2 }! ⇐⇒ [by the induction hypothesis] DM , s |= ¬T (r1 ) ∧ ¬T (r2 ) ⇐⇒ DM , s |= ¬T (r1 ∪r2 ) 4. DM , s |= not {b·r1 }! ⇐⇒ for every finite run σ = ss0 σ 0 , s |= ¬b or DM , s0 |= not {r1 }! ⇐⇒ [by the induction hypothesis] s |= ¬b or for every σ = ss0 σ 0 , DM , s0 |= ¬T (r1 ) ⇐⇒ DM , s |= ¬b ∨ AX¬T (r1 ) ⇐⇒ DM , s |= ¬T (b·r1 ) 5. DM , s 6|= not {b∗ ·r1 }!

25

⇐⇒ there exists a run σ s.t. either σ |≡ r1 or σ = s0 s1 . . . sn σ 0 (where s0 = s) and ∀0 ≤ j < n : σ j |= b and σ n σ 0 |≡ r1 ⇐⇒ DM , s 6|= not {r}! or there exists a run σ = s0 s1 . . . sn σ 0 (where s0 = s) and ∀0 ≤ j < n : σ j |= b and DM , σ n 6|= not {r1 }! ⇐⇒ [by the induction hypothesis] DM , s |= T (r) or there exists a run σ = s0 s1 . . . sn σ 0 (where s0 = s) and ∀0 ≤ j < n : σ j |= b and DM , σ n |= T (r1 ) ⇐⇒ DM , s |= T (r1 ) ∨ E[bUT (r1 )] ⇐⇒ DM , s |= T (b∗ ·r1 ) 6. DM , s |= not {r1 ∪r2 ·r}! ⇐⇒ DM , s |= not {r1 ·r}! ∧ not {r2 ·r}! ⇐⇒ [by the induction hypothesis] DM , s |= ¬T (r1 ·r) ∧ ¬T (r2 ·r) ⇐⇒ DM , s |= ¬T (r1 ∪r2 ·r) In particular for every initial state s, DM , s |= not {r}! Therefore DM |= not {r}! ⇐⇒ DM |= ¬T (r).

⇐⇒

DM , s |= ¬T (r). u t