Finite Automata and Non-Self-Embedding Grammars - CiteSeerX

Finite Automata and Non-Self-Embedding Grammars ? Marcella Anselmo1 , Dora Giammarresi2 , and Stefano Varricchio2 1

2

Dipartimento di Informatica ed Applicazioni, Universit` a di Salerno I-84081 Baronissi (SA) Italy [email protected] Dipartimento di Matematica. Universit` a di Roma “Tor Vergata”, via della Ricerca Scientifica, 00133 Roma, Italy. giammarr/[email protected]

Abstract. We consider non-self-embedding (NSE) context-free grammars as a representation of regular sets. We point out its advantages with respect to more classical representations by finite automata, in particular when considering the efficient realization of the rational operations. We give a characterization in terms of composition of regular grammars and state relationships between NSE grammars and push-down automata. Finally we show a polynomial algorithm to decide whether a context-free grammars is self-embedding or not.

1

Introduction

Regular languages play a central role in formal language theory, and indeed in theoretical computer science. Evidence is given by a wide and continued literature, together with a variety of practical problems dealing with regular languages (e.g. compiling, text editing, DNA sequences, and so on). Different properties and questions have been investigated and many points of view have been considered: logical, algebraic, analytical or algorithmic. The fundamental question we deal with in this paper is representation of a regular language, indeed defined as an abstract concept. There are several ways of representing a regular language, each one with its peculiarity, advantages and disadvantages and this results in a richness of the theory. Every time we deal with a regular language, we can choose the most convenient type of representation for it. In this contest, all the procedures to transform a representation into an equivalent one, together with their time and/or space complexity, are of great interest. Regular languages are classically represented by: regular expressions, finite automata (deterministic, non-deterministic, two-way, ...), logical formalisms and regular grammars. We observe that regular grammars are only a different way to represent non-deterministic finite automata. Therefore, if we look for a representation of regular languages which is “better” with respect to finite automata, we have to consider a larger class of grammars. In this paper we focus on a particular family of grammars called ?

This work was partially supported by MIUR project Linguaggi formali e automi: teoria e applicazioni.

non-self-embedding (NSE) grammars, strictly including the regular grammars, but still representing regular languages. A context-free grammar is self-embedding (SE) if there is a derivation for a ∗ variable A, of type A =⇒ αAβ with both α, β non-empty. A context-free grammar is non-self-embedding (NSE) if it is not SE. From a result due to Chomsky [2], we know that any NSE grammar generates a regular language. Notice that the vice versa is always true: every regular language admits a NSE grammar representing it (a right-linear grammar is NSE). Despite a poor literature on NSE grammars (we are not aware of any other result), we believe that NSE grammars can be regarded to as an interesting representation for regular languages. Indeed, we exhibit a simple example showing that the representation of a regular language can be much more concise (an exponential gap!) via NSE grammars than via finite automata. This example is, actually, a simple case of a more general situation where the representation by NSE grammars is always more “compact”. The idea is to exploit the structure of a grammar that has variables that can be used to generate different instances of the same language. In this paper we first discuss the consequences of the SE (NSE) property on the grammar structure. We show that a NSE grammar can be expressed in terms of regular grammars. More precisely, we introduce an operation between grammars we call ⊕-composition that corresponds to the substitution operation between languages. Then we characterize the NSE grammars as those grammars that can be obtained by a finite number of ⊕-compositions of regular grammars. As immediate consequence, one obtains the Chomsky result, since regular languages are closed under regular substitutions. We also investigate the realization of rational operations on languages using NSE grammars and we highlight the advantages with respect to finite automata in many situations: for instance in the representation of the square of a language, or, more generally, in the representation of regular expressions containing different instances of the same language. Moreover, we remark that rational operations have a very simple representation in terms of ⊕-composition of NSE grammars and that NSE grammars are in general much more concise when we exploit the operation of composition on grammars some of which are identical. We then study what the SE property yields on push-down automata (PDA). We show that a PDA, obtained by the canonical construction from a NSE grammar, has stack size bounded by some constant. Recall that the vice versa is always true: i.e. PDA with constant stack size recognize regular languages. In the last part of the paper we give an algorithm that tests whether a context-free grammar is NSE or not. This algorithm has a running time polynomial in the size of the grammar. As consequence, this shows that the SE property has also some relations on decidability results on CFG. Indeed it is well-known that it is undecidable whether a CFG generates a regular language or not. For lack of space, most of the proofs are omitted from this extended abstract. We refer the interested reader to the full paper [1].

2

Basic Notations and Definitions

We assume the reader familiar with basic formal language theory including finitestate automata, push-down automata, regular expressions and grammars. We will mainly use notations as in [4]. A context-free grammar (CFG) will be denoted by G = (V, T, P, S) where V, T, P, S are the sets of variables, terminals, productions and the start symbol, respectively. We always assume that V ∩T = ∅. We will use informally the notion of size of a grammar as the space needed for its description. By regular grammar we indicate a grammar that is either leftor right- linear. We now recall the definition of self-embedding grammar. Definition 1. A context-free grammar G = (V, T, P, S) is self-embedding (SE) ∗ if there exists a variable A such that A ⇒ αAβ with α, β ∈ (V ∪ T )+ . In this paper we will be interested in non-self-embedding (NSE) context-free grammars i.e. satisfying the properties that, for all variables A, any derivation ∗ A ⇒ αAβ implies that either α = or β = . Notice that given a regular language L there exists a NSE grammar for L: in particular a right-linear grammar is NSE. A well-known result states that also the reverse is true. The following theorem is due to Chomsky [2] (see also [3]). Theorem 1. The language generated by a NSE grammar is regular.

3

A characterization for NSE grammars

In this section we revisit the Chomsky theorem in a more general framework. In particular we give a Decomposition Theorem stating that the NSE grammars are exactly the context-free grammars obtained as a particular composition of regular grammars. We emphasize that such composition operation, corresponds to the operation of substitution on languages. As consequence, NSE grammars are a formalism which is exponentially more compact with respect to non deterministic finite automata. Definition 2. Let G1 = (V1 , T1 , P1 , S1 ) and G2 = (V2 , T2 , P2 , S2 ) be two contextfree grammars, with V1 ∩ V2 = ∅. The ⊕-composition of G1 and G2 is given by: G = G1 ⊕ G2 = (V, T, P, S) where V = V1 ∪ V2 , T = T1 \V2 ∪ T2 , P = P1 ∪ P2 and S = S1 . Notice that, if T1 ∩V2 = ∅ then L(G1 ⊕G2 ) = L(G1 ). Moreover, by definition, if G is a grammar and G = G1 ⊕ G2 , then the corresponding sets of variables, V1 and V2 , are disjoint. This implies that a derivation of a word w in G can be split in two phases. First, we apply only rules of G1 (to variables in V1 ) starting from its start symbol and get a word w0 (on the alphabet T1 ). Second, we apply only rules of G2 (starting from symbols in T1 ∩ V2 ) and get word w. More formally:

Remark 1. If G = G1 ⊕ G2 then language L(G) can be obtained from L(G1 ) by applying a substitution that maps each symbol A ∈ T1 ∩ V2 to LA (G2 ) where LA (G2 ) is the language generated by the grammar G2 using A as start symbol. Similarly, we consider sequences of n applications of ⊕-composition, for some n. It is not difficult to verify that the ⊕-composition is associative. Let G1 ⊕G2 ⊕ . . . ⊕ Gn = G with Gi = (Vi , Ti , Pi , Si ), for i = 1, 2, . . . , n and G = (V, T, P, S). By definition, there should be Vi ∩ Vi+1 = ∅ for i = 1, . . . n − 1. Moreover, V = V1 ∪V2 ∪. . .∪Vn , T = T1 ∪T2 ∪. . .∪Tn \(V2 ∪. . .∪Vn ), P = P1 ∪P2 ∪. . .∪Pn and S = S1 . The proof of the next lemma can be found in the full paper [1]. Lemma 1. Let G1 and G2 be two NSE grammars, then G1 ⊕ G2 is NSE. As consequence of the above lemma, observe that, the ⊕-composition of two regular grammars G1 and G2 (either left- or right- linear) gives, in general, a non-self-embedding grammar. The main result of this section states that the reverse is also true: every NSE grammar can be obtained as ⊕-composition of a finite number of regular grammars. We now give the Decomposition Theorem. Theorem 2. Let G = (V, T, P, S) be a NSE grammar. Then, there exist n regular grammars G1 , G2 , . . . , Gn , for some n, such that G = G1 ⊕ G2 ⊕ . . . ⊕ Gn . Proof. (Sketch) The idea of the proof is to “extract” all grammars G1 , G2 , . . . , Gn from G, one after the other. We start with G1 = (V1 , T1 , P1 , S1 ) that is obtained from G by considering some of G variables as terminals. More precisely, S1 = S while variables in V1 are only those ones that are both “reachable” from the start symbol and from which the start symbol can be “reached”. All the other G variables will be added to the set of terminals together with G terminals. Then we proceed by defining in order G2 , . . . , Gn . In general, grammar Gi = (Vi , Ti , Pi , Si ), i = 2, ..., n, is defined after Gi−1 as follows. The start symbol Si is one of the variables of G that is considered terminal in Gi−1 . Then variables in Vi are G variables that are both “reachable” from Si and from which Si can be “reached”. The productions in Pi ⊆ P are defined by selecting the rules having Vi variables in the left-hand side. All the symbols on the right-hand side that are not in Vi will constitute the set of terminals Ti . The key idea behind the proof lies in the order of the Gi ’s (i.e. the order in which such start vertices Si are chosen) that is based on a topological ordering of vertices of a particular production graph. This guarantees that, chosen a grammar Gi , the possible G’s variables in the right-hand side of rules in Pi do not belong to set Vj for j < i. The complete proof can be found in the full paper [1]. We only remark that such proof is constructive, i. e. it gives an effective and efficient (polynomial time) procedure to decompose a NSE grammar in terms of regular grammars. Observe that Theorem 2 together with Lemma 1 implies that NSE grammars are exactly those grammars obtained as a ⊕-composition of regular grammars (either right- or left-linear). Moreover, by Remark 1, one obtains Chomsky result (Theorem 1) as a corollary, since regular languages are closed under substitution.

To conclude, we remark that, despite the decomposition of NSE grammars in left- and right-regular grammars evoke a similarity with two-way finite automata, the two representations are quite different (see Example 1 in the next section).

4

Advantages of NSE representations

In this section we compare the representation of regular languages by NSE grammars with respect to finite automata (regular grammars). We will see that NSE grammars are much more concise when we exploit the operation of composition on grammars some of which are identical. We start by defining the rational operations between languages represented via NSE grammars. Rational operations of union, concatenation and star of languages are basic in the construction of regular languages. Let G1 = (V1 , T1 , P1 , S1 ) and G2 = (V2 , T2 , P2 , S2 ) be two context-free grammars. Without loss of generality, we assume that the sets of variables V1 and V2 are disjoint. We define the following grammars corresponding to the standard constructions for the union, concatenation and star of context-free grammars [4]: – Gu = (Vu , Tu , Pu , Su ) with Vu = V1 ∪ V2 ∪ {Su }, Tu = T1 ∪ T2 , Pu = P1 ∪ P2 ∪ {Su → S1 |S2 } – Gc = (Vc , Tc , Pc , Sc ) with Vc = V1 ∪ V2 ∪ {Sc }, Tc = T1 ∪ T2 , Pc = P1 ∪ P2 ∪ {Sc → S1 S2 } – Gs = (Vs , Ts , Ps , Ss ) with Vs = V1 ∪{Ss }, Ts = T1 , Ps = P1 ∪{Ss → Ss S1 |}. Proposition 1. If the grammars G1 and G2 are NSE then the grammars Gu , Gc and Gs are NSE. Observe that, when applying rational operations to NSE grammars, the resulting grammar is of the “same” type of the starting ones (while in the automata case we get non-determinism or -transitions). Moreover the size of the resulting grammar only increases by an additive constant: we add only one production (while in the automata case, we have to add a number of transitions that depends on the number of transitions in the starting automata). The most interesting case is the concatenation when L(G1 ) = L(G2 ): when we define the grammar Gc for the square L0 = LL of a regular language L = L(G), we do not need to make two disjoint copies of the grammar G. Then, the size of the NSE grammar Gc differs from the size of G only by an additive constant. On the other hand classical constructions give a NFA for L0 = LL whose size is at least twice the size of a NFA for L. Observe that this is only one simple case of a typical situation. In fact, by definition, any regular language is obtained as application of rational operations to simpler languages (some of which are often identical!). We observe that the rational operations can be also given in terms of ⊕-compositions. More precisely, we define: G+ = ({S}, {S1 , S2 }, {S → S1 |S2 }, S) G• = ({S}, {S1 , S2 }, {S → S1 S2 }, S) G∗ = ({S}, {S1 }, {S → SS1 |}, S).

Then, it is immediate to verify that Gu = G+ ⊕ G1 ⊕ G2 , Gc = G• ⊕ G1 ⊕ G2 , Gs = G∗ ⊕ G1 . We observed that NSE representations are more concise than finite automata ones when the language contains concatenations of copies of same languages. Indeed similar arguments apply to ⊕-compositions. We find that NSE grammars are in general much more concise when we exploit the operation of composition on grammars some of which are identical. The next example will show that the difference in size between the two representations can be even exponential. k

Example 1. Let L = {a2 }. Any NFA for L have at least 2k states, otherwise it had a loop on its states and would recognize an infinite language. Let G = (V, T, P, Ak ), where V = {A0 , A1 , . . . , Ak }, T = {a}, and P = {Ak → Ak−1 Ak−1 , Ak−1 → Ak−2 Ak−2 , . . . , A1 → A0 A0 , A0 → a}. k

It is not difficult to see that G is NSE and that L(G) = {a2 }. This shows that the minimal size of a NFA accepting a regular language L can be exponential with respect a NSE grammar generating L. Notice that G is obtained as composition of very simple grammars exploiting the effect of generating and substituting two copies of the same languages in different occurrences. More precisely, G can be decomposed as G = Gk ⊕ Gk−1 ⊕ · · · ⊕ G1 , where G1 = ({A1 }, {a}, {A1 → aa}, {A1 }) and Gi = ({Ai }, {Ai−1 }, {Ai → Ai−1 Ai−1 }, {Ai }), i = 2, · · · , k. Remark that for every i = 2, · · · , k, L(Gi ⊕· · ·⊕G0 ) can be obtained by applying a substitution that maps Ai−1 to L(Gi−1 ⊕ · · · ⊕ G1 ) (see Remark 1).

5

NSE grammars and push-down automata

In this section we analyze the relationships between NSE grammars and pushdown automata (PDA). The main result states that a context-free grammar is NSE if and only if the corresponding equivalent PDA has a bounded stack. We recall that a grammar G = (V, T, P, S) is in canonical form if its productions are of the kind A → aγ or A → γ, with A ∈ V , a ∈ T , and γ ∈ V ∗ . As well known [4], for any context-free grammar G, there exists a context-free grammar G0 in canonical form, such that L(G0 ) = L(G). Moreover, G = (V, T, P, S) in canonical form can be easily transformed in an equivalent PDA MG [4]. More precisely, MG = ({q}, T, V, δ, q, S, ∅), where the transition function δ is defined as: (q, γ) ∈ δ(q, a, A) if and only if A → aγ, with a ∈ T ∪ {} and A ∈ V . The PDA MG simulates the leftmost derivations of G: ∗

S ⇒` wγ ⇐⇒ (q, w, S) `∗MG (q, , γ),

w ∈ T ∗, γ ∈ V ∗.

(1)

Proposition 2. Let G be a NSE grammar in canonical form and let MG the corresponding equivalent PDA defined as above. Then there exists a constant K > 0 such that in any computation of MG the string contained in the stack has length upper-bounded by K.

Proof. Let G = (V, T, P, S) be a NSE grammar in canonical form and let k = |V | and h = max{|γ| | A → γ or A → aγ, A ∈ V, a ∈ T }. One can prove by ∗ induction on k that for any leftmost derivation A⇒` wγ, with A ∈ V , w ∈ T ∗ , ∗ and γ ∈ V , one has |γ| ≤ hk. Let us set K = hk. By Eq. (1) if γ appears in the ∗ stack in some computation, then S ⇒` wγ for some w ∈ T ∗ and then |γ| ≤ K. Since any PDA with bounded stack accepts a regular language, then Proposition 2 gives a new proof of Chomsky Theorem (Theorem 1). The converse of the previous proposition holds under the hypothesis that all the symbols of the grammar G are useful, i.e. they appear inside a derivation of a string of L(G). More precisely let G be a grammar in canonical form, whose symbols are all useful, and let MG be the corresponding equivalent PDA defined as above. If there exists a constant K > 0 such that in any computation of MG the string contained in the stack has length upper-bounded by K, then G is NSE. The proof goes by contradiction. Furthermore it is well-known that given a PDA M , one can construct a context-free grammar GM such that L(M ) = L(GM ) [4]. We remark that, in this construction, the first step consists in building a PDA M 0 having only one state such that L(M ) = L(M 0 ). In this construction one can easily prove that M 0 has a bounded stack if and only if M has a bounded stack. Moreover, from M 0 one constructs the grammar GM which is in canonical form and GM is such that the corresponding equivalent PDA is exactly M 0 . Therefore, using Proposition 2 and its converse one can state the following proposition. Proposition 3. Let M be a PDA and GM the corresponding equivalent grammar. Then M has bounded stack if and only if GM is NSE. It is interesting to observe that the classical constructions on the equivalence of PDA’s and CFG’s [3] are polynomial time. Therefore, for a CFG G the equivalent PDA MG has polynomial size w.r.t. G. Conversely, for a PDA M the equivalent CFG GM has polynomial size w.r.t. M . Therefore, the representations of regular languages by PDA with bounded stack and by NSE grammars are equivalent in the size up to a polynomial.

6

Test for self-embedding property

We recall that, given a context-free grammar G that generates a language over the alphabet Σ, it is not decidable whether L(G) is regular. (Notice that it is undecidable even whether L(G) = Σ ∗ ). In this section we describe an algorithm to test whether G is non-self-embedding. If G is non-self-embedding then L(G) is regular. The test is based on the association to G of a labelled directed graph, whose labels are taken in a properly introduced semi-ring. The SE property for G is characterized by the existence of some special paths in the graph. These paths are detected by powering some associated matrix, with entries in the semi-ring. Let us introduce the semi-ring C = {`, r, b, 0} equipped with operations sum and product given by the following tables.

+`br0 ` `bb` r bbr r b bbbb 0 `br0

×`br0 ` `bb0 r bbr0 b bbb0 0 0000

Let G be a unit-free context-free grammar, where the set of variables is V = {A1 , A2 , · · · , An }. We associate a graph and a matrix with G as follows. The Labelled Production Graph H(G) is a labelled directed graph which vertices are the variables of G, there is an edge A → B iff there exists a production rule A→αBβ in P , and the label of edge A → B is defined in C by: ( ` if for every A→αBβ in P it holds α 6= , β = lab(A → B) = r if for every A→αBβ in P it holds α = , β 6= . b otherwise The Transition Matrix M (G) is a n × n matrix which entries are defined in C as follows, for any i, j ∈ {1, 2, · · · , n}: 0 if there is no production rule of type Ai →αAj β in P M (G)i,j = lab(Ai → Aj ) otherwise. The following algorithm tests whether the grammar G is NSE. Se-Test(G = (V, E)) 1 for i ← 1 to |V | 2 do for i ← j to |V | 3 do M (G)i,j ← 0 4 for each production Ai →αAj β in P 5 do if α 6= and β = 6 then if M (G)i,j = ` or 0 7 then M (G)i,j ← ` 8 else M (G)i,j ← b 9 if α = and β 6= 10 then if M (G)i,j = r or 0 11 then M (G)i,j ← r 12 else M (G)i,j ← b 13 if α 6= and β 6= 14 then M (G)i,j ← b 15 M 1 ← M (G) 16 M ← M 1 17 for i ← 2 to |V | 18 do M i ← M i−1 M 1 19 M ← M + Mi 20 for i ← 1 to |V | 21 do if Mi,i = b 22 then return ”G is SE and Ai is SE” 23 return ”G is NSE”

The algorithm Se-Test constructs the transition matrix M (G) in lines 4-14. After the execution of the for loop in lines 17-19, we find M = M (G)≤V , where for a matrix M and an integer n, we use the notation M ≤n = M +M 2 +· · ·+M n . Finally Se-Test either returns the message “G is SE and Ai is SE” when it finds that M (G)≤V i,i = b for some i, or the message “G is NSE” otherwise. It is easy to see that the running time of the algorithm Se-Test on G = (V, T, P, S) is polynomial in the size of G (indeed o(|P | + |V |4 )). We want now to prove the correctness of the algorithm Se-Test. More precisely we claim that the algorithm Se-Test on a unit-free grammar G returns the message ”G is NSE” iff G is a non-self-embedding grammar. A grammar G is self-embedding by definition iff there exists a variable A and ∗ a derivation A ⇒ αAβ with α, β 6= . Note that no limitations constraint such a derivation, and in particular its length. The following proposition restricts the type and the length of derivations to be tested in order to decide whether a grammar is self-embedding. We say that a derivation α0 A0 β0 ⇒α1 A1 β1 · · · ⇒αn An βn is simple if Ah = Ak , with 1 ≤ h < k ≤ n implies h = 1, k = n. Therefore the length of a simple derivation in grammar G = (V, T, P, S) is at most |V |. Proposition 4. Let G = (V, T, P, S) be a unit-free context-free grammar. A ∗ variable A ∈ V is Self-Embedding iff there exists a derivation A ⇒ αAβ such that one of the following two cases holds: 1. the derivation is simple and α, β 6= ∗ 2. the derivation A ⇒ αAβ can be split into [i]

[j]

[k]

A ⇒ α0 Bβ 0 ⇒ α0 α00 Bβ 00 β 0 ⇒ α0 α00 α000 Aβ 000 β 00 β 0 [i]

[k]

[j]

000 0 so that the derivationsA ⇒ α0 Bβ 0 ⇒ α0 α000 and B ⇒ α00 Bβ 00 are Aβ00 β 000 α0 α000 = β 00 = α = β β0 = both simple and either or α00 , β 000 β 0 6= α0 α000 , β 00 6= . [n]

Proof. (Sketch) Let A be a self-embedding variable of G and A ⇒ αAβ be a derivation with α, β 6= of minimal length n. If the derivation is not simple then it can be split according to the statement. One can prove that the resulting derivations are both simple by supposing the contrary, removing the derivation from a repeated variable to its next occurrence and showing that we then obtain some new derivation that contradicts the minimality of n. Proposition 4 can be translated in terms of the labelled production graph. Let us say that a path in labelled production graph H(G), is of type ` (r, resp.) if all the edges composing it are labelled ` (r, resp.); it is of type b otherwise. Proposition 4 implies that G is self-embedding iff there exists a vertex X in H(G) and either X has a loop of type b and length |V | at most or X has two loops of length |V | at most, one of type ` and the other one of type r. Furthermore, this characterization can be easily tested on the transition matrices. Indeed G is SE iff there exists a variable Ai such that M (G)≤h i,i = b.

Example 2. Let G = (V, T, P, S) where V = {S, A, B}, T = {a, b} and P is given by: S→aSb/AB, A→aB/a and B→bA/Bb/b. Suppose vertices in V are numbered as A1 = S, A2 = A and A3 = B. br` The matrix M (G) is the following one: M(G)= 0 0 ` . 0`r The grammar G is SE and all its variables are self-embedding. As an example, A is SE and the derivation A⇒aB⇒aBb⇒abAb satisfies case 2) of Proposition 4. Further this derivation yields in H(G) two loops on B each of length |V | at r

`

`

most: B → B of type r and length 1 < |V |, and B → A → B of type ` and length ≤h 2 < |V |. Finally these loops on B = A3 imply M (G)≤h 3,3 = b. Indeed M (G)i,i = b for i = 1, 2, 3. Applying the algorithm Se-Test on G we have: bbb bbb M 1 = M (G); M 2 = 0 ` b ; M 3 = 0 b b . 0bb 0bb After the execution of the for loop of lines 17-19, M = M 3 = M (G)3 . Hence the algorithm finds M1,1 = b in line 21 and returns: “G is SE and A1 is SE”. The previous considerations allow us to claim the correctness of algorithm Se-Test and state the main result in this section. Theorem 3. It is decidable whether a context-free grammar is NSE or not.

7

Conclusions and Further Directions

Next step of our work will be the comparison of the representation of regular languages by NSE grammars with all the other known formalisms. In particular it would be very interesting to develop efficient algorithms to transform regular expressions to/from NSE grammars as well as finite automata to/from NSE grammars. It could be interesting to study the complexity of an algorithm to transform a finite automaton in a regular expression that has a NSE grammar as intermediate stage of the transformation.

References 1. M. Anselmo, D. Giammarresi, S. Varicchio. Non-Self-Embedding Grammars as representation for Regular Languages. Full paper available at www.mat.uniroma2.it/~giammarr/Papers/nse.ps 2. N. Chomsky. A note on phrase-structure grammars, Information and Control. Vol 2, pp. 393 - 395, 1959b. 3. M. A. Harrison. Introduction to Formal Language Theory. Addison-Wesley, Reading, MA, 1978. 4. J. E. Hopcroft, R.Motwani and J. D. Ullman. Introduction to Automata Theory, Languages and Computation - 2nd Edition. Addison-Wesley, 2001.