Formal Languages, Natural Grammars, and Axiomatic

0 downloads 0 Views 681KB Size Report
This project addresses the formal nature of grammars, from a mathematical and computational point of view, and asks which the requirements that natural ...
Formal Languages, Natural Grammars, and Axiomatic Systems Diego Gabriel Krivochen University of Reading, UK CINN e-mail: [email protected] This project addresses the formal nature of grammars, from a mathematical and computational point of view, and asks which the requirements that natural languages impose the theoretician in developing a grammar are. We propose to revisit the so-called Chomsky Hierarchy, an inclusive hierarchy of formal grammars, under the light of recent developments in Biolinguistics and computational / mathematical linguistics; for the kind of transformational system advocated for in generative linguistics is often not appropriate to account for problematic data from natural languages, particularly adopting a comparative perspective.

1. Definitions: Grammar of L: finite set of rules that generate structural descriptions of formulae belonging to a (formal / natural) language L, such that: 

Constructivist version: Given a generative system Σ, and a finite set S = {α1…αn} of wellformed formulae, Σ generates S and crucially no α such that α ∉ S.

Ej: “x + y = z” is a well-formed formula; “- + =” is not. Problem: S is finite by stipulation. Additional criteria are needed to determine decidibility over a string in a closed system (the so-called Entscheidungsproblem). Within generative grammar, it has been said that: “We must require of such a linguistic theory that it provide for: (i)

an enumeration of the class S1' S2', … of possible sentences

(ii)

an enumeration of the class SD1, SD2, … of possible structural descriptions

(iii)

an enumeration of the class G1, G2, … of possible generative grammars

(iv) specification of a function f such that SDf(i, j) is the structural description assigned to sentence Si, by grammar Gj, for arbitrary i,j (v) specification of a function m such that m(z) is an integer associated with the grammar G, as its value (with, let us say, lower value indicated by higher number)” Chomsky (1965: 31) There is no reason to assume that the requirements above do not apply to formal languages.



Restrictivist version: given a generative system Σ and a set S of discrete units of whatever nature, Σ manipulates members of S freely and unboundedly, all constraints being determined by external systems

Problem: Are there external systems in mathematical expressions? Halting problem → Cf. Frege’s Function & Object Language: finite set of well-formed formulae, where a ‘formula’ is: a) A string of terminal nodes: #a#⏜#b# b) A set of strings of terminal nodes: A = {#a#⏜#b#} c) A set of sets of strings of terminal nodes: {A, B}, where A = {#a#} & B = {#b#} (Note: if A ≡ #a#, then we write |a|) d) An object x of arbitrary computational complexity General problem: Are languages really finite (or countably infinite)? In which respect? To which extent? Formally, if a language is a Cantor set, and that set is infinite, then there is no way to represent formulae that do not belong to the set (i.e., there are no ill-formed formulae): Let us assume set S of well-formed formulae. If S = {a1, a2, …∞}, then ¬an ∉ S, but an infinite set has no decidibility criteria, as it is not needed, for: ∀(a), a ∈ S. Therefore, either we fall in a contradiction, of the decidability criterion ‘grammaticality’ is to be eliminated. However, when we deal with terminals instead of formulae, S = {#a1#, #a2#, ...∞} and S’ = {#a’1#, #a’2#, ...∞}, given a’1 ∉ S; are acceptable sets, as long as S ⊇ S’ or vice-versa. E.g.: ℤ ⊃ ℕ. Is it possible for a language to be only partially infinitely countable? (e.g., that there is a 1-to-1 correspondence between S and ℕ)? Or that it is of a higher order? (e.g., a 1-to-1 correspondence between S and ℝ)? It all depends on the grammar we consider for each set… These considerations lead us to consider types of formal languages: Type 3 languages: regular languages. Type 3 algorithms: Peano’s successor function, S(x) = x + 1 (generates ℕ from a finite set of axioms). Type 2 languages: context-free systems. Type 2 algorithms: Chomsky-style rewriting grammars (sequential, distinguishing terminals and nonterminals). Chomsky-normal: every context-free language is generated by a grammar for which all productions are of the form A → BC or A → b. (A, B, C, nonterminals, b a terminal)

Greibach-normal: every context-free language is generated by a grammar for which all productions are of the form A → bα, where b is a terminal and α is a string of nonterminal variables. [Type 2.5: Joshi’s 1985 Tree Adjoining Grammars] Type 1 languages: context-sensitive systems. Type 1 algorithms: Σ → F /C_C/, that is, ‘rewrite Σ as F in the context {#C#⏜#Σ#⏜#C#}’ (generate, among other things, syllabic structures with allophony) (Note: Type 1-2 languages are known as phrase structure languages) Type 0 languages: recursively enumerable languages. Type 0 algorithms: recursive algorithms in the sense of Gödel (1931); transformational grammars; Turing machine instructions. Inclusion theorem (Chomsky, 1959: 143): “THEOREM 1. For both grammars and languages, type 0 ⊇ type 1 ⊇ type 2 ⊇ type 3.” Type 1 languages are also decidible, as there are contextual restrictions upon well-formed formulae. Type 1 and 2 languages can be ‘pumped’ (so-called Pumping Lemma). Problems: a) Not all Turing Machines work in the same way (e.g., probabilistic vs. deterministic TM) b) The hierarchy does not take into account hypercomputation, parallel computation, or interactive computation models. c) There are problems that arise only when dealing with formal languages, which do not appear when analyzing natural languages (e.g., halting problem) Can we assume the hierarchy is valid for natural languages? Or is it an empirical question? (as Lasnik & Uriagereka, 2012; Krivochen, 2014 propose) d) The Church-Turing thesis CTT, which equals effective computation with TMs, applies only to functio-based computation (finite input, finite output, closed derivations) (Turing, 1936), and it is not obvious (in fact, it is quite unlikely) that it applies to production and / or comprehension in natural language, which appeals to several kinds of information at once in parallel computations.

2. Languages and Automata   

Turing Machine: unlimited memory tape Linear Bounded Automata: memory limited to the input, although free within a local domain Push Down Automata: memory limited to a single derivational step



Markov processes: memory-less

Notice that, crucially, we are always talking about ROM memory, as working memory (RAM) is not taken into account in these formalisms. Neurocognitively, RAM is a dynamic working bench, whose neurological bases seem to be located in the frontal neocortex (D’Espósito, 2007), in coactivation with other brain areas in order to process different stimuli. Recursion, as defined by Gödel, appears in Markov chains in the form of head-tail recursion, where an element can appear as Σ and F in the same chain. However, free structural embedding (i.e., […X…[…X…]]) is a feature of PDA+ on (true recursion). Let us review the original Gödel formulation of a number theoretic recursive function (which, if TMs are function-based, applies to them, and therefore is to be taken into account for Turing-like approaches to linguistic computation) “[a] number theoretic function φ is said to be recursive if there is a finite sequence of number-theoretic functions φ1, φ2, . . . , φn that ends with φ and has the property that every function φn of the sequence is recursively defined in terms of [...] preceding functions, or [...] is the successor function x + 1” (Gödel, 1931: 159) Recursion is not to be confused with infinitude: a formal object / system can be infinite without being recursive (e.g., ℕ) and vice-versa (e.g., natural languages implemented in a human mind). In the definitions above, processing time is not taken into account when considering an automaton’s capacity, only memory is considered. 3. Arithmetic, geometry, and formal languages: ℕ: generable via successor function (Type 3) ℤ: generable via a context-free function (Type 2) ℚ: generable via a context-sensitive function (Type 1) ℝ: generable via TM (Type 0) Therefore, according to Chomsky’s theorem, ℝ ⊃ ℚ ⊃ ℤ ⊃ ℕ Problem: Can a TM generate ℂ? In other words: Is a TM necessary to generate all members of ℂ? Crucial note to be taken into account: the fact that a formal object O (or its structural description) can be generated with a grammar G does not mean that: a) The computational nature of O is the same as the rest of the objects G potentially generates (consider fuzzy set theory, in the best of cases; classification mistakes, in the worst) b) G is the simplest grammar that can generate O

E.g.: Fibonacci (and similar patterns); iteration and coordination in natural languages (Krivochen, 2015a). 4. Grammaticality, ungrammaticality, formal axiomatic systems 1) “Any number either is prime or is measured by some prime number”. (Euclid, Elements VII, Proposition 32) Language: ℕ 2) “That, if a straight line falling on two straight lines make the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than the two right angles”. (Euclid, Elements I, Postulate 5) Language: Euclidean Geometry 3) “English is not a finite-state language” (Chomsky, 1957: 21) Language: English Notice that only English is a language whose members (well-formed formulae) receive interpretation in a mind, and is thus limited not formally, but neuro-cognitively. In any case, that is not a property of the system, but of the implementational level of that system (Marr, 1982). …what if we negate these propositions (1-3)? Do we get consistent systems? 5. Different systems, the same properties: a) The Scale of Natural Harmonics:

Fig. 1: Scale of natural harmonics

Harmonic: sinoidal function with a frequence expressed in terms of integer multiples. A harmonic of a wave is a component frequency of the signal that is an integer multiple of the fundamental frequency, (i.e. if the fundamental frequency is f, the harmonics have frequencies 2f, 3f, 4f, . . . etc.). As the integer increases, wave longitude decreases and frequency increases. 4) Grammar: SNHn = n/(n-1) In plain words, for the nth member of the scale, its value is defined as a reason between itself and the immediately anterior member. That is: 5) 1/2, 2/3, 3/4, 4/5, 5/6... = C3/C2, G3/C3, C4/G3, E4/C4, G4/E4...

The (geometrical) ratio among the members of SNH seen in terms of Hz is the following: 6) 2 — 1,5 — 1,3 — 1,25 — 1,2 — 1,17 —1,14 — 1,125… As can be seen, as values in Hz increase, the ratio tends asymptotically to 0.

Fig. 2: Harmonics in a string of length 1. http://en.wikipedia.org/wiki/Harmonic#mediaviewer/File:Moodswingerscale.svg

Is the SNH a Markov chain? Crucially, there is no hierarchy among the members of the scane, tonal relations appear at a different computational level (PDA-level, perhaps). Memory is not needed either. The generative logics follows that of the series of Fibonacci integers (Fibn = Fibn + Fibn-1), but Fib is an arithmetic ratio → does that imply that SNH is Markovian in nature? 

Crucially, the fact that an object is describable by a formal procedure does not mean the nature of the object is consistent with that formal procedure. Nor does it imply that all parts of the object are equally describable by the same formal means (Krivochen, 2015a)

Is there a formal halting? → No, there need not be. However, the string (if physically real), cannot be infinite in length (among other reasons, because the amount of energy to move an infinitely long string is infinite). Therefore, as long as the string is physically finite, there is a necessary halting.

b) Theodorus’ spiral:

Fig. 3: Theodorus’spiral

7) Area Tn = (1 * √n) / 2 Variation in the radiums for Tn: 8) Δr = (√n+1) - √n Each state of the system (i.e., each triangle) depends on the immediately preceeding state. No memory is required, and the computation is strictly sequential. c) Biological systems: “The laws of physics determine brain wiring. In fact, they are the same laws that operate in order to optimize river confluence and electrical charge patterns (…)” (Adapted from Uriagereka, 1998: 17)

Fig. 4: Real and optimal dispositions (Cherniak, 2009: 111)

Notice the word optimize, which does not entail perfection, but best use of available resources given certain output conditions. 

They are all generable via L-grammars, but…are they really Type-3 objects?

(Note: Given the traveling salesman problem, Optimized Markov chain algorithms which use local searching heuristic sub-algorithms can find a route extremely close to the optimal route for 700 to 800 cities) Steiner tres: recurrent structures in biological systems (see Prusinkiewicz & Lindenmayer, 1990). (given n points in a phase space, connect them with lines of the shortest possible length. Additional nodel can be included in order to minimize segments, so-called ‘Steiner points’) Generation of biological systems via L-grammars: “Organic form itself is found, mathematically speaking, to be a function of time [...] We might call the form of an organism an event in space-time, and not merely a configuration in space.” (D’Arcy Thompson. Destacado en el original) Prusinkiewicz & Lindenmayer (1990) provide a grammar for the developmental pattern of the bacteria Anabaena catenula, where a and b represent cytological states of the cell (size, and division possibility), and l and r indicate cell polarity (differences in form, structure, and function of the cell, which allow for different, specialized functions) 9) ω : ar p1 : ar → albr p2 : al → blar p3 : br → ar p4 : bl → al d) Fractals Let us clarify what are customarily taken to be the basic properties of fractal structures (based on Falconer, 2014; Stogartz, 1994: Chapter 11; Lapidus and van Frankenhuijsen, 2006): 10) a. They display structure at arbitrarily small scales b. They cannot be described by ‘traditional geometric language’, either locally or globally c. They display self-similarity (either strictly geometrical / topological, or statistical) d. Their dimensionality is often1 a non-integer e. They are not defined in Euclidean metric spaces, such that d(x, y) can range from 0 to ∞ 1

This depends on the notion of ‘dimensionality’ we consider. In general, Hausdorff dimensionality is the path to take, and we will adopt that formulation. In any case, there are fractals with an integer number of Hausdorff dimensions: a Hilbert or a Peano curve filling an n dimensional space will have n dimensions. Perhaps less trivially, Julia sets have a Hausdorff dimension 2 for some critical values, and so does the boundary of the Mandelbrot set (Shishikura, 1991).

f.

They are almost always2 obtained via iterative procedures

From our perspective, these are issues to take into consideration before giving a definitive answer to the question whether a relevant natural object is a fractal in any relevant respect. “One begins with two shapes, an initiator and a generator. The latter is an oriented broken line made up of N equal sides of length r. Thus each stage of the construction begins with a broken line and consists in replacing each straight interval with a copy of the generator, reduced and displaced so as to have the same end points as those of the interval being replaced.” (Mandelbrot, 1982: 93) A crucial point here is that the initiator and the generators are finite, but the application of the replacement operation yields infinite complexity representationally, once a certain threshold has been reached. Just like in half lines, infinity is to be found only in one ‘derivational direction’: consider a light beam, originated in point A and travelling in a straight line through an infinite vacuum. The existence of a finite starting point does not imply that the object is representationally / derivationally seen as a fractal. Grammar: Σ → F

Fig. 5: Koch snowflakes

Notice that the final object is in fact a fractal, self similar and displaying infinite complexity. However, there are two possible approaches to the complexity of a Mandelbrot set as described by Mandelbrot himself (i.e., initiator + generator + algorithm):   2

A grammar for the derivation A grammar for the final object

The graphical representation of the motion of a Brownian particle is fractal, but not obtained by means of recursive or iterative procedures. Falconer (2014: xviii) also leaves space for these kinds of fractals when claiming that ‘In most cases of interest, F [a fractal set] is defined in a very simple way, perhaps recursively’ (our highlighting).

Notice that, if we devise a grammar for the final object, and leave aside the procedure that led to it (which is methodologically valid), we might think that a fractal is an object that displays infinity (and, in this case, embedding, as the generator is contained within the final form infinite times) ‘allthe-way-down’. However, if we provide a grammar for the derivation of the object, we see that infinite complexity is the result of infinite applications of a simple generative procedure to a simple generator initial string (thus, the Σ → F grammar we propose). Fractals do not have a halting point, therefore, the application of the generative procedure is unbounded. However, natural languages do have necessary halting points (both formally and in processing), provided by sound-meaning systems. This essential point is missed in the ‘provide a grammar for the final object’ approach, and leads to the claim that language cannot be seen as a fractal. Pruniewicz & Lindenmayer (1990: 11) provide derivations for dragon curves and Sierpinski pyramids from L-grammars. Penrose Tilings: A Penrose tiling is a non-periodic tiling generated by an aperiodic set of prototiles (originally, pentagons, then, kites and darts. More tiles have appeared in recent times). Penrose tilings are self-similar fractal structures with non-integer dimensionality. As we said before, each of the initial tiles (analogous to Mandelbrot’s generators) is describable by a formal system (Euclidean Geometry) that cannot account for the final stage. Derivationally, the tessellation of a space by means of Penrose tiles is Euclidean-compatible, but representationally, the final stage is not. e) Syntatic Structures Early Chomskyan grammars: Sequential. Strongly derivational but limited to a single workspace. Derivational steps are strictly subjacent, where subjacency is defined as follows: A process p is an ordered series of steps [x⏜x+1⏜ … x+n], where ⏜ is linear ordering; and x is subjacent to x+n via x+y, iff x+y is a sine qua non condition for the linear derivation to reach x+n departing from x. a) Standard Theory (1965-1973) 11) S → NP⏜Aux⏜VP VP → V⏜NP NP → Det⏜N Det → the N → man, ball V → hit Aux → Ø 12) Sentence NP⏜Aux⏜VP Det⏜N⏜VP Det⏜N⏜Verb⏜NP the⏜N⏜Verb⏜NP the⏜man⏜Verb⏜NP the⏜man⏜hit⏜NP the⏜man⏜hit⏜Det ⏜N

the⏜man⏜hit⏜the⏜N the⏜man⏜hit⏜the⏜ball Each line represents a derivational step, which is subjacent to the previous one. b) Government and Binding / Minimalist Program (1980-nowadays) 13) XP → Specifier, X’ (, Adjunct) (X a lexical / functional terminal node) X’ → X0, Complement XP X’

ZP X

YP

So, from VP on: 14) VP → Subject, V’ V’ → V, Object In successive derivational stages, proceeding from the developed theorem towards the axiom (i.e., derivations are bottom-up, until we get to the label Sentence, or, in more recent terms, CP). However, Chomskyan generative grammars do not operate in real time (Chomsky, 2007), unlike other forms of generative grammar, more concerned with the implementational capacity of the theory. Natural language, as long as it builds complexity out of atomic units with no internal halting necessity (but those given by interpreting systems, sound and meaning, and finite memory capacities), can be taken to be a Mandelbrot set, with the following caveats: a) b) c) d)

There is more than a single initiator More than a single initiator can combine in a derivational step There is a single generative algorithm Structure expands in more than 2-D (Krivochen, 2015a, b)

If a version of copy-Merge (Stroik & Putnam, 2013; Krivochen, 2015b) is on the right track, then the generative engine actually copies the initiator n times, where n < ∞ iff the generated structure Σ is to be interpreted by Phon-Sem. Halting is thus an interface phenomenon, external to the syntactic generator itself. L-grammars: operate simultaneously; rewriting applies in parallel (as it happens in cell division). Should we assume a starting state ar for the grammar in (9), the resulting derivation is as follows: 15) ar

al br bl ar ar al al br al br

bl ar bl ar ar bl ar ar Take into account that Lindenmayer’s generative procedure crucially includes the T factor, which is essential in the development of a biological system. Lindenmayer (1990: 3) explicitly claims that the replacement of the initial sequence with final strings occurs in parallel and simultaneously, as cell division, for instance, occurs in more tan a single cell at any time T. Notice that the form of the grammar is simply Σ → F, that is, “rewrite Σ as F”, very much in line with the first phrase structure models (Chomsky, 1957, 1965). The same kind of grammar, always with the added assumption that all rules apply simultaneously (Prusinkiewicz & Lindenmayer , 1990: 3), is also useful to provide a descriptive generative (i.e., explicit) procedure for aminoacids. We will take only Phenylalanine (Phe) and Leucine (Leu) (Smith, 2008): 16) Grammar: Σ, F, where F = 3 (that is, Σ is always rewritten as 3 terminals) Phe → U3; U2 C (i.e., UUU, UUC)3 Leu → U2A; U2G (i.e., UUA, UUG) While this does not mean RNA’s structure is itself finite-state, it does show that the descriptive limits of Markov models go well beyond what was initially considered in traditional transformational grammar. In other words, the fact that an object X is modelable by means of grammar G does not necessarily mean that X has the computational complexity of G: X can be more complex that X (as in the case of DNA) or it can be simpler. The model is not the object, and the formal features of the model are not necessarily the formal features of the object. Axiomatizing grammar within Radical Minimalism: Axioms: a) An operation cannot apply to an operation. Operations always apply to objects, and operations are never objects themselves, just as objects are never themselves operations. An operation determines a relation ℜ holding for a finite set of representations S = {α, β, …n}, where ∀(x) | x ∈ S, x ≠ ℜ. S is determined by interface conditions. In this framework, an atomic object is a terminal {#x#}. b) Operations are free. Unless there is an interface requirement, there is no point in limiting the scope and potential of a generative operation. c) Operations should be kept to the minimum. Occam’s Razor applied at the methodological level. If we have a certain phenomenon in different levels of organization, the first assumption is that the operation is the same. E.g., structural

3

U: Uracil; A: Adenine; G: Guanine

complexity (syntax, cells, DNA, plant development, galaxies, fractals…) is always derived by the same algorithm, despite material instantiation. d) If an Operation can apply, it must apply. If it may not apply, it must not apply. Theoretical possibility equals necessity. In Fibonacci 2-D trees like Uriagereka’s (2012), complexity grows arithmetically, in such a way that each derivational step D, in a top-down fashion, can be described as D + k, k a constant determined by the sequence. In DNA-based trees, 3-dimensional and self-bending, complexity grows geometrically. Theorems: 

Conservation Principle:

Information cannot be eliminated in derivational procedures or transfer, but it must be instantiated in such a way that it can be read by the relevant level. 

Dynamic (Full) Interpretation:

Any derivational step is justified only insofar as it increases the information and/or it generates an interpretable object. 

∀(x) | x ∈ Fx [a certain mental faculty] & ∀(y) | y ∈ Fy,  If Format (x) = Format (y), then  Merge (x, y) = {x, y} is a possible construction

Krivochen’s (2015a) n-ary branching: 17) Σ → F, where F = {α, β, k} As argued in Krivochen (2011), {α, β} is interface-minimal, but all of its instances can be seen as a subset of (16), where k = 0. For iteration and coordination, k = n, where n is an arbitrary number of paratactically related constituents. The scheme for X-bar is thus to be reformulated (along the lines of Uriagereka, 2011; Krivochen, 2015a, b): 18)

Σ α

β

k

A simple substitution mechanism can replace the terminal k for an n number of non-terminals, since k is a variable and not semantically interpretable per se. Notice that if k = 0, then we are left with the traditional scheme. f) Iteration and structuring in numbers and sequences:

The problem of whether “---X---X---” sequences are to be interpreted as containing a hierarchical relation between both instances of X (which, under most circumstances, can be represented via binary-branched tree structures) or there is iteration (thus, no scope between Xs and therefore multiple branching) does not arise only in language. Let us consider the first nine numbers of the Fibonacci series (Fib): 19) 0, 1, 1, 2, 3, 5, 8, 13, 21… Let us now only write the instances of “1”, replacing all other numbers with X: 20) X, 1, 1, X, X, X, X, 1X, X1 Are those instances hierarchically related or not? Does Fib exhibit both hierarchy and iteration? Let us examine the matter. The place a digit occupies in a number gives us information about the numerical group a digit belongs to: units, groups of ten, groups of a hundred, etc. So: 21) X1 = X tenths, 1 unit 1X = 1 tenth, X units Thus, there is a crucial difference between the string {1, 1, 1} and the string {111}, namely: 22) 1, 1, 1 = 1 unit, 1 unit, 1 unit 23) 111 = 1 hundredth, 1 tenth, 1 unit There being no structural relation between the numbers in (22), which we prove by the fact that they all denote the same digit in the same position, there is iteration. The structure of Fib is thus flat up to the eighth number, in which the value of 1 is determined relationally in terms of its position with respect to X. The Limits of L-grammars: Interestingly enough, the contrast shown in the previous section does not appear if Fib is generated via an L-grammar, of the kind discussed by Uriagereka (2012): 24) 0 → 1 1 → 0, 1 Let us develop the tree a bit, to visualize the problem:

25)

0 1 1

1

0 0 1

1 0

Problem: an L-grammar either cannot represent iteration (if interpreted through a tree) or cannot represent hierarchy (if interpreted as linear adjacency). Possibility: dominance in a tree does not entail hierarchy. Recursion is not a direct consequence of dominance, neither is dominance a sine qua non condition for hierarchy. Possibly, recursion is not relevant for the generative algorithm, but at the semantic interface, which reads structural configurations but is computational itself. Matlach and Krivochen (2015) suggest L-systems are orthogonal to the Chomsky Hierarchy, contra P&L (1991). L-systems are not normal grammars. In Chomsky grammars productions are applied sequentially, whereas in L-systems they are applied in parallel and simultaneously replace all letters in a given word. This difference reflects the biological motivation of L-systems. Productions are intended to capture cell divisions in multicellular organisms, where many divisions may occur at the same time. (Prusinkiewicz and Lindenmayer, 1991: 3) While only one rule can apply per generation in a Chomsky-normal grammar, even if there is more than one nonterminal that can be rewritten, L-grammars rewrite all possible symbols per generation, yielding a completely different growth pattern. The question is, where do L-systems fall as formal grammars4? Are they finite-state? Context-free? Context-sensitive? Anticipating provisional conclusions and summarizing discussion from Matlach and Krivochen (2015), we will defend here the idea that L-grammars as such cannot be implemented in any kind of automata contemplated in traditional automata theory (as Rozenberg and Salomaa, 1980: x, point out, ‘there is no way of presenting iterated morphisms or parallel rewriting in a natural way within the framework of sequential rewriting’). In that work, we proved the following hypothesis: Theorem 2: no tape-based machine can compute simultaneous rule application 4

This is, we are asking about the properties of the grammar, not of the strings the grammar can generate. We will see it is possible to find a context-sensitive grammar that generates at least some strings that are also derivable by L-grammars. This does not mean, of course, that the relevant L-grammar and the relevant CSG are equivalent in any way.

We also developed a sample proof involving state and transition redundancy, for good measure: Proof: Let DTM be a deterministic single tape Turing Machine DTM = (S, Σ, Γ, δ, s0, F) where S is the set of states, Σ ⊆ Γ is the set of input symbols, whereas Γ is the full set of tape symbols. s0 is the initial state, F ⊆ S is the set of final states, and δ is a transition function from S × Γ […]. If δ(s ∈ S × γ ∈ Γ) = δ’(s ∈ S × γ’ ∈ Γ), then either γ = γ’, or δ = δ’, or both. But such a machine is either contradictory or redundant. (Matlach and Krivochen, in 2015: 2) Another argument against the assimilation of L-systems to the CH can go along the lines of the characteristics of the alphabet and the concept of a Chomsky-normal grammar: since terminals and nonterminals are defined contextually, depending on the ‘side’ of the transition relation in which they appear (thus, we cannot establish whether ‘1’ is a terminal or a nonterminal outside the context of a rule in a grammar like (24)), L-grammars are not Chomsky/Greibach-normal, and thus a special kind of context-sensitivity is required to interpret each symbol insofar as we have to look at the symbol in the context of a rule and know its position with respect to the transition function.

6. Bibliography: Cherniak, Christopher (2009) Brain Wiring Optimization and Non-Genomic Nativism. In M. Piattelli-Palmarini, et. al., eds. Of Minds and Language. Oxford: OUP. 108-119. Chomsky, Noam (1956) Three models for the description of language. IRE Transactions on Information Theory 2: 113–124. (1957) Syntactic Structures. The Hague: Mouton. (1959) On Certain Formal Properties of Grammars. Information and Control 2. 137-167. (1963) Formal Properties of Grammars. In R. D. Luce, R. R. Bush, and E. Galanter (eds.), Handbook of Mathematical Psychology. New York: John Wiley & Sons. 323–418. (1965) Aspects of the Theory of Syntax. Cambridge, Mass.: MIT Press. (2007) Approaching UG from below. In Uli Sauerland and Hans- Martin Gartner (eds.), Interfaces + recursion = language? Chomsky’s minimalism and the view from syntaxsemantics. Berlin: Mouton de Gruyter. 1-29. D’Esposito, Mark (2007) From cognitive to neural models of working memory. Phil. Trans. R. Soc. B 2007 362, 761-772. Deutsch, David (1985) Quantum theory, the Church-Turing principle and the universal quantum computer. Proceedings of the Royal Society of London A 400. 97-117

Gödel, Kurt (1931/1986) “Über formal unentscheidbare sätze der Principia Mathematica und verwandter systeme”. In Kurt Gödel: Collected Works, Vol. I, Publications 1929–1936, eds S. Feferman, J. W. Dawson, S. C. Kleene, G. H. Moore, R. M. Solovay, and J. Van Heijenoort. Oxford: Oxford University Press. 144-195. (1934/1986).“On undecidable propositions of formal mathematical systems”. In Kurt Gödel: Collected Works, Vol. I, Publications 1929–1936, eds S. Feferman, J. W. Dawson, S. C. Kleene, G. H. Moore, R. M. Solovay, and J. Van Heijenoort. Oxford: Oxford University Press. 346-371. Krivochen, Diego (2013) A frustrated mind. Ms. Under review. http://ling.auf.net/lingbuzz/001932 (2015a) On Phrase Structure building and Labeling algorithms: towards a non-uniform theory of syntactic structures. The Linguistic Review 32(3). 515-572. (2015b) Types vs. Tokens: Displacement Revisited. Studia Linguistica. DOI: 10.1111/stul.12044 Mandelbrot, Benoit (1982) The fractal geometry of nature. W. H. Freeman, San Francisco. Marr, David (1982) Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. New York: Freeman. Vladimir Matlach and Diego Krivochen (2015) L-systems, parallel computing, and the limits of the Chomsky Hierarchy. Ms. Submitted. Prusinkiewicz, Przemislaw; Aristid Lindenmayer (1990) The Algorithmic Beauty of Plants. Springer-Verlag. [Electronic edition: 2006] Smith, Ann (2008) Nucleic acids to amino acids: DNA specifies protein. Nature Education 1(1):126 Stroik, Thomas & Michael T. Putnam (2013) The Structural Design of Language. Cambridge, CUP. Turing, Alan (1936) On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of the London Mathematical Society. 2 (1937) 42: 230–265 Uriagereka, Juan (1998) Rhyme and Reason. Cambridge, Mass.: MIT Press. (2011) A sketch of the grammar under non-classical conditions. Ms. UMD. (2012) Spell-Out and the Minimalist Program. Oxford: OUP. Wegner, Peter (1997) Why Interaction is More Powerful Than Algorithms. Comm. ACM, May 1997. 80-91. (1998) Interactive foundations of computing. Theoretical Computer Science 192. 315–351.