(2020) Generative vs. model-theoretic approaches to L-systems

2020, Workshop on Hierarchical Structure Processing (Geneva)



In this talk we briefly present two approaches to Lindenmayer systems: the rule-based (or 'generative') approach, which focuses on L-systems as Thue rewriting systems and a constraint-based (or 'model-theoretic') approach, in which rules are abandoned in favour of conditions over allowable expressions in the language (Pullum, 2019). We will argue that it is possible, for at least a subset of Lsystems and the languages they generate, to map string admissibility conditions to local tree admissibility conditions (cf. Rogers, 1997). This is equivalent to defining a model for those languages. We will work out how to construct structure assuming only superficial constraints on expressions, and define a set of constraints that well-formed expressions of specific L-languages must satisfy. Spoiler alert: we will see that some L-systems that other methods distinguish turn out to satisfy the same model.

Generative vs. modeltheoretic approaches to L-systems From n-grams to trees Diego Gabriel Krivochen Workshop on Hierarchical Structure Processing 26-28 August 2020 Proof theoretical vs. Model theoretical syntax   Model theory is concerned with finding interpretations for well-formed formulae which make such formulae true  If an interpretation I makes a WFF S true, we say that I is a model of S (or, alternatively, that S satisfies I).  ‘Interpretations’ in a formal sense are constraints or sets thereof over expressions (WFF)  Grammars, in this view, consist of finite sets of ‘admissibility conditions’, in the sense of ‘what an expression must look like in order to satisfy the grammar’ (Pullum, 2007: 1-2). Proof theory is concerned with the enumeration of WFF by means of recursive operations implemented through a Turing machine.  More often than not, these operations are combinatoric, and inspired in the syntactic rather than the semantic side of logic  A grammar is a set of rules that recursively enumerates the set of sentences which constitutes a language L (in Post systems, ‘generate’ means ‘recursively enumerate’)  Proof-theoretic models of syntax adopt a procedural view, which translates into the central role of derivations. A working example  Consider the phrase structure rule S → NP, VP  Let us interpret it proof-theoretically and model-theoretically Proof-theoretically (deterministic procedure): a symbol S in line φi of a derivation can only be replaced by the string NP⌒VP in line φi+1 (Chomsky, 1959) Model-theoretically: a node labelled S is admissible in a well-formed structure T iff it immediately dominates nodes labelled NP and VP in T (McCawley, 1968)  Note how the MT interpretation actually reformulates a grammar not as a set of rules that recursively enumerate structural descriptions or produce trees, but as a set of local admissibility conditions for nodes in graphs (or basic expressions in derived expressions)  Now, because MT grammar is concerned with the development of models of expressions and structures, we can think of constraints as ways to filter structures  Here we depart from Pullum & Scholz (2001) in the definition of the model: not the expressions themselves, but their complement set. Point is, there are some differences in the application of MTS here and in their work. L-systems  Let us consider the rules of the so-called Fibonacci grammar: 0→1 1→01  We know very well how that works proof-theoretically, we just rewrite stuff.  Model-theoretically, however, we can reformulate things as follows: 0 → 1 (a tree T with a node labelled 0 is well formed iff every node labelled 0 immediately dominates a node labelled 1 in T) 1 → 0 1 (a tree T with a node labelled 1 is well formed iff every node labelled 1 immediately dominates a node labelled 0 and a node labelled 1 in T)   All we’ve done is re-interpret the rules as admissibility conditions for local trees  That is, local trees with roots 0 and 1 respectively  We’ll call these Node Admissibility Conditions (NAC) This is cool in its own right, but we can do more! Describing a language  The ‘constructive’ way to describe a language model-theoretically would be to characterise the expressions or trees that are well-formed  And then, define a set of statements over expressions  Things like: If α ∈ PFC/NP and β ∈ PNP, then F1(α, β) ∈ PFC, for all α, β. (read: if α is an expression of the category FC/NP and β is an expression of category NP, then the result of applying rule F 1 to the pair α, β will be an expression of category FC) ∀(x, y) [(x ⨞* y ∧ y ⨞* x) → x = y] (Rogers, 1998: 17) (read: for every x, y [in a well-formed tree T] if x dominates y and y dominates x, then x is identical to y)  Rogers (1997) provides a MT formulation of a system that is strongly equivalent to a PSG   His statements pertain to allowed relations in trees rather than expressions But it is equally possible to start from the things that we forbid in the formalism, and define the complement set of those  Making sure that a system does not assign a well-formed status to a set of expressions or structures means defining the well-formed expressions / structures as the complement set of the banned ones  Sometimes that’s not too practical, because the set of ill-formed expressions is recursively enumerable…however… Restricting ill-formedness: The Laws  Strings produced by the Fibonacci grammar present two local restrictions on expressions, which in an Asimovian move I called the ‘Three Laws’ First Law: Every 0 is followed by a 1 (*00) Second Law: Two 1s are always followed by a 0 (*111) Third Law: A single 1 may be followed by either a 0 or a 1  We care about the deterministic ones, the First and Second laws.  But these do not refer to trees… do they?  Well…sorta. We’ll see how. But we need some preliminary definitions first:  Let the breadth of a tree be the number of sister nodes at every generation  Let the depth of a tree be the number of dominance relations in a given structure  Then, we can characterise local trees as follows: x y Breadth: 2 Depth: 1 z x y Breadth: 1 Depth: 1  We can think of the First and Second Laws as NACs for trees of depth 0. That means that for the First and Second laws we can define the complement set of strings allowed, for 2- and 3-grams (since these are the n-gram sizes that the Laws refer to): {11, 01, 10} {101, 110, 011, 010} {*00} is excluded by the 1st Law and therefore all 3-grams containing it also will.  Then, the complement of the set of strings forbidden by the First Law should define exhaustively the set of trees with breadth 2 and depth 0 over the alphabet Σ.  And the complement set of the Second Law does the same for trees with breadth 3 and depth 0 over Σ.   Not too impressive so far, but bear with me Now, we can make our description more powerful if, like Rogers (1997), we allow for Boolean connectives in the metalanguage  Specifically, we care about ‘AND’, since it allows us to concatenate strings  At this point, we can define sets of conditions that are equivalent to CFGs  It is easy to see that the set of allowable strings is not closed under concatenation *01-11 *10-010 *010-011 *10-01 *011-11 *010-010 *11-10 *011-10 *011-110 *11-11 *101-11 *011-101 *01-110 *110-010 *10-011 *110-011  This is important: by allowing for AND in the metalanguage we can restrict the allowed expressions  Knowing what can precede what is important  This basically does it for strings What about trees?  So far we have been dealing exclusively with conditions over (sub-)strings; we can now introduce some conditions on trees (informally): 1. Every node (other than the root) has a mother • What node can be a mother? Well, in principle both of the symbols in our alphabet can be mother-of 2. Every mother has at least one daughter • Again, both of our symbols can be daughter-of 3. If a node has m daughters in a treelet T and n daughters in a treelet T’, for n < m, it will have exactly m daughters in every T” that properly contains T or properly contains T’  Trees are graph-theoretic objects   Thus, sets of vertices and edges (G = (E, V); E = {e1, e2, …en}, V = {v1, v2, …vn}) Specifically, we can define walks in those trees  A vx-vy walk in a directed tree T is a finite ordered alternating sequence of vertices and edges that begins in vx and ends in vy.  Then, if A (immediately) precedes B in a walk in T, then we can say that A (immediately) dominates B in T.  In other words: if T is a rooted, directed graph, and if A dominates B in T, then A will be walked on before B in a walk defined for T (this walk may be a trail or a path, depending on whether re-visiting a node is allowed or not).  In the Fib grammar we have a rule in which a node dominates only one node (0 → 1) and another rule in which a node dominates two nodes (1 → 0 1). How do we capture this?  Let us assume, at this point, that an L-grammar may include something like Pullum’s (2019: 69) Lonely Beta condition: LONELY BETA ≡def (∃x)[β(x) ∧ (∀y)[(β(y) ⇒ (y = x)) ∧ (¬β(y) ⇒ α(y))]]  ‘There is an x that is labeled β, and x is the only node labeled β (i.e., any y labeled β is identical with x), and any other node (i.e., any y not labeled β) is labeled α.’  Let’s try a simpler way. Define a binary relation ρ(x, y). Then, we have that: (x, y) ∈ ρ (y, x) ∈ ρ (x, x) ∉ ρ (y, y) ∈ ρ  We have thus characterised our lonely beta (LB) in a different way: a LB is the only indexed category that does not allow for a loop arc (in the sense of Arc Pair Grammar).  All other configurations involving LB are permitted, as long as they do not violate any of the constraints independently derived from superficial regularities (a.k.a. the Three Laws). Characterising Fib-trees  This means that we allow for the following trees of depth 1 and breadth 1, and using [0] and [1] instead of x and y: Set 𝑇11 : a. b. 1 1 0 1 As per our LB, the following tree is ill-formed: d. 0 0 c. 0 1  A crucial ingredient is the possibility of having tree composition  Nodes in a tree are assigned to indexed categories (in our case, 0 and 1, but also NP, VP, S, IV/NP, etc…)  The important part is that the grammar needs to be capable of identifying identical indexes   More specifically, tree nodes can be conceived of as Gorn addresses, which basically means that each corresponds to a memory address in a procedure. We will not get into details about this here With Sarkar & Joshi (1997), we assume that if a node A in T is assigned to the indexed category C, and a node B in T’ is assigned to the same category C, when a tree T” is constructed from T and T’, A and B can be collapsed as a single node  Exactly the same is assumed in Unification-style grammars (Shieber, 1986 and much of HPSG), but I personally like TAGs more.  Recall that the set of allowable strings is not closed under concatenation…  Is the set of admissible trees closed under substitution?  We can construct the following set of trees 𝑇21 by root substitution Set 𝑇21 e.  1 0 1  f. 1 0 (a) ⋃ (b) 1 (b) ⋃ (a) This can be generalised: substitution can target any tree from any set and operate at the root or at the frontier.  Why? Because we are simply unifying graphs, which are sets of addresses plus edges. The concepts of ‘root’ and ‘frontier’ do not play any role  This is intimately related to the fact that there is no terminal vs. non-terminal distinction in L-systems This means that the following trees are also legitimate: g. b. 1 0 1 1 1 (c) ⋃ (f) 0 1 0 1 c. 1 1 (c) ⋃ (f) ⋃ (e) 0 1 1 0 1 (c) ⋃ (e) ⋃ (f)  At this point it should be clear that the definitions of trees are recursive  Recall that we have stipulated that If a node has m daughters in T and n daughters in T’, for n < m, it will have exactly m daughters in every T” that properly contains T or properly contains T’  This is because of the way in which we have defined the trees: outputs of union can be used as inputs for subsequent operations  But elementary trees do not grow beyond strict limits:    In particular, we do not have the need to define anything other than the sets 𝑇11 , 𝑇21 , and the operation union, which applies exactly like LSLT-style substitution As a caveat, this is so because of the lack of terminal vs. non-terminal distinction in Lindenmayer grammars This is why we can get away with doing all of this in (a version of) 𝐿2𝐾,𝑃 , which is a strictly context-free language (Rogers, 1997)  Nothing prevents, in principle, that the ‘illegal’ n-grams are obtained via tree composition  All we have established is that, if those conditions over n-grams are interpreted as NACs, then an elementary tree cannot feature them. We have said nothing about constraints on derived trees (i.e., trees which are the result of composition).  Thus, the Second Law would ban an elementary tree like 1 Note: verticality is an illusion We are dealing with paths in graphs; what matters is how we define the relation precedes in T 1 1  But any derived tree which contains 1 preceding 1 preceding 1 in T as a proper subpart should indeed be permitted, as long as no other condition is violated. For instance: 1 0 1 1 0 1 ‘What was all that?’  Our goal was to provide a way to build a set of trees starting from conditions over expressions.  We provided arguments there is a mapping between a set of n-grams and a set of elementary trees  There is also a function from a set of elementary trees to a set of derived trees, but we did not come up with that one  This mapping is done through constraints on possible bi-grams and tri-grams; all we have is a set of restrictions on strings, not a procedure to proof-theoretically get from strings to strings  We defined, in this way, a model for the Fibonacci grammar  That is, a set of constraints that a well-formed expression of the language (a string in a string language, a tree in a tree language) must satisfy  In doing so, we have started from conditions on n-grams and interpreted them as NACs  It is essential to note that we are not generating anything.  There is no recursive enumeration of strings or production of strings at all A note: Fib vs. bif  Both Fib and bif languages satisfy the Three Laws.  The optionality between [0 1] and [1 0] in the rules arises as two equally legitimate ways to compose trees of depth 2 and breadth 2 (trees (e) and (f) above)  We have said nothing about constituency, which is where the differences between Fib and bif arise  The primitives here are n-grams and constraints over expressions; the primitives in procedurally-based syntax are categories and rules of combination   See Hockett (1954) and Schmerling (1983), who distinguish Item-and-Arrangement grammars from Item-andProcess grammars. ‘Constituent’ is an IA notion, not an IP notion. In other words: Fib and bif satisfy the same model  That does not make them identical or equivalent or the same or anything like that  It just says that, from the perspective of building elementary trees from conditions on n-gram co-occurrence, both languages satisfy the same set of constraints  This is not surprising, again. Conditions on expressions are the same (the Three Laws) Conclusions  As a conclusion, we are simply pointing out that there is a way to build elementary and derived trees starting from n-grams if there are well-defined restrictions on possible n-grams at the local level (2- or 3-grams, we have not tried more complex grammars where larger n-grams would need to be considered).  This is possible if we interpret rules as NACs   Looking at n-grams and looking at trees are not mutually exclusive things; this is an important conclusion for the analysis of L-systems   Thus, if we look at the grammar not procedurally, but model-theoretically And also for the construction of syntactic meta-theory There are advantages in adopting a model-theoretic approach, which does not entail abandoning a proof-theoretic one  They just answer different questions (in L-systems and natural language grammars) Thank you! …and what about…?  XOR?   Well, an XOR grammar is an irreducible L-system without a lonely beta. If these two conditions (being irreducible and having an LB) are met, we have a grammar of the Fib-family. If not, then the grammar will be symmetrical. …Skip and non-Skip?  Non-Skip belongs to the same family as the grammar it is expanded from, since it has been obtained by means of exploiting transitivity of dominance.  Skip is a bit more interesting. The question is whether the local trees we have defined are a model for Skip, if they can take us from n-grams to structure. The minimal trees defined here, which were constructed using only the First and Second Laws, are not suitable for Skip. This is because we get things like the following: 0 → 01 (Fib G2) 1 → 01101 (Fib G4) G3 of Skip = 01011010101101011010101101 The bolded substrings, given our trees for asymmetric irreducible L-grammars (i.e., given our model) would construct a structure where we find the substring *111, in direct violation of the Second Law. Note: I bolded [01] to remain Fibby, but there are also [101010] if bif is what you prefer.