Generative vs. modeltheoretic approaches to
L-systems
From n-grams to trees
Diego Gabriel Krivochen
Workshop on Hierarchical Structure Processing
26-28 August 2020
Proof theoretical vs. Model theoretical syntax
Model theory is concerned with finding interpretations for well-formed formulae
which make such formulae true
If an interpretation I makes a WFF S true, we say that I is a model of S (or, alternatively,
that S satisfies I).
‘Interpretations’ in a formal sense are constraints or sets thereof over expressions (WFF)
Grammars, in this view, consist of finite sets of ‘admissibility conditions’, in the sense of
‘what an expression must look like in order to satisfy the grammar’ (Pullum, 2007: 1-2).
Proof theory is concerned with the enumeration of WFF by means of recursive
operations implemented through a Turing machine.
More often than not, these operations are combinatoric, and inspired in the syntactic rather
than the semantic side of logic
A grammar is a set of rules that recursively enumerates the set of sentences which
constitutes a language L (in Post systems, ‘generate’ means ‘recursively enumerate’)
Proof-theoretic models of syntax adopt a procedural view, which translates into the central
role of derivations.
A working example
Consider the phrase structure rule
S → NP, VP
Let us interpret it proof-theoretically and model-theoretically
Proof-theoretically (deterministic procedure): a symbol S in line φi of a derivation can only be
replaced by the string NP⌒VP in line φi+1 (Chomsky, 1959)
Model-theoretically: a node labelled S is admissible in a well-formed structure T iff it
immediately dominates nodes labelled NP and VP in T (McCawley, 1968)
Note how the MT interpretation actually reformulates a grammar not as a set of rules that
recursively enumerate structural descriptions or produce trees, but as a set of local
admissibility conditions for nodes in graphs (or basic expressions in derived expressions)
Now, because MT grammar is concerned with the development of models of expressions and
structures, we can think of constraints as ways to filter structures
Here we depart from Pullum & Scholz (2001) in the definition of the model: not the expressions themselves,
but their complement set. Point is, there are some differences in the application of MTS here and in their work.
L-systems
Let us consider the rules of the so-called Fibonacci grammar:
0→1
1→01
We know very well how that works proof-theoretically, we just rewrite stuff.
Model-theoretically, however, we can reformulate things as follows:
0 → 1 (a tree T with a node labelled 0 is well formed iff every node labelled 0
immediately dominates a node labelled 1 in T)
1 → 0 1 (a tree T with a node labelled 1 is well formed iff every node labelled 1
immediately dominates a node labelled 0 and a node labelled 1 in T)
All we’ve done is re-interpret the rules as admissibility conditions for local trees
That is, local trees with roots 0 and 1 respectively
We’ll call these Node Admissibility Conditions (NAC)
This is cool in its own right, but we can do more!
Describing a language
The ‘constructive’ way to describe a language model-theoretically would be to characterise
the expressions or trees that are well-formed
And then, define a set of statements over expressions
Things like:
If α ∈ PFC/NP and β ∈ PNP, then F1(α, β) ∈ PFC, for all α, β.
(read: if α is an expression of the category FC/NP and β is an expression of category NP, then
the result of applying rule F 1 to the pair α, β will be an expression of category FC)
∀(x, y) [(x ⨞* y ∧ y ⨞* x) → x = y] (Rogers, 1998: 17)
(read: for every x, y [in a well-formed tree T] if x dominates y and y dominates x, then x is identical to y)
Rogers (1997) provides a MT formulation of a system that is strongly equivalent to a PSG
His statements pertain to allowed relations in trees rather than expressions
But it is equally possible to start from the things that we forbid in the formalism, and
define the complement set of those
Making sure that a system does not assign a well-formed status to a set of expressions or
structures means defining the well-formed expressions / structures as the complement set of the
banned ones
Sometimes that’s not too practical, because the set of ill-formed expressions is recursively
enumerable…however…
Restricting ill-formedness: The Laws
Strings produced by the Fibonacci grammar present two local restrictions on expressions,
which in an Asimovian move I called the ‘Three Laws’
First Law: Every 0 is followed by a 1 (*00)
Second Law: Two 1s are always followed by a 0 (*111)
Third Law: A single 1 may be followed by either a 0 or a 1
We care about the deterministic ones, the First and Second laws.
But these do not refer to trees… do they?
Well…sorta. We’ll see how. But we need some preliminary definitions first:
Let the breadth of a tree be the number of sister nodes at every generation
Let the depth of a tree be the number of dominance relations in a given structure
Then, we can characterise local trees as follows:
x
y
Breadth: 2
Depth: 1
z
x
y
Breadth: 1
Depth: 1
We can think of the First and Second Laws as NACs for trees of depth 0. That means
that for the First and Second laws we can define the complement set of strings allowed,
for 2- and 3-grams (since these are the n-gram sizes that the Laws refer to):
{11, 01, 10}
{101, 110, 011, 010} {*00} is excluded by the 1st Law and therefore all 3-grams
containing it also will.
Then, the complement of the set of strings forbidden by the First Law should define
exhaustively the set of trees with breadth 2 and depth 0 over the alphabet Σ.
And the complement set of the Second Law does the same for trees with breadth 3 and
depth 0 over Σ.
Not too impressive so far, but bear with me
Now, we can make our description more powerful if, like Rogers (1997), we allow for
Boolean connectives in the metalanguage
Specifically, we care about ‘AND’, since it allows us to concatenate strings
At this point, we can define sets of conditions that are equivalent to CFGs
It is easy to see that the set of allowable strings is not closed
under concatenation
*01-11
*10-010
*010-011
*10-01
*011-11
*010-010
*11-10
*011-10
*011-110
*11-11
*101-11
*011-101
*01-110
*110-010
*10-011
*110-011
This is important: by allowing for AND in the metalanguage we can restrict the
allowed expressions
Knowing what can precede what is important
This basically does it for strings
What about trees?
So far we have been dealing exclusively with conditions over (sub-)strings; we can now introduce
some conditions on trees (informally):
1. Every node (other than the root) has a mother
•
What node can be a mother? Well, in principle both of the symbols in our alphabet can be mother-of
2. Every mother has at least one daughter
•
Again, both of our symbols can be daughter-of
3. If a node has m daughters in a treelet T and n daughters in a treelet T’, for n < m, it will have
exactly m daughters in every T” that properly contains T or properly contains T’
Trees are graph-theoretic objects
Thus, sets of vertices and edges (G = (E, V); E = {e1, e2, …en}, V = {v1, v2, …vn})
Specifically, we can define walks in those trees
A vx-vy walk in a directed tree T is a finite ordered alternating sequence of vertices and edges that begins in vx
and ends in vy.
Then, if A (immediately) precedes B in a walk in T, then we can say that A (immediately) dominates B in T.
In other words: if T is a rooted, directed graph, and if A dominates B in T, then A will be walked on before B in a walk
defined for T (this walk may be a trail or a path, depending on whether re-visiting a node is allowed or not).
In the Fib grammar we have a rule in which a node dominates only one node (0 → 1) and
another rule in which a node dominates two nodes (1 → 0 1). How do we capture this?
Let us assume, at this point, that an L-grammar may include something like Pullum’s
(2019: 69) Lonely Beta condition:
LONELY BETA ≡def (∃x)[β(x) ∧ (∀y)[(β(y) ⇒ (y = x)) ∧ (¬β(y) ⇒ α(y))]]
‘There is an x that is labeled β, and x is the only node labeled β (i.e., any y labeled β is
identical with x), and any other node (i.e., any y not labeled β) is labeled α.’
Let’s try a simpler way. Define a binary relation ρ(x, y). Then, we have that:
(x, y) ∈ ρ
(y, x) ∈ ρ
(x, x) ∉ ρ
(y, y) ∈ ρ
We have thus characterised our lonely beta (LB) in a different way: a LB is the only indexed
category that does not allow for a loop arc (in the sense of Arc Pair Grammar).
All other configurations involving LB are permitted, as long as they do not violate any of the
constraints independently derived from superficial regularities (a.k.a. the Three Laws).
Characterising Fib-trees
This means that we allow for the following trees of depth 1 and breadth 1, and
using [0] and [1] instead of x and y:
Set 𝑇11 :
a.
b.
1
1
0
1
As per our LB, the following tree is ill-formed:
d.
0
0
c.
0
1
A crucial ingredient is the possibility of having tree composition
Nodes in a tree are assigned to indexed categories (in our case, 0 and 1, but also NP, VP,
S, IV/NP, etc…)
The important part is that the grammar needs to be capable of identifying identical
indexes
More specifically, tree nodes can be conceived of as Gorn addresses, which basically means that
each corresponds to a memory address in a procedure. We will not get into details about this
here
With Sarkar & Joshi (1997), we assume that if a node A in T is assigned to the indexed
category C, and a node B in T’ is assigned to the same category C, when a tree T” is
constructed from T and T’, A and B can be collapsed as a single node
Exactly the same is assumed in Unification-style grammars (Shieber, 1986 and much of HPSG),
but I personally like TAGs more.
Recall that the set of allowable strings is not closed under concatenation…
Is the set of admissible trees closed under substitution?
We can construct the following set of trees 𝑇21 by root substitution
Set 𝑇21
e.
1
0
1
f.
1
0
(a) ⋃ (b)
1
(b) ⋃ (a)
This can be generalised: substitution can target any tree from any set and operate at the root or at
the frontier.
Why? Because we are simply unifying graphs, which are sets of addresses plus edges. The concepts of ‘root’
and ‘frontier’ do not play any role
This is intimately related to the fact that there is no terminal vs. non-terminal distinction in L-systems
This means that the following trees are also legitimate:
g.
b.
1
0
1
1
1
(c) ⋃ (f)
0
1
0
1
c.
1
1
(c) ⋃ (f) ⋃ (e)
0
1
1
0
1
(c) ⋃ (e) ⋃ (f)
At this point it should be clear that the definitions of trees are recursive
Recall that we have stipulated that
If a node has m daughters in T and n daughters in T’, for n < m, it will have exactly m
daughters in every T” that properly contains T or properly contains T’
This is because of the way in which we have defined the trees: outputs of
union can be used as inputs for subsequent operations
But elementary trees do not grow beyond strict limits:
In particular, we do not have the need to define anything other than the sets 𝑇11 , 𝑇21 ,
and the operation union, which applies exactly like LSLT-style substitution
As a caveat, this is so because of the lack of terminal vs. non-terminal distinction in
Lindenmayer grammars
This is why we can get away with doing all of this in (a version of) 𝐿2𝐾,𝑃 , which is a
strictly context-free language (Rogers, 1997)
Nothing prevents, in principle, that the ‘illegal’ n-grams are obtained via tree composition
All we have established is that, if those conditions over n-grams are interpreted as NACs,
then an elementary tree cannot feature them. We have said nothing about constraints on
derived trees (i.e., trees which are the result of composition).
Thus, the Second Law would ban an elementary tree like
1
Note: verticality is an illusion
We are dealing with paths in graphs; what matters
is how we define the relation precedes in T
1
1
But any derived tree which contains 1 preceding 1 preceding 1 in T as a proper subpart
should indeed be permitted, as long as no other condition is violated. For instance:
1
0
1
1
0
1
‘What was all that?’
Our goal was to provide a way to build a set of trees starting from conditions over
expressions.
We provided arguments there is a mapping between a set of n-grams and a set of
elementary trees
There is also a function from a set of elementary trees to a set of derived trees, but we did
not come up with that one
This mapping is done through constraints on possible bi-grams and tri-grams; all we
have is a set of restrictions on strings, not a procedure to proof-theoretically get from
strings to strings
We defined, in this way, a model for the Fibonacci grammar
That is, a set of constraints that a well-formed expression of the language (a string in a
string language, a tree in a tree language) must satisfy
In doing so, we have started from conditions on n-grams and interpreted them as
NACs
It is essential to note that we are not generating anything.
There is no recursive enumeration of strings or production of strings at all
A note: Fib vs. bif
Both Fib and bif languages satisfy the Three Laws.
The optionality between [0 1] and [1 0] in the rules arises as two equally legitimate ways
to compose trees of depth 2 and breadth 2 (trees (e) and (f) above)
We have said nothing about constituency, which is where the differences between Fib
and bif arise
The primitives here are n-grams and constraints over expressions; the primitives in
procedurally-based syntax are categories and rules of combination
See Hockett (1954) and Schmerling (1983), who distinguish Item-and-Arrangement grammars from Item-andProcess grammars. ‘Constituent’ is an IA notion, not an IP notion.
In other words: Fib and bif satisfy the same model
That does not make them identical or equivalent or the same or anything like that
It just says that, from the perspective of building elementary trees from conditions
on n-gram co-occurrence, both languages satisfy the same set of constraints
This is not surprising, again. Conditions on expressions are the same (the Three Laws)
Conclusions
As a conclusion, we are simply pointing out that there is a way to build elementary
and derived trees starting from n-grams if there are well-defined restrictions on
possible n-grams at the local level (2- or 3-grams, we have not tried more complex
grammars where larger n-grams would need to be considered).
This is possible if we interpret rules as NACs
Looking at n-grams and looking at trees are not mutually exclusive things; this is an
important conclusion for the analysis of L-systems
Thus, if we look at the grammar not procedurally, but model-theoretically
And also for the construction of syntactic meta-theory
There are advantages in adopting a model-theoretic approach, which does not entail
abandoning a proof-theoretic one
They just answer different questions (in L-systems and natural language grammars)
Thank you!
…and what about…?
XOR?
Well, an XOR grammar is an irreducible L-system without a lonely beta. If these two conditions (being
irreducible and having an LB) are met, we have a grammar of the Fib-family. If not, then the grammar
will be symmetrical.
…Skip and non-Skip?
Non-Skip belongs to the same family as the grammar it is expanded from, since it has been obtained by
means of exploiting transitivity of dominance.
Skip is a bit more interesting. The question is whether the local trees we have defined are a model for
Skip, if they can take us from n-grams to structure. The minimal trees defined here, which were
constructed using only the First and Second Laws, are not suitable for Skip. This is because we get things
like the following:
0 → 01 (Fib G2)
1 → 01101 (Fib G4)
G3 of Skip = 01011010101101011010101101
The bolded substrings, given our trees for asymmetric irreducible L-grammars (i.e., given our model) would
construct a structure where we find the substring *111, in direct violation of the Second Law.
Note: I bolded [01] to remain Fibby, but there are also [101010] if bif is what you prefer.