[go: up one dir, main page]

0% found this document useful (0 votes)
68 views34 pages

An Introduction To Minimalist Grammar

The document discusses the evolution of grammatical theory from Chomsky's Principles and Parameters model to the Minimalist program, emphasizing the need for a more unified and efficient understanding of language components. It outlines the modular nature of grammar, the roles of various theories, and the importance of interface levels for phonetic and semantic interpretation. The Minimalist approach advocates for eliminating unnecessary mechanisms in grammar, focusing on the optimal design of language systems and the computational processes involved in sentence formation.

Uploaded by

ajibade ademola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views34 pages

An Introduction To Minimalist Grammar

The document discusses the evolution of grammatical theory from Chomsky's Principles and Parameters model to the Minimalist program, emphasizing the need for a more unified and efficient understanding of language components. It outlines the modular nature of grammar, the roles of various theories, and the importance of interface levels for phonetic and semantic interpretation. The Minimalist approach advocates for eliminating unnecessary mechanisms in grammar, focusing on the optimal design of language systems and the computational processes involved in sentence formation.

Uploaded by

ajibade ademola
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

An introduction to minimalist grammar

Eric Reuland (UiL OTS, March 2003)

Since the publication of Chomsky (1981) a conception of grammar has


developed in which the language faculty was conceived as as an
essentially modular system with the following components (see
Haegeman 1991 for an overview):

1. Lexicon
2. Phrase structure (in the form of the X'-schema)
3. Movement theory (reduced to the general rule schema Move α)
4. Theta-theory
5. Case theory
6. Binding Theory
7. Control theory
8. Bounding theory

To these two interpretive components should be added: i) a component


handling the expression of language in a medium (Phonetic Form=PF),
and ii) a component handling the semantic interpretation (mapping a
possibly abstract syntactic structure, Logical Form=LF into extralinguistic
meaning). These subtheories were conceived of as independent modules.
This so-called Principles and Parameters (P&P) model was very
successful from a descriptive and heuristic descriptive point of view. The
model has proven very fruitful in bringing an ever increasing body of
facts in the scope of syntactic theory. Yet it became necessary to reassess
the foundations of the theory, specifically examining the computational
mechanisms and attemting to reduce and unify them as much as possible.
To mention a few reasons:
- there was occasional overlap between components (Movement theory
and binding theory)

1
- certain components were sometimes conceptually similar (Binding
theory and Control theory).
- some components were hybrid in making use of both typically syntactic
and typically interpretive mechanisms (binding theory)
- in the course of the years, the limits of the theory became increasingly
unclear. If researchers introduced some new notion in order to capture a
certain body of facts, it was hard to determine whether such an enrichment
could still be motivated given the other components of the theory. This led
to the need for setting clearer standards of explanation.

Developing these standards gave rise to the Minimalist program as


developed in Chomsky (1995, 1998, 1999) and by many others. The
driving force of minimalism is that no mechanism should be introduced
into the theory unless it is proved to be absolutely necessary. Any
mechanism introduced into the theory should be instantly gotten rid of
once proven to be superfluous.
This change definitely caused a shift in the style of argumentation, and
the type of data usually discussed. In the P&P style of argumentation one
usually has some intriguing pattern in language x, y, or z. It is then shown
that this pattern can be accommodated/explained if the theory is adapted
along the following lines. A minimalist style of argument is usually,
consider the following property of the theory. Conceptually, this seems
like a superfluous property. What happens if we try to get rid of it? What
follows are long discussions of formal properties of derivations, covering
all the angles of what could where go wrong, very often in areas of the
grammar that are by themselves well-studied. Precisely, because the ins
and outs of such areas are well-known, and one is unlikely to miss an
option.
This is one of the reasons that one often finds hybrid approaches, in
which P&P-style representations and arguments are mixed. To put it a bit
differently, for macro-analyses of lesser-known areas of grammar, a P&P
style approach is often still very fruitful. Minimalist explorations are often
2
intended to derive certain macro properties from the nitty-gritty details of
the computation. But, if such macro-properties can be derived from more
fundamental principles of the grammatical system, this makes them no
less valid. by themselves.

Conception of the status of the grammar:


Human language consists by virtue of a systematic relation between form
and meaning. Forms are standardly realized and perceived in the form of
auditory stimuli, but a route via the visual system is also available (sign
language). Since neither the auditory nor the visual systems are dedicated
specifically to language these may be considered as external to the
language system per se. Yet, the language system is apparently able to
"talk" to the articulatory-perceptual systems via some interface. This
interface is usually referred to as the PF-interface (PF=Phonetic Form).
On the meaning side, at least this much is clear that our cognitive system
is able to process information, form concepts, feel emotions, form
intentions, and perhaps even experience thoughts, independent of
language. (Although language definitely helps in articulating those.)
Following Chomsky and others we may dub this part of the system the
system of thought (with perhaps also a "language of thought" in so far as
concepts can be combined without using the linguistic system proper). By
the same token this system is in some relevant sense external to the
language system. Yet, again, the systems must be able to communicate,
which takes place using the Conceptual-Intentional Interface (C-I
interface).
So, the traditional conception of language expressing a systematic
relation between sound and meaning, now gets its expression in the
following manner:

A grammar G is a generative procedure specifying a set of pairs of


representations <P, L>, where P is an expression that can be
interpreted at the PF interface as a set of instructions for the perceptual
3
or articulatory systems, and L is an expression that can be interpreted
at the C-I interface as a set of instructions for the system of thought
(update its information state, do something, changing emotional state,
etc.). Both P and L must obey any constraints imposed at the level
where they are interpreted. Both are expressions consisting of discrete
linguistic elements. Every element P contains must be interpretable at
the PF interface, any element L contains must be interpretable at the
C-I interface. A derivation leading to a pair <P,L> where one member
does not meet this requirement crashes. This requirement on
representations that are fed into the interface levels is known as the
principle of Full Interpretation (FI). A derivation (computation) that
leads to both P and L meeting the interface requirements is said to
converge.

An interesting picture of the position of the language system among the


other cognitive systems is provided by the following Evolutionary fable
(Chomsky (1998:6)):

Given a primate with the human mental architecture and sensori-


motor apparatus in place, but not yet a language organ. What
specifications does some language organ FL have to meet if upon
insertion it is to work properly?

Thus the theoretical aim of the linguist is to arrive at a precise


characterization of that part of our cognitive system that is dedicated to
language broadly conceived. A working hypothesis is that language is
optimally designed. That it is a ‘perfect system’, the optimal solution to
the task of relating the two interface levels. An effect of this working
hypothesis is that it discourages solving descriptive problems by
introducing substantive hypotheses specific to language per se. Any
such substantive hypothesis is like an imperfection in the design.
Presumably, there are such imperfections, but they should only be
4
postulated as a last resort. Rather, whenever one sees such a problem the
first step should always be to investigate how it can be reduced to issues
concerning the design of natural language in view of its general task.
The grammatical system itself consists of a lexicon and a system
specifying how elements from the lexicon may be combined. This
combinatory system is referred as CHL, the Computational system of
Human Language. The lexicon is a repository of idiosyncrasies (but see
recent work in distributive morphology for a different view). It is an
inventory of elements that are triples of three types of features: <p, g, l>.
p-features reflect instructions to the PF-interface, l-features reflect
instructions to the C-I interface, and g-features reflect instructions for the
computation itself. g-features are not interpretable; hence, by FI they must
all be used up when the expression is sent off to an interface.
The first step in building a sentence is the selection of the building
stones from the lexicon: a collection of words and functional elements.
This selection is called numeration.
Constructing a sentence is carrying out a computation using the
building blocks provided in the numeration.
The computation stops precisely then when all the elements of the
numeration have been used up.
Due to the principle of Full Interpretation, any element in the
numeration should either contribute to the interpretation, or play a role in
driving the computation itself and thus be "used up" (=eliminated) (see
below).
Computations obey the inclusiveness condition. That is, a
computation consists of manipulating/ rearranging elements or parts of
elements obtained from the numeration; it cannot add new elements that
are not part of the numeration.
A consequence of a strict interpretation of the inclusiveness condition
is that syntactic computations cannot use any element that could not be
morphologically reflected in a human language. Much of the apparatus of

5
logical syntax, such as indices, lambda's, etc. cannot, therefore, be part of
a natural language computation.

N.B. Not every numeration will lead to a grammatical sentence. For


instance, a numeration that, in addition to some functional material only
contains the verb come and two arguments (for instance John and Mary)
will never lead to a well-formed result. Similarly, a numeration that only
contains a finite verb (comes), and an accusative pronominal, such as me
can never lead to a correct result. Such numerations always lead to what is
called a crash.

The basic operation of the syntactic computation consists of combining


two elements from the numeration. This operation is called Merge. Some
operation like merge must be part of any discrete symbolic system by
necessity. In addition to Merge, there is the operation Move. Move
reflects the so-called displacement property of natural language:
predicates and their arguments are not always realized “together”, i.e. in
their position of first merge. Move can best be conceptualized as a form of
Merge, where the element to be merged is not taken from the numeration
but is an element (or a copy of an element) that was already part of the
structure. Unlike Merge, Move cannot be part of the linguistic system by
necessity. It is easy to think of symbolic systems without Move. Thus, it
gives rise to two questions: i) why does it occur at all in natural language,
and ii) how it is implemented in the computational system? Another way
of putting the first question is whether the displacement property is an
imperfection or whether it can be shown to be a necessary feature of an
optimal system that is able to meet the requirements imposed by the two
interface levels. Chomsky’s position leans towards the latter. Below we
will present some further discussion. Although, as we will see, movement
can be motivated in terms of requirements imposed by the C-I interface,
within the computational system it is just a blind process, enforced by

6
formal properties of the structure, in the form of the requirement that
certain features require Checking.
For sake of concreteness we will go through an example. Suppose the
numeration contains the lexical items the and man, Merge may combine
them into the constituent the man. Minimalism retains the condition of
endocentricity. That is, one of the two should be designated as the head.
Assuming, that man is characterized as a noun, the as a determiner, and
assuming that the selects for a noun, we add the convention that it is the
selecting element that projects. Informally this is rendered as in (1a);
formally a set-theoretic expression is used, as in (1b):

(1) a. the

the man

b. {the, {the, man}}

(1b) states that Merging the and man formed a complex object, with the as
a designated member, where this designation is interpreted as having
head-status.
Since, by assumption, the has D-features and man has N-features, and
it is the D-features that have projected, we can still say that [the man] is a
determiner phrase, hence DP, and that man is a noun, and also a maximal
N-projection, but the idea is that this information can be constructed from
the information in (1), hence need not be stated in the form of category
labels. Nevertheless, for practical reasons a notation with labels is still
being used.
Suppose, the numeration also contains the verb form loves and the
pronominal she, assuming for concreteness that pronominals are analyzed
as D's that do not select a complement. The man is now an element in the
construction space, if we select loves from the numeration Merge will be
able to combine loves and the man. By assumption loves has a collection
7
of features, including various verbal features, but also features for Tense,
and Agreement. Its verbal features select for a D. Hence Merge forms a
complex constituent, with loves, being the selector projecting.

(2) loves

loves the

the man

One of the issues arising in this connection is how the system can be sure
that the verbal features of loves, rather than, let's say the Tense features
are used in the computation. This issues arises in a system that assumes
that lexical items, and not individual morphemes are listed in the
numeration. (Distributive morphology pursues the latter course, but
discussing it here will lead us too far afield.) The question can be resolved
if one considers the fact that selection determines what projects. The T-
features in loves do not select for a D. Nor does D select for a T. Hence,
for this choice of features it remains undefined which of the two members
projects; this, one may assume, leads to a crash.
Suppose, next she is taken from the numeration and merged with loves
the man. The V-features in loves do indeed select for an additional
argument, namely the experiencer of the love. Hence loves projects. From
this perspective the next step is (3):

8
(3) loves

she loves

loves the

the man

Suppose, the numeration were depleted, at this point. This would mean
that the predicational core of the sentence would be fully specified.
However, at this point we still have an incomplete sentential structure. In
order for a predication to be properly semantically evaluated more
information is needed. Informally: one needs to know for which
coordinates in space/time it is supposed to hold and with which degree of
certainty, and whether it represents an assertion, a question, or a request.
The former type of information is encoded in the T-system, the later in
the C-system or Force-system. So, in a complete sentence, the core
predication is contained in two layers, or shells: [Force [ Tense/mood [
Predication ] ] ]. In order for the derivation to be completed, also values
for Tense and Force have to be specified by selecting appropriate
functional heads.1 Another reason why for a proper derivation the
numeration should not be depleted at this point is that in (3) as it is now,
not all uninterpretable features have been used up. For instance the Case
feature of she. This warrants the following excursion:

Excursion
Case
She is specified for nominative Case. In traditional grammar, nominative
is assigned by the finite verb to the subject. Case is a formal property of a
DP which by itself does not contribute to the interpretation. For instance,
1
Traditionally, the Force domain is represented by one functional head, namely C, and the T-domain by either one
(I or T) or two (T and AgrS). Currently, many researchers take it that the number of functional projections in these
domains may be substantially greater. We will not systematically discuss these issues here.
9
whereas some Cases, such as instrumental in Russian may convey that the
DP carrying them has a specific thematic role (instrument)2, nominative is
compatible with a variety of roles (here we will be assuming, as is
standard, that a DP that appears in a position where a pronominal would
have visible Case, is itself marked for that Case):

Agent: The man opened the door/he opened the door


Instrument: The key opened the door
Theme: The door was opened; this door opens easily
Goal: Mary was given a key/she was given a key

So nominative Case does not provide an independent contribution to the


interpretation of the sentence. Yet, it does something: it provides
information to which argument position the DP bearing it should be
linked up. In traditional terms, the element with nominative case is the
one agreeing with the finite verb. In current terms: (nominative) Case is
an uninterpretable feature that is used in the computation. Being
uninterpretable it cannot survive at the C-I interface, since at the C-I
interface the expression should only contain elements that can be
interpreted there. As a consequence it must be deleted. Within the current
conception of the grammatical system, it can only delete if it is brought
together with an element licensing it. We know that nominative Case is, in
traditional terms, assigned by the finite verb, more precisely, by the
finiteness feature on the verb, often in the form of the verbal agreement.
Here, this relation is implemented by the notion of checking: i)
nominative Case is checked by the finiteness feature on the verb; ii) as
soon as it is checked it is deleted/erased.3 More generally:

Checking

2
Note, that Russian instrumental has other uses as well.
33
There is a technical difference between deletion and erasure that does not concern us here. We use the terms
interchangeably.
10
For each uninterpretable feature uF it holds that it must at some point
be checked by a matching feature f. Upon checking it is erased.

We may speculate as to why the grammatical system uses uninterpretable


features. One could easily conceive of a "communicative system" without
uninterpretable features, and one might think that such a system would in
fact be more efficient. That is, uninterpretable features would constitute
imperfections. Perhaps they are. Yet, the effect of the presence of
uninterpretable features, given the present theory, is that certain
connections between constituents are enforced, established in a blind
fashion, without recourse to interpretation. It may well be the case that
having a certain number of "blind" operations at your disposition allows
for an amount of rough, hard and fast, computation that is in fact more
efficient than always having to consult other components of the systems.

Case
In the current conceptions of the grammar, nominative Case is generally
considered an uninterpretable feature. Another uninterpretable Case
feature is objective/acccusative that is the typical Case of the direct object
licensed by verbs, which can be observed in English as the object form of
pronominals such as him, me, etc.. Also the Case licensed by prepositions
is generally taken to be uninterpretable.
For objective Case, one way to implement the checking relation has
been by postulating an object agreement feature on the verb along the
lines of the subject agreement feature. This was inspired by the existence
of languages that do indeed have separate object agreement morphology.
Alternatively one has conceived of ways to enforce checking without
postulating a specific object agreement feature. Below we will discuss in
some more detail how checking in such cases proceeds.
Nominative and Objective are the canonical instantiations of the
notion of structural Case: Case whose licensing only depends on
structural factors. Their structural character shows up in the fact that they
11
are par excellence the Cases involved in Case alternations: passive,
middle, nominalizations.
The Case of the indirect object is English should also be considered a
structural Case given that it is also involved in Passive. In other languages
(Dutch, German) the nature of dative Case is less clear, although it has
been argued that it is structural as well. Some languages have Cases that
are clearly semantic, hence not uninterpretable. These should be analyzed
along quite different lines. For current purposes a discussion would lead
us too far afield. For sake of completeness it should be mentioned that the
literature also distinguishes a type of Case called inherent Case. Inherent
Case does not strictly determine the thematic role of its bearer (hence it is
not a semantic Case), but is clearly linked to properties of that role, hence
not uninterpretable. For present purposes it suffices to note that it is not
checked by the same system as structural Case. Since theoretically its
properties are still rather unclear, we will not discuss it here.

Tense and Agreement


Just like structural Case is an uninterpretable feature of the DP/NP (note
that Case may be encoded both on D and on N), Agreement is an
uninterpretable feature on the verbal system. The type of agreement we
see in Indo-European languages is typically dependent on the grammatical
subject. As already discussed, tense, mood and agreement are not part of
the predicational core of the sentence. Standard analyses of English
sentences with an auxiliary, as in she will love the man assign these a
structure as in (4) (omitting reference to the C-system), where T, is a
category subsuming tense, agreement and mood4:

4
T is also referred to as I, especially in older literature. Nothing much hinges of this choice of label.
12
(4) TP

DP T'
she
T VP

Aux
will
love the man

Along the same lines it is often assumed that in a sentence without an


auxiliary, the Tense/agreement morpheme is realized under T, the verbal
stem under V, with some further process bringing them together, as in (5):

(5) TP

DP T'
she
T VP
|
-s
[+PRES]
+3rd love the man

It has been argued that Tense and agreement are in fact separate
functional categories each with their own projection. In such work the
structure of (5) is usually rendered as (6). Here it is any case no longer
possible to identify a segment and associate it with a particular
morpheme. A common assumption is that it is the finite verb which
carries the features and that these require to be matched with the features
of the higher functional heads.5 There are, however, alternative

5
Arguments are based on the fact that certain languages split finiteness from agreement, and that also in languages
13
conceptions. Often making empirical distinctions between such views is
not trivial.

(6) TP

DP T'
AgrP
T
Agr'

Agr VP

V-infl ..........

In order to proceed it has to be made explicit that there is both theory-


internal and theory external evidence that the subject in a transitive clause
such as she loves the man is moved from a position to the right of I. The
theory-internal argument is that assignment of thematic roles is local, and
requires sisterhood. (The DP in (6) is not a sister of a V-projection.)
External evidence is that certain quantificational expressions such as all
are interpreted as part of the subject, as in they all will contact their
parents, but may also be realized to the right of the auxiliary, as in they
will all contact their parents. Note that before we entered this excursion
we had already arrived at such a conclusion for other theory-internal
reasons.
A structure reflecting the relevant relations is given in (7) with the
occurrences of she in brackets being copies (or traces) of the moved
subject.

where tense and agreement are fused, one stilll finds evidence that the computational system accesses them
differently.
14
(7) TP

DP T'
she | AgrP
T
(she) Agr'
VP
|
[+PRES] Agr
+3rd
(she) loves the man

Already in preminimalist approaches it was therefore assumed that a


subject moved from inside the VP via a specifier of Agr (where it entered
agreement) to a specifier of T. Movement to the specifier of T must be
assumed, given the rather leftward position which subjects
crosslinguistically occupy within the T-domain. At present there is no
clear consensus about WHY this is the case. The movement to Spec TP
was enforced both preminimalistically and still is minimalistically by the
postulation of a so-called EPP feature on T.
EPP stands for Extended Projection Principle, which originated as a
stipulation expressing that all sentences have subjects, but so far, it seems
no more, but also no less than a feature that is postulated to make the
computation work, since the subject property in a thematic sense is
already satisfied independently within VP. (However, it also expresses
that verbs have subjects even if there no thematic requirement as in it
rains.) Note, however, that from the perspective of the C-I interface it is
not inconceivable that there is a reason. For interpretation the space/time
coordinates of a predication must be fixed. T carries Tense information,
no spatial information. A DP carries spatial information namely the space
(possibly virtual, in the case of abstract objects) it occupies. In order to fix
a space/time coordinate both temporal and spatial information must be
15
used. It would seem pretty good design to copy a DP from the core
predication to the Tense shell in order to have the required information
locally available for the relevant part of the computation. Merge is
available as an operation independently. The fact that the information in
the DP must be entered into the computation in order to meet the interface
requirement implies the necessity to have some Copy-ing process.
Therefore, the only arbitrary feature so far is that the copying is
reflected in a surface displacement, whereas one could also imagine a
more abstract relation. All we need now is an independent reason why, at
least in some cases, an abstract relation will not do. It would not seem
implausible to assume that such reasons can be found in the relation
between surface constituency and prosody on the one and information
structure (topic-comment, theme-rheme) on the other (inspired by the
discussion of object shift in Chomsky (1999)): a man is in the garden is
about a man in a way in which there is a man in the garden is not. If so,
having a formal uninterpretable feature “—need a DP” is indeed an
optimal solution to meeting conditions posed by the interfaces involved.
Given the presence of this feature the relevant operations apply
blindly. One DP from the core predication is copied and merged with T.
Note, that this procedure does not specify which DP. Any DP will do.
Which DPs are available is determined by other properties of the
computational system, specifically, economy. Note, that the EPP can also
be satisfied by Merge, the way it happens in sentences such as there is a
man in the garden or there are children in the garden. Here, the EPP is
satisfied by merging there with concomitant interpretive effects. In these
cases the thematic subject stays down in the VP. Nevertheless the verb
agrees with the thematic subject. Chomsky (1998, 1999) takes this as
evidence that agreement and case checking are independent of movement,
and governed by a relation AGREE (feature movement in earlier versions
of the theory).

16
So far the long version. The short version is that both the Agreement
features of Agr and the EPP feature of T are uninterpretable. Hence they
must be checked by the subject she step-by-step moving up and erasing
them.

Force
The force domain is the highest shell of the clause. Here, as noted above,
the status of a sentence as declarative, question, imperative, etc. is
marked. In line with the logic of the system developed so far, also the
obligation to move a wh-element to Comp is encoded by an
uninterpretable feature. Languages vary as to whether question formation
requires overt movement of a constituent into the Force domain. For
languages that do, it seems inevitable to resort to some arbitrary feature
within the Force domain that triggers movement. In recent work this
feature has been given as similar status as the EPP feature of T. Thus,
Chomsky 2001 also speaks of an EPP feature of C triggering the
movement of an XP with no reference to content. Only interpretively, the
element moved must be compatible with question formation, but the
process by itself is blind.
The upshot is that the less the expressive power of a system, thus the
less reference to content such triggers allow, the closer one approaches
explanation. It is an empirical matter whether such an impoverished
system can still be made to work.

Little v
Just like arguments have been put forward that the T-domain must be split
up, many authors assume that the V-domain is more finely articulated,
namely as [ v [ V....]]. “little v” is then seen as the locus of transitivity,
and the element instrumental in assigning structural accusative Case to the
object. The structure of a canonical VP is then as in (3’)

17
(3’) v

she v
v loves

loves the

the man

Here she is the specifier of the v-projection v loves the man. It is “little v”,
then, that allows the accusative Case of the man to be checked and erased
(by Agree, c.q. abstract feature movement).
With unaccusative verbs, such as arrive, come, little v is generally
taken to be absent (sometimes it is postulated to be there, but to lack the
accusative Case property)..

Concluding the excursion:


- Structural Case on the N/D-projection is an uninterpretable feature that
must be checked by an appropriate element and erased. Nominative Case
must be checked by the Tense system, accusative Case must be checked
by the V-system.
- Agreement is an uninterpretable feature on the Tense system (more in
particular reflected in the Agr category), and must be checked and erased
by a suitable DP.
- EPP is an uninterpretable feature on T and must also be checked and
erased by a DP.
- EPP is also an uninterpretable feature of C (or perhaps a +wh C) ; it
must be checked and erased by a +wh DP. (It may well be the case that
the EPP puts no restrictions, but that moving a -wh DP, though it
succeeds in erasing the EPP feature, clashes with the +wh feature of C,
thus making the derivation crash.

18
Back to where we left: structure (3)

(3) loves

she loves

loves the

the man

As we said: “Suppose, the numeration were depleted, at this point. This


would mean that the predicational core of the sentence would be fully
specified. However, in order for a predication to be properly semantically
evaluated also its temporal/modal and Force shells must be specified. So
we would still have an incomplete sentential structure”. Moreover, we are
now able to continue, the subject she still has an uninterpretable
nominative Case feature. The verb loves has uninterpretable agreement
features. They must be used, that is, checked and erased before the
derivation can be successfully terminated. So, in order for convergence to
be possible, the numeration must contain more material. And, further
material is also necessary in order for the derivation to lead to a full
sentence. Thus here, formal and substantive requirements go hand in
hand.
It is good to note that there is no guarantee that the numeration
contains all the necessary material. One suggestive metaphor is that the
numeration is comparable to grabbing a handful of Lego pieces from a
bag, and see whether they yield a proper object. Some grabs may, other
may not. This seems like a rather inefficient procedure if the numeration
reflects at the conceptual level what the speaker has in mind (for instance
in the sense of Levelt 1989). For those who would wish to pursue that
interpretation it would seem more reasonable if at least access to
functional material were free during the computation. Note, that this idea
19
of a numeration of that limits one's options may seem a bit arbitrary at the
moment, but it is essential if it comes to implementing the notion of
economy as a restriction on admissible derivations. (I will leave this aside
for the moment.)
Since it is easy to see what goes wrong if the necessary inventory is
not available, I will keep assuming that the right elements are indeed in
the numeration.
If we go beyond the predicative core functional material will have to
be added. The standard assumption is that just like a verb may select for
an object, also functional elements have selectional properties, and that
also here the selector projects. So assuming that the inventory of
functional categories above V is just Agr, T and C, in that order we have
to assume that Agr selects V, T selects Agr and C selects T. It should be
noted that many authors argue that the number of, broadly speaking,
verbal features that the computational system has to take into account is
far richer. Moreover, it has been argued that, on the one hand, their order
is fixed cross-linguistically, but relative to other material (verbs) they give
rise to word order variations which means that such features must be
syntactically realized as categories (Cinque (1999)).6 For the moment we
will leave it at this, since the technicalities will not be fundamentally
different whether we have 3 or 13 functional categories.
So, if as a next step we grab from the numeration Agr, this will work
out, since Agr selects for V, hence can project. This yields the structure in
(8):

6
But see Nilsen (2003) for an alternative approach.
20
(8) Agr

Agr loves

she loves

loves the

the man

In the next few steps a configuration must be created in which she can
check the uninterpretable features of the verb. Recall, that in the current
position of loves these are inaccessible, since it is the predicational part of
the verb that determines that loves selects she and projects. A standard
implementation is the following: Agr encodes not only the relevant
features, but also that it depends on a verb. This feature has the status of
an uninterpretable v-feature. This feature could in principle be checked
and eliminated by Moving love, deriving (9):

(9) Agr

Agr loves

loves Agr she loves

(loves) the

the man

Such overt movement can be observed in a number of languages. In


English, however, we have to assume that loves is spelled out in the
position indicated in (8). The upshot is that for a minimally simple
21
computational system to work, the agreement and Tense features of loves
must be accessible in the position of Agr and still higher, whereas the verb
itself is realized within the predicational core. In order to resolve this,
three implementations have been proposed, each quite simple in its own
right.

i) the uninterpretable v-feature of Agr is weak. For this implementation


weakness of an uninterpretable feature implies that it must be checked
and eliminated, but that it is sufficient for the movement to be abstract.
That is, it takes place in the syntax, but has no effect on spell-out/surface
order.

Implementation i) led to a discussion as to why the computational system


would work that way. That is, why would there be a distinction between
overt and covert movement (apart from the fact that is seems needed in
many areas, for instance quantification). The upshot of this discussion was
that the choice between overt and covert movement was governed by
economy. Overt movement came out as the operation that must be
specifically enforced (by a strong feature), in the absence of a strong
feature movement was covert. In earlier stages of minimalist theory
formation it was also assumed that covert movement would take place
after all overt movements. That is, movement is to be postponed
whenever possible. This is the principle of procrastination. This
assumption led to various difficulties in implementation, which we will
not discuss, but more importantly the notion of economy itself was not
invoked in a natural manner. Thinking about why covert movement would
be more economical, the intuitive reason is that you have to move less, i.e.
no PF-accessible information as encoded in the p-member of the lexical
triple. Pursuing this further, you could say that the less you move, the
more economical. Suppose you only move what you have to move? What
you have to move is the grammatical features involved in the checking.
The rest is superfluous. The led to implementation ii):
22
ii) Movement is triggered by an attracting (uninterpretable) feature. If the
attracting feature is weak, all it attracts are the formal syntactic features of
the attractee. So, covert movement is formal feature movement. If the
attracting feature is strong, the full constituent is moved along with the
attracted feature (referred to as pied piping).

iii) Current theory investigates whether certain dependencies are to be


understood in a way different from movement, namely as Agree. Where
agreement is thought to be an independently needed type of dependency.

In this overview we will essentially use the implementation in terms of


feature movement. Given this, the first step after the formation of (8) is
(10), with FFX standing for "formal features of X":

(10) Agr

Agr loves

FFloves Agr she loves

loves the

the man

Since the lowest Agr in this structure is a head, that is an element directly
taken from the numeration, the movement depicted here instantiates head-
movement: the formal features of loves are adjoined to the head of the
Agr projection.
As the next step, the uninterpretable features of Agr/loves must be
checked and erased. This is implemented by assuming that these features
23
attract the formal features of she. Note, that person and number (and
where relevant, gender) are uninterpretable on the verb, but are
interpretable on the DP: varying one of these features on, for instance, a
pronominal changes the denotation (compare he and she). So, nothing
precludes that a feature that is interpretable on one type of category is
used to erase its uninterpretable counterpart on another category. It is
assumed that checking is an operation that can only take place if a certain
formal condition on the configuration is satisfied. It is an essentially local
operation and can only apply in a so-called checking configuration. For
current purposes it is sufficient to note that the canonical case of such a
checking configuration obtains by merging an element with the first
projection of a head. This is the position corresponding to the traditional
specifier position. This is represented in (11): the italiced Agr is the first
category dominating the Agr-head.
Note, that in order to be precise one more issue has to be clarified,
namely the status of the subtree rooted by the bold-faced Agr. In order for
the computation to be effective, she has to be able to "see" FFloves. The
question is whether or not FFloves is shielded by the intervening Agr node.
The predicament here and elsewhere is this: we want the computational
processes to be local, yet sometimes a process can look a little deeper.
This is resolved by defining two ways in which an element can be
merged. One type of merger is called substitution (although by now this is
really a misnomer, the term is still used), the other adjunction. Formally,
in terms of the set-theoretic notation introduced earlier, adjunction is
defined as follows: if α and β are merged by adjunction and α projects this
is represented as { <α , α>, { α, β}}. (<α , α> stands for an ordered pair).
Informally speaking the difference between a structure resulting from
standard (substitution) Merge and a structure resulting from adjunction is
that substitution Merge forms a new category. The italicized categories
Agr and loves in (11) are full categories, that fully count for calculations
of locality. The boldfaced Agr heads an adjunction structure. Adjunction
splits a category into segments. So, the boldfaced Agr and the Agr below
24
it are segments of the same category. she can look down into the bold-
faced Agr and see FF-loves since Agr is only a segment of a category, not
a category. Since Agr is a segment of the lower Agr it does not dominate
the latter (nor does it dominate FF-loves). So, given the segment-category
distinction Agr is indeeed the first category dominating the Agr-head.

(11)
Agr

she Agr

Agr loves

FFloves Agr (she) loves

loves the

the man

Since she is now in a checking configuration with Agr and FFloves


(which is also called a sublabel of Agr), the uninterpetable formal
features on the latter can be checked and erased as required.
The next step is getting T from the numeration, and merging it with the
root of the tree. It is stipulated to be part of the computational procedure
that substitution Merge always takes place at the root. This property is
also known as the extension condition. There is no adding elements
halfway the structure. Only adjunction is exempt from this requirement
(as it must be, since otherwise an operation adding material to the head of
a phrase could not exist)7. Merging T with (11) and subsequent
movement of FFloves to check a v-feature of T and of she to check and
erase the EPP feature of T yields (12).

7
This is one of the reasons that many researchers seek to eliminate head-movement as a grammatical process.
25
(12) T

she T

T Agr
FFloves T (she) Agr

Agr loves

(FFloves) Agr (she) loves

loves the

the man

Taking that a sentence must also be specified for Force, the next step
would be to merge (12) with C. However if C has the value Declarative,
no indications of further checking and movement can be observed.
Further discussion would not add much to the points I wanted to establish.
So, I will leave it at this.

26
Summary and overview of minimalist principles
In what follows I give a brief and very much condensed overview of a
number of important notions and issues. Some of it recapitulates what was
discussed above, but much if it is new. In its present form this text is not
self-explanatory, but may serve as a means to generate discussion. In the
future it will be expanded, though.

Summarizing, this version of minimalism has a very simple set of

• Merge (comes in fact for free given that we want to put stuff
grammatical devices:

• Select
together)

• Project
• Move/Attract(=Copy followed by Merge)
• Check
• Delete/Erase

• Full Interpretation
There is an interface requirement:

Features may have the following properties (Chomsky (1995):


- (Un)interpretability
Issue for discussion: draw-back that this property can only be checked
at the interface
- Strong: triggers overt movement
Issue for discussion: feature of feature (draw-back?)
- Weak : triggers covert (abstract) movement
Draw-back: id.?

Recent versions of Minimalist theory (Chomsky 1998, 1999) modified


some of the notions involving features:
- Uninterpretable Unvalued (lacking a value is visible by "inspection")
27
- EPP triggers movement
- Agree is a non-movement dependency

Dependencies (Move, Agree) are encoded as follows:


Probe at the root (active = at least one unvalued feature) searches for
goal (in c-command domain, active = at least one unvalued feature)

Exercise: Compare this to the previous version: Uninterpretable features


at the root must be checked an erased by attracting a matching candidate.
Go back to cases of movement discussed earlier, and establish what would
be the goal and what the probe.

Operations establishing dependencies are local


The following section discusses the proper notion of locality. We know
that all linguistic operations are local in some intuitive sense. However, it
is non-trivial to define locality formally in such a way that it allows for all
the possible operations and forbids the impossible ones. There is only one
way to figure this out, namely drawing trees, figure out what goes wrong,
and correct the locality notion in such a way that it works, without
overgenerating again.

Conditions on attraction:
I. P and G match in features (have identical features, irrespective of value)
II. G is in Domain (P) = G contained in a sister of P
III. Locality reduces to closest c-command

Exercise: Draw a tree and indicate for each constituent its domain.

More precisely: For D(P) the c-command domain of P, a matching feature


G is closest to P if there is no G’ in D(P) matching P such that G is in
D(G’) (Minimal Link Condition: MLC)

28
Exercise: Draw a tree; indicate the c-command relations, and indicate the
interveners. Show possible effects of the matching condition.

Fine tuning of Minimal Link Condition:


- Terms of the same minimal domain (e.g. specifiers of the same head) are
equidistant to probes
- Defective intervention constraint:
In a structure α > β > Г, where > is c-command and β, Г match the
probe α, but β is rendered inactive (because already checked), the
effects of matching between α and Г are blocked.

Exercise: Draw a tree in which you illustrate the effect of the notion of
equidistance.
Exercise: Draw a tree in which you illustrate the effect of the defective
intervention constraint

Issue: To what extent is the grammar designed so as to minimize


computational complexity?
Very generally, if you allow a lot of computational operations to interact
you can get a combinatorial explosion. In recent years there has been
considerable discussion of this issue. In Chomsky (1995) is considered as
not necessarily significant, but is subsequent work it received more
attention. The notion of derivation by phase is intended to reduce the
search space for operations in a principled manner.

- Minimal Link Condition


- Derivation by Phase: parts of the structure that are “ready” are shipped
off to other components of the system: PF, Interpretation; i.e. spelled out.
Once spelled out structures are inaccessible to further syntactic operations.
Such parts that are shipped off when "ready" are called the strong phases.

Questions: i) What are the (strong) phases; ii) How does the timing work?
29
The strong phases are: CP and vP; perhaps there others. TP and VP cannot
be strong phases.

Interpretation/evaluation for some strong phase PH1 is at the next relevant


phase PH2.

Access to material in some phase is regulated by the Phase impenetrability


condition:

Phase impenetrability condition (PIC)


The domain of a strong phase head H is not accessible to operations at
the next higher strong phase ZP, but only H and its Edge (=specifiers,
adjoined elements)

The following paragraph goes over the details of moving out of a phase.

Consider the structure in (13) with HP and ZP strong phases:

(13) [ZP Z ... [HP α [H YP]]]

Suppose the computation of L has completed HP and moves on to a stage


Г beyond HP. L can access the edge α and the head of HP. But PIC makes
a distinction between Г=ZP and Г within ZP (e.g. TP). For Г =TP, T can
access an element in the domain YP, but for Г =CP C cannot. Note, that,
under this strict theory, in order for wh-movement from the object
position in YP to be possible, the object has to move to the the edge of HP
before L moves on to CP. Assuming that H=v, this entails that v must
have an optional EPP feature. Optional, since in English the object does
not standardly end up at the left edge of the vP. Economy will require that
using the option is only possible if there is an effect on interpretation, i.e.
if the position is subsequently vacated by further movement to the C-
30
domain. The ensuing configuration might seem to pose a problem for the
MLC since one way or the other the subject will have to cross the object
on its way to T, as in (14):

(14) [CP C ...[T... [vP OB [SUB [v YP]]]]]

The problem is overcome as follows. We already know that all evaluation


is done at the next higher strong phase level =CP. Let’s assume now that
this is really general and that “evaluation” includes assessing whether the
MLC has been respected. For independent reasons only the head of a
chain is visible for intervention effects. At the CP the structure is as in
(15):

(15) [CP OBJ C [ SUB[T... [vP tOBJ [ tSUB[v YP]]]]]

This structure satisfies the MLC as understood now since the SUB did not
cross the head of the OBJ chain.

A note on economy
In the first versions of the minimalist program economy considerations
were a major concern. It was tried to capture a number of grammatical
effects in terms of comparing derivations in terms of economy. For
instance, a derivation containing fewer steps than some other derivation
using the same numeration was taken to yield the preferred outcome.
Upon proper consideration it was concluded that the ensuing model was
computationally inefficient, leading to insufficiently controlled back-
tracking. In the version of the MP described here, economy considerations
have largely been incorporated into the mechanics of the system itself:
locality is part of the definition of Move/attract, Phases close off certain
domains for rule application, etc. Hence, the search spaces have been
substantially limited. Reminiscent of the earlier stages is the reference to
optionality being only allowed if there is an effect on interpretation (the
31
optional EPP feature of v), and also a preference for Merge over Move if
there is an option (Move being more complex than Merge, since it
consists of Merge + Copy). Apart from that, economy has largely been
built into the system along the lines discussed, making a detailed
discussion of economy as found in the literature less relevant for present
purposes.

A note on Subjacency and locality


Island phenomena are examples of macro-phenomena that require an
explanation on the micro-level from the current perspective. Within the
P&P model the key word for island phenomena was subjacency:
movement may not cross more than one bounding node in a single step.
Bounding nodes (for English) were taken to be IP and NP (DP).
Consequently the derivation in (15) violates subjacency since who can
only move to the matrix C by crossing both IP2 and IP1; landing into the
intermediate C-domain is excluded since Spec-CP is already occupied by
what.

(15) *[CP1 whatj did [IP1 you ask [CP2 whoi [IP2 ti read tj ]]]]

Among the instruments of the minimalist program we have sketched the


notion of a bounding node is lacking. The direction the minimalist
program takes is to eliminate all reference to properties that are not
intrinsic to the lexical elements in the computation. So attempting to state
any conception like a bounding node would be against the whole idea of
the program. So, the question is how can one derive such facts using just
the devices that are available?
The direction subjacency research takes is to explain the facts on the
basis of locality considerations. Consider again the wh-island pattern of
(15), and let's take as a starting point the situation where both wh's are still
outside the Force domain.

32
(16) [CP2 C [IP2 whoi [read whatj ]]]]]

Assume C has a feature attracting wh. As a result one of the two must
move to Spec-C. Take it to be the subject, yielding (17):

(17) [CP2 whoi C [IP2 ti read whatj]]]]

Suppose structure building goes further, yielding (18), where also matrix
C has an uninterpretable feature attracting wh:

(18) [CP1 C did [IP1 you ask [CP2 whoi (C)[IP2 ti read whatj]]]]

However, give the way Attract is defined, it cannot attract what. Attract
can only see the nearest element with the required feature, namely who.
Hence attraction of what fails and (19) cannot be derived.

(19) *[CP1 whatj C did [IP1 you ask [CP2 whoi (C)[IP2 ti read tj]]]]

Suppose who is attracted and moves on to the matrix C as in (20):

(20) [CP1 whoi C did [IP1 you ask [CP2 ti (C) [IP2 ti read whatj]]]]

Even if multiple specifiers are allowed, what cannot move to Spec-CP2.


Clearly such movement would violate cyclicity. However, cyclicity need
not be stipulated since the embedded C-head’s attracting feature has been
used up=erased due to who’s passing through.

Literature
Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton
Chomsky, Noam. 1965. Aspects of the Theory of Syntax. Cambridge,
Mass: MIT Press

33
Chomsky, Noam. 1981. Lectures on Government and Binding. Dordrecht:
Foris
Chomsky, Noam. 1986. Knowledge of Language: Its Nature, Origin and
Use. New York: Praeger
Chomsky, Noam. 1986. Barriers. Cambridge, Mass.: MIT press
Chomsky, Noam. 1995. The Minimalist Program. Cambridge, Mass.:
MIT Press
Chomsky, Noam. 1998. Minimalist Inquiries. MIT Working papers in
Linguistics. Cambridge, Mass.: MIT
Chomsky, Noam. 1999. Derivation by Phase. MIT Working papers in
Linguistics, Cambridge, Mass.: MIT
Cinque, G. 1999. Adverbs and functional heads: A Crosslinguistic
perspective. Oxford University Press
Haegeman, Liliane. 1991. Introduction to Government & Binding Theory.
Oxford: Blackwell
Nilsen, Oystein. 2003. Eliminating Positions. LOT Dissertation Series.
Utrecht: Roquade

34

You might also like