This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced ... more This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced forms of personal pronouns, such as ’ann (for hann ‘he’) and ’ún (for hún ‘she’). We show that there are various restrictions found on DU which are not found on other reduced forms of personal pronouns in Icelandic. We argue that whereas reduced forms such as ’ann and ’ún are morphophonologically conditioned, DU is syntactically conditioned; it is not only clause-bounded but also phase-bounded.
We analyze genitive of negation (GN) in Lithuanian. When the verb is negated, GN is realized on a... more We analyze genitive of negation (GN) in Lithuanian. When the verb is negated, GN is realized on an object that would otherwise be realized as accusative. We demonstrate that Lithuanian GN is a syntactic (in line with Arkadiev 2016) and morphological phenomenon in contrast to Russian GN, whose realization is influenced by semantic factors (e.g. Kagan 2013). It differs from Russian (Pesetsky 1982) in that (i) it is always assigned to a DP which would otherwise bear structural accusative regardless of its semantic properties, and (ii) it cannot affect a structural nominative DP regardless of whether it is an external or internal argument. Lithuanian GN, in this respect, is similar to Polish GN (e.g. Przepiórkowski 2000, Witkoś 2008). We offer a three-layered approach to case, arguing that GN is a reflection of structural object case, assigned in syntax, then translated to morphological genitive case at PF and, finally, realized at Vocabulary Insertion (Halle & Marantz 1993). Thus, stru...
This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced ... more This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced forms of personal pronouns, such as 'ann (for hann 'he') and 'ún (for hún 'she'). We show that there are various restrictions found on DU which are not found on other reduced forms of personal pronouns in Icelandic. We argue that whereas reduced forms such as 'ann and 'ún are morphophonologically conditioned, DU is syntactically conditioned; it is not only clause-bounded but also phase-bounded.
Proceedings of the Linguistic Society of America, 2018
Dative case on indirect objects (IO) in Lithuanian is preserved under passivization, which is not... more Dative case on indirect objects (IO) in Lithuanian is preserved under passivization, which is not the case with dative direct objects (DO) of monotransitive verbs, suggesting that the two datives are not alike. Although DAT-to-NOM conversion is taken as an indicator of structural case, we show that DO datives behave differently from DOs bearing structural accusative in that the former exhibit inherent case properties as well (see also Anderson 2015). We develop an account for the contrast between the two datives by using two types of derivational mechanisms: structure-building features, triggering Merge, and probe features, triggering Agree (Heck & Müller 2007; Müller 2010). This study demonstrates that structural vs. non-structural conversion can be dependent on not only how case is assigned but also on the Voice system of a language (in line with Alexiadou et al. 2014). We argue that the DO dative in Lithuanian is in fact non-structural. Even though the result of DAT-to-NOM conver...
The Reflexive Passive in Icelandic is reminiscent of the so-called New Passive (or New Impersonal... more The Reflexive Passive in Icelandic is reminiscent of the so-called New Passive (or New Impersonal) in that the oblique case of a passivized object NP is preserved. As is shown by recent surveys, however, speakers who accept the Reflexive Passive do not necessarily accept the New Passive, whereas conversely, speakers who accept the New Passive do also accept the Reflexive Passive. Based on these results we suggest that there is a hierarchy in the acceptance of passive sentences in Icelandic, termed the Passive Acceptability Hierarchy. The validity of this hierarchy is confirmed by our diachronic corpus study of open access digital library texts from Icelandic journals and newspapers dating from the 19th and 20th centuries (tímarit.is). Finally, we sketch an analysis of the Reflexive Passive, proposing that the different acceptability rates of the Reflexive and New Passives lie in the argument status of the object. Simplex reflexive pronouns are semantically dependent on the verbs whi...
This paper presents ongoing work that aims to improve machine parsing of Faroese using a combinat... more This paper presents ongoing work that aims to improve machine parsing of Faroese using a combination of Faroese and Icelandic training data. We show that even if we only have a relatively small parsed corpus of one language, namely 53,000 words of Faroese, we can obtain better results by adding information about phrase structure from a closely related language which has a similar syntax. Our experiment uses the Berkeley parser. We demonstrate that the addition of Icelandic data without any other modification to the experimental setup results in an f-measure improvement from 75.44% to 78.05% in Faroese and an improvement in part-of-speech tagging accuracy from 88.86% to 90.40%.
We describe the background for and building of IcePaHC, a one million word parsed historical corp... more We describe the background for and building of IcePaHC, a one million word parsed historical corpus of Icelandic which has just been finished. This corpus which is completely free and open contains fragments of 60 texts ranging from the late 12 century to the present. We describe the text selection and text collecting process and discuss the quality of the texts and their conversion to modern Icelandic spelling. We explain why we choose to use a phrase structure Penn style annotation scheme and briefly describe the syntactic annotation process. Furthermore, we advocate the importance of an open source policy as regards language resources.
We describe the background for and building of IcePaHC, a one million word parsed historical corp... more We describe the background for and building of IcePaHC, a one million word parsed historical corpus of Icelandic which has just been finished. This corpus which is completely free and open contains fragments of 60 texts ranging from the late 12th century to the present. We describe the text selection and text collecting process and discuss the quality of the texts and their conversion to modern Icelandic spelling. We explain why we choose to use a phrase structure Penn style annotation scheme and briefly describe the syntactic anno-tation process. We also describe a spin-off project which is only in its beginning stages: a parsed historical corpus of Faroese. Finally, we advocate the importance of an open source policy as regards language resources.
Cross-linguistically, reflexive verbs frequently show puzzling behavior when they are embedded un... more Cross-linguistically, reflexive verbs frequently show puzzling behavior when they are embedded under causatives. We focus on two ways that this pattern manifests itself in Icelandic Indirect Causatives, formed with the light verb láta ‘let/make/have’: (i) verbs that normally cannot be embedded are allowed with reflexives, and (ii) a pleonastic use of the causative verb becomes available in imperatives with oblique subjects. We propose that these facts follow from the syntax of long-distance reflexives (which involves a “point-of-view” operator OPPOV), and a Voice-stacking analysis of indirect causatives, where two Voice heads are added on top of a single vP. The claim is that there is a limited set of ways to interpret the Voice-stacking structure, and reflexives provide one particular way to do this that is not otherwise available. Assuming that either Voice head can introduce a thematic interpretation or be expletive, we propose that in principle, there are four ways to interpret ...
Recent work in Distributed Morphology (DM) proposes an important role for phase theory in constra... more Recent work in Distributed Morphology (DM) proposes an important role for phase theory in constraining special realization of morphemes at the interfaces in contextual allomorphy (Embick 2010) and polysemy resolution (Marantz 2013).1 Our case study on verb-derived agent nominals (VAN) (1a) and root-derived agent nominals (RAN) (1b) in Icelandic supports this line of research and provides a novel type of argument by showing that Icelandic palatalization applies if and only if the undergoing morpheme is phase-local to the trigger. If our analysis is on the right track, any analysis of segmental phonology like palatalization potentially requires an understanding of syntax and semantics. The findings are interesting because they suggest that surface-oriented phonological processes within a word are constrained by the same universal principles as superficially unrelated phenomena like locality constraints on syntactic displacement; island constraints and palatalization may play by the sa...
The topic of this paper is a rule-based pipeline for converting constituency treebanks based on t... more The topic of this paper is a rule-based pipeline for converting constituency treebanks based on the Penn Treebank format to Universal Dependencies (UD). We describe an Icelandic constituency treebank, its annotation scheme and the UD scheme. The conversion is discussed, the methods used to deliver a fully automated UD corpus and complications involved. To show its applicability to corpora in different languages, we extend the pipeline and convert a Faroese constituency treebank to a UD corpus. The result is an open-source conversion tool, published under an Apache 2.0 license, applicable to a Penn-style treebank for conversion to a UD corpus, along with the two new UD corpora.
The new Icelandic Word Web (IW) is a language technology focused redesign of a lexicosemantic dat... more The new Icelandic Word Web (IW) is a language technology focused redesign of a lexicosemantic database of semantically related entries. The IW’s entities, relations, metadata and categorization scheme have all been implemented from scratch in two systems, OntoLex and SKOS. After certain adjustments were made to OntoLex and SKOS interoperability, it was also possible to implement specific IW features that, while potentially nonstandard, form an integral part of the Word Web’s lexicosemantic functionality. Also new in this implementation are access to a larger amount of linguistic data, a greater variety of search options, the possibility of automated processing, and the ability to conduct research through SPARQL without possessing a mastery of Icelandic.
This paper discusses idiomatic expressions in natural language where a particular verb and its pa... more This paper discusses idiomatic expressions in natural language where a particular verb and its particular direct object are interpreted in a special way which cannot be predicted based on the meaning of individual words. We discuss the interplay of certain syntactic constructions in Icelandic and the interpretation of these syntactic constructions. We show that idioms that lose their idiomatic meaning in a canonical passive retain it in the new impersonal constuction. Properties that characterize verbal idioms are Þ atur var tekin i hlaupinu af Hoskuldi 42 discussed as well as theories put forth by Chomsky and Lebeaux regarding the analysis of such data. Those theories are compatible with the view that the new impersonal construction in Icelandic involves a silent subject because a silent element in the subject position should block A-movement of the direct object. If the object cannot move, the verb and its object are adjacent and nothing prevents an interpretation where the two su...
This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced ... more This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced forms of personal pronouns, such as ’ann (for hann ‘he’) and ’ún (for hún ‘she’). We show that there are various restrictions found on DU which are not found on other reduced forms of personal pronouns in Icelandic. We argue that whereas reduced forms such as ’ann and ’ún are morphophonologically conditioned, DU is syntactically conditioned; it is not only clause-bounded but also phase-bounded.
We analyze genitive of negation (GN) in Lithuanian. When the verb is negated, GN is realized on a... more We analyze genitive of negation (GN) in Lithuanian. When the verb is negated, GN is realized on an object that would otherwise be realized as accusative. We demonstrate that Lithuanian GN is a syntactic (in line with Arkadiev 2016) and morphological phenomenon in contrast to Russian GN, whose realization is influenced by semantic factors (e.g. Kagan 2013). It differs from Russian (Pesetsky 1982) in that (i) it is always assigned to a DP which would otherwise bear structural accusative regardless of its semantic properties, and (ii) it cannot affect a structural nominative DP regardless of whether it is an external or internal argument. Lithuanian GN, in this respect, is similar to Polish GN (e.g. Przepiórkowski 2000, Witkoś 2008). We offer a three-layered approach to case, arguing that GN is a reflection of structural object case, assigned in syntax, then translated to morphological genitive case at PF and, finally, realized at Vocabulary Insertion (Halle & Marantz 1993). Thus, stru...
This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced ... more This squib discusses the clitic DU (-du/-ðu/-tu) in Icelandic and compares it with other reduced forms of personal pronouns, such as 'ann (for hann 'he') and 'ún (for hún 'she'). We show that there are various restrictions found on DU which are not found on other reduced forms of personal pronouns in Icelandic. We argue that whereas reduced forms such as 'ann and 'ún are morphophonologically conditioned, DU is syntactically conditioned; it is not only clause-bounded but also phase-bounded.
Proceedings of the Linguistic Society of America, 2018
Dative case on indirect objects (IO) in Lithuanian is preserved under passivization, which is not... more Dative case on indirect objects (IO) in Lithuanian is preserved under passivization, which is not the case with dative direct objects (DO) of monotransitive verbs, suggesting that the two datives are not alike. Although DAT-to-NOM conversion is taken as an indicator of structural case, we show that DO datives behave differently from DOs bearing structural accusative in that the former exhibit inherent case properties as well (see also Anderson 2015). We develop an account for the contrast between the two datives by using two types of derivational mechanisms: structure-building features, triggering Merge, and probe features, triggering Agree (Heck & Müller 2007; Müller 2010). This study demonstrates that structural vs. non-structural conversion can be dependent on not only how case is assigned but also on the Voice system of a language (in line with Alexiadou et al. 2014). We argue that the DO dative in Lithuanian is in fact non-structural. Even though the result of DAT-to-NOM conver...
The Reflexive Passive in Icelandic is reminiscent of the so-called New Passive (or New Impersonal... more The Reflexive Passive in Icelandic is reminiscent of the so-called New Passive (or New Impersonal) in that the oblique case of a passivized object NP is preserved. As is shown by recent surveys, however, speakers who accept the Reflexive Passive do not necessarily accept the New Passive, whereas conversely, speakers who accept the New Passive do also accept the Reflexive Passive. Based on these results we suggest that there is a hierarchy in the acceptance of passive sentences in Icelandic, termed the Passive Acceptability Hierarchy. The validity of this hierarchy is confirmed by our diachronic corpus study of open access digital library texts from Icelandic journals and newspapers dating from the 19th and 20th centuries (tímarit.is). Finally, we sketch an analysis of the Reflexive Passive, proposing that the different acceptability rates of the Reflexive and New Passives lie in the argument status of the object. Simplex reflexive pronouns are semantically dependent on the verbs whi...
This paper presents ongoing work that aims to improve machine parsing of Faroese using a combinat... more This paper presents ongoing work that aims to improve machine parsing of Faroese using a combination of Faroese and Icelandic training data. We show that even if we only have a relatively small parsed corpus of one language, namely 53,000 words of Faroese, we can obtain better results by adding information about phrase structure from a closely related language which has a similar syntax. Our experiment uses the Berkeley parser. We demonstrate that the addition of Icelandic data without any other modification to the experimental setup results in an f-measure improvement from 75.44% to 78.05% in Faroese and an improvement in part-of-speech tagging accuracy from 88.86% to 90.40%.
We describe the background for and building of IcePaHC, a one million word parsed historical corp... more We describe the background for and building of IcePaHC, a one million word parsed historical corpus of Icelandic which has just been finished. This corpus which is completely free and open contains fragments of 60 texts ranging from the late 12 century to the present. We describe the text selection and text collecting process and discuss the quality of the texts and their conversion to modern Icelandic spelling. We explain why we choose to use a phrase structure Penn style annotation scheme and briefly describe the syntactic annotation process. Furthermore, we advocate the importance of an open source policy as regards language resources.
We describe the background for and building of IcePaHC, a one million word parsed historical corp... more We describe the background for and building of IcePaHC, a one million word parsed historical corpus of Icelandic which has just been finished. This corpus which is completely free and open contains fragments of 60 texts ranging from the late 12th century to the present. We describe the text selection and text collecting process and discuss the quality of the texts and their conversion to modern Icelandic spelling. We explain why we choose to use a phrase structure Penn style annotation scheme and briefly describe the syntactic anno-tation process. We also describe a spin-off project which is only in its beginning stages: a parsed historical corpus of Faroese. Finally, we advocate the importance of an open source policy as regards language resources.
Cross-linguistically, reflexive verbs frequently show puzzling behavior when they are embedded un... more Cross-linguistically, reflexive verbs frequently show puzzling behavior when they are embedded under causatives. We focus on two ways that this pattern manifests itself in Icelandic Indirect Causatives, formed with the light verb láta ‘let/make/have’: (i) verbs that normally cannot be embedded are allowed with reflexives, and (ii) a pleonastic use of the causative verb becomes available in imperatives with oblique subjects. We propose that these facts follow from the syntax of long-distance reflexives (which involves a “point-of-view” operator OPPOV), and a Voice-stacking analysis of indirect causatives, where two Voice heads are added on top of a single vP. The claim is that there is a limited set of ways to interpret the Voice-stacking structure, and reflexives provide one particular way to do this that is not otherwise available. Assuming that either Voice head can introduce a thematic interpretation or be expletive, we propose that in principle, there are four ways to interpret ...
Recent work in Distributed Morphology (DM) proposes an important role for phase theory in constra... more Recent work in Distributed Morphology (DM) proposes an important role for phase theory in constraining special realization of morphemes at the interfaces in contextual allomorphy (Embick 2010) and polysemy resolution (Marantz 2013).1 Our case study on verb-derived agent nominals (VAN) (1a) and root-derived agent nominals (RAN) (1b) in Icelandic supports this line of research and provides a novel type of argument by showing that Icelandic palatalization applies if and only if the undergoing morpheme is phase-local to the trigger. If our analysis is on the right track, any analysis of segmental phonology like palatalization potentially requires an understanding of syntax and semantics. The findings are interesting because they suggest that surface-oriented phonological processes within a word are constrained by the same universal principles as superficially unrelated phenomena like locality constraints on syntactic displacement; island constraints and palatalization may play by the sa...
The topic of this paper is a rule-based pipeline for converting constituency treebanks based on t... more The topic of this paper is a rule-based pipeline for converting constituency treebanks based on the Penn Treebank format to Universal Dependencies (UD). We describe an Icelandic constituency treebank, its annotation scheme and the UD scheme. The conversion is discussed, the methods used to deliver a fully automated UD corpus and complications involved. To show its applicability to corpora in different languages, we extend the pipeline and convert a Faroese constituency treebank to a UD corpus. The result is an open-source conversion tool, published under an Apache 2.0 license, applicable to a Penn-style treebank for conversion to a UD corpus, along with the two new UD corpora.
The new Icelandic Word Web (IW) is a language technology focused redesign of a lexicosemantic dat... more The new Icelandic Word Web (IW) is a language technology focused redesign of a lexicosemantic database of semantically related entries. The IW’s entities, relations, metadata and categorization scheme have all been implemented from scratch in two systems, OntoLex and SKOS. After certain adjustments were made to OntoLex and SKOS interoperability, it was also possible to implement specific IW features that, while potentially nonstandard, form an integral part of the Word Web’s lexicosemantic functionality. Also new in this implementation are access to a larger amount of linguistic data, a greater variety of search options, the possibility of automated processing, and the ability to conduct research through SPARQL without possessing a mastery of Icelandic.
This paper discusses idiomatic expressions in natural language where a particular verb and its pa... more This paper discusses idiomatic expressions in natural language where a particular verb and its particular direct object are interpreted in a special way which cannot be predicted based on the meaning of individual words. We discuss the interplay of certain syntactic constructions in Icelandic and the interpretation of these syntactic constructions. We show that idioms that lose their idiomatic meaning in a canonical passive retain it in the new impersonal constuction. Properties that characterize verbal idioms are Þ atur var tekin i hlaupinu af Hoskuldi 42 discussed as well as theories put forth by Chomsky and Lebeaux regarding the analysis of such data. Those theories are compatible with the view that the new impersonal construction in Icelandic involves a silent subject because a silent element in the subject position should block A-movement of the direct object. If the object cannot move, the verb and its object are adjacent and nothing prevents an interpretation where the two su...
In this paper, we re-examine the case for a split theory of concord (Baier 2015) which distinguis... more In this paper, we re-examine the case for a split theory of concord (Baier 2015) which distinguishes between concord by Agree and Morphological Feature Copying as well as a split theory of affixation which distinguishes between head-to-head Lowering and Local Dislocation under linear adjacency (Embick and Noyer 2001). We show that both complications are redundant once we adopt our structural typology of adjectives. The empirical differences which gave rise to those split theories can in fact be accounted for by systematic variation in the structure of adjectives.
Uploads
Papers