[go: up one dir, main page]

Academia.eduAcademia.edu
SUSA/JSFOu 96, 2017 Gwen Eva Janda, Axel Wisiorek & Stefanie Eckmann (Munich) Reference tracking mechanisms and automatic annotation based on Ob-Ugric information structure The following paper is concerned with information structure in the Ob-Ugric languages and its manifestation in reference tracking and its mechanisms. We will show how both knowledge on information structure and on reference tracking mechanisms can be used to develop a system for a (semi-)automatic annotation of syntactic, semantic and pragmatic functions. We assume that the principles of information structure, i.e., the balancing of the content of an utterance, are indicated by the use of anaphoric devices to mark participants in an on-going discourse. This process in which participants are encoded by the speaker and decoded by the hearer is called reference tracking. Our model distinguishes four important factors that play a role in reference tracking: inherent (linguistic) features of a referent, information structure, referential devices and referential strategies. The interaction between these factors we call reference tracking mechanisms. Here, the passive voice and the dative shift are used to exemplify this complex interaction system. Drawing conclusions from this, rules are developed to annotate both syntactic, semantic and pragmatic roles of discourse participants (semi-)automatically. 1. Introduction This paper deals with information structure in Ob-Ugric, its effects on reference tracking and the application of both in the development of (semi-)automatic annotation tools. It covers three main topics: (i) an analysis of linguistic devices used for reference tracking in Ob-Ugric, (ii) a description of information structure and its impacts on reference tracking, and, concluding from these, (iii) a discussion of how these can be used to design and set rules for a (semi-)automatic annotation of syntactic, semantic and pragmatic functions. After a short introduction to the Ob-Ugric languages and their specific typological features (section 1), we will define the notions of information structure and reference tracking (section 2). Then we will describe the complex system of reference tracking mechanisms exemplified by the passive voice and the so-called dative shift (section 3). The last section shows how the regularities of reference tracking mechanisms can be used to develop a (semi-)automatic annotation tool for Ob-Ugric text samples. The Ob-Ugric languages form a branch of the Uralic language family and are spoken in Western Siberia on the river Ob and its tributaries. The two languages Khanty and Mansi are further divided into dialectal groups, which in turn consist of several sub-dialects. Like many of the other minority languages spoken in the Russian Federation, both Khanty and Mansi are highly endangered (Moseley 2010). The material of our investigation is taken from the corpus of the EUROCORES project “Ob-Ugric languages: conceptual structures, lexicon, constructions, categories” 116 Janda, Wisiorek & Eckmann and its proceeding DFG/FWG project “Ob-Ugric database: analysed text corpora and dictionaries for less described Ob-Ugric dialects”.1 The corpus offers glossed and translated texts from several Khanty and Mansi varieties, such as Northern and Western Mansi as well as Surgut Khanty and its subdialect Yugan Khanty. From a typological point of view, Khanty and Mansi are, like most Uralic languages, mainly agglutinative languages. As known from Mordvin, the Ugric and Samoyedic branch, there is subject as well as object agreement on the verb, resulting in two sets of verbal paradigms. The first set of endings agrees only with the subject of the sentence (traditionally called subjective conjugation), while the second set of endings agrees with both the subject and the direct object of the sentence (traditionally called objective conjugation). Nouns are marked for person (traditionally referred to as possessive inflection), number and case. In addition to a large inventory of adverbial cases, Khanty and Mansi dialects also use postpositions, mainly to mark spatial and other adverbial relations. However, regardless of their large inventory of suffixes, the Ob-Ugric languages tend to not mark syntactic core roles by case, but by word order (SOV) and verbal inflection. In Northern Mansi, for instance, as a result of the loss of the Proto Ob-Ugric accusative *-m(V), both the subject and direct object role are unmarked on nouns. In these dialects, however, the accusative is still found with pronominals. The same applies for Surgut and Yugan Khanty. In the Western Mansi dialects, another strategy to mark direct objects has developed: in Pelym Mansi, e.g., some nouns in direct object role are marked with the dative-lative, but not all of them. In other dialects of Mansi, e.g., in the eastern group or the Middle Lozva dialect of the Western group, the accusative suffix can still be found, which was still used at the time of the extinction of the dialect. For now, we will mainly focus on those dialects with unmarked nominal direct objects. Furthermore, the Ob-Ugric languages are pro-drop languages, i.e., the subject does not need to be expressed overtly in the sentence. The same applies for certain types of direct objects. The fundamental premise is the aforementionedness of subject and direct object in the discourse, i.e., they represent previously given information. 2. Basic notions of information structure and reference tracking 2.1. Information structure In our understanding, a discourse can be defined as the comprising of grammar and its communicative value. The latter is based on (a) the text coherence and (b) the balanced content of given and new information. This balanced content is what is called information structure and is determined by the speaker’s intention and his assumption about the hearer’s knowledge. A topic is what the discourse is about and is chosen according to what the speaker intends to talk about, i.e., to share information about. The comment provides 1. <http://www.oudb.gwi.uni-muenchen.de/> Reference Tracking Mechanisms ... on Ob-Ugric Information Structure 117 new information on the topic (cf. Loos 2004). The comment depends on the speaker’s intention as well. The speaker also decides how to convey this intention to the hearer. The speaker must signal what he speaks about (his topic) and, at the same time, he must structure the information he wants to share in a way so that it can be recognized and processed correctly by the hearer. If there is too much new information which the hearer cannot connect to anything known to him, communication fails. Therefore, the speaker must establish a common ground for the hearer to connect to any new information. If, on the contrary, there is too much given information, there is no communicative value in a discourse either, since the hearer does not learn anything new (cf. Krifka 2008, Kern 2010). Linguistically speaking, these principles result in and are indicated by the use of various anaphoric devices to refer to participants in an on-going discourse. This process is called reference tracking, which in Ob-Ugric is based on co-referentiality (cf. Comrie 1988). 2.2. Reference tracking Reference tracking can be defined as the monitoring of a participant in an on-going discourse (Nagaya 2006: 3). The notion of reference tracking thus describes the encoding (speaker) and decoding (hearer) of referents in discourse. In our model, we distinguish based on Comrie (1988, 1989) four important factors that have an effect on reference tracking and its mechanisms. Firstly, in every language there are certain inherent features of a referent that are marked linguistically (Comrie 1988). In the Ob-Ugric languages, these are number (Example 1) and – to a limited extent – animacy: there is a distinction between animate comitatives (a postpositional phrase is used instead of an adverbial case suffix, Example 2) and inanimate instrumentals (marked with instrument case suffix, Example 3). Additionally, the repeated occurrence of the noun ‘knife’ instead of the use of a pronominal in Example 3 indicates (apart from other factors) the distinction between animate and inanimate referents. (1) OUDB Surgut Khanty Corpus. Text ID 735, Nr. 1 kɐːt iːmiɣən βɑɬɬəɣən. kɐːt iːmi -ɣən βɑɬ -ɬ -əɣən two old_woman -DU live -PRS -3DU ‘Once upon a time there lived two women.’ (2) OUDB Northern Mansi Corpus. Text ID 750, Nr. 81 neːmatər sir (…) maːn jotuw at weːriti neːmatər sir maːn no_one 1PL jot -uw at weːrit with -1PL NEG trust -i -PRS[3SG] ‘No manner of water shrine, no manner of land shrine can stand against us.’ 118 Janda, Wisiorek & Eckmann (3) OUDB Northern Mansi Corpus. Text ID 1238, Nr. 59 janəɣ eːkʷa kasaj wis, piɣe kasajil ta puwtmaste. janəɣ elder ta EMPH1 eːkʷa woman puwtm push kasaj wi -s knife take -PST[3SG] -as -te -PST -SG<3SG piɣ -e boy -SG<3SG kasaj knife -il -INST ‘The old woman took a knife and stabbed her son with it.’ Secondly, there are (anaphoric) devices that can be used to refer to a referent. In Ob-Ugric, these are noun (phrases), pronouns, personal endings and the zero morpheme. Noun (phrases) are mostly used to introduce a new referent into the discourse or to point out its non-subject-status. Pronouns and personal endings do not bear any reference of their own like nouns; instead, they are substitutes for the respective noun. Whilst pronouns are overtly realized on the surface of the sentence, referents encoded in personal endings are marked on the word they are attached to, e.g., on the verb (Skribnik 2001a). No additional overt marker is needed (Example 4).2 (4) OUDB Northern Mansi Corpus. Text ID 1234, Nr. 155 matər met woweɣən. ø matər met wow -eɣ -ən ø what fee demand -PRS -2SG ‘(You will get) what kind of fee you demand.’ 3SG pro-forms, however, have a limited competence regarding disambiguation and expression of co-referentiality (Kovgan 2001: 152; Kovgan 2005: 558; Kibrik 2008: 1130). Zero morphemes exhibit the least overt realization, i.e., none (Example 5).3 Additionally, the inherent features are mirrored in the referential devices. (5) OUDB Pelym Mansi Corpus. Text ID 1278, Nr. 33 oːs i oːli. ø oːs ø and i PTCL oːl -i live -PRS[3SG] ‘And he lives on.’ Thirdly, the underlying information structure needs to be considered. The referent’s status of accessibility and givenness is in functional relation with the morpho-phonological size of the referential device: the less accessible, i.e., new information, the 2. 3. Zero anaphoras are marked with the wildcard ø in the glossing. Information encoded with zero morphemes is indicated with square brackets in the glossing. Reference Tracking Mechanisms ... on Ob-Ugric Information Structure 119 more encoding material is needed, i.e., noun phrases, nouns or pronouns which are overtly realized. The more accessible, i.e., given information, the less encoding material is needed, i.e., zero anaphora. Finally, information structure decides which referential strategy is to be employed. Referential strategies are the application of a certain syntactic role, agreement, ellipsis (i.e., pro-drop), word order and/or diathesis, which will be explained in more detail in the following section. Closing the circle, the referential strategies, of course, make use of the referential devices. The combination and use of all of these factors is what we call reference tracking mechanisms. They are illustrated in the following model (Figure 1): Figure 1. Model of the reference tracking mechanisms in Ob-Ugric languages. 3. Ob-Ugric reference tracking mechanisms Concluding from the former section, we can state that reference tracking can be viewed as the visualization of information structure in a text. Consequently, the referent serving as the topic is the one which is monitored by reference tracking mechanisms. We understand the notion of topic as a referent’s pragmatic role. Note, that in this conception we assume that there is a hierarchy of pragmatic roles (pragmatic 120 Janda, Wisiorek & Eckmann hierarchy). These have to be distinguished according to the referent’s level of involvement in the discourse. Our text corpus mainly consists of narratives and tales. Texts of this type are well qualified to demonstrate reference tracking throughout a coherent text since in Ob-Ugric narratives there is usually only one main hero. This main hero is typically established in the beginning of the story and continuously mentioned in the plot. He therefore serves as the (role of the) primary topic or discourse topic (cf. Nikolaeva 2001). Other referents, mainly those with a relation to the main hero expressed by the verbal action, may also hold a certain topical status and serve as the secondary topic (cf. Nikolaeva 2001). They are not necessarily part of the whole story but may occur only in certain paragraphs and can alternatively be called paragraph topics. While usually there is only one discourse topic, there can be several paragraph topics since the main hero may interact with several participants or the speaker’s attention may shift from one participant to another. The lowest role in the pragmatic hierarchy is represented by the sentence topic (cf. Reinhart 1982), which is – as inferable from its denomination – only found in a few consecutive sentences. Reference tracking mechanisms are used to track both kinds of referents – primary topics as well as secondary topics. Hereafter, the devices used to refer to the primary and secondary topics differ. The principles of information structure in general apply to other text genres as well, but reference tracking mechanisms may differ slightly or may not be as obvious. Owing to a lack of comparable data, our analysis focuses on narrative texts. 3.1. Passive In general, the passive voice is used to indicate that the semantic role of the subject is not the agent but the “patient or recipient of the action denoted by the verb” (Loos 2004). Passive voice is very frequent in Ob-Ugric and has been described in detail by, e.g., Kulonen (1989). Therefore, we will only focus on the most characteristic features of the passive voice and its relevance as a reference tracking mechanism. Examples of a patient or a recipient as subject of a passive sentence can also be found in Ob-Ugric (Example 6). The passive is also known to demote an agent as it is not necessarily obligatory in the sentence. If it occurs, the agent is represented in a non-core syntactic role. In Mansi, for instance, the agent is marked with the dativelative case (DLAT) (Example 6); in Khanty with the Locative case. (6) OUDB Pelym Mansi Corpus. Text ID 1277, Nr. 27 oɒ̯mpnə purx itwəs. ø ø S/PAT oɒ̯mp dog ADV/AG -nə -DLAT pur -x it bite -INF want ‘A dog wanted to bite him.’ -w -əs -PASS -PST[3SG] Reference Tracking Mechanisms ... on Ob-Ugric Information Structure 121 Additionally, other semantic roles in subject position can also be found in passive sentences. There are, e.g., passive sentences in Ob-Ugric with a so-called locative subject (cf. Kulonen 1989). A locative subject takes the semantic role of goal in sentences with verbs of motion: (7) OUDB Pelym Mansi Corpus. Text ID 1335, Nr. 11 ta keːm wuləmnə joxtows. ø ø S/LOC ta keːm wuləm to an extent sleep ADV/AG -nə -DLAT joxt come -ow -PASS -s -PST[3SG] ‘She was completely overcome by sleep.’ Another instance where the passive is used in Ob-Ugric is when the subject takes the role of addressee with verba dicendi: (8) OUDB Surgut Khanty Corpus. Text ID 734, Nr. 20 mʉβəɬinə təɣə muːnʲtʲo? ø mʉβəɬi -nə təɣə muːnʲtʲ ø what -LOC here tell tales S/ ADV/AG -ø -o -PST -PASS.2SG ‘What has told you (to come) here?’ To conclude, the Ob-Ugric passive does not necessarily promote either the patient or the recipient to the syntactic role of subject. The passive is not restricted to promote the patient/recipient or to demote the agent. Much rather, the passive is considered as a means to decrease verb valency – since the agent is represented with a non-core syntactic role, there is only one syntactic core role left: the subject, performed by the patient. This is why the passive is usually associated with transitive verbs. The sample of verbs of motion (Example 8), however, proves that in Ob-Ugric passivization is not limited to transitives, nor is transitivity required for passivization. Instead, the passive voice is used to maintain the correlation between subject and primary topic: the assignment of a certain syntactic role is a referential strategy determined by information structure. Consequently, the primary topic is assigned the subject of a sentence in Ob-Ugric. If the semantic role is not the agent, there is a change in diathesis, i.e., passivization. The passive in Ob-Ugric therefore is a reference tracking mechanism and is not limited by verb valency or the semantic role of the referent promoted to the role of the subject. 122 Janda, Wisiorek & Eckmann 3.2. Dative Shift In the previous section, we saw that there is a strong correlation between the pragmatic role of primary topic and the syntactic role of subject. The correlation between pragmatic and syntactic roles, however, goes beyond subject and primary topic: there is also a correlation between the secondary topic and the direct object. The correlation between direct object and secondary topic is maintained with a reference tracking mechanism comparable to the passive: the so-called dative shift. (Skribnik 2001a; Givón 1984.) Consequently, there is no restriction of the patient role to the direct object, either. In a ditransitive sentence, the subject generally takes the semantic role agent, the direct object is assigned the semantic role of patient and the indirect object takes the semantic role of recipient: (9) OUDB Surgut Khanty Corpus. Text ID 1083, Nr. 22 nʉŋ mɐːntem məje ɐːj nʲeːβreməle. nʉŋ mɐːntem 2SG 1SG.DAT S/AG IO/REC məj -e give -IMP.SG<2SG ɐːj nʲeːβrem small child DO/PAT -əle -DIM.MEL ‘You give me your little child.’ However, if the referent in indirect object position is the secondary topic, dative shift occurs in Ob-Ugric: the primary topic is still in subject position taking the semantic role of agent, but the secondary topic – being the recipient of the action – changes from indirect object to direct object role (Skribnik 2001: 227–229). The referent semantically serving as patient is then syntactically in indirect object position: (10) OUDB Surgut Khanty Corpus. Text ID 1083, Nr. 23 tʲuːt mɐː nʉŋɐt tʉβətɐt məɬəm. tʲuːt then S/AG mɐː 1SG DO/REC nʉŋɐt 2SG.ACC ADV/PAT tʉβət fire -ɐt -INSC mə -ɬ -əm give -PRS -1SG ‘Then I will give you fire.’ The formal marking of the dative shift differs between Khanty and Mansi. In Surgut Khanty, the dative shift is marked in this way: the recipient as direct object (DO) is marked by accusative (if pronominal) or remains unmarked. Patients then are adverbials and marked with the instructive case (INSC) (Example 10). In Northern Mansi, the direct object is unmarked, while the adverbial patient is inflected in the instrumental case (Example 11). Reference Tracking Mechanisms ... on Ob-Ugric Information Structure 123 (11) OUDB Northern Mansi Corpus. Text ID 1229, Nr. 6 nʲaːləl waːrilum. ø ø nʲaːl ø ø arrow S/AG DO/REC ADV/PAT -əl -INST waːr make -i -lum -PRS -SG<1SG ‘I make you an arrow.’ As previously mentioned, the choice and the morpho-phonological size of the referential device as well as the referential strategy are determined by information structure (Givón 1983). Therefore, not only can the subject as primary topic be dropped, but also the direct object as secondary topic. Furthermore, the topical status of both referents (subject and direct object) triggers the use of the objective conjugation (subject and direct object agreement on the verb) which is obligatory in this case. In summary, the secondary topic referent triggers (i) dative shift if its semantic role is not the patient and – in Northern Mansi – and (ii) objective conjugation because of its direct object role owing to the dative shift. Thus, in these uses, the objective conjugation is not only a referential strategy by itself but can be regarded as the formal indication of dative shift. In a nutshell, the dative shift is a reference tracking mechanism that is used to assign the secondary topic to the syntactic role of direct object, and it is not limited by the semantic role of the referent promoted into the role of the direct object. 4. (Semi-)automatic annotation rules for Ob-Ugric texts The previous chapters/sections have shown that there are certain regularities regarding the use of referential devices, referential strategies and referential tracking mechanisms, all of which are mainly based on information structure. With the help of these regularities, it is possible to formulate a set of (semi-)automatic annotation rules. Such an annotation tool has recently been developed for the corpus of the EUROCORES project “Ob-Ugric languages: conceptual structures, lexicon, constructions, categories” and its proceeding DFG/FWG project “Ob-Ugric database: analysed text corpora and dictionaries for less described Ob-Ugric dialects” (see Wisiorek & Schön 2017: 389f). In this annotation system, these regularities were used as heuristic rules to tag the functional, semantic and pragmatic values of each referent throughout the whole text, followed by a review and – if necessary, a correction – by an annotator. This section will deal with the most significant regularities and their transformation into annotation rules. One referential strategy which has not been mentioned in detail yet is word order: the unmarked word order in Ob-Ugric is SOV. We can use this regularity as a heuristic device to determine clause boundaries in complex sentences as well as to differentiate the core syntactic roles of zero marked nominal phrases: a subject-tag 124 Janda, Wisiorek & Eckmann is set when there is an unmarked nominal phrase and the next unmarked nominal phrase is tagged as direct object. Any nominal phrase with a case suffix is tagged as adverbial. Yet, there are more referential devices which replace recurring nouns. Personal pronouns, which in Ob-Ugric differentiate subject and direct object by case, are tagged according to their case form. The most common references to subject as well as direct object in pro-drop languages, however, are: ellipsis as a referential strategy or zero morphemes as referential devices. Therefore, since subject and direct object do not necessarily occur on the surface of the sentence in Ob-Ugric, their position has to be visualized first in order to be able to tag the majority of syntactic arguments. Another annotation rule is thus to add a subject zero position in any clause which does not feature any unmarked nominals or pronominal phrases. If the verb is marked with objective conjugation (visible by glossing), two zero positions are added (resp. one, if there is one unmarked nominal or pronominal phrase). Referring back to word order, the position is preverbal and the sentence-initial zero is tagged as subject, the following as direct object. With regard to semantic roles, it is suggested that subjects be tagged as agents, direct objects as patients. If there is a passive marker on the verb, however, it is suggested that the subject is tagged as patient. If there is an adverbial in the passive sentence, its suggested tag is agent. In certain cases (e.g. the combination of dative shift and passive), the suggestion will have to be corrected by the annotator. Pragmatic roles are difficult to tag automatically. Since the size of the referential device correlates with the referent’s accessibility, any zero is tagged as topic role, whereas personal pronouns are tagged as contrast. The comment part of the sentence most often corresponds with the focus part. If a sentence consists only of the predicate, this component must be new information and focus (since subject and probably direct object are given and thus dropped). This coincides with the assumption of a preverbal focus and a topic-initial word order in Ob-Ugric and is the reason why zeros are placed not only in front of the verb but in sentence-initial position. Any components (mostly adverbials) directly preceding the verb are also tagged as focus. On the other hand, there must be a reason why sometimes topics are realized as nouns within the text even though they have been dropped before. This is considered in the evaluation of the tagging results by computing the value of the nominal coded referent on the accessibility scale based on the data of the referent-tagging (s. below); introduced referents are tagged as ‘RESUME’, resp. ‘REPEAT’, if they have been mentioned immediately before. Each referent’s frequency of occurrence can be determined by numbering all participants in the text. This is where the automatic tagging reaches its limits and has to be done manually since present anaphora resolution techniques are not sufficient to recognize and disambiguate all referents correctly. Instead, the tagging relies on suggestion lists consisting of the afore-identified referents in the form of their first having been mentioned, the use of which significantly speeds up the annotation procedure. Reference Tracking Mechanisms ... on Ob-Ugric Information Structure 125 Since proofreading of the results is always needed, too, the term (semi-)automatic annotation is used. 5. Conclusions and further research The aim of this paper was to examine the Ob-Ugric reference tracking mechanisms as a set of rules based on the principles of information structure. From the interaction between reference tracking strategies and mechanisms and our knowledge of the information structure system, we can detect certain correlations: those between syntactic and pragmatic role as well as those between referential device and pragmatic role. These principles were then used to set up a tool for (semi-)automatic annotation: in combination with syntactic rules (e.g. standard word-order), it is possible to tag functional parameters. In the tagging process during the project OUDB (which was conducted in connection with a syntactic parsing), these regularities were successfully applied to give suggestions for syntactic, semantic and pragmatic values (semiautomatic tagging). The annotated data can be used to draw further conclusions on the complexity of Ob-Ugric sentence structure and also serve as a basis for further analyses of, e.g., frequency of reference in general, the introduction of reference, referential chains, disruption in reference and re-introduction of referents in particular, as well as the choice of the actual referential device for introduction, sustainability and re-introduction. We are just at the beginning of this kind of analysis, therefore the insights are not limited to those examples mentioned. A next task could be the inclusion of data from those (Mansi) dialects which do have direct object marking on the noun. With Middle Lozva Mansi data, the task should be easily manageable: we already tag pronominals with the accusative marker as direct objects, this should be applicable to nouns, as well. In Pelym Mansi, however, the dative-lative case is occasionally used to mark direct objects. The dative-lative gloss triggers tagging the noun phrase as adverbial and thus a more thorough research is required before we manage a correct annotation of the direct objects in question. Another valuable next step could also be an analysis of further variants and dialects (e.g. Northern and Eastern Khanty, Eastern Mansi) and the comparison of the obtained data with other recent studies in this field of research (e.g. Virtanen 2015, Filchenko 2012). References Comrie, Bernard 1988: Coreference and conjunction reduction in grammar and discourse. – John Hawkins (ed.), Explaining language universals. Oxford: Blackwell. 186–208. Comrie, Bernard 1989: Some general properties of reference-tracking systems. – Doug Arnold, Martin Atkinson, Jacques Durand, Claire Grover & Louisa Sadler (eds.), Essays on Grammatical Theory and Universal Grammar. Oxford: Clarendon Press. 37–51. 126 Janda, Wisiorek & Eckmann Fichenko, Andrey 2012: Continuity of information structuring strategies in Eastern Khanty. – Pirkko Suihkonen, Bernard Comrie & Valery Solovyev (eds.), Argument structure and grammatical relations: A crosslinguistic typology. Amsterdam: John Benjamins. 115–132. Givón, Talmy 1983: Topic continuity in discourse: an introduction. – Talmy Givón (ed.), Topic continuity in discourse. A quantitative cross-language study. Typological Studies in Language 3. Amsterdam: John Benjamins. 1–41. Givón, Talmy 1992: The grammar of referential coherence as mental processing instructions. – Linguistics 30: 5–55. Kern, Beate 2010: Metonymie und Diskurskontinuität im Französischen. Linguistische Arbeiten 531. Berlin – New York: De Gruyter. Kibrik, Andrej 2008: Reference maintanance in discourse. – Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals 2. Handbücher zur Sprach- und Kommunikationswissenschaft (HSK) 20/2. 1123–1141. Kovgan, Elena 2001: Reference-tracking in Khanty. – Tõnu Seilenthal (ed.), Congressus Nonus Internationalis Fenno-Ugristarum. 7.–13.8.2000. Tartu: Auctores. 145–152. Kovgan, Elena 2005: The textual structure of Khanty: co-reference and anaphora. – M. M. Jocelyne Fernandez-Vest (ed.), The Uralic languages today. A linguistic and cognitive approach. Paris: Honoré Champion. 547–560. Krifka, Manfred 2008: Basic notions of information structure. – Acta Linguistica Hungarica 55 (3–4): 243–276. Kulonen, Ulla-Maija 1989: The passive in Ob-Ugrian. Mémoires de la Société FinnoOugrienne 203. Helsinki: Société Finno-Ougrienne. Loos, Eugene E. et al. (ed.) 2004: Topic. – Glossary of linguistic terms. <http://www.glossary. sil.org/term/topic> 26th July 2017 Moseley, Christopher (ed.) 2010: Atlas of the world’s languages in danger. Paris: UNESCO Publishing. Online version: <http://www.unesco.org/culture/en/endangeredlanguages/ atlas>. Nagaya, Naonori 2006: Topicality and reference-tracking in Tagalog. Tokyo: University of Tokyo, Department of Linguistics. <http://www.ruf.rice.edu/~nn1/download/Nagaya2006Topicality_and_reference-tracking_in_Tagalog.pdf>. Nikolaeva, Irina 2001: Syntaktische Analyse der objektiven Konjugation im Ob-Ugrischen. – Seilenthal, Tõnu (ed.), Congressus Nonus Internationalis Fenno-Ugristarum. 7.–13.8.2000. Tartu: Auctores. 145–152. Reinhart, Tanya 1982: Pragmatics and linguistics: An analysis of sentence topics. – Philosophica 27: 53–94. Skribnik, Jelena 2001: Pragmatic Structuring in Northern Mansi. – Seilenthal, Tõnu (ed.), Congressus Nonus Internationalis Fenno-Ugristarum. 7.–13.8.2000. Tartu: Auctores. 222–239. Virtanen, Susanna 2015: Transitivity in Eastern Mansi: an information structural approach. Dissertation. University of Helsinki. <http://urn.fi/URN:ISBN:978-951-51-0548-6>. Wisiorek, Axel & Zsófia Schön 2017: Ob-Ugric database: Corpus and lexicon databases of Khanty and Mansi dialects. – Acta Linguistica Academica 64 (3): 383–396.