Abstract
We present HamleDT—a HArmonized Multi-LanguagE Dependency Treebank. HamleDT is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. In the present article, we provide a thorough investigation and discussion of a number of phenomena that are comparable across languages, though their annotation in treebanks often differs. We claim that transformation procedures can be designed to automatically identify most such phenomena and convert them to a unified annotation style. This unification is beneficial both to comparative corpus linguistics and to machine learning of syntactic parsing.
Similar content being viewed by others
Notes
The initial version has been described in Zeman et al. (2012).
HamleDT v1.5 does not include the harmonization of verbal groups (see Sect. 5.4).
The transformations are not robust to coordination styles.
So far, there are only two differences between the PDT style (used in [cs]) and the HamleDT v1.5 style: handling of appositions (see Table 3) and marking of conjuncts (in HamleDT, the root of a conjunct subtree is marked as conjunct even if it is a preposition or subordinating conjunction; in PDT, only content words are marked as conjuncts). By conjunct, we mean a member of coordination (unlike Quirk et al. 1985). By content word, we mean autosemantic word, i.e. a word with a full lexical meaning, as contrasted with auxiliary. Note that PDT also has a more abstract layer of annotation (called tectogrammatical), but in this work, we only use the shallow dependencies (called analytical layer in PDT).
Unless we explicitly say otherwise, we mean by “original” the data source indicated in Table 1. It may actually differ from the really original treebank. For instance, some of the CoNLL data underwent a conversion procedure to the CoNLL format from other formats, and some information may have been lost in the process.
In the Pāṇinian tradition, karta is the agent, doer of the action, and karma is the “deed” or patient. See Bharati et al. (1994).
Ideally we would also want to distinguish objects (Obj) from adverbials. Unfortunately, this particular source annotation does not provide enough information to make such a distinction.
In Chomskian (constituency-based) approaches, it is the standard analysis that determiners function as the head of a noun phrase.
Note however that numerals governing nouns are not restricted to [da]. Czech has a complex set of rules for numerals (motivated by the morphological agreement), which may result under some circumstances in the numeral serving as the head.
In [ja], the previous token essentially means the main predicate, but if it is followed by a question particle then the punctuation node is attached to the particle.
http://ufal.mff.cuni.cz/tred/ with EasyTreex extension.
We do not attempt at reversibility when unifying dependency relations.
References
Aduriz, I., Aranzabe, M. J., Arriola, J. M., Atutxa, A., Díaz de Ilarraza, A., Garmendia, A., & Oronoz, M. (2003). Construction of a Basque dependency treebank. In Proceedings of the 2nd workshop on treebanks and linguistic theories.
Afonso, S., Bick, E., Haber, R., & Santos, D. (2002). “Floresta sintá(c)tica”: A treebank for Portuguese. In Proceedings of the 3rd international conference on language resources and evaluation (LREC) (pp. 1968–1703).
Atalay, N. B., Oflazer, K., Say, B., & Inst, I. (2003). The annotation process in the Turkish rreebank. In Proceedings of the 4th international workshop on linguistically interpreteted corpora (LINC).
Bamman, D., & Crane, G. (2011). The ancient Greek and Latin dependency treebanks. In C. Sporleder, A. Bosch, & K. Zervanou (Eds.), Language technology for cultural heritage, theory and applications of natural language processing (pp. 79–98). Berlin, Heidelberg: Springer.
Bengoetxea, K., & Gojenola, K. (2009). Exploring treebank transformations in dependency parsing. In Proceedings of the international conference RANLP-2009. Borovets, Bulgaria (pp. 33–38). Association for Computational Linguistics.
Bharati, A., Chaitanya, V., & Sangal, R. (1994). Natural language processing: A paninian perspective. New Delhi: Prentice-Hall of India.
Bick, E., Uibo, H., & Müürisep, K. (2004). Arborest—A VISL-style treebank derived from an Estonian constraint grammar corpus. In Proceedings of treebanks and linguistic theories.
Boguslavsky, I., Grigorieva, S., Grigoriev, N., Kreidlin, L., & Frid, N. (2000). Dependency treebank for Russian: Concept, tools, types of information. In Proceedings of the 18th conference on computational linguistics (Vol. 2, pp. 987–991).
Bosco, C., Montemagni, S., Mazzei, A., Lombardo, V., Lenci, A., Lesmo, L., Attardi, G., Simi, M., Lavelli, A., Hall, J., Nilsson, J., & Nivre, J. (2010). Comparing the influence of different treebank annotations on dependency parsing.
Brants, S., Dipper, S., Eisenberg, P., Hansen, S., König, E., Lezius, W., et al. (2004). TIGER: Linguistic interpretation of a German corpus. Journal of Language and Computation, 2(4), 597–620. Special Issue.
Buchholz, S., & Marsi, E. (2006). CoNLL-X shared task on multilingual dependency parsing. In Proceedings of CoNLL (pp. 149–164).
Călăcean, M. (2008). Data-driven dependency parsing for Romanian. Master’s thesis, Uppsala University.
Civit, M., Martí, M. A., & Bufí, N. (2006). Cat3LB and Cast3LB: From constituents to dependencies. In T. Salakoski, F. Ginter, S. Pyysalo, & T. Pahikkala (Eds.), FinTAL, Vol. 4139 of Lecture notes in computer science (pp. 141–152). Berlin: Springer.
Csendes, D., Csirik, J., Gyimóthy, T., & Kocsor, A. (2005). The Szeged treebank. In V. Matoušek, P. Mautner, & T. Pavelka (Eds.), TSD, Vol. 3658 of Lecture notes in computer science (pp. 123–131). Berlin: Springer.
de Marneffe, M.-C., & Manning, C. D. (2008). Stanford typed dependencies manual.
Džeroski, S., Erjavec, T., Ledinek, N., Pajas, P., Žabokrtský, Z., & Žele, A. (2006). Towards a slovene dependency treebank. In Proceedings of the fifth international language resources and evaluation conference, LREC 2006. Genova, Italy (pp. 1388–1391). European Language Resources Association (ELRA).
Hajič, J., Ciaramita, M., Johansson, R., Kawahara, D., Martí, M. A., Màrquez, L., Meyers, A., Nivre, J., Padó, S., Štěpánek, J., Straňák, P., Surdeanu, M., Xue, N., & Zhang, Y. (2009). The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages. In Proceedings of the 13th conference on computational natural language learning (CoNLL-2009), June 4–5. Boulder, Colorado, USA.
Hajič, J., Panevová, J., Hajičová, E., Sgall, P., Pajas, P., Štěpánek, J., Havelka, J., Mikulová, M., Žabokrtský, Z., & Ševčíková-Razímová, M. (2006). Prague dependency treebank 2.0. CD-ROM, Linguistic Data Consortium, LDC Catalog No.: LDC2006T01, Philadelphia.
Haverinen, K., Viljanen, T., Laippala, V., Kohonen, S., Ginter, F., & Salakoski, T. (2010). Treebanking finnish. In M. Dickinson, K. Müürisep, & M. Passarotti (Eds.), Proceedings of the ninth international workshop on treebanks and linguistic theories (TLT9) (pp. 79–90).
Hudson, R. (2004). Are determiners heads? Functions of Language, 11(1).
Hudson, R. (2010). An encyclopedia of word grammar and English grammar. London, UK: University College London. http://tinyurl.com/wg-encyc.
Husain, S., Mannem, P., Ambati, B., & Gadde, P. (2010). The ICON-2010 tools contest on Indian language dependency parsing. In Proceedings of ICON-2010 tools contest on Indian language dependency parsing. Kharagpur, India.
Hwa, R., Resnik, P., Weinberg, A., Cabezas, C. I., & Kolak, O. (2005). Bootstrapping parsers via syntactic projection across parallel texts. Natural Language Engineering, 11(3), 311–325.
Kawata, Y., & Bartels, J. (2000). Stylebook for the Japanese treebank in verbmobil. In Report 240. Tübingen, Germany.
Kromann, M. T., Mikkelsen, L., & Lynge, S. K. (2004). Danish dependency treebank.
Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1993). Building a large annotated corpus of English: The Penn treebank. Computational Linguistics, 19(2), 313–330.
Mareček, D., & Žabokrtský, Z. (2012). Exploiting reducibility in unsupervised dependency parsing. In Proceedings of EMNLP-CoNLL’12 (pp. 297–307).
McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bedini, C., Castelló, N. B., & Lee, J. (2013). Universal dependency annotation for multilingual parsing. In Proceedings of the ACL 2013. Association for Computational Linguistics.
McDonald, R., Petrov, S., & Hall, K. (2011a). Multi-source transfer of delexicalized dependency parsers. In Proceedings of the conference on empirical methods in natural language processing (pp. 62–72). Stroudsburg, PA, USA. Association for Computational Linguistics.
McDonald, R., Petrov, S., & Hall, K. (2011b). Multi-source transfer of delexicalized dependency parsers. In Proceedings of the 2011 conference on empirical methods in natural language processing (pp. 62–72). Edinburgh, Scotland, UK. Association for Computational Linguistics.
Mel’čuk, I. A. (1988). Dependency syntax: Theory and practice. New York: State University of New York Press.
Montemagni, S., Barsotti, F., Battista, M., Calzolari, N., Corazzari, O., Lenci, A., et al. (2003). Building the Italian syntactic-semantic treebank. In A. Abeillé (Ed.), Building and using parsed corpora (pp. 189–210). Dordrecht: Kluwer.
Nilsson, J., Hall, J., & Nivre, J. (2005). MAMBA Meets TIGER: Reconstructing a Swedish treebank from antiquity. In Proceedings of the NODALIDA special session on treebanks.
Nilsson, J., Nivre, J., & Hall, J. (2006). Graph transformations in data-driven dependency parsing. In Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the association for computational linguistics (pp. 257–264).
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., & Yuret, D. (2007). The CoNLL 2007 shared task on dependency parsing. In Proceedings of the CoNLL 2007 shared task. Joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL).
Popel, M., & Žabokrtský, Z. (2010). TectoMT: Modular NLP framework. In Advances in natural language processing (pp. 293–304).
Popel, M., Mareček, D., Štěpánek, J., Zeman, D., & Žabokrtský, Z. (2013). Coordination structures in dependency treebanks’. In Proceedings of the 51st annual meeting of the association for computational linguistics (pp. 517–527). Sofia, Bulgaria. Association for Computational Linguistics.
Prokopidis, P., Desipri, E., Koutsombogera, M., Papageorgiou, H., & Piperidis, S. (2005). Theoretical and practical issues in the construction of a Greek dependency treebank. In Proceedings of the 4th workshop on treebanks and linguistic theories (TLT) (pp. 149–160).
Quirk, R., Greenbaum, S., & Leech, G., Svartvik, J. (1985). A comprehensive grammar of the English language. London: Longman.
Ramasamy, L., & Žabokrtský, Z. (2012). Prague dependency style treebank for Tamil. In Proceedings of LREC 2012. İstanbul, Turkey.
Rasooli, M. S., Moloodi, A., Kouhestani, M., & Minaei-Bidgoli, B. (2011). A syntactic valency lexicon for persian verbs: The first steps towards Persian dependency treebank. In 5th language and technology conference (LTC): Human language technologies as a challenge for computer science and linguistics (pp. 227–231). Poland: Poznań.
Schwartz, R., Abend, O., & Rappoport, A. (2012). Learnability-based syntactic annotation design. In Proceedings of COLING 2012: Technical papers (pp. 2405–2422). India: Mumbai.
Seginer, Y. (2007). Learning syntactic structure. Ph.D. thesis, University of Amsterdam.
Simov, K., & Osenova, P. (2005). Extending the annotation of BulTreeBank: Phase 2. In The fourth workshop on treebanks and linguistic theories (TLT 2005), Barcelona (pp. 173–184).
Smrž, O., Bielický, V., Kouřilová, I., Kráčmar, J., Hajič, J., & Zemánek, P. (2008). Prague Arabic dependency treebank: A word on the million words. In Proceedings of the workshop on Arabic and local languages (LREC 2008) (pp. 16–23). Marrakech, Morocco. European Language Resources Association.
Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., & Nivre, J. (2008). The CoNLL-2008 shared task on joint parsing of syntactic and semantic dependencies. In Proceedings of CoNLL.
Taulé, M., Martí, M.A., & Recasens, M. (2008). AnCora: Multilevel annotated corpora for Catalan and Spanish. In LREC. European Language Resources Association.
Tesnière, L. (1959). Éléments de syntaxe structurale. Paris: Klincksieck.
Tsarfaty, R., Nivre, J., & Andersson, E. (2011). Evaluating dependency parsing: Robust and heuristics-free cross-annotation evaluation. In Proceedings of the 2011 conference on empirical methods in natural language processing (pp. 385–396). Edinburgh, Scotland, UK. Association for Computational Linguistics.
van der Beek, L., Bouma, G., Daciuk, J., Gaustad, T., Malouf, R., van Noord, G., Prins, R., & Villada, B. (2002). Chapter 5. The Alpino dependency treebank. In Algorithms for linguistic processing NWO PIONIER progress report. Groningen, The Netherlands.
Zeman, D. (2008). Reusable tagset conversion using tagset drivers. In N. Calzolari, K. Choukri, B. Maegaard, Mariani J., J. Odijk, S. Piperidis, & D. Tapias (Eds.), Proceedings of the sixth international language resources and evaluation conference, LREC 2008 (pp. 28–30). Marrakech, Morocco. European Language Resources Association (ELRA).
Zeman, D., Mareček, D., Popel, M., Ramasamy, L., Štěpánek, J., Žabokrtský, Z., & Hajič, J. (2012). HamleDT: To parse or not to parse? In N. Calzolari, K. Choukri, T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani, J. Odijk, & S. Piperidis (Eds.), In Proceedings of the eight international conference on language resources and evaluation (LREC’12). İstanbul, Turkey. European Language Resources Association (ELRA).
Acknowledgments
The authors wish to express their gratitude to all the creators and providers of the respective corpora. The work on this project was supported by the Czech Science Foundation Grant Nos. P406/11/1499 and P406/14/06548P, by the European Union Seventh Framework Programme under Grant Agreement FP7-ICT-2013-10-610516 (QTLeap), and by research resources of the Charles University in Prague (PRVOUK). This work has been using language resources developed and/or stored and/or distributed by the LINDAT/CLARIN project of the Ministry of Education of the Czech Republic (Project LM2010013). Finally, we are very grateful for the numerous valuable comments provided by the anonymous reviewers.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: List of included languages and treebanks
-
Arabic [ar]: Prague Arabic Dependency Treebank 1.0/CoNLL 2007 (Smrž et al. 2008)
http://padt-online.blogspot.com/2007/01/conll-shared-task-2007.html
-
Basque [eu]: Basque Dependency Treebank, a larger version than the one included in CoNLL 2007, generously provided by IXA Group (Aduriz et al. 2003)
-
Bengali [bn], Hindi [hi] and Telugu [te]: Hyderabad Dependency Treebank/ICON 2010 (Husain et al. 2010)
-
Bulgarian [bg]: BulTreeBank (Simov and Osenova 2005)
-
Catalan [ca] and Spanish [es]: AnCora (Taulé et al. 2008)
-
Czech [cs]: Prague Dependency Treebank 2.0/CoNLL 2009 (Hajič et al. 2006)
-
Danish [da]: Danish Dependency Treebank/CoNLL 2006 (Kromann et al. 2004), now part of the Copenhagen Dependency Treebank
-
Dutch [nl]: Alpino Treebank/CoNLL 2006 (van der Beek et al. 2002)
-
English [en]: Penn TreeBank 3/CoNLL 2007 (Marcus et al. 1993)
-
Estonian [et]: Eesti keele puudepank/Arborest (Bick et al. 2004)
-
Finnish [fi]: Turku Dependency Treebank (Haverinen et al. 2010)
-
German [de]: Tiger Treebank/CoNLL 2009 (Brants et al. 2004)
http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/tiger.html
-
Greek (modern) [el]: Greek Dependency Treebank (Prokopidis et al. 2005)
-
Greek (ancient) [grc] and Latin [la]: Ancient Greek and Latin Dependency Treebanks (Bamman and Crane 2011)
-
Hindi [hi]: see Bengali
-
Hungarian [hu]: Szeged Treebank (Csendes et al. 2005)
-
Italian [it]: Italian Syntactic-Semantic Treebank/CoNLL 2007 (Montemagni et al. 2003)
-
Japanese [ja]: Verbmobil (Kawata and Bartels 2000)
-
Latin [la]: see Greek (ancient)
-
Persian [fa]: Persian Dependency Treebank (Rasooli et al. 2011)
-
Portuguese [pt]: Floresta sintá(c)tica (Afonso et al. 2002)
http://www.linguateca.pt/floresta/info_floresta_English.html
-
Romanian [ro]: Romanian Dependency Treebank (Călăcean 2008)
-
Russian [ru]: Syntagrus (Boguslavsky et al. 2000)
-
Slovene [sl]: Slovene Dependency Treebank/CoNLL 2006 (Džeroski et al. 2006)
-
Spanish [es]: see Catalan
-
Swedish [sv]: Talbanken05 (Nilsson et al. 2005)
-
Tamil [ta]: TamilTB (Ramasamy and Žabokrtský 2012)
-
Telugu [te]: see Bengali
-
Turkish [tr]: METU-Sabanci Turkish Treebank (Atalay et al. 2003)
Appendix 2: Examples of harmonization of dependency relations
Appendix 3: List of dependency relation labels in figures
Language | Label | Description | Example |
---|---|---|---|
X | Our meta-label that represents the unknown relation of the depicted subtree to its unshown parent | ||
bg | comp | Complement, i.e. argument of non-verbal head, non-finite verbal head, copula | Figure 18 |
bg | indobj | Child is indirect object of parent | Figure 18 |
bg | mod | Child is modifier, e.g. of a noun phrase, or a negative particle modifying a verb etc. | Figure 18 |
bg | prepcomp | Child is noun phrase, parent is preposition | Figure 18 |
bg | subj | Child is subject of parent | Figure 18 |
bg | xcomp | Child is clausal complement; this includes complements of modal verbs | Figure 18 |
ca | CO | Child is coordinating conjunction, parent is the first conjunct | Figure 4 |
ca | CONJUNCT | Parent is the first conjunct, child is one of the other conjuncts | Figure 4 |
ca | PUNC | Child is punctuation symbol | Figure 4 |
cs, sl, la, ta | Adv | Child is adverbial modifier of parent | Figure 2 |
cs, sl, la, ta | Atr | Parent is noun, child is its attribute | Figure 9 |
cs, sl, la, ta | AuxC | Child is subordinating conjunction, parent is governing predicate. The relation of the subordinate clause to the parent is labeled at the grandchild | Figure 19 |
cs, sl, la, ta | AuxP | Child is preposition. The relation of the prepositional phrase to the parent is labeled at the grandchild | Figure 2 |
cs, sl, la, ta | AuxV | Child is auxiliary verb or negative particle, parent is content verb | Figure 19 |
cs, sl, la, ta | AuxX | Child is comma and does not serve as coordination root | Figure 2 |
cs, sl, la, ta | AuxZ | Emphasizing word | Figure 8 |
cs, sl, la, ta | Coord | Child serves as root of a coordinate structure | Figure 1 |
cs, sl, la, ta | Obj | Child is object of parent | Figure 2 |
cs, sl, la, ta | Pred | Child is predicate of a main clause | Figure 2 |
cs, sl, la, ta | Sb | Child is subject of parent | Figure 19 |
cs, ta | _M | Suffix to a label, saying that the child is a conjunct. The main label tags its relation to the parent of the coordinate structure | Figure 1 |
da | appr | Restrictive apposition (no comma) | Figure 28 |
da | conj | Child is conjunct, parent is first conjunct or coordinating conjunction | Figure 6 |
da | coord | Parent is conjunct, child is coordinating conjunction | Figure 6 |
da | dobj | Child is direct object of parent | Figure 28 |
da | expl | Child is expletive subject of parent | Figure 28 |
da | mod | Modifier, e.g. attribute of noun, adverbial modifier of verb, adjective attached to determiner etc. | Figure 28 |
da | nobj | Child is noun phrase or infinitive, parent is e.g. determiner, numeral, preposition etc. | Figure 28 |
da | pnct | Child is punctuation symbol | Figure 6 |
da | possd | Child is argument of possessive parent, i.e. child is the thing possessed | Figure 28 |
de | CD | Child is coordinating conjunction, parent is one conjunct and right sibling is the other conjunct | Figure 3 |
de | CJ | Parent and child are conjuncts | Figure 3 |
de | MO | Modifier. In NPs only focus particles are annotated as modifiers | Figure 23 |
de | NG | Child is negative particle, parent is negated verb | Figure 23 |
de | NK | Noun Kernel. Child attached within a noun phrase or a prepositional phrase | Figure 10 |
de | OA | Child is accusative object of parent | Figure 23 |
de | OC | Clausal object. Also verb tokens building a complex verbal form and modal constructions | Figure 23 |
de | PUNC | Child is punctuation symbol | Figure 3 |
de | SB | Child is subject of parent | Figure 23 |
es | atr | Attribute. E.g. child is adverbial/prepositional phrase, parent is verb | Figure 12 |
es | cd | Child is direct object of parent | Figure 12 |
es | conj | Child is subordinating conjunction | Figure 12 |
es | s.a | Child is adjectival phrase, parent is not verb | Figure 12 |
es | sn | Child is noun phrase. Parent may be e.g. preposition | Figure 12 |
es | spec | Specifier. E.g. child is determiner and parent is noun | Figure 12 |
es | suj | Child is subject of parent | Figure 12 |
fa | NPREMOD | Child is premodifier of parent noun | Figure 26 |
fa | NVE | Child is non-verbal element of compound verb. Parent is verbal element | Figure 26 |
fa | SBJ | Child is subject of parent | Figure 26 |
hi | lwg_cont | Child is additional node of a complex expression; child and parent together perform certain function | Figure 27 |
hi | lwg_psp | Child is postposition and modifies a noun | Figure 11 |
hi | lwg_vaux | Child is auxiliary verb, parent is content verb | Figure 27 |
hi | pof | Part of relation, e.g. part of conjunct verb | Figure 27 |
hi | pof_cn | Part of relation | Figure 27 |
hi, bn, te | adv | Child is adverbial modifier (only adverbs of manner) of parent | Figure 29 |
hi, bn, te | ccof | Child is conjunct, parent is coordinating conjunction or comma | Figure 29 |
hi, bn, te | k1 | Child is karta (doer/agent/subject) of parent predicate | Figure 27 |
hi, bn, te | k2 | Child is karma (pacient/object) of parent predicate | Figure 27 |
hi, bn, te | k7p | Child is deshadhikarana (location in space) of the parent predicate | Figure 30 |
hi, bn, te | k7t | Child is kaalaadhikarana (location in time) of the parent predicate | Figure 31 |
hi, bn, te | nmod | Parent is noun, child is its attribute | Figure 29 |
hi, bn, te | nmod_adj | Child is adjective and modifies a noun | Figure 11 |
hi, bn, te | r6 | Shashthi (possessive). Child is possessor in genitive, parent is the possessed noun | Figure 30 |
hu | ATT | Attribute | Figure 15 |
hu | CONJ | Child is conjunction (coordinating or subordinating) | Figure 5 |
hu | DET | Child is determiner, parent is noun | Figure 15 |
hu | ILL | Child is verbal argument in illative case | Figure 15 |
hu | OBJ | Child is object of parent | Figure 15 |
hu | PUNCT | Child is punctuation symbol | Figure 5 |
hu | SUBJ | Child is subject of parent | Figure 15 |
it | cong_sub | Parent is subordinating conjunction | Figure 13 |
it | det | Child is determiner, parent is noun | Figure 13 |
it | modal | Child is modal (dovere, volere, potere) or aspectual (andare, venire, stare) verb, parent is content verb | Figure 13 |
it | pred | Parent is verb (often it is copula), child is predicative complement (nominal predicate) | Figure 13 |
it | sogg | Child is subject of parent | Figure 13 |
ja | ADJ | Child is adjunct of parent | Figure 25 |
ja | COMP | Complement, e.g. verb attached to another verb form, noun attached to postposition etc. | Figure 25 |
ja | SBJ | Child is subject of parent | Figure 25 |
nl | det | Child is determiner, parent is noun | Figure 21 |
nl | mod | Child is adverbial modifier (bijwoordelijke bepaling) of parent | Figure 21 |
nl | obj1 | Child is direct object; this includes nouns attached to prepositions! | Figure 21 |
nl | predm | Child determines state (adverbial modifier), parent is predicate | Figure 22 |
nl | su | Child is subject of parent | Figure 21 |
nl | vc | Verbal complement. Example: parent is modal, child is infinitive | Figure 21 |
pt | >N | Child is left dependent of nominal core | Figure 24 |
pt | ADVL | Child is adverbial adjunct (adjunto adverbial) of parent | Figure 24 |
pt | MV | Child is main verb, parent may be e.g. modal verb | Figure 24 |
pt | N< | Child is right dependent of nominal core | Figure 24 |
pt | P< | Child is right dependent of preposition | Figure 24 |
pt | PRT-AUX< | Child is verbal particle (partícula de ligação verbal), e.g. between modal and content verb, parent would be modal | Figure 24 |
pt | PUNC | Child is punctuation symbol | Figure 24 |
pt | SC | Child is nominal predicate (predicativo do sujeito), parent is copula | Figure 24 |
pt | SUBJ | Child is subject of parent | Figure 24 |
ro | rel.conj. | Parent is coordinating conjunction, child is conjunct | Figure 7 |
ru |
| Child is argument other than subject. Also: genitive noun modifier of another noun | Figure 17 |
ru |
| Child is agent-object of passive parent | Figure 17 |
ru |
| Parent is noun, child is its attribute | Figure 17 |
ru |
| Child is passive participle, parent is finite auxiliary verb | Figure 17 |
ru |
| Parent is predicate, child is subject | Figure 17 |
ta | AComp | Child is (obligatory) adverbial complement of parent | Figure 8 |
tr | OBJECT | Child is object of parent | Figure 16 |
tr | QUESTION .PARTICLE | Child is question particle, parent is verb | Figure 16 |
tr | SUBJECT | Child is subject of parent | Figure 16 |
tr | VOCATIVE | Child is vocative noun phrase serving as doer (actor) of parent verb | Figure 16 |
Rights and permissions
About this article
Cite this article
Zeman, D., Dušek, O., Mareček, D. et al. HamleDT: Harmonized multi-language dependency treebank. Lang Resources & Evaluation 48, 601–637 (2014). https://doi.org/10.1007/s10579-014-9275-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-014-9275-2