Papers by M. Antonia Marti
Computational Linguistics
In the context of text representation, Compositional Distributional Semantics models aim to fuse ... more In the context of text representation, Compositional Distributional Semantics models aim to fuse the Distributional Hypothesis and the Principle of Compositionality. Text embedding is based on co-ocurrence distributions and the representations are in turn combined by compositional functions taking into account the text structure. However, the theoretical basis of compositional functions is still an open issue. In this article we define and study the notion of Information Theory–based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon’s Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empiric...
Bookmarks Related papers MentionsView impact
Lecture Notes in Computer Science, 2004
Bookmarks Related papers MentionsView impact
In this paper we present two Spanish corpora, MiniCors and Cast3LB, semantically tagged according... more In this paper we present two Spanish corpora, MiniCors and Cast3LB, semantically tagged according to different annotation criteria and objectives. In order to guarantee the quality of the results, we have established a methodology for the development of these corpora. The resulting resources consist of a semantically tagged corpus according to the lexical sample task, and a semantically tagged corpus
Bookmarks Related papers MentionsView impact
Proceedings of the Thirteenth Conference on Computational Natural Language Learning Shared Task - CoNLL '09, 2009
Bookmarks Related papers MentionsView impact
Lingua, 2011
Bookmarks Related papers MentionsView impact
Language, Cognition and Neuroscience, 2013
Bookmarks Related papers MentionsView impact
Behavior Research Methods, 2013
Bookmarks Related papers MentionsView impact
Proceedings of Corpus …, 2007
The concept of Named Entity (NE) has its origin in the Named Entity Recognition and Classificatio... more The concept of Named Entity (NE) has its origin in the Named Entity Recognition and Classification (NERC) tasks, an offspring of Information Retrieval systems, and became one of the main interest points in the Sixth and Seventh Message Understanding Conference (MUC-6, MUC-7) ...
Bookmarks Related papers MentionsView impact
Piek Vossen, University of Amsterdam Salvador Climent, Maria Antonia Marti, Mariona Taule, Univer... more Piek Vossen, University of Amsterdam Salvador Climent, Maria Antonia Marti, Mariona Taule, Universitat de Barcelona Julio Gonzalo, Irina Chugur, M. Felisa Verdejo, UNED Gerard Escudero, German Rigau, Horacio Rodriguez, Universitat Politecnica de Catalunya Antonietta Alonge, Francesca Bertagna, Rita Marinelli, Adriana Roventini, Luca Tarasi, Istituto di Linguistica del CNR, Pisa ... Deliverable D029, D030, WP3, WP4 EuroWordNet, LE2-4003 ... Title Comparison of the Final Wordnets Dutch, Spanish and Italian ... Authors Ö Piek Vossen, Laura ...
Bookmarks Related papers MentionsView impact
The goal of the project is to analyze, experiment, and develop intelligent, interactive and multi... more The goal of the project is to analyze, experiment, and develop intelligent, interactive and multilingual Text Mining technologies, as a key element of the next generation of search engines, systems with the capacity to find" the need behind the query". This new generation will provide specialized services and interfaces according to the search domain and type of information needed. Moreover, it will integrate textual search (websites) and multimedia search (images, audio, video), it will be able to find and organize information, rather than ...
Bookmarks Related papers MentionsView impact
Proceedings of Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes, 2005
In this position paper we present the research on verb predicates that we have carried out until ... more In this position paper we present the research on verb predicates that we have carried out until now for Catalan, Spanish, and Basque, and we outline the framework of our future research, which is based on the idea that it is necessary to include syntagmatic and statistic information in lexical resources, such as WordNet, in order to use it in tasks of information extraction from annotated corpora, and in automatic syntactic and semantic tagging of corpora.
Bookmarks Related papers MentionsView impact
Resumen: Los textos históricos y dialectales del catalán no se pueden anotar morfosintácticamente... more Resumen: Los textos históricos y dialectales del catalán no se pueden anotar morfosintácticamente de manera automática ya que no existe una variante estándar de referencia que permita un tratamiento homogéneo y sistemático. El objetivo de los proyectos HistoCat y DialCat ha sido desarrollar un entorno de anotación semiautomático aprovechando herramientas existentes para la anotación morfosintáctica de textos en catalán, que minimizara al máximo la anotación manual. Palabras clave: Corpus ...
Bookmarks Related papers MentionsView impact
Proceedings of Corpus Linguistics, 2003
Bookmarks Related papers MentionsView impact
Procesamiento del lenguaje natural, 1999
Resumen En este artículo se presenta una clase de predicados, la de cambio, a partir de los eleme... more Resumen En este artículo se presenta una clase de predicados, la de cambio, a partir de los elementos que hemos definido como básicos para la descripción del comportamiento verbal (componentes de significado, diátesis y estructura eventual). Se parte de la hipótesis de que los tres aspectos citados interaccionan entre sí y que son fundamentales a la hora de dar cuenta del uso real de los predicados. Esta información ha sido incorporada en la entrada léxica de una base de conocimiento léxico, de la cual presentamos la ...
Bookmarks Related papers MentionsView impact
Language Design, 1999
In this article we present our conception of diathesis alternations and how they intervene in the... more In this article we present our conception of diathesis alternations and how they intervene in the definition of a model of lexical entries. We consider that diathesis alternations are the syntactic realizations of oppositions of a more general semantic nature. We will see how they interact with other components such as event structure and how different semantic classes of predicates arise from that interaction. Keywords: Theoretical Model of Lexical Entries, Computational Lexicography, Diathesis Alternations.
Bookmarks Related papers MentionsView impact
Proceedings from First International Conference on Language Resources & Evaluation, 1998
We evaluate two types of lexical resources with respect to their applicability to interlingual ma... more We evaluate two types of lexical resources with respect to their applicability to interlingual machine translation:(1) a EuroWordNetbased database of bilingual links between Spanish and English words; and (2) a repository of semantically classified verbs with their corresponding Lexical Conceptual Structure (LCS) representations. We examine the utility of these two resources for the task of lexical selection in machine translation. Our approach uses a coarse-grained graph-matching scheme that selects target-language words based ...
Bookmarks Related papers MentionsView impact
Proceedings of TMI ‘92, Montreal, Canada, 1992
We propose a strongly lexicalist treatment of translation equivalence where mismatches due to div... more We propose a strongly lexicalist treatment of translation equivalence where mismatches due to diverging lexicalization patterns are dealt with by means of translation links which capture crosslinguistic generalizations across sets of semantically related lexical items. We show how this treatment can be developed within a unification-based, multilingual lexical knowledge base which is integrated with facilities for semi-automatic development of bilingual lexicons, and describe an approach to machine translation where generation ...
Bookmarks Related papers MentionsView impact
We present an extension of the coreference annotation in the English NP4E and the Catalan AnCora-... more We present an extension of the coreference annotation in the English NP4E and the Catalan AnCora-CA corpora with near-identity relations, which are borderline cases of coreference. The annotated subcorpora have 50K tokens each. Near-identity relations, as presented by Recasens et al.(2010; 2011), build upon the idea that identity is a continuum rather than an either/or relation, thus introducing a middle ground category to explain currently problematic cases. The first annotation effort that we describe shows that it is not ...
Bookmarks Related papers MentionsView impact
Proceedings of 6th International Conference on Language Resources and Evaluation, May 1, 2008
In this paper we present two large-scale verbal lexicons, AnCora-Verb-Ca for Catalan and AnCora-V... more In this paper we present two large-scale verbal lexicons, AnCora-Verb-Ca for Catalan and AnCora-Verb-Es for Spanish, which are the basis for the semantic annotation with arguments and thematic roles of AnCora corpora. In AnCora-Verb lexicons, the mapping between syntactic functions, arguments and thematic roles of each verbal predicate it is established taking into account the verbal semantic class and the diatheses alternations in which the predicate can participate. Each verbal predicate is related to one ...
Bookmarks Related papers MentionsView impact
Bookmarks Related papers MentionsView impact
Uploads
Papers by M. Antonia Marti