Skip to main content
Netizens, Michael and Ronda Hauben's foundational treatise on Usenet and the Internet, was first published in print 25 years ago. In this piece, we trace the history and impact of the book and of Usenet itself, contextualising them within... more
Netizens, Michael and Ronda Hauben's foundational treatise on Usenet and the Internet, was first published in print 25 years ago. In this piece, we trace the history and impact of the book and of Usenet itself, contextualising them within the contemporary and modern-day scholarship on virtual communities, online culture, and Internet history. We discuss the Net as a tool of empowerment, and touch on the social, technical, and economic issues related to the maintenance of shared network infrastructures and to the preservation and commodification of Usenet archives. Our interview with Ronda Hauben offers a retrospective look at the development of online communities, their impact, and how they are studied. She recounts her own introduction to the online world, as well as the impetus and writing process for Netizens. She presents Michael Hauben's conception of “netizens” as contributory citizens of the Net (rather than mere users of it) and the “electronic commons” they built up, and argues that this collaborative and collectivist model has been overwhelmed and endangered by the privatisation and commercialisation of the Internet and its communities.
In 1979 and 1980, Word Ways: The Journal of Recreational Linguistics printed a series of articles on the early history, religious symbolism, and cultural significance of the rotas square, an ancient Latin-language palindromic word... more
In 1979 and 1980, Word Ways: The Journal of Recreational Linguistics printed a series of articles on the early history, religious symbolism, and cultural significance of the rotas square, an ancient Latin-language palindromic word square.  The articles were attributed to Dmitri A. Borgmann, the noted American writer on wordplay and former editor of Word Ways.  While they attracted little attention at the time, some 35 years after their publication (and 29 years after Borgmann's death), questions began to be raised about their authorship.  There is much internal and external evidence that, taken together, compellingly supports the notion that Borgmann did not write the articles himself.  This paper surveys this evidence and solicits help in identifying the articles' original source.
In computer science, a preprocessor (or macro processor) is a tool that programatically alters its input, typically on the basis of inline annotations, to produce data that serves as input for another program. Preprocessors are used in... more
In computer science, a preprocessor (or macro processor) is a tool that programatically alters its input, typically on the basis of inline annotations, to produce data that serves as input for another program. Preprocessors are used in software development and document processing workflows to translate or extend programming or markup languages, as well as for conditional or pattern-based generation of source code and text. Early preprocessors were relatively simple string replacement tools that were tied to specific programming languages and application domains, and while these have since given rise to more powerful, general-purpose tools, these often require the user to learn and use complex macro languages with their own syntactic conventions. In this paper, we present GPP, an extensible, general-purpose preprocessor whose principal advantage is that its syntax and behaviour can be customized to suit any given preprocessing task. This makes GPP of particular benefit to research applications, where it can be easily adapted for use with novel markup, programming, and control languages.
Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic... more
Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic approach, a variant of Gaussian process preference learning (GPPL), that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations. We apply our system, which is similar to one that had previously shown good performance on English-language one-liners annotated with pairwise humorousness annotations, to the Spanish-language data set of the HAHA@IberLEF2019 evaluation campaign. We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF2019 data and the pairwise judgment annotations required for our method.
Lexical polysemy, a fundamental characteristic of all human languages, has long been regarded as a major challenge to machine translation, human–computer interaction, and other applications of computational natural language processing... more
Lexical polysemy, a fundamental characteristic of all human languages, has long been regarded as a major challenge to machine translation, human–computer interaction, and other applications of computational natural language processing (NLP).  Traditional approaches to automatic word sense disambiguation (WSD) rest on the assumption that there exists a single, unambiguous communicative intention underlying every word in a document.  However, writers sometimes intend for a word to be interpreted as simultaneously carrying multiple distinct meanings.  This deliberate use of lexical ambiguity—i.e., punning—is a particularly common source of humour, and therefore has important implications for how NLP systems process documents and interact with users.  In this paper we make a case for research into computational methods for the detection of puns in running text and for the isolation of the intended meanings.  We discuss the challenges involved in adapting principles and techniques from WSD to humorously ambiguous text, and outline our plans for evaluating WSD-inspired systems in a dedicated pun identification task.  We describe the compilation of a large manually annotated corpus of puns and present an analysis of its properties.  While our work is principally concerned with simple puns which are monolexemic and homographic (i.e., exploiting single words which have different meanings but are spelled identically), we touch on the challenges involved in processing other types.
Latent semantic analysis (LSA) is an automated, statistical technique for comparing the semantic similarity of words or documents. In this paper, I examine the application of LSA to automated essay scoring. I compare LSA methods to... more
Latent semantic analysis (LSA) is an automated, statistical technique for comparing the semantic similarity of words or documents. In this paper, I examine the application of LSA to automated essay scoring. I compare LSA methods to earlier statistical methods for assessing essay quality, and critically review contemporary essay-scoring systems built on LSA, including the Intelligent Essay Assessor, Summary Street, State the Essence, Apex, and Select-a-Kibitzer. Finally, I discuss current avenues of research, including LSA's application to computer-measured readability assessment and to automatic summarization of student essays.
For many years, the non-monotonic reasoning community has focussed on highly expressive logics. Such logics have turned out to be computationally expensive, and have given little support to the practical use of non-monotonicreasoning. In... more
For many years, the non-monotonic reasoning community has focussed on highly expressive logics. Such logics have turned out to be computationally expensive, and have given little support to the practical use of non-monotonicreasoning. In this work we discuss defeasible logic, a less-expressive but more efficient non-monotonic logic. We report on two new implemented systems for defeasible logic: a query answering system employing a backward-chaining approach, and a forward-chaining implementation that computes all conclusions. Our experimental evaluation demonstrates that the systems can deal with large theories (up to hundreds of thousands of rules). We show that defeasible logic has linear complexity, which contrasts markedly with most other non-monotonic logics and helps to explain the impressive experimental results. We believe that defeasible logic, with its efficiency and simplicity, is a good candidate to be used as a modelling language for practical applications, including modelling of regulations and business rules.
Automatic generation of Bayesian network (BN) structures (directed acyclic graphs) is an important step in experimental study of algorithms for inference in BNs and algorithms for learning BNs from data. Previously known simulation... more
Automatic generation of Bayesian network (BN) structures (directed acyclic graphs) is an important step in experimental study of algorithms for inference in BNs and algorithms for learning BNs from data. Previously known simulation algorithms do not guarantee connectedness of generated structures or even successful genearation according to a user specification. We propose a simple, efficient and well-behaved algorithm for automatic generation of BN structures. The performance of the algorithm is demonstrated experimentally.
We present and evaluate PunCAT, an interactive electronic tool for the translation of puns. Following the strategies known to be applied in pun translation, PunCAT automatically translates each sense of the pun separately; it then allows... more
We present and evaluate PunCAT, an interactive electronic tool for the translation of puns. Following the strategies known to be applied in pun translation, PunCAT automatically translates each sense of the pun separately; it then allows the user to explore the semantic fields of these translations in order to help construct a plausible target-language solution that maximizes the semantic correspondence to the original. Our evaluation is based on an empirical pilot study in which the participants translated puns from a variety of published sources from English into German, with and without PunCAT. We aimed to answer the following questions: Does the tool support, improve, or constrain the translation process, and if so, in what ways? And what are the tool's main benefits and drawbacks as perceived and described by the participants? Our analysis of the translators' cognitive processes gives us insight into their decision-making strategies and how they interacted with the tool. We find clear evidence that PunCAT effectively supports the translation process in terms of stimulating brainstorming and broadening the translator's pool of solution candidates. We have also identified a number of directions in which the tool could be adapted to better suit translators' work processes.
We describe a language-neutral automatic summarization system which aims to produce coherent extracts. It builds an initial extract composed solely of topic sentences, and then recursively fills in the topical lacunae by providing linking... more
We describe a language-neutral automatic summarization system which aims to produce coherent extracts. It builds an initial extract composed solely of topic sentences, and then recursively fills in the topical lacunae by providing linking material between semantically dissimilar sentences. While experiments with human judges did not prove a statistically significant increase in textual coherence with the use of a latent semantic analysis module, we found a strong positive correlation between coherence and overall summary quality.
The goal of the JOKER track series is to bring together linguists, translators, and computer scientists to foster progress on the automatic interpretation, generation, and translation of wordplay. Building on lessons learned from last... more
The goal of the JOKER track series is to bring together linguists, translators, and computer scientists to foster progress on the automatic interpretation, generation, and translation of wordplay. Building on lessons learned from last year's edition, JOKER-2023 held three shared tasks aligned with the human approaches to the translation of wordplay, or more specifically of puns in English, French, and Spanish: detection, location and interpretation, and finally translation. In this paper, we define these three tasks and describe our approaches to corpus creation and evaluation. We then present an overview of the participating systems, including summaries of their approaches and a comparison of their performance. As in JOKER-2022, this year's track also solicited contributions making further use of our data (an “unshared task”), which we also report on.
Despite recent advances in information retrieval and natural language processing, rhetorical devices that exploit ambiguity or subvert linguistic rules remain a challenge for such systems. However, corpus-based analysis of wordplay has... more
Despite recent advances in information retrieval and natural language processing, rhetorical devices that exploit ambiguity or subvert linguistic rules remain a challenge for such systems. However, corpus-based analysis of wordplay has been a perennial topic of scholarship in the humanities, including literary criticism, language education, and translation studies. The immense data-gathering effort required for these studies points to the need for specialized text retrieval and classification technology, and consequently for appropriate test collections. In this paper, we introduce and analyze a new dataset for research and applications in the retrieval and processing of wordplay. Developed for the JOKER track at CLEF 2023, our annotated corpus extends and improves upon past English wordplay detection datasets in several ways. First, we introduce hundreds of additional positive examples; second, we provide French translations for the examples; and third, we provide negative examples with characteristics closely matching those of the positive examples. This last feature helps ensure that AI models learn to effectively distinguish wordplay from non-wordplay, and not simply texts differing in length, style, or vocabulary. Our test collection represents then a step towards wordplay-aware multilingual information retrieval.
Understanding and translating humorous wordplay often requires recognition of implicit cultural references, knowledge of word formation processes, and discernment of double meanings – issues which pose challenges for humans and computers... more
Understanding and translating humorous wordplay often requires recognition of implicit cultural references, knowledge of word formation processes, and discernment of double meanings – issues which pose challenges for humans and computers alike. This paper introduces the CLEF 2023 JOKER track, which takes an interdisciplinary approach to the creation of reusable test collections, evaluation metrics, and methods for the automatic processing of wordplay. We describe the track's interconnected shared tasks for the detection, location, interpretation, and translation of puns. We also describe associated data sets and evaluation methodologies, and invite contributions making further use of our data.
While humour and wordplay are among the most intensively studied problems in the field of translation studies, they have been almost completely ignored in machine translation. This is partly because most AI-based translation tools require... more
While humour and wordplay are among the most intensively studied problems in the field of translation studies, they have been almost completely ignored in machine translation. This is partly because most AI-based translation tools require a quality and quantity of training data (e.g., parallel corpora) that has historically been lacking for humour and wordplay. The goal of the JOKER@CLEF 2022 workshop was to bring together translators and computer scientists to work on an evaluation framework for wordplay, including data and metric development, and to foster work on automatic methods for wordplay translation. To this end, we defined three pilot tasks: (1) classify and explain instances of wordplay, (2) translate single terms containing wordplay, and (3) translate entire phrases containing wordplay (punning jokes). This paper describes and discusses each of these pilot tasks, as well as the participating systems and their results.
Humour remains one of the most difficult aspects of intercultural communication: understanding humour often requires understanding implicit cultural references and/ or double meanings, and this raises the question of the... more
Humour remains one of the most difficult aspects of intercultural communication: understanding humour often requires understanding implicit cultural references and/ or double meanings, and this raises the question of the (un)translatability of humour. Wordplay is a common source of humour in literature, journalism, and advertising due to its attention-getting, mnemonic, playful, and subversive character. The translation of humour and wordplay is therefore in high demand. Modern translation depends heavily on technological aids, yet few works have treated the automation of humour and wordplay translation and the creation of humour corpora. The goal of the JOKER workshop is to bring together translators and computer scientists to work on an evaluation framework for creative language, including data and metric development, and to foster work on automatic methods for wordplay translation. We propose three pilot tasks: (1) classify and explain instances of wordplay, (2) translate single words containing wordplay, and (3) translate entire phrases containing wordplay.
In this work, we design an end-to-end model for poetry generation based on conditioned recurrent neural network (RNN) language models whose goal is to learn stylistic features (poem length, sentiment, alliteration, and rhyming) from... more
In this work, we design an end-to-end model for poetry generation based on conditioned recurrent neural network (RNN) language models whose goal is to learn stylistic features (poem length, sentiment, alliteration, and rhyming) from examples alone. We show this model successfully learns the ‘meaning' of length and sentiment, as we can control it to generate longer or shorter as well as more positive or more negative poems. However, the model does not grasp sound phenomena like alliteration and rhyming, but instead exploits low-level statistical cues. Possible reasons include the size of the training data, the relatively low frequency and difficulty of these sublexical phenomena as well as model biases. We show that more recent GPT-2 models also have problems learning sublexical phenomena such as rhyming from examples alone.
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision. However, most supervised machine learning methods assume that a single preferred... more
Disagreement between coders is ubiquitous in virtually all datasets annotated with human judgements in both natural language processing and computer vision. However, most supervised machine learning methods assume that a single preferred interpretation exists for each item, which is at best an idealization. The aim of the SemEval-2021 shared task on Learning with Disagreements (Le-wi-Di) was to provide a unified testing framework for methods for learning from data containing multiple and possibly contradictory annotations covering the best-known datasets containing information about disagreements for interpreting language and classifying images. In this paper we describe the shared task and its results.
The translation of wordplay is one of the most extensively researched problems in translation studies, but it has attracted little attention in the fields of natural language processing and machine translation. This is because today's... more
The translation of wordplay is one of the most extensively researched problems in translation studies, but it has attracted little attention in the fields of natural language processing and machine translation. This is because today's language technologies treat anomalies and ambiguities in the input as things that must be resolved in favour of a single ``correct'' interpretation, rather than preserved and interpreted in their own right. But if computers cannot yet process such creative language on their own, can they at least provide specialized support to translation professionals? In this paper, I survey the state of the art relevant to computational processing of humorous wordplay and put forth a vision of how existing theories, resources, and technologies could be adapted and extended to support interactive, computer-assisted translation.
Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic... more
Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum. In this paper, we present a probabilistic approach, a variant of Gaussian process preference learning (GPPL), that learns to rank and rate the humorousness of short texts by exploiting human preference judgments and automatically sourced linguistic annotations. We apply our system, which had previously shown good performance on English-language one-liners annotated with pairwise humorousness annotations, to the Spanish-language data set of the HAHA@IberLEF2019 evaluation campaign. We report system performance for the campaign's two subtasks, humour detection and funniness score prediction, and discuss some issues arising from the conversion between the numeric scores used in the HAHA@IberLEF2019 data and the pairwise judgment annotations required for our method.
The inability to quantify key aspects of creative language is a frequent obstacle to natural language understanding. To address this, we introduce novel tasks for evaluating the creativeness of language---namely, scoring and ranking text... more
The inability to quantify key aspects of creative language is a frequent obstacle to natural language understanding. To address this, we introduce novel tasks for evaluating the creativeness of language---namely, scoring and ranking text by humorousness and metaphor novelty. To sidestep the difficulty of assigning discrete labels or numeric scores, we learn from pairwise comparisons between texts. We introduce a Bayesian approach for predicting humorousness and metaphor novelty using Gaussian process preference learning~(GPPL), which achieves a Spearman's~$\rho$ of 0.56 against gold using word embeddings and linguistic features. Our experiments show that given sparse, crowdsourced annotation data, ranking using GPPL outperforms best--worst scaling. We release a new dataset for evaluating humor containing 28,210 pairwise comparisons of 4,030 texts, and make our software freely available.
The study of argumentation and the development of argument mining tools depends on the availability of annotated data, which is challenging to obtain in sufficient quantity and quality. We present a method that breaks down a popular but... more
The study of argumentation and the development of argument mining tools depends on the availability of annotated data, which is challenging to obtain in sufficient quantity and quality. We present a method that breaks down a popular but relatively complex discourse-level argument annotation scheme into a simpler, iterative procedure that can be applied even by untrained annotators. We apply this method in a crowdsourcing setup and report on the reliability of the annotations obtained. The source code for a tool implementing our annotation method, as well as the sample data we obtained (4909 gold-standard annotations across 982 documents), are freely released to the research community. These are intended to serve the needs of qualitative research into argumentation, as well as of data-driven approaches to argument mining.
Argument mining is a core technology for automating argument search in large document collections. Despite its usefulness for this task, most current approaches are designed for use only with specific text types and fall short when... more
Argument mining is a core technology for automating argument search in large document collections. Despite its usefulness for this task, most current approaches are designed for use only with specific text types and fall short when applied to heterogeneous texts. In this paper, we propose a new sentential annotation scheme that is reliably applicable by crowd workers to arbitrary Web texts. We source annotations for over 25,000 instances covering eight controversial topics. We show that integrating topic information into bidirectional long short-term memory networks outperforms vanilla BiLSTMs by more than 3 percentage points in F1 in two- and three-label cross-topic settings. We also show that these results can be further improved by leveraging additional data for topic relevance using multi-task learning.
Research Interests:
Argument mining is a core technology for enabling argument search in large corpora. However, most current approaches fall short when applied to heterogeneous texts. In this paper, we present an argument retrieval system capable of... more
Argument mining is a core technology for enabling argument search in large corpora. However, most current approaches fall short when applied to heterogeneous texts. In this paper, we present an argument retrieval system capable of retrieving sentential arguments for any given controversial topic. By analyzing the highest-ranked results extracted from Web sources, we found that our system covers 89% of arguments found in expert-curated lists of arguments from an online debate portal, and also identifies additional valid arguments.
Research Interests:
A pun is a form of wordplay in which a word suggests two or more meanings by exploiting polysemy, homonymy, or phonological similarity to another word, for an intended humorous or rhetorical effect. Though a recurrent and expected feature... more
A pun is a form of wordplay in which a word suggests two or more meanings by exploiting polysemy, homonymy, or phonological similarity to another word, for an intended humorous or rhetorical effect. Though a recurrent and expected feature in many discourse types, puns stymie traditional approaches to computational lexical semantics because they violate their one-sense-per-context assumption. This paper describes the first competitive evaluation for the automatic detection, location, and interpretation of puns. We describe the motivation for these tasks, the evaluation methods, and the manually annotated data set. Finally, we present an overview and discussion of the participating systems' methodologies, resources, and results.
In this paper, we propose using metaheuristics---in particular, simulated annealing and the new D-Bees algorithm---to solve word sense disambiguation as an optimization problem within a knowledge-based lexical substitution system. We are... more
In this paper, we propose using metaheuristics---in particular, simulated annealing and the new D-Bees algorithm---to solve word sense disambiguation as an optimization problem within a knowledge-based lexical substitution system.  We are the first to perform such an extrinsic evaluation of metaheuristics, for which we use two standard lexical substitution datasets, one English and one German.  We find that D-Bees has robust performance for both languages, and performs better than simulated annealing, though both achieve good results.  Moreover, the D-Bees--based lexical substitution system outperforms state-of-the-art systems on several evaluation metrics.  We also show that D-Bees achieves competitive performance in lexical simplification, a variant of lexical substitution.
When processing arguments in online user interactive discourse, it is often necessary to determine their bases of support. In this paper, we describe a supervised approach, based on deep neural networks, for classifying the claims made in... more
When processing arguments in online user interactive discourse, it is often necessary to determine their bases of support. In this paper, we describe a supervised approach, based on deep neural networks, for classifying the claims made in online arguments. We conduct experiments using convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) on two claim data sets compiled from online user comments. Using different types of distributional word embeddings, but without incorporating any rich, expensive set of features, we achieve a significant improvement over the state of the art for one data set (which categorizes arguments as factual vs.\ emotional), and performance comparable to the state of the art on the other data set (which categorizes claims according to their verifiability). Our approach has the advantages of using a generalized, simple, and effective methodology that works for claim categorization on different data sets and tasks.
We describe the construction of GLASS, a newly sense-annotated version of the German lexical substitution data set used at the GermEval 2015: LexSub shared task. Using the two annotation layers, we conduct the first known empirical study... more
We describe the construction of GLASS, a newly sense-annotated version of the German lexical substitution data set used at the GermEval 2015: LexSub shared task. Using the two annotation layers, we conduct the first known empirical study of the relationship between manually applied word senses and lexical substitutions. We find that synonymy and hypernymy/hyponymy are the only semantic relations directly linking targets to their substitutes, and that substitutes in the target's hypernymy/hyponymy taxonomy closely align with the synonyms of a single GermaNet synset. Despite this, these substitutes account for a minority of those provided by the annotators. The results of our analysis accord with those of a previous study on English-language data (albeit with automatically induced word senses), leading us to suspect that the sense–substitution relations we discovered may be of a universal nature. We also tentatively conclude that relatively cheap lexical substitution annotations can be used as a knowledge source for automatic WSD. Also introduced in this paper is Ubyline, the web application used to produce the sense annotations. Ubyline presents an intuitive user interface optimized for annotating lexical sample data, and is readily adaptable to sense inventories other than GermaNet.
Lexical substitution is a task in which participants are given a word in a short context and asked to provide a list of synonyms appropriate for that context. This paper describes GermEval 2015: LexSub, the first shared task for automated... more
Lexical substitution is a task in which participants are given a word in a short context and asked to provide a list of synonyms appropriate for that context. This paper describes GermEval 2015: LexSub, the first shared task for automated lexical substitution on German-language text.  We describe the motivation for this task, the evaluation methods, and the manually annotated data set used to train and test the participating systems.  Finally, we present an overview and discussion of the participating systems' methodologies, resources, and results.
Traditional approaches to word sense disambiguation (WSD) rest on the assumption that there exists a single, unambiguous communicative intention underlying every word in a document. However, writers sometimes intend for a word to be... more
Traditional approaches to word sense disambiguation (WSD) rest on the assumption that there exists a single, unambiguous communicative intention underlying every word in a document.  However, writers sometimes intend for a word to be interpreted as simultaneously carrying multiple distinct meanings.  This deliberate use of lexical ambiguity—i.e., punning—is a particularly common source of humour.  In this paper we describe how traditional, language-agnostic WSD approaches can be adapted to "disambiguate" puns, or rather to identify their double meanings.  We evaluate several such approaches on a manually sense-annotated corpus of English puns and obse
We present a method for clustering word senses of a lexical-semantic resource by mapping them to those of another sense inventory. This is a promising way of reducing polysemy in sense inventories and consequently improving word sense... more
We present a method for clustering word senses of a lexical-semantic resource by mapping them to those of another sense inventory. This is a promising way of reducing polysemy in sense inventories and consequently improving word sense disambiguation performance. In contrast to previous approaches, we use Dijkstra-WSA, a parameterizable alignment algorithm which is largely resource- and language-agnostic.  To demonstrate this, we apply our technique to GermaNet, the German equivalent to WordNet. The GermaNet sense clusterings we induce through alignments to various collaboratively constructed resources achieve a significant boost in accuracy, even though our method is far less complex and less dependent on language-specific knowledge than past approaches.
We explore the contribution of distributional information for purely knowledge-based word sense disambiguation. Specifically, we use a distributional thesaurus, computed from a large parsed corpus, for lexical expansion of context and... more
We explore the contribution of distributional information for purely knowledge-based word sense disambiguation. Specifically, we use a distributional thesaurus, computed from a large parsed corpus, for lexical expansion of context and sense information. This bridges the lexical gap that is seen as the major obstacle for word overlap–based approaches.
Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past... more
Implementations of word sense disambiguation (WSD) algorithms tend to be tied to a particular test corpus format and sense inventory. This makes it difficult to test their performance on new data sets, or to compare them against past algorithms implemented for different data sets. In this paper we present DKPro WSD, a freely licensed, general-purpose framework for WSD which is both modular and extensible. DKPro WSD abstracts the WSD process in such a way that test corpora, sense inventories, and algorithms can be freely swapped. Its UIMA-based architecture makes it easy to add support for new resources and algorithms. Related tasks such as word sense induction and entity linking are also supported.
Current word completion tools rely mostly on statistical or syntactic knowledge. Can using semantic knowledge improve the completion task? We propose a language-independent word completion algorithm which uses latent semantic analysis... more
Current word completion tools rely mostly on statistical or syntactic knowledge. Can using semantic knowledge improve the completion task? We propose a language-independent word completion algorithm which uses latent semantic analysis (LSA) to model the semantic context of the word being typed. We find that a system using this algorithm alone achieves keystroke savings of 56% and a hit rate of 42%. This represents improvements of 4.3% and 12%, respectively, over existing approaches.
For many years, the non-monotonic reasoning community has focussed on highly expressive logics. Such logics have turned out to be computationally expensive, and have given little support to the practical use of non-monotonicreasoning. In... more
For many years, the non-monotonic reasoning community has focussed on highly expressive logics. Such logics have turned out to be computationally expensive, and have given little support to the practical use of non-monotonicreasoning. In this work we discuss defeasible logic, a less-expressive but more efficient non-monotonic logic. We report on two new implemented systems for defeasible logic: a query answering system employing a backward-chaining approach, and a forward-chaining implementation that computes all conclusions. Our experimental evaluation demonstrates that the systems can deal with large theories (up to hundreds of thousands of rules). We show that defeasible logic has linear complexity, which contrasts markedly with most other non-monotonic logics and helps to explain the impressive experimental results. We believe that defeasible logic, with its efficiency and simplicity, is a good candidate to be used as a modelling language for practical applications, including modelling of regulations and business rules.
We describe eFISK, an automated keyword extraction system which unobtrusively measures the user's attention in order to isolate and identify those areas of a written document the reader finds of greatest interest. Attention is measured by... more
We describe eFISK, an automated keyword extraction system which unobtrusively measures the user's attention in order to isolate and identify those areas of a written document the reader finds of greatest interest. Attention is measured by use of eye-tracking hardware consisting of a desk-mounted infrared camera which records various data about the user's eye. The keywords thus identified are subsequently used in the back end of an information retrieval system to help the user find other documents which contain information of interest to him. Unlike traditional IR techniques which compare documents simply on the basis of common terms withal, our system also accounts for the weights users implicitly attach to certain words or sections of the source document. We describe a task-based user study which compares the utility of standard relevance feedback techniques to the keywords and keyphrases discovered by our system in finding other relevant documents from a corpus.
We investigate the use of topic models, such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA), for word completion tasks. The advantage of using these models for such an application is twofold. On the... more
We investigate the use of topic models, such as probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA), for word completion tasks. The advantage of using these models for such an application is twofold. On the one hand, they allow us to exploit semantic or contextual information when predicting candidate words for completion. On the other hand, these probabilistic models have been found to outperform classical latent semantic analysis (LSA) for modeling text documents. We describe a word completion algorithm that takes into account the semantic context of the word being typed. We also present evaluation metrics to compare different models being used in our study. Our experiments validate our hypothesis of using probabilistic models for semantic analysis of text documents and their application in word completion tasks.
Good hypertext writing style mandates that link texts clearly indicate the nature of the link target. While this guideline is routinely ignored in HTML, the lightweight markup languages used by wikis encourage or even force hypertext... more
Good hypertext writing style mandates that link texts clearly indicate the nature of the link target. While this guideline is routinely ignored in HTML, the lightweight markup languages used by wikis encourage or even force hypertext authors to use semantically appropriate link texts. This property of wiki hypertext makes it an ideal candidate for processing with latent semantic analysis, a factor analysis technique for finding latent transitive relations among natural-language documents. In this study, we design, implement, and test an LSA-based information retrieval system for wikis. Instead of a full-text index, our system indexes only link texts and document titles. Nevertheless, its precision exceeds that of a popular full-text search engine, and is comparable to that of PageRank-based systems such as Google.
We describe a language-neutral automatic summarization system which aims to produce coherent extracts. It builds an initial extract composed solely of topic sentences, and then recursively fills in the topical lacunae by providing linking... more
We describe a language-neutral automatic summarization system which aims to produce coherent extracts. It builds an initial extract composed solely of topic sentences, and then recursively fills in the topical lacunae by providing linking material between semantically dissimilar sentences. While experiments with human judges did not prove a statistically significant increase in textual coherence with the use of a latent semantic analysis module, we found a strong positive correlation between coherence and overall summary quality.
Technological progress allows us to equip any mobile phone with new functionalities, such as storing personalized information about its owner and using the corresponding personal profile for enabling communication to persons whose mobile... more
Technological progress allows us to equip any mobile phone with new functionalities, such as storing personalized information about its owner and using the corresponding personal profile for enabling communication to persons whose mobile phones represent similar profiles. However, this raises very specific security issues, in particular relating to the use of Bluetooth technology. Herein we consider such scenarios and related problems in privacy and security matters. We analyze in which respect certain design approaches may fail or succeed at solving these problems. We concentrate on methods for designing the user-related part of the communication service appropriately in order to enhance confidentiality.
The coverage and quality of conceptual information contained in lexical semantic resources is crucial for many tasks in natural language processing. Automatic alignment of complementary resources is one way of improving this coverage and... more
The coverage and quality of conceptual information contained in lexical semantic resources is crucial for many tasks in natural language processing. Automatic alignment of complementary resources is one way of improving this coverage and quality; however, past attempts have always been between pairs of specific resources.  In this paper we establish some set-theoretic conventions for describing concepts and their alignments, and use them to describe a method for automatically constructing n-way alignments from arbitrary pairwise alignments.  We apply this technique to the production of a three-way alignment from previously published WordNet--Wikipedia and WordNet--Wiktionary alignments.  We then present a quantitative and informal qualitative analysis of the aligned resource. The three-way alignment was found to have greater coverage, an enriched sense representation, and coarser sense granularity than both the original resources and their pairwise alignments, though this came at the cost of accuracy.  An evaluation of the induced word sense clusters in a word sense disambiguation task showed that they were no better than random clusters of equivalent granularity.  However, use of the alignments to enrich a sense inventory with additional sense glosses did significantly improve the performance of a baseline knowledge-based WSD algorithm.
A major problem with automatically-produced summaries in general, and extracts in particular, is that the output text often lacks textual coherence. Our goal is to improve the textual coherence of automatically produced extracts. We... more
A major problem with automatically-produced summaries in general, and extracts in particular, is that the output text often lacks textual coherence. Our goal is to improve the textual coherence of automatically produced extracts. We developed and implemented an algorithm which builds an initial extract composed solely of topic sentences, and then recursively fills in the lacunae by providing linking material from the original text between semantically dissimilar sentences. Our summarizer differs in architecture from most others in that it measures semantic similarity with latent semantic analysis (LSA), a factor analysis technique based on the vector-space model of information retrieval. We believed that the deep semantic relations discovered by LSA would assist in the identification and correction of abrupt topic shifts in the summaries. However, our experiments did not show a statistically significant difference in the coherence of summaries produced by our system as compared with a non-LSA version.
Informal empirical and anecdotal evidence from the (male) scientific community has long pointed to the difficulty in securing decent, long-term female companionship. To date, however, no one has published a rigorous study of the matter.... more
Informal empirical and anecdotal evidence from the (male) scientific community has long pointed to the difficulty in securing decent, long-term female companionship. To date, however, no one has published a rigorous study of the matter. In this essay, the author investigates himself as a case study and presents a proof, using simple statistical calculus, of why it is impossible to find a girlfriend.
This paper presents details of Task 1 of the JOKER-2023 Track, which aims to detect sentences in English, French, and Spanish that contain wordplay. With applications in humour generation, sentiment analysis, conversational agents,... more
This paper presents details of Task 1 of the JOKER-2023 Track, which aims to detect sentences in English, French, and Spanish that contain wordplay. With applications in humour generation, sentiment analysis, conversational agents, content filtering, and linguistic creativity, this task is still challenging despite significant recent progress in information retrieval and natural language processing. Building on the lessons learned from last year's edition of the JOKER track, our overall goal is to foster progress in the automatic interpretation, generation, and translation of wordplay in English, Spanish, and French. In this paper, we define our task and describe our approaches to corpus creation and evaluation in the three languages. We then present an overview of the participating systems, including summaries of their approaches and a comparison of their performance.
This paper presents an overview of Task 2 of the JOKER-2023 track on automatic wordplay analysis. The goal of the JOKER track series is to bring together linguists, translators, and computer scientists to foster progress in the automatic... more
This paper presents an overview of Task 2 of the JOKER-2023 track on automatic wordplay analysis. The goal of the JOKER track series is to bring together linguists, translators, and computer scientists to foster progress in the automatic interpretation, generation, and translation of wordplay. Task 2 is focussed on pun location and interpretation. Automatic pun interpretation is important for advancing natural language understanding, enabling humor generation, aiding in translation and cross-linguistic understanding, enhancing information retrieval, and contributing to the field of computational creativity. In this overview, we present the general setup of the shared task we organized as part of the CLEF-2023 evaluation campaign, the participants' approaches, and the quantitative results.
This paper provides a comprehensive overview of Task 3 of the JOKER-2023 track. The overarching objective of the JOKER track series is to facilitate collaboration among linguists, translators, and computer scientists to advance the... more
This paper provides a comprehensive overview of Task 3 of the JOKER-2023 track. The overarching objective of the JOKER track series is to facilitate collaboration among linguists, translators, and computer scientists to advance the development of automatic interpretation, generation, and translation of wordplay. Task 3 specifically concentrates on the automatic translation of puns from English into French and Spanish. In this overview, we outline the overall structure of the shared task that we organized as part of the CLEF-2023 evaluation campaign. We discuss the approaches employed by the participants and present and analyze the results they achieved.
The translation of the pun is one of the most challenging issues for translators and for this reason has become an intensively studied phenomenon in the field of translation studies. Translation technology aims to partially or even... more
The translation of the pun is one of the most challenging issues for translators and for this reason has become an intensively studied phenomenon in the field of translation studies. Translation technology aims to partially or even totally automate the translation process, but relatively little attention has been paid to the use of computers for the translation of wordplay. The CLEF 2022 JOKER track aims to build a multilingual corpus of wordplay and evaluation metrics in order to advance the automation of creative-language translation. This paper provides an overview of the track's Pilot Task 3, where the goal is to translate entire phrases containing wordplay (particularly puns). We describe the data collection, the task setup, the evaluation procedure, and the participants' results. We also cover a side product of our project, a homogeneous monolingual corpus for wordplay detection in French.
Onomastic wordplay has been widely used as a rhetorical device by novelists, poets, and playwrights, from character names in Shakespeare and other classic literature to named entities in Pokémon, Harry Potter, Asterix, and video games.... more
Onomastic wordplay has been widely used as a rhetorical device by novelists, poets, and playwrights, from character names in Shakespeare and other classic literature to named entities in Pokémon, Harry Potter, Asterix, and video games. The translation of such wordplay is problematic both for humans and algorithms due to its ambiguity and unorthodox morphology. In this paper, we present an overview of Pilot Task 2 of the JOKER@CLEF 2022 track, where participants had to translate wordplay in named entities from English into French. For this, we constructed a parallel corpus wordplay in named entities from movies, video games, advertising slogans, literature, etc. Five teams participated in the task. The methods employed by participants were based on the state-of-the-art transformer models, which have the advantage of subword tokenisation. The participants' models were pre-trained on large corpora and fine-tuned on the JOKER training set. We observed that in many cases the models provided the exact official translations, suggesting that they were pre-trained on the corpus containing the source texts used in the JOKER corpus. Those translations that differed from the official ones only rarely contained wordplay.
As a multidisciplinary field of study, humour remains one of the most difficult aspects of intercultural communication. Understanding humour often involves understanding implicit cultural references and/or double meanings, which raises... more
As a multidisciplinary field of study, humour remains one of the most difficult aspects of intercultural communication. Understanding humour often involves understanding implicit cultural references and/or double meanings, which raises the questions of how to detect and classify instances of this complex phenomenon. This paper provides an overview of Pilot Task 1 of the CLEF 2022 JOKER track, where participants had to classify and explain instances of wordplay. We introduce a new classification of wordplay and a new annotation scheme for wordplay interpretation suitable both for phrase-based wordplay and wordplay in named entities. We describe the collection of our data, our task setup, and the evaluation procedure, and we give a brief overview of the participating teams' approaches and results.
How do we know when a translation is good? This seemingly simple question has long dogged human practitioners of translation, and has arguably taken on even greater importance in today’s world of fully automatic, end-to-end machine... more
How do we know when a translation is good? This seemingly simple question has long dogged human practitioners of translation, and has arguably taken on even greater importance in today’s world of fully automatic, end-to-end machine translation systems. Much of the difficulty in assessing translation quality is that different translations of the same text may be made for different purposes, each of which entails a unique set of requirements and constraints. This difficulty is compounded by ambiguities in the source text, which must be identified and then preserved or eliminated according to the needs of the translation and the (apparent) intent of the source text. In this talk, I survey the state of the art in linguistics, computational linguistics, translation, and machine translation as it relates to the notion of linguistic ambiguity in general, and intentional humorous ambiguity in particular. I describe the various constraints and requirements of different types of translations and provide examples of how various automatic and interactive techniques from natural language processing can be used to detect and then resolve or preserve linguistic ambiguities according to these constraints and requirements. In the vein of the “Translator’s Amanuensis” proposed by Martin Kay, I outline some specific proposals concerning how the hitherto disparate work in the aforementioned fields can be connected with a view to producing “machine-in-the-loop” computer-assisted translation (CAT) tools to assist human translators in selecting and implementing pun translation strategies in furtherance of the translation requirements. Throughout the talk, I will attempt to draw links with how this research relates to the requirements engineering community.
After the working group on “What is missing in ML&AI to understanding Jokes?”, we discussed the possibility to survey the expressiveness on existing models on meaning representation, contrasted by the forecast of existing theories in... more
After the working group on “What is missing in ML&AI to understanding Jokes?”, we discussed the possibility to survey the expressiveness on existing models on meaning representation, contrasted by the forecast of existing theories in cognitive science about what is relevant cognitive activities and processes. Spatial stimuli activate the zoo of spatial cells in hippocampus, forming cognitive map or collage in the memory, producing spatial descriptions in languages. We need to survey existing models on Mental Spatial Representation (MSR) in the literature of cognitive psychology. On the other hand, we need to analyse vector embeddings of spatial entities and relations in the large-scaled pre-train world model, and find the gap between MSR and vector embedding via Machine Learning.
Why current Machine Learning and AI (ML&AI) techniques cannot understand jokes as we humans do? What is missing? The knowledge that is needed to understand jokes is neither in the joke texts, nor in the neural networks. Acquisition and... more
Why current Machine Learning and AI (ML&AI) techniques cannot understand jokes as we humans do? What is missing? The knowledge that is needed to understand jokes is neither in the joke texts, nor in the neural networks. Acquisition and reasoning with commonsense knowledge is still an open problem for Machine Learning and AI. The meaning representation based on embeddings is insufficient. We need meaning representation formats that are beyond vector representations. Vectors are only shadows. Information processing and meaning understanding are embodied. The discussion guides us to develop novel embodied ML&AI techniques to understand \emphSpatial Jokes first.
Cartoons can be understood without language. That is, a suitably arranged scene of simple objects, with no accompanying text, is often enough to make us laugh – evidence that thinking (mental activity) happens before language. This raises... more
Cartoons can be understood without language. That is, a suitably arranged scene of simple objects, with no accompanying text, is often enough to make us laugh – evidence that thinking (mental activity) happens before language. This raises the question of non-linguistic diagrammatic representation of spatial humour, along with the mechanism of neural computation. In particular, we raise following questions: (1) How can we diagrammatically formalise spatial humour? (2) How can these diagrammatic formalisms be processed by neural networks? (3) How can this neural computation deliver high-level schema that are similar to the script-opposition semantic theory of humour? The spatial knowledge encoded in the scene can activate the necessary spatial and non- spatial knowledge. By what neural associative mechanism or process of reasoning do we put this all together to “get” the joke? During the seminar, we aimed to make some headway towards establishing (1) exactly what sort of scene-specific and common-sense knowledge is required to understand any given cartoon, (2) what part of this knowledge could in principle be acquired by existing machine learning (ML) techniques, and which could be acquired or encoded through symbolic structures, (3) what activation process acquires the rest of the knowledge required to interpret the humour, and (4) whether there is a unified representation that could represent this knowledge in a computer’s working memory.
The existing n-ball embedding approach can precisely encode a large symbolic tree structure into tree node embeddings. In this working group, we discussed how to apply the idea of n-ball to solve NLP tasks, in particular, the Word Sense... more
The existing n-ball embedding approach can precisely encode a large symbolic tree structure into tree node embeddings. In this working group, we discussed how to apply the idea of n-ball to solve NLP tasks, in particular, the Word Sense Disambiguation (WSD). WSD is a fundamental task in Natural Language Processing (NLP), which impacts a variety of downstream NLP applications. WSD determines the intended meaning of words in a context. To tackle the WSD task, researchers have been investigating knowledge-based approaches, supervised, semi-supervised, and unsupervised machine learning. However, those methods encounter a number of limitations, besides their costly computation. We let n-ball rotate, and result in the Rotating Spheres Model (RSM). Using RSM, embeddings of word senses work like gestures of a word. Given a context, the word chooses the best gesture. The WSD is to determine the best rotating axis in a given context. Each rating axis represents a sense that in the predefined sense inventory.
RPM is a package management system which provides a uniform, automated way for users to install, upgrade, and uninstall programs. Because RPM is the default software distribution format for many operating systems (particularly GNU/Linux),... more
RPM is a package management system which provides a uniform, automated way for users to install, upgrade, and uninstall programs. Because RPM is the default software distribution format for many operating systems (particularly GNU/Linux), users may find it useful to manage their library of TeX-related packages using RPM. This article explains how to produce RPM files for TeX software, either for personal use or for public distribution. We also explain how a (La)TeX user can find, install, and remove TeX-related RPM packages.
We present Biblet, a set of BibTeX bibliography styles (bst) which generate XHTML from BibTeX databases. Unlike other BibTeX to XML/HTML converters, Biblet is written entirely in the native BibTeX style language and therefore works ``out... more
We present Biblet, a set of BibTeX bibliography styles (bst) which generate XHTML from BibTeX databases. Unlike other BibTeX to XML/HTML converters, Biblet is written entirely in the native BibTeX style language and therefore works ``out of the box'' on any system that runs BibTeX. Features include automatic conversion of LaTeX symbols to HTML or Unicode entities; customizable graphical hyperlinks to PostScript, PDF, DVI, LaTeX, and HTML resources; support for nonstandard but common fields such as day, isbn, and abstract; hideable text blocks; and output of the original BibTeX entry for sharing citations. Biblet's highly structured XHTML output means that bibliography appearance to can be drastically altered simply by specifying a Cascading Style Sheet (CSS), or easily postprocessed with third-party XML, HTML, or text processing tools. We compare and contrast Biblet to other common converters, describe basic usage of Biblet, give examples of how to produce custom-formatted bibliographies, and provide a basic overview of Biblet internals for those wishing to modify the style file itself.
In this paper, we present HA-prosper, a LaTeX package for creating overhead slides. We describe the features of the package and give examples of their use. We also discuss what advantages there are to producing slides with LaTeX versus... more
In this paper, we present HA-prosper, a LaTeX package for creating overhead slides. We describe the features of the package and give examples of their use. We also discuss what advantages there are to producing slides with LaTeX versus the presentation software typically bundled with today's office suites.
In questo articolo verrà presentato HA-prosper, un pacchetto LaTeX per la creazione di sofisticate slide. Ne descriveremo le caratteristiche mostrandone alcuni esempi d'uso. Inoltre, discuteremo quali vantaggi si possono trarre dal tipo... more
In questo articolo verrà presentato HA-prosper, un pacchetto LaTeX per la creazione di sofisticate slide. Ne descriveremo le caratteristiche mostrandone alcuni esempi d'uso. Inoltre, discuteremo quali vantaggi si possono trarre dal tipo di approccio, proprio della filosofia LaTeX, in rapporto agli altri tipi di programmi per presentazioni che generalmente sono presenti nelle attuali suite di applicazioni per ufficio.