Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could he... more Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could help to identify symptoms at early stages, so that preventive actions can be carried out. This research is a contribution to the ADReSSo Challenge, we analyze the usage of a SotA ASR system to transcribe participant’s spoken descriptions from a picture. We analyse the loss of performance regarding the use of human transcriptions (measured using transcriptions from the 2020 ADReSS Challenge). Furthermore, we study the influence of a language model — which tends to correct non-standard sequences of words— with the lack of language model to decode the hypothesis from the ASR. This aims at studying the language bias and get more meaningful transcriptions based only on the acoustic information from patients. The proposed system combines acoustic —based on prosody and voice quality— and lexical features based on the first occurrence of the most common words. The reported results show the effect ...
Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could he... more Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could help to identify symptoms at early stages, so that preventive actions can be carried out. This research is a contribution to the ADReSSo Challenge, we analyze the usage of a SotA ASR system to transcribe participant’s spoken descriptions from a picture. We analyse the loss of performance regarding the use of human transcriptions (measured using transcriptions from the 2020 ADReSS Challenge). Furthermore, we study the influence of a language model — which tends to correct non-standard sequences of words— with the lack of language model to decode the hypothesis from the ASR. This aims at studying the language bias and get more meaningful transcriptions based only on the acoustic information from patients. The proposed system combines acoustic —based on prosody and voice quality— and lexical features based on the first occurrence of the most common words. The reported results show the effect ...
The use of peer-review methods in the evaluation of students' work is bec... more The use of peer-review methods in the evaluation of students' work is becoming increasingly popular due mainly to pedagogical considerations. This paper argues for the integration of several statistical techniques in the peer-review process that would increase the efficiency and reliability of this process by providing the teacher with tools that make the necessary supervision task less daunting and more
ABSTRACT Review sites include thousands of opinions about different products. Mining such user ge... more ABSTRACT Review sites include thousands of opinions about different products. Mining such user generated contents from these informal social networks may provide very relevant and timely information about the market, products or companies. To go beyond the basic statistics we do low‐level interpretation of the contents by extracting meaningful terms together with the K‐means clustering of the opinions to produce an overview of the collective perceptions of the market, suggesting the strengths and weakness of a product, how it compares with others and which concepts are associated, in the consumer’s mind, with it. In this paper we describe the techniques, resources and tools developed to produce different perspectives that help understand what's going on behind the scenes in reviewer interactions.
The importance of collocations in the context of second language learning is generally acknowledg... more The importance of collocations in the context of second language learning is generally acknowledged. Studies show that the “collocation density" in learner corpora is nearly the same as in native corpora, i.e., that use of collocations by learners is as common as it is by native speakers, while the collocation error rate in learner corpora is about ten times as high as in native reference corpora. Therefore, CALL could be of great aid to support the learners for better mastering of collocations. However, surprisingly few works address specifically research on CALL-oriented collocation learning assistants that detect miscollocations in the writings of the learners and propose suggestions for their correction or that offer the learner the possibility to verify a word co-occurrence with respect to its correctness as collocation and obtain suggestions for its correction in case it is determined to be a miscollocation. This disregard is likely to be, on the one hand, due to the focu...
This paper presents an implementation of the widely used speech analysis tool Praat as a web appl... more This paper presents an implementation of the widely used speech analysis tool Praat as a web application with an extended functionality for feature annotation. In particular, Praat on the Web addresses some of the central limitations of the original Praat tool and provides (i) enhanced visualization of annotations in a dedicated window for feature annotation at interval and point segments, (ii) a dynamic scripting composition exemplified with a modular prosody tagger, and (iii) portability and an operational web interface. Speech annotation tools with such a functionality are key for exploring large corpora and designing modular pipelines.
Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound word... more Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound words traditionally pose a challenge to language learners and many Natural Language Processing (NLP) applications alike. Reliable ground truth (i.e., ideally manually compiled) resources are thus of high value. We present a manually compiled bilingual English–French collocation resource with 7,480 collocations in English and 6,733 in French. Each collocation is enriched with information that facilitates its downstream exploitation in NLP tasks such as machine translation, word sense disambiguation, natural language generation, relation classification, and so forth. Our proposed enrichment covers: the semantic category of the collocation (its lexical function), its vector space representation (for each individual word as well as their joint collocation embedding), a subcategorization pattern of both its elements, as well as their corresponding BabelNet id, and finally, indices of their occurr...
In this paper we present an overview of a UIMA-based system for Sentiment Analysis in hotel custo... more In this paper we present an overview of a UIMA-based system for Sentiment Analysis in hotel customer reviews. It extracts objectopinion/attribute-polarity triples using a variety of UIMA modules, some of which are adapted from freely available open source components and others developed fully in-house. A Solr based graphical interface is used to explore and visualize the collection of reviews and the opinions expressed in them.
Alzheimer's Disease (AD) is nowadays the most common form of dementia, and its automatic dete... more Alzheimer's Disease (AD) is nowadays the most common form of dementia, and its automatic detection can help to identify symptoms at early stages, so that preventive actions can be carried out. Moreover, non-intrusive techniques based on spoken data are crucial for the development of AD automatic detection systems. In this light, this paper is presented as a contribution to the ADReSS Challenge, aiming at improving AD automatic detection from spontaneous speech. To this end, recordings from 108 participants, which are age-, gender-, and AD condition-balanced, have been used as training set to perform two different tasks: classification into AD/non-AD conditions, and regression over the Mini-Mental State Examination (MMSE) scores. Both tasks have been performed extracting 28 features from speech -- based on prosody and voice quality -- and 51 features from the transcriptions -- based on lexical and turn-taking information. Our results achieved up to 87.5 % of classification accura...
This paper describes the system implemented by Fundaci´ o Barcelona Media (FBM) for classifying t... more This paper describes the system implemented by Fundaci´ o Barcelona Media (FBM) for classifying the polarity of opinion expressions in tweets and SMSs, and which is supported by a UIMA pipeline for rich linguistic and sentiment annotations. FBM participated in the SEMEVAL 2013 Task 2 on polarity classification. It ranked 5th in Task A (constrained track) using an ensemble system combining ML algorithms with dictionary-based heuristics, and 7th (Task B, constrained) using an SVM classifier with features derived from the linguistic annotations and some heuristics.
Comunicacio presentada a: LREC 2016, Tenth International Conference on Language Resources and Eva... more Comunicacio presentada a: LREC 2016, Tenth International Conference on Language Resources and Evaluation, celebrada del 23 al 28 de maig de 2016 a Portorož, Eslovenia.
Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could he... more Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could help to identify symptoms at early stages, so that preventive actions can be carried out. This research is a contribution to the ADReSSo Challenge, we analyze the usage of a SotA ASR system to transcribe participant’s spoken descriptions from a picture. We analyse the loss of performance regarding the use of human transcriptions (measured using transcriptions from the 2020 ADReSS Challenge). Furthermore, we study the influence of a language model — which tends to correct non-standard sequences of words— with the lack of language model to decode the hypothesis from the ASR. This aims at studying the language bias and get more meaningful transcriptions based only on the acoustic information from patients. The proposed system combines acoustic —based on prosody and voice quality— and lexical features based on the first occurrence of the most common words. The reported results show the effect ...
Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could he... more Alzheimer’s Disease is the most common form of dementia. Automatic detection from speech could help to identify symptoms at early stages, so that preventive actions can be carried out. This research is a contribution to the ADReSSo Challenge, we analyze the usage of a SotA ASR system to transcribe participant’s spoken descriptions from a picture. We analyse the loss of performance regarding the use of human transcriptions (measured using transcriptions from the 2020 ADReSS Challenge). Furthermore, we study the influence of a language model — which tends to correct non-standard sequences of words— with the lack of language model to decode the hypothesis from the ASR. This aims at studying the language bias and get more meaningful transcriptions based only on the acoustic information from patients. The proposed system combines acoustic —based on prosody and voice quality— and lexical features based on the first occurrence of the most common words. The reported results show the effect ...
The use of peer-review methods in the evaluation of students' work is bec... more The use of peer-review methods in the evaluation of students' work is becoming increasingly popular due mainly to pedagogical considerations. This paper argues for the integration of several statistical techniques in the peer-review process that would increase the efficiency and reliability of this process by providing the teacher with tools that make the necessary supervision task less daunting and more
ABSTRACT Review sites include thousands of opinions about different products. Mining such user ge... more ABSTRACT Review sites include thousands of opinions about different products. Mining such user generated contents from these informal social networks may provide very relevant and timely information about the market, products or companies. To go beyond the basic statistics we do low‐level interpretation of the contents by extracting meaningful terms together with the K‐means clustering of the opinions to produce an overview of the collective perceptions of the market, suggesting the strengths and weakness of a product, how it compares with others and which concepts are associated, in the consumer’s mind, with it. In this paper we describe the techniques, resources and tools developed to produce different perspectives that help understand what's going on behind the scenes in reviewer interactions.
The importance of collocations in the context of second language learning is generally acknowledg... more The importance of collocations in the context of second language learning is generally acknowledged. Studies show that the “collocation density" in learner corpora is nearly the same as in native corpora, i.e., that use of collocations by learners is as common as it is by native speakers, while the collocation error rate in learner corpora is about ten times as high as in native reference corpora. Therefore, CALL could be of great aid to support the learners for better mastering of collocations. However, surprisingly few works address specifically research on CALL-oriented collocation learning assistants that detect miscollocations in the writings of the learners and propose suggestions for their correction or that offer the learner the possibility to verify a word co-occurrence with respect to its correctness as collocation and obtain suggestions for its correction in case it is determined to be a miscollocation. This disregard is likely to be, on the one hand, due to the focu...
This paper presents an implementation of the widely used speech analysis tool Praat as a web appl... more This paper presents an implementation of the widely used speech analysis tool Praat as a web application with an extended functionality for feature annotation. In particular, Praat on the Web addresses some of the central limitations of the original Praat tool and provides (i) enhanced visualization of annotations in a dedicated window for feature annotation at interval and point segments, (ii) a dynamic scripting composition exemplified with a modular prosody tagger, and (iii) portability and an operational web interface. Speech annotation tools with such a functionality are key for exploring large corpora and designing modular pipelines.
Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound word... more Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound words traditionally pose a challenge to language learners and many Natural Language Processing (NLP) applications alike. Reliable ground truth (i.e., ideally manually compiled) resources are thus of high value. We present a manually compiled bilingual English–French collocation resource with 7,480 collocations in English and 6,733 in French. Each collocation is enriched with information that facilitates its downstream exploitation in NLP tasks such as machine translation, word sense disambiguation, natural language generation, relation classification, and so forth. Our proposed enrichment covers: the semantic category of the collocation (its lexical function), its vector space representation (for each individual word as well as their joint collocation embedding), a subcategorization pattern of both its elements, as well as their corresponding BabelNet id, and finally, indices of their occurr...
In this paper we present an overview of a UIMA-based system for Sentiment Analysis in hotel custo... more In this paper we present an overview of a UIMA-based system for Sentiment Analysis in hotel customer reviews. It extracts objectopinion/attribute-polarity triples using a variety of UIMA modules, some of which are adapted from freely available open source components and others developed fully in-house. A Solr based graphical interface is used to explore and visualize the collection of reviews and the opinions expressed in them.
Alzheimer's Disease (AD) is nowadays the most common form of dementia, and its automatic dete... more Alzheimer's Disease (AD) is nowadays the most common form of dementia, and its automatic detection can help to identify symptoms at early stages, so that preventive actions can be carried out. Moreover, non-intrusive techniques based on spoken data are crucial for the development of AD automatic detection systems. In this light, this paper is presented as a contribution to the ADReSS Challenge, aiming at improving AD automatic detection from spontaneous speech. To this end, recordings from 108 participants, which are age-, gender-, and AD condition-balanced, have been used as training set to perform two different tasks: classification into AD/non-AD conditions, and regression over the Mini-Mental State Examination (MMSE) scores. Both tasks have been performed extracting 28 features from speech -- based on prosody and voice quality -- and 51 features from the transcriptions -- based on lexical and turn-taking information. Our results achieved up to 87.5 % of classification accura...
This paper describes the system implemented by Fundaci´ o Barcelona Media (FBM) for classifying t... more This paper describes the system implemented by Fundaci´ o Barcelona Media (FBM) for classifying the polarity of opinion expressions in tweets and SMSs, and which is supported by a UIMA pipeline for rich linguistic and sentiment annotations. FBM participated in the SEMEVAL 2013 Task 2 on polarity classification. It ranked 5th in Task A (constrained track) using an ensemble system combining ML algorithms with dictionary-based heuristics, and 7th (Task B, constrained) using an SVM classifier with features derived from the linguistic annotations and some heuristics.
Comunicacio presentada a: LREC 2016, Tenth International Conference on Language Resources and Eva... more Comunicacio presentada a: LREC 2016, Tenth International Conference on Language Resources and Evaluation, celebrada del 23 al 28 de maig de 2016 a Portorož, Eslovenia.
Intelligent Components and Instruments for Control Applications , 1994
The use of artificial neural networks (ANN) for nonlinear system modeling is a field where still ... more The use of artificial neural networks (ANN) for nonlinear system modeling is a field where still there is much theoretical work to be done. A structured ANN which obtains neural models of nonlinear systems is presented. Those neural models are Fourier-series based. To check the goodness of the method, conventional difference equations are re-modeled via ANN and their respective input/outputs compared. Also their Fourier series expansion are compared. The Fourier coefficients being optimal for series truncation, this allows to estimate the goodness of the models obtained. Preliminary tests give encouraging results
Uploads
Papers