In this paper, we classify speech into several emotional states based on the statistical properti... more In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1 % to 11 % is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11 % is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3 % in DES and 42 % in SUSAS. For comparison purposes, a...
Thirty-two emotional speech databases are reviewed. Each database consists of a corpus of human s... more Thirty-two emotional speech databases are reviewed. Each database consists of a corpus of human speech pronounced under dierent emotional con- ditions. A basic description of each database and its applications is provided. The conclusion of this study is that automated emotion recognition cannot achieve a correct classification that exceeds 50% for the four basic emotions, i.e., twice as much as
In this paper, we classify speech into several emotional states based on the statistical properti... more In this paper, we classify speech into several emotional states based on the statistical properties of prosody features estimated on utterances extracted from Danish Emotional Speech (DES) and a subset of Speech Under Simulated and Actual Stress (SUSAS) data collections. The proposed novelties are in: 1) speeding up the sequential floating feature selection up to 60%, 2) applying fusion of decisions taken on short speech segments in order to derive a unique decision for longer utterances, and 3) demonstrating that gender and accent information reduce the classification error. Indeed, a lower classification error by 1 % to 11 % is achieved, when the combination of decisions is made on long phrases and an error reduction by 2%-11 % is obtained, when the gender and the accent information is exploited. The total classification error reported on DES is 42.8%. The same figure on SUSAS is 46.3%. The reported human errors have been 32.3 % in DES and 42 % in SUSAS. For comparison purposes, a...
Thirty-two emotional speech databases are reviewed. Each database consists of a corpus of human s... more Thirty-two emotional speech databases are reviewed. Each database consists of a corpus of human speech pronounced under dierent emotional con- ditions. A basic description of each database and its applications is provided. The conclusion of this study is that automated emotion recognition cannot achieve a correct classification that exceeds 50% for the four basic emotions, i.e., twice as much as
This Special Issue will collect interdisciplinary efforts in the IT and sociology and humanities ... more This Special Issue will collect interdisciplinary efforts in the IT and sociology and humanities sectors on the topic of social virtual reality environments in education. Recent pandemic issues – aside from migration issues – have increased the need for high-quality distant-learning educational services where video telepresence is not enough and needs to be enhanced with immersive and native collaboration features. Virtual reality is the most suitable technology to achieve this goal but it suffers from several drawbacks such as increased hardware cost, the need for programming skills for educators to author such VR environments, and increased cognitive-navigational burden for learners. However, the pressure exerted by COVID-19 will result in public investments that will lead to significant research in the field and innovative solutions. In the past decade, virtual labs have been significantly used to minimize the cost of accessing expensive equipment or to avoid human risk, e.g., chemistry equipment or surgical environments, respectively. However, in the future, innovations will seek to replace real-space activities with social VR environments, targeting primary, secondary, vocational, and job training activities. This Special Issue will supplement the existing literature with results focused on social VR technology issues such as quality of experience, authoring tools, accessibility, efficiency, and ethics where the literature is still limited.
Uploads
Papers