Shruti et al., 2023 - Google Patents

A comparative study on bengali speech sentiment analysis based on audio data

Shruti et al., 2023

Document ID: 4448871846053908594
Author: Shruti A; Rifat R; Kamal M; Alam M
Publication year: 2023
Publication venue: 2023 IEEE International Conference on Big Data and Smart Computing (BigComp)

External Links

Cited by

Snippet

Sentiment analysis is one of the most researched areas for every language. Due to the rise of AI, the use of speech in every sector is rapidly growing so is the importance of Speech Sentiment Analysis. Despite being the seventh most spoken language in the world, Bengali …

Continue reading at www.researchgate.net (PDF) (other versions)

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
- G06K9/6807—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries
- G06K9/6842—Dividing the references in groups prior to recognition, the recognition taking place in steps; Selecting relevant dictionaries according to the linguistic properties, e.g. English, German
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models

Similar Documents

Publication	Publication Date	Title
Jothimani et al.	2022	MFF-SAug: Multi feature fusion with spectrogram augmentation of speech emotion recognition using convolution neural network
Ho et al.	2020	Multimodal approach of speech emotion recognition using multi-level multi-head fusion attention-based recurrent neural network
Chatziagapi et al.	2019	Data augmentation using GANs for speech emotion recognition.
Hazarika et al.	2018	Self-attentive feature-level fusion for multimodal emotion detection
Pan et al.	2024	Spanish MEACorpus 2023: A multimodal speech–text corpus for emotion analysis in Spanish from natural environments
Ram et al.	2020	Neural network based end-to-end query by example spoken term detection
Shruti et al.	2023	A comparative study on bengali speech sentiment analysis based on audio data
Tran et al.	2018	Ensemble application of ELM and GPU for real-time multimodal sentiment analysis
Valles et al.	2021	An audio processing approach using ensemble learning for speech-emotion recognition for children with ASD
Vasuki et al.	2012	Improving emotion recognition from speech using sensor fusion techniques
Kundu et al.	2024	Enhanced speech emotion recognition with efficient channel attention guided deep CNN-BiLSTM framework
Zhou et al.	2020	Speech Emotion Recognition with Discriminative Feature Learning.
Sultana et al.	2025	BanSpEmo: A Bangla audio dataset for speech emotion recognition and its baseline evaluation
Dvoynikova et al.	2023	Bimodal sentiment and emotion classification with multi-head attention fusion of acoustic and linguistic information
Matsane et al.	2020	The use of automatic speech recognition in education for identifying attitudes of the speakers
Vlasenko et al.	2021	Fusion of acoustic and linguistic information using supervised autoencoder for improved emotion recognition
Nguyen et al.	2024	Enhancing speech emotion recognition through knowledge distillation
Shinde et al.	2023	Design and validation of HindiSER: speech emotion recognition dataset for Hindi language
Yin	2019	Steps towards end-to-end neural speaker diarization
Kokate et al.	2022	An Algorithmic Approach to Audio Processing and Emotion Mapping
Hossain	2022	Classification of bangla regional languages and recognition of artificial bangla speech using deep learning
Syamala et al.	2021	An Efficient Aspect based Sentiment Analysis Model by the Hybrid Fusion of Speech and Text Aspects
Karbhari et al.	2023	Age, Gender and Emotion Recognition by Speech Spectrograms Using Feature Learning
Luo et al.	2023	Multimodal emotion recognition based on 2D kernel density estimation for multiple labels fusion
Roken et al.	2022	Arabic multimodal emotion recognition using deep learning