Togare et al., 2023 - Google Patents

Machine Learning Approaches for Audio Classification in Video Surveillance: A Comparative Analysis of ANN vs. CNN vs. LSTM

Togare et al., 2023

Document ID: 14448484411148949246
Author: Togare S; Andurkar A
Publication year: 2023
Publication venue: 2023 International Conference on Integration of Computational Intelligent System (ICICIS)

External Links

Cited by

Snippet

This study aims to develop a robust audio classification system capable of accurately analyzing and categorizing audio events in video streams. Leveraging machine learning techniques, we construct a model that can recognize and classify various audio events …

Continue reading at ieeexplore.ieee.org (other versions)

238000013459 approach 0 title abstract description 7

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
- G06K9/6284—Single class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00771—Recognising scenes under surveillance, e.g. with Markovian modelling of scene activity
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00496—Recognising patterns in signals and combinations thereof

Similar Documents

Publication	Publication Date	Title
US11386916B2 (en)	2022-07-12	Segmentation-based feature extraction for acoustic scene classification
Ntalampiras et al.	2009	On acoustic surveillance of hazardous situations
Carletti et al.	2013	Audio surveillance using a bag of aural words classifier
Abbasi et al.	2022	A large-scale benchmark dataset for anomaly detection and rare event classification for audio forensics
Huang et al.	2010	Scream detection for home applications
Choi et al.	2012	Selective background adaptation based abnormal acoustic event recognition for audio surveillance
Rahman et al.	2021	Hybrid system for automatic detection of gunshots in indoor environment
Vozáriková et al.	2011	Acoustic events detection using MFCC and MPEG-7 descriptors
Lin et al.	2012	Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization
Togare et al.	2023	Machine Learning Approaches for Audio Classification in Video Surveillance: A Comparative Analysis of ANN vs. CNN vs. LSTM
Rafi et al.	2024	Exploring classification of vehicles using horn sound analysis: a deep learning-based approach
Feroze et al.	2018	Sound event detection in real life audio using perceptual linear predictive feature with neural network
Mielke et al.	2013	Smartphone application for automatic classification of environmental sound
Korycki	2013	Time and spectral analysis methods with machine learning for the authentication of digital audio recordings
Rodríguez-Hidalgo et al.	2018	Echoic log-surprise: A multi-scale scheme for acoustic saliency detection
Kotus et al.	2016	Processing of acoustical data in a multimodal bank operating room surveillance system
Moritz et al.	2016	Acoustic scene classification using time-delay neural networks and amplitude modulation filter bank features
Ranasinghe et al.	2023	Enhanced frequency domain analysis for detecting wild elephants in Asia using acoustics
Patil et al.	2013	Task-driven attentional mechanisms for auditory scene recognition
Suhaimy et al.	2020	Classification of ambulance siren sound with MFCC-SVM
Dedeoglu et al.	2008	Surveillance using both video and audio
Uzkent et al.	2011	Pitch-range based feature extraction for audio surveillance systems
Iliev et al.	2021	Acoustic Event Detection and Sound Separation for security systems and IoT devices
Iqbal et al.	2023	Identification and Categorization of Unusual Internet of Vehicles Events in Noisy Audio
Ren et al.	2021	Learning target template for acoustic event detection from low-SNR training data