Togare et al., 2023 - Google Patents
Machine Learning Approaches for Audio Classification in Video Surveillance: A Comparative Analysis of ANN vs. CNN vs. LSTMTogare et al., 2023
- Document ID
- 14448484411148949246
- Author
- Togare S
- Andurkar A
- Publication year
- Publication venue
- 2023 International Conference on Integration of Computational Intelligent System (ICICIS)
External Links
Snippet
This study aims to develop a robust audio classification system capable of accurately analyzing and categorizing audio events in video streams. Leveraging machine learning techniques, we construct a model that can recognize and classify various audio events …
- 238000013459 approach 0 title abstract description 7
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6279—Classification techniques relating to the number of classes
- G06K9/6284—Single class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
- G06K9/00771—Recognising scenes under surveillance, e.g. with Markovian modelling of scene activity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00221—Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00496—Recognising patterns in signals and combinations thereof
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11386916B2 (en) | Segmentation-based feature extraction for acoustic scene classification | |
Ntalampiras et al. | On acoustic surveillance of hazardous situations | |
Carletti et al. | Audio surveillance using a bag of aural words classifier | |
Abbasi et al. | A large-scale benchmark dataset for anomaly detection and rare event classification for audio forensics | |
Huang et al. | Scream detection for home applications | |
Choi et al. | Selective background adaptation based abnormal acoustic event recognition for audio surveillance | |
Rahman et al. | Hybrid system for automatic detection of gunshots in indoor environment | |
Vozáriková et al. | Acoustic events detection using MFCC and MPEG-7 descriptors | |
Lin et al. | Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization | |
Togare et al. | Machine Learning Approaches for Audio Classification in Video Surveillance: A Comparative Analysis of ANN vs. CNN vs. LSTM | |
Rafi et al. | Exploring classification of vehicles using horn sound analysis: a deep learning-based approach | |
Feroze et al. | Sound event detection in real life audio using perceptual linear predictive feature with neural network | |
Mielke et al. | Smartphone application for automatic classification of environmental sound | |
Korycki | Time and spectral analysis methods with machine learning for the authentication of digital audio recordings | |
Rodríguez-Hidalgo et al. | Echoic log-surprise: A multi-scale scheme for acoustic saliency detection | |
Kotus et al. | Processing of acoustical data in a multimodal bank operating room surveillance system | |
Moritz et al. | Acoustic scene classification using time-delay neural networks and amplitude modulation filter bank features | |
Ranasinghe et al. | Enhanced frequency domain analysis for detecting wild elephants in Asia using acoustics | |
Patil et al. | Task-driven attentional mechanisms for auditory scene recognition | |
Suhaimy et al. | Classification of ambulance siren sound with MFCC-SVM | |
Dedeoglu et al. | Surveillance using both video and audio | |
Uzkent et al. | Pitch-range based feature extraction for audio surveillance systems | |
Iliev et al. | Acoustic Event Detection and Sound Separation for security systems and IoT devices | |
Iqbal et al. | Identification and Categorization of Unusual Internet of Vehicles Events in Noisy Audio | |
Ren et al. | Learning target template for acoustic event detection from low-SNR training data |