[go: up one dir, main page]

Togare et al., 2023 - Google Patents

Machine Learning Approaches for Audio Classification in Video Surveillance: A Comparative Analysis of ANN vs. CNN vs. LSTM

Togare et al., 2023

Document ID
14448484411148949246
Author
Togare S
Andurkar A
Publication year
Publication venue
2023 International Conference on Integration of Computational Intelligent System (ICICIS)

External Links

Snippet

This study aims to develop a robust audio classification system capable of accurately analyzing and categorizing audio events in video streams. Leveraging machine learning techniques, we construct a model that can recognize and classify various audio events …
Continue reading at ieeexplore.ieee.org (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06K9/6232Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
    • G06K9/6247Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6279Classification techniques relating to the number of classes
    • G06K9/6284Single class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • G06K9/6267Classification techniques
    • G06K9/6268Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00624Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
    • G06K9/00771Recognising scenes under surveillance, e.g. with Markovian modelling of scene activity
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00221Acquiring or recognising human faces, facial parts, facial sketches, facial expressions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/00496Recognising patterns in signals and combinations thereof

Similar Documents

Publication Publication Date Title
US11386916B2 (en) Segmentation-based feature extraction for acoustic scene classification
Ntalampiras et al. On acoustic surveillance of hazardous situations
Carletti et al. Audio surveillance using a bag of aural words classifier
Abbasi et al. A large-scale benchmark dataset for anomaly detection and rare event classification for audio forensics
Huang et al. Scream detection for home applications
Choi et al. Selective background adaptation based abnormal acoustic event recognition for audio surveillance
Rahman et al. Hybrid system for automatic detection of gunshots in indoor environment
Vozáriková et al. Acoustic events detection using MFCC and MPEG-7 descriptors
Lin et al. Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization
Togare et al. Machine Learning Approaches for Audio Classification in Video Surveillance: A Comparative Analysis of ANN vs. CNN vs. LSTM
Rafi et al. Exploring classification of vehicles using horn sound analysis: a deep learning-based approach
Feroze et al. Sound event detection in real life audio using perceptual linear predictive feature with neural network
Mielke et al. Smartphone application for automatic classification of environmental sound
Korycki Time and spectral analysis methods with machine learning for the authentication of digital audio recordings
Rodríguez-Hidalgo et al. Echoic log-surprise: A multi-scale scheme for acoustic saliency detection
Kotus et al. Processing of acoustical data in a multimodal bank operating room surveillance system
Moritz et al. Acoustic scene classification using time-delay neural networks and amplitude modulation filter bank features
Ranasinghe et al. Enhanced frequency domain analysis for detecting wild elephants in Asia using acoustics
Patil et al. Task-driven attentional mechanisms for auditory scene recognition
Suhaimy et al. Classification of ambulance siren sound with MFCC-SVM
Dedeoglu et al. Surveillance using both video and audio
Uzkent et al. Pitch-range based feature extraction for audio surveillance systems
Iliev et al. Acoustic Event Detection and Sound Separation for security systems and IoT devices
Iqbal et al. Identification and Categorization of Unusual Internet of Vehicles Events in Noisy Audio
Ren et al. Learning target template for acoustic event detection from low-SNR training data