Trowitzsch et al., 2017 - Google Patents

Robust detection of environmental sounds in binaural auditory scenes

Trowitzsch et al., 2017

Document ID: 17894340640657088902
Author: Trowitzsch I; Mohr J; Kashef Y; Obermayer K
Publication year: 2017
Publication venue: IEEE/ACM Transactions on Audio, Speech, and Language Processing

External Links

Cited by

Snippet

In realistic acoustic scenes, the detection of particular types of environmental sounds is often impeded by the simultaneous presence of multiple sound sources. In this work, we use simulations to systematically investigate the impact of superimposed distractor sources on …

Continue reading at ieeexplore.ieee.org (other versions)

238000001514 detection method 0 title abstract description 30

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6261—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation partitioning the feature space
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation

Similar Documents

Publication	Publication Date	Title
Pak et al.	2019	Sound localization based on phase difference enhancement using deep neural networks
Barchiesi et al.	2015	Acoustic scene classification: Classifying environments from the sounds they produce
Markov et al.	2014	Music genre and emotion recognition using Gaussian processes
Rafii et al.	2012	Repeating pattern extraction technique (REPET): A simple method for music/voice separation
Shi et al.	2006	On the importance of phase in human speech recognition
Wang et al.	2017	Sound event recognition using auditory-receptive-field binary pattern and hierarchical-diving deep belief network
Lim et al.	2012	Music-genre classification system based on spectro-temporal features and feature selection
Tengtrairat et al.	2013	Single-channel blind separation using pseudo-stereo mixture and complex 2-D histogram
Rosner et al.	2014	Classification of music genres based on music separation into harmonic and drum components
Trowitzsch et al.	2017	Robust detection of environmental sounds in binaural auditory scenes
Weninger et al.	2011	Recognition of nonprototypical emotions in reverberated and noisy speech by nonnegative matrix factorization
Schröder et al.	2017	Classifier architectures for acoustic scenes and events: implications for DNNs, TDNNs, and perceptual features from DCASE 2016
Trowitzsch et al.	2019	Joining sound event detection and localization through spatial segregation
Marxer et al.	2012	Low-latency instrument separation in polyphonic audio using timbre models
Martin-Morato et al.	2018	Adaptive mid-term representations for robust audio event classification
Yang et al.	2022	Domain agnostic few-shot learning for speaker verification
Abidin et al.	2018	Local binary pattern with random forest for acoustic scene classification
Podwinska et al.	2019	Acoustic event detection from weakly labeled data using auditory salience
Rosner et al.	2014	Influence of low-level features extracted from rhythmic and harmonic sections on music genre classification
Liu et al.	2023	Sound event classification based on frequency-energy feature representation and two-stage data dimension reduction
Varzandeh et al.	2024	Speech-aware binaural DOA estimation utilizing periodicity and spatial features in convolutional neural networks
Sarno et al.	2019	Music fingerprinting based on bhattacharya distance for song and cover song recognition
Sandhan et al.	2014	Audio bank: A high-level acoustic signal representation for audio event recognition
Vargas et al.	2018	A compressed encoding scheme for approximate TDOA estimation
Shirali-Shahreza et al.	2009	Fast and scalable system for automatic artist identification