Gharib et al., 2019 - Google Patents
VOICe: A sound event detection dataset for generalizable domain adaptationGharib et al., 2019
View PDF- Document ID
- 3371341711941907316
- Author
- Gharib S
- Drossos K
- Fagerlund E
- Virtanen T
- Publication year
- Publication venue
- arXiv preprint arXiv:1911.07098
External Links
Snippet
The performance of sound event detection methods can significantly degrade when they are used in unseen conditions (eg recording devices, ambient noise). Domain adaptation is a promising way to tackle this problem. In this paper, we present VOICe, the first dataset for the …
- 230000004301 light adaptation 0 title abstract description 41
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/00624—Recognising scenes, i.e. recognition of a whole field of perception; recognising scene-specific objects
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11386916B2 (en) | Segmentation-based feature extraction for acoustic scene classification | |
Mesaros et al. | DCASE 2017 challenge setup: Tasks, datasets and baseline system | |
CN113140226B (en) | An Acoustic Event Labeling and Recognition Method Using Dual Token Labels | |
Kao et al. | A comparison of pooling methods on LSTM models for rare acoustic event classification | |
Ntalampiras | Universal background modeling for acoustic surveillance of urban traffic | |
Mesaros et al. | Assessment of human and machine performance in acoustic scene classification: DCASE 2016 case study | |
Lagrange et al. | The bag-of-frames approach: a not so sufficient model for urban soundscapes | |
Tran et al. | Audio-vision emergency vehicle detection | |
Gharib et al. | VOICe: A sound event detection dataset for generalizable domain adaptation | |
Waldekar et al. | Two-level fusion-based acoustic scene classification | |
Cartwright et al. | Tricycle: Audio representation learning from sensor network data using self-supervision | |
Shabbir et al. | Smart city traffic management: Acoustic-based vehicle detection using stacking-based ensemble deep learning approach | |
Hou et al. | Transfer learning for improving singing-voice detection in polyphonic instrumental music | |
Soni et al. | Automatic audio event recognition schemes for context-aware audio computing devices | |
CN114898737A (en) | Acoustic event detection method, apparatus, electronic device and storage medium | |
Luitel et al. | Sound event detection in urban soundscape using two-level classification | |
Jeong et al. | Constructing an Audio Dataset of Construction Equipment from Online Sources for Audio-Based Recognition | |
Kawale et al. | Analysis and simulation of sound classification system using machine learning techniques | |
CN114125368B (en) | Conference audio participant association method and device and electronic equipment | |
Martin-Morato et al. | On the robustness of deep features for audio event classification in adverse environments | |
Chen et al. | Overlapped Speech Detection Based on Spectral and Spatial Feature Fusion. | |
Tiwari et al. | Evaluating robustness of you only hear once (YOHO) algorithm on noisy audios in the voice dataset | |
Kovalenko et al. | Analysis of the sound event detection methods and systems | |
Marković et al. | Reverberation-based feature extraction for acoustic scene classification | |
Kek et al. | Acoustic scene classification using bilinear pooling on time-liked and frequency-liked convolution neural network |