Guo et al., 2022 - Google Patents
Specpatch: Human-in-the-loop adversarial audio spectrogram patch attack on speech recognitionGuo et al., 2022
View PDF- Document ID
- 18269816339703812489
- Author
- Guo H
- Wang Y
- Ivanov N
- Xiao L
- Yan Q
- Publication year
- Publication venue
- Proceedings of the 2022 ACM SIGSAC conference on computer and communications security
External Links
Snippet
In this paper, we propose SpecPatch, a human-in-the loop adversarial audio attack on automated speech recognition (ASR) systems. Existing audio adversarial attacker assumes that the users cannot notice the adversarial audios, and hence allows the successful …
- 238000000034 method 0 abstract description 25
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Guo et al. | Specpatch: Human-in-the-loop adversarial audio spectrogram patch attack on speech recognition | |
Carlini et al. | Audio adversarial examples: Targeted attacks on speech-to-text | |
Zheng et al. | Black-box adversarial attacks on commercial speech platforms with minimal information | |
Schönherr et al. | Imperio: Robust over-the-air adversarial examples for automatic speech recognition systems | |
Li et al. | Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations | |
Chen et al. | Metamorph: Injecting inaudible commands into over-the-air voice controlled systems | |
Neekhara et al. | Universal adversarial perturbations for speech recognition systems | |
Carlini et al. | Hidden voice commands | |
Zhang et al. | Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems | |
US10546585B2 (en) | Localizing and verifying utterances by audio fingerprinting | |
Shi et al. | Audio-domain position-independent backdoor attack via unnoticeable triggers | |
US9263055B2 (en) | Systems and methods for three-dimensional audio CAPTCHA | |
Janicki et al. | An assessment of automatic speaker verification vulnerabilities to replay spoofing attacks | |
Ahmed et al. | Towards more robust keyword spotting for voice assistants | |
Wang et al. | Vsmask: Defending against voice synthesis attack via real-time predictive perturbation | |
Zhang et al. | Generating robust audio adversarial examples with temporal dependency | |
Zong et al. | Trojanmodel: A practical trojan attack against automatic speech recognition systems | |
WO2019173304A1 (en) | Method and system for enhancing security in a voice-controlled system | |
Li et al. | Inaudible adversarial perturbation: Manipulating the recognition of user speech in real time | |
Cheng et al. | UniAP: Protecting speech privacy with non-targeted universal adversarial perturbations | |
Guo et al. | Phantomsound: Black-box, query-efficient audio adversarial attack via split-second phoneme injection | |
Zong et al. | Targeted universal adversarial perturbations for automatic speech recognition | |
Testa et al. | Privacy against real-time speech emotion detection via acoustic adversarial evasion of machine learning | |
Wang et al. | A practical survey on emerging threats from ai-driven voice attacks: How vulnerable are commercial voice control systems? | |
Cai et al. | Vsvc: backdoor attack against keyword spotting based on voiceprint selection and voice conversion |