Li et al., 2023 - Google Patents
Learning normality is enough: a software-based mitigation against inaudible voice attacksLi et al., 2023
View PDF- Document ID
- 5986533926879498456
- Author
- Li X
- Ji X
- Yan C
- Li C
- Li Y
- Zhang Z
- Xu W
- Publication year
- Publication venue
- 32nd USENIX Security Symposium (USENIX Security 23)
External Links
Snippet
Inaudible voice attacks silently inject malicious voice commands into voice assistants to manipulate voice-controlled devices such as smart speakers. To alleviate such threats for both existing and future devices, this paper proposes NormDetect, a software-based …
- 230000000116 mitigating effect 0 title abstract description 8
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Abdullah et al. | Sok: The faults in our asrs: An overview of attacks against automatic speech recognition and speaker identification systems | |
Chen et al. | Who is real bob? adversarial attacks on speaker recognition systems | |
Ahmed et al. | Void: A fast and light voice liveness detection system | |
Koffas et al. | Can you hear it? backdoor attacks via ultrasonic triggers | |
Yan et al. | A survey on voice assistant security: Attacks and countermeasures | |
Li et al. | Adversarial music: Real world audio adversary against wake-word detection system | |
Li et al. | Learning normality is enough: a software-based mitigation against inaudible voice attacks | |
Zhang et al. | Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication | |
KR102386155B1 (en) | How to protect your voice assistant from being controlled by machine learning-based silent commands | |
Wang et al. | Secure your voice: An oral airflow-based continuous liveness detection for voice assistants | |
Roy et al. | Listening through a vibration motor | |
US20200243067A1 (en) | Environment classifier for detection of laser-based audio injection attacks | |
Jiang et al. | Securing liveness detection for voice authentication via pop noises | |
Zong et al. | Trojanmodel: A practical trojan attack against automatic speech recognition systems | |
CN116868265A (en) | Systems and methods for data enhancement and speech processing in dynamic acoustic environments | |
Liu et al. | Defending against microphone-based attacks with personalized noise | |
Guo et al. | Phantomsound: Black-box, query-efficient audio adversarial attack via split-second phoneme injection | |
He et al. | Fast and lightweight voice replay attack detection via time-frequency spectrum difference | |
Salvi et al. | Poliphone: A dataset for smartphone model identification from audio recordings | |
Mathur et al. | On robustness of cloud speech apis: An early characterization | |
Shahid et al. | " Is this my president speaking?" Tamper-proofing Speech in Live Recordings | |
Cao et al. | LiveProbe: Exploring continuous voice liveness detection via phonemic energy response patterns | |
WO2025031170A1 (en) | Voiceprint recognition system evaluation method and apparatus, storage medium, and electronic device | |
US20240086759A1 (en) | System and Method for Watermarking Training Data for Machine Learning Models | |
Nagakrishnan et al. | Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models |