[go: up one dir, main page]

Li et al., 2023 - Google Patents

Learning normality is enough: a software-based mitigation against inaudible voice attacks

Li et al., 2023

View PDF
Document ID
5986533926879498456
Author
Li X
Ji X
Yan C
Li C
Li Y
Zhang Z
Xu W
Publication year
Publication venue
32nd USENIX Security Symposium (USENIX Security 23)

External Links

Snippet

Inaudible voice attacks silently inject malicious voice commands into voice assistants to manipulate voice-controlled devices such as smart speakers. To alleviate such threats for both existing and future devices, this paper proposes NormDetect, a software-based …
Continue reading at www.usenix.org (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice

Similar Documents

Publication Publication Date Title
Abdullah et al. Sok: The faults in our asrs: An overview of attacks against automatic speech recognition and speaker identification systems
Chen et al. Who is real bob? adversarial attacks on speaker recognition systems
Ahmed et al. Void: A fast and light voice liveness detection system
Koffas et al. Can you hear it? backdoor attacks via ultrasonic triggers
Yan et al. A survey on voice assistant security: Attacks and countermeasures
Li et al. Adversarial music: Real world audio adversary against wake-word detection system
Li et al. Learning normality is enough: a software-based mitigation against inaudible voice attacks
Zhang et al. Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication
KR102386155B1 (en) How to protect your voice assistant from being controlled by machine learning-based silent commands
Wang et al. Secure your voice: An oral airflow-based continuous liveness detection for voice assistants
Roy et al. Listening through a vibration motor
US20200243067A1 (en) Environment classifier for detection of laser-based audio injection attacks
Jiang et al. Securing liveness detection for voice authentication via pop noises
Zong et al. Trojanmodel: A practical trojan attack against automatic speech recognition systems
CN116868265A (en) Systems and methods for data enhancement and speech processing in dynamic acoustic environments
Liu et al. Defending against microphone-based attacks with personalized noise
Guo et al. Phantomsound: Black-box, query-efficient audio adversarial attack via split-second phoneme injection
He et al. Fast and lightweight voice replay attack detection via time-frequency spectrum difference
Salvi et al. Poliphone: A dataset for smartphone model identification from audio recordings
Mathur et al. On robustness of cloud speech apis: An early characterization
Shahid et al. " Is this my president speaking?" Tamper-proofing Speech in Live Recordings
Cao et al. LiveProbe: Exploring continuous voice liveness detection via phonemic energy response patterns
WO2025031170A1 (en) Voiceprint recognition system evaluation method and apparatus, storage medium, and electronic device
US20240086759A1 (en) System and Method for Watermarking Training Data for Machine Learning Models
Nagakrishnan et al. Generic speech based person authentication system with genuine and spoofed utterances: different feature sets and models