Malah, 2003 - Google Patents

Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals

Malah, 2003

Document ID: 15815725201522184645
Author: Malah D
Publication year: 2003
Publication venue: IEEE Transactions on Acoustics, Speech, and Signal Processing

External Links

Cited by

Snippet

Frequency scaling of speech signals by methods based on short-time Fourier analysis (STFA), analytic rooting, and harmonic compression using a bank of filters, is a complex operation which requires a large amount of computation in a digital implementation. It is …

Continue reading at www.academia.edu (PDF) (other versions)

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit

Similar Documents

Publication	Publication Date	Title
Malah	2003	Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals
McAulay et al.	2003	Speech analysis/synthesis based on a sinusoidal representation
Moulines et al.	1990	Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
Evangelista	2002	Pitch-synchronous wavelet representations of speech and music signals
Talkin et al.	1995	A robust algorithm for pitch tracking (RAPT)
Smith et al.	1987	PARSHL: An analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation
Goh et al.	1999	Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model
Charpentier et al.	1989	Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones.
Van Immerseel et al.	1992	Pitch and voiced/unvoiced determination with an auditory model
US5029509A (en)	1991-07-09	Musical synthesizer combining deterministic and stochastic waveforms
US4885790A (en)	1989-12-05	Processing of acoustic waveforms
US5933808A (en)	1999-08-03	Method and apparatus for generating modified speech from pitch-synchronous segmented speech waveforms
JPH0677200B2 (en)	1994-09-28	Digital processor for speech synthesis of digitized text
AU597573B2 (en)	1990-06-07	Acoustic waveform processing
Seneff	2003	System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction
Syrdal et al.	1998	TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
Abe et al.	2006	Sinusoidal model based on instantaneous frequency attractors
Mittal et al.	2015	Study of characteristics of aperiodicity in Noh voices
Fitz et al.	2009	A unified theory of time-frequency reassignment
Meseguer	2009	Speech analysis for automatic speech recognition
d'Alessandro et al.	2002	Effectiveness of a periodic and aperiodic decomposition method for analysis of voice sources
Kadiri et al.	2023	Analysis of instantaneous frequency components of speech signals for epoch extraction
Ghitza	2003	Auditory nerve representation criteria for speech analysis/synthesis
CN120148484B (en)	2025-07-11	Speech recognition method and device based on microcomputer
Ferreira	2002	An odd-DFT based approach to time-scale expansion of audio signals