Malah, 2003 - Google Patents
Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signalsMalah, 2003
View PDF- Document ID
- 15815725201522184645
- Author
- Malah D
- Publication year
- Publication venue
- IEEE Transactions on Acoustics, Speech, and Signal Processing
External Links
Snippet
Frequency scaling of speech signals by methods based on short-time Fourier analysis (STFA), analytic rooting, and harmonic compression using a bank of filters, is a complex operation which requires a large amount of computation in a digital implementation. It is …
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/038—Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/90—Pitch determination of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Malah | Time-domain algorithms for harmonic bandwidth reduction and time scaling of speech signals | |
| McAulay et al. | Speech analysis/synthesis based on a sinusoidal representation | |
| Moulines et al. | Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones | |
| Evangelista | Pitch-synchronous wavelet representations of speech and music signals | |
| Talkin et al. | A robust algorithm for pitch tracking (RAPT) | |
| Smith et al. | PARSHL: An analysis/synthesis program for non-harmonic sounds based on a sinusoidal representation | |
| Goh et al. | Kalman-filtering speech enhancement method based on a voiced-unvoiced speech model | |
| Charpentier et al. | Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones. | |
| Van Immerseel et al. | Pitch and voiced/unvoiced determination with an auditory model | |
| US5029509A (en) | Musical synthesizer combining deterministic and stochastic waveforms | |
| US4885790A (en) | Processing of acoustic waveforms | |
| US5933808A (en) | Method and apparatus for generating modified speech from pitch-synchronous segmented speech waveforms | |
| JPH0677200B2 (en) | Digital processor for speech synthesis of digitized text | |
| AU597573B2 (en) | Acoustic waveform processing | |
| Seneff | System to independently modify excitation and/or spectrum of speech waveform without explicit pitch extraction | |
| Syrdal et al. | TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis | |
| Abe et al. | Sinusoidal model based on instantaneous frequency attractors | |
| Mittal et al. | Study of characteristics of aperiodicity in Noh voices | |
| Fitz et al. | A unified theory of time-frequency reassignment | |
| Meseguer | Speech analysis for automatic speech recognition | |
| d'Alessandro et al. | Effectiveness of a periodic and aperiodic decomposition method for analysis of voice sources | |
| Kadiri et al. | Analysis of instantaneous frequency components of speech signals for epoch extraction | |
| Ghitza | Auditory nerve representation criteria for speech analysis/synthesis | |
| CN120148484B (en) | Speech recognition method and device based on microcomputer | |
| Ferreira | An odd-DFT based approach to time-scale expansion of audio signals |