CN101266798B - A method and device for gain smoothing in voice decoder - Google Patents
A method and device for gain smoothing in voice decoder Download PDFInfo
- Publication number
- CN101266798B CN101266798B CN 200710088039 CN200710088039A CN101266798B CN 101266798 B CN101266798 B CN 101266798B CN 200710088039 CN200710088039 CN 200710088039 CN 200710088039 A CN200710088039 A CN 200710088039A CN 101266798 B CN101266798 B CN 101266798B
- Authority
- CN
- China
- Prior art keywords
- speech
- frame
- fixed codebook
- codebook gain
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/083—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being an excitation gain
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
本发明公开了一种在语音解码器中进行增益平滑的方法,该方法包括:计算当前语音帧的语音参数变化因子,并对所述当前语音帧的固定码本增益进行初始化修正;按照所述语音参数变化因子确定当前语音帧的状态;利用所述初始化修正后的固定码本增益以及为该状态的语音帧设置的平滑因子,对该当前语音帧的固定码本增益进行平滑。同时,本发明还公开了一种在语音解码中进行增益平滑的装置。本发明实施例不需记录连续多帧的固定码本增益,因此存储复杂度较小,而且,也不需要同时计算稳定性因子和浊音因子,只需要计算语音变化因子,因此算法复杂度也降低。
The invention discloses a method for performing gain smoothing in a speech decoder. The method includes: calculating the speech parameter change factor of the current speech frame, and initializing and correcting the fixed codebook gain of the current speech frame; according to the The speech parameter change factor determines the state of the current speech frame; the fixed codebook gain of the current speech frame is smoothed by using the initialized and corrected fixed codebook gain and the smoothing factor set for the speech frame in this state. At the same time, the invention also discloses a device for performing gain smoothing in speech decoding. The embodiment of the present invention does not need to record the fixed codebook gain of continuous multi-frames, so the storage complexity is small, and it is not necessary to calculate the stability factor and voice factor at the same time, only need to calculate the voice change factor, so the algorithm complexity is also reduced .
Description
技术领域technical field
本发明涉及语音解码技术领域,更具体地说,涉及一种在语音解码器中进行增益平滑的方法及装置。The present invention relates to the technical field of speech decoding, and more specifically, relates to a method and device for performing gain smoothing in a speech decoder.
背景技术Background technique
参见图1所示,在语音通信系统中,编码器将输入的语音信号进行编码,然后通过通信信道将编码后的比特流发送;解码器对从通信信道中接收到比特流进行解码后,合成为语音信号。As shown in Figure 1, in the voice communication system, the encoder encodes the input voice signal, and then sends the encoded bit stream through the communication channel; after the decoder decodes the bit stream received from the communication channel, it synthesizes for voice signals.
以下将对语音信号进行编码的编码器称为语音编码器。语音编码器常用的编码原理是代数码本激励线性预测(ACELP,Algebraic Code Excited LinearPrediction),这类编码器包括G.729、EVRC、AMR、AMR-WB、AMR-WB+等。其中G.729是国际电信联盟(ITU-T)的语音编码标准;EVRC是第三代移动通信合作伙伴计划2(3GPP-2,3rd Generation Partnership Project2)的语音编码标准;AMR、AMR-WB、AMR-WB+是第三代移动通信合作伙伴计划(3GPP,3rd Generation Partnership Project)的语音编码标准。A coder that encodes a speech signal is referred to as a speech coder below. The coding principle commonly used in speech encoders is Algebraic Code Excited Linear Prediction (ACELP, Algebraic Code Excited Linear Prediction). Such encoders include G.729, EVRC, AMR, AMR-WB, AMR-WB+, etc. Among them, G.729 is the voice coding standard of the International Telecommunication Union (ITU-T); EVRC is the voice coding standard of the 3rd Generation Partnership Project 2 (3GPP-2, 3rd Generation Partnership Project2); AMR, AMR-WB, AMR-WB+ is the voice coding standard of the 3rd Generation Partnership Project (3GPP, 3rd Generation Partnership Project).
基于ACELP的语音编码器生成的码流都是以语音帧为单位的,有些将帧分为若干子帧,如AMR,以子帧为单位。对于每一帧的输入数据,通常为几十毫秒PCM格式数据,发送端的语音编码器要将其编码为一组参数。这些参数一般要经过量化并且传输。接收端的解码器则要将这些参数重新合成为语音信号,常见为PCM格式数据。The code stream generated by the ACELP-based speech coder is based on the speech frame, and some divide the frame into several sub-frames, such as AMR, the unit is the sub-frame. For each frame of input data, usually tens of milliseconds of PCM format data, the speech encoder at the sending end encodes it into a set of parameters. These parameters are generally quantized and transmitted. The decoder at the receiving end needs to resynthesize these parameters into a voice signal, usually in PCM format.
基于ACELP的语音编码器生成的语音帧的参数一般包括谱参数、自适应码本参数、代数码本参数、自适应码本增益和代数码本增益等。The parameters of the speech frame generated by the ACELP-based speech encoder generally include spectral parameters, adaptive codebook parameters, algebraic codebook parameters, adaptive codebook gain and algebraic codebook gain, etc.
由于编码过程中会产生量化噪声,降低了语音质量,因此在解码器端重新合成语音信号时,一般会进行一些后处理,如固定码本增益平滑和增强周期性,以改善的合成语音质量。其中固定码本增益平滑的目的是为了避免稳态语音的能量不自然波动。Since quantization noise is generated during the encoding process, which reduces the speech quality, when the speech signal is re-synthesized at the decoder, some post-processing, such as fixed codebook gain smoothing and periodicity enhancement, is generally performed to improve the synthesized speech quality. The purpose of smoothing the gain of the fixed codebook is to avoid unnatural fluctuations in the energy of the steady-state speech.
目前语音解码器有两种方法对固定码进行增益平滑。一种是基于短期LSP(线谱对,Linear Spectral Pair)的稳定性对固定码本进行增益平滑,另一种是基于语音的稳定性和浊音特性对固定码本进行增益平滑。There are currently two methods for speech decoders to perform gain smoothing on fixed codes. One is to smooth the gain of the fixed codebook based on the stability of the short-term LSP (Linear Spectral Pair), and the other is to smooth the gain of the fixed codebook based on the stability and voiced sound characteristics of speech.
基于短期LSP的稳定性对固定码本进行增益平滑处理步骤如下:The steps of smoothing the fixed codebook based on the stability of the short-term LSP are as follows:
(1)对于每个帧,计算平均LSP。(1) For each frame, calculate the average LSP.
其中是当前帧的第4子帧的LSP,q(n-1)是上一帧的平均LSP,q(n)是当前帧的LSP。in is the LSP of the 4th subframe of the current frame, q(n-1) is the average LSP of the previous frame, and q(n) is the LSP of the current frame.
(2)对于子帧m,计算平均LSP向量和子帧m的LSP的差分;(2) For subframe m, calculate the difference between the average LSP vector and the LSP of subframe m;
(3)计算平滑因子km。(3) Calculate the smoothing factor k m .
km=min(0.25,max(0,diffm-0.4))/0.25k m = min(0.25, max(0, diff m -0.4))/0.25
(4)利用计算最近5个子帧的固定码本平均增益;(4) Utilizing the fixed codebook average gain of the last 5 subframes;
(5)对当前子帧的固定码本增益进行平滑;(5) smoothing the fixed codebook gain of the current subframe;
上述进行增益平滑的的缺点是:需要记录过去多个子帧的固定码本平均增益,因此,存储复杂度较大。The disadvantage of the above method of gain smoothing is that it is necessary to record the average gain of the fixed codebook of multiple subframes in the past, so the storage complexity is relatively large.
基于语音的稳定性和浊音特性对固定码本进行增益平滑是处理步骤如下:Based on the stability of speech and voiced sound characteristics, the processing steps of performing gain smoothing on a fixed codebook are as follows:
(1)计算浊音因子λ=0.5(1-rv),rv=(Ev-Ec)/(Ev+Ec)(1) Calculate voice factor λ=0.5(1-rv), rv=(Ev-Ec)/(Ev+Ec)
其中,Ev是自适应码本的能量,Ec是固定码本的能量。Among them, Ev is the energy of the adaptive codebook, and Ec is the energy of the fixed codebook.
(2)计算稳定性因子θ,并且将范围限制在0≤θ≤1,计算公式为(2) Calculate the stability factor θ, and limit the range to 0≤θ≤1, the calculation formula is
其中isf_new是当前帧的ISF(导谱频率,Immitance Spectral Frequency),isf_old是上一帧的ISF。Among them, isf_new is the ISF (guiding frequency, Immitance Spectral Frequency) of the current frame, and isf_old is the ISF of the previous frame.
(3)计算增益平滑因子Sm,Sm=λθ;(3) Calculate the gain smoothing factor Sm, Sm=λθ;
(4)对固定码本增益进行初始化修正;(4) For fixed codebook gain Make initialization corrections;
当
当
其中,g0表示当前帧经过初始化修正后的固定码本增益。Among them, g 0 represents the fixed codebook gain of the current frame after initialization correction.
(5)最后对固定码本增益进行平滑。(5) Finally, smooth the fixed codebook gain.
第二种进行增益平滑方法的缺点是:需要计算稳定性因子和浊音因子,算法复杂度较大。The disadvantage of the second gain smoothing method is that the stability factor and the voice factor need to be calculated, and the algorithm complexity is relatively large.
综上所述,现有技术中,语音编码器对固定码进行增益平滑时,由于需要记录过去多个子帧的固定码本平均增益,或者需要计算稳定性因子和浊音因子,因此,语音编码过程非常复杂。To sum up, in the prior art, when the speech coder performs gain smoothing on the fixed code, because it needs to record the average gain of the fixed codebook of multiple subframes in the past, or needs to calculate the stability factor and the voicing factor, therefore, the speech coding process very complicated.
发明内容Contents of the invention
本发明的主要目的时提供一种在语音解码器中进行增益平滑的方法及装置,用以简化语音编码中增益的平滑处理。The main purpose of the present invention is to provide a method and device for smoothing the gain in the speech decoder, so as to simplify the smoothing process of the gain in the speech coding.
本发明实施例提供的一种在语音解码中进行增益平滑的方法是这样实现的:A method of gain smoothing in speech decoding provided by an embodiment of the present invention is implemented as follows:
A.计算当前语音帧的语音参数变化因子,并对所述当前语音帧的固定码本增益进行初始化修正;A. Calculate the speech parameter variation factor of the current speech frame, and initialize and correct the fixed codebook gain of the current speech frame;
B.按照所述语音参数变化因子确定当前语音帧的状态;B. determine the state of the current speech frame according to the speech parameter change factor;
C.利用所述初始化修正后的固定码本增益以及该状态的语音帧对应的平滑因子,对所述当前语音帧的固定码本增益进行平滑。C. Smooth the fixed codebook gain of the current speech frame by using the initialized and corrected fixed codebook gain and the smoothing factor corresponding to the speech frame in this state.
本发明实施例提供的一种在语音解码中进行增益平滑的装置包括:A device for performing gain smoothing in speech decoding provided by an embodiment of the present invention includes:
语音参数变化因子获取单元,用于获取当前帧的语音参数变化因子;The voice parameter change factor acquisition unit is used to acquire the voice parameter change factor of the current frame;
固定码本增益初始化修正单元,用于对所述当前语音帧的固定码本增益进行初始化修正;A fixed codebook gain initialization correction unit, configured to initialize and correct the fixed codebook gain of the current speech frame;
语音帧状态确定单元,用于根据获得的所述当前帧的语音参数变化因子确定当前语音帧的状态;A speech frame state determination unit, configured to determine the state of the current speech frame according to the obtained speech parameter change factor of the current frame;
所述固定码本增益平滑单元,用于根据所述进行初始化修正后的固定码本增益以及该状态的语音帧对应的平滑因子,对所述当前语音帧的固定码本增益进行平滑。The fixed codebook gain smoothing unit is configured to smooth the fixed codebook gain of the current speech frame according to the initialized and corrected fixed codebook gain and a smoothing factor corresponding to the speech frame in this state.
通过上述本发明实施例的技术方案可知,本发明实施例只需要记录上一帧的经过修正的固定码本增益,而不需记录连续多帧的固定码本平均增益,因此存储简单。而且,本发明实施例只需要利用一个当前语音帧的语音参数变化因子即可实现增益的平滑,而不需要同时计算稳定性因子和浊音因子,因此算法复杂度也降低。It can be seen from the above technical solutions of the embodiments of the present invention that the embodiments of the present invention only need to record the corrected fixed codebook gain of the last frame, instead of recording the average fixed codebook gain of multiple consecutive frames, so the storage is simple. Moreover, the embodiments of the present invention only need to use one speech parameter change factor of the current speech frame to achieve smoothing of the gain, without calculating the stability factor and voice factor at the same time, so the complexity of the algorithm is also reduced.
附图说明Description of drawings
图1为语音通信系统示意图;Fig. 1 is a schematic diagram of a voice communication system;
图2为本发明实施例的对固定码本增益进行平滑的流程示意图;FIG. 2 is a schematic flow diagram of smoothing a fixed codebook gain according to an embodiment of the present invention;
图3为本发明实施例的基于谱参数变化因子对固定码本增益进行平滑的流程示意图;3 is a schematic flow diagram of smoothing a fixed codebook gain based on a spectral parameter change factor according to an embodiment of the present invention;
图4为另一本发明实施例的基于谱参数变化因子对固定码本增益进行平滑的流程示意图;FIG. 4 is a schematic flow diagram of smoothing a fixed codebook gain based on a spectral parameter change factor according to another embodiment of the present invention;
图5为本发明实施例的基于基音延迟参数变化因子对固定码本增益进行平滑的流程示意图;FIG. 5 is a schematic flow diagram of smoothing a fixed codebook gain based on a pitch delay parameter variation factor according to an embodiment of the present invention;
图6为本发明实施例在语音解码器中进行增益平滑的装置的结构示意图;6 is a schematic structural diagram of a device for performing gain smoothing in a speech decoder according to an embodiment of the present invention;
图7为本发明装置的一具体实施例结构示意图。Fig. 7 is a schematic structural diagram of a specific embodiment of the device of the present invention.
具体实施方式Detailed ways
本发明实施例是在语音通信系统中,需要计算所述当前语音帧的语音参数变化因子,对所述当前语音帧的固定码本增益进行初始化修正;按照所述语音参数变化因子确定当前语音帧的状态;利用所述初始化修正后的固定码本增益以及为该状态的语音帧设置的平滑因子,对该当前语音帧的固定码本增益进行平滑。In the embodiment of the present invention, in the voice communication system, it is necessary to calculate the voice parameter change factor of the current voice frame, and initialize and correct the fixed codebook gain of the current voice frame; determine the current voice frame according to the voice parameter change factor state; smoothing the fixed codebook gain of the current speech frame by using the initialized and corrected fixed codebook gain and the smoothing factor set for the speech frame in this state.
这里,当前语音帧的语音参数变化因子可以利用所述当前帧的语音参数以及上一帧的语音参数计算出。语音参数可以为谱参数、基音延迟参数或浊音因子。Here, the speech parameter change factor of the current speech frame may be calculated by using the speech parameters of the current frame and the speech parameters of the previous frame. The speech parameters may be spectral parameters, pitch delay parameters or voicing factors.
平滑因子可以利用一定的公式计算得到,也可以根据仿真结果得到。The smoothing factor can be calculated by using a certain formula, and can also be obtained according to the simulation results.
参见图2所示,本发明实施例对固定码本增益进行平滑的具体流程如下:Referring to Fig. 2, the specific process of smoothing the fixed codebook gain in the embodiment of the present invention is as follows:
步骤201:利用当前帧的语音参数以及上一帧的语音参数计算所述当前语音帧的语音参数变化因子,并对所述当前语音帧的固定码本增益进行初始化修正;Step 201: Using the speech parameters of the current frame and the speech parameters of the previous frame to calculate the speech parameter change factor of the current speech frame, and initializing and correcting the fixed codebook gain of the current speech frame;
如果语音参数变化因子为谱参数变化因子,则步骤201中可以利用当前帧的谱参数以及上一帧的谱参数计算出当前帧的语音参数变化因子。如果语音参数变化因子为基音延迟参数变化因子,则步骤201中可以利用当前帧的基音延迟参数以及上一帧的基音延迟参数计算得出。If the speech parameter change factor is a spectral parameter change factor, in
比如:如果当前帧的固定码本增益大于上一语音帧经过初始化修正后的固定码本增益,则在进行初始化修正时,需要将当前帧的固定码本增益设置为:上一语音帧经过初始化修正后的固定码本增益,以及固定码本增益与增益缩放因子的比值中的最大值;For example: if the fixed codebook gain of the current frame is greater than the fixed codebook gain of the previous speech frame after initialization correction, then when performing initialization correction, the fixed codebook gain of the current frame needs to be set to: the previous speech frame is initialized the modified fixed codebook gain, and the maximum value of the ratio of the fixed codebook gain to the gain scaling factor;
如果当前帧的固定码本增益小于等于上一语音帧经过初始化修正后的固定码本增益,则进行初始化修正时,需要将当前帧的固定码本增益设置为:上一语音帧经过初始化修正后的固定码本增益,以及固定码本增益与增益缩放因子的乘积中的最小值。If the fixed codebook gain of the current frame is less than or equal to the fixed codebook gain after initialization correction of the previous speech frame, when performing initialization correction, the fixed codebook gain of the current frame needs to be set to: After the previous speech frame is initialized and corrected The fixed codebook gain of , and the minimum value of the product of the fixed codebook gain and the gain scaling factor.
步骤202:根据语音参数变化因子确定当前语音帧的状态。Step 202: Determine the state of the current speech frame according to the speech parameter change factor.
这里,可以预先根据语音参数变化因子将语音帧分为若干种状态,并设置每种语音帧的状态与语音参数变化范围的对应关系,则步骤202中确定当前语音帧的状态可以这样实现:Here, the speech frame can be divided into several states according to the speech parameter change factor in advance, and the corresponding relationship between the state of each kind of speech frame and the speech parameter variation range is set, then the state of determining the current speech frame in
确定所述语音帧参数变化因子所处于的语音参数变化范围;根据所述语音帧的状态与语音参数变化范围的对应关系,获得所述语音参数变化范围所对应的当前语音帧的状态。Determine the speech parameter change range where the speech frame parameter change factor is located; obtain the state of the current speech frame corresponding to the speech parameter change range according to the corresponding relationship between the state of the speech frame and the speech parameter change range.
步骤203:利用所述初始化修正后的固定码本增益以及该状态的语音帧对应的平滑因子,对该当前语音帧的固定码本增益进行平滑。Step 203: Smooth the fixed codebook gain of the current speech frame by using the initialized and corrected fixed codebook gain and the smoothing factor corresponding to the speech frame in this state.
例如:进行平滑处理的公式为
参见图3所示,本发明实施例基于谱参数变化因子对固定码本增益进行平滑的具体流程如下:Referring to Fig. 3, the specific process of smoothing the fixed codebook gain based on the spectral parameter change factor in the embodiment of the present invention is as follows:
步骤301:利用当前帧的谱参数和上一帧的谱参数,计算当前帧的谱参数变化因子s_diff。计算公式如下:Step 301: Calculate the spectral parameter change factor s_diff of the current frame by using the spectral parameters of the current frame and the spectral parameters of the previous frame. Calculated as follows:
s_diff=f(s_newi,s_oldi)s_diff=f(s_new i , s_old i )
其中,s_new是当前帧的谱参数,s_old是上一帧的谱参数,f为s_new和s_old的函数。Among them, s_new is the spectral parameter of the current frame, s_old is the spectral parameter of the previous frame, and f is the function of s_new and s_old.
谱参数可以是ISF或ISP或LSP或LSF或LPC,不同的语音编解码器可能采用ISF、ISP、LSP、LSF、LPC的一种或多种来表示语音信号的短时相关性。The spectrum parameter can be ISF, ISP, LSP, LSF, or LPC. Different speech codecs may use one or more of ISF, ISP, LSP, LSF, and LPC to represent the short-term correlation of speech signals.
步骤302:对当前语音帧的固定码本增益进行初始化修正。这里,可以利用上一语音帧经过初始化修正后的固定码本增益,对当前语音帧的固定码本增益进行初始化修正,Step 302: Fixed codebook gain for the current speech frame Perform initialization corrections. Here, the fixed codebook gain of the current speech frame can be adjusted by using the fixed codebook gain of the previous speech frame after initialization and correction. Make initialization corrections,
当
当
其中,g0表示当前语音帧经过初始化修正后的固定码本增益,g-1表示上一语音帧经过初始化修正后的固定码本增益,和为关于g-1和的函数,为当前语音帧的固定码本增益。Among them, g 0 represents the fixed codebook gain of the current speech frame after initialization correction, and g -1 represents the fixed codebook gain of the previous speech frame after initialization correction, and for with respect to g -1 and The function, is the fixed codebook gain of the current speech frame.
步骤303:根据语音参数变化因子确定当前语音帧的状态。Step 303: Determine the state of the current speech frame according to the speech parameter change factor.
比如:预先根据语音参数变化因子将语音帧分为n+1种状态,n为自然数t1......tn为语音帧状态阈值,因此,可以将语音参数变化范围设置为小于t1,大于t1小于t2.....以及大于tn-1n+1个语音参数变化范围,每个语音参数变化范围对应一种语音帧的状态;For example: the speech frame is divided into n+1 states according to the speech parameter change factor in advance, n is a natural number t 1 ... t n is the speech frame state threshold, therefore, the speech parameter change range can be set to be less than t 1 , greater than t 1 less than t 2 ..... and greater than t n-1 n+1 speech parameter variation range, each speech parameter variation range corresponds to a state of a speech frame;
因此,这里可以根据语音变化因子,确该定当前语音帧所处的语音参数变化范围内,进而再确定的语音参数变化范围所对应的当前语音帧的状态。Therefore, according to the speech change factor, the speech parameter change range of the current speech frame can be determined, and then the state of the current speech frame corresponding to the determined speech parameter change range can be determined.
步骤304:利用所述初始化修正后的固定码本增益以及该状态对应的平滑因子,对该当前语音帧的固定码本增益进行平滑。Step 304: Smooth the fixed codebook gain of the current speech frame by using the initialized and corrected fixed codebook gain and the smoothing factor corresponding to the state.
这里,平滑因子的值可以是根据仿真结果确定。还需要预先设置语音参数变化范围与平滑因子的对应关系。Here, the value of the smoothing factor may be determined according to simulation results. It is also necessary to pre-set the corresponding relationship between the speech parameter variation range and the smoothing factor.
例如:为当前语音帧的固定码本增益,s_diff为谱参数变化因子,For example: is the fixed codebook gain of the current speech frame, s_diff is the spectral parameter change factor,
当语音参数变化因子小于t1时,为第一种状态,该状态下的平滑因子为S1;When the speech parameter change factor is less than t1 , it is the first state, and the smoothing factor in this state is S1 ;
当语音参数变化因子大于等于t1且小于t2时,为第二种状态,该状态下的平滑因子为S2;When the speech parameter change factor is greater than or equal to t1 and less than t2 , it is the second state, and the smoothing factor in this state is S2 ;
……...
当语音参数变化因子大于等于tm-1且小于tm时,为第二种状态,该状态下的平滑因子为Sm;When the speech parameter change factor is greater than or equal to t m-1 and less than t m , it is the second state, and the smoothing factor in this state is S m ;
当语音参数变化因子大于等于tn-1时,为第二种状态,该状态下的平滑因子为Sn,When the speech parameter change factor is greater than or equal to t n-1 , it is the second state, and the smoothing factor in this state is S n ,
如果进行平滑处理的公式为
因此,当s_diff>t1时,可以根据公式
当t2<s_diff≤t1时,根据公式
当s_diff≤tn-1时,根据公式
参见图4所示,另一本发明实施例的基于谱参数变化因子对固定码本增益进行平滑的方法包括以下步骤:Referring to Fig. 4, another embodiment of the present invention, based on a spectral parameter change factor, a method for smoothing a fixed codebook gain includes the following steps:
步骤401:利用当前帧的谱参数和上一帧的谱参数,计算谱参数变化因子s_diff,谱参数变化因子可以是LSF、ISF、LPC、ISP、LSP等的变化因子,计算公式可以为:Step 401: Using the spectral parameters of the current frame and the spectral parameters of the previous frame, calculate the spectral parameter change factor s_diff, the spectral parameter change factor can be a change factor of LSF, ISF, LPC, ISP, LSP, etc., and the calculation formula can be:
其中s_new是当前帧的谱参数,s_old是上一帧的谱参数。s_scale是归一化因子,可以是一个常数,例如可以取值为40000。Where s_new is the spectral parameter of the current frame, and s_old is the spectral parameter of the previous frame. s_scale is a normalization factor, which can be a constant, such as 40000.
步骤402:利用上一语音帧经过初始化修正后的固定码本增益,或固定码本增益与增益缩放因子的比值,或固定码本增益与增益缩放因子的乘积,作为修正后的当前帧的固定码本增益。Step 402: Use the fixed codebook gain after initialization and modification of the previous speech frame, or the ratio of the fixed codebook gain to the gain scaling factor, or the fixed codebook gain The product with the gain scaling factor is used as the modified fixed codebook gain of the current frame.
如果当前帧的固定码本增益大于上一语音帧经过初始化修正后的固定码本增益g-1,则当前语音帧经过初始化修正后的固定码本增益为:上一语音帧经过初始化修正后的固定码本增益g-1,以及固定码本增益与增益缩放因子的比值中的最大值;If the fixed codebook gain of the current frame is greater than the fixed codebook gain g -1 of the previous speech frame after initialization and correction, then the fixed codebook gain of the current speech frame after initialization and correction is: the fixed codebook gain of the previous speech frame after initialization and correction g -1 , and a fixed codebook gain the maximum value in the ratio to the gain scaling factor;
如果当前帧的固定码本增益小于等于上一语音帧经过初始化修正后的固定码本增益g-1,则当前语音帧经过初始化修正后的固定码本增益为:上一语音帧经过初始化修正后的固定码本增益g-1,以及固定码本增益与增益缩放因子的乘积中的最小值。If the fixed codebook gain of the current frame is less than or equal to the fixed codebook gain g -1 of the previous speech frame after initialization and correction, then the fixed codebook gain of the current speech frame after initialization and correction is: the fixed codebook gain of the previous speech frame after initialization and correction g -1 , and the fixed codebook gain The minimum value in the product with the gain scaling factor.
具体公式可以包括:Specific formulas may include:
当
当
其中,g0表示当前语音帧经过初始化修正后的固定码本增益,g-1表示上一语音帧经过初始化修正后的固定码本增益。g_scale是增益缩放因子,可以是一个常数,如1.06。Among them, g 0 represents the fixed codebook gain of the current speech frame after initialization correction, and g -1 represents the fixed codebook gain of the previous speech frame after initialization correction. g_scale is the gain scaling factor, which can be a constant, such as 1.06.
步骤403:根据语音参数变化因子确定当前语音帧的状态。Step 403: Determine the state of the current speech frame according to the speech parameter change factor.
这里,可以根据仿真结果预先设置一个语音帧状态阈值,根据s_diff将语音帧分为两类:稳态和非稳态。当s_diff大于语音帧状态阈值时,表示谱参数处于非稳态,当s_diff小于等于语音帧状态阈值时,表示谱参数处于稳态,因此,可以针对根据仿真结果为稳态和非稳态分别设置固定码本平滑因子,稳态时的平滑因子小于非稳态的平滑因子,Here, a voice frame state threshold can be preset according to the simulation results, and the voice frames can be divided into two types according to s_diff: steady state and non-stationary state. When s_diff is greater than the speech frame state threshold, it means that the spectral parameters are in an unsteady state. When s_diff is less than or equal to the speech frame state threshold, it means that the spectral parameters are in a steady state. Therefore, it can be set separately for the steady state and the unsteady state according to the simulation results. Fixed codebook smoothing factor, the smoothing factor in the steady state is smaller than the smoothing factor in the unsteady state,
步骤404:利用所述初始化修正后的固定码本增益以及为该状态的语音帧设置的平滑因子,对该当前语音帧的固定码本增益进行平滑。Step 404: Smooth the fixed codebook gain of the current speech frame by using the initialized corrected fixed codebook gain and the smoothing factor set for the speech frame in this state.
当s_diff>thr时,则可以当前语音帧处于非稳态,根据公式
当s_diff≤thr时,可以当前语音帧处于稳态,根据公式
其中,为当前语音帧的固定码本增益,thr是语音帧状态阈值,可以是一个常数,如0.58。s_diff大于thr表示谱参数处于非稳态,s_diff小于thr表示谱参数处于稳态。S1和S2是对应两种不同类型的固定码本增益平滑因子,都是常数,例如分别取值为0.17和0.83。in, is the fixed codebook gain of the current speech frame, and thr is the state threshold of the speech frame, which can be a constant, such as 0.58. If s_diff is greater than thr, it means that the spectral parameters are in an unsteady state, and if s_diff is smaller than thr, it means that the spectral parameters are in a steady state. S 1 and S 2 are corresponding to two different types of fixed codebook gain smoothing factors, both of which are constants, such as 0.17 and 0.83 respectively.
参见图5所示,本发明实施例基于基音延迟参数变化因子的对固定码本增益进行平滑的具体流程如下:Referring to Fig. 5, the specific process of smoothing the fixed codebook gain based on the pitch delay parameter change factor in the embodiment of the present invention is as follows:
步骤501:根据当前帧的基音延迟参数和上一帧的基音延迟参数计算基音延迟参数变化因子delay_diff。公式可以如下:Step 501: Calculate the pitch delay parameter change factor delay_diff according to the pitch delay parameter of the current frame and the pitch delay parameter of the previous frame. The formula can be as follows:
delay_diff=f(delay_newi,delay_oldi)delay_diff=f(delay_new i , delay_old i )
其中delay_new是当前帧的基音延迟参数,s_old是上一帧的基音延迟参数,f是可以根据需要设定的函数。Among them, delay_new is the pitch delay parameter of the current frame, s_old is the pitch delay parameter of the previous frame, and f is a function that can be set as required.
步骤502:对固定码本增益进行初始化修正。具体如下:Step 502: Gain for a fixed codebook Perform initialization corrections. details as follows:
当
当
其中,g0表示当前语音帧经过初始化修正后的固定码本增益,g-1表示上一语音帧经过初始化修正后的固定码本增益,和为关于g-1和的函数,为当前语音帧的固定码本增益。Among them, g 0 represents the fixed codebook gain of the current speech frame after initialization correction, and g -1 represents the fixed codebook gain of the previous speech frame after initialization correction, and for with respect to g -1 and The function, is the fixed codebook gain of the current speech frame.
步骤503:根据基音延迟参数变化因子delay_diff确定当前语音帧的状态。确定方式可以参见步骤303。Step 503: Determine the state of the current speech frame according to the pitch delay parameter change factor delay_diff. Refer to step 303 for the determination method.
步骤504:根据该状态对应的平滑因子,对该当前语音帧的固定码本增益进行增益平滑。Step 504: Perform gain smoothing on the fixed codebook gain of the current speech frame according to the smoothing factor corresponding to the state.
当delay_diff>t1时,根据公式
当t2<delay_diff≤t1时,根据公式
当delay_diff≤tn-1时,根据公式
其中,delay_diff为当前语音帧的基音延迟参数变化因子,S1,...,Sn是对应不同类型的平滑因子,t1......tn为n个语音帧状态阈值,n为自然数,为当前语音帧的固定码本增益。Among them, delay_diff is the pitch delay parameter change factor of the current speech frame, S 1 ,..., S n are smoothing factors corresponding to different types, t 1 ...t n is the state threshold of n speech frames, n is a natural number, is the fixed codebook gain of the current speech frame.
参见图6所示,本发明实施例在语音解码器中进行增益平滑的装置包括:Referring to Fig. 6, the embodiment of the present invention performs gain smoothing device in speech decoder including:
语音参数变化因子获取单元61、固定码本增益初始化修正单元62、语音帧状态确定单元63以及固定码本增益平滑单元64。Speech parameter change factor acquisition unit 61 , fixed codebook gain initialization correction unit 62 , speech frame state determination unit 63 and fixed codebook gain smoothing unit 64 .
其中,语音参数变化因子获取单元61,用于获取当前帧的语音参数变化因子;固定码本增益初始化修正单元62,用于对所述当前语音帧的固定码本增益进行初始化修正;语音帧状态确定单元63,用于根据获得的所述当前帧的语音参数变化因子确定当前语音帧的状态;所述固定码本增益平滑单元64,用于根据所述进行初始化修正后的固定码本增益以及该状态的语音帧对应的平滑因子,对该当前语音帧的固定码本增益进行平滑。Wherein, the speech parameter change factor acquisition unit 61 is used to obtain the speech parameter change factor of the current frame; the fixed codebook gain initialization correction unit 62 is used to initialize and correct the fixed codebook gain of the current speech frame; the speech frame state The determination unit 63 is configured to determine the state of the current speech frame according to the obtained speech parameter change factor of the current frame; the fixed codebook gain smoothing unit 64 is used to perform initialization and correction according to the fixed codebook gain and The smoothing factor corresponding to the speech frame in this state smoothes the fixed codebook gain of the current speech frame.
参见图7所示,所述语音参数变化因子获取单元61可以包括:第一语音参数获取单元71、第二语音参数获取单元72以及语音参数变化因子计算单元73。Referring to FIG. 7 , the speech parameter change factor acquisition unit 61 may include: a first speech parameter acquisition unit 71 , a second speech parameter acquisition unit 72 and a speech parameter change
其中,第一语音参数获取单元71,用于获取当前帧的语音参数;第二语音参数获取单元72,用于获取上一帧的语音参数;语音参数变化因子计算单元73,用于根据所述当前帧的语音参数和上一帧的语音参数计算所述当前帧的语音参数变化因子。Wherein, the first speech parameter acquisition unit 71 is used to obtain the speech parameters of the current frame; the second speech parameter acquisition unit 72 is used to obtain the speech parameters of the previous frame; the speech parameter change
所述语音帧状态确定单元63可以包括:存储单元74和语音帧状态解析单元75。The speech frame state determination unit 63 may include: a
其中,存储单元74,用于保存语音帧的状态与语音参数变化范围的对应关系;语音帧状态解析单元75,用于确定获得的所述语音帧参数变化因子所处于的语音参数变化范围;根据所述对应关系,获得所述语音参数变化范围所对应的当前语音帧的状态。Wherein, the
所述固定码本增益平滑单元64可以包括:平滑因子存储单元76、平滑因子获取单元77以及平滑处理单元78。The fixed codebook gain smoothing unit 64 may include: a smoothing factor storage unit 76 , a smoothing
其中,平滑因子存储单元76,用于存储语音帧的状态与平滑因子的对应关系;平滑因子获取单元77,用于根据当前语音帧的状态,从所述语音帧的状态与平滑因子的对应关系中,获得该状态的语音帧对应的平滑因子;平滑处理单元78,用于根据
所述固定码本增益初始化修正单元62包括:比较单元79以及修正处理单元70。The fixed codebook gain initialization correction unit 62 includes: a
其中,比较单元79,用于判断当前帧的固定码本增益是否大于上一语音帧经过初始化修正后的固定码本增益;修正处理单元70,用于当前帧的固定码本增益大于上一语音帧经过初始化修正后的固定码本增益,将当前帧的固定码本增益设置为:上一语音帧经过初始化修正后的固定码本增益,以及固定码本增益与增益缩放因子的比值中的最大值;当前帧的固定码本增益小于等于上一语音帧经过初始化修正后的固定码本增益,则将当前帧的固定码本增益设置为:上一语音帧经过初始化修正后的固定码本增益,以及固定码本增益与增益缩放因子的乘积中的最小值。Among them, the
本发明实施例是:计算所述当前语音帧的语音参数变化因子,对所述当前语音帧的固定码本增益进行初始化修正;按照所述语音参数变化因子确定当前语音帧的状态;利用所述初始化修正后的固定码本增益以及为该状态的语音帧设置的平滑因子,对该当前语音帧的固定码本增益进行平滑。由于在本发明实施例中,平滑因子可以利用一定的公式计算得到,也可以根据仿真结果得到。在进行平滑处理时,只需要记录上一帧的经过修正的固定码本增益,并用静态配置好的平滑因子进行平滑处理,不需记录连续多帧的固定码本增益;而且还需要利用一个当前语音帧的语音参数变化因子即可实现增益的平滑,不需要同时计算稳定性因子和浊音因子,因此,相比现有技术来说,存储复杂度以及算法的复杂度都比较低。The embodiment of the present invention is: calculating the speech parameter change factor of the current speech frame, and initializing and correcting the fixed codebook gain of the current speech frame; determining the state of the current speech frame according to the speech parameter change factor; using the The modified fixed codebook gain and the smoothing factor set for the speech frame in this state are initialized, and the fixed codebook gain of the current speech frame is smoothed. Because in the embodiment of the present invention, the smoothing factor can be calculated by using a certain formula, or can be obtained according to the simulation results. When performing smoothing processing, it is only necessary to record the corrected fixed codebook gain of the previous frame, and use the statically configured smoothing factor to perform smoothing processing, without recording the fixed codebook gain of continuous multiple frames; and it is also necessary to use a current The speech parameter change factor of the speech frame can realize the smoothing of the gain, and it is not necessary to calculate the stability factor and the voicing factor at the same time. Therefore, compared with the prior art, the storage complexity and the complexity of the algorithm are relatively low.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies thereof, the present invention also intends to include these modifications and variations.
Claims (11)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200710088039 CN101266798B (en) | 2007-03-12 | 2007-03-12 | A method and device for gain smoothing in voice decoder |
PCT/CN2008/070458 WO2008110109A1 (en) | 2007-03-12 | 2008-03-10 | A method and apparatus for smoothing gains in a speech decoder |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN 200710088039 CN101266798B (en) | 2007-03-12 | 2007-03-12 | A method and device for gain smoothing in voice decoder |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101266798A CN101266798A (en) | 2008-09-17 |
CN101266798B true CN101266798B (en) | 2011-06-15 |
Family
ID=39759021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN 200710088039 Active CN101266798B (en) | 2007-03-12 | 2007-03-12 | A method and device for gain smoothing in voice decoder |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN101266798B (en) |
WO (1) | WO2008110109A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106887233B (en) * | 2015-12-15 | 2020-01-24 | 广州酷狗计算机科技有限公司 | Audio data processing method and system |
CN113205824B (en) * | 2021-04-30 | 2022-11-11 | 紫光展锐(重庆)科技有限公司 | Sound signal processing method, device, storage medium, chip and related equipment |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1391689A (en) * | 1999-11-18 | 2003-01-15 | 语音时代公司 | Gain-smoothing in wideband speech and audio signal decoder |
EP1688918A1 (en) * | 1999-09-10 | 2006-08-09 | Nec Corporation | Speech decoding |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6351731B1 (en) * | 1998-08-21 | 2002-02-26 | Polycom, Inc. | Adaptive filter featuring spectral gain smoothing and variable noise multiplier for noise reduction, and method therefor |
JP3365360B2 (en) * | 1999-07-28 | 2003-01-08 | 日本電気株式会社 | Audio signal decoding method, audio signal encoding / decoding method and apparatus therefor |
AU2547201A (en) * | 2000-01-11 | 2001-07-24 | Matsushita Electric Industrial Co., Ltd. | Multi-mode voice encoding device and decoding device |
JP3558031B2 (en) * | 2000-11-06 | 2004-08-25 | 日本電気株式会社 | Speech decoding device |
CN1322488C (en) * | 2004-04-14 | 2007-06-20 | 华为技术有限公司 | Method for strengthening sound |
-
2007
- 2007-03-12 CN CN 200710088039 patent/CN101266798B/en active Active
-
2008
- 2008-03-10 WO PCT/CN2008/070458 patent/WO2008110109A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1688918A1 (en) * | 1999-09-10 | 2006-08-09 | Nec Corporation | Speech decoding |
CN1391689A (en) * | 1999-11-18 | 2003-01-15 | 语音时代公司 | Gain-smoothing in wideband speech and audio signal decoder |
Non-Patent Citations (1)
Title |
---|
Bessette, B. et al..Efficient methods for high quality low bit rate wideband speech coding.《Speech Coding, 2002, IEEE Workshop Proceedings.》.2002, * |
Also Published As
Publication number | Publication date |
---|---|
WO2008110109A1 (en) | 2008-09-18 |
CN101266798A (en) | 2008-09-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101523484B (en) | Systems, methods and apparatus for frame erasure recovery | |
US20240221766A1 (en) | Very Short Pitch Detection and Coding | |
US7472059B2 (en) | Method and apparatus for robust speech classification | |
KR101698905B1 (en) | Apparatus and method for encoding and decoding an audio signal using an aligned look-ahead portion | |
CN1820306B (en) | Method and device for gain quantization in variable bit rate wideband speech coding | |
US8346544B2 (en) | Selection of encoding modes and/or encoding rates for speech compression with closed loop re-decision | |
US10141001B2 (en) | Systems, methods, apparatus, and computer-readable media for adaptive formant sharpening in linear prediction coding | |
EP3352169B1 (en) | Unvoiced decision for speech processing | |
EP2091040B1 (en) | Decoding method and device | |
US9015039B2 (en) | Adaptive encoding pitch lag for voiced speech | |
Jelinek et al. | Wideband speech coding advances in VMR-WB standard | |
EP3281197B1 (en) | Audio encoder and method for encoding an audio signal | |
CN101266798B (en) | A method and device for gain smoothing in voice decoder | |
CN101582263B (en) | Method and device for noise enhancement post-processing in speech decoding | |
CN101533639B (en) | Voice signal processing method and device | |
CN102968997A (en) | Method and device for treatment after noise enhancement in broadband voice decoding | |
JP3475958B2 (en) | Speech encoding / decoding apparatus including speechless encoding, decoding method, and recording medium recording program | |
Wang et al. | Transcoding Scheme between AMR-WB and VMR-WB | |
Shikui et al. | Speech transcoding from AMR to G. 729 in excitation domain | |
Choi et al. | Efficient harmonic-CELP based hybrid coding of speech at low bit rates. | |
JP2004004946A (en) | Voice decoder | |
HK1114939A (en) | Method and apparatus for robust speech classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |