[go: up one dir, main page]

JP3209247B2 - Excitation signal coding for speech - Google Patents

Excitation signal coding for speech

Info

Publication number
JP3209247B2
JP3209247B2 JP16558193A JP16558193A JP3209247B2 JP 3209247 B2 JP3209247 B2 JP 3209247B2 JP 16558193 A JP16558193 A JP 16558193A JP 16558193 A JP16558193 A JP 16558193A JP 3209247 B2 JP3209247 B2 JP 3209247B2
Authority
JP
Japan
Prior art keywords
speech
vector
excitation vector
noise excitation
noise
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
JP16558193A
Other languages
Japanese (ja)
Other versions
JPH0720895A (en
Inventor
健弘 守谷
章俊 片岡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP16558193A priority Critical patent/JP3209247B2/en
Publication of JPH0720895A publication Critical patent/JPH0720895A/en
Application granted granted Critical
Publication of JP3209247B2 publication Critical patent/JP3209247B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【産業上の利用分野】この発明は音声信号をできるだけ
少ない情報量でディジタル符号化する高能率音声符号化
法で、特に線形予測合成フィルタに供給する励振信号を
決定する励振信号符号化方法に関するものである。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a high-efficiency speech coding method for digitally coding a speech signal with a minimum amount of information, and more particularly to an excitation signal coding method for determining an excitation signal to be supplied to a linear prediction synthesis filter. It is.

【0002】[0002]

【従来の技術】8kbit/s程度以下の音声符号化法
として、CELP(Code Excited Lin
ear Prediction:符号励振線形予測)符
号化が良く知られている。この符号化法は図5に示すよ
うに、フレーム単位で雑音符号帳11と適応符号帳12
から励振ベクトルを選択して、それぞれ利得調整手段1
3,14で利得を調整して線形予測合成フィルタ15に
入力して音声を合成する。この合成音声と入力音声との
差を差回路16で求め、その差出力を聴感補正フィルタ
17を通し、そのフィルタ出力から歪計算部18で歪を
計算し、その歪が小さくなるように励振信号を選択し利
得調整する。励振ベクトルの選択基準は合成された信号
と入力信号の聴感上の誤差を最小化することである。こ
のように最終的に合成される波形をフィードバックして
励振ベクトルを決定するため、高い品質が得られる。
2. Description of the Related Art As a speech encoding method of about 8 kbit / s or less, CELP (Code Excited Lin) is used.
Ear Prediction (Code Excited Linear Prediction) coding is well known. As shown in FIG. 5, this encoding method uses a random codebook 11 and an adaptive codebook 12 in frame units.
, The excitation vector is selected from the
The gain is adjusted in steps 3 and 14 and input to the linear prediction synthesis filter 15 to synthesize speech. The difference between the synthesized speech and the input speech is obtained by a difference circuit 16, the difference output is passed through an audibility correction filter 17, and a distortion is calculated by a distortion calculator 18 from the filter output. Select and adjust the gain. The criterion for selecting the excitation vector is to minimize the audible error between the synthesized signal and the input signal. Since the excitation vector is determined by feeding back the finally synthesized waveform in this manner, high quality can be obtained.

【0003】この半面、大量の雑音励振ベクトルを記憶
しておく必要があることや、励振ベクトルの探索の演算
量が膨大となるという難点がある。このため、記憶容量
や演算量を削減するためにいくつかの手法が考えられて
いるが、以下で探索の基本原理を示し、この発明と関連
のある代表的な演算量削減法について整理して紹介す
る。
On the other hand, on the other hand, there are disadvantages that it is necessary to store a large amount of noise excitation vectors and that the amount of calculation for searching the excitation vectors becomes enormous. For this reason, several methods have been considered to reduce the storage capacity and the amount of calculation. However, the basic principle of the search will be described below, and typical calculation amount reduction methods related to the present invention will be summarized. introduce.

【0004】CELP符号化における雑音励振ベクトル
の探索では、前述したように線形予測合成フィルタ15
の出力合成波形と入力音声との差をとり、その差出力を
聴感補正フィルタ17を通したあとの歪が最小となるベ
クトルを選択する。具体的には、あるフレーム(フレー
ム長はnサンプル)で量子化の目標とするn次元の縦ベ
クトル(入力音声から、前のフレームからの応答成分や
適応符号ベクトルからの合成成分を差し引いた現フレー
ムの入力音声を、聴感補正フィルタに通した信号ベクト
ル)をX、符号帳中のm種類の雑音励振ベクトルのn次
元縦ベクトルをCj (j=0,…,m−1)、聴感補正
フィルタ17も一括した合成フィルタ15のインパルス
応答行列(n×n)をH、雑音励振ベクトルの利得をg
とすると、下記の歪みdを最小とするjを探索する。
In searching for a noise excitation vector in CELP coding, as described above, a linear prediction synthesis filter 15 is used.
Then, the difference between the output synthesized waveform and the input voice is obtained, and the vector having the minimum distortion after the difference output is passed through the audibility correction filter 17 is selected. Specifically, an n-dimensional vertical vector (a current component obtained by subtracting a response component from a previous frame and a synthetic component from an adaptive code vector from an input speech) in a certain frame (the frame length is n samples). X is the signal vector obtained by passing the input speech of the frame through the audibility correction filter, C is the j- dimensional vertical vector of m kinds of noise excitation vectors in the codebook (j = 0,..., M−1), The impulse response matrix (n × n) of the synthesis filter 15 is H, and the gain of the noise excitation vector is g.
Then, j that minimizes the following distortion d is searched.

【0005】 ここで、最適な利得が与えられ、雑音励振ベクトル決定
後にその利得を量子化することとすると、 g=XT HCj /((HCj T (HCj )) (3) が最適な利得となり(XT はXの転置行列)、このとき
の歪は d=XT X−(XT HCj )2/((HCj T (HCj )) (4) となる。結局第2項の f=(XT HCj )2/((HCj T (HCj )) (5) を最大とするjを捜せばよいことになる。ここでHCj
の合成演算にはn2 /2回程度の積和演算が必要ですべ
てのjについてCj を合成するには莫大な演算量を必要
とする。そこで種々の提案がなされている。
[0005] Here, assuming that an optimum gain is given and that the gain is quantized after the determination of the noise excitation vector, g = X T HC j / ((HC j ) T (HC j )) (3) next (X T is the transpose matrix of X), the distortion in this case is d = X T X- (X T HC j) 2 / ((HC j) T (HC j)) (4). Eventually, it is sufficient to search for j that maximizes the second term, f = (X T HC j ) 2 / ((HC j ) T (HC j )) (5). Where HC j
The synthesis operation requires an enormous amount of computation to synthesize C j for all j required sum of products about n 2/2 times. Therefore, various proposals have been made.

【0006】まず式(5)の分子はXT Hを1回だけ先
に計算し、そのあとでCj と内積をとれば近似誤差を発
生させずに演算量を削減する手法が知られている。内積
はn回の演算ですむためである。ところが、分母の計算
で演算量を削減しようとすると、近似誤差が生ずる。例
えば分母の項を(Cj T T HCj として、HT Hを
テプリッツ型の相関行列で近似し、Hの自己相関関数と
j の自己相関関数の内積で計算する方法が知られてい
る。これにより、分母の項をn回の積和で計算できる
が、近似誤差が大きく、歪の増加と品質の劣化を招く。
また雑音励振ベクトルCj 毎に相関値を記憶しておくた
めに、符号帳のための記憶容量と同程度の記憶容量の追
加が必要となる。
[0006] First molecules of formula (5) calculates the X T H above once and method to reduce the amount of calculation without causing approximation error Taking C j and the inner product is known for its later I have. The inner product requires n operations. However, if an attempt is made to reduce the amount of calculation in the calculation of the denominator, an approximation error occurs. For example the denominator term as (C j) T H T HC j, approximates the H T H in the correlation matrix of the Toeplitz type, a method of calculating by the inner product of the autocorrelation function of the autocorrelation function and C j of H is known ing. As a result, the denominator term can be calculated by n times of product sum, but the approximation error is large, causing an increase in distortion and a deterioration in quality.
In order to store the correlation value for each noise excitation vector C j, additional storage capacity and comparable storage capacity for the codebook is required.

【0007】これとは別にCにローパスフィルタを掛け
てダウンサンプルした励振ベクトルをもち、分母の項を
このベクトルで近似する手法も知られている。ダウンサ
ンプルした比率で演算量を削減できるが、この手法でも
近似誤差やダウンサンプルしたベクトルのための記憶容
量が増加するという問題がある。また探索を2段階にわ
け、第1段では近似計算で複数のCj を候補として予備
選択し、その候補についてのみ第2段で式(5)を最大
とするものを選択する方法も知られている。第1段の近
似法としては分子の項だけを用いる方法や、上記で紹介
した近似法を用いる方法などが考えられる。第1段の近
似誤差を小さくすることや候補数を増やすことで歪を小
さくできるが、演算量削減効果が小さくなってしまう。
[0007] Apart from this, there is also known a method in which a low-pass filter is applied to C to obtain an excitation vector downsampled, and the term of the denominator is approximated by this vector. Although the amount of calculation can be reduced by the downsampled ratio, this method also has a problem that the approximation error and the storage capacity for the downsampled vector increase. There is also known a method in which the search is divided into two stages, and a plurality of Cj are preliminarily selected as candidates in an approximate calculation in the first stage, and only those candidates in the second stage are selected so as to maximize Expression (5). ing. As the first approximation method, a method using only the numerator term, a method using the above-described approximation method, and the like can be considered. Distortion can be reduced by reducing the approximation error in the first stage or increasing the number of candidates, but the effect of reducing the amount of computation is reduced.

【0008】[0008]

【発明が解決しようとする課題】この発明の目的は音声
を少ない情報量で符号化するとき、できるだけ演算量を
抑えたまま符号化音声の品質を向上させ、特に上記従来
法の中で紹介した2段階の探索と組合せ、歪の劣化が小
さい割に大幅な演算量の削減ができる音声の励振信号符
号化方法を提供することにある。
SUMMARY OF THE INVENTION An object of the present invention is to improve the quality of a coded speech while encoding the speech with a small amount of information while minimizing the amount of computation as much as possible. It is an object of the present invention to provide a speech excitation signal encoding method capable of significantly reducing the amount of computation in spite of small distortion degradation in combination with a two-stage search.

【0009】[0009]

【課題を解決するための手段】この発明では雑音励振ベ
クトルを選択する(または予備選択する)際に式(5)
の分母の項すなわちベクトルのエネルギーの項も考慮す
るが、そのエネルギーの項は予め蓄えておいた推定値を
読み出すことで求めることが特徴である。このエネルギ
ーの推定値は雑音励振ベクトル毎に、また合成フィルタ
の分類毎に予め蓄えておく。図1に2段階の探索を行な
う場合について、この発明方法と従来法との比較を示
す。従来においては図1Bに示すようにエネルギー項を
全く無視して、式(5)の分子の計算結果のみで予備選
択するか、図1Cに示すように、近似演算を行なうこと
でエネルギー項を求め、これを用いて式(5)を近似計
算して予備選択することが知られていたが、この発明で
は図1Aに示すようにインパルス応答の分類に基づいた
表の参照でエネルギー項を推定し、これを用いて式
(5)を近似計算して予備選択する。
According to the present invention, when a noise excitation vector is selected (or preselected), equation (5) is used.
Although the term of the denominator of, that is, the term of the energy of the vector is also taken into consideration, the term of the energy is characterized in that it is obtained by reading an estimated value stored in advance. The estimated value of the energy is stored in advance for each noise excitation vector and each classification of the synthesis filter. FIG. 1 shows a comparison between the method of the present invention and the conventional method when a two-stage search is performed. Conventionally, the energy term is completely ignored by ignoring the energy term as shown in FIG. 1B, or the energy term is obtained by performing an approximation operation as shown in FIG. It has been known that the equation (5) is approximated and preliminarily selected using this, but in the present invention, the energy term is estimated by referring to a table based on the classification of the impulse response as shown in FIG. 1A. Using this, equation (5) is approximated and preliminarily selected.

【0010】[0010]

【実施例】図2にこの発明の最も基本的な実施例の要部
の構成を示す。通常のm個のn次元雑音励振ベクトルか
らなる符号帳11のほかに、この実施例では合成フィル
タの類型を示すものとして複数種類のインパルス応答の
パターンが蓄えられた記憶部21およびその記憶部21
からインパルス応答系列と対応したコードを選択する選
択器22、インパルス応答パターンを縦軸、雑音励振ベ
クトルを横軸とし、各要素には合成後のエネルギー推定
値またはその逆数が予め計算されて記憶されているエネ
ルギー項表23とが備えられる。2段階の探索を行なう
としたときの探索手順は以下のようになる。
FIG. 2 shows the structure of a main part of the most basic embodiment of the present invention. In addition to the codebook 11 composed of m normal n-dimensional noise excitation vectors, in this embodiment, a storage unit 21 storing a plurality of types of impulse response patterns as a type of a synthesis filter, and the storage unit 21
A selector 22 for selecting a code corresponding to an impulse response sequence from the above, an impulse response pattern as a vertical axis, a noise excitation vector as a horizontal axis, and an energy estimated value after synthesis or its reciprocal is pre-calculated and stored in each element. Energy term table 23 is provided. A search procedure when a two-step search is performed is as follows.

【0011】1.あるフレームで合成フィルタや聴感補
正フィルタの特性が与えられると、合成フィルタのイン
パルス応答が決まる。このインパルス応答と記憶部21
に予め蓄えられた応答パターンとの照合を選択器22で
とり、現在のフレームのインパルス応答のパターンを決
定する。 2.次に式(5)の分子の項を全ての雑音励振ベクトル
に対して演算する。
1. When the characteristics of the synthesis filter and the audibility correction filter are given in a certain frame, the impulse response of the synthesis filter is determined. This impulse response and the storage unit 21
The selector 22 compares the response pattern with the response pattern stored in advance in the selector 22 to determine the pattern of the impulse response of the current frame. 2. Next, the term of the numerator of the equation (5) is calculated for all the noise excitation vectors.

【0012】3.選択器22からの決定されたインパル
ス応答に対応するエネルギー推定値の表23を参照して
求め、各雑音励振ベクトルに対して、(分子の項/エネ
ルギー推定値)を算出し、その算出値の大きいものから
設定された数の候補を残す。 4.予備選択した候補についてのみ式(5)の計算を行
ない、最適な雑音励振ベクトルの符号を決定する。
3. The energy estimation value corresponding to the determined impulse response from the selector 22 is obtained with reference to Table 23, and for each noise excitation vector, (numerator term / energy estimation value) is calculated. Leave a set number of candidates from the largest. 4. Formula (5) is calculated only for the preselected candidates, and the sign of the optimal noise excitation vector is determined.

【0013】上記処理中で、インパルス応答を分類す
る、つまりインパルス応答のパターンを決定する方法と
してはインパルス応答すなわちFIRフィルタ係数の距
離尺度を用いる方法や、インパルス応答を近似する全極
型のスペクトルモデルの尺度を用いる方法がある。フレ
ーム毎に決まるインパルス応答ベクトルをh(ただしh
T ={h0 ,…,hn-1 })、M種類の代表パターンと
して蓄えておくインパルス応答ベクトルをh′i (i=
0,…,M−1)とすると、前者の場合eを最小とする
代表パターンを選ぶことになる。
In the above process, the impulse response is classified.
To determine the pattern of the impulse response
The impulse response, that is, the distance of the FIR filter coefficient
Separation scale method, all poles approximating impulse response
There is a method that uses a measure of the type spectral model. Fret
H (where h
T= {H0, ..., hn-1}), M types of representative patterns and
The impulse response vector to be storedi(I =
0,..., M−1), in the former case, minimize e.
You will select a representative pattern.

【0014】 e=‖h−h′i 2 (6) またこの変形として次式のように高次の係数が減少する
三角窓Wの重みをつけることや、低次のみでの距離計算
も考えられる。 e=‖W(h−h′i )‖2 (7) これはインパルス応答行列が下三角であるため、低次の
係数ほどエネルギー計算に大きく貢献していることを考
慮したものである。
E = {h−h ′ i } 2 (6) As a modification, weighting of a triangular window W in which a higher-order coefficient decreases as in the following equation, and distance calculation only in a lower order are also performed. Conceivable. e = {W (h−h ′ i )} 2 (7) This takes into account that since the impulse response matrix is a lower triangular, lower-order coefficients contribute more to energy calculation.

【0015】一方、全極型のスペクトルモデルを用いる
ときは以下のeを最小とする最尤距離尺度で代表パター
ンを選ぶ。 e=Σαi Φi (8) ここで、Σはi=0からpまで、pはモデルの次数、α
i は代表モデルのi次の線形予測係数、Φi はインパル
ス応答のi次の自己相関関数であり、この尺度は音声認
識で頻繁に使われる尺度である。また予め用意しておく
少数の代表パターンの作成には通常のベクトル量子化で
用いられている代表パターンの設計アルゴリズムをその
まま利用すればよい。
On the other hand, when an all-pole spectral model is used, a representative pattern is selected using a maximum likelihood distance scale that minimizes the following e. e = Σα i Φ i (8) where Σ is from i = 0 to p, p is the order of the model, α
i is the i-th linear prediction coefficient of the representative model, Φ i is the i-th autocorrelation function of the impulse response, and this measure is a measure frequently used in speech recognition. In order to create a small number of representative patterns prepared in advance, the representative pattern design algorithm used in normal vector quantization may be used as it is.

【0016】以上の実施例は図5に示した基本的なCE
LP符号化を前提に説明したが、雑音励振ベクトルの探
索の前に、すべての雑音励振ベクトルをピッチ成分と直
交化させる手法や、複数チャンネルの雑音励振ベクトル
をもつ場合にもこの発明を適用できる。すなわち第2の
実施例として雑音励振ベクトルをピッチ成分(適応符号
帳からの出力)n次元ベクトルPと直交化させる場合に
ついて説明する。この場合には最終的に最大化する式
(5)が次のように変形される。
In the above embodiment, the basic CE shown in FIG.
Although the description has been made on the assumption that the LP coding is used, the present invention can be applied to a method of orthogonalizing all the noise excitation vectors with the pitch component before searching for the noise excitation vector, or to a case where the noise excitation vector has a plurality of channels. . That is, a case where the noise excitation vector is orthogonalized to the pitch component (output from the adaptive codebook) n-dimensional vector P will be described as a second embodiment. In this case, the expression (5) that finally maximizes is modified as follows.

【0017】 f=(XT HCj −ρHP)2 / ((HCj ) T (HCj )−ρ(HCj T (HP)) (9) ここで ρ=(HCj ) T (HP)/((HP) T (HP)) (10) である。式(9)でも分子の項は式(5)の場合と同様
に正確に比較的少ない演算で計算できる。また分母のρ
以下の第2項も積和の計算も同様である。従って分母の
第1項を第1の実施例と同じ手順で表から読み出せば演
算量を削減できる。また分母の第2項の計算を全く省略
して、第1項のみで評価しても歪の増加はほとんどな
い。
F = (X T HC j −ρHP) 2 / ((HC j ) T (HC j ) −ρ (HC j ) T (HP)) (9) where ρ = (HC j ) T (HP ) / ((HP) T (HP)) (10). In equation (9), the term of the numerator can be calculated accurately and with relatively few operations as in the case of equation (5). Denominator ρ
The same applies to the following second term and the calculation of the sum of products. Therefore, if the first term of the denominator is read from the table in the same procedure as in the first embodiment, the amount of calculation can be reduced. Further, even if the calculation of the second term of the denominator is omitted altogether and the evaluation is performed only with the first term, the distortion hardly increases.

【0018】第3の実施例として2チャンネルの雑音励
振ベクトルをもつ場合について示す。ここでも2段階の
探索を行なうと図4に示すような処理を行なうことにな
る。第1チャンネルの励振ベクトルをCj 、第2チャン
ネルの励振ベクトルをC′kとすると、最終的な評価尺
度は以下のようになる。
As a third embodiment, a case in which noise excitation vectors of two channels are provided will be described. Here, if a two-stage search is performed, the processing shown in FIG. 4 is performed. Assuming that the excitation vector of the first channel is C j and the excitation vector of the second channel is C ′ k , the final evaluation scale is as follows.

【0019】[0019]

【数1】 しかしながら各チャンネル毎の予備選択ではそれぞれ、
第1の実施例と同じ処理を行なう。すなわち分母のエネ
ルギー項の概算値を表から読みだし、式(5)を最大と
する候補を複数個選択する。そして、各チャンネルから
の候補の組合せで式(11)が最大となるjとkの組合
せを探索する。
(Equation 1) However, in the preliminary selection for each channel,
The same processing as in the first embodiment is performed. That is, the approximate value of the energy term of the denominator is read from the table, and a plurality of candidates that maximize the equation (5) are selected. Then, a combination of j and k that maximizes Expression (11) is searched for in a combination of candidates from each channel.

【0020】上述において、エネルギー項の表23には
雑音励振ベクトルの合成後のエネルギー推定値の逆数を
各励振ベクトル及び各合成フィルタの類型ごとに記憶し
ておいてもよい。また場合によっては式(5)を正しく
計算することなく、前記予備選択を最終選択としてもよ
い。
In the above description, in the energy term table 23, the reciprocal of the estimated energy value after the synthesis of the noise excitation vector may be stored for each excitation vector and each type of synthesis filter. In some cases, the preliminary selection may be the final selection without correctly calculating equation (5).

【0021】[0021]

【発明の効果】この発明を用いた場合の入力音声と符号
化歪のSNRと音声符号化処理全体の演算量(単位MO
PS Mega Operation Per Sec
ond:実時間処理に必要な1秒あたり100万回単位
の演算処理回数)との関係を従来法と比較したものを図
4に示す。この場合、雑音励振ベクトルは2チャンネル
からなり、各チャンネル毎に7ビット(極性を除く)を
配分した場合である。雑音励振ベクトルの次元数nは4
0でピッチ成分との直交化処理を併用している。またイ
ンパルス応答の分類は4種類とした。ここでの従来法は
式(5)の分子の項のみで予備選択を行なう方法であ
る。図中の数字は各チャンネル毎に128個の中から残
す候補数である。この図からこの発明により従来の予備
選択と比較してほとんど演算量を増加させることなく、
SNRを改善していることがわかる。この発明で残す候
補数が同一であれば、増加する演算量の内分けは、イン
パルス応答の分類のための距離計算がフレームあたり1
回、分母の項の逆数の積が予備選択での各雑音励振ベク
トル毎に1回ずつで、探索の距離計算の演算量と比較し
てごく少ない量である。またこの発明で増加する記憶容
量はインパルス応答の分類のための表21とエネルギー
の表23の記憶容量である。例えば図3の場合のように
インパルス応答を4種類に分類し、40次元の雑音励振
ベクトルを用いる時には、記憶容量の増加する割合は約
1割である。
According to the present invention, the input voice and the SNR of the coding distortion and the amount of calculation (unit MO)
PS Mega Operation Per Sec
FIG. 4 shows a comparison between the conventional method and ond: the number of operations performed in units of one million operations per second required for real-time processing. In this case, the noise excitation vector includes two channels, and 7 bits (excluding polarity) are allocated to each channel. The dimension number n of the noise excitation vector is 4
At 0, orthogonal processing with the pitch component is used together. The impulse response was classified into four types. Here, the conventional method is a method in which the preliminary selection is performed using only the numerator of the equation (5). The numbers in the figure are the number of candidates to be left out of 128 for each channel. From this figure, according to the present invention, there is almost no increase in the amount of computation as compared with the conventional preliminary selection.
It can be seen that the SNR has been improved. If the number of candidates to be left in the present invention is the same, the subdivision of the increasing amount of computation is as follows.
The product of the reciprocal of the term and the denominator term is once for each noise excitation vector in the preliminary selection, which is a very small amount as compared with the calculation amount of the distance calculation for the search. The storage capacities increased in the present invention are the storage capacities of Table 21 and Table 23 for the classification of the impulse response. For example, when the impulse responses are classified into four types as in the case of FIG. 3 and a 40-dimensional noise excitation vector is used, the rate at which the storage capacity increases is about 10%.

【図面の簡単な説明】[Brief description of the drawings]

【図1】2段階の探索におけるこの発明の処理を従来法
と対比させて示す図。
FIG. 1 is a diagram showing processing of the present invention in a two-stage search in comparison with a conventional method.

【図2】この発明の実施例の要部を示すブロック図。FIG. 2 is a block diagram showing a main part of the embodiment of the present invention.

【図3】この発明を2チャンネルの雑音励振ベクトルの
場合への適用した実施例の要部を示すブロック図。
FIG. 3 is a block diagram showing a main part of an embodiment in which the present invention is applied to the case of a two-channel noise excitation vector.

【図4】この発明によるSNRと演算量の関係を従来法
と比較した図。
FIG. 4 is a diagram comparing the relationship between the SNR and the amount of calculation according to the present invention with the conventional method.

【図5】CELP符号化法の基本原理を示すブロック
図。
FIG. 5 is a block diagram showing the basic principle of the CELP encoding method.

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.7,DB名) G10L 19/12 ──────────────────────────────────────────────────続 き Continued on the front page (58) Field surveyed (Int.Cl. 7 , DB name) G10L 19/12

Claims (1)

(57)【特許請求の範囲】(57) [Claims] 【請求項1】 ピッチ周期成分ベクトルと雑音励振ベク
トルを線形予測合成フィルタの励振源とする音声合成モ
デルを持ち、合成音声と入力音声の誤差を最小化するよ
うに雑音励振ベクトルを符号帳のなかから選択する音声
の励振信号符号化法において、 雑音励振ベクトル毎にその合成後の波形と入力音声波形
の内積を求め、 前記合成フィルタの類型を決定し、 予め、雑音励振ベクトル毎、および合成フィルタの類型
毎に作られた励振ベクトルの合成後のエネルギー推定値
の表または推定値の逆数の表から推定値を、前記決定し
た類型で各雑音励振ベクトル毎に読み出し、 前記内積値の2乗を前記エネルギーの推定値でわった値
を基準に雑音励振ベクトルの選択または予備選択を行な
うことを特徴とする音声の励振信号符号化法。
1. A speech synthesis model having a pitch period component vector and a noise excitation vector as excitation sources of a linear predictive synthesis filter, and a noise excitation vector in a codebook so as to minimize an error between synthesized speech and input speech. In the speech excitation signal encoding method selected from the following, the inner product of the synthesized waveform and the input speech waveform is obtained for each noise excitation vector, and the type of the synthesis filter is determined. From the table of energy estimation values after synthesis of the excitation vectors created for each type or the table of the reciprocals of the estimation values, the estimated values are read for each noise excitation vector in the determined type, and the square of the inner product value is calculated. A method for encoding a speech excitation signal, comprising selecting or preselecting a noise excitation vector based on a value obtained by dividing the estimated energy value.
JP16558193A 1993-07-05 1993-07-05 Excitation signal coding for speech Expired - Lifetime JP3209247B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP16558193A JP3209247B2 (en) 1993-07-05 1993-07-05 Excitation signal coding for speech

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP16558193A JP3209247B2 (en) 1993-07-05 1993-07-05 Excitation signal coding for speech

Publications (2)

Publication Number Publication Date
JPH0720895A JPH0720895A (en) 1995-01-24
JP3209247B2 true JP3209247B2 (en) 2001-09-17

Family

ID=15815080

Family Applications (1)

Application Number Title Priority Date Filing Date
JP16558193A Expired - Lifetime JP3209247B2 (en) 1993-07-05 1993-07-05 Excitation signal coding for speech

Country Status (1)

Country Link
JP (1) JP3209247B2 (en)

Also Published As

Publication number Publication date
JPH0720895A (en) 1995-01-24

Similar Documents

Publication Publication Date Title
US5208862A (en) Speech coder
US6510407B1 (en) Method and apparatus for variable rate coding of speech
JP3114197B2 (en) Voice parameter coding method
JP2746039B2 (en) Audio coding method
EP0422232B1 (en) Voice encoder
US5323486A (en) Speech coding system having codebook storing differential vectors between each two adjoining code vectors
JP3094908B2 (en) Audio coding device
KR100194775B1 (en) Vector quantizer
JPH056199A (en) Voice parameter coding system
JP2800618B2 (en) Voice parameter coding method
JP3180786B2 (en) Audio encoding method and audio encoding device
JP2624130B2 (en) Audio coding method
JP2002268686A (en) Voice coder and voice decoder
JP3209248B2 (en) Excitation signal coding for speech
JP3209247B2 (en) Excitation signal coding for speech
JPH0854898A (en) Voice coding device
JP2931059B2 (en) Speech synthesis method and device used for the same
JP3153075B2 (en) Audio coding device
JP3471889B2 (en) Audio encoding method and apparatus
JP3144284B2 (en) Audio coding device
JP3192051B2 (en) Audio coding device
JP3299099B2 (en) Audio coding device
JP3194930B2 (en) Audio coding device
JP3089967B2 (en) Audio coding device
JP3471542B2 (en) Audio coding device

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20070713

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080713

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080713

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090713

Year of fee payment: 8

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090713

Year of fee payment: 8

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100713

Year of fee payment: 9

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100713

Year of fee payment: 9

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110713

Year of fee payment: 10

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120713

Year of fee payment: 11

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130713

Year of fee payment: 12

EXPY Cancellation because of completion of term