JP2001013999A

JP2001013999A - Device and method for voice coding

Info

Publication number: JP2001013999A
Application number: JP11185114A
Authority: JP
Inventors: Kimio Miseki; 公生三関; Masahiro Oshikiri; 正浩押切
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1999-06-30
Filing date: 1999-06-30
Publication date: 2001-01-19
Anticipated expiration: 2019-06-30
Also published as: JP3552201B2

Abstract

PROBLEM TO BE SOLVED: To obtain a voice coder in which coding distortion is made less noticeable even at a low bit rate. SOLUTION: By windowing first autocorrelation coefficients, that are computed in an autocorrelation computing section 101 from inputted voice signals, using a different shaped autocorrelation window in first and second windowing sections 102 and 103, corrected second and third autocorrelation coefficients are obtained. Using the second autocorrelation coefficients, spectrum parameters, that are to be coded in a coding section 106, are computed in a coding spectrum parameter computing section 104. Moreover, a hearing weighted spectrum parameter computing section 105 computes the spectrum parameters for hearing weighted characteristic setting using the third autocorrelation coefficients. Based on the parameters, a hearing weight setting section 107 sets the hearing weighted characteristic for the coding of residual components conducted by a coding section 108.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声信号をスペク
トルパラメータと残差成分とで表して低ビットレートで
高能率に符号化する音声符号化方法および装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech coding method and apparatus for coding a speech signal with a low bit rate and a high efficiency by expressing the speech signal by spectral parameters and residual components.

【０００２】[0002]

【従来の技術】音声信号を少ない情報量で蓄積したり伝
送するための低ビットレート符号化方式として、ＣＥＬ
Ｐ( Code Excited Linear Prediction ，M.R.Schroeder
and B.S.Atal, “Code Excited Linear Prediction (C
ELP) : High Quality Speech at Very Low Bit Rate
s”， Proc. ICASSP, pp.937-940, 1985（文献１）が知
られている。また、「音のコミュニケーション工学」日
本音響学会編、1996年、コロナ社（文献２）のpp.33-42
においても、ＣＥＬＰ方式が解説されている。2. Description of the Related Art CEL is a low bit rate encoding method for storing and transmitting audio signals with a small amount of information.
P (Code Excited Linear Prediction, MRSchroeder
and BSAtal, “Code Excited Linear Prediction (C
ELP): High Quality Speech at Very Low Bit Rate
s ", Proc. ICASSP, pp. 937-940, 1985 (Reference 1). Also," Sound Communication Engineering ", edited by The Acoustical Society of Japan, 1996, Corona Corporation (Reference 2), pp. 33 -42
Also describes the CELP method.

【０００３】ＣＥＬＰ方式は線形予測分析に基づく符号
化方式であり、音声信号を音韻情報となるスペクトル包
絡を表すスペクトルパラメータと、音の高さ等を表す残
差成分とで表して、両者を符号化する。音声信号のスペ
クトル包絡を表すスペクトルパラメータとしては様々な
ものがあるが、音声符号化の分野ではＬＰＣ係数（線形
予測係数）が最も一般的に使用される。[0003] The CELP system is a coding system based on linear prediction analysis. A speech signal is represented by a spectrum parameter representing a spectrum envelope serving as phoneme information and a residual component representing a pitch of a sound, and both are encoded. Become Although there are various spectral parameters representing the spectral envelope of the audio signal, LPC coefficients (linear prediction coefficients) are most commonly used in the field of audio coding.

【０００４】ＣＥＬＰ方式では、ＬＰＣ係数は音声信号
の自己相関係数に窓掛けを行うことで修正された自己相
関係数から求められる。自己相関係数からＬＰＣ係数を
求めるには、Levinson-Durbin algorithmやDurbinの再
帰的解法として知られている方法が用いられる。この方
法の詳細については、例えば「ディジタル音声処理」東
海大学出版会、古井貞氏著（文献３）のpp.75に記載さ
れている。このようにして求められたＬＰＣ係数は、符
号化に適したＬＳＰ係数などの等価なパラメータに変換
される（文献３のpp.89-92参照）。そして、これが符号
化されることによりスペクトルパラメータの符号が求め
られる。In the CELP system, the LPC coefficient is obtained from the autocorrelation coefficient corrected by windowing the autocorrelation coefficient of the audio signal. In order to calculate the LPC coefficient from the autocorrelation coefficient, a method known as a Levinson-Durbin algorithm or a Durbin recursive solution is used. The details of this method are described in, for example, "Digital Speech Processing", Tokai University Press, pp. 75 of Sada Furui (Reference 3). The LPC coefficients obtained in this way are converted into equivalent parameters such as LSP coefficients suitable for encoding (see pp. 89-92 of Document 3). Then, by encoding this, the sign of the spectrum parameter is obtained.

【０００５】一方、残差成分の符号化に当たっては、聴
覚的に符号化歪みが聞こえにくくなるように聴覚重み付
きの歪み尺度により符号選択が行われる。ＣＥＬＰ方式
などの従来の音声符号化技術では、符号化される前のＬ
ＰＣ係数を聴覚重み付けにも利用することが特徴であ
る。[0005] On the other hand, in coding the residual component, code selection is performed using a distortion measure with an auditory weight so that encoding distortion is hardly heard. In a conventional speech coding technique such as the CELP scheme, L before encoding is used.
A feature is that the PC coefficient is also used for auditory weighting.

【０００６】音声信号の復号に当たっては、スペクトル
パラメータの符号と残差成分の符号が復号され、復号化
されたスペクトルパラメータに従って、復号化された残
差成分にスペクトル包絡を与えることにより音声信号を
再生する。In decoding an audio signal, the code of the spectral parameter and the code of the residual component are decoded, and the audio signal is reproduced by giving a spectrum envelope to the decoded residual component according to the decoded spectral parameter. I do.

【０００７】このように従来の音声符号化技術では、符
号化対象にすることを主目的に求められたＬＰＣ係数を
聴覚重み特性の設定にも流用しており、この制約のため
に必ずしも十分な聴覚重み特性を表現することができな
い。従って、例えば４ｋｂｉｔ／ｓ程度以下の低ビット
レート符号化に従来の音声符号化技術を用いると、符号
化歪みへの影響が大きい残差成分の劣化を聴覚重み付け
によってマスクしきれなくなり、高品質の復号音声を得
ることができなくなる。As described above, in the conventional speech coding technique, the LPC coefficient obtained mainly for the purpose of encoding is also used for setting the auditory weighting characteristic. The auditory weight characteristics cannot be expressed. Therefore, if a conventional speech coding technique is used for low bit rate coding of, for example, about 4 kbit / s or less, deterioration of a residual component having a large influence on coding distortion cannot be completely masked by perceptual weighting, and high-quality Decoded speech cannot be obtained.

【０００８】[0008]

【発明が解決しようとする課題】上述したように従来の
音声符号化技術では、符号化対象にすることを主目的に
求められたＬＰＣ係数を聴覚重み特性の設定にも流用す
ることから、必ずしも十分な聴覚重み特性を表現するこ
とができないため、さらなる低ビットレート化を図ろう
とすると、符号化歪みが大きい残差成分の劣化を聴覚重
み付けによってマスクしきれなくなり、復号音声の品質
が劣化するという問題があった。As described above, in the conventional speech coding technique, the LPC coefficients obtained mainly for the purpose of coding are diverted to the setting of the auditory weighting characteristics. Since sufficient perceptual weight characteristics cannot be expressed, when trying to further reduce the bit rate, the deterioration of the residual component with large coding distortion cannot be completely masked by perceptual weighting, and the quality of decoded speech deteriorates. There was a problem.

【０００９】本発明は、低ビットレート化を図りつつ符
号化歪みが知覚されにくい音声符号化方法および装置を
提供することを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to provide a speech encoding method and apparatus in which encoding distortion is hardly perceived while reducing the bit rate.

【００１０】[0010]

【課題を解決するための手段】上記の課題を解決するた
め、本発明は入力音声信号をスペクトル包絡を表すスペ
クトルパラメータと残差成分とで表し、これらスペクト
ルパラメータおよび残差成分を符号化する音声符号化に
際して、入力音声信号から求められた第１の自己相関係
数を修正して得られた第２の自己相関係数からスペクト
ルパラメータを算出して符号化し、また第１の自己相関
係数を第２の自己相関係数を得る際の条件と異なる条件
で修正して求められた第３の自己相関関数から聴覚重み
特性を求め、これらスペクトルパラメータおよび聴覚重
み特性を用いて残差成分を符号化することを基本的な特
徴とする。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, the present invention represents an input speech signal by a spectrum parameter representing a spectrum envelope and a residual component, and a speech encoding the spectrum parameter and the residual component. At the time of encoding, a spectrum parameter is calculated and encoded from a second autocorrelation coefficient obtained by correcting a first autocorrelation coefficient obtained from an input speech signal, and a first autocorrelation coefficient is calculated. Is corrected under a condition different from the condition under which the second autocorrelation coefficient is obtained, an auditory weighting characteristic is obtained from the third autocorrelation function, and a residual component is obtained by using these spectral parameters and the auditory weighting characteristic. Encoding is a basic feature.

【００１１】ここで、自己相関係数の修正は、例えば自
己相関窓を用いて行われる。第１の自己相関係数に対し
て自己相関窓を用いて窓掛けを行うことにより、修正さ
れた第２または第３の自己相関係数が得られる。この場
合、第２の自己相関係数を得る際に用いる第１の自己相
関窓と、第３の自己相関係数を得る際に用いる第２の自
己相関窓を形状の異なるものとする。Here, the correction of the autocorrelation coefficient is performed using, for example, an autocorrelation window. By performing windowing on the first autocorrelation coefficient using an autocorrelation window, a corrected second or third autocorrelation coefficient is obtained. In this case, the first autocorrelation window used for obtaining the second autocorrelation coefficient and the second autocorrelation window used for obtaining the third autocorrelation coefficient have different shapes.

【００１２】より具体的には、本発明では入力音声信号
から所定の時間単位毎に第１の自己相関係数が算出され
る。第１の自己相関係数に対し、第１の窓掛け部で第１
の自己相関窓を用いて窓掛けを行うことにより第２の自
己相関係数が求められ、同様に第２の窓掛け部で形状が
第１の自己相関窓と異なる第２の自己相関窓を用いて窓
掛けを行うことにより第３の自己相関係数を求められ
る。More specifically, in the present invention, a first autocorrelation coefficient is calculated for each predetermined time unit from an input audio signal. For the first autocorrelation coefficient, the first window
The second autocorrelation coefficient is obtained by performing windowing using the autocorrelation window of the second autocorrelation window. Similarly, a second autocorrelation window having a shape different from the first autocorrelation window in the second windowing portion is obtained. The third autocorrelation coefficient can be obtained by performing windowing using this.

【００１３】第２の自己相関係数を用いて符号化対象と
なる第１のスペクトルパラメータが算出され、この第１
のスペクトルパラメータが符号化される。一方、第３の
自己相関係数を用いて別の第２のスペクトルパラメータ
が算出され、この第２のスペクトルパラメータから聴覚
重み特性が設定され、第１のスペクトルパラメータおよ
び聴覚重み特性を用いて残差成分が符号化される。A first spectral parameter to be encoded is calculated using the second autocorrelation coefficient.
Are encoded. On the other hand, another second spectral parameter is calculated using the third autocorrelation coefficient, an auditory weighting characteristic is set from the second spectral parameter, and the remaining auditory weighting characteristic is set using the first spectral parameter and the auditory weighting characteristic. The difference component is encoded.

【００１４】このような本発明によると、第１の自己相
関窓を符号化対象となる第１のスペクトルパラメータ
（例えばＬＰＣ係数）を求めるために最適化した形状と
し、第２の自己相関窓を聴覚重み特性の設定に用いる第
２のスペクトルパラメータを求めるために最適化した形
状とすることにより、符号化対象の第１のスペクトルパ
ラメータと聴覚重み特性のそれぞれを精度よく求めるこ
とができるようになる。従って、非常に低い符号化ビッ
トレートでも、復号時に符号化歪みが知覚されにくく、
高品質の復号音声を再生できる音声符号化が可能とな
る。According to the present invention, the first autocorrelation window has a shape optimized for obtaining a first spectral parameter (for example, LPC coefficient) to be encoded, and the second autocorrelation window is formed. By adopting a shape optimized for obtaining the second spectral parameter used for setting the hearing weight characteristic, it becomes possible to accurately obtain each of the first spectral parameter to be encoded and the hearing weight characteristic. . Therefore, even at a very low coding bit rate, coding distortion is hardly perceived during decoding,
Audio encoding that can reproduce high-quality decoded audio becomes possible.

【００１５】[0015]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態を説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１６】（第１の実施形態）図１は、本発明の第１
の実施形態に係る音声符号化装置の構成を示すブロック
図である。この音声符号化装置は自己相関算出部１０
１、第１窓掛け部１０２、第２窓掛け部１０３、符号化
用スペクトルパラメータ算出部１０４、聴覚重み用スペ
クトルパラメータ算出部１０５、スペクトルパラメータ
符号化部１０６、聴覚重み設定部１０７、残差成分符号
化部１０８および多重化部１０９からなる。(First Embodiment) FIG. 1 shows a first embodiment of the present invention.
It is a block diagram showing the composition of the speech coding device concerning an embodiment. This speech coding apparatus includes an autocorrelation calculating unit 10
1, first windowing section 102, second windowing section 103, coding spectrum parameter calculating section 104, perceptual weighting spectrum parameter calculating section 105, spectrum parameter coding section 106, perceptual weight setting section 107, residual component It comprises an encoding unit 108 and a multiplexing unit 109.

【００１７】自己相関算出部１０１では、所定のサンプ
リング周波数でサンプリングされディジタル化された入
力音声信号から、所定の時間単位毎に次式で示すように
第１の自己相関係数ｒｉ（ｒ０，ｒ１，…，ｒＮ）が算
出される。The autocorrelation calculating section 101 obtains a first autocorrelation coefficient ri (r0, r1) from an input audio signal sampled and digitized at a predetermined sampling frequency for each predetermined time unit as shown in the following equation. ,..., RN) are calculated.

【００１８】[0018]

【数１】 (Equation 1)

【００１９】ここで、｛ｘ_ｎ｝は入力音声信号に長さＬ
の時間窓をかけて切り出した入力音声信号系列、Ｎは自
己相関の次数をそれぞれ表しており、入力音声信号のサ
ンプリング周波数が８ｋＨｚの場合、典型的なＮの値は
Ｎ＝１０である。Here, {x _n } is the length L of the input audio signal.
, N represents the order of the autocorrelation, and a typical value of N is N = 10 when the sampling frequency of the input audio signal is 8 kHz.

【００２０】次に、自己相関算出部１０１で求められた
第１の自己相関係数ｒｉに対して、第１窓掛け部１０２
で第１の自己相関窓による窓掛けが行われることによ
り、第１の自己相関係数ｒｉが修正され、第２の自己相
関係数φｉ（φ０，φ１，…，φＮ）が求められる。こ
の第１の自己相関窓による窓掛け処理の一例は、次式に
示される。 φｉ＝ｒｉ×ｗｉ（ｉ＝０，１，…，Ｎ）（２）ここで、ｗｉは第１の自己相関窓を表す。Next, the first autocorrelation coefficient ri obtained by the autocorrelation calculation section 101 is applied to the first windowing section 102
, The first autocorrelation coefficient ri is corrected, and the second autocorrelation coefficient φi (φ0, φ1,..., ΦN) is obtained. An example of the windowing process using the first autocorrelation window is shown in the following equation. φi = ri × wi (i = 0, 1,..., N) (2) where wi represents a first autocorrelation window.

【００２１】次に、符号化用スペクトルパラメータ算出
部１０４において、第２の自己相関係数φｉを用いて符
号化対象となるスペクトルパラメータが求められる。ス
ペクトルパラメータとしてはパワースペクトル、ＬＰＣ
ケプストラム、メル尺度のスペクトルパラメータ、サブ
バンドエネルギーなど様々なものが知られているが、こ
こではＬＰＣ係数（線形予測係数）の例について示す。
ＬＰＣ係数は、次の線形方程式を解くことにより算出さ
れる。 Φα＝ψ （３）ここで、Φは次式に示すように、第２の自己相関係数φ
ｉから構成される自己相関行列である。Next, the spectrum parameter to be coded is obtained in the coding spectrum parameter calculating section 104 using the second autocorrelation coefficient φi. Power spectrum, LPC as spectrum parameters
Various types such as cepstrum, mel scale spectral parameter, and subband energy are known. Here, examples of LPC coefficients (linear prediction coefficients) will be described.
The LPC coefficient is calculated by solving the following linear equation. Φα = ψ (3) where Φ is a second autocorrelation coefficient φ as shown in the following equation.
It is an autocorrelation matrix composed of i.

【００２２】[0022]

【数２】 (Equation 2)

【００２３】式（３）の方程式からＬＰＣ係数｛αｉ｝
を求める方法としては、例えばLevinson-Durbin algori
thmや、Durbinの再帰的解法を用いることができ、これ
らの方法は先の文献３のｐｐ．７５に記載されているの
で、詳細な説明は省略する。From the equation (3), the LPC coefficient {αi}
For example, as a method for obtaining
thm and Durbin's recursive method can be used. 75, detailed description is omitted.

【００２４】こうして求められた符号化対象となるスペ
クトルパラメータ（以下、符号化用スペクトルパラメー
タという）（ここではＬＰＣ係数｛αｉ｝）は、スペク
トルパラメータ符号化部１０６によって符号化され、量
子化されたスペクトルパラメータとこれを表すスペクト
ルパラメータの符号Ａが出力される。The spectrum parameter to be coded (hereinafter referred to as a coding spectrum parameter) (here, LPC coefficient {αi}) thus obtained is coded and quantized by the spectrum parameter coding unit 106. The spectrum parameter and the symbol A representing the spectrum parameter are output.

【００２５】このスペクトルパラメータの符号化に当た
っては、例えばスペクトルパラメータがＬＰＣ係数であ
る場合、ＬＰＣ係数をＬＳＰ（線スペクトル対）係数な
どの等価なパラメータに変換し（文献３のpp.89-92参
照）、これをベクトル量子化法を用いて符号化すること
により、同一ビット数の下でより量子化歪みの少ないス
ペクトルパラメータの符号化を行うことができる。In coding the spectral parameters, for example, when the spectral parameters are LPC coefficients, the LPC coefficients are converted into equivalent parameters such as LSP (line spectrum pair) coefficients (see pp. 89-92 of Reference 3). ), By encoding this using the vector quantization method, it is possible to encode spectral parameters with less quantization distortion under the same number of bits.

【００２６】次に、自己相関関数から聴覚重み特性を設
定するために、まず自己相関算出部１０１で求められた
第１の自己相関係数ｒｉに対して第２窓掛け部１０３で
第２の自己相関窓による窓掛けが行われることにより、
第１の自己相関係数ｒｉが修正され、第３の自己相関係
数φ′ｉ（φ′０，φ′１，…，φ′Ｎ）が求められ
る。この第２の自己相関窓による窓掛け処理の一例は、
次式に示される。 φ′ｉ＝ｒｉ×ｖｉ（ｉ＝０，１，…，Ｎ）（５）ここで、ｖｉは第２の自己相関窓を表す。Next, in order to set the auditory weighting characteristic from the autocorrelation function, first, the second windowing section 103 applies a second windowing section 103 to the first autocorrelation coefficient ri obtained by the autocorrelation calculating section 101. By windowing with the autocorrelation window,
The first autocorrelation coefficient ri is corrected, and the third autocorrelation coefficient φ′i (φ′0, φ′1,..., Φ′N) is obtained. An example of the windowing process using the second autocorrelation window is as follows.
It is shown by the following equation. φ′i = ri × vi (i = 0, 1,..., N) (5) Here, vi represents a second autocorrelation window.

【００２７】第２の自己相関窓ｖｉは、聴覚重み特性を
設定するために用いる窓であり、第１の自己相関窓ｗｉ
とは形状が異なる。より具体的には、第１の自己相関窓
ｗｉよりも第２の自己相関窓ｖｉの方が自己相関係数に
与える修正の度合いが小さくなるような関係になるよう
に、これら２つの自己相関窓ｗｉ，ｖｉを設定すること
が望ましい。この理由は、次の通りである。The second autocorrelation window vi is a window used for setting the auditory weighting characteristic, and the first autocorrelation window wi is used.
Is different in shape. More specifically, the two autocorrelation windows vi have a relationship such that the degree of correction given to the autocorrelation coefficient is smaller in the second autocorrelation window vi than in the first autocorrelation window wi. It is desirable to set windows wi and vi. The reason is as follows.

【００２８】まず、第１の自己相関窓ｗｉについては、
これを用いて最終的に符号化用スペクトルパラメータ算
出部１０４で求められる符号化用スペクトルパラメータ
は、スペクトルパラメータ符号化部１０６内で量子化さ
れた後に音声信号を生成するための合成フィルタのフィ
ルタ特性として用いられるので、周波数特性に過度に強
い共振点を持たないように、自己相関係数に対する修正
の度合いが比較的強い窓形状とすることが望ましい。First, regarding the first autocorrelation window wi,
The spectral parameters for encoding finally determined by the spectral parameter computing unit for encoding 104 using this are the filter characteristics of a synthesis filter for generating a speech signal after being quantized in the spectral parameter encoding unit 106. Therefore, it is desirable to use a window shape in which the degree of correction to the autocorrelation coefficient is relatively strong so as not to have an excessively strong resonance point in the frequency characteristic.

【００２９】一方、第２の自己相関窓ｖｉは、最終的に
音声信号のスペクトルの形状に対応した周波数マスキン
グ効果を反映するための聴覚重み特性の設定に使用する
ことから、過度に強い共振点を持たないようにすること
が必要ではあるが、合成フィルタのフィルタ特性として
用いられることはないので、第１の自己相関窓ｗｉより
も自己相関関数に対する修正の度合いは小さくなるよう
な窓形状とすることが望ましい。On the other hand, since the second autocorrelation window vi is used to set an auditory weighting characteristic for reflecting a frequency masking effect corresponding to the shape of the spectrum of the audio signal, an excessively strong resonance point Although it is necessary to prevent the first autocorrelation window wi from being modified, the window shape is not used as a filter characteristic of the synthesis filter. It is desirable to do.

【００３０】次に、聴覚重み用スペクトルパラメータ算
出部１０５において、第２窓掛け部１０３で得られた第
３の自己相関係数φ′ｉを用いて聴覚重みの設定に必要
なスペクトルパラメータ（以下、聴覚重み用スペクトル
パラメータという）が求められる。この聴覚重み用スペ
クトルパラメータとして、ＬＰＣ係数を用いる場合に
は、前述の符号化用パラメータとして用いられるＬＰＣ
係数の算出方法と同じアルゴリズムを用いることができ
ることは言うまでもない。このときの線形方程式は、次
のようになる。 Φ′β＝ψ′ （６）ここで、Φ′は次式に示すように、第３の自己相関係数
φ′ｉから構成される自己相関行列である。Next, in the auditory weighting spectrum parameter calculating section 105, the third autocorrelation coefficient φ′i obtained by the second windowing section 103 is used to set the spectral parameters necessary for setting the auditory weights (hereinafter, referred to as the following). , A spectral parameter for auditory weight). When the LPC coefficient is used as the auditory weighting spectrum parameter, the LPC coefficient used as the above-described encoding parameter is used.
It goes without saying that the same algorithm as the coefficient calculation method can be used. The linear equation at this time is as follows. Φ′β = ψ ′ (6) Here, Φ ′ is an autocorrelation matrix composed of a third autocorrelation coefficient φ′i as shown in the following equation.

【００３１】[0031]

【数３】 (Equation 3)

【００３２】第２の自己相関窓ｖｉは第１の自己相関窓
ｖｉと異なるため、第２の自己相関窓ｖｉにより修正さ
れる第３の自己相関関数φ′ｉを用いて算出される聴覚
重み用スペクトルパラメータとしてのＬＰＣ係数｛β
ｉ｝は、第１の自己相関窓ｖｉにより修正された符号化
用スペクトルパラメータとしてのＬＰＣ係数｛αｉ｝と
はスペクトルの特性が異なる。従って、第２の自己相関
窓ｖｉを聴覚重み用に適切に設定することによって、よ
り精度の高い聴覚重み特性を残差成分の符号化に用いる
ことができるという効果がある。Since the second auto-correlation window vi is different from the first auto-correlation window vi, the auditory weight calculated using the third auto-correlation function φ′i modified by the second auto-correlation window vi Coefficient ｛β as spectrum parameter for
i｝ has a different spectral characteristic from the LPC coefficient {αi} as the coding spectrum parameter corrected by the first autocorrelation window vi. Therefore, by setting the second autocorrelation window vi appropriately for the auditory weight, there is an effect that a more accurate auditory weight characteristic can be used for encoding the residual component.

【００３３】聴覚重み設定部１０７は、聴覚重み用スペ
クトルパラメータ（この例ではＬＰＣ係数｛βｉ｝）を
用いて、残差成分符号化部１０８での聴覚重み付けに用
いる聴覚重み特性の設定を行う。残差成分符号化部１０
８において時間領域で聴覚重み付けを行って残差成分の
符号化を行う場合には、聴覚重み付けはＷ（ｚ）なる特
性の重みフィルタによるフィルタリング処理として実現
される。ＬＰＣ係数｛βｉ｝を用いた聴覚重み付けフィ
ルタ特性Ｗ（ｚ）の典型的な例は、次式に示される。The perceptual weight setting unit 107 sets perceptual weight characteristics used for perceptual weighting in the residual component coding unit 108 using the perceptual weight spectral parameters (in this example, LPC coefficients {βi}). Residual component encoding unit 10
In the case where the perceptual weighting is performed in the time domain at 8 to encode the residual component, the perceptual weighting is realized as a filtering process using a weight filter having a characteristic of W (z). A typical example of the auditory weighting filter characteristic W (z) using the LPC coefficient {βi} is represented by the following equation.

【００３４】[0034]

【数４】 (Equation 4)

【００３５】ここで、Ｂ（ｚ）は次式となる。Here, B (z) is given by the following equation.

【００３６】[0036]

【数５】 (Equation 5)

【００３７】γ１，γ２は残差成分符号化部１０８での
聴覚重み付け特性を設定するパラメータであり、１≧γ
１＞γ２＞０の関係が必要である。典型的な例として
は、例えばγ１＝０．９４、γ２＝０．６を用いること
ができる。Γ1 and γ2 are parameters for setting the auditory weighting characteristics in the residual component encoding unit 108, where 1 ≧ γ
A relationship of 1>γ2> 0 is required. As a typical example, for example, γ1 = 0.94 and γ2 = 0.6 can be used.

【００３８】残差成分符号化部１０８は、入力音声信号
とスペクトルパラメータ符号化部１０６からの量子化さ
れたスペクトルパラメータと聴覚重みの情報を入力し、
量子化されたスペクトルパラメータと共に音声信号を表
すの必要な残差成分の符号化を行い、得られた残差成分
の符号Ｂを出力する。The residual component encoding unit 108 receives the input speech signal, the quantized spectral parameters from the spectral parameter encoding unit 106 and the information of the perceptual weight,
The residual component necessary to represent the audio signal is encoded together with the quantized spectral parameters, and the code B of the obtained residual component is output.

【００３９】以上のようにしてスペクトルパラメータ符
号化部１０６で得られたスペクトルパラメータの符号Ａ
と、残差成分符号化部１０８で得られた残差成分の符号
Ｂは多重化部１０９で多重化され、入力音声信号を表す
符号化データとして出力される。この符号化データは、
蓄積系または伝送系へ送出される。The code A of the spectrum parameter obtained by spectrum parameter coding section 106 as described above
And the code B of the residual component obtained by the residual component encoding unit 108 are multiplexed by the multiplexing unit 109 and output as encoded data representing the input audio signal. This encoded data is
It is sent to the storage system or transmission system.

【００４０】次に、図２に示すフローチャートを用い
て、本実施形態による音声符号化装置と同様の音声符号
化処理をソフトウェアにより実現する場合の処理手順を
説明する。Next, with reference to the flowchart shown in FIG. 2, a description will be given of a processing procedure when the same voice encoding processing as that of the voice encoding apparatus according to the present embodiment is realized by software.

【００４１】まず、入力音声信号から所定の時間単位毎
に第１の自己相関係数ｒｉ（ｒ０，ｒ１，…，ｒＮ）を
求める（ステップＳ１）。次に、この自己相関係数ｒｉ
に第１の自己相関窓ｗｉ（ｗ０，ｗ１，…，ｗＮ）によ
る窓掛けを行い、修正された第２の自己相関係数φｉ
（φ０，φ１，…，φＮ）を求める（ステップＳ２）。
次に、第２の自己相関係数φｉを用いて符号化対象とな
る符号化用スペクトルパラメータを求める（ステップＳ
３）。次に、符号化用スペクトルパラメータを符号化
し、この符号化の過程で得られる量子化されたスペクト
ルパラメータとそれを表すスペクトルパラメータの符号
を求める（ステップＳ４）。First, a first autocorrelation coefficient ri (r0, r1,..., RN) is obtained for each predetermined time unit from the input audio signal (step S1). Next, this autocorrelation coefficient ri
Is windowed with a first autocorrelation window wi (w0, w1,..., WN), and the corrected second autocorrelation coefficient φi
(Φ0, φ1,..., ΦN) are obtained (step S2).
Next, a coding spectrum parameter to be coded is obtained using the second autocorrelation coefficient φi (step S
3). Next, the encoding spectral parameters are encoded, and the quantized spectral parameters obtained in the encoding process and the sign of the spectral parameters representing the quantized spectral parameters are obtained (step S4).

【００４２】一方、ステップＳ１で求められた第１の自
己相関関数ｒｉから、聴覚重み特性を設定するまでの処
理を以下のようにして行う。すなわち、自己相関係数ｒ
ｉに第２の自己相関窓ｖｉ（ｖ０，ｖ１，…，ｖＮ）に
よる窓掛けを行い、修正された第３の自己相関係数φ′
ｉ（φ′０，φ′１，…，φ′Ｎ）を求める（ステップ
Ｓ５）。次に、第３の自己相関係数φ′ｉを用いて、聴
覚重みの設定に必要な聴覚重み用スペクトルパラメータ
を求める（ステップＳ６）。次に、この聴覚重み用スペ
クトルパラメータを用いて残差成分符号化で用いる聴覚
重み特性の設定を行う（ステップＳ７）。次に、入力音
声信号と量子化されたスペクトルパラメータと聴覚重み
特性の情報を用いて、量子化されたスペクトルパラメー
タと共に音声信号を表すために必要な残差成分の符号化
を行う（ステップＳ８）。そして、ステップＳ４とＳ８
の処理により得られたスペクトルパラメータの符号と残
差成分の符号を多重化して音声信号の符号データとして
出力する（ステップＳ９）。On the other hand, processing from setting of the first autocorrelation function ri obtained in step S1 to setting of auditory weighting characteristics is performed as follows. That is, the autocorrelation coefficient r
i is windowed with a second autocorrelation window vi (v0, v1,..., vN), and the corrected third autocorrelation coefficient φ ′
i (φ′0, φ′1,..., φ′N) are obtained (step S5). Next, using the third autocorrelation coefficient φ′i, a perceptual weight spectrum parameter required for setting the perceptual weight is determined (step S6). Next, using the perceptual weight spectrum parameters, perceptual weight characteristics used in residual component coding are set (step S7). Next, using the input audio signal, the quantized spectral parameters, and the information on the auditory weighting characteristics, the residual component necessary for representing the audio signal together with the quantized spectral parameters is encoded (step S8). . Then, steps S4 and S8
Are multiplexed with the code of the spectrum parameter and the code of the residual component obtained by the above processing and output as code data of the audio signal (step S9).

【００４３】以上のステップＳ１〜Ｓ９の処理を終える
と、一つの時間単位（典型的には、入力音声信号が８ｋ
Ｈｚでサンプリングされているとき２０ｍｓｅｃ）の音
声信号の符号化処理が終了する。この一連の処理をステ
ップＳ１０で次の時間単位の処理を行わないと判定され
るまで時間単位毎に連続して行うことにより、連続して
入力される音声信号の符号化を行うことができる。When the processing in steps S1 to S9 is completed, one time unit (typically, when the input audio signal is 8k
The encoding process of the audio signal of 20 msec when sampling at Hz is completed. By continuously performing this series of processing for each time unit until it is determined in step S10 that the processing for the next time unit is not to be performed, it is possible to encode a continuously input audio signal.

【００４４】（第２の実施形態）図３は、本発明をＣＥ
ＬＰ方式に適用した音声符号装置の構成を示すブロック
図である。この図では、特にＣＥＬＰ方式の特徴である
残差成分符号化部について、図１よりも詳細に示してい
る。ＣＥＬＰ方式の詳細については、前述したように文
献１や文献２に記載されている。(Second Embodiment) FIG. 3 shows that the present invention
It is a block diagram which shows the structure of the speech coding apparatus applied to the LP system. In this figure, the residual component encoding unit, which is a feature of the CELP method, is shown in more detail than FIG. Details of the CELP method are described in References 1 and 2 as described above.

【００４５】この音声符号化装置は自己相関算出部３０
１、第１窓掛け部３０２、第２窓掛け部３０３、符号化
用ＬＰＣ係数算出部３０４、聴覚重み用ＬＰＣ係数算出
部３０５、ＬＰＣ係数符号化部３０６、聴覚重み設定部
３０７、残差成分符号化部３０８および多重化部３０９
からなる。This speech coding apparatus includes an autocorrelation calculating section 30.
1, first windowing section 302, second windowing section 303, encoding LPC coefficient calculating section 304, auditory weighting LPC coefficient calculating section 305, LPC coefficient encoding section 306, auditory weight setting section 307, residual component Encoding section 308 and multiplexing section 309
Consists of

【００４６】ここで、自己相関算出部３０１、第１窓掛
け部３０２、第２窓掛け部３０３、符号化用ＬＰＣ係数
算出部３０４、聴覚重み用ＬＰＣ係数算出部３０５、Ｌ
ＰＣ係数符号化部３０６および聴覚重み設定部３０７に
ついては、第１の実施形態における自己相関算出部１０
１、第１窓掛け部１０２、第２窓掛け部１０３、符号化
用スペクトルパラメータ算出部１０４、聴覚重み用スペ
クトルパラメータ算出部１０５、スペクトルパラメータ
符号化部１０６および聴覚重み設定部１０７と同様であ
るので、説明を省略する。Here, autocorrelation calculation section 301, first windowing section 302, second windowing section 303, encoding LPC coefficient calculating section 304, auditory weighting LPC coefficient calculating section 305, L
The PC coefficient encoding unit 306 and the perceptual weight setting unit 307 are based on the autocorrelation calculation unit 10 in the first embodiment.
1, the first windowing unit 102, the second windowing unit 103, the coding spectrum parameter calculation unit 104, the hearing weight spectrum parameter calculation unit 105, the spectrum parameter coding unit 106, and the hearing weight setting unit 107 are the same. Therefore, the description is omitted.

【００４７】残差成分符号化部３０８は目標信号生成部
３１１、適応音源符号化部３１２、雑音音源符号化部３
１３、ゲイン符号化部３１４、駆動信号生成部３１５お
よび重み付き合成フィルタ３１６から構成される。以
下、残差成分符号化部３０８の各部の構成について詳細
に説明する。The residual component encoder 308 includes a target signal generator 311, an adaptive excitation encoder 312, and a noise excitation encoder 3
13, a gain encoder 314, a drive signal generator 315, and a weighted synthesis filter 316. Hereinafter, the configuration of each unit of the residual component encoding unit 308 will be described in detail.

【００４８】目標信号生成部３１１は、聴覚重み設定部
３０７により聴覚重み特性が設定される聴覚重みフィル
タを有し、この聴覚重みフィルタを用いて入力音声信号
に対しフィルタリングを行うことにより聴覚重み付けが
なされた音声信号を生成すると共に、この聴覚重み付け
がなされた音声信号から前の時間単位での符号化の影響
を差引くことにより、残差成分の符号化の目標となる目
標信号｛ｆｎ｝を生成する。The target signal generating section 311 has an auditory weight filter for which an auditory weight characteristic is set by the auditory weight setting section 307. The auditory weighting is performed by filtering the input speech signal using the auditory weight filter. The target signal {fn} which is the target of the residual component encoding is generated by generating the audio signal thus obtained and subtracting the influence of the encoding in the previous time unit from the audio signal having the auditory weight. Generate.

【００４９】適応音源符号化部３１２は、ＣＥＬＰ方式
の音声符号化でよく知られている適応符号帳を有し、目
標信号｛ｆｎ｝（目標ベクトルｆ）を用いて次式の誤差
ベクトルｅ０の大きさをより小さくする、好ましくは最
小化する最適な適応符号ベクトルｃ０を適応符号帳の中
から探索する。ｅ０＝ｆ−Ｈｗｃ０（ｉ）（１０）ここで、ｉは適応符号ベクトルの候補となるコードベク
トルのインデックスを示す。また、Ｈｗは聴覚重み付け
られた音声のスペクトル包絡特性（聴覚重み付けられた
合成フィルタの特性）Ｈｗ（ｚ）を有するフィルタのイ
ンパルス応答から構成されるインパルス応答行列であ
る。Adaptive excitation coding section 312 has an adaptive codebook well-known in CELP speech coding, and uses target signal {fn} (target vector f) to generate error vector e0 of the following equation. The optimal adaptive code vector c0 to be reduced in size, preferably minimized, is searched in the adaptive codebook. e0 = f−Hwc0 (i) (10) Here, i indicates an index of a code vector that is a candidate for an adaptive code vector. Hw is an impulse response matrix composed of an impulse response of a filter having a spectral envelope characteristic of a hearing-weighted voice (a characteristic of a hearing-weighted synthesis filter) Hw (z).

【００５０】聴覚重み付けられたスペクトル包絡特性Ｈ
ｗ（ｚ）は、次式で表される。The perceptually weighted spectral envelope characteristic H
w (z) is represented by the following equation.

【００５１】[0051]

【数６】 (Equation 6)

【００５２】ここで、Ｗ（ｚ）は式（４）に示した聴覚
重みフィルタ特性、またＡｑ（ｚ）は次式で表される。Here, W (z) is the perceptual weight filter characteristic shown in equation (4), and Aq (z) is expressed by the following equation.

【００５３】[0053]

【数７】 (Equation 7)

【００５４】ただし、α_ｑｉは量子化されたＬＰＣ係数
である。Here, α _qi is a quantized LPC coefficient.

【００５５】こうして適応符号ベクトルの候補の中から
選択された適応符号ベクトルのインデックスＩと、これ
に対応する適応符号ベクトルｃ０（Ｉ）が適応音源符号
化部３１２から出力される。The adaptive code vector index I selected from the adaptive code vector candidates and the adaptive code vector c0 (I) corresponding thereto are output from the adaptive excitation coding section 312.

【００５６】次に、雑音音源符号化部３１３において
は、ＣＥＬＰ方式の音声符号化でよく知られている所定
の方法で構成される雑音符号帳または擬似的に雑音を表
現することのできるパルス音源等を用いて、適応音源符
号化部３１２で表しきれなかった成分の符号化を行う。
この際に用いる目標ベクトルｄは、ｄ＝ｆ−ｃ０（Ｉ）
とすることができる。この目標ベクトルｄを用いて、次
式の誤差ベクトルｅ１の大きさをより小さくする、好ま
しくは最小化する最適な雑音符号ベクトルｃ１を雑音符
号ベクトル候補の中から探索する。ｅ１＝ｄ−Ｈｗｃ１（ｊ）（１３）ここで、ｊは雑音符号ベクトルの候補となる符号ベクト
ルのインデックスを示す。Next, in the noise excitation coding section 313, a noise excitation book or a pulse excitation source capable of expressing noise in a pseudo manner is constructed by a predetermined method well known in speech coding of the CELP system. The components that cannot be represented by the adaptive excitation coding unit 312 are coded using the above-described method.
The target vector d used at this time is d = fc0 (I)
It can be. Using this target vector d, an optimal noise code vector c1 that makes the magnitude of the error vector e1 of the following equation smaller, preferably minimized, is searched from the noise code vector candidates. e1 = d−Hwc1 (j) (13) Here, j indicates an index of a code vector that is a candidate of a noise code vector.

【００５７】こうして雑音符号ベクトルの候補の中から
選択された雑音符号ベクトルのインデックスＪと、これ
に対応する雑音符号ベクトルｃ１（Ｊ）が雑音音源符号
化部３１１から出力される。The noise code vector index J selected from the random code vector candidates and the corresponding noise code vector c1 (J) are output from the noise excitation coding section 311.

【００５８】次に、ゲイン符号化部３１４は、ＣＥＬＰ
方式の音声符号化でよく知られている所定の方法で構成
されるゲイン符号化帳を有し、適応音源符号化部３１２
から出力される適応符号ベクトルｃ０（Ｉ）と雑音音源
符号化部３１３から出力される雑音符号ベクトルｃ１
（Ｊ）にそれぞれ乗じるためのゲインを符号化する。符
号化に際しては、次式に示す誤差ベクトルｅｇの大きさ
をより小さくする、好ましくは最小化する最適なゲイン
をゲイン符号化帳に格納されたゲインベクトルの候補ｇ
０（ｋ），ｇ１（ｋ）（ただし、ｋはゲインベクトルの
インデックス）の中から探索する。ｅｇ＝ｆ−ｇ０（ｋ）Ｈｗｃ０（Ｉ） −ｇ１（ｋ）Ｈｗｃ１（Ｊ）（１４）こうしてゲインベクトルの候補ｇ０（ｋ），ｇ１（ｋ）
の中から探索されたゲインのインデックスＫと、それに
対応するゲインベクトルｇ０（Ｋ），ｇ１（Ｋ）がゲイ
ン符号化部３１４から出力される。Next, gain encoding section 314 performs CELP
The adaptive excitation coding unit 312 has a gain codebook configured by a predetermined method well-known in the speech coding of the system.
And the noise code vector c1 output from the noise excitation coding unit 313.
(J) is encoded with a gain to be multiplied. At the time of encoding, a gain vector candidate g stored in a gain encoding book is used to reduce or preferably minimize the size of an error vector eg represented by the following equation.
Search is performed from 0 (k) and g1 (k) (where k is an index of a gain vector). eg = f−g0 (k) Hwc0 (I) −g1 (k) Hwc1 (J) (14) Thus, gain vector candidates g0 (k) and g1 (k)
Are output from the gain encoding unit 314. The gain index K searched for among the above and the corresponding gain vectors g0 (K) and g1 (K) are output.

【００５９】適応音源符号化部３１２から出力される適
応符号化ベクトルｃ０（Ｉ）、雑音音源符号化部３１３
から出力される雑音符号ベクトルｃ１（Ｊ）およびゲイ
ン符号化部３１４から出力されるゲインベクトルｇ０
（Ｋ），ｇ１（Ｋ）は、駆動信号生成部３１５に入力さ
れる。駆動信号生成部３１５は、次式に示すように適応
符号ベクトルｃ０（Ｉ），雑音符号ベクトルｃ１（Ｊ）
をそれぞれにゲインベクトルｇ０（Ｋ），ｇ１（Ｋ）を
乗じた後に加算することにより、量子化された残差ベク
トルｅｘを求める。この残差ベクトルｅｘは、適応音源
符号化部３１２に入力されて適応符号帳に格納されると
ともに、重み付き合成フィルタ３１６に駆動信号として
入力される。Adaptive coded vector c0 (I) output from adaptive excitation coding section 312, noise excitation coding section 313
And a gain vector g0 output from the gain encoding unit 314.
(K) and g1 (K) are input to the drive signal generation unit 315. The drive signal generation unit 315 calculates the adaptive code vector c0 (I) and the noise code vector c1 (J) as shown in the following equation.
Are multiplied by gain vectors g0 (K) and g1 (K), respectively, and then added to obtain a quantized residual vector ex. This residual vector ex is input to adaptive excitation coding section 312 and stored in the adaptive codebook, and is also input as a driving signal to weighted synthesis filter 316.

【００６０】ｅｘ＝ｇ０（Ｋ）ｃ０（Ｉ）＋ｇ１（Ｋ）ｃ１（Ｊ）（１５）そして最後に、残差ベクトルｅｘと重み付き合成フィル
タの特性Ｗ（ｚ）およびＡｑ（ｚ）を用いて、入力音声
信号の次の時間単位の符号化に及ぶ影響を求めるための
重み付き合成フィルタの内部状態を求め、これを目標信
号生成部３１１に供給する。Ex = g0 (K) c0 (I) + g1 (K) c1 (J) (15) Finally, the residual vector ex and the characteristics W (z) and Aq (z) of the weighted synthesis filter are used. Then, the internal state of the weighted synthesis filter for obtaining the influence on the next time unit encoding of the input audio signal is obtained, and this is supplied to the target signal generation unit 311.

【００６１】最後に、以上のようにして得られたスペク
トルパラメータ（ＬＰＣ係数）の符号Ａと、図１におけ
る残差成分の符号Ｂに相当する適応符号ベクトルのイン
デックスＩ、雑音符号ベクトルのインデックスＪおよび
ゲインベクトルのインデックスＫが多重化部３０９で多
重化され、入力音声信号を表す符号化データとして出力
される。この符号化データは、蓄積系または伝送系に送
出される。Finally, the index A of the adaptive code vector corresponding to the code A of the spectral parameter (LPC coefficient) obtained as described above, the code B of the residual component in FIG. The multiplexing unit 309 multiplexes the gain vector index K with the gain vector and outputs the coded data representing the input audio signal. This encoded data is sent to a storage system or a transmission system.

【００６２】次に、本実施形態に係る音声復号化装置に
ついて説明する。図４は、同実施形態に係る図３に示し
た音声符号化装置に対応する音声復号化装置の構成を示
すブロック図である。Next, the speech decoding apparatus according to this embodiment will be described. FIG. 4 is a block diagram showing a configuration of a speech decoding device corresponding to the speech encoding device shown in FIG. 3 according to the embodiment.

【００６３】本発明は、基本的に符号化側におけるスペ
クトルパラメータ（例えばＬＰＣ係数）および残差成分
の抽出法に特徴を有するものであり、図３に示した音声
符号化装置から出力される符号化データそのものは、従
来のＣＥＬＰ方式のそれと基本的に変わらない。従っ
て、音声復号化装置の構成は、従来のＣＥＬＰ方式のそ
れと同様でよい。The present invention basically has a feature in a method of extracting a spectrum parameter (for example, LPC coefficient) and a residual component on the encoding side, and codes output from the speech encoding apparatus shown in FIG. The coded data itself is basically the same as that of the conventional CELP system. Therefore, the configuration of the speech decoding device may be the same as that of the conventional CELP system.

【００６４】図４に示す音声復号化装置は、分離部４０
０、ＬＰＣ係数復号化部４０１、適応音源復号化部４０
２、雑音音源復号化部４０３、ゲイン復号化部４０４、
駆動信号生成部４０５、合成フィルタ４０６およびポス
トフィルタ４０７から構成される。The speech decoding apparatus shown in FIG.
0, LPC coefficient decoding section 401, adaptive excitation decoding section 40
2, noise excitation decoding section 403, gain decoding section 404,
The driving signal generator 405 includes a synthesis filter 406 and a post filter 407.

【００６５】分離部４００では、図３に示した音声符号
化装置より蓄積系または伝送系を経て入力された符号化
データから、スペクトルパラメータ（ＬＰＣ係数）の符
号Ａと、残差成分の符号に相当する適応符号ベクトルの
インデックスＩ、雑音符号ベクトルのインデックスＪお
よびゲインベクトルのインデックスＫが分離され、それ
ぞれＬＰＣ係数復号化部４０１、適応音源復号化部４０
２、雑音音源復号化部４０３およびゲイン復号化部４０
４に入力される。The separation unit 400 converts the coded data input from the speech coding apparatus shown in FIG. 3 through the storage system or the transmission system into a code A of a spectrum parameter (LPC coefficient) and a code of a residual component. The corresponding index I of the adaptive code vector, the index J of the noise code vector, and the index K of the gain vector are separated, and the LPC coefficient decoding unit 401 and the adaptive excitation decoding unit 40, respectively.
2. Noise excitation decoding section 403 and gain decoding section 40
4 is input.

【００６６】ＬＰＣ係数復号化部４０１では、音声符号
化装置と同様にしてスペクトルパラメータＡの符号に対
応する量子化されたＬＰＣ係数を再生し、これを合成フ
ィルタ４０６およびポストフィルタ４０７に供給する。The LPC coefficient decoding section 401 reproduces the quantized LPC coefficient corresponding to the code of the spectrum parameter A in the same manner as in the speech coding apparatus, and supplies this to the synthesis filter 406 and the post filter 407.

【００６７】適応音源復号化部４０２は、図３の適応音
源符号化部３１２と同様に適応符号帳を有し、インデッ
クスＩに対応する適応符号ベクトルｃ０（Ｉ）を求めて
駆動信号生成部４０５に供給する。雑音音源復号化部４
０３は、図３の雑音音源符号化部３１３と同様に雑音符
号帳を有し、インデックスＪに対応する雑音符号ベクト
ルｃ１（Ｊ）を求めて駆動信号生成部４０５に供給す
る。さらに、ゲイン復号化部４０４は、図３のゲイン符
号化部３１４と同様にゲイン符号帳を有し、インデック
スＫに対応するゲインベクトルｇ０（Ｋ），ｇ１（Ｋ）
を求めて駆動信号生成部４０５に供給する。Adaptive excitation decoding section 402 has an adaptive codebook like adaptive excitation encoding section 312 in FIG. 3, and determines adaptive code vector c0 (I) corresponding to index I to drive signal generation section 405. To supply. Noise source decoding unit 4
03 has a noise codebook similarly to the noise excitation coding unit 313 of FIG. 3, finds a noise code vector c1 (J) corresponding to the index J, and supplies it to the drive signal generation unit 405. Further, gain decoding section 404 has a gain codebook similarly to gain encoding section 314 in FIG. 3, and gain vectors g0 (K) and g1 (K) corresponding to index K.
Is supplied to the drive signal generation unit 405.

【００６８】駆動信号生成部４０５は、図３の駆動信号
生成部３１５と同様に式（１５）に従って適応符号ベク
トルｃ０（Ｉ）、雑音符号ベクトルｃ１（Ｊ）およびゲ
インベクトルｇ０（Ｋ），ｇ１（Ｋ）から量子化された
残差ベクトルｅｘを求める。この残差ベクトルｅｘは、
適応音源復号化部４０２に入力されて適応符号帳に格納
されるとともに、合成フィルタ４０６に駆動信号として
入力される。The drive signal generation unit 405, like the drive signal generation unit 315 of FIG. 3, uses the adaptive code vector c0 (I), the noise code vector c1 (J), and the gain vectors g0 (K), g1 according to equation (15). A quantized residual vector ex is obtained from (K). This residual vector ex is
The signal is input to adaptive excitation decoding section 402 and stored in adaptive codebook, and is also input to synthesis filter 406 as a drive signal.

【００６９】合成フィルタ４０６は、ＬＰＣ係数復号化
部４０１で求められた量子化されたＬＰＣ係数α_ｑｉを
用いて式（１２）と逆特性１／Ａｑ（ｚ）のフィルタリ
ングを駆動信号（残差ベクトルｅｘ）に対して行うこと
により、復号された音声信号を合成する。この合成フィ
ルタ４０６の出力信号は、ＬＰＣ係数復号化部４０１で
求められた量子化されたＬＰＣ係数α_ｑｉを用いて特性
が設定されたポストフィルタ４０７によってスペクトル
形状が強調されることにより、最終的な復号音声信号が
生成される。Using the quantized LPC coefficient α _qi obtained by the LPC coefficient decoding section 401, the synthesis filter 406 filters the inverse characteristic 1 / Aq (z) with the driving signal (residual By performing on the vector ex), the decoded audio signal is synthesized. The output signal of the synthesis filter 406 is finally subjected to the spectral shape enhancement by the post filter 407 whose characteristics are set using the quantized LPC coefficient α _qi obtained by the LPC coefficient decoding unit 401, so that the final output signal is obtained. A decoded audio signal is generated.

【００７０】以上、本発明の実施形態について説明した
が、本発明はこれに限られるものでなく、種々変形して
実施することができる。例えば、上記の実施形態におい
ては、自己相関係数の修正法として自己相関係数に自己
相関窓を乗じて窓掛け処理を行う方法を例にとり説明し
たが、自己相関係数の修正手法はこれに限られるもので
はない。要するに、符号化の対象とするスペクトルパラ
メータに供する自己相関係数と、聴覚重み特性の設定に
供する自己相関係数がそれぞれに適した異なる条件で修
正され、共通の自己相関係数に端を発して求められる手
法であればよい。Although the embodiment of the present invention has been described above, the present invention is not limited to this, and can be implemented with various modifications. For example, in the above embodiment, a method of multiplying an autocorrelation coefficient by an autocorrelation window and performing windowing processing has been described as an example of a method of correcting an autocorrelation coefficient. It is not limited to. In short, the autocorrelation coefficient used for the spectral parameter to be coded and the autocorrelation coefficient used for setting the auditory weighting characteristics are modified under different conditions suitable for each, and the common autocorrelation coefficient starts from the common autocorrelation coefficient. Any method can be used as long as it can be obtained.

【００７１】また、自己相関係数の定義に上記実施形態
での説明と多少違いがある場合や、自己相関係数の代り
に正規化自己相関係数を用いた場合についても、本発明
を適用できることは言うまでもない。The present invention is also applicable to the case where the definition of the autocorrelation coefficient is slightly different from that described in the above embodiment, and the case where the normalized autocorrelation coefficient is used instead of the autocorrelation coefficient. It goes without saying that you can do it.

【００７２】[0072]

【発明の効果】以上説明したように、本発明では入力音
声信号から求められた第１の自己相関係数をそれぞれ異
なる条件で修正して得られた第２、第３の自己相関係数
をそれぞれ用いて、符号化対象のスペクトルパラメータ
と残差成分の符号化に用いる聴覚重み特性を個別に求め
ることによって、符号化対象のスペクトルパラメータお
よび聴覚重み特性をいずれも精度よく求めることが可能
となる。As described above, according to the present invention, the second and third autocorrelation coefficients obtained by correcting the first autocorrelation coefficient obtained from the input speech signal under different conditions are used. By individually calculating the spectral parameter to be encoded and the auditory weighting characteristic used for encoding the residual component, it is possible to accurately determine both the spectral parameter and the auditory weighting characteristic of the encoding target. .

【００７３】従って、本発明によると４ｋｂｉｔ／ｓ程
度以下というような低ビットレート符号化においても、
符号化歪みが知覚されにくい高品質の復号音声が得られ
る音声符号化を実現することができる。Therefore, according to the present invention, even in a low bit rate encoding of about 4 kbit / s or less,
It is possible to realize speech encoding that can obtain high-quality decoded speech in which encoding distortion is hardly perceived.

[Brief description of the drawings]

【図１】本発明の第１の実施形態に係る音声符号化装置
の構成を示すブロック図FIG. 1 is a block diagram illustrating a configuration of a speech encoding device according to a first embodiment of the present invention.

【図２】同第２の実施形態に係る音声符号化の処理手順
を示すフローチャートFIG. 2 is a flowchart showing a speech encoding processing procedure according to the second embodiment;

【図３】本発明の第２の実施形態に係る音声符号化装置
の構成を示すブロック図FIG. 3 is a block diagram showing a configuration of a speech coding apparatus according to a second embodiment of the present invention.

【図４】同第２の実施形態に係る音声符号化装置に対応
する音声復号化装置の構成を示すブロック図FIG. 4 is a block diagram showing a configuration of a speech decoding device corresponding to the speech encoding device according to the second embodiment;

[Explanation of symbols]

１０１，３０１…自己相関算出部１０２，３０２…第１窓掛け部１０３，３０３…第２窓掛け部１０４，３０４…符号化用スペクトルパラメータ算出部１０５，３０５…聴覚重み用スペクトルパラメータ算出
部１０６…スペクトルパラメータ符号化部３０６…ＬＰＣ係数符号化部１０７，３０７…聴覚重み設定部１０８，３０８…残差成分符号化部１０９，３０９…多重化部101, 301: autocorrelation calculation units 102, 302: first windowing units 103, 303 ... second windowing units 104, 304 ... coding spectrum parameter calculation units 105, 305 ... hearing weight spectrum parameter calculation units 106 ... Spectrum parameter encoder 306 LPC coefficient encoder 107, 307 Perceptual weight setting unit 108, 308 Residual component encoder 109, 309 Multiplexer

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5D045 CA01 CB01 5J064 BA10 BB03 BC02 BC03 BC11 BD02 5K041 AA04 CC01 EE23 HH21 JJ14 JJ21 ──────────────────────────────────────────────────続き Continued on the front page F term (reference) 5D045 CA01 CB01 5J064 BA10 BB03 BC02 BC03 BC11 BD02 5K041 AA04 CC01 EE23 HH21 JJ14 JJ21

Claims

[Claims]

1. A speech coding method for representing an input speech signal by a spectrum parameter representing a spectrum envelope and a residual component, and encoding the spectrum parameter and the residual component, wherein a first parameter obtained from the input speech signal is obtained. Calculating and encoding the spectrum parameter from a second autocorrelation coefficient obtained by correcting the autocorrelation coefficient of A sound weighting characteristic is obtained from a third autocorrelation function obtained by correction under a condition different from the above condition, and the residual component is encoded using the spectrum parameter and the hearing weighting characteristic. Encoding method.

2. A speech encoding method for representing an input speech signal by a spectrum parameter representing a spectrum envelope and a residual component, and encoding the spectrum parameter and the residual component, wherein a first parameter obtained from the input speech signal is obtained. The spectrum parameter is calculated and encoded from a second autocorrelation coefficient obtained by correcting the autocorrelation coefficient of the first autocorrelation window using a first autocorrelation window, and the first autocorrelation coefficient is calculated by the first autocorrelation window. A perceptual weighting characteristic is obtained from a third autocorrelation function obtained by using a second autocorrelation window different from the autocorrelation window, and the residual component is encoded using the spectral parameter and the perceptual weighting characteristic. Speech encoding method.

3. A speech encoding method for representing an input speech signal by a spectrum parameter representing a spectrum envelope and a residual component, and encoding the spectrum parameter and the residual component, comprising the steps of: Calculating a first autocorrelation coefficient; and performing a windowing on the first autocorrelation coefficient using a first autocorrelation window to obtain a second autocorrelation coefficient. Calculating a first spectral parameter using the second autocorrelation coefficient; encoding the first spectral parameter; and forming a shape with respect to the first autocorrelation coefficient. Calculating a third autocorrelation coefficient by performing windowing using a second autocorrelation window different from the first autocorrelation window; and using the third autocorrelation coefficient. Calculating a second spectral parameter; setting an auditory weighting characteristic based on the second spectral parameter; encoding the residual component using the first spectral parameter and the auditory weighting characteristic Voice coding method.

4. A speech encoding apparatus for representing an input speech signal by a spectrum parameter representing a spectrum envelope and a residual component, and encoding the spectrum parameter and the residual component. An autocorrelation calculating means for calculating a first autocorrelation coefficient, and windowing the first autocorrelation coefficient using a first autocorrelation window, thereby obtaining a second autocorrelation coefficient. , A first spectral parameter calculating means for calculating a first spectral parameter using the second autocorrelation coefficient, and a first spectral parameter calculating means for calculating the first spectral parameter. A spectrum parameter encoding means for encoding a spectrum parameter; a second parameter having a shape different from the first autocorrelation window with respect to the first autocorrelation coefficient. A second windowing means for obtaining a third autocorrelation coefficient by performing windowing using an autocorrelation window; and a second windowing means for calculating a second spectral parameter using the third autocorrelation coefficient. A second spectral parameter calculating means, an auditory weight characteristic setting means for setting an auditory weight characteristic based on the second spectral parameter, and an auditory weight characteristic set by the first spectral parameter and the auditory weight setting means. And a residual component encoding unit for encoding the residual component using the residual component.