JP4935329B2

JP4935329B2 - Speech coding apparatus, speech decoding apparatus, speech coding method, speech decoding method, and program

Info

Publication number: JP4935329B2
Application number: JP2006325696A
Authority: JP
Inventors: 博康井手
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2006-12-01
Filing date: 2006-12-01
Publication date: 2012-05-23
Anticipated expiration: 2026-12-01
Also published as: JP2008139562A

Abstract

<P>PROBLEM TO BE SOLVED: To reduce an information amount which is transmitted and received, by maintaining sound quality in voice encoding and decoding of an analysis and synthesis type. <P>SOLUTION: A voice encoding device expresses a residual signal created by applying prediction analysis to a voice signal, by a flag for indicating a noise band for the noise band, and by a frequency conversion coefficient for a non-noise band. The voice encoding device encodes the flag and the frequency conversion coefficient together with a prediction coefficient, and transmits it to a voice decoding device. The voice decoding device decodes a received code, and on the basis of a decoding result, a residual signal is restored by creation of a noise string and reverse conversion of the frequency conversion coefficient, and the voice signal is restored by inputting the residual signal to a filter section for synthesis, which is defined by the prediction coefficient. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、分析合成型の音声圧縮復元を実行する際に必要となる、音声符号化装置、音声復号装置、音声符号化方法、音声復号方法、及び、プログラムに関する。 The present invention relates to a speech encoding device, speech decoding device, speech encoding method, speech decoding method, and program that are required when performing analysis / synthesis speech compression / decompression.

移動体通信の分野においては、利用者の増加に対処する等の理由から、低ビットレート（8kbps程度）の音声の圧縮符号化方法が求められている。例えば、8kbpsの音声符号化方法として、ITU-T勧告G.729に示される音声符号化方法がある。該勧告に係る音声符号化方法は、基本的には、音声信号を予測分析により予測係数と残差信号とに分解してから送信する方法である。予測分析としては、例えば、線型予測分析や、ＭＬＳＡ（Mel Log Spectrum Approximation）分析（例えば、非特許文献１参照。）が知られている。 In the field of mobile communications, a low bit rate (about 8 kbps) audio compression coding method is required for the purpose of dealing with an increase in users. For example, as an 8 kbps speech coding method, there is a speech coding method shown in ITU-T recommendation G.729. The speech coding method according to the recommendation is basically a method of transmitting a speech signal after decomposing it into a prediction coefficient and a residual signal by predictive analysis. As prediction analysis, for example, linear prediction analysis and MLSA (Mel Log Spectrum Approximation) analysis (for example, refer to Non-Patent Document 1) are known.

今井聖、住田一男、古市千枝子著「音声合成のためのメル対数スペクトル近似（ＭＬＳＡ）フィルタ」、電子通信学会論文誌、第Ｊ６６−Ａ巻、第２号、ｐ．１２２−１２９、１９８３年Sei Imai, Kazuo Sumita, Chieko Furuichi, “Mel Log Spectrum Approximation (MLSA) Filter for Speech Synthesis”, IEICE Transactions, Vol. J66-A, No. 2, p. 122-129, 1983

分析合成型の符号化及び復号において上述の低ビットレート通信を可能にするためには、符号化装置側から復号装置側へ、いかに残差信号に関する情報を効率的に伝達するか、に、工夫が必要となる。残差信号は情報量が多いため、そのまま送信すると、オーバーフローしてしまうからである。 In order to enable the above-described low bit rate communication in the analysis and synthesis type encoding and decoding, it is necessary to devise how to efficiently transmit information on the residual signal from the encoding device side to the decoding device side. Is required. This is because the residual signal has a large amount of information and thus overflows if transmitted as it is.

かかる工夫のひとつとして、典型的な残差信号を集めた符号帳を符号化側の装置（送信機）と復号側の装置（受信機）とで共有し、前者の装置から後者の装置に向けて、符号帳から選択された残差信号がどれであるかを伝えることが行われている。しかし、音質改善のために動的な符号帳を用いた場合、送信機と受信機との間で符号帳のミスマッチが生じ得るという問題がある。 As one of such contrivances, a codebook in which typical residual signals are collected is shared between the encoding side device (transmitter) and the decoding side device (receiver), and the former device is directed to the latter device. Thus, it is performed to tell which residual signal is selected from the codebook. However, when a dynamic codebook is used to improve sound quality, there is a problem that a codebook mismatch may occur between the transmitter and the receiver.

なお、送信機で生成された残差信号は、受信機においては音声再生のための励起信号として使われる。つまり、残差信号という用語と励起信号という用語とは、局面によって使い分けがなされるに過ぎず、実体としては同じものを指す。よって、以下では、両用語の使い方には拘泥しない。 The residual signal generated by the transmitter is used as an excitation signal for sound reproduction in the receiver. In other words, the term “residual signal” and the term “excitation signal” are used differently depending on the situation, and are the same in substance. Therefore, in the following, the usage of both terms is not particular.

上述の符号帳を用いることなく、残差信号に関する情報を効率的に伝達するための工夫として、送信機が残差信号を周波数変換してから符号化することが考えられる。 As a device for efficiently transmitting information about the residual signal without using the above-described codebook, it is conceivable that the transmitter performs frequency conversion of the residual signal and then encodes it.

しかし、例えば音声が子音の場合には、残差信号は雑音としての性質を顕著に有することが多く、このようなときにまで一律に周波数変換係数を受信機に伝達することには無駄が多い。すなわち、限られた送信容量を効率的に利用することができない。 However, for example, when the voice is a consonant, the residual signal often has a characteristic as noise, and it is wasteful to uniformly transmit the frequency conversion coefficient to the receiver until such time. . That is, the limited transmission capacity cannot be used efficiently.

本発明は、上記実情に鑑みてなされたもので、音声圧縮復号技術において、残差信号を複数の帯域に分割し、雑音帯域については主にその旨を示すフラグを送信対象とし、非雑音帯域については周波数変換係数を送信対象とすることにより、残差信号に関する情報を効率的に伝達しつつ、再生音声の品質を十分に確保するような、音声符号化装置、音声復号装置、音声符号化方法、音声復号方法、及び、プログラムを提供することを目的とする。 The present invention has been made in view of the above circumstances, and in a speech compression decoding technique, a residual signal is divided into a plurality of bands, and a noise band is mainly transmitted with a flag indicating that as a non-noise band. In the case of a speech coding apparatus, a speech decoding apparatus, a speech coding, and the like, the frequency conversion coefficient is set as a transmission target so that the quality of the reproduced speech is sufficiently ensured while efficiently transmitting information on the residual signal. It is an object to provide a method, a speech decoding method, and a program.

上記目的を達成するために、この発明の第１の観点に係る音声符号化装置は、
音声信号を予測分析により予測係数と残差信号とに分解する予測分析部と、
前記残差信号を帯域別残差信号に分割する帯域別残差信号生成部と、
前記残差信号について帯域毎に該帯域が雑音帯域か否かを判別する雑音判別部と、
前記雑音判別部により雑音帯域であると判別された前記帯域について該帯域が雑音帯域であることを示すフラグを生成するとともに該帯域の帯域別残差信号のゲインを求めるフラグ生成部と、
前記雑音判別部により雑音帯域ではないと判別された前記帯域の前記帯域別残差信号を実時間領域において重ね合わせてから周波数変換して、非雑音帯域における周波数変換係数を生成する非雑音帯域変換部と、
前記予測分析部で得られた予測係数と前記フラグ生成部で得られたフラグとゲインと前記非雑音帯域変換部で生成された周波数変換係数とを符号化する符号化部と、
を備える。 In order to achieve the above object, a speech encoding apparatus according to the first aspect of the present invention provides:
A prediction analysis unit that decomposes a speech signal into a prediction coefficient and a residual signal by prediction analysis;
A residual signal generator for each band that divides the residual signal into residual signals for each band;
A noise discriminating unit that discriminates whether or not the band is a noise band for each band of the residual signal;
A flag generating unit for generating a flag indicating that the band is a noise band for the band determined to be a noise band by the noise determining unit and obtaining a gain of a residual signal for each band of the band ;
Non-noise band conversion for generating a frequency conversion coefficient in a non-noise band by superimposing the band-specific residual signals of the band determined not to be a noise band by the noise determination unit in the real-time domain and then performing frequency conversion And
An encoding unit that encodes the prediction coefficient obtained by the prediction analysis unit, the flag and gain obtained by the flag generation unit, and the frequency conversion coefficient generated by the non-noise band conversion unit ;
Is provided.

残差信号を複数の帯域に分割し、雑音帯域については原則としてその旨を示すフラグのみを送信対象とし、非雑音帯域については周波数変換係数を送信対象とすることにより、残差信号に関する情報の効率的な伝達と、再生音声の品質の確保と、を両立させることができる。 The residual signal is divided into a plurality of bands, and for the noise band, only the flag indicating that in principle is to be transmitted, and for the non-noise band, the frequency conversion coefficient is to be transmitted. It is possible to achieve both efficient transmission and ensuring the quality of reproduced audio.

上記目的を達成するために、この発明の第２の観点に係る音声符号化装置は、
音声信号を予測分析により予測係数と残差信号とに分解する予測分析部と、
前記残差信号を周波数変換して周波数変換係数を生成する全帯域変換部と、
前記残差信号を帯域別残差信号に分割する帯域別残差信号生成部と、
前記残差信号について帯域毎に該帯域が雑音帯域か否かを判別する雑音判別部と、
前記雑音判別部により雑音帯域であると判別された前記帯域について該帯域が雑音帯域であることを示すフラグを生成するとともに該帯域の帯域別残差信号のゲインを求めるフラグ生成部と、
前記全帯域変換部で得られた周波数変換係数から、前記雑音判別部により雑音帯域ではないと判別された前記帯域の周波数変換係数を集計する集計手段と、
前記予測分析部で得られた予測係数と前記フラグ生成部で得られたフラグとゲインと前記集計手段で集計された周波数変換係数とを符号化する符号化部と、
を備える。 In order to achieve the above object, a speech encoding apparatus according to the second aspect of the present invention provides:
A prediction analysis unit that decomposes a speech signal into a prediction coefficient and a residual signal by prediction analysis;
A full-band conversion unit that frequency-converts the residual signal to generate a frequency conversion coefficient;
A residual signal generator for each band that divides the residual signal into residual signals for each band;
A noise discriminating unit that discriminates whether or not the band is a noise band for each band of the residual signal;
A flag generating unit for generating a flag indicating that the band is a noise band for the band determined to be a noise band by the noise determining unit and obtaining a gain of a residual signal for each band of the band ;
From the frequency conversion coefficients obtained by the all-band conversion unit, a totaling unit that totalizes the frequency conversion coefficients of the band determined not to be a noise band by the noise determination unit,
An encoding unit that encodes the prediction coefficient obtained by the prediction analysis unit, the flag and gain obtained by the flag generation unit, and the frequency conversion coefficient tabulated by the tabulation unit ;
Is provided.

残差信号を帯域別残差信号に分割する前に周波数変換係数を全帯域に渡って一括して求めるので、帯域フィルタの分解能等の性能に関係なく高い精度で周波数変換係数を求めることができる。 Since the frequency conversion coefficient is obtained all over the entire band before dividing the residual signal into band-specific residual signals, the frequency conversion coefficient can be obtained with high accuracy regardless of the performance such as the resolution of the band filter. .

前記雑音判別部は、例えば、前記帯域毎に前記帯域別残差信号の自己相関関数の形状に基づき該帯域が雑音帯域か否かを判別する。 The noise determination unit determines, for example, whether or not the band is a noise band based on the shape of the autocorrelation function of the band-specific residual signal for each band.

このようにすると、後に詳しく述べるように、所定の基準を採用することにより、容易に有声無声判別ができる。 In this way, as will be described in detail later, voiced / unvoiced discrimination can be easily performed by adopting a predetermined standard.

前記予測分析部は、例えば、ＭＬＳＡ（Mel Log Spectrum Approximation）分析により前記予測係数としてＭＬＳＡフィルタ係数を求め、該ＭＬＳＡフィルタ係数により定義される逆フィルタを用いて前記残差信号を求める。 The prediction analysis unit obtains an MLSA filter coefficient as the prediction coefficient by, for example, MLSA (Mel Log Spectrum Approximation) analysis, and obtains the residual signal using an inverse filter defined by the MLSA filter coefficient.

前記予測分析部は、あるいは例えば、線形予測分析により前記予測係数として線形予測係数を求め、該線型予測係数により定義される逆フィルタを用いて前記残差信号を求める。 The prediction analysis unit obtains a linear prediction coefficient as the prediction coefficient by, for example, linear prediction analysis, and obtains the residual signal using an inverse filter defined by the linear prediction coefficient.

上記目的を達成するために、この発明の第３の観点に係る音声復号装置は、
予測分析により音声信号から生成された予測係数、該予測分析により該音声信号から生成された残差信号の特定の帯域が雑音帯域であることを示すフラグ、雑音帯域における帯域別残差信号のゲイン、及び、非雑音帯域における周波数変換係数、が符号化された結果である符号を受信する受信部と、
前記符号から、前記予測係数、前記フラグ、前記ゲイン、及び、前記非雑音帯域における周波数変換係数、を復号する復号部と、
前記フラグにより雑音帯域であることが示されている帯域において帯域毎に前記ゲインにより振幅が調整された雑音列を生成する雑音列生成部と、
前記フラグにより雑音帯域であることが示されている帯域においては当該帯域における周波数変換係数を全て０として記憶し、前記非雑音帯域においては前記復号された周波数変換係数を記憶することで全帯域の周波数変換係数を生成し、この生成された周波数変換係数をスペクトル逆変換して非雑音帯域における残差信号を求める逆変換部と、
前記雑音列生成部で生成された雑音列と前記逆変換部で求められた非雑音帯域における残差信号とを重ね合わせて復元残差信号を生成する残差信号復元部と、
前記復号部で復号された予測係数と前記残差信号復元部で生成された復元残差信号とを合成することにより復元音声信号を生成する合成部と、
を備える。 In order to achieve the above object, a speech decoding apparatus according to the third aspect of the present invention provides:
Prediction coefficient generated from speech signal by prediction analysis, flag indicating that specific band of residual signal generated from speech signal by prediction analysis is noise band, gain of residual signal by band in noise band And a receiving unit that receives a code that is a result of encoding a frequency conversion coefficient in a non-noise band ;
A decoding unit that decodes the prediction coefficient, the flag, the gain, and the frequency transform coefficient in the non-noise band from the code;
A noise string generation unit that generates a noise string whose amplitude is adjusted by the gain for each band in a band indicated by the flag to be a noise band;
In the band indicated by the flag as a noise band, all the frequency conversion coefficients in the band are stored as 0, and in the non-noise band, the decoded frequency conversion coefficient is stored to store the entire band. An inverse transform unit that generates a frequency transform coefficient and inversely transforms the generated frequency transform coefficient to obtain a residual signal in a non-noise band ;
A residual signal restoration unit that generates a restored residual signal by superimposing a noise sequence generated by the noise sequence generation unit and a residual signal in a non-noise band obtained by the inverse transformation unit ;
A synthesizing unit that generates a restored speech signal by synthesizing the prediction coefficient decoded by the decoding unit and the restored residual signal generated by the residual signal restoring unit ;
Is provided.

本発明によれば、分析合成型の音声信号通信において、情報伝達量を抑えつつ、再生音声の音質を向上させることができる。 According to the present invention, it is possible to improve the sound quality of reproduced sound while suppressing the amount of information transmitted in analysis / synthesis type audio signal communication.

以下、本発明の実施の形態に係る音声符号化装置及び音声復号装置について詳細に説明する。音声符号化装置の機能構成例として３種類（それぞれ実施形態１、実施形態２、実施形態３、とする。）を、音声復号装置の機能構成例として２種類（それぞれ実施形態４、実施形態５とする。）を挙げ、さらに、これらの機能構成例を実現するための物理的な構成例として音声符号化兼復号装置（実施形態６とする。）を挙げる。なお、これらの実施形態を順次記述するにあたっては、原則として、同一の部材については同一の符号を付すこととし、その説明を省略する。 The speech encoding apparatus and speech decoding apparatus according to embodiments of the present invention will be described in detail below. Three types of functional configuration examples of speech encoding apparatuses (respectively, Embodiment 1, Embodiment 2, and Embodiment 3) and two types of functional configuration examples of speech decoding apparatuses (Embodiments 4, 5 respectively). Furthermore, a speech encoding / decoding device (referred to as Embodiment 6) is given as an example of a physical configuration for realizing these functional configuration examples. In addition, in describing these embodiments sequentially, in principle, the same members are denoted by the same reference numerals, and description thereof is omitted.

なお、実施形態１乃至実施形態３に係る音声符号化装置が送信する信号は、本質的には同じものである。また、実施形態４及び実施形態５に係る音声復号装置が受信する信号は、本質的には同じものである。よって、実施形態６に係る音声符号化兼復号装置としては、上述の音声符号化装置と音声復号装置を任意に組み合わせたものを想定してよい。 Note that the signals transmitted by the speech encoding apparatus according to Embodiments 1 to 3 are essentially the same. In addition, the signals received by the speech decoding apparatuses according to the fourth and fifth embodiments are essentially the same. Therefore, as the speech encoding / decoding device according to the sixth embodiment, an arbitrary combination of the speech encoding device and the speech decoding device described above may be assumed.

（実施形態１）
図１は、本実施形態に係る音声符号化装置１１１の機能構成図である。 (Embodiment 1)
FIG. 1 is a functional configuration diagram of the speech encoding device 111 according to the present embodiment.

音声符号化装置１１１は、図示するように、マイクロフォン１２１と、Ａ／Ｄ変換部１２３と、予測分析部１２５と、帯域フィルタ部１２７と、雑音判別部１２９と、符号化用Ａスイッチ部１３１と、フラグ及びゲイン生成部１３３と、集計変換部１３５と、符号化部１３７と、送信部１３９と、を備える。 As shown in the figure, the speech encoding device 111 includes a microphone 121, an A / D conversion unit 123, a prediction analysis unit 125, a band filter unit 127, a noise determination unit 129, an encoding A switch unit 131, and the like. , A flag and gain generation unit 133, a total conversion unit 135, an encoding unit 137, and a transmission unit 139.

予測分析部１２５は、予測分析用逆フィルタ算出器１４１を内蔵している。 The prediction analysis unit 125 includes a prediction analysis inverse filter calculator 141.

帯域フィルタ部１２７は、第１帯域フィルタ１４３と、第２帯域フィルタ１４５と、図１では省略するが、第３帯域フィルタ以降の必要な帯域フィルタと、を備える。 The band filter unit 127 includes a first band filter 143, a second band filter 145, and necessary band filters after the third band filter, which are omitted in FIG.

雑音判別部１２９は、第１雑音判別器１４７と、第２雑音判別器１４９と、図１では省略するが、第３雑音判別器以降の必要な雑音判別器と、を備える。 The noise discriminating unit 129 includes a first noise discriminator 147, a second noise discriminator 149, and a necessary noise discriminator after the third noise discriminator, although omitted in FIG.

符号化用Ａスイッチ部１３１は、第１Ａスイッチ１５１と、第２Ａスイッチ１５３と、図１では省略するが、第３Ａスイッチ以降の必要なＡスイッチと、を備える。 The encoding A switch unit 131 includes a first A switch 151, a second A switch 153, and necessary A switches after the third A switch, which are omitted in FIG.

フラグ及びゲイン生成部１３３は、第１フラグ生成及び第１ゲイン算出器１５５と、第２フラグ生成及び第２ゲイン算出器１５７と、図１では省略するが、第３フラグ生成及び第３ゲイン算出器以降の必要なフラグ生成及びゲイン算出器と、を備え、フラグ及び雑音ゲイン集計器１５９をさらに備える。 The flag and gain generation unit 133 includes a first flag generation and first gain calculator 155, a second flag generation and second gain calculator 157, and a third flag generation and third gain calculation, which are omitted in FIG. And a necessary flag generator and gain calculator after the counter, and further includes a flag and noise gain aggregator 159.

集計変換部１３５は、非雑音残差信号集計器１６１と、スペクトル変換器１６３と、非雑音帯域切り出し器１６２と、を備える。 The total conversion unit 135 includes a non-noise residual signal totalizer 161, a spectrum converter 163, and a non-noise band cutout unit 162.

マイクロフォン１２１に入力されたアナログ音声信号は、Ａ／Ｄ変換部１２３により、例えば16kHzサンプリングにより、デジタル音声信号に変換されてから、予測分析部１２５に引き渡される。予測分析部１２５は、引き渡されたデジタル音声信号に、線型予測分析やＭＬＳＡ（Mel Log Spectrum Approximation）分析等の予測分析を施す。予測分析部１２５は、該デジタル音声信号を、所定の時間区間（例えば5ms）に区切って、S_i={s_i、0、・・・、s_i、l-1}(0≦i≦M-1)という形にした後、各時間区間について、予測係数、例えば、線型予測係数やＭＬＳＡ係数、を算出する。続いて、該予測係数から、予測分析用逆フィルタ算出器１４１により、予測分析用逆フィルタを求める。次に、デジタル音声信号S_iが該予測分析用逆フィルタに入力された結果として、残差信号D_i={d_i、0、・・・、d_i、l-1}(0≦i≦M-1)が求まる。このように、予測分析部１２５は、デジタル音声信号を、予測係数と残差信号とに分解する。 The analog audio signal input to the microphone 121 is converted into a digital audio signal by the A / D conversion unit 123, for example, by 16 kHz sampling, and then delivered to the prediction analysis unit 125. The prediction analysis unit 125 performs prediction analysis such as linear prediction analysis or MLSA (Mel Log Spectrum Approximation) analysis on the delivered digital audio signal. The prediction analysis unit 125 divides the digital audio signal into predetermined time intervals (for example, 5 ms), and S _i = {s _{i, 0} ,..., Si _{, l−1} } (0 ≦ i ≦ M -1), a prediction coefficient such as a linear prediction coefficient or an MLSA coefficient is calculated for each time interval. Subsequently, a prediction analysis inverse filter is obtained from the prediction coefficient by the prediction analysis inverse filter calculator 141. Next, as a result of the digital audio signal S _i being input to the prediction analysis inverse filter, residual signals D _i = {d _{i, 0} ,..., D _{i, l−1} } (0 ≦ i ≦ M-1) is obtained. As described above, the prediction analysis unit 125 decomposes the digital audio signal into a prediction coefficient and a residual signal.

予測分析部１２５により生成された予測係数は、そのまま符号化部１３７に引き渡される。 The prediction coefficient generated by the prediction analysis unit 125 is transferred to the encoding unit 137 as it is.

一方、同じく生成された残差信号は、帯域フィルタ部１２７に引き渡されて、帯域毎の残差信号に分割される。帯域フィルタ部１２７は、例えば、残差信号を帯域１乃至８に分割し、帯域１を0〜1kHz、帯域２を1〜2kHz、帯域３を2〜3kHz、帯域４を3〜4kHz、帯域５を4〜5kHz、帯域６を5kHz〜6kHz、帯域７を6kHz〜7kHz、帯域８を7kHz〜8kHz、とするのが好適である。残差信号が第１帯域フィルタ１４３に通されることにより帯域１の残差信号が生成され、残差信号が第２帯域フィルタ１４５に通されることにより帯域２の残差信号が生成され、以下、同様である。 On the other hand, the generated residual signal is transferred to the band filter unit 127 and divided into residual signals for each band. The band filter unit 127 divides the residual signal into bands 1 to 8, for example, band 1 is 0 to 1 kHz, band 2 is 1 to 2 kHz, band 3 is 2 to 3 kHz, band 4 is 3 to 4 kHz, band 5 Is preferably 4 to 5 kHz, band 6 is 5 kHz to 6 kHz, band 7 is 6 kHz to 7 kHz, and band 8 is 7 kHz to 8 kHz. The residual signal is passed through the first band-pass filter 143 to generate a band 1 residual signal, and the residual signal is passed through the second band-pass filter 145 to generate a band 2 residual signal. The same applies hereinafter.

帯域識別のための変数をωと表記することにする。例えば、第１帯域フィルタ１４３により生成される信号はω=1の帯域の信号であるとし、第２帯域フィルタ１４５により生成される信号はω=2の帯域の信号であるとする。 A variable for band identification is denoted by ω. For example, it is assumed that the signal generated by the first band filter 143 is a signal in the band of ω = 1, and the signal generated by the second band filter 145 is a signal in the band of ω = 2.

生成された帯域１の残差信号は、雑音判別部１２９の中の第１雑音判別器１４７と、符号化用Ａスイッチ部１３１の中の第１Ａスイッチ１５１と、に引き渡され、生成された帯域２の残差信号は、雑音判別部１２９の中の第２雑音判別器１４９と、符号化用Ａスイッチ部１３１の中の第２Ａスイッチ１５３と、に引き渡され、以下、同様である。 The generated band 1 residual signal is transferred to the first noise discriminator 147 in the noise discriminating unit 129 and the first A switch 151 in the coding A switch unit 131, and the generated band The second residual signal is transferred to the second noise discriminator 149 in the noise discriminating unit 129 and the second A switch 153 in the coding A switch unit 131, and so on.

第１雑音判別器１４７は、引き渡された帯域１の残差信号が雑音であるか否かを判別し、第１Ａスイッチ１５１に対して切替命令を送る。ここで、切替命令とは、帯域１の残差信号が雑音であると判別された場合には第１Ａスイッチ１５１をａ１側に閉じ、帯域１の残差信号が雑音ではないと判別された場合には第１Ａスイッチをｂ１側に閉じるように、第１Ａスイッチ１５１を制御する命令である。第２雑音判別器１４９は、引き渡された帯域２の残差信号が雑音であるか否かを判別し、第２Ａスイッチ１５３に対して切替命令を送る。ここで、切替命令とは、帯域２の残差信号が雑音であると判別された場合には第２Ａスイッチ１５３をａ２側に閉じ、帯域２の残差信号が雑音ではないと判別された場合には第２Ａスイッチをｂ２側に閉じるように、第２Ａスイッチ１５３を制御する命令である。帯域３以降についても同様である。 The first noise discriminator 147 discriminates whether or not the handed over band 1 residual signal is noise, and sends a switching command to the first A switch 151. Here, the switching command means that when it is determined that the residual signal of band 1 is noise, the first A switch 151 is closed to the a1 side, and it is determined that the residual signal of band 1 is not noise. Is a command for controlling the first A switch 151 to close the first A switch to the b1 side. The second noise discriminator 149 discriminates whether or not the handed over band 2 residual signal is noise, and sends a switching command to the second A switch 153. Here, the switching command means that when it is determined that the residual signal of band 2 is noise, the second A switch 153 is closed to the a2 side, and it is determined that the residual signal of band 2 is not noise. Is a command for controlling the second A switch 153 to close the second A switch to the b2 side. The same applies to bands 3 and after.

第１Ａスイッチ１５１がａ１側に閉じた場合は、帯域１の残差信号は、フラグ及びゲイン生成部１３３の中の第１フラグ生成及び第１ゲイン算出器１５５には引き渡されるが、集計変換部１３５の中の非雑音残差信号集計器１６１には引き渡されない。一方、第１Ａスイッチ１５１がｂ１側に閉じた場合は、帯域１の残差信号は、フラグ及びゲイン生成部１３３の中の第１フラグ生成及び第１ゲイン算出器１５５には引き渡されないが、集計変換部１３５の中の非雑音残差信号集計器１６１には引き渡される。第２Ａスイッチ１５３がａ２側に閉じた場合は、帯域２の残差信号は、フラグ及びゲイン生成部１３３の中の第２フラグ生成及び第２ゲイン算出器１５７には引き渡されるが、集計変換部１３５の中の非雑音残差信号集計器１６１には引き渡されない。一方、第２Ａスイッチ１５３がｂ２側に閉じた場合は、帯域１の残差信号は、フラグ及びゲイン生成部１３３の中の第２フラグ生成及び第２ゲイン算出器１５７には引き渡されないが、集計変換部１３５の中の非雑音残差信号集計器１６１には引き渡される。帯域３以降についても同様である。 When the first A switch 151 is closed to the a1 side, the band 1 residual signal is delivered to the first flag generation and first gain calculator 155 in the flag and gain generation unit 133, but the total conversion unit 135 is not transferred to the non-noise residual signal totalizer 161. On the other hand, when the first A switch 151 is closed to the b1 side, the band 1 residual signal is not delivered to the first flag generation and first gain calculator 155 in the flag and gain generation unit 133. The result is delivered to the non-noise residual signal totalizer 161 in the total conversion unit 135. When the second A switch 153 is closed to the a2 side, the band 2 residual signal is delivered to the second flag generation and second gain calculator 157 in the flag and gain generation unit 133, but the total conversion unit 135 is not transferred to the non-noise residual signal totalizer 161. On the other hand, when the second A switch 153 is closed to the b2 side, the band 1 residual signal is not delivered to the second flag generation and second gain calculator 157 in the flag and gain generation unit 133. The result is delivered to the non-noise residual signal totalizer 161 in the total conversion unit 135. The same applies to bands 3 and after.

フラグ及びゲイン生成部１３３の中の第１フラグ生成及び第1ゲイン算出器１５５は、帯域１の残差信号を受け取った場合には、帯域１が雑音帯域である旨を示すフラグを生成するとともに帯域１の残差信号のゲインを算出し、該フラグ及び該ゲインを同じくフラグ及びゲイン生成部１３３の中にあるフラグ及び雑音ゲイン集計器１５９に引き渡す一方、帯域１の残差信号を受け取らなかった場合には、何も行わない。フラグ及びゲイン生成部１３３の中の第２フラグ生成及び第２ゲイン算出器１５７は、帯域２の残差信号を受け取った場合には、帯域２が雑音帯域である旨を示すフラグを生成するとともに帯域２の残差信号のゲインを算出し、該フラグ及び該ゲインを同じくフラグ及びゲイン生成部１３３の中にあるフラグ及び雑音ゲイン集計器１５９に引き渡す一方、帯域２の残差信号を受け取らなかった場合には、何も行わない。帯域３以降についても同様である。 The first flag generator and first gain calculator 155 in the flag and gain generation unit 133 generates a flag indicating that the band 1 is a noise band when the residual signal of the band 1 is received. The gain of the residual signal of band 1 is calculated, and the flag and the gain are passed to the flag and noise gain totalizer 159 in the flag and gain generation unit 133, but the residual signal of band 1 is not received. In case you do nothing. When the second flag generation and second gain calculator 157 in the flag and gain generation unit 133 receives the residual signal of the band 2, the second flag generation and second gain calculator 157 generates a flag indicating that the band 2 is a noise band. The gain of the residual signal of band 2 is calculated, and the flag and the gain are passed to the flag and noise gain totalizer 159 in the flag and gain generation unit 133, but the residual signal of band 2 is not received. In case you do nothing. The same applies to bands 3 and after.

各帯域のゲインは、残差信号の該帯域の成分の強度を表すものである。音声信号においては、一般に、帯域が異なればゲインも異なる値になる。ゲインは、後述の音声復号装置に伝えられる。すると、該装置により、元の残差信号の帯域毎の強度の違いが反映された音声信号が再生される。したがって、音声符号化装置１１１により帯域毎にゲインを求めておくことは、例えばゲインが帯域に依存しない一定値であるといった仮定を採る場合に比べて、後述の音声復号装置が高い品質の音声信号を再生するのに資する。なお、本実施形態においては、非雑音帯域における残差信号の性質については、数値で表される周波数変換係数として音声復号装置に伝達されるので、別途ゲインを求める必要はない。 The gain of each band represents the intensity of the component of the band of the residual signal. In the case of an audio signal, generally, the gain becomes a different value if the band is different. The gain is transmitted to a speech decoding device described later. Then, the apparatus reproduces an audio signal reflecting the difference in intensity of each band of the original residual signal. Therefore, obtaining the gain for each band by the speech encoding device 111 means that the speech decoding device, which will be described later, has a higher quality speech signal than when assuming that the gain is a constant value independent of the bandwidth. Contribute to playing. In the present embodiment, the characteristics of the residual signal in the non-noise band are transmitted to the speech decoding apparatus as frequency conversion coefficients represented by numerical values, so that it is not necessary to obtain a separate gain.

雑音判別部１２９及び符号化用Ａスイッチ部１３１が上述のように動作することから、フラグ及び雑音ゲイン集計器１５９には、雑音帯域におけるフラグ及びゲインが集められる結果となる。これらの雑音帯域におけるフラグ及びゲインは、符号化部１３７に引き渡される。 Since the noise discriminating unit 129 and the encoding A switch unit 131 operate as described above, the flag and noise gain counter 159 collects flags and gains in the noise band. The flags and gains in these noise bands are delivered to the encoding unit 137.

集計変換部１３５の中の非雑音残差信号集計器１６１は、どの帯域の残差信号が符号化用Ａスイッチ部１３１から引き渡されたかを把握するとともに、かかる帯域の残差信号だけを実時間領域において重ね合わせた信号を生成する。 The non-noise residual signal totalizer 161 in the totalizing conversion unit 135 grasps which band residual signal has been delivered from the encoding A switch unit 131, and transmits only the residual signal in the band in real time. A signal superimposed on the region is generated.

雑音判別部１２９及び符号化用Ａスイッチ部１３１が上述のように動作することから、非雑音残差信号集計器１６１が上述のように把握した帯域は、結果として、非雑音帯域である。同様に、非雑音残差信号集計器１６１が上述のように重ね合わせて生成した信号は、結果として、非雑音帯域における残差信号である。 Since the noise discriminating unit 129 and the encoding A switch unit 131 operate as described above, the band recognized by the non-noise residual signal totalizer 161 as described above is a non-noise band as a result. Similarly, the signal generated by superimposing the non-noise residual signal totalizer 161 as described above is a residual signal in the non-noise band as a result.

非雑音帯域における残差信号は、非雑音残差信号集計器１６１から同じく集計変換部１３５の中のスペクトル変換器１６３に引き渡され、高速フーリエ変換（FFT、Fast Fourier Transform）、離散コサイン変換（DCT、Discrete Cosine Transform）、変形コサイン変換（MDCT、Modified Discrete Cosine Transform）等の周波数変換手法により、周波数変換係数に変換されてから、同じく集計変換部１３５の中の非雑音帯域切り出し器１６２に引き渡される。 The residual signal in the non-noise band is transferred from the non-noise residual signal totalizer 161 to the spectrum converter 163 in the total conversion unit 135, and is subjected to fast Fourier transform (FFT), discrete cosine transform (DCT). , Discrete Cosine Transform), modified cosine transform (MDCT), and the like, and after being converted into frequency conversion coefficients, it is delivered to the non-noise band cutout unit 162 in the total conversion unit 135. .

非雑音帯域切り出し器１６２には、非雑音残差信号集計器１６１が上述のように把握した、どの帯域が非雑音帯域であるか、という情報も引き渡される。非雑音帯域切り出し器１６２は、かかる情報を参照することにより、スペクトル変換器１６３から引き渡された周波数変換係数のうち、非雑音帯域における周波数変換係数だけを取り出して、符号化部１３７に引き渡す。つまり、符号化部１３７には、非雑音帯域における残差信号を周波数変換して得られた全帯域における周波数変換係数のうち、非雑音帯域における周波数変換係数だけが切り出されて引き渡される。 The non-noise band cutout unit 162 is also handed over information on which band is the non-noise band, as grasped by the non-noise residual signal totalizer 161 as described above. The non-noise band cutout unit 162 refers to such information, extracts only the frequency conversion coefficient in the non-noise band from the frequency conversion coefficients transferred from the spectrum converter 163, and transfers the extracted frequency conversion coefficient to the encoding unit 137. That is, only the frequency conversion coefficient in the non-noise band is extracted and delivered to the encoding unit 137 out of the frequency conversion coefficients in the entire band obtained by frequency-converting the residual signal in the non-noise band.

もっとも、雑音判別部１２９及び符号化用Ａスイッチ部１３１が上述のように動作することから、非雑音残差信号集計器１６１が上述の重ね合わせにより生成した、非雑音帯域における残差信号は、元々、雑音帯域の成分を含まない。よって、原理的には、スペクトル変換器１６３が生成する周波数変換係数は、雑音帯域において数値0が並んだものとなる。したがって、非雑音帯域切り出し器１６２は、非雑音残差信号集計器１６１からの非雑音帯域に関する情報の引き渡しを受けずに、スペクトル変換器１６３から引き渡された周波数変換係数から値が0の係数を除去したものを符号化部１３７に引き渡すものであってもよい。ただし、帯域フィルタ部１２７の帯域フィルタとしての精度を考慮して、本実施形態においては、動作を確実なものとするために、非雑音帯域切り出し器１６２は周波数変換係数の切り出しにあたり非雑音帯域に関する情報を参照するものとする。 However, since the noise determination unit 129 and the encoding A switch unit 131 operate as described above, the residual signal in the non-noise band generated by the non-noise residual signal totalizer 161 by the above-described superposition is Originally, no noise band component is included. Therefore, in principle, the frequency conversion coefficients generated by the spectrum converter 163 are those in which numerical values 0 are arranged in the noise band. Therefore, the non-noise band cutout unit 162 does not receive the information on the non-noise band from the non-noise residual signal totalizer 161, and calculates a coefficient having a value of 0 from the frequency conversion coefficient transferred from the spectrum converter 163. What has been removed may be delivered to the encoding unit 137. However, in consideration of the accuracy of the band filter unit 127 as a band filter, in the present embodiment, in order to ensure the operation, the non-noise band cutout unit 162 relates to the non-noise band in cutting out the frequency conversion coefficient. Information shall be referenced.

符号化部１３７は、予測分析部１２５からは予測係数を、フラグ及びゲイン生成部１３３からは雑音帯域におけるフラグ及びゲインを、集計変換部１３５からは非雑音帯域における周波数変換係数を、受け取る。符号化部１３７はこれらをまとめて所定の符号化手法、例えばベクトル量子化手法やハフマン符号化手法、により符号化し、生成された符号を送信部１３９に引き渡す。 The encoding unit 137 receives the prediction coefficient from the prediction analysis unit 125, the flag and gain in the noise band from the flag and gain generation unit 133, and the frequency conversion coefficient in the non-noise band from the total conversion unit 135. The encoding unit 137 collectively encodes them using a predetermined encoding method, for example, a vector quantization method or a Huffman encoding method, and delivers the generated code to the transmission unit 139.

送信部１３９は、符号化部１３７から、予測係数、フラグ、ゲイン、及び、非雑音帯域における周波数変換係数、が符号化されたもの、を受け取り、これを、後述の実施形態４又は５に係る音声復号装置、あるいは、かかる音声復号装置として機能する、後述の実施形態６に係る他の音声符号化兼復号装置に向けて、送信する。送信方法は、本実施形態においては、無線通信であるとするが、他の、有線や、有線と無線の併用など、様々な通信方法であってもよい。 The transmission unit 139 receives, from the encoding unit 137, a prediction coefficient, a flag, a gain, and a frequency conversion coefficient in a non-noise band, which are encoded, and relates to this according to Embodiment 4 or 5 described later. Transmission is performed toward the speech decoding apparatus or another speech encoding / decoding apparatus according to Embodiment 6 that functions as the speech decoding apparatus. In this embodiment, the transmission method is wireless communication. However, various other communication methods such as wired or a combination of wired and wireless may be used.

本実施形態に係る音声符号化装置１１１は、残差信号を複数の帯域に分割し、雑音帯域についてはその旨を示すフラグと、該帯域におけるゲインのみを送信対象とし、非雑音帯域については周波数変換係数を送信対象とすることにより、残差信号に関する情報の効率的な伝達と、再生音声の品質の確保と、を両立させることができる。 The speech encoding apparatus 111 according to the present embodiment divides the residual signal into a plurality of bands, and for the noise band, only the flag indicating that and the gain in the band are to be transmitted, and the frequency for the non-noise band By using the transform coefficient as a transmission target, it is possible to achieve both efficient transmission of information related to the residual signal and ensuring the quality of reproduced audio.

（実施形態２）
図２は、本実施形態に係る音声符号化装置１１３の機能構成図である。音声符号化装置１１３は、概ね、実施形態１に係る音声符号化装置１１１と同じ構成を有する。主な相違点は、符号化用Ａスイッチ１３１が符号化用Ｃスイッチ部１６５に置き換えられていることと、それに伴い定電圧源（電圧V_C）が設けられていることと、Band Elimination Filter１６９が設けられていることと、それに伴い集計変換部１３５としてまとめられていた部分が分解された上で構成要素の仕様が一部変更されていること、である。 (Embodiment 2)
FIG. 2 is a functional configuration diagram of the speech encoding device 113 according to the present embodiment. The speech coding apparatus 113 generally has the same configuration as the speech coding apparatus 111 according to the first embodiment. The main difference is that the encoding A switch 131 is replaced by an encoding C switch unit 165, and that a constant voltage source (voltage V _C ) is provided accordingly, and that the Band Elimination Filter 169 is provided. This is that the specifications of the constituent elements have been partially changed after the parts that have been collected as the total conversion unit 135 have been disassembled.

実施形態１において雑音判別部１２９は符号化用Ａスイッチ部１３１に切替命令を送っていたが、本実施形態においては、雑音判別部１２９は符号化用Ｃスイッチ部１６５に対して選択オン命令を送る。ここで、選択オン命令とは、帯域１についての場合であれば、帯域１の残差信号が雑音であると判別された場合には第１Ｃスイッチ１７１のｃ１を閉じｄ１を開き、帯域１の残差信号が雑音ではないと判別された場合には第１Ｃスイッチ１７１のｃ１を開きｄ１を閉じるように、第１Ｃスイッチ１７１を制御する命令である。そして、帯域２についての場合であれば、帯域２の残差信号が雑音であると判別された場合には第２Ｃスイッチ１７３のｃ２を閉じｄ２を開き、帯域２の残差信号が雑音ではないと判別された場合には第２Ｃスイッチ１７３のｃ２を開きｄ２を閉じるように、第２Ｃスイッチ１７３を制御する命令である。帯域３以降についての場合も同様である。 In the first embodiment, the noise determination unit 129 sends a switching command to the encoding A switch unit 131. However, in this embodiment, the noise determination unit 129 sends a selection ON command to the encoding C switch unit 165. send. Here, if the selection ON command is for the band 1, when it is determined that the residual signal of the band 1 is noise, the c1 of the first C switch 171 is closed and the d1 is opened. This is a command for controlling the first C switch 171 to open c1 of the first C switch 171 and close d1 when it is determined that the residual signal is not noise. In the case of band 2, if it is determined that the residual signal of band 2 is noise, c2 of second C switch 173 is closed and d2 is opened, and the residual signal of band 2 is not noise. Is determined to control the second C switch 173 to open c2 of the second C switch 173 and close d2. The same applies to the band 3 and subsequent bands.

第１Ｃスイッチ１７１においてｃ１が閉じｄ１が開いた場合は、帯域１の残差信号がフラグ及びゲイン生成部１３３に引き渡されるが、非雑音帯域決定器１６７には電圧V_Cが印加されず、非雑音帯域決定器１６７は帯域１が雑音帯域である旨を記憶する。一方、第１Ｃスイッチ１７１においてｃ１が開きｄ１が閉じた場合は、帯域１の残差信号がフラグ及びゲイン生成部１３３に引き渡されず、非雑音帯域決定器１６７には電圧V_Cが印加され、非雑音帯域決定器１６７は帯域１が非雑音帯域である旨を記憶する。第２Ｃスイッチ１７３においてｃ２が閉じｄ２が開いた場合は、帯域２の残差信号がフラグ及びゲイン生成部１３３に引き渡されるが、非雑音帯域決定器１６７には電圧V_Cが印加されず、非雑音帯域決定器１６７は帯域２が雑音帯域である旨を記憶する。一方、第２Ｃスイッチ１７３においてｃ２が開きｄ２が閉じた場合は、帯域２の残差信号がフラグ及びゲイン生成部１３３に引き渡されず、非雑音帯域決定器１６７には電圧V_Cが印加され、非雑音帯域決定器１６７は帯域２が非雑音帯域である旨を記憶する。帯域３以降についても同様である。 When c1 is closed and d1 is opened in the first C switch 171, the residual signal of band 1 is delivered to the flag and gain generation unit 133, but the voltage V _C is not applied to the non-noise band determiner 167, The noise band determiner 167 stores that band 1 is a noise band. On the other hand, when c1 is opened and d1 is closed in the first C switch 171, the residual signal of band 1 is not delivered to the flag and gain generation unit 133, and the voltage V _C is applied to the non-noise band determiner 167. The noise band determiner 167 stores that band 1 is a non-noise band. When c2 is closed and d2 is opened in the second C switch 173, the residual signal of band 2 is delivered to the flag and gain generation unit 133, but the voltage V _C is not applied to the non-noise band determiner 167, and non-noise is determined. The noise band determiner 167 stores that band 2 is a noise band. On the other hand, when c2 is opened and d2 is closed in the second C switch 173, the residual signal of the band 2 is not delivered to the flag and gain generation unit 133, and the voltage V _C is applied to the non-noise band determiner 167. The noise band determiner 167 stores that band 2 is a non-noise band. The same applies to bands 3 and after.

このように、定電圧源（電圧V_C）は、非雑音帯域決定器１６７が、どの帯域が非雑音帯域であるかを把握するためのものである。図示した定電圧源は模式的なものであって、非雑音帯域決定器１６７によるかかる把握を可能にするものであれば他の機構のもので代用してよい。 Thus, the constant voltage source (voltage V _C ) is used by the non-noise band determiner 167 to grasp which band is the non-noise band. The illustrated constant voltage source is schematic and may be replaced by another mechanism as long as the non-noise band determiner 167 enables such grasping.

非雑音帯域決定器１６７は、把握した非雑音帯域に関する情報を、実施形態１において非雑音残差信号集計器１６１が行ったのと同様に、非雑音帯域切り出し器１６２に引き渡す。 The non-noise band determiner 167 delivers information regarding the recognized non-noise band to the non-noise band cutout unit 162 in the same manner as the non-noise residual signal totalizer 161 performs in the first embodiment.

非雑音帯域決定器１６７はしかし、実施形態１における非雑音残差信号集計器１６１とは異なり、非雑音帯域の残差信号を受け取っていない。本実施形態においては、スペクトル変換器１６３が非雑音帯域の残差信号を取得するために、まず残差信号全体が予測分析部１２５からBand Elimination Filter１６９に引き渡される。非雑音帯域決定器１６７はどの帯域が非雑音帯域であるかを把握しているのであるから、逆に、どの帯域が雑音帯域であるかを把握しているともいえる。そこで、非雑音帯域決定器１６７は、かかる雑音帯域、すなわちスペクトル変換器１６３に残差信号が入力される前に削除されておくべき帯域を、Band Elimination Filter１６９に一括して指定する命令を送る。Band Elimination Filter１６９は、削除すべき帯域を自在に選択設定することができるフィルタであり、本実施形態においては、前記命令にしたがって、予測分析部１２５から引き渡された残差信号のうち削除すべき帯域を削除した上で、スペクトル変換器１６３に引き渡す。 However, unlike the non-noise residual signal aggregator 161 in the first embodiment, the non-noise band determiner 167 does not receive a non-noise band residual signal. In the present embodiment, in order for the spectrum converter 163 to acquire a residual signal in a non-noise band, first, the entire residual signal is delivered from the prediction analysis unit 125 to the Band Elimination Filter 169. Since the non-noise band determiner 167 knows which band is the non-noise band, it can be said that, on the contrary, it knows which band is the noise band. Therefore, the non-noise band determiner 167 sends a command for collectively specifying such a noise band, that is, a band to be deleted before the residual signal is input to the spectrum converter 163, to the Band Elimination Filter 169. The Band Elimination Filter 169 is a filter that can freely select and set a band to be deleted. In the present embodiment, the band Elimination Filter 169 is a band to be deleted from the residual signal delivered from the prediction analysis unit 125 according to the command. And is transferred to the spectrum converter 163.

本実施形態は、実施形態１の変形例といえる。ただし、本実施形態においては、実施形態１の場合と異なり、スペクトル変換器１６３に入力される非雑音残差信号として、帯域フィルタ部１２７を経由したものを用いずに、Band Elimination Filter１６９を経由したものを用いている。したがって、帯域フィルタ部１２７で使用される多数の帯域フィルタに比べフィルタとしての性能が高いBand Elimination Filterが１個だけでも入手できる場合には、本実施形態を採用する方が、実施形態１の場合に比べて非雑音帯域における周波数変換係数が高い精度で求まるという利点がある。また、実施形態１の場合、非雑音残差信号集計器１６１が非雑音帯域の残差信号の重ね合わせ処理を行う際に誤差が生じ得るが、本実施形態の場合にはかかる重ね合わせ処理がないので、この点でも、周波数変換係数がより高い精度で求まると期待される。 This embodiment can be said to be a modification of the first embodiment. However, in the present embodiment, unlike the case of the first embodiment, the non-noise residual signal input to the spectrum converter 163 is not passed through the band filter unit 127 but is passed through the Band Elimination Filter 169. Something is used. Therefore, when only one Band Elimination Filter having higher performance as a filter than the many band filters used in the band filter unit 127 can be obtained, this embodiment is more suitable in the case of the first embodiment. There is an advantage that the frequency conversion coefficient in the non-noise band can be obtained with high accuracy as compared with. In the case of the first embodiment, an error may occur when the non-noise residual signal totalizer 161 performs the superimposing process of the residual signal in the non-noise band, but in the case of the present embodiment, the superimposing process is performed. In this respect, it is expected that the frequency conversion coefficient can be obtained with higher accuracy.

（実施形態３）
図３は、本実施形態に係る音声符号化装置１１５の機能構成図である。音声符号化装置１１５は、実施形態１に係る音声符号化装置１１１に類似した構成を有するが、符号化用Ａスイッチ１３１が符号化用Ｅスイッチ部１７７に置き換えられている点と、集計変換部１３５としてまとめられていた部分が完全に分解された上でスペクトル変換器１６３を除く構成要素の仕様が一部変更されている点と、において相違する。 (Embodiment 3)
FIG. 3 is a functional configuration diagram of the speech encoding device 115 according to the present embodiment. The speech coding apparatus 115 has a configuration similar to that of the speech coding apparatus 111 according to the first embodiment, except that the coding A switch 131 is replaced with a coding E switch unit 177, and an aggregation conversion unit. It is different in that the specifications of the constituent elements excluding the spectrum converter 163 are partly changed after the part grouped as 135 is completely decomposed.

本実施形態においては、予測分析部１２５からの残差信号が直接にスペクトル変換器１６３に引き渡される。そして、スペクトル変換器１６３は、先の２つの実施形態の場合と異なり、引き渡された残差信号に雑音帯域が含まれているか否かには拘泥せずに、残差信号全体から周波数変換係数を求める。 In the present embodiment, the residual signal from the prediction analysis unit 125 is directly delivered to the spectrum converter 163. The spectral converter 163 differs from the previous two embodiments in that the frequency conversion coefficient is calculated from the entire residual signal without regard to whether the handed over residual signal contains a noise band or not. Ask for.

求まった周波数変換係数は、周波数変換係数切り分け器１７５に引き渡される。この周波数変換係数切り分け器１７５は、先の２つの実施形態における非雑音帯域切り出し器１６２と比べ、受け取った周波数変換係数を所定の帯域に対応づける点において似ている。しかし、後者が非雑音帯域に属する周波数変換係数を削除するのに対して、前者は周波数変換係数の全てを所定の複数の帯域に分類するだけであって、帯域１における周波数変換係数、帯域２における周波数変換係数、・・・、のように結局全ての周波数変換係数を出力する点が異なる。 The obtained frequency conversion coefficient is delivered to the frequency conversion coefficient discriminator 175. The frequency conversion coefficient discriminator 175 is similar to the non-noise band cutout unit 162 in the previous two embodiments in that the received frequency conversion coefficient is associated with a predetermined band. However, while the latter deletes the frequency conversion coefficients belonging to the non-noise band, the former only classifies all the frequency conversion coefficients into a plurality of predetermined bands, and the frequency conversion coefficient in band 1 and band 2 The difference is that all frequency conversion coefficients are output in the end, as in FIG.

周波数変換係数切り分け器１７５から出力された帯域１の周波数変換係数が伝送される信号線は、符号化用Ｅスイッチ部１７７の中の第１Ｅスイッチ１８１に接続され、帯域２の周波数変換係数が伝送される信号線は第２Ｅスイッチ１８３に接続されている。帯域３以降についても同様である。 The signal line for transmitting the frequency conversion coefficient of band 1 output from the frequency conversion coefficient discriminator 175 is connected to the first E switch 181 in the encoding E switch unit 177, and the frequency conversion coefficient of band 2 is transmitted. The signal line to be connected is connected to the second E switch 183. The same applies to bands 3 and after.

実施形態１において雑音判別部１２９は符号化用Ａスイッチ部１３１に切替命令を送っていたが、本実施形態においては、雑音判別部１２９は符号化用Ｅスイッチ部１７７に対して選択オン命令を送る。ここで、選択オン命令とは、帯域１についての場合であれば、帯域１の残差信号が雑音であると判別された場合には第１Ｅスイッチ１８１のｅ１を閉じｆ１を開き、帯域１の残差信号が雑音ではないと判別された場合には第１Ｅスイッチ１８１のｅ１を開きｆ１を閉じるように、第１Ｅスイッチ１８１を制御する命令である。そして、帯域２についての場合であれば、帯域２の残差信号が雑音であると判別された場合には第２Ｅスイッチ１８３のｅ２を閉じｆ２を開き、帯域２の残差信号が雑音ではないと判別された場合には第２Ｅスイッチ１８３のｅ２を開きｆ２を閉じるように、第２Ｅスイッチ１８３を制御する命令である。帯域３以降についての場合も同様である。 In the first embodiment, the noise determination unit 129 sends a switching command to the encoding A switch unit 131. However, in this embodiment, the noise determination unit 129 sends a selection on command to the encoding E switch unit 177. send. Here, if the selection ON command is for the band 1, if it is determined that the residual signal of the band 1 is noise, the e1 of the first E switch 181 is closed and the f1 is opened. This is a command for controlling the first E switch 181 to open e1 of the first E switch 181 and close f1 when it is determined that the residual signal is not noise. In the case of band 2, if it is determined that the residual signal of band 2 is noise, e2 of the second E switch 183 is closed and f2 is opened, and the residual signal of band 2 is not noise. Is determined to control the second E switch 183 to open e2 of the second E switch 183 and close f2. The same applies to the band 3 and subsequent bands.

第１Ｅスイッチ１８１においてｅ１が閉じｆ１が開いた場合は、帯域１の残差信号がフラグ及びゲイン生成部１３３に引き渡されるが、非雑音周波数変換係数集計器１７９には帯域１における周波数変換係数が引き渡されず、非雑音周波数変換係数集計器１７９は帯域１における周波数変換係数を記憶しない。一方、第１Ｅスイッチ１８１においてｅ１が開きｆ１が閉じた場合は、帯域１の残差信号はフラグ及びゲイン生成部１３３に引き渡されず、非雑音周波数変換係数集計器１７９には帯域１における周波数変換係数が引き渡され、非雑音周波数変換係数集計器１７９は帯域１における周波数変換係数を記憶する。第２Ｅスイッチ１８３においてｅ２が閉じｆ２が開いた場合は、帯域２における残差信号がフラグ及びゲイン生成部１３３に引き渡されるが、非雑音周波数変換係数集計器１７９には帯域２の周波数変換係数が引き渡されず、非雑音周波数変換係数集計器１７９は帯域２における周波数変換係数を記憶しない。一方、第２Ｅスイッチ１８３においてｅ２が開きｆ２が閉じた場合は、帯域２の残差信号はフラグ及びゲイン生成部１３３に引き渡されず、非雑音周波数変換係数集計器１７９には帯域２における周波数変換係数が引き渡され、非雑音周波数変換係数集計器１７９は帯域２における周波数変換係数を記憶する。帯域３以降についても同様である。 When e1 is closed and f1 is opened in the first E switch 181, the residual signal of band 1 is delivered to the flag and gain generation unit 133, but the non-noise frequency conversion coefficient totalizer 179 stores the frequency conversion coefficient in band 1. Not handed over, the non-noise frequency conversion coefficient totalizer 179 does not store the frequency conversion coefficients in band 1. On the other hand, when e1 is opened and f1 is closed in the first E switch 181, the residual signal of band 1 is not delivered to the flag and gain generation unit 133, and the frequency conversion coefficient in band 1 is not transferred to the non-noise frequency conversion coefficient totalizer 179. Is passed, and the non-noise frequency conversion coefficient totalizer 179 stores the frequency conversion coefficients in band 1. When e2 is closed and f2 is opened in the second E switch 183, the residual signal in band 2 is delivered to the flag and gain generation unit 133, but the frequency conversion coefficient of band 2 is stored in the non-noise frequency conversion coefficient totalizer 179. Not handed over, the non-noise frequency conversion coefficient totalizer 179 does not store the frequency conversion coefficients in band 2. On the other hand, when e2 is opened and f2 is closed in the second E switch 183, the residual signal of the band 2 is not delivered to the flag and gain generation unit 133, and the frequency conversion coefficient in the band 2 is input to the non-noise frequency conversion coefficient totalizer 179. Is passed, and the non-noise frequency conversion coefficient totalizer 179 stores the frequency conversion coefficients in band 2. The same applies to bands 3 and after.

この結果、非雑音周波数変換係数集計器１７９には、非雑音帯域における周波数変換係数が集計され記憶される。非雑音周波数変換係数集計器１７９は、かかる周波数変換係数を符号化部１３７に引き渡す。 As a result, the non-noise frequency conversion coefficient totalizer 179 counts and stores frequency conversion coefficients in the non-noise band. The non-noise frequency conversion coefficient totalizer 179 passes the frequency conversion coefficient to the encoding unit 137.

本実施形態に係る音声符号化装置１１５が符号化の対象とするものは、先の２つの実施形態に係る音声符号化装置１１１及び音声符号化装置１１３が符号化の対象とするものと同じである。 What the speech encoding apparatus 115 according to the present embodiment encodes is the same as what the speech encoding apparatus 111 and the speech encoding apparatus 113 according to the previous two embodiments encode. is there.

ただし、本実施形態の場合、残差信号を帯域別残差信号に分割する前に周波数変換係数を全帯域に渡って一括して求める。よって、実施形態１の場合と異なり、周波数変換係数が、帯域フィルタ部１２７の性能とは関係なく、高精度で求まる。また、本実施形態においては、周波数変換係数は、実施形態１における非雑音残差信号集計器１６１（図１）での信号の重ね合わせ処理に伴う誤差の影響も受けずに、高精度で求まる。さらに、実施形態２と比較しても、本実施形態においては、周波数変換係数が、Band Elimination Filterの性能とは関係なく、高精度で求まる。 However, in the case of the present embodiment, the frequency conversion coefficient is obtained collectively over the entire band before the residual signal is divided into the band-specific residual signals. Therefore, unlike the case of the first embodiment, the frequency conversion coefficient can be obtained with high accuracy regardless of the performance of the band filter unit 127. Further, in the present embodiment, the frequency conversion coefficient is obtained with high accuracy without being affected by errors due to signal superposition processing in the non-noise residual signal totalizer 161 (FIG. 1) in the first embodiment. . Furthermore, even in comparison with the second embodiment, in this embodiment, the frequency conversion coefficient can be obtained with high accuracy irrespective of the performance of the band elimination filter.

（実施形態４）
図４は、本実施形態に係る音声復号装置２１１の機能構成図である。 (Embodiment 4)
FIG. 4 is a functional configuration diagram of the speech decoding apparatus 211 according to the present embodiment.

音声復号装置２１１は、図示するように、受信部２２１と、復号部２２３と、フラグ存否判別部２２５と、復号用Ｇスイッチ部２２７と、帯域別雑音列生成部２２９と、集計逆変換部２３１と、残差信号復元部２３３と、合成用フィルタ算出部２３５と、合成用フィルタ部２３７と、Ｄ／Ａ変換部２３９と、スピーカ２４１と、を備える。 As shown in the figure, the speech decoding apparatus 211 includes a receiving unit 221, a decoding unit 223, a flag presence / absence determination unit 225, a decoding G switch unit 227, a band-specific noise sequence generation unit 229, and an aggregate inverse conversion unit 231. A residual signal restoration unit 233, a synthesis filter calculation unit 235, a synthesis filter unit 237, a D / A conversion unit 239, and a speaker 241.

フラグ存否判別部２２５は、第１フラグ存否判別器２４３と、第２フラグ存否判別器２４５と、図４では省略するが、第３フラグ存否判別器以降の必要なフラグ存否判別器と、を備える。 The flag presence / absence determining unit 225 includes a first flag presence / absence determiner 243, a second flag presence / absence determiner 245, and a necessary flag presence / absence determiner after the third flag presence / absence determiner, which are omitted in FIG. .

復号用Ｇスイッチ部２２７は、第１Ｇスイッチ２４７と、第２Ｇスイッチ２４９と、図４では省略するが、第３Ｇスイッチ以降の必要なＧスイッチと、を備える。 The decoding G switch unit 227 includes a first G switch 247, a second G switch 249, and a necessary G switch after the third G switch, although omitted in FIG.

帯域別雑音列生成部２２９は、第１雑音列生成器２５１と、第２雑音列生成器２５３と、図４では省略するが、第３雑音列生成器以降の必要な雑音列生成器と、を備える。 The band-specific noise sequence generation unit 229 includes a first noise sequence generator 251, a second noise sequence generator 253, a noise sequence generator required after the third noise sequence generator, although omitted in FIG. Is provided.

集計逆変換部２３１は、周波数変換係数集計及び補充器２５５と、スペクトル逆変換器２５７と、を備える。 The tabulation inverse conversion unit 231 includes a frequency conversion coefficient tabulation and supplementer 255 and a spectrum inverse converter 257.

受信部２２１は、実施形態１係る音声符号化装置１１１（図１）や、実施形態２に係る音声符号化装置１１３（図２）や、実施形態３に係る音声符号化装置１１５（図３）が、送信部１３９から無線通信手段等により送信した符号を受け取り、復号部２２３に引き渡す。かかる符号は、予測係数、フラグ、ゲイン、及び、非雑音帯域における周波数変換係数、が符号化されたものである。 The receiving unit 221 includes the speech encoding device 111 (FIG. 1) according to the first embodiment, the speech encoding device 113 (FIG. 2) according to the second embodiment, and the speech encoding device 115 (FIG. 3) according to the third embodiment. However, the code transmitted from the transmission unit 139 by wireless communication means or the like is received and delivered to the decoding unit 223. Such a code is obtained by encoding a prediction coefficient, a flag, a gain, and a frequency conversion coefficient in a non-noise band.

復号部２２３は、受信部２２１から引き渡された符号を復号して、上述のように変数i(0≦i≦M-1)により識別した各時間区分における、予測係数と、フラグと、ゲインと、非雑音帯域における周波数変換係数と、を生成する。 The decoding unit 223 decodes the code delivered from the reception unit 221 and, as described above, the prediction coefficient, flag, gain, and gain in each time segment identified by the variable i (0 ≦ i ≦ M−1). And a frequency conversion coefficient in a non-noise band.

復号部２２３は、生成した予測係数を、合成用フィルタ算出部２３５に引き渡す。それとともに、復号部２２３は、生成したフラグとゲインと非雑音帯域における周波数変換係数とを、帯域毎の情報として、フラグ存否判別部２２５及び復号用Ｇスイッチ部２２７に引き渡す。概ね、各帯域の情報のうちフラグの有無に関する情報がフラグ存否判別部２２５に引き渡され、各帯域の情報のうちフラグ以外に関する情報が復号用Ｇスイッチ部２２７に引き渡される。 The decoding unit 223 delivers the generated prediction coefficient to the synthesis filter calculation unit 235. At the same time, the decoding unit 223 delivers the generated flag, gain, and frequency conversion coefficient in the non-noise band to the flag presence / absence determination unit 225 and the decoding G switch unit 227 as information for each band. In general, information regarding the presence / absence of a flag among the information of each band is delivered to the flag presence / absence determination unit 225, and information relating to other than the flag among the information of each band is delivered to the decoding G switch unit 227.

なお、フラグ、ゲイン、周波数変換係数は、帯域毎にみると、復号部２２３により生成されていたり生成されていなかったりする。つまり、音声符号化装置１１１（図１）等の送信側の装置において、雑音帯域についてのみフラグとゲインが生成され符号化され、かつ、非雑音帯域についてのみ周波数変換係数が生成され符号化されているので、受信側の装置である本実施形態に係る音声復号装置２１１の中で復号部２２３による復号が行われても、雑音帯域における周波数変換係数や、非雑音帯域におけるフラグ及びゲインは、生成されることはない。 Note that the flag, gain, and frequency conversion coefficient may or may not be generated by the decoding unit 223 in each band. That is, in the transmission side device such as the speech encoding device 111 (FIG. 1), the flag and the gain are generated and encoded only for the noise band, and the frequency conversion coefficient is generated and encoded only for the non-noise band. Therefore, even if decoding by the decoding unit 223 is performed in the speech decoding apparatus 211 according to the present embodiment which is a receiving-side apparatus, the frequency conversion coefficient in the noise band and the flag and gain in the non-noise band are generated. It will never be done.

復号部２２３の役割のひとつは、帯域１の情報のうち、帯域１におけるフラグの有無を、フラグ存否判別部２２５の中の第１フラグ存否判別器２４３に通知することである。より正確には、復号部２２３は、帯域１におけるフラグが生成された場合にはその旨を第１フラグ存否判別器２４３に通知し、帯域１におけるフラグが生成されなかった場合には第１フラグ存否判別器２４３に何らの通知も行わない。帯域２については、復号部２２３は、帯域２におけるフラグが生成された場合にはその旨を第２フラグ存否判別器２４５に通知し、帯域２におけるフラグが生成されなかった場合には第２フラグ存否判別器２４５に何らの通知も行わない。帯域３以降についても同様である。 One of the roles of the decoding unit 223 is to notify the first flag presence / absence discriminator 243 in the flag presence / absence discriminating unit 225 of the presence / absence of the flag in the band 1 in the band 1 information. More precisely, the decoding unit 223 notifies the first flag presence / absence discriminator 243 when the flag in the band 1 is generated, and the first flag when the flag in the band 1 is not generated. No notification is made to the presence / absence discriminator 243. For the band 2, the decoding unit 223 notifies the second flag presence / absence discriminator 245 when the flag for the band 2 is generated, and the second flag when the flag for the band 2 is not generated. No notification is made to the presence / absence discriminator 245. The same applies to bands 3 and after.

復号部２２３はまた、帯域１の情報のうち、フラグ以外の情報、すなわち、帯域１が雑音帯域であった場合には帯域１におけるゲイン、帯域１が非雑音帯域であった場合には帯域１における周波数変換係数、を、復号用Ｇスイッチ部２２７の中の第１Ｇスイッチ２４７に通知する。帯域２については、フラグ以外の情報、すなわち、帯域２が雑音帯域であった場合には帯域２におけるゲイン、帯域２が非雑音帯域であった場合には帯域２における周波数変換係数、を、復号用Ｇスイッチ部２２７の中の第２Ｇスイッチ２４９に通知する。帯域３以降についても、同様である。 The decoding unit 223 also includes information other than the flag in the band 1 information, that is, the gain in the band 1 when the band 1 is a noise band, and the band 1 when the band 1 is a non-noise band. Is notified to the first G switch 247 in the decoding G switch unit 227. For band 2, information other than the flag, that is, gain in band 2 when band 2 is a noise band, and frequency conversion coefficient in band 2 when band 2 is a non-noise band are decoded. The second G switch 249 in the G switch unit 227 is notified. The same applies to bands 3 and after.

第１フラグ存否判別器２４３は、復号部２２３から帯域１のフラグが生成された旨の通知を受けたか否かを判別し、第１Ｇスイッチ１５１に対して切替命令を送る。ここで、切替命令とは、帯域１のフラグが生成された旨の通知を受けたと判別された場合には第１Ｇスイッチ２４７をｇ１側に閉じ、帯域１のフラグが生成された旨の通知を受けなかったと判別された場合には第１Ｇスイッチをｈ１側に閉じるように、第１Ｇスイッチ２４７を制御する命令である。第２フラグ存否判別器２４５は、復号部２２３から帯域２のフラグが生成された旨の通知を受けたか否かを判別し、第２Ｇスイッチ２４９に対して切替命令を送る。ここで、切替命令とは、帯域２のフラグが生成された旨の通知を受けたと判別された場合には第２Ｇスイッチ２４９をｇ２側に閉じ、帯域２のフラグが生成された旨の通知を受けなかったと判別された場合には第２Ｇスイッチをｈ２側に閉じるように、第２Ｇスイッチ２４９を制御する命令である。帯域３以降についても同様である。 The first flag presence / absence discriminator 243 determines whether or not a notification indicating that the band 1 flag has been generated is received from the decoding unit 223, and sends a switching command to the first G switch 151. Here, when it is determined that the notification that the band 1 flag has been generated is received, the switching command closes the first G switch 247 to the g1 side, and notifies that the band 1 flag has been generated. When it is determined that the first G switch is not received, the first G switch 247 is controlled to close the first G switch to the h1 side. The second flag presence / absence determiner 245 determines whether or not a notification that the band 2 flag has been generated is received from the decoding unit 223, and sends a switching command to the second G switch 249. Here, when it is determined that the notification that the band 2 flag has been generated is received, the switching command closes the second G switch 249 to the g2 side, and notifies that the band 2 flag has been generated. The command is for controlling the second G switch 249 so that the second G switch is closed to the h2 side when it is determined that it has not been received. The same applies to bands 3 and after.

第１Ｇスイッチ２４７がｇ１側に閉じた場合は、帯域別雑音列生成部２２９の中の第１雑音列生成器２５１に、帯域１のゲインが届けられる。なぜならば、第１Ｇスイッチ２４７がｇ１側に閉じたということは、上述の通り、第１フラグ存否判別器２４３が帯域１におけるフラグの存在を検知したからであり、かかるフラグが存在する以上、復号部２２３は帯域１において周波数変換係数ではなくゲインを復号したことになり、したがって、復号部２２３から帯域１の情報として第１Ｇスイッチ２４７に通知されるのは周波数変換係数ではなくゲインだったということであり、ゆえに、第１Ｇスイッチ２４７がｇ１側に閉じたことにより復号部２２３と第１雑音列生成器２５１との間で接続された信号線を流れる情報は帯域１のゲインということになるからである。 When the first G switch 247 is closed to the g1 side, the gain of the band 1 is delivered to the first noise string generator 251 in the noise string generator 229 for each band. This is because the first G switch 247 is closed to the g1 side because the first flag presence / absence discriminator 243 detects the presence of the flag in the band 1 as described above. That is, the unit 223 decodes the gain instead of the frequency conversion coefficient in the band 1, and therefore, it is not the frequency conversion coefficient but the gain that is notified to the first G switch 247 as the band 1 information from the decoding unit 223. Therefore, since the first G switch 247 is closed to the g1 side, the information flowing through the signal line connected between the decoding unit 223 and the first noise string generator 251 is the gain of band 1. It is.

一方、第１Ｇスイッチ２４７がｈ１側に閉じた場合は、集計逆変換部２３１の中の周波数変換係数集計及び補充器２５５に、帯域１における周波数変換係数が届けられる。なぜならば、第１Ｇスイッチ２４７がｈ１側に閉じたということは、上述の通り、第１フラグ存否判別器２４３が帯域１におけるフラグの不在を検知したからであり、かかるフラグが存在しない以上、復号部２２３は帯域１においてゲインではなく周波数変換係数を復号したことになり、したがって、復号部２２３から帯域１の情報として第１Ｇスイッチ２４７に通知されるのはゲインではなく周波数変換係数だったということであり、ゆえに、第１Ｇスイッチ２４７がｈ１側に閉じたことにより復号部２２３と周波数変換係数集計及び補充器２５５との間で接続された信号線を流れる情報は帯域１における周波数変換係数ということになるからである。 On the other hand, when the first G switch 247 is closed to the h1 side, the frequency conversion coefficient in the band 1 is delivered to the frequency conversion coefficient totalization and supplementer 255 in the totalization inverse conversion unit 231. This is because the first G switch 247 is closed to the h1 side because the first flag presence / absence discriminator 243 detects the absence of the flag in the band 1 as described above. That is, the unit 223 decodes the frequency conversion coefficient instead of the gain in the band 1, and therefore, it is not the gain but the frequency conversion coefficient that is notified to the first G switch 247 as the band 1 information from the decoding unit 223. Therefore, the information flowing through the signal line connected between the decoding unit 223 and the frequency conversion coefficient totaling and supplementing device 255 when the first G switch 247 is closed to the h1 side is a frequency conversion coefficient in the band 1. Because it becomes.

同様に、第２Ｇスイッチ２４９がｇ２側に閉じた場合は、帯域別雑音列生成部２２９の中の第２雑音列生成器２５３に、帯域２のゲインが届けられる。なぜならば、第２Ｇスイッチ２４９がｇ２側に閉じたということは、上述の通り、第２フラグ存否判別器２４５が帯域２におけるフラグの存在を検知したからであり、かかるフラグが存在する以上、復号部２２３は帯域２において周波数変換係数ではなくゲインを復号したことになり、したがって、復号部２２３から帯域２の情報として第２Ｇスイッチ２４９に通知されるのは周波数変換係数ではなくゲインだったということであり、ゆえに、第２Ｇスイッチ２４９がｇ２側に閉じたことにより復号部２２３と第２雑音列生成器２５３との間で接続された信号線を流れる情報は帯域２のゲインということになるからである。 Similarly, when the second G switch 249 is closed to the g2 side, the gain of the band 2 is delivered to the second noise string generator 253 in the noise string generator 229 for each band. This is because the second G switch 249 is closed to the g2 side because the second flag presence / absence discriminator 245 detects the presence of the flag in the band 2 as described above. That is, the unit 223 decodes the gain, not the frequency conversion coefficient, in the band 2, and therefore it is not the frequency conversion coefficient but the gain that is notified from the decoding unit 223 to the second G switch 249 as the band 2 information. Therefore, since the second G switch 249 is closed to the g2 side, information flowing through the signal line connected between the decoding unit 223 and the second noise string generator 253 is a gain of band 2. It is.

一方、第２Ｇスイッチ２４９がｈ２側に閉じた場合は、集計逆変換部２３１の中の周波数変換係数集計及び補充器２５５に、帯域２における周波数変換係数が届けられる。なぜならば、第２Ｇスイッチ２４９がｈ２側に閉じたということは、上述の通り、第２フラグ存否判別器２４５が帯域２におけるフラグの不在を検知したからであり、かかるフラグが存在しない以上、復号部２２３は帯域２においてゲインではなく周波数変換係数を復号したことになり、したがって、復号部２２３から帯域２の情報として第１Ｇスイッチに通知されるのはゲインではなく周波数変換係数だったということであり、ゆえに、第２Ｇスイッチ２４９がｈ２側に閉じたことにより復号部２２３と周波数変換係数集計及び補充器２５５との間で接続された信号線を流れる情報は帯域２における周波数変換係数ということになるからである。 On the other hand, when the second G switch 249 is closed to the h2 side, the frequency conversion coefficient in the band 2 is delivered to the frequency conversion coefficient totalization and supplementer 255 in the totalization inverse conversion unit 231. This is because the second G switch 249 is closed to the h2 side because the second flag presence / absence discriminator 245 detects the absence of the flag in the band 2 as described above. The unit 223 decodes the frequency conversion coefficient instead of the gain in the band 2. Therefore, it is not the gain but the frequency conversion coefficient that is notified to the first G switch as the band 2 information from the decoding unit 223. Therefore, when the second G switch 249 is closed to the h2 side, information flowing through the signal line connected between the decoding unit 223 and the frequency conversion coefficient totaling and supplementing device 255 is a frequency conversion coefficient in the band 2. Because it becomes.

帯域３以降についても同様である。 The same applies to bands 3 and after.

帯域別雑音列生成部２２９の中の第１雑音列生成器２５１は、帯域１のゲインを受け取った場合には、内蔵のホワイトノイズ生成器（図示せず。）と、ホワイトノイズから帯域１の信号を取り出す内蔵の帯域フィルタ（図示せず。）と、受け取ったゲインの通りに信号の振幅を調整する内蔵の乗算器（図示せず。）と、により、帯域１の雑音列を生成し、残差信号復元部２３３に引き渡す一方、帯域１のゲインを受け取らなかった場合には、何も行わない。 When receiving the gain of band 1, the first noise string generator 251 in the noise string generation unit 229 for each band receives the built-in white noise generator (not shown) and the band 1 from the white noise. A built-in band-pass filter (not shown) for extracting a signal and a built-in multiplier (not shown) that adjusts the amplitude of the signal according to the received gain generate a noise sequence of band 1; On the other hand, if the gain of the band 1 is not received while performing the transfer to the residual signal restoration unit 233, nothing is performed.

帯域別雑音列生成部２２９の中の第２雑音列生成器２５３は、帯域２のゲインを受け取った場合には、内蔵のホワイトノイズ生成器（図示せず。）と、ホワイトノイズから帯域２の信号を取り出す内蔵の帯域フィルタ（図示せず。）と、受け取ったゲインの通りに信号の振幅を調整する内蔵の乗算器（図示せず。）と、により、帯域２の雑音列を生成し、残差信号復元部２３３に引き渡す一方、帯域２のゲインを受け取らなかった場合には、何も行わない。 When the second noise sequence generator 253 in the noise sequence generation unit 229 for each band receives the gain of the band 2, the second noise string generator 253 includes a built-in white noise generator (not shown) and the band 2 from the white noise. A built-in band-pass filter (not shown) for extracting the signal and a built-in multiplier (not shown) that adjusts the amplitude of the signal according to the received gain generate a noise sequence of band 2; On the other hand, if the gain of band 2 is not received, nothing is performed.

フラグ存否判別部２２５及び復号用Ｇスイッチ部２２７が上述のように動作することから、残差信号復元部２３３には、雑音帯域における残差信号として妥当と考えられる信号が、全ての雑音帯域に渡って、入力されることになる。ここで「妥当」という表現を用いているのは、次の理由による。すなわち、実施形態１に係る音声符号化装置１１１（図１）等の送信側の装置では雑音帯域における残差信号をゲインという量だけにより特徴づけて、かかるゲインだけを受信側の装置である本実施形態に係る音声復号装置２１１に伝達した。よって、音声復号装置２１１は、雑音帯域における残差信号を完全に復元することはできず、ゲインという量だけしか通知されていない状況においては最も適切な残差信号、つまり妥当と考えられる信号、あるいは擬似的な残差信号、を生成したといえるからである。 Since the flag presence / absence determination unit 225 and the decoding G switch unit 227 operate as described above, the residual signal restoration unit 233 receives a signal that is considered valid as a residual signal in the noise band in all noise bands. It will be input across. The expression “valid” is used here for the following reason. That is, in the transmission-side apparatus such as the speech encoding apparatus 111 (FIG. 1) according to the first embodiment, the residual signal in the noise band is characterized only by the amount of gain, and only this gain is the reception-side apparatus. This is transmitted to the speech decoding apparatus 211 according to the embodiment. Therefore, the speech decoding apparatus 211 cannot completely restore the residual signal in the noise band, and in a situation where only the amount of gain is notified, the most appropriate residual signal, that is, a signal that is considered appropriate, Alternatively, it can be said that a pseudo residual signal is generated.

集計逆変換部２３１の中の周波数変換係数集計及び補充器２５５は、帯域１における周波数変換係数を受け取った場合には、それを記憶する一方、帯域１の周波数変換係数を受け取らなかった場合には、帯域１における周波数変換係数を全て0とし、それを記憶する。集計逆変換部２３１の中の周波数変換係数集計及び補充器２５５は、帯域２における周波数変換係数を受け取った場合には、それを記憶する一方、帯域２の周波数変換係数を受け取らなかった場合には、帯域２における周波数変換係数を全て0とし、それを記憶する。帯域３以降についても同様である。 The frequency conversion coefficient totaling and replenisher 255 in the totalizing inverse conversion unit 231 stores the frequency conversion coefficient in the band 1 when it is received, and stores the frequency conversion coefficient in the band 1 when it is not received. The frequency conversion coefficients in band 1 are all set to 0 and stored. The frequency conversion coefficient totaling and replenisher 255 in the totalizing inverse conversion unit 231 stores the frequency conversion coefficient in the band 2 when it is received, and stores the frequency conversion coefficient in the band 2 when it is not received. The frequency conversion coefficients in band 2 are all set to 0 and stored. The same applies to bands 3 and after.

周波数変換係数集計及び補充器２５５は、このように、周波数変換係数を受け取った帯域については、かかる周波数変換係数をそのまま記憶し、周波数変換係数を受け取らなかった帯域については、周波数変換係数として0を補充して記憶することにより、欠落した帯域のない周波数変換係数を生成したことになる。周波数変換係数集計及び補充器２５５は、かかる周波数変換係数を、同じく集計逆変換部２３１の中にあるスペクトル逆変換器２５７に引き渡す。スペクトル逆変換器２５７は、実施形態１に係る音声符号化装置１１１（図１）等の送信側の装置の中のスペクトル変換器１６３において用いられた所定の周波数変換手法と対をなす逆変換手法を用いて、引き渡された周波数変換係数から実時間領域の信号に逆変換する。スペクトル変換器１６３が受け取る周波数変換係数には、上述の通り、欠落がないため、前記逆変換は円滑に行われる。スペクトル逆変換器２５７は、かかる実時間領域の信号を残差信号復元部２３３に引き渡す。 In this way, the frequency conversion coefficient totaling and supplementing unit 255 stores the frequency conversion coefficient as it is for the band that has received the frequency conversion coefficient, and sets 0 as the frequency conversion coefficient for the band that has not received the frequency conversion coefficient. By supplementing and storing, a frequency conversion coefficient without a missing band is generated. The frequency conversion coefficient totalizing and supplementing unit 255 delivers the frequency conversion coefficient to the spectral inverse converter 257 that is also in the totaling inverse conversion unit 231. The spectrum inverse converter 257 is an inverse conversion technique that is paired with a predetermined frequency conversion technique used in the spectrum converter 163 in the transmission-side apparatus such as the speech encoding apparatus 111 (FIG. 1) according to the first embodiment. Is used to inversely transform the passed frequency transform coefficient into a real time domain signal. Since the frequency conversion coefficient received by the spectrum converter 163 is not missing as described above, the inverse conversion is performed smoothly. The spectrum inverse transformer 257 delivers the real-time domain signal to the residual signal restoration unit 233.

フラグ存否判別部２２５及び復号用Ｇスイッチ部２２７が上述のように動作することから、スペクトル逆変換器２５７から残差信号復元部２３３に引き渡された実時間領域の信号は、非雑音帯域における復元残差信号である。 Since the flag presence / absence determination unit 225 and the decoding G switch unit 227 operate as described above, the signal in the real time domain transferred from the spectrum inverse converter 257 to the residual signal restoration unit 233 is restored in the non-noise band. It is a residual signal.

以上から、残差信号復元部２３３には、帯域別雑音列生成部２２９からは上述の通り擬似的なものとはいえ雑音帯域における残差信号が引き渡され、集計逆変換部２３１からは非雑音帯域における復元残差信号が引き渡されるので、結局、全帯域における残差信号が引き渡されることになる。残差信号復元部２３３は、これら各帯域の残差信号を重ね合わせることにより、復元残差信号D'_i={d'_i、0、・・・、d'_i、l-1}(0≦i≦M-1)を生成する。生成された復元残差信号は、合成用フィルタ部２３７に引き渡される。 From the above, the residual signal in the noise band is handed over to the residual signal restoring unit 233 from the noise sequence generating unit 229 for each band although it is a pseudo signal as described above, and the non-noise is output from the total inverse converting unit 231. Since the restored residual signal in the band is delivered, the residual signal in the entire band is eventually delivered. The residual signal restoration unit 233 superimposes the residual signals of these bands, thereby restoring the residual signal D ′ _i = {d ′ _{i, 0} ,..., D ′ _{i, l−1} } (0 ≦ i ≦ M−1) is generated. The generated restored residual signal is delivered to the synthesis filter unit 237.

合成用フィルタ算出部２３５には、復号部２２３から予測係数が引き渡される。合成用フィルタ算出部２３５は、引き渡された予測係数に基づいて、任意の既知の手法により、合成用フィルタの仕様を決定し、かかる決定の結果を合成用フィルタ部２３７に通知する。合成用フィルタ部２３７は、かかる通知に従って、自らの仕様を定める。 The prediction coefficient is delivered from the decoding unit 223 to the synthesis filter calculation unit 235. The synthesis filter calculation unit 235 determines the specification of the synthesis filter by any known method based on the delivered prediction coefficient, and notifies the synthesis filter unit 237 of the result of the determination. The synthesizing filter unit 237 determines its own specification according to the notification.

合成用フィルタ部２３７には、残差信号復元部２３３からの復元残差信号が、励起信号として入力される。なお、既に述べたように、残差信号と励起信号とは、同じ信号を別の視点からみたものにすぎない。かかる励起信号の入力の結果、合成用フィルタ部２３７からは復元されたデジタル音声信号が生成される。該信号はＤ／Ａ変換部２３９によりアナログ音声信号に変換された後、スピーカ２４１に送られる。こうして、スピーカ２４１からは、復元された音声信号が、人間の耳に聞こえる態様にて発せられる。 The restoration residual signal from the residual signal restoration unit 233 is input to the synthesis filter unit 237 as an excitation signal. As already described, the residual signal and the excitation signal are merely the same signal viewed from different viewpoints. As a result of inputting the excitation signal, the synthesis filter unit 237 generates a restored digital audio signal. The signal is converted into an analog audio signal by the D / A converter 239 and then sent to the speaker 241. Thus, the restored audio signal is emitted from the speaker 241 in a manner that can be heard by the human ear.

（実施形態５）
図５は、本実施形態に係る音声復号装置２１３の機能構成図である。 (Embodiment 5)
FIG. 5 is a functional configuration diagram of the speech decoding apparatus 213 according to the present embodiment.

本実施形態は、実施形態４の変形例である。すなわち、実施形態４においては、雑音帯域毎に実時間信号を生成していたのに対して、本実施形態においては、雑音帯域のゲインに合わせて周波数変換係数を生成してから、非雑音帯域の周波数変換係数とともに一括して実時間領域への逆変換を行う。 This embodiment is a modification of the fourth embodiment. That is, in the fourth embodiment, a real-time signal is generated for each noise band. In the present embodiment, a frequency conversion coefficient is generated in accordance with the gain of the noise band, and then the non-noise band is generated. Reverse conversion to the real time domain is performed together with the frequency conversion coefficient.

本実施形態に係る音声復号装置２１３は、実施形態４に係る音声復号装置２１１（図４）と比較すると、帯域別雑音列生成部２２９（図４）を帯域別定数周波数変換係数生成部２５９に置換し、復号用Ｇスイッチ部２２７の下流の信号線の構成を一部変更し、集計逆変換部２３１（図４）を一括集計逆変換部２６１に置換し、残差信号復元部２３３が省略された構成となっている。 Compared with the speech decoding apparatus 211 (FIG. 4) according to the fourth embodiment, the speech decoding apparatus 213 according to the present embodiment replaces the band-specific noise sequence generation unit 229 (FIG. 4) with the constant frequency conversion coefficient generation unit 259 for each band. Replace, partially change the configuration of the signal line downstream of the decoding G switch unit 227, replace the total inverse conversion unit 231 (FIG. 4) with the collective total inverse conversion unit 261, and omit the residual signal restoration unit 233 It has been configured.

帯域別定数周波数変換係数生成部２５９の中の第１定数周波数変換係数生成器２６３は、帯域１のゲインを受け取った場合には、まず帯域１における周波数変換係数として周波数変換係数の単位量が並んだものを生成し、次いで、該ゲインを乗じて、帯域１における周波数変換係数を生成して、一括集計逆変換部２６１の中の周波数変換係数集計器２６７に引き渡す一方、帯域１のゲインを受け取らなかった場合には、何らの動作も行わない。 When the first constant frequency conversion coefficient generator 263 in the band-specific constant frequency conversion coefficient generator 259 receives the gain of the band 1, first, the unit amount of the frequency conversion coefficient is arranged as the frequency conversion coefficient in the band 1. And then multiplying the gain to generate a frequency conversion coefficient in band 1 and hand it over to the frequency conversion coefficient totalizer 267 in the batch total inverse transform unit 261 while receiving the gain in band 1 If not, no action is taken.

帯域別定数周波数変換係数生成部２５９の中の第２定数周波数変換係数生成器２６５は、帯域２のゲインを受け取った場合には、まず帯域２における周波数変換係数として周波数変換係数の単位量が並んだものを生成し、次いで、該ゲインを乗じて、帯域２における周波数変換係数を生成して、一括集計逆変換部２６１の中の周波数変換係数集計器２６７に引き渡す一方、帯域２のゲインを受け取らなかった場合には、何らの動作も行わない。 When the second constant frequency conversion coefficient generator 265 in the band-specific constant frequency conversion coefficient generation unit 259 receives the gain of the band 2, first, the unit amount of the frequency conversion coefficient is arranged as the frequency conversion coefficient in the band 2. And then multiplying the gain to generate a frequency conversion coefficient in band 2 and hand it over to the frequency conversion coefficient totalizer 267 in the batch total inverse transform unit 261 while receiving the gain in band 2 If not, no action is taken.

復号用Ｇスイッチ部２２７と一括集計逆変換部２６１とを結ぶ信号線に、図５においてｇ’１やｇ’２で示される接続部が設けられているために、上述の周波数変換係数の引き渡しが可能となる。なお、第１Ｇスイッチ２４７においてスイッチはｇ１とｈ１の何れか一方側のみに閉じ、第２Ｇスイッチ２４９においてスイッチはｇ２とｈ２の何れか一方側にのみ閉じるから、接続部ｇ’１やｇ’２には、復号用Ｇスイッチ部２２７からの信号又は帯域別定数周波数変換係数生成部２５９からの信号の何れか一方だけが流入するのであり、かかる２種類の信号が混じり合うことはない。 Since the signal line connecting the decoding G switch unit 227 and the batch aggregation inverse conversion unit 261 is provided with a connection unit indicated by g′1 or g′2 in FIG. Is possible. In the first G switch 247, the switch is closed only on one side of g1 and h1, and in the second G switch 249, the switch is closed only on either side of g2 or h2. Only one of the signal from the decoding G switch unit 227 and the signal from the constant frequency conversion coefficient generation unit 259 for each band flows in, and the two types of signals are not mixed.

一括集計逆変換部２６１の中の周波数変換係数集計器２６７は、実施形態４に係る音声復号装置２１１（図４）における集計逆変換部２３１の中の周波数変換係数集計及び補充器２５５とよく似た機能を有する。実施形態４における周波数変換係数集計及び補充器２５５（図４）は、欠落した帯域を補うために、数値0から構成される周波数変換係数を生成する必要があった。 The frequency conversion coefficient totalizer 267 in the batch total inverse conversion unit 261 is very similar to the frequency conversion coefficient totalization and supplementer 255 in the total inverse conversion unit 231 in the speech decoding apparatus 211 (FIG. 4) according to the fourth embodiment. It has a function. The frequency conversion coefficient tabulation and supplementer 255 (FIG. 4) in the fourth embodiment needs to generate a frequency conversion coefficient composed of a numerical value 0 in order to compensate for the missing band.

しかし、本実施形態の場合、フラグ存否判別部２２５、復号用Ｇスイッチ部２２７、及び、帯域別定数周波数変換係数生成部２５９、の動作から明らかなように、周波数変換係数集計器２６７は受け取った変換係数を単に記憶するだけの動作により、欠落した帯域のない周波数変換係数を生成してスペクトル逆変換器２５７による逆変換に供することができる。 However, in the case of the present embodiment, the frequency conversion coefficient totalizer 267 has received, as is apparent from the operations of the flag presence / absence determination unit 225, the decoding G switch unit 227, and the constant frequency conversion coefficient generation unit 259 for each band. By simply storing the transform coefficient, a frequency transform coefficient without a missing band can be generated and used for inverse transform by the spectrum inverse transformer 257.

フラグ存否判別部２２５、復号用Ｇスイッチ部２２７、及び、帯域別定数周波数変換係数生成部２５９、の動作から明らかなように、一括集計逆変換部２６１の中のスペクトル逆変換器２５７が生成した残差信号は、雑音帯域と非雑音帯域の両成分を既に含んでいる。 As is apparent from the operations of the flag presence / absence determination unit 225, the decoding G switch unit 227, and the constant frequency conversion coefficient generation unit 259 for each band, the spectrum inverse converter 257 in the collective aggregation inverse conversion unit 261 generates The residual signal already contains both noise band and non-noise band components.

よって、かかる残差信号は、そのまま復元残差信号、又は、励起用の信号として、合成用フィルタ部２３７に引き渡してよい。本実施形態の場合は、実施形態４において雑音帯域の成分と非雑音帯域の成分とを重ね合わせて復元残差信号を生成するために用いられた残差信号復元部２３３（図４）は不要である。 Therefore, the residual signal may be delivered to the synthesis filter unit 237 as a restored residual signal or an excitation signal. In the case of the present embodiment, the residual signal restoration unit 233 (FIG. 4) used in Embodiment 4 to generate the restored residual signal by superimposing the noise band component and the non-noise band component is unnecessary. It is.

このように、本実施形態は原理的には実施形態４と同様の動作を行う。ただし、実施形態４においては帯域別雑音列生成部２２９（図４）の中にホワイトノイズ発生源や帯域フィルタ（いずれも図示せず。）が必要であるのに対して、本実施形態においてはかかる部材が不要であるので、より簡易である。 Thus, the present embodiment performs the same operation as that of the fourth embodiment in principle. However, in the fourth embodiment, a white noise generation source and a band filter (both not shown) are necessary in the band-specific noise string generation unit 229 (FIG. 4). Since such a member is unnecessary, it is simpler.

（実施形態６）
図６は、本実施形態に係る音声符号化兼復号装置３１１を示したものである。ここまで機能構成図である図１乃至図５を参照して説明してきた実施形態１乃至実施形態５に係る音声符号化装置１１１、音声符号化装置１１３、音声符号化装置１１５、音声復号装置２１１、音声復号装置２１３、は、物理的には、使い勝手の観点から両種の装置の機能を統合した、本実施形態に係る音声符号化兼復号装置３１１により実現される。以下では、音声符号化兼復号装置３１１として携帯電話機を想定して説明する。 (Embodiment 6)
FIG. 6 shows a speech encoding / decoding device 311 according to this embodiment. Speech encoding apparatus 111, speech encoding apparatus 113, speech encoding apparatus 115, speech decoding apparatus 211 according to Embodiments 1 to 5 that have been described with reference to FIGS. The speech decoding device 213 is physically realized by the speech encoding / decoding device 311 according to the present embodiment, which integrates the functions of both types of devices from the viewpoint of usability. In the following description, a mobile phone is assumed as the speech encoding / decoding device 311.

音声符号化兼復号装置３１１は、図１乃至図３で既に示してあるマイクロフォン１２１と、図４及び図５で既に示してあるスピーカ２４１と、を備える。該装置は、アンテナ３３５と、操作キー３３７と、をさらに備える。該装置は、システムバス３３３により相互に接続された、ＣＰＵ３２１と、ＲＯＭ（Read Only Memory）３２３と、記憶部３２５と、音声処理部３２９と、無線通信部３２７と、操作キー入力処理部３３１と、をさらに備える。記憶部３２５は、例えば、ＲＡＭ（Random Access Memory）３３９と、ハードディスク３４１と、を備える。音声符号化兼復号装置３１１は、図６に示したものの他にも、例えば、実施形態２に示したBand Elimination Filter１６９を、別途、専用ハードウェアとして備える等してもよい。 The speech encoding / decoding device 311 includes a microphone 121 already shown in FIGS. 1 to 3 and a speaker 241 already shown in FIGS. 4 and 5. The apparatus further includes an antenna 335 and operation keys 337. The apparatus includes a CPU 321, a ROM (Read Only Memory) 323, a storage unit 325, a voice processing unit 329, a wireless communication unit 327, an operation key input processing unit 331, which are connected to each other via a system bus 333. Are further provided. The storage unit 325 includes, for example, a RAM (Random Access Memory) 339 and a hard disk 341. The speech encoding / decoding device 311 may include, for example, the Band Elimination Filter 169 shown in Embodiment 2 as dedicated hardware separately from the one shown in FIG.

ＲＯＭ３２３には、音声符号化及び復号のための動作プログラムが格納されている。ＣＰＵ３２１は、該動作プログラムに従って動作する。そして、ＣＰＵ３２１は、内蔵のレジスタ（図示せず。）と記憶部３２５との間で適宜データのやりとりを行いながら、数値演算により、音声符号化兼復号装置３１１に、図１乃至図５に示す音声符号化装置１１１、音声符号化装置１１３、音声符号化装置１１５、音声復号装置２１１、音声復号装置２１３、としての機能を発揮させる。ＣＰＵ３２１は、その際、必要に応じて音声処理部３２９、無線通信部３２７、操作キー入力処理部３３１とデータのやりとりを行う。 The ROM 323 stores an operation program for voice encoding and decoding. The CPU 321 operates according to the operation program. Then, the CPU 321 performs numerical computation while appropriately exchanging data between a built-in register (not shown) and the storage unit 325, and the speech encoding / decoding device 311 is shown in FIGS. 1 to 5. The functions of the speech encoding device 111, the speech encoding device 113, the speech encoding device 115, the speech decoding device 211, and the speech decoding device 213 are exhibited. At that time, the CPU 321 exchanges data with the voice processing unit 329, the wireless communication unit 327, and the operation key input processing unit 331 as necessary.

図６の音声処理部３２９は、図１乃至図３のＡ／Ｄ変換部１２３、及び、図４及び図５のＤ／Ａ変換部２３９として動作することができる。無線通信部３２７は、図１乃至図３の送信部１３９、及び、図４及び図５の受信部２２１として動作することができる。符号の送受信は、基本的には、図６のアンテナ３３５を用いた無線通信により行われるが、別の方法、例えば有線通信により行われてもよい。操作キー入力処理部３３１は、操作キー３３７からの操作信号を受け付けて、操作信号に対応するキーコード信号をＣＰＵ３２１に伝達する。操作キー３３７は、通信の相手方となる音声符号化兼復号装置３１１を特定する、すなわちいわゆる電話番号を入力するのに使われる他、基本的には設定済みの各種事項をユーザの好みに応じて変化させるために用いられてもよい。 The audio processing unit 329 in FIG. 6 can operate as the A / D conversion unit 123 in FIGS. 1 to 3 and the D / A conversion unit 239 in FIGS. 4 and 5. The wireless communication unit 327 can operate as the transmission unit 139 in FIGS. 1 to 3 and the reception unit 221 in FIGS. 4 and 5. The transmission / reception of the code is basically performed by wireless communication using the antenna 335 of FIG. 6, but may be performed by another method, for example, wired communication. The operation key input processing unit 331 receives an operation signal from the operation key 337 and transmits a key code signal corresponding to the operation signal to the CPU 321. The operation key 337 is used to specify a voice encoding / decoding device 311 as a communication partner, that is, to input a so-called telephone number, and basically, various set items can be set according to the user's preference. It may be used to change.

（予測分析の手順）
以下では、図１乃至図３の予測分析部１２５が行う予測分析について、図７に示すフローチャートを参照しつつ説明する。予測分析としては、例えば、線型予測分析やＭＬＳＡ（Mel Log Spectrum Approximation）分析が知られている。図７では、後者を括弧書きにして、両分析が併記されている。 (Predictive analysis procedure)
Hereinafter, the prediction analysis performed by the prediction analysis unit 125 of FIGS. 1 to 3 will be described with reference to the flowchart shown in FIG. As prediction analysis, for example, linear prediction analysis and MLSA (Mel Log Spectrum Approximation) analysis are known. In FIG. 7, both analyzes are shown together with the latter in parentheses.

記憶部３２５（図６）には、既に、デジタル音声信号（入力波形）S_i={s_i、0、・・・、s_i、l−1}(0≦i≦M-1)が格納されているとする。ＣＰＵ３２１（図６）は、内蔵のカウンタレジスタ（図示せず）を入力信号サンプルカウンタiの格納に用いることとし、初期値として、i=0とする（図７のステップＳ４１１）。 The storage unit 325 (FIG. 6) already stores digital audio signals (input waveforms) S _i = {s _{i, 0} ,..., S _{i, l−1} } (0 ≦ i ≦ M−1). Suppose that The CPU 321 (FIG. 6) uses a built-in counter register (not shown) for storing the input signal sample counter i, and sets i = 0 as an initial value (step S411 in FIG. 7).

ＣＰＵ３２１は、内蔵の汎用レジスタ（図示せず）に、記憶部３２５から、入力信号サンプルS_i={s_i、0、・・・、s_i、l-1}をロードする（図７のステップＳ４１３）。 The CPU 321 loads the input signal samples S _i = {s _{i, 0} ,..., S _{i, l−1} } from the storage unit 325 to a built-in general-purpose register (not shown) (step of FIG. 7). S413).

ＣＰＵ３２１は、線型予測分析の場合は、入力信号サンプルS_iから、線型予測係数A_i={a_i、1、・・・、a_i、n}を計算する（ステップＳ４１５）。ただし、nは線型予測分析の次数である。計算方法としては、残差信号が所定の尺度に基づき十分に小さいと評価されることになるような計算方法であれば、任意の既知の手法を採用してよい。例えば、よく知られている、自己相関関数の計算とレビンソン・ダービンアルゴリズムを組み合わせた計算方法を採用するのが好適である。 In the case of linear prediction analysis, the CPU 321 calculates linear prediction coefficients A _i = {a _{i, 1} ,..., A _{i, n} } from the input signal sample S _i (step S415). Where n is the order of linear predictive analysis. As a calculation method, any known method may be employed as long as the residual signal is evaluated to be sufficiently small based on a predetermined scale. For example, it is preferable to use a well-known calculation method that combines the calculation of the autocorrelation function and the Levinson-Durbin algorithm.

ＣＰＵ３２１は、ＭＬＳＡ分析の場合は、入力信号サンプルS_iから、まず、ケプストラムC_i={c_i、0、・・・、c_i、(l/2)-1}を計算する。かかる計算には、任意の既知の手法を採用してよい。どの手法においても、概ね、離散フーリエ変換をする、絶対値をとる、対数をとる、逆離散フーリエ変換をする、といった手続が行われる。次に、求めたケプストラムC_iから、任意の既知の手法により、ＭＬＳＡフィルタ係数M_i={m_i、0、・・・、m_i、p-1}を計算する（ステップＳ４１５）。 In the case of MLSA analysis, the CPU 321 first calculates a cepstrum C _i = {c _{i, 0} ,..., C _{i, (l / 2) −1} } from the input signal sample S _i . Any known method may be employed for such calculation. In any method, procedures such as discrete Fourier transform, absolute value, logarithm, and inverse discrete Fourier transform are generally performed. Next, MLSA filter coefficients M _i = {mi _{, 0} ,..., Mi _{, p−1} } are calculated from the obtained cepstrum C _i by any known method (step S415).

線型予測分析の場合は線型予測係数A_i={a_i、1、・・・、a_i、n}が、ＭＬＳＡ分析の場合はＭＬＳＡフィルタ係数M_i={m_i、0、・・・、m_i、p-1}が、記憶部３２５に予測係数として記憶される（ステップＳ４１７）。 In the case of linear prediction analysis, linear prediction coefficient A _i = {a _{i, 1} ,..., A _{i, n} }, and in the case of MLSA analysis, MLSA filter coefficient M _i = {m _{i, 0} ,. m _{i, p-1} } is stored as a prediction coefficient in the storage unit 325 (step S417).

続いて、線型予測分析の場合、線型予測係数A_iから、任意の既知の手法により、予測分析用逆線型予測フィルタAIA_iが計算され、ＭＬＳＡ分析の場合、ＭＬＳＡフィルタ係数M_iから、任意の既知の手法により、予測分析用逆ＭＬＳＡフィルタAIM_iが計算される（ステップＳ４１９）。これらの計算は、図１及び図２の予測分析用逆フィルタ算出器１４１が行う計算に相当する。 Subsequently, in the case of linear prediction analysis, an inverse prediction filter AIA _i for prediction analysis is calculated from the linear prediction coefficient A _i by an arbitrary known method. In the case of MLSA analysis, an arbitrary linear prediction filter A _i is calculated from the MLSA filter coefficient M _i . An inverse MLSA filter AIM _i for prediction analysis is calculated by a known method (step S419). These calculations correspond to the calculations performed by the prediction analysis inverse filter calculator 141 in FIGS.

求めた予測分析用逆線型予測フィルタAIA_i又は予測分析用逆ＭＬＳＡフィルタAIM_iに入力信号サンプルS_i={s_i、0、・・・、s_i、l-1}が通されることにより、残差信号D_i={d_i、0、・・・、d_i、l-1｝が求まる（図７のステップＳ４２１）。残差信号D_iは記憶部３２５に記憶される（ステップＳ４２３）。 By passing the input signal samples S _i = {s _{i, 0} ,..., S _{i, l-1} } through the obtained prediction analysis inverse linear prediction filter AIA _i or prediction analysis inverse MLSA filter AIM _i , Residual signal D _i = {d _{i, 0} ,..., D _{i, l−1} } is obtained (step S421 in FIG. 7). The residual signal D _i is stored in the storage unit 325 (step S423).

ここで、入力信号サンプルカウンタiがM-1に達しているか否かが判別される（ステップＳ４２５）。達していれば（ステップＳ４２５；Ｙｅｓ）、終了する。一方、達していなければ（ステップＳ４２５；Ｎｏ）、次の時間区間の入力信号サンプルについての処理を行うために、iを1だけインクリメントし（ステップＳ４２７）、ステップＳ４１３以降の処理を繰り返す。 Here, it is determined whether or not the input signal sample counter i has reached M−1 (step S425). If it has been reached (step S425; Yes), the process ends. On the other hand, if not reached (step S425; No), i is incremented by 1 (step S427) in order to perform processing on the input signal sample in the next time interval, and the processing after step S413 is repeated.

（残差信号からフラグ等を生成する手順）
以下では、実施形態１乃至実施形態３に係る音声符号化装置１１１、音声符号化装置１１３、音声符号化装置１１５、が行う、残差信号からフラグ、ゲイン、及び、周波数変換係数、を生成する手順を説明する。 (Procedure for generating flags etc. from residual signal)
In the following, a flag, a gain, and a frequency conversion coefficient are generated from the residual signal, which is performed by the speech encoding device 111, the speech encoding device 113, and the speech encoding device 115 according to Embodiments 1 to 3. Explain the procedure.

なお、前提として、Ａ／Ｄ変換部１２３（図１乃至図３）によりデジタル音声信号（入力信号）S_i={s_i、0、・・・、s_i、l-1}(0≦i≦M-1)が既に求められて記憶部３２５（図６）に格納されているものとし、かつ、該信号を受け取った予測分析部１２５（図１乃至３）により既に予測係数及び残差信号D_i={d_i、0、・・・、d_i、l-1}}(0≦i≦M-1)も求められていて、これらの係数及び残差信号も記憶部３２５に格納されているものとする。 As a premise, the digital audio signal (input signal) S _i = {s _{i, 0} ,..., Si _{, l−1} } (0 ≦ i) by the A / D converter 123 (FIGS. 1 to 3). ≦ M−1) is already obtained and stored in the storage unit 325 (FIG. 6), and the prediction analysis unit 125 (FIGS. 1 to 3) that has received the signal has already predicted the coefficient and the residual signal. D _i = {d _{i, 0} ,..., D _{i, l-1} }} (0 ≦ i ≦ M−1) is also obtained, and these coefficients and residual signals are also stored in the storage unit 325. It shall be.

まず、実施形態１に係る音声符号化装置１１１（図１）が行う処理の手順を、図８及び図９に示すフローチャートを参照しつつ説明する。 First, the procedure of the process performed by the speech encoding apparatus 111 (FIG. 1) according to the first embodiment will be described with reference to the flowcharts shown in FIGS.

ＣＰＵ３２１（図６）は、内蔵のカウンタレジスタ（図示せず。）において、入力信号サンプルカウンタiを、i=0に設定する（図８のステップＳ４３１）。 The CPU 321 (FIG. 6) sets an input signal sample counter i to i = 0 in a built-in counter register (not shown) (step S431 in FIG. 8).

ＣＰＵ３２１は、内蔵の汎用レジスタ（図示せず。）に、記憶部３２５（図６）から、残差信号D_i={d_i、0、・・・、d_i、l-1}をロードする（図８のステップＳ４３３）。 The CPU 321 loads the residual signal D _i = {d _{i, 0} ,..., D _{i, l−1} } from the storage unit 325 (FIG. 6) to a built-in general-purpose register (not shown). (Step S433 in FIG. 8).

ＣＰＵ３２１は、カウンタレジスタにおいて、帯域識別変数ωを、ω=1に設定する（ステップＳ４３５）。 In the counter register, the CPU 321 sets the band identification variable ω to ω = 1 (step S435).

ＣＰＵ３２１は、帯域フィルタ部１２７（図１）として機能することにより、帯域ωの残差信号D_i、ω={d_i、ω、0、・・・、d_i、ω、l-1}を生成する（図８のステップＳ４３７）。 The CPU 321 functions as the band filter unit 127 (FIG. 1) to generate the residual signal D _{i, ω} = {d _{i, ω, 0} ,..., D _{i, ω, l−1} } of the band ω. It is generated (step S437 in FIG. 8).

ＣＰＵ３２１は、雑音判別部１２９（図１）として機能することにより、D_i、ωが雑音であるか否かを判別する（ステップＳ４３９）。D_i、ωが雑音であると判別された場合には（ステップＳ４３９；Ｙｅｓ）、ＣＰＵ３２１はフラグ及びゲイン生成部１３３（図１）として機能することにより、帯域ωにおけるフラグFlag_i、ωの生成、及び、帯域ωにおけるゲインG_i、ωの算出、を行う（図８のステップＳ４４１）。 The CPU 321 functions as the noise determination unit 129 (FIG. 1) to determine whether Di _{and ω} are noise (step S439). If it is determined that D _{i and ω} are noises (step S439; Yes), the CPU 321 functions as the flag and gain generation unit 133 (FIG. 1) to generate the flags Flag _{i and ω} in the band ω. , and performs the gain G _i in the band _omega, the calculation of the _omega, the (step S441 in FIG. 8).

かかるFlag_i、ω及びゲインG_i、ωが記憶部に格納されてから（ステップＳ４４３）、ステップＳ４４７に進む。D_i、ωが雑音ではないと判別された場合には（ステップＳ４３９；Ｎｏ）、ＣＰＵ３２１は集計変換部１３５（図１）として機能することにより、D_i、ωを記憶部３２５に格納してから（図８のステップＳ４４５）、ステップＳ４４７に進む。 After the Flag _{i, ω} and the gains G _{i, ω} are stored in the storage unit (step S443), the process proceeds to step S447. When it is determined that D _{i and ω} are not noise (step S439; No), the CPU 321 functions as the total conversion unit 135 (FIG. 1) to store D _{i and ω} in the storage unit 325. (Step S445 in FIG. 8), the process proceeds to Step S447.

なお、ステップＳ４３９において行われる、D_i、ωが雑音であるか否かを判別する手法としては、様々なものが考えられるが、その好適な一例は、次のようなものである。すなわち、残差信号D_i、ωについて、規格化された自己相関関数
C_REG(t)=C(t)/REG(t)
（ただし、
C(t)=d_i、ω、0×d_i、ω、t+・・・
+d_{i、ω、l-1-t}×d_i、ω、l-1
であり、
REG(t)={(d_i、ω、0 ²＋・・・＋d_{i、ω、l-1-t} ²)
×(d_i、ω、t ²＋・・・＋d_i、ω、l-1 ²)}^0.5
である。）
を計算し、C_REG(t)が例えば0.5よりも大きい極大値を有する場合は雑音ではないと判別し、C_REG(t)が0.5よりも大きい極大値を有しない場合は雑音である、と判別する方法である。 Various methods for determining whether Di _{and ω} are noises performed in step S439 can be considered, and a preferable example thereof is as follows. That is, the normalized autocorrelation function for the residual signals D _{i and ω}
C _REG (t) = C (t) / REG (t)
(However,
C (t) = d _{i, ω, 0} × d _{i, ω, t} + ...
+ d _{i, ω, l-1-t} × d _{i, ω, l-1}
And
REG (t) = {(d _{i, ω, 0} ² + ... + d _{i, ω, l-1-t} ² )
× (d _{i, ω, t} ² + ... + d _{i, ω, l-1} ² )} ^0.5
It is. )
If C _REG (t) has a maximum value greater than 0.5, for example, it is determined that it is not noise, and if C _REG (t) does not have a maximum value greater than 0.5, it is noise. It is a method of discrimination.

また、ステップＳ４４１において行われる、ゲインG_i、ωの算出にあたっても、様々なものが考えられるが、その好適な一例は、次のようなものである。すなわち、
G_i、ω=10×log₁₀{Avg(d_i、ω ²)}、
Avg(d_i、ω ²)
=(d_i、ω、0 ²+・・・+d_i、ω、l-1 ²)/l
である。対数をとるのは、音の大きさと、人間の聴覚の感度との関係を考慮したからである。 Various things can be considered for the calculation of the gains Gi _{and ω} performed in step S441, and a preferable example is as follows. That is,
G _{i, ω} = 10 × log ₁₀ {Avg (d _{i, ω} ² )},
Avg (d _{i, ω} ² )
= (d _{i, ω, 0} ² + ... + d _{i, ω, l-1} ² ) / l
It is. The reason why the logarithm is taken is that the relationship between the loudness of the sound and the sensitivity of human hearing is taken into consideration.

ステップＳ４４７では、ωが、最終帯域を示す所定の自然数ω_finに達したか否かが判別される。達していない場合（ステップＳ４４７；Ｎｏ）には、ωを1増加してから（ステップＳ４４９）、ステップＳ４３７に戻り、達している場合（ステップＳ４４７；Ｙｅｓ）には、ステップＳ４５１に進む。なお、帯域は、基本的には低周波数側から番号付けされているものとする。つまり、ω=1は最低周波数帯域を、ω=ω_finは最高周波数帯域を、それぞれ意味する。 In step S447, it is determined whether or not ω has reached a predetermined natural number ω _fin indicating the final band. If not reached (step S447; No), ω is increased by 1 (step S449), and then the process returns to step S437. If reached (step S447; Yes), the process proceeds to step S451. The bands are basically numbered from the low frequency side. That is, ω = 1 means the lowest frequency band, and ω = ω _fin means the highest frequency band.

ステップＳ４５１では、iがM-1に達したか否かが判別される。達していない場合（ステップＳ４５１；Ｎｏ）には、iを1増加してから（ステップＳ４５３）、ステップＳ４３３に戻る。達している場合（ステップＳ４５１；Ｙｅｓ）、図９のステップＳ４６１に進む。この時点で、雑音帯域についてのみフラグ及びゲインが生成されて記憶部３２５に格納されている。そして、残差信号の各帯域の成分のうち非雑音帯域のものだけが記憶部３２５に格納されている。 In step S451, it is determined whether i has reached M-1. If not reached (step S451; No), i is increased by 1 (step S453), and the process returns to step S433. When it has reached (step S451; Yes), the process proceeds to step S461 in FIG. At this time, a flag and a gain are generated only for the noise band and stored in the storage unit 325. Only the non-noise band components among the components of each band of the residual signal are stored in the storage unit 325.

ステップＳ４６１では、ＣＰＵ３２１は、入力信号サンプルカウンタiをi=0に設定する。 In step S461, the CPU 321 sets the input signal sample counter i to i = 0.

ＣＰＵ３２１は、非雑音残差信号D_i、v={d_{i、ｖ、０}、・・・、d_i、v、l-1}を汎用レジスタに置き、d_{i、ｖ、０}、・・・、d_i、v、l-1を全て0に初期化設定する（ステップＳ４６３）。また、帯域識別変数ωを、ω=1に設定する（ステップＳ４６５）。 The CPU 321 places the non-noise residual signal D _{i, v} = {d _{i, v, 0} ,..., D _{i, v, l−1} } in a general-purpose register, and d _{i, v, 0} ,. , D _{i, v, and l−1} are all initialized to 0 (step S463). Further, the band identification variable ω is set to ω = 1 (step S465).

ＣＰＵ３２１は、記憶部３２５の内部を検索して、D_i、ωが格納されているか否かを調査する（ステップＳ４６７）。かかる調査の結果（ステップＳ４６９）、D_i、ωが格納されていると判別された場合（ステップＳ４６９；Ｙｅｓ）、帯域ωは非雑音帯域であり、非雑音残差信号を求めるための重ね合わせ処理が必要になる。この処理は、図１の非雑音残差信号集計器１６１の機能に対応する。よって、D_i、ωはレジスタにロードされ（ステップＳ４７１）、これまでレジスタに格納されていたD_i、vに重ね合わされる。D_i、vがかかる重ね合わせの結果得られた数列に更新されてから（ステップＳ４７３）、ステップＳ４７５に進む。 The CPU 321 searches the inside of the storage unit 325 and investigates whether Di _{and ω} are stored (step S467). As a result of such investigation (step S469), when it is determined that Di _{and ω} are stored (step S469; Yes), the band ω is a non-noise band, and superposition for obtaining a non-noise residual signal. Processing is required. This processing corresponds to the function of the non-noise residual signal totalizer 161 in FIG. Therefore, D _{i, ω} is loaded into the register (step S471) and is superimposed on D _{i, v} stored in the register so far. After Di _{and v} are updated to the numerical sequence obtained as a result of such superposition (step S473), the process proceeds to step S475.

一方、ステップＳ４６９において、D_i、ωが格納されてないと判別された場合（ステップＳ４６９；Ｎｏ）、帯域ωは雑音帯域であるから、非雑音残差信号を求めるための重ね合わせは行われずに、ステップＳ４７５に進む。 On the other hand, when it is determined in step S469 that Di _{and ω} are not stored (step S469; No), since the band ω is a noise band, the superposition for obtaining the non-noise residual signal is not performed. Then, the process proceeds to step S475.

ステップＳ４７５において、ωがω_finに達したか否かが判別される。達していないと判別された場合（ステップＳ４７５；Ｎｏ）は、ωを1増加してから（ステップＳ４７７）、ステップＳ４６７に戻り、非雑音帯域の探索と、非雑音帯域が見つかった場合の上述の重ね合わせ処理と、が繰り返される。ωがω_finに達したと判別された場合（ステップＳ４７５；Ｙｅｓ）、ステップＳ４７９に進む。 In step S475, it is determined whether or not ω has reached ω _fin . If it is determined that it has not been reached (step S475; No), ω is incremented by 1 (step S477), and the process returns to step S467 to search for the non-noise band and the above-described case where the non-noise band is found. The superposition process is repeated. When it is determined that ω has reached ω _fin (step S475; Yes), the process proceeds to step S479.

ステップＳ４７９において、ＣＰＵ３２１は、スペクトル変換器１６３（図１）として機能することにより、D_i、vを所定の手法により周波数変換してスペクトルF[D_i、v]を求める。F[D_i、v]は、周波数変換係数から構成される。 In step S479, the CPU 321 functions as the spectrum converter 163 (FIG. 1) to frequency-convert D _{i and v} by a predetermined method to obtain a spectrum F [D _{i, v} ]. F [D _{i, v} ] is composed of frequency conversion coefficients.

なお、以下の説明、及び、図８以降の図面においては、用語が煩雑になることを避けるために、スペクトルとしての一組の周波数変換係数群、又は、スペクトルの一部をなす周波数変換係数群、を指して、単に、周波数変換係数、と呼ぶ場合がある。また、スペクトルの全体又は一部を構成する個々の周波数係数を指して、成分、と呼ぶ場合がある。 In the following description and the drawings after FIG. 8, in order to avoid complicated terms, a set of frequency conversion coefficients as a spectrum or a group of frequency conversion coefficients forming a part of the spectrum May be simply referred to as a frequency conversion coefficient. Further, individual frequency coefficients constituting the whole or a part of the spectrum may be referred to as components.

ＣＰＵ３２１は、帯域識別変数ωをω=1に設定し（図９のステップＳ４８１）、記憶部３２５の内部を検索してD_i、ωが格納されているか否かを調査し（ステップＳ４８３）、格納の有無を判別する（ステップＳ４８５）。この手順（ステップＳ４８１〜ステップＳ４８５）は、雑音帯域と非雑音帯域に場合分けした処理に関係する手順であるという点では、既出の手順であるステップＳ４６５〜ステップＳ４６９と全く同様の手順である。 The CPU 321 sets the band identification variable ω to ω = 1 (step S481 in FIG. 9), searches the storage unit 325 to check whether Di _{and ω} are stored (step S483), The presence or absence of storage is determined (step S485). This procedure (steps S481 to S485) is exactly the same as steps S465 to S469, which have already been described, in that it is a procedure related to processing divided into a noise band and a non-noise band.

なお、かかる場合分けにあたっては、ＣＰＵ３２１は、記憶部３２５の内部の検索に際し、上述のようにD_i、ωが見つかるか否かを調査する代わりに、フラグFlag_i、ωが見つからないか見つかるかを調査してもよい。既出のステップＳ４３７〜ステップＳ４４９から明らかなように、D_i、ωとFlag_i、ωは択一的に記憶部に格納されるからである。 In this case classification, the CPU 321 determines whether or not the flags Flag _{i and ω} are not found instead of investigating whether or not D _{i and ω} are found as described above when searching inside the storage unit 325. You may investigate. This is because D _{i, ω} and Flag _{i, ω} are alternatively stored in the storage unit, as is clear from the foregoing steps S437 to S449.

D_i、ωが格納されていると判別された場合には（ステップＳ４８５；Ｙｅｓ）、ωは非雑音帯域であるということであるから、ＣＰＵ３２１は、非雑音帯域切り出し器１６２として機能することにより、F[D_i、v]から帯域ωにおける周波数変換係数F[D_i、v](ω)={f_{D、i、v、ω、１}、・・・、f_{D、i、v、ω、p(ω)}}を切り出し（ステップＳ４８７）、記憶部３２５に格納する（ステップＳ４８９）。ここで、p(ω)は、帯域ωにおける周波数変換係数の個数である。その後、ステップＳ４９１に進む。 If it is determined that D _{i and ω} are stored (step S485; Yes), it means that ω is a non-noise band, so that the CPU 321 functions as the non-noise band extractor 162. , F [D _{i, v} ] to frequency conversion coefficient F [D _{i, v} ] (ω) = {f _{D, i, v, ω 1} ,..., F _{D, i, v, ω , P (ω)} } are cut out (step S487) and stored in the storage unit 325 (step S489). Here, p (ω) is the number of frequency conversion coefficients in the band ω. Thereafter, the process proceeds to step S491.

一方、D_i、ωが格納されていないと判別された場合には（ステップＳ４８５；Ｎｏ）、ωは雑音帯域であるということであるから、周波数変換係数の切り出しは行われずに、ステップＳ４９１に進む。 On the other hand, D _{i, if omega} is judged not to be stored (step S485; No), omega since is that a noise band, is not performed cutout frequency transform coefficients, in step S491 move on.

ステップＳ４９１では、ωがω_finに達したか否かが判別される。達していないと判別された場合（ステップＳ４９１；Ｎｏ）、ωを1増加してから（ステップＳ４９３）、ステップＳ４８３に戻り、達していると判別された場合（ステップＳ４９１；Ｙｅｓ）、i番目の時間区間における帯域毎の処理が終了したということであるから、ステップＳ４９５に進む。 In step S491, it is determined whether or not ω has reached ω _fin . If it is determined that it has not been reached (step S491; No), ω is increased by 1 (step S493), and then the process returns to step S483. If it is determined that it has been reached (step S491; Yes), the i-th Since this means that the processing for each band in the time interval has been completed, the process proceeds to step S495.

ステップＳ４９５では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ４９５；Ｎｏ）、iを1増加してから（ステップＳ４９７）、ステップＳ４６３に戻り、達していると判別された場合（ステップＳ４９５；Ｙｅｓ）、全ての時間区間における処理が終了したと言うことであるから、残差信号からフラグ等を求める処理全体が終了する。 In step S495, it is determined whether i has reached M-1. If it is determined that it has not been reached (step S495; No), i is incremented by 1 (step S497), then the process returns to step S463, and if it is determined that it has been reached (step S495; Yes), all times Since it means that the processing in the section has been completed, the entire processing for obtaining a flag or the like from the residual signal is completed.

次に、実施形態２に係る音声符号化装置１１３（図２）が行う処理の手順を、図１０に示すフローチャートを参照しつつ説明する。 Next, the procedure of the process performed by the speech encoding apparatus 113 (FIG. 2) according to the second embodiment will be described with reference to the flowchart shown in FIG.

ＣＰＵ３２１は、入力信号サンプルカウンタiを、i=0に設定する（図１０のステップＳ５１１）。 The CPU 321 sets the input signal sample counter i to i = 0 (step S511 in FIG. 10).

ＣＰＵ３２１は、レジスタに、記憶部３２５から、入力信号サンプルD_i={d_i、0、・・・、d_i、l-1}をロードするとともに、帯域識別変数ωをω=1に設定し（ステップＳ５１３）、帯域ωの残差信号D_i、ω={d_i、ω、0、・・・、d_i、ω、l-1}を生成する（ステップＳ５１５）。 The CPU 321 loads the input signal sample D _i = {d _{i, 0} ,..., D _{i, l-1} } from the storage unit 325 to the register and sets the band identification variable ω to ω = 1. (Step S513), residual signals D _{i, ω} = {d _{i, ω, 0} ,..., D _{i, ω, l-1} } of the band ω are generated (Step S515).

ＣＰＵ３２１は、D_i、ωが雑音であるか否かを判別する（ステップＳ５１７）。D_i、ωが雑音であると判別された場合には（ステップＳ５１７；Ｙｅｓ）、帯域ωにおけるフラグFlag_i、ωの生成及びゲインG_i、ωの算出と、それらの記憶部３２５への格納と、が行われてから（ステップＳ５１９）、ステップＳ５２３に進む。 The CPU 321 determines whether Di _{and ω} are noise (step S517). If it is determined that D _{i and ω} are noises (step S517; Yes), generation of flags Flag _{i and ω} in the band _ω , calculation of gains G _{i and ω} , and storage in the storage unit 325 are performed. Are performed (step S519), the process proceeds to step S523.

一方、D_i、ωが雑音ではないと判別された場合には（ステップＳ５１７；Ｎｏ）、Band Elimination Filter１６９（図２）の設定が、帯域ωを削除対象としない設定にされてから（ステップＳ５２１）、ステップＳ５２３に進む。 On the other hand, when it is determined that Di _{and ω} are not noise (step S517; No), the setting of the Band Elimination Filter 169 (FIG. 2) is set not to delete the band ω (step S521). ), And proceeds to step S523.

ここで、Band Elimination Filter１６９のかかる設定は、後述のステップＳ５２５等で示されるωについてのループ処理について、各ループ毎に累積的に行われる。例えば、ステップＳ５１７の雑音判別処理において、全帯域のうち、帯域２、帯域５、帯域６が雑音ではないと判別された場合には、Band Elimination Filter１６９の設定は、最終的には、帯域２と、帯域５と、帯域６と、の計3個の帯域を除く帯域を削除する設定となる。 Here, the setting of the Band Elimination Filter 169 is cumulatively performed for each loop with respect to the loop processing for ω shown in Step S525 and the like described later. For example, in the noise discrimination processing in step S517, when it is discriminated that the band 2, the band 5 and the band 6 are not noise among all the bands, the setting of the Band Elimination Filter 169 is finally set to the band 2 , The band 5 and the band 6 are set to be deleted except for a total of three bands.

ステップＳ５２３では、ωがω_finに達したか否かが判別され、達していないと判別された場合（ステップＳ５２３；Ｎｏ）、ωを1増加してから（ステップＳ５２５）、ステップＳ５１５に戻り、達していると判別された場合（ステップＳ５２３；Ｙｅｓ）、ステップＳ５２７に進む。 In step S523, it is determined whether or not ω has reached ω _{fin. If} it is determined that ω has not been reached (step S523; No), after increasing ω by 1 (step S525), the process returns to step S515, When it is determined that it has reached (step S523; Yes), the process proceeds to step S527.

ステップＳ５２７では、iがM-1に達したか否かが判別され、達していないと判別された場合（ステップＳ５２７；Ｎｏ）、iを1増加してから（ステップＳ５２９）、ステップＳ５１３に戻り、達していると判別された場合（ステップＳ５２７；Ｙｅｓ）、ステップＳ５３１に進む。 In step S527, it is determined whether or not i has reached M−1. If it is determined that i has not been reached (step S527; No), i is increased by 1 (step S529), and then the process returns to step S513. , When it is determined that it has been reached (step S527; Yes), the process proceeds to step S531.

この時点で、ＣＰＵ３２１は、フラグ及びゲイン生成部１３３として機能することをほぼ完了し、また、非雑音帯域決定器１６７（図２）として機能することによりBand Elimination Filter１６９に削除帯域指定命令を送る動作を完了したことになる。 At this time, the CPU 321 almost completes functioning as the flag and gain generation unit 133, and also functions as a non-noise band determiner 167 (FIG. 2) to send a deletion band designation command to the Band Elimination Filter 169. Has been completed.

ステップＳ５３１では、入力信号サンプルカウンタiがi=0に設定される。それから、ＣＰＵ３２１は、残差信号D_iをロードし、ステップＳ５２１における累積的設定が済んでいるBand Elimination Filter１６９に通すことにより、非雑音残差信号D_i、v={d_i、v、0、・・・、d_i、v、l-1}を生成する。 In step S531, the input signal sample counter i is set to i = 0. Then, the CPU 321 loads the residual signal D _i and passes it through the Band Elimination Filter 169 for which the cumulative setting in step S521 has been completed, whereby the non-noise residual signal D _{i, v} = {d _{i, v, 0} , ..., d _{i, v, l-1} } are generated.

なお、Band Elimination Filter１６９は、ＲＯＭ３２３（図６）に格納されたＣＰＵの動作プログラムに含まれる関数等であってもよいし、別途ハードウェアとして備えられたものであってもよい。 The Band Elimination Filter 169 may be a function or the like included in the CPU operation program stored in the ROM 323 (FIG. 6), or may be provided separately as hardware.

ＣＰＵ３２１はさらに、生成した非雑音残差信号D_i、vを周波数変換してスペクトルF[D_i、ｖ]を求める（ステップＳ５３３）。 The CPU 321 further obtains a spectrum F [D _{i, v} ] by frequency-converting the generated non-noise residual signals D _{i, v} (step S533).

ステップＳ５３５では、帯域識別変数ωが、ω=1に設定される。 In step S535, the band identification variable ω is set to ω = 1.

ステップＳ５３７及びステップＳ５３９では帯域ωが雑音帯域であるか非雑音帯域であるかが判別される。本ステップは、図２においては、非雑音帯域決定器１６７が非雑音帯域切り出し器１６２に非雑音帯域に関する情報を送ることに相当する。なお、図８及び図９に示したフローチャートにおいては、雑音帯域であるか非雑音帯域であるかの判別に際して、帯域別残差信号D_i、ωの検索が行われている。そして、かかる判別はフラグFlag_i、ωの検索により行ってもよいことは、前述したとおりである。 In step S537 and step S539, it is determined whether the band ω is a noise band or a non-noise band. In FIG. 2, this step corresponds to the non-noise band determiner 167 sending information related to the non-noise band to the non-noise band extractor 162. In the flowcharts shown in FIGS. 8 and 9, the band-based residual signals Di _{and ω} are searched when determining whether the band is a noise band or a non-noise band. As described above, this determination may be performed by searching for the flags Flag _{i and ω} .

それに対して、図１０で示される手順においては、該図に対応する実施形態２を示す図２からも明らかなとおり、帯域別残差信号D_i、ωを記憶部３２５に格納する過程（図８のステップＳ４４５）がない。実施形態２においては、D_i、ωを重ね合わせる処理が不要だからである。したがって、図１０のステップＳ５３７及びステップＳ５３９における判別の方法としては、フラグFlag_i、ωの検索による方法しか採り得ない。この事情は、後述の、実施形態３に対応したフローチャートである図１１及び図１２においても同様である。 On the other hand, in the procedure shown in FIG. 10, as is clear from FIG. 2 showing the second embodiment corresponding to the figure, the process of storing the band-specific residual signals D _{i and ω} in the storage unit 325 (FIG. There is no step S445). This is because the process of superimposing Di _{and ω} is unnecessary in the second embodiment. Therefore, as the determination method in step S537 and step S539 in FIG. 10, only a method by searching for the flags Flag _{i and ω} can be used. This situation also applies to FIGS. 11 and 12, which are flowcharts corresponding to the third embodiment described later.

結局、ステップＳ５３９では、記憶部３２５にフラグFlag_i、ωが格納されているか否かが判別される。Flag_i、ωが格納されていないと判別された場合（ステップＳ５３９；Ｎｏ）、帯域ωは非雑音帯域であるということであるから、ステップＳ５３３で求められたスペクトルF[D_i、v]から帯域ωにおける周波数変換係数F[D_i、v](ω)={f_{D、i、v、ω、１}、・・・、f_{D、i、v、ω、p(ω)}}が切り出され、記憶部３２５に格納されてから（ステップＳ５４１）、ステップＳ５４３に進む。 Eventually, in step S539, it is determined whether or not the flags Flag _{i and ω} are stored in the storage unit 325. If it is determined that Flag _{i, ω} is not stored (step S539; No), the band ω is a non-noise band, and therefore, from the spectrum F [D _{i, v} ] obtained in step S533. The frequency conversion coefficient F [D _{i, v} ] (ω) = {f _{D, i, v, ω 1} ,..., F _{D, i, v, ω, p (ω)} } in the band ω is cut out. After being stored in the storage unit 325 (step S541), the process proceeds to step S543.

一方、Flag_i、ωが格納されていると判別された場合（ステップＳ５３９；Ｙｅｓ）、帯域ωは雑音帯域であるということであるから、そのままステップＳ５４３に進む。 On the other hand, when it is determined that Flag _{i and ω} are stored (step S539; Yes), since the band ω is a noise band, the process proceeds to step S543 as it is.

ステップＳ５４３では、ωがω_finに達したか否かが判別される。達していないと判別された場合（ステップＳ５４３；Ｎｏ）、ωを1増加してから（ステップＳ５４５）、ステップＳ５３７に戻り、達していると判別された場合（ステップＳ５４３；Ｙｅｓ）、i番目の時間区間における帯域毎の処理が終了したということであるから、ステップＳ５４７に進む。 In step S543, it is determined whether or not ω has reached ω _fin . If it is determined that it has not been reached (step S543; No), ω is increased by 1 (step S545), and then the process returns to step S537. If it is determined that it has been reached (step S543; Yes), the i-th Since this means that the processing for each band in the time interval has been completed, the process proceeds to step S547.

ステップＳ５４７では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ５４７；Ｎｏ）、iを1増加してから（ステップＳ５４９）、ステップＳ５３３に戻り、達していると判別された場合（ステップＳ５４７；Ｙｅｓ）、全ての時間区間における処理が終了したと言うことであるから、残差信号からフラグ等を求める処理全体が終了する。 In step S547, it is determined whether i has reached M-1. If it is determined that it has not been reached (step S547; No), after i is increased by 1 (step S549), the process returns to step S533, and if it is determined that it has been reached (step S547; Yes), all times Since it means that the processing in the section has been completed, the entire processing for obtaining a flag or the like from the residual signal is completed.

次に、実施形態３に係る音声符号化装置１１５（図３）が行う処理の手順を、図１１及び図１２に示すフローチャートを参照しつつ説明する。 Next, a procedure of processing performed by the speech encoding device 115 (FIG. 3) according to the third embodiment will be described with reference to the flowcharts shown in FIGS.

はじめに、入力信号サンプルカウンタiが、i=0に設定される（図１１のステップＳ５４０）。 First, the input signal sample counter i is set to i = 0 (step S540 in FIG. 11).

入力信号サンプルD_i={d_i、0、・・・、d_i、l-1}がロードされ（ステップＳ５４２）、そのスペクトルF[D_i]が求められ（ステップＳ５４４）、ステップＳ５４７に進む。図３におけるスペクトル変換器１６３の位置から明らかなとおり、実施形態３においては、先の２つの実施形態の場合に比べて、早い段階で周波数変換が行われることが特徴である。各種の前置処理を経ないうちに周波数変換が行われるために、かかる前置処理に起因する誤差の影響がなく、周波数変換係数が精度よく求まることは、既に述べたとおりである。 The input signal samples D _i = {d _{i, 0} ,..., D _{i, l-1} } are loaded (step S542), the spectrum F [D _i ] is obtained (step S544), and the process proceeds to step S547. . As is clear from the position of the spectrum converter 163 in FIG. 3, the third embodiment is characterized in that frequency conversion is performed at an earlier stage than in the previous two embodiments. As described above, since the frequency conversion is performed before various kinds of preprocessing, the frequency conversion coefficient can be obtained with high accuracy without being affected by the error due to the preprocessing.

ステップＳ５４６では、帯域識別変数ωがω=1に設定される。 In step S546, the band identification variable ω is set to ω = 1.

ステップＳ５４８では、帯域ωにおける周波数変換係数F[D_i](ω)={f_{D、i、ω、１}、・・・、f_{D、i、ω、p(ω)}}が、F[D_i]から切り出されることにより生成される。なお、この時点では、帯域ωが雑音帯域であるか非雑音帯域であるかがまだ判別されていないので、周波数変換係数F[D_i](ω)は全ての帯域ωについて生成される。生成されたF[D_i](ω)は、記憶部３２５に格納される（ステップＳ５５０）。 In step S548, the frequency conversion coefficient F [D _i ] (ω) = {f _{D, i, ω 1} ,..., F _{D, i, ω, p (ω)} } in the band ω is changed to F [D _i ]. At this point, since it is not yet determined whether the band ω is a noise band or a non-noise band, the frequency conversion coefficient F [D _i ] (ω) is generated for all bands ω. The generated F [D _i ] (ω) is stored in the storage unit 325 (step S550).

ステップＳ５５２では、ωがω_finに達したか否かが判別される。達していないと判別された場合（ステップＳ５５２；Ｎｏ）、ωを1増加してから（ステップＳ５５５）、ステップＳ５４８に戻り、達していると判別された場合（ステップＳ５５２；Ｙｅｓ）、ステップＳ５５７に進む。 In step S552, it is determined whether or not ω has reached ω _fin . If it is determined that it has not been reached (step S552; No), after increasing ω by 1 (step S555), the process returns to step S548, and if it is determined that it has been reached (step S552; Yes), the process returns to step S557. move on.

ステップＳ５５７では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ５５７；Ｎｏ）、iを1増加してから（ステップＳ５５９）、ステップＳ５４２に戻り、達していると判別された場合（ステップＳ５５７；Ｙｅｓ）、ステップＳ５６１に進む。 In step S557, it is determined whether i has reached M-1. If it is determined that it has not been reached (step S557; No), i is increased by 1 (step S559), and then the process returns to step S542. If it is determined that it has been reached (step S557; Yes), the process returns to step S561. move on.

ステップＳ５６１では、入力信号サンプルカウンタiが、i=0に設定される。ステップＳ５６３では、残差信号D_iがレジスタにロードされ、ステップ５６５では、帯域識別変数ωが、ω=1に設定され、ステップＳ５６７では、入力信号サンプルD_iから、帯域フィルタ部１２７（図３）の中の第ω帯域フィルタにより、帯域ωの残差信号D_i、ω={d_i、ω、0、・・・、d_i、ω、l-1}が生成される。 In step S561, the input signal sample counter i is set to i = 0. In step S563, the residual signal D _i is loaded into the register. In step 565, the band identification variable ω is set to ω = 1. In step S567, the band filter unit 127 (FIG. 3) is input from the input signal sample D _i . ) Of the ω-th band filter in) generates the residual signal D _{i, ω} = {d _{i, ω, 0} ,..., D _{i, ω, l-1} } of the band ω.

ここで、残差信号D_i、ωが雑音か否かが判別される（ステップＳ５６９）。D_i、ωが雑音であると判別された場合（ステップＳ５６９；Ｙｅｓ）、フラグFlag_i、ωの生成及びゲインG_i、ωの算出が行われ（ステップＳ５７１）、Flag_i、ω及びG_i、ωの記憶部３２５への格納が行われてから（ステップＳ５７３）、ステップＳ５７５に進む。D_i、ωが雑音ではないと判別された場合は（ステップＳ５６９；Ｎｏ）、すぐにステップＳ５７５に進む。 Here, it is determined whether or not the residual signals D _{i and ω} are noise (step S569). If it is determined that D _{i and ω} are noise (step S569; Yes), flags Flag _{i and ω} are generated and gains G _{i and ω} are calculated (step S571). Flag _{i, ω,} and G _{i , Ω} is stored in the storage unit 325 (step S573), and then the process proceeds to step S575. If it is determined that D _{i and ω} are not noise (step S569; No), the process immediately proceeds to step S575.

ステップＳ５７５では、ωがω_finに達したか否かが判別される。達していないと判別された場合（ステップＳ５７５；Ｎｏ）、ωを1増加してから（ステップＳ５７７）、ステップＳ５６７に戻り、達していると判別された場合（ステップＳ５７５；Ｙｅｓ）、ステップＳ５７９に進む。 In step S575, it is determined whether or not ω has reached ω _fin . If it is determined that it has not been reached (step S575; No), ω is incremented by 1 (step S577), and then the process returns to step S567. If it is determined that it has been reached (step S575; Yes), the process returns to step S579. move on.

ステップＳ５７９では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ５７９；Ｎｏ）、iを1増加してから（ステップＳ５８１）、ステップＳ５６３に戻り、達していると判別された場合（ステップＳ５７９；Ｙｅｓ）、図１２のステップＳ５９１に進む。 In step S579, it is determined whether i has reached M-1. When it is determined that it has not been reached (step S579; No), i is increased by 1 (step S581), and then the process returns to step S563. When it is determined that it has been reached (step S579; Yes), FIG. The process proceeds to step S591.

ステップＳ５９１では、入力信号サンプルカウンタiが、i=0に設定される。そして、ステップＳ５９３では、帯域識別変数ωがω=1に設定される。 In step S591, the input signal sample counter i is set to i = 0. In step S593, the band identification variable ω is set to ω = 1.

記憶部３２５の内部が検索され、フラグFlag_i、ωが格納されているか否かが調査された（ステップＳ５９５）後、Flag_i、ωが格納されているか否かの判別ステップ（ステップＳ５９７）に進む。 The inside of the storage unit 325 is searched, and it is investigated whether or not the flags Flag _{i and ω} are stored (step S595), and then in the determination step of whether or not Flag _{i and ω} are stored (step S597). move on.

Flag_i、ωが格納されていないと判別された場合（ステップＳ５９７；Ｎｏ）、帯域ωにおける周波数変換係数F[D_i](ω)={f_{D、i、ω、１}、・・・、f_{D、i、ω、p(ω)}}がレジスタにロードされる（ステップＳ５９９）。そして、レジスタにはF[D_i](ω)とは別に、帯域ωにおける周波数変換係数F[D_i、v](ω)={f_{D、i、v、ω、１}、・・・、f_{D、i、v、ω、p(ω)}}が用意され、F[D_i、v](ω)=F[D_i](ω)によりF[D_i、v](ω)が決定される（ステップＳ６０１）。 When it is determined that Flag _{i and ω} are not stored (step S597; No), the frequency conversion coefficient F [D _i ] (ω) = {f _{D, i, ω 1} ,. f _{D, i, ω, p (ω)} } are loaded into the register (step S599). In addition to the F [D _i ] (ω), the register includes frequency conversion coefficients F [D _{i, v} ] (ω) = {f _{D, i, v, ω 1} ,. _{f D, i, v, ω} , p (ω)} are _{prepared, F [D i, v]} (ω) = F [D i] by _{(ω) F [D i,} v] (ω) is determined (Step S601).

ここで、上述のようにF[D_i](ω)は全ての帯域について生成されるものであるのに対し、ステップＳ６０１で定義されるF[D_i、v](ω)は、ステップＳ５９７における場合分けのため、非雑音帯域についてのみ生成されるものであることに留意する。F[D_i、v](ω)が記憶部３２５に格納された（ステップＳ６０３）後、ステップＳ６０５に進む。 Here, as described above, F [D _i ] (ω) is generated for all bands, whereas F [D _{i, v} ] (ω) defined in step S601 is determined in step S597. Note that because of the case separation in, it is generated only for the non-noise band. After F [D _{i, v} ] (ω) is stored in the storage unit 325 (step S603), the process proceeds to step S605.

一方、Flag_i、ωが格納されていると判別された場合は（ステップＳ５９７；Ｙｅｓ）、そのままステップＳ６０５に進む。 On the other hand, if it is determined that Flag _{i and ω} are stored (step S597; Yes), the process directly proceeds to step S605.

ステップＳ６０５では、ωがω_finに達したか否かが判別される。達していないと判別された場合（ステップＳ６０５；Ｎｏ）、ωを1増加してから（ステップＳ６０７）、ステップＳ５９５に戻り、達していると判別された場合（ステップＳ６０５；Ｙｅｓ）、ステップＳ６０９に進む。 In step S605, it is determined whether or not ω has reached ω _fin . If it is determined that it has not been reached (step S605; No), ω is increased by 1 (step S607), and then the process returns to step S595. If it is determined that it has been reached (step S605; Yes), the process returns to step S609. move on.

ステップＳ６０９では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ６０９；Ｎｏ）、iを1増加してから（ステップＳ６１１）、ステップＳ５９３に戻り、達していると判別された場合（ステップＳ６０９；Ｙｅｓ）、処理を終了する。 In step S609, it is determined whether i has reached M-1. If it is determined that it has not been reached (step S609; No), i is incremented by 1 (step S611), and the process returns to step S593. If it is determined that it has been reached (step S609; Yes), the process is terminated. To do.

（フラグ等から残差信号を復元する手順）
以下では、実施形態４に係る音声復号装置２１１及び実施形態５に係る音声復号装置２１３が行う、残差信号からフラグ、ゲイン、及び、周波数変換係数、を求める手順を説明する。 (Procedure for restoring residual signal from flags, etc.)
Hereinafter, a procedure for obtaining a flag, a gain, and a frequency conversion coefficient from a residual signal performed by the speech decoding device 211 according to the fourth embodiment and the speech decoding device 213 according to the fifth embodiment will be described.

なお、前提として、復号部２２３（図４乃び図５）により、予測係数、フラグFlag_i、ω(0≦i≦M-1、1≦ω≦ω_fin、ただしωは雑音帯域であるとする。)、ゲインG_i、ω(0≦i≦M-1、1≦ω≦ω_fin、ただしωは雑音帯域であるとする。)、非雑音帯域における周波数変換係数F[D_i、v](ω)={f_{D、i、v、ω、１}、・・・、f_{D、i、v、ω、p(ω)}} (0≦i≦M-1、1≦ω≦ω_fin、ただしωは非雑音帯域であるとする。)、は、既に復号されて記憶部３２５に格納されているものとする。 As a premise, the decoding unit 223 (FIG. 4 and FIG. 5) performs prediction coefficients, flags Flag _{i, ω} (0 ≦ i ≦ M−1, 1 ≦ ω ≦ ω _fin , where ω is a noise band. ), Gain G _{i, ω} (0 ≦ i ≦ M-1, 1 ≦ ω ≦ ω _fin , where ω is a noise band), frequency conversion coefficient F [D _{i, v} in non-noise band ] (ω) = {f _{D, i, v, ω 1} ,..., f _{D, i, v, ω, p (ω)} } (0 ≦ i ≦ M-1, 1 ≦ ω ≦ ω _fin , Where ω is a non-noise band.) Is already decoded and stored in the storage unit 325.

まず、実施形態４に係る音声復号装置２１１（図４）が行う処理の手順を、図１３及び図１４に示すフローチャートを参照しつつ説明する。 First, the procedure of the process performed by the speech decoding apparatus 211 (FIG. 4) according to the fourth embodiment will be described with reference to the flowcharts shown in FIGS.

はじめに、ステップＳ６２１（図１３）において、入力信号サンプルカウンタiが、i=0に設定される。 First, in step S621 (FIG. 13), the input signal sample counter i is set to i = 0.

ＣＰＵ３２１は、レジスタに、復元雑音残差信号D'_i、uvと、非雑音残差信号スペクトルF[D_i、v]と、を用意し、D'_i、uvの成分と、F[D_i、v]の成分と、を全て0に初期化設定する（ステップＳ６２３）。 The CPU 321 prepares the restored noise residual signal D ′ _{i, uv} and the non-noise residual signal spectrum F [D _{i, v} ] in the register, the component of D ′ _{i, uv} , and F [D _{i , V} ] are all initialized to 0 (step S623).

なお、F[D_i、v]の全ての成分の初期値は、上述の通り、一般には0に設定するのが適切である。かかる数値0は、非雑音帯域における残差信号は別途ゲインを元に復元されるために、オフセットが不要であるとの観点から決定されたものであって、絶対的な規則ではない。 Note that the initial values of all the components of F [D _{i, v} ] are generally set to 0 as described above. The numerical value 0 is determined from the viewpoint that an offset is unnecessary because the residual signal in the non-noise band is separately restored based on the gain, and is not an absolute rule.

例えば、実施形態１乃至５については、ゲインに関係した処理を省略する、つまり雑音帯域についての情報としては音声符号化装置と音声復号装置の間（換言すれば送受信器間）でフラグのみを伝達することとした変形例も考え得るが、かかる場合には、人間の聴覚特性を考慮した上で、F[D_i、v]の成分の初期値を、0以外の所定の定数にしてもよいし、さらに、かかる所定の定数が成分毎に異なっていてもよい。このようにすれば、後述のように、非雑音帯域についてのみ成分の置換が行われるので、最終的に生成されるF[D_i、v]においては、雑音帯域における成分が前記所定の定数のまま残る。つまり、前記所定の定数とは、送受信器間でゲインの授受がなされない場合に、あらかじめ定めておくゲインである。 For example, in the first to fifth embodiments, the process related to the gain is omitted, that is, only the flag is transmitted between the speech coding apparatus and the speech decoding apparatus (in other words, between the transmitter and the receiver) as information about the noise band. In this case, the initial value of the component of F [D _{i, v} ] may be set to a predetermined constant other than 0 in consideration of human auditory characteristics. In addition, the predetermined constant may be different for each component. In this way, as will be described later, since the component replacement is performed only for the non-noise band, in the finally generated F [D _{i, v} ], the component in the noise band is equal to the predetermined constant. Remains. In other words, the predetermined constant is a gain determined in advance when no gain is exchanged between the transmitter and the receiver.

帯域識別変数ωがω=1に設定された（ステップＳ６２５）後、記憶部３２５の内部が検索されて、フラグFlag_i、ωが格納されているか否かが調査され（ステップＳ６２７）、判別ステップ（ステップＳ６２９）に進む。 After the band identification variable ω is set to ω = 1 (step S625), the inside of the storage unit 325 is searched to check whether or not the flags Flag _{i and ω} are stored (step S627), and the determination step. The process proceeds to (Step S629).

Flag_i、ωが格納されていると判別された場合（ステップＳ６２９；Ｙｅｓ）、帯域ωは雑音帯域であるということであるから、記憶部３２５にはゲインG_i、ωが格納されているはずである。そこで、G_i、ωがレジスタにロードされる（ステップＳ６３１）。これは、図４においては、復号部２２３から復号用Ｇスイッチ部２２７に帯域ωの情報として引き渡されるものが、周波数変換係数ではなくゲインであることに相当する。 If it is determined that Flag _{i and ω} are stored (step S629; Yes), the band ω is a noise band, and therefore the gain G _{i and ω} should be stored in the storage unit 325. It is. Therefore, G _{i and ω} are loaded into the registers (step S631). In FIG. 4, this is equivalent to the fact that what is passed from the decoding unit 223 to the decoding G switch unit 227 as the band ω information is not a frequency conversion coefficient but a gain.

ＣＰＵ３２１は、帯域別雑音列生成部２２９（図４）として機能することにより、ロードされたG_i、ωを手がかりにして、帯域ωにおける雑音列D'_i、uv、ωを生成する（図１３のステップＳ６３３）。具体的な生成方法は、後に図１４を参照して説明する。 The CPU 321 functions as the noise sequence generation unit 229 (FIG. 4) for each band _, and generates noise sequences D ′ _{i, uv, and ω} in the band ω using the loaded G _{i and ω} as a clue (FIG. 13). Step S633). A specific generation method will be described later with reference to FIG.

ＣＰＵ３２１は、生成されたD'_i、uv、ωを、レジスタ内に格納されているD'_i、uvに重ね合わせて、新たなD'_i、uvを生成する。つまり、D'_i、uvを更新する（図１３のステップＳ６３５）。これは、図４においては、帯域別雑音列生成部２２９（図４）により生成された雑音列が残差信号復元部２３３において重ね合わせられ、雑音帯域における残差信号が復元されていく過程に相当する。D'_i、uvが上述のように更新されたら、ステップＳ６４１に進む。 CPU321 is generated D _i _{'i, uv,} the _omega, D is stored in the _register', superimposed on the _uv, it generates a new D _{'i, uv.} That is, D ′ _{i and uv} are updated (step S635 in FIG. 13). In FIG. 4, the noise sequence generated by the band-specific noise sequence generation unit 229 (FIG. 4) is superimposed in the residual signal recovery unit 233, and the residual signal in the noise band is recovered. Equivalent to. When D ′ _{i, uv} is updated as described above, the process proceeds to step S641.

一方、ステップＳ６２９において、Flag_i、ωが格納されていないと判別された場合（ステップＳ６２９；Ｎｏ）、帯域ωは非雑音帯域であるということであるから、記憶部３２５には帯域ωにおける周波数変換係数F[D_i、v](ω)が格納されているはずである。そこで、F[D_i、v](ω)がレジスタにロードされる（ステップＳ６３７）。これは、図４においては、復号部２２３から復号用Ｇスイッチ部２２７に帯域ωの情報として引き渡されるものが、ゲインではなく周波数変換係数であることに相当する。 On the other hand, when it is determined in Step S629 that Flag _{i and ω} are not stored (Step S629; No), the band ω is a non-noise band, and therefore the storage unit 325 has a frequency in the band ω. The conversion coefficient F [D _{i, v} ] (ω) should be stored. Therefore, F [D _{i, v} ] (ω) is loaded into the register (step S637). In FIG. 4, this is equivalent to the fact that what is handed over from the decoding unit 223 to the decoding G switch unit 227 as band ω information is not a gain but a frequency conversion coefficient.

ＣＰＵ３２１は、レジスタ内に格納されている非雑音残差信号スペクトルF[D_i、v]の成分のうち、帯域ωにおける成分群を、ステップＳ６３７でロードしたF[D_i、v](ω)に置換することにより、F[D_i、v]を更新する（ステップＳ６３９）。 The CPU 321 loads F [D _{i, v} ] (ω) obtained by loading the component group in the band ω among the components of the non-noise residual signal spectrum F [D _{i, v} ] stored in the register in step S637. F [D _{i, v} ] is updated by replacing with (step S639).

既出のステップＳ６２３において、F[D_i、v]の成分の初期値は全て0に設定されているから、後述のステップＳ６４３等によるωに関するループ処理により、F[D_i、v]の一部分が、ループの度にF[D_i、v](ω)に置換され、F[D_i、v]は最終的には非雑音残差信号スペクトルとして適切なものとなる。 In the above-described step S623, the initial values of the components of F [D _{i, v} ] are all set to 0, so that a part of F [D _{i, v} ] is obtained by the loop processing for ω in step S643 and the like described later. In each loop, F [D _{i, v} ] (ω) is replaced, and F [D _{i, v} ] is finally suitable as a non-noise residual signal spectrum.

ここで、かかる置換が生じるのは非雑音帯域の成分についてのみであり、雑音帯域の成分は初期値である0のままであることに留意する。 Here, it should be noted that such substitution occurs only for the components in the non-noise band, and the noise band component remains at the initial value of 0.

ステップＳ６３７及びステップＳ６３９で行われる処理は、図４においては、周波数変換係数集計及び補充器２５５が復号用Ｇスイッチ部２２７から非雑音帯域における周波数変換係数を受け取って集計し、雑音帯域における周波数変換係数としては0を補充しつつ、全帯域における周波数変換係数を求めていく過程に相当する。F[D_i、v]が上述のように更新されたら、ステップＳ６４１に進む。 In FIG. 4, the processing performed in step S637 and step S639 is performed as follows. In FIG. 4, the frequency conversion coefficient totaling and supplementing unit 255 receives the frequency conversion coefficients in the non-noise band from the decoding G switch unit 227 and totals them. This is equivalent to the process of obtaining frequency conversion coefficients in the entire band while supplementing 0 as the coefficients. When F [D _{i, v} ] is updated as described above, the process proceeds to step S641.

なお、このように、ステップＳ６２７におけるFlag_i、ωの検索の後にF[D_i、v]が徐々に完成されていく手順を採っている理由は、実施形態１乃至５の音声符号化装置と音声復号装置との間では、非雑音帯域における周波数変換係数の授受は行わないことを前提としているためである。 The reason why F [D _{i, v} ] is gradually completed after the search for Flag _{i, ω} in step S627 is the same as that of the speech encoding apparatus according to the first to fifth embodiments. This is because it is assumed that no frequency conversion coefficient is exchanged in the non-noise band with the speech decoding apparatus.

かかる授受が行われないことは、ＣＰＵ３２１に検索のための負荷がかかるという問題を生じさせ得るが、送信器としての音声符号化装置から受信器としての音声復号装置への情報伝達量が少なくて済むという点で、本発明の目的に沿ったものである。しかも、一般的なＣＰＵにとって、フラグという簡潔な情報の有無を検索する程度の負荷は、実際にはほとんど問題とはならない。 The fact that such transfer is not performed may cause a problem that the CPU 321 is subjected to a search load, but the amount of information transmitted from the speech encoding device as the transmitter to the speech decoding device as the receiver is small. This is in accordance with the object of the present invention. In addition, for a general CPU, the load of searching for the presence / absence of concise information such as a flag is hardly a problem in practice.

もっとも、実施形態１乃至５の変形例として、一部の非雑音帯域についての情報については、フラグの代わりに、該帯域の周波数変換係数を0とした態様で伝達してもよい。このようにすれば、送受信器間での情報伝達量は増加するが、上述のＣＰＵの検索負担の軽減と、ステップＳ６３７及びステップＳ６３９で行われる置換処理の一部省略が可能となり、音声復号装置における処理の高速化に資する。また、送受信器間での情報伝達量が増加するといっても、小さな数値である数値0が伝達されるだけであるから、与えられた情報伝達量に余裕がある場合には、このような変形例の方が効率が良いこともあり得る。 However, as a modification of the first to fifth embodiments, information about a part of the non-noise band may be transmitted in a mode in which the frequency conversion coefficient of the band is set to 0 instead of the flag. This increases the amount of information transmitted between the transmitter and the receiver, but it is possible to reduce the above-described CPU search burden and to omit part of the replacement processing performed in steps S637 and S639. Contributes to speeding up the process. In addition, even if the amount of information transmitted between the transmitter and the receiver increases, only a small numerical value 0 is transmitted, so if there is a margin in the given information transmission amount, such a modification The example may be more efficient.

ステップＳ６４１では、ωがω_finに達したか否かが判別され、達していない場合（ステップＳ６４１；Ｎｏ）、ωを1増加してから（ステップＳ６４３）、ステップＳ６２７に戻り、達している場合（ステップＳ６４１；Ｙｅｓ）、ステップＳ６４５に進む。 In step S641, it is determined whether or not ω has reached ω _fin. If not reached (step S641; No), ω is increased by 1 (step S643), and then the process returns to step S627 and has been reached. (Step S641; Yes), the process proceeds to Step S645.

ステップＳ６４５では、ＣＰＵ３２１は図４におけるスペクトル逆変換器２５７として機能することにより、非雑音残差信号スペクトルF[D_i、v]から非雑音残差信号D_i、vを求める。そして、ＣＰＵ３２１は、D'_i、uvとD_i、vを重ね合わせることにより、復元残差信号D'_iを求め（ステップＳ６４７）、D'_iを記憶部に格納する（ステップＳ６４９）。 In step S645, CPU 321 may by functioning as a spectral inverter 257 in FIG. 4, non-noise residual signal spectrum F [D _{i, v]} from the non-noise residual signal D _i, obtaining the _v. Then, the CPU 321 obtains a restored residual signal D ′ _i by superimposing D ′ _{i, uv} and D _{i, v} (step S647), and stores D ′ _i in the storage unit (step S649).

ステップＳ６５１では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ６５１；Ｎｏ）、iを1増加してから（ステップＳ６５３）、ステップＳ６２３に戻り、達したと判別された場合（ステップＳ６５１；Ｙｅｓ）、処理を終了する。 In step S651, it is determined whether i has reached M-1. If it is determined that it has not been reached (step S651; No), i is increased by 1 (step S653), and then the process returns to step S623. If it is determined that it has been reached (step S651; Yes), the process is terminated. .

以下では、上述のステップＳ６３３における帯域別雑音列D'_i、uv、ωの生成の具体的な手順について、図１４に示すフローチャートを参照しつつ説明する。 Hereinafter, a specific procedure for generating the band-specific noise sequence D ′ _{i, uv, ω} in step S633 described above will be described with reference to the flowchart shown in FIG.

はじめに、大きさが±1で、時間間隔が乱数であるような基本雑音列R_i={R_i、0、・・・、R_i、l-1}を生成する（ステップＳ６５５）。 First, a basic noise sequence R _i = {R _{i, 0} ,..., R _{i, l-1} } having a size of ± 1 and a time interval of a random number is generated (step S655).

ここでは、元の残差信号のサンプリング間隔と同じサンプリング間隔であるとしてR_iを生成する。よって、実際には、その各要素R_i、0、・・・、R_i、l-1の値はそれぞれ0か+1か-1のいずれかである。しかも、これら時間順に並んだ要素の列においては、ランダムな個数間隔で+1か-1が出現し、他の要素の値は0ということになる。 Here, _Ri is generated assuming that the sampling interval is the same as the sampling interval of the original residual signal. Therefore, in practice, the value of each element R _{i, 0} ,..., R _{i, l−1} is either 0, +1, or −1. Moreover, in these element sequences arranged in time order, +1 or -1 appears at random number intervals, and the values of the other elements are 0.

得られた基本雑音列R_iを、帯域ωの成分を取り出す帯域フィルタに通すことにより、帯域ωの基本雑音列R_i、ω={R_i、ω、0、・・・、R_i、ω、l-1｝を生成する（ステップＳ６５７）。 By passing the obtained basic noise sequence R _i through a band-pass filter that extracts the component of the band ω, the basic noise sequence R _{i, ω} = {R _{i, ω, 0} , ..., R _{i, ω} of the band _{ω , L-1} } is generated (step S657).

生成した帯域ωの基本雑音列R_i、ωに、図１３のステップＳ６３１でロードされたゲインG_i、ωを乗じることにより、雑音列D’_i、uv、ω={d’_{i、uv、ω、0}、・・・、d’_{i、uv、ω、l-1}｝が生成され（ステップＳ６５９）、処理は終了する。 By multiplying the generated basic noise sequence R _{i, ω} of the band ω by the gains G _{i, ω} loaded in step S631 in FIG. 13, the noise sequence D ′ _{i, uv, ω} = {d ′ _{i, uv, ω, 0} ,..., d ′ _{i, uv, ω, l−1} } are generated (step S659), and the process ends.

次に、実施形態５に係る音声復号装置２１３（図５）が行う処理の手順を、図１５に示すフローチャートを参照しつつ説明する。 Next, the procedure of processing performed by the speech decoding apparatus 213 (FIG. 5) according to the fifth embodiment will be described with reference to the flowchart shown in FIG.

はじめに、ステップＳ６６１において、入力信号サンプルカウンタiが、i=0に設定される。 First, in step S661, the input signal sample counter i is set to i = 0.

ＣＰＵ３２１のレジスタにおいて復元残差信号スペクトルF[D'_i]が用意され、その成分が全て0に初期化設定される（ステップＳ６６３）。 A restored residual signal spectrum F [D ′ _i ] is prepared in the register of the CPU 321 and all its components are initialized to 0 (step S663).

帯域識別変数ωがω=1に設定された（ステップＳ６６５）後、記憶部３２５の内部が検索されて、フラグFlag_i、ωが格納されているか否かが調査され（ステップＳ６６７）、判別ステップ（ステップＳ６６９）に進む。 After the band identification variable ω is set to ω = 1 (step S665), the inside of the storage unit 325 is searched to check whether or not the flags Flag _{i and ω} are stored (step S667), and the determination step. The process proceeds to (Step S669).

Flag_i、ωが格納されていると判別された場合（ステップＳ６６９；Ｙｅｓ）、ゲインG_i、ωがレジスタにロードされる（ステップＳ６７１）。 When it is determined that Flag _{i and ω} are stored (step S669; Yes), the gains G _{i and ω} are loaded into the register (step S671).

ＣＰＵ３２１は、復元残差信号スペクトルF[D'_i]の成分について、帯域ωに含まれる成分を全て、G_i、ω×単位成分、に置換することにより、F[D'_i]を更新する（ステップＳ６７３）。これは、図５においては、帯域別定数周波数変換係数生成部２５９から、雑音帯域における周波数変換係数が、周波数変換係数集計器２６７に引き渡されることに相当する。この後、ステップＳ６７９に進む。 The CPU 321 updates F [D ′ _i ] by replacing all the components included in the band ω with G _{i, ω} × unit component for the components of the restored residual signal spectrum F [D ′ _i ]. (Step S673). In FIG. 5, this corresponds to the frequency conversion coefficient in the noise band being transferred from the constant frequency conversion coefficient generation unit 259 for each band to the frequency conversion coefficient totalizer 267. Thereafter, the process proceeds to step S679.

一方、Flag_i、ωが格納されていないと判別された場合（ステップＳ６６９；Ｎｏ）、帯域ωにおける周波数変換係数F[D_i、v](ω)がレジスタにロードされ（ステップＳ６７５）、F[D'_i]の成分について、帯域ωに含まれる成分群がF[D_i、v](ω)に置換されることにより、F[D'_i]が更新されてから（ステップＳ６７７）、ステップＳ６７９に進む。 On the other hand, when it is determined that Flag _{i and ω} are not stored (step S669; No), the frequency conversion coefficient F [D _{i, v} ] (ω) in the band ω is loaded into the register (step S675). For the component [D ′ _i ], the component group included in the band ω is replaced with F [D _{i, v} ] (ω), so that F [D ′ _i ] is updated (step S677). The process proceeds to step S679.

ステップＳ６７９では、ωがω_finに達したか否かが判別され、達していない場合（ステップＳ６７９；Ｎｏ）、ωを1増加してから（ステップＳ６８１）、ステップＳ６６７に戻り、達している場合（ステップＳ６７９；Ｙｅｓ）、ステップＳ６８３に進む。 In step S679, it is determined whether or not ω has reached ω _fin. If it has not been reached (step S679; No), ω is increased by 1 (step S681), and then the process returns to step S667. (Step S679; Yes), the process proceeds to Step S683.

ステップＳ６８３では、逆変換により、復元残差信号スペクトルF[D'_i]から残差信号D'_iが求められる。D'_iが記憶部に格納された（ステップＳ６８５）後、ステップＳ６８７に進む。 In step S683, a residual signal D ′ _i is obtained from the restored residual signal spectrum F [D ′ _i ] by inverse transformation. After D ′ _i is stored in the storage unit (step S685), the process proceeds to step S687.

ステップＳ６８７では、iがM-1に達したか否かが判別される。達していないと判別された場合（ステップＳ６８７；Ｎｏ）、iを1増加してから（ステップＳ６８９）、ステップＳ６６３に戻り、達したと判別された場合（ステップＳ６８７；Ｙｅｓ）、処理を終了する。 In step S687, it is determined whether i has reached M-1. If it is determined that it has not been reached (step S687; No), i is increased by 1 (step S689), and then the process returns to step S663. If it is determined that it has been reached (step S687; Yes), the process is terminated. .

（音声信号復元の手順）
以下では、図１６を参照しつつ、図４及び図５に示した音声復号装置２１１及び音声復号装置２１３の内部で行われる、音声信号復元の手順について説明する。ここでは、予測分析としてＭＬＳＡ分析が用いられた場合の手順を例に説明するが、線型予測分析など他の予測分析が用いられた場合の手順も、同様である。 (Procedure for audio signal restoration)
Hereinafter, with reference to FIG. 16, a description will be given of the procedure of audio signal restoration performed inside the audio decoding device 211 and the audio decoding device 213 shown in FIGS. 4 and 5. Here, the procedure when MLSA analysis is used as predictive analysis will be described as an example, but the procedure when other predictive analysis such as linear predictive analysis is used is also the same.

受信部２２１（図４及び図５）は、元の音声の予測係数等が符号化された結果である符号を、受信して、復号部２２３（図４及び図５）に引き渡す。復号部２２３は、引き渡された符号を復号して、予測係数と、フラグ等とを生成する。これらは記憶部３２５に格納される。予測係数は、ＭＬＳＡ分析の場合は、ＭＬＳＡフィルタ係数M_i={m_i、0、・・・、m_i、p-1}(0≦i≦M-1)である。 The receiving unit 221 (FIGS. 4 and 5) receives a code that is the result of encoding the prediction coefficient of the original speech and passes it to the decoding unit 223 (FIGS. 4 and 5). The decoding unit 223 decodes the delivered code and generates a prediction coefficient, a flag, and the like. These are stored in the storage unit 325. In the case of MLSA analysis, the prediction coefficient is MLSA filter coefficient M _i = {m _{i, 0} ,..., M _{i, p−1} } (0 ≦ i ≦ M−1).

入力信号サンプルカウンタがi=1に設定（図１６のステップＳ７１１）された後、予測係数M_iが記憶部３２５からＣＰＵ３２１の内部のレジスタにロードされる（ステップＳ７１３）。次に、予測係数M_iから合成用逆フィルタCIM_iが計算される（ステップＳ７１５）。これは、図４及び図５において、予測係数を引き渡された合成用フィルタ算出部２３５により、合成用フィルタ部２３７の仕様が定められることに相当する。 After the input signal sample counter is set to i = 1 (step S711 in FIG. 16), the prediction coefficient M _i is loaded from the storage unit 325 to the internal register of the CPU 321 (step S713). Next, the synthesis inverse filter CIM _i is calculated from the prediction coefficient M _i (step S715). This corresponds to the specification of the synthesizing filter unit 237 being determined by the synthesizing filter calculating unit 235 to which the prediction coefficient is handed over in FIGS. 4 and 5.

続いて、復元残差信号D 'iが、ステップＳ７１５にて求められた合成用フィルタCIM_iに通される。その結果、復元されたデジタル音声信号S '_i={s'_i、0、・・・、s'_i、l-1}(0≦i≦M-1)が生成される（ステップＳ７１７）。復元されたデジタル音声信号S'_iは記憶部３２５に格納される（ステップＳ７１９）。続いて、iがM-1に達しているか否かが判別され（ステップＳ７２１）、達していないのであれば（ステップＳ７２１；Ｎｏ）、iを1だけ増加してから（ステップＳ７２３）、ステップＳ７１３に戻る。iがM-1に達しているのであれば（ステップＳ７２１；Ｙｅｓ）、処理を終了する。 Subsequently, the restored residual signal D′ i is passed through the synthesis filter CIM _i obtained in step S715. As a result, the restored digital audio signal S ′ _i = {s ′ _{i, 0} ,..., S ′ _{i, l−1} } (0 ≦ i ≦ M−1) is generated (step S717). The restored digital audio signal S ′ _i is stored in the storage unit 325 (step S719). Subsequently, it is determined whether or not i has reached M-1 (step S721). If not (step S721; No), i is increased by 1 (step S723), and then step S713. Return to. If i has reached M−1 (step S721; Yes), the process is terminated.

（ケプストラムからＭＬＳＡ係数を求める手順の一例）
図１７は、ケプストラムC_i={c_i、0、・・・、c_i、(l/2)-1}からＭＬＳＡフィルタ係数M_i={m_i、0、・・・、m_i、p-1}を求める具体的な手順の一例をフローチャートにしたものである。ステップＳ８１１〜Ｓ８３５に示した計算を行うことにより、ＭＬＳＡフィルタ係数が求まる。αは近似用の数値であり、音声信号が10kHzでサンプリングされている場合にはα=0.35とするのが好適である。また、β=１-α²である。m_i(0≦i≦p-1)は0に初期化しておく。 (Example of procedure for obtaining MLSA coefficients from cepstrum)
Figure 17 is a cepstrum _{_{C i = {c i, 0}} , ···, c i, (l / 2) -1} MLSA filter coefficients from _{_{M i = {m i, 0}} , ···, m i, p _-1 } is a flowchart illustrating an example of a specific procedure. By performing the calculations shown in steps S811 to S835, the MLSA filter coefficient is obtained. α is a numerical value for approximation, and α = 0.35 is preferable when the audio signal is sampled at 10 kHz. Further, β = 1−α ² . m _i (0 ≦ i ≦ p−1) is initialized to 0.

このようにして求まったＭＬＳＡフィルタ係数を用いたＭＬＳＡフィルタの構成の一例を、図１８に示す。P₁〜P₄は近似用係数であり、例えば、P₁=0.4999、P₂=0.1067、P₃=0.0117、P₄=0.0005656とするのが好適である。 An example of the configuration of the MLSA filter using the MLSA filter coefficient obtained in this way is shown in FIG. P _{1 to} P ₄ are approximation coefficients, and for example, P ₁ = 0.4999, P ₂ = 0.1067, P ₃ = 0.0117, and P ₄ = 0.0005656 are preferable.

なお、この発明は、上記実施形態に限定されず、既に文中でもいくつかの変形例を挙げたように、種々の変形及び応用が可能である。上述のハードウェア構成やブロック構成、フローチャートは例示であって、限定されるものではない。 In addition, this invention is not limited to the said embodiment, A various deformation | transformation and application are possible as already mentioned some modification examples in the text. The above-described hardware configuration, block configuration, and flowchart are examples, and are not limited.

例えば、図３に示される音声符号化兼復号装置３１１として携帯電話機を想定して説明したが、ＰＨＳ（Personal Handyphone System）、ＰＤＡ（Personal Digital Assistants）、ノート型及びデスクトップ型パーソナルコンピュータ等による音声処理においても、同様に本発明を適用することができる。例えば本発明をパーソナルコンピュータに適用する場合には、パーソナルコンピュータに音声入出力装置や通信装置等を付加すれば、ハードウェアとしては携帯電話機の機能を有するようにすることができる。そして、上述の処理をコンピュータに実行させるためのコンピュータプログラムが記録媒体や通信により配布されれば、これをコンピュータにインストールして実行させることにより、該コンピュータをこの発明に係る音声符号化装置又は音声復号装置として機能させることも可能である。 For example, the description has been made assuming that a mobile phone is used as the speech encoding / decoding device 311 shown in FIG. The present invention can also be applied in the same manner. For example, when the present invention is applied to a personal computer, if a voice input / output device, a communication device, or the like is added to the personal computer, it can have the function of a mobile phone as hardware. Then, if a computer program for causing a computer to execute the above-described processing is distributed by a recording medium or communication, the computer is installed and executed on the computer, thereby causing the computer to execute the speech encoding apparatus or the speech according to the present invention. It is also possible to function as a decoding device.

すなわち、上記実施形態は説明のためのものであり、本願発明の範囲を制限するものではない。したがって、当業者であればこれらの各要素もしくは全要素をこれと均等なものに置換した実施形態を採用することが可能であるが、これらの実施形態も本発明の範囲に含まれる。 That is, the said embodiment is for description and does not restrict | limit the scope of the present invention. Therefore, those skilled in the art can employ embodiments in which each or all of these elements are replaced with equivalent ones, and these embodiments are also included in the scope of the present invention.

本発明の実施形態１に係る音声符号化装置の機能構成図である。It is a functional block diagram of the speech coder according to Embodiment 1 of the present invention. 本発明の実施形態２に係る音声符号化装置の機能構成図である。It is a function block diagram of the audio | voice coding apparatus which concerns on Embodiment 2 of this invention. 本発明の実施形態３に係る音声符号化装置の機能構成図である。It is a function block diagram of the audio | voice coding apparatus which concerns on Embodiment 3 of this invention. 本発明の実施形態４に係る音声復号装置の機能構成図である。It is a function block diagram of the audio | voice decoding apparatus which concerns on Embodiment 4 of this invention. 本発明の実施形態５に係る音声復号装置の機能構成図である。It is a function block diagram of the speech decoding apparatus which concerns on Embodiment 5 of this invention. 本発明の実施形態６に係る音声符号化兼復号装置の物理的な構成を示す図である。It is a figure which shows the physical structure of the audio | voice encoding and decoding apparatus which concerns on Embodiment 6 of this invention. 線型予測分析又はＭＬＳＡ分析の流れを示す図である。It is a figure which shows the flow of a linear prediction analysis or MLSA analysis. 本発明の実施形態１における、残差信号からフラグ等を生成する処理の流れの前半を示す図である。It is a figure which shows the first half of the flow of the process which produces | generates a flag etc. from a residual signal in Embodiment 1 of this invention. 本発明の実施形態１における、残差信号からフラグ等を生成する処理の流れの後半を示す図である。It is a figure which shows the second half of the flow of the process which produces | generates a flag etc. from a residual signal in Embodiment 1 of this invention. 本発明の実施形態２における、残差信号からフラグ等を生成する処理の流れを示す図である。It is a figure which shows the flow of the process which produces | generates a flag etc. from a residual signal in Embodiment 2 of this invention. 本発明の実施形態３における、残差信号からフラグ等を生成する処理の流れの前半を示す図である。It is a figure which shows the first half of the flow of the process which produces | generates a flag etc. from a residual signal in Embodiment 3 of this invention. 本発明の実施形態３における、残差信号からフラグ等を生成する処理の流れの後半を示す図である。It is a figure which shows the second half of the flow of the process which produces | generates a flag etc. from a residual signal in Embodiment 3 of this invention. 本発明の実施形態４における、フラグ等から残差信号を復元する処理の流れを示す図である。It is a figure which shows the flow of a process which decompress | restores a residual signal from the flag etc. in Embodiment 4 of this invention. 本発明の実施形態４における、帯域毎の雑音列を生成する処理の流れを示す図である。It is a figure which shows the flow of the process which produces | generates the noise sequence for every band in Embodiment 4 of this invention. 本発明の実施形態５における、フラグ等から残差信号を復元する処理の流れを示す図である。It is a figure which shows the flow of the process which decompress | restores a residual signal from the flag etc. in Embodiment 5 of this invention. 音声信号を復元する流れを示す図である。It is a figure which shows the flow which restore | restores an audio | voice signal. ＭＬＳＡフィルタ係数の計算の流れの一例を示す図である。It is a figure which shows an example of the flow of calculation of an MLSA filter coefficient. ＭＬＳＡフィルタの一例を示す図である。It is a figure which shows an example of an MLSA filter.

Explanation of symbols

１１１・・・実施形態１に係る音声符号化装置、１１３・・・実施形態２に係る音声符号化装置、１１５・・・実施形態３に係る音声符号化装置、１２１・・・マイクロフォン、１２３・・・Ａ／Ｄ変換部、１２５・・・予測分析部、１２７・・・帯域フィルタ部、１２９・・・雑音判別部、１３１・・・符号化用Ａスイッチ部、１３３・・・フラグ及びゲイン生成部、１３５・・・集計変換部、１３７・・・符号化部、１３９・・・送信部、１４１・・・予測分析用逆フィルタ算出器、１４３・・・第１帯域フィルタ、１４５・・・第２帯域フィルタ、１４７・・・第１雑音判別器、１４９・・・第２雑音判別器、１５１・・・第１Ａスイッチ、１５３・・・第２Ａスイッチ、１５５・・・第１フラグ生成及び第１ゲイン算出器、１５７・・・第２フラグ生成及び第２ゲイン算出器、１５９・・・フラグ及び雑音ゲイン集計器、１６１・・・非雑音残差信号集計器、１６２・・・非雑音帯域切り出し器、１６３・・・スペクトル変換器、１６５・・・符号化用Ｃスイッチ部、１６７・・・非雑音帯域決定器、１６９・・・Band Elimination Filter、１７１・・・第１Ｃスイッチ、１７３・・・第２Ｃスイッチ、１７５・・・周波数変換係数切り分け器、１７７・・・符号化用Ｅスイッチ部、１７９・・・非雑音周波数変換係数集計器、１８１・・・第１Ｅスイッチ、１８３・・・第２Ｅスイッチ、２１１・・・実施形態４に係る音声復号装置、２１３・・・実施形態５に係る音声復号装置、２２１・・・受信部、２２３・・・復号部、２２５・・・フラグ存否判別部、２２７・・・復号用Ｇスイッチ部、２２９・・・帯域別雑音列生成部、２３１・・・集計逆変換部、２３３・・・残差信号復元部、２３５・・・合成用フィルタ算出部、２３７・・・合成用フィルタ部、２３９・・・Ｄ／Ａ変換部、２４１・・・スピーカ、２４３・・・第１フラグ存否判別器、２４５・・・第２フラグ存否判別器、２４７・・・第１Ｇスイッチ、２４９・・・第２Ｇスイッチ、２５１・・・第１雑音列生成器、２５３・・・第２雑音列生成器、２５５・・・周波数変換係数集計及び補充器、２５７・・・スペクトル逆変換器、２５９・・・帯域別定数周波数変換係数生成部、２６１・・・一括集計逆変換部、２６３・・・第１定数周波数変換係数生成器、２６５・・・第２定数周波数変換係数生成器、２６７・・・周波数変換係数集計器、３１１・・・実施形態６に係る音声符号化兼復号装置、３２１・・・ＣＰＵ、３２３・・・ＲＯＭ、３２５・・・記憶部、３２７・・・無線通信部、３２９・・・音声処理部、３３１・・・操作キー入力処理部、３３３・・・システムバス、３３５・・・アンテナ、３３７・・・操作キー、３３９・・・ＲＡＭ、３４１・・・ハードディスク 111... Speech encoding apparatus according to Embodiment 1, 113... Speech encoding apparatus according to Embodiment 2, 115... Speech encoding apparatus according to Embodiment 3, 121. ..A / D conversion unit, 125 ... predictive analysis unit, 127 ... band filter unit, 129 ... noise discrimination unit, 131 ... encoding A switch unit, 133 ... flag and gain Generating unit, 135... Aggregation conversion unit, 137... Encoding unit, 139... Transmitting unit, 141... Predictive analysis inverse filter calculator, 143. Second band filter, 147... First noise discriminator, 149... Second noise discriminator, 151... 1A switch, 153... 2A switch, 155. And a first gain calculator 157 Second flag generation and second gain calculator, 159... Flag and noise gain totalizer, 161... Non-noise residual signal totalizer, 162. Converter 165... Coding C switch section 167... Non-noise band determiner 169... Band Elimination Filter 171... 1C switch 173. ... Frequency conversion coefficient discriminator, 177... E switch unit for encoding, 179... Non-noise frequency conversion coefficient totalizer, 181... 1E switch, 183. Speech decoding apparatus according to the fourth embodiment, 213... Speech decoding apparatus according to the fifth embodiment, 221... Receiving section, 223... Decoding section, 225. For decryption G switch unit, 229... Noise sequence generation unit for each band, 231... Total reverse conversion unit, 233... Residual signal restoration unit, 235. Filter unit, 239 ... D / A conversion unit, 241 ... Speaker, 243 ... First flag presence / absence discriminator, 245 ... Second flag presence / absence discriminator, 247 ... First G switch, 249 ... 2nd G switch, 251 ... 1st noise train generator, 253 ... 2nd noise train generator, 255 ... Frequency conversion coefficient totaling and supplementer, 257 ... Spectral inverse transformer, 259 ... Constant frequency conversion coefficient generation unit for each band, 261 ... Collective tabulation inverse conversion unit, 263 ... First constant frequency conversion coefficient generator, 265 ... Second constant frequency conversion coefficient generator, 267 ... Frequency conversion coefficient totalizer, 31 ... Speech encoding / decoding device according to the sixth embodiment, 321 ... CPU, 323 ... ROM, 325 ... storage unit, 327 ... radio communication unit, 329 ... voice processing unit, 331: Operation key input processing unit, 333: System bus, 335 ... Antenna, 337 ... Operation key, 339 ... RAM, 341 ... Hard disk

Claims

A prediction analysis unit that decomposes a speech signal into a prediction coefficient and a residual signal by prediction analysis;
A residual signal generator for each band that divides the residual signal into residual signals for each band;
A noise discriminating unit that discriminates whether or not the band is a noise band for each band of the residual signal;
A flag generating unit for generating a flag indicating that the band is a noise band for the band determined to be a noise band by the noise determining unit and obtaining a gain of a residual signal for each band of the band ;
Non-noise band conversion for generating a frequency conversion coefficient in a non-noise band by superimposing the band-specific residual signals of the band determined not to be a noise band by the noise determination unit in the real-time domain and then performing frequency conversion And
An encoding unit that encodes the prediction coefficient obtained by the prediction analysis unit, the flag and gain obtained by the flag generation unit, and the frequency conversion coefficient generated by the non-noise band conversion unit ;
A speech encoding device comprising:

A prediction analysis unit that decomposes a speech signal into a prediction coefficient and a residual signal by prediction analysis;
A full-band conversion unit that frequency-converts the residual signal to generate a frequency conversion coefficient;
A residual signal generator for each band that divides the residual signal into residual signals for each band;
A noise discriminating unit that discriminates whether or not the band is a noise band for each band of the residual signal;
A flag generating unit for generating a flag indicating that the band is a noise band for the band determined to be a noise band by the noise determining unit and obtaining a gain of a residual signal for each band of the band ;
From the frequency conversion coefficients obtained by the all-band conversion unit, a totaling unit that totalizes the frequency conversion coefficients of the band determined not to be a noise band by the noise determination unit,
An encoding unit that encodes the prediction coefficient obtained by the prediction analysis unit, the flag and gain obtained by the flag generation unit, and the frequency conversion coefficient tabulated by the tabulation unit ;
A speech encoding device comprising:

The noise discrimination unit is
Determining whether or not the band is a noise band based on the shape of the autocorrelation function of the band-specific residual signal for each band;
The speech encoding apparatus according to claim 1 or 2 , characterized in that

The prediction analysis unit
An MLSA filter coefficient is obtained as the prediction coefficient by MLSA (Mel Log Spectrum Approximation) analysis, and the residual signal is obtained using an inverse filter defined by the MLSA filter coefficient.
The speech coding apparatus according to any one of claims 1 to 3 , wherein

The prediction analysis unit
A linear prediction coefficient is obtained as the prediction coefficient by linear prediction analysis, and the residual signal is obtained using an inverse filter defined by the linear prediction coefficient.
The speech coding apparatus according to any one of claims 1 to 3 , wherein

Prediction coefficient generated from speech signal by prediction analysis, flag indicating that specific band of residual signal generated from speech signal by prediction analysis is noise band, gain of residual signal by band in noise band And a receiving unit that receives a code that is a result of encoding a frequency conversion coefficient in a non-noise band ;
A decoding unit that decodes the prediction coefficient, the flag, the gain, and the frequency transform coefficient in the non-noise band from the code;
A noise string generation unit that generates a noise string whose amplitude is adjusted by the gain for each band in a band indicated by the flag to be a noise band;
In the band indicated by the flag as a noise band, all the frequency conversion coefficients in the band are stored as 0, and in the non-noise band, the decoded frequency conversion coefficient is stored to store the entire band. An inverse transform unit that generates a frequency transform coefficient and inversely transforms the generated frequency transform coefficient to obtain a residual signal in a non-noise band ;
A residual signal restoration unit that generates a restored residual signal by superimposing a noise sequence generated by the noise sequence generation unit and a residual signal in a non-noise band obtained by the inverse transformation unit ;
A synthesizing unit that generates a restored speech signal by synthesizing the prediction coefficient decoded by the decoding unit and the restored residual signal generated by the residual signal restoring unit ;
A speech decoding apparatus comprising:

A predictive analysis step that decomposes the speech signal into predictive coefficients and residual signals by predictive analysis;
A band-specific residual signal generating step of dividing the residual signal into band-specific residual signals;
A noise determination step for determining whether the band is a noise band for each band of the residual signal;
A flag generating step for generating a flag indicating that the band is a noise band for the band determined to be a noise band in the noise determining step, and obtaining a gain of the band-specific residual signal of the band ;
Non-noise band conversion that generates a frequency conversion coefficient in a non-noise band by superimposing the band-specific residual signals in the band determined not to be a noise band in the noise determination step in the real-time domain and then performing frequency conversion Steps,
An encoding step for encoding the prediction coefficient obtained in the prediction analysis step, the flag and gain obtained in the flag generation step, and the frequency conversion coefficient obtained in the non-noise band conversion step ;
A speech encoding method comprising:

Prediction coefficient generated from speech signal by prediction analysis, flag indicating that specific band of residual signal generated from speech signal by prediction analysis is noise band, gain of residual signal by band in noise band and a receiving step of receiving a code that is the result of the frequency transform coefficients, but coded in a non-noise band,
Decoding the prediction coefficient, the flag, the gain, and the frequency transform coefficient in the non-noise band from the code;
A noise sequence generating step for generating a noise sequence whose amplitude is adjusted by the gain for each band in a band indicated by the flag to be a noise band;
In the band indicated by the flag as a noise band, all the frequency conversion coefficients in the band are stored as 0, and in the non-noise band, the decoded frequency conversion coefficient is stored to store the entire band. An inverse transform step for generating a frequency transform coefficient and performing a spectrum inverse transform on the generated frequency transform coefficient to obtain a residual signal in a non-noise band ;
A residual signal restoration step of generating a restored residual signal by superimposing the noise sequence generated in the noise sequence generation step and the residual signal in the non-noise band obtained in the inverse transformation step ;
A synthesis step of generating a restored speech signal by synthesizing the prediction coefficient decoded in the decoding step and the restored residual signal generated in the residual signal restoration step ;
A speech decoding method comprising:

On the computer,
A predictive analysis step that decomposes the speech signal into predictive coefficients and residual signals by predictive analysis;
A band-specific residual signal generating step of dividing the residual signal into band-specific residual signals;
A noise determination step for determining whether the band is a noise band for each band of the residual signal;
A flag generating step for generating a flag indicating that the band is a noise band for the band determined to be a noise band in the noise determining step, and obtaining a gain of the band-specific residual signal of the band ;
Non-noise band conversion that generates a frequency conversion coefficient in a non-noise band by superimposing the band-specific residual signals in the band determined not to be a noise band in the noise determination step in the real-time domain and then performing frequency conversion Steps,
An encoding step for encoding the prediction coefficient obtained in the prediction analysis step, the flag and gain obtained in the flag generation step, and the frequency conversion coefficient obtained in the non-noise band conversion step ;
A computer program that executes

On the computer,
Prediction coefficient generated from speech signal by prediction analysis, flag indicating that specific band of residual signal generated from speech signal by prediction analysis is noise band, gain of residual signal by band in noise band and a receiving step of receiving a code that is the result of the frequency transform coefficients, but coded in a non-noise band,
Decoding the prediction coefficient, the flag, the gain, and the frequency transform coefficient in the non-noise band from the code;
A noise sequence generating step for generating a noise sequence whose amplitude is adjusted by the gain for each band in a band indicated by the flag to be a noise band;
In the band indicated by the flag as a noise band, all the frequency conversion coefficients in the band are stored as 0, and in the non-noise band, the decoded frequency conversion coefficient is stored to store the entire band. An inverse transform step for generating a frequency transform coefficient and performing a spectrum inverse transform on the generated frequency transform coefficient to obtain a residual signal in a non-noise band ;
A residual signal restoration step of generating a restored residual signal by superimposing the noise sequence generated in the noise sequence generation step and the residual signal in the non-noise band obtained in the inverse transformation step ;
A synthesis step of generating a restored speech signal by synthesizing the prediction coefficient decoded in the decoding step and the restored residual signal generated in the residual signal restoration step ;
A computer program that executes