JPH10171497A

JPH10171497A - Background noise removing device

Info

Publication number: JPH10171497A
Application number: JP8332182A
Authority: JP
Inventors: Shinsuke Takada; 真資高田; Yoshihiro Ariyama; 義博有山
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1996-12-12
Filing date: 1996-12-12
Publication date: 1998-06-26

Abstract

PROBLEM TO BE SOLVED: To remove a background noise and to obtain excellent sound quality generating no sense of incongruity with hearing sensation by splitting an input signal into a plurality of small frequency areas and operating subtraction processing on the background noise. SOLUTION: During a speech period in which a background noise and a speech signal are inputted to an input terminal 18 in a form of a mixture, a power spectrum corresponding to the mixture is outputted from a calculation part 4. A subtraction is executed between this power spectrum and an estimated background noise estimated by a background noise estimation part 7. A background noise re-estimation part 12 for each band checks an S/N difference between each channel based on an S/Na of the overall frequency band and S/Nm of each channel, and re-updates an estimated background noise of a channel with a low S/N. An adder 13 executes a subtraction between a m-th channel output of an adder 5 coming from a band splitter 101 and a re-updated estimated background noise coming from a by-band background noise re- updating part 122.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声信号と背景雑
音が混在する入力信号から背景雑音を除去する背景雑音
除去装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus for removing background noise from an input signal in which a speech signal and background noise are mixed.

【０００２】[0002]

【従来の技術】音声信号と背景雑音が混在する入力信号
から背景雑音を除去する一方法として、入力信号を周波
数成分に変換してあらかじめ定めた低周波数成分と高周
波数成分とに分け、低周波数成分に対しては全体に１よ
り小さいゲインをかけて振幅を小さくし、高周波数成分
に対しては周波数減算法(spectral subtraction)を適用
して各スペクトル毎に背景雑音を除去する方法が知られ
ている。2. Description of the Related Art As one method for removing background noise from an input signal in which a voice signal and background noise are mixed, an input signal is converted into a frequency component and divided into a predetermined low frequency component and a predetermined high frequency component. It is known to reduce the amplitude by applying a gain of less than 1 to the entire component and reduce the background noise for each spectrum by applying the frequency subtraction method (spectral subtraction) to the high frequency component. ing.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、上記の
背景雑音除去方法では、(a) 通常の使用状態では背景雑
音の周波数特性は様々で未知であるため、あらかじめ低
周波数域と高周波数域とを決定しておくこは困難であ
る。(b) 低周波数域と高周波数域とでは別々の方式を用
いて処理を行うので、低周波数域と高周波数域の分割点
での整合が難しく、スペクトルが低周波数域から高周波
数域まで連続した周波数特性に対しても、周波数分割点
において処理方式の相違による不連続な処理点が発生
し、聴感上の違和感を発生させる。(c) 低周波数域では
S/N が改善されずにパワーが小さくなるだけであるの
で、特に音声成分が低周波数域に集中している場合には
効果的な雑音除去が期待できず、周波数成分の全帯域で
見たときには、S/N はむしろ劣化し、音質も劣化する。
等の欠点があった。However, in the above-described background noise elimination method, (a) the frequency characteristics of the background noise are various and unknown in a normal use state. It is difficult to decide. (b) Since processing is performed using different methods in the low frequency range and the high frequency range, matching at the division point between the low frequency range and the high frequency range is difficult, and the spectrum is continuous from the low frequency range to the high frequency range. Also for the frequency characteristics, a discontinuous processing point is generated at the frequency division point due to a difference in the processing method, which causes a sense of discomfort in hearing. (c) In the low frequency range
Since the power is only reduced without improving the S / N, effective noise reduction cannot be expected, especially when the audio component is concentrated in the low frequency range. Sometimes the S / N is rather degraded and the sound quality is also degraded.
And the like.

【０００４】本発明はこのような従来技術の欠点を解消
し、低周波数域と高周波数域を区別することなく周波数
成分の全帯域で１つの方法で背景雑音の除去を行う、聴
感上の違和感が生じない、音質の優れた背景雑音除去装
置を提供することを目的とする。[0004] The present invention solves such disadvantages of the prior art, and eliminates background noise in a single method in the entire frequency component band without distinguishing between a low frequency region and a high frequency region. It is an object of the present invention to provide a background noise elimination device which is excellent in sound quality and does not cause noise.

【０００５】[0005]

【課題を解決するための手段】本発明は上述の課題を解
決するために、入力信号として、背景雑音が入力され次
いで該背景雑音と音声信号が混在する形で入力される背
景雑音除去装置において、この装置は、入力信号を時間
軸信号から順次フレーム単位で周波数成分に変換する信
号変換手段と、入力信号に含まれる音声信号を検出する
音声検出手段と、この音声検出手段で音声信号が検出さ
れないノイズ期間は、信号変換手段で変換される周波数
成分と１フレーム前に生成した推定背景雑音との平均を
とることにより推定背景雑音を生成して保持する背景雑
音推定手段と、音声検出手段で音声信号が検出される音
声期間は、信号変換手段で変換される周波数成分から背
景雑音推定手段で保持される推定背景雑音を減算する第
１の加算手段と、この第１の加算手段の減算で得られる
周波数成分を信号とし、背景雑音推定手段で保持される
推定背景雑音を雑音として、全周波数帯域のS/N と全周
波数帯域を複数に分割した小領域毎のS/N とを計算する
S/N 計算手段と、小領域のS/N と全周波数帯域のS/N の
差が所定値以下の各小領域について、第１の加算手段の
減算で得られる周波数成分と背景雑音推定手段で保持さ
れる推定背景雑音とを所定の割合で含む再更新推定背景
雑音を生成する帯域別背景雑音再推定手段と、第１の加
算手段の減算で得られる周波数成分から再更新推定背景
雑音を減算する第２の加算手段と、この第２の加算手段
の減算で得られるフレーム毎の周波数成分を時間軸信号
に変換して出力する信号再生手段とを有し、第１の加算
手段で減算する推定背景雑音と第２の加算手段で減算す
る再更新推定背景雑音の大きさは、入力信号から背景雑
音が除去されるように設定されることを特徴とする。SUMMARY OF THE INVENTION In order to solve the above-mentioned problems, the present invention provides a background noise eliminator in which background noise is input as an input signal and then the background noise and voice signal are input in a mixed form. The apparatus comprises: a signal conversion unit for sequentially converting an input signal from a time axis signal to a frequency component in frame units; a voice detection unit for detecting a voice signal included in the input signal; and a voice signal detected by the voice detection unit. During the noise period that is not performed, the background noise estimating means for generating and holding the estimated background noise by averaging the frequency component converted by the signal converting means and the estimated background noise generated one frame before, and the voice detecting means A speech period in which the speech signal is detected, a first addition unit for subtracting the estimated background noise held by the background noise estimation unit from the frequency component converted by the signal conversion unit; The frequency component obtained by the subtraction of the first adding means is used as a signal, the estimated background noise held by the background noise estimating means is used as noise, and the S / N of the entire frequency band and the small area obtained by dividing the entire frequency band into a plurality Calculate S / N for each
S / N calculation means, and for each small area in which the difference between the S / N of the small area and the S / N of the entire frequency band is equal to or less than a predetermined value, the frequency component obtained by subtraction of the first addition means and the background noise estimation means A band-based background noise re-estimating means for generating a re-updated estimated background noise including the estimated background noise held at a predetermined ratio, and a re-updated estimated background noise from a frequency component obtained by subtraction of the first adding means. A second adding unit for subtracting the signal, and a signal reproducing unit for converting a frequency component for each frame obtained by the subtraction of the second adding unit into a time axis signal and outputting the time axis signal. The magnitude of the estimated background noise to be subtracted from the estimated background noise to be subtracted by the second adding means is set so that the background noise is removed from the input signal.

【０００６】また、本発明は上述の課題を解決するため
に、入力信号として、背景雑音が入力され次いで該背景
雑音と音声信号が混在する形で入力される背景雑音除去
装置において、この装置は、入力信号を時間軸信号から
順次フレーム単位で周波数成分に変換する信号変換手段
と、入力信号に含まれる音声信号を検出する音声検出手
段と、この音声検出手段で音声信号が検出されないノイ
ズ期間は、信号変換手段で変換される周波数成分と１フ
レーム前に生成した推定背景雑音との平均をとることに
より推定背景雑音を生成して保持する背景雑音推定手段
と、音声検出手段で音声信号が検出される音声期間は、
信号変換手段で変換される周波数成分から背景雑音推定
手段で保持される推定背景雑音を減算する第１の加算手
段と、この第１の加算手段の減算で得られる周波数成分
を信号とし、背景雑音推定手段で保持される推定背景雑
音を雑音として、全周波数帯域のS/N と全周波数帯域を
複数に分割した小領域毎のS/N とを計算するS/N 計算手
段と、小領域のS/N と全周波数帯域のS/N の差が所定値
以下の各小領域について、第１の加算手段の減算で得ら
れる周波数成分と背景雑音推定手段で保持される推定背
景雑音とを所定の割合で含む雑音を生成し、この雑音に
背景雑音推定手段で保持される推定背景雑音を加算する
ことにより再更新推定背景雑音を生成する帯域別背景雑
音再推定手段と、信号変換手段の変換で得られる周波数
成分から再更新推定背景雑音を減算する第２の加算手段
と、この第２の加算手段の減算で得られるフレーム毎の
周波数成分を時間軸信号に変換して出力する信号再生手
段とを有し、第２の加算手段で減算する再更新推定背景
雑音の大きさは、入力信号から背景雑音が除去されるよ
うに設定されることを特徴とする。In order to solve the above-mentioned problems, the present invention provides a background noise elimination device in which background noise is input as an input signal and then the background noise and the audio signal are input in a mixed form. A signal converting means for sequentially converting the input signal from the time axis signal to a frequency component in frame units, a voice detecting means for detecting a voice signal included in the input signal, and a noise period in which no voice signal is detected by the voice detecting means. Background noise estimating means for generating and holding the estimated background noise by averaging the frequency component converted by the signal converting means and the estimated background noise generated one frame before, and detecting the audio signal by the audio detecting means The audio period is
A first adding means for subtracting the estimated background noise held by the background noise estimating means from the frequency component converted by the signal converting means; and a frequency component obtained by the subtraction of the first adding means being a signal, S / N calculation means for calculating the S / N of the entire frequency band and the S / N of each small area obtained by dividing the entire frequency band into a plurality of parts, using the estimated background noise held by the estimation means as noise, For each small region where the difference between the S / N and the S / N of the entire frequency band is equal to or less than a predetermined value, the frequency component obtained by the subtraction of the first adding means and the estimated background noise held by the background noise estimating means are determined. And a noise conversion unit that generates a renewed estimated background noise by adding the estimated background noise held by the background noise estimating unit to the noise, and a conversion of the signal converting unit. Estimation background from frequency components obtained by A second adder for subtracting a sound; and a signal reproducer for converting a frequency component for each frame obtained by the subtraction of the second adder into a time axis signal and outputting the signal. The magnitude of the re-updated estimated background noise to be subtracted in is set so that the background noise is removed from the input signal.

【０００７】この場合、音声検出手段は、信号変換手段
の変換で得られる周波数成分に含まれる音声成分を検出
することにより入力信号に含まれる音声信号を検出する
ようにするとよい。In this case, it is preferable that the voice detecting means detects the voice signal included in the input signal by detecting the voice component included in the frequency component obtained by the conversion of the signal converting means.

【０００８】更に、音声検出手段は、信号変換手段の変
換で得られる周波数成分を複数の小領域に分割して該小
領域毎に音声成分を検出し、音声成分が検出された小領
域の数が所定値以上であるとき入力信号に音声信号が含
まれると判定するようにするとよい。Further, the voice detecting means divides the frequency component obtained by the conversion of the signal converting means into a plurality of small areas, detects the voice component for each of the small areas, and determines the number of the small areas in which the voice component is detected. Is larger than or equal to a predetermined value, it may be determined that the input signal includes an audio signal.

【０００９】また、音声検出手段は、信号変換手段の変
換で得られる周波数成分を複数の小領域に分割して該小
領域毎に音声成分を検出し、全小領域数に対する音声成
分が検出された小領域の数の割合が所定値以上であると
き入力信号に音声信号が含まれると判定するようにする
とよい。The sound detecting means divides the frequency component obtained by the conversion of the signal converting means into a plurality of small areas and detects a sound component for each of the small areas. When the ratio of the number of the small regions is equal to or more than a predetermined value, it may be determined that the input signal includes the audio signal.

【００１０】[0010]

【発明の実施の形態】次に添付図面を参照して本発明に
よる背景雑音除去装置の実施例を詳細に説明する。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing an embodiment of a background noise elimination apparatus according to the present invention.

【００１１】図１は本発明の第１の実施例の背景雑音除
去装置を示すブロック図である。この第１の実施例は、
背景雑音から急峻な変化成分を取り除いた推定背景雑音
および背景雑音の急峻な変化分を含む再更新推定背景雑
音を生成し、入力信号から推定背景雑音を除去し、更
に、周波数帯域を複数に分割した小領域のうちS/N が特
に小さい小領域に対しては再更新推定背景雑音を除去す
ることにより、音質を損なうことなく背景雑音を精度よ
く除去するものである。FIG. 1 is a block diagram showing a background noise removing apparatus according to a first embodiment of the present invention. This first embodiment is:
Generates estimated background noise by removing the steep change component from the background noise and re-updated estimated background noise including the steep change of the background noise, removes the estimated background noise from the input signal, and further divides the frequency band into multiple Among the small regions, the small noise region having a particularly small S / N is used to remove the background noise re-updated, thereby removing the background noise accurately without deteriorating the sound quality.

【００１２】図１において、入力端子18には、始め背景
雑音N1が入力され、次いで音声信号S がその背景雑音N1
に混在する形で入力されるものとする。例えば、携帯電
話装置、自動車電話装置、音響機器等のマイクロホンか
ら出力される信号がこの音声信号S 、背景雑音N1に相当
する。入力端子18に入力された入力信号（背景雑音N1、
または音声信号S と背景雑音N1）は、入力端子18に接続
されるアナログ−ディジタル変換器（A/D 変換器）１に
入力される。A/D 変換器１は、入力端子18からの信号を
所定の周波数、例えば、電話装置であれば8 KHz でサン
プリングし、アナログ信号からディジタル信号に変換し
て出力側に接続される窓関数演算器２および時間軸音声
検出器６へ出力する。In FIG. 1, a background noise N1 is first input to an input terminal 18, and then a speech signal S is applied to the background noise N1.
It is assumed to be input in a mixed form. For example, a signal output from a microphone of a mobile phone device, a car phone device, an audio device, or the like corresponds to the voice signal S and the background noise N1. The input signal (background noise N1,
Alternatively, the audio signal S and the background noise N1) are input to an analog-digital converter (A / D converter) 1 connected to the input terminal 18. The A / D converter 1 samples a signal from the input terminal 18 at a predetermined frequency, for example, 8 KHz in the case of a telephone device, converts an analog signal into a digital signal, and calculates a window function connected to the output side. To the detector 2 and the time axis sound detector 6.

【００１３】窓関数演算器２は、A/D 変換器１からのデ
ィジタル信号を所定のサンプル数毎に１つのフレームに
まとめ、その各フレームに対して窓関数演算を施して出
力側に接続される高速フーリエ変換演算器（FFT 演算
器）３へ出力する。なお、窓関数演算には、フレーム間
のデータ飛躍に基づく高周波成分の発生を防止するため
に、ハニング窓、ハミング窓等の窓関数が使用されるこ
とは周知のところである。FFT 演算器３は、窓関数演算
器２からの信号について１フレーム毎に高速フーリエ変
換演算を施し、時間t の関数として表わされる時間軸信
号を周波数成分（スペクトル信号）に変換する。そし
て、そのスペクトル信号を、出力側に接続されるパワー
スペクトル計算部４および位相計算部８へ出力する。A window function calculator 2 combines the digital signals from the A / D converter 1 into one frame for each predetermined number of samples, performs a window function calculation on each frame, and is connected to the output side. Output to the fast Fourier transform calculator (FFT calculator) 3. It is well known that a window function such as a Hanning window or a Hamming window is used in the window function calculation in order to prevent generation of high-frequency components based on data jumps between frames. The FFT operation unit 3 performs a fast Fourier transform operation on the signal from the window function operation unit 2 for each frame, and converts a time axis signal represented as a function of time t into a frequency component (spectral signal). Then, the spectrum signal is output to the power spectrum calculator 4 and the phase calculator 8 connected to the output side.

【００１４】ここで、入力端子18に背景雑音N1が入力さ
れたときFFT 演算器３から出力されるスペクトル信号を
N1(f,k) （f は周波数、k はフレーム番号）、音声信号
S のみが入力されたときFFT 演算器３から出力されるス
ペクトル信号をS(f,k)とすると、FFT 演算器３から出力
されるスペクトル信号X(f,k)は、入力端子18に背景雑音
N1のみが入力されるノイズ期間では(1) 式で表わすこと
ができ、入力端子18に背景雑音N1と音声信号S が混在す
る形で入力される音声期間では(2) 式で表わすことがで
きる。Here, the spectrum signal output from the FFT operator 3 when the background noise N1 is input to the input terminal 18 is
N1 (f, k) (f is frequency, k is frame number), audio signal
Assuming that the spectrum signal output from the FFT operator 3 when only S is input is S (f, k), the spectrum signal X (f, k) output from the FFT operator 3 is input to the input terminal 18 at the background. noise
In the noise period in which only N1 is input, it can be expressed by equation (1), and in the audio period in which background noise N1 and audio signal S are input to input terminal 18 in a mixed form, it can be expressed by equation (2) .

【００１５】 X(f,k)＝N1(f,k) (1) X(f,k)＝N1(f,k) ＋S(f,k) (2) パワースペクトル計算部４は、FFT 演算器３からのスペ
クトル信号X(f,k)のパワースペクトルX_f,kを、１フレー
ム毎に計算する。そして、計算したパワースペクトルX
_f,kを出力側に接続される加算器５および背景雑音推定
部７へ出力する。位相計算部８は、FFT 演算器３からの
スペクトル信号X(f,k)の各周波数成分毎の同相成分およ
び直交成分から各周波数成分の位相Φ(f,k) を計算し、
その位相Φ(f,k) を出力側に接続される位相保持器９へ
出力する。位相保持器９は、位相計算部８からの位相Φ
(f,k) を１フレームの期間保持する。X (f, k) = N1 (f, k) (1) X (f, k) = N1 (f, k) + S (f, k) (2) The power spectrum calculator 4 performs an FFT operation. spectrum signal X (f, k) from the vessel 3 the power spectrum X _f of _{the k,} calculated for each frame. And the calculated power spectrum X
_{f and k} are output to the adder 5 and the background noise estimator 7 connected to the output side. The phase calculator 8 calculates the phase Φ (f, k) of each frequency component from the in-phase component and the quadrature component of each frequency component of the spectrum signal X (f, k) from the FFT calculator 3,
The phase Φ (f, k) is output to the phase holder 9 connected to the output side. The phase holder 9 receives the phase Φ from the phase calculator 8.
(f, k) is held for one frame period.

【００１６】一方、時間軸音声検出器６は、A/D 変換器
１からのディジタル信号に音声が含まれるか否かを検出
し、その検出結果を出力側に接続される背景雑音推定部
７へ出力する。本実施例では、音声の検出を時間軸で行
なうものであり、長時間、例えば500ms における入力信
号の平均パワーと、短時間、例えば50msにおける入力信
号の平均パワーとの差を算出し、その差があらかじめ設
定した閾値を越えたとき入力信号中に音声が含まれてい
ると判定する。なお、音声の検出方法は、本実施例に限
定されるものではなく、時間軸で音声を検出できる方法
であればどのような方法であってもかまわない。On the other hand, the time axis sound detector 6 detects whether or not sound is included in the digital signal from the A / D converter 1, and outputs the detection result to a background noise estimator 7 connected to the output side. Output to In the present embodiment, voice detection is performed on the time axis, and the difference between the average power of the input signal in a long time, for example, 500 ms, and the average power of the input signal in a short time, for example, 50 ms, is calculated. Is determined to include a voice in the input signal when exceeds a preset threshold value. It should be noted that the sound detection method is not limited to the present embodiment, and any method may be used as long as the sound can be detected on the time axis.

【００１７】背景雑音推定部７は、スイッチ71、背景雑
音更新部72、および背景雑音保持部73を含み、時間軸音
声検出器６が音声を検出しないノイズ期間において、パ
ワースペクトル計算部４からのパワースペクトルX_f,kを
用いて背景雑音を推定する。詳細には、スイッチ71は、
時間軸音声検出器６からの判定結果に基づいてノイズ期
間のみスイッチを閉じ、パワースペクトル計算部４の出
力側を背景雑音更新部72の入力側に接続する。したがっ
て、パワースペクトル計算部４からのパワースペクトル
X_f,kは、ノイズ期間のみ背景雑音更新部72に入力され
る。The background noise estimating unit 7 includes a switch 71, a background noise updating unit 72, and a background noise holding unit 73. During a noise period in which the time axis speech detector 6 does not detect speech, the background noise estimating unit 7 receives a signal from the power spectrum calculation unit 4. The background noise is estimated using the power spectrum _{Xf, k} . Specifically, the switch 71 is
The switch is closed only during the noise period based on the determination result from the time axis speech detector 6, and the output side of the power spectrum calculation unit 4 is connected to the input side of the background noise update unit 72. Therefore, the power spectrum from the power spectrum calculator 4
X _{f, k} is input to the background noise updating unit 72 only during the noise period.

【００１８】背景雑音更新部72は、パワースペクトル計
算部４からのパワースペクトルX_f,kと、背景雑音保持部
73に保持されている１フレーム前に推定した推定背景雑
音N2_f,k-1 とを用いて、(3) 式により推定背景雑音N2
_f,k を算出すると共に、背景雑音保持部73に保持されて
いる１フレーム前の推定背景雑音N2_f,k-1 を推定背景雑
音N2_f,k に更新する。なお、(3) 式の計算は、フレーム
毎、周波数成分毎に実行される。The background noise updating section 72 includes a power spectrum X _{f, k} from the power spectrum calculating section 4 and a background noise holding section.
Using the estimated background noise N2 _{f, k-1} estimated one frame before held in 73, the estimated background noise N2
_{In addition} to calculating _{f, k} , the estimated background noise N2 _{f, k−1} one frame before, which is held in the background noise holding unit 73, is updated to the estimated background noise N2 _{f, k} . The calculation of equation (3) is executed for each frame and each frequency component.

【００１９】 N2_f,k ＝α・N2_f,k-1 ＋(1−α) ・X_f,k (3) ここで、k ＝0 のとき、N2_f,0 ＝0 とする。また、α
は、ノイズ推定の速度を決定する係数であって、1 ＞α
＞0 の範囲内に設定される。N2f _{, k} = α · N2f _{, k−1} + (1−α) · _{Xf, k} (3) Here, when k = 0, N2f _{, 0} = 0. Also, α
Is a coefficient that determines the speed of the noise estimation, and 1> α
It is set within the range of> 0.

【００２０】ところで、背景雑音のパワースペクトルは
その平均値の周辺に或る分散をもって時間的に変動する
ものである。したがって、背景雑音のパワースペクトル
をそのまま背景雑音除去のために用いると、装置が各フ
レーム毎に背景雑音に過敏に応答し、その動作が不安定
となる。そこで、本実施例では、(3) 式を用いてパワー
スペクトル計算部４からのパワースペクトルX_f,kと１フ
レーム前に推定した推定背景雑音N2_f,k-1 との平均をと
り、その平均したパワースペクトルを推定背景雑音とし
て用いることにより、装置が各フレーム毎に背景雑音に
敏感に応答することを防止し、装置の安定化を図ってい
る。By the way, the power spectrum of the background noise fluctuates temporally with a certain variance around its average value. Therefore, if the power spectrum of the background noise is used as it is to remove the background noise, the device will respond excessively to the background noise for each frame, and the operation will be unstable. Therefore, in the present embodiment, the average of the power spectrum _{Xf, k} from the power spectrum calculation unit 4 and the estimated background noise N2f _{, k-1} estimated one frame before is calculated by using the equation (3). By using the averaged power spectrum as the estimated background noise, the apparatus is prevented from responding sensitively to the background noise for each frame, and the apparatus is stabilized.

【００２１】このように、背景雑音推定部７は、ノイズ
期間において、フレーム毎に(3) 式により背景雑音を推
定していく。そして、背景雑音保持部73で保持される推
定背景雑音N2_f,k-1 が、パワースペクトル計算部４から
出力されるパワースペクトルX_f,kに近づいてN2_f,k-1 ≒
X_f,kとなったとき、ノイズ期間ではX_f,k≒N1_f,k （N1
_f,k は背景雑音のパワースペクトル）であるので、推定
背景雑音N2_f,k を用いて背景雑音N1_f,k を除去すること
ができることとなり、背景雑音推定部７の背景雑音の推
定アルゴリズムは収束する。背景雑音更新部72は、推定
アルゴリズム収束後の推定背景雑音N2_f,k を出力側に接
続される加算器５およびパワー操作部10へ出力し、背景
雑音保持部73は、その推定背景雑音N2_f,k を保持する。As described above, the background noise estimating unit 7 estimates the background noise by the equation (3) for each frame during the noise period. Then, the estimated background noise N2 _{f, k-1} held by the background noise holding unit 73 approaches the power spectrum X _{f, k} output from the power spectrum calculation unit 4, and N2 _{f, k-1} ≒
When X _{f, k} , X _{f, k} ≒ N1 _{f, k} (N1
_{Since f, k} is the power spectrum of the background noise), the background noise N1 _{f, k} can be removed by using the estimated background noise N2 _{f, k,} and the background noise estimation algorithm of the background noise estimation unit 7 is as follows. Converge. The background noise updating unit 72 outputs the estimated background noise N2 _{f, k} after the convergence of the estimation algorithm to the adder 5 and the power operation unit 10 connected to the output side, and the background noise holding unit 73 outputs the estimated background noise N2 Hold _{f and k} .

【００２２】ノイズ期間が終了し、次いで入力端子18に
背景雑音N1と音声信号S が混在する形で入力される音声
期間に移ると、パワースペクトル計算部４からは背景雑
音N1と音声信号S が混在する信号に対応するパワースペ
クトルX_f,kが出力される。加算器５は、このパワースペ
クトルX_f,kと、背景雑音推定部７からの推定背景雑音N2
_f,k とを用いて(4) 式により減算を実行し、その結果E1
_f,k を出力側に接続されるパワー操作部10および加算器
13へ出力する。なお、(4) 式において、α_c は、推定背
景雑音N2_f,k の減算量を加減するための係数であり、1
≧α_c >0の範囲内で、使用環境等を考慮してあらかじめ
定めておく。When the noise period ends, and then the operation proceeds to the audio period in which the background noise N1 and the audio signal S are input to the input terminal 18 in a mixed manner, the background noise N1 and the audio signal S are output from the power spectrum calculation unit 4. A power spectrum _{Xf, k} corresponding to the mixed signal is output. The adder 5 calculates the power spectrum X _{f, k} and the estimated background noise N 2 from the background noise estimating unit 7.
_{Using f and k} , subtraction is performed by equation (4), and as a result E1
Power operation unit 10 and adder that connect _{f and k} to the output side
Output to 13. In Equation (4), α _c is a coefficient for adjusting the amount of subtraction of the estimated background noise N2 _{f, k} , and 1 _c
It is determined in advance within the range of ≧ α _c > 0 in consideration of the use environment and the like.

【００２３】 E1_f,k ＝X_f,k−α_c ・N2_f,k (4) 先述したように音声期間では、FFT 演算器３の出力X(f,
k)は、X(f,k)＝N1(f,k) ＋S(f,k)となるので、パワース
ペクトル計算部４から出力される音声のパワースペクト
ルをS_f,kとすると、加算器５の出力E1_f,k は(5) 式のよ
うになる。E1 _{f, k} = X _{f, k} −α _c · N2 _{f, k} (4) As described above, during the voice period, the output X (f,
k) is given by X (f, k) = N1 (f, k) + S (f, k). Therefore, assuming that the power spectrum of the sound output from the power spectrum calculation unit 4 is _{Sf, k} , the adder The output E1 _{f, k} of 5 is as shown in equation (5).

【００２４】 E1_f,k ≒S_f,k (5) このように、加算器５からは、背景雑音を含む音声信号
に対応するパワースペクトルX_f,kから推定背景雑音N2
_f,k を除去した音声信号に対応するパワースペクトルS
_f,kが出力されることになる。ただし、推定背景雑音N2
_f,k は実際の背景雑音には完全に一致しないので、音声
信号に対応するパワースペクトルS_f,kのみを得ることは
困難であり、E1_f,k には多少の背景雑音が含まれること
となる。なお、ノイズ期間では、FFT 演算器３の出力X
(f,k)はX(f,k)＝N(f,k)となるので、加算器５の出力E1
_f,k は、E1_f,k ≒0 となる。E1 _{f, k} ≒ S _{f, k} (5) As described above, the adder 5 outputs the estimated background noise N2 from the power spectrum X _{f, k} corresponding to the audio signal including the background noise.
Power spectrum S corresponding to the audio signal from which _{f and k} have been removed
_{f and k} will be output. Where the estimated background noise N2
_{Since f, k} does not completely match the actual background noise, it is difficult to obtain only the power spectrum S _{f, k} corresponding to the audio signal, and E1 _{f, k} contains some background noise. Becomes In the noise period, the output X of the FFT operator 3
Since (f, k) becomes X (f, k) = N (f, k), the output E1
_{f, k} becomes E1 _{f, k} ≒ 0.

【００２５】パワー操作部10は、帯域分割器101 、パワ
ー計算部102 、対数計算部103 、スイッチ104 、および
パワー保持部105 を含み、加算器５および背景雑音推定
部７からのパワースペクトルを複数の小領域の周波数帯
域（チャンネル）に分割し、そのチャンネル毎にパワー
を計算するものである。詳細には、加算器５からのパワ
ースペクトルE1_f,k および背景雑音推定部７からのパワ
ースペクトルN2_f,k は、帯域分割器101 およびパワー計
算部102 の双方にそれぞれ入力される。The power operation unit 10 includes a band splitter 101, a power calculation unit 102, a logarithmic calculation unit 103, a switch 104, and a power holding unit 105, and a plurality of power spectra from the adder 5 and the background noise estimation unit 7. Is divided into small frequency bands (channels), and the power is calculated for each channel. Specifically, the power spectrum E1 _{f, k} from the adder 5 and the power spectrum N2 _{f, k} from the background noise estimator 7 are input to both the band divider 101 and the power calculator 102, respectively.

【００２６】帯域分割器101 は、ノイズ期間においては
背景雑音推定部７からのパワースペクトルN2_f,k を、音
声期間においては１フレーム毎に入力される加算器５か
らのパワースペクトルE1_f,k をそれぞれ所定の数のチャ
ンネルに分割し、出力側に接続されるパワー計算部102
および後述する帯域別背景雑音再推定部12へ出力する。
パワー計算部102 は、帯域分割器101 からの各チャンネ
ルのパワースペクトルを用いて、(6) 式、(7) 式によ
り、パワースペクトルE1_f,k に対応する各チャンネルの
パワー和E1_m,k 、パワースペクトルN2_f,k に対応する各
チャンネルのパワー和N2_m,k を計算する。The band divider 101 receives the power spectrum N2 _{f, k} from the background noise estimator 7 during the noise period, and the power spectrum E1 _{f, k} from the adder 5 input for each frame during the voice period. Is divided into a predetermined number of channels, and a power calculation unit 102 connected to the output side
And, it outputs to the background noise re-estimation unit 12 for each band described later.
The power calculator 102 uses the power spectrum of each channel from the band divider 101 to calculate the power sum E1 _{m, k of} each channel corresponding to the power spectrum E1 _{f, k according} to the equations (6) and (7). , The power sum N2 _{m, k} of each channel corresponding to the power spectrum N2 _f _{, k} is calculated.

【００２７】[0027]

【数１】ここで、m は、分割した小領域の周波数帯域の第m 番目
の周波数帯域（第m チャンネル）を表わし、fsは、第m
チャンネルにおける開始周波数を表わし、feは、第m チ
ャンネルにおける終了周波数を表わしている。(Equation 1) Here, m represents the m-th frequency band (m-th channel) of the divided small region frequency bands, and fs is the m-th frequency band.
Represents the start frequency in the channel, and fe represents the end frequency in the m-th channel.

【００２８】また、パワー計算部102 は、加算器５から
のパワースペクトルE1_f,k 、背景雑音推定部７からのパ
ワースペクトルN2_f,k を用いて、(8) 式、(9) 式によ
り、パワースペクトルE1_f,k に対応する全周波数帯域に
おけるパワー総和E1_ALL 、パワースペクトルN2_f,k に対
応する全周波数帯域におけるパワー総和N2_ALL を計算す
る。The power calculator 102 uses the power spectrum E1 _{f, k} from the adder 5 and the power spectrum N2 _{f, k} from the background noise estimator 7 according to equations (8) and (9). , The power sum E1 _ALL in the entire frequency band corresponding to the power spectrum E1 _{f, k} and the power sum N2 _ALL in the entire frequency band corresponding to the power spectrum N2 _{f, k} .

【００２９】 E1_ALL ＝ΣE1_f,k (8) N2_ALL ＝ΣN2_f,k (9) パワー計算部102 は、計算したパワー和E1_m,k 、N2
_m,k 、パワー総和E1_ALL 、N2_ALL を出力側に接続される
対数計算部103 へ出力する。対数計算部103 は、(10)〜
(13)式により、パワー計算部102 からの各パワー和、パ
ワー総和について対数変換を行い、出力側に接続される
スイッチ104 へ出力する。E1 _ALL = ΣE1 _{f, k} (8) N2 _ALL = ΣN2 _{f, k} (9) The power calculator 102 calculates the calculated power sum E1 _{m, k} , N2
_{m, k} and the power sum E1 _ALL and N2 _ALL are output to the logarithmic calculation unit 103 connected to the output side. The logarithmic calculation unit 103 calculates (10) to
According to the equation (13), logarithmic conversion is performed on each power sum and total power from the power calculation unit 102, and the result is output to the switch 104 connected to the output side.

【００３０】 E1_mlog＝log(E1_m,k) (10) N2_mlog＝log(N2_m,k) (11) E1_alog＝log(E1_ALL) (12) N2_alog＝log(N2_ALL) (13) スイッチ104 は、時間軸音声検出器６により制御され、
ノイズ期間はスイッチを端子a 側に倒して対数計算部10
3 の出力側をパワー保持器105 に接続する。これによ
り、ノイズ期間では対数計算部103 の出力はパワー保持
器105 に入力され、パワー保持器105 は、対数計算部10
3 からの各チャンネルのパワー和N2_mlogと全周波数帯域
のパワー総和N2_alogを保持すると共に、出力側に接続さ
れるS/N 計算部11へ出力する。スイッチ104 は、音声期
間はスイッチを端子b 側に倒して対数計算部103 の出力
側をS/N 計算部11に接続する。これにより、音声期間で
は対数計算部103 からの各チャンネルのパワー和E1_mlog
と全周波数帯域のパワー総和E1_alogがフレーム毎にS/N
計算部11に入力される。E1 _mlog = log (E1 _{m, k} ) (10) N2 _mlog = log (N2 _{m, k} ) (11) E1 _alog = log (E1 _ALL ) (12) N2 _alog = log (N2 _ALL ) (13) ) The switch 104 is controlled by the time axis sound detector 6,
During the noise period, the switch is tilted to the terminal a side and the logarithmic calculation unit 10
3 is connected to the power retainer 105. As a result, during the noise period, the output of the logarithmic calculation unit 103 is input to the power holding unit 105, and the power holding unit 105
Holds the power sum N2 _alog power sum N2 _mlog and the entire frequency band of each channel from 3, and outputs to the S / N calculator 11 connected to the output side. The switch 104 connects the output side of the logarithmic calculation unit 103 to the S / N calculation unit 11 by turning the switch to the terminal b side during the voice period. Thus, during the voice period, the power sum E1 _mlog of each channel from the logarithmic calculation unit 103
And power sum E1 _alog of all frequency bands are S / N for each frame
It is input to the calculation unit 11.

【００３１】S/N 計算部11は、全帯域S/N 計算部111 お
よび帯域別S/N 計算部112 を含み、パワー操作部10から
の各チャンネルのパワー和E1_mlog、N2_mlog、全周波数帯
域のパワー総和E1_alog、N2_alogに基づいてS/N を計算す
るものである。詳細には、全帯域S/N 計算部111 は、音
声期間にスイッチ104 を介して対数計算部103 から送出
される全周波数帯域のパワー総和E1_alogと、パワー保持
器105 に保持されているノイズ期間中の全周波数帯域の
パワー総和N2_alogとを用いて(14)式により全帯域のS/N_a
を計算し、出力側に接続される帯域別背景雑音再推定部
12へ出力する。The S / N calculation unit 11 includes a full-band S / N calculation unit 111 and a band-specific S / N calculation unit 112. The power sum E1 _mlog and N2 _mlog of each channel from the power operation unit 10 The S / N is calculated based on the power sum E1 _alog and N2 _alog of the band. More specifically, the full-band S / N calculator 111 includes a power sum E1 _alog of the entire frequency band transmitted from the logarithmic calculator 103 via the switch 104 during the voice period, and a noise held in the power holder 105. Using the power sum N2 _alog of all frequency bands during the period, S / N _a
And the background noise re-estimator for each band connected to the output side
Output to 12.

【００３２】 S/N_a＝E1_alog−N2_alog (14) 一方、帯域別S/N 計算部112 は、音声期間にスイッチ10
4 を介して対数計算部103 から出力される各チャンネル
のパワー和E1_mlogと、パワー保持器105 に保持されてい
るノイズ期間中の各チャンネルのパワー和N2_mlogとを用
いて、(15)式により小領域別のS/N_mを計算し、出力側に
接続される帯域別背景雑音再推定部12へ出力する。On the other hand _{_{S / N a = E1 alog -N2}} alog (14), the band-by-band S / N calculation unit 112, the switch 10 to the speech periods
(15) using the power sum E1 _{mlog of} each channel output from the logarithmic calculation unit 103 through 4 and the power sum N2 _mlog of each channel during the noise period held in the power holder 105. To calculate the S / N _m for each small area, and output the calculated S / N _m to the background noise re-estimation unit 12 for each band connected to the output side.

【００３３】 S/N_m＝E1_mlog−N2_mlog (15) ここで、m は、分割した複数の小領域の周波数帯域の第
m 番目の周波数帯域（第m チャンネル）を表わしてい
る。S / N _m = E1 _mlog −N2 _mlog (15) Here, m is the frequency band of the divided small regions.
Represents the m-th frequency band (m-th channel).

【００３４】なお、本実施例では、パワー計算部102 の
出力を対数計算部103 で対数変換した後、S/N 計算部11
でS/N を算出しているが、対数変換しないでS/N を算出
してもよい。この場合、対数計算部103 が不要となるが
S/N の計算がやや面倒になる。In this embodiment, after the output of the power calculator 102 is logarithmically converted by the logarithmic calculator 103, the output of the S / N calculator 11 is calculated.
Is used to calculate S / N, but S / N may be calculated without logarithmic conversion. In this case, the logarithmic calculation unit 103 becomes unnecessary,
S / N calculation is a little complicated.

【００３５】帯域別背景雑音再推定部12は、帯域別S/N
差判定部121 および帯域別背景雑音再更新部122 を含
み、S/N 計算部11からの全周波数帯域のS/N_aおよび各チ
ャンネルのS/N_mに基づいてチャンネル毎のS/N 差を判定
し、S/N の低いチャンネルについては推定背景雑音の再
更新を行なうものである。詳細には、帯域別S/N 差判定
部121 は、S/N 計算部11の全帯域S/N 計算部111 からの
全周波数帯域のS/N_aおよび帯域別S/N 計算部112 からの
各チャンネルのS/N_mを用いて、(16)式、(17)式により、
チャンネル毎にS/N_mとS/N_aの差を判定し、その判定結果
を出力側に接続される帯域別背景雑音再更新部122 へ出
力する。The background noise re-estimating unit 12 for each band performs the S / N
A S / N difference for each channel based on the S / N _a of the entire frequency band and the S / N _m of each channel from the S / N calculation unit 11 including a difference determination unit 121 and a background noise re-update unit 122 for each band. , And the estimated background noise is re-updated for the channel having a low S / N. In particular, the per-band S / N difference determination unit 121, the S / N calculation section S / N _a and the band-by-band S / N calculation unit 112 of the entire frequency band from the entire band S / N calculation unit 111 of the 11 Using the S / N _m of each channel of the following formulas (16) and (17),
Determining the difference between the S / N _m and S / N _a for each channel, and outputs the determination result to the band-by-band background noise re-update unit 122 connected to the output side.

【００３６】 S/N_m≧S/N_a＋δ (16) S/N_m＜S/N_a＋δ (17) ここで、δは、装置や使用環境等を考慮してあらかじめ
定めていく定数である。図２は、全周波数帯域のS/N_aと
チャンネルのS/N_mの一例を示す図である。ただし、α＝
0.5 、α_c ＝0.7 、β＝0.8 、δ＝−6(dB) 、m ＝5 の
場合であり、また、縦軸の単位はdBである。S / N _m ≧ S / N _a + δ (16) S / N _m <S / N _a + δ (17) Here, δ is a constant that is determined in advance in consideration of the apparatus and the use environment. is there. FIG. 2 is a diagram illustrating an example of S / N _a of the entire frequency band and S / N _m of the channel. Where α =
0.5, α _c = 0.7, β = 0.8, δ = −6 (dB), and m = 5, and the unit of the vertical axis is dB.

【００３７】帯域別背景雑音再更新部122 は、パワー操
作部10の帯域分割器101 から出力される各チャンネルの
推定背景雑音のうち、 (17) 式を満足するチャンネルの
推定背景雑音N2_mf,kと、同じく帯域分割器101 から出力
される加算器５の出力がチャンネル毎に分割された出力
のうち、 (17) 式を満足するチャンネルの出力E1_mf,kと
を用いて、(18)式により推定背景雑音N2_mf,kの再更新を
行って再更新推定背景雑音N3_mf,kを生成し、出力側に接
続される加算器13へ出力する。なお、 (16) 式を満足す
るチャンネルについては、再更新は行わない。The band-specific background noise re-updating unit 122 calculates the estimated background noise N2 _{mf, of the} channel satisfying the expression (17) among the estimated background noise of each channel output from the band divider 101 of the power operation unit 10 _{. k} and the output E1 _{mf, k of} the channel that satisfies the expression (17) among the outputs obtained by dividing the output of the adder 5 output from the band divider 101 for each channel. The estimated background noise N2 _{mf, k} is re-updated by the equation to generate a re-updated estimated background noise N3 _{mf, k,} which is output to the adder 13 connected to the output side. Note that for channels that satisfy Equation (16), re-update is not performed.

【００３８】 N3_mf,k＝(1−β) ・N2_mf,k＋β・ E1_mf,k (18) ここで、m は、分割した複数の周波数帯域の第m 番目の
周波数帯域（第m チャンネル）を表わす。また、βは、
再更新の速度を表わす係数であり、加算器５の第m チャ
ンネルの出力E1_mf,kを再更新推定背景雑音N3_mf,kに反映
させるためのものである。N3 _{mf, k} = (1−β) · N2 _{mf, k} + β · E1 _{mf, k} (18) where m is the m-th frequency band (the m-th channel) of the plurality of divided frequency bands. ). Β is
This is a coefficient representing the speed of re-update, and is used to reflect the output E1 _{mf, k} of the m-th channel of the adder 5 on the estimated re-update background noise N3 _{mf, k} .

【００３９】(18)式において、βを大きくすると再更新
推定背景雑音N3_mf,kに対するE1_mf,kの影響が大きくな
り、βを小さくするとその影響が小さくなる。ここで、
加算器５の第m チャンネルの出力E1_mf,kは、背景雑音の
急峻な変化に応じて急峻に変化するものである。したが
って、βが大きければ再更新推定背景雑音N3_mf,kは、急
峻に変化する背景雑音成分に速やかに追従するものとな
り、逆にβが小さければ再更新推定背景雑音N3_mf,kは、
急峻に変化する背景雑音成分に緩やかに追従するものと
なる。よって、βは、再更新の速度を表わす係数である
ということができ、使用環境等に応じて決定される。In equation (18), when β is increased _, the effect of E1 _{mf, k on} the re-updated estimated background noise N3 _{mf, k} increases, and when β is reduced, the effect decreases. here,
The output E1 _{mf, k} of the m-th channel of the adder 5 changes abruptly in response to a steep change of the background noise. Therefore, if β is large, the re-updated estimated background noise N3 _{mf, k} quickly follows the steeply changing background noise component, and conversely, if β is small, the re-updated estimated background noise N3 _{mf, k} is
This follows the steeply changing background noise component slowly. Therefore, β can be said to be a coefficient representing the re-update speed, and is determined according to the use environment and the like.

【００４０】加算器13は、帯域別背景雑音再更新部122
により推定背景雑音の再更新が行われるチャンネルに対
して、帯域分割器101 からの加算器５の第m チャンネル
の出力E1_mf,kおよび帯域別背景雑音再更新部122 からの
再更新推定背景雑音N3_mf,kを用いて(19)式により減算を
行い、出力側に接続される平滑演算部14へ出力する。The adder 13 includes a background noise re-update unit 122 for each band.
, The output E1 _{mf, k} of the m-th channel of the adder 5 from the band divider 101 and the re-updated estimated background noise from the band-based background noise re-update unit 122 Using N3 _{mf, k} , subtraction is performed according to equation (19), and the result is output to the smoothing operation unit 14 connected to the output side.

【００４１】 E2_mf,k＝E1_mf,k−α_k ・N3_mf,k ＝X_mf,k −α_c ・N2_mf,k−α_k ・N3_mf,k (19) ここで、α_k は、1 ≧α_k ＞0 の範囲内であらかじめ定
めておく係数であり、再更新推定背景雑音N3_mf,kの減算
量を加減する係数である。E2 _{mf, k} = E1 _{mf, k} −α _k · N3 _{mf, k} = X _{mf, k} −α _c · N2 _{mf, k} −α _k · N3 _{mf, k} (19) where α _k is , 1 ≧ α _k > 0, and is a coefficient for adjusting the amount of subtraction of the renewed estimated background noise N3 _{mf, k} .

【００４２】また、加算器13は、推定背景雑音の再更新
が行われないチャンネルに対しては、(20)式に示すよう
に加算器５の出力E1_mf,kをそのまま出力側に接続される
平滑演算部15へ出力する。The adder 13 connects the output E1 _{mf, k} of the adder 5 to the output side as it is, as shown in the equation (20), for the channel in which the estimated background noise is not updated again. Output to the smoothing operation unit 15.

【００４３】 E2_mf,k＝E1_mf,k (20) 図３は、帯域別背景雑音再更新部の動作を説明するため
の図である。図３において、縦軸は各成分のパワーレベ
ルを表わし、横軸はチャンネル（周波数）を表わしてい
る。また、破線a は、背景雑音更新部72により推定され
た推定背景雑音N2_f,k のスペクトル、実線b は音声信号
と背景雑音が混在したスペクトルX_f,k、点線c は背景雑
音のスペクトル、点線d は帯域別背景雑音再更新部122
で再更新された再更新推定背景雑音N3_mf,kのスペクトル
である。また、斜線を施したチャンネル( ４、６、９チ
ャンネル）は、帯域別背景雑音再更新部122 において推
定背景雑音の再更新が行われたチャンネルである。E2 _{mf, k} = E1 _{mf, k} (20) FIG. 3 is a diagram for explaining the operation of the background noise re-update unit for each band. In FIG. 3, the vertical axis represents the power level of each component, and the horizontal axis represents the channel (frequency). Further, dashed line a, the background noise is estimated by the updater 72 were estimated noise N2 _f, spectrum of _k, solid lines b spectrum X _f where the audio signal and background noise are _{mixed, k,} dotted c the spectrum of background noise, Dotted line d is a background noise re-update unit 122 for each band.
7 _shows the spectrum of the re-updated estimated background noise N3 _{mf, k} re-updated in FIG. Channels with diagonal lines (channels 4, 6, and 9) are channels on which the estimated background noise has been re-updated in the background noise re-update unit 122 for each band.

【００４４】破線a の推定背景雑音N2_f,k は、(3) 式で
示されるようにフレームに対して平滑化された雑音であ
り、フレーム毎に急峻に変化する背景雑音成分に迅速に
対応するものではない。したがって、加算器５におい
て、点線b の音声信号と背景雑音が混在したスペクトル
から破線a の推定背景雑音N2_f,k を除去しても、急峻に
変化する背景雑音成分は残ってしまう。The estimated background noise N2 _{f, k} indicated by the dashed line a is noise that has been smoothed with respect to the frame as shown by the equation (3), and quickly corresponds to the background noise component that changes sharply for each frame. It does not do. Therefore, even if the estimated background noise N2 _{f, k} indicated by the dashed line a is removed from the spectrum in which the audio signal and the background noise indicated by the dotted line b are mixed in the adder 5, the background noise component that changes sharply remains.

【００４５】そこで、(17)式に当てはまるS/N の低いチ
ャンネル、言い換えれば他のチャンネルに比較して音声
成分が少ないチャンネル（図３の４、６、９チャンネ
ル）については、加算器５の出力におけるそのチャンネ
ルに対応する成分E1_mf,kを用いて急峻に変化する背景雑
音成分に対応する点線d で示す再更新推定背景雑音N3
_mf,kを(18)式により生成する。そして、加算器13におい
て、(19)式により加算器５の出力E1_f,k からその再更新
推定背景雑音N3_mf,kを除去する。これにより、加算器13
の出力における音声成分が少ないチャンネル（図３の
４、６、９チャンネル）のS/N は、大幅に改善されるこ
ととなる。Therefore, for a channel having a low S / N, which corresponds to the expression (17), in other words, a channel having less audio components than the other channels (channels 4, 6, and 9 in FIG. 3), The renewed estimated background noise N3 indicated by the dotted line d corresponding to the background noise component that changes rapidly using the component E1 _{mf, k} corresponding to the channel in the output
_{mf, k} is generated by equation (18). Then, the adder 13 removes the re-updated estimated background noise N3 _{mf, k} from the output E1 _{f, k} of the adder 5 according to equation (19). Thereby, the adder 13
The S / N of the channels having a small audio component at the output of (4, 6, and 9 in FIG. 3) will be greatly improved.

【００４６】平滑演算部14は、加算器13からの信号に対
して(19)、(20)式による処理を施し、その信号の周波数
特性を平滑化する。この周波数特性の平滑化により各フ
レームの非連続性が減少するので、自然な音感を得るこ
とができる。平滑演算部14は、周波数特性を平滑化した
出力E3_f,k を出力側に接続される逆高速フーリエ変換演
算器（逆FFT 演算器）15へ出力する。The smoothing operation unit 14 performs a process according to the equations (19) and (20) on the signal from the adder 13 to smooth the frequency characteristics of the signal. Since the discontinuity of each frame is reduced by the smoothing of the frequency characteristics, a natural sound feeling can be obtained. The smoothing operation unit 14 outputs the output E3 _{f, k obtained} by smoothing the frequency characteristic to an inverse fast Fourier transform operation unit (inverse FFT operation unit) 15 connected to the output side.

【００４７】 E3_f,k ＝(1−γ) ・E2_f,k-1 ＋γ・ E2_f,k (21) 逆FFT 演算器15は、１フレーム毎に位相保持器９からの
位相Φ(f,k) を用いて平滑演算部14からの信号E3_f,k に
対して逆高速フーリエ変換を施し、周波数成分で表わさ
れる信号E3_f,k を時間t の関数として表わされる時間軸
信号E4(k) に変換する。そして、出力側に接続される窓
関数オーバラップ処理部16へ出力する。なお、この周波
数成分で表わされる信号を逆FFT 演算により時間軸信号
に変換する方法は既に公知である。E3 _{f, k} = (1−γ) · E2 _{f, k−1} + γ · E2 _{f, k} (21) The inverse FFT calculator 15 outputs the phase Φ (f) from the phase holder 9 every frame. , k) by performing an inverse fast Fourier transform on the signal E3 _{f, k} from the smoothing operation unit 14 to convert the signal E3 _{f, k} represented by the frequency component into a time axis signal E4 ( k). Then, it outputs to the window function overlap processing unit 16 connected to the output side. It should be noted that a method of converting a signal represented by this frequency component into a time axis signal by an inverse FFT operation is already known.

【００４８】窓関数オーバラップ処理部16は、逆FFT 演
算器15からの時間軸信号E4(k) を、フレーム間で不連続
にならないようにあらかじめ定めた比率でオーバラップ
させることにより整合をとり、出力側に接続されるディ
ジタル−アナログ変換器（D/A 変換器）17へ出力する。
D/A 変換器17は、窓関数オーバラップ処理部16からの信
号をディジタル信号からアナログ信号に変換し、出力側
に接続される出力端子19へ出力する。このようにして、
入力端子18に入力された音声信号は、背景雑音が除去さ
れて出力端子17から出力される。The window function overlap processing unit 16 matches the time axis signal E4 (k) from the inverse FFT calculator 15 by overlapping at a predetermined ratio so as not to be discontinuous between frames. To a digital-analog converter (D / A converter) 17 connected to the output side.
The D / A converter 17 converts the signal from the window function overlap processing unit 16 from a digital signal to an analog signal, and outputs it to an output terminal 19 connected to the output side. In this way,
The audio signal input to the input terminal 18 is output from the output terminal 17 after removing background noise.

【００４９】以上説明したように第１の実施例によれ
ば、入力信号を複数の小領域の周波数帯域（チャンネ
ル）に分割して背景雑音の減算処理を行っているので、
入力信号の特性が未知であっても処理内容に変更を加え
る必要がなく、また、１つの処理方法により低周波数
域、高周波数域の分け隔てなく背景雑音の除去処理を行
うので、聴感上の違和感が発生しない音質の優れた背景
雑音除去装置を実現することができる。As described above, according to the first embodiment, the input signal is divided into a plurality of small frequency bands (channels) and the background noise is subtracted.
Even if the characteristics of the input signal are unknown, there is no need to change the processing contents, and the background noise is removed by a single processing method without discrimination between the low frequency range and the high frequency range. It is possible to realize a background noise elimination device with excellent sound quality that does not cause discomfort.

【００５０】また、第１の実施例によれば、入力信号を
複数のチャンネルに分割し、各チャンネル毎に背景雑音
の平均的な特性を示す推定背景雑音を用いて雑音減算処
理を行い、他のチャンネルに比較してS/N が小さいチャ
ンネル、つまり音声成分の少ないチャンネルに対しては
背景雑音の急峻的な成分に対応する再更新推定背景雑音
を用いて更に雑音減算処理を行っているので、音声成分
が低周波数域に集中している場合であっても音声成分が
少ないチャンネルで背景雑音を精度よく除去することが
できる。Further, according to the first embodiment, the input signal is divided into a plurality of channels, and a noise subtraction process is performed for each channel using estimated background noise indicating an average characteristic of the background noise. Since the S / N is smaller than that of the channel, that is, the channel with less voice component, the noise is further subtracted using the re-updated estimated background noise corresponding to the steep component of the background noise. Even when the audio components are concentrated in the low frequency range, the background noise can be accurately removed from the channel having a small audio component.

【００５１】図４は本発明の第２の実施例の背景雑音除
去装置を示すブロック図である。この第２の実施例は、
図１に示す第１の実施例の背景雑音除去装置における時
間軸音声検出器６を周波数軸音声検出器20に変更し、パ
ワースペクトル計算部４の出力側とパワー操作部10の帯
域分割器101 の出力側とを周波数軸音声検出器20の入力
側にそれぞれ接続したものである。FIG. 4 is a block diagram showing a background noise removing apparatus according to a second embodiment of the present invention. This second embodiment is:
The time axis speech detector 6 in the background noise elimination device of the first embodiment shown in FIG. 1 is changed to a frequency axis speech detector 20, and the output side of the power spectrum calculator 4 and the band divider 101 of the power operation unit 10 are changed. Are connected to the input side of the frequency axis sound detector 20, respectively.

【００５２】図４において、周波数軸音声検出器20は、
パワースペクトル計算部４の出力X_f,kと帯域分割器101
の出力N2_mf,kを用いて、帯域分割器101 において分割し
た小領域の周波数帯域（チャンネル）毎に(22)式により
出力N4_c,k を計算する。In FIG. 4, the frequency axis sound detector 20 is
The output _{Xf, k} of the power spectrum calculator 4 and the band splitter 101
Using the output _{N2mf, k} of (1), the output N4c _{, k} is calculated by the equation (22) for each frequency band (channel) of the small region divided by the band divider 101.

【００５３】 N4_cf,k＝ε・N2_cf,k＋(1−ε )・X_cf,k (22) ただし、N4_cf,0＝0 とする。ここで、c は、分割した複
数の周波数帯域の第c番目の周波数帯域（第c チャンネ
ル）を表わし、εは、1 ＞ε＞0 の範囲内であらかじめ
定めておくノイズ平均の速度を決定する定数である。 _{N4cf, k} = ε · _{N2cf, k} + (1−ε) · _{Xcf, k} (22) where _{N4cf, 0} = 0. Here, c represents a c-th frequency band (c-th channel) of a plurality of divided frequency bands, and ε determines a predetermined noise averaging speed within a range of 1>ε> 0. Is a constant.

【００５４】次いで、周波数軸音声検出器20は、(23)式
により、(22)式により計算した出力N4_cf,kにあらかじめ
設定した閾値κを加算する。そして、その加算結果J
_cf,k が(24)式を満足する場合には、第c チャンネルに
音声成分が含まれるものと判定し、その加算結果J_cf,k
が(25)式を満足する場合には、第c チャンネルに音声成
分が含まれないものと判定する。Next, the frequency axis sound detector 20 adds a preset threshold value κ to the output _{N4cf, k} calculated by the equation (22) according to the equation (23). And the addition result J
_{If cf, k} satisfies Equation (24), it is determined that the audio component is included in the c-th channel, and the addition result J _{cf, k}
Satisfies Expression (25), it is determined that the c-th channel does not include a sound component.

【００５５】 J_cf,k ＝κ＋N4_cf,k (23) X_cf,k ＞J_cf,k (24) X_cf,k ≦J_cf,k (25) 次いで、周波数軸音声検出器20は、音声成分が含まれて
いると判定したチャンネルの数V があらかじめ定めたチ
ャンネル数M に対して(26)式を満足する場合には、第k
番目のフレームは音声信号を含むフレームである判定
し、(27)式を満足する場合には、第k 番目のフレームは
音声信号を含まないフレームである判定する。なお、チ
ャンネル数M は、特定の値に限定されるものではない。J _{cf, k} = κ + N4 _{cf, k} (23) X _{cf, k} > J _{cf, k} (24) X _{cf, k} ≦ J _{cf, k} (25) Next, the frequency axis sound detector 20 If the number V of channels determined to contain the component satisfies Equation (26) with respect to the predetermined number of channels M, the k-th
The k-th frame is determined to be a frame that does not include an audio signal, and if the expression (27) is satisfied, the k-th frame is determined to be a frame that does not include an audio signal. Note that the number of channels M is not limited to a specific value.

【００５６】音声成分有りのチャンネル数V ≧M (26) 音声成分有りのチャンネル数V ＜M (27) 図５は、上述の周波数軸音声検出器20の動作を説明する
ための図である。図５において、縦軸はパワーレベル、
横軸はチャンネル番号（周波数）である。また、実線a
は、パワースペクトル計算部４から出力される第k 番目
のフレームのパワースペクトル出力X_f,k、実線b は、背
景雑音推定部７により推定された推定背景雑音のスペク
トルN2_f,k 、そして点線c は、(23)式により計算された
加算結果J_f,kを表わす。Number of Channels with Audio Component V ≧ M (26) Number of Channels with Audio Component V <M (27) FIG. 5 is a diagram for explaining the operation of the above-described frequency axis audio detector 20. In FIG. 5, the vertical axis is the power level,
The horizontal axis is the channel number (frequency). Also, the solid line a
Is the power spectrum output X _{f, k of} the k-th frame output from the power spectrum calculator 4, the solid line b is the spectrum N 2 _{f, k} of the estimated background noise estimated by the background noise estimator 7, and the dotted line c represents the addition result J _{f, k} calculated by equation (23).

【００５７】この図５は、第２、第４、第６チャンネル
については(24)式が成立して音声成分が含まれるものと
判定され、残る第１、第３、第５、第７〜第10チャンネ
ルについては(25)式が成立して音声成分が含まれないも
のと判定される場合の例である。この場合、音声成分が
含まれるものと判定されるチャンネル数V は３チャンネ
ルであるから、例えば、(26)式、(27)式におけるチャン
ネル数M を３とすれば、この第K 番目のフレームは音声
信号を含むものと判定される。なお、本実施例では、音
声成分が含まれるチャンネル数V によりフレームに音声
信号が含まれているか否かを判定したが、全チャンネル
数に対する音声成分が含まれるチャンネル数V の割合に
よりフレームに音声信号が含まれているか否かを判定し
てもよい。FIG. 5 shows that, for the second, fourth, and sixth channels, equation (24) holds, and it is determined that a speech component is included, and the remaining first, third, fifth, seventh through seventh channels are determined. This is an example of a case where Expression (25) is established for the tenth channel and it is determined that no audio component is included. In this case, since the number of channels V determined to include an audio component is 3 channels, for example, if the number of channels M in equations (26) and (27) is 3, the K-th frame Is determined to include an audio signal. In the present embodiment, whether or not the audio signal is included in the frame is determined based on the number of channels V including the audio component. However, the audio is included in the frame based on the ratio of the number of channels V including the audio component to the total number of channels. It may be determined whether or not a signal is included.

【００５８】周波数軸音声検出器20による判定結果は、
第１の実施例の場合と同様に背景雑音推定部７のスイッ
チ71およびパワー操作部10のスイッチ104 に送出され、
各スイッチを制御する。なお、周波数軸音声検出器20以
外の各部は、図１に示す第１の実施例の対応する各部と
構成、動作が同じである。The judgment result by the frequency axis sound detector 20 is as follows.
As in the case of the first embodiment, the signals are sent to the switch 71 of the background noise estimating unit 7 and the switch 104 of the power operating unit 10, and
Control each switch. The components other than the frequency axis sound detector 20 have the same configuration and operation as the corresponding components of the first embodiment shown in FIG.

【００５９】以上説明したように第２の実施例によれ
ば、音声信号の検出を周波数軸で行っているので、音声
の周波数成分に重点を置いて検出し得る閾値の設定が可
能である。また、フレーム単位で音声信号を検出するの
で多くのフレーム内のデータを一括して判定材料とする
ことになり、確実に音声信号を検出することができる。
したがって、背景雑音を精度よく除去できる、音質の優
れた背景雑音除去装置を提供できる。また、第２の実施
例によれば、当然に第１の実施例と同様の効果を得るこ
とができる。As described above, according to the second embodiment, since the detection of the audio signal is performed on the frequency axis, it is possible to set the threshold value which can be detected with emphasis on the frequency component of the audio. In addition, since the audio signal is detected on a frame basis, data in many frames is collectively used as determination data, and the audio signal can be detected reliably.
Therefore, it is possible to provide a background noise elimination apparatus which can remove background noise with high accuracy and has excellent sound quality. According to the second embodiment, the same effects as those of the first embodiment can be obtained.

【００６０】図６は本発明の第３の実施例の背景雑音除
去装置を示すブロック図である。図１に示す第１の実施
例の背景雑音除去装置においては、音声期間において、
パワースペクトル計算部４から出力されるパワースペク
トルX_f,kから、背景雑音推定部７で生成した推定背景雑
音N2_f,k を加算器５で減算し、次いで、帯域別背景雑音
再推定部12で生成した再更新推定背景雑音N3_mf,kを加算
器13で減算することにより背景雑音の除去を行ってい
る。FIG. 6 is a block diagram showing a background noise removing apparatus according to a third embodiment of the present invention. In the background noise elimination device of the first embodiment shown in FIG.
The estimated background noise N2f _{, k} generated by the background noise estimator 7 is subtracted by the adder 5 from the power spectrum _{Xf, k} output from the power spectrum calculator 4, and then the background noise re-estimator 12 for each band is subtracted. The background noise is removed by subtracting the re-updated estimated background noise N3 _{mf, k} generated in step (3) by the adder 13.

【００６１】これに対して第３の実施例では、パワース
ペクトル計算部４から出力されるパワースペクトルX_f,k
から、推定背景雑音N2_f,k と再更新推定背景雑音N3_mf,k
とをまとめて加算器13で減算することとしたものであ
る。そのため、図６では、パワースペクトル計算部４の
出力側を加算器５を介さずに直接加算器13の入力側に接
続している。そして、帯域別背景雑音再推定部12は、再
更新推定背景雑音N3_mf,kに一定量の推定背景雑音N2_f,k
を加算した推定背景雑音（N3_mf,k＋α_c ・N2_f,k）を生
成し、出力側に接続される加算器13へ出力するものであ
る。この第３の実施例おいても、第１の実施例と同様の
効果を得ることができる。On the other hand, in the third embodiment, the power spectrum X _{f, k} output from the power spectrum calculator 4
From the estimated background noise N2 _{f, k} and the re-updated estimated background noise N3 _{mf, k}
Are collectively subtracted by the adder 13. Therefore, in FIG. 6, the output side of the power spectrum calculator 4 is directly connected to the input side of the adder 13 without passing through the adder 5. Then, the band-based background noise re-estimating unit 12 _{adds a} certain amount of estimated background noise N2 _{f, k to the} re-updated estimated background noise N3 _mf _{, k.}
To generate an estimated background noise (N3 _{mf, k} + α _c · N2 _{f, k} ), which is output to the adder 13 connected to the output side. In the third embodiment, the same effect as in the first embodiment can be obtained.

【００６２】図７は本発明の第４の実施例の背景雑音除
去装置を示すブロック図である。図４に示す第２の実施
例の背景雑音除去装置においては、音声期間において、
パワースペクトル計算部４から出力されるパワースペク
トルX_f,kから、背景雑音推定部７で生成した推定背景雑
音N2_f,k を加算器５で減算し、次いで、帯域別背景雑音
再推定部12で生成した再更新推定背景雑音N3_mf,kを加算
器13で減算することにより背景雑音の除去を行ってい
る。FIG. 7 is a block diagram showing a background noise removing apparatus according to a fourth embodiment of the present invention. In the background noise elimination device of the second embodiment shown in FIG.
The estimated background noise N2f _{, k} generated by the background noise estimator 7 is subtracted by the adder 5 from the power spectrum _{Xf, k} output from the power spectrum calculator 4, and then the background noise re-estimator 12 for each band is subtracted. The background noise is removed by subtracting the re-updated estimated background noise N3 _{mf, k} generated in step (3) by the adder 13.

【００６３】これに対して第４の実施例では、パワース
ペクトル計算部４から出力されるパワースペクトルX_f,k
から、推定背景雑音N2_f,k と再更新推定背景雑音N3_mf,k
とをまとめて加算器13で減算することとしたものであ
る。そのため、図７では、パワースペクトル計算部４の
出力側を加算器５を介さずに直接加算器13の入力側に接
続している。そして、帯域別背景雑音再推定部12は、再
更新推定背景雑音N3_mf,kに一定量の推定背景雑音N2_f,k
を加算した推定背景雑音（N3_mf,k＋α_c ・N2_f,k）を生
成し、出力側に接続される加算器13へ出力するものであ
る。この第４の実施例おいても、第２の実施例と同様の
効果を得ることができる。On the other hand, in the fourth embodiment, the power spectrum X _{f, k} output from the power spectrum calculator 4
From the estimated background noise N2 _{f, k} and the re-updated estimated background noise N3 _{mf, k}
Are collectively subtracted by the adder 13. Therefore, in FIG. 7, the output side of the power spectrum calculation unit 4 is directly connected to the input side of the adder 13 without passing through the adder 5. Then, the band-based background noise re-estimating unit 12 _{adds a} certain amount of estimated background noise N2 _{f, k to the} re-updated estimated background noise N3 _mf _{, k.}
To generate an estimated background noise (N3 _{mf, k} + α _c · N2 _{f, k} ), which is output to the adder 13 connected to the output side. In the fourth embodiment, the same effects as in the second embodiment can be obtained.

【００６４】なお、本発明は、携帯電話装置、自動車電
話装置、音響装置等だけではなく、音声認識装置、TV会
議装置、無線通信機等に広く適用することができる。It should be noted that the present invention can be widely applied not only to portable telephone devices, car telephone devices, and audio devices, but also to voice recognition devices, video conference devices, wireless communication devices, and the like.

【００６５】[0065]

【発明の効果】このように本発明によれば、入力信号の
周波数帯域を複数の小領域に分割し、小領域毎に背景雑
音の減算処理を行っているので、入力信号の特性が未知
であっても処理内容に変更を加える必要がなく、また、
１つの処理方法により低周波数域、高周波数域の分け隔
てなく背景雑音の除去処理を行うので聴感上の違和感が
発生しないという効果が得られる。As described above, according to the present invention, since the frequency band of the input signal is divided into a plurality of small areas and the background noise is subtracted for each small area, the characteristics of the input signal are unknown. Even if there is no need to change the processing content,
Since the background noise removal processing is performed by one processing method without distinction between the low frequency range and the high frequency range, an effect that a sense of incongruity does not occur in the auditory sense can be obtained.

【００６６】また、本発明によれば、小領域毎に、背景
雑音の平均的な特性を示す推定背景雑音を用いて雑音減
算処理を行い、更に、他のチャンネルに比較してS/N が
小さいチャンネルに対しては背景雑音の急峻的な変化に
対応する再更新推定背景雑音を用いて雑音減算処理を行
っているので、S/N が小さいチャンネルにおける背景雑
音を精度よく除去することができ、音質の優れた背景雑
音除去装置を実現することができる。Further, according to the present invention, a noise subtraction process is performed for each small area using an estimated background noise having an average characteristic of the background noise, and further, the S / N is reduced as compared with other channels. For small channels, noise subtraction processing is performed using renewed estimated background noise corresponding to a sharp change in background noise, so background noise in channels with small S / N can be accurately removed. Thus, it is possible to realize a background noise elimination device having excellent sound quality.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の第１の実施例を示すブロック図であ
る。FIG. 1 is a block diagram showing a first embodiment of the present invention.

【図２】チャンネルのS/N と全周波数帯域のS/N の一例
を示す図である。FIG. 2 is a diagram illustrating an example of S / N of a channel and S / N of an entire frequency band.

【図３】帯域別背景雑音再更新部の動作例を示す図であ
る。FIG. 3 is a diagram illustrating an operation example of a background noise re-update unit for each band.

【図４】本発明の第２の実施例を示すブロック図であ
る。FIG. 4 is a block diagram showing a second embodiment of the present invention.

【図５】周波数軸音声検出器の動作例を示す図である。FIG. 5 is a diagram illustrating an operation example of a frequency axis sound detector.

【図６】本発明の第３の実施例を示すブロック図であ
る。FIG. 6 is a block diagram showing a third embodiment of the present invention.

【図７】本発明の第４の実施例を示すブロック図であ
る。FIG. 7 is a block diagram showing a fourth embodiment of the present invention.

[Explanation of symbols]

１ A/D 変換器２窓関数演算器３ FFT 演算器４パワースペクトル計算部５、13 加算器６時間軸音声検出器７背景雑音推定部８位相計算部９位相保持器 10 パワー操作部 11 S/N 計算部 12 帯域別背景雑音再推定部 14 平滑演算部 15 逆FFT 演算器 16 窓関数オーバラップ処理部 17 D/A 変換器 20 周波数軸音声検出器 Reference Signs List 1 A / D converter 2 Window function calculator 3 FFT calculator 4 Power spectrum calculator 5, 13 adder 6 Time axis voice detector 7 Background noise estimator 8 Phase calculator 9 Phase holder 10 Power operation unit 11 S / N calculation unit 12 Band-wise background noise re-estimation unit 14 Smoothing operation unit 15 Inverse FFT operation unit 16 Window function overlap processing unit 17 D / A converter 20 Frequency axis speech detector

Claims

[Claims]

1. A background noise elimination device in which background noise is input as an input signal and then the background noise and the audio signal are input in a mixed form. A signal conversion unit that converts a signal into a frequency component, a voice detection unit that detects a voice signal included in the input signal, and a noise period in which a voice signal is not detected by the voice detection unit. Background noise estimating means for generating and holding the estimated background noise by averaging the component and the estimated background noise generated one frame before; and a voice period in which a voice signal is detected by the voice detection means,
A first method for subtracting the estimated background noise held by the background noise estimating means from the frequency component converted by the signal converting means;
An S / N of the entire frequency band and a plurality of S / Ns of the entire frequency band, with the frequency component obtained by the subtraction of the first adding unit as a signal and the estimated background noise held by the background noise estimating unit as noise. S / N calculating means for calculating the S / N for each of the divided small regions, and for each small region in which the difference between the S / N of the small region and the S / N of the entire frequency band is equal to or smaller than a predetermined value, Band-based background noise re-estimating means for generating a re-estimated estimated background noise including a frequency component obtained by the subtraction of the adding means of No. 1 and the estimated background noise held by the background noise estimating means at a predetermined ratio; A second addition means for subtracting the re-updated estimated background noise from a frequency component obtained by subtraction of the first addition means, and a frequency component for each frame obtained by the subtraction of the second addition means is converted into a time axis signal. Signal reproducing means for outputting the first addition Means for subtracting the estimated background noise and the second
The magnitude of the renewed estimated background noise to be subtracted by the adding means of
A background noise elimination device, which is set so that background noise is eliminated from the input signal.

2. A background noise elimination apparatus in which background noise is input as an input signal and then the background noise and the audio signal are input in a mixed form, the apparatus comprising: A signal conversion unit that converts a signal into a frequency component, a voice detection unit that detects a voice signal included in the input signal, and a noise period in which a voice signal is not detected by the voice detection unit. Background noise estimating means for generating and holding the estimated background noise by averaging the component and the estimated background noise generated one frame before; and a voice period in which a voice signal is detected by the voice detection means,
A first method for subtracting the estimated background noise held by the background noise estimating means from the frequency component converted by the signal converting means;
An S / N of the entire frequency band and a plurality of S / Ns of the entire frequency band, with the frequency component obtained by the subtraction of the first adding unit as a signal and the estimated background noise held by the background noise estimating unit as noise. S / N calculating means for calculating the S / N for each of the divided small regions, and for each small region in which the difference between the S / N of the small region and the S / N of the entire frequency band is equal to or smaller than a predetermined value, 1 generates noise including a predetermined ratio of a frequency component obtained by subtraction of the adding means and the estimated background noise held by the background noise estimating means, and the estimated background held by the background noise estimating means is included in the noise. Band-based background noise re-estimating means for generating re-updated estimated background noise by adding noise, and second adding means for subtracting the re-updated estimated background noise from frequency components obtained by conversion of the signal converting means. The frame obtained by the subtraction of the second adding means. Signal reproduction means for converting a frequency component for each system into a time axis signal and outputting the converted signal. The magnitude of the renewed estimated background noise to be subtracted by the second addition means is such that background noise is removed from the input signal. A background noise elimination device, wherein

3. The background noise elimination apparatus according to claim 1, wherein the voice detection unit detects the voice component included in a frequency component obtained by the conversion by the signal conversion unit, thereby detecting the input. A background noise elimination device for detecting an audio signal included in a signal.

4. The background noise elimination device according to claim 3, wherein said voice detection means detects a voice component for each of a plurality of small regions obtained by dividing a frequency component obtained by the conversion by said signal conversion means, and A background noise eliminator characterized in that it is determined that an audio signal is included in the input signal when the number of small regions in which components are detected is equal to or greater than a predetermined value.

5. The background noise elimination apparatus according to claim 3, wherein said audio detecting means detects an audio component for each of a plurality of small regions obtained by dividing a frequency component obtained by the conversion of said signal conversion means, and A background noise eliminator characterized in that it is determined that an audio signal is included in the input signal when a ratio of the number of small areas in which audio components are detected to the number of small areas is equal to or greater than a predetermined value.