[go: up one dir, main page]

JP4709714B2 - Echo canceling apparatus, method thereof, program thereof, and recording medium thereof - Google Patents

Echo canceling apparatus, method thereof, program thereof, and recording medium thereof Download PDF

Info

Publication number
JP4709714B2
JP4709714B2 JP2006232519A JP2006232519A JP4709714B2 JP 4709714 B2 JP4709714 B2 JP 4709714B2 JP 2006232519 A JP2006232519 A JP 2006232519A JP 2006232519 A JP2006232519 A JP 2006232519A JP 4709714 B2 JP4709714 B2 JP 4709714B2
Authority
JP
Japan
Prior art keywords
signal
sound
echo
speaker
received
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2006232519A
Other languages
Japanese (ja)
Other versions
JP2008060715A (en
Inventor
和則 小林
賢一 古家
陽一 羽田
章俊 片岡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2006232519A priority Critical patent/JP4709714B2/en
Publication of JP2008060715A publication Critical patent/JP2008060715A/en
Application granted granted Critical
Publication of JP4709714B2 publication Critical patent/JP4709714B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Circuit For Audible Band Transducer (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Description

この発明は、例えば、TV会議や音声会議などハンズフリー通信のエコー消去装置、その方法、そのプログラム、およびその記録媒体に関する。   The present invention relates to an echo canceling apparatus for hands-free communication such as a TV conference and an audio conference, a method thereof, a program thereof, and a recording medium thereof.

従来技術のエコー消去装置の機能構成例を図1に示し、スピーカを線形特性と非線形特性の並列接続したモデルで表したときのブロック図を図2に示す。
以下の説明において、再生手段はスピーカとして、収音手段はマイクロホンとして説明する。また、図示しない遠端話者よりの通信網2を経由した音声信号を受話信号x(t)と表し、近端話者7が発した音声を話者音声s(t)と表し、マイクロホンで収音された話者音声s(t)を話者受音信号a(t)と表し、スピーカから放声された再生音がマイクロホンにまわり込み、収音された音声信号をエコー信号d(t)と表し、マイクロホンで収音した音声信号を受音信号y(t)と表し、遠端話者への送話信号または誤差信号をe(t)と表す。ただしtは離散的時刻を表す。
従来のエコー消去装置14は可変フィルタ8、減算部10、フィルタ係数更新部12、とで構成されている。従来のエコー消去装置14はスピーカ4とマイクロホン6を用いた拡声通話において、受音信号y(t)に混入されたエコー信号d(t)を消去する。エコー消去装置14の入力信号、(マイクロホン6で収音される音声信号)つまり受音信号y(t)はエコー信号d(t)と話者受音信号a(t)とからなる。従来のエコー消去装置14は、受音信号y(t)に含まれるエコー信号d(t)を推定して、受音信号y(t)から差し引くことにより、受音信号y(t)に含まれるエコー信号d(t)を消去する。
FIG. 1 shows an example of a functional configuration of a conventional echo canceling apparatus, and FIG. 2 shows a block diagram when a speaker is represented by a model in which linear characteristics and nonlinear characteristics are connected in parallel.
In the following description, the reproducing means is described as a speaker, and the sound collecting means is described as a microphone. In addition, a voice signal from a far-end speaker (not shown) via the communication network 2 is represented as a received signal x (t), and a voice uttered by the near-end speaker 7 is represented as a speaker voice s (t). The collected speaker voice s (t) is represented as a speaker received signal a (t), and the reproduced sound emitted from the speaker wraps around the microphone, and the collected voice signal is echo signal d (t). The voice signal picked up by the microphone is represented as a received signal y (t), and the transmitted signal or error signal to the far-end speaker is represented as e (t). However, t represents a discrete time.
The conventional echo canceling device 14 includes a variable filter 8, a subtracting unit 10, and a filter coefficient updating unit 12. The conventional echo canceller 14 cancels the echo signal d (t) mixed in the received sound signal y (t) in the loudspeaking call using the speaker 4 and the microphone 6. The input signal of the echo canceler 14 (the sound signal collected by the microphone 6), that is, the sound reception signal y (t) is composed of the echo signal d (t) and the speaker sound reception signal a (t). The conventional echo canceller 14 estimates the echo signal d (t) included in the received sound signal y (t) and subtracts it from the received sound signal y (t), thereby including it in the received sound signal y (t). The echo signal d (t) to be deleted is deleted.

まず、図示しない遠端話者からの音声信号が通信網2を経由して、受話信号x(t)として、スピーカ4に入力される。また、スピーカ特性が線形であるとした場合、その特性をg(t)とし、スピーカ4からマイクロホン6までのインパルス応答をr(t)とすると、スピーカ4からマイクロホン6に回り込み、収音されたエコー信号d(t)は以下の式(1)で表すことが出来る。
d(t)=g(t)*r(t)*x(t) (1)
ただし*は畳み込み演算を表す。次に、近端話者7からマイクロホン6までのインパルス応答をc(t)と表し、上述したように、近端話者7からの音声はs(t)であるので、受音信号y(t)は以下の式(2)で表すことができる。
y(t)=g(t)*r(t)*x(t)+c(t)*s(t) (2)
ここで、エコー消去装置14に求められるのは、受音信号y(t)に含まれるエコー信号d(t)を消去することである。つまり、式(2)の右辺の第1項g(t)*r(t)*x(t)の成分を消去することである。
可変フィルタ8で、推定したインパルス応答(以下、擬似インパルス応答h(t)という)を受話信号x(t)に畳み込み、可変フィルタ8よりの出力信号である擬似エコー信号h(t)*x(t)を出力する。減算部10で受音信号y(t)から擬似エコー信号h(t)*x(t)を減算して、減算部10はエコー信号d(t)を消去した送話信号e(t)を出力する。送話信号e(t)を式で表せば以下の式(3)になる。
e(t)={g(t)*r(t)−h(t)}*x(t)+c(t)*s(t) (3)
First, a voice signal from a far-end speaker (not shown) is input to the speaker 4 as the received signal x (t) via the communication network 2. If the speaker characteristic is linear, the characteristic is g (t), and the impulse response from the speaker 4 to the microphone 6 is r 1 (t). The echo signal d (t) can be expressed by the following equation (1).
d (t) = g (t) * r 1 (t) * x (t) (1)
However, * represents a convolution operation. Next, the impulse response from the near-end speaker 7 to the microphone 6 is expressed as c 1 (t), and as described above, the sound from the near-end speaker 7 is s (t), so the received signal y (T) can be expressed by the following formula (2).
y (t) = g (t) * r 1 (t) * x (t) + c 1 (t) * s (t) (2)
Here, what is required of the echo canceller 14 is to cancel the echo signal d (t) included in the received sound signal y (t). That is, the component of the first term g (t) * r 1 (t) * x (t) on the right side of Expression (2) is eliminated.
The variable filter 8 convolves the estimated impulse response (hereinafter referred to as a pseudo impulse response h (t)) with the received signal x (t), and the pseudo echo signal h (t) * x ( t) is output. The subtraction unit 10 subtracts the pseudo echo signal h (t) * x (t) from the sound reception signal y (t), and the subtraction unit 10 uses the transmission signal e (t) from which the echo signal d (t) is deleted. Output. If the transmission signal e (t) is expressed by an equation, the following equation (3) is obtained.
e (t) = {g (t) * r 1 (t) −h (t)} * x (t) + c 1 (t) * s (t) (3)

可変フィルタのフィルタ係数(擬似インパルス応答h(t))は、フィルタ係数更新部12で受話信号x(t)と送話信号e(t)等を用いて更新される。この更新には、学習同定(NLMS:NormalizedLeast−Mean−Squares)アルゴリズム、もしくは射影アルゴリズム、もしくは逐次最小二乗(RecursiveLeastSquare)アルゴリズム、もしくはLMS(LeastMeanSquare)アルゴリズム等を用いられる。例えば、学習同定アルゴリズムを用いると、更新式は以下の式(4)になる。
H(t+1)=H(t)+a・X(t)・e(t)/X(t)X(t) (4)
ここで、aは事前に設定されたステップサイズであり、0<a<2の値をとり、AはベクトルAの転置行列を表し、H(t)は時刻tにおけるフィルタ係数H(t)=(h(0),h(1),…,h(L−1))で表され、Lは可変フィルタ8のフィルタのタップ長であり、X(t)は時刻tにおける受話信号x(t)のLサンプル分のベクトルであり、X(t)=(x(t−0),x(t−1),…,x(t−L+1))で表す。
The filter coefficient (pseudo impulse response h (t)) of the variable filter is updated by the filter coefficient updating unit 12 using the received signal x (t), the transmitted signal e (t), and the like. For this update, a learning identification (NLMS: Normalized-Lean-Squares) algorithm, a projection algorithm, a sequential least-square (RecursiveLeastSquare) algorithm, an LMS (LeastMeanSquare) algorithm, or the like is used. For example, when the learning identification algorithm is used, the update formula is expressed by the following formula (4).
H (t + 1) = H (t) + a.X (t) .e (t) / X (t) TX (t) (4)
Here, a is a preset step size, takes a value of 0 <a <2, AT represents a transposed matrix of vector A, and H (t) is a filter coefficient H (t) at time t. = (H (0), h (1),..., H (L-1)) T , L is the tap length of the filter of the variable filter 8, and X (t) is the received signal x at time t. It is a vector for L samples of (t), and is represented by X (t) = (x (t-0), x (t-1), ..., x (t-L + 1)) T.

上述したように、フィルタ係数更新部12は受話信号x(t)と送話信号e(t)から、上記式(4)等で表す更新式を用いて、可変フィルタ8のフィルタ係数を更新する。可変フィルタ8は更新されたフィルタ係数で逐次、受話信号x(t)をフィルタリングする。以上の処理により、受音信号y(t)に含まれるエコー信号d(t)が消去される。エコー信号d(t)が消去された送話信号e(t)が通信網2を通じて、図示しない遠端話者に出力される。なお、従来のエコー消去装置の詳細は特許文献1に記載されている。
特許第2602750号
As described above, the filter coefficient updating unit 12 updates the filter coefficient of the variable filter 8 from the reception signal x (t) and the transmission signal e (t) using the update expression represented by the above expression (4) or the like. . The variable filter 8 sequentially filters the received signal x (t) with the updated filter coefficient. By the above processing, the echo signal d (t) included in the sound reception signal y (t) is deleted. The transmission signal e (t) from which the echo signal d (t) has been deleted is output to the far end speaker (not shown) through the communication network 2. The details of the conventional echo canceller are described in Patent Document 1.
Patent No. 2602750

従来のエコー消去装置14においては、エコー信号d(t)を十分に消去できない場合があった。この理由は、次の点にあると考えられるに至った。
即ち、一般的に、スピーカ特性は、振幅の大きな入力信号(受話信号x(t))に対して、出力が頭打ちになるような非線形の特性をもっているので、スピーカ特性は線形と非線形の特性に分けて考えられる。即ちスピーカ特性は、図2に示すように、線形特性の部分のスピーカ4のインパルス応答をg(t)、非線形の部分の特性を関数f(・)との並列特性で表すことが出来る。このスピーカ特性を考慮するとエコー信号d(t)は以下の式(5)で表すことができる。
d(t)=g(t)*r(t)*x(t)+r(t)*f(x(t)) (5)
また受音信号y(t)は以下の式(6)で表すことができる。
y(t)=d(t)+c(t)*s(t)
=g(t)*r(t)*x(t)+r(t)*f(x(t))+c(t)*s(t) (6)
従って、送話信号e(t)は以下の式(7)になる。
e(t)={g(t)*r(t)−h(t)}*x(t)+r(t)*f(x(t))+c(t)*s(t) (7)
しかし、従来のエコー消去装置14においては、消去可能なエコーは、線形のエコー経路を通って、マイクロホン6に到達したエコーのみで、非線形のエコーは消去できない。つまり、上記式(6)の右辺の第1項であるg(t)*r(t)*x(t)のみが消去され、第2項であるr(t)*f(x(t))の成分を消去することができず、送話信号e(t)に非線形のエコー信号が混入してしまう。従って、非線形が強いスピーカ等を用いた場合、十分にエコー信号を消去できないという問題があった。
In the conventional echo canceller 14, the echo signal d (t) may not be sufficiently cancelled. The reason for this is thought to be as follows.
That is, in general, the speaker characteristic has a nonlinear characteristic such that the output reaches a peak with respect to an input signal (received signal x (t)) having a large amplitude. It can be considered separately. That is, as shown in FIG. 2, the speaker characteristics can be expressed by the parallel characteristics of the linear response portion with the impulse response of the speaker 4 as g (t) and the nonlinear portion as the function f (·). Considering the speaker characteristics, the echo signal d (t) can be expressed by the following equation (5).
d (t) = g (t) * r 1 (t) * x (t) + r 1 (t) * f (x (t)) (5)
The sound reception signal y (t) can be expressed by the following equation (6).
y (t) = d (t) + c 1 (t) * s (t)
= G (t) * r 1 (t) * x (t) + r 1 (t) * f (x (t)) + c 1 (t) * s (t) (6)
Accordingly, the transmission signal e (t) is expressed by the following equation (7).
e (t) = {g (t) * r 1 (t) −h (t)} * x (t) + r 1 (t) * f (x (t)) + c 1 (t) * s (t) (7)
However, in the conventional echo canceller 14, the only echo that can be canceled is the echo that reaches the microphone 6 through the linear echo path, and the nonlinear echo cannot be canceled. That is, only g (t) * r 1 (t) * x (t), which is the first term on the right side of the above formula (6), is deleted, and r 1 (t) * f (x (x), which is the second term. The component t)) cannot be eliminated, and a non-linear echo signal is mixed in the transmission signal e (t). Therefore, there is a problem that the echo signal cannot be sufficiently erased when using a speaker having a strong nonlinearity.

この発明によれば、受話信号を再生手段により、再生音に変換して、放声し、話者用収音手段よりの話者受音信号と上記再生音が回り込んだ信号(以下、エコー信号という)からなる受音信号から上記エコー信号を除去して、送話信号として出力するエコー消去装置において、上記受話信号から、第1の擬似エコー信号を生成し、上記複数の収音手段中よりの受音信号から、第2の擬似エコー信号を生成し、上記話者用収音手段よりの受音信号から上記第1の擬似エコー信号及び上記第2の擬似エコー信号を減算して、上記送話信号を出力し、少なくとも、上記受話信号と上記送話信号とが入力され、上記第1の可変フィルタのフィルタ係数を更新し、少なくとも、上記複数の収音手段中よりの受音信号と上記送話信号とから、上記第2の可変フィルタのフィルタ係数を更新し、受話信号のレベルを検出し、検出されたレベルが予め決められた閾値より小さい場合は、上記第2の可変フィルタを稼動させず、当該第2の可変フィルタの出力を0とする。 According to the present invention, the received signal is converted into a reproduced sound by the reproducing means, and is emitted, and a signal (hereinafter referred to as an echo signal) in which the speaker received signal from the sound collecting means for the speaker and the reproduced sound are circulated. In the echo canceling apparatus that removes the echo signal from the received sound signal and outputs it as a transmitted signal, a first pseudo echo signal is generated from the received signal, and the plurality of sound collecting means A second pseudo echo signal is generated from the received sound signal, and the first pseudo echo signal and the second pseudo echo signal are subtracted from the received sound signal from the speaker sound collecting means, A transmission signal is output; at least the reception signal and the transmission signal are input; a filter coefficient of the first variable filter is updated; and at least a reception signal from the plurality of sound collection units; From the transmission signal, the second possible Update the filter coefficients of the filter, to detect the level of the received signal, if the detected level is lower than a predetermined threshold, without operating the second variable filter, the output of the second variable filter Is set to 0 .

上記の構成により、非線形性が強いスピーカ等を用いた場合でも、エコー信号の除去能力を高めることが出来、送話信号の品質を高めることができる。   With the above configuration, even when a loudspeaker or the like with strong nonlinearity is used, the ability to remove echo signals can be enhanced, and the quality of the transmitted signal can be enhanced.

以下に、この発明を実施するための最良の形態を示す。   The best mode for carrying out the present invention will be described below.

この発明の機能構成例を図3に示し、この発明の主要な処理の流れを図4に示す。図1と同一機能構成部分には同一参照番号を付け、重複説明を省略する。以下も同様とする。また、メインマイクロホン61は話者用収音手段を話者の音声を収音するためのものであり、収音手段とはメインマイクロホンを含む全てのマイクロホンを表すこととする。
エコー消去装置40は、第1の可変フィルタ24、第2の可変フィルタ26m(m=2、...、M)、加算部36、減算部38、第1のフィルタ係数更新部34、第2のフィルタ係数更新部32、とで構成されている。また、マイクロホンの数は、複数個であるM個(M≧2)になり、マイクロホンを6m(m=1、...、M)と表す。
この発明のエコー消去装置40の入力信号は、スピーカ4よりの再生信号がまわり込んだエコー信号と近端話者7よりの音声信号であり、つまりマイクロホン6mで収音された受音信号y(t)である。出力信号は図示しない遠端話者への送話信号e(t)である。エコー消去装置40は、エコー信号d(t)を消去し、会話をしやすくする。また、エコー消去装置40の各入力信号は図示しないAD変換部で、アナログ信号から離散時間の信号(ディジタル信号)に変換され、エコー消去装置40の各出力信号は、図示しないDA変換部で離散時間の信号(ディジタル信号)からアナログ信号に変換される。
An example of the functional configuration of the present invention is shown in FIG. 3, and the main processing flow of the present invention is shown in FIG. The same functional components as those in FIG. 1 are denoted by the same reference numerals, and redundant description is omitted. The same applies to the following. The main microphone 61 is for collecting the voice of the speaker by the speaker sound collecting means, and the sound collecting means represents all microphones including the main microphone.
The echo canceller 40 includes a first variable filter 24, a second variable filter 26m (m = 2,..., M), an adder 36, a subtractor 38, a first filter coefficient update unit 34, a second filter And the filter coefficient updating unit 32. The number of microphones is a plurality of M (M ≧ 2), and the microphones are represented as 6 m (m = 1,..., M).
The input signal of the echo canceling device 40 of the present invention is an echo signal in which a reproduction signal from the speaker 4 wraps around and an audio signal from the near-end speaker 7, that is, a received sound signal y m collected by the microphone 6m. (T). The output signal is a transmission signal e (t) to a far-end speaker (not shown). The echo canceller 40 cancels the echo signal d (t) to facilitate conversation. Each input signal of the echo canceller 40 is converted from an analog signal to a discrete time signal (digital signal) by an AD converter (not shown), and each output signal of the echo canceler 40 is discretely converted by a DA converter (not shown). A time signal (digital signal) is converted into an analog signal.

受話信号x(t)はスピーカ4により再生音として再生される。スピーカ4内の信号の流れの詳細は上述と同様、図2に示す。上述の通り、スピーカ特性は、振幅の大きな信号の入力に対して、出力が頭打ちになるような非線形の特性をもっているので、スピーカ特性は線形と非線形の特性に分けて考える。線形特性の部分のインパルス応答g(t)、非線形の部分の特性を関数f(・)として表す。またスピーカから各マイクロホン6m(m=1、...、M)までのインパルス応答をr(t)と表し、近端話者7から各マイクロホン6mまでのインパルス応答をc(t)と表すと、各マイクロホン6mでの受音信号y(t)は以下の式(8)で表すことができる。
(t)=g(t)*r(t)*x(t)+r(t)*f(x(t))+c(t)*s(t) (8)
この実施例1では、図示しない話者受音信号選出手段は、複数のマイクロホン6mから、エコー信号レベル対話者受音信号レベルの比の時間平均値が、他のマイクロホン6mに比べて小さいマイクロホン6mをメインマイクロホン61として選出した場合である。(ステップS2)。
The received signal x (t) is reproduced as reproduced sound by the speaker 4. The details of the signal flow in the speaker 4 are shown in FIG. 2 as described above. As described above, the speaker characteristic has a non-linear characteristic such that the output reaches a peak with respect to the input of a signal having a large amplitude. Therefore, the speaker characteristic is considered as being divided into a linear characteristic and a non-linear characteristic. The impulse response g (t) of the linear characteristic part and the characteristic of the nonlinear part are expressed as a function f (·). Further, the impulse response from the speaker to each microphone 6m (m = 1,..., M) is represented as r m (t), and the impulse response from the near-end speaker 7 to each microphone 6m is represented as c m (t). Denoting, received sound signals y m at each microphone 6 m (t) can be expressed by the following equation (8).
y m (t) = g (t) * r m (t) * x (t) + r m (t) * f (x (t)) + c m (t) * s (t) (8)
In the first embodiment, the speaker received signal selection means (not shown) has a microphone 6m whose time average value of the ratio of the echo signal level talker received signal level from the plurality of microphones 6m is smaller than that of the other microphones 6m. Is selected as the main microphone 61. (Step S2).

また図示しないエコー受音信号選出手段は、複数のマイクロホン6mから、エコー信号レベル対話者受音信号レベルの比の時間平均値が大きいマイクロホンを1つ以上選出した場合である。ただし、この場合、メインマイクロホン61は選択しない(ステップS2)。
話者受音信号選出手段の選出の仕方の一例として、図3に記載のように、1つのマイクロホンをメインマイクロホン61とし、メインマイクロホン61の感度の高い方向を近端話者7方向に向け、その他のマイクロホンを全てサブマイクロホン6m(m=2、...、M)として、サブマイクロホン6mの感度の高い方向をスピーカ4方向に向けることが考えられる。メインマイクロホン61、サブマイクロホン6mの感度の方向については後ほど詳細に述べる。この場合、メインマイクロホン61に向かって近端話者7は音声を発することになる。
メインマイクロホン61および、サブマイクロホン6mで収音された受音信号y(t)(m=1、...、M)は上記式(8)と同様に表すことができる。サブマイクロホン6mの受音信号y(t)がそれぞれ対応する第2の可変フィルタ26m(m=2、...、M)に入力される。第2の可変フィルタ26mが受音信号y(t)に第2のフィルタ係数h(t)を畳み込んで、つまり以下の式(9)を計算して第2の擬似エコー信号β(t)(m=2、...、M)が生成される(ステップS4)。
The echo sound signal selection means (not shown) is a case where one or more microphones having a large time average value of ratios of echo signal level dialogue person sound signal levels are selected from the plurality of microphones 6m. However, in this case, the main microphone 61 is not selected (step S2).
As an example of the selection method of the speaker reception signal selection means, as shown in FIG. 3, one microphone is set as the main microphone 61, the direction of high sensitivity of the main microphone 61 is directed toward the near-end speaker 7, It is conceivable that all other microphones are sub-microphones 6m (m = 2,..., M) and the direction of high sensitivity of the sub-microphone 6m is directed toward the speaker 4. The direction of sensitivity of the main microphone 61 and the sub microphone 6m will be described in detail later. In this case, the near-end speaker 7 emits voice toward the main microphone 61.
The received sound signal y m (t) (m = 1,..., M) collected by the main microphone 61 and the sub microphone 6m can be expressed in the same manner as the above equation (8). The received sound signal y m (t) of the sub microphone 6m is input to the corresponding second variable filter 26m (m = 2,..., M). The second variable filter 26m convolves the received signal y m (t) with the second filter coefficient h m (t), that is, calculates the following equation (9) to calculate the second pseudo echo signal β m. (T) (m = 2, ..., M) is generated (step S4).

β(t)=y(t)*h(t) (9)
第2の擬似エコー信号β(t)はそれぞれ加算部36へ入力される。
β m (t) = y m (t) * h m (t) (9)
The second pseudo echo signals β m (t) are each input to the adder 36.

一方、受話信号x(t)は第1の可変フィルタ24に入力される。第1の可変フィルタ24が受話信号x(t)に第1のフィルタ係数h(t)を畳み込んで、つまり以下の式(10)を計算して、第1の擬似エコー信号α(t)が生成される(ステップS4)。
α(t)=x(t)*h(t) (10)
第1の擬似エコー信号α(t)および第2の擬似エコー信号β(t)は加算部36へ入力され、α(t)+Σi=2 β(t)が計算される。加算された信号α(t)+Σi=2 β(t)は、減算部38に入力される。
On the other hand, the received signal x (t) is input to the first variable filter 24. The first variable filter 24 convolves the received signal x (t) with the first filter coefficient h 1 (t), that is, calculates the following equation (10) to obtain the first pseudo echo signal α (t ) Is generated (step S4).
α (t) = x (t) * h 1 (t) (10)
The first pseudo echo signal α (t) and the second pseudo echo signal β m (t) are input to the adder 36, and α (t) + Σ i = 2 M β i (t) is calculated. The added signal α (t) + Σ i = 2 M β i (t) is input to the subtracting unit 38.

一方、メインマイクロホン61で収音された受音信号y(t)は減算部38に入力される。減算部38で、受音信号y(t)から加算部36よりの音声信号α(t)+Σm=2 β(t)を減算し、つまり、以下の式(11)が計算され、送話信号e(t)が出力される(ステップS6)。
e(t)=y(t)−α(t)−Σi=2 β(t)
=y(t)−h(t)*x(t)−Σi=2 (t)*h(t)
(11)
上記式(8)より
On the other hand, the received sound signal y 1 (t) collected by the main microphone 61 is input to the subtractor 38. The subtracting unit 38 subtracts the audio signal α (t) + Σ m = 2 M β m (t) from the adding unit 36 from the received sound signal y 1 (t), that is, the following equation (11) is calculated. The transmission signal e (t) is output (step S6).
e (t) = y 1 (t) −α (t) −Σ i = 2 M β i (t)
= Y 1 (t) -h 1 (t) * x (t) -Σ i = 2 M y i (t) * h i (t)
(11)
From the above equation (8)

(t)=g(t)*r(t)*x(t)+r(t)*f(x(t))+c(t)*s(t) (12)
(t)=g(t)*r(t)*x(t)+r(t)*f(x(t))+c(t)*s(t) (13)
となるので、式(12)(13)を上記式(11)へ代入すると、以下の式(14)になる。
y 1 (t) = g (t) * r 1 (t) * x (t) + r 1 (t) * f (x (t)) + c 1 (t) * s (t) (12)
y i (t) = g (t) * r i (t) * x (t) + r i (t) * f (x (t)) + c i (t) * s (t) (13)
Therefore, when Expressions (12) and (13) are substituted into Expression (11), the following Expression (14) is obtained.

e(t)=g(t)*r(t)*x(t)+r(t)*f(x(t))+c(t)*s(t)
−h(t)*x(t)−Σi=2 (t)*{g(t)*r(t)*x(t)
+r(t)*f(x(t))+c(t)*s(t)}
={g(t)*r(t)−h(t)−Σi=2 (t)*g(t)*r(t)}*x(t)
+{r(t)−Σi=2 (t)*r(t)}*f(x(t))
+{c(t)−Σi=2 (t)*c(t)}*s(t) (14)
e (t) = g (t) * r 1 (t) * x (t) + r 1 (t) * f (x (t)) + c 1 (t) * s (t)
-H 1 (t) * x ( t) -Σ i = 2 M h i (t) * {g (t) * r i (t) * x (t)
+ R i (t) * f (x (t)) + c i (t) * s (t)}
= {G (t) * r 1 (t) -h 1 (t) -Σ i = 2 M h i (t) * g (t) * r i (t)} * x (t)
+ {R 1 (t) −Σ i = 2 M h i (t) * r i (t)} * f (x (t))
+ {C 1 (t) −Σ i = 2 M h i (t) * c i (t)} * s (t) (14)

ここで、式(14)の第1項{g(t)*r(t)−h(t)−Σi=2 (t)*g(t)*r(t)}*x(t)は線形の音響エコー成分であり、{g(t)*r(t)−h(t)−Σi=2 (t)*g(t)*r(t)}=0となるh(t)とh(t)を設定すれば、線形の音響エコーを消去することが出来る。
式(14)の第2項{r(t)−Σi=2 (t)*r(t)}*f(x(t))は、非線形の音響エコー成分であり、r(t)−Σi=2 (t)*r(t)=0となるh(t)を設定すれば、非線形の音響エコーを消去することが出来る。
上記話者受音信号選出手段および、エコー受音信号選出手段は必ずしも設けなくても良い。この場合は話者受音信号選出手段が設けられないメインマイクロホン61からの話者音声を収音した受音信号が減算部38へ入力され、エコー受音信号選出手段が設けられないサブマイクロホン6mよりの受音信号が第2の可変フィルタ26mへ入力される。
式(14)の第3項{c(t)−Σi=2 (t)*c(t)}*s(t)は近端話者7の音声成分であり、メインマイクロホン61で収音された近端話者7の音声成分c(t)*s(t)に加え、劣化成分であるΣi=2 (t)*c(t)*s(t)が存在している。
Here, the first term of the expression (14) {g (t) * r 1 (t) −h 1 (t) −Σ i = 2 M h i (t) * g (t) * r i (t) } * X (t) is a linear acoustic echo component, and {g (t) * r 1 (t) −h 1 (t) −Σ i = 2 M h i (t) * g (t) * r By setting h 1 (t) and h m (t) where i (t)} = 0, linear acoustic echo can be eliminated.
The second term {r 1 (t) −Σ i = 2 M h i (t) * r i (t)} * f (x (t)) in Equation (14) is a nonlinear acoustic echo component, By setting h m (t) where r 1 (t) −Σ i = 2 M h i (t) * r i (t) = 0, nonlinear acoustic echo can be eliminated.
The speaker received signal selecting means and the echo received signal selecting means are not necessarily provided. In this case, a received sound signal obtained by picking up the speaker voice from the main microphone 61 without a speaker received signal selection means is input to the subtractor 38, and a sub microphone 6m without an echo received signal selection means is provided. Is received by the second variable filter 26m.
The third term {c 1 (t) −Σ i = 2 M h i (t) * c i (t)} * s (t) in the equation (14) is a speech component of the near-end speaker 7, and In addition to the speech component c 1 (t) * s (t) of the near-end speaker 7 picked up by the microphone 61, Σ i = 2 M h i (t) * c i (t) * s which is a degradation component (T) exists.

また、上記式(14)の第3項に含まれる近端話者7の音声の劣化成分であるΣi=2 (t)*c(t)*s(t)が大きくなってしまうと、送話音声の品質が悪くなってしまう。これを防ぐためには前述したエコー受音信号選択手段を用いる。以下に一例を説明する。簡略化のために、メインマイクロホン61、サブマイクロホン62、共に1個の場合を説明する。
上記{c(t)−Σi=2 (t)*c(t)}*s(t)に基づく近端話者7の音声の劣化を防ぐためには、h(t)とc(t)を小さくする必要がある。このためには、上記話者受音信号選出手段およびエコー受音信号選出手段を設ければよい。例えば、メインマイクロホン61およびサブマイクロホン62の配置を変えることが考えられる。図5に示すように、メインマイクロホン61およびサブマイクロホン62として単一指向性マイクロホンを使用する。メインマイクロホン61の感度の高い部分61aを近端話者7に向け、感度の低い部分61bをスピーカ4に向ける。また、サブマイクロホン62の感度の高い部分62aをスピーカ4に向けて、感度の低い部分62bを近端話者7に向ける。このような配置をすることで、メインマイクロホン61において、c(t)の振幅が、c(t)の振幅より小さくなる。さらに、r(t)の振幅がr(t)の振幅よりも小さくなることで、エコーを消去するためのフィルタh(t)の振幅も小さくなる。何故なら、上記式(14)の第2項において、r(t)−h(t)*r(t)=0となるh(t)を設定しているからである。この配置により、近端話者7の音声の劣化成分h(t)*c(t)*s(t)を小さくすることが出来る。
この実施例1において、図6に示すように、メインマイクロホン61の感度の高い部分61aを近端話者7に向け、マイクロホン6mの感度の高い部分6maをスピーカ4に向けることで、近端話者7の音声の劣化成分を小さくすることが出来る。
Also, Σ i = 2 M h i (t) * c i (t) * s (t), which is a degradation component of the speech of the near-end speaker 7 included in the third term of the above formula (14), becomes large. If this happens, the quality of the transmitted voice will deteriorate. In order to prevent this, the above-described echo sound signal selection means is used. An example will be described below. For simplification, the case where there is only one main microphone 61 and one sub microphone 62 will be described.
In order to prevent the deterioration of the voice of the near-end speaker 7 based on {c 1 (t) −Σ i = 2 M h i (t) * c i (t)} * s (t), h 2 (t ) And c 2 (t) must be reduced. For this purpose, the above-mentioned speaker received signal selecting means and echo received signal selecting means may be provided. For example, the arrangement of the main microphone 61 and the sub microphone 62 can be changed. As shown in FIG. 5, unidirectional microphones are used as the main microphone 61 and the sub microphone 62. The high sensitivity portion 61 a of the main microphone 61 is directed toward the near-end speaker 7, and the low sensitivity portion 61 b is directed toward the speaker 4. Further, the highly sensitive portion 62 a of the sub microphone 62 is directed to the speaker 4, and the low sensitive portion 62 b is directed to the near-end speaker 7. With this arrangement, in the main microphone 61, the amplitude of c 2 (t) is smaller than the amplitude of c 1 (t). Furthermore, since the amplitude of r 1 (t) is smaller than the amplitude of r 2 (t), the amplitude of the filter h 2 (t) for canceling the echo is also reduced. Is because the second term in the above equation (14), is set to r 1 (t) -h 2 ( t) * r 2 (t) = 0 and becomes h 2 (t). With this arrangement, the degradation component h 2 (t) * c 2 (t) * s (t) of the voice of the near-end speaker 7 can be reduced.
In the first embodiment, as shown in FIG. 6, the high-sensitivity portion 61 a of the main microphone 61 is directed to the near-end speaker 7, and the high-sensitivity portion 6 ma of the microphone 6 m is directed to the speaker 4. The deterioration component of the voice of the person 7 can be reduced.

次に、エコー信号d(t)を抑圧するための可変フィルタの係数h(t)、h(t)(m=2、...、M)を求める方法を説明する。フィルタ係数h(t)は第1のフィルタ係数更新部34で、h(t)は、第2のフィルタ係数更新部32で更新される(ステップS8)。送話信号e(t)に含まれるエコー成分の2乗平均が小さくなるようにフィルタ係数h(t)、h(t)(m=2、...、M)を逐次更新することにより得られる。ただしフィルタ係数の初期値は任意の値で事前に与えられる。
このフィルタ係数を逐次更新するアルゴリズムの代表的なものには、学習同定(NLMS:NormalizedLeast−Mean−Squares)アルゴリズム、もしくは射影アルゴリズム、もしくは逐次最小二乗(RecursiveLeastSquare)アルゴリズム、もしくはLMS(LeastMeanSquare)アルゴリズム等がある。以下、それぞれのアルゴリズムを簡単に説明する。
Next, a method for obtaining the coefficients h 1 (t) and h m (t) (m = 2,..., M) of the variable filter for suppressing the echo signal d (t) will be described. The filter coefficient h 1 (t) is updated by the first filter coefficient update unit 34, and h m (t) is updated by the second filter coefficient update unit 32 (step S8). The filter coefficients h 1 (t), h m (t) (m = 2,..., M) are sequentially updated so that the mean square of the echo components included in the transmission signal e (t) is reduced. Is obtained. However, the initial value of the filter coefficient is given in advance as an arbitrary value.
Typical algorithms for sequentially updating the filter coefficients include a learning identification (NLMS: Normalized-Lean-Squares) algorithm, a projection algorithm, a sequential least-squares (RecursiveLeastSquare) algorithm, or an LMS (LeastMeanSquare) algorithm. is there. Each algorithm will be briefly described below.

NLMSアルゴリズム
NLMSアルゴリズムは観測された最新の1サンプルの送話信号e(t)のみを用いてフィルタ係数を更新するアルゴリズムであり、演算量が少ない特徴をもつ。第1のフィルタ係数更新部34による更新式は以下の式(15)で表され、第2のフィルタ係数更新部32による更新式は以下の式(16)で表される。
(t+1)=H(t)+a・X(t)・e(t)/{X(t)X(t)+Σi=2 (t)(t)} (15)
(t+1)=H(t)+a・Y(t)・e(t)/{X(t)X(t)+Σi=2 (t)(t)} (16)
ただし、H(t)、H(t)(m=2、...、M)は、時刻tにおける受話信号x(t)に対するフィルタ係数のベクトルであり、H(t)=(h(0),h(1),…,h(L−1))(m=1、...、M)で表され、Lはタップ数である。aとaは事前に設定されたNLMSアルゴリズムのステップサイズであり、
0<a、a<2を満たす。また、Y(t)は時刻tにおける送話信号y(t)のLサンプル分のベクトルであり、Y(t)=(y(t−0),y(t−1),…,y(t−L+1))で表す。
また、上記式(4)の右辺の分母と、上記式(15)(16)の右辺の分母を比較すると、上記式(15)(16)の右辺の分母で、余分にΣi=2 (t)(t)が足されている。Σi=2 (t)(t)は、各マイクロホン6mで収音された受音信号y(t)のパワーの和である。分母に受音信号y(t)のパワーの和を足しておくことで、フィルタ係数の発散を防ぐことが出来る。
NLMS algorithm The NLMS algorithm is an algorithm for updating the filter coefficient using only the latest observed transmission signal e (t) of one sample, and has a feature that the amount of calculation is small. The update formula by the first filter coefficient update unit 34 is expressed by the following formula (15), and the update formula by the second filter coefficient update unit 32 is expressed by the following formula (16).
H 1 (t + 1) = H 1 (t) + a 1 · X (t) · e (t) / {X (t) T X (t) + Σ i = 2 M Y i (t) T Y i (t) } (15)
H m (t + 1) = H m (t) + a m · Y m (t) · e (t) / {X (t) T X (t) + Σ i = 2 M Y i (t) T Y i (t )} (16)
Here, H 1 (t), H m (t) (m = 2,..., M) are filter coefficient vectors for the received signal x (t) at time t, and H m (t) = ( h m (0), h m (1),..., h m (L−1)) T (m = 1,..., M), where L is the number of taps. a 1 and a m are pre-set NLMS algorithm step sizes,
0 <a 1 and a m <2 are satisfied. Y (t) is a vector of L samples of the transmission signal y (t) at time t, and Y m (t) = (y m (t−0), y m (t−1),. , Y m (t−L + 1)) T.
Further, the denominator of the right side of the equation (4), comparing the denominator of the right side of the equation (15) (16), in the denominator of the right side of the equation (15) (16), extra Σ i = 2 M Y i (t) T Y i (t) is added. Σ i = 2 M Y i (t) T Y i (t) is the sum of the powers of the received sound signals y m (t) collected by the microphones 6m. By adding the sum of the power of the received sound signal y m (t) to the denominator, the divergence of the filter coefficient can be prevented.

LMSアルゴリズム
LMSアルゴリズムもNLMSアルゴリズム同様、観測された最新の1サンプルの送話信号e(t)のみを用いてフィルタ係数を更新するアルゴリズムであり、演算量が少ない特徴をもつ。LMSアルゴリズムの更新式は、以下の式(17)、(18)で表すことができる。
(t+1)=H(t)+b・X(t)・e(t) (17)
(t+1)=H(t)+b・Y(t)・e(t) (18)
Similar to the NLMS algorithm, the LMS algorithm is an algorithm for updating the filter coefficient using only the latest observed transmission signal e (t) of one sample, and has a feature that the amount of calculation is small. The update formula of the LMS algorithm can be expressed by the following formulas (17) and (18).
H 1 (t + 1) = H 1 (t) + b · X (t) · e (t) (17)
H m (t + 1) = H m (t) + b m · Y m (t) · e (t) (18)

射影アルゴリズム
射影アルゴリズムは、過去uサンプル分の送話信号e(t)を用いて、フィルタ係数を更新するアルゴリズムである。射影アルゴリズムは隣り合う入力信号ベクトルの間の相関を取り除くことを基本的な考え方とするアルゴリズムである。射影アルゴリズムは上述したNLMSアルゴリズムに比べ、演算量が多くなるが送話信号e(t)の収束速度が速いという特徴がある。第1のフィルタ係数h(t)および第2のフィルタ係数h(t)(m=2、...、M)は以下の式(19)で更新される。なお、以下の式(19)は2次のフィルタ係数の更新式である。

Figure 0004709714
Projection algorithm The projection algorithm is an algorithm for updating the filter coefficient using the transmission signal e (t) for the past u samples. The projection algorithm is an algorithm whose basic idea is to remove the correlation between adjacent input signal vectors. The projection algorithm has a feature that the amount of calculation is larger than the above-described NLMS algorithm, but the convergence speed of the transmission signal e (t) is high. The first filter coefficient h 1 (t) and the second filter coefficient h m (t) (m = 2,..., M) are updated by the following equation (19). In addition, the following formula | equation (19) is an update formula of a secondary filter coefficient.
Figure 0004709714

ただし、ベクトルu(t)は、以下の式(20)で表すことができる。

Figure 0004709714
However, the vector u (t) can be expressed by the following equation (20).
Figure 0004709714

RLSアルゴリズム
RLSアルゴリズムは、過去全ての送話信号e(t)を利用して、フィルタ係数を更新するアルゴリズムであり、上述した射影アルゴリズムよりも送話信号e(t)の収束速度が速いが、演算量は多い。RLSアルゴリズムは過去の全入出力の関係を最小2乗誤差で近似させるフィルタ係数ベクトルH^(t)を求めることにある。第1のフィルタ係数h(t)および第2のフィルタ係数h(t)(m=2、...、M)は以下の式(21)で更新される。

Figure 0004709714
ただし、ベクトルK(t)は以下の式(22)で表すことができる。
Figure 0004709714
RLS algorithm RLS algorithm is an algorithm for updating filter coefficients using all past transmission signals e (t), and the convergence speed of the transmission signal e (t) is faster than the projection algorithm described above. The amount of calculation is large. The RLS algorithm is to obtain a filter coefficient vector H ^ (t) that approximates all past input / output relationships with a least square error. The first filter coefficient h 1 (t) and the second filter coefficient h m (t) (m = 2,..., M) are updated by the following expression (21).
Figure 0004709714
However, the vector K (t) can be expressed by the following equation (22).
Figure 0004709714

ただし、ベクトルP(t)は入力信号の共分散行列E[x(t)x(t)]の逆行列として定義され、タップ数Lを用いて、L×Lの正方行列である。またベクトルP(t)は以下の式(23)を満たす。

Figure 0004709714
ただし、λは忘却係数であり、0≦λ≦1を満たす係数である。
これらのアルゴリズムの詳細については、「音響エコーキャンセラのための適応信号処理の研究 牧野昭二 博士論文 1993 東北大学」に記載されている。
また第1のフィルタ係数更新部34および、第2のフィルタ係数更新部32の更新アルゴリズムは各々違うアルゴリズムを用いても良い。 However, the vector P (t) is defined as an inverse matrix of the covariance matrix E [x (t) x (t) T ] of the input signal, and is an L × L square matrix using the tap number L. The vector P (t) satisfies the following expression (23).
Figure 0004709714
However, λ is a forgetting factor, which is a factor satisfying 0 ≦ λ ≦ 1.
Details of these algorithms are described in "Research on Adaptive Signal Processing for Acoustic Echo Canceller, Dr. Shoji Makino Doctoral Dissertation 1993 Tohoku University".
Also, different algorithms may be used for the update algorithms of the first filter coefficient update unit 34 and the second filter coefficient update unit 32, respectively.

これらの更新アルゴリズムを適用して第1のフィルタ係数更新部34および、第2のフィルタ係数更新部32で、送話信号e(t)が収束するまで、第1のフィルタ係数h(t)および第2のフィルタ係数h(t)(m=2、...、M)は更新される(ステップS8)。またこれらのアルゴリズムを用いて、フィルタ係数を更新できることは、以下で説明する実施例2〜4においても同様である。
実施例1の機能構成により、スピーカ4に非線形性などがある場合に発生する非線形な音響エコーと、線形の音響エコーとの両方のエコーを消去することが出来、高い消去性能を実現することが出来る。
By applying these update algorithms, the first filter coefficient updating unit 34 and the second filter coefficient updating unit 32 apply the first filter coefficient h 1 (t) until the transmission signal e (t) converges. The second filter coefficient h m (t) (m = 2,..., M) is updated (step S8). The filter coefficients can be updated using these algorithms as well in the second to fourth embodiments described below.
With the functional configuration of the first embodiment, it is possible to cancel both the nonlinear acoustic echo generated when the speaker 4 has nonlinearity and the like, and the linear acoustic echo, thereby realizing high canceling performance. I can do it.

[実験結果]
従来のエコー消去装置14と実施例2のエコー消去装置40を実際のハンズフリー装置に適用して、効果を比較するための実験結果を説明する。この実験で使用した実施例1で説明したエコー消去装置40、スピーカ4、近端話者7などの配置を図7に示す。また、メインマイクロホン、サブマイクロホンをそれぞれ1つとした。メインマイクロホン61とサブマイクロホン62とを結ぶ直線と、スピーカ4と近端話者7とを結ぶ直線と、が直交するように、メインマイクロホン61、サブマイクロホン62、スピーカ4、近端話者7を配置させる。メインマイクロホン61とサブマイクロホン62との距離を2cm、スピーカ4と近端話者7との距離を50cm、スピーカ4とメインマイクロホン61との距離を5cmとし、スピーカ4の直径は4cmである。スピーカ特性は過大入力に対してクリップする特性を模擬したシグモイド関数とし、空間応答c(t)、c(t)、r(t)、r(t)は図7の配置で計測したインパルス応答とした。
[Experimental result]
An experimental result for comparing the effect by applying the conventional echo canceling device 14 and the echo canceling device 40 of the second embodiment to an actual hands-free device will be described. FIG. 7 shows the arrangement of the echo canceller 40, the speaker 4, the near-end speaker 7 and the like described in the first embodiment used in this experiment. One main microphone and one sub microphone were used. The main microphone 61, the sub microphone 62, the speaker 4, and the near-end speaker 7 are arranged so that the straight line connecting the main microphone 61 and the sub microphone 62 and the straight line connecting the speaker 4 and the near-end speaker 7 are orthogonal to each other. Arrange. The distance between the main microphone 61 and the sub microphone 62 is 2 cm, the distance between the speaker 4 and the near-end speaker 7 is 50 cm, the distance between the speaker 4 and the main microphone 61 is 5 cm, and the diameter of the speaker 4 is 4 cm. The speaker characteristic is a sigmoid function that simulates the characteristic of clipping with an excessive input, and the spatial responses c 1 (t), c 2 (t), r 1 (t), and r 2 (t) are measured with the arrangement shown in FIG. Impulse response.

メインマイクロホン61の感度の高い部分41aを近端話者の方向へ向け、サブマイクロホン62の感度の高い部分62aをスピーカ4の方向へ向ける。また、サンプリング周波数は16kHz、残響時間を200ms、第1の可変フィルタ24および第2の可変フィルタ26のタップ数は512タップ、適応アルゴリズムはNLMSアルゴリズムを用いた。
図8、図9に実験結果を示す。図8は受話信号x(t)に白色雑音を入力したときのエコー消去量を示しているグラフであり、値が大きいほどより多くのエコーを消去できていることを示す。横軸が時刻(s)であり、縦軸がその時刻のエコー消去量を示す。細線が従来技術のエコー消去装置14によるエコー消去量を示し、太線が実施例1で説明したエコー消去装置40によるエコー消去量を示す。
従来技術のエコー消去装置14を用いた場合のエコー消去量は20dB程度が最大であることが理解できる。これは、スピーカ4に非線形性があるために生じる非線形な音響エコーを従来技術で説明したエコー消去装置14では消去できないためである。
The high sensitivity portion 41 a of the main microphone 61 is directed toward the near-end speaker, and the high sensitivity portion 62 a of the sub microphone 62 is directed toward the speaker 4. The sampling frequency was 16 kHz, the reverberation time was 200 ms, the number of taps of the first variable filter 24 and the second variable filter 26 was 512 taps, and the adaptive algorithm used was the NLMS algorithm.
8 and 9 show the experimental results. FIG. 8 is a graph showing the amount of echo cancellation when white noise is input to the received signal x (t). The larger the value, the more echoes can be canceled. The horizontal axis represents time (s), and the vertical axis represents the amount of echo cancellation at that time. A thin line indicates the amount of echo erasure by the conventional echo erasing device 14, and a thick line indicates the amount of echo erasure by the echo erasing device 40 described in the first embodiment.
It can be understood that the maximum amount of echo cancellation when using the conventional echo cancellation apparatus 14 is about 20 dB. This is because the non-linear acoustic echo generated due to the non-linearity of the speaker 4 cannot be canceled by the echo canceling device 14 described in the related art.

これに対し、実施例1で説明したエコー消去装置40を使用した場合のエコー消去量は40dB程度であることが理解できる。これは、実施例2で説明したエコー消去装置40においての非線形な音響エコーも消去できているからである。
図9は、近端話者7に対するインパルス応答を示したグラフである。横軸が周波数(Hz)を示し、縦軸が、近端話者7からメインマイクロホン61までのインパルス応答c(t)である。細線がエコー消去装置40の処理前を示し、太線がエコー消去装置40の処理後を示す。図8のグラフより、処理前と処理後のインパルス応答c(t)を比較しても、殆ど差が無いことが理解できる。以上のことから、実施例1では、近端話者7の音声の劣化成分が小さいことがわかる。
以上の説明から、実施例1によれば、線形の音響エコーと、スピーカ4の非線形性により発生する非線形な音響エコーの両方を消去し、高いエコー消去方法を実現出来る。さらに、近端話者7の音声の劣化を小さく高品質な収音が実現できる。
On the other hand, it can be understood that the amount of echo cancellation when the echo cancellation apparatus 40 described in the first embodiment is used is about 40 dB. This is because the nonlinear acoustic echo in the echo canceling apparatus 40 described in the second embodiment can also be canceled.
FIG. 9 is a graph showing an impulse response to the near-end speaker 7. The horizontal axis represents the frequency (Hz), and the vertical axis represents the impulse response c 1 (t) from the near-end speaker 7 to the main microphone 61. A thin line indicates before processing of the echo canceller 40, and a thick line indicates after processing of the echo canceller 40. It can be understood from the graph of FIG. 8 that there is almost no difference even when the impulse response c 1 (t) before and after the processing is compared. From the above, it can be seen that in Example 1, the degradation component of the voice of the near-end speaker 7 is small.
From the above description, according to the first embodiment, both the linear acoustic echo and the nonlinear acoustic echo generated due to the nonlinearity of the speaker 4 can be eliminated, and a high echo cancellation method can be realized. Furthermore, it is possible to realize high-quality sound collection with little deterioration of the voice of the near-end speaker 7.

実施例1では、メインマイクロホン61は感度の高い部分61aを近端話者7に向けて、感度の低い部分をスピーカ4に向け、サブマイクロホン62は感度の高い部分62aをスピーカ4に向け、感度の低い部分を近端話者7に向けることで、近端話者7の音声の劣化成分を小さくすることが出来ることを説明した。実施例2では、話者受音信号選出手段をメインビームフォーマとし、エコー受音信号選出手段をサブビームフォーマとする機能構成例である。実施例2ではこれらメインビームフォーマとサブビームフォーマを使って近端話者7の音声の劣化成分を小さくする。
図10に実施例2の機能構成例を示す。実施例2のエコー消去装置40は実施例2で説明したエコー消去装置40と比較して、上記サブビームフォーマ50とメインビームフォーマ52とが加えられる。また、この実施例では、マイクロホン6mにおいては、メインマイクロホン(話者用収音手段)、サブマイクロホンとに分けられることなく、全てのマイクロホン6mが並列的に作動する。
In the first embodiment, the main microphone 61 directs the high-sensitivity portion 61a toward the near-end speaker 7, directs the low-sensitivity portion toward the speaker 4, and the sub microphone 62 directs the high-sensitivity portion 62a toward the speaker 4, It has been explained that the deterioration component of the voice of the near-end speaker 7 can be reduced by directing the low-frequency portion toward the near-end speaker 7. The second embodiment is a functional configuration example in which the speaker reception signal selection unit is a main beamformer and the echo reception signal selection unit is a sub-beamformer. In the second embodiment, the main beam former and the sub beam former are used to reduce the degradation component of the voice of the near-end speaker 7.
FIG. 10 shows a functional configuration example of the second embodiment. The echo canceling apparatus 40 of the second embodiment is added with the sub beam former 50 and the main beam former 52 as compared with the echo canceling apparatus 40 described in the second embodiment. In this embodiment, the microphones 6m are not divided into main microphones (speaker sound collecting means) and sub microphones, and all the microphones 6m operate in parallel.

また、サブビームフォーマ50はサブ固定フィルタ50m(m=1、...、M)とサブ加算部500とで構成され、メインビームフォーマ52はメイン固定フィルタ52m(m=1、...、M)とメイン加算部520とで構成される。
メインビームフォーマ52は近端話者7方向に感度を高くし、スピーカ4に対する感度を低くする。また、サブビームフォーマ50は、スピーカ4に対する感度を高くして、近端話者7に対する感度を低くする。メインビームフォーマ52とサブビームフォーマ50を使用することで、任意の方向に対して指向性が高い部分と低い部分を作ることができ、様々なスピーカとマイクロホンの配置に適用することができる。
また、この実施例2の説明では、簡略化のため、周波数領域ωで説明する。近端話者からマイクロホン6mまでの伝達関数をC(ω)とし、スピーカ4からマイクロホン6mまでの伝達関数をR(ω)とし、メインビームフォーマ52中のmチャネルのメイン固定フィルタ52mの各固定フィルタ係数をP(ω)とし、サブビームフォーマ50のサブ固定フィルタ50mの固定フィルタ係数をQ(ω)とする。
サブ固定フィルタ50mおよびメイン固定フィルタ52mの各固定フィルタ係数は予め与えられた値から固定されたものである。以下に、固定フィルタ係数の設計を説明する。
The sub beamformer 50 includes a sub fixed filter 50m (m = 1,..., M) and a sub adder 500, and the main beamformer 52 includes a main fixed filter 52m (m = 1,. ) And a main adder 520.
The main beamformer 52 increases sensitivity toward the near-end speaker 7 and decreases sensitivity to the speaker 4. Further, the sub-beamformer 50 increases sensitivity to the speaker 4 and decreases sensitivity to the near-end speaker 7. By using the main beamformer 52 and the sub-beamformer 50, it is possible to create a portion having high directivity and a portion having low directivity in an arbitrary direction, and it can be applied to various speaker and microphone arrangements.
In the description of the second embodiment, the description will be made in the frequency domain ω for simplification. The transfer function from the near-end speaker to the microphone 6 m is C m (ω), the transfer function from the speaker 4 to the microphone 6 m is R m (ω), and the m-channel main fixed filter 52 m in the main beamformer 52 is Each fixed filter coefficient is P m (ω), and the fixed filter coefficient of the sub fixed filter 50 m of the sub beam former 50 is Q m (ω).
Each fixed filter coefficient of the sub fixed filter 50m and the main fixed filter 52m is fixed from a predetermined value. Hereinafter, the design of the fixed filter coefficient will be described.

メインビームフォーマ52に要求されるのは、近端話者7の音声s(ω)を収音し、スピーカ4からの再生音を抑圧することである。これらの条件を式で表せば、以下の式(24)(25)になる。
Σi=1 ’(ω)・P(ω)=D(ω) (24)
Σi=1 ’(ω)・P(ω)=0 (25)
ここで、D(ω)は目標とするインパルス応答である。目標とするインパルス応答とは、例えば、振幅値が固定値であり、位相が直線位相(時間領域における固定遅延)となっているようなインパルス応答である。C’(ω)、R’(ω)はマイク6m、スピーカ4、近端話者7の配置から計算される直接波の理論的な応答を事前に設定する。以下の式(26)(27)でも同様である。上記式(24)(25)を満たす固定フィルタ係数P(ω)を設定すれば良い。次に、サブビームフォーマ50に要求されるのは、近端話者7の音声s(ω)を抑圧し、スピーカ4からの再生音を収音することである。これらの条件を式で表すと、以下の式(26)(27)になる。
Σi=1 ’(ω)・Q(ω)=0 (26)
Σi=1 ’(ω)・Q(ω)=K(ω) (27)
ただし、K(ω)は目標とするインパルス応答である。目標とするインパルス応答とは、例えば、振幅値が固定値であり、位相が直線位相(時間領域における固定遅延)となっているようなインパルス応答である。
その他の処理は実施例1と同様である。
What is required of the main beamformer 52 is to pick up the voice s (ω) of the near-end speaker 7 and suppress the reproduced sound from the speaker 4. When these conditions are expressed by equations, the following equations (24) and (25) are obtained.
Σ i = 1 M C i ′ (ω) · P i (ω) = D (ω) (24)
Σ i = 1 M R i ′ (ω) · P i (ω) = 0 (25)
Here, D (ω) is a target impulse response. The target impulse response is, for example, an impulse response in which the amplitude value is a fixed value and the phase is a linear phase (fixed delay in the time domain). C m ′ (ω) and R m ′ (ω) set in advance the theoretical response of the direct wave calculated from the arrangement of the microphone 6m, the speaker 4, and the near-end speaker 7. The same applies to the following equations (26) and (27). A fixed filter coefficient P m (ω) that satisfies the above equations (24) and (25) may be set. Next, what is required of the sub-beamformer 50 is to suppress the voice s (ω) of the near-end speaker 7 and collect the reproduced sound from the speaker 4. When these conditions are expressed by equations, the following equations (26) and (27) are obtained.
Σ i = 1 M C i ′ (ω) · Q i (ω) = 0 (26)
Σ i = 1 M R i ′ (ω) · Q i (ω) = K (ω) (27)
However, K (ω) is a target impulse response. The target impulse response is, for example, an impulse response in which the amplitude value is a fixed value and the phase is a linear phase (fixed delay in the time domain).
Other processes are the same as those in the first embodiment.

以上のように、メインビームフォーマ52、サブビームフォーマ50を設定すれば、任意のマイクロホン6mとスピーカ4の配置において、メインビームフォーマ52では近端話者7の方向に感度を高くして、スピーカ4に対する感度を低くして、サブビームフォーマ50ではスピーカ4に対する感度を高くして、近端話者7に対する感度を低くすることが実現し、近端話者音声の劣化を防止することが出来る。   As described above, if the main beamformer 52 and the sub-beamformer 50 are set, the sensitivity of the main beamformer 52 in the direction of the near-end speaker 7 is increased in the arrangement of the arbitrary microphone 6m and the speaker 4, and the speaker 4 The sub-beamformer 50 can reduce the sensitivity to the near-end speaker 7 by reducing the sensitivity to the near-end speaker 7 by preventing the deterioration of the near-end speaker voice.

この実施例の機能構成例は、実施例1で説明したエコー消去装置40に受話検出部60とスイッチ62m(m=2、...M)を加えたものである。スイッチ62mのそれぞれは第2の可変フィルタ26mのそれぞれと対応して接続されている。
図11に実施例3の機能構成例を示す。通信網2よりの音声信号j(t)は第1の可変フィルタ24のほかに、受話検出部60にも入力される。受話検出部60は、音声信号j(t)のレベルを観測し、受話信号x(t)のある区間を検出する。検出は例えば、予め、与えられた閾値と音声信号j(t)のレベル(パワー)とを比較し、音声信号j(t)の方が大きい場合はその区間に受話信号x(t)が含まれていると判断する。
受話信号x(t)が含まれている区間では、受話検出部60がスイッチ62mのオン接続とし、第2の可変フィルタ26mをそれぞれ、加算部36へ接続させる。また、受話信号x(t)が含まれていない区間では、受話検出部60が、スイッチ62mの接続をオフとし、第2の可変フィルタ26mをそれぞれ加算部36から切り離す。
The functional configuration example of this embodiment is obtained by adding the reception detector 60 and the switch 62m (m = 2,... M) to the echo canceller 40 described in the first embodiment. Each of the switches 62m is connected to each of the second variable filters 26m.
FIG. 11 shows a functional configuration example of the third embodiment. The audio signal j (t) from the communication network 2 is input to the reception detection unit 60 in addition to the first variable filter 24. The reception detection unit 60 observes the level of the audio signal j (t) and detects a certain section of the reception signal x (t). In the detection, for example, a given threshold value is compared with the level (power) of the audio signal j (t) in advance, and if the audio signal j (t) is larger, the received signal x (t) is included in that section. It is judged that
In a section in which the reception signal x (t) is included, the reception detection unit 60 turns on the switch 62m and connects the second variable filter 26m to the addition unit 36, respectively. In a section in which the reception signal x (t) is not included, the reception detection unit 60 turns off the connection of the switch 62m and disconnects the second variable filter 26m from the addition unit 36, respectively.

これらの処理により、近端話者7の発話があり、音声信号j(t)中に受話信号x(t)が含まれていない区間ではスイッチ62mの接続がオフになっているので、第2の可変フィルタ26よりの第2の擬似エコー信号q(t)が加算部36に出力されず、つまりサブマイクロホン6mを経由する近端話者の音声が遮断され、送話信号e(t)での近端話者音声の劣化がなくなる。
また音声信号j(t)中に受話信号x(t)が含まれている区間ではスイッチ62mがオン接続になっているので、実施例2で説明したとおり、エコー信号d(t)を消去する。
また受話信号x(t)が含まれている区間で、かつ近端話者7が発話しているいわゆるダブルトークの区間では、スイッチ62mがオン接続となり、エコー信号d(t)を消去する。このダブルトーク区間では、近端話者7の音声の劣化が生じるが、ダブルトーク時は、近端話者7から音声信号s(t)と受話信号x(t)の両方が聴こえるため、聴覚のマスキング効果により、音声の劣化が聞こえにくくなっているので、劣化の知覚が少なくなる。
By these processes, the connection of the switch 62m is turned off in the section where the near-end speaker 7 speaks and the voice signal j (t) does not include the received signal x (t). The second pseudo echo signal q (t) from the variable filter 26 is not output to the adding unit 36, that is, the voice of the near-end speaker passing through the sub microphone 6m is blocked, and the transmission signal e (t) The near-end speaker's voice is no longer degraded.
Further, since the switch 62m is on in the section where the received signal x (t) is included in the voice signal j (t), the echo signal d (t) is deleted as described in the second embodiment. .
Further, in a section in which the received signal x (t) is included and a so-called double talk section in which the near-end speaker 7 is speaking, the switch 62m is turned on to cancel the echo signal d (t). In this double talk section, the voice of the near-end speaker 7 is deteriorated, but at the time of double talk, both the voice signal s (t) and the received signal x (t) can be heard from the near-end talker 7. Since the masking effect makes it difficult to hear sound deterioration, the perception of deterioration is reduced.

図12に示すこの実施例4の機能構成例は、実施例2で説明したエコー消去装置40に実施例3で説明した受話検出部60を加えたものである。実施例3同様、受話信号x(ω)が受話検出部60に入力され、通信網2よりの音声信号j(ω)中に受話信号x(ω)が含まれている区間では、スイッチ62がオン接続となり、エコー信号d(ω)が消去される。また、通信網2よりの音声信号j(ω)中に受話信号x(ω)が含まれていない区間では、スイッチ62の接続がオフとなり、近端話者7の音声の劣化は生じない。
以上の各実施形態の他、本発明であるエコー消去装置は上述の実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。また、エコー消去装置において説明した処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されるとしてもよい。
また、この発明のエコー消去装置における処理をコンピュータによって実現する場合、エコー消去装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、エコー消去装置における処理機能がコンピュータ上で実現される。
The functional configuration example of the fourth embodiment shown in FIG. 12 is obtained by adding the reception detection unit 60 described in the third embodiment to the echo canceller 40 described in the second embodiment. As in the third embodiment, the reception signal x (ω) is input to the reception detection unit 60, and in the section where the reception signal x (ω) is included in the voice signal j (ω) from the communication network 2, the switch 62 is turned on. The connection is turned on, and the echo signal d (ω) is deleted. Further, in a section where the received signal x (ω) is not included in the voice signal j (ω) from the communication network 2, the connection of the switch 62 is turned off and the voice of the near-end speaker 7 does not deteriorate.
In addition to the above embodiments, the echo canceller according to the present invention is not limited to the above-described embodiments, and can be appropriately changed without departing from the spirit of the present invention. In addition, the processing described in the echo canceling device is not only executed in time series according to the order of description, but may be executed in parallel or individually as required by the processing capability of the device that executes the processing. .
Further, when the processing in the echo canceller of the present invention is realized by a computer, the processing contents of the functions that the echo canceller should have are described by a program. Then, by executing this program on a computer, the processing function in the echo canceller is realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、DVD(DigitalVersatileDisc)、DVD−RAM(RandomAccessMemory)、CD−ROM(CompactDiscReadOnlyMemory)、CD−R(Recordable)/RW(ReWritable)等を、光磁気記録媒体として、MO(Magneto−Opticaldisc)等を、半導体メモリとしてEEP−ROM(ElectronicallyErasableandProgrammable−ReadOnlyMemory)等を用いることができる。
また、このプログラムの流通は、例えば、そのプログラムを記録したDVD、CD−ROM等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。
The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a flexible disk, a magnetic tape or the like is used as an optical disc, and a DVD (Digital Versatile Disc), a DVD-RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only Memory), a CD-R (Recordable). ) / RW (ReWritable), etc., magneto-optical recording medium, MO (Magneto-Optical disc), etc., semiconductor memory, EEP-ROM (Electronically Erasable Programmable-Read Only Memory), etc. can be used.
The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるASP(ApplicationServiceProvider)型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの(コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等)を含むものとする。
また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、エコー消去装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。
A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. When executing the process, the computer reads a program stored in its own recording medium and executes a process according to the read program. As another execution form of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to the computer. Each time, the processing according to the received program may be executed sequentially. Further, the above-described processing may be executed by a so-called ASP (Application Service Provider) type service that realizes a processing function only by an execution instruction and result acquisition without transferring a program from the server computer to the computer. Good. Note that the program in this embodiment includes information that is used for processing by an electronic computer and that conforms to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).
In this embodiment, the echo canceling apparatus is configured by executing a predetermined program on a computer. However, at least a part of these processing contents may be realized by hardware.

従来技術のシステムの機能構成例を示すブロック図。The block diagram which shows the function structural example of the system of a prior art. 従来技術及びこの発明のスピーカ4内の受話信号x(t)の流れを示す図。The figure which shows the flow of the reception signal x (t) in the speaker 4 of a prior art and this invention. この発明の実施例1のシステムの機能構成例を示すブロック図。1 is a block diagram showing a functional configuration example of a system according to Embodiment 1 of the present invention. この発明の実施例1の主な処理の流れを示すフローチャート。The flowchart which shows the flow of the main processes of Example 1 of this invention. この発明の実施例1または3において、近端話者7の音声の劣化成分を抑圧するためのメインマイクロホン61、サブマイクロホン62の配置を示す図。The figure which shows arrangement | positioning of the main microphone 61 and the submicrophone 62 for suppressing the degradation component of the audio | voice of the near-end speaker 7 in Example 1 or 3 of this invention. この発明に使用したメインマイクロホン61とサブマイクロホン62の配置を図5のように配置した場合の機能構成例を示すブロック図。FIG. 6 is a block diagram showing an example of a functional configuration when the main microphone 61 and the sub microphone 62 used in the present invention are arranged as shown in FIG. 5. 従来の技術のエコー消去装置14とこの発明のエコー消去装置40との効果の違いを示すための実験配置図。FIG. 6 is an experimental layout diagram showing the difference in effect between the conventional echo canceling device 14 and the echo canceling device 40 of the present invention. 従来の技術のエコー消去装置14とこの発明のエコー消去装置40とのエコー消去性能を示すグラフ。The graph which shows the echo cancellation performance of the echo cancellation apparatus 14 of a prior art, and the echo cancellation apparatus 40 of this invention. この発明のエコー消去装置40による処理前と処理後においての、近端話者7に対するインパルス応答c(t)を示す図。The figure which shows the impulse response cm (t) with respect to the near-end speaker 7 before and after the process by the echo cancellation apparatus 40 of this invention. この発明の実施例2のシステムの機能構成例を示すブロック図。The block diagram which shows the function structural example of the system of Example 2 of this invention. この発明の実施例3のシステムの機能構成例を示すブロック図。The block diagram which shows the function structural example of the system of Example 3 of this invention. この発明の実施例4のシステムの機能構成例を示すブロック図。The block diagram which shows the function structural example of the system of Example 4 of this invention.

Claims (12)

再生手段により受話信号を再生音に変換して、放声し、近端話者音声が話者用収音手段により受音された信号(以下、話者受音信号という)と上記再生音が上記話者用収音手段により受音された信号(以下、エコー信号という)からなる受音信号から、上記エコー信号を除去して、送話信号として出力するエコー消去装置において、
上記受話信号が入力され、第1の擬似エコー信号を生成する第1の可変フィルタと、
1以上のエコー用収音手段により、受音された受話信号が入力され、第2の擬似エコー信号を生成する第2の可変フィルタと、
上記話者用収音手段での受音信号から、上記第1の擬似エコー信号及び上記第2の擬似エコー信号を減算して、上記送話信号を出力する減算部と、
少なくとも、上記受話信号と上記送話信号とが入力され、上記第1の可変フィルタのフィルタ係数を更新する第1のフィルタ係数更新部と、
少なくとも、上記エコー用収音手段での受音信号と上記送話信号とが入力され、上記第2の可変フィルタのフィルタ係数を更新する第2のフィルタ係数更新部と
上記受話信号のレベルを検出し、検出されたレベルが予め決められた閾値より小さい場合は、上記第2の可変フィルタを稼動させず、当該第2の可変フィルタの出力を0とする受話検出部とを備えることを特徴とするエコー消去装置。
The reception means converts the received signal into a reproduced sound, utters, and a signal (hereinafter referred to as a speaker received signal) in which the near-end speaker sound is received by the speaker sound collecting means and the reproduced sound are In an echo canceller that removes the echo signal from a received signal consisting of a signal received by the speaker sound collection means (hereinafter referred to as an echo signal) and outputs the signal as a transmitted signal.
A first variable filter that receives the received signal and generates a first pseudo echo signal;
By one or more echo YoOsamu sound hand stage, received sound has been received signals are input, a second variable filter for generating a second pseudo echo signal,
A subtracting unit that subtracts the first pseudo echo signal and the second pseudo echo signal from the received sound signal at the sound collecting means for the speaker and outputs the transmission signal;
A first filter coefficient updating unit that receives at least the reception signal and the transmission signal and updates a filter coefficient of the first variable filter;
A second filter coefficient updating unit that receives at least a sound reception signal from the echo sound collection means and the transmission signal, and updates a filter coefficient of the second variable filter ;
When the level of the received signal is detected and the detected level is smaller than a predetermined threshold, the second variable filter is not operated and the output of the second variable filter is set to 0. And an echo canceller.
請求項1記載のエコー消去装置において、
更に、複数の上記収音手段中から、エコー信号レベル対話者音信号レベルの比の時間平均値が、他の受音信号に比べて小さい受音信号を選出して、上記減算部へ入力する話者受音信号選出手段と、
複数の上記収音手段中から、エコー信号レベル対話者音信号レベルの比の時間平均値が、他の受音信号に比べて大きい受音信号を選出して、上記第2の可変フィルタへ入力するエコー受音信号選出手段と、を備えることを特徴とするエコー消去装置。
The echo canceller according to claim 1,
Further, a received sound signal having a time average value of the ratio of the echo signal level dialogue person sound signal level smaller than that of the other received sound signals is selected from the plurality of sound collecting means and input to the subtracting unit. Speaker receiving signal selection means;
A sound reception signal having a time average value of the ratio of the echo signal level to the conversation person sound signal level larger than that of the other sound reception signals is selected from the plurality of sound collection means and input to the second variable filter. An echo canceling device comprising: an echo sound receiving signal selecting means for performing the operation.
請求項2記載のエコー消去装置において、
上記話者受音信号選出手段は、話者方向に感度が高く、上記複数の収音手段中の上記話者用収音手段としての1つの収音手段(以下、主収音手段という)からの収音信号を得る手段であり、
上記エコー受音信号選出手段は、上記再生手段方向に感度が高く、上記主収音手段以外である収音手段からの収音信号を得る手段であることを特徴とするエコー消去装置。
The echo canceller according to claim 2, wherein
The speaker received signal selection means has high sensitivity in the direction of the speaker, and from one sound collection means (hereinafter referred to as main sound collection means) as the speaker sound collection means among the plurality of sound collection means. Is a means of obtaining a sound pickup signal of
The echo canceling device, wherein the echo sound receiving signal selecting means is means for obtaining a collected sound signal from a sound collecting means other than the main sound collecting means and having high sensitivity in the reproducing means direction.
請求項2記載のエコー消去装置において、
上記話者受音信号選出手段は、全ての上記複数の収音手段からの上記受音信号が入力され、上記受音信号中の上記エコー信号の成分を抑圧するメインビームフォーマであり、
上記エコー受音信号選出手段は、全ての上記複数の収音手段からの上記受音信号が入力され、上記受音信号中の上記話者受音信号の成分を抑圧するサブビームフォーマであることを特徴とするエコー消去装置。
The echo canceller according to claim 2, wherein
The speaker received signal selection means is a main beamformer that receives the received sound signals from all of the plurality of sound collecting means and suppresses the component of the echo signal in the received sound signals,
The echo sound receiving signal selecting means is a sub-beamformer that receives the sound receiving signals from all the plurality of sound collecting means and suppresses the components of the speaker sound receiving signals in the sound receiving signals. Echo canceling device.
請求項1〜の何れかに記載のエコー消去装置において、
上記第1のフィルタ係数更新部および、上記第2のフィルタ係数更新部のフィルタ係数の更新は、学習同定(NLMS:NormalizedLeast−Mean−Squares)アルゴリズム、もしくは射影アルゴリズム、もしくは逐次最小二乗(RecursiveLeastSquare)アルゴリズム、もしくはLMS(LeastMeanSquare)アルゴリズムを用いて、逐次更新するものであることを特徴とするエコー消去装置。
In the echo canceller according to any one of claims 1 to 4 ,
The update of the filter coefficients of the first filter coefficient update unit and the second filter coefficient update unit is performed by a learning identification (NLMS: Normalized Last-Mean-Squares) algorithm, a projection algorithm, or a sequential least squares (Recursive Last Square) algorithm. Alternatively, an echo canceller that updates sequentially using a LMS (Least Mean Square) algorithm.
再生手段により受話信号を再生音に変換して、放声し、近端話者音声が話者用収音手段により受音された信号(以下、話者受音信号という)と上記再生音が上記話者用収音手段により受音された信号(以下、エコー信号という)からなる受音信号から、上記エコー信号を除去して、送話信号として出力するエコー消去方法において、
第1の可変フィルタが、上記受話信号から第1の擬似エコー信号を生成する第1の擬似エコー信号生成過程と、
第2の可変フィルタが、1以上のエコー用収音手段により、受音された受話信号から第2の擬似エコー信号を生成する第2の擬似エコー信号生成過程と、
減算手段が、上記話者用収音手段よりの受音信号から上記第1の擬似エコー信号及び上記第2の擬似エコー信号を減算して、上記送話信号を出力する減算過程と、
第1のフィルタ係数更新手段が、少なくとも、上記受話信号と上記送話信号とから、上記第1の可変フィルタのフィルタ係数を更新する第1のフィルタ係数更新過程と、
第2のフィルタ係数更新手段が、少なくとも、上記エコー用収音手段での受音信号と上記送話信号とから、上記第2の可変フィルタのフィルタ係数を更新する第2のフィルタ係数更新過程と、
受話検出手段が、上記受話信号のレベルを検出し、検出されたレベルが予め決められた閾値より小さい場合は、上記第2の可変フィルタを稼動させず、当該第2の可変フィルタの出力を0とする受話検出過程と、を有することを特徴とするエコー消去方法。
The reception means converts the received signal into a reproduced sound, utters, and a signal (hereinafter referred to as a speaker received signal) in which the near-end speaker sound is received by the speaker sound collecting means and the reproduced sound are In an echo canceling method for removing the echo signal from a received signal consisting of a signal received by a speaker sound pickup means (hereinafter referred to as an echo signal) and outputting the signal as a transmitted signal,
A first pseudo-echo signal generation process in which a first variable filter generates a first pseudo-echo signal from the received signal;
Second variable filter, by one or more echo YoOsamu sound hand stage, a second pseudo echo signal generating process of generating a second pseudo echo signal from the received sound has been received signals,
A subtracting step of subtracting the first pseudo echo signal and the second pseudo echo signal from a sound reception signal from the speaker sound collecting unit and outputting the transmission signal;
A first filter coefficient updating means for updating a filter coefficient of the first variable filter from at least the received signal and the transmitted signal;
A second filter coefficient updating step in which the second filter coefficient updating means updates the filter coefficient of the second variable filter from at least the sound reception signal from the echo sound collecting means and the transmission signal; ,
The reception detection means detects the level of the reception signal, and when the detected level is smaller than a predetermined threshold, the second variable filter is not operated and the output of the second variable filter is set to 0. An echo detection method characterized by comprising:
請求項記載のエコー消去方法において、
更に、話者受音信号選出手段が、複数の上記収音手段中から、エコー信号レベル対話者音信号レベルの比の時間平均値が、他の受音信号に比べて小さい受音信号を選出する話者受音信号選出過程と、
エコー受音信号選出手段が、複数の上記収音手段中から、エコー信号レベル対話者音信号レベルの比の時間平均値が、他の受音信号に比べて大きい受音信号を選出するエコー受音信号選出過程と、を有することを特徴とするエコー消去方法。
The echo cancellation method according to claim 6 ,
Further, the speaker sound signal selection means selects a sound reception signal whose time average value of the ratio of the echo signal level to the conversation person sound signal level is smaller than the other sound reception signals from the plurality of sound pickup means. The process of selecting a speaker's received sound signal,
The echo sound receiving signal selecting means selects an echo receiving signal from among the plurality of sound collecting means for selecting a sound receiving signal whose time average value of the ratio of the echo signal level to the conversation person sound signal level is larger than that of the other sound receiving signals. An echo canceling method comprising: a sound signal selection process;
請求項記載のエコー消去方法において、
上記話者受音信号選出過程は、上記話者受音信号選出手段が、話者方向に感度が高く、上記複数の収音手段中の上記話者用収音手段としての1つの収音手段(以下、主収音手段という)からの収音信号を得る過程であり、
上記エコー受音信号選出過程は、上記エコー受音信号選出手段が、上記再生手段方向に感度が高く、上記主収音手段以外である収音手段からの収音信号を得る過程であることを特徴とするエコー消去方法。
The echo cancellation method according to claim 7 , wherein
In the speaker received signal selection process, the speaker received signal selection means has high sensitivity in the direction of the speaker, and one sound collection means as the speaker sound collection means among the plurality of sound collection means (Hereinafter referred to as the main sound collecting means)
The echo sound reception signal selection process is a process in which the echo sound signal selection means obtains a sound collection signal from a sound collection means other than the main sound collection means with high sensitivity in the reproduction means direction. Echo canceling method characterized.
請求項記載のエコー消去方法において、
上記話者受音信号選出過程は、メインビームフォーマにより、全ての上記複数の収音手段からの上記受音信号中の上記エコー信号の成分を抑圧する過程であり、
上記エコー受音信号選出過程は、サブビームフォーマにより、全ての上記複数の収音手段からの上記受音信号中の上記話者受音信号の成分を抑圧する過程であることを特徴とするエコー消去方法。
The echo cancellation method according to claim 7 , wherein
The speaker received signal selection process is a process of suppressing the component of the echo signal in the received signal from all the plurality of sound collecting means by the main beamformer,
The echo received signal selection process is a process of suppressing a component of the speaker received signal in the received signal from all the plurality of sound collecting means by a sub-beamformer. Method.
請求項の何れかに記載のエコー消去方法において、
上記第1のフィルタ係数更新過程および、上記第2のフィルタ係数更新過程のフィルタ係数の更新は、学習同定(NLMS:NormalizedLeast−Mean−Squares)アルゴリズム、もしくは射影アルゴリズム、もしくは逐次最小二乗(RecursiveLeastSquare)アルゴリズム、もしくはLMS(LeastMeanSquare)アルゴリズムを用いて、逐次更新する過程であることを特徴とするエコー消去方法。
In the echo cancellation method according to any one of claims 6 to 9 ,
The update of the filter coefficient in the first filter coefficient update process and the second filter coefficient update process is performed by a learning identification (NLMS: Normalized Last-Mean-Squares) algorithm, a projection algorithm, or a recursive least square (RecursiveLeastSquare) algorithm. Alternatively, an echo cancellation method, which is a process of sequentially updating using an LMS (Least Mean Square) algorithm.
請求項10の何れかに記載したエコー消去方法の各過程をコンピュータに実行させるためのエコー消去プログラム。 An echo cancellation program for causing a computer to execute each step of the echo cancellation method according to any one of claims 6 to 10 . 請求項11記載のエコー消去プログラムを記録したコンピュータ読み取り可能な記録媒体。 The computer-readable recording medium which recorded the echo cancellation program of Claim 11 .
JP2006232519A 2006-08-29 2006-08-29 Echo canceling apparatus, method thereof, program thereof, and recording medium thereof Active JP4709714B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2006232519A JP4709714B2 (en) 2006-08-29 2006-08-29 Echo canceling apparatus, method thereof, program thereof, and recording medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2006232519A JP4709714B2 (en) 2006-08-29 2006-08-29 Echo canceling apparatus, method thereof, program thereof, and recording medium thereof

Related Child Applications (1)

Application Number Title Priority Date Filing Date
JP2011029507A Division JP2011160429A (en) 2011-02-15 2011-02-15 Echo elimination device

Publications (2)

Publication Number Publication Date
JP2008060715A JP2008060715A (en) 2008-03-13
JP4709714B2 true JP4709714B2 (en) 2011-06-22

Family

ID=39243000

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2006232519A Active JP4709714B2 (en) 2006-08-29 2006-08-29 Echo canceling apparatus, method thereof, program thereof, and recording medium thereof

Country Status (1)

Country Link
JP (1) JP4709714B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220053268A1 (en) * 2020-08-12 2022-02-17 Auzdsp Co., Ltd. Adaptive delay diversity filter and echo cancellation apparatus and method using the same

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8693698B2 (en) * 2008-04-30 2014-04-08 Qualcomm Incorporated Method and apparatus to reduce non-linear distortion in mobile computing devices
JP5406966B2 (en) * 2012-08-15 2014-02-05 日本電信電話株式会社 Echo canceling device, echo canceling method, echo canceling program
CN103051818B (en) * 2012-12-20 2014-10-29 歌尔声学股份有限公司 Device and method for cancelling echoes in miniature hands-free voice communication system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61502581A (en) * 1984-12-14 1986-11-06 モトロ−ラ・インコ−ポレ−テッド Full duplex speaker horn for radio and landline telephones
JPH04290320A (en) * 1991-03-19 1992-10-14 Fujitsu Ltd Echo canceller
JPH0775002A (en) * 1993-06-21 1995-03-17 Canon Inc Recorder
JP2003060530A (en) * 2001-08-13 2003-02-28 Fujitsu Ltd Echo suppression processing system
JP2003273782A (en) * 2002-03-14 2003-09-26 Osaka Industrial Promotion Organization Speech processor, computer program, and recording medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61502581A (en) * 1984-12-14 1986-11-06 モトロ−ラ・インコ−ポレ−テッド Full duplex speaker horn for radio and landline telephones
JPH04290320A (en) * 1991-03-19 1992-10-14 Fujitsu Ltd Echo canceller
JPH0775002A (en) * 1993-06-21 1995-03-17 Canon Inc Recorder
JP2003060530A (en) * 2001-08-13 2003-02-28 Fujitsu Ltd Echo suppression processing system
JP2003273782A (en) * 2002-03-14 2003-09-26 Osaka Industrial Promotion Organization Speech processor, computer program, and recording medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220053268A1 (en) * 2020-08-12 2022-02-17 Auzdsp Co., Ltd. Adaptive delay diversity filter and echo cancellation apparatus and method using the same
US11843925B2 (en) * 2020-08-12 2023-12-12 Auzdsp Co., Ltd. Adaptive delay diversity filter and echo cancellation apparatus and method using the same

Also Published As

Publication number Publication date
JP2008060715A (en) 2008-03-13

Similar Documents

Publication Publication Date Title
JP5075042B2 (en) Echo canceling apparatus, echo canceling method, program thereof, and recording medium
US8842851B2 (en) Audio source localization system and method
US11297178B2 (en) Method, apparatus, and computer-readable media utilizing residual echo estimate information to derive secondary echo reduction parameters
JP5391103B2 (en) Multi-channel echo canceling method, multi-channel echo canceling apparatus, multi-channel echo canceling program and recording medium therefor
WO2019098178A1 (en) Voice communication device, voice communication method, and program
CN112863532A (en) Echo suppressing device, echo suppressing method, and storage medium
JP4709714B2 (en) Echo canceling apparatus, method thereof, program thereof, and recording medium thereof
JP5469564B2 (en) Multi-channel echo cancellation method, multi-channel echo cancellation apparatus and program thereof
JP2003188776A (en) Acoustic echo erasing method and device, and acoustic echo erasure program
JP2011160429A (en) Echo elimination device
JP2003250193A (en) Echo elimination method, device for executing the method, program and recording medium therefor
JP2006067127A (en) Method and apparatus of reducing reverberation
JP6537997B2 (en) Echo suppressor, method thereof, program, and recording medium
CN118486317A (en) Nonlinear echo suppression method and device, electronic equipment and storage medium
JP4543896B2 (en) Echo cancellation method, echo canceller, and telephone repeater
JP3403655B2 (en) Method and apparatus for identifying unknown system using subband adaptive filter
JP6356087B2 (en) Echo canceling apparatus, method and program
JP3583998B2 (en) Multi-channel echo canceling method, apparatus therefor, and program recording medium
JP6180689B1 (en) Echo canceller apparatus, echo cancellation method, and echo cancellation program
JP5925149B2 (en) Acoustic coupling amount estimating apparatus, echo canceling apparatus, method and program thereof
CN116013345A (en) Echo cancellation method and electronic equipment
JP5086969B2 (en) Echo canceling apparatus, method thereof, program thereof, and recording medium thereof
JP5264687B2 (en) Echo canceling method, echo canceling device, echo canceling program
JP2006135886A (en) Echo erasing method, echo erasing apparatus, echo erasing program, and recording medium with the program recorded thereon
JP2002252577A (en) Method and system for canceling multichannel acoustic echo, its program and its recording medium

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20080804

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20101215

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20101221

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20110215

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20110308

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20110318

R150 Certificate of patent or registration of utility model

Ref document number: 4709714

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350