JP3975153B2 - Blind signal separation method and apparatus, blind signal separation program and recording medium recording the program - Google Patents
Blind signal separation method and apparatus, blind signal separation program and recording medium recording the program Download PDFInfo
- Publication number
- JP3975153B2 JP3975153B2 JP2002312204A JP2002312204A JP3975153B2 JP 3975153 B2 JP3975153 B2 JP 3975153B2 JP 2002312204 A JP2002312204 A JP 2002312204A JP 2002312204 A JP2002312204 A JP 2002312204A JP 3975153 B2 JP3975153 B2 JP 3975153B2
- Authority
- JP
- Japan
- Prior art keywords
- signal
- permutation
- frequencies
- frequency
- short
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
Description
【0001】
【発明の属する技術分野】
本発明は信号処理の技術分野に属し、複数の信号が空間内で混合されたものから、源信号をできるだけ正確に復元する信号分離の技術に関する。
本技術により、様々な妨害信号が発生する実環境において、目的の信号を精度良く取り出すことが可能となる。音信号に対する応用例としては、音声認識器のフロントエンドとして働く音源分離システムなどが挙げられる。話者とマイクが離れた位置にあり、マイクが話者の音声以外を収音してしまうような状況でも、そのようなシステムを使うことで話者の音声のみを取り出して正しく音声を認識することができる。
【0002】
【従来の技術】
[ブラインド信号分離]
まず、ブラインド信号分離の定式化を行う。
N個の信号が混合されてM個(M≧N)のセンサで観測されたとする。本発明では、信号の発生源からセンサまでの距離により信号が減衰・遅延し、また壁などにより信号が反射して残響が発生する状況を扱う。このような状況での混合は、源信号sp(t)(t:時刻、1<p≦N)からセンサxq(t)(1<q≦M)へのインパルス応答hqp(k)による畳み込み混合
【数1】
ブラインド信号分離の目的は、源信号sp(t)やインパルス応答hqp(k)を知らずに、観測信号xq(t)のみから、分離のためのFIR(Finite Impulse Response)フィルタの係数wrq(k)と分離信号
【数2】
図1にN=M=2である場合のブラインド信号分離の概要を説明するための図を示す。
一般に源信号sp(t)は互いに独立であるため、独立成分分析(ICA:Independent Component Analysis)を用いて分離のためのフィルタ係数wrq(k)を計算できる。ICAを用いた信号分離の手法には様々なものがあるが、残響に対処するためには周波数領域での手法が有効である。上記の畳み込み混合の問題を、周波数毎の瞬時混合の問題に置き換えることができるからである。
【0003】
[周波数領域でのブラインド信号分離]
図2に周波数領域で独立成分分析を用いるブラインド信号分離装置の構成を示す。
周波数領域の手法では、フィルタ係数wrq(k)を直接計算するのではなく、その周波数応答Wrq(f)をICAを用いて計算する。そのために、まず、センサqでの観測信号xq(t)に短時間離散フーリエ変換を適用してXq(f,m)を求める。ここでfは周波数、mはフレーム番号である。
次に、各周波数fで瞬時混合のICA:
【外1】
揃える必要がある。これがパーミュテーション(permutation)の問題である。
これを解決した後、Wrq(f)に逆離散フーリエ変換を施すことで、分離のためのフィルタ係数wrq(k)が最終的に求まる。以下、permutationの問題を解決する従来技術を2つ紹介する
【0004】
[信号の到来方向によるpermutationの解法]
1つ目の従来技術は、信号の到来方向を推定することによるpermutationの解法である(例えば、非特許文献1 参照)。
センサの間隔が適度に狭ければ、独立成分分析によって得られる分離行列の各行は、ある方向から到来する信号を取り出しながら、別の方向から到来する信号を抑圧するという周波数領域でのフィルタを形成している。各周波数におけるこのような状況を解析し、分離行列の各行が取り出している信号の到来方向Θ(f)=[θ1(f),・・・,θP(f)]Tを推定できれば、permutationを解決することができる。
到来方向の推定を行う代表的な方法として、指向特性をプロットするものが知られている。その方法はまず、混合系のインパルス応答を直接波のみで近似し、さらに平面波を仮定する。源信号spの到来方向を0°≦θP≦180°(センサの並びと垂直な方向が90°)、センサqの位置をdqとすると、混合系の周波数応答はHqp(f)=exp(j2πfc-1dqcosθP)と表現できる(cは信号の速度)。すると、角度θPにある源信号spから分離信号yrへの周波数応答
【数3】
が求まる。
【0005】
図7は、ある2つの周波数に関して、指向特性のゲインをプロットしたものである。まず周波数3152Hzを見ると、分離行列の1行目Y1が与える指向特性は41°でゲインが最小となっており、2行目Y2が与える指向特性は132°でゲインが最小となっている。このことから、分離行列の1行目Yは41°から到来する信号を抑圧して132°から到来する信号を取り出し、逆に分離行列の2行目Y2は132°から到来する信号を抑圧して41°から到来する信号を取り出している。従って、Θ(3152Hz)=[132,41]Tと推定できる。同様に周波数3156Hzにおいては、Θ(3156Hz)=[45,126]Tと推定できる。明らかに現状ではpermutationが揃っていないため、3152Hzの分離行列の行を入れ替えてpermutationを揃える必要がある。
以上の方法により、分離行列の各行が取り出している信号の到来方向を周波数毎に推定し、それらの方向を揃えることによりpermutationを解決することができる。しかし、いくつかの周波数では、ゲインが最小となる角度0°≦θP≦180°に存在せず、到来方向の推定が得られない場合もある。また、推定値が他の周波数と大きく異なるため信頼度の低い推定となることもある。特に低周波数では、方向の差から生じる位相差が小さいため、そのような場合が多い。従って、それらの周波数ではpermutationが決定できなかったり間違えたりする。
【0006】
[分離信号の類似度によるpermutationの解法]
2つ目の従来技術は、分離信号の類似度によるpermutationの解法である(例えば、非特許文献2 参照)。
ある2つの周波数での分離信号Yr(f1,m)とYr(f2,m)の類似度は、それらの絶対値の包絡線に関する相関を用いて計算する。
まず相関の定義を行う。
2つの信号x(n)とy(n)の相関はcor(x,y)=[<x・y>−<x>・<y>]/(σx・σy)で与えられる。ここで<・>は時間平均、σは標準偏差である。cor(x,x)=1であり、xとyが無相関ならばcor(x,y)=0である。
ある2つの周波数での分離信号Yr(f1,m)とYr(f2,m)は、たとえこれらが同じ源信号に対応していても、それらの相関は小さい。これはフーリエ変換が直交変換の性質をもつからである。一方、分離信号Yr(f,m)の絶対値の包絡線(Rは移動平均を取る長さを決定するパラメータ)
【数4】
は分離信号Yr(f,m)自身と違い、同じ源信号に対応する場合、特に近傍の周波数で高い相関を持つことが知られている。従ってこれらの相関を計算することでpermutationを解決できる。以後の説明では、permutationをπ:{1,・・・,N}→{1,・・・,N}で表現する。例えばN=2である場合、permutationを変更しなければπ(1)=1,π(2)=2であり、permutationを入れ替えればπ(1)=2,π(2)=1である。従来の技術としては、周波数の差D以下の近傍で相関の和が最も大きくなるように
【数5】
に基づき周波数fでのpermutationπfを求めていく方法が存在する。ここでπgは周波数gでのpermutationである。
【0007】
【非特許文献1】
S.Kurita, H.Saruwatari,S.Kajita, K.Takeda, and F.Itakura, "Evaluation of blind signal separation method using directivity pattern under reverberant conditions," in Proc. ICASSP2000, 2000, pp.3140-3143
【非特許文献2】
S.Ikeda and N.Murata, "An approach to blind source separation of speech signals," in Proc. ICANN '98, Sep.1998, pp.761-766
【0008】
【発明が解決しようとする課題】
従来の技術として紹介したpermutationの解決方法は、それぞれ以下の欠点がある。
1つ目の信号の到来方向によるものでは、実際に起こる信号の減衰や残響を考慮せず、混合系のインパルス応答を直接波のみで近似し平面波を仮定して方向を推定している。そのため、従来の技術で説明したように、いくつかの周波数で方向が推定できないこと、あるいは推定できたとしても信頼度の低い推定となることがある。その結果、それらの周波数ではpermutationが決定できなかったり間違えたりする。全体としてみると、いくつかの周波数でどうしてもpermutationを間違うため、高精度にpermutationを解決しているとは言えない。
一方、2つ目の分離信号の類似度によるものは、式(3)に従ってpermutationを解決するため、すべての周波数ビン(bin)でpermutationが決定できる。また、分離信号そのものを用いているため、その精度は、近似を行っている1つ目の到来方向によるものより高い。しかし、近傍の周波数との相対的な関係によりpermutationを決定していくため、どこかの周波数で間違えれば、その先の周波数すべてにおいて間違えることになる。従って、すべての周波数で正しいpermutationが得られれば良いが、どこかの周波数で間違えた場合の被害は甚大であるため、安定性に欠けるという点で実用的ではない。
そこで本発明の目的は、上記2つの方法を統合してお互いの欠点を補間し合い、高精度で安定性のあるpermutationの解決方法を提供することにある。
【0009】
【課題を解決するための手段】
上記目的を達成するため、本発明は、
観測信号を短時間フーリエ変換し、
独立成分分析により各周波数での分離行列を求め、
各周波数での分離行列の各行により取り出される信号の到来方向を推定し、
その推定値が十分に信頼できるかどうかを判定し、
到来方向の推定値からpermutationを決定し、
周波数間での分離信号の絶対値の包絡線を計算し、
指定された(推定値が十分に信頼できる)周波数のpermutationは変更せずに、指定されない周波数では近傍の周波数との分離信号の絶対値の包絡線の相関に基づきpermutationを決定することを特徴とする。
【0010】
【発明の実施の形態】
[周波数領域で独立成分分析を用いる信号分離の構成]
図2は、周波数領域で独立成分分析を用いるブラインド信号分離装置のブロック図である。
その詳細は従来の技術で説明した。本発明は、この中のpermutation解決部に特徴を有する。
図3に本発明のブラインド信号分離方法の手順を示す。
s1:観測信号を短時間フーリエ変換し、
s2:独立成分分析により各周波数での分離行列を求め、
s3:各周波数での分離行列の各行により取り出される信号の到来方向を推定し、
s4:その推定値が十分に信頼できるかどうかを判定し、
s5:周波数間での分離信号の類似度を計算し、
s6:各周波数で分離行列を求めた後でpermutationを解決する際に、信号の到来方向の推定が十分に信頼できると判定された周波数ではそれらの方向を揃えることでpermutationを決定し、その他の周波数(信号の到来方向の推定が信頼できないと判定された周波数)では近傍の周波数との分離信号の類似度を高めるようにpermutationを決定する。
【0011】
[本発明の構成]
図4は、本発明のpermutation解決部の構成例を示すブロック図である。
permutation解決部は、信号の到来方向によるpermutation解決部と、分離信号の類似度によるpermutation解決部で構成される。
信号の到来方向によるpermutation解決部では、
【外2】
【0012】
[信号の到来方向によるpermutationの解決]
図5は、信号の到来方向によるpermutation解決部の構成を示すブロック図である。
到来方向によるpermutation解析部では、従来の技術で説明した方法などを用いて、周波数毎に分離行列の各行がどの方向の信号を取り出しているかを解析してΘ(f)を出力する。方向によるpermutation決定部では、各周波数において推定された信号の到来方向Θ(f)に基づき、
【外3】
本発明の特徴は、推定された信号の到来方向が十分に信頼できるかどうかを、信頼性判定部において判定し、信頼できる周波数の集合fixを求めることにある。本実施例では以下の条件を満たすかどうかを調べることで判定する。
1.信号の到来方向の推定値が、源信号の数だけ存在すること
2.信号の到来方向の推定値が、他の周波数のものと比べて大きく異ならないこと
3.各推定値が与える角度において、抑圧されるべき信号が取り出される信号に比べて十分に抑圧されていること
1つ目の条件は、到来方向推定部の出力Θ(f)が、源信号と同じ数の推定値を持っているかどうかで判定できる。2つ目の条件は、推定された信号の方向をソートした後、すべての周波数による平均を計算し、その平均と大きく異ならなければ条件を満たすと判定できる。例えば源信号が2個の場合、推定方向の全周波数での平均が54°と137°であるとする。ある周波数で推定方向が53°と134°であれば、これらは大きく異ならないため条件を満たすが、別の周波数で推定方向が20°と91°であれば大きく異なるため条件を満たしていないと見なす。3つ目の条件は、各推定値が与える角度における指向特性Br(f,θp)のゲインを計算することで判定できる。例えば、図7に示す指向特性では、3152Hz,3156Hz双方において、抑圧されるべき信号が十分に抑圧されているため条件を満たす。一方、図8に示す312Hzの指向特性では、Θ(312Hz)=[114,70]Tであり、それぞれの角度における指向特性のゲインを計算すると、B1(312Hz,114)=0.601,B2(312Hz,114)=0.537,B1(312Hz,70)=0.325,B2(312Hz,70)=0.743となる。取り出される信号と抑圧されるべき信号のゲインの比を計算すると、それぞれ、0.537/0.601=0.894,0.325/0.743=0.437であり、十分に抑圧されていないとみなせるため、条件を満たしていないと考える。
【外4】
permutationを解決し、同時にその推定値が十分に信頼できるかどうかの判定を行った。
信頼できる周波数はfixの要素となっている。fixに属さない周波数では、信号の到来方向の推定値が十分に信頼できないため、次の分離信号の類似度によるpermutationの解決に頼る必要がある。
【0013】
[分離信号の類似度によるpermutationの解決]
図6は、分離信号の類似度によるpermutation解決部の構成を示すブロック図である。
【外5】
本実施例では、既にpermutationが決定した(すなわち集合fixに属する)周波数との包絡線の相関を、明らかに大きくできる周波数からpermutationを決めていく。そのための具体的なアルゴリズムを図9に示す。
まず、集合fixに属さないすべての周波数fにおいて、周波数の差がD以下の近傍で集合fixに属する周波数との包絡線の相関の和
【数6】
を最大にするpermutationとその最大値maxCorfを求める。ここでπgは周波数gでのpermutationである。次に、maxCorfが最大となる周波数iを選び、そのpermutationをπiとして決定し、周波数iをfixの要素とする。なお、permute(W,π)は、permutationπに従ってWの行を入れ替える関数である。
以上の方法により、すべての周波数においてpermutationが決定する。
【0014】
本発明のブラインド信号分離装置は、CPUやメモリ等を有するコンピュータと、ユーザが利用する端末と、CD−ROM,磁気ディスク装置,半導体メモリ等の機械読み取り可能な記録媒体とから構成することができる。記録媒体に記録されたブラインド信号分離プログラムあるいは回線を介して伝送されたブラインド信号分離プログラムはコンピュータに読み取られ、コンピュータ上に前述した各構成要素及び処理を実現する。
【0015】
【発明の効果】
従来技術および本発明を用いて、2つの音源を分離した際の分離性能の比較を図10に示す。
本結果を得るに際し、残響時間300msのインパルス応答に、ASJ研究用音声コーパスから選んだ8秒の音声データ12組を畳み込んで混合信号を作成した。縦軸はSNR(signal-to-noise ratio)として計算した分離性能に対応し、横軸は音声データの組に対応する。”av”は12組の平均である。比較のためpermutationの解決には以下の3つの方法を用いた。”dir”は信号の到来方向による方法、”cor”は分離信号の類似度による方法、”both”は双方を併用した本発明による方法である。”dir”は安定的に解決しているが性能が不十分であるのに対し、”cor”は非常に良い場合もあるが悪い場合もあり安定性に欠ける。”both”は常に良い性能となっており、本発明の効果が確認できる。
信号の到来方向による方法では方向という絶対的な基準でpermutattionを解決するため、精度にはやや欠けるが、大きく間違えることが少ない。一方、分離信号の類似度による方法では、高い精度でpermutationを解決できるが、どこかで間違った時の被害が大きい。本発明は、これら2種類の利点を活かして統合しているため、安定的に高い精度でpermutationを解決できる。
【図面の簡単な説明】
【図1】ブラインド信号分離の概要を説明するための図。
【図2】周波数領域で独立成分分析を用いるブラインド信号分離装置の構成を示すブロック図。
【図3】本発明のブラインド信号分離方法の手順を示す図。
【図4】本発明におけるpermutation解決部の構成を示すブロック図。
【図5】図4における信号到来方向によるpermutation解決部の構成を示すブロック図。
【図6】図4における分離信号の類似度によるpermutation解決部の構成を示すブロック図。
【図7】 3152Hz,3156Hzにおける指向特性のゲインをプロットした図。
【図8】 312Hzにおける指向特性のゲインをプロットした図。
【図9】分離信号の類似度によるpermutation決定部のアルゴリズムを示す図。
【図10】従来方法と本発明による方法の分離性能の比較を行う図。[0001]
BACKGROUND OF THE INVENTION
The present invention belongs to the field of signal processing, and relates to a signal separation technique for restoring a source signal as accurately as possible from a mixture of a plurality of signals in space.
According to the present technology, it is possible to accurately extract a target signal in an actual environment where various interference signals are generated. An application example for a sound signal is a sound source separation system that works as a front end of a speech recognizer. Even in situations where the speaker and the microphone are separated and the microphone picks up sound other than the speaker's voice, using such a system, only the speaker's voice is extracted and the voice is recognized correctly. be able to.
[0002]
[Prior art]
[Blind signal separation]
First, the blind signal separation is formulated.
It is assumed that N signals are mixed and observed by M (M ≧ N) sensors. The present invention deals with a situation in which a signal is attenuated / delayed depending on the distance from the signal generation source to the sensor, and reverberation occurs due to reflection of the signal by a wall or the like. Mixing In this situation, the source signal s p (t) (t: time, 1 <p ≦ N) from the sensor x q (t) impulse response h qp of (1 <q ≦ M) to (k) Convolution mixing by the formula [Equation 1]
The purpose of the blind signal separation, the source signal s p (t) or without knowing the impulse response h qp (k), observed signal x from the q (t) only, FIR for separation (Finite Impulse Response) filter coefficients w rq (k) and separated signal
FIG. 1 is a diagram for explaining an outline of blind signal separation when N = M = 2.
Generally the source signal s p (t) are independent of each other, independent component analysis (ICA: Independent Component Analysis) can be calculated filter coefficients w rq (k) for separation using. There are various signal separation methods using ICA, but a method in the frequency domain is effective in dealing with reverberation. This is because the problem of convolutional mixing can be replaced with the problem of instantaneous mixing for each frequency.
[0003]
[Blind signal separation in the frequency domain]
FIG. 2 shows the configuration of a blind signal separation device using independent component analysis in the frequency domain.
In the frequency domain method, the filter coefficient w rq (k) is not directly calculated, but its frequency response W rq (f) is calculated using ICA. For this purpose, first, X q (f, m) is obtained by applying a short-time discrete Fourier transform to the observation signal x q (t) of the sensor q. Here, f is a frequency and m is a frame number.
Next, ICA for instantaneous mixing at each frequency f:
[Outside 1]
It is necessary to align. This is the problem of permutation.
After solving this, Wrq (f) is subjected to inverse discrete Fourier transform to finally obtain a filter coefficient w rq (k) for separation. The following introduces two conventional techniques for solving the permutation problem. [0004]
[Solution of permutation according to direction of arrival of signal]
The first prior art is a permutation solution by estimating the arrival direction of a signal (see, for example, Non-Patent Document 1).
If the sensor interval is reasonably narrow, each row of the separation matrix obtained by independent component analysis forms a filter in the frequency domain that suppresses signals coming from another direction while extracting signals coming from one direction is doing. If such a situation at each frequency is analyzed and the arrival direction Θ (f) = [θ 1 (f),..., Θ P (f)] T of the signal extracted by each row of the separation matrix can be estimated, Permutation can be solved.
As a typical method for estimating the direction of arrival, a method of plotting directional characteristics is known. The method first approximates the impulse response of the mixed system with only a direct wave, and further assumes a plane wave. The direction of arrival of the
Is obtained.
[0005]
FIG. 7 is a plot of gain of directivity with respect to a certain two frequencies. First, looking at a frequency of 3152 Hz, the directivity given by the first row Y 1 of the separation matrix is 41 ° and the gain is minimum, and the directivity given by the second row Y 2 is 132 ° and the gain is minimum. Yes. Therefore, the first row Y of the separation matrix takes the signal coming from the suppressed to 132 ° signals coming from the 41 °, suppresses second line Y 2 is the signal coming from the 132 ° opposite to the separation matrix The signal coming from 41 ° is taken out. Therefore, it can be estimated that Θ (3152 Hz) = [132, 41] T. Similarly, at a frequency of 3156 Hz, it can be estimated that Θ (3156 Hz) = [45,126] T. Obviously, at present, permutation is not complete, so it is necessary to align permutation by exchanging rows of 3152Hz separation matrix.
By the above method, permutation can be solved by estimating the arrival direction of the signal extracted from each row of the separation matrix for each frequency and aligning the directions. However, at some frequencies, there is no angle at which the gain becomes the minimum, 0 ° ≦ θ P ≦ 180 °, and the arrival direction may not be estimated. In addition, since the estimated value is significantly different from other frequencies, the estimation may be low in reliability. Particularly at low frequencies, this is often the case because the phase difference resulting from the difference in direction is small. Therefore, permutation cannot be determined or mistaken at those frequencies.
[0006]
[Solution of permutation by similarity of separated signals]
The second prior art is a permutation solution based on the similarity of separated signals (see, for example, Non-Patent Document 2).
The similarity between the separated signals Y r (f 1 , m) and Y r (f 2 , m) at two frequencies is calculated using the correlation of the absolute value envelope.
First, the correlation is defined.
The correlation between the two signals x (n) and y (n) is given by cor (x, y) = [<x · y> − <x> · <y>] / (σ x · σ y ). Here, <•> is a time average, and σ is a standard deviation. If cor (x, x) = 1 and x and y are uncorrelated, cor (x, y) = 0.
The separated signals Y r (f 1 , m) and Y r (f 2 , m) at two frequencies have a small correlation even though they correspond to the same source signal. This is because the Fourier transform has the property of orthogonal transform. On the other hand, the absolute value envelope of the separation signal Y r (f, m) (R is a parameter that determines the length of the moving average)
[Expression 4]
Unlike the separated signal Y r (f, m) itself, it is known that when it corresponds to the same source signal, it has a high correlation especially at nearby frequencies. Therefore, permutation can be solved by calculating these correlations. In the following description, permutation is expressed by π: {1,..., N} → {1,. For example, if N = 2, π (1) = 1, π (2) = 2 if permutation is not changed, and π (1) = 2, π (2) = 1 if permutation is replaced. As a conventional technique, the sum of correlations is maximized in the vicinity of a frequency difference D or less.
Methods exist to continue seeking Permutationpai f at frequency f based on. Here π g is a permutation of the frequency g.
[0007]
[Non-Patent Document 1]
S.Kurita, H.Saruwatari, S.Kajita, K.Takeda, and F.Itakura, "Evaluation of blind signal separation method using directivity pattern under reverberant conditions," in Proc. ICASSP2000, 2000, pp.3140-3143
[Non-Patent Document 2]
S. Ikeda and N. Murata, "An approach to blind source separation of speech signals," in Proc. ICANN '98, Sep. 1998, pp.761-766
[0008]
[Problems to be solved by the invention]
The permutation solutions introduced as conventional techniques have the following drawbacks.
According to the arrival direction of the first signal, the direction and the direction are estimated by approximating the impulse response of the mixed system with only the direct wave and assuming the plane wave without considering the attenuation and reverberation of the actual signal. Therefore, as described in the related art, the direction cannot be estimated at some frequencies, or even if it can be estimated, the estimation may be low in reliability. As a result, permutation cannot be determined or wrong at those frequencies. As a whole, it cannot be said that permutation is solved with high precision because it is inevitably wrong at several frequencies.
On the other hand, according to the similarity of the second separated signal, the permutation is solved according to the equation (3), so that the permutation can be determined for all frequency bins. Moreover, since the separated signal itself is used, the accuracy is higher than that of the first arrival direction that is being approximated. However, since permutation is determined based on a relative relationship with nearby frequencies, if a mistake is made at any frequency, a mistake is made at all the frequencies ahead. Accordingly, it is sufficient that correct permutation can be obtained at all frequencies, but the damage caused by mistakes at some frequency is enormous, so it is not practical in terms of lack of stability.
Therefore, an object of the present invention is to provide a highly accurate and stable permutation solving method by integrating the above two methods and interpolating each other's defects.
[0009]
[Means for Solving the Problems]
In order to achieve the above object, the present invention provides:
Short-time Fourier transform of the observed signal,
Find the separation matrix at each frequency by independent component analysis,
Estimate the direction of arrival of the signal extracted by each row of the separation matrix at each frequency,
Determine if the estimate is reliable enough,
Determine the permutation from the estimated direction of arrival,
Calculate the envelope of the absolute value of the separated signal between frequencies,
The permutation of the specified frequency (the estimated value is sufficiently reliable) is not changed, and the permutation is determined based on the correlation of the envelope of the absolute value of the separated signal with the nearby frequency at the unspecified frequency. To do.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
[Configuration of signal separation using independent component analysis in frequency domain]
FIG. 2 is a block diagram of a blind signal separation device using independent component analysis in the frequency domain.
Details thereof have been described in the prior art. The present invention is characterized by the permutation resolution unit.
FIG. 3 shows the procedure of the blind signal separation method of the present invention.
s1: Short-time Fourier transform of the observed signal,
s2: A separation matrix at each frequency is obtained by independent component analysis,
s3: estimating the direction of arrival of the signal extracted by each row of the separation matrix at each frequency,
s4: determine whether the estimate is sufficiently reliable,
s5: Calculate the similarity of the separated signal between frequencies,
s6: When solving the permutation after obtaining the separation matrix at each frequency, the permutation is determined by aligning those directions at the frequency for which it is determined that the estimation of the arrival direction of the signal is sufficiently reliable. Permutation is determined so as to increase the degree of similarity of a separated signal with a nearby frequency at a frequency (a frequency at which the estimation of the arrival direction of the signal is determined to be unreliable).
[0011]
[Configuration of the present invention]
FIG. 4 is a block diagram illustrating a configuration example of the permutation resolution unit of the present invention.
The permutation resolution unit includes a permutation resolution unit based on the arrival direction of the signal and a permutation resolution unit based on the similarity of the separated signals.
In the permutation resolution unit according to the arrival direction of the signal,
[Outside 2]
[0012]
[Resolution of permutation depending on signal arrival direction]
FIG. 5 is a block diagram illustrating a configuration of a permutation resolution unit according to the arrival direction of a signal.
The permutation analysis unit according to the arrival direction analyzes the direction in which each row of the separation matrix takes out for each frequency by using the method described in the related art and outputs Θ (f). In the permutation determination unit by direction, based on the arrival direction Θ (f) of the signal estimated at each frequency,
[Outside 3]
A feature of the present invention is that a reliability determination unit determines whether or not the estimated arrival direction of the signal is sufficiently reliable, and obtains a reliable frequency set fix. In this embodiment, the determination is made by examining whether or not the following condition is satisfied.
1. 1. There are as many estimates of the direction of arrival of signals as there are source signals. 2. The estimated value of the direction of arrival of the signal is not significantly different from that of other frequencies. The first condition is that the signal to be suppressed is sufficiently suppressed at the angle given by each estimated value compared to the signal to be extracted. The first condition is that the output Θ (f) of the arrival direction estimation unit is the same as that of the source signal. Judgment can be made based on whether the number has an estimated value. As for the second condition, after the estimated signal directions are sorted, an average of all frequencies is calculated. If the average is not significantly different from the average, it can be determined that the condition is satisfied. For example, when there are two source signals, it is assumed that the average at all frequencies in the estimated direction is 54 ° and 137 °. If the estimated direction is 53 ° and 134 ° at a certain frequency, they are not significantly different, so the condition is met. However, if the estimated direction is 20 ° and 91 ° at another frequency, the condition is not met. Consider. The third condition can be determined by calculating the gain of the directivity characteristic B r (f, θ p ) at the angle given by each estimated value. For example, the directivity shown in FIG. 7 satisfies the condition because the signal to be suppressed is sufficiently suppressed in both 3152 Hz and 3156 Hz. On the other hand, in the 312 Hz directivity shown in FIG. 8, Θ (312 Hz) = [114,70] T , and when calculating the gain of the directivity at each angle, B 1 (312 Hz, 114) = 0.601, B 2 (312 Hz, 114) = 0.537, B 1 (312 Hz, 70) = 0.325, B 2 (312 Hz, 70) = 0.743 When the ratio of the gain of the signal to be extracted and the signal to be suppressed is calculated, it is 0.537 / 0.601 = 0.894, 0.325 / 0.743 = 0.437, respectively. .
[Outside 4]
Permutation was solved, and at the same time it was determined whether the estimate was reliable enough.
Reliable frequencies are an element of fix. At frequencies that do not belong to fix, the estimated value of the direction of arrival of the signal is not sufficiently reliable, so it is necessary to rely on the solution of permutation due to the similarity of the next separated signal.
[0013]
[Resolution of permutation by similarity of separated signals]
FIG. 6 is a block diagram illustrating a configuration of a permutation resolution unit based on the similarity between separated signals.
[Outside 5]
In this embodiment, permutation is determined from a frequency at which the correlation of the envelope with the frequency for which permutation has already been determined (that is, belonging to the set fix) can be clearly increased. A specific algorithm for this is shown in FIG.
First, for all frequencies f not belonging to the set fix, the sum of the correlations of the envelopes with the frequencies belonging to the set fix in the vicinity where the frequency difference is not more than D
Find the permutation that maximizes and the maximum value maxCor f . Here, π g is permutation at the frequency g. Next, select the frequency i which MaxCor f is maximized, to determine the permutation as [pi i, the frequency i as elements of the fix. Note that permute (W, π) is a function that replaces the rows of W according to permutation π.
With the above method, permutation is determined at all frequencies.
[0014]
The blind signal separation device of the present invention can be composed of a computer having a CPU, a memory, etc., a terminal used by a user, and a machine-readable recording medium such as a CD-ROM, a magnetic disk device, a semiconductor memory, etc. . The blind signal separation program recorded on the recording medium or the blind signal separation program transmitted via the line is read by a computer, and the above-described components and processes are realized on the computer.
[0015]
【The invention's effect】
FIG. 10 shows a comparison of separation performance when two sound sources are separated using the prior art and the present invention.
In order to obtain this result, 12 sets of 8 second voice data selected from the ASJ research speech corpus were convolved with an impulse response with a reverberation time of 300 ms to create a mixed signal. The vertical axis corresponds to the separation performance calculated as SNR (signal-to-noise ratio), and the horizontal axis corresponds to a set of audio data. “Av” is the average of 12 pairs. For comparison, the following three methods were used to solve permutation. “Dir” is a method based on the arrival direction of a signal, “cor” is a method based on the similarity of separated signals, and “both” is a method according to the present invention in which both are used together. “Dir” is a stable solution but has poor performance, whereas “cor” may be very good but may be bad and lack stability. “Both” always has good performance, and the effect of the present invention can be confirmed.
In the method based on the direction of arrival of the signal, the permutattion is solved on the absolute basis of direction, so the accuracy is somewhat lacking, but it is rarely mistaken. On the other hand, the method based on the similarity of separated signals can solve permutation with high accuracy, but the damage caused when it is wrong somewhere is great. Since the present invention is integrated by taking advantage of these two kinds of advantages, the permutation can be stably solved with high accuracy.
[Brief description of the drawings]
FIG. 1 is a diagram for explaining an outline of blind signal separation;
FIG. 2 is a block diagram showing the configuration of a blind signal separation device using independent component analysis in the frequency domain.
FIG. 3 is a diagram showing a procedure of a blind signal separation method according to the present invention.
FIG. 4 is a block diagram showing a configuration of a permutation resolution unit in the present invention.
5 is a block diagram showing a configuration of a permutation resolution unit according to a signal arrival direction in FIG. 4;
6 is a block diagram showing a configuration of a permutation resolution unit based on the similarity of separated signals in FIG. 4;
FIG. 7 is a graph plotting gains of directivity characteristics at 3152 Hz and 3156 Hz.
FIG. 8 is a plot of gain of directivity at 312 Hz.
FIG. 9 is a diagram illustrating an algorithm of a permutation determination unit based on similarity of separated signals.
FIG. 10 is a diagram comparing the separation performance of the conventional method and the method according to the present invention.
Claims (4)
独立成分分析により短時間フーリエ変換した各周波数での分離行列を求める手順と、
各周波数での分離行列の各行により取り出される信号の到来方向を推定する手順と、
その推定値が十分に信頼できるかどうかを判定する手順と、
短時間フーリエ変換した周波数間での分離信号の絶対値の包絡線を計算する手順と、
各周波数で分離行列を求めた後でパーミュテーション(permutation)を解決する際に、
信号の到来方向の推定が十分に信頼できると判定された周波数ではそれらの方向を揃えることでpermutationを決定し、その他の周波数では近傍の周波数との分離信号の絶対値の包絡線の相関を高めるようにpermutationを決定していく手順を有する、ことを特徴とするブラインド信号分離方法。A procedure for short-time Fourier transform of the observed signal;
A procedure for obtaining a separation matrix at each frequency subjected to Fourier transform for a short time by independent component analysis,
A procedure for estimating the direction of arrival of the signal extracted by each row of the separation matrix at each frequency;
A procedure for determining whether the estimate is sufficiently reliable;
A procedure for calculating the envelope of the absolute value of the separated signal between the frequencies subjected to the short-time Fourier transform;
When solving the permutation after finding the separation matrix at each frequency,
Permutation is determined by aligning the directions of the frequencies that are determined to be reliable enough to estimate the direction of arrival of the signal, and the correlation of the envelope of the absolute value of the separated signal with neighboring frequencies is increased at other frequencies. A blind signal separation method characterized by having a procedure for determining permutation as described above.
独立成分分析により短時間フーリエ変換した各周波数での分離行列を求める手段と、
各周波数での分離行列の各行により取り出される信号の到来方向を推定する手段と、
その推定値が十分に信頼できるかどうかを判定する手段と、
短時間フーリエ変換した周波数間での分離信号の絶対値の包絡線を計算する手段と、
各周波数で分離行列を求めた後でパーミュテーション(permutation)を解決する際に、
信号の到来方向の推定が十分に信頼できると判定された周波数ではそれらの方向を揃えることでpermutationを決定し、その他の周波数では近傍の周波数との分離信号の絶対値の包絡線の相関を高めるようにpermutationを決定していく手段と、を備えたことを特徴とするブラインド信号分離装置。Means for short-time Fourier transform of the observation signal;
Means for obtaining a separation matrix at each frequency subjected to a short-time Fourier transform by independent component analysis;
Means for estimating the direction of arrival of the signal extracted by each row of the separation matrix at each frequency;
Means for determining whether the estimate is sufficiently reliable;
Means for calculating the envelope of the absolute value of the separation signal between the short-time Fourier-transformed frequencies;
When solving the permutation after finding the separation matrix at each frequency,
Permutation is determined by aligning the directions of the frequencies that are determined to be reliable enough to estimate the direction of arrival of the signal, and the correlation of the envelope of the absolute value of the separated signal with neighboring frequencies is increased at other frequencies. And a means for determining permutation as described above.
独立成分分析により短時間フーリエ変換した各周波数での分離行列を求める処理と、
各周波数での分離行列の各行により取り出される信号の到来方向を推定する処理と、
その推定値が十分に信頼できるかどうかを判定する処理と、
短時間フーリエ変換した周波数間での分離信号の絶対値の包絡線を計算する処理と、
各周波数で分離行列を求めた後でパーミュテーション(permutation)を解決する際に、
信号の到来方向の推定が十分に信頼できると判定された周波数ではそれらの方向を揃えることでpermutationを決定し、その他の周波数では近傍の周波数との分離信号の絶対値の包絡線の相関を高めるようにpermutationを決定していく処理と、をコンピュータに実行させるためのブラインド信号分離プログラム。A process for short-time Fourier transform of the observation signal;
A process for obtaining a separation matrix at each frequency that has been Fourier-transformed for a short time by independent component analysis;
A process of estimating the direction of arrival of the signal extracted by each row of the separation matrix at each frequency;
Processing to determine whether the estimate is sufficiently reliable;
Processing to calculate the envelope of the absolute value of the separated signal between the frequencies subjected to the short-time Fourier transform;
When solving the permutation after finding the separation matrix at each frequency,
Permutation is determined by aligning the directions of the frequencies that are determined to be reliable enough to estimate the direction of arrival of the signal, and the correlation of the envelope of the absolute value of the separated signal with neighboring frequencies is increased at other frequencies. A blind signal separation program for causing a computer to execute the process of determining permutation as described above.
独立成分分析により短時間フーリエ変換した各周波数での分離行列を求める処理と、
各周波数での分離行列の各行により取り出される信号の到来方向を推定する処理と、
その推定値が十分に信頼できるかどうかを判定する処理と、
短時間フーリエ変換した周波数間での分離信号の絶対値の包絡線を計算する処理と、
各周波数で分離行列を求めた後でパーミュテーション(permutation)を解決する際に、
信号の到来方向の推定が十分に信頼できると判定された周波数ではそれらの方向を揃えることでpermutationを決定し、その他の周波数では近傍の周波数との分離信号の絶対値の包絡線の相関を高めるようにpermutationを決定していく処理と、をコンピュータに実行させるためのブラインド信号分離プログラムを記録した記録媒体。A process for short-time Fourier transform of the observation signal;
A process for obtaining a separation matrix at each frequency that has been Fourier-transformed for a short time by independent component analysis;
A process of estimating the direction of arrival of the signal extracted by each row of the separation matrix at each frequency;
Processing to determine whether the estimate is sufficiently reliable;
Processing to calculate the envelope of the absolute value of the separated signal between the frequencies subjected to the short-time Fourier transform;
When solving the permutation after finding the separation matrix at each frequency,
Permutation is determined by aligning the directions of the frequencies that are determined to be reliable enough to estimate the direction of arrival of the signal, and the correlation of the envelope of the absolute value of the separated signal with neighboring frequencies is increased at other frequencies. The recording medium which recorded the blind signal separation program for making a computer perform the process which determines permutation like this.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2002312204A JP3975153B2 (en) | 2002-10-28 | 2002-10-28 | Blind signal separation method and apparatus, blind signal separation program and recording medium recording the program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2002312204A JP3975153B2 (en) | 2002-10-28 | 2002-10-28 | Blind signal separation method and apparatus, blind signal separation program and recording medium recording the program |
Publications (2)
Publication Number | Publication Date |
---|---|
JP2004145172A JP2004145172A (en) | 2004-05-20 |
JP3975153B2 true JP3975153B2 (en) | 2007-09-12 |
Family
ID=32457166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP2002312204A Expired - Fee Related JP3975153B2 (en) | 2002-10-28 | 2002-10-28 | Blind signal separation method and apparatus, blind signal separation program and recording medium recording the program |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP3975153B2 (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE602004029867D1 (en) | 2003-03-04 | 2010-12-16 | Nippon Telegraph & Telephone | POSITION INFORMATION IMPRESSION DEVICE, METHOD THEREFOR AND PROGRAM |
EP2068308B1 (en) | 2003-09-02 | 2010-06-16 | Nippon Telegraph and Telephone Corporation | Signal separation method, signal separation device, and signal separation program |
JP4449871B2 (en) | 2005-01-26 | 2010-04-14 | ソニー株式会社 | Audio signal separation apparatus and method |
CN1815550A (en) | 2005-02-01 | 2006-08-09 | 松下电器产业株式会社 | Method and system for identifying voice and non-voice in envivonment |
EP1752969A4 (en) | 2005-02-08 | 2007-07-11 | Nippon Telegraph & Telephone | SIGNAL SEPARATION DEVICE, SIGNAL SEPARATION METHOD, SIGNAL SEPARATION PROGRAM, AND RECORDING MEDIUM |
JP2006337851A (en) * | 2005-06-03 | 2006-12-14 | Sony Corp | Speech signal separating device and method |
JP2007215163A (en) * | 2006-01-12 | 2007-08-23 | Kobe Steel Ltd | Sound source separation apparatus, program for sound source separation apparatus and sound source separation method |
JP4556875B2 (en) | 2006-01-18 | 2010-10-06 | ソニー株式会社 | Audio signal separation apparatus and method |
JP4630203B2 (en) * | 2006-02-24 | 2011-02-09 | 日本電信電話株式会社 | Signal separation device, signal separation method, signal separation program and recording medium, signal arrival direction estimation device, signal arrival direction estimation method, signal arrival direction estimation program and recording medium |
JP4920270B2 (en) * | 2006-03-06 | 2012-04-18 | Kddi株式会社 | Signal arrival direction estimation apparatus and method, signal separation apparatus and method, and computer program |
JP5117012B2 (en) * | 2006-08-09 | 2013-01-09 | 株式会社東芝 | Direction detection system and signal extraction method |
JP2008089312A (en) * | 2006-09-29 | 2008-04-17 | Kddi Corp | Signal arrival direction estimation apparatus and method, signal separation apparatus and method, and computer program |
JP4897519B2 (en) * | 2007-03-05 | 2012-03-14 | 株式会社神戸製鋼所 | Sound source separation device, sound source separation program, and sound source separation method |
JP4649437B2 (en) * | 2007-04-03 | 2011-03-09 | 株式会社東芝 | Signal separation and extraction device |
JP5642339B2 (en) | 2008-03-11 | 2014-12-17 | トヨタ自動車株式会社 | Signal separation device and signal separation method |
JP4975691B2 (en) * | 2008-07-11 | 2012-07-11 | 株式会社東芝 | Receiving apparatus and waveform processing method |
JP2011081293A (en) | 2009-10-09 | 2011-04-21 | Toyota Motor Corp | Signal separation device and signal separation method |
JP5374427B2 (en) * | 2010-03-18 | 2013-12-25 | 株式会社日立製作所 | Sound source separation device, sound source separation method and program therefor, video camera device using the same, and mobile phone device with camera |
EP3333850A4 (en) | 2015-10-16 | 2018-06-27 | Panasonic Intellectual Property Management Co., Ltd. | Sound source separating device and sound source separating method |
JP7509102B2 (en) | 2021-08-10 | 2024-07-02 | 日本電信電話株式会社 | SOUND SOURCE SEPARATION DEVICE, SOUND SOURCE SEPARATION METHOD, AND SOUND SOURCE SEPARATION PROGRAM |
-
2002
- 2002-10-28 JP JP2002312204A patent/JP3975153B2/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
JP2004145172A (en) | 2004-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3975153B2 (en) | Blind signal separation method and apparatus, blind signal separation program and recording medium recording the program | |
WO2020108614A1 (en) | Audio recognition method, and target audio positioning method, apparatus and device | |
CN111474521B (en) | Sound source positioning method based on microphone array in multipath environment | |
RU2640742C1 (en) | Extraction of reverberative sound using microphone massives | |
US7626889B2 (en) | Sensor array post-filter for tracking spatial distributions of signals and noise | |
Thiergart et al. | On the spatial coherence in mixed sound fields and its application to signal-to-diffuse ratio estimation | |
Jensen et al. | Nonlinear least squares methods for joint DOA and pitch estimation | |
CN114830686B (en) | Improved localization of sound sources | |
CN106646350B (en) | A Correction Method for Inconsistent Amplitude Gains of Each Channel of a Single Vector Hydrophone | |
CN113687305A (en) | Method, device and equipment for positioning sound source azimuth and computer readable storage medium | |
CN114167356A (en) | A method and system for sound source localization based on polyhedral microphone array | |
JP4812302B2 (en) | Sound source direction estimation system, sound source direction estimation method, and sound source direction estimation program | |
JP3949074B2 (en) | Objective signal extraction method and apparatus, objective signal extraction program and recording medium thereof | |
JP3862685B2 (en) | Sound source direction estimating device, signal time delay estimating device, and computer program | |
Jia et al. | Two-dimensional detection based LRSS point recognition for multi-source DOA estimation | |
CN113707171B (en) | Airspace filtering voice enhancement system and method | |
Oliinyk et al. | Center weighted median filter application to time delay estimation in non-Gaussian noise environment | |
Firoozabadi et al. | Combination of nested microphone array and subband processing for multiple simultaneous speaker localization | |
JP2004064697A (en) | Sound source receiving position estimation method, apparatus, and program | |
CN113763982A (en) | Audio processing method and device, electronic equipment and readable storage medium | |
Peterson et al. | Analysis of fast localization algorithms for acoustical environments | |
CN107238813B (en) | Method and device for determining direction of arrival and time of arrival of near-field signal source | |
US11835625B2 (en) | Acoustic-environment mismatch and proximity detection with a novel set of acoustic relative features and adaptive filtering | |
US12328570B2 (en) | Boundary distance system and method | |
CN114822579B (en) | Signal estimation method based on first-order differential microphone array |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A621 | Written request for application examination |
Free format text: JAPANESE INTERMEDIATE CODE: A621 Effective date: 20050128 |
|
RD03 | Notification of appointment of power of attorney |
Free format text: JAPANESE INTERMEDIATE CODE: A7423 Effective date: 20061018 |
|
A977 | Report on retrieval |
Free format text: JAPANESE INTERMEDIATE CODE: A971007 Effective date: 20070312 |
|
A131 | Notification of reasons for refusal |
Free format text: JAPANESE INTERMEDIATE CODE: A131 Effective date: 20070403 |
|
A521 | Request for written amendment filed |
Free format text: JAPANESE INTERMEDIATE CODE: A523 Effective date: 20070522 |
|
TRDD | Decision of grant or rejection written | ||
A01 | Written decision to grant a patent or to grant a registration (utility model) |
Free format text: JAPANESE INTERMEDIATE CODE: A01 Effective date: 20070612 |
|
A61 | First payment of annual fees (during grant procedure) |
Free format text: JAPANESE INTERMEDIATE CODE: A61 Effective date: 20070618 |
|
R150 | Certificate of patent or registration of utility model |
Free format text: JAPANESE INTERMEDIATE CODE: R150 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100622 Year of fee payment: 3 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100622 Year of fee payment: 3 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110622 Year of fee payment: 4 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120622 Year of fee payment: 5 |
|
LAPS | Cancellation because of no payment of annual fees |