JP2004257877A

JP2004257877A - Sound source detection method, sound source detection device, and robot

Info

Publication number: JP2004257877A
Application number: JP2003049407A
Authority: JP
Inventors: Akihiko Ikegami; 昭彦池上
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2003-02-26
Filing date: 2003-02-26
Publication date: 2004-09-16

Abstract

【課題】音源方向の検出誤差を低減することのできる音源検出方法及び装置を提供する。
【解決手段】本発明の音源検出方法は、３個以上の音波検出手段Ａ，Ｂ，Ｃにおいて検出された相互に対応する音波の間の複数の検出差ｂ，ｃを求め、音波検出手段の位置及び複数の検出差に基づいて点状音源としての音源Ｐの位置情報を決定する。より具体的には、音波検出手段の位置及び前記複数の検出差に基づいて音波の音源位置を規定するための複数のパラメータのうちの一つを仮定パラメータとして値ｒ′に設定し、この仮定パラメータ及び複数の検出差に基づいて仮の音源位置Ｐ′を求めた上で、仮の音源位置と３個以上の音波検出手段の位置との関係が所定の許容範囲内で整合性を持つか否かを試算し、整合性を持たない場合には仮定パラメータの値を変更してさらに試算を続けることにより、許容範囲内で整合性を有する前記位置情報を決定する。
【選択図】図２A sound source detection method and apparatus capable of reducing a detection error in a sound source direction are provided.
A sound source detection method according to the present invention obtains a plurality of detection differences b and c between mutually corresponding sound waves detected by three or more sound wave detection means A, B and C, and determines a plurality of detection differences b and c of the sound wave detection means. The position information of the sound source P as a point sound source is determined based on the position and the plurality of detection differences. More specifically, one of a plurality of parameters for defining the sound source position of the sound wave based on the position of the sound wave detecting means and the plurality of detection differences is set to a value r ′ as a hypothetical parameter, and this assumption is made. After determining the temporary sound source position P ′ based on the parameters and the plurality of detection differences, whether the relationship between the temporary sound source position and the positions of the three or more sound wave detecting means is consistent within a predetermined allowable range. Whether or not the position information has consistency within the allowable range is determined by changing the value of the hypothetical parameter and continuing the trial calculation.
[Selection] Fig. 2

Description

【０００１】
【発明の属する技術分野】
本発明は、音源検出方法、音源検出装置及びロボットに係り、特に、音源の方向を検出して音声に反応して動作するロボットに用いる場合に好適な音声入力インターフェイスの構成に関する。
【０００２】
【従来の技術】
一般に、音源を検出する従来の方法としては、複数のマイクロホンを用いて音波を検出し、この音波の検出時間の差を求め、この時間差から音源方向を求める方法が知られている。たとえば、同一平面内で三角形をなすように任意に配置された少なくとも３個以上のマイクロホンを設けて、各マイクロホンの音波到達時間の差を用いて上記平面内における音源の方向を検出するようにした音方向検出装置が知られている（たとえば、特許文献１又は特許文献２参照）。
【０００３】
また、空間的に分離して設置された複数のマイクロホンを用いて、音波の到来方向を演算する水中方向計や水中の２点間の距離を計測するように構成した水中距離計も知られている（たとえば、特許文献３参照）。
【０００４】
図６を参照して従来の方法の概略を説明すると、従来の方法では、音源を点Ｐではなく、線（あるいは面）Ｑであると仮定し、線（面）Ｑから並行に音が伝播してくるものと仮定して音源方向Ｓを求めている。たとえば、三角形ＡＢＣの各頂点にマイクロホンを置き、それぞれにおいて音波（の立ち上がり）を捉える場合、最初に音源に最も近いＡ点で音波を捉え、続いてＣ点、さらにはＢ点で捉えることになるとすれば、音波をＡ点で捉えてからＢ点、Ｃ点で捉えるまでの時間差を検出し、これらの時間差を距離ｂ，ｃに換算する。そして、これらの距離ｂ，ｃを用いて音波の波面を直線Ｗｑとして求め、この直線Ｗｑに直交する方向が音源方向Ｓとして求められる。
【０００５】
【特許文献１】
特開昭５９−１０５５７５号公報
【特許文献２】
特開昭５７−１２５３６６号公報
【特許文献３】
特開平７−２０９４０３号公報
【０００６】
【発明が解決しようとする課題】
しかしながら、上記従来の音源検出装置においては、いずれも音波の伝達を線（または面）音源Ｑからの「並行的に伝播する音」として扱っており、多くの音波が実際に示している「点音源」Ｐから「放射状に広がってくる音」としては捉えていないため、決定された音源方向Ｓに、多くの誤差（試算の結果、作図による解と比較して最大７度程度の誤差になる）を含んでいる。すなわち、人の話し声などは実際には点Ｐから伝播してくる音波であって、その音波の波面Ｗｐは円弧状（球面状）であるため、上記のように線（面）音源Ｑを前提として求められた音源方向Ｓと実際の音源方向との間には大きな誤差が生じる。
【０００７】
上記のような音源方向Ｓの検出誤差は、たとえば、音声に反応して音源方向を向いたり、音声に反応して物を掴むなどの各種動作を行うように構成されたロボットに用いる場合には、ロボットの動作に違和感を与えたり、動作不良を招いたりする可能性がある。また、上記の方法では、基本的に線（面）音源Ｑを前提としていることから、音源までの距離を特定することは不可能であり、音源位置を検出したり、音源に対して種々の動作を行ったりするといったことができないという問題点もあった。
【０００８】
そこで本発明は上記問題点を解決するものであり、その課題は、音源方向の検出誤差を低減することのできる新規の音源検出方法及び装置を提供することにある。また、音源の距離を求めることによって音源に関する位置情報をより詳細に得ることのできる音源検出方法及び装置を実現することにある。
【０００９】
【課題を解決するための手段】
上記課題を解決するために本発明の音源検出方法は、３個以上の音波検出手段を用いて音源を検出する音源検出方法であって、前記３個以上の音波検出手段において検出された相互に対応する音波の間の複数の検出差を求め、前記音波検出手段の位置及び前記複数の検出差に基づいて点状音源としての音源の位置情報を決定することを特徴とする。
【００１０】
この発明によれば、音波検出手段の位置及び複数の検出差に基づいて点状音源としての音源の位置情報を求めるので、特に音源方向の誤差が低減され、正確な位置情報を求めることが可能になる。また、音源までの距離を求めることも可能になる。
【００１１】
ここで、上記検出差とは、音波の伝播時間の差、或いは、これに対応する音波の伝播距離の差をいうものとする。これらの伝播時間の差と伝播距離の差は音波の伝播速度により直接的に関連付けられている。
【００１２】
また、音源の位置情報とは、音源の３次元位置座標、たとえば直交座標（Ｘ、Ｙ、Ｚ）、或いは極座標の音源距離Ｒ及び音源方向（θ，φ）のうちの少なくとも一つ（上記のＸ，Ｙ，Ｚ，Ｒ，θ，φのうちの少なくとも一つ）をいう。また、ある仮想平面上における音源の位置情報を考えることにすれば、上記３次元位置座標の上記仮想平面への投影位置の座標として、音源の直交座標（ｘ、ｙ）、極座標の音源距離ｒ及び音源方向θが用いられ、これらのうちの少なくとも一つとなる。
【００１３】
さらに、２次元平面上における音源位置を検出する場合には、一つの直線上に全てが配列されない態様で３個以上の音波検出手段を配置することが好ましく、また、３次元空間内における音源位置を検出するには、一つの平面上に全てが配列されない態様で４個以上の音波検出手段を配置することが好ましい。
【００１４】
本発明において、前記音波検出手段の位置及び前記複数の検出差に基づいて前記音波の音源位置を規定するための複数のパラメータのうちの少なくとも一つを仮定パラメータとして適宜の値に設定し、この仮定パラメータ及び前記複数の検出差に基づいて仮の音源位置を求めた上で、前記仮の音源位置と前記３個以上の音波検出手段の位置との関係が所定の許容範囲内で整合性を持つか否かを試算し、整合性を持たない場合には前記仮定パラメータの値を変更してさらに試算を続けることにより、前記許容範囲内で整合性を有する前記位置情報を決定することが好ましい。これによれば、音源の位置情報を求めるための演算処理を簡略化することができるとともに、許容範囲を変えることによって音源の位置情報の精度を調整することができるため、必要に応じて位置情報の精度と、音源位置算出に要する時間との関係を任意に、あるいは段階的に選定することが可能になる。
【００１５】
本発明において、前記仮定パラメータは、音波の伝播時間若しくは伝播距離であることが好ましい。音波の伝播時間若しくは伝播距離は、音源距離（上記のＲ又はｒ）に相当するパラメータであり、これを仮定パラメータとして仮定することにより、複数の検出差を用いて仮の音源位置を求めることができる。そして、この仮の音源位置と、３個以上の音源検出手段の位置との整合性を試算することによって、仮の音源位置の精度を評価できる。
【００１６】
たとえば、一つの音波検出手段の検出状態を基準として他の音波検出手段による検出状態との間の複数の検出差を求めた場合には、複数の検出差と、他の音波検出手段の位置座標と、上記仮定パラメータとによって仮の音源位置を求め、この仮の音源位置と上記基準となる音波検出手段との距離が上記仮定パラメータに対応する音源距離と整合するか否かによって仮の音源位置の整合性を判定できる。
【００１７】
本発明において、３個の前記音波検出手段を用いて２つの前記検出差を求め、前記３個の音波検出手段が配置された平面上の仮想音源の前記位置情報を決定することが好ましい。これによれば、３個の音波検出手段で平面上の仮想音源の位置情報を決定することができるため、平面的に音源がどの方向に存在するか、或いは、平面的に音源がどの程度の距離にあるかを知りたい場合には、十分な音源の位置情報が得られるとともに演算処理を簡略化することができる。
【００１８】
本発明において、周囲温度により温度補正された音速を用いて前記位置情報を決定することが好ましい。これによれば、音速の温度補正が行われることによって、より正確な音源の位置情報を得ることができる。
【００１９】
次に、本発明の音源検出装置は、３個以上の音波検出手段と、前記３個以上の音波検出手段において検出された相互に対応する音波の間の複数の検出差を求める検出差算出手段と、前記音波検出手段の位置及び前記複数の検出差に基づいて点状音源としての音源の位置情報を求める音源位置情報算出手段とを有することを特徴とする。
【００２０】
本発明において、前記音源位置情報算出手段は、前記音波検出手段の位置及び前記複数の検出差に基づいて前記音源位置を規定するための複数のパラメータのうちの少なくとも一つを仮定パラメータとして適宜の値に設定し、この仮定パラメータ及び前記複数の検出差に基づいて仮の音源位置を求めた上で、前記仮の音源位置と前記３個以上の音波検出手段の位置との関係が所定の許容範囲内で整合性を持つか否かを試算し、整合性を持たない場合には前記仮定パラメータの値を変更してさらに試算を続けることにより、前記許容範囲内で整合性を有する前記位置情報を決定することが好ましい。
【００２１】
本発明において、前記仮定パラメータは、音波の伝播時間若しくは伝播距離であることが好ましい。
【００２２】
本発明において、前記複数の検出差として、最も小さい検出差から順番に必要数だけ用いることが好ましい。これによれば、音波検出手段の間で得られる複数の検出差のうち、小さい検出差を用いることは、音源に対する見込み角が大きくなることを通常意味するので、検出差に起因する音源方向を求める際の誤差を小さくすることができる。たとえば、或る二つの音波検出手段がある程度離れて配置されていたとき、これらの二つの音波検出手段の検出差が大きいということは検出される音波の進行方向に対して二つの音波検出手段の位置が直列的に配置されていることを意味し、したがって、二つの音波検出手段の音源に対する見込み角は小さいことになる。また逆に、検出差が小さいということは音源に対して二つの音波検出手段が並列的に配置されていることを意味し、上記の見込み角は大きいことになる。従って、音源の方向によって用いるべき検出差を選択することで、常に最適な検出手段を用い、小さな誤差で音源方向を求めることができる。
【００２３】
本発明において、３個の前記音波検出手段により得られた２つの前記検出差を求め、前記３個の音波検出手段が配置された平面上の仮想音源の前記位置情報を決定することが好ましい。これによれば、３個の音波検出手段で平面上の仮想音源の位置情報を決定することができるため、平面的に音源がどの方向に存在するか、或いは、平面的に音源がどの程度の距離にあるかを知りたい場合には、十分な音源の位置情報が得られるとともに演算処理を簡略化することができる。
【００２４】
本発明において、前記３個の音波検出手段は、正三角形の頂点位置に配置されていることが好ましい。正三角形の頂点位置に３個の音波検出手段が配置されていることにより、音源方向に起因する検出精度のばらつきを低減することができる。ただし、音源方向が特定範囲内に限定されている場合には、上記特定範囲内の音源方向に対する見込み角が最も大きくなるように配置すればよい。
【００２５】
本発明において、温度検出手段をさらに有し、前記温度検出手段による検出温度により温度補正された音速を用いて前記位置情報を決定するように構成されていることが好ましい。これによれば、音速の温度依存性に起因して生ずる誤差を低減することができるため、より正確な位置情報を得ることができる。
【００２６】
次に、本発明のロボットは、上記のいずれかに記載の音源検出装置と、前記音源検出装置によって決定された前記位置情報に応じて動作することを特徴とする。これによれば、上記の音源検出装置を用いることによってより正確で的確な動作を実現できる。なお、ここで、位置情報に応じて動作する態様としては、その位置情報を出力すること（たとえば表示すること）も含まれる。
【００２７】
本発明において、前記音源検出装置によって決定された前記位置情報に基づいて音源方向を向くように構成されていることが好ましい。このロボットは、音声を検出して音源方向を向くように動作するため、外部から見ると応答性が良好で高度なロボットとして認識されるとともに、ロボット側から見ると、外部情報の検出範囲を音源方向に制限することができるため、外部情報に対する処理が容易になるというメリットがある。ここで、ロボットが音源方向を向くとは、ロボットのカメラの視野を音源方向に向けること、指向性マイクロホンなどの種々のセンサを音源方向に向けることを意味する。
【００２８】
上述の本発明の音源検出方法及び装置によれば、音源からの情報を雑音等の他の検出要素から分離し、利用しやすくなる、複数音源を弁別し、各々からの情報が個々に利用可能になる、移動する音源を追跡し、情報交換を維持あるいは、行動を監視等することが可能になるなど、きわめて顕著な効果が得られる。このような音源検出方法及び装置は、特にロボットの音声入力インターフェイスとしてきわめて適している。
【００２９】
本発明では、逐次数値演算によって数値解を得る手法を用いて音源の位置情報を取得するようにしている。したがって、音源を点状音源として把握しても、演算処理を複雑化することなく、容易に位置情報を得ることができる。
【００３０】
また、従来例は、音が並行的に伝播すると仮定して方向を算出しているので、人の話し声等、点音源として考える方がふさわしい場合には、音源の方向によって検出精度が異なり、ある程度の方位誤差が避けられなかった。これに対して、本発明は、音が点音源から出て放射状に広がることを前提に算出しているので、原理的に点音源に対する誤差が無く、特に人の話し声等の音源に対応するのに適する。また、対象が面（または線）音源である場合には、多少の方向誤差が生ずる場合も考えられるが、この場合には、音源が線状若しくは面状であるため、逆に多少の方向誤差が存在したとしても、実用的には全く問題にはならない。
【００３１】
さらに、上記の従来例では、基本的に音源方向を検出する方法が記載されているだけであるが、本発明においては、音源位置を求めることもできるため、位置情報としては、上記の音源方句に限らず、音源までの距離を求めることも可能になる。これによって、音源に対する各種動作を正確に行うことが可能になる。
【００３２】
【発明の実施の形態】
次に、添付図面を参照して本発明に係る音源検出方法及び装置並びにロボットの実施形態について詳細に説明する。
【００３３】
図１は、本実施形態に係る音源検出方法の原理を示す説明図であり、図２は、音源検出方法の演算仮定の一例を示す説明図である。ここで、以下の説明は、次に示す状況に基づいて行う。すなわち、音波検出手段（マイクロホン）を図中の点Ａ、点Ｂ及び点Ｃに配置し、点ＢがＸＹ座標の原点位置に配置され、点ＣがＸ軸上に配置され、点Ａ，Ｂ，Ｃは、ＸＹ平面上における一辺が長さＬの正三角形の頂点位置に配置されているものとする。また、音源Ｐは、ＸＹ平面上に配置されているものとする。さらに、検出すべき音源Ｐは、音波検出手段Ａに最も近いものとする。
【００３４】
図１に示すように、点状の音源Ｐから音波が伝播するとき、その波面Ｗｐは音源Ｐを中心とする円となる。そして、音波は音源Ｐに最も近い音波検出手段Ａに時刻ｔａにおいて最初に到達し、その後、時刻ｔｂにおいて音波検出手段Ｂに、時刻ｔｃにおいて音波検出手段Ｃにそれぞれ到達する。これによって、音波検出手段Ａを基準にすると、音波検出手段Ｂの音波到達時間差はｔｂ−ｔａ、音波検出手段Ｃの音波到達時間差はｔｃ−ｔａとなる。ここで、音速をＶとすれば、上記の音波到達時間差に対応する音波伝播距離の差は、ｂ＝Ｖ・（ｔｂ−ｔａ）、ｃ＝Ｖ・（ｔｃ−ｔａ）となる。ここで、音源Ｐから音波検出手段Ａまでの距離をｒとすれば、音源Ｐから音波検出手段Ｂまでの距離はｒ＋ｂ、音源Ｐから音波検出手段Ｃまでの距離はｒ＋ｃとなる。
【００３５】
この状況においては、音波検出手段Ａ，Ｂ，Ｃの位置座標と、音波伝播距離の差ｂ，ｃを用いて、音源Ｐの座標［Ｘｐ，Ｙｐ］を求めることができる。たとえば、
ｒ^２＝（Ｘｐ−Ｌ／２）^２＋（Ｙｐ−３^１／２Ｌ／２）^２ …（１）
（ｒ＋ｂ）^２＝Ｘｐ^２＋Ｙｐ^２ …（２）
（ｒ＋ｃ）^２＝（Ｘｐ−Ｌ）^２＋Ｙｐ^２ …（３）
が成立するため、これらの式（１）〜（３）で構成される連立方程式を解くことができれば、未知数ｒ、Ｘｐ，Ｙｐを求めることができる。しかしながら、上記の音波伝播距離の差ｂ，ｃは、実際に検出された時間差に基づく数値であるため、上記式（１）〜（３）によって解を求めようとしても、実際に解が得られない場合（解無しなど）も考えられる。したがって、本発明者は、音源Ｐの位置座標を決定する上記未知数ｒ、Ｘｐ，Ｙｐのうちのいずれか一つの値を所定の数であるものと仮定し、この仮定した値に基づいて、図２に示すように、仮の音源Ｐ′の位置を求め、この仮の音源Ｐ′の位置と３つの音波検出手段Ａ，Ｂ，Ｃの位置との関係がどの程度整合するかを試算する。
【００３６】
たとえば、上記音波検出手段Ａと仮の音源Ｐ′との距離ＡＰ′バー（真の値はｒ）を仮定パラメータとして所定の値ｒ′に仮に設定し、この値ｒ′に基づいて仮の音源Ｐ′のＸＹ平面上の位置座標［ｘ、ｙ］を求め、その後、音波検出手段Ａ，Ｂ，Ｃの位置と仮の音源Ｐ′の位置との間の整合性を確認する。
【００３７】
より具体的には、上記方法の一例として以下の方法が挙げられる。すなわち、仮定パラメータｒ′と、音波検出手段Ｂ，Ｃの位置座標とから、
（ｒ′＋ｂ）^２＝ｘ^２＋ｙ^２ …（４）
（ｒ′＋ｃ）^２＝（ｘ−Ｌ）^２＋ｙ^２ …（５）
が成立するので、この式（４）及び（５）を解くことによって、以下に示す仮の音源Ｐ′の位置［ｘ，ｙ］が求められる。
ｘ＝（Ｌ^２＋ｂ^２−ｃ^２＋２（ｂ−ｃ）・ｒ′）／２Ｌ …（６）
ｙ＝±｛（ｒ′＋ｂ）^２−ｘ^２｝^１／２ …（７）
【００３８】
上記のようにして求めた仮の音源Ｐ′の位置座標［ｘ，ｙ］と、音波検出手段Ａの位置座標［Ｌ／２，３^１／２Ｌ／２］との距離（すなわちＡＰ′バー）と、上記仮定パラメータｒ′とが整合するか否かを検証する。すなわち、整合性の評価指標をδｒとして、
δｒ＝（ｘ−Ｌ／２）^２＋（ｙ−３^１／２Ｌ／２）^２−ｒ′ …（８）
としたとき、δｒ＝０となれば完全に整合性が保たれていることになるため、ｒ＝ｒ′であり、上記仮の音源Ｐ′の位置は音源Ｐの位置と等しいことになる。しかしながら、通常、上記δｒは０にはならないため、δｒの値の大小によって仮の音源Ｐ′の位置がどれだけ真の音源Ｐに近いのかを評価する。
【００３９】
δｒの評価としては、基準値をｋとして、ＡＢＳ［δｒ］＜ｋの範囲（上記許容範囲）内であれば整合性が得られたとし、ＡＢＳ［δｒ］≧ｋであれば整合性が得られていないものとする方法がある（なお、ＡＢＳ［〜］は〜の絶対値、以下同様。）。ここで、絶対値ｋは、たとえばｋ＝１０ｃｍなど、音源位置に対する要求精度に応じて設定される。
【００４０】
また、上記音波検出手段Ａ，Ｂ，Ｃの相対位置関係に関するパラメータ、たとえば、音波検出手段ＢとＣ間の距離Ｌと、上記のδｒとの比率を評価対象にする方法も考えられる。ここで、たとえば、基準比率をｍとして、ＡＢＳ［δｒ／Ｌ］＜ｍの範囲（上記許容範囲）であれば整合性が得られたとし、ＡＢＳ［δｒ／Ｌ］≧ｍであれば整合性が得られていないものとすることができる。ここで、基準比率ｍは、たとえばｍ＝０．０３すなわち３％など、上記と同様に音源位置に対する要求精度に応じて設定される。
【００４１】
上記の許容範囲（上記の基準値ｋや基準比率ｍなど）の設定値を小さくしすぎると、音波検出手段Ａ，Ｂ，Ｃの検出精度などとの関係で解が無い状態となる場合もあるので、許容範囲は、音源Ｐの位置に対する要求精度や音波検出手段の検出精度などを勘案して適宜に設定する必要がある。また、仮定パラメータｒ′の初期値は、実際の音源Ｐに較べて検出手段の近傍に設定することが好ましい。この初期値としては、たとえば上記Ｌなどの基準値に対して一定比率の距離とすることもでき、また、一定距離とすることもできる。
【００４２】
上記の結果、δｒが許容範囲を逸脱していた場合、すなわち整合性が十分に得られなかった場合には、仮定パラメータｒ′の値を変更して再び仮の音源Ｐ′の位置を求め、上記のδｒを評価する、すなわち仮の音源Ｐ′の位置の整合性を評価することを繰り返す。そして、δｒが許容範囲内に入ったときに、その仮の音源Ｐ′の位置が検出位置とみなして、音源Ｐの位置情報を決定する。
【００４３】
ここで、整合性が得られなかった場合の仮定パラメータの値の変更は、たとえばδｒが正であれば仮定パラメータｒ′の値を増大させ、δｒが負であれば仮定パラメータｒ′の値を低減させるなど、不整合の原因となるパラメータｒ′のずれ方向を補正する方向に（すなわち整合性の評価指標がゼロに近づく方向に）仮定パラメータを修正する。また、仮定パラメータの修正量は、不整合の原因となるずれ量（整合性の評価指標の大きさ）との間に正の相関があるように設定すること、すなわち、不整合の度合が大きくなるほど仮定パラメータの修正量を大きくすることが好ましい。
【００４４】
なお、上記の演算処理の過程では、音波到達時間の差から音波伝播距離の差ｂ，ｃを求める場合には音速Ｖを用いることとなるが、この音速Ｖを温度補正することが好ましい。すなわち、音速を温度τの関数Ｖ＝Ｆ（τ）として設定し、検出された温度（周囲温度）によって補正された値とする。たとえば、空気中における音速の温度依存性をＦ＝３３１．５＋０．６τとすれば、この式に従って音速を補正した上で上記演算を行い、或いは、この式を適用した場合と等価な補正値を用いて演算結果を補正すればよい。すなわち、本発明における温度補正とは、上記の演算過程において実質的に音速を補正した場合と同等の補正処理を広く包含するものである。
【００４５】
上記の音源Ｐの位置情報としては、音源Ｐの方向（たとえば、音波検出手段Ａから見た音源Ｐの方向、或いは、音波検出手段Ａ，Ｂ，Ｃの重心位置Ｇから見た音源Ｐの方向など、図１及び図２のＸ軸に対する角度θ（図示せず）などで表現できる。）や、音源Ｐまでの距離（たとえば、音波検出手段Ａと音源Ｐとの距離ｒ、或いは、図１に示す音波検出手段Ａ，Ｂ，Ｃの重心位置Ｇと音源Ｐとの距離など）などが挙げられる。また、音源Ｐの位置座標そのもの（［Ｘｐ，Ｙｐ］や［ｒ、θ］など）であってもよい。
【００４６】
図３は、上記音源検出方法を用いた音源検出装置１０の構成を模式的に示す概略構成ブロック図である。この音源検出装置１０においては、音波検出手段（マイクロホン）Ａ，Ｂ，Ｃと、これらの音波検出手段に接続された入力信号処理部１５と、この入力信号処理部１５から出力される検出差（上記音波到達時間の差、或いは、上記音波伝播距離の差など）に基づいて上記の各種演算を行う演算処理部１６と、演算処理部１６によって求められた上記音源Ｐの位置情報に基づいて動作する演算結果出力部１７とを備えている。
【００４７】
入力信号処理部１５は、音波検出手段Ａ，Ｂ，Ｃから出力される音波信号を処理して整形された音声信号を出力する増幅器やバンドパスフィルタなどを含む信号処理手段１５Ａと、音声信号から音波の特定点（たとえば立ち上がりエッジなど）を抽出する特定点抽出手段１５Ｂと、特定点抽出手段１５Ｂによって抽出された特定点を基準として、音波検出手段Ａ，Ｂ、Ｃ間の特定点の時間差若しくはこれと等価な量を求める検出差導出手段１５Ｃとを備えている。
【００４８】
また、演算処理部１６は、バスや入出力回路などで構成される信号入出力手段１６Ａと、この信号入出力手段１６Ａに接続された中央処理ユニット１６Ｂと、信号入出力手段１６Ａに接続された記憶手段１６Ｃとを有する。中央処理ユニット１６Ｂは、予め記憶手段１６Ｃに格納されたプログラムに従って上記の演算処理を実施する。
【００４９】
さらに、演算結果出力部１７は、演算中か、演算結果が得られたか否か、演算不能かなどの状態表示、演算によって得られた音源の位置情報表示などを適宜に行う表示手段１７Ａと、演算結果に応じて動作する動作部１７Ｂ，１７Ｃとを備えている。なお、表示手段１７Ａと動作部１７Ｂ，１７Ｃとは必ずしも両方設けなくてもよく、いずれか一方だけでもよい。また、音源検出装置１０としては、演算結果表示部１７を設けることなく、そのまま外部に演算結果を出力（送信）するように構成されていても構わない。
【００５０】
図４は、音波検出手段Ａ，Ｂ，Ｃの検出信号から得られた上記の音声信号を対比して示すタイミングチャートである。音波検出手段Ａ，Ｂ，Ｃは相互に異なる位置に配置されているため、各手段から得られる音声信号は相互に位相が異なったものとなる。そして、この位相差を求めることによって、上記の音波到達時間の差、或いは、音波伝播距離の差を求めることができる。この場合、たとえば、上記音声信号にはノイズが含まれているので、所定の閾値Ｓを設定して、この閾値Ｓを超えた信号値のみを検出して、その信号値が得られた所定のタイミング（立ち上がりエッジ、立ち下がりエッジ、或いは極大ピーク値など）ｔａ、ｔｂ、ｔｃを求める。これらのタイミングｔａ，ｔｂ，ｔｃが音波到達時刻となり、これに基づいて上記の音波到達時間の差や音波伝播距離の差などの検出差が求められる。
【００５１】
ここで、連続して上記音波到達時刻を求める場合には、図示のように、所定周期Ｔ内で最初に閾値Ｓを超えたタイミングを音波到達時刻ｔａ，ｔｂ，ｔｃとして検出するといった方法を用いることができる。これによって、閾値Ｓを越えるだけの音量がある場合には、所定周期Ｔ毎に音波を検出し、次々と上記演算の結果を更新していくことができる。
【００５２】
図５は、上記音源検出方法を用いたロボットの制御方法を示す説明図である。このロボットは、本体１１と、動作部である頭部１２及びアーム部１３，１４とを備えている。ロボット内には上記の音波検出手段Ａ，Ｂ，Ｃが設置され、上記の音源検出装置１０が構成されている。その音源検出装置１０は、上記と同様の方法で音源Ｐの位置情報としての音源Ｐの方向を求め、その音源Ｐの方向にロボットを向けるように構成される。この場合、頭部１２のみを音源Ｐの方向に向けるように動作してもよく、ロボット全体が音源Ｐに向くように動作させてもよい。頭部１２には、たとえば、カメラ（ＣＣＤカメラなど）や指向性マイクロホンなどのセンサ類が取り付けられており、このセンサ類によって音源Ｐの近傍からの情報の取り込みを行うことができる。
【００５３】
尚、本発明の音源検出方法、音源検出装置及びロボットは、上述の図示例にのみ限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々変更を加え得ることは勿論である。たとえば、上記実施形態では、ＸＹ平面上において２次元の音源の位置情報を求めるようにしているが、音波検出手段の所要数を４つに設定し、これを同一平面上に全てが設置されないように配置することにより、これを３次元の音源の位置情報に拡張することも容易にできる。
【００５４】
また、上記実施形態では３つの音波検出手段Ａ，Ｂ，Ｃを正三角形の頂点位置に配置してあるが、本発明はこれに限定されることなく、相互にある程度離間していれば、任意の位置に配置することができる。ただし、任意の方向にある音源に対して支障なく検出をするためには同一線上に全ての音波検出手段が配列されないようにすることが好ましい。また、全ての方向にある音源に対してほぼ均一な精度で検出できるようにするには、上記のように正三角形の頂点位置に音波検出手段を配置することが望ましい。
【００５５】
また、上記実施形態では、音波検出手段を３つだけ用いているが、４つ以上の音波検出手段を設置し、そのうちの必要数（上記例では３つ）の音波検出手段の検出値を用いて同様に演算処理を行うことが可能である。この場合、４つ以上の音波検出手段を用いる場合には、３つ以上の独立した上記検出差が得られることになるが、３つの検出差のうち、２つの検出差を選んで用いることが好ましい。このとき、最終的に音源の位置情報として音源方向を求める場合には、３つの検出差のうち、小さい方から２つを選定して用いることが望ましい。これは、検出差が小さいことは、その検出差を求めた二つの音波検出手段の音源に対する見込み角が大きいことを意味し、音源に対する見込み角が大きければ、それだけ、求める音源方向の誤差を低減することができるからである。
【図面の簡単な説明】
【図１】実施形態の音源検出方法の前提事項を示す説明図。
【図２】実施形態の音源検出方法の演算内容を示す説明図。
【図３】実施形態の音源検出装置の概略構成を示す概略構成ブロック図。
【図４】実施形態の音声信号を示すタイミングチャート。
【図５】実施形態のロボットの動作を示す動作説明図。
【図６】従来の音源方向検出方法の概念を示す説明図。
【符号の説明】
１０…音源検出装置、１５…入力信号処理部、１６…演算処理部、１７…演算結果出力部、Ａ，Ｂ，Ｃ…音波検出手段、Ｐ…音源（点状音源）、Ｐ′…仮の音源、ｂ，ｃ…音波伝播距離の差[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a sound source detection method, a sound source detection device, and a robot, and more particularly to a configuration of a voice input interface suitable for use in a robot that detects a direction of a sound source and operates in response to voice.
[0002]
[Prior art]
In general, as a conventional method for detecting a sound source, a method is known in which a sound wave is detected using a plurality of microphones, a difference between detection times of the sound waves is obtained, and a sound source direction is obtained from the time difference. For example, at least three or more microphones arbitrarily arranged so as to form a triangle in the same plane are provided, and the direction of the sound source in the plane is detected using the difference in the sound wave arrival time of each microphone. A sound direction detecting device is known (for example, see Patent Document 1 or Patent Document 2).
[0003]
Also known are underwater direction meters that calculate the direction of arrival of sound waves and underwater distance meters that are configured to measure the distance between two points in water using a plurality of microphones that are installed spatially separated. (For example, see Patent Document 3).
[0004]
An outline of the conventional method will be described with reference to FIG. 6. In the conventional method, it is assumed that the sound source is not a point P but a line (or plane) Q, and sound propagates in parallel from the line (plane) Q. The sound source direction S is obtained on the assumption that the sound is coming. For example, when a microphone is placed at each vertex of the triangle ABC and a sound wave (rising) of each is captured, a sound wave is captured first at a point A closest to the sound source, and then captured at a point C and further at a point B. Then, a time difference from when the sound wave is captured at the point A to when it is captured at the points B and C is detected, and these time differences are converted into distances b and c. Then, using these distances b and c, the wavefront of the sound wave is determined as a straight line Wq, and the direction orthogonal to the straight line Wq is determined as the sound source direction S.
[0005]
[Patent Document 1]
JP-A-59-105575
[Patent Document 2]
JP-A-57-125366
[Patent Document 3]
JP-A-7-209403
[0006]
[Problems to be solved by the invention]
However, in the above-described conventional sound source detection devices, the transmission of sound waves is treated as “sounds propagating in parallel” from the line (or surface) sound source Q, and “points” that many sound waves actually show Since the sound source P is not considered as a “radially spreading sound”, there are many errors in the determined sound source direction S (as a result of the trial calculation, an error of about 7 degrees at the maximum as compared with the solution by drawing). ). That is, the voice of a person is actually a sound wave propagating from the point P, and the wave front Wp of the sound wave is arc-shaped (spherical). A large error occurs between the sound source direction S obtained as (1) and the actual sound source direction.
[0007]
The detection error of the sound source direction S as described above is, for example, when used in a robot configured to perform various operations such as pointing to the sound source direction in response to voice or grasping an object in response to voice. Therefore, there is a possibility that the operation of the robot may give a sense of incongruity or cause a malfunction. Further, in the above method, since it is basically assumed that the line (plane) sound source Q is used, it is impossible to specify the distance to the sound source. There was also a problem that it was not possible to perform operations.
[0008]
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object of the present invention is to provide a novel sound source detection method and apparatus capable of reducing a detection error in a sound source direction. Another object of the present invention is to realize a sound source detection method and apparatus capable of obtaining positional information on a sound source in more detail by obtaining a distance of the sound source.
[0009]
[Means for Solving the Problems]
In order to solve the above-mentioned problem, a sound source detection method of the present invention is a sound source detection method for detecting a sound source using three or more sound wave detecting means, wherein the mutual sound detected by the three or more sound wave detecting means is different. A plurality of detection differences between corresponding sound waves are obtained, and position information of a sound source as a point-like sound source is determined based on a position of the sound wave detection unit and the plurality of detection differences.
[0010]
According to the present invention, since the position information of the sound source as the point-like sound source is obtained based on the position of the sound wave detecting means and a plurality of detection differences, errors in the direction of the sound source are particularly reduced, and accurate position information can be obtained. become. It is also possible to determine the distance to the sound source.
[0011]
Here, the detection difference refers to a difference in the propagation time of the sound wave or a difference in the propagation distance of the sound wave corresponding thereto. These differences in propagation time and propagation distance are directly related to the propagation speed of the sound wave.
[0012]
The position information of the sound source is at least one of the three-dimensional position coordinates of the sound source, for example, rectangular coordinates (X, Y, Z) or the sound source distance R and the sound source direction (θ, φ) in polar coordinates. X, Y, Z, R, θ, φ). Further, considering the position information of the sound source on a certain virtual plane, the coordinates of the projection position of the three-dimensional position coordinates on the virtual plane are the orthogonal coordinates (x, y) of the sound source and the sound source distance r of the polar coordinates. And the sound source direction θ, which is at least one of these.
[0013]
Further, when detecting a sound source position on a two-dimensional plane, it is preferable to arrange three or more sound wave detecting means in such a manner that not all of them are arranged on one straight line. In order to detect the sound wave, it is preferable to arrange four or more sound wave detecting means in such a manner that they are not all arranged on one plane.
[0014]
In the present invention, at least one of a plurality of parameters for defining the sound source position of the sound wave based on the position of the sound wave detection unit and the plurality of detection differences is set to an appropriate value as a hypothetical parameter, After obtaining a tentative sound source position based on the hypothetical parameters and the plurality of detection differences, the relationship between the tentative sound source position and the positions of the three or more sound wave detectors is consistent within a predetermined allowable range. It is preferable to determine whether or not to have the position information having consistency within the permissible range by calculating whether or not to have the consistency, and in the case of not having consistency, by changing the value of the assumed parameter and continuing the trial calculation. . According to this, the arithmetic processing for obtaining the position information of the sound source can be simplified, and the accuracy of the position information of the sound source can be adjusted by changing the allowable range. And the time required for sound source position calculation can be arbitrarily or stepwise selected.
[0015]
In the present invention, it is preferable that the assumed parameter is a propagation time or a propagation distance of a sound wave. The propagation time or propagation distance of the sound wave is a parameter corresponding to the sound source distance (R or r described above). By assuming this as a hypothetical parameter, it is possible to obtain a temporary sound source position using a plurality of detection differences. it can. Then, the accuracy of the provisional sound source position can be evaluated by trially calculating the consistency between the provisional sound source position and the positions of the three or more sound source detection units.
[0016]
For example, when a plurality of detection differences between the detection state of another sound wave detection means and the detection state of another sound wave detection means are determined based on the detection state of one sound wave detection means, And a hypothetical sound source position is determined by using the hypothetical parameter, and the tentative sound source position is determined based on whether or not a distance between the hypothetical sound source position and the reference sound wave detecting means matches a sound source distance corresponding to the hypothetical parameter. Can be determined.
[0017]
In the present invention, it is preferable that two detection differences are obtained by using three sound wave detecting means, and the position information of a virtual sound source on a plane on which the three sound wave detecting means are arranged is determined. According to this, since the position information of the virtual sound source on the plane can be determined by the three sound wave detecting means, the direction in which the sound source exists in the plane or the degree of the sound source in the plane When it is desired to know whether the user is at a distance, sufficient positional information of the sound source can be obtained, and the calculation processing can be simplified.
[0018]
In the present invention, it is preferable that the position information is determined using a sound velocity temperature-corrected by an ambient temperature. According to this, the temperature of the sound velocity is corrected, so that more accurate position information of the sound source can be obtained.
[0019]
Next, the sound source detecting apparatus of the present invention comprises: three or more sound wave detecting means; and a detection difference calculating means for obtaining a plurality of detection differences between mutually corresponding sound waves detected by the three or more sound wave detecting means. And sound source position information calculation means for obtaining position information of a sound source as a point sound source based on the position of the sound wave detection means and the plurality of detection differences.
[0020]
In the present invention, the sound source position information calculating means may appropriately set at least one of a plurality of parameters for defining the sound source position based on the position of the sound wave detecting means and the plurality of detection differences as a hypothetical parameter. After setting a tentative sound source position based on the assumed parameters and the plurality of detection differences, the relationship between the tentative sound source position and the positions of the three or more sound wave detecting means is set to a predetermined tolerance. Estimate whether or not there is consistency within the range, if not, then continue the trial calculation by changing the value of the hypothetical parameter, the position information having consistency within the allowable range Is preferably determined.
[0021]
In the present invention, it is preferable that the assumed parameter is a propagation time or a propagation distance of a sound wave.
[0022]
In the present invention, it is preferable that a required number of detection differences are used in order from the smallest detection difference. According to this, the use of a small detection difference among a plurality of detection differences obtained between the sound wave detection means usually means that the prospective angle with respect to the sound source becomes large. An error in the calculation can be reduced. For example, when certain two sound wave detecting means are arranged at a certain distance from each other, the fact that the detection difference between these two sound wave detecting means is large means that the two sound wave detecting means This means that the positions are arranged in series, so that the angle of view of the two sound wave detecting means with respect to the sound source is small. Conversely, a small detection difference means that two sound wave detecting means are arranged in parallel with respect to the sound source, and the above-mentioned prospective angle is large. Therefore, by selecting the detection difference to be used depending on the direction of the sound source, the direction of the sound source can be obtained with a small error by always using the optimal detection means.
[0023]
In the present invention, it is preferable that two detection differences obtained by the three sound wave detecting means are obtained, and the position information of a virtual sound source on a plane on which the three sound wave detecting means are arranged is determined. According to this, since the position information of the virtual sound source on the plane can be determined by the three sound wave detecting means, the direction in which the sound source exists in the plane or the degree of the sound source in the plane When it is desired to know whether the user is at a distance, sufficient positional information of the sound source can be obtained, and the calculation processing can be simplified.
[0024]
In the present invention, it is preferable that the three sound wave detecting means are arranged at the vertices of an equilateral triangle. Since three sound wave detecting means are arranged at the apexes of the equilateral triangle, it is possible to reduce the variation in detection accuracy due to the direction of the sound source. However, when the sound source direction is limited to a specific range, the sound source direction may be arranged so that the expected angle with respect to the sound source direction within the specific range becomes the largest.
[0025]
In the present invention, it is preferable that the apparatus further includes a temperature detection unit, and the position information is determined using a sound velocity temperature-corrected by the temperature detected by the temperature detection unit. According to this, the error caused by the temperature dependence of the sound speed can be reduced, so that more accurate position information can be obtained.
[0026]
Next, a robot according to the present invention operates according to any one of the sound source detection devices described above and the position information determined by the sound source detection device. According to this, a more accurate and accurate operation can be realized by using the above sound source detection device. Here, the mode of operating in accordance with the position information includes outputting (for example, displaying) the position information.
[0027]
In the present invention, it is preferable that the sound source is directed to a sound source based on the position information determined by the sound source detection device. Since this robot detects sound and operates so as to face the sound source direction, it is recognized as an advanced robot with good responsiveness when viewed from the outside, and when viewed from the robot side, the detection range of external information is Since the direction can be restricted, there is an advantage that the processing for the external information becomes easy. Here, the expression that the robot faces the sound source direction means that the visual field of the robot camera is directed toward the sound source, and that various sensors such as a directional microphone are directed toward the sound source.
[0028]
According to the above-described sound source detection method and apparatus of the present invention, information from a sound source is separated from other detection elements such as noise, which makes it easy to use, discriminates a plurality of sound sources, and information from each can be used individually. Very remarkable effects are obtained, such as the ability to track a moving sound source and maintain information exchange or monitor behavior. Such a sound source detection method and apparatus are extremely suitable especially as a voice input interface of a robot.
[0029]
In the present invention, the position information of the sound source is obtained by using a technique of obtaining a numerical solution by sequential numerical calculation. Therefore, even if the sound source is grasped as a point sound source, the position information can be easily obtained without complicating the arithmetic processing.
[0030]
Also, in the conventional example, the direction is calculated assuming that sound propagates in parallel.Therefore, if it is more appropriate to consider it as a point sound source such as a human voice, the detection accuracy differs depending on the direction of the sound source. Azimuth error was inevitable. On the other hand, in the present invention, since the calculation is performed on the assumption that the sound is emitted from the point sound source and spreads radially, there is no error with respect to the point sound source in principle, and the present invention corresponds particularly to a sound source such as a human voice. Suitable for. When the target is a surface (or line) sound source, a slight directional error may occur. In this case, the sound source is linear or planar, and conversely, a slight directional error occurs. Is practically not a problem at all.
[0031]
Furthermore, in the above-mentioned conventional example, basically, only the method of detecting the sound source direction is described. However, in the present invention, since the sound source position can also be obtained, the above-mentioned sound source method is used as the position information. Not only the phrase but also the distance to the sound source can be obtained. This makes it possible to accurately perform various operations on the sound source.
[0032]
BEST MODE FOR CARRYING OUT THE INVENTION
Next, embodiments of a sound source detection method and apparatus and a robot according to the present invention will be described in detail with reference to the accompanying drawings.
[0033]
FIG. 1 is an explanatory diagram showing the principle of the sound source detection method according to the present embodiment, and FIG. 2 is an explanatory diagram showing an example of calculation assumptions in the sound source detection method. Here, the following description is made based on the following situation. That is, the sound wave detecting means (microphone) is arranged at points A, B, and C in the figure, the point B is arranged at the origin position of the XY coordinates, the point C is arranged on the X axis, and the points A, B , C are arranged at the vertices of an equilateral triangle having a length L on the XY plane. It is assumed that the sound source P is arranged on the XY plane. Further, it is assumed that the sound source P to be detected is closest to the sound wave detecting means A.
[0034]
As shown in FIG. 1, when a sound wave propagates from a point-like sound source P, its wavefront Wp is a circle centered on the sound source P. Then, the sound wave first reaches the sound wave detecting means A closest to the sound source P at time ta, and thereafter reaches the sound wave detecting means B at time tb and the sound wave detecting means C at time tc. Thus, based on the sound wave detecting means A, the sound wave arrival time difference of the sound wave detecting means B is tb-ta, and the sound wave arrival time difference of the sound wave detecting means C is tc-ta. Here, assuming that the sound speed is V, the difference in the sound wave propagation distance corresponding to the sound wave arrival time difference is b = V · (tb−ta) and c = V · (tc−ta). Here, if the distance from the sound source P to the sound wave detecting means A is r, the distance from the sound source P to the sound wave detecting means B is r + b, and the distance from the sound source P to the sound wave detecting means C is r + c.
[0035]
In this situation, the coordinates [Xp, Yp] of the sound source P can be obtained using the position coordinates of the sound wave detecting means A, B, and C and the difference b, c between the sound wave propagation distances. For example,
r ² = (Xp-L / 2) ² + (Yp-3 ^1/2 L / 2) ² … (1)
(R + b) ² = Xp ² + Yp ² … (2)
(R + c) ² = (Xp-L) ² + Yp ² … (3)
Holds, if the simultaneous equations composed of these equations (1) to (3) can be solved, the unknowns r, Xp, and Yp can be obtained. However, the differences b and c between the sound wave propagation distances are numerical values based on the actually detected time differences. Therefore, even if an attempt is made to obtain a solution by the above equations (1) to (3), a solution is actually obtained. There may be no case (eg no solution). Therefore, the present inventor assumes that any one of the unknowns r, Xp, and Yp that determines the position coordinates of the sound source P is a predetermined number, and based on the assumed value, As shown in FIG. 2, the position of the tentative sound source P 'is obtained, and a trial calculation is made to determine how much the relationship between the position of the tentative sound source P' and the positions of the three sound wave detecting means A, B, and C matches.
[0036]
For example, a distance AP 'bar (true value is r) between the sound wave detecting means A and the temporary sound source P' is temporarily set to a predetermined value r 'as an assumed parameter, and the temporary sound source is determined based on this value r'. The position coordinates [x, y] of P ′ on the XY plane are obtained, and then the consistency between the positions of the sound wave detecting means A, B, and C and the position of the temporary sound source P ′ is confirmed.
[0037]
More specifically, the following method is mentioned as an example of the above method. That is, from the assumed parameter r 'and the position coordinates of the sound wave detecting means B and C,
(R '+ b) ² = X ² + Y ² … (4)
(R '+ c) ² = (Xl) ² + Y ² … (5)
Holds, the following equation (4) and (5) are solved to obtain the following position [x, y] of the temporary sound source P ′.
x = (L ² + B ² -C ² +2 (bc) · r ′) / 2L (6)
y = ± ｛(r ′ + b) ² -X ² ｝ ^1/2 … (7)
[0038]
The position coordinates [x, y] of the temporary sound source P ′ obtained as described above and the position coordinates [L / 2, 3 ^1/2 L / 2] (i.e., AP 'bar) and the above assumed parameter r' are verified. That is, assuming that the evaluation index of consistency is δr,
δr = (x−L / 2) ² + (Y-3 ^1/2 L / 2) ² −r ′ (8)
When δr = 0, complete consistency is maintained, so that r = r ′, and the position of the temporary sound source P ′ is equal to the position of the sound source P. However, since the above-mentioned δr does not normally become 0, it is evaluated how close the position of the temporary sound source P ′ is to the true sound source P depending on the value of δr.
[0039]
As for the evaluation of δr, assuming that the consistency was obtained if ABS [δr] <k (the above-mentioned allowable range), and the consistency was obtained if ABS [δr] ≧ k, where k is the reference value. There is a method in which ABS [-] is the absolute value of-, and so on. Here, the absolute value k is set according to the required accuracy for the sound source position, for example, k = 10 cm.
[0040]
A method is also conceivable in which a parameter relating to the relative positional relationship between the sound wave detecting means A, B, and C, for example, the ratio of the distance L between the sound wave detecting means B and C to the ratio δr is evaluated. Here, for example, assuming that the reference ratio is m, if ABS [δr / L] <m (the above allowable range), the consistency is obtained, and if ABS [δr / L] ≧ m, the consistency is obtained. Has not been obtained. Here, the reference ratio m is set according to the required accuracy for the sound source position, for example, m = 0.03, that is, 3%, as described above.
[0041]
If the set value of the allowable range (the reference value k or the reference ratio m or the like) is too small, there may be a case where no solution is obtained due to the detection accuracy of the sound wave detecting means A, B, and C. Therefore, it is necessary to appropriately set the allowable range in consideration of the required accuracy for the position of the sound source P, the detection accuracy of the sound wave detecting means, and the like. Further, it is preferable that the initial value of the hypothetical parameter r 'be set closer to the detecting means than the actual sound source P. The initial value may be, for example, a distance at a fixed ratio with respect to the reference value such as L, or may be a fixed distance.
[0042]
As a result, when δr is out of the allowable range, that is, when the matching is not sufficiently obtained, the value of the hypothetical parameter r ′ is changed and the position of the temporary sound source P ′ is obtained again. The above evaluation of δr, that is, the evaluation of the consistency of the position of the temporary sound source P ′ is repeated. Then, when δr falls within the allowable range, the position of the temporary sound source P ′ is regarded as the detection position, and the position information of the sound source P is determined.
[0043]
Here, the change of the value of the hypothetical parameter when the consistency is not obtained, for example, increases the value of the hypothetical parameter r 'if δr is positive, and changes the value of the hypothetical parameter r' if δr is negative. For example, the hypothetical parameter is corrected in a direction for correcting the deviation direction of the parameter r ′ that causes the mismatch (that is, in a direction in which the evaluation index of the consistency approaches zero), such as reducing it. In addition, the correction amount of the hypothetical parameter is set so as to have a positive correlation with the shift amount (magnitude of the evaluation index of consistency) that causes the mismatch, that is, the degree of mismatch is large. It is preferable to increase the correction amount of the assumed parameter.
[0044]
In the above-described arithmetic processing, the sound speed V is used when the differences b and c in the sound wave propagation distance are obtained from the differences in the sound wave arrival times. However, it is preferable to correct the sound speed V by temperature. That is, the sound speed is set as a function V = F (τ) of the temperature τ, and the value is corrected by the detected temperature (ambient temperature). For example, if the temperature dependence of the sound velocity in the air is F = 331.5 + 0.6τ, the above calculation is performed after the sound velocity is corrected according to this equation, or a correction value equivalent to the case where this equation is applied is calculated. Then, the calculation result may be corrected. That is, the temperature correction in the present invention broadly includes a correction process equivalent to a case where the sound speed is substantially corrected in the above calculation process.
[0045]
The position information of the sound source P includes the direction of the sound source P (for example, the direction of the sound source P viewed from the sound wave detecting means A, or the direction of the sound source P viewed from the center of gravity G of the sound wave detecting means A, B, and C). 1 and 2 can be expressed by an angle θ (not shown) with respect to the X axis in FIG. 1 and FIG. 2), and a distance to the sound source P (for example, a distance r between the sound wave detecting means A and the sound source P, or FIG. And the distance between the position G of the center of gravity of the sound wave detecting means A, B, and C and the sound source P). Further, the position coordinates of the sound source P ([Xp, Yp], [r, θ], etc.) may be used.
[0046]
FIG. 3 is a schematic block diagram schematically showing the configuration of the sound source detection device 10 using the above sound source detection method. In the sound source detection device 10, sound wave detection means (microphones) A, B, and C, an input signal processing unit 15 connected to these sound wave detection means, and a detection difference ( An arithmetic processing unit 16 for performing the various calculations based on the difference in the sound wave arrival time or the difference in the sound wave propagation distance, and an operation based on the position information of the sound source P obtained by the arithmetic processing unit 16 And an operation result output unit 17 for performing the operation.
[0047]
The input signal processing unit 15 includes a signal processing unit 15A including an amplifier and a band pass filter for processing a sound wave signal output from the sound wave detection units A, B, and C and outputting a shaped sound signal; A specific point extracting unit 15B for extracting a specific point (for example, a rising edge) of the sound wave, and a time difference or a time difference between the specific points between the sound wave detecting units A, B, and C based on the specific point extracted by the specific point extracting unit 15B. A detection difference deriving unit 15C for obtaining an equivalent amount is provided.
[0048]
The arithmetic processing unit 16 is connected to a signal input / output unit 16A including a bus and an input / output circuit, a central processing unit 16B connected to the signal input / output unit 16A, and a signal input / output unit 16A. Storage means 16C. The central processing unit 16B performs the above-described arithmetic processing according to a program stored in the storage unit 16C in advance.
[0049]
The calculation result output unit 17 further includes a display unit 17A that appropriately displays a status indicating whether the calculation is being performed, whether the calculation result is obtained, whether the calculation is not possible, a position information of the sound source obtained by the calculation, and the like. Operation units 17B and 17C that operate according to the calculation result are provided. Note that the display unit 17A and the operation units 17B and 17C are not necessarily provided, and only one of them may be provided. Also, the sound source detection device 10 may be configured to output (transmit) the calculation result to the outside without providing the calculation result display unit 17.
[0050]
FIG. 4 is a timing chart showing the above-mentioned sound signals obtained from the detection signals of the sound wave detecting means A, B, and C in comparison. Since the sound wave detecting means A, B and C are arranged at different positions, the sound signals obtained from the respective means have mutually different phases. Then, by calculating the phase difference, the difference in the sound wave arrival time or the difference in the sound wave propagation distance can be determined. In this case, for example, since the audio signal contains noise, a predetermined threshold value S is set, and only a signal value exceeding the threshold value S is detected, and a predetermined value at which the signal value is obtained is obtained. Timings (rising edge, falling edge, maximum peak value, etc.) ta, tb, tc are obtained. These timings ta, tb, and tc are the sound wave arrival times, and based on this, detection differences such as the difference in the sound wave arrival time and the difference in the sound wave propagation distance are obtained.
[0051]
Here, in order to continuously obtain the sound wave arrival time, a method is used in which the timing at which the threshold S is first exceeded within the predetermined period T is detected as sound wave arrival times ta, tb, and tc, as shown in the figure. be able to. Thus, when there is a sound volume that exceeds the threshold value S, a sound wave can be detected at every predetermined period T, and the result of the calculation can be updated one after another.
[0052]
FIG. 5 is an explanatory diagram showing a robot control method using the above sound source detection method. This robot includes a main body 11, a head 12, which is an operation unit, and arms 13, 14. The sound wave detecting means A, B, and C are installed in the robot, and the sound source detecting device 10 is configured. The sound source detection device 10 is configured to determine the direction of the sound source P as the position information of the sound source P in the same manner as described above, and to direct the robot in the direction of the sound source P. In this case, only the head 12 may be operated so as to face the sound source P, or the entire robot may be operated so as to face the sound source P. Sensors such as a camera (such as a CCD camera) and a directional microphone are attached to the head 12, and information can be taken in from the vicinity of the sound source P by these sensors.
[0053]
It should be noted that the sound source detection method, the sound source detection device, and the robot according to the present invention are not limited to the illustrated examples described above, and it is needless to say that various changes can be made without departing from the gist of the present invention. For example, in the above-described embodiment, the position information of the two-dimensional sound source is obtained on the XY plane. However, the required number of sound wave detecting means is set to four so that all of the sound wave detecting means are not installed on the same plane. , It can be easily extended to three-dimensional sound source position information.
[0054]
In the above embodiment, the three sound wave detecting means A, B, and C are arranged at the vertices of an equilateral triangle. However, the present invention is not limited to this. Can be arranged at the position. However, in order to detect a sound source in an arbitrary direction without any trouble, it is preferable that all the sound wave detecting means are not arranged on the same line. In addition, in order to enable sound sources in all directions to be detected with substantially uniform accuracy, it is desirable to dispose the sound wave detecting means at the apexes of the equilateral triangle as described above.
[0055]
In the above embodiment, only three sound wave detecting means are used. However, four or more sound wave detecting means are provided, and the detection values of the required number (three in the above example) of the sound wave detecting means are used. Thus, it is possible to perform arithmetic processing in the same manner. In this case, when four or more sound wave detecting means are used, three or more independent detection differences are obtained. However, it is possible to select and use two detection differences among the three detection differences. preferable. At this time, when finally obtaining the sound source direction as the position information of the sound source, it is desirable to select and use two of the three detection differences from the smaller one. This means that a small detection difference means that the expected angle of the two sound wave detecting means for which the detection difference was obtained with respect to the sound source is large. Because you can.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram showing prerequisites of a sound source detection method according to an embodiment.
FIG. 2 is an explanatory diagram showing calculation contents of a sound source detection method according to the embodiment.
FIG. 3 is a schematic configuration block diagram illustrating a schematic configuration of a sound source detection device of the embodiment.
FIG. 4 is a timing chart showing an audio signal according to the embodiment.
FIG. 5 is an operation explanatory view showing the operation of the robot of the embodiment.
FIG. 6 is an explanatory diagram showing the concept of a conventional sound source direction detection method.
[Explanation of symbols]
Reference Signs List 10: sound source detection device, 15: input signal processing unit, 16: operation processing unit, 17: operation result output unit, A, B, C: sound wave detection means, P: sound source (point-like sound source), P ': temporary Sound source, b, c: difference in sound wave propagation distance

Claims

A sound source detection method for detecting a sound source using three or more sound wave detecting means, wherein a plurality of detection differences between mutually corresponding sound waves detected by the three or more sound wave detecting means are determined, and A sound source detection method comprising: determining position information of a sound source as a point-like sound source based on a position of a detection unit and the plurality of detection differences.

At least one of a plurality of parameters for defining the sound source position of the sound wave based on the position of the sound wave detection unit and the plurality of detection differences is set to an appropriate value as a hypothetical parameter, and this hypothetical parameter and the After determining a tentative sound source position based on a plurality of detection differences, whether or not the relationship between the tentative sound source position and the positions of the three or more sound wave detecting means has consistency within a predetermined allowable range. Calculating the position information having consistency within the allowable range by changing the value of the hypothetical parameter and continuing the trial calculation if the consistency is not satisfied. 2. The sound source detection method according to item 1.

The sound source detection method according to claim 2, wherein the assumed parameter is a propagation time or a propagation distance of a sound wave.

3. The method according to claim 2, wherein two detection differences are obtained using three sound wave detecting means, and the position information of a virtual sound source on a plane on which the three sound wave detecting means are arranged is determined. 4. The sound source detection device according to 3.

The sound source detection method according to any one of claims 1 to 4, wherein the position information is determined using a sound velocity that has been temperature-corrected according to an ambient temperature.

Three or more sound wave detecting means, detection difference calculating means for obtaining a plurality of detection differences between mutually corresponding sound waves detected by the three or more sound wave detecting means, a position of the sound wave detecting means and the plurality of sound wave detecting means; Sound source position information calculating means for obtaining position information of a sound source as a point-like sound source based on the detection difference of the sound source.

The sound source position information calculating means sets at least one of a plurality of parameters for defining the sound source position based on the position of the sound wave detecting means and the plurality of detection differences to an appropriate value as a hypothetical parameter. Determining a temporary sound source position based on the assumed parameters and the plurality of detection differences, and then matching a relationship between the temporary sound source position and the positions of the three or more sound wave detecting means within a predetermined allowable range. To determine whether the position information has consistency within the allowable range by changing the value of the hypothetical parameter and continuing the trial calculation. The sound source detection device according to claim 6, wherein:

The sound source detection device according to claim 7, wherein the assumed parameter is a propagation time or a propagation distance of a sound wave.

The sound source detection device according to claim 6, wherein a required number of detection differences are used in order from the smallest detection difference as the plurality of detection differences.

7. The method according to claim 6, wherein two detection differences obtained by the three sound wave detecting means are obtained, and the position information of a virtual sound source on a plane on which the three sound wave detecting means are arranged is determined. The sound source detection device according to any one of claims 8 to 12.

The sound source detecting device according to claim 10, wherein the three sound wave detecting means are arranged at vertices of an equilateral triangle.

12. The apparatus according to claim 6, further comprising a temperature detecting unit, wherein the position information is determined using a sound velocity temperature-corrected by the temperature detected by the temperature detecting unit. A sound source detection device according to the item.

A robot that operates according to the sound source detection device according to any one of claims 6 to 12 and the position information determined by the sound source detection device.

The robot according to claim 13, wherein the robot is configured to face a sound source direction based on the position information determined by the sound source detection device.