JP2007318274A

JP2007318274A - Sound emission/pickup apparatus

Info

Publication number: JP2007318274A
Application number: JP2006143633A
Authority: JP
Inventors: Toshiaki Ishibashi; 利晃石橋; Makoto Tanaka; 田中　　良; Norifumi Ukai; 訓史鵜飼
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2006-05-24
Filing date: 2006-05-24
Publication date: 2007-12-06

Abstract

<P>PROBLEM TO BE SOLVED: To suppress sneak sound more effectively. <P>SOLUTION: A first filter 203 delivers the frequency characteristics of an inputted sound pickup beam signal to an echo cancel circuit 200 while correcting such that the sneak sound has a uniform signal level over the entire frequency band. The echo cancel circuit 200 generates a para-recurrence sound signal from an input sound signal of sound pickup signal and subtracts it from a sound pickup beam signal corrected through the first filter 203. Since the frequency characteristics of an sneak signal are substantially uniform over the entire band in the first filter 203, the sneak sound signal and the para-recurrence sound signal included in the corrected sound pickup beam signal have a substantially identical frequency spectrum. Consequently, the sneak sound is removed effectively by the echo cancel circuit 200. A second filter 204 readjusts the frequency characteristics of utterance sound from a person of utterance included in the sound pickup beam signal and varied along with the sneak sound to the original state before delivering an output sound signal. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は、ネットワーク等を介して複数の地点間で行う音声会議等に用いる放収音装置、特にエコーキャンセル機能を有する放収音装置に関するものである。 The present invention relates to a sound emission and collection device used for audio conferences or the like performed between a plurality of points via a network or the like, and more particularly to a sound emission and collection device having an echo cancellation function.

従来、遠隔地間で音声会議を行う方法として、音声会議を行う地点毎に放収音装置を設置して、これら装置をネットワークで接続し、音声信号を通信する方法が多く用いられている。そして、放収音装置では、相手装置側の音声を放音するスピーカと、自装置側の音声を収音するマイクロホンとが１つの筐体に同時に設置されたものが多い。 2. Description of the Related Art Conventionally, as a method for performing a voice conference between remote locations, a method of installing a sound emitting and collecting device at each point where a voice conference is performed, connecting these devices through a network, and communicating a voice signal is often used. In many sound emitting and collecting apparatuses, a speaker that emits sound on the partner apparatus side and a microphone that collects sound on the own apparatus side are installed in one casing at the same time.

例えば、特許文献１の音声会議装置（放収音装置）は、ネットワークを介して入力される音声信号を天面に配置されたスピーカから放音し、側面に配置された異なる複数方向をそれぞれの正面方向とする各マイクで音声を収音し、ネットワークを介して収音信号を外部に送信する。
特開平８−２９８６９６号公報 For example, the audio conference apparatus (sound emitting and collecting apparatus) of Patent Document 1 emits an audio signal input via a network from a speaker arranged on the top surface, and passes through a plurality of different directions arranged on the side surface. Sound is collected by each microphone in the front direction, and the collected sound signal is transmitted to the outside via a network.
JP-A-8-298696

しかしながら、特許文献１の装置では、マイクとスピーカとが近接することで、各マイクの収音信号にスピーカからの回り込み音声が多く含まれる。このため、自装置の発話者の発声音に対する収音信号のＳ／Ｎ比が低下し、発話者の発声音を鮮明に収音して、出力することができない。 However, in the apparatus of Patent Document 1, since the microphone and the speaker are close to each other, the sound collected signal of each microphone includes a lot of wraparound sound from the speaker. For this reason, the S / N ratio of the collected signal with respect to the utterance sound of the utterer of the own device is lowered, and the utterance sound of the utterer cannot be clearly collected and output.

このような回り込み音声を除去する方法として、従来エコーキャンセル処理が存在する。エコーキャンセル処理では、適応型フィルタと減算器であるポストプロセッサとを用いて、適応型フィルタで入力音声信号すなわち放音信号の回り込み音声に対する擬似回帰音信号を設定し、ポストプロセッサで収音信号から擬似回帰音信号を減算することで、回り込み音声を除去する。 Conventionally, echo cancellation processing exists as a method for removing such wraparound sound. In the echo cancellation process, an adaptive filter and a post processor as a subtractor are used to set a pseudo-regression sound signal for the sneak sound of the input sound signal, that is, the sound emission signal, by the adaptive filter, and from the collected sound signal by the post processor. By subtracting the pseudo-regression sound signal, the wraparound sound is removed.

しかしながら、回り込み音声信号は、通常、図６に示すような周波数特性を有する。
図６は、一般的な回り込み音声信号の周波数特性を示す図である。なお、この周波数特性は装置の仕様等により異なるが、後述する本実施形態のような構成の放収音装置では、概ね図６に示す周波数特性となる。
図６に示すように、回り込み音声信号は、３００Ｈｚ程度の低域の信号レベルに対して、１０００Ｈｚを超える高域の信号レベルが低い。このような特性の回り込み音声信号に対して、低域成分を基準に演算ビットを設定してエコーキャンセル処理を行うと、高域成分は、信号レベルが低い分、実際の演算時に信号レベルを表すビット数が小さくなり、演算精度が低域に対して著しく低くなる。このため、低域成分のエコーキャンセルは高精度に行えたとしても高域成分のエコーキャンセルを高精度に行うことができない。 However, the wraparound audio signal usually has frequency characteristics as shown in FIG.
FIG. 6 is a diagram illustrating frequency characteristics of a general wraparound audio signal. Although this frequency characteristic varies depending on the specifications of the apparatus, the sound emission / collection apparatus configured as described later in this embodiment generally has the frequency characteristic shown in FIG.
As shown in FIG. 6, the wraparound audio signal has a low signal level in the high range exceeding 1000 Hz with respect to the signal level in the low range of about 300 Hz. When echo cancellation processing is performed on a wraparound audio signal having such characteristics with the calculation bit set based on the low-frequency component, the high-frequency component represents the signal level during actual calculation because the signal level is low. The number of bits is reduced, and the calculation accuracy is remarkably lowered with respect to the low frequency range. For this reason, even if low-frequency component echo cancellation can be performed with high accuracy, high-frequency component echo cancellation cannot be performed with high accuracy.

一方で、高域成分に対する演算精度を向上させるため演算ビット数を増加させると、もともと演算量が他の処理よりも多いエコーキャンセル処理に対して、さらに演算処理負荷が増加する。このため、より高性能な演算処理器（ＤＳＰ）が必要となり、放収音装置自体がコストアップしてしまう。 On the other hand, if the number of calculation bits is increased in order to improve the calculation accuracy for the high frequency component, the calculation processing load is further increased with respect to the echo cancellation process that originally has a larger calculation amount than other processes. For this reason, a higher-performance arithmetic processor (DSP) is required, and the sound emitting and collecting apparatus itself increases in cost.

したがって、この発明の目的は、高性能な演算処理器や複雑な構成を必要とすることなく、低域のみでなく高域に対しても回り込み音声の影響を除去して、発話者からの音声を高いＳ／Ｎ比で収音・出力することができる放収音装置を提供することにある。 Therefore, the object of the present invention is to eliminate the influence of the sneak sound not only on the low frequency but also on the high frequency without requiring a high-performance arithmetic processor or a complicated configuration, and thereby the voice from the speaker. Is to provide a sound emission and collection device that can collect and output sound with a high S / N ratio.

この発明の放収音装置は、入力音声信号を放音する放音手段と、収音して収音信号を生成する収音手段と、収音信号に対して放音手段から収音手段に回り込む音声の周波数特性を略均一にする第１フィルタリング処理を行う回り込み特性補正フィルタと、第１フィルタリング処理された収音信号に対してエコーキャンセル処理を行うエコーキャンセル手段と、該エコーキャンセル手段によりエコーキャンセル処理された収音信号に対して第１フィルタリング処理の逆特性からなる第２フィルタリング処理を行う特性再調整フィルタと、を備えたことを特徴としている。 The sound emission and collection device according to the present invention includes a sound emission means for emitting an input sound signal, a sound collection means for collecting and generating a sound collection signal, and a sound collection means to a sound collection means for the sound collection signal. A wraparound characteristic correction filter that performs a first filtering process that makes the frequency characteristics of the wrapping sound substantially uniform, an echo cancellation unit that performs an echo cancellation process on the collected sound signal that has been subjected to the first filtering process, and an echo that is echoed by the echo cancellation unit And a characteristic readjustment filter that performs a second filtering process having a reverse characteristic of the first filtering process with respect to the canceled sound collection signal.

この構成では、放音手段から放音された音声の一部が回り込んで収音手段で収音される。収音手段で収音される回り込み音声は、前述のように低域での信号レベルが高く、高域での信号レベルが低い。 In this configuration, a part of the sound emitted from the sound emitting means wraps around and is collected by the sound collecting means. As described above, the wraparound sound collected by the sound collecting means has a high signal level in the low range and a low signal level in the high range.

回り込み特性補正フィルタは、回り込み音声の高域成分の信号レベルを低域成分の信号レベルまで引き上げるような第１フィルタリング処理を行う。すなわち、低域成分を基準として相対的に高域成分の信号レベルを引き上げ、信号レベルの周波数特性を略均一にする。この際、回り込み音声を含む収音信号の周波数特性は第１フィルタリング処理に応じて変化する。 The wraparound characteristic correction filter performs a first filtering process that raises the signal level of the high frequency component of the wraparound audio to the signal level of the low frequency component. That is, the signal level of the high frequency component is raised relatively with the low frequency component as a reference, and the frequency characteristics of the signal level are made substantially uniform. At this time, the frequency characteristic of the collected sound signal including the wraparound sound changes according to the first filtering process.

エコーキャンセル手段は、この回り込み音声に対して均一な周波数特性で補正された収音信号のエコーキャンセル処理を行う。この際、低域成分とともに、高域成分の信号レベルが高く補正されていることで、高域成分は、補正（第１フィルタリング処理）前よりも多くの実効ビット数で表現される。ここで、実効ビット数とは、対応する周波数の信号レベルを表すのに少なくとも必要なビット数を表し、信号レベルが低ければ実効ビット数が少なくなり、信号レベルが高ければ実効ビット数も多くなる。このように、実効ビット数が増加することで、低域成分のみでなく高域成分の信号レベルの分解能が増加する。したがって、低域、高域ともに高精度にエコーキャンセル処理が行われる。 The echo canceling unit performs echo cancellation processing of the collected sound signal corrected with uniform frequency characteristics for the wraparound sound. At this time, since the signal level of the high frequency component is corrected to be high together with the low frequency component, the high frequency component is expressed by a larger number of effective bits than before the correction (first filtering process). Here, the effective number of bits represents at least the number of bits necessary to represent the signal level of the corresponding frequency. If the signal level is low, the effective number of bits decreases, and if the signal level is high, the effective number of bits increases. . Thus, the increase in the number of effective bits increases the resolution of the signal level of the high frequency component as well as the low frequency component. Therefore, echo cancellation processing is performed with high accuracy in both the low and high frequencies.

特性再調整フィルタは、第１フィルタリング処理の逆特性となる第２フィルタリング処理を行う。すなわち、第１フィルタリング処理で略均一な周波数特性に補正された収音信号に対して、元々の回り込み音声の周波数特性と同様に、低域の信号レベルが高く、高域の信号レベルが低い信号へ逆補正する。これにより、周波数特性が補正された後の回り込み音声が除去された収音信号が、元来のすなわち収音時の周波数特性に復元される。 The characteristic readjustment filter performs a second filtering process that is an inverse characteristic of the first filtering process. That is, a signal with a high low-frequency signal level and a low high-frequency signal level is the same as the frequency characteristic of the original wraparound sound with respect to the collected sound signal corrected to a substantially uniform frequency characteristic by the first filtering process. Reverse correction to Thereby, the collected sound signal from which the wraparound sound after the frequency characteristic is corrected is restored to the original frequency characteristic at the time of sound collection.

この結果、収音時の収音信号に対して回り込み音声が除去され、周波数特性の変わらない収音信号が生成される。 As a result, the wraparound sound is removed from the collected sound signal at the time of sound collection, and a sound collected signal whose frequency characteristics do not change is generated.

また、この発明の放収音装置の収音手段は、複数のマイクを所定配列で設置してなるマイクアレイと複数のマイクが収音した音声信号を用いてそれぞれに異なる方位に強い指向性を有する複数の収音ビーム信号を生成して該収音ビーム信号を収音信号として出力する収音ビーム信号生成手段とを備える。さらに、この発明の放収音装置の回り込み特性補正フィルタおよび特性再調整フィルタは、予め収音ビーム信号毎に設定された複数のフィルタリング特性から、選択された収音指向性に応じたフィルタリング特性を選択してフィルタリング処理を行うことを特徴としている。 Further, the sound collecting means of the sound emitting and collecting apparatus of the present invention has a strong directivity in different directions using a microphone array in which a plurality of microphones are installed in a predetermined arrangement and an audio signal collected by the plurality of microphones. Sound collection beam signal generation means for generating a plurality of sound collection beam signals and outputting the sound collection beam signals as sound collection signals. Furthermore, the wraparound characteristic correction filter and characteristic readjustment filter of the sound emission and collection device of the present invention have a filtering characteristic corresponding to the selected sound collection directivity from a plurality of filtering characteristics set in advance for each sound collection beam signal. It is characterized by selecting and performing a filtering process.

この構成では、収音手段のマイクアレイの各マイクで収音した音声を用いて、複数の収音ビーム信号を生成する。各収音ビーム信号（収音信号）は収音指向性が異なるので、回り込み音声の周波数特性も異なる。回り込み特性補正フィルタおよび特性再調整フィルタが収音指向性毎にフィルタリング特性を選択してフィルタリング処理を行うことで、選択した収音指向性に適応したフィルタリング処理を行う。これにより、収音指向性に応じて、より高精度に回り込み音声が除去される。 In this configuration, a plurality of sound collection beam signals are generated using the sound collected by each microphone of the microphone array of the sound collection means. Since each sound collecting beam signal (sound collecting signal) has a different sound collecting directivity, the frequency characteristics of the wraparound sound are also different. The wraparound characteristic correction filter and the characteristic readjustment filter select a filtering characteristic for each sound collection directivity and perform a filtering process, thereby performing a filtering process adapted to the selected sound collection directivity. Thereby, the wraparound sound is removed with higher accuracy according to the sound collection directivity.

また、この発明の放収音装置の放音手段は、複数のスピーカを所定配列で設置してなるスピーカアレイと複数のスピーカが放音する音によりそれぞれに異なる複数の放音特性が実現されるように複数のスピーカに対する放音信号を生成する放音制御手段とを備える。さらに、この発明の放収音装置の回り込み特性補正フィルタおよび特性再調整フィルタは、予め放音特性毎に設定された複数のフィルタリング特性から、選択された放音特性に対応したフィルタリング特性を選択して、フィルタリング処理を行うことを特徴としている。 The sound emission means of the sound emission and collection device according to the present invention realizes a plurality of sound emission characteristics different from each other depending on a speaker array in which a plurality of speakers are installed in a predetermined arrangement and a sound emitted by the plurality of speakers. And a sound emission control means for generating sound emission signals for a plurality of speakers. Furthermore, the wraparound characteristic correction filter and characteristic readjustment filter of the sound emission and collection device of the present invention select a filtering characteristic corresponding to the selected sound emission characteristic from a plurality of filtering characteristics set in advance for each sound emission characteristic. And filtering processing.

この構成では、放音手段のスピーカアレイの各スピーカから放音する音声を用いて、複数の放音特性を実現する。各放音特性では回り込み音声の周波数特性も異なる。回り込み特性補正フィルタおよび特性再調整フィルタが、放音特性毎にフィルタリング特性を選択してフィルタリング処理を行うことで、選択した放音特性に適応したフィルタリング処理が行われる。これにより、放音特性に応じて、より高精度に回り込み音声が除去される。 In this configuration, a plurality of sound emission characteristics are realized using sound emitted from each speaker of the speaker array of the sound emission means. Each sound emission characteristic has a different frequency characteristic of the wraparound sound. The wraparound characteristic correction filter and the characteristic readjustment filter select the filtering characteristic for each sound emission characteristic and perform the filtering process, whereby the filtering process adapted to the selected sound emission characteristic is performed. Thereby, the wraparound sound is removed with higher accuracy in accordance with the sound emission characteristics.

また、この発明の放収音装置は、複数のマイクを所定配列で設置してなるマイクアレイ、および複数のマイクが収音した音声信号を用いてそれぞれに異なる方位に強い指向性を有する複数の収音ビーム信号を生成して該収音ビーム信号を収音信号として出力する収音ビーム信号生成手段を備えた収音手段と、複数のスピーカを所定配列で設置してなるスピーカアレイ、および複数のスピーカが放音する音によりそれぞれに異なる複数の放音特性が実現されるように複数のスピーカに対する放音信号を生成する放音制御手段を備えた放音手段と、を備える。さらに、この放収音装置の回り込み特性補正フィルタおよび特性再調整フィルタは、予め収音ビーム信号と放音特性との組合せ毎に設定された複数のフィルタリング特性から、選択された組合せに応じたフィルタリング特性を選択して、フィルタリング処理を行うことを特徴としている。 The sound emission and collection device of the present invention includes a microphone array in which a plurality of microphones are installed in a predetermined arrangement, and a plurality of microphones having strong directivities in different directions using sound signals collected by the plurality of microphones. A sound collection means comprising a sound collection beam signal generation means for generating a sound collection beam signal and outputting the sound collection beam signal as a sound collection signal; a speaker array comprising a plurality of speakers arranged in a predetermined arrangement; Sound emission means including sound emission control means for generating sound emission signals for the plurality of speakers so that a plurality of sound emission characteristics different from each other are realized by sounds emitted by the speakers. Further, the sneak characteristic correction filter and the characteristic readjustment filter of the sound emission and collection device perform filtering according to a selected combination from a plurality of filtering characteristics set in advance for each combination of the sound collection beam signal and the sound emission characteristic. It is characterized by selecting characteristics and performing filtering processing.

この構成では、収音指向性と放音特性との組合せ毎にフィルタリング特性を選択してフィルタリング処理を行うことで、選択した組合せに適応したフィルタリング処理が行われる。これにより、収音指向性と放音特性との組合せに応じて、より高精度に回り込み音声が除去される。 In this configuration, by performing filtering processing by selecting a filtering characteristic for each combination of sound collection directivity and sound emission characteristic, a filtering process adapted to the selected combination is performed. As a result, depending on the combination of sound collection directivity and sound emission characteristics, the wraparound sound is removed with higher accuracy.

また、この発明の放収音装置の回り込み特性補正フィルタおよび特性再調整フィルタは、エコーキャンセル手段に対して倍精度のビット演算でフィルタリング処理を行うことを特徴としている。 In addition, the sneak characteristic correction filter and characteristic readjustment filter of the sound emission and collection device of the present invention are characterized in that filtering processing is performed on the echo cancellation means by double-precision bit calculation.

この構成では、フィルタリング処理の際に、倍精度でビット演算することで、より高精度に高域の信号レベルが補正、復元される。そして、フィルタリング処理のビット演算量を上げながら、エコーキャンセル処理のビット演算量を上げることなく、さらに高精度に回り込み音声が除去される。この際、エコーキャンセル処理の方がフィルタリング処理よりも高負荷であるので、処理負荷を大幅に増加させることなく、高精度な回り込み音声の除去が行われる。 In this configuration, a high-frequency signal level is corrected and restored with higher accuracy by performing bit operations with double accuracy during the filtering process. Then, while increasing the bit calculation amount of the filtering process, the wraparound sound is removed with higher accuracy without increasing the bit calculation amount of the echo cancellation process. At this time, since the echo cancellation processing has a higher load than the filtering processing, highly accurate wraparound speech is removed without significantly increasing the processing load.

この発明によれば、回り込み音声の周波数特性に関係なく、確実且つ効果的に回り込み音声を除去して、発話者等の音源方向からの音声を高いＳ／Ｎ比で収音して出力することができる。 According to this invention, regardless of the frequency characteristics of the wraparound sound, the wraparound sound is reliably and effectively removed, and the sound from the sound source direction of the speaker or the like is collected and output with a high S / N ratio. Can do.

本発明の実施形態に係る放収音装置について図を参照して説明する。
図１（Ａ）は本実施形態に係る放収音装置１のマイク、スピーカ配置を示す平面図であり、図１（Ｂ）は図１（Ａ）に示す放収音装置１により形成される収音ビーム領域を示す図である。 A sound emitting and collecting apparatus according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1A is a plan view showing the microphone and speaker arrangement of the sound emitting and collecting apparatus 1 according to the present embodiment, and FIG. 1B is formed by the sound emitting and collecting apparatus 1 shown in FIG. It is a figure which shows a sound collection beam area | region.

図２は本実施形態の放収音装置１の機能ブロック図である。 FIG. 2 is a functional block diagram of the sound emitting and collecting apparatus 1 of the present embodiment.

本実施形態の放収音装置１は、筐体１０１に、複数のスピーカＳＰ１〜ＳＰ３、複数のマイクＭＩＣ１１〜ＭＩＣ１７，ＭＩＣ２１〜ＭＩＣ２７、図２に示す機能部を備えて成る。 The sound emission and collection device 1 of the present embodiment includes a housing 101 provided with a plurality of speakers SP1 to SP3, a plurality of microphones MIC11 to MIC17, MIC21 to MIC27, and a functional unit shown in FIG.

筐体１０１は一方向に長尺な略直方体形状からなり、筐体１０１の長尺な辺（面）の両端部には、筐体１０１の下面を設置面から所定間隔離間する所定高さの脚部（図示せず）が設置されている。なお、以下の説明では、筐体１０１の四側面のうち、長尺な面を長尺面、短尺な面を短尺面と称する。 The casing 101 has a substantially rectangular parallelepiped shape that is long in one direction, and has a predetermined height that separates the lower surface of the casing 101 from the installation surface at a predetermined interval at both ends of the long side (surface) of the casing 101. Legs (not shown) are installed. In the following description, of the four side surfaces of the housing 101, a long surface is referred to as a long surface, and a short surface is referred to as a short surface.

筐体１０１の下面には、同形状からなる無指向性の単体スピーカＳＰ１〜ＳＰ３が設置されている。これら単体スピーカＳＰ１〜ＳＰ３は長尺方向に沿って一定の間隔で直線状に設置されており、且つ、各単体スピーカＳＰ１〜ＳＰ３の中心を結ぶ直線は、筐体１０１の長尺面に沿い、短尺面の中心間を結ぶ中心軸１００に対して水平方向位置が一致するように設置されている。すなわち、中心軸１００を含む垂直な基準面にスピーカＳＰ１〜ＳＰ３の中心を結ぶ直線が配置される。このように、単体スピーカＳＰ１〜ＳＰ３を配列設置することでスピーカアレイＳＰＡ１０が構成される。このような状態では、スピーカアレイＳＰＡ１０の各単体スピーカＳＰ１〜ＳＰ３から音声を放音すると、放音音声は二つの長尺面に同等に伝わる。この際、二つの対向する長尺面に伝搬する放音音声は、前記基準面に対して直交する互いに対称な方向へ進行する。 On the lower surface of the housing 101, non-directional single speakers SP1 to SP3 having the same shape are installed. These single speakers SP1 to SP3 are installed in a straight line at regular intervals along the long direction, and the straight line connecting the centers of the single speakers SP1 to SP3 is along the long surface of the casing 101. It is installed such that the horizontal position coincides with the central axis 100 connecting the centers of the short surfaces. That is, a straight line connecting the centers of the speakers SP1 to SP3 is arranged on a vertical reference plane including the central axis 100. As described above, the speaker array SPA 10 is configured by arranging the single speakers SP1 to SP3 in an array. In such a state, when sound is emitted from the individual speakers SP1 to SP3 of the speaker array SPA10, the emitted sound is equally transmitted to the two long surfaces. At this time, the sound emission propagating to two opposing long surfaces proceeds in mutually symmetric directions perpendicular to the reference surface.

筐体１０１の一方の長尺面には、同スペックのマイクＭＩＣ１１〜ＭＩＣ１７が設置されている。これらマイクＭＩＣ１１〜ＭＩＣ１７は長尺方向に沿って一定の間隔で直線状に設置されており、これによりマイクアレイＭＡ１０が構成される。また、筐体１０１の他方の長尺面にも、同スペックのマイクＭＩＣ２１〜ＭＩＣ２７が設置されている。これらマイクＭＩＣ２１〜ＭＩＣ２７も長尺方向に沿って一定の間隔で直線状に設置されており、これにより、マイクアレイＭＡ２０が構成される。マイクアレイＭＡ１０とマイクアレイＭＡ２０とはその配列軸の垂直位置が一致するように配置されており、さらに、マイクアレイＭＡ１０の各マイクＭＩＣ１１〜ＭＩＣ１７と、マイクアレイＭＡ２０の各マイクＭＩＣ２１〜ＭＩＣ２７とは、それぞれ前記基準面に対して対称な位置に配置されている。具体的に、例えばマイクＭＩＣ１１とマイクＭＩＣ２１とが基準面に対して対称の関係にあり、同様にマイクＭＩＣ１７とマイクＭＩＣ２７とが対称の関係にある。 On one long surface of the casing 101, microphones MIC11 to MIC17 having the same specifications are installed. These microphones MIC11 to MIC17 are installed in a straight line at regular intervals along the longitudinal direction, thereby forming a microphone array MA10. In addition, microphones MIC21 to MIC27 having the same specifications are also installed on the other long surface of the casing 101. These microphones MIC21 to MIC27 are also installed in a straight line at regular intervals along the lengthwise direction, thereby forming a microphone array MA20. The microphone array MA10 and the microphone array MA20 are arranged so that the vertical positions of the arrangement axes thereof coincide with each other. The microphones MIC11 to MIC17 of the microphone array MA10 and the microphones MIC21 to MIC27 of the microphone array MA20 are: Each is arranged at a position symmetrical to the reference plane. Specifically, for example, the microphone MIC11 and the microphone MIC21 are symmetrical with respect to the reference plane, and the microphone MIC17 and the microphone MIC27 are similarly symmetrical.

なお、本実施形態では、スピーカアレイＳＰＡ１０のスピーカ数を３本とし、各マイクアレイＭＡ１０，ＭＡ２０のマイク数をそれぞれ７本としたが、これに限ることなく、仕様に応じてスピーカ数およびマイク数は適宜設定すればよい。また、スピーカアレイの各スピーカ間隔およびマイクアレイの各マイク間隔は一定ではなくてもよく、例えば、長尺方向に沿って中央部で密に配置され、両端部に向かうに従って疎に配置されるような態様でもよい。 In the present embodiment, the speaker array SPA10 has three speakers and the microphone arrays MA10 and MA20 each have seven microphones. However, the present invention is not limited to this, and the number of speakers and microphones is not limited thereto. May be set as appropriate. Further, the speaker intervals of the speaker array and the microphone intervals of the microphone array do not have to be constant. For example, they are arranged densely at the center along the longitudinal direction and sparsely arranged toward both ends. Various modes may be used.

次に、図２に示すように、本実施形態の放収音装置１は、機能的に、制御部１０、入出力コネクタ１１、入出力Ｉ／Ｆ１２、放音指向性制御部１３、Ｄ／Ａコンバータ１４、放音用アンプ１５、前述のスピーカアレイＳＰＡ１０（スピーカＳＰ１〜ＳＰ３）、前述のマイクアレイＭＡ１０，ＭＡ２０（マイクＭＩＣ１１〜ＭＩＣ１７，ＭＩＣ２１〜ＭＩＣ２７）、収音用アンプ１６、Ａ／Ｄコンバータ１７、収音ビーム生成部１８１，１８２、収音ビーム選択部１９、および、エコーキャンセル部２０を備える。 Next, as shown in FIG. 2, the sound emitting and collecting apparatus 1 of this embodiment is functionally composed of a control unit 10, an input / output connector 11, an input / output I / F 12, a sound emitting directivity control unit 13, and a D / A converter 14, sound emission amplifier 15, speaker array SPA10 (speakers SP1 to SP3), microphone arrays MA10 and MA20 (microphones MIC11 to MIC17, MIC21 to MIC27), sound collecting amplifier 16, and A / D converter 17, a sound collection beam generation unit 181, 182, a sound collection beam selection unit 19, and an echo cancellation unit 20.

制御部１０は、電源制御等を含む装置全体の動作・処理制御を行うととともに、図示しない操作部からの操作入力命令に応じて、装置の各部に対して演算処理等の制御命令を与える。 The control unit 10 performs operation / process control of the entire device including power supply control and the like, and gives control commands such as arithmetic processing to each unit of the device in response to an operation input command from an operation unit (not shown).

入出力Ｉ／Ｆ１２は、入出力コネクタ１１を介して入力された、他の放収音装置からの入力音声信号をネットワークに対応するデータ形式（プロトコル）から変換して、エコーキャンセル部２０を介して放音指向性制御部１３に与える。また、入出力Ｉ／Ｆ１２は、エコーキャンセル部２０で生成される出力音声信号をネットワークに対応するデータ形式（プロトコル）に変換して、入出力コネクタ１１を介して、ネットワークに送信する。 The input / output I / F 12 converts an input audio signal input from the input / output connector 11 from another sound emitting and collecting device from a data format (protocol) corresponding to the network, and passes through the echo canceling unit 20. To the sound output directivity control unit 13. The input / output I / F 12 converts the output audio signal generated by the echo cancel unit 20 into a data format (protocol) corresponding to the network, and transmits it to the network via the input / output connector 11.

放音指向性制御部１３は、放音指向性が設定されていなければ、スピーカアレイＳＰＡ１０の各スピーカＳＰ１〜ＳＰ３へ、入力音声信号に基づく放音信号を同時に与える。また、放音指向性制御部１３は、仮想点音源の設定等の放音指向性が制御部１０から指定されると、指定された放音指向性に基づいて、スピーカアレイＳＰＡ１０の各スピーカＳＰ１〜ＳＰ３にそれぞれ固有の遅延処理及び振幅処理等を入力音声信号に対して行うことで個別放音信号を生成する。放音指向性制御部１３は、これら個別放音信号をスピーカＳＰ１〜ＳＰ３毎に設置されたＤ／Ａコンバータ１４に出力する。 If the sound emission directivity is not set, the sound emission directivity control unit 13 simultaneously gives a sound emission signal based on the input sound signal to the speakers SP1 to SP3 of the speaker array SPA10. Further, when the sound emission directivity such as the setting of the virtual point sound source is designated by the control unit 10, the sound emission directivity control unit 13 is based on the designated sound emission directivity and each speaker SP1 of the speaker array SPA10. The individual sound emission signal is generated by performing delay processing and amplitude processing specific to .about.SP3 on the input audio signal. The sound emission directivity control unit 13 outputs these individual sound emission signals to the D / A converter 14 installed for each of the speakers SP1 to SP3.

各Ｄ／Ａコンバータ１４は個別放音信号をアナログ形式に変換して各放音用アンプ１５に出力し、各放音用アンプ１５は個別放音信号を増幅してスピーカＳＰ１〜ＳＰ３に与える。 Each D / A converter 14 converts the individual sound emission signal into an analog format and outputs it to each sound emission amplifier 15, and each sound emission amplifier 15 amplifies the individual sound emission signal and gives it to the speakers SP 1 to SP 3.

スピーカＳＰ１〜ＳＰ３は、与えられた放音信号や個別放音信号を音声変換して外部に放音する。スピーカＳＰ１〜ＳＰ３は筐体１０１の下面に設置されているので、放音された音声は、放収音装置１が設置される机の設置面を反射して、会議者のいる装置の横から斜め上方に向かって伝搬される。また、放音音声の一部は、放収音装置１の底面からマイクアレイＭＡ１０，ＭＡ２０が設置された側面へ回り込む。 The speakers SP1 to SP3 convert a given sound emission signal or individual sound emission signal into sound and emit the sound outside. Since the speakers SP1 to SP3 are installed on the lower surface of the housing 101, the emitted sound is reflected from the installation surface of the desk on which the sound emitting and collecting apparatus 1 is installed, from the side of the apparatus where the conference person is located. Propagated obliquely upward. Further, a part of the sound emission goes around from the bottom surface of the sound emission and collection device 1 to the side surface on which the microphone arrays MA10 and MA20 are installed.

マイクアレイＭＡ１０，ＭＡ２０の各マイクＭＩＣ１１〜ＭＩＣ１７、ＭＩＣ２１〜ＭＩＣ２７は、無指向性であっても有指向性であってもよいが、有指向性であることが望ましく、放収音装置１の外部からの音声を収音して電気変換し、収音音声信号を各収音用アンプ１６に出力する。 The microphones MIC11 to MIC17 and MIC21 to MIC27 of the microphone arrays MA10 and MA20 may be omnidirectional or directional, but are preferably directional, and are external to the sound emitting and collecting apparatus 1. Are collected and electrically converted, and a collected sound signal is output to each sound collecting amplifier 16.

また、マイクアレイＭＡ１０，ＭＡ２０の各マイクＭＩＣ１１〜ＭＩＣ１７、ＭＩＣ２１〜ＭＩＣ２７は、各スピーカＳＰ１〜ＳＰ３の放音音声の回り込み音声を収音する。 Further, each of the microphones MIC11 to MIC17 and MIC21 to MIC27 of the microphone arrays MA10 and MA20 collects the wraparound sound of the sound emitted from the speakers SP1 to SP3.

各収音用アンプ１６は、収音音声信号を増幅してそれぞれＡ／Ｄコンバータ１７に与え、Ａ／Ｄコンバータ１７は、収音音声信号をデジタル変換して収音ビーム生成部１８１，１８２に出力する。収音ビーム生成部１８１には、一方の長尺面に設置されたマイクアレイＭＡ１０の各マイクＭＩＣ１１〜ＭＩＣ１７での収音音声信号が入力され、収音ビーム生成部１８２には、他方の長尺面に設置されたマイクアレイＭＡ２０のマイクＭＩＣ２１〜ＭＩＣ２７での収音音声信号が入力される。 Each sound collecting amplifier 16 amplifies the collected sound signal and supplies the amplified sound signal to the A / D converter 17, and the A / D converter 17 converts the collected sound signal into a digital signal to the collected sound beam generation units 181 and 182. Output. The collected sound signal from the microphones MIC11 to MIC17 of the microphone array MA10 installed on one long surface is input to the collected sound beam generation unit 181, and the other long length is input to the collected sound beam generation unit 182. The collected sound signals from the microphones MIC21 to MIC27 of the microphone array MA20 installed on the surface are input.

収音ビーム生成部１８１は、各マイクＭＩＣ１１〜ＭＩＣ１７の収音音声信号に対して所定の遅延処理等を行い、収音ビーム信号ＭＢ１１〜ＭＢ１４を生成する。収音ビーム信号ＭＢ１１〜ＭＢ１４は、図１（Ｂ）に示すように、マイクＭＩＣ１１〜ＭＩＣ１７が設置された長尺面側で当該長尺面に沿ってそれぞれに異なる所定幅の領域が収音ビーム領域に設定されている。 The collected sound beam generation unit 181 performs predetermined delay processing and the like on the collected sound signals of the microphones MIC11 to MIC17, and generates collected sound beam signals MB11 to MB14. As shown in FIG. 1 (B), the sound collecting beam signals MB11 to MB14 are obtained by collecting areas having different predetermined widths along the long surface on the long surface side where the microphones MIC11 to MIC17 are installed. It is set in the area.

収音ビーム生成部１８２は、各マイクＭＩＣ２１〜ＭＩＣ２７の収音音声信号に対して所定の遅延処理等を行い、収音ビーム信号ＭＢ２１〜ＭＢ２４を生成する。収音ビーム信号ＭＢ２１〜ＭＢ２４は、図１（Ｂ）に示すように、マイクＭＩＣ２１〜ＭＩＣ２７が設置された長尺面側で当該長尺面に沿ってそれぞれに異なる所定幅の領域が収音ビーム領域に設定されている。 The collected sound beam generation unit 182 performs predetermined delay processing or the like on the collected sound signals of the microphones MIC21 to MIC27, and generates collected sound beam signals MB21 to MB24. As shown in FIG. 1 (B), the sound collection beam signals MB21 to MB24 are obtained by collecting areas having different predetermined widths along the long surface on the long surface side where the microphones MIC21 to MIC27 are installed. It is set in the area.

収音ビーム選択部１９は、入力された収音ビーム信号ＭＢ１１〜ＭＢ１４、ＭＢ２１〜ＭＢ２４に対して、全波整流、話者音声周波数帯域のＢＰＦ、ピーク検出を行って、話者音声を主に収音した収音ビーム信号を選択し、収音ビーム信号ＭＢ（本発明の「収音信号」に相当する。）をエコーキャンセル部２０に出力する。 The collected sound beam selection unit 19 performs full-wave rectification, BPF in the speaker sound frequency band, and peak detection on the input sound collected beam signals MB11 to MB14 and MB21 to MB24, and mainly uses the speaker sound. The collected sound collecting beam signal is selected, and the collected sound beam signal MB (corresponding to the “sound collecting signal” of the present invention) is output to the echo canceling unit 20.

また、収音ビーム選択部１９は、選択した収音ビーム信号ＭＢに対応する収音指向性情報を制御部１０に与える。 In addition, the sound collection beam selection unit 19 gives the sound collection directivity information corresponding to the selected sound collection beam signal MB to the control unit 10.

制御部１０は、放音指向性情報と収音指向性情報とを回り込み音声の遅延に準じて同期するように組み合わせてエコーキャンセル部２０に与える。 The control unit 10 combines the sound emission directivity information and the sound collection directivity information so as to synchronize according to the delay of the wraparound sound, and gives the combined information to the echo cancellation unit 20.

図３は本実施形態のエコーキャンセル部２０の主要構成を示すブロック図である。
エコーキャンセル部２０は、収音ビーム信号の入力側から順に、第１フィルタ２０３、エコーキャンセル回路２００、第２フィルタ２０４を備えるとともに、フィルタ特性記憶部２０５を備える。エコーキャンセル回路２００は、適応型フィルタ２０１とポストプロセッサ２０２とを備える。適応型フィルタ２０１、第１フィルタ２０３、第２フィルタ２０４は、ＦＩＲフィルタ等のディジタルフィルタにより構成され、各フィルタ特性はディジタルフィルタの各フィルタ係数により設定される。 FIG. 3 is a block diagram showing the main configuration of the echo canceling unit 20 of the present embodiment.
The echo cancellation unit 20 includes a first filter 203, an echo cancellation circuit 200, a second filter 204, and a filter characteristic storage unit 205 in order from the input side of the collected sound beam signal. The echo cancellation circuit 200 includes an adaptive filter 201 and a post processor 202. The adaptive filter 201, the first filter 203, and the second filter 204 are configured by digital filters such as FIR filters, and each filter characteristic is set by each filter coefficient of the digital filter.

フィルタ特性記憶部２０５には、第１フィルタ２０３、第２フィルタ２０４に対して、放音指向性情報と収音指向性情報との組合せ毎に個別に設定されたフィルタ特性がそれぞれ記憶されている。このフィルタ特性は、スピーカアレイＳＰＡ１０およびマイクアレイＭＡ１０，ＭＡ２０の構造と、各スピーカＳＰと各マイクＭＩＣとの位置関係および設置状況に依存するものであるので、放収音装置を設置した時点で予め設定しておけばよい。その設定は、例えば、１つの放音指向性で放音した音声を収音して収音指向性毎に分析する処理を全ての放音指向性に対して実行することで得られる。 The filter characteristic storage unit 205 stores filter characteristics individually set for each combination of sound emission directivity information and sound collection directivity information with respect to the first filter 203 and the second filter 204. . This filter characteristic depends on the structure of the speaker array SPA10 and the microphone arrays MA10 and MA20, the positional relationship between each speaker SP and each microphone MIC, and the installation status. Just set it up. The setting can be obtained, for example, by collecting a sound emitted with one sound emission directivity and performing processing for each sound collection directivity for all sound emission directivities.

図４は、放音指向性と収音指向性との組合せによる回り込み音声の周波数特性の違いを示すグラフであり、（Ａ），（Ｂ），（Ｃ）のグラフはそれぞれ個別の放音指向性Ｄｖ１〜Ｄｖ３を示し、各グラフ内において、各特性曲線はそれぞれ異なる収音指向性Ｄｓ１１〜Ｄｓ１４（収音ビーム信号ＭＢ１１〜ＭＢ１４）を示す。 FIG. 4 is a graph showing the difference in the frequency characteristics of the wraparound sound depending on the combination of the sound emission directivity and the sound collection directivity, and the graphs (A), (B), and (C) are the individual sound emission directivities. In each graph, each characteristic curve indicates a different sound collection directivity Ds11 to Ds14 (sound collection beam signals MB11 to MB14).

図４に示すように、いずれの放音指向性と収音指向性との組合せにおいても、各周波数成分の信号レベル間に強弱差が発生する。第１フィルタ２０３のフィルタ特性は、この周波数成分毎の信号レベルの差を抑圧し、全ての周波数帯域において信号レベルが略同じになるように設定される。すなわち、低域の信号レベルが高く、高域の信号レベルが低いので、高域の信号レベルを低域の信号レベルまで引き上げる。より具体的には、全周波数帯域はそれぞれに所定周波数幅からなる部分周波数領域に分割し、低域の最も高い信号レベルを基準とする。そして、この基準信号レベルに信号レベルが略一致するように、各部分周波数領域の信号レベルのレベルシフト量を設定する。この際、レベルシフト量の設定・記憶には、エコーキャンセル回路２００で用いる演算ビット数よりも高いビット数を利用する。 As shown in FIG. 4, in any combination of sound emission directivity and sound collection directivity, a difference in strength occurs between the signal levels of each frequency component. The filter characteristics of the first filter 203 are set such that the signal level is substantially the same in all frequency bands by suppressing the difference in signal level for each frequency component. That is, since the signal level in the low band is high and the signal level in the high band is low, the signal level in the high band is raised to the signal level in the low band. More specifically, the entire frequency band is divided into partial frequency areas each having a predetermined frequency width, and the highest signal level in the low band is used as a reference. Then, the level shift amount of the signal level in each partial frequency region is set so that the signal level substantially matches the reference signal level. At this time, the number of bits higher than the number of calculation bits used in the echo cancellation circuit 200 is used for setting and storing the level shift amount.

そして、図４（Ａ）〜（Ｃ）のそれぞれに示すように、放音指向性と収音指向性との組合せ毎に周波数特性は異なるので、このようなレベルシフト量によるフィルタ特性の設定を、放音指向性と収音指向性との組合せ毎に行う。 As shown in each of FIGS. 4A to 4C, the frequency characteristics are different for each combination of the sound emission directivity and the sound collection directivity. Therefore, the filter characteristics are set according to the level shift amount. This is performed for each combination of sound emission directivity and sound collection directivity.

一方、第２フィルタ２０４のフィルタ特性は、第１フィルタ２０３のフィルタ特性を打ち消すように設定される。すなわち、第２フィルタ２０４のフィルタ特性は、第１フィルタ２０３で補正した信号レベルの周波数特性を元の周波数特性に戻すように設定される。 On the other hand, the filter characteristic of the second filter 204 is set so as to cancel the filter characteristic of the first filter 203. That is, the filter characteristic of the second filter 204 is set so that the frequency characteristic of the signal level corrected by the first filter 203 is returned to the original frequency characteristic.

図５は、フィルタ特性記憶部２０５の記憶内容を示す概念図である。なお、本説明では、放音指向性が３種類であり、収音指向性が８種類である場合を示すが、放音指向性数および収音指向性数はこれに限るものではなく、適宜設定すればよい。そして、フィルタ特性記憶部２０５には、個別の放音指向性と収音指向性との組合せ毎にフィルタ特性が記憶されている。 FIG. 5 is a conceptual diagram showing the contents stored in the filter characteristic storage unit 205. In addition, although this description shows the case where there are three types of sound emission directivities and eight types of sound collection directivities, the number of sound emission directivities and the number of sound collection directivities are not limited to this, and are appropriate. You only have to set it. The filter characteristic storage unit 205 stores a filter characteristic for each combination of individual sound emission directivity and sound collection directivity.

図５に示すように、フィルタ特性記憶部２０５には、放音指向性Ｄｖ１，Ｄｖ２，Ｄｖ３と、収音指向性Ｄｓ１１〜Ｄｓ１４，Ｄｓ２１〜Ｄｓ２４との組合せ毎にフィルタ特性が記憶されている。例えば、放音指向性Ｄｖ１と収音指向性Ｄｓ１１との組合せであれば、第１フィルタ特性Ｆｃ１１１、第２フィルタ特性Ｆｒ１１１が記憶されている。同様に、放音指向性Ｄｖ３と収音指向性Ｄｓ２４との組合せであれば、第１フィルタ特性Ｆｃ３２４、第２フィルタ特性Ｆｒ３２４が記憶される。すなわち、図５に示す例であれば、それぞれ３×８＝２４通りの第１フィルタ特性Ｆｃ、第２フィルタ特性Ｆｒが記憶される。 As illustrated in FIG. 5, the filter characteristic storage unit 205 stores filter characteristics for each combination of the sound emission directivities Dv1, Dv2, and Dv3 and the sound collection directivities Ds11 to Ds14 and Ds21 to Ds24. For example, if the sound emission directivity Dv1 and the sound collection directivity Ds11 are combined, the first filter characteristic Fc111 and the second filter characteristic Fr111 are stored. Similarly, if the sound emission directivity Dv3 and the sound collection directivity Ds24 are combined, the first filter characteristic Fc324 and the second filter characteristic Fr324 are stored. That is, in the example shown in FIG. 5, 3 × 8 = 24 kinds of first filter characteristics Fc and second filter characteristics Fr are stored.

エコーキャンセル部２０は、制御部１０から放音指向性情報と収音指向性情報との組合せ情報を受け付けると、該当する第１フィルタ特性Ｆｃと第２フィルタ特性Ｆｒとをフィルタ特性記憶部２０５から読み出す。エコーキャンセル部２０は、読み出した第１フィルタ特性Ｆｃに基づいて第１フィルタ２０３の各フィルタ係数を設定し、読み出した第２フィルタ特性Ｆｒに基づいて第２フィルタ２０４の各フィルタ係数を設定する。 When the echo cancel unit 20 receives the combination information of the sound emission directivity information and the sound collection directivity information from the control unit 10, the echo cancel unit 20 transmits the corresponding first filter characteristic Fc and second filter characteristic Fr from the filter characteristic storage unit 205. read out. The echo cancellation unit 20 sets each filter coefficient of the first filter 203 based on the read first filter characteristic Fc, and sets each filter coefficient of the second filter 204 based on the read second filter characteristic Fr.

第１フィルタ２０３は、入力された収音ビーム信号ＭＢに対して第１フィルタ特性Ｆｃに基づくフィルタ処理を行い、エコーキャンセル回路２００のポストプロセッサ２０２に出力する。すなわち、回り込みによる周波数特性が低域から高域まで略均一化された状態となるように、収音ビーム信号ＭＢの高域の信号レベルを上げる。これにより、回り込みによる周波数スペクトルの変化が打ち消され、信号レベルこそ異なるものの、放音指向性制御部１３に入力される入力音声信号と同じ周波数スペクトルの回り込み音声が取り込まれたのと同等になる。また、低域、高域ともに信号レベルが高い状態で均一化されているので、各周波数での信号レベルを表す実効ビット数が高くなり、分解能が周波数帯域全体で高くなる。 The first filter 203 performs a filtering process based on the first filter characteristic Fc on the input sound collection beam signal MB, and outputs the filtered signal to the post processor 202 of the echo cancellation circuit 200. That is, the signal level of the high frequency range of the collected sound beam signal MB is increased so that the frequency characteristic due to the wraparound is substantially uniform from the low frequency range to the high frequency range. As a result, the change in the frequency spectrum due to the wraparound is canceled and the signal level is different, but this is equivalent to that the wraparound sound having the same frequency spectrum as that of the input sound signal input to the sound output directivity control unit 13 is captured. In addition, since the signal level is uniform in both the low and high frequencies, the number of effective bits representing the signal level at each frequency increases, and the resolution increases in the entire frequency band.

このレベルシフト演算には、エコーキャンセル回路２００で用いる演算ビット数よりも高いビット数を利用する。例えば、エコーキャンセル回路２００の演算ビット数が１６ビットであれば、レベルシフト演算のビット数は３２ビット等に設定する。また、レベルシフトに対して浮動小数点演算を用いても良い。これにより、元々信号レベルの低い高域に対して、多くのビット量を割り当てることができるので、補正された収音ビーム信号ＭＢの信号レベルをより高精度に演算することができる。すなわち、信号補正時の量子化誤差を抑圧することができる。 For this level shift calculation, a higher number of bits than the number of calculation bits used in the echo cancellation circuit 200 is used. For example, if the number of operation bits of the echo cancel circuit 200 is 16, the number of bits for level shift operation is set to 32 bits. Further, floating point arithmetic may be used for the level shift. As a result, a large amount of bits can be allocated to the high region where the signal level is originally low, so that the signal level of the corrected sound collection beam signal MB can be calculated with higher accuracy. That is, it is possible to suppress quantization errors during signal correction.

エコーキャンセル部２０の適応型フィルタ２０１は、入力音声信号に対して、選択された収音ビーム信号ＭＢの収音指向性に基づく擬似回帰音信号を生成する。この際、適応型フィルタ２０１は、初期条件として、放音指向性、収音指向性に関係なく、低域から高域までの全周波数帯域で同等に音声が回り込むものとして、擬似回帰音信号の生成を開始する。これにより、擬似回帰音信号の生成開始条件が簡素化（一元化）され、演算負荷の増加を防止することができる。ポストプロセッサ２０２は、減算器として機能し、第１フィルタ２０３で補正された収音ビーム信号ＭＢから擬似回帰音信号を減算して、第２フィルタ２０４に出力する。ここで、ポストプロセッサ２０２に入力される収音ビーム信号は、前述のように全周波数帯域で信号レベルが略均一に補正された回り込み音声を含んでいるので、入力音声信号と補正された回り込み音声信号との周波数スペクトルが略一致する。したがって、入力音声信号に基づく擬似回帰音信号と収音ビーム信号ＭＢに含まれる回り込み音声信号との周波数スペクトルも略一致する。また、周波数帯域全域での実効ビット数が高くなり、分解能が高くなる。これにより、収音ビーム信号ＭＢから回り込み音声を高精度で確実に除去することができる。 The adaptive filter 201 of the echo cancellation unit 20 generates a pseudo regression sound signal based on the sound collection directivity of the selected sound collection beam signal MB with respect to the input sound signal. At this time, as an initial condition, the adaptive filter 201 assumes that the sound circulates equally in all frequency bands from the low range to the high range regardless of the sound emission directivity and the sound collection directivity. Start generation. Thereby, the generation start conditions of the pseudo regression sound signal are simplified (unified), and an increase in calculation load can be prevented. The post processor 202 functions as a subtracter, subtracts the pseudo regression sound signal from the collected sound beam signal MB corrected by the first filter 203, and outputs it to the second filter 204. Here, the collected sound beam signal input to the post processor 202 includes the wraparound sound whose signal level is substantially uniformly corrected in the entire frequency band as described above, so the input sound signal and the corrected wraparound sound are included. The frequency spectrum of the signal substantially matches. Therefore, the frequency spectra of the pseudo-regression sound signal based on the input sound signal and the wraparound sound signal included in the collected sound beam signal MB also substantially match. In addition, the effective number of bits in the entire frequency band increases, and the resolution increases. Thereby, the wraparound sound can be reliably removed with high accuracy from the collected sound beam signal MB.

第２フィルタ２０４は、回り込み音声が除去された収音ビーム信号に対して、第２フィルタ特性Ｆｒに基づくフィルタ処理を行い、出力音声信号として入出力Ｉ／Ｆ１２に出力する。第２フィルタ特性Ｆｒは、第１フィルタ特性Ｆｃで持ち上げられた高域の信号レベルを、元の回り込み音声の周波数特性に準じるように低下させる。これにより、第１フィルタ２０３で回り込み音声とともに高域が持ち上げられた、回り込み音声信号以外の音声信号すなわち発話者の発声音等の周波数特性を、第１フィルタ２０３での補正前の状態に再調整することができる。すなわち、収音された生の周波数スペクトルによる収音ビーム信号に準じた出力音声信号を得られる。 The second filter 204 performs a filtering process based on the second filter characteristic Fr on the collected sound beam signal from which the wraparound sound is removed, and outputs it to the input / output I / F 12 as an output sound signal. The second filter characteristic Fr lowers the high-frequency signal level raised by the first filter characteristic Fc so as to conform to the frequency characteristic of the original wraparound sound. As a result, the frequency characteristics of the speech signal other than the wraparound speech signal, that is, the utterance sound of the speaker, whose high frequency is raised together with the wraparound speech by the first filter 203, are readjusted to the state before the correction by the first filter 203. can do. That is, it is possible to obtain an output audio signal according to the collected sound beam signal based on the collected raw frequency spectrum.

この再調整演算についても、エコーキャンセル回路２００で用いる演算ビット数よりも高いビット数を利用する。例えば、エコーキャンセル回路２００の演算ビット数が１６ビットであれば、再調整演算のビット数は３２ビット等に設定する。また、再調整演算に対して浮動小数点演算を用いても良い。これにより、再調整された収音ビーム信号ＭＢの信号レベルを高精度に演算することができる。すなわち、信号再調整時の量子化誤差を抑圧することができる。 Also for this readjustment calculation, the number of bits higher than the number of calculation bits used in the echo cancellation circuit 200 is used. For example, if the number of operation bits of the echo cancellation circuit 200 is 16 bits, the number of readjustment operations is set to 32 bits. In addition, a floating point operation may be used for the readjustment operation. Thereby, the signal level of the readjusted sound collecting beam signal MB can be calculated with high accuracy. That is, it is possible to suppress a quantization error during signal readjustment.

また、演算処理負荷の高いエコーキャンセル処理の演算精度に対して、比較的演算処理負荷の低い第１，第２フィルタ演算に倍精度を用いることで、元々処理が複雑で負荷の高いエコーキャンセル処理の演算負荷をさらに上げることなく、回り込み音声を除去した収音ビーム信号（出力音声信号）を高精度で得ることができる。 In addition, by using double precision for the first and second filter operations having a relatively low calculation processing load as compared to the calculation accuracy of the echo cancellation processing having a high calculation processing load, the echo cancellation processing is originally complicated and has a high load. Without further increasing the computational load, it is possible to obtain a sound collection beam signal (output sound signal) from which the wraparound sound is removed with high accuracy.

以上のように、本実施形態の構成および処理を用いることにより、高精度に回り込み音声を除去して、発話者の発声音等の必要音のみを高いＳ／Ｎ比で取得して出力する放収音装置を実現することができる。 As described above, by using the configuration and processing of this embodiment, the wraparound speech is removed with high accuracy, and only the necessary sound such as the utterance sound of the speaker is acquired and output with a high S / N ratio. A sound collection device can be realized.

なお、前述の説明では、第１フィルタ特性Ｆｃと第２フィルタ特性Ｆｒとをともに、フィルタ特性記憶部２０５に記憶する例を示した。しかしながら、第２フィルタ特性Ｆｒは、第１フィルタ特性Ｆｃに対する逆補正特性であるので、第１フィルタ特性Ｆｃのみを記憶しておき、選択された第１フィルタ特性Ｆｃから第２フィルタ特性Ｆｒを演算して、第２フィルタ２０４に設定しても良い。 In the above description, the example in which both the first filter characteristic Fc and the second filter characteristic Fr are stored in the filter characteristic storage unit 205 has been described. However, since the second filter characteristic Fr is a reverse correction characteristic with respect to the first filter characteristic Fc, only the first filter characteristic Fc is stored, and the second filter characteristic Fr is calculated from the selected first filter characteristic Fc. Thus, the second filter 204 may be set.

また、前述の説明では、収音指向性を備えるマイクアレイと放音指向性を備えるスピーカアレイとをともに備えた放収音装置について説明したが、収音指向性を備えないマイクおよびマイクアレイと放音指向性を備えるスピーカアレイとの組合せや、収音指向性を備えるマイクアレイと放音指向性を備えないスピーカおよびスピーカアレイとの組合せを用いた場合でも、前述の構成を適用することができる。 In the above description, the sound emitting and collecting apparatus including both the microphone array having the sound collecting directivity and the speaker array having the sound emitting directivity has been described. Even when a combination with a speaker array having sound emission directivity or a combination of a microphone array with sound collection directivity and a speaker and speaker array without sound emission directivity is used, the above-described configuration can be applied. it can.

本実施形態に係る放収音装置のマイク、スピーカ配置を示す平面図、および、放収音装置により形成される収音ビーム領域を示す図である。It is the top view which shows the microphone of the sound emission and collection apparatus which concerns on this embodiment, and speaker arrangement | positioning, and the figure which shows the sound collection beam area | region formed with a sound emission and collection apparatus. 本実施形態の放収音装置の機能ブロック図である。It is a functional block diagram of the sound emission and collection device of this embodiment. 本実施形態のエコーキャンセル部２０の主要構成を示すブロック図である。It is a block diagram which shows the main structures of the echo cancellation part 20 of this embodiment. 放音指向性と収音指向性との組合せによる回り込み音声の周波数特性の違いを示すグラフである。It is a graph which shows the difference in the frequency characteristic of the wraparound sound by the combination of sound emission directivity and sound collection directivity. フィルタ特性記憶部２０５の記憶内容を示す概念図である。FIG. 4 is a conceptual diagram showing storage contents of a filter characteristic storage unit 205. 回り込み音声の概略な周波数特性を示す図である。It is a figure which shows the rough frequency characteristic of a wraparound sound.

Explanation of symbols

１−放収音装置、１０１−筐体、１１−入出力コネクタ、１２−入出力Ｉ／Ｆ、１３−放音指向性制御部、１４−Ｄ／Ａコンバータ、１５−放音用アンプ、１６−収音用アンプ、１７−Ａ／Ｄコンバータ、１８１，１８２−収音ビーム生成部、１９−収音ビーム選択部、２０−エコーキャンセル部、２０１−適応型フィルタ、２０２−ポストプロセッサ、2２０３，２０４−フィルタ、ＳＰ１〜ＳＰ３−スピーカ、ＳＰＡ１０−スピーカアレイ、ＭＩＣ１１〜ＭＩＣ１７，ＭＩＣ２１〜ＭＩＣ２７−マイク、ＭＡ１０，ＭＡ２０−マイクアレイ DESCRIPTION OF SYMBOLS 1- Sound emission / collection apparatus, 101- Housing | casing, 11- Input / output connector, 12- Input / output I / F, 13- Sound emission directivity control part, 14-D / A converter, 15- Sound emission amplifier, 16 -Amplifier for sound collection, 17-A / D converter, 181,182 -Sound collecting beam generation unit, 19 -Sound collecting beam selection unit, 20 -Echo canceling unit, 201 -Adaptive filter, 202 -Post processor, 2203, 204-filter, SP1-SP3-speaker, SPA10-speaker array, MIC11-MIC17, MIC21-MIC27-microphone, MA10, MA20-microphone array

Claims

Sound emission means for emitting an input audio signal;
Sound collection means for collecting sound and generating a sound collection signal;
A wraparound characteristic correction filter that performs a first filtering process for making the frequency characteristic of the sound that wraps around from the sound emitting means to the sound collecting means substantially uniform with respect to the sound collected signal;
Echo cancellation means for performing echo cancellation processing on the collected sound signal subjected to the first filtering processing;
A characteristic readjustment filter that performs a second filtering process consisting of the reverse characteristics of the first filtering process on the collected sound signal that has been echo canceled by the echo canceling means;
A sound emission and collection device.

The sound collecting means uses a microphone array in which a plurality of microphones are installed in a predetermined arrangement, and a plurality of sound collecting beams having strong directivities in different directions using sound signals collected by the plurality of microphones. A sound collecting beam signal generating means for generating a signal and outputting the sound collecting beam signal as the sound collecting signal;
The wraparound characteristic correction filter and the characteristic readjustment filter perform a filtering process by selecting a filtering characteristic corresponding to a selected collected sound beam signal from a plurality of filtering characteristics set in advance for each collected sound beam signal. Item 2. The sound emission and collection device according to Item 1.

The sound emitting means includes a speaker array in which a plurality of speakers are arranged in a predetermined arrangement, and a plurality of sound emission characteristics that are different from each other by sounds emitted from the plurality of speakers. A sound emission control means for generating a sound emission signal for the speaker;
2. The wraparound characteristic correction filter and the characteristic readjustment filter perform a filtering process by selecting a filtering characteristic corresponding to the selected sound emission characteristic from a plurality of filtering characteristics set in advance for each sound emission characteristic. The sound emission and collection device described in 1.

The sound emitting means includes a speaker array in which a plurality of speakers are arranged in a predetermined arrangement, and a plurality of sound emission characteristics that are different from each other by sounds emitted from the plurality of speakers. A sound emission control means for generating a sound emission signal for the speaker;
The wraparound characteristic correction filter and the characteristic readjustment filter perform filtering by selecting a filtering characteristic corresponding to the selected combination from a plurality of filtering characteristics set in advance for each combination of the collected sound beam signal and the sound emission characteristic. The sound emission and collection device according to claim 2 which performs processing.

The sound emission and collection device according to any one of claims 1 to 4, wherein the wraparound characteristic correction filter and the characteristic readjustment filter perform a filtering process by double-precision bit calculation on the echo cancellation unit.