JP5780096B2

JP5780096B2 - Audio processing apparatus, audio processing method, and audio processing program

Info

Publication number: JP5780096B2
Application number: JP2011217119A
Authority: JP
Inventors: 龍一神田
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2011-09-30
Filing date: 2011-09-30
Publication date: 2015-09-16
Anticipated expiration: 2031-09-30
Also published as: JP2013078010A

Description

本発明は、音の入出力処理を行い、且つ、他の音声処理装置と接続して通信を行うことが可能な音声処理装置、音声処理方法、および音声処理プログラムに関する。 The present invention relates to a sound processing device, a sound processing method, and a sound processing program that can perform sound input / output processing and can communicate with another sound processing device.

マイクによって集音した音声を、遠隔地に設置された他の音声処理装置に対してネットワークを介して送信すると同時に、ネットワークを介して遠隔地の音声を受信し、スピーカから出力する音声処理装置が知られている。このような音声処理装置は、遠隔会議システム等において広く使用されている。音声処理装置の一例として、スピーカフォンが挙げられる。音声処理装置には、通常、エコー除去の機能が搭載されている。音声処理装置は、エコー除去を機能させることによって、エコーやハウリングの発生を抑止することができる。例えば特許文献１では、エコー除去の機能を応用し、音響結合の利得を軽減させることによって、２以上のスピーカおよびマイクを備えた系においてエコーやハウリングの発生を抑止する技術が開示されている。 An audio processing device that transmits audio collected by a microphone to another audio processing device installed at a remote location via the network, and simultaneously receives audio from the remote location via the network and outputs it from a speaker. Are known. Such a voice processing apparatus is widely used in a remote conference system or the like. A speakerphone is an example of the sound processing device. An audio processing device is usually equipped with an echo removal function. The sound processing apparatus can suppress the occurrence of echoes and howling by making echo cancellation function. For example, Patent Document 1 discloses a technique for suppressing the occurrence of echoes and howling in a system including two or more speakers and microphones by applying an echo removal function to reduce the gain of acoustic coupling.

同一拠点内で複数の音声処理装置を相互に接続し、使用することが可能な音声処理装置が知られている。このような音声処理装置では、拠点内の広い領域に音声処理装置を点在させることによって、会議音声の可聴範囲、および発話音の集音範囲を広げることができる。 2. Description of the Related Art There is known a speech processing device that can connect and use a plurality of speech processing devices within the same site. In such an audio processing device, the audio processing device can be spread over a wide area in the base, thereby expanding the audible range of the conference audio and the sound collection range of the uttered sound.

特開２００１−９５０８４号公報JP 2001-95084 A

従来のエコー除去は、単一の音声処理装置のスピーカおよびマイクが使用された場合に機能するように設定されている。このため、同一拠点内で複数の音声処理装置が接続された場合、従来のエコー除去ではエコーやハウリングの発生を十分抑止することができない可能性がある。一方で、特許文献１に記載された技術を用いた場合、同一拠点内で複数の音声処理装置が接続された場合でも、エコーやハウリングの発生を抑止することができることが想定される。しかしながらこの場合、音響結合の利得を制御するための複雑なアルゴリズムを音響処理装置に実装しなければならない。従って、音声処理装置の構成が複雑化したり、装置に含まれる部品点数が増加したりすることで、音声処理装置を安価に提供できないという問題点がある。 Conventional echo cancellation is set to work when a speaker and microphone of a single audio processing device is used. For this reason, when a plurality of speech processing devices are connected in the same site, there is a possibility that the conventional echo removal cannot sufficiently suppress the occurrence of echoes and howling. On the other hand, when the technique described in Patent Document 1 is used, it is assumed that the occurrence of echoes and howling can be suppressed even when a plurality of voice processing devices are connected in the same site. In this case, however, a complex algorithm for controlling the gain of acoustic coupling must be implemented in the acoustic processing apparatus. Therefore, there is a problem that the voice processing device cannot be provided at a low cost because the configuration of the voice processing device becomes complicated or the number of parts included in the device increases.

本発明の目的は、装置構成の複雑化や、装置に含まれる部品点数の増加を抑制しつつ、複数の音声処理装置を相互に接続して使用する場合にエコーやハウリングの発生を抑止することができる音声処理装置、音声処理方法、および音声処理プログラムを提供することである。 An object of the present invention is to suppress the occurrence of echoes and howling when a plurality of audio processing devices are connected to each other while suppressing the complexity of the device configuration and the increase in the number of parts included in the device. A voice processing apparatus, a voice processing method, and a voice processing program.

本発明の第一態様に係る音声処理装置は、マイクおよびスピーカを備えた音声処理装置であって、他の音声処理装置と通信を行う通信制御手段と、前記通信制御手段によって前記他の音声処理装置と通信を行うことによって、前記スピーカから所定音を出力するタイミングを特定する第一特定手段と、前記第一特定手段によって特定された前記タイミングで、前記スピーカから前記所定音を所定音量で出力する制御を行う出力制御手段と、前記他の音声処理装置に接続された他のスピーカから前記所定音量で出力された前記所定音を、前記マイクを介して取得した場合に、取得した前記所定音の音量および到来方向を特定する第二特定手段と、前記第二特定手段によって特定された前記音量および前記到来方向に基づき、前記音声処理装置に到来する音に対する感度の指向性である全体指向性を調整する第一調整手段であって、前記音声処理装置に向けて特定の方向から到来する音に対する感度が減衰するように前記全体指向性を調整する第一調整手段と、前記スピーカから音を出力する場合の音量が変更された場合に、変更後の前記音量を特定するための情報を通知する変更通知を、前記他の音声処理装置に対して送信する送信手段と、前記他の音声処理装置によって送信された前記変更通知を受信する受信手段と、前記受信手段によって前記変更通知を受信した場合に、前記変更通知によって通知された前記情報に基づいて、前記第一調整手段によって調整された前記全体指向性を再調整する第二調整手段とを備えを備えている。 An audio processing apparatus according to a first aspect of the present invention is an audio processing apparatus including a microphone and a speaker, and communication control means for communicating with another audio processing apparatus, and the other audio processing by the communication control means. By communicating with a device, a first specifying means for specifying a timing for outputting a predetermined sound from the speaker, and outputting the predetermined sound from the speaker at a predetermined volume at the timing specified by the first specifying means. The predetermined sound obtained when the predetermined sound output at the predetermined volume from the other speaker connected to the other sound processing device and the predetermined sound volume is acquired via the microphone. A second specifying means for specifying the volume and the direction of arrival of the sound, and the sound processing device based on the volume and the direction of arrival specified by the second specifying means. First adjustment means for adjusting the overall directivity, which is a directivity of sensitivity to incoming sound, wherein the overall directivity is adjusted so that sensitivity to sound coming from a specific direction toward the voice processing device is attenuated; A first adjustment means for adjusting, and a change notification for notifying information for specifying the changed volume when the volume when the sound is output from the speaker is changed, to the other audio processing device Transmitting means for transmitting to, receiving means for receiving the change notification transmitted by the other audio processing device, and the information notified by the change notification when the change notification is received by the receiving means. And second adjusting means for readjusting the overall directivity adjusted by the first adjusting means .

第一態様によれば、複数の音声処理装置が相互に接続された状態で使用される場合であっても、他の音声処理装置から出力された音が要因でエコーやハウリングが発生することを抑止することができる。また、音声処理装置が通常備える周知のエコー除去機能をそのまま利用することができるため、新たにエコー除去のための特別な機能を音声処理装置に実装する必要がない。このため、音声処理装置の構成の複雑化や、装置に含まれる部品点数の増加を抑制しつつ、複数の音声処理装置が接続された場合のエコーやハウリングの発生を効率的に抑止することができる。又、他の音声処理装置のスピーカから出力される音の音量が変更された場合、エコーやハウリングの発生を抑止するために必要な全体指向性も、再度調整する必要がある。これに対して本態様では、他の音声処理装置のスピーカから出力される音の変更後の音量に基づいて全体指向性を再調整することによって、音声処理装置に対して到来方向から到来する音に対する感度を減衰させることができる。これによって音声処理装置は、全体指向性を最適な状態に再調整することができる。また、全体指向性は自動的に調整されるので、再度、他の音声処理装置のスピーカから所定音を出力させることによって全体指向性を最初から設定しなおす手間を省くことができる。 According to the first aspect, even when a plurality of sound processing devices are used in a state of being connected to each other, echoes and howling are caused by the sound output from other sound processing devices. Can be deterred. In addition, since the well-known echo removal function that is normally provided in the speech processing apparatus can be used as it is, it is not necessary to newly install a special function for echo removal in the speech processing apparatus. For this reason, it is possible to efficiently suppress the occurrence of echoes and howling when a plurality of sound processing devices are connected while suppressing the complexity of the structure of the sound processing device and the increase in the number of components included in the device. it can. In addition, when the volume of the sound output from the speaker of another audio processing device is changed, the overall directivity necessary to suppress the occurrence of echo and howling needs to be adjusted again. On the other hand, in this aspect, the sound coming from the direction of arrival with respect to the sound processing device is readjusted by re-adjusting the overall directivity based on the volume after the change of the sound output from the speaker of the other sound processing device. The sensitivity to can be attenuated. As a result, the speech processing apparatus can readjust the overall directivity to an optimum state. Also, since the overall directivity is automatically adjusted, it is possible to save the trouble of resetting the overall directivity from the beginning by outputting a predetermined sound from the speaker of another audio processing device again.

第一態様において、前記第一調整手段は、前記第二特定手段によって特定された前記音量が所定閾値以上である場合に、前記音量に対応する前記到来方向の感度を減衰させることによって、前記全体指向性を調整してもよい。他の音声処理装置から出力される音の入力レベルを選択的に減衰させることで、エコーやハウリングの発生を効果的に抑止することができる。 In the first aspect, the first adjustment means attenuates the sensitivity in the direction of arrival corresponding to the volume when the volume specified by the second specifying means is equal to or greater than a predetermined threshold, thereby The directivity may be adjusted. By selectively attenuating the input level of sound output from other sound processing devices, it is possible to effectively suppress the occurrence of echoes and howling.

第一態様において、前記第一調整手段は、前記マイクを介して前記所定音を取得した場合の音量が前記所定閾値以内となるように、前記音量に対応する前記到来方向の感度を減衰させることによって、前記全体指向性を調整してもよい。これによって、他の音声処理装置から出力される音の入力レベルを、エコーやハウリングが発生しないレベルまで減衰させることができる。これによって、エコーやハウリングの発生を更に効果的に抑止することができる。 In the first aspect, the first adjustment means attenuates the sensitivity in the direction of arrival corresponding to the volume so that the volume when the predetermined sound is acquired via the microphone is within the predetermined threshold. The overall directivity may be adjusted as follows. As a result, the input level of the sound output from another audio processing device can be attenuated to a level at which no echo or howling occurs. As a result, the occurrence of echoes and howling can be more effectively suppressed.

第一態様において、前記送信手段は、変更前後での前記音量の差分を算出し、前記差分を通知するための前記変更通知を送信し、前記第二調整手段は、前記変更通知によって通知された前記差分に基づいて、前記第一調整手段によって調整された前記全体指向性を再調整してもよい。音声処理装置は、他の音声処理装置のスピーカから出力される音の音量の変更前後での差分に基づいて、全体指向性を再調整し、到来方向から到来する音に対する感度を減衰させることができる。これによって音声処理装置は、他の音声処理装置のスピーカから出力される音の音量の変化の程度に応じ、全体指向性を最適な状態に調整することができる。 In the first aspect, the transmission unit calculates the difference in the volume before and after the change, transmits the change notification for notifying the difference, and the second adjustment unit is notified by the change notification. Based on the difference, the overall directivity adjusted by the first adjusting means may be readjusted. The sound processing device may readjust the overall directivity based on the difference between before and after the change in the volume of the sound output from the speaker of the other sound processing device, and attenuate the sensitivity to the sound coming from the direction of arrival. it can. Thus, the sound processing device can adjust the overall directivity to an optimal state according to the degree of change in the volume of the sound output from the speaker of the other sound processing device.

本発明の第二態様に係る音声処理装置は、マイクおよびスピーカを備えた音声処理装置であって、他の音声処理装置と通信を行う通信制御手段と、前記通信制御手段によって前記他の音声処理装置と通信を行うことによって、前記スピーカから所定音を出力するタイミングを特定する第一特定手段と、前記第一特定手段によって特定された前記タイミングで、前記スピーカから前記所定音を所定音量で出力する制御を行う出力制御手段と、前記他の音声処理装置に接続された他のスピーカから前記所定音量で出力された前記所定音を、前記マイクを介して取得した場合に、取得した前記所定音の音量および到来方向を特定する第二特定手段と、前記第二特定手段によって特定された前記音量および前記到来方向に基づき、前記音声処理装置に到来する音に対する感度の指向性である全体指向性を調整する第一調整手段であって、前記音声処理装置に向けて特定の方向から到来する音に対する感度が減衰するように前記全体指向性を調整する第一調整手段と、前記マイクが音を集音する場合の感度が変更された場合に、変更前後での前記感度の差分を算出し、算出した前記差分に基づいて、前記第一調整手段によって調整された前記全体指向性を再調整する第二調整手段とを備えたことを特徴とする。第二態様によれば、複数の音声処理装置が相互に接続された状態で使用される場合であっても、他の音声処理装置から出力された音が要因でエコーやハウリングが発生することを抑止することができる。また、音声処理装置が通常備える周知のエコー除去機能をそのまま利用することができるため、新たにエコー除去のための特別な機能を音声処理装置に実装する必要がない。このため、音声処理装置の構成の複雑化や、装置に含まれる部品点数の増加を抑制しつつ、複数の音声処理装置が接続された場合のエコーやハウリングの発生を効率的に抑止することができる。又、マイクの感度が変更された場合、エコーやハウリングの発生を抑止するために必要な全体指向性も、再度調整する必要がある。これに対して本発明では、感度の変更前後での差分を算出し、差分に基づいて全体指向性を再調整することによって、到来方向から到来する音に対する感度を減衰させることができる。これによって音声処理装置は、マイクの感度の変化の程度に応じ、全体指向性を最適な状態に再調整することができる。また、差分に応じて全体指向性は自動的に調整されるので、再度、他の音声処理装置のスピーカから所定音を出力させることによって全体指向性を最初から設定しなおす手間を省くことができる。 The audio processing apparatus according to the second aspect of the present invention is an audio processing apparatus including a microphone and a speaker, and communication control means for communicating with another audio processing apparatus, and the other audio processing by the communication control means. By communicating with a device, a first specifying means for specifying a timing for outputting a predetermined sound from the speaker, and outputting the predetermined sound from the speaker at a predetermined volume at the timing specified by the first specifying means. The predetermined sound obtained when the predetermined sound output at the predetermined volume from the other speaker connected to the other sound processing device and the predetermined sound volume is acquired via the microphone. A second specifying means for specifying the volume and the direction of arrival of the sound, and the sound processing device based on the volume and the direction of arrival specified by the second specifying means. First adjustment means for adjusting the overall directivity, which is a directivity of sensitivity to incoming sound, wherein the overall directivity is adjusted so that sensitivity to sound coming from a specific direction toward the voice processing device is attenuated; When the sensitivity when the microphone collects sound is changed , the first adjustment means to adjust, the difference of the sensitivity before and after the change is calculated, and the first adjustment based on the calculated difference And second adjusting means for readjusting the overall directivity adjusted by the means. According to the second aspect, even when a plurality of sound processing devices are used in a state of being connected to each other, echoes and howling are caused by the sound output from other sound processing devices. Can be deterred. In addition, since the well-known echo removal function that is normally provided in the speech processing apparatus can be used as it is, it is not necessary to newly install a special function for echo removal in the speech processing apparatus. For this reason, it is possible to efficiently suppress the occurrence of echoes and howling when a plurality of sound processing devices are connected while suppressing the complexity of the structure of the sound processing device and the increase in the number of components included in the device. it can. In addition, when the sensitivity of the microphone is changed, it is necessary to adjust the overall directivity necessary for suppressing the occurrence of echoes and howling again. On the other hand, in the present invention, the sensitivity with respect to the sound coming from the arrival direction can be attenuated by calculating the difference before and after the sensitivity change and readjusting the overall directivity based on the difference. As a result, the speech processing apparatus can readjust the overall directivity to an optimum state according to the degree of change in the sensitivity of the microphone. Also, since the overall directivity is automatically adjusted according to the difference, it is possible to save the trouble of resetting the overall directivity from the beginning by outputting a predetermined sound again from the speaker of another audio processing device. .

第一態様及び第二態様において、前記第一特定手段は、前記通信制御手段によって、前記音声処理装置および前記他の音声処理装置に対して順番に割り当てられる識別情報に基づいて、前記タイミングを決定してもよい。音声処理装置は、所定音を出力するタイミングを容易に決定することができる。音声処理装置を使用するユーザや、音声処理装置を制御する制御機器によって、音声処理装置から所定音を所定のタイミングで出力させることを要することなく、音声処理装置は、独自にタイミングを判断して所定音を出力することができる。 In the first aspect and the second aspect , the first specifying unit determines the timing based on identification information sequentially assigned to the voice processing device and the other voice processing device by the communication control unit. May be. The voice processing device can easily determine the timing for outputting the predetermined sound. The voice processing device independently determines the timing without requiring the user to use the voice processing device or the control device that controls the voice processing device to output the predetermined sound from the voice processing device at the predetermined timing. A predetermined sound can be output.

本発明の第三態様に係る音声処理方法は、他の音声処理装置と通信を行う通信制御ステップと、前記通信制御ステップによって前記他の音声処理装置と通信を行うことによって、音声処理装置のスピーカから所定音を出力するタイミングを特定する第一特定ステップと、前記第一特定ステップによって特定された前記タイミングで、前記スピーカから前記所定音を所定音量で出力する制御を行う出力制御ステップと、前記他の音声処理装置に接続された他のスピーカから前記所定音量で出力された前記所定音を、前記音声処理装置のマイクを介して取得した場合に、取得した前記所定音の音量および到来方向を特定する第二特定ステップと、前記第二特定ステップによって特定された前記音量および前記到来方向に基づき、前記音声処理装置に到来する音に対する感度の指向性である全体指向性を調整する第一調整ステップであって、前記音声処理装置に向けて特定の方向から到来する音に対する感度が減衰するように前記全体指向性を調整する第一調整ステップと、前記スピーカから音を出力する場合の音量が変更された場合に、変更後の前記音量を特定するための情報を通知する変更通知を、前記他の音声処理装置に対して送信する送信ステップと、前記他の音声処理装置によって送信された前記変更通知を受信する受信ステップと、前記受信ステップによって前記変更通知を受信した場合に、前記変更通知によって通知された前記情報に基づいて、前記第一調整ステップによって調整された前記全体指向性を再調整する第二調整ステップとを備えている。第三態様によれば、第一態様と同様の効果を奏することができる。 The speech processing method according to the third aspect of the present invention includes a communication control step for communicating with another speech processing device, and a speaker for the speech processing device by communicating with the other speech processing device through the communication control step. A first specifying step for specifying a timing for outputting a predetermined sound from the output control step for performing control for outputting the predetermined sound at a predetermined volume from the speaker at the timing specified by the first specifying step; When the predetermined sound output at the predetermined volume from another speaker connected to another audio processing apparatus is acquired via the microphone of the audio processing apparatus, the volume and direction of arrival of the acquired predetermined sound are indicated. Based on the second specifying step to specify, the volume and the direction of arrival specified by the second specifying step, the sound processing device A first adjustment step of adjusting the overall directivity, which is a directivity of sensitivity to an incoming sound, wherein the overall directivity is set so that sensitivity to a sound coming from a specific direction toward the voice processing device is attenuated. A first adjustment step for adjusting, and a change notification for notifying information for specifying the changed volume when the volume when the sound is output from the speaker is changed, to the other audio processing device A transmitting step for transmitting to the receiving device; a receiving step for receiving the changing notification transmitted by the other audio processing device; and the information notified by the changing notification when the changing notification is received by the receiving step. And a second adjustment step for readjusting the overall directivity adjusted by the first adjustment step . According to the 3rd aspect, there can exist an effect similar to a 1st aspect.

本発明の第四態様に係る音声処理プログラムは、他の音声処理装置と通信を行う通信制御ステップと、前記通信制御ステップによって前記他の音声処理装置と通信を行うことによって、音声処理装置のスピーカから所定音を出力するタイミングを特定する第一特定ステップと、前記第一特定ステップによって特定された前記タイミングで、前記スピーカから前記所定音を所定音量で出力する制御を行う出力制御ステップと、前記他の音声処理装置の制御によって、前記他の音声処理装置に接続された他のスピーカから前記所定音量で出力された前記所定音を、前記音声処理装置のマイクを介して取得した場合に、取得した前記所定音の音量および到来方向を特定する第二特定ステップと、前記第二特定ステップによって特定された前記音量および前記到来方向に基づき、前記音声処理装置に到来する音に対する感度の指向性である全体指向性を調整する第一調整ステップであって、前記音声処理装置に向けて特定の方向から到来する音に対する感度が減衰するように前記全体指向性を調整する第一調整ステップと、前記スピーカから音を出力する場合の音量が変更された場合に、変更後の前記音量を特定するための情報を通知する変更通知を、前記他の音声処理装置に対して送信する送信ステップと、前記他の音声処理装置によって送信された前記変更通知を受信する受信ステップと、前記受信ステップによって前記変更通知を受信した場合に、前記変更通知によって通知された前記情報に基づいて、前記第一調整ステップによって調整された前記全体指向性を再調整する第二調整ステップとを前記音声処理装置のコンピュータに実行させる。第四態様によれば、第一態様と同様の効果を奏することができる。 The audio processing program according to the fourth aspect of the present invention includes a communication control step for communicating with another audio processing device, and a speaker of the audio processing device by communicating with the other audio processing device through the communication control step. A first specifying step for specifying a timing for outputting a predetermined sound from the output control step for performing control for outputting the predetermined sound at a predetermined volume from the speaker at the timing specified by the first specifying step; Acquired when the predetermined sound output at the predetermined volume from another speaker connected to the other audio processing device is acquired through the microphone of the audio processing device under the control of the other audio processing device. A second specifying step for specifying the volume and the direction of arrival of the predetermined sound, and the volume specified by the second specifying step. And a first adjustment step for adjusting the overall directivity, which is the directivity of sensitivity to the sound arriving at the sound processing device, based on the arrival direction, and the sound arriving from a specific direction toward the sound processing device A first adjustment step for adjusting the overall directivity so that the sensitivity to the sound is attenuated, and notification of information for specifying the changed sound volume when the sound volume when the sound is output from the speaker is changed Transmitting the change notification to the other voice processing device, receiving the change notification transmitted by the other voice processing device, and receiving the change notification by the receiving step. A second adjustment for readjusting the global directivity adjusted by the first adjustment step based on the information notified by the change notification. To execute the steps in the computer of the audio processing device. According to the 4th aspect, there can exist an effect similar to a 1st aspect.

音声処理装置１０を含む会議システム１の概要、および音声処理装置１０の電気的構成を示す図である。1 is a diagram showing an outline of a conference system 1 including a voice processing device 10 and an electrical configuration of the voice processing device 10. FIG. 状態テーブル２３１の第一例を示す模式図である。6 is a schematic diagram showing a first example of a state table 231. FIG. メイン処理を示すフローチャートである。It is a flowchart which shows a main process. メイン処理を示すフローチャートであって、図３の続きである。4 is a flowchart showing main processing, which is a continuation of FIG. 3. メイン処理を示すフローチャートであって、図４の続きである。5 is a flowchart showing main processing, which is a continuation of FIG. 4. 全体指向性のパターンを示す図である。It is a figure which shows the pattern of whole directivity. 状態テーブル２３１の第二例を示す模式図である。6 is a schematic diagram illustrating a second example of a state table 231. FIG. 音声処理装置１０のネットワーク構成を示す図である。1 is a diagram showing a network configuration of a voice processing device 10. FIG.

以下、本発明の一実施形態について、図面を参照して説明する。これらの図面は、本発明が採用しうる技術的特徴を説明するために用いられるものである。記載されている装置の構成、各種処理のフローチャート等は、それのみに限定する趣旨ではなく、単なる説明例である。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. These drawings are used to explain technical features that can be adopted by the present invention. The configuration of the apparatus, the flowcharts of various processes, and the like that are described are not intended to be limited to only that, but are merely illustrative examples.

図１を参照し、会議システム１の概要について説明する。会議システム１は、音声処理装置１１、１２、１３、およびＰＣ１５を備えている。音声処理装置１１、１２、１３、およびＰＣ１５は、同一拠点（以下、自拠点ともいう）内に設置されている。音声処理装置１１、１２、１３、およびＰＣ１５は、通信ケーブルによってディジーチェーン接続している。ディジーチェーンとは、複数の装置を数珠つなぎに連結する接続方法を示す。音声処理装置１１は、ＰＣ１５および音声処理装置１２と接続している。音声処理装置１２は、音声処理装置１１、１３と接続している。ＰＣ１５は、インターネット網１６にも接続している。従って会議システム１では、（インターネット網１６、）ＰＣ１５、音声処理装置１１、１２、１３の順に接続されていることになる。以下、音声処理装置１１、１２、１３を区別しない場合または総称する場合、これらを音声処理装置１０という。 An overview of the conference system 1 will be described with reference to FIG. The conference system 1 includes voice processing apparatuses 11, 12, 13 and a PC 15. The voice processing apparatuses 11, 12, 13 and the PC 15 are installed in the same base (hereinafter also referred to as own base). The audio processing devices 11, 12, 13 and the PC 15 are daisy chain connected by a communication cable. Daisy chain refers to a connection method for connecting a plurality of devices in a daisy chain. The voice processing device 11 is connected to the PC 15 and the voice processing device 12. The audio processing device 12 is connected to the audio processing devices 11 and 13. The PC 15 is also connected to the Internet network 16. Therefore, in the conference system 1, the (Internet network 16) PC 15 and the audio processing devices 11, 12, and 13 are connected in this order. Hereinafter, when the voice processing devices 11, 12, and 13 are not distinguished or collectively referred to, they are referred to as the voice processing device 10.

音声処理装置１０は、ＰＣ１５およびインターネット網１６を介し、自拠点とは異なる他拠点に設置された他のＰＣ（図示外）および他の音声処理装置（図示外）と通信を行うことができる。音声処理装置１０は、マイク２５（後述）によって集音した音声のデータを、インターネット網１６を介して他の音声処理装置に送信すると同時に、インターネット網１６を介して他の音声処理装置から音声のデータを受信し、スピーカ２４から音声を出力する。音声処理装置１０を使用する自拠点のユーザは、他の音声処理装置を使用する他拠点の他のユーザとの間で、音声による遠隔会議を行うことができる。 The voice processing apparatus 10 can communicate with other PCs (not shown) and other voice processing apparatuses (not shown) installed at other sites different from the own site via the PC 15 and the Internet network 16. The voice processing apparatus 10 transmits voice data collected by a microphone 25 (described later) to another voice processing apparatus via the Internet network 16 and simultaneously transmits voice data from the other voice processing apparatus via the Internet network 16. Data is received and sound is output from the speaker 24. A user at his / her own site using the voice processing device 10 can perform a voice remote conference with another user at another site using another voice processing device.

また会議システム１では、音声処理装置１１、１２、１３を自拠点内の広い領域に点在させることができる。そして、他拠点に設置された他の音声処理装置から送信された音声のデータに基づく音声を、音声処理装置１１、１２、１３のスピーカ２４（後述）から出力させることができる。これによって、スピーカ２４から出力される音声が広範にわたる領域で聞こえるようにすることができる。また音声処理装置１０は、自拠点内の音声を隅々まで集音し、他拠点に設置された他の音声処理装置に対して音声データを送信することができる。 In the conference system 1, the voice processing devices 11, 12, and 13 can be scattered over a wide area in the own base. And the sound based on the audio | voice data transmitted from the other speech processing apparatus installed in another base can be output from the speaker 24 (after-mentioned) of the speech processing apparatuses 11,12,13. Thereby, the sound output from the speaker 24 can be heard in a wide area. In addition, the voice processing device 10 can collect the voice in the local site to every corner, and can transmit the voice data to other voice processing devices installed in other bases.

なお図１のシステム構成は本発明の一例であり、他のシステム構成であってもよい。例えばインターネット網１６の代わりに、固定電話網、移動電話網、専用通信網等、周知の様々な外部通信網が使用され、自拠点と他拠点との間で通信が実行されてもよい。音声処理装置１０は、ＰＣ１５の代わりに、音声処理装置１０以外の様々な機器（固定電話機、携帯電話機、ルータ、モデム等）を介して外部通信網と接続してもよい。また音声処理装置１０は、外部通信網と直接接続してもよい。 The system configuration in FIG. 1 is an example of the present invention, and other system configurations may be used. For example, various known external communication networks such as a fixed telephone network, a mobile telephone network, and a dedicated communication network may be used in place of the Internet network 16, and communication may be performed between the own base and another base. The voice processing device 10 may be connected to an external communication network via various devices (a fixed telephone, a mobile phone, a router, a modem, etc.) other than the voice processing device 10 instead of the PC 15. The voice processing device 10 may be directly connected to an external communication network.

また会議システム１において、ＰＣ１５にディスプレイおよびカメラが接続されてもよい。ＰＣ１５は、カメラによって撮影された自拠点の映像のデータを、インターネット網１６を介して他のＰＣに送信すると同時に、インターネット網１６を介して他のＰＣから映像のデータを受信し、ディスプレイに映像を表示してもよい。これによって自拠点のユーザは、他拠点の他のユーザとの間で、映像および音声による遠隔会議を行うことができる。 In the conference system 1, a display and a camera may be connected to the PC 15. The PC 15 transmits the video data of the local site taken by the camera to the other PC via the Internet network 16 and simultaneously receives the video data from the other PC via the Internet network 16 and displays the video on the display. May be displayed. As a result, the user at the local site can hold a video and audio remote conference with other users at other sites.

音声処理装置１０の電気的構成について説明する。音声処理装置１０は、音声処理装置１０の制御を司るＣＰＵ２０を備えている。ＣＰＵ２０は、ＲＯＭ２１、ＲＡＭ２２、フラッシュメモリ２３、スピーカ２４、マイク２５、通信インタフェース（以下、通信Ｉ／Ｆという。）２６、および入力部２７と電気的に接続している。ＲＯＭ２１には、ブートプログラム、ＢＩＯＳ、ＯＳ等が記憶される。ＲＡＭ２２には、タイマやカウンタ、一時的なデータが記憶される。またＲＡＭ２２には、スピーカ２４から出力される音の音量の設定値、およびマイク２５の感度の設定値が記憶される。以下、スピーカ２４から出力される音の音量の設定値としてＲＡＭ２２に記憶された情報を、スピーカ音量という。マイク２５の感度の設定値としてＲＡＭ２２に記憶された情報を、マイク感度という。ユーザは、入力部２７（後述）を操作することによって、スピーカ音量およびマイク感度を設定することができる。 The electrical configuration of the audio processing device 10 will be described. The voice processing device 10 includes a CPU 20 that controls the voice processing device 10. The CPU 20 is electrically connected to the ROM 21, RAM 22, flash memory 23, speaker 24, microphone 25, communication interface (hereinafter referred to as communication I / F) 26, and input unit 27. The ROM 21 stores a boot program, BIOS, OS, and the like. The RAM 22 stores a timer, a counter, and temporary data. The RAM 22 stores a sound volume setting value output from the speaker 24 and a sensitivity setting value of the microphone 25. Hereinafter, the information stored in the RAM 22 as the set value of the volume of the sound output from the speaker 24 is referred to as the speaker volume. The information stored in the RAM 22 as the sensitivity setting value of the microphone 25 is referred to as microphone sensitivity. The user can set the speaker volume and the microphone sensitivity by operating the input unit 27 (described later).

フラッシュメモリ２３には、ＣＰＵ２０の制御プログラムが記憶される。またフラッシュメモリ２３には、後述する状態テーブル２３１（図２参照）が記憶される。更にフラッシュメモリ２３には、後述するテスト音のデータ、およびテスト音の音量であるテスト音量が記憶される。スピーカ２４は、ＲＡＭ２２に記憶されたスピーカ音量で音を出力することができる。マイク２５は、ＲＡＭ２２に記憶されたマイク感度で音を集音することができる。マイク２５は、音声処理装置１０の周囲に複数設けられている。ＣＰＵ２０は、其々のマイク２５によって集音した音の音量、および、其々のマイク２５における音の集音タイミングの時間差に基づき、集音した音の到来方向を特定することができる。またＣＰＵ２０は、其々のマイク２５によって集音した音の増幅度を調節することによって、マイク２５全体としての指向性を調整することができる。以下、マイク２５全体としての指向性を、全体指向性という。通信Ｉ／Ｆ２６は、他の音声処理装置１０およびＰＣ１５と通信を行うためのインタフェースである。なお音声処理装置１０は、異なる二つの他の装置と接続することによってディジーチェーン接続を実現している。このため、異なる二つの他の装置の其々と通信を行うために、通信Ｉ／Ｆ２６は二つ以上設けられる。入力部２７は、スピーカ音量およびマイク感度を設定するためのボタンである。 The flash memory 23 stores a control program for the CPU 20. The flash memory 23 stores a state table 231 (see FIG. 2) described later. The flash memory 23 stores test sound data, which will be described later, and a test sound volume that is the sound volume of the test sound. The speaker 24 can output sound at the speaker volume stored in the RAM 22. The microphone 25 can collect sound with the microphone sensitivity stored in the RAM 22. A plurality of microphones 25 are provided around the sound processing apparatus 10. The CPU 20 can specify the arrival direction of the collected sound based on the volume of the sound collected by each microphone 25 and the time difference between the sound collection timings of the sound in each microphone 25. Further, the CPU 20 can adjust the directivity of the microphone 25 as a whole by adjusting the amplification degree of the sound collected by each microphone 25. Hereinafter, the directivity of the microphone 25 as a whole is referred to as overall directivity. The communication I / F 26 is an interface for communicating with the other voice processing apparatus 10 and the PC 15. Note that the audio processing device 10 realizes daisy chain connection by connecting to two different devices. For this reason, two or more communication I / Fs 26 are provided in order to communicate with two different other apparatuses. The input unit 27 is a button for setting speaker volume and microphone sensitivity.

音声処理装置１０には、周知のエコー除去機能が搭載されている。音声処理装置１０は、音声処理装置１０のスピーカ２４から出力された音をマイク２５が集音することによって発生するエコーやハウリングを、エコー除去を機能させることによって抑止することができる。しかしながら、例えば音声処理装置１１のマイク２５によって、音声処理装置１２、１３のスピーカ２４から出力された音が集音された場合、音声処理装置１１はエコー除去を有効に機能させることができない場合がある。エコー除去の機能は、マイク２５によって集音された音に基づき、自装置のスピーカ２４から出力される音をフィードバック制御によって調節し、エコーやハウリングを抑止するためである。従って音声処理装置１１は、接続されている他の音声処理装置１２、１３のスピーカ２４から出力された音がエコーやハウリングの要因となっている場合、音声処理装置１２、１３のスピーカ２４から出力される音を調節することができないため、エコーやハウリングを抑止することができない。 The sound processing apparatus 10 is equipped with a known echo removal function. The sound processing device 10 can suppress echo and howling generated by the microphone 25 collecting sound output from the speaker 24 of the sound processing device 10 by making echo removal function. However, for example, when the sound output from the speaker 24 of the sound processing devices 12 and 13 is collected by the microphone 25 of the sound processing device 11, the sound processing device 11 may not be able to function echo removal effectively. is there. The echo removal function is based on the sound collected by the microphone 25 and adjusts the sound output from the speaker 24 of its own device by feedback control to suppress echo and howling. Therefore, when the sound output from the speakers 24 of the other connected sound processing devices 12 and 13 is a cause of echo or howling, the sound processing device 11 outputs from the speakers 24 of the sound processing devices 12 and 13. Since the sound to be played cannot be adjusted, echo and howling cannot be suppressed.

これに対して本実施形態における音声処理装置１１は、全体指向性を調整し、音声処理装置１２、１３のスピーカ２４から出力された音がマイク２５によって集音される場合の音量を小さくする。これによって音声処理装置１１は、音声処理装置１２、１３のスピーカ２４から出力された音が要因となってエコーやハウリングが発生することを抑止することができる。また音声処理装置１１は、音声処理装置１１のスピーカ２４から出力された音が要因となってエコーやハウリングが発生することを、周知のエコー除去機能を利用して抑止することができる。従って音声処理装置１１は、相互に接続した音声処理装置１２、１３が自拠点内に設置される場合でも、エコーやハウリングのないクリアな音声環境で遠隔会議を行うことができる。以下詳説する。 On the other hand, the sound processing apparatus 11 according to the present embodiment adjusts the overall directivity, and reduces the sound volume when the sound output from the speaker 24 of the sound processing apparatuses 12 and 13 is collected by the microphone 25. As a result, the sound processing device 11 can suppress the occurrence of echoes and howling due to the sound output from the speaker 24 of the sound processing devices 12 and 13. In addition, the sound processing device 11 can suppress the occurrence of echo and howling due to the sound output from the speaker 24 of the sound processing device 11 by using a known echo removal function. Therefore, the voice processing apparatus 11 can perform a remote conference in a clear voice environment free from echo and howling even when the voice processing apparatuses 12 and 13 connected to each other are installed in the local site. The details will be described below.

図２を参照し、音声処理装置１１のフラッシュメモリ２３に記憶された状態テーブル２３１について説明する。状態テーブル２３１は、音声処理装置１１が全体指向性を調整する場合に参照するテーブルである。状態テーブル２３１には、受信音量および音の到来方向を示す情報が、音声処理装置１０のＩＤに対応付けて記憶されている。受信音量は、到来方向から到来した音を集音した場合の音量である。到来方向は、音声処理装置１０の所定方向に対する角度で示されている。状態テーブル２３１の作成方法、および参照方法についての詳細は後述する。 The state table 231 stored in the flash memory 23 of the sound processing device 11 will be described with reference to FIG. The state table 231 is a table that is referred to when the speech processing apparatus 11 adjusts the overall directivity. In the state table 231, information indicating the reception volume and the arrival direction of the sound is stored in association with the ID of the sound processing device 10. The reception volume is a volume when sound that has arrived from the direction of arrival is collected. The arrival direction is indicated by an angle with respect to a predetermined direction of the speech processing device 10. Details of the creation method and the reference method of the state table 231 will be described later.

図３から図５を参照し、音声処理装置１０が実行するメイン処理について説明する。以下説明するメイン処理は、フラッシュメモリ２３に記憶されている音声処理プログラムに従って、音声処理装置１０のＣＰＵ２０が実行する。メイン処理は、音声処理装置１０の電源がＯＮされた場合に、フラッシュメモリ２３に記憶されたメイン処理用のプログラムが起動されて開始される。そして、ＣＰＵ２０がこのプログラムを実行することにより行われる。なおＣＰＵ２０では、メイン処理以外にも様々な周知の音響処理（エコーキャンセル処理など）が並列して実行されている。これらについての説明は、以下では省略している。なおＲＡＭ２２には、メイン処理の起動時において、変数ＩＤ、Ｘ、およびＭが定義される。 With reference to FIG. 3 to FIG. 5, main processing executed by the voice processing device 10 will be described. The main process described below is executed by the CPU 20 of the sound processing apparatus 10 according to the sound processing program stored in the flash memory 23. The main processing is started when the main processing program stored in the flash memory 23 is activated when the power of the audio processing device 10 is turned on. Then, the CPU 20 executes this program. In addition to the main process, the CPU 20 executes various known acoustic processes (echo cancellation process, etc.) in parallel. These descriptions are omitted below. In the RAM 22, variable IDs, X, and M are defined when the main process is started.

メイン処理が開始されると、ＣＰＵ２０は、ＩＤに０を記憶することによってＩＤを初期化する（Ｓ１１）。ＣＰＵ２０は、他の音声処理装置１０またはＰＣ１５が通信Ｉ／Ｆ２６（図１参照）を介して接続されたかを判断する（Ｓ１３）。他の音声処理装置１０またはＰＣ１５が接続されていないと判断した場合（Ｓ１３：ＮＯ）、他の音声処理装置１０またはＰＣ１５が接続されるのを継続して監視するために、処理はＳ１３に戻る。一方、音声処理装置１０またはＰＣ１５が接続されたと判断した場合（Ｓ１３：ＹＥＳ）、ＣＰＵ２０は、ＰＣ１５が接続されたかを判断する（Ｓ１５）。具体的には、通信ケーブルに含まれている信号の電気的な状態（例えばプルアップ状態など）を検出することによって、ＰＣ１５が接続されたか否かを判断する。なお、ＰＣ１５が接続されたか否かを判断する方法はこの方法に限定されない。例えば装置間が通信ケーブルを介して接続された場合、互いの種別を通知するための初期通信が実行されてもよい。ＣＰＵ２０は、初期通信によってＰＣ１５が接続されたか否かを判断してもよい。ＰＣ１５が接続されたと判断した場合（Ｓ１５：ＹＥＳ）、ＣＰＵ２０は、ＩＤに１を記憶する（Ｓ１７）。これによって、音声処理装置１０のＩＤが決定される。ＣＰＵ２０は、決定したＩＤを通知するためのＩＤ通知データを、通信ケーブルを介して接続された他の音声処理装置１０に対して送信する（Ｓ１９）。処理はＳ２９に進む。 When the main process is started, the CPU 20 initializes the ID by storing 0 in the ID (S11). The CPU 20 determines whether another audio processing device 10 or the PC 15 is connected via the communication I / F 26 (see FIG. 1) (S13). If it is determined that the other voice processing apparatus 10 or PC 15 is not connected (S13: NO), the process returns to S13 in order to continuously monitor the connection of the other voice processing apparatus 10 or PC 15 . On the other hand, when it is determined that the voice processing device 10 or the PC 15 is connected (S13: YES), the CPU 20 determines whether the PC 15 is connected (S15). Specifically, it is determined whether or not the PC 15 is connected by detecting an electrical state (for example, a pull-up state) of a signal included in the communication cable. The method for determining whether the PC 15 is connected is not limited to this method. For example, when the apparatuses are connected via a communication cable, initial communication for notifying each other's type may be executed. The CPU 20 may determine whether or not the PC 15 is connected by initial communication. When it is determined that the PC 15 is connected (S15: YES), the CPU 20 stores 1 in the ID (S17). Thereby, the ID of the voice processing device 10 is determined. The CPU 20 transmits ID notification data for notifying the determined ID to the other voice processing apparatus 10 connected via the communication cable (S19). The process proceeds to S29.

一方、Ｓ１５で、ＰＣ１５が接続されていないと判断した場合（Ｓ１５：ＮＯ）、音声処理装置１０には他の音声処理装置１０が接続されたことになる。ＣＰＵ２０は、接続された状態のＰＣ１５または他の音声処理装置１０から送信されたＩＤ通知データを受信したかを判断する（Ｓ２１）。ＩＤ通知データを受信したと判断した場合（Ｓ２１：ＹＥＳ）、ＣＰＵ２０は、受信したＩＤ通知データによって通知されたＩＤに１を加算した値を、ＩＤに記憶する（Ｓ２３）。これによって、音声処理装置１０のＩＤが決定される。ＣＰＵ２０は、決定したＩＤを通知するためのＩＤ通知データを、通信ケーブルを介して接続された他の音声処理装置１０に対して送信する（Ｓ２５）。処理はＳ２９に進む。 On the other hand, when it is determined in S15 that the PC 15 is not connected (S15: NO), the other voice processing device 10 is connected to the voice processing device 10. The CPU 20 determines whether or not the ID notification data transmitted from the connected PC 15 or other voice processing device 10 has been received (S21). When it is determined that the ID notification data has been received (S21: YES), the CPU 20 stores a value obtained by adding 1 to the ID notified by the received ID notification data in the ID (S23). Thereby, the ID of the voice processing device 10 is determined. The CPU 20 transmits ID notification data for notifying the determined ID to the other voice processing apparatus 10 connected via the communication cable (S25). The process proceeds to S29.

他方、Ｓ２１で、ＩＤ通知データを受信していないと判断した場合（Ｓ２１：ＮＯ）、ＣＰＵ２０は、ディジーチェーン接続の終端に接続した状態であるかを判断する（Ｓ２７）。ＣＰＵ２０は、通信Ｉ／Ｆ２６の一方側に他の音声処理装置１０またはＰＣ１５が接続されており、他方側に他の音声処理装置１０およびＰＣ１５のいずれも接続されていない場合、ディジーチェーン接続の終端に接続した状態であると判断する。ディジーチェーン接続の終端に接続した状態であると判断した場合（Ｓ２７：ＹＥＳ）、他の音声処理装置１０またはＰＣ１５が新たに接続されるのを継続して監視するために、処理はＳ１３に戻る。一方、ディジーチェーン接続の終端に接続した状態でないと判断した場合（Ｓ２７：ＮＯ）、ＰＣ１５または他の音声処理装置１０から送信されたＩＤ通知データを受信するのを継続して監視するために、処理はＳ２１に戻る。 On the other hand, if it is determined in S21 that the ID notification data has not been received (S21: NO), the CPU 20 determines whether or not it is connected to the end of the daisy chain connection (S27). When the other audio processing device 10 or the PC 15 is connected to one side of the communication I / F 26 and neither the other audio processing device 10 nor the PC 15 is connected to the other side, the CPU 20 terminates the daisy chain connection. It is determined that it is in a connected state. If it is determined that the terminal is connected to the end of the daisy chain connection (S27: YES), the process returns to S13 in order to continuously monitor the connection of another audio processing device 10 or the PC 15 to the terminal. . On the other hand, when it is determined that the terminal is not connected to the end of the daisy chain connection (S27: NO), in order to continuously monitor the reception of the ID notification data transmitted from the PC 15 or the other voice processing device 10, The process returns to S21.

Ｓ２９において、ＣＰＵ２０は、Ｓ１９またはＳ２５でＩＤ通知データを送信してから所定時間が経過したかを判断する（Ｓ２９）。所定時間が経過していないと判断した場合（Ｓ２９：ＮＯ）、処理はＳ２９に戻る。所定時間が経過したと判断した場合（Ｓ２９：ＹＥＳ）、処理はＳ３１（図４参照）に進む。 In S29, the CPU 20 determines whether a predetermined time has elapsed since the ID notification data was transmitted in S19 or S25 (S29). If it is determined that the predetermined time has not elapsed (S29: NO), the process returns to S29. If it is determined that the predetermined time has elapsed (S29: YES), the process proceeds to S31 (see FIG. 4).

例えば、音声処理装置１０が図１に示すようにディジーチェーン接続している場合、音声処理装置１１は、ＰＣ１５が接続された場合（Ｓ１５：ＹＥＳ）、ＩＤとして１を決定し（Ｓ１７）、通信ケーブルを介して接続した音声処理装置１２に対してＩＤ通知データを送信する（Ｓ１９）。音声処理装置１２は、音声処理装置１１からＩＤ通知データを受信し（Ｓ２１：ＹＥＳ）、ＩＤとして２を決定する（Ｓ２３）。音声処理装置１２は、通信ケーブルを介して接続した音声処理装置１３に対してＩＤ通知データを送信する（Ｓ２５）。音声処理装置１３は、音声処理装置１２からＩＤ通知データを受信し（Ｓ２１：ＹＥＳ）、ＩＤとして３を決定する（Ｓ２３）。以上のようにして、音声処理装置１１、１２、１３のＩＤが、其々１、２、３として決定される。 For example, when the voice processing apparatus 10 is daisy chain connected as shown in FIG. 1, the voice processing apparatus 11 determines 1 as the ID (S17) when the PC 15 is connected (S15: YES), and the communication. ID notification data is transmitted to the audio processing device 12 connected via the cable (S19). The voice processing device 12 receives the ID notification data from the voice processing device 11 (S21: YES), and determines 2 as the ID (S23). The voice processing device 12 transmits the ID notification data to the voice processing device 13 connected via the communication cable (S25). The voice processing device 13 receives the ID notification data from the voice processing device 12 (S21: YES), and determines 3 as the ID (S23). As described above, the IDs of the sound processing apparatuses 11, 12, and 13 are determined as 1, 2, and 3, respectively.

図４に示すように、ＣＰＵ２０は、Ｘに１を記憶することによって初期化する（Ｓ３１）。ＣＰＵ２０は、状態テーブル２３１（図２参照）に記憶されたＩＤ、受信音量、および到来方向を削除し、状態テーブル２３１をクリアする（Ｓ３３）。ＣＰＵ２０は、ＲＡＭ２２に記憶されたスピーカ音量に０を記憶する（Ｓ３５）。 As shown in FIG. 4, the CPU 20 initializes by storing 1 in X (S31). The CPU 20 deletes the ID, reception volume, and arrival direction stored in the state table 231 (see FIG. 2), and clears the state table 231 (S33). CPU20 memorize | stores 0 in the speaker volume memorize | stored in RAM22 (S35).

ＣＰＵ２０は、ディジーチェーン接続した音声処理装置１０の総数とＸとが等しいかを判断する（Ｓ３７）。Ｘが総数よりも小さい場合（Ｓ３７：ＮＯ）、ＣＰＵ２０は、ＸとＩＤとが一致するかを判断する（Ｓ３９）。ＸとＩＤとが一致すると判断した場合（Ｓ３９：ＹＥＳ）、ＣＰＵ２０は、ＲＡＭ２２のスピーカ音量に、フラッシュメモリ２３のテスト音量を記憶する（Ｓ４１）。ＣＰＵ２０は、フラッシュメモリ２３のテスト音のデータに基づき、スピーカ２４からテスト音を出力する（Ｓ４３）。テスト音はテスト音量でスピーカ２４から出力される。スピーカ２４からのテスト音の出力が終了した後、ＣＰＵ２０は、ＲＡＭ２２のスピーカ音量に０を記憶する（Ｓ４５）。ＣＰＵ２０は、Ｘに１を加算することによってＸを更新する（Ｓ４７）。処理はＳ３７に戻る。 The CPU 20 determines whether X is equal to the total number of audio processing devices 10 connected in a daisy chain (S37). When X is smaller than the total number (S37: NO), the CPU 20 determines whether X and ID match (S39). If it is determined that X and ID match (S39: YES), the CPU 20 stores the test volume of the flash memory 23 in the speaker volume of the RAM 22 (S41). The CPU 20 outputs a test sound from the speaker 24 based on the test sound data in the flash memory 23 (S43). The test sound is output from the speaker 24 at the test volume. After the output of the test sound from the speaker 24 is completed, the CPU 20 stores 0 in the speaker volume of the RAM 22 (S45). The CPU 20 updates X by adding 1 to X (S47). The process returns to S37.

一方、Ｓ３９で、ＸとＩＤとが一致しないと判断した場合（Ｓ３９：ＮＯ）、ＣＰＵ２０は、他の音声処理装置１０のスピーカ２４からテスト音が出力されているかを判断する（Ｓ４９）。マイク２５を介してテスト音を受信することができない場合、ＣＰＵ２０は、他の音声処理装置１０がテスト音を出力していないと判断する（Ｓ４９：ＮＯ）。この場合、継続してテスト音の受信を監視するために、処理はＳ４９に戻る。一方、マイク２５を介してテスト音を受信した場合、他の音声処理装置１０のスピーカ２４からテスト音が出力されたと判断する（Ｓ４９：ＹＥＳ）。ＣＰＵ２０は、マイク２５を介して受信したテスト音の受信音量と到来方向を特定する（Ｓ５１）。ＣＰＵ２０は、この時点でのＸを、テスト音を出力した他の音声処理装置１０のＩＤとして特定する。ＣＰＵ２０は、特定した受信音量および到来方向を、特定した他の音声処理装置１０のＩＤに対応付けて、状態テーブル２３１（図２参照）に記憶する（Ｓ５３）。ＣＰＵ２０は、Ｘに１を加算することによってＸを更新する（Ｓ５５）。処理はＳ３７に戻る。 On the other hand, if it is determined in S39 that X and ID do not match (S39: NO), the CPU 20 determines whether or not a test sound is output from the speaker 24 of another audio processing device 10 (S49). When the test sound cannot be received via the microphone 25, the CPU 20 determines that the other sound processing apparatus 10 does not output the test sound (S49: NO). In this case, in order to continuously monitor the reception of the test sound, the process returns to S49. On the other hand, when the test sound is received through the microphone 25, it is determined that the test sound is output from the speaker 24 of the other sound processing apparatus 10 (S49: YES). The CPU 20 specifies the reception volume and arrival direction of the test sound received via the microphone 25 (S51). The CPU 20 specifies X at this point as the ID of the other voice processing apparatus 10 that has output the test sound. The CPU 20 stores the specified reception volume and arrival direction in the state table 231 (see FIG. 2) in association with the ID of the other specified voice processing device 10 (S53). The CPU 20 updates X by adding 1 to X (S55). The process returns to S37.

Ｓ４７、およびＳ５５においてＸが繰り返し更新され、Ｘが総数と等しくなった場合（Ｓ３７：ＹＥＳ）、処理はＳ６１（図５参照）に進む。 When X is repeatedly updated in S47 and S55 and X becomes equal to the total number (S37: YES), the process proceeds to S61 (see FIG. 5).

例えば図１において、音声処理装置１１、１２、１３のＩＤが、其々１、２、３として決定されている場合、はじめにＩＤ１の音声処理装置１１のスピーカ２４から、テスト音がテスト音量で出力される（Ｓ４３）。音声処理装置１２、１３は、音声処理装置１１のスピーカ２４から出力されたテスト音を、マイク２５を介して受信し、その受信音量および到来方向を特定する（Ｓ５１）。音声処理装置１２、１３は、特定した受信音量および到来方向を、ＩＤ１に対応付けて、フラッシュメモリ２３の状態テーブル２３１（図２参照）に記憶する。次に、ＩＤ２の音声処理装置１２のスピーカ２４から、テスト音がテスト音量で出力される（Ｓ４３）。音声処理装置１１、１３は、音声処理装置１２のスピーカ２４から出力されたテスト音を、マイク２５を介して受信し、その受信音量および到来方向を特定する（Ｓ５１）。音声処理装置１１、１３は、特定した受信音量および到来方向を、ＩＤ２に対応付けて、状態テーブル２３１（図２参照）に記憶する。同様の処理が、音声処理装置１３のスピーカ２４からテスト音が出力された場合にも繰り返される。 For example, in FIG. 1, when the IDs of the sound processing devices 11, 12, and 13 are determined as 1, 2, and 3 respectively, first, the test sound is the test volume from the speaker 24 of the sound processing device 11 with ID 1. Is output (S43). The voice processing devices 12 and 13 receive the test sound output from the speaker 24 of the voice processing device 11 via the microphone 25, and specify the reception volume and direction of arrival (S51). The voice processing devices 12 and 13 store the identified reception volume and arrival direction in the state table 231 (see FIG. 2) of the flash memory 23 in association with ID 1. Next, a test sound is output at a test volume from the speaker 24 of the audio processing device 12 of ID 2 (S43). The audio processing devices 11 and 13 receive the test sound output from the speaker 24 of the audio processing device 12 via the microphone 25, and specify the reception volume and arrival direction (S51). The audio processing devices 11 and 13 store the identified reception volume and arrival direction in the state table 231 (see FIG. 2) in association with ID 2. Similar processing is repeated when a test sound is output from the speaker 24 of the sound processing device 13.

図５に示すように、ＣＰＵ２０は、Ｍに１を記憶することによって初期化する（Ｓ６１）。ＣＰＵ２０は、状態テーブル２３１のうちＩＤＭに対応する受信音量が、所定値（例えば７０ｄＢ）以上であるかを判断する（Ｓ６３）。なお所定値は、他の音声処理装置１０のスピーカ２４から出力された音がマイク２５を介して受信された場合に、ハウリングやエコーが発生する可能性のある最小の音量に設定される。ＩＤＭに対応する受信音量が所定値未満である場合（Ｓ６３：ＮＯ）、ＩＤＭの音声処理装置１０のスピーカ２４から出力される音の音量は十分小さく、エコーやハウリングの発生の要因となる可能性は低い。ＣＰＵ２０は、Ｍに１を加算することによってＭを更新する（Ｓ６９）。更新したＭに基づいて受信音量の判断を繰り返し実行するために、処理はＳ６３に戻る。 As shown in FIG. 5, the CPU 20 initializes by storing 1 in M (S61). The CPU 20 determines whether or not the reception volume corresponding to ID M in the state table 231 is equal to or higher than a predetermined value (for example, 70 dB) (S63). Note that the predetermined value is set to the minimum volume that may generate howling or echo when the sound output from the speaker 24 of another audio processing device 10 is received via the microphone 25. When the reception volume corresponding to ID M is less than the predetermined value (S63: NO), the volume of sound output from the speaker 24 of the audio processing device 10 of ID M is sufficiently small, which causes generation of echo and howling. Unlikely. The CPU 20 updates M by adding 1 to M (S69). In order to repeatedly execute the determination of the reception volume based on the updated M, the process returns to S63.

一方、状態テーブル２３１のうちＩＤＭに対応する受信音量が所定値以上である場合（Ｓ６３：ＹＥＳ）、ＩＤＭの音声処理装置１０のスピーカ２４から出力される音の音量は大きく、エコーやハウリングの発生の要因となる可能性が高い。従って、ＩＤＭの音声処理装置１０のスピーカ２４から出力される音がマイク２５によって集音される場合の受信音量を小さくすることによって、エコーやハウリングの発生を抑制する必要がある。ＣＰＵ２０は、状態テーブル２３１のうちＩＤＭに対応する到来方向の感度を下げることによって、全体指向性を調整する（Ｓ６５）。ＣＰＵ２０は、ディジーチェーン接続した音声処理装置１０の総数とＭとが等しいかを判断する（Ｓ６７）。Ｍが総数よりも小さい場合（Ｓ６７：ＮＯ）、ＣＰＵ２０は、Ｍに１を加算することによってＭを更新する（Ｓ６９）。更新したＭに基づいて受信音量の判断を繰り返し実行するため、処理はＳ６３に戻る。Ｓ６９においてＭが繰り返し更新され、Ｍが総数と等しくなった場合（Ｓ６７：ＹＥＳ）、全体指向性の調整は終了する。処理はＳ７１に進む。 On the other hand, if the reception volume corresponding to ID M in the state table 231 is greater than or equal to a predetermined value (S63: YES), the volume of the sound output from the speaker 24 of the audio processing apparatus 10 of ID M is large, and echo or howling is performed. Is likely to be the cause of Therefore, it is necessary to suppress the occurrence of echoes and howling by reducing the reception volume when the sound output from the speaker 24 of the IDM audio processing apparatus 10 is collected by the microphone 25. The CPU 20 adjusts the overall directivity by reducing the sensitivity of the arrival direction corresponding to ID M in the state table 231 (S65). The CPU 20 determines whether M is equal to the total number of audio processing devices 10 connected in a daisy chain (S67). When M is smaller than the total number (S67: NO), the CPU 20 updates M by adding 1 to M (S69). Since the determination of the reception volume is repeatedly performed based on the updated M, the process returns to S63. When M is repeatedly updated in S69 and M becomes equal to the total number (S67: YES), the adjustment of the overall directivity is completed. The process proceeds to S71.

図６は、Ｓ６１〜Ｓ６９の処理によって調整された、音声処理装置１１〜１３の全体指向性のパターンを示している。音声処理装置１１（ＩＤ：１）、１２（ＩＤ：２）、１３（ＩＤ：３）は、左から右側に向かって順番に一直線上に並んで配置されていたとする。そして、音声処理装置１１において、図２に示す状態テーブル２３１が作成されたとする。音声処理装置１１、１２間の距離は小さいため、図２にて示されているように、音声処理装置１２（ＩＤ：２）から出力されたテスト音が音声処理装置１１において集音された場合の受信音量は大きくなっている（７５ｄＢ）。この値は所定値（７０ｄＢ）よりも大きいため（Ｓ６３：ＹＥＳ、図５参照）、全体指向性が調整される（Ｓ６５、図５参照）。調整の具体的な方法は次のとおりである。音声処理装置１１は、音声処理装置１２から送信されたテスト音を受信した場合の受信音量（７５ｄＢ）が所定値（７０ｄＢ）以下となるように、到来方向（０ｄｅｇ、図２参照）に近い位置に配置されているマイク２５の感度を減衰させる。これによって、音声処理装置１１に対して到来方向（０ｄｅｇ）から到来する音の感度を５ｄＢ分減衰させる。 FIG. 6 shows the pattern of overall directivity of the sound processing devices 11 to 13 adjusted by the processing of S61 to S69. Assume that the sound processing apparatuses 11 (ID: 1), 12 (ID: 2), and 13 (ID: 3) are arranged in a straight line in order from the left to the right. Then, it is assumed that the state table 231 shown in FIG. Since the distance between the voice processing devices 11 and 12 is small, as shown in FIG. 2, when the test sound output from the voice processing device 12 (ID: 2) is collected by the voice processing device 11 The reception volume of is increased (75 dB). Since this value is larger than the predetermined value (70 dB) (S63: YES, see FIG. 5), the overall directivity is adjusted (S65, see FIG. 5). The specific method of adjustment is as follows. The position of the sound processing device 11 is close to the arrival direction (0 deg, see FIG. 2) so that the reception volume (75 dB) when the test sound transmitted from the sound processing device 12 is received is equal to or less than the predetermined value (70 dB). Attenuating the sensitivity of the microphone 25 arranged in the. As a result, the sensitivity of the sound coming from the arrival direction (0 deg) is attenuated by 5 dB with respect to the speech processing device 11.

一方、音声処理装置１１、１３間の距離は十分大きいため、音声処理装置１３（ＩＤ：３）から出力されたテスト音が音声処理装置１１において集音された場合の受信音量は小さくなっている（６３ｄＢ、図２参照）。この値は所定値（７０ｄＢ）よりも小さいため（Ｓ６３：ＮＯ、図５参照）、全体指向性は調整されない。以上の結果、図６に示すように、音声処理装置１１の全体指向性のパターン３１は、音声処理装置１１に対して音声処理装置１２が配置されている方向（図６の右方向）の感度が、他の方向の感度に比べて５ｄＢ分小さくなっている。 On the other hand, since the distance between the speech processing devices 11 and 13 is sufficiently large, the reception volume when the test sound output from the speech processing device 13 (ID: 3) is collected by the speech processing device 11 is small. (63 dB, see FIG. 2). Since this value is smaller than the predetermined value (70 dB) (S63: NO, see FIG. 5), the overall directivity is not adjusted. As a result of the above, as shown in FIG. 6, the overall directivity pattern 31 of the voice processing device 11 has a sensitivity in the direction in which the voice processing device 12 is disposed (right direction in FIG. 6). However, it is 5 dB smaller than the sensitivity in other directions.

一方、音声処理装置１２では、音声処理装置１１との間の距離、および、音声処理装置１３との間の距離の両方が小さくなっている。従って音声処理装置１２の全体指向性のパターン３２は、音声処理装置１２に対して音声処理装置１１が配置されている方向（図６の左方向）、および、音声処理装置１２に対して音声処理装置１３が配置されている方向（図６の右方向）の両方の感度が、他の方向の感度に比べて小さくなっている。 On the other hand, in the voice processing device 12, both the distance to the voice processing device 11 and the distance to the voice processing device 13 are small. Accordingly, the omnidirectional pattern 32 of the voice processing device 12 is in the direction in which the voice processing device 11 is arranged with respect to the voice processing device 12 (left direction in FIG. 6) and the voice processing with respect to the voice processing device 12. Both sensitivities in the direction in which the device 13 is arranged (right direction in FIG. 6) are smaller than those in the other directions.

以上のように音声処理装置１０では、音声処理装置１０に向けて特定の方向から到来する音に対する感度を減衰させることによって、他の音声処理装置１０から出力された音を受信した場合の受信音量を選択的に減衰させることができる。これによって音声処理装置１０は、他の音声処理装置１０から出力された音が要因となってエコーやハウリングが発生することを効果的に抑止できる。また音声処理装置１０は、他の音声処理装置１０から出力された音を受信した場合の受信音量が所定値以下となるように、特定の方向に近い位置に配置されたマイク２５の感度を調整し、全体指向性を調整する。これによって、エコーやハウリングが発生しないレベルまで受信音量を効率的に減衰させることができる。これによって音声処理装置１０は、エコーやハウリングの発生を更に効果的に抑止することができる。 As described above, the sound processing device 10 attenuates the sensitivity to sound coming from a specific direction toward the sound processing device 10, thereby receiving the received sound volume when the sound output from the other sound processing device 10 is received. Can be selectively attenuated. As a result, the sound processing device 10 can effectively suppress the occurrence of echoes and howling due to the sound output from the other sound processing devices 10. In addition, the sound processing device 10 adjusts the sensitivity of the microphone 25 disposed in a position close to a specific direction so that the reception volume when receiving the sound output from the other sound processing device 10 is equal to or less than a predetermined value. And adjust the overall directivity. As a result, the reception volume can be efficiently attenuated to a level at which no echo or howling occurs. As a result, the speech processing apparatus 10 can more effectively suppress the occurrence of echoes and howling.

図５に示すように、全体指向性が調整された後、ＣＰＵ２０は、他拠点に設置された他の音声処理装置との間で音声のデータの通信を行うことによって、音声による遠隔会議を開始する。遠隔会議の開始後、ＣＰＵ２０は、スピーカ音量を変更する操作が入力部２７を介してユーザによって行われたかを判断する（Ｓ７１）。スピーカ音量を変更する操作が行われたと判断した場合（Ｓ７１：ＹＥＳ）、ＣＰＵ２０は、変更後のスピーカ音量をＲＡＭ２２に記憶する。ＣＰＵ２０は、スピーカ２４から出力される音の音量を、ＲＡＭ２２に記憶したスピーカ音量に基づいて変更する。 As shown in FIG. 5, after the overall directivity is adjusted, the CPU 20 starts voice remote conference by communicating voice data with another voice processing device installed at another base. To do. After the start of the remote conference, the CPU 20 determines whether an operation for changing the speaker volume has been performed by the user via the input unit 27 (S71). When it is determined that an operation for changing the speaker volume has been performed (S71: YES), the CPU 20 stores the changed speaker volume in the RAM 22. The CPU 20 changes the volume of the sound output from the speaker 24 based on the speaker volume stored in the RAM 22.

音声処理装置１０のスピーカ音量が変更された場合、他の音声処理装置１０では、Ｓ６５で調整した全体指向性を、変更後のスピーカ音量に基づいて再調整する必要がある。ＣＰＵ２０は、変更前のスピーカ音量と変更後のスピーカ音量との差分を他の音声処理装置１０に対して通知するための変更通知データを、通信ケーブルを介して接続された他の音声処理装置１０に対して送信する（Ｓ７３）。処理はＳ６１に戻る。 When the speaker volume of the voice processing device 10 is changed, the other voice processing devices 10 need to readjust the overall directivity adjusted in S65 based on the changed speaker volume. The CPU 20 transmits the change notification data for notifying the other audio processing device 10 of the difference between the speaker volume before the change and the speaker volume after the change, to the other audio processing device 10 connected via the communication cable. (S73). The process returns to S61.

Ｓ７１で、スピーカ音量を変更する操作が行われていないと判断した場合（Ｓ７１：ＮＯ）、ＣＰＵ２０は、他の音声処理装置１０から変更通知データを受信したかを判断する（Ｓ７５）。変更通知データを受信したと判断した場合（Ｓ７５：ＹＥＳ）、ＣＰＵ２０は、ディジーチェーン接続された全ての音声処理装置１０に変更通知データが到達するように、変更通知データを中継転送する。例えば音声処理装置１２（図１参照）が音声処理装置１１（図１参照）から変更通知データを受信した場合、音声処理装置１２のＣＰＵ２０は、受信した変更通知データを音声処理装置１３（図１参照）に対して中継転送する。次にＣＰＵ２０は、受信した変更通知データによって通知された差分に基づいて、状態テーブル２３１の内容を更新する。具体的には以下の通りである。 If it is determined in S71 that the operation for changing the speaker volume has not been performed (S71: NO), the CPU 20 determines whether or not change notification data has been received from another audio processing device 10 (S75). If it is determined that the change notification data has been received (S75: YES), the CPU 20 relays and transfers the change notification data so that the change notification data reaches all the daisy chain-connected voice processing devices 10. For example, when the voice processing device 12 (see FIG. 1) receives the change notification data from the voice processing device 11 (see FIG. 1), the CPU 20 of the voice processing device 12 sends the received change notification data to the voice processing device 13 (see FIG. 1). Relay) to (see). Next, the CPU 20 updates the contents of the state table 231 based on the difference notified by the received change notification data. Specifically, it is as follows.

ＣＰＵ２０は、状態テーブル２３１のうち、変更通知データを送信した音声処理装置１０のＩＤに対応する受信音量を選択する。ＣＰＵ２０は、受信した変更通知データによって通知された差分を、選択した受信音量に加算することによって、状態テーブル２３１を更新する（Ｓ７９）。例えば、他の音声処理装置１０においてスピーカ音量が１０ｄＢ分大きくなるように変更されていたとする。この場合、変更通知データによって通知される差分は＋１０ｄＢとなる。ＣＰＵ２０は、通知された差分＋１０ｄＢを、状態テーブル２３１から選択した受信音量に加算することで、状態テーブル２３１を更新する。処理はＳ６１に戻る。 The CPU 20 selects a reception volume corresponding to the ID of the voice processing apparatus 10 that has transmitted the change notification data from the state table 231. The CPU 20 updates the state table 231 by adding the difference notified by the received change notification data to the selected reception volume (S79). For example, it is assumed that the speaker volume has been changed to be increased by 10 dB in another audio processing apparatus 10. In this case, the difference notified by the change notification data is +10 dB. The CPU 20 updates the state table 231 by adding the notified difference +10 dB to the reception volume selected from the state table 231. The process returns to S61.

ＣＰＵ２０は、更新した状態テーブル２３１に基づいて、其々のＩＤに対応する受信音量が所定値以上であるかを判断する（Ｓ６３）。受信音量が所定値未満である場合（Ｓ６３：ＮＯ）、変更後のスピーカ音量で他の音声処理装置１０のスピーカ２４から出力される音野音量は十分小さく、エコーやハウリングの発生の要因となる可能性は低い。この場合、全体指向性は再調整されない。一方、受信音量が所定値以上である場合（Ｓ６３：ＹＥＳ）、変更後のスピーカ音量で他の音声処理装置１０のスピーカ２４から出力される音の音量は大きく、エコーやハウリングの発生の要因となる可能性が高い。従って、エコーやハウリングの発生を抑制するために、ＣＰＵ２０は全体指向性を再調整する（Ｓ６５）。具体的には、スピーカ音量が変更された他の音声処理装置１０が配置されている方向を、状態テーブル２３１の到来方向によって特定し、特定した方向に近い位置に配置されたマイク２５の感度を減衰させることによって、全体指向性を再調整する。詳細は、テスト音を使用して実行された全体指向性の調整方法と同一であるので、説明を省略する。 Based on the updated state table 231, the CPU 20 determines whether or not the reception volume corresponding to each ID is a predetermined value or more (S63). When the reception volume is less than the predetermined value (S63: NO), the volume of the sound field output from the speaker 24 of the other sound processing apparatus 10 is sufficiently small with the speaker volume after the change, which may cause echo and howling. The nature is low. In this case, the overall directivity is not readjusted. On the other hand, when the reception volume is equal to or higher than the predetermined value (S63: YES), the volume of the sound output from the speaker 24 of the other audio processing device 10 is large at the changed speaker volume, which is a cause of occurrence of echo and howling. Is likely to be. Therefore, in order to suppress the occurrence of echoes and howling, the CPU 20 readjusts the overall directivity (S65). Specifically, the direction in which the other sound processing device 10 with the changed speaker volume is arranged is specified by the arrival direction of the state table 231, and the sensitivity of the microphone 25 arranged at a position close to the specified direction is set. Readjust the global directivity by attenuating. The details are the same as the overall directivity adjustment method executed using the test sound, and thus the description thereof is omitted.

他の音声処理装置１０のスピーカ音量が変更された場合、エコーやハウリングの発生を抑止するために必要な全体指向性も、再度調整される必要がある。これに対して本実施形態では、他の音声処理装置１０における変更後のスピーカ音量に基づいて、全体指向性を再調整することができる。これによって音声処理装置１０は、全体指向性を最適な状態に再調整することができる。また、スピーカ音量の変更前後での差分に応じて全体指向性が調整されるので、再度、他の音声処理装置１０のスピーカ２４からテスト音を出力させることによって全体指向性を最初から設定しなおす手間を省くことができる。従って音声処理装置１０は、全体指向性を迅速に再調整することができる。 When the speaker volume of another audio processing device 10 is changed, the overall directivity necessary to suppress the occurrence of echoes and howling needs to be adjusted again. On the other hand, in this embodiment, the overall directivity can be readjusted based on the speaker volume after the change in the other audio processing device 10. As a result, the speech processing apparatus 10 can readjust the overall directivity to an optimum state. In addition, since the overall directivity is adjusted according to the difference between before and after the change of the speaker volume, the overall directivity is reset from the beginning by outputting the test sound from the speaker 24 of another audio processing device 10 again. Save time and effort. Therefore, the speech processing apparatus 10 can quickly readjust the overall directivity.

さらに音声処理装置１０は、他の音声処理装置１０のスピーカ音量の変更前後での差分に基づいて全体指向性を再調整することによって、他の音声処理装置１０のスピーカ２４から出力される音の音量の変化の程度に応じ、全体指向性を最適な状態に調整することができる。 Furthermore, the audio processing device 10 readjusts the sound output from the speaker 24 of the other audio processing device 10 by readjusting the overall directivity based on the difference before and after the change in the speaker volume of the other audio processing device 10. The overall directivity can be adjusted to an optimum state according to the degree of change in volume.

Ｓ７５で、変更通知データを受信していないと判断した場合（Ｓ７５：ＮＯ）、ＣＰＵ２０は、マイク感度を変更する操作が入力部２７を介してユーザによって行われたかを判断する（Ｓ７７）。マイク感度を変更する操作が行われていないと判断した場合（Ｓ７７：ＮＯ）、処理はＳ８１に進む。一方、マイク感度を変更する操作が行われたと判断した場合（Ｓ７７：ＹＥＳ）、ＣＰＵ２０は、変更後のマイク感度をＲＡＭ２２に記憶する。ＣＰＵ２０は、マイク２５を介して音を集音する場合の感度を、ＲＡＭ２２に記憶したマイク感度に基づいて変更する。 If it is determined in S75 that the change notification data has not been received (S75: NO), the CPU 20 determines whether an operation for changing the microphone sensitivity has been performed by the user via the input unit 27 (S77). If it is determined that the operation for changing the microphone sensitivity is not performed (S77: NO), the process proceeds to S81. On the other hand, if it is determined that an operation for changing the microphone sensitivity has been performed (S77: YES), the CPU 20 stores the changed microphone sensitivity in the RAM 22. The CPU 20 changes the sensitivity when collecting sound via the microphone 25 based on the microphone sensitivity stored in the RAM 22.

音声処理装置１０のマイク感度が変更された場合、Ｓ６５で調整した全体指向性を、変更後のマイク感度に基づいて再調整する必要がある。ＣＰＵ１０は、変更前後でのマイク感度の差分を算出する。ＣＰＵ２０は、算出した差分を、状態テーブル２３１に記憶されているすべての受信音量に加算することによって、状態テーブル２３１を更新する（Ｓ７９）。例えば、マイクの感度が１０ｄＢ分小さくなるように変更されていたとする。この場合、変更前後でのマイク感度の差分は−１０ｄＢとなる。ＣＰＵ２０は、算出された差分−１０ｄＢを、状態テーブル２３１に記憶されたすべての受信音量に加算することで、状態テーブル２３１を更新する。処理はＳ６１に戻る。 When the microphone sensitivity of the sound processing device 10 is changed, it is necessary to readjust the overall directivity adjusted in S65 based on the changed microphone sensitivity. The CPU 10 calculates the difference in microphone sensitivity before and after the change. The CPU 20 updates the state table 231 by adding the calculated difference to all the received sound volumes stored in the state table 231 (S79). For example, it is assumed that the sensitivity of the microphone is changed so as to be reduced by 10 dB. In this case, the difference in microphone sensitivity before and after the change is −10 dB. The CPU 20 updates the state table 231 by adding the calculated difference −10 dB to all received sound volumes stored in the state table 231. The process returns to S61.

ＣＰＵ２０は、更新した状態テーブル２３１に基づいて、其々のＩＤに対応する受信音量が所定値以上であるかを判断する（Ｓ６３）。受信音量が所定値未満である場合（Ｓ６３：ＮＯ）、変更後のマイク感度で他の音声処理装置１０のスピーカ２４から出力された音を受信した場合、受信音量は十分小さく、エコーやハウリングの発生の要因となる可能性は低い。この場合、全体指向性は再調整されない。一方、受信音量が所定値以上である場合（Ｓ６３：ＹＥＳ）、変更後のマイク感度で他の音声処理装置１０のスピーカ２４から出力された音を受信した場合、受信音量は大きくなり、エコーやハウリングの発生の要因となる可能性が高い。従って、エコーやハウリングの発生を抑制するために、ＣＰＵ２０は全体指向性を再調整する（Ｓ６５）。具体的には、状態テーブル２３１のうち所定値よりも大きい受信音量に対応する到来方向の近くに配置されたマイク２５の感度を減衰させることによって、全体指向性を再調整する。詳細は、テスト音を使用して実行されたマイク２５の指向性の調整方法と同一であるので、説明を省略する。 Based on the updated state table 231, the CPU 20 determines whether or not the reception volume corresponding to each ID is a predetermined value or more (S63). When the received sound volume is less than the predetermined value (S63: NO), when the sound output from the speaker 24 of the other sound processing apparatus 10 is received with the changed microphone sensitivity, the received sound volume is sufficiently small, and echo or howling is not performed. It is unlikely to be a cause of occurrence. In this case, the overall directivity is not readjusted. On the other hand, when the received sound volume is equal to or higher than the predetermined value (S63: YES), when the sound output from the speaker 24 of the other sound processing device 10 is received with the changed microphone sensitivity, the received sound volume increases, and echo or This is likely to be a factor in howling. Therefore, in order to suppress the occurrence of echoes and howling, the CPU 20 readjusts the overall directivity (S65). Specifically, the overall directivity is readjusted by attenuating the sensitivity of the microphone 25 arranged in the vicinity of the arrival direction corresponding to the reception volume larger than the predetermined value in the state table 231. The details are the same as the directivity adjustment method of the microphone 25 performed using the test sound, and thus the description thereof is omitted.

マイク２５の感度が変更された場合、エコーやハウリングの発生を抑止するために必要な全体指向性も再度調整される必要がある。これに対して本実施形態では、感度の変更前後での差分を算出し、差分に基づいて全体指向性を再調整することができる。これによって音声処理装置１０は、マイク２５の感度の変化の程度に応じ、全体指向性を最適な状態に再調整することができる。また、差分に応じて全体指向性が調整されるので、再度、他の音声処理装置１０のスピーカ２４からテスト音を出力させることによって全体指向性を最初から設定しなおす手間を省くことができる。従って音声処理装置１０は、全体指向性を迅速に再調整することができる。 When the sensitivity of the microphone 25 is changed, the overall directivity necessary to suppress the occurrence of echo and howling needs to be adjusted again. On the other hand, in this embodiment, the difference before and after the sensitivity change can be calculated, and the overall directivity can be readjusted based on the difference. As a result, the speech processing apparatus 10 can readjust the overall directivity to an optimal state according to the degree of change in sensitivity of the microphone 25. Further, since the overall directivity is adjusted according to the difference, it is possible to save the trouble of resetting the overall directivity from the beginning by outputting the test sound from the speaker 24 of the other audio processing device 10 again. Therefore, the speech processing apparatus 10 can quickly readjust the overall directivity.

Ｓ８１において、ＣＰＵ２０は、ディジーチェーン接続された状態の音声処理装置１０に対して新たに別の音声処理装置１０が接続されたかを判断する（Ｓ８１）。別の音声処理装置１０が接続されたと判断した場合（Ｓ８１：ＹＥＳ）、テスト音を出力することによって全体指向性を調整する処理を最初から実行するために、処理はＳ３１（図４参照）に戻る。別の音声処理装置１０が接続されていないと判断した場合（Ｓ８１：ＮＯ）、処理はＳ７１に戻る。 In S81, the CPU 20 determines whether another audio processing device 10 is newly connected to the audio processing device 10 in the daisy chain connection state (S81). If it is determined that another audio processing device 10 is connected (S81: YES), the process proceeds to S31 (see FIG. 4) in order to execute the process of adjusting the overall directivity by outputting a test sound from the beginning. Return. If it is determined that another audio processing device 10 is not connected (S81: NO), the process returns to S71.

以上説明したように、音声処理装置１０は、複数の音声処理装置１０が相互に接続され、同一拠点内に設置された状態で使用される場合であっても、他の音声処理装置１０から出力された音が要因でエコーやハウリングが発生することを抑止することができる。また、音声処理装置１０が通常備える周知のエコー除去機能をそのまま利用することができるため、新たにエコー除去のための特別な機能を音声処理装置１０に実装する必要がない。このため、音声処理装置１０のコストを抑制しつつ、複数の音声処理装置１０が接続された場合のエコーやハウリングの発生を効率的に抑止することができる。 As described above, the voice processing device 10 outputs from other voice processing devices 10 even when the plurality of voice processing devices 10 are connected to each other and used in the same location. It is possible to suppress the occurrence of echo and howling due to the generated sound. Further, since the well-known echo removal function that is normally provided in the voice processing apparatus 10 can be used as it is, it is not necessary to newly install a special function for echo removal in the voice processing apparatus 10. For this reason, generation | occurrence | production of the echo and howling when the several audio | voice processing apparatus 10 is connected can be suppressed efficiently, suppressing the cost of the audio | voice processing apparatus 10. FIG.

また音声処理装置１０は、ＩＤ通知データの通信を行うことによって、テスト音を出力するタイミングを容易に決定することができる。従って、音声処理装置１０を使用するユーザや、音声処理装置１０を制御する制御機器によって、音声処理装置１０からテスト音を所定のタイミングで出力させる制御を行うことを要することなく、音声処理装置１０は、独自にタイミングを判断してテスト音を出力することができる。 The voice processing apparatus 10 can easily determine the timing for outputting the test sound by communicating the ID notification data. Therefore, the voice processing apparatus 10 does not need to perform control to output a test sound from the voice processing apparatus 10 at a predetermined timing by a user who uses the voice processing apparatus 10 or a control device that controls the voice processing apparatus 10. Can output the test sound by judging the timing independently.

なお、Ｓ１９、Ｓ２５の処理を行うＣＰＵ２０が本発明の「通信制御手段」に相当する。Ｓ３９の処理を行うＣＰＵ２０が本発明の「第一特定手段」に相当する。Ｓ４３の処理を行うＣＰＵ２０が本発明の「出力制御手段」に相当する。Ｓ５１の処理を行うＣＰＵ２０が本発明の「第二特定手段」に相当する。Ｓ６５の処理を行うＣＰＵ２０が本発明の「第一調整手段」「第二調整手段」に相当する。Ｓ７３の処理を行うＣＰＵ２０が本発明の「送信手段」に相当する。Ｓ７５の処理を行うＣＰＵ２０が本発明の「受信手段」に相当する。Ｓ１９、Ｓ２５の処理が本発明の「通信制御ステップ」に相当する。Ｓ２３の処理が本発明の「第一特定ステップ」に相当する。Ｓ４３の処理が本発明の「出力制御ステップ」に相当する。Ｓ５１の処理が本発明の「第二特定ステップ」に相当する。Ｓ６５の処理が本発明の「第一調整ステップ」に相当する。 The CPU 20 that performs the processes of S19 and S25 corresponds to the “communication control means” of the present invention. The CPU 20 that performs the process of S39 corresponds to the “first specifying means” of the present invention. The CPU 20 that performs the process of S43 corresponds to the “output control means” of the present invention. The CPU 20 that performs the process of S51 corresponds to the “second specifying means” of the present invention. The CPU 20 that performs the process of S65 corresponds to the “first adjustment unit” and the “second adjustment unit” of the present invention. The CPU 20 that performs the process of S73 corresponds to the “transmission means” of the present invention. The CPU 20 that performs the process of S75 corresponds to the “reception unit” of the present invention. The processes of S19 and S25 correspond to the “communication control step” of the present invention. The process of S23 corresponds to the “first specific step” of the present invention. The process of S43 corresponds to the “output control step” of the present invention. The process of S51 corresponds to the “second specifying step” of the present invention. The process of S65 corresponds to the “first adjustment step” of the present invention.

なお本発明は上述の実施形態に限定されず、種々の変更が可能である。上述では、音声処理装置１０はディジーチェーン接続されていたが、他の接続形態、例えばツリー型、リング型、スター型等の接続形態で音声処理装置１０が接続されてもよい。音声処理装置１０は、通信ケーブルによって相互に接続されていたが、音声処理装置１０は無線によって相互に接続されてもよい。音声処理装置１０は、スピーカ２４およびマイク２５を備えていない構成であってもよく、外部のスピーカおよびマイクを接続可能とする構成であってもよい。 In addition, this invention is not limited to the above-mentioned embodiment, A various change is possible. In the above description, the voice processing apparatus 10 is daisy chain connected. However, the voice processing apparatus 10 may be connected in other connection forms such as a tree type, a ring type, and a star type. The audio processing devices 10 are connected to each other via a communication cable, but the audio processing devices 10 may be connected to each other wirelessly. The audio processing device 10 may be configured not to include the speaker 24 and the microphone 25, or may be configured to be able to connect an external speaker and microphone.

上述では、音声処理装置１０間でＩＤ通知データの通信を行うことによって、其々の音声処理装置１０がＩＤを決定していた。また音声処理装置１０は、決定したＩＤの順番でテスト音を送信していた。これに対し、ＰＣ１５が音声処理装置１０にＩＤ通知データを送信することによって、ＰＣ１５が音声処理装置１０のＩＤを一括決定してもよい。また、ＰＣ１５が音声処理装置１０に制御信号を送信することによって、其々の音声処理装置１０がテスト音を出力するタイミングが制御されてもよい。例えば音声処理装置１０は、ＰＣ１５から通知されたタイミングで、テスト音を出力してもよい。 In the above description, each voice processing device 10 determines an ID by communicating ID notification data between the voice processing devices 10. In addition, the voice processing device 10 transmits test sounds in the order of the determined IDs. On the other hand, the PC 15 may collectively determine the ID of the voice processing device 10 by transmitting the ID notification data to the voice processing device 10. Further, the timing at which each voice processing device 10 outputs a test sound may be controlled by the PC 15 transmitting a control signal to the voice processing device 10. For example, the voice processing device 10 may output a test sound at the timing notified from the PC 15.

例えば音声処理装置１０は、テスト音を出力するタイミングで、ＲＡＭ２２のスピーカ音量にテスト音量を記憶する制御のみを行ってもよい。ＰＣ１５は、音声処理装置１０に対して継続的にテスト音のデータを送信してもよい。音声処理装置１０は、ＰＣ１５から受信したテスト音のデータに基づき、スピーカ音量にテスト音量が記憶されている場合にのみ、テスト音をスピーカ２４から出力してもよい。 For example, the audio processing device 10 may perform only control for storing the test volume in the speaker volume of the RAM 22 at the timing of outputting the test sound. The PC 15 may continuously transmit test sound data to the sound processing device 10. The sound processing apparatus 10 may output the test sound from the speaker 24 only when the test sound volume is stored in the speaker sound volume based on the test sound data received from the PC 15.

音声処理装置１０は、状態テーブル２３１に記憶された受信音量を到来方向毎に加算してもよい。音声処理装置１０は、加算した受信音量が所定値以上となった場合に、対応する到来方向の感度が減衰するように全体指向性を調整してもよい。 The audio processing device 10 may add the reception volume stored in the state table 231 for each arrival direction. The audio processing device 10 may adjust the overall directivity so that the sensitivity in the corresponding arrival direction is attenuated when the added reception volume is equal to or higher than a predetermined value.

全体指向性の調整方法は、上述の実施形態に限定されない。上述において音声処理装置１０は、状態テーブル２３１の受信音量が所定値以上である場合に、受信音量が所定値となる様にマイク２５の感度を調整していた。これに対し、所定値から所定のマージン分を減算し、受信音量がこの値となるように、マイク２５の感度を調整してもよい。 The method of adjusting the overall directivity is not limited to the above-described embodiment. In the above description, the sound processing apparatus 10 adjusts the sensitivity of the microphone 25 so that the reception volume becomes a predetermined value when the reception volume of the state table 231 is equal to or higher than the predetermined value. On the other hand, the sensitivity of the microphone 25 may be adjusted such that a predetermined margin is subtracted from the predetermined value and the received sound volume becomes this value.

上述において音声処理装置１０は、スピーカ音量の差分を通知する変更通知データを他の音声処理装置１０に対して送信していた。これに対して音声処理装置１０は、変更後のスピーカ音量を通知する変更通知データを他の音声処理装置１０に対して送信してもよい。また音声処理装置１０は、このような変更通知データを他の音声処理装置１０から受信した場合、受信した変更通知データによって通知された変更後のスピーカ音量をテスト音量から減算することによって差分を算出してもよい。音声処理装置１０は、算出した差分に基づいて、全体指向性を再調整してもよい。 In the above description, the audio processing device 10 transmits change notification data for notifying the difference in speaker volume to the other audio processing devices 10. On the other hand, the audio processing device 10 may transmit change notification data for notifying the changed speaker volume to the other audio processing devices 10. Further, when such change notification data is received from another voice processing device 10, the voice processing device 10 calculates a difference by subtracting the changed speaker volume notified by the received change notification data from the test volume. May be. The voice processing device 10 may readjust the overall directivity based on the calculated difference.

状態テーブル２３１に記憶された情報に基づき、音声処理装置１０のネットワーク構成を特定してもよい。例えば図８に示す状態テーブル２３１が音声処理装置１１において作成されたとする。ＩＤ３の音声処理装置１３から出力されたテスト音が集音された場合の受信音量（８５ｄＢ）の方が、ＩＤ２の音声処理装置１２から出力されたテスト音が集音された場合の受信音量（７５ｄＢ）よりも大きくなっている。この場合、音声処理装置１２よりも音声処理装置１３の方が、音声処理装置１１の近くに配置されていることになる。またＩＤ３の音声処理装置１３から出力されたテスト音の到来方向は４５ｄｅｇとなっている。従って音声処理装置１１は、図９に示すように、音声処理装置１１、１２が並んで配置されており、音声処理装置１３が音声処理装置１１側に近づくように且つ音声処理装置１１に対して４５度の位置に配置されていることを認識する。音声処理装置１１は、認識したネットワーク構成に基づいて全体指向性を調節することができるようになる。このように音声処理装置１１は、テスト音の受信音量および到来方向に基づいて、音声処理装置１１、１２、１３のネットワーク構成を特定し、特定したネットワーク構成に基づいて全体指向性を調節することができる。 Based on the information stored in the state table 231, the network configuration of the voice processing device 10 may be specified. For example, it is assumed that the state table 231 shown in FIG. The reception volume (85 dB) when the test sound output from the audio processing device 13 with ID 3 is collected is received when the test sound output from the audio processing device 12 with ID 2 is collected. It is larger than the volume (75 dB). In this case, the voice processing device 13 is arranged closer to the voice processing device 11 than the voice processing device 12. The arrival direction of the test sound output from the voice processing device 13 with ID 3 is 45 deg. Therefore, as shown in FIG. 9, the voice processing device 11 is arranged side by side with the voice processing devices 11 and 12, so that the voice processing device 13 approaches the voice processing device 11 side. It recognizes that it is arranged at a 45 degree position. The voice processing device 11 can adjust the overall directivity based on the recognized network configuration. As described above, the voice processing device 11 specifies the network configuration of the voice processing devices 11, 12, and 13 based on the reception volume and the arrival direction of the test sound, and adjusts the overall directivity based on the specified network configuration. Can do.

１会議システム
１０、１１、１２、１３音声処理装置
２３フラッシュメモリ
２４スピーカ
２５マイク
２３１状態テーブル 1 Conference system 10, 11, 12, 13 Audio processing device 23 Flash memory 24 Speaker 25 Microphone 231 Status table

Claims

An audio processing device including a microphone and a speaker,
Communication control means for communicating with other audio processing devices;
First specifying means for specifying timing for outputting a predetermined sound from the speaker by communicating with the other sound processing device by the communication control means;
Output control means for performing control to output the predetermined sound at a predetermined volume from the speaker at the timing specified by the first specifying means;
When the predetermined sound output at the predetermined volume from another speaker connected to the other audio processing device is acquired via the microphone, the volume and direction of arrival of the acquired predetermined sound are specified. Two specific means;
First adjusting means for adjusting overall directivity, which is directivity of sensitivity to sound arriving at the sound processing device, based on the volume and the arrival direction specified by the second specifying means, the sound processing; First adjusting means for adjusting the overall directivity so that sensitivity to sound coming from a specific direction toward the device is attenuated ;
A transmission means for transmitting a change notification for notifying information for specifying the volume after the change to the other audio processing device when the volume when the sound is output from the speaker is changed;
Receiving means for receiving the change notification transmitted by the other audio processing device;
Second adjustment means for readjusting the overall directivity adjusted by the first adjustment means based on the information notified by the change notification when the change notification is received by the reception means; A speech processing apparatus comprising:

The first adjusting means includes
The overall directivity is adjusted by attenuating sensitivity in the direction of arrival corresponding to the volume when the volume specified by the second specifying unit is equal to or greater than a predetermined threshold. The speech processing apparatus according to 1.

The first adjusting means includes
The global directivity is adjusted by attenuating the sensitivity in the arrival direction corresponding to the volume so that the volume when the predetermined sound is acquired via the microphone is within the predetermined threshold. The speech processing apparatus according to claim 2.

The transmission means calculates the difference in the volume before and after the change, transmits the change notification for notifying the difference,
The second adjusting means is
Based on the difference notified by the change notification, a voice processing device according to any one of claims 1 to 3, characterized in that the re-adjust the whole directional adjusted by the first adjusting means.

An audio processing device including a microphone and a speaker,
Communication control means for communicating with other audio processing devices;
First specifying means for specifying timing for outputting a predetermined sound from the speaker by communicating with the other sound processing device by the communication control means;
Output control means for performing control to output the predetermined sound at a predetermined volume from the speaker at the timing specified by the first specifying means;
When the predetermined sound output at the predetermined volume from another speaker connected to the other audio processing device is acquired via the microphone, the volume and direction of arrival of the acquired predetermined sound are specified. Two specific means;
First adjusting means for adjusting overall directivity, which is directivity of sensitivity to sound arriving at the sound processing device, based on the volume and the arrival direction specified by the second specifying means, the sound processing; First adjusting means for adjusting the overall directivity so that sensitivity to sound coming from a specific direction toward the device is attenuated;
When the sensitivity when the microphone collects sound is changed, the difference between the sensitivity before and after the change is calculated, and the omnidirectional adjusted by the first adjustment unit based on the calculated difference features and to Ruoto voice processing apparatus further comprising a <br/> a second adjusting means for readjusting sex.

The first specifying means includes
By the communication control means, based on the identification information sequentially assigned to the voice processing device and the other speech processing apparatus, according to any one of claims 1 to 5, characterized in that identifying the timing Voice processing device.

A communication control step for communicating with other audio processing devices;
A first specifying step of specifying a timing for outputting a predetermined sound from a speaker of the sound processing device by performing communication with the other sound processing device in the communication control step;
An output control step for performing control to output the predetermined sound from the speaker at a predetermined volume at the timing specified by the first specifying step;
When the predetermined sound output at the predetermined volume from another speaker connected to the other audio processing apparatus is acquired via the microphone of the audio processing apparatus, the volume and direction of arrival of the acquired predetermined sound A second identification step for identifying
A first adjustment step of adjusting a global directivity, which is a directivity of sensitivity to sound arriving at the sound processing device, based on the volume and the arrival direction specified by the second specifying step, the sound processing A first adjustment step of adjusting the global directivity so that sensitivity to sound coming from a specific direction toward the device is attenuated ;
A transmission step of transmitting, to the other audio processing device, a change notification for notifying information for specifying the volume after the change when the volume when the sound is output from the speaker is changed;
A receiving step of receiving the change notification transmitted by the other audio processing device;
A second adjustment step for readjusting the global directivity adjusted by the first adjustment step based on the information notified by the change notification when the change notification is received by the reception step; A speech processing method characterized by comprising:

A communication control step for communicating with other audio processing devices;
A first specifying step of specifying a timing for outputting a predetermined sound from a speaker of the sound processing device by performing communication with the other sound processing device in the communication control step;
An output control step for performing control to output the predetermined sound from the speaker at a predetermined volume at the timing specified by the first specifying step;
When the predetermined sound output at the predetermined volume from another speaker connected to the other sound processing device is acquired through the microphone of the sound processing device under the control of the other sound processing device. A second specifying step of specifying the volume and direction of arrival of the acquired predetermined sound;
A first adjustment step of adjusting a global directivity, which is a directivity of sensitivity to sound arriving at the sound processing device, based on the volume and the arrival direction specified by the second specifying step, the sound processing A first adjustment step of adjusting the global directivity so that sensitivity to sound coming from a specific direction toward the device is attenuated ;
A transmission step of transmitting, to the other audio processing device, a change notification for notifying information for specifying the volume after the change when the volume when the sound is output from the speaker is changed;
A receiving step of receiving the change notification transmitted by the other audio processing device;
A second adjustment step for readjusting the global directivity adjusted by the first adjustment step based on the information notified by the change notification when the change notification is received by the reception step; Is a voice processing program for causing the computer of the voice processing apparatus to execute the program.