JP2007078545A

JP2007078545A - Object detection system and voice conference system

Info

Publication number: JP2007078545A
Application number: JP2005267885A
Authority: JP
Inventors: Akio Yamane; 章生山根; Makoto Tanaka; 田中　　良
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2005-09-15
Filing date: 2005-09-15
Publication date: 2007-03-29

Abstract

<P>PROBLEM TO BE SOLVED: To provide an object detection system which detects the position and direction of an object more accurately and will not limit the detection range of the object. <P>SOLUTION: The voice conference system 1 is provided with functions for inputting a voice signal to a speaker array 2 and inputting the voice signal from a microphone array 3, and comprises a searching signal generation section 26 for generating a search voice signal and inputting it into the speaker array 2; first and second beam adjustment sections 13, 23 for adjusting the focuses of a voice beam and a sound collecting beam; a directivity control section 161 for controlling to make the focuses of these beams overlap each other; a timer 163 for measuring the search time for the voice to be output from the speaker array 2, reflecting at the focus position, and input into the microphone array 3; a signal processing section 25 for searching the voice signal from the microphone array 3 whether a component of the searching voice signal is included therein during the search time; and a position detection section 162 for deciding that there is an object in the path of the voice beam, when the component of the search voice signal is included. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、スピーカアレイからの音声ビーム及びマイクアレイからの集音ビームを用いて対象物の検出を行う対象物検出装置及び音声会議装置に関する。 The present invention relates to an object detection apparatus and an audio conference apparatus that detect an object using an audio beam from a speaker array and a collected sound beam from a microphone array.

従来、スピーカから出力した音声をマイクロフォンで集音し、集音した音声信号を用いて、障害物や人等の対象物を検出する対象物検出装置が知られている。例えば、特許文献１には、車両の周囲に位置する障害物を検出して運転者に提示する運転支援装置が記載されている。この運転支援装置では、可聴帯域の信号波を送信器（無指向性スピーカ）が広角度に送信する。障害物に当たり反射したこの信号波を複数の受信器（マイク）が配列されて成る受信器アレイ（マイクアレイ）が検出する。 2. Description of the Related Art Conventionally, there is known an object detection device that collects sound output from a speaker with a microphone and detects an object such as an obstacle or a person using the collected sound signal. For example, Patent Literature 1 describes a driving support device that detects an obstacle located around a vehicle and presents it to a driver. In this driving support apparatus, a transmitter (non-directional speaker) transmits an audible signal wave at a wide angle. The signal wave reflected by the obstacle is detected by a receiver array (microphone array) in which a plurality of receivers (microphones) are arranged.

このマイクアレイの各マイクで検出した信号波に対して位相補正を行う。すなわち、運転支援装置は、検出領域を複数に区分し、この区分領域からの音声が各マイクに入力されるまでの各遅延時間を各区分領域に応じて記憶する。そして、運転支援装置は、各マイクで検出した信号波に対して対応する区分領域に応じた遅延時間を付与することで位相補正を行う。この位相補正は、全区分領域分だけ行われる。 Phase correction is performed on the signal wave detected by each microphone of the microphone array. That is, the driving support apparatus divides the detection area into a plurality of areas, and stores each delay time until the sound from the division area is input to each microphone according to each division area. And a driving assistance apparatus performs phase correction by providing the delay time according to the division area corresponding to the signal wave detected with each microphone. This phase correction is performed for all the divided areas.

運転支援装置は、位相補正後の各信号を重ね合わせて合成する。これによって、対応する区分領域からの音声成分についてのみ、各信号の位相が合致して強められる。一方、その他の音声成分については位相が合わずに弱められる。このため、対応する区分領域からの音声に特化して集音する（指向性を持つ）集音ビームを形成し、これによって、この区分領域からの音声に特化して集音することができる。 The driving support device combines the signals after phase correction by superimposing them. As a result, only the sound components from the corresponding segmented regions are intensified by matching the phases of the signals. On the other hand, other audio components are weakened because the phases are not matched. For this reason, a sound collecting beam (having directivity) that collects sound specialized to the sound from the corresponding segmented region is formed, and thereby, it is possible to collect sound specialized to the sound from the segmented region.

運転支援装置は、この各区分領域の集音ビームを用いて、各区分領域の音声を集音してこの集音した各音声の強度を示す強度分布を生成する。そして、運転支援装置は、この強度分布において強度が強い位置に障害物があると検出する。この様にして、従来の運転支援装置では障害物の有無及び障害物の位置を検出することができる。
特開平１０−２００３４号公報 The driving support device collects the sound of each divided area using the sound collection beam of each divided area and generates an intensity distribution indicating the intensity of each collected sound. Then, the driving support device detects that there is an obstacle at a position where the intensity is strong in the intensity distribution. In this way, the conventional driving support apparatus can detect the presence of an obstacle and the position of the obstacle.
JP-A-10-20034

従来の対象物検出装置では、無指向性スピーカを用いて可聴帯域の信号波を広角度に送信する。そして、各区分領域の音声を集音ビームで集音することで、各区分領域の検出音声の強度が取得され、この強度が高い区分領域に障害物等の対象物が位置すると検出される。このため、障害物で反射した音声が障害物のある区分領域の他の区分領域に回りこみノイズとなる。この様な、探査する区分領域の他の領域からのノイズを集音してしまうマルチパスによって、障害物の検出を正確に行うことができない場合がある。 In the conventional object detection device, a signal wave in the audible band is transmitted at a wide angle using an omnidirectional speaker. Then, by collecting the sound of each divided area with the sound collecting beam, the intensity of the detected sound of each divided area is acquired, and it is detected that an object such as an obstacle is located in the divided area having a high intensity. For this reason, the sound reflected by the obstacle wraps around the other partitioned area where the obstacle exists and becomes noise. Obstacles may not be detected accurately due to such multipath that collects noise from other areas to be searched.

そこで、本発明は、上記課題を解決するために、対象物をより正確に検出することができる対象物検出装置及び音声会議装置を提供することを目的としている。 Accordingly, an object of the present invention is to provide an object detection apparatus and an audio conference apparatus that can detect an object more accurately in order to solve the above problems.

上記課題を解決するために本発明では以下の手段を採用している。 In order to solve the above problems, the present invention employs the following means.

（１）本発明は、スピーカアレイに音声信号を入力するとともに、マイクアレイから音声信号を入力する機能を備え、探査用音声信号を生成して前記スピーカアレイに入力する信号生成部と、スピーカアレイからの音声ビームの焦点合わせを行う第１ビーム調整部と、マイクアレイの集音ビームの焦点合わせを行う第２ビーム調整部と、音声ビームの焦点と集音ビームの焦点を重ならせるように第１及び第２ビーム調整部の焦点合わせを制御する指向性制御部と、スピーカアレイから音声が出力されてから焦点位置で反射してマイクアレイに入力されるまでの探査時間を計時する計時部と、前記探査時間に、マイクアレイから入力された音声信号に前記探査用音声信号の成分が含まれるかを探査する探査部と、前記探査部によって前記探査用音声信号の成分が含まれると検出された場合に、前記スピーカアレイから焦点までの音声ビームの経路に対象物があると判断する判断部と、を備えたことを特徴とする対象物検出装置である。 (1) The present invention has a function of inputting an audio signal to a speaker array and inputting an audio signal from a microphone array, generating a search audio signal and inputting the audio signal to the speaker array, and the speaker array A first beam adjusting unit for focusing the sound beam from the sound source, a second beam adjusting unit for focusing the sound collecting beam of the microphone array, and the focus of the sound beam and the sound collecting beam are overlapped with each other. A directivity control unit that controls the focusing of the first and second beam adjustment units, and a time measuring unit that measures the search time from when the sound is output from the speaker array until it is reflected at the focal position and input to the microphone array A search unit for searching whether the audio signal input from the microphone array includes a component of the search audio signal at the search time; and An object detection apparatus comprising: a determination unit configured to determine that there is an object in a path of an audio beam from the speaker array to a focus when it is detected that a component of a voice signal is included. is there.

上記本発明の構成によれば、信号生成部によって、探査用音声信号が生成されてスピーカアレイに入力される。このスピーカアレイからの探査用音声信号の音声（探査用音声）の焦点合わせが第１ビーム調整部によって行われる。これとともに、マイクアレイからの集音ビームの焦点合わせが第２ビーム調整部によって行われる。ここで、この音声ビーム及び集音ビームの焦点は重なるように指向性制御部によって制御される。 According to the configuration of the present invention, the search sound signal is generated and input to the speaker array by the signal generator. The first beam adjustment unit focuses the sound of the sound signal for search from the speaker array (sound for search). At the same time, the second beam adjustment unit performs focusing of the collected sound beam from the microphone array. Here, the focal points of the sound beam and the sound collecting beam are controlled by the directivity control unit so as to overlap.

対象物が音声ビームの出力方向に位置する場合に、この対象物に音声ビームが反射するが、焦点位置の近傍であれば、この反射音が集音ビームによって集音される。探査部によって、マイクアレイから入力された音声信号に探査用音声信号の成分が含まれるかが探査される。そして、探査部によって探査用音声信号の成分が含まれると検出された場合に、スピーカアレイから焦点までの音声ビームの経路に対象物があると判断される。 When the object is located in the output direction of the sound beam, the sound beam is reflected by the object. If the object is near the focal position, the reflected sound is collected by the sound collecting beam. The search unit searches for whether or not the audio signal input from the microphone array includes the component of the search audio signal. Then, when the search unit detects that the component of the audio signal for search is included, it is determined that there is an object in the path of the audio beam from the speaker array to the focal point.

この様に、従来技術のように無指向性スピーカではなくスピーカアレイを用いて音声ビームを出力するため、探査範囲外の領域（方向）に音声が出力されず、探査範囲外の反射音がマイクに周り込むマルチパスを効果的に防止することが可能となる。 As described above, since the sound beam is output using the speaker array instead of the omnidirectional speaker as in the prior art, the sound is not output to the region (direction) outside the search range, and the reflected sound outside the search range is It is possible to effectively prevent multipaths that wrap around.

また、スピーカアレイから出力した音声が焦点位置で反射してマイクアレイに入力されるまでの探査時間が、計時部によって計時される。そして、探査部によって、計時時間に限って、マイクアレイから入力された音声信号に探査用音声信号の成分が含まれるかが探査される。これによって、焦点位置を越えた位置で反射した反射音をマイクアレイで受信してしまい、対象物を誤検出してしまうことが防止される。 In addition, the search time until the sound output from the speaker array is reflected at the focal position and input to the microphone array is measured by the timer unit. Then, the search unit searches for whether or not the audio signal input from the microphone array includes the component of the search audio signal only during the time measurement. Accordingly, it is possible to prevent the reflected sound reflected at the position beyond the focal position from being received by the microphone array and erroneously detecting the object.

（２）本発明は、上記対象物検出装置において、前記判断部は、音声ビームのスピーカアレイからの出力タイミング、マイクアレイへの探査用音声信号の成分の入力タイミング及び音声ビームの出力方向を用いて、前記対象物の位置を検出する。 (2) In the above object detection apparatus according to the present invention, the determination unit uses the output timing of the sound beam from the speaker array, the input timing of the component of the sound signal for exploration to the microphone array, and the output direction of the sound beam. Then, the position of the object is detected.

この構成によれば、音速及び上記出力タイミングと入力タイミングを用いてスピーカアレイやマイアレイからの対象物の距離を測定することが可能となる。また、音声ビームの出力方向によって対象物の位置方向を測定することが可能となる。このため、対象物の距離及び位置方向から対象物の位置を検出することが可能となる。 According to this configuration, it is possible to measure the distance of the object from the speaker array or the my array using the sound speed and the output timing and input timing. Further, the position direction of the object can be measured by the output direction of the sound beam. For this reason, it is possible to detect the position of the object from the distance and the position direction of the object.

（１）で上述したように、無指向性スピーカを用いる従来技術と比較して、対象物の位置方向をより正確に測定することが可能である。また、対象物の距離も正確に測定することができるため、従来より正確に対象物の検出位置を測定することが可能となる。 As described above in (1), it is possible to measure the position direction of the object more accurately as compared with the conventional technique using an omnidirectional speaker. In addition, since the distance of the object can be accurately measured, the detection position of the object can be measured more accurately than before.

（３）本発明は、上記対象物検出装置において、前記信号生成部は、非周期的なパルス列を探査用音声信号として生成する。これによって、判断部が探査用音声信号の成分の入力タイミングを取得することが容易になる。すなわち、周期的なパルス列が探査用音声信号として用いられると、所定周期（例えば１周期）ずれた位置からのパルス形状が開始位置からのパルス形状と同様になる。このため、マイクアレイからの探査用音声信号の入力開始タイミングの誤検出が行われ易い。 (3) In the object detection device according to the present invention, the signal generation unit generates an aperiodic pulse train as a search audio signal. This facilitates the determination unit to acquire the input timing of the component of the search audio signal. That is, when a periodic pulse train is used as the search audio signal, the pulse shape from a position shifted by a predetermined cycle (for example, one cycle) becomes the same as the pulse shape from the start position. For this reason, erroneous detection of the input start timing of the exploration audio signal from the microphone array is likely to be performed.

一方、非周期的なパルス列が探査用音声信号として用いられると、ずれた位置からのパルス形状も開始位置からのパルス形状とは異なることになるため、探査用音声信号の成分の入力タイミングの取得が周期的なパルス列の探査用音声信号を用いる場合と比較して容易になる。 On the other hand, when a non-periodic pulse train is used as a sound signal for exploration, the pulse shape from the shifted position will be different from the pulse shape from the start position. However, this is easier than using a periodic pulse train search audio signal.

（４）本発明は、上記対象物検出装置において、請求項１〜３の何れかに記載の対象物検出装置を用いた音声会議装置であって、信号生成部は、探査用音声信号として可聴音の周波数成分から成る音声信号を生成し、判断部は、前記対象物として会議出席者を検出することを特徴とする音声会議装置。 (4) The present invention is the audio conference device using the object detection device according to any one of claims 1 to 3 in the object detection device, wherein the signal generation unit can be used as an audio signal for exploration. An audio conference apparatus that generates an audio signal composed of a frequency component of listening sound, and wherein the determination unit detects a conference attendee as the object.

上記本発明の構成によれば、信号生成部によって、探査用音声信号として可聴音の周波数成分から成る音声信号が生成される。この探査用音声によって会議出席者の位置方向が検出される。音声会議装置は、可聴音を出力するための構成は通常備える。本発明では、可聴音の周波数成分から成る音声信号が用いられるため、音声会議装置が通常備える可聴音を出力する構成を用いて探査用音声を出力することが可能となる。また、可聴音は直進性が弱いが、スピーカアレイによって音声ビームとして出力されるため、この直進性の弱さを補うことが可能となる。 According to the configuration of the present invention, the signal generation unit generates an audio signal composed of the frequency component of the audible sound as the search audio signal. The position and direction of the attendees of the conference are detected by the search voice. Audio conferencing devices typically have a configuration for outputting audible sounds. In the present invention, since an audio signal composed of frequency components of audible sound is used, it is possible to output search sound using a configuration that outputs audible sound that is normally provided in an audio conference apparatus. In addition, the audible sound is weak in straightness, but is output as a sound beam by the speaker array, so that this weakness in straightness can be compensated.

本発明によれば、スピーカアレイを用いて音声ビームを出力するため、探査範囲外に音声が出力されず、探査範囲外の反射音を集音してしまうマルチパスを効果的に防止することができる。また、スピーカアレイから出力した音声信号が焦点位置で反射してマイクアレイに入力されるまでの計時時間に限って、マイクアレイから入力された音声信号に探査用音声信号の成分が含まれるかが探査される。これによって、焦点位置を越えた位置で反射した反射音をマイクアレイで受信してしまうマルチパスを効果的に防止し、対象物を誤検出してしまうことを防止することができる。 According to the present invention, since the sound beam is output using the speaker array, it is possible to effectively prevent multipath that does not output sound outside the search range and collects reflected sound outside the search range. it can. Also, whether the audio signal input from the microphone array contains the component of the audio signal for exploration only during the time measured until the audio signal output from the speaker array is reflected at the focal position and input to the microphone array. Explored. As a result, it is possible to effectively prevent multipath in which reflected sound reflected at a position beyond the focal position is received by the microphone array, and to prevent erroneous detection of an object.

上述のように、探査範囲の他の領域（方向）での反射音のマルチパスを効果的に防止することができるとともに、焦点位置を越えた位置での反射音のマルチパスを効果的に防止することができるため、対象物の検出の正確さを向上させることができる。 As described above, multipath of reflected sound in other areas (directions) of the search range can be effectively prevented, and multipath of reflected sound at positions beyond the focal position can be effectively prevented. Therefore, the accuracy of detection of the object can be improved.

図１〜図６を参照して本発明の実施形態である音声会議装置について詳細に説明する。音声会議装置１は、遠隔地にある他の（相手方の）音声会議装置１（区別するために相手方装置１´と記載する）との間で通話信号の送受信を行うことで、本音声会議装置１のユーザ（会議出席者ｈ）と相手方装置１´を使用する相手方話者との間で音声会議を行うための装置である。 With reference to FIGS. 1-6, the audio conference apparatus which is embodiment of this invention is demonstrated in detail. The audio conference apparatus 1 transmits / receives a call signal to / from other (the other party) audio conference apparatus 1 (denoted as the other party apparatus 1 'for distinction) at a remote place, thereby the present audio conference apparatus. This is a device for conducting an audio conference between one user (conference attendee h) and a partner speaker using the partner device 1 '.

図１は、音声会議装置１を上方から見た外観及び音声会議用音声の伝搬及び集音範囲を示す図である。本図では、音声会議装置１は、会議机上に配置されることで、着座した会議出席者ｈの頭部近傍の高さ位置で配置されている。なお、本図において、音声会議装置１の前方側を−Ｙ側、後方側をＹ側、右側をＸ側、左側を−Ｘ側と記載する。 FIG. 1 is a diagram showing the appearance of the audio conference apparatus 1 as viewed from above, the propagation of audio for audio conference, and the sound collection range. In this figure, the audio conference apparatus 1 is arranged at a height position in the vicinity of the head of the seated conference attendee h by being arranged on the conference desk. In this figure, the front side of the audio conference apparatus 1 is described as -Y side, the rear side as Y side, the right side as X side, and the left side as -X side.

音声会議装置１は、長尺の略直方体状である筐体１Ａを備え、この筐体１Ａの−Ｙ側の上段にスピーカアレイ２を備える。なお、スピーカアレイ２は筐体１Ａに内蔵されているため、本来外観視できないが、同図では、説明の便宜のため透視的に記載している。また、同様に説明の便宜のため、スピーカアレイ２を−Ｙ側ではなくＹ側に記載している。 The audio conference apparatus 1 includes a long casing 1A having a substantially rectangular parallelepiped shape, and includes a speaker array 2 on the upper side of the casing 1A on the -Y side. Note that the speaker array 2 is built in the housing 1A and cannot be visually seen from the outside, but in FIG. Similarly, for convenience of explanation, the speaker array 2 is shown on the Y side instead of the -Y side.

スピーカアレイ２は、長尺方向に亘ってライン状に配列された８個のスピーカユニットＳＰ（ＳＰ１〜ＳＰ８）から成る。各スピーカユニットＳＰは−Ｙ側に放音面が位置する様に配置され、スピーカユニットＳＰ１〜ＳＰ８に音声信号が入力されると、スピーカアレイ２から音声ビームが−Ｙ方向に向かうように出力される。このスピーカアレイ２からの音声ビームは、相手方装置１´から受信した相手方通話者の音声を内容とする。 The speaker array 2 includes eight speaker units SP (SP1 to SP8) arranged in a line shape in the longitudinal direction. Each speaker unit SP is arranged so that the sound emission surface is located on the -Y side. When an audio signal is input to the speaker units SP1 to SP8, an audio beam is output from the speaker array 2 so as to be directed in the -Y direction. The The sound beam from the speaker array 2 contains the voice of the other party's caller received from the other party device 1 '.

各スピーカユニットＳＰに入力する音声信号に付加する遅延時間によって、スピーカアレイ２からの音声ビームの指向性（指向方向及び指向範囲）を制御することができる。すなわち、各スピーカユニットＳＰからの音声が同タイミングで焦点Ｐに到達するような遅延時間（図中の太字矢印で示す時間）を各スピーカユニットＳＰに入力する各信号に付加する。これによって、焦点Ｐに音声ビームを指向させるように焦点合わせを行うことができる。 The directivity (directivity direction and directivity range) of the sound beam from the speaker array 2 can be controlled by the delay time added to the sound signal input to each speaker unit SP. That is, a delay time (time indicated by a bold arrow in the figure) such that the sound from each speaker unit SP reaches the focal point P at the same timing is added to each signal input to each speaker unit SP. Thus, focusing can be performed so that the sound beam is directed to the focal point P.

この様にして、スピーカアレイ２からの音声ビームを会議出席者ｈ（ｈ１〜ｈ３）を含む狭い指向範囲で指向させることで、会議出席者ｈのみに対して会議用音声を提供することができる。すなわち、会議出席者ｈの他の者に対する会議用音声の音漏れを効果的に防止することができ、会議室ではなく通常のオフィスルームで遠隔地との間で音声会議を行っても、オフィスルームの会議出席者の他の者の業務を妨げない。 In this way, by directing the sound beam from the speaker array 2 in a narrow directivity range including the conference attendees h (h1 to h3), the conference audio can be provided only to the conference attendees h. . In other words, it is possible to effectively prevent the sound leakage of the conference audio for other people in the conference attendee h, and even if an audio conference is performed with a remote place in a normal office room instead of a conference room, Does not interfere with the work of others in the room meeting attendees.

また、筐体１Ａの−Ｙ側の下段、すなわちスピーカアレイ２の下段にはマイクアレイ３が配設されている。なお、マイクアレイ３は筐体１Ａに内蔵されているため、本来外観視できないが、同図では、説明の便宜のため透視的に記載している。また、正確にはスピーカアレイ２がマイクアレイ３の上側に重なって位置するが、同図では、説明の便宜のためスピーカアレイ２とマイクアレイ３とが水平方向で並列するように図示する。 Further, a microphone array 3 is disposed on the lower stage of the housing 1A on the -Y side, that is, on the lower stage of the speaker array 2. Note that since the microphone array 3 is built in the housing 1A, it cannot be visually seen from the outside, but in FIG. In addition, although the speaker array 2 is positioned so as to overlap the upper side of the microphone array 3, the speaker array 2 and the microphone array 3 are shown in parallel in the horizontal direction for convenience of explanation.

マイクアレイ３は、長尺方向に亘ってライン状に配列された８個のマイクＭ（Ｍ１〜Ｍ８）から成る。各マイクＭは集音側が−Ｙ側に向くように配置されている。マイクアレイ３は、所定の探査位置への指向性を持たせた集音ビームによって探査位置の音声に特化して音声を集音することができる。すなわち、音声会議装置１は、探査位置からの音声が各マイクＭに至るまでの各遅延時間を記憶し、各マイクＭで集音した音声信号に対して対応する遅延時間で位相調整を行う。この位相調整によって、探査位置からの音声成分については、各マイクＭで集音した音声信号同士の位相が一致することになる。 The microphone array 3 includes eight microphones M (M1 to M8) arranged in a line shape in the longitudinal direction. Each microphone M is arranged so that the sound collection side faces the -Y side. The microphone array 3 can collect the sound specialized to the sound at the search position by the sound collection beam having directivity to the predetermined search position. That is, the audio conference apparatus 1 stores each delay time until the sound from the search position reaches each microphone M, and performs phase adjustment with the corresponding delay time for the audio signal collected by each microphone M. As a result of this phase adjustment, the phases of the audio signals collected by the microphones M are matched with respect to the audio component from the search position.

一方、探査位置から離れた位置からの音声成分については、位相調整後の各マイクＭで集音した音声信号同士の位相が不一致になる。このため、各マイクＭで集音した音声信号同士を位相調整後に重ね合わせて合成することにより、探査位置からの音声成分が強められ、探査位置から離れた位置に由来する音声成分ほど弱められる。これによって、マイクアレイ３によって、探査位置の音声に特化して音声を集音することができる。 On the other hand, for audio components from positions away from the search position, the phases of the audio signals collected by the respective microphones M after phase adjustment become inconsistent. For this reason, by combining the audio signals collected by the microphones M after phase adjustment, the audio components from the search position are strengthened, and the audio components originating from a position away from the search position are weakened. As a result, the microphone array 3 can collect the sound specialized to the sound at the search position.

本音声会議装置１では、会議出席者ｈの位置に集音ビームの焦点合わせが行われる。これによって、会議出席者ｈの音声のみに特化して集音することができ、環境音等のノイズを集音してしまうことによって会議出席者ｈの音声が聴こえにくくなることが効果的に防止される。 In the audio conference apparatus 1, the sound collecting beam is focused on the position of the conference attendee h. As a result, it is possible to collect only the voice of the conference attendee h, and effectively prevent the conference attendee h from being difficult to hear by collecting noises such as environmental sounds. Is done.

また、会議出席者ｈが複数居る場合、例えば同図のように３人の会議出席者ｈ１〜ｈ３が居る場合には、この３人の会議出席者ｈ１〜ｈ３の位置（位置Ｐ１〜Ｐ３）に集音ビームの焦点合わせを行う。そして、音声会議装置１は、集音した位置Ｐ１〜Ｐ３の各音声の強度を比較して、強度に応じて位置Ｐ１〜Ｐ３の各音声の重み付けをつけて加算合成する。 Further, when there are a plurality of meeting attendees h, for example, when there are three meeting attendees h1 to h3 as shown in the figure, the positions of these three meeting attendees h1 to h3 (positions P1 to P3). Focus the sound collecting beam. Then, the audio conference apparatus 1 compares the strengths of the collected voices at the positions P1 to P3, and adds and synthesizes the weights of the voices at the positions P1 to P3 according to the strength.

この加算合成後の音声信号は、音声会議装置１から相手方装置１´に出力される。これによって、相手方装置１´には、会議出席者ｈの音声に特化した通話音声を送信することができる。この様にして、ノイズ成分によって会議出席者ｈの音声が聴こえにくくなることを効果的に防止し、本音声会議装置１と相手方装置１´との間で好適に音声会議を行わせることができる。 The voice signal after the addition synthesis is output from the audio conference apparatus 1 to the counterpart apparatus 1 ′. As a result, a call voice specialized to the voice of the conference attendee h can be transmitted to the counterpart apparatus 1 ′. In this way, it is possible to effectively prevent the sound of the conference attendee h from being difficult to hear due to the noise component, and it is possible to suitably hold the audio conference between the audio conference device 1 and the counterpart device 1 ′. .

上述したように、本実施形態では、音声会議装置１を用いて相手方装置１´との間で音声会議を行うときに、スピーカアレイ２から会議出席者ｈの位置に焦点Ｐが合うように音声ビームが出力される。また、会議出席者ｈの位置に集音ビームの焦点合わせを行う。このために、音声会議装置１は、音声会議前に、会議出席者ｈ（対象物）の位置を音声会議前に予め検出する処理（位置検出処理）を行う。音声会議装置１は、この位置検出処理によって取得した会議出席者ｈの位置を用いて音声ビーム及び集音ビームの焦点合わせを行う。 As described above, in the present embodiment, when a voice conference is performed with the counterpart device 1 ′ using the voice conference device 1, the voice is set so that the focal point P is focused on the position of the conference attendee h from the speaker array 2. A beam is output. In addition, the focused sound beam is focused on the meeting attendee h. For this reason, the audio conference apparatus 1 performs a process (position detection process) for detecting the position of the meeting participant h (target object) in advance before the audio conference before the audio conference. The audio conference apparatus 1 performs focusing of the audio beam and the sound collection beam using the position of the conference attendee h acquired by the position detection process.

以下図２を用いて、音声会議装置１の実行する位置検出処理を説明する。図２は、位置検出処理が実行されている場合の音声ビームの伝搬範囲及び集音ビームの集音範囲を示す図である。音声会議装置１はスピーカアレイ２から探査用音声を音声ビームとして出力する。ここで、位置検出処理では、スピーカアレイ２からの音声ビームの焦点Ｐの位置及びマイクアレイ３の集音ビームの焦点位置が重なるように焦点合わせが行われる。 Hereinafter, a position detection process executed by the audio conference apparatus 1 will be described with reference to FIG. FIG. 2 is a diagram showing the propagation range of the sound beam and the sound collection range of the sound collection beam when the position detection process is executed. The audio conference apparatus 1 outputs the search audio from the speaker array 2 as an audio beam. Here, in the position detection process, focusing is performed so that the position of the focal point P of the sound beam from the speaker array 2 and the focal position of the sound collection beam of the microphone array 3 overlap.

これによって、音声ビームの出力方向に会議出席者ｈが居る場合には、この会議出席者ｈに音声ビームが反射して集音ビームによって集音される。このスピーカアレイ２からの探査用音声の出力タイミングからマイクアレイ３による探査用音声の入力タイミングとを用いて、下記式（１）により会議出席者ｈから音声会議装置１の前方側面までの距離Ｌ１を測定することができる。 As a result, when there is a meeting attendee h in the output direction of the sound beam, the sound beam is reflected by the meeting attendee h and collected by the sound collecting beam. Using the search sound output timing from the speaker array 2 and the search sound input timing from the microphone array 3, the distance L1 from the conference attendee h to the front side surface of the audio conference apparatus 1 according to the following equation (1). Can be measured.

距離Ｌ１＝時間ｔ１／２×Ｃ・・・式（１）
なお、時間ｔ１は探査用音声の出力タイミングからマイクアレイ３による探査用音声の入力タイミングまでの時間であり、Ｃは音速である。 Distance L1 = time t1 / 2 × C (1)
The time t1 is the time from the search sound output timing to the search sound input timing by the microphone array 3, and C is the speed of sound.

そして、音声ビームの出力方向と距離Ｌ１を用いて、会議出席者ｈの位置を検出することができる。この様にして、スピーカアレイ２によって音声ビームを出力し、この音声ビームの反射音を集音ビームで集音することで、会議出席者ｈの位置検出を行う。このため、音声ビームの伝播範囲（探査範囲）外に音声が出力されず、探査範囲外の反射音を集音してしまうマルチパスを効果的に防止することができる。 Then, the position of the meeting attendee h can be detected using the output direction of the sound beam and the distance L1. In this way, the speaker array 2 outputs a sound beam, and the reflected sound of the sound beam is collected by the sound collecting beam, thereby detecting the position of the conference attendee h. For this reason, it is possible to effectively prevent multipath that does not output sound outside the propagation range (search range) of the sound beam and collects reflected sound outside the search range.

また、位置検出処理では、音声会議装置１は音声ビームの焦点Ｐで反射した反射音声が集音されるまでの探査時間ｔ２を下記式（２）を用いて算出する。そして、この探査時間ｔ２内でのみマイクアレイ３からの入力信号に探査用音声の成分が含まれるか判断される。これによって、焦点Ｐで焦点した後に更に前方（同図の−Ｙ方向）に伝播した音声ビームが会議出席者ｈの他の物や者に反射し、この反射音が集音されるマルチパスを防止することができる。 In the position detection process, the audio conference apparatus 1 calculates the search time t2 until the reflected sound reflected by the focal point P of the audio beam is collected using the following equation (2). Then, it is determined whether or not the sound component for search is included in the input signal from the microphone array 3 only within the search time t2. As a result, the sound beam propagated further forward (in the −Y direction in the figure) after being focused at the focal point P is reflected to other objects and persons in the conference attendee h, and a multipath where the reflected sound is collected is obtained. Can be prevented.

ｔ２＝Ｌ２／Ｃ×２・・・式（２）
なお、Ｌ２は、各マイクＭのうち最も焦点Ｐに遠いマイクＭの位置から焦点Ｐまでの距離である。最も焦点Ｐに遠いマイクＭを基準としたのは、最も焦点Ｐに遠いマイクＭで焦点Ｐからの反射音が入力された状態でなければ、各マイクＭからの各入力信号を位相調整して加算合成することで、焦点Ｐに指向性を持つ集音ビームを形成することができないからである。 t2 = L2 / C × 2 Formula (2)
L2 is the distance from the position of the microphone M farthest from the focal point P to the focal point P among the microphones M. When the microphone M farthest from the focal point P is used as a reference, if the reflected sound from the focal point P is not input by the microphone M farthest from the focal point P, the phase of each input signal from each microphone M is adjusted. This is because a sound collecting beam having directivity at the focal point P cannot be formed by the addition synthesis.

上述したように、会議出席者ｈの検出範囲は、音声会議装置１と焦点Ｐとの間の領域に限定される。もっとも、音声会議装置１と焦点Ｐとの間の領域であっても、集音ビームによって集音されるため、焦点Ｐから離れる程集音しにくくなる。このため、会議出席者ｈの位置が焦点Ｐの近傍位置である場合に有効な位置検出方法である。 As described above, the detection range of the conference attendee h is limited to the area between the audio conference apparatus 1 and the focal point P. Of course, even in the region between the audio conference apparatus 1 and the focal point P, sound is collected by the sound collecting beam, so that it becomes difficult to collect sound as the distance from the focal point P increases. Therefore, this is an effective position detection method when the position of the meeting attendee h is in the vicinity of the focal point P.

上述した様な位置検出方法が、焦点Ｐの位置を探査領域内（会議出席者が着座する可能性のあるエリア）で移動させながら実行される。これによって、探査領域内において会議出席者の位置を検出することができる。なお、同図では焦点Ｐは奥行き方向の位置を維持しながら左右方向にスライドする様に移動されているが、奥行き方向の位置を移動させてもよい。上述したように、焦点Ｐから離れる程集音しにくくなるため、この構成によると、より正確に会議出席者ｈの位置を検出することができる。 The position detection method as described above is executed while moving the position of the focal point P within the search area (an area where a conference attendee may be seated). As a result, the position of the conference attendee can be detected in the search area. In the figure, the focal point P is moved so as to slide in the left-right direction while maintaining the position in the depth direction, but the position in the depth direction may be moved. As described above, the farther away from the focal point P, the less likely it is to collect sound, so this configuration makes it possible to detect the position of the meeting participant h more accurately.

図３は、図１で示す音声会議装置１の構成を概略的に示すブロック図である。装置本体１Ａには、相手方装置１´からの受信音声を出力するための構成として、スピーカユニットＳＰ１〜ＳＰ８の他に、入出力インタフェース１１、エコーキャンセラ１２、遅延部１３、Ｄ／Ａ（digital/analog）コンバータ１４及びアンプ１５及びコントロール部１６を備える。 FIG. 3 is a block diagram schematically showing the configuration of the audio conference apparatus 1 shown in FIG. In addition to speaker units SP1 to SP8, input / output interface 11, echo canceller 12, delay unit 13, D / A (digital / analog) converter 14, amplifier 15, and control unit 16.

入出力インタフェース１１は、接続端子１７に接続された通信ケーブル（図略）等を介して、この通信ケーブルに接続された相手方装置１´との間でデジタル音声信号の送受信を行う。エコーキャンセラ１２は、相手方装置１´から入出力インタフェース１１を介して受信した通話音声信号（受信音声信号）が入力される。エコーキャンセラ１２は、この入力信号を用いて、スピーカアレイ２から出力されてマイクアレイ３に帰還されるエコー成分を擬似した擬似信号を生成する。そして、エコーキャンセラ１２は、マイクアレイ３から入力した音声信号（後述）から擬似信号を除去することでエコー成分を除去する。 The input / output interface 11 transmits / receives a digital audio signal to / from a counterpart apparatus 1 ′ connected to the communication cable via a communication cable (not shown) connected to the connection terminal 17. The echo canceller 12 receives a call voice signal (received voice signal) received from the counterpart apparatus 1 ′ via the input / output interface 11. The echo canceller 12 uses this input signal to generate a pseudo signal that simulates an echo component output from the speaker array 2 and fed back to the microphone array 3. The echo canceller 12 removes an echo component by removing a pseudo signal from an audio signal (described later) input from the microphone array 3.

遅延部１３は、本願発明の第１ビーム調整部に対応し、スピーカユニットＳＰ１〜ＳＰ８の個数分だけ（８個）設けられている。以下、それぞれの遅延部１３を区別する場合には、スピーカユニットＳＰ１〜ＳＰ８のうち対応するものと同様の数字を添え字として付す。例えば、スピーカユニットＳＰ１に対応する遅延部１３は、遅延部１３−１と記載する。遅延部１３−１〜１３−８は、それぞれＤ／Ａコンバータ１４に受信音声信号を入力する。 The delay unit 13 corresponds to the first beam adjustment unit of the present invention, and is provided by the number of speaker units SP1 to SP8 (eight). Hereinafter, when distinguishing each delay part 13, the number similar to what is corresponded among speaker units SP1-SP8 is attached as a subscript. For example, the delay unit 13 corresponding to the speaker unit SP1 is referred to as a delay unit 13-1. Each of the delay units 13-1 to 13-8 inputs the received audio signal to the D / A converter 14.

遅延部１３−１〜１３−８は、それぞれエコーキャンセラ１２から受信音声信号が入力される。遅延部１３−１〜１３−８には、それぞれ遅延時間Ｄ１〜Ｄ８が設定されている。遅延部１３−１〜１３−８は、入力された受信音声信号を遅延時間Ｄ１〜Ｄ８だけ遅延させて、Ｄ／Ａコンバータ１４を介して対応するアンプ１５に入力することで、音声ビームの焦点合わせを行う。すなわち、入力した受信音声信号に付与する遅延時間Ｄ１〜Ｄ８の値によって、遅延部１３−１〜１３−８は音声ビームの焦点Ｐが会議出席者ｈの位置になるように焦点合わせを行う。 Each of the delay units 13-1 to 13-8 receives the received audio signal from the echo canceller 12. Delay times D1 to D8 are set in the delay units 13-1 to 13-8, respectively. The delay units 13-1 to 13-8 delay the input received audio signal by delay times D1 to D8, and input them to the corresponding amplifiers 15 via the D / A converter 14 to thereby focus the audio beam. Align. In other words, the delay units 13-1 to 13-8 perform focusing so that the focal point P of the audio beam becomes the position of the conference attendee h based on the values of the delay times D1 to D8 added to the input received audio signal.

遅延時間Ｄ１〜Ｄ８の値によって音声ビームの指向性が制御される原理を説明する。各スピーカユニットからは放射状（円形）に伝播するように音声が出力される。各遅延時間Ｄ１〜Ｄ８を同じ時間とし、各スピーカユニットからの各音声が同時に出力されると、互いに平行に向かって伝播する成分のみが位相が一致して強め合う。そして、この他の方向に伝播する成分は隣接するスピーカユニットＳＰからの音声同士で干渉し合って打ち消される。これによって、スピーカユニットＳＰ１〜ＳＰ８からの合成音声は音声ビームとなる。この音声ビームは正面方向に指向するとともに、焦点Ｐでのビーム幅がスピーカユニットＳＰ１−スピーカユニットＳＰ８間の距離と略同幅になる。 The principle that the directivity of the sound beam is controlled by the values of the delay times D1 to D8 will be described. Sound is output from each speaker unit so as to propagate radially (circular). When the delay times D1 to D8 are set to the same time and the sounds from the speaker units are simultaneously output, only the components propagating parallel to each other are in phase and intensified. The components propagating in the other direction interfere with each other from the adjacent speaker units SP and are canceled out. Thereby, the synthesized speech from the speaker units SP1 to SP8 becomes an audio beam. The sound beam is directed in the front direction, and the beam width at the focal point P is substantially the same as the distance between the speaker unit SP1 and the speaker unit SP8.

これに対して、各遅延時間Ｄ１〜Ｄ８を同じ時間とするのではなく、各スピーカユニットＳＰからの音声が焦点Ｐに同時にかつ所定のビーム幅で到達するような遅延時間（例えば図１の太字矢印で示す遅延時間）に遅延時間Ｄ１〜Ｄ８を設定することで、焦点Ｐに焦点するように制御することができる。 On the other hand, the delay times D1 to D8 are not set to the same time, but the delay times (for example, bold letters in FIG. 1) for the sound from the speaker units SP to reach the focal point P simultaneously and with a predetermined beam width. By setting the delay times D1 to D8 in (delay time indicated by an arrow), the focus P can be controlled.

Ｄ／Ａコンバータ１４は、遅延部１３の個数だけ設けられている。これらのＤ／Ａコンバータ１４（１４−１〜１４−８）は、それぞれ対応する遅延部１３から遅延時間の付与された受信音声信号が入力される。Ｄ／Ａコンバータ１４−１〜１４−８は入力された受信音声信号をデジタル信号からアナログ信号に変換して対応するアンプ１５に入力する。 There are as many D / A converters 14 as the number of delay units 13. These D / A converters 14 (14-1 to 14-8) receive the received audio signal to which the delay time is given from the corresponding delay unit 13. The D / A converters 14-1 to 14-8 convert the received reception audio signal from a digital signal to an analog signal and input it to the corresponding amplifier 15.

アンプ１５は、入力された音声信号の信号レベルを増幅する。アンプ１５は、スピーカユニットＳＰ１〜ＳＰ８に対応する個数だけ設けられている。以下、それぞれのアンプ１５を区別する場合には、スピーカユニットＳＰ１〜ＳＰ８のうち対応するものと同様の数字を添え字として付す。 The amplifier 15 amplifies the signal level of the input audio signal. The amplifier 15 is provided in the number corresponding to the speaker units SP1 to SP8. Hereinafter, when distinguishing each amplifier 15, the number similar to what is corresponded among speaker units SP1-SP8 is attached as a subscript.

アンプ１５−１〜アンプ１５−８は、遅延部１３−１〜１３−８からＤ／Ａコンバータ１４（１４−１〜１４−８）を介して受信音声信号が入力される。アンプ１５−１〜アンプ１５−８は、入力された受信音声信号の信号レベルを増幅して対応するスピーカユニットＳＰ１〜ＳＰ８に入力する。これによって、スピーカユニットＳＰ１〜ＳＰ８から受信音声信号の音声が放音され、相手方装置１´からの相手方の話声が放音される。 The amplifier 15-1 to amplifier 15-8 receives the received audio signal from the delay units 13-1 to 13-8 via the D / A converter 14 (14-1 to 14-8). The amplifiers 15-1 to 15-8 amplify the signal level of the input reception audio signal and input the amplified signal levels to the corresponding speaker units SP1 to SP8. As a result, the voice of the received voice signal is emitted from the speaker units SP1 to SP8, and the voice of the other party from the other party apparatus 1 'is emitted.

コントロール部１６は、例えばＣＰＵ（Central Processing Unit）やメモリ等の記憶部、操作部等のユーザインタフェース等を備える。メモリに記憶されたプログラムを実行することで、コントロール部１６は例えば音声会議装置１´との間の通話等、音声会議装置１の各部の動作を制御する。例えば、コントロール部１６は、位置検出処理を実行することで、会議出席者ｈの位置を検出する。そして、コントロール部１６はこの検出した位置に音声ビームが焦点となるような遅延時間Ｄ１〜Ｄ８を算出して遅延部１３−１〜１３−８に設定する。 The control unit 16 includes a storage unit such as a CPU (Central Processing Unit) and a memory, a user interface such as an operation unit, and the like. By executing the program stored in the memory, the control unit 16 controls the operation of each unit of the audio conference apparatus 1 such as a call with the audio conference apparatus 1 ′. For example, the control unit 16 detects the position of the meeting attendee h by executing a position detection process. Then, the control unit 16 calculates delay times D1 to D8 such that the sound beam is focused on the detected position, and sets the delay times 13-1 to 13-8.

また、音声会議装置１は、相手方装置１´に会議出席者ｈの音声信号を出力するための構成として、マイクアレイ３及び上記構成に加えて、マイクアンプ２１、Ａ／Ｄ変換部２２、集音ビーム形成部２３、バンドパスフィルタ２４及び信号処理部２５を備える。 In addition to the microphone array 3 and the above-described configuration, the audio conference device 1 is configured to output the audio signal of the conference attendee h to the counterpart device 1 ′, in addition to the microphone amplifier 21, the A / D conversion unit 22, the collection device. A sound beam forming unit 23, a band pass filter 24 and a signal processing unit 25 are provided.

マイクアンプ２１は、マイクアレイ３の各マイクＭで集音した音声の各信号がマイクアレイ３から入力され、この各入力信号を増幅する。Ａ／Ｄ変換部２２は、マイクアンプ２１から入力された増幅後の各アナログ信号をデジタル信号に変換する。Ａ／Ｄ変換部２２は変換後の各デジタル信号を集音ビーム形成部２３に入力する。 The microphone amplifier 21 receives each signal of the sound collected by each microphone M of the microphone array 3 from the microphone array 3 and amplifies each input signal. The A / D conversion unit 22 converts each amplified analog signal input from the microphone amplifier 21 into a digital signal. The A / D conversion unit 22 inputs the converted digital signals to the sound collection beam forming unit 23.

集音ビーム形成部２３は、本願発明の第２ビーム調整部に対応し、位相補正部２３１及び加算部２３２を備える。位相補正部２３１は、Ａ／Ｄ変換部２２から入力した各デジタル信号の位相調整を行い、加算部２３２は位相補正後の各デジタル信号を加算合成する。この位相調整は、上述したように、探査位置からの音声が各マイクＭに至るまでの遅延時間Ｄ１１〜Ｄ１８を用いて行われる。これによって、入力信号のうち探査位置からの音声成分の位相を合致させ、その他の成分の位相を不一致にさせることができる。 The sound collection beam forming unit 23 corresponds to the second beam adjustment unit of the present invention, and includes a phase correction unit 231 and an addition unit 232. The phase correction unit 231 adjusts the phase of each digital signal input from the A / D conversion unit 22, and the addition unit 232 adds and synthesizes each digital signal after phase correction. As described above, this phase adjustment is performed using the delay times D11 to D18 until the sound from the search position reaches each microphone M. As a result, the phase of the audio component from the search position in the input signal can be matched, and the phases of the other components can be made inconsistent.

このため、加算合成後の各デジタル信号は、探査位置からの音声成分のレベルが強められ、その他の成分のレベルは弱められることになる。これによって、探査位置からの音声に特化して音声を集音することができる集音ビームが形成される。この探査位置は、上述したように、会議出席者ｈの位置とされる。 For this reason, in each digital signal after addition synthesis, the level of the sound component from the search position is strengthened, and the levels of the other components are weakened. As a result, a sound collecting beam is formed that can collect the sound specifically for the sound from the search position. This search position is the position of the meeting attendee h as described above.

集音ビーム形成部２３は、複数チャンネル用意され（ここでは、集音ビーム形成部２３Ａ〜２３Ｃの３チャンネル）、各チャンネル２３Ａ〜２３Ｃは会議出席者ｈ１〜ｈ３の位置に焦点する集音ビームを形成する。この各チャンネル２３Ａ〜２３Ｃによる加算合成後の各音声信号はそれぞれバンドパスフィルタ２４に入力される。 The sound collecting beam forming unit 23 is provided with a plurality of channels (here, three channels of the sound collecting beam forming units 23A to 23C), and each channel 23A to 23C receives a sound collecting beam focused on the positions of the conference attendees h1 to h3. Form. The audio signals after the addition and synthesis by the channels 23A to 23C are input to the band pass filter 24, respectively.

なお、位相補正部２３１は、例えば、シフトレジスタ等の遅延時間バッファメモリ（図略）等で実現される。遅延時間バッファメモリは、Ａ／Ｄ変換部２２から入力した各デジタル信号を格段に記憶するとともに、この記憶された値が位相補正の分だけ遅延させて読み出されて加算部２３２に入力される。これによって、所定の遅延時間で位相補正を行うことができるようになっている。また、この位相補正はコントロール部１６によって制御される。すなわち、コントロール部１６は後述する位置検出処理によって会議出席者ｈの位置を検出し、この検出位置を焦点位置とする集音ビームを形成するように位相補正を制御する。 The phase correction unit 231 is realized by, for example, a delay time buffer memory (not shown) such as a shift register. The delay time buffer memory stores each digital signal input from the A / D conversion unit 22 markedly, and the stored value is read by being delayed by the phase correction and input to the addition unit 232. . As a result, phase correction can be performed with a predetermined delay time. The phase correction is controlled by the control unit 16. That is, the control unit 16 detects the position of the conference attendee h by a position detection process described later, and controls the phase correction so as to form a sound collecting beam having the detected position as a focal position.

バンドパスフィルタ２４は、入力した音声信号に対して、人の音声の周波数帯域の他の周波数帯域をカットするためのフィルタ係数を畳み込み演算して信号処理部２５（本願発明の探査部に対応）に入力する。これによって、人の音声の周波数帯域の成分のみが抽出されて信号処理部２５に入力される。 The band-pass filter 24 performs a convolution operation on the input audio signal with a filter coefficient for cutting other frequency bands of human speech, and a signal processing unit 25 (corresponding to the search unit of the present invention). To enter. As a result, only the frequency band components of human speech are extracted and input to the signal processing unit 25.

信号処理部２５には、会議出席者ｈ１〜ｈ３それぞれの位置に特化して集音された各音声信号が各チャンネル２３Ａ〜２３Ｃからバンドパスフィルタ２４を介して入力される。信号処理部２５は、入力された各音声信号のレベルを比較し、このレベル比に応じた重み付けで、各音声信号を加算合成する。信号処理部２５は、加算合成後の音声信号（送信音声信号）をエコーキャンセラ１２に入力する。これによって、発言している会議出席者ｈ１の位置の音声信号を送信音声信号に多く反映させることができる。なお、信号処理部２５は、例えばＤＳＰ等で実現される。 Audio signals collected specifically for the positions of the conference attendees h1 to h3 are input to the signal processing unit 25 from the channels 23A to 23C via the bandpass filter 24. The signal processing unit 25 compares the levels of the input audio signals, and adds and synthesizes the audio signals with weighting according to the level ratio. The signal processing unit 25 inputs the audio signal (transmission audio signal) after the addition synthesis to the echo canceller 12. As a result, the audio signal at the position of the conference attendee h1 who is speaking can be reflected in the transmitted audio signal. The signal processing unit 25 is realized by a DSP, for example.

エコーキャンセラ１２では、上述した様に、信号処理部２５から入力された通話音声信号から擬似信号を除去することでエコー成分が除去される。このエコー成分の除去後の通話音声信号は、入出力インタフェース１１及び接続端子１７を介して相手方装置１´に送信される。これによって、本音声会議装置１´で集音した音声が相手方装置１´に入力されて、相手方との間で音声会議を行うことができる。 In the echo canceller 12, as described above, the echo component is removed by removing the pseudo signal from the call voice signal input from the signal processing unit 25. The call voice signal after the removal of the echo component is transmitted to the counterpart apparatus 1 ′ via the input / output interface 11 and the connection terminal 17. As a result, the voice collected by the voice conference apparatus 1 ′ is input to the partner apparatus 1 ′, and a voice conference can be performed with the partner party.

更に、音声会議装置１は、探査用信号生成部２６（本願発明の信号生成部に対応）を備え、この探査用信号生成部２６及び上述した各部の構成を用いて位置検出処理を実行する。以下に位置検出処理における各部の機能を説明する。 Furthermore, the audio conference apparatus 1 includes a search signal generation unit 26 (corresponding to the signal generation unit of the present invention), and executes position detection processing using the configuration of the search signal generation unit 26 and each unit described above. The function of each part in the position detection process will be described below.

コントロール部１６は、記憶したプログラムの実行によって、指向性制御部１６１、位置検出部１６２、タイマ１６３及び位置記憶部１６４として機能する。 The control unit 16 functions as a directivity control unit 161, a position detection unit 162, a timer 163, and a position storage unit 164 by executing the stored program.

指向性制御部１６１は、音声ビームの焦点が所定の探査位置になるように遅延部１３−１〜１３−８それぞれに設定する遅延時間Ｄ１〜Ｄ８を算出して遅延部１３−１〜１３−８に設定する。これとともに、指向性制御部１６１は、音声ビームの焦点位置に重なる位置に集音ビームを焦点させるように位相補正部２３１に設定する遅延時間Ｄ１１〜Ｄ１８を算出して位相補正部２３１に設定する。なお、遅延時間Ｄ１１〜Ｄ１８は集音ビーム形成部２３Ａ〜２３Ｃのうち１チャンネルにのみ設定されればよい。 The directivity control unit 161 calculates delay times D1 to D8 set in the delay units 13-1 to 13-8 so that the focal point of the sound beam becomes a predetermined search position, and delay units 13-1 to 13-. Set to 8. At the same time, the directivity control unit 161 calculates the delay times D11 to D18 set in the phase correction unit 231 so as to focus the collected sound beam at a position overlapping the focal position of the sound beam, and sets it in the phase correction unit 231. . The delay times D11 to D18 need only be set for one channel among the sound collection beam forming units 23A to 23C.

位置検出部１６２は、本願発明の判断部に対応し、探査用信号生成部２６に探査用音声信号の生成を指示するとともに、タイマ１６３（本願発明の計時部に対応）を作動させて計時を開始させる。位置検出部１６２は、音声ビームをスピーカアレイ２から出力（探査用音声信号を発生）して焦点位置で反射してマイクアレイ３で集音されるまでの時間、すなわち上述した探査時間ｔ２を算出する。そして、位置検出部１６２は、計時開始時から探査時間ｔ２が経過するまでの間だけ信号処理部２５に探査用音声成分の検出を行わせる。これによって、上述したように焦点より奥行き方向に位置する物や者に反射した探査用音声がマイクアレイ３に入力され、会議出席者ｈが居ると誤検出されることが効果的に防止される。 The position detection unit 162 corresponds to the determination unit of the present invention, instructs the search signal generation unit 26 to generate a search audio signal, and activates a timer 163 (corresponding to the time measurement unit of the present invention) to measure time. Let it begin. The position detector 162 outputs a sound beam from the speaker array 2 (generates a sound signal for exploration), reflects it at the focal position, and calculates the time until the sound is collected by the microphone array 3, that is, the exploration time t2 described above. To do. Then, the position detection unit 162 causes the signal processing unit 25 to detect the search audio component only during the period from the start of timing until the search time t2 elapses. As a result, as described above, the sound for exploration reflected to the object or person positioned in the depth direction from the focal point is input to the microphone array 3 and is effectively prevented from being erroneously detected when the conference attendee h is present. .

また、位置検出部１６２は、信号処理部２５が探査用音声成分を検出したときに、検出開始タイミングの通知を受ける。位置検出部１６２は、探査用音声信号の発生時から探査用音声成分の検出の通知時までの時間を時間ｔ１として、上述した式（１）を用いて、会議出席者ｈから音声会議装置１の前方側面までの距離Ｌ１を算出する。 In addition, the position detection unit 162 receives a notification of detection start timing when the signal processing unit 25 detects a sound component for search. The position detection unit 162 sets the time from the generation of the search audio signal to the notification of the detection of the search audio component as the time t1, and uses the above-described equation (1) to call the audio conference device 1 from the conference attendee h. The distance L1 to the front side surface is calculated.

なお、距離Ｌ１の算出において、時間を無視できる程の軽微な時間であるため、探査用音声信号の生成からスピーカアレイ２に入力されるまでの時間、及びマイクアレイ３に探査用音声が入力されて探査用音声成分が信号処理部２５で検出されるまでの時間が無視される。しかしながら、無視できない程長時間である場合にはこの時間分補正を行った上で距離Ｌ１を算出してもよい。 In calculating the distance L1, the time is negligible so that the time can be ignored. Therefore, the time from the generation of the search audio signal to the input to the speaker array 2 and the search audio are input to the microphone array 3. Thus, the time until the search sound component is detected by the signal processing unit 25 is ignored. However, if the time is too long to be ignored, the distance L1 may be calculated after correcting for this time.

位置検出部１６２は、算出した距離Ｌ１と音声ビームの出力方向とを用いて会議出席者ｈの位置検出を行う。位置検出部１６２は検出した位置を位置記憶部１６４に記憶させる。なお、位置検出処理の終了後には、指向性制御部１６１は、検出した位置に音声ビーム及び集音ビームを焦点させるような遅延時間Ｄ１〜Ｄ８，Ｄ１１〜Ｄ１８を算出して、遅延部１３及び位相補正部２３１に設定する。これによって、会議出席者ｈ１の位置に音声ビームが焦点するとともに集音ビームが焦点し、会議出席者ｈ１の位置に指向性を持たせて音声ビームを出力することができるとともに、会議出席者ｈ１の位置からの音声に特化して集音することができる。このため、相手方装置１´を用いる相手方との間で好適に音声会議を行うことができる。 The position detection unit 162 detects the position of the meeting attendee h using the calculated distance L1 and the output direction of the audio beam. The position detection unit 162 stores the detected position in the position storage unit 164. After the end of the position detection process, the directivity control unit 161 calculates delay times D1 to D8 and D11 to D18 that focus the sound beam and the sound collection beam on the detected position. The phase correction unit 231 is set. As a result, the sound beam is focused on the position of the conference attendee h1, the sound collection beam is focused, and the sound beam can be output with the directivity at the position of the conference attendee h1, and the conference attendee h1. It is possible to collect sound specialized for the sound from the position. For this reason, a voice conference can be suitably performed with the other party using the other party apparatus 1 '.

探査用信号生成部２６は、位置検出部１６２の指示によって、探査用音声信号を生成して各遅延部１３−１〜１３−８に入力する。この探査用音声信号は、可聴音の周波数成分から成る。これによって、音声会議装置１が音声会議を行うために通常備えるスピーカアレイ２やＤ／Ａコンバータ１４の構成を用いて探査用音声を出力することができる。なお、可聴音は直進性が悪いが、この直進性の悪さはスピーカアレイ２によって音声ビームとして出力されることで補うことができる。 The search signal generation unit 26 generates a search audio signal according to an instruction from the position detection unit 162 and inputs it to the delay units 13-1 to 13-8. This search audio signal is composed of frequency components of audible sound. As a result, the sound for exploration can be output using the configuration of the speaker array 2 and the D / A converter 14 that are normally provided for the audio conference apparatus 1 to hold an audio conference. Although the audible sound has poor straightness, this poor straightness can be compensated by being output as a sound beam by the speaker array 2.

また、探査用音声信号は、非周期的なパルス列で構成される。この非周期的なパルス列の生成方法の一例を以下に説明する。図４は、探査用音声信号の生成方法を説明するための図である。まず、探査用信号生成部２６は、波形ａ１で示すような連続的した波形のデジタル音声信号（搬送波信号）を生成する。探査用信号生成部２６は、搬送波信号とともに、擬似乱数系列ｂ１を発生させる。この擬似乱数系列ｂ１は、例えばＭ系列やゴールド符号等を用いた２値の乱数である。 The search audio signal is composed of an aperiodic pulse train. An example of this non-periodic pulse train generation method will be described below. FIG. 4 is a diagram for explaining a method for generating an audio signal for search. First, the search signal generator 26 generates a digital audio signal (carrier signal) having a continuous waveform as shown by the waveform a1. The search signal generator 26 generates a pseudo-random number sequence b1 together with the carrier wave signal. The pseudo-random number sequence b1 is a binary random number using, for example, an M sequence or a Gold code.

そして、探査用信号生成部２６は擬似乱数系列ｂ１で搬送波信号をオンオフ（振幅変調）して、探査用音声信号ｃ１を生成する。具体的には、探査用信号生成部２６はクロックタイミングの到来時に擬似乱数を発生する。クロック周期は例えば１秒間隔であり、同図において、太字矢印の位置がクロックタイミングを示す。通常、探査用信号のマイクアレイ３への入力タイミングの検出時刻精度は、クロック周期によって決まり、クロック周期の１／１０程度となる。 Then, the search signal generator 26 turns on / off (amplitude modulation) the carrier signal with the pseudo-random number sequence b1 to generate the search audio signal c1. Specifically, the search signal generator 26 generates a pseudo random number when the clock timing arrives. The clock cycle is, for example, one second interval. In the figure, the position of the bold arrow indicates the clock timing. Usually, the detection time accuracy of the input timing of the search signal to the microphone array 3 is determined by the clock cycle and is about 1/10 of the clock cycle.

すなわち、探査用信号生成部２６は擬似乱数として「０」か「１」の二つの値のいずれかを発生する。「０」を発生したときには、搬送波信号ａ１のレベルが０に変調され、「１」を発生したときには、搬送波信号ａ１のレベルは変更されない。これによって探査用音声信号ｃ１が生成される。同図の例では、擬似乱数系列ｂ１は「００１１０１０１１」であり、探査用音声信号ｃ１は、この「１」の期間のパルスで構成されたパルス列となる。なお、この探査用音声信号の周波数は例えば９００〜７ｋＨｚである。もっとも周波数はスピーカユニットＳＰ間の幅や、スピーカアレイ２の全長によって好ましい値が変わる。また、時間長さは、擬似乱数の系列長によって好ましい寸法が変わり、誤検出防止や検出タイミングの精度等を確保できる程度の長さがあり、かつ探査時間をできるだけ短く抑えることができるように長すぎないことが好ましい。 That is, the search signal generator 26 generates one of two values “0” or “1” as a pseudo-random number. When “0” is generated, the level of the carrier signal a1 is modulated to 0, and when “1” is generated, the level of the carrier signal a1 is not changed. As a result, the search audio signal c1 is generated. In the example shown in the figure, the pseudo-random number sequence b1 is “001101011”, and the search audio signal c1 is a pulse train composed of pulses in the period “1”. Note that the frequency of the search audio signal is, for example, 900 to 7 kHz. Of course, the preferred frequency varies depending on the width between the speaker units SP and the overall length of the speaker array 2. In addition, the preferred length of the time length varies depending on the sequence length of the pseudo-random numbers, has a length that can prevent false detection and accuracy of detection timing, and is long enough to keep the search time as short as possible. Preferably, it is not too much.

探査用信号生成部２６で生成された探査用音声信号は、遅延部１３に入力される。遅延部１３では、音声ビームを探査位置に焦点させるための遅延時間Ｄ１〜Ｄ８が付与され、遅延時間が付与された音声信号はＤ／Ａコンバータ１４−アンプ１５を介してスピーカアレイ２に入力される。 The search audio signal generated by the search signal generation unit 26 is input to the delay unit 13. The delay unit 13 is provided with delay times D1 to D8 for focusing the sound beam at the search position, and the sound signal with the delay time is input to the speaker array 2 via the D / A converter 14-amplifier 15. The

スピーカアレイ２から出力された音声ビームは、音声ビームの経路に会議出席者ｈが居る場合には、この会議出席者ｈに反射してマイクアレイ３に集音される。各マイクＭで集音された各信号はマイクアンプ２１−Ａ／Ｄ変換部２２を介して集音ビーム形成部２３に入力される。なお、各信号は遅延時間Ｄ１１〜Ｄ１８が設定されているチャンネルに入力される。集音ビーム形成部２３は、設定された遅延時間Ｄ１１〜Ｄ１８で入力された各信号の位相調整を行って、位相調整後の各信号を加算合成することで、焦点Ｐからの音声に特化して集音する集音ビームを形成する。集音ビーム形成部２３は、加算合成後の音声信号をバンドパスフィルタ２４に入力する。 The audio beam output from the speaker array 2 is reflected by the conference attendee h and collected by the microphone array 3 when the conference attendee h is on the path of the audio beam. Each signal collected by each microphone M is input to the sound collection beam forming unit 23 via the microphone amplifier 21 -A / D conversion unit 22. Each signal is input to a channel in which delay times D11 to D18 are set. The sound collection beam forming unit 23 performs phase adjustment of each signal input in the set delay times D11 to D18, and adds and synthesizes each signal after phase adjustment, thereby specializing in the sound from the focal point P. To form a sound collecting beam. The sound collection beam forming unit 23 inputs the audio signal after the addition synthesis to the band pass filter 24.

バンドパスフィルタ２４は、探査用音声の周波数帯域の他の成分をカットするフィルタ係数が設定される。バンドパスフィルタ２４は入力した音声信号をこのフィルタ係数で畳み込み演算して、信号処理部２５に入力する。これによって、入力した音声信号のうち探査用音声の周波数帯域の成分のみを抽出して信号処理部２５に入力することができる。 The band pass filter 24 is set with a filter coefficient for cutting off other components in the frequency band of the search voice. The band pass filter 24 performs a convolution operation on the input audio signal with the filter coefficient and inputs the result to the signal processing unit 25. As a result, only the frequency band component of the search voice can be extracted from the input voice signal and input to the signal processing unit 25.

信号処理部２５は、バンドパスフィルタ２４から入力した音声信号の中に探査用音声の成分が含まれるかを探査する。上述した様に、非周期的なパルス列が探査用音声信号として用いられているため、探査用音声のマイクアレイ３への入力タイミングの検出が容易かつ正確になる。図５は、探査用音声の入力タイミングの検出方法を示す図である。 The signal processing unit 25 searches for whether or not the sound signal for search is included in the sound signal input from the band pass filter 24. As described above, since a non-periodic pulse train is used as the search audio signal, the input timing of the search audio to the microphone array 3 can be easily and accurately detected. FIG. 5 is a diagram illustrating a method for detecting the input timing of the search sound.

信号処理部２５は、探査用音声信号のパルス波形ａ２と入力した音声信号の波形ｂ２とを比較することで探査用音声成分を検出する。すなわち、コントロール部１６は探査用音声信号のパルス波形ａ２が探査用信号生成部２６から通知され、コントロール部１６はこのパルス波形ａ２を信号処理部２５に通知する。信号処理部２５は、通知されたパルス波形ａ２と、バンドパスフィルタ２４から入力された音声信号のパルス波形ｂ２とを比較してゆき、両パルス波形ａ２，ｂ２が一致した場合に探査用音声成分を検出したと判断する。 The signal processing unit 25 detects the sound component for search by comparing the pulse waveform a2 of the sound signal for search with the waveform b2 of the input sound signal. That is, the control unit 16 is notified of the pulse waveform a2 of the search audio signal from the search signal generation unit 26, and the control unit 16 notifies the signal processing unit 25 of this pulse waveform a2. The signal processing unit 25 compares the notified pulse waveform a2 with the pulse waveform b2 of the audio signal input from the bandpass filter 24, and when both pulse waveforms a2 and b2 match, the search audio component Is determined to have been detected.

同図（ａ）は、仮に探査用音声信号が周期的なパルス列である場合に、パルス波形ａ２とパルス波形ｂ２とを比較する様子を示す図である。同図（ｂ）は非周期的なパルス列である本実施形態の場合に、パルス波形ａ２とパルス波形ｂ２とを比較する様子を示す図である。なお、点線の位置がクロックタイミングであり、各クロックタイミングから波形の一致の比較が開始される。 FIG. 5A is a diagram showing a state in which the pulse waveform a2 and the pulse waveform b2 are compared when the search audio signal is a periodic pulse train. FIG. 4B is a diagram showing a state in which the pulse waveform a2 and the pulse waveform b2 are compared in the case of the present embodiment, which is an aperiodic pulse train. Note that the position of the dotted line is the clock timing, and the comparison of waveform matching is started from each clock timing.

パルス列が周期的である場合には、（ａ）で示すように、探査用音声成分の開始時点のクロックタイミングＴ１から半周期ずれた時点のクロックタイミングＴ２では、両パルス波形ａ２，ｂ２の形状は一致しない。しかしながら、一周期ずれたクロックタイミングＴ３では、両パルス波形ａ２，ｂ２は一致してしまう。このため、探査用音声成分の開始位置がクロックタイミングＴ１であるのかクロックタイミングＴ３であるのかの判断が困難である。 When the pulse train is periodic, as shown in (a), the shape of both pulse waveforms a2 and b2 is as shown in (a) at the clock timing T2 at a time deviated from the clock timing T1 at the start of the search audio component. It does not match. However, at the clock timing T3 shifted by one cycle, both pulse waveforms a2 and b2 coincide. For this reason, it is difficult to determine whether the start position of the search audio component is the clock timing T1 or the clock timing T3.

一方、探査用音声信号が非周期的なパルス列で構成される場合には、（ｂ）で示すように、探査用音声成分の開始時点Ｔ１からのパルス波形ｂ２のみがパルス波形ａ２と合致し、ずれた時点からのパルス波形ｂ２とパルス波形ａ２とは一致しない。このため、探査用音声成分の開始時点を検出することが容易であり、正確に開示時点を検出することができる。 On the other hand, when the search audio signal is composed of an aperiodic pulse train, only the pulse waveform b2 from the start time T1 of the search audio component matches the pulse waveform a2, as shown in (b), The pulse waveform b2 and the pulse waveform a2 from the time of deviation do not match. For this reason, it is easy to detect the start time of the sound component for search, and it is possible to accurately detect the time of disclosure.

信号処理部２５は、波形の一致の開始時点を探査用音声の入力タイミングとしてコントロール部１６に通知する。上述したように、コントロール部１６は、通知された探査用音声の入力タイミング及び探査用音声の出力タイミングを用いて会議出席者ｈの位置を算出する。 The signal processing unit 25 notifies the control unit 16 of the waveform matching start time as the search voice input timing. As described above, the control unit 16 calculates the position of the meeting attendee h using the notified search voice input timing and search voice output timing.

図６は、図３で示す音声会議装置１の実行する位置検出処理を示すフローチャートである。この位置検出処理は、コントロール部１６を構成する操作部を用いてユーザが位置検出処理の実行を指示した場合に実行される。まず、指向性制御部１６１は、上述した遅延時間Ｄ１〜Ｄ８を算出して、遅延部１３−１〜１３−８に設定することで音声ビームの焦点合わせを行う（Ｓ１）。これとともに、指向性制御部１６１は、上述した遅延時間Ｄ１１〜Ｄ１８を算出して、位相補正部２３１に設定することで集音ビームの焦点合わせを行う（Ｓ２）。 FIG. 6 is a flowchart showing position detection processing executed by the audio conference apparatus 1 shown in FIG. This position detection process is executed when the user instructs execution of the position detection process using the operation unit constituting the control unit 16. First, the directivity control unit 161 calculates the delay times D1 to D8 described above and sets the delay times 13-1 to 13-8 to focus the sound beam (S1). At the same time, the directivity control unit 161 calculates the delay times D11 to D18 described above and sets the phase correction unit 231 to focus the sound collection beam (S2).

この後、位置検出部１６２は、探査用音声信号の生成を探査用信号生成部２６に指示し（Ｓ３）、これとともにタイマ１６３を用いて計時を開始する（Ｓ４）。探査用信号生成部２６は探査用音声信号を生成して遅延部１３−Ｄ／Ａコンバータ１４−アンプ１５を介してスピーカアレイ２に入力する。位置検出部１６２は、信号処理部２５に探査用音声成分の検出を開始させる（Ｓ５）。 Thereafter, the position detection unit 162 instructs the search signal generation unit 26 to generate a search audio signal (S3), and starts timing using the timer 163 (S4). The search signal generator 26 generates a search audio signal and inputs it to the speaker array 2 via the delay unit 13 -D / A converter 14 -amplifier 15. The position detection unit 162 causes the signal processing unit 25 to start detecting a sound component for search (S5).

位置検出部１６２は、探査用音声の入力タイミングが信号処理部２５から通知されたかどうかを判断する（Ｓ６）。探査用音声の入力タイミングが通知された場合には（Ｓ６でＹＥＳ）、位置検出部１６２は入力タイミング及び探査用音声の出力タイミング（探査用音声信号の発生タイミング）を用いて上述した方法によって会議出席者ｈ１の位置を算出し、位置記憶部１６４に記憶させる（Ｓ７）。 The position detection unit 162 determines whether or not the input timing of the search sound is notified from the signal processing unit 25 (S6). When the exploration audio input timing is notified (YES in S6), the position detection unit 162 uses the input timing and the exploration audio output timing (the occurrence timing of the exploration audio signal) to perform a meeting by the method described above. The position of the attendee h1 is calculated and stored in the position storage unit 164 (S7).

この後、位置検出部１６２は、指向性制御部１６１に音声ビーム及び集音ビームの焦点位置が探査領域の全ての方向に移動されたかを判断させる（Ｓ８）。全ての方向に焦点位置が移動されたと指向性制御部１６１が判断した場合には（Ｓ８でＹＥＳ）、位置検出部１６２は本処理を終了させる。一方、全ての方向に焦点位置が移動されていないと指向性制御部１６１が判断した場合には（Ｓ８でＮＯ）、位置検出部１６２は本処理をステップＳ１に戻し、ステップＳ１では焦点が別の位置に変更される。 Thereafter, the position detection unit 162 causes the directivity control unit 161 to determine whether the focal positions of the sound beam and the sound collection beam have been moved in all directions of the search area (S8). If the directivity control unit 161 determines that the focal position has been moved in all directions (YES in S8), the position detection unit 162 ends this process. On the other hand, when the directivity control unit 161 determines that the focus position has not been moved in all directions (NO in S8), the position detection unit 162 returns the process to step S1, and the focus is different in step S1. It is changed to the position.

一方、位置検出部１６２は、探査用音声の入力タイミングが信号処理部２５から通知されてないと判断した場合には（Ｓ６でＮＯ）、探査時間ｔ２が経過したかどうかを判断する（Ｓ９）。探査時間ｔ２が経過したと判断していない場合には（Ｓ９でＮＯ）、位置検出部１６２は本処理をステップＳ６に戻し、探査時間ｔ２が経過したと判断した場合には（Ｓ９でＹＥＳ）、位置検出部１６２は上述したステップＳ８を実行する。 On the other hand, when the position detection unit 162 determines that the input timing of the search sound is not notified from the signal processing unit 25 (NO in S6), the position detection unit 162 determines whether the search time t2 has elapsed (S9). . If it is not determined that the exploration time t2 has elapsed (NO in S9), the position detection unit 162 returns this processing to step S6, and if it is determined that the exploration time t2 has elapsed (YES in S9). The position detection unit 162 executes step S8 described above.

上述した位置検出処理によって、本実施形態では、マイクアレイ３の集音ビームの焦点とスピーカアレイ２の音声ビームの焦点とが重なるように指向性制御部１６１によって制御される。そして、スピーカアレイ２から探査用音声のビームが出力され、この音声ビームの経路に会議出席者ｈが居る場合には、この会議出席者ｈに反射した探査用音声がマイクアレイ３によって集音される。この探査用音声の入力及び出力タイミングと音声ビームの出力方向によって、会議出席者ｈの位置が位置検出部１６２によって検出される。これによって、従来技術に比較して、より会議出席者ｈの位置検出を正確に行うことができる。 According to the position detection process described above, in this embodiment, the directivity control unit 161 controls the focus of the sound collecting beam of the microphone array 3 and the focus of the sound beam of the speaker array 2 to overlap. Then, an exploration sound beam is output from the speaker array 2, and when the conference attendee h is on the sound beam path, the exploration sound reflected by the conference attendant h is collected by the microphone array 3. The The position of the meeting attendee h is detected by the position detection unit 162 based on the input and output timing of the search sound and the output direction of the sound beam. As a result, the position of the meeting attendee h can be detected more accurately than in the prior art.

すなわち、無指向性スピーカを用いて広角度に探査用音声が出力する従来技術とは異なり、スピーカアレイ２を用いて音声ビームを出力する。このため、、探査範囲外に音声が出力されず、探査範囲外の反射音を集音してしまうマルチパスを効果的に防止することができる。また、スピーカアレイ２から出力した音声信号が焦点位置で反射してマイクアレイ３に入力されるまでの計時時間に限って、マイクアレイ３から入力された音声信号に探査用音声信号の成分が含まれるかが探査される。これによって、焦点位置を越えた位置で反射した反射音をマイクアレイで受信してしまい（マルチパス）、会議出席者ｈを誤検出してしまうことを防止することができる。 That is, unlike the conventional technique in which sound for exploration is output at a wide angle using an omnidirectional speaker, a sound beam is output using the speaker array 2. Therefore, it is possible to effectively prevent multipath that does not output sound outside the search range and collects reflected sound outside the search range. Further, the audio signal input from the microphone array 3 includes the component of the audio signal for exploration only during the time until the audio signal output from the speaker array 2 is reflected at the focal position and input to the microphone array 3. It will be explored. Accordingly, it is possible to prevent the reflected sound reflected at the position beyond the focal position from being received by the microphone array (multipath) and erroneously detecting the attendee h.

上述のように、探査範囲の他の領域（方向）での反射音のマルチパスを効果的に防止することができるとともに、焦点位置を越えた位置での反射音のマルチパスを効果的に防止することができるため、会議出席者ｈの位置検出の正確さを向上させることができる。 As described above, multipath of reflected sound in other areas (directions) of the search range can be effectively prevented, and multipath of reflected sound at positions beyond the focal position can be effectively prevented. Therefore, it is possible to improve the accuracy of position detection of the meeting attendee h.

本実施形態は、以下の変形例を採用することができる。 The present embodiment can employ the following modified examples.

（１）本実施形態では、スピーカユニットＳＰの個数は８個であるが、この個数に限定されず、音声ビームの指向性及びビーム幅を制御できるだけの個数が少なくとも配設されていればよい。 (1) In the present embodiment, the number of speaker units SP is eight. However, the number is not limited to this, and it is sufficient that at least the number capable of controlling the directivity and beam width of the sound beam is provided.

（２）本実施形態では、本発明の対象物検出装置を音声会議装置に適用しているがこれに限定されず、対象物を検出する機能を備えた装置であれば本発明を適用することができる。例えば、本発明は、車両の周囲に位置する障害物を検出して運転者に提示する運転支援装置等に適用されてもよい。 (2) In the present embodiment, the object detection device of the present invention is applied to an audio conference device. However, the present invention is not limited to this, and the present invention is applied to any device having a function of detecting an object. Can do. For example, the present invention may be applied to a driving support device that detects an obstacle located around the vehicle and presents it to the driver.

（３）本実施形態では、スピーカユニットＳＰをライン状に配列したスピーカアレイ２やマイクＭをライン状に配列したマイクアレイ３を用いているが、本発明はこのスピーカアレイ２及びマイクアレイ３の構成に限定されない。例えば、マトリクス状、ハニカム状、円形状等に配列されたスピーカユニットＳＰやマイクＭを備えたスピーカアレイ２やマイクアレイ３が用いられても良い。この場合には、対象物の高さ方向の位置等も検出することができる。 (3) In this embodiment, the speaker array 2 in which the speaker units SP are arranged in a line and the microphone array 3 in which the microphones M are arranged in a line are used. However, the present invention uses the speaker array 2 and the microphone array 3. It is not limited to the configuration. For example, a speaker array 2 or a microphone array 3 including speaker units SP and microphones M arranged in a matrix shape, a honeycomb shape, a circular shape, or the like may be used. In this case, the position of the object in the height direction can also be detected.

音声会議装置１を上方から見た外観及び音声会議用音声の伝搬及び集音範囲を示す図である。It is a figure which shows the propagation and sound collection range of the audio | voice conference audio | voice when the audio conference apparatus 1 was seen from the upper direction. 図２は、位置検出処理が実行されている場合の音声ビームの伝搬範囲及び集音ビームの集音範囲を示す図である。FIG. 2 is a diagram showing the propagation range of the sound beam and the sound collection range of the sound collection beam when the position detection process is executed. 図１で示す音声会議装置１の構成を概略的に示すブロック図である。It is a block diagram which shows roughly the structure of the audio conference apparatus 1 shown in FIG. 探査用音声信号の生成方法を説明するための図である。It is a figure for demonstrating the production | generation method of the audio | voice signal for a search. 探査用音声の入力タイミングの検出方法を示す図である。It is a figure which shows the detection method of the input timing of the audio | voice for search. 図３で示す音声会議装置の実行する位置検出処理を示すフローチャートである。It is a flowchart which shows the position detection process which the audio conference apparatus shown in FIG. 3 performs.

Explanation of symbols

１−音声会議装置１´−相手方装置２−スピーカアレイ３−マイクアレイ１３−遅延部（第１ビーム調整部）１６−コントロール部１６１−指向性制御部１６２−位置検出部（判断部）１６３−タイマ（計時部）２３（２３Ａ〜２３Ｃ）−集音ビーム形成部（第２ビーム調整部）２５−信号処理部（探査部）２６−探査用信号生成部（信号生成部）ｃ１−探査用音声信号ｈ（ｈ１〜ｈ３）−会議出席者Ｐ−焦点 1-voice conference device 1'-partner device 2-speaker array 3-microphone array 13-delay unit (first beam adjustment unit) 16-control unit 161-directivity control unit 162-position detection unit (determination unit) 163- Timer (timer) 23 (23A to 23C) -sound collecting beam forming unit (second beam adjusting unit) 25-signal processing unit (searching unit) 26-searching signal generating unit (signal generating unit) c1-searching sound Signal h (h1-h3)-Conference attendee P-Focus

Claims

A function to input audio signals from the microphone array and input audio signals to the speaker array,
A signal generation unit for generating a search sound signal and inputting the sound signal to the speaker array;
A first beam adjustment unit for focusing an audio beam from the speaker array;
A second beam adjustment unit for focusing the sound collection beam of the microphone array;
A directivity control unit that controls the focusing of the first and second beam adjustment units so that the focus of the sound beam and the focus of the sound collection beam overlap.
A timekeeping unit that measures the exploration time from when the sound is output from the speaker array until it is reflected at the focal position and input to the microphone array;
An exploration unit for exploring whether the audio signal input from the microphone array includes the component of the audio signal for exploration at the exploration time;
A determination unit that determines that there is an object in the path of the sound beam from the speaker array to the focal point when the search unit detects that the component of the search sound signal is included;
An object detection apparatus comprising:

The determination unit detects the position of the object using the output timing of the sound beam from the speaker array, the input timing of the component of the sound signal for exploration to the microphone array, and the output direction of the sound beam.
The object detection apparatus according to claim 1.

The object detection device according to claim 2, wherein the signal generation unit generates a non-periodic pulse train as an audio signal for search.

An audio conference device using the object detection device according to claim 1,
The signal generation unit generates an audio signal composed of frequency components of audible sound as an audio signal for exploration,
The determination unit detects meeting attendees as the object.
An audio conference apparatus.