JPH10308814A

JPH10308814A - Voice switch for talker

Info

Publication number: JPH10308814A
Application number: JP11572497A
Authority: JP
Inventors: Yasushi Yamazaki; 泰山崎; Tomonori Sato; 知紀佐藤; Hitoshi Matsuzawa; 均松澤; Masato Ito; 正人伊藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1997-05-06
Filing date: 1997-05-06
Publication date: 1998-11-17
Anticipated expiration: 2017-05-06
Also published as: JP3460783B2

Abstract

(57)【要約】【課題】本発明はハンズフリー通話機などに用いられる
音声スイッチに関し、フレーム処理をともなう通話機に
おいて送話受話信号の時間ずれにかかわらず音響エコー
を的確に抑圧することを目的とする。【解決手段】受話音声信号を所定時間だけ遅延させる遅
延手段と、受話音声信号と送話音声信号とに基づき受話
音声信号を抑圧するか否かを判定する受話音声判定手段
と、遅延手段で遅延させた受話音声信号と送話音声信号
とに基づき送話音声信号を抑圧するか否かを判定する送
話音声判定手段と、受話音声判定手段の判定結果に従っ
て受話音声信号を抑圧する受話側抑圧手段と、送話音声
判定手段の判定結果に従って送話音声信号を抑圧する送
話側抑圧手段とを備える。 (57) [Summary] The present invention relates to a voice switch used in a hands-free telephone or the like, and in a telephone with frame processing, accurately suppresses acoustic echo regardless of a time lag of a transmission / reception signal. Aim. A delay unit delays a received voice signal by a predetermined time, a received voice determination unit that determines whether to suppress a received voice signal based on the received voice signal and a transmitted voice signal, and a delay unit that delays the received voice signal. A transmitting voice determining unit that determines whether or not the transmitting voice signal is suppressed based on the received receiving voice signal and the transmitting voice signal, and a receiving-side suppression that suppresses the receiving voice signal according to a determination result of the receiving voice determining unit. And a transmitting-side suppressing means for suppressing the transmitted voice signal in accordance with the result of the determination by the transmitted voice determining means.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明はハンズフリー通話機
などに用いられる音声スイッチに関するものである。音
声スイッチ方式を採用したハンズフリー通話機において
は、音響エコーを的確に抑圧できることが必要とされ
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice switch used for a hands-free telephone or the like. In a hands-free talker employing a voice switch system, it is necessary to be able to accurately suppress acoustic echo.

【０００２】[0002]

【従来の技術】ハンズフリー機能を実現するためには、
スピーカの音量を上げ、マイクの感度を高める必要があ
る。しかしながら、このようにすると、図５に示される
ように、スピーカ等の音声出力部から出力された受話音
声がマイクロホン等の音声入力部に回り込む音響エコー
が生じる。これは、通話相手にとっては自分の声がこだ
まのように聞こえる現象で、非常に使いにくいものとな
る。この音響エコーを除去するためには、（１）エコー
キャンセラ方式、（２）音声スイッチ方式の二方式があ
る。2. Description of the Related Art To realize a hands-free function,
It is necessary to increase the volume of the speaker and the sensitivity of the microphone. However, in this case, as shown in FIG. 5, an acoustic echo occurs in which the received voice output from the voice output unit such as a speaker circulates to a voice input unit such as a microphone. This is a phenomenon in which one's voice sounds like an echo to the other party, and is very difficult to use. There are two methods for removing the acoustic echo: (1) an echo canceller method and (2) a voice switch method.

【０００３】エコーキャンセラ方式は適応信号処理技術
を用いて音響エコーを除去するものである。例えば図６
に示されるように、出力された受話音声がマイクに回り
込む音響エコーｒを、通話機の内部で擬似的に発生さ
せ、マイク入力された信号から差し引くものである。こ
の擬似エコーｒ’の発生はスピーカからマイクへの伝達
関数をＦＩＲフィルタで表したものである。この伝達関
数は通話機の周囲の状況によって変化するため、擬似エ
コーｒ’と音響エコーｒの誤差が最小になるよう適応的
にフィルタを変化させるものである。[0003] The echo canceller system removes an acoustic echo using an adaptive signal processing technique. For example, FIG.
As shown in (1), an acoustic echo r in which the output received voice wraps around the microphone is artificially generated inside the telephone, and is subtracted from the signal input to the microphone. The generation of the pseudo echo r 'is obtained by expressing a transfer function from the speaker to the microphone by using an FIR filter. Since this transfer function changes depending on the situation around the telephone, the filter is adaptively changed so that the error between the pseudo echo r 'and the acoustic echo r is minimized.

【０００４】一方、音声スイッチ方式は、図７に示され
るように、スピーカ出力音声とマイク入力音声とのパワ
ーを比較し、どちらか一方を抑圧することで、音響エコ
ーを除去する。つまり、スピーカ出力している間はマイ
ク入力された信号は音響エコーである確率が高いので、
この間はマイク入力信号を抑圧することで、相手に音響
エコーを送信することを防ぐ。On the other hand, in the voice switch system, as shown in FIG. 7, the power of a speaker output voice and the power of a microphone input voice are compared, and one of them is suppressed to remove an acoustic echo. In other words, while the speaker is outputting, the signal input to the microphone has a high probability of being an acoustic echo,
During this time, by suppressing the microphone input signal, transmission of an acoustic echo to the other party is prevented.

【０００５】このように、ハンズフリー機能を実現する
上で問題となる音響エコーの除去には、エコーキャンセ
ラ、音声スイッチの２方式がある。両者の長所、短所の
比較は図８に示すとおりであり、処理量と能力のトレー
ドオフとなる。コストを優先させる場合には音声スイッ
チ方式を採用することになる。本発明はこの音声スイッ
チに関わるものである。As described above, there are two methods of removing an acoustic echo which is a problem in realizing the hands-free function, an echo canceller and a voice switch. A comparison of the advantages and disadvantages of both is as shown in FIG. 8, which is a trade-off between the processing amount and the performance. To prioritize the cost, a voice switch method will be adopted. The present invention relates to this voice switch.

【０００６】図９にはこの音声スイッチを備えたハンズ
フリー通話機の詳細な従来構成が示される。図９におい
て、１は相手側からの音声信号を受信する復調器等から
なる受信部、２は受信ゲインｇａｉｎ-rを変化させるこ
とで受信信号のパワーを抑圧制御できるパワー抑圧部、
３は増幅器やスピーカ等からなり受話音声（Ｒ）を放音
する音声出力部である。６はマイクロホンや増幅器から
なり送話音声（Ｓ）を入力する音声入力部、７は送信ゲ
インｇａｉｎ-sを変化させることで受信信号のパワーを
抑圧制御できるパワー抑圧部、８は送話音声信号を相手
側に送信する変調器等からなる送信部である。FIG. 9 shows a detailed conventional structure of a hands-free telephone having the voice switch. In FIG. 9, reference numeral 1 denotes a receiving unit including a demodulator for receiving an audio signal from the other party, 2 denotes a power suppressing unit that can suppress and control the power of the received signal by changing the receiving gain gain-r,
Reference numeral 3 denotes an audio output unit which includes an amplifier, a speaker, and the like, and emits a received voice (R). Reference numeral 6 denotes a voice input unit comprising a microphone or an amplifier for inputting a transmission voice (S), 7 a power suppressing unit capable of suppressing and controlling the power of a reception signal by changing a transmission gain gain-s, and 8 a transmission voice signal. Is a transmission unit including a modulator or the like for transmitting the signal to the other party.

【０００７】４は受話音声と送話音声の大きさに基づい
て、受話側のパワー抑圧部２で受話音声を抑圧するか、
送話側のパワー抑圧部７で送話音声を抑圧するかを判定
する判定部である。４１は受信部１で受信した受信信号
のパワーを計算するパワー計算部、４２はパワー計算部
４１で算出したパワーに基づいて現在の受話音声状態ｓ
-sが無音か有音かを検出する有音検出部、４３は音声入
力部６に入力した音声信号のパワーを計算するパワー計
算部、４４はパワー計算部４３で算出したパワーに基づ
いて現在の送話音声状態ｓ-rが無音か有音かを検出する
有音検出部、４５は有音検出部４２、４４の検出結果に
基づいてパワー抑圧部２、７のいずれ側を抑圧制御状態
にするかを判定する判定部である。[0007] 4 indicates whether the received voice is suppressed by the power suppression unit 2 on the receiving side based on the magnitudes of the received voice and the transmitted voice.
This is a determination unit for determining whether or not the transmitted voice is suppressed by the power suppressing unit 7 on the transmitting side. 41 is a power calculator for calculating the power of the received signal received by the receiver 1, and 42 is the current received voice state s based on the power calculated by the power calculator 41.
-s is a sound detector that detects whether there is no sound or sound, 43 is a power calculator that calculates the power of the audio signal input to the audio input unit 6, and 44 is the current power based on the power calculated by the power calculator 43. The sound detection unit 45 detects whether the transmitted voice state sr of the transmission sound is silent or non-speech. The control unit 45 controls any one of the power suppression units 2 and 7 based on the detection results of the sound detection units 42 and 44. It is a determination unit for determining whether to use

【０００８】ここで、パワー計算部４１、４３は次の計
算式により入力音声データのパワーを計算する。すなわ
ち、入力された音声データをｘ_iとすると、出力パワー
ｐ_iは、ｐ_i＝１０×log 〔Σ（ｘ_i-j×ｘ_i-j）〕で求まる。但し、Σはｊ＝０からＪまでの加算であるも
のとする。Here, the power calculators 41 and 43 calculate the power of the input voice data according to the following formula. That is, assuming that the input audio data is x _i , the output power p _i is obtained by p _i = 10 × log [Σ (x _ij x _ij )]. Here, Σ is an addition from j = 0 to J.

【０００９】有音検出部４２、４４は、図１０に示され
るように、入力パワーｐ_iを一定のしきい値ｔｈと比較
する比較部からなり、次の判定式により、入力パワーｐ
_iをしきい値ｔｈと比較して、現在の音声状態Ｓ_iが有
音か無音かを判定している。ここで、ｓ_i＝０は無音、
ｓ_i＝１は有音を意味する。判定式は、ｉｆ（ｐ_i＜ｔｈ）ｓ_i＝０ｉｆ（ｐ_i＞ｔｈ）ｓ_i＝１である。これは、入力パワーｐ_iがしきい値ｔｈより小
さければ、音声状態ｓ_iを「０」とし、しきい値ｔｈに
よりも大きければ、音声状態ｓ_iを「１」とするもので
ある。これより、しきい値ｔｈ以下の背景雑音が誤って
有音を判定されることを防ぐ。As shown in FIG. 10, the sound detectors 42 and 44 comprise a comparator for comparing the input power p _i with a fixed threshold value th.
_By comparing _i with the threshold th, it is determined whether the current voice state S _i is sound or no sound. Here, s _i = 0 is silence,
s _i = 1 means sound. The judgment formula is if (p _i <th) s _i = 0 if (p _i > th) s _i = 1. This is because the input power p _i is smaller than the threshold th, the speech state s _i to "0", if greater More threshold th, in which the speech state s _i to "1". As a result, it is possible to prevent erroneous determination of existence of background noise having a threshold value th or less.

【００１０】判定部４５は、図１１に一例として示す判
定論理テーブルに従って、受話パワー抑圧部２の受話ゲ
インｇａｉｎ-rと送話パワー抑圧部７の送話ゲインｇａ
ｉｎ-sを制御している。ここで、受話ゲインｇａｉｎ-r
と送話ゲインｇａｉｎ-sは０．０≦ｇａｉｎ≦１．０の範囲のものである。図１１の判定論理テーブルでは、送話音声状態ｓ-s＝０、受話音声状態ｓ-r＝０の場合
には、送話ゲインｇａｉｎ-sを「０．０」、受話ゲイン
ｇａｉｎ-rを「０．０」とする．送話音声状態ｓ-s＝１、受話音声状態ｓ-r＝０の場合
には、送話ゲインｇａｉｎ-sを「１．０」、受話ゲイン
ｇａｉｎ-rを「０．０」とする．送話音声状態ｓ-s＝０、受話音声状態ｓ-r＝１の場合
には、送話ゲインｇａｉｎ-sを「０．０」、受話ゲイン
ｇａｉｎ-rを「１．０」とする．送話音声状態ｓ-s＝１、受話音声状態ｓ-r＝１の場合
には、受話を優先して、送話ゲインｇａｉｎ-sを「０．
０」、受話ゲインｇａｉｎ-rを「１．０」とする．の制御を行う。[0010] The determination unit 45 receives the reception gain gain-r of the reception power suppression unit 2 and the transmission gain ga of the transmission power suppression unit 7 according to a determination logic table shown as an example in FIG.
Controlling in-s. Here, the reception gain gain-r
And the transmission gain gain-s are in the range of 0.0 ≦ gain ≦ 1.0. In the determination logic table of FIG. 11, when the transmission voice state s−s = 0 and the reception voice state s−r = 0, the transmission gain gain-s is “0.0”, and the reception gain main−r is Set to “0.0”. When the transmitted voice state s−s = 1 and the received voice state s−r = 0, the transmitted gain “gain-s” is set to “1.0” and the received gain “gain-r” is set to “0.0”. When the transmitted voice state s−s = 0 and the received voice state sr = 1, the transmitted gain “gain-s” is set to “0.0” and the received gain “gain-r” is set to “1.0”. When the transmission voice state s-s = 1 and the reception voice state sr = 1, the reception is prioritized, and the transmission gain gain-s is set to "0.
0 ", and the reception gain gain-r is set to" 1.0 ". Control.

【００１１】この判定部４５の判定結果に従って、パワ
ー抑圧部２、７は入力音声データｘ _iに対して以下の処
理を行って、出力音声データｘ_iとして出力する。ｘ_i＝ｘ_i×ｇａｉｎAccording to the determination result of the determination section 45, the power
-The suppression units 2 and 7 are input audio data x _iFor
And output audio data x_iOutput as x_i= X_i× gain

【００１２】このように、この音声スイッチ方式は、受
話音声と送話音声の状態によりどちらか一方を抑圧し、
他方が受話音声であればスピーカ出力し、送話音声であ
れば送信するものである。両者のいずれもが有音の場合
には、受話音声を優先する場合や、音声パワーの高い方
を優先する場合など様々な基準が考えられる。As described above, this voice switch system suppresses one of the received voice and the transmitted voice depending on the state of the voice.
If the other is the reception voice, the speaker output is performed, and if the transmission voice is the transmission voice, the transmission is performed. If both of them have sound, various criteria can be considered, such as a case where the received voice is prioritized, and a case where the higher voice power is prioritized.

【００１３】[0013]

【発明が解決しようとする課題】上述のように、音声ス
イッチ方式は、受話音声と送話音声の状況を比較し、い
ずれか一方を抑圧し他方を通過させることにより音響エ
コーを除去するものである。これにより通常は問題なく
音響エコーを除去することができるが、フレーム処理を
行った場合には、受話音声と送話音声の間に時間的なず
れが生じ、音響エコーを完全に除去することができなく
なる場合がある。As described above, the voice switch system compares the state of a received voice and the status of a transmitted voice, and suppresses one of them and passes the other to remove an acoustic echo. is there. This normally eliminates the acoustic echo without any problem.However, when frame processing is performed, there is a time lag between the received voice and the transmitted voice, and it is possible to completely eliminate the acoustic echo. May not be possible.

【００１４】このフレーム処理とは、例えば音声符復号
化処理を行う際に用いられ、一定の時間分のデータを一
括して処理することである。図１２はこのフレーム処理
を説明する図であり、入力された音声が送話側で符号化
され、受話側で復号化され出力するまでのタイミングを
示したものである。図１２において、送信側では音声入
力部で入力された送話音声が一定時間分溜められてフレ
ームとされ、このフレームは２フレーム目のタイミ
ングでは符号化部で符号化処理、送信部で送信処理され
て相手側に送られる。相手側ではフレームは受信部で
受信処理、復号化部で復号化処理された後、３フレーム
目のタイミングで音声出力部からフレームが受話音声
として放音される。この図１２から分かるように、送話
側での入力音声は少なくとも２フレームの遅延をもって
相手側で音声出力されることになる。The frame processing is used, for example, when performing a speech codec processing, and is to collectively process data for a predetermined time. FIG. 12 is a diagram for explaining this frame processing, and shows the timing from when the input voice is encoded on the transmitting side to when it is decoded and output on the receiving side. In FIG. 12, on the transmission side, the transmission voice input by the voice input unit is accumulated for a certain period of time to form a frame, and this frame is subjected to the encoding process at the timing of the second frame and the transmission process at the transmission unit. And sent to the other party. On the other side, the frame is received by the receiving unit and decoded by the decoding unit, and then the frame is emitted from the audio output unit as a received voice at the timing of the third frame. As can be seen from FIG. 12, the input voice on the transmitting side is output on the other side with a delay of at least two frames.

【００１５】このフレーム処理を音声スイッチで行った
場合には、図１３に示すとおり、判定部で判定時に比較
する受話音声と送話音声は、スピーカから出力された時
に同時にマイクロホンから入力された受話音声と送話音
声ではなくなる。つまり、判定の時点を基準に考える
と、１フレーム前にマイク入力された送話音声と１フレ
ーム後にスピーカ出力される受話音声とを比較している
ことになる。このずれのため、単純に比較すると、音声
入力部で入力された送話音声が音声出力部から回り込ん
だ音響エコーであるか否かを判断できなくなる。When this frame processing is performed by a voice switch, as shown in FIG. 13, the received voice and the transmitted voice to be compared at the time of the determination by the determination unit are the received voice and the transmitted voice which are simultaneously output from the microphone when output from the speaker. It will no longer be voice and transmitted voice. In other words, considering the time of determination as a reference, the transmitted voice input by the microphone one frame before is compared with the received voice output by the speaker after one frame. Because of this shift, it is impossible to determine whether the transmitted voice input from the voice input unit is an acoustic echo wrapping around from the voice output unit, when compared simply.

【００１６】以下、図１３に従ってこれを詳細に説明す
る。図１３は横軸方向に時間がフレームを単位にして示
されている。以下、このフレーム単位の時間に従って説
明する。Hereinafter, this will be described in detail with reference to FIG. FIG. 13 shows the time in units of frames in the horizontal axis direction. Hereinafter, a description will be given according to the time in the frame unit.

【００１７】１フレーム目：受話側の受信部で受信された受信データ
が有音、送話側の音声入力部で入力された入力データ
が有音である。２フレーム目：判定部で受話側の有音と送話側の有音
を比較する。両者が有音であるので、受話を優先し、
送話側の有音を抑圧する判定をする。３フレーム目：上記判定に従って、出力部からは上記有
音を有音’として出力し、送話側の有音は抑圧し
て無音’にして送信する。このとき、相手側が会話を
中断したため、受話音声が途絶え、受信部の受信データ
は無音になったものとする。この時点で、音声入力部
には有音が観測された。しかし、この有音は、自局
送話者の会話であるか、音声出力部から回り込んだ有音
’の音響エコーかは分からない。４フレーム目：判定部では、受話側の無音と送話側の
有音を比較し、その結果、送話側の有音は抑圧しな
いと判定する。５フレーム目：上記判定に従って、受話側では無音を
音声出力部から無音’として出力し、送話側では音声
入力部からの有音は抑圧せずに有音’として送信す
る。First frame: The received data received by the receiving unit on the receiving side is voiced, and the input data input by the voice input unit on the transmitting side is voiced. Second frame: The voice on the receiving side and the voice on the transmitting side are compared by the determination unit. Since both are sound, give priority to receiving,
It is determined that the sound on the transmitting side is suppressed. Third frame: In accordance with the above determination, the output unit outputs the above-mentioned sound as "voice", and suppresses the voice on the transmitting side to "silence" before transmission. At this time, it is assumed that since the other party has interrupted the conversation, the receiving voice is interrupted, and the data received by the receiving unit becomes silent. At this point, sound was observed in the voice input unit. However, it is not known whether this sound is a conversation of the local transmitter or an acoustic echo of a sound 'that wrapped around from the audio output unit. Fourth frame: The determination unit compares the silence on the receiving side with the voice on the transmitting side, and as a result, determines that the voice on the transmitting side is not suppressed. Fifth frame: In accordance with the above determination, the receiving end outputs silence from the audio output section as silence, and the transmitting end transmits speech from the audio input section as speech without suppression.

【００１８】上記のシーケンスでは、５フレーム目で有
音’を相手側に送信しているが、この有音は自局送
話者の会話音声であったのか、音声出力部から回り込ん
だ有音’の音響エコーであったのかは分からない。こ
のため、後者であった場合には、本来抑圧しなければな
らなかった音響エコーを相手側に送信してしまうことに
なり、相手側は話を中断した時などに自分の声のエコー
を聞くこととなって、これが不快に感じられる。In the above-mentioned sequence, the voiced sound is transmitted to the other party in the fifth frame. This voiced voice may be the conversation voice of the local station sender or may be transmitted from the voice output unit. I don't know if it was an acoustic echo of the sound. For this reason, in the case of the latter, the acoustic echo that had to be suppressed originally is transmitted to the other party, and the other party hears the echo of their own voice when talking is interrupted That makes this feel uncomfortable.

【００１９】本発明はかかる問題点に鑑みてなされたも
のであり、フレーム処理をともなう通話機において送話
受話信号の時間ずれにかかわらず音響エコーを的確に抑
圧することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and has as its object to appropriately suppress acoustic echo in a communication device with frame processing regardless of a time lag of a transmission / reception signal.

【００２０】[0020]

【課題を解決するための手段】図１は本発明に係る原理
説明図である。上述の課題を解決するために、本発明に
係る通話機の音声スイッチは、受信した受話音声信号を
所定時間だけ遅延させる遅延手段と、受信した受話音声
信号と入力された送話音声信号とに基づき受話音声信号
を抑圧するか否かを判定する受話音声判定手段と、前記
遅延手段で遅延させた受話音声信号と入力された送話音
声信号とに基づき送話音声信号を抑圧するか否かを判定
する送話音声判定手段と、前記受話音声判定手段の判定
結果に従って前記受話音声信号を抑圧する受話側抑圧手
段と、前記送話音声判定手段の判定結果に従って前記送
話音声信号を抑圧する送話側抑圧手段とを備える。この
音声スイッチが適用される通話機ではフレーム処理によ
り音声データをブロック単位に処理しており、前記遅延
手段で遅延させる所定時間は、フレーム処理により生じ
る受話音声信号と送話音声信号との時間ずれを補償し両
者の同期をとる時間とする。この遅延手段は一時記憶を
する記憶手段で構成できる。この音声スイッチにおいて
は、受話音声信号を抑圧するか判定する際には、受信し
た受話音声と入力された送話音声を比較して行う。送話
音声信号を抑圧するか判定する際には、今入力された送
話音声とその送話音声と同じ時間にスピーカで観測され
た既に出力した受話音声とを比較し、時間的なずれを補
正し、受話音声信号と送話音声信号との同期を判定時に
とる。これは、受話音声信号を一時記憶手段などの遅延
手段で遅延させることで実現する。FIG. 1 is an explanatory view of the principle according to the present invention. In order to solve the above-described problems, the voice switch of the telephone according to the present invention includes a delay unit that delays a received voice signal received by a predetermined time, and a delay unit that receives the received voice signal and the input voice signal. Receiving voice determining means for determining whether or not to suppress the received voice signal based on the received voice signal delayed by the delay means, and whether or not to suppress the transmitted voice signal based on the input transmitted voice signal Transmitted voice determining means for determining the received voice signal, receiving side suppressing means for suppressing the received voice signal in accordance with the determination result of the received voice determining means, and suppressing the transmitted voice signal in accordance with the determination result of the transmitted voice determining means Transmitting side suppressing means. In a telephone to which this voice switch is applied, voice data is processed in block units by frame processing, and the predetermined time to be delayed by the delay means is a time lag between a received voice signal and a transmitted voice signal caused by the frame processing. Is compensated for, and the time for synchronizing the two is taken. This delay means can be constituted by a storage means for temporarily storing. In this voice switch, when determining whether to suppress the received voice signal, the received voice is compared with the input transmitted voice. When determining whether to suppress the transmitted voice signal, the currently input transmitted voice is compared with the already-received received voice observed by the speaker at the same time as the transmitted voice, and the time lag is determined. After the correction, the synchronization between the received voice signal and the transmitted voice signal is taken at the time of determination. This is realized by delaying the received voice signal by delay means such as temporary storage means.

【００２１】上記音声スイッチにおいては、受話音声信
号と同期した送話音声信号の状態を推定する送話音声推
定手段を有し、前記送話音声推定手段からの送話音声信
号を受話音声判定手段に送話音声信号として入力するよ
うに構成できる。The above voice switch has transmitted voice estimating means for estimating the state of the transmitted voice signal synchronized with the received voice signal, and receives the transmitted voice signal from the transmitted voice estimating means to receive voice determining means. To be input as a transmission voice signal.

【００２２】[0022]

【発明の実施の形態】以下、図面を参照して本発明の実
施例を説明する。図２には本発明の一実施例としての音
声スイッチを備えたハンズフリー通話機が示される。図
中、受信部１、パワー抑圧部２、７、音声出力部３、音
声入力部６、送信部８は、図６の従来装置で説明した回
路要素と同じものであるので、ここでは詳細な説明は省
く。Embodiments of the present invention will be described below with reference to the drawings. FIG. 2 shows a hands-free telephone having a voice switch according to an embodiment of the present invention. In the figure, a receiving unit 1, power suppressing units 2, 7, an audio output unit 3, an audio input unit 6, and a transmitting unit 8 are the same as the circuit elements described in the conventional device of FIG. Description is omitted.

【００２３】また、本実施例装置では受信データの復号
化処理を行う復号化部９と送信データの符号化処理を行
う符号化部１０を有している。例えばパソコン同士を接
続して通話を行う場合には、通信路の伝送容量が小さい
ことが十分考えられるので、符号化を行うことが必要に
なってくる。このため本実施例装置は、音声スイッチと
音声符復号器を用いたハンズフリー機能の構成となって
いる。Further, the apparatus of this embodiment has a decoding unit 9 for decoding received data and an encoding unit 10 for encoding transmitted data. For example, when a call is made by connecting personal computers, it is conceivable that the transmission capacity of the communication path is small, so that it is necessary to perform encoding. For this reason, the apparatus of this embodiment has a configuration of a hands-free function using a voice switch and a voice codec.

【００２４】また、本実施例装置では、受話音声一時記
憶部４８を有し、パワー抑圧部２からの受話音声を所定
フレーム時間だけ遅延させるようになっている。この遅
延時間は送話音声と受話音声の時間的なずれを補正し両
者の同期をとれる長さとする。この実施例では１フレー
ムの受話音声データを２フレーム時間遅延させるものと
する。Further, the apparatus of this embodiment has a received voice temporary storage section 48, and the received voice from the power suppressing section 2 is delayed by a predetermined frame time. This delay time is set to a length that corrects the time difference between the transmitted voice and the received voice and synchronizes the two. In this embodiment, it is assumed that the received voice data of one frame is delayed by two frames.

【００２５】また、従来装置での判定部４に代えて、本
実施例では、受話音声判定部４６と送話音声判定部４７
の２つを有している。受話音声判定部４６は受話側のパ
ワー抑圧部２で受話音声を抑圧するか否かを判定するも
のであって、受話音声としては復号化部９から出力され
たものが入力される。送話音声判定部４７は送話側のパ
ワー抑圧部７で送話音声を抑圧するか否かを判定するも
のであって、受話音声としては受話音声一時記憶部４８
からの所定フレーム時間だけ遅延されたものが入力され
る。これら受話音声判定部４６と送話音声判定部４７は
従来技術で説明した判定部４と同じ構成を有しているの
で、その詳細な構成の説明は省略する。In the present embodiment, instead of the determination unit 4 in the conventional apparatus, the received voice determination unit 46 and the transmitted voice determination unit 47 are used.
It has two. The received voice determination unit 46 determines whether or not the received power is suppressed by the power suppressing unit 2 on the receiving side. The received voice output from the decoding unit 9 is input as the received voice. The transmitted voice determination unit 47 determines whether or not the transmitted voice is suppressed by the power suppressing unit 7 on the transmitting side. The received voice temporary storage unit 48 is used as the received voice.
Are delayed by a predetermined frame time from the input. Since the received voice determination unit 46 and the transmitted voice determination unit 47 have the same configuration as the determination unit 4 described in the related art, a detailed description of the configuration will be omitted.

【００２６】この実施例装置の動作シーケンスを図３を
参照して説明する。この図３は横軸方向に時間がフレー
ムを単位にして示されている。以下、このフレーム単位
の時間に従って説明する。The operation sequence of this embodiment will be described with reference to FIG. FIG. 3 shows time in units of frames in the horizontal axis direction. Hereinafter, a description will be given according to the time in the frame unit.

【００２７】１フレーム目：受話側の受信部１で受信さ
れた受信データが有音、送話側の入力入力部６で入力
された入力データが有音である。First frame: The received data received by the receiving unit 1 on the receiving side is voiced, and the input data input by the input input unit 6 on the transmitting side is voiced.

【００２８】２フレーム目：受話音声判定部４６で受話
側の有音と送話側の有音を比較する。両者が有音で
あるので、受話優先をし、受話側の有音を抑圧せずに
音声出力部３から有音’として出力すると判定する。Second frame: The received voice determination unit 46 compares the voice on the receiving side with the voice on the transmitting side. Since both voices are voiced, it is determined that the reception is prioritized, and the voice output unit 3 outputs the voice as 'voice' without suppressing the voice on the receiving side.

【００２９】３フレーム目：上記判定に従って、音声出
力部３からは上記有音をパワー抑圧部で抑圧せずに有
音’として出力する。このとき、相手側が会話を中断
したため、受話音声が途絶え、受信部１の受信データは
無音になったものとする。この時点で、音声入力部６
には有音が観測される。しかし、この有音は、自局
送話者の会話であるか、音声出力部３から回り込んだ受
話の有音’の音響エコーかはまだ分からない。Third frame: In accordance with the above determination, the sound output unit 3 outputs the above sound as sound 'without being suppressed by the power suppression unit. At this time, it is assumed that since the other party has interrupted the conversation, the receiving voice is interrupted, and the data received by the receiving unit 1 is silent. At this point, the voice input unit 6
A sound is observed in. However, it is not yet known whether this sound is a conversation of the transmitter of the local station or an acoustic echo of the sound of the received voice wrapped around from the voice output unit 3.

【００３０】４フレーム目：受話音声判定部４６では、
受話側の無音と送話側の有音を比較し、受話側の無
音は抑圧して音声出力部３から無音’として出力す
ると判定する。また送信音声判定部４６では、受話音声
一時記憶部４８で２フレーム時間遅延させておいた受話
側の有音’と送話側の有音を比較し、受話優先の規
則に従って、送話音声を送話側のパワー抑圧部７で抑
圧して無音’として送信すると判定する。Fourth frame: In the received voice determination section 46,
The silent on the receiving side is compared with the sound on the transmitting side, and it is determined that the silent on the receiving side is suppressed and output from the voice output unit 3 as 'silence'. Further, the transmission sound determination unit 46 compares the sound on the reception side which has been delayed by two frames in the reception sound temporary storage unit 48 with the sound on the transmission side, and determines the transmission sound according to the reception priority rule. It is determined that the power is suppressed by the power suppression unit 7 on the transmitting side and transmitted as 'silence'.

【００３１】５フレーム目：上記判定に従って、受話側
では無音を音声出力部から無音’として出力し、送
話側では入力部６からの有音は抑圧して無音’とし
て送信する。Fifth frame: According to the above determination, the receiving end outputs silence from the voice output section as silence, and the transmitting end suppresses the sound from the input section 6 and transmits it as silence.

【００３２】上記のシーケンスでは、５フレーム目で、
有音を無音’として送信している。この場合、音声
入力部６に入力された有音は、同時点で音声出力部３
から有音’が出力されているため、この有音’の音
響エコーである可能性が高い。しかし、従来装置ではそ
の判別はできなかった。これに対して、本実施例装置に
よれば、送信音声判定部４７は、受話音声一時記憶部４
８で２フレーム時間遅延させておいた有音’と音声入
力部６に入力された有音とを比較しているので、受話
側の有音’と送話側の有音の時間のずれを一致させ
ることができ、両者が有音であれば、送話側の有音
は、受話側で出力された有音’が回り込んだ音響エコ
ーである可能性が大であると判断でき、この有音を抑
圧して無音を送信するよう判定する。これにより音響エ
コーを除去することができる。In the above sequence, at the fifth frame,
The sound is transmitted as 'silence'. In this case, the sound input to the audio input unit 6 is synchronized with the audio output unit 3 at the same time.
Since the sound is output from the sound, there is a high possibility that the sound is an acoustic echo of the sound. However, the conventional device could not make that determination. On the other hand, according to the present embodiment, the transmission voice determination unit 47 sets the reception voice temporary storage unit 4
8 and the voice input to the voice input unit 6 is compared, so that the time lag between the voice on the receiving side and the voice on the transmitting side is calculated. If both are sound, it can be determined that the sound on the transmitting side is likely to be an acoustic echo wrapped around the sound 'output on the receiving side. It is determined that a sound is suppressed and silence is transmitted. Thereby, an acoustic echo can be removed.

【００３３】図４には本発明の他の実施例が示される。
前述の実施例と同様、フレーム遅延に対処したものであ
るが、受話音声判定部４６における受話音声の判定の際
にもなるベく受話音声と送話音声の時間ずれをなくすた
め、現在判定中の受話音声が出力される際に、同時に音
声入力部６のマイクで観測される送話音声の音声状態を
推定する送話音声推定部４９を有する。この送話音声の
推定は、過去数フレームの送話音声のパワーの平均をと
る方法などが可能である。音声は時間的に比較的に滑ら
かな変動をするものなので、推定精度も比較的高くなる
ものと考えられる。FIG. 4 shows another embodiment of the present invention.
As in the above-described embodiment, the present embodiment deals with the frame delay. And a transmission voice estimation unit 49 for estimating the voice state of the transmission voice observed by the microphone of the voice input unit 6 when the reception voice is output. For the estimation of the transmitted voice, a method of averaging the power of transmitted voices in the past several frames can be used. Since the voice fluctuates relatively smoothly over time, the estimation accuracy is considered to be relatively high.

【００３４】[0034]

【発明の効果】以上説明したように、本発明によれば、
フレーム処理を用いた音声スイッチの問題点である判定
時の受話音声と送話音声の時間のずれを補正することが
できる。これにより、音響エコーの除去や受話・送話音
声の変化に対する素早い追従性を確保し、より自然な通
話を提供することが可能となる。As described above, according to the present invention,
It is possible to correct the time lag between the received voice and the transmitted voice at the time of determination, which is a problem of the voice switch using frame processing. As a result, it is possible to ensure quick removal of acoustic echoes and changes in received and transmitted voices, and to provide a more natural call.

[Brief description of the drawings]

【図１】本発明に係る原理説明図である。FIG. 1 is an explanatory view of the principle according to the present invention.

【図２】本発明に係る一実施例としての音声スイッチを
備えたハンズフリー通話機を示す図である。FIG. 2 is a diagram illustrating a hands-free telephone having a voice switch according to an embodiment of the present invention;

【図３】実施例装置の動作シーケンスを説明する図であ
く。FIG. 3 is a diagram illustrating an operation sequence of the apparatus according to the embodiment.

【図４】本発明の他の実施例を示す図である。FIG. 4 is a diagram showing another embodiment of the present invention.

【図５】ハンズフリー通話機等における音響エコーを説
明する図である。FIG. 5 is a diagram illustrating an acoustic echo in a hands-free communication device or the like.

【図６】エコーキャンセラ方式を説明する図である。FIG. 6 is a diagram illustrating an echo canceller method.

【図７】音声スイッチ方式を説明する図である。FIG. 7 is a diagram illustrating an audio switch system.

【図８】エコーキャンセラ方式と音声スイッチ方式を比
較する図である。FIG. 8 is a diagram comparing an echo canceller system and a voice switch system.

【図９】従来の音声スイッチを備えたハンズフリー通話
機を示す図である。FIG. 9 is a diagram showing a conventional hands-free telephone having a voice switch.

【図１０】従来装置における有音検出部の構成を示す図
である。FIG. 10 is a diagram illustrating a configuration of a sound detection unit in a conventional device.

【図１１】有音／無音の判定テーブルの例を示す図であ
る。FIG. 11 is a diagram showing an example of a sound / silence determination table.

【図１２】フレーム処理により遅延を説明する図であ
る。FIG. 12 is a diagram illustrating a delay due to frame processing.

【図１３】従来装置の動作シーケンスを示す図である。FIG. 13 is a diagram showing an operation sequence of the conventional device.

[Explanation of symbols]

１受信部２、７パワー抑圧部３音声出力部４判定部６音声入力部８送信部４１、４３パワー計算部４２、４４有音検出部４５判定部４６受話音声判定部４７送話音声判定部４８受話音声一時記憶部４９送話音声推定部 DESCRIPTION OF SYMBOLS 1 Receiving part 2, 7 Power suppression part 3 Audio output part 4 Judgment part 6 Audio input part 8 Transmission part 41, 43 Power calculation part 42, 44 Sound detection part 45 Judgment part 46 Reception sound judgment part 47 Speech sound judgment part 48 Received voice temporary storage unit 49 Transmitted voice estimation unit

フロントページの続き (72)発明者松澤均神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内 (72)発明者伊藤正人神奈川県川崎市中原区上小田中４丁目１番１号富士通株式会社内Continuation of the front page (72) Inventor Hitoshi Matsuzawa 4-1-1 Uedanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Inside Fujitsu Limited (72) Inventor Masato Ito 4-1-1, Kamiodanaka, Nakahara-ku, Kawasaki City, Kanagawa Prefecture Fujitsu Limited

Claims

[Claims]

1. A delay means for delaying a received voice signal received for a predetermined time, and a received voice determination for determining whether or not the received voice signal is suppressed based on the received voice signal and an input transmitted voice signal. Means, transmitted voice determination means for determining whether to suppress a transmitted voice signal based on the received voice signal delayed by the delay means and the input transmitted voice signal, and the received voice determination means A speech switch for a communication device, comprising: a receiving-side suppressing unit that suppresses the received voice signal according to the result of the determination; and a transmitting-side suppressing unit that suppresses the transmitted voice signal according to the determination result of the transmitted voice determining unit.

2. The apparatus according to claim 1, further comprising: a transmitting voice estimating means for estimating a voice state of the transmitting voice signal synchronized with the receiving voice signal, and transmitting the transmitting voice signal from the transmitting voice estimating means to the receiving voice determining means. 2. The voice switch for a telephone according to claim 1, wherein the voice switch is configured to be input as a voice signal.