CN104469619B

CN104469619B - Controller applied to audio device and related operating method

Info

Publication number: CN104469619B
Application number: CN201310414628.1A
Authority: CN
Inventors: 黄宏吉; 胡正伦
Original assignee: MStar Semiconductor Inc Taiwan
Current assignee: MediaTek Inc
Priority date: 2013-09-12
Filing date: 2013-09-12
Publication date: 2017-10-27
Anticipated expiration: 2033-09-12
Also published as: CN104469619A

Abstract

The present invention relates to a controller applied to an audio device and a related operation method; the controller can receive a first sound signal and a second sound signal respectively provided by two microphones, and includes an echo cancellation module and a beamforming module. The echo cancellation module can perform echo cancellation on the first sound signal and provide an intermediate signal accordingly, and the beamforming module can perform beamforming processing using the intermediate signal after echo cancellation and the second sound signal that has not been processed by echo cancellation.

Description

Controller applied to audio device and related operating method

技术领域technical field

本发明有关于一种应用于音频装置的控制器与相关的操作方法，尤指一种能以低运算量有效改善收音效果的音频装置控制器与相关操作方法。The invention relates to a controller applied to an audio device and a related operation method, in particular to an audio device controller and a related operation method capable of effectively improving the sound collection effect with a low calculation amount.

背景技术Background technique

能收集及/或播放声音的音频装置已在现代资讯生活中扮演重要的角色；再者，具有声控功能的装置也可视为音频装置。举例而言，音频装置可以涵盖手机、数位相机/摄影机、可发音并接受声控的导航/定位装置、穿戴式/手持式/可携式的计算机/电子书/电子字典/电脑、具有声控功能的电视、音响、多媒体播放器、玩具乃至于互动艺术作品等等。Audio devices capable of collecting and/or playing sounds have played an important role in modern information life; moreover, devices with voice control functions can also be regarded as audio devices. For example, audio devices can cover mobile phones, digital cameras/video cameras, navigation/positioning devices that can pronounce and accept voice control, wearable/handheld/portable computers/e-books/electronic dictionaries/computers, voice-activated TVs, stereos, multimedia players, toys and even interactive art works and more.

请参考图1，其所示意的是一已知的音频装置10，其可播放声音，并可接受声控。音频装置10包括有麦克风12a与12b，扬声器14a与14b，一控制器20，一音频输出模块23与一播放模块24。麦克风12a与12b用以收音，并将收集到的声音转换为信号Si_L与Si_R。信号Si_L与Si_R会被传输至控制器20。Please refer to FIG. 1 , which shows a known audio device 10 capable of playing sound and accepting voice control. The audio device 10 includes microphones 12 a and 12 b , speakers 14 a and 14 b , a controller 20 , an audio output module 23 and a playback module 24 . The microphones 12a and 12b are used to collect sound and convert the collected sound into signals Si_L and Si_R. The signals Si_L and Si_R are transmitted to the controller 20 .

控制器20包括一波束成型(beam forming)模块16、一回音消除(echocancellation)模块18与一语音辨识模块22。音频输出模块23可提供信号Sp_L与Sp_R作为音源信号，播放模块24可依据信号Sp_L与Sp_R进行播放，例如说是依据信号Sp_L与Sp_R分别驱动扬声器14a与14b，以将信号Sp_L与Sp_R播放为声音。The controller 20 includes a beam forming module 16 , an echo cancellation module 18 and a speech recognition module 22 . The audio output module 23 can provide signals Sp_L and Sp_R as audio source signals, and the playback module 24 can play according to the signals Sp_L and Sp_R, for example, drive the speakers 14a and 14b according to the signals Sp_L and Sp_R, so as to play the signals Sp_L and Sp_R as sounds .

为了实现声控的功能，音频装置10必须聚焦于使用者的位置以集中收集使用者发出的声控命令，并避免扬声器14a与14b的播音影响收音，因为扬声器14a与14b播出的声音会形成回音，并被麦克风12a与12b接收。在习知音频装置10的控制器20中，波束成型模块16便是要利用信号Si_L与Si_R进行波束成型处理，并据以提供信号Sm1；波束成型的目的是在信号Sm1中加强某一聚焦区域内的声音，并减抑其他非聚焦区域的声音干扰。回音消除模块18则依据信号Sp_R来对信号Sm1进行回音消除，据以提供信号Sm2。然后，语音辨识模块22便可利用信号Sm2来进行语音辨识，由信号Sm2中辨识出是否有声控命令及命令的内容，使控制器20能据以控制音频装置10。In order to realize the function of voice control, the audio device 10 must focus on the position of the user to collect the voice control commands issued by the user, and prevent the broadcasting of the speakers 14a and 14b from affecting the sound collection, because the sound broadcast by the speakers 14a and 14b will form an echo. and is received by microphones 12a and 12b. In the controller 20 of the conventional audio device 10, the beamforming module 16 uses the signals Si_L and Si_R to perform beamforming processing, and accordingly provides the signal Sm1; the purpose of the beamforming is to enhance a certain focus area in the signal Sm1 sound in the center and suppress sound interference from other non-focused areas. The echo cancellation module 18 performs echo cancellation on the signal Sm1 according to the signal Sp_R, so as to provide the signal Sm2. Then, the voice recognition module 22 can use the signal Sm2 to perform voice recognition, and identify whether there is a voice control command and the content of the command from the signal Sm2, so that the controller 20 can control the audio device 10 accordingly.

由图1可知，已知音频装置10是在进行波束成型之后进行回音消除。在此已知架构下，虽然控制器20仅需单一回音消除模块18而降低运算量，但波束成型会破坏回音的线性特性，产生非线性的信号，使回音消除模块18无法完全消除回音，连带影响语音辨识的正确性与辨识率。As can be seen from FIG. 1 , the known audio device 10 performs echo cancellation after performing beamforming. Under this known architecture, although the controller 20 only needs a single echo cancellation module 18 to reduce the amount of computation, the beamforming will destroy the linear characteristics of the echo and generate a nonlinear signal, so that the echo cancellation module 18 cannot completely eliminate the echo, resulting in Affect the accuracy and recognition rate of speech recognition.

发明内容Contents of the invention

为克服已知技术的缺点，本发明的目的之一是提供一种可运用于一音频装置的控制器。本发明控制器可接收由两麦克风分别提供的一第一收音信号与一第二收音信号，并包括一回音消除模块与一波束成型模块。回音消除模块对第一收音信号进行回音消除并据以提供一中介信号。波束成型模块耦接回音消除模块与第二收音信号，以中介信号与第二收音信号进行波束成型(beam forming)处理，据以提供一输出信号；其中，第二收音信号不经回音消除处理。控制器还可包括一语音辨识模块，耦接该波束成型模块，对输出信号进行语音辨识，并依据语音辨识的结果控制音频装置。To overcome the disadvantages of the prior art, one of the objectives of the present invention is to provide a controller applicable to an audio device. The controller of the present invention can receive a first radio signal and a second radio signal respectively provided by two microphones, and includes an echo cancellation module and a beam forming module. The echo cancellation module performs echo cancellation on the first radio signal and provides an intermediate signal accordingly. The beamforming module is coupled to the echo cancellation module and the second radio signal, and performs beam forming processing on the intermediate signal and the second radio signal to provide an output signal; wherein, the second radio signal is not processed by echo cancellation. The controller can also include a voice recognition module, coupled to the beamforming module, to perform voice recognition on the output signal, and control the audio device according to the result of the voice recognition.

本发明音频装置可以包括一或多个扬声器、一音频输出模块与一播放模块。音频输出模块用以为各扬声器提供一音源信号，播放模块依据各音频信号来使各扬声器播放对应的声音，而回音消除模块则可依据音源信号来对第一收音信号进行回音消除。The audio device of the present invention may include one or more speakers, an audio output module and a playback module. The audio output module is used to provide a sound source signal for each speaker, the play module makes each speaker play a corresponding sound according to each audio signal, and the echo cancellation module can perform echo cancellation on the first radio signal according to the sound source signal.

本发明的目的之一是提供一种应用于一音频装置的操作方法，包括：分别自一第一麦克风与一第二麦克风接收一第一收音信号与一第二收音信号，对第一收音信号进行一回音消除处理并据以提供一中介信号，以及，依据中介信号与第二收音信号进行一波束成型处理并据以提供一输出信号；其中，第二收音信号是未经回音消除处理。One of the objects of the present invention is to provide an operation method applied to an audio device, comprising: receiving a first sound collection signal and a second sound collection signal from a first microphone and a second microphone respectively, and performing an operation on the first sound collection signal An echo cancellation process is performed to provide an intermediate signal, and a beamforming process is performed according to the intermediate signal and the second radio signal to provide an output signal; wherein the second radio signal is not processed by echo cancellation.

为了对本发明的上述及其他方面有更佳的了解，下文特举较佳实施例，并配合附图，作详细说明如下：In order to have a better understanding of the above-mentioned and other aspects of the present invention, the preferred embodiments are specifically cited below, together with the accompanying drawings, and are described in detail as follows:

附图说明Description of drawings

图1示出了一已知音频装置的控制器架构。FIG. 1 shows the controller architecture of a known audio device.

图2示出了一音频装置及其控制器。Figure 2 shows an audio device and its controller.

图3示出了依据本发明一实施例的音频装置及其控制器。FIG. 3 shows an audio device and its controller according to an embodiment of the present invention.

图4举例比较图1至图3的回音消除效果与运算量。FIG. 4 compares the echo cancellation effect and calculation amount in FIG. 1 to FIG. 3 by way of example.

图5示出了依据本发明一实施例的操作方法流程。Fig. 5 shows the flow of the operation method according to an embodiment of the present invention.

符号说明Symbol Description

10、30、50：音频装置10, 30, 50: Audio installations

12a-12b、32a-32b、52a-52b：麦克风12a-12b, 32a-32b, 52a-52b: Microphones

14a-14b、34a-34b、54a-54b：扬声器14a-14b, 34a-34b, 54a-54b: Speakers

16、36、56：波束成型模块16, 36, 56: beamforming module

18、38a-38b、58：回音消除模块18, 38a-38b, 58: Echo cancellation module

20、40、60：控制器20, 40, 60: Controller

22、42、62：语音辨识模块22, 42, 62: speech recognition module

23、43、63：音频输出模块23, 43, 63: audio output module

24、44、64：播放模块24, 44, 64: Play module

Si_L/Si_R、Sm1、Sm2、Sp_L/Sp_R、Sm_R/Sm_L、Si_a/Si_b、Sp_a/Sp_b、S1、S2：信号Si_L/Si_R, Sm1, Sm2, Sp_L/Sp_R, Sm_R/Sm_L, Si_a/Si_b, Sp_a/Sp_b, S1, S2: Signal

100：流程100: Process

102-108：步骤102-108: Steps

具体实施方式detailed description

请参考图2，其所示意的是一音频装置30。音频装置30亦可播放声音并接受声控，其包括有麦克风32a与32b，扬声器34a与34b，一控制器40，一音频输出模块43与一播放模块44。麦克风32a与32b用以收音，据以提供电子信号Si_L与Si_R，并传输至控制器40。Please refer to FIG. 2 , which shows an audio device 30 . The audio device 30 can also play sound and receive voice control, and includes microphones 32 a and 32 b , speakers 34 a and 34 b , a controller 40 , an audio output module 43 and a playback module 44 . The microphones 32 a and 32 b are used to collect sound, and provide electronic signals Si_L and Si_R to transmit to the controller 40 .

控制器40包括两回音消除模块38a与38b、一波束成型模块36与一语音辨识模块42。音频输出模块43可提供信号Sp_L与Sp_R作为音源信号，播放模块44依据信号Sp_L与Sp_R控制扬声器34a与34b，以将信号Sp_L与Sp_R播放为声音。The controller 40 includes two echo cancellation modules 38 a and 38 b , a beamforming module 36 and a speech recognition module 42 . The audio output module 43 can provide signals Sp_L and Sp_R as audio source signals, and the playing module 44 controls the speakers 34 a and 34 b according to the signals Sp_L and Sp_R to play the signals Sp_L and Sp_R as sounds.

为了实现声控的功能，音频装置30同样必须聚焦收音，并避免扬声器34a与34b的播放回音干扰收音。在音频装置30的控制器40中，回音消除模块38a与38b会先依据信号Sp_L与Sp_R而分别从信号Si_L与Si_R中消除回音，并产生信号Sm_L与Sm_R。然后，由波束成型模块36利用信号Sm_L与Sm_R进行波束成型处理，并据以产生信号Sm2，作为一输出信号。如此，语音辨识模块42便可利用信号Sm2来进行语音辨识，以使控制器40能据以控制音频装置30。In order to realize the voice control function, the audio device 30 must also focus on sound collection, and avoid the playback echoes of the speakers 34a and 34b from interfering with sound collection. In the controller 40 of the audio device 30 , the echo cancellation modules 38 a and 38 b respectively cancel the echoes from the signals Si_L and Si_R according to the signals Sp_L and Sp_R, and generate signals Sm_L and Sm_R. Then, the beamforming module 36 uses the signals Sm_L and Sm_R to perform beamforming processing, and accordingly generates the signal Sm2 as an output signal. In this way, the voice recognition module 42 can use the signal Sm2 to perform voice recognition, so that the controller 40 can control the audio device 30 accordingly.

不同于图1的已知技术，图2的控制器架构是先进行两路的均衡回音消除，再进行波束成型，以避免回音特性被波束成型破坏。不过，图2两路均衡回音消除可能需耗费较多运算量。Different from the known technology in FIG. 1 , the controller architecture in FIG. 2 first performs two-way equalized echo cancellation, and then performs beamforming, so as to prevent the echo characteristics from being damaged by the beamforming. However, the two-way equalization echo cancellation shown in FIG. 2 may require a lot of computation.

请参考图3，其所示意的是依据本发明一实施例的音频装置50。举例而言，音频装置50可以是一个可播放声音且可接受声控的装置，例如一声控电视或一声控的多媒体播放器。音频装置50可以包括一或多个麦克风(例如麦克风52a与52b)，一或多个扬声器(例如扬声器54a与54b)，一音频输出模块63，一播放模块64以及一控制器60。麦克风52a与52b用以收音，并分别将收集到的声音转换为电子信号Si_a与Si_b(可视为第一与第二收音信号)，传输至控制器60。Please refer to FIG. 3 , which illustrates an audio device 50 according to an embodiment of the present invention. For example, the audio device 50 may be a device capable of playing sound and accepting voice control, such as a voice-activated TV or a voice-activated multimedia player. The audio device 50 may include one or more microphones (such as microphones 52a and 52b ), one or more speakers (such as speakers 54a and 54b ), an audio output module 63 , a playback module 64 and a controller 60 . The microphones 52 a and 52 b are used for sound collection, and respectively convert the collected sounds into electronic signals Si_a and Si_b (which can be regarded as first and second sound collection signals) and transmit them to the controller 60 .

控制器60可以是一处理器或控制器芯片，也可以包括控制器芯片的周边支持电路及/或硬件，如挥发性及/或非挥发性存储器等等。控制器60可包括单一回音消除模块58、一波束成型模块56与一语音辨识模块62。在音频装置50中，音频输出模块63可提供信号Sp_a与Sp_b(可视为音源信号)，播放模块64则依据信号Sp_a与Sp_b驱动扬声器54a与54b，以将信号Sp_a与Sp_b播放为对应的声音。举例而言，音频输出模块63可以包括音频编解码(audio codec)模块，用以从一立体声的音源串流(未绘示)中提取出不同声道的信号以分别作为不同扬声器的音源信号，例如扬声器54a与54b的信号Sp_a与Sp_b。The controller 60 may be a processor or a controller chip, and may also include peripheral supporting circuits and/or hardware of the controller chip, such as volatile and/or non-volatile memory and the like. The controller 60 may include a single echo cancellation module 58 , a beamforming module 56 and a speech recognition module 62 . In the audio device 50, the audio output module 63 can provide signals Sp_a and Sp_b (which can be regarded as audio source signals), and the playback module 64 drives the speakers 54a and 54b according to the signals Sp_a and Sp_b, so as to play the signals Sp_a and Sp_b as corresponding sounds . For example, the audio output module 63 may include an audio codec (audio codec) module for extracting signals of different channels from a stereo audio stream (not shown) as audio signals of different speakers, For example, the signals Sp_a and Sp_b of the speakers 54a and 54b.

音频装置50可聚焦收音，并抑制扬声器播音所导致的回音。举例而言，为了实现声控的功能，音频装置50可聚焦于使用者的位置以集中收集使用者发出的声控命令，并避免扬声器54a与54b的播音影响收音。在控制器60中，回音消除模块58耦接于麦克风52a、波束成型模块56与音频输出模块63，接收信号Sp_a，以参考信号Sp_a来对信号Si_a进行回音消除，并据以提供信号S1作为一中介信号。波束成型模块56耦接回音消除模块58、麦克风52b与语音辨识模块62，可利用信号S1与麦克风52b的信号Si_b进行波束成型处理，据以提供一信号S2作为一输出信号。语音辨识模块62耦接波束成型模块56，对信号S2进行语音辨识，使控制器60得以依据语音辨识的结果控制音频装置50。The audio device 50 can focus on sound collection and suppress the echo caused by the sound from the speaker. For example, in order to realize the voice control function, the audio device 50 can focus on the position of the user to collect the voice control commands issued by the user, and prevent the sound from the speakers 54a and 54b from affecting the sound collection. In the controller 60, the echo cancellation module 58 is coupled to the microphone 52a, the beamforming module 56 and the audio output module 63, receives the signal Sp_a, uses the reference signal Sp_a to perform echo cancellation on the signal Si_a, and accordingly provides the signal S1 as a intermediary signal. The beamforming module 56 is coupled to the echo cancellation module 58 , the microphone 52 b and the voice recognition module 62 , and can use the signal S1 and the signal Si_b of the microphone 52 b to perform beamforming processing, thereby providing a signal S2 as an output signal. The voice recognition module 62 is coupled to the beamforming module 56 to perform voice recognition on the signal S2, so that the controller 60 can control the audio device 50 according to the voice recognition result.

由图3可知，本发明控制器60是将回音消除安排在波束成型之前，如此，便可避免波束成型的非线性信号影响回音消除的效果，也进一步防止波束成型干扰语音辨识率与正确性。举例而言，回音消除可利用正规化最小平方误差(NLMS，Normalized Least MeanSquare)演算法来进行，但在对某一输入的音源信号进行回音消除时，若该信号经过越多的处理(例如空间反射、非线性共振及/或波束成型等等)，便越难以利用处理后的音源信号经由NLMS演算法去逼近输入回音的适应性滤波器系数。所以，若将波束成型置于回音消除之前，会让回音消除模块更难学习到消除回音的滤波器系数，而使回音更难消除。相较之下，本发明的控制器架构是将波束成型安排在回音消除之后，因此能有效防止波束成型破坏回音消除的效果。It can be seen from FIG. 3 that the controller 60 of the present invention arranges the echo cancellation before the beamforming. In this way, the non-linear signal of the beamforming can be prevented from affecting the effect of the echo cancellation, and the beamforming can further prevent the speech recognition rate and accuracy from being interfered. For example, echo cancellation can be performed using the Normalized Least MeanSquare (NLMS, Normalized Least MeanSquare) algorithm. reflection, nonlinear resonance and/or beamforming, etc.), the more difficult it is to use the processed sound source signal to approximate the adaptive filter coefficients of the input echo through the NLMS algorithm. Therefore, if beamforming is placed before echo cancellation, it will make it more difficult for the echo cancellation module to learn the filter coefficients for echo cancellation, which will make it more difficult to cancel the echo. In contrast, the controller architecture of the present invention arranges the beamforming after the echo cancellation, so it can effectively prevent the beamforming from destroying the effect of the echo cancellation.

再者，本发明控制器60可以实现单一回音消除模块58，因此，控制器60的运算量可以缩减，避免图2中多个回音消除所需的额外运算量。虽然控制器60只对麦克风52a提供的信号Si_a进行回音消除，并未对麦克风52b的信号Si_b进行回音消除，但依据本发明实施例，信号Si_b中的回音仍会被波束成型模块56的波束成型处理抑制、消除，因此，整体而言，信号Si_a与Si_b中的回音均不会干扰语音辨识的辨识率。Furthermore, the controller 60 of the present invention can realize a single echo cancellation module 58, therefore, the computation amount of the controller 60 can be reduced, avoiding the extra computation required by multiple echo cancellations in FIG. 2 . Although the controller 60 only performs echo cancellation on the signal Si_a provided by the microphone 52a, and does not perform echo cancellation on the signal Si_b of the microphone 52b, according to the embodiment of the present invention, the echo in the signal Si_b will still be formed by the beamforming module 56 The processing is suppressed and eliminated. Therefore, overall, the echoes in the signals Si_a and Si_b will not interfere with the recognition rate of speech recognition.

波束成型的目的之一是增强聚焦区的声音并相对地抑制非聚焦区的声音；举例而言，聚焦区可以位在麦克风54a与54b的几何中心线上。也就是说，聚焦区距离麦克风54a与54b的距离是相近的，因此在聚焦区发出的声音表现在信号Si_a与Si_b中也是类似的，若一声音在信号Si_a与Si_b中有不同的表现，或者只表现在信号Si_a与Si_b其中之一，则可判断其并非聚焦区发出的声音。在本发明实施例中，虽然麦克风52b的信号Si_b未经回音消除，但因信号Si_b的回音只出现在麦克风54b传入的信号Si_b内，而没有出现在回音消除模块58传送的信号S1内，故会被波束成型模块56认定为非聚焦区的声音；如此，波束成型模块56的波束成型处理便会将信号Si_b的回音滤除。One of the purposes of beamforming is to enhance the sound in the focused area and relatively suppress the sound in the non-focused area; for example, the focused area may be located on the geometric centerline of the microphones 54a and 54b. That is to say, the distance between the focal area and the microphones 54a and 54b is similar, so the sound emitted in the focal area is also similar in the signals Si_a and Si_b. If a sound has different performances in the signals Si_a and Si_b, or If it only appears in one of the signals Si_a and Si_b, it can be judged that it is not the sound from the focus area. In the embodiment of the present invention, although the signal Si_b of the microphone 52b is not echo-cancelled, the echo of the signal Si_b only appears in the signal Si_b input by the microphone 54b, but does not appear in the signal S1 transmitted by the echo cancellation module 58. Therefore, the beamforming module 56 will recognize the sound in the non-focus area; thus, the beamforming process of the beamforming module 56 will filter out the echo of the signal Si_b.

请参考图4，其举例比较图1至图3控制器的回音消除效果与运算量。图4中，回音消除效果是以回音往返损耗的增强(ERLE，Echo Return Loss Enhancement)来量化；数值越高者，回音消除的效果越好。运算量则以回音消除所需的时脉来表示；数值越低者，所需消耗的运算量越少。由图4可知，本发明(图3)的控制器架构可兼顾回音消除效果与低运算量，不仅回音消除效果优良，且使用的运算量也很低。Please refer to FIG. 4 , which is an example to compare the echo cancellation effect and operation amount of the controllers in FIGS. 1 to 3 . In Fig. 4, the echo cancellation effect is quantified by Echo Return Loss Enhancement (ERLE, Echo Return Loss Enhancement); the higher the value, the better the echo cancellation effect. The amount of computation is represented by the clock pulse required for echo cancellation; the lower the value, the less the amount of computation required. As can be seen from FIG. 4 , the controller architecture of the present invention ( FIG. 3 ) can take into account both the echo cancellation effect and the low computation load. Not only is the echo cancellation effect excellent, but the computation load is also very low.

在图3实施例中，语音辨识模块62也可以是其他功能的模块，例如说是录音模块(用以将信号S2记录至非挥发性存储器)、传输模块(将信号S2传输至网络)及/或音频处理模块，例如编码模块(将信号S2编码为串流)或频谱转换模块(将信号S2转换至频域)等等。控制器60的各模块可以用专属硬件实现，以及/或者，用硬件处理器执行软件及/或固件程序来实现。In the embodiment of FIG. 3, the voice recognition module 62 may also be a module with other functions, such as a recording module (for recording the signal S2 to a non-volatile memory), a transmission module (for transmitting the signal S2 to the network) and/or Or an audio processing module, such as an encoding module (encoding the signal S2 into a stream) or a spectrum conversion module (converting the signal S2 into a frequency domain), etc. Each module of the controller 60 may be realized by dedicated hardware, and/or by executing software and/or firmware programs by a hardware processor.

请参考图5，其所示意的是依据本发明一实施例的流程100，其可运用于图3音频装置。流程100的主要步骤可说明如下。Please refer to FIG. 5 , which illustrates a process 100 according to an embodiment of the present invention, which can be applied to the audio device shown in FIG. 3 . The main steps of the process 100 can be described as follows.

步骤102：由多麦克风接收多个收音信号，例如说是由麦克风52a与52b(图3)分别取得信号Si_a与Si_b。Step 102: Receive a plurality of sound collection signals by multiple microphones, for example, the signals Si_a and Si_b are respectively obtained by the microphones 52a and 52b (FIG. 3).

步骤104：于多个收音信号中，对部份的一或多个收音信号进行回音消除处理，对剩下的一或多个收音信号则不经回音消除处理。举例而言，于图3的例子中，便是依据信号Sp_a来对信号Si_a进行回音消除处理以形成信号S1(中介信号)，信号Si_b则不经回音消除处理。Step 104: Among the multiple radio signals, perform echo cancellation processing on part of one or more radio signals, and do not undergo echo cancellation processing on the remaining one or more radio signals. For example, in the example of FIG. 3 , the signal Si_a is subjected to echo cancellation processing according to the signal Sp_a to form the signal S1 (intermediate signal), and the signal Si_b is not subjected to the echo cancellation processing.

步骤106：并用回音消除后的信号(如信号S1)与未经回音消除的信号(如信号Si_b)进行波束成型处理，据以提供一输出信号，如图3中的信号S2。Step 106 : Perform beamforming processing with the echo-cancelled signal (such as the signal S1 ) and the non-echo-cancelled signal (such as the signal Si_b ), so as to provide an output signal, such as the signal S2 in FIG. 3 .

步骤108：运用步骤106所提供的输出信号。举例而言，可对输出信号S2进行语音辨识，并依据语音辨识结果控制音频装置50。Step 108: Apply the output signal provided in step 106. For example, voice recognition can be performed on the output signal S2, and the audio device 50 can be controlled according to the voice recognition result.

总结来说，本发明可推广如下：本发明控制器可接收一麦克风阵列(可包括多个麦克风)所提供的多个收音信号，对其中的部份(一或多个)收音信号进行回音消除处理，其余的(一或多个)收音信号则不需经由回音消除处理；再者，利用回音消除后的收音信号与未经回音消除的收音信号整合进行波束成型处理，以达成聚焦收音与回音消除。换言之，本发明是对不同麦克风提供的信号采用不均衡的回音消除，再搭配波束成型来整合实现聚焦收音与回音消除。相较于已知技术，本发明可避免回音消除受到波束成型影响，且不需对所有声道的麦克风进行回音消除，故可兼顾优秀的回音消除效果与精简的运算量。In summary, the present invention can be generalized as follows: the controller of the present invention can receive a plurality of radio signals provided by a microphone array (may include a plurality of microphones), and perform echo cancellation to part (one or more) of the radio signals wherein The remaining (one or more) radio signals do not need to be processed by echo cancellation; moreover, the radio signals after echo cancellation are integrated with the radio signals without echo cancellation for beamforming processing to achieve focused radio and echo eliminate. In other words, the present invention uses unbalanced echo cancellation for signals provided by different microphones, and then integrates them with beamforming to achieve focused sound collection and echo cancellation. Compared with the known technology, the present invention can prevent the echo cancellation from being affected by the beamforming, and does not need to perform echo cancellation on the microphones of all channels, so excellent echo cancellation effect and simplified calculation can be taken into account.

综上所述，虽然本发明已以较佳实施例揭示如上，然其并非用以限定本发明。本发明所属技术领域中具有通常知识者，在不脱离本发明的精神和范围内，当可作各种的更动与润饰。因此，本发明的保护范围当由权利要求书所界定为准。To sum up, although the present invention has been disclosed above with preferred embodiments, it is not intended to limit the present invention. Those skilled in the art of the present invention can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention should be defined by the claims.

Claims

1. A controller applied to an audio device, the controller receives a first radio signal and a second radio signal respectively provided by two microphones, and includes:

An echo cancellation module, which performs an echo cancellation process on the first radio signal and provides an intermediate signal accordingly; and

A beamforming module performs a beamforming process according to the intermediate signal and the second radio signal to provide an output signal, wherein the second radio signal is not subjected to the echo cancellation process.

2. The controller according to claim 1, wherein the audio device comprises an audio output module and a playback module, and the playback module plays according to an audio source signal output by the audio output module, wherein the echo cancellation The module performs the echo cancellation processing on the first sound receiving signal according to the sound source signal.

3. The controller of claim 1, further comprising:

A speech recognition module performs a speech recognition on the output signal.

4. The controller according to claim 3, wherein the audio device is controlled according to the voice recognition result.

5. An operation method applied to an audio device, comprising:

Receiving a first radio signal and a second radio signal from a first microphone and a second microphone respectively;

performing an echo cancellation process on the first radio signal to provide an intermediate signal; and

A beamforming process is performed according to the intermediate signal and the second radio signal to provide an output signal, wherein the second radio signal is not subjected to the echo cancellation process.

6. The operation method according to claim 5, wherein the audio device comprises an audio output module and a playback module, and the playback module plays a sound source signal output by the audio output module, wherein the first The step of performing the echo cancellation processing on the received sound signal and providing the intermediate signal accordingly is carried out according to the sound source signal.

7. The operating method according to claim 5, further comprising: performing a speech recognition on the output signal.

8. The operating method according to claim 7, further comprising: controlling the audio device according to the voice recognition result.