CN100524465C

CN100524465C - A method and device for noise elimination

Info

Publication number: CN100524465C
Application number: CNB2006101440540A
Authority: CN
Inventors: 张晨; 邓昊; 冯宇红
Original assignee: Vimicro Corp
Current assignee: Vimicro Corp
Priority date: 2006-11-24
Filing date: 2006-11-24
Publication date: 2009-08-05
Anticipated expiration: 2026-11-24
Also published as: CN1953059A

Abstract

The invention discloses a noise elimination device and method. Two microphones are used to construct a heart-shaped directional microphone with two zero points respectively pointing to the front and rear directions, so that one signal of the main microphone after processing mainly contains the target voice, and the other signal mainly contains noise. ; Then use frequency-domain adaptive filtering to control the beam pointing range of the entire device by adjusting the gain of the adaptive filter; finally, use a single-channel denoising algorithm to further eliminate the noise components in the main microphone signal. This allows the device to achieve optimum performance in terms of noise cancellation and maintaining target voice quality. The invention can be widely used in the field of microphone noise elimination.

Description

Noise canceling device and method

技术领域 technical field

本发明涉及音频处理领域，特别涉及一种噪声消除装置和方法。The invention relates to the field of audio processing, in particular to a noise elimination device and method.

技术背景 technical background

麦克风接收的声音除了目标语音信号外，往往还包括背景人声、背景音乐等噪声。现有技术中噪声消除的方法主要分为单麦克风去噪和多麦克风即麦克风阵列去噪两种方法。In addition to the target voice signal, the sound received by the microphone often also includes noises such as background vocals and background music. The methods of noise elimination in the prior art are mainly divided into two methods: single-microphone de-noising and multi-microphone, that is, microphone array de-noising.

单麦克风去噪方法在特定的应用场景中可以取得较好的效果，其去噪原理一般都是利用目标语音和噪声成分在时-频域上的区别来进行噪声消除，如一般认为噪声信号的特性变化相对语音信号来说较为缓慢。因此对于一般情况，这种算法存在以下缺点：The single-microphone denoising method can achieve better results in specific application scenarios. The denoising principle generally uses the difference between the target speech and noise components in the time-frequency domain to eliminate noise. For example, it is generally believed that the noise signal The characteristics change relatively slowly for speech signals. Therefore, for the general case, this algorithm has the following disadvantages:

1)当噪声为非平稳类型，如背景人声、背景音乐时，去噪效果相对平稳类型噪声有所下降；1) When the noise is non-stationary type, such as background vocals and background music, the denoising effect is relatively lower than that of the stationary type noise;

2)信噪比较低(如低于0)时，去噪效果不明显；2) When the signal-to-noise ratio is low (such as lower than 0), the denoising effect is not obvious;

3)可能引入音乐噪声，所述音乐噪声是在噪声消除过程中因此需要在保持语音质量和抑制噪声之间做折中。3) Music noise may be introduced, and the music noise is in the noise elimination process, so a compromise needs to be made between maintaining voice quality and suppressing noise.

利用多麦克风去噪方法可以克服上述问题，以适应较为恶劣的声学环境，因此在数字助听器、车载语音设备或录音设备、飞行员及士兵的通信装置、会议麦克风、语音识别前端装置等领域得到了广泛的应用。相对于单麦克风去噪方法，多麦克风去噪方法主要利用了目标声源和噪声源在空间域上的差别，即各声源到麦克风的距离和方向不同来进行信号分离，从而实现噪声消除。The multi-microphone denoising method can overcome the above problems to adapt to the harsh acoustic environment, so it has been widely used in digital hearing aids, vehicle voice equipment or recording equipment, communication devices for pilots and soldiers, conference microphones, and voice recognition front-end devices. Applications. Compared with the single-microphone denoising method, the multi-microphone denoising method mainly utilizes the difference in the spatial domain between the target sound source and the noise source, that is, the distance and direction of each sound source to the microphone are different to separate the signals, thereby achieving noise elimination.

现有的肩并肩方式(broadside type)的双麦克风去噪装置主要有三种方式：采用两个全指向性麦克风；采用两个单指向性麦克风；采用一个全指向，一个单指向麦克风。所谓肩并肩方式指正前方目标语音信号到达两个麦克风的时刻相同，称为broadside type。The existing broadside type dual-microphone denoising device mainly has three methods: adopting two omnidirectional microphones; adopting two unidirectional microphones; adopting one omnidirectional and one unidirectional microphone. The so-called shoulder-to-shoulder mode refers to the time when the voice signal of the target directly in front reaches the two microphones at the same time, which is called broadside type.

第一种，即采用两个全指向麦克风的方法，往往对目标语音的质量影响比较大，并且基本无法消除来自正后方的噪声。The first method, that is, the method of using two omnidirectional microphones, often has a relatively large impact on the quality of the target voice, and basically cannot eliminate the noise from the front.

第二种，即采用两个朝前指向的单指向性的麦克风的方法，除了利用单指向麦克风本身的特性对正后方的噪声有一定的抑制外，其他性能与第一种方案相似。The second method is to use two forward-pointing unidirectional microphones, except that the characteristics of the unidirectional microphone itself can suppress the noise directly behind to a certain extent, and other performances are similar to the first method.

第三种，即一个单指向，一个全指向麦克风的组合方法，现有方案往往直接采用这两路信号来做自适应滤波，由于两路信号中都含有较强的语音成分，因此语音质量会有所下降。The third method is a combined method of a unidirectional microphone and an omnidirectional microphone. Existing solutions often directly use these two signals for adaptive filtering. Since the two signals contain strong voice components, the voice quality will deteriorate. has declined.

综上所述，现有的双麦克风去噪方法，噪声消除的效果并不理想，而且在消除噪声的同时，会引起目标语音质量的下降。To sum up, the existing dual-microphone denoising method has an unsatisfactory effect of noise elimination, and while eliminating noise, it will cause a decrease in the quality of the target voice.

发明内容 Contents of the invention

针对现有技术中肩并肩方式的双麦克风去噪方法的缺点，本发明的目的就在于提供一种噪声消除装置和方法，可以有效消除噪声，同时不会引起目标语音质量的下降。In view of the shortcomings of the side-by-side dual-microphone denoising method in the prior art, the purpose of the present invention is to provide a noise canceling device and method, which can effectively eliminate noise without causing degradation of the target voice quality.

为了达到上述目的，本发明提供一种噪声消除装置，包括：In order to achieve the above object, the present invention provides a noise canceling device, comprising:

目标语音信号获取模块，用于获取信号主要成分为语音信号的声音信号；The target speech signal acquisition module is used to obtain the sound signal whose main component is the speech signal;

噪声信号获取模块，用于获取信号主要成分为噪声信号的声音信号；The noise signal acquisition module is used to acquire the sound signal whose main component is the noise signal;

自适应滤波模块，用于利用所述噪声信号获取模块的输出信号模拟得到所述目标语音信号获取模块中的噪声信号成分，然后将模拟得到的信号从目标语音信号获取模块的输出中减去，以得到去除了噪声的语音信号；Adaptive filtering module, for using the output signal simulation of the noise signal acquisition module to obtain the noise signal component in the target speech signal acquisition module, and then subtract the simulated signal from the output of the target speech signal acquisition module, To obtain a speech signal with noise removed;

单通道去噪模块，用于对所述自适应滤波模块的输出结果进行单通道去噪处理，得到进一步去除了噪声后的信号。The single-channel denoising module is configured to perform single-channel denoising processing on the output result of the adaptive filtering module to obtain a signal after further noise removal.

优选的，所述目标语音信号获取模块，为一指向目标语音信号方向的心形单指向性麦克风，用于吸收主要成分为目标语音信号的声音信号；Preferably, the target voice signal acquisition module is a cardioid unidirectional microphone pointing in the direction of the target voice signal, and is used to absorb the sound signal whose main component is the target voice signal;

所述噪声信号获取模块，包括一全指向性麦克风以及一增益调整单元，及一减法器，所述全指向性麦克风用于吸收所有方向的声音信号，所述增益调整单元用于调整所述全指向性麦克风输出信号的增益，使得所述全指向性麦克风及心形单指向性麦克风吸收目标语音方向声音的增益相同，所述减法器用于将所述全指向性麦克风通过增益调整的信号与所述心形单指向性麦克风的输出相减，得到信号主要成分为噪声信号的声音信号。The noise signal acquisition module includes an omnidirectional microphone, a gain adjustment unit, and a subtractor, the omnidirectional microphone is used to absorb sound signals in all directions, and the gain adjustment unit is used to adjust the omnidirectional The gain of the directional microphone output signal makes the gain of the omnidirectional microphone and the cardioid unidirectional microphone absorbing the sound of the target voice direction the same, and the subtractor is used to combine the signal adjusted by the omnidirectional microphone with the gain The outputs of the cardioid-shaped unidirectional microphones are subtracted to obtain a sound signal whose main component is a noise signal.

所述噪声信号获取模块，为一指向与目标语音信号方向反向的单指向性麦克风，用于吸收主要成分为噪声信号的声音信号。The noise signal acquisition module is a unidirectional microphone whose direction is opposite to that of the target voice signal, and is used to absorb the voice signal whose main component is the noise signal.

优选的，所述自适应滤波模块为一频域自适应滤波器。Preferably, the adaptive filtering module is a frequency domain adaptive filter.

优选的，所述频域自适应滤波器中包括系数调整单元，用于检测滤波器系数的大小，并在所述滤波器系数过大时，降低其值。Preferably, the frequency-domain adaptive filter includes a coefficient adjustment unit for detecting the size of the filter coefficient, and reducing the value of the filter coefficient when the filter coefficient is too large.

优选的，所述降低滤波器系数的值具体为：令 $W (k + 1)' = W (k + 1) * \frac{Threshhold}{| | W_{\max} (k + 1) | |},$ Preferably, the value of the reduced filter coefficient is specifically: $W (k + 1)' = W (k + 1) * \frac{Threshold}{| | W_{\max} (k + 1) | |},$

W(k+1)＝[W₀(k+1)，W₁(k+1)...W_N-1(k+1)]，是频域自适应滤波器的系数，为一个N维的复数矢量，N为系数长度；W(k+1)=[W ₀ (k+1), W ₁ (k+1)...W _N-1 (k+1)] is the coefficient of the frequency domain adaptive filter, which is an N Dimensional complex vector, N is the coefficient length;

‖W_max(k+1)‖为W₀(k+1)，W₁(k+1)...W_N-1(k+1)的各个复数的模的最大值；‖W _max (k+1)‖ is the maximum value of the modulus of each complex number of W ₀ (k+1), W ₁ (k+1)...W _N-1 (k+1);

$Threshhold = \sqrt{\frac{1 + \cos (θ)}{1 - \cos (θ)}}$ 其中θ为希望保护的语音信号的输入角度值。 $Threshold = \sqrt{\frac{1 + \cos (θ)}{1 - \cos (θ)}}$ Among them θ is the input angle value of the speech signal that wishes to protect.

本发明还提供一种噪声消除方法，包括以下步骤：The present invention also provides a noise elimination method, comprising the following steps:

获取目标语音信号，即主要成分为语音信号的声音信号s(k)；Obtain the target speech signal, that is, the sound signal s(k) whose main component is a speech signal;

获取噪声信号，即主要成分为噪声的声音信号n(k)；Obtain a noise signal, that is, a sound signal n(k) whose main component is noise;

自适应滤波步骤，对所述n(k)进行滤波，以模拟得到所述s(k)中的噪声信号成分，然后令s(k)减去模拟得到的噪声；The adaptive filtering step is to filter the n(k) to simulate the noise signal component in the s(k), and then subtract the simulated noise from s(k);

单通道去噪步骤，对所述自适应滤波步骤的结果进行单通道去噪处理，得到去除了噪声后的信号。A single-channel denoising step, performing single-channel denoising processing on the result of the adaptive filtering step to obtain a signal with noise removed.

优选的，所述获取目标语音信号，为使用一单指向性麦克风吸收目标语音方向的声音信号f(k)，即s(k)＝f(k)；Preferably, the acquisition of the target voice signal is to use a unidirectional microphone to absorb the sound signal f(k) of the target voice direction, that is, s(k)=f(k);

所述获取噪声信号，包括：The acquisition of noise signals includes:

使用一全指向性麦克风吸收所有方向的声音信号b(k)；Use an omnidirectional microphone to absorb sound signals b(k) from all directions;

调整所述b(k)的增益α，使其与所述f(k)在目标语音方向上的增益相同；Adjusting the gain α of the b(k) to make it the same as the gain of the f(k) in the target voice direction;

将所述通过增益调整后的信号α*b(k)与所述f(k)相减，得到信号主要成分为噪声信号的声音信号n(k)，即n(k)＝α*b(k)-f(k)。The signal α*b(k) after gain adjustment is subtracted from the f(k) to obtain the sound signal n(k) whose main component is a noise signal, i.e. n(k)=α*b( k)-f(k).

优选的，所述获取目标语音信号，为使用一单指向性麦克风吸收目标语音方向的声音信号f′(k)，即s(k)＝f′(k)；Preferably, the acquisition of the target voice signal is to use a unidirectional microphone to absorb the sound signal f'(k) of the target voice direction, that is, s(k)=f'(k);

所述获取噪声信号，为使用一心形单指向性麦克风吸收与目标语音方向反向的声音信号b′(k)，即n(k)＝b′(k)。The acquisition of the noise signal is to use a cardioid unidirectional microphone to absorb the sound signal b'(k) opposite to the direction of the target voice, that is, n(k)=b'(k).

优选的，所述自适应滤波步骤采用频域自适应滤波方法。Preferably, the adaptive filtering step adopts a frequency domain adaptive filtering method.

优选的，在每次滤波器系数更新后检测滤波器系数的大小，并在所述滤波器系数过大时，降低其值。Preferably, the size of the filter coefficient is detected after each update of the filter coefficient, and when the filter coefficient is too large, its value is reduced.

所述W(k+1)＝[W₀(k+1)，W₁(k+1)...W_N-1(k+1)]，是频域自适应滤波器的系数，为一个N维的复数矢量，N为系数长度；The W(k+1)=[W ₀ (k+1), W ₁ (k+1)...W _N-1 (k+1)] is the coefficient of the frequency domain adaptive filter, which is An N-dimensional complex vector, where N is the coefficient length;

所述‖W_max(k+1)‖为所述W₀(k+1)，W₁(k+1)...W_N-1(k+1)的各个复数的模的最大值；The ∥W _max (k+1)∥ is the maximum value of the modulus of each complex number of the W ₀ (k+1), W ₁ (k+1)...W _N-1 (k+1);

所述 $Threshhold = \sqrt{\frac{1 + \cos (θ)}{1 - \cos (θ)}}$ 其中θ为希望保护的语音信号的输入角度最犬值。said $Threshold = \sqrt{\frac{1 + \cos (θ)}{1 - \cos (θ)}}$ Among them, θ is the maximum value of the input angle of the speech signal to be protected.

本发明与现有技术的不同之处在于，利用两个麦克风使得处理之后的主麦克风一路信号主要包含目标语音，而辅麦克风一路信号主要包含噪声；然后再采用频域自适应滤波，并且通过调节自适应滤波器的增益，来控制整个装置的波束指向范围，从而使得装置对于消除噪声和保持目标语音质量的性能上达到最优。使用本发明，可以有效消除噪声，同时不降低目标语音的质量。The difference between the present invention and the prior art is that two microphones are used to make the main microphone one-way signal after processing mainly contain the target voice, while the auxiliary microphone one-way signal mainly contains noise; then frequency domain adaptive filtering is adopted, and by adjusting The gain of the adaptive filter is used to control the beam pointing range of the entire device, so that the performance of the device for eliminating noise and maintaining the target voice quality is optimized. The invention can effectively eliminate the noise without reducing the quality of the target voice.

附图说明 Description of drawings

图1：本发明装置的基本原理框图；Fig. 1: the basic principle block diagram of device of the present invention;

图2：本发明实施例一的电路框图；Fig. 2: the circuit block diagram of embodiment one of the present invention;

图3：本发明实施例二的电路框图；Fig. 3: the circuit block diagram of embodiment two of the present invention;

图4：波束形成原理图；Figure 4: Schematic diagram of beamforming;

图5：频域LMS算法示意图；Figure 5: Schematic diagram of frequency domain LMS algorithm;

图6：全指向性麦克风和心形单指向性麦克风极性图；Figure 6: Omnidirectional microphone and cardioid unidirectional microphone polar diagram;

图7：本发明方法示意图。Figure 7: Schematic diagram of the method of the present invention.

具体实施方式 Detailed ways

本发明在现有技术的基础上，经过充分的理论研究和大量的实验，提出一种新的肩并肩式的双麦克风去噪的装置及方法。本发明使用两个麦克风构造两个零点分别指向前后方向的心形指向麦克风，使得处理之后的主麦克风一路信号主要包含目标语音，而另一路信号主要包含噪声。在文中提到的指向前方即指向目标语音方向，指向后方即指向目标语音的反方向；主麦克风一路是指采集主要包含目标语音信号的一路麦克风，而辅麦克风一路是指处理后得到的信号主要为噪音信号的一路麦克风。On the basis of the prior art, the present invention proposes a new side-by-side dual-microphone denoising device and method through sufficient theoretical research and a large number of experiments. The present invention uses two microphones to construct two cardioid-shaped microphones with zero points pointing to the front and rear directions respectively, so that one signal of the main microphone after processing mainly contains the target voice, and the other signal mainly contains noise. As mentioned in the article, pointing to the front means pointing to the direction of the target voice, and pointing to the back means pointing to the opposite direction of the target voice; one line of the main microphone refers to the one line of microphone that mainly collects the target voice signal, and one line of the auxiliary microphone refers to the signal obtained after processing. A microphone for the noise signal.

下面结合说明书附图，详细说明本发明的装置和方法。The device and method of the present invention will be described in detail below in conjunction with the accompanying drawings.

如图1所示，为本发明装置的基本原理框图。其中包括：目标语音信号获取模块，用于获取主要成分为语音信号的声音信号；As shown in Fig. 1, it is a basic principle block diagram of the device of the present invention. It includes: a target speech signal acquisition module, which is used to obtain a sound signal whose main component is a speech signal;

噪声信号获取模块，用于获取主要成分为噪声信号的声音信号；The noise signal acquisition module is used to acquire the sound signal whose main component is the noise signal;

自适应滤波模块，用于利用所述噪声信号获取模块的输出信号模拟得到所述目标语音信号获取模块中的噪声信号成分；Adaptive filtering module, for using the output signal simulation of the noise signal acquisition module to obtain the noise signal component in the target speech signal acquisition module;

单通道去噪模块，用于令目标语音信号获取模块的输出与自适应滤波模块的输出相减，以得到去除了噪声的语音信号。The single-channel denoising module is used to subtract the output of the target speech signal acquisition module from the output of the adaptive filtering module to obtain a speech signal from which noise has been removed.

在此给出本发明装置的两个实施例。第一实施例如图2所示，是利用一个心形单指向性麦克风采集主要成分为目标语音信号的f(x)，同时利用一个全指向性麦克风采集声音信号b(x)，然后利用波束形成模块对所述f(x)和b(x)进行处理，以构造两个信号增益的零点分别指向前后方向的心形指向麦克风。第二实施例如图3所示是直接使用两个分别指向前后方向的心形单指向性麦克风。上述两种方式都可以使得处理之后的主麦克风一路信号主要包含目标语音，而另一路信号主要包含噪声。然后采用频域自适应滤波器，通过调节其增益，来控制整个装置的波束指向范围。最后再通过单通道去噪模块，进一步消除主麦克风信号中的噪声成分。从而使得装置对于消除噪声和保持目标语音质量的性能上达到最优。Two exemplary embodiments of the device according to the invention are given here. The first embodiment, as shown in Figure 2, uses a cardioid unidirectional microphone to collect f(x) whose main component is the target speech signal, and simultaneously uses an omnidirectional microphone to collect the sound signal b(x), and then uses beamforming The module processes the f(x) and b(x) to construct a cardioid-shaped microphone whose zero points of signal gain point to the front and rear directions respectively. The second embodiment, as shown in FIG. 3 , directly uses two cardioid unidirectional microphones pointing to the front and rear directions respectively. Both of the above two methods can make the signal of one channel of the main microphone after processing mainly contain the target voice, and the signal of the other channel mainly contain noise. Then, a frequency-domain adaptive filter is used to control the beam pointing range of the entire device by adjusting its gain. Finally, the single-channel denoising module is used to further eliminate the noise components in the main microphone signal. This allows the device to achieve optimum performance in terms of noise cancellation and maintaining target voice quality.

该两个实施例，构造两个信号增益的零点分别指向前后方向的心形指向麦克风的方式各有不同，下面详述其构造原理。In the two embodiments, there are different ways of constructing two cardioid microphones with the zero points of signal gain pointing to the front and rear directions respectively, and the construction principles are described in detail below.

如图4所示：第一实施例中波束的形成需要包括心形单指向性麦克风，其增益的特性曲线如图中单指向性麦克风右边的心形图所示，其指向目标语音方向即前方的增益最大，对于后方的声音信号的增益最小为0。还包括全指向性麦克风，以各向相同的增益收集各个方向的声音信号。图2中所示的波束形成模块在图4中由增益调整模块和减法器组成。图中所示的增益调整模块用于调整所述全指向性麦克风输出信号的增益，使得所述全指向性麦克风及心形单指向性麦克风吸收目标语音方向声音的增益相同，所述减法器用于将所述全指向性麦克风通过增益调整的信号与所述心形单指向性麦克风的输出相减，得到的差值中含有较少的目标语音信号，主要包含其他方向的噪声信号。As shown in Figure 4: the formation of the beam in the first embodiment needs to include a cardioid unidirectional microphone, and the characteristic curve of its gain is shown in the cardioid diagram on the right side of the unidirectional microphone in the figure, which points to the direction of the target voice, that is, the front The gain is the largest, and the gain for the rear sound signal is the smallest 0. An omnidirectional microphone is also included to collect sound signals from all directions with the same gain in all directions. The beamforming block shown in Figure 2 consists of a gain adjustment block and a subtractor in Figure 4. The gain adjustment module shown in the figure is used to adjust the gain of the output signal of the omnidirectional microphone, so that the gain of the omnidirectional microphone and the cardioid unidirectional microphone to absorb the sound of the target voice direction is the same, and the subtractor is used for The gain-adjusted signal of the omnidirectional microphone is subtracted from the output of the cardioid unidirectional microphone, and the difference obtained contains less target speech signals and mainly includes noise signals from other directions.

波束形成用数学表达式可以表示为：The mathematical expression of beamforming can be expressed as:

s(k)＝f(k) (1.1)s(k)=f(k)

n(k)＝α*b(k)-f(k) (1.2)n(k)=α*b(k)-f(k) (1.2)

其中：in:

f(k)：指向目标语音(定义为0度方向)的心形单指向性麦克风接收到的信号；f(k): the signal received by the cardioid unidirectional microphone pointing at the target voice (defined as 0 degree direction);

b(k)：全指向性麦克风接收到的信号；b(k): the signal received by the omnidirectional microphone;

s(k)：构造得到的零点为180度的心形指向麦克风的输出，理论上其主要成分为前方入射的目标语音信号。s(k): The output of the constructed cardioid-shaped microphone whose zero point is 180 degrees. Theoretically, its main component is the target voice signal incident from the front.

n(k)：构造得到的零点为0度的心形指向麦克风的输出，理论上其主要成分为后方入射的噪声信号。n(k): The output of the constructed cardioid pointing microphone whose zero point is 0 degrees, theoretically its main component is the noise signal incident from the rear.

α：增益因子，由增益调整模块调整得到，用来校正麦克风，使得两个麦克风吸收正前方信号的增益相同。α: Gain factor, adjusted by the gain adjustment module, used to calibrate the microphones so that the two microphones have the same gain for absorbing the front signal.

经过这种波束形成方法的处理，可以使得主麦克风一路信号主要包含目标语音，而另一路辅麦克风信号主要包含噪声。这样就为下一步自适应滤波提供了理想的条件。After processing by this beamforming method, one signal of the main microphone mainly includes the target speech, while the signal of the other auxiliary microphone mainly includes noise. This provides ideal conditions for the next step of adaptive filtering.

本发明中的第二实施例，两个分别前后指向的心形单指向性麦克风采集到的信号，等同于实施例一通过波束形成处理后产生的两路信号，因而不需要波束形成，直接进行自适应滤波。具体为使用一心形单指向性麦克风指向前方，吸收目标语音方向的声音信号f′(k)，即s(k)＝f′(k)；另外，使用一心形单指向性麦克风吸收与目标语音方向反向的声音信号b′(k)，即n(k)＝b′(k)。In the second embodiment of the present invention, the signals collected by two cardioid-shaped unidirectional microphones pointing forward and backward respectively are equivalent to the two-way signals generated after beamforming processing in Embodiment 1, so beamforming is not required, and the Adaptive filtering. Specifically, use a heart-shaped unidirectional microphone to point to the front to absorb the sound signal f'(k) of the target voice direction, i.e. s(k)=f'(k); in addition, use a heart-shaped unidirectional microphone to absorb and target voice The sound signal b'(k) in the opposite direction, ie n(k)=b'(k).

下面介绍频域自适应滤波模块。本发明之所以采用频域自适应滤波，主要是考虑到以下四点：The frequency domain adaptive filtering module is introduced below. The reason why the present invention adopts frequency-domain adaptive filtering is mainly to consider the following four points:

1)频域自适应滤波运算复杂度低，具有更高的效率；1) Frequency-domain adaptive filtering has low computational complexity and higher efficiency;

2)频域自适应滤波鲁棒性能更好；2) Frequency-domain adaptive filtering has better robust performance;

3)频域自适应滤波，频率选择特性好，能同时消除频率成分存在差异的多个干扰噪声源产生的噪声；3) Frequency-domain adaptive filtering, with good frequency selection characteristics, can simultaneously eliminate the noise generated by multiple interference noise sources with different frequency components;

4)便于应用本发明提出的方法，通过调节自适应滤波器的增益，来控制整个装置的波束指向范围。4) It is convenient to apply the method proposed by the present invention, and control the beam pointing range of the whole device by adjusting the gain of the adaptive filter.

在此，简单介绍一下本发明中使用的频域自适应滤波的方法。在此采用频域的LMS算法，如图5表示，其中细箭头代表时域信号处理，粗箭头代表频域信号处理。采用频域自适应滤波，信号要分帧处理。我们知道，长序列截短后分块处理再合并，需要采用重叠相加法或者重叠保留法避免混叠，本文采用重叠保留法。Here, the frequency domain adaptive filtering method used in the present invention is briefly introduced. The LMS algorithm in the frequency domain is used here, as shown in FIG. 5 , where the thin arrows represent the time domain signal processing, and the thick arrows represent the frequency domain signal processing. Adaptive filtering in the frequency domain is used, and the signal is processed in frames. We know that after the long sequence is truncated and processed in blocks and then merged, it is necessary to use the overlap-add method or the overlap-preserving method to avoid aliasing. This paper adopts the overlap-preserving method.

首先，假设我们采用自适应滤波器的阶数是M，其时域滤波器系数记为w(k)，因采用重叠保留法，为避免混叠，将M阶的滤波器扩展M个0，组成N＝2M个系数的滤波器，经FFT处理后得到滤波器的频域系数向量为：First, assume that the order of the adaptive filter we use is M, and its time-domain filter coefficient is recorded as w(k). Because the overlap-preserving method is used, in order to avoid aliasing, the M-order filter is extended with M 0s. A filter with N=2M coefficients is formed, and the frequency-domain coefficient vector of the filter obtained after FFT processing is:

$W W ((k k)) = = FFT FFT [\begin{matrix} w w ((k k)) \\ 00 \end{matrix}] - - - - - - ((2.1 2.1))$

从上式可以看出，频域自适应滤波器系数向量长度是时域系数向量的2倍.对于频域自适应滤波算法，自适应滤波和滤波器系数更新都是在频域中完成的，所以将不出现时域滤波器的形式.值得注意的是以后我们提到的FFT或者逆FFT都是N点的FFT。It can be seen from the above formula that the length of the frequency domain adaptive filter coefficient vector is twice that of the time domain coefficient vector. For the frequency domain adaptive filtering algorithm, both adaptive filtering and filter coefficient update are completed in the frequency domain, So there will be no form of time-domain filter. It is worth noting that the FFT or inverse FFT we mentioned later are all N-point FFTs.

接着我们考虑输入信号，在以下频域自适应滤波方法的叙述中，所述

即为上文中所述的主要成分为噪声信号的n(k)经分帧后的数据，每帧数据长度为M，将上一帧

和当前帧

合并为一个大帧

如下所示：Next we consider the input signal, in the following description of the adaptive filtering method in the frequency domain, the

It is the framed data of n(k) whose main component is the noise signal mentioned above, the data length of each frame is M, and the previous frame

and the current frame

Merge into one large frame

As follows:

其中

为合并后的大帧，长度为N＝2M。in

It is a merged large frame with a length of N=2M.

将

做FFT，转换到频域有：Will

Do FFT, convert to frequency domain:

$U u ((k k)) = = FFT FFT [[\overset{&RightArrow; &Right Arrow;}{u u} ((k k))]] - - - - - - ((2.3 2.3))$

然后我们采用重叠保留法，对输入信号进行滤波，即是时域上的卷积，或者频域上的相乘，即有：Then we use the overlap preservation method to filter the input signal, that is, convolution in the time domain, or multiplication in the frequency domain, that is:

$\overset{&RightArrow; &Right Arrow;}{y the y} ((k k)) = = [[y the y ((kM kM)),, y the y ((kM kM + + 11)),, . . . . . . . . . . . .,, y the y ((kM kM + + M m - - 11))]] = = IFFT IFFT [[U u ((k k)) * * W W ((k k))]] - - - - - - ((2.4 2.4))$

其中IFFT的结果取后M个结果，Among them, the results of IFFT take the last M results,

在此用

表示上文中所述的主要成分为语音信号的s(k)：use here

Indicates that the main component described above is s(k) of the speech signal:

$\overset{&RightArrow; &Right Arrow;}{d d} ((k k)) = = [[d d ((kM kM)),, d d ((kM kM + + 11)),, . . . . . .,, d d ((kM kM + + M m - - 11))]] - - - - - - ((2.5 2.5))$

则滤波结果信号为：Then the filtered result signal is:

$\overset{&RightArrow; &Right Arrow;}{e e} ((k k)) = = [[e e ((kM kM)),, e e ((kM kM + + 11)),, . . . . . .,, e e ((kM kM + + M m - - 11))]]$

(2.6)(2.6)

$= = \overset{&RightArrow; &Right Arrow;}{d d} ((k k)) - - \overset{&RightArrow; &Right Arrow;}{y the y} ((k k))$

经过FFT，得到频域的误差信号矢量为：After FFT, the error signal vector in the frequency domain is obtained as:

$E E. ((k k)) = = FFT FFT [\begin{matrix} 00 \\ \overset{&RightArrow; &Right Arrow;}{e e} ((k k)) \end{matrix}] - - - - - - ((2.7 2.7))$

和时域LMS相似，现在我们通过误差信号矢量E(k)和输入信号矢量来计算自适应滤波器系数矢量的更新量。在频域中，自适应滤波器系数矢量的更新量是通过计算误差信号和输入信号的相关性确定的，由于线性相关从形式上看相当与一个逆的线性卷积，所以，借助于时域的卷积在频域上有FFT的快速算法，根据重叠保留法，有Similar to the time-domain LMS, now we calculate the update amount of the adaptive filter coefficient vector through the error signal vector E(k) and the input signal vector. In the frequency domain, the update amount of the adaptive filter coefficient vector is determined by calculating the correlation between the error signal and the input signal. Since the linear correlation is equivalent to an inverse linear convolution in form, with the help of the time domain The convolution of has a fast FFT algorithm in the frequency domain. According to the overlap-preserving method, there is

$\overset{&RightArrow; &Right Arrow;}{φ φ} ((k k)) = = IFFT IFFT [[{U u}^{H h} ((k k)) * * E E. ((k k))]] - - - - - - ((2.8 2.8))$

所述IFFT结果取前M个值。The IFFT result takes the first M values.

最后我们利用

来更新自适应滤波器系数，注意到频域的滤波器系数是将时域系数后面补零，然后经FFT处理生成的。所以相应的，我们就得到了滤波器系数更新的频域形式W(k+1)如下：Finally we use

To update the adaptive filter coefficients, note that the filter coefficients in the frequency domain are generated by padding the time domain coefficients with zeros and then undergoing FFT processing. So correspondingly, we get the frequency domain form W(k+1) of filter coefficient update as follows:

$W W ((k k + + 11)) = = W W ((k k)) + + μFFT μFFT [\begin{matrix} φ φ ((k k)) \\ 00 \end{matrix}] - - - - - - ((2.9 2.9))$

其中，μ为滤波器的步长。Among them, μ is the step size of the filter.

从上面的叙述可以看出，自适应滤波的作用就是使波束形成后的辅麦克风那一路信号n(k)，通过自适应滤波器的滤波，能够模拟主麦克风中的噪声信号。从而进一步将主麦克风中的噪声信号消除。It can be seen from the above description that the function of adaptive filtering is to make the signal n(k) of the auxiliary microphone after beamforming, through the filtering of the adaptive filter, simulate the noise signal in the main microphone. In this way, the noise signal in the main microphone is further eliminated.

然而由于麦克风特性并不是理想的，因此，辅麦克风那一路信号中含有一定的目标语音信号，因此，如果对自适应滤波器不加控制，就有可能会消除一部分语音信号，从而引起语音质量的下降，目前，现有的很多算法都存在这个问题。However, since the characteristics of the microphone are not ideal, the signal of the auxiliary microphone contains a certain target voice signal. Therefore, if the adaptive filter is not controlled, a part of the voice signal may be eliminated, which will cause voice quality degradation. At present, many existing algorithms have this problem.

本文提出通过调节自适应滤波器的增益，来控制整个装置的波束指向范围，使得在装置波束指向范围内的声音不会被自适应滤波器消弱，这样就能保证目标语音质量不会下降。This paper proposes to control the beam pointing range of the entire device by adjusting the gain of the adaptive filter, so that the sound within the beam pointing range of the device will not be weakened by the adaptive filter, so that the target voice quality will not be degraded.

之所以能够采用增益控制的方法，是由于，通过所述的波束形成后的两路信号，主麦克风一路的目标语音信号远远大于辅麦克风一路的目标语音信号。如果自适应滤波器试图消除主麦克风中的目标语音的话，滤波器系数的幅度需要比较大才行，也就是说自适应滤波器需要较大的增益。如果我们限制自适应滤波器的增益在一个阈值以内，那么自适应滤波器就无法消除目标语音了。The reason why the gain control method can be adopted is that, through the two beam-formed signals, the target voice signal of one channel of the main microphone is much larger than the target voice signal of one channel of the auxiliary microphone. If the adaptive filter is trying to eliminate the target speech in the main microphone, the magnitude of the filter coefficients needs to be relatively large, that is to say, the adaptive filter needs a large gain. If we limit the gain of the adaptive filter within a threshold, then the adaptive filter cannot eliminate the target speech.

采用的方法就是在每次系数更新后，检查一下系数的大小，如果大于设定的阈值，我们就认为自适应滤波器试图消除目标语音了。于是降低滤波器的增益，保护目标语音质量。具体来说，对于频域NLMS算法，如前面所述，系数更新如下式所示：The method used is to check the size of the coefficient after each coefficient update. If it is greater than the set threshold, we think that the adaptive filter is trying to eliminate the target speech. Therefore, the gain of the filter is reduced to protect the target voice quality. Specifically, for the frequency-domain NLMS algorithm, as mentioned earlier, the coefficient update is as follows:

式中W(k+1)为更新后的频域自适应滤波器系数，是一个N维复数矢量，N为FFT点数，即W(k+1)＝[W₀(k+1)，W₁(k+1)，...，W_N-1(k+1)]^T (2.10)In the formula, W(k+1) is the updated frequency-domain adaptive filter coefficient, which is an N-dimensional complex vector, and N is the number of FFT points, that is, W(k+1)=[W ₀ (k+1), W ₁ (k+1), ..., W _N-1 (k+1)] ^T (2.10)

系数的大小，我们用复数的模来量度，即：The size of the coefficient is measured by the modulus of the complex number, namely:

[‖W₀(k+1)‖，‖W₁(k+1)‖，...，‖W_N-1(k+1)‖]^T (2.11)[‖W ₀ (k+1)‖,‖W ₁ (k+1)‖,...,‖W _N-1 (k+1)‖] ^T (2.11)

对于‖W_i(k+1)‖，i＝0，1，…，N-1，搜索找到最大的系数的模‖W_max(k+1)‖，若‖W_max(k+1)‖>Threshold，则判定此时自适应滤波器试图消除目标语音，于是降低滤波器的增益，即： $W (k + 1)' = W (k + 1) * \frac{Threshold}{| | W_{\max} (k + 1) | |} .$ For ‖W _i (k+1)‖, i=0, 1,..., N-1, search to find the modulus ‖W _max (k+1)‖ of the largest coefficient, if ‖W _max (k+1)‖ >Threshold, it is determined that the adaptive filter is trying to eliminate the target speech at this time, so the gain of the filter is reduced, that is: $W (k + 1)' = W (k + 1) * \frac{Threshold}{| | W_{\max} (k + 1) | |} .$

下面介绍Thrashold值的选取。The selection of the Thrashold value is introduced below.

首先参考本发明实施例1中用到的心形单指向性麦克风和全指向性麦克风极性图，如图6所示：其中全指向性麦克风的极性不随角度变化，而心形单指向性麦克风，在0度角时最大，180度角时最小，可以用数学表达式表示如下：First refer to the cardioid unidirectional microphone and the omnidirectional microphone polar diagram used in Embodiment 1 of the present invention, as shown in Figure 6: wherein the polarity of the omnidirectional microphone does not change with the angle, while the cardioid unidirectional The microphone, which is the largest at an angle of 0 degrees and the smallest at an angle of 180 degrees, can be expressed in mathematical expressions as follows:

$\{\begin{matrix} P_{omni} = 1 \\ P_{uni} = 0.5 (1 + \cos θ) \end{matrix} 0 \leq θ < 360,$ 其中P_omm表示全指向麦克风的极性，而

表示心形单指向性麦克风的极性。

\{\begin{matrix} P_{omni} = 1 \\ P_{uni} = 0.5 (1 + \cos θ) \end{matrix} 0 \leq θ < 360,

where _Pomm represents the polarity of the omnidirectional microphone, and

Indicates the polarity of a cardioid unidirectional microphone.

则，波束形成后的主麦克风路和辅麦克风路的极性比值为Then, the polarity ratio of the main microphone path and the auxiliary microphone path after beamforming is

$P P ((θ θ)) = = \frac{{P P}_{uni uni}}{{P P}_{omni omni} - - {P P}_{uni uni}} = = \frac{11 + + cos cos θ θ}{11 - - cos cos θ θ},,$

对于实施例二，由于采用了两个心形单指向性麦克风，极性相反，即指向的角度相差180度，可以用数学表达式表示如下：For the second embodiment, since two cardioid-shaped unidirectional microphones are used, the polarities are opposite, that is, the pointing angles differ by 180 degrees, which can be expressed as follows in a mathematical expression:

$\{\begin{matrix} P_{uni_ref} = 0.5 (1 + \cos (θ + 180)) \\ P_{uni} = 0.5 (1 + \cos θ) \end{matrix} 0 \leq θ < 360,$ 其中P_umi表示主麦克风的极性，而P_{uni_ref}表示辅麦克风的极性。 $\{\begin{matrix} P_{uni_ref} = 0.5 (1 + \cos (θ + 180)) \\ P_{uni} = 0.5 (1 + \cos θ) \end{matrix} 0 \leq θ < 360,$ Where P _umi represents the polarity of the primary microphone, and P _{uni_ref} represents the polarity of the secondary microphone.

则主麦克风路和辅麦克风路的极性比值为： $P (θ) = \frac{P_{uni}}{P_{uni} - P_{ref}} = \frac{1 + \cos θ}{1 - \cos θ} .$ Then the polarity ratio of the main microphone path and the auxiliary microphone path is: $P (θ) = \frac{P_{uni}}{P_{uni} - P_{ref}} = \frac{1 + \cos θ}{1 - \cos θ} .$

可以看出，由实施例一和二都可得出以下结论，即P(θ)在0度时为极大，180度时为0，90度时为1。即波束形成后的主麦克风路和辅麦克风路的能量对比，在0度时，前者远远大于后者，180度时，前者远远小于后者，90度时两者差不多大。因此我们可以根据P(θ)来确定Threshold，如下式所示：It can be seen that the following conclusions can be drawn from Examples 1 and 2, that is, P(θ) is maximum at 0 degrees, 0 at 180 degrees, and 1 at 90 degrees. That is, the energy contrast between the main microphone path and the auxiliary microphone path after beamforming, at 0 degrees, the former is much larger than the latter, at 180 degrees, the former is much smaller than the latter, and at 90 degrees, the two are almost the same. Therefore, we can determine Threshold according to P(θ), as shown in the following formula:

Threshold＝P(θ)Threshold=P(θ)

比如，我们希望保护以0度角为中心，左右各偏移30度角这个范围内的信号。那么可以求出： $Threshold = P (π / 6) = \sqrt{\frac{1 + \cos (π / 6)}{1 - \cos (π / 6)}} = \approx 3.73 .$ For example, we want to protect the signal within the range of 0 degree angle as the center and 30 degree angle offset left and right. Then it can be found that: $Threshold = P (π / 6) = \sqrt{\frac{1 + \cos (π / 6)}{1 - \cos (π / 6)}} = \approx 3.73 .$

单通道去噪主要有三种方式：维纳滤波、减谱法和短时谱调整法，本发明中的单通道去噪模块采用短时谱调整法去除残余的噪声。很多文献中都有介绍，此处省略其叙述。There are mainly three methods for single-channel denoising: Wiener filtering, spectral subtraction and short-time spectrum adjustment. The single-channel denoising module of the present invention uses short-time spectrum adjustment to remove residual noise. There are many introductions in the literature, and their descriptions are omitted here.

本发明的方法流程图见图7所示，其详细内容在前述的装置介绍中已有体现，在此不予赘述。The flow chart of the method of the present invention is shown in FIG. 7 , and its detailed content has been reflected in the aforementioned device introduction, so it will not be repeated here.

以上所述仅为本发明的较佳实施例而已，并不用以限制本发明，凡在本发明的精神和原则之内，所作的任何修改、等同替换等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements, etc. made within the spirit and principles of the present invention should be included in the protection scope of the present invention within.

Claims

1, a kind of noise elimination apparatus is characterized in that, comprising:

Target voice signal acquisition module is used to obtain the voice signal that the signal principal ingredient is a voice signal;

The noise signal acquisition module is used to obtain the voice signal that the signal principal ingredient is a noise signal;

The auto adapted filtering module, be used for the output signal of described noise signal acquisition module is simulated to obtain the noise signal composition of described target voice signal acquisition module, from the output of target voice signal acquisition module, deduct the described signal that obtains through simulation then, to obtain having removed the voice signal of noise;

Single channel denoising module is used for the output result of described auto adapted filtering module is carried out the single channel denoising, has further been removed the signal behind the noise,

Described target voice signal acquisition module is the heart-shaped unidirectivity microphone of a definite object voice signal direction, is used to absorb the voice signal that principal ingredient is the target voice signal;

Described noise signal acquisition module, comprise an omni-directional microphone and a gain adjusting unit, an and subtracter, described omni-directional microphone is used to absorb the voice signal of all directions, described gain adjusting unit is used to adjust the gain of described omni-directional microphone output signal, make that the gain of described omni-directional microphone and heart-shaped unidirectivity microphone absorption target voice direction sound is identical, described subtracter is used for described omni-directional microphone is subtracted each other by the signal of gain adjustment and the output of described heart-shaped unidirectivity microphone, obtains the voice signal that the signal principal ingredient is a noise signal; Perhaps, described noise signal acquisition module is a sensing and the reverse heart-shaped unidirectivity microphone of target voice signal direction, is used to absorb the voice signal that principal ingredient is a noise signal.

2, device according to claim 1 is characterized in that, described auto adapted filtering module is an adaptive frequency domain filter.

3, device according to claim 2 is characterized in that, comprises the coefficient adjustment unit in the described adaptive frequency domain filter, is used for the size of detection filter device coefficient, and when described filter coefficient is excessive, reduces its value.

4, device according to claim 3 is characterized in that, the value after described filter coefficient is lowered is: order

W (k + 1)' = W (k + 1) * \frac{Threshold}{| | W_{\max} (k + 1) | |},

Described W (k+1)=[W ₀(k+1), W ₁(k+1) ... W _N-1(k+1)], be the coefficient of adaptive frequency domain filter, be the complex vector of a N dimension, N is a coefficient length;

Described W (k+1) ' is the value after described filter coefficient is lowered;

Described ‖ W _Max(k+1) ‖ is W ₀(k+1), W ₁(k+1) ... W _N-1The maximal value of the mould that (k+1) each is plural;

Described

Threshhold = \sqrt{\frac{1 + \cos (θ)}{1 - \cos (θ)}},

Wherein θ is the input angle value of the voice signal of hope protection.

5, a kind of dual microphone noise cancellation method is characterized in that, comprising:

Obtain the target voice signal, promptly principal ingredient is the voice signal s (k) of voice signal;

Obtain noise signal, promptly principal ingredient is the voice signal n (k) of noise;

The auto adapted filtering step is carried out filtering to described n (k), obtains noise signal composition among the described s (k) with simulation, makes s (k) deduct the noise that simulation obtains then;

Single channel denoising step is carried out the single channel denoising to the result of described auto adapted filtering step, has obtained removing the signal behind the noise,

The described target voice signal that obtains is for using the voice signal f (k) of unidirectivity microphone absorption target voice direction, i.e. s (k)=f (k);

The described noise signal of obtaining comprises: use an omni-directional microphone to absorb the voice signal b (k) of all directions; Adjust the gain alpha of described b (k), make it identical with the gain of described f (k) on the target voice direction; Subtract each other by adjusted signal alpha * b of gain (k) and described f (k) described, obtaining the signal principal ingredient is the voice signal n (k) of noise signal, i.e. n (k)=α * b (k)-f (k); Perhaps, the described noise signal of obtaining, for using a voice signal b ' that heart-shaped unidirectivity microphone absorbs and the target voice direction is reverse (k), promptly (k)=b ' is (k) for n.

6, method according to claim 5 is characterized in that, described auto adapted filtering step adopts the frequency domain adaptive filtering method.

7, method according to claim 6 is characterized in that, in the size of each filter coefficient update post detection filtering device coefficient, and when described filter coefficient is excessive, reduces its value.

8, method according to claim 7 is characterized in that, the value after described filter coefficient is lowered is: order

W (k + 1)' = W (k + 1) * \frac{Threshold}{| | W_{\max} (k + 1) | |},

Described W (k+1) ' is the value after described filter coefficient is lowered;

Described ‖ W _Max(k+1) ‖ is described W ₀(k+1), W ₁(k+1) ... W _N-1The maximal value of the mould that (k+1) each is plural;

Described

Threshold = \sqrt{\frac{1 + \cos (θ)}{1 - \cos (θ)}},

Wherein θ is the input angle maximal value of the voice signal of hope protection.