CN114495962A

CN114495962A - An audio noise reduction method, apparatus, system and computer-readable storage medium

Info

Publication number: CN114495962A
Application number: CN202210034896.XA
Authority: CN
Inventors: 黎绍鑫; 郝正海
Original assignee: Hefei Ustc Iflytek Co ltd
Current assignee: Hefei Ustc Iflytek Co ltd
Priority date: 2022-01-12
Filing date: 2022-01-12
Publication date: 2022-05-13
Anticipated expiration: 2042-01-12
Also published as: CN114495962B

Abstract

The application discloses an audio noise reduction method, an audio noise reduction device, an audio noise reduction system and a computer readable storage medium, wherein the audio noise reduction method comprises the following steps: calculating a power spectrum of data to be denoised; initializing the noise estimation parameters based on the power spectrum to obtain a noise spectrum of the noise data; carrying out minimum tracking on the initialized noise estimation parameters in a first time period to obtain a first array; calculating a posterior signal-to-noise ratio and a prior signal-to-noise ratio of the data to be denoised based on the first array; carrying out minimum tracking on the initialized noise estimation parameters in a second time period to obtain a second array; calculating a gain estimation value of the noise-free data based on the second array, the silence probability estimation value, the sound probability estimation value and the noise power spectrum estimation value; and performing noise reduction processing on the data to be subjected to noise reduction based on the gain estimation value, the noise spectrum and the noise power spectrum estimation value to obtain noise-free data. By the aid of the mode, the noise reduction effect on the audio data can be improved.

Description

An audio noise reduction method, apparatus, system and computer-readable storage medium

技术领域technical field

本申请涉及音频处理技术领域，具体涉及一种音频降噪方法、装置、系统和计算机可读存储介质。The present application relates to the technical field of audio processing, and in particular, to an audio noise reduction method, apparatus, system, and computer-readable storage medium.

背景技术Background technique

在语音通信过程中，音频数据会不同程度地受到各种各样的干扰，影响了音频的质量与自然度，因此需要从带噪音的音频数据中提取尽可能纯净的原始音频数据，即对带噪音频数据进行音频降噪处理，从而达到抗噪效果，而现有在对音频数据的处理过程中，无法达到较好的音频降噪效果。In the process of voice communication, the audio data will be subject to various interferences to varying degrees, which affects the quality and naturalness of the audio. Therefore, it is necessary to extract the original audio data as pure as possible from the audio data with noise. The noise frequency data is subjected to audio noise reduction processing, thereby achieving an anti-noise effect, while the existing audio data processing process cannot achieve a better audio noise reduction effect.

发明内容SUMMARY OF THE INVENTION

本申请提供一种音频降噪方法、装置、系统和计算机可读存储介质，能够提高对音频数据的降噪效果。The present application provides an audio noise reduction method, device, system and computer-readable storage medium, which can improve the noise reduction effect on audio data.

为解决上述技术问题，本申请采用的技术方案是：提供一种音频降噪方法，该音频降噪方法包括：获取待降噪数据，并计算待降噪数据的功率谱，待降噪数据包括噪声数据与无噪数据；基于功率谱对噪声估计参数进行初始化，得到噪声数据的噪声谱；在第一时间段对初始化后的噪声估计参数进行最小值跟踪，得到第一数组；基于第一数组，计算出待降噪数据的后验信噪比以及先验信噪比；基于后验信噪比的置信度，计算出无音概率估计值、有音概率估计值以及噪声功率谱估计值；在第二时间段对初始化后的噪声估计参数进行最小值跟踪，得到第二数组；基于第二数组、无音概率估计值、有音概率估计值以及噪声功率谱估计值，计算出无噪数据的增益估计值；基于增益估计值、噪声谱以及噪声功率谱估计值，对待降噪数据进行降噪处理，得到无噪数据。In order to solve the above technical problems, the technical solution adopted in the present application is to provide an audio noise reduction method, the audio noise reduction method includes: acquiring data to be denoised, and calculating a power spectrum of the data to be denoised, the data to be denoised comprising: Noise data and noise-free data; initialize the noise estimation parameters based on the power spectrum to obtain the noise spectrum of the noise data; perform minimum value tracking on the initialized noise estimation parameters in the first time period to obtain the first array; based on the first array , calculate the posterior signal-to-noise ratio and the prior signal-to-noise ratio of the data to be denoised; based on the confidence of the posterior signal-to-noise ratio, calculate the estimated value of the probability of no sound, the estimated value of the probability of sound and the estimated value of the noise power spectrum; Perform minimum value tracking on the initialized noise estimation parameters in the second time period to obtain a second array; based on the second array, the soundless probability estimation value, the sounding probability estimation value and the noise power spectrum estimation value, calculate the noiseless data Based on the estimated gain value, the noise spectrum and the estimated value of the noise power spectrum, noise reduction processing is performed on the data to be denoised to obtain noise-free data.

为解决上述技术问题，本申请采用的另一技术方案是：提供一种音频降噪装置，该音频降噪装置包括互相连接的存储器和处理器，其中，存储器用于存储计算机程序，计算机程序在被处理器执行时，用于实现上述技术方案中的音频降噪方法。In order to solve the above-mentioned technical problem, another technical solution adopted in the present application is to provide an audio noise reduction device, the audio noise reduction device includes a memory and a processor connected to each other, wherein the memory is used for storing a computer program, and the computer program is stored in the computer program. When executed by the processor, it is used to implement the audio noise reduction method in the above technical solution.

为解决上述技术问题，本申请采用的另一技术方案是：提供一种音频降噪装置，其用于同时执行接收任务与降噪任务，该音频降噪装置包括调度电路以及降噪电路，调度电路用于接收与接收任务对应的待降噪数据，并对待降噪数据进行分路处理，得到多路子音频数据；降噪电路与调度电路连接，用于对与降噪任务对应的所有子音频数据进行并行降噪处理，得到降噪后的音频数据；其中，降噪电路用于实现上述技术方案中的音频降噪方法。In order to solve the above-mentioned technical problems, another technical solution adopted in the present application is to provide an audio noise reduction device, which is used to perform a receiving task and a noise reduction task at the same time, the audio noise reduction device includes a scheduling circuit and a noise reduction circuit, and the scheduling The circuit is used to receive the data to be noise-reduced corresponding to the receiving task, and perform branch processing on the data to be noise-reduced to obtain multi-channel sub-audio data; the noise-reduction circuit is connected to the scheduling circuit for all sub-audio corresponding to the noise reduction task. The data is subjected to parallel noise reduction processing to obtain noise-reduced audio data; wherein, the noise reduction circuit is used to implement the audio noise reduction method in the above technical solution.

为解决上述技术问题，本申请采用的另一技术方案是：提供一种音频降噪系统，该音频降噪系统包括音频采集装置以及音频降噪装置，音频采集装置用于采集目标场景中的声音，得到待降噪数据；音频降噪装置与音频采集装置连接，其用于对待降噪数据进行降噪处理，得到降噪后的音频数据；其中，音频降噪装置为上述技术方案中的音频降噪装置。In order to solve the above technical problem, another technical solution adopted in the present application is to provide an audio noise reduction system, the audio noise reduction system includes an audio collection device and an audio noise reduction device, and the audio collection device is used to collect the sound in the target scene , to obtain the data to be denoised; the audio denoising device is connected to the audio collection device, which is used to perform denoising processing on the denoising data to obtain the denoised audio data; wherein, the audio denoising device is the audio in the above technical solution Noise reduction device.

为解决上述技术问题，本申请采用的另一技术方案是：提供一种计算机可读存储介质，其用于存储计算机程序，计算机程序在被处理器执行时，用于实现上述技术方案中的音频降噪方法。In order to solve the above-mentioned technical problem, another technical solution adopted in this application is to provide a computer-readable storage medium, which is used for storing a computer program, and when the computer program is executed by a processor, is used to realize the audio frequency in the above-mentioned technical solution Noise reduction method.

通过上述方案，本申请的有益效果是：本申请所提供的音频降噪方法通过将音频降噪方法的步骤流水线化，能够提高对音频降噪的运算速度，从而提高音频降噪的效率；同时通过对初始化后的噪声估计参数进行多次最小值跟踪操作，能够提高后续计算增益估计值的精度，进而提高对待降噪数据的降噪效果。Through the above scheme, the beneficial effects of the present application are: the audio noise reduction method provided by the present application can improve the operation speed of the audio noise reduction by streamlining the steps of the audio noise reduction method, thereby improving the efficiency of the audio noise reduction; By performing multiple minimum value tracking operations on the initialized noise estimation parameters, the accuracy of the subsequent calculation of the gain estimation value can be improved, thereby improving the noise reduction effect of the data to be denoised.

附图说明Description of drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。其中：In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort. in:

图1是本申请提供的音频降噪方法的流程示意图；1 is a schematic flowchart of an audio noise reduction method provided by the application;

图2是本申请提供的语音降噪的示意图；Fig. 2 is the schematic diagram of speech noise reduction provided by the application;

图3是本申请提供的FFT和IFFT的实现过程的示意图；3 is a schematic diagram of an implementation process of FFT and IFFT provided by the present application;

图4是本申请提供的FFT和IFFT中的蝶形运算的示意图；Fig. 4 is the schematic diagram of butterfly operation in FFT and IFFT provided by this application;

图5是本申请提供的对IFFT的运算结果进行处理的示意图；Fig. 5 is the schematic diagram of processing the operation result of IFFT provided by the present application;

图6是本申请提供的噪声谱估计原理的示意图；6 is a schematic diagram of the noise spectrum estimation principle provided by the present application;

图7是本申请提供的增益计算原理的示意图；Fig. 7 is the schematic diagram of the gain calculation principle provided by the application;

图8是本申请提供的音频降噪装置一实施例的流程示意图；8 is a schematic flowchart of an embodiment of an audio noise reduction device provided by the present application;

图9是本申请提供的音频降噪装置一实施例的结构示意图；9 is a schematic structural diagram of an embodiment of an audio noise reduction device provided by the present application;

图10是本申请提供的音频降噪装置另一实施例的结构示意图；10 is a schematic structural diagram of another embodiment of an audio noise reduction device provided by the present application;

图11是图10所示的实施例中分路模组与调度模组的连接示意图；11 is a schematic diagram of the connection between the branching module and the scheduling module in the embodiment shown in FIG. 10;

图12是本申请提供的音频降噪系统一实施例的结构示意图；12 is a schematic structural diagram of an embodiment of an audio noise reduction system provided by the present application;

图13是本申请提供的计算机可读存储介质一实施例的结构示意图。FIG. 13 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided by the present application.

具体实施方式Detailed ways

下面结合附图和实施例，对本申请作进一步的详细描述。特别指出的是，以下实施例仅用于说明本申请，但不对本申请的范围进行限定。同样的，以下实施例仅为本申请的部分实施例而非全部实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本申请保护的范围。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is particularly pointed out that the following examples are only used to illustrate the present application, but do not limit the scope of the present application. Similarly, the following embodiments are only some of the embodiments of the present application, but not all of the embodiments, and all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

在本申请中提及“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包括在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是，本文所描述的实施例可以与其它实施例相结合。Reference in this application to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.

需要说明的是，本申请中的术语“第一”、“第二”、“第三”仅用于描述目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此，限定有“第一”、“第二”、“第三”的特征可以明示或者隐含地包括至少一个该特征。本申请的描述中，“多个”的含义是至少两个，例如两个，三个等，除非另有明确具体的限定。此外，术语“包括”和“具有”以及它们任何变形，意图在于覆盖不排他的包括。例如包括了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元，而是可选地还包括没有列出的步骤或单元，或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second" and "third" in this application are only used for description purposes, and should not be interpreted as indicating or implying relative importance or indicating the indicated technical features. quantity. Thus, a feature defined as "first", "second", "third" may expressly or implicitly include at least one of that feature. In the description of the present application, "a plurality of" means at least two, such as two, three, etc., unless otherwise expressly and specifically defined. Furthermore, the terms "comprising" and "having", and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device that includes a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally also includes For other steps or units inherent to these processes, methods, products or devices.

请参阅图1，图1是本申请提供的音频降噪方法一实施例的流程示意图，该音频降噪方法包括：Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an embodiment of an audio noise reduction method provided by the present application. The audio noise reduction method includes:

步骤11：获取待降噪数据，并计算待降噪数据的功率谱。Step 11: Acquire the data to be denoised, and calculate the power spectrum of the data to be denoised.

待降噪数据包括噪声数据与无噪数据，具体地，可先计算待降噪数据中前半部分数据(如前128个数据点)的功率谱，得到第一功率谱，然后基于对称性计算出待降噪数据中后半部分数据(如后128个数据点)的功率谱，得到第二功率谱；再将第一功率谱以及第二功率谱合并，得到功率谱；可以理解地，在计算待降噪数据的功率谱之前，可先对待降噪数据进行快速傅里叶变换(Fast Fourier Transform，FFT)，然后先计算出FFT变换得到的前半部分数据的功率谱，再基于对称性计算出后半部分数据的功率谱，以得到整个数据的功率谱，从而无需对全部数据都进行功率谱计算，能够提高运算效率。The data to be denoised includes noise data and non-noise data. Specifically, the power spectrum of the first half of the data (such as the first 128 data points) in the data to be denoised can be calculated to obtain the first power spectrum, and then calculated based on symmetry. The power spectrum of the second half of the data to be denoised (such as the last 128 data points) is obtained to obtain the second power spectrum; then the first power spectrum and the second power spectrum are combined to obtain the power spectrum; Before the power spectrum of the denoised data, a Fast Fourier Transform (FFT) can be performed on the denoised data, and then the power spectrum of the first half of the data obtained by the FFT transform is calculated first, and then calculated based on the symmetry. The power spectrum of the second half of the data can be obtained to obtain the power spectrum of the entire data, so that it is not necessary to perform the power spectrum calculation for all the data, which can improve the operation efficiency.

步骤12：基于功率谱对噪声估计参数进行初始化，得到噪声数据的噪声谱。Step 12: Initialize the noise estimation parameters based on the power spectrum to obtain the noise spectrum of the noise data.

在计算得到待降噪数据的功率谱之后，可基于功率谱对噪声估计参数进行初始化，得到噪声数据的噪声谱，将功率谱赋值给噪声谱；具体地，可先采用三点均值滤波法对功率谱进行平滑处理，得到平滑后的功率谱，然后基于平滑后的功率谱对噪声估计参数进行初始化，得到噪声谱；可以理解地，三点均值滤波法为技术领域中常规的操作方法，在此不再详述。After calculating the power spectrum of the data to be denoised, the noise estimation parameters can be initialized based on the power spectrum to obtain the noise spectrum of the noise data, and assign the power spectrum to the noise spectrum; The power spectrum is smoothed to obtain the smoothed power spectrum, and then the noise estimation parameters are initialized based on the smoothed power spectrum to obtain the noise spectrum; it is understandable that the three-point mean filtering method is a conventional operation method in the technical field. This will not be described in detail.

步骤13：在第一时间段对初始化后的噪声估计参数进行最小值跟踪，得到第一数组。Step 13: Perform minimum value tracking on the initialized noise estimation parameters in the first time period to obtain a first array.

在一具体的实施方式中，可先计算初始化后的噪声估计参数的最小值，得到第四数组；然后再对第四数组进行最小值跟踪，得到第一数组；通过对第四数组进行最小值跟踪，能够对第四数组的准确性进行校核，判断第四数组是否为初始化后的噪声估计参数的最小值，如果第四数组不是初始化后的噪声估计参数的最小值，则可进行最小值跟踪，将实际的最小值更新至第四数组中。In a specific embodiment, the minimum value of the initialized noise estimation parameter can be calculated first to obtain the fourth array; then the minimum value tracking is performed on the fourth array to obtain the first array; the minimum value of the fourth array is obtained by Tracking, can check the accuracy of the fourth array, determine whether the fourth array is the minimum value of the initialized noise estimation parameters, if the fourth array is not the minimum value of the initialized noise estimation parameters, the minimum value can be performed Track, update the actual minimum value into the fourth array.

步骤14：基于第一数组，计算出待降噪数据的后验信噪比以及先验信噪比。Step 14: Based on the first array, calculate a posteriori SNR and a priori SNR of the data to be denoised.

可基于第一数组计算出后验信噪比，并获取后验信噪比的最小值，得到第三数组，然后再基于第三数组计算出先验信噪比。The posterior SNR may be calculated based on the first array, and the minimum value of the posterior SNR may be obtained to obtain a third array, and then the prior SNR may be calculated based on the third array.

步骤15：基于后验信噪比的置信度，计算出无音概率估计值、有音概率估计值以及噪声功率谱估计值。Step 15: Based on the confidence of the posterior signal-to-noise ratio, calculate the estimated value of the probability of no sound, the estimated value of the probability of having a sound, and the estimated value of the noise power spectrum.

步骤16：在第二时间段对初始化后的噪声估计参数进行最小值跟踪，得到第二数组。Step 16: Perform minimum value tracking on the initialized noise estimation parameters in the second time period to obtain a second array.

第二时间段处于第一时间段之后，即在第一时间段结束之后，开始第二时间段，在第一时间段执行一次对初始化后的噪声估计参数的最小值跟踪，在第一时间段结束之后，在第二时间段再次执行一次对初始化后的噪声估计参数的最小值跟踪；可以理解地，步骤16中执行的最小值跟踪的操作与上述步骤13中的最小值跟踪的操作相同，在此不再赘述。The second time period is after the first time period, that is, after the first time period ends, the second time period is started, and the minimum value tracking of the initialized noise estimation parameters is performed in the first time period. After the end, the minimum value tracking of the initialized noise estimation parameter is performed again in the second time period; it can be understood that the operation of the minimum value tracking performed in step 16 is the same as the operation of the minimum value tracking in the above-mentioned step 13, It is not repeated here.

在一具体的实施方式中，对初始化后的噪声估计参数进行最小值跟踪的步骤可执行两次或两次以上，待降噪数据可包含多帧音频数据，间隔预设数量帧音频数据便可执行一次最小值跟踪的操作，通过执行多次最小值跟踪能够提高后续计算增益估计值的精度，进而提高对待降噪数据的降噪效果。In a specific embodiment, the step of performing minimum value tracking on the initialized noise estimation parameter can be performed twice or more, and the data to be denoised may include multiple frames of audio data, and the interval of a preset number of frames of audio data may be sufficient. By performing the operation of minimum value tracking once, the accuracy of the subsequent calculation of the gain estimation value can be improved by performing the minimum value tracking multiple times, thereby improving the noise reduction effect of the data to be denoised.

步骤17：基于第二数组、无音概率估计值、有音概率估计值以及噪声功率谱估计值，计算出无噪数据的增益估计值。Step 17: Calculate the gain estimation value of the noise-free data based on the second array, the soundless probability estimation value, the sounding probability estimation value, and the noise power spectrum estimation value.

无音概率估计值为先验语音不存在的概率估计值，有音概率估计值为条件语音存在的概率估计值。The soundless probability estimation value is the probability estimation value that the prior speech does not exist, and the sound probability estimation value is the probability estimation value that the conditional speech exists.

步骤18：基于增益估计值、噪声谱以及噪声功率谱估计值，对待降噪数据进行降噪处理，得到无噪数据。Step 18: Based on the estimated gain value, the noise spectrum, and the estimated value of the noise power spectrum, perform noise reduction processing on the data to be denoised to obtain noise-free data.

在一具体的实施方式中，在得到第二数组之后，可先基于第二数组对无音概率估计值进行更新，得到更新后的无音概率估计值；然后再基于噪声功率谱估计值对噪声谱进行更新，得到更新后的噪声谱；从而基于先验信噪比、后验信噪比、更新后的无音概率估计以及有音概率估计，计算出增益估计值；最终基于增益估计值以及更新后的噪声谱，对待降噪数据进行降噪处理，得到无噪数据。In a specific embodiment, after the second array is obtained, the estimated value of silent probability can be updated based on the second array to obtain the updated estimated value of silent probability; The spectrum is updated to obtain the updated noise spectrum; thus, the gain estimate is calculated based on the prior signal-to-noise ratio, the posterior signal-to-noise ratio, the updated silent probability estimate and the sound probability estimate; finally, based on the gain estimate and After the updated noise spectrum, noise reduction processing is performed on the data to be denoised to obtain noise-free data.

具体地，基于第二数组对无音概率估计值进行更新，得到更新后的无音概率估计值的步骤可包括：基于第二数组，计算出第一无音概率估计值；采用三点均值滤波法对第一无音概率估计值进行平滑处理，得到第二无音概率估计值；对第二无音概率估计值进行加窗处理，得到第三无音概率估计值；对第三无音概率估计值进行数值变换处理，得到更新后的无音概率估计值。Specifically, the step of updating the estimated value of silent probability based on the second array, and obtaining the updated estimated value of silent probability may include: calculating the first estimated value of silent probability based on the second array; using three-point mean filtering method to smooth the first estimated value of silence probability to obtain the second estimated value of silence probability; perform window processing on the second estimated value of silence probability to obtain the third estimated value of silence probability; The estimated value is subjected to numerical transformation processing to obtain the updated silent probability estimated value.

本实施例通过将音频降噪方法的步骤流水线化，能够提高对音频降噪的运算速度，从而提高音频降噪的效率；同时通过对初始化后的噪声估计参数进行多次最小值跟踪操作，能够提高后续计算增益估计值的精度，进而提高对待降噪数据的降噪效果。In this embodiment, by streamlining the steps of the audio noise reduction method, the operation speed of audio noise reduction can be improved, thereby improving the efficiency of audio noise reduction; Improve the accuracy of the subsequent calculation of the gain estimation value, thereby improving the noise reduction effect of the denoised data.

在一具体的实施方式中，还可利用上述实施例中的音频降噪方法实现对待降噪数据的并行处理，以提高降噪速率；具体地，可先对待降噪数据进行分路处理，得到多路子音频数据；然后再采用降噪处理方法对所有子音频数据进行并行降噪处理，得到降噪后的子音频数据，然后对降噪后的子音频数据进行合并，得到无噪数据；其中，降噪处理方法为上述实施例中的音频降噪方法。In a specific implementation manner, the audio noise reduction method in the above-mentioned embodiment can also be used to realize the parallel processing of the data to be denoised, so as to improve the noise reduction rate; Multi-channel sub-audio data; then use the noise reduction processing method to perform parallel noise reduction processing on all sub-audio data to obtain noise-reduced sub-audio data, and then combine the noise-reduced sub-audio data to obtain noise-free data; wherein , the noise reduction processing method is the audio noise reduction method in the above embodiment.

进一步地，上述实施例中的音频降噪方法可应用于可编程逻辑器件(FieldProgrammable Gate Array，FPGA)中来实现对待降噪数据的并行处理，下面对基于FPGA平台的音频降噪方法步骤进行具体介绍：Further, the audio noise reduction method in the above-mentioned embodiment can be applied to a programmable logic device (FieldProgrammable Gate Array, FPGA) to realize the parallel processing of the data to be noise reduction, and the steps of the audio noise reduction method based on the FPGA platform are carried out below. Specific introduction:

首先，如图2所示，语音降噪的实现过程主要分为以下几个部分：时域加窗、FFT变、降噪运算(Log-Spectral Amplitude estimator，LSA)、逆傅里叶变换(Inverse FastFourier Transform，IFFT)、权重叠加以及存储等；具体地，在进行IFFT运算之前，可对音频数据(即上述实施例中的待降噪数据)进行预处理，比如：去低频、去高频或与增益系数相乘等；在IFFT运算之后再利用权重叠加(weighted overlap-add)、加窗以及去除定标的方式还原音频时域数据；其中，由于输入的音频数据在FFT运算之前，做了分帧处理(1帧分4帧)，因此需要在IFFT运算之后做权重叠加等处理，以将这4帧数据转换成一帧，以便保证输入的音频数据和输出的音频数据数量保持一致，权重叠加处理后的音频数据的存储与进入FFT运算之前输入的音频数据的存储类似，前后数据具有相关性。First, as shown in Figure 2, the implementation process of speech noise reduction is mainly divided into the following parts: time domain windowing, FFT transformation, noise reduction operation (Log-Spectral Amplitude estimator, LSA), inverse Fourier transform (Inverse Fourier transform) FastFourier Transform, IFFT), weight superposition and storage, etc.; specifically, before performing the IFFT operation, the audio data (that is, the data to be denoised in the above embodiment) may be preprocessed, such as: removing low frequencies, removing high frequencies or Multiply with the gain coefficient, etc.; after the IFFT operation, use weighted overlap-add, windowing and de-scaling to restore the audio time-domain data; among them, because the input audio data is before the FFT operation, do Frame-by-frame processing (1 frame is divided into 4 frames), so it is necessary to do weight overlap addition and other processing after IFFT operation to convert these 4 frames of data into one frame, so as to ensure that the number of input audio data and output audio data are consistent. The storage of the processed audio data is similar to the storage of the audio data input before entering the FFT operation, and the data before and after are correlated.

具体地，待降噪的音频数据可为带噪信号，其包含噪声信号以及纯净信号，通过提取出纯净信号来实现音频降噪，语音降噪的步骤可包括：1)对输入的音频数据(即带噪信号)进行分帧以及加窗处理；2)对每帧带噪信号进行FFT运算；3)先估计后验信噪比，然后利用判决引导法估计先验信噪比，其中，在非语言片段(如语音开始之前几帧或语音间隙)中估计噪声的能量谱；4)利用最优的MMSE-LSA估计器估计增强信号的强度(相当于下述的求增益的步骤)；5)重建增强信号谱，然后对增强信号谱进行IFFT运算，得到对应的增强语音的时域信号(即降噪后的音频数据)。Specifically, the audio data to be denoised may be a noisy signal, which includes a noise signal and a pure signal, and the audio denoising is realized by extracting the pure signal, and the steps of speech denoising may include: 1) to the input audio data ( That is, the noisy signal) is divided into frames and windowed; 2) FFT operation is performed on each frame of the noisy signal; 3) The posterior SNR is estimated first, and then the a priori SNR is estimated by the decision-guided method, wherein, in Estimate the energy spectrum of noise in non-verbal segments (such as several frames before speech or speech gaps); 4) Use the optimal MMSE-LSA estimator to estimate the strength of the enhanced signal (equivalent to the following gain-seeking steps); 5. ) to reconstruct the enhanced signal spectrum, and then perform an IFFT operation on the enhanced signal spectrum to obtain the corresponding time domain signal of the enhanced speech (ie, the audio data after noise reduction).

(一)由于人耳对声音强度的感受是与谱幅度的对数成正比的，假设噪声信号和语音信号互不相关，则带噪信号可以表示为y＝x+d，其中，y为带噪语音，x为纯净语音，d为加性平稳噪声。(1) Since the human ear's perception of sound intensity is proportional to the logarithm of the spectral amplitude, assuming that the noise signal and the speech signal are not correlated with each other, the noisy signal can be expressed as y=x+d, where y is the band Noise speech, x is pure speech, d is additive stationary noise.

首先，可利用Y_k、X_k和D_k分别表示上述y、x和d经过FFT运算后的第k个频谱分量，Y_k、B_k可利用下述公式(1)以及公式(2)计算得到：First, Y _k , X _k and D _k can be used to represent the k-th spectral component of the above y, x and d after FFT operation, respectively, and Y _k and B _k can be calculated using the following formulas (1) and (2) get:

其中，上述公式(1)以及公式(2)中的R_k和B_k分别为带噪语音、纯净语音在频点k的幅度，θ_k和α_k分别为带噪语言、纯净语言在频点k的相位。Among them, R _k and B _k in the above formula (1) and formula (2) are the amplitudes of the noisy speech and the pure speech at the frequency point k, respectively, and θ _k and α _k are the frequency points of the noisy speech and the pure speech, respectively. phase of k.

然后再利用下述公式(3)～公式(6)从Y_k中估计出B_k：Then use the following formulas (3) to (6) to estimate B _{k from Y k} _:

其中，在上述公式(3)中，

为B_k的估计，可由上述公式(3)推导出上述公式(4)，上述公式(6)中G(ξ_k，γ_k)为增益函数，ξ_k和γ_k分别为先验信噪比(Signal-Noise Ratio，SNR)和后验信噪比，v_k＝(ξ_k/1+ξ_k)*γ_k，ξ_k＝λ_s(k)/λ_n(k)，γ_k＝R_k ²/λ_n(k)，λ_s为纯净语音方差，λ_n为噪声方差；由上述公式(5)可知，R_k ²与公式(6)所示的增益函数相乘即可得到纯净语音估计

Among them, in the above formula (3),

is the estimation of B _k , the above formula (4) can be derived from the above formula (3), G(ξ _k , γ _k ) in the above formula (6) is the gain function, and ξ _k and γ _k are the prior signal-to-noise ratios, respectively (Signal-Noise Ratio, SNR) and a posteriori SNR, v _k =(ξ _k /1+ξ _k )*γ _k , ξ _k =λ _s (k)/λ _n (k), γ _k =R _k ² /λ _n (k), λ _s is the variance of pure speech, and λ _n is the variance of noise; it can be seen from the above formula (5) that the pure speech can be obtained by multiplying R _k ² with the gain function shown in formula (6). estimate

(二)对于FFT和IFFT的运算，可通过移植C源代码的方式，采用基-2的频域/时域抽取法实现运算，如图3所示，在FPGA平台下的FFT和IFFT运算的实现过程如下：(2) For the operations of FFT and IFFT, the radix-2 frequency domain/time domain decimation method can be used to realize the operation by transplanting the C source code. As shown in Figure 3, the operation of FFT and IFFT under the FPGA platform The implementation process is as follows:

可通过图3所示的Norml_ram(随机存储器)存储第一次蝶形运算前的原始音频数据，Norml_ram可包含一个实部存储器(图中未示出)和一个虚部存储器(图中未示出)，每个实部/虚部存储器的大小可为256*16bit，图3所示的DATA RAM0和DATA RAM1可分别存储FFT或IFFT运算中实部数据和虚部数据，DATA RAM0和DATA RAM1的大小可为256*32bit。The original audio data before the first butterfly operation can be stored through the Norml_ram (random access memory) shown in Figure 3. Norml_ram can include a real part memory (not shown in the figure) and an imaginary part memory (not shown in the figure). ), the size of each real part/imaginary part memory can be 256*16bit, DATA RAM0 and DATA RAM1 shown in Figure 3 can respectively store real part data and imaginary part data in FFT or IFFT operation, DATA RAM0 and DATA RAM1 The size can be 256*32bit.

在第一次做蝶形运算时可直接输入音频数据至蝶形运算模块，不需要利用存储器缓存，能够节省运算时间，其中，输入音频数据的实部数据和虚部数据的大小均是32bit，输出的FFT或IFFT处理后的音频数据的实部数据和虚部数据的大小均是16bit，在进行FFT或IFFT运算时，可通过查FFT余/正弦表或IFFT余/正弦表的方式获取FFT或IFFT运算中的正余弦数，从而调整蝶形运算数据顺序，例如：FFT余/正弦表为FFTg_FFTCos或g_FFTReverse，FFTg_FFTCos与g_FFTReverse的大小可分别为512*16bit和256*16bit。When the butterfly operation is performed for the first time, the audio data can be directly input to the butterfly operation module without using the memory buffer, which can save the operation time. The size of the real part data and imaginary part data of the output audio data processed by FFT or IFFT is 16 bits. When performing FFT or IFFT operation, you can obtain FFT by checking the FFT co/sine table or IFFT co/sine table. Or the number of sine and cosine in IFFT operation, so as to adjust the data order of butterfly operation, for example: FFT co/sine table is FFTg_FFTCos or g_FFTReverse, the size of FFTg_FFTCos and g_FFTReverse can be 512*16bit and 256*16bit respectively.

进一步地，如图4所示，在FFT的蝶形运算过程中，前128个点的音频数据与后128个点的音频数据的数值是分开计算的，可先计算前128个点的音频数据的FFT值，将前64个点的音频数据进行定标以及复数运算处理；然后根据对称性，计算出后128个点的音频数据的FFT值；而在IFFT的蝶形运算过程中，是分成两次蝶形运算直接对256个点的音频数据进行计算，从而得到256个点的音频数据的IFFT值，其中，在FFT/IFFT运算过程中的乘法和加法均采用乘法器知识产权(Intellectual Property，IP)核和加法器IP核来实现，由于涉及到有符号数的运算，可将采用的乘法器和加法器的延时设置为2个时钟周期。Further, as shown in Figure 4, in the butterfly operation process of FFT, the audio data of the first 128 points and the audio data of the last 128 points are calculated separately, and the audio data of the first 128 points can be calculated first. Then, according to the symmetry, the FFT value of the audio data of the last 128 points is calculated; and in the butterfly operation process of IFFT, it is divided into The two butterfly operations directly calculate the audio data of 256 points, thereby obtaining the IFFT value of the audio data of 256 points. , IP) core and adder IP core to achieve, because the operation of the signed number is involved, the delay of the multiplier and the adder can be set to 2 clock cycles.

可以理解地，如图5所示，经过IFFT运算过后的数据是256个点，当前的运算结果需要叠加上一次的运算结果，然后输出低64位的数据作为当前整个降噪算法运算得到的结果，在输出运算结果之后，再将64-255位的数据移至0-191位，将高64位的数据填0，以供下一次运算使用。Understandably, as shown in Figure 5, the data after IFFT operation is 256 points, the current operation result needs to be superimposed on the previous operation result, and then output the lower 64-bit data as the result obtained by the current operation of the entire noise reduction algorithm. , after outputting the operation result, move the 64-255-bit data to 0-191-bit, and fill the upper 64-bit data with 0 for the next operation.

(三)降噪运算的实现过程实际就是计算出噪声谱估计和增益，具体地，可基于如图6以及图7所示的噪声谱估计原理和增益计算原理实现降噪，其中，|Y|²表示语音信号的能量(即FFT运算后的数据)，用λ_d表示噪声谱，用

表示噪声谱估计值，G表示增益，Y_a ²表示功率谱，具体地，下面对计算数噪声谱估计值

和增益G的步骤进行介绍(即上述实施例中的音频降噪方法)：(3) The realization process of noise reduction operation is actually to calculate the noise spectrum estimation and gain. Specifically, noise reduction can be realized based on the noise spectrum estimation principle and gain calculation principle as shown in Figure 6 and Figure 7, where |Y| ² represents the energy of the speech signal (that is, the data after FFT operation), with λ _d representing the noise spectrum, with

represents the estimated value of the noise spectrum, G represents the gain, and Y _a ² represents the power spectrum. Specifically, the following is the estimated value of the calculated noise spectrum.

The steps of and gain G are introduced (that is, the audio noise reduction method in the above-mentioned embodiment):

1)计算前128个点的数据的功率谱，即Y_a ²＝Real²+Image²，并根据FFT运算过程中的定标值将数据还原，从而得到功率谱的值，并将其存储在对应的存储器中，其中，存储器的大小可为129*32bit；同时对功率谱进行三点均值滤波，对前一个点、中间点以及后一个点的数据的权重分别设为1/4、1/2和1/4，以实现S_f[i]＝(Ya²[i-1]>>2)+(Ya²[i]>>1)+(Ya²[i+1]>>2)，然后再将一部分中间数组型变量初始值设置成三点均值滤波后的值。1) Calculate the power spectrum of the data of the first 128 points, that is, Y _a ² =Real ² +Image ² , and restore the data according to the scaling value during the FFT operation, so as to obtain the value of the power spectrum, and store it in In the corresponding memory, the size of the memory can be 129*32bit; at the same time, three-point mean filtering is performed on the power spectrum, and the weights of the data of the previous point, the middle point and the latter point are set to 1/4, 1/ 2 and 1/4 to achieve S _f [i]=(Ya ² [i-1]>>2)+(Ya ² [i]>>1)+(Ya ² [i+1]>>2) , and then set the initial value of some intermediate array variables to the value after three-point mean filtering.

2)基于功率谱初始化噪声谱，即λ_d＝Y_a ²，然后在存储器(Blockram)中根据初始化的噪声谱对部分中间数组型变量(即噪声估计参数)进行初始化，例如：nShiftYa2(m_nShiftYa2)和eta(m_eta)。2) Initialize the noise spectrum based on the power spectrum, that is, λ _d =Y _a ² , and then initialize some intermediate array variables (ie, noise estimation parameters) in the memory (Blockram) according to the initialized noise spectrum, for example: nShiftYa2 (m_nShiftYa2) and eta(m_eta).

3)受加法器和乘法器的数量限制，可先计算出部分数组型变量的最小值，然后再执行数组型变量中的第一次最小值搜索，其中，数组型变量在第一次运算、前14帧及14帧以后的运算/赋值方式可不同。3) Limited by the number of adders and multipliers, the minimum value of some array variables can be calculated first, and then the first minimum value search in the array variables can be performed. The operation/assignment methods can be different for the first 14 frames and after 14 frames.

4)计算后验信噪比的最小值，计算先验信噪比。4) Calculate the minimum value of the posterior SNR, and calculate the prior SNR.

5)根据后验信噪比的置信度(即值域)，计算先验语音不存在的概率估计值、条件语音存在的概率估计值以及通过递归平均的方式计算噪声功率谱估计值。5) According to the confidence of the posterior signal-to-noise ratio (ie, the range), calculate the probability estimation value of the absence of a priori speech, the probability estimation value of the conditional speech existence, and the noise power spectrum estimation value by recursive averaging.

6)执行数组型变量中的第二次最小值搜索，除第10帧的数据有特殊处理之外，其他每隔10帧数据计算一次最小值，每个频点保留5个值，然后存储在5个129*32bit的存储器(Blockram)中，得到每个频点的最小值。6) Execute the second minimum value search in the array variable. Except for the data of the 10th frame, the minimum value is calculated every 10 frames of data, and 5 values are reserved for each frequency point, and then stored in In five 129*32bit memories (Blockram), the minimum value of each frequency point is obtained.

7)先验计算语音不存在的概率估计，根据噪声功率谱估计值更新噪声谱，分别对概率估计加局部窗(三点均值滤波)和全局窗(加窗，并根据定标还原数据)；最终通过列变换和其他运算，得出先验语音不存在的概率估计值。7) Calculate the probability estimate of the absence of speech a priori, update the noise spectrum according to the estimated value of the noise power spectrum, and respectively add a local window (three-point mean filtering) and a global window (windowing, and restore data according to the calibration) to the probability estimate; Finally, through column transformation and other operations, a probability estimate of the absence of a priori speech is obtained.

8)更新计算噪声谱的部分中间数组型变量，例如，m_gamma&m_eta&m_v。8) Update some of the intermediate array variables for calculating the noise spectrum, for example, m_gamma&m_eta&m_v.

9)更新计算增益估计值的中间变量值，计算增益最小值估计值，从而计算得到增益G；其中，在计算过程中，涉及5个临时的数组变量：ivUInt32m_min_temp[129]，m_lambda_d_global[129]，m_GH0[129]，m_GH1[129]和ivUInt16m_PH1[129]，增益G与临时数组变量m_PH1可复用一个存储器(Blockram)，存储器的深度为129*16bit，临时数组变量m_GH1与功率谱Ya²复用一个存储器(Blockram)。9) Update the value of the intermediate variable for calculating the estimated gain value, calculate the estimated value of the minimum gain value, and thereby calculate the gain G; wherein, in the calculation process, 5 temporary array variables are involved: ivUInt32m_min_temp[129], m_lambda_d_global[129], m_GH0[129], m_GH1[129] and ivUInt16m_PH1[129], the gain G and the temporary array variable m_PH1 can be multiplexed with a memory (Blockram), the depth of the memory is 129*16bit, and the temporary array variable m_GH1 is multiplexed with the power spectrum Ya ² A memory (Blockram).

11)最后更新计算噪声谱估计值

和增益G中间变量eta_2term，完成降噪处理。11) Finally update the calculated noise spectrum estimate

and the gain G intermediate variable eta_2term to complete the noise reduction process.

本实施例中的基于FPGA平台的音频降噪方法能够通过将源C移植的方式，按照功能和上下文的耦合度实现源码分块，从而能够降低各个降噪和/运算模块间的耦合度，从而实现模块的并行运算，大大提高降噪处理的效果和效率；另外，所有的乘加或除法等运算均采用流水线的方式执行，能够提高运算效率；此外，部分存储器和运算模块可实现复用，能够节省平台的资源占用，还能够在平台中采用100M以上的时钟频率，进一步提高运算速度。The audio noise reduction method based on the FPGA platform in this embodiment can implement source code block according to the coupling degree of function and context by transplanting the source C, so that the coupling degree between each noise reduction and/or operation module can be reduced, thereby The parallel operation of the modules is realized, which greatly improves the effect and efficiency of noise reduction processing; in addition, all operations such as multiplication, addition or division are performed in a pipeline manner, which can improve the operation efficiency; in addition, some memories and operation modules can be reused, The resource occupation of the platform can be saved, and a clock frequency of more than 100M can be used in the platform to further improve the operation speed.

请参阅图8，图8是本申请提供的音频降噪装置一实施例的结构示意图，该音频降噪装置80包括互相连接的存储器81和处理器82，其中，存储器81用于存储计算机程序，计算机程序在被处理器82执行时，用于实现上述实施例中的音频降噪方法。Please refer to FIG. 8. FIG. 8 is a schematic structural diagram of an embodiment of an audio noise reduction device provided by the present application. The audio noise reduction device 80 includes a memory 81 and a processor 82 that are connected to each other, wherein the memory 81 is used for storing computer programs. When executed by the processor 82, the computer program is used to implement the audio noise reduction method in the above-mentioned embodiment.

请参阅图9，图9是本申请提供的音频降噪装置一实施例的结构示意图，音频降噪装置10包括：调度电路11以及降噪电路12。Please refer to FIG. 9 . FIG. 9 is a schematic structural diagram of an embodiment of an audio noise reduction apparatus provided by the present application. The audio noise reduction apparatus 10 includes a scheduling circuit 11 and a noise reduction circuit 12 .

调度电路11用于接收与接收任务对应的待降噪数据，并对待降噪数据进行分路处理，得到多路子音频数据；降噪电路12与调度电路11连接，其用于对与降噪任务对应的所有子音频数据进行并行降噪处理，得到降噪后的待降噪数据。The scheduling circuit 11 is used to receive the data to be noise-reduced corresponding to the receiving task, and perform demultiplexing processing on the data to be noise-reduced to obtain multi-channel sub-audio data; All corresponding sub-audio data are subjected to parallel noise reduction processing to obtain noise-reduced data to be noise-reduced.

在一具体的实施方式中，可设置多个子降噪电路(图中未示出)，然后利用多个子降噪电路分别对多路子音频数据进行处理，以实现子音频数据的并行降噪处理，其中，子降噪电路的数量可根据实际需求进行设置，在此不作限定。可以理解地，本实施例中的“降噪”并不只限制于对待降噪数据的降噪处理，其还可包括对待降噪数据进行分离或解混响等操作。In a specific embodiment, multiple sub-noise reduction circuits (not shown in the figure) can be set, and then the multiple sub-audio data are processed respectively by the multiple sub-noise reduction circuits, so as to realize parallel noise reduction processing of the sub-audio data, The number of sub-noise reduction circuits can be set according to actual needs, which is not limited here. It can be understood that the "noise reduction" in this embodiment is not limited to the noise reduction processing of the data to be denoised, but may also include operations such as separation or de-reverberation of the data to be denoised.

具体地，子音频数据可为待降噪数据中需要进行降噪处理的待降噪数据，调度电路11可将接收到的待降噪数据分路，同时可识别出子音频数据，然后将待降噪数据划分为多路子音频数据；其中，“多路”可以理解为“多通道”，即利用多个不同传输通道将每路子音频数据并行传输至降噪电路12中，以提高数据传输的效率。可以理解地，可将待降噪数据分为四路子音频数据或八路子音频数据等，子音频数据的通道数可根据实际需求增加或减少，在此不作限定。Specifically, the sub-audio data may be the data to be noise-reduced that needs noise reduction processing in the data to be noise-reduced. The scheduling circuit 11 can split the received data to be noise-reduced, identify the sub-audio data, and then The noise reduction data is divided into multi-channel sub-audio data; among them, "multi-channel" can be understood as "multi-channel", that is, each channel of sub-audio data is transmitted in parallel to the noise reduction circuit 12 by using multiple different transmission channels, so as to improve the data transmission efficiency. efficiency. It can be understood that the data to be denoised can be divided into four channels of sub-audio data or eight channels of sub-audio data, etc. The number of channels of the sub-audio data can be increased or decreased according to actual needs, which is not limited here.

进一步地，接收任务与降噪任务可同时执行，即音频降噪装置10可对接收到的待降噪数据实时进行降噪处理，其中，待降噪数据可为由音频采集装置(如拾音器等)实时采集到的包含语音的数据，在这种情况下，调度电路11可在第一时间将采集的子音频数据直接传输至降噪电路12，以通过降噪电路12对所有子音频数据进行并行降噪处理，得到降噪后的待降噪数据，无需经过计算机端来中转待降噪数据，节省数据传输时间，能够大大提高待降噪数据处理的实时性，经过实际应用验证，相较于现有方案，本实施例的方案可节省4-8ms的时间；或者，在其他实施方式中，待降噪数据还可为音频采集装置中预先存储的之前采集的待降噪数据，在此不作限定。Further, the receiving task and the noise reduction task can be performed at the same time, that is, the audio noise reduction device 10 can perform noise reduction processing on the received data to be denoised in real time, wherein the data to be denoised may be generated by an audio collection device (such as a pickup, etc.). ) data containing speech collected in real time, in this case, the scheduling circuit 11 can directly transmit the collected sub-audio data to the noise reduction circuit 12 at the first time, so that all the sub-audio data can be processed by the noise reduction circuit 12. Parallel noise reduction processing, to obtain the noise reduction data after noise reduction, does not need to transfer the noise reduction data through the computer, saves data transmission time, and can greatly improve the real-time performance of the noise reduction data processing. After practical application verification, compared with Compared with the existing solution, the solution in this embodiment can save 4-8 ms of time; or, in other implementations, the data to be noise-reduced can also be the data to be noise-reduced that is pre-stored in the audio collection device and collected before, here Not limited.

在一具体的实施方式中，对于每个传输通道来说，调度电路11可对相应的子音频数据进行串行传输，每次将预设数量(如64个数据点)的部分子音频数据传输至降噪电路12，以通过降噪电路12进行降噪处理；然后再传输下一预设数量的部分子音频数据至降噪电路12，以此类推，直到完成对所有子音频数据的降噪处理；通过这种边传输边处理的方式，能够减少降噪电路12的处理压力，提高音频降噪的效率，经过实际应用验证，降噪电路12对每路子音频数据的降噪时间大约在250us，远远低于相关技术中的常规2ms～6ms的降噪处理时间。In a specific implementation manner, for each transmission channel, the scheduling circuit 11 may serially transmit the corresponding sub-audio data, and transmit part of the sub-audio data of a preset number (eg, 64 data points) each time. to the noise reduction circuit 12 to perform noise reduction processing by the noise reduction circuit 12; then transmit the next preset number of partial sub-audio data to the noise reduction circuit 12, and so on, until the noise reduction of all the sub-audio data is completed Processing; through this method of processing while transmitting, the processing pressure of the noise reduction circuit 12 can be reduced, and the efficiency of audio noise reduction can be improved. After practical application verification, the noise reduction time of the noise reduction circuit 12 for each sub-audio data is about 250us , which is far lower than the conventional noise reduction processing time of 2ms to 6ms in the related art.

本实施例采用调度电路执行接收任务，即接收待降噪数据，并对待降噪数据进行分路处理，得到多路子音频数据，从而将多路子音频数据并行传输至降噪电路中；利用降噪电路执行降噪任务，即对所有子音频数据进行并行降噪处理，得到降噪后的待降噪数据，能够大大节省数据传输以及降噪处理的时间，从而提高降噪处理的效率；而且，接收任务与降噪任务同时执行，能够在第一时间接收采集的待降噪数据，再将待降噪数据传输至降噪电路，无需中转待降噪数据，能够大大提高待降噪数据处理的实时性。In this embodiment, the scheduling circuit is used to perform the receiving task, that is, the data to be noise-reduced is received, and the data to be noise-reduced is demultiplexed to obtain multi-channel sub-audio data, so as to transmit the multi-channel sub-audio data to the noise-reduction circuit in parallel; The circuit performs the noise reduction task, that is, performs parallel noise reduction processing on all sub-audio data to obtain the noise reduction data to be noise reduction, which can greatly save the time of data transmission and noise reduction processing, thereby improving the efficiency of noise reduction processing; moreover, The receiving task and the noise reduction task are executed at the same time, which can receive the collected data to be denoised at the first time, and then transmit the data to be denoised to the denoising circuit without transferring the denoised data, which can greatly improve the processing efficiency of the denoised data. real-time.

请参阅图10，图10是本申请提供的音频降噪装置另一实施例的结构示意图，音频降噪装置20包括：调度电路21以及降噪电路22。Please refer to FIG. 10 . FIG. 10 is a schematic structural diagram of another embodiment of an audio noise reduction apparatus provided by the present application. The audio noise reduction apparatus 20 includes a scheduling circuit 21 and a noise reduction circuit 22 .

调度电路21包括分路模组211以及调度模组212，分路模组211用于对待降噪数据进行分路处理，得到多路待降噪数据；调度模组212与分路模组211以及降噪电路22连接，其用于对多路待降噪数据进行调度处理，得到多路子音频数据至降噪电路22；具体地，分路模组211可为数据选择器(multiplexer，MUX)。The scheduling circuit 21 includes a branching module 211 and a scheduling module 212. The branching module 211 is used to perform branch processing on the data to be denoised to obtain multiple channels of data to be denoised; the scheduling module 212 and the branching module 211 and The noise reduction circuit 22 is connected, and is used for scheduling and processing multiple channels of data to be noise reduction to obtain multiple channels of sub-audio data to the noise reduction circuit 22; specifically, the branching module 211 may be a data selector (multiplexer, MUX).

在一具体的实施方式中，待降噪数据可包括标识信息以及原始音频信息，标识信息包括通道号标识以及降噪信息标识，通道号标识用于标识待降噪数据的通道号，每一比特待降噪数据可对应一个通道号，降噪信息标识用于标识待降噪数据的降噪信息，其中，降噪信息用于表示待降噪数据是否需要降噪处理。In a specific embodiment, the data to be denoised may include identification information and original audio information, the identification information includes a channel number identification and a noise reduction information identification, the channel number identification is used to identify the channel number of the data to be denoised, and each bit The data to be denoised may correspond to a channel number, and the denoising information identifier is used to identify denoising information of the data to be denoised, wherein the denoising information is used to indicate whether the data to be denoised needs denoising processing.

具体地，分路模组211可识别通道号标识中的通道号，并采用与通道号对应的传输通道传输待降噪数据；调度模组212还可识别降噪信息标识中的降噪信息，并基于降噪信息判断待降噪数据是否为子音频数据，即判断待降噪数据是否需要进行降噪处理，若待降噪数据为子音频数据，则将子音频数据输入至降噪电路22；若待降噪数据不是子音频数据，则可将待降噪数据输入至其他电路中，实现其他处理操作或者直接输出。Specifically, the branching module 211 can identify the channel number in the channel number identifier, and use the transmission channel corresponding to the channel number to transmit the data to be noise-reduced; the scheduling module 212 can also identify the noise-reduction information in the noise-reduction information identifier, And based on the noise reduction information, it is judged whether the data to be noise-reduced is sub-audio data, that is, to determine whether the data to be noise-reduced needs noise reduction processing, and if the data to be noise-reduced is sub-audio data, the sub-audio data is input to the noise reduction circuit 22 ; If the data to be noise-reduced is not sub-audio data, the data to be noise-reduced can be input into other circuits to implement other processing operations or output directly.

如图11所示，音频降噪装置20还包括存储模组23，存储模组23与分路模组211连接，其用于对多路待降噪数据进行存储；具体地，存储模组23可包括多个子存储模组231，每个子存储模组231与通道号标识中的通道号对应，分路模组211还可基于通道号将待降噪数据输出至对应的子存储模组231中。As shown in FIG. 11 , the audio noise reduction device 20 further includes a storage module 23, and the storage module 23 is connected to the branching module 211, which is used to store the multi-channel data to be noise-reduced; specifically, the storage module 23 It can include a plurality of sub-storage modules 231, each sub-storage module 231 corresponds to the channel number in the channel number identification, and the branching module 211 can also output the data to be denoised to the corresponding sub-storage module 231 based on the channel number. .

降噪电路22还可包括多个子降噪电路221(图10以三个子降噪电路221为例)，降噪后的待降噪数据可包括多个子降噪音频数据，每个子降噪电路221与调度电路21连接，通过每个子降噪电路221对相应的待处理待降噪数据进行降噪处理，能够得到对应的子降噪音频数据；在其他实施方式中，音频降噪装置20还可包括解复用器(demultiplexer，DEMUX)，以通过DEMUX输出多个子降噪音频数据。The noise reduction circuit 22 may further include a plurality of sub noise reduction circuits 221 (three sub noise reduction circuits 221 are taken as an example in FIG. 10 ), the noise reduction data to be noise reduction after noise reduction may include a plurality of sub noise reduction audio data, each sub noise reduction circuit 221 It is connected to the scheduling circuit 21, and each sub-noise reduction circuit 221 performs noise reduction processing on the corresponding to-be-processed noise-reduced data to obtain corresponding sub-noise reduction audio data; in other embodiments, the audio noise reduction device 20 can also A demultiplexer (DEMUX) is included to output a plurality of sub noise reduction audio data through the DEMUX.

在一具体的实施方式中，音频降噪装置20可基于FPGA来实现，经过实际测验，基于FPGA的音频降噪装置20的尺寸能够控制在50mm×40mm以内，能够将音频降噪装置20集成在一个IP核中，在不同的应用场景下，能够实现灵活移植，使得音频降噪装置20能够广泛地适用在任意的FPGA平台中；而且由于FPGA具有成本低、大规模、高集成度、低功耗、灵活性高以及开发周期短等优点，其处理功耗不超过3w，则基于FPGA平台的音频降噪装置20能够节约降噪成本，提高数据降噪的实时性以及效率。In a specific embodiment, the audio noise reduction device 20 can be implemented based on an FPGA. After an actual test, the size of the FPGA-based audio noise reduction device 20 can be controlled within 50mm×40mm, and the audio noise reduction device 20 can be integrated in a In one IP core, in different application scenarios, flexible transplantation can be realized, so that the audio noise reduction device 20 can be widely used in any FPGA platform; and because the FPGA has low cost, large scale, high integration, and low power The audio noise reduction device 20 based on the FPGA platform can save noise reduction costs and improve the real-time performance and efficiency of data noise reduction.

而且，在基于其他平台(如IMAX6Q或STM32等)下的音频降噪方案中，在待降噪数据采集完之后，需要在一帧待降噪数据接收完并且进行数据更新之后，再开始降噪处理，以一帧待降噪数据包含256个数据点为例，每次只能更新64个数据点，则此时便需要更新四次之后才能开始降噪处理，而本实施例基于FPGA的音频降噪装置20中的待降噪数据的采集在FPGA中实现，无需等到一帧待降噪数据更新后再开始降维处理，在接收到1/4帧的待降噪数据就能开始进行音频降噪处理，从而大大提高待降噪数据传输以及降噪处理的效率。Moreover, in the audio noise reduction scheme based on other platforms (such as IMAX6Q or STM32, etc.), after the noise reduction data is collected, it is necessary to start the noise reduction after a frame of the noise reduction data is received and the data is updated. For example, taking a frame of data to be noise-reduced contains 256 data points as an example, only 64 data points can be updated at a time, then it needs to be updated four times before the noise-reduction processing can be started. The collection of the data to be denoised in the noise reduction device 20 is implemented in the FPGA. It is not necessary to wait for a frame of the denoised data to be updated before starting the dimensionality reduction process, and the audio can be started after receiving 1/4 of the frame of the denoised data to be denoised. Noise reduction processing, thereby greatly improving the efficiency of data transmission to be noise reduction and noise reduction processing.

具体地，降噪电路22可用于实现上述实施例中的音频降噪方法，相较于其他的语音降噪算法(如减谱法、自适应滤波、维纳滤波或最小均方误差估计(Minimum Mean SquareError，MMSE)等)，利用上述实施例中的音频降噪算法进行降噪处理，能够优化语音降噪处理的效果，对背景噪声实现更高的抑制，使得待降噪数据的失真度更低。Specifically, the noise reduction circuit 22 can be used to implement the audio noise reduction method in the above-mentioned embodiment, compared with other speech noise reduction algorithms (such as spectral subtraction, adaptive filtering, Wiener filtering or minimum mean square error estimation (Minimum mean square error estimation). Mean Square Error, MMSE), etc.), using the audio noise reduction algorithm in the above embodiment to perform noise reduction processing, can optimize the effect of speech noise reduction processing, achieve higher suppression of background noise, and make the distortion of the data to be noise reduced more. Low.

本实施例通过分路模组以及调度模组，实现对待降噪数据的分路以及调度，在不同的传输通道中将多路需要降噪的待降噪数据并行传输至后续的多个子降噪电路中，通过设置多个子降噪电路，实现对多路的子音频数据的并行降噪处理，从而大大提高音频降噪的效率；而且，本实施例中的音频降噪装置可在FPGA平台上实现，同时利用FPGA平台对数据处理的并行性和快速性来支持多路音频降噪方法的并行处理，能够在提高降噪效果的同时仍保证处理速度，能够解决音频降噪耗时长以及响应慢的问题，进一步提高音频降噪的效率。In this embodiment, the branching module and the scheduling module are used to realize the branching and scheduling of the data to be denoised, and in different transmission channels, multiple channels of data to be denoised that need to be denoised are transmitted in parallel to a plurality of subsequent sub-denoisers In the circuit, by setting a plurality of sub-noise reduction circuits, parallel noise reduction processing for multiple sub-audio data is realized, thereby greatly improving the efficiency of audio noise reduction; moreover, the audio noise reduction device in this embodiment can be implemented on an FPGA platform. At the same time, the parallel processing of multi-channel audio noise reduction methods is supported by the parallelism and rapidity of data processing on the FPGA platform, which can improve the noise reduction effect while still ensuring the processing speed, and can solve the problem of time-consuming audio noise reduction and slow response. problem, and further improve the efficiency of audio noise reduction.

请参阅图12，图12是本申请提供的音频降噪系统一实施例的结构示意图，音频降噪系统120包括音频采集装置121以及音频降噪装置122，音频采集装置121用于采集目标场景中的声音，得到待降噪数据；音频降噪装置122与音频采集装置121连接，其用于对待降噪数据进行降噪处理，得到降噪后的音频数据；其中，音频降噪装置122为上述实施例中的音频降噪装置。Please refer to FIG. 12. FIG. 12 is a schematic structural diagram of an embodiment of an audio noise reduction system provided by the present application. The audio noise reduction system 120 includes an audio collection device 121 and an audio noise reduction device 122. The audio collection device 121 is used in the collection target scene. The audio noise reduction device 122 is connected to the audio collection device 121, which is used to perform noise reduction processing on the noise reduction data to obtain the audio data after noise reduction; wherein, the audio noise reduction device 122 is the above-mentioned The audio noise reduction device in the embodiment.

请参阅图13，图13是本申请提供的计算机可读存储介质一实施例的结构示意图，计算机可读存储介质130用于存储计算机程序131，计算机程序131在被处理器执行时，用于实现上述实施例中的音频降噪方法。Please refer to FIG. 13. FIG. 13 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided by the present application. The computer-readable storage medium 130 is used to store a computer program 131. When the computer program 131 is executed by a processor, it is used to realize The audio noise reduction method in the above embodiment.

计算机可读存储介质130可以是服务端、U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The computer-readable storage medium 130 may be a server, a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access Memory), a magnetic disk or an optical disk, etc. medium of program code.

在本申请所提供的几个实施方式中，应该理解到，所揭露的方法以及设备，可以通过其它的方式实现。例如，以上所描述的设备实施方式仅仅是示意性的，例如，模块或单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。In the several embodiments provided in this application, it should be understood that the disclosed method and device may be implemented in other manners. For example, the device implementations described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other divisions. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.

作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.

另外，在本申请各个实施方式中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

以上所述仅为本申请的实施例，并非因此限制本申请的专利范围，凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本申请的专利保护范围内。The above are only the embodiments of the present application, and are not intended to limit the scope of the patent of the present application. Any equivalent structure or equivalent process transformation made by using the contents of the description and drawings of the present application, or directly or indirectly applied to other related technologies Fields are similarly included within the scope of patent protection of this application.

Claims

1. A method for audio noise reduction, the method comprising:

acquiring data to be denoised, and calculating a power spectrum of the data to be denoised, wherein the data to be denoised comprises noise data and noise-free data;

initializing a noise estimation parameter based on the power spectrum to obtain a noise spectrum of the noise data;

carrying out minimum tracking on the initialized noise estimation parameters in a first time period to obtain a first array;

calculating a posterior signal-to-noise ratio and a prior signal-to-noise ratio of the data to be denoised based on the first array;

calculating a silence probability estimation value, a sound probability estimation value and a noise power spectrum estimation value based on the confidence coefficient of the posterior signal-to-noise ratio;

carrying out minimum tracking on the initialized noise estimation parameters in a second time period to obtain a second array;

calculating a gain estimation value of the noise-free data based on the second array, the silence probability estimation value, the sound probability estimation value and the noise power spectrum estimation value;

and performing noise reduction processing on the data to be subjected to noise reduction based on the gain estimation value, the noise spectrum and the noise power spectrum estimation value to obtain the noise-free data.

2. The audio noise reduction method of claim 1, further comprising:

updating the silence probability estimation value based on the second array to obtain an updated silence probability estimation value;

updating the noise spectrum based on the noise power spectrum estimated value to obtain an updated noise spectrum;

calculating the gain estimation value based on the prior signal-to-noise ratio, the posterior signal-to-noise ratio, the updated silence probability estimation and the voiced probability estimation;

and performing noise reduction processing on the data to be subjected to noise reduction based on the gain estimation value and the updated noise spectrum to obtain the noise-free data.

3. The audio denoising method of claim 1, wherein the step of initializing a noise estimation parameter based on the power spectrum to obtain a noise spectrum of the noise data comprises:

smoothing the power spectrum by adopting a three-point average filtering method to obtain a smoothed power spectrum;

initializing the noise estimation parameters based on the smoothed power spectrum to obtain the noise spectrum.

4. The method of claim 1, wherein the step of calculating a posteriori snr and a priori snr of the data to be denoised based on the first array comprises:

calculating a posterior signal-to-noise ratio based on the first array, and acquiring the minimum value of the posterior signal-to-noise ratio to obtain a third array;

and calculating the prior signal-to-noise ratio based on the third array.

5. The audio denoising method according to claim 1, wherein the step of calculating the power spectrum of the data to be denoised comprises:

calculating the power spectrum of the first half part of data in the data to be denoised to obtain a first power spectrum;

calculating a power spectrum of the latter half data in the data to be denoised based on symmetry to obtain a second power spectrum;

and combining the first power spectrum and the second power spectrum to obtain the power spectrum.

6. The method of claim 1, wherein the step of performing minimum tracking on the initialized noise estimation parameters in the first time period to obtain the first array comprises:

calculating the minimum value of the initialized noise estimation parameters to obtain a fourth array;

and carrying out minimum value tracking on the fourth array to obtain the first array.

7. The audio denoising method of claim 1, wherein the data to be denoised comprises a plurality of frames of audio data, the method further comprising:

and performing the minimum tracking operation at intervals of a preset number of frames of audio data.

8. The audio noise reduction method of claim 2, wherein the method comprises:

the step of updating the silence probability estimation value based on the second array to obtain an updated silence probability estimation value includes:

calculating a first silence probability estimation value based on the second array;

smoothing the first silence probability estimation value by adopting a three-point mean filtering method to obtain a second silence probability estimation value;

windowing the second silence probability estimation value to obtain a third silence probability estimation value;

and carrying out numerical transformation processing on the third silence probability estimation value to obtain an updated silence probability estimation value.

9. The audio noise reduction method of claim 1, further comprising:

carrying out shunt processing on the data to be denoised to obtain multi-path sub-audio data;

performing parallel noise reduction processing on all the sub-audio data by adopting a noise reduction processing method to obtain noise-reduced sub-audio data, wherein the noise reduction processing method is the audio noise reduction method of any one of claims 1 to 8;

and merging the sub audio data subjected to noise reduction to obtain the noiseless data.

10. An audio noise reduction arrangement comprising a memory and a processor connected to each other, wherein the memory is adapted to store a computer program, which when executed by the processor is adapted to implement the audio noise reduction method of any of claims 1-9.

11. An audio noise reduction apparatus for simultaneously performing a reception task and a noise reduction task, the audio noise reduction apparatus comprising:

the scheduling circuit is used for receiving the data to be denoised corresponding to the receiving task and carrying out shunt processing on the data to be denoised to obtain multi-path sub-audio data;

the noise reduction circuit is connected with the scheduling circuit and is used for carrying out parallel noise reduction processing on all the sub-audio data corresponding to the noise reduction task to obtain noise-reduced audio data; wherein the noise reduction circuit is configured to implement the audio noise reduction method of any of claims 1-9.

12. The audio noise reduction device of claim 11,

the denoised audio data comprises a plurality of sub-denoised audio data; the noise reduction circuit comprises a plurality of sub noise reduction circuits, and each sub noise reduction circuit is connected with the scheduling circuit and is used for carrying out noise reduction processing on corresponding sub audio data to obtain the sub noise reduction audio data.

13. The audio noise reduction device of claim 11, wherein the scheduling circuit comprises:

the shunting module is used for carrying out shunting processing on the data to be subjected to noise reduction to obtain a plurality of paths of data to be subjected to noise reduction;

and the scheduling module is connected with the shunt module and the noise reduction circuit and used for scheduling the multi-path data to be subjected to noise reduction to obtain the multi-path sub-audio data and transmitting the multi-path sub-audio data to the noise reduction circuit.

14. The audio noise reduction device of claim 13,

the data to be denoised comprises identification information and original audio information, wherein the identification information comprises a channel number identification and a denoising information identification; and the shunting module is also used for identifying the channel number in the channel number identification and transmitting the data to be denoised by adopting the transmission channel corresponding to the channel number.

15. The audio noise reduction device of claim 14,

the scheduling module is further configured to identify noise reduction information in the noise reduction information identifier, and determine whether the data to be noise reduced is the sub-audio data based on the noise reduction information; and if so, inputting the sub-audio data to the noise reduction circuit.

16. The audio noise reduction device of claim 14,

the audio noise reduction device further comprises a storage module, wherein the storage module is connected with the shunt module and used for storing the multi-path data to be subjected to noise reduction.

17. The audio noise reduction device of claim 16,

the storage module comprises a plurality of sub-storage modules, each sub-storage module corresponds to a channel number in the channel number identification, and the branching module is further used for outputting the multi-channel data to be denoised to the corresponding sub-storage module based on the channel number.

18. An audio noise reduction system, comprising:

the audio acquisition device is used for acquiring sound in a target scene to obtain data to be denoised;

the audio noise reduction device is connected with the audio acquisition device and is used for carrying out noise reduction processing on the data to be subjected to noise reduction to obtain noise-reduced audio data; wherein the audio noise reduction device is as claimed in any of the preceding claims 11-17.

19. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, is adapted to implement the audio noise reduction method of any of claims 1-9.