CN110197670B

CN110197670B - Audio noise reduction method and device and electronic equipment

Info

Publication number: CN110197670B
Application number: CN201910480708.4A
Authority: CN
Inventors: 侯锐
Original assignee: Volkswagen Mobvoi Beijing Information Technology Co Ltd
Current assignee: Volkswagen Mobvoi Beijing Information Technology Co Ltd
Priority date: 2019-06-04
Filing date: 2019-06-04
Publication date: 2022-06-07
Anticipated expiration: 2039-06-04
Also published as: CN110197670A

Abstract

The invention discloses an audio noise reduction method, an audio noise reduction device and electronic equipment, wherein the audio noise reduction method comprises the following steps: determining the type of a noise scene according to the characteristics of the acquired audio signals; acquiring a noise reduction parameter group corresponding to the type of the noise scene, wherein the noise reduction parameter group at least comprises a noise reduction parameter; and carrying out noise reduction on the audio signal through the noise reduction parameters in the noise reduction parameter group. The audio noise reduction method can improve the signal-to-noise ratio of the audio signal.

Description

Audio noise reduction method and device and electronic equipment

Technical Field

The present invention relates to the field of audio processing technologies, and in particular, to an audio denoising method and apparatus, and an electronic device.

Background

At present, noise reduction is performed on audio through a digital signal processing algorithm, and generally, corresponding parameters are determined according to the noise condition of a target scene, and the parameters are used for performing noise reduction processing on the audio. However, the noise reduction method can only reduce the noise of the audio acquired in a noise scene with a single characteristic, but the noise reduction performance is low for a scene with dynamic change of environmental noise.

Disclosure of Invention

In view of this, the present invention provides an audio noise reduction method, an audio noise reduction device and an electronic device, which can improve the signal-to-noise ratio of the audio subjected to noise reduction processing.

According to a first aspect of the present invention, there is provided an audio noise reduction method comprising: determining the type of a noise scene according to the characteristics of the acquired audio signals; acquiring a noise reduction parameter group corresponding to the type of the noise scene, wherein the noise reduction parameter group at least comprises a noise reduction parameter; and carrying out noise reduction on the audio signal through the noise reduction parameters in the noise reduction parameter group.

Optionally, the method further includes: before determining the type of a noise scene according to the characteristics of the acquired audio signals, dividing the scene into different types of noise scenes according to the noise intensity and the noise type of the audio signals acquired under different scenes; and establishing a corresponding relation between the type of each noise scene and a preset noise parameter group.

Optionally, the type of the noise scene is divided according to at least two kinds of information: the environment of the vehicle, the running state of the vehicle, whether windows are opened, whether air conditioners in the vehicle are opened and the magnitude of the environmental noise.

Optionally, the obtaining a noise reduction parameter set corresponding to the type of the noise scene includes: extracting features of the audio signal; determining the type of a noise scene corresponding to each frame in the audio signal according to the characteristics; and if the types of the noise scenes corresponding to the frames with the continuous preset number are the same, generating a noise reduction parameter group corresponding to the type of the noise scenes.

Optionally, the method further includes: acquiring audio data under different noise scenes, wherein the audio data comprises audio signals; marking the audio data according to the noise scene; and training a noise scene classification model by using the marked audio data, wherein the input of the noise scene classification model is the characteristics of the audio signal, and the output of the noise scene classification model is the noise scene corresponding to the audio signal.

Optionally, the noise reduction parameter set at least includes one of the following noise reduction parameters: an over-subtraction factor and a spectral lower limit parameter.

According to a second aspect of the present invention, there is provided an audio noise reduction apparatus comprising: the determining module is used for determining the type of a noise scene according to the characteristics of the acquired audio signals; an obtaining module, configured to obtain a noise reduction parameter set corresponding to a type of the noise scene, where the noise reduction parameter set at least includes a noise reduction parameter; and the noise reduction module is used for carrying out noise reduction on the audio signal through the noise reduction parameters in the noise reduction parameter group.

Optionally, the apparatus further comprises: the dividing module is used for dividing the scenes into different types of noise scenes according to the noise intensity and the noise types of the audio signals collected under different scenes before determining the types of the noise scenes according to the characteristics of the collected audio signals; and the establishing module is used for establishing a corresponding relation between the type of each noise scene and a preset noise parameter group.

Optionally, the obtaining module includes: an extraction unit configured to extract a feature of the audio signal; a determining unit, configured to determine a type of a noise scene corresponding to each frame in the audio signal according to the feature; the generating unit is used for generating a noise reduction parameter group corresponding to the type of the noise scene if the types of the noise scenes corresponding to the frames with the continuous preset number are the same.

Optionally, the apparatus further comprises: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring audio data under different noise scenes, and the audio data comprises audio signals; the marking module is used for marking the audio data according to the noise scene; the training module is used for training a noise scene classification model by using the marked audio data, wherein the input of the noise scene classification model is the characteristics of an audio signal, and the output of the noise scene classification model is a noise scene corresponding to the audio signal.

According to a third aspect of the present invention, there is provided an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing any of the audio noise reduction methods according to the first aspect of the present invention when executing the program.

From the above, it can be seen that the audio noise reduction method of the present invention can determine the current noise scene according to the acquired characteristics of the audio signal, and determine the noise reduction parameter set for performing noise reduction processing on the audio signal according to the noise scene, so as to perform noise reduction processing on the audio signal based on the noise reduction parameters in the noise reduction parameter set, and can adaptively adjust the noise reduction parameters under the condition that the ambient noise is constantly changing, so as to obtain the noise reduction parameter set most matched with the current noise scene, and can perform targeted processing on the audio signal according to the characteristics of the current noise scene, so as to improve the signal-to-noise ratio of the processed audio signal.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow diagram illustrating a method of audio noise reduction according to an exemplary embodiment;

FIG. 2 is a schematic diagram illustrating a method of audio noise reduction according to an exemplary embodiment;

FIG. 3 is a diagram illustrating a training process and a use process of a noise scene classification model according to an exemplary embodiment;

FIG. 4 is a flowchart illustrating a process for denoising an original noisy signal according to a denoising parameter in a denoising parameter set, according to an exemplary embodiment;

fig. 5 is a block diagram illustrating an audio noise reduction apparatus according to an exemplary embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.

It should be noted that all expressions using "first" and "second" in the embodiments of the present invention are used for distinguishing two entities with the same name but different names or different parameters, and it should be noted that "first" and "second" are merely for convenience of description and should not be construed as limitations of the embodiments of the present invention, and they are not described in any more detail in the following embodiments.

Fig. 1 is a flow chart illustrating a method of audio noise reduction, as shown in fig. 1, according to an exemplary embodiment, the method comprising:

step 101: determining the type of a noise scene according to the characteristics of the acquired audio signals;

for example, after an audio signal (which is an audio signal to be denoised) is acquired, a feature of the audio signal may be extracted, and the feature of the audio signal may be, for example, MFCC (Mel Frequency Cepstrum Coefficient), and the extracted feature is input to a noise scene classification model obtained through pre-training, so as to output a type of a noise scene.

Before the step 101, noise scenes may be classified into different types according to the noise intensity and the noise type of the scene in advance, for example, in the case of applying the audio noise reduction method of the present invention to the noise reduction processing of the audio acquired in the vehicle, the noise scenes may be classified into the following types:

noise scene one: when the vehicle is in a parking lot environment, idling is carried out, the window of the vehicle is opened, the air conditioner of the vehicle is closed, and the ambient noise decibel value is 50-55 db;

a noise scene two: the vehicle is in an urban road environment and in a low-speed running state, the running speed is 40-60km/h, the window of the vehicle is closed, the air conditioner of the vehicle is opened, and the ambient noise decibel value is 55-60 db;

a third noise scene: the vehicle is in an urban road environment and in a low-speed running state, the running speed is 40-60km/h, the window of the vehicle is opened, the air conditioner of the vehicle is closed, and the ambient noise decibel value is 60-70 db;

and a noise scene four: the vehicle is in a expressway environment and in a high-speed running state, the running speed is 80-120km/h, the windows of the vehicle are closed, the air conditioner of the vehicle is opened, and the ambient noise decibel value is 55-65 db.

Step 102: acquiring a noise reduction parameter group corresponding to the type of the noise scene, wherein the noise reduction parameter group at least comprises a noise reduction parameter;

for example, different noise reduction parameter sets may be preset for each type of noise scene, and a correspondence relationship between each type of noise scene and each noise reduction parameter set may be established. Wherein the different sets of noise reduction parameters comprise at least one different noise reduction parameter. For example, for a certain noise scene, according to the characteristics of the noise scene, a plurality of different sets of noise reduction parameters may be selected to respectively perform noise reduction on the audio signal obtained in the noise scene, through a plurality of experiments, a set of noise reduction parameters with the best noise reduction result is selected, and the set of noise reduction parameters is determined as the set of noise reduction parameters corresponding to the noise scene. After the type of the noise scene is determined in step 102, a set of noise reduction parameters may be generated according to a preset correspondence between the type of the noise scene and the set of noise reduction parameters.

In one implementation, the noise reduction parameter may include at least one of an over-subtraction factor and a spectral lower limit parameter.

Step 103: and carrying out noise reduction on the audio signal through the noise reduction parameters in the noise reduction parameter group.

For example, the noise reduction parameter set and the audio Signal to be noise reduced may be sent to a Digital Signal Processing (DSP) module, and the DSP performs noise reduction Processing on the audio Signal to be noise reduced and outputs the noise-reduced audio Signal.

In an implementation manner, the noise reduction method of the present invention can be applied to a speech recognition process to perform noise reduction processing on an audio signal to be recognized so as to remove noise in the audio signal to be recognized, thereby improving the accuracy of speech recognition.

The audio noise reduction method can determine the current noise scene according to the acquired characteristics of the audio signal, and determine the noise reduction parameter group for performing noise reduction processing on the audio signal according to the noise scene, so that the audio signal is subjected to noise reduction processing based on the noise reduction parameters in the noise reduction parameter group, the noise reduction parameters can be adaptively adjusted under the condition that the ambient noise is continuously changed, the noise reduction parameter group which is most matched with the current noise scene is obtained, the audio signal can be subjected to targeted processing according to the characteristics of the current noise scene, and the signal-to-noise ratio of the processed audio signal can be improved.

In one implementation, the audio denoising method may further include: before determining the type of a noise scene according to the characteristics of the acquired audio signals, dividing the scene into different types of noise scenes according to the noise intensity and the noise type of the audio signals acquired under different scenes; for example, the noise intensity may be first divided into n levels, the noise types to be referred to include m types, and it may be preset that when the noise intensity in the scene satisfies a certain level of the n levels, and at least i (0 < i < m) types of the m types of noise types are simultaneously included in the audio signal acquired in the scene, the scene is determined to be a specific noise scene. In the case of applying the audio noise reduction method of the present invention to noise reduction processing on audio acquired in a vehicle, the noise types may include: tire noise, wind noise, engine noise, noisy human noise, and other vehicle noises, and the intensity of the noise is the magnitude of the noise, e.g., decibel value of the noise.

In one implementation manner, the obtaining the set of noise reduction parameters corresponding to the type of the noise scene may include: extracting features of the audio signal; determining the type of a noise scene corresponding to each frame in the audio signal according to the characteristics; and if the types of the noise scenes corresponding to the frames with the continuously preset number are the same, generating a noise reduction parameter group corresponding to the type of the noise scene. For example, feature extraction is performed on an audio signal, the audio signal is sent to a decoder (the decoder determines a current noise scene according to audio features through a noise scene classification model obtained through pre-training), the audio signal is decoded frame by frame through the decoder, when decoding results (the decoding results are the noise scenes corresponding to the current frame) corresponding to consecutive 3-10 (which is an example of the preset number) frames are the same, that is, the decoding results are taken as target noise scenes, noise reduction parameters (for example, an over-reduction factor α and a spectrum lower limit parameter β) corresponding to the scenes are generated according to the target noise scenes, and the generated noise reduction parameters are output. Based on the method, the noise reduction parameters corresponding to the noise scene can be loaded frame by frame for the audio signal to be subjected to noise reduction, so that the noise of the audio signal to be subjected to noise reduction can be reduced according to the characteristics of the noise scene, and the signal-to-noise ratio of the audio signal is improved.

In one implementation, the audio denoising method may further include: acquiring audio data under different types of noise scenes, wherein the audio data comprises audio signals; for example, the audio data may be respectively collected in a plurality of divided noise scenes, and the collected noise may include various different types of environmental noise such as tire noise, wind noise, engine noise, noisy human voice, other vehicle noise, and the like. Marking the audio data according to the type of the noise scene; for example, audio data captured in a first noise scene is labeled as a first noise scene, audio data captured in a second noise scene is labeled as a second noise scene, and so on, audio data captured in a third noise scene is labeled as a third noise scene, and audio data captured in a fourth noise scene is labeled as a fourth noise scene. And training a noise scene classification model by using the marked audio data, wherein the input of the noise scene classification model is the characteristics of the audio signal, and the output of the noise scene classification model is the noise scene corresponding to the audio signal. When training a noise scene classification model, it can be implemented by using a model training method such as DNN (deep neural network) or HMM (hidden markov model).

In one implementation, the type of the noise scene may be divided according to at least two kinds of information: the environment of the vehicle, the running state of the vehicle, whether windows are opened, whether air conditioners in the vehicle are opened and the magnitude of the environmental noise. For example, a noise scene may be defined based on at least two of these pieces of information. As an example, the types of the noise scenes are divided by the above two kinds of information, in this example, the noise scene is defined as a first noise scene assuming that the vehicle is in a driving state and the windows of the vehicle are opened; assuming that the vehicle is in a running state and the window of the vehicle is closed, defining the noise scene as a second noise scene; assuming that the vehicle is in a static state and the window of the vehicle is opened, defining a noise scene as a third noise scene; assuming that the vehicle is in a stationary state and the windows of the vehicle are closed, the noise scene is defined as a fourth noise scene. As another example, the noise scene is classified into four types of information, and in this example, it is assumed that the current vehicle is in a parking lot environment, idling is performed, windows of the vehicle are opened, an air conditioner of the vehicle is closed, and an ambient noise decibel value is between 50 db and 55db, and the noise scene is defined as a first noise scene. Assuming that the current vehicle is in an urban road environment and in a low-speed driving state, the driving speed is 40-60km/h, the windows of the vehicle are closed, the air conditioner of the vehicle is opened, and the ambient noise decibel value is 55-60 db, wherein the noise scene is defined as a second noise scene. Assuming that the current vehicle is in an urban road environment and in a low-speed driving state, the driving speed is between 40 and 60km/h, the window of the vehicle is opened, the air conditioner of the vehicle is closed, the ambient noise decibel value is between 60 and 70db, and the noise scene is defined as a third noise scene; assuming that the current vehicle is in a expressway environment and in a high-speed driving state, the driving speed is between 80 and 120km/h, windows of the vehicle are closed, an air conditioner of the vehicle is opened, and the ambient noise decibel value is between 55 and 65db, the noise scene is defined as a fourth noise scene. The first noise scene, the second noise scene, the third noise scene, and the fourth noise scene are an identifier of the noise scene to distinguish different noise scenes, and in addition, the first noise scene and the second noise scene may also be used to represent the intensity of noise in the noise scene, for example, the intensity of noise in the first noise scene is greater than the intensity of noise in the second noise scene.

In an implementation manner, the audio noise reduction method of the present invention may be applied to perform noise reduction processing on an audio signal acquired in a vehicle, for example, may be applied to a voice interaction system in a vehicle, and is used to perform noise reduction on an audio acquired by the voice interaction system, so as to improve the accuracy of voice recognition, and further improve the response capability of the voice interaction system.

Fig. 2 is a schematic diagram illustrating an audio noise reduction method according to an exemplary embodiment, which may include the following processes, as shown in fig. 2:

sending the collected audio signal to an ADC (Analog-to-Digital Converter) by a sound signal collecting device such as a microphone, so as to convert the Analog signal (i.e., the collected audio signal) into a Digital signal;

sending the digital signal to an intelligent noise scene detection algorithm module (which can output the type of the noise scene according to the noise scene classification model described above) and a Digital Signal Processing (DSP) algorithm module;

the intelligent noise scene detection algorithm module identifies a noise scene of the received digital signal, judges the current noise scene, generates a noise reduction parameter group matched with the current noise scene according to the judged noise scene, and sends the noise reduction parameter group to the digital signal processing algorithm module;

the digital signal processing algorithm module updates the noise reduction parameters into the received noise reduction parameters in the noise reduction parameter group from the intelligent noise scene detection algorithm module, performs noise reduction processing on the audio signal by using the updated noise reduction parameters, and sends the processed audio signal to the rear-end voice awakening/recognition processing module so as to perform voice recognition or voice awakening and other operations on the processed audio signal.

The intelligent noise scene detection algorithm module can more accurately identify the characteristics of the current environmental noise, and provides a more suitable noise reduction parameter set for the digital signal processing algorithm module.

The digital signal processing module can dynamically load noise reduction parameters according to the current noise scene, can obtain the optimal noise reduction effect and the optimal voice signal to noise ratio, and can provide the optimal voice signals for post-processing modules such as rear-end voice recognition, voice awakening and the like in the voice interaction system, so that the noise immunity of the whole voice interaction system can be improved.

Fig. 3 is a schematic diagram illustrating a process of training a noise scene classification model and a process of using the noise scene classification model according to an exemplary embodiment, which are described below with reference to fig. 3 (where the flow shown in the upper part of fig. 3 is a process of training the noise scene classification model, and the flow shown in the lower part of fig. 3 is a process of using the noise scene classification model).

The process of training the noise scene classification model comprises the following steps:

in a driving application scene, noise data can be respectively collected in four noise scenes, namely the noise scene I, the noise scene II, the noise scene III and the noise scene IV, so that a noise database of scenes of various vehicles is obtained. For example, 50 hours of ambient noise data is collected for each noise scenario, and the collected noise data may include, for example, a plurality of different types of noise data, such as tire noise, wind noise, engine noise, noisy human noise, and other vehicle noise.

Marking each audio according to a noise scene to which each noise data belongs in a noise database, and extracting the characteristics of the audio in the database, wherein if the noise data A is collected in the noise scene, the noise scene is a noise scene of the noise data;

and extracting the characteristics of the audio signals collected under each noise scene in the noise database, and training a noise scene classification model according to the characteristics of the audio signals collected under each noise scene in the noise database to obtain a noise scene classification model. The training of the noise scene classification model can be realized by using a model training method such as DNN or HMM.

The process of using the noise scene classification model may include the following processes:

extracting the characteristics of an original band noise frequency signal (an audio signal to be denoised), sending the original band noise frequency signal into a decoder (the decoder determines a noise scene corresponding to the audio signal by using a noise scene classification model), and performing frame-by-frame decoding on the original band noise frequency signal to obtain a decoding result, wherein the decoding result comprises the type of the noise scene corresponding to the original band noise frequency signal. And when the decoding results of the continuous 3-10 frames are the same, namely the decoding result is determined to be the type of the current noise scene, generating a noise reduction parameter group corresponding to the type of the noise scene, and outputting the noise reduction parameter group to the digital signal processing module so that the digital signal processing module performs noise reduction processing on the original band noise frequency signal according to the noise reduction parameter group.

Fig. 4 is a flowchart illustrating a process of denoising an original noisy signal according to a denoising parameter in a denoising parameter set according to an exemplary embodiment, where, as shown in fig. 4, the process of denoising the original noisy signal according to the denoising parameter set includes the following operations:

in a digital signal processing module, frame processing is performed on an original band noise frequency signal, a 20ms hamming window is added, 10 second frame shift is taken, and Fast Fourier Transform (FFT) change is performed frame by frame to acquire frequency spectrum and phase information.

Calculating spectral subtraction according to the following formula by using noise reduction parameters in the noise reduction parameter group transmitted by the intelligent noise scene detection module, such as an over-subtraction factor alpha and a spectral lower limit parameter beta:

if it is used

Where α (1 or more) is an over-subtraction factor, which mainly affects the degree of distortion of the speech spectrum. Beta (more than 0 and less than 1) is a spectrum lower limit parameter which can control the amount of residual noise and the size of music noise, Y (omega) represents an original signal with noise, X (omega) represents a pure speech signal, and D (omega) represents additive noise.

An IFFT (inverse FFT transform) transform is performed, and the frequency domain signal and the phase signal are converted into a time domain signal, thereby obtaining an audio signal subjected to noise reduction.

Fig. 5 is a block diagram illustrating an audio noise reduction apparatus according to an exemplary embodiment, and as shown in fig. 5, the apparatus 50 includes the following components:

a determining module 51 (which may include the above-mentioned intelligent noise scene detection algorithm module) for determining the type of the noise scene according to the characteristics of the acquired audio signal;

an obtaining module 52, configured to obtain a noise reduction parameter set corresponding to the type of the noise scene, where the noise reduction parameter set at least includes one noise reduction parameter;

and a noise reduction module 53, configured to perform noise reduction on the audio signal according to a noise reduction parameter in the noise reduction parameter set.

The obtaining module 52 and the noise reduction module 53 can be implemented by the above-mentioned digital signal processing module.

In one implementation, the apparatus may further include: the device comprises a dividing module, a judging module and a judging module, wherein the dividing module is used for dividing a scene into different types of noise scenes according to the noise intensity and the noise type of audio signals collected under different scenes before determining the type of the noise scene according to the characteristics of the collected audio signals; and the establishing module is used for establishing a corresponding relation between the type of each noise scene and a preset noise parameter group.

In one implementation, the obtaining module may include: an extraction unit configured to extract a feature of the audio signal; a determining unit, configured to determine, according to the features, a type of a noise scene corresponding to each frame in the audio signal; the generating unit is used for generating a noise reduction parameter group corresponding to the type of the noise scene if the types of the noise scenes corresponding to the frames with the continuous preset number are the same.

In one implementation, the apparatus may further include: the acquisition module is used for acquiring audio data under different noise scenes, wherein the audio data comprises audio signals; the marking module is used for marking the audio data according to the noise scene; the training module is used for training a noise scene classification model by using the marked audio data, wherein the input of the noise scene classification model is the characteristics of an audio signal, and the output of the noise scene classification model is a noise scene corresponding to the audio signal.

In one implementation, the type of the noise scene may be divided according to at least two kinds of information: the environment of the vehicle, the running state of the vehicle, whether windows are opened, whether air conditioners in the vehicle are opened and the magnitude of the environmental noise.

The apparatus of the foregoing embodiment is used to implement the corresponding method in the foregoing embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.

Based on the same inventive concept, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the audio noise reduction method according to any of the above embodiments is implemented.

Based on the same inventive concept, embodiments of the present invention also provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the audio noise reduction method according to any of the above embodiments.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the idea of the invention, also features in the above embodiments or in different embodiments may be combined, steps may be implemented in any order, and there are many other variations of the different aspects of the invention as described above, which are not provided in detail for the sake of brevity.

In addition, well known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures for simplicity of illustration and discussion, and so as not to obscure the invention. Furthermore, devices may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform within which the present invention is to be implemented (i.e., specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative instead of restrictive.

While the present invention has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of these embodiments will be apparent to those of ordinary skill in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic ram (dram)) may use the discussed embodiments.

The embodiments of the invention are intended to embrace all such alternatives, modifications and variances that fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements and the like that may be made without departing from the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims

1. An audio noise reduction method, comprising:

inputting the characteristics of the collected audio signals into a noise scene classification model obtained by pre-training, and outputting the type of a noise scene;

determining the type of a noise scene corresponding to each frame in the audio signal according to the characteristics of the audio signal;

if the types of the noise scenes corresponding to the frames with the preset number are the same, generating a noise reduction parameter group corresponding to the type of the noise scenes, wherein the noise reduction parameter group at least comprises one noise reduction parameter, the noise reduction parameter comprises at least one of an over-subtraction factor and a spectrum lower limit parameter, and the types of different noise scenes correspond to different noise reduction parameter groups;

and carrying out noise reduction on the audio signal through the noise reduction parameters in the noise reduction parameter group.

2. The method of claim 1, further comprising:

before determining the type of a noise scene according to the characteristics of the acquired audio signals, dividing the scene into different types of noise scenes according to the noise intensity and the noise type of the audio signals acquired under different scenes;

and establishing a corresponding relation between the type of each noise scene and a preset noise parameter group.

3. The method of claim 2, wherein the type of the noise scene is divided according to at least two of the following information:

the environment of the vehicle, the running state of the vehicle, whether windows are opened, whether air conditioners in the vehicle are opened and the magnitude of the environmental noise.

4. The method of claim 1, further comprising:

acquiring audio data under different types of noise scenes, wherein the audio data comprises audio signals;

marking the audio data according to the type of the noise scene;

and training a noise scene classification model by using the marked audio data, wherein the input of the noise scene classification model is the characteristics of the audio signal, and the output of the noise scene classification model is the noise scene corresponding to the audio signal.

5. An audio noise reduction apparatus, comprising:

the determining module is used for inputting the characteristics of the acquired audio signals into a noise scene classification model obtained by pre-training and outputting the type of the noise scene;

an obtaining module, configured to determine, according to characteristics of the audio signal, a type of a noise scene corresponding to each frame in the audio signal, and if the types of the noise scenes corresponding to a preset number of frames are the same, generate a noise reduction parameter group corresponding to the type of the noise scene, where the noise reduction parameter group includes at least one noise reduction parameter, and different types of the noise scenes correspond to different noise reduction parameter groups;

and the noise reduction module is used for carrying out noise reduction on the audio signal through the noise reduction parameters in the noise reduction parameter group.

6. The apparatus of claim 5, further comprising:

the device comprises a dividing module, a judging module and a judging module, wherein the dividing module is used for dividing a scene into different types of noise scenes according to the noise intensity and the noise type of audio signals collected under different scenes before determining the type of the noise scene according to the characteristics of the collected audio signals;

and the establishing module is used for establishing a corresponding relation between the type of each noise scene and a preset noise parameter group.

7. The apparatus of claim 6, wherein the type of the noise scene is divided according to at least two of the following information:

8. The apparatus of any one of claims 5 to 7, further comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring audio data under different noise scenes, and the audio data comprises audio signals;

the marking module is used for marking the audio data according to the noise scene;

the training module is used for training a noise scene classification model by using the marked audio data, wherein the input of the noise scene classification model is the characteristics of an audio signal, and the output of the noise scene classification model is a noise scene corresponding to the audio signal.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the audio noise reduction method according to any of claims 1 to 4 when executing the program.