CN117935818B - Audio encoding and decoding device, method and system with automatic gain control function - Google Patents
Audio encoding and decoding device, method and system with automatic gain control function Download PDFInfo
- Publication number
- CN117935818B CN117935818B CN202410132693.3A CN202410132693A CN117935818B CN 117935818 B CN117935818 B CN 117935818B CN 202410132693 A CN202410132693 A CN 202410132693A CN 117935818 B CN117935818 B CN 117935818B
- Authority
- CN
- China
- Prior art keywords
- audio
- signal
- module
- noise
- automatic gain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 230000007613 environmental effect Effects 0.000 claims abstract description 78
- 230000005236 sound signal Effects 0.000 claims abstract description 53
- 238000012544 monitoring process Methods 0.000 claims abstract description 42
- 238000012545 processing Methods 0.000 claims abstract description 29
- 238000001228 spectrum Methods 0.000 claims description 75
- 238000006243 chemical reaction Methods 0.000 claims description 65
- 230000000694 effects Effects 0.000 claims description 39
- 238000001514 detection method Methods 0.000 claims description 37
- 230000003595 spectral effect Effects 0.000 claims description 29
- 238000009499 grossing Methods 0.000 claims description 28
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 15
- 238000004458 analytical method Methods 0.000 claims description 13
- 238000007781 pre-processing Methods 0.000 claims description 13
- 238000005259 measurement Methods 0.000 claims description 7
- 230000003139 buffering effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 abstract description 74
- 230000008569 process Effects 0.000 abstract description 32
- 238000005070 sampling Methods 0.000 abstract description 14
- 238000005516 engineering process Methods 0.000 abstract description 9
- 230000004888 barrier function Effects 0.000 abstract description 6
- 230000004044 response Effects 0.000 abstract description 4
- 230000010354 integration Effects 0.000 abstract description 3
- 230000001934 delay Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 10
- 238000004891 communication Methods 0.000 description 5
- 230000003044 adaptive effect Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005316 response function Methods 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Mathematical Physics (AREA)
- Control Of Amplification And Gain Control (AREA)
Abstract
本申请提供具有自动增益控制功能的音频编解码装置、方法和系统,本申请创新性地将环境噪声监测与AGC功能在音频编解码器内部进行集成,令AGC模块能够根据实时环境噪声数据对音频增益参数进行动态调整的同时,保证了音频输出在各种环境条件下的一致性和清晰度,从而大幅提升了音频信号的处理质量和用户的听觉体验。其次将AGC功能集成至音频编解码器内部显著降低了外部硬件依赖和系统集成的复杂性,减少了额外设备的需求,简化了系统部署过程,降低了使用AGC技术的技术壁垒。最后,将AGC功能集成至音频编解码器内部可充分利用音频编解码器内部所具有的高采样率能力,以更低的算法时延处理音频信号,从而实现更快速和精确的响应。
The present application provides an audio codec device, method and system with an automatic gain control function. The present application innovatively integrates environmental noise monitoring and AGC functions inside the audio codec, so that the AGC module can dynamically adjust the audio gain parameters according to the real-time environmental noise data, while ensuring the consistency and clarity of the audio output under various environmental conditions, thereby greatly improving the processing quality of the audio signal and the user's auditory experience. Secondly, integrating the AGC function into the audio codec significantly reduces external hardware dependence and the complexity of system integration, reduces the need for additional equipment, simplifies the system deployment process, and reduces the technical barriers to using AGC technology. Finally, integrating the AGC function into the audio codec can make full use of the high sampling rate capability of the audio codec to process audio signals with lower algorithm delays, thereby achieving faster and more accurate responses.
Description
技术领域Technical Field
本申请涉及音频编解码领域,特别是涉及一种具有自动增益控制功能的音频编解码装置、方法和系统。The present application relates to the field of audio coding and decoding, and in particular to an audio coding and decoding device, method and system with an automatic gain control function.
背景技术Background Art
音频编解码系统用于使音频数据在不同没接和设备之间进行有效传输和存储,其包括有音频编码器和音频解码器。其中编码器用于将原始模拟音频信号转换为数字格式,而解码器用于将数字信号重新转换至模拟信号格式以便于音频信号的播放。音频编解码技术可实现将压缩音频文件快速且高效地进行网络传播。具体地,在移动电话和会议系统场景中,音频编解码器可确保语音通信的清晰度和可靠性。The audio codec system is used to effectively transmit and store audio data between different connections and devices. It includes an audio encoder and an audio decoder. The encoder is used to convert the original analog audio signal into a digital format, while the decoder is used to reconvert the digital signal into an analog signal format for the playback of the audio signal. Audio codec technology can achieve fast and efficient network transmission of compressed audio files. Specifically, in mobile phone and conference system scenarios, audio codecs can ensure the clarity and reliability of voice communication.
自动增益控制(AGC)为音频和信号处理领域的常用技术之一。AGC技术根据实时监测到的输入信号的强度,根据预设的目标水平自动地调整输出信号的强度,以维持输出信号的稳定性和一致性,从而确保输出信号保持在恒定的水平。通过AGC技术可以实现无论发送者的语音音量如何变化,接收方都可以接收到清晰、一致的声音。音频的清晰度和一致性语音通信、广播和会议系统尤其重要,AGC可有效减少因话筒距离、输入音量变化以及背景噪声等因素引起的音量波动。Automatic Gain Control (AGC) is one of the commonly used technologies in the field of audio and signal processing. AGC technology automatically adjusts the strength of the output signal according to the preset target level based on the strength of the input signal monitored in real time to maintain the stability and consistency of the output signal, thereby ensuring that the output signal remains at a constant level. AGC technology can ensure that the receiver can receive clear and consistent sound regardless of how the sender's voice volume changes. Audio clarity and consistency are particularly important in voice communications, broadcasting, and conferencing systems. AGC can effectively reduce volume fluctuations caused by factors such as microphone distance, input volume changes, and background noise.
然而现有的AGC系统中存在的困难在于,传统的AGC系统由于缺乏环境监测功能,导致难以区分不同的听觉背景,因此难以根据用户所处的环境来调整目标音量。当周围环境的噪声水平变化时,AGC系统仅能将音量维持在一个固定水平,从而导致在噪声环境中的音量过低,或在安静环境中音量过高,从而影响音频接收者的使用体验。其次,传统的AGC系统的参数设置具有较高的技术壁垒,其需要专业知识来正确配置和调整其参数。这种调整包括但不限于设定目标音量水平、调整响应速度以及处理突然的音量变化,影响了AGC系统的普及和效率。同时在进行音频编解码的过程中使用AGC系统具有较高的复杂性并且部署难度较高。However, the difficulty in the existing AGC system is that the traditional AGC system lacks environmental monitoring function, which makes it difficult to distinguish different auditory backgrounds, and therefore it is difficult to adjust the target volume according to the user's environment. When the noise level of the surrounding environment changes, the AGC system can only maintain the volume at a fixed level, resulting in the volume being too low in a noisy environment or too high in a quiet environment, thus affecting the user experience of the audio receiver. Secondly, the parameter setting of the traditional AGC system has a high technical barrier, which requires professional knowledge to correctly configure and adjust its parameters. Such adjustments include but are not limited to setting the target volume level, adjusting the response speed, and handling sudden volume changes, which affects the popularity and efficiency of the AGC system. At the same time, the use of the AGC system in the process of audio encoding and decoding is highly complex and difficult to deploy.
发明内容Summary of the invention
鉴于以上所述现有技术的缺点,本申请的目的在于提供具有自动增益控制功能的音频编解码装置、方法和系统,用于解决音频编解码过程中自动增益控制系统部署难度高、参数设置存在较高技术壁垒以及自动增益控制系统不具备环境监测功能,从而降低传输音频的清晰度和可靠性的问题。In view of the shortcomings of the prior art mentioned above, the purpose of the present application is to provide an audio codec device, method and system with automatic gain control function, which is used to solve the problems of high difficulty in deploying the automatic gain control system in the audio coding and decoding process, high technical barriers in parameter setting, and the lack of environmental monitoring function of the automatic gain control system, thereby reducing the clarity and reliability of the transmitted audio.
为实现上述目的及其他相关目的,本申请的第一方面提供一种具有自动增益控制功能的音频编解码装置,包括:模数转换单元,与环境噪声监测单元和音频接口电性连接,所述模数转换单元用于接收近端音频模拟信号并执行模数转换操作、数据匹配操作以及数据缓存操作,以生成对应的近端音频数字信号,并将所述近端音频数字信号输出至所述环境噪声监测单元和所述数据接口;环境噪声监测单元,与自动增益控制单元电性连接,所述环境噪声监测单元用于对接收到的所述近端音频数字信号执行噪声监测操作,以生成环境嘈杂度数据,并将所述环境嘈杂度数据输出至所述自动增益控制单元;自动增益控制单元,与控制接口、音频接口和数模转换单元电性连接,所述自动增益控制单元用于从所述环境噪声监测单元接收环境嘈杂度数据,从所述音频接口接收远端音频数字信号,以及从所述控制接口接收控制信息,基于所述环境嘈杂度数据和所述控制信息对所述远端音频数字信号执行自动增益控制操作,以生成自动增益音频信号,并将所述自动增益音频信号输出至所述数模转换单元。To achieve the above-mentioned purpose and other related purposes, the first aspect of the present application provides an audio codec device with an automatic gain control function, including: an analog-to-digital conversion unit, electrically connected to an environmental noise monitoring unit and an audio interface, the analog-to-digital conversion unit being used to receive a near-end audio analog signal and perform an analog-to-digital conversion operation, a data matching operation, and a data caching operation to generate a corresponding near-end audio digital signal, and output the near-end audio digital signal to the environmental noise monitoring unit and the data interface; an environmental noise monitoring unit, electrically connected to an automatic gain control unit, the environmental noise monitoring unit being used to receive the near-end audio digital signal A noise monitoring operation is performed to generate environmental noise data, and the environmental noise data is output to the automatic gain control unit; the automatic gain control unit is electrically connected to the control interface, the audio interface and the digital-to-analog conversion unit, and the automatic gain control unit is used to receive the environmental noise data from the environmental noise monitoring unit, receive the remote audio digital signal from the audio interface, and receive control information from the control interface, perform an automatic gain control operation on the remote audio digital signal based on the environmental noise data and the control information to generate an automatic gain audio signal, and output the automatic gain audio signal to the digital-to-analog conversion unit.
于本申请的第一方面的一些实施例中,所述环境噪声监测单元包括:预处理模块,用于对接收到的所述近端音频数字信号执行频域预处理操作,以生成近端音频频域信号,并将所述近端音频频域信号发送至语音活动检测模块;语音活动检测模块,用于接收所述近端音频频域信号,计算所述近端音频频域信号的短时能量和零交叉率,基于所述短时能量和零交叉率进行语音活动检测,以生成语音活动检测结果,并将所述语音活动检测结果发送至噪声功率谱密度估计模块;噪声功率谱密度估计模块,用于接收所述语音活动检测结果,基于所述语音活动检测结果进行噪声功率谱密度分析,以生成噪声功率谱密度,并将所述噪声功率谱密度发送至噪声特征提取模块;噪声特征提取模块,用于接收所述噪声功率谱密度,基于噪声功率谱密度进行噪声特征提取操作,以生成综合噪声特征并发送至嘈杂度测定模块;嘈杂度测定模块,用于接收所述综合噪声特征,基于综合噪声特征计算环境嘈杂度数据并发送至所述自动增益控制单元。In some embodiments of the first aspect of the present application, the environmental noise monitoring unit includes: a preprocessing module, configured to perform a frequency domain preprocessing operation on the received near-end audio digital signal to generate a near-end audio frequency domain signal, and send the near-end audio frequency domain signal to a voice activity detection module; a voice activity detection module, configured to receive the near-end audio frequency domain signal, calculate the short-time energy and zero crossing rate of the near-end audio frequency domain signal, perform voice activity detection based on the short-time energy and zero crossing rate to generate a voice activity detection result, and send the voice activity detection result to a noise power spectrum density estimation module; a noise power spectrum density estimation module, configured to receive the voice activity detection result, perform noise power spectrum density analysis based on the voice activity detection result to generate a noise power spectrum density, and send the noise power spectrum density to a noise feature extraction module; a noise feature extraction module, configured to receive the noise power spectrum density, perform a noise feature extraction operation based on the noise power spectrum density to generate a comprehensive noise feature and send it to a noisiness measurement module; and a noisiness measurement module, configured to receive the comprehensive noise feature, calculate environmental noise data based on the comprehensive noise feature, and send it to the automatic gain control unit.
于本申请的第一方面的一些实施例中,所述预处理模块包括如下模块以执行频域预处理操作,去直流分量处理模块,用于接收近端音频数字信号并去除信号中的直流分量,并将去除直流分量的近端音频数字信号发送至窗函数处理模块;窗函数处理模块,用于对接收到的去除直流分量的近端音频数字信号使用汉宁窗函数进行滤波处理,并将汉宁窗函数处理后的信号发送至短时傅里叶变换模块;短时傅里叶变换模块,用于接收经过窗函数处理后的信号,将其从时域转换至频域,以生成近端音频频域信号,并将所述近端音频频域信号发送至所述语音活动检测模块。In some embodiments of the first aspect of the present application, the preprocessing module includes the following modules to perform frequency domain preprocessing operations: a DC component removal processing module, which is used to receive a near-end audio digital signal and remove the DC component in the signal, and send the near-end audio digital signal with the DC component removed to a window function processing module; a window function processing module, which is used to filter the received near-end audio digital signal with the DC component removed using a Hanning window function, and send the signal processed by the Hanning window function to a short-time Fourier transform module; a short-time Fourier transform module, which is used to receive the signal processed by the window function, convert it from the time domain to the frequency domain to generate a near-end audio frequency domain signal, and send the near-end audio frequency domain signal to the voice activity detection module.
于本申请的第一方面的一些实施例中,所述噪声特征提取操作包括:基于噪声功率谱密度计算噪声总能量、频谱扁平度和谱摘;其中所述综合噪声特征包括所述噪声总能量、所述频谱扁平度和所述谱摘。In some embodiments of the first aspect of the present application, the noise feature extraction operation includes: calculating the total noise energy, spectrum flatness and spectrum abstract based on the noise power spectral density; wherein the comprehensive noise feature includes the total noise energy, the spectrum flatness and the spectrum abstract.
于本申请的第一方面的一些实施例中,自动增益控制单元包括:目标音量确定模块,其中包括三个输入端和一个输出端,三个输入端分别与控制接口、音频接口以及环境噪声监测单元相连,输出端与增益计算模块相连;所述目标音量确定模块基于所述控制信息和环境嘈杂度数据计算所述远端音频数字信号的目标音量值;增益计算模块,其中包括一个输入端和一个输出端;输入端与目标音量确定模块相连,输出端与增益平滑调整模块相连;所述增益计算模块根据预设的增益系数和所述远端音频数字信号的目标音量值计算远端音频数字信号的增益值,并将所述增益值发送至增益平滑调整模块;增益平滑调整模块,其中包括一个输入端和一个输出端;输入端与增益计算模块相连,输出端与增益应用模块相连;所述增益平滑调整模块基于预设的平滑因子对远端音频数字信号的增益值进行平滑操作,并将平滑后的增益值发送至增益应用模块;增益应用模块,其中包括一个输入端和一个输出端,输入端与增益平滑调整模块相连,输出端与所述数模转换单元相连;所述增益应用模块通过平滑后的增益值计算得到自动增益音频信号,并将所述自动增益音频信号发送至数模转换单元。In some embodiments of the first aspect of the present application, the automatic gain control unit includes: a target volume determination module, which includes three input terminals and one output terminal, the three input terminals are respectively connected to the control interface, the audio interface and the environmental noise monitoring unit, and the output terminal is connected to the gain calculation module; the target volume determination module calculates the target volume value of the remote audio digital signal based on the control information and the environmental noisiness data; a gain calculation module, which includes an input terminal and an output terminal; the input terminal is connected to the target volume determination module, and the output terminal is connected to the gain smoothing adjustment module; the gain calculation module calculates the remote audio digital signal according to a preset gain coefficient and the target volume value of the remote audio digital signal The gain value is calculated and sent to a gain smoothing adjustment module; the gain smoothing adjustment module includes an input terminal and an output terminal; the input terminal is connected to the gain calculation module, and the output terminal is connected to the gain application module; the gain smoothing adjustment module performs a smoothing operation on the gain value of the remote audio digital signal based on a preset smoothing factor, and sends the smoothed gain value to the gain application module; the gain application module includes an input terminal and an output terminal, the input terminal is connected to the gain smoothing adjustment module, and the output terminal is connected to the digital-to-analog conversion unit; the gain application module obtains an automatic gain audio signal by calculating the smoothed gain value, and sends the automatic gain audio signal to the digital-to-analog conversion unit.
于本申请的第一方面的一些实施例中,所述数模转换单元包括:数据缓存模块,与自动增益控制单元和数据匹配模块电性相连,用于对所述自动增益音频信号执行延时匹配及数据缓存操作,并将其发送至数据匹配模块;数据匹配模块,与混音模块电性相连,用于对所述自动增益音频信号进行格式转换匹配操作,以生成数模中间信号并发送至混音模块;混音模块,与数模转换子模块电性相连,用于对所述数模中间信号与其他远端音频流的数字信号或者本地存储的数字音频流进行混音叠加操作,以生成混音信号并将所述混音信号发送至数模转换子模块;数模转换子模块,用于对所述混音信号执行数模转换操作,并生成远端音频模拟信号并输出。In some embodiments of the first aspect of the present application, the digital-to-analog conversion unit includes: a data cache module, which is electrically connected to the automatic gain control unit and the data matching module, and is used to perform delay matching and data cache operations on the automatic gain audio signal, and send it to the data matching module; a data matching module, which is electrically connected to the mixing module, and is used to perform format conversion and matching operations on the automatic gain audio signal to generate a digital-to-analog intermediate signal and send it to the mixing module; a mixing module, which is electrically connected to the digital-to-analog conversion submodule, and is used to mix and superimpose the digital-to-analog intermediate signal with the digital signals of other remote audio streams or the locally stored digital audio streams to generate a mixed signal and send the mixed signal to the digital-to-analog conversion submodule; a digital-to-analog conversion submodule, which is used to perform a digital-to-analog conversion operation on the mixed signal, and generate a remote audio analog signal and output it.
为实现上述目的及其他相关目的,本申请的第二方面提供一种具有自动增益控制功能的音频编解码方法,应用于音频编解码器,所述方法包括:接收近端音频模拟信号并执行模数转换操作、数据匹配操作以及数据缓存操作,以生成对应的近端音频数字信号;对所述近端音频数字信号执行噪声监测操作,以生成环境嘈杂度数据;接收控制信息及远端音频数字信号,基于所述环境嘈杂度数据和所述控制信息对所述远端音频数字信号执行自动增益控制操作,以生成自动增益音频信号;对所述自动增益音频信号执行数据缓存操作、格式转换匹配操作、混音叠加操作和数模转换操作并生成经过自动增益控制的远端音频模拟信号并输出。To achieve the above-mentioned purpose and other related purposes, the second aspect of the present application provides an audio codec method with automatic gain control function, which is applied to an audio codec, and the method includes: receiving a near-end audio analog signal and performing analog-to-digital conversion operation, data matching operation and data caching operation to generate a corresponding near-end audio digital signal; performing a noise monitoring operation on the near-end audio digital signal to generate environmental noisy data; receiving control information and a far-end audio digital signal, and performing an automatic gain control operation on the far-end audio digital signal based on the environmental noisy data and the control information to generate an automatic gain audio signal; performing a data caching operation, a format conversion matching operation, a mixing and superposition operation and a digital-to-analog conversion operation on the automatic gain audio signal to generate a far-end audio analog signal that has undergone automatic gain control and output it.
于本申请的第二方面的一些实施例中,所述噪声监测操作包括:接收近端音频数字信号并去除信号中的直流分量;使用汉宁窗函数对去除直流分量的近端音频数字信号进行滤波处理;将经汉宁窗函数处理后的近端音频数字信号从时域转换至频域,以生成近端音频频域信号;计算所述近端音频频域信号的短时能量和零交叉率,基于所述短时能量和零交叉率进行语音活动检测,以生成语音活动检测结果;基于所述语音活动检测结果进行噪声功率谱密度分析,以生成噪声功率谱密度;基于噪声功率谱密度计算噪声总能量、频谱扁平度和谱摘,以生成包含有所述噪声总能量、频谱扁平度和谱摘的综合噪声特征,并通过综合噪声特征以计算得到环境嘈杂度数据。In some embodiments of the second aspect of the present application, the noise monitoring operation includes: receiving a near-end audio digital signal and removing a DC component from the signal; filtering the near-end audio digital signal with the DC component removed using a Hanning window function; converting the near-end audio digital signal processed by the Hanning window function from the time domain to the frequency domain to generate a near-end audio frequency domain signal; calculating the short-time energy and zero-crossing rate of the near-end audio frequency domain signal, and performing voice activity detection based on the short-time energy and zero-crossing rate to generate a voice activity detection result; performing noise power spectral density analysis based on the voice activity detection result to generate a noise power spectral density; calculating the total noise energy, spectral flatness and spectral abstract based on the noise power spectral density to generate a comprehensive noise feature including the total noise energy, spectral flatness and spectral abstract, and calculating the environmental noisiness data through the comprehensive noise feature.
于本申请的第二方面的一些实施例中,所述自动增益控制操作包括:基于所述控制信息和环境嘈杂度数据计算得到所述远端音频数字信号的目标音量值;根据预设的增益系数和所述远端音频数字信号的目标音量值计算远端音频数字信号的增益值;基于预设的平滑因子对远端音频数字信号的增益值进行平滑操作;将平滑后的增益值通过计算得到自动增益音频信号。In some embodiments of the second aspect of the present application, the automatic gain control operation includes: calculating the target volume value of the far-end audio digital signal based on the control information and the ambient noise data; calculating the gain value of the far-end audio digital signal according to a preset gain coefficient and the target volume value of the far-end audio digital signal; smoothing the gain value of the far-end audio digital signal based on a preset smoothing factor; and obtaining an automatic gain audio signal by calculating the smoothed gain value.
为实现上述目的及其他相关目的,本申请的第三方面提供一种具有自动增益控制功能的音频编解码系统,包括:主处理器以及上述具有自动增益控制功能的音频编解码装置。To achieve the above-mentioned purpose and other related purposes, the third aspect of the present application provides an audio codec system with automatic gain control function, including: a main processor and the above-mentioned audio codec device with automatic gain control function.
如上所述,本申请的音频编解码领域的具有自动增益控制功能的音频编解码装置、方法和系统,具有以下有益效果:本申请创新性地将环境噪声监测与AGC功能在音频编解码器内部进行集成,使得AGC模块能够根据实时环境噪声数据对音频增益参数进行动态调整的同时,保证了音频输出在各种环境条件下的一致性和清晰度,从而大幅提升了音频信号的处理质量和用户的听觉体验。其次将AGC功能集成至音频编解码器内部显著降低了外部硬件依赖和系统集成的复杂性,减少了额外设备的需求,简化了系统部署过程,降低了使用AGC技术的技术壁垒。最后,将AGC功能集成至音频编解码器内部可充分利用音频编解码器内部所具有的高采样率能力,以更低的算法时延处理音频信号,从而实现更快速和精确的响应。As described above, the audio codec device, method and system with automatic gain control function in the field of audio codec of the present application have the following beneficial effects: the present application innovatively integrates environmental noise monitoring and AGC functions inside the audio codec, so that the AGC module can dynamically adjust the audio gain parameters according to the real-time environmental noise data, while ensuring the consistency and clarity of the audio output under various environmental conditions, thereby greatly improving the processing quality of the audio signal and the user's auditory experience. Secondly, integrating the AGC function into the audio codec significantly reduces the external hardware dependence and the complexity of system integration, reduces the need for additional equipment, simplifies the system deployment process, and reduces the technical barriers to using AGC technology. Finally, integrating the AGC function into the audio codec can make full use of the high sampling rate capability of the audio codec to process the audio signal with a lower algorithm delay, thereby achieving a faster and more accurate response.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1显示了本申请具有自动增益控制功能的音频编解码装置一实施例中外部连接的结构示意图。FIG. 1 shows a schematic structural diagram of external connections in an embodiment of an audio codec device with an automatic gain control function of the present application.
图2显示了本申请具有自动增益控制功能的音频编解码装置一实施例中内部连接的结构示意图。FIG. 2 shows a schematic structural diagram of the internal connections in an embodiment of an audio codec device with an automatic gain control function of the present application.
图3显示了本申请具有自动增益控制功能的音频编解码装置一实施例中模数转换单元的结构示意图。FIG. 3 shows a schematic diagram of the structure of an analog-to-digital conversion unit in an embodiment of an audio codec device with an automatic gain control function of the present application.
图4显示了本申请具有自动增益控制功能的音频编解码装置一实施例中环境噪声监测单元的结构示意图。FIG. 4 shows a schematic diagram of the structure of an environmental noise monitoring unit in an embodiment of an audio codec device with an automatic gain control function of the present application.
图5显示了本申请具有自动增益控制功能的音频编解码装置一实施例中自动增益控制单元的结构示意图。FIG. 5 shows a schematic diagram of the structure of an automatic gain control unit in an embodiment of an audio codec device with an automatic gain control function of the present application.
图6显示了本申请具有自动增益控制功能的音频编解码装置一实施例中数模转换单元的结构示意图。FIG. 6 shows a schematic diagram of the structure of a digital-to-analog conversion unit in an embodiment of an audio codec device with an automatic gain control function of the present application.
图7显示了本申请具有自动增益控制功能的音频编解码方法一实施例的流程示意图。FIG. 7 shows a flow chart of an embodiment of an audio encoding and decoding method with an automatic gain control function of the present application.
图8显示了本申请具有自动增益控制功能的音频编解码系统一实施例的结构示意图。FIG. 8 shows a schematic structural diagram of an audio codec system with an automatic gain control function according to an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
以下通过特定的具体实例说明本申请的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本申请的其他优点与功效。本申请还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本申请的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。The following describes the embodiments of the present application through specific examples, and those skilled in the art can easily understand other advantages and effects of the present application from the contents disclosed in this specification. The present application can also be implemented or applied through other different specific embodiments, and the details in this specification can also be modified or changed in various ways based on different viewpoints and applications without departing from the spirit of the present application. It should be noted that the following embodiments and features in the embodiments can be combined with each other without conflict.
需要说明的是,在下述描述中,参考附图,附图描述了本申请的若干实施例。应当理解,还可使用其他实施例,并且可以在不背离本申请的精神和范围的情况下进行机械组成、结构、电气以及操作上的改变。下面的详细描述不应该被认为是限制性的,并且本申请的实施例的范围仅由公布的专利的权利要求书所限定。这里使用的术语仅是为了描述特定实施例,而并非旨在限制本申请。空间相关的术语,例如“上”、“下”、“左”、“右”、“下面”、“下方”、“下部”、“上方”、“上部”等,可在文中使用以便于说明图中所示的一个元件或特征与另一元件或特征的关系。It should be noted that in the following description, with reference to the accompanying drawings, several embodiments of the present application are described in the accompanying drawings. It should be understood that other embodiments may also be used, and mechanical composition, structure, electrical and operational changes may be made without departing from the spirit and scope of the present application. The following detailed description should not be considered restrictive, and the scope of the embodiments of the present application is limited only by the claims of the published patents. The terms used here are only for describing specific embodiments and are not intended to limit the present application. Spatially related terms, such as "upper", "lower", "left", "right", "below", "below", "lower", "above", "upper", etc., may be used in the text to facilitate the description of the relationship between an element or feature shown in the figure and another element or feature.
在本申请中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”、“固定”、“固持”等术语应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本申请中的具体含义。In this application, unless otherwise clearly specified and limited, the terms "install", "connect", "connect", "fix", "hold" and the like should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection, or it can be an indirect connection through an intermediate medium, or it can be the internal communication of two components. For ordinary technicians in this field, the specific meanings of the above terms in this application can be understood according to specific circumstances.
再者,如同在本文中所使用的,单数形式“一”、“一个”和“该”旨在也包括复数形式,除非上下文中有相反的指示。应当进一步理解,术语“包含”、“包括”表明存在所述的特征、操作、元件、组件、项目、种类、和/或组,但不排除一个或多个其他特征、操作、元件、组件、项目、种类、和/或组的存在、出现或添加。此处使用的术语“或”和“和/或”被解释为包括性的,或意味着任一个或任何组合。因此,“A、B或C”或者“A、B和/或C”意味着“以下任一个:A;B;C;A和B;A和C;B和C;A、B和C”。仅当元件、功能或操作的组合在某些方式下内在地互相排斥时,才会出现该定义的例外。Furthermore, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless there is an indication to the contrary in the context. It should be further understood that the terms "comprise", "include" indicate the presence of the described features, operations, elements, components, items, kinds, and/or groups, but do not exclude the presence, occurrence or addition of one or more other features, operations, elements, components, items, kinds, and/or groups. The terms "or" and "and/or" used herein are interpreted as inclusive, or mean any one or any combination. Therefore, "A, B or C" or "A, B and/or C" means "any of the following: A; B; C; A and B; A and C; B and C; A, B and C". Exceptions to this definition will only occur when the combination of elements, functions or operations is inherently mutually exclusive in some way.
为解决上述背景技术中的问题,本发明提供一种具有自动增益控制功能的音频编解码装置、方法和系统,旨在解决音频编解码过程中自动增益控制系统部署难度高、参数设置存在较高技术壁垒以及自动增益控制系统不具备环境监测功能,从而降低传输音频的清晰度和可靠性的问题。与此同时,为了使本发明的目的、技术方案及优点更加清楚明白,通过下述实施例并结合附图,对本发明实施例中的技术方案的进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本发明,并不用于限定发明。In order to solve the problems in the above-mentioned background technology, the present invention provides an audio codec device, method and system with automatic gain control function, aiming to solve the problems that the automatic gain control system is difficult to deploy, there are high technical barriers to parameter setting, and the automatic gain control system does not have an environmental monitoring function during the audio coding and decoding process, thereby reducing the clarity and reliability of the transmitted audio. At the same time, in order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention are further described in detail through the following embodiments and in combination with the accompanying drawings. It should be understood that the specific embodiments described herein are only used to explain the present invention and are not used to limit the invention.
在对本发明进行进一步详细说明之前,对本发明实施例中涉及的名词和术语进行说明,本发明实施例中涉及的名词和术语适用于如下的解释:Before further describing the present invention in detail, the nouns and terms involved in the embodiments of the present invention are explained. The nouns and terms involved in the embodiments of the present invention are applicable to the following interpretations:
<1>自动增益控制(Automatic Gain Control,AGC):用于自动调整音频信号的增益水平,以确保输出信号的幅度在一个合适的范围内,使得信号不会过强或过弱。AGC通常用于音频处理、无线通信等领域。<1> Automatic Gain Control (AGC): It is used to automatically adjust the gain level of the audio signal to ensure that the amplitude of the output signal is within an appropriate range so that the signal is not too strong or too weak. AGC is usually used in audio processing, wireless communication and other fields.
<2>去直流分量:指从音频信号中移除直流分量,以确保信号的均值为零,这样可以避免信号偏离零点,有助于信号的处理和编解码。<2> DC removal: refers to removing the DC component from the audio signal to ensure that the mean value of the signal is zero. This can prevent the signal from deviating from the zero point and facilitate signal processing and encoding and decoding.
<3>短时能量:用于衡量信号瞬时能量的指标,展示了用于检测信号的瞬时变化和特征。<3>Short-time energy: An indicator used to measure the instantaneous energy of a signal, showing the instantaneous changes and characteristics used to detect the signal.
<4>零交叉率:指在单位时间内信号穿过零值的次数,常用于音频信号中声音的特征提取和分析。<4>Zero crossing rate: refers to the number of times a signal crosses the zero value per unit time, and is often used for feature extraction and analysis of sound in audio signals.
<5>噪声功率谱密度:是指噪声信号在频率域上的功率分布,通常用于分析和处理噪声信号的频谱特性。<5>Noise power spectral density: refers to the power distribution of the noise signal in the frequency domain, which is usually used to analyze and process the spectral characteristics of the noise signal.
<6>频谱扁平度:用于描述信号频谱的平坦程度,通常用于音频信号的特征提取和分析。<6> Spectral flatness: used to describe the flatness of the signal spectrum, usually used for feature extraction and analysis of audio signals.
<7>谱摘:用于描述信号频谱的偏斜程度,可以用于音频信号的特征分析和处理。<7> Spectral abstract: It is used to describe the degree of skewness of the signal spectrum and can be used for feature analysis and processing of audio signals.
本发明实施例提供具有自动增益控制功能的音频编解码装置、以及在音频编解码器中进行自动增益控制的方法和具有自动增益控制功能的音频编解码装置所应用的系统,就具有自动增益控制功能的音频编解码装置的结构而言,本发明实施例将对具有自动增益控制功能的音频编解码装置的示例性实施场景进行说明。An embodiment of the present invention provides an audio codec device with an automatic gain control function, a method for performing automatic gain control in an audio codec, and a system used by the audio codec device with an automatic gain control function. As far as the structure of the audio codec device with an automatic gain control function is concerned, the embodiment of the present invention will illustrate an exemplary implementation scenario of the audio codec device with an automatic gain control function.
如图1所示,展示了本发明实施例中的一种具有自动增益控制功能的音频编解码装置的外部连接的结构示意图。本实施例中的具有自动增益控制功能的音频编解码装置主要包括如下各部分。As shown in Figure 1, a schematic diagram of the external connection structure of an audio codec device with automatic gain control function in an embodiment of the present invention is shown. The audio codec device with automatic gain control function in this embodiment mainly includes the following parts.
在本发明一实施例中,具有自动增益控制功能的音频编解码装置的外部连接包括音频输入、音频编解码器、控制接口、数据接口、主处理器和音频输出。其中音频编解码器和主处理器之间通过控制接口和数据接口/音频接口相连。音频编解码器与环境之间的交互是通过音频输入和音频输出实现的。In one embodiment of the present invention, the external connection of the audio codec device with automatic gain control function includes audio input, audio codec, control interface, data interface, main processor and audio output. The audio codec and the main processor are connected through the control interface and the data interface/audio interface. The interaction between the audio codec and the environment is realized through audio input and audio output.
进一步地,音频输入用于为音频编解码器从外部获取音频流的音频采集器件,音频采集器件包括数字麦克风、模拟麦克风、传感器,或者其他形式的声电转换器件。音频采集器件的音频输入包括单通道信号和是多通道信号。音频输入模块将完成将声信号转换成电信号的功能。其输入是声波信号,输出为电压信号。Furthermore, the audio input is used to obtain an audio acquisition device for the audio codec to obtain an audio stream from the outside, and the audio acquisition device includes a digital microphone, an analog microphone, a sensor, or other forms of sound-to-electric conversion devices. The audio input of the audio acquisition device includes a single-channel signal and a multi-channel signal. The audio input module will complete the function of converting the acoustic signal into an electrical signal. Its input is a sound wave signal and its output is a voltage signal.
示例性地,音频编解码器用于将音频数据压以一种格式转换成另一种格式,以降低数据计算、传输和存储成本。其包括上行(录音)通路,下行(播放)通路和控制接口。其中,上行通路包括模数转换单元,数据格式转换单元,数据缓存单元,数据接口;下行通路包括数据缓存单元,数据格式转换单元,混音单元(多个音频流),数模转换单元。而本发明的音频编解码系统是在常规音频编解码系统的架构上,增加了环境噪声检测和自动增益控制控制模块以实现对环境嘈杂度的自动评估和适应,从而进行自动增益控制。Exemplarily, the audio codec is used to convert audio data from one format to another format to reduce data calculation, transmission and storage costs. It includes an uplink (recording) path, a downlink (playing) path and a control interface. Among them, the uplink path includes an analog-to-digital conversion unit, a data format conversion unit, a data cache unit, and a data interface; the downlink path includes a data cache unit, a data format conversion unit, a mixing unit (multiple audio streams), and a digital-to-analog conversion unit. The audio codec system of the present invention is based on the architecture of a conventional audio codec system, and adds an environmental noise detection and automatic gain control control module to realize automatic evaluation and adaptation to environmental noisiness, thereby performing automatic gain control.
进一步地,控制接口为主处理器向音频编解码器下发命令的通道,也是音频编解码器向主处理器反馈状态的通道,其可采用的通信协议包括:I2C、Soundwire、Slimbus、HDA等接口协议。通过控制接口,用户可以和编解码器进行控制指令的互通。在具有自动增益控制功能的音频编解码装置中,主处理器通过控制接口向自动增益控制模块发送控制信息。Furthermore, the control interface is a channel for the main processor to send commands to the audio codec, and is also a channel for the audio codec to feedback status to the main processor. The communication protocols that can be used include: I2C, Soundwire, Slimbus, HDA and other interface protocols. Through the control interface, the user can communicate control instructions with the codec. In an audio codec device with an automatic gain control function, the main processor sends control information to the automatic gain control module through the control interface.
示例性地,数据接口为主处理器和音频编解码器之间音频数据交互的通道,如原始音频流通过该接口送到主处理器,主处理器将远端音频信号或者本地音乐信号,通过该接口传输给音频编解码器,其可采用的协议包括但不限于:I2S/PCM/TDM、Soundwire、Slimbus、HDA等接口协议。Exemplarily, the data interface is a channel for audio data interaction between the main processor and the audio codec. For example, the original audio stream is sent to the main processor through the interface, and the main processor transmits the remote audio signal or the local music signal to the audio codec through the interface. The protocols that can be used include but are not limited to: I2S/PCM/TDM, Soundwire, Slimbus, HDA and other interface protocols.
进一步地,主处理器为SOC(System On Chip)处理芯片,其负责设备的主控功能。具体地,主处理器接收并处理音频编解码上行的音频数据,或者将音频数据传输给音频编解码器下行通道,进行进一步处理。同时,通过控制接口,监控音频编解码器的状态,或者给其下发参数配置。Furthermore, the main processor is a SOC (System On Chip) processing chip, which is responsible for the main control function of the device. Specifically, the main processor receives and processes the audio data of the audio codec upstream, or transmits the audio data to the audio codec downstream channel for further processing. At the same time, through the control interface, it monitors the status of the audio codec, or sends parameter configuration to it.
示例性地,音频输出为音频编解码器将音频流输出给外部的声音播放器件,其中声音播放器件包括听筒、喇叭或者耳机,或者以上三种的组合,可以是单通道也可以是多通道,完成将电信号转换成声信号的功能。其输入是电压信号,输出是声波信号。Exemplarily, the audio output is the audio codec outputting the audio stream to an external sound playing device, wherein the sound playing device includes a receiver, a speaker or an earphone, or a combination of the above three, which can be single-channel or multi-channel, to complete the function of converting electrical signals into sound signals. Its input is a voltage signal and its output is a sound wave signal.
上文介绍了具有自动增益控制功能的音频编解码装置的外部连接结构,下文将结合图2对具有自动增益控制功能的音频编解码装置的内部的结构进行详细说明。如图2所示,具有自动增益控制功能的音频编解码装置的外部连接的结构示意图包括如下结构。The above describes the external connection structure of the audio codec device with automatic gain control function. The following will describe the internal structure of the audio codec device with automatic gain control function in detail in conjunction with Figure 2. As shown in Figure 2, the schematic diagram of the external connection structure of the audio codec device with automatic gain control function includes the following structure.
模数转换单元,与环境噪声监测单元和音频接口电性连接,所述模数转换单元用于接收近端音频模拟信号并执行模数转换操作、数据匹配操作以及数据缓存操作,以生成对应的近端音频数字信号,并将所述近端音频数字信号输出至所述环境噪声监测单元和所述数据接口。An analog-to-digital conversion unit is electrically connected to the environmental noise monitoring unit and the audio interface. The analog-to-digital conversion unit is used to receive a near-end audio analog signal and perform analog-to-digital conversion operations, data matching operations, and data caching operations to generate a corresponding near-end audio digital signal, and output the near-end audio digital signal to the environmental noise monitoring unit and the data interface.
在本发明一实施例中,模数转换单元用于将外部的音频输入信号转化为符合当前系统处理需求的数字信号。具体地,如图3所示,模数转换单元包括模数转换模块、数据匹配模块以及数据缓存模块。该单元的输入近端音频模拟信号,输出近端音频数字信号。该单元的输入信号数与输出信号数一致。In one embodiment of the present invention, the analog-to-digital conversion unit is used to convert an external audio input signal into a digital signal that meets the current system processing requirements. Specifically, as shown in FIG3 , the analog-to-digital conversion unit includes an analog-to-digital conversion module, a data matching module, and a data cache module. The unit inputs a near-end audio analog signal and outputs a near-end audio digital signal. The number of input signals of the unit is consistent with the number of output signals.
示例性地,模数转换模块将音频输入采集进来的模拟音频流,经过采样、量化,转换成数字音频流,以便后续处理均基于数字离散信号。Exemplarily, the analog-to-digital conversion module samples and quantizes the analog audio stream collected by the audio input and converts it into a digital audio stream so that subsequent processing is based on digital discrete signals.
x(n)=x(nT) (公式1)x(n)=x(nT) (Formula 1)
xq(n)=Q[x(n)] (公式2) xq (n)=Q[x(n)] (Formula 2)
其中,公式1展示了模数转换模块中的采样过程,将输入模拟连续信号x(t)按采样周期T进行离散时间采样,输出x(n)。公式2展示了音频数据的量化过程,将采样过程的输出x(n)经量化函数Q将幅度进行离散,输出xq(n)。经过上述两个过程,将实现将模拟连续信号,转换成输出的数字离散信号。Among them, formula 1 shows the sampling process in the analog-to-digital conversion module, where the input analog continuous signal x(t) is sampled in discrete time according to the sampling period T, and the output is x(n). Formula 2 shows the quantization process of the audio data, where the output x(n) of the sampling process is discretized by the quantization function Q, and the output is xq (n). After the above two processes, the analog continuous signal is converted into an output digital discrete signal.
进一步地,数据匹配模块将转换后的数字音频流按要求进行一定的格式转换,包括采样率的匹配,信号位宽的匹配等。Furthermore, the data matching module performs certain format conversion on the converted digital audio stream as required, including matching of sampling rate, matching of signal bit width, etc.
其中,公式3为输入信号进行滤波和采样率变换,x(n)为模数转换模块的输出,M为降采样变化因子,I为升采样变化因子,h(k)是单位冲击响应函数,输出yd(n)或yu(n)。公式4展示了信号位宽匹配的过程,数据匹配模块根据移位位宽参数B的正负性,判断位宽匹配操作为左移放大还是右移缩小。该单元的输入信号为模数中间信号,输出为数字音频信号。Among them, Formula 3 is the filtering and sampling rate conversion of the input signal, x(n) is the output of the analog-to-digital conversion module, M is the downsampling change factor, I is the upsampling change factor, h(k) is the unit impulse response function, and the output is y d (n) or yu (n). Formula 4 shows the process of signal bit width matching. The data matching module determines whether the bit width matching operation is left shift to enlarge or right shift to reduce according to the positive and negative of the shift bit width parameter B. The input signal of this unit is the analog-to-digital intermediate signal, and the output is a digital audio signal.
进一步地,数据缓存模块用于在音频接口缓存一定量的数据,从而避免因为接口两侧系统时钟设计缺陷引入的抖动,导致音频数据被重复采样或者丢失的问题。该模块的输入是数字信号,没有输出数据流。Furthermore, the data buffer module is used to buffer a certain amount of data in the audio interface, thereby avoiding the problem of repeated sampling or loss of audio data due to jitter introduced by the system clock design defects on both sides of the interface. The input of this module is a digital signal, and there is no output data stream.
在本发明一实施例中,如图4所示,具有自动增益控制功能的音频编解码装置还包括,环境噪声监测单元,与自动增益控制单元电性连接,所述环境噪声监测单元用于对接收到的所述近端音频数字信号执行噪声监测操作,以生成环境嘈杂度数据,并将所述环境嘈杂度数据输出至所述自动增益控制单元。In one embodiment of the present invention, as shown in Figure 4, the audio codec device with automatic gain control function also includes an environmental noise monitoring unit, which is electrically connected to the automatic gain control unit, and the environmental noise monitoring unit is used to perform a noise monitoring operation on the received near-end audio digital signal to generate environmental noise data, and output the environmental noise data to the automatic gain control unit.
值得说明的是,本发明中的环境噪声监测单元用于实时分析和评估周围环境的噪声水平,以便优化音频处理算法,特别是自适应自动增益控制(AGC)算法。环境噪声监测单元通过综合利用多种信号处理技术,以实现对环境噪声特征的精准捕获和量化。环境噪声监测单元通过去除原始音频信号中的直流分量,消除长期平均偏差,从而准确地捕捉信号的瞬时变化。随后,通过应用汉宁窗函数,实现信号的时间局部化处理,以减少频谱泄露。接下来,利用短时傅里叶变换(STFT),将处理过的时域信号转换为频域表示,为后续分析提供了必要的数据基础。更为重要的是,本发明中采用提取综合噪声特征的方法,为环境噪声提供了多维度的视角,提升了环境嘈杂度的捕捉精度。It is worth noting that the environmental noise monitoring unit in the present invention is used to analyze and evaluate the noise level of the surrounding environment in real time in order to optimize the audio processing algorithm, especially the adaptive automatic gain control (AGC) algorithm. The environmental noise monitoring unit uses a variety of signal processing techniques to accurately capture and quantify the characteristics of environmental noise. The environmental noise monitoring unit removes the DC component in the original audio signal and eliminates the long-term average deviation, thereby accurately capturing the instantaneous changes of the signal. Subsequently, by applying the Hanning window function, the time localization processing of the signal is realized to reduce spectrum leakage. Next, the short-time Fourier transform (STFT) is used to convert the processed time domain signal into a frequency domain representation, which provides the necessary data basis for subsequent analysis. More importantly, the method of extracting comprehensive noise features in the present invention provides a multi-dimensional perspective for environmental noise and improves the accuracy of capturing environmental noisiness.
在本实施例中,如图4所示,环境噪声监测单元包括:预处理模块,语音活动检测模块,噪声功率谱密度估计模块和嘈杂度测定模块。其中预处理模块用于对接收到的所述近端音频数字信号执行频域预处理操作,以生成近端音频频域信号,并将所述近端音频频域信号发送至语音活动检测模块;语音活动检测模块,用于接收所述近端音频频域信号,计算所述近端音频频域信号的短时能量和零交叉率,基于所述短时能量和零交叉率进行语音活动检测,以生成语音活动检测结果,并将所述语音活动检测结果发送至噪声功率谱密度估计模块;噪声功率谱密度估计模块,用于接收所述语音活动检测结果,基于所述语音活动检测结果进行噪声功率谱密度分析,以生成噪声功率谱密度,并将所述噪声功率谱密度发送至噪声特征提取模块;噪声特征提取模块,用于接收所述噪声功率谱密度,基于噪声功率谱密度进行噪声特征提取操作,以生成综合噪声特征并发送至嘈杂度测定模块;嘈杂度测定模块,用于接收所述综合噪声特征,基于综合噪声特征计算环境嘈杂度数据并发送至所述自动增益控制单元。In this embodiment, as shown in FIG. 4 , the environmental noise monitoring unit includes: a preprocessing module, a voice activity detection module, a noise power spectrum density estimation module and a noisiness measurement module. The preprocessing module is used to perform a frequency domain preprocessing operation on the received near-end audio digital signal to generate a near-end audio frequency domain signal, and send the near-end audio frequency domain signal to the voice activity detection module; the voice activity detection module is used to receive the near-end audio frequency domain signal, calculate the short-time energy and zero crossing rate of the near-end audio frequency domain signal, perform voice activity detection based on the short-time energy and zero crossing rate to generate a voice activity detection result, and send the voice activity detection result to the noise power spectrum density estimation module; the noise power spectrum density estimation module is used to receive the voice activity detection result, perform noise power spectrum density analysis based on the voice activity detection result to generate a noise power spectrum density, and send the noise power spectrum density to the noise feature extraction module; the noise feature extraction module is used to receive the noise power spectrum density, perform a noise feature extraction operation based on the noise power spectrum density to generate a comprehensive noise feature and send it to the annoyance measurement module; the annoyance measurement module is used to receive the comprehensive noise feature, calculate the environmental annoyance data based on the comprehensive noise feature and send it to the automatic gain control unit.
进一步地,所述预处理模块包括如下模块以执行频域预处理操作,去直流分量处理模块,用于接收近端音频数字信号并去除信号中的直流分量,并将去除直流分量的近端音频数字信号发送至窗函数处理模块;窗函数处理模块,用于对接收到的去除直流分量的近端音频数字信号使用汉宁窗函数进行滤波处理,并将汉宁窗函数处理后的信号发送至短时傅里叶变换模块;短时傅里叶变换模块,用于接收经过窗函数处理后的信号,将其从时域转换至频域,以生成近端音频频域信号,并将所述近端音频频域信号发送至所述语音活动检测模块。Furthermore, the preprocessing module includes the following modules to perform frequency domain preprocessing operations: a DC component removal processing module, which is used to receive a near-end audio digital signal and remove the DC component in the signal, and send the near-end audio digital signal with the DC component removed to a window function processing module; a window function processing module, which is used to filter the received near-end audio digital signal with the DC component removed using a Hanning window function, and send the signal processed by the Hanning window function to a short-time Fourier transform module; a short-time Fourier transform module, which is used to receive the signal processed by the window function, convert it from the time domain to the frequency domain to generate a near-end audio frequency domain signal, and send the near-end audio frequency domain signal to the voice activity detection module.
示例性地,去除原始声音信号s(t)中的直流分量,以消除长期平均偏差如公式5所示。Exemplarily, the DC component in the original sound signal s(t) is removed to eliminate the long-term average deviation as shown in Formula 5.
其中,s(t)为原始时域声音信号;s′(t)为去除直流分量后的信号;T为考虑的时间长度。Among them, s(t) is the original time domain sound signal; s′(t) is the signal after removing the DC component; T is the time length considered.
随后,如公式6所示,将去除直流分量后的公式采用汉宁窗函数对信号进行局部化处理,减少频谱泄露。Then, as shown in Formula 6, the formula after removing the DC component uses the Hanning window function to localize the signal to reduce spectrum leakage.
其中,sw(t)为应用窗函数后的信号;T为窗函数的长度。Wherein, s w (t) is the signal after the window function is applied; T is the length of the window function.
随后,采用公式7将经过处理的时域信号转换为频域信号,以便于进一步的分析和处理。Subsequently, the processed time domain signal is converted into a frequency domain signal using Formula 7 for further analysis and processing.
其中,S(f,τ)为频率f和时间窗口τ的频域表示。Among them, S(f,τ) is the frequency domain representation of frequency f and time window τ.
在本发明一实施例中,对近端音频频域信号进行语音活动检测的过程结合了短时能量和零交叉率来判断信号中是否存在语音。具体地如公式8至公式10所示。其中,公式8展示了采用短时能量对近端音频频域信号进行检测的过程。公式9展示了采用零交叉率对近端音频频域信号进行检测的过程。最终,基于短时能量和零交叉率通过公式10进行最终的VAD决策。In one embodiment of the present invention, the process of performing voice activity detection on the near-end audio frequency domain signal combines short-time energy and zero crossing rate to determine whether there is voice in the signal. Specifically, as shown in formulas 8 to 10. Among them, formula 8 shows the process of detecting the near-end audio frequency domain signal using short-time energy. Formula 9 shows the process of detecting the near-end audio frequency domain signal using zero crossing rate. Finally, the final VAD decision is made based on the short-time energy and zero crossing rate through formula 10.
其中θE表示近端音频信号的能量和,θZ表示零交叉率的预设阈值。Wherein θ E represents the energy sum of the near-end audio signal, and θ Z represents the preset threshold value of the zero-crossing rate.
在本发明一实施例中,所述噪声特征提取操作包括:基于噪声功率谱密度计算噪声总能量、频谱扁平度和谱摘;其中所述综合噪声特征包括所述噪声总能量、所述频谱扁平度和所述谱摘。In one embodiment of the present invention, the noise feature extraction operation includes: calculating the total noise energy, spectrum flatness and spectrum abstract based on the noise power spectral density; wherein the comprehensive noise feature includes the total noise energy, the spectrum flatness and the spectrum abstract.
示例性地,当语音活动检测单元中未检测到语音活动时,对于每个频率f和时间窗口τ,进行噪声功率谱密度估计的过程如公式11所示。Exemplarily, when no voice activity is detected in the voice activity detection unit, the process of estimating the noise power spectrum density for each frequency f and time window τ is as shown in Formula 11.
其中,Nns表示VAD判断表示非语音的样本数量,N(f,τ)表示在频率f和时间窗口τ的噪声功率谱密度。Where N ns represents the number of samples that VAD judges to represent non-speech, and N(f,τ) represents the noise power spectral density at frequency f and time window τ.
进一步地,在环境噪声监测模块中,提取以下噪声特征以进行进一步的分析和处理包括:计算总能量、计算频谱扁平度以及计算谱摘。Furthermore, in the environmental noise monitoring module, the following noise features are extracted for further analysis and processing including: calculating total energy, calculating spectrum flatness, and calculating spectrum abstracts.
示例性地,公式12展示了噪声特征中总能量提取的过程。Exemplarily, Equation 12 shows the process of total energy extraction from noise features.
Enoise(τ)=∑f|N(f,τ)|2(公式12)E noise (τ)=∑ f |N(f,τ)| 2 (Formula 12)
其中,Enoise(τ)表示在时间窗口τ下所有频率f的噪声功率谱密度N(f,τ)的平方和,即该时间窗口内噪声的总能量水平。F(τ)表示时间窗口τ内噪声功率谱密度N(f,τ)的几何平均与算术平均的比率,以用于衡量频谱的均匀性。F表示分析时考虑的频率点的总数。示例性地,若分析的频谱范围为从0Hz到1000Hz,且每100Hz有一个频率点,则此时F为10。Wherein, E noise (τ) represents the sum of the squares of the noise power spectral density N (f, τ) of all frequencies f in the time window τ, that is, the total energy level of the noise in the time window. F (τ) represents the ratio of the geometric mean to the arithmetic mean of the noise power spectral density N (f, τ) in the time window τ, which is used to measure the uniformity of the spectrum. F represents the total number of frequency points considered in the analysis. For example, if the analyzed spectrum ranges from 0 Hz to 1000 Hz, and there is a frequency point every 100 Hz, then F is 10.
示例性地,公式13和14展示了噪声特征中频谱扁平度(Spectral Flatness)和谱摘的提取过程。Exemplarily, Formulas 13 and 14 illustrate the process of extracting spectral flatness and spectral abstract from noise features.
其中,H(τ)表示时间窗口τ下的谱摘,即频率成分f的功率谱密度N(f,τ)在总功率谱中的分布复杂性。谱摘的高值表明频谱分布更均匀,而低值表示频谱在某些特定频率上较为集中。Among them, H(τ) represents the spectrum abstract under the time window τ, that is, the distribution complexity of the power spectral density N(f,τ) of the frequency component f in the total power spectrum. A high value of the spectrum abstract indicates that the spectrum is more evenly distributed, while a low value indicates that the spectrum is more concentrated on certain specific frequencies.
在计算得到总能量,频谱扁平度和谱摘后,进一步通过公式15根据近端音频信号的总能量,频谱扁平度和谱摘计算环境噪声的嘈杂度。After the total energy, spectrum flatness and spectrum extract are calculated, the noisiness of the ambient noise is further calculated according to the total energy, spectrum flatness and spectrum extract of the near-end audio signal using formula 15.
D(τ)=α·Enoise(τ)+β·F(τ)+γ·H(τ)(公式15)D(τ)=α·E noise (τ)+β·F(τ)+γ·H(τ) (Formula 15)
其中,D(τ)表示时间窗口τ的环境嘈杂度。α,β,γ为预设的加权参数。Where D(τ) represents the environmental noise level in the time window τ. α, β, and γ are preset weighting parameters.
上文对环境噪声监测单元对环境噪声进行监测的过程进行了详细描述,下文中将结合图5对基于环境嘈杂度对输出音频进行自动增益控制的过程进行阐述。The above describes in detail the process of the environmental noise monitoring unit monitoring the environmental noise. The following will describe the process of automatically controlling the gain of the output audio based on the environmental noise in conjunction with FIG. 5 .
在本发明一实施例中,如图5所示,自动增益控制单元包括:目标音量确定模块,其中包括三个输入端和一个输出端,三个输入端分别与控制接口、音频接口以及环境噪声监测单元相连,输出端与增益计算模块相连;所述目标音量确定模块基于所述控制信息和环境嘈杂度数据计算所述远端音频数字信号的目标音量值;增益计算模块,其中包括一个输入端和一个输出端;输入端与目标音量确定模块相连,输出端与增益平滑调整模块相连;所述增益计算模块根据预设的增益系数和所述远端音频数字信号的目标音量值计算远端音频数字信号的增益值,并将所述增益值发送至增益平滑调整模块;增益平滑调整模块,其中包括一个输入端和一个输出端;输入端与增益计算模块相连,输出端与增益应用模块相连;所述增益平滑调整模块基于预设的平滑因子对远端音频数字信号的增益值进行平滑操作,并将平滑后的增益值发送至增益应用模块;增益应用模块,其中包括一个输入端和一个输出端,输入端与增益平滑调整模块相连,输出端与所述数模转换单元相连;所述增益应用模块通过平滑后的增益值计算得到自动增益音频信号,并将所述自动增益音频信号发送至数模转换单元。In one embodiment of the present invention, as shown in FIG5 , the automatic gain control unit includes: a target volume determination module, which includes three input terminals and one output terminal, the three input terminals are respectively connected to the control interface, the audio interface and the environmental noise monitoring unit, and the output terminal is connected to the gain calculation module; the target volume determination module calculates the target volume value of the remote audio digital signal based on the control information and the environmental noisiness data; a gain calculation module, which includes an input terminal and an output terminal; the input terminal is connected to the target volume determination module, and the output terminal is connected to the gain smoothing adjustment module; the gain calculation module calculates the target volume value of the remote audio digital signal according to a preset gain coefficient and the target volume value of the remote audio digital signal The gain value is calculated and sent to a gain smoothing adjustment module; the gain smoothing adjustment module includes an input terminal and an output terminal; the input terminal is connected to the gain calculation module, and the output terminal is connected to the gain application module; the gain smoothing adjustment module performs a smoothing operation on the gain value of the remote audio digital signal based on a preset smoothing factor, and sends the smoothed gain value to the gain application module; the gain application module includes an input terminal and an output terminal, the input terminal is connected to the gain smoothing adjustment module, and the output terminal is connected to the digital-to-analog conversion unit; the gain application module obtains an automatic gain audio signal by calculating the smoothed gain value, and sends the automatic gain audio signal to the digital-to-analog conversion unit.
在本实施例中,首先根据环境嘈杂度确定目标音量值,以确保经过自适应增益处理的输出音频信号不超过安全的最大音量。其具体过程如公式16所示。In this embodiment, the target volume value is first determined according to the environmental noise level to ensure that the output audio signal after adaptive gain processing does not exceed the safe maximum volume. The specific process is shown in Formula 16.
Vtarget(τ)=min(Vbase+k·(D(τ)-Dmin),Vmax) (公式14)V target (τ)=min(V base +k·(D(τ)-D min ),V max ) (Formula 14)
其中;Vtarget(τ)表示时间窗口τ的目标音量值;Vbase表示基准音量值,即当处于最低嘈杂度下的标准音量水平;D(τ)表示时间窗口τ的嘈杂度;Dmin表示嘈杂度的最小阈值;当环境嘈杂度低于此值时目标音量保持基线水平;k表示控制目标音量调整强度的系数;Vmax表示设定的最大音量限制,用于听力保护。Wherein: V target (τ) represents the target volume value of the time window τ; V base represents the reference volume value, that is, the standard volume level when at the lowest noisiness; D (τ) represents the noisiness of the time window τ; D min represents the minimum threshold of noisiness; when the ambient noise is lower than this value, the target volume maintains the baseline level; k represents the coefficient for controlling the intensity of the target volume adjustment; V max represents the set maximum volume limit for hearing protection.
在确定好安全最大音量后,通过公式15根据目标音量和当前音量计算所需的增益参数。After determining the safe maximum volume, the required gain parameters are calculated based on the target volume and the current volume using Formula 15.
其中,G(τ)表示时间窗口τ的增益系数。Ecurrent(τ)表示当前时间窗口τ的实际声音能量。Where G(τ) represents the gain coefficient of the time window τ. E current (τ) represents the actual sound energy of the current time window τ.
为了进一步提高自动增益音频信号的质量、稳定性和适用性,对公式15得到的增益参数通过公式16进行平滑调整,以避免音量的突变。In order to further improve the quality, stability and applicability of the automatic gain audio signal, the gain parameter obtained by Formula 15 is smoothly adjusted by Formula 16 to avoid sudden changes in volume.
Gsmooth(τ)=β·G(τ)+(1-β)·Gsmooth(τ-1) (公式16)G smooth (τ)=β·G(τ)+(1-β)·G smooth (τ-1) (Formula 16)
其中,Gsmooth(τ)表示平滑后的时间窗口τ的增益,β表示增益调整的平滑程度。Wherein, G smooth (τ) represents the gain of the smoothed time window τ, and β represents the smoothness of the gain adjustment.
随后将公式16计算得到的平滑参数增益至输出音频中,以调整信号的音量。The smoothing parameter calculated by formula 16 is then added to the output audio to adjust the volume of the signal.
y(t)=Gsmooth(τ)·s(t) (公式17)y(t)=G smooth (τ)·s(t) (Formula 17)
其中,y(t)表示经过增益调整后的信号,s(t)表示输出信号。Among them, y(t) represents the signal after gain adjustment, and s(t) represents the output signal.
上文对自动增益音频信号的生成过程进行了详细说明,下文将结合图6将对基于自动增益音频信号进行音频输出的过程进行详细说明。其中所述自动增益控制单元将自动增益音频信号法发送至数模转换单元。The above describes in detail the process of generating the automatic gain audio signal, and the following describes in detail the process of outputting audio based on the automatic gain audio signal in conjunction with Figure 6. The automatic gain control unit sends the automatic gain audio signal to the digital-to-analog conversion unit.
在本发明一实施例中,如图6所示,所述数模转换单元包括:数据缓存模块,与自动增益控制单元和数据匹配模块电性相连,用于对所述自动增益音频信号执行延时匹配及数据缓存操作,并将其发送至数据匹配模块;数据匹配模块,与混音模块电性相连,用于对所述自动增益音频信号进行格式转换匹配操作,以生成数模中间信号并发送至混音模块;混音模块,与数模转换子模块电性相连,用于对所述数模中间信号与其他远端音频流的数字信号或者本地存储的数字音频流进行混音叠加操作,以生成混音信号并将所述混音信号发送至数模转换子模块;数模转换子模块,用于对所述混音信号执行数模转换操作,并生成远端音频模拟信号并输出。In one embodiment of the present invention, as shown in Figure 6, the digital-to-analog conversion unit includes: a data cache module, which is electrically connected to the automatic gain control unit and the data matching module, and is used to perform delay matching and data cache operations on the automatic gain audio signal, and send it to the data matching module; a data matching module, which is electrically connected to the mixing module, and is used to perform format conversion and matching operations on the automatic gain audio signal to generate a digital-to-analog intermediate signal and send it to the mixing module; a mixing module, which is electrically connected to the digital-to-analog conversion submodule, and is used to perform mixing and superimposing operations on the digital-to-analog intermediate signal with digital signals of other remote audio streams or locally stored digital audio streams to generate a mixed signal and send the mixed signal to the digital-to-analog conversion submodule; a digital-to-analog conversion submodule, which is used to perform a digital-to-analog conversion operation on the mixed signal, and generate a remote audio analog signal and output it.
在本实施例中,数模转换子模块主要负责将数字信号转化为音频输出信号(通常为模拟信号)。其中所包含的子模块有数模转换子模块、混音模块、数据匹配模块以及数据缓存模块。其中数据匹配模块用于将数字音频流按要求进行一定的格式转换,其中包括采样率的匹配,信号位宽的匹配等操作,数据缓存模块用于缓存一定量的数据,从而避免因为接口两侧系统时钟设计缺陷引入的抖动,导致音频数据被重复采样或者丢失的问题。In this embodiment, the digital-to-analog conversion submodule is mainly responsible for converting the digital signal into an audio output signal (usually an analog signal). The submodules included therein include a digital-to-analog conversion submodule, a mixing module, a data matching module, and a data buffer module. The data matching module is used to convert the digital audio stream into a certain format as required, including operations such as matching the sampling rate and matching the signal bit width, and the data buffer module is used to buffer a certain amount of data, thereby avoiding the problem of repeated sampling or loss of audio data due to jitter introduced by the system clock design defects on both sides of the interface.
在本实施例中,混音模块用于将音频编解码器主处理器通过数据接口传输过来的远端音频流或者本地存储的音频流等多个下行音频流进行混音叠加。In this embodiment, the audio mixing module is used to mix and superimpose multiple downstream audio streams, such as a remote audio stream transmitted from the audio codec main processor through a data interface or a locally stored audio stream.
其中,公式18表示有M个输入音频流x(n),将他们累加后,得到输出混音后的音频流y(n)。Wherein, Formula 18 indicates that there are M input audio streams x(n), and after accumulating them, the output mixed audio stream y(n) is obtained.
进一步地,数模转换子模块用于将编解码器处理完成的数字音频流转换成模拟音频流。Furthermore, the digital-to-analog conversion submodule is used to convert the digital audio stream processed by the codec into an analog audio stream.
v=ky(t)(公式20)v = ky(t) (Formula 20)
其中,公式19是通过插值过程将采样率升到模拟处理采样率的过程,通过数字样点之间进行插值拟合以实现采样过程。公式20是数字信号转成模拟信号过程,输入数字离散信号y(n),乘以比例因子k,得到输出模拟电压信号v(t),以实现离散数字信号转成连续模拟信号的功能。Among them, Formula 19 is the process of increasing the sampling rate to the analog processing sampling rate through the interpolation process, and the sampling process is realized by interpolation fitting between digital sample points. Formula 20 is the process of converting digital signals into analog signals. The input digital discrete signal y(n) is multiplied by the proportional factor k to obtain the output analog voltage signal v(t) to realize the function of converting discrete digital signals into continuous analog signals.
上文对具有自动增益控制功能的音频编解码装置进行了详细说明,下文将结合图7对具有自动增益控制功能的音频编解码方法进行说明,其中所述方法应用于音频编解码器。The audio codec device with automatic gain control function is described in detail above. The audio codec method with automatic gain control function will be described below in conjunction with FIG. 7 , wherein the method is applied to an audio codec.
步骤S71:接收近端音频模拟信号并执行模数转换操作、数据匹配操作以及数据缓存操作,以生成对应的近端音频数字信号。Step S71: receiving a near-end audio analog signal and performing an analog-to-digital conversion operation, a data matching operation, and a data buffering operation to generate a corresponding near-end audio digital signal.
步骤S72:对所述近端音频数字信号执行噪声监测操作,以生成环境嘈杂度数据。Step S72: performing a noise monitoring operation on the near-end audio digital signal to generate environmental noisiness data.
在本发明一实施例中,所述噪声监测操作包括:接收近端音频数字信号并去除信号中的直流分量;使用汉宁窗函数对去除直流分量的近端音频数字信号进行滤波处理;将经汉宁窗函数处理后的近端音频数字信号从时域转换至频域,以生成近端音频频域信号;计算所述近端音频频域信号的短时能量和零交叉率,基于所述短时能量和零交叉率进行语音活动检测,以生成语音活动检测结果;基于所述语音活动检测结果进行噪声功率谱密度分析,以生成噪声功率谱密度;基于噪声功率谱密度计算噪声总能量、频谱扁平度和谱摘,以生成包含有所述噪声总能量、频谱扁平度和谱摘的综合噪声特征,并通过综合噪声特征以计算得到环境嘈杂度数据。In one embodiment of the present invention, the noise monitoring operation includes: receiving a near-end audio digital signal and removing a DC component from the signal; filtering the near-end audio digital signal with the DC component removed using a Hanning window function; converting the near-end audio digital signal processed by the Hanning window function from the time domain to the frequency domain to generate a near-end audio frequency domain signal; calculating the short-time energy and zero-crossing rate of the near-end audio frequency domain signal, and performing voice activity detection based on the short-time energy and zero-crossing rate to generate a voice activity detection result; performing noise power spectral density analysis based on the voice activity detection result to generate a noise power spectral density; calculating the total noise energy, spectral flatness and spectral abstract based on the noise power spectral density to generate a comprehensive noise feature including the total noise energy, spectral flatness and spectral abstract, and obtaining environmental noisiness data through calculation through the comprehensive noise feature.
步骤S73:接收控制信息及远端音频数字信号,基于所述环境嘈杂度数据和所述控制信息对所述远端音频数字信号执行自动增益控制操作,以生成自动增益音频信号。Step S73: receiving control information and a far-end audio digital signal, and performing an automatic gain control operation on the far-end audio digital signal based on the environmental noise data and the control information to generate an automatic gain audio signal.
在本发明一实施例中,所述自动增益控制操作包括:基于所述控制信息和环境嘈杂度数据计算得到所述远端音频数字信号的目标音量值;根据预设的增益系数和所述远端音频数字信号的目标音量值计算远端音频数字信号的增益值;基于预设的平滑因子对远端音频数字信号的增益值进行平滑操作;将平滑后的增益值通过计算得到自动增益音频信号。In one embodiment of the present invention, the automatic gain control operation includes: calculating the target volume value of the far-end audio digital signal based on the control information and the environmental noise data; calculating the gain value of the far-end audio digital signal according to a preset gain coefficient and the target volume value of the far-end audio digital signal; smoothing the gain value of the far-end audio digital signal based on a preset smoothing factor; and obtaining an automatic gain audio signal by calculating the smoothed gain value.
步骤S74:对所述自动增益音频信号执行数据缓存操作、格式转换匹配操作、混音叠加操作和数模转换操作并生成经过自动增益控制的远端音频模拟信号并输出。Step S74: performing data caching operation, format conversion and matching operation, mixing and superposition operation and digital-to-analog conversion operation on the automatic gain audio signal, and generating and outputting a remote audio analog signal that has undergone automatic gain control.
需要说明的是:上述实施例提供的具有自动增益控制功能的音频编解码方法在音频编解码器中进行自动增益控制时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将方法的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的具有自动增益控制功能的音频编解码方法与具有自动增益控制功能的音频编解码装置实施例属于同一构思,其具体实现过程详见装置实施例,这里不再赘述。It should be noted that: when the audio codec method with automatic gain control function provided in the above embodiment performs automatic gain control in the audio codec, only the division of the above program modules is used as an example. In practical applications, the above processing can be assigned to different program modules as needed, that is, the internal structure of the method is divided into different program modules to complete all or part of the processing described above. In addition, the audio codec method with automatic gain control function provided in the above embodiment and the audio codec device embodiment with automatic gain control function belong to the same concept. The specific implementation process is detailed in the device embodiment and will not be repeated here.
如图8所示,展示了本发明实施例中的一种具有自动增益控制功能的音频编解码装置的结构示意图。本实施例中,具有自动增益控制功能的音频编解码装置800包括:主处理器801以及如上述具有自动增益控制功能的音频编解码装置802。As shown in Fig. 8, a schematic diagram of the structure of an audio codec device with automatic gain control function in an embodiment of the present invention is shown. In this embodiment, the audio codec device with automatic gain control function 800 includes: a main processor 801 and the audio codec device with automatic gain control function 802 as described above.
综上所述,本申请提供具有自动增益控制功能的音频编解码装置、方法和系统,本发明提供了一种在音频编解码器内部提高音频信号自适应增益控制效率的方法,本申请创新性地将环境噪声监测与AGC功能在音频编解码器内部进行集成,使得AGC模块能够根据实时环境噪声数据对音频增益参数进行动态调整的同时,保证了音频输出在各种环境条件下的一致性和清晰度,从而大幅提升了音频信号的处理质量和用户的听觉体验。其次将AGC功能集成至音频编解码器内部显著降低了外部硬件依赖和系统集成的复杂性,减少了额外设备的需求,简化了系统部署过程,降低了使用AGC技术的技术壁垒。最后,将AGC功能集成至音频编解码器内部可充分利用音频编解码器内部所具有的高采样率能力,以更低的算法时延处理音频信号,从而实现更快速和精确的响应。所以,本申请有效克服了现有技术中的种种缺点而具高度产业利用价值。In summary, the present application provides an audio codec device, method and system with an automatic gain control function. The present invention provides a method for improving the efficiency of adaptive gain control of audio signals within an audio codec. The present application innovatively integrates environmental noise monitoring and AGC functions within the audio codec, so that the AGC module can dynamically adjust the audio gain parameters according to real-time environmental noise data while ensuring the consistency and clarity of the audio output under various environmental conditions, thereby greatly improving the processing quality of the audio signal and the user's auditory experience. Secondly, integrating the AGC function into the audio codec significantly reduces the external hardware dependence and the complexity of system integration, reduces the need for additional equipment, simplifies the system deployment process, and reduces the technical barriers to using AGC technology. Finally, integrating the AGC function into the audio codec can make full use of the high sampling rate capability of the audio codec to process the audio signal with a lower algorithm delay, thereby achieving a faster and more accurate response. Therefore, the present application effectively overcomes the various shortcomings in the prior art and has a high industrial utilization value.
上述实施例仅例示性说明本申请的原理及其功效,而非用于限制本申请。任何熟悉此技术的人士皆可在不违背本申请的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本申请所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本申请的权利要求所涵盖。The above embodiments are merely illustrative of the principles and effects of the present application and are not intended to limit the present application. Anyone familiar with the technology may modify or change the above embodiments without violating the spirit and scope of the present application. Therefore, all equivalent modifications or changes made by a person of ordinary skill in the art without departing from the spirit and technical ideas disclosed in the present application shall still be covered by the claims of the present application.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410132693.3A CN117935818B (en) | 2024-01-30 | 2024-01-30 | Audio encoding and decoding device, method and system with automatic gain control function |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410132693.3A CN117935818B (en) | 2024-01-30 | 2024-01-30 | Audio encoding and decoding device, method and system with automatic gain control function |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117935818A CN117935818A (en) | 2024-04-26 |
CN117935818B true CN117935818B (en) | 2024-10-18 |
Family
ID=90753444
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410132693.3A Active CN117935818B (en) | 2024-01-30 | 2024-01-30 | Audio encoding and decoding device, method and system with automatic gain control function |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117935818B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118800268B (en) * | 2024-09-14 | 2025-02-28 | 潍坊歌尔电子有限公司 | Voice signal processing method, voice signal processing device and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101415045A (en) * | 2007-10-17 | 2009-04-22 | 北京三星通信技术研究有限公司 | Method and apparatus for implementing intelligent automatic level control in communication network |
CN113823307A (en) * | 2021-09-17 | 2021-12-21 | 广州华多网络科技有限公司 | Voice signal processing method and device, equipment, medium and product thereof |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6959275B2 (en) * | 2000-05-30 | 2005-10-25 | D.S.P.C. Technologies Ltd. | System and method for enhancing the intelligibility of received speech in a noise environment |
CN103617797A (en) * | 2013-12-09 | 2014-03-05 | 腾讯科技(深圳)有限公司 | Voice processing method and device |
CN103915103B (en) * | 2014-04-15 | 2017-04-19 | 成都凌天科创信息技术有限责任公司 | Voice quality enhancement system |
WO2021013363A1 (en) * | 2019-07-25 | 2021-01-28 | Unify Patente Gmbh & Co. Kg | Method and system for avoiding howling disturbance on conferences |
-
2024
- 2024-01-30 CN CN202410132693.3A patent/CN117935818B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101415045A (en) * | 2007-10-17 | 2009-04-22 | 北京三星通信技术研究有限公司 | Method and apparatus for implementing intelligent automatic level control in communication network |
CN113823307A (en) * | 2021-09-17 | 2021-12-21 | 广州华多网络科技有限公司 | Voice signal processing method and device, equipment, medium and product thereof |
Also Published As
Publication number | Publication date |
---|---|
CN117935818A (en) | 2024-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5706513B2 (en) | Spatial audio processor and method for providing spatial parameters based on an acoustic input signal | |
US8098813B2 (en) | Communication system | |
Jeub et al. | Noise reduction for dual-microphone mobile phones exploiting power level differences | |
WO2021179651A1 (en) | Call audio mixing processing method and apparatus, storage medium, and computer device | |
US20050108004A1 (en) | Voice activity detector based on spectral flatness of input signal | |
JP2008543194A (en) | Audio signal gain control apparatus and method | |
JP2009050013A (en) | Echo detection and monitoring | |
KR102417047B1 (en) | Signal processing method and apparatus adaptive to noise environment and terminal device employing the same | |
CN117935818B (en) | Audio encoding and decoding device, method and system with automatic gain control function | |
US20180151187A1 (en) | Audio Signal Processing | |
CN104580764B (en) | Ultrasonic pairing signal control in TeleConference Bridge | |
EP2158753B1 (en) | Selection of audio signals to be mixed in an audio conference | |
CN113012722B (en) | Sampling rate processing method, device, system, storage medium and computer equipment | |
Sauert et al. | Near end listening enhancement considering thermal limit of mobile phone loudspeakers | |
CN103093758B (en) | Electronic device and method for receiving voice signal thereof | |
KR101746178B1 (en) | APPARATUS AND METHOD OF VoIP PHONE QUALITY MEASUREMENT USING WIDEBAND VOICE CODEC | |
BR112020004703A2 (en) | time deviation estimate | |
CN115881080B (en) | Acoustic feedback processing method and device in voice communication system | |
Côté et al. | Speech communication | |
JP5298769B2 (en) | Noise estimation device, communication device, and noise estimation method | |
CN116015993A (en) | An audio signal processing method and terminal | |
CN117912474B (en) | Audio codec device, method and system with sound source directional function | |
US20030235293A1 (en) | Adaptive system control | |
US7672839B2 (en) | Detecting audio signal activity in a communications system | |
CN114584902B (en) | Method and device for eliminating nonlinear echo of intercom equipment based on volume control |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: No. 11, Lane 999, Dangui Road, China (Shanghai) Pilot Free Trade Zone, Pudong New Area, Shanghai, 201203 Patentee after: Yaoxin Microelectronics (Shanghai) Electronic Technology Co.,Ltd. Country or region after: China Address before: 201207 rooms 405 and 416, building 3, No. 1690, Cailun Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai Patentee before: Yaoxin microelectronics technology (Shanghai) Co.,Ltd. Country or region before: China |