Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the standard coding flow of the encoder, the processing procedure of the frequency spectrum quantization module for the coded audio frame comprises two parts:
the main calculation formula of the first part is as follows:
Where gg represents the quantized global gain parameter, gg ind represents the quantized global gain index, and gg off represents the quantized global gain offset. Wherein, gg ind and gg off are both obtained by standard coding procedures.
The calculation formula of the second part is as follows:
Wherein, X f (N) is a spectrum data sample filtered by the TNS time domain noise shaping module, and N E X q (N) are quantized spectrum data samples, and N E are all; n E (Number ofencoded SPECTRAL LINES) is the number of coded spectral lines and is a variable with respect to the sampling frequency. The silence frames and the non-silence frames exist in the encoded audio frames, and if the encoded audio is encoded by adopting a standard spectrum quantization process, unnecessary consumption of the operation amount of an encoder is caused.
Fig. 1 shows a schematic representation of encoded audio with silence frames for the method of optimizing the number of quantization of audio codes of the present application. As shown in fig. 1, when the encoded audio includes a mute frame, according to the description of the principle of the spectrum quantization module, the result of the spectrum quantization process of the mute frame is a fixed value, and if the fixed value is directly output, the spectrum quantization process of the mute frame can be omitted, thereby reducing the code rate of the encoder, reducing the operation amount of the encoder, and reducing the power consumption.
Fig. 2 shows a specific embodiment of the method of optimizing the number of quantization of an audio code according to the present application.
In the embodiment shown in fig. 2, the method for optimizing the quantization number of audio codes according to the present application includes: s101, counting the output result of the encoded audio frame passing through the time domain noise shaping module, and calculating the maximum value in the output result; the process S102, which determines whether the encoded audio frame is a mute frame according to the maximum value, includes: when the maximum value is zero, the coded audio frame is a mute frame; and when the maximum value is non-zero, the encoded audio frame is a non-silence frame; a process S103, performing a spectrum quantization process on the encoded audio frame, including: when the encoded audio frame is a non-mute frame, performing a spectrum quantization process of a standard flow on the encoded audio frame to obtain a spectrum quantization result of the encoded audio frame; and when the encoded audio frame is a mute frame, skipping a spectrum quantization process of the standard flow, and setting a spectrum quantization result of the encoded audio frame to a first preset value.
In the embodiment shown in fig. 2, the method for optimizing the quantization count of audio coding according to the present application includes a process S101 of counting the output result of the encoded audio frame through the time domain noise shaping module and calculating the maximum value in the output result.
In this embodiment, the encoded audio frames are encoded according to the standard encoding flow of the encoder, and the output result of the encoded audio frames through the time domain noise shaping module is counted and denoted by X f(n),n=0...NE -1, and the maximum value of the output result X f (n) of the time domain noise shaping module in one encoded audio frame is counted and usedAnd (3) representing.
In the embodiment shown in fig. 2, the method for optimizing the quantization count of audio coding according to the present application includes a process S102 of determining whether the encoded audio frame is a mute frame according to the maximum value, including: when the maximum value is zero, the coded audio frame is a mute frame; and when the maximum value is non-zero, the encoded audio frame is a non-silence frame.
In this embodiment, according to the correlation characteristic when the encoded audio frame is a silence frame, whether the current encoded audio frame is a silence frame is determined according to the processing result of the time domain noise shaping module on the encoded audio frame. Wherein, judging the maximum value of the output result of the time domain noise shaping moduleWhether or not it is zero. WhenWhen the audio frame is zero, indicating that all output results X f (n) of the time domain noise shaping module are zero, and indicating that the current encoded audio frame is a mute frame; if the maximum value/>, of the output result of the time domain noise shaping moduleNon-zero indicates that the current encoded audio frame is a non-silence frame, i.e., an encoded audio frame with speech information.
The time domain noise shaping module is utilized to judge whether the current encoded audio frame is a mute frame or not, no additional operation is needed, the intermediate encoding result obtained in the normal standard encoding flow is directly utilized to judge, and the method is simple, convenient and easy to operate, and avoids unnecessary operation amount increase.
In the embodiment shown in fig. 2, the method for optimizing the quantization count of audio coding according to the present application includes a process S103, where a spectrum quantization process is performed on a coded audio frame, including: when the encoded audio frame is a non-mute frame, performing a spectrum quantization process of a standard flow on the encoded audio frame to obtain a spectrum quantization result of the encoded audio frame; and when the encoded audio frame is a mute frame, skipping the spectrum quantization process of the standard flow, and directly setting the spectrum quantization result of the encoded audio frame as a first preset value.
In this embodiment, when it is determined that the current encoded audio frame is a mute frame, X f (n) is all zero, and according to the standard operation procedure in the spectrum quantization module in formula 2, bringing X f (n) into formula 2 can obtain that the operation result of the spectrum quantization module on the mute frame at this time is a determined value, where the value is irrelevant to the quantized global gain parameter gg. Therefore, when the encoded audio frame is a mute frame, a numerical value is directly output as the operation result of the spectrum quantization module of the audio frame without the operation process of the spectrum quantization module.
In a specific embodiment of the present application, when the encoded audio frame is a mute frame, the operation result of the spectrum quantization module of the mute frame is set to 0 according to the standard specification of the spectrum quantization process.
In this embodiment, a fixed value is calculated according to the standard specification of the spectrum quantization process and is used as the output result of the spectrum quantization module, so that the specific operation process of the spectrum quantization module is skipped directly, the operation amount of the encoder is reduced, and the power consumption of the encoder is reduced.
In one embodiment of the present application, prior to the spectrum quantization process, the method comprises: and calculating a quantized global gain offset and a quantized global gain index according to the standard specification, calculating a quantized global gain parameter according to the quantized global gain offset and the quantized global gain index, and performing a spectrum quantization process.
In this embodiment, when the current encoded audio frame is determined to be a non-mute frame, the encoded audio frame is subjected to a spectrum quantization process according to a standard processing flow of a spectrum quantization module in the encoder. Wherein, equation 1 and equation 2 show the operation of partial spectrum quantization.
In one example of the present application, according to the specification of the encoding process, when performing the operation of the spectrum quantization module on the encoded audio frame of the non-mute frame, the quantized global gain parameter is calculated according to the quantized global gain offset and the quantized global gain index, the quantized global gain parameter gg is calculated according to the formula 1, and the spectrum quantization result of the encoded audio frame is calculated according to the formula 2.
In a specific embodiment of the present application, the quantized global gain minima are calculated according to a standard specification, and the quantized global gain index is corrected according to the quantized global gain minima.
The method for optimizing the audio coding quantization times of the application judges whether the coded audio frame is a mute frame or not by utilizing the output result of the time domain noise shaping module in the standard coding flow, and carries out different frequency spectrum quantization operation processes according to different types of the coded audio frame. When the encoded audio frame is a mute frame, skipping a spectrum quantization process, directly outputting a spectrum quantization result of the mute frame according to the working principle of a spectrum quantization module, reducing unnecessary operation amount in the encoder and reducing power consumption of the encoder; and when the encoded audio frame is a non-mute frame, carrying out a spectrum quantization operation process of the encoded audio frame according to a standard encoding flow.
Fig. 3 shows a specific example of the method of optimizing the number of quantization times of audio coding according to the present application.
In the specific example shown in FIG. 3, the maximum value is calculated from the time domain noise shaping module calculation output result X f (n)Then judging the maximum value/>, of the output result calculated by the time domain noise shaping moduleWhether or not to be zero, whenWhen zero, the current coded audio frame is a mute frame; whenAnd when the current encoded audio frame is not zero, the current encoded audio frame is a non-mute frame.
When (when)When the current coding audio frame is a non-mute frame, calculating a quantized global gain offset gg off and a quantized global gain index gg ind according to a standard coding flow, calculating a quantized global gain minimum gg min to correct gg ind, calculating a quantized global gain gg according to a formula 1, and carrying out a normal spectrum quantization process on the non-mute frame according to a formula 2. WhenWhen the current encoded audio frame is a mute frame, setting the quantized global gain index gg ind to a second preset value, wherein gg ind can be set to 255, so that the encoding process after the spectrum quantization process is performed normally. And outputting the spectrum quantization result of the mute frame as a first value, wherein the first value is 0 according to the audio coding specification. The frequency spectrum quantization result of the mute frame is directly output, a complex frequency spectrum quantization operation process is omitted, the operation amount of the frequency spectrum quantization process is reduced, the frequency spectrum quantization times are reduced by skipping the frequency spectrum quantization process, the operation amount of the encoder is reduced, and the power consumption of the encoder is reduced.
In one example of the present application, the effect of the method of optimizing the number of quantization of audio codes of the present application will be described by taking as an example a monaural, coded audio with a sampling rate of 16KHz and a frame length of 10 ms. For example, this audio length is 10 seconds for a total of 1000 frames, where the frame length is 10 milliseconds. If about 500 frames in the 10 seconds of audio are mute frames and the other 500 frames are non-mute frames, the method for optimizing the audio coding quantization times of the application is used, and the corresponding 500 mute frames directly output the spectrum quantization result without the operation process of spectrum quantization. Only the spectral quantization process needs to be performed for the other 500 non-mute frames. Therefore, the original frequency spectrum quantization times are reduced from 1000 times to 500 times, and the frequency spectrum quantization process times are reduced, so that the operation amount of the encoder is reduced, and the power consumption of the encoder is reduced.
The method for optimizing the audio coding quantization times of the application judges whether the audio frame is a mute frame or not, and then directly skips the spectrum quantization process on the mute frame to directly output the result, thereby reducing the operation amount of the encoder, reducing the power consumption of the encoder and prolonging the service time of the encoder under the limited power consumption. The method for optimizing the quantization times of the audio coding is simple to operate, can be applied to audio coding with frame lengths of 10ms and 7.5ms and all sampling rates, and has a wide application range. In addition, the method for optimizing the audio coding quantization times of the application reduces the frequency spectrum quantization times, does not negatively influence the tone quality of the coded audio, and has the same tone quality as the tone quality coded according to the standard flow.
Fig. 4 shows an embodiment of the system of the application for optimizing the number of quantization of an audio code.
In the embodiment shown in fig. 4, the system for optimizing the quantization number of audio codes according to the present application includes: the statistics module is used for counting the output result of the time domain noise shaping module of the encoded audio frame and calculating the maximum value in the output result; the judging module judges whether the coded audio frame is a mute frame according to the maximum value, and comprises the following steps: when the maximum value is zero, the coded audio frame is a mute frame; and when the maximum value is non-zero, the encoded audio frame is a non-silence frame; a spectral quantization module that performs a spectral quantization process on an encoded audio frame, comprising: when the encoded audio frame is a non-mute frame, performing a spectrum quantization process of a standard flow on the encoded audio frame to obtain a spectrum quantization result of the encoded audio frame; and when the encoded audio frame is a mute frame, skipping the spectrum quantization process of the standard flow, and directly setting the spectrum quantization result of the encoded audio frame as a first preset value.
The system for optimizing the audio coding quantization times of the application judges whether the audio frame is a mute frame or not through the judging module, and then directly skips the spectrum quantization process to directly output the result, thereby reducing the operation amount of the encoder, reducing the power consumption of the encoder and prolonging the service time of the encoder under the limited power consumption.
In one embodiment of the application, a computer readable storage medium stores computer instructions operable to perform the method of optimizing the number of quantization of audio codes described in any of the embodiments. Wherein the storage medium may be directly in hardware, in a software module executed by a processor, or in a combination of the two.
A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium.
The Processor may be a central processing unit (English: central Processing Unit, CPU for short), other general purpose Processor, digital signal Processor (English: DIGITAL SIGNAL Processor, DSP for short), application specific integrated Circuit (Application SPECIFIC INTEGRATED Circuit, ASIC for short), field programmable gate array (English: field Programmable GateArray, FPGA for short), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In one embodiment of the application, a computer device includes a processor and a memory storing computer instructions, wherein: the processor operates the computer instructions to perform the method of optimizing the number of quantization of audio codes described in any of the embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of elements is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., multiple elements or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The foregoing is only illustrative of the present application and is not to be construed as limiting the scope of the application, and all equivalent structural changes made by the present application and the accompanying drawings, or direct or indirect application in other related technical fields, are included in the scope of the present application.