US20090210235A1 - Encoding device, encoding method, and computer program product including methods thereof - Google Patents
Encoding device, encoding method, and computer program product including methods thereof Download PDFInfo
- Publication number
- US20090210235A1 US20090210235A1 US12/367,963 US36796309A US2009210235A1 US 20090210235 A1 US20090210235 A1 US 20090210235A1 US 36796309 A US36796309 A US 36796309A US 2009210235 A1 US2009210235 A1 US 2009210235A1
- Authority
- US
- United States
- Prior art keywords
- unit
- scale factor
- bands
- power
- frequency spectra
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 28
- 238000004590 computer program Methods 0.000 title claims 3
- 238000001228 spectrum Methods 0.000 claims abstract description 262
- 230000005236 sound signal Effects 0.000 claims abstract description 63
- 238000012937 correction Methods 0.000 claims abstract description 18
- 238000001514 detection method Methods 0.000 claims abstract description 12
- 238000013139 quantization Methods 0.000 claims description 185
- 238000012545 processing Methods 0.000 claims description 33
- 230000008859 change Effects 0.000 claims description 29
- 230000002238 attenuated effect Effects 0.000 claims description 2
- 230000009189 diving Effects 0.000 claims 1
- 230000014509 gene expression Effects 0.000 description 28
- 238000010586 diagram Methods 0.000 description 19
- 238000005516 engineering process Methods 0.000 description 19
- 230000008569 process Effects 0.000 description 15
- 230000000873 masking effect Effects 0.000 description 13
- 238000006243 chemical reaction Methods 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 8
- 230000003595 spectral effect Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000002159 abnormal effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 2
- 239000012141 concentrate Substances 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000006866 deterioration Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/035—Scalar quantisation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
Definitions
- the present invention relates to an encoding device, an encoding method, and a program product including an encoding method.
- Such audio coding technology will be found in Advanced Audio Coding (ACC) method, High Efficiency-Advanced Audio Coding (HE-AAC) method.
- ACC Advanced Audio Coding
- HE-AAC High Efficiency-Advanced Audio Coding
- the AAC and the HE-AAC methods are ones of the ISO/IEC MPEG-2/4 audio standards and are widely used in a digital broadcasting, such as the digital terrestrial, the BS digital, and the Communication Satellite, and one segment broadcastings in Japan.
- a conventional encoding device for implementing the audio coding technology converts an audio signal into frequency spectra by Modified Discrete Cosine Transform (MDCT) conversion, quantizes the frequency spectra, and then performs encoding.
- MDCT Modified Discrete Cosine Transform
- the conventional encoding device quantizes the frequency spectra by utilizing auditory masking properties. Specifically, the conventional encoding device quantizes only sound that can be heard by human auditory perception. In the quantization, a masking threshold as a threshold is used to determine components of sound that cannot be acoustically heard, namely the threshold for whether sound can be heard or not.
- the conventional encoding device performs psychoacoustic analysis, which is a scheme for analyzing whether or sound is acoustically heard or not, with respect to an audio signal (a sound source to be encoded). Then masking thresholds are determined for each frequency. Thereafter, for each band having a predetermined frequency width, the conventional encoding device determines an error limit.
- the error limit is an allowable error power that is allowed during quantization, based on the determined masking threshold. Then, using the allowable error power, the conventional encoding device quantizes only frequency spectra as a sound source that is acoustically heard.
- Japanese Laid-open Patent Publication No. 2006-18023, pages 5 to 11 and FIG. 1, discloses a scheme for adjusting a masking threshold
- Japanese Laid-open Patent Publication No. 2001-7704, pages 5 to 9 and FIG. 1 discloses a scheme for improving efficiency during encoding for reducing the usage amount of bits used during encoding.
- Japanese Laid-open Patent Publication No. 7-202823, pages 3-5 and FIG. 1 disclose schemes for specifying the amount of bit distribution.
- the above-described conventional technologies have a problem in that the sound quality deteriorates during encoding of a tonal high audio signal.
- the conventional encoding device cannot reliably quantize frequency spectra adjacent to the peak during encoding of a tonal audio signal, and the device cannot satisfactorily perform encoding while maintaining a sufficient sound quality.
- Japanese Laid-open Patent Publications described above do not disclose a scheme for reliably quantizing frequency spectra adjacent to the peak and cannot sufficiently improve the sound quality during encoding of a tonal audio signal.
- an present encoding device for converting an audio signal into frequency spectra and quantizing and encoding the frequency spectra includes a power correcting unit for correcting allowable error powers determined in accordance with the audio signal when a tonal frequency spectrum is detected from the frequency spectra, and a quantizing unit for quantizing each of the frequency spectra having greater powers than the allowable error powers corrected by the power correcting unit.
- an encoding device for encoding an audio signal including a frequency converting unit for converting the audio signal into frequency spectra, a power determining unit for determining allowable error powers in accordance with the audio signal, a detecting unit for detecting a tonal frequency spectrum from the frequency spectra converted by the frequency converting unit, a power correcting unit for correcting the allowable error powers by using a result of the detection performed by the detecting unit and the allowable error powers determined by the power determining unit, and a quantizing unit for quantizing each of the frequency spectra having greater powers than the allowable error powers corrected by the power correcting unit.
- FIG. 1 illustrates a diagram showing the underlying technology of an encoding device according to a first embodiment
- FIG. 2 illustrates a diagram showing the underlying technology of an encoding device according to the first embodiment
- FIGS. 3A to 3C illustrate diagrams showing the underlying technology of an encoding device according to the first embodiment
- FIG. 4 illustrates a diagram showing the underlying technology of an encoding device according to the first embodiment
- FIG. 5 illustrates a diagram showing the outline and configuration of an encoding device according to the first embodiment
- FIG. 6 illustrates a block diagram showing the configuration of the encoding device according to the first embodiment
- FIGS. 7A to 7D illustrate diagrams showing a tone detecting unit of the coding device according to the first embodiment
- FIGS. 8A and 8B illustrate diagrams showing a psychoacoustic analyzing unit in the encoding device according to the first embodiment
- FIG. 9 illustrates a diagram showing an allowable-error-power correcting unit in the encoding device according to the first embodiment
- FIGS. 10A to 10D illustrate diagrams showing the allowable-error-power correcting unit in the encoding unit according to the first embodiment
- FIGS. 11A and 11B illustrate diagrams showing a scale factor correcting unit in the encoding unit according to the first embodiment
- FIG. 12 illustrates a flowchart showing a flow of processing of the encoding device according to the first embodiment
- FIG. 13 illustrates a flowchart showing a flow of processing performed by the scale factor correcting unit of the encoding device according to the first embodiment
- FIG. 14A illustrates a waveform of an audio signal
- FIG. 14B illustrates an encoded signal
- FIG. 14C illustrates frequency characteristics of an encoded signal
- FIG. 15 illustrates frequency spectra adjacent to a tonal frequency spectrum
- FIG. 16A illustrates an original sound
- FIG. 16B illustrates a generation of abnormal sound generated during a quantization using the known scheme
- FIG. 16C illustrates a reduction of the abnormal sound
- FIG. 17 illustrates a diagram showing an encoding device according to a second embodiment
- FIGS. 18A to 18C illustrate a diagram showing an encoding device according to the second embodiment
- FIG. 19 illustrates a flowchart of a scale factor correction processing in an encoding device according to the second embodiment
- FIG. 20 illustrates a diagram showing an encoding device according to a third embodiment
- FIG. 21 illustrates a diagram showing an encoding device according to the third embodiment
- FIG. 22 illustrates a diagram of a program for the encoding device according to the first embodiment
- FIGS. 23A to 23C illustrate for showing the consideration on underlying technology.
- FIG. 23A to 23B consideration over a conventional technology of coding an audio signal is described to make clear a shortcoming which is caused in quantization of frequency spectra adjacent to the peak of a tonal audio signal.
- a tonal audio signal e.g. a sinusoidal wave, a sweep wave, or the like
- intensities or power in dB concentrate in a specific band which exhibits a relatively large peak compared to other bands. That is, a specific band has frequency spectra having high intensities as shown in FIG. 23A which illustrates frequency spectra obtained by performing MDCT conversion on a tonal audio signal.
- an allowable error powers determined with respect to the bands adjacent to one including the peak also increased. Specifically, since the frequency spectra in the band including the peak are greater in power than other frequency spectra, the conventional encoding device also has large masking thresholds determined for the adjacent bands as well as the band including the peak. As a result, the allowable error powers also increase. Consequently, as shown in FIG. 23C , the frequency spectra in the bands adjacent to the peak become frequency spectra that are smaller than or equal to the allowable error powers. Since the frequency spectra in the adjacent bands are regarded as spectra not to be quantized, the frequency spectra are not quantized.
- the resultant frequency spectra are composed with each of MDCT coefficients of which each contains information of the amplitude and the phase of the audio-signal, while each of FIG. 23A , 23 B, and 23 C illustrate only individual amplitudes.
- the frequency spectra adjacent to the peak are not quantized, information contained in the frequency spectra are lost. Therefore, the drops of the phase and the amplitude affect ones of a sound source associated with the peak and causes sound-quality deterioration, such as a sensation of trill.
- a sound source adjacent to a specific frequency in a band having the peak effectively contributes to a main sound source, and the influence due to the loss of the information contained in the frequency spectra adjacent to the peak is strongly exerted on the sound quality of an encoded sound source, compared to a low tonal audio signal.
- FIGS. 1 to 4 An underlying technology for describing an encoding device according to a first embodiment will be described using FIGS. 1 to 4 .
- a term “frequency spectrum” corresponds to a coefficient e.g. an MDCT coefficient for each frequency obtained through converting an audio signal (a sound source) by e.g. MDCT into frequency domain.
- a term “frequency spectral power” corresponds to the value of the square of the frequency spectrum.
- a term “tonal frequency spectrum” is a coefficient for one frequency of frequency spectra when peaks of frequency spectral powers concentrate at the frequency. For example, a frequency spectrum having a greater power than the average of all frequency spectral powers corresponds to the tonal frequency spectrum.
- An audio signal corresponding to a conversion source of the “tonal frequency spectrum” is referred to as a “tonal sound source”.
- a term “quantization” is processing for rounding down a numeric value after a decimal point (e.g., changing “1.8” and “2.1” to integers such as “1” and “2”, respectively).
- a term “quantization value” indicates a value obtained by quantizing a frequency spectrum.
- quantization error is an error caused in each frequency spectrum by quantizing the frequency spectrum. Specifically, as shown in FIG. 1 , the difference between a pre-quantization frequency spectrum and a post-inverse-quantization spectrum corresponds to the quantization error, where the post-inverse-quantization spectrum is referred to as an “inversely quantized spectrum”.
- the term “inversely quantized spectrum” is a frequency spectrum obtained from a quantization value.
- the encoding device quantizes frequency spectra to obtain quantization values and then obtains inversely quantized spectra from the quantization values. Since the dynamic range of the frequency spectra is usually large, the encoding device first performs scaling using a predetermined “scale factor” to reduce the range as shown at ( 1 ) in FIG. 1 . Thereafter, as shown at ( 2 ) in FIG. 1 , the encoding device performs quantization to obtain quantization values. Then, as shown at ( 3 ) in FIG. 1 , the encoding device rescales (performs the reverse processing of the scaling performed at ( 1 ) in FIG. 1 ) the obtained quantization values by using the predetermined scale factor to obtain inversely quantized spectra.
- Equation 1 the inversely quantized spectrum
- Equation 2 the quantization value
- Expression 1 is an expression representing the relationship among the frequency spectrum, the quantization value, and the scale factor. “2 ⁇ (scale factor)” indicates “2 raised to the power of scale factor.”
- a frequency range in which frequency spectra of an audio signal are analyzed is divided into a plurality of a smaller frequency range having a predetermined width in frequency as a band.
- individual “scale factor” is given to each of the bands. For example, in the example shown in FIG. 1 , one scale factor is given to a band “b” containing frequency spectra ( 4 ) and ( 5 ) shown in FIG. 1 .
- the scale factor is determined by the encoding device so that the quantization error power is smaller than an allowable error power.
- band power of frequency spectra refers to the sum of powers of frequency spectra contained in a band.
- quantization error power of a frequency spectrum refers to the value of the square of a quantization error.
- a quantization error power in one band refers to the sum of quantization error powers determined from quantization errors generated during quantization of frequency spectra contained in the band. Specifically, the relationship between a quantization error power and a quantization error in one band is given by Expression 2 where ⁇ 2 indicates a square.
- the term “allowable error power” is a maximum quantization error power that is allowed during quantization.
- the allowable error power is an allowable maximum quantization error power in the quantization error powers caused during quantizing the scaled spectrum. More in detail, the allowable error power is derived for each band from a transformation of the masking threshold corresponding to the band, where the masking threshold that indicates whether or not it can be acoustically heard.
- the allowable error power for example, the scheme described in ISO/IEC 13818-7 may be used or other schemes may be used.
- the allowable error power is a “limit of an allowable quantization error power”.
- the allowable error power in one band is a quantization error power determined for the band and exhibits a maximum value that is allowed as an error generated during quantization of frequency spectra in the band.
- the encoding device quantizes frequency spectra so that the difference power between the power of pre-quantization frequency spectra and the power of inversely quantized spectra in one band is smaller than the allowable error power.
- the allowable error power for each band is derived from the individual masking thresholds.
- the derived allowable error power is also compared with individual power frequency spectra to select the frequency spectra to be quantized in which band. What are compared with the allowable error powers during determination of frequency spectra to be quantized are band powers.
- encoding is processing for converting the quantization values and/or the scale factors into other values (codes) by using, for example, Huffman coding.
- each of scale factors is assigned to each band, and each frequency spectrum contained in one band is quantized using the assigned scale factor.
- the encoding device regards the band as a band to be quantized. Also, as shown in FIG. 4 , the encoding device quantizes the frequency spectra by using such a scale factor that the quantization error power becomes smaller than the allowable error power. Thus, as shown in FIG. 4 , the encoding device performs quantization by using a scale factor that satisfies “Allowable Error Power>Quantization Error Power”.
- the band power equals to the quantization error power corresponds to a case in which the quantization values of the frequency spectra are “0” (i.e., the frequency spectra are not quantized).
- the allowable error power serves as a threshold for determining whether or not all frequency spectra in a band are to be quantized.
- FIG. 5 illustrating an overview and features of the encoding device according to the embodiment.
- FIG. 5 illustrates the encoding device in which several main units are provided and shown with a signal processed in the each unit for coding an audio signal.
- a sound source an audio signal
- the device encodes the audio signal as shown in FIG. 5 .
- the encoding device has a main feature in that it can improve the encoded-sound quality of a tonal audio signal, as described below.
- a frequency converting unit converts the inputted audio signal into frequency spectra as shown in ( 1 ) in FIG. 5 .
- the frequency converting unit determines the powers of the frequency spectra for each of bands having a predetermined width in frequency as shown in ( 2 ) in FIG. 5 .
- the frequency converting unit determines the total of powers (the band power) corresponding to a sum of the individual powers of each frequency spectrum contained in a band.
- each unpainted bar indicates frequency spectra in each band.
- the power determining unit determines allowable error powers for respective bands in accordance with the audio signal, (refer to the above-described [Underlying Technology]).
- each bar painted indicates the allowable error power in each band (on a band basis).
- a detecting unit detects a tonal frequency spectrum from the frequency spectra converted by the frequency converting unit and also detects a band containing the tonal frequency spectrum. For example, the detecting unit detects a band “ 5 ” in ( 4 ) in FIG. 5 as a band containing a tonal frequency spectrum.
- a power correcting unit corrects the allowable error powers using both of the result detected by the detecting unit and the allowable error powers determined by the power determining unit. Specifically, each of the allowable error powers of the bands adjacent to the band containing the tonal frequency spectrum are individually corrected by the power correcting unit so that the allowable error power become smaller than the sum of powers of frequency spectra in the bands.
- the power correcting unit corrects the powers of frequency spectra of the bands “ 4 ” and “ 6 ” adjacent to the band “ 5 ” so that the allowable error power of each of the bands “ 4 ” and “ 6 ” become smaller than each of the powers of the frequency spectra of the bands “ 4 ” and “ 6 ”.
- the painted portion in the bars in the bands “ 4 ” and “ 6 ” in ( 6 ) in FIG. 5 shows the corrected allowable error powers for each of the bands. Namely in ( 6 ) in FIG. 5 , each of unpainted portions in the band “ 4 ” and “ 6 ” illustrates the amounts corrected by the power correcting unit.
- a quantizing unit quantizes frequency spectra having greater powers than the allowable error powers corrected by the power correcting unit. For example, the quantizing unit quantizes the frequency spectra contained in the band “ 5 ” containing the tonal frequency spectrum and the frequency spectra contained in the bands “ 4 ” and “ 6 ” that have allowable error powers corrected by the power correcting unit as shown in ( 7 ) in FIG. 5 .
- the allowable error powers are corrected so that the frequency spectra that exist adjacent to a peak power are quantized, it is possible to reliably quantize the frequency spectra that exist adjacent to the peak power and it is possible to improve the encoded-sound quality of a tonal audio signal.
- FIG. 6 is a block diagram showing the configuration of the encoding device according to the first embodiment.
- FIG. 7 is a drawing for describing a tone detecting unit in the first embodiment.
- FIG. 8 is a drawing for describing a psychoacoustic analyzing unit in the first embodiment.
- FIG. 9 is a drawing for describing an allowable-error-power correcting unit in the first embodiment.
- FIG. 10 is a drawing for describing the allowable-error-power correcting unit in the first embodiment.
- FIG. 11 is a diagram for describing a scale factor correcting unit in the first embodiment.
- the encoding device includes, an input unit 101 , a Modified Discrete Cosine Transform (MDCT) unit 102 , a tone detecting unit 103 , a psychoacoustic analyzing unit 104 , an allowable-error-power correcting unit 105 , a quantization-band detecting unit 106 , a scale factor determining unit 107 , a scale factor correcting unit 108 , a quantizing unit 109 , an encoding unit 110 , and an output unit 111 .
- MDCT Modified Discrete Cosine Transform
- the MDCT unit 102 , the psychoacoustic analyzing unit 104 , and the tone detecting unit 103 may correspond to a “frequency converting unit”, a “power determining unit”, and a “detecting unit” respectively. Further the allowable-error-power correcting unit 105 and the quantizing unit 109 may correspond to a “power correcting unit” and a “quantizing unit” respectively.
- the scale factor determining unit 107 may correspond to a “first scale factor determining unit” and a “second scale factor determining unit”. The scale factor correcting unit 108 may correspond to a “third scale factor determining unit”.
- An audio signal as a sound source to be encoded is received by the input unit 101 and then fed to the MDCT unit 102 and the psychoacoustic analyzing unit 104 both of which are described below.
- the MDCT unit 102 converts the audio signal, transmitted from the input unit 101 , into frequency spectra. Specifically, through MDCT conversion, the MDCT unit 102 performs time-frequency conversion by which the audio signal transmitted from the input unit 101 is converted into frequency spectra.
- the time-frequency conversion herein means, for example, a conversion of an audio signal as a function of time variable into frequency spectra of frequency variable.
- the MDCT unit 102 determines the power of frequency spectra for each of bands obtained by dividing a whole predetermined width of the frequency spectra by a predetermined band width in frequency. For example, in the example shown in FIG. 7A , the frequency spectra within a width W are divided into seven sub-bands indicated as bands “ 0 ” to “ 6 ” and the sum of the powers of the frequency spectra contained in each band is determined as a band power such as E 0 to E 6 .
- the MDCT unit 102 transmits data of the converted frequency spectra and the band powers to both of the tone detecting unit 103 and the quantization-band detecting unit 106 described below.
- the tone detecting unit 103 determines an average value of the powers in all bands (in other words, an average value of the powers of all frequency spectra) from the determined powers in the respective bands. Specifically, when the number of bands (the number of divided bands) is indicated by “band” (e.g., “band” is 7 in the example shown in FIG. 7B ) and each band power is indicated by “E band ”, the tone detecting unit 103 determines an average power “E ave ” of the frequency spectra in all the bands in accordance with an expression shown in FIG. 7C .
- the tone detecting unit 103 determines that a band is a tonal band when the band has a power averaged over its band width and the averaged power is greater than a threshold, where the threshold is a power averaged over a whole range to be calculated. Specifically along an example shown in FIG. 7B , the tone detecting unit 103 detects the band 3 as a band containing a tonal frequency spectrum, because the band 3 is a band having an average power of frequency spectra which is greater than the determined average power E ave .
- the tone detecting unit 103 transmits the data of the detected band containing the tonal frequency spectrum to both of the allowable-error-power correcting unit 105 and the scale factor correcting unit 108 described below. Furthermore, the tone detecting unit 103 transmits information of a flag and information for identifying the detected band, which indicated tone_flag and tone_band respectively.
- the flag as tone_flag indicates that a tonality is detected
- the information as tone_band indicates the band 3 having a band power E 3 in the example shown in FIG. 7B .
- the information of both of tone_flag and tone_band are sent to the allowable-error-power correcting unit 105 and the scale factor correcting unit 108 described below.
- the tone detecting unit 103 does not detect a band containing a tonal frequency spectrum, the unit 103 does not transmit the information of tone_flag and tone_band.
- the tone detecting unit 103 transmits also data of the frequency spectra and the band powers, which received from the MDCT unit 102 , to the allowable-error-power correcting unit 105 described below.
- the psychoacoustic analyzing unit 104 determines allowable error powers in accordance with the audio signal (refer to the underlying technology).
- the psychoacoustic analyzing unit 104 divides a predetermined band width of frequency included in the audio signal into smaller predetermined-width bands and determines allowable error powers for the respective divided bands, while it is preferable to use the bands determined by the MDCT unit 102 .
- the psychoacoustic analyzing unit 104 determines a masking threshold for the audio signal transmitted from the input unit 101 . Also, as shown in FIG. 8B , the unit 104 converts the determined masking threshold to determine allowable error powers.
- bands correspond to the bands used by the MDCT unit 102 .
- the psychoacoustic analyzing unit 104 determines preferably an allowable error power for each band using of the bands and the respective band power determined by the MDCT unit 102 .
- FIG. 8A and 8B illustrates the masking threshold or the allowable error powers in conjunction with the frequency spectra.
- the psychoacoustic analyzing unit 104 also transmits the data of the determined allowable error powers to the allowable-error-power correcting unit 105 described below.
- the allowable-error-power correcting unit 105 has the number-of-bands storing unit (not shown in FIG. 6 ) for storing the predetermined number of bands. As shown in FIG. 9 , the allowable-error-power correcting unit 105 receives the detection results of “tone_band” and “tone_flag” from the tone detecting unit 103 ; the data of allowable error powers from the psychoacoustic analyzing unit 104 ; and the data of band powers also from the tone detecting unit 103 . “tone_band” and “tone_flag” are shown as “Detection Result” in the example shown in FIG. 9 . Using the detection results and the data of the band powers, the allowable-error-power correcting unit 105 corrects the data of the allowable error powers.
- the number-of-bands storing unit may correspond to a “number-of-bands storing unit”.
- the allowable-error-power correcting unit 105 it is performed in the allowable-error-power correcting unit 105 so that the allowable error powers determined by the psychoacoustic analyzing unit 104 with respect to bands adjacent to the band detected by the tone detecting unit 103 become smaller than the band powers with respect to the adjacent bands.
- the allowable-error-power correcting unit 105 detects, as adjacent bands, bands located in the range of a predetermined number of bands which is stored by the number-of-bands storing unit, with the band that contains the tonal frequency spectrum detected by the tone detecting unit 103 being as the center thereof.
- the allowable-error-power correcting unit 105 corrects the allowable error powers with respect to the detected adjacent bands.
- the pre-correction allowable error powers in the bands “ 12 ” to “ 20 ” are greater than the band powers in the detected adjacent bands.
- the allowable-error-power correcting unit 105 performs correction by equally attenuating the allowable error powers in the bands “ 12 ” to “ 20 ” (excluding the band “ 16 ”) so that the allowable error powers become smaller than the powers in the frequency spectra.
- the allowable-error-power correcting unit 105 also transmits to the quantization-band detecting unit 106 ; the data of the allowable error powers determined by the psychoacoustic analyzing unit 104 ; and the data of the corrected allowable error.
- the allowable-error-power correcting unit 105 does not receive the flag (tone_flag) and the information for identifying the detected band from the tone detecting unit 103 , the correcting unit 105 does not perform the processing for correcting the allowable error powers and transmits the allowable error powers determined by the psychoacoustic analyzing unit 104 to the quantization-band detecting unit 106 described below.
- the quantization-band detecting unit 106 detects bands to be quantized from the band of the frequency spectra when received the frequency spectra and the allowable error powers.
- the frequency spectra is from the MDCT unit 102 and the allowable error powers (including the allowable error powers corrected by the allowable-error-power correcting unit 105 ) from the allowable-error-power correcting unit 105 .
- the quantization-band detecting unit 106 compares, on a band-to-band basis, the band powers transmitted from the MDCT unit 102 with the allowable error powers transmitted from the allowable-error-power correcting unit 105 . Accordingly bands to be quantized are determined. More specifically, with respect to a band having an allowable error power corrected by the allowable-error-power correcting unit 105 , the quantization-band detecting unit 106 compares the corrected allowable error power with the band power of the band. Also, with respect to a band that have not an allowable error power corrected by the unit 105 , the unit 106 compares the allowable error power determined by the psychoacoustic analyzing unit 104 with the band power of the band. The unit 106 also detects, as a band to be quantized, each band indicating a greater band power than the allowable error power. The unit 106 also detects information for identifying the detected bands.
- the quantization-band detecting unit 106 also transmits to the scale factor determining unit 107 the information for identifying the detected bands to be quantized; the data of the allowable error powers transmitted from the allowable-error-power correcting unit 105 ; and the data of the frequency spectra transmitted from the MDCT unit 102 .
- the scale factor determining unit 107 determines, for respective bands, such scale factors that the quantization error powers become smaller than the allowable error powers.
- the scale factor determining unit 107 determines such scale factors that the quantization error powers become smaller than the corrected allowable error powers with respect to the adjacent bands.
- the scale factor determining unit 107 also transmits the information for identifying the bands to be quantized; and the sets of data of the allowable error powers, the frequency spectra, and the scale factors determined for the respective bands to the scale factor correcting unit 108 described below.
- the scale factor correcting unit 108 receives the data of tonality detection result from the tone detecting unit 103 ; the information for identifying the bands to be quantized, and the each sets of data of the allowable error powers (the information and the allowable error powers are not shown in FIG. 11 ), the frequency spectra, and the scale factors for the respective bands from the scale factor determining unit 107 . Upon receiving these data, the scale factor correcting unit 108 corrects the scale factor of the band containing the tonal frequency spectrum. As described above, the tonality detection result includes the data of the band containing the tonal frequency spectrum and the tone detecting signal.
- the scale factor correcting unit 108 corrects the scale factor for the band containing the tonal frequency spectrum to such a scale factor that the quantization value obtained from a largest one of the frequency spectra that constitute the band becomes the maximum value of the quantization values.
- the band containing the tonal frequency spectrum is a band “b” and the scale factor determined by the scale factor determining unit 107 with respect to the band “b” is “Sb”.
- the scale factor correcting unit 108 searches for the maximum frequency spectrum contained in the band “b” (“Maximum Frequency Spectrum Searching” corresponds thereto, in the example shown in FIG. 11A ).
- the maximum frequency spectrum is referred to as “max_pow_spec”, and the term “maximum frequency spectrum” referred to herein means the greatest-power frequency spectrum of the frequency spectra that constitute the band containing the tonal frequency spectrum.
- the scale factor correcting unit 108 determines such a scale factor “S′b” that the quantization value obtained by quantizing the maximum frequency spectrum becomes “MAX_QUANT”. “MAX_QUANT” means the maximum value of the quantization values.
- the scale factor “S′b” is determined in “Corrected Scale-Value Determination” in the example shown in FIG. 11A , and is set as a scale factor for the band containing the tonal frequency spectrum detected by the tone detecting unit 103 . For example, in accordance with an expression shown in FIG.
- the scale factor correcting unit 108 replaces the scale factor “Sb” with the scale factor “S′b”, that is, the scale factor “Sb” is corrected to the scale factor “S′b”.
- the scale factor correcting unit 108 also transmits to the quantizing unit 109 the information for identifying the bands to be quantized; the each set of the allowable error powers, the frequency spectra, and the scale factors for the respective bands.
- the data of the scale factors includes the scale factor detected by the scale factor correcting unit 108 for the band containing the tonal frequency spectrum.
- the quantizing unit 109 Upon receiving the information for identifying the bands to be quantized; each data set of the allowable error powers, the frequency spectra, and the scale factors for the respective bands, the quantizing unit 109 quantizes each frequency spectrum having a greater power than the allowable error power. Specifically, with respect to each of the bands detected by the quantization-band detecting unit 106 , the quantizing unit 109 reduces the dynamic ranges of the frequency spectra to dynamic ranges uniquely specified by the scale factors and quantizes each of the frequency spectra that constitute each band in the reduced dynamic range. In this process, each of the bands detected by the quantization-band detecting unit 106 is identified by the information for identifying the bands to be quantized.
- the quantizing unit 109 quantizes each of the frequency spectra contained in the band whose scale factor was determined by the scale factor correcting unit 108 , by using the scale factor determined by the scale factor correcting unit 108 . Furthermore, the quantizing unit 109 quantizes each of the frequency spectra contained in the bands whose scale factors were not determined by the scale factor correcting unit 108 , by using the scale factor determined by the scale factor determining unit 107 .
- the quantizing unit 109 uses the scale factors, determined by the scale factor determining unit 107 and the scale factor correcting unit 108 , to change the dynamic ranges on a band-by-band basis (for each band). Thereafter, during execution of the quantization, the quantizing unit 109 performs the quantization on a frequency-spectrum by frequency-spectrum basis (for each frequency spectrum) that constitutes each of the bands, rather than performing quantization on a band-by-band basis. That is, the quantizing unit 109 obtains quantization values for respective frequency spectrum.
- the quantizing unit 109 also transmits the data of the quantization values obtained by the quantization, and the scale factors to the encoding unit 110 described below.
- the encoding unit 110 Upon receiving the quantization values and the scale factors from the quantizing unit 109 , the encoding unit 110 encodes the quantization values and the scale factors. For example, the encoding unit 110 uses Huffman coding to individually encode the quantization values and the scale factors. The encoding unit 110 transmits the encoded information to the output unit 111 described below.
- the output unit 111 Upon receiving of the encoded information from the encoding unit 110 , the output unit 111 outputs the information received from the encoding unit 110 , as encoded information of the audio signal input by the input unit 101 .
- the encoding device can also be realized by incorporating the functions of the MDCT unit 102 , the tone detecting unit 103 , the psychoacoustic analyzing unit 104 , the allowable-error-power correcting unit 105 , the quantization-band detecting unit 106 , the scale factor determining unit 107 , the scale factor correcting unit 108 , and the quantizing unit 109 , which are described above, into an information processing apparatus, such as a known personal computer, workstation, portable phone, PHS terminal, mobile communication terminal, or PDA.
- an information processing apparatus such as a known personal computer, workstation, portable phone, PHS terminal, mobile communication terminal, or PDA.
- FIGS. 12 and 13 Processing performed by the encoding device will be described next using FIGS. 12 and 13 .
- the flow of entire processing performed by the encoding device is first described using FIG. 12 , and then, the flow of processing performed by the scale factor correcting unit 108 is described using FIG. 13 .
- FIG. 12 is a flowchart showing the flow of entire processing of the encoding device according to the first embodiment
- FIG. 13 is a flowchart showing the flow of processing performed by the scale factor correcting unit according to the first embodiment.
- step S 101 when an audio signal exists (YES in step S 101 ), i.e., when an audio signal is received by the input unit 101 , the MDCT unit 102 performs MDCT conversion (step S 102 ). That is, the MDCT unit 102 converts the audio signal, transmitted from the input unit 101 , into frequency spectra.
- the MDCT unit 102 then divides the band (step S 103 ) and determines band powers (step S 104 ). That is, the MDCT unit 102 determines frequency spectral powers and further determines the sum of frequency spectral powers in each of the bands obtained by division by a predetermined width.
- the tone detecting unit 103 detects a band containing a tonal frequency spectrum (step S 105 ). That is, when there is a band having a greater frequency spectral power than a threshold, which is the average power of the frequency spectra in all the bands, the tone detecting unit 103 detects the band as a band having a high tonality.
- the psychoacoustic analyzing unit 104 determines allowable error powers (step S 106 ). That is, upon transmission of the audio signal from the input unit 101 , the psychoacoustic analyzing unit 104 determines allowable error powers in accordance with the audio signal.
- the allowable-error-power correcting unit 105 corrects the allowable error powers (step S 108 ). That is, upon transmission of the detection result from the tone detecting unit 103 , the allowable-error-power correcting unit 105 corrects the allowable error powers with respect to adjacent bands. For example, the allowable-error-power correcting unit 105 corrects the allowable error powers with respect to adjacent bands to allowable error powers that are smaller than the band powers with respect to the adjacent bands.
- the allowable-error-power correcting unit 105 corrects the allowable error powers (step S 108 ).
- the quantization-band detecting unit 106 detects bands to be quantized (step S 109 ). That is, upon transmission of the frequency spectra from the MDCT unit 102 and transmission of the allowable error powers from the allowable-error-power correcting unit 105 , the quantization-band detecting unit 106 detects bands to be quantized from the bands of the frequency spectra.
- the scale factor determining unit 107 determines scale factors (step S 110 ). That is, upon transmission of the information for identifying the bands to be quantized, the allowable error powers, and the frequency spectra from the quantization-band detecting unit 106 , the scale factor determining unit 107 determines, for each band, such a scale factor that the quantization error power becomes smaller than or the allowable error power.
- the scale factor correcting unit 108 corrects the scale factor (step S 112 ). That is, when the band containing the tonal frequency spectrum is transmitted from the tone detecting unit 103 and the information for identifying the bands to be quantized, the allowable error powers, the frequency spectra, and the scale factors for the respective bands are transmitted from the scale factor determining unit 107 , the scale factor correcting unit 108 corrects the scale factor for the band containing the tonal frequency spectrum.
- the scale factor correcting unit 108 then corrects the scale factor (step S 112 ).
- the quantizing unit 109 quantizes the frequency spectra (step S 113 ). That is, upon transmission of the information for identifying the bands to be quantized, the allowable error powers, the frequency spectra, and the scale factors for the respective bands from the scale factor correcting unit 108 , the quantizing unit 109 quantizes each frequency spectrum in each band detected by the quantization-band detecting unit 106 .
- the encoding unit 110 performs encoding (step S 114 ). That is, upon transmission of quantization values obtained by the quantization from the quantizing unit 109 , the encoding unit 110 encodes the quantization values.
- the scale factor correcting unit 108 detects a maximum frequency spectrum (step S 202 ).
- the scale factor correcting unit 108 determines a scale factor for a case in which the quantization value becomes maximum (step S 203 ). That is, the scale factor correcting unit 108 determines such a scale factor that a quantization value obtained from a largest one of the frequency spectra that constitute the band containing the tonal frequency spectrum detected by the tone detecting unit 103 becomes a maximum value.
- the scale factor correcting unit 108 then corrects the scale factor (step S 204 ). That is, the scale factor correcting unit 108 corrects the scale factor determined by the scale factor determining unit 107 to the scale factor determined with respect to the band from which the tonal frequency spectrum was detected.
- the disclosed encoding device converts an audio signal into frequency spectra, determines allowable error powers for respective bands obtained by dividing the frequency of the audio signal by a predetermined width.
- the encoding device also detects a tonal frequency spectrum from the frequency spectra and a band containing the frequency spectrum. Using the detection result and the allowable error powers, the encoding device performs correction such that the allowable error powers determined with respect to bands adjacent to the band detected by the detecting unit become smaller than the powers of the frequency spectra with respect to the adjacent bands.
- the encoding device quantizes each of the frequency spectra having greater powers than the corrected allowable error powers. Thus, it can be possible to improve the encoded-sound quality of a tonal audio signal.
- the allowable error powers are corrected so that each of frequency spectra that exist adjacent to a peak power is corrected, it can be possible to reliably quantize each of the frequency spectra that exist adjacent to the peak power. Furthermore, it can be possible to improve the encoded-sound quality of a tonal audio signal.
- the amplitude fluctuates to overflow (e.g., to exceed (16 bits) which is the maximum value of PCM), resulting in the generation of clipping. Consequently, as shown in FIG. 14C , abnormal sound such as a sound of chi'ri'chi'ri (e.g. a clipping noise) is generated. Also, as shown in FIG. 14B , variations in the amplitude cause a sound to vibrate peceptually.
- frequency spectra adjacent to a tonal frequency spectrum can be reliably quantized as shown in FIG. 15 .
- a sound to vibrate perceptually and the generation of abnormal sound of chi'ri'chi'ri generated during the quantization using the known schemes, as shown in FIG. 16B are reduced as shown in FIG. 16C , and the encoded-sound quality of a tonal audio signal can be improved.
- the scale factor correcting unit 108 determines, as scale factors for the adjacent bands, such scale factors with respect to the adjacent bands that the quantization error powers determined from quantization errors that are errors generated during the quantization of frequency spectra contained in the adjacent bands become smaller than the allowable error powers determined by the allowable-error-power correcting unit 105 with respect to the adjacent bands, and the quantizing unit 109 quantizes each of the frequency spectra contained in the band whose scale factor was determined by the scale factor correcting unit 108 , by using the scale factor determined by the scale factor correcting unit 108 .
- the allowable error powers are corrected, it is possible to perform quantization using an appropriate scale factor.
- the tone detecting unit 103 detects the band containing the tonal frequency spectrum. Thereafter, the scale factor of the band is determined so that the a quantization value obtained from a largest one of the frequency spectra that constitute the band including the tonal frequency spectrum becomes a maximum Thus, it can be possible to minimize quantization errors. Specifically, since a quantization value obtained from a peak having a tonality takes a maximum value set based on the standard, it is possible to minimize quantization errors.
- the disclosed encoding device stores a predetermined number of bands and determines adjacent bands which locate in the range of the stored predetermined number of bands around the band containing the detected tonal frequency spectrum as the center thereof. Thereafter, the encoding device corrects the allowable error powers of the adjacent bands. Thus, it can be possible to easily detect bands in which the allowable error powers are to be corrected.
- the encoding device adopts a scheme in which the scale factor correcting unit 108 corrects the scale factor for the band detected by the tone detecting unit 103 so that the value by quantizing the largest one of frequency spectra in the band becomes the maximum value based on the standard.
- the present invention is not limited to the scheme.
- an encoding device may be such that it searches for a scale factor at which the quantization error power generated is small and uses the scale factor obtained by the searching.
- the encoding device determines, as the scale factors, the scale factor determined by the scale factor determining unit 107 and a scale factor that is selected from changed scale factor obtained by changing the scale factor by a predetermined value. Then the encoding device uses one of both the scale factors which reduces the quantization error (or the quantization error power) generated during quantization.
- the encoding device uses one of both the scale factors which reduces the quantization error (or the quantization error power) generated during quantization.
- the scale factor correcting unit 108 corresponding to an “error determining unit” determines a quantization error power generated during the quantization of the frequency spectra contained in a band by using the scale factor determined by the scale factor determining unit 107 with respect to the band. Furthermore, the scale factor correcting unit 108 determines the quantization error power using the changed scale factor obtained by changing the scale factor determined by the scale factor determining unit 107 .
- the scale factor correcting unit 108 according to the second embodiment is different from one, shown in FIG. 11B , according to the first embodiment. That is, in scale-correction value searching, the scale factor correcting unit 108 according to the second embodiment uses allowable error powers (in the second embodiment, the allowable error power corrected by the allowable-error-power correcting unit 105 and the pre-correction allowable error power).
- allowable error powers in the second embodiment, the allowable error power corrected by the allowable-error-power correcting unit 105 and the pre-correction allowable error power.
- the scale factor correcting unit 108 quantizes each of the frequency spectra that constitute the band by using the scale factor determined by the scale factor determining unit 107 . Then the encoding device determines a quantization error power generated during the quantization (refer to the consideration on underlying technology).
- the tone detecting unit 103 detects a band “b”
- the scale factor determining unit 107 determines a scale factor “Sb” for the band “b”
- the number of frequency spectra that constitute the band “b” is “Nb”.
- the scale factor correcting unit 108 quantizes each of the frequency spectra that constitute the band “b” to determine quantization values, by using the scale factor “Sb”. Then, the unit 108 performs inverse-quantization to determine inversely quantized spectra by using the determined quantization values and the scale factor “Sb”. For example, in the AAC encoding method, the scale factor correcting unit 108 determines a quantized value “quanti” obtained from the ith spectrum “speci” contained in the band “b” and an inversely quantized spectrum “ispeci” in accordance with expressions shown in FIGS. 18A and 18B .
- the scale factor correcting unit 108 determines a quantization error power in the band from the pre-quantization frequency spectra and the inversely quantized spectra. For example, the scale factor correcting unit 108 determines a quantization error power “error_eb” in the band “b” in accordance with an expression shown in FIG. 18C . “Nb” in the expression shown in FIG. 18C indicates the number of frequency spectra contained in the band “b”.
- the scale factor correcting unit 108 changes the scale factor determined by the scale factor determining unit 107 to a predetermined value. Then the unit 108 uses the changed scale factor (a change scale factor) to determine a quantization error power generated during the quantization with respect to the band detected by the tone detecting unit 103 .
- the scale factor correcting unit 108 compares two quantization error powers, one is referred to as a “first” quantization error power and the other is referred to as a “second” quantization error power to determine whether the “second” quantization error power is smaller.
- the first quantization error power is generated by a use of the scale factor determined by the scale factor determining unit 107 and the second quantization error power is generated by a use of the change scale factor.
- the scale factor correcting unit 108 corrects the scale factor (e.g., “Sb”) for the band detected by the tone detecting unit 103 to the change scale factor (e.g., “S′b”).
- the scale factor correcting unit 108 does not correct the scale factor.
- the scale factor correcting unit 108 determines quantization error powers with respect to multiple scale factors by using various “As” and corrects the scale factor to a scale factor at which a smallest one of the quantization error powers is generated. It is shown as an example, in which the scale factor correcting unit 108 uses “Sb 1 ” and “Sb 2 ” as the change scale factors during first quantization and during second operations, respectively. When the unit 108 uses the change scale factor “Sb 1 ” for the first operation to correct the scale factor “Sb” determined by the unit 107 , the unit 108 then compares a quantization error power generated by a use of “Sb 1 ” with a quantization error power generated by a use of “Sb 2 ”.
- the scale factor correcting unit 108 determines whether or not the comparison is performed with respect to all predetermined change scale factors (e.g., scale factors (change scale factor candidates) determined from the predetermined “As”). Then, the scale factor correcting unit 108 continues the scale factor correction processing until the comparison is performed with respect to all change scale factors.
- predetermined change scale factors e.g., scale factors (change scale factor candidates) determined from the predetermined “As”.
- the present invention is not limited thereto.
- the arrangement may be such that quantization error powers are determined with respect to multiple scale factors, respectively, the comparison is simultaneously performed on the determined multiple (e.g., three or more) quantization error powers, and one scale factor at which the generated quantization error power is the smallest is used.
- the value of “A” is arbitrary, and not only is a value that is greater than “0” used as “A”, but also a value that is smaller than “0” may be used as “A”. Also, the scale factor correcting unit 108 may pre-store a setting regarding the number of values used as the change scale factors (the number of times for determining and comparing the quantization error powers) and may execute the scale factor correction processing based on the setting.
- the present invention is not limited to the scheme using various “As” (using multiple change scale factors).
- the scale factors determined by the scale factor determining unit 107 may be compared with only one change scale factor.
- one value that is estimated to reduce quantization errors may be pre-selected and used as the change scale factor. This makes it possible to quickly execute the scale factor correction processing.
- the quantizing unit 109 quantizes each of the frequency spectra contained in the band, by using the scale factor (or the change scale factor).
- the scale factor (or the change scale factor) is one giving the smallest one of the quantization error powers determined by the scale factor correcting unit 108 .
- the scale factor correcting unit 108 determines quantization error powers with respect to the scale factor “Sb” determined by the scale factor determining unit 107 and the value “S′b” obtained by changing the scale factor to “A”. Then, when the quantization error power generated by a use of “S′b” is the smallest, each of the frequency spectra that constitute the band detected by the tone detecting unit 103 is quantized using the scale factor “S′b”.
- FIG. 19 is a flowchart showing the flow of the scale factor correction processing performed by the encoding device according the second embodiment.
- the tone detecting unit 103 detects a band “b”
- the scale factor determining unit 107 determines a scale factor “Sb” for the band “b”
- the number of frequency spectra that constitute the band “b” is “Nb”, unless otherwise particularly stated.
- the scale factor correcting unit 108 when the scale factor correcting unit 108 is to correct the scale factor (YES in step S 301 ), it determines a quantization error power (step S 302 ). That is, the scale factor correcting unit 108 performs quantization by using the scale factor “Sb” determined by the scale factor determining unit 107 and determines a quantization error power generated during the quantization of the band “b”.
- the scale factor correcting unit 108 changes the scale factor (step S 303 ). That is, for example, the scale factor correcting unit 108 changes the scale factor “Sb” to a predetermined value “A”.
- the scale factor correcting unit 108 determines a quantization error power by using the changed scale factor (step S 304 ). That is, for example, the scale factor correcting unit 108 determines a quantization error power generated during the quantization of the band “b”, by using the obtained change scale factor “S′b”.
- the scale factor correcting unit 108 compares the quantization error powers (step S 305 ). That is, for example, with respect to the quantization error powers generated during the quantization of the band “b”, the scale factor correcting unit 108 compares a “first” quantization error power” with a “second” quantization error power.
- the “first” quantization error power is generated when the scale factor “Sb” determined by the scale factor determining unit 107 is used.
- the “second” quantization error power is generated when the change scale factor “S′b” is used.
- the scale factor correcting unit 108 compares both of the quantization error powers derived by a use of the scale factor determined by the scale factor determining unit 107 and by a use of the change scale factor, to determine whether the quantization error power when the change scale factor is used is smaller (step S 306 ). That is, for example, the scale factor correcting unit 108 determines whether the “second” quantization error power is smaller than the “first” quantization error power. In this case, when the “second” quantization error power is smaller than the “first” quantization error power (affirmative in step S 306 ), the scale factor correcting unit 108 corrects the scale factor (step S 307 ). That is, for example, the scale factor correcting unit 108 corrects the scale factor “Sb” to the change scale factor “S′b”.
- step S 307 when the scale factor correcting unit 108 corrects the scale factor (step S 307 ) or when the “second” quantization error power is not smaller than the “first” quantization error power (negative in step S 306 ), the scale factor correcting unit 108 determines whether the comparison has been performed with respect to all change scale factor candidates (step S 308 ). In this case, when the comparison has been performed with respect to all change scale factor candidates (affirmative in step S 308 ), the processing ends. On the other hand, when the comparison has not been performed with respect to all change scale factor candidates (negative in step S 308 ), the processing from steps S 303 to S 307 described above is repeated until the comparison has been performed with respect to all change scale factor candidates.
- the scale factor correcting unit 108 determines the quantization error powers by using the scale factor determined by the scale factor determining unit 107 and also using the change scale factor. Then, the encoding device performs quantization by using the scale factor (or the change scale factor) used when the smallest one of the determined quantization error powers was determined.
- the scheme in which the scale factor correcting unit 108 corrects the scale factor with respect to only the band detected by the tone detecting unit 103 has been described in the first and second embodiments described above.
- the present invention is not limited thereto and the scale factors may be corrected with respect to all bands. This allows an encoding device according to a third embodiment to reduce quantization errors with respect to other bands.
- the quantizing unit 109 may quantize each of the frequency spectra contained in all bands by using the scale factor determined by the scale factor correcting unit 108 .
- the scale factor correcting unit 108 determines a scale factor with respect to only the band detected by the tone detecting unit 103 and corrects the scale factor. Further the unit 108 corrects the scale factors with respect to the other bands to the scale factor determined with respect to the band detected by the tone detecting unit 103 .
- the quantizing unit 109 then quantizes each of the frequency spectra in all bands by using the scale factor determined by the tone detecting unit 103 with respect to the band detected by the tone detecting unit 103 .
- the encoding device to reduce the number of bits used during the encoding of the scale factor.
- the scale factor is expressed by a difference from an adjacent scale factor.
- making all the scale factors the same scale factor makes it possible to reduce the number of bits required during the decoding of the scale factor set for the individual bands, compared to a scheme in which different scale factors are set for the respective bands.
- the present invention is not limited thereto and bands that exist in a predetermined power width from a peak power may be used.
- a band width in which the allowable error powers are to be corrected is determined using a preset power width, and then, the allowable error powers are corrected.
- the allowable-error-power correcting unit 105 has a power-width storing unit.
- a predetermined power width is stored in the power-width storing unit.
- the allowable-error-power correcting unit 105 stores, for example, “G” in the power-width storing unit.
- the tone detecting unit 103 detects the band containing the tonal frequency spectrum. Further, the allowable-error-power correcting unit 105 regards as an adjacent band or adjacent bands that have a power value or power values and include the band containing the tonal frequency spectrum. The power value or power values are greater than or equal to a power value attenuated from the power value of the band detected by the tone detecting unit 103 to the predetermined power width stored in the power-width storing unit. The allowable-error-power correcting unit 105 corrects the allowable error power(s) for the adjacent band or adjacent bands as shown in FIG. 21 .
- the allowable-error-power correcting unit 105 determines “Ethr” that is a power obtained by attenuating “G” from “Epeak” and uses “Ethr” as a power threshold for determining bands in which the allowable error powers are to be corrected.
- the encoding device can easily detect bands in which the allowable error powers are to be corrected.
- the entire or part of the processing described as being automatically performed can be manually performed or the entire or part of the processing described as being manually performed can be automatically performed by a known method.
- the processing processes, the control processes, the specific names, and information including various types of data and parameters which are illustrated in the document and the drawings can be arbitrary modified, unless otherwise particularly stated.
- the description in the first embodiment described above has been given of, for example, a case in which (1) the scheme for correcting the scale factor, (2) the scheme using the scale factor at which the quantization value becomes the maximum value, and (3) the scheme using the predetermined bandwidth during the detection of adjacent bands are implemented together during the correction of the allowable error powers.
- the present invention is not limited to the case, and during the correction of the allowable error powers, (1) to (3) do not have to be implemented together and only one or some of (1) to (3) may also be implemented.
- the present invention is not limited to a case in which one of the schemes is implemented, and multiple schemes maybe implemented together.
- FIG. 22 is a diagram for describing a program for the encoding device according to the first embodiment.
- an encoding device 3000 in the first embodiment has a configuration in which an operation unit 3001 , a microphone 3002 , a speaker 3003 , a display 3005 , a communication unit 3006 , a CPU 3010 , a ROM 3011 , a HDD 3012 , and a RAM 3013 are connected through a bus 3009 and so on.
- the ROM 3011 pre-stores control programs such as an input program 3011 a, an MDCT program 3011 b, a tone detecting program 3011 c, a psychoacoustic analyzing program 3011 d, an allowable-error-power correcting program 3011 e, a quantization-band detecting program 3011 f, a scale factor determining program 3011 g, a scale factor correcting program 3011 h, a quantizing program 3011 i, an encoding program 3011 j, and an output program 3011 k.
- control programs such as an input program 3011 a, an MDCT program 3011 b, a tone detecting program 3011 c, a psychoacoustic analyzing program 3011 d, an allowable-error-power correcting program 3011 e, a quantization-band detecting program 3011 f, a scale factor determining program 3011 g, a scale factor correcting program 3011 h, a quantizing program 3011 i, an encoding
- Each of the pre-stored control programs provides the same functions as the input unit 101 , the MDCT unit 102 , the tone detecting unit 103 , the psychoacoustic analyzing unit 104 , the allowable-error-power correcting unit 105 , the quantization-band detecting unit 106 , the scale factor determining unit 107 , the scale factor correcting unit 108 , the quantizing unit 109 , the encoding unit 110 , and the output unit 111 which are illustrated in the first embodiment described above.
- These programs 3011 a to 3011 k may be integrated together or distributed, as required, similarly to the elements that constitute the encoding device shown in FIG. 6 .
- the CPU 3010 reads these programs 3011 a to 3011 k from the ROM 3011 and executes them to thereby cause the programs 3011 a to 3011 k to function as an input process 3010 a, an MDCT process 3010 b, a tone detecting process 3010 c, a psychoacoustic analyzing process 3010 d, an allowable-error-power correcting process 3010 e, a quantization-band detecting process 3010 f, a scale factor determining process 3010 g, a scale factor correcting process 3010 h, a quantizing process 3010 i, an encoding process 3010 j, and an output process 3010 k, as shown in FIG. 22 .
- the processes 3010 a to 3010 k correspond to the input unit 101 , the MDCT unit 102 , the tone detecting unit 103 , the psychoacoustic analyzing unit 104 , the allowable-error-power correcting unit 105 , the quantization-band detecting unit 106 , the scale factor determining unit 107 , the scale factor correcting unit 108 , the quantizing unit 109 , the encoding unit 110 , and the output unit 111 which are shown in FIG. 6 , respectively.
- the encoding device described in the present embodiment can be achieved by causing a computer, such as a personal computer or workstation, to execute the prepared program.
- This program can be distributed over a network, such as the Internet.
- This program can also be recorded to a computer-readable storage media, such as a hard disk, flexible disk (FD), CD-ROM, MO, and DVD, and can also be executed by causing the computer to read the program from the recording medium.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Description
- This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2008-037991, filed on Feb. 19, 2008, the entire contents of which are incorporated herein by reference.
- 1. Field of the Invention
- The present invention relates to an encoding device, an encoding method, and a program product including an encoding method.
- 2. Description of the Related Art
- Conventionally, various researches have been done for audio coding technology for compressing/decompressing audio signals as sound sources for voice, music, and so on. For example, various researches are directed to schemes for encoding audio signals through conversion into frequency domain.
- For example, such audio coding technology will be found in Advanced Audio Coding (ACC) method, High Efficiency-Advanced Audio Coding (HE-AAC) method. The AAC and the HE-AAC methods are ones of the ISO/IEC MPEG-2/4 audio standards and are widely used in a digital broadcasting, such as the digital terrestrial, the BS digital, and the Communication Satellite, and one segment broadcastings in Japan.
- In such audio coding technology, a conventional encoding device for implementing the audio coding technology converts an audio signal into frequency spectra by Modified Discrete Cosine Transform (MDCT) conversion, quantizes the frequency spectra, and then performs encoding.
- The conventional encoding device quantizes the frequency spectra by utilizing auditory masking properties. Specifically, the conventional encoding device quantizes only sound that can be heard by human auditory perception. In the quantization, a masking threshold as a threshold is used to determine components of sound that cannot be acoustically heard, namely the threshold for whether sound can be heard or not.
- For example, the conventional encoding device performs psychoacoustic analysis, which is a scheme for analyzing whether or sound is acoustically heard or not, with respect to an audio signal (a sound source to be encoded). Then masking thresholds are determined for each frequency. Thereafter, for each band having a predetermined frequency width, the conventional encoding device determines an error limit. The error limit is an allowable error power that is allowed during quantization, based on the determined masking threshold. Then, using the allowable error power, the conventional encoding device quantizes only frequency spectra as a sound source that is acoustically heard.
- Japanese Laid-open Patent Publication No. 2006-18023,
pages 5 to 11 and FIG. 1, discloses a scheme for adjusting a masking threshold, Japanese Laid-open Patent Publication No. 2001-7704,pages 5 to 9 and FIG. 1, discloses a scheme for improving efficiency during encoding for reducing the usage amount of bits used during encoding. In addition, Japanese Laid-open Patent Publication No. 7-202823, pages 3-5 and FIG. 1, and Japanese Laid-open Patent Publication No. 7-295594,pages 2 to 3 and FIG. 1 disclose schemes for specifying the amount of bit distribution. - Meanwhile, the above-described conventional technologies have a problem in that the sound quality deteriorates during encoding of a tonal high audio signal.
- In more detail, since the conventional encoding device cannot reliably quantize frequency spectra adjacent to the peak during encoding of a tonal audio signal, and the device cannot satisfactorily perform encoding while maintaining a sufficient sound quality.
- The Japanese Laid-open Patent Publications described above do not disclose a scheme for reliably quantizing frequency spectra adjacent to the peak and cannot sufficiently improve the sound quality during encoding of a tonal audio signal.
- It is an object of the present invention to provide an encoding device capable of operating in a satisfactory state.
- According to one aspect of the invention, there is provided an present encoding device for converting an audio signal into frequency spectra and quantizing and encoding the frequency spectra includes a power correcting unit for correcting allowable error powers determined in accordance with the audio signal when a tonal frequency spectrum is detected from the frequency spectra, and a quantizing unit for quantizing each of the frequency spectra having greater powers than the allowable error powers corrected by the power correcting unit.
- According to another aspect of the invention, there is provided an encoding device for encoding an audio signal including a frequency converting unit for converting the audio signal into frequency spectra, a power determining unit for determining allowable error powers in accordance with the audio signal, a detecting unit for detecting a tonal frequency spectrum from the frequency spectra converted by the frequency converting unit, a power correcting unit for correcting the allowable error powers by using a result of the detection performed by the detecting unit and the allowable error powers determined by the power determining unit, and a quantizing unit for quantizing each of the frequency spectra having greater powers than the allowable error powers corrected by the power correcting unit.
-
FIG. 1 illustrates a diagram showing the underlying technology of an encoding device according to a first embodiment; -
FIG. 2 illustrates a diagram showing the underlying technology of an encoding device according to the first embodiment; -
FIGS. 3A to 3C illustrate diagrams showing the underlying technology of an encoding device according to the first embodiment; -
FIG. 4 illustrates a diagram showing the underlying technology of an encoding device according to the first embodiment; -
FIG. 5 illustrates a diagram showing the outline and configuration of an encoding device according to the first embodiment; -
FIG. 6 illustrates a block diagram showing the configuration of the encoding device according to the first embodiment; -
FIGS. 7A to 7D illustrate diagrams showing a tone detecting unit of the coding device according to the first embodiment; -
FIGS. 8A and 8B illustrate diagrams showing a psychoacoustic analyzing unit in the encoding device according to the first embodiment; -
FIG. 9 illustrates a diagram showing an allowable-error-power correcting unit in the encoding device according to the first embodiment; -
FIGS. 10A to 10D illustrate diagrams showing the allowable-error-power correcting unit in the encoding unit according to the first embodiment; -
FIGS. 11A and 11B illustrate diagrams showing a scale factor correcting unit in the encoding unit according to the first embodiment; -
FIG. 12 illustrates a flowchart showing a flow of processing of the encoding device according to the first embodiment; -
FIG. 13 illustrates a flowchart showing a flow of processing performed by the scale factor correcting unit of the encoding device according to the first embodiment; -
FIG. 14A illustrates a waveform of an audio signal,FIG. 14B illustrates an encoded signal, andFIG. 14C illustrates frequency characteristics of an encoded signal; -
FIG. 15 illustrates frequency spectra adjacent to a tonal frequency spectrum; -
FIG. 16A illustrates an original sound,FIG. 16B illustrates a generation of abnormal sound generated during a quantization using the known scheme, andFIG. 16C illustrates a reduction of the abnormal sound; -
FIG. 17 illustrates a diagram showing an encoding device according to a second embodiment; -
FIGS. 18A to 18C illustrate a diagram showing an encoding device according to the second embodiment; -
FIG. 19 illustrates a flowchart of a scale factor correction processing in an encoding device according to the second embodiment; -
FIG. 20 illustrates a diagram showing an encoding device according to a third embodiment; -
FIG. 21 illustrates a diagram showing an encoding device according to the third embodiment; -
FIG. 22 illustrates a diagram of a program for the encoding device according to the first embodiment; and -
FIGS. 23A to 23C illustrate for showing the consideration on underlying technology. - Referring to
FIG. 23A to 23B , consideration over a conventional technology of coding an audio signal is described to make clear a shortcoming which is caused in quantization of frequency spectra adjacent to the peak of a tonal audio signal. When a tonal audio signal, e.g. a sinusoidal wave, a sweep wave, or the like, is encoded, intensities or power in dB concentrate in a specific band which exhibits a relatively large peak compared to other bands. That is, a specific band has frequency spectra having high intensities as shown inFIG. 23A which illustrates frequency spectra obtained by performing MDCT conversion on a tonal audio signal. - Also, as shown in
FIG. 23B , in the conventional encoding device, an allowable error powers determined with respect to the bands adjacent to one including the peak also increased. Specifically, since the frequency spectra in the band including the peak are greater in power than other frequency spectra, the conventional encoding device also has large masking thresholds determined for the adjacent bands as well as the band including the peak. As a result, the allowable error powers also increase. Consequently, as shown inFIG. 23C , the frequency spectra in the bands adjacent to the peak become frequency spectra that are smaller than or equal to the allowable error powers. Since the frequency spectra in the adjacent bands are regarded as spectra not to be quantized, the frequency spectra are not quantized. - When an audio signal is transformed through MDCT, the resultant frequency spectra are composed with each of MDCT coefficients of which each contains information of the amplitude and the phase of the audio-signal, while each of
FIG. 23A , 23B, and 23C illustrate only individual amplitudes. For example, when the frequency spectra adjacent to the peak are not quantized, information contained in the frequency spectra are lost. Therefore, the drops of the phase and the amplitude affect ones of a sound source associated with the peak and causes sound-quality deterioration, such as a sensation of trill. In particular, for a tonal audio signal, a sound source adjacent to a specific frequency in a band having the peak effectively contributes to a main sound source, and the influence due to the loss of the information contained in the frequency spectra adjacent to the peak is strongly exerted on the sound quality of an encoded sound source, compared to a low tonal audio signal. - Embodiments of an encoding device, an encoding method, and a program product including the method will be described below in detail with reference to the accompanying drawings. In the below, an underlying technology an overview, features, and a processing flow of the encoding device according to the first embodiment are described in order, and then, other embodiments are described.
- First, an underlying technology for describing an encoding device according to a first embodiment will be described using
FIGS. 1 to 4 . - A term “frequency spectrum” corresponds to a coefficient e.g. an MDCT coefficient for each frequency obtained through converting an audio signal (a sound source) by e.g. MDCT into frequency domain. A term “frequency spectral power” corresponds to the value of the square of the frequency spectrum. A term “tonal frequency spectrum” is a coefficient for one frequency of frequency spectra when peaks of frequency spectral powers concentrate at the frequency. For example, a frequency spectrum having a greater power than the average of all frequency spectral powers corresponds to the tonal frequency spectrum. An audio signal corresponding to a conversion source of the “tonal frequency spectrum” is referred to as a “tonal sound source”.
- Also, a term “quantization” is processing for rounding down a numeric value after a decimal point (e.g., changing “1.8” and “2.1” to integers such as “1” and “2”, respectively). A term “quantization value” indicates a value obtained by quantizing a frequency spectrum.
- A term “quantization error” is an error caused in each frequency spectrum by quantizing the frequency spectrum. Specifically, as shown in
FIG. 1 , the difference between a pre-quantization frequency spectrum and a post-inverse-quantization spectrum corresponds to the quantization error, where the post-inverse-quantization spectrum is referred to as an “inversely quantized spectrum”. - Herein, the term “inversely quantized spectrum” is a frequency spectrum obtained from a quantization value. The relationship of a frequency spectrum, a quantization value, and an inversely quantized spectrum will be described. Through the series of processing described below, the encoding device quantizes frequency spectra to obtain quantization values and then obtains inversely quantized spectra from the quantization values. Since the dynamic range of the frequency spectra is usually large, the encoding device first performs scaling using a predetermined “scale factor” to reduce the range as shown at (1) in
FIG. 1 . Thereafter, as shown at (2) inFIG. 1 , the encoding device performs quantization to obtain quantization values. Then, as shown at (3) inFIG. 1 , the encoding device rescales (performs the reverse processing of the scaling performed at (1) inFIG. 1 ) the obtained quantization values by using the predetermined scale factor to obtain inversely quantized spectra. - In this case, the inversely quantized spectrum is given by an expression shown in
Equation 1 shown inFIG. 2 and the quantization value is given byEquation 2 shown inFIG. 2 . These equations are derived fromExpression 1, which is an expression representing the relationship among the frequency spectrum, the quantization value, and the scale factor. “2̂(scale factor)” indicates “2 raised to the power of scale factor.” -
Frequency Spectrum=Quantization Value×2̂(scale factor) Expression 1: - A frequency range in which frequency spectra of an audio signal are analyzed is divided into a plurality of a smaller frequency range having a predetermined width in frequency as a band. To each of the bands, individual “scale factor” is given. For example, in the example shown in
FIG. 1 , one scale factor is given to a band “b” containing frequency spectra (4) and (5) shown inFIG. 1 . The scale factor is determined by the encoding device so that the quantization error power is smaller than an allowable error power. - A term “band power” of frequency spectra refers to the sum of powers of frequency spectra contained in a band.
- A term “quantization error power” of a frequency spectrum refers to the value of the square of a quantization error. Also, a quantization error power in one band refers to the sum of quantization error powers determined from quantization errors generated during quantization of frequency spectra contained in the band. Specifically, the relationship between a quantization error power and a quantization error in one band is given by
Expression 2 where ̂2 indicates a square. -
Quantization Error Power in One Band=Σ{(Quantization Errors in Frequency Spectra Contained in the Band)̂2}. Expression 2: - Also, the term “allowable error power” is a maximum quantization error power that is allowed during quantization. The allowable error power is an allowable maximum quantization error power in the quantization error powers caused during quantizing the scaled spectrum. More in detail, the allowable error power is derived for each band from a transformation of the masking threshold corresponding to the band, where the masking threshold that indicates whether or not it can be acoustically heard. As a scheme for determining the allowable error power from the masking threshold, for example, the scheme described in ISO/IEC 13818-7 may be used or other schemes may be used.
- Specifically, the allowable error power is a “limit of an allowable quantization error power”. For example, the allowable error power in one band is a quantization error power determined for the band and exhibits a maximum value that is allowed as an error generated during quantization of frequency spectra in the band. In other words, the encoding device according to the first embodiment quantizes frequency spectra so that the difference power between the power of pre-quantization frequency spectra and the power of inversely quantized spectra in one band is smaller than the allowable error power.
- Also, the allowable error power for each band is derived from the individual masking thresholds. The derived allowable error power is also compared with individual power frequency spectra to select the frequency spectra to be quantized in which band. What are compared with the allowable error powers during determination of frequency spectra to be quantized are band powers.
- Also, the term “encoding” is processing for converting the quantization values and/or the scale factors into other values (codes) by using, for example, Huffman coding.
- The relationship between the scale factor and the quantization error power will be briefly described. As described above, each of scale factors is assigned to each band, and each frequency spectrum contained in one band is quantized using the assigned scale factor.
- When attention is given to one frequency spectrum in a band, the relationship between the quantization value and the scale factor is given as shown in
FIG. 3A and the relationships shown byExpression 3 andExpression 4 below hold. - Attention is now given to frequency spectra contained in a band. As shown in
FIG. 3B , when the scale factor is set to be large, the quantization value becomes “0” from a small-power frequency spectrum in the band and thus the quantization error increases. - That is, as shown in
FIG. 3C , the relationships inExpression 5 andExpression 6 below hold for the scale factor and the quantization error power. - When all frequency spectra contained in a band are quantized with a quantization value “0” (i.e., are not quantized), the quantization error power has a maximum value and the relationship in
Expression 7 below holds. -
Quantization Error Power=Band Power Expression 7 - Also, the relationship of the scale factor, the quantization error power, and the allowable error power will now be briefly described. First, when the band power is greater than the allowable error power, the encoding device regards the band as a band to be quantized. Also, as shown in
FIG. 4 , the encoding device quantizes the frequency spectra by using such a scale factor that the quantization error power becomes smaller than the allowable error power. Thus, as shown inFIG. 4 , the encoding device performs quantization by using a scale factor that satisfies “Allowable Error Power>Quantization Error Power”. - Now, the relationship of the quantization error power, the allowable error power, and the band power is summarized again. That is, the relationship is given by:
- (1) the maximum value of the quantization error power is the band power (Expression 7),
- (2) the relationship in
Expression 7 is given when all frequency spectra are quantized with a quantization value “0” (i.e., are not quantized), and - (3) the quantization value is performed using a scale factor that satisfies a case of Allowable Error Power>Quantization Error Power (this is referred to as “Expression A”). Now, when the relationship in
Expression 7 holds, Expression A is given by Expression B below. -
Allowable Error Power>Quantization Error Power, where Quantization Error Power=Band Power. Expression B - A case in which the band power equals to the quantization error power corresponds to a case in which the quantization values of the frequency spectra are “0” (i.e., the frequency spectra are not quantized). In other words, the allowable error power serves as a threshold for determining whether or not all frequency spectra in a band are to be quantized.
- An overview and features of the encoding device according to the first embodiment will be described next using
FIG. 5 illustrating an overview and features of the encoding device according to the embodiment. -
FIG. 5 illustrates the encoding device in which several main units are provided and shown with a signal processed in the each unit for coding an audio signal. When a sound source (an audio signal) to be encoded is input into the encoding device, the device encodes the audio signal as shown inFIG. 5 . The encoding device has a main feature in that it can improve the encoded-sound quality of a tonal audio signal, as described below. - That is, a frequency converting unit converts the inputted audio signal into frequency spectra as shown in (1) in
FIG. 5 . The frequency converting unit determines the powers of the frequency spectra for each of bands having a predetermined width in frequency as shown in (2) inFIG. 5 . For example, the frequency converting unit determines the total of powers (the band power) corresponding to a sum of the individual powers of each frequency spectrum contained in a band. In the example shown in (2) inFIG. 5 , each unpainted bar indicates frequency spectra in each band. - As shown in (3) in
FIG. 5 , the power determining unit determines allowable error powers for respective bands in accordance with the audio signal, (refer to the above-described [Underlying Technology]). In the example shown in (3) inFIG. 5 , each bar painted indicates the allowable error power in each band (on a band basis). - A detecting unit, as shown in (4) in
FIG. 5 , detects a tonal frequency spectrum from the frequency spectra converted by the frequency converting unit and also detects a band containing the tonal frequency spectrum. For example, the detecting unit detects a band “5” in (4) inFIG. 5 as a band containing a tonal frequency spectrum. - Then, a power correcting unit corrects the allowable error powers using both of the result detected by the detecting unit and the allowable error powers determined by the power determining unit. Specifically, each of the allowable error powers of the bands adjacent to the band containing the tonal frequency spectrum are individually corrected by the power correcting unit so that the allowable error power become smaller than the sum of powers of frequency spectra in the bands.
- As shown in (5) in
FIG. 5 , the power correcting unit corrects the powers of frequency spectra of the bands “4” and “6” adjacent to the band “5” so that the allowable error power of each of the bands “4” and “6” become smaller than each of the powers of the frequency spectra of the bands “4” and “6”. To clarify the correction, the painted portion in the bars in the bands “4” and “6” in (6) inFIG. 5 shows the corrected allowable error powers for each of the bands. Namely in (6) inFIG. 5 , each of unpainted portions in the band “4” and “6” illustrates the amounts corrected by the power correcting unit. - Then, in the encoding device, as shown in (7) in
FIG. 5 , a quantizing unit quantizes frequency spectra having greater powers than the allowable error powers corrected by the power correcting unit. For example, the quantizing unit quantizes the frequency spectra contained in the band “5” containing the tonal frequency spectrum and the frequency spectra contained in the bands “4” and “6” that have allowable error powers corrected by the power correcting unit as shown in (7) inFIG. 5 . - Specifically, since the allowable error powers are corrected so that the frequency spectra that exist adjacent to a peak power are quantized, it is possible to reliably quantize the frequency spectra that exist adjacent to the peak power and it is possible to improve the encoded-sound quality of a tonal audio signal.
- The configuration of the encoding device shown in
FIG. 5 will be described next usingFIGS. 6 to 11 . Here,FIG. 6 is a block diagram showing the configuration of the encoding device according to the first embodiment.FIG. 7 is a drawing for describing a tone detecting unit in the first embodiment.FIG. 8 is a drawing for describing a psychoacoustic analyzing unit in the first embodiment.FIG. 9 is a drawing for describing an allowable-error-power correcting unit in the first embodiment.FIG. 10 is a drawing for describing the allowable-error-power correcting unit in the first embodiment.FIG. 11 is a diagram for describing a scale factor correcting unit in the first embodiment. - As shown in
FIG. 6 , the encoding device includes, aninput unit 101, a Modified Discrete Cosine Transform (MDCT)unit 102, atone detecting unit 103, apsychoacoustic analyzing unit 104, an allowable-error-power correcting unit 105, a quantization-band detecting unit 106, a scalefactor determining unit 107, a scalefactor correcting unit 108, aquantizing unit 109, anencoding unit 110, and anoutput unit 111. - The
MDCT unit 102, thepsychoacoustic analyzing unit 104, and thetone detecting unit 103 may correspond to a “frequency converting unit”, a “power determining unit”, and a “detecting unit” respectively. Further the allowable-error-power correcting unit 105 and thequantizing unit 109 may correspond to a “power correcting unit” and a “quantizing unit” respectively. The scalefactor determining unit 107 may correspond to a “first scale factor determining unit” and a “second scale factor determining unit”. The scalefactor correcting unit 108 may correspond to a “third scale factor determining unit”. - An audio signal as a sound source to be encoded is received by the
input unit 101 and then fed to theMDCT unit 102 and thepsychoacoustic analyzing unit 104 both of which are described below. - The
MDCT unit 102 converts the audio signal, transmitted from theinput unit 101, into frequency spectra. Specifically, through MDCT conversion, theMDCT unit 102 performs time-frequency conversion by which the audio signal transmitted from theinput unit 101 is converted into frequency spectra. The time-frequency conversion herein means, for example, a conversion of an audio signal as a function of time variable into frequency spectra of frequency variable. - The
MDCT unit 102 determines the power of frequency spectra for each of bands obtained by dividing a whole predetermined width of the frequency spectra by a predetermined band width in frequency. For example, in the example shown inFIG. 7A , the frequency spectra within a width W are divided into seven sub-bands indicated as bands “0” to “6” and the sum of the powers of the frequency spectra contained in each band is determined as a band power such as E0 to E6. - Also, the
MDCT unit 102 transmits data of the converted frequency spectra and the band powers to both of thetone detecting unit 103 and the quantization-band detecting unit 106 described below. - Upon receiving the data of the frequency spectra from the
MDCT unit 102, thetone detecting unit 103 analyzes a tonality with respect to the frequency spectra, detects a tonal frequency spectrum, and detects a band containing the tonal frequency spectrum. - Also, for example, as shown in
FIG. 7B , thetone detecting unit 103 determines an average value of the powers in all bands (in other words, an average value of the powers of all frequency spectra) from the determined powers in the respective bands. Specifically, when the number of bands (the number of divided bands) is indicated by “band” (e.g., “band” is 7 in the example shown inFIG. 7B ) and each band power is indicated by “Eband”, thetone detecting unit 103 determines an average power “Eave” of the frequency spectra in all the bands in accordance with an expression shown inFIG. 7C . - Also, as shown in
FIG. 7D , thetone detecting unit 103 determines that a band is a tonal band when the band has a power averaged over its band width and the averaged power is greater than a threshold, where the threshold is a power averaged over a whole range to be calculated. Specifically along an example shown inFIG. 7B , thetone detecting unit 103 detects theband 3 as a band containing a tonal frequency spectrum, because theband 3 is a band having an average power of frequency spectra which is greater than the determined average power Eave. - Also, the
tone detecting unit 103 transmits the data of the detected band containing the tonal frequency spectrum to both of the allowable-error-power correcting unit 105 and the scalefactor correcting unit 108 described below. Furthermore, thetone detecting unit 103 transmits information of a flag and information for identifying the detected band, which indicated tone_flag and tone_band respectively. The flag as tone_flag indicates that a tonality is detected, and the information as tone_band indicates theband 3 having a band power E3 in the example shown inFIG. 7B . The information of both of tone_flag and tone_band are sent to the allowable-error-power correcting unit 105 and the scalefactor correcting unit 108 described below. When thetone detecting unit 103 does not detect a band containing a tonal frequency spectrum, theunit 103 does not transmit the information of tone_flag and tone_band. - The
tone detecting unit 103 transmits also data of the frequency spectra and the band powers, which received from theMDCT unit 102, to the allowable-error-power correcting unit 105 described below. - Upon receiving the audio signal from the
input unit 101, thepsychoacoustic analyzing unit 104 determines allowable error powers in accordance with the audio signal (refer to the underlying technology). Thepsychoacoustic analyzing unit 104 divides a predetermined band width of frequency included in the audio signal into smaller predetermined-width bands and determines allowable error powers for the respective divided bands, while it is preferable to use the bands determined by theMDCT unit 102. - As shown in
FIG. 8A , thepsychoacoustic analyzing unit 104 determines a masking threshold for the audio signal transmitted from theinput unit 101. Also, as shown inFIG. 8B , theunit 104 converts the determined masking threshold to determine allowable error powers. - The term “bands” referred to herein correspond to the bands used by the
MDCT unit 102. In other words, thepsychoacoustic analyzing unit 104 determines preferably an allowable error power for each band using of the bands and the respective band power determined by theMDCT unit 102. In, for easy understanding, each ofFIG. 8A and 8B illustrates the masking threshold or the allowable error powers in conjunction with the frequency spectra. - The
psychoacoustic analyzing unit 104 also transmits the data of the determined allowable error powers to the allowable-error-power correcting unit 105 described below. - The allowable-error-
power correcting unit 105 has the number-of-bands storing unit (not shown inFIG. 6 ) for storing the predetermined number of bands. As shown inFIG. 9 , the allowable-error-power correcting unit 105 receives the detection results of “tone_band” and “tone_flag” from thetone detecting unit 103; the data of allowable error powers from thepsychoacoustic analyzing unit 104; and the data of band powers also from thetone detecting unit 103. “tone_band” and “tone_flag” are shown as “Detection Result” in the example shown inFIG. 9 . Using the detection results and the data of the band powers, the allowable-error-power correcting unit 105 corrects the data of the allowable error powers. The number-of-bands storing unit may correspond to a “number-of-bands storing unit”. - Specifically, it is performed in the allowable-error-
power correcting unit 105 so that the allowable error powers determined by thepsychoacoustic analyzing unit 104 with respect to bands adjacent to the band detected by thetone detecting unit 103 become smaller than the band powers with respect to the adjacent bands. - For example, the allowable-error-
power correcting unit 105 detects, as adjacent bands, bands located in the range of a predetermined number of bands which is stored by the number-of-bands storing unit, with the band that contains the tonal frequency spectrum detected by thetone detecting unit 103 being as the center thereof. - An example of a case in which the
tone detecting unit 103 detects the “b”th band and a predetermined bandwidth stored in the number-of-bands storing unit is a correction bandwidth “B” will be specifically described by way of example. As shown inFIG. 10A , the allowable-error-power correcting unit 105 detects “B” bands adjacent to the band “b” as adjacent bands to be corrected, with the “b”th band being as the center thereof. In other words, the allowable-error-power correcting unit 105 detects the “b-B”th to “b+B”th bands as adjacent bands to be corrected. For example, in the example shown inFIG. 10A , for “b=16” and “B=4”, the allowable-error-power correcting unit 105 detects bands “12” to “20” as adjacent bands to be corrected. - Also, as shown in
FIG. 10B , the allowable-error-power correcting unit 105 corrects the allowable error powers with respect to the detected adjacent bands. In the shown inFIG. 10B , the pre-correction allowable error powers in the bands “12” to “20” (excluding the band “16”), which are the detected adjacent bands, are greater than the band powers in the detected adjacent bands. Thus, the allowable-error-power correcting unit 105 performs correction by equally attenuating the allowable error powers in the bands “12” to “20” (excluding the band “16”) so that the allowable error powers become smaller than the powers in the frequency spectra. One of methods for the attenuation determines “M′b−1=g×Mb−1” (Amount of Attenuation “g”<1.0) as shown inFIG. 10C , where “M′b−1” indicates a post-correction allowable error power in the “b−1”th band and “Mb−1” indicates a pre-correction allowable error power in the “b−1”th band. - The allowable-error-
power correcting unit 105 also transmits to the quantization-band detecting unit 106; the data of the allowable error powers determined by thepsychoacoustic analyzing unit 104; and the data of the corrected allowable error. When the allowable-error-power correcting unit 105 does not receive the flag (tone_flag) and the information for identifying the detected band from thetone detecting unit 103, the correctingunit 105 does not perform the processing for correcting the allowable error powers and transmits the allowable error powers determined by thepsychoacoustic analyzing unit 104 to the quantization-band detecting unit 106 described below. - The quantization-
band detecting unit 106 detects bands to be quantized from the band of the frequency spectra when received the frequency spectra and the allowable error powers. The frequency spectra is from theMDCT unit 102 and the allowable error powers (including the allowable error powers corrected by the allowable-error-power correcting unit 105) from the allowable-error-power correcting unit 105. - Specifically, the quantization-
band detecting unit 106 compares, on a band-to-band basis, the band powers transmitted from theMDCT unit 102 with the allowable error powers transmitted from the allowable-error-power correcting unit 105. Accordingly bands to be quantized are determined. More specifically, with respect to a band having an allowable error power corrected by the allowable-error-power correcting unit 105, the quantization-band detecting unit 106 compares the corrected allowable error power with the band power of the band. Also, with respect to a band that have not an allowable error power corrected by theunit 105, theunit 106 compares the allowable error power determined by thepsychoacoustic analyzing unit 104 with the band power of the band. Theunit 106 also detects, as a band to be quantized, each band indicating a greater band power than the allowable error power. Theunit 106 also detects information for identifying the detected bands. - The quantization-
band detecting unit 106 also transmits to the scalefactor determining unit 107 the information for identifying the detected bands to be quantized; the data of the allowable error powers transmitted from the allowable-error-power correcting unit 105; and the data of the frequency spectra transmitted from theMDCT unit 102. - Upon transmission of the information for identifying the bands to be quantized, the allowable error powers, and the frequency spectra from the quantization-
band detecting unit 106, the scalefactor determining unit 107 determines, for respective bands, such scale factors that the quantization error powers become smaller than the allowable error powers. - When the allowable-error-
power correcting unit 105 corrects the allowable error powers with respect to bands adjacent to the band containing the tonal frequency spectrum detected by thetone detecting unit 103, the scalefactor determining unit 107 determines such scale factors that the quantization error powers become smaller than the corrected allowable error powers with respect to the adjacent bands. - The scale
factor determining unit 107 also transmits the information for identifying the bands to be quantized; and the sets of data of the allowable error powers, the frequency spectra, and the scale factors determined for the respective bands to the scalefactor correcting unit 108 described below. - As shown in
FIG. 11A , the scalefactor correcting unit 108 receives the data of tonality detection result from thetone detecting unit 103; the information for identifying the bands to be quantized, and the each sets of data of the allowable error powers (the information and the allowable error powers are not shown inFIG. 11 ), the frequency spectra, and the scale factors for the respective bands from the scalefactor determining unit 107. Upon receiving these data, the scalefactor correcting unit 108 corrects the scale factor of the band containing the tonal frequency spectrum. As described above, the tonality detection result includes the data of the band containing the tonal frequency spectrum and the tone detecting signal. In particular, the scalefactor correcting unit 108 corrects the scale factor for the band containing the tonal frequency spectrum to such a scale factor that the quantization value obtained from a largest one of the frequency spectra that constitute the band becomes the maximum value of the quantization values. - Now, a description will be specifically given of an example of a case in which the band containing the tonal frequency spectrum is a band “b” and the scale factor determined by the scale
factor determining unit 107 with respect to the band “b” is “Sb”. The scalefactor correcting unit 108 searches for the maximum frequency spectrum contained in the band “b” (“Maximum Frequency Spectrum Searching” corresponds thereto, in the example shown inFIG. 11A ). The maximum frequency spectrum is referred to as “max_pow_spec”, and the term “maximum frequency spectrum” referred to herein means the greatest-power frequency spectrum of the frequency spectra that constitute the band containing the tonal frequency spectrum. - Also, for example, upon detecting the maximum frequency spectrum, the scale
factor correcting unit 108 determines such a scale factor “S′b” that the quantization value obtained by quantizing the maximum frequency spectrum becomes “MAX_QUANT”. “MAX_QUANT” means the maximum value of the quantization values. The scale factor “S′b” is determined in “Corrected Scale-Value Determination” in the example shown inFIG. 11A , and is set as a scale factor for the band containing the tonal frequency spectrum detected by thetone detecting unit 103. For example, in accordance with an expression shown inFIG. 11B , the scalefactor correcting unit 108 replaces the scale factor “Sb” with the scale factor “S′b”, that is, the scale factor “Sb” is corrected to the scale factor “S′b”. The maximum value of the quantization value is a value defined by a coding technology standard, and MAX_QUANT=8191 is defined in the standard of Advanced Audio Coding (AAC). - The scale
factor correcting unit 108 also transmits to thequantizing unit 109 the information for identifying the bands to be quantized; the each set of the allowable error powers, the frequency spectra, and the scale factors for the respective bands. The data of the scale factors includes the scale factor detected by the scalefactor correcting unit 108 for the band containing the tonal frequency spectrum. - Upon receiving the information for identifying the bands to be quantized; each data set of the allowable error powers, the frequency spectra, and the scale factors for the respective bands, the
quantizing unit 109 quantizes each frequency spectrum having a greater power than the allowable error power. Specifically, with respect to each of the bands detected by the quantization-band detecting unit 106, thequantizing unit 109 reduces the dynamic ranges of the frequency spectra to dynamic ranges uniquely specified by the scale factors and quantizes each of the frequency spectra that constitute each band in the reduced dynamic range. In this process, each of the bands detected by the quantization-band detecting unit 106 is identified by the information for identifying the bands to be quantized. - More specifically, the
quantizing unit 109 quantizes each of the frequency spectra contained in the band whose scale factor was determined by the scalefactor correcting unit 108, by using the scale factor determined by the scalefactor correcting unit 108. Furthermore, thequantizing unit 109 quantizes each of the frequency spectra contained in the bands whose scale factors were not determined by the scalefactor correcting unit 108, by using the scale factor determined by the scalefactor determining unit 107. - In this case, the
quantizing unit 109 uses the scale factors, determined by the scalefactor determining unit 107 and the scalefactor correcting unit 108, to change the dynamic ranges on a band-by-band basis (for each band). Thereafter, during execution of the quantization, thequantizing unit 109 performs the quantization on a frequency-spectrum by frequency-spectrum basis (for each frequency spectrum) that constitutes each of the bands, rather than performing quantization on a band-by-band basis. That is, thequantizing unit 109 obtains quantization values for respective frequency spectrum. - The
quantizing unit 109 also transmits the data of the quantization values obtained by the quantization, and the scale factors to theencoding unit 110 described below. - Upon receiving the quantization values and the scale factors from the
quantizing unit 109, theencoding unit 110 encodes the quantization values and the scale factors. For example, theencoding unit 110 uses Huffman coding to individually encode the quantization values and the scale factors. Theencoding unit 110 transmits the encoded information to theoutput unit 111 described below. - Upon receiving of the encoded information from the
encoding unit 110, theoutput unit 111 outputs the information received from theencoding unit 110, as encoded information of the audio signal input by theinput unit 101. - The encoding device can also be realized by incorporating the functions of the
MDCT unit 102, thetone detecting unit 103, thepsychoacoustic analyzing unit 104, the allowable-error-power correcting unit 105, the quantization-band detecting unit 106, the scalefactor determining unit 107, the scalefactor correcting unit 108, and thequantizing unit 109, which are described above, into an information processing apparatus, such as a known personal computer, workstation, portable phone, PHS terminal, mobile communication terminal, or PDA. - Processing performed by the encoding device will be described next using
FIGS. 12 and 13 . Here, the flow of entire processing performed by the encoding device is first described usingFIG. 12 , and then, the flow of processing performed by the scalefactor correcting unit 108 is described usingFIG. 13 .FIG. 12 is a flowchart showing the flow of entire processing of the encoding device according to the first embodiment andFIG. 13 is a flowchart showing the flow of processing performed by the scale factor correcting unit according to the first embodiment. - As shown in
FIG. 12 , in the disclosed encoding device, when an audio signal exists (YES in step S101), i.e., when an audio signal is received by theinput unit 101, theMDCT unit 102 performs MDCT conversion (step S102). That is, theMDCT unit 102 converts the audio signal, transmitted from theinput unit 101, into frequency spectra. TheMDCT unit 102 then divides the band (step S103) and determines band powers (step S104). That is, theMDCT unit 102 determines frequency spectral powers and further determines the sum of frequency spectral powers in each of the bands obtained by division by a predetermined width. - The
tone detecting unit 103 then detects a band containing a tonal frequency spectrum (step S105). That is, when there is a band having a greater frequency spectral power than a threshold, which is the average power of the frequency spectra in all the bands, thetone detecting unit 103 detects the band as a band having a high tonality. - The
psychoacoustic analyzing unit 104 then determines allowable error powers (step S106). That is, upon transmission of the audio signal from theinput unit 101, thepsychoacoustic analyzing unit 104 determines allowable error powers in accordance with the audio signal. - In this case, when a tone exists (YES in step S107), in other words, when the
tone detecting unit 103 detects a tonal band in step S105 described above, the allowable-error-power correcting unit 105 corrects the allowable error powers (step S108). That is, upon transmission of the detection result from thetone detecting unit 103, the allowable-error-power correcting unit 105 corrects the allowable error powers with respect to adjacent bands. For example, the allowable-error-power correcting unit 105 corrects the allowable error powers with respect to adjacent bands to allowable error powers that are smaller than the band powers with respect to the adjacent bands. - Thus, the allowable-error-
power correcting unit 105 corrects the allowable error powers (step S108). Alternatively, when no tone exists (NO in step S107), the quantization-band detecting unit 106 detects bands to be quantized (step S109). That is, upon transmission of the frequency spectra from theMDCT unit 102 and transmission of the allowable error powers from the allowable-error-power correcting unit 105, the quantization-band detecting unit 106 detects bands to be quantized from the bands of the frequency spectra. - The scale
factor determining unit 107 then determines scale factors (step S110). That is, upon transmission of the information for identifying the bands to be quantized, the allowable error powers, and the frequency spectra from the quantization-band detecting unit 106, the scalefactor determining unit 107 determines, for each band, such a scale factor that the quantization error power becomes smaller than or the allowable error power. - In this case, when a tone exists (affirmative in step S111), the scale
factor correcting unit 108 corrects the scale factor (step S112). That is, when the band containing the tonal frequency spectrum is transmitted from thetone detecting unit 103 and the information for identifying the bands to be quantized, the allowable error powers, the frequency spectra, and the scale factors for the respective bands are transmitted from the scalefactor determining unit 107, the scalefactor correcting unit 108 corrects the scale factor for the band containing the tonal frequency spectrum. - The scale
factor correcting unit 108 then corrects the scale factor (step S112). Alternatively, when no tone exists (negative in step S111), thequantizing unit 109 quantizes the frequency spectra (step S113). That is, upon transmission of the information for identifying the bands to be quantized, the allowable error powers, the frequency spectra, and the scale factors for the respective bands from the scalefactor correcting unit 108, thequantizing unit 109 quantizes each frequency spectrum in each band detected by the quantization-band detecting unit 106. - Then, the
encoding unit 110 performs encoding (step S114). That is, upon transmission of quantization values obtained by the quantization from thequantizing unit 109, theencoding unit 110 encodes the quantization values. - As shown in
FIG. 13 , in the disclosed encoding device, when the scale factors are corrected (YES in step S201), i.e., when the band containing the tonal frequency spectrum is transmitted from thetone detecting unit 103 and the information for identifying the bands to be quantized, the allowable error powers, the frequency spectra, and the scale factors for the respective bands are transmitted from the scalefactor determining unit 107, the scalefactor correcting unit 108 detects a maximum frequency spectrum (step S202). - Then, for example, the scale
factor correcting unit 108 determines a scale factor for a case in which the quantization value becomes maximum (step S203). That is, the scalefactor correcting unit 108 determines such a scale factor that a quantization value obtained from a largest one of the frequency spectra that constitute the band containing the tonal frequency spectrum detected by thetone detecting unit 103 becomes a maximum value. - The scale
factor correcting unit 108 then corrects the scale factor (step S204). That is, the scalefactor correcting unit 108 corrects the scale factor determined by the scalefactor determining unit 107 to the scale factor determined with respect to the band from which the tonal frequency spectrum was detected. - As described above, according to the first embodiment, the disclosed encoding device converts an audio signal into frequency spectra, determines allowable error powers for respective bands obtained by dividing the frequency of the audio signal by a predetermined width. The encoding device also detects a tonal frequency spectrum from the frequency spectra and a band containing the frequency spectrum. Using the detection result and the allowable error powers, the encoding device performs correction such that the allowable error powers determined with respect to bands adjacent to the band detected by the detecting unit become smaller than the powers of the frequency spectra with respect to the adjacent bands. Furthermore the encoding device quantizes each of the frequency spectra having greater powers than the corrected allowable error powers. Thus, it can be possible to improve the encoded-sound quality of a tonal audio signal.
- Specifically, since the allowable error powers are corrected so that each of frequency spectra that exist adjacent to a peak power is corrected, it can be possible to reliably quantize each of the frequency spectra that exist adjacent to the peak power. Furthermore, it can be possible to improve the encoded-sound quality of a tonal audio signal.
- That is, when a tonal audio signal is to be encoded in the known schemes, frequency spectra adjacent to a tonal frequency spectrum cannot be reliably quantized and the adjacent frequency spectra are lost. Consequently, in an original sound as shown in
FIG. 14A , the phase characteristic of an encoded sound is distorted as shown inFIG. 14B , which may cause the amplitude to fluctuate and cause the sound to vibrate or the trill. - Also, for example, in the known schemes, the amplitude fluctuates to overflow (e.g., to exceed (16 bits) which is the maximum value of PCM), resulting in the generation of clipping. Consequently, as shown in
FIG. 14C , abnormal sound such as a sound of chi'ri'chi'ri (e.g. a clipping noise) is generated. Also, as shown inFIG. 14B , variations in the amplitude cause a sound to vibrate peceptually. - Compared to such conventional schemes, according to the disclosed encoding device, frequency spectra adjacent to a tonal frequency spectrum can be reliably quantized as shown in
FIG. 15 . Thus, during encoding of an original sound shown inFIG. 16A , a sound to vibrate perceptually and the generation of abnormal sound of chi'ri'chi'ri generated during the quantization using the known schemes, as shown inFIG. 16B , are reduced as shown inFIG. 16C , and the encoded-sound quality of a tonal audio signal can be improved. - Also, according to the first embodiment, in the disclosed encoding device, with respect to the bands adjacent to the band containing the detected tonal frequency spectrum, the scale
factor correcting unit 108 determines, as scale factors for the adjacent bands, such scale factors with respect to the adjacent bands that the quantization error powers determined from quantization errors that are errors generated during the quantization of frequency spectra contained in the adjacent bands become smaller than the allowable error powers determined by the allowable-error-power correcting unit 105 with respect to the adjacent bands, and thequantizing unit 109 quantizes each of the frequency spectra contained in the band whose scale factor was determined by the scalefactor correcting unit 108, by using the scale factor determined by the scalefactor correcting unit 108. Thus, even when the allowable error powers are corrected, it is possible to perform quantization using an appropriate scale factor. - Also, according to the first embodiment, in the disclosed encoding device, the
tone detecting unit 103 detects the band containing the tonal frequency spectrum. Thereafter, the scale factor of the band is determined so that the a quantization value obtained from a largest one of the frequency spectra that constitute the band including the tonal frequency spectrum becomes a maximum Thus, it can be possible to minimize quantization errors. Specifically, since a quantization value obtained from a peak having a tonality takes a maximum value set based on the standard, it is possible to minimize quantization errors. - Also, according to the first embodiment, the disclosed encoding device stores a predetermined number of bands and determines adjacent bands which locate in the range of the stored predetermined number of bands around the band containing the detected tonal frequency spectrum as the center thereof. Thereafter, the encoding device corrects the allowable error powers of the adjacent bands. Thus, it can be possible to easily detect bands in which the allowable error powers are to be corrected.
- The encoding device according to the first embodiment adopts a scheme in which the scale
factor correcting unit 108 corrects the scale factor for the band detected by thetone detecting unit 103 so that the value by quantizing the largest one of frequency spectra in the band becomes the maximum value based on the standard. The present invention, however, is not limited to the scheme. For example, an encoding device according to a second embodiment may be such that it searches for a scale factor at which the quantization error power generated is small and uses the scale factor obtained by the searching. - The encoding device according to the second embodiment determines, as the scale factors, the scale factor determined by the scale
factor determining unit 107 and a scale factor that is selected from changed scale factor obtained by changing the scale factor by a predetermined value. Then the encoding device uses one of both the scale factors which reduces the quantization error (or the quantization error power) generated during quantization. In the below, a description of the same points as those in the encoding device in the first embodiment is briefly given or is omitted. - In the encoding device according to the second embodiment, the scale
factor correcting unit 108 corresponding to an “error determining unit” determines a quantization error power generated during the quantization of the frequency spectra contained in a band by using the scale factor determined by the scalefactor determining unit 107 with respect to the band. Furthermore, the scalefactor correcting unit 108 determines the quantization error power using the changed scale factor obtained by changing the scale factor determined by the scalefactor determining unit 107. - Thus, as shown in
FIG. 17 , the scalefactor correcting unit 108 according to the second embodiment is different from one, shown inFIG. 11B , according to the first embodiment. That is, in scale-correction value searching, the scalefactor correcting unit 108 according to the second embodiment uses allowable error powers (in the second embodiment, the allowable error power corrected by the allowable-error-power correcting unit 105 and the pre-correction allowable error power). - Specifically, in the encoding device according to the second embodiment, with respect to the band detected by the
tone detecting unit 103, the scalefactor correcting unit 108 quantizes each of the frequency spectra that constitute the band by using the scale factor determined by the scalefactor determining unit 107. Then the encoding device determines a quantization error power generated during the quantization (refer to the consideration on underlying technology). - Specifically, hereinafter it is explained the case that the
tone detecting unit 103 detects a band “b”, the scalefactor determining unit 107 determines a scale factor “Sb” for the band “b”, and the number of frequency spectra that constitute the band “b” is “Nb”. - First, in the encoding device according to the second embodiment, the scale
factor correcting unit 108 quantizes each of the frequency spectra that constitute the band “b” to determine quantization values, by using the scale factor “Sb”. Then, theunit 108 performs inverse-quantization to determine inversely quantized spectra by using the determined quantization values and the scale factor “Sb”. For example, in the AAC encoding method, the scalefactor correcting unit 108 determines a quantized value “quanti” obtained from the ith spectrum “speci” contained in the band “b” and an inversely quantized spectrum “ispeci” in accordance with expressions shown inFIGS. 18A and 18B . - Then, in the encoding device, the scale
factor correcting unit 108 determines a quantization error power in the band from the pre-quantization frequency spectra and the inversely quantized spectra. For example, the scalefactor correcting unit 108 determines a quantization error power “error_eb” in the band “b” in accordance with an expression shown inFIG. 18C . “Nb” in the expression shown inFIG. 18C indicates the number of frequency spectra contained in the band “b”. - Also, specifically, the scale
factor correcting unit 108 changes the scale factor determined by the scalefactor determining unit 107 to a predetermined value. Then theunit 108 uses the changed scale factor (a change scale factor) to determine a quantization error power generated during the quantization with respect to the band detected by thetone detecting unit 103. - For example, the scale
factor correcting unit 108 changes the scale factor “Sb” to a predetermined value “A” and uses the resulting change scale factor “S′b (e.g., “S′b” =“Sb”+“A”)” to determine a quantization error power generated during the quantization of the band “b”. - Also, specifically, in the encoding device according to the second embodiment, the scale
factor correcting unit 108 compares two quantization error powers, one is referred to as a “first” quantization error power and the other is referred to as a “second” quantization error power to determine whether the “second” quantization error power is smaller. The first quantization error power is generated by a use of the scale factor determined by the scalefactor determining unit 107 and the second quantization error power is generated by a use of the change scale factor. In this case, when the “second” quantization error power is smaller than the “first” quantization error power, the scalefactor correcting unit 108 corrects the scale factor (e.g., “Sb”) for the band detected by thetone detecting unit 103 to the change scale factor (e.g., “S′b”). On the other hand, when the “second” quantization error power is not smaller than the “first” quantization error power, the scalefactor correcting unit 108 does not correct the scale factor. - Also, the scale
factor correcting unit 108 determines quantization error powers with respect to multiple scale factors by using various “As” and corrects the scale factor to a scale factor at which a smallest one of the quantization error powers is generated. It is shown as an example, in which the scalefactor correcting unit 108 uses “Sb1” and “Sb2” as the change scale factors during first quantization and during second operations, respectively. When theunit 108 uses the change scale factor “Sb1” for the first operation to correct the scale factor “Sb” determined by theunit 107, theunit 108 then compares a quantization error power generated by a use of “Sb1” with a quantization error power generated by a use of “Sb2”. - Also, for example, the scale
factor correcting unit 108 determines whether or not the comparison is performed with respect to all predetermined change scale factors (e.g., scale factors (change scale factor candidates) determined from the predetermined “As”). Then, the scalefactor correcting unit 108 continues the scale factor correction processing until the comparison is performed with respect to all change scale factors. - Although the description for the encoding device according to the second embodiment has been given of the scheme in which the scale
factor correcting unit 108 compares the quantization error powers on a one-to-one basis, the present invention is not limited thereto. The arrangement may be such that quantization error powers are determined with respect to multiple scale factors, respectively, the comparison is simultaneously performed on the determined multiple (e.g., three or more) quantization error powers, and one scale factor at which the generated quantization error power is the smallest is used. - The value of “A” is arbitrary, and not only is a value that is greater than “0” used as “A”, but also a value that is smaller than “0” may be used as “A”. Also, the scale
factor correcting unit 108 may pre-store a setting regarding the number of values used as the change scale factors (the number of times for determining and comparing the quantization error powers) and may execute the scale factor correction processing based on the setting. - The present invention is not limited to the scheme using various “As” (using multiple change scale factors). For example, the scale factors determined by the scale
factor determining unit 107 may be compared with only one change scale factor. For example, one value that is estimated to reduce quantization errors may be pre-selected and used as the change scale factor. This makes it possible to quickly execute the scale factor correction processing. - One example of the flow of detailed processing performed by the scale
factor correcting unit 108 in the encoding device according to the second embodiment is not described now and will be described below. - In the encoding device according to the second embodiment, the
quantizing unit 109 quantizes each of the frequency spectra contained in the band, by using the scale factor (or the change scale factor). The scale factor (or the change scale factor) is one giving the smallest one of the quantization error powers determined by the scalefactor correcting unit 108. For example, the scalefactor correcting unit 108 determines quantization error powers with respect to the scale factor “Sb” determined by the scalefactor determining unit 107 and the value “S′b” obtained by changing the scale factor to “A”. Then, when the quantization error power generated by a use of “S′b” is the smallest, each of the frequency spectra that constitute the band detected by thetone detecting unit 103 is quantized using the scale factor “S′b”. - Processing performed by the scale factor correcting unit in the second embodiment will be described next using
FIG. 19 .FIG. 19 is a flowchart showing the flow of the scale factor correction processing performed by the encoding device according the second embodiment. - A description below is given using, as an example, a case in which the
tone detecting unit 103 detects a band “b”, the scalefactor determining unit 107 determines a scale factor “Sb” for the band “b”, and the number of frequency spectra that constitute the band “b” is “Nb”, unless otherwise particularly stated. - As shown in
FIG. 19 , in the disclosed encoding device, when the scalefactor correcting unit 108 is to correct the scale factor (YES in step S301), it determines a quantization error power (step S302). That is, the scalefactor correcting unit 108 performs quantization by using the scale factor “Sb” determined by the scalefactor determining unit 107 and determines a quantization error power generated during the quantization of the band “b”. - Then, the scale
factor correcting unit 108 changes the scale factor (step S303). That is, for example, the scalefactor correcting unit 108 changes the scale factor “Sb” to a predetermined value “A”. The scalefactor correcting unit 108 then determines a quantization error power by using the changed scale factor (step S304). That is, for example, the scalefactor correcting unit 108 determines a quantization error power generated during the quantization of the band “b”, by using the obtained change scale factor “S′b”. - Then, the scale
factor correcting unit 108 compares the quantization error powers (step S305). That is, for example, with respect to the quantization error powers generated during the quantization of the band “b”, the scalefactor correcting unit 108 compares a “first” quantization error power” with a “second” quantization error power. The “first” quantization error power is generated when the scale factor “Sb” determined by the scalefactor determining unit 107 is used. The “second” quantization error power is generated when the change scale factor “S′b” is used. - Then, the scale
factor correcting unit 108 compares both of the quantization error powers derived by a use of the scale factor determined by the scalefactor determining unit 107 and by a use of the change scale factor, to determine whether the quantization error power when the change scale factor is used is smaller (step S306). That is, for example, the scalefactor correcting unit 108 determines whether the “second” quantization error power is smaller than the “first” quantization error power. In this case, when the “second” quantization error power is smaller than the “first” quantization error power (affirmative in step S306), the scalefactor correcting unit 108 corrects the scale factor (step S307). That is, for example, the scalefactor correcting unit 108 corrects the scale factor “Sb” to the change scale factor “S′b”. - Then, when the scale
factor correcting unit 108 corrects the scale factor (step S307) or when the “second” quantization error power is not smaller than the “first” quantization error power (negative in step S306), the scalefactor correcting unit 108 determines whether the comparison has been performed with respect to all change scale factor candidates (step S308). In this case, when the comparison has been performed with respect to all change scale factor candidates (affirmative in step S308), the processing ends. On the other hand, when the comparison has not been performed with respect to all change scale factor candidates (negative in step S308), the processing from steps S303 to S307 described above is repeated until the comparison has been performed with respect to all change scale factor candidates. - As described above, according to the second embodiment, in the disclosed encoding device, the scale
factor correcting unit 108 determines the quantization error powers by using the scale factor determined by the scalefactor determining unit 107 and also using the change scale factor. Then, the encoding device performs quantization by using the scale factor (or the change scale factor) used when the smallest one of the determined quantization error powers was determined. - Although the first and second embodiments have been described above, the present invention may also implemented in various other forms other than the first and the second embodiments described above. Accordingly, other embodiments will be described below.
- The scheme in which the scale
factor correcting unit 108 corrects the scale factor with respect to only the band detected by thetone detecting unit 103 has been described in the first and second embodiments described above. The present invention is not limited thereto and the scale factors may be corrected with respect to all bands. This allows an encoding device according to a third embodiment to reduce quantization errors with respect to other bands. - Also, for example, the
quantizing unit 109 may quantize each of the frequency spectra contained in all bands by using the scale factor determined by the scalefactor correcting unit 108. To put it with a specific example, the scalefactor correcting unit 108 determines a scale factor with respect to only the band detected by thetone detecting unit 103 and corrects the scale factor. Further theunit 108 corrects the scale factors with respect to the other bands to the scale factor determined with respect to the band detected by thetone detecting unit 103. Thequantizing unit 109 then quantizes each of the frequency spectra in all bands by using the scale factor determined by thetone detecting unit 103 with respect to the band detected by thetone detecting unit 103. - This allows the encoding device according to the third embodiment to reduce the number of bits used during the encoding of the scale factor. Specifically, during encoding, the scale factor is expressed by a difference from an adjacent scale factor. In this case, making all the scale factors the same scale factor makes it possible to reduce the number of bits required during the decoding of the scale factor set for the individual bands, compared to a scheme in which different scale factors are set for the respective bands.
- Although the scheme in which bands that exist in a predetermined band width, with the band detected by the
tone detecting unit 103 being as the center thereof, are used as adjacent bands has been described in the first embodiment described above, the present invention is not limited thereto and bands that exist in a predetermined power width from a peak power may be used. - In other words, as shown in
FIG. 20 , first, based on the band power and the detection result from thetone detecting unit 103, a band width in which the allowable error powers are to be corrected is determined using a preset power width, and then, the allowable error powers are corrected. - Specifically, in the encoding device according to the third embodiment, the allowable-error-
power correcting unit 105 has a power-width storing unit. A predetermined power width is stored in the power-width storing unit. The allowable-error-power correcting unit 105 stores, for example, “G” in the power-width storing unit. - Then, in the encoding device, the
tone detecting unit 103 detects the band containing the tonal frequency spectrum. Further, the allowable-error-power correcting unit 105 regards as an adjacent band or adjacent bands that have a power value or power values and include the band containing the tonal frequency spectrum. The power value or power values are greater than or equal to a power value attenuated from the power value of the band detected by thetone detecting unit 103 to the predetermined power width stored in the power-width storing unit. The allowable-error-power correcting unit 105 corrects the allowable error power(s) for the adjacent band or adjacent bands as shown inFIG. 21 . - For example, a description is specifically given using the example shown in
FIG. 21 . In this case, the description is given assuming that the power of frequency spectra in a tonal band is “Epeak”, the power-width storing unit stores “G”, and seven bands exists. In the encoding device according to the third embodiment, the allowable-error-power correcting unit 105 determines “Ethr” that is a power obtained by attenuating “G” from “Epeak” and uses “Ethr” as a power threshold for determining bands in which the allowable error powers are to be corrected. - For example, in the encoding device according to the third embodiment, the allowable-error-
power correcting unit 105 checks for bands having greater powers than the power threshold in bands adjacent to the tone band. For example, in the example shown inFIG. 21 , since bands “2” and “4” exhibit greater powers than the power threshold, the allowable-error-power correcting unit 105 determines that the band width in which the allowable error powers are to be corrected is “B1 (the number of bands at the lower frequency side than the band adjacent to the tone band)=1” and “B2 (the number of bands at the higher frequency side than the band adjacent to the tone band)=1”. - In this manner, the encoding device according to the third embodiment can easily detect bands in which the allowable error powers are to be corrected.
- Of the processing described in the present embodiment, the entire or part of the processing described as being automatically performed can be manually performed or the entire or part of the processing described as being manually performed can be automatically performed by a known method. In addition, the processing processes, the control processes, the specific names, and information including various types of data and parameters which are illustrated in the document and the drawings (e.g.,
FIGS. 5 to 13 andFIGS. 17 to 21 ) can be arbitrary modified, unless otherwise particularly stated. - Also, the description in the first embodiment described above has been given of, for example, a case in which (1) the scheme for correcting the scale factor, (2) the scheme using the scale factor at which the quantization value becomes the maximum value, and (3) the scheme using the predetermined bandwidth during the detection of adjacent bands are implemented together during the correction of the allowable error powers. However, the present invention is not limited to the case, and during the correction of the allowable error powers, (1) to (3) do not have to be implemented together and only one or some of (1) to (3) may also be implemented.
- Also, similarly, with respect to the schemes described in the second embodiment and the third embodiment described above, the present invention is not limited to a case in which one of the schemes is implemented, and multiple schemes maybe implemented together.
- Meanwhile, although a case in which various types of processing are achieved by hardware logic has been described in the first embodiment described above, the present invention is not limited thereto and the processing may be achieved by causing a computer to execute a prepared program. Accordingly, one example of a computer for executing an encoding program having the same function as the encoding device illustrated in the first embodiment described above will be described below using
FIG. 22 .FIG. 22 is a diagram for describing a program for the encoding device according to the first embodiment. - As shown in the figure, an
encoding device 3000 in the first embodiment has a configuration in which anoperation unit 3001, amicrophone 3002, aspeaker 3003, adisplay 3005, acommunication unit 3006, aCPU 3010, aROM 3011, aHDD 3012, and aRAM 3013 are connected through abus 3009 and so on. - The
ROM 3011 pre-stores control programs such as an input program 3011 a, anMDCT program 3011 b, atone detecting program 3011 c, apsychoacoustic analyzing program 3011 d, an allowable-error-power correcting program 3011 e, a quantization-band detecting program 3011 f, a scalefactor determining program 3011 g, a scalefactor correcting program 3011 h, aquantizing program 3011 i, anencoding program 3011 j, and anoutput program 3011 k. Each of the pre-stored control programs provides the same functions as theinput unit 101, theMDCT unit 102, thetone detecting unit 103, thepsychoacoustic analyzing unit 104, the allowable-error-power correcting unit 105, the quantization-band detecting unit 106, the scalefactor determining unit 107, the scalefactor correcting unit 108, thequantizing unit 109, theencoding unit 110, and theoutput unit 111 which are illustrated in the first embodiment described above. These programs 3011 a to 3011 k may be integrated together or distributed, as required, similarly to the elements that constitute the encoding device shown inFIG. 6 . - The
CPU 3010 reads these programs 3011 a to 3011 k from theROM 3011 and executes them to thereby cause the programs 3011 a to 3011 k to function as aninput process 3010 a, anMDCT process 3010 b, atone detecting process 3010 c, apsychoacoustic analyzing process 3010 d, an allowable-error-power correcting process 3010 e, a quantization-band detecting process 3010 f, a scalefactor determining process 3010 g, a scalefactor correcting process 3010 h, aquantizing process 3010 i, anencoding process 3010 j, and anoutput process 3010 k, as shown inFIG. 22 . Theprocesses 3010 a to 3010 k correspond to theinput unit 101, theMDCT unit 102, thetone detecting unit 103, thepsychoacoustic analyzing unit 104, the allowable-error-power correcting unit 105, the quantization-band detecting unit 106, the scalefactor determining unit 107, the scalefactor correcting unit 108, thequantizing unit 109, theencoding unit 110, and theoutput unit 111 which are shown inFIG. 6 , respectively. - The encoding device described in the present embodiment can be achieved by causing a computer, such as a personal computer or workstation, to execute the prepared program. This program can be distributed over a network, such as the Internet. This program can also be recorded to a computer-readable storage media, such as a hard disk, flexible disk (FD), CD-ROM, MO, and DVD, and can also be executed by causing the computer to read the program from the recording medium.
- The following appendices are further disclosed with respect to illustrative embodiments including the embodiments described above.
Claims (11)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2008-037991 | 2008-02-19 | ||
JP2008037991A JP5262171B2 (en) | 2008-02-19 | 2008-02-19 | Encoding apparatus, encoding method, and encoding program |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090210235A1 true US20090210235A1 (en) | 2009-08-20 |
US9076440B2 US9076440B2 (en) | 2015-07-07 |
Family
ID=40834407
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/367,963 Expired - Fee Related US9076440B2 (en) | 2008-02-19 | 2009-02-09 | Audio signal encoding device, method, and medium by correcting allowable error powers for a tonal frequency spectrum |
Country Status (4)
Country | Link |
---|---|
US (1) | US9076440B2 (en) |
EP (1) | EP2093758A2 (en) |
JP (1) | JP5262171B2 (en) |
CN (1) | CN101515458A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110301961A1 (en) * | 2009-02-16 | 2011-12-08 | Mi-Suk Lee | Method and apparatus for encoding and decoding audio signal using adaptive sinusoidal coding |
US20130253939A1 (en) * | 2010-11-22 | 2013-09-26 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US20140114652A1 (en) * | 2012-10-24 | 2014-04-24 | Fujitsu Limited | Audio coding device, audio coding method, and audio coding and decoding system |
US9633663B2 (en) | 2011-12-15 | 2017-04-25 | Fraunhofer-Gesellschaft Zur Foederung Der Angewandten Forschung E.V. | Apparatus, method and computer program for avoiding clipping artefacts |
US10468043B2 (en) | 2013-01-29 | 2019-11-05 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Low-complexity tonality-adaptive audio signal quantization |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6398607B2 (en) * | 2014-10-24 | 2018-10-03 | 富士通株式会社 | Audio encoding apparatus, audio encoding method, and audio encoding program |
JP7535053B2 (en) | 2019-10-16 | 2024-08-15 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Quantization scale factor determination device and quantization scale factor determination method |
CN111402856B (en) * | 2020-03-23 | 2023-04-14 | 北京字节跳动网络技术有限公司 | Voice processing method and device, readable medium and electronic equipment |
Citations (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5684922A (en) * | 1993-11-25 | 1997-11-04 | Sharp Kabushiki Kaisha | Encoding and decoding apparatus causing no deterioration of sound quality even when sine-wave signal is encoded |
US5918203A (en) * | 1995-02-17 | 1999-06-29 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Method and device for determining the tonality of an audio signal |
US6138101A (en) * | 1997-01-22 | 2000-10-24 | Sharp Kabushiki Kaisha | Method of encoding digital data |
US6199038B1 (en) * | 1996-01-30 | 2001-03-06 | Sony Corporation | Signal encoding method using first band units as encoding units and second band units for setting an initial value of quantization precision |
US6295009B1 (en) * | 1998-09-17 | 2001-09-25 | Matsushita Electric Industrial Co., Ltd. | Audio signal encoding apparatus and method and decoding apparatus and method which eliminate bit allocation information from the encoded data stream to thereby enable reduction of encoding/decoding delay times without increasing the bit rate |
US6308150B1 (en) * | 1998-06-16 | 2001-10-23 | Matsushita Electric Industrial Co., Ltd. | Dynamic bit allocation apparatus and method for audio coding |
US6385572B2 (en) * | 1998-09-09 | 2002-05-07 | Sony Corporation | System and method for efficiently implementing a masking function in a psycho-acoustic modeler |
US20020120442A1 (en) * | 2001-02-27 | 2002-08-29 | Atsushi Hotta | Audio signal encoding apparatus |
US6456968B1 (en) * | 1999-07-26 | 2002-09-24 | Matsushita Electric Industrial Co., Ltd. | Subband encoding and decoding system |
US6629283B1 (en) * | 1999-09-27 | 2003-09-30 | Pioneer Corporation | Quantization error correcting device and method, and audio information decoding device and method |
US6725192B1 (en) * | 1998-06-26 | 2004-04-20 | Ricoh Company, Ltd. | Audio coding and quantization method |
US20040098268A1 (en) * | 2002-11-07 | 2004-05-20 | Samsung Electronics Co., Ltd. | MPEG audio encoding method and apparatus |
US6745162B1 (en) * | 2000-06-22 | 2004-06-01 | Sony Corporation | System and method for bit allocation in an audio encoder |
US6778953B1 (en) * | 2000-06-02 | 2004-08-17 | Agere Systems Inc. | Method and apparatus for representing masked thresholds in a perceptual audio coder |
US20040162720A1 (en) * | 2003-02-15 | 2004-08-19 | Samsung Electronics Co., Ltd. | Audio data encoding apparatus and method |
US6801886B1 (en) * | 2000-06-22 | 2004-10-05 | Sony Corporation | System and method for enhancing MPEG audio encoder quality |
US6826526B1 (en) * | 1996-07-01 | 2004-11-30 | Matsushita Electric Industrial Co., Ltd. | Audio signal coding method, decoding method, audio signal coding apparatus, and decoding apparatus where first vector quantization is performed on a signal and second vector quantization is performed on an error component resulting from the first vector quantization |
US6895374B1 (en) * | 2000-09-29 | 2005-05-17 | Sony Corporation | Method for utilizing temporal masking in digital audio coding |
US20060004565A1 (en) * | 2004-07-01 | 2006-01-05 | Fujitsu Limited | Audio signal encoding device and storage medium for storing encoding program |
US7110941B2 (en) * | 2002-03-28 | 2006-09-19 | Microsoft Corporation | System and method for embedded audio coding with implicit auditory masking |
US20070162277A1 (en) * | 2006-01-12 | 2007-07-12 | Stmicroelectronics Asia Pacific Pte., Ltd. | System and method for low power stereo perceptual audio coding using adaptive masking threshold |
US7333930B2 (en) * | 2003-03-14 | 2008-02-19 | Agere Systems Inc. | Tonal analysis for perceptual audio coding using a compressed spectral representation |
US20080052068A1 (en) * | 1998-09-23 | 2008-02-28 | Aguilar Joseph G | Scalable and embedded codec for speech and audio signals |
US20080133223A1 (en) * | 2006-12-04 | 2008-06-05 | Samsung Electronics Co., Ltd. | Method and apparatus to extract important frequency component of audio signal and method and apparatus to encode and/or decode audio signal using the same |
US7613605B2 (en) * | 2004-11-18 | 2009-11-03 | Canon Kabushiki Kaisha | Audio signal encoding apparatus and method |
US7627481B1 (en) * | 2005-04-19 | 2009-12-01 | Apple Inc. | Adapting masking thresholds for encoding a low frequency transient signal in audio data |
US7627469B2 (en) * | 2004-05-28 | 2009-12-01 | Sony Corporation | Audio signal encoding apparatus and audio signal encoding method |
US7630170B2 (en) * | 2005-08-09 | 2009-12-08 | Hitachi Global Storage Technologies Netherlands B.V. | Sealing method for magnetic disk drive |
US7634400B2 (en) * | 2003-03-07 | 2009-12-15 | Stmicroelectronics Asia Pacific Pte. Ltd. | Device and process for use in encoding audio data |
US7873510B2 (en) * | 2006-04-28 | 2011-01-18 | Stmicroelectronics Asia Pacific Pte. Ltd. | Adaptive rate control algorithm for low complexity AAC encoding |
US7974848B2 (en) * | 2006-06-21 | 2011-07-05 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding audio data |
US20110270616A1 (en) * | 2007-08-24 | 2011-11-03 | Qualcomm Incorporated | Spectral noise shaping in audio coding based on spectral dynamics in frequency sub-bands |
Family Cites Families (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE3639753A1 (en) | 1986-11-21 | 1988-06-01 | Inst Rundfunktechnik Gmbh | METHOD FOR TRANSMITTING DIGITALIZED SOUND SIGNALS |
JP3173218B2 (en) | 1993-05-10 | 2001-06-04 | ソニー株式会社 | Compressed data recording method and apparatus, compressed data reproducing method, and recording medium |
JPH0750589A (en) | 1993-08-04 | 1995-02-21 | Sanyo Electric Co Ltd | Sub-band coding device |
JP3227291B2 (en) | 1993-12-16 | 2001-11-12 | シャープ株式会社 | Data encoding device |
JP3465341B2 (en) | 1994-04-28 | 2003-11-10 | ソニー株式会社 | Audio signal encoding method |
KR100289733B1 (en) * | 1994-06-30 | 2001-05-15 | 윤종용 | Digital audio coding method and apparatus |
JP3680374B2 (en) | 1995-09-28 | 2005-08-10 | ソニー株式会社 | Speech synthesis method |
JP3467750B2 (en) | 1997-01-17 | 2003-11-17 | 日本電信電話株式会社 | Distributed object processing system |
JP2000293199A (en) | 1999-04-05 | 2000-10-20 | Nippon Columbia Co Ltd | Voice coding method and recording and reproducing device |
JP2001007704A (en) | 1999-06-24 | 2001-01-12 | Matsushita Electric Ind Co Ltd | Adaptive audio encoding method for tone component data |
JP4211165B2 (en) * | 1999-12-10 | 2009-01-21 | ソニー株式会社 | Encoding apparatus and method, recording medium, and decoding apparatus and method |
JP2001282288A (en) | 2000-03-28 | 2001-10-12 | Matsushita Electric Ind Co Ltd | Encoding device for audio signal and processing method |
JP2001343998A (en) | 2000-05-31 | 2001-12-14 | Yamaha Corp | Digital audio decoder |
JP2002268693A (en) | 2001-03-12 | 2002-09-20 | Mitsubishi Electric Corp | Audio encoding device |
JP2004522198A (en) | 2001-05-08 | 2004-07-22 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Audio coding method |
JP3984468B2 (en) * | 2001-12-14 | 2007-10-03 | 松下電器産業株式会社 | Encoding device, decoding device, and encoding method |
US20050259819A1 (en) | 2002-06-24 | 2005-11-24 | Koninklijke Philips Electronics | Method for generating hashes from a compressed multimedia content |
JP3933072B2 (en) * | 2003-03-25 | 2007-06-20 | ヤマハ株式会社 | Wave compressor |
US20050198061A1 (en) | 2004-02-17 | 2005-09-08 | David Robinson | Process and product for selectively processing data accesses |
JP2005258158A (en) | 2004-03-12 | 2005-09-22 | Advanced Telecommunication Research Institute International | Noise removal device |
US7546240B2 (en) | 2005-07-15 | 2009-06-09 | Microsoft Corporation | Coding with improved time resolution for selected segments via adaptive block transformation of a group of samples from a subband decomposition |
US7516074B2 (en) | 2005-09-01 | 2009-04-07 | Auditude, Inc. | Extraction and matching of characteristic fingerprints from audio signals |
US20100153099A1 (en) | 2005-09-30 | 2010-06-17 | Matsushita Electric Industrial Co., Ltd. | Speech encoding apparatus and speech encoding method |
JP4398416B2 (en) | 2005-10-07 | 2010-01-13 | 株式会社エヌ・ティ・ティ・ドコモ | Modulation device, modulation method, demodulation device, and demodulation method |
JP5071960B2 (en) | 2006-08-04 | 2012-11-14 | 旭化成ケミカルズ株式会社 | Foam |
-
2008
- 2008-02-19 JP JP2008037991A patent/JP5262171B2/en not_active Expired - Fee Related
-
2009
- 2009-02-09 US US12/367,963 patent/US9076440B2/en not_active Expired - Fee Related
- 2009-02-18 EP EP09153093A patent/EP2093758A2/en not_active Withdrawn
- 2009-02-19 CN CN200910008031.0A patent/CN101515458A/en active Pending
Patent Citations (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5684922A (en) * | 1993-11-25 | 1997-11-04 | Sharp Kabushiki Kaisha | Encoding and decoding apparatus causing no deterioration of sound quality even when sine-wave signal is encoded |
US5918203A (en) * | 1995-02-17 | 1999-06-29 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Method and device for determining the tonality of an audio signal |
US6199038B1 (en) * | 1996-01-30 | 2001-03-06 | Sony Corporation | Signal encoding method using first band units as encoding units and second band units for setting an initial value of quantization precision |
US6826526B1 (en) * | 1996-07-01 | 2004-11-30 | Matsushita Electric Industrial Co., Ltd. | Audio signal coding method, decoding method, audio signal coding apparatus, and decoding apparatus where first vector quantization is performed on a signal and second vector quantization is performed on an error component resulting from the first vector quantization |
US6138101A (en) * | 1997-01-22 | 2000-10-24 | Sharp Kabushiki Kaisha | Method of encoding digital data |
US6370499B1 (en) * | 1997-01-22 | 2002-04-09 | Sharp Kabushiki Kaisha | Method of encoding digital data |
US6308150B1 (en) * | 1998-06-16 | 2001-10-23 | Matsushita Electric Industrial Co., Ltd. | Dynamic bit allocation apparatus and method for audio coding |
US6725192B1 (en) * | 1998-06-26 | 2004-04-20 | Ricoh Company, Ltd. | Audio coding and quantization method |
US6385572B2 (en) * | 1998-09-09 | 2002-05-07 | Sony Corporation | System and method for efficiently implementing a masking function in a psycho-acoustic modeler |
US6295009B1 (en) * | 1998-09-17 | 2001-09-25 | Matsushita Electric Industrial Co., Ltd. | Audio signal encoding apparatus and method and decoding apparatus and method which eliminate bit allocation information from the encoded data stream to thereby enable reduction of encoding/decoding delay times without increasing the bit rate |
US20080052068A1 (en) * | 1998-09-23 | 2008-02-28 | Aguilar Joseph G | Scalable and embedded codec for speech and audio signals |
US6456968B1 (en) * | 1999-07-26 | 2002-09-24 | Matsushita Electric Industrial Co., Ltd. | Subband encoding and decoding system |
US6629283B1 (en) * | 1999-09-27 | 2003-09-30 | Pioneer Corporation | Quantization error correcting device and method, and audio information decoding device and method |
US6778953B1 (en) * | 2000-06-02 | 2004-08-17 | Agere Systems Inc. | Method and apparatus for representing masked thresholds in a perceptual audio coder |
US6745162B1 (en) * | 2000-06-22 | 2004-06-01 | Sony Corporation | System and method for bit allocation in an audio encoder |
US6801886B1 (en) * | 2000-06-22 | 2004-10-05 | Sony Corporation | System and method for enhancing MPEG audio encoder quality |
US6895374B1 (en) * | 2000-09-29 | 2005-05-17 | Sony Corporation | Method for utilizing temporal masking in digital audio coding |
US20020120442A1 (en) * | 2001-02-27 | 2002-08-29 | Atsushi Hotta | Audio signal encoding apparatus |
US7110941B2 (en) * | 2002-03-28 | 2006-09-19 | Microsoft Corporation | System and method for embedded audio coding with implicit auditory masking |
US20040098268A1 (en) * | 2002-11-07 | 2004-05-20 | Samsung Electronics Co., Ltd. | MPEG audio encoding method and apparatus |
US20040162720A1 (en) * | 2003-02-15 | 2004-08-19 | Samsung Electronics Co., Ltd. | Audio data encoding apparatus and method |
US7634400B2 (en) * | 2003-03-07 | 2009-12-15 | Stmicroelectronics Asia Pacific Pte. Ltd. | Device and process for use in encoding audio data |
US7333930B2 (en) * | 2003-03-14 | 2008-02-19 | Agere Systems Inc. | Tonal analysis for perceptual audio coding using a compressed spectral representation |
US7627469B2 (en) * | 2004-05-28 | 2009-12-01 | Sony Corporation | Audio signal encoding apparatus and audio signal encoding method |
US20060004565A1 (en) * | 2004-07-01 | 2006-01-05 | Fujitsu Limited | Audio signal encoding device and storage medium for storing encoding program |
US7613605B2 (en) * | 2004-11-18 | 2009-11-03 | Canon Kabushiki Kaisha | Audio signal encoding apparatus and method |
US7627481B1 (en) * | 2005-04-19 | 2009-12-01 | Apple Inc. | Adapting masking thresholds for encoding a low frequency transient signal in audio data |
US7630170B2 (en) * | 2005-08-09 | 2009-12-08 | Hitachi Global Storage Technologies Netherlands B.V. | Sealing method for magnetic disk drive |
US20070162277A1 (en) * | 2006-01-12 | 2007-07-12 | Stmicroelectronics Asia Pacific Pte., Ltd. | System and method for low power stereo perceptual audio coding using adaptive masking threshold |
US7873510B2 (en) * | 2006-04-28 | 2011-01-18 | Stmicroelectronics Asia Pacific Pte. Ltd. | Adaptive rate control algorithm for low complexity AAC encoding |
US7974848B2 (en) * | 2006-06-21 | 2011-07-05 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding audio data |
US20080133223A1 (en) * | 2006-12-04 | 2008-06-05 | Samsung Electronics Co., Ltd. | Method and apparatus to extract important frequency component of audio signal and method and apparatus to encode and/or decode audio signal using the same |
US20110270616A1 (en) * | 2007-08-24 | 2011-11-03 | Qualcomm Incorporated | Spectral noise shaping in audio coding based on spectral dynamics in frequency sub-bands |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140310007A1 (en) * | 2009-02-16 | 2014-10-16 | Electronics And Telecommunications Research Institute | Method and apparatus for encoding and decoding audio signal using adaptive sinusoidal coding |
US20110301961A1 (en) * | 2009-02-16 | 2011-12-08 | Mi-Suk Lee | Method and apparatus for encoding and decoding audio signal using adaptive sinusoidal coding |
US9251799B2 (en) * | 2009-02-16 | 2016-02-02 | Electronics And Telecommunications Research Institute | Method and apparatus for encoding and decoding audio signal using adaptive sinusoidal coding |
US8805694B2 (en) * | 2009-02-16 | 2014-08-12 | Electronics And Telecommunications Research Institute | Method and apparatus for encoding and decoding audio signal using adaptive sinusoidal coding |
US9508350B2 (en) * | 2010-11-22 | 2016-11-29 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US20130253939A1 (en) * | 2010-11-22 | 2013-09-26 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US10115402B2 (en) | 2010-11-22 | 2018-10-30 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US10762908B2 (en) | 2010-11-22 | 2020-09-01 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US11322163B2 (en) | 2010-11-22 | 2022-05-03 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US11756556B2 (en) | 2010-11-22 | 2023-09-12 | Ntt Docomo, Inc. | Audio encoding device, method and program, and audio decoding device, method and program |
US9633663B2 (en) | 2011-12-15 | 2017-04-25 | Fraunhofer-Gesellschaft Zur Foederung Der Angewandten Forschung E.V. | Apparatus, method and computer program for avoiding clipping artefacts |
US20140114652A1 (en) * | 2012-10-24 | 2014-04-24 | Fujitsu Limited | Audio coding device, audio coding method, and audio coding and decoding system |
US10468043B2 (en) | 2013-01-29 | 2019-11-05 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Low-complexity tonality-adaptive audio signal quantization |
US11094332B2 (en) | 2013-01-29 | 2021-08-17 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Low-complexity tonality-adaptive audio signal quantization |
US11694701B2 (en) | 2013-01-29 | 2023-07-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Low-complexity tonality-adaptive audio signal quantization |
Also Published As
Publication number | Publication date |
---|---|
JP5262171B2 (en) | 2013-08-14 |
US9076440B2 (en) | 2015-07-07 |
EP2093758A2 (en) | 2009-08-26 |
JP2009198612A (en) | 2009-09-03 |
CN101515458A (en) | 2009-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9076440B2 (en) | Audio signal encoding device, method, and medium by correcting allowable error powers for a tonal frequency spectrum | |
US11756556B2 (en) | Audio encoding device, method and program, and audio decoding device, method and program | |
EP1939862B1 (en) | Encoding device, decoding device, and method thereof | |
US9390717B2 (en) | Encoding device and method, decoding device and method, and program | |
US7734053B2 (en) | Encoding apparatus, encoding method, and computer product | |
JP6061121B2 (en) | Audio encoding apparatus, audio encoding method, and program | |
US8606567B2 (en) | Signal encoding apparatus, signal decoding apparatus, signal processing system, signal encoding process method, signal decoding process method, and program | |
US20080164942A1 (en) | Audio data processing apparatus, terminal, and method of audio data processing | |
US11232803B2 (en) | Encoding device, decoding device, encoding method, decoding method, and non-transitory computer-readable recording medium | |
US11257506B2 (en) | Decoding device, encoding device, decoding method, and encoding method | |
US20040002859A1 (en) | Method and architecture of digital conding for transmitting and packing audio signals | |
US20060004565A1 (en) | Audio signal encoding device and storage medium for storing encoding program | |
US6922667B2 (en) | Encoding apparatus and decoding apparatus | |
US20080255860A1 (en) | Audio decoding apparatus and decoding method | |
US8225160B2 (en) | Decoding apparatus, decoding method, and recording medium | |
US20080059203A1 (en) | Audio Encoding Device, Decoding Device, Method, and Program | |
US20090326963A1 (en) | Audio encoding device, audio encoding method, and program thereof | |
JP2001100796A (en) | Audio signal encoding device | |
JPH11177435A (en) | Quantizer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: FUJITSU LIMITED, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHIRAKAWA, MIYUKI;SUZUKI, MASANAO;TSUCHINAGA, YOSHITERU;SIGNING DATES FROM 20081216 TO 20081218;REEL/FRAME:022269/0297 Owner name: FUJITSU LIMITED, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHIRAKAWA, MIYUKI;SUZUKI, MASANAO;TSUCHINAGA, YOSHITERU;REEL/FRAME:022269/0297;SIGNING DATES FROM 20081216 TO 20081218 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
CC | Certificate of correction | ||
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20230707 |