CN110223702A - Audio decoding system and reconstructing method - Google Patents
Audio decoding system and reconstructing method Download PDFInfo
- Publication number
- CN110223702A CN110223702A CN201910546611.9A CN201910546611A CN110223702A CN 110223702 A CN110223702 A CN 110223702A CN 201910546611 A CN201910546611 A CN 201910546611A CN 110223702 A CN110223702 A CN 110223702A
- Authority
- CN
- China
- Prior art keywords
- audio object
- audio
- decorrelation
- signal
- weighted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/20—Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/02—Systems employing more than two channels, e.g. quadraphonic of the matrix type, i.e. in which input signals are combined algebraically, e.g. after having been phase shifted with respect to each other
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S5/00—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/03—Aspects of down-mixing multi-channel audio to configurations with lower numbers of playback channels, e.g. 7.1 -> 5.1
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/07—Synergistic effects of band splitting and sub-band processing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Algebra (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
This disclosure relates to audio decoding system and reconstructing method.It provides and method, equipment and the computer program product of the less complex and more flexible control to the decorrelation introduced in audio coding system is provided.According to the disclosure, by calculating and being realized using two weighted factors of the decorrelation for introducing audio object in audio coding system, a weighted factor is used for decorrelation audio object for approaching audio object, a weighted factor for this.
Description
It is May 23, entitled " audio in 2014 that the application, which is application No. is the 201480029603.2, applying date,
The divisional application of the application for a patent for invention of coding and decoding methods, medium and audio coder and decoder ".
Cross reference to related applications
This application claims the U.S. Provisional Patent Application No.61/827 that on May 24th, 2013 submits, and 288 priority should
The full content of application is incorporated herein.
Technical field
Disclosure herein is usually directed to audio coding.Particularly, this disclosure relates to use and calculate for compiling audio
The weighted factor of audio object decorrelation in code system.
This disclosure relates to be submitted by one day with the application, entitled " Coding of Audio Scenes ", inventor
Name is the U.S. Provisional Application No.61/827,246 of Heiko Purnhagen etc..The full content of the application of the reference is at this
In be included by reference.
Background technique
In conventional audio system, using the method based on sound channel.Each sound channel can for example indicate a loudspeaker
Or the content of a loudspeaker array.Possible encoding scheme for such system includes discrete multi-channel encoder or parameter
Change coding (such as MPEG is surround).
Recently, new method is developed.This method is object-based.In the system using object-based method
In, dimensional audio scene is indicated with their associated location metadata by audio object.These audio objects are in audio
It is moved around in three-dimensional scenic during signal playback.The system may also include so-called bed sound channel, these sound channels can be retouched
It states to map directly to the static audio object of the loudspeaker position of conventional audio system for example as described above.In such system
The decoder end of system lower mixed signal and upper mixed or restructuring matrix can be used to reconstruct object/bed sound channel, wherein by based on weight
The linear combination of mixed signal reconstructs object/bed sound channel under the value of corresponding element in structure matrix is constituted.
The problem of may cause (especially under low target bit rate) in object-based audio system is, decoded
Correlation between object/bed sound channel is likely larger than primary object/bed sound channel correlation for coding.Such as in MPEG
In SAOC, solve such problems and improve the common methods of the reconstruct of audio object to be to introduce decorrelation in a decoder
Device.In MPEG SAOC, the decorrelation of introducing is intended to the specified rendering in view of audio object (that is, dependent on sound is connected to
The what kind of playback unit of display system) correct correlation between Lai Huifu audio object.
It is well known, however, that the method for object-based audio system to quantity and object/bed sound channel of lower mixed signal
Quantity it is sensitive, and can also be the complex operations of the rendering depending on audio object.Therefore it needs a kind of simple and flexible
Method, the method is used to control the amount of the decorrelation introduced in decoder in such a system, so that changing
Into the reconstruct of audio object.
Detailed description of the invention
It will now be described with reference to the attached figures example embodiment, in which:
Fig. 1 is the generalized block diagram of audio decoding system according to example embodiment;
Fig. 2 shows restructuring matrix and weighting parameters by way of example and is received lattice used by the audio decoding system of Fig. 1
Formula;
Fig. 3 is the sound for generating at least one weighting parameters used in the decorrelation process in audio decoding system
The generalized block diagram of frequency encoder;
Fig. 4 shows a part in the encoder of Fig. 3 for generating at least one weighting parameters by way of example
Generalized block diagram;
Fig. 5 a-5c shows the mapping function used in the part of the encoder of Fig. 4 by way of example.
All attached drawings are all schematical, and part necessary to usually illustrating only to illustrate the disclosure, and its
He can then be omitted or only be proposed in part.Unless otherwise directed, identical label refers to identical in different drawings
Part.
Specific embodiment
In view of above, it is therefore an objective to provide a kind of less complicated and more flexible control provided to the decorrelation of introducing,
So that improving the encoder and decoder and associated method of the reconstruct of audio object.
I. summarize --- decoder
According in a first aspect, example embodiment is proposed for the production of decoded coding/decoding method, decoder and computer program
Product.Method, decoder and the computer program product proposed usually can have identical feature and advantage.
According to example embodiment, a kind of method for reconstructing the time/frequency tile of N number of audio object is provided.Institute
Method is stated the following steps are included: receiving mixed signal under M;Receiving can be realized from the mixed N number of audio object of signal reconstruction under M
The restructuring matrix approached;Restructuring matrix is applied to mixed signal under M, N number of approaches audio object to generate;It approaches N number of
At least one subset of audio object carries out decorrelative transformation, to generate at least one decorrelation audio object, thus at least
Each of one decorrelation audio object corresponds to N number of one approached in audio object;Audio object is approached for N number of
In each of the corresponding decorrelation audio object of not having approach audio object, reconstruct audio pair by approaching audio object
The time/frequency tile of elephant;And there is each of corresponding decorrelation audio object to force N number of approach in audio object
Nearly audio object, reconstructs the time/frequency tile of audio object by following steps: receiving indicates the first weighted factor and the
At least one weighting parameters of two weighted factors, are weighted with the first weighted factor to audio object is approached, with the second weighting
Factor pair decorrelation audio object corresponding with audio object is approached is weighted, and by weighting approach audio object with it is right
The decorrelation audio object for the weighting answered combines.
Audio coding decoding system, will usually for example by the way that suitable filter group is applied to input audio signal
T/F space is divided into time/frequency tile.Time/frequency tile be often referred in T/F space between the time
Every a part corresponding with frequency subband.Time interval can correspond generally to the time used in audio coding decoding system
The duration of frame.Frequency subband can correspond generally to one that the filter group as used in coder/decoder system defines
Or several adjacent frequency subbands.The case where frequency subband corresponds to several adjacent frequency subbands defined by filter group
Under, this to can have non-uniform frequency subband in the decoding process of audio signal, for example, for the sound of upper frequency
Frequency signal has wider frequency subband.(in this case, audio coding decoding system is to entire in wide band situation
Frequency range is operated), the frequency subband of time/frequency tile can correspond to entire frequency range.Above method discloses
The step of such time/frequency tile of the N number of audio object of reconstruct.It is to be appreciated, however, that audio decoding system
Each time/frequency tile can repeat the method.It will further be understood that some time/frequency tile can be compiled simultaneously
Code.In general, adjacent time/frequency tile can have some overlappings on time and/or frequency.For example, temporal overlapping
(that is, from the time interval to next time interval) of the element of restructuring matrix in time can be equivalent to linearly to insert
Value.However, the disclosure is using the other parts of coder/decoder system as target, and between adjacent time/frequency tile
Any overlapping on time and/or frequency is left to technical staff to go to implement.
As used herein, lower mixed signal is the combination as one or more bed sound channels and/or audio object
Signal.
Above method provides a kind of for reconstructing the flexible and simple of the time/frequency tile of N number of audio object
Method reduces any undesired correlation between the N number of audio object approached in the method.By using two
Weighted factor, one for audio object is approached, one is directed to decorrelation audio object, allows to neatly control and is introduced into
Decorrelation amount it is simple parametrization be implemented.
Moreover, the simple parametrization in the method carries out what kind of wash with watercolours independent of to reconstruct audio object
Dye.This advantages of, is, identical independently of the what kind of playback unit for the audio decoding system for being connected to realization the method
Method used, so as to cause less complex audio decoding system.
According to embodiment, there is each of corresponding decorrelation audio object to approach N number of approach in audio object
Audio object, at least one described weighting parameters include can be from wherein deriving the first weighted factor and the second weighted factor
Single weighting parameters.This advantages of is to propose the simple ginseng of the amount for the decorrelation that control introduces in audio decoding system
Numberization.This method (goes phase using " dry " (the not being decorrelation) contribution and " wet " for describing every an object and time/frequency tile
Close) the single parameter of the mixing of contribution.With use several parameters (for example, a wet contribution of description, the dry contribution of a description)
It compares, by using single parameter, required bit rate can be reduced.
According to embodiment, the quadratic sum of the first weighted factor and the second weighted factor is equal to one.In this case, described
Single weighting parameters include or the first weighted factor or the second weighted factor.This can be implementation for describe every an object and
The plain mode of the single weighted factor of the mixing of dry contribution and the wet contribution of time/frequency tile.Also, it implies that reconstruct
Object will have energy identical with object is approached.
It include to N to the step of N number of at least one subset progress decorrelative transformation for approaching audio object according to embodiment
A each of audio object that approaches carries out decorrelative transformation, and thus N number of each of audio object that approaches is corresponding to one
A decorrelation audio object.This can further decrease any undesired correlation between reconstruct audio object, because of institute
There is reconstruct audio object to be all based on decorrelation audio object and approach both audio objects.
According to embodiment, the first weighted factor and the second weighted factor are to change at any time with frequency.Therefore, Ke Yiti
High audio decodes the flexibility of system, because can introduce different decorrelation amounts to different time/frequency tiles.This may be used also
To further decrease any undesired correlation between reconstruct audio object, and improve the quality of reconstruct audio object.
According to embodiment, restructuring matrix is to change at any time with frequency.Therefore, the flexibility of audio decoding system is mentioned
Height because for from lower mixed signal reconstruction or approach audio object parameter can for different time/frequency tiles and become
Change.
According to another embodiment, restructuring matrix and at least one weighting parameters are disposed in frame once being received.Make
Restructuring matrix is arranged in the first field of frame with the first format, and will at least one described weighting ginseng using the second format
Number is arranged in the second field of frame, so that only supporting that the decoder of the first format can be to the reconstruct square in the first field
Battle array is decoded, and abandons at least one described weighting parameters in the second field.It is thereby achieved that with phase is not implemented
The compatibility of the decoder of pass.
According to embodiment, the method can also include receiving L auxiliary signal, wherein restructuring matrix further realizes
From the reconstruct approached of mixed signal and L auxiliary signal to N number of audio object under M, and wherein, the method also includes will
Restructuring matrix is applied under M mixed signal and L auxiliary signal to generate and N number of to approach audio object.L auxiliary signal can be with
For example including equal at least one of L auxiliary signal of be reconstructed audio object being believed in N number of audio object
Number.The quality of specific reconstruct audio object can be improved in this.In N number of audio object by be reconstructed audio object
Indicate a part (for example, the audio object for indicating speaker's voice in documentary film) with the audio signal of particular importance
In the case where, this may be advantageous.According to embodiment, at least one of L auxiliary signal is the general in N number of audio object
The combination at least two audio objects being reconstructed, to provide the compromise between bit rate and quality.
According to embodiment, mixed signal spans hyperplane under M, and wherein, at least one of L auxiliary signal not position
Under M in the hyperplane of mixed signal spans.Therefore, one or more auxiliary signals in L auxiliary signal can indicate not
It is included in the signal dimension under M in any one of mixed signal signal.Therefore, the quality for reconstructing audio object can mention
It is high.In embodiment, at least one auxiliary signal in L auxiliary signal is orthogonal with the hyperplane of mixed signal spans under M.Cause
This, the entire signal of one or more auxiliary signals in L auxiliary signal indicates that M lower mix that are not included in of audio signal are believed
Number any one of part in signal.The quality of reconstruct audio object can be improved in this, while reducing required bit rate,
Because at least one auxiliary signal in L auxiliary signal does not include being already present on any one of mixed signal letter under M
Any information in number.
According to example embodiment, a kind of computer-readable medium is provided, which includes working as to have
The computer generation code instruction for being adapted for carrying out any method of first aspect is performed on the device of processing capacity.
According to example embodiment, provide it is a kind of for reconstructing the device of the time/frequency tile of N number of audio object, should
Device includes: the first receiving unit, is configured as receiving mixed signal under M;Second receiving unit is configured as receiving real
Now from the restructuring matrix of the mixed N number of audio object of signal reconstruction under M approached;Audio object approaches component, is disposed in
The downstream of one receiving unit and the second receiving unit, and be configured as restructuring matrix being applied to mixed signal under M, to produce
Life is N number of to approach audio object;Decorrelation component is disposed in audio object and approaches the downstream of component, and is configured as to N
A at least one subset for approaching audio object carries out decorrelative transformation, to generate at least one decorrelation audio object, by
Each of this at least one decorrelation audio object corresponds to N number of one approached in audio object;Second receiving unit
It is further configured to for N number of approach in audio object there is each of corresponding decorrelation audio object to approach audio pair
As receiving at least one weighting parameters for indicating the first weighted factor and the second weighted factor;And audio object reconstitution assembly,
It is disposed in audio object and approaches the downstream of component, decorrelation component and the second receiving unit, and is configured as: being directed to N
A each of corresponding decorrelation audio object that do not have approached in audio object approaches audio object, by approaching audio pair
As come the time/frequency tile that reconstructs audio object;And there is corresponding decorrelation sound for N number of approach in audio object
Each of frequency object approaches audio object, and the time/frequency tile of audio object is reconstructed by following steps: with the first weighting
Factor pair approaches audio object and is weighted, with the second weighted factor pair decorrelation audio object corresponding with audio object is approached
It is weighted, and combines the audio object that approaches of weighting with the decorrelation audio object of corresponding weighting.
II. summarize --- encoder
According to second aspect, example embodiment proposes the coding method for coding, encoder and computer program and produces
Product.Method, encoder and the computer program product proposed usually can have identical feature and advantage.
According to example embodiment, the method for generating at least one weighting parameters in encoder is provided, wherein when
Added by the way that the decoder-side of the weighting of specific audio object to be approached to the corresponding of the specific audio object approached with decoder-side
The decorrelation version of power combines, and when time/frequency tile to reconstruct the specific frequency object, at least one weighting parameters will be by
With in a decoder, the described method comprises the following steps: receiving mixed signal under M, under these mixed signal be include described specific
The combination of at least N number of audio object of audio object;Receive the specific audio object;It calculates and indicates the specific audio object
Energy level the first amount;It calculates and indicates energy corresponding with the energy level that the coder side of the specific audio object is approached
Measure the second horizontal amount, the coder side approach be mixed signal under M a combination;Based on the first amount and the second amount to calculate
State at least one weighting parameters.
Above method, which is disclosed, generates at least one weighting for specific audio object during a time/frequency tile
The step of parameter.It is to be appreciated, however, that can each time/frequency tile to audio coding decoding system and to every
A audio object repeats the method.
It can be pointed out that the tiling (tiling) in audio coding system, i.e., be divided into time/frequency for audio signal/object
Rate tile, it is not necessary to identical as the tiling in audio decoding system.
It may also be noted that the decoder-side of the specific audio object approaches the coder side with the specific audio object
It approaches can be different and approach or they can be identical approach.
For bit rate required for reducing and complexity is reduced, at least one described weighting parameters may include can be with
From the single weighting parameters for wherein deriving the first weighted factor and the second weighted factor, the first weighted factor is used for the spy
The decoder-side for determining audio object, which approaches, to be weighted, and the audio object that the second weighted factor is used to approach decoder-side is gone
Related versions are weighted.
Energy is added to the reconstruct audio object on decoder-side in order to prevent, which includes the spy
The decoder-side for determining audio object approaches the decorrelation version of the audio object approached with decoder-side, the first weighted factor and
The quadratic sum of two weighted factors can be equal to one.In this case, the single weighting parameters may include or first weights
The factor or the second weighted factor.
According to embodiment, the step of calculating at least one weighting parameters includes comparing the first amount and the second amount.For example, can be with
Compare the energy of the specific audio object approached and the energy of specific audio object.
It according to example embodiment, include: the ratio calculated between the second amount and the first amount to the comparison of the first amount and the second amount
Rate;The ratio is increased to α power;And weighting parameters are calculated using the ratio for being raised to α power.Volume can be improved in this
The flexibility of code device.Parameter alpha can be equal to two.
According to example embodiment, the ratio of α power is raised in accordance with increasing function, which will be raised to α
The rate maps of power at least one weighting parameters described in.
According to example embodiment, the first weighted factor and the second weighted factor are to change at any time with frequency.
According to example embodiment, indicate that the second amount of energy level is forced corresponding to the coder side of the specific audio object
Close energy level, the coder side approach be mixed signal and L auxiliary signal under M linear combination, lower mixed signal with
Auxiliary signal is formed from N number of audio object.In order to improve decoder-side audio object reconstruct, auxiliary signal can be included
In audio coding decoding system.
According to example embodiment, at least one auxiliary signal in L auxiliary signal can correspond to especially important sound
Frequency object such as indicates the audio object of dialogue.Therefore, at least one auxiliary signal in L auxiliary signal can be equal to N number of
One in audio object.According to further embodiments, at least one auxiliary signal in L auxiliary signal is N number of audio
At least two combination in object.
According to example embodiment, mixed signal spans hyperplane under M is a, and wherein, at least one of L auxiliary signal
Auxiliary signal is not located under M in the hyperplane of mixed signal spans.It means that at least one of L auxiliary signal assists
Signal indicates the signal dimension for the audio object lost during mixed signal under generating M, this can be improved to decoder
The reconstruct of the audio object of side.According to further embodiments, at least one described auxiliary signal and M in L auxiliary signal
The hyperplane of mixed signal spans is orthogonal under a.
According to example embodiment, a kind of computer-readable medium is provided, which includes when it is having
There is the computer generation code instruction that any method for being adapted for carrying out second aspect is performed on the device of processing capacity.
According to example embodiment, it provides a kind of for generating the encoder of at least one weighting parameters, wherein when passing through
The decoder-side of the weighting of specific audio object is approached to the corresponding weighting of the specific audio object approached with decoder-side
The combination of decorrelation version, when time/frequency tile to reconstruct the specific frequency object, at least one described weighting parameters will be by
With in a decoder, described device includes: receiving unit, is configured as receiving mixed signal under M, and mixed signal is packet under these
The combination of at least N number of audio object of the specific audio object is included, which is further configured to receive the spy
Determine audio object;Computing unit is configured as: calculating the first amount for indicating the energy level of the specific audio object;Meter
Calculate the second amount for indicating energy level corresponding with the energy level that the coder side of the specific audio object is approached, the volume
Code device side approach be mixed signal under M combination;At least one described weighting parameters are calculated based on the first amount and the second amount.
Example embodiment
Fig. 1 shows the generalized block diagram of the audio decoding system 100 for reconstructing N number of audio object.Audio decoding system
100 execute time/frequency resolution process, it is meant that it operates to reconstruct N number of audio pair single time/frequency tile
As.Below, by the processing of a time/frequency tile for being used to reconstruct N number of audio object for description system 100.N number of audio
Object can be one or more audio objects.
System 100 includes the first receiving unit 102, is configured as receiving mixed signal 106 under M.Mixed signal can under M
To be to mix signal under one or more.Mixed signal 106 may, for example, be with established voice codec system (such as under M
Dolby Digital Plus, MPEG or AAC) back compatible 5.1 or 7.1 around signals.In other embodiments, under M
Mixed 106 not back compatible of signal.The input signal of first receiving unit 102 can be bit stream 130, receiving unit can from than
Mixed signal 106 under M is extracted in spy's stream 130.
System 100 further includes the second receiving unit 112, is configured as receiving and realizes that mixed signal 106 reconstructs N under M
The restructuring matrix 104 of a audio object approached.Restructuring matrix 104 can also be referred to as upper mixed matrix.Second receiving unit 112
Input signal 126 can be bit stream 126, which can extract restructuring matrix 104 or its yuan from bit stream 126
Element will be described in detail additional information below.In some embodiments of audio decoding system 100,102 He of the first receiving unit
Second receiving unit 112 is combined in a single receiving unit.In some embodiments, input signal 130,126 is by group
Be combined into a single input signal, one single input signal can be have allow receiving unit 102,112 from
One single input signal extracts the bit stream of the format of different information.
System 100 can also include that audio object approaches component 108, be disposed in the first receiving unit 102 and second
The downstream of receiving unit 112, and mixed signal 106 is configured as restructuring matrix 104 being applied under M to generate N number of force
Nearly audio object 110.More specifically, audio object, which approaches component 108, can execute matrix operation, in the matrix operation,
By restructuring matrix multiplied by the vector for including mixed signal under M.Restructuring matrix 104 can be at any time with frequency variation, that is, weight
The value of element in structure matrix 104 can be different for each time/frequency tile.Therefore, the element of restructuring matrix 104
It is currently being handled dependent on which time/frequency.
(that is, time/frequency tile) approaches at frequency k and time slot lAudio object n for example in audio object
It approaches at component 108 and is calculated, for example, being used for all frequency sampling k in frequency band b, b=1 ..., BTo calculate, wherein cM, b, nBe in frequency band b with lower mixing sound road YmThe associated object n's of mesh
Reconstruction coefficients.It can be pointed out that reconstruction coefficients cM, b, nIt is fixed for being assumed to be on time/frequency tile, but further
Embodiment in, which can change during time/frequency tile.
System 100 further includes the decorrelation component 118 for being disposed in audio object and approaching 108 downstream of component.Decorrelation group
Part 118 is configured as carrying out decorrelative transformation to N number of at least one subset 140 for approaching audio object 110, to generate at least
One decorrelation audio object 136.It in other words, can be to N number of entirely or only some progress approached in audio object 110
Decorrelative transformation.Each of at least one described decorrelation audio object 136 corresponds to N number of approach in audio object 110
One.More precisely, the set of decorrelation audio object 136 correspond to be input into decorrelation process 118 approach sound
The set 140 of frequency object.The purpose of at least one decorrelation audio object 136 be reduce it is N number of approach audio object 110 it
Between undesired correlation.The undesired correlation is especially to be had in the audio system including audio decoding system 100
Occur when low target bit rate.Under low target bit rate, restructuring matrix may be sparse.This means that in restructuring matrix
Many elements may be zero.In this case, specifically approaching audio object 110 can be based on the mixed signal 106 under M
Individually under mixed signal or several lower mixed signals, introduce undesired correlation approaching between audio object 110 to increase
The risk of property.According to some embodiments, decorrelation component 118 carries out decorrelation to N number of each of audio object 110 that approaches
Processing, thus N number of each of audio object 110 that approaches is corresponding to a decorrelation audio object 136.
N number of each of audio object 110 that approaches that decorrelative transformation can be carried out to decorrelation component 118 carries out not
Same decorrelative transformation, for example, approaching audio object by be applied to be decorrelated by noise-whitening filter, or by answering
With any other suitable decorrelative transformation, such as all-pass wave filtering.
The example of further decorrelative transformation can be found in the following: MPEG parametric stereo encoding tool (its
It is used in HE-AAC v2, such as the paper of the 116th conference of ISO/IEC 14496-3 and in May, 2004 Berlin, Germany AES:
J.H.Purnhagen, J.L.Liljeryd, " Synthetic ambience in
As described in parametric stereo coding "), MPEG is around (ISO/IEC 23003-1) and MPEG
SAOC(ISO/IEC 23003-2)。
In order not to introduce undesired correlation, different decorrelative transformations is mutual decorrelation.According to other implementations
Example carries out identical decorrelative transformation to some or all of objects approached in audio object 110.
System 100 further includes audio object reconstitution assembly 128.Object reconstruction component 128 is disposed in audio object and approaches
The downstream of component 108, decorrelation component 118 and the second receiving unit 112.Object reconstruction component 128 is configured as, for N number of
Each of the corresponding decorrelation audio object 136 that do not have approached in audio object 138 approaches audio object, by approaching sound
Frequency object 138 reconstructs the time/frequency tile of audio object 142.In other words, if a certain approach audio object 138 still
Decorrelative transformation is not carried out, then it, which is simply reconstructed into, approaches audio object by what audio object approached that component 108 provides
110.Object reconstruction component 128 is further configured to, and has corresponding decorrelation for N number of approach in audio object 110
Each of audio object 136 approaches audio object, using decorrelation audio object 136 and corresponding approaches 110 liang of audio object
Person reconstructs the time/frequency tile of audio object.
In order to promote the process, the second receiving unit 112 is further configured to approach in audio object 110 for N number of
There is each of corresponding decorrelation audio object 136 to approach audio object, receive at least one weighting parameters 132.It is described
At least one weighting parameters 132 indicates the first weighted factor 116 and the second weighted factor 114.The first of the also referred to as dry factor
Second weighted factor 116 of weighted factor 116 and the also referred to as wet factor, by wet/dry extractor 134 from it is described at least one
Weighting parameters 132 are derived.First weighted factor 116 and/or the second weighted factor 114 can be to be changed with frequency at any time
, that is, the value of weighted factor 116,114 can be different for processed each time/frequency tile.
In some embodiments, at least one described weighting parameters 132 include the first weighted factor 116 and the second weighting because
Son 114.In some embodiments, at least one described weighting parameters 132 include single weighting parameters.If so, then wet/dry
Extractor 134 can derive the first weighted factor 116 and the second weighted factor 114 from the single weighting parameters 132.Example
Such as, the first weighted factor 116 and the second weighted factor 114 can satisfy certain relationships, once these relationships allow weighted factor
In a weighted factor be it is known, then another weighted factor can be derived.The example of such relationship can be,
The quadratic sum of first weighted factor 116 and the second weighted factor 114 is equal to one.Therefore, if single weighting parameters 132 include the
One weighted factor 116, the then square root of the first weighted factor 116 that can be subtracted square according to one derive the second weighted factor
114, vice versa.
First weighted factor 116 is for weighting 122, that is, for approach audio object 110 and be multiplied.Second weighted factor
114 for weighting 120, that is, for being multiplied with corresponding decorrelation audio object 136.Audio object reconstitution assembly 128 is by into one
Step is configured to for example combine the decorrelation sound for approaching audio object 150 with corresponding weighting of 124 weightings by executing summation
Frequency object 152, to reconstruct the time/frequency tile of corresponding audio object 142.
In other words, for each object and each time/frequency tile, the amount of decorrelation can be by a weighting parameters
132 controls.In wet/dry extractor 134, which is converted into the weight factor for being applied to approach object 110
116(wdry) and be applied to the 114 (w of weight factor of decorrelation object 136wet).The quadratic sum of these weight factors is one,
That is,
This means that the final object 142 of the output as summation 124 is with identical with corresponding decorrelation object 110
Energy.
In order to enable input signal 126,130 can cannot be handled the audio decoding system decoding of decorrelation, that is, be
The backward compatibility with such audio decoder is kept, input signal 126 can be disposed in as depicted in fig. 2
In frame 202.According to this embodiment, restructuring matrix 104 is arranged in the first field of frame 202 using the first format, and made
At least one described weighting parameters 132 are arranged in the second field of frame 202 with the second format.In this way it is possible to read
Taking the first format but cannot reading the decoder of the second format still can be decoded restructuring matrix 104 and with any
Conventional mode carries out lower mixed signal 106 using restructuring matrix 104 upper mixed.Second field of frame 202 is in this case
It can be dropped.
According to some embodiments, the audio decoding system 100 in Fig. 1 can add for example at the first receiving unit 102
Ground receives L auxiliary signal 144.There may be auxiliary signals as one or more, that is, L >=1.These auxiliary signals 144
It can be included in input signal 130.Auxiliary signal 144 can be maintained with backward compatibility more than such basis
Mode be included in input signal 130, that is, so that cannot handle the decoder system of auxiliary signal still can be from defeated
Mixed signal 106 under entering in signal 130 at derivation.Restructuring matrix 104 can further realize auxiliary from mixed signal 106 under M and L
Signal 144 is helped to reconstruct approaching for N number of audio object 110.Audio object, which approaches component 108 therefore can be configured as, will reconstruct square
Battle array 104 is applied under M mixed signal 106 and L auxiliary signal 144 to generate and N number of to approach audio object 110.
The effect of auxiliary signal 144 is to improve to approach in component 108 in audio object to approach N number of audio object.Root
According to an example, at least one auxiliary signal in auxiliary signal 144 be equal in N number of audio object by be reconstructed one.
In this case, the vector in the restructuring matrix 104 for reconstructing specific audio object will only include single non-zero parameter, example
Such as, with the parameter of value one (1).According to other examples, at least one auxiliary signal in L auxiliary signal 144 is N number of audio
In object by be reconstructed at least two combination.
In some embodiments, L auxiliary signal can indicate the signal dimension of N number of audio object, these signal dimensions
It is the information lost during mixed signal 106 under generating M from N number of audio object.This can be by illustrating M lower mixed letters
Hyperplane and L auxiliary signal 144 in number 106 crossover signal spaces, which are not located in the hyperplane, to explain.For example,
L auxiliary signal 144 can be orthogonal with the hyperplane that mixed signal 106 under M is crossed over.It is based only upon mixed signal 106 under M, only
Signal in hyperplane can be reconstructed, that is, the audio object not being located in hyperplane will be believed by the audio in hyperplane
It number approaches.By further using L auxiliary signal 144 in reconstruct, the signal not being located in hyperplane can also be reconstructed.
As a result, it is possible to by also improving approaching for audio object using L auxiliary signal.
Fig. 3 shows the summary of the audio coder 300 for generating at least one weighting parameters 320 by way of example
Block diagram.As the spy approached by the way that the decoder-side of the weighting of specific audio object to be approached to (label 150 of Fig. 1) and decoder-side
Decorrelation version (label 152 of Fig. 1) combination (label 124 of Fig. 1) of the corresponding weighting of audio object is determined to reconstruct the spy
When determining the time/frequency tile of frequency object, at least one described weighting parameters 320 will be used in decoder (such as above-mentioned sound
Frequency decoding system 100) in.
Encoder 300 includes receiving unit 302, is configured as receiving mixed signal 312 under M, mixed signal 312 under these
Be include the specific audio object at least N number of audio object combination.Receiving unit 302 is further configured to receive special
Determine audio object 314.In some embodiments, receiving unit 302 is further configured to receive L auxiliary signal 322.As above
It is discussed, at least one of L auxiliary signal 322 can be equal to one in N number of audio object, in L auxiliary signal 322
At least one can be at least two combination in N number of audio signal, and at least one of L auxiliary signal 322
It may include the information being not present under M in any one of mixed signal.
Encoder 300 further includes computing unit 304.Computing unit 304 is configured as example in the first energy balane component
The first amount 316 of the energy level of instruction specific audio object is calculated at 306.First amount 316 can be calculated as specific audio
The norm of object.For example, the first amount 316 can be equal to the energy of specific audio object, therefore two norm Q can be used1=| | S | |2
To calculate, wherein S indicates the specific audio object.First amount can alternatively be calculated as indicating the specific audio
Another amount (square root of such as energy) of the energy of object.
Computing unit 304 is further configured to calculate the second amount 318, the coding of instruction and specific audio object 314
The corresponding energy level of the energy level that device side is approached.Coder side approaches the combination that may, for example, be mixed signal 312 under M,
Such as linear combination.Alternatively, coder side approaches the combination that can be mixed signal 312 and L auxiliary signal 322 under M,
Such as linear combination.Second amount can be calculated at the second energy balane component 308.
Coder side, which is approached, for example to be counted by using mixed signal 312 under the matched mixed matrix of non-energy and M
It calculates.In the context of the present specification, by term " non-energy is matched " it should be understood that specific audio object approach with
The specific audio object itself is that energy is unmatched, that is, this is approached will have different energy compared with specific audio object 314
Amount is horizontal, usually lower energy level.
Different methods can be used and generate the matched mixed matrix of non-energy.It is, for example, possible to use Minimum Mean Square Errors
(MMSE) prediction technique, this method at least take mixed 312 (and possibly, L auxiliary of signal under N number of audio object and M
Signal 322) as input.This can be described as being intended to find the upper mixed of the mean square deviation approached for minimizing N number of audio object
The alternative manner of matrix.Specifically, mixing Matrix Multiplication on this method candidate with signal mixed under M 312 (and possibly, L
Auxiliary signal 322) to approach N number of audio object, and described approach compares with N number of audio object in terms of mean square deviation.
Mixed matrix on the candidate of mean square deviation is minimized to be chosen as being used to define the upper mixed square that the coder side of specific audio object is approached
Battle array.
When using MMSE method, specific audio object S and the prediction error e approached between audio object S ' are orthogonal with S.
This means that:
||S′||2+||e||2=| | S | |2
In other words, the energy of audio object S is equal to the sum of the energy of the energy for approaching audio object and prediction error.By
In relation above, predict therefore the energy of error e gives the instruction that the energy of S ' is approached coder side.
Therefore, it is possible to use specific audio object approaches S ' or prediction error to calculate the second amount 318.Second amount can be with
It is calculated as the norm for approaching S ' of specific audio object or predicts the norm of error e.For example, the second amount can be calculated as
2 norms are (that is, Q2=| | S ' | |2Or Q2=| | e | |2).Second amount can alternatively be calculated as the specific audio that instruction approaches
Another amount of the energy of object, the energy of the square root or prediction error of the energy of the specific audio object such as approached are put down
Root.
Computing unit is further configured to for being based on the first amount 316 and second for example at parameter computation component 310
318 are measured to calculate at least one described weighting parameters 320.Parameter computation component 310 can be for example by comparing 316 He of the first amount
Second amount 318 calculates at least one described weighting parameters 320.Example will be explained in detail in conjunction with Fig. 4 and Fig. 5 a-c now
Property parameter computation component 310.
Fig. 4 shows the parameter computation component 310 for generating at least one weighting parameters 320 by way of example
Generalized block diagram.Parameter computation component 310 is for example at ratio calculation component 402, by calculating the second amount 318 and the first amount
Ratio r between 316 compares the first amount 316 and the second amount 318.Then the ratio is increased to α power, it may be assumed that
Wherein, Q2It is the second amount 318, Q1It is the first amount 316.According to some embodiments, work as Q2=| | S ' | | and Q1=| |
S | | when, α is equal to 2, that is, ratio r is the ratio of the specific audio object approached and the energy of specific audio object.Then for example
At least one described weighting parameters 320 are calculated using the ratio for being raised to α power at map component 404.Map component
404 make r406 in accordance with increasing function, which is mapped at least one described weighting parameters 320 for r.It illustrates in Fig. 5 a-c
Illustrate such increasing function.In Fig. 5 a-c, trunnion axis indicates the value of r406, and vertical axis indicates the value of weighting parameters 320.
In this example embodiment, weighting parameters 320 are single weighting parameters corresponding with the first weighted factor 116 in Fig. 1.
Generally, the principle of mapping function is:
If Q2< < Q1, then the first weighted factor is close to 0, if Q2≈Q1, then the first weighted factor is close to 1.
Fig. 5 a shows mapping function 502, in the mapping function 502, the value between 0 and 1 for r406, and the value of r
It will be identical as the value of weighting parameters 312.For the value for being greater than 1 of r, the value of weighting parameters 320 will be 1.
Figure 5b shows that another mapping functions 504, in the mapping function 504, the value between 0 and 0.5 for r406,
The value of weighting parameters 320 will be 0.For the value for being greater than 1 of r, the value of weighting parameters 320 will be 1.For r 0.5 and 1 between
Value, the value of weighting parameters 320 will be (r-0.5) * 2.
Fig. 5 c shows the third substitution mapping function 506 of the mapping function of overview diagram 5a-b.Mapping function 506 is by least
Four parameter b1、b2、β1And β2It is limited, these parameters can be the optimal perceived of the reconstruct audio object for decoder-side
The constant that quality is tuned.Generally, the maximum of the decorrelation in limitation output audio signal can be beneficial, because
The quality for approaching audio object of decorrelation is usually more of poor quality when audio object is individually listened to than approaching.By b1It is set as big
Directly control this point in zero, so as to ensure weighting parameters 320 (therefore and Fig. 1 in the first weighted factor 116)
It all will be greater than zero under all situations.By b2Be set smaller than 1 have be constantly present minimum in the output of audio decoding system 100
The effect of horizontal decorrelation energy.In other words, the second weighted factor 114 in Fig. 1 will be always greater than zero.β1Implicitly control
The amount for the decorrelation added in the output of audio decoding system 100 is made, but is related to different dynamics (with b1Compared to).Class
As, β2Implicitly control the amount of the decorrelation in the output of audio decoding system 100.
In the value β of desired r1And β2Between curved surface mapping function in the case where, need at least one another parameter, the ginseng
Number can be constant.
It is equivalent, extension, substitution and other
After studying above description, the further embodiment of the disclosure will become to those skilled in the art
It is clear.Even if current description and attached drawing discloses embodiment and example, but the present disclosure is not limited to these particular examples.It is not carrying on the back
In the case where from the scope of the present disclosure being defined by the following claims, many modifications and variations can be made.In claim
Any quotation mark of middle appearance is not understood to limit their range.
In addition, the modification of the disclosed embodiments can be by skill by research attached drawing, disclosure and appended claims
Art personnel understand and implement in implementing the disclosure.In the claims, word " comprising " is not excluded for other elements or step, no
Definite article " one " is not excluded for multiple.The fact that only certain measures are described in mutually different dependent claims is not
Show that the combination of these measures cannot be used for benefiting.
System and method disclosed hereinabove may be implemented as software, firmware, hardware or their combination.In hardware
In embodiment, the division between the functional unit that task refers in the above description not necessarily corresponds to drawing for physical unit
Point;On the contrary, a physical assemblies can have multiple functions, and a task can be executed by several physical assemblies cooperations.
Certain components or all components may be implemented as the software executed by digital signal processor or microprocessor, or be carried out
For hardware, or it is implemented as specific integrated circuit.Such software can be distributed on a computer-readable medium, and computer can
Reading medium may include computer storage medium (or non-transitory medium) and communication media (or fugitive medium).Such as this field
Well known to technical staff, term computer storage medium is included in for storing information (such as computer-readable instruction, data knot
Structure, program module or other data) any method or technique in implement volatile and non-volatile, can be removed and it is not removable
Except medium.Computer storage medium include but is not limited to RAM, ROM, EEPROM, flash memory or other memory technologies, CD-ROM,
Digital versatile disc (DVD) or other optics disk storages, magnetic holder, tape, disk storage or other magnetic memory apparatus or can be with
Any other medium for storing desired information and can be accessed by a computer.In addition, well known to those skilled in the art
It is that communication media generally comprises computer readable instructions, data structure, program module or such as carrier wave or other conveyers
Other data in the modulated data signal of system etc, and including any information delivery media.
Claims (15)
1. a kind of for reconstructing the audio decoding system of the time/frequency tile of N number of audio object, comprising:
First receiving unit (102) is configured as receiving the first input signal (130), and first input signal includes under M
Mixed signal (126) and L auxiliary signal (130);
Second receiving unit (112), is configured as:
It receives the second input signal (126), and extracts restructuring matrix (104) from second input signal;And
It receives weighting parameters (132);
Audio object approaches component (108), is arranged in the downstream of first receiving unit and second receiving unit, and
And it is configured as the restructuring matrix being applied under the M mixed signal and the L auxiliary signal and N number of approaches sound to generate
Frequency object;
It is wet/dry to extract device assembly (134), it is arranged in the downstream of second receiving unit, and be configured as from by described
The received weighting parameters of second receiving unit derive the dry factor (116) and the wet factor (114);
Decorrelation component (118) is arranged in the audio object and approaches the downstream of component, and is configured as to described N number of
At least one subset for approaching audio object carries out decorrelative transformation, to generate at least one decorrelation audio object, thus
Each of at least one described decorrelation audio object corresponds to N number of one approached in audio object;
Audio object reconstitution assembly (128) is arranged in the audio object and approaches component, the decorrelation component, Yi Jisuo
Wet/dry downstream for extracting device assembly is stated, the audio object reconstitution assembly is configured as:
It is weighted using N number of audio object that approaches described in the dry factor pair;
It is weighted using at least one decorrelation audio object described in the wet factor pair;And
Weighted N number of audio object and at least one weighted decorrelation audio object of approaching is combined to reconstruct N number of audio
The time/frequency tile of object (142).
2. system according to claim 1, wherein the wet factor and the dry factor are to change at any time with frequency
, and wherein the restructuring matrix is to change at any time with frequency.
3. system according to claim 1, wherein at least one of described L auxiliary signal is equal to N number of audio
In object by be reconstructed one.
4. system according to claim 1, wherein at least one of described L auxiliary signal is N number of audio pair
As in by be reconstructed at least two combination.
5. system according to claim 1, wherein mixed signal spans hyperplane under the M, and wherein, the L
At least one of auxiliary signal is not located under the M in the hyperplane of mixed signal spans.
6. system according to claim 5, wherein in the L auxiliary signal it is described at least one with the M under
The hyperplane of mixed signal spans is orthogonal.
7. system according to claim 1, wherein the restructuring matrix and the weighting parameters are when being received by cloth
It sets in frame, wherein the restructuring matrix is arranged in the first field of the frame using the first format, and uses second
The weighting parameters are arranged in the second field of the frame by format, so that only supporting that the decoder of the first format can
Restructuring matrix in first field is decoded and abandons the weighting parameters in the second field.
8. a kind of method for reconstructing the time/frequency tile of N number of audio object by audio decoding system, comprising:
The first input signal is received by the first receiving unit of audio decoding system, first input signal includes under M
Mixed signal and L auxiliary signal;
The second input signal and weighting parameters are received by the second receiving unit of audio decoding system;
Restructuring matrix is extracted from second input signal by the second receiving unit;
Component is approached by the audio object of audio decoding system, and restructuring matrix is applied to mixed signal and the L under the M
A auxiliary signal with generate it is N number of approach audio object, the audio object approaches component and is arranged in first receiving unit
With the downstream of second receiving unit;
The dry factor and the wet factor are derived from the received weighting parameters of institute by wet/dry extraction device assembly of audio decoding system,
Wet/dry downstream extracted device assembly and be arranged in second receiving unit;
Phase is carried out to described N number of at least one subset for approaching audio object by the decorrelation component of audio decoding system
It closes, including generates at least one decorrelation audio object, wherein each of at least one described decorrelation audio object pair
N number of one approached in audio object described in Ying Yu, the decorrelation component are arranged in the audio object and approach component
Downstream;
It is weighted by audio object reconstitution assembly using N number of audio object that approaches described in the dry factor pair, the audio
Object reconstruction group is arranged in the audio object and approaches component, the decorrelation component and wet/dry extraction device assembly
Downstream;
Added by the audio object reconstitution assembly using at least one decorrelation audio object described in the wet factor pair
Power;And
It is combined by the audio object reconstitution assembly and weighted N number of approaches audio object and at least one weighted goes phase
Audio object is closed to reconstruct the time/frequency tile of N number of audio object, wherein the audio decoding system includes one or more
A computer processor.
9. according to the method described in claim 8, wherein:
It is weighted including N number of audio object that approaches using N number of audio object that approaches described in the dry factor pair multiplied by institute
State the dry factor;
It is weighted using at least one decorrelation audio object described in the wet factor pair including at least one described decorrelation
Audio object is multiplied by the wet factor;
Combine it is weighted it is N number of approach audio object and at least one weighted decorrelation audio object include will be weighted
It is N number of to approach audio object and at least one weighted decorrelation audio object is summed.
10. according to the method described in claim 8, wherein, the wet factor and the dry factor are to change at any time with frequency
, and wherein the restructuring matrix is to change at any time with frequency.
11. according to the method described in claim 8, wherein, at least one of described L auxiliary signal is equal to N number of sound
In frequency object by be reconstructed one.
12. according to the method described in claim 8, wherein, at least one of described L auxiliary signal is N number of audio
In object by be reconstructed at least two combination.
13. according to the method described in claim 8, wherein, mixed signal spans hyperplane under the M, and wherein, the L
At least one of a auxiliary signal is not located under the M in the hyperplane of mixed signal spans.
14. according to the method for claim 13, wherein at least one of described L auxiliary signal is lower mixed with the M
The hyperplane of signal spans is orthogonal.
15. according to the method described in claim 8, wherein, the restructuring matrix and the weighting parameters quilt when being received
It is arranged in frame, wherein the restructuring matrix is arranged in the first field of the frame using the first format, and uses the
The weighting parameters are arranged in the second field of the frame by two formats, so that only supporting the decoder energy of the first format
It is enough that restructuring matrix in first field is decoded and abandons the weighting parameters in the second field.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910546611.9A CN110223702B (en) | 2013-05-24 | 2014-05-23 | Audio decoding system and reconstruction method |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361827288P | 2013-05-24 | 2013-05-24 | |
US61/827,288 | 2013-05-24 | ||
CN201480029603.2A CN105393304B (en) | 2013-05-24 | 2014-05-23 | Audio coding and coding/decoding method, medium and audio coder and decoder |
PCT/EP2014/060728 WO2014187987A1 (en) | 2013-05-24 | 2014-05-23 | Methods for audio encoding and decoding, corresponding computer-readable media and corresponding audio encoder and decoder |
CN201910546611.9A CN110223702B (en) | 2013-05-24 | 2014-05-23 | Audio decoding system and reconstruction method |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480029603.2A Division CN105393304B (en) | 2013-05-24 | 2014-05-23 | Audio coding and coding/decoding method, medium and audio coder and decoder |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110223702A true CN110223702A (en) | 2019-09-10 |
CN110223702B CN110223702B (en) | 2023-04-11 |
Family
ID=50771513
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480029603.2A Active CN105393304B (en) | 2013-05-24 | 2014-05-23 | Audio coding and coding/decoding method, medium and audio coder and decoder |
CN201910546611.9A Active CN110223702B (en) | 2013-05-24 | 2014-05-23 | Audio decoding system and reconstruction method |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201480029603.2A Active CN105393304B (en) | 2013-05-24 | 2014-05-23 | Audio coding and coding/decoding method, medium and audio coder and decoder |
Country Status (10)
Country | Link |
---|---|
US (1) | US9818412B2 (en) |
EP (1) | EP3005352B1 (en) |
JP (1) | JP6248186B2 (en) |
KR (1) | KR101761099B1 (en) |
CN (2) | CN105393304B (en) |
BR (1) | BR112015028914B1 (en) |
ES (1) | ES2624668T3 (en) |
HK (1) | HK1216453A1 (en) |
RU (1) | RU2628177C2 (en) |
WO (1) | WO2014187987A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2926243C (en) | 2013-10-21 | 2018-01-23 | Lars Villemoes | Decorrelator structure for parametric reconstruction of audio signals |
CN107886960B (en) * | 2016-09-30 | 2020-12-01 | 华为技术有限公司 | Audio signal reconstruction method and device |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060083385A1 (en) * | 2004-10-20 | 2006-04-20 | Eric Allamanche | Individual channel shaping for BCC schemes and the like |
CN101506875A (en) * | 2006-07-07 | 2009-08-12 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for combining multiple parametrically coded audio sources |
CN101517637A (en) * | 2006-09-18 | 2009-08-26 | 皇家飞利浦电子股份有限公司 | Encoding and decoding of audio objects |
CN101529501A (en) * | 2006-10-16 | 2009-09-09 | 杜比瑞典公司 | Enhanced coding and parameter representation of multichannel downmixed object coding |
CN101543098A (en) * | 2007-04-17 | 2009-09-23 | 弗劳恩霍夫应用研究促进协会 | Generation of decorrelated signals |
CN101849257A (en) * | 2007-10-17 | 2010-09-29 | 弗劳恩霍夫应用研究促进协会 | Audio coding using downmix |
WO2010149700A1 (en) * | 2009-06-24 | 2010-12-29 | Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio signal decoder, method for decoding an audio signal and computer program using cascaded audio object processing stages |
CN102169693A (en) * | 2004-03-01 | 2011-08-31 | 杜比实验室特许公司 | Multichannel audio coding |
CN102334158A (en) * | 2009-01-28 | 2012-01-25 | 弗劳恩霍夫应用研究促进协会 | Upmixer, method and computer program for upmixing downmixed audio signals |
US20120114126A1 (en) * | 2009-05-08 | 2012-05-10 | Oliver Thiergart | Audio Format Transcoder |
CN102576532A (en) * | 2009-04-28 | 2012-07-11 | 弗兰霍菲尔运输应用研究公司 | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
CN102640213A (en) * | 2009-10-20 | 2012-08-15 | 弗兰霍菲尔运输应用研究公司 | Apparatus for providing an upmix signal representation on the basis of a downmix signal representation, apparatus for providing a bitstream representing a multichannel audio signal, methods, computer program and bitstream using a distortion control signaling |
CN102667919A (en) * | 2009-09-29 | 2012-09-12 | 弗兰霍菲尔运输应用研究公司 | Audio signal decoder, audio signal encoder, method for providing an upmix signal representation, method for providing a downmix signal representation, computer program and bitstream using a common inter-object-correlation parameter value |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4969192A (en) * | 1987-04-06 | 1990-11-06 | Voicecraft, Inc. | Vector adaptive predictive coder for speech and audio |
US7447317B2 (en) | 2003-10-02 | 2008-11-04 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V | Compatible multi-channel coding/decoding by weighting the downmix channel |
US7394903B2 (en) | 2004-01-20 | 2008-07-01 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Apparatus and method for constructing a multi-channel output signal or for generating a downmix signal |
US7391870B2 (en) * | 2004-07-09 | 2008-06-24 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E V | Apparatus and method for generating a multi-channel output signal |
RU2391714C2 (en) * | 2004-07-14 | 2010-06-10 | Конинклейке Филипс Электроникс Н.В. | Audio channel conversion |
KR101407429B1 (en) | 2004-09-17 | 2014-06-17 | 코닌클리케 필립스 엔.브이. | Composite audio coding to minimize perceptual distortion |
SE0402649D0 (en) * | 2004-11-02 | 2004-11-02 | Coding Tech Ab | Advanced methods of creating orthogonal signals |
JP5017121B2 (en) | 2004-11-30 | 2012-09-05 | アギア システムズ インコーポレーテッド | Synchronization of spatial audio parametric coding with externally supplied downmix |
KR101215868B1 (en) | 2004-11-30 | 2012-12-31 | 에이저 시스템즈 엘엘시 | A method for encoding and decoding audio channels, and an apparatus for encoding and decoding audio channels |
US7787631B2 (en) | 2004-11-30 | 2010-08-31 | Agere Systems Inc. | Parametric coding of spatial audio with cues based on transmitted channels |
US7573912B2 (en) * | 2005-02-22 | 2009-08-11 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschunng E.V. | Near-transparent or transparent multi-channel encoder/decoder scheme |
US7751572B2 (en) | 2005-04-15 | 2010-07-06 | Dolby International Ab | Adaptive residual audio coding |
PL2088580T3 (en) | 2005-07-14 | 2012-07-31 | Koninl Philips Electronics Nv | Audio decoding |
RU2419249C2 (en) * | 2005-09-13 | 2011-05-20 | Кониклейке Филипс Электроникс Н.В. | Audio coding |
RU2406164C2 (en) | 2006-02-07 | 2010-12-10 | ЭлДжи ЭЛЕКТРОНИКС ИНК. | Signal coding/decoding device and method |
KR101065704B1 (en) | 2006-09-29 | 2011-09-19 | 엘지전자 주식회사 | Method and apparatus for encoding and decoding object based audio signals |
AU2007328614B2 (en) | 2006-12-07 | 2010-08-26 | Lg Electronics Inc. | A method and an apparatus for processing an audio signal |
KR101149448B1 (en) | 2007-02-12 | 2012-05-25 | 삼성전자주식회사 | Audio encoding and decoding apparatus and method thereof |
WO2008100100A1 (en) | 2007-02-14 | 2008-08-21 | Lg Electronics Inc. | Methods and apparatuses for encoding and decoding object-based audio signals |
CA2684975C (en) | 2007-04-26 | 2016-08-02 | Dolby Sweden Ab | Apparatus and method for synthesizing an output signal |
EP2144229A1 (en) | 2008-07-11 | 2010-01-13 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Efficient use of phase information in audio encoding and decoding |
US8315396B2 (en) | 2008-07-17 | 2012-11-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating audio output signals using object based metadata |
AU2010321013B2 (en) | 2009-11-20 | 2014-05-29 | Dolby International Ab | Apparatus for providing an upmix signal representation on the basis of the downmix signal representation, apparatus for providing a bitstream representing a multi-channel audio signal, methods, computer programs and bitstream representing a multi-channel audio signal using a linear combination parameter |
AU2011206675C1 (en) | 2010-01-12 | 2016-04-28 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Audio encoder, audio decoder, method for encoding an audio information, method for decoding an audio information and computer program using a hash table describing both significant state values and interval boundaries |
WO2012110415A1 (en) * | 2011-02-14 | 2012-08-23 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for processing a decoded audio signal in a spectral domain |
WO2012122397A1 (en) | 2011-03-09 | 2012-09-13 | Srs Labs, Inc. | System for dynamically creating and rendering audio objects |
US9530421B2 (en) | 2011-03-16 | 2016-12-27 | Dts, Inc. | Encoding and reproduction of three dimensional audio soundtracks |
PL3279895T3 (en) | 2011-11-02 | 2020-03-31 | Telefonaktiebolaget Lm Ericsson (Publ) | Audio encoding based on an efficient representation of auto-regressive coefficients |
RS1332U (en) | 2013-04-24 | 2013-08-30 | Tomislav Stanojević | Total surround sound system with floor loudspeakers |
EP3005355B1 (en) | 2013-05-24 | 2017-07-19 | Dolby International AB | Coding of audio scenes |
-
2014
- 2014-05-23 BR BR112015028914-2A patent/BR112015028914B1/en active IP Right Grant
- 2014-05-23 CN CN201480029603.2A patent/CN105393304B/en active Active
- 2014-05-23 WO PCT/EP2014/060728 patent/WO2014187987A1/en active Application Filing
- 2014-05-23 US US14/890,793 patent/US9818412B2/en active Active
- 2014-05-23 KR KR1020157033532A patent/KR101761099B1/en active Active
- 2014-05-23 RU RU2015150066A patent/RU2628177C2/en active
- 2014-05-23 EP EP14725734.9A patent/EP3005352B1/en active Active
- 2014-05-23 JP JP2016514441A patent/JP6248186B2/en active Active
- 2014-05-23 CN CN201910546611.9A patent/CN110223702B/en active Active
- 2014-05-23 ES ES14725734.9T patent/ES2624668T3/en active Active
-
2016
- 2016-04-18 HK HK16104430.2A patent/HK1216453A1/en unknown
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102169693A (en) * | 2004-03-01 | 2011-08-31 | 杜比实验室特许公司 | Multichannel audio coding |
US20060083385A1 (en) * | 2004-10-20 | 2006-04-20 | Eric Allamanche | Individual channel shaping for BCC schemes and the like |
CN101506875A (en) * | 2006-07-07 | 2009-08-12 | 弗劳恩霍夫应用研究促进协会 | Apparatus and method for combining multiple parametrically coded audio sources |
CN101517637A (en) * | 2006-09-18 | 2009-08-26 | 皇家飞利浦电子股份有限公司 | Encoding and decoding of audio objects |
CN101529501A (en) * | 2006-10-16 | 2009-09-09 | 杜比瑞典公司 | Enhanced coding and parameter representation of multichannel downmixed object coding |
CN101543098A (en) * | 2007-04-17 | 2009-09-23 | 弗劳恩霍夫应用研究促进协会 | Generation of decorrelated signals |
CN101849257A (en) * | 2007-10-17 | 2010-09-29 | 弗劳恩霍夫应用研究促进协会 | Audio coding using downmix |
CN102334158A (en) * | 2009-01-28 | 2012-01-25 | 弗劳恩霍夫应用研究促进协会 | Upmixer, method and computer program for upmixing downmixed audio signals |
CN102576532A (en) * | 2009-04-28 | 2012-07-11 | 弗兰霍菲尔运输应用研究公司 | Apparatus for providing one or more adjusted parameters for a provision of an upmix signal representation on the basis of a downmix signal representation, audio signal decoder, audio signal transcoder, audio signal encoder, audio bitstream, method and computer program using an object-related parametric information |
US20120114126A1 (en) * | 2009-05-08 | 2012-05-10 | Oliver Thiergart | Audio Format Transcoder |
WO2010149700A1 (en) * | 2009-06-24 | 2010-12-29 | Fraunhofer Gesellschaft zur Förderung der angewandten Forschung e.V. | Audio signal decoder, method for decoding an audio signal and computer program using cascaded audio object processing stages |
CN102667919A (en) * | 2009-09-29 | 2012-09-12 | 弗兰霍菲尔运输应用研究公司 | Audio signal decoder, audio signal encoder, method for providing an upmix signal representation, method for providing a downmix signal representation, computer program and bitstream using a common inter-object-correlation parameter value |
CN102640213A (en) * | 2009-10-20 | 2012-08-15 | 弗兰霍菲尔运输应用研究公司 | Apparatus for providing an upmix signal representation on the basis of a downmix signal representation, apparatus for providing a bitstream representing a multichannel audio signal, methods, computer program and bitstream using a distortion control signaling |
Also Published As
Publication number | Publication date |
---|---|
EP3005352B1 (en) | 2017-03-29 |
WO2014187987A1 (en) | 2014-11-27 |
CN105393304A (en) | 2016-03-09 |
KR20160003083A (en) | 2016-01-08 |
US9818412B2 (en) | 2017-11-14 |
ES2624668T3 (en) | 2017-07-17 |
BR112015028914B1 (en) | 2021-12-07 |
CN105393304B (en) | 2019-05-28 |
US20160111097A1 (en) | 2016-04-21 |
EP3005352A1 (en) | 2016-04-13 |
HK1216453A1 (en) | 2016-11-11 |
JP2016522445A (en) | 2016-07-28 |
CN110223702B (en) | 2023-04-11 |
BR112015028914A2 (en) | 2017-08-29 |
RU2628177C2 (en) | 2017-08-15 |
KR101761099B1 (en) | 2017-07-25 |
RU2015150066A (en) | 2017-05-26 |
JP6248186B2 (en) | 2017-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2660638C2 (en) | Device and method for of the audio objects improved spatial encoding | |
KR102486365B1 (en) | Parametric reconstruction of audio signals | |
RU2608847C1 (en) | Audio scenes encoding | |
BRPI0715559B1 (en) | IMPROVED ENCODING AND REPRESENTATION OF MULTI-CHANNEL DOWNMIX DOWNMIX OBJECT ENCODING PARAMETERS | |
RU2666640C2 (en) | Multi-channel decorrelator, multi-channel audio decoder, multi-channel audio encoder, methods and computer program using premix of decorrelator input signals | |
EP3201916B1 (en) | Audio encoder and decoder | |
CN105659320B (en) | Audio coder and decoder | |
CN105518775A (en) | Artifact Removal of Comb Filters for Multichannel Downmix Using Adaptive Phase Calibration | |
TWI792006B (en) | Audio synthesizer, signal generation method, and storage unit | |
CN107771346B (en) | Internal sound channel processing method and device for realizing low-complexity format conversion | |
US10170131B2 (en) | Decoding method and decoder for dialog enhancement | |
CN105393304B (en) | Audio coding and coding/decoding method, medium and audio coder and decoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40014360 Country of ref document: HK |
|
GR01 | Patent grant | ||
GR01 | Patent grant |