WO2009001292A1 - A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream - Google Patents
A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream Download PDFInfo
- Publication number
- WO2009001292A1 WO2009001292A1 PCT/IB2008/052502 IB2008052502W WO2009001292A1 WO 2009001292 A1 WO2009001292 A1 WO 2009001292A1 IB 2008052502 W IB2008052502 W IB 2008052502W WO 2009001292 A1 WO2009001292 A1 WO 2009001292A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- oriented
- audio parameter
- stream
- oriented audio
- parameter values
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/18—Vocoders using multiple modes
- G10L19/20—Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
- G10L19/167—Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
Definitions
- a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream is a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream
- the invention relates to a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream, each object- oriented audio parameter stream comprising object-oriented audio parameter values, said object-oriented audio parameter values representing statistical properties of audio objects as a function of time.
- MPEG framework a workgroup has been started on object-based spatial audio coding.
- the aim of this workgroup is to "explore new technology and reuse of current MPEG Surround components and technologies for the bit rate efficient coding of multiple sound sources or objects into a number of down-mix channels and corresponding spatial parameters".
- the aim is to encode multiple audio objects in a limited set of down-mix channels with corresponding parameters.
- users may interact with the content for example by repositioning the individual audio objects.
- Such interaction with the content is easily realized in object-oriented decoders. It is then realized by including a rendering step that follows the decoding process. Said rendering is combined with the decoding as a single processing step to prevent the need of determining individual objects.
- Said rendering is combined with the decoding as a single processing step to prevent the need of determining individual objects.
- For loudspeaker playback such combination is described in Faller, C, "Parametric joint-coding of audio sources", Proc. 120 th AES Convention, Paris, France, May 2006.
- headphone playback an efficient combination of decoding and head- related transfer function processing is described in Breebaart, J., Herre, J., Villemoes, L., Jin, C, Kj ⁇ rling, K., Plogsties, J., Koppens, J. (2006), "Multi-channel goes mobile: MPEG Surround binaural rendering", Proc. 29th AES conference, Seoul, Korea.
- object-oriented audio parameters as provided by object-oriented coding form a stream that reflects statistical properties of the audio objects as a function of time. Therefore, these object-oriented audio parameters are valid for one certain frame or even for a portion of a frame.
- Each object-oriented audio parameter stream comprises object-oriented audio parameter values.
- the object-oriented audio parameter values represent statistical properties of audio objects as a function of time.
- the method comprises the following steps. First, calculating a synchronized stream for each said input object-oriented audio parameter stream takes place. Said synchronized stream has object- oriented parameter values at predetermined temporal positions. Said predetermined temporal positions are the same for all synchronized streams. For each synchronized stream said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object-oriented audio parameter stream. Second, creating of the output object-oriented audio parameter stream is performed. Said output object-oriented audio parameter stream has object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position.
- the advantage of the method of merging at least two input object-oriented audio parameter streams according to the invention is that no delaying of the object-oriented audio parameter streams is required in order to merge said streams. Instead a very simple processing is performed in order to obtain the object-oriented audio parameter values of the synchronized streams.
- the additional benefit of the proposed method is that the problem of different object-oriented audio parameter positions within frames across the object-oriented audio parameter streams is overcome.
- filtering is applied to the object-oriented audio parameter values of the synchronized stream. Applying, e.g. a simple piecewise linear interpolation has an effect that the resulting interpolated object-oriented audio parameter values are low-pass filtered.
- linear interpolation e.g.
- high-pass filtering can be employed to reduce the effect of low-pass filtering due to interpolation.
- the advantage of applying filtering to the object-oriented audio parameter values of the synchronized stream is that it ensures a similar dynamic behavior as the corresponding input object-oriented audio parameter stream. In other words it improves the quality of synchronization, as it helps the object-oriented audio parameter values of the synchronized stream to resemble the original behavior of said parameters.
- Said synchronization process provides the synchronized stream for the corresponding input object-oriented audio parameter stream.
- the applied filtering is adaptive in order to match statistical properties of the corresponding input object-oriented audio parameter values.
- the object-oriented audio parameters reflect statistical properties of the audio objects as a function of time, it is desired that the synchronization process takes these fluctuations over time into account.
- a Linear Predictive Coding analysis is used to determine an envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently said envelope is imposed on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering.
- This way of post-processing/filtering of the object-oriented audio parameter values of the synchronized stream ensures that the resulting synchronized parameters possess similar characteristics as the original input parameters.
- the mutually different frequency resolutions are matched by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution. It can occur that the input object-oriented audio parameter streams may have mutually different frequency resolutions. In this case, the frequency resolutions must be matched.
- the up-sampling of object-oriented audio parameter values to a higher frequency resolution has the advantage that it is simple, and does not require an extensive computational effort. Since in most systems, frequency resolutions have common edge/threshold frequencies for parameter bands, the up-sampling can be achieved by simply copying appropriate parameter values.
- the invention further provides device claims, a computer program product enabling a programmable device to perform the method according to the invention, as well as a teleconferencing system comprising a device according to the invention.
- Fig. 1 schematically shows a flow diagram of a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream according to the invention
- Fig. 2 schematically shows an example architecture in which two input object- oriented audio parameter streams are merged into an output object-oriented audio parameter stream;
- Fig. 3 shows object-oriented audio parameter values of the synchronized stream that are obtained by means of interpolation between object-oriented audio parameter values of the corresponding input object-oriented audio parameter stream at the predetermined temporal positions;
- Fig. 4 shows schematically an architecture in which filtering is applied to the object-oriented audio parameter values of the synchronized stream
- Fig. 5 illustrates a use of a Linear Predictive Coding analysis to determine a spectral envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently imposing said envelope on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering;
- Fig. 6 illustrates matching of the mutually different frequency resolutions by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution.
- Fig. 1 schematically shows a flow diagram of a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream according to the invention.
- Each object-oriented audio parameter stream comprises object-oriented audio parameter values.
- Said object-oriented audio parameter values represent statistical properties of audio objects as a function of time.
- the method comprises the following steps.
- the step 110 comprises calculating a synchronized stream for each said input object-oriented audio parameter stream.
- Said synchronized stream has object-oriented parameter values at predetermined temporal positions. Said predetermined temporal positions are the same for all synchronized streams.
- said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object- oriented audio parameter stream.
- the step 120 comprises creating of the output object- oriented audio parameter stream.
- Said output object-oriented audio parameter stream has object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position.
- Said combining is realized by means of concatenating of object-oriented audio parameter values corresponding to the input object-oriented audio parameter streams.
- the object-oriented audio parameters as comprised in the object-oriented audio parameter streams are arranged according to time/frequency tiles at each temporal position.
- Each object-oriented audio parameter is associated with an audio object, and each audio object is assigned to one of the time/frequency tiles in turn. Therefore, the concatenation of object-oriented audio parameter values corresponding to the input object- oriented audio parameter streams as used in the step 120 is performed for each of the time/frequency tile separately.
- the sequence of concatenated parameter values for a specific time/frequency tile is arbitrary.
- Fig. 2 schematically shows an example architecture in which two input object- oriented audio parameter streams are merged into an output object-oriented audio parameter stream.
- the object-oriented audio parameter will be used interchangeably with the object parameters throughout the present application.
- Object-oriented encoders 210 and 220 at the two participant sides encode various audio object sets 201 and 202 at these two sides, respectively.
- Each of the decoders generates a down-mix stream 211 and 212, respectively, and the corresponding object parameter streams 221 and 222, respectively.
- These two object-oriented audio streams are fed into the unit 230, which performs merging of the streams.
- This unit 230 generates the merged down- mix 213 and the merged object parameter stream.
- the merged object-oriented audio stream can be decoded at the receiving participant side in the object-oriented decoder 240 based on the user data 224.
- Said user data 224 regards e.g. the object positioning in the three-dimensional space.
- a rendered output 214 is generated that is presented to the user.
- the object parameters reflect statistical properties of the audio objects as a function of time. Hence, these parameters are time varying and are valid only for specific time intervals. These time intervals are typically called frames. Said object parameters are valid for a certain frame or even a portion of frame. In a teleconferencing application, it is very unlikely that frames of multiple object-oriented audio encoders are perfectly aligned in time. Moreover, different audio objects may give rise to different framing and to different object parameter positions, as object parameter positions are preferably content dependent. Without the synchronization processing of the proposed invention, merging of the object- oriented audio streams would require to delay at least one of the down-mixes and the corresponding object parameters in order to align frame boundaries of the streams to be merged.
- the advantage of the method of merging at least two object-oriented audio parameter streams according to the invention is that no delaying of the object-oriented audio parameter streams is required in order to merge said streams. Instead a very simple processing is performed in order to obtain the object-oriented audio parameter values of the synchronized stream.
- the additional benefit of the proposed method is that the problem of different object-oriented audio parameter positions within frames across the object-oriented audio parameter streams is overcome.
- Fig. 3 shows object-oriented audio parameter values 331, 332, and 333 of the synchronized stream that are obtained by means of interpolation between object-oriented audio parameter values 321, 322, 323, and 324, of the corresponding input object-oriented audio parameter stream at the predetermined temporal positions 351, 352, and 353.
- Fig. 3 depicts object parameter values for three consecutive frames 0, 1, and 2, for input object parameter streams 310 and 320.
- the input object parameter stream 310 is a reference stream that remains intact, while the second input object parameter stream 320 is going to be synchronized so that the new positions of the object parameters align with the object parameters of the input object parameter stream 310.
- the framing and the positioning of the object parameters of the two input streams 310 and 320 are not aligned.
- the framing and parameter positions of the object parameter stream 310 are copied to the synchronized stream 330.
- the object parameter values of the object parameter stream 320 are interpolated between the object parameter values of the input object parameter stream 320, as indicated by the dashed lines.
- the interpolated object parameter values at the temporal positions 311, 312, and 313, are copied to the synchronized stream 330.
- the predetermined temporal positions correspond to temporal positions of the object-oriented audio parameter values in one of the input object- oriented audio parameter streams.
- both streams could be synchronized at the temporal positions missing in the other object parameter stream.
- Yet another option is to select the temporal positions depending on the density of synchronization positions and/or computational complexity.
- two input streams are merged also larger numbers of the input streams can be merged. It is very much application dependent. For the teleconferencing application the number of streams to be merged will take the number of participants a single participant is communicating with.
- the interpolation is piecewise linear.
- interpolation of object parameters can be implemented in a different domain, for example:
- log ⁇ (n,b,p)) w p _ ⁇ log ⁇ (n,b,p - ⁇ ))+ w p+ ⁇ log ⁇ (n,b,p + ⁇ )) .
- Fig. 4 shows schematically an architecture in which filtering is applied to the object-oriented audio parameter values of the synchronized stream.
- the applied filtering is adaptive to statistical properties of the corresponding input object-oriented audio parameter values.
- the elements 401 and 402 represent the input object parameter streams, whereby the stream 401 is the reference stream that provides the temporal positions for synchronizing of the second object parameter stream 402. Said temporal positions for synchronizing are comprised in the control parameters denoted by 411.
- the object parameters with the corresponding temporal positions 412 of the object parameter stream 402 are fed into synchronization unit 410 and statistics processing unit 420.
- the unit 410 synchronizes the object parameter stream 402 using an interpolation. Applying, e.g.
- the resulting synchronized stream 431 is fed into the statistics processing unit 420 and to a filter 440.
- the filter 440 is preferably a Io w- order high-pass filter.
- filter coefficients 432 are determined by the unit 420 and provided to the filter 440.
- Said filter 440 generates at its output synchronized object parameter values 441 that exhibit a similar dynamic behavior as in the original input object-oriented audio parameter stream.
- Fig. 5 illustrates a use of a Linear Predictive Coding analysis to determine a spectral envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently imposing said envelope on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering.
- temporal positions 511 are derived and fed into the synchronization unit 510.
- object parameters and the corresponding temporal positions 512 are derived and fed into the synchronization unit 510.
- the synchronization unit 510 synchronizes the object parameters of the stream 502 at the positions 511.
- an LPC analysis is conducted on both the original input object parameters as well as the synchronized object parameters in the units 520 and 540, respectively.
- the LPC analysis performed on the synchronized object parameters 531 in the unit 550 results in the so-called spectral whitening of the values provided at its input.
- the filter unit 550 acts as a spectral whitening filter.
- the second filter unit 570 imposes the spectral envelope of the original object parameters of the stream 502 onto the whitened object parameters 562.
- the two filter stages 550 and 570 can be combined into a single filter unit.
- the LPC analysis is preferably conducted on an auto-correlation.
- the auto- correlation estimates are performed using a sliding window over time.
- the input object-oriented audio parameter streams have mutually different frequency resolutions.
- the input object-oriented audio streams to be merged are encoded by different object-oriented encoders, which might use mutually different frequency resolutions.
- the mutually different frequency resolutions are matched by means of averaging the object-oriented audio parameter values for a higher frequency resolution. It is a very simple way of dealing with this issue that does not require excessive computational effort.
- the object-oriented audio parameters comprise at least (relative) level information defined for separate time/frequency tiles of various audio objects. Said level information is related to an amount of energy in the time/frequency tile of the audio object that the parameter value refers to.
- Said frequency used for time/frequency tiles can have different resolutions, or in other words can have different splitting into frequency bands. The number of frequency bands as well as their width can vary for different resolutions.
- Fig. 6 illustrates matching of the mutually different frequency resolutions by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution. Since in most systems frequency resolutions have common edge frequencies for parameter bands up-sampling can be achieved by copying of appropriate object-oriented audio parameter values.
- the reference input object parameter stream has 9 frequency bands depicted by 610 in Fig. 6.
- the second input object parameter stream has a lower frequency resolution 630 with 6 bands. Given the common edge frequencies of both object parameter streams are aligned, the up-sampled frequency resolution of the second stream is obtained by copying parameter values, depicted by dashed arrows, to overlapping frequency bands as depicted in 620 of the figure.
- the band bl in 630 is an equivalent of the two bands bl and b2 in 610. Therefore the left object parameter value from bl of 630 is copied to bl band of 620, while the right object parameter value from bl of 630 is copied to b2 band of 620.
- the proposed method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream can be realized in a device that comprises: synchronizing means and combining means.
- the processing means calculate the synchronized stream for each input object parameter stream.
- Said synchronized stream has object-oriented parameter values at the predetermined temporal positions.
- Said predetermined temporal positions are the same for all synchronized streams.
- For each synchronized stream said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object parameter stream.
- the combining means create the output object parameter stream.
- the output object parameter has object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position. Said combining is realized by means of concatenating of object-oriented audio parameter values corresponding to the input object-oriented audio parameter streams.
- a computer program product executes the method according to the invention.
- a teleconferencing system comprises a device according to the invention.
- any reference signs placed between parentheses shall not be construed as limiting the claim.
- the word “comprising” does not exclude the presence of elements or steps other than those listed in a claim.
- the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
- the invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Stereophonic System (AREA)
Abstract
A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream is disclosed. Each object- oriented audio parameter stream comprises object-oriented audio parameter values. Said object-oriented audio parameter values represent statistical properties of audio objects as a function of time. Said method comprises the following step. First, calculating a synchronized stream for each said input object-oriented audio parameter stream takes place. Said synchronized stream has object-oriented parameter values at predetermined temporal positions. Said predetermined temporal positions are the same for all synchronized streams. For each synchronized stream said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object-oriented audio parameter stream. Second, creating of the output object-oriented audio parameter stream is performed. Said output object- oriented audio parameter stream has object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position.
Description
A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream
TECHNICAL FIELD
The invention relates to a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream, each object- oriented audio parameter stream comprising object-oriented audio parameter values, said object-oriented audio parameter values representing statistical properties of audio objects as a function of time.
TECHNICAL BACKGROUND
Recently, techniques for processing and manipulating of individual audio objects at the decoding side have attracted significant interest. For example, within the
MPEG framework, a workgroup has been started on object-based spatial audio coding. The aim of this workgroup is to "explore new technology and reuse of current MPEG Surround components and technologies for the bit rate efficient coding of multiple sound sources or objects into a number of down-mix channels and corresponding spatial parameters". In other words, the aim is to encode multiple audio objects in a limited set of down-mix channels with corresponding parameters. At the decoder side, users may interact with the content for example by repositioning the individual audio objects.
Such interaction with the content is easily realized in object-oriented decoders. It is then realized by including a rendering step that follows the decoding process. Said rendering is combined with the decoding as a single processing step to prevent the need of determining individual objects. For loudspeaker playback, such combination is described in Faller, C, "Parametric joint-coding of audio sources", Proc. 120th AES Convention, Paris, France, May 2006. For headphone playback, an efficient combination of decoding and head- related transfer function processing is described in Breebaart, J., Herre, J., Villemoes, L., Jin, C, Kjόrling, K., Plogsties, J., Koppens, J. (2006), "Multi-channel goes mobile: MPEG Surround binaural rendering", Proc. 29th AES conference, Seoul, Korea.
When applying the object-oriented approach, as described above, to teleconferencing, it can be desired to merge multiple object-oriented audio streams, each stream comprising its own down-mix and object-oriented audio parameter stream, stemming
from different far ends. As a result a single object-oriented audio stream is created, which can be further decoded by an object-oriented decoder. Typically a down-mix audio stream employs a frame-based structure. The object-oriented audio parameters as provided by object-oriented coding form a stream that reflects statistical properties of the audio objects as a function of time. Therefore, these object-oriented audio parameters are valid for one certain frame or even for a portion of a frame. In teleconferencing applications, it is very unlikely that framing as used by multiple object-oriented audio encoders at far ends is perfectly aligned in time. Moreover, different audio objects may give rise to different framing and different object-oriented audio parameter positions, since said parameter positions are preferably content dependent. Merging of the object-oriented audio streams, said streams comprising down-mixes and the corresponding object-oriented audio parameter streams, would require to delay at least one of the object-oriented audio streams in order to align frame boundaries of said streams. The disadvantage of such a merging method is that an additional delay is introduced that is rather undesirable in real-time telecommunication/teleconferencing systems.
SUMMARY OF THE INVENTION
It is an object of the invention to provide an enhanced method of merging at least two object-oriented audio parameter streams which does not require delaying of object- oriented audio parameter streams in order to align frame boundaries.
This object is achieved by a method of merging at least two input object- oriented audio parameter streams into an output object-oriented audio parameter stream according to the invention as defined in claim 1. Each object-oriented audio parameter stream comprises object-oriented audio parameter values. The object-oriented audio parameter values represent statistical properties of audio objects as a function of time. The method comprises the following steps. First, calculating a synchronized stream for each said input object-oriented audio parameter stream takes place. Said synchronized stream has object- oriented parameter values at predetermined temporal positions. Said predetermined temporal positions are the same for all synchronized streams. For each synchronized stream said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object-oriented audio parameter stream. Second, creating of the output object-oriented audio parameter stream is performed. Said output object-oriented audio parameter stream has object-oriented parameter values at the predetermined temporal position obtained by
combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position.
The advantage of the method of merging at least two input object-oriented audio parameter streams according to the invention is that no delaying of the object-oriented audio parameter streams is required in order to merge said streams. Instead a very simple processing is performed in order to obtain the object-oriented audio parameter values of the synchronized streams. The additional benefit of the proposed method is that the problem of different object-oriented audio parameter positions within frames across the object-oriented audio parameter streams is overcome. In an embodiment, filtering is applied to the object-oriented audio parameter values of the synchronized stream. Applying, e.g. a simple piecewise linear interpolation has an effect that the resulting interpolated object-oriented audio parameter values are low-pass filtered. Hence, in the case of linear interpolation e.g. high-pass filtering can be employed to reduce the effect of low-pass filtering due to interpolation. The advantage of applying filtering to the object-oriented audio parameter values of the synchronized stream is that it ensures a similar dynamic behavior as the corresponding input object-oriented audio parameter stream. In other words it improves the quality of synchronization, as it helps the object-oriented audio parameter values of the synchronized stream to resemble the original behavior of said parameters. Said synchronization process provides the synchronized stream for the corresponding input object-oriented audio parameter stream.
In an embodiment, the applied filtering is adaptive in order to match statistical properties of the corresponding input object-oriented audio parameter values.
Using the adaptive filtering brings further improvement into the synchronization process. Since the object-oriented audio parameters reflect statistical properties of the audio objects as a function of time, it is desired that the synchronization process takes these fluctuations over time into account.
In an embodiment, a Linear Predictive Coding analysis is used to determine an envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently said envelope is imposed on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering. This way of post-processing/filtering of the object-oriented audio parameter values of the synchronized stream ensures that the resulting synchronized parameters possess similar characteristics as the original input parameters.
In an embodiment, the mutually different frequency resolutions are matched by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution. It can occur that the input object-oriented audio parameter streams may have mutually different frequency resolutions. In this case, the frequency resolutions must be matched. The up-sampling of object-oriented audio parameter values to a higher frequency resolution has the advantage that it is simple, and does not require an extensive computational effort. Since in most systems, frequency resolutions have common edge/threshold frequencies for parameter bands, the up-sampling can be achieved by simply copying appropriate parameter values. The invention further provides device claims, a computer program product enabling a programmable device to perform the method according to the invention, as well as a teleconferencing system comprising a device according to the invention.
BRIEF DESCRIPTION OF THE DRAWINGS These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments shown in the drawings, in which:
Fig. 1 schematically shows a flow diagram of a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream according to the invention; Fig. 2 schematically shows an example architecture in which two input object- oriented audio parameter streams are merged into an output object-oriented audio parameter stream;
Fig. 3 shows object-oriented audio parameter values of the synchronized stream that are obtained by means of interpolation between object-oriented audio parameter values of the corresponding input object-oriented audio parameter stream at the predetermined temporal positions;
Fig. 4 shows schematically an architecture in which filtering is applied to the object-oriented audio parameter values of the synchronized stream;
Fig. 5 illustrates a use of a Linear Predictive Coding analysis to determine a spectral envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently imposing said envelope on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering;
Fig. 6 illustrates matching of the mutually different frequency resolutions by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution.
Throughout the figures, same reference numerals indicate similar or corresponding features. Some of the features indicated in the drawings are typically implemented in software, and as such represent software entities, such as software modules or objects.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Fig. 1 schematically shows a flow diagram of a method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream according to the invention. Each object-oriented audio parameter stream comprises object-oriented audio parameter values. Said object-oriented audio parameter values represent statistical properties of audio objects as a function of time. The method comprises the following steps. The step 110 comprises calculating a synchronized stream for each said input object-oriented audio parameter stream. Said synchronized stream has object-oriented parameter values at predetermined temporal positions. Said predetermined temporal positions are the same for all synchronized streams. For each synchronized stream said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object- oriented audio parameter stream. The step 120 comprises creating of the output object- oriented audio parameter stream. Said output object-oriented audio parameter stream has object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position. Said combining is realized by means of concatenating of object-oriented audio parameter values corresponding to the input object-oriented audio parameter streams. The object-oriented audio parameters as comprised in the object-oriented audio parameter streams are arranged according to time/frequency tiles at each temporal position. Each object-oriented audio parameter is associated with an audio object, and each audio object is assigned to one of the time/frequency tiles in turn. Therefore, the concatenation of object-oriented audio parameter values corresponding to the input object- oriented audio parameter streams as used in the step 120 is performed for each of the time/frequency tile separately. The sequence of concatenated parameter values for a specific time/frequency tile is arbitrary.
Fig. 2 schematically shows an example architecture in which two input object- oriented audio parameter streams are merged into an output object-oriented audio parameter stream. When applying the object-oriented approach for teleconferencing applications it occurs that multiple object-oriented audio streams stemming from different far ends have to be combined into a single stream. Each stream comprises its own down-mix stream and the corresponding object-oriented audio parameter stream. The object-oriented audio parameter will be used interchangeably with the object parameters throughout the present application. Object-oriented encoders 210 and 220 at the two participant sides encode various audio object sets 201 and 202 at these two sides, respectively. Each of the decoders generates a down-mix stream 211 and 212, respectively, and the corresponding object parameter streams 221 and 222, respectively. These two object-oriented audio streams are fed into the unit 230, which performs merging of the streams. An example of such unit from the teleconferencing domain is Multi-participant Control Unit (=MCU). This unit 230 generates the merged down- mix 213 and the merged object parameter stream. Subsequently, the merged object-oriented audio stream can be decoded at the receiving participant side in the object-oriented decoder 240 based on the user data 224. Said user data 224 regards e.g. the object positioning in the three-dimensional space. As a result of the decoding a rendered output 214 is generated that is presented to the user.
The object parameters reflect statistical properties of the audio objects as a function of time. Hence, these parameters are time varying and are valid only for specific time intervals. These time intervals are typically called frames. Said object parameters are valid for a certain frame or even a portion of frame. In a teleconferencing application, it is very unlikely that frames of multiple object-oriented audio encoders are perfectly aligned in time. Moreover, different audio objects may give rise to different framing and to different object parameter positions, as object parameter positions are preferably content dependent. Without the synchronization processing of the proposed invention, merging of the object- oriented audio streams would require to delay at least one of the down-mixes and the corresponding object parameters in order to align frame boundaries of the streams to be merged. The advantage of the method of merging at least two object-oriented audio parameter streams according to the invention is that no delaying of the object-oriented audio parameter streams is required in order to merge said streams. Instead a very simple processing is performed in order to obtain the object-oriented audio parameter values of the synchronized stream. The additional benefit of the proposed method is that the problem of
different object-oriented audio parameter positions within frames across the object-oriented audio parameter streams is overcome.
Fig. 3 shows object-oriented audio parameter values 331, 332, and 333 of the synchronized stream that are obtained by means of interpolation between object-oriented audio parameter values 321, 322, 323, and 324, of the corresponding input object-oriented audio parameter stream at the predetermined temporal positions 351, 352, and 353.
Fig. 3 depicts object parameter values for three consecutive frames 0, 1, and 2, for input object parameter streams 310 and 320. The input object parameter stream 310 is a reference stream that remains intact, while the second input object parameter stream 320 is going to be synchronized so that the new positions of the object parameters align with the object parameters of the input object parameter stream 310. The framing and the positioning of the object parameters of the two input streams 310 and 320 are not aligned. The framing and parameter positions of the object parameter stream 310 are copied to the synchronized stream 330. The object parameter values of the object parameter stream 320 are interpolated between the object parameter values of the input object parameter stream 320, as indicated by the dashed lines. The interpolated object parameter values at the temporal positions 311, 312, and 313, are copied to the synchronized stream 330.
In an embodiment, the predetermined temporal positions correspond to temporal positions of the object-oriented audio parameter values in one of the input object- oriented audio parameter streams. However, other scenarios for determining the predetermined temporal positions are also possible. For example, both streams could be synchronized at the temporal positions missing in the other object parameter stream. Yet another option is to select the temporal positions depending on the density of synchronization positions and/or computational complexity. Further, although in the discussed example only two input streams are merged also larger numbers of the input streams can be merged. It is very much application dependent. For the teleconferencing application the number of streams to be merged will take the number of participants a single participant is communicating with.
In an embodiment, the interpolation is piecewise linear. For object parameter values denoted by σ (n,b,p-l) and σ (n,b,p+l), with n the object number, b the parameter band number, and p the parameter index the interpolated value is calculated as : σ (n, b, p) = wp_y<5 (n, b,p - ϊ) + wp+ισ (n, b,p + ϊ) ,
and with wp_i, wp+i two interpolation weights that sum to 1 :
W^1 H- Wp+1 = 1 ,
and Wp-1, wp+i inversely proportional to the distance (in samples) of the new parameter position (p) with the parameters at position (p-1) and (p+1), respectively. The advantage of this method of synchronization is that it is simple, and does not require extensive computational effort.
Alternatively, interpolation of object parameters can be implemented in a different domain, for example:
<52(n,b,p) = w tσ 2 (n,b,p - 1) + w tσ 2 (n,b,p + l) ,
or
log{σ (n,b,p)) = wp_ι log{σ (n,b,p - ϊ))+ wp+ι log{σ (n,b,p + ϊ)) .
Fig. 4 shows schematically an architecture in which filtering is applied to the object-oriented audio parameter values of the synchronized stream. In an embodiment, the applied filtering is adaptive to statistical properties of the corresponding input object-oriented audio parameter values. The elements 401 and 402 represent the input object parameter streams, whereby the stream 401 is the reference stream that provides the temporal positions for synchronizing of the second object parameter stream 402. Said temporal positions for synchronizing are comprised in the control parameters denoted by 411. The object parameters with the corresponding temporal positions 412 of the object parameter stream 402 are fed into synchronization unit 410 and statistics processing unit 420. The unit 410 synchronizes the object parameter stream 402 using an interpolation. Applying, e.g. a simple piecewise linear interpolation has an effect that the resulting interpolated object-oriented audio parameter values are low-pass filtered. The resulting synchronized stream 431 is fed into the statistics processing unit 420 and to a filter 440. The filter 440 is preferably a Io w- order high-pass filter. The advantage of applying filtering to the object-oriented audio parameter values of the synchronized stream is that it ensures a similar dynamic behavior as the corresponding input object-oriented audio parameter stream. In other words it improves the quality of synchronizing, as it helps the object-oriented audio parameter values of the
synchronized stream to resemble the original behavior of said parameters as in the input stream. Based on the statistics gathered by the unit 420, filter coefficients 432 are determined by the unit 420 and provided to the filter 440. Said filter 440 generates at its output synchronized object parameter values 441 that exhibit a similar dynamic behavior as in the original input object-oriented audio parameter stream.
Fig. 5 illustrates a use of a Linear Predictive Coding analysis to determine a spectral envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently imposing said envelope on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering. From the reference input object parameter stream 501 temporal positions 511 are derived and fed into the synchronization unit 510. From the input object parameter stream 502 object parameters and the corresponding temporal positions 512 are derived and fed into the synchronization unit 510. The synchronization unit 510 synchronizes the object parameters of the stream 502 at the positions 511. In order to obtain similar characteristics for the synchronized object parameters as for the original input object parameters, an LPC analysis is conducted on both the original input object parameters as well as the synchronized object parameters in the units 520 and 540, respectively. The LPC analysis performed on the synchronized object parameters 531 in the unit 550 results in the so-called spectral whitening of the values provided at its input. The filter unit 550 acts as a spectral whitening filter. When applying the inverse (synthesis) filter of spectrally white input, the spectral envelope of the original object parameters is obtained. Hence, the second filter unit 570 imposes the spectral envelope of the original object parameters of the stream 502 onto the whitened object parameters 562. The two filter stages 550 and 570 can be combined into a single filter unit. The LPC analysis is preferably conducted on an auto-correlation. Preferably the auto- correlation estimates are performed using a sliding window over time.
In an embodiment, the input object-oriented audio parameter streams have mutually different frequency resolutions. The input object-oriented audio streams to be merged are encoded by different object-oriented encoders, which might use mutually different frequency resolutions. In an embodiment, the mutually different frequency resolutions are matched by means of averaging the object-oriented audio parameter values for a higher frequency resolution. It is a very simple way of dealing with this issue that does not require excessive computational effort.
The object-oriented audio parameters comprise at least (relative) level information defined for separate time/frequency tiles of various audio objects. Said level information is related to an amount of energy in the time/frequency tile of the audio object that the parameter value refers to. Said frequency used for time/frequency tiles can have different resolutions, or in other words can have different splitting into frequency bands. The number of frequency bands as well as their width can vary for different resolutions.
Fig. 6 illustrates matching of the mutually different frequency resolutions by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution. Since in most systems frequency resolutions have common edge frequencies for parameter bands up-sampling can be achieved by copying of appropriate object-oriented audio parameter values. The reference input object parameter stream has 9 frequency bands depicted by 610 in Fig. 6. The second input object parameter stream has a lower frequency resolution 630 with 6 bands. Given the common edge frequencies of both object parameter streams are aligned, the up-sampled frequency resolution of the second stream is obtained by copying parameter values, depicted by dashed arrows, to overlapping frequency bands as depicted in 620 of the figure. The band bl in 630 is an equivalent of the two bands bl and b2 in 610. Therefore the left object parameter value from bl of 630 is copied to bl band of 620, while the right object parameter value from bl of 630 is copied to b2 band of 620.
The proposed method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream can be realized in a device that comprises: synchronizing means and combining means. The processing means calculate the synchronized stream for each input object parameter stream. Said synchronized stream has object-oriented parameter values at the predetermined temporal positions. Said predetermined temporal positions are the same for all synchronized streams. For each synchronized stream said object-oriented parameter values at the predetermined temporal positions are calculated by means of interpolating of the object-oriented parameter values of the corresponding input object parameter stream. The combining means create the output object parameter stream. The output object parameter has object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position. Said combining is realized by means of concatenating of object-oriented audio parameter values corresponding to the input object-oriented audio parameter streams.
In an embodiment, a computer program product executes the method according to the invention.
In an embodiment, a teleconferencing system comprises a device according to the invention.
Although this application has focused on the merging of the input object- oriented audio parameter streams, for the purpose of teleconferencing applications the down- mix audio streams corresponding to the object-oriented audio parameter streams need also to be merged. A similar method as proposed for merging of the object-oriented parameter streams can be applied for merging of the down-mixes.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims.
In the accompanying claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps other than those listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer.
Claims
1. A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream, each object-oriented audio parameter stream comprising object-oriented audio parameter values, said object-oriented audio parameter values representing statistical properties of audio objects as a function of time, said method comprising the steps of: calculating a synchronized stream for each said input object-oriented audio parameter stream; said synchronized stream having object-oriented parameter values at predetermined temporal positions; said predetermined temporal positions being the same for all synchronized streams; for each synchronized stream said object-oriented parameter values at the predetermined temporal positions being calculated by means of interpolating of the object-oriented parameter values of the corresponding input object-oriented audio parameter stream; creating of the output object-oriented audio parameter stream; said output object-oriented audio parameter stream having object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position.
2. A method as claimed in claim 1, wherein the predetermined temporal positions correspond to temporal positions of the object-oriented audio parameter values in one of the input object-oriented audio parameter streams.
3. A method as claimed in claim 1, wherein the interpolation is piecewise linear.
4. A method as claimed in claim 1, wherein filtering is applied to the object- oriented audio parameter values of the synchronized stream.
5. A method as claimed in claim 4, wherein the applied filtering is adaptive in order to match statistical properties of the corresponding input object-oriented audio parameter values.
6. A method as claimed in claim 5, wherein a Linear Predictive Coding analysis is used to determine a spectral envelope of the object-oriented audio parameter values of the input object-oriented audio parameter stream and subsequently said envelope is imposed on the object-oriented audio parameter values of the corresponding synchronized stream during the filtering.
7. A method as claimed in claim 1, wherein the input object-oriented audio parameter streams have mutually different frequency resolutions.
8. A method as claimed in claim 7, wherein the mutually different frequency resolutions are matched by means of averaging the object-oriented audio parameter values for a higher frequency resolution.
9. A method as claimed in claim 7, wherein the mutually different frequency resolutions are matched by means of up-sampling the object-oriented audio parameter values to a higher frequency resolution.
10. A method as claimed in claim 9, wherein up-sampling is achieved by copying of appropriate object-oriented audio parameter values.
11. A device for merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream, each object-oriented audio parameter stream comprising object-oriented audio parameter values, said object-oriented audio parameter values representing statistical properties of audio objects as a function of time, said device comprising: synchronizing means for calculating a synchronized stream for each said input object-oriented audio parameter stream; said synchronized stream having object-oriented parameter values at predetermined temporal positions; said predetermined temporal positions being the same for all synchronized streams; for each synchronized stream said object- oriented parameter values at the predetermined temporal positions being calculated by means of interpolating of the object-oriented parameter values of the corresponding input object- oriented audio parameter stream; combining means for creating of the output object-oriented audio parameter stream; said output object-oriented audio parameter stream having object-oriented parameter values at the predetermined temporal position obtained by combining the object-oriented audio parameter values of the synchronized streams at said same predetermined temporal position.
12. A device as claimed in claim 11, wherein said device further comprises filtering means for filtering of the object-oriented audio parameter values of the synchronized streams.
13. A device as claimed in claim 12, wherein said filtering means are configured to adapt to statistical properties of the corresponding object-oriented audio parameter values.
14. A device as claimed in claim 12, wherein said device further comprises matching means for matching frequency resolutions of the input object-oriented audio parameter streams when said resolutions are mutually different.
15. A computer program product for executing the method of any claims 1-10.
16. A teleconferencing system comprising a device according to claim 11-14.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP07111149 | 2007-06-27 | ||
| EP07111149.6 | 2007-06-27 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2009001292A1 true WO2009001292A1 (en) | 2008-12-31 |
Family
ID=39768243
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2008/052502 Ceased WO2009001292A1 (en) | 2007-06-27 | 2008-06-24 | A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream |
Country Status (2)
| Country | Link |
|---|---|
| TW (1) | TW200921643A (en) |
| WO (1) | WO2009001292A1 (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2011020067A1 (en) * | 2009-08-14 | 2011-02-17 | Srs Labs, Inc. | System for adaptively streaming audio objects |
| WO2014184618A1 (en) * | 2013-05-17 | 2014-11-20 | Nokia Corporation | Spatial object oriented audio apparatus |
| US9026450B2 (en) | 2011-03-09 | 2015-05-05 | Dts Llc | System for dynamically creating and rendering audio objects |
| US9558785B2 (en) | 2013-04-05 | 2017-01-31 | Dts, Inc. | Layered audio coding and transmission |
| TWI607654B (en) * | 2011-07-01 | 2017-12-01 | 杜比實驗室特許公司 | Apparatus, method and non-transitory medium for enhancing 3D audio editing and rendering |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6463414B1 (en) * | 1999-04-12 | 2002-10-08 | Conexant Systems, Inc. | Conference bridge processing of speech in a packet network environment |
| WO2004039096A1 (en) * | 2002-10-25 | 2004-05-06 | Dilithium Networks Pty Limited | Method and apparatus for dtmf detection and voice mixing in the celp parameter domain |
| US20050102137A1 (en) * | 2001-04-02 | 2005-05-12 | Zinser Richard L. | Compressed domain conference bridge |
| WO2005078707A1 (en) * | 2004-02-16 | 2005-08-25 | Koninklijke Philips Electronics N.V. | A transcoder and method of transcoding therefore |
-
2008
- 2008-06-24 TW TW97123574A patent/TW200921643A/en unknown
- 2008-06-24 WO PCT/IB2008/052502 patent/WO2009001292A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6463414B1 (en) * | 1999-04-12 | 2002-10-08 | Conexant Systems, Inc. | Conference bridge processing of speech in a packet network environment |
| US20050102137A1 (en) * | 2001-04-02 | 2005-05-12 | Zinser Richard L. | Compressed domain conference bridge |
| WO2004039096A1 (en) * | 2002-10-25 | 2004-05-06 | Dilithium Networks Pty Limited | Method and apparatus for dtmf detection and voice mixing in the celp parameter domain |
| WO2005078707A1 (en) * | 2004-02-16 | 2005-08-25 | Koninklijke Philips Electronics N.V. | A transcoder and method of transcoding therefore |
Cited By (20)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8396577B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | System for creating audio objects for streaming |
| US8396576B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | System for adaptively streaming audio objects |
| US8396575B2 (en) | 2009-08-14 | 2013-03-12 | Dts Llc | Object-oriented audio streaming system |
| WO2011020067A1 (en) * | 2009-08-14 | 2011-02-17 | Srs Labs, Inc. | System for adaptively streaming audio objects |
| US9167346B2 (en) | 2009-08-14 | 2015-10-20 | Dts Llc | Object-oriented audio streaming system |
| US9721575B2 (en) | 2011-03-09 | 2017-08-01 | Dts Llc | System for dynamically creating and rendering audio objects |
| US9026450B2 (en) | 2011-03-09 | 2015-05-05 | Dts Llc | System for dynamically creating and rendering audio objects |
| US9165558B2 (en) | 2011-03-09 | 2015-10-20 | Dts Llc | System for dynamically creating and rendering audio objects |
| US10244343B2 (en) | 2011-07-01 | 2019-03-26 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| TWI607654B (en) * | 2011-07-01 | 2017-12-01 | 杜比實驗室特許公司 | Apparatus, method and non-transitory medium for enhancing 3D audio editing and rendering |
| US9838826B2 (en) | 2011-07-01 | 2017-12-05 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US10609506B2 (en) | 2011-07-01 | 2020-03-31 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US11057731B2 (en) | 2011-07-01 | 2021-07-06 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US11641562B2 (en) | 2011-07-01 | 2023-05-02 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US12047768B2 (en) | 2011-07-01 | 2024-07-23 | Dolby Laboratories Licensing Corporation | System and tools for enhanced 3D audio authoring and rendering |
| US9613660B2 (en) | 2013-04-05 | 2017-04-04 | Dts, Inc. | Layered audio reconstruction system |
| US9558785B2 (en) | 2013-04-05 | 2017-01-31 | Dts, Inc. | Layered audio coding and transmission |
| US9837123B2 (en) | 2013-04-05 | 2017-12-05 | Dts, Inc. | Layered audio reconstruction system |
| US9706324B2 (en) | 2013-05-17 | 2017-07-11 | Nokia Technologies Oy | Spatial object oriented audio apparatus |
| WO2014184618A1 (en) * | 2013-05-17 | 2014-11-20 | Nokia Corporation | Spatial object oriented audio apparatus |
Also Published As
| Publication number | Publication date |
|---|---|
| TW200921643A (en) | 2009-05-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| RU2705007C1 (en) | Device and method for encoding or decoding a multichannel signal using frame control synchronization | |
| JP5934922B2 (en) | Decoding device | |
| AU2005280041B2 (en) | Multichannel decorrelation in spatial audio coding | |
| CN101543098B (en) | decorrelator and method for generation of output signal, and audio decoder for producing multi-channel output signals | |
| KR101424752B1 (en) | An Apparatus for Determining a Spatial Output Multi-Channel Audio Signal | |
| JP5645951B2 (en) | An apparatus for providing an upmix signal based on a downmix signal representation, an apparatus for providing a bitstream representing a multichannel audio signal, a method, a computer program, and a multi-channel audio signal using linear combination parameters Bitstream | |
| EP1851997B1 (en) | Near-transparent or transparent multi-channel encoder/decoder scheme | |
| AU2006340728B2 (en) | Enhanced method for signal shaping in multi-channel audio reconstruction | |
| US8817992B2 (en) | Multichannel audio coder and decoder | |
| JP6134867B2 (en) | Renderer controlled space upmix | |
| TW200926147A (en) | Audio coding using downmix | |
| EP1971979A1 (en) | Decoding of binaural audio signals | |
| Purnhagen et al. | Immersive audio delivery using joint object coding | |
| DK2171712T3 (en) | A method and device for improving spatial audio signals | |
| WO2009001292A1 (en) | A method of merging at least two input object-oriented audio parameter streams into an output object-oriented audio parameter stream | |
| JPWO2007043388A1 (en) | Acoustic signal processing apparatus and acoustic signal processing method | |
| WO2007080225A1 (en) | Decoding of binaural audio signals | |
| EP4621772A2 (en) | Processing parametrically coded audio | |
| Yu et al. | Low-complexity binaural decoding using time/frequency domain HRTF equalization | |
| James et al. | Corpuscular Streaming and Parametric Modification Paradigm for Spatial Audio Teleconferencing | |
| MX2008009565A (en) | Apparatus and method for encoding/decoding signal |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08776462 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 08776462 Country of ref document: EP Kind code of ref document: A1 |