RU2575393C2

RU2575393C2 - Encoding and decoding of slot positions with events in audio signal frame

Info

Publication number: RU2575393C2
Application number: RU2013138354/08A
Authority: RU
Inventors: Ахим КУНТЦ; Саша ДИШ; Том БЯКСТРЕМ
Original assignee: Фраунхофер-Гезелльшафт Цур Фердерунг Дер Ангевандтен Форшунг Е.Ф.
Priority date: 2011-01-18
Filing date: 2012-01-17
Publication date: 2016-02-20

Abstract

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio signal processing and audio encoding. Disclosed is an apparatus for decoding, an apparatus for encoding, a method for decoding and a method for encoding positions of slots comprising events in an audio signal frame and respective programs and encoded signals, wherein the apparatus for decoding comprises: an analysing unit for analysing a frame slots number indicating the total of slots of the audio signal frame, an event slots number indicating the number of slots comprising the events of the audio signal frame, and an event state number, and a generating unit for generating an indication of a plurality of positions of slots comprising the events in the audio signal frame using the frame slots number, the event slots number and the event state number.

EFFECT: high accuracy of encoding and decoding.

17 cl, 25 dwg, 6 tbl

Description

Настоящее изобретение относится к области обработки аудио и аудиокодирования, в частности к кодированию и декодированию позиций слотов с событиями в кадре аудиосигнала.The present invention relates to the field of audio processing and audio coding, in particular to coding and decoding of the positions of slots with events in an audio signal frame.

Обработка аудио и/или аудиокодирование продвинулись во многих отношениях. В частности, применения пространственного аудио становятся все более важными. Обработка аудиосигнала часто используется для декорреляции или воспроизведения сигналов. Более того, декорреляция и воспроизведение сигналов используется в процессе повышающего микширования моно-стерео (из моно в стерео), повышающего микширования из моно/стерео в многоканальный формат, для искусственной реверберации, расширения стерео или взаимодействующего с пользователем микширования/воспроизведения.Audio processing and / or audio coding have advanced in many ways. In particular, spatial audio applications are becoming increasingly important. Audio processing is often used for decorrelation or reproduction of signals. Moreover, decorrelation and reproduction of signals is used in the process of up-mixing mono-stereo (from mono to stereo), up-mixing from mono / stereo to multi-channel format, for artificial reverb, stereo expansion or user-mixing mixing / playback.

Некоторые системы обработки аудиосигнала используют декорреляторы. Важным примером является применение декорреляции сигналов в параметрических пространственных аудиодекодерах для восстановления конкретных свойств декорреляции между двумя или более сигналами, которые реконструируются из одного или нескольких сигналов, полученных понижающим микшированием. Применение декорреляторов значительно улучшает воспринимаемое качество выходного сигнала, например, при сравнении со стерео на основе интенсивности сигнала. Конкретно, использование декорреляторов обеспечивает возможность правильного синтеза пространственного звука с широкой фонограммой, несколькими одновременно действующими звуковыми объектами и/или окружением. Однако также известно, что декорреляторы вносят артефакты, такие как изменения во временной структуре сигнала, тембре и т.д.Some audio processing systems use decorrelators. An important example is the use of decorrelation of signals in parametric spatial audio decoders to restore specific decorrelation properties between two or more signals that are reconstructed from one or more signals obtained by downmixing. The use of decorrelators significantly improves the perceived quality of the output signal, for example, when compared with stereo based on signal intensity. Specifically, the use of decorrelators provides the ability to correctly synthesize spatial sound with a wide phonogram, several simultaneously operating sound objects and / or surroundings. However, it is also known that decorrelators introduce artifacts, such as changes in the temporal structure of the signal, timbre, etc.

Другими примерами декорреляторов при обработке аудио являются, например, генерирование искусственной реверберации для изменения пространственного впечатления или использование декорреляторов в многоканальных акустических системах эхоподавления для улучшения характера сходимости.Other examples of decorrelators in audio processing are, for example, generating artificial reverb to change the spatial impression, or using decorrelators in multi-channel acoustic echo cancellation systems to improve convergence.

Одной важной схемой пространственного аудиокодирования является параметрическое стерео (PS). На фиг.1 проиллюстрирована структура моно-стерео декодера. Одиночный декоррелятор генерирует декоррелированный сигнал D ("обработанный" сигнал) из входного моносигнала M ("необработанного" сигнала). Декоррелированный сигнал D затем подается в микшер вместе с сигналом M. Затем микшер применяет матрицу H микширования к входным сигналам M и D для генерирования выходных сигналов L и R. Коэффициенты в матрице H микширования могут быть фиксированными, зависимыми от сигнала или управляемыми пользователем.One important spatial audio coding scheme is parametric stereo (PS). Figure 1 illustrates the structure of a mono-stereo decoder. A single decorrelator generates a decorrelated signal D (“processed” signal) from the input mono signal M (“unprocessed” signal). The decorrelated signal D is then supplied to the mixer along with signal M. The mixer then applies the mixing matrix H to the input signals M and D to generate the output signals L and R. The coefficients in the mixing matrix H can be fixed, signal dependent, or controlled by the user.

В качестве альтернативы, матрица микширования управляется дополнительной информацией, которая передается вместе с понижающим микшированием и содержит параметрическое описание того, как осуществить повышающее микширование сигналов понижающего микширования для формирования желаемого многоканального выходного сигнала. Пространственная дополнительная информация обычно генерируется во время процесса понижающего микширования до моно в соответственном кодере сигнала.Alternatively, the mixing matrix is driven by additional information that is transmitted along with the downmix and contains a parametric description of how to upmix the downmix signals to produce the desired multi-channel output. Spatial additional information is usually generated during the down-mix to mono process in the respective signal encoder.

Пространственное аудиокодирование, как описано выше, широко применяется, например, в параметрическом стерео. Типичная структура декодера параметрического стерео показана на фиг.2. На фиг.2 декорреляция выполняется в области преобразования. Пространственные параметры могут быть модифицированы пользователем или дополнительными инструментами, например, посредством пост-обработки для бинаурального воспроизведения/представления. В этом случае параметры повышающего микширования объединяются с параметрами из бинауральных фильтров, чтобы вычислить входные параметры для матрицы микширования.Spatial audio coding, as described above, is widely used, for example, in parametric stereo. A typical structure of a parametric stereo decoder is shown in FIG. 2, decorrelation is performed in the transform domain. Spatial parameters can be modified by the user or by additional tools, for example, through post-processing for binaural playback / presentation. In this case, the upmix parameters are combined with the parameters from the binaural filters to calculate the input parameters for the mixing matrix.

Выходной сигнал L/R матрицы H микширования вычисляется из входного моносигнала M и декоррелированного сигнала D.The output signal L / R of the mixing matrix H is calculated from the input mono signal M and the decorrelated signal D.

В матрице микширования величиной декоррелированного звука, поданного на выход, управляют на основании переданных параметров, например межканальных разностей уровней частот (ILD), межканальной корреляции/когерентности (ICC) и/или фиксированных или заданных пользователем настроек.In the mixing matrix, the amount of decorrelated sound output is controlled based on the transmitted parameters, for example interchannel frequency difference (ILD), interchannel correlation / coherence (ICC) and / or fixed or user-defined settings.

Концептуально, выходной сигнал выхода D декоррелятора заменяет остаточный сигнал, который в идеале обеспечил бы возможность идеального декодирования первоначальных L/R сигналов. Использование выхода D декоррелятора вместо остаточного сигнала в повышающем микшере приводит к сбережению битовой скорости, которая иначе потребовалась бы для передачи остаточного сигнала. Целью декоррелятора, таким образом, является сгенерировать сигнал D из моносигнала M, который показывает аналогичные свойства, как и остаточный сигнал, который заменяется посредством D. Сделана ссылка на документ:Conceptually, the output signal of the D output of the decorrelator replaces the residual signal, which ideally would provide the ideal decoding of the original L / R signals. Using the output D of the decorrelator instead of the residual signal in the boost mixer saves the bit rate that would otherwise be required to transmit the residual signal. The purpose of the decorrelator, therefore, is to generate a signal D from a mono signal M, which shows similar properties as the residual signal, which is replaced by D. Reference is made to the document:

[1] J. Breebaart, S. van de Par, A. Kohlrausch, E. Schuijers, "High-Quality Parametric Spatial Audio Coding at Low Bitrates" in Proceedings of the AES 116^th Convention, Berlin, Preprint 6072, May 2004.[1] J. Breebaart, S. van de Par, A. Kohlrausch, E. Schuijers, "High-Quality Parametric Spatial Audio Coding at Low Bitrates" in Proceedings of the AES 116 ^th Convention, Berlin, Preprint 6072, May 2004.

Рассматривая MPEG Surround (MPS), структуры, аналогичные PS, называемые блоками преобразования из одного канала в два (OTT-блоки), используются в деревьях пространственного аудиодекодирования. Это может выглядеть как обобщение концепции повышающего микширования моно-стерео для схем многоканального пространственного аудиокодирования/аудиодекодирования. В MPS также существуют системы повышающего микширования из двух каналов в три (TTT-блоки), которые могут применять декорреляторы в зависимости от TTT-режима работы. Подробности описаны в документе:Considering MPEG Surround (MPS), structures similar to PS, called one-to-two conversion blocks (OTT blocks), are used in spatial audio decoding trees. This may look like a generalization of the mono-stereo up-mix concept for multi-channel spatial audio coding / audio decoding. The MPS also has two-to-three boost mixing systems (TTT blocks) that can use decorrelators depending on the TTT mode of operation. Details are described in the document:

[2] J. Herre, K. Kjorling, J. Breebaart, et al., "MPEG surround - the ISO/MPEG standard for efficient and compatible multi-channel audio coding", in Proceedings of the 122^th AES Convention, Vienna, Austria, May 2007.[2] J. Herre, K. Kjorling, J. Breebaart, et al., "MPEG surround - the ISO / MPEG standard for efficient and compatible multi-channel audio coding", in Proceedings of the 122 ^th AES Convention, Vienna, Austria, May 2007.

Что касается направленного аудиокодирования (DirAC), DirAC относится к схеме параметрического кодирования звукового поля, которая не ограничена фиксированным числом каналов выходного аудиосигнала с фиксированными позициями громкоговорителей. DirAC применяет декорреляторы в блоке воспроизведения DirAC, т.е. в пространственном аудиодекодере для синтеза некогерентных составляющих звуковых полей. Направленное аудиокодирование дополнительно описано в:Regarding directional audio coding (DirAC), DirAC relates to a parametric coding scheme for a sound field that is not limited to a fixed number of audio output channels with fixed speaker positions. DirAC applies decorrelators in the DirAC playback unit, i.e. in a spatial audio decoder for the synthesis of incoherent components of sound fields. Directional audio coding is further described in:

[3] Pulkki, Ville: "Spatial Sound Reproduction with Directional Audio Coding", в J. Audio Eng. Soc., Vol. 55, № 6, 2007.[3] Pulkki, Ville: "Spatial Sound Reproduction with Directional Audio Coding," in J. Audio Eng. Soc., Vol. 55, No. 6, 2007.

Касательно декорреляторов предшествующего уровня техники, сделана ссылка на документы:Regarding prior art decorrelators, reference is made to documents:

[4] ISO/IEC International Standard "Information Technology - MPEG audio technologies - Part1: MPEG Surround", ISO/IEC 23003-1:2007.[4] ISO / IEC International Standard "Information Technology - MPEG audio technologies - Part1: MPEG Surround", ISO / IEC 23003-1: 2007.

[5] J. Engdegard, H. Purnhagen, J. Roden, L. Liljeryd, "Synthetic Ambience in Parametric Stereo Coding" in Proceedings of the AES 116^th Convention, Preprint, May 2004.[5] J. Engdegard, H. Purnhagen, J. Roden, L. Liljeryd, "Synthetic Ambience in Parametric Stereo Coding" in Proceedings of the AES 116 ^th Convention, Preprint, May 2004.

Решетчатые частотнонезависимые IIR-структуры используются в качестве декорреляторов в пространственных аудиодекодерах, таких как MPS [2, 4]. Другие декорреляторы предшествующего уровня техники (потенциально частотнозависимые) применяют задержки к декоррелированным сигналам или свертывают входные сигналы, например, с экспоненциально затухающими шумовыми выбросами. Для обзора декорреляторов предшествующего уровня техники для пространственных аудиосистем повышающего микширования сделана ссылка на документ [5]: "Synthetic Ambience in Parametric Stereo Coding".Lattice frequency-independent IIR structures are used as decorrelators in spatial audio decoders, such as MPS [2, 4]. Other prior art decorrelators (potentially frequency-dependent) apply delays to decorrelated signals or reduce input signals, for example, with exponentially decaying noise emissions. For a review of prior art decorrelators for spatial upmix audio systems, reference is made to document [5]: "Synthetic Ambience in Parametric Stereo Coding".

В общем, известно, что стерео или многоканальные подобные аплодисментам сигналы, кодированные/декодированные в параметрических пространственных аудиокодерах, приводят к уменьшенному качеству сигнала. Подобные аплодисментам сигналы характеризуются содержанием достаточно частых примесей транзиентов с разных направлений. Примерами для таких сигналов являются аплодисменты, звук дождя, скачущие лошади и т.д. Подобные аплодисментам сигналы часто также содержат составляющие звука от удаленных источников звука, которые ощутимо сливаются с шумоподобным, сглаженным фоновым звуковым полем.In general, it is known that stereo or multi-channel applause-like signals encoded / decoded in parametric spatial audio encoders result in reduced signal quality. Applause-like signals are characterized by the content of fairly frequent transient impurities from different directions. Examples of such signals are applause, the sound of rain, galloping horses, etc. Applause-like signals often also contain sound components from distant sound sources that tangibly merge with a noise-like, smoothed background sound field.

Решетчатые частотнонезависимые структуры, используемые в пространственных аудиодекодерах, таких как MPEG Surround, действуют как генераторы искусственной реверберации и поэтому хорошо подходят для генерирования однородных, сглаженных, шумоподобных, инверсивных звуков (таких как реверберационные хвосты в помещении). Однако они являются примерами звуковых полей с неоднородной пространственно-временной структурой, которые все же создают погружение слушателя: одним заметным примером являются подобные аплодисментам звуковые поля, которые создают охват слушателя не только посредством однородных шумоподобных полей, но также посредством достаточно частых последовательностей одиночных хлопков с разных направлений. Поэтому неоднородная составляющая звуковых полей аплодисментов может характеризоваться пространственно распределенной примесью транзиентов. Эти отличимые хлопки вовсе не являются однородными, сглаженными и шумоподобными.Lattice frequency-independent structures used in spatial audio decoders, such as MPEG Surround, act as artificial reverb generators and are therefore well suited to generating uniform, smooth, noise-like, inverse sounds (such as indoor reverb tails). However, they are examples of sound fields with an inhomogeneous spatio-temporal structure, which nevertheless create an immersion in the listener: one notable example is sound fields similar to applause, which create a listener reach not only through uniform noise-like fields, but also through fairly frequent sequences of single pops from different directions. Therefore, the heterogeneous component of the sound field of applause can be characterized by a spatially distributed admixture of transients. These distinguishable claps are not at all homogeneous, smooth and noise-like.

Из-за их подобного реверберации поведения, решетчатые частотнонезависимые декорреляторы не способны генерировать иммерсивные звуковые поля с характеристиками, например, аплодисментов. Вместо этого, при применении к подобным аплодисментам сигналам они имеют склонность к временному размытию транзиентов в сигнале. Нежелательным результатом является шумоподобное иммерсивное звуковое поле без отличительной пространственно-временной структуры подобных аплодисментам звуковых полей. Кроме того, события транзиентов, такие как одиночные хлопки руками, могут вызывать звенящие артефакты фильтров декоррелятора.Due to their similar reverberation behavior, trellised frequency-independent decorrelators are not able to generate immersive sound fields with characteristics such as applause. Instead, when applied to similar applause signals, they tend to temporarily blur the transients in the signal. An undesirable result is a noise-like immersive sound field without a distinctive spatio-temporal structure of applause-like sound fields. In addition, transient events, such as single claps of hands, can cause ringing artifacts of decorrelator filters.

USAC (унифицированное речевое и аудиокодирование) является стандартом аудиокодирования для кодирования речи и аудио и их смешения с разными битовыми скоростями.USAC (Unified Speech and Audio Coding) is an audio coding standard for encoding speech and audio and mixing them at different bit rates.

Воспринимаемое качество USAC может быть дополнительно улучшено при стереокодировании аплодисментов и подобных аплодисментам звуков с битовыми скоростями в пределах 32 кбит/с, когда применимы методы параметрического стереокодирования. Элементы аплодисментов, кодированные с помощью USAC, имеют склонность показывать узкую звуковую сцену и недостаток охвата, если внутри кодека не применяется выделенная обработка аплодисментов. Во многом методы стереокодирования USAC и их ограничения были унаследованы от MPEG Surround (MPS). Однако USAC предлагает выделенную адаптацию для требования правильной обработки аплодисментов. Вышеупомянутая адаптация называется управляющим транзиентами декоррелятором (TSD) и является вариантом осуществления этого изобретения.The perceived USAC quality can be further improved by stereo coding of applause and applause-like sounds with bit rates of up to 32 kbit / s when parametric stereo coding methods are applicable. USAC encoded applause tends to show a narrow soundstage and lack of coverage if dedicated applause processing is not used inside the codec. In many ways, USAC stereo coding methods and their limitations have been inherited from MPEG Surround (MPS). However, the USAC offers dedicated adaptation to the requirement for proper applause. The above adaptation is called transient control decorrelator (TSD) and is an embodiment of this invention.

Сигналы аплодисментов можно представить составленными из одиночных, отличимых близких хлопков, временно разделенных несколькими миллисекундами, и наложенного шумоподобного окружения, возникающего из-за очень частых отдаленных хлопков. В параметрическом стереокодировании с разумной скоростью дополнительной информации детализация наборов пространственных параметров (межканальной разности уровней частот, межканальной корреляции и т.д.) является чересчур низкой, чтобы гарантировать достаточное пространственное перераспределение одиночных хлопков, приводя к недостатку охвата. Дополнительно, хлопки подвергаются обработке решетчатым частотнонезависимым декоррелятором. Это неизбежно порождает временную дисперсию транзиентов и дополнительно уменьшает субъективное качество.Applause signals can be represented as composed of single, distinguishable close pops, temporarily separated by several milliseconds, and an imposed noise-like environment arising from very frequent distant pops. In parametric stereo coding with a reasonable additional information rate, the detail of sets of spatial parameters (inter-channel difference in frequency levels, inter-channel correlation, etc.) is too low to guarantee sufficient spatial redistribution of single pops, leading to a lack of coverage. Additionally, claps are treated with a trellised frequency-independent decorrelator. This inevitably gives rise to a temporary dispersion of transients and further reduces subjective quality.

Использование управляющего транзиентами декоррелятора (TSD) внутри USAC-декодера приводит к модификации MPS-обработки. Основополагающая идея такого подхода состоит в решении проблемы декорреляции аплодисментов, как следует ниже:Using a transient control decorrelator (TSD) inside the USAC decoder modifies the MPS processing. The fundamental idea of this approach is to solve the problem of decorrelation of applause, as follows:

- Отделить транзиенты в QMF-области до решетчатого частотнонезависимого декоррелятора, т.е.: разделить входной сигнал декоррелятора на поток s2 с транзиентами и поток s1 без транзиентов.- Separate the transients in the QMF region to a trellised frequency-independent decorrelator, i.e.: divide the input of the decorrelator into stream s2 with transients and stream s1 without transients.

- Подать поток с транзиентами на другой управляемый параметрами декоррелятор, который хорошо подходит для примесей транзиентов.- Feed the transient stream to another parameter-controlled decorrelator that is well suited for transient impurities.

- Подать поток без транзиентов на частотнонезависимый MPS-декоррелятор.- Feed a stream without transients to a frequency-independent MPS decorrelator.

- Сложить выходные сигналы обоих декорреляторов D₁ и D₂, чтобы получить декоррелированный сигнал D.- Add the output signals of both decorrelators D ₁ and D ₂ to get decorrelated signal D.

На фиг.3 проиллюстрирована конфигурация преобразования из одного канала в два (OTT) внутри USAC-декодера. U-образный блок обработки транзиентов по фиг.3 содержит параллельный тракт сигнала, предлагаемый для обработки транзиентов.Figure 3 illustrates the configuration of the conversion from one channel to two (OTT) inside the USAC decoder. The U-shaped transient processing unit of FIG. 3 comprises a parallel signal path proposed for transient processing.

Два параметра, которые управляют TSD-процессом, передаются как частотнонезависимые параметры из кодера в декодер (см. фиг.3):Two parameters that control the TSD process are transmitted as frequency-independent parameters from the encoder to the decoder (see figure 3):

- Двоичное решение "транзиенты/без транзиентов" детектора транзиентов, выполняющегося в кодере, используется для управления отделением транзиентов с детализацией временных слотов QMF в декодере. Эффективная схема кодирования без потерь используется для передачи данных о позициях слотов QMF с транзиентами.- The binary solution "transients / without transients" of the transient detector running in the encoder is used to control the separation of transients with details of the QMF time slots in the decoder. An efficient lossless coding scheme is used to transmit position data of QMF slots with transients.

- Действительные параметры декоррелятора транзиентов, которые нужны для декоррелятора транзиентов, чтобы управлять пространственным распределением транзиентов. Параметры декоррелятора транзиентов обозначают угол между понижающим микшированием и его остатком. Эти параметры передаются только для временных слотов, в которых на кодере было обнаружено содержание транзиентов.- The actual transient decorrelator parameters that are needed for the transient decorrelator to control the spatial distribution of transients. The transient decorrelator parameters indicate the angle between the downmix and its remainder. These parameters are transmitted only for time slots in which the transient content has been detected on the encoder.

Для того чтобы оценить качество вышеописанной технологии, были проведены две проверки на прослушивание MUSHRA в управляемой среде проверки на прослушивание с использованием высококачественных электростатических наушников STAX. Проверка выполнялась при конфигурации стерео 32 кбит/с и 16 кбит/с. Шестнадцать экспертных слушателей участвовали в каждой из проверок.In order to evaluate the quality of the technology described above, two MUSHRA listening tests were conducted in a controlled listening test environment using high-quality STAX electrostatic headphones. The test was performed with a stereo configuration of 32 kbps and 16 kbps. Sixteen expert listeners participated in each of the audits.

Так как набор для проверки USAC не содержит элементов аплодисментов, чтобы продемонстрировать преимущество предложенной технологии, были выбраны дополнительные элементы аплодисментов. Элементы, приведенные в таблице 1, были включены в проверку:Since the USAC verification kit does not contain applause elements, to demonstrate the advantage of the proposed technology, additional applause elements were selected. The elements shown in table 1 were included in the check:

Таблица 1
Элементы проверки на прослушиваниеTable 1
Listening Check Elements ЭлементElement СвойстваThe properties ARL_applauseARL_applause аплодисменты с частотой от низкой к средней (элемент набора для проверки MPS)applause from low to medium frequency (MPS check kit element) applause4sapplause4s очень частые аплодисменты, содержащие несколько отличимых хлопковvery frequent applause containing a few distinguishable claps Applse_2chApplse_2ch частые многоканальные аплодисменты - передние каналы (элемент набора для проверки MPS)frequent multi-channel applause - front channels (MPS check kit element) Applse_stApplse_st частые многоканальные аплодисменты - понижающее микширование до стерео (элемент набора для проверки MPS)frequent multi-channel applause - downmix to stereo (MPS check kit element) KlatschenKlatschen сигнал редких аплодисментовsignal of rare applause

Что касается обычных двенадцати элементов проверки на прослушивание MPEG USAC, TSD всегда неактивен. Однако эти элементы не остаются точно идентичными по битам, так как в битовый поток дополнительно включен бит включения TSD (указывающий, что TSD отключен), и, таким образом, это немного воздействует на битовый бюджет для базового кодера. Так как эти отличия очень малы, эти элементы не были включены в проверку на прослушивание. Предоставляются данные по размеру этих отличий, чтобы показать, что эти изменения ничтожны и незаметны.As for the usual twelve elements of the MPEG USAC listening test, the TSD is always inactive. However, these elements do not remain exactly identical in bit, since the TSD enable bit (indicating that the TSD is disabled) is additionally included in the bitstream, and thus this affects the bit budget for the base encoder a bit. Since these differences are very small, these elements were not included in the listening test. Data is provided on the size of these differences to show that these changes are negligible and inconspicuous.

Инструмент кодека, называемый inter-TES, является частью эталонной модели 8 (RM8) USAC. Так как этот метод был представлен для улучшения воспринимаемого качества транзиентов, включающих в себя подобные аплодисментам сигналы, inter-TES был всегда включен при каждом условии проверки. При такой установке гарантируется наилучшее возможное качество и демонстрируется ортогональность inter-TES и TSD.The codec tool, called inter-TES, is part of the USAC reference model 8 (RM8). Since this method was introduced to improve the perceived quality of transients, including applause-like signals, inter-TES was always on for every test condition. With this installation, the best possible quality is guaranteed and the orthogonality of inter-TES and TSD is demonstrated.

Проверки системы имеют следующие конфигурации:System checks have the following configurations:

- RM8: система RM8 USAC- RM8: RM8 USAC system

- CE: система RM8 USAC, расширенная посредством управляющего транзиентами декоррелятора (TSD)- CE: RM8 USAC system enhanced by Transient Controlling Decorrelator (TSD)

На фиг.4 и 5 изображены баллы MUSHRA вместе с их 95%-ми доверительными интервалами для сценария проверки при 32 кбит/с. Для данных проверки было предложено t-распределение Стьюдента. Абсолютные баллы на фиг.4 показывают более высокий средний балл для всех элементов, для четырех из пяти элементов существует значительное улучшение в смысле 95% достоверности. Ни один элемент не ухудшился в отличие от RM8. Разностные баллы для USAC+TSD, оцененные в базовом эксперименте (CE) TSD по отношению к RM8 USAC, нанесены на фиг.5. Здесь может быть видно значительное улучшение для всех элементов.Figures 4 and 5 show MUSHRA scores along with their 95% confidence intervals for the verification scenario at 32 kbps. Student t-distribution was proposed for the verification data. The absolute scores in Fig. 4 show a higher average score for all elements, for four out of five elements there is a significant improvement in terms of 95% confidence. Not a single element has worsened, unlike RM8. The difference scores for USAC + TSD evaluated in the base experiment (CE) TSD with respect to RMAC USAC are plotted in FIG. 5. A significant improvement can be seen here for all elements.

Для схемы проверки при 16 кбит/с на фиг.6 и 7 изображены баллы MUSHRA вместе с их 95%-ми доверительными интервалами. Было предложено t-распределение Стьюдента для данных. Абсолютные баллы на фиг.6 показывают более высокий средний балл для каждого элемента. Для одного элемента может быть видна значимость в смысле 95% достоверности. Ни один элемент не имеет худшего балла, чем RM8. Разностные баллы нанесены на фиг.7. Снова было продемонстрировано значительное улучшение для всех элементов по отношению к разным данным.For the verification scheme at 16 kbps, FIGS. 6 and 7 show MUSHRA scores along with their 95% confidence intervals. Student t-distribution for data was proposed. The absolute scores in FIG. 6 show a higher average score for each element. For one element, significance in the sense of 95% confidence may be visible. No item has a worse score than RM8. Difference points are plotted in Fig.7. Significant improvement was again demonstrated for all elements in relation to different data.

Инструмент TSD включается посредством флага bsTsdEnable, передаваемого в битовом потоке. Если TSD включен, действительное разделение транзиентов управляется посредством флагов обнаружения транзиентов TsdSepData, которые также передаются в битовом потоке и которые кодируются в bsTsdCodedPos в случае, когда TSD включен.The TSD tool is enabled via the bsTsdEnable flag, transmitted in the bitstream. If the TSD is enabled, the actual transient separation is controlled by the TsdSepData transient detection flags, which are also transmitted in the bitstream and which are encoded in bsTsdCodedPos when the TSD is turned on.

В кодере флаг включения TSD bsTsdEnable генерируется посредством сегментного классификатора. Флаги обнаружения транзиентов TsdSepData задаются детектором транзиентов.In the encoder, the TSD enable flag bsTsdEnable is generated by a segment classifier. Transient detection flags TsdSepData are set by the transient detector.

Как уже отмечено, TDS не активируется для двенадцати элементов проверки MPEG USAC. Для пяти дополнительных элементов аплодисментов активация TSD изображена на фиг.8, отображающей логическое состояние bsTsdEnable в зависимости от времени.As already noted, TDS is not activated for the twelve MPEG USAC validation elements. For five additional applause elements, TSD activation is depicted in FIG. 8, showing the logical state of bsTsdEnable versus time.

Если TSD активирован, транзиенты обнаруживаются в определенных временных слотах QMF, и они впоследствии подаются на выделенный декоррелятор транзиентов. Для каждого дополнительного элемента проверки таблица 2 приводит проценты слотов внутри активированных посредством TSD кадров, которые содержат транзиенты.If TSD is activated, transients are detected in specific QMF time slots, and they are subsequently fed to a dedicated transient decorrelator. For each additional check item, Table 2 lists the percent of slots inside TSD-activated frames that contain transients.

Таблица 2
Процент слотов с транзиентами (частота слотов с транзиентами в % из всех временных слотов TSD-кадров)table 2
Percentage of slots with transients (frequency of slots with transients in% of all time slots of TSD frames) ЭлементElement Частота слотов с транзиентами (%)Slot frequency with transients (%) ARL_applauseARL_applause 23,423,4 Applause4sApplause4s 20,120.1 applse_2chapplse_2ch 24,724.7 applse_stapplse_st 23,823.8 KlatschenKlatschen 21,321.3

Передача решений отделения транзиентов и параметров декоррелятора из кодера в декодер требует определенной величины дополнительной информации. Однако эта величина с избытком компенсируется сбережениями битовой скорости, возникающими из-за передачи пространственных широкополосных меток внутри MPS.The transfer of transient separation decisions and decorrelator parameters from the encoder to the decoder requires a certain amount of additional information. However, this value is abundantly offset by bit rate savings resulting from the transmission of spatial broadband labels within the MPS.

Вследствие этого, средняя битовая скорость дополнительной информации MPS+TSD даже ниже, чем битовая скорость дополнительной информации простого MPS при простом USAC, как приведено в первом столбце таблицы 3. В предложенной конфигурации, использованной для оценки субъективного качества, средние битовые скорости, приведенные во втором столбце таблицы 3, были измерены для TDS:As a result, the average bit rate of the additional information MPS + TSD is even lower than the bit rate of the additional information of simple MPS with a simple USAC, as shown in the first column of Table 3. In the proposed configuration used to evaluate subjective quality, the average bit rates given in the second column of table 3 were measured for TDS:

Таблица 3
Битовые скорости MPS(+TSD) в бит/с в рамках сценария со стереокодеком 32 кбит/сTable 3
MPS bit rates (+ TSD) in bps as part of a scenario with 32 kbps stereo codec ЭлементElement Средняя битовая скорость (бит/с) дополнительной информации MPS(+TSD)Average bit rate (bps) of additional MPS information (+ TSD) простое USAC RM8simple USAC RM8 USAC с TSDUSAC with TSD ARL_applauseARL_applause 29662966 23452345 Applause4sApplause4s 27542754 22782278 applse_2chapplse_2ch 30003000 25442544 applse_stapplse_st 27352735 22532253 KlatschenKlatschen 29502950 24952495

Сложность вычисления TSD возникает из-заThe difficulty in calculating TSD is due to

- декодирования позиций слотов с транзиентами,- decoding slots with transients,

- сложности декоррелятора транзиентов.- complexity of transient decorrelator.

Предполагая длину пространственного кадра MPEG Surround в 32 временных слота, декодирование позиций слотов требует (64 делений + 80 умножений) на каждый пространственный кадр в худшем случае, т.е. 64*25+80=1680 операций на каждый пространственный кадр.Assuming an MPEG Surround spatial frame length of 32 time slots, decoding slot positions requires (64 divisions + 80 multiplications) for each spatial frame in the worst case, i.e. 64 * 25 + 80 = 1680 operations for each spatial frame.

Игнорируя операции копирования и условные операторы, сложность декоррелятора транзиентов определяется одним комплексным умножением на каждый слот и гибридным диапазоном QMF.Ignoring copy operations and conditional operators, the complexity of the transient decorrelator is determined by one complex multiplication per slot and a hybrid QMF range.

Это приводит к следующим числам общей сложности TSD, показанным в сравнении с числами сложности простого USAC в таблице 4:This results in the following TSD total numbers shown in comparison with the simple USAC complexity numbers in table 4:

Таблица 4
Сложность TSD-декодера в МОПС (миллионах операций в секунду) и относительно сложности простого USAC-декодераTable 4
The complexity of a TSD decoder in PFM (millions of operations per second) and the complexity of a simple USAC decoder Сложность простого USAC, в МОПСThe complexity of simple USAC, in pug TSD: сложность декоррелятора транзиентов, в МОПСTSD: Transient Decorrelator Complexity, in MOPS TSD: сложность декодера позиций столов, в МОПСTSD: table position decoder complexity, in pug ∑(сложность TSD), в МОПС∑ (TSD complexity), in pug ∑(сложность TSD) относительно простого USAC∑ (TSD complexity) relatively simple USAC 16 кбит/с стерео (f_s=28,8 кГЦ)16 kbps stereo (f _s = 28.8 kHz) 8,78.7 0,1170.117 0,0240.024 0,1410.141 1,62%1.62% 32 кбит/с стерео (f_s=40 кГц)32 kbps stereo (f _s = 40 kHz) 13,213,2 0,1630.163 0,0330,033 0,1960.196 1,48%1.48%

Итак, данные проверки на прослушивание ясно показывают значительное улучшение субъективного качества сигналов аплодисментов в разностных баллах всех элементов в обеих рабочих точках. В том, что касается абсолютных баллов, все элементы при условии TSD показывают более высокий средний балл. Для 32 кбит/с значительное улучшение существует для четырех из пяти элементов. Для 16 кбит/с один элемент показывает значительное улучшение. Ни один элемент не имеет худшего балла, чем RM8. Улучшение достигается при ничтожных вычислительных затратах, как может быть видно из данных о сложности. Это дополнительно подчеркивает преимущество инструмента TSD для USAC.So, these listening tests clearly show a significant improvement in the subjective quality of the applause signals in the difference scores of all elements at both operating points. In terms of absolute scores, all elements under the TSD condition show a higher average score. For 32 kbps, a significant improvement exists for four of the five elements. For 16 kbps, one element shows a significant improvement. No item has a worse score than RM8. Improvement is achieved at a negligible computational cost, as can be seen from the data on complexity. This further emphasizes the advantage of the TSD tool for USAC.

Вышеописанный управляющий транзиентами декоррелятор значительно улучшает обработку аудио при USAC. Однако, как было также видно выше, управляющий транзиентами декоррелятор требует информации о существовании или несуществовании транзиентов в конкретном слоте. При USAC, информация о временных слотах может быть передана на покадровой основе. Кадр содержит несколько, например 32, временных слота. Вследствие этого следует осознавать, что кодер также передает информацию о том, какие слоты содержат транзиенты, на покадровой основе. Уменьшение числа битов, которые должны быть переданы, является критичным при обработке аудиосигнала. Поскольку даже одиночная аудиозапись содержит огромное число кадров, это означает, что даже если число битов, которые должны быть переданы для каждого кадра, уменьшено всего лишь на несколько битов, общая скорость передачи битов может быть значительно уменьшена.The transient control decorrelator described above greatly improves USAC audio processing. However, as was also seen above, the transient-controlling decorrelator requires information about the existence or non-existence of transients in a particular slot. With USAC, time slot information can be transmitted frame by frame. A frame contains several, for example 32, time slots. As a consequence, it should be recognized that the encoder also transmits information about which slots contain transients on a frame-by-frame basis. Reducing the number of bits to be transmitted is critical in audio processing. Since even a single audio recording contains a huge number of frames, this means that even if the number of bits to be transmitted for each frame is reduced by only a few bits, the overall bit rate can be significantly reduced.

Проблема декодирования позиций слотов с событиями в кадре аудиосигнала, однако, не ограничивается проблемой декодирования транзиентов. Более того, было бы полезно декодировать также позиции слотов других событий, как, например, является ли слот кадра аудиосигнала тональным (или нет), содержит ли он шумы (или нет) и подобных. В действительности, устройство для эффективного кодирования и декодирования позиций слотов с событиями в кадре аудиосигнала было бы очень полезным для большого числа разного рода событий.The problem of decoding the positions of slots with events in the audio frame, however, is not limited to the problem of decoding transients. Moreover, it would be useful to also decode the positions of the slots of other events, such as whether the slot of the audio frame is tonal (or not), whether it contains noise (or not) and the like. In fact, a device for effectively encoding and decoding the positions of slots with events in an audio frame would be very useful for a large number of all kinds of events.

Когда этот документ ссылается на слоты или позиции слотов кадра аудиосигнала, слоты в этом смысле могут быть временными слотами, частотными слотами, частотно-временными слотами или любым другим типом слотов. Кроме того, следует понимать, что настоящее изобретение не ограничено обработкой аудио и кадров аудиосигнала при USAC, а вместо этого относится к любому типу кадров аудиосигнала и любому типу аудиоформатов, таких как MPEG1/2 уровень 3 ("MP3"), улучшенное аудиокодирование (AAC) и подобным. Эффективное кодирование и декодирование позиций слотов с событиями в кадре аудиосигнала было бы очень полезным для любого типа кадра аудиосигнала.When this document refers to slots or slot positions of an audio frame, the slots in this sense can be time slots, frequency slots, time-frequency slots, or any other type of slots. In addition, it should be understood that the present invention is not limited to the processing of audio and audio frames by USAC, but instead relates to any type of audio frames and any type of audio formats, such as MPEG1 / 2 level 3 ("MP3"), advanced audio coding (AAC ) and the like. Effectively encoding and decoding the positions of the event slots in the audio frame would be very useful for any type of audio frame.

Вследствие этого, целью настоящего изобретения является предоставить устройство для кодирования позиций слотов с событиями в кадре аудиосигнала с помощью малого числа битов. Более того, целью настоящего изобретения является предоставить устройство для декодирования позиций слотов с событиями в кадре аудиосигнала, кодированных устройством для кодирования согласно настоящему изобретению. Цели настоящего изобретения достигаются посредством устройства для декодирования по п.1 формулы изобретения, устройства для кодирования по п.11 формулы изобретения, способа декодирования по п.14 формулы изобретения, способа кодирования по п.15 формулы изобретения, компьютерной программы для декодирования по п.16 формулы изобретения, компьютерной программы для кодирования по п.17 формулы изобретения и кодированного сигнала по п.18 формулы изобретения.Therefore, it is an object of the present invention to provide an apparatus for encoding slot positions with events in an audio frame using a small number of bits. Moreover, it is an object of the present invention to provide an apparatus for decoding slot positions with events in an audio frame encoded by an encoding apparatus according to the present invention. The objectives of the present invention are achieved by means of a decoding device according to claim 1, a device for encoding according to claim 11, a decoding method according to claim 14, an encoding method according to claim 15, a computer program for decoding according to claim 16 of the claims, a computer program for encoding according to claim 17 and the encoded signal according to claim 18.

Согласно настоящему изобретению, предполагается, что число слотов кадра, указывающее общее число слотов кадра аудиосигнала, и число слотов с событиями, указывающее число слотов, содержащих события кадра аудиосигнала, могут быть доступны в декодирующем устройстве настоящего изобретения. Например, кодер может передавать число слотов кадра и/или число слотов с событиями на устройство для декодирования. Согласно варианту осуществления, кодер может указывать общее число слотов кадра аудиосигнала посредством передачи числа, которое является общим числом слотов кадра аудиосигнала минус 1. Кодер может дополнительно указывать число слотов, содержащих события кадра аудиосигнала, посредством передачи числа, которое является числом слотов, содержащих события кадра аудиосигнала, минус 1. В качестве альтернативы, декодер может сам определить общее число слотов кадра аудиосигнала и число слотов, содержащих события кадра аудиосигнала, без информации от кодера.According to the present invention, it is contemplated that the number of frame slots indicating the total number of audio frame slots and the number of event slots indicating the number of slots containing audio frame events may be available in the decoding apparatus of the present invention. For example, an encoder may transmit the number of frame slots and / or the number of event slots to a device for decoding. According to an embodiment, the encoder may indicate the total number of slots of the audio frame by transmitting a number that is the total number of slots of the audio frame minus 1. The encoder may further indicate the number of slots containing events of the audio frame by transmitting a number which is the number of slots containing frame events audio signal, minus 1. Alternatively, the decoder can determine the total number of slots of the audio frame and the number of slots containing events of the audio frame without informing cations from the encoder.

На основе этих предположений, согласно настоящему изобретению, данное число позиций слотов, содержащих события в кадре аудиосигнала, может быть кодировано и декодировано с использованием следующих результатов изысканий:Based on these assumptions, according to the present invention, a given number of slot positions containing events in an audio signal frame can be encoded and decoded using the following survey results:

Пусть N будет общим числом слотов кадра аудиосигнала, и P будет числом слотов, содержащих события кадра аудиосигнала.Let N be the total number of slots of the audio frame, and P be the number of slots containing events of the audio frame.

Предполагается, что как устройство для кодирования, так и устройство для декодирования осведомлены о значениях N и P.It is assumed that both the encoding device and the decoding device are aware of the values of N and P.

Зная N и P, можно определить, что есть только $(_{p}^{N})$

разных комбинаций позиций слотов, содержащих события в кадре аудиосигнала.Knowing N and P, we can determine that there is only

(_{p}^{N})

different combinations of slot positions containing events in the audio frame.

Например, если позиции слотов в кадре пронумерованы от 0 до N-1 и если P=8, то первой возможной комбинацией позиций слотов с событиями будет (0, 1, 2, 3, 4, 5, 6, 7), второй комбинацией будет (0, 1, 2, 3, 4, 5, 6, 8) и так далее, до комбинации (N-8, N-7, N-6, N-5, N-4, N-3, N-2, N-1), так что в итоге есть $(_{p}^{N})$

разных комбинаций.For example, if the slot positions in the frame are numbered from 0 to N-1 and if P = 8, then the first possible combination of slot positions with events will be (0, 1, 2, 3, 4, 5, 6, 7), the second combination will be (0, 1, 2, 3, 4, 5, 6, 8) and so on, until the combination (N-8, N-7, N-6, N-5, N-4, N-3, N- 2, N-1), so in the end there

(_{p}^{N})

different combinations.

Более того, настоящее изобретение использует дополнительные результаты изысканий, что число состояний события может быть кодировано устройством для кодирования и что число состояний события передается декодеру. Если каждая из возможных $(_{p}^{N})$

комбинаций представлена уникальным числом состояний события и если устройство для декодирования осведомлено о том, какое число состояний события представляет какую комбинацию позиций слотов, содержащих события в кадре аудиосигнала (например, посредством применения соответствующего способа декодирования), то устройство для декодирования может декодировать позиции слотов, содержащие события, с использованием N, P и числа состояний события. Для огромного количества типичных значений для N и P такой метод кодирования использует меньше битов для кодирования позиций слотов с событиями по сравнению с другими методами (например, использующими массив битов с одним битом для каждого слота кадра, в котором каждый бит указывает, встречается ли событие в этом слоте или нет).Moreover, the present invention uses additional research results that the number of event states can be encoded by the encoding device and that the number of event states is transmitted to the decoder. If each of the possible

(_{p}^{N})

of combinations is represented by a unique number of event states and if the device for decoding is aware of the number of event states representing which combination of slot positions containing events in the audio signal frame (for example, by applying the appropriate decoding method), then the device for decoding can decode the positions of slots containing events using N, P and the number of event states. For a huge number of typical values for N and P, such an encoding method uses fewer bits to encode slot positions with events compared to other methods (for example, using an array of bits with one bit for each slot in a frame in which each bit indicates whether an event occurs in this slot or not).

Другими словами, проблема кодирования позиций слотов с событиями в кадре аудиосигнала может быть решена посредством кодирования дискретного числа P позиций p_k на диапазоне [0...N-1], так что позиции не перекрываются, p_k≠p_h для k≠h, с как можно меньшим количеством битов. Так как порядок позиций не имеет значения, следовательно, число уникальных комбинаций позиций является биномиальным коэффициентом $(_{p}^{N})$

. Число требуемых битов, таким образом, составляетIn other words, the problem of coding the positions of slots with events in the audio signal frame can be solved by encoding a discrete number P of positions p _k in the range [0 ... N-1], so that the positions do not overlap, p _k ≠ p _h for k ≠ h , with as few bits as possible. Since the order of the positions does not matter, therefore, the number of unique combinations of positions is a binomial coefficient

(_{p}^{N})

. The number of bits required is thus

.

В варианте осуществления предоставляется устройство для декодирования, в котором данное устройство для декодирования выполнено с возможностью проведения проверки, сравнивающей число состояний события или обновленное число состояний события с пороговым значением. Такая проверка может быть использована для получения позиций слотов, содержащих события, из числа состояний события. Проверка сравнения числа состояний события с пороговым значением может быть проведена посредством сравнения, является ли число состояний события или обновленное число состояний события большим, большим или равным, меньшим или меньшим или равным пороговому значению. Кроме того, предпочтительно, чтобы устройство для декодирования было выполнено с возможностью обновления числа состояний события или обновленного числа состояний события в зависимости от результата проверки.In an embodiment, a decoding apparatus is provided in which the decoding apparatus is arranged to perform a check comparing the number of event states or the updated number of event states with a threshold value. Such a check can be used to obtain the positions of slots containing events from the number of event states. A comparison check of the number of event states with a threshold value can be performed by comparing whether the number of event states or the updated number of event states is greater, greater or equal, less than or less than or equal to the threshold value. In addition, it is preferable that the device for decoding was configured to update the number of event states or the updated number of event states depending on the result of the check.

Согласно варианту осуществления, предоставляется устройство для декодирования, которое выполнено с возможностью проведения проверки, сравнивающей число состояний события или обновленное число состояний события относительно конкретного рассматриваемого слота, при этом пороговое значение зависит от числа слотов кадра, числа слотов с событиями и от позиции рассматриваемого слота внутри кадра. Кроме того, позиции слотов, содержащих события, могут быть определены на послотовой основе, с решением для каждого слота кадра, один за другим, содержит ли слот событие.According to an embodiment, a decoding apparatus is provided which is arranged to carry out a check comparing the number of event states or the updated number of event states with respect to the particular slot in question, the threshold value depending on the number of frame slots, the number of event slots and the position of the considered slot inside frame. In addition, the positions of the slots containing the events can be determined on a per-slot basis, with a decision for each slot of the frame, one after the other, whether the slot contains an event.

Согласно дополнительному варианту осуществления, предоставляется устройство для декодирования, которое выполнено с возможностью разбиения кадра на первый раздел кадра, содержащий первый набор слотов кадра, и на второй раздел кадра, содержащий второй набор слотов кадра, и в котором данное устройство для декодирования дополнительно выполнено с возможностью определения позиций, содержащих события, для каждого из разделов кадра по отдельности. Кроме того, позиции слотов, содержащих события, могут быть определены посредством повторного разбиения кадра или разделов кадра на даже меньшие разделы кадра.According to a further embodiment, a decoding apparatus is provided which is arranged to split a frame into a first section of a frame containing a first set of frame slots and into a second section of a frame containing a second set of frame slots, and in which this decoding device is further configured defining positions containing events for each of the sections of the frame individually. In addition, the positions of the slots containing the events can be determined by re-dividing the frame or sections of the frame into even smaller sections of the frame.

Ниже, варианты осуществления настоящего изобретения описаны более подробно относительно чертежей, на которых:Below, embodiments of the present invention are described in more detail with respect to the drawings, in which:

на фиг.1 показано типичное применение декоррелятора в повышающем микшере моно-стерео;1 shows a typical use of a decorrelator in a mono-stereo boost mixer;

на фиг.2 показано дополнительное типичное применение декоррелятора в повышающем микшере моно-стерео;FIG. 2 shows an additional typical use of a decorrelator in a mono-stereo boost mixer;

на фиг.3 показан обзор системы преобразования одного канала в два (OTT), включающей в себя управляющий транзиентами декоррелятор (TSD);figure 3 shows an overview of the conversion system of one channel into two (OTT), including transient control decorrelator (TSD);

на фиг.4 показана диаграмма, иллюстрирующая абсолютные баллы для 32 кбит/с стерео, сравнивающая RM8 USAC и USAC RM8+TSD в базовом эксперименте (CE) TSD;FIG. 4 is a diagram illustrating absolute scores for 32 kbps stereo comparing RM8 USAC and USAC RM8 + TSD in a basic experiment (CE) TSD;

на фиг.5 показана диаграмма, отображающая разностные баллы для 32 кбит/с стерео, сравнивающая USAC, использующее управляющий транзиентами декоррелятор, с системой простого USAC;Fig. 5 is a graph showing difference scores for 32 kbps stereo comparing USAC using a transient-controlled decorrelator with a simple USAC system;

на фиг.6 показана диаграмма, отображающая абсолютные баллы для 16 кбит/с стерео, сравнивающая RM8 USAC и USAC RM8+TSD в базовом эксперименте (CE) TSD;6 is a diagram showing absolute scores for 16 kbps stereo comparing RM8 USAC and USAC RM8 + TSD in a basic experiment (CE) TSD;

на фиг.7 показана диаграмма, отображающая разностные баллы для 16 кбит/с стерео, сравнивающая USAC, использующее управляющий транзиентами декоррелятор, с системой простого USAC;Fig. 7 is a diagram showing difference scores for 16 kbps stereo comparing USAC using a transient control decorrelator with a simple USAC system;

на фиг.8 отображена активность TSD для пяти дополнительных элементов, изображенная как логическое состояние флага bsTsdEnable;Fig. 8 shows TSD activity for five additional elements, depicted as the logical state of the bsTsdEnable flag;

на фиг.9А проиллюстрировано устройство для декодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно варианту осуществления настоящего изобретения;Fig. 9A illustrates an apparatus for decoding slot positions containing events in an audio frame according to an embodiment of the present invention;

на фиг.9В проиллюстрировано устройство для декодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно дополнительному варианту осуществления настоящего изобретения;Fig. 9B illustrates an apparatus for decoding slot positions containing events in an audio signal frame, according to a further embodiment of the present invention;

на фиг.9С проиллюстрировано устройство для декодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно еще одному варианту осуществления настоящего изобретения;Fig. 9C illustrates an apparatus for decoding slot positions containing events in an audio signal frame, according to another embodiment of the present invention;

на фиг.10 показана схема последовательности операций, иллюстрирующая процесс декодирования, проводимый устройством для декодирования, согласно варианту осуществления настоящего изобретения;10 is a flowchart illustrating a decoding process conducted by a decoding apparatus according to an embodiment of the present invention;

на фиг.11 показан псевдокод, реализующий декодирование позиций слотов, содержащих события, согласно варианту осуществления настоящего изобретения;11 shows a pseudo code implementing decoding of positions of slots containing events according to an embodiment of the present invention;

на фиг.12 показана схема последовательности операций, иллюстрирующая процесс кодирования, проводимый устройством для кодирования, согласно варианту осуществления настоящего изобретения;12 is a flowchart illustrating an encoding process conducted by an encoding apparatus according to an embodiment of the present invention;

на фиг.13 проиллюстрирован псевдокод, изображающий процесс кодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно дополнительному варианту осуществления настоящего изобретения;13 is a pseudo-code illustrating a process for encoding slot positions containing events in an audio signal frame, according to a further embodiment of the present invention;

на фиг.14 проиллюстрировано устройство для декодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно дополнительному варианту осуществления настоящего изобретения;Fig. 14 illustrates an apparatus for decoding slot positions containing events in an audio signal frame, according to a further embodiment of the present invention;

на фиг.15 проиллюстрировано устройство для кодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно варианту осуществления настоящего изобретения;on Fig illustrated a device for encoding the positions of slots containing events in the frame of the audio signal, according to a variant implementation of the present invention;

на фиг.16 изображен синтаксис данных MPS 212 USAC согласно варианту осуществления;FIG. 16 illustrates data syntax of MPS 212 of the USAC according to an embodiment;

на фиг.17 проиллюстрирован синтаксис TsdData USAC согласно варианту осуществления;17 illustrates the syntax of TsdData USAC according to an embodiment;

на фиг.18 проиллюстрирована таблица nBitsTrSlots в зависимости от длины MPS-кадра;on Fig illustrated table nBitsTrSlots depending on the length of the MPS frame;

на фиг.19 показана таблица, относящаяся к bsTempShapeConfig USAC согласно варианту осуществления;on Fig shows a table related to bsTempShapeConfig USAC according to a variant implementation;

на фиг.20 изображен синтаксис TempShapeData USAC согласно варианту осуществления;20 depicts the syntax of a TempShapeData USAC according to an embodiment;

на фиг.21 проиллюстрирован блок D декоррелятора в блоке OTT-декодирования согласно варианту осуществления;FIG. 21 illustrates a decorrelator block D in an OTT decoding block according to an embodiment;

на фиг.22 изображен синтаксис EcData USAC согласно варианту осуществления;22 depicts the syntax of the EcData USAC according to an embodiment;

на фиг.23 показана схема прохождения сигналов для генерирования данных TSD.23 shows a signal flow diagram for generating TSD data.

На фиг.9А проиллюстрировано устройство 10 для декодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно варианту осуществления настоящего изобретения. Устройство 10 для декодирования содержит анализирующий блок 20 и блок 30 генерирования. Число слотов кадра FSN, указывающее общее число слотов кадра аудиосигнала, число слотов с событиями ESON, указывающее число слотов, содержащих события кадра аудиосигнала, и число состояний события ESTN подаются в устройство 10 для декодирования. Устройство 10 для декодирования затем декодирует позиции слотов, содержащих события, посредством использования числа слотов кадра FSN, числа слотов с событиями ESON и числа состояний события ESTN. Декодирование проводится анализирующим блоком 20 и блоком 30 генерирования, которые взаимодействуют в процессе декодирования. Тогда как анализирующий блок 20 ответственен за исполнение проверок, например сравнение числа состояний события ESTN с пороговым значением, блок 30 генерирования генерирует и обновляет промежуточные результаты процесса декодирования, например, обновленное число состояний события.FIG. 9A illustrates an apparatus 10 for decoding slot positions containing events in an audio frame according to an embodiment of the present invention. The decoding device 10 comprises an analysis unit 20 and a generation unit 30. The number of FSN frame slots indicating the total number of audio frame slots, the number of ESON event slots indicating the number of slots containing audio frame events, and the number of ESTN event states are supplied to decoding apparatus 10. The decoding apparatus 10 then decodes the positions of the slots containing the events by using the number of slots of the FSN frame, the number of slots with ESON events, and the number of states of the ESTN event. Decoding is performed by an analyzing unit 20 and a generating unit 30, which interact in the decoding process. While the analysis unit 20 is responsible for performing checks, for example, comparing the number of states of an ESTN event with a threshold value, the generating unit 30 generates and updates intermediate results of the decoding process, for example, an updated number of states of the event.

Кроме того, блок 30 генерирования генерирует указание множества позиций слотов, содержащих события в кадре аудиосигнала. Конкретное указание множества позиций слотов, содержащих события кадра аудиосигнала, может называться как "состояние указания".In addition, the generating unit 30 generates an indication of a plurality of slot positions containing events in the audio frame. A specific indication of a plurality of slot positions containing events of an audio signal frame may be referred to as an “indication state”.

Согласно варианту осуществления, указание множества позиций слотов, содержащих события в кадре аудиосигнала, может быть сгенерировано так, что в первый момент времени блок 30 генерирования указывает для первого слота, содержит ли слот событие или нет, во второй момент времени блок 30 генерирования указывает для второго слота, содержит ли слот событие или нет, и так далее.According to an embodiment, an indication of a plurality of slot positions containing events in an audio signal frame can be generated so that at the first time, the generating unit 30 indicates for the first slot whether the slot contains an event or not, at the second time, the generating unit 30 indicates for the second slots, whether the slot contains an event or not, and so on.

Согласно дополнительному варианту осуществления, указание множества позиций слотов, содержащих события, может, например, быть массивом битов, указывающим для каждого слота кадра, содержит ли он событие.According to a further embodiment, the indication of the multiple positions of the slots containing the events may, for example, be an array of bits indicating for each slot of the frame whether it contains an event.

Анализирующий блок 20 и блок 30 генерирования могут взаимодействовать так, что оба блока вызывают друг друга один или более раз в процессе декодирования, чтобы произвести промежуточные результаты.The analyzing unit 20 and the generating unit 30 can interact so that both units call each other one or more times in the decoding process to produce intermediate results.

На фиг.9В проиллюстрировано устройство 40 для декодирования согласно варианту осуществления настоящего изобретения. Устройство 40 для декодирования, среди прочего, отличается от устройства 10 для декодирования по фиг.9А в том, что дополнительно содержит процессор 50 аудиосигнала. Процессор 50 аудиосигнала принимает входной аудиосигнал и указание множества позиций слотов, содержащих события в кадре аудиосигнала, которое было сгенерировано блоком 45 генерирования. В зависимости от указания, процессор 50 аудиосигнала генерирует выходной аудиосигнал. Процессор 50 аудиосигнала может генерировать выходной аудиосигнал, например, посредством декоррелирования входного аудиосигнала. Кроме того, процессор 50 аудиосигнала может содержать решетчатый IIR-декоррелятор 54, декоррелятор 56 транзиентов и блок 52 отделения транзиентов для генерирования выходного аудиосигнала, как проиллюстрировано на фиг.3. Если указание множества позиций слотов, содержащих события в кадре аудиосигнала, указывает, что слот содержит транзиенты, то процессор 50 аудиосигнала будет декоррелировать входной аудиосигнал, относящийся к этому слоту, посредством декоррелятора 56 транзиентов. Если, однако, указание множества позиций слотов, содержащих события в кадре аудиосигнала, указывает, что слот не содержит транзиенты, то процессор аудиосигнала будет декоррелировать входной аудиосигнал S, относящийся к этому слоту, посредством использования решетчатого IIR-декоррелятора 54. Процессор аудиосигнала использует блок 52 отделения транзиентов, который решает на основе указания, подан ли участок входного аудиосигнала, относящийся к слоту, в декоррелятор 56 транзиентов или в решетчатый IIR-декоррелятор 54, в зависимости от того, указывает ли указание, что конкретный слот содержит транзиенты (декорреляция посредством декоррелятора 56 транзиентов) или слот не содержит транзиенты (декорреляция посредством решетчатого IIR-декоррелятора 54).9B, a decoding apparatus 40 according to an embodiment of the present invention is illustrated. The decoding apparatus 40, among other things, differs from the decoding apparatus 10 of FIG. 9A in that it further comprises an audio signal processor 50. An audio signal processor 50 receives an audio input signal and an indication of a plurality of slot positions containing events in an audio signal frame that has been generated by the generating unit 45. Depending on the indication, the audio signal processor 50 generates an audio output signal. The audio processor 50 may generate an audio output, for example, by de-correlating an audio input. In addition, the audio signal processor 50 may include a trellis IIR decorrelator 54, a transient decorrelator 56, and a transient separation unit 52 for generating an audio output signal, as illustrated in FIG. If the indication of the multiple positions of the slots containing events in the audio signal frame indicates that the slot contains transients, the audio signal processor 50 will decorrelate the input audio signal related to this slot by the transient decorrelator 56. If, however, an indication of the plurality of slot positions containing events in the audio signal frame indicates that the slot does not contain transients, then the audio signal processor will decorrelate the input audio signal S related to this slot by using the trellis IIR decorrelator 54. The audio signal processor uses block 52 a transient compartment, which decides based on the indication whether the portion of the input audio signal related to the slot is fed to the transient decorrelator 56 or to the trellis IIR decorrelator 54, depending on whether AET specify whether that particular slot contains transients (decorrelation by transient decorrelator 56) or slot contains no transients (decorrelation by trellis decorrelator IIR-54).

На фиг.9С проиллюстрировано устройство 60 для декодирования согласно варианту осуществления настоящего изобретения. Устройство 60 для декодирования отличается от устройства 10 для декодирования по фиг.9А в том, что дополнительно содержит селектор 90 слотов. Декодирование делается на послотовой основе, с решением для каждого слота кадра, один за другим, содержит ли слот событие. Селектор 90 слотов решает, какой слот кадра рассматривать. Предпочтительным подходом был бы тот, где селектор 90 слотов выбирает слоты кадра один за другим.9C, a decoding apparatus 60 according to an embodiment of the present invention is illustrated. The decoding apparatus 60 differs from the decoding apparatus 10 of FIG. 9A in that it further comprises a slot selector 90. Decoding is done on a per-slot basis, with a decision for each slot in the frame, one after the other, whether the slot contains an event. The slot selector 90 decides which frame slot to consider. A preferred approach would be one where the 90 slot selector selects frame slots one by one.

Послотовое декодирование устройства 60 для декодирования по этому варианту осуществления основано на следующих результатах изысканий, которые могут быть применены для вариантов осуществления устройства для декодирования, устройства для кодирования, способа декодирования и способа кодирования позиций слотов, которые содержат события в кадре аудиосигнала. Следующие результаты изысканий также применимы для соответствующих компьютерных программ и кодированных сигналов.The slot-by-bit decoding of the decoding apparatus 60 of this embodiment is based on the following research findings that can be applied to embodiments of the decoding apparatus, the encoding apparatus, the decoding method, and the encoding method of slot positions that contain events in an audio signal frame. The following survey results are also applicable for the corresponding computer programs and encoded signals.

Предположим, что N является (общим) числом слотов кадра аудиосигнала и P является числом слотов, содержащих события кадра (это означает, что N может быть числом слотов кадра FSN и P может быть числом слотов с событиями ESON). Рассматривается первый слот кадра. Можно различить два случая.Suppose that N is the (total) number of audio frame slots and P is the number of slots containing frame events (this means that N may be the number of FSN frame slots and P may be the number of ESON event slots). The first slot of the frame is considered. Two cases can be distinguished.

Если первый слот является слотом, который не содержит событие, то, относительно оставшихся N-1 слотов кадра, есть только $(_{P}^{N - 1})$

разных возможных комбинаций из P позиций слотов, содержащих событие, относительно оставшихся N-1 слотов кадра.If the first slot is a slot that does not contain an event, then, relative to the remaining N-1 frame slots, there is only

(_{P}^{N - one})

different possible combinations of P positions of the slots containing the event relative to the remaining N-1 frame slots.

Однако если первый слот является слотом, который содержит событие, то, относительно оставшихся N-1 слотов кадра, есть только $(_{P - 1}^{N - 1}) = (_{P}^{N}) - (_{P}^{N - 1})$

разных возможных комбинаций из оставшихся P-1 слотов, содержащих событие, относительно оставшихся N-1 слотов кадра.However, if the first slot is the slot that contains the event, then, relative to the remaining N-1 frame slots, there is only

(_{P - one}^{N - one}) = (_{P}^{N}) - (_{P}^{N - one})

different possible combinations of the remaining P-1 slots containing the event, relative to the remaining N-1 frame slots.

На основе этих результатов изысканий варианты осуществления дополнительно основаны на результатах изысканий, что все комбинации с первым слотом, где событие не встретилось, должны быть кодированы посредством чисел состояний события, которые меньше, чем пороговое значение, или равны ему. Кроме того, все комбинации с первым слотом, где встретилось событие, должны быть кодированы посредством чисел состояний события, которые больше, чем пороговое значение. В варианте осуществления все числа состояний события могут быть положительными целыми числами или 0, и подходящим пороговым значением касательно первого слота может быть $(_{P}^{N - 1})$

.Based on these survey results, the embodiments are further based on the survey results, that all combinations with the first slot where the event did not occur should be encoded by numbers of event states that are less than or equal to the threshold value. In addition, all combinations with the first slot where the event occurred must be encoded by the numbers of event states that are greater than the threshold value. In an embodiment, all event state numbers may be positive integers or 0, and a suitable threshold value for the first slot may be

(_{P}^{N - one})

.

В варианте осуществления устройство для декодирования выполнено с возможностью определения того, содержит ли первый слот кадра событие, посредством проверки того, является ли число состояний события большим, чем пороговое значение. (В качестве альтернативы, может также быть реализован процесс кодирования/декодирования по вариантам осуществления, так что устройство для декодирования проверяет, является ли число состояний события большим или равным, меньшим или равным или меньшим, чем пороговое значение.) После анализа первого слота декодирование продолжается для второго слота кадра с использованием отрегулированных значений. Помимо регулирования числа рассматриваемых слотов (которое уменьшено на один), число слотов, содержащих события, также, в конечном счете, уменьшается на один (если первый слот содержит событие), и число состояний события регулируется, в случае, когда число состояний события было больше, чем пороговое значение, чтобы удалить участок, относящийся к первому слоту, исходя из числа состояний события. Процесс декодирования может быть продолжен для дальнейших слотов кадра аналогичным образом.In an embodiment, the decoding apparatus is configured to determine whether the first slot of the frame contains an event by checking whether the number of event states is greater than a threshold value. (Alternatively, the encoding / decoding process of the embodiments may also be implemented, so that the decoding apparatus checks whether the number of event states is greater or equal, less than or equal to or less than a threshold value.) After analyzing the first slot, decoding continues for the second slot of the frame using the adjusted values. In addition to adjusting the number of slots in question (which is reduced by one), the number of slots containing events also ultimately decreases by one (if the first slot contains an event), and the number of event states is adjusted when the number of event states was more than the threshold value in order to remove the portion related to the first slot, based on the number of event states. The decoding process can be continued for further frame slots in a similar manner.

В варианте осуществления кодируется дискретное число P из позиций p_k на диапазоне [0...N-1], так что позиции не перекрываются, p_k≠p_h для k≠h. Здесь, каждая уникальная комбинация позиций на данном диапазоне называется состоянием, и каждая возможная позиция в этом диапазоне называется слотом. Согласно варианту осуществления устройства для декодирования, рассматривается первый слот в диапазоне. Если слот не имеет назначенной ему позиции, то диапазон может быть уменьшен до N-1, и число возможных состояний уменьшается до $(_{P}^{N - 1})$

. Напротив, если состояние больше, чем

(_{P}^{N - 1})

, то можно сделать вывод, что первый слот имеет назначенную ему позицию. Следующий алгоритм декодирования может быть результатом этого:In an embodiment, a discrete number P is encoded from positions p _k on the range [0 ... N-1], so that the positions do not overlap, p _k ≠ p _h for k ≠ h. Here, each unique combination of positions in a given range is called a state, and each possible position in this range is called a slot. According to an embodiment of a decoding apparatus, a first slot in a range is considered. If the slot does not have a position assigned to it, then the range can be reduced to N-1, and the number of possible states is reduced to

(_{P}^{N - one})

. On the contrary, if the condition is greater than

(_{P}^{N - one})

, then we can conclude that the first slot has a position assigned to it. The following decoding algorithm may be the result of this:

Для каждого слота hFor each slot h

Если состояние > $(_{P}^{N - h - 1})$

, тоIf condition>

(_{P}^{N - h - one})

then

Назначить позицию для слота hAssign position to slot h

Обновить оставшееся состояние как state:=state- $(_{P}^{N - h - 1})$

Update remaining state as state: = state-

(_{P}^{N - h - one})

Уменьшить число позиций влево как P:=P-1Decrease the number of positions to the left as P: = P-1

Конецthe end

Вычисление биномиального коэффициента на каждой итерации было бы затратным. Вследствие этого, согласно вариантам осуществления, следующие правила могут быть использованы для обновления биномиального коэффициента с использованием значения из предыдущей итерации:Calculating the binomial coefficient at each iteration would be costly. Because of this, according to embodiments, the following rules can be used to update the binomial coefficient using the value from the previous iteration:

Используя эти формулы, каждое обновление биномиального коэффициента стоит только одно умножение и одно деление, тогда как точная оценка стоила бы P умножений и делений на каждой итерации.Using these formulas, each binomial coefficient update costs only one multiplication and one division, while an exact estimate would cost P multiplications and divisions at each iteration.

В этом варианте осуществления общая сложность декодера составляет P умножений и делений для инициализации биномиального коэффициента, для каждой итерации 1 умножение, деление и оператор "если", и для каждой кодированной позиции 1 умножение, сложение и деление. Следует отметить, что в теории было бы возможно уменьшить число делений, нужных для инициализации, до одного. На практике, однако, этот подход привел бы к очень большим целым числам, которые трудно обработать. Сложностью декодера в худшем случае является тогда N+2P делений и N+2P умножений, P сложений (могут быть проигнорированы, если используются MAC-операции) и N операторов "если".In this embodiment, the total decoder complexity is P multiplications and divisions to initialize the binomial coefficient, for each iteration 1 is multiplication, division, and the if operator, and for each coded position 1 is multiplication, addition, and division. It should be noted that in theory it would be possible to reduce the number of divisions needed for initialization to one. In practice, however, this approach would lead to very large integers that are difficult to process. The worst case decoder complexity is then N + 2P divisions and N + 2P multiplications, P additions (can be ignored if MAC operations are used), and N if-statements.

В варианте осуществления алгоритм кодирования, используемый устройством для кодирования, не должен выполнять итерацию во всех слотах, а только в тех, которые имеют назначенную им позицию. Вследствие этогоIn an embodiment, the encoding algorithm used by the encoding device does not have to iterate over all slots, but only at those that have their assigned position. Thereby

Для каждой позиции P_h, h=1...PFor each position P _h , h = 1 ... P

Обновить состояние как state:=state+ $(\overset{P h - 1}{h})$

.Update state as state: = state +

(\overset{P h - one}{h})

.

Сложностью кодера в худшем случае является P∙(P-1) умножений и P∙(P-1) делений, а также P-1 делений.The worst case coder complexity is P ∙ (P-1) multiplications and P ∙ (P-1) divisions, as well as P-1 divisions.

На фиг.10 иллюстрирован процесс декодирования, проводимый устройством для декодирования, согласно варианту осуществления настоящего изобретения. В этом варианте осуществления декодирование выполняется на послотовой основе.10 illustrates a decoding process conducted by a decoding apparatus according to an embodiment of the present invention. In this embodiment, decoding is performed on a per-chip basis.

На этапе 110 инициализируются значения. Устройство для декодирования хранит число состояний события, которое оно приняло как входное значение, в переменной s. Кроме того, число слотов, содержащих события кадра, как указано числом слотов с событиями, хранится в переменной p. Кроме того, общее число слотов, содержащихся в кадре, как указано числом слотов кадра, хранится в переменной N.At 110, values are initialized. The device for decoding stores the number of states of the event that it took as an input value in the variable s. In addition, the number of slots containing frame events, as indicated by the number of event slots, is stored in the variable p. In addition, the total number of slots contained in the frame, as indicated by the number of frame slots, is stored in variable N.

На этапе 120 значение TsdSepData[t] инициализируется со значением 0 для всех слотов кадра. Массив битов TsdSepData является выходными данными, которые должны быть сгенерированы. Он указывает для каждой позиции слота t, содержит ли событие слот с соответствующей позицией слота (TsdSepData[t]=1) или не содержит (TsdSepData[t]=0). На этапе 120 соответствующие значения всех слотов кадра инициализируются со значением 0.At 120, a TsdSepData [t] value is initialized with a value of 0 for all frame slots. The TsdSepData bitmap is the output to be generated. It indicates for each slot position t whether the event contains a slot with the corresponding slot position (TsdSepData [t] = 1) or not (TsdSepData [t] = 0). At step 120, the corresponding values of all frame slots are initialized with a value of 0.

На этапе 130 переменная k инициализируется со значением N-1. В этом варианте осуществления слоты кадра, содержащего N элементов, пронумерованы 0, 1, 2,..., N-1. Задание k=N-1 означает, что слот с наивысшим числом слота считается первым.At step 130, the variable k is initialized with a value of N-1. In this embodiment, the slots of a frame containing N elements are numbered 0, 1, 2, ..., N-1. Setting k = N-1 means that the slot with the highest number of slots is considered the first.

На этапе 140 рассматривается, верно ли k≥0. Если k<0, декодирование позиций слотов было закончено, и процесс прекращается, иначе процесс продолжается на этапе 150.At step 140, it is considered whether k≥0 is true. If k <0, the decoding of the positions of the slots has been completed, and the process stops, otherwise the process continues at step 150.

На этапе 150 проверяется, верно ли p>k. Если p больше, чем k, это означает, что все оставшиеся слоты содержат событие. Процесс продолжается на этапе 230, в котором все значения поля TsdSepData оставшихся слотов 0, 1,..., k заданы в значение 1, указывая, что каждый из оставшихся слотов содержит событие. В этом случае процесс после этого прекращается. Однако если на этапе 150 обнаружено, что p не больше, чем k, процесс декодирования продолжается на этапе 160.At step 150, it is checked whether p> k is true. If p is greater than k, this means that all remaining slots contain the event. The process continues to step 230, in which all the TsdSepData field values of the remaining slots 0, 1, ..., k are set to 1, indicating that each of the remaining slots contains an event. In this case, the process then stops. However, if at step 150 it is found that p is not greater than k, the decoding process continues at step 160.

На этапе 160 вычисляется значение c= $(_{p}^{k})$

, c используется как пороговое значение.At step 160, the value c =

(_{p}^{k})

, c is used as a threshold value.

На этапе 170 проверяется, является ли (в конечном счете, обновленное) число состояний события s большим, чем c, или равно ему, где c является пороговым значением, только что вычисленным на этапе 160.At step 170, it is checked whether the (ultimately updated) number of event states s is greater than c, or equal to it, where c is the threshold value just calculated at step 160.

Если s меньше, чем c, это означает, что рассматриваемый слот (с позицией слота k) не содержит событие. В этом случае не нужно предпринимать дополнительное действие, так как TsdSepData[k] уже был задан в значение 0 для этого слота на этапе 140. Процесс тогда продолжается на этапе 220. На этапе 220 k задается так, чтобы быть k:=k-1, и рассматривается следующий слот.If s is less than c, this means that the slot in question (with slot position k) does not contain an event. In this case, it is not necessary to take an additional action, since TsdSepData [k] has already been set to 0 for this slot in step 140. The process then continues to step 220. At step 220, k is set so that k: = k-1 , and the next slot is considered.

Однако если проверка на этапе 170 показывает, что s больше, чем c, или равно ему, это означает, что рассматриваемый слот k содержит событие. В этом случае число состояний события s обновляется и задается в значение s:=s-c на этапе 180. Кроме того, TsdSepData[k] задается в значение 1 на этапе 190 для указания, что слот k содержит событие. Более того, на этапе 200 p задается в значение p-1, указывая, что оставшиеся слоты, которые должны быть исследованы, теперь содержат только p-1 слотов с событиями.However, if the check in step 170 shows that s is greater than or equal to c, this means that the slot k in question contains an event. In this case, the number of event states s is updated and set to s: = s-c in step 180. In addition, TsdSepData [k] is set to 1 in step 190 to indicate that slot k contains the event. Moreover, at step 200, p is set to p-1, indicating that the remaining slots to be examined now contain only p-1 event slots.

На этапе 210 проверяется, является ли p равным 0. Если p равно 0, оставшиеся слоты не содержат события, и процесс декодирования заканчивается. Иначе, по меньшей мере один из оставшихся слотов содержит событие, и процесс продолжается на этапе 220, где процесс декодирования продолжается со следующим слотом (k-1).At step 210, it is checked whether p is 0. If p is 0, the remaining slots do not contain events, and the decoding process ends. Otherwise, at least one of the remaining slots contains an event, and the process continues to step 220, where the decoding process continues with the next slot (k-1).

Процесс декодирования по варианту осуществления, проиллюстрированному на фиг.10, генерирует массив TsdSepData в качестве выходного значения, указывая для каждого слота k кадра, содержит ли слот событие (TsdSepData[k]=1) или нет (TsdSepData[k]=0).The decoding process of the embodiment illustrated in FIG. 10 generates an TsdSepData array as an output value, indicating for each slot k of the frame whether the slot contains an event (TsdSepData [k] = 1) or not (TsdSepData [k] = 0).

Возвращаясь к фиг.9c, устройство 60 для декодирования по варианту осуществления, в котором данное устройство реализует процесс декодирования, проиллюстрированный на фиг.10, содержит селектор 90 слотов, который решает, какие слоты рассматривать. Относительно фиг.10, такой селектор слотов был бы выполнен с возможностью исполнения этапов 130 и 220 процесса по Фиг. 10. Подходящий анализирующий блок 70 этого варианта осуществления был бы выполнен с возможностью исполнения этапов 140, 150, 170 и 210 обработки по фиг.10. Блок 80 генерирования по такому варианту осуществления был бы выполнен с возможностью проведения всех остальных этапов обработки по фиг.10.Returning to FIG. 9c, the decoding apparatus 60 of the embodiment in which the apparatus implements the decoding process illustrated in FIG. 10 comprises a slot selector 90 that decides which slots to consider. Regarding FIG. 10, such a slot selector would be capable of executing steps 130 and 220 of the process of FIG. 10. A suitable analysis unit 70 of this embodiment would be configured to execute processing steps 140, 150, 170 and 210 of FIG. 10. The generating unit 80 according to such an embodiment would be able to carry out all the other processing steps of FIG. 10.

На фиг.11 показан псевдокод, реализующий декодирование позиций слотов, содержащих события, согласно варианту осуществления настоящего изобретения.11 shows pseudo code implementing decoding of positions of slots containing events according to an embodiment of the present invention.

На фиг.12 иллюстрирован процесс кодирования, проводимый устройством для кодирования, согласно варианту осуществления настоящего изобретения. В этом варианте осуществления кодирование выполняется на послотовой основе. Целью процесса кодирования согласно варианту осуществления, проиллюстрированному на фиг.12, является сгенерировать число состояний события.12 illustrates an encoding process conducted by an encoding apparatus according to an embodiment of the present invention. In this embodiment, encoding is performed on a per-chip basis. The aim of the encoding process according to the embodiment illustrated in FIG. 12 is to generate the number of event states.

На этапе 310 инициализируются значения. p_s инициализируется со значением 0. Число состояний события генерируется посредством успешного обновления переменной p_s. Когда процесс кодирования закончен, p_s будет нести число состояний события. Этап 310 также инициализирует переменную k посредством задания k в значение k:= число слотов, содержащих события в кадре, -1.At 310, values are initialized. p_s is initialized with a value of 0. The number of event states is generated by successfully updating the variable p_s. When the encoding process is completed, p_s will carry the number of event states. Step 310 also initializes the variable k by setting k to a value of k: = the number of slots containing events in the frame, -1.

На этапе 320 переменная "slots" задается в значение slots:=tsdPos[k], где tsdPos является массивом, удерживающим позиции слотов, содержащих события. Позиции слотов в массиве хранятся в возрастающем порядке.At 320, the variable "slots" is set to slots: = tsdPos [k], where tsdPos is an array holding the positions of the slots containing the events. The positions of the slots in the array are stored in ascending order.

На этапе 330 проводится проверка, проверяющая, верно ли k≥slots. Если это так, процесс прерывается. Иначе, процесс продолжается на этапе 340.At 330, a check is made to check if k≥slots is true. If so, the process is interrupted. Otherwise, the process continues at block 340.

На этапе 340 вычисляется значение c= $(_{k + 1}^{s l o t s})$

.At step 340, the value c =

(_{k + one}^{s l o t s})

.

На этапе 350 переменная p_s обновляется и задается в значение p_s:=p_s+c.At step 350, the variable p_s is updated and set to p_s: = p_s + c.

На этапе 360 k задается в значение k:=k-1.At step 360, k is set to k: = k-1.

Затем, на этапе 370 проводится проверка, проверяющая, верно ли k≥0. В этом случае рассматривается следующий слот k-1. Иначе, процесс прерывается.Then, at step 370, a check is performed to check if k≥0 is true. In this case, the next slot k-1 is considered. Otherwise, the process is interrupted.

На фиг.13 изображен псевдокод, реализующий кодирование позиций слотов, содержащих события, согласно варианту осуществления настоящего изобретения.13 is a pseudo code implementing encoding the positions of slots containing events according to an embodiment of the present invention.

На фиг.14 проиллюстрировано устройство 410 для декодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно дополнительному варианту осуществления настоящего изобретения. Снова, как на фиг.9a, число слотов кадра FSN, указывающее общее число слотов кадра аудиосигнала, число слотов с событиями ESON, указывающее число слотов, содержащих события кадра аудиосигнала, и число состояний события ESTN подаются в устройство 410 для декодирования. Устройство 410 для декодирования отличается от устройства по фиг.9a в том, что дополнительно содержит блок 440 разделения кадра. Блок 440 разделения кадра выполнен с возможностью разбиения кадра на первый раздел кадра, содержащий первый набор слотов кадра, и на второй раздел кадра, содержащий второй набор слотов кадра, и в котором позиции слотов, содержащие события, определяются по отдельности для каждого раздела кадра. Кроме того, позиции слотов, содержащих события, могут быть определены посредством повторного разбиения кадра или разделов кадра на даже меньшие разделы кадра.FIG. 14 illustrates an apparatus 410 for decoding slot positions containing events in an audio frame according to a further embodiment of the present invention. Again, as in FIG. 9a, the number of FSN frame slots indicating the total number of audio frame slots, the number of ESON event slots indicating the number of slots containing audio frame events, and the number of ESTN event states are supplied to decoding apparatus 410. The decoding apparatus 410 differs from the apparatus of FIG. 9a in that it further comprises a frame splitting unit 440. The frame splitting unit 440 is configured to split the frame into a first frame section containing a first set of frame slots, and into a second frame section containing a second set of frame slots, and in which slot positions containing events are determined individually for each frame section. In addition, the positions of the slots containing the events can be determined by re-dividing the frame or sections of the frame into even smaller sections of the frame.

Декодирование "на основе разделения" устройства 410 для декодирования по этому варианту осуществления основано на следующих концепциях, которые могут быть применены для вариантов осуществления устройства для декодирования, устройства для кодирования, способа декодирования и способа кодирования позиций слотов, которые содержат события в кадре аудиосигнала. Следующие концепции также применимы для соответствующих компьютерных программ и кодированных сигналов:The “split-based” decoding of the decoding apparatus 410 of this embodiment is based on the following concepts that can be applied to embodiments of the decoding apparatus, the encoding apparatus, the decoding method, and the encoding method of slot positions that contain events in an audio signal frame. The following concepts also apply to related computer programs and encoded signals:

Декодирование на основе разделения основано на идее, что кадр разбивается на два раздела A и B кадра, причем каждый раздел кадра содержит набор слотов, в котором раздел A кадра содержит N_a слотов и в котором раздел B кадра содержит N_b слотов, и что N_a+N_b=N. Кадр может быть произвольно разбит на два раздела предпочтительно так, чтобы разделы A и B имели примерно одинаковое общее число слотов (например, так, чтобы N_a=N_b или N_a=N_b-1). Посредством разбиения кадра на два раздела задача определения позиций слотов, где встретились события, также разбивается на две подзадачи, а именно определение позиций слотов, где встретились события в разделе A кадра, и определение позиций слотов, где встретились события в разделе B кадра.Split-based decoding is based on the idea that a frame is divided into two sections A and B of the frame, with each section of the frame containing a set of slots in which section A of the frame contains N _a slots and in which section B of the frame contains N _b slots, and that N _a + N _b = N. The frame can be arbitrarily divided into two sections, preferably so that sections A and B have approximately the same total number of slots (for example, so that N _a = N _b or N _a = N _b -1). By splitting the frame into two sections, the task of determining the positions of the slots where the events met is also divided into two subtasks, namely, determining the positions of the slots where the events occurred in section A of the frame, and determining the positions of the slots where the events met in section B of the frame.

В этом варианте осуществления снова предполагается, что устройство для декодирования осведомлено о числе слотов кадра, числе слотов, содержащих событие кадра, и числе состояний события. Для решения обеих подзадач устройство для декодирования должно быть осведомлено о числе слотов каждого раздела кадра, числе слотов, где встретились события касательно каждого раздела кадра, и числе состояний события каждого раздела кадра (такое число состояний события раздела кадра теперь называется как "число подсостояний события").In this embodiment, it is again assumed that the decoding apparatus is aware of the number of frame slots, the number of slots containing the frame event, and the number of event states. To solve both subproblems, the decoding device must be aware of the number of slots of each section of the frame, the number of slots where events occurred for each section of the frame, and the number of event states of each section of the frame (this number of states of the event of the frame section is now called the "number of substates of the event" )

Так как устройство для декодирования само разбивает кадр на два раздела кадра, оно само по себе знает, что раздел A кадра содержит N_a слотов, и раздел B кадра содержит N_b слотов. Определение числа слотов, содержащих события, для каждого из двух разделов кадра основано на следующих результатах изысканий.Since the decoding device itself divides the frame into two sections of the frame, it alone knows that section A of the frame contains N _a slots, and section B of the frame contains N _b slots. The determination of the number of slots containing events for each of the two sections of the frame is based on the following survey results.

Так как кадр был разбит на два раздела, каждый из слотов, содержащих события, теперь расположен либо в разделе A, либо в разделе B. Кроме того, предполагается, что P является числом слотов, содержащих события раздела кадра, и N является общим числом слотов раздела кадра, и что f(P,N) является функцией, которая возвращает число разных комбинаций позиций слотов с событиями раздела кадра, тогда число разных комбинаций позиций слотов с событиями всего кадра (который был разбит на раздел A и раздел B) составляет:Since the frame was divided into two sections, each of the slots containing the events is now located either in section A or in section B. In addition, it is assumed that P is the number of slots containing frame section events, and N is the total number of slots section of the frame, and that f (P, N) is a function that returns the number of different combinations of slot positions with events of the frame section, then the number of different combinations of slot positions with events of the entire frame (which was divided into section A and section B) is:

Число слотов, содержащих события, в разделе AThe number of slots containing events in section A Число слотов, содержащих события, в разделе BThe number of slots containing events in section B Число разных комбинаций в целом кадре аудиосигнала с этой конфигурациейThe number of different combinations in the whole audio frame with this configuration 00 PP f(0,N_a)∙f(P,N_b)f (0, N _a ) ∙ f (P, N _b ) 1one P-1P-1 f(1,N_a)∙f(P-1,N_b)f (1, N _a ) ∙ f (P-1, N _b ) 22 P-2P-2 f(2,N_a)∙f(P-2,N_b)f (2, N _a ) ∙ f (P-2, N _b ) ........ ........ ........ PP 00 f(P,N_a)∙f(0,N_b)f (P, N _a ) ∙ f (0, N _b )

На основе вышеприведенных размышлений, согласно варианту осуществления, все комбинации с первой конфигурацией, где раздел A имеет 0 слотов, содержащих события, и где раздел B имеет P слотов, содержащих события, должны быть кодированы с числом состояний события, меньшим, чем первое пороговое значение. Число состояний события может быть кодировано как целочисленное значение, являющееся положительным или 0. Так как есть только f(0,N_a)∙f(P,N_b) комбинаций с первой конфигурацией, подходящим первым пороговым значением может быть f(0,N_a)∙f(P,N_b).Based on the foregoing considerations, according to an embodiment, all combinations with the first configuration, where section A has 0 slots containing events, and where section B has P slots containing events, should be encoded with the number of event states less than the first threshold value . The number of event states can be encoded as an integer value that is positive or 0. Since there are only f (0, N _a ) ∙ f (P, N _b ) combinations with the first configuration, a suitable first threshold value can be f (0, N _a ) ∙ f (P, N _b ).

Все комбинации со второй конфигурацией, где раздел A имеет 1 слот, содержащий события, и где раздел B имеет P-1 слотов, содержащих события, должны быть кодированы с числом состояний события, большим, чем первое пороговое значение, или равным ему, но меньшим, чем второе значение, или равным ему. Так как есть только f(1,N_a)∙f(P-1,N_b) комбинаций со второй конфигурацией, подходящим вторым значением может быть f(0,N_a)∙f(P,N_b)+f(1,N_a)∙f(P-1,N_b). Число состояний события для комбинаций с другими конфигурациями определяется аналогично.All combinations with the second configuration, where section A has 1 slot containing events, and where section B has P-1 slots containing events, must be encoded with the number of event states greater than or equal to but less than the first threshold value than the second value, or equal to it. Since there are only f (1, N _a ) ∙ f (P-1, N _b ) combinations with the second configuration, a suitable second value might be f (0, N _a ) ∙ f (P, N _b ) + f (1 , N _a ) ∙ f (P-1, N _b ). The number of event states for combinations with other configurations is defined similarly.

Согласно варианту осуществления, декодирование выполняется посредством разделения кадра на два раздела A и B кадра. Затем, проверяется, является ли число состояний события меньшим, чем первое пороговое значение. В предпочтительном варианте осуществления первым пороговым значением может быть f(0,N_a)∙f(P,N_b).According to an embodiment, decoding is performed by dividing the frame into two sections A and B of the frame. Then, it is checked whether the number of event states is less than the first threshold value. In a preferred embodiment, the first threshold value may be f (0, N _a ) ∙ f (P, N _b ).

Если число состояний события меньше, чем первое пороговое значение, то может быть сделан вывод, что раздел A содержит 0 слотов, содержащих события, и раздел B содержит все P слотов кадра, где встретились события. Декодирование тогда проводится для обоих разделов с соответственно определенным числом, представляющим число слотов, содержащих события соответствующего раздела. Кроме того, первое число состояний события определяется для раздела A, и второе число состояний события определяется для раздела B, которые соответственно используются как новое число состояний события. В рамках этого документа, число состояний события раздела кадра называется как "число подсостояний события".If the number of event states is less than the first threshold value, it can be concluded that section A contains 0 slots containing events, and section B contains all P slots of the frame where the events occurred. Decoding is then performed for both partitions with an appropriately defined number representing the number of slots containing events of the corresponding partition. In addition, the first number of event states is determined for section A, and the second number of event states is determined for section B, which are respectively used as the new number of event states. For the purposes of this document, the number of event states of a frame section event is referred to as the “number of event sub-states”.

Однако если число состояний события больше, чем первое пороговое значение, или равно ему, число состояний события может быть обновлено. В предпочтительном варианте осуществления число состояний события может быть обновлено посредством вычитания значения из числа состояний события, предпочтительно посредством вычитания первого порогового значения, например f(0,N_a)∙f(P,N_b). На следующем этапе проверяется, является ли обновленное число состояний события меньше, чем второе пороговое значение. В предпочтительном варианте осуществления вторым пороговым значением может быть f(1,N_a)∙f(P-1,N_b). Если число состояний события меньше, чем второе пороговое значение, то может быть получено, что раздел A содержит 1 слот, содержащий события, и раздел B содержит P-1 слотов, содержащих события. Декодирование тогда проводится для обоих разделов с соответственно определенным числом слотов, содержащих события каждого раздела. Первое значение подсостояний события используется для декодирования раздела A, и второе значение подсостояний события используется для декодирования раздела B. Однако если число состояний события больше, чем второе пороговое значение, или равно ему, число состояний события может быть обновлено. В предпочтительном варианте осуществления число состояний события может быть обновлено посредством вычитания значения из числа состояний события, предпочтительно f(1,N_a)∙f(P-1,N_b). Процесс декодирования аналогично применяется для оставшихся возможностей распределения слотов, содержащих события касательно двух разделов кадра.However, if the number of event states is greater than or equal to the first threshold value, the number of event states can be updated. In a preferred embodiment, the number of event states can be updated by subtracting the value from the number of event states, preferably by subtracting the first threshold value, for example f (0, N _a ) ∙ f (P, N _b ). In the next step, it is checked whether the updated number of event states is less than the second threshold value. In a preferred embodiment, the second threshold may be f _{(1, N a) ∙ f (P} - 1, N _b). If the number of event states is less than the second threshold value, then it can be obtained that section A contains 1 slot containing events, and section B contains P-1 slots containing events. Decoding is then carried out for both sections with an appropriately defined number of slots containing the events of each section. The first event substate value is used to decode section A, and the second event substate value is used to decode section B. However, if the number of event states is greater than or equal to the second threshold value, the number of event states can be updated. In a preferred embodiment, the number of event states can be updated by subtracting the value from the number of event states, preferably f (1, N _a ) ∙ f (P-1, N _b ). The decoding process is similarly applied to the remaining distribution possibilities of slots containing events regarding two sections of the frame.

В варианте осуществления значение подсостояний события для раздела A и значение подсостояний события для раздела B могут быть использованы для декодирования раздела A и раздела B, где оба значения подсостояний события определяются посредством проведения деления:In an embodiment, the event substate value for section A and the event substate value for section B can be used to decode section A and section B, where both event substate values are determined by dividing:

значение состояний события/f(число слотов, содержащих события раздела B, N_b).value of event states / f (number of slots containing events of section B, N _b ).

Предпочтительно, число подсостояний события раздела A является целочисленной частью вышеприведенного деления, и число подсостояний события раздела B является остатком этого деления. Число состояний события, используемое в этом делении, может быть первоначальным числом состояний события кадра или обновленным числом состояний события, например обновленным посредством вычитания одного или более пороговых значений, как описано выше.Preferably, the number of substates of the event of section A is an integer part of the above division, and the number of substates of the event of section B is the remainder of this division. The number of event states used in this division may be the initial number of frame event states or the updated number of event states, for example, updated by subtracting one or more threshold values, as described above.

Чтобы проиллюстрировать вышеописанную концепцию декодирования на основе разделения, рассмотрена ситуация, где кадр имеет два слота, содержащих события. Кроме того, если f(p,N) снова является функцией, которая возвращает число разных комбинаций позиций слотов с событиями раздела кадра, в которой p является числом слотов, содержащих события раздела кадра, и N является общим числом слотов этого раздела кадра, тогда, для каждого из возможных распределений позиций в результате получается следующее число возможных комбинаций:To illustrate the above concept of separation-based decoding, a situation is considered where a frame has two slots containing events. In addition, if f (p, N) is again a function that returns the number of different combinations of slot positions with frame section events, in which p is the number of slots containing frame section events, and N is the total number of slots of this frame section, then, for each of the possible position distributions, the result is the following number of possible combinations:

Позиции в разделе AItems in Section A Позиция в разделе BPosition in section B Число комбинаций в этой конфигурацииThe number of combinations in this configuration 00 22 f(0,N_a)∙f(2,N_b)f (0, N _a ) ∙ f (2, N _b ) 1one 1one f(1,N_a)∙f(1,Nb)f (1, N _a ) ∙ f (1, Nb) 22 00 f(2,N_a)∙f(0,N_b)f (2, N _a ) ∙ f (0, N _b )

Таким образом, может быть сделан вывод, что если кодированное число состояний события кадра меньше, чем f(0,N_a)∙f(2,N_b), то слоты, содержащие события, должны быть распределены как 0 и 2. Иначе, f(0,N_a)∙f(2,N_b) вычитается из числа состояний события, и результат сравнивается с f(1,N_a)∙f(1,N_b). Если он меньше, то позиции распределены как 1 и 1. Иначе, осталось только распределение 2 и 0, и позиции распределены как 2 и 0.Thus, the conclusion can be made that if the number of states of the encoded frame events is smaller than f (0, N _a) ∙ f (2, N _b), the slots comprising events shall be distributed as 0 and 2. Otherwise, f (0, N _a ) ∙ f (2, N _b ) is subtracted from the number of event states, and the result is compared with f (1, N _a ) ∙ f (1, N _b ). If it is smaller, then the positions are distributed as 1 and 1. Otherwise, only the distribution of 2 and 0 remains, and the positions are distributed as 2 and 0.

Ниже предоставлен псевдокод согласно варианту осуществления для декодирования позиций слотов, содержащих определенные события (здесь: "pulses") в кадре аудиосигнала. В этом псевдокоде "pulses_a" является (предполагаемым) числом слотов, содержащих события в разделе A, и "pulses_b" является (предполагаемым) числом слотов, содержащих события в разделе B. В этом псевдокоде (в конечном счете, обновленное) число состояний события называется "state". Числа подсостояний события разделов A и B по-прежнему кодируются совместно в переменной "state". Согласно схеме совместного кодирования по варианту осуществления, число подсостояний события из A (в настоящем документе называемое "state_a") является целочисленной частью деления state/f(pulses_b, N_b), и число подсостояний события из B (в настоящем документе называемое "state_b") является остатком этого деления. Кроме того, длина (общее число слотов раздела) и число кодированных позиций (число слотов, содержащих события в разделе) обоих разделов могут быть декодированы посредством одинакового подхода:Below is provided a pseudo code according to an embodiment for decoding the positions of slots containing certain events (here: "pulses") in an audio signal frame. In this pseudo-code, "pulses_a" is the (estimated) number of slots containing events in section A, and "pulses_b" is the (estimated) number of slots containing events in section B. In this pseudo-code, the (ultimately updated) number of event states is called "state". The numbers of event substates of sections A and B are still co-encoded in the state variable. According to the co-coding scheme of the embodiment, the number of substates of an event from A (herein referred to as “state_a”) is an integer part of the division state / f (pulses_b, N _b ), and the number of substates of an event from B (herein referred to as “state_b" ) is the remainder of this division. In addition, the length (total number of section slots) and the number of coded positions (number of slots containing events in a section) of both sections can be decoded using the same approach:

Функция x=decodestate(state, pulses, N)Function x = decodestate (state, pulses, N)

1. Разбить вектор на два раздела длиной Na и Nb.1. Break the vector into two sections of length Na and Nb.

2. Для pulses_a от 0 до pulses2. For pulses_a from 0 to pulses

a. pulses_b=pulses-pulses_aa. pulses_b = pulses-pulses_a

b. если state<f(pulses_a,Na)*f(pulses_b,Nb), тоb. if state <f (pulses_a, Na) * f (pulses_b, Nb), then

прекратить цикл "для".stop the "for" loop.

c. state:=state-f(pulses_a,Na)*f(pulses_b,Nb)c. state: = state-f (pulses_a, Na) * f (pulses_b, Nb)

3. Число возможных состояний для раздела В равно3. The number of possible states for section B is

no_states_b=f(pulses_b,Nb)no_states_b = f (pulses_b, Nb)

4. Состояния state_a и state_b разделов A и B,4. States state_a and state_b of sections A and B,

соответственно, являются целой частью и остаткомrespectively, are the integer part and the remainder

от деления state/no_states_b.from the division state / no_states_b.

5. Если Na>1, то декодированный вектор раздела A5. If Na> 1, then the decoded partition vector A

получается рекурсивно какit turns out recursively as

xa=decodestate(state_a,pulses_a,Na)xa = decodestate (state_a, pulses_a, Na)

Иначе (Na==1), и вектор xa является скаляром, иOtherwise (Na == 1), and the vector xa is a scalar, and

мы можем задать xa=state_a.we can set xa = state_a.

6. Если Nb>1, то декодированный вектор раздела В6. If Nb> 1, then the decoded section vector B

получается рекурсивно какit turns out recursively as

xb=decodestate(state_b,pulses_b,Nb)xb = decodestate (state_b, pulses_b, Nb)

Иначе (Nb==1), и вектор xb является скаляром, иOtherwise (Nb == 1), and the vector xb is a scalar, and

мы можем задать xb=state_b.we can set xb = state_b.

7. Конечный выходной сигнал x получается путем7. The final output signal x is obtained by

объединения xa и xb как x=[xa xb].unions xa and xb as x = [xa xb].

Выходным сигналом этого алгоритма является вектор, который имеет единицу (1) в каждой кодированной позиции (т.е. позиции слота для слота, содержащего событие) и ноль (0) где-либо еще (т.е. в позициях слотов, которые не содержат события).The output of this algorithm is a vector that has one (1) at each coded position (i.e., the slot position for the slot containing the event) and zero (0) elsewhere (i.e. at slot positions that are not contain events).

Ниже предоставлен псевдокод согласно варианту осуществления для кодирования позиций слотов, содержащих события в кадре аудиосигнала, который использует аналогичные имена переменных с аналогичным значением, как приведено выше:Below is provided a pseudo code according to an embodiment for encoding slot positions containing events in an audio signal frame that uses the same variable names with the same value as above:

Функция state=encodestate(x,N)State = encodestate (x, n) function

1. Разбить вектор на два раздела xa и xb1. Break the vector into two sections xa and xb

длиной Na и Nb.the length of Na and Nb.

2. Подсчитать pulses в разделах A и В в pulses_a и2. Count pulses in sections A and B in pulses_a and

pulses_b и задать pulses=pulses_a+pulses_b.pulses_b and set pulses = pulses_a + pulses_b.

3. Задать состояние в 0.3. Set the state to 0.

4. Для k от 0 до pulses_a-14. For k from 0 to pulses_a-1

a. state:=state+f(k,Na)*f (pulses-k,Nb).a. state: = state + f (k, Na) * f (pulses-k, Nb).

5. Если Na>1, кодировать раздел A как5. If Na> 1, encode section A as

state_a=encodestate(xa,Na);state_a = encodestate (xa, Na);

Иначе (Na==1), задать state_a=xa.Otherwise (Na == 1), set state_a = xa.

6. Если Nb>1, кодировать раздел В как6. If Nb> 1, encode section B as

state_b=encodestate(xb,Nb);state_b = encodestate (xb, Nb);

Иначе (Nb==1), задать state_b=xb.Otherwise (Nb == 1), set state_b = xb.

7. Кодировать состояния совместно7. Encoding states together

state:=state+state_a*f(pulses_b,Nb)+state_b.state: = state + state_a * f (pulses_b, Nb) + state_b.

Здесь, предполагается, что аналогично алгоритму декодера каждая кодированная позиция (т.е. позиция слота для слота, содержащего событие) идентифицируется посредством единицы (1) в векторе x, и все другие элементы являются нулем (0) (т.е. в позициях слотов, которые не содержат события.)Here, it is assumed that, similarly to the decoder algorithm, each coded position (i.e., the position of the slot for the slot containing the event) is identified by one (1) in the vector x, and all other elements are zero (0) (i.e., in the positions slots that do not contain events.)

Вышеприведенные рекурсивные способы, сформулированные в псевдокоде, могут легко быть реализованы нерекурсивным образом с использованием стандартных способов.The above recursive methods formulated in pseudo-code can easily be implemented in a non-recursive manner using standard methods.

Согласно варианту осуществления настоящего изобретения, функция f(p,N) может быть реализована как таблица соответствия. Когда позиции не перекрываются, как, например, в текущем контексте, то функция числа состояний f(p,N) является просто биномиальной функцией, которая может быть вычислена по сети. ИмеетсяAccording to an embodiment of the present invention, the function f (p, N) can be implemented as a correspondence table. When the positions do not overlap, as, for example, in the current context, the function of the number of states f (p, N) is simply a binomial function that can be calculated over the network. Is available

Согласно варианту осуществления настоящего изобретения, и кодер, и декодер имеют цикл со счетчиком, где произведение f(p-k,N_a)*f(k,N_b) вычисляется для последовательных значений из k. Для эффективного вычисления это может быть записано какAccording to an embodiment of the present invention, both the encoder and the decoder have a loop with a counter, where the product f (pk, N _a ) * f (k, N _b ) is calculated for consecutive values from k. For efficient calculation, this can be written as

Другими словами, последовательные члены для вычитания/сложения (на этапе 2b и 2c в декодере и на этапе 4a в кодере) могут быть вычислены посредством трех умножений и одного деления за итерацию.In other words, the sequential terms for subtraction / addition (in step 2b and 2c in the decoder and in step 4a in the encoder) can be calculated by three multiplications and one division per iteration.

Аналогично, как в способе, описанном ранее, состояние длинного вектора (многослотовый кадр) может быть очень большим целым числом, легко продлевающим длину представления в стандартных процессорах. Вследствие этого, будет необходимо использовать арифметические функции, способные обрабатывать очень большие целые числа.Similarly, as in the method described previously, the state of a long vector (multi-slot frame) can be a very large integer that easily extends the presentation length in standard processors. As a result, it will be necessary to use arithmetic functions that can handle very large integers.

Касательно сложности, рассмотренный здесь способ, в отличие от вышеприведенных послотовых процессов, является алгоритмом типа "разделяй и властвуй". Предполагая, что длина входного вектора имеет степень двойки, тогда рекурсия имеет глубину log2(N).Regarding complexity, the method discussed here, in contrast to the above post-bit processes, is a divide-and-conquer algorithm. Assuming that the length of the input vector has a power of two, then recursion has a depth of log2 (N).

Так как число импульсов остается постоянным на каждой глубине рекурсии, то число итераций цикла со счетчиком является одинаковым при каждой рекурсии. Из этого следует, что число циклов составляет pulses∙log2(N).Since the number of pulses remains constant at each depth of the recursion, the number of iterations of the loop with the counter is the same for each recursion. It follows that the number of cycles is pulses ∙ log2 (N).

Как разъяснено выше, каждое обновление f(p-k,N_a)∙f(k,N_b) может быть сделано с помощью трех умножений и одного деления.As explained above, each update _{f (pk, N a) ∙} f (k, N b) can be done using three multiplications and one division.

Следует отметить, что вычитания и сравнения в декодере могут предполагаться как одна операция.It should be noted that subtractions and comparisons in the decoder can be assumed as one operation.

Может быть легко видно, что разделы объединяются log2(N)-1 раз. При совместном кодировании состояний в кодере, таким образом, необходимо умножать и складывать log2(N)-1 раз. Аналогично, при совместном декодировании состояний в декодере необходимо делить log2(N)-1 раз.It can be easily seen that partitions combine log2 (N) -1 times. When co-encoding states in an encoder, it is therefore necessary to multiply and add log2 (N) -1 times. Similarly, when decoding states together in a decoder, it is necessary to divide log2 (N) -1 times.

Следует отметить, что из делений, только при совместном кодировании состояний в декодере нужны деления, где знаменателем является длинное целое число. Другие деления всегда имеют относительно короткие целые числа в знаменателе. Так как деления с длинными знаменателями являются наиболее сложными операциями, их нужно избегать, когда возможно.It should be noted that of divisions, only when co-coding states in a decoder, divisions are needed where the denominator is a long integer. Other divisions always have relatively short integers in the denominator. Since divisions with long denominators are the most difficult operations, they should be avoided when possible.

Итак, данное число арифметических операций с длинными целыми числами происходит в декодереSo, this number of arithmetic operations with long integers occurs in the decoder

УмноженияMultiplications (3∙pulses+1)∙log2(N)-1(3 ∙ pulses + 1) ∙ log2 (N) -1 ДеленияDivisions (pulses+1)∙log2(N)-1(pulses + 1) ∙ log2 (N) -1 из которых делений с длинным знаменателемof which divisions with a long denominator log2(N)-1log2 (N) -1 Сложения и вычитанияAddition and Subtraction pulses∙log2(N)pulses ∙ log2 (N) Аналогично, в кодере происходятSimilarly, in the encoder occur УмноженияMultiplications (3∙pulses+1)∙log2(N)-1(3 ∙ pulses + 1) ∙ log2 (N) -1 ДеленияDivisions (pulses+1)∙log2(N)-1(pulses + 1) ∙ log2 (N) -1 из которых делений с длинным знаменателемof which divisions with a long denominator 00 Сложения и вычитанияAddition and Subtraction (pulses+2)∙log2(N)(pulses + 2) ∙ log2 (N)

Требуется только log2(N)-1 делений с длинным знаменателем.Only log2 (N) -1 divisions with a long denominator are required.

В дополнительных вариантах осуществления вышеописанные варианты осуществления, которые содержат или которые выполнены с возможностью использования этапов рекурсивной обработки, модифицированы так, что некоторые или все из этапов рекурсивной обработки реализованы нерекурсивным образом с использованием стандартных способов.In further embodiments, the above-described embodiments that comprise or are configured to use recursive processing steps are modified so that some or all of the recursive processing steps are implemented in a non-recursive manner using standard methods.

На фиг.15 проиллюстрировано устройство (510) для кодирования позиций слотов, содержащих события в кадре аудиосигнала, согласно варианту осуществления. Устройство (510) для кодирования содержит генератор (530) числа состояний события, который выполнен с возможностью кодирования позиций слотов посредством кодирования числа состояний события. Кроме того, устройство содержит блок (520) информации слота, выполненный с возможностью предоставления числа слотов кадра и числа слотов с событиями в генератор (530) числа состояний события. Генератор числа состояний события может реализовать один из вышеописанных способов кодирования.FIG. 15 illustrates a device (510) for encoding slot positions containing events in an audio frame according to an embodiment. The encoding device (510) comprises an event state number generator (530), which is configured to encode slot positions by encoding the number of event states. In addition, the device comprises a slot information unit (520) configured to provide the number of frame slots and the number of event slots to an event state generator (530). The event state number generator may implement one of the above encoding methods.

В дополнительном варианте осуществления предоставлен кодированный аудиосигнал. Кодированный аудиосигнал содержит число состояний события. В другом варианте осуществления кодированный аудиосигнал, кроме того, содержит число слотов с событиями. Более того, кадр кодированного аудиосигнала может также содержать число слотов кадра. В кадре аудиосигнала позиции слотов, содержащих события в кадре аудиосигнала, могут быть декодированы согласно одному из вышеописанных способов декодирования. В варианте осуществления передаются число состояний события, число слотов с событиями и число слотов кадра, так что позиции слотов, содержащих события в кадре аудиосигнала, могут быть декодированы посредством использования одного из вышеописанных способов.In a further embodiment, an encoded audio signal is provided. The encoded audio signal contains the number of event states. In another embodiment, the encoded audio signal further comprises a number of event slots. Moreover, the encoded audio signal frame may also contain the number of frame slots. In the audio frame, the positions of slots containing events in the audio frame can be decoded according to one of the decoding methods described above. In an embodiment, the number of event states, the number of event slots, and the number of frame slots are transmitted, so that the positions of the slots containing the events in the audio frame can be decoded using one of the above methods.

Изобретенный кодированный аудиосигнал может быть сохранен в цифровом запоминающем носителе или некратковременном запоминающем носителе или может быть передан в среде передачи, такой как среда беспроводной передачи или среда проводной передачи, такая как Интернет.The inventive encoded audio signal may be stored in a digital storage medium or non-transitory storage medium or may be transmitted in a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.

Нижеследующее разъясняет синтаксические определения USAC, выполненные с возможностью поддержки управляющего транзиентами декоррелятора (TSD) согласно варианту осуществления.The following explains USAC syntax definitions configured to support a transient control decorrelator (TSD) according to an embodiment.

На фиг.16 проиллюстрированы данные MPS (MPEG Surround) 212. Данные MPS 212 являются блоком данных, содержащих полезные данные для стереомодуля MPS 212. Данные MPS 212 содержат данные TSD.16, MPS data (MPEG Surround) 212 is illustrated. MPS data 212 is a data block containing useful data for the stereo module MPS 212. MPS data 212 contains TSD data.

На фиг.17 изображен синтаксис данных TSD. Они содержат число слотов с транзиентами (bsTsdNumTrSlots) и данные фазы транзиентов TSD (bsTsdTrPhaseData) для слотов в кадре данных MPS 212. Если слот содержит данные транзиентов (TsdSepData[ts] задано в значение 1), bsTsdTrPhaseData содержит данные фазы, иначе bsTsdTrPhaseData[ts] задано в значение 0.17 depicts the syntax of TSD data. They contain the number of transient slots (bsTsdNumTrSlots) and the TSD transient phase data (bsTsdTrPhaseData) for the slots in the MPS 212 data frame. If the slot contains transient data (TsdSepData [ts] is set to 1), bsTsdTrPhaseDataTata tase phase data ] is set to 0.

nBitsTrSlots задает число битов, используемых для переноса числа слотов с транзиентами (bsTsdNumTrSlots). nBitsTrSlots зависит от числа слотов в кадре данных MPS 212 (numSlots). На фиг.18 проиллюстрирована взаимосвязь числа слотов в кадре данных MPS 212 и числа битов, используемых для переноса числа слотов с транзиентами.nBitsTrSlots sets the number of bits used to carry the number of transient slots (bsTsdNumTrSlots). nBitsTrSlots depends on the number of slots in the MPS 212 data frame (numSlots). On Fig illustrated the relationship of the number of slots in the data frame of the MPS 212 and the number of bits used to transfer the number of slots with transients.

На фиг.19 задается значение tempShapeConfig. tempShapeConfig указывает режим работы временного формирования (STP или GES) или активацию управляющей транзиентами декорреляции в декодере. Если tempShapeConfig задан в значение 0, временное формирование совсем не применяется; если tempShapeConfig задан в значение 1, применяется временная обработка области поддиапазона (STP); если tempShapeConfig задан в значение 2, применяется управляемое формирование огибающей (GES); и если tempShapeConfig задан в значение 3, применяется управляющая транзиентами декорреляция (TSD).In Fig.19 sets the value of tempShapeConfig. tempShapeConfig indicates the mode of operation of the temporary formation (STP or GES) or the activation of transient-controlled decorrelation in the decoder. If tempShapeConfig is set to 0, temporary shaping does not apply at all; if tempShapeConfig is set to 1, temporary subband processing (STP) is applied; if tempShapeConfig is set to 2, controlled envelope shaping (GES) is applied; and if tempShapeConfig is set to 3, transient control decorrelation (TSD) is applied.

На фиг.20 проиллюстрирован синтаксис данных TempShapeData. Если bsTempShapeConfig задан в значение 3, TempShapeData содержит bsTsdEnable, указывающий, что TSD включена в кадре.20 illustrates the syntax of TempShapeData data. If bsTempShapeConfig is set to 3, TempShapeData contains bsTsdEnable, indicating that TSD is included in the frame.

На фиг.21 проиллюстрирован блок D декоррелятора согласно варианту осуществления. Блок D декоррелятора в блоке OTT-декодирования содержит блок разделения сигнала, две структуры декоррелятора и блок объединения сигналов.21, a decorrelator block D according to an embodiment is illustrated. The decorrelator unit D in the OTT decoding unit comprises a signal separation unit, two decorrelator structures and a signal combining unit.

D_AP означает: частотнонезависимый декоррелятор, как задано в подразделе 7.11.2.5 (частотнонезависимый декоррелятор).D _AP means: frequency-independent decorrelator, as specified in subsection 7.11.2.5 (frequency-independent decorrelator).

D_TR означает: декоррелятор транзиентов.D _TR means: decorrelator of transients.

Если инструмент TSD активен в текущем кадре, т.е. если (bsTsdEnable==1), входной сигнал разделяется на поток $ν_{X, T r}^{n, k}$

с транзиентами и поток

ν_{X, n o n T r}^{n, k}

без транзиентов согласно:If the TSD tool is active in the current frame, i.e. if (bsTsdEnable == 1), the input signal is split into a stream

ν_{X, T r}^{n, k}

with transients and flow

ν_{X, n o n T r}^{n, k}

without transients according to:

Флаг послотового отделения транзиентов TsdSepData(n) декодируется из кодового слова переменной длины bsTsdCodedPos посредством TsdTrPos_dec(), как описано ниже. Длина кодового слова из bsTsdCodedPos, т.е. nBitsTsdCW, вычисляется согласно:The flag of the post-stranded transients separation TsdSepData (n) is decoded from the variable-length codeword bsTsdCodedPos by TsdTrPos_dec (), as described below. The length of the codeword from bsTsdCodedPos, i.e. nBitsTsdCW, calculated according to:

Возвращаясь к фиг.11, на фиг.11 проиллюстрировано декодирование данных отделения слота с транзиентами bsTsdCodedPos в TsdSepData[n] согласно варианту осуществления. Массив длины numSlots, состоящий из '1' для кодированных позиций с транзиентами и '0' для других, задается, как проиллюстрировано на фиг.11.Returning to FIG. 11, FIG. 11 illustrates decoding of bsTsdCodedPos transient slot separation data in TsdSepData [n] according to an embodiment. An array of length numSlots, consisting of '1' for coded positions with transients and '0' for others, is specified as illustrated in FIG. 11.

Если инструмент TSD отключен в текущем кадре, т.е. если (bsTsdEnable==0), входной сигнал обрабатывается так, как если бы TsdSepData(n)=0 для всех n.If the TSD tool is disabled in the current frame, i.e. if (bsTsdEnable == 0), the input signal is processed as if TsdSepData (n) = 0 for all n.

Составляющие сигнала с транзиентами обрабатываются в структуре декоррелятора транзиентов D_TR, как следует ниже:The components of the signal with transients are processed in the structure of the decorrelator of transients D _TR , as follows:

$d_{X, T r}^{n, k} = {\begin{array}{l} e^{j ϕ_{T S D}^{n}} \cdot v_{X, T r}^{n, k} & , если bsTsdEnable = 1 \\ 0 & , в ином случае \end{array}$

,

d_{X, T r}^{n, k} = {\begin{array}{l} e^{j ϕ_{T S D}^{n}} \cdot v_{X, T r}^{n, k} & if bsTsdEnable = one \\ 0 & , otherwise \end{array}

,

гдеWhere

Составляющие сигнала без транзиентов обрабатываются в частотнонезависимом декорреляторе D_AP, как задано в следующем подразделе, выдающем выходной сигнал декоррелятора для составляющих сигнала без транзиентовThe signal components without transients are processed in the frequency-independent decorrelator D _AP , as specified in the next subsection, which outputs the output signal of the decorrelator for signal components without transients

Выходные сигналы декоррелятора складываются для образования декоррелированного сигнала, содержащего составляющие как с транзиентами, так и без транзиентовThe output signals of the decorrelator are added to form a decorrelated signal containing components with both transients and without transients

На фиг.22 проиллюстрирован синтаксис EcData, содержащего bsFrequencyResStrideXXX. Синтаксический элемент bsFreqResStride обеспечивает возможность для использования широкополосных меток в MPS. XXX должно быть заменено значением типа данных (CLD, ICC, IPD).On Fig illustrates the syntax of EcData containing bsFrequencyResStrideXXX. The bsFreqResStride syntax element provides the ability to use broadband tags in MPS. XXX should be replaced with a data type value (CLD, ICC, IPD).

Управляющий транзиентами декоррелятор в структуре OTT-декодера предоставляет возможность применения специализированного декоррелятора к составляющим с транзиентами подобных аплодисментам сигналов. Активация этой функции TSD управляется сгенерированным кодером флагом bsTsdEnable, который передается раз за кадр.A decorrelator controlling transients in the structure of an OTT decoder provides the possibility of applying a specialized decorrelator to components that are similar to applause with transients. The activation of this TSD function is controlled by the encoder-generated flag bsTsdEnable, which is transmitted once per frame.

Данные TSD в модуле преобразования двух каналов в один канал (R-OTT) кодера генерируются, как следует ниже:The TSD data in the module for converting two channels into one channel (R-OTT) of the encoder is generated as follows:

- Запустить семантический классификатор сигналов, который обнаруживает подобные аплодисментам сигналы. Результат классификации передается раз за кадр: флаг bsTsdEnable задается в значение 1 для подобных аплодисментам сигналов, иначе он задается в значение 0.- Launch a semantic signal classifier that detects applause-like signals. The classification result is transmitted once per frame: the bsTsdEnable flag is set to 1 for applause-like signals, otherwise it is set to 0.

- Если bsTsdEnable задан в значение 0 для текущего кадра, для этого кадра дополнительные данные TSD не генерируются/не передаются.- If bsTsdEnable is set to 0 for the current frame, no additional TSD data is generated / transmitted for this frame.

- Если bsTsdEnable задан в значение 1 для текущего кадра, выполнить следующее:- If bsTsdEnable is set to 1 for the current frame, do the following:

○ Включить широкополосный расчет пространственных OTT-параметров.○ Enable broadband calculation of spatial OTT parameters.

○ Обнаружить транзиенты в текущем кадре (двоичное решение по каждому временному слоту MPS).○ Detect transients in the current frame (binary solution for each MPS time slot).

○ Кодировать позиции слотов с транзиентами tsdPosLen в векторе tsdPos согласно следующему псевдокоду, где позиции слотов в tsdPos предполагаются в возрастающем порядке. На фиг.13 проиллюстрирован псевдокод для кодирования позиций слотов с транзиентами в tsdPosLen.○ Encode the positions of the slots with transients tsdPosLen in the tsdPos vector according to the following pseudo-code, where the positions of the slots in tsdPos are assumed in ascending order. FIG. 13 illustrates pseudo-code for encoding slot positions with transients in tsdPosLen.

○ Передать число слотов с транзиентами (bsTsdNumTrSlots=(число обнаруженных слотов с транзиентами)-1).○ Send the number of slots with transients (bsTsdNumTrSlots = (the number of detected slots with transients) -1).

○ Передать кодированные позиции с транзиентами (bsTsdCodedPos).○ Transfer coded positions with transients (bsTsdCodedPos).

○ Для каждого слота с транзиентами вычислить величину фазы, которая представляет разность фаз при широкополосной передаче между сигналом, полученным понижающим микшированием, и остаточным сигналом.○ For each transient slot, calculate the phase value, which represents the phase difference in broadband transmission between the signal obtained by downmix and the residual signal.

○ Для каждого слота с транзиентами кодировать и передать величину разности фаз при широкополосной передаче (bsTsdTrPhaseData).○ For each transient slot, encode and transmit the phase difference value for broadband transmission (bsTsdTrPhaseData).

Наконец, на фиг.23 проиллюстрирована схема прохождения сигналов для генерирования данных TSD в модуле преобразования двух каналов в один канал (R-OTT).Finally, FIG. 23 illustrates a signal flow diagram for generating TSD data in a two channel to one channel conversion (R-OTT) module.

Хотя некоторые аспекты были описаны в контексте устройства, ясно, что эти аспекты также представляют описание соответствующего способа, где блок или устройство соответствует этапу способа, или признака этапа способа. Аналогично, аспекты, описанные в контексте этапа способа, также представляют описание соответствующего блока, или элемента, или признака соответствующего устройства.Although some aspects have been described in the context of the device, it is clear that these aspects also represent a description of the corresponding method, where the unit or device corresponds to a step of the method, or an attribute of a step of the method. Likewise, aspects described in the context of a method step also provide a description of a corresponding unit, or element, or feature of a corresponding device.

В зависимости от определенных требований реализации, варианты осуществления данного изобретения могут быть реализованы в аппаратном обеспечении или программном обеспечении. Данная реализация может быть выполнена, используя цифровой запоминающий носитель, например флоппи-диск, DVD, CD, ROM, PROM, EPROM, EEPROM или флэш-память, содержащий электронно-считываемые сигналы управления, хранящиеся на нем, которые взаимодействуют (или способны взаимодействовать) с программируемой компьютерной системой так, чтобы выполнялся соответствующий способ.Depending on certain implementation requirements, embodiments of the present invention may be implemented in hardware or software. This implementation can be performed using a digital storage medium such as a floppy disk, DVD, CD, ROM, PROM, EPROM, EEPROM or flash memory containing electronically readable control signals stored on it that communicate (or are capable of interacting) with a programmable computer system so that the appropriate method is performed.

Некоторые варианты осуществления согласно данному изобретению содержат носитель данных, содержащий электронно-считываемые сигналы управления, которые способны взаимодействовать с программируемой компьютерной системой так, чтобы выполнялся один из способов, описанных в настоящем документе.Some embodiments of the present invention comprise a storage medium containing electronically readable control signals that are capable of interacting with a programmable computer system such that one of the methods described herein is performed.

В общем, варианты осуществления настоящего изобретения могут быть реализованы как компьютерный программный продукт с программным кодом, причем программный код функционирует для выполнения одного из способов, когда компьютерный программный продукт выполняется на компьютере. Программный код может, например, храниться на машиночитаемом носителе.In general, embodiments of the present invention may be implemented as a computer program product with program code, the program code functioning to perform one of the methods when the computer program product is executed on a computer. The program code may, for example, be stored on a computer-readable medium.

Другие варианты осуществления содержат компьютерную программу для выполнения одного из способов, описанных в настоящем документе, хранящихся на машиночитаемом носителе или в некратковременном запоминающем носителе.Other embodiments comprise a computer program for performing one of the methods described herein stored on a computer-readable medium or in a non-transitory storage medium.

Другими словами, вариант осуществления изобретенного способа вследствие этого является компьютерной программой, содержащей программный код для выполнения одного из способов, описанных в настоящем документе, когда компьютерная программа выполняется на компьютере.In other words, an embodiment of the invented method is therefore a computer program comprising program code for executing one of the methods described herein when a computer program is executed on a computer.

Дополнительный вариант осуществления изобретенных способов вследствие этого является носителем данных (или цифровым запоминающим носителем или компьютерно-читаемым носителем), содержащим записанную на нем компьютерную программу для выполнения одного из способов, описанных в настоящем документе.An additional embodiment of the invented methods is therefore a data medium (or digital storage medium or computer-readable medium) comprising a computer program recorded thereon for performing one of the methods described herein.

Дополнительный вариант осуществления изобретенного способа вследствие этого является потоком данных или последовательностью сигналов, представляющими компьютерную программу для выполнения одного из способов, описанных в настоящем документе. Поток данных или последовательность сигналов могут, например, быть выполненными с возможностью быть перенесенными через соединение передачи данных, например через Интернет.An additional embodiment of the invented method is therefore a data stream or a sequence of signals representing a computer program for performing one of the methods described herein. The data stream or sequence of signals may, for example, be configured to be carried over a data connection, for example via the Internet.

Дополнительный вариант осуществления содержит средство обработки, например компьютер или программируемое логическое устройство, сконфигурированное с возможностью или выполненное с возможностью выполнения одного из способов, описанных в настоящем документе.A further embodiment comprises processing means, for example, a computer or programmable logic device, configured to or configured to perform one of the methods described herein.

Дополнительный вариант осуществления содержит компьютер, содержащий установленную на нем компьютерную программу для выполнения одного из способов, описанных в настоящем документе.A further embodiment comprises a computer comprising a computer program installed thereon for executing one of the methods described herein.

В некоторых вариантах осуществления программируемое логическое устройство (например, программируемая пользователем вентильная матрица) может быть использовано для выполнения некоторых или всех функциональных возможностей способов, описанных в настоящем документе. В некоторых вариантах осуществления программируемая пользователем вентильная матрица может взаимодействовать с микропроцессором для выполнения одного из способов, описанных в настоящем документе. В общем, способы предпочтительно выполняются любым аппаратным устройством.In some embodiments, a programmable logic device (eg, a user programmable gate array) may be used to perform some or all of the functionality of the methods described herein. In some embodiments, a user programmable gate array may interact with a microprocessor to perform one of the methods described herein. In general, the methods are preferably performed by any hardware device.

Вышеописанные варианты осуществления являются лишь иллюстративными для принципов настоящего изобретения. Следует понимать, что модификации и вариации данных компоновок и подробности, описанные в настоящем документе, будут очевидны специалистам в данной области техники. Соответственно, подразумевается, что изобретение следует ограничивать только объемом прилагаемой формулы изобретения, но не конкретными подробностями, представленными с целью описания и разъяснения вариантов осуществления в настоящем документе.The above embodiments are merely illustrative of the principles of the present invention. It should be understood that modifications and variations of these arrangements and the details described herein will be apparent to those skilled in the art. Accordingly, it is intended that the invention be limited only by the scope of the appended claims, and not by the specific details presented for the purpose of describing and explaining the embodiments herein.

Список литературыBibliography

[2] J. Herre, K. Kjorling, J. Breebaart et al., "MPEG surround - the ISO/MPEG standard for efficient and compatible multi-channel audio coding", in Proceedings of the 122^th AES Convention, Vienna, Austria, May 2007.[2] J. Herre, K. Kjorling, J. Breebaart et al., "MPEG surround - the ISO / MPEG standard for efficient and compatible multi-channel audio coding", in Proceedings of the 122 ^th AES Convention, Vienna, Austria May 2007.

[5] J. Engdegard. H. Purnhagen, J. Roden, L. Liljeryd, "Synthetic Ambience in Parametric Stereo Coding" in Proceedings of the AES 116^th Convention, Berlin, Preprint, May 2004.[5] J. Engdegard. H. Purnhagen, J. Roden, L. Liljeryd , "Synthetic Ambience in Parametric Stereo Coding" in Proceedings of the AES 116 ^th Convention, Berlin, Preprint, May 2004.

Claims

1. Device (10; 40; 60; 410) for decoding an encoded audio signal containing an audio signal frame containing slots and events associated with these slots, comprising:
an analysis unit (20; 42; 70; 420) for analyzing the number of frame slots indicating the total number of audio frame slots, the number of event slots indicating the number of slots containing audio frame events, and the number of event states; and
a generating unit (30; 45; 80; 430) for generating an indication of a plurality of slot positions containing events in an audio signal frame using the number of frame slots, the number of event slots and the number of event states.

2. The device (10; 40; 60; 410) for decoding according to claim 1,
wherein the device (10; 40; 60; 410) for decoding is configured to decode the positions of the slots with transients in the frame of the audio signal.

3. The device (10; 40; 60; 410) for decoding according to claim 1,
in which the analyzing unit (20; 42; 70; 420) is configured to perform a check comparing the number of event states or the updated number of event states with a threshold value.

4. The device (10; 40; 60; 410) for decoding according to claim 3,
in which the analyzing unit (20; 42; 70; 420) is configured to perform verification by comparison, is
whether the number of event states or the updated number of event states is greater, greater, or equal, less than, less than, or equal to a threshold value, and
in which the generating unit (30; 45; 80; 430) is further configured to update the number of event states or the updated number of event states depending on the result of the check.

5. The device (10; 40; 60) for decoding according to claim 3,
wherein the device (10; 40; 60) for decoding further comprises a selector (90) slots,
wherein the slot selector (90) is configured to select a slot as the slot in question,
while the analyzing unit (20; 42; 70) is configured to perform checks in relation to the slot in question,
and the threshold value depends on the number of slots of the frame, the number of slots with events and the position of the considered slot inside the frame.

6. The device (10; 40) for decoding according to claim 5,
in which the analyzing unit (20; 42; 70) is configured to perform a check comparing the number of event states or the updated number of event states with a threshold value, where the threshold value is

,
where N is the total number of slots of the audio frame, where P is the number of slots containing events of the audio frame or the considered portion of the audio frame, and where h
is the position of the slot in question within the frame.

7. The device (10; 40; 410) for decoding according to claim 1,
wherein the device for decoding (10; 40; 410) further comprises a frame splitting unit (440),
the block (440) for dividing the frame is configured to split the frame into a first section of a frame containing a first set of frame slots, and into a second section of a frame containing a second set of frame slots, and the device (10; 40; 410) for decoding additionally configured to determine the positions of slots containing events for each of the sections of the frame individually.

8. The device (10; 40; 60; 410) for decoding according to claim 1, further comprising:
an audio signal processor (50) for generating an output audio signal using indicating a plurality of slot positions containing events in an audio signal frame using the number of frame slots, the number of event slots and the number of event states.

9. The device (10; 60; 410) for decoding according to claim 8,
in which the audio signal processor (50) is configured to generate an output audio signal according to the first method, if the indication of the plurality of positions of slots containing the events is in a first indicating state, and in which the audio signal processor (50) is configured to generate an audio output according to a different second method if the indication of the multiple positions of the slots containing the events is in the second indication state,
which differs from the first indication state.

10. The device (10; 40; 60; 410) for decoding according to claim 9,
in which the audio signal processor (50) is configured such that the first method comprises the step of using a transformer decorrelator (56) to decode the slot, if the first indication state indicates that the slot contains a transient, and wherein the second method comprises the step of the decoding of the slot uses a second decorrelator (54) if the second indication state indicates that the slot does not contain a transient.

11. A device (510) for encoding slot positions containing events in an audio signal frame, comprising:
an event state number generator (530) for encoding slot positions by encoding an event state number; and
a slot information unit (520) configured to provide the number of frame slots indicating the total number of audio frame slots and the number of event slots indicating the number of slots containing audio frame events to the event state generator (530),
wherein the number of event states, the number of frame slots and the number of event slots together indicate a plurality of slot positions containing events in the audio signal frame.

12. The device (510) for encoding according to claim 11,
wherein the event state number generator (530) is configured to generate an event state number by adding a positive integer value for each slot containing the event.

13. The device (510) for encoding according to claim 11,
wherein the event state number generator (530) is configured to generate the number of event states by determining the first number of event substates for the first section of the frame, by determining the second number of event substates for the second section of the frame, and by combining the first and second number of event states to generate a number event states.

14. A method for decoding positions of slots containing events in an audio signal frame, comprising the steps of:
analyzing the number of frame slots indicating the total number of audio frame slots, the number of event slots indicating the number of slots containing audio frame events, and the number of event states; and
generating an indication of the plurality of slot positions containing events in the audio signal frame using the number of frame slots, the number of event slots, and the number of event states.

15. A method of encoding slot positions containing events in an audio signal frame, comprising the steps of:
receive or determine the number of frame slots indicating the total number of audio frame slots,
receive or determine the number of event slots indicating the number of slots containing events of the audio frame,
encode the number of event states based on the number of event states, the number of frame slots and the number of event slots, so that the indication of the multiple positions of the slots containing events in the audio signal frame can be encoded using the number of frame slots, the number of event slots and the number of event states.

16. A computer-readable medium containing a computer program recorded on it, which, when executed on a computer, instructs the computer to perform a method of decoding the positions of slots with events in an audio signal frame according to claim 14.

17. A computer-readable medium containing a computer program recorded on it, which, when executed on a computer, instructs the computer to perform a method of encoding slot positions with events in an audio signal frame according to claim 15.