[go: up one dir, main page]

CN101361117B - Method and apparatus for processing a media signal - Google Patents

Method and apparatus for processing a media signal Download PDF

Info

Publication number
CN101361117B
CN101361117B CN2007800015359A CN200780001535A CN101361117B CN 101361117 B CN101361117 B CN 101361117B CN 2007800015359 A CN2007800015359 A CN 2007800015359A CN 200780001535 A CN200780001535 A CN 200780001535A CN 101361117 B CN101361117 B CN 101361117B
Authority
CN
China
Prior art keywords
information
rendering
signal
unit
filter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2007800015359A
Other languages
Chinese (zh)
Other versions
CN101361117A (en
Inventor
吴贤午
房熙锡
金东秀
林宰显
郑亮源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LG Electronics Inc
Original Assignee
LG Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LG Electronics Inc filed Critical LG Electronics Inc
Priority claimed from PCT/KR2007/000350 external-priority patent/WO2007083960A1/en
Publication of CN101361117A publication Critical patent/CN101361117A/en
Application granted granted Critical
Publication of CN101361117B publication Critical patent/CN101361117B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Stereophonic System (AREA)

Abstract

An apparatus for processing a media signal and method thereof are disclosed, by which the media signal can be converted to a surround signal by using spatial information of the media signal. The present invention provides a method of processing a signal, comprising: extracting a downmix signal from a bit stream; generating decorrelated downmix signal through using decorrelator on the downmix signal; generating surround signal through using the rendering information for generating surround signal on the downmix signal and the decorrelated downmix signal.

Description

处理媒体信号的方法和装置Method and device for processing media signals

技术领域technical field

本发明涉及处理媒体信号的装置及其方法,尤其涉及通过使用媒体信号的空间信息生成环绕信号的装置及其方法。The present invention relates to a device and method for processing media signals, in particular to a device and method for generating surround signals by using spatial information of media signals.

背景技术Background technique

一般而言,各种类型的装置和方法已被广泛地用于通过使用多声道媒体信号的空间信息以及声道缩减混音信号来生成该多声道媒体信号,其中声道缩减混音信号是通过将多声道媒体信号作声道缩减混音成单声道或立体声信号而生成的。In general, various types of devices and methods have been widely used to generate a multi-channel media signal by using spatial information of the multi-channel media signal and a downmix signal, wherein the downmix signal It is generated by downmixing a multi-channel media signal to a mono or stereo signal.

然而,上述的方法和装置在不适于生成多声道信号的环境中是不可使用的。例如,它们对于仅能生成立体声信号的设备是不可使用的。换言之,没有任何现有的在不能通过使用多声道信号的空间信息生成该多声道信号的环境中生成环绕信号——其中该环绕信号具有多声道特征——的方法或装置。However, the methods and apparatus described above are not usable in environments that are not suitable for generating multi-channel signals. For example, they are not available for devices that can only generate stereo signals. In other words, there is no existing method or apparatus for generating a surround signal having multi-channel characteristics in an environment where the multi-channel signal cannot be generated by using the spatial information of the multi-channel signal.

所以,因为没有任何现有的在仅能生成单声道或立体声信号的设备中生成环绕信号的方法或装置,所以难以高效率地处理媒体信号。Therefore, since there is not any existing method or apparatus for generating surround signals in a device capable of generating only mono or stereo signals, it is difficult to efficiently process media signals.

发明公开invention disclosure

技术问题technical problem

因此,本发明涉及一种基本上消除了一个或多个由于相关技术的局限和缺点引起的问题的处理媒体信号的装置及其方法。Accordingly, the present invention is directed to an apparatus and method for processing a media signal that substantially obviate one or more problems due to limitations and disadvantages of the related art.

本发明的一个目的是提供一种用于处理信号的装置及其方法,藉之可通过使用媒体信号的空间信息来将该媒体信号转换成环绕信号。An object of the present invention is to provide an apparatus for processing a signal and method thereof, by which a media signal can be converted into a surround signal by using spatial information of the media signal.

本发明的另外的特征和优点将在以下的描述中阐述,并将从描述中部分地显而易见,或者可从本发明的实践中认识到。本发明的目的和其它优点将可由书面说明书及其权利要求书和附图中具体指出的结构来实现并获得。Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

技术方案Technical solutions

为了实现这些和其它优点且根据本发明的目的,一种根据本发明的处理信号的方法包括:通过使用指示多个源之间的特征的空间信息生成对应于这多个源中的每一个源的源映射信息;通过将给出环绕效果的滤波器信息按源应用于这些源映射信息来生成子渲染信息;通过整合这些子渲染信息中的至少一个生成用于生成环绕信号的渲染信息;以及通过将此渲染信息应用于通过对这多源进行声道缩减混音处理生成的声道缩减混音信号来生成环绕信号。To achieve these and other advantages and in accordance with the object of the present invention, a method of processing a signal according to the present invention comprises generating a signal corresponding to each of the plurality of sources by using spatial information indicative of features between the plurality of sources generating sub-rendering information by applying filter information giving a surround effect to the source-mapping information by source; generating rendering information for generating a surround signal by integrating at least one of the sub-rendering information; and A surround signal is generated by applying this rendering information to a downmix signal generated by downmixing the multiple sources.

为了进一步实现这些和其它优点且根据本发明的目的,一种处理信号的装置包括:源映射单元,其通过使用指示多个源之间特征的空间信息生成对应于这多个源中的每一个源的源映射信息;子渲染信息生成单元,其通过将具有环绕效果的滤波器信息按源应用于这些源映射信息来生成子渲染信息;整合单元,其通过整合这些子渲染信息中的至少一个生成用于生成环绕信号的渲染信息;以及渲染单元,其通过将渲染信息应用于通过对这多个源进行声道缩减混音处理生成的声道缩减混音信号来生成环绕信号。To further achieve these and other advantages and in accordance with the object of the present invention, an apparatus for processing a signal comprises: a source mapping unit which generates a signal corresponding to each of the plurality of sources by using spatial information indicative of features between the plurality of sources Source mapping information of the source; a sub-rendering information generation unit that generates sub-rendering information by applying filter information with a surround effect to the source mapping information by source; an integration unit that generates sub-rendering information by integrating at least one of the sub-rendering information generating rendering information for generating the surround signal; and a rendering unit generating the surround signal by applying the rendering information to the downmix signal generated by downmixing the plurality of sources.

应理解,以上的一般描述和以下的详细描述是示例性和说明性的,并且旨在提供对主张权利的本发明的进一步解释。It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.

有益效果Beneficial effect

根据本发明的信号处理装置和方法使得接收包括通过对多声道信号进行声道缩减混音处理生成的声道缩减混音信号以及该多声道信号的空间信息的比特流的解码器能在不能够恢复该多声道信号的环境中生成具有环绕效果的信号。The signal processing apparatus and method according to the present invention enable a decoder receiving a bit stream including a down-mix signal generated by down-mixing a multi-channel signal and spatial information of the multi-channel signal to A signal with a surround effect is generated in an environment where it is not possible to restore the multi-channel signal.

附图简述Brief description of the drawings

包括于此以提供对本发明的进一步理解、并被结合在本申请中且构成其一部分的附图示出本发明的实施方式,其与说明书一起可用来解释本发明的原理。The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.

附图中:In the attached picture:

图1是根据本发明的一个实施例的音频信号编码装置和音频信号解码装置的的框图;1 is a block diagram of an audio signal encoding device and an audio signal decoding device according to an embodiment of the present invention;

图2是根据本发明的一个实施例的音频信号的比特流的结构图;Fig. 2 is a structural diagram of a bit stream of an audio signal according to an embodiment of the present invention;

图3是根据本发明的一个实施例的空间信息转换单元的详细框图;3 is a detailed block diagram of a spatial information conversion unit according to an embodiment of the present invention;

图4和图5是根据本发明的一个实施例用于源映射过程的声道配置的框图;4 and 5 are block diagrams of channel configurations for a source mapping process according to one embodiment of the invention;

图6和图7是根据本发明的一个实施例用于立体声的声道缩减混音信号的渲染单元的详细框图;6 and 7 are detailed block diagrams of a rendering unit for a stereo channel downmix signal according to an embodiment of the present invention;

图8和图9是根据本发明的一个实施例用于单声道的声道缩减混音信号的渲染单元的详细框图;8 and 9 are detailed block diagrams of a rendering unit for a mono channel downmix signal according to an embodiment of the present invention;

图10和图11是根据本发明的一个实施例的平滑单元和扩展单元的框图;10 and 11 are block diagrams of a smoothing unit and an expanding unit according to an embodiment of the present invention;

图12是用于解释根据本发明的一个实施例的第一平滑方法的坐标图;12 is a graph for explaining a first smoothing method according to an embodiment of the present invention;

图13是用于解释根据本发明的一个实施例的第二平滑方法的坐标图;13 is a graph for explaining a second smoothing method according to an embodiment of the present invention;

图14是用于解释根据本发明的一个实施例的第三平滑方法的坐标图;14 is a graph for explaining a third smoothing method according to an embodiment of the present invention;

图15是用于解释根据本发明的一个实施例的第四平滑方法的坐标图;15 is a graph for explaining a fourth smoothing method according to an embodiment of the present invention;

图16是用于解释根据本发明的一个实施例的第五平滑方法的坐标图;16 is a graph for explaining a fifth smoothing method according to an embodiment of the present invention;

图17是用于解释对应于每个声道的原型滤波器信息的图;FIG. 17 is a diagram for explaining prototype filter information corresponding to each channel;

图18是根据本发明的一个实施例在空间信息转换单元中生成渲染滤波器信息的第一方法的框图;18 is a block diagram of a first method for generating rendering filter information in a spatial information conversion unit according to an embodiment of the present invention;

图19是根据本发明的一个实施例在空间信息转换单元中生成渲染滤波器信息的第二方法的框图;19 is a block diagram of a second method of generating rendering filter information in a spatial information conversion unit according to an embodiment of the present invention;

图20是根据本发明的一个实施例在空间信息转换单元中生成渲染滤波器信息的第三方法的框图;20 is a block diagram of a third method of generating rendering filter information in a spatial information conversion unit according to an embodiment of the present invention;

图21是用于解释根据本发明的一个实施例在渲染单元中生成环绕信号的方法的图;21 is a diagram for explaining a method of generating surround signals in a rendering unit according to an embodiment of the present invention;

图22是根据本发明的一个实施例的第一内插法的图;Figure 22 is a diagram of a first interpolation method according to one embodiment of the present invention;

图23是根据本发明的一个实施例的第二内插法的图;Figure 23 is a diagram of a second interpolation method according to one embodiment of the present invention;

图24是根据本发明的一个实施例的块切换法的图;FIG. 24 is a diagram of a block switching method according to one embodiment of the present invention;

图25是根据本发明的一个实施例应用由窗口长度决定单元决定的窗口长度的位置的框图;Fig. 25 is a block diagram of applying the position of the window length determined by the window length determination unit according to one embodiment of the present invention;

图26是根据本发明的一个实施例在处理音频信号中使用的具有各种长度的滤波器的图;26 is a diagram of filters of various lengths used in processing audio signals according to one embodiment of the present invention;

图27是根据本发明的一个实施例通过使用多个子滤波器来分开地处理音频信号的方法的图;27 is a diagram of a method for separately processing an audio signal by using a plurality of sub-filters according to one embodiment of the present invention;

图28是根据本发明的一个实施例向单声道的声道缩减混音信号渲染由多个子滤波器生成的分割渲染信息的方法的框图;28 is a block diagram of a method for rendering split rendering information generated by a plurality of sub-filters to a mono channel downmix signal according to an embodiment of the present invention;

图29是根据本发明的一个实施例向立体声的声道缩减混音信号渲染由多个子滤波器生成的分割渲染信息的方法的框图;29 is a block diagram of a method for rendering split rendering information generated by a plurality of sub-filters to a stereo downmix signal according to an embodiment of the present invention;

图30是根据本发明的一个实施例的声道缩减混音信号的第一域转换方法的框图;以及FIG. 30 is a block diagram of a first domain conversion method of a channel downmix signal according to an embodiment of the present invention; and

图31是根据本发明的一个实施例的声道缩减混音信号的第二域转换方法的框图。Fig. 31 is a block diagram of a second domain conversion method for a downmix signal according to an embodiment of the present invention.

本发明的最佳实施方式BEST MODE FOR CARRYING OUT THE INVENTION

现在将详细参考本发明的优选实施方式,其示例图解于附图中。Reference will now be made in detail to the preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings.

图1是根据本发明的一个实施例的音频信号编码装置和音频信号解码装置的的框图。FIG. 1 is a block diagram of an audio signal encoding device and an audio signal decoding device according to an embodiment of the present invention.

参考图1,编码装置10包括声道缩减混音单元100、空间信息生成单元200、声道缩减混音信号编码单元300、空间信息编码单元400、和多路复用单元500。Referring to FIG. 1 , the encoding device 10 includes a downmix unit 100 , a spatial information generation unit 200 , a downmix signal encoding unit 300 , a spatial information encoding unit 400 , and a multiplexing unit 500 .

如果多源(X1、X2、……、Xn)音频信号被输入到声道缩减混音单元100,则声道缩减混音单元100将所输入的信号作声道缩减混音成声道缩减混音信号。在这种情形中,声道缩减混音信号包括单声道、立体声及多源音频信号。If a multi-source (X1, X2, ..., Xn) audio signal is input to the downmix unit 100, the downmix unit 100 downmixes the input signal into a downmix tone signal. In this case, the downmix signal includes mono, stereo and multi-source audio signals.

源包括声道,且在以下的描述中方便地表示为声道。在本说明书中,以单声道或立体声的声道缩减混音信号作为参考。然而,本发明不限于单声道或立体声的声道缩减混音信号。Sources include channels and are conveniently denoted as channels in the description below. In this specification, a mono or stereo downmix signal is used as a reference. However, the invention is not limited to mono or stereo downmix signals.

编码装置10能可任选地使用从外部环境直接提供的任意性声道缩减混音信号。The encoding device 10 can optionally use an arbitrary downmix signal provided directly from the external environment.

空间信息生成单元200从多声道音频信号生成空间信息。此空间信息可在声道缩减混音过程中生成。所生成的声道缩减混音信号和空间信息分别由声道缩减混音信号编码单元300和空间信息编码单元400编码,然后传输至多路复用单元500。The spatial information generation unit 200 generates spatial information from a multi-channel audio signal. This spatial information can be generated during downmixing. The generated downmix signal and spatial information are respectively encoded by the downmix signal encoding unit 300 and the spatial information encoding unit 400 , and then transmitted to the multiplexing unit 500 .

在本发明中,‘空间信息’是指由解码装置从对声道缩减混音信号进行声道扩展混音来生成多声道信号所需的信息,其中该声道缩减混音信号是由编码装置通过对该多声道信号进行声道缩减混音处理来生成并被传输到该解码装置的。空间信息包括空间参数。空间参数包括指示声道之间的能量差的CLD(声道电平差)、指示声道之间的相关性的ICC(声道间相干性)、在从两声道生成三声道时使用的CPC(声道预测系数)等。In the present invention, 'spatial information' refers to information required by a decoding device to generate a multi-channel signal from downmixing a downmix signal encoded by generated by the device by downmixing the multi-channel signal and transmitted to the decoding device. Spatial information includes spatial parameters. Spatial parameters include CLD (Channel Level Difference) indicating the energy difference between channels, ICC (Inter-Channel Coherence) indicating the correlation between channels, used when generating three channels from two channels CPC (Channel Prediction Coefficient) etc.

在本发明中,‘声道缩减混音信号编码单元’或‘声道缩减混音信号解码单元’是指编码或解码音频信号而不是空间信息的编解码器。在本说明书中,以声道缩减混音音频信号为音频信号而不是空间信息的例子。并且,声道缩减混音信号编码或解码单元可包括MP3、AC-3、DTS、或AAC。此外,声道缩减混音信号编码或解码单元可包括未来的编解码器以及以前已经开发出来的编解码器。In the present invention, 'downmix signal encoding unit' or 'downmix signal decoding unit' refers to a codec that encodes or decodes audio signals instead of spatial information. In this specification, a downmix audio signal is taken as an example of an audio signal instead of spatial information. Also, the downmix signal encoding or decoding unit may include MP3, AC-3, DTS, or AAC. Furthermore, the downmix signal encoding or decoding unit may include future codecs as well as codecs that have been developed before.

多路复用单元500通过将声道缩减混音信号与空间信息多路复用来生成比特流,然后将所生成的比特流传输到解码装置20。此外,稍后将在图2中解释此比特流的结构。The multiplexing unit 500 generates a bit stream by multiplexing the downmix signal with spatial information, and then transmits the generated bit stream to the decoding device 20 . Also, the structure of this bitstream will be explained later in FIG. 2 .

解码装置20包括多路分解单元600、声道缩减混音信号解码单元700、空间信息解码单元800、渲染单元900、以及空间信息转换单元1000。The decoding device 20 includes a demultiplexing unit 600 , a downmix signal decoding unit 700 , a spatial information decoding unit 800 , a rendering unit 900 , and a spatial information conversion unit 1000 .

多路分解单元600接收比特流,然后从该比特流中分离出经编码的声道缩减混音信号和经编码的空间信息。随后,声道缩减混音信号解码单元700对此经编码的声道缩减混音信号进行解码,并且空间信息解码单元800对此经编码的空间信息进行解码。The demultiplexing unit 600 receives a bitstream and then separates the encoded downmix signal and the encoded spatial information from the bitstream. Then, the down-mix signal decoding unit 700 decodes the encoded down-mix signal, and the spatial information decoding unit 800 decodes the encoded spatial information.

空间信息转换单元1000利用经解码的空间信息和滤波器信息生成可应用于声道缩减混音信号的渲染信息。在这种情形中,将渲染信息应用于该声道缩减混音信号以生成环绕信号。The spatial information conversion unit 1000 generates rendering information applicable to the downmix signal using the decoded spatial information and filter information. In this case, the rendering information is applied to the downmix signal to generate a surround signal.

例如,环绕信号按以下方式生成。首先,由编码装置10从多声道音频信号生成声道缩减混音信号的过程可包括利用OTT(一至二)框或TTT(三至三)框的若干步骤。在这种情形中,空间信息可从这些步骤中的每一个生成。空间信息被传输到解码装置20。解码装置20然后通过转换空间信息然后用声道缩减混音信号渲染经转换的空间信息来生成环绕信号。本发明不是通过对声道缩减混音信号进行声道扩展混音处理来生成多声道信号,而是代之以涉及包括以下步骤的渲染方法:提取用于每个声道扩展混音步骤的空间信息,并通过使用所提取的空间信息执行渲染。例如,HRTF(头部相关的传递函数)滤波在该渲染方法中是可使用的。For example, surround signals are generated as follows. First, the process of generating a downmix signal from a multi-channel audio signal by the encoding device 10 may include several steps using an OTT (one to two) box or a TTT (three to three) box. In this case, spatial information can be generated from each of these steps. The spatial information is transmitted to decoding means 20 . The decoding device 20 then generates the surround signal by converting the spatial information and then rendering the converted spatial information with the downmix signal. Instead of generating a multi-channel signal by downmixing a downmix signal, the present invention instead involves a rendering method comprising the steps of extracting the spatial information, and perform rendering by using the extracted spatial information. For example, HRTF (Head Related Transfer Function) filtering is usable in this rendering method.

在这种情形中,空间信息是也可应用于混合域的值。所以,可根据域将渲染分类成以下的类型。In this case, spatial information is a value that also applies to mixed domains. Therefore, rendering can be classified into the following types according to domains.

第一类型是通过令声道缩减混音信号通过混合滤波器组来在混合域上执行渲染。在这种情形中,空间信息的域转换是不必要的。The first type is to perform rendering on the mix domain by passing the downmix signal through a mix filter bank. In this case, domain conversion of the spatial information is unnecessary.

第二类型是在时域上执行渲染。在这种情形中,第二类型利用HRTF滤波器是被建模成时域上的FIR(有限逆响应)滤波器或IIR(无限逆响应)滤波器这一事实。所以,将空间信息转换成时域的滤波器系数的过程是需要的。The second type is to perform rendering on the temporal domain. In this case, the second type exploits the fact that HRTF filters are modeled as FIR (finite inverse response) filters or IIR (infinite inverse response) filters in the time domain. Therefore, a process of converting spatial information into time-domain filter coefficients is required.

第三类型是在不同的频域上执行渲染。例如,此渲染在DFT(离散傅里叶变换)域上执行。在这种情形中,将空间信息变换至相应的域中的过程是必需的。特别是,第三类型通过将时域上的滤波替换成频域上的运算来使快速运算能得以实现。A third type is to perform rendering on a different frequency domain. For example, this rendering is performed on the DFT (Discrete Fourier Transform) domain. In this case, a process of transforming the spatial information into the corresponding domain is necessary. In particular, the third type enables fast operations by replacing filtering in the time domain with operations in the frequency domain.

在本发明中,滤波器信息是关于处理音频信号所需的滤波器的信息,并包括提供给特定滤波器的滤波器系数。解释滤波器信息的例子如下。首先,原型滤波器信息是特定滤波器的原始滤波器信息,并可表示为GL_L等。经转换的滤波器信息指示在原型滤波器信息已被转换后的滤波器系数,并可表示为GL_L等。子渲染信息是指将原型滤波器信息空间化以生成环绕信号所得到的滤波器信息,并可表示为FL_L1等。渲染信息是指执行渲染所需的滤波器信息,并可表示为HL_L等。经内插/平滑的渲染信息是指从内插/平滑此渲染信息得到的滤波器信息,并可表示为HL-L等。在本说明书中,提到了以上的滤波器信息。然而,本发明不受滤波器信息的名称的限制。具体地,以HRTF为滤波器信息的例子。然而,本发明不限于HRTF。In the present invention, filter information is information on a filter required for processing an audio signal, and includes filter coefficients provided to a specific filter. An example explaining the filter information is as follows. First, the prototype filter information is original filter information of a specific filter, and can be expressed as GL_L or the like. The converted filter information indicates filter coefficients after the prototype filter information has been converted, and may be expressed as GL_L or the like. The sub-rendering information refers to the filter information obtained by spatializing the prototype filter information to generate the surround signal, and may be expressed as FL_L1 and the like. Rendering information refers to filter information required to perform rendering, and may be expressed as HL_L or the like. The interpolated/smoothed rendering information refers to filter information obtained from interpolating/smoothing this rendering information, and may be expressed as HL-L or the like. In this specification, the above filter information is referred to. However, the present invention is not limited by the name of the filter information. Specifically, take HRTF as an example of filter information. However, the present invention is not limited to HRTFs.

渲染单元900接收经解码的声道缩减混音信号和渲染信息,然后利用经解码的声道缩减混音信号和渲染信息生成环绕信息。环绕信号可以是向仅能够生成立体声信号的音频系统提供环绕效果的信号。除了仅能够生成立体声信号的音频系统外,本发明还可应用于各种系统。The rendering unit 900 receives the decoded downmix signal and rendering information, and generates surround information using the decoded downmix signal and rendering information. A surround signal may be a signal that provides a surround effect to an audio system that is only capable of generating a stereo signal. The present invention can be applied to various systems other than an audio system capable of generating only stereo signals.

图2是根据本发明的一个实施例的音频信号的比特流的结构图,其中该比特流包括经编码的声道缩减混音信号和经编码的空间信息。Fig. 2 is a structural diagram of a bit stream of an audio signal according to an embodiment of the present invention, wherein the bit stream includes encoded downmix signals and encoded spatial information.

参考图2,1帧音频有效载荷包括声道缩减混音信号字段和辅助数据字段。经编码的空间信息可存储在此辅助数据字段中。例如,如果音频有效载荷是48~128kbps(千比特/秒),则空间信息可具有5~32kbps的范围。然而,对音频有效载荷和空间信息的范围不设限制。Referring to FIG. 2, a 1-frame audio payload includes a downmix signal field and an ancillary data field. Encoded spatial information may be stored in this auxiliary data field. For example, if the audio payload is 48˜128 kbps (kilobits per second), the spatial information may have a range of 5˜32 kbps. However, there is no limit to the range of audio payload and spatial information.

图3是根据本发明的一个实施例的空间信息转换单元的详细框图。FIG. 3 is a detailed block diagram of a spatial information conversion unit according to an embodiment of the present invention.

参考图3,空间信息转换单元1000包括源映射单元1010、子渲染信息生成单元1020、整合单元1030、处理单元1040、以及域转换单元1050。Referring to FIG. 3 , the spatial information converting unit 1000 includes a source mapping unit 1010 , a sub-rendering information generating unit 1020 , an integrating unit 1030 , a processing unit 1040 , and a domain converting unit 1050 .

源映射单元101通过利用空间信息执行源映射来生成对应于音频信号的每一个源的源映射信息。在这种情形中,源映射信息是指通过利用空间信息等来生成以使其对应于音频信号的每一个源的每源的信息。源包括声道,且在这种情形中,生成的是对应于每一声道的源映射信息。可将源映射信息表示为系数。并且,稍后将参考图4和图5详细解释源映射过程。The source mapping unit 101 generates source mapping information corresponding to each source of an audio signal by performing source mapping using spatial information. In this case, the source map information refers to per-source information generated by using spatial information or the like so as to correspond to each source of the audio signal. A source includes channels, and in this case, source map information corresponding to each channel is generated. The source map information may be represented as coefficients. And, the source mapping process will be explained in detail later with reference to FIGS. 4 and 5 .

子渲染信息生成单元1020通过利用源映射信息和滤波器信息生成对应于每个源的子渲染信息。例如,如果渲染单元900是HRTF滤波器。则子渲染信息生成单元1020能通过利用HRTF滤波器信息生成子渲染信息。The sub-rendering information generating unit 1020 generates sub-rendering information corresponding to each source by using source map information and filter information. For example, if the rendering unit 900 is an HRTF filter. Then the sub-rendering information generation unit 1020 can generate sub-rendering information by using HRTF filter information.

整合单元1030通过整合子渲染信息以使其对应于声道缩减混音信号的每一个源来生成渲染信息。通过利用空间信息和滤波器信息生成的渲染信息是指通过被应用于声道缩减混音信号来生成环绕信号的信息。并且,渲染信息包括滤波器系数类型。可省略整合以减少渲染过程的运算量。随后,渲染信息被传输给处理单元1042。The integrating unit 1030 generates rendering information by integrating the sub-rendering information so as to correspond to each source of the downmix signal. Rendering information generated by using spatial information and filter information refers to information that generates a surround signal by being applied to a downmix signal. And, the rendering information includes a filter coefficient type. Integration can be omitted to reduce the computational load of the rendering process. Subsequently, the rendering information is transmitted to the processing unit 1042 .

处理单元1042包括内插单元1041和/或平滑单元1042。渲染信息由内插单元1041内插和/或由平滑单元1042平滑。The processing unit 1042 includes an interpolation unit 1041 and/or a smoothing unit 1042 . The rendering information is interpolated by an interpolation unit 1041 and/or smoothed by a smoothing unit 1042 .

域转换单元1050将渲染信息的域转换至渲染单元900所使用的声道缩减混音信号的域。并且,可向包括图3中所示的位置在内的各种位置之一设置域转换单元1050。所以,如果渲染信息是在与渲染单元900相同的域上生成的,则可省略域转换单元1050。经域转换的渲染信息随后被传输给渲染单元900。The domain conversion unit 1050 converts the domain of the rendering information to the domain of the downmix signal used by the rendering unit 900 . And, the domain converting unit 1050 may be provided to one of various positions including the position shown in FIG. 3 . Therefore, if the rendering information is generated on the same domain as the rendering unit 900, the domain converting unit 1050 may be omitted. The domain-converted rendering information is then transmitted to the rendering unit 900 .

空间信息转换单元1000可包括滤波器信息转换单元1060。在图3中,滤波器信息转换单元1060被设置在空间信息转换单元100内。替换地,可将滤波器信息转换单元1060设置在空间信息转换单元100的外部。滤波器信息转换单元1060被转换成适用于从例如HRTF等的随机滤波器信息生成子渲染信息或渲染信息。滤波器信息的转换过程可包括以下步骤。The spatial information conversion unit 1000 may include a filter information conversion unit 1060 . In FIG. 3 , a filter information converting unit 1060 is provided within the spatial information converting unit 100 . Alternatively, the filter information converting unit 1060 may be provided outside the spatial information converting unit 100 . The filter information conversion unit 1060 is converted to be suitable for generating sub-rendering information or rendering information from random filter information such as HRTF or the like. The conversion process of the filter information may include the following steps.

首先,包括将域匹配成可应用的步骤。如果滤波器信息的域不匹配执行渲染的域,则需要此域匹配步骤。例如,将时域HRTF转换到用于生成渲染信息的DFT、QMF或混合域的步骤是必需的。First, the step of matching domains to be applicable is included. This domain matching step is required if the domain of the filter information does not match the domain where rendering is performed. For example, a step of converting time-domain HRTF to DFT, QMF or hybrid domain for generating rendering information is required.

第二,可包括系数约简步骤。在这种情形中,易于保存经域转换的HRTF并将经域转换的HRTF应用于空间信息。例如,如果原型滤波器系数具有长抽头(tap)数(长度)的响应,则在5.1声道的情形中对应的系数必须存储在与对应长度合计总共为10的响应相对应的存储空间中。这增加了存储器的负载和运算量。为了防止这一问题,可采用在域转换过程中在维持滤波器特性的同时约简要存储的滤波器系数的方法。例如,HRTF响应可被转换成少数几个参数值。在这种情形中,参数生成过程和参数值可根据应用的域而有所不同。Second, a coefficient reduction step may be included. In this case, it is easy to save the domain-transformed HRTF and apply the domain-converted HRTF to spatial information. For example, if the prototype filter coefficients have a response with a long number of taps (length), the corresponding coefficients must be stored in a storage space corresponding to responses whose corresponding lengths add up to 10 in the case of 5.1 channels. This increases the load on the memory and the amount of computation. In order to prevent this problem, a method of compacting stored filter coefficients while maintaining filter characteristics during domain conversion may be employed. For example, HRTF responses can be transformed into a few parameter values. In this case, the parameter generation process and parameter values may differ according to the domain of application.

声道缩减混音信号在用渲染信息进行渲染之前通过域转换单元1110和/或解相关单元1200。在渲染信息的域与声道缩减混音信号的域不同的情形中,域转换单元1110转换声道缩减混音信号的域以将这两个域匹配起来。The downmix signal passes through the domain conversion unit 1110 and/or the decorrelation unit 1200 before being rendered with the rendering information. In case the domain of the rendering information is different from the domain of the downmix signal, the domain conversion unit 1110 converts the domain of the downmix signal to match the two domains.

解相关单元1200被应用于经域转换的声道缩减混音信号。与将解相关器应用于渲染信息的方法相比,这可能会具有相对较高的运算量。然而,它能够防止在生成渲染信息的过程中发生畸变。如果运算量可允许,则解相关单元1200可包括多个特性上彼此不同的解相关器。如果声道缩减混音信号是立体声信号,则可以不使用解相关单元1200。在图3中,在渲染过程中使用的是经域转换的单声道的声道缩减混音信号——即频率、混合、QMF或DFT域上单声道的声道缩减混音信号的情形中,在相应的域上使用解相关器。并且,本发明还包括在时域上使用的解相关器。在该情形中,是将域转换单元1100之前的单声道的声道缩减混音信号直接输入到解相关单元1200。第一阶或更高阶的IIR滤波器(或FIR滤波器)可作为解相关器使用。The decorrelation unit 1200 is applied to the domain converted downmix signal. This can be relatively computationally expensive compared to applying a decorrelator to the rendered information. However, it prevents distortion during the generation of rendering information. If the amount of computation is allowable, the decorrelating unit 1200 may include a plurality of decorrelators different in characteristics from each other. If the downmix signal is a stereo signal, the decorrelation unit 1200 may not be used. In Figure 3, a domain-converted mono downmix signal is used in the rendering process—that is, the case of a mono downmix signal in the frequency, mixing, QMF, or DFT domains In , a decorrelator is used on the corresponding domain. Also, the present invention also includes a decorrelator used in the time domain. In this case, the monaural downmix signal preceding the domain conversion section 1100 is directly input to the decorrelation section 1200 . A first-order or higher-order IIR filter (or FIR filter) can be used as a decorrelator.

随后,渲染单元900利用声道缩减混音信号、经解相关的声道缩减混音信号、和渲染信息生成环绕信号。如果声道缩减混音信号是立体声信号,则可以不使用经解相关的声道缩减混音信号。稍后将参考图6至9描述渲染过程的详情。Subsequently, the rendering unit 900 generates a surround signal using the downmix signal, the decorrelated downmix signal, and the rendering information. If the down-mix signal is a stereo signal, the decorrelated down-mix signal may not be used. Details of the rendering process will be described later with reference to FIGS. 6 to 9 .

此环绕信号由域逆转换单元1300转换至时域然后被输出。如果是这样的话,用户就能够通过立体声耳机等听到具有多声道效果的声音。The surround signal is converted to the time domain by the domain inverse conversion unit 1300 and then output. If so, the user can hear the sound with multi-channel effects through stereo headphones or the like.

图4和图5是根据本发明的一个实施例用于源映射过程的声道配置的框图。源映射过程是通过利用空间信息生成与音频信号的每一个源相对应的源映射信息的过程。如在上面描述中提及的,源包括声道,且可生成源映射信息以使之对应于图4和图5中所示的声道。源映射信息以适用于渲染过程的类型来生成。4 and 5 are block diagrams of channel configurations for a source mapping process according to one embodiment of the present invention. The source mapping process is a process of generating source mapping information corresponding to each source of an audio signal by utilizing spatial information. As mentioned in the above description, a source includes channels, and source mapping information may be generated so as to correspond to the channels shown in FIGS. 4 and 5 . Source map information is generated in a type suitable for the rendering pass.

例如,如果声道缩减混音信号是单声道信号,则能够利用诸如CLD1~CLD5、ICC1~ICC5等空间信息生成源映射信息。For example, if the downmix signal is a mono signal, it is possible to generate source mapping information using spatial information such as CLD1-CLD5, ICC1-ICC5, and the like.

可将源映射信息表示为诸如D_L(=DL)、D_R(=DR)、D_C(=DC)、D_LFE(=DLFE)、D_Ls(=DLs)、D_R(=DRs)等值。在这种情形中,生成源映射信息的过程可根据对应于空间信息的树状结构、要使用的空间信息的范围等而变。在本说明书中,声道缩减混音信号例如是单声道信号,它不对本发明构成限制。Source mapping information can be expressed as D_L(=D L ), D_R(=D R ), D_C(=D C ), D_LFE(=D LFE ), D_Ls(=D Ls ), D_R(=D Rs ), etc. value. In this case, the process of generating the source map information may vary according to the tree structure corresponding to the spatial information, the range of the spatial information to be used, and the like. In this specification, the downmix signal is, for example, a mono signal, which does not limit the present invention.

从渲染单元900输出的右和左声道输出可表达为数学演算1。Right and left channel outputs output from the rendering unit 900 may be expressed as a mathematical operation 1 .

数学演算1Mathematical Calculus 1

Lo=L*GL_L′+C*GC_L′+R*GR_L′+Ls*GLs_L′+Rs*GRs_L′Lo=L*GL_L'+C*GC_L'+R*GR_L'+Ls*GLs_L'+Rs*GRs_L'

Ro=L*GL_R′+C*GC_R′+R*GR_R′+Ls*GLs_R′+Rs*GRs_R′Ro=L*GL_R'+C*GC_R'+R*GR_R'+Ls*GLs_R'+Rs*GRs_R'

在这种情形中,算子‘*’指示DFT域上的乘积,且可被QMF或时域上的卷积所替代。In this case, the operator '*' indicates a product on the DFT domain and can be replaced by a QMF or convolution on the time domain.

本发明包括由利用空间信息的源映射信息或由利用空间信息和滤波器信息的源映射信息生成L、C、R、Ls和Rs的方法。例如,可仅利用空间信息的CLD或利用空间信息的CLD和ICC来生成源映射信息。仅利用CLD生成源映射信息的方法解释如下。The present invention includes a method of generating L, C, R, Ls and Rs from source map information using spatial information or from source map information using spatial information and filter information. For example, source map information may be generated using only CLD of spatial information or using CLD and ICC of spatial information. The method of generating source map information using only CLD is explained below.

在此树状结构具有图4所示的结构的情形中,可将仅利用CLD获得源映射信息的第一方法表达为数学演算2。In the case where this tree structure has the structure shown in FIG. 4 , the first method of obtaining source map information using only CLD can be expressed as Mathematical Calculations 2 .

数学演算2Mathematical Calculus 2

LL RR CC LFELFE Lsls RsRs. == DD. LL DD. RR DD. CC DD. LFELFE DD. Lsls DD. RsRs. mm == cc 11 ,, OTTOTT 33 cc 11 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 22 ,, OTTOTT 33 cc 11 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 11 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 11 ,, OTTOTT 22 cc 22 ,, OTTOTT 00 cc 22 ,, OTTOTT 22 cc 22 ,, OTTOTT 00 mm

在这种情形中,In this case,

cc 11 ,, OTTOTT Xx ll ,, mm == 1010 CLDCLD Xx ll ,, mm 1010 11 ++ 1010 CLDCLD Xx ll ,, mm 1010

cc 22 ,, OTOT TT Xx ll ,, mm == 11 11 ++ 1010 CLDCLD Xx ll ,, mm 1010 ,,

,且‘m’指示单声道的声道缩减混音信号。, and 'm' indicates a mono downmix signal.

在此树状结构具有图5中所示的结构的情形中,仅利用CLD获得源映射信息的第二方法可表达为数学演算3。In the case where this tree structure has the structure shown in FIG. 5 , the second method of obtaining source map information using only CLD can be expressed as Mathematical Calculations 3 .

数学演算3Mathematical Calculus 3

LL Lsls RR RsRs. CC LFELFE == DD. LL DD. Lsls DD. RR DD. RsRs. DD. CC DD. LFELFE ]] mm == cc 11 ,, OTTOTT 33 cc 11 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 22 ,, OTTOTT 33 cc 11 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 11 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 cc 11 ,, OTTOTT 22 cc 22 ,, OTTOTT 00 cc 22 ,, OTTOTT 22 cc 22 ,, OTTOTT 00 mm

如果源映射信息仅利用CLD生成,则三维效果可能下降。所以能够利用ICC和/或解相关器来生成源映射信息。并且,通过利用解相关器输出信号dx(m)生成的多声道信息可表达为数学演算4。If the source map information is generated using only CLD, the three-dimensional effect may be reduced. So source map information can be generated using ICC and/or decorrelators. And, the multi-channel information generated by using the decorrelator output signal dx(m) can be expressed as a mathematical operation 4 .

数学演算4Mathematical Calculus 4

LL RR CC LFELFE Lsls RsRs. == AA LL 11 mm ++ BB LL 00 dd 00 (( mm )) ++ BB LL 11 dd 11 (( CC LL 11 mm )) ++ BB LL 33 dd 33 (( CC LL 33 mm )) AA RR 11 mm ++ BB RR 00 dd 00 (( mm )) ++ BB RR 11 dd 11 (( CC RR 11 mm )) ++ BB RR 33 dd 33 (( CC RR 33 mm )) AA CC 11 mm ++ BB CC 00 dd 00 (( mm )) ++ BB CC 11 dd 11 (( CC CC 11 mm )) cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 mm AA LSLS 11 mm ++ BB LSLS 00 dd 00 (( mm )) ++ BB LSLS 22 dd 22 (( CC LSLS 22 mm )) AA RSRS 11 mm ++ BB RSRS 00 dd 00 (( mm )) ++ BB RSRS 22 dd 22 (( CC RSRS 22 mm ))

在这种情形中,‘A’、‘B’和‘C’是可通过利用CLD和ICC来表示的值。‘d0’至‘d3’指示解相关器。并且,‘m’指示单声道的声道缩减混音信号。然而,该方法不可用于生成诸如D_L、D_R等源映射信息。In this case, 'A', 'B', and 'C' are values expressible by using CLD and ICC. 'd 0 ' to 'd 3 ' indicate decorrelators. And, 'm' indicates a monaural downmix signal. However, this method cannot be used to generate source map information such as D_L, D_R, etc.

因此,利用关于声道缩减混音信号的CLD、ICC和/或解相关器生成源映射信息的第一方法将dx(m)(x=0,1,2)视为独立输入。在这种情形中,‘dx’可用于根据数学演算5生成子渲染滤波器信息的过程。Therefore, a first method of generating source mapping information using CLD, ICC and/or decorrelators on the downmix signal considers dx(m) (x=0, 1, 2) as independent inputs. In this case, 'dx' can be used in the process of generating sub-rendering filter information according to Mathematical Calculus 5.

数学演算5Mathematical Calculus 5

FL_L_M=d_L_M*GL_L′(单声道输入→左输出)FL_L_M=d_L_M*GL_L' (mono input → left output)

FL_R_M=d_L_M*GL_R′(单声道输入→右输出)FL_R_M=d_L_M*GL_R' (mono input → right output)

FL_L_Dx=d_L_Dx*GL_L′(Dx输出→左输出)FL_L_Dx=d_L_Dx*GL_L'(Dx output→left output)

FL_R_Dx=d_L_Dx*GL_R′(Dx输出→右输出)FL_R_Dx=d_L_Dx*GL_R'(Dx output→right output)

并且,渲染信息可利用数学演算5的结果根据数学演算6来生成。And, the rendering information may be generated according to the mathematical calculation 6 using the result of the mathematical calculation 5 .

数学演算6Mathematical Calculus 6

HM_L=FL_L_M+FR_L_M+FC_L_M+FLS_L_M+FRS_L_M+FLFE_L_MHM_L=FL_L_M+FR_L_M+FC_L_M+FLS_L_M+FRS_L_M+FLFE_L_M

HM_R=FL_R_M+FR_R_M+FC_R_M+FLS_R_M+FRS_R_M+FLFE_R_MHM_R=FL_R_M+FR_R_M+FC_R_M+FLS_R_M+FRS_R_M+FLFE_R_M

HDx_L=FL_L_Dx+FR_L_Dx+FC_L_Dx+FLS_L_Dx+FRS_L_Dx+FLFE_L_DxHDx_L=FL_L_Dx+FR_L_Dx+FC_L_Dx+FLS_L_Dx+FRS_L_Dx+FLFE_L_Dx

HDx_R=FL_R_Dx+FR_R_Dx+FC_R_Dx+FLS_R_Dx+FRS_R_Dx+FLFE_R_DxHDx_R=FL_R_Dx+FR_R_Dx+FC_R_Dx+FLS_R_Dx+FRS_R_Dx+FLFE_R_Dx

渲染信息生成过程的详情稍后解释。利用CLD、ICC和/或解相关器生成源映射信息的第一方法将dx输出值即‘dx(m)’作为独立输入处理,这可能增加运算量。Details of the rendering information generation process are explained later. The first method of generating source map information using CLD, ICC and/or decorrelator processes the dx output value, 'dx(m)', as a separate input, which may increase the computational load.

利用CLD、ICC和/或解相关器生成源映射信息的第二方法采用在频域上应用的解相关器。在这种情形中,可将源映射信息表达为数学演算7。A second method of generating source mapping information using CLD, ICC and/or decorrelators employs a decorrelator applied in the frequency domain. In this case, the source map information can be expressed as a mathematical operation7.

数学演算7Mathematical Calculus 7

LL RR CC LFELFE Lsls RsRs. == AA LL 11 mm ++ BB LL 00 dd 00 mm ++ BB LL 11 dd 11 CC LL 11 mm ++ BB LL 33 dd 33 CC LL 33 mm AA RR 11 mm ++ BB RR 00 dd 00 mm ++ BB RR 11 dd 11 CC RR 11 mm ++ BB RR 33 dd 33 CC RR 33 mm AA CC 11 mm ++ BB CC 00 dd 00 mm ++ BB CC 11 dd 11 CC CC 11 mm cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 mm AA LSLS 11 mm ++ BB LSLS 00 dd 00 mm ++ BB LSLS 22 dd 22 CC LSLS 22 mm AA RSRS 11 mm ++ BB RSRS 00 dd 00 mm ++ BB RSRS 22 dd 22 CC RSRS 22 mm == AA LL 11 ++ BB LL 00 dd 00 ++ BB LL 11 dd 11 CC LL 11 ++ BB LL 33 dd 33 CC LL 33 AA RR 11 ++ BB RR 00 dd 00 ++ BB RR 11 dd 11 CC RR 11 ++ BB RR 33 dd 33 CC RR 33 AA CC 11 ++ BB CC 00 dd 00 ++ BB CC 11 dd 11 CC CC 11 cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 AA LSLS 11 ++ BB LSLS 00 dd 00 ++ BB LSLS 22 dd 22 CC LSLS 22 AA RSRS 11 ++ BB RSRS 00 DD. 00 ++ BB RSRS 22 DD. 22 CC RSRS 22 mm

在这种情形中,通过在频域上应用解相关器,就可生成与应用解相关器之前相同的诸如D_L、D_R等的源映射信息。所以,它能以简单的方式实现。In this case, by applying the decorrelator on the frequency domain, the same source mapping information as before applying the decorrelator, such as D_L, D_R, etc., can be generated. So, it can be implemented in a simple manner.

利用CLD、ICC和/或解相关器生成源映射信息的第三方法采用如第二方法的解相关器那样的具有全通特性的解相关器。在这种情形中,全通特性是指大小固定仅有相位变动。并且,本发明可采用如第一方法的解相关器那样的具有全通特性的解相关器。A third method of generating source map information using CLD, ICC, and/or a decorrelator employs a decorrelator with an all-pass characteristic like the decorrelator of the second method. In this case, the all-pass characteristic means that the magnitude is fixed and only the phase changes. Also, the present invention can employ a decorrelator having an all-pass characteristic like the decorrelator of the first method.

利用CLD、ICC和/或解相关器生成源映射信息的第四方法通过使用针对相应各声道(例如,L、R、C、Ls、Rs等)的解相关器代替使用第二方法的‘d0’至‘d3’来实行解相关。在这种情形中,可将源映射信息表达为数学演算8。A fourth method of generating source mapping information using CLD, ICC, and/or decorrelators replaces the ' d0' to 'd3' to perform decorrelation. In this case, the source map information can be expressed as a mathematical operation 8 .

数学演算8Mathematical Calculus 8

LL RR CC LFELFE Lsls RsRs. == AA LL 11 ++ KK LL dd LL AA RR 11 ++ KK RR dd RR AA CC 11 ++ KK CC dd CC cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 AA LSLS 11 ++ KK LSLS dd Lsls AA RSRS 11 ++ KK RSRS dd RsRs. mm

在这种情形中,‘k’是从CLD和ICC值确定的经解相关信号的能量值。并且‘d_L’、‘d_R’、‘d_C’、‘d_Ls’和‘d_Rs’分别指示应用于诸声道的解相关器。In this case 'k' is the energy value of the decorrelated signal determined from the CLD and ICC values. And 'd_L', 'd_R', 'd_C', 'd_Ls' and 'd_Rs' indicate decorrelators applied to channels, respectively.

利用CLD、ICC和/或解相关器生成源映射信息的第五方法通过在第四方法中将‘d_L’和‘d_R’配置成相互对称并在第四方法中将‘d_Ls’和‘d_Rs’配置成相互对称来使解相关效果最大化。具体地,假设d_R=f(d_L)且d_Rs=f(d_Ls),仅需要设计‘d_L’、‘d_C’和‘d_Ls’。The fifth method of generating source mapping information using CLD, ICC and/or decorrelator is by configuring 'd_L' and 'd_R' to be symmetrical to each other in the fourth method and configuring 'd_Ls' and 'd_Rs' in the fourth method The configurations are symmetrical to each other to maximize the decorrelation effect. Specifically, assuming that d_R=f(d_L) and d_Rs=f(d_Ls), only 'd_L', 'd_C' and 'd_Ls' need to be designed.

利用CLD、ICC和/或解相关器生成源映射信息的第六方法是在第五方法中将‘d_L’和‘d_Ls’配置成具有相关性。且,也可将‘d_L’和‘d_C’配置成具有相关性。A sixth method of generating source map information using CLD, ICC and/or decorrelator is to configure 'd_L' and 'd_Ls' to have correlation in the fifth method. Also, 'd_L' and 'd_C' may be configured to have a correlation.

利用CLD、ICC和/或解相关器生成源映射信息的第七方法是将第三方法中的解相关器用作全通滤波器的串联或嵌套结构。第七方法利用了即使将全通滤波器用作串联或嵌套结构全通特性也能维持这一事实。在将全通滤波器用作串联或嵌套结构的情形中,能够获取更多不同种类的相位响应。因此,可使解相关效果最大化。A seventh method for generating source map information using CLD, ICC and/or decorrelators is to use the decorrelators in the third method as a cascaded or nested structure of all-pass filters. The seventh method takes advantage of the fact that the all-pass characteristic is maintained even if an all-pass filter is used as a series or nested structure. In the case of using all-pass filters as series or nested structures, more diverse kinds of phase responses can be obtained. Therefore, the decorrelation effect can be maximized.

利用CLD、ICC和/或解相关器生成源映射信息的第八方法是将相关技术的解相关器与第二方法的频域解相关器一起使用。在这种情形中,可将多声道信号表达为数学演算9。An eighth method of generating source mapping information using CLD, ICC, and/or a decorrelator is to use a related art decorrelator together with a frequency domain decorrelator of the second method. In this case, the multi-channel signal can be expressed as a mathematical operation9.

数学演算9Mathematical Calculus 9

LL RR CC LFELFE Lsls RsRs. == AA LL 11 ++ KK LL dd LL AA RR 11 ++ KK RR dd RR AA CC 11 ++ KK CC dd CC cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 AA LSLS 11 ++ KK LSLS dd Lsls AA RSRS 11 ++ KK RSRS dd RsRs. mm ++ PP LL 00 dd newnew 00 (( mm )) ++ PP LL 11 dd newnew 11 (( mm )) ++ ·· ·&Center Dot; ·&Center Dot; PP RR 00 dd newnew 00 (( mm )) ++ PP RR 11 dd newnew 11 (( mm )) ++ ·&Center Dot; ·&Center Dot; ·&Center Dot; PP CC 00 dd newnew 00 (( mm )) ++ PP CC 11 dd newnew 11 (( mm )) ++ ·&Center Dot; ·&Center Dot; ·&Center Dot; 00 PP Lsls 00 dd newnew 00 (( mm )) ++ PP Lsls 11 dd newnew 11 (( mm )) ++ ·&Center Dot; ·&Center Dot; ·&Center Dot; PP RsRs. 00 dd newnew 00 (( mm )) ++ PP RsRs. 11 dd newnew 11 (( mm )) ++ ·&Center Dot; ·&Center Dot; ·&Center Dot;

在这种情形中,滤波器系数生成过程使用在第一方法中解释的相同的过程——除了将‘A’改成了‘A+Kd’。In this case, the filter coefficient generation process uses the same process explained in the first method - except 'A' is changed to 'A+Kd'.

利用CLD、ICC和/或解相关器生成源映射信息的第九方法是通过在使用相关技术的解相关器的情形中将频域解相关器应用于该相关技术的解相关器的输出来生成经进一步解相关的值。因此,能够通过克服频域解相关器的局限来以很少的运算量生成源映射信息。A ninth method of generating source mapping information using CLD, ICC and/or a decorrelator is by applying a frequency domain decorrelator to the output of a related art decorrelator in the case of a related art decorrelator The value after further decorrelation. Therefore, source map information can be generated with a small amount of computation by overcoming the limitation of the frequency domain decorrelator.

利用CLD、ICC和/或解相关器生成源映射信息的第十方法表达为数学演算10。A tenth method of generating source map information using CLD, ICC and/or decorrelators is expressed as mathematical calculation 10 .

数学演算10Mathematical Calculus 10

LL RR CC LFELFE Lsls RsRs. == AA LL 11 mm ++ KK LL dd LL (( mm )) AA RR 11 mm ++ KK RR dd RR (( mm )) AA CC 11 mm ++ KK CC dd CC (( mm )) cc 22 ,, OTTOTT 44 cc 22 ,, OTTOTT 11 cc 11 ,, OTTOTT 00 mm AA LSLS 11 mm ++ KK Lsls dd Lsls (( mm )) AA RSRS 11 mm ++ KK RsRs. dd RsRs. (( mm ))

在这种情形中,‘di_(m)’(i=L,R,C,Ls,Rs)是应用于声道i的解相关器输出值。且,该输出值可在时域、频域、QMF域、混合域等上处理。如果输出值在与当前处理的域不同的域上处理的,则其可由域转换来被转换。能够对d_L、d_R、d_C、d_Ls和d_Rs使用同一个′d。在这种情形中,能以非常简单的方式表达数学演算10。In this case 'di_(m)' (i=L, R, C, Ls, Rs) is the decorrelator output value applied to channel i. Also, the output value can be processed in time domain, frequency domain, QMF domain, mixed domain, etc. If the output value was processed on a different domain than the one currently being processed, it may be transformed by a domain transformation. The same 'd can be used for d_L, d_R, d_C, d_Ls and d_Rs. In this case, the mathematical calculation 10 can be expressed in a very simple manner.

如果数学演算10被应用于数学演算1,则可将数学演算1表达为数学演算11。If Math10 is applied to Math1, Math1 can be expressed as Math11.

数学演算11Mathematical Calculus 11

Lo=HM_L*m+HMD_L*d(m)Lo=HM_L*m+HMD_L*d(m)

Ro=HM_R*R+HMD_R*d(m)Ro=HM_R*R+HMD_R*d(m)

在这种情形中,渲染信息HM_L是从组合空间信息与滤波器信息以用输入m生成环绕信号Lo所得到的值。且渲染信息HM_R是从组合空间信息与滤波器信息以用输入m生成环绕信号Ro所得到的值。此外,‘d(m)’是通过将任意域上的解相关器输出值转为当前域上的值而生成的解相关器输出值,或是通过在当前域上处理而生成的解相关器输出值。渲染信息HMD_L是指示在渲染d(m)时向‘Lo’添加解相关器输出值d(m)的程度的值,且还是将空间信息与滤波器信息组合起来得到的值。渲染信息HMD_R是指示在渲染d(m)时向‘Ro’添加解相关器输出值d(m)的程度的值。In this case, the rendering information HM_L is a value resulting from combining the spatial information and the filter information to generate the surround signal Lo with the input m. And the rendering information HM_R is a value obtained from combining the spatial information and the filter information to generate the surround signal Ro with the input m. In addition, 'd(m)' is the decorrelator output value generated by converting the decorrelator output value on an arbitrary domain to the value on the current domain, or the decorrelator output value generated by processing on the current domain output value. The rendering information HMD_L is a value indicating the degree to which the decorrelator output value d(m) is added to 'Lo' at the time of rendering d(m), and is also a value obtained by combining spatial information and filter information. The rendering information HMD_R is a value indicating the degree to which the decorrelator output value d(m) is added to 'Ro' when d(m) is rendered.

由此,为了对单声道的声道缩减混音信号执行渲染处理,本发明提出了一种通过向声道缩减混音信号和经解相关的声道缩减混音信号渲染藉由组合空间信息与滤波器信息(例如,HRTF滤波器系数)而生成的渲染信息来生成环绕信号的方法。此渲染过程可不拘于域地来执行。如果将‘d(m)’表达为在频域上执行的‘d*m’(乘积算子),则可将数学演算11表达为数学演算12。Thus, in order to perform a rendering process on a mono down-mix signal, the present invention proposes a method for rendering the down-mix signal by combining the spatial information to the down-mix signal and the decorrelated down-mix signal. A method for generating surround signals from rendering information generated with filter information (eg, HRTF filter coefficients). This rendering process can be performed domain-independently. If 'd(m)' is expressed as 'd*m' (product operator) performed on the frequency domain, mathematical operation 11 can be expressed as mathematical operation 12 .

数学演算12Mathematical Calculus 12

Lo=HM_L*m+HMD_L*d*m=HMoverall_L*mLo=HM_L*m+HMD_L*d*m=HMoverall_L*m

Ro=HM_R*m+HMD_R*d*m=HMoveralf_R*mRo=HM_R*m+HMD_R*d*m=HMoveralf_R*m

由此,在频域上对声道缩减混音信号执行渲染过程的情形中,能够以将从组合空间信息、滤波器信息和解相关器组合得到的值恰当地表示为乘积形式的方式来使运算量最小化。Thus, in the case of performing a rendering process on a downmix signal in the frequency domain, it is possible to make the operation amount is minimized.

图6和图7是根据本发明的一个实施例用于立体声的声道缩减混音信号的渲染单元的详细框图。6 and 7 are detailed block diagrams of a rendering unit for a stereo downmix signal according to an embodiment of the present invention.

参考图6,渲染单元900包括渲染单元-A 910和渲染单元-B 920。Referring to FIG. 6, the rendering unit 900 includes a rendering unit-A 910 and a rendering unit-B 920.

如果声道缩减混音信号是立体声信号,则空间信息转换单元1000生成用于声道缩减混音信号的左和右声道的渲染信息。渲染单元-A 910通过向声道缩减混音信号的左声道渲染用于该声道缩减混音信号的左声道的渲染信息来生成环绕信号。并且,渲染单元-B 920通过向声道缩减混音信号的右声道渲染用于该声道缩减混音信号的右声道的渲染信息来生成环绕信号。声道的名称仅仅是示例性的,它不对本发明构成限制。If the downmix signal is a stereo signal, the spatial information converting unit 1000 generates rendering information for left and right channels of the downmix signal. The rendering unit-A 910 generates the surround signal by rendering rendering information for the left channel of the down-mix signal to the left channel of the down-mix signal. And, the rendering unit-B 920 generates the surround signal by rendering the rendering information for the right channel of the down-mix signal to the right channel of the down-mix signal. The names of the channels are merely exemplary and do not limit the invention.

渲染信息可包括递送给同一声道的渲染信息和递送给另一个声道的渲染信息。The rendering information may include rendering information delivered to the same channel and rendering information delivered to another channel.

例如,空间信息转换单元1000能够生成输入至用于声道缩减混音信号的左声道的渲染单元的渲染信息HL_L和HL_R,其中渲染信息HL_L被递送至对应于同一声道的左输出,而渲染信息HL_R被递送至对应于另一个声道的右输出。并且,空间信息转换单元1000能够生成输入至用于声道缩减混音信号的右声道的渲染单元的渲染信息HR_R和HR_L,其中渲染信息HR_R被递送至对应于同一声道的右输出,而渲染信息HR_L被递送至对应于另一个声道的左输出。For example, the spatial information conversion unit 1000 can generate rendering information HL_L and HL_R input to a rendering unit for the left channel of the downmix signal, wherein the rendering information HL_L is delivered to the left output corresponding to the same channel, and The rendering information HL_R is delivered to the right output corresponding to the other channel. And, the spatial information conversion unit 1000 can generate rendering information HR_R and HR_L input to a rendering unit for a right channel of a downmix signal, wherein the rendering information HR_R is delivered to a right output corresponding to the same channel, and The rendering information HR_L is delivered to the left output corresponding to the other channel.

参考图7,渲染单元900包括渲染单元-1A 911、渲染单元-2A 912、渲染单元-1B 921以及渲染单元-2B 922。Referring to FIG. 7, the rendering unit 900 includes a rendering unit-1A 911, a rendering unit-2A 912, a rendering unit-1B 921, and a rendering unit-2B 922.

渲染单元900接收立体声的声道缩减混音信号和来自空间信息转换单元1000的渲染信息。随后,渲染单元900通过向此立体声的声道缩减混音信号渲染此渲染信息来生成环绕信号。The rendering unit 900 receives the stereo downmix signal and the rendering information from the spatial information converting unit 1000 . Then, the rendering unit 900 generates a surround signal by rendering the rendering information to the stereo downmix signal.

具体地,渲染单元-1A 911通过利用用于声道缩减混音信号的左声道的渲染信息当中的递送至同一声道的渲染信息HL_L来执行渲染。渲染单元-2A 912通过利用用于声道缩减混音信号的左声道的渲染信息当中递送至另一个声道的渲染信息HL_R来执行渲染。渲染单元-1B 921利用用于声道缩减混音信号的右声道的渲染信息当中递送至同一声道的渲染信息HR_R来执行渲染。且渲染单元-2B 922通过利用用于声道缩减混音信号的右声道的渲染信息当中递送至另一个声道的渲染信息HR_L来执行渲染。Specifically, the rendering unit-1A 911 performs rendering by using the rendering information HL_L delivered to the same channel among the rendering information for the left channel of the downmix signal. The rendering unit-2A 912 performs rendering by using the rendering information HL_R delivered to the other channel among the rendering information for the left channel of the downmix signal. The rendering unit-1B 921 performs rendering using the rendering information HR_R delivered to the same channel among the rendering information for the right channel of the downmix signal. And the rendering unit-2B 922 performs rendering by using the rendering information HR_L delivered to the other channel among the rendering information for the right channel of the downmix signal.

在以下的描述中,递送至另一个声道的渲染信息被命名为‘交叉渲染信息’。交叉渲染信息HL_R或HR_L被应用至同一声道然后由加法器加至另一个声道。在这种情形中,交叉渲染信息HL_R和/或HR_L可以是0。如果交叉渲染信息HL_R和/或HR_L是0,则意味着对相应路径没有贡献。In the following description, rendering information delivered to another channel is named 'cross rendering information'. The cross-rendering information HL_R or HR_L is applied to the same channel and then added to the other channel by an adder. In this case, cross rendering information HL_R and/or HR_L may be 0. If the cross rendering information HL_R and/or HR_L is 0, it means that there is no contribution to the corresponding path.

图6或图7中所示的环绕信号生成方法的例子解释如下。An example of the surround signal generation method shown in FIG. 6 or 7 is explained as follows.

首先,如果声道缩减混音信号是立体声信号,则定义为‘x’的声道缩减混音信号、定义为‘D’的通过利用空间信息生成的源映射信息、定义为‘G’的原型滤波器信息、定义为‘p’的多声道信号和定义为‘y’的环绕信号可由数学演算13中所示的矩阵表示。First, if the downmix signal is a stereo signal, the downmix signal is defined as 'x', the source map information generated by using spatial information is defined as 'D', the prototype is defined as 'G' The filter information, the multi-channel signal defined as 'p' and the surround signal defined as 'y' may be represented by a matrix as shown in Mathematics 13.

数学演算13Mathematical Calculus 13

xx == LiLi RiRi ,, pp == LL Lsls RR RsRs. CC LFELFE ,, DD. == DD. __ LL 11 DD. __ LL 22 DD. __ Lsls 11 DD. __ Lsls 22 DD. __ RR 11 DD. __ RR 22 DD. __ RsRs. 11 DD. __ RsRs. 22 DD. __ CC 11 DD. __ CC 22 DD. __ LFELFE 11 DD. __ LFELFE 22 ,,

GG == GLGL __ LL GLsGLs __ LL GRGR __ LL GRsGRs __ LL GCGC __ LL GLFEGLFE __ LL GLGL __ RR GLsGLs __ RR GRGR __ RR GRsGRs __ RR GCGC __ RR GLFEGLFE __ RR ,, ythe y == LoLo RoRo

在这种情形中,如果上述值是在频域上,则它们可如下展开。In this case, if the above values are on the frequency domain, they can be expanded as follows.

首先,如数学演算14中所示,可将多声道信号p表达为通过利用空间信息生成的源映射信息D与声道缩减混音信号x之间的乘积。First, as shown in Mathematical Calculations 14, a multi-channel signal p can be expressed as a product between source map information D generated by using spatial information and a downmix signal x.

数学演算14Mathematical Calculus 14

pp == DD. ·&Center Dot; xx LL Lsls RR RsRs. CC LFELFE == DD. __ LL 11 DD. __ LL 22 DD. __ Lsls 11 DD. __ Lsls 22 DD. __ RR 11 DD. __ RR 22 DD. __ RsRs. 11 DD. __ RsRs. 22 DD. __ CC 11 DD. __ CC 22 DD. __ LFELFE 11 DD. __ LFELFE 22 LiLi RiRi

环绕信号y如数学演算15所示可通过向多声道信号p渲染原型滤波器信息G来生成。The surround signal y can be generated by rendering the prototype filter information G to the multi-channel signal p as shown in Mathematical Calculations 15 .

数学演算15Mathematical Calculus 15

y=G·py=G·p

在这种情形中,如果将数学演算14代入p,则可生成为数学演算16。In this case, if the mathematical operation 14 is substituted into p, it can be generated as the mathematical operation 16 .

数学演算16Mathematical Calculus 16

y=GDxy = GDx

在这种情形中,如果将渲染信息H定义为H=GD,则环绕信号y和声道缩减混音信号x可具有数学演算17的关系。In this case, if the rendering information H is defined as H=GD, the surround signal y and the downmix signal x may have a relationship of mathematical operation 17 .

数学演算17Mathematical Calculus 17

H = HL _ L HR _ L HL _ R HR _ R , y=Hx h = HL _ L HR _ L HL _ R HR _ R , y=Hx

因此,在通过处理滤波器信息与源映射信息之积生成渲染信息H之后,将声道缩减混音信号x乘以渲染信息H以生成环绕信号y。Therefore, after generating the rendering information H by processing the product of the filter information and the source map information, the downmix signal x is multiplied by the rendering information H to generate the surround signal y.

根据渲染信息H的定义,可将渲染信息H表达为数学演算18。According to the definition of the rendering information H, the rendering information H can be expressed as a mathematical operation 18 .

数学演算18Mathematical Calculus 18

H=GDH=GD

GLGL __ LL GLsGLs __ LL GRGR __ LL GRsGRs __ LL GCGC __ LL GLFEGLFE __ LL GLGL __ RR GLsGLs __ RR GRGR __ RR GRsGRs __ RR GCGC __ RR GLFEGLFE __ RR DD. __ LL 11 DD. __ LL 22 DD. __ Lsls 11 DD. __ Lsls 22 DD. __ RR 11 DD. __ RR 22 DD. __ RsRs. 11 DD. __ RsRs. 22 DD. __ CC 11 DD. __ CC 22 DD. __ LFELFE 11 DD. __ LFELFE 22

图8和图9是根据本发明的一个实施例用于单声道的声道缩减混音信号的渲染单元的详细框图。8 and 9 are detailed block diagrams of a rendering unit for a mono-channel downmix signal according to an embodiment of the present invention.

参考图8,渲染单元900包括渲染单元-A 930和渲染单元-B 940。Referring to FIG. 8, the rendering unit 900 includes a rendering unit-A 930 and a rendering unit-B 940.

如果声道缩减混音信号是单声道信号,则空间信息转换单元1000生成渲染信息HM_L和HM_R,其中渲染信息HM_L是在向左声道渲染此单声道信号时使用,而渲染信息HM_R是在向右声道渲染此单声道信号时使用。If the channel downmix signal is a mono signal, the spatial information conversion unit 1000 generates rendering information HM_L and HM_R, wherein the rendering information HM_L is used when rendering the mono signal to the left channel, and the rendering information HM_R is Used when rendering this mono signal to the right channel.

渲染单元-A 930将渲染信息HM_L应用到单声道的声道缩减混音信号以生成左声道环绕信号。渲染单元-B 940将渲染信息HM_R应用到单声道的声道缩减混音信号以生成右声道环绕信号。The rendering unit-A 930 applies the rendering information HM_L to the mono downmix signal to generate the left channel surround signal. The rendering unit-B 940 applies the rendering information HM_R to the mono downmix signal to generate the right channel surround signal.

图中的渲染单元900不使用解相关器。然而,如果渲染单元-A 930和渲染单元-B 940分别通过利用数学演算12中定义的渲染信息Hmoverall_R和Hmoverall_L执行渲染,则能够分别获得应用了解相关器的输出。The rendering unit 900 in the figure does not use a decorrelator. However, if the rendering unit-A 930 and the rendering unit-B 940 perform rendering by using the rendering information Hmoverall_R and Hmoverall_L defined in Mathematical Calculations 12, respectively, outputs of the applied correlators can be obtained, respectively.

同时,在完成对单声道的声道缩减混音信号执行的渲染后试图获得立体声信号而不是环绕信号的输出的情形中,以下两种方法是可能的。Meanwhile, in the case of attempting to obtain an output of a stereo signal instead of a surround signal after completion of rendering performed on a down-mix signal of mono, the following two methods are possible.

第一种方法是使用用于立体声输出的值来代替使用用于环绕效果的渲染信息。在这种情形中,可通过仅修改图3中所示的结构中的渲染信息来获得立体声信号。The first is to use the values for stereo output instead of rendering information for surround effects. In this case, a stereo signal can be obtained by only modifying the rendering information in the structure shown in FIG. 3 .

第二方法是在利用声道缩减混音信号和空间信息生成多声道信号的解码过程中,可通过将解码过程仅执行到获得特定声道数的相应步骤来获得立体声信号。The second method is that in a decoding process for generating a multi-channel signal using a downmix signal and spatial information, a stereo signal may be obtained by performing the decoding process only up to a corresponding step for obtaining a specific channel number.

参考图9,渲染单元900对应于其中经解相关信号被表示为一个,即数学演算11的情形。渲染单元900包括渲染单元-1A 931、渲染单元-2A 932、渲染单元-1B 941、和渲染单元-2B 942。渲染单元900类似于用于立体声的声道缩减混音信号的渲染单元——除了渲染单元900包括用于经解相关信号的渲染单元941和942。Referring to FIG. 9 , a rendering unit 900 corresponds to a case in which a decorrelated signal is represented as one, ie, a mathematical operation 11 . The rendering unit 900 includes a rendering unit-1A 931, a rendering unit-2A 932, a rendering unit-1B 941, and a rendering unit-2B 942. Rendering unit 900 is similar to the rendering unit for the stereo downmix signal—except that rendering unit 900 includes rendering units 941 and 942 for the decorrelated signal.

在立体声的声道缩减混音信号的情形中,可认为两声道之一是经解相关信号。所以,在不采用附加解相关器的情况下,能够通过使用先前定义的四种渲染信息HL_L、HL_R等执行渲染过程。具体地,渲染单元-1A 931通过将渲染信息HM_L应用于单声道的声道缩减混音信号来生成将被递送至同一声道的信号。渲染单元-2A 932通过将渲染信息HM_R应用于单声道的声道缩减混音信号来生成将被递送至另一声道的信号。渲染单元-1B 941通过将渲染信息HMD_R应用于经解相关信号来生成将被递送至同一声道的信号。且渲染单元-2B 942通过将渲染信息HMD_L应用于此经解相关信号来生成将递送至另一声道的信号。In the case of a stereo downmix signal, one of the two channels may be considered to be a decorrelated signal. Therefore, without employing an additional decorrelator, it is possible to perform a rendering process by using the previously defined four kinds of rendering information HL_L, HL_R, etc. Specifically, the rendering unit-1A 931 generates a signal to be delivered to the same channel by applying the rendering information HM_L to the downmix signal of the mono channel. The rendering unit-2A 932 generates a signal to be delivered to another channel by applying the rendering information HM_R to the mono-channel downmix signal. The rendering unit-1B 941 generates a signal to be delivered to the same channel by applying the rendering information HMD_R to the decorrelated signal. And the rendering unit-2B 942 generates a signal to be delivered to another channel by applying the rendering information HMD_L to this decorrelated signal.

如果声道缩减混音信号是单声道信号,则定义为x的声道缩减混音信号、定义为D的源声道信息、定义为G的原型滤波器信息、定义为p的多声道信号、和定义为y的环绕信号可由数学演算19中所示的矩阵表示。If the downmix signal is a mono signal, the downmix signal is defined as x, the source channel information is defined as D, the prototype filter information is defined as G, and the multichannel is defined as p The signal, and the surround signal defined as y, can be represented by the matrices shown in Mathematical Calculation 19.

数学演算19Mathematical Calculus 19

x=[Mi], p = L Ls R Rs C LFE , D = D _ L D _ Ls D _ R D _ Rs D _ C D _ LFE , x=[Mi], p = L ls R Rs. C LFE , D. = D. _ L D. _ ls D. _ R D. _ Rs. D. _ C D. _ LFE ,

GG == GLGL __ LL GLsGLs __ LL GRGR __ LL GRsGRs __ LL GCGC __ LL GLFEGLFE __ LL GLGL __ RR GLsGLs __ RR GRGR __ RR GRsGRs __ RR GCGC __ RR GLFEGLFE __ RR ,, ythe y == LoLo RoRo

在这种情形中,这些矩阵之间的关系类似于声道缩减混音信号是立体声信号的情形中的关系。所以省略其详情。In this case, the relationship between these matrices is similar to that in the case where the downmix signal is a stereo signal. Therefore its details are omitted.

同时,参考图4和图5描述的源映射信息以及通过利用此源映射信息生成的渲染信息具有每频带、参数带、和/或传送时隙不同的值。在该情形中,如果源映射信息和/或渲染信息的值在相邻带之间或边界时隙之间具有相当大的差,则在渲染过程中可能会发生畸变。为了防止此畸变,需要频域和/或时域上的平滑过程。除了频域平滑和/或时域平滑外,也可使用适用于渲染的其它平滑方法。并且,可使用从将源映射信息或渲染信息乘以一特定增益得到的值。Meanwhile, source map information described with reference to FIGS. 4 and 5 and rendering information generated by using this source map information have different values per frequency band, parameter band, and/or transmission slot. In this case, distortion may occur during rendering if the values of source map information and/or rendering information have a considerable difference between adjacent bands or between boundary slots. In order to prevent this distortion, a smoothing process in frequency domain and/or time domain is required. In addition to frequency domain smoothing and/or time domain smoothing, other smoothing methods suitable for rendering may also be used. Also, a value obtained by multiplying source map information or rendering information by a certain gain may be used.

图10和图11是根据本发明的一个实施例的平滑单元和扩展单元的框图。10 and 11 are block diagrams of a smoothing unit and an expanding unit according to one embodiment of the present invention.

如图10和图11所示,根据本发明的平滑方法可应用于渲染信息和/或源映射信息。然而,该平滑方法也可应用于其它类型的信息。在以下的描述中,描述了频域上的平滑。然而除了频域平滑以外,本发明也包括时域平滑。As shown in FIGS. 10 and 11 , the smoothing method according to the present invention can be applied to rendering information and/or source map information. However, this smoothing method can also be applied to other types of information. In the following description, smoothing in the frequency domain is described. However, in addition to smoothing in the frequency domain, the present invention also includes smoothing in the time domain.

参考图10和图11,平滑单元1042能够对渲染信息和/或源映射信息执行平滑。稍后将参考图18至图20描述平滑发生的位置的详细例子。Referring to FIGS. 10 and 11 , the smoothing unit 1042 can perform smoothing on rendering information and/or source map information. A detailed example of where smoothing occurs will be described later with reference to FIGS. 18 to 20 .

平滑单元1042可被配置成与扩展单元1043联用,在扩展单元中渲染信息和/或源映射信息可被扩展到比参数频带更宽的范围——例如滤波器带中。具体地,源映射信息可被扩展到与滤波器信息相对应的频率分辨率(例如,滤波器带)以便乘以此滤波器信息(例如,HRTF滤波器系数)。根据本发明的平滑是在扩展之前或与扩展一起执行的。与扩展一起使用的平滑可采用图12至16中所示的方法之一。The smoothing unit 1042 may be configured to work in conjunction with an extension unit 1043 in which rendering information and/or source map information may be extended into a wider range than the parameter band - eg into a filter band. In particular, the source map information may be expanded to a frequency resolution (eg, filter band) corresponding to the filter information to be multiplied by this filter information (eg, HRTF filter coefficients). Smoothing according to the invention is performed before or together with the expansion. Smoothing used with extensions can take one of the methods shown in Figures 12 to 16.

图12是用于解释根据本发明的一个实施例的第一平滑方法的坐标图。FIG. 12 is a graph for explaining a first smoothing method according to an embodiment of the present invention.

参考图12,第一平滑方法采用在每个参数带中与空间信息具有相同大小的值。在这种情形中,可通过使用合适的平滑函数来实现平滑效果。Referring to FIG. 12, the first smoothing method employs values having the same size as spatial information in each parameter band. In this case, a smoothing effect can be achieved by using a suitable smoothing function.

图13是用于解释根据本发明的一个实施例的第二平滑方法的坐标图。FIG. 13 is a graph for explaining a second smoothing method according to an embodiment of the present invention.

参考图13,第二平滑方法是要通过连接参数带的代表性位置获得平滑效果。代表性位置是诸参数带中的每一个的正中心、与对数标度、Bark标度等成比例的中心位置。最低频率值、或由不同方法事先确定的位置。Referring to FIG. 13 , the second smoothing method is to obtain a smoothing effect by connecting representative positions of the parameter bands. The representative position is the exact center of each of the parameter bands, a center position proportional to a logarithmic scale, a Bark scale, or the like. The lowest frequency value, or the location determined in advance by different methods.

图14是用于解释根据本发明的一个实施例的第三平滑方法的坐标图。FIG. 14 is a graph for explaining a third smoothing method according to an embodiment of the present invention.

参考图14,第三平滑方法是要以平滑地连接参数的边界的曲线或直线的形式执行平滑。在这种情形中,第三平滑方法使用预设的边界平滑曲线或由一阶或更高阶的IIR滤波器或FIR滤波器所作的低通滤波。Referring to FIG. 14 , the third smoothing method is to perform smoothing in the form of a curve or a straight line that smoothly connects the boundaries of parameters. In this case, the third smoothing method uses a preset boundary smoothing curve or low-pass filtering by a first-order or higher-order IIR filter or FIR filter.

图15是用于解释根据本发明的一个实施例的第四平滑方法的坐标图。FIG. 15 is a graph for explaining a fourth smoothing method according to an embodiment of the present invention.

参考图15,第四平滑方法是通过向空间信息轮廓添加诸如随机噪声之类的信号来实现平滑效果。在这种情形中,可将在声道或频带中不同的值用作随机噪声。在频域上添加随机噪声的情形中,可在保持相位值不变的同时仅添加大小值。除了频域上的平滑效果外,第四平滑方法也可实现声道间解相关效果。Referring to FIG. 15 , the fourth smoothing method is to achieve a smoothing effect by adding a signal such as random noise to the spatial information profile. In this case, values that differ in channels or frequency bands can be used as random noise. In the case of adding random noise on the frequency domain, only the magnitude value can be added while keeping the phase value unchanged. In addition to the smoothing effect in the frequency domain, the fourth smoothing method can also achieve the decorrelation effect between channels.

图16是用于解释根据本发明的一个实施例的第五平滑方法的坐标图。FIG. 16 is a graph for explaining a fifth smoothing method according to one embodiment of the present invention.

参考图16,第五平滑方法是要使用第二至第四平滑方法的组合。例如,在已连接代表性的参数带的代表性位置之后,添加随机噪声并随后应用低通滤波。这样就可修改序列。第五平滑方法使频域上的不连续点最小化,并可增强声道间解相关效果。Referring to FIG. 16 , the fifth smoothing method is to use a combination of the second to fourth smoothing methods. For example, after the representative positions of the representative parameter bands have been connected, random noise is added and then low-pass filtering is applied. This allows the sequence to be modified. The fifth smoothing method minimizes the discontinuities in the frequency domain and can enhance the decorrelation effect between channels.

在第一至第五平滑方法中,每声道的相应频域上的空间信息值(例如,CLD值)的总功率应如常数那样是均匀的。为此,在每声道地执行平滑方法之后,应执行功率归一化。例如,如果声道缩减混音信号是单声道信号,则相应各声道的电平值应满足数学演算20的关系。In the first to fifth smoothing methods, the total power of spatial information values (for example, CLD values) on the corresponding frequency domain for each channel should be uniform as a constant. For this, power normalization should be performed after performing the smoothing method per channel. For example, if the channel downmix signal is a mono signal, the level values of corresponding channels should satisfy the relationship of mathematical calculation 20.

数学演算20Mathematical Calculus 20

D_L(pb)+D_R(pb)+D_C(pb)+D_Ls(pb)+D_Rs(pb)+D_Lfe(pb)=CD_L(pb)+D_R(pb)+D_C(pb)+D_Ls(pb)+D_Rs(pb)+D_Lfe(pb)=C

在这种情形中,‘pb=0~总参数频数1’,并且‘C’是任意常数。In this case, 'pb=0~total parameter frequency 1', and 'C' is an arbitrary constant.

图17是用于解释每声道的原型滤波器信息的图。Fig. 17 is a diagram for explaining prototype filter information per channel.

参考图17,为了渲染,已经通过用于左声道源的GL_L滤波器的信号被发送到左输出,而已经通过GL_R滤波器的信号被发送到右输出。Referring to FIG. 17 , for rendering, the signal that has passed the GL_L filter for the left channel source is sent to the left output, while the signal that has passed the GL_R filter is sent to the right output.

随后,通过将从相应各声道接收到的所有信号相加来生成左最终输出(例如,Lo)和右最终输出(例如,Ro)。具体地,所渲染的左/右声道输出可表达为数学演算21。Subsequently, a left final output (eg Lo) and a right final output (eg Ro) are generated by summing all signals received from respective channels. Specifically, the rendered left/right channel output can be expressed as a mathematical operation 21 .

数学演算21Mathematical Calculus 21

Lo=L*GL_L+C*GC_L+R*GR_L+Ls*GLs_L+Rs*GRs_LLo=L*GL_L+C*GC_L+R*GR_L+Ls*GLs_L+Rs*GRs_L

Ro=L*GL_R+C*GC_R+R*GR_R+Ls*GLs_R+Rs*GRs_RRo=L*GL_R+C*GC_R+R*GR_R+Ls*GLs_R+Rs*GRs_R

在本发明中,所渲染的左/右声道输出可通过利用藉由利用空间信息将声道缩减混音信号解码成多声道信号而生成的L、R、C、Ls和Rs来生成。并且,本发明能够在不生成L、R、C、Ls和Rs的情况下利用渲染信息生成所渲染的左/右声道输出,其中渲染信息是通过利用空间信息和滤波器信息生成的。In the present invention, the rendered left/right channel output may be generated by utilizing L, R, C, Ls, and Rs generated by decoding the downmix signal into a multi-channel signal using spatial information. And, the present invention can generate rendered left/right channel output using rendering information generated by using spatial information and filter information without generating L, R, C, Ls, and Rs.

参考图18至20解释利用空间信息生成渲染信息的过程如下。The process of generating rendering information using spatial information is explained with reference to FIGS. 18 to 20 as follows.

图18是根据本发明的一个实施例在空间信息转换单元900中生成渲染信息的第一方法的框图。FIG. 18 is a block diagram of a first method of generating rendering information in the spatial information converting unit 900 according to an embodiment of the present invention.

参考图18,如在上面描述中所提及的,空间信息转换单元900包括源映射单元1010、子渲染信息生成单元1020、整合单元1030、处理单元1040、以及域转换单元1050。空间信息转换单元900具有与图3中所示相同的配置。Referring to FIG. 18 , as mentioned in the above description, the spatial information conversion unit 900 includes a source mapping unit 1010 , a sub-rendering information generation unit 1020 , an integration unit 1030 , a processing unit 1040 , and a domain conversion unit 1050 . The spatial information conversion unit 900 has the same configuration as that shown in FIG. 3 .

子渲染信息生成单元1020包括至少一个或多个子渲染信息生成单元(第1子渲染信息生成单元至第N子渲染信息生成单元)。The sub-rendering information generation unit 1020 includes at least one or more sub-rendering information generation units (the first sub-rendering information generation unit to the Nth sub-rendering information generation unit).

子渲染信息生成单元1020通过使用滤波器信息和源映射信息生成子渲染信息。The sub-rendering information generating unit 1020 generates sub-rendering information by using filter information and source map information.

例如,如果声道缩减混音信号是单声道信号,则第一子渲染信息生成单元能够生成对应于多声道上的左声道的子渲染信息。并且,可利用源映射信息D_L和经转换的滤波器信息GL_L′和GL_R′将此子渲染信息表示为数学演算22For example, if the downmix signal is a mono signal, the first sub-rendering information generation unit can generate sub-rendering information corresponding to the left channel on the multi-channel. And, this sub-rendering information can be represented as a mathematical operation 22 using the source map information D_L and the converted filter information GL_L' and GL_R'

数学演算22Mathematical Calculus 22

FL_L=D_L*GL_L′FL_L=D_L*GL_L'

(单声道输入→至左输出声道的滤波器系数)(mono input → filter coefficients to left output channel)

FL_R=D_L*GL_R′FL_R=D_L*GL_R'

(单声道输入→至右输出声道的滤波器系数)(mono input → filter coefficients to right output channel)

在这种情形中,D_L是通过在源映射单元1010中利用空间信息生成的值。然而,生成D_L的过程可遵循树状结构。In this case, D_L is a value generated by utilizing spatial information in the source mapping unit 1010 . However, the process of generating D_L may follow a tree structure.

第二子渲染信息生成单元可生成对应于多声道上的右声道的子渲染信息FR_L和FR_R。并且,第N子渲染信息生成单元能够生成对应于多声道上的右环绕声道的子渲染信息FRs_L和FRs_R。The second sub-rendering information generating unit may generate sub-rendering information FR_L and FR_R corresponding to the right channel on the multi-channel. And, the Nth sub-rendering information generation unit can generate sub-rendering information FRs_L and FRs_R corresponding to the right surround channel on the multi-channel.

如果声道缩减混音信号是立体声信号,则第一子渲染信息生成单元可生成对应于多声道上的左声道的子渲染信息。并且,可通过利用源映射信息D_L1和D_L2将此子渲染信息表示为数学演算23。If the downmix signal is a stereo signal, the first sub-rendering information generating unit may generate sub-rendering information corresponding to a left channel on a multi-channel. And, this sub-rendering information can be expressed as a mathematical operation 23 by utilizing the source map information D_L1 and D_L2.

数学演算23Mathematical Calculus 23

FL_L1=D_L1*GL_L′FL_L1=D_L1*GL_L'

(左输入→至左输出声道的滤波器系数)(left input → filter coefficients to left output channel)

FL_L2=D_L2*GL_L′FL_L2=D_L2*GL_L'

(右输入→至左输出声道的滤波器系数)(right input → filter coefficients to left output channel)

FL_R1=D_L1*GL_R′FL_R1=D_L1*GL_R'

(左输入→至右输出声道的滤波器系数)(left input → filter coefficients to right output channel)

PL_R2=D_L2*GL_R′PL_R2=D_L2*GL_R'

(右输入→至右输出声道的滤波器系数)(right input → filter coefficients to right output channel)

在数学演算23中,例如,如下解释FL_R1。In Math 23, for example, FL_R1 is explained as follows.

首先,在FL_R1中,‘L’指示多声道的位置,‘R’指示环绕信号的输出声道,且‘1’指示声道缩减混音信号的声道。即,FL_R1指示在从声道缩减混音信号的左声道生成环绕信号的右输出声道时使用的子渲染信息。First, in FL_R1, 'L' indicates the position of the multi-channel, 'R' indicates the output channel of the surround signal, and '1' indicates the channel of the downmix signal. That is, FL_R1 indicates sub-rendering information used when generating the right output channel of the surround signal from the left channel of the down-mix signal.

第二,D_L1和D_L2是通过在源映射单元1010中利用空间信息生成的值。Second, D_L1 and D_L2 are values generated by utilizing spatial information in the source mapping unit 1010 .

如果声道缩减混音信号是立体声信号,则能够以与声道缩减混音信号是单声道信号的情形相同的方式从至少一个子渲染信息生成单元生成多个子渲染信息。由多个子渲染信息生成单元生成的子渲染信息的类型是示例性的,这不对本发明构成限制。If the downmix signal is a stereo signal, a plurality of sub rendering information can be generated from at least one sub rendering information generating unit in the same manner as a case where the downmix signal is a monaural signal. The types of the sub-rendering information generated by the plurality of sub-rendering information generating units are exemplary, which does not limit the present invention.

由子渲染信息生成单元1020生成的子渲染信息经由整合单元1030、处理单元1040、以及域转换单元1050传送至渲染单元900。The sub-rendering information generated by the sub-rendering information generation unit 1020 is transmitted to the rendering unit 900 via the integration unit 1030 , the processing unit 1040 , and the domain conversion unit 1050 .

整合单元1030将每声道生成的子渲染信息整合成用于渲染过程的渲染信息(例如,HL_L、HL_R、HR_L、HR_R)。如下解释单声道信号情形以及立体声信号情形下整合单元1030中的整合过程。The integration unit 1030 integrates sub-rendering information generated per channel into rendering information (for example, HL_L, HL_R, HR_L, HR_R) for a rendering process. The integration process in the integration unit 1030 in the case of a mono signal and in the case of a stereo signal is explained as follows.

首先,如果声道缩减混音信号是单声道信号,则渲染信息可表达为数学演算24。First, if the downmix signal is a mono signal, the rendering information can be expressed as a mathematical operation 24 .

数学演算24Mathematical Calculus 24

HM_L=FL_L+FR_L+FC_L+FLs_L+FRs_L+FLFE_LHM_L=FL_L+FR_L+FC_L+FLs_L+FRs_L+FLFE_L

HM_R=FL_R+FR_R+FC_R+FLs_R+FRs_R+FLFE_RHM_R=FL_R+FR_R+FC_R+FLs_R+FRs_R+FLFE_R

第二,如果声道缩减混音信号是立体声信号,则可将渲染信息表达为数学演算25。Second, if the downmix signal is a stereo signal, the rendering information can be expressed as a mathematical operation 25 .

数学演算25Mathematical Calculus 25

HL_L=FL_L1+FR_L1+FC_L1+FLs_L1+FRs_L1+FLFE_L1HL_L=FL_L1+FR_L1+FC_L1+FLs_L1+FRs_L1+FLFE_L1

HR_L=FL_L2+FR_L2+FC_L2+FLs_L2+FRs_L2+FLFE_L2HR_L=FL_L2+FR_L2+FC_L2+FLs_L2+FRs_L2+FLFE_L2

HL_R=FL_R1+FR_R1+FC_R1+FLs_R1+FRs_R1+FLFE_R1HL_R=FL_R1+FR_R1+FC_R1+FLs_R1+FRs_R1+FLFE_R1

HR_R=FL_R2+FR_R2+FC_R2+FLs_R2+FRs_R2+FLFE_R2HR_R=FL_R2+FR_R2+FC_R2+FLs_R2+FRs_R2+FLFE_R2

随后,处理单元1040包括内插单元1041和/或平滑单元1042,并执行针对渲染信息的内插和/或平滑。内插和/或平滑可在时域、频域、或QMF域上执行。在本说明书中,以时域为例,这不对本发明构成限制。Subsequently, the processing unit 1040 includes an interpolation unit 1041 and/or a smoothing unit 1042, and performs interpolation and/or smoothing for rendering information. Interpolation and/or smoothing can be performed in the time domain, frequency domain, or QMF domain. In this specification, the time domain is used as an example, which does not limit the present invention.

如果所传送的渲染信息在时域上具有宽间隔,则执行内插以获得渲染信息之间非现存的渲染信息。例如,假设渲染信息分别存在于第n时隙和第(n+k)时隙中,则能够通过使用所生成的渲染信息(例如,HL_L、HR_L、HL_R、HR_R)在未传送的时隙上执行线性内插。If the transmitted rendering information has a wide interval in the temporal domain, interpolation is performed to obtain non-existing rendering information between the rendering information. For example, assuming that rendering information exists in the n-th slot and the (n+k)-th slot respectively, it is possible to use the generated rendering information (for example, HL_L, HR_L, HL_R, HR_R) Perform linear interpolation.

参考声道缩减混音信号是单声道信号的情形和声道缩减混音信号是立体声信号的情形解释从内插生成的渲染信息。The rendering information generated from the interpolation is explained with reference to the case where the downmix signal is a mono signal and the case where the downmix signal is a stereo signal.

如果声道缩减混音信号是单声道信号,则可将内插渲染信息表达为数学演算26。If the downmix signal is a mono signal, the interpolated rendering information can be expressed as a mathematical operation 26 .

数学演算26Mathematical Calculus 26

HM_L(n+j)=HM_L(n)*(1-a)+HM_L(n+k)*aHM_L(n+j)=HM_L(n)*(1-a)+HM_L(n+k)*a

HM_R(n+j)=HM_R(n)*(1-a)+HM_R(n+k)*aHM_R(n+j)=HM_R(n)*(1-a)+HM_R(n+k)*a

如果声道缩减混音信号是立体声信号,则可将经内插的渲染信息表达为数学演算27。If the downmix signal is a stereo signal, the interpolated rendering information can be expressed as a mathematical operation 27 .

数学演算27Mathematical Calculus 27

HL_L(n+j)=HL_L(n)*(1-a)+HL_L(n+k)*aHL_L(n+j)=HL_L(n)*(1-a)+HL_L(n+k)*a

HR_L(n+j)=HR_L(n)*(1-a)+HR_L(n+k)*aHR_L(n+j)=HR_L(n)*(1-a)+HR_L(n+k)*a

HL_R(n+j)=HL_R(n)*(1-a)+HL_R(n+k)*aHL_R(n+j)=HL_R(n)*(1-a)+HL_R(n+k)*a

HR_R(n+j)=HR_R(n)*(1-a)+HR_R(n+k)*aHR_R(n+j)=HR_R(n)*(1-a)+HR_R(n+k)*a

在这种情形中,有0<j<k。‘j’和‘k’是整数。且,‘a’是与将表达为数学演算28的‘′0<a<1’相对应的实数。In this case, 0<j<k. 'j' and 'k' are integers. And, 'a' is a real number corresponding to ''0<a<1' which will be expressed as a mathematical operation 28 .

数学演算28Mathematical Calculus 28

a=j/ka=j/k

如果是这样的话,能够根据数学演算27和数学演算28获得与在连接这两个时隙中的值的直线上的未传送时隙相对应的值。稍后将参考图22和图23解释内插的详情。If so, the value corresponding to the non-transmitted slot on the straight line connecting the values in these two slots can be obtained according to the mathematical operation 27 and the mathematical operation 28 . Details of the interpolation will be explained later with reference to FIGS. 22 and 23 .

在滤波器系数值在时域上的两相邻时隙之间突变的情形中,平滑单元1042执行平滑以防止由于不连续点的出现引起的畸变问题。可利用参考图12至16描述的平滑方法实行时域上的平滑。平滑可与扩展一起执行。并且,平滑可根据其所应用的位置而不同。如果声道缩减混音信号是单声道信号,则可将时域平滑表示为数学演算29。In the case where the filter coefficient value changes abruptly between two adjacent slots on the time domain, the smoothing unit 1042 performs smoothing to prevent distortion problems due to the occurrence of discontinuous points. Smoothing in the temporal domain can be performed using the smoothing methods described with reference to FIGS. 12 to 16 . Smoothing can be performed together with expansion. Also, smoothing can vary depending on where it is applied. If the downmix signal is a mono signal, temporal smoothing can be expressed as a mathematical operation 29 .

数学演算29Mathematical Calculus 29

HM_L(n)′=HM_L(n)*b+HM_L(n-1)′*(1-b)HM_L(n)'=HM_L(n)*b+HM_L(n-1)'*(1-b)

HM_R(n)′=HM_R(n)*b+HM_R(n-1)′*(1-b)HM_R(n)'=HM_R(n)*b+HM_R(n-1)'*(1-b)

即,平滑可由按照将在前一时隙n-1中已作平滑的渲染信息HM_L(n-1)或HM_R(n-1)乘以(1-b)、将当前时隙中生成的渲染信息HM_L(n)或HM_R(n)乘以b、并将这两个乘法结果相加的方式执行的1-pol IIR滤波器类型来执行。在这种情形中,‘b’是0<b<1的常数。如果‘b’变小,则平滑效果变大。如果‘b’变大,则平滑效果变小。并且,可以相同的方式应用其余的滤波器。That is, the smoothing can be performed by multiplying the rendering information HM_L(n-1) or HM_R(n-1) that has been smoothed in the previous time slot n-1 by (1-b), and multiplying the rendering information generated in the current time slot HM_L(n) or HM_R(n) is multiplied by b and the 1-pol IIR filter type is performed by adding the results of these two multiplications. In this case, 'b' is a constant of 0<b<1. If 'b' gets smaller, the smoothing effect gets bigger. If 'b' becomes larger, the smoothing effect becomes smaller. And, the rest of the filters can be applied in the same way.

可通过利用针对时域平滑的数学演算29将内插和平滑表示为数学演算30中所示的一个表达式。Interpolation and smoothing can be expressed as one expression shown in math 30 by using math 29 for temporal smoothing.

数学演算30Mathematical Calculus 30

HM_L(n+j)′=(HM_L(n)*(1-a)+HM_L(n+k)*a)*b+HM_L(n+j-1)′*(1-b)HM_L(n+j)'=(HM_L(n)*(1-a)+HM_L(n+k)*a)*b+HM_L(n+j-1)'*(1-b)

HM_R(n+j)′=(HM_R(n)*(1-a)+HM_R(n+k)*a)*b+HM_R(n+j-1)′*(1-b)HM_R(n+j)'=(HM_R(n)*(1-a)+HM_R(n+k)*a)*b+HM_R(n+j-1)'*(1-b)

如果由内插单元1041执行了内插和/或如果由平滑单元1042执行了平滑,则可获得具有与原型渲染信息的能量值不同的能量值的渲染信息。为了防止该问题,可另外执行能量归一化。If interpolation is performed by the interpolation unit 1041 and/or if smoothing is performed by the smoothing unit 1042, rendering information having an energy value different from that of the original rendering information may be obtained. To prevent this problem, energy normalization may additionally be performed.

最后,域转换单元1050对渲染信息执行针对用于执行渲染的域的域转换。如果用于执行渲染的域与渲染信息的域相同,则可不执行此域转换。之后,将经域转换的渲染信息传输到渲染单元900。Finally, the domain conversion unit 1050 performs domain conversion on the rendering information for the domain used to perform rendering. This domain conversion may not be performed if the domain used to perform the rendering is the same as that of the rendering information. Afterwards, the domain-converted rendering information is transmitted to the rendering unit 900 .

图19是根据本发明的一个实施例在空间信息转换单元中生成渲染信息的第二方法的框图。FIG. 19 is a block diagram of a second method of generating rendering information in a spatial information conversion unit according to an embodiment of the present invention.

第二方法与第一方法的类似之处在于空间信息转换单元1000包括源映射单元1010、子渲染信息生成单元1020、整合单元1030、处理单元1040、以及域转换单元1050,并在于子渲染信息生成单元1020包括至少一个子渲染信息生成单元。The second method is similar to the first method in that the spatial information conversion unit 1000 includes a source mapping unit 1010, a sub-rendering information generation unit 1020, an integration unit 1030, a processing unit 1040, and a domain conversion unit 1050, and in that the sub-rendering information generation Unit 1020 includes at least one sub-rendering information generation unit.

参考图19,生成渲染信息的第二方法与第一方法的不同之处在于处理单元1040的位置。所以,可对在子渲染信息生成单元1020中每声道地生成的子渲染信息(例如,在单声道情形中的FL_L和FL_R或在立体声信号情形中的FL_L1、FL_L2、FL_R1、FL_R2)每声道地来执行内插和/或平滑。Referring to FIG. 19 , the second method of generating rendering information differs from the first method in the location of the processing unit 1040 . Therefore, sub-rendering information (for example, FL_L and FL_R in the case of mono or FL_L1, FL_L2, FL_R1, FL_R2 in the case of stereo signals) generated per channel in the sub-rendering information generation unit 1020 can be channel ground to perform interpolation and/or smoothing.

随后,整合单元1030将经内插和/或平滑的子渲染信息整合成渲染信息。Subsequently, the integration unit 1030 integrates the interpolated and/or smoothed sub-rendering information into rendering information.

将所生成的渲染信息经由域转换单元1050传输到渲染单元900。The generated rendering information is transmitted to the rendering unit 900 via the domain converting unit 1050 .

图20是根据本发明的一个实施例在空间信息转换单元中生成渲染滤波器信息的第三方法的框图。FIG. 20 is a block diagram of a third method of generating rendering filter information in a spatial information conversion unit according to an embodiment of the present invention.

第三方法与第一或第二方法的类似之处在于空间信息转换单元1000包括源映射单元1010、子渲染信息生成单元1020、整合单元1030、处理单元1040、以及域转换单元1050,并在于子渲染信息生成单元1020包括至少一个子渲染信息生成单元。The third method is similar to the first or second method in that the spatial information conversion unit 1000 includes a source mapping unit 1010, a sub-rendering information generation unit 1020, an integration unit 1030, a processing unit 1040, and a domain conversion unit 1050, and in that the sub- The rendering information generating unit 1020 includes at least one sub-rendering information generating unit.

参考图20,生成渲染信息的第三方法与第一或第二方法的不同之处在于处理单元1040与源映射单元1010相邻。所以,可对通过在源映射单元1010中使用空间信息生成的源映射信息每声道地来执行内插和/或平滑。Referring to FIG. 20 , the third method of generating rendering information is different from the first or second method in that the processing unit 1040 is adjacent to the source mapping unit 1010 . Therefore, interpolation and/or smoothing may be performed per channel on source map information generated by using spatial information in the source map unit 1010 .

随后,子渲染信息生成单元1020通过利用经内插和/或平滑的源映射信息和滤波器信息生成子渲染信息。Subsequently, the sub-rendering information generation unit 1020 generates sub-rendering information by using the interpolated and/or smoothed source map information and filter information.

子渲染信息在整合单元1030中被整合成渲染信息。并且,将所生成的渲染信息经由域转换单元1050传输至渲染单元900。The sub-rendering information is integrated into rendering information in the integrating unit 1030 . And, the generated rendering information is transmitted to the rendering unit 900 via the domain conversion unit 1050 .

图21是用于解释根据本发明的一个实施例在渲染单元中生成环绕信号的方法的图。图21示出在DFT域上执行的渲染过程。然而,该渲染过程也可按类似方式在不同域上实现。图21示出输入信号是单声道的声道缩减混音信号的情形。然而,图21能以类似方式应用于包括立体声的声道缩减混音信号等在内的其它输入声道。FIG. 21 is a diagram for explaining a method of generating surround signals in a rendering unit according to one embodiment of the present invention. Figure 21 shows the rendering process performed on the DFT domain. However, the rendering process can also be implemented on different domains in a similar manner. FIG. 21 shows a case where the input signal is a monaural downmix signal. However, FIG. 21 can be applied in a similar manner to other input channels including a stereo downmix signal and the like.

参考图21,时域上的单声道的声道缩减混音信号在域转换单元中优选地执行具有重叠区间OL的开窗。图21示出使用50%重叠的情形。然而,本发明包括使用其它重叠的情形。Referring to FIG. 21 , the monaural downmix signal in the time domain is preferably windowed with an overlapping interval OL in the domain conversion unit. Figure 21 shows the case where 50% overlap is used. However, the present invention includes the use of other overlapping situations.

用于执行开窗口的窗函数可采用藉由在时域上无不连续性地无缝连接而在DFT域上具有良好频率选择性的函数。例如,正弦平方窗函数可用作此窗函数。A window function for performing windowing may employ a function having good frequency selectivity in the DFT domain by being seamlessly connected without discontinuity in the time domain. For example, a sine square window function can be used as this window function.

随后,利用在域转换单元中进行转换的渲染信息,对具有从开窗获取的OL*2长度的单声道的声道缩减混音信号执行渲染滤波器的抽头(tab)长度[精确地,是(抽头长度)-1]的补零ZL。然后执行域转换转为DFT域。图20示出块-k声道缩减混音信号被域转换到DFT域中。Then, using the rendering information converted in the domain conversion unit, the tap length of the rendering filter is performed on the mono down-mix signal with length OL*2 obtained from the windowing [precisely, is the zero-padding ZL of (tap length)-1]. Then perform a domain transformation into the DFT domain. Fig. 20 shows that the block-k channel downmix signal is domain transformed into the DFT domain.

经域转换的声道缩减混音信号由使用渲染信息的渲染滤波器来渲染。可将渲染过程表示为声道缩减混音信号与渲染信息的乘积。经渲染的声道缩减混音信号在域逆转换单元中经历IDFT(离散傅立叶逆变换),然后与先前以OL长度的延迟执行的声道缩减混音信号(图20中的块k-1)重叠以生成环绕信号。The domain converted downmix signal is rendered by a rendering filter using the rendering information. The rendering process can be expressed as the product of the downmix signal and the rendering information. The rendered downmix signal undergoes IDFT (Inverse Discrete Fourier Transform) in the domain inverse transform unit and is then compared with the downmix signal previously performed with a delay of length OL (block k-1 in Fig. 20) overlap to create a surround signal.

可在经历此渲染过程的每一个块上执行内插。如下解释内插法。Interpolation can be performed on every block that goes through this rendering process. The interpolation method is explained as follows.

图22是根据本发明的一个实施例的第一内插法的图。根据本发明的内插可在各个位置上执行。例如,内插可在图18至图20中所示的空间信息转换单元中的各个位置上执行,或可在渲染单元中执行。可将空间信息、源映射信息、滤波器信息等用作待内插的值。在本说明书中,空间信息示例性地用于描述。然而,本发明不限于空间信息。内插在扩展至更宽频带之前或与之一起执行。Figure 22 is a diagram of a first interpolation method according to one embodiment of the present invention. Interpolation according to the invention can be performed at various positions. For example, interpolation may be performed at various positions in the spatial information conversion unit shown in FIGS. 18 to 20 , or may be performed in the rendering unit. Spatial information, source map information, filter information, etc. may be used as values to be interpolated. In this specification, spatial information is exemplarily used for description. However, the present invention is not limited to spatial information. Interpolation is performed before or along with the extension to the wider frequency band.

参考图22,从编码装置传输的空间信息可从随机位置传输而不是在每一个时隙上传送。一个空间帧能够携带多个空间信息集(例如,图22中的参数集n和n+1)。在低比特率的情形中,一个空间帧能够携带单个新的空间信息集。所以,是使用相邻的已传送的空间信息集的值来实行对未传送时隙的内插。用于执行渲染的窗口之间的间隔并不总是与时隙匹配。所以,如图22中所示,找出在渲染窗口的中心处(K-1、K、K+1、K+2等)的内插出的值来使用。尽管图22示出在存在空间信息集的时隙之间实行线性内插,但本发明不限于该内插法。例如,在不存在空间信息集的时隙上不实性内插。而是可代之以采用先前的或预先设定的值。Referring to FIG. 22, the spatial information transmitted from the encoding device may be transmitted from a random position instead of being transmitted on every slot. One spatial frame can carry multiple sets of spatial information (eg, parameter sets n and n+1 in FIG. 22 ). In the case of low bit rates, one spatial frame can carry a single new set of spatial information. Therefore, the interpolation of the non-transmitted slots is performed using the values of adjacent transmitted spatial information sets. The spacing between windows used to perform rendering does not always match the time slot. So, as shown in Figure 22, find the interpolated value at the center of the rendering window (K-1, K, K+1, K+2, etc.) to use. Although FIG. 22 shows that linear interpolation is performed between time slots in which sets of spatial information exist, the present invention is not limited to this interpolation. For example, false interpolation over time slots where no spatial information set exists. Instead, previous or preset values may be used.

图23是根据本发明的一个实施例的第二内插法的图。Figure 23 is a diagram of a second interpolation method according to one embodiment of the present invention.

参考图23,根据本发明的一个实施例的第二内插法具有将采用先前值的区间、采用预先设定的缺省值的区间等相组合的结构。例如,可通过使用维持先前值的方法、采用预先设定的缺省值的方法、以及在一个空间帧的区间里执行线性内插的方法中的至少一种来执行内插。在一个窗口中存在至少两个新的空间信息集的情形中,可能会发生畸变。在以下的描述中,解释用于防止畸变的块切换。Referring to FIG. 23 , the second interpolation method according to one embodiment of the present invention has a structure combining intervals using previous values, intervals using preset default values, and the like. For example, interpolation may be performed by using at least one of a method of maintaining a previous value, a method of adopting a preset default value, and a method of performing linear interpolation in an interval of one spatial frame. Distortion may occur in situations where there are at least two new sets of spatial information in a window. In the following description, block switching for preventing distortion is explained.

图24是根据本发明的一个实施例的块切换法的图。FIG. 24 is a diagram of a block switching method according to one embodiment of the present invention.

参考图24(a),因为窗口长度大于时隙长度,所以一个窗口区间中可能存在至少两个空间信息集(例如,图24中的参数集n和n+1)。在这种情形中,应将空间信息集中的每一个应用于不同的时隙。然而,如果应用了从内插这至少两个空间信息集得到的一个值,则可能发生畸变。即,可能发生归因于根据窗口长度的时间分辨率不足的畸变。Referring to FIG. 24( a ), since the window length is greater than the slot length, there may be at least two spatial information sets in one window interval (eg, parameter sets n and n+1 in FIG. 24 ). In this case, each of the spatial information sets should be applied to a different time slot. However, distortion may occur if a value resulting from interpolating the at least two sets of spatial information is applied. That is, distortion due to insufficient temporal resolution according to the window length may occur.

为了解决这一问题,可使用改变窗口大小以配合时隙分辨率的切换方法。例如,如图24(b)所示,对于要求高分辨率的区间,可将窗口大小切换成大小较短的窗口。在这种情形中,在已切换的窗口的开始部分和结束部分处,使用连接窗以防止在已切换的窗口的时域上出现接缝。To solve this problem, a switching method that changes the window size to match the slot resolution can be used. For example, as shown in Fig. 24(b), for an interval requiring high resolution, the window size can be switched to a window with a shorter size. In this case, at the beginning and end of the switched window, a connection window is used to prevent seams from appearing on the time domain of the switched window.

窗口长度可以不是作为单独的附加信息来传输而是代之以通过在解码装置中使用空间信息来确定。例如,窗口长度可通过利用更新空间信息的时隙的区间来确定。即,如果用于更新空间信息的区间窄,则使用长度短的窗口函数。如果用于更新空间信息的区间宽,则使用长度长的窗口函数。在这种情形中,通过在渲染中使用可变长度的窗口,有利的是不单独地使用发送窗口长度信息的比特。在图24(b)中示出了两种类型的窗口长度。然而,根据传输频率和空间信息的关系可使用具有各种长度的窗口。所决定的窗口长度信息可应用于生成环绕信号的各个步骤,这将在以下的描述中解释。The window length may not be transmitted as separate additional information but instead be determined by using the spatial information in the decoding device. For example, the window length can be determined by using the interval of the time slot for updating the spatial information. That is, if the interval for updating spatial information is narrow, a window function with a short length is used. If the interval used to update the spatial information is wide, use a window function with a long length. In this case, by using a variable length window in the rendering, it is advantageous not to use the bits that transmit the window length information separately. Two types of window lengths are shown in Fig. 24(b). However, windows with various lengths may be used depending on the relationship between transmission frequency and spatial information. The determined window length information can be applied to various steps of generating the surround signal, which will be explained in the following description.

图25是根据本发明的一个实施例应用由窗口长度决定单元决定的窗口长度的位置的框图。FIG. 25 is a block diagram of a location where a window length determined by a window length determination unit is applied according to one embodiment of the present invention.

参考图25,窗口长度决定单元1400可通过使用空间信息来决定窗口长度。关于所决定的窗口长度的信息可应用于源映射单元1010、整合单元1030、处理单元1040、域转换单元1050和1100以及域逆转换单元1300。图25示出使用立体声的声道缩减混音信号的情形。然而,本发明不仅限于立体声的声道缩减混音信号。如上述描述中所提及的,即使窗口长度缩短,根据滤波器抽头数决定的补零长度也是不可调节的。所以,在以下的描述中解释该问题的解决方案。Referring to FIG. 25, the window length decision unit 1400 may decide the window length by using spatial information. Information on the decided window length may be applied to the source mapping unit 1010 , the integration unit 1030 , the processing unit 1040 , the domain conversion units 1050 and 1100 , and the domain inverse conversion unit 1300 . FIG. 25 shows a case where a stereo downmix signal is used. However, the invention is not limited to stereo downmix signals. As mentioned in the above description, even if the window length is shortened, the length of zero padding determined according to the number of filter taps is not adjustable. Therefore, a solution to this problem is explained in the following description.

图26是根据本发明的一个实施例在处理音频信号中使用的具有各种长度的滤波器的图。如在上面描述中提及的,如果根据滤波器抽头数决定的补零长度不作调节,则实质上发生合计达相应长度的覆盖,从而致使时间分辨率不足。该问题的解决方案是通过限制滤波器抽头的长度来缩短补零的长度。缩短补零长度的方法可通过截断响应的尾部(例如,对应于回响的扩散区间)来实现。在这种情形中,渲染过程可能比不截断滤波器响应的尾部的情形精确度低。然而,时域上的滤波器系数值很小,从而主要影响了回响。所以,音质没有受到截断的显著影响。FIG. 26 is a diagram of filters of various lengths used in processing an audio signal according to one embodiment of the present invention. As mentioned in the above description, if the length of zero padding determined according to the number of filter taps is not adjusted, overlays amounting to a corresponding length substantially occur, resulting in insufficient temporal resolution. The solution to this problem is to shorten the length of zero padding by limiting the length of the filter taps. A method of shortening the zero-padding length can be achieved by truncating the tail of the response (eg, corresponding to the diffusion interval of the reverberation). In this case, the rendering process may be less accurate than if the tails of the filter response were not truncated. However, the values of the filter coefficients in the time domain are small, thereby mainly affecting the reverberation. Therefore, sound quality is not significantly affected by truncation.

参考图26,有四种滤波器可使用。这四种滤波器可在DFT域上使用,这不对本发明构成限制。Referring to Figure 26, there are four filters available. These four filters can be used in the DFT domain, which is not a limitation of the present invention.

滤波器-N指示具有长滤波器长度FL和不受滤波器抽头数限制的长补零长度2*OL的滤波器。滤波器-N2指示通过限制滤波器的抽头数而具有比滤波器-N1要短的补零长度2*LO的具有相同滤波器长度FL的滤波器。滤波器-N3指示通过不限制滤波器抽头数而具有长的补零长度2*LO的具有比滤波器-N1要短的滤波器长度FL的滤波器。并且,滤波器-N4指示通过限制滤波器的抽头数而具有比滤波器-N1要短的窗口长度FL的具有短补零长度2*LO的滤波器。Filter-N indicates a filter with a long filter length FL and a long zero-padding length 2*OL that is not limited by the number of filter taps. Filter-N2 indicates a filter having the same filter length FL having a zero-padding length 2*LO shorter than that of Filter-N1 by limiting the number of taps of the filter. Filter-N3 indicates a filter having a shorter filter length FL than Filter-N1 with a long zero-padding length 2*LO by not limiting the number of filter taps. And, Filter-N4 indicates a filter having a short zero-padding length 2*LO having a window length FL shorter than that of Filter-N1 by limiting the number of taps of the filter.

如在以上描述中提及的,可利用以上示例性的四种滤波器来解决时间分辨率的问题。并且,对于滤波器响应的尾部,可将不同的滤波器系数用于每一个域。As mentioned in the above description, the above exemplary four filters can be used to solve the problem of temporal resolution. Also, for the tails of the filter response, different filter coefficients may be used for each domain.

图27是根据本发明的一个实施例通过使用多个子滤波器来分开地处理音频信号的方法的图。可将一个滤波器分成具有相互不同的滤波器系数的子滤波器。在通过利用子滤波器处理音频信号之后,可使用将处理的结果相加的方法。在向具有小能量的滤波器响应尾部应用空间信息的情形中,即,在通过利用具有长滤波器抽头的滤波器执行渲染的情形中,该方法提供了按预定长度单位来分开地处理音频信号的功能。例如,因为滤波器的尾部对于对应于每一个声道的每HRTF没有显著变化,所以可通过提取多个窗口共有的系数来执行渲染。在本说明书中,描述了在DFT域上执行的情形。然而,本发明不限于DFT域。FIG. 27 is a diagram of a method of separately processing an audio signal by using a plurality of sub-filters according to one embodiment of the present invention. One filter may be divided into sub-filters having mutually different filter coefficients. After an audio signal is processed by using sub-filters, a method of adding the processed results may be used. In the case of applying spatial information to the tail of the filter response with small energy, i.e. in the case of rendering by using a filter with long filter taps, the method provides for separately processing the audio signal in predetermined length units function. For example, since the tail of the filter does not change significantly for each HRTF corresponding to each channel, rendering may be performed by extracting coefficients common to a plurality of windows. In this specification, the case performed on the DFT domain is described. However, the invention is not limited to the DFT domain.

参考图27,在将一个滤波器FL分成多个子区后,这多个子区可有具有相互不同的滤波器系数的多个子滤波器(滤波器-A和滤波器-B)来处理。Referring to FIG. 27, after one filter FL is divided into a plurality of subsections, the subsections can be processed by a plurality of subfilters (filter-A and filter-B) having filter coefficients different from each other.

随后,经滤波器-A处理的输出和经滤波器-B处理的输出被组合在一起。例如,对经滤波器-A处理的输出和经滤波器-B处理的输出中的每一个执行IDFT(离散傅立叶逆转换)以生成时域信号。并且,将所生成的信号加起来。在这种情形中,加上了经滤波器-B处理的输出的位置由FL时延了比经滤波器-A处理的输出的位置要多的时间。这样,经多个子滤波器处理的信号带来了与由单个滤波器处理该信号的情形相同的效果。Subsequently, the Filter-A processed output and the Filter-B processed output are combined. For example, IDFT (Inverse Discrete Fourier Transform) is performed on each of the filter-A processed output and the filter-B processed output to generate a time-domain signal. And, add up the generated signals. In this case, the position where the output processed by Filter-B is added is delayed by FL by more time than the position where the output processed by Filter-A is added. In this way, a signal processed by multiple sub-filters brings about the same effect as if the signal were processed by a single filter.

并且,本发明包括直接向声道缩减混音信号渲染经滤波器-B处理的输出的方法。在这种情形中,能够通过使用从空间信息提取的系数、部分地使用空间信息、或不使用空间信息地来向声道缩减混音信号渲染此输出。Also, the present invention includes a method of rendering the Filter-B processed output directly to the downmix signal. In this case, this output can be rendered to the downmix signal by using coefficients extracted from the spatial information, partially using the spatial information, or without using the spatial information.

该方法的特征在于可将具有长抽头数的滤波器分开地应用,且具有小能量的滤波器尾部不利用空间信息作转换即可应用。在这种情形中,如果不应用利用空间信息的转换,则不将不同的滤波器应用于每一个作处理的窗口。所以,没有必要应用与块切换相同的方案。图26示出分成两个区的滤波器。然而,本发明可将滤波器分成多个区。The method is characterized in that a filter with a long number of taps can be applied separately, and a filter tail with a small energy can be applied without conversion using spatial information. In this case, if no transformation utilizing spatial information is applied, then a different filter is not applied to each processed window. Therefore, it is not necessary to apply the same scheme as block switching. Figure 26 shows a filter divided into two regions. However, the present invention may divide the filter into multiple regions.

图28是根据本发明的一个实施例向单声道的声道缩减混音信号渲染由多个子滤波器生成的分割渲染信息的方法的框图。图28涉及一个渲染系数。该方法可每渲染系数地来执行。FIG. 28 is a block diagram of a method of rendering split rendering information generated by a plurality of sub-filters to a mono downmix signal according to an embodiment of the present invention. Figure 28 relates to a rendering factor. This method may be performed per render factor.

参考图28,图27的滤波器-A信息对应于第一分割渲染信息HM_L_A,并且图27的滤波器-B信息对应于第二分割渲染信息HM_L_B。图28示出分割成两个子滤波器的实施例。然而,本发明不限于这两个子滤波器。这两个子滤波器可经由拆分单元1500利用在空间信息生成单元1000中生成的渲染信息HM_L来获得。或者,这两个子滤波器可利用原型HRTF信息或根据用户选择决定的信息来获得。根据用户选择决定的信息可包括例如根据用户爱好选择的空间信息。在这种情形中,HM_L_A是基于所接收到的空间信息的渲染信息,而HM_L_B可以是用于提供通常应用于信号的3维效果的渲染信息。Referring to FIG. 28 , the filter-A information of FIG. 27 corresponds to the first split rendering information HM_L_A, and the filter-B information of FIG. 27 corresponds to the second split rendering information HM_L_B. Figure 28 shows an embodiment of splitting into two sub-filters. However, the invention is not limited to these two sub-filters. These two sub-filters may be obtained by using the rendering information HM_L generated in the spatial information generating unit 1000 via the splitting unit 1500 . Alternatively, the two sub-filters can be obtained using prototype HRTF information or information determined by user selection. The information determined according to the user's selection may include, for example, spatial information selected according to the user's preference. In this case, HM_L_A is rendering information based on the received spatial information, while HM_L_B may be rendering information for providing a 3D effect typically applied to the signal.

如在上面描述中提及的,利用多个子滤波器的处理不仅可应用于DFT域,还可应用于时域和QMF域。具体地,由滤波器-A和滤波器-B拆分的系数值藉由时域或QMF域渲染被应用于声道缩减混音信号,然后被相加以生成最终信号。As mentioned in the above description, processing with multiple sub-filters is applicable not only to the DFT domain, but also to the time domain and QMF domain. Specifically, the coefficient values split by Filter-A and Filter-B are applied to the downmix signal by time domain or QMF domain rendering, and then added to generate the final signal.

渲染单元900包括第一分割渲染单元950和第二分割渲染单元960。第一分割渲染单元950利用HM_L_A执行渲染过程,而第二分割渲染单元960利用HM_L_B执行渲染过程。The rendering unit 900 includes a first split rendering unit 950 and a second split rendering unit 960 . The first split rendering unit 950 performs a rendering process using HM_L_A, and the second split rendering unit 960 performs a rendering process using HM_L_B.

如果如图27所示的滤波器-A和滤波器-B是同一滤波器根据时间的拆分,则能考虑适当的延迟以对应于此时间间隔。图28示出单声道的声道缩减混音信号的例子。在使用单声道声道缩减混音信号和解相关器的情形中,对应于滤波器-B的部分不是被应用于解相关器,而是被直接应用于此单声道的声道缩减混音信号。If Filter-A and Filter-B as shown in FIG. 27 are divisions of the same filter according to time, an appropriate delay can be considered to correspond to this time interval. FIG. 28 shows an example of a mono downmix signal. In the case of using a mono downmix signal and a decorrelator, the part corresponding to Filter-B is not applied to the decorrelator, but directly to the mono downmix Signal.

图29是根据本发明的一个实施例向立体声的声道缩减混音信号渲染利用多个子滤波器生成的分割渲染信息的方法的框图。FIG. 29 is a block diagram of a method for rendering split rendering information generated using a plurality of sub-filters to a stereo downmix signal according to an embodiment of the present invention.

图29中所示的分割渲染过程与图28的过程的类似之处在于通过使用由空间信息转换单元1000生成的渲染信息、原型HRTF滤波器信息或用户决定信息在拆分器1500中获得两个子滤波器。与图28的差别在于对应于滤波器-B的分割渲染过程被共同地应用于L/R信号。The split rendering process shown in FIG. 29 is similar to the process of FIG. 28 in that two subdivisions are obtained in the splitter 1500 by using the rendering information generated by the spatial information conversion unit 1000, the prototype HRTF filter information, or the user decision information. filter. The difference from FIG. 28 is that the split rendering process corresponding to Filter-B is commonly applied to the L/R signal.

具体地,拆分器1500生成对应于滤波器-A的信息的第一分割渲染信息、第二分割渲染信息、和对应于滤波器-B的信息的第三分割渲染信息。在这种情形中,第三分割渲染信息可通过使用可共同应用于L/R信号的滤波器信息或空间信息来生成。Specifically, the splitter 1500 generates first split rendering information corresponding to filter-A information, second split rendering information, and third split rendering information corresponding to filter-B information. In this case, the third split rendering information may be generated by using filter information or spatial information commonly applicable to L/R signals.

参考图29,渲染单元900包括第一分割渲染单元970、第二分割渲染单元980和第三分割渲染单元990。Referring to FIG. 29 , the rendering unit 900 includes a first split rendering unit 970 , a second split rendering unit 980 , and a third split rendering unit 990 .

所生成的第三分割渲染信息在第三分割渲染单元990中被应用于L/R信号的和信号以生成一个输出信号。将此输出信号与L/R输出信号相加,L/R输出信号是分别在第一和第二分割渲染单元970和980中由滤波器-A1和滤波器-A2独立渲染以生成环绕信号。在这种情形中,第三分割渲染单元990的输出信号可在适当延迟后进行相加。在图29中为了便于解释省略了从L/R输入向另一个声道应用交叉渲染信息的表达。The generated third split rendering information is applied to the sum signal of the L/R signals in the third split rendering unit 990 to generate an output signal. This output signal is added to the L/R output signal which is independently rendered by Filter-A1 and Filter-A2 in the first and second split rendering units 970 and 980 respectively to generate the surround signal. In this case, the output signals of the third split rendering unit 990 may be summed after an appropriate delay. The expression of applying cross-rendering information from the L/R input to another channel is omitted in FIG. 29 for convenience of explanation.

图30是根据本发明的一个实施例的声道缩减混音信号的第一域转换方法的框图。到目前为止已经描述了在DFL域上执行的渲染过程。如在上面描述中提及的,除了DFL域以外,渲染过程还可在其它域上执行。然而,图30示出在DFT域上执行的渲染过程。域转换单元1100包括QMF滤波器和DFL滤波器。域逆转换单元1300包括IDFT滤波器和IQMF滤波器。图30涉及单声道的声道缩减混音信号,这不对本发明构成限制。Fig. 30 is a block diagram of a first domain conversion method of a downmix signal according to an embodiment of the present invention. The rendering process performed on the DFL domain has been described so far. As mentioned in the above description, besides the DFL domain, the rendering process can also be performed on other domains. However, Fig. 30 shows a rendering process performed on the DFT domain. The domain conversion unit 1100 includes a QMF filter and a DFL filter. The domain inverse transform unit 1300 includes an IDFT filter and an IQMF filter. Fig. 30 relates to a mono channel downmix signal, which does not constitute a limitation of the present invention.

参考图30,有p样本的时域声道缩减混音信号通过QMF滤波器以生成P个子带样本。每频带地再采集W个样本。在对再采集到的样本执行开窗之后执行补零。然后执行M点DFT(FFT)。在这种情形中,DFT通过上述类型的开窗实现处理。可把将由M点DFT获得的每频带的M/2个频域值连接到P个频带的值视为由M/2*P点DFT获得的频谱的适当值。所以,将在M/2*P点DFT域上表示的滤波器系数乘以此频谱以带来与DFT域上的渲染相同的效果。Referring to FIG. 30 , the time-domain downmix signal with p samples is passed through a QMF filter to generate P subband samples. W additional samples are taken per frequency band. Zero padding is performed after windowing is performed on the reacquired samples. An M-point DFT (FFT) is then performed. In this case, DFT achieves processing through windowing of the type described above. A value that connects M/2 frequency-domain values per frequency band obtained by M-point DFT to P frequency bands can be regarded as an appropriate value of the frequency spectrum obtained by M/2*P-point DFT. So, multiplying this spectrum by the filter coefficients expressed on the M/2*P point DFT domain brings about the same effect as rendering on the DFT domain.

在这种情形中,已通过QMF滤波器的信号具有漏泄,例如,相邻频带之间的混叠。具体地,对应于相邻频带的值渗入到当前频带,且当前频带中存在的值移到相邻的频带中。在这种情形中,如果执行QMF整合,则由于QMF特性可恢复原始信号。然而,如果滤波过程是如本发明中的情形那样在相应频带的信号上执行的,则信号由于此漏泄而发生畸变。为了使该问题最小化,可通过以下方式增加用于恢复原始信号的过程:令信号在域转换单元100中的QMF之后在每频带地执行DFT之前通过漏泄最小化蝴蝶滤波器并在域逆转换单元1300中的IDFT之后执行逆过程。In this case, the signal that has passed through the QMF filter has leakage, eg, aliasing between adjacent frequency bands. Specifically, values corresponding to adjacent frequency bands bleed into the current frequency band, and values present in the current frequency band are shifted into adjacent frequency bands. In this case, if QMF integration is performed, the original signal can be restored due to QMF characteristics. However, if the filtering process is performed on the signal of the corresponding frequency band as is the case in the present invention, the signal is distorted due to this leakage. To minimize this problem, the procedure for recovering the original signal can be increased by passing the signal through a leakage-minimizing butterfly filter after QMF in the domain transformation unit 100 before performing DFT per band and inverse transforming in the domain The IDFT in unit 1300 is followed by the inverse process.

同时,为了使空间信息转换单元1000中生成的渲染信息的生成过程与声道缩减混音信号的生成过程匹配,可不是在开始时执行M/2*P点DFT而是代之以对通过了QMF的信号执行DFT以获得原型滤波器信息。在这种情形中,可能会存在由QMF滤波器引起的延迟和数据扩展。At the same time, in order to match the generation process of the rendering information generated in the spatial information conversion unit 1000 with the generation process of the channel downmix signal, instead of performing the M/2*P point DFT at the beginning, the DFT is performed on the signal of the QMF to obtain the prototype filter information. In this case there may be delay and data spreading caused by the QMF filter.

图31是根据本发明的一个实施例的声道缩减混音信号的第二域转换方法的框图。图31示出在QMF域上执行的渲染过程。Fig. 31 is a block diagram of a second domain conversion method for a downmix signal according to an embodiment of the present invention. Figure 31 shows the rendering process performed on the QMF domain.

参考图31,域转换单元1100包括QMF域转换单元,并且域逆转换单元1300包括IQMF域转换单元。图31中示出的配置与仅利用DFT的情形相同——仅除域转换单元是QMF滤波器。在以下的描述中,QMF指包括具有相同带宽的QMF和混合QMF。与仅利用DFT的情形的差别在于渲染信息的生成是在QMF域上执行的,且将此渲染过程表示为卷积而不是DFT域上的乘积,因为由渲染器-M 3012执行的此渲染过程是在QMF域上执行的。Referring to FIG. 31 , the domain conversion unit 1100 includes a QMF domain conversion unit, and the domain inverse conversion unit 1300 includes an IQMF domain conversion unit. The configuration shown in FIG. 31 is the same as the case of using only DFT—only the domain-dividing transform unit is a QMF filter. In the following description, QMF refers to QMF and hybrid QMF having the same bandwidth. The difference from the case of using only DFT is that the generation of rendering information is performed on the QMF domain, and this rendering process is expressed as a convolution rather than a product on the DFT domain, because this rendering process performed by the renderer-M 3012 is executed on the QMF domain.

假设QMF滤波器设有B个频带,可将滤波器系数表示为对于这B个频带具有不同特性(系数)的一组滤波器系数。有时,如果滤波器抽头数变为一阶(即,乘以一个常数),则DFT域上的具有B个频谱的渲染过程与运算过程是匹配的。数学演算31表示针对利用渲染信息HM_L执行渲染过程的一条路径的一个QMF频带(b)上执行的渲染过程。Assuming that a QMF filter is provided with B frequency bands, the filter coefficients can be expressed as a set of filter coefficients having different characteristics (coefficients) for the B frequency bands. Sometimes, if the number of filter taps becomes first order (ie, multiplied by a constant), the rendering process with B spectra on the DFT domain is matched with the operational process. Mathematical calculation 31 represents a rendering process performed on one QMF band (b) for a path in which the rendering process is performed using the rendering information HM_L.

数学演算31Mathematical Calculus 31

LoLo __ mm bb (( kk )) == HMH M __ LL bb ** mm == &Sigma;&Sigma; ii == 00 filterfilter __ orderorder -- 11 hmhm __ ll bb (( ii )) mm bb (( kk -- 11 ))

在这种情形中,k指示QMF频带中的时序,即,时隙单元。在QMF域上执行的渲染过程的优势在于:如果传送的空间信息是可应用于QMF域的值,则对应的数据的应用是最方便的,且可使应用期间的畸变最小化。然而,在原型滤波器信息(例如,原型滤波器系数)转换过程中的QMF域转换的情形中,应用经转换的值的过程需要相当大的运算量。在这种情形中,可通过在滤波器信息转换过程中将HRTF系数参数化的方法来使运算量最小化。In this case, k indicates timing in the QMF band, ie, slot unit. An advantage of the rendering process performed on the QMF domain is that if the transmitted spatial information is a value applicable to the QMF domain, the application of the corresponding data is most convenient and the distortion during application can be minimized. However, in the case of QMF domain conversion during conversion of prototype filter information (eg, prototype filter coefficients), the process of applying the converted values requires a considerable amount of computation. In this case, the amount of computation can be minimized by parameterizing HRTF coefficients during filter information conversion.

工业实用性Industrial Applicability

因此,本发明的信号处理方法和装置使用由编码器提供的空间信息以在能够生成多声道的解码装置中通过利用HRTF滤波器信息或是根据用户的滤波器信息来生成环绕信号。并且,本发明很适用于仅能再现立体声信号的各种解码器。Therefore, the signal processing method and apparatus of the present invention use spatial information provided by an encoder to generate surround signals by using HRTF filter information or according to user's filter information in a decoding apparatus capable of generating multi-channels. Also, the present invention is well suited for various decoders that can only reproduce stereo signals.

尽管已参考本发明的优选实施例描述并说明了本发明,但本领域内技术人员可以理解,可对本发明作出各种修改和变形而不会脱离本发明的精神或范围。因此,本发明旨在涵盖落入所附权利要求书及其等效技术方案的范围内的本发明的所有这些修改和变形。While the present invention has been described and illustrated with reference to preferred embodiments thereof, it will be understood by those skilled in the art that various modifications and variations can be made therein without departing from the spirit or scope of the invention. Thus, it is intended that the present invention cover all such modifications and variations of this invention that come within the scope of the appended claims and their equivalents.

Claims (15)

1. the method for a processing signals, described method comprises:
From bit stream, extract down-mix audio signal;
Generate down-mix audio signal by decorrelator being applied to described down-mix audio signal through decorrelation; And
By will play up information be applied to described down-mix audio signal and described down-mix audio signal through decorrelation generate have surrounding effect around signal.
2. the method for claim 1 is characterized in that, described application of playing up information is carried out on one of time domain, frequency domain, QMF territory and hybrid domain.
3. the method for claim 1 is characterized in that, the described information of playing up is to be used for the filter information of described surrounding effect and the spatial information that extracts from described bit stream generates by use.
4. the method for claim 1 is characterized in that, comprises also that territory with described down-mix audio signal is converted to generate described territory around signal therein.
5. method as claimed in claim 4 is characterized in that, described territory of playing up information equals the territory of generation around signal.
6. the method for claim 1 is characterized in that, described decorrelator has all-pass characteristics.
7. the method for claim 1 is characterized in that, described down-mix audio signal is a monophonic signal.
8. the device of a processing signals, described device comprises:
Correlated elements, it generates down-mix audio signal through decorrelation by the down-mix audio signal that decorrelator is applied to extract from bit stream; And
Rendering unit, its by will play up information be applied to described down-mix audio signal and described down-mix audio signal through decorrelation generate have surrounding effect around signal.
9. device as claimed in claim 8 is characterized in that, described rendering unit generates described around signal on one of time domain, frequency domain, QMF territory and hybrid domain.
10. device as claimed in claim 8 is characterized in that, the described information of playing up is to be used for the filter information of described surrounding effect and the spatial information that extracts from described bit stream generates by use.
11. device as claimed in claim 8, it is characterized in that the described information of playing up comprises that being applied to down-mix audio signal sound channel and in the down-mix audio signal sound channel of decorrelation one first plays up information and be applied to described down-mix audio signal sound channel and described in the down-mix audio signal sound channel of decorrelation one transmits then second plays up information on another sound channel what transmit on the same sound channel then.
12. device as claimed in claim 8 is characterized in that, also comprises the territory converting unit, the territory of described down-mix audio signal is converted to generates described territory around signal therein.
13. device as claimed in claim 12 is characterized in that, described territory of playing up information equals the territory of generation around signal.
14. device as claimed in claim 8 is characterized in that, described decorrelator has all-pass characteristics.
15. device as claimed in claim 8 is characterized in that, described down-mix audio signal is a monophonic signal.
CN2007800015359A 2006-01-19 2007-01-19 Method and apparatus for processing a media signal Active CN101361117B (en)

Applications Claiming Priority (15)

Application Number Priority Date Filing Date Title
US75998006P 2006-01-19 2006-01-19
US60/759,980 2006-01-19
US77672406P 2006-02-27 2006-02-27
US60/776,724 2006-02-27
US77944106P 2006-03-07 2006-03-07
US77941706P 2006-03-07 2006-03-07
US77944206P 2006-03-07 2006-03-07
US60/779,441 2006-03-07
US60/779,417 2006-03-07
US60/779,442 2006-03-07
US78717206P 2006-03-30 2006-03-30
US60/787,172 2006-03-30
US78751606P 2006-03-31 2006-03-31
US60/787,516 2006-03-31
PCT/KR2007/000350 WO2007083960A1 (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal

Publications (2)

Publication Number Publication Date
CN101361117A CN101361117A (en) 2009-02-04
CN101361117B true CN101361117B (en) 2011-06-15

Family

ID=40332817

Family Applications (8)

Application Number Title Priority Date Filing Date
CN2007800015359A Active CN101361117B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CNA2007800027924A Pending CN101371298A (en) 2006-01-19 2007-01-19 Method and device for decoding a signal
CN2007800015397A Expired - Fee Related CN101361118B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CN200780001540XA Active CN101361119B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CN2007800015344A Active CN101361116B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CN2007800015486A Active CN101361121B (en) 2006-01-19 2007-01-19 Method and device for processing media signals
CN2007800015448A Active CN101361120B (en) 2006-01-19 2007-01-19 Method and device for processing media signals
CNA2007800015240A Pending CN101361115A (en) 2006-01-19 2007-01-19 Method and apparatus for decoding a signal

Family Applications After (7)

Application Number Title Priority Date Filing Date
CNA2007800027924A Pending CN101371298A (en) 2006-01-19 2007-01-19 Method and device for decoding a signal
CN2007800015397A Expired - Fee Related CN101361118B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CN200780001540XA Active CN101361119B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CN2007800015344A Active CN101361116B (en) 2006-01-19 2007-01-19 Method and apparatus for processing a media signal
CN2007800015486A Active CN101361121B (en) 2006-01-19 2007-01-19 Method and device for processing media signals
CN2007800015448A Active CN101361120B (en) 2006-01-19 2007-01-19 Method and device for processing media signals
CNA2007800015240A Pending CN101361115A (en) 2006-01-19 2007-01-19 Method and apparatus for decoding a signal

Country Status (1)

Country Link
CN (8) CN101361117B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101387902B1 (en) * 2009-06-10 2014-04-22 한국전자통신연구원 Encoder and method for encoding multi audio object, decoder and method for decoding and transcoder and method transcoding
TWI557723B (en) * 2010-02-18 2016-11-11 杜比實驗室特許公司 Decoding method and system
ES2565959T3 (en) * 2010-06-09 2016-04-07 Panasonic Intellectual Property Corporation Of America Bandwidth extension method, bandwidth extension device, program, integrated circuit and audio decoding device
CN103067759A (en) * 2011-10-19 2013-04-24 深圳市快播科技有限公司 Establishment method and devices of playback link and player
WO2013108200A1 (en) * 2012-01-19 2013-07-25 Koninklijke Philips N.V. Spatial audio rendering and encoding
BR112015002367B1 (en) 2012-08-03 2021-12-14 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung Ev DECODER AND METHOD FOR MULTI-INSTANCE SPATIAL AUDIO OBJECT ENCODING USING A PARAMETRIC CONCEPT FOR MULTI-CHANNEL DOWNMIX/UPMIX BOXES
ES2932422T3 (en) * 2013-09-17 2023-01-19 Wilus Inst Standards & Tech Inc Method and apparatus for processing multimedia signals
WO2015060654A1 (en) * 2013-10-22 2015-04-30 한국전자통신연구원 Method for generating filter for audio signal and parameterizing device therefor
GB201808897D0 (en) * 2018-05-31 2018-07-18 Nokia Technologies Oy Spatial audio parameters
EP3617871A1 (en) * 2018-08-28 2020-03-04 Koninklijke Philips N.V. Audio apparatus and method of audio processing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5561736A (en) * 1993-06-04 1996-10-01 International Business Machines Corporation Three dimensional speech synthesis
WO2005069637A1 (en) * 2004-01-05 2005-07-28 Koninklijke Philips Electronics, N.V. Ambient light derived form video content by mapping transformations through unrendered color space
WO2005069638A1 (en) * 2004-01-05 2005-07-28 Koninklijke Philips Electronics, N.V. Flicker-free adaptive thresholding for ambient light derived from video content mapped through unrendered color space

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5561736A (en) * 1993-06-04 1996-10-01 International Business Machines Corporation Three dimensional speech synthesis
WO2005069637A1 (en) * 2004-01-05 2005-07-28 Koninklijke Philips Electronics, N.V. Ambient light derived form video content by mapping transformations through unrendered color space
WO2005069638A1 (en) * 2004-01-05 2005-07-28 Koninklijke Philips Electronics, N.V. Flicker-free adaptive thresholding for ambient light derived from video content mapped through unrendered color space

Also Published As

Publication number Publication date
CN101361118B (en) 2011-07-27
CN101361121A (en) 2009-02-04
CN101361115A (en) 2009-02-04
CN101361120A (en) 2009-02-04
CN101361119A (en) 2009-02-04
CN101371298A (en) 2009-02-18
CN101361121B (en) 2012-01-11
CN101361120B (en) 2011-09-07
CN101361116B (en) 2011-06-22
CN101361117A (en) 2009-02-04
CN101361118A (en) 2009-02-04
CN101361116A (en) 2009-02-04
CN101361119B (en) 2011-06-15

Similar Documents

Publication Publication Date Title
US8488819B2 (en) Method and apparatus for processing a media signal
CN101361117B (en) Method and apparatus for processing a media signal
HK1127433B (en) Method and apparatus for processing a media signal

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant