[go: up one dir, main page]

CN102760442B - 3D video azimuth parametric quantification method - Google Patents

3D video azimuth parametric quantification method Download PDF

Info

Publication number
CN102760442B
CN102760442B CN201210256992.5A CN201210256992A CN102760442B CN 102760442 B CN102760442 B CN 102760442B CN 201210256992 A CN201210256992 A CN 201210256992A CN 102760442 B CN102760442 B CN 102760442B
Authority
CN
China
Prior art keywords
jnd
vector
code word
dimension
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210256992.5A
Other languages
Chinese (zh)
Other versions
CN102760442A (en
Inventor
胡瑞敏
王晓晨
刘梦颖
冯云杰
章佩
杨姗姗
涂卫平
杨玉红
李登实
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201210256992.5A priority Critical patent/CN102760442B/en
Publication of CN102760442A publication Critical patent/CN102760442A/en
Application granted granted Critical
Publication of CN102760442B publication Critical patent/CN102760442B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明提出一种3D音频中水平方位参数量化方法,考虑了人耳在不同频带的感知能力,引入了人耳可感知的最小差异JND值;在量化客观失真的基础上,将其与JND值进行结合运算得到主观感知失真,以此为依据选出最符合人耳感知特性的码字作为量化结果。利用这种技术方案进行量化,量化结果的人耳主观感知性能较好。

The present invention proposes a method for quantifying horizontal orientation parameters in 3D audio, which considers the perception ability of the human ear in different frequency bands, and introduces the minimum difference JND value that the human ear can perceive; on the basis of quantifying objective distortion, it is compared with the JND value The subjective perceptual distortion is obtained by combining operations, and based on this, the codeword that is most in line with the perceptual characteristics of the human ear is selected as the quantization result. Quantification is performed using this technical solution, and the subjective perception performance of the quantification result is better.

Description

一种3D音频中水平方位参数量化方法A Quantization Method of Horizontal Orientation Parameters in 3D Audio

技术领域 technical field

本发明涉及量化编码技术领域,更具体地,涉及一种3D音频中水平方位参数量化方法。The present invention relates to the technical field of quantization and coding, and more specifically, to a method for quantizing horizontal orientation parameters in 3D audio.

背景技术 Background technique

量化是压缩编码的主要步骤,通过用一个较小的数据集表示较大的数据集来实现数据量的减少。较小的数据集通常称为码本,码本中的数据称为码字。较大的数据集就是待量化值的集合。数据经过量化后,只用输出码字在码本中的位置值即可。这个位置值通常称为索引。常用的量化技术有很多类型,大体可以分为标量量化和矢量量化两类。标量量化是对待量化数据逐个进行量化,复杂度较低;矢量量化则是将若干个数据看成一个矢量,对每个矢量进行整体量化,保留了数据之间的相关性。其中根据实际情况的不同,矢量量化又有很多不同的算法,如格形矢量量化、乘积码量化等等。Quantization is the main step in compression coding, which achieves data reduction by representing a larger data set with a smaller data set. Smaller data sets are often called codebooks, and the data in the codebooks are called codewords. Larger datasets are collections of values to quantify. After the data is quantized, only the position value of the output codeword in the codebook is used. This positional value is often called an index. There are many types of commonly used quantization techniques, which can be roughly divided into two categories: scalar quantization and vector quantization. Scalar quantization is to quantify the data to be quantized one by one, and the complexity is low; vector quantization is to regard several data as a vector, and perform overall quantization on each vector, preserving the correlation between data. Among them, according to different actual situations, there are many different algorithms for vector quantization, such as trellis vector quantization, product code quantization, and so on.

3D音频的基本原理是通过模拟空间某声源在人耳处产生的声场,使听者感觉声音从空间某声源对应位置发出。其中,声源水平方位信息的判断主要由依靠双耳效应完成。水平方位参数是3D音频中用来表达声源水平方位信息的重要参数,主要包括时间差(ITD,interauraltime difference)和强度差(ILD,interaural level difference)等。在生理上,ITD和ILD的听觉相对重要性受到频率、激励信号时间结构等因素的影响。人耳对声音的频域分辨率有一定规律,在低频段人耳频域分辨率很高,两个频率相近的声音都不会相互掩蔽;但在高频段人耳的分辨率相对较低。实际音频编解码中,会根据这一特性将信号划分成若干个子带,每个子带提取相应的ILD和ITD。因此,在对水平方位参数进行量化的时候,也需要考虑人耳在不同频带的感知能力进行量化处理,使得量化结果更加符合人耳的主观听感。现有量化算法只针对水平方位参数进行量化,未考虑频带等相关信息对主观听感的影响,以客观性能最佳为量化目标,但客观性能与人耳的主观听感并不完全一致,导致量化器的主观感知性能并非最优。The basic principle of 3D audio is to make the listener feel that the sound is emitted from the corresponding position of a certain sound source in the space by simulating the sound field generated by a certain sound source in the space. Among them, the judgment of the horizontal orientation information of the sound source is mainly completed by relying on the binaural effect. The horizontal orientation parameter is an important parameter used to express the horizontal orientation information of the sound source in 3D audio, mainly including time difference (ITD, interauraltime difference) and intensity difference (ILD, interaural level difference). Physiologically, the relative auditory importance of ITD and ILD is influenced by factors such as frequency and temporal structure of excitation signals. The human ear has certain rules for the frequency domain resolution of sound. In the low frequency band, the human ear has a high frequency domain resolution, and two sounds with similar frequencies will not mask each other; but in the high frequency band, the human ear's resolution is relatively low. In the actual audio codec, the signal will be divided into several subbands according to this characteristic, and the corresponding ILD and ITD will be extracted for each subband. Therefore, when quantifying the horizontal orientation parameter, it is also necessary to consider the perception ability of the human ear in different frequency bands for quantization processing, so that the quantization result is more in line with the subjective sense of hearing of the human ear. The existing quantization algorithm only quantifies the horizontal orientation parameters, does not consider the influence of frequency band and other related information on the subjective sense of hearing, and takes the best objective performance as the quantification goal, but the objective performance is not completely consistent with the subjective sense of hearing of the human ear, resulting in The subjectively perceived performance of quantizers is not optimal.

针对上述问题,迫切需要将水平方位参数额外信息考虑在内,同时将人耳对声音的感受考虑进去,符合人耳主观听感的量化器,以提升整体编码主观性能。In view of the above problems, there is an urgent need to take into account the additional information of the horizontal orientation parameters, and at the same time, the human ear's perception of sound, and a quantizer that conforms to the human ear's subjective sense of hearing, so as to improve the overall subjective performance of encoding.

发明内容 Contents of the invention

本发明针对现有量化技术的不足,提出一种适于人耳感知的应用于3D音频水平方位参数编码的量化技术方案,在量化过程中考虑与水平方位参数相关的其他信息,以主观感知失真为最优码字筛选标准量化算法。目的是通过结合水平方位参数的相关信息,依据符合人耳感知特性的评价标准选择码字,减小人耳的感知失真,实现整体量化主观性能的提高。Aiming at the deficiencies of the existing quantization technology, the present invention proposes a quantization technical scheme suitable for human ear perception and applied to the encoding of 3D audio horizontal orientation parameters. During the quantization process, other information related to the horizontal orientation parameter is considered to subjectively perceive distortion. Screen the standard quantization algorithm for the optimal codeword. The purpose is to reduce the perceptual distortion of the human ear and improve the overall quantitative subjective performance by combining the relevant information of the horizontal orientation parameters and selecting codewords according to the evaluation criteria that conform to the perceptual characteristics of the human ear.

本发明的技术方案为一种3D音频中水平方位参数量化方法,包括以下步骤:The technical solution of the present invention is a method for quantizing horizontal orientation parameters in 3D audio, comprising the following steps:

步骤1.1、设3D音频的子带总数为N,输入N个子带的水平方位参数x1,x2,…,xN,通过查表得到对应的JND值jnd1,jnd2,…,jndNStep 1.1. Set the total number of sub-bands of 3D audio as N, input the horizontal orientation parameters x 1 , x 2 ,…,x N of N sub-bands, and obtain the corresponding JND values jnd 1 , jnd 2 ,…,jnd N by looking up the table ;

步骤1.2、按照预设的维数k,划分得到多个k维的水平方位参数矢量[(x1,x2,…,xk)(xk+1,xk+2,…,x2k)…(xN-k+1,xN-k+2,…,xN)],根据划分结果生成相应的码本;同时将对应的JND值划分成多个k维矢量[(jnd1,jnd2,…,jndk)(jndk+1,jndk+2,…,jnd2k)…(jndN-k+1,jndN-k+2,…,jndN)];Step 1.2. According to the preset dimension k, divide and obtain multiple k-dimensional horizontal orientation parameter vectors [(x 1 ,x 2 ,…,x k )(x k+1 ,x k+2 ,…,x 2k )...(x N-k+1 ,x N-k+2 ,...,x N )], generate the corresponding codebook according to the division result; meanwhile, divide the corresponding JND value into multiple k-dimensional vectors [(jnd 1 ,jnd 2 ,...,jnd k )(jnd k+1 ,jnd k+2 ,...,jnd 2k )...(jnd N-k+1 ,jnd N-k+2 ,...,jnd N )];

步骤1.3、对每个水平方位参数矢量(xi+1,xi+2,…,xi+k)进行量化,得到k维码字矢量(yi+1,yi+2,…,yi+k),i的取值为1,k+1…N-k+1;实现方式为,对每个水平方位参数矢量(xi+1,xi+2,…,xi+k)执行以下子步骤,Step 1.3. Quantize each horizontal orientation parameter vector (x i+1 , x i+2 ,...,x i+k ) to obtain a k-dimensional codeword vector (y i+1 , y i+2 ,..., y i+k ), the value of i is 1,k+1...N-k+1; the implementation method is, for each horizontal orientation parameter vector (xi +1 , xi+2 ,…,xi + k ) perform the following sub-steps,

步骤1.3.1、从码本中按顺序读入一个码字,将读入的码字作为k维码字矢量(yi+1,yi+2,…,yi+k)的当前量化结果;Step 1.3.1. Read a codeword in order from the codebook, and use the read-in codeword as the current quantization of the k-dimensional codeword vector (y i+1 , y i+2 ,...,y i+k ) result;

步骤1.3.2、根据1.3.1所得k维码字矢量(yi+1,yi+2,…,yi+k)的当前量化结果和步骤1.2所得相应的k维矢量(jndi+1,jndi+2,…,jndi+k)计算量化的主观感知失真 Step 1.3.2, the current quantization result of the k-dimensional codeword vector (y i+1 , y i+2 ,...,y i+k ) obtained in step 1.3.1 and the corresponding k-dimensional vector (jnd i+ 1 , jnd i+2 ,...,jnd i+k ) calculate the quantified subjective perceptual distortion

步骤1.3.3、返回步骤1.3.1从码本中按顺序读入下一个码字并作为k维码字矢量(yi+1,yi+2,…,yi+k)的当前量化结果,直到遍历完码本中的码字,然后根据每次执行步骤1.3.2的结果选择其中主观感知失真Dsp的值最小的相应码字作为k维码字矢量(yi+1,yi+2,…,yi+k)的最终量化结果;Step 1.3.3, return to step 1.3.1 Read the next codeword in order from the codebook and use it as the current quantization of the k-dimensional codeword vector (y i+1 , y i+2 ,...,y i+k ) As a result, until the codewords in the codebook are traversed, then according to the results of step 1.3.2 each time, the corresponding codeword with the smallest value of the subjective perceptual distortion D sp is selected as the k-dimensional codeword vector (y i+1 , y i+2 ,…,y i+k ) final quantization result;

步骤1.4、根据步骤1.3.3所得k维码字矢量[(y1,y2,…,yk)(yk+1,yk+2,…,y2k)…(yN-k+1,yN-k+2,…,yN)]的最终量化结果,输出N个子带的量化水平方位参数{y1,y2,…,yN}以及相应索引值Index1,Index2,…,IndexN,将索引值输出至码流。Step 1.4, the k-dimensional codeword vector obtained according to step 1.3.3 [(y 1 ,y 2 ,…,y k )(y k+1 ,y k+2 ,…,y 2k )…(y N-k+ 1 ,y N-k+2 ,...,y N )] the final quantization result, and output the quantized horizontal orientation parameters {y 1 ,y 2 ,...,y N } of N subbands and the corresponding index values Index 1 , Index 2 ,…,Index N , output the index value to the code stream.

而且,预设的维数k为2,Moreover, the preset dimension k is 2,

步骤1.2中,划分得到多个二维的水平方位参数矢量[(x1,x2)(x3,x4)…(xN-1,xN)],根据划分结果生成相应的码本;同时将对应的JND值两个一组划分成多个二维矢量[(jnd1,jnd2)(jnd3,jnd4)…(jndN-1,jndN)];In step 1.2, multiple two-dimensional horizontal orientation parameter vectors [(x 1 ,x 2 )(x 3 ,x 4 )…(x N-1 ,x N )] are obtained by division, and corresponding codebooks are generated according to the division results ; At the same time, divide the corresponding JND values into multiple two-dimensional vectors [(jnd 1 ,jnd 2 )(jnd 3 ,jnd 4 )...(jnd N-1 ,jnd N )];

步骤1.3中,对每个水平方位参数矢量(xi,xi+1)进行量化,得到二维码字矢量(yi,yi+1),i的取值为1,3…N-1;实现方式为,对每个水平方位参数矢量(xi,xi+1)执行以下子步骤,In step 1.3, each horizontal orientation parameter vector (xi , xi +1 ) is quantized to obtain a two-dimensional code word vector (y i ,y i+1 ), and the value of i is 1,3...N- 1; the implementation method is to execute the following sub-steps for each horizontal orientation parameter vector ( xi ,xi +1 ),

步骤1.3.1、从码本中按顺序读入一个码字,将读入的码字作为二维码字矢量(yi,yi+1)的当前分量;Step 1.3.1, read in a codeword in sequence from the codebook, and use the read-in codeword as the current component of the two-dimensional codeword vector (y i , y i+1 );

步骤1.3.2、根据1.3.1所得二维码字矢量(yi,yi+1)的当前分量和步骤1.2所得相应的二维矢量(jndi,jndi+1)计算量化的主观感知失真 Step 1.3.2, calculate the subjective perception of quantization according to the current component of the two-dimensional code word vector (y i , y i+1 ) obtained in step 1.3.1 and the corresponding two-dimensional vector (jnd i , jnd i+1 ) obtained in step 1.2 distortion

步骤1.3.3、返回步骤1.3.1从码本中按顺序读入下一个码字并作为二维码字矢量(yi,yi+1)的当前分量,直到遍历完码本中的码字,然后根据每次执行步骤1.3.2的结果选择其中主观感知失真Dsp的值最小的相应码字作为二维码字矢量(yi,yi+1)的最终结果;Step 1.3.3, return to step 1.3.1 Read the next codeword in order from the codebook and use it as the current component of the two-dimensional codeword vector (y i , y i+1 ), until the codes in the codebook are traversed word, and then select the corresponding codeword in which the value of the subjective perceptual distortion Dsp is the smallest according to the result of step 1.3.2 each time as the final result of the two-dimensional codeword vector (y i , y i+1 );

步骤1.4中,根据步骤1.3.3所得二维码字矢量[(y1,y2)(y3,y4)…(yN-1,yN)],输出N个子带的量化水平方位参数{y1,y2,…,yN}的最终量化结果以及相应索引值Index1,Index2,…,IndexN,将索引值输出至码流。In step 1.4, according to the two-dimensional code word vector obtained in step 1.3.3 [(y 1 ,y 2 )(y 3 ,y 4 )…(y N-1 ,y N )], output the quantized horizontal orientation of N subbands The final quantization results of the parameters {y 1 , y 2 ,...,y N } and the corresponding index values Index 1 , Index 2 ,...,Index N , output the index values to the code stream.

本发明提供的水平方位参数量化编码技术方案可以应用于各类量化方法,在原有量化算法用统计学失真筛选最优码字的基础上,结合水平方位参数的相关信息,计算量化的主观感知失真,既保留了各类量化装置原有的特性,同时使量化结果更加符合人耳的主观感知特性。为了计算水平方位参数量化的主观失真,本发明引入了人耳可感知的最小差异JND值,它是一个随着水平方位参数大小和所在频带变化而变化的量。在计算出量化客观失真的基础上,将其与JND值(可感知差异值)进行结合运算得到主观感知失真。计算出的主观失真能很好的反映人耳的主观听感,有助于提高整个量化装置的主观感知性能。The horizontal orientation parameter quantization coding technical solution provided by the present invention can be applied to various quantization methods. On the basis of the original quantization algorithm using statistical distortion to screen the optimal codeword, combined with the relevant information of the horizontal orientation parameter, the subjective perception distortion of quantization is calculated. , not only retains the original characteristics of various quantization devices, but also makes the quantization results more in line with the subjective perception characteristics of the human ear. In order to calculate the subjective distortion of the quantization of the horizontal orientation parameter, the present invention introduces the minimum difference JND value perceivable by the human ear, which is a quantity that changes with the size of the horizontal orientation parameter and the frequency band in which it is located. On the basis of calculating the quantitative objective distortion, it is combined with the JND value (perceptible difference value) to obtain the subjective perception distortion. The calculated subjective distortion can well reflect the subjective hearing perception of the human ear, and is helpful to improve the subjective perception performance of the whole quantization device.

附图说明 Description of drawings

图1为本发明实施例的编码流程图;Fig. 1 is the encoding flowchart of the embodiment of the present invention;

图2为本发明实施例的解码流程图。Fig. 2 is a decoding flowchart of an embodiment of the present invention.

具体实施方式 Detailed ways

本发明提供了3D音频中水平方位参数量化方法,主要思路是输入待量化的水平方位参数,并通过查表得到所在频带对应的JND值;迭代执行读入码本中的码字,对待量化码本水平方位参数进行量化;、计算出使用当前码字量化后的统计学失真、结合统计学失真和待量化水平方位参数的JND值计算使用当前码字量化的主观感知失真;直至遍历整个码本,选出使主观量化失真最小的码字并输出其索引值。相应解码过程包括接收从编码端传入的索引值;根据索引值查找码本,得到对应量化水平方位参数。The invention provides a method for quantizing horizontal orientation parameters in 3D audio. The main idea is to input the horizontal orientation parameters to be quantized, and obtain the JND value corresponding to the frequency band by looking up the table; Quantize this horizontal orientation parameter; calculate the statistical distortion after quantization using the current codeword, and calculate the subjective perception distortion using the current codeword quantization by combining the statistical distortion and the JND value of the horizontal orientation parameter to be quantized; until the entire codebook is traversed , select the codeword that minimizes the subjective quantization distortion and output its index value. The corresponding decoding process includes receiving the index value input from the encoder; searching the codebook according to the index value to obtain the corresponding quantization horizontal orientation parameter.

具体实施时,可以由本领域技术人员采用计算机软件手段根据所提供技术方案实现自动化量化编码。由于在编码应用中,往往将编码软件方法固化形成编码装置,因此按本发明所提供方法实现的编码装置也在保护范围内。During specific implementation, computer software means can be used by those skilled in the art to realize automatic quantization coding according to the technical solution provided. Since in encoding applications, the encoding software method is often solidified to form an encoding device, the encoding device implemented according to the method provided by the present invention is also within the scope of protection.

本发明按照预设的维数k对水平方位参数分组,得到多个k维的水平方位参数矢量,然后计算量化的主观感知失真,可以累加每个分量造成的误差。因此,k的取值范围是1,2,…N,取1时相当于每个水平方位参数作为1个矢量,取N时相当于所有水平方位参数作为1个矢量。具体实施时,本领域技术人员可以根据具体情况自行设定取值。一般使得N是k的整数倍,如果不是整数倍,最后一个划分的水平方位参数矢量少于k维,根据实际维数进行处理即可。本发明建议k的取值为2或3。The present invention groups the horizontal orientation parameters according to the preset dimension k, obtains multiple k-dimensional horizontal orientation parameter vectors, and then calculates the quantified subjective perception distortion, and can accumulate errors caused by each component. Therefore, the value range of k is 1, 2, ... N, when it is 1, it is equivalent to each horizontal orientation parameter as a vector, when it is N, it is equivalent to all horizontal orientation parameters as a vector. During specific implementation, those skilled in the art can set the value by themselves according to specific situations. Generally, N is an integer multiple of k. If it is not an integer multiple, the last divided horizontal orientation parameter vector is less than k-dimensional, and it can be processed according to the actual dimension. The present invention suggests that the value of k is 2 or 3.

实施例中预设的维数k为2,择矢量量化中的分裂矢量量化作为量化算法,采用的统计学失真为均方误差。以下结合附图和实施例对本发明技术方案做详细说明。In the embodiment, the preset dimension k is 2, split vector quantization in vector quantization is selected as the quantization algorithm, and the statistical distortion adopted is the mean square error. The technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings and embodiments.

如附图1所示,实施例的量化编码步骤如下:As shown in accompanying drawing 1, the quantization encoding step of embodiment is as follows:

步骤1.1、设3D音频的子带总数为N(N为偶数),输入N个子带的水平方位参数x1,x2,…,xN,记为集合X={x1,x2,…,xN},并根据水平方位参数的大小和所在频带查JND表,通过查表得到对应的JND值jnd1,jnd2,…,jndN,记为集合JND={jnd1,jnd2,…,jndN}。Step 1.1. Set the total number of sub-bands of 3D audio as N (N is an even number), input the horizontal orientation parameters x 1 , x 2 ,…,x N of N sub-bands, and write it as the set X={x 1 ,x 2 ,… ,x N }, and look up the JND table according to the size of the horizontal orientation parameter and the frequency band where it is located, and obtain the corresponding JND values jnd 1 ,jnd 2 ,...,jnd N by looking up the table, which is recorded as the set JND={jnd 1 ,jnd 2 , ..., jnd N }.

步骤1.2中,按照预设的维数k=2,划分得到多个二维的水平方位参数矢量[(x1,x2)(x3,x4)…(xN-1,xN)],根据划分结果生成相应的码本;同时将对应的JND值两个一组划分成多个二维矢量[(jnd1,jnd2)(jnd3,jnd4)…(jndN-1,jndN)]。In step 1.2, divide according to the preset dimension k=2 to obtain multiple two-dimensional horizontal orientation parameter vectors [(x 1 ,x 2 )(x 3 ,x 4 )…(x N-1 ,x N ) ], generate the corresponding codebook according to the division result; at the same time, divide the corresponding JND values into multiple two-dimensional vectors [(jnd 1 ,jnd 2 )(jnd 3 ,jnd 4 )…(jnd N-1 , jnd N )].

具体实施时,维数k确定后,即可根据该维数需求来训练码本,并可将训练好的码本放入量化器,作为量化器的一个组成部分。具体训练方式为现有技术,本发明不予赘述。码本中的码字个数可以根据精度需要设定。During specific implementation, after the dimension k is determined, the codebook can be trained according to the dimension requirement, and the trained codebook can be put into the quantizer as a component of the quantizer. The specific training method is the prior art, and will not be described in detail in the present invention. The number of codewords in the codebook can be set according to the accuracy requirement.

步骤1.3、对每个水平方位参数矢量(xi,xi+1)进行量化,得到二维码字矢量(yi,yi+1),i的取值为1,3…N-1。Step 1.3. Quantize each horizontal orientation parameter vector (xi , xi +1 ) to obtain a two-dimensional code word vector (y i ,y i+1 ), and the value of i is 1,3...N-1 .

本步骤进行矢量量化,码本是提前训练好的码字集合,量化就是从码本中选取和待量化矢量最接近的码字。In this step, vector quantization is performed. The codebook is a set of codewords trained in advance, and quantization is to select the codeword closest to the vector to be quantized from the codebook.

其中“最接近”的评价标准就是失真(误差)的计算,因而在量化中,是依次读入每一个码字去计算失真,通过迭代选取使失真最小的码字作为量化结果,实施例是选择对应主观感知失真Dsp的值最小的相应两个码字作为二维码字矢量(yi,yi+1)。Among them, the "closest" evaluation standard is the calculation of distortion (error). Therefore, in quantization, each codeword is read in order to calculate the distortion, and the codeword with the smallest distortion is selected as the quantization result through iteration. The embodiment is to select The corresponding two codewords corresponding to the minimum value of the subjective perceptual distortion D sp are used as the two-dimensional codeword vector (y i , y i+1 ).

因此,实施例对每个水平方位参数矢量(xi,xi+1)执行以下子步骤,Therefore, the embodiment performs the following sub-steps for each horizontal orientation parameter vector ( xi ,xi +1 ),

步骤1.3.1、从码本中按顺序读入一个码字量化水平方位参数矢量(xi,xi+1),即将读入的码字作为二维码字矢量(yi,yi+1)的当前量化结果。例如对水平方位参数矢量(x1,x2),第1次执行步骤1.3.1时,从码本中读入第1个码字(y1 1,y2 1),再次执行步骤1.3.1时,从码本中读入第2个码字(y1 2,y2 2)…依次类推,直到读入码本中最后一个码字(y1 m,y2 m),m为码本中的码字个数。Step 1.3.1. Read in a codeword quantized horizontal orientation parameter vector (x i , x i+1 ) in sequence from the codebook, and use the read-in codeword as a two-dimensional codeword vector (y i , y i+ 1 ) The current quantization result. For example, for the horizontal orientation parameter vector (x 1 , x 2 ), when step 1.3.1 is executed for the first time, the first codeword (y 1 1 , y 2 1 ) is read from the codebook, and step 1.3 is executed again. 1, read the second codeword (y 1 2 , y 2 2 ) from the codebook...and so on until the last codeword (y 1 m ,y 2 m ) in the codebook is read in, where m is the code The number of codewords in the book.

步骤1.3.2、根据1.3.1所得二维码字矢量(yi,yi+1)的当前量化结果和步骤1.2所得相应的二维矢量(jndi,jndi+1)计算量化的主观感知失真 Step 1.3.2, according to the current quantization result of the two-dimensional codeword vector (y i , y i+1 ) obtained in step 1.3.1 and the corresponding two-dimensional vector (jnd i , jnd i+1 ) obtained in step 1.2, calculate the subjective quantization Perceptual distortion

普通矢量量化时,计算量化客观失真采用均方误差公式而实施例是在均方误差的基础上计算主观感知失真,即结合改均方差公式和JND值(jndi,jndi+1)计算主观感知失真DspIn ordinary vector quantization, the calculation of quantization objective distortion uses the mean square error formula However, in the embodiment, the subjective perceptual distortion is calculated on the basis of the mean square error, that is, the subjective perceptual distortion D sp is calculated by combining the modified mean square error formula and the JND value (jnd i , jnd i+1 ).

步骤1.3.3、返回步骤1.3.1从码本中按顺序读入码字并作为二维码字矢量(yi,yi+1)的当前量化结果,直到遍历完码本中的码字,然后选择其中主观感知失真Dsp的值最小的相应码字作为二维码字矢量(yi,yi+1)的最终量化结果,即对水平方位参数矢量(xi,xi+1)找到的二维水平方位参数矢量量化结果。例如对水平方位参数矢量(x1,x2),就是从(y1 1,y2 1)、(y1 2,y2 2)…(y1 m,y2 m)中选择主观感知失真Dsp的值最小者,作为最终量化结果。Step 1.3.3, return to step 1.3.1 Read the codewords in order from the codebook and use them as the current quantization result of the two-dimensional codeword vector (y i , y i+1 ), until the codewords in the codebook are traversed , and then select the corresponding codeword with the smallest value of subjective perceptual distortion D sp as the final quantization result of the two-dimensional codeword vector (y i , y i+1 ), that is, for the horizontal orientation parameter vector (xi , x i+1 ) The vector quantization result of the two-dimensional horizontal orientation parameters found. For example, for the horizontal orientation parameter vector (x 1 ,x 2 ), it is to select the subjective perception distortion from (y 1 1 ,y 2 1 ), (y 1 2 ,y 2 2 )...(y 1 m ,y 2 m ) The one with the smallest value of D sp is taken as the final quantization result.

步骤1.4中,根据步骤1.3.3所得二维码字矢量[(y1,y2)(y3,y4)…(yN-1,yN)],输出N个子带的量化水平方位参数{y1,y2,…,yN}的最终量化结果以及相应索引值Index1,Index2,…,IndexN,将索引值输出至码流。In step 1.4, according to the two-dimensional code word vector [(y 1 ,y 2 )(y 3 ,y 4 )…(y N-1 ,y N )] obtained in step 1.3.3, output the quantized horizontal orientation of N subbands The final quantization results of the parameters {y 1 , y 2 ,...,y N } and the corresponding index values Index 1 , Index 2 ,...,Index N , output the index values to the code stream.

如附图1所示,实施例的解码步骤可以如下:As shown in accompanying drawing 1, the decoding step of embodiment can be as follows:

步骤2.1、从码流中读入索引值Index1,Index2,…,IndexNStep 2.1. Read the index values Index 1 , Index 2 ,..., Index N from the code stream.

步骤2.2、根据索引值Index1,Index2,…,IndexN在码本中查找得到量化水平方位参数{y1,y2,…,yN}。Step 2.2: Search the codebook according to the index values Index 1 , Index 2 , ..., Index N to obtain the quantized horizontal orientation parameters {y 1 , y 2 , ..., y N }.

为说明本发明效果起见,采用实施例所提供技术方案进行具体量化举例,输入的N个子带水平方位参数为6个。In order to illustrate the effect of the present invention, the technical solution provided by the embodiment is used for a specific quantization example, and the input N sub-band horizontal orientation parameters are 6.

(a)6维输入待量化水平方位参数X={23.709299,-20.163263,0.947107,2.862993,34.412148,22.736864},查表得到对应的JND值为JND={0.815555417,0.66236129,0.601276375,0.732569875,0.619971,0.722825075}。(a)6维输入待量化水平方位参数X={23.709299,-20.163263,0.947107,2.862993,34.412148,22.736864},查表得到对应的JND值为JND={0.815555417,0.66236129,0.601276375,0.732569875,0.619971,0.722825075 }.

(b)将输入数据组成3个二维矢量(23.709299,-20.163263),(0.947107,2.862993),(34.412148,22.736864),对应JND值也组成3个二维矢量(0.815555417,0.66236129),(0.601276375,0.732569875),(0.619971,0.722825075)。(b) Compile the input data into three two-dimensional vectors (23.709299, -20.163263), (0.947107, 2.862993), (34.412148, 22.736864), and corresponding JND values also form three two-dimensional vectors (0.815555417, 0.66236129), (0.601276375, 0.732569875), (0.619971, 0.722825075).

(c)分别对X中的三个矢量进行量化,矢量(23.709299,-20.163263)对应的JND值为(0.619971,0.722825075),与码本中的码字逐个计算统计失真,并计算主观感知失真,得到最小主观感知失真为0.236205,对应的码字为(25.689648,-20.168681),输出索引为5。(c) Quantize the three vectors in X respectively. The JND value corresponding to the vector (23.709299, -20.163263) is (0.619971, 0.722825075), calculate the statistical distortion one by one with the codewords in the codebook, and calculate the subjective perception distortion, The minimum subjective perceptual distortion is obtained as 0.236205, the corresponding codeword is (25.689648, -20.168681), and the output index is 5.

(d)对剩下两个矢量做同样操作,得到对应的码字为(1.542842,2.684604),(34.535961,22.689129),即为主观感知效果最好的量化结果,输出对应的量化索引分别为3、7。(d) Do the same operation on the remaining two vectors, and the corresponding codewords are (1.542842, 2.684604), (34.535961, 22.689129), which is the quantization result with the best subjective perception effect, and the output corresponding quantization indexes are 3 , 7.

(d)得到量化后的水平方位参数X′=(25.689648,-20.168681,1.542842,2.684604,34.535961,22.689129),输出对应的量化索引分别为5、3、7。(d) Get the quantized horizontal orientation parameter X′=(25.689648, -20.168681, 1.542842, 2.684604, 34.535961, 22.689129), and output the corresponding quantization indexes as 5, 3, and 7 respectively.

本文中所描述的具体实施例仅仅是对本发明精神作举例说明。本发明所属技术领域的技术人员可以对所描述的具体实施例做各种各样的修改或补充或采用类似的方式替代,但并不会偏离本发明的精神或者超越所附权利要求书所定义的范围。The specific embodiments described herein are merely illustrative of the spirit of the invention. Those skilled in the art to which the present invention belongs can make various modifications or supplements to the described specific embodiments or adopt similar methods to replace them, but they will not deviate from the spirit of the present invention or go beyond the definition of the appended claims range.

Claims (2)

1. a level orientation parameter quantification method in 3D audio frequency, is characterized in that, comprises the following steps:
Step 1.1, the sub-band sum of establishing 3D audio frequency are N, the level orientation parameter x of N subband of input 1, x 2..., x n, by tabling look-up, obtain corresponding JND value jnd 1, jnd 2..., jnd n;
Step 1.2, according to default dimension k, divide level orientation the parameter vector [(x obtain a plurality of k dimensions 1, x 2..., x k) (x k+1, x k+2..., x 2k) ... (x n-k+1, x n-k+2..., x n)], according to division result, generate corresponding code book; Corresponding JND value is divided into a plurality of k n dimensional vector n [(jnd simultaneously 1, jnd 2..., jnd k) (jnd k+1, jnd k+2..., jnd 2k) ... (jnd n-k+1, jnd n-k+2..., jnd n)];
Step 1.3, to each level orientation parameter vector (x i+1, x i+2..., x i+k) quantize, obtain k dimension code word vector (y i+1, y i+2..., y i+k), the value of i is 1, k+1 ... N-k+1; Implementation is, to each level orientation parameter vector (x i+1, x i+2..., x i+k) carry out following sub-step,
Step 1.3.1, from code book, read in order a code word, the dimension code word vector (y using the code word of reading in as k i+1, y i+2..., y i+k) current quantized result;
Step 1.3.2, according to 1.3.1 gained k dimension code word vector (y i+1, y i+2..., y i+k) current quantized result and the corresponding k n dimensional vector n of step 1.2 gained (jnd i+1, jnd i+2..., jnd i+k) the subjective perception distortion that calculate to quantize
Step 1.3.3, return to that step 1.3.1 reads in order next code word from code book and as k dimension code word vector (y i+1, y i+2..., y i+k) current quantized result, until traveled through the code word in code book, then according to the result of each execution step 1.3.2, select wherein subjective perception distortion D spthe corresponding codewords of value minimum as k dimension code word vector (y i+1, y i+2..., y i+k) final quantized result;
Step 1.4, according to step 1.3.3 gained k dimension code word vector [(y 1, y 2..., y k) (y k+1, y k+2..., y 2k) ... (y n-k+1, y n-k+2..., y n)] final quantized result, output N subband quantization level direction parameter { y 1, y 2..., y nand respective index value Index 1, Index 2..., Index n, export index value to code stream.
2. level orientation parameter quantification method in 3D audio frequency as claimed in claim 1, is characterized in that: default dimension k is 2,
In step 1.2, divide level orientation the parameter vector [(x that obtains a plurality of two dimensions 1, x 2) (x 3, x 4) ... (x n-1, x n)], according to division result, generate corresponding code book; Two one group of corresponding JND value is divided into a plurality of two-dimensional vector [(jnd simultaneously 1, jnd 2) (jnd 3, jnd 4) ... (jnd n-1, jnd n)];
In step 1.3, to each level orientation parameter vector (x i, x i+1) quantize, obtain type of two-dimension codeword vector (y i, y i+1), the value of i is 1,3 ... N-1; Implementation is, to each level orientation parameter vector (x i, x i+1) carry out following sub-step,
Step 1.3.1, from code book, read in order a code word, using the code word of reading in as type of two-dimension codeword vector (y i, y i+1) current component;
Step 1.3.2, according to 1.3.1 gained type of two-dimension codeword vector (y i, y i+1) current component and the corresponding two-dimensional vector (jnd of step 1.2 gained i, jnd i+1) the subjective perception distortion that calculate to quantize
Step 1.3.3, return to that step 1.3.1 reads in order next code word from code book and as type of two-dimension codeword vector (y i, y i+1) current component, until traveled through the code word in code book, then according to the result of each execution step 1.3.2, select wherein subjective perception distortion D spthe corresponding codewords of value minimum as type of two-dimension codeword vector (y i, y i+1) net result;
In step 1.4, according to step 1.3.3 gained type of two-dimension codeword vector [(y 1, y 2) (y 3, y 4) ... (y n-1, y n)], the quantization level direction parameter { y of N subband of output 1, y 2..., y nfinal quantized result and respective index value Index 1, Index 2..., Index n, export index value to code stream.
CN201210256992.5A 2012-07-24 2012-07-24 3D video azimuth parametric quantification method Expired - Fee Related CN102760442B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210256992.5A CN102760442B (en) 2012-07-24 2012-07-24 3D video azimuth parametric quantification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210256992.5A CN102760442B (en) 2012-07-24 2012-07-24 3D video azimuth parametric quantification method

Publications (2)

Publication Number Publication Date
CN102760442A CN102760442A (en) 2012-10-31
CN102760442B true CN102760442B (en) 2014-09-03

Family

ID=47054883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210256992.5A Expired - Fee Related CN102760442B (en) 2012-07-24 2012-07-24 3D video azimuth parametric quantification method

Country Status (1)

Country Link
CN (1) CN102760442B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188878A (en) * 2007-12-05 2008-05-28 武汉大学 A Spatial Parameter Quantization and Entropy Coding Method of Stereo Audio Signal and Its System Structure
CN101499280A (en) * 2009-03-09 2009-08-05 武汉大学 Spacing parameter choosing method and apparatus based on spacing perception entropy judgement

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7333930B2 (en) * 2003-03-14 2008-02-19 Agere Systems Inc. Tonal analysis for perceptual audio coding using a compressed spectral representation

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101188878A (en) * 2007-12-05 2008-05-28 武汉大学 A Spatial Parameter Quantization and Entropy Coding Method of Stereo Audio Signal and Its System Structure
CN101499280A (en) * 2009-03-09 2009-08-05 武汉大学 Spacing parameter choosing method and apparatus based on spacing perception entropy judgement

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Measurement and Analysis of Just Noticeable Difference of Interaural Level Difference Cue;Tu Weiping,et al.;《2010 International Conference on Multimedia Technology (ICMT)》;20101031;1-3 *
Perceptual characteristic and compression research in 3D audio technology;Ruimin Hu,et al.;《9th International Symposium on Computer Music Modelling and Retrieval (CMMR 2012)》;20120622;241-256 *
Ruimin Hu,et al..Perceptual characteristic and compression research in 3D audio technology.《9th International Symposium on Computer Music Modelling and Retrieval (CMMR 2012)》.2012,241-256.
Tu Weiping,et al..Measurement and Analysis of Just Noticeable Difference of Interaural Level Difference Cue.《2010 International Conference on Multimedia Technology (ICMT)》.2010,1-3.

Also Published As

Publication number Publication date
CN102760442A (en) 2012-10-31

Similar Documents

Publication Publication Date Title
US7573912B2 (en) Near-transparent or transparent multi-channel encoder/decoder scheme
US8964994B2 (en) Encoding of multichannel digital audio signals
KR101143225B1 (en) Complex-transform channel coding with extended-band frequency coding
CN101223582B (en) Audio frequency coding method, audio frequency decoding method and audio frequency encoder
CN105917408B (en) Indicating frame parameter reusability for coding vectors
RU2368074C2 (en) Adaptive grouping of parametres for improved efficiency of coding
TWI584271B (en) Encoding device and encoding method thereof, decoding device and decoding method thereof, and computer program
JP2019080347A (en) Method for parametric multi-channel encoding
ES2899286T3 (en) Temporal Envelope Configuration for Audio Spatial Encoding Using Frequency Domain Wiener Filtering
CN112997248B (en) Determining coding and associated decoding of spatial audio parameters
KR101679083B1 (en) Factorization of overlapping transforms into two block transforms
JP2022160440A (en) Multi-channel signal encoding method, multi-channel signal decoding method, encoder, and decoder
TW201145261A (en) Audio encoder, audio decoder, method for encoding and audio information, method for decoding an audio information and computer program using a hash table describing both significant state values and interval boundaries
CN106463121A (en) Higher order ambisonics signal compression
JP2022548038A (en) Determining Spatial Audio Parameter Encoding and Related Decoding
US20160111100A1 (en) Audio signal encoder
KR20190040063A (en) Quantizer with index coding and bit scheduling
JP2020518030A (en) Difference data in digital audio signal
US9214158B2 (en) Audio decoding device and audio decoding method
CN102760442B (en) 3D video azimuth parametric quantification method
CN102708872B (en) Method for acquiring horizontal azimuth parameter codebook in three-dimensional (3D) audio
CN105336334B (en) Multi-channel sound signal coding method, decoding method and device
CN118609581B (en) Audio encoding and decoding methods, apparatuses, devices, storage medium, and products
WO2024196888A1 (en) Frame segmentation and grouping for audio encoding
Yang et al. Multi-channel object-based spatial parameter compression approach for 3d audio

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140903

CF01 Termination of patent right due to non-payment of annual fee