CN110139098B - Intra-frame fast algorithm selection method for high-efficiency video encoder based on decision tree - Google Patents
Intra-frame fast algorithm selection method for high-efficiency video encoder based on decision tree Download PDFInfo
- Publication number
- CN110139098B CN110139098B CN201910281249.7A CN201910281249A CN110139098B CN 110139098 B CN110139098 B CN 110139098B CN 201910281249 A CN201910281249 A CN 201910281249A CN 110139098 B CN110139098 B CN 110139098B
- Authority
- CN
- China
- Prior art keywords
- ctu
- rate
- algorithm
- encoding
- cus
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003066 decision tree Methods 0.000 title claims abstract description 17
- 238000010187 selection method Methods 0.000 title claims description 5
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000012216 screening Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000013139 quantization Methods 0.000 claims description 4
- 230000001174 ascending effect Effects 0.000 claims 1
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 241000219357 Cactaceae Species 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/107—Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/56—Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/96—Tree coding, e.g. quad-tree coding
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Discrete Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
技术领域technical field
本发明属于视频编码解码技术领域,具体涉及基于决策树的高效率视频编码器帧内快速算法选择方法。The invention belongs to the technical field of video encoding and decoding, and in particular relates to a fast algorithm selection method in a high-efficiency video encoder frame based on a decision tree.
背景技术Background technique
视频编码,就是指通过一些压缩的手段,将视频信号的文件转换成另一种文件格式,从而使得在信号传输过程中,减少带宽的使用,使其高效的传播。高效率视频编码(HighEfficiency Video Coding,简称HEVC)是一种新的视频压缩标准,在性能上相较于H.264更加优秀,在同等视频质量下其压缩率可达到H.264的2倍。电影、动画片等视频经HEVC视频压缩后,手机用户观看在线视频不仅流量耗费大大减少,且下载速度会更快,画质基本不会受到影响,即使在线观看也会更流畅,不易卡机。Video coding refers to the conversion of video signal files into another file format through some compression means, so that in the process of signal transmission, the use of bandwidth is reduced and its transmission is efficient. High Efficiency Video Coding (HEVC for short) is a new video compression standard, which is better than H.264 in performance, and its compression rate can reach twice that of H.264 under the same video quality. After movies, cartoons and other videos are compressed by HEVC video, not only the traffic consumption of mobile phone users watching online videos will be greatly reduced, but also the download speed will be faster, and the image quality will basically not be affected.
在HEVC中,视频信号序列以图像组(Group of Picture,简称GOP)为基本单位进行编码,其中每一帧又会被分为一系列的片(编码的独立单元),每一片又会被划分为若干个树形编码单元(Coding Tree Unit,简称CTU),CTU按照类似四叉树的结构划分成四个更小尺寸的编码单元(Coding Units,简称CU),CU是HEVC帧内/帧间预测、量化变换以及熵编码等环节共享的基本单位,可支持的编码尺寸最大为64×64,最小为8×8,编码器可以根据不同的图片内容、图片大小以及应用需求合理地选择CU的尺寸,以此获得较大程度的优化。In HEVC, the video signal sequence is encoded with the Group of Picture (GOP) as the basic unit, in which each frame will be divided into a series of slices (independent units of encoding), and each slice will be divided into It is a number of tree-shaped coding units (Coding Tree Unit, referred to as CTU). CTU is divided into four smaller-sized coding units (Coding Units, referred to as CU) according to a quadtree-like structure. CU is HEVC intra/inter frame The basic unit shared by prediction, quantization transformation, and entropy coding. The maximum supported coding size is 64×64, and the minimum is 8×8. The encoder can reasonably select the CU according to different picture content, picture size and application requirements. Size, in order to obtain a greater degree of optimization.
帧内预测技术在H.264/AVC中已经成功得到应用。HEVC中的帧内预测指利用图像的空间相关性,通过已编码的重构像素块来预测当前像素块,以去除空间冗余信息,提高图像压缩率。在HEVC中,为了更准确地描述图像的纹理特性,降低预测误差,提出了更精确的帧内预测技术,将预测模式增加到了35种,如图1所示。对于HEVC帧内编码,CTU可以迭代地划分成四个CU,一直划分到最小编码单一(8×8),每一深度的CU又可以划分成2N×2N和N×N两种PU,每种PU进行35种帧内预测编码,并根据RDCost选择最佳PU模式。判定一个CTU的最佳划分方式,一般需要进行1+4+42+43=85次RDCost计算,如图2中序号所示。Intra-frame prediction technology has been successfully applied in H.264/AVC. Intra prediction in HEVC refers to using the spatial correlation of the image to predict the current pixel block through the encoded reconstructed pixel block, so as to remove spatial redundant information and improve the image compression rate. In HEVC, in order to more accurately describe the texture characteristics of the image and reduce the prediction error, a more accurate intra-frame prediction technology is proposed, and the prediction mode is increased to 35, as shown in Figure 1. For HEVC intra-frame coding, the CTU can be iteratively divided into four CUs until the smallest coded unit (8×8), and each depth CU can be divided into 2N×2N and N×N PUs, each The PU performs 35 types of intra-frame prediction coding, and selects the best PU mode according to RDCost. To determine the optimal division method of a CTU, generally 1+4+4 2 +4 3 =85 RDCost calculations are required, as shown by the serial numbers in FIG. 2 .
HEVC的帧内编码算法从35种预测模式中选择出最佳的预测模式要经历两个步骤,一是“粗搜索”,二是“精搜索”。在粗搜索中,编码器首先从35种模式中选出N种最有可能成为最佳模式的候选类型构成“精搜索”候选集合。N取决于预测单元(Prediction Uints,简称PU)的大小,当PU的大小为{4×4,8×8,16×16,32×32,64×64}时,对应的N分别是{8,8,3,3,3}。然后将“最可能”预测模式(Most Probable Modes,简称MPMs)加入到候选集合中。在“精搜索”中,对候选集合中的模式进行完整的计算率失真代价,将率失真代价最小的模式作为帧内模式的方向。由于完整的率失真代价计算复杂度很高,在“粗搜索”中只计算了简单的率失真代价。The HEVC intra-frame coding algorithm selects the best prediction mode from 35 prediction modes to go through two steps, one is "rough search" and the other is "fine search". In the rough search, the encoder first selects N candidate types that are most likely to become the best mode from the 35 modes to form a "fine search" candidate set. N depends on the size of the prediction unit (Prediction Uints, referred to as PU). When the size of the PU is {4×4, 8×8, 16×16, 32×32, 64×64}, the corresponding N is {8 , 8, 3, 3, 3}. The "Most Probable" prediction modes (MPMs for short) are then added to the candidate set. In the "fine search", the rate-distortion cost is completely calculated for the modes in the candidate set, and the mode with the smallest rate-distortion cost is taken as the direction of the intra mode. Due to the high computational complexity of the full rate-distortion cost, only simple rate-distortion costs are calculated in the "coarse search".
出于压缩效率的考虑,HEVC编码器的计算复杂度很高,在视频会议、网络直播等对延迟有较高要求的应用中受到了限制。因此,为了提高视频编码效率,降低计算复杂度,仍需一种新的方法。For the sake of compression efficiency, the computational complexity of the HEVC encoder is very high, which is limited in applications that have high requirements for delay, such as video conferencing and webcasting. Therefore, in order to improve video coding efficiency and reduce computational complexity, a new method is still needed.
发明内容Contents of the invention
为解决现有技术中存在的问题,本发明实施例之一的目的在于提供一种基于决策树的高效率视频编码器帧内快速算法选择方法。In order to solve the problems existing in the prior art, an object of one embodiment of the present invention is to provide a fast algorithm selection method in a high-efficiency video encoder based on a decision tree.
为实现上述目的,本发明实施例之一采用以下技术方案:In order to achieve the above purpose, one of the embodiments of the present invention adopts the following technical solutions:
基于决策树的高效率视频编码器帧内快速算法选择方法,步骤包括:A method for fast algorithm selection in a high-efficiency video encoder frame based on a decision tree, the steps of which include:
(1)分别用第一算法和第二算法对训练视频序列进行编码,将编码过程中编码每个CTU时的中间信息作为特征写入文本文件;(1) encode the training video sequence with the first algorithm and the second algorithm respectively, and write the intermediate information when encoding each CTU in the encoding process into a text file as a feature;
(2)对步骤(1)的文本文件进行标记,若则标记为0,否则标记为1,得到标记的训练样本,其中RDcost1是第一算法的总率失真代价,RDcost2是第二算法的总率失真代价,T1是第一算法编码所用时间,T2分别是第二算法编码所用时间;(2) mark the text file of step (1), if Then mark it as 0, otherwise mark it as 1, and get the marked training samples, where RDcost1 is the total rate-distortion cost of the first algorithm, RDcost2 is the total rate-distortion cost of the second algorithm, T1 is the encoding time of the first algorithm, T2 respectively is the encoding time of the second algorithm;
(3)用步骤(2)标记的训练样本对决策树模型进行训练后,通过训练后的决策树模型在CTU开始编码时进行预测,决定编码流程。(3) After training the decision tree model with the training samples marked in step (2), the trained decision tree model is used to predict when the CTU starts encoding, and determine the encoding process.
步骤(1)中,训练视频序列包括:Kimono、ParkScene、Cactus、BasketballDrill、BQMall、BasketballPass、BQSquare、FourPeople、KristenAndSara、vidyo1、vidyo3、BasketballDrillText和SlideEditing。In step (1), the training video sequence includes: Kimono, ParkScene, Cactus, BasketballDrill, BQMall, BasketballPass, BQSquare, FourPeople, KristenAndSara, vidyo1, vidyo3, BasketballDrillText and SlideEditing.
测试视频的帧数等于视频一秒钟包含的图片数,即视频的帧率。The number of frames of the test video is equal to the number of pictures contained in the video in one second, that is, the frame rate of the video.
编码器的配置文件为encoder_intra_main.cfg。The configuration file of the encoder is encoder_intra_main.cfg.
优选地,所述第一算法步骤包括:Preferably, the first algorithm step comprises:
(1s)快速模式决策:(1s) Fast mode decision:
(1s.a)按HEVC标准对{0,1,2,6,10,14,18,22,26,30,34}11个模式做粗模式搜索,根据绝对变换差值,从中选出六个最佳帧内模式候选,结合当前PU左边和上边PU的最佳帧内模式组成集合A;(1s.a) According to the HEVC standard, do a rough pattern search on {0, 1, 2, 6, 10, 14, 18, 22, 26, 30, 34} 11 patterns, and select six of them according to the absolute transformation difference The best intra-mode candidates are combined with the best intra-modes of the left side of the current PU and the top PU to form a set A;
(1s.b)对集合A中的所有元素的2距邻居模式进行测验,从中进一步选出最好的两个帧内模式候选,将这两个帧内模式的1距邻居模式连同PU的最可能模式(MPM)组成集合B;(1s.b) Test the 2-distance neighbor patterns of all elements in the set A, and further select the best two intra-mode candidates, and combine the 1-distance neighbor patterns of the two intra-modes together with the PU’s best The possible patterns (MPM) form the set B;
(1s.c)对集合B中所有模式做粗模式搜索;(1s.c) Do a coarse pattern search for all patterns in set B;
(1s.d)从所有做过粗模式搜索的模式中找出SATD代价最小的M个模式,进行后续操作,M的个数由CU的大小决定:当CU大小为{64×64,32×32,16×16,8×8,4×4}时,M的值分别为{3,3,3,8,8};(1s.d) Find M patterns with the smallest SATD cost from all the patterns that have been searched for coarse patterns, and perform subsequent operations. The number of M is determined by the size of the CU: when the size of the CU is {64×64, 32× 32, 16×16, 8×8, 4×4}, the values of M are {3, 3, 3, 8, 8};
(2s)基于率失真优化量化的模式筛选:(2s) Mode screening based on rate-distortion optimized quantization:
(2s.a)从快速模式决策中得到的M个模式中选出SATD代价最小两个{m1,m2}组成集合W;(2s.a) From the M modes obtained in the fast mode decision, select two {m1, m2} with the smallest SATD cost to form a set W;
(2s.b)将M个模式中剩余的模式mi依次进行以下操作:(2s.b) Perform the following operations on the remaining patterns mi among the M patterns in sequence:
若mi与W中的所有元素的距离都大于1,则将mi加入到W集合中;If the distance between m i and all elements in W is greater than 1, add m i to the W set;
若集合中的元素包括了这些模式{m1,m2,Intra_Planar,Intra_DC,MPM},则跳出步骤(2s.b);If the elements in the collection include these patterns {m 1 , m 2 , Intra_Planar, Intra_DC, MPM}, skip step (2s.b);
(2s.c)对W中的所有元素做精模式搜索;(2s.c) Do a refined pattern search for all elements in W;
(3s)基于率失真代价的终止划分:(3s) Termination division based on rate-distortion cost:
若当前子CU的率失真代价之和大于其中某一阈值时,则跳过后续子CU的编码过程,减小计算复杂度,具体判断标准为:If the sum of the rate-distortion costs of the current sub-CU is greater than one of the thresholds, the encoding process of the subsequent sub-CU is skipped to reduce the computational complexity. The specific judgment criteria are:
其中:K的取值范围为{1,2,3,4},βK对应K={1,2,3,4}时的取值为{1.5,1.2,1.1,1},表示4个子CU的哈达马代价之和,4个子CU还没完全编码完的情况下,其值用当前CU的率失真代价代替,为前K个子CU的哈达马代价之和,表示第i个子CU的率失真代价,为当前CU的率失真代价。Among them: the value range of K is {1, 2, 3, 4}, and the value of β K corresponding to K={1, 2, 3, 4} is {1.5, 1.2, 1.1, 1}, Indicates the sum of the Hadamard costs of the 4 sub-CUs. When the 4 sub-CUs have not been completely encoded, its value is replaced by the rate-distortion cost of the current CU. is the sum of Hadamard costs of the first K sub-CUs, Indicates the rate-distortion cost of the i-th sub-CU, is the rate-distortion cost of the current CU.
某帧内模式的N距邻居表示与该帧内模式值之差的绝对值等于N的帧内模式,即满足等式|m–mi|=N的帧内模式(其中mi表示该帧内模式的值,m的值表示该帧内模式N距邻居的帧模式值),模式Intra_Planar和Intra_DC没有N距邻居。The N-distance neighbors of a certain intra-frame mode represent the intra-frame mode whose absolute value of the difference from the intra-frame mode value is equal to N, that is, the intra-frame mode that satisfies the equation |m–mi|=N (where mi represents the intra-frame mode The value of m represents the frame mode value of the intra-frame mode N-distance neighbor), and the modes Intra_Planar and Intra_DC have no N-distance neighbors.
优选地,所述第二算法步骤包括:Preferably, the second algorithm step includes:
(1s)CU深度预测:(1s) CU depth prediction:
若则当前编码块的深度范围设置为与前一帧相同位置编码块的深度范围相同,若当前编码块的深度范围设置为其中为前一帧相同位置编码块的最小编码深度,为前一帧相同位置编码块的最大编码深度,p为常数1.02;like Then the depth range of the current coding block is set to be the same as the depth range of the coding block at the same position in the previous frame, if The depth range of the current encoded block is set to in the minimum coded depth for the same-position coded block in the previous frame, is the maximum coded depth of the coded block at the same position in the previous frame, p is a constant 1.02;
(2s)基于率失真代价的终止划分:(2s) Termination division based on rate-distortion cost:
如果当前编码块深度为d的CU的率失真代价J满足:If the rate-distortion cost J of the CU with the depth of the currently coded block is d satisfies:
其中表示上一帧当前位置编码块总的率失真代价,d表示编码深度,为系数,值为上一帧当前位置编码块处于最大深度的CU数与编码块总CU数之比;in Indicates the total rate-distortion cost of the encoding block at the current position of the previous frame, d indicates the encoding depth, is a coefficient, and the value is the ratio of the number of CUs at the maximum depth of the coding block at the current position of the previous frame to the total number of CUs of the coding block;
(3s)快速候选模式筛选:(3s) Fast candidate pattern screening:
对粗搜索后得到的候选集合,其中元素按率失真代价从小到大排列,记为P={p(0),p(1),…,p(M-1)},在进行精搜索之前,对候选集中的元素进行以下操作:For the candidate set obtained after the rough search, the elements are arranged according to the rate-distortion cost from small to large, which is recorded as P={p(0),p(1),...,p(M-1)}, before the fine search , perform the following operations on the elements in the candidate set:
(3s.a)假设m为MPM中三个元素在P中最靠后的索引值,则集合P的大小可以缩小为P={p(0),p(1),…,p(m)},其中M–1>m;(3s.a) Assuming that m is the last index value of the three elements in P in the MPM, the size of the set P can be reduced to P={p(0),p(1),...,p(m) }, where M–1>m;
(3s.b)对于新的集合P中的所有元素,如果满足J(p(i))SATD>1.3J(p(0))SATD,则将元素p(i)从集合P中移除,其中J(p(i))SATD,J(p(0))SATD分别代表P中第i+1个、第0个元素的率失真代价。(3s.b) For all elements in the new set P, if J(p(i)) SATD > 1.3J(p(0)) SATD is satisfied, the element p(i) is removed from the set P, Among them, J(p(i)) SATD and J(p(0)) SATD represent the rate-distortion cost of the i+1th and 0th elements in P, respectively.
优选地,所述中间信息包括:CTU的最大编码深度,CTU的最小编码深度,最大编码深度CU的个数与CTU中所有CU的个数之比,CTU中总的率失真代价,上一帧当前位置CTU的总率失真代价,左边CTU最大编码深度与最小编码深度之差,左边CTU最大编码深度CU的个数与CTU中所有CU的个数之比,右边CTU最大编码深度与最小编码深度之差,右边CTU最大编码深度CU的个数与CTU中所有CU的个数之比,以及编码CTU所用时间。Preferably, the intermediate information includes: the maximum coded depth of the CTU, the minimum coded depth of the CTU, the ratio of the number of CUs with the maximum coded depth to the number of all CUs in the CTU, the total rate-distortion cost in the CTU, and the previous frame The total rate-distortion cost of the CTU at the current position, the difference between the maximum coded depth and the minimum coded depth of the left CTU, the ratio of the number of CUs with the maximum coded depth of the left CTU to the number of all CUs in the CTU, the maximum coded depth and the minimum coded depth of the right CTU The difference, the ratio of the number of maximum coded depth CUs in the right CTU to the number of all CUs in the CTU, and the time spent coding the CTU.
优选地,步骤(3)中,决定编码流程的步骤包括:Preferably, in step (3), the step of determining the encoding process includes:
(a)在CTU编码开始前,判断编码帧是否为第一帧,若是,则用第一算法对当前CTU进行编码,若否,则进行步骤(b);(a) before the CTU encoding starts, judge whether the encoded frame is the first frame, if so, then use the first algorithm to encode the current CTU, if not, then proceed to step (b);
(b)将前一帧当前位置CTU编码完成时保存下来的特征输入到决策树进行预测,若预测结果为0,用第一算法对CTU进行编码,若结果为1,用第二算法对CTU进行编码;(b) Input the features saved when the current position CTU encoding of the previous frame is completed to the decision tree for prediction. If the prediction result is 0, use the first algorithm to encode the CTU; if the result is 1, use the second algorithm to encode the CTU to encode;
(c)收集步骤(b)编码过程中的中间量并编码下一个CTU;(c) collecting intermediate quantities in the encoding process of step (b) and encoding the next CTU;
(d)重复步骤(b)和(c)至所有CTU编码完成。(d) Repeat steps (b) and (c) until all CTUs are coded.
步骤(1)中,用第一算法对当前CTU进行编码,编码完成后收集编码过程中的中间量并保存下来作为以后决策时的特征。In step (1), the first algorithm is used to encode the current CTU, and after the encoding is completed, the intermediate quantities in the encoding process are collected and saved as features for future decision-making.
第一算法在微观和宏观层面分别引入快速决策算法:微观上提出了一种渐进式的粗搜索算法,降低进行粗搜索的预测方向的个数;宏观上比较当前PU的绝对变换差值(SATD)与其四个子PU的绝对变换差值之和,决定CU是否进一步往下划分,减小编码深度。The first algorithm introduces fast decision-making algorithms at the micro and macro levels: a progressive coarse search algorithm is proposed at the micro level to reduce the number of prediction directions for rough search; at the macro level, the absolute transformation difference (SATD) of the current PU is compared ) and the sum of the absolute transformation differences of its four sub-PUs determine whether the CU is further divided to reduce the coding depth.
关于第二算法,编码块的最大编码深度与最小编码深度之差的平均值等于1.75,说明在编码块编码过程中,仅需要搜索2至3个深度,而不是编码所有深度。因此只要能精确的预测编码块的编码深度,就能极大的降低编码器的计算复杂度。Regarding the second algorithm, the average value of the difference between the maximum coded depth and the minimum coded depth of a coded block is equal to 1.75, indicating that only 2 to 3 depths need to be searched instead of coding all depths during the coding process of a coded block. Therefore, as long as the coding depth of the coding block can be accurately predicted, the computational complexity of the encoder can be greatly reduced.
本发明实施例之一的方法中,CTU是编码的最小单元,编码一段视频可以看成是编码一个个的CTU,不同的算法编码相同的CTU时会消耗不同的时间。如果对于每一个CTU都使用编码时间最短的算法来完成编码,那么其总的编码时间将会小于使用单一算法的编码器。In the method of one embodiment of the present invention, a CTU is the smallest unit of encoding, and encoding a piece of video can be regarded as encoding each CTU, and different algorithms will consume different time when encoding the same CTU. If the algorithm with the shortest encoding time is used for each CTU to complete the encoding, the total encoding time will be shorter than that of an encoder using a single algorithm.
本发明实施例之一的发明构思为:CTU是编码的最小单元,编码一段视频可以看成是编码一个个的CTU,不同的算法编码相同的CTU时会消耗不同的时间。如果对于每一个CTU都使用编码时间最短的算法来完成编码,那么其总的编码时间将会小于使用单一算法的编码器。The inventive idea of one of the embodiments of the present invention is: CTU is the smallest unit of encoding, encoding a video can be regarded as encoding each CTU, different algorithms will consume different time when encoding the same CTU. If the algorithm with the shortest encoding time is used for each CTU to complete the encoding, the total encoding time will be shorter than that of an encoder using a single algorithm.
本发明实施例的有益效果Beneficial effects of the embodiments of the present invention
与单一的算法相比,本发明实施例之一提供的方法能够进一步降低编码器的计算复杂度,同时视频质量损失可以忽略。Compared with a single algorithm, the method provided by one of the embodiments of the present invention can further reduce the computational complexity of the encoder, and at the same time, the loss of video quality can be ignored.
在与原HEVC编码率失真相近的情况下,本发明实施例1的方法平均能降低61.7%的编码时间,同时质量(BRBD)只损失了1.91%。Under the condition that the rate distortion is close to that of the original HEVC encoding, the method of Embodiment 1 of the present invention can reduce the encoding time by 61.7% on average, and at the same time, the quality (BRBD) is only lost by 1.91%.
附图说明Description of drawings
图1是帧内预测33种角度预测方向示意图。FIG. 1 is a schematic diagram of 33 angle prediction directions for intra prediction.
图2是CTU四叉树递归划分结构示意图。FIG. 2 is a schematic diagram of a recursive division structure of a CTU quadtree.
图3是利用决策树预测的流程图。Fig. 3 is a flowchart of prediction using a decision tree.
具体实施方式detailed description
以下是本发明的具体实施例,并结合实施例对本发明的技术方案作进一步的描述,但本发明并不限于这些实施例。The following are specific examples of the present invention, and further describe the technical solutions of the present invention in conjunction with the examples, but the present invention is not limited to these examples.
实施例1Example 1
本例提供了基于决策树的高效率视频编码器帧内快速算法选择方法,步骤包括:This example provides a fast intra-frame algorithm selection method for high-efficiency video encoders based on decision trees. The steps include:
(1)分别用第一算法和第二算法对训练视频序列进行编码,将编码过程中编码每个CTU时的中间信息作为特征写入文本文件;(1) encode the training video sequence with the first algorithm and the second algorithm respectively, and write the intermediate information when encoding each CTU in the encoding process into a text file as a feature;
(2)对步骤(1)的文本文件进行标记,若则标记为0,否则标记为1,得到标记的训练样本,其中RDcost1是第一算法的总率失真代价,RDcost2是第二算法的总率失真代价,T1是第一算法编码所用时间,T2分别是第二算法编码所用时间;(2) mark the text file of step (1), if Then mark it as 0, otherwise mark it as 1, and get the marked training samples, where RDcost1 is the total rate-distortion cost of the first algorithm, RDcost2 is the total rate-distortion cost of the second algorithm, T1 is the encoding time of the first algorithm, T2 respectively is the encoding time of the second algorithm;
(3)用步骤(2)标记的训练样本对决策树模型进行训练后,通过训练后的决策树模型在CTU开始编码时进行预测,决定编码流程,如图3所示。(3) After training the decision tree model with the training samples marked in step (2), the trained decision tree model is used to predict when the CTU starts to encode, and determine the encoding process, as shown in Figure 3.
步骤(1)中,训练视频序列包括:Kimono、ParkScene、Cactus、BasketballDrill、BQMall、BasketballPass、BQSquare、FourPeople、KristenAndSara、vidyo1、vidyo3、BasketballDrillText和SlideEditing。In step (1), the training video sequence includes: Kimono, ParkScene, Cactus, BasketballDrill, BQMall, BasketballPass, BQSquare, FourPeople, KristenAndSara, vidyo1, vidyo3, BasketballDrillText and SlideEditing.
测试视频的帧数等于视频一秒钟包含的图片数,即视频的帧率。The number of frames of the test video is equal to the number of pictures contained in the video in one second, that is, the frame rate of the video.
编码器的配置文件为encoder_intra_main.cfg。The configuration file of the encoder is encoder_intra_main.cfg.
第一算法步骤包括:The first algorithm step includes:
(1s)快速模式决策:(1s) Fast mode decision:
(1s.a)按HEVC标准对{0,1,2,6,10,14,18,22,26,30,34}11个模式做粗模式搜索,根据绝对变换差值,从中选出六个最佳帧内模式候选,结合当前PU左边和上边PU的最佳帧内模式组成集合A;(1s.a) According to the HEVC standard, do a rough pattern search on {0, 1, 2, 6, 10, 14, 18, 22, 26, 30, 34} 11 patterns, and select six of them according to the absolute transformation difference The best intra-mode candidates are combined with the best intra-modes of the left side of the current PU and the top PU to form a set A;
(1s.b)对集合A中的所有元素的2距邻居模式进行测验,从中进一步选出最好的两个帧内模式候选,将这两个帧内模式的1距邻居模式连同PU的最可能模式(MPM)组成集合B;(1s.b) Test the 2-distance neighbor patterns of all elements in the set A, and further select the best two intra-mode candidates, and combine the 1-distance neighbor patterns of the two intra-modes together with the PU’s best The possible patterns (MPM) form the set B;
(1s.c)对集合B中所有模式做粗模式搜索;(1s.c) Do a coarse pattern search for all patterns in set B;
(1s.d)从所有做过粗模式搜索的模式中找出SATD代价最小的M个模式,进行后续操作,M的个数由CU的大小决定:当CU大小为{64×64,32×32,16×16,8×8,4×4}时,M的值分别为{3,3,3,8,8};(1s.d) Find M patterns with the smallest SATD cost from all the patterns that have been searched for coarse patterns, and perform subsequent operations. The number of M is determined by the size of the CU: when the size of the CU is {64×64, 32× 32, 16×16, 8×8, 4×4}, the values of M are {3, 3, 3, 8, 8};
(2s)基于率失真优化量化的模式筛选:(2s) Mode screening based on rate-distortion optimized quantization:
(2s.a)从快速模式决策中得到的M个模式中选出SATD代价最小两个{m1,m2}组成集合W;(2s.a) From the M modes obtained in the fast mode decision, select two {m1, m2} with the smallest SATD cost to form a set W;
(2s.b)将M个模式中剩余的模式mi依次进行以下操作:(2s.b) Perform the following operations on the remaining patterns mi among the M patterns in sequence:
若mi与W中的所有元素的距离都大于1,则将mi加入到W集合中;If the distance between m i and all elements in W is greater than 1, add m i to the W set;
若集合中的元素包括了这些模式{m1,m2,Intra_Planar,Intra_DC,MPM},则跳出步骤(2s.b);If the elements in the collection include these patterns {m 1 , m 2 , Intra_Planar, Intra_DC, MPM}, skip step (2s.b);
(2s.c)对W中的所有元素做精模式搜索;(2s.c) Do a refined pattern search for all elements in W;
(3s)基于率失真代价的终止划分:(3s) Termination division based on rate-distortion cost:
若当前子CU的率失真代价之和大于其中某一阈值时,则跳过后续子CU的编码过程,减小计算复杂度,具体判断标准为:If the sum of the rate-distortion costs of the current sub-CU is greater than one of the thresholds, the encoding process of the subsequent sub-CU is skipped to reduce the computational complexity. The specific judgment criteria are:
其中:K的取值范围为{1,2,3,4},βK对应K={1,2,3,4}时的取值为{1.5,1.2,1.1,1},表示4个子CU的哈达马代价之和,4个子CU还没完全编码完的情况下,其值用当前CU的率失真代价代替,为前K个子CU的哈达马代价之和,表示第i个子CU的率失真代价,为当前CU的率失真代价。Among them: the value range of K is {1, 2, 3, 4}, and the value of β K corresponding to K={1, 2, 3, 4} is {1.5, 1.2, 1.1, 1}, Indicates the sum of the Hadamard costs of the 4 sub-CUs. When the 4 sub-CUs have not been completely encoded, its value is replaced by the rate-distortion cost of the current CU. is the sum of Hadamard costs of the first K sub-CUs, Indicates the rate-distortion cost of the i-th sub-CU, is the rate-distortion cost of the current CU.
某帧内模式的N距邻居表示与该帧内模式值之差的绝对值等于N的帧内模式,即满足等式|m–mi|=N的帧内模式(其中mi表示该帧内模式的值,m的值表示该帧内模式N距邻居的帧模式值),模式Intra_Planar和Intra_DC没有N距邻居。The N-distance neighbors of a certain intra-frame mode represent the intra-frame mode whose absolute value of the difference from the intra-frame mode value is equal to N, that is, the intra-frame mode that satisfies the equation |m–mi|=N (where mi represents the intra-frame mode The value of m represents the frame mode value of the intra-frame mode N-distance neighbor), and the modes Intra_Planar and Intra_DC have no N-distance neighbors.
第二算法步骤包括:The second algorithm step includes:
(1s)CU深度预测:(1s) CU depth prediction:
若则当前编码块的深度范围设置为与前一帧相同位置编码块的深度范围相同,若当前编码块的深度范围设置为其中为前一帧相同位置编码块的最小编码深度,为前一帧相同位置编码块的最大编码深度,p为常数1.02;like Then the depth range of the current coding block is set to be the same as the depth range of the coding block at the same position in the previous frame, if The depth range of the current encoded block is set to in the minimum coded depth for the same-position coded block in the previous frame, is the maximum coded depth of the coded block at the same position in the previous frame, p is a constant 1.02;
(2s)基于率失真代价的终止划分:(2s) Termination division based on rate-distortion cost:
如果当前编码块深度为d的CU的率失真代价J满足:If the rate-distortion cost J of the CU with the depth of the currently coded block is d satisfies:
其中表示上一帧当前位置编码块总的率失真代价,d表示编码深度,为系数,值为上一帧当前位置编码块处于最大深度的CU数与编码块总CU数之比;in Indicates the total rate-distortion cost of the encoding block at the current position of the previous frame, d indicates the encoding depth, is a coefficient, and the value is the ratio of the number of CUs at the maximum depth of the coding block at the current position of the previous frame to the total number of CUs of the coding block;
(3s)快速候选模式筛选:(3s) Fast candidate pattern screening:
对粗搜索后得到的候选集合,其中元素按率失真代价从小到大排列,记为P={p(0),p(1),…,p(M-1)},在进行精搜索之前,对候选集中的元素进行以下操作:For the candidate set obtained after the rough search, the elements are arranged according to the rate-distortion cost from small to large, which is recorded as P={p(0),p(1),...,p(M-1)}, before the fine search , perform the following operations on the elements in the candidate set:
(3s.a)假设m为MPM中三个元素在P中最靠后的索引值,则集合P的大小可以缩小为P={p(0),p(1),…,p(m)},其中M–1>m;(3s.a) Assuming that m is the last index value of the three elements in P in the MPM, the size of the set P can be reduced to P={p(0),p(1),...,p(m) }, where M–1>m;
(3s.b)对于新的集合P中的所有元素,如果满足J(p(i))SATD>1.3J(p(0))SATD,则将元素p(i)从集合P中移除,其中J(p(i))SATD,J(p(0))SATD分别代表P中第i+1个、第0个元素的率失真代价。(3s.b) For all elements in the new set P, if J(p(i)) SATD > 1.3J(p(0)) SATD is satisfied, the element p(i) is removed from the set P, Among them, J(p(i)) SATD and J(p(0)) SATD represent the rate-distortion cost of the i+1th and 0th elements in P, respectively.
中间信息包括:CTU的最大编码深度,CTU的最小编码深度,最大编码深度CU的个数与CTU中所有CU的个数之比,CTU中总的率失真代价,上一帧当前位置CTU的总率失真代价,左边CTU最大编码深度与最小编码深度之差,左边CTU最大编码深度CU的个数与CTU中所有CU的个数之比,右边CTU最大编码深度与最小编码深度之差,右边CTU最大编码深度CU的个数与CTU中所有CU的个数之比,以及编码CTU所用时间。The intermediate information includes: the maximum coded depth of the CTU, the minimum coded depth of the CTU, the ratio of the number of the maximum coded depth CU to the number of all CUs in the CTU, the total rate-distortion cost in the CTU, and the total Rate-distortion cost, the difference between the maximum coded depth and the minimum coded depth of the left CTU, the ratio of the number of CUs with the maximum coded depth of the left CTU to the number of all CUs in the CTU, the difference between the maximum coded depth and the minimum coded depth of the right CTU, and the right CTU The ratio of the number of CUs with the maximum coded depth to the number of all CUs in the CTU, and the time spent encoding the CTU.
步骤(3)中,决定编码流程的步骤包括:In step (3), the steps of determining the encoding process include:
(a)在CTU编码开始前,判断编码帧是否为第一帧,若是,则用第一算法对当前CTU进行编码,若否,则进行步骤(b);(a) before the CTU encoding starts, judge whether the encoded frame is the first frame, if so, then use the first algorithm to encode the current CTU, if not, then proceed to step (b);
(b)将前一帧当前位置CTU编码完成时保存下来的特征输入到决策树进行预测,若预测结果为0,用第一算法对CTU进行编码,若结果为1,用第二算法对CTU进行编码;(b) Input the features saved when the current position CTU encoding of the previous frame is completed to the decision tree for prediction. If the prediction result is 0, use the first algorithm to encode the CTU; if the result is 1, use the second algorithm to encode the CTU to encode;
(c)收集步骤(b)编码过程中的中间量并编码下一个CTU;(c) collecting intermediate quantities in the encoding process of step (b) and encoding the next CTU;
(d)重复步骤(b)和(c)至所有CTU编码完成。(d) Repeat steps (b) and (c) until all CTUs are coded.
步骤(1)中,用第一算法对当前CTU进行编码,编码完成后收集编码过程中的中间量并保存下来作为以后决策时的特征。In step (1), the first algorithm is used to encode the current CTU, and after the encoding is completed, the intermediate quantities in the encoding process are collected and saved as features for future decision-making.
实施例2Example 2
本例采用实施例1的方法,在Win10操作系统下,编译环境为Visual Studio 2017,通过编码同一视频,与原HEVC编码结果进行比较,HEVC参考软件HM,版本号10.0。结果如表1所示。This example adopts the method of Example 1. Under the Win10 operating system, the compilation environment is Visual Studio 2017. By encoding the same video, compare it with the original HEVC encoding result. HEVC reference software HM, version number 10.0. The results are shown in Table 1.
表1实施例1的方法与原HEVC编码结果性能比较The performance comparison between the method of Table 1 Embodiment 1 and the original HEVC encoding result
从表1可知,与原HEVC编码器相比,本发明实施例1的方法平均能降低61.7%的编码时间,同时质量(BRBD)只损失了1.91%。It can be seen from Table 1 that compared with the original HEVC encoder, the method of Embodiment 1 of the present invention can reduce the encoding time by 61.7% on average, and at the same time, the quality (BRBD) is only lost by 1.91%.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910281249.7A CN110139098B (en) | 2019-04-09 | 2019-04-09 | Intra-frame fast algorithm selection method for high-efficiency video encoder based on decision tree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910281249.7A CN110139098B (en) | 2019-04-09 | 2019-04-09 | Intra-frame fast algorithm selection method for high-efficiency video encoder based on decision tree |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110139098A CN110139098A (en) | 2019-08-16 |
CN110139098B true CN110139098B (en) | 2023-01-06 |
Family
ID=67569432
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910281249.7A Active CN110139098B (en) | 2019-04-09 | 2019-04-09 | Intra-frame fast algorithm selection method for high-efficiency video encoder based on decision tree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110139098B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111327909B (en) * | 2020-03-06 | 2022-10-18 | 郑州轻工业大学 | A fast depth coding method for 3D-HEVC |
CN115334308B (en) * | 2022-10-14 | 2022-12-27 | 北京大学深圳研究生院 | Learning model-oriented coding decision processing method, device and equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103338371A (en) * | 2013-06-07 | 2013-10-02 | 东华理工大学 | Fast and efficient video coding intra mode determining method |
CN106131547A (en) * | 2016-07-12 | 2016-11-16 | 北京大学深圳研究生院 | The high-speed decision method of intra prediction mode in Video coding |
CN107071418A (en) * | 2017-05-05 | 2017-08-18 | 上海应用技术大学 | A kind of quick division methods of HEVC intraframe coding units based on decision tree |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8467448B2 (en) * | 2006-11-15 | 2013-06-18 | Motorola Mobility Llc | Apparatus and method for fast intra/inter macro-block mode decision for video encoding |
US9426473B2 (en) * | 2013-02-01 | 2016-08-23 | Qualcomm Incorporated | Mode decision simplification for intra prediction |
US10142626B2 (en) * | 2014-10-31 | 2018-11-27 | Ecole De Technologie Superieure | Method and system for fast mode decision for high efficiency video coding |
-
2019
- 2019-04-09 CN CN201910281249.7A patent/CN110139098B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103338371A (en) * | 2013-06-07 | 2013-10-02 | 东华理工大学 | Fast and efficient video coding intra mode determining method |
CN106131547A (en) * | 2016-07-12 | 2016-11-16 | 北京大学深圳研究生院 | The high-speed decision method of intra prediction mode in Video coding |
CN107071418A (en) * | 2017-05-05 | 2017-08-18 | 上海应用技术大学 | A kind of quick division methods of HEVC intraframe coding units based on decision tree |
Non-Patent Citations (1)
Title |
---|
低复杂度的HEVC帧内编码模式决策算法;朱威等;《小型微型计算机系统》;20171215;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110139098A (en) | 2019-08-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115037932A (en) | Image encoding/decoding method and apparatus, and recording medium storing bit stream | |
CN101584218B (en) | Method and apparatus for encoding and decoding based on intra prediction | |
CN103188496B (en) | Based on the method for coding quick movement estimation video of motion vector distribution prediction | |
CN107105237A (en) | Video decoding apparatus | |
CN112738511B (en) | A fast mode decision-making method and device combined with video analysis | |
CN103533355B (en) | A kind of HEVC fast encoding method | |
CN117041564A (en) | Video encoding/decoding method, apparatus, and recording medium storing bit stream | |
CN107566846A (en) | Video coding skip mode decision-making technique, device, equipment and storage medium | |
CN113875235A (en) | Image encoding/decoding method and apparatus and recording medium storing bit stream | |
JP6212890B2 (en) | Moving picture coding apparatus, moving picture coding method, and moving picture coding program | |
WO2022166370A1 (en) | Video encoding and decoding method and apparatus, computer program product, computer-readable storage medium, and electronic device | |
CN110139098B (en) | Intra-frame fast algorithm selection method for high-efficiency video encoder based on decision tree | |
CN102075757B (en) | Video foreground object coding method by taking boundary detection as motion estimation reference | |
CN113574868A (en) | Image encoding/decoding method and apparatus, and recording medium storing bitstream | |
WO2021168817A1 (en) | Video processing method and apparatus | |
CN111885382A (en) | Intra-frame chroma prediction mode fast selection | |
CN108322740B (en) | Encoding method with controllable encoding complexity | |
CN110139099B (en) | Interframe prediction mode selection method based on precoding and coding SATD value weighting | |
CN111683245B (en) | CU Partition Decision Based on Texture Similarity | |
CN110113601B (en) | HEVC intra-frame rapid algorithm selection method based on video picture texture features | |
CN105791863B (en) | 3D-HEVC depth map intra-frame predictive encoding method based on layer | |
CN110868593A (en) | Fast Division of Video CU Based on Regional Decision Tree | |
CN108322743B (en) | A fast method for intra-frame selection of inseparable quadratic transformation modes based on mode-dependent properties | |
TWI861393B (en) | Methods and systems for combined lossless and lossy coding | |
JP6904156B2 (en) | Video coding device, video coding method, and video coding program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |