CN118590650A - A decoding and encoding method, device and equipment thereof - Google Patents
A decoding and encoding method, device and equipment thereof Download PDFInfo
- Publication number
- CN118590650A CN118590650A CN202310217157.9A CN202310217157A CN118590650A CN 118590650 A CN118590650 A CN 118590650A CN 202310217157 A CN202310217157 A CN 202310217157A CN 118590650 A CN118590650 A CN 118590650A
- Authority
- CN
- China
- Prior art keywords
- feature
- image block
- entropy
- current image
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 236
- 230000009466 transformation Effects 0.000 claims abstract description 70
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 18
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 18
- 238000013139 quantization Methods 0.000 claims description 153
- 230000008569 process Effects 0.000 claims description 100
- 238000012545 processing Methods 0.000 claims description 49
- 238000012937 correction Methods 0.000 claims description 5
- 230000002708 enhancing effect Effects 0.000 claims description 4
- 239000000523 sample Substances 0.000 claims 1
- 239000000758 substrate Substances 0.000 claims 1
- 230000000875 corresponding effect Effects 0.000 description 530
- 238000013528 artificial neural network Methods 0.000 description 59
- 238000010586 diagram Methods 0.000 description 15
- 230000006835 compression Effects 0.000 description 14
- 238000007906 compression Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 14
- 238000004458 analytical method Methods 0.000 description 11
- 238000013527 convolutional neural network Methods 0.000 description 10
- 210000002569 neuron Anatomy 0.000 description 8
- 238000011084 recovery Methods 0.000 description 8
- 238000011156 evaluation Methods 0.000 description 7
- 230000008859 change Effects 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 5
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 230000001131 transforming effect Effects 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 238000009795 derivation Methods 0.000 description 3
- 230000010365 information processing Effects 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000003672 processing method Methods 0.000 description 3
- NEAPKZHDYMQZCB-UHFFFAOYSA-N N-[2-[4-[2-(2,3-dihydro-1H-inden-2-ylamino)pyrimidin-5-yl]piperazin-1-yl]ethyl]-2-oxo-3H-1,3-benzoxazole-6-carboxamide Chemical compound C1CN(CCN1CCNC(=O)C2=CC3=C(C=C2)NC(=O)O3)C4=CN=C(N=C4)NC5CC6=CC=CC=C6C5 NEAPKZHDYMQZCB-UHFFFAOYSA-N 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000005315 distribution function Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004883 computer application Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000006837 decompression Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
- G06T9/002—Image coding using neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
技术领域Technical Field
本申请涉及编解码技术领域,尤其是涉及一种解码、编码方法、装置及其设备。The present application relates to the field of coding and decoding technology, and in particular to a decoding and coding method, device and equipment thereof.
背景技术Background Art
为了达到节约空间的目的,视频图像都是经过编码后才传输的,完整的视频编码可以包括预测、变换、量化、熵编码、滤波等过程。针对预测过程,预测过程可以包括帧内预测和帧间预测,帧间预测是指利用视频时间域的相关性,使用邻近已编码图像的像素预测当前像素,以达到有效去除视频时域冗余的目的。帧内预测是指利用视频空间域的相关性,使用当前帧图像的已编码块的像素预测当前像素,以达到去除视频空域冗余的目的。In order to save space, video images are transmitted after being encoded. Complete video encoding can include prediction, transformation, quantization, entropy coding, filtering and other processes. For the prediction process, the prediction process can include intra-frame prediction and inter-frame prediction. Inter-frame prediction refers to the use of the correlation in the video time domain to use the pixels of the adjacent encoded images to predict the current pixel, so as to effectively remove the redundancy in the video time domain. Intra-frame prediction refers to the use of the correlation in the video space domain to use the pixels of the encoded blocks of the current frame image to predict the current pixel, so as to remove the redundancy in the video space domain.
随着深度学习的迅速发展,深度学习在许多高层次的计算机视觉问题上取得成功,如图像分类、目标检测等,深度学习也逐渐在编解码领域开始应用,即可以采用神经网络对图像进行编码和解码。虽然基于神经网络的编解码方法展现出巨大性能潜力,但是,基于神经网络的编解码方法仍然存在编码性能较差、解码性能较差和复杂度较高等问题。With the rapid development of deep learning, deep learning has achieved success in many high-level computer vision problems, such as image classification and target detection. Deep learning has also gradually begun to be applied in the field of encoding and decoding, that is, neural networks can be used to encode and decode images. Although the encoding and decoding methods based on neural networks have shown great performance potential, they still have problems such as poor encoding performance, poor decoding performance and high complexity.
发明内容Summary of the invention
有鉴于此,本申请提供一种解码、编码方法、装置及其设备,提高编码性能和解码性能。In view of this, the present application provides a decoding and encoding method, device and equipment thereof to improve encoding performance and decoding performance.
本申请提供一种解码方法,应用于解码端,所述方法包括:The present application provides a decoding method, which is applied to a decoding end, and the method includes:
基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;Based on the first bitstream corresponding to the current image block, a standard deviation corresponding to the current image block is obtained through a probabilistic hyperparameter decoding network; a probability distribution model is determined based on the standard deviation, and a second bitstream corresponding to the current image block is decoded based on the probability distribution model to obtain decoded image features; based on the decoded image features, a residual feature corresponding to the current image block is determined; based on the residual feature, an initial image feature corresponding to the current image block is determined;
获取所述当前图像块对应的熵梯度偏移步长;Obtaining the entropy gradient offset step corresponding to the current image block;
基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;Performing feature enhancement on the initial image features based on the entropy gradient offset step to obtain target image features;
基于所述目标图像特征,通过合成变换网络获取所述当前图像块对应的重建图像块。Based on the target image feature, a reconstructed image block corresponding to the current image block is obtained through a synthetic transformation network.
本申请提供一种编码方法,应用于编码端,所述方法包括:The present application provides a coding method, which is applied to a coding end, and the method includes:
基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;Based on the first bitstream corresponding to the current image block, a standard deviation corresponding to the current image block is obtained through a probabilistic hyperparameter decoding network; a probability distribution model is determined based on the standard deviation, and a second bitstream corresponding to the current image block is decoded based on the probability distribution model to obtain decoded image features; based on the decoded image features, a residual feature corresponding to the current image block is determined; based on the residual feature, an initial image feature corresponding to the current image block is determined;
针对每个熵梯度偏移步长,基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;基于所述目标图像特征,通过合成变换网络获取所述熵梯度偏移步长对应的重建图像块;For each entropy gradient offset step, the initial image feature is enhanced based on the entropy gradient offset step to obtain a target image feature; based on the target image feature, a reconstructed image block corresponding to the entropy gradient offset step is obtained through a synthetic transformation network;
基于所述当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取所述当前图像块对应的熵梯度偏移步长,并在所述当前图像块对应的头信息码流中编码所述当前图像块对应的熵梯度偏移步长。Based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the current image block is selected from all entropy gradient offset steps, and the entropy gradient offset step corresponding to the current image block is encoded in the header information code stream corresponding to the current image block.
本申请提供一种解码装置,应用于解码端,所述装置包括:The present application provides a decoding device, applied to a decoding end, the device comprising:
确定模块,用于基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;A determination module is used to obtain a standard deviation corresponding to the current image block through a probabilistic hyperparameter decoding network based on a first bitstream corresponding to the current image block; determine a probability distribution model based on the standard deviation, and decode the second bitstream corresponding to the current image block based on the probability distribution model to obtain decoded image features; determine a residual feature corresponding to the current image block based on the decoded image features; and determine an initial image feature corresponding to the current image block based on the residual feature;
获取模块,用于获取所述当前图像块对应的熵梯度偏移步长;An acquisition module, used to acquire the entropy gradient offset step corresponding to the current image block;
处理模块,用于基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;A processing module, used for performing feature enhancement on the initial image features based on the entropy gradient offset step to obtain target image features;
所述获取模块,还用于基于所述目标图像特征,通过合成变换网络获取所述当前图像块对应的重建图像块。The acquisition module is further used to acquire the reconstructed image block corresponding to the current image block through a synthetic transformation network based on the target image feature.
本申请提供一种编码装置,应用于编码端,所述装置包括:The present application provides an encoding device, applied to an encoding end, the device comprising:
确定模块,用于基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;A determination module is used to obtain a standard deviation corresponding to the current image block through a probabilistic hyperparameter decoding network based on a first bitstream corresponding to the current image block; determine a probability distribution model based on the standard deviation, and decode the second bitstream corresponding to the current image block based on the probability distribution model to obtain decoded image features; determine a residual feature corresponding to the current image block based on the decoded image features; and determine an initial image feature corresponding to the current image block based on the residual feature;
处理模块,用于针对每个熵梯度偏移步长,基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;基于所述目标图像特征,通过合成变换网络获取所述熵梯度偏移步长对应的重建图像块;A processing module, configured to perform feature enhancement on the initial image feature based on each entropy gradient offset step to obtain a target image feature; and obtain a reconstructed image block corresponding to the entropy gradient offset step through a synthetic transformation network based on the target image feature;
选取模块,用于基于所述当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取所述当前图像块对应的熵梯度偏移步长;A selection module, configured to select, based on the current image block and a reconstructed image block corresponding to each entropy gradient offset step, an entropy gradient offset step corresponding to the current image block from all entropy gradient offset step lengths;
编码模块,用于在所述当前图像块对应的头信息码流中编码所述当前图像块对应的熵梯度偏移步长。The encoding module is used to encode the entropy gradient offset step corresponding to the current image block in the header information code stream corresponding to the current image block.
本申请提供一种解码端设备,包括:处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令;The present application provides a decoding end device, comprising: a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions that can be executed by the processor;
所述处理器用于执行机器可执行指令,以实现上述的解码方法。The processor is used to execute machine executable instructions to implement the above decoding method.
本申请提供一种编码端设备,包括:处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令;The present application provides an encoding end device, comprising: a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions that can be executed by the processor;
所述处理器用于执行机器可执行指令,以实现上述的编码方法。The processor is used to execute machine executable instructions to implement the above encoding method.
本申请提供一种电子设备,包括:处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令;所述处理器用于执行机器可执行指令,以实现上述的解码方法;或者,所述处理器用于执行机器可执行指令,以实现上述的编码方法。The present application provides an electronic device, comprising: a processor and a machine-readable storage medium, wherein the machine-readable storage medium stores machine-executable instructions that can be executed by the processor; the processor is used to execute the machine-executable instructions to implement the above-mentioned decoding method; or, the processor is used to execute the machine-executable instructions to implement the above-mentioned encoding method.
本申请提供一种机器可读存储介质,所述机器可读存储介质上存储有若干计算机指令,所述计算机指令被处理器执行时,实现上述的解码方法;或者,实现上述的编码方法。The present application provides a machine-readable storage medium, on which a number of computer instructions are stored. When the computer instructions are executed by a processor, the above-mentioned decoding method is implemented; or, the above-mentioned encoding method is implemented.
由以上技术方案可见,本申请实施例中,可以获取当前图像块对应的初始图像特征,并获取当前图像块对应的熵梯度偏移步长,基于初始图像特征和熵梯度偏移步长确定目标图像特征,基于目标图像特征获取当前图像块对应的重建图像块,从而提出一种端到端的视频图像压缩方法,能够基于神经网络实现视频图像的编码和解码,通过结合熵梯度偏移步长达到提升编码效率和解码效率的目的。结合网络结构设计和头信息码流,使得神经网络在保持低复杂度的同时,有效保证重建图像块的质量,达到提升编码性能和解码性能的目的,并降低复杂度。利用熵梯度对特征进行质量增强,编码端不直接改变特征信息,而是编码熵梯度偏移步长进入头信息码流,解码端通过熵梯度偏移步长对特征进行增强,提高编码性能,提高重建图像质量。增加正则化项提高特征增强的结构稳定性。It can be seen from the above technical solutions that in the embodiment of the present application, the initial image features corresponding to the current image block can be obtained, and the entropy gradient offset step corresponding to the current image block can be obtained. The target image features are determined based on the initial image features and the entropy gradient offset step, and the reconstructed image block corresponding to the current image block is obtained based on the target image features, thereby proposing an end-to-end video image compression method, which can realize the encoding and decoding of video images based on neural networks, and achieve the purpose of improving the encoding efficiency and decoding efficiency by combining the entropy gradient offset step. Combined with the network structure design and the header information code stream, the neural network can effectively guarantee the quality of the reconstructed image block while maintaining low complexity, so as to achieve the purpose of improving the encoding performance and decoding performance, and reduce the complexity. The entropy gradient is used to enhance the quality of the feature. The encoding end does not directly change the feature information, but encodes the entropy gradient offset step into the header information code stream. The decoding end enhances the feature through the entropy gradient offset step, improves the encoding performance, and improves the reconstructed image quality. Adding a regularization term improves the structural stability of feature enhancement.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请一种实施方式中的三维特征矩阵的示意图;FIG1 is a schematic diagram of a three-dimensional feature matrix in one embodiment of the present application;
图2是本申请一种实施方式中的解码方法的流程图;FIG2 is a flowchart of a decoding method in one embodiment of the present application;
图3是本申请一种实施方式中的编码方法的流程图;FIG3 is a flow chart of an encoding method in one embodiment of the present application;
图4是本申请一种实施方式中的编码端的处理过程示意图;FIG4 is a schematic diagram of a processing process of an encoding end in one embodiment of the present application;
图5是本申请一种实施方式中的解码端的处理过程示意图;FIG5 is a schematic diagram of a processing process of a decoding end in one embodiment of the present application;
图6A是本申请一种实施方式中的解码方法的流程示意图;FIG6A is a schematic diagram of a decoding method in one embodiment of the present application;
图6B是本申请一种实施方式中的端到端的编解码框架示意图;FIG6B is a schematic diagram of an end-to-end encoding and decoding framework in one implementation of the present application;
图6C是本申请一种实施方式中的特征点自信息量与主信息量化残差的关系示意图;FIG6C is a schematic diagram showing the relationship between the self-information of a feature point and the quantized residual of the main information in one embodiment of the present application;
图6D和图6E是本申请一种实施方式中的编码方法的流程示意图;6D and 6E are schematic flow diagrams of an encoding method in one embodiment of the present application;
图6F是本申请一种实施方式中的JPEG-AI的解码端增加LEGS模块的处理流程;FIG6F is a processing flow of adding a LEGS module to the decoding end of JPEG-AI in one embodiment of the present application;
图6G是本申请一种实施方式中的JPEG-AI分量的重建流程图;FIG6G is a flowchart of reconstruction of a JPEG-AI component in one embodiment of the present application;
图7A是本申请一种实施方式中的解码方法的流程图;FIG7A is a flowchart of a decoding method in one embodiment of the present application;
图7B是本申请一种实施方式中的编码方法的流程图;FIG7B is a flowchart of an encoding method in one embodiment of the present application;
图8A是本申请一种实施方式中的解码端设备的硬件结构图;FIG8A is a hardware structure diagram of a decoding end device in one embodiment of the present application;
图8B是本申请一种实施方式中的编码端设备的硬件结构图。FIG8B is a hardware structure diagram of an encoding end device in one implementation of the present application.
具体实施方式DETAILED DESCRIPTION
在本申请实施例使用的术语仅仅是出于描述特定实施例的目的,而非用于限制本申请。本申请实施例和权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其它含义。还应当理解,本文中使用的术语“和/或”是指包含一个或多个相关联的列出项目的任何或所有可能组合。应当理解,尽管在本申请实施例可能采用术语第一、第二、第三等来描述各种信息,但是,这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本申请实施例范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息,取决于语境。此外,所使用的词语“如果”可以被解释成为“在……时”,或“当……时”,或“响应于确定”。The terms used in the embodiments of the present application are only for the purpose of describing specific embodiments, rather than for limiting the present application. The singular forms of "a", "said" and "the" used in the embodiments of the present application and the claims are also intended to include plural forms, unless the context clearly indicates other meanings. It should also be understood that the term "and/or" used herein refers to any or all possible combinations of one or more associated listed items. It should be understood that, although the terms first, second, third, etc. may be used to describe various information in the embodiments of the present application, these information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of the embodiments of the present application, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information, depending on the context. In addition, the word "if" used can be interpreted as "at ... ", or "when ... ", or "in response to determination".
本申请实施例中提出一种解码方法和编码方法,可以涉及如下概念:In the embodiments of the present application, a decoding method and an encoding method are proposed, which may involve the following concepts:
JPEG(Joint Photographic Experts Group,联合图像专家组):JPEG是用于连续色调静态图像压缩的一种标准,文件后缀名可以为.jpg或者.jpeg,JPEG是一种常用的图像文件格式。JPEG是采用预测编码(DPCM)、离散余弦变换(DCT)以及熵编码的联合编码方式,以去除冗余的图像和彩色数据,属于有损压缩格式,能够将图像压缩在很小的储存空间,一定程度上会造成图像数据的损伤。尤其是使用过高的压缩比例时,将使最终解压缩后恢复的图像质量降低,如果追求高品质图像,则JPEG不宜采用过高的压缩比例。JPEG (Joint Photographic Experts Group): JPEG is a standard for continuous tone static image compression. The file extension can be .jpg or .jpeg. JPEG is a commonly used image file format. JPEG uses a joint coding method of predictive coding (DPCM), discrete cosine transform (DCT) and entropy coding to remove redundant image and color data. It is a lossy compression format that can compress images in a very small storage space, which will cause damage to image data to a certain extent. In particular, when using too high a compression ratio, the quality of the image restored after the final decompression will be reduced. If you pursue high-quality images, JPEG should not use too high a compression ratio.
JPEG-AI(Joint Photographic Experts Group Artificial Intelligence,联合图像专家组-人工智能):JPEG-AI的范围是创建基于学习的图像编码标准,提供单流、紧凑的压缩域表示,在同等主观质量下,比常用的图像编码标准显著提高压缩效率,在图像处理和计算机视觉任务中有效提升性能。JPEG-AI面向广泛的应用,如云存储、视觉管理、自动驾驶汽车和设备、图像采集存储和管理、视觉数据的实时管理和媒体分发等。JPEG-AI的目标是设计一种编解码解决方案,该解决方案需要以相同的主观质量显著提高压缩效率,为基于机器学习的图像处理和计算机视觉任务提供有效的压缩域处理。JPEG-AI需要针对硬件和软件实现友好的编码和解码,对8位和10位深度的支持,使用文本和图形对图像进行高效编码和渐进式解码。JPEG-AI (Joint Photographic Experts Group Artificial Intelligence): The scope of JPEG-AI is to create a learning-based image coding standard that provides a single-stream, compact compressed domain representation that significantly improves compression efficiency over commonly used image coding standards at the same subjective quality, effectively improving performance in image processing and computer vision tasks. JPEG-AI targets a wide range of applications such as cloud storage, vision management, self-driving cars and devices, image acquisition storage and management, real-time management of visual data, and media distribution. The goal of JPEG-AI is to design a codec solution that needs to significantly improve compression efficiency at the same subjective quality and provide efficient compressed domain processing for machine learning-based image processing and computer vision tasks. JPEG-AI requires hardware- and software-friendly encoding and decoding, support for 8-bit and 10-bit depths, and efficient encoding and progressive decoding of images with text and graphics.
熵编码(Entropy Encoding):熵编码即编码过程中按照熵原理不丢失任何信息的编码,信息熵为信源的平均信息量(不确定性的度量)。熵编码的编码方式可以包括但不限于:香农(Shannon)编码、哈夫曼(Huffman)编码和算术编码(arithmetic coding)。Entropy coding: Entropy coding is the coding process that does not lose any information according to the entropy principle. Information entropy is the average amount of information of the source (a measure of uncertainty). Entropy coding methods may include but are not limited to: Shannon coding, Huffman coding, and arithmetic coding.
神经网络(Neural Network,NN):神经网络是指人工神经网络,神经网络是一种运算模型,由大量节点(或称为神经元)之间相互联接构成。在神经网络中,神经元处理单元可表示不同的对象,如特征、字母、概念,或者一些有意义的抽象模式。神经网络中处理单元的类型可以分为三类:输入单元、输出单元和隐单元。输入单元接受外部世界的信号与数据;输出单元实现处理结果的输出;隐单元是处在输入和输出单元之间,不能由系统外部观察的单元。神经元之间的连接权值反映了单元间的连接强度,信息的表示和处理体现在处理单元的连接关系中。神经网络是一种非程序化、类大脑风格的信息处理方式,神经网络的本质是通过神经网络的变换和动力学行为得到一种并行分布式的信息处理功能,并在不同程度和层次上模仿人脑神经系统的信息处理功能。在视频处理领域,常用的神经网络可以包括但不限于:卷积神经网络(CNN)、循环神经网络(RNN)、全连接网络等。Neural Network (NN): Neural network refers to artificial neural network. Neural network is a computing model composed of a large number of nodes (or neurons) connected to each other. In a neural network, neuron processing units can represent different objects, such as features, letters, concepts, or some meaningful abstract patterns. The types of processing units in a neural network can be divided into three categories: input units, output units, and hidden units. Input units receive signals and data from the outside world; output units realize the output of processing results; hidden units are units between input and output units and cannot be observed from outside the system. The connection weights between neurons reflect the connection strength between units, and the representation and processing of information are reflected in the connection relationship of processing units. Neural network is a non-programmed, brain-like information processing method. The essence of neural network is to obtain a parallel distributed information processing function through the transformation and dynamic behavior of neural network, and imitate the information processing function of human brain nervous system to different degrees and levels. In the field of video processing, commonly used neural networks may include but are not limited to: convolutional neural network (CNN), recurrent neural network (RNN), fully connected network, etc.
卷积神经网络(Convolutional Neural Network,CNN):卷积神经网络是一种前馈神经网络,是深度学习技术中极具代表的网络结构之一,卷积神经网络的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。卷积神经网络的基本结构包括两层,其一为特征提取层(也称卷积层),每个神经元的输入与前一层的局部接受域相连,并提取该局部的特征。一旦该局部特征被提取后,它与其它特征间的位置关系也随之确定下来。其二是特征映射层(也称激活层),神经网络的每个计算层由多个特征映射组成,每个特征映射是一个平面,平面上所有神经元的权值相等。特征映射结构可以采用Sigmoid函数、ReLU函数、Leaky-ReLU函数、PReLU函数、GDN函数等作为卷积网络的激活函数。此外,由于一个映射面上的神经元共享权值,因而能够减少网络自由参数的个数。Convolutional Neural Network (CNN): Convolutional neural network is a feedforward neural network and one of the most representative network structures in deep learning technology. The artificial neurons of convolutional neural network can respond to surrounding units within a certain coverage range and have excellent performance in large image processing. The basic structure of convolutional neural network includes two layers. One is the feature extraction layer (also called convolution layer). The input of each neuron is connected to the local receptive field of the previous layer and extracts the local features. Once the local feature is extracted, its positional relationship with other features is also determined. The second is the feature mapping layer (also called activation layer). Each computing layer of the neural network consists of multiple feature maps. Each feature map is a plane, and the weights of all neurons on the plane are equal. The feature mapping structure can use Sigmoid function, ReLU function, Leaky-ReLU function, PReLU function, GDN function, etc. as the activation function of the convolutional network. In addition, since neurons on a mapping surface share weights, the number of free parameters of the network can be reduced.
示例性的,卷积神经网络相较于图像处理算法的优点之一在于,避免了对图像复杂的前期预处理过程(提取人工特征等),可以直接输入原始图像,进行端到端的学习。卷积神经网络相较于普通神经网络的优点之一在于,普通神经网络都是采用全连接的方式,即输入层到隐藏层的神经元都是全部连接的,这样做将导致参数量巨大,使得网络训练耗时甚至难以训练,而卷积神经网络则通过局部连接、权值共享等方法避免了这一困难。For example, one of the advantages of convolutional neural networks over image processing algorithms is that they avoid the complex pre-processing of images (extracting artificial features, etc.), and can directly input the original image for end-to-end learning. One of the advantages of convolutional neural networks over ordinary neural networks is that ordinary neural networks use a fully connected approach, that is, all neurons from the input layer to the hidden layer are connected, which will result in a huge number of parameters, making network training time-consuming or even difficult to train, while convolutional neural networks avoid this difficulty through local connections, weight sharing and other methods.
反卷积层(Deconvolution):反卷积层又称为转置卷积层,反卷积层和卷积层的工作过程很相似,主要区别在于,反卷积层会通过padding(填充),使得输出大于输入(当然也可以保持相同)。若stride为1,则表示输出尺寸等于输入尺寸;若stride(步长)为N,则表示输出特征的宽为输入特征的宽的N倍,输出特征的高为输入特征的高的N倍。Deconvolution layer: Deconvolution layer is also called transposed convolution layer. The working process of deconvolution layer and convolution layer is very similar. The main difference is that deconvolution layer will make the output larger than the input through padding (of course, it can also remain the same). If stride is 1, it means that the output size is equal to the input size; if stride is N, it means that the width of the output feature is N times the width of the input feature, and the height of the output feature is N times the height of the input feature.
泛化能力(Generalization Ability):泛化能力可以是指机器学习算法对新鲜样本的适应能力,学习的目的是学到隐含在数据对背后的规律,对具有同一规律的学习集以外的数据,经过训练的网络也能够给出合适的输出,该能力就可以称为泛化能力。Generalization Ability: Generalization ability refers to the ability of a machine learning algorithm to adapt to fresh samples. The purpose of learning is to learn the rules hidden behind the data pairs. For data outside the learning set with the same rules, the trained network can also give appropriate outputs. This ability can be called generalization ability.
特征(feature):本申请中涉及的特征,是一个C*W*H的三维特征矩阵,参见图1所示,是三维特征矩阵的示意图,在三维特征矩阵中,C表示通道(channel)数,H表示特征高度,W表示特征宽度。三维特征矩阵可以是神经网络的输入,也可以是神经网络的输出。Feature: The feature involved in this application is a three-dimensional feature matrix of C*W*H. See Figure 1, which is a schematic diagram of a three-dimensional feature matrix. In the three-dimensional feature matrix, C represents the number of channels, H represents the feature height, and W represents the feature width. The three-dimensional feature matrix can be the input of the neural network or the output of the neural network.
率失真原则(Rate-Distortion Optimized):评价编码效率的有两大指标:码率和PSNR(Peak Signal to Noise Ratio,峰值信噪比),比特流越小,则压缩率越大,PSNR越大,则重建图像质量越好,在模式选择时,判别公式实质上也就是对二者的综合评价。例如,模式对应的代价:J(mode)=D+λ*R,其中,D表示Distortion(失真),通常可以使用SSE指标来进行衡量,SSE是指重建图像块与源图像的差值的均方和,为了实现代价考虑,也可以使用SAD指标,SAD是指重建图像块与源图像的差值绝对值之和;λ是拉格朗日乘子,R就是该模式下图像块编码所需的实际比特数,包括编码模式信息、运动信息、残差等所需的比特总和。在模式选择时,若使用率失真原则去对编码模式做比较决策,通常可以保证编码性能最佳。Rate-Distortion Optimized: There are two major indicators for evaluating coding efficiency: bit rate and PSNR (Peak Signal to Noise Ratio). The smaller the bit stream, the greater the compression rate, and the greater the PSNR, the better the quality of the reconstructed image. When selecting a mode, the discriminant formula is essentially a comprehensive evaluation of the two. For example, the cost corresponding to the mode: J(mode) = D + λ*R, where D represents Distortion, which can usually be measured using the SSE indicator. SSE refers to the mean square sum of the difference between the reconstructed image block and the source image. In order to achieve cost considerations, the SAD indicator can also be used. SAD refers to the sum of the absolute values of the difference between the reconstructed image block and the source image; λ is the Lagrange multiplier, and R is the actual number of bits required for encoding the image block in this mode, including the sum of bits required for encoding mode information, motion information, residuals, etc. When selecting a mode, if the rate-distortion principle is used to make a comparison decision on the encoding mode, the best encoding performance can usually be guaranteed.
针对编码端的各个模块,提出了非常多的编码工具,而每个工具往往又有多种模式。对于不同视频序列,能获得最优编码性能的编码工具往往不同。因此,在编码过程中,通常采用RDO(Rate-Distortion Opitimize)比较不同工具或模式的编码性能,以选择最佳模式。在确定最优工具或模式后,再通过在比特流中编码标记信息的方法传递工具或模式的决策信息。这种方法虽然带来了较高的编码复杂度,但可以针对不同内容,自适应选择最优的模式组合,获得最优的编码性能。解码端可通过直接解析标志信息获得相关模式信息,复杂度影响较小。A large number of coding tools have been proposed for each module of the encoding end, and each tool often has multiple modes. For different video sequences, the coding tools that can achieve the best coding performance are often different. Therefore, in the encoding process, RDO (Rate-Distortion Optimize) is usually used to compare the coding performance of different tools or modes to select the best mode. After determining the optimal tool or mode, the decision information of the tool or mode is transmitted by encoding the marker information in the bitstream. Although this method brings higher coding complexity, it can adaptively select the optimal mode combination for different content to obtain the best coding performance. The decoding end can obtain relevant mode information by directly parsing the flag information, and the complexity impact is small.
在端到端的图像编码通用框架中,主要包括特征主信息部分与超先验边信息部分,特征主信息部分包括了分析网络、量化、正态熵编码、正态熵解码与合成网络,超先验边信息包括了超先验分析网络、量化、因子化熵编码、因子化熵解码、超先验合成网络。图像分量x在分别由特征主信息部分的分析网络与合成网络进行压缩编码与重建恢复;超先验边信息部分主要用于建模特征主信息的概率,指导进行特征主信息的熵编解码。在端到端的图像编码通用框架中,存在没有充分利用主信息特征熵梯度的问题,图像重建质量未得到充分提升。In the general framework of end-to-end image coding, it mainly includes the feature main information part and the super prior side information part. The feature main information part includes analysis network, quantization, normal entropy coding, normal entropy decoding and synthesis network, and the super prior side information includes super prior analysis network, quantization, factored entropy coding, factored entropy decoding and super prior synthesis network. The image component x is compressed and encoded and reconstructed and restored by the analysis network and synthesis network of the feature main information part respectively; the super prior side information part is mainly used to model the probability of the feature main information and guide the entropy coding and decoding of the feature main information. In the general framework of end-to-end image coding, there is a problem that the main information feature entropy gradient is not fully utilized, and the image reconstruction quality is not fully improved.
针对上述发现,本实施例中,利用端到端的图像编码框架特点,解码端利用主信息量化残差的熵梯度信息对特征进行质量增强,提高重建图像的质量,通过增加正则化项,提高特征增强的结构稳定性。编码端不直接改变主信息,而是编码步长进入头信息码流(即将熵梯度偏移步长编码到头信息码流),解码端通过熵梯度偏移步长对主信息特征进行增强。In view of the above findings, in this embodiment, the end-to-end image coding framework features are utilized, and the decoding end utilizes the entropy gradient information of the quantized residual of the main information to enhance the quality of the feature, thereby improving the quality of the reconstructed image, and by adding a regularization term, the structural stability of the feature enhancement is improved. The encoding end does not directly change the main information, but the encoding step length is entered into the header information bit stream (that is, the entropy gradient offset step length is encoded into the header information bit stream), and the decoding end enhances the main information feature through the entropy gradient offset step length.
以下结合几个具体实施例,对本申请实施例中的解码方法和编码方法进行详细说明。The decoding method and encoding method in the embodiments of the present application are described in detail below in combination with several specific embodiments.
实施例1:本申请实施例中提出一种解码方法,参见图2所示,为该解码方法的流程示意图,该方法可以应用于解码端(也称为视频解码器),该方法可以包括:Embodiment 1: A decoding method is proposed in the embodiment of the present application. Referring to FIG. 2 , it is a flow chart of the decoding method. The method can be applied to a decoding end (also called a video decoder). The method may include:
步骤201、通过第一神经网络获取当前图像块对应的初始图像特征,并获取当前图像块对应的熵梯度偏移步长,第一神经网络可以为概率超参解码网络,包括至少一个卷积层。Step 201: obtaining initial image features corresponding to the current image block through a first neural network, and obtaining an entropy gradient offset step corresponding to the current image block. The first neural network may be a probabilistic hyperparameter decoding network including at least one convolutional layer.
在一种可能的实施方式中,通过第一神经网络获取当前图像块对应的初始图像特征,可以包括但不限于:基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于标准差确定概率分布模型,基于概率分布模型对当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于解码后的图像特征确定当前图像块对应的残差特征;基于残差特征确定当前图像块对应的初始图像特征。示例性的,基于当前图像块对应的第一码流,通过概率超参解码网络获取前图像块对应的标准差,可以包括但不限于:基于因子化概率模型对当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定系数超参特征;将该系数超参特征输入给概率超参解码网络,通过概率超参解码网络对该系数超参特征进行反变换,得到当前图像块对应的标准差。In a possible implementation, obtaining the initial image features corresponding to the current image block through the first neural network may include but is not limited to: based on the first bitstream corresponding to the current image block, obtaining the standard deviation corresponding to the current image block through the probabilistic hyperparameter decoding network; determining the probability distribution model based on the standard deviation, decoding the second bitstream corresponding to the current image block based on the probability distribution model to obtain the decoded image features; determining the residual features corresponding to the current image block based on the decoded image features; and determining the initial image features corresponding to the current image block based on the residual features. Exemplarily, based on the first bitstream corresponding to the current image block, obtaining the standard deviation corresponding to the previous image block through the probabilistic hyperparameter decoding network may include but is not limited to: decoding the first bitstream corresponding to the current image block based on the factorized probability model to obtain the decoded image features, and determining the coefficient hyperparameter features based on the decoded image features; inputting the coefficient hyperparameter features into the probabilistic hyperparameter decoding network, and inversely transforming the coefficient hyperparameter features through the probabilistic hyperparameter decoding network to obtain the standard deviation corresponding to the current image block.
在一种可能的实施方式中,获取当前图像块对应的熵梯度偏移步长,可以包括但不限于:对当前图像块对应的头信息码流进行解码,得到熵梯度偏移步长。或者,对当前图像块对应的头信息码流进行解码,得到熵梯度偏移步长对应的指示信息,并基于该指示信息确定熵梯度偏移步长。示例性的,基于指示信息确定熵梯度偏移步长,可以包括但不限于:若已配置的熵梯度偏移步长表包括至少一个熵梯度偏移步长,且指示信息用于指示熵梯度偏移步长的索引值,则基于索引值从熵梯度偏移步长表中确定当前图像块对应的熵梯度偏移步长。In a possible implementation, obtaining the entropy gradient offset step corresponding to the current image block may include, but is not limited to: decoding the header information code stream corresponding to the current image block to obtain the entropy gradient offset step. Alternatively, decoding the header information code stream corresponding to the current image block to obtain indication information corresponding to the entropy gradient offset step, and determining the entropy gradient offset step based on the indication information. Exemplarily, determining the entropy gradient offset step based on the indication information may include, but is not limited to: if the configured entropy gradient offset step table includes at least one entropy gradient offset step, and the indication information is used to indicate the index value of the entropy gradient offset step, then determining the entropy gradient offset step corresponding to the current image block from the entropy gradient offset step table based on the index value.
步骤202、基于熵梯度偏移步长对初始图像特征进行特征增强,得到目标图像特征。Step 202: Enhance the initial image features based on the entropy gradient offset step to obtain target image features.
示例性的,基于熵梯度偏移步长对初始图像特征进行特征增强,得到目标图像特征,可以包括:基于熵梯度偏移步长和当前图像块对应的熵梯度特征信息对初始图像特征进行特征增强,得到目标图像特征;其中,熵梯度偏移步长表示熵梯度的偏移步长,且熵梯度偏移步长为用于对初始图像特征进行增强时的修正步长。Exemplarily, performing feature enhancement on initial image features based on the entropy gradient offset step to obtain target image features may include: performing feature enhancement on initial image features based on the entropy gradient offset step and entropy gradient feature information corresponding to the current image block to obtain target image features; wherein the entropy gradient offset step represents the offset step of the entropy gradient, and the entropy gradient offset step is a correction step used to enhance the initial image features.
在一种可能的实施方式中,熵梯度特征信息包括熵梯度偏移值,当前图像块对应的熵梯度偏移值的确定方式,可以包括但不限于:基于残差特征确定当前图像块对应的熵梯度偏移值。示例性的,基于残差特征确定当前图像块对应的熵梯度偏移值,可以包括但不限于:确定该残差特征对应的残差增量特征;计算该残差特征的熵,并计算该残差增量特征的熵;基于该残差特征的熵和该残差增量特征的熵确定残差特征的熵梯度,并基于残差特征的熵梯度和正则化项确定熵梯度偏移值。In a possible implementation, the entropy gradient feature information includes an entropy gradient offset value, and the method for determining the entropy gradient offset value corresponding to the current image block may include, but is not limited to: determining the entropy gradient offset value corresponding to the current image block based on the residual feature. Exemplarily, determining the entropy gradient offset value corresponding to the current image block based on the residual feature may include, but is not limited to: determining the residual incremental feature corresponding to the residual feature; calculating the entropy of the residual feature, and calculating the entropy of the residual incremental feature; determining the entropy gradient of the residual feature based on the entropy of the residual feature and the entropy of the residual incremental feature, and determining the entropy gradient offset value based on the entropy gradient of the residual feature and the regularization term.
在另一种可能的实施方式中,熵梯度特征信息可以包括目标熵梯度和正则化项,当前图像块对应的目标熵梯度的确定方式,可以包括但不限于:从熵率拟合表中查询标准差对应的目标量化尺度,目标量化尺度是与标准差最接近的量化尺度,从熵率拟合表中查询目标量化尺度对应的目标二次项;其中,熵率拟合表包括量化尺度与二次项的对应关系;基于目标二次项确定当前图像块对应的目标熵梯度。In another possible implementation, the entropy gradient feature information may include a target entropy gradient and a regularization term. The method for determining the target entropy gradient corresponding to the current image block may include but is not limited to: querying the target quantization scale corresponding to the standard deviation from the entropy rate fitting table, the target quantization scale is the quantization scale closest to the standard deviation, and querying the target quadratic term corresponding to the target quantization scale from the entropy rate fitting table; wherein the entropy rate fitting table includes the correspondence between the quantization scale and the quadratic term; and determining the target entropy gradient corresponding to the current image block based on the target quadratic term.
示例性的,熵率拟合表的获取方式,可以包括但不限于:针对最小标准差与最大标准差之间的多个标准差,在对数尺度上对多个标准差进行均匀量化,得到多个标准差对应的多个量化尺度;针对每个量化尺度,可以对该量化尺度进行熵率拟合,得到该量化尺度对应的二次项,并在熵率拟合表中记录该量化尺度与该二次项之间的对应关系。Exemplarily, the method of obtaining the entropy rate fitting table may include but is not limited to: for multiple standard deviations between the minimum standard deviation and the maximum standard deviation, the multiple standard deviations are uniformly quantized on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations; for each quantization scale, the entropy rate can be fitted to the quantization scale to obtain the quadratic term corresponding to the quantization scale, and the correspondence between the quantization scale and the quadratic term is recorded in the entropy rate fitting table.
示例性的,针对上述两种实现方式,正则化项的确定方式,可以包括但不限于:基于当前图像块对应的初始图像特征、标准差和当前图像块对应的残差特征确定正则化项。Exemplarily, for the above two implementations, the method of determining the regularization term may include, but is not limited to: determining the regularization term based on the initial image features corresponding to the current image block, the standard deviation, and the residual features corresponding to the current image block.
步骤203、基于目标图像特征,通过合成变换网络获取当前图像块对应的重建图像块。Step 203: Based on the target image features, a reconstructed image block corresponding to the current image block is obtained through a synthetic transformation network.
示例性的,可以将目标图像特征输入给合成变换网络,通过合成变换网络对目标图像特征进行合成变换,得到重建图像块。Exemplarily, the target image features may be input into a synthetic transformation network, and the synthetic transformation may be performed on the target image features through the synthetic transformation network to obtain a reconstructed image block.
在上述实施例中,当前图像块对应的初始图像特征为亮度分量对应的初始图像特征;和/或者,当前图像块对应的初始图像特征为色度分量对应的初始图像特征。若当前图像块对应的初始图像特征为亮度分量对应的初始图像特征,则目标图像特征为亮度分量对应的目标图像特征,且重建图像块为亮度分量对应的重建图像块。和/或者,若当前图像块对应的初始图像特征为色度分量对应的初始图像特征,则目标图像特征为色度分量对应的目标图像特征,且重建图像块为色度分量对应的重建图像块。In the above embodiment, the initial image feature corresponding to the current image block is the initial image feature corresponding to the brightness component; and/or, the initial image feature corresponding to the current image block is the initial image feature corresponding to the chrominance component. If the initial image feature corresponding to the current image block is the initial image feature corresponding to the brightness component, the target image feature is the target image feature corresponding to the brightness component, and the reconstructed image block is the reconstructed image block corresponding to the brightness component. And/or, if the initial image feature corresponding to the current image block is the initial image feature corresponding to the chrominance component, the target image feature is the target image feature corresponding to the chrominance component, and the reconstructed image block is the reconstructed image block corresponding to the chrominance component.
示例性的,上述执行顺序只是为了方便描述给出的示例,在实际应用中,还可以改变步骤之间的执行顺序,对此执行顺序不做限制。而且,在其它实施例中,并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤,其方法所包括的步骤可以比本说明书所描述的更多或更少。此外,本说明书中所描述的单个步骤,在其它实施例中可能被分解为多个步骤进行描述;本说明书中所描述的多个步骤,在其它实施例也可能被合并为单个步骤进行描述。Exemplarily, the above execution order is only for the convenience of describing the examples given. In practical applications, the execution order between the steps can also be changed, and there is no limitation on this execution order. Moreover, in other embodiments, the steps of the corresponding method are not necessarily executed in the order shown and described in this specification, and the steps included in the method may be more or less than those described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; multiple steps described in this specification may also be combined into a single step for description in other embodiments.
由以上技术方案可见,本申请实施例中,通过获取当前图像块对应的初始图像特征,并获取当前图像块对应的熵梯度偏移步长,基于初始图像特征和熵梯度偏移步长确定目标图像特征,基于目标图像特征,获取当前图像块对应的重建图像块,从而提出一种端到端的视频图像压缩方法,能够基于神经网络实现视频图像的编码和解码,通过结合熵梯度偏移步长达到提升编码效率和解码效率的目的。结合网络结构设计和头信息码流,使得神经网络在保持低复杂度的同时,有效保证重建图像块的质量,达到提升编码性能和解码性能的目的,并降低复杂度。利用熵梯度对特征进行质量增强,编码端不直接改变特征信息,而是编码熵梯度偏移步长进入头信息码流,解码端通过熵梯度偏移步长对特征进行增强,提高编码性能,提高重建图像质量。增加正则化项提高特征增强的结构稳定性。It can be seen from the above technical scheme that in the embodiment of the present application, by obtaining the initial image features corresponding to the current image block and obtaining the entropy gradient offset step corresponding to the current image block, the target image features are determined based on the initial image features and the entropy gradient offset step, and based on the target image features, the reconstructed image block corresponding to the current image block is obtained, thereby proposing an end-to-end video image compression method, which can realize the encoding and decoding of video images based on neural networks, and achieve the purpose of improving the encoding efficiency and decoding efficiency by combining the entropy gradient offset step. Combined with the network structure design and the header information code stream, the neural network can effectively guarantee the quality of the reconstructed image block while maintaining low complexity, so as to achieve the purpose of improving the encoding performance and decoding performance, and reduce the complexity. The entropy gradient is used to enhance the quality of the feature. The encoding end does not directly change the feature information, but encodes the entropy gradient offset step into the header information code stream. The decoding end enhances the feature through the entropy gradient offset step, improves the encoding performance, and improves the reconstructed image quality. Adding a regularization term improves the structural stability of feature enhancement.
实施例2:本申请实施例中提出一种编码方法,参见图3所示,为该编码方法的流程示意图,该方法可以应用于编码端(也称为视频编码器),该方法可以包括:Embodiment 2: In the embodiment of the present application, a coding method is proposed. Referring to FIG. 3 , it is a schematic diagram of the flow of the coding method. The method can be applied to a coding end (also called a video encoder). The method may include:
步骤301、通过第一神经网络获取当前图像块对应的初始图像特征,第一神经网络可以为概率超参解码网络,且第一神经网络可以包括至少一个卷积层。Step 301: Acquire initial image features corresponding to the current image block through a first neural network. The first neural network may be a probabilistic hyperparameter decoding network, and the first neural network may include at least one convolutional layer.
在一种可能的实施方式中,通过第一神经网络获取当前图像块对应的初始图像特征,可以包括但不限于:基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于标准差确定概率分布模型,基于概率分布模型对当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于解码后的图像特征确定当前图像块对应的残差特征;基于残差特征确定当前图像块对应的初始图像特征。示例性的,基于当前图像块对应的第一码流,通过概率超参解码网络获取前图像块对应的标准差,可以包括但不限于:基于因子化概率模型对当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定系数超参特征;将该系数超参特征输入给概率超参解码网络,通过概率超参解码网络对该系数超参特征进行反变换,得到当前图像块对应的标准差。In a possible implementation, obtaining the initial image features corresponding to the current image block through the first neural network may include but is not limited to: based on the first bitstream corresponding to the current image block, obtaining the standard deviation corresponding to the current image block through the probabilistic hyperparameter decoding network; determining the probability distribution model based on the standard deviation, decoding the second bitstream corresponding to the current image block based on the probability distribution model to obtain the decoded image features; determining the residual features corresponding to the current image block based on the decoded image features; and determining the initial image features corresponding to the current image block based on the residual features. Exemplarily, based on the first bitstream corresponding to the current image block, obtaining the standard deviation corresponding to the previous image block through the probabilistic hyperparameter decoding network may include but is not limited to: decoding the first bitstream corresponding to the current image block based on the factorized probability model to obtain the decoded image features, and determining the coefficient hyperparameter features based on the decoded image features; inputting the coefficient hyperparameter features into the probabilistic hyperparameter decoding network, and inversely transforming the coefficient hyperparameter features through the probabilistic hyperparameter decoding network to obtain the standard deviation corresponding to the current image block.
步骤302、针对每个熵梯度偏移步长,基于该熵梯度偏移步长对初始图像特征进行特征增强,得到目标图像特征;基于目标图像特征,通过合成变换网络获取该熵梯度偏移步长对应的重建图像块。Step 302: for each entropy gradient offset step, the initial image features are enhanced based on the entropy gradient offset step to obtain the target image features; based on the target image features, a reconstructed image block corresponding to the entropy gradient offset step is obtained through a synthetic transformation network.
步骤303、基于当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取当前图像块对应的熵梯度偏移步长(即最终采用的熵梯度偏移步长),并在当前图像块对应的头信息码流中编码当前图像块对应的熵梯度偏移步长。Step 303: Based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, select the entropy gradient offset step corresponding to the current image block from all entropy gradient offset step sizes (i.e., the entropy gradient offset step size finally adopted), and encode the entropy gradient offset step size corresponding to the current image block in the header information code stream corresponding to the current image block.
示例性的,基于该熵梯度偏移步长对初始图像特征进行特征增强,得到目标图像特征,可以包括:基于熵梯度偏移步长和当前图像块对应的熵梯度特征信息对初始图像特征进行特征增强,得到目标图像特征;其中,熵梯度偏移步长表示熵梯度的偏移步长,且熵梯度偏移步长为用于对初始图像特征进行增强时的修正步长。Exemplarily, performing feature enhancement on the initial image features based on the entropy gradient offset step to obtain the target image features may include: performing feature enhancement on the initial image features based on the entropy gradient offset step and the entropy gradient feature information corresponding to the current image block to obtain the target image features; wherein the entropy gradient offset step represents the offset step of the entropy gradient, and the entropy gradient offset step is the correction step used to enhance the initial image features.
在一种可能的实施方式中,熵梯度特征信息包括熵梯度偏移值,当前图像块对应的熵梯度偏移值的确定方式,可以包括但不限于:基于残差特征确定当前图像块对应的熵梯度偏移值。示例性的,基于残差特征确定当前图像块对应的熵梯度偏移值,可以包括但不限于:确定该残差特征对应的残差增量特征;计算该残差特征的熵,并计算该残差增量特征的熵;基于该残差特征的熵和该残差增量特征的熵确定残差特征的熵梯度,并基于残差特征的熵梯度和正则化项确定熵梯度偏移值。In a possible implementation, the entropy gradient feature information includes an entropy gradient offset value, and the method for determining the entropy gradient offset value corresponding to the current image block may include, but is not limited to: determining the entropy gradient offset value corresponding to the current image block based on the residual feature. Exemplarily, determining the entropy gradient offset value corresponding to the current image block based on the residual feature may include, but is not limited to: determining the residual incremental feature corresponding to the residual feature; calculating the entropy of the residual feature, and calculating the entropy of the residual incremental feature; determining the entropy gradient of the residual feature based on the entropy of the residual feature and the entropy of the residual incremental feature, and determining the entropy gradient offset value based on the entropy gradient of the residual feature and the regularization term.
在另一种可能的实施方式中,熵梯度特征信息可以包括目标熵梯度和正则化项,当前图像块对应的目标熵梯度的确定方式,可以包括但不限于:从熵率拟合表中查询标准差对应的目标量化尺度,目标量化尺度是与标准差最接近的量化尺度,从熵率拟合表中查询目标量化尺度对应的目标二次项;其中,熵率拟合表包括量化尺度与二次项的对应关系;基于目标二次项确定当前图像块对应的目标熵梯度。In another possible implementation, the entropy gradient feature information may include a target entropy gradient and a regularization term. The method for determining the target entropy gradient corresponding to the current image block may include but is not limited to: querying the target quantization scale corresponding to the standard deviation from the entropy rate fitting table, the target quantization scale is the quantization scale closest to the standard deviation, and querying the target quadratic term corresponding to the target quantization scale from the entropy rate fitting table; wherein the entropy rate fitting table includes the correspondence between the quantization scale and the quadratic term; and determining the target entropy gradient corresponding to the current image block based on the target quadratic term.
示例性的,熵率拟合表的获取方式,可以包括但不限于:针对最小标准差与最大标准差之间的多个标准差,在对数尺度上对多个标准差进行均匀量化,得到多个标准差对应的多个量化尺度;针对每个量化尺度,可以对该量化尺度进行熵率拟合,得到该量化尺度对应的二次项,并在熵率拟合表中记录该量化尺度与该二次项之间的对应关系。Exemplarily, the method of obtaining the entropy rate fitting table may include, but is not limited to: for multiple standard deviations between the minimum standard deviation and the maximum standard deviation, uniformly quantizing the multiple standard deviations on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations; for each quantization scale, entropy rate fitting can be performed on the quantization scale to obtain a quadratic term corresponding to the quantization scale, and the correspondence between the quantization scale and the quadratic term is recorded in the entropy rate fitting table.
示例性的,针对上述两种实现方式,正则化项的确定方式,可以包括但不限于:基于当前图像块对应的初始图像特征、标准差和当前图像块对应的残差特征确定正则化项。Exemplarily, for the above two implementations, the method of determining the regularization term may include, but is not limited to: determining the regularization term based on the initial image features corresponding to the current image block, the standard deviation, and the residual features corresponding to the current image block.
示例性的,基于目标图像特征,通过合成变换网络获取该熵梯度偏移步长对应的重建图像块,可以包括但不限于:可以将目标图像特征输入给合成变换网络,通过合成变换网络对目标图像特征进行合成变换,得到该熵梯度偏移步长对应的重建图像块。Exemplarily, based on the target image features, a reconstructed image block corresponding to the entropy gradient offset step is obtained through a synthetic transformation network, which may include but is not limited to: the target image features may be input into the synthetic transformation network, and the target image features may be synthetically transformed through the synthetic transformation network to obtain a reconstructed image block corresponding to the entropy gradient offset step.
在一种可能的实施方式中,基于当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取当前图像块对应的熵梯度偏移步长,可以包括但不限于:针对每个熵梯度偏移步长,基于当前图像块和该熵梯度偏移步长对应的重建图像块,确定该熵梯度偏移步长对应的代价值;基于每个熵梯度偏移步长对应的代价值,可以将最小代价值对应的熵梯度偏移步长选取为当前图像块对应的熵梯度偏移步长。In a possible implementation, based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the current image block is selected from all entropy gradient offset steps, which may include but is not limited to: for each entropy gradient offset step, based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step, determining the cost value corresponding to the entropy gradient offset step; based on the cost value corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the minimum cost value can be selected as the entropy gradient offset step corresponding to the current image block.
在另一种可能的实施方式中,基于当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取当前图像块对应的熵梯度偏移步长,可以包括但不限于:针对每个熵梯度偏移步长,基于当前图像块和该熵梯度偏移步长对应的重建图像块,确定该熵梯度偏移步长对应的目标代价值,并基于目标代价值和参考代价值确定该熵梯度偏移步长对应的保真程度;基于每个熵梯度偏移步长对应的保真程度,将最大保真程度对应的熵梯度偏移步长选取为当前图像块对应的熵梯度偏移步长;其中,参考代价值的获取过程可以包括:在得到当前图像块对应的初始图像特征之后,基于初始图像特征,通过合成变换网络获取重建图像块,基于当前图像块和重建图像块确定参考代价值。In another possible implementation, based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the current image block is selected from all entropy gradient offset steps, which may include but is not limited to: for each entropy gradient offset step, based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step, determining the target cost value corresponding to the entropy gradient offset step, and determining the fidelity corresponding to the entropy gradient offset step based on the target cost value and the reference cost value; based on the fidelity corresponding to each entropy gradient offset step, selecting the entropy gradient offset step corresponding to the maximum fidelity as the entropy gradient offset step corresponding to the current image block; wherein the process of obtaining the reference cost value may include: after obtaining the initial image features corresponding to the current image block, based on the initial image features, obtaining the reconstructed image block through a synthetic transformation network, and determining the reference cost value based on the current image block and the reconstructed image block.
示例性的,在得到当前图像块对应的熵梯度偏移步长之后,可以在当前图像块对应的头信息码流中编码当前图像块对应的熵梯度偏移步长。比如说,在当前图像块对应的头信息码流中编码熵梯度偏移步长的值;或者,在当前图像块对应的头信息码流中编码熵梯度偏移步长对应的指示信息,比如说,若已配置的熵梯度偏移步长表包括至少一个熵梯度偏移步长,则该熵梯度偏移步长对应的指示信息用于指示熵梯度偏移步长的索引值。Exemplarily, after obtaining the entropy gradient offset step corresponding to the current image block, the entropy gradient offset step corresponding to the current image block can be encoded in the header information code stream corresponding to the current image block. For example, the value of the entropy gradient offset step is encoded in the header information code stream corresponding to the current image block; or, the indication information corresponding to the entropy gradient offset step is encoded in the header information code stream corresponding to the current image block, for example, if the configured entropy gradient offset step table includes at least one entropy gradient offset step, the indication information corresponding to the entropy gradient offset step is used to indicate the index value of the entropy gradient offset step.
在上述实施例中,当前图像块对应的初始图像特征为亮度分量对应的初始图像特征;和/或者,当前图像块对应的初始图像特征为色度分量对应的初始图像特征。若当前图像块对应的初始图像特征为亮度分量对应的初始图像特征,则目标图像特征为亮度分量对应的目标图像特征,且重建图像块为亮度分量对应的重建图像块。和/或者,若当前图像块对应的初始图像特征为色度分量对应的初始图像特征,则目标图像特征为色度分量对应的目标图像特征,且重建图像块为色度分量对应的重建图像块。In the above embodiment, the initial image feature corresponding to the current image block is the initial image feature corresponding to the brightness component; and/or, the initial image feature corresponding to the current image block is the initial image feature corresponding to the chrominance component. If the initial image feature corresponding to the current image block is the initial image feature corresponding to the brightness component, the target image feature is the target image feature corresponding to the brightness component, and the reconstructed image block is the reconstructed image block corresponding to the brightness component. And/or, if the initial image feature corresponding to the current image block is the initial image feature corresponding to the chrominance component, the target image feature is the target image feature corresponding to the chrominance component, and the reconstructed image block is the reconstructed image block corresponding to the chrominance component.
示例性的,上述执行顺序只是为了方便描述给出的示例,在实际应用中,还可以改变步骤之间的执行顺序,对此执行顺序不做限制。而且,在其它实施例中,并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤,其方法所包括的步骤可以比本说明书所描述的更多或更少。此外,本说明书中所描述的单个步骤,在其它实施例中可能被分解为多个步骤进行描述;本说明书中所描述的多个步骤,在其它实施例也可能被合并为单个步骤进行描述。Exemplarily, the above execution order is only for the convenience of describing the examples given. In practical applications, the execution order between the steps can also be changed, and there is no limitation on this execution order. Moreover, in other embodiments, the steps of the corresponding method are not necessarily executed in the order shown and described in this specification, and the steps included in the method may be more or less than those described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; multiple steps described in this specification may also be combined into a single step for description in other embodiments.
由以上技术方案可见,本申请实施例中,通过获取当前图像块对应的初始图像特征,并获取当前图像块对应的熵梯度偏移步长,基于初始图像特征和熵梯度偏移步长确定目标图像特征,基于目标图像特征获取当前图像块对应的重建图像块,从而提出一种端到端的视频图像压缩方法,能够基于神经网络实现视频图像的编码和解码,通过结合熵梯度偏移步长达到提升编码效率和解码效率的目的。结合网络结构设计和头信息码流,使得神经网络在保持低复杂度的同时,有效保证重建图像块的质量,达到提升编码性能和解码性能的目的,并降低复杂度。利用熵梯度对特征进行质量增强,编码端不直接改变特征信息,而是编码熵梯度偏移步长进入头信息码流,解码端通过熵梯度偏移步长对特征进行增强,提高编码性能,提高重建图像质量。增加正则化项提高特征增强的结构稳定性。It can be seen from the above technical scheme that in the embodiment of the present application, by obtaining the initial image features corresponding to the current image block, and obtaining the entropy gradient offset step corresponding to the current image block, the target image features are determined based on the initial image features and the entropy gradient offset step, and the reconstructed image block corresponding to the current image block is obtained based on the target image features, thereby proposing an end-to-end video image compression method, which can realize the encoding and decoding of video images based on neural networks, and achieve the purpose of improving the encoding efficiency and decoding efficiency by combining the entropy gradient offset step. Combined with the network structure design and the header information code stream, the neural network can effectively guarantee the quality of the reconstructed image block while maintaining low complexity, so as to achieve the purpose of improving the encoding performance and decoding performance, and reduce the complexity. The entropy gradient is used to enhance the quality of the feature. The encoding end does not directly change the feature information, but encodes the entropy gradient offset step into the header information code stream. The decoding end enhances the feature through the entropy gradient offset step, improves the encoding performance, and improves the quality of the reconstructed image. Adding a regularization term improves the structural stability of feature enhancement.
实施例3:针对实施例1和实施例2,关于编码端的处理过程,可以参见图4所示,当然,图4只是编码端的处理过程的一个示例,对此编码端的处理过程不作限制。Embodiment 3: For Embodiment 1 and Embodiment 2, the processing process of the encoding end can be referred to as shown in FIG4. Of course, FIG4 is only an example of the processing process of the encoding end, and does not limit the processing process of the encoding end.
编码端在得到当前图像块x(当前图像块x可以为原始图像块x,即输入的图像块)之后,可以通过分析变换网络(即神经网络)对当前图像块x进行分析变换,得到当前图像块x对应的图像特征y。其中,通过分析变换网络对当前图像块x进行特征变换是指:将当前图像块x变换到latent域的图像特征y,从而方便后续所有过程在latent域上进行操作。After obtaining the current image block x (the current image block x may be the original image block x, i.e., the input image block), the encoder may analyze and transform the current image block x through an analysis transformation network (i.e., a neural network) to obtain an image feature y corresponding to the current image block x. Among them, performing feature transformation on the current image block x through the analysis transformation network means: transforming the current image block x to an image feature y in the latent domain, thereby facilitating all subsequent processes to operate on the latent domain.
图像可以划分成1个图像块,也可以划分成多个图像块,若图像划分成1个图像块,则当前图像块x也可以为图像,即对于图像块的编解码过程也可以直接用于图像。The image can be divided into one image block or multiple image blocks. If the image is divided into one image block, the current image block x can also be an image, that is, the encoding and decoding process for the image block can also be directly used for the image.
编码端在得到图像特征y之后,对图像特征y进行系数超参特征变换,得到系数超参特征z,比如说,可以将图像特征y输入给超参编码网络(即神经网络),由超参编码网络对图像特征y进行系数超参特征变换,得到系数超参特征z。其中,超参编码网络可以是已训练的神经网络,对此超参编码网络的训练过程不作限制,能够对图像特征y进行系数超参特征变换即可。其中,latent域的图像特征y经过超参编码网络之后得到超先验latent信息z。After obtaining the image feature y, the encoding end performs coefficient hyperparameter feature transformation on the image feature y to obtain coefficient hyperparameter feature z. For example, the image feature y can be input into a hyperparameter encoding network (i.e., a neural network), and the hyperparameter encoding network performs coefficient hyperparameter feature transformation on the image feature y to obtain coefficient hyperparameter feature z. The hyperparameter encoding network can be a trained neural network, and there is no restriction on the training process of the hyperparameter encoding network, as long as the coefficient hyperparameter feature transformation can be performed on the image feature y. The image feature y in the latent domain obtains the super prior latent information z after passing through the hyperparameter encoding network.
编码端在得到系数超参特征z之后,可以对系数超参特征z进行量化,得到系数超参特征z对应的超参量化特征,即图4中的Q操作为量化过程。在得到系数超参特征z对应的超参量化特征之后,对超参量化特征进行编码,得到当前图像块对应的Bitstream#1(即第一码流),即图4中的AE操作表示编码过程,如熵编码过程。或者,编码端也可以直接对系数超参特征z进行编码,得到当前图像块对应的Bitstream#1。其中,Bitstream#1中携带的超参量化特征或者系数超参特征z主要用于获取均值和概率分布模型的参数。After obtaining the coefficient hyperparameter feature z, the encoding end can quantize the coefficient hyperparameter feature z to obtain the hyperparameter quantization feature corresponding to the coefficient hyperparameter feature z, that is, the Q operation in Figure 4 is the quantization process. After obtaining the hyperparameter quantization feature corresponding to the coefficient hyperparameter feature z, the hyperparameter quantization feature is encoded to obtain Bitstream#1 (i.e., the first bitstream) corresponding to the current image block, that is, the AE operation in Figure 4 represents the encoding process, such as the entropy encoding process. Alternatively, the encoding end can also directly encode the coefficient hyperparameter feature z to obtain Bitstream#1 corresponding to the current image block. Among them, the hyperparameter quantization feature or coefficient hyperparameter feature z carried in Bitstream#1 is mainly used to obtain the parameters of the mean and probability distribution model.
在得到当前图像块对应的Bitstream#1之后,编码端可以将当前图像块对应的Bitstream#1发送给解码端,关于解码端针对当前图像块对应的Bitstream#1的处理过程,参见后续实施例。After obtaining the Bitstream#1 corresponding to the current image block, the encoder may send the Bitstream#1 corresponding to the current image block to the decoder. For the processing process of the decoder for the Bitstream#1 corresponding to the current image block, refer to the subsequent embodiments.
编码端在得到当前图像块对应的Bitstream#1之后,还可以对Bitstream#1进行解码,得到超参量化特征,即图4中的AD表示解码过程,然后,对超参量化特征进行反量化,得到系数超参特征z_hat,系数超参特征z_hat与系数超参特征z可以相同或者不同,图4中的IQ操作为反量化过程。或者,编码端在得到当前图像块对应的Bitstream#1之后,还可以对Bitstream#1进行解码,得到系数超参特征z_hat,而不涉及系数超参特征z_hat的反量化过程。After obtaining the Bitstream#1 corresponding to the current image block, the encoder can also decode Bitstream#1 to obtain the hyperparameter quantization feature, that is, AD in Figure 4 represents the decoding process, and then the hyperparameter quantization feature is dequantized to obtain the coefficient hyperparameter feature z_hat. The coefficient hyperparameter feature z_hat and the coefficient hyperparameter feature z can be the same or different. The IQ operation in Figure 4 is the dequantization process. Alternatively, after obtaining the Bitstream#1 corresponding to the current image block, the encoder can also decode Bitstream#1 to obtain the coefficient hyperparameter feature z_hat without involving the dequantization process of the coefficient hyperparameter feature z_hat.
针对Bitstream#1的编码过程,可以采用固定概率密度模型的编码方法,针对Bitstream#1的解码过程,可以采用固定概率密度模型的解码方法,对此编码和解码过程不做限制。For the encoding process of Bitstream#1, the encoding method of the fixed probability density model can be adopted, and for the decoding process of Bitstream#1, the decoding method of the fixed probability density model can be adopted. There is no restriction on this encoding and decoding process.
编码端在得到系数超参特征z_hat之后,可以基于当前图像块的系数超参特征z_hat和前面图像块的残差特征y_hat(残差特征y_hat的确定过程参见后续实施例),进行基于上下文的预测,得到当前图像块对应的预测值mu(即均值mu),比如说,将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定预测值mu,对此预测过程不作限制。其中,针对基于上下文的预测过程,输入包括系数超参特征z_hat和已解码的残差特征y_hat,两者联合输入获取更精准的预测值mu,预测值mu用于与原始特征作差获得残差和与解码的残差相加得到重建y。After obtaining the coefficient hyperparameter feature z_hat, the encoding end can perform context-based prediction based on the coefficient hyperparameter feature z_hat of the current image block and the residual feature y_hat of the previous image block (the determination process of the residual feature y_hat is described in the subsequent embodiments) to obtain the prediction value mu (i.e., mean mu) corresponding to the current image block. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat are input to the mean prediction network, and the mean prediction network determines the prediction value mu based on the coefficient hyperparameter feature z_hat and the residual feature y_hat. There is no restriction on this prediction process. Among them, for the context-based prediction process, the input includes the coefficient hyperparameter feature z_hat and the decoded residual feature y_hat, and the two are jointly input to obtain a more accurate prediction value mu. The prediction value mu is used to obtain the residual by subtracting from the original feature and to obtain the reconstructed y by adding the decoded residual.
需要注意的是,均值预测网络为可选的神经网络,即也可以没有均值预测网络,即不需要通过均值预测网络确定预测值mu,图4中的虚线框表示均值预测网络为可选。It should be noted that the mean prediction network is an optional neural network, that is, there is no mean prediction network, that is, there is no need to determine the predicted value mu through the mean prediction network. The dotted box in Figure 4 indicates that the mean prediction network is optional.
编码端在得到图像特征y之后,可以基于图像特征y和预测值mu确定残差特征r,如将图像特征y与预测值mu的差值作为残差特征r。然后,对残差特征r进行特征处理,得到图像特征s,对此特征处理过程不做限制,可以是任意特征处理方式。在该情况下,需要部署均值预测网络,由均值预测网络提供预测值mu。或者,编码端在得到图像特征y之后,可以对图像特征y进行特征处理,得到图像特征s,对此特征处理过程不做限制,可以是任意特征处理方式。在该情况下,不需要部署均值预测网络,通过虚线框表示残差过程为可选过程。After obtaining the image feature y, the encoder can determine the residual feature r based on the image feature y and the prediction value mu, such as taking the difference between the image feature y and the prediction value mu as the residual feature r. Then, the residual feature r is feature processed to obtain the image feature s. There is no restriction on this feature processing process, and it can be any feature processing method. In this case, it is necessary to deploy a mean prediction network, and the mean prediction network provides the prediction value mu. Alternatively, after obtaining the image feature y, the encoder can perform feature processing on the image feature y to obtain the image feature s. There is no restriction on this feature processing process, and it can be any feature processing method. In this case, there is no need to deploy a mean prediction network, and the dotted box indicates that the residual process is an optional process.
编码端在得到图像特征s之后,可以对图像特征s进行量化,得到图像特征s对应的图像量化特征,即图4中的Q操作为量化过程。在得到图像特征s对应的图像量化特征之后,编码端可以对该图像量化特征进行编码,得到当前图像块对应的Bitstream#2(即第二码流),即图4中的AE操作表示编码过程,如熵编码过程。或者,编码端也可以直接对图像特征s进行编码,得到当前图像块对应的Bitstream#2,而不涉及图像特征s的量化过程。After obtaining the image feature s, the encoding end can quantize the image feature s to obtain the image quantization feature corresponding to the image feature s, that is, the Q operation in Figure 4 is the quantization process. After obtaining the image quantization feature corresponding to the image feature s, the encoding end can encode the image quantization feature to obtain Bitstream#2 (i.e., the second bitstream) corresponding to the current image block, that is, the AE operation in Figure 4 represents the encoding process, such as the entropy encoding process. Alternatively, the encoding end can also directly encode the image feature s to obtain Bitstream#2 corresponding to the current image block without involving the quantization process of the image feature s.
在得到当前图像块对应的Bitstream#2之后,编码端可以将当前图像块对应的Bitstream#2发送给解码端,关于解码端针对当前图像块对应的Bitstream#2的处理过程,参见后续实施例。After obtaining Bitstream#2 corresponding to the current image block, the encoder may send Bitstream#2 corresponding to the current image block to the decoder. For the processing process of the decoder for Bitstream#2 corresponding to the current image block, refer to the subsequent embodiments.
编码端在得到当前图像块对应的Bitstream#2之后,还可以对Bitstream#2进行解码,得到图像量化特征,即图4中的AD表示解码过程,然后,编码端可以对图像量化特征进行反量化,得到图像特征s’,图像特征s’与图像特征s可以相同或者不同,图4中的IQ操作为反量化过程。或者,编码端在得到当前图像块对应的Bitstream#2之后,还可以对Bitstream#2进行解码,得到图像特征s’,而不涉及图像量化特征的反量化过程。After obtaining the Bitstream#2 corresponding to the current image block, the encoder can also decode Bitstream#2 to obtain the image quantization feature, that is, AD in FIG4 represents the decoding process, and then the encoder can dequantize the image quantization feature to obtain the image feature s', which can be the same as or different from the image feature s. The IQ operation in FIG4 is the dequantization process. Alternatively, after obtaining the Bitstream#2 corresponding to the current image block, the encoder can also decode Bitstream#2 to obtain the image feature s' without involving the dequantization process of the image quantization feature.
编码端在得到图像特征s’之后,可以对图像特征s’进行特征恢复(即特征处理的逆过程),对此特征恢复过程不做限制,可以是任意特征恢复方式,得到残差特征r_hat,残差特征r_hat与残差特征r可以相同或者不同。编码端得到残差特征r_hat后,基于残差特征r_hat和预测值mu确定图像特征y_hat,图像特征y_hat与图像特征y可以相同或不同,比如说,将残差特征r_hat与预测值mu的和作为图像特征y_hat。在该情况下,需要部署均值预测网络,由均值预测网络提供预测值mu。或者,编码端在得到图像特征s’之后,可以对图像特征s’进行特征恢复(即特征处理的逆过程),得到图像特征y_hat,图像特征y_hat与图像特征y可以相同或不同。在该情况下,不需要部署均值预测网络,通过虚线框表示残差过程为可选过程。After obtaining the image feature s', the encoder can perform feature recovery (i.e., the inverse process of feature processing) on the image feature s'. There is no restriction on this feature recovery process, and it can be any feature recovery method to obtain the residual feature r_hat. The residual feature r_hat and the residual feature r can be the same or different. After obtaining the residual feature r_hat, the encoder determines the image feature y_hat based on the residual feature r_hat and the prediction value mu. The image feature y_hat and the image feature y can be the same or different. For example, the sum of the residual feature r_hat and the prediction value mu is used as the image feature y_hat. In this case, it is necessary to deploy a mean prediction network, and the mean prediction network provides the prediction value mu. Alternatively, after obtaining the image feature s', the encoder can perform feature recovery (i.e., the inverse process of feature processing) on the image feature s' to obtain the image feature y_hat. The image feature y_hat and the image feature y can be the same or different. In this case, it is not necessary to deploy a mean prediction network, and the dotted box indicates that the residual process is an optional process.
编码端在得到图像特征y_hat之后,可以对图像特征y_hat进行合成变换,得到当前图像块x对应的重建图像块x_hat,比如说,将图像特征y_hat输入给合成变换网络,由合成变换网络对图像特征y_hat进行合成变换,得到重建图像块x_hat,至此,完成图像重建过程。After obtaining the image feature y_hat, the encoding end can perform a synthetic transformation on the image feature y_hat to obtain a reconstructed image block x_hat corresponding to the current image block x. For example, the image feature y_hat is input into the synthetic transformation network, and the synthetic transformation network performs a synthetic transformation on the image feature y_hat to obtain the reconstructed image block x_hat. At this point, the image reconstruction process is completed.
在一种可能的实施方式中,编码端在对图像量化特征或者图像特征s进行编码,得到当前图像块对应的Bitstream#2时,编码端需要先确定概率分布模型,然后基于该概率分布模型对图像量化特征或者图像特征s进行编码。此外,编码端在对Bitstream#2进行解码时,也需要先确定概率分布模型,然后基于该概率分布模型对Bitstream#2进行解码。In a possible implementation, when the encoding end encodes the image quantization feature or the image feature s to obtain the Bitstream#2 corresponding to the current image block, the encoding end needs to first determine the probability distribution model, and then encode the image quantization feature or the image feature s based on the probability distribution model. In addition, when the encoding end decodes Bitstream#2, it also needs to first determine the probability distribution model, and then decode Bitstream#2 based on the probability distribution model.
为了得到概率分布模型,继续参见图4所示,编码端在得到系数超参特征z_hat之后,可以对系数超参特征z_hat进行系数超参特征反变换,得到概率分布参数p,比如说,将系数超参特征z_hat输入给概率超参解码网络,由概率超参解码网络对系数超参特征z_hat进行系数超参特征反变换,得到概率分布参数p,在得到概率分布参数p之后,可以基于概率分布参数p生成概率分布模型。其中,概率超参解码网络可以是已训练的神经网络,对此概率超参解码网络的训练过程不作限制,能够对系数超参特征z_hat进行系数超参特征反变换即可。In order to obtain the probability distribution model, as shown in Figure 4, after obtaining the coefficient hyperparameter feature z_hat, the encoding end can perform a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain the probability distribution parameter p. For example, the coefficient hyperparameter feature z_hat is input to the probability hyperparameter decoding network, and the probability hyperparameter decoding network performs a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain the probability distribution parameter p. After obtaining the probability distribution parameter p, a probability distribution model can be generated based on the probability distribution parameter p. Among them, the probability hyperparameter decoding network can be a trained neural network, and there is no restriction on the training process of this probability hyperparameter decoding network, and it is sufficient to be able to perform a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat.
在一种可能的实施方式中,上述编码端的处理过程,可以由深度学习模型或者神经网络模型执行,从而实现端到端的图像压缩和编码过程,对此编码过程不作限制。In a possible implementation, the processing process at the encoding end may be performed by a deep learning model or a neural network model, thereby realizing an end-to-end image compression and encoding process, without any limitation on the encoding process.
实施例4:针对实施例1和实施例2,关于解码端的处理过程,可以参见图5所示,当然,图5只是解码端的处理过程的一个示例,对此解码端的处理过程不作限制。Embodiment 4: For Embodiment 1 and Embodiment 2, the processing process of the decoding end can be referred to as shown in FIG5. Of course, FIG5 is only an example of the processing process of the decoding end, and does not limit the processing process of the decoding end.
解码端在得到当前图像块对应的Bitstream#1之后,还可以对Bitstream#1进行解码,得到超参量化特征,即图5中的AD表示解码过程,然后,对超参量化特征进行反量化,得到系数超参特征z_hat,系数超参特征z_hat与系数超参特征z可以相同或者不同,图5中的IQ操作为反量化过程。或者,解码端在得到当前图像块对应的Bitstream#1之后,还可以对Bitstream#1进行解码,得到系数超参特征z_hat,而不涉及系数超参特征z_hat的反量化过程。After obtaining the Bitstream#1 corresponding to the current image block, the decoding end can also decode Bitstream#1 to obtain the hyperparameter quantization feature, that is, AD in Figure 5 represents the decoding process, and then the hyperparameter quantization feature is dequantized to obtain the coefficient hyperparameter feature z_hat. The coefficient hyperparameter feature z_hat and the coefficient hyperparameter feature z can be the same or different. The IQ operation in Figure 5 is the dequantization process. Alternatively, after obtaining the Bitstream#1 corresponding to the current image block, the decoding end can also decode Bitstream#1 to obtain the coefficient hyperparameter feature z_hat without involving the dequantization process of the coefficient hyperparameter feature z_hat.
针对Bitstream#1的解码过程,可以采用固定概率密度模型的解码方法,对此不做限制。For the decoding process of Bitstream#1, a decoding method of a fixed probability density model may be used, and there is no restriction on this.
图像可以划分成1个图像块,也可以划分成多个图像块,若图像划分成1个图像块,则当前图像块x也可以为图像,即对于图像块的解码过程也可以直接用于图像。The image can be divided into one image block or into multiple image blocks. If the image is divided into one image block, the current image block x can also be an image, that is, the decoding process for the image block can also be directly used for the image.
解码端在得到系数超参特征z_hat之后,可以基于当前图像块的系数超参特征z_hat和前面图像块的残差特征y_hat(残差特征y_hat的确定过程参见后续实施例),进行基于上下文的预测,得到当前图像块对应的预测值mu(即均值mu),比如说,将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定预测值mu,对此预测过程不作限制。其中,针对基于上下文的预测过程,输入包括系数超参特征z_hat和已解码的残差特征y_hat,两者联合输入获取更精准的预测值mu。After obtaining the coefficient hyperparameter feature z_hat, the decoding end can perform context-based prediction based on the coefficient hyperparameter feature z_hat of the current image block and the residual feature y_hat of the previous image block (the determination process of the residual feature y_hat is shown in the subsequent embodiments) to obtain the prediction value mu (i.e., mean mu) corresponding to the current image block. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat are input to the mean prediction network, and the mean prediction network determines the prediction value mu based on the coefficient hyperparameter feature z_hat and the residual feature y_hat. There is no restriction on this prediction process. Among them, for the context-based prediction process, the input includes the coefficient hyperparameter feature z_hat and the decoded residual feature y_hat, and the two are jointly input to obtain a more accurate prediction value mu.
需要注意的是,均值预测网络为可选的神经网络,即也可以没有均值预测网络,即不需要通过均值预测网络确定预测值mu,图5中的虚线框表示均值预测网络为可选。It should be noted that the mean prediction network is an optional neural network, that is, there is no mean prediction network, that is, there is no need to determine the predicted value mu through the mean prediction network. The dotted box in Figure 5 indicates that the mean prediction network is optional.
解码端在得到当前图像块对应的Bitstream#2之后,还可以对Bitstream#2进行解码,得到图像量化特征,即图5中的AD表示解码过程,然后,解码端可以对图像量化特征进行反量化,得到图像特征s’,图像特征s’与图像特征s可以相同或者不同,图5中的IQ操作为反量化过程。或者,解码端在得到当前图像块对应的Bitstream#2之后,还可以对Bitstream#2进行解码,得到图像特征s’,而不涉及图像量化特征的反量化过程。After obtaining the Bitstream#2 corresponding to the current image block, the decoding end can also decode Bitstream#2 to obtain the image quantization feature, that is, AD in Figure 5 represents the decoding process, and then the decoding end can dequantize the image quantization feature to obtain the image feature s', which can be the same as or different from the image feature s. The IQ operation in Figure 5 is the dequantization process. Alternatively, after obtaining the Bitstream#2 corresponding to the current image block, the decoding end can also decode Bitstream#2 to obtain the image feature s' without involving the dequantization process of the image quantization feature.
解码端在得到图像特征s’之后,可以对图像特征s’进行特征恢复(即特征处理的逆过程),得到残差特征r_hat,残差特征r_hat与残差特征r相同或者不同。解码端在得到残差特征r_hat之后,基于残差特征r_hat和预测值mu确定图像特征y_hat,图像特征y_hat与图像特征y相同或不同,如将残差特征r_hat与预测值mu的和作为图像特征y_hat。在该情况下,需要部署均值预测网络,由均值预测网络提供预测值mu。或者,解码端在得到图像特征s’之后,可以对图像特征s’进行特征恢复,得到图像特征y_hat,图像特征y_hat与图像特征y可以相同或不同。在该情况下,不需要部署均值预测网络,通过虚线框表示残差过程为可选过程。After obtaining the image feature s', the decoder can perform feature recovery on the image feature s' (i.e., the inverse process of feature processing) to obtain the residual feature r_hat, which is the same as or different from the residual feature r. After obtaining the residual feature r_hat, the decoder determines the image feature y_hat based on the residual feature r_hat and the prediction value mu, and the image feature y_hat is the same as or different from the image feature y, such as taking the sum of the residual feature r_hat and the prediction value mu as the image feature y_hat. In this case, it is necessary to deploy a mean prediction network, and the prediction value mu is provided by the mean prediction network. Alternatively, after obtaining the image feature s', the decoder can perform feature recovery on the image feature s' to obtain the image feature y_hat, which is the same as or different from the image feature y. In this case, it is not necessary to deploy a mean prediction network, and the dotted box indicates that the residual process is an optional process.
解码端在得到图像特征y_hat之后,可以对图像特征y_hat进行合成变换,得到当前图像块x对应的重建图像块x_hat,比如说,将图像特征y_hat输入给合成变换网络,由合成变换网络对图像特征y_hat进行合成变换,得到重建图像块x_hat,至此,完成图像重建过程。After obtaining the image feature y_hat, the decoding end can perform a synthetic transformation on the image feature y_hat to obtain a reconstructed image block x_hat corresponding to the current image block x. For example, the image feature y_hat is input into the synthetic transformation network, and the synthetic transformation network performs a synthetic transformation on the image feature y_hat to obtain the reconstructed image block x_hat. At this point, the image reconstruction process is completed.
在一种可能的实施方式中,解码端在对Bitstream#2进行解码时,需要先确定概率分布模型,然后基于该概率分布模型对Bitstream#2进行解码。为了得到概率分布模型,继续参见图5所示,解码端在得到系数超参特征z_hat之后,可以对系数超参特征z_hat进行系数超参特征反变换,得到概率分布参数p,比如说,将系数超参特征z_hat输入给概率超参解码网络,由概率超参解码网络对系数超参特征z_hat进行系数超参特征反变换,得到概率分布参数p,在得到概率分布参数p之后,可以基于概率分布参数p生成概率分布模型。其中,概率超参解码网络可以是已训练的神经网络,对此概率超参解码网络的训练过程不作限制,能够对系数超参特征z_hat进行系数超参特征反变换,得到概率分布参数p即可。In a possible implementation, when decoding Bitstream#2, the decoding end needs to first determine the probability distribution model, and then decode Bitstream#2 based on the probability distribution model. In order to obtain the probability distribution model, referring to FIG5 , after obtaining the coefficient hyperparameter feature z_hat, the decoding end can perform a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain the probability distribution parameter p. For example, the coefficient hyperparameter feature z_hat is input to the probability hyperparameter decoding network, and the probability hyperparameter decoding network performs a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain the probability distribution parameter p. After obtaining the probability distribution parameter p, a probability distribution model can be generated based on the probability distribution parameter p. Among them, the probability hyperparameter decoding network can be a trained neural network, and there is no restriction on the training process of the probability hyperparameter decoding network. It is sufficient to perform a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain the probability distribution parameter p.
在一种可能的实施方式中,上述解码端的处理过程,可以由深度学习模型或者神经网络模型执行,从而实现端到端的图像压缩和编码过程,对此解码过程不作限制。In a possible implementation, the processing process at the decoding end may be performed by a deep learning model or a neural network model, thereby realizing an end-to-end image compression and encoding process, without limiting the decoding process.
实施例5:本申请实施例中提出一种解码方法,参见图6A所示,为该解码方法的流程示意图,该方法可以应用于解码端(也称为视频解码器),该方法可以包括:Embodiment 5: A decoding method is proposed in the embodiment of the present application. Referring to FIG. 6A , it is a flowchart of the decoding method. The method can be applied to a decoding end (also called a video decoder). The method may include:
步骤S11、解码端接收当前图像块对应的第一码流(Bitstream#1),第一码流也可以称为边信息码流。解码端对第一码流进行解码,得到超参量化特征,比如说,根据因子化概率模型pf(.)对第一码流进行无损熵解码,得到超参量化特征。解码端对超参量化特征进行反量化,得到系数超参特征z_hat,系数超参特征z_hat也可以称为量化边信息或者,也可以根据因子化概率模型pf(.)对第一码流进行无损熵解码,得到系数超参特征z_hat。Step S11: The decoding end receives the first bitstream (Bitstream#1) corresponding to the current image block. The first bitstream can also be called the side information bitstream. The decoding end decodes the first bitstream to obtain the hyperparametric quantization feature. For example, the first bitstream is losslessly entropy decoded according to the factorized probability model p f (.) to obtain the hyperparametric quantization feature. The decoding end dequantizes the hyperparametric quantization feature to obtain the coefficient hyperparametric feature z_hat. The coefficient hyperparametric feature z_hat can also be called the quantization side information. Alternatively, the first bitstream may be losslessly entropy decoded according to the factorized probability model p f (.) to obtain the coefficient hyperparameter feature z_hat.
综上所述,解码端可以基于因子化概率模型对当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定系数超参特征。比如说,解码后的图像特征可以为超参量化特征,解码端可以对解码后的图像特征进行反量化,从而得到系数超参特征z_hat。或者,解码后的图像特征可以直接为系数超参特征z_hat。In summary, the decoding end can decode the first bitstream corresponding to the current image block based on the factorized probability model to obtain decoded image features, and determine the coefficient hyperparameter features based on the decoded image features. For example, the decoded image features can be hyperparameter quantization features, and the decoding end can dequantize the decoded image features to obtain the coefficient hyperparameter features z_hat. Alternatively, the decoded image features can be directly the coefficient hyperparameter features z_hat.
步骤S12、解码端接收当前图像块对应的头信息码流,从头信息码流中解码当前图像块对应的熵梯度偏移步长,熵梯度偏移步长也可以称为最佳主信息的熵梯度偏移步长ρ。Step S12: The decoding end receives the header information code stream corresponding to the current image block, and decodes the entropy gradient offset step corresponding to the current image block from the header information code stream. The entropy gradient offset step may also be referred to as the entropy gradient offset step ρ of the optimal main information.
在一种可能的实施方式中,头信息码流可以携带熵梯度偏移步长的值,因此,解码端可以对当前图像块对应的头信息码流进行解码,直接得到熵梯度偏移步长的值。In a possible implementation, the header information bitstream may carry the value of the entropy gradient offset step, so the decoding end may decode the header information bitstream corresponding to the current image block to directly obtain the value of the entropy gradient offset step.
在另一种可能的实施方式中,头信息码流可以携带熵梯度偏移步长对应的指示信息,因此,解码端可以对当前图像块对应的头信息码流进行解码,得到熵梯度偏移步长对应的指示信息,该指示信息用于指示熵梯度偏移步长的索引值,该索引值用于表示熵梯度偏移步长是熵梯度偏移步长表中的第几个熵梯度偏移步长。比如说,已配置的熵梯度偏移步长表可以包括至少一个熵梯度偏移步长,解码端基于该指示信息对应的索引值,从熵梯度偏移步长表中选取与该索引值对应的熵梯度偏移步长作为当前图像块对应的熵梯度偏移步长。In another possible implementation, the header information bitstream may carry indication information corresponding to the entropy gradient offset step, so the decoding end may decode the header information bitstream corresponding to the current image block to obtain indication information corresponding to the entropy gradient offset step, the indication information is used to indicate the index value of the entropy gradient offset step, the index value is used to indicate the entropy gradient offset step is the number of entropy gradient offset step in the entropy gradient offset step table. For example, the configured entropy gradient offset step table may include at least one entropy gradient offset step, and the decoding end selects the entropy gradient offset step corresponding to the index value from the entropy gradient offset step table based on the index value corresponding to the indication information as the entropy gradient offset step corresponding to the current image block.
步骤S13、解码端在得到系数超参特征z_hat(量化边信息)之后,对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ,比如说,将系数超参特征z_hat输入给概率超参解码网络,由概率超参解码网络对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ。其中,概率超参解码网络也称为超先验尺度解码网络,标准差σ也称为主信息的概率密度参数。Step S13: The decoding end obtains the coefficient hyperparameter feature z_hat (quantized edge information ) After that, the coefficient hyperparameter feature z_hat is inversely transformed to obtain the standard deviation σ. For example, the coefficient hyperparameter feature z_hat is input to the probabilistic hyperparameter decoding network, and the probabilistic hyperparameter decoding network inversely transforms the coefficient hyperparameter feature z_hat to obtain the standard deviation σ. Among them, the probabilistic hyperparameter decoding network is also called the super-prior scale decoding network, and the standard deviation σ is also called the probability density parameter of the main information.
示例性的,在得到标准差σ之后,还可以基于标准差σ确定概率分布参数p。比如说,若概率分布参数p只包括标准差σ,则可以将标准差σ作为概率分布参数p,即零均值的概率分布参数p。或者,若概率分布参数p包括标准差σ和均值,则还可以获取均值,比如说,将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定均值mu(即预测值mu),对此预测过程不作限制,这样,可以基于标准差σ和均值mu确定概率分布参数p。在得到概率分布参数p之后,解码端还可以基于概率分布参数p生成概率分布模型N(mu,σ),本实施例中对此过程不作限制。Exemplarily, after obtaining the standard deviation σ, the probability distribution parameter p can also be determined based on the standard deviation σ. For example, if the probability distribution parameter p only includes the standard deviation σ, the standard deviation σ can be used as the probability distribution parameter p, that is, the probability distribution parameter p with zero mean. Alternatively, if the probability distribution parameter p includes the standard deviation σ and the mean, the mean can also be obtained. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat are input to the mean prediction network, and the mean prediction network determines the mean mu (that is, the predicted value mu) based on the coefficient hyperparameter feature z_hat and the residual feature y_hat. There is no restriction on this prediction process. In this way, the probability distribution parameter p can be determined based on the standard deviation σ and the mean mu. After obtaining the probability distribution parameter p, the decoding end can also generate a probability distribution model N(mu, σ) based on the probability distribution parameter p. There is no restriction on this process in this embodiment.
综上可以看出,解码端得到系数超参特征z_hat之后,可以将系数超参特征z_hat输入给概率超参解码网络,通过概率超参解码网络对系数超参特征z_hat进行反变换,得到标准差。In summary, it can be seen that after the decoding end obtains the coefficient hyperparameter feature z_hat, the coefficient hyperparameter feature z_hat can be input into the probabilistic hyperparameter decoding network, and the coefficient hyperparameter feature z_hat can be inversely transformed through the probabilistic hyperparameter decoding network to obtain the standard deviation.
步骤S14、解码端接收当前图像块对应的第二码流(Bitstream#2),第二码流也可以称为主信息码流。解码端基于概率分布模型对第二码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定当前图像块对应的残差特征,残差特征即主信息的量化残差 Step S14: The decoding end receives the second bitstream (Bitstream#2) corresponding to the current image block. The second bitstream can also be called the main information bitstream. The decoding end decodes the second bitstream based on the probability distribution model to obtain the decoded image features, and determines the residual features corresponding to the current image block based on the decoded image features. The residual features are the quantized residuals of the main information.
比如说,解码端基于概率分布模型对第二码流进行解码,得到图像量化特征(即解码后的图像特征),并对图像量化特征进行反量化,得到残差特征r_hat,即量化残差或者,解码端基于概率分布模型对第二码流进行解码,得到图像量化特征(即解码后的图像特征),并对图像量化特征进行反量化,得到图像特征s’,或者,解码端基于概率分布模型对第二码流进行解码,得到图像特征s’(即解码后的图像特征)。在得到图像特征s’之后,解码端可以对图像特征s’进行特征恢复,得到残差特征r_hat,即量化残差 For example, the decoding end decodes the second bitstream based on the probability distribution model to obtain the image quantization feature (i.e., the decoded image feature), and dequantizes the image quantization feature to obtain the residual feature r_hat, i.e., the quantized residual Alternatively, the decoding end decodes the second bitstream based on the probability distribution model to obtain the image quantization feature (i.e., the decoded image feature), and dequantizes the image quantization feature to obtain the image feature s'. Alternatively, the decoding end decodes the second bitstream based on the probability distribution model to obtain the image feature s' (i.e., the decoded image feature). After obtaining the image feature s', the decoding end can perform feature recovery on the image feature s' to obtain the residual feature r_hat, i.e., the quantized residual
步骤S15、解码端基于残差特征确定当前图像块对应的初始图像特征。Step S15: The decoding end determines the initial image features corresponding to the current image block based on the residual features.
示例性的,解码端在得到残差特征r_hat之后,基于残差特征r_hat和预测值mu(即均值mu)确定图像特征y_hat,图像特征y_hat为当前图像块对应的初始图像特征。例如,将残差特征r_hat与预测值mu的和作为初始图像特征,初始图像特征也可以称为主信息量化值 Exemplarily, after obtaining the residual feature r_hat, the decoder determines the image feature y_hat based on the residual feature r_hat and the prediction value mu (i.e., the mean value mu), and the image feature y_hat is the initial image feature corresponding to the current image block. For example, the sum of the residual feature r_hat and the prediction value mu is used as the initial image feature, which can also be called the main information quantization value
为了得到预测值mu,那么:解码端在得到系数超参特征z_hat之后,可以基于当前图像块的系数超参特征z_hat和前面图像块的残差特征y_hat,进行基于上下文的预测,得到当前图像块对应的预测值mu,比如说,可以将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定预测值mu,对此预测过程不作限制。其中,针对基于上下文的预测过程,输入可以包括系数超参特征z_hat和已解码的残差特征y_hat,两者联合输入获取更精准的预测值mu。In order to obtain the predicted value mu, then: after obtaining the coefficient hyperparameter feature z_hat, the decoding end can perform context-based prediction based on the coefficient hyperparameter feature z_hat of the current image block and the residual feature y_hat of the previous image block to obtain the predicted value mu corresponding to the current image block. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat can be input into the mean prediction network, and the mean prediction network determines the predicted value mu based on the coefficient hyperparameter feature z_hat and the residual feature y_hat. There is no restriction on this prediction process. Among them, for the context-based prediction process, the input can include the coefficient hyperparameter feature z_hat and the decoded residual feature y_hat, and the two are combined to obtain a more accurate prediction value mu.
步骤S16、解码端基于初始图像特征、熵梯度偏移步长、熵梯度偏移值确定目标图像特征。其中,熵梯度偏移步长表示熵梯度的偏移步长,且用于表示特征码率增大的方向。Step S16: The decoding end determines the target image feature based on the initial image feature, the entropy gradient offset step length, and the entropy gradient offset value, wherein the entropy gradient offset step length represents the offset step length of the entropy gradient and is used to indicate the direction in which the feature code rate increases.
示例性的,基于熵梯度偏移步长ρ、初始图像特征熵梯度偏移值shift,可以采用如下表达式(1)确定目标图像特征y,也就是说,可以基于熵梯度偏移步长ρ和熵梯度偏移值shift对初始图像特征进行偏移,如基于熵梯度偏移步长ρ和熵梯度偏移值shift对初始图像特征中的每个特征点的特征值进行偏移,当然,如下表达式(1)只是示例,对此不作限制。Exemplarily, based on the entropy gradient offset step size ρ, the initial image features The entropy gradient offset value shift can be used to determine the target image feature y using the following expression (1). That is, the initial image feature y can be determined based on the entropy gradient offset step size ρ and the entropy gradient offset value shift. Perform an offset, such as shifting the initial image features based on the entropy gradient offset step size ρ and the entropy gradient offset value shift The feature value of each feature point in is offset. Of course, the following expression (1) is only an example and is not limited to this.
在表达式(1)中,表示初始图像特征,是基于残差特征r_hat和预测值mu得到的,ρ表示熵梯度偏移步长,是从当前图像块对应的头信息码流中解码得到,shift表示熵梯度偏移值,熵梯度偏移值的确定方式可以包括但不限于:基于当前图像块对应的残差特征(如量化残差)确定当前图像块对应的熵梯度偏移值shift,比如说,确定该残差特征对应的残差增量特征,计算该残差特征的熵,并计算该残差增量特征的熵;基于该残差特征的熵和该残差增量特征的熵确定残差特征的熵梯度,并基于残差特征的熵梯度和正则化项确定熵梯度偏移值。In expression (1), represents the initial image feature, which is obtained based on the residual feature r_hat and the predicted value mu, ρ represents the entropy gradient shift step, which is decoded from the header information code stream corresponding to the current image block, and shift represents the entropy gradient shift value. The entropy gradient shift value can be determined in a manner including but not limited to: based on the residual feature corresponding to the current image block (such as the quantized residual ) determine the entropy gradient offset value shift corresponding to the current image block, for example, determine the residual incremental feature corresponding to the residual feature, calculate the entropy of the residual feature, and calculate the entropy of the residual incremental feature; determine the entropy gradient of the residual feature based on the entropy of the residual feature and the entropy of the residual incremental feature, and determine the entropy gradient offset value based on the entropy gradient of the residual feature and the regularization term.
在一种可能的实施方式中,正则化项的确定方式,可以包括但不限于:基于初始图像特征标准差σ和残差特征(如量化残差)确定正则化项其中,可以是正则化项,用于提升特征域表示的结构性稳定度,增强不同失真评价指标熵的表现。In a possible implementation, the regularization term may be determined in a manner including, but not limited to: based on the initial image features Standard deviation σ and residual features (such as quantized residual ) Determine the regularization term in, It can be a regularization term used to improve the structural stability of the feature domain representation and enhance the performance of the entropy of different distortion evaluation indicators.
步骤S17、解码端基于目标图像特征获取当前图像块对应的重建图像块。比如说,将目标图像特征输入给合成变换网络,通过合成变换网络对目标图像特征进行合成变换,得到重建图像块。其中,目标图像特征是增强后的主信息,重建图像块可以记为重建图像块 Step S17: The decoding end obtains a reconstructed image block corresponding to the current image block based on the target image feature. For example, the target image feature is input to the synthesis transformation network, and the target image feature is synthesized and transformed by the synthesis transformation network to obtain a reconstructed image block. The target image feature is the enhanced main information, and the reconstructed image block can be recorded as the reconstructed image block.
实施例6:在实施例1、实施例2和实施例5中,涉及基于初始图像特征、熵梯度偏移步长、熵梯度偏移值确定目标图像特征,参见表达式(1)所示,以下对该过程进行说明。Example 6: In Example 1, Example 2 and Example 5, the target image features are determined based on the initial image features, the entropy gradient offset step, and the entropy gradient offset value, as shown in expression (1). The process is described below.
参见图6B所示,为端到端的编解码框架示意图,主要分两个部分,特征主信息部分与超先验边信息部分,特征主信息部分包括分析网络、量化、正态熵编码、正态熵解码与合成网络。超先验边信息部分包括超先验分析网络、量化、因子化熵编码、因子化熵解码、超先验合成网络。当前图像块x分别由特征主信息部分的分析网络与合成网络进行压缩编码与重建恢复,超先验边信息部分用于建模特征主信息的概率,指导特征主信息的熵编码和熵解码。As shown in FIG6B , it is a schematic diagram of an end-to-end encoding and decoding framework, which is mainly divided into two parts, the feature main information part and the super priori side information part. The feature main information part includes an analysis network, quantization, normal entropy coding, normal entropy decoding and a synthesis network. The super priori side information part includes a super priori analysis network, quantization, factored entropy coding, factored entropy decoding and a super priori synthesis network. The current image block x is compressed and encoded and reconstructed and restored by the analysis network and synthesis network of the feature main information part, respectively. The super priori side information part is used to model the probability of the feature main information and guide the entropy coding and entropy decoding of the feature main information.
在端到端的编解码框架中,初始图像特征(即主信息特征值)服从N(μ,σ)的正态分布,N(μ,σ)为概率分布模型,网络在训练的过程中需要使下列损失函数最小化:In the end-to-end encoding and decoding framework, the initial image features (i.e., main information characteristic value ) obeys the normal distribution of N(μ, σ), N(μ, σ) is a probability distribution model, and the network needs to minimize the following loss function during training:
在表达式(2)中,是的熵率损失或码率损失,为的熵率损失或码率损失,是上述实施例的初始图像特征,是上述实施例的系数超参特征z_hat(即量化边信息),为失真损失,该失真损失是当前图像块x与当前图像块x对应的重建图像块之间的失真损失,λ为拉格朗日系数,该问题是一个多目标帕累托优化问题(Pareto Optimal),当神经网络模型训练收敛时,可由KKT条件(Karush-Kuhn-Tucker Condition)推断出表达式(3):In expression (2), yes The entropy rate loss or bit rate loss, for The entropy rate loss or bit rate loss, is the initial image feature of the above embodiment, is the coefficient hyperparameter feature z_hat (i.e., quantized edge information) of the above embodiment. ), is the distortion loss, which is the reconstructed image block corresponding to the current image block x and the current image block x. The distortion loss between λ and λ is the Lagrangian coefficient. This problem is a multi-objective Pareto optimization problem. When the neural network model training converges, the expression (3) can be inferred from the KKT condition (Karush-Kuhn-Tucker Condition):
从表达式(3)可以看出,主信息熵梯度与失真梯度可以为负相关,即,沿着主信息熵梯度方向调整初始图像特征可以使得失真损失减小,进而提升编码性能。From expression (3), it can be seen that the main information entropy gradient and the distortion gradient can be negatively correlated, that is, the initial image features are adjusted along the direction of the main information entropy gradient. Can cause distortion loss Reduce, thereby improving encoding performance.
示例性的,初始图像特征的自信息量可以参见表达式(4)所示,比如说,针对初始图像特征的每个特征点特征点的自信息量可以参见表达式(4)所示:Exemplary, initial image features The amount of self-information As shown in expression (4), for example, for the initial image features Each feature point Feature Points The amount of self-information See expression (4):
在表达式(4)中,σ为初始图像特征的正态分布的尺度参数,为主信息量化残差,erf可以为误差函数。可以为残差特征,可以为初始图像特征,μ可以为均值。In expression (4), σ is the initial image feature The scale parameter of the normal distribution is The residual is the main information quantization, and erf can be the error function. can be the residual feature, can be the initial image feature, and μ can be the mean.
参见图6C所示,为不同σ下,特征点自信息量与主信息量化残差的关系,可以选取三个σ,在其对应的上采样,在图6C中使用圆圈标注,使用二次多项式曲线拟合,在图6C中使用虚线标识。由图6C可知,固定σ越大,曲线的斜率越小,即主信息熵梯度越小。As shown in FIG6C , the relationship between the self-information of the feature point and the quantized residual of the main information under different σ can be selected. Upsampling is marked with circles in Figure 6C, and quadratic polynomial curve fitting is used, marked with dashed lines in Figure 6C. As shown in Figure 6C, fixed The larger σ is, the smaller the slope of the curve is, that is, the smaller the main information entropy gradient is.
示例性的,关于码率的计算方式,可以参见表达式(5)所示,对于码率关于某个特征点的梯度,以及码率关于特征点的梯度的推导过程,可以参见表达式(6)所示:For example, the calculation method of the code rate can be shown in expression (5), and the gradient of the code rate with respect to a certain feature point and the derivation process of the gradient of the code rate with respect to the feature point can be shown in expression (6):
在表达式(5)中,为主信息的码率,Ii为第i个特征点的自信息量。在表达式(6)中,ai,bi,ci分别为二次曲线拟合的二次系数、一次系数与常数项,∝为正比关系,从表达式(6)可以看出,可以使用残差特征与正态分布的尺度参数σ近似刻画主信息的熵梯度。In expression (5), is the code rate of the main information, and Ii is the self-information of the ith feature point. In expression (6), ai , bi , and ci are the quadratic coefficient, the linear coefficient, and the constant term of the quadratic curve fitting, respectively, and ∝ is a proportional relationship. From expression (6), it can be seen that the residual feature can be used The scale parameter σ of the normal distribution approximately describes the entropy gradient of the main information.
在一种可能的实施方式中,可以使用如下表达式对主信息特征点进行偏移:In a possible implementation, the main information feature points may be offset using the following expression:
1、计算残差特征的熵比如说,采用如下表达式(7)计算残差特征的熵 1. Calculate residual features Entropy For example, the residual feature is calculated using the following expression (7): Entropy
在表达式(7)中,erf是误差函数,当然,表达式(7)只是示例,对此不作限制。In expression (7), erf is an error function. Of course, expression (7) is only an example and is not limiting.
2、确定残差特征对应的残差增量特征,如残差增量特征为s可以大于0,如0.01、0.02,s也可以小于0,如-0.01、-0.02。为了方便描述,以为例,计算残差增量特征的熵比如说,采用如下表达式(8)计算残差增量特征的熵 2. Determine the residual incremental feature corresponding to the residual feature, such as the residual incremental feature is s can be greater than 0, such as 0.01, 0.02, or less than 0, such as -0.01, -0.02. Take the residual incremental feature as an example. Entropy For example, the residual incremental feature is calculated using the following expression (8): Entropy
在表达式(8)中,erf是误差函数,当然,表达式(8)只是示例,对此不作限制。In expression (8), erf is an error function. Of course, expression (8) is only an example and is not limiting.
3、基于残差特征的熵和残差增量特征的熵确定残差特征的熵梯度比如说,采用如下表达式(9)计算残差特征的熵梯度 3. Based on residual features Entropy and residual incremental features Entropy Determine residual characteristics The entropy gradient For example, the residual feature is calculated using the following expression (9): The entropy gradient
4、基于初始图像特征标准差σ和残差特征确定正则化项f,参见表达式(10)所示:4. Based on initial image features Standard deviation σ and residual characteristics Determine the regularization term f, as shown in expression (10):
5、基于残差特征的熵梯度的熵梯度和正则化项f确定熵梯度偏移值shift,比如说,可以采用表达式(11)确定熵梯度偏移值shift,即,计算正则化项与主信息量化残差熵梯度对主信息量化值的偏移值shift,当然,表达式(11)只是示例,对此确定方式不作限制。5. Based on residual features The entropy gradient of And the regularization term f determines the entropy gradient offset value shift. For example, the entropy gradient offset value shift can be determined by using expression (11), that is, the offset value shift of the main information quantization value of the regularization term and the main information quantization residual entropy gradient is calculated. Of course, expression (11) is only an example and there is no limitation on this determination method.
6、基于熵梯度偏移步长ρ、初始图像特征熵梯度偏移值shift,采用如下表达式(12)确定目标图像特征y,即使用熵梯度偏移值和熵梯度偏移步长对主信息量化值进行偏移:6. Entropy gradient-based offset step size ρ and initial image features The entropy gradient offset value shift uses the following expression (12) to determine the target image feature y, that is, the entropy gradient offset value and the entropy gradient offset step size are used to offset the main information quantization value:
至此,得到目标图像特征y的确定表达式,可以基于熵梯度偏移步长ρ和熵梯度偏移值shift对初始图像特征进行偏移,如基于熵梯度偏移步长ρ和熵梯度偏移值shift对初始图像特征中的每个特征点的特征值进行偏移,在表达式(12)中,表示初始图像特征,是基于残差特征和预测值mu得到的,ρ表示熵梯度偏移步长,是从当前图像块对应的头信息码流中解码得到,shift表示熵梯度偏移值,熵梯度偏移值的确定方式可以参见上述表达式。At this point, the definite expression of the target image feature y is obtained, and the initial image feature can be shifted based on the entropy gradient shift step size ρ and the entropy gradient shift value shift Perform an offset, such as shifting the initial image features based on the entropy gradient offset step size ρ and the entropy gradient offset value shift The eigenvalues of each feature point in are offset. In expression (12), Represents the initial image features, which are based on residual features and the predicted value mu, ρ represents the entropy gradient offset step, which is decoded from the header information bitstream corresponding to the current image block, and shift represents the entropy gradient offset value. The method for determining the entropy gradient offset value can refer to the above expression.
实施例7:本申请实施例中提出一种解码方法,该方法可以应用于解码端,该方法包括:Embodiment 7: A decoding method is proposed in the embodiment of the present application. The method can be applied to a decoding end. The method includes:
步骤S21、解码端接收当前图像块对应的第一码流(Bitstream#1),第一码流也可以称为边信息码流。解码端对第一码流进行解码,得到超参量化特征,比如说,根据因子化概率模型pf(.)对第一码流进行无损熵解码,得到超参量化特征。解码端对超参量化特征进行反量化,得到系数超参特征z_hat,系数超参特征z_hat也可以称为量化边信息或者,也可以根据因子化概率模型pf(.)对第一码流进行无损熵解码,得到系数超参特征z_hat。Step S21: The decoding end receives the first bitstream (Bitstream#1) corresponding to the current image block. The first bitstream can also be called the side information bitstream. The decoding end decodes the first bitstream to obtain the hyperparametric quantization feature. For example, the first bitstream is losslessly entropy decoded according to the factorized probability model p f (.) to obtain the hyperparametric quantization feature. The decoding end dequantizes the hyperparametric quantization feature to obtain the coefficient hyperparametric feature z_hat. The coefficient hyperparametric feature z_hat can also be called the quantization side information. Alternatively, the first bitstream may be losslessly entropy decoded according to the factorized probability model p f (.) to obtain the coefficient hyperparameter feature z_hat.
综上所述,解码端可以基于因子化概率模型对当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定系数超参特征。比如说,解码后的图像特征可以为超参量化特征,解码端可以对解码后的图像特征进行反量化,从而得到系数超参特征z_hat。或者,解码后的图像特征可以直接为系数超参特征z_hat。In summary, the decoding end can decode the first bitstream corresponding to the current image block based on the factorized probability model to obtain decoded image features, and determine the coefficient hyperparameter features based on the decoded image features. For example, the decoded image features can be hyperparameter quantization features, and the decoding end can dequantize the decoded image features to obtain the coefficient hyperparameter features z_hat. Alternatively, the decoded image features can be directly the coefficient hyperparameter features z_hat.
步骤S22、解码端接收当前图像块对应的头信息码流,从头信息码流中解码当前图像块对应的熵梯度偏移步长,熵梯度偏移步长也可以称为最佳主信息的熵梯度偏移步长ρ。Step S22: The decoding end receives the header information code stream corresponding to the current image block, and decodes the entropy gradient offset step corresponding to the current image block from the header information code stream. The entropy gradient offset step may also be referred to as the entropy gradient offset step ρ of the optimal main information.
在一种可能的实施方式中,头信息码流可以携带熵梯度偏移步长的值,因此,解码端可以对当前图像块对应的头信息码流进行解码,直接得到熵梯度偏移步长的值。In a possible implementation, the header information bitstream may carry the value of the entropy gradient offset step, so the decoding end may decode the header information bitstream corresponding to the current image block to directly obtain the value of the entropy gradient offset step.
在另一种可能的实施方式中,头信息码流可以携带熵梯度偏移步长对应的指示信息,因此,解码端可以对当前图像块对应的头信息码流进行解码,得到熵梯度偏移步长对应的指示信息,该指示信息用于指示熵梯度偏移步长的索引值,该索引值用于表示熵梯度偏移步长是熵梯度偏移步长表中的第几个熵梯度偏移步长。比如说,已配置的熵梯度偏移步长表可以包括至少一个熵梯度偏移步长,解码端基于该指示信息对应的索引值,从熵梯度偏移步长表中选取与该索引值对应的熵梯度偏移步长作为当前图像块对应的熵梯度偏移步长。In another possible implementation, the header information bitstream may carry indication information corresponding to the entropy gradient offset step, so the decoding end may decode the header information bitstream corresponding to the current image block to obtain indication information corresponding to the entropy gradient offset step, the indication information is used to indicate the index value of the entropy gradient offset step, the index value is used to indicate the entropy gradient offset step is the number of entropy gradient offset step in the entropy gradient offset step table. For example, the configured entropy gradient offset step table may include at least one entropy gradient offset step, and the decoding end selects the entropy gradient offset step corresponding to the index value from the entropy gradient offset step table based on the index value corresponding to the indication information as the entropy gradient offset step corresponding to the current image block.
步骤S23、解码端在得到系数超参特征z_hat(量化边信息)之后,对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ,比如说,将系数超参特征z_hat输入给概率超参解码网络,由概率超参解码网络对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ。其中,概率超参解码网络也称为超先验尺度解码网络,标准差σ也称为主信息的概率密度参数。Step S23, the decoding end obtains the coefficient hyperparameter feature z_hat (quantized edge information ) After that, the coefficient hyperparameter feature z_hat is inversely transformed to obtain the standard deviation σ. For example, the coefficient hyperparameter feature z_hat is input to the probabilistic hyperparameter decoding network, and the probabilistic hyperparameter decoding network inversely transforms the coefficient hyperparameter feature z_hat to obtain the standard deviation σ. Among them, the probabilistic hyperparameter decoding network is also called the super-prior scale decoding network, and the standard deviation σ is also called the probability density parameter of the main information.
示例性的,在得到标准差σ之后,还可以基于标准差σ确定概率分布参数p。比如说,若概率分布参数p只包括标准差σ,则可以将标准差σ作为概率分布参数p,即零均值的概率分布参数p。或者,若概率分布参数p包括标准差σ和均值,则还可以获取均值,比如说,将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定均值mu(即预测值mu),对此预测过程不作限制,这样,可以基于标准差σ和均值mu确定概率分布参数p。在得到概率分布参数p之后,解码端还可以基于概率分布参数p生成概率分布模型N(mu,σ),本实施例中对此过程不作限制。Exemplarily, after obtaining the standard deviation σ, the probability distribution parameter p can also be determined based on the standard deviation σ. For example, if the probability distribution parameter p only includes the standard deviation σ, the standard deviation σ can be used as the probability distribution parameter p, that is, the probability distribution parameter p with zero mean. Alternatively, if the probability distribution parameter p includes the standard deviation σ and the mean, the mean can also be obtained. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat are input to the mean prediction network, and the mean prediction network determines the mean mu (that is, the predicted value mu) based on the coefficient hyperparameter feature z_hat and the residual feature y_hat. There is no restriction on this prediction process. In this way, the probability distribution parameter p can be determined based on the standard deviation σ and the mean mu. After obtaining the probability distribution parameter p, the decoding end can also generate a probability distribution model N(mu, σ) based on the probability distribution parameter p. There is no restriction on this process in this embodiment.
综上可以看出,解码端得到系数超参特征z_hat之后,可以将系数超参特征z_hat输入给概率超参解码网络,通过概率超参解码网络对系数超参特征z_hat进行反变换,得到标准差。In summary, it can be seen that after the decoding end obtains the coefficient hyperparameter feature z_hat, the coefficient hyperparameter feature z_hat can be input into the probabilistic hyperparameter decoding network, and the coefficient hyperparameter feature z_hat can be inversely transformed through the probabilistic hyperparameter decoding network to obtain the standard deviation.
步骤S24、解码端接收当前图像块对应的第二码流(Bitstream#2),第二码流也可以称为主信息码流。解码端基于概率分布模型对第二码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定当前图像块对应的残差特征,残差特征即主信息的量化残差 Step S24: The decoding end receives the second bitstream (Bitstream#2) corresponding to the current image block. The second bitstream can also be called the main information bitstream. The decoding end decodes the second bitstream based on the probability distribution model to obtain the decoded image features, and determines the residual features corresponding to the current image block based on the decoded image features. The residual features are the quantized residuals of the main information.
比如说,解码端基于概率分布模型对第二码流进行解码,得到图像量化特征(即解码后的图像特征),并对图像量化特征进行反量化,得到残差特征r_hat,即量化残差或者,解码端基于概率分布模型对第二码流进行解码,得到图像量化特征(即解码后的图像特征),并对图像量化特征进行反量化,得到图像特征s’,或者,解码端基于概率分布模型对第二码流进行解码,得到图像特征s’(即解码后的图像特征)。在得到图像特征s’之后,解码端可以对图像特征s’进行特征恢复,得到残差特征r_hat,即量化残差 For example, the decoding end decodes the second bitstream based on the probability distribution model to obtain the image quantization feature (i.e., the decoded image feature), and dequantizes the image quantization feature to obtain the residual feature r_hat, i.e., the quantized residual Alternatively, the decoding end decodes the second bitstream based on the probability distribution model to obtain the image quantization feature (i.e., the decoded image feature), and dequantizes the image quantization feature to obtain the image feature s'. Alternatively, the decoding end decodes the second bitstream based on the probability distribution model to obtain the image feature s' (i.e., the decoded image feature). After obtaining the image feature s', the decoding end can perform feature recovery on the image feature s' to obtain the residual feature r_hat, i.e., the quantized residual
步骤S25、解码端基于残差特征确定当前图像块对应的初始图像特征。Step S25: The decoding end determines the initial image feature corresponding to the current image block based on the residual feature.
示例性的,解码端在得到残差特征r_hat之后,基于残差特征r_hat和预测值mu(即均值mu)确定图像特征y_hat,图像特征y_hat为当前图像块对应的初始图像特征。例如,将残差特征r_hat与预测值mu的和作为初始图像特征,初始图像特征也可以称为主信息量化值 Exemplarily, after obtaining the residual feature r_hat, the decoder determines the image feature y_hat based on the residual feature r_hat and the prediction value mu (i.e., the mean value mu), and the image feature y_hat is the initial image feature corresponding to the current image block. For example, the sum of the residual feature r_hat and the prediction value mu is used as the initial image feature, which can also be called the main information quantization value
为了得到预测值mu,那么:解码端在得到系数超参特征z_hat之后,可以基于当前图像块的系数超参特征z_hat和前面图像块的残差特征y_hat,进行基于上下文的预测,得到当前图像块对应的预测值mu,比如说,可以将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定预测值mu,对此预测过程不作限制。其中,针对基于上下文的预测过程,输入可以包括系数超参特征z_hat和已解码的残差特征y_hat,两者联合输入获取更精准的预测值mu。In order to obtain the predicted value mu, then: after obtaining the coefficient hyperparameter feature z_hat, the decoding end can perform context-based prediction based on the coefficient hyperparameter feature z_hat of the current image block and the residual feature y_hat of the previous image block to obtain the predicted value mu corresponding to the current image block. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat can be input into the mean prediction network, and the mean prediction network determines the predicted value mu based on the coefficient hyperparameter feature z_hat and the residual feature y_hat. There is no restriction on this prediction process. Among them, for the context-based prediction process, the input can include the coefficient hyperparameter feature z_hat and the decoded residual feature y_hat, and the two are combined to obtain a more accurate prediction value mu.
步骤S26、解码端从熵率拟合表中查询标准差σ对应的目标量化尺度目标量化尺度是与标准差σ最接近的量化尺度,并从熵率拟合表(用于记录量化尺度与二次项之间的对应关系)中查询目标量化尺度对应的目标二次项,并基于目标二次项确定目标熵梯度。Step S26: The decoding end queries the target quantization scale corresponding to the standard deviation σ from the entropy rate fitting table Target Quantification Scale is the quantization scale closest to the standard deviation σ, and the target quantization scale is queried from the entropy rate fitting table (used to record the correspondence between the quantization scale and the quadratic term) The corresponding target quadratic term is obtained, and the target entropy gradient is determined based on the target quadratic term.
示例性的,可以预先维护熵率拟合表,熵率拟合表的获取方式包括但不限于:针对最小标准差与最大标准差之间的多个标准差,在对数尺度上对多个标准差进行均匀量化,得到多个标准差对应的多个量化尺度;针对每个量化尺度,可以对该量化尺度进行熵率拟合,得到该量化尺度对应的二次项,并在熵率拟合表中记录该量化尺度与该二次项之间的对应关系。Exemplarily, an entropy rate fitting table can be maintained in advance, and the method for obtaining the entropy rate fitting table includes but is not limited to: for multiple standard deviations between the minimum standard deviation and the maximum standard deviation, the multiple standard deviations are uniformly quantized on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations; for each quantization scale, an entropy rate fitting can be performed on the quantization scale to obtain a quadratic term corresponding to the quantization scale, and the correspondence between the quantization scale and the quadratic term is recorded in the entropy rate fitting table.
比如说,在最小标准差σmin与最大标准差σmax之间划分出多个标准差,对此标准差的数量不作限制,如64个标准差,假设最小标准差σmin为0.11,最大标准差σmax为256,则可以在0.11与256之间,按照等比数列划分出64个值,即得到64个标准差。然后,在对数尺度上对多个标准差进行均匀量化,得到多个标准差对应的多个量化尺度,也就是说,可以得到64个标准差对应的64个量化尺度,可以通过表示标准差对应的量化尺度。For example, multiple standard deviations are divided between the minimum standard deviation σ min and the maximum standard deviation σ max . There is no limit on the number of standard deviations. For example, 64 standard deviations are divided. Assuming that the minimum standard deviation σ min is 0.11 and the maximum standard deviation σ max is 256, 64 values can be divided between 0.11 and 256 according to the geometric progression, that is, 64 standard deviations are obtained. Then, multiple standard deviations are uniformly quantized on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations. In other words, 64 quantization scales corresponding to 64 standard deviations can be obtained, which can be obtained by Indicates the quantitative scale corresponding to the standard deviation.
针对每个量化尺度量化尺度对应了一个熵关于的曲线,可以使用二次曲线拟合即可以对量化尺度进行熵率拟合,得到量化尺度对应的二次项a(即二次曲线中的二次项)。针对上述二次曲线拟合对上述式子求导可知,由于每个主信息特征点有其对应的主信息特征残差每个对应一个特征点的标准差σi,针对每个标准差σi,可以从64个量化尺度中找到该标准差σi最接近的量化尺度,记为每个对应了一个ai,因此,的计算方式可以为如下表达式: For each quantitative scale Quantitative scale Corresponding to an entropy about The curve can be fitted using a quadratic curve That is, we can quantify the scale Perform entropy rate fitting to obtain the quantitative scale The corresponding quadratic term a (i.e. the quadratic term in the quadratic curve). By taking the derivative of the above formula, we can know that Since each main information feature point has its corresponding main information feature residual Each Corresponding to the standard deviation σ i of a feature point, for each standard deviation σ i , the quantization scale closest to the standard deviation σ i can be found from the 64 quantization scales, denoted as Each corresponds to an a i , therefore, The calculation method can be as follows:
综上可以看出,在对主信息进行增强时,可以使用拟合的二次多项式系数对熵梯度进行建模,针对每个量化尺度可以对量化尺度进行熵率拟合,得到量化尺度对应的二次项a,这样,就可以在熵率拟合表中记录该量化尺度与该二次项a之间的对应关系。假设存在64个量化尺度,则可以在熵率拟合表中记录64个量化尺度与二次项之间的对应关系。In summary, when enhancing the main information, the entropy gradient can be modeled using the fitted quadratic polynomial coefficients. Can be used to quantify the scale Perform entropy rate fitting to obtain the quantitative scale The corresponding quadratic term a, so that the quantitative scale can be recorded in the entropy rate fitting table Assuming that there are 64 quantization scales, the corresponding relationship between the 64 quantization scales and the quadratic term a can be recorded in the entropy rate fitting table.
示例性的,解码端在得到系数超参特征z_hat(量化边信息)之后,对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ,在此基础上,解码端可以从熵率拟合表中查询标准差σ对应的目标量化尺度目标量化尺度是与标准差σ最接近的量化尺度,然后,可以从熵率拟合表中查询目标量化尺度对应的目标二次项,并基于目标二次项确定目标熵梯度。For example, the decoder obtains the coefficient hyperparameter feature z_hat (quantized edge information ) After that, the coefficient hyperparameter feature z_hat is inversely transformed to obtain the standard deviation σ. On this basis, the decoder can query the target quantization scale corresponding to the standard deviation σ from the entropy rate fitting table Target Quantification Scale is the quantization scale closest to the standard deviation σ, and then the target quantization scale can be queried from the entropy rate fitting table The corresponding target quadratic term is obtained, and the target entropy gradient is determined based on the target quadratic term.
步骤S27、解码端基于初始图像特征、熵梯度偏移步长、目标熵梯度、正则化项确定目标图像特征。示例性的,基于熵梯度偏移步长ρ、初始图像特征目标熵梯度和正则化项,可以采用如下表达式(13)确定目标图像特征y,也就是说,可以基于熵梯度偏移步长ρ、目标熵梯度和正则化项对初始图像特征进行偏移,如基于熵梯度偏移步长ρ、目标熵梯度和正则化项对初始图像特征中的每个特征点的特征值进行偏移,当然,表达式(13)只是示例。Step S27: The decoding end determines the target image feature based on the initial image feature, the entropy gradient offset step, the target entropy gradient, and the regularization term. The target entropy gradient and regularization term can be used to determine the target image feature y using the following expression (13). That is, the initial image feature y can be determined based on the entropy gradient offset step size ρ, the target entropy gradient and the regularization term. Perform an offset, such as offsetting the initial image features based on the entropy gradient offset step size ρ, the target entropy gradient and the regularization term The feature value of each feature point in is offset. Of course, expression (13) is just an example.
在表达式(13)中,f表示正则化项,正则化项f的确定方式,可以包括但不限于:基于初始图像特征标准差σ和残差特征(如量化残差)确定正则化项其中,可以是正则化项,用于提升特征域表示的结构性稳定度,增强不同失真评价指标熵的表现。在表达式(13)中,表示目标熵梯度,a表示目标二次项,N表示量化尺度的总数量,如64。在表达式(13)中,用于表示初始图像特征,是基于残差特征r_hat和预测值mu得到的,ρ用于表示熵梯度偏移步长,是从当前图像块对应的头信息码流中解码得到。In expression (13), f represents a regularization term. The determination method of the regularization term f may include but is not limited to: based on the initial image features Standard deviation σ and residual features (such as quantized residual ) Determine the regularization term in, It can be a regularization term used to improve the structural stability of the feature domain representation and enhance the performance of different distortion evaluation index entropy. In expression (13), represents the target entropy gradient, a represents the target quadratic term, and N represents the total number of quantization scales, such as 64. In expression (13), It is used to represent the initial image features, which is obtained based on the residual feature r_hat and the predicted value mu. ρ is used to represent the entropy gradient offset step, which is decoded from the header information code stream corresponding to the current image block.
步骤S28、解码端基于目标图像特征获取当前图像块对应的重建图像块。比如说,将目标图像特征输入给合成变换网络,通过合成变换网络对目标图像特征进行合成变换,得到重建图像块。其中,目标图像特征是增强后的主信息,重建图像块可以记为重建图像块 Step S28, the decoding end obtains the reconstructed image block corresponding to the current image block based on the target image feature. For example, the target image feature is input to the synthesis transformation network, and the target image feature is synthesized and transformed by the synthesis transformation network to obtain the reconstructed image block. Among them, the target image feature is the enhanced main information, and the reconstructed image block can be recorded as the reconstructed image block
实施例8:本申请实施例中提出一种编码方法,参见图6D和图6E所示,为该编码方法的流程示意图,该方法可以应用于编码端(也称为视频编码器),该方法可以包括:Embodiment 8: In the embodiment of the present application, a coding method is proposed. Referring to FIG. 6D and FIG. 6E , a flow chart of the coding method is shown. The method can be applied to a coding end (also called a video encoder). The method may include:
步骤S31、编码端在得到当前图像块x之后,通过分析变换网络对当前图像块x进行分析变换,得到当前图像块x对应的图像特征y,图像特征y也可以称为特征主信息y。Step S31: After obtaining the current image block x, the encoder performs analysis and transformation on the current image block x through an analysis and transformation network to obtain an image feature y corresponding to the current image block x. The image feature y may also be referred to as feature main information y.
步骤S32、编码端在得到图像特征y之后,还可以对图像特征y进行系数超参特征变换,得到系数超参特征z,比如说,编码端可以将图像特征y输入给超参编码网络,由超参编码网络对图像特征y进行系数超参特征变换,得到系数超参特征z。其中,超参编码网络也可以称为超先验编码网络,系数超参特征z也可以称为特征边信息z。Step S32: After obtaining the image feature y, the encoding end may also perform coefficient hyperparameter feature transformation on the image feature y to obtain coefficient hyperparameter feature z. For example, the encoding end may input the image feature y into a hyperparameter encoding network, and the hyperparameter encoding network may perform coefficient hyperparameter feature transformation on the image feature y to obtain coefficient hyperparameter feature z. The hyperparameter encoding network may also be referred to as a super priori encoding network, and the coefficient hyperparameter feature z may also be referred to as feature side information z.
步骤S33、编码端在得到系数超参特征z之后,可以对系数超参特征z进行量化,得到系数超参特征z对应的超参量化特征,超参量化特征也可以称为量化特征边信息 Step S33: After obtaining the coefficient super-parameter feature z, the encoder can quantize the coefficient super-parameter feature z to obtain the super-parameter quantization feature corresponding to the coefficient super-parameter feature z. The super-parameter quantization feature can also be called the quantization feature side information.
步骤S34、编码端在得到系数超参特征z对应的超参量化特征之后,对超参量化特征进行编码,得到Bitstream#1(即第一码流),比如说,编码端对超参量化特征使用固定的因子化概率密度pf(·),使用无损熵编码器编码到第一码流中,第一码流也可以称为边信息码流。Step S34: After obtaining the hyperparametric quantization feature corresponding to the coefficient hyperparametric feature z, the encoder encodes the hyperparametric quantization feature to obtain Bitstream#1 (i.e., the first bitstream). For example, the encoder uses a fixed factorized probability density p f (·) for the hyperparametric quantization feature and uses a lossless entropy encoder to encode it into the first bitstream. The first bitstream can also be called a side information bitstream.
在得到当前图像块对应的Bitstream#1之后,编码端可以将当前图像块对应的Bitstream#1发送给解码端,关于解码端针对当前图像块对应的Bitstream#1的处理过程,参见实施例5-7。After obtaining the Bitstream#1 corresponding to the current image block, the encoder may send the Bitstream#1 corresponding to the current image block to the decoder. For the processing process of the decoder for the Bitstream#1 corresponding to the current image block, refer to Example 5-7.
步骤S35、编码端在得到Bitstream#1之后,还可以对Bitstream#1进行解码,得到超参量化特征,如可以使用固定的因子化概率密度pf(.),使用无损熵解码器对Bitstream#1进行解码,得到超参量化特征,然后对超参量化特征进行反量化,得到系数超参特征z_hat。Step S35, after obtaining Bitstream#1, the encoding end can also decode Bitstream#1 to obtain a hyperparametric quantization feature. For example, a fixed factorized probability density p f (.) can be used to decode Bitstream#1 using a lossless entropy decoder to obtain a hyperparametric quantization feature, and then the hyperparametric quantization feature is dequantized to obtain a coefficient hyperparametric feature z_hat.
步骤S36、编码端在得到系数超参特征z_hat之后,可以对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ,比如说,将系数超参特征z_hat输入给概率超参解码网络,由概率超参解码网络对系数超参特征z_hat进行系数超参特征反变换,得到标准差σ,在得到标准差σ之后,可以基于标准差σ生成概率分布模型。其中,概率超参解码网络也可以称为超先验尺度解码网络,标准差σ也可以称为主信息量化特征的概率分布尺度σ。Step S36, after obtaining the coefficient hyperparameter feature z_hat, the encoding end can perform a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain a standard deviation σ. For example, the coefficient hyperparameter feature z_hat is input to a probabilistic hyperparameter decoding network, and the probabilistic hyperparameter decoding network performs a coefficient hyperparameter feature inverse transformation on the coefficient hyperparameter feature z_hat to obtain a standard deviation σ. After obtaining the standard deviation σ, a probability distribution model can be generated based on the standard deviation σ. Among them, the probabilistic hyperparameter decoding network can also be called a super-prior scale decoding network, and the standard deviation σ can also be called the probability distribution scale σ of the main information quantization feature.
步骤S37、编码端在得到图像特征y之后,还可以基于图像特征y和预测值mu确定残差特征r,比如说,可以将图像特征y与预测值mu的差值作为残差特征r。编码端在得到残差特征r之后,还可以对残差特征r进行量化,得到图像量化特征。Step S37: After obtaining the image feature y, the encoder can also determine the residual feature r based on the image feature y and the prediction value mu. For example, the difference between the image feature y and the prediction value mu can be used as the residual feature r. After obtaining the residual feature r, the encoder can also quantize the residual feature r to obtain a quantized image feature.
示例性的,编码端在得到系数超参特征z_hat之后,可以基于当前图像块的系数超参特征z_hat和前面图像块的残差特征y_hat,进行基于上下文的预测,得到当前图像块对应的预测值mu(即均值mu),比如说,将系数超参特征z_hat和残差特征y_hat输入给均值预测网络,由均值预测网络基于系数超参特征z_hat和残差特征y_hat确定预测值mu。Exemplarily, after obtaining the coefficient hyperparameter feature z_hat, the encoding end can perform context-based prediction based on the coefficient hyperparameter feature z_hat of the current image block and the residual feature y_hat of the previous image block to obtain the prediction value mu (i.e., mean mu) corresponding to the current image block. For example, the coefficient hyperparameter feature z_hat and the residual feature y_hat are input into the mean prediction network, and the mean prediction network determines the prediction value mu based on the coefficient hyperparameter feature z_hat and the residual feature y_hat.
步骤S38、编码端在得到图像量化特征之后,可以对图像量化特征进行编码,得到当前图像块对应的Bitstream#2(即第二码流),比如说,编码端可以基于标准差σ生成概率分布模型,并使用概率分布模型对图像量化特征进行编码,得到Bitstream#2。Step S38, after obtaining the image quantization features, the encoding end can encode the image quantization features to obtain Bitstream#2 (i.e., the second code stream) corresponding to the current image block. For example, the encoding end can generate a probability distribution model based on the standard deviation σ, and use the probability distribution model to encode the image quantization features to obtain Bitstream#2.
在得到当前图像块对应的Bitstream#2之后,编码端可以将当前图像块对应的Bitstream#2发送给解码端,关于解码端针对当前图像块对应的Bitstream#2的处理过程,参见实施例5-7。After obtaining Bitstream#2 corresponding to the current image block, the encoder may send Bitstream#2 corresponding to the current image block to the decoder. For the processing of Bitstream#2 corresponding to the current image block by the decoder, see Example 5-7.
步骤S39、编码端在得到Bitstream#2之后,还可以对Bitstream#2进行解码,得到解码后的图像特征,解码后的图像特征可以为图像量化特征,比如说,编码端基于概率分布模型对Bitstream#2进行解码,得到图像量化特征。在得到图像量化特征之后,编码端还可以对图像量化特征进行反量化,得到残差特征r_hat,残差特征r_hat即量化残差 Step S39: After obtaining Bitstream#2, the encoder can also decode Bitstream#2 to obtain decoded image features. The decoded image features can be image quantization features. For example, the encoder decodes Bitstream#2 based on a probability distribution model to obtain image quantization features. After obtaining the image quantization features, the encoder can also dequantize the image quantization features to obtain residual features r_hat. The residual features r_hat are the quantization residual features.
步骤S40、编码端基于残差特征确定当前图像块对应的初始图像特征。比如说,在得到残差特征r_hat之后,编码端基于残差特征r_hat和预测值mu(即均值mu)确定图像特征y_hat,图像特征y_hat为当前图像块对应的初始图像特征。例如,编码端将残差特征r_hat与预测值mu的和作为初始图像特征,初始图像特征也可以称为主信息量化值 Step S40, the encoder determines the initial image feature corresponding to the current image block based on the residual feature. For example, after obtaining the residual feature r_hat, the encoder determines the image feature y_hat based on the residual feature r_hat and the prediction value mu (i.e., the mean value mu), and the image feature y_hat is the initial image feature corresponding to the current image block. For example, the encoder uses the sum of the residual feature r_hat and the prediction value mu as the initial image feature, which can also be called the primary information quantization value
步骤S41、针对每个熵梯度偏移步长,编码端基于初始图像特征和该熵梯度偏移步长确定目标图像特征,并基于目标图像特征获取该熵梯度偏移步长对应的重建图像块。Step S41: For each entropy gradient offset step, the encoding end determines the target image feature based on the initial image feature and the entropy gradient offset step, and obtains the reconstructed image block corresponding to the entropy gradient offset step based on the target image feature.
在一种可能的实施方式中,可以预先配置多个熵梯度偏移步长,如预先配置10个熵梯度偏移步长,针对每个熵梯度偏移步长,可以基于初始图像特征、该熵梯度偏移步长、熵梯度偏移值确定目标图像特征。其中,熵梯度偏移值的确定方式包括:确定残差特征对应的残差增量特征;计算该残差特征的熵,计算该残差增量特征的熵;基于该残差特征的熵和该残差增量特征的熵确定残差特征的熵梯度,基于残差特征的熵梯度和正则化项确定熵梯度偏移值。关于如何确定目标图像特征,可以参见实施例5的步骤S16和实施例6,在此不再赘述。In a possible implementation, multiple entropy gradient offset step sizes may be preconfigured, such as 10 entropy gradient offset step sizes. For each entropy gradient offset step size, the target image feature may be determined based on the initial image feature, the entropy gradient offset step size, and the entropy gradient offset value. The entropy gradient offset value is determined by: determining the residual incremental feature corresponding to the residual feature; calculating the entropy of the residual feature, calculating the entropy of the residual incremental feature; determining the entropy gradient of the residual feature based on the entropy of the residual feature and the entropy of the residual incremental feature, and determining the entropy gradient offset value based on the entropy gradient of the residual feature and the regularization term. Regarding how to determine the target image feature, please refer to step S16 of Example 5 and Example 6, which will not be repeated here.
在另一种可能的实施方式中,可以预先配置多个熵梯度偏移步长,针对每个熵梯度偏移步长,可以从熵率拟合表中查询标准差对应的目标量化尺度,目标量化尺度是与标准差最接近的量化尺度,从熵率拟合表中查询目标量化尺度对应的目标二次项,并基于目标二次项确定当前图像块对应的目标熵梯度,然后,可以基于初始图像特征、该熵梯度偏移步长、目标熵梯度和正则化项确定目标图像特征。其中,熵率拟合表的获取方式,可以包括但不限于:针对最小标准差与最大标准差之间的多个标准差,在对数尺度上对多个标准差进行均匀量化,得到多个标准差对应的多个量化尺度;针对每个量化尺度,可以对该量化尺度进行熵率拟合,得到该量化尺度对应的二次项,并在熵率拟合表中记录该量化尺度与该二次项之间的对应关系。关于如何确定目标图像特征,可以参见实施例5的步骤S26和步骤S27,在此不再赘述。In another possible implementation, multiple entropy gradient offset step sizes can be preconfigured. For each entropy gradient offset step size, the target quantization scale corresponding to the standard deviation can be queried from the entropy rate fitting table. The target quantization scale is the quantization scale closest to the standard deviation. The target quadratic term corresponding to the target quantization scale is queried from the entropy rate fitting table, and the target entropy gradient corresponding to the current image block is determined based on the target quadratic term. Then, the target image feature can be determined based on the initial image feature, the entropy gradient offset step size, the target entropy gradient, and the regularization term. Among them, the acquisition method of the entropy rate fitting table may include but is not limited to: for multiple standard deviations between the minimum standard deviation and the maximum standard deviation, the multiple standard deviations are uniformly quantized on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations; for each quantization scale, the entropy rate can be fitted to the quantization scale to obtain the quadratic term corresponding to the quantization scale, and the corresponding relationship between the quantization scale and the quadratic term is recorded in the entropy rate fitting table. Regarding how to determine the target image feature, please refer to step S26 and step S27 of embodiment 5, which will not be repeated here.
针对每个熵梯度偏移步长,在得到该熵梯度偏移步长对应的目标图像特征之后,可以基于目标图像特征获取该熵梯度偏移步长对应的重建图像块。比如说,将目标图像特征输入给合成变换网络,通过合成变换网络对目标图像特征进行合成变换,得到重建图像块。For each entropy gradient offset step, after obtaining the target image feature corresponding to the entropy gradient offset step, the reconstructed image block corresponding to the entropy gradient offset step can be obtained based on the target image feature. For example, the target image feature is input to the synthesis transformation network, and the target image feature is synthetically transformed through the synthesis transformation network to obtain the reconstructed image block.
步骤S42、编码端基于当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取当前图像块对应的熵梯度偏移步长(即最终采用的熵梯度偏移步长)。Step S42: The encoder selects the entropy gradient offset step size corresponding to the current image block (i.e., the entropy gradient offset step size finally adopted) from all entropy gradient offset step sizes based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step size.
示例性的,针对每个熵梯度偏移步长,编码端可以基于当前图像块和该熵梯度偏移步长对应的重建图像块,确定该熵梯度偏移步长对应的代价值(即基于当前图像块和重建图像块之间的差异确定代价值,对此确定方式不做限制)。基于每个熵梯度偏移步长对应的代价值,可以将最小代价值对应的熵梯度偏移步长选取为当前图像块对应的熵梯度偏移步长。Exemplarily, for each entropy gradient offset step, the encoding end may determine the cost value corresponding to the entropy gradient offset step based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step (i.e., the cost value is determined based on the difference between the current image block and the reconstructed image block, and there is no limitation on this determination method). Based on the cost value corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the minimum cost value may be selected as the entropy gradient offset step corresponding to the current image block.
步骤S43、编码端在当前图像块对应的头信息码流中编码当前图像块对应的熵梯度偏移步长。比如说,在当前图像块对应的头信息码流中编码该熵梯度偏移步长的值。或者,若已配置的熵梯度偏移步长表包括至少一个熵梯度偏移步长,则在当前图像块对应的头信息码流中编码熵梯度偏移步长对应的指示信息,该熵梯度偏移步长对应的指示信息用于指示熵梯度偏移步长的索引值,即表示熵梯度偏移步长表中的第几个熵梯度偏移步长。Step S43, the encoding end encodes the entropy gradient offset step corresponding to the current image block in the header information code stream corresponding to the current image block. For example, the value of the entropy gradient offset step is encoded in the header information code stream corresponding to the current image block. Alternatively, if the configured entropy gradient offset step table includes at least one entropy gradient offset step, the indication information corresponding to the entropy gradient offset step is encoded in the header information code stream corresponding to the current image block, and the indication information corresponding to the entropy gradient offset step is used to indicate the index value of the entropy gradient offset step, that is, it indicates the number of the entropy gradient offset step in the entropy gradient offset step table.
在上述实施例中,通过熵梯度偏移步长对初始图像特征进行特征增强,得到目标图像特征的过程,可以称为LEGS(Latent Entrophy Gradient Shift,特征熵梯度偏移)增强方式,端到端的图像编码框架中使用LEGS模块时,可以参见图6D和图6E所示。在图6D和图6E中,虚线用于区分新模块和新功能,而不代表是可选的,比如说,头信息编码和头信息解码是本实施例的新功能,而LEGS模块是本实施例的新模块。示例性的,LEGS模块可以是硬件模块,LEGS模块也可以由神经网络执行,LEGS模块还可以由软件代码实现,对此不做限制,只要能够实现LEGS模块的功能即可。比如说,针对编码端,LEGS模块接收主信息量化特征与概率模型参数,评估最佳偏移步长(即当前图像块对应的熵梯度偏移步长),并将当前图像块对应的熵梯度偏移步长编入头信息码流。针对解码端,LEGS模块利用当前图像块对应的熵梯度偏移步长对主信息量化特征(即初始图像特征)进行偏移增强,得到目标图像特征。示例性的,LEGS模块可以选取失真损失最小的ρ作为最佳偏移步长,将最佳偏移步长以定长码的方式编入头信息码流,如显式的编入头信息码流。In the above embodiment, the process of enhancing the initial image features by entropy gradient shift step to obtain the target image features can be called LEGS (Latent Entrophy Gradient Shift, feature entropy gradient shift) enhancement method. When the LEGS module is used in the end-to-end image coding framework, it can be seen as shown in Figures 6D and 6E. In Figures 6D and 6E, the dotted lines are used to distinguish new modules and new functions, and do not represent optional. For example, header information encoding and header information decoding are new functions of this embodiment, and the LEGS module is a new module of this embodiment. Exemplarily, the LEGS module can be a hardware module, the LEGS module can also be executed by a neural network, and the LEGS module can also be implemented by software code, which is not limited as long as the function of the LEGS module can be realized. For example, for the encoding end, the LEGS module receives the main information quantization features and the probability model parameters, evaluates the optimal shift step (i.e., the entropy gradient shift step corresponding to the current image block), and encodes the entropy gradient shift step corresponding to the current image block into the header information code stream. For the decoding end, the LEGS module uses the entropy gradient offset step corresponding to the current image block to offset and enhance the main information quantization feature (i.e., the initial image feature) to obtain the target image feature. Exemplarily, the LEGS module can select ρ with the smallest distortion loss as the optimal offset step, and encode the optimal offset step into the header information bitstream in the form of a fixed-length code, such as explicitly encoding it into the header information bitstream.
示例性的,编码端使用LEGS模块确定最佳偏移步长时,由于编码最佳偏移步长造成的码流大小变化可忽略不计,因此,可以使用如下失真损失作为最佳偏移步长的确定准则:Exemplarily, when the encoder uses the LEGS module to determine the optimal offset step size, the change in the code stream size caused by encoding the optimal offset step size is negligible. Therefore, the following distortion loss can be used as a criterion for determining the optimal offset step size:
实施例9:在实施例8中,需要配置多个熵梯度偏移步长,确定每个熵梯度偏移步长对应的重建图像块,基于每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取当前图像块对应的熵梯度偏移步长。为了选取当前图像块对应的熵梯度偏移步长,可以采用以下方式:针对每个熵梯度偏移步长,基于当前图像块和该熵梯度偏移步长对应的重建图像块,确定该熵梯度偏移步长对应的目标代价值,并基于目标代价值和参考代价值确定该熵梯度偏移步长对应的保真程度;基于每个熵梯度偏移步长对应的保真程度,将最大保真程度对应的熵梯度偏移步长选取为当前图像块对应的熵梯度偏移步长;其中,参考代价值的获取过程可以包括:在通过第一神经网络获取当前图像块对应的初始图像特征之后,基于初始图像特征,通过第二神经网络获取重建图像块,基于当前图像块和重建图像块确定参考代价值。Embodiment 9: In embodiment 8, it is necessary to configure multiple entropy gradient offset steps, determine the reconstructed image block corresponding to each entropy gradient offset step, and select the entropy gradient offset step corresponding to the current image block from all entropy gradient offset steps based on the reconstructed image block corresponding to each entropy gradient offset step. In order to select the entropy gradient offset step corresponding to the current image block, the following method can be used: for each entropy gradient offset step, based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step, determine the target cost value corresponding to the entropy gradient offset step, and determine the fidelity corresponding to the entropy gradient offset step based on the target cost value and the reference cost value; based on the fidelity corresponding to each entropy gradient offset step, select the entropy gradient offset step corresponding to the maximum fidelity as the entropy gradient offset step corresponding to the current image block; wherein the process of obtaining the reference cost value may include: after obtaining the initial image features corresponding to the current image block through the first neural network, based on the initial image features, obtain the reconstructed image block through the second neural network, and determine the reference cost value based on the current image block and the reconstructed image block.
比如说,在JPEG-AI的端到端的图像编解码框架中,可以使用VMAF(VisualMultimethod Assessment Fusion,视频质量多方法评价融合)与FSIM(FeatureSimilarity Index Mersure,特征相似性指数合并)评价指标作为LGBS模块评估失真的准则,VMAF评价指标在LGBS模块调整之后会上升,而FSIM指标在LGBS模块调整之后会下降。For example, in the end-to-end image encoding and decoding framework of JPEG-AI, VMAF (Visual Multimethod Assessment Fusion) and FSIM (Feature Similarity Index Mersure) evaluation indicators can be used as criteria for evaluating the distortion of the LGBS module. The VMAF evaluation index will increase after the LGBS module is adjusted, while the FSIM index will decrease after the LGBS module is adjusted.
在得到当前图像块对应的初始图像特征之后,不对初始图像特征进行调整,直接将初始图像特征输入给合成变换网络,得到重建图像块。基于当前图像块和重建图像块,可以得到重建值指标fsimorg与VMAForg,而重建值指标fsimorg与VMAForg作为参考代价值。After obtaining the initial image features corresponding to the current image block, the initial image features are not adjusted, and are directly input into the synthesis transformation network to obtain the reconstructed image block. Based on the current image block and the reconstructed image block, the reconstruction value indicators fsim org and VMAF org can be obtained, and the reconstruction value indicators fsim org and VMAF org are used as reference cost values.
可以采用熵梯度偏移步长1对初始图像特征进行调整,得到目标图像特征1,将目标图像特征1输入给合成变换网络,得到熵梯度偏移步长1对应的重建图像块。基于当前图像块和熵梯度偏移步长1对应的重建图像块,可以得到重建值指标fsimnow与VMAFnow,而重建值指标fsimnow与VMAFnow可以作为熵梯度偏移步长1对应的目标代价值。The initial image feature can be adjusted by using an entropy gradient offset step of 1 to obtain a target image feature 1, and the target image feature 1 is input into the synthetic transformation network to obtain a reconstructed image block corresponding to the entropy gradient offset step of 1. Based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step of 1, the reconstruction value indicators fsim now and VMAF now can be obtained, and the reconstruction value indicators fsim now and VMAF now can be used as the target cost value corresponding to the entropy gradient offset step of 1.
可以采用熵梯度偏移步长2对初始图像特征进行调整,得到目标图像特征2,将目标图像特征2输入给合成变换网络,得到熵梯度偏移步长2对应的重建图像块。基于当前图像块和熵梯度偏移步长2对应的重建图像块,可以得到重建值指标fsimnow与VMAFnow,而重建值指标fsimnow与VMAFnow可以作为熵梯度偏移步长2对应的目标代价值。The entropy gradient offset step 2 can be used to adjust the initial image features to obtain the target image features 2, and the target image features 2 are input to the synthetic transformation network to obtain the reconstructed image block corresponding to the entropy gradient offset step 2. Based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step 2, the reconstruction value indicators fsim now and VMAF now can be obtained, and the reconstruction value indicators fsim now and VMAF now can be used as the target cost values corresponding to the entropy gradient offset step 2.
以此类推,可以得到每个熵梯度偏移步长对应的目标代价值。By analogy, the target cost value corresponding to each entropy gradient offset step can be obtained.
示例性的,针对每个熵梯度偏移步长,基于该熵梯度偏移步长对应的目标代价值和参考代价值,可以采用表达式(15)确定该熵梯度偏移步长对应的保真程度:Exemplarily, for each entropy gradient offset step, based on the target cost value and the reference cost value corresponding to the entropy gradient offset step, the fidelity corresponding to the entropy gradient offset step can be determined using expression (15):
在表达式(15)中,fsimnow与VMAFnow表示该熵梯度偏移步长对应的目标代价值,fsimorg与VMAForg表示参考代价值。1、100是根据经验配置的数值,用于将VMAF评价指标与FSIM评价指标归一化到同一数量级,还可以调整为其它数值,对此不做限制。In expression (15), fsim now and VMAF now represent the target cost values corresponding to the entropy gradient offset step, and fsim org and VMAF org represent the reference cost values. 1 and 100 are numerical values configured based on experience, which are used to normalize the VMAF evaluation index and the FSIM evaluation index to the same order of magnitude, and can also be adjusted to other values without limitation.
当然,表达式(15)只是一个示例,本实施例中对此保真程度的确定表达式不做限制。Of course, expression (15) is just an example, and this embodiment does not limit the expression for determining the degree of fidelity.
示例性的,在得到每个熵梯度偏移步长对应的保真程度之后,可以将最大保真程度对应的熵梯度偏移步长选取为当前图像块对应的熵梯度偏移步长。Exemplarily, after obtaining the fidelity level corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the maximum fidelity level may be selected as the entropy gradient offset step corresponding to the current image block.
示例性的,在配置多个熵梯度偏移步长(如10、12、16个熵梯度偏移步长)时,有一个熵梯度偏移步长可以为0,这样,如果最大保真程度对应的熵梯度偏移步长为0,则表示不需要编码熵梯度偏移步长,即不需要采用熵梯度偏移步长进行特征增强处理。For example, when configuring multiple entropy gradient offset step sizes (such as 10, 12, or 16 entropy gradient offset step sizes), one entropy gradient offset step size can be 0. In this way, if the entropy gradient offset step size corresponding to the maximum fidelity level is 0, it means that the entropy gradient offset step size does not need to be encoded, that is, the entropy gradient offset step size does not need to be used for feature enhancement processing.
实施例10:参见图6F所示,为JPEG-AI的解码端增加LEGS模块的处理流程,LEGS模块在JPEG-AI端到端图像编解码框架的位置,可以参见图6F所示。边信息码流(即第一码流)通过固定的因子化模型获得边信息特征,边信息特征经过超先验尺度解码网络获得正态分布概率模型的尺度,正态概率密度模型与主信息码流(即第二码流)经过无损熵解码获得主信息特征残差。主信息特征量化残差输入超先验解码网络获得通过上下文网络与预测融合网络的自回归过程,获得主信息量化特征LEGS模块从头信息码流中获取了熵梯度偏移步长后,通过的熵梯度调整经过合成变化网络后得到重建图像块。Embodiment 10: As shown in FIG6F, the processing flow of adding a LEGS module to the decoding end of JPEG-AI, the position of the LEGS module in the JPEG-AI end-to-end image encoding and decoding framework can be seen in FIG6F. The side information code stream (i.e., the first code stream) obtains the side information feature through a fixed factorization model, and the side information feature obtains the scale of the normal distribution probability model through a super-prior scale decoding network. The normal probability density model and the main information code stream (i.e., the second code stream) are subjected to lossless entropy decoding to obtain the main information feature residual. Main information feature quantization residual Input the super prior decoding network to obtain The main information quantitative features are obtained through the autoregressive process of the context network and the prediction fusion network. After the LEGS module obtains the entropy gradient offset step from the header information stream, it uses Entropy gradient adjustment The reconstructed image block is obtained after the synthetic variation network.
实施例11:以端到端的图像编解码框架JPEG-AI为例,对于图像样本x,如当前图像块x或者当前图像x,当前图像块x或者当前图像x可以为亮度分量,也可以为色度分量。图像样本x的尺寸为Cp×H×W,可分为亮度分量xY和色度分量xUV,亮度分量xY的尺寸为1×H×W,色度分量xUV的尺寸为2×H×W。JPEG-AI分量编码-解码的重建流程与LEGS模块所处位置,可以参见图6G所示,图6G是JPEG-AI分量的重建流程图。Embodiment 11: Taking the end-to-end image coding and decoding framework JPEG-AI as an example, for an image sample x, such as a current image block x or a current image x, the current image block x or the current image x can be a luminance component or a chrominance component. The size of the image sample x is C p ×H×W, which can be divided into a luminance component x Y and a chrominance component x UV . The size of the luminance component x Y is 1×H×W, and the size of the chrominance component x UV is 2×H×W. The reconstruction process of the JPEG-AI component encoding and decoding and the location of the LEGS module can be seen in FIG6G , which is a reconstruction flow chart of the JPEG-AI component.
以亮度分量xY为例,xY经过分析变换网络(Analysis Transform Net,ga)得到特征张量yY,其尺寸为Cp×hY×wY。yY经过超先验编码网络(Hyper Encoder Net)得到超先验特征张量zY,其尺寸为Cp×hhpY×whpY。zY中每个元素(即每个特征点)具有一个固定的累计概率分布函数(Cumulative Distribuyion Function,CDF),zY相同通道具有相同的CDF,所有元素的CDF通过因子化概率模型(Factorize Entrophy Model)得到。zY经过舍入量化得到量化值 与CDF共同输入一个无损熵编码器获得码流1(BitStream#1);码流1经过无损熵解码器得到 通过超先验解码得到其尺寸为2Cp×hhpY×whpY,是yY的量化值,其尺寸为2Cp×hpY×wpY。经过上下文模型(Context Model Net)得到ζ,尺寸为2Cp×hhpY×whpY;与ζ在通道维连结,共同输入预测融合网络(Prediction Fusion Net)得到均值μy,其尺寸为Cp×hhpY×whpY。yY-μy得到残差ry,ry经过舍入量化得到量化残差与μy相加得到 μy,ry,的生成是一个自回归过程,即在两个空间维度上沿光栅顺序逐像素生成。通过一个超先验尺度解码网络得到σY,其尺寸为Cp×hY×wY,作为参数为中每一个元素生成一个CDF,与对应的CDF输入无损熵编码器,得到码流2(BitStream#2),码流2(BitStream#2)经过无损熵解码得到与μy相加得到进入合成变换网络后得到亮度重建分量 Taking the brightness component x Y as an example, x Y is subjected to the Analysis Transform Net ( ga ) to obtain the feature tensor y Y , whose size is C p ×h Y ×w Y . y Y is subjected to the Hyper Encoder Net to obtain the hyper priori feature tensor z Y , whose size is C p ×h hpY ×w hpY . Each element (i.e., each feature point) in z Y has a fixed cumulative probability distribution function (Cumulative Distribution Function, CDF), and the same channel of z Y has the same CDF. The CDF of all elements is obtained by the Factorize Entrophy Model. z Y is rounded and quantized to obtain the quantized value Input a lossless entropy encoder together with CDF to obtain bitstream 1 (BitStream#1); bitstream 1 is passed through a lossless entropy decoder to obtain Obtained through super-prior decoding Its size is 2C p ×h hpY ×w hpY , is the quantized value of y Y , whose size is 2C p ×h pY ×w pY . ζ is obtained through the context model (Context Model Net), with a size of 2C p ×h hpY ×w hpY ; Connect with ζ in the channel dimension and input the prediction fusion network (Prediction Fusion Net) to obtain the mean μ y , whose size is C p ×h hpY ×w hpY . y Y -μ y obtains the residual ry , and ry is rounded and quantized to obtain the quantized residual Adding μ y gives μ y , ry , The generation of is an autoregressive process, i.e., pixel-by-pixel along the raster order in two spatial dimensions. Through a super prior scale decoding network, σ Y is obtained, whose size is C p ×h Y ×w Y , and its parameters are Generate a CDF for each element in The corresponding CDF is input into the lossless entropy encoder to obtain bitstream 2 (BitStream#2). Bitstream 2 (BitStream#2) is decoded by lossless entropy decoding to obtain Adding μ y gives After entering the synthesis transformation network, the brightness reconstruction component is obtained
示例性的,色度分量xUV的处理过程与亮度分量xY的处理类似,可以得到色度重建分量亮度重建分量与色度重建分量连接后经过色度上采样模块后得到重建图像 For example, the processing of the chrominance component xUV is similar to the processing of the luminance component xY , and the chrominance reconstruction component can be obtained. Luminance reconstruction component Chroma reconstruction component After connection, the reconstructed image is obtained after the chroma upsampling module
在端到端的图像编解码网络中加入LEGS模块,增强与进入合成变化网络前的表示能力,使重建图像失真下降,提升编码性能。在编码过程,其在码流2生成前,评估偏移对失真造成的影响,选择偏移的最佳熵梯度偏移步长,最佳熵梯度偏移步长编入头信息码流,但不对编入码流的进行偏移。在解码过程中,通过解码头信息码流可以获得熵梯度偏移步长,在合成变化网络前,LEGS模块对进行偏移。关于熵梯度偏移步长在JEPG-AI框架中所处的位置,可以参见图6G所示,在图6G中,虚线框所示为熵梯度偏移步长所处的位置。Adding LEGS module to the end-to-end image encoding and decoding network enhances and The representation capability before entering the synthetic variation network reduces the distortion of the reconstructed image and improves the encoding performance. In the encoding process, before the bitstream 2 is generated, the impact of the offset on the distortion is evaluated, and the optimal entropy gradient offset step size is selected. The optimal entropy gradient offset step size is included in the header information bitstream, but not in the bitstream. During the decoding process, the entropy gradient offset step can be obtained by decoding the header information stream. Before synthesizing the change network, the LEGS module Regarding the position of the entropy gradient offset step in the JPEG-AI framework, please refer to FIG. 6G , in which the dotted box shows the position of the entropy gradient offset step.
实施例12:本申请实施例中提出一种解码方法,参见图7A所示,为该解码方法的流程示意图,该方法可以应用于解码端(也称为视频解码器),该方法可以包括:Embodiment 12: A decoding method is proposed in the embodiment of the present application. Referring to FIG. 7A , it is a flowchart of the decoding method. The method can be applied to a decoding end (also called a video decoder). The method may include:
步骤711、通过第一神经网络获取当前图像块对应的残差特征。Step 711: Obtain residual features corresponding to the current image block through the first neural network.
示例性的,通过第一神经网络获取当前图像块对应的残差特征,可以包括但不限于:基于当前图像块对应的第一码流,通过第一神经网络获取标准差;基于标准差确定概率分布模型,并基于概率分布模型对当前图像块对应的第二码流进行解码,得到解码后的图像特征;可以基于解码后的图像特征确定当前图像块对应的残差特征。示例性的,第一神经网络可以包括但不限于概率超参解码网络,基于当前图像块对应的第一码流,通过第一神经网络获取标准差,可以包括但不限于:基于因子化概率模型对当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于解码后的图像特征确定系数超参特征;将该系数超参特征输入给概率超参解码网络,通过概率超参解码网络对该系数超参特征进行反变换,得到标准差。Exemplarily, obtaining the residual features corresponding to the current image block through the first neural network may include but is not limited to: obtaining the standard deviation through the first neural network based on the first bitstream corresponding to the current image block; determining the probability distribution model based on the standard deviation, and decoding the second bitstream corresponding to the current image block based on the probability distribution model to obtain the decoded image features; the residual features corresponding to the current image block may be determined based on the decoded image features. Exemplarily, the first neural network may include but is not limited to a probabilistic hyperparameter decoding network, and obtaining the standard deviation through the first neural network based on the first bitstream corresponding to the current image block may include but is not limited to: decoding the first bitstream corresponding to the current image block based on a factorized probability model to obtain the decoded image features, and determining the coefficient hyperparameter features based on the decoded image features; inputting the coefficient hyperparameter features into the probabilistic hyperparameter decoding network, and inversely transforming the coefficient hyperparameter features through the probabilistic hyperparameter decoding network to obtain the standard deviation.
步骤712、基于残差特征和当前图像块对应的均值特征生成初始图像特征。Step 712: Generate initial image features based on the residual features and the mean features corresponding to the current image block.
步骤713、对初始图像特征的目标通道进行特征域增强,得到目标图像特征。Step 713: Perform feature domain enhancement on the target channel of the initial image feature to obtain the target image feature.
示例性的,初始图像特征可以包括多个通道,可以基于每个通道对应的消耗码率,按照消耗码率从大到小的顺序对多个通道进行排序,并选取排序靠前的K个通道作为目标通道,K为正整数;或者,可以将消耗码率大于预设码率阈值的通道选取为目标通道。Exemplarily, the initial image features may include multiple channels. Based on the consumption bit rate corresponding to each channel, the multiple channels may be sorted in descending order of the consumption bit rate, and the top K channels in the sorting may be selected as target channels, where K is a positive integer; alternatively, a channel whose consumption bit rate is greater than a preset bit rate threshold may be selected as the target channel.
示例性的,对初始图像特征的目标通道进行特征域增强,可以包括但不限于:若目标通道包括多个特征点,则从多个特征点中选取目标特征点;其中,目标通道的所有特征点均为目标特征点,或者,针对每个特征点,若该特征点对应的标准差大于标准差阈值,则该特征点为目标特征点,若该特征点对应的标准差不大于标准差阈值,则该特征点不为目标特征点。在得到目标特征点之后,针对每个目标特征点,可以确定该目标特征点对应的特征补偿值,并基于该特征补偿值对该目标特征点的特征值进行补偿。Exemplarily, the feature domain enhancement of the target channel of the initial image feature may include, but is not limited to: if the target channel includes multiple feature points, then the target feature point is selected from the multiple feature points; wherein all feature points of the target channel are target feature points, or, for each feature point, if the standard deviation corresponding to the feature point is greater than the standard deviation threshold, then the feature point is the target feature point, and if the standard deviation corresponding to the feature point is not greater than the standard deviation threshold, then the feature point is not the target feature point. After obtaining the target feature point, for each target feature point, the feature compensation value corresponding to the target feature point can be determined, and the feature value of the target feature point can be compensated based on the feature compensation value.
示例性的,确定目标特征点对应的特征补偿值,可以包括但不限于:对当前图像块对应的头信息码流进行解码,得到目标特征点对应的调整符号位,并基于调整符号位确定特征补偿值的符号。此外,确定目标特征点对应的特征补偿值,还可以包括但不限于:对当前图像块对应的头信息码流进行解码,得到目标特征点对应的幅值指示信息,基于幅值指示信息确定特征补偿值的幅值;或者,获取已配置的固定幅值,基于固定幅值确定特征补偿值的幅值。Exemplarily, determining the feature compensation value corresponding to the target feature point may include, but is not limited to: decoding the header information code stream corresponding to the current image block, obtaining the adjustment sign bit corresponding to the target feature point, and determining the sign of the feature compensation value based on the adjustment sign bit. In addition, determining the feature compensation value corresponding to the target feature point may also include, but is not limited to: decoding the header information code stream corresponding to the current image block, obtaining the amplitude indication information corresponding to the target feature point, and determining the amplitude of the feature compensation value based on the amplitude indication information; or obtaining a configured fixed amplitude, and determining the amplitude of the feature compensation value based on the fixed amplitude.
示例性的,基于幅值指示信息确定特征补偿值的幅值,可以包括但不限于:若该幅值指示信息包括幅值列表中的位置索引(用于指示幅值列表中的第几个幅值),则从幅值列表中选取位置索引对应的幅值,并将选取的幅值确定为特征补偿值的幅值。Exemplarily, determining the amplitude of the characteristic compensation value based on the amplitude indication information may include, but is not limited to: if the amplitude indication information includes a position index in the amplitude list (used to indicate the number of the amplitude in the amplitude list), then selecting the amplitude corresponding to the position index from the amplitude list, and determining the selected amplitude as the amplitude of the characteristic compensation value.
步骤714、基于目标图像特征,通过第二神经网络获取重建图像块。Step 714: Based on the target image features, a reconstructed image block is obtained through a second neural network.
示例性的,第一神经网络包括至少一个卷积层,且第二神经网络包括至少一个卷积层。Exemplarily, the first neural network includes at least one convolutional layer, and the second neural network includes at least one convolutional layer.
示例性的,第二神经网络可以包括合成变换网络,基于目标图像特征,通过第二神经网络获取重建图像块,可以包括但不限于:将目标图像特征输入给合成变换网络,通过合成变换网络对目标图像特征进行合成变换,得到当前图像块对应的重建图像块。Exemplarily, the second neural network may include a synthetic transformation network. Based on the target image features, obtaining the reconstructed image block through the second neural network may include but is not limited to: inputting the target image features into the synthetic transformation network, performing a synthetic transformation on the target image features through the synthetic transformation network, and obtaining a reconstructed image block corresponding to the current image block.
示例性的,上述执行顺序只是为了方便描述给出的示例,在实际应用中,还可以改变步骤之间的执行顺序,对此执行顺序不做限制。而且,在其它实施例中,并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤,其方法所包括的步骤可以比本说明书所描述的更多或更少。此外,本说明书中所描述的单个步骤,在其它实施例中可能被分解为多个步骤进行描述;本说明书中所描述的多个步骤,在其它实施例也可能被合并为单个步骤进行描述。Exemplarily, the above execution order is only for the convenience of describing the examples given. In practical applications, the execution order between the steps can also be changed, and there is no limitation on this execution order. Moreover, in other embodiments, the steps of the corresponding method are not necessarily executed in the order shown and described in this specification, and the steps included in the method may be more or less than those described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; multiple steps described in this specification may also be combined into a single step for description in other embodiments.
由以上技术方案可见,本申请实施例中,通过第一神经网络获取当前图像块对应的残差特征,并基于残差特征生成初始图像特征,通过对初始图像特征的目标通道进行特征域增强,得到目标图像特征,基于目标图像特征,通过第二神经网络获取当前图像块对应的重建图像块,从而提出一种端到端的视频图像压缩方法,能够基于神经网络实现视频图像的编码和解码,通过对目标通道进行特征域增强达到提升编码效率和解码效率的目的,使得神经网络在保持低复杂度的同时,有效保证重建图像块的质量,达到提升编码性能和解码性能的目的。It can be seen from the above technical scheme that in the embodiment of the present application, the residual features corresponding to the current image block are obtained through the first neural network, and the initial image features are generated based on the residual features. The target image features are obtained by performing feature domain enhancement on the target channel of the initial image features. Based on the target image features, the reconstructed image block corresponding to the current image block is obtained through the second neural network, thereby proposing an end-to-end video image compression method, which can realize encoding and decoding of video images based on neural networks, and achieve the purpose of improving encoding efficiency and decoding efficiency by performing feature domain enhancement on the target channel, so that the neural network can effectively guarantee the quality of the reconstructed image block while maintaining low complexity, thereby achieving the purpose of improving encoding performance and decoding performance.
实施例13:本申请实施例中提出一种编码方法,参见图7B所示,为该编码方法的流程示意图,该方法可以应用于编码端(也称为视频编码器),该方法可以包括:Embodiment 13: In the embodiment of the present application, a coding method is proposed. Referring to FIG. 7B , it is a schematic diagram of the flow of the coding method. The method can be applied to a coding end (also called a video encoder). The method may include:
步骤721、获取当前图像块对应的残差特征。Step 721: Obtain the residual features corresponding to the current image block.
步骤722、对残差特征进行量化,得到当前图像块对应的初始图像特征。Step 722: quantize the residual features to obtain the initial image features corresponding to the current image block.
步骤723、基于当前图像块对应的初始图像特征的目标通道和残差特征(即当前图像块对应的残差特征),确定目标通道中的目标特征点对应的特征补偿值。Step 723: Determine a feature compensation value corresponding to a target feature point in a target channel based on a target channel and a residual feature of the initial image feature corresponding to the current image block (ie, a residual feature corresponding to the current image block).
示例性的,初始图像特征可以包括多个通道,可以基于每个通道对应的消耗码率,按照消耗码率从大到小的顺序对多个通道进行排序,并选取排序靠前的K个通道作为目标通道,K为正整数;或者,可以将消耗码率大于预设码率阈值的通道选取为目标通道。Exemplarily, the initial image features may include multiple channels. Based on the consumption bit rate corresponding to each channel, the multiple channels may be sorted in descending order of the consumption bit rate, and the top K channels in the sorting may be selected as target channels, where K is a positive integer; alternatively, a channel whose consumption bit rate is greater than a preset bit rate threshold may be selected as the target channel.
步骤724、在当前图像块对应的头信息码流中编码特征补偿值的指示信息。Step 724: Encode the indication information of the feature compensation value in the header information code stream corresponding to the current image block.
示例性的,特征补偿值的指示信息可以包括目标特征点对应的调整符号位;若目标特征点在残差特征中对应的特征值与目标特征点在目标通道中对应的特征值之间的差值大于0,则目标特征点对应的调整符号位为正;若目标特征点在残差特征中对应的特征值与目标特征点在目标通道中对应的特征值之间的差值小于0,则目标特征点对应的调整符号位为负。Exemplarily, the indication information of the feature compensation value may include an adjustment sign bit corresponding to the target feature point; if the difference between the eigenvalue corresponding to the target feature point in the residual feature and the eigenvalue corresponding to the target feature point in the target channel is greater than 0, the adjustment sign bit corresponding to the target feature point is positive; if the difference between the eigenvalue corresponding to the target feature point in the residual feature and the eigenvalue corresponding to the target feature point in the target channel is less than 0, the adjustment sign bit corresponding to the target feature point is negative.
示例性的,特征补偿值的指示信息包括目标特征点对应的幅值指示信息;若幅值指示信息包括幅值列表中的位置索引,则位置索引表示特征补偿值的幅值在幅值列表中对应的位置。Exemplarily, the indication information of the feature compensation value includes amplitude indication information corresponding to the target feature point; if the amplitude indication information includes a position index in the amplitude list, the position index indicates the position corresponding to the amplitude of the feature compensation value in the amplitude list.
示例性的,上述执行顺序只是为了方便描述给出的示例,在实际应用中,还可以改变步骤之间的执行顺序,对此执行顺序不做限制。而且,在其它实施例中,并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤,其方法所包括的步骤可以比本说明书所描述的更多或更少。此外,本说明书中所描述的单个步骤,在其它实施例中可能被分解为多个步骤进行描述;本说明书中所描述的多个步骤,在其它实施例也可能被合并为单个步骤进行描述。Exemplarily, the above execution order is only for the convenience of describing the examples given. In practical applications, the execution order between the steps can also be changed, and there is no limitation on this execution order. Moreover, in other embodiments, the steps of the corresponding method are not necessarily executed in the order shown and described in this specification, and the steps included in the method may be more or less than those described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; multiple steps described in this specification may also be combined into a single step for description in other embodiments.
由以上技术方案可见,本申请实施例中,通过对初始图像特征的目标通道进行特征域增强,提出一种端到端的视频图像压缩方法,能够基于神经网络实现视频图像的编码和解码,通过对目标通道进行特征域增强达到提升编码效率和解码效率的目的,使得神经网络在保持低复杂度的同时,有效保证重建图像块的质量,达到提升编码性能和解码性能的目的。It can be seen from the above technical scheme that in the embodiment of the present application, an end-to-end video image compression method is proposed by performing feature domain enhancement on the target channel of the initial image feature, which can realize encoding and decoding of video images based on a neural network, and achieve the purpose of improving the encoding efficiency and decoding efficiency by performing feature domain enhancement on the target channel, so that the neural network can effectively guarantee the quality of the reconstructed image block while maintaining low complexity, thereby achieving the purpose of improving the encoding performance and decoding performance.
实施例14:在实施例12和实施例13中,编码端和解码端可以获取残差特征和初始图像特征,获取过程可以参见实施例1-实施例11,在此不再重复赘述,在得到目标图像特征之后,可以基于目标图像特征获取重建图像块,获取过程可以参见实施例1-实施例11,在此不再重复赘述,以下对基于初始图像特征得到目标图像特征的过程进行说明。Example 14: In Example 12 and Example 13, the encoding end and the decoding end can obtain residual features and initial image features. The acquisition process can refer to Example 1-Example 11 and will not be repeated here. After obtaining the target image features, the reconstructed image blocks can be obtained based on the target image features. The acquisition process can refer to Example 1-Example 11 and will not be repeated here. The following describes the process of obtaining the target image features based on the initial image features.
示例性的,可以对初始图像特征的重点通道(将重点通道称为目标通道)进行补偿,来进行特征域增强,增强重建图像质量,比如说,可以采用如下步骤进行补偿:Exemplarily, the key channel of the initial image feature (the key channel is referred to as the target channel) can be compensated to perform feature domain enhancement and enhance the quality of the reconstructed image. For example, the following steps can be used for compensation:
步骤S51、从初始图像特征的所有通道中选取目标通道,比如说,初始图像特征包括多个通道,可以基于每个通道对应的消耗码率,从初始图像特征的所有通道中选取目标通道。Step S51: Select a target channel from all channels of the initial image feature. For example, the initial image feature includes multiple channels. The target channel can be selected from all channels of the initial image feature based on the consumed bit rate corresponding to each channel.
在一种可能的实施方式中,初始图像特征可以包括多个通道,基于每个通道对应的消耗码率,可以按照消耗码率从大到小的顺序对多个通道进行排序,并选取排序靠前的K个通道作为目标通道,K为正整数。或者,基于每个通道对应的消耗码率,可以按照消耗码率从小到大的顺序对多个通道进行排序,并选取排序靠后的K个通道作为目标通道。In a possible implementation, the initial image feature may include multiple channels. Based on the consumed bit rate corresponding to each channel, the multiple channels may be sorted in descending order of the consumed bit rate, and the top K channels are selected as target channels, where K is a positive integer. Alternatively, based on the consumed bit rate corresponding to each channel, the multiple channels may be sorted in ascending order of the consumed bit rate, and the bottom K channels are selected as target channels.
在另一种可能的实施方式中,初始图像特征可以包括多个通道,基于每个通道对应的消耗码率,可以将消耗码率大于预设码率阈值的通道选取为目标通道。In another possible implementation, the initial image feature may include multiple channels, and based on the consumed bit rate corresponding to each channel, a channel whose consumed bit rate is greater than a preset bit rate threshold may be selected as a target channel.
当然,上述只是两个示例,对此目标通道的选取方式不做限制。Of course, the above are just two examples, and there is no limitation on the method of selecting the target channel.
比如说,初始图像特征的亮度分量特征有128个通道,若目标通道的数量为1个,可以使用初始图像特征计算消耗码率最大的通道,将消耗码率最大的通道作为目标通道。For example, the brightness component feature of the initial image feature has 128 channels. If the number of target channels is 1, the initial image feature can be used to calculate the channel with the largest bit rate consumption, and the channel with the largest bit rate consumption can be used as the target channel.
步骤S52、对初始图像特征的目标通道进行特征域增强,得到目标图像特征。Step S52: Perform feature domain enhancement on the target channel of the initial image feature to obtain the target image feature.
示例性的,初始图像特征的目标通道可以记为目标通道的高度为h,目标通道的宽度为w,且目标通道是初始图像特征的特征通道图,因此,目标通道包括h*w个特征点,将目标通道的所有特征点均选取为目标特征点(即一共存在h*w个目标特征点)。For example, the target channel of the initial image feature can be recorded as The height of the target channel is h, the width of the target channel is w, and the target channel is a feature channel map of the initial image feature. Therefore, the target channel includes h*w feature points, and all feature points of the target channel are selected as target feature points (that is, there are a total of h*w target feature points).
针对每个目标特征点,可以确定该目标特征点对应的特征补偿值,并基于该特征补偿值对该目标特征点的特征值进行补偿,得到补偿后的特征值。在得到每个目标特征点对应的补偿后的特征值之后,可以将所有目标特征点对应的补偿后的特征值组成目标图像特征。For each target feature point, a feature compensation value corresponding to the target feature point can be determined, and the feature value of the target feature point can be compensated based on the feature compensation value to obtain a compensated feature value. After obtaining the compensated feature value corresponding to each target feature point, the compensated feature values corresponding to all target feature points can be combined into a target image feature.
为了确定目标特征点对应的特征补偿值,可以先确定目标特征点对应的调整符号位,如调整符号位为正或负,可以确定目标特征点对应的幅值,如幅值为A。若调整符号位为正,则目标特征点对应的特征补偿值为-A,在基于特征补偿值对目标特征点的特征值进行补偿时,是在目标特征点的特征值的基础上减去A,得到补偿后的特征值,即调整符号位为正时,为目标特征点的特征值补偿-A。若调整符号位为负,则目标特征点对应的特征补偿值为+A,在基于特征补偿值对目标特征点的特征值进行补偿时,是在目标特征点的特征值的基础上加上A,得到补偿后的特征值,即调整符号位为负时,为目标特征点的特征值补偿+A。In order to determine the characteristic compensation value corresponding to the target feature point, the adjustment sign bit corresponding to the target feature point can be determined first. If the adjustment sign bit is positive or negative, the amplitude corresponding to the target feature point can be determined, such as the amplitude A. If the adjustment sign bit is positive, the characteristic compensation value corresponding to the target feature point is -A. When compensating the characteristic value of the target feature point based on the characteristic compensation value, A is subtracted from the characteristic value of the target feature point to obtain the compensated characteristic value, that is, when the adjustment sign bit is positive, the characteristic value of the target feature point is compensated by -A. If the adjustment sign bit is negative, the characteristic compensation value corresponding to the target feature point is +A. When compensating the characteristic value of the target feature point based on the characteristic compensation value, A is added to the characteristic value of the target feature point to obtain the compensated characteristic value, that is, when the adjustment sign bit is negative, the characteristic value of the target feature point is compensated by +A.
为了确定目标特征点对应的调整符号位,则编码端可以计算目标特征点对应的调整符号位。若目标特征点在残差特征中对应的特征值与目标特征点在目标通道中对应的特征值之间的差值大于0,则目标特征点对应的调整符号位为正;若目标特征点在残差特征中对应的特征值与目标特征点在目标通道中对应的特征值之间的差值小于0,则目标特征点对应的调整符号位为负。比如说,编码端可以采用如下表达式计算目标通道的主信息值与主信息量化值的差:yimpotant表示目标特征点在残差特征的目标通道中对应的特征值,表示目标特征点在初始图像特征的目标通道中对应的特征值,若diff大于0,则目标特征点对应的调整符号位为正,若diff小于0,目标特征点对应的调整符号位为负。In order to determine the adjustment sign bit corresponding to the target feature point, the encoder can calculate the adjustment sign bit corresponding to the target feature point. If the difference between the eigenvalue corresponding to the target feature point in the residual feature and the eigenvalue corresponding to the target feature point in the target channel is greater than 0, the adjustment sign bit corresponding to the target feature point is positive; if the difference between the eigenvalue corresponding to the target feature point in the residual feature and the eigenvalue corresponding to the target feature point in the target channel is less than 0, the adjustment sign bit corresponding to the target feature point is negative. For example, the encoder can use the following expression to calculate the difference between the main information value and the main information quantization value of the target channel: y impotant represents the eigenvalue corresponding to the target feature point in the target channel of the residual feature. It represents the eigenvalue corresponding to the target feature point in the target channel of the initial image feature. If diff is greater than 0, the adjustment sign bit corresponding to the target feature point is positive. If diff is less than 0, the adjustment sign bit corresponding to the target feature point is negative.
假设目标通道包括h*w个目标特征点,则可以得到每个目标特征点对应的调整符号位,并以1bit定长码将h*w个目标特征点对应的调整符号位编码到头信息码流,即头信息码流可以传输h*w个bit,每个bit表示一个目标特征点对应的调整符号位(正或者负)。Assuming that the target channel includes h*w target feature points, the adjustment sign bit corresponding to each target feature point can be obtained, and the adjustment sign bits corresponding to the h*w target feature points can be encoded into the header information code stream with a 1-bit fixed-length code, that is, the header information code stream can transmit h*w bits, each bit represents the adjustment sign bit (positive or negative) corresponding to a target feature point.
解码端可以对当前图像块对应的头信息码流进行解码,得到每个目标特征点对应的调整符号位,并基于调整符号位得到每个目标特征点对应的符号,如正或者负。The decoding end can decode the header information code stream corresponding to the current image block, obtain the adjustment sign bit corresponding to each target feature point, and obtain the sign corresponding to each target feature point based on the adjustment sign bit, such as positive or negative.
为了确定目标特征点对应的幅值,解码端可以将固定幅值作为目标特征点对应的幅值,固定幅值可以根据经验配置,如固定幅值可以为0.25、0.75、1.25、1.75等,假设固定幅值为0.25,那么,解码端可以将0.25作为目标特征点对应的幅值。In order to determine the amplitude corresponding to the target feature point, the decoding end can use the fixed amplitude as the amplitude corresponding to the target feature point. The fixed amplitude can be configured based on experience. For example, the fixed amplitude can be 0.25, 0.75, 1.25, 1.75, etc. Assuming that the fixed amplitude is 0.25, then the decoding end can use 0.25 as the amplitude corresponding to the target feature point.
综上所述,若目标特征点对应的调整符号位为正,则目标特征点对应的特征补偿值为-0.25,若目标特征点对应的调整符号位为负,则目标特征点对应的特征补偿值为+0.25。In summary, if the adjustment sign bit corresponding to the target feature point is positive, the feature compensation value corresponding to the target feature point is -0.25, and if the adjustment sign bit corresponding to the target feature point is negative, the feature compensation value corresponding to the target feature point is +0.25.
需要注意的是,若目标通道的数量为1个,即消耗码率最大的通道作为目标通道,则头信息码流中不需要传输目标通道的标识,解码端可以隐式推导出目标通道,即消耗码率最大的通道作为目标通道。若目标通道的数量为多个,如X个,头信息码流中也不需要传输X个目标通道的标识,解码端可以隐式推导出X个目标通道,推导过程可以参见步骤S51。It should be noted that if the number of target channels is 1, that is, the channel with the largest bit rate consumption is used as the target channel, then the header information bitstream does not need to transmit the identifier of the target channel, and the decoding end can implicitly derive the target channel, that is, the channel with the largest bit rate consumption is used as the target channel. If the number of target channels is multiple, such as X, the header information bitstream does not need to transmit the identifiers of the X target channels, and the decoding end can implicitly derive the X target channels. The derivation process can refer to step S51.
实施例15:在实施例12和实施例13中,编码端和解码端可以获取残差特征和初始图像特征,获取过程可以参见实施例1-实施例11,在此不再重复赘述,在得到目标图像特征之后,可以基于目标图像特征获取重建图像块,获取过程可以参见实施例1-实施例11,在此不再重复赘述,以下对基于初始图像特征得到目标图像特征的过程进行说明。Example 15: In Example 12 and Example 13, the encoding end and the decoding end can obtain residual features and initial image features. The acquisition process can refer to Example 1-Example 11 and will not be repeated here. After obtaining the target image features, the reconstructed image blocks can be obtained based on the target image features. The acquisition process can refer to Example 1-Example 11 and will not be repeated here. The following describes the process of obtaining the target image features based on the initial image features.
示例性的,量化模块存在一个残差跳过技术,根据设定的σ的阈值对主信息残差直接置零,导致部分主信息残差虽然具有较大的值,但是被直接置零,导致基于点的与y之间差距较大,因此,除消耗码率最大的通道作为目标通道,其它通道中也存在某些通道是目标通道。For example, the quantization module has a residual skipping technique, which directly sets the main information residual to zero according to the set σ threshold, resulting in that some main information residuals have large values but are directly set to zero, resulting in the point-based There is a large gap between y and y. Therefore, in addition to the channel with the largest bit rate consumption as the target channel, some channels in other channels are also target channels.
示例性的,可以对初始图像特征的重点通道(将重点通道称为目标通道)进行补偿,来进行特征域增强,增强重建图像质量,比如说,可以采用如下步骤进行补偿:Exemplarily, the key channel of the initial image feature (the key channel is referred to as the target channel) can be compensated to perform feature domain enhancement and enhance the quality of the reconstructed image. For example, the following steps can be used for compensation:
步骤S61、从初始图像特征的所有通道中选取目标通道,比如说,初始图像特征包括多个通道,可以基于每个通道对应的消耗码率,从初始图像特征的所有通道中选取目标通道。Step S61: Select a target channel from all channels of the initial image feature. For example, the initial image feature includes multiple channels. The target channel can be selected from all channels of the initial image feature based on the consumed bit rate corresponding to each channel.
在一种可能的实施方式中,初始图像特征可以包括多个通道,基于每个通道对应的消耗码率,可以按照消耗码率从大到小的顺序对多个通道进行排序,并选取排序靠前的K个通道作为目标通道,K为正整数。或者,基于每个通道对应的消耗码率,可以按照消耗码率从小到大的顺序对多个通道进行排序,并选取排序靠后的K个通道作为目标通道。In a possible implementation, the initial image feature may include multiple channels. Based on the consumed bit rate corresponding to each channel, the multiple channels may be sorted in descending order of the consumed bit rate, and the top K channels are selected as target channels, where K is a positive integer. Alternatively, based on the consumed bit rate corresponding to each channel, the multiple channels may be sorted in ascending order of the consumed bit rate, and the bottom K channels are selected as target channels.
在另一种可能的实施方式中,初始图像特征可以包括多个通道,基于每个通道对应的消耗码率,可以将消耗码率大于预设码率阈值的通道选取为目标通道。In another possible implementation, the initial image feature may include multiple channels, and based on the consumed bit rate corresponding to each channel, a channel whose consumed bit rate is greater than a preset bit rate threshold may be selected as a target channel.
当然,上述只是两个示例,对此目标通道的选取方式不做限制。Of course, the above are just two examples, and there is no limitation on the method of selecting the target channel.
步骤S62、初始图像特征的目标通道(可以为多个)可以记为目标通道的高度为h,目标通道的宽度为w,且目标通道是初始图像特征的特征通道图,因此,目标通道包括h*w个特征点,针对每个特征点,若该特征点对应的标准差大于标准差阈值,则将该特征点选取为目标特征点,若该特征点对应的标准差不大于标准差阈值,则不将该特征点选取为目标特征点。至此,可以从目标通道的所有特征点中选取目标特征点。Step S62: The target channel (can be multiple) of the initial image feature can be recorded as The height of the target channel is h, the width of the target channel is w, and the target channel is the feature channel map of the initial image feature. Therefore, the target channel includes h*w feature points. For each feature point, if the standard deviation corresponding to the feature point is greater than the standard deviation threshold, the feature point is selected as the target feature point. If the standard deviation corresponding to the feature point is not greater than the standard deviation threshold, the feature point is not selected as the target feature point. At this point, the target feature point can be selected from all the feature points of the target channel.
示例性的,根据标准差阈值确定目标通道中的目标特征点,目标特征点可以分布在各个特征通道(可以允许特征通道不存在目标特征点),各个特征通道确定目标特征点的标准差阈值可以不同,也可以相同,目标特征点的个数取决于标准差阈值的设定。Exemplarily, the target feature points in the target channel are determined according to the standard deviation threshold. The target feature points can be distributed in each feature channel (it can be allowed that there is no target feature point in the feature channel). The standard deviation thresholds for determining the target feature points in each feature channel can be different or the same, and the number of target feature points depends on the setting of the standard deviation threshold.
步骤S63、对初始图像特征的目标通道进行特征域增强,得到目标图像特征。Step S63: Perform feature domain enhancement on the target channel of the initial image feature to obtain the target image feature.
针对每个目标特征点,可以确定该目标特征点对应的特征补偿值,并基于该特征补偿值对该目标特征点的特征值进行补偿,得到补偿后的特征值。在得到每个目标特征点对应的补偿后的特征值之后,可以将所有目标特征点对应的补偿后的特征值组成目标图像特征。For each target feature point, a feature compensation value corresponding to the target feature point can be determined, and the feature value of the target feature point can be compensated based on the feature compensation value to obtain a compensated feature value. After obtaining the compensated feature value corresponding to each target feature point, the compensated feature values corresponding to all target feature points can be combined into a target image feature.
为了确定目标特征点对应的特征补偿值,可以先确定目标特征点对应的调整符号位,如调整符号位可以为正或负,可以确定目标特征点对应的幅值,如幅值为A。若调整符号位为正,则目标特征点对应的特征补偿值为-A,在基于特征补偿值对目标特征点的特征值进行补偿时,是在目标特征点的特征值的基础上减去A,得到补偿后的特征值,即调整符号位为正时,为目标特征点的特征值补偿-A。若调整符号位为负,则目标特征点对应的特征补偿值为+A,在基于特征补偿值对目标特征点的特征值进行补偿时,是在目标特征点的特征值的基础上加上A,得到补偿后的特征值,即调整符号位为负时,为目标特征点的特征值补偿+A。In order to determine the characteristic compensation value corresponding to the target feature point, the adjustment sign bit corresponding to the target feature point can be determined first. For example, the adjustment sign bit can be positive or negative, and the amplitude corresponding to the target feature point can be determined, such as the amplitude A. If the adjustment sign bit is positive, the characteristic compensation value corresponding to the target feature point is -A. When compensating the characteristic value of the target feature point based on the characteristic compensation value, A is subtracted from the characteristic value of the target feature point to obtain the compensated characteristic value, that is, when the adjustment sign bit is positive, the characteristic value of the target feature point is compensated by -A. If the adjustment sign bit is negative, the characteristic compensation value corresponding to the target feature point is +A. When compensating the characteristic value of the target feature point based on the characteristic compensation value, A is added to the characteristic value of the target feature point to obtain the compensated characteristic value, that is, when the adjustment sign bit is negative, the characteristic value of the target feature point is compensated by +A.
为了确定目标特征点对应的调整符号位,则编码端可以计算目标特征点对应的调整符号位。若目标特征点在残差特征中对应的特征值与目标特征点在目标通道中对应的特征值之间的差值大于0,则目标特征点对应的调整符号位为正;若目标特征点在残差特征中对应的特征值与目标特征点在目标通道中对应的特征值之间的差值小于0,则目标特征点对应的调整符号位为负。比如说,编码端可以采用如下表达式计算目标通道的主信息值与主信息量化值的差:yimpotant表示目标特征点在残差特征的目标通道中对应的特征值,表示目标特征点在初始图像特征的目标通道中对应的特征值,若diff大于0,则目标特征点对应的调整符号位为正,若diff小于0,目标特征点对应的调整符号位为负。In order to determine the adjustment sign bit corresponding to the target feature point, the encoder can calculate the adjustment sign bit corresponding to the target feature point. If the difference between the eigenvalue corresponding to the target feature point in the residual feature and the eigenvalue corresponding to the target feature point in the target channel is greater than 0, the adjustment sign bit corresponding to the target feature point is positive; if the difference between the eigenvalue corresponding to the target feature point in the residual feature and the eigenvalue corresponding to the target feature point in the target channel is less than 0, the adjustment sign bit corresponding to the target feature point is negative. For example, the encoder can use the following expression to calculate the difference between the main information value and the main information quantization value of the target channel: y impotant represents the eigenvalue corresponding to the target feature point in the target channel of the residual feature. It represents the eigenvalue corresponding to the target feature point in the target channel of the initial image feature. If diff is greater than 0, the adjustment sign bit corresponding to the target feature point is positive. If diff is less than 0, the adjustment sign bit corresponding to the target feature point is negative.
针对每个目标通道中的目标特征点,可以采用1bit定长码将目标特征点对应的调整符号位编码到头信息码流,每个bit表示一个目标特征点对应的调整符号位(正或者负)。For each target feature point in the target channel, a 1-bit fixed-length code can be used to encode the adjustment sign bit corresponding to the target feature point into the header information code stream, and each bit represents the adjustment sign bit (positive or negative) corresponding to a target feature point.
解码端可以对当前图像块对应的头信息码流进行解码,得到每个目标特征点对应的调整符号位,并基于调整符号位得到每个目标特征点对应的符号,如正或者负。The decoding end can decode the header information code stream corresponding to the current image block, obtain the adjustment sign bit corresponding to each target feature point, and obtain the sign corresponding to each target feature point based on the adjustment sign bit, such as positive or negative.
为了确定目标特征点对应的幅值,编码端和解码端均需要维护幅值列表,该幅值列表可以包括多个幅值,如0.25、0.75、1.25、1.75等,当然,这里只是给出幅值的几个示例,对此不做限制。当幅值列表包括0.25、0.75、1.25、1.75等幅值时,这些幅值的顺序可以任意配置,对此不做限制,如这些幅值的顺序可以是0.25、0.75、1.25、1.75。In order to determine the amplitude corresponding to the target feature point, both the encoding end and the decoding end need to maintain an amplitude list, which may include multiple amplitudes, such as 0.25, 0.75, 1.25, 1.75, etc. Of course, here are just a few examples of amplitudes, and there is no restriction on this. When the amplitude list includes amplitudes such as 0.25, 0.75, 1.25, 1.75, the order of these amplitudes can be arbitrarily configured without restriction, such as the order of these amplitudes can be 0.25, 0.75, 1.25, 1.75.
示例性的,编码端可以确定目标特征点对应的幅值,如为每个目标特征点单独确定幅值(即目标通道包括多个目标特征点时,为每个目标特征点单独确定幅值),或者,为同一目标通道的所有目标特征点确定同一个幅值(即目标通道包括多个目标特征点时,为所有目标特征点确定同一个幅值),以为同一目标通道的所有目标特征点确定同一个幅值为例,则编码端可以根据目标通道的diff的平均大小(即目标通道的所有目标特征点的diff的平均值)确定该目标通道的幅值,该目标通道的幅值作为该目标通道的所有目标特征点的幅值。Exemplarily, the encoding end can determine the amplitude corresponding to the target feature point, such as determining the amplitude for each target feature point separately (that is, when the target channel includes multiple target feature points, the amplitude is determined separately for each target feature point), or determining the same amplitude for all target feature points of the same target channel (that is, when the target channel includes multiple target feature points, the same amplitude is determined for all target feature points). Taking the same amplitude as an example, the encoding end can determine the amplitude of the target channel according to the average size of the diff of the target channel (that is, the average value of the diff of all target feature points of the target channel), and the amplitude of the target channel is used as the amplitude of all target feature points of the target channel.
然后,编码端在当前图像块对应的头信息码流中编码目标特征点对应的幅值指示信息,该幅值指示信息包括幅值列表中的位置索引。比如说,假设一共存在X个目标通道,则在头信息码流中编码X个幅值指示信息(X个幅值指示信息对应X个目标通道),每个幅值指示信息可以为2bit的码字(当然,码字的比特数可以更多或者更少,对此不做限制)。需要注意的是,针对X个目标通道来说,头信息码流中不需要传输X个目标通道的标识,解码端可以隐式推导出X个目标通道,推导过程可以参见步骤S61。显然,根据σ的阈值,编码端和解码端可以获得统一的目标通道,即X无需通过头信息码流传输。Then, the encoder encodes the amplitude indication information corresponding to the target feature point in the header information code stream corresponding to the current image block, and the amplitude indication information includes the position index in the amplitude list. For example, assuming that there are X target channels in total, X amplitude indication information is encoded in the header information code stream (X amplitude indication information corresponds to X target channels), and each amplitude indication information can be a 2-bit code word (of course, the number of bits of the code word can be more or less, and there is no restriction on this). It should be noted that for the X target channels, the identifiers of the X target channels do not need to be transmitted in the header information code stream, and the decoder can implicitly derive the X target channels. The derivation process can refer to step S61. Obviously, according to the threshold value of σ, the encoder and decoder can obtain a unified target channel, that is, X does not need to be transmitted through the header information code stream.
针对X个幅值指示信息(幅值指示信息可以为2bit的码字),第1个码字表示第1个目标通道的幅值,如“0”表示幅值列表中的第1个幅值,如0.25,解码端基于该幅值指示信息确定第1个目标通道中的所有目标特征点的幅值均为0.25;如“1”表示幅值列表中的第2个幅值,如0.75,解码端基于该幅值指示信息确定第1个目标通道中的所有目标特征点的幅值均为0.75;如“2”表示幅值列表中的第3个幅值,如1.25,解码端基于该幅值指示信息确定第1个目标通道中的所有目标特征点的幅值均为1.25;如“3”表示幅值列表中的第4个幅值,如1.75,解码端基于该幅值指示信息确定第1个目标通道中的所有目标特征点的幅值均为1.75。For X amplitude indication information (the amplitude indication information can be a 2-bit codeword), the first codeword represents the amplitude of the first target channel. For example, "0" represents the first amplitude in the amplitude list, such as 0.25. The decoding end determines that the amplitudes of all target feature points in the first target channel are 0.25 based on the amplitude indication information; if "1" represents the second amplitude in the amplitude list, such as 0.75, the decoding end determines that the amplitudes of all target feature points in the first target channel are 0.75 based on the amplitude indication information; if "2" represents the third amplitude in the amplitude list, such as 1.25, the decoding end determines that the amplitudes of all target feature points in the first target channel are 1.25 based on the amplitude indication information; if "3" represents the fourth amplitude in the amplitude list, such as 1.75, the decoding end determines that the amplitudes of all target feature points in the first target channel are 1.75 based on the amplitude indication information.
第2个码字表示第2个目标通道的幅值,如“0”表示幅值列表中的第1个幅值,如0.25,“1”表示幅值列表中的第2个幅值,如0.75,“2”表示幅值列表中的第3个幅值,如1.25,“3”表示幅值列表中的第4个幅值,如1.75。第3个码字表示第2个目标通道的幅值,以此类推,第X个码字表示第X个目标通道的幅值,从而可以得到每个目标通道的幅值,即,针对每个目标通道,解码端可以确定该目标通道中的所有目标特征点的幅值。The second codeword represents the amplitude of the second target channel, such as "0" represents the first amplitude in the amplitude list, such as 0.25, "1" represents the second amplitude in the amplitude list, such as 0.75, "2" represents the third amplitude in the amplitude list, such as 1.25, and "3" represents the fourth amplitude in the amplitude list, such as 1.75. The third codeword represents the amplitude of the second target channel, and so on. The Xth codeword represents the amplitude of the Xth target channel, so that the amplitude of each target channel can be obtained, that is, for each target channel, the decoding end can determine the amplitudes of all target feature points in the target channel.
综上所述,若目标特征点对应的调整符号位为正,且该目标特征点对应的幅值为0.25,则该目标特征点对应的特征补偿值为-0.25,若该目标特征点对应的调整符号位为负,且该目标特征点对应的幅值为0.25,则该目标特征点对应的特征补偿值为+0.25。To sum up, if the adjustment sign bit corresponding to the target feature point is positive and the amplitude corresponding to the target feature point is 0.25, then the feature compensation value corresponding to the target feature point is -0.25; if the adjustment sign bit corresponding to the target feature point is negative and the amplitude corresponding to the target feature point is 0.25, then the feature compensation value corresponding to the target feature point is +0.25.
示例性的,上述各实施例可以单独实现,也可以组合实现,比如说,实施例1-实施例15中的每个实施例,可以单独实现,实施例1-实施例15中的至少两个实施例,可以组合实现。Illustratively, the above-mentioned embodiments can be implemented individually or in combination. For example, each embodiment in Embodiment 1 to Embodiment 15 can be implemented individually, and at least two embodiments in Embodiment 1 to Embodiment 15 can be implemented in combination.
示例性的,上述各实施例中,编码端的内容也可以应用到解码端,即解码端可以采用相同方式处理,解码端的内容也可以应用到编码端,即编码端可以采用相同方式处理。Illustratively, in the above embodiments, the content of the encoding end can also be applied to the decoding end, that is, the decoding end can be processed in the same way, and the content of the decoding end can also be applied to the encoding end, that is, the encoding end can be processed in the same way.
基于与上述方法同样的申请构思,本申请实施例中还提出一种解码装置,所述装置应用于解码端,所述装置包括:存储器,其经配置以存储视频数据;解码器,其经配置以实现上述实施例1-实施例15中的解码方法,即解码端的处理流程。Based on the same application concept as the above method, a decoding device is also proposed in the embodiment of the present application. The device is applied to the decoding end, and the device includes: a memory, which is configured to store video data; a decoder, which is configured to implement the decoding method in the above embodiments 1 to 15, that is, the processing flow of the decoding end.
比如说,在一种可能的实施方式中,解码器,其经配置以实现:For example, in one possible implementation, the decoder is configured to implement:
基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;Based on the first bitstream corresponding to the current image block, a standard deviation corresponding to the current image block is obtained through a probabilistic hyperparameter decoding network; a probability distribution model is determined based on the standard deviation, and a second bitstream corresponding to the current image block is decoded based on the probability distribution model to obtain decoded image features; based on the decoded image features, a residual feature corresponding to the current image block is determined; based on the residual feature, an initial image feature corresponding to the current image block is determined;
获取所述当前图像块对应的熵梯度偏移步长;Obtaining the entropy gradient offset step corresponding to the current image block;
基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;Performing feature enhancement on the initial image features based on the entropy gradient offset step to obtain target image features;
基于所述目标图像特征,通过合成变换网络获取所述当前图像块对应的重建图像块。Based on the target image feature, a reconstructed image block corresponding to the current image block is obtained through a synthetic transformation network.
示例性的,解码器,其经配置还实现:基于因子化概率模型对所述当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于所述解码后的图像特征确定系数超参特征;将所述系数超参特征输入给所述概率超参解码网络,通过所述概率超参解码网络对所述系数超参特征进行反变换,得到所述当前图像块对应的标准差。Exemplarily, the decoder is configured to further implement: decoding the first code stream corresponding to the current image block based on a factorized probability model to obtain decoded image features, and determining coefficient hyperparameter features based on the decoded image features; inputting the coefficient hyperparameter features to the probabilistic hyperparameter decoding network, and inversely transforming the coefficient hyperparameter features through the probabilistic hyperparameter decoding network to obtain the standard deviation corresponding to the current image block.
示例性的,解码器,其经配置还实现:对所述当前图像块对应的头信息码流进行解码,得到所述熵梯度偏移步长;或者,对所述当前图像块对应的头信息码流进行解码,得到所述熵梯度偏移步长对应的指示信息,并基于所述指示信息确定所述熵梯度偏移步长。Exemplarily, the decoder is configured to further implement: decoding the header information code stream corresponding to the current image block to obtain the entropy gradient offset step; or, decoding the header information code stream corresponding to the current image block to obtain indication information corresponding to the entropy gradient offset step, and determining the entropy gradient offset step based on the indication information.
示例性的,解码器,其经配置还实现:若已配置的熵梯度偏移步长表包括至少一个熵梯度偏移步长,且所述指示信息用于指示熵梯度偏移步长的索引值,则基于所述索引值从所述熵梯度偏移步长表中确定所述当前图像块对应的熵梯度偏移步长。Exemplarily, a decoder is configured to further implement: if the configured entropy gradient offset step table includes at least one entropy gradient offset step, and the indication information is used to indicate an index value of the entropy gradient offset step, then the entropy gradient offset step corresponding to the current image block is determined from the entropy gradient offset step table based on the index value.
示例性的,解码器,其经配置还实现:基于所述熵梯度偏移步长和所述当前图像块对应的熵梯度特征信息对所述初始图像特征进行特征增强,得到目标图像特征;其中,所述熵梯度偏移步长表示熵梯度的偏移步长,且所述熵梯度偏移步长为用于对初始图像特征进行增强时的修正步长。Exemplarily, a decoder is configured to further implement: based on the entropy gradient offset step and the entropy gradient feature information corresponding to the current image block, the feature of the initial image is enhanced to obtain the target image feature; wherein the entropy gradient offset step represents the offset step of the entropy gradient, and the entropy gradient offset step is a correction step used to enhance the initial image feature.
示例性的,所述熵梯度特征信息包括熵梯度偏移值,解码器,其经配置还实现:基于所述残差特征确定所述当前图像块对应的熵梯度偏移值。Exemplarily, the entropy gradient feature information includes an entropy gradient offset value, and the decoder is configured to further implement: determining the entropy gradient offset value corresponding to the current image block based on the residual feature.
示例性的,解码器,其经配置还实现:确定所述残差特征对应的残差增量特征;计算所述残差特征的熵,并计算所述残差增量特征的熵;基于所述残差特征的熵和所述残差增量特征的熵确定所述残差特征的熵梯度,并基于所述残差特征的熵梯度和正则化项确定所述熵梯度偏移值。Exemplarily, the decoder is configured to further implement: determining the residual incremental feature corresponding to the residual feature; calculating the entropy of the residual feature, and calculating the entropy of the residual incremental feature; determining the entropy gradient of the residual feature based on the entropy of the residual feature and the entropy of the residual incremental feature, and determining the entropy gradient offset value based on the entropy gradient of the residual feature and a regularization term.
示例性的,所述熵梯度特征信息包括目标熵梯度,解码器,其经配置还实现:从熵率拟合表中查询所述标准差对应的目标量化尺度,所述目标量化尺度是与所述标准差最接近的量化尺度,从熵率拟合表中查询所述目标量化尺度对应的目标二次项;其中,所述熵率拟合表包括量化尺度与二次项的对应关系;基于所述目标二次项确定所述当前图像块对应的目标熵梯度。Exemplarily, the entropy gradient feature information includes a target entropy gradient, and the decoder is configured to further implement: querying a target quantization scale corresponding to the standard deviation from an entropy rate fitting table, wherein the target quantization scale is a quantization scale closest to the standard deviation, and querying a target quadratic term corresponding to the target quantization scale from an entropy rate fitting table; wherein the entropy rate fitting table includes a correspondence between a quantization scale and a quadratic term; and determining a target entropy gradient corresponding to the current image block based on the target quadratic term.
示例性的,解码器,其经配置还实现:针对最小标准差与最大标准差之间的多个标准差,在对数尺度上对所述多个标准差进行均匀量化,得到所述多个标准差对应的多个量化尺度;针对每个量化尺度,对所述量化尺度进行熵率拟合,得到所述量化尺度对应的二次项,在熵率拟合表中记录所述量化尺度与所述二次项之间的对应关系。Exemplarily, the decoder is configured to further implement: for multiple standard deviations between the minimum standard deviation and the maximum standard deviation, uniformly quantizing the multiple standard deviations on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations; for each quantization scale, performing entropy rate fitting on the quantization scale to obtain a quadratic term corresponding to the quantization scale, and recording the correspondence between the quantization scale and the quadratic term in an entropy rate fitting table.
示例性的,解码器,其经配置还实现:基于所述初始图像特征、所述标准差和所述残差特征确定所述正则化项。Exemplarily, the decoder is configured to further implement: determining the regularization term based on the initial image feature, the standard deviation and the residual feature.
基于与上述方法同样的申请构思,本申请实施例中还提出一种编码装置,所述装置应用于编码端,所述装置包括:存储器,其经配置以存储视频数据;编码器,其经配置以实现上述实施例1-实施例15中的编码方法,即编码端的处理流程。Based on the same application concept as the above method, an encoding device is also proposed in the embodiment of the present application. The device is applied to the encoding end, and the device includes: a memory, which is configured to store video data; an encoder, which is configured to implement the encoding method in the above embodiments 1 to 15, that is, the processing flow of the encoding end.
比如说,在一种可能的实施方式中,编码器,其经配置以实现:For example, in one possible implementation, the encoder is configured to implement:
基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;Based on the first bitstream corresponding to the current image block, a standard deviation corresponding to the current image block is obtained through a probabilistic hyperparameter decoding network; a probability distribution model is determined based on the standard deviation, and a second bitstream corresponding to the current image block is decoded based on the probability distribution model to obtain decoded image features; based on the decoded image features, a residual feature corresponding to the current image block is determined; based on the residual feature, an initial image feature corresponding to the current image block is determined;
针对每个熵梯度偏移步长,基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;基于所述目标图像特征,通过合成变换网络获取所述熵梯度偏移步长对应的重建图像块;For each entropy gradient offset step, the initial image feature is enhanced based on the entropy gradient offset step to obtain a target image feature; based on the target image feature, a reconstructed image block corresponding to the entropy gradient offset step is obtained through a synthetic transformation network;
基于所述当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取所述当前图像块对应的熵梯度偏移步长,并在所述当前图像块对应的头信息码流中编码所述当前图像块对应的熵梯度偏移步长。Based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the current image block is selected from all entropy gradient offset steps, and the entropy gradient offset step corresponding to the current image block is encoded in the header information code stream corresponding to the current image block.
示例性的,编码器,其经配置还实现:针对每个熵梯度偏移步长,基于所述当前图像块和该熵梯度偏移步长对应的重建图像块,确定该熵梯度偏移步长对应的目标代价值,并基于所述目标代价值和参考代价值确定该熵梯度偏移步长对应的保真程度;基于每个熵梯度偏移步长对应的保真程度,将最大保真程度对应的熵梯度偏移步长选取为所述当前图像块对应的熵梯度偏移步长;其中,所述参考代价值的获取过程包括:在得到当前图像块对应的初始图像特征之后,基于所述初始图像特征,通过合成变换网络获取重建图像块,基于所述当前图像块和所述重建图像块确定所述参考代价值。Exemplarily, the encoder is configured to further implement: for each entropy gradient offset step, based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step, determine the target cost value corresponding to the entropy gradient offset step, and determine the fidelity corresponding to the entropy gradient offset step based on the target cost value and the reference cost value; based on the fidelity corresponding to each entropy gradient offset step, select the entropy gradient offset step corresponding to the maximum fidelity as the entropy gradient offset step corresponding to the current image block; wherein the process of obtaining the reference cost value includes: after obtaining the initial image features corresponding to the current image block, based on the initial image features, obtaining the reconstructed image block through a synthetic transformation network, and determining the reference cost value based on the current image block and the reconstructed image block.
基于与上述方法同样的申请构思,本申请实施例提供的解码端设备(也可以称为视频解码器),从硬件层面而言,其硬件架构示意图具体可以参见图8A所示。包括:处理器811和机器可读存储介质812,机器可读存储介质812存储有能够被处理器811执行的机器可执行指令;处理器811用于执行机器可执行指令,以实现本申请上述实施例1-15的解码方法。Based on the same application concept as the above method, the decoding end device (also referred to as a video decoder) provided in the embodiment of the present application, from the hardware level, its hardware architecture diagram can be specifically shown in Figure 8A. It includes: a processor 811 and a machine-readable storage medium 812, the machine-readable storage medium 812 stores machine-executable instructions that can be executed by the processor 811; the processor 811 is used to execute the machine-executable instructions to implement the decoding method of the above embodiments 1-15 of the present application.
比如说,在一种可能的实施方式中,处理器811执行机器可执行指令时用于实现:For example, in one possible implementation, the processor 811 executes the machine executable instructions to implement:
基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;Based on the first bitstream corresponding to the current image block, a standard deviation corresponding to the current image block is obtained through a probabilistic hyperparameter decoding network; a probability distribution model is determined based on the standard deviation, and a second bitstream corresponding to the current image block is decoded based on the probability distribution model to obtain decoded image features; based on the decoded image features, a residual feature corresponding to the current image block is determined; based on the residual feature, an initial image feature corresponding to the current image block is determined;
获取所述当前图像块对应的熵梯度偏移步长;Obtaining the entropy gradient offset step corresponding to the current image block;
基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;Performing feature enhancement on the initial image features based on the entropy gradient offset step to obtain target image features;
基于所述目标图像特征,通过合成变换网络获取所述当前图像块对应的重建图像块。Based on the target image feature, a reconstructed image block corresponding to the current image block is obtained through a synthetic transformation network.
基于与上述方法同样的申请构思,本申请实施例提供的编码端设备(也可以称为视频编码器),从硬件层面而言,其硬件架构示意图具体可以参见图8B所示。包括:处理器821和机器可读存储介质822,机器可读存储介质822存储有能够被处理器821执行的机器可执行指令;处理器821用于执行机器可执行指令,以实现本申请上述实施例1-15的编码方法。Based on the same application concept as the above method, the encoding end device (also referred to as a video encoder) provided in the embodiment of the present application, from the hardware level, its hardware architecture diagram can be specifically shown in Figure 8B. It includes: a processor 821 and a machine-readable storage medium 822, the machine-readable storage medium 822 stores machine-executable instructions that can be executed by the processor 821; the processor 821 is used to execute the machine-executable instructions to implement the encoding method of the above embodiments 1-15 of the present application.
比如说,在一种可能的实施方式中,处理器821执行机器可执行指令时用于实现:For example, in one possible implementation, the processor 821 executes the machine executable instructions to implement:
基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;Based on the first bitstream corresponding to the current image block, a standard deviation corresponding to the current image block is obtained through a probabilistic hyperparameter decoding network; a probability distribution model is determined based on the standard deviation, and a second bitstream corresponding to the current image block is decoded based on the probability distribution model to obtain decoded image features; based on the decoded image features, a residual feature corresponding to the current image block is determined; based on the residual feature, an initial image feature corresponding to the current image block is determined;
针对每个熵梯度偏移步长,基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;基于所述目标图像特征,通过合成变换网络获取所述熵梯度偏移步长对应的重建图像块;For each entropy gradient offset step, the initial image feature is enhanced based on the entropy gradient offset step to obtain a target image feature; based on the target image feature, a reconstructed image block corresponding to the entropy gradient offset step is obtained through a synthetic transformation network;
基于所述当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取所述当前图像块对应的熵梯度偏移步长,并在所述当前图像块对应的头信息码流中编码所述当前图像块对应的熵梯度偏移步长。Based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, the entropy gradient offset step corresponding to the current image block is selected from all entropy gradient offset steps, and the entropy gradient offset step corresponding to the current image block is encoded in the header information code stream corresponding to the current image block.
基于与上述方法同样的申请构思,本申请实施例提供一种电子设备。包括:处理器和机器可读存储介质,机器可读存储介质存储有能够被处理器执行的机器可执行指令;处理器用于执行机器可执行指令,以实现本申请上述实施例1-15的解码方法或编码方法。Based on the same application concept as the above method, an embodiment of the present application provides an electronic device, including: a processor and a machine-readable storage medium, the machine-readable storage medium storing machine-executable instructions that can be executed by the processor; the processor is used to execute the machine-executable instructions to implement the decoding method or encoding method of the above embodiments 1-15 of the present application.
基于与上述方法同样的申请构思,本申请实施例还提供一种机器可读存储介质,所述机器可读存储介质上存储有若干计算机指令,所述计算机指令被处理器执行时,能够实现本申请上述示例公开的方法,如上述上述各实施例中的解码方法或者编码方法。Based on the same application concept as the above method, an embodiment of the present application also provides a machine-readable storage medium, on which a number of computer instructions are stored. When the computer instructions are executed by a processor, the method disclosed in the above example of the present application can be implemented, such as the decoding method or encoding method in the above embodiments.
基于与上述方法同样的申请构思,本申请实施例还提供一种计算机应用程序,所述计算机应用程序令被处理器执行时,能够实现本申请上述示例公开的解码方法或者编码方法。Based on the same application concept as the above method, an embodiment of the present application also provides a computer application, which, when executed by a processor, can implement the decoding method or encoding method disclosed in the above example of the present application.
基于与上述方法同样的申请构思,本申请实施例中还提出一种解码装置,所述解码装置可以应用于解码端(也可以称为视频解码器),所述解码装置可以包括:Based on the same application concept as the above method, a decoding device is also proposed in an embodiment of the present application. The decoding device can be applied to a decoding end (also referred to as a video decoder). The decoding device may include:
确定模块,用于基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;获取模块,用于获取所述当前图像块对应的熵梯度偏移步长;处理模块,用于基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;所述获取模块,还用于基于所述目标图像特征,通过合成变换网络获取所述当前图像块对应的重建图像块。A determination module is used to obtain the standard deviation corresponding to the current image block through a probabilistic hyperparameter decoding network based on the first code stream corresponding to the current image block; determine a probability distribution model based on the standard deviation, and decode the second code stream corresponding to the current image block based on the probability distribution model to obtain decoded image features; determine the residual features corresponding to the current image block based on the decoded image features; determine the initial image features corresponding to the current image block based on the residual features; an acquisition module is used to obtain the entropy gradient offset step corresponding to the current image block; a processing module is used to perform feature enhancement on the initial image features based on the entropy gradient offset step to obtain target image features; the acquisition module is also used to obtain a reconstructed image block corresponding to the current image block through a synthetic transformation network based on the target image features.
示例性的,所述确定模块基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差时具体用于:基于因子化概率模型对所述当前图像块对应的第一码流进行解码,得到解码后的图像特征,并基于所述解码后的图像特征确定系数超参特征;将所述系数超参特征输入给所述概率超参解码网络,通过所述概率超参解码网络对所述系数超参特征进行反变换,得到所述当前图像块对应的标准差。Exemplarily, the determination module is specifically used to obtain the standard deviation corresponding to the current image block through a probabilistic hyperparameter decoding network based on the first code stream corresponding to the current image block, and is specifically used to: decode the first code stream corresponding to the current image block based on a factorized probability model to obtain decoded image features, and determine coefficient hyperparameter features based on the decoded image features; input the coefficient hyperparameter features to the probabilistic hyperparameter decoding network, and inversely transform the coefficient hyperparameter features through the probabilistic hyperparameter decoding network to obtain the standard deviation corresponding to the current image block.
示例性的,所述获取模块获取所述当前图像块对应的熵梯度偏移步长时具体用于:对所述当前图像块对应的头信息码流进行解码,得到所述熵梯度偏移步长;或者,对所述当前图像块对应的头信息码流进行解码,得到所述熵梯度偏移步长对应的指示信息,并基于所述指示信息确定所述熵梯度偏移步长。Exemplarily, when the acquisition module acquires the entropy gradient offset step corresponding to the current image block, it is specifically used to: decode the header information code stream corresponding to the current image block to obtain the entropy gradient offset step; or, decode the header information code stream corresponding to the current image block to obtain indication information corresponding to the entropy gradient offset step, and determine the entropy gradient offset step based on the indication information.
示例性的,所述获取模块基于所述指示信息确定所述熵梯度偏移步长时具体用于:若已配置的熵梯度偏移步长表包括至少一个熵梯度偏移步长,且所述指示信息用于指示熵梯度偏移步长的索引值,则基于所述索引值从所述熵梯度偏移步长表中确定所述当前图像块对应的熵梯度偏移步长。Exemplarily, when the acquisition module determines the entropy gradient offset step based on the indication information, it is specifically used to: if the configured entropy gradient offset step table includes at least one entropy gradient offset step, and the indication information is used to indicate the index value of the entropy gradient offset step, then determine the entropy gradient offset step corresponding to the current image block from the entropy gradient offset step table based on the index value.
示例性的,所述处理模块基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征时具体用于:基于所述熵梯度偏移步长和所述当前图像块对应的熵梯度特征信息对所述初始图像特征进行特征增强,得到目标图像特征;其中,所述熵梯度偏移步长表示熵梯度的偏移步长,所述熵梯度偏移步长为用于对初始图像特征进行增强时的修正步长。Exemplarily, the processing module performs feature enhancement on the initial image features based on the entropy gradient offset step to obtain the target image features, and is specifically used to: perform feature enhancement on the initial image features based on the entropy gradient offset step and the entropy gradient feature information corresponding to the current image block to obtain the target image features; wherein the entropy gradient offset step represents the offset step of the entropy gradient, and the entropy gradient offset step is a correction step used to enhance the initial image features.
示例性的,所述熵梯度特征信息包括熵梯度偏移值,所述处理模块确定当前图像块对应的熵梯度偏移值时具体用于:基于所述残差特征确定所述当前图像块对应的熵梯度偏移值。Exemplarily, the entropy gradient feature information includes an entropy gradient offset value, and the processing module determines the entropy gradient offset value corresponding to the current image block by specifically determining the entropy gradient offset value corresponding to the current image block based on the residual feature.
示例性的,所述处理模块基于所述残差特征确定所述当前图像块对应的熵梯度偏移值时具体用于:确定所述残差特征对应的残差增量特征;计算所述残差特征的熵,并计算所述残差增量特征的熵;基于所述残差特征的熵和所述残差增量特征的熵确定所述残差特征的熵梯度,并基于所述残差特征的熵梯度和正则化项确定所述熵梯度偏移值。Exemplarily, when the processing module determines the entropy gradient offset value corresponding to the current image block based on the residual feature, it is specifically used to: determine the residual incremental feature corresponding to the residual feature; calculate the entropy of the residual feature, and calculate the entropy of the residual incremental feature; determine the entropy gradient of the residual feature based on the entropy of the residual feature and the entropy of the residual incremental feature, and determine the entropy gradient offset value based on the entropy gradient of the residual feature and the regularization term.
示例性的,所述熵梯度特征信息包括目标熵梯度,所述处理模块确定所述当前图像块对应的目标熵梯度时具体用于:从熵率拟合表中查询所述标准差对应的目标量化尺度,所述目标量化尺度是与所述标准差最接近的量化尺度,从熵率拟合表中查询所述目标量化尺度对应的目标二次项;其中,所述熵率拟合表包括量化尺度与二次项的对应关系;基于所述目标二次项确定所述当前图像块对应的目标熵梯度。Exemplarily, the entropy gradient feature information includes a target entropy gradient, and the processing module determines the target entropy gradient corresponding to the current image block by: querying the target quantization scale corresponding to the standard deviation from an entropy rate fitting table, the target quantization scale is the quantization scale closest to the standard deviation, and querying the target quadratic term corresponding to the target quantization scale from the entropy rate fitting table; wherein the entropy rate fitting table includes a correspondence between the quantization scale and the quadratic term; and determining the target entropy gradient corresponding to the current image block based on the target quadratic term.
其中,所述熵率拟合表的获取方式包括:针对最小标准差与最大标准差之间的多个标准差,在对数尺度上对所述多个标准差进行均匀量化,得到所述多个标准差对应的多个量化尺度;针对每个量化尺度,对所述量化尺度进行熵率拟合,得到所述量化尺度对应的二次项,在熵率拟合表中记录所述量化尺度与所述二次项之间的对应关系。Among them, the method for obtaining the entropy rate fitting table includes: for multiple standard deviations between the minimum standard deviation and the maximum standard deviation, uniformly quantizing the multiple standard deviations on a logarithmic scale to obtain multiple quantization scales corresponding to the multiple standard deviations; for each quantization scale, performing entropy rate fitting on the quantization scale to obtain a quadratic term corresponding to the quantization scale, and recording the corresponding relationship between the quantization scale and the quadratic term in the entropy rate fitting table.
示例性的,所述处理模块确定所述正则化项时具体用于:基于所述初始图像特征、所述标准差和所述残差特征确定所述正则化项。Exemplarily, when determining the regularization term, the processing module is specifically used to: determine the regularization term based on the initial image feature, the standard deviation and the residual feature.
示例性的,所述当前图像块对应的初始图像特征为亮度分量对应的初始图像特征;或,所述当前图像块对应的初始图像特征为色度分量对应的初始图像特征。Exemplarily, the initial image feature corresponding to the current image block is an initial image feature corresponding to a brightness component; or, the initial image feature corresponding to the current image block is an initial image feature corresponding to a chrominance component.
基于与上述方法同样的申请构思,本申请实施例中还提出一种编码装置,所述装置应用于编码端(也可以称为视频编码器),所述装置可以包括:确定模块,用于基于当前图像块对应的第一码流,通过概率超参解码网络获取当前图像块对应的标准差;基于所述标准差确定概率分布模型,基于所述概率分布模型对所述当前图像块对应的第二码流进行解码,得到解码后的图像特征;基于所述解码后的图像特征确定所述当前图像块对应的残差特征;基于所述残差特征确定所述当前图像块对应的初始图像特征;处理模块,用于针对每个熵梯度偏移步长,基于所述熵梯度偏移步长对所述初始图像特征进行特征增强,得到目标图像特征;基于所述目标图像特征,通过合成变换网络获取所述熵梯度偏移步长对应的重建图像块;选取模块,用于基于所述当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取所述当前图像块对应的熵梯度偏移步长;编码模块,用于在所述当前图像块对应的头信息码流中编码所述当前图像块对应的熵梯度偏移步长。Based on the same application concept as the above method, an encoding device is also proposed in an embodiment of the present application. The device is applied to an encoding end (also referred to as a video encoder), and the device may include: a determination module, which is used to obtain a standard deviation corresponding to the current image block through a probabilistic hyperparameter decoding network based on a first code stream corresponding to the current image block; determine a probability distribution model based on the standard deviation, and decode the second code stream corresponding to the current image block based on the probability distribution model to obtain decoded image features; determine the residual features corresponding to the current image block based on the decoded image features; determine the initial image features corresponding to the current image block based on the residual features; a processing module, which is used to enhance the initial image features based on the entropy gradient offset step for each entropy gradient offset step to obtain a target image feature; based on the target image feature, obtain a reconstructed image block corresponding to the entropy gradient offset step through a synthetic transformation network; a selection module, which is used to select the entropy gradient offset step corresponding to the current image block from all entropy gradient offset steps based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step; and an encoding module, which is used to encode the entropy gradient offset step corresponding to the current image block in the header information code stream corresponding to the current image block.
示例性的,所述选取模块基于所述当前图像块和每个熵梯度偏移步长对应的重建图像块,从所有熵梯度偏移步长中选取所述当前图像块对应的熵梯度偏移步长时具体用于:针对每个熵梯度偏移步长,基于所述当前图像块和该熵梯度偏移步长对应的重建图像块,确定该熵梯度偏移步长对应的目标代价值,并基于所述目标代价值和参考代价值确定该熵梯度偏移步长对应的保真程度;基于每个熵梯度偏移步长对应的保真程度,将最大保真程度对应的熵梯度偏移步长选取为所述当前图像块对应的熵梯度偏移步长;其中,所述参考代价值的获取过程包括:在得到当前图像块对应的初始图像特征之后,基于所述初始图像特征,通过合成变换网络获取重建图像块,基于所述当前图像块和所述重建图像块确定所述参考代价值。Exemplarily, the selection module selects the entropy gradient offset step corresponding to the current image block from all entropy gradient offset steps based on the current image block and the reconstructed image block corresponding to each entropy gradient offset step, and is specifically used to: for each entropy gradient offset step, determine the target cost value corresponding to the entropy gradient offset step based on the current image block and the reconstructed image block corresponding to the entropy gradient offset step, and determine the fidelity corresponding to the entropy gradient offset step based on the target cost value and the reference cost value; based on the fidelity corresponding to each entropy gradient offset step, select the entropy gradient offset step corresponding to the maximum fidelity as the entropy gradient offset step corresponding to the current image block; wherein the process of obtaining the reference cost value includes: after obtaining the initial image features corresponding to the current image block, based on the initial image features, obtaining the reconstructed image block through a synthetic transformation network, and determining the reference cost value based on the current image block and the reconstructed image block.
本领域内的技术人员应明白,本申请实施例可提供为方法、系统、或计算机程序产品。本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。以上所述仅为本申请的实施例而已,并不用于限制本申请。Those skilled in the art will appreciate that the embodiments of the present application may be provided as methods, systems, or computer program products. The present application may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment in combination with software and hardware. The embodiments of the present application may adopt the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code. The above is only an embodiment of the present application and is not intended to limit the present application.
对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。For those skilled in the art, the present application may have various modifications and variations. Any modification, equivalent substitution, improvement, etc. made within the spirit and principle of the present application shall be included in the scope of the claims of the present application.
Claims (19)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310217157.9A CN118590650A (en) | 2023-03-01 | 2023-03-01 | A decoding and encoding method, device and equipment thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310217157.9A CN118590650A (en) | 2023-03-01 | 2023-03-01 | A decoding and encoding method, device and equipment thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118590650A true CN118590650A (en) | 2024-09-03 |
Family
ID=92532421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310217157.9A Pending CN118590650A (en) | 2023-03-01 | 2023-03-01 | A decoding and encoding method, device and equipment thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118590650A (en) |
-
2023
- 2023-03-01 CN CN202310217157.9A patent/CN118590650A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7434604B2 (en) | Content-adaptive online training using image replacement in neural image compression | |
US11496769B2 (en) | Neural network based image set compression | |
Jia et al. | Layered image compression using scalable auto-encoder | |
WO2020261314A1 (en) | Image encoding method and image decoding method | |
JP2023053272A (en) | Image encoding device, image decoding device, and program | |
Kim et al. | Fractal coding of video sequence using circular prediction mapping and noncontractive interframe mapping | |
Kawawa-Beaudan et al. | Recognition-aware learned image compression | |
CN118590650A (en) | A decoding and encoding method, device and equipment thereof | |
Mishra et al. | An improved SVD based image compression | |
KR20160065860A (en) | Method for encoding and decoding a media signal and apparatus using the same | |
TWI853774B (en) | Decoding and encoding methods and apparatuses and devices | |
Lu et al. | Hybrid image compression scheme based on PVQ and DCTVQ | |
WO2024149245A1 (en) | Decoding method and apparatus, encoding method and apparatus, and devices thereof | |
WO2020066307A1 (en) | Image decoding device, image encoding device, image processing system, and program | |
TWI851498B (en) | Decoding and encoding methods and apparatuses, and devices | |
TWI859971B (en) | Method and apparatus for image decoding and encoding based on neural network, and device | |
TW202504315A (en) | Decoding methods and apparatuses, and devices | |
WO2024217460A1 (en) | Decoding method and apparatus, coding method and apparatus, and devices | |
Kumar et al. | High-Efficiency Video Coder in Pruned Environment Using Adaptive Quantization Parameter Selection. | |
WO2024213145A1 (en) | Decoding method and apparatus, encoding method and apparatus, device, and medium | |
Balcilar et al. | Vector Quantization and Shifting: Exploiting Latent Properties to Optimize Neural Codecs | |
CN118368434A (en) | Image decoding and encoding method, device, equipment and storage medium | |
CN109495758A (en) | The quantization of bandwidth reduction airspace and quantification method | |
CN119232942A (en) | Multi-step context prediction-based decoding and encoding method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |