[go: up one dir, main page]

CN117501692A - Template matching prediction for video encoding and decoding - Google Patents

Template matching prediction for video encoding and decoding Download PDF

Info

Publication number
CN117501692A
CN117501692A CN202280026726.5A CN202280026726A CN117501692A CN 117501692 A CN117501692 A CN 117501692A CN 202280026726 A CN202280026726 A CN 202280026726A CN 117501692 A CN117501692 A CN 117501692A
Authority
CN
China
Prior art keywords
block
current block
template
prediction
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280026726.5A
Other languages
Chinese (zh)
Inventor
K·纳赛尔
F·莱莱昂内克
T·波里尔
G·马丁-科谢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
InterDigital CE Patent Holdings SAS
Original Assignee
InterDigital CE Patent Holdings SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by InterDigital CE Patent Holdings SAS filed Critical InterDigital CE Patent Holdings SAS
Priority claimed from PCT/EP2022/057416 external-priority patent/WO2022207400A1/en
Publication of CN117501692A publication Critical patent/CN117501692A/en
Pending legal-status Critical Current

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明公开了一种用于处理视频信息的方法、装置或系统,该方法、装置或系统能够涉及基于与图片信息的当前块相关联的模板和与解码图片信息的区域中的至少一个其他块相关联的至少一个其他模板的比较来确定用于该当前块的预测块,其中该比较基于与该当前块的大小无关的每个像素的恒定比较次数;以及基于该预测块对该当前块进行解码/编码。

The present invention discloses a method, device or system for processing video information, which method, device or system can involve based on a template associated with a current block of picture information and at least one other block in a region of the decoded picture information. Comparison of associated at least one other template to determine a prediction block for the current block, wherein the comparison is based on a constant number of comparisons per pixel regardless of the size of the current block; and performing the prediction block on the current block based on the prediction block decode/encode.

Description

Template matching prediction for video encoding and decoding
Technical Field
The present disclosure relates to video compression.
Background
To achieve high compression efficiency, image and video coding schemes typically employ prediction and transformation to exploit spatial and temporal redundancy in video content. In general, intra or inter prediction is used to exploit intra or inter correlation, and then differences between original picture blocks and predicted picture blocks (often denoted as prediction errors or prediction residuals) are transformed, quantized and entropy coded. To reconstruct video, the compressed data is decoded by an inverse process corresponding to entropy encoding, quantization, transformation, and prediction.
Disclosure of Invention
Generally, at least one example of an embodiment relates to a method or apparatus for video encoding or decoding, including providing an intra-prediction processing mode employing template matching prediction based on a search range determined as described herein.
At least one example of an embodiment may relate to a method or apparatus for video encoding or decoding, including providing an intra-prediction processing mode employing template matching prediction based on a template search having a fixed number of comparisons per pixel, regardless of block size.
At least one example of an embodiment may relate to a method or apparatus for video encoding or decoding, comprising: providing an intra prediction processing mode employing template matching prediction based on a template search having a fixed number of comparisons per pixel regardless of block size; and modifying the search range so that parallel processing can be performed.
At least one example of an embodiment may relate to an apparatus comprising: one or more processors configured to: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of decoded picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and decoding the current block based on the prediction block.
At least one example of an embodiment may relate to a method comprising: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of decoded picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and decoding the current block based on the prediction block.
At least one example of an embodiment may relate to an apparatus comprising: one or more processors configured to: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of reconstructed picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and encoding the current block based on the prediction block.
At least one example of an embodiment may relate to a method comprising: one or more processors configured to: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of reconstructed picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and encoding the current block based on the prediction block.
According to another general aspect of at least one embodiment, there is provided an apparatus comprising: a device according to any of the decoding implementations; and at least one of the following: (i) An antenna configured to receive a signal, the signal comprising a video block; (ii) A band limiter configured to limit the received signal to a frequency band including the video block; and (iii) a display configured to display an output representing the video block.
According to another general aspect of at least one embodiment, there is provided a non-transitory computer-readable medium comprising data content generated according to any of the described coding embodiments or variants.
According to another general aspect of at least one embodiment, there is provided a computer program product storing program instructions that, when executed by a processor, are adapted to carry out one or more embodiments of the methods described herein.
According to another general aspect of at least one embodiment, there is provided a signal comprising video data generated according to any of the described coding embodiments or variants.
According to another general aspect of at least one embodiment, the bitstream is formatted to include data content generated according to any of the described coding embodiments or variants.
The above presents a simplified summary of the subject matter in order to provide a basic understanding of some aspects of the disclosure. This summary is not an extensive overview of the subject matter. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the subject matter. Its sole purpose is to present some concepts of the subject matter in a simplified form as a prelude to the more detailed description that is presented later.
Drawings
The disclosure may be better understood by considering the following detailed description in conjunction with the accompanying drawings, in which:
fig. 1 illustrates, in block diagram form, an example of an embodiment of an encoder (e.g., a video encoder) suitable for implementing various aspects, features and embodiments described herein;
fig. 2 illustrates, in block diagram form, an example of an embodiment of a decoder (e.g., a video decoder) suitable for implementing various aspects, features, and embodiments described herein;
fig. 3 shows intra prediction modes, such as in a general video coding (VVC);
FIG. 4 illustrates an example of Template Matching Prediction (TMP);
Fig. 5 shows an example of parallel processing in video coding, such as Wavefront Parallel Processing (WPP);
FIG. 6 illustrates an example of at least one embodiment of TMP that can generally involve the use of a single search area;
FIG. 7 illustrates an example of at least one embodiment of a TMP that can generally involve utilizing multiple search regions;
FIG. 8 illustrates an example of at least one embodiment that may generally involve utilizing TMPs of relatively small search scope, wherein not all pixels within a current Coding Tree Unit (CTU) may be used;
FIG. 9 illustrates an example of at least one embodiment of TMP that can generally involve the use of wave front parallel processing (WPP);
FIG. 10 illustrates an example of at least one embodiment that may generally involve TMPs that utilize a search range limited to a current CTU row such that independent CTU row decoding is allowed;
FIG. 11 is an example of at least one embodiment that can generally relate to TMP in which the upper template is ignored when the upper template exceeds the CTU line;
FIG. 12 illustrates, in block diagram form, an example of an embodiment of an apparatus or device or system suitable for implementing one or more embodiments, aspects or features of the disclosure;
FIG. 13 illustrates an example in accordance with at least one embodiment of the present disclosure;
FIG. 14 illustrates an example in accordance with at least one embodiment of the present disclosure; and is also provided with
Fig. 15 illustrates an example in accordance with at least one embodiment of the present disclosure.
It should be understood that the drawings are for purposes of illustrating examples according to various aspects, features, and embodiments of the disclosure, and are not necessarily the only possible configuration. The same reference indicators will be used throughout the drawings to refer to the same or like features.
Detailed Description
As will be described in more detail below, a video codec may relate to an intra prediction processing mode. One example of intra prediction may employ a template matching prediction process. Template matching prediction may be based on a template search in a particular region. At least one example of an embodiment described herein may involve template matching prediction with a fixed number of comparisons per pixel. The fixed number of comparisons per pixel may be independent of the block size. In at least one other embodiment, the template matching prediction may involve a fixed number of comparisons, regardless of block size, and the search range is modified so that parallel processing may be performed.
One example of a video coding method is provided by High Efficiency Video Coding (HEVC). The latest supplements to video compression technology include various versions of reference software and/or documents called Joint Exploration Models (JEM) developed by joint video exploration team (jfet) as part of the development of a new video coding standard called universal video coding (VVC). The aim of JEM is to further improve the existing HEVC (high efficiency video coding) standard, e.g. to increase coding efficiency, reduce complexity, etc.
For ease of explanation, examples of one or more aspects and/or examples and/or features of the embodiments described herein may be described in the context of particular standards (such as VVC). However, references to VVCs or any other particular criteria are not intended to be limiting, and do not limit the scope of potential applications of the various embodiments and features described herein.
Turning now to the drawings, fig. 1 shows an example of a video encoder 100 (high efficiency video coding (HEVC) encoder). Variations of the encoder 100 are contemplated. However, for clarity, the encoder 100 is described below, but not all of the desired variations. For example, fig. 1 may also show an encoder in which the HEVC standard is modified or an encoder employing techniques similar to HEVC, such as a JEM (joint exploration model) encoder being developed by jfet (joint video exploration team), as part of the development of a new video coding standard known as Versatile Video Coding (VVC).
Prior to encoding, the video sequence may undergo a pre-encoding process (101), such as applying a color transform to the input color picture (e.g., converting from RGB 4:4 to YCbCr 4:2: 0), or performing remapping of the input picture components, in order to obtain a signal distribution that is more resilient to compression (e.g., histogram equalization using one of the color components). Metadata may be associated with the preprocessing and attached to the bitstream.
In HEVC, to encode a video sequence having one or more pictures, the pictures are partitioned (102) into one or more slices, wherein each slice may include one or more slice segments. The slice segments are organized into coding units, prediction units, and transform units. The HEVC specification distinguishes between "blocks" and "units," where a "block" processes a particular region (e.g., luminance, Y) in a sample array, and a "unit" includes all coded color components (Y, cb, cr, or monochrome) associated with the block, syntax elements, and collocated blocks of prediction data (e.g., motion vectors).
For encoding in HEVC, pictures are partitioned into square-shaped Coding Tree Blocks (CTBs) having a configurable size, and a contiguous set of coding tree blocks are grouped into slices. A Coding Tree Unit (CTU) contains CTBs of coded color components. CTBs are the roots of quadtrees partitioned into Coded Blocks (CBs), and a coded block may be partitioned into one or more Prediction Blocks (PB) and form the roots of quadtrees partitioned into Transform Blocks (TBs). Corresponding to the coding block, the prediction block, and the transform block, the Coding Unit (CU) includes a Prediction Unit (PU) and a Transform Unit (TU) of a tree structure set, the PU includes prediction information of all color components, and the TU includes a residual coding syntax structure of each color component. The sizes of CBs, PB, and TBs of the luma component are suitable for the corresponding CU, PU, and TU. A diagram of the division of Coding Tree Units (CTUs) into Coding Units (CUs), prediction Units (PUs), and Transform Units (TUs) in HEVC is shown in fig. 3.
In JEM, the QTBT (quadtree plus binary tree) structure eliminates the concept of multiple partition types in HEVC, i.e., eliminates the distinction of CU, PU, and TU concepts. First, coding Tree Units (CTUs) are partitioned by a quadtree structure. The quadtree leaf nodes are further partitioned by a binary tree structure. The binary leaf node is named coding unit (cu) which is used for prediction and transformation without further partitioning. Thus, in the new coded QTBT block structure, the block sizes of the CU, PU and TU are the same. In JEM, a CU is composed of blocks of different color components, i.e., coded Blocks (CBs).
In this application, the term "block" may be used to refer to any one of CTU, CU, PU, TU, CB, PB and TB, for example. In addition, "blocks" may also be used to refer to macroblocks and partitions specified in the H.264/AVC or other video coding standard, and more generally to data arrays of various sizes.
In the encoder 100, pictures are encoded by encoder elements, as described below. The pictures to be encoded are partitioned (102) and processed in units such as CUs. For example, each unit is encoded using an intra mode or an inter mode. When a unit is encoded in intra mode, the unit performs intra prediction (160). In inter mode, motion estimation (175) and compensation (170) are performed. The encoder decides (105) which of the intra-mode or inter-mode is used to encode the unit and indicates the intra/inter decision by, for example, a prediction mode flag. For example, the prediction residual is calculated by subtracting (110) the prediction block from the initial image block.
The prediction residual is then transformed (125) and quantized (130). The quantized transform coefficients, as well as the motion vectors and other syntax elements, are entropy encoded (145) to output a bitstream. The encoder may skip the transform and directly apply quantization to the untransformed residual signal. The encoder may bypass both transformation and quantization, i.e. directly encode the residual without applying a transformation or quantization process.
The encoder decodes the encoded block to provide a reference for further prediction. The quantized transform coefficients are dequantized (140) and inverse transformed (150) to decode the prediction residual. The decoded prediction residual and the prediction block are combined (155) to reconstruct the image block. An in-loop filter (165) is applied to the reconstructed picture to perform, for example, deblocking/SAO (sample adaptive offset) filtering to reduce coding artifacts. The filtered image is stored at a reference picture buffer (180).
Fig. 2 shows a block diagram of a video decoder 200. In decoder 200, the bit stream is decoded by a decoder element, as described below. Video decoder 200 typically performs decoding passes that are reciprocal to the encoding passes described in fig. 1. Encoder 100 also typically performs video decoding as part of encoding video data.
In particular, the input to the decoder comprises a video bitstream, which may be generated by the video encoder 100. First, the bitstream is entropy decoded (230) to obtain transform coefficients, motion vectors, and other encoded information. The picture partition information indicates how to partition the picture. Thus, the decoder may divide (235) the pictures according to the decoded picture partition information. The transform coefficients are dequantized (240) and inverse transformed (250) to decode the prediction residual. The decoded prediction residual and the prediction block are combined (255), reconstructing the image block. The prediction block may be obtained (270) from intra prediction (260) or motion compensated prediction (i.e., inter prediction) (275). An in-loop filter (265) is applied to the reconstructed image. The filtered image is stored at a reference picture buffer (280).
The decoded pictures may also undergo post-decoding processing (285), such as an inverse color transform (e.g., conversion from YCbCr 4:2:0 to RGB 4:4:4) or performing an inverse remapping that is inverse to the remapping process performed in the pre-encoding processing (101). The post-decoding process may use metadata derived in the pre-encoding process and signaled in the bitstream.
As described above, in the HEVC video compression standard, pictures are divided into so-called Coding Tree Units (CTUs), and each CTU is represented by a Coding Unit (CU) in the compressed domain. Each CU is then given some intra or inter prediction parameters (prediction information). To this end, they are spatially partitioned into one or more Prediction Units (PUs), each PU being assigned some prediction information. Intra or inter coding modes are allocated at the CU level. Intra-or inter-prediction is used to exploit intra-or inter-frame correlation. The differences between the original block and the predicted block (typically representing the prediction error or prediction residual) are transformed, quantized and entropy encoded in a Transform Block (TB). To reconstruct video, the compressed data is decoded by an inverse process corresponding to entropy encoding, quantization, transformation, and prediction.
Intra-picture prediction is an essential part of image and video compression. Conventionally, the prediction signal is generated by or from L-shaped reconstructed pixels (reference samples) to the left and/or above the current block or coding unit. During intra prediction, reference samples are obtained based on assumptions about the reference samples along different angles. This mechanism is called angle prediction. As shown in fig. 3, a video codec such as VVC provides 65 intra prediction modes including 63 angles and DC and plane predictions.
Conventional intra prediction in VVC is enhanced with a variety of tools:
-Cross Component Line Model (CCLM): the chroma prediction block is generated by luma reconstruction of a linear model of the samples.
Multi-reference line prediction (MRL): more reference samples are used to generate the prediction block.
-intra sub-partition (ISP): the prediction block is divided into 4 sub-blocks sharing the same prediction mode.
Matrix weighted intra prediction (MIP): the prediction block is generated by multiplying the reference samples with some off-line optimized prediction matrix.
-Intra Block Copy (IBC): a prediction block is generated by copying another block from the already reconstructed image portion, wherein a displacement vector is signaled in the bitstream.
The residual block is transformed with another mode of core transform DCT-II or a combination of DST-VII and DCT-VIII, called transform selection (MTS). The transformed block may be further transformed with a secondary inseparable transform to further compress the residual block. This process is called low frequency inseparable transform (LFNST).
Template Matching Prediction (TMP) is yet another powerful intra prediction mode that is not included in VVC. It is performed by searching one or more L-like neighbors (referred to as "templates") to find one or more targets or candidate blocks for prediction. This is shown in fig. 4. If TMP is used, the current template consists of reconstructed L-shaped neighbors. Similar templates were found to differ little from the current templates. The blocks belonging to these templates (target blocks) are used to generate a prediction signal by averaging them or considering only the one with the smallest template difference.
Integration of TMP in video codecs such as VVC requires proper interoperability with existing intra tools. Namely:
interaction with ISP, MIP and MRL
Interaction with transformation tools (MTS and LFNST, implicit MTS)
Interaction with combined inter-and intra-prediction (CIIP)
In general, at least one example of an embodiment described herein addresses these interactions and enables this mode while providing acceptable complexity/Rate Distortion (RD) performance tradeoff, such as for a subsequent profile of VVC or for a new codec.
One problem that may be associated with implementing a TPM comes from the number of comparisons per pixel. The number of comparisons for a small block is much higher than for a large block for a given search range. This increases the complexity of the small block, which may become a bottleneck in the encoding process.
Another problem may be that the TMP may perform template searches that do not allow parallel processing. That is, for Wavefront Parallel Processing (WPP), the decoding process of CTU is required to be independent of CTUs other than upper right. This is indicated in fig. 5. That is, if the upper right CTU of the previous run/row decoding has been decoded, each CTU in the given CTU row decoding ("run" in fig. 5) can be decoded. This results in a limited search range for the reconstructed part.
In general, at least one example of an embodiment described herein may involve:
defining a number of comparisons per pixel, e.g. a fixed number or a maximum number of comparisons per pixel, independent or irrespective of block size; and/or
Defining the search range such that parallel processing can be performed, e.g. limiting the search range to reconstructed frame portions.
Examples of embodiments involving a number of comparisons per pixel (such as a fixed number of comparisons or a maximum number of comparisons) may provide a single search area. As an example, the search scope may be in a single area or limited to a single area. An example of an implementation may involve a single search area located at the top left of the current block (CU/PU). This avoids accessing the pixels that are not decoded in the current CTU. The search area is shown in fig. 6.
In an example of an embodiment, the number of comparisons may be determined based on or corresponding to the search range. For example, for a search range of width "search_w" and height "search_h", search_w x search_h comparisons are made to select the best matching block. That is, for the example described, the number of comparisons per pixel (CompPerPixel) is calculated as follows:
CompPerPixel=search_w x search_h/(blk_w x blk_h)
where blk_w and blk_height are the width and height of the current block.
To fix CompPerPixel, the search_w/blk_w and the search_h/blk_h must be fixed. In other words:
search_w=const x blk_w
search_h=const x blk_h
where "const" is a constant value that controls the search range. The value of "const" may be a fixed value or signaled by a high level syntax (e.g., SPS).
At least one other example of an embodiment may involve multiple search areas or ranges. Multiple search areas may provide higher coding gain. Examples of embodiments of multiple search areas may involve searching based on areas or ranges of one or more of the reconstructed pixels that include the upper right and left CTUs in addition to the reconstructed pixel with the current CTU. Fig. 7 shows an example of a plurality of search areas. In the example of fig. 7, four regions are defined as follows:
R1: within the current CTU, starting from the upper left of the current position
R2: left upper pixel
R3: upper right pixel
R4: left side pixel
Each of these search ranges is defined by a search range width (search_w) and a search range height (search_h). The total comparison for each pixel is calculated as:
CompPerPixel=4x search_w x search_h/(blk_w x blk_h)
as in the case of a single search area, to have a fixed CompPerPixel, the search_w/blk_w and the search_h/blk_h must be fixed. In other words:
search_w=const x blk_w
search_h=const x blk_h
for small search ranges, examples of embodiments may be based on using fewer than all pixels within the current CTU, i.e., not using all, a portion, or a subset of the pixels within the current CTU. This is illustrated by way of example of the embodiment shown in fig. 8.
In general, at least one other example of an embodiment may involve template matching prediction based on providing or enabling parallel processing such as Wavefront Parallel Processing (WPP). In at least one example of an embodiment, to implement Wavefront Parallel Processing (WPP), the search range should be limited so that pixels other than the upper right CTU of each CTU row are not accessed. That is, if the pixel is not off-the-shelf, it should not be used. Fig. 9 shows an example of an embodiment in which allowed CTUs to be used for TMP search are shown as shaded and the current block is located within a white non-shaded CTU.
In general, at least one other example of an embodiment may involve separate rows of CTUs. For example, it may be desirable for many real-time encoding processes that each CTU row be independently decodable. That is, there is no dependency between the current CTU row and the above-mentioned row. To enable the TMP and have independent CTU rows, the search scope may be reduced, controlled, or determined such that access to the above CTU rows is restricted, not allowed, or disabled, as illustrated by the example shown in fig. 10.
Furthermore, in at least one other example of an embodiment, only the left template is considered when the template exceeds the current CTU row. This occurs when the vertical position of the current block is the same as the CTU. This is shown in fig. 11. In this case, the top template is not used to find the best candidate, but only the left template is used.
At least one other example of an embodiment relates to a template that is part of the template. For example, when encoding a CU in the first row or first column, the top or left template, respectively, is not available. This is shown in fig. 14 and 15. In this case, part of the template is used for template matching prediction. In other words, if the reference template exceeds or extends to near the frame boundary, the partial template within the frame is considered. Fig. 14 shows an example of an embodiment in which the upper template is not available. In the example of fig. 14, only the left template is used for template matching prediction. That is, the template of the current block includes only a first region to the left of the current block, i.e., a first left template to the left of the current block, and the template associated with the second block to be used for comparison with the template of the current block includes only a second region to the left of the second block, i.e., a second left template to the left of the second block. Thus, the comparison is based only on the first and second regions to the left of the corresponding block, i.e. the first and second left templates. Fig. 15 shows an example in which the left template is not available. In the example of fig. 15, only the upper template is used for template matching prediction. That is, the template of the current block includes only a first region above the current block, i.e., a first upper template above the current block, and the template associated with the second block to be used for comparison with the template of the current block includes only a second region above the second block, i.e., a second upper template above the second block. Thus, the comparison is based only on the first and second regions above the corresponding blocks, i.e., the first and second cope-plates.
Another example of an embodiment relates to the special case when both the upper and left templates exceed the frame boundary and thus neither the upper nor the left template is available. This special case may occur, for example, when the first CU in the current frame is encoded. In this case, the prediction is considered as DC prediction, wherein the prediction value is set to:
1<<(bitDepth-1)
where bitDepth is the internal bit depth representation.
In general, examples of embodiments described and contemplated herein may be implemented in many different forms. Fig. 1 and 2 described above and fig. 12 described below provide examples of some embodiments, but other embodiments are contemplated and the discussion of fig. 1, 2, and 12 does not limit the breadth of the embodiments. For example, at least one aspect of one or more examples of the embodiments described herein generally relates to video encoding and decoding, and at least one other aspect generally relates to transmitting a generated or encoded bitstream. These aspects and others may be implemented in various embodiments, such as a method, an apparatus, a computer-readable storage medium having stored thereon instructions for encoding or decoding video data according to any of the methods, and/or a computer-readable storage medium having stored thereon a bitstream generated according to any of the methods. Additionally, it should be understood that the figures provided herein and the text or grammatical sections provided herein that may relate to industry standards or standard-related documents are for purposes of illustration of examples of the various aspects and embodiments and are not necessarily the only possible configurations. In addition, in the present application, the terms "reconstruct" and "decode" may be used interchangeably, and the terms "pixel" and "sample" may be used interchangeably, and the terms "image", "picture" and "frame" may be used interchangeably. Various methods are described herein, and each method includes one or more steps or actions for achieving the method. Unless a particular order of steps or actions is required for proper operation of the method, the order and/or use of particular steps and/or actions may be modified or combined. Various methods and other aspects described in this patent application may be used to modify modules, such as module 160 included in the example of video encoder implementation 100 shown in fig. 1 and module 260 included in the example of video decoder implementation 200 shown in fig. 2. Furthermore, the various embodiments, features, etc. described herein are not limited to VVC or HEVC, and may be applied to, for example, other standards and recommendations (whether pre-existing or developed in the future) and extensions of any such standards and recommendations (including VVC and HEVC). The aspects described in this application may be used alone or in combination unless otherwise indicated or technically excluded. Various values are used in this patent application, such as the size of the maximum quantization matrix, the number of block sizes considered, etc. The particular values are for exemplary purposes and the described aspects are not limited to these particular values.
Fig. 12 shows a block diagram of an example of a system in which various features and embodiments are implemented. The system 1000 in fig. 12 may be embodied as a device including various components described below and configured to perform or implement one or more of the examples of embodiments, features, etc. described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptops, smartphones, tablets, digital multimedia set-top boxes, digital television receivers, personal video recording systems, connected home appliances, and servers. The elements of system 1000 may be embodied in a single Integrated Circuit (IC), multiple ICs, and/or discrete components, alone or in combination. For example, in at least one embodiment, the processing and encoder/decoder elements of system 1000 are distributed across multiple ICs and/or discrete components. In various embodiments, system 1000 is communicatively coupled to one or more other systems or other electronic devices via, for example, a communication bus or through dedicated input ports and/or output ports. In general, system 1000 is configured to implement one or more of the examples of embodiments, features, etc. described in this document.
The system 1000 includes at least one processor 1010 configured to execute instructions loaded therein for implementing various aspects such as those described in this document. The processor 1010 may include an embedded memory, an input-output interface, and various other circuits as known in the art. The system 1000 includes at least one memory 1020 (e.g., volatile memory device and/or non-volatile memory device). The system 1000 includes a storage device 1040, which may include non-volatile memory and/or volatile memory, including, but not limited to, electrically erasable programmable read-only memory (EEPROM), read-only memory (ROM), programmable read-only memory (PROM), random Access Memory (RAM), dynamic Random Access Memory (DRAM), static Random Access Memory (SRAM), flash memory, a magnetic disk drive, and/or an optical disk drive. By way of non-limiting example, storage 1040 may include internal storage, attached storage (including removable and non-removable storage), and/or network-accessible storage.
The system 1000 includes an encoder/decoder module 1030 configured to process data to provide encoded video or decoded video, for example, and the encoder/decoder module 1030 may include its own processor and memory. Encoder/decoder module 1030 represents one or more modules that may be included in a device to perform encoding and/or decoding functions. As is well known, an apparatus may include one or both of an encoding module and a decoding module. Additionally, the encoder/decoder module 1030 may be implemented as a stand-alone element of the system 1000 or may be incorporated within the processor 1010 as a combination of hardware and software as known to those skilled in the art.
Program code to be loaded onto processor 1010 or encoder/decoder 1030 (e.g., to perform or implement one or more examples of embodiments, features, etc. described in this document) may be stored in storage device 1040 and subsequently loaded onto memory 1020 for execution by processor 1010. According to various implementations, one or more of the processor 1010, memory 1020, storage 1040, and encoder/decoder module 1030 may store one or more of various items during execution of the processes described in this document. Such storage items may include, but are not limited to, input video, decoded video or partially decoded video, bitstreams, matrices, variables, and intermediate or final results of processing equations, formulas, operations, and arithmetic logic.
In some embodiments, memory internal to the processor 1010 and/or encoder/decoder module 1030 is used to store instructions as well as to provide working memory for processing as needed during encoding or decoding. However, in other embodiments, memory external to the processing device (e.g., the processing device may be the processor 1010 or the encoder/decoder module 1030) is used for one or more of these functions. The external memory may be memory 1020 and/or storage device 1040, such as dynamic volatile memory and/or non-volatile flash memory. In several embodiments, external non-volatile flash memory is used to store an operating system such as a television. In at least one embodiment, a fast external dynamic volatile memory such as RAM is used as a working memory for video encoding and decoding operations, such as MPEG-2 (MPEG refers to moving picture experts group, MPEG-2 is also known as ISO/IEC 13818, and 13818-1 is also known as h.222, 13818-2 is also known as h.262), HEVC (HEVC refers to high efficiency video encoding, also known as h.265 and MPEG-H part 2), or VVC (universal video encoding, a new standard developed by the joint video experts group (jfet)).
Input to the elements of system 1000 may be provided through various input devices as indicated in block 1130. Such input devices include, but are not limited to: (i) A Radio Frequency (RF) section that receives an RF signal transmitted over the air, for example, by a broadcaster; (ii) A Component (COMP) input terminal (or set of COMP input terminals); (iii) a Universal Serial Bus (USB) input terminal; and/or (iv) a High Definition Multimedia Interface (HDMI) input terminal. Other examples not shown in fig. 3 include composite video.
In various embodiments, the input device of block 1130 has associated respective input processing elements as known in the art. For example, the RF section may be associated with elements suitable for: (i) select the desired frequency (also referred to as a select signal, or band limit the signal to one frequency band), (ii) down-convert the selected signal, (iii) band limit again to a narrower frequency band to select a signal band that may be referred to as a channel in some embodiments, for example, (iv) demodulate the down-converted and band limited signal, (v) perform error correction, and (vi) de-multiplex to select the desired data packet stream. The RF portion of the various embodiments includes one or more elements for performing these functions, such as a frequency selector, a signal selector, a band limiter, a channel selector, a filter, a down-converter, a demodulator, an error corrector, and a demultiplexer. The RF section may include a tuner that performs various of these functions including, for example, down-converting the received signal to a lower frequency (e.g., intermediate or near baseband frequency) or to baseband. In one set-top box embodiment, the RF section and its associated input processing elements receive RF signals transmitted over a wired (e.g., cable) medium and perform frequency selection by filtering, down-converting and re-filtering to a desired frequency band. Various embodiments rearrange the order of the above (and other) elements, remove some of these elements, and/or add other elements that perform similar or different functions. Adding components may include inserting components between existing components, such as an insertion amplifier and an analog-to-digital converter. In various embodiments, the RF section includes an antenna.
Additionally, the USB and/or HDMI terminals may include respective interface processors for connecting the system 1000 to other electronic devices across a USB and/or HDMI connection. It should be appreciated that various aspects of the input processing (e.g., reed-Solomon error correction) may be implemented as necessary, for example, within a separate input processing IC or within the processor 1010. Similarly, aspects of USB or HDMI interface processing may be implemented within a separate interface IC or within the processor 1010, if desired. The demodulated, error corrected, and demultiplexed streams are provided to various processing elements including, for example, a processor 1010 and an encoder/decoder 1030 that operate in conjunction with memory and storage elements to process the data streams as needed for presentation on an output device.
The various elements of system 1000 may be provided within an integrated housing within which the various elements may be interconnected and data transferred therebetween using a suitable connection arrangement 1140 (e.g., internal buses, including inter-IC (I2C) buses, wiring, and printed circuit boards, as is known in the art).
The system 1000 includes a communication interface 1050 that allows communication with other devices via a communication channel 1060. Communication interface 1050 may include, but is not limited to, a transceiver configured to transmit and receive data over communication channel 1060. Communication interface 1050 may include, but is not limited to, a modem or network card, and communication channel 1060 may be implemented within a wired and/or wireless medium, for example.
In various embodiments, the data stream is transmitted or otherwise provided to system 1000 using a wireless network, such as a Wi-Fi network, for example IEEE 802.11 (IEEE refers to institute of electrical and electronics engineers). Wi-Fi signals of these embodiments are received through a communication channel 1060 and a communication interface 1050 suitable for Wi-Fi communication. The communication channel 1060 of these embodiments is typically connected to an access point or router that provides access to external networks, including the internet, for allowing streaming applications and other communications across operators. Other embodiments provide streamed data to the system 1000 using a set top box that delivers the data over an HDMI connection of input block 1130. Still other embodiments provide streamed data to system 1000 using an RF connection of input block 1130. As described above, various embodiments provide data in a non-streaming manner. In addition, various embodiments use wireless networks other than Wi-Fi, such as cellular networks or bluetooth networks.
The system 1000 may provide output signals to various output devices including a display 1100, speakers 1110, and other peripheral devices 1120. The display 1100 of various embodiments includes, for example, one or more of a touch screen display, an Organic Light Emitting Diode (OLED) display, a curved display, and/or a collapsible display. The display 1100 may be used in a television, tablet, laptop, cellular telephone (mobile phone), or other device. The display 1100 may also be integrated with other components (e.g., as in a smart phone), or may be a stand-alone display (e.g., an external monitor for a laptop). In various examples of implementations, other peripheral devices 1120 include one or more of a stand-alone digital video disc (or digital versatile disc) (DVR, which may be referred to by both terms), a disc player, a stereo system, and/or a lighting system. Various embodiments use one or more peripheral devices 1120 that provide functionality based on the output of the system 1000. For example, a disk player performs the function of playing the output of system 1000.
In various embodiments, control signals are communicated between the system 1000 and the display 1100, speakers 1110, or other peripheral 1120 using signaling such as av.link, consumer Electronics Control (CEC), or other communication protocol that allows device-to-device control with or without user intervention. Output devices may be communicatively coupled to system 1000 via dedicated connections through respective interfaces 1070, 1080, and 1090. Alternatively, the output device may be connected to the system 1000 via the communication interface 1050 using a communication channel 1060. In an electronic device (such as, for example, a television), the display 1100 and speaker 1110 may be integrated in a single unit with other components of the system 1000. In various embodiments, the display interface 1070 includes a display driver, such as, for example, a timing controller (TCon) chip.
For example, if the RF portion of input 1130 is part of a stand-alone set-top box, display 1100 and speaker 1110 may alternatively be independent with respect to one or more of the other components. In various implementations where display 1100 and speaker 1110 are external components, the output signals may be provided via dedicated output connections, including, for example, HDMI ports, USB ports, or COMP outputs.
The implementation may be performed by computer software implemented by the processor 1010, or by hardware, or by a combination of hardware and software. As a non-limiting example, these embodiments may be implemented by one or more integrated circuits. As a non-limiting example, memory 1020 may be of any type suitable to the technical environment and may be implemented using any suitable data storage technology such as optical memory devices, magnetic memory devices, semiconductor-based memory devices, fixed memory and removable memory. As a non-limiting example, the processor 1010 may be of any type suitable to the technical environment, and may encompass one or more of microprocessors, general purpose computers, special purpose computers, and processors based on a multi-core architecture.
Fig. 13 provides another example of an embodiment. In fig. 13, at 1310, a prediction block for a current block of picture information is determined. The determination at 1310 is based on a comparison of a template associated with the current block (e.g., an L-shaped template such as shown in fig. 4 with a first portion to the left of the current block and a second portion above the current block) and at least one other template associated with at least one other block in the region of decoded or reconstructed picture information. The comparison may include searching for one or more templates in the area of decoded or reconstructed picture information that match or most closely match the template of the current block. The comparison may be based on a constant number of comparisons per pixel, independent of block size, as described herein, for example, with respect to fig. 6 or fig. 7. One or more blocks associated with the one or more templates determined by the comparison are used to generate a prediction block. At 1320, the current block is decoded (or encoded) based on the prediction block.
In addition to the examples of embodiments described herein, various broad and specific embodiments are supported and contemplated throughout this disclosure. Examples of embodiments according to the present disclosure include, but are not limited to, the following embodiments.
Generally, at least one example of an embodiment relates to a method or apparatus for video encoding or decoding, including providing an intra-prediction processing mode employing template matching prediction based on a search range determined as described herein.
At least one example of an embodiment may relate to a method or apparatus for video encoding or decoding, including providing an intra-prediction processing mode employing template matching prediction based on a template search having a fixed number of comparisons per pixel, regardless of block size.
At least one example of an embodiment may relate to a method or apparatus for video encoding or decoding, comprising: providing an intra prediction processing mode employing template matching prediction based on a template search having a fixed number of comparisons per pixel regardless of block size; and modifying the search range so that parallel processing can be performed.
At least one example of an embodiment may relate to an apparatus comprising: one or more processors configured to: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of decoded picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and decoding the current block based on the prediction block.
At least one example of an embodiment may relate to a method comprising: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of decoded picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and decoding the current block based on the prediction block.
At least one example of an embodiment may relate to an apparatus comprising: one or more processors configured to: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of reconstructed picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and encoding the current block based on the prediction block.
At least one example of an embodiment may relate to a method comprising: one or more processors configured to: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of reconstructed picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and encoding the current block based on the prediction block.
At least one example of an implementation may relate to a method or apparatus as described herein, wherein the constant number of comparisons per pixel is one of a fixed value or a value signaled by high level syntax information.
At least one example of an embodiment may relate to a method or apparatus as described herein, wherein the region where the at least one other template occurs includes a region above and to the left of the current block.
At least one example of an embodiment may relate to a method or apparatus as described herein, wherein the region in which the at least one other template occurs includes a plurality of regions including a first region including pixels above and to the left of the current block and within the current CTU including the current block, a second region including pixels above and to the left of the current CTU, a third region including pixels above and to the right of the current CTU, and a fourth region including pixels to the left of the current CTU.
At least one example of an embodiment may relate to a method or apparatus as described herein, wherein the region in which the at least one other template occurs comprises a region selected to enable wavefront parallel processing.
At least one example of an embodiment may relate to a method or apparatus as described herein, wherein the region in which the at least one other template occurs comprises a region selected to enable independent decoding of each CTU row.
At least one example of an embodiment may relate to a method or apparatus as described herein, wherein the region is selected such that decoding does not require access to CTU rows above CTU rows comprising the current block.
At least one example of an embodiment may relate to a method or apparatus as described herein, wherein the template associated with the current block includes a first portion to the left of the current block and a second portion above the current block, and the comparison is based on only the first portion when the second portion extends above a row of CTUs that includes the current block.
At least one example of an embodiment may relate to an apparatus comprising an apparatus as described herein and at least one of: (i) An antenna configured to receive a signal, the signal including data representing image information; (ii) A band limiter configured to limit the received signal to a frequency band including data representing image information; and (iii) a display configured to display an image from the image information.
At least one example of an embodiment may relate to a device as described herein, wherein the device comprises one of a television, a television signal receiver, a set-top box, a gateway device, a mobile device, a cellular phone, a tablet computer, or other electronic device.
In general, another example of an embodiment may relate to a bitstream or signal formatted to include syntax elements and picture information, wherein the syntax elements are generated by processing based on any one or more of the examples of an embodiment of a method according to the present disclosure and the picture information is encoded.
In general, one or more other examples of embodiments may also provide a computer-readable storage medium, e.g., a non-volatile computer-readable storage medium, having instructions stored thereon for encoding or decoding picture information (such as video data) according to the methods or apparatus described herein. One or more embodiments may also provide a computer readable storage medium having stored thereon a bitstream generated according to the methods or apparatus described herein. One or more embodiments may also provide methods and apparatus for transmitting or receiving a bitstream or signal generated according to the methods or apparatus described herein.
Many of these examples are specifically described and are generally described in a manner that may appear to be limiting, at least to show various characteristics. However, this is for clarity of description and does not limit the application or scope of these aspects. Indeed, all the different aspects may be combined and interchanged to provide further aspects. Furthermore, these embodiments, features, etc. may also be combined and interchanged with other embodiments, features, etc. described in the previous submissions.
Various implementations participate in decoding. As used in this application, "decoding" may encompass all or part of a process performed on a received encoded sequence, for example, in order to produce a final output suitable for display. In various implementations, such processes include one or more processes typically performed by a decoder, such as entropy decoding, inverse quantization, inverse transformation, and differential decoding. In various embodiments, such processes also or alternatively include processes performed by the various embodying decoders described in the present application.
As a further example, in an embodiment, "decoding" refers only to entropy decoding, in another embodiment "decoding" refers only to differential decoding, and in yet another embodiment "decoding" refers to a combination of entropy decoding and differential decoding. The phrase "decoding process" is intended to refer specifically to a subset of operations or broadly to a broader decoding process, as will be clear based on the context of the specific description, and is believed to be well understood by those skilled in the art.
Various implementations participate in the encoding. In a similar manner to the discussion above regarding "decoding," as used in this application, may encompass, for example, all or part of a process performed on an input video sequence to produce an encoded bitstream. In various implementations, such processes include one or more processes typically performed by an encoder, such as partitioning, differential encoding, transformation, quantization, and entropy encoding.
As a further example, in an embodiment, "encoding" refers only to entropy encoding, in another embodiment, "encoding" refers only to differential encoding, and in yet another embodiment, "encoding" refers to a combination of differential encoding and entropy encoding. Whether the phrase "encoding process" refers specifically to a subset of operations or broadly refers to a broader encoding process will be apparent based on the context of the specific description and is believed to be well understood by those skilled in the art.
Note that syntax elements used herein are descriptive terms. Thus, they do not exclude the use of other syntax element names.
When the figures are presented as flow charts, it should be understood that they also provide block diagrams of corresponding devices. Similarly, when the figures are presented as block diagrams, it should be understood that they also provide a flow chart of the corresponding method/process.
In general, examples of embodiments, implementations, features, etc. described herein may be implemented in, for example, a method or process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (e.g., discussed only as a method), the implementation of the features discussed may also be implemented in other forms (e.g., an apparatus or program). The apparatus may be implemented in, for example, suitable hardware, software and firmware. One or more examples of the method may be implemented in, for example, a processor, which refers generally to a processing device including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end users. Furthermore, the use of the term "processor" herein is intended to broadly encompass a single processor or various configurations of more than one processor.
Reference to "one embodiment" or "an embodiment" or "one embodiment" or "an embodiment" and other variations thereof means that a particular feature, structure, characteristic, etc., described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrase "in one embodiment" or "in an embodiment" or "in one embodiment" or "in an embodiment" and any other variations that occur in various places throughout this application are not necessarily all referring to the same embodiment.
In addition, the present application may be directed to "determining" various information. The determination information may include, for example, one or more of estimation information, calculation information, prediction information, or retrieval information from memory.
Furthermore, the present application may relate to "accessing" various information. The access information may include, for example, one or more of receiving information, retrieving information (e.g., from memory), storing information, moving information, copying information, computing information, determining information, predicting information, or estimating information.
In addition, the present application may be directed to "receiving" various information. As with "access," receipt is intended to be a broad term. Receiving information may include, for example, one or more of accessing information or retrieving information (e.g., from memory). Further, during operations such as, for example, storing information, processing information, transmitting information, moving information, copying information, erasing information, computing information, determining information, predicting information, or estimating information, the "receiving" is typically engaged in one way or another.
It should be understood that, for example, in the case of "a/B", "a and/or B", and "at least one of a and B", use of any of the following "/", "and/or" and "at least one" is intended to cover selection of only the first listed option (a), or selection of only the second listed option (B), or selection of both options (a and B). As a further example, in the case of "A, B and/or C" and "at least one of A, B and C", such phrases are intended to cover selection of only the first listed option (a), or only the second listed option (B), or only the third listed option (C), or only the first and second listed options (a and B), or only the first and third listed options (a and C), or only the second and third listed options (B and C), or all three options (a and B and C). As will be apparent to one of ordinary skill in the art and related arts, this extends to as many items as are listed.
It will be apparent to one of ordinary skill in the art that implementations may produce various signals formatted to carry, for example, storable or transmittable information. The information may include, for example, instructions for performing a method or data resulting from one of the implementations. For example, the signal may be formatted to carry the bit stream of the described embodiments. Such signals may be formatted, for example, as electromagnetic waves (e.g., using the radio frequency portion of the spectrum) or baseband signals. Formatting may include, for example, encoding the data stream and modulating the carrier with the encoded data stream. The information carried by the signal may be, for example, analog or digital information. It is known that signals may be transmitted over a variety of different wired or wireless links. The signal may be stored on a processor readable medium.
Various embodiments are described herein. The features of these embodiments may be provided separately or in any combination in the various claim categories and types. Further, embodiments may include one or more of the following features, devices, or aspects, alone or in any combination, across the various claim categories and types:
providing video encoding and/or decoding, comprising: determining a prediction block for the current block based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in the region of decoded picture information, wherein the comparison is based on a constant number of comparisons per pixel irrespective of the size of the current block; and encoding/decoding the current block based on the prediction block.
Providing video encoding and/or decoding as described herein, wherein the constant number of comparisons per pixel is one of a fixed value or a value signaled by high level syntax information;
providing video encoding and/or decoding as described herein, wherein the region in which the at least one other template occurs comprises a region above and to the left of the current block;
providing video encoding and/or decoding as described herein, wherein the region in which the at least one other template appears comprises a plurality of regions including a first region including pixels above and to the left of the current block and within the current CTU including the current block, a second region including pixels above and to the left of the current CTU, a third region including pixels above and to the right of the current CTU, and a fourth region including pixels to the left of the current CTU;
providing video encoding and/or decoding as described herein, wherein the region in which the at least one other template appears comprises a region selected to enable wavefront parallel processing;
providing video encoding and/or decoding as described herein, wherein the region in which the at least one other template occurs comprises a region selected to enable independent decoding of each CTU row;
Providing video encoding and/or decoding as described herein, wherein the region is selected such that decoding does not require access to CTU rows above CTU rows comprising the current block;
providing video encoding and/or decoding as described herein, wherein the template associated with the current block includes a first portion to the left of the current block and a second portion above the current block, and the comparison is based on the first portion only when the second portion extends above a row of CTUs that includes the current block;
providing a bitstream or signal comprising one or more syntax elements or variants thereof;
providing a bitstream or signal comprising a syntax conveying information generated according to any of the embodiments;
providing for inserting syntax elements in the signaling, which enable the decoder to operate in a manner corresponding to the encoding manner used by the encoder;
inserting an enabling encoder and/or decoder in the signaling syntax element to provide encoding and/or decoding according to any of the implementations, features, or entities as described herein (alone or in any combination).
Based on these syntax elements, select features or entities (alone or in any combination) as described herein for application at the decoder;
Providing for creating and/or transmitting and/or receiving and/or decoding a bitstream or signal comprising one or more of said syntax elements or variants thereof;
providing a creation and/or transmission and/or reception and/or decoding of a bitstream according to any of the embodiments;
a method, process, apparatus, medium storing instructions, medium storing data, or signal according to any one of the embodiments;
a television, set-top box, cellular telephone, tablet computer, or other electronic device provided for application of encoding and/or decoding according to any of the embodiments, features, or entities as described herein (alone or in any combination);
a television, set-top box, cellular telephone, tablet computer, or other electronic device that performs encoding and/or decoding according to any of the embodiments, features, or entities (alone or in any combination) as described herein and displays (e.g., using a monitor, screen, or other type of display) the resulting image;
a television, set-top box, cellular telephone, tablet computer, or other electronic device that tunes (e.g., using a tuner) a channel to receive a signal comprising an encoded image and perform encoding and/or decoding according to any of the embodiments, features, or entities (alone or in any combination) as described herein;
A television, set-top box, cellular telephone, tablet computer or other electronic device that receives (e.g., using an antenna) an over-the-air signal comprising an encoded image and performs encoding and/or decoding according to any of the embodiments, features or entities (alone or in any combination) as described herein;
a computer program product storing program code for execution by a computer for encoding and/or decoding according to any of the embodiments, features or entities (alone or in any combination) as described herein.
A non-transitory computer readable medium comprising executable program instructions that
The instructions cause a computer executing the instructions to perform the encoding and/or decoding according to any of the embodiments, features, or entities (alone or in any combination) as described herein.

Claims (23)

1.一种装置,包括:1. A device comprising: 一个或多个处理器,所述一个或多个处理器被配置为基于与图片信息的当前块相关联的模板和与解码图片信息的区域中的至少一个其他块相关联的至少一个其他模板的比较来确定用于所述当前块的预测块,其中所述比较基于与所述当前块的大小无关的每个像素的恒定比较次数;以及One or more processors configured to decode a template based on a template associated with a current block of picture information and at least one other template associated with at least one other block in a region of the decoded picture information. comparing to determine a prediction block for the current block, wherein the comparison is based on a constant number of comparisons per pixel regardless of the size of the current block; and 基于所述预测块对所述当前块进行解码。The current block is decoded based on the predicted block. 2.一种方法,包括:2. A method including: 基于与图片信息的当前块相关联的模板和与解码图片信息的区域中的至少一个其他块相关联的至少一个其他模板的比较来确定用于所述当前块的预测块,其中所述比较基于与所述当前块的大小无关的每个像素的恒定比较次数;以及The prediction block for the current block of picture information is determined based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in a region of the decoded picture information, wherein the comparison is based on a constant number of comparisons per pixel regardless of the size of the current block; and 基于所述预测块对所述当前块进行解码。The current block is decoded based on the predicted block. 3.一种装置,包括:3. A device comprising: 一个或多个处理器,所述一个或多个处理器被配置为基于与图片信息的当前块相关联的模板和与重建的图片信息的区域中的至少一个其他块相关联的至少一个其他模板的比较来确定用于所述当前块的预测块,其中所述比较基于与所述当前块的大小无关的每个像素的恒定比较次数;以及One or more processors configured to based on a template associated with a current block of picture information and at least one other template associated with at least one other block in a region of reconstructed picture information to determine a prediction block for the current block, wherein the comparison is based on a constant number of comparisons per pixel regardless of the size of the current block; and 基于所述预测块对所述当前块进行编码。The current block is encoded based on the predicted block. 4.一种方法,包括:4. A method comprising: 基于与图片信息的当前块相关联的模板和与重建的图片信息的区域中的至少一个其他块相关联的至少一个其他模板的比较来确定用于所述当前块的预测块,其中所述比较基于与所述当前块的大小无关的每个像素的恒定比较次数;以及Determining a prediction block for a current block of picture information based on a comparison of a template associated with the current block of picture information and at least one other template associated with at least one other block in a region of reconstructed picture information, wherein the comparison Based on a constant number of comparisons per pixel independent of the size of the current block; and 基于所述预测块对所述当前块进行编码。The current block is encoded based on the predicted block. 5.根据前述权利要求中任一项所述的方法或装置,其中每个像素的所述恒定比较次数是固定值或通过高级语法信息发信号通知的值中的一种。5. A method or apparatus according to any one of the preceding claims, wherein the constant number of comparisons per pixel is one of a fixed value or a value signaled through high-level syntax information. 6.根据前述权利要求中任一项所述的装置或方法,其中出现所述至少一个其他模板的所述区域包括在所述当前块左上方的区域。6. An apparatus or method according to any one of the preceding claims, wherein the area in which the at least one other template appears includes an area in the upper left of the current block. 7.根据权利要求1至5中任一项所述的装置或方法,其中出现所述至少一个其他模板的所述区域包括多个区域,所述多个区域包括第一区域、第二区域、第三区域和第四区域,所述第一区域包括在所述当前块左上方并且在包括所述当前块的当前CTU内的像素,所述第二区域包括在所述当前CTU左上方的像素,所述第三区域包括在所述当前CTU右上方的像素,所述第四区域包括在所述当前CTU左侧的像素。7. The device or method according to any one of claims 1 to 5, wherein the area in which the at least one other template appears includes a plurality of areas, the plurality of areas including a first area, a second area, The third area and the fourth area, the first area includes pixels at the upper left of the current block and within the current CTU including the current block, and the second area includes pixels at the upper left of the current CTU. , the third area includes pixels on the upper right side of the current CTU, and the fourth area includes pixels on the left side of the current CTU. 8.根据权利要求1至5中任一项所述的装置或方法,其中出现所述至少一个其他模板的所述区域包括被选择为使得能够进行波前并行处理的区域。8. An apparatus or method according to any one of claims 1 to 5, wherein the regions in which the at least one other template occurs comprise regions selected to enable wavefront parallel processing. 9.根据权利要求1至5中任一项所述的装置或方法,其中出现所述至少一个其他模板的所述区域包括被选择为使得能够对每个CTU行进行独立解码的区域。9. The apparatus or method of any one of claims 1 to 5, wherein the region in which the at least one other template occurs includes a region selected to enable independent decoding of each CTU row. 10.根据权利要求9所述的装置或方法,其中所述区域被选择为使得解码不需要访问在包括所述当前块的CTU行上方的CTU行。10. The apparatus or method of claim 9, wherein the region is selected such that decoding does not require access to CTU rows above the CTU row including the current block. 11.根据权利要求10所述的装置或方法,其中与所述当前块相关联的所述模板包括在所述当前块左侧的第一部分和在所述当前块上方的第二部分,并且当所述第二部分在包括所述当前块的CTU行上方延伸时,所述比较仅基于所述第一部分。11. The apparatus or method of claim 10, wherein the template associated with the current block includes a first portion to the left of the current block and a second portion above the current block, and when When the second part extends over the CTU row including the current block, the comparison is based only on the first part. 12.根据权利要求1至4中任一项所述的装置或方法,其中所述当前块的所述模板仅包括在所述当前块左侧的第一左模板,并且所述至少一个其他块的所述至少一个其他模板仅包括在所述至少一个其他块左侧的第二左模板,并且所述比较仅基于所述第一左模板和所述第二左模板。12. The apparatus or method of any one of claims 1 to 4, wherein the template of the current block only includes a first left template to the left of the current block, and the at least one other block The at least one other template only includes a second left template to the left of the at least one other block, and the comparison is based only on the first left template and the second left template. 13.根据权利要求12所述的装置或方法,其中所述当前块和所述预测块位于当前帧的第一行中。13. The apparatus or method of claim 12, wherein the current block and the prediction block are located in the first row of the current frame. 14.根据权利要求1至4中任一项所述的装置或方法,其中所述当前块的所述模板仅包括在所述当前块上方的第一上模板,并且所述至少一个其他块的所述至少一个其他模板仅包括在所述至少一个其他块上方的第二上模板,并且所述比较仅基于所述第一上模板和所述第二上模板。14. The apparatus or method of any one of claims 1 to 4, wherein the template of the current block only includes a first upper template above the current block, and the at least one other block's The at least one other template only includes a second upper template above the at least one other block, and the comparison is based only on the first upper template and the second upper template. 15.根据权利要求14所述的装置或方法,其中所述当前块位于当前帧的第一列中。15. The apparatus or method of claim 14, wherein the current block is located in the first column of the current frame. 16.根据权利要求1至4中任一项所述的装置或方法,其中所述当前块对应于当前帧的第一编码单元,并且第一块的模板延伸超出所述当前帧的上边界和左边界两者。16. The apparatus or method according to any one of claims 1 to 4, wherein the current block corresponds to the first coding unit of the current frame, and the template of the first block extends beyond the upper boundary of the current frame and Left border both. 17.根据权利要求14所述的装置或方法,其中确定所述预测块基于预测值被设置为1<<(bitDepth-1)的DC预测。17. The apparatus or method of claim 14, wherein determining the prediction block is based on DC prediction with a prediction value set to 1<<(bitDepth-1). 18.一种包括指令的计算机程序产品,所述指令在由计算机执行时使所述计算机执行根据权利要求2或4至17中任一项所述的方法。18. A computer program product comprising instructions which, when executed by a computer, cause the computer to perform a method according to any one of claims 2 or 4 to 17. 19.一种存储可执行程序指令的非暂态计算机可读介质,所述可执行程序指令使执行所述指令的计算机执行根据权利要求2或4至17中任一项所述的方法。19. A non-transitory computer-readable medium storing executable program instructions causing a computer executing the instructions to perform the method of any one of claims 2 or 4 to 17. 20.一种信号,所述信号包括根据权利要求4或从属于权利要求4的权利要求5至17中任一项所述的方法生成的数据。20. A signal comprising data generated according to the method of claim 4 or any one of claims 5 to 17 as dependent on claim 4. 21.一种比特流,根据从属于权利要求4的权利要求5所述的方法,所述比特流被格式化为包括与指示恒定比较次数和编码图像信息相关联的语法元素。21. A bitstream according to the method of claim 5 when dependent on claim 4, said bitstream formatted to include syntax elements associated with indicating a constant number of comparisons and encoding image information. 22.一种设备,所述设备包括:22. A device comprising: 根据权利要求1或3或从属于权利要求1或权利要求3的权利要求5至17中任一项所述的装置;和A device according to claim 1 or 3 or any one of claims 5 to 17 dependent on claim 1 or claim 3; and 以下中的至少一者:(i)天线,所述天线被配置为接收信号,所述信号包括表示图像信息的数据;(ii)频带限制器,所述频带限制器被配置为将所接收的信号限制为包括表示所述图像信息的所述数据的频带;和(iii)显示器,所述显示器被配置为显示来自所述图像信息的图像。At least one of: (i) an antenna configured to receive a signal including data representing image information; (ii) a frequency band limiter configured to convert the received a signal limited to a frequency band including said data representing said image information; and (iii) a display configured to display an image from said image information. 23.根据权利要求22所述的设备,其中所述设备包括电视机、电视机信号接收器、机顶盒、网关设备、移动设备、蜂窝电话、平板电脑、计算机、膝上型电脑或其他电子设备中的一者。23. The device of claim 22, wherein the device includes a television, television receiver, set-top box, gateway device, mobile device, cellular phone, tablet, computer, laptop or other electronic device one of.
CN202280026726.5A 2021-03-30 2022-03-22 Template matching prediction for video encoding and decoding Pending CN117501692A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP21305403.4 2021-03-30
EP21305892.8 2021-06-29
EP21305892 2021-06-29
PCT/EP2022/057416 WO2022207400A1 (en) 2021-03-30 2022-03-22 Template matching prediction for video encoding and decoding

Publications (1)

Publication Number Publication Date
CN117501692A true CN117501692A (en) 2024-02-02

Family

ID=76845168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280026726.5A Pending CN117501692A (en) 2021-03-30 2022-03-22 Template matching prediction for video encoding and decoding

Country Status (1)

Country Link
CN (1) CN117501692A (en)

Similar Documents

Publication Publication Date Title
CN116195254B (en) Template matching prediction for universal video coding
CN113678438B (en) Wide-angle intra-frame prediction using sub-partitions
US11463712B2 (en) Residual coding with reduced usage of local neighborhood
US20250039357A1 (en) Chroma prediction for video encoding and decoding based on template matching
KR20250100770A (en) Syntax elements for video encoding or decoding
US20250047867A1 (en) Extension of template based intra mode derivation (timd) with isp mode
US20240171756A1 (en) Template matching prediction for video encoding and decoding
JP2025124835A (en) Local Lighting Correction Flag Inheritance
US12081798B2 (en) Scaling process for joint chroma coded blocks
CN114930819B (en) Sub-block merging candidates in triangle merging mode
US20250030871A1 (en) Luma to chroma quantization parameter table signaling
CN115039409A (en) Residual processing for video encoding and decoding
KR20220123666A (en) Estimation of weighted-prediction parameters
US20220360781A1 (en) Video encoding and decoding using block area based quantization matrices
CN117501692A (en) Template matching prediction for video encoding and decoding
US20210344962A1 (en) Method and apparatus for video encoding and decoding with signaling of coding type or coding tree type
US20230336721A1 (en) Combining abt with vvc sub-block-based coding tools
US20220224902A1 (en) Quantization matrices selection for separate color plane mode
CN118120228A (en) Chroma Prediction for Video Coding and Decoding Based on Template Matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination