[go: up one dir, main page]

WO2024216414A1 - 编码方法及装置、设备、码流、存储介质 - Google Patents

编码方法及装置、设备、码流、存储介质 Download PDF

Info

Publication number
WO2024216414A1
WO2024216414A1 PCT/CN2023/088557 CN2023088557W WO2024216414A1 WO 2024216414 A1 WO2024216414 A1 WO 2024216414A1 CN 2023088557 W CN2023088557 W CN 2023088557W WO 2024216414 A1 WO2024216414 A1 WO 2024216414A1
Authority
WO
WIPO (PCT)
Prior art keywords
reference block
block
search
search process
blocks
Prior art date
Application number
PCT/CN2023/088557
Other languages
English (en)
French (fr)
Inventor
杨铀
罗景洋
叶杰栋
刘琼
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Priority to PCT/CN2023/088557 priority Critical patent/WO2024216414A1/zh
Publication of WO2024216414A1 publication Critical patent/WO2024216414A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation

Definitions

  • the embodiments of the present application relate to video coding technology, and relate to but are not limited to coding methods and devices, equipment, code streams, and storage media.
  • Light field images are composed of a series of regularly arranged macro pixels. According to the imaging principle of light field cameras, there is a strong correlation between adjacent macro pixels. Therefore, in the search process of motion estimation, the search method based on macro pixel units can fully utilize the correlation of light field images, thereby bringing more efficient compression performance. However, although this search method takes into account the macro pixel arrangement regularity of light field images, searching in macro pixel units may lose some local optimal points, thereby reducing the compression performance of video images.
  • the encoding method, apparatus, device, code stream, and storage medium provided in the embodiments of the present application can improve the motion estimation accuracy of the current block, thereby saving the bit overhead of the code stream and improving the compression performance of the video image;
  • the encoding method, apparatus, device, code stream, and storage medium provided in the embodiments of the present application are implemented as follows:
  • a coding method which is applied to an encoder, and the method includes: searching for a second reference block of the current block in a first reference image based on multiple adjacent blocks of the first reference block of the current block in the first reference image; determining first motion information of the current block based on the second reference block; and generating a code stream based on the first motion information.
  • an encoding device which is applied to an encoder, and the device comprises: a first search module, configured to search for a second reference block of the current block in a first reference image based on multiple adjacent blocks of the first reference block of the current block in the first reference image; a second search module, configured to determine first motion information of the current block based on the second reference block; and an encoding module, configured to generate a code stream based on the first motion information.
  • an encoder comprising a first memory and a first processor; wherein the first memory is used to store a computer program that can be run on the first processor; and the first processor is used to execute the encoding method as described in the embodiment of the present application when running the computer program.
  • a code stream is provided, wherein the code stream is generated by the encoding method described in the embodiment of the present application.
  • an electronic device comprising: a processor, adapted to execute a computer program; and a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the encoding method described in the embodiment of the present application is implemented.
  • a second reference block of the current block in the first reference image is searched for based on multiple adjacent blocks of the first reference block of the current block in the first reference image, rather than directly determining the first motion information of the current block based on the first reference block; in this way, the motion estimation accuracy of the current block is improved, and the first motion information is made closer to the true value, which is beneficial to saving the bit overhead of the code stream and improving the compression performance of the video image.
  • FIG1 is a schematic diagram of a processing flow of an encoding end of a video encoding and decoding framework provided in an embodiment of the present application;
  • FIG2 is a schematic diagram of a processing flow of a decoding end of a video encoding and decoding framework provided in an embodiment of the present application;
  • FIG3 is a schematic diagram of a network architecture of a coding and decoding system provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of a coding and decoding system provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of an implementation flow of an encoding method provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of an implementation flow of an encoding method provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of pixel-level fine correction of initial search points of a light field video provided by an embodiment of the present application.
  • FIG8 is a schematic diagram of a preliminary search at a macro pixel level for a light field video according to an embodiment of the present application.
  • FIG9 is a schematic diagram of a macro-pixel-level fine search of a light field video with a step size of 1 provided in an embodiment of the present application;
  • FIG10 is a schematic diagram of a macro-pixel-level fine search of a light field video when a step size is greater than a certain threshold value provided by an embodiment of the present application;
  • FIG11A is a schematic diagram of a partial implementation flow of an encoding method provided in an embodiment of the present application.
  • FIG11B is a schematic diagram of another partial implementation flow of the encoding method provided in an embodiment of the present application.
  • FIG12 is a schematic diagram of a light field image
  • FIG13 is a schematic diagram of the structure of an encoding device provided in an embodiment of the present application.
  • FIG. 14 is a schematic diagram of the structure of the encoder provided in an embodiment of the present application.
  • Each image or sub-image or frame in the video is divided into square largest coding units (LCU) or coding tree units (CTU) of the same size (such as 128x128 or 64x64, etc.).
  • Each largest coding unit or coding tree unit can be divided into rectangular coding units (CU) according to rules.
  • Coding units may also be divided into prediction units (PU) and/or transform units (TU).
  • the hybrid coding framework includes modules such as prediction, transform, quantization, entropy coding and in-loop filter.
  • the prediction module includes intra prediction and inter prediction.
  • Inter prediction includes motion estimation and motion compensation.
  • the intra-frame prediction method is used in video coding and decoding technology to eliminate the spatial redundancy between adjacent pixels. Since there is a strong similarity between adjacent images in a video, the inter-frame prediction method is used in video coding and decoding technology to eliminate the temporal redundancy between adjacent images, thereby improving coding and decoding efficiency.
  • FIG1 is a schematic diagram of the processing flow of the encoding end of the video encoding and decoding framework provided by an embodiment of the present application.
  • a frame of image 101 is divided into blocks, and intra-frame prediction or inter-frame prediction is used for the current block to generate a prediction block of the current block.
  • the original block of the current block is subtracted from the prediction block to obtain a residual block, and the residual block is transformed and quantized to obtain a quantization coefficient matrix.
  • the quantization coefficient matrix is entropy encoded and output to the code stream.
  • intra-frame prediction or inter-frame prediction is used for the current block to generate a prediction block/prediction value of the current block.
  • the code stream is parsed, the residual is transformed and quantized to obtain a quantization coefficient matrix, and the quantization coefficient matrix is inversely quantized and inversely transformed to obtain a residual block.
  • the prediction block and the residual block are added to obtain a reconstructed block.
  • the reconstructed block constitutes a reconstructed image, and the reconstructed image is loop-filtered based on the image or based on the block to obtain a decoded image.
  • the encoding end also requires similar operations as the decoding end to obtain a decoded image. At the encoding end, the decoded image obtained can be used as a reference image for inter-frame prediction for subsequent images.
  • the decoded image obtained by the encoding end is also usually called a reconstructed image.
  • the current block can be divided into prediction units during prediction, and the current block can be divided into transformation units during transformation. The division of prediction units and transformation units can be different.
  • the above is the basic process of the video codec under the block-based hybrid coding framework. With the development of technology, some modules or steps of the framework or process may be optimized.
  • the coding and decoding method provided in the embodiment of the present application is applicable to the basic process of the video codec under the block-based hybrid coding framework, but is not limited to the framework and process. It is known to those skilled in the art that with the evolution of encoders and decoders and the emergence of new business scenarios, the method provided in the embodiment of the present application is also applicable to similar technical problems.
  • the current block may be a current coding unit (CU) or a current prediction unit (PU), etc.
  • the present application embodiment also provides a network architecture of a codec system including an encoder and a decoder, wherein FIG. 3 shows a schematic diagram of a network architecture of a codec system provided by the present application embodiment.
  • the network architecture includes one or more electronic devices 13 to 1N and a communication network 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication network 01.
  • the electronic device can be various types of devices with video codec functions during implementation, for example, the electronic device can include a smart phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., and the present application embodiment is not specifically limited.
  • the decoder or encoder described in the present application embodiment can be the above-mentioned electronic device.
  • the method of the embodiment of the present application is mainly applied to the inter-frame prediction module shown in Figure 1.
  • the "current block” specifically refers to the encoding block currently to be inter-frame predicted.
  • the encoding and decoding method provided in the embodiment of the present application is applicable to the encoding and decoding of light field video images/light field images.
  • light field video is a light field video captured by a multi-eye camera or a multi-eye camera array, which is currently being studied by the MPEG LVC (Lenslet video coding) working group.
  • the light field camera adds a set of microlens arrays in front of the imaging plane, so that the light at the same point on the object plane can be captured by multiple microlenses at the same time, which is equivalent to shooting the same point from multiple angles at the same time.
  • the captured/photographed light field video can be converted into a data format that meets the input requirements of the codec, and then encoded into a bit stream through the codec and transmitted to the decoding end.
  • the decoding end parses the bit stream, obtains the light field video in the data format, and then converts it into a light field video that meets the display requirements.
  • FIG5 is a schematic diagram of an implementation flow of the encoding method provided in the embodiment of the present application. As shown in FIG5 , the method includes the following steps 501 to 503:
  • Step 501 searching for a second reference block of the current block in a first reference image according to a plurality of neighboring blocks of the first reference block of the current block in the first reference image;
  • the encoder may implement step 501 according to steps 603 to 607 described in the following embodiments.
  • Step 502 Determine first motion information of the current block according to the second reference block.
  • the encoder may implement step 502 according to steps 608 to 609 described in the following embodiments. Further, the encoder may implement step 502 according to steps 708 to 724 described in the following embodiments.
  • Step 503 Generate a bit stream according to the first motion information.
  • a second reference block of the current block in the first reference image is searched for based on multiple adjacent blocks of the first reference block of the current block in the first reference image, rather than directly determining the first motion information of the current block based on the first reference block; in this way, the motion estimation accuracy of the current block is improved, and the first motion information is made closer to the true value, which is beneficial to saving the bit overhead of the code stream and improving the compression performance of the video image.
  • the application scenario of the encoding method provided in the embodiment of the present application is not limited, and it can be light field video compression or other types of video compression.
  • the first reference image can be a light field image
  • the current block is a light field image. That is to say, the encoding method provided in the embodiment of the present application can be applied to light field video compression.
  • a light field image is composed of a series of regularly arranged macro pixels. According to the imaging principle of a light field camera, there is a strong correlation between adjacent macro pixels. Therefore, in the motion estimation search process, compared with the conventional pixel-based method, the macro pixel-based search method can fully utilize the correlation of the light field image, thereby bringing more efficient compression performance.
  • the search for motion estimation based on macropixels may lose some local optimal points.
  • the matching current block may move across macropixels between different images, and there are also differences in parallax and size between adjacent macropixels, the best reference blocks obtained by the macropixel-based motion estimation search method may not be strictly arranged according to the macropixel spacing.
  • a pixel-level motion estimation search is performed in an embodiment of the present application, that is, based on multiple adjacent blocks of the first reference block of the current block in the first reference image, a second reference block of the current block in the first reference image is searched to obtain the first motion information of the current block, and then the first motion information of the current block is determined based on the second reference block; in this way, the first motion information obtained by the search is closer to the true value, which is beneficial to saving the bit overhead of the code stream and improving the compression performance of the video image.
  • FIG. 6 is a schematic diagram of an implementation flow of the encoding method provided in the embodiment of the present application. As shown in FIG. 6 , the method includes the following steps 601 to 610:
  • Step 601 perform motion estimation on the current block according to the motion information candidates of the current block to obtain second motion information of the current block; wherein the second motion information includes a first motion vector and an index value of the first reference image.
  • the encoder may adopt the Merge mode to construct a Merge list of the current block, which records the motion information candidates of the current block. In other embodiments, the encoder may also adopt the AMVP mode to construct an AMVP list of the current block, which records the motion information candidates of the current block.
  • Step 602 Determine a first reference block of the current block in the first reference image according to the second motion information.
  • the second motion information includes the first motion vector (MV) of the current block and the index value of the first reference image. Therefore, the encoder can easily find the block (ie, the first reference block) pointed to by the first motion vector in the first reference image based on the second motion information.
  • Step 603 taking the first reference block as the central block, executing a first search process, wherein the first search process includes: searching for a neighboring block with the smallest rate distortion cost from a plurality of neighboring blocks of the central block as a first candidate reference block; wherein the size of the neighboring block is equal to the size of the first reference block.
  • the rate-distortion cost of the adjacent block refers to the rate-distortion cost caused by assuming that the current block is inter-frame predicted and encoded based on the adjacent block.
  • Step 604 taking the first candidate reference block currently searched as the center block, executing the first search process to obtain another first candidate reference block;
  • Step 605 determining whether there are two central blocks in the selected central blocks that are the same region block; if so, taking the same region block as the second reference block and proceeding to step 608; otherwise, executing step 606;
  • Step 606 determining whether the number of executions of the first search process is equal to the first number threshold; if so, executing step 607; otherwise, taking the first candidate reference block currently searched as the center block, returning to execute the first search process;
  • the first number threshold which may be 2, 3, or 4, etc.
  • the first number threshold is a value greater than or equal to 2.
  • Step 607 taking the first candidate reference block with the minimum rate-distortion cost among the searched first candidate reference blocks as the second reference block, and proceeding to step 608;
  • rate-distortion cost of the first candidate reference block mentioned here refers to the rate-distortion cost caused by assuming that the current block is inter-frame predicted and encoded based on the first candidate reference block.
  • steps 603 to 607 may be understood as operations of performing pixel-level fine correction on the first reference block.
  • pixel point pixel point “0” is the initial pixel point (which calibrates the first reference block), and eight surrounding pixel points are searched with it as the center point.
  • These eight pixel points calibrate the eight adjacent blocks of the first reference block, and the adjacent block with the smallest rate-distortion cost is selected from these eight adjacent blocks as the first candidate reference block.
  • the same eight-point search is performed with pixel point “1” as the center point to obtain pixel point “2” (which calibrates another first candidate reference block); and so on to obtain pixel point “3”; finally, the same eight-point search is performed with pixel point “3” as the center point to obtain pixel point “2”, which is consistent with the center point selected previously, so the block calibrated with pixel point “2” is the second reference block.
  • steps 603 to 607 are operations of performing pixel-level fine correction on the first reference block, that is, the second reference block is the result of performing pixel-level fine correction on the first reference block; based on this, the second reference block is further corrected at the macro-pixel level through the following step 608; thus, compared with the motion estimation search based only on the macro-pixel level, the combination of pixel-level fine correction and macro-pixel-level correction is beneficial to reducing the impact of inconsistent macro-pixel spacing in the same light field image on the motion estimation accuracy, thereby improving the motion estimation accuracy, and further saving the bit overhead of the code stream and enhancing the light field video compression performance.
  • Step 608 Using the macropixel where the second reference block is located as the starting macropixel, a second search process is performed to search for a third reference block with the minimum rate-distortion cost.
  • the second search process is performed with the macro pixel where the specific sample of the second reference block is located as the starting macro pixel.
  • the specific sample includes a sample at the upper left corner of the second reference block, for example, the specific sample is located at the upper left corner vertex of the second reference block.
  • rate-distortion cost of the third reference block mentioned here refers to the rate-distortion cost caused by assuming that the current block is inter-frame predicted and encoded based on the third reference block.
  • step 608 may include a preliminary search at the macro-pixel level, or step 608 includes a preliminary search at the macro-pixel level and a fine search at the macro-pixel level based on the preliminary search results.
  • the encoder may implement step 608 as follows: using at least one macropixel as a search step, searching the first reference image for a fourth reference block with the lowest macropixel rate-distortion cost; and determining the third reference block based on the fourth reference block.
  • rate-distortion cost of the fourth reference block refers to the assumption that the current block is inter-frame predicted based on the fourth reference block.
  • rate-distortion penalty caused by measurement and encoding.
  • the macropixel where the second reference block is located is used as the starting macropixel, and the search is performed on the first reference image at a search step equal to at least one macropixel spacing.
  • the fourth reference block can be obtained by the following macro-pixel-level search: searching for a second candidate reference block having the same size as the second reference block on macro-pixels on one or more search templates according to one or more search steps; selecting a block with the smallest rate-distortion cost from the second reference block and the second candidate reference block as the fifth reference block; wherein the one or more search steps are multiples of macro-pixels; and determining the fourth reference block based on the fifth reference block.
  • the one or more search steps are multiples of macro pixels, which can be understood as one or more search steps are multiples of macro pixel spacing.
  • the operation/process of searching for the fifth reference block can be understood as a preliminary search at the macro-pixel level, and for this search process, there is no limitation on the one or more search templates described therein.
  • the one or more search templates may be a diamond template or a square template or a search template of another shape, and blocks (the same size as the second reference block) marked by specific points (such as vertices or midpoints) of these templates may be searched to find the fifth reference block.
  • the second reference block is used as the initial search point
  • the search step starts from 1 macropixel unit, and increases in the form of an integer power of 2, and the search is performed within the specified search range according to the diamond template to obtain the second candidate reference block (i.e., the initial macropixel search), and the block with the smallest rate-distortion cost is selected from the second reference block and the second candidate reference block as the fifth reference block.
  • the fifth reference block can be directly used as the fourth reference block.
  • a macro-pixel-level fine search may be performed based on the fifth reference block to find the fourth reference block.
  • determining the fourth reference block based on the fifth reference block includes:
  • the search step length between the fifth reference block and the second reference block is greater than 1 macropixel and less than or equal to a step length threshold, using the fifth reference block as the fourth reference block; wherein the step length threshold is greater than 1 macropixel;
  • a fourth candidate reference block having the same size as the fifth reference block is searched on the remaining macropixels different from the macropixels where the fifth reference block is located within a specific range; wherein the macropixels where the fifth reference block is located are within the specific range; and a block with the smallest rate-distortion cost is selected from the fifth reference block and the fourth candidate reference block as the fourth reference block.
  • the specific range used when searching for the fourth reference block is not limited, and may be the same as the fifth reference block.
  • the spacing between the macropixels where the blocks are located is within the range of 1 macropixel, and may also be within the range of a plurality of macropixels between the macropixels where the fifth reference block is located.
  • the block with the smallest rate-distortion cost is selected from blocks 902 and 903 (i.e., the third candidate reference block) on the two adjacent macropixels of the macropixel where the fifth reference block 901 is located and the fifth reference block 901 as the fourth reference block; and as shown in Figure 10, when the search step between the fifth reference block and the second reference block is greater than the step threshold, with the fifth reference block as the center and one macropixel as the search step, the fourth candidate reference block is searched for on all adjacent pixels of the macropixel where the fifth reference block is located, and the block with the smallest rate-distortion cost is selected from the fifth reference block and the fourth candidate reference block as the fourth reference block.
  • the encoder may directly use the fourth reference block as the third reference block, or may perform further pixel-level search based on the fourth reference block, etc.
  • the encoder may directly use the fourth reference block as the third reference block, or may perform further pixel-level search based on the fourth reference block, etc.
  • steps 717 to 723 of the following embodiment refer to steps 717 to 723 of the following embodiment.
  • Step 609 Determine first motion information of the current block according to the third reference block.
  • the encoder may determine the first motion information of the current block based on the motion vector of the current block relative to the third reference block; in other embodiments, the encoder may also determine the first motion information of the current block based on the motion vector of the third reference block relative to the first reference block (that is, the difference between the motion vector of the third reference block relative to the current block and the motion vector of the first reference block relative to the current block).
  • the first motion information includes: a motion vector difference of the third reference block relative to the first reference block, an index value of the first motion information, and an index value of the first reference image.
  • the first motion information includes a motion vector difference of the third reference block relative to the first reference block and an index value of the second motion information.
  • Step 610 Generate a bitstream according to the first motion information.
  • the encoder further determines an inter-frame prediction value of the current block according to the first motion information, determines a residual value of the current block according to the inter-frame prediction value of the current block and a sample value of the current block; and generates a bitstream according to the residual value of the current block.
  • the decoder parses the bitstream and obtains the residual value of the current block according to the parsing result; the decoder also parses the bitstream to obtain the index value of the second motion information and the motion vector difference, based on which the second motion information (which includes the first motion vector and the index value of the first reference image) is obtained; the decoder determines the third reference block of the current block based on the first motion vector and the motion vector difference (based on which the predicted value of the current block is obtained); the decoder obtains the reconstructed value of the current block based on the residual value of the current block and the predicted value of the current block.
  • FIG. 11A and FIG. 11B are schematic diagrams of an implementation flow of the encoding method provided in the embodiment of the present application. As shown in FIG. 11A and FIG. 11B , the method includes the following steps 701 to 725, wherein FIG. 11A shows steps 701 to 716, and FIG. 11B shows steps 717 to 725:
  • Step 701 performing motion estimation on the current block according to the motion information candidate of the current block to obtain second motion information of the current block; wherein the second motion information includes a first motion vector and an index value of the first reference image;
  • Step 702 determining the first reference block of the current block in the first reference image according to the second motion information
  • Step 703 taking the first reference block as the central block, performing a first search process, wherein the first search process includes: searching for an adjacent block with the minimum rate distortion cost from a plurality of adjacent blocks of the central block as a first candidate reference block;
  • the size of the neighboring block is equal to the size of the first reference block.
  • Step 704 performing the first search process with the first candidate reference block currently searched as the center block;
  • Step 705 determining whether there are two central blocks in the selected central blocks that are the same region block; if so, taking the same region block as the second reference block and proceeding to step 708; otherwise, executing step 706;
  • Step 706 determining whether the number of executions of the first search process is equal to the first number threshold; if so, executing step 707; otherwise, taking the first candidate reference block currently searched as the center block, returning to execute the first search process;
  • Step 707 taking the first candidate reference block with the minimum rate-distortion cost among the first candidate reference blocks obtained by searching as the second reference block, and proceeding to step 708;
  • Step 708 taking the macropixel where the second reference block is located as the starting macropixel, and searching for a second candidate reference block having the same size as the second reference block in macropixels on one or more search templates according to one or more search steps;
  • Step 709 selecting a block with the smallest rate-distortion cost from the second reference block and the second candidate reference block as a fifth reference block; wherein the one or more search steps are multiples of macro pixels;
  • Step 710 determining whether the search step length between the fifth reference block and the second reference block is equal to one macro pixel; if yes, executing step 711; otherwise, executing step 713;
  • Step 711 using one macropixel as a search step, searching for a third candidate reference block having the same size as the fifth reference block from N adjacent macropixels of the macropixel where the fifth reference block is located, and proceeding to step 712; wherein N is greater than or equal to 1;
  • Step 712 selecting a block with the smallest rate-distortion cost from the fifth reference block and the third candidate reference block as the fourth reference block, and proceeding to step 717;
  • Step 713 determining whether the search step length between the fifth reference block and the second reference block is greater than a step length threshold; if so, executing step 714; otherwise, executing step 716; wherein the step length threshold is greater than 1 macro pixel;
  • Step 714 searching for a fourth candidate reference block having the same size as the fifth reference block on the remaining macropixels in the specific range that are different from the macropixels where the fifth reference block is located; proceeding to step 715; wherein the macropixels where the fifth reference block is located are within the specific range;
  • Step 715 select a block with the smallest rate-distortion cost from the fifth reference block and the fourth candidate reference block as the fourth reference block, and proceed to step 717 .
  • Step 716 Use the fifth reference block as the fourth reference block and proceed to step 717; wherein the step length threshold is greater than 1.
  • Step 717 taking the fourth reference block as the central block, executing the first search process to obtain a new second reference block;
  • Step 718 determining the third reference block according to the new second reference block
  • step 718 may be implemented by steps 719 to 723 as follows:
  • Step 719 taking the new second reference block as the center block, performing a third search process to obtain a new second reference block; wherein the third search process includes the first search process and the second search process;
  • Step 721 using the new second reference block of the same region block as the third reference block, and proceeding to step 724;
  • Step 722 determining whether the number of executions of the third search process is equal to the second number threshold; if yes, executing step 723; Otherwise, the current new second reference block is used as the central block, and the process returns to step 719;
  • the value of the second number threshold is not limited, and can be 2, 3, or 4, etc.
  • the first number threshold is a value greater than or equal to 2.
  • the first number threshold and the second number threshold can be equal or different.
  • Step 723 taking the block with the smallest rate-distortion cost in the new second reference block as the third reference block, and proceeding to step 724;
  • Step 724 determining first motion information of the current block according to the third reference block
  • Step 725 Generate a bit stream according to the first motion information.
  • rate-distortion cost of the block described in the embodiment of the present application refers to the rate-distortion cost caused by assuming that inter-frame prediction and encoding are performed on the current block based on the block.
  • the arrangement rule of macro pixels in the light field image is used to improve the motion estimation.
  • the specific method is as follows: based on the TZSearch algorithm of traditional motion estimation, the motion vector search step size is set to a multiple of the light field macro pixel spacing each time, as shown in the following formula (1):
  • dx and dy are the motion vector offsets in this iteration (including horizontal and vertical directions)
  • dx and dy are the motion vector offsets in the previous iteration
  • ⁇ stepX and ⁇ stepY are the search steps added in this iteration
  • Lx and Ly are the step lengths of each step, for example, the step length is set to the spacing size of each light field macro pixel (divided into spacing in the horizontal direction and the vertical direction).
  • the light field image is composed of a series of regularly arranged macro pixels, as shown in Figure 12.
  • the imaging principle of the light field camera there is a strong correlation between adjacent macro pixels. Therefore, in the search process of motion estimation, the search method based on macro pixel units can make more full use of the correlation of the light field image, thereby bringing more efficient compression performance.
  • a pixel-level fine search can be appropriately added on the basis of the macropixel search, and the candidate reference block position obtained based on the macropixel search can be offset and corrected at the pixel level to increase its correlation with the current prediction unit to a greater extent, thereby further optimizing the motion estimation effect of light field video compression.
  • Step (1) based on HEVC’s Adaptive Motion Vector Prediction (AMVP) algorithm
  • the encoder selects the MV with the smallest rate-distortion cost, and uses the position pointed to by the MV as the initial search point;
  • Step (2) perform pixel-level fine correction on the initial search point: take the initial search point obtained in step (1) as the center, perform multiple steps of pixel-level fine search in the neighborhood around the point, and perform fine correction on it.
  • eight pixel points in the neighborhood of the point are searched, and the optimal point with the minimum rate-distortion cost is selected from them, and the optimal point is used as the new center point to perform the same pixel-level fine search in the neighborhood again.
  • the optimal point selected in a certain search is consistent with the center point selected previously, the optimal point is used as the result of this step; when the number of searches reaches a certain threshold, the search is terminated in advance, and the optimal point with the minimum rate-distortion cost is used as the result of this step.
  • pixel point "0" is the initial pixel point, and eight surrounding pixel points are searched with it as the center point to obtain the optimal pixel point "1"; then the same eight-point search is performed with “1” as the center point to obtain “2”; and so on to obtain “3”; finally, "2” is obtained with "3” as the center point, which is consistent with the center point selected before, so the motion vector corresponding to pixel point "2" is the result of this step;
  • Step (3) based on the search results of step (2), a preliminary search at the macro-pixel level is performed: as shown in FIG8 , the search step size starts from 1 unit and increases in the form of an integer power of 2, and a search is performed within a specified search range according to a diamond template (or a square template), and a search point with the minimum rate-distortion cost is selected as the result of this step, where the step size unit is set to the macro-pixel spacing of the light field image;
  • Step (4) based on the search results of step (3), a fine search is performed at the macro-pixel level: if the step size corresponding to the optimal point obtained in step (3) is 1, then as shown in FIG9 , two-point search is performed around the point, and the search point with the minimum rate-distortion cost is selected as the result of this step.
  • the search here is also in units of macro-pixels;
  • step size corresponding to the optimal point obtained in step (3) is greater than a certain threshold, then as shown in FIG10 , a full search is performed within a certain range with the point as the center, and the search point with the minimum rate distortion cost is selected as the result of this step.
  • the search here is also in units of macro pixels;
  • Step (5) based on the search results of step (4), a conventional pixel-level fine search is performed: taking the result point obtained in step (4) as the center point, a multi-step pixel-level fine search is performed in the neighborhood around the point, and the specific method is the same as that described in step (2);
  • Step (6) take the optimal result point obtained in step (5) as the new initial search point, and repeat steps (2) to (5).
  • the result points obtained from two consecutive searches are consistent or the number of searches reaches a certain threshold, stop searching, and the motion vector corresponding to the last search result point is the final motion vector.
  • a conventional pixel-level fine search is added on the basis of the macro-pixel search, which more fully considers the local optimum, further refines the search results, and improves the motion estimation performance;
  • FIG. 13 is a schematic diagram of the structure of the encoding device provided in the embodiment of the present application. As shown in FIG. 13 , the encoding device 13 includes:
  • a first search module 131 is configured to search for a second reference block of the current block in the first reference image based on a plurality of neighboring blocks of the first reference block of the current block in the first reference image;
  • a second search module 132 configured to determine first motion information of the current block according to the second reference block
  • the encoding module 133 is configured to generate a code stream according to the first motion information.
  • the first reference image is a light field image
  • the current block is a light field image block
  • the first search module 131 is configured to: perform a first search process with the first reference block as the central block, the first search process comprising: searching for an adjacent block with the smallest rate distortion cost from multiple adjacent blocks of the central block as a first candidate reference block; perform the first search process with the first candidate reference block currently searched as the central block; when there are two central blocks in the selected central blocks that are the same area block, the same area block is the second reference block.
  • the first search module 131 is further configured to: in a case where there are no two center blocks in the selected center blocks that are the same area block, perform the first search process with the first candidate reference block currently searched as the center block; in a case where two first candidate reference blocks in the searched first candidate reference blocks are the same area block, the same area block is the second reference block.
  • the first search module 131 is further configured to: when there are no two central blocks in the selected central blocks that are the same area blocks and the number of executions of the first search process is equal to the first number threshold, the first candidate reference block with the smallest rate-distortion cost among the first candidate reference blocks searched is the second reference block.
  • the first search module 131 is further configured to: when there are no two central blocks in the selected central blocks that are the same area blocks and the number of executions of the first search process is less than the first number threshold, use the first candidate reference block currently searched as the central block to iteratively execute the first search process until the second reference block is searched out or the number of executions of the first search process is equal to the first number threshold.
  • the size of the neighboring block is equal to the size of the first reference block.
  • the second search module 132 is configured to perform a second search process with the macropixel where the second reference block is located as the starting macropixel to search for a third reference block with the minimum rate-distortion cost; and determine the first motion information of the current block based on the third reference block.
  • the second search process includes: using at least one macropixel as a search step, searching for a fourth reference block with the minimum macropixel rate-distortion cost in the first reference image; and determining the third reference block based on the fourth reference block.
  • the method of searching for a fourth reference block with the minimum rate-distortion cost on macropixels in the first reference image with at least one macropixel as the search step size includes: searching for a second candidate reference block with the same size as the second reference block on macropixels on one or more search templates according to one or more search steps; selecting a block with the minimum rate-distortion cost from the second reference block and the second candidate reference blocks as the fifth reference block; wherein the one or more search steps are multiples of macropixels; and determining the fourth reference block based on the fifth reference block.
  • determining the fourth reference block based on the fifth reference block includes: when the search step between the fifth reference block and the second reference block is equal to one macropixel, using one macropixel as the search step, searching for a third candidate reference block with the same size as the fifth reference block from N adjacent macropixels of the macropixel where the fifth reference block is located; wherein N is greater than or equal to 1; and selecting a block with the smallest rate-distortion cost from the fifth reference block and the third candidate reference block as the fourth reference block.
  • the step of searching for a fourth reference block with the minimum macropixel rate-distortion cost in the first reference image with at least one macropixel as the search step includes: when the search step between the fifth reference block and the second reference block is greater than 1 macropixel and less than or equal to a step threshold, using the fifth reference block as the fourth reference block; wherein the step threshold is greater than 1 macropixel.
  • the step of searching for a fourth reference block with the minimum rate-distortion cost on macropixels in the first reference image with at least one macropixel as the search step size includes: when the search step size between the fifth reference block and the second reference block is greater than a step size threshold, searching for a fourth candidate reference block with the same size as the fifth reference block on the remaining macropixels that are different from the macropixels where the fifth reference block is located within a specific range; wherein the macropixels where the fifth reference block is located are within the specific range; and selecting the block with the minimum rate-distortion cost from the fifth reference block and the fourth candidate reference block as the fourth reference block.
  • a fourth candidate reference block having the same size as the fifth reference block is searched for on all neighboring macropixels of the macropixel where the fifth reference block is located.
  • determining the third reference block based on the fourth reference block includes: taking the fourth reference block as the center block, executing the first search process to obtain a new second reference block; and determining the third reference block based on the new second reference block.
  • determining the third reference block based on the new second reference block includes: taking the new second reference block as the central block, iteratively executing a third search process, wherein the third search process includes the first search process and the second search process, until the new second reference blocks obtained by two adjacent searches of the third search process are the same regional block, and the new second reference block that is the same regional block is used as the third reference block; or, until the number of executions of the third search process is equal to a second number threshold, the block with the smallest rate-distortion cost in the new second reference block is used as the third reference block.
  • the encoding device 13 also includes a motion estimation module and a determination module; wherein the motion estimation module is configured to perform motion estimation on the current block based on motion information candidates of the current block before searching for the second reference block based on multiple adjacent blocks of the first reference block in the first reference image to obtain the second motion information of the current block; wherein the second motion information includes a first motion vector and an index value of the first reference image; and the determination module is configured to determine the first reference block of the current block in the first reference image based on the second motion information.
  • the motion estimation module is configured to perform motion estimation on the current block based on motion information candidates of the current block before searching for the second reference block based on multiple adjacent blocks of the first reference block in the first reference image to obtain the second motion information of the current block
  • the second motion information includes a first motion vector and an index value of the first reference image
  • the determination module is configured to determine the first reference block of the current block in the first reference image based on the second motion information.
  • the first motion information includes a motion vector difference of the third reference block relative to the first reference block, an index value of the first motion vector, and an index value of the first reference image.
  • the first motion information includes a motion vector difference of the third reference block relative to the first reference block and an index value of the second motion information.
  • the description of the above coding device embodiment is similar to the description of the above coding method embodiment, and has similar beneficial effects as the coding method embodiment.
  • For technical details not disclosed in the coding device embodiment of the present application please refer to the description of the coding method embodiment of the present application for understanding.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or may exist physically separately, or two or more units may be integrated into one unit.
  • the integrated unit can be implemented in the form of hardware, or in the form of a software functional unit, or in the form of a combination of software and hardware.
  • the technical solution of the embodiments of the present application can essentially or in other words, the part that contributes to the relevant technology can be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling an electronic device to execute all or part of the methods described in each embodiment of the present application.
  • the aforementioned storage medium includes: various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a magnetic disk or an optical disk. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • An embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed, the encoding method or the decoding method as described in the embodiment of the present application is implemented.
  • the encoder 14 includes: a communication interface 141, a memory 142 and a processor 143; each component is coupled together through a bus system 144.
  • the bus system 144 is used to realize the connection and communication between these components.
  • the bus system 144 also includes a power bus, a control bus and a status signal bus.
  • various buses are marked as bus system 144 in FIG14. Among them,
  • the communication interface 141 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
  • a memory 142 for storing computer programs that can be run on the processor 143;
  • the processor 143 is used to execute the encoding method described in the embodiment of the present application when running the computer program.
  • the memory 142 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
  • the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
  • the volatile memory can be a random access memory (RAM), which is used as an external cache.
  • RAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate synchronous DRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous link DRAM
  • DRRAM direct RAM bus RAM
  • the processor 143 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit in the processor 143 or the instructions in the form of software.
  • the above processor 143 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components. It can implement or execute the embodiments of the present application. The disclosed methods, steps and logic block diagrams.
  • the general processor may be a microprocessor or the processor may be any conventional processor, etc.
  • the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as being executed by a hardware decoding processor, or may be executed by a combination of hardware and software modules in the decoding processor.
  • the software module may be located in a mature storage medium in the art such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc.
  • the storage medium is located in the memory 142, and the processor 143 reads the information in the memory 142, and completes the steps of the above method in conjunction with its hardware.
  • the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), digital signal processors (Digital Signal Processing, DSP), digital signal processing devices (DSP Device, DSPD), programmable logic devices (Programmable Logic Device, PLD), field programmable gate arrays (Field-Programmable Gate Array, FPGA), general processors, controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application or a combination thereof.
  • ASIC Application Specific Integrated Circuits
  • DSP Digital Signal Processing
  • DSP Device digital signal processing devices
  • PLD programmable logic devices
  • FPGA field programmable gate array
  • general processors controllers, microcontrollers, microprocessors, other electronic units for performing the functions described in this application or a combination thereof.
  • the technology described in this application can be implemented by a module (such as a process, function, etc.) that performs the functions described in this application.
  • the software code can be stored in a memory and executed by a processor.
  • the memory can be implemented in the processor or outside the processor.
  • the processor 143 is further configured to execute any of the aforementioned encoding method embodiments when running the computer program.
  • the embodiment of the present application also provides a code stream, which is obtained by using the above-mentioned encoding method.
  • the embodiment of the present application provides an electronic device, including: a processor, adapted to execute a computer program; a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the encoding method and/or decoding method described in the embodiment of the present application is implemented.
  • the electronic device may be any type of device having video encoding and/or video decoding capabilities, for example, the electronic device is a mobile phone, a tablet computer, a laptop computer, a personal computer, a television, a projection device, or a monitoring device.
  • object A and/or object B can mean: object A exists alone, object A and object B exist at the same time, and object B exists alone. kind of situation.
  • modules described above as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules; they may be located in one place or distributed on multiple network units; some or all of the modules may be selected according to actual needs to achieve the purpose of the present embodiment.
  • all functional modules in the embodiments of the present application may be integrated into one processing unit, or each module may be a separate unit, or two or more modules may be integrated into one unit; the above-mentioned integrated modules may be implemented in the form of hardware or in the form of hardware plus software functional units.
  • the above-mentioned integrated unit of the present application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
  • the technical solution of the embodiment of the present application can essentially or in other words, the part that contributes to the relevant technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including a number of instructions for enabling an electronic device to execute all or part of the methods described in each embodiment of the present application.
  • the aforementioned storage medium includes: various media that can store program codes, such as mobile storage devices, ROMs, magnetic disks, or optical disks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请实施例公开了编码方法及装置、设备、码流、存储介质,所述方法包括:根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;根据所述第二参考块,确定所述当前块的第一运动信息;根据所述第一运动信息,生成码流。

Description

编码方法及装置、设备、码流、存储介质 技术领域
本申请实施例涉及视频编码技术,涉及但不限于编码方法及装置、设备、码流、存储介质。
背景技术
光场图像是由一系列规则排列的宏像素组成的,根据光场相机的成像原理,相邻宏像素之间存在很强的相关性。因此在运动估计的搜索过程中,基于宏像素单位的搜索方式能够充分地利用光场图像的相关性,从而带来更加高效的压缩性能。然而,虽然该搜索方法考虑了光场图像的宏像素排列规律,但是以宏像素为单位进行搜索可能会丢失一些局部最优点,从而降低了视频图像的压缩性能。
发明内容
有鉴于此,本申请实施例提供的编码方法及装置、设备、码流、存储介质,能够提高当前块的运动估计精度,从而有益于节约码流的比特开销,提升视频图像的压缩性能;本申请实施例提供的编码方法及装置、设备、码流、存储介质,是这样实现的:
根据本申请实施例的一个方面,提供一种编码方法,应用于编码器,所述方法包括:根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;根据所述第二参考块,确定所述当前块的第一运动信息;根据所述第一运动信息,生成码流。
根据本申请实施例的一个方面,提供一种编码装置,应用于编码器,所述装置包括:第一搜索模块,配置为根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;第二搜索模块,配置为根据所述第二参考块,确定所述当前块的第一运动信息;编码模块,配置为根据所述第一运动信息,生成码流。
根据本申请实施例的一个方面,提供一种编码器,包括第一存储器和第一处理器;其中,所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;所述第一处理器,用于在运行所述计算机程序时,执行如本申请实施例所述的编码方法。
根据本申请实施例的一个方面,提供一种码流,所述码流是通过本申请实施例所述的编码方法而生成的。
根据本申请实施例的一个方面,提供一种电子设备,包括:处理器,适于执行计算机程序;计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被所述处理器执行时,实现本申请实施例所述的编码方法。
在本申请实施例中,根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到当前块在第一参考图像中的第二参考块,而不是直接根据第一参考块确定当前块的第一运动信息;如此,提高了当前块的运动估计精度,使得第一运动信息更加接近真实值,从而有益于节约码流的比特开销,提升视频图像的压缩性能。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本申请的实施例,并与说明书一起用于说明本申请的技术方案。显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
附图中所示的流程图仅是示例性说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解,而有的操作/步骤可以合并或部分合并,因此实际执行的顺序有可能根据实际情况改变。
图1为本申请实施例提供的视频编解码框架的编码端的处理流程示意图;
图2为本申请实施例提供的视频编解码框架的解码端的处理流程示意图;
图3为本申请实施例提供的一种编解码系统的网络架构示意图;
图4为本申请实施例提供的一种编解码系统示意图;
图5为本申请实施例提供的编码方法的实现流程示意图;
图6为本申请实施例提供的编码方法的实现流程示意图;
图7为本申请实施例提供的光场视频初始搜索点的像素级精细修正示意图;
图8为本申请实施例提供的光场视频宏像素级初步搜索示意图;
图9为本申请实施例提供的步长为1的情况下光场视频宏像素级精细搜索示意图;
图10为本申请实施例提供的步长大于某个阈值的情况下光场视频宏像素级精细搜索示意图;
图11A为本申请实施例提供的编码方法的部分实现流程示意图;
图11B为本申请实施例提供的编码方法的另一部分实现流程示意图;
图12为光场图像示意图;
图13为本申请实施例提供的编码装置的结构示意图;
图14为本申请实施例提供的编码器的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请的具体技术方案做进一步详细描述。以下实施例用于说明本申请,但不用来限制本申请的范围。
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常 理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。
在以下的描述中,涉及到“一些实施例”、“本实施例”、“本申请实施例”以及举例等等,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。
本申请实施例中出现的“第一、第二、第三”等描述,仅作示意与区分描述对象之用,没有次序之分,也不表示本申请实施例中对设备个数的特别限定,不能构成对本申请实施例的任何限制。
视频编解码标准大多采用基于块的混合编码框架。视频中的每一个图像或子图像或一帧(frame)被分割成相同大小(如128x128或64x64等)的正方形的最大编码单元(Largest Coding Unit,LCU)或编码树单元(Coding Tree Unit,CTU)。每个最大编码单元或编码树单元可根据规则划分成矩形的编码单元(Coding Unit,CU)。编码单元可能还会划分预测单元(Prediction Unit PU)和/或变换单元(Transform Unit,TU)等。混合编码框架包括预测(prediction)、变换(transform)、量化(quantization)、熵编码(entropy coding)和环路滤波(in loop filter)等模块。预测模块包括帧内预测(intra prediction)和帧间预测(inter prediction)。帧间预测包括运动估计(motion estimation)和运动补偿(motion compensation)。由于视频图像中的相邻像素之间存在较强的相关性,在视频编解码技术中使用帧内预测的方法消除相邻像素之间的空间冗余。由于视频中的相邻图像之间存在较强的相似性,在视频编解码技术中使用帧间预测方法消除相邻图像之间的时间冗余,从而提高编解码效率。
图1为本申请实施例提供的视频编解码框架的编码端的处理流程示意图,如图1所示,将一帧图像101划分成块,对当前块使用帧内预测或帧间预测产生当前块的预测块,当前块的原始块减去预测块得到残差块,对残差块进行变换和量化得到量化系数矩阵,对量化系数矩阵进行熵编码输出到码流中。如图2所示,在解码端,对当前块使用帧内预测或帧间预测产生当前块的预测块/预测值,另一方面解析码流,对残差进行变换和量化,得到量化系数矩阵,对量化系数矩阵进行反量化、反变换得到残差块,将预测块和残差块相加得到重建块。重建块组成重建图像,基于图像或基于块对重建图像进行环路滤波得到解码图像。编码端同样需要和解码端类似的操作获得解码图像。在编码端,得到的解码图像可以为后续的图像作为帧间预测的参考图像。编码端确定的块划分信息,预测、变换、量化、熵编码和环路滤波等模式信息或者参数信息如果有必要需要在输出的码流中。可以理解,在解码端,通过解析及根据已有信息进行分析确定与编码端相同的块划分信息,预测、变换、量化、熵编码和环路滤波等模式信息或者参数信息,从而保证编码端获得的解码图像和解码端获得的解码图像相同。编码端获得的解码图像通常也叫做重建图像。在预测时可以将当前块划分成预测单元,在变换时可以将当前块划分成变换单元,预测单元和变换单元的划分可以不同。
上述是基于块的混合编码框架下的视频编解码器的基本流程,随着技术的发展,该框架或流程的一些模块或步骤可能会被优化,本申请实施例提供的编解码方法适用于该基于块的混合编码框架下的视频编解码器的基本流程,但不限于该框架及流程。本领域普通技术人员可知,随着编码器、解码器的演变以及新业务场景的出现,本申请实施例提供的方法对于类似的技术问题,同样适用。
当前块(current block)可以是当前编码单元(CU)或当前预测单元(PU)等。
本申请实施例还提供了一种包含编码器和解码器的编解码系统的网络架构,其中,图3示出了本申请实施例提供的一种编解码系统的网络架构示意图。如图3所示,该网络架构包括一个或多个电子设备13至1N和通信网络01,其中,电子设备13至1N可以通过通信网络01进行视频交互。电子设备在实施的过程中可以为各种类型的具有视频编解码功能的设备,例如,所述电子设备可以包括智能手机、平板电脑、个人计算机、个人数字助理、导航仪、数字电话、视频电话、电视机、传感设备、服务器等,本申请实施例不作具体限定。在这里,本申请实施例所述的解码器或编码器就可以为上述电子设备。
需要说明的是,本申请实施例的方法主要应用在如图1所示的帧间预测模块。当应用于编码端的帧间预测模块时,“当前块”具体是指当前待进行帧间预测的编码块。
本申请实施例提供的编解码方法适用于对光场视频图像/光场图像的编解码。可以理解,光场视频是由多目摄像头或多目摄像头阵列捕获的光场视频,目前为MPEG LVC(Lenslet video coding)工作组的研究。与一般的摄像机成像模型不同,光场相机在成像平面前增加了一组微透镜阵列,使得物体平面的同一个点的光线可以同时被多个微透镜捕获,相当于同时对同一点从多个角度拍摄。
由于其特殊的成像模型,光场图像的视觉效果与传统图片差距较大。这也导致对于一般图像或视频的压缩方法,在处理光场图像或视频时效果不佳。LVC小组的出现就是为了解决这一问题,研究更加适应于光场视频的压缩方法。
对于编码端而言,如图4所示,可以将捕获/拍摄的光场视频转换为符合编解码器输入要求的数据格式,然后通过编解码器将其编码为码流传输给解码端,解码端解析码流,得到该数据格式的光场视频后转换为符合显示要求的光场视频。
本申请实施例提供一种编码方法,该方法应用于编码器,图5为本申请实施例提供的编码方法的实现流程示意图,如图5所示,该方法包括如下步骤501至步骤503:
步骤501,根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;
在一些实施例中,编码器可以根据如下实施例所述的步骤603至步骤607实现步骤501。
步骤502,根据所述第二参考块,确定所述当前块的第一运动信息。
在一些实施例中,编码器可以根据如下实施例所述的步骤608至步骤609实现步骤502。进一步地,编码器可以根据如下实施例所述的步骤708至步骤724实现步骤502。
步骤503,根据所述第一运动信息,生成码流。
在本申请实施例中,根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到当前块在第一参考图像中的第二参考块,而不是直接根据第一参考块确定当前块的第一运动信息;如此,提高了当前块的运动估计精度,使得第一运动信息更加接近真实值,从而有益于节约码流的比特开销,提升视频图像的压缩性能。
需要说明的是,对于本申请实施例提供的编码方法的应用场景不做限定,可以是光场视频压缩,也可以是其他类型的视频压缩。在一些实施例中,第一参考图像可以是光场图像,当前块为光场图 像块。也就是说,本申请实施例提供的编码方法可以应用于光场视频压缩。
可以理解,光场图像是由一系列规则排列的宏像素组成的,根据光场相机的成像原理,相邻宏像素之间存在很强的相关性。因此在运动估计的搜索过程中,相比于常规的基于像素单位的方式,基于宏像素单位的搜索方式能够充分地利用光场图像的相关性,从而带来更加高效的压缩性能。
然而,以宏像素为单位进行运动估计的搜索可能会丢失一些局部最优点。考虑到匹当前块在不同图像之间可能存在跨宏像素运动的情况,同时相邻宏像素之间也存有视差和尺寸等差异,因此,基于宏像素的运动估计搜索方法,得到的最佳参考块未必严格按照宏像素间距排列。
有鉴于此,在光场视频压缩的应用场景中,对于当前块的运动估计搜索,在本申请实施例中进行像素级的运动估计搜索,即基于当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到当前块在第一参考图像中的第二参考块,进而基于第二参考块确定当前块的第一运动信息;如此,使搜索得到的第一运动信息更加接近真实值,从而有益于节约码流的比特开销,提升视频图像的压缩性能。
本申请实施例提供一种编码方法,图6为本申请实施例提供的编码方法的实现流程示意图,如图6所示,该方法包括如下步骤601至步骤610:
步骤601,根据所述当前块的运动信息候选,对所述当前块进行运动估计,得到所述当前块的第二运动信息;其中,所述第二运动信息包括第一运动矢量和所述第一参考图像的索引值。
在一些实施例中,编码器可以采用Merge模式构建当前块的Merge列表,该列表中记录了当前块的运动信息候选。在另一些实施例中,编码器也可以采用AMVP模式构建当前块的AMVP列表,该列表中记录了当前块的运动信息候选。
步骤602,根据所述第二运动信息,确定所述当前块在所述第一参考图像中的第一参考块。
可以理解,第二运动信息包括当前块的第一运动矢量(Motion Vector,MV)和第一参考图像的索引值,因此编码器基于该第二运动信息容易找到第一运动矢量在第一参考图像中指向的块(即第一参考块)。
步骤603,以所述第一参考块为中心块,执行第一搜索过程,所述第一搜索过程包括:从所述中心块的多个相邻块中,搜索出率失真代价最小的相邻块作为第一候选参考块;其中,所述相邻块的尺寸等于所述第一参考块的尺寸。
可以理解,在本申请实施例中,所述的相邻块的率失真代价是指假设基于该相邻块对当前块进行帧间预测和编码所带来的率失真代价。
步骤604,以当前搜索得到的所述第一候选参考块为中心块,执行所述第一搜索过程,得到又一第一候选参考块;
步骤605,确定所选的中心块中是否存在两个中心块为同一区域块;如果是,将所述同一区域块作为第二参考块,进入步骤608;否则,执行步骤606;
步骤606,确定所述第一搜索过程的执行次数是否等于第一次数阈值;如果是,执行步骤607;否则,以当前搜索得到的所述第一候选参考块为中心块,返回执行所述第一搜索过程;
在本申请实施例中,对于第一次数阈值的数值不做限定,可以是2,也可以是3或者4等,总之第一次数阈值是大于或等于2的数值。
步骤607,将搜索得到的第一候选参考块中率失真代价最小的第一候选参考块作为所述第二参考块,进入步骤608;
可以理解,这里所述的第一候选参考块的率失真代价是指假设基于该第一候选参考块对当前块进行帧间预测和编码所带来的率失真代价。
在本申请实施例中,步骤603至步骤607可以理解为对第一参考块进行像素级的精细修正的操作。为了便于对步骤603至步骤607描述的方案的理解,这里举例而言,如图7所示,假设所有块均采用自身块中左上角的像素/样本(如下称为“像素点”)来标定,像素点“0”为初始像素点(其标定的是第一参考块),以其为中心点搜索周围八个像素点,这八个像素点标定的是第一参考块的8个相邻块,从这8个相邻块中选出率失真代价最小的相邻块作为第一候选参考块,假设标定该第一候选参考块的像素点为像素点“1”;再以像素点“1”为中心点做同样的八点搜索得出像素点“2”(其标定的是又一第一候选参考块);以此类推得出像素点“3”;最后以像素点“3”为中心点做同样的八点搜索得出像素点“2”,与之前所选的中心点一致,故像素点“2”标定的块即为第二参考块。
可以理解,步骤603至步骤607为对第一参考块进行像素级的精细修正的操作,即第二参考块为对第一参考块进行像素级的精细修正的结果;基于此,通过如下步骤608对第二参考块进行进一步的宏像素级的修正;如此,相比于仅基于宏像素级的运动估计搜索,基于像素级的精细修正和基于宏像素级的修正的结合,有益于降低同一光场图像中的宏像素间距不一致对运动估计精度的影响,从而有益于提高运动估计精度,进而有益于节约码流的比特开销,增强光场视频压缩性能。
步骤608,以所述第二参考块所在的宏像素为起始宏像素,执行第二搜索过程,搜索得到率失真代价最小的第三参考块。
进一步地,在一些实施例中,以所述第二参考块的特定样本所在的宏像素为起始宏像素,执行所述第二搜索过程。
示例性地,在一些实施例中,所述特定样本包括所述第二参考块中左上角的样本,例如,特定样本处于第二参考块的左上角顶点。
可以理解,这里所述第三参考块的率失真代价是指假设基于该第三参考块对当前块进行帧间预测和编码所带来的率失真代价。
可以理解,第二搜索过程为宏像素级的搜索操作。在本申请实施例中,步骤608可以包括宏像素级的初步搜索,或者,步骤608包括宏像素级的初步搜索和在初步搜索结果的基础上进行宏像素级的精细搜索。
具体地,在一些实施例中,编码器可以这样实现步骤608:以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块;以及根据所述第四参考块,确定所述第三参考块。
可以理解,这里所述第四参考块的率失真代价是指假设基于该第四参考块对当前块进行帧间预 测和编码所带来的率失真代价。
在本申请实施例中,在搜索第四参考块时,对于搜索的哪个或哪些宏像素不做限定,总之是以第二参考块所在的宏像素为起始宏像素,按照等于至少一个宏像素间距的搜索步长在第一参考图像上进行搜索即可。
示例性地,在一些实施例中,可以通过如下宏像素级的搜索得到第四参考块:按照一个或多个搜索步长,在一个或多个搜索模板上搜索出宏像素上与所述第二参考块尺寸相同的第二候选参考块;从所述第二参考块和所述第二候选参考块中选出率失真代价最小的块作为第五参考块;其中,所述一个或多个搜索步长为宏像素的倍数;以及根据所述第五参考块,确定所述第四参考块。
可以理解,所述一个或多个搜索步长为宏像素的倍数,可以理解为一个或多个搜索步长为宏像素间距的倍数。
搜索第五参考块的操作/过程可以理解为宏像素级的初步搜索,对于该搜索过程,对于其中所述的一个或多个搜索模板不做限定。在一些实施例中,所述一个或多个搜索模板可以是菱形模板或正方形模板或其他形状的搜索模板,可以搜索这些模板特定点(比如顶点或中点)标定的块(与第二参考块的尺寸相同)从而找出第五参考块。
以搜索模板为菱形模板为例,如图8所示,以第二参考块为初始搜索点,搜索步长从1个宏像素单位开始,并以2的整数次幂形式递增,按照菱形模板在规定的搜索范围内进行搜索,得到第二候选参考块(即宏像素初搜索),从第二参考块和第二候选参考块中选出率失真代价最小的块作为第五参考块。
对于所述根据所述第五参考块,确定所述第四参考块,更进一步地,在一些实施例中可以直接将第五参考块作为第四参考块。
在另一些实施例中,也可以基于第五参考块做宏像素级的精细搜索,从而找出第四参考块,具体地,根据第五参考块,确定第四参考块,包括:
在所述第五参考块与所述第二参考块之间的搜索步长等于一个宏像素的情况下,以一个宏像素为搜索步长,从所述第五参考块所在的宏像素的N个相邻宏像素上搜索出与所述第五参考块尺寸相同的第三候选参考块;其中,N大于或等于1;从所述第五参考块和所述第三候选参考块中,选出率失真代价最小的块作为所述第四参考块;在本申请实施例中,对于N的值不做限定,可以是任意值,例如N=2。
在所述第五参考块与所述第二参考块之间的搜索步长大于1个宏像素且小于或等于步长阈值的情况下,将所述第五参考块作为所述第四参考块;其中,所述步长阈值大于1个宏像素;
在所述第五参考块与所述第二参考块之间的搜索步长大于步长阈值的情况下,在特定范围内不同于所述第五参考块所在宏像素的其余宏像素上搜索出与所述第五参考块尺寸相同的第四候选参考块;其中,所述第五参考块所在的宏像素在所述特定范围内;从所述第五参考块和所述第四候选参考块中,选出率失真代价最小的块作为所述第四参考块。
在本申请实施例中,对于搜索第四参考块时所采用的的特定范围不做限定,可以是与第五参考 块所在宏像素之间的间距为1个宏像素的范围内,也可以是与第五参考块所在宏像素之间的间距为多个宏像素的范围内。
举例而言,如图9所示,在第五参考块与第二参考块之间的搜索步长等于一个宏像素的情况下,从第五参考块901所在的宏像素的2个相邻宏像素上的块902和903(即第三候选参考块)以及第五参考块901中选出率失真代价最小的块作为第四参考块;又如图10所示,在所述第五参考块与所述第二参考块之间的搜索步长大于步长阈值的情况下,以第五参考块为中心,以一个宏像素为搜索步长,搜索出第五参考块在其所在的宏像素的所有相邻像素上的第四候选参考块,从第五参考块和第四候选参考块中,选出率失真代价最小的块作为第四参考块。
在本申请实施例中,编码器可以直接将第四参考块作为第三参考块,也可以基于第四参考块进行进一步地像素级搜索等,具体参见如下实施例的步骤717至步骤723。
步骤609,根据所述第三参考块,确定所述当前块的第一运动信息。
在一些实施例中,编码器可以根据当前块相对第三参考块的运动矢量,确定当前块的第一运动信息;在另一些实施例中,编码器也可以根据第三参考块相对第一参考块的运动矢量(也即第三参考块相对当前块的运动矢量与第一参考块相对当前块的运动矢量的差),确定当前块的第一运动信息。
示例性地,在一些实施例中,第一运动信息包括:第三参考块相对第一参考块的运动矢量差、第一运动信息的索引值和第一参考图像的索引值。
或者,在另一些实施例中,第一运动信息包括所述第三参考块相对所述第一参考块的运动矢量差和所述第二运动信息的索引值。
步骤610,根据所述第一运动信息,生成码流。
在一些实施例中,编码器还根据第一运动信息确定当前块的帧间预测值,根据当前块的帧间预测值和当前块的样本值,确定当前块的残差值;根据当前块的残差值,生成码流。
相应地,在解码端,解码器解析码流,根据解析结果得到当前块的残差值;解码器解析码流还得到第二运动信息的索引值和运动矢量差,基于此得到第二运动信息(其包括第一运动矢量和第一参考图像的索引值);解码器基于第一运动矢量和运动矢量差,确定当前块的第三参考块(基于此得到当前块的预测值);解码器基于当前块的残差值和当前块的预测值,得到当前块的重建值。
本申请实施例提供一种编码方法,图11A和图11B为本申请实施例提供的编码方法的实现流程示意图,如图11A和图11B所示,该方法包括如下步骤701至步骤725,其中图11A示出了步骤701至步骤716,图11B示出了步骤717至步骤725:
步骤701,根据所述当前块的运动信息候选,对所述当前块进行运动估计,得到所述当前块的第二运动信息;其中,所述第二运动信息包括第一运动矢量和所述第一参考图像的索引值;
步骤702,根据所述第二运动信息,确定所述当前块在所述第一参考图像中的所述第一参考块;
步骤703,以所述第一参考块为中心块,执行第一搜索过程,所述第一搜索过程包括:从所述中心块的多个相邻块中,搜索出率失真代价最小的相邻块作为第一候选参考块;
在一些实施例中,所述相邻块的尺寸等于所述第一参考块的尺寸。
步骤704,以当前搜索得到的所述第一候选参考块为中心块,执行所述第一搜索过程;
步骤705,确定所选的中心块中是否存在两个中心块为同一区域块;如果是,将所述同一区域块为第二参考块,进入步骤708;否则,执行步骤706;
步骤706,确定所述第一搜索过程的执行次数是否等于第一次数阈值;如果是,执行步骤707;否则,以当前搜索得到的所述第一候选参考块为中心块,返回执行所述第一搜索过程;
步骤707,将搜索得到的第一候选参考块中率失真代价最小的第一候选参考块作为所述第二参考块,进入步骤708;
步骤708,以所述第二参考块所在的宏像素为起始宏像素,按照一个或多个搜索步长,在一个或多个搜索模板上搜索出宏像素上与所述第二参考块尺寸相同的第二候选参考块;
步骤709,从所述第二参考块和所述第二候选参考块中选出率失真代价最小的块作为第五参考块;其中,所述一个或多个搜索步长为宏像素的倍数;
步骤710,确定所述第五参考块与所述第二参考块之间的搜索步长是否等于一个宏像素;如果是,执行步骤711;否则,执行步骤713;
步骤711,以一个宏像素为搜索步长,从所述第五参考块所在的宏像素的N个相邻宏像素上搜索出与所述第五参考块尺寸相同的第三候选参考块,进入步骤712;其中,N大于或等于1;
步骤712,从所述第五参考块和所述第三候选参考块中,选出率失真代价最小的块作为所述第四参考块,进入步骤717;
步骤713,确定所述第五参考块与所述第二参考块之间的搜索步长是否大于步长阈值;如果是,执行步骤714;否则,执行步骤716;其中,步长阈值大于1个宏像素;
步骤714,在特定范围内不同于所述第五参考块所在宏像素的其余宏像素上搜索出与所述第五参考块尺寸相同的第四候选参考块;进入步骤715;其中,所述第五参考块所在的宏像素在所述特定范围内;
步骤715,从所述第五参考块和所述第四候选参考块中,选出率失真代价最小的块作为所述第四参考块,进入步骤717。
步骤716,将所述第五参考块作为所述第四参考块,进入步骤717;其中,所述步长阈值大于1。
步骤717,以所述第四参考块为中心块,执行所述第一搜索过程,得到新的第二参考块;
步骤718,根据所述新的第二参考块,确定所述第三参考块;
在一些实施例中,如图7B所示,可以通过如下步骤719至步骤723实现步骤718:
步骤719,以所述新的第二参考块为中心块,执行第三搜索过程,得到新的第二参考块;其中,所述第三搜索过程包括所述第一搜索过程和所述第二搜索过程;
720,确定相邻两次所述第三搜索过程搜索得到的所述新的第二参考块是否为同一区域块;如果是,执行步骤721;否则,执行步骤722;
步骤721,将为同一区域块的所述新的第二参考块作为所述第三参考块,进入步骤724;
步骤722,确定所述第三搜索过程的执行次数是否等于第二次数阈值;如果是,执行步骤723; 否则,以当前新的第二参考块为中心块,返回执行步骤719;
在本申请实施例中,对于第二次数阈值的数值不做限定,可以是2,也可以是3或者4等,总之第一次数阈值是大于或等于2的数值。第一次数阈值与第二次数阈值可以相等,也可以不同。
步骤723,将所述新的第二参考块中率失真代价最小的块作为所述第三参考块,进入步骤724;
步骤724,根据所述第三参考块,确定所述当前块的第一运动信息;
步骤725,根据所述第一运动信息,生成码流。
需要说明的是,在本申请实施例所述的块的率失真代价,是指假设基于该块对当前块进行帧间预测和编码所带来的率失真代价。
为了提高光场视频压缩性能,在一些实施例中,利用光场图像中宏像素的排布规律对运动估计进行改进,具体方式为:在传统运动估计的TZSearch算法基础上,将每次的运动矢量搜索步长设置为光场宏像素间距的倍数,如下公式(1)所示:
式(1)中,dx和dy为该次迭代中的运动矢量偏移(包括水平方向和垂直方向),dx ,和dy ,为上一次迭代中的运动矢量偏移,ΔstepX和ΔstepY为该次迭代中增加的搜索步数,Lx和Ly为每一步的步长,例如该步长设置为每个光场宏像素的间距大小(分为水平方向和垂直方向上的间距)。
该方法的依据在于,光场图像是由一系列规则排列的宏像素组成的,如图12所示。根据光场相机的成像原理,相邻宏像素之间存在较强的相关性。因此在运动估计的搜索过程中,基于宏像素单位的搜索方式能够更加充分地利用光场图像的相关性,从而带来更加高效的压缩性能。
虽然利用光场图像中宏像素的排布规律对运动估计的技术方案中考虑了光场图像的宏像素排列规律,但是仅以宏像素为单位进行搜索可能会丢失一些局部最优点。考虑到匹配块在不同图像之间可能存在跨宏像素运动的情况,以及相邻宏像素之间也存有视差和尺寸等差异,因此匹配块的最佳候选未必严格按照宏像素间距排列。有鉴于此,在本申请实施例中,可以在宏像素搜索的基础上,适当添加像素级的精细搜索,对基于宏像素搜索得出的候选参考块位置进行像素级的偏移修正,使其更大程度地增加与当前预测单元的相关性,从而进一步优化光场视频压缩的运动估计效果。
下面将说明本申请实施例在一个实际的应用场景中的示例性应用。
在考虑光场图像中宏像素间相关性的基础上,为了进一步精细运动估计的结果,在本申请实施例中提出了一种将宏像素级搜索和像素级搜索结合的快速搜索方法。该方法基于HEVC中的TZSearch算法进行改进,具体流程如下步骤(1)至步骤(6)所述:
步骤(1),基于HEVC的自适应运动矢量预测(Adaptive Motion Vector Prediction,AMVP)算
法确定初始搜索点。
具体地,在一些实施例中,在AMVP给出的候选预测运动矢量(Motion Vector,MV)中,编码器选出率失真代价最小的MV,并将该MV指向的位置作为初始搜索点;
步骤(2),对初始搜索点进行像素级的精细修正:以步骤(1)所得的初始搜索点为中心,在该点周围邻域内做多步像素级精细搜索,对其进行精细修正。
具体地,在一些实施例中,搜索该点邻域内八个像素点,从中选出率失真代价最小的最优点,并把该最优点作为新的中心点,再次进行相同的邻域内像素级精细搜索。当某次搜索选出的最优点与之前所选的中心点一致,则该最优点作为本步骤的结果;当搜索次数达到某一阈值,则提前终止搜索,并以率失真代价最小的最优点作为本步骤的结果。
例如,以图7为例,像素点“0”为初始像素点,以其为中心点搜索周围八个像素点,得出最优像素点“1”;再以“1”为中心点做同样的八点搜索得出“2”;以此类推得出“3”;最后以“3”为中心点得出“2”,与之前所选的中心点一致,故像素点“2”对应的运动矢量便是本步骤的结果;
步骤(3),基于步骤(2)的搜索结果进行宏像素级的初步搜索:如图8所示,搜索步长从1个单位开始,以2的整数次幂形式递增,按照菱形模板(或正方形模板)在规定的搜索范围内进行搜索,从中选出率失真代价最小的搜索点作为该步骤的结果,这里的步长单位设置为光场图像的宏像素间距;
步骤(4),基于步骤(3)的搜索结果进行宏像素级的精细搜索:若步骤(3)所得的最优点对应步长为1,则如图9所示,在该点周围进行两点搜索,选出率失真代价最小的搜索点作为该步骤的结果,这里的搜索也是以宏像素为单位;
若步骤(3)所得的最优点对应步长大于某个阈值,则如图10所示,以该点为中心在一定范围内做全搜索,选出率失真代价最小的搜索点作为该步骤的结果,这里的搜索也是以宏像素为单位;
步骤(5),基于步骤(4)的搜索结果进行常规像素级精细搜索:以步骤(4)所得的结果点为中心点,在该点周围邻域内做多步像素级精细搜索,具体做法与步骤(2)所述一致;
步骤(6),以步骤(5)得到的最优结果点为新的初始搜索点,并重复步骤(2)~(5)。当相邻两次搜索得到的结果点一致或搜索次数达到某一阈值时,停止搜索,最后的搜索结果点对应的运动矢量就是最终运动矢量。
在本申请实施例中,在宏像素搜索的基础上增加了常规的像素级精细搜索,其更加充分地考虑了局部最优点,进一步精细搜索结果,提高了运动估计性能;
在本申请实施例中,光场图像上各宏像素间的间距难以保证严格一致,引入像素级的精细搜索提高了对此的鲁棒性,也就是说,即使光场图像上的宏像素间的间距不同,依然能够获得较好的运动估计结果,从而有益于节约码流的比特开销。
应当注意,尽管在附图中以特定顺序描述了本申请中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等;或者,将不同实施例中步骤组合为新的技术方案。
基于前述的实施例,本申请实施例提供一种编码装置,该装置应用于编码器,图13为本申请实施例提供的编码装置的结构示意图,如图13所示,所述编码装置13包括:
第一搜索模块131,配置为根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;
第二搜索模块132,配置为根据所述第二参考块,确定所述当前块的第一运动信息;
编码模块133,配置为根据所述第一运动信息,生成码流。
在一些实施例中,所述第一参考图像为光场图像,所述当前块为光场图像块。
在一些实施例中,第一搜索模块131,配置为:以所述第一参考块为中心块,执行第一搜索过程,所述第一搜索过程包括:从所述中心块的多个相邻块中,搜索出率失真代价最小的相邻块作为第一候选参考块;以当前搜索得到的所述第一候选参考块为中心块,执行所述第一搜索过程;在所选的中心块中存在两个中心块为同一区域块的情况下,所述同一区域块为所述第二参考块。
在一些实施例中,第一搜索模块131,还配置为:在所选的中心块中不存在两个中心块为同一区域块的情况下,以当前搜索得到的所述第一候选参考块为中心块,执行所述第一搜索过程;在搜索得到的第一候选参考块中的两个第一候选参考块为同一区域块的情况下,所述同一区域块为所述第二参考块。
在一些实施例中,第一搜索模块131,还配置为:在所选的中心块中不存在两个中心块为同一区域块且所述第一搜索过程的执行次数等于第一次数阈值的情况下,搜索得到的第一候选参考块中率失真代价最小的第一候选参考块为所述第二参考块。
在一些实施例中,第一搜索模块131,还配置为:在所选的中心块中不存在两个中心块为同一区域块且所述第一搜索过程的执行次数小于第一次数阈值的情况下,以当前搜索得到的所述第一候选参考块为中心块,迭代执行所述第一搜索过程,直至搜索出所述第二参考块或者所述第一搜索过程的执行次数等于第一次数阈值为止。
在一些实施例中,所述相邻块的尺寸等于所述第一参考块的尺寸。
在一些实施例中,第二搜索模块132,配置为以所述第二参考块所在的宏像素为起始宏像素,执行第二搜索过程,搜索得到率失真代价最小的第三参考块;根据所述第三参考块,确定所述当前块的第一运动信息。
在一些实施例中,所述第二搜索过程,包括:以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块;根据所述第四参考块,确定所述第三参考块。
在一些实施例中,所述以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块,包括:按照一个或多个搜索步长,在一个或多个搜索模板上搜索出宏像素上与所述第二参考块尺寸相同的第二候选参考块;从所述第二参考块和所述第二候选参考块中选出率失真代价最小的块作为第五参考块;其中,所述一个或多个搜索步长为宏像素的倍数;根据所述第五参考块,确定所述第四参考块。
在一些实施例中,所述根据所述第五参考块,确定所述第四参考块,包括:在所述第五参考块与所述第二参考块之间的搜索步长等于一个宏像素的情况下,以一个宏像素为搜索步长,从所述第五参考块所在的宏像素的N个相邻宏像素上搜索出与所述第五参考块尺寸相同的第三候选参考块;其中,N大于或等于1;从所述第五参考块和所述第三候选参考块中,选出率失真代价最小的块作为所述第四参考块。
在一些实施例中,所述以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块,包括:在所述第五参考块与所述第二参考块之间的搜索步长大于1个宏像素且小于或等于步长阈值的情况下,将所述第五参考块作为所述第四参考块;其中,所述步长阈值大于1个宏像素。
在一些实施例中,所述以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块,包括:在所述第五参考块与所述第二参考块之间的搜索步长大于步长阈值的情况下,在特定范围内不同于所述第五参考块所在宏像素的其余宏像素上搜索出与所述第五参考块尺寸相同的第四候选参考块;其中,所述第五参考块所在的宏像素在所述特定范围内;从所述第五参考块和所述第四候选参考块中,选出率失真代价最小的块作为所述第四参考块。
在一些实施例中,在所述第五参考块所在的宏像素的全部相邻宏像素上搜索出与所述第五参考块尺寸相同的第四候选参考块。
在一些实施例中,所述根据所述第四参考块,确定所述第三参考块,包括:以所述第四参考块为中心块,执行所述第一搜索过程,得到新的第二参考块;根据所述新的第二参考块,确定所述第三参考块。
在一些实施例中,所述根据所述新的第二参考块,确定所述第三参考块,包括:以所述新的第二参考块为中心块,迭代执行第三搜索过程,所述第三搜索过程包括所述第一搜索过程和所述第二搜索过程,直至相邻两次所述第三搜索过程搜索得到的所述新的第二参考块为同一区域块,将为同一区域块的所述新的第二参考块作为所述第三参考块;或者,直至所述第三搜索过程的执行次数等于第二次数阈值为止,将所述新的第二参考块中率失真代价最小的块作为所述第三参考块。
在一些实施例中,编码装置13还包括运动估计模块和确定模块;其中,所述运动估计模块,配置为在根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述第二参考块之前,根据所述当前块的运动信息候选,对所述当前块进行运动估计,得到所述当前块的第二运动信息;其中,所述第二运动信息包括第一运动矢量和所述第一参考图像的索引值;所述确定模块,配置为根据所述第二运动信息,确定所述当前块在所述第一参考图像中的所述第一参考块。
在一些实施例中,所述第一运动信息包括所述第三参考块相对所述第一参考块的运动矢量差、所述第一运动矢量的索引值和所述第一参考图像的索引值。
或者,在另一些实施例中,第一运动信息包括所述第三参考块相对所述第一参考块的运动矢量差和所述第二运动信息的索引值。
以上编码装置实施例的描述,与上述编码方法实施例的描述是类似的,具有同编码方法实施例相似的有益效果。对于本申请编码装置实施例中未披露的技术细节,请参照本申请编码方法实施例的描述而理解。
需要说明的是,本申请实施例中图13所示的装置对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述 集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。也可以采用软件和硬件结合的形式实现。
需要说明的是,本申请实施例中,如果以软件功能模块的形式实现上述的方法,并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得电子设备执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。这样,本申请实施例不限制于任何特定的硬件和软件结合。
本申请实施例提供一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如本申请实施例所述的编码方法或解码方法。
本申请实施例提供一种编码器,如图14所示,编码器14包括:通信接口141、存储器142和处理器143;各个组件通过总线系统144耦合在一起。可理解,总线系统144用于实现这些组件之间的连接通信。总线系统144除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图14中将各种总线都标为总线系统144。其中,
通信接口141,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;
存储器142,用于存储能够在处理器143上运行的计算机程序;
处理器143,用于在运行所述计算机程序时,执行本申请实施例所述的编码方法。
可以理解,本申请实施例中的存储器142可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的存储器142旨在包括但不限于这些和任意其它适合类型的存储器。
而处理器143可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器143中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器143可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的 公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器142,处理器143读取存储器142中的信息,结合其硬件完成上述方法的步骤。
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。
可选地,作为另一个实施例,处理器143还配置为在运行所述计算机程序时,执行前述任一编码方法实施例。
本申请实施例还提供一种码流,所述码流是采用如前述编码方法得到的。
本申请实施例提供一种电子设备,包括:处理器,适于执行计算机程序;计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被所述处理器执行时,实现本申请实施例所述的编码方法和/或解码方法。该电子设备可以是各种类型的具有视频编码和/或视频解码能力的设备,例如,该电子设备为手机、平板电脑、笔记本电脑、个人计算机、电视机、投影设备或监控设备等。
这里需要指出的是:以上存储介质和设备实施例的描述,与上述方法实施例的描述是类似的,具有同方法实施例相似的有益效果。对于本申请存储介质、存储介质和设备实施例中未披露的技术细节,请参照本申请方法实施例的描述而理解。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”或“一些实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”或“在一些实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如对象A和/或对象B,可以表示:单独存在对象A,同时存在对象A和对象B,单独存在对象B这三 种情况。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者设备中还存在另外的相同要素。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个模块或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或模块的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的模块可以是、或也可以不是物理上分开的,作为模块显示的部件可以是、或也可以不是物理模块;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部模块来实现本实施例方案的目的。
另外,在本申请各实施例中的各功能模块可以全部集成在一个处理单元中,也可以是各模块分别单独作为一个单元,也可以两个或两个以上模块集成在一个单元中;上述集成的模块既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、只读存储器(Read Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得电子设备执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、磁碟或者光盘等各种可以存储程序代码的介质。
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
以上所述,仅为本申请的实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域 的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (27)

  1. 一种编码方法,所述方法应用于编码器,所述方法包括:
    根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;
    根据所述第二参考块,确定所述当前块的第一运动信息;
    根据所述第一运动信息,生成码流。
  2. 根据权利要求1所述的方法,其中,所述第一参考图像为光场图像,所述当前块为光场图像块。
  3. 根据权利要求1或2所述的方法,其中,所述根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块,包括:
    以所述第一参考块为中心块,执行第一搜索过程,所述第一搜索过程包括:从所述中心块的多个相邻块中,搜索出率失真代价最小的相邻块作为第一候选参考块;
    以当前搜索得到的所述第一候选参考块为中心块,执行所述第一搜索过程;
    在所选的中心块中存在两个中心块为同一区域块的情况下,所述同一区域块为所述第二参考块。
  4. 根据权利要求3所述的方法,其中,所述方法还包括:
    在搜所选的中心块中不存在两个中心块为同一区域块的情况下,以当前搜索得到的所述第一候选参考块为中心块,执行所述第一搜索过程;
    在所选的中心块中存在两个中心块为同一区域块的情况下,所述同一区域块为所述第二参考块。
  5. 根据权利要求3或4所述的方法,其中,
    在所选的中心块中不存在两个中心块为同一区域块且所述第一搜索过程的执行次数等于第一次数阈值的情况下,搜索得到的第一候选参考块中率失真代价最小的第一候选参考块为所述第二参考块。
  6. 根据权利要求5所述的方法,其中,所述方法还包括:
    在所选的中心块中不存在两个中心块为同一区域块且所述第一搜索过程的执行次数小于第一次数阈值的情况下,以当前搜索得到的所述第一候选参考块为中心块,迭代执行所述第一搜索过程,直至搜索出所述第二参考块或者所述第一搜索过程的执行次数等于第一次数阈值为止。
  7. 根据权利要求1至6任一项所述的方法,其中,所述相邻块的尺寸等于所述第一参考块的尺寸。
  8. 根据权利要求3至6任一项所述的方法,其中,所述根据所述第二参考块,确定所述当前块的第一运动信息,包括:
    以所述第二参考块所在的宏像素为起始宏像素,执行第二搜索过程,搜索得到率失真代价最小的第三参考块;
    根据所述第三参考块,确定所述当前块的第一运动信息。
  9. 根据权利要求8所述的方法,其中,所述以所述第二参考块所在的宏像素为起始宏像素,执行第二搜索过程,包括:
    以所述第二参考块的特定样本所在的宏像素为起始宏像素,执行所述第二搜索过程。
  10. 根据权利要求9所述的方法,其中,所述特定样本处于所述第二参考块中左上角的顶点。
  11. 根据权利要求8至10任一所述的方法,其中,所述第二搜索过程,包括:
    以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块;
    根据所述第四参考块,确定所述第三参考块。
  12. 根据权利要求11所述的方法,其中,所述以至少一个宏像素为搜索步长,在所述第一参考图像中搜索出宏像素上率失真代价最小的第四参考块,包括:
    按照一个或多个搜索步长,在一个或多个搜索模板上搜索出宏像素上与所述第二参考块尺寸相同的第二候选参考块;
    从所述第二参考块和所述第二候选参考块中选出率失真代价最小的块作为第五参考块;其中,所述一个或多个搜索步长为宏像素的倍数;
    根据所述第五参考块,确定所述第四参考块。
  13. 根据权利要求12所述的方法,其中,所述根据所述第五参考块,确定所述第四参考块,包括:
    在所述第五参考块与所述第二参考块之间的搜索步长等于一个宏像素的情况下,以一个宏像素为搜索步长,从所述第五参考块所在的宏像素的N个相邻宏像素上搜索出与所述第五参考块尺寸相同的第三候选参考块;其中,N大于或等于1;
    从所述第五参考块和所述第三候选参考块中,选出率失真代价最小的块作为所述第四参考块。
  14. 根据权利要求12所述的方法,其中,所述方法还包括:
    在所述第五参考块与所述第二参考块之间的搜索步长大于1个宏像素且小于或等于步长阈值的情况下,将所述第五参考块作为所述第四参考块;其中,所述步长阈值大于1个宏像素。
  15. 根据权利要求12所述的方法,其中,所述方法还包括:
    在所述第五参考块与所述第二参考块之间的搜索步长大于步长阈值的情况下,在特定范围内不同于所述第五参考块所在宏像素的其余宏像素上搜索出与所述第五参考块尺寸相同的第四候选参考块;其中,所述第五参考块所在的宏像素在所述特定范围内;
    从所述第五参考块和所述第四候选参考块中,选出率失真代价最小的块作为所述第四参考块。
  16. 根据权利要求15所述的方法,其中,
    在所述第五参考块所在的宏像素的全部相邻宏像素上搜索出与所述第五参考块尺寸相同的第四候选参考块。
  17. 根据权利要求13、15或16所述的方法,其中,所述第五参考块所在的宏像素是指所述第五参考块的特定样本所在的宏像素。
  18. 根据权利要求17所述的方法,其中,所述特定样本处于所述第五参考块中左上角的顶点。
  19. 根据权利要求11所述的方法,其中,所述根据所述第四参考块,确定所述第三参考块,包括:
    以所述第四参考块为中心块,执行所述第一搜索过程,得到新的第二参考块;
    根据所述新的第二参考块,确定所述第三参考块。
  20. 根据权利要求19所述的方法,其中,所述根据所述新的第二参考块,确定所述第三参考块,包括:
    以所述新的第二参考块为中心块,迭代执行第三搜索过程,所述第三搜索过程包括所述第一搜索过程和所述第二搜索过程,直至相邻两次所述第三搜索过程搜索得到的所述新的第二参考块为同一区域块,将为同一区域块的所述新的第二参考块作为所述第三参考块;或者,直至所述第三搜索过程的执行次数等于第二次数阈值为止,将所述新的第二参考块中率失真代价最小的块作为所述第三参考块。
  21. 根据权利要求8所述的方法,其中,在根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述第二参考块之前,所述方法还包括:
    根据所述当前块的运动信息候选,对所述当前块进行运动估计,得到所述当前块的第二运动信息;其中,所述第二运动信息包括第一运动矢量和所述第一参考图像的索引值;
    根据所述第二运动信息,确定所述当前块在所述第一参考图像中的所述第一参考块。
  22. 根据权利要求21所述的方法,其中,所述第一运动信息包括所述第三参考块相对所述第一参考块的运动矢量差和所述第二运动信息的索引值。
  23. 一种编码装置,应用于编码器,所述装置包括:
    第一搜索模块,配置为根据当前块在第一参考图像中的第一参考块的多个相邻块,搜索得到所述当前块在所述第一参考图像中的第二参考块;
    第二搜索模块,配置为根据所述第二参考块,确定所述当前块的第一运动信息;
    编码模块,配置为根据所述第一运动信息,生成码流。
  24. 一种编码器,包括第一存储器和第一处理器;其中,
    所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;
    所述第一处理器,用于在运行所述计算机程序时,执行如权利要求1至22任一项所述的方法。
  25. 一种码流,所述码流是通过权利要求1至22任一项所述的编码方法而生成的。
  26. 一种电子设备,包括:
    处理器,适于执行计算机程序;
    计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被所述处理器执行时,实现如权利要求1至22任一项所述的方法。
  27. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1至22任一项所述的方法。
PCT/CN2023/088557 2023-04-16 2023-04-16 编码方法及装置、设备、码流、存储介质 WO2024216414A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/088557 WO2024216414A1 (zh) 2023-04-16 2023-04-16 编码方法及装置、设备、码流、存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/088557 WO2024216414A1 (zh) 2023-04-16 2023-04-16 编码方法及装置、设备、码流、存储介质

Publications (1)

Publication Number Publication Date
WO2024216414A1 true WO2024216414A1 (zh) 2024-10-24

Family

ID=93151847

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/088557 WO2024216414A1 (zh) 2023-04-16 2023-04-16 编码方法及装置、设备、码流、存储介质

Country Status (1)

Country Link
WO (1) WO2024216414A1 (zh)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170044599A (ko) * 2015-10-15 2017-04-25 한양대학교 산학협력단 움직임 추정을 이용한 영상 부호화 방법 및 장치
CN109996080A (zh) * 2017-12-31 2019-07-09 华为技术有限公司 图像的预测方法、装置及编解码器
CN110062243A (zh) * 2019-04-23 2019-07-26 清华大学深圳研究生院 一种基于近邻优化的光场视频运动估计方法
CN110839155A (zh) * 2018-08-17 2020-02-25 北京金山云网络技术有限公司 运动估计的方法、装置、电子设备及计算机可读存储介质
CN112218076A (zh) * 2020-10-17 2021-01-12 浙江大华技术股份有限公司 一种视频编码方法、装置、系统及计算机可读存储介质
WO2022061573A1 (zh) * 2020-09-23 2022-03-31 深圳市大疆创新科技有限公司 运动搜索方法、视频编码装置及计算机可读存储介质
CN115103196A (zh) * 2022-06-21 2022-09-23 安谋科技(中国)有限公司 图像编码方法、电子设备以及介质
CN115665424A (zh) * 2022-11-21 2023-01-31 腾讯科技(深圳)有限公司 图像处理方法、装置、设备、存储介质及程序产品
CN115720267A (zh) * 2021-08-24 2023-02-28 腾讯科技(深圳)有限公司 基于帧间预测的编码方法、编码器、设备以及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170044599A (ko) * 2015-10-15 2017-04-25 한양대학교 산학협력단 움직임 추정을 이용한 영상 부호화 방법 및 장치
CN109996080A (zh) * 2017-12-31 2019-07-09 华为技术有限公司 图像的预测方法、装置及编解码器
CN110839155A (zh) * 2018-08-17 2020-02-25 北京金山云网络技术有限公司 运动估计的方法、装置、电子设备及计算机可读存储介质
CN110062243A (zh) * 2019-04-23 2019-07-26 清华大学深圳研究生院 一种基于近邻优化的光场视频运动估计方法
WO2022061573A1 (zh) * 2020-09-23 2022-03-31 深圳市大疆创新科技有限公司 运动搜索方法、视频编码装置及计算机可读存储介质
CN112218076A (zh) * 2020-10-17 2021-01-12 浙江大华技术股份有限公司 一种视频编码方法、装置、系统及计算机可读存储介质
CN115720267A (zh) * 2021-08-24 2023-02-28 腾讯科技(深圳)有限公司 基于帧间预测的编码方法、编码器、设备以及存储介质
CN115103196A (zh) * 2022-06-21 2022-09-23 安谋科技(中国)有限公司 图像编码方法、电子设备以及介质
CN115665424A (zh) * 2022-11-21 2023-01-31 腾讯科技(深圳)有限公司 图像处理方法、装置、设备、存储介质及程序产品

Similar Documents

Publication Publication Date Title
US11496732B2 (en) Video image encoding and decoding method, apparatus, and device
US20200021850A1 (en) Video data decoding method, decoding apparatus, encoding method, and encoding apparatus
US11310524B2 (en) Method and apparatus for determining motion vector of affine code block
CN117499674A (zh) 视频编码和解码中低复杂度双向帧内预测的方法和装置
CN104717510A (zh) 用于图像处理的方法和装置
US12206889B2 (en) Affine motion prediction-based image decoding method and device using affine merge candidate list in image coding system
WO2020042630A1 (zh) 一种视频图像预测方法及装置
US20240283966A1 (en) Method and Apparatus for Motion Vector Prediction
US20240422346A1 (en) Method and apparatus for constructing motion information list in video encoding and decoding and device
US20240291962A1 (en) Method and apparatus for restricted long-distance motion vector prediction
CN114390289B (zh) 参考像素候选列表构建方法、装置、设备及存储介质
CN109672889B (zh) 约束的序列数据头的方法及装置
CN113382249B (zh) 图像/视频编码方法、装置、系统及计算机可读存储介质
CN114079782B (zh) 视频图像重建方法、装置、计算机设备及存储介质
WO2022110131A1 (zh) 帧间预测方法、装置、编码器、解码器和存储介质
WO2024216414A1 (zh) 编码方法及装置、设备、码流、存储介质
TWI830334B (zh) 視頻編解碼系統中低延遲模板匹配的方法和裝置
WO2024216412A1 (zh) 一种编码方法、编码器以及存储介质
CN114979629B (zh) 图像块预测样本的确定方法及编解码设备
CN114979628B (zh) 图像块预测样本的确定方法及编解码设备
WO2025076670A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2024077562A1 (zh) 编解码方法及装置、编解码器、码流、存储介质
WO2025076671A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2024207136A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质
WO2025065665A1 (zh) 编解码方法、码流、编码器、解码器以及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23933297

Country of ref document: EP

Kind code of ref document: A1