[go: up one dir, main page]

WO2020207451A1 - H.265 encoding method and apparatus - Google Patents

H.265 encoding method and apparatus Download PDF

Info

Publication number
WO2020207451A1
WO2020207451A1 PCT/CN2020/084093 CN2020084093W WO2020207451A1 WO 2020207451 A1 WO2020207451 A1 WO 2020207451A1 CN 2020084093 W CN2020084093 W CN 2020084093W WO 2020207451 A1 WO2020207451 A1 WO 2020207451A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
frame
prediction
block
pipeline
Prior art date
Application number
PCT/CN2020/084093
Other languages
French (fr)
Chinese (zh)
Inventor
张善旭
陈恒明
张圣钦
何德龙
Original Assignee
福州瑞芯微电子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 福州瑞芯微电子股份有限公司 filed Critical 福州瑞芯微电子股份有限公司
Priority to US17/603,002 priority Critical patent/US11956452B2/en
Publication of WO2020207451A1 publication Critical patent/WO2020207451A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search

Definitions

  • the present invention relates to the field of H.265 coding, in particular to a H.265 coding method and device.
  • H.265 is a new video coding standard developed by ITU-T VCEG after H.264.
  • the H.265 standard revolves around the existing video coding standard H.264, retaining some of the original technologies, while improving some related technologies.
  • the newly added technology is used to improve the relationship between code stream, coding quality, delay and algorithm complexity to achieve optimal settings.
  • Specific research contents include: improving compression efficiency, improving robustness and error recovery capabilities, reducing real-time delay, reducing channel acquisition time and random access delay, and reducing complexity.
  • the existing H.265 algorithm generally has the problems of large hardware resource consumption and low coding efficiency.
  • an H.265 encoding device which includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module.
  • the preprocessing module is connected to the coarse selection module.
  • the module is connected to the precise comparison module;
  • the preprocessing module is used to divide a current frame in an original video into multiple CTU blocks
  • the coarse selection module is used to divide each CTU block according to multiple division modes, each division mode divides one CTU block into corresponding multiple CU blocks, and divides each CU block into corresponding one or Multiple PU blocks; the coarse selection module is also used to perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate prediction information corresponding to each division mode;
  • the precise comparison module is used to compare the cost of prediction information corresponding to each partition mode of each CTU block, select the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and According to the selected division mode and its corresponding coding information, the entropy coding information used to generate the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated.
  • the inventor also provides an H.265 encoding method, which is applied to an H.265 encoding device.
  • the device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module.
  • the coarse selection module is connected, and the coarse selection module is connected with the precise comparison module; the method includes the following steps:
  • the preprocessing module divides a current frame in an original video into multiple CTU blocks
  • the coarse selection module divides each CTU block according to multiple division modes.
  • Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each CU block into one or more corresponding PU blocks. ; And perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate a prediction information corresponding to each division mode;
  • the precise comparison module compares the cost of the prediction information corresponding to each partition mode of each CTU block, selects the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and selects
  • the division mode and its corresponding coding information are used to generate entropy coding information for generating an H.265 code stream from the current frame and reconstruction information for generating a reconstructed frame from the current frame.
  • each pipeline step includes at least one pipeline stage for executing at least one module, wherein:
  • the multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
  • the multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
  • the preprocessing pipeline step divides a current frame in an original video into multiple CTU blocks through the preprocessing module
  • the rough selection pipeline step uses the rough selection module to divide each CTU block according to multiple division modes, and performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates a and Forecast information corresponding to each division mode;
  • the precise comparison pipeline step calculates and compares the prediction information corresponding to each division mode of each CTU block through the precise comparison module, and selects a division mode with the smallest cost for each CTU block and the division mode.
  • Corresponding coding information and according to the selected division mode and its corresponding coding information, generate entropy coding information for generating the H.265 code stream from the current frame and reconstruction information for generating the reconstructed frame from the current frame,
  • the overall control module is used to control the storage and retrieval of original frame data and reference frame data, and control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
  • the inventor also provides a H.265 encoding method, which is applied to an H.265 encoding device, the device includes multiple modules and multiple pipeline steps, each pipeline step includes at least one pipeline stage for execution At least one module, of which:
  • the multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
  • the multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
  • the method includes the following steps:
  • the preprocessing pipeline step divides a current frame in an original video into multiple CTU blocks through the preprocessing module
  • the rough selection pipeline process uses the rough selection module to divide each CTU block according to multiple division modes, and performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates one and each Forecast information corresponding to the division mode;
  • the precise comparison pipeline step calculates and compares the prediction information corresponding to each partition mode of each CTU block through the precise comparison module, and selects the partition mode with the smallest cost for each CTU block and the partition mode corresponding to the partition mode.
  • the overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
  • FIG. 1 is a schematic diagram of an H.265 encoding device related to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of a coarse selection module of an H.265 encoding device according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of a rough search process of an H.265 encoding device according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram of the fine search process of the H.265 encoding device according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of fractional pixel search of an H.265 encoding device according to an embodiment of the present invention
  • 6-A is a schematic diagram of search prediction performed by an H.265 encoding device according to an embodiment of the present invention.
  • FIG. 6-B is a schematic diagram of search prediction performed by an H.265 encoding device according to another embodiment of the present invention.
  • FIG. 7 is a schematic diagram of an accurate comparison module of an H.265 encoding device according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of a layered comparison module of an H.265 encoding device according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of an H.265 encoding method according to an embodiment of the present invention.
  • FIG. 10 is a flowchart of a rough search method for H.265 encoding according to an embodiment of the present invention.
  • FIG. 11 is a flowchart of a fine search method for H.265 encoding according to an embodiment of the present invention.
  • FIG. 12 is a flowchart of a H.265 coded fractional pixel search method according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of motion vector information around a current CTU block according to an embodiment of the present invention.
  • FIG. 15 is a schematic diagram of an H.265 encoding device related to another embodiment of the present invention.
  • Inter-frame prediction coarse selection module 211.
  • Coarse search module 213.
  • Fine search module 215.
  • Fractional pixel search module 215.
  • Intra-frame prediction coarse selection module 231.
  • Reference pixel generation module 231.
  • Reference frame 311, down-sampling; 320, down-sampled image; 351, motion vector; 352, minimum cost pixel block; 330, current CTU; 340, down-sampling CTU.
  • Reference frame 420, current PU position; 421, restore motion vector; 423, fine search motion vector; 430, fine search area; 431, start search position; 433, minimum cost position;
  • Distribution module 721, first-level calculation Level_calc0; 722, second-level calculation Level_calc1;
  • Reference frame data loading module 910.
  • Overall control module 910.
  • FIG. 1 is a schematic diagram of an H.265 encoding apparatus according to an embodiment of the present invention.
  • the device is an image encoding device 110.
  • the device may be a chip with image encoding function, or an electronic device containing the above chip, such as a smart mobile device such as a mobile phone, a tablet computer, a personal digital assistant, or a personal digital assistant.
  • the device includes the following modules: a preprocessing module 120, a coarse selection module 130, and an accurate comparison module 140, the preprocessing module 120 is connected to the coarse selection module 130, the coarse selection module 130 and the precise comparison module 140 Connection; where:
  • the preprocessing module 120 is used to divide a current frame 102 in an original video 100 into multiple CTU blocks (Coding Tree Unit, coding tree unit).
  • the CTU is a sub-block in the current frame image, and the size can be any of 16x16 sub-blocks, 32x32 sub-blocks, and 64x64 sub-blocks.
  • the preprocessing module may obtain an original image frame 101 in the original video 100, and select a current frame 102 from the original image frame 101.
  • the coarse selection module 130 is configured to divide each CTU block according to multiple division modes, each division mode divides a CTU block into corresponding multiple CU blocks (Coding Unit, coding unit), and divides each of them
  • the CU block is divided into one or more PU blocks (Prediction Unit, prediction unit); the coarse selection module 130 is also used to perform inter-frame prediction and intra-frame prediction for each division mode of each CTU block, and generate A prediction information corresponding to each division mode.
  • the division mode is selected according to actual needs. For example, for a current CTU 121 with a size of 64x64, it can be divided into 4 32x32 sub-blocks; for each 32x32 sub-block, it can be divided into 4 16x16 sub-blocks.
  • the precise comparison module 140 is configured to compare the prediction information corresponding to each partition mode of each CTU block, and select the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, And according to the selected division mode and its corresponding coding information, the entropy coding information for generating the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated. In this way, the search accuracy is improved through the distributed search, while the details of the reconstructed image are better preserved, and the hardware resource consumption is reduced.
  • the device further includes an entropy encoding module 150, which is connected to the precise comparison module 140: the entropy encoding module 150 is configured to divide according to the least costly corresponding to each CTU block The mode and the entropy coding information corresponding to the current frame generated according to the corresponding coding information to generate the H.265 code stream corresponding to the current frame.
  • the precise comparison module 140 generates the data required for entropy coding corresponding to the CTU according to the partition mode and prediction mode with the smallest CTU cost, that is, the coding information 141 shown in FIG.
  • the entropy coding module 150 is used to generate an encoded code stream 190 corresponding to the original video according to the data required for entropy encoding corresponding to the CTU.
  • the image encoding device 110 will also output the encoded video 180, and a certain image frame of the encoded video 180 is the reconstructed image frame 145.
  • the device includes a post-processing module that is connected to the precise comparison module.
  • the post-processing module is used to generate the reconstruction corresponding to the current frame according to the least costly partition mode corresponding to each CTU block and the reconstruction information corresponding to the current frame generated according to the corresponding coding information frame.
  • the post-processing module includes a deblocking filter module 160 and a sample adaptive offset module 170; the deblocking filter module 160 is connected to the sample adaptive offset module 170; the deblocking filter module 160 is used to use
  • the accurate comparison module provides the least costly partition mode and its corresponding coding information to filter the reconstructed frame; the sample adaptive offset module 170 is used to perform SAO calculation on the filtered reconstructed frame, and The calculated data is transmitted to the entropy encoding module 150.
  • the coarse selection module 130 includes an inter-frame prediction coarse selection module 230 and an intra-frame prediction coarse selection module 330, and the inter-frame prediction coarse selection module 230 is respectively connected to the preprocessing module 120 and the precise comparison module 140 ,
  • the intra-frame prediction coarse selection module 330 is respectively connected to the pre-processing module 120 and the precise comparison module 140;
  • the inter-frame prediction coarse selection module 230 is configured to perform inter-frame prediction on each PU block in each division mode, and select one or more reference frames with a cost less than a preset cost value relative to each PU block
  • the obtained reference information and the motion vector of the selected reference PU block are used as prediction information corresponding to the division mode.
  • Each PU block has its own corresponding motion vector.
  • the motion vector of each PU block is used to obtain prediction information from the reconstructed reference frame. Specifically, the location of the current PU block can be used as the starting point, and the motion vector of each PU block The corresponding motion vector obtains prediction information.
  • the intra-frame prediction coarse selection module 330 is configured to perform intra-frame prediction on each PU block in each division mode, and select one or more intra-frame prediction directions whose cost is less than a preset cost value relative to each PU block , And use the selected intra prediction direction as the prediction information corresponding to the division mode.
  • the inter-frame prediction coarse selection module 230 further includes: a coarse search module 211, a fine search module 213, and a fractional pixel search module 215.
  • the coarse search module 211 is connected to the preprocessing module 120, so The coarse search module 211 is connected to the fine search module 213, and the fine search module is connected to the 213 fractional pixel search module 215.
  • the coarse search module is used to select a frame from the reference array, select one of its original frame or reconstructed frame as a reference frame, perform down-sampling operations on the reference frame and the current CTU block, and perform down-sampling on the reference frame after down-sampling Find the pixel location with the least cost compared with the down-sampled CTU block, and calculate the coarse search vector of the pixel location relative to the current CTU block.
  • the reference list is a list storing reference frames, and the reference frame of the current frame may have multiple frames, all of which are indexed through the reference list.
  • a reference frame includes reconstructed frames and original frames. Since the reference frame and the current CTU block are obtained through down-sampling, the coarse search vector calculated by the coarse search module should also be the corresponding down-sampled search vector, that is, the coarse search vector corresponding to the current CTU block needs to be multiplied by the following The sampling magnification (such as 1/4), and the coarse search vector multiplied by the corresponding magnification is transmitted to the next processing module.
  • the sampling magnification such as 1/4
  • the coarse search module selects one of the original frame or the reconstructed frame as a reference frame, performs down-sampling operations on the reference frame and the current CTU respectively, and then finds and down-sampled the reference frame after down-sampling.
  • the CTU is compared to the least costly pixel location and coarse search vector.
  • the down-sampling scaling ratio of the reference frame and the current CTU are the same.
  • the down-sampled image 320 obtained from the reference frame 310 after down-sampling 311 is to scale the length and width of the reference frame to 1/4, then the down-sampled CTU obtained by the current CTU 330 after down-sampling 331, through the current The length and width of CTU330 are scaled to 1/4. Then the down-sampled CTU340 (B sub-block in Figure 3) is used as a unit, and prediction is performed in the down-sampled image (A sub-block in Figure 3), and the sampled CTU340 and the down-sampled image 320 are calculated in turn.
  • the cost of the sub-block (take each pixel in the A sub-block as the center, take the sub-block with the same size as the B sub-block), find the pixel block with the smallest cost compared with the down-sampled CTU, and record it as the minimum cost pixel block 352 (C sub-block in Figure 3), and record the center pixel position of the current minimum cost pixel block and the coarse search vector.
  • the coarse search vector is the center pixel and minimum cost pixel block 352 of the CTU340 (sub-block B in Figure 3) after downsampling.
  • the vector displacement between the center pixel positions of (C sub-block in FIG. 3) that is, the motion vector 351 in FIG. 3).
  • the intra-frame prediction coarse selection module 330 further includes a reference pixel generation module 231.
  • the reference pixel generating module 231 is used to generate reference pixels using the original pixels of the current frame for each PU block in each division mode, and to predict all intra-frame directions according to the rules of the H.265 protocol according to the reference pixels. Perform prediction to obtain prediction results in each direction, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and sort the cost from small to large to select one or more intra-frame prediction directions with less cost.
  • the coarse selection method of the intra-frame prediction coarse selection module is similar to that of the inter-frame prediction coarse selection module, and will not be repeated here.
  • the difference between the two is that when performing intra-frame prediction, the original frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is down-sampled from the original frame to obtain the down-sampled image for prediction; while performing inter-frame prediction At this time, the reference frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is predicted in the down-sampled image obtained by down-sampling the reference frame.
  • the reference pixels should be reconstructed pixels, but in the process of hardware implementation, only the original pixels can be obtained at the current time point, and the reconstructed pixels are often not available. Therefore, the method of replacing reconstructed pixels with original pixels is adopted in the present invention.
  • the black-filled dots in the figure are edge pixels.
  • the 4x4 block (the shadow-filled dots in Figure 6-B) has a total of 17 boundary pixels.
  • the black filled part of the pixels in the figure (ie, side pixels) should be filled with reconstructed pixels, but the reconstructed pixels cannot be obtained at the current time point, and only original pixels are used instead.
  • the shadow filling part is a PU block of 4x4 size. After the boundary pixel filling is completed, prediction is performed according to the protocol to obtain a 4x4 block filled with the shadow part.
  • the fine search module sets a fine search area in the reference frame for each PU according to the coarse search vector, and finds a fine search vector corresponding to the PU with the smallest cost in the fine search area .
  • the fine search step is performed in the reference frame 410.
  • Each current CTU contains multiple PUs, and the fine search is performed by selecting one of these PUs as the current PU in a certain order.
  • the current PU position 420 is determined first, and then a fine search area 430 is set in the reference frame for the PU according to the previously obtained coarse search vector (or called the restored motion vector 421). Then, a starting search position 431 corresponding to the current PU position 420 is determined in the fine search area 430 according to the restored motion vector 421.
  • the pixels in the starting search position 431 and the fine search area 430 are calculated in turn, and the current PU size is the same.
  • find the minimum cost position 433 calculate the motion vector between the current PU position 420 and the minimum cost position 433, and record it as the fine search motion vector 423.
  • the fine search module is configured to set a fine search area in the reconstructed image of the reference frame for each PU block according to the coarse search vector, and generate a fine search area in the fine search area.
  • a fine search vector with the lowest cost corresponding to the PU block and used to generate one or more predicted motion vectors with the same function as the coarse search vector according to the motion vector information around the current CTU block, and generate a fine search based on the predicted motion vector Vector; and send all the generated fine search vectors to the fractional pixel search module.
  • the adjacent CTU block on the upper left side is the same as the upper right side.
  • adjacent CTU blocks there is a corresponding rough search result and corresponding motion vector information.
  • there are 16 assisted motion vectors in the current CTU block so there are at most 28 mvs as adjacent mvs (that is, the motion vector information around the current CTU block).
  • the 28 motion vector information will undergo a certain screening, and a preset number (such as 3) of adjacent mvs will be screened out and transmitted to the fine search module to determine the same preset number of fine search motion vectors.
  • the same function means that the filtered preset number of adjacent mvs are the same as the search results obtained by the coarse search module, that is, they will be input to the interface of the fine search module for further processing.
  • the coarse search module will input a motion vector to the fine search module, and then select several mvs from adjacent mvs to input to the fine search module. Assuming that there are a total of N mvs input to the fine search module, then the fine search module The search module will also generate N fine search rmvs (that is, fine search vectors), and input the N fine search vectors to FME (that is, the fractional pixel search module), and then FME will compare the costs from these N fine search mvs An optimal fme_mv (ie, fractional pixel search vector) is obtained, and this fme_mv will finally be input to the accurate comparison module.
  • N fine search rmvs that is, fine search vectors
  • FME that is, the fractional pixel search module
  • the fractional pixel search module 215 is configured to set a corresponding fractional pixel search area 530 in the reference frame for each PU block according to each received fine search vector. , And generate a fractional pixel search vector 423 with the lowest cost corresponding to the PU block in the fractional pixel search area 530.
  • the fractional pixel search area 530 can be determined in the following manner: according to the current PU position 520 and the previously acquired fine search motion vector, the start search position 531 corresponding to the current PU position 520 is determined in the reference frame 510 to start the search
  • the position pixel is the center, and K pixels are expanded in 4 directions respectively (the value of K can be set according to actual needs), and a square area with a side length of 2K is obtained as the fractional pixel search area 530.
  • the starting search position 531 pixel is taken as the center, the starting search position 531 and each pixel point in the fractional pixel search area 530 are calculated in turn, and the current PU size is the same.
  • the minimum cost position 533 is calculated, and the motion vector between the current PU position and the minimum cost position 533 is calculated and recorded as the fractional pixel search motion vector 523.
  • the precise comparison module 140 includes a distribution module 711, multiple single-stage calculation modules (such as 721, 722, 723, and 724), and multiple hierarchical comparison modules 740.
  • the distribution module 711 is connected to the coarse selection module 130 and connected to a plurality of single-stage calculation modules; each single-stage calculation module is connected to a corresponding hierarchical comparison module 740. among them:
  • the distribution module 711 is configured to distribute different prediction information corresponding to the CU block in each division mode to different single-stage calculation modules according to different division modes of each CTU block;
  • the single-stage calculation module is used to calculate multiple cost information and compare them in layers according to the prediction information corresponding to the CU block received from the distribution module 711, and select a prediction mode with the least cost corresponding to the CU block and Partition mode
  • the layered comparison module 740 is used to compare the cost information calculated by the single-stage comparison modules of different layers, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  • the exact comparison module 140 of FIG. 7 includes four single-stage calculation modules 721, 722, 723, and 724.
  • Each single-stage calculation module 721, 722, 723, and 724 may be composed of the single-stage calculation module 810 of FIG. 8.
  • the single-stage calculation module 810 includes an inter-mode cost calculation module 820, an intra-mode cost calculation module 830, and an optimization module 840.
  • the single-stage calculation module 810 may calculate an inter-frame cost through the inter-mode cost calculation module 820, and calculate an intra-frame cost by the intra-mode cost calculation module 830, and compare it by the optimization module 840 For the inter-frame cost and the intra-frame cost, determine the partition mode and prediction mode with the smallest comprehensive cost, that is, the partition mode and prediction mode with the smallest cost corresponding to the currently input CU.
  • each single-stage calculation module 721, 722, 723, and 724 is used to process a CU block of a specific level.
  • the single-stage calculation module 721 can be set as a first-level calculation module for processing 64x64 CU blocks; the single-stage calculation module 722 can be set as a second-level calculation module for processing CU blocks of 32x32 size; single-stage calculation module 723 can be set as a three-level calculation module for processing 16x16 CU blocks; single-level calculation module 724 can be set as a four-level calculation module for processing 8x8 CU blocks.
  • the distribution module 711 can distribute to the computing modules 721-724 at all levels according to the size of the CU in various division modes.
  • the intra-mode cost calculation module 830 of each single-stage calculation module receives one or more intra-frame prediction information related to a CU of a certain level, calculates and selects an intra-frame cost .
  • the inter-mode cost calculation module 820 of each single-stage calculation module simultaneously/parallel receives one or more inter-frame motion vectors and reference information related to a CU of a certain level, calculates and selects an inter-frame cost.
  • the optimization module 840 of each single-stage calculation module will select a minimum cost from the calculated intra-frame cost and inter-frame cost.
  • the minimum cost when the minimum cost is intra-frame cost, it means that it is a better choice to use relevant intra-frame prediction information for H.265 encoding; when the minimum cost is inter-frame cost, it means that the relevant inter-frame motion vector and Reference information for H.265 encoding is a better choice.
  • the hierarchical comparison module 743 can compare the sum of the minimum costs corresponding to four 8x8 blocks calculated by the four-level calculation module 724 with the minimum cost of one 16x16 block calculated from the three-level calculation module 723, And get the less expensive division mode.
  • one of the objects to be compared for hierarchical comparison 4 8x8 blocks (assumed to be called A, B, C, and D blocks), which can all be the smallest cost blocks obtained by inter-frame comparison, and all are intra-frame comparison
  • block A can be acquired between frames
  • blocks B, C, and D can be acquired within frames.
  • blocks A and C can be obtained between frames
  • blocks B and D are obtained within frames.
  • the hierarchical comparison module 742 can select four 16x16 blocks with the lowest cost obtained from the hierarchical comparison module 743, and combine them with one 32x32 block with the lowest cost calculated from the secondary calculation module 722 for comparison.
  • the four 16x16 blocks (supposedly called E, F, G, and H blocks) selected by the hierarchical comparison module 742 may include a complete 16x16 CU block, or may be composed of multiple 8x8 blocks.
  • the E block may be a 16x16CU block obtained between frames
  • the F block may be a 16x16CU block obtained within a frame
  • the G blocks may be a 16x16 combined block composed of four 8x8 blocks obtained between frames and intraframes.
  • the hierarchical comparison module 741 can select four 32x32 blocks with the smallest cost obtained from the hierarchical comparison module 742, and combine them with one 64x64 block with the smallest cost calculated from the first-level calculation module 721 for comparison.
  • the 4 32x32 blocks selected by the hierarchical comparison module 741 (assumed to be called I, J, K, L blocks) can include a complete 32x32CU block, or can be composed of multiple 16x16 blocks, each of which is composed of A combination block composed of multiple 8x8 blocks.
  • the I block can be a 32x32CU block acquired between frames; the J block is composed of four 16x16CU blocks acquired between frames and intraframe; one or more 16x16 blocks in the K block can be composed of multiple 8x8 blocks. Composed of blocks.
  • the hierarchical comparison module 740 can find the combination of the CTU, CU, and PU block with the smallest cost, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  • the inventor also provides a H.265 encoding method, which is applied to an H.265 encoding device, and the device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module.
  • the preprocessing module is connected to the coarse selection module, and the coarse selection module is connected to the precise comparison module; the method includes the following steps:
  • the preprocessing module divides a current frame in an original video into multiple CTU blocks
  • the coarse selection module divides each CTU block according to multiple division modes.
  • Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each CU block into a corresponding one or Multiple PU blocks; and perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate prediction information corresponding to each division mode;
  • the precise comparison module performs cost comparison on the prediction information corresponding to each partition mode of each CTU block, selects the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and According to the selected division mode and its corresponding coding information, the entropy coding information used to generate the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated.
  • the device further includes an entropy encoding module connected to an accurate comparison module; the method includes the following steps: the entropy encoding module according to the least costly partition mode corresponding to each CTU block And the entropy coding information corresponding to the current frame generated according to the corresponding coding information to generate the H.265 code stream corresponding to the current frame.
  • the device includes a post-processing module that is connected to the precise comparison module: the method includes: the post-processing module according to the least costly division mode and the basis of the corresponding to each CTU block The reconstruction information corresponding to the current frame generated from the corresponding encoding information is used to generate a reconstructed frame corresponding to the current frame.
  • the post-processing module includes a deblocking filter module and a sample adaptive offset module; the deblocking filter module is connected to the sample adaptive offset module; the method includes: the deblocking filter module uses the precise comparison module Provide the least costly partition mode and its corresponding coding information, and filter the reconstructed frame; the sample adaptive offset module performs SAO calculation on the reconstructed frame after the filter processing, and transmits the calculated data to entropy coding Module.
  • the coarse selection module includes an inter-frame prediction coarse selection module and an intra-frame prediction coarse selection module.
  • the inter-frame prediction coarse selection module is respectively connected to the preprocessing module and the precise comparison module.
  • the prediction coarse selection module is respectively connected with the preprocessing module and the precise comparison module; the method includes: the inter prediction coarse selection module performs inter prediction on each PU block in each division mode, and selects the relative to each PU block One or more reference information obtained from a reference frame whose cost is less than the preset cost value, and the motion vector of the selected reference PU block is used as the prediction information corresponding to the division mode; the intra-frame prediction coarse selection module performs each division Each PU block in the mode performs intra prediction, and selects one or more intra prediction directions whose cost is less than the preset cost value relative to each PU block, and uses the selected intra prediction direction as the division mode corresponding Forecast information.
  • the intra-frame prediction coarse selection module further includes a reference pixel generation module; the method includes: the reference pixel generation module uses the original pixels of the current frame for each PU block in each division mode. Generate reference pixels, and predict all intra-frame prediction directions according to the rules of the H.265 protocol to obtain the prediction results in each direction according to the reference pixels, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and reduce the cost One or more intra-frame prediction directions with a lower cost are selected in a large order.
  • the inter-frame prediction coarse selection module further includes: a coarse search module, a fine search module, and a fractional pixel search module.
  • the coarse search module is connected to the preprocessing module, so The coarse search module is connected with the fine search module, and the fine search module is connected with the score pixel search module.
  • the method includes:
  • step S201 the coarse search module to select a frame from the reference array, and select a reference frame from its original frame or reconstructed frame; then go to step S202 to down-sample the reference frame and the current CTU block; then go to step S203.
  • step S201 the coarse search module to select a frame from the reference array, and select a reference frame from its original frame or reconstructed frame; then go to step S202 to down-sample the reference frame and the current CTU block; then go to step S203.
  • the sampled reference frame find the pixel position with the least cost compared with the down-sampled CTU block, and calculate the coarse search vector of the pixel position relative to the current CTU block.
  • the method includes:
  • the fine search module sets a fine search area in the reconstructed image of the reference frame for each PU block according to the rough search vector; then go to step S302 to generate a corresponding PU block in the fine search area A fine search vector with the least cost; and according to the motion vector information around the current CTU block, one or more predicted motion vectors with the same function as the coarse search vector are generated, and a fine search vector is generated according to the predicted motion vector; All fine search vectors are sent to the fractional pixel search module.
  • the method includes:
  • step S401 the fractional pixel search module, according to each received fine search vector, set a corresponding fractional pixel search area in the reference frame for each PU block; then go to step S402 to generate one in the fractional pixel search area A fractional pixel search vector with the smallest cost corresponding to the PU block.
  • the precise comparison module includes a distribution module, multiple hierarchical calculation modules, and multiple hierarchical comparison modules, the distribution module is connected to the coarse selection module, and the hierarchical comparison module is connected to the distribution module Connection; the method includes:
  • the distribution module distributes each CU block in each division mode and the prediction information corresponding to the CU block to different layered calculation modules according to each division mode of each CTU block;
  • the layered calculation module calculates multiple cost information according to the received prediction information corresponding to the CU block and performs intra-layer comparison, and selects a prediction mode and partition mode with the least cost corresponding to the CU block;
  • the layered comparison module compares the prediction mode selected by the layered calculation modules of different layers and the minimum cost corresponding to the partition mode, and selects the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  • the device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module, the preprocessing module is connected to the coarse selection module, the coarse selection module Connected to the precise comparison module; wherein: the preprocessing module is used to divide a current frame in an original video into multiple CTU blocks; the coarse selection module is used to divide each CTU according to multiple division modes Block, each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of the CU blocks into corresponding one or more PU blocks; the coarse selection module is also used for each CTU block Each division mode of the block performs inter-frame prediction and intra-frame prediction, and generates a prediction information corresponding to each division mode; the precise comparison module is used for prediction corresponding to each division mode of each CTU block The information is compared with the cost, the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode are selected, and the selected partition
  • the H.265 encoding device designed in the present invention can adopt another implementation manner of a pipeline including multiple pipeline steps to implement multiple steps in a specific embodiment.
  • the above-mentioned "pipeline”, also known as pipeline (Pipeline) refers to the process of splitting the encoding process of H.265 into multiple steps, and executing these steps in parallel through multiple corresponding hardware processing units to speed up the processing speed.
  • "Streamline step” refers to a specific step in a pipeline;
  • pipeline stage refers to a specific pipeline stage within a pipeline step.
  • a pipeline can include one or more pipeline steps;
  • a pipeline step can include one or more pipeline stages. When only one pipeline stage is included in a pipeline step, the pipeline step and pipeline stage can be treated equally.
  • a specific hardware module can support the operation of one or more pipeline steps. That is to say, all the pipeline stages in these pipeline steps are run by the hardware module (or by the sub-modules contained therein).
  • a specific hardware module can support at least one pipeline stage. If there are multiple pipeline stages in a pipeline step, the hardware module is only responsible for the operation of one or more pipeline stages in the pipeline step. In other words, the pipeline step can be implemented by multiple hardware modules, and each hardware module is responsible for running the corresponding pipeline stage in the corresponding pipeline step.
  • the device includes multiple modules and multiple pipeline steps, each pipeline step includes at least one pipeline stage for executing at least one module, wherein:
  • the multiple modules include a preprocessing module 120, a coarse selection module 130, an accurate comparison module 140, and an overall control module 920, and the overall control module 920 is connected to the preprocessing module 120, the rough selection module 130, and the precise comparison module 140, respectively;
  • the multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
  • the preprocessing pipeline step passes through the preprocessing module 120 to divide a current frame 102 in an original video 100 into multiple CTU blocks (Coding Tree Unit, coding tree unit).
  • the CTU is a sub-block in the current frame image, and the size can be any of 16x16 sub-blocks, 32x32 sub-blocks, and 64x64 sub-blocks.
  • the preprocessing module may obtain an original image frame 101 in the original video 100, and select a current frame 102 from the original image frame 101.
  • the coarse selection pipeline step passes through the coarse selection module 130, divides each CTU block according to multiple division modes, performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates one The prediction information corresponding to each division mode.
  • the coarse selection module includes: an inter prediction coarse selection module and an intra prediction coarse selection module;
  • the coarse selection pipeline includes: inter prediction coarse selection pipeline and frame Intra-prediction rough selection of pipeline level;
  • the inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes.
  • Each division mode divides a CTU block into corresponding multiple CU blocks (Coding Units, coding units). Unit), and divide each CU block into one or more corresponding PU blocks (Prediction Unit, prediction unit), perform inter-frame prediction for each division mode of each CTU block and obtain reference frame information, and Perform intra-frame prediction on each division mode of each CTU block and generate prediction information corresponding to each division mode.
  • the division mode is selected according to actual needs. For example, for a current CTU 121 with a size of 64x64, it can be divided into 4 32x32 sub-blocks; for each 32x32 sub-block, it can be divided into 4 16x16 sub-blocks.
  • the intra-frame prediction coarse selection pipeline passes through the intra-frame prediction coarse selection module: performs intra-frame prediction on each PU block in each division mode and calculates the corresponding cost, and selects one or more costs relative to each PU block according to the cost. Intra prediction directions, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  • Each PU block has its own corresponding motion vector. The motion vector of each PU block is used to obtain prediction information from the reconstructed reference frame. Specifically, the location of the current PU block can be used as the starting point, and the motion vector of each PU block The corresponding motion vector obtains prediction information.
  • the precise comparison pipeline step uses the precise comparison module 140 to calculate and compare the prediction information corresponding to each partition mode of each CTU block, and select the partition mode with the smallest cost for each CTU block and compare it with the partition mode.
  • the coding information corresponding to the mode, and according to the selected division mode and its corresponding coding information, the entropy coding information for generating the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated. In this way, the search accuracy is improved through the distributed search, while the details of the reconstructed image are better preserved, and the hardware resource consumption is reduced.
  • the overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, coarse selection module, and precise comparison module to sequentially execute the corresponding pipeline steps.
  • the rough selection pipeline step is performed after the pretreatment pipeline step
  • the precise comparison pipeline step is executed after the rough selection pipeline step.
  • the preprocessing module can perform the preprocessing pipeline steps of the next frame corresponding to the current frame
  • the precise comparison module executes the accurate comparison pipeline corresponding to the current frame.
  • the rough selection module can perform the rough selection pipeline steps of the next frame, and so on to achieve pipeline operation, thereby effectively improving the coding efficiency.
  • the coarse selection module 130 further includes an inter-frame coarse selection module 230
  • the precise comparison module 140 further includes an intra-frame coarse selection module 330
  • the coarse selection pipeline includes an inter-frame coarse selection pipeline.
  • the precise comparison pipeline step includes coarse selection of pipeline stages within a frame.
  • the inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes.
  • Each division mode divides a CTU block into corresponding multiple CU blocks, and divides the Each CU block is divided into one or more corresponding PU blocks, inter-frame prediction is performed for each division mode of each CTU block and reference frame information is obtained, and each division mode of each CTU block is intra-frame prediction And generate a prediction information corresponding to each division mode;
  • the intra-frame prediction coarse selection pipeline passes through the intra-frame prediction coarse selection module: performs intra-frame prediction on each PU block in each division mode and calculates the corresponding cost, and selects one or more costs relative to each PU block according to the cost. Intra prediction directions, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  • the intra-frame coarse selection module 330 can be attached to the coarse selection module 130 or the precise comparison module 140, thereby broadening the application scenarios of the device.
  • the inter-frame prediction coarse selection module 230 includes: a coarse search module 211, a reference frame data loading module 910, a fine search module 213, and a fractional pixel search module 215.
  • the rough selection pipeline includes: rough search pipeline, reference frame data loading pipeline, fine search pipeline and fractional pixel search pipeline;
  • the coarse search pipeline stage passes through the coarse search module: select a frame from the reference array, select a reference frame from its original frame or reconstructed frame, perform down-sampling operations on the reference frame and the current CTU block, and perform down-sampling on the Find the pixel location with the least cost compared with the down-sampled CTU block in the reference frame, and calculate the coarse search vector of the pixel location relative to the current CTU block;
  • the reference frame data loading pipeline stage is through the reference frame data loading pipeline stage: the coarse search vector of the coarse search pipeline is obtained through the overall control module, and one or more predictions with the same function as the coarse search are obtained according to the motion vector around the CTU block Motion vector, load reference frame data according to the coarse search vector and one or more prediction vectors, and pass it to the fine search pipeline through the overall control module;
  • the fine search pipeline passes the fine search module: according to the coarse search vector, a fine search area is set in the reconstructed image of the reference frame for each PU block, and a corresponding PU block is generated in the fine search area A fine search vector with the smallest cost; and used to generate one or more predicted motion vectors with the same function as the coarse search vector based on the motion vector information around the current CTU block, and generate a fine search vector based on the predicted motion vector; and Send all the generated fine search vectors to the fractional pixel search module;
  • the fractional pixel search pipeline level passes through the fractional pixel search module: according to each received fine search vector, a corresponding fractional pixel search area is set in the reference frame for each PU block, and in the fractional pixel search area Generate a fractional pixel search vector with the smallest cost corresponding to the PU block.
  • the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are the same pipeline stage, and the intra-frame prediction coarse selection module and the fractional pixel search module are executed in parallel at the same pipeline stage.
  • the reference list is a list storing reference frames, and the reference frame of the current frame may have multiple frames, all of which are indexed through the reference list.
  • a reference frame includes reconstructed frames and original frames. Since the reference frame and the current CTU block are obtained through down-sampling, the coarse search vector calculated by the coarse search module should also be the corresponding down-sampled search vector, that is, the coarse search vector corresponding to the current CTU block needs to be multiplied by the following The sampling magnification (such as 1/4), and the coarse search vector multiplied by the corresponding magnification is transmitted to the next processing module.
  • the sampling magnification such as 1/4
  • the intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline; the intra-frame prediction coarse selection pipeline includes: for each division mode Each PU block uses the original pixels of the current frame to generate reference pixels. According to the reference pixels, all intra-frame prediction directions are predicted according to the rules of the H.265 protocol to obtain the prediction results in each direction. The prediction results in each direction are compared with the original The pixel calculates the distortion cost, and sorts the cost from small to large to select one or more intra prediction directions with a small cost.
  • the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are different pipeline stages, and the intra-frame prediction coarse selection module is executed at the pipeline stage after the fractional pixel search module.
  • the intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline; the reference pixel generation module is used for each PU block in each division mode, using the reconstruction of the current frame Pixels to generate reference pixels, predict all intra-frame prediction directions according to the rules of the H.265 protocol according to the reference pixels to obtain the prediction results in each direction, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and reduce the cost
  • One or more intra-frame prediction directions with a lower cost are selected in a large order.
  • the coarse search module selects one of the original frame or the reconstructed frame as a reference frame, performs down-sampling operations on the reference frame and the current CTU respectively, and then finds and down-sampled the reference frame after down-sampling.
  • the CTU is compared to the least costly pixel location and coarse search vector.
  • the down-sampling scaling ratio of the reference frame and the current CTU are the same.
  • the down-sampled image 320 obtained from the reference frame 310 after down-sampling 311 is to scale the length and width of the reference frame to 1/4, then the down-sampled CTU obtained by the current CTU 330 after down-sampling 331, through the current The length and width of CTU330 are scaled to 1/4. Then the down-sampled CTU340 (B sub-block in Figure 3) is used as a unit, and prediction is performed in the down-sampled image (A sub-block in Figure 3), and the sampled CTU340 and the down-sampled image 320 are calculated in turn.
  • the cost of the sub-block (take each pixel in the A sub-block as the center, take the sub-block with the same size as the B sub-block), find the pixel block with the smallest cost compared with the down-sampled CTU, and record it as the minimum cost pixel block 352 (C sub-block in Figure 3), and record the center pixel position of the current minimum cost pixel block and the coarse search vector.
  • the coarse search vector is the center pixel and minimum cost pixel block 352 of the CTU340 (sub-block B in Figure 3) after downsampling.
  • the vector displacement between the center pixel positions of (C sub-block in FIG. 3) that is, the motion vector 351 in FIG. 3).
  • the adjacent CTU block on the upper left side is the same as the upper right side.
  • adjacent CTU blocks there is a corresponding rough search result and corresponding motion vector information.
  • there are 16 assisted motion vectors in the current CTU block so there are at most 28 mvs as adjacent mvs (that is, the motion vector information around the current CTU block).
  • the 28 motion vector information will undergo a certain screening, and a preset number (such as 3) of adjacent mvs will be screened out and transmitted to the fine search module to determine the same preset number of fine search motion vectors.
  • the same function means that the filtered preset number of adjacent mvs are the same as the search results obtained by the coarse search module, that is, they will be input to the interface of the fine search module for further processing.
  • the coarse search module will input a motion vector to the fine search module, and then select several mvs from adjacent mvs to input to the fine search module. Assuming that there are a total of N mvs input to the fine search module, then the fine search module The search module will also generate N fine search rmvs (that is, fine search vectors), and input the N fine search vectors to FME (that is, the fractional pixel search module), and then FME will compare the costs from these N fine search mvs An optimal fme_mv (ie, fractional pixel search vector) is obtained, and this fme_mv will finally be input to the accurate comparison module.
  • N fine search rmvs that is, fine search vectors
  • FME that is, the fractional pixel search module
  • the fractional pixel search module 215 is configured to set a corresponding fractional pixel search area 530 in the reference frame for each PU block according to each received fine search vector. , And generate a fractional pixel search vector 423 with the lowest cost corresponding to the PU block in the fractional pixel search area 530.
  • the fractional pixel search area 530 can be determined in the following manner: according to the current PU position 520 and the previously acquired fine search motion vector, the start search position 531 corresponding to the current PU position 520 is determined in the reference frame 510 to start the search
  • the position pixel is the center, and K pixels are expanded in 4 directions respectively (the value of K can be set according to actual needs), and a square area with a side length of 2K is obtained as the fractional pixel search area 530.
  • the starting search position 531 pixel is taken as the center, the starting search position 531 and each pixel point in the fractional pixel search area 530 are calculated in turn, and the current PU size is the same.
  • the minimum cost position 533 is calculated, and the motion vector between the current PU position and the minimum cost position 533 is calculated and recorded as the fractional pixel search motion vector 523.
  • the precise comparison module 140 includes a distribution module 711, multiple single-stage calculation modules (such as 721, 722, 723, and 724), and multiple hierarchical comparison modules 740.
  • the distribution module 711 is connected to the coarse selection module 130 and connected to a plurality of single-stage calculation modules; each single-stage calculation module is connected to a corresponding hierarchical comparison module 740. among them:
  • the distribution module 711 is configured to distribute different prediction information corresponding to the CU block in each division mode to different single-stage calculation modules according to different division modes of each CTU block;
  • the single-stage calculation module is used to calculate multiple cost information and compare them in layers according to the prediction information corresponding to the CU block received from the distribution module 711, and select a prediction mode with the least cost corresponding to the CU block and Partition mode
  • the layered comparison module 740 is used to compare the cost information calculated by the single-stage comparison modules of different layers, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  • the exact comparison module 140 of FIG. 7 includes four single-stage calculation modules 721, 722, 723, and 724.
  • Each single-stage calculation module 721, 722, 723, and 724 may be composed of the single-stage calculation module 810 of FIG. 8.
  • the single-stage calculation module 810 includes an inter-mode cost calculation module 820, an intra-mode cost calculation module 830, and an optimization module 840.
  • the single-stage calculation module 810 may calculate an inter-frame cost through the inter-mode cost calculation module 820, and calculate an intra-frame cost by the intra-mode cost calculation module 830, and compare it by the optimization module 840 For the inter-frame cost and the intra-frame cost, determine the partition mode and prediction mode with the smallest comprehensive cost, that is, the partition mode and prediction mode with the smallest cost corresponding to the currently input CU.
  • each single-stage calculation module 721, 722, 723, and 724 is used to process a CU block of a specific level.
  • the single-stage calculation module 721 can be set as a first-level calculation module for processing 64x64 CU blocks; the single-stage calculation module 722 can be set as a second-level calculation module for processing CU blocks of 32x32 size; single-stage calculation module 723 can be set as a three-level calculation module for processing 16x16 CU blocks; single-level calculation module 724 can be set as a four-level calculation module for processing 8x8 CU blocks.
  • the distribution module 711 can distribute to the computing modules 721-724 at all levels according to the size of the CU in various division modes.
  • the intra-mode cost calculation module 830 of each single-stage calculation module receives one or more intra-frame prediction information related to a CU of a certain level, calculates and selects an intra-frame cost .
  • the inter-mode cost calculation module 820 of each single-stage calculation module simultaneously/parallel receives one or more inter-frame motion vectors and reference information related to a CU of a certain level, calculates and selects an inter-frame cost.
  • the optimization module 840 of each single-stage calculation module will select a minimum cost from the calculated intra-frame cost and inter-frame cost.
  • the minimum cost when the minimum cost is intra-frame cost, it means that it is a better choice to use relevant intra-frame prediction information for H.265 encoding; when the minimum cost is inter-frame cost, it means that the relevant inter-frame motion vector and Reference information for H.265 encoding is a better choice.
  • the hierarchical comparison module 743 can compare the sum of the minimum costs corresponding to four 8x8 blocks calculated by the four-level calculation module 724 with the minimum cost of one 16x16 block calculated from the three-level calculation module 723, And get the less expensive division mode.
  • one of the objects to be compared for hierarchical comparison 4 8x8 blocks (assumed to be called A, B, C, and D blocks), which can all be the smallest cost blocks obtained by inter-frame comparison, and all are intra-frame comparison
  • block A can be acquired between frames
  • blocks B, C, and D can be acquired within frames.
  • blocks A and C can be obtained between frames
  • blocks B and D are obtained within frames.
  • the hierarchical comparison module 742 can select four 16x16 blocks with the lowest cost obtained from the hierarchical comparison module 743, and combine them with one 32x32 block with the lowest cost calculated from the secondary calculation module 722 for comparison.
  • the four 16x16 blocks (supposedly called E, F, G, and H blocks) selected by the hierarchical comparison module 742 may include a complete 16x16 CU block, or may be composed of multiple 8x8 blocks.
  • the E block may be a 16x16CU block obtained between frames
  • the F block may be a 16x16CU block obtained within a frame
  • the G blocks may be a 16x16 combined block composed of four 8x8 blocks obtained between frames and intraframes.
  • the hierarchical comparison module 741 can select four 32x32 blocks with the smallest cost obtained from the hierarchical comparison module 742, and combine them with one 64x64 block with the smallest cost calculated from the first-level calculation module 721 for comparison.
  • the 4 32x32 blocks selected by the hierarchical comparison module 741 (assumed to be called I, J, K, L blocks) can include a complete 32x32CU block, or can be composed of multiple 16x16 blocks, each of which is composed of A combination block composed of multiple 8x8 blocks.
  • the I block can be a 32x32CU block acquired between frames; the J block is composed of four 16x16CU blocks acquired between frames and intraframe; one or more 16x16 blocks in the K block can be composed of multiple 8x8 blocks. Composed of blocks.
  • the hierarchical comparison module 740 can find the combination of the CTU, CU, and PU block with the smallest cost, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  • the intra-frame prediction coarse selection module 330 includes a reference pixel generation module 231; the intra-frame prediction coarse selection module 330 is executed in the intra-frame prediction coarse selection pipeline;
  • the reference pixel generating module 231 is used to generate reference pixels using the original pixels of the current frame for each PU block in each division mode, and to predict all intra-frame directions according to the rules of the H.265 protocol according to the reference pixels. Perform prediction to obtain prediction results in each direction, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and sort the cost from small to large to select one or more intra-frame prediction directions with less cost.
  • the coarse selection method of the intra-frame prediction coarse selection module is similar to that of the inter-frame prediction coarse selection module, and will not be repeated here.
  • the difference between the two is that when performing intra-frame prediction, the original frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is down-sampled from the original frame to obtain the down-sampled image for prediction; while performing inter-frame prediction At this time, the reference frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is predicted in the down-sampled image obtained by down-sampling the reference frame.
  • the reference pixels should be reconstructed pixels, but in the process of hardware implementation, only the original pixels can be obtained at the current time point, and the reconstructed pixels are often not available. Therefore, the method of replacing reconstructed pixels with original pixels is adopted in the present invention.
  • the black-filled dots in the figure are edge pixels.
  • the 4x4 block (the shadow-filled dots in Figure 6-B) has a total of 17 boundary pixels.
  • the black filled part of the pixels in the figure (ie, side pixels) should be filled with reconstructed pixels, but the reconstructed pixels cannot be obtained at the current time point, and only original pixels are used instead.
  • the shadow filling part is a PU block of 4x4 size. After the boundary pixel filling is completed, prediction is performed according to the protocol to obtain a 4x4 block filled with the shadow part.
  • the fine search module sets a fine search area in the reference frame for each PU according to the coarse search vector, and finds a fine search vector corresponding to the PU with the smallest cost in the fine search area .
  • the fine search step is performed in the reference frame 410.
  • Each current CTU contains multiple PUs, and the fine search is performed by selecting one of these PUs as the current PU in a certain order.
  • the current PU position 420 is determined first, and then a fine search area 430 is set in the reference frame for the PU according to the previously obtained coarse search vector (or called the restored motion vector 421). Then, a starting search position 431 corresponding to the current PU position 420 is determined in the fine search area 430 according to the restored motion vector 421.
  • the pixels in the starting search position 431 and the fine search area 430 are calculated in turn, and the current PU size is the same.
  • find the minimum cost position 433 calculate the motion vector between the current PU position 420 and the minimum cost position 433, and record it as the fine search motion vector 423.
  • the device further includes a post-processing module 180, which is connected to the precise comparison module 140; the post-processing module 180 is executed in post-processing pipeline steps, and the post-processing pipeline steps include : Generate a reconstructed frame corresponding to the current frame according to the least costly partition mode corresponding to each CTU block output by the precise comparison module and according to the corresponding reconstruction information.
  • the post-processing module 180 includes a deblocking filtering module 160 and a sample adaptive offset module 170;
  • the post-processing pipeline step includes a deblocking filtering pipeline step and a sample adaptive offset step;
  • the deblocking filtering pipeline step includes: using the least costly partition mode provided by the accurate comparison module and its corresponding The reconstructed information is reconstructed and the reconstructed frame is filtered;
  • the sample adaptive offset pipeline step includes: performing SAO calculation on the reconstructed frame after the filtering process to obtain the final reconstructed frame for reference and display.
  • the deblocking filtering pipeline step and the sample adaptive offset pipeline step are sequentially executed in the post-processing pipeline stage in sequence.
  • the device further includes an entropy encoding module 150 connected to the precise comparison module 140.
  • the entropy encoding module 150 is executed in the entropy encoding pipeline step, and the entropy encoding pipeline step includes: according to the least costly partition mode corresponding to each CTU block output by the precise comparison module 140 and the and generated according to the corresponding encoding information Entropy coding information corresponding to the current frame is used to generate an H.265 code stream corresponding to the current frame.
  • the entropy coding pipeline step and the post-processing pipeline step are executed in parallel at the same pipeline stage.
  • the precise comparison module 140 generates the data required for entropy coding corresponding to the CTU according to the partition mode and prediction mode with the smallest CTU cost, that is, the coding information 141 shown in FIG. 1, the entropy coding module 150 is used to generate an encoded code stream 190 corresponding to the original video according to the data required for entropy encoding corresponding to the CTU.
  • the image encoding device 110 will also output the encoded video 180, and a certain image frame of the encoded video 180 is the reconstructed image frame 145.
  • the inventor also provides a H.265 encoding method, the method is applied to H.265 encoding device, the device includes multiple modules and multiple pipeline steps, each pipeline step includes at least one The pipeline stage is used to execute at least one module, where:
  • the multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
  • the multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
  • the method includes the following steps:
  • step S101' preprocessing pipeline step to divide a current frame in an original video into multiple CTU blocks through the preprocessing module
  • step S102' rough selection pipeline step through the rough selection module, divide each CTU block according to multiple division modes, and perform coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and Generate a prediction information corresponding to each division mode;
  • step S103' accurate comparison pipeline step through the accurate comparison module, the prediction information corresponding to each division mode of each CTU block is calculated and compared, and the division mode with the smallest cost for each CTU block is selected and compared
  • the coding information corresponding to the division mode, and according to the selected division mode and its corresponding coding information, the entropy coding information for generating the H.265 code stream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated ,
  • the overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
  • the coarse selection module includes: an inter prediction coarse selection module and an intra prediction coarse selection module;
  • the coarse selection pipeline includes: an inter prediction coarse selection pipeline and an intra prediction coarse selection pipeline ;
  • the method also includes:
  • the inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes.
  • Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of them
  • the CU block is divided into one or more corresponding PU blocks, and each division mode of each CTU block is inter-predicted and reference frame information is obtained, and each division mode of each CTU block is intra-predicted and generated A prediction information corresponding to each division mode;
  • Intra-frame prediction coarse selection pipeline through the intra-frame prediction coarse selection module Perform intra-frame prediction for each PU block in each division mode and calculate the corresponding cost, and select one or more frames relative to the cost of each PU block according to the cost Intra prediction direction, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  • the coarse selection module further includes a coarse inter-frame selection module
  • the precise comparison module further includes a coarse intra-frame selection module
  • the coarse selection pipeline includes the inter-frame coarse selection pipeline
  • the precise comparison module The comparison pipeline step includes coarse selection of pipeline stages within the frame.
  • the method includes:
  • the inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes.
  • Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of them
  • the CU block is divided into one or more corresponding PU blocks, and each division mode of each CTU block is inter-predicted and reference frame information is obtained, and each division mode of each CTU block is intra-predicted and generated A prediction information corresponding to each division mode;
  • Intra-frame prediction coarse selection pipeline through the intra-frame prediction coarse selection module Perform intra-frame prediction for each PU block in each division mode and calculate the corresponding cost, and select one or more frames relative to the cost of each PU block according to the cost Intra prediction direction, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  • the intra-frame coarse selection module can be a part of the coarse selection module or a part of the precise comparison module, thereby effectively broadening the application scenarios of the present invention.
  • the inter-frame prediction coarse selection module includes: a coarse search module, a reference frame data loading module, a fine search module, and a fractional pixel search module;
  • the rough selection pipeline includes: rough search pipeline, reference frame data loading pipeline, fine search pipeline and fractional pixel search pipeline;
  • the method includes:
  • the rough search pipeline stage passes through the rough search module: first enter step S201, the rough search module selects a frame from the reference array, and selects a reference frame from its original frame or reconstructed frame; then enters step S202 for reference The frame and the current CTU block perform down-sampling operation; then go to step S203 to find the pixel position with the least cost compared with the down-sampled CTU block in the down-sampled reference frame, and calculate the coarse search of the pixel position relative to the current CTU block Vector.
  • the reference frame data loading pipeline stage obtain the coarse search vector of the coarse search pipeline through the overall control module and obtain one or more predicted motion vectors with the same function as the coarse search according to the motion vectors around the CTU block , Load the reference frame data according to the coarse search vector and one or more prediction vectors, and pass it to the fine search pipeline through the overall control module;
  • the fine search pipeline level passes through the fine search module: first enter step S301 according to the coarse search vector, set a fine search area in the reconstructed image of the reference frame for each PU block; then enter step S302, Generate a fine search vector corresponding to the PU block in the fine search area; and generate one or more predicted motion vectors with the same function as the coarse search vector according to the motion vector information around the current CTU block, and Predict the motion vector to generate a fine search vector; and send all the generated fine search vectors to the fractional pixel search module;
  • the fractional pixel search pipeline passes through the fractional pixel search module: first enter step S401, the fractional pixel search module sets a corresponding frame in the reference frame for each PU block according to each received fine search vector Then, proceed to step S402 to generate a fractional pixel search vector corresponding to the PU block with the smallest cost in the fractional pixel search area.
  • the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are the same pipeline stage, and the intra-frame prediction coarse selection module and the fractional pixel search module are executed in parallel at the same pipeline stage .
  • the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline can be executed in parallel, that is, synchronously, or in sequential order, that is, the intra-frame prediction coarse selection pipeline is executed first, and then executed The fractional pixel search pipeline level.
  • the intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed at the intra-frame prediction coarse selection pipeline; the method includes:
  • the rough selection pipeline of intra prediction includes: for each PU block in each division mode, the original pixels of the current frame are used to generate reference pixels, and all intra prediction directions are performed according to the rules of the H.265 protocol according to the reference pixels.
  • the prediction results in each direction are obtained by prediction, and the distortion cost is calculated with the original pixels according to the prediction results in each direction, and the cost is sorted from small to large to select one or more intra prediction directions with a small cost.
  • the multiple modules further include a post-processing module
  • the multiple pipeline steps further include a post-processing pipeline step
  • the method includes: the post-processing pipeline step passes through the post-processing module and outputs the precise comparison module
  • Each CTU block corresponds to a partition mode with the least cost and a reconstructed frame corresponding to the current frame is generated according to the corresponding reconstruction information.
  • the multiple modules further include an entropy encoding module
  • the multiple pipeline steps further include an entropy encoding pipeline step
  • the method includes: the entropy encoding pipeline step outputs the precise comparison module through the entropy encoding module
  • Each CTU block corresponds to the least costly partition mode and according to its corresponding entropy coding information, a binary code stream conforming to the H.265 protocol specification is generated.
  • the preprocessing module 120 belongs to the first-level pipeline and executes the preprocessing pipeline steps.
  • the coarse selection module performs rough selection pipeline steps.
  • the coarse selection module includes a coarse search module 211, a reference frame data loading module 910, a fine search module 213, and a fractional pixel search module 215.
  • the coarse selection pipeline includes a coarse search pipeline (i.e., two-stage pipeline), a reference frame data loading pipeline (i.e., a three-stage pipeline), a fine search pipeline (i.e., a four-stage pipeline), and a fractional pixel search pipeline ( That is five-level pipeline).
  • the intra-frame prediction coarse selection module and the fractional pixel search module are executed in parallel at the same pipeline stage (that is, both are executed in a five-stage pipeline).
  • the precise comparison module 140 executes the precise comparison pipeline, which belongs to the six-stage pipeline.
  • the entropy coding module 150 and the post-processing module are respectively executed in the entropy coding pipeline stage and the post-processing pipeline stage, and the entropy coding pipeline stage and the post-processing pipeline stage are executed in parallel in the seven-stage pipeline.
  • One to seven levels of pipelines are all implemented through the overall control module 920 to achieve data transmission, scheduling, and control, so that the coding process is carried out in an orderly manner, which greatly improves the coding efficiency.
  • the present invention provides a H.265 encoding method and device.
  • the device includes multiple modules and multiple pipeline steps. Each pipeline step includes at least one pipeline stage for executing at least one module.
  • the multiple modules include preprocessing. Module, rough selection module, accurate comparison module and overall control module; multiple pipeline steps include pretreatment pipeline step, rough selection pipeline step, and accurate comparison pipeline step.
  • the rough selection pipeline step is executed after the pretreatment pipeline step, so The precise comparison pipeline step is performed after the rough selection pipeline step.
  • the overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
  • the invention improves the search accuracy through the distributed search mode, while better retaining the details of the reconstructed image, and reduces the hardware resource consumption.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

An H.265 encoding method and apparatus, the apparatus comprising the following modules: a pre-processing module (120), a rough selection module (130) and a precise comparison module (140). The pre-processing module (120) is used to segment a current frame in an original video (100) into a plurality of CTU blocks; the rough selection module is used to divide each CTU block according to a plurality of partition modes, and segment each CU block therein into one or more corresponding PU blocks; the rough selection module (130) is also used to perform inter frame prediction and intra frame-prediction for each partition mode of each CTU block, and to generate one or more items of prediction information corresponding to each partition mode; the precise comparison module (140) is used to perform cost comparison on prediction information corresponding to each partition mode of each CTU block, and to generate entropy encoding information used for generating a current frame into an H.265 code stream and reconstruction information for generating the current frame into a reconstructed frame. By means of a distributed search means, searching accuracy is improved and hardware resource consumption is reduced.

Description

一种H.265编码方法和装置A H.265 encoding method and device 技术领域Technical field
本发明涉及H.265编码领域,尤其涉及一种H.265编码方法和装置。The present invention relates to the field of H.265 coding, in particular to a H.265 coding method and device.
背景技术Background technique
H.265是ITU-T VCEG继H.264之后所制定的新的视频编码标准。H.265标准围绕着现有的视频编码标准H.264,保留原来的某些技术,同时对一些相关的技术加以改进。新加的技术用以改善码流、编码质量、延时和算法复杂度之间的关系,达到最优化设置。具体的研究内容包括:提高压缩效率、提高鲁棒性和错误恢复能力、减少实时的时延、减少信道获取时间和随机接入时延、降低复杂度等。目前,现有的H.265算法普遍存在着硬件资源消耗大、编码效率低的问题。H.265 is a new video coding standard developed by ITU-T VCEG after H.264. The H.265 standard revolves around the existing video coding standard H.264, retaining some of the original technologies, while improving some related technologies. The newly added technology is used to improve the relationship between code stream, coding quality, delay and algorithm complexity to achieve optimal settings. Specific research contents include: improving compression efficiency, improving robustness and error recovery capabilities, reducing real-time delay, reducing channel acquisition time and random access delay, and reducing complexity. At present, the existing H.265 algorithm generally has the problems of large hardware resource consumption and low coding efficiency.
发明内容Summary of the invention
为此,需要提供一种H.265编码的技术方案,用以降低H.265算法的硬件资源消耗。For this reason, it is necessary to provide a technical solution for H.265 encoding to reduce the hardware resource consumption of the H.265 algorithm.
为实现上述目的,发明人提供了一种H.265编码装置,包括如下模块:预处理模块、粗选择模块和精确比较模块,所述预处理模块与所述粗选择模块连接,所述粗选择模块与所述精确比较模块连接;其中:In order to achieve the above objective, the inventor provides an H.265 encoding device, which includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module. The preprocessing module is connected to the coarse selection module. The module is connected to the precise comparison module; where:
所述预处理模块用于将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing module is used to divide a current frame in an original video into multiple CTU blocks;
所述粗选择模块用于按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块;所述粗选择模块还用于对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划分模式相对应的预测信息;The coarse selection module is used to divide each CTU block according to multiple division modes, each division mode divides one CTU block into corresponding multiple CU blocks, and divides each CU block into corresponding one or Multiple PU blocks; the coarse selection module is also used to perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate prediction information corresponding to each division mode;
所述精确比较模块用于对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。The precise comparison module is used to compare the cost of prediction information corresponding to each partition mode of each CTU block, select the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and According to the selected division mode and its corresponding coding information, the entropy coding information used to generate the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated.
发明人还提供了一种H.265编码方法,所述方法应用于H.265编码装置,所述装置包括如下模块:预处理模块、粗选择模块和精确比较模块,所述预处理模块与所述粗选择模块连接,所述粗选择模块与所述精确比较模块连接;所述方法包括以下步骤:The inventor also provides an H.265 encoding method, which is applied to an H.265 encoding device. The device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module. The coarse selection module is connected, and the coarse selection module is connected with the precise comparison module; the method includes the following steps:
预处理模块将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing module divides a current frame in an original video into multiple CTU blocks;
粗选择模块按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块;以及对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划分模式相对应的预测信息;The coarse selection module divides each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each CU block into one or more corresponding PU blocks. ; And perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate a prediction information corresponding to each division mode;
精确比较模块对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。The precise comparison module compares the cost of the prediction information corresponding to each partition mode of each CTU block, selects the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and selects The division mode and its corresponding coding information are used to generate entropy coding information for generating an H.265 code stream from the current frame and reconstruction information for generating a reconstructed frame from the current frame.
发明人还提供了一种H.265编码装置,包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,其中:The inventor also provides an H.265 encoding device, including multiple modules and multiple pipeline steps, each pipeline step includes at least one pipeline stage for executing at least one module, wherein:
多个模块包括预处理模块、粗选择模块、精确比较模块和整体控制模块,所述整体控制模块分别与预处理模块、粗选择模块、精确比较模块连接;The multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行;The multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
所述预处理流水步骤通过预处理模块,将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing pipeline step divides a current frame in an original video into multiple CTU blocks through the preprocessing module;
所述粗选择流水步骤通过粗选择模块,按照多个划分模式来划分每个CTU块,对每个CTU块的每个划分模式进行帧间预测粗选择和帧内预测粗选择,并生成一个与每个划分模式相对应的预测信息;The rough selection pipeline step uses the rough selection module to divide each CTU block according to multiple division modes, and performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates a and Forecast information corresponding to each division mode;
所述精确比较流水步骤通过精确比较模块,对与每个CTU块的各个划分模式相对应的预测信息进行代价计算并比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息,The precise comparison pipeline step calculates and compares the prediction information corresponding to each division mode of each CTU block through the precise comparison module, and selects a division mode with the smallest cost for each CTU block and the division mode. Corresponding coding information, and according to the selected division mode and its corresponding coding information, generate entropy coding information for generating the H.265 code stream from the current frame and reconstruction information for generating the reconstructed frame from the current frame,
所述整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
发明人还提供了一种H.265编码方法,所述方法应用于H.265编码装置,所述装置包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,其中:The inventor also provides a H.265 encoding method, which is applied to an H.265 encoding device, the device includes multiple modules and multiple pipeline steps, each pipeline step includes at least one pipeline stage for execution At least one module, of which:
多个模块包括预处理模块、粗选择模块、精确比较模块和整体控制模块,所述整体控制模块分别与预处理模块、粗选择模块、精确比较模块连接;The multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行;The multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
所述方法包括以下步骤:The method includes the following steps:
预处理流水步骤通过预处理模块,将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing pipeline step divides a current frame in an original video into multiple CTU blocks through the preprocessing module;
粗选择流水步骤通过粗选择模块,按照多个划分模式来划分每个CTU块,对每个CTU块的每个划分模式进行帧间预测粗选择和帧内预测粗选择,并生成一个与每个划分模式相对应的预测信息;The rough selection pipeline process uses the rough selection module to divide each CTU block according to multiple division modes, and performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates one and each Forecast information corresponding to the division mode;
精确比较流水步骤通过精确比较模块,对与每个CTU块的各个划分模式相对应的预测信息进行代价计算并比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息,The precise comparison pipeline step calculates and compares the prediction information corresponding to each partition mode of each CTU block through the precise comparison module, and selects the partition mode with the smallest cost for each CTU block and the partition mode corresponding to the partition mode. Encoding information, and according to the selected division mode and its corresponding encoding information, generating entropy encoding information for generating H.265 bitstream from the current frame and reconstruction information for generating reconstructed frames from the current frame,
整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
附图说明Description of the drawings
图1为本发明一实施方式涉及的H.265编码装置的示意图;FIG. 1 is a schematic diagram of an H.265 encoding device related to an embodiment of the present invention;
图2为本发明一实施方式涉及的H.265编码装置的粗选择模块的示意图;2 is a schematic diagram of a coarse selection module of an H.265 encoding device according to an embodiment of the present invention;
图3为本发明一实施方式涉及的H.265编码装置的粗搜索过程的示意图;FIG. 3 is a schematic diagram of a rough search process of an H.265 encoding device according to an embodiment of the present invention;
图4为本发明一实施方式涉及的H.265编码装置的精搜索过程的示意图;4 is a schematic diagram of the fine search process of the H.265 encoding device according to an embodiment of the present invention;
图5为本发明一实施方式涉及的H.265编码装置的分数像素搜索的示意图;5 is a schematic diagram of fractional pixel search of an H.265 encoding device according to an embodiment of the present invention;
图6-A为本发明一实施方式涉及的H.265编码装置进行搜索预测的示意图;6-A is a schematic diagram of search prediction performed by an H.265 encoding device according to an embodiment of the present invention;
图6-B为本发明另一实施方式涉及的H.265编码装置进行搜索预测的示意图;FIG. 6-B is a schematic diagram of search prediction performed by an H.265 encoding device according to another embodiment of the present invention;
图7为本发明一实施方式涉及的H.265编码装置的精确比较模块的示意图;FIG. 7 is a schematic diagram of an accurate comparison module of an H.265 encoding device according to an embodiment of the present invention;
图8为本发明一实施方式涉及的H.265编码装置的分层比较模块的示意图;FIG. 8 is a schematic diagram of a layered comparison module of an H.265 encoding device according to an embodiment of the present invention;
图9为本发明一实施方式涉及的H.265编码方法的流程图;FIG. 9 is a flowchart of an H.265 encoding method according to an embodiment of the present invention;
图10为本发明一实施方式涉及的H.265编码的粗搜索方法的流程图;FIG. 10 is a flowchart of a rough search method for H.265 encoding according to an embodiment of the present invention;
图11为本发明一实施方式涉及的H.265编码的精搜索方法的流程图;FIG. 11 is a flowchart of a fine search method for H.265 encoding according to an embodiment of the present invention;
图12为本发明一实施方式涉及的H.265编码的分数像素搜索方法的流程图;FIG. 12 is a flowchart of a H.265 coded fractional pixel search method according to an embodiment of the present invention;
图13为本发明一实施方式涉及的当前CTU块周围的运动矢量信息的示意图;FIG. 13 is a schematic diagram of motion vector information around a current CTU block according to an embodiment of the present invention;
图14为本发明另一实施方式涉及的H.265编码方法的流程图;14 is a flowchart of an H.265 encoding method related to another embodiment of the present invention;
图15为本发明另一实施方式涉及的H.265编码装置的示意图;15 is a schematic diagram of an H.265 encoding device related to another embodiment of the present invention;
附图标记:Reference signs:
100、原始视频;100. Original video;
101、原始图像帧;101. Original image frame;
102、当前帧;102. The current frame;
110、图像编码设备;120、预处理模块;130、粗选择模块;140、精确比较模块;150、熵编码模块;160、去块滤波模块;170、样本自适应偏置模块;180、后处理模块;110. Image coding equipment; 120. Preprocessing module; 130. Coarse selection module; 140. Precise comparison module; 150. Entropy coding module; 160. Deblocking filtering module; 170. Sample adaptive biasing module; 180. Post-processing Module
121、当前CTU;141、编码信息;180、编码后视频;190、码流;145、重构帧图像;121. Current CTU; 141, coding information; 180, encoded video; 190, code stream; 145, reconstructed frame image;
230、帧间预测粗选择模块;211、粗搜索模块;213、精搜索模块;215、分数像素搜索模块;230. Inter-frame prediction coarse selection module; 211. Coarse search module; 213. Fine search module; 215. Fractional pixel search module;
330、帧内预测粗选择模块;231、参考像素生成模块;330. Intra-frame prediction coarse selection module; 231. Reference pixel generation module;
310、参考帧;311、下采样;320、下采样后的图像;351、运动矢量;352、最小代价像素块;330、当前CTU;340、下采样后CTU。310. Reference frame; 311, down-sampling; 320, down-sampled image; 351, motion vector; 352, minimum cost pixel block; 330, current CTU; 340, down-sampling CTU.
410、参考帧;420、当前PU位置;421、还原运动矢量;423、精搜索运动矢量;430、精搜索区域;431、起始搜索位置;433、最小代价位置;410. Reference frame; 420, current PU position; 421, restore motion vector; 423, fine search motion vector; 430, fine search area; 431, start search position; 433, minimum cost position;
510、参考帧;520、当前PU位置;521、精搜索运动矢量;423、分数像素搜索运动矢量;530、分数像素搜索区域;531、起始搜索位置;533、最小代价位置;510. Reference frame; 520. Current PU position; 521. Fine search motion vector; 423. Fractional pixel search motion vector; 530. Fractional pixel search area; 531. Start search position; 533. Minimum cost position;
711、分发模块;721、一级计算Level_calc0;722、二级计算Level_calc1;711. Distribution module; 721, first-level calculation Level_calc0; 722, second-level calculation Level_calc1;
723、三级计算Level_calc2;724、四级计算Level_calc3;723, three-level calculation Level_calc2; 724, four-level calculation Level_calc3;
740、分层比较模块;740. Hierarchical comparison module;
810、单级计算模块;820、帧间模式代价计算模块;830、帧内模式代价计算模块;840、优选模块;810. Single-stage calculation module; 820, inter-mode cost calculation module; 830, intra-mode cost calculation module; 840, optimization module;
910、参考帧数据加载模块;920、整体控制模块。910. Reference frame data loading module; 920. Overall control module.
具体实施方式detailed description
为详细说明技术方案的技术内容、构造特征、所实现目的及效果,以下结合具体实施例并配合附图详予说明。In order to describe in detail the technical content, structural features, achieved objectives and effects of the technical solution, the following detailed description will be given in conjunction with specific embodiments and accompanying drawings.
请参阅图1,为本发明一实施方式涉及的H.265编码装置的示意图。所述装置为图像编码设备110,所述装置可以为具有图像编码功能的芯片,也可以为包含有上述芯片的电子设备,如是手机、平板电脑、个人数字助理等智能移动设备,还可以是个人计算机、工业装备用计算机等电子设备。所述装置包括如下模块:预处理模块120、粗选择模块130和精确比较模块140,所述预处理模块120与所述粗选择模块130连接,所述粗选择模块130与所述精确比较模块140连接;其中:Please refer to FIG. 1, which is a schematic diagram of an H.265 encoding apparatus according to an embodiment of the present invention. The device is an image encoding device 110. The device may be a chip with image encoding function, or an electronic device containing the above chip, such as a smart mobile device such as a mobile phone, a tablet computer, a personal digital assistant, or a personal digital assistant. Computers, computers for industrial equipment and other electronic equipment. The device includes the following modules: a preprocessing module 120, a coarse selection module 130, and an accurate comparison module 140, the preprocessing module 120 is connected to the coarse selection module 130, the coarse selection module 130 and the precise comparison module 140 Connection; where:
所述预处理模块120用于将一个原始视频100中的一个当前帧102分割为多个CTU块(Coding Tree Unit,编码树单元)。CTU为当前帧图像中的一个子块,大小可以为16x16子块、32x32子块、64x64子块中的任意一种。具体地,预处理模块可以获取一个原始视频100中的原始图像帧101,并从原始图像帧101中选定一当前帧102。The preprocessing module 120 is used to divide a current frame 102 in an original video 100 into multiple CTU blocks (Coding Tree Unit, coding tree unit). The CTU is a sub-block in the current frame image, and the size can be any of 16x16 sub-blocks, 32x32 sub-blocks, and 64x64 sub-blocks. Specifically, the preprocessing module may obtain an original image frame 101 in the original video 100, and select a current frame 102 from the original image frame 101.
所述粗选择模块130用于按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块(Coding Unit,编码单元),以及将其中的每个CU块分割为对应的一个或多个PU块(Prediction Unit,预测单元);所述粗选择模块130还用于对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划分模式相对应的预测信息。划分模式根据实际需要进行选择,例如对于一个64x64大小的当前CTU 121,可以将其划分为4个32x32子块;对于每个32x32子块又可以将其分为4个16x16子块。The coarse selection module 130 is configured to divide each CTU block according to multiple division modes, each division mode divides a CTU block into corresponding multiple CU blocks (Coding Unit, coding unit), and divides each of them The CU block is divided into one or more PU blocks (Prediction Unit, prediction unit); the coarse selection module 130 is also used to perform inter-frame prediction and intra-frame prediction for each division mode of each CTU block, and generate A prediction information corresponding to each division mode. The division mode is selected according to actual needs. For example, for a current CTU 121 with a size of 64x64, it can be divided into 4 32x32 sub-blocks; for each 32x32 sub-block, it can be divided into 4 16x16 sub-blocks.
所述精确比较模块140用于对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。这样,通过分布搜索的方式提高了搜索精度,同时更好地保留了重构图像的细节,降低了硬件资源消耗。The precise comparison module 140 is configured to compare the prediction information corresponding to each partition mode of each CTU block, and select the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, And according to the selected division mode and its corresponding coding information, the entropy coding information for generating the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated. In this way, the search accuracy is improved through the distributed search, while the details of the reconstructed image are better preserved, and the hardware resource consumption is reduced.
在某些实施例中,所述装置还包括熵编码模块150,所述熵编码模块150与精确比较模块140连接:所述熵编码模块150用于根据每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的熵编码信息,来生成与当前帧相对应的H.265码流。具体地,所述精确比较模块140根据该CTU代价最小的划分模式和预测模式生成与该CTU相对应的熵编码所需数据,即如图1中所示的编码信息141,所述熵编码模块150用于根据与该CTU相对应的熵编码所需数据来生成与原始视频相对应的编码后码流190。同时,图像编码设备110也会输出编码后视频180,编码后视频180的某一图像帧即为重构图像帧145。In some embodiments, the device further includes an entropy encoding module 150, which is connected to the precise comparison module 140: the entropy encoding module 150 is configured to divide according to the least costly corresponding to each CTU block The mode and the entropy coding information corresponding to the current frame generated according to the corresponding coding information to generate the H.265 code stream corresponding to the current frame. Specifically, the precise comparison module 140 generates the data required for entropy coding corresponding to the CTU according to the partition mode and prediction mode with the smallest CTU cost, that is, the coding information 141 shown in FIG. 1, the entropy coding module 150 is used to generate an encoded code stream 190 corresponding to the original video according to the data required for entropy encoding corresponding to the CTU. At the same time, the image encoding device 110 will also output the encoded video 180, and a certain image frame of the encoded video 180 is the reconstructed image frame 145.
在某些实施例中,所述装置包括后处理模块,所述后处理模块与精确比较模块连接。所述后处理模块用于根据与每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的重构信息,来生成与当前帧相对应的重构帧。In some embodiments, the device includes a post-processing module that is connected to the precise comparison module. The post-processing module is used to generate the reconstruction corresponding to the current frame according to the least costly partition mode corresponding to each CTU block and the reconstruction information corresponding to the current frame generated according to the corresponding coding information frame.
优选的,所述后处理模块包括去块滤波模块160和样本自适应偏移模块170;所述去块滤波模块160和样本自适应偏移模块170连接;所述去块滤波模块160用于利用精确比 较模块所提供的代价最小的划分模式和与其对应的编码信息,对重构帧进行滤波处理;所述样本自适应偏移模块170用于对滤波处理后的重构帧进行SAO计算,并将计算后的数据传输至熵编码模块150。Preferably, the post-processing module includes a deblocking filter module 160 and a sample adaptive offset module 170; the deblocking filter module 160 is connected to the sample adaptive offset module 170; the deblocking filter module 160 is used to use The accurate comparison module provides the least costly partition mode and its corresponding coding information to filter the reconstructed frame; the sample adaptive offset module 170 is used to perform SAO calculation on the filtered reconstructed frame, and The calculated data is transmitted to the entropy encoding module 150.
如图2所示,所述粗选择模块130包括帧间预测粗选择模块230和帧内预测粗选择模块330,所述帧间预测粗选择模块230分别与预处理模块120、精确比较模块140连接,所述帧内预测粗选择模块330分别与预处理模块120、精确比较模块140连接;其中:As shown in FIG. 2, the coarse selection module 130 includes an inter-frame prediction coarse selection module 230 and an intra-frame prediction coarse selection module 330, and the inter-frame prediction coarse selection module 230 is respectively connected to the preprocessing module 120 and the precise comparison module 140 , The intra-frame prediction coarse selection module 330 is respectively connected to the pre-processing module 120 and the precise comparison module 140; wherein:
所述帧间预测粗选择模块230用于对每个划分模式中的每个PU块进行帧间预测,并选择相对于每个PU块代价小于预设代价值的一个或多个从参考帧中获取的参考信息,以及将选择的参考PU块的运动矢量作为该划分模式相对应的预测信息。每个PU块都有自身对应的运动矢量,每个PU块的运动矢量都是用来从重构的参考帧中获取预测信息的,具体可以以当前PU块所在的位置为起点,按PU块对应的运动矢量获取预测信息。The inter-frame prediction coarse selection module 230 is configured to perform inter-frame prediction on each PU block in each division mode, and select one or more reference frames with a cost less than a preset cost value relative to each PU block The obtained reference information and the motion vector of the selected reference PU block are used as prediction information corresponding to the division mode. Each PU block has its own corresponding motion vector. The motion vector of each PU block is used to obtain prediction information from the reconstructed reference frame. Specifically, the location of the current PU block can be used as the starting point, and the motion vector of each PU block The corresponding motion vector obtains prediction information.
所述帧内预测粗选择模块330用于对每个划分模式中的每个PU块进行帧内预测,并选择相对于每个PU块代价小于预设代价值的一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。The intra-frame prediction coarse selection module 330 is configured to perform intra-frame prediction on each PU block in each division mode, and select one or more intra-frame prediction directions whose cost is less than a preset cost value relative to each PU block , And use the selected intra prediction direction as the prediction information corresponding to the division mode.
在某些实施例中,所述帧间预测粗选择模块230还包括有:粗搜索模块211、精搜索模块213和分数像素搜索模块215,所述粗搜索模块211与预处理模块120连接,所述粗搜索模块211与精搜索模块213连接,所述精搜索模块与213分数像素搜索模块215连接。In some embodiments, the inter-frame prediction coarse selection module 230 further includes: a coarse search module 211, a fine search module 213, and a fractional pixel search module 215. The coarse search module 211 is connected to the preprocessing module 120, so The coarse search module 211 is connected to the fine search module 213, and the fine search module is connected to the 213 fractional pixel search module 215.
所述粗搜索模块用于从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个作为参考帧,对参考帧和当前CTU块分别进行下采样操作,并在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量。The coarse search module is used to select a frame from the reference array, select one of its original frame or reconstructed frame as a reference frame, perform down-sampling operations on the reference frame and the current CTU block, and perform down-sampling on the reference frame after down-sampling Find the pixel location with the least cost compared with the down-sampled CTU block, and calculate the coarse search vector of the pixel location relative to the current CTU block.
所述参考列表为存放参考帧的列表,当前帧的参考帧可以有多帧,都是通过参考列表索引的。一个参考帧包括重构帧和原始帧。由于参考帧和当前CTU块都是经过下采样操作得到,因而粗搜索模块计算得到的粗搜索矢量也应为相应下采样的搜索矢量,即相较于当前CTU块对应的粗搜索矢量需要乘以下采样的倍率(如1/4),并将乘以相应倍率后的粗搜索矢量传输给下一处理模块。The reference list is a list storing reference frames, and the reference frame of the current frame may have multiple frames, all of which are indexed through the reference list. A reference frame includes reconstructed frames and original frames. Since the reference frame and the current CTU block are obtained through down-sampling, the coarse search vector calculated by the coarse search module should also be the corresponding down-sampled search vector, that is, the coarse search vector corresponding to the current CTU block needs to be multiplied by the following The sampling magnification (such as 1/4), and the coarse search vector multiplied by the corresponding magnification is transmitted to the next processing module.
如图3所示,所述粗搜索模块从原始帧或者重构帧中选择一个作为参考帧,对参考帧和当前CTU分别进行下采样操作,再在下采样后的参考帧中找到与下采样后的CTU相比代价最小的像素位置和粗搜索矢量。优选的,在本实施方式中,参考帧和当前CTU的下采样缩放比例相同。例如参考帧310经过下采样311后得到的下采样后图像320,是将参考帧的长宽各缩放至1/4,则当前CTU330经过下采样331后得到的下采样后的CTU,通过将当前CTU330长宽各缩放至1/4得到。而后以下采样后的CTU340(图3中B子块)为单位,在下采样后图像(图3中A子块)中进行预测,并依次计算采样后的CTU340与下采样后图像320中各个对应的子块(以A子块中各个像素点为中心,取与B子块大小相同的子块)的代价,找到与下采样后的CTU相比代价最小的像素块,记为最小代价像素块352(图3中C子块),并记录当前最小代价像素块的中心像素位置和粗搜索矢量,粗搜索矢量为下采样后CTU340(图3中B子块)的中心像素与最小代价像素块352(图3中C子块)的中心像素位置之间的矢量位移(即图3中的运动矢量351)。As shown in Figure 3, the coarse search module selects one of the original frame or the reconstructed frame as a reference frame, performs down-sampling operations on the reference frame and the current CTU respectively, and then finds and down-sampled the reference frame after down-sampling. The CTU is compared to the least costly pixel location and coarse search vector. Preferably, in this embodiment, the down-sampling scaling ratio of the reference frame and the current CTU are the same. For example, the down-sampled image 320 obtained from the reference frame 310 after down-sampling 311 is to scale the length and width of the reference frame to 1/4, then the down-sampled CTU obtained by the current CTU 330 after down-sampling 331, through the current The length and width of CTU330 are scaled to 1/4. Then the down-sampled CTU340 (B sub-block in Figure 3) is used as a unit, and prediction is performed in the down-sampled image (A sub-block in Figure 3), and the sampled CTU340 and the down-sampled image 320 are calculated in turn. The cost of the sub-block (take each pixel in the A sub-block as the center, take the sub-block with the same size as the B sub-block), find the pixel block with the smallest cost compared with the down-sampled CTU, and record it as the minimum cost pixel block 352 (C sub-block in Figure 3), and record the center pixel position of the current minimum cost pixel block and the coarse search vector. The coarse search vector is the center pixel and minimum cost pixel block 352 of the CTU340 (sub-block B in Figure 3) after downsampling. The vector displacement between the center pixel positions of (C sub-block in FIG. 3) (that is, the motion vector 351 in FIG. 3).
在某些实施例中,所述帧内预测粗选择模块330还包括参考像素生成模块231。所述 参考像素生成模块231用于对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,并根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,并根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。帧内预测粗选择模块进行粗选择的方法与帧间预测粗选择模块的类似,此处不再赘述。两者的差别在于在进行帧内预测时,是对原始帧进行下采样得到下采样后图像,下采样后的CTU在原始帧进行下采样得到下采样后图像进行预测;而在进行帧间预测时,是对参考帧进行下采样得到下采样后图像,下采样后的CTU在参考帧进行下采样得到下采样后图像中进行预测。In some embodiments, the intra-frame prediction coarse selection module 330 further includes a reference pixel generation module 231. The reference pixel generating module 231 is used to generate reference pixels using the original pixels of the current frame for each PU block in each division mode, and to predict all intra-frame directions according to the rules of the H.265 protocol according to the reference pixels. Perform prediction to obtain prediction results in each direction, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and sort the cost from small to large to select one or more intra-frame prediction directions with less cost. The coarse selection method of the intra-frame prediction coarse selection module is similar to that of the inter-frame prediction coarse selection module, and will not be repeated here. The difference between the two is that when performing intra-frame prediction, the original frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is down-sampled from the original frame to obtain the down-sampled image for prediction; while performing inter-frame prediction At this time, the reference frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is predicted in the down-sampled image obtained by down-sampling the reference frame.
如图6-A和图6-B,按照H.265的协议,参考像素应该用重构像素,但硬件实现的过程中,当前时间点只能得到原始像素,往往还无法得到重构像素,因此本发明中采用原始像素代替重构像素的方式。以4x4大小的PU子块为例,图中有黑色填充的圆点部分为边像素,根据H.265协议,4x4块(图6-B中阴影填充的圆点部分)的边界像素总共有17个,图中的黑色填充部分像素(即边像素)应该应用重构像素填充,但当前时间点无法得到重构像素,只用原始像素代替。阴影填充部分即为4x4大小的PU块。边界像素填充完成之后,再按协议进行预测得到阴影部分填充的4x4大小的块。As shown in Figure 6-A and Figure 6-B, according to the H.265 protocol, the reference pixels should be reconstructed pixels, but in the process of hardware implementation, only the original pixels can be obtained at the current time point, and the reconstructed pixels are often not available. Therefore, the method of replacing reconstructed pixels with original pixels is adopted in the present invention. Taking a 4x4 PU sub-block as an example, the black-filled dots in the figure are edge pixels. According to the H.265 protocol, the 4x4 block (the shadow-filled dots in Figure 6-B) has a total of 17 boundary pixels. First, the black filled part of the pixels in the figure (ie, side pixels) should be filled with reconstructed pixels, but the reconstructed pixels cannot be obtained at the current time point, and only original pixels are used instead. The shadow filling part is a PU block of 4x4 size. After the boundary pixel filling is completed, prediction is performed according to the protocol to obtain a 4x4 block filled with the shadow part.
如图4,所述精搜索模块根据粗搜索矢量,对每个PU在参考帧中设定一个精搜索区域,并在该精搜索区域中,找到一个对应该PU的代价最小的一个精搜索矢量。精搜索步骤是在参考帧410内进行的,每个当前CTU包含有多个PU,精搜索则是以某种顺序从这些PU中一个一个地选作当前PU来进行的。具体地,首先确定当前PU位置420,而后根据之前获取的粗搜索矢量(或称为还原运动矢量421)对该个PU在参考帧中设定一个精搜索区域430。并再根据还原运动矢量421在精搜索区域430内确定与当前PU位置420对应的一个起始搜索位置431。与粗搜索的搜索方式类似的,在精搜索区域430内,以起始搜索位置431像素为中心,依次计算起始搜索位置431与精搜索区域430中各个像素点为中心、当前PU大小相同的子块的代价,找到最小代价位置433,并计算当前PU位置420与最小代价位置433之间的运动矢量,记为精搜索运动矢量423。As shown in Figure 4, the fine search module sets a fine search area in the reference frame for each PU according to the coarse search vector, and finds a fine search vector corresponding to the PU with the smallest cost in the fine search area . The fine search step is performed in the reference frame 410. Each current CTU contains multiple PUs, and the fine search is performed by selecting one of these PUs as the current PU in a certain order. Specifically, the current PU position 420 is determined first, and then a fine search area 430 is set in the reference frame for the PU according to the previously obtained coarse search vector (or called the restored motion vector 421). Then, a starting search position 431 corresponding to the current PU position 420 is determined in the fine search area 430 according to the restored motion vector 421. Similar to the search method of coarse search, in the fine search area 430, with the starting search position 431 pixels as the center, the pixels in the starting search position 431 and the fine search area 430 are calculated in turn, and the current PU size is the same. For the cost of the sub-block, find the minimum cost position 433, calculate the motion vector between the current PU position 420 and the minimum cost position 433, and record it as the fine search motion vector 423.
在某些实施例中,所述精搜索模块用于根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域,并在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及用于根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块。In some embodiments, the fine search module is configured to set a fine search area in the reconstructed image of the reference frame for each PU block according to the coarse search vector, and generate a fine search area in the fine search area. A fine search vector with the lowest cost corresponding to the PU block; and used to generate one or more predicted motion vectors with the same function as the coarse search vector according to the motion vector information around the current CTU block, and generate a fine search based on the predicted motion vector Vector; and send all the generated fine search vectors to the fractional pixel search module.
如图13,对于一个64x64大小的当前CTU块,在位于上边的10个8x8大小的子块(图13中用1-10来标注的子块),左上边相邻的CTU块和右上边相邻的CTU块中,分别有其对应的一个粗搜索结果以及对应的运动矢量信息。此外,当前CTU块内部有16个协助运动矢量,因此最多有28个mv作为邻接mv(即当前CTU块周围的运动矢量信息)。这28个运动矢量信息会经过一定的筛选,筛选出预设数量个(如3个)的邻接mv传输给精搜索模块,从而确定同样预设数量个的精搜索运动矢量。在本实施方式中,同样功能是指筛选出的预设数量个的邻接mv与粗搜索模块得到的搜索结果的作用是一致的,即均会输入给精搜索模块的接口进行下一步处理。As shown in Figure 13, for a 64x64 current CTU block, in the upper 10 8x8 size sub-blocks (sub-blocks marked with 1-10 in Figure 13), the adjacent CTU block on the upper left side is the same as the upper right side. In adjacent CTU blocks, there is a corresponding rough search result and corresponding motion vector information. In addition, there are 16 assisted motion vectors in the current CTU block, so there are at most 28 mvs as adjacent mvs (that is, the motion vector information around the current CTU block). The 28 motion vector information will undergo a certain screening, and a preset number (such as 3) of adjacent mvs will be screened out and transmitted to the fine search module to determine the same preset number of fine search motion vectors. In this embodiment, the same function means that the filtered preset number of adjacent mvs are the same as the search results obtained by the coarse search module, that is, they will be input to the interface of the fine search module for further processing.
在本实施方式中,粗搜索模块会给精搜模块输入一个运动矢量,然后也会从邻接mv 中选择几个mv输入给精搜索模块,假设总共有N个mv输入给精搜索模块,那么精搜索模块也会产生N个精搜索rmv(即精搜索矢量),并将N个精搜索矢量均输入给FME(即分数像素搜索模块),再由FME从这N个精搜索mv中通过代价比较得到一个最优的fme_mv(即分数像素搜索矢量),这个fme_mv最后会输入给精确比较模块。In this embodiment, the coarse search module will input a motion vector to the fine search module, and then select several mvs from adjacent mvs to input to the fine search module. Assuming that there are a total of N mvs input to the fine search module, then the fine search module The search module will also generate N fine search rmvs (that is, fine search vectors), and input the N fine search vectors to FME (that is, the fractional pixel search module), and then FME will compare the costs from these N fine search mvs An optimal fme_mv (ie, fractional pixel search vector) is obtained, and this fme_mv will finally be input to the accurate comparison module.
如图5所示,为了进一步提高搜索精度,所述分数像素搜索模块215用于根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域530,并在该分数像素搜索区域530中生成一个该PU块对应的代价最小的一个分数像素搜索矢量423。具体地,分数像素搜索区域530可以通过以下方式确定:根据当前PU位置520以及之前获取的精搜索运动矢量,在参考帧510中确定当前PU位置520对应的起始搜索位置531,以起始搜索位置像素为中心,分别在上下左右4个方位各扩展K个像素(K的值可以根据实际需要设定),得到边长为2K的方形区域即为分数像素搜索区域530。与精搜索的搜索方式类似的,以起始搜索位置531像素为中心,依次计算起始搜索位置531与分数像素搜索区域530中各个像素点为中心、当前PU大小相同的子块的代价,找到最小代价位置533,并计算当前PU位置与最小代价位置533之间的运动矢量,记为分数像素搜索运动矢量523。As shown in FIG. 5, in order to further improve the search accuracy, the fractional pixel search module 215 is configured to set a corresponding fractional pixel search area 530 in the reference frame for each PU block according to each received fine search vector. , And generate a fractional pixel search vector 423 with the lowest cost corresponding to the PU block in the fractional pixel search area 530. Specifically, the fractional pixel search area 530 can be determined in the following manner: according to the current PU position 520 and the previously acquired fine search motion vector, the start search position 531 corresponding to the current PU position 520 is determined in the reference frame 510 to start the search The position pixel is the center, and K pixels are expanded in 4 directions respectively (the value of K can be set according to actual needs), and a square area with a side length of 2K is obtained as the fractional pixel search area 530. Similar to the search method of the fine search, the starting search position 531 pixel is taken as the center, the starting search position 531 and each pixel point in the fractional pixel search area 530 are calculated in turn, and the current PU size is the same. The minimum cost position 533 is calculated, and the motion vector between the current PU position and the minimum cost position 533 is calculated and recorded as the fractional pixel search motion vector 523.
请参阅图7,为本发明一实施方式涉及的H.265编码装置中的精确比较模块的示意图。在某些实施例中,所述精确比较模块140包括有分发模块711、多个单级计算模块(比如721、722、723和724)、和多个分层比较模块740。所述分发模块711与粗选择模块130连接,并与多个单级计算模块连接;所述每个单级计算模块与一个相对应的分层比较模块740连接。其中:Please refer to FIG. 7, which is a schematic diagram of the precise comparison module in the H.265 encoding device according to an embodiment of the present invention. In some embodiments, the precise comparison module 140 includes a distribution module 711, multiple single-stage calculation modules (such as 721, 722, 723, and 724), and multiple hierarchical comparison modules 740. The distribution module 711 is connected to the coarse selection module 130 and connected to a plurality of single-stage calculation modules; each single-stage calculation module is connected to a corresponding hierarchical comparison module 740. among them:
所述分发模块711用于根据每个CTU块的不同划分模式,将每个划分模式中的不同的与该CU块相对应的预测信息分发给对不同的单级计算模块;The distribution module 711 is configured to distribute different prediction information corresponding to the CU block in each division mode to different single-stage calculation modules according to different division modes of each CTU block;
所述单级计算模块用于根据从分发模块711接收到的与CU块相对应的预测信息,计算多个代价信息并进行层内比较,选出一个该CU块对应的代价最小的预测模式和划分模式;The single-stage calculation module is used to calculate multiple cost information and compare them in layers according to the prediction information corresponding to the CU block received from the distribution module 711, and select a prediction mode with the least cost corresponding to the CU block and Partition mode
所述分层比较模块740用于比较不同层的单级比较模块所计算出的代价信息,选择出对于CTU块代价最小的划分模式和相对应的编码信息。The layered comparison module 740 is used to compare the cost information calculated by the single-stage comparison modules of different layers, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
在某些实施例中,图7的精确比较模块140包含了四个单级计算模块721、722、723和724。每个单级计算模块721、722、723和724可以由图8的单级计算模块810来组成。如图8所示,单级计算模块810包括帧间模式代价计算模块820、帧内模式代价计算模块830、和优选模块840。对于每个输入的CU,单级计算模块810可以通过帧间模式代价计算模块820来计算一个帧间代价,通过帧内模式代价计算模块830来计算一个帧内代价,并通过优选模块840来比较帧间代价和帧内代价,确定综合代价最小的一个划分模式和预测模式,即为当前输入的CU相对应的代价最小的划分模式和预测模式。In some embodiments, the exact comparison module 140 of FIG. 7 includes four single- stage calculation modules 721, 722, 723, and 724. Each single- stage calculation module 721, 722, 723, and 724 may be composed of the single-stage calculation module 810 of FIG. 8. As shown in FIG. 8, the single-stage calculation module 810 includes an inter-mode cost calculation module 820, an intra-mode cost calculation module 830, and an optimization module 840. For each input CU, the single-stage calculation module 810 may calculate an inter-frame cost through the inter-mode cost calculation module 820, and calculate an intra-frame cost by the intra-mode cost calculation module 830, and compare it by the optimization module 840 For the inter-frame cost and the intra-frame cost, determine the partition mode and prediction mode with the smallest comprehensive cost, that is, the partition mode and prediction mode with the smallest cost corresponding to the currently input CU.
回到图7的实施例中,每个单级计算模块721、722、723和724用于处理一个特定级别的CU块。比如,单级计算模块721可设为一级计算模块,用于处理64x64大小的CU块;单级计算模块722可设为二级计算模块,用于处理32x32大小的CU块;单级计算模块723可设为三级计算模块,用于处理16x16大小的CU块;单级计算模块724可设为四级计算模块,用于处理8x8大小的CU块。假设精确比较模块140从粗选择模块130接收到一个 CTU以及相应的划分模式、预测信息、以及多个帧间运动矢量和参考信息。分发模块711可以根据各种划分模式下的CU,根据其大小,分发给各级计算模块721-724。Returning to the embodiment of FIG. 7, each single- stage calculation module 721, 722, 723, and 724 is used to process a CU block of a specific level. For example, the single-stage calculation module 721 can be set as a first-level calculation module for processing 64x64 CU blocks; the single-stage calculation module 722 can be set as a second-level calculation module for processing CU blocks of 32x32 size; single-stage calculation module 723 can be set as a three-level calculation module for processing 16x16 CU blocks; single-level calculation module 724 can be set as a four-level calculation module for processing 8x8 CU blocks. Assume that the precise comparison module 140 receives a CTU from the coarse selection module 130 and the corresponding division mode, prediction information, and multiple inter-frame motion vectors and reference information. The distribution module 711 can distribute to the computing modules 721-724 at all levels according to the size of the CU in various division modes.
在某些实施例中,每个单级计算模块的帧内模式代价计算模块830,会接收到与某个级别的CU相关的一个或多个帧内预测信息,计算并选出一个帧内代价。每个单级计算模块的帧间模式代价计算模块820,会同时/并行接收到与某个级别的CU相关的一个或多个帧间运动矢量和参考信息,计算并选出一个帧间代价。之后,每个单级计算模块的优选模块840会从已经计算出的帧内代价和帧间代价中,优选一个最小代价。换句话说,当最小代价是帧内代价时,说明采用相关的帧内预测信息来进行H.265编码是较佳选择;当最小代价是帧间代价时,说明采用相关的帧间运动矢量和参考信息来进行H.265编码是较佳选择。In some embodiments, the intra-mode cost calculation module 830 of each single-stage calculation module receives one or more intra-frame prediction information related to a CU of a certain level, calculates and selects an intra-frame cost . The inter-mode cost calculation module 820 of each single-stage calculation module simultaneously/parallel receives one or more inter-frame motion vectors and reference information related to a CU of a certain level, calculates and selects an inter-frame cost. After that, the optimization module 840 of each single-stage calculation module will select a minimum cost from the calculated intra-frame cost and inter-frame cost. In other words, when the minimum cost is intra-frame cost, it means that it is a better choice to use relevant intra-frame prediction information for H.265 encoding; when the minimum cost is inter-frame cost, it means that the relevant inter-frame motion vector and Reference information for H.265 encoding is a better choice.
例如,分层比较模块743可以将四级计算模块724计算得到的四个8x8块对应的最小代价之和、与1个从三级计算模块723计算得到的1个16x16块的最小代价进行比较,并得到代价更小的那一种划分模式。具体来说,分层比较所比较的对象之一的:4个8x8块(假设称为A、B、C、D块),可以全部是帧间比较获取的最小代价块,全部是帧内比较获取的最小代价块,或者同时包含帧间比较获取的最小代价块和帧内比较获取的最小代价块。比如A块可以是帧间获取的,B、C、D块可以是帧内获取的。或者A、C块可以是帧间获取的,B、D块是帧内获取的。For example, the hierarchical comparison module 743 can compare the sum of the minimum costs corresponding to four 8x8 blocks calculated by the four-level calculation module 724 with the minimum cost of one 16x16 block calculated from the three-level calculation module 723, And get the less expensive division mode. Specifically, one of the objects to be compared for hierarchical comparison: 4 8x8 blocks (assumed to be called A, B, C, and D blocks), which can all be the smallest cost blocks obtained by inter-frame comparison, and all are intra-frame comparison The obtained minimum cost block, or both the minimum cost block obtained by inter-frame comparison and the minimum cost block obtained by intra-frame comparison. For example, block A can be acquired between frames, and blocks B, C, and D can be acquired within frames. Or blocks A and C can be obtained between frames, and blocks B and D are obtained within frames.
同样的,分层比较模块742可以选择4个从分层比较模块743获取的、最小代价的16x16块,合并起来跟1个从二级计算模块722计算得到的最小代价的32x32块进行比较。具体来说,分层比较模块742选择的4个16x16块(假设称为E、F、G、H块),可以包括完整的16x16CU块,也可以是由多个8x8块组成。比如E块可以是帧间获取的一个16x16CU块;F块可以是帧内获取的一个16x16CU块;G块可以是包括帧间获取的和帧内获取的4个8x8块组成的16x16组合块。Similarly, the hierarchical comparison module 742 can select four 16x16 blocks with the lowest cost obtained from the hierarchical comparison module 743, and combine them with one 32x32 block with the lowest cost calculated from the secondary calculation module 722 for comparison. Specifically, the four 16x16 blocks (supposedly called E, F, G, and H blocks) selected by the hierarchical comparison module 742 may include a complete 16x16 CU block, or may be composed of multiple 8x8 blocks. For example, the E block may be a 16x16CU block obtained between frames; the F block may be a 16x16CU block obtained within a frame; and the G blocks may be a 16x16 combined block composed of four 8x8 blocks obtained between frames and intraframes.
同样的,分层比较模块741可以选择4个从分层比较模块742获取的,有最小代价的32x32块,合并起来跟1个从一级计算模块721计算得到的最小代价的64x64块进行比较。具体来说,分层比较模块741选择的4个32x32块(假设称为I、J、K、L块),可以包括完整的32x32CU块,也可以是由多个16x16块组成,每个再由多个8x8块组成的组合块。比如I块可以是帧间获取的一个32x32CU块;J块是包括帧间获取的和帧内获取的4个16x16CU块组成的;K块中的一个或多个16x16块可以分别是由多个8x8块组成的。Similarly, the hierarchical comparison module 741 can select four 32x32 blocks with the smallest cost obtained from the hierarchical comparison module 742, and combine them with one 64x64 block with the smallest cost calculated from the first-level calculation module 721 for comparison. Specifically, the 4 32x32 blocks selected by the hierarchical comparison module 741 (assumed to be called I, J, K, L blocks) can include a complete 32x32CU block, or can be composed of multiple 16x16 blocks, each of which is composed of A combination block composed of multiple 8x8 blocks. For example, the I block can be a 32x32CU block acquired between frames; the J block is composed of four 16x16CU blocks acquired between frames and intraframe; one or more 16x16 blocks in the K block can be composed of multiple 8x8 blocks. Composed of blocks.
通过以上的方式,分层比较模块740可以找到有最小代价的CTU、CU和PU块的组合,选择出对于CTU块代价最小的划分模式和相对应的编码信息。Through the above method, the hierarchical comparison module 740 can find the combination of the CTU, CU, and PU block with the smallest cost, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
如图9所示,发明人还提供了一种H.265编码方法,所述方法应用于H.265编码装置,所述装置包括如下模块:预处理模块、粗选择模块和精确比较模块,所述预处理模块与所述粗选择模块连接,所述粗选择模块与所述精确比较模块连接;所述方法包括以下步骤:As shown in Figure 9, the inventor also provides a H.265 encoding method, which is applied to an H.265 encoding device, and the device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module. The preprocessing module is connected to the coarse selection module, and the coarse selection module is connected to the precise comparison module; the method includes the following steps:
首先进入步骤S101预处理模块将一个原始视频中的一个当前帧分割为多个CTU块;First enter step S101, the preprocessing module divides a current frame in an original video into multiple CTU blocks;
而后进入步骤S102粗选择模块按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块;以及对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划 分模式相对应的预测信息;Then enter step S102. The coarse selection module divides each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each CU block into a corresponding one or Multiple PU blocks; and perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate prediction information corresponding to each division mode;
而后进入步骤S103精确比较模块对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。Then it proceeds to step S103. The precise comparison module performs cost comparison on the prediction information corresponding to each partition mode of each CTU block, selects the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and According to the selected division mode and its corresponding coding information, the entropy coding information used to generate the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated.
在某些实施例中,所述装置还包括熵编码模块,所述熵编码模块与精确比较模块连接;所述方法包括以下步骤:熵编码模块根据每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的熵编码信息,来生成与当前帧相对应的H.265码流。In some embodiments, the device further includes an entropy encoding module connected to an accurate comparison module; the method includes the following steps: the entropy encoding module according to the least costly partition mode corresponding to each CTU block And the entropy coding information corresponding to the current frame generated according to the corresponding coding information to generate the H.265 code stream corresponding to the current frame.
在某些实施例中,所述装置包括后处理模块,所述后处理模块与精确比较模块连接:所述方法包括:后处理模块根据与每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的重构信息,来生成与当前帧相对应的重构帧。In some embodiments, the device includes a post-processing module that is connected to the precise comparison module: the method includes: the post-processing module according to the least costly division mode and the basis of the corresponding to each CTU block The reconstruction information corresponding to the current frame generated from the corresponding encoding information is used to generate a reconstructed frame corresponding to the current frame.
优选的,所述后处理模块包括去块滤波模块和样本自适应偏移模块;所述去块滤波模块和样本自适应偏移模块连接;所述方法包括:去块滤波模块利用精确比较模块所提供的代价最小的划分模式和与其对应的编码信息,对重构帧进行滤波处理;样本自适应偏移模块对滤波处理后的重构帧进行SAO计算,并将计算后的数据传输至熵编码模块。Preferably, the post-processing module includes a deblocking filter module and a sample adaptive offset module; the deblocking filter module is connected to the sample adaptive offset module; the method includes: the deblocking filter module uses the precise comparison module Provide the least costly partition mode and its corresponding coding information, and filter the reconstructed frame; the sample adaptive offset module performs SAO calculation on the reconstructed frame after the filter processing, and transmits the calculated data to entropy coding Module.
在某些实施例中,所述粗选择模块包括帧间预测粗选择模块和帧内预测粗选择模块,所述帧间预测粗选择模块分别与预处理模块、精确比较模块连接,所述帧内预测粗选择模块分别与预处理模块、精确比较模块连接;所述方法包括:帧间预测粗选择模块对每个划分模式中的每个PU块进行帧间预测,并选择相对于每个PU块代价小于预设代价值的一个或多个从参考帧中获取的参考信息,以及将选择的参考PU块的运动矢量作为该划分模式相对应的预测信息;帧内预测粗选择模块对每个划分模式中的每个PU块进行帧内预测,并选择相对于每个PU块代价小于预设代价值的一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。In some embodiments, the coarse selection module includes an inter-frame prediction coarse selection module and an intra-frame prediction coarse selection module. The inter-frame prediction coarse selection module is respectively connected to the preprocessing module and the precise comparison module. The prediction coarse selection module is respectively connected with the preprocessing module and the precise comparison module; the method includes: the inter prediction coarse selection module performs inter prediction on each PU block in each division mode, and selects the relative to each PU block One or more reference information obtained from a reference frame whose cost is less than the preset cost value, and the motion vector of the selected reference PU block is used as the prediction information corresponding to the division mode; the intra-frame prediction coarse selection module performs each division Each PU block in the mode performs intra prediction, and selects one or more intra prediction directions whose cost is less than the preset cost value relative to each PU block, and uses the selected intra prediction direction as the division mode corresponding Forecast information.
在某些实施例中,所述帧内预测粗选择模块还包括参考像素生成模块;所述方法包括:参考像素生成模块对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,并根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,并根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。In some embodiments, the intra-frame prediction coarse selection module further includes a reference pixel generation module; the method includes: the reference pixel generation module uses the original pixels of the current frame for each PU block in each division mode. Generate reference pixels, and predict all intra-frame prediction directions according to the rules of the H.265 protocol to obtain the prediction results in each direction according to the reference pixels, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and reduce the cost One or more intra-frame prediction directions with a lower cost are selected in a large order.
如图10所示,在某些实施例中,所述帧间预测粗选择模块还包括有:粗搜索模块、精搜索模块和分数像素搜索模块,所述粗搜索模块与预处理模块连接,所述粗搜索模块与精搜索模块连接,所述精搜索模块与分数像素搜索模块连接。所述方法包括:As shown in FIG. 10, in some embodiments, the inter-frame prediction coarse selection module further includes: a coarse search module, a fine search module, and a fractional pixel search module. The coarse search module is connected to the preprocessing module, so The coarse search module is connected with the fine search module, and the fine search module is connected with the score pixel search module. The method includes:
首先进入步骤S201粗搜索模块从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧;而后进入步骤S202对参考帧和当前CTU块进行下采样操作;而后进入步骤S203在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量。First, go to step S201, the coarse search module to select a frame from the reference array, and select a reference frame from its original frame or reconstructed frame; then go to step S202 to down-sample the reference frame and the current CTU block; then go to step S203. In the sampled reference frame, find the pixel position with the least cost compared with the down-sampled CTU block, and calculate the coarse search vector of the pixel position relative to the current CTU block.
如图11所示,在某些实施例中,所述方法包括:As shown in Figure 11, in some embodiments, the method includes:
首先进入步骤S301精搜索模块根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域;而后进入步骤S302在该精搜索区域中生成一个该PU块对应的 代价最小的一个精搜索矢量;以及根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块。First, go to step S301, the fine search module sets a fine search area in the reconstructed image of the reference frame for each PU block according to the rough search vector; then go to step S302 to generate a corresponding PU block in the fine search area A fine search vector with the least cost; and according to the motion vector information around the current CTU block, one or more predicted motion vectors with the same function as the coarse search vector are generated, and a fine search vector is generated according to the predicted motion vector; All fine search vectors are sent to the fractional pixel search module.
如图12所示,在某些实施例中,所述方法包括:As shown in Figure 12, in some embodiments, the method includes:
首先进入步骤S401分数像素搜索模块根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域;而后进入步骤S402在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。First, go to step S401, the fractional pixel search module, according to each received fine search vector, set a corresponding fractional pixel search area in the reference frame for each PU block; then go to step S402 to generate one in the fractional pixel search area A fractional pixel search vector with the smallest cost corresponding to the PU block.
在某些实施例中,所述精确比较模块包括有分发模块、多个分层计算模块和多个分层比较模块,所述分发模块与粗选择模块连接,所述分层比较模块与分发模块连接;所述方法包括:In some embodiments, the precise comparison module includes a distribution module, multiple hierarchical calculation modules, and multiple hierarchical comparison modules, the distribution module is connected to the coarse selection module, and the hierarchical comparison module is connected to the distribution module Connection; the method includes:
分发模块根据每个CTU块的每个划分模式,将每个划分模式中的每个CU块、以及与该CU块相对应的预测信息分发给对不同的分层计算模块;The distribution module distributes each CU block in each division mode and the prediction information corresponding to the CU block to different layered calculation modules according to each division mode of each CTU block;
分层计算模块根据接收的与CU块相对应的预测信息,计算多个代价信息并进行层内比较,选出一个该CU块对应的代价最小的预测模式和划分模式;The layered calculation module calculates multiple cost information according to the received prediction information corresponding to the CU block and performs intra-layer comparison, and selects a prediction mode and partition mode with the least cost corresponding to the CU block;
分层比较模块比较不同层的分层计算模块所选择出的预测模式和划分模式对应的最小代价,选择出对于CTU块代价最小的划分模式和相对应的编码信息。The layered comparison module compares the prediction mode selected by the layered calculation modules of different layers and the minimum cost corresponding to the partition mode, and selects the partition mode with the smallest cost for the CTU block and the corresponding coding information.
上述技术方案所述H.265编码方法和装置,所述装置包括如下模块:预处理模块、粗选择模块和精确比较模块,所述预处理模块与所述粗选择模块连接,所述粗选择模块与所述精确比较模块连接;其中:所述预处理模块用于将一个原始视频中的一个当前帧分割为多个CTU块;所述粗选择模块用于按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块;所述粗选择模块还用于对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划分模式相对应的预测信息;所述精确比较模块用于对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。本发明通过分布搜索的方式提高了搜索精度,同时更好地保留了重构图像的细节,降低了硬件资源消耗。According to the H.265 encoding method and device of the above technical solution, the device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module, the preprocessing module is connected to the coarse selection module, the coarse selection module Connected to the precise comparison module; wherein: the preprocessing module is used to divide a current frame in an original video into multiple CTU blocks; the coarse selection module is used to divide each CTU according to multiple division modes Block, each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of the CU blocks into corresponding one or more PU blocks; the coarse selection module is also used for each CTU block Each division mode of the block performs inter-frame prediction and intra-frame prediction, and generates a prediction information corresponding to each division mode; the precise comparison module is used for prediction corresponding to each division mode of each CTU block The information is compared with the cost, the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode are selected, and the selected partition mode and its corresponding coding information are used to generate H The entropy coding information of the .265 bitstream and the reconstruction information for generating the reconstructed frame from the current frame. The invention improves the search accuracy through the distributed search mode, while better retaining the details of the reconstructed image, and reduces the hardware resource consumption.
本发明设计的H.265编码装置可采用包含了多个流水步骤的流水线的另外一种实施方式来实现具体实施例中的多个步骤。上述“流水线”,亦称管线(Pipeline),是指将对H.265的编码处理过程拆分为多个步骤,并通过多个相对应的硬件处理单元并行执行这些步骤,来加快处理速度的一种硬件实施方案。“流水步骤”指一条流水线上的一个具体步骤;“流水级”指在一个流水步骤内的一个特定流水线级。换句话说,一个流水线可以包括一个或者多个流水步骤;一个流水步骤可以包括一个或多个流水级。当一个流水步骤中只包括一个流水级时,可以将该流水步骤和流水级等同对待。The H.265 encoding device designed in the present invention can adopt another implementation manner of a pipeline including multiple pipeline steps to implement multiple steps in a specific embodiment. The above-mentioned "pipeline", also known as pipeline (Pipeline), refers to the process of splitting the encoding process of H.265 into multiple steps, and executing these steps in parallel through multiple corresponding hardware processing units to speed up the processing speed. A hardware implementation. "Streamline step" refers to a specific step in a pipeline; "pipeline stage" refers to a specific pipeline stage within a pipeline step. In other words, a pipeline can include one or more pipeline steps; a pipeline step can include one or more pipeline stages. When only one pipeline stage is included in a pipeline step, the pipeline step and pipeline stage can be treated equally.
在一些实施例中,某个特定的硬件模块可以支持一个或多个流水步骤的运行。就是说,这些流水步骤中的所有流水级,都由该硬件模块(或者由包含其中的子模块)来负责运行。在另外一些实施例中,某个特定的硬件模块可以支持至少一个流水级的运行。如果一个流水步骤有多个流水级时,该硬件模块只负责该流水步骤中,特定的一个或多 个流水级的运行。换句话说,该流水步骤可以通过多个硬件模块来实施,每个硬件模块分别负责运行相应的流水步骤中相应的流水级。In some embodiments, a specific hardware module can support the operation of one or more pipeline steps. That is to say, all the pipeline stages in these pipeline steps are run by the hardware module (or by the sub-modules contained therein). In other embodiments, a specific hardware module can support at least one pipeline stage. If there are multiple pipeline stages in a pipeline step, the hardware module is only responsible for the operation of one or more pipeline stages in the pipeline step. In other words, the pipeline step can be implemented by multiple hardware modules, and each hardware module is responsible for running the corresponding pipeline stage in the corresponding pipeline step.
所述装置包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,其中:The device includes multiple modules and multiple pipeline steps, each pipeline step includes at least one pipeline stage for executing at least one module, wherein:
多个模块包括预处理模块120、粗选择模块130、精确比较模块140和整体控制模块920,所述整体控制模块920分别与预处理模块120、粗选择模块130、精确比较模块140连接;The multiple modules include a preprocessing module 120, a coarse selection module 130, an accurate comparison module 140, and an overall control module 920, and the overall control module 920 is connected to the preprocessing module 120, the rough selection module 130, and the precise comparison module 140, respectively;
多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行;The multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
所述预处理流水步骤通过预处理模块120,将一个原始视频100中的一个当前帧102分割为多个CTU块(Coding Tree Unit,编码树单元)。CTU为当前帧图像中的一个子块,大小可以为16x16子块、32x32子块、64x64子块中的任意一种。具体地,预处理模块可以获取一个原始视频100中的原始图像帧101,并从原始图像帧101中选定一当前帧102。The preprocessing pipeline step passes through the preprocessing module 120 to divide a current frame 102 in an original video 100 into multiple CTU blocks (Coding Tree Unit, coding tree unit). The CTU is a sub-block in the current frame image, and the size can be any of 16x16 sub-blocks, 32x32 sub-blocks, and 64x64 sub-blocks. Specifically, the preprocessing module may obtain an original image frame 101 in the original video 100, and select a current frame 102 from the original image frame 101.
所述粗选择流水步骤通过粗选择模块130,按照多个划分模式来划分每个CTU块,对每个CTU块的每个划分模式进行帧间预测粗选择和帧内预测粗选择,并生成一个与每个划分模式相对应的预测信息。The coarse selection pipeline step passes through the coarse selection module 130, divides each CTU block according to multiple division modes, performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates one The prediction information corresponding to each division mode.
如图2所示,在本实施方式中,所述粗选择模块包括:帧间预测粗选择模块和帧内预测粗选择模块;所述粗选择流水步骤包括:帧间预测粗选择流水级和帧内预测粗选择流水级;As shown in FIG. 2, in this embodiment, the coarse selection module includes: an inter prediction coarse selection module and an intra prediction coarse selection module; the coarse selection pipeline includes: inter prediction coarse selection pipeline and frame Intra-prediction rough selection of pipeline level;
所述帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块(Coding Unit,编码单元),以及将其中的每个CU块分割为对应的一个或多个PU块(Prediction Unit,预测单元),对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息。划分模式根据实际需要进行选择,例如对于一个64x64大小的当前CTU 121,可以将其划分为4个32x32子块;对于每个32x32子块又可以将其分为4个16x16子块。The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks (Coding Units, coding units). Unit), and divide each CU block into one or more corresponding PU blocks (Prediction Unit, prediction unit), perform inter-frame prediction for each division mode of each CTU block and obtain reference frame information, and Perform intra-frame prediction on each division mode of each CTU block and generate prediction information corresponding to each division mode. The division mode is selected according to actual needs. For example, for a current CTU 121 with a size of 64x64, it can be divided into 4 32x32 sub-blocks; for each 32x32 sub-block, it can be divided into 4 16x16 sub-blocks.
所述帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。每个PU块都有自身对应的运动矢量,每个PU块的运动矢量都是用来从重构的参考帧中获取预测信息的,具体可以以当前PU块所在的位置为起点,按PU块对应的运动矢量获取预测信息。The intra-frame prediction coarse selection pipeline passes through the intra-frame prediction coarse selection module: performs intra-frame prediction on each PU block in each division mode and calculates the corresponding cost, and selects one or more costs relative to each PU block according to the cost. Intra prediction directions, and the selected intra prediction direction is used as the prediction information corresponding to the division mode. Each PU block has its own corresponding motion vector. The motion vector of each PU block is used to obtain prediction information from the reconstructed reference frame. Specifically, the location of the current PU block can be used as the starting point, and the motion vector of each PU block The corresponding motion vector obtains prediction information.
所述精确比较流水步骤通过精确比较模块140,对与每个CTU块的各个划分模式相对应的预测信息进行代价计算并比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。这样,通过分布搜索的方式提高了搜索精度,同时更好地保留了重构图像的细节,降低了硬件资源消耗。The precise comparison pipeline step uses the precise comparison module 140 to calculate and compare the prediction information corresponding to each partition mode of each CTU block, and select the partition mode with the smallest cost for each CTU block and compare it with the partition mode. The coding information corresponding to the mode, and according to the selected division mode and its corresponding coding information, the entropy coding information for generating the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated. In this way, the search accuracy is improved through the distributed search, while the details of the reconstructed image are better preserved, and the hardware resource consumption is reduced.
所述整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理 模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。优选的,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行。简言之,粗选择模块在执行当前帧对应的粗选择流水步骤时,预处理模块可以进行当前帧对应的下一帧的预处理流水步骤,精确比较模块在执行当前帧对应的精确比较流水步骤时,粗选择模块可以进行当前帧对应的下一帧的粗选择流水步骤,以此类推,实现流水作业,从而有效提高了编码效率。The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, coarse selection module, and precise comparison module to sequentially execute the corresponding pipeline steps. Preferably, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step. In short, when the coarse selection module executes the rough selection pipeline corresponding to the current frame, the preprocessing module can perform the preprocessing pipeline steps of the next frame corresponding to the current frame, and the precise comparison module executes the accurate comparison pipeline corresponding to the current frame. When the current frame corresponds to the next frame, the rough selection module can perform the rough selection pipeline steps of the next frame, and so on to achieve pipeline operation, thereby effectively improving the coding efficiency.
在某些实施例中,所述粗选择模块130还包括帧间粗选择模块230,所述精确比较模块140还包括帧内粗选择模块330;所述粗选择流水步骤包括帧间粗选择流水级,所述精确比较流水步骤包括帧内粗选择流水级。In some embodiments, the coarse selection module 130 further includes an inter-frame coarse selection module 230, and the precise comparison module 140 further includes an intra-frame coarse selection module 330; the coarse selection pipeline includes an inter-frame coarse selection pipeline. , The precise comparison pipeline step includes coarse selection of pipeline stages within a frame.
所述帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides the Each CU block is divided into one or more corresponding PU blocks, inter-frame prediction is performed for each division mode of each CTU block and reference frame information is obtained, and each division mode of each CTU block is intra-frame prediction And generate a prediction information corresponding to each division mode;
所述帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。The intra-frame prediction coarse selection pipeline passes through the intra-frame prediction coarse selection module: performs intra-frame prediction on each PU block in each division mode and calculates the corresponding cost, and selects one or more costs relative to each PU block according to the cost. Intra prediction directions, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
简言之,在实际应用过程中,帧内粗选择模块330既可以附属于粗选择模块130,也可以附属于精确比较模块140,从而拓宽本装置的应用场景。In short, in the actual application process, the intra-frame coarse selection module 330 can be attached to the coarse selection module 130 or the precise comparison module 140, thereby broadening the application scenarios of the device.
在某些实施例中,所述帧间预测粗选择模块230包括:粗搜索模块211、参考帧数据加载模块910、精搜索模块213和分数像素搜索模块215。所述粗选择流水步骤包括:粗搜索流水级、参考帧数据加载流水级、精搜索流水级和分数像素搜索流水级;In some embodiments, the inter-frame prediction coarse selection module 230 includes: a coarse search module 211, a reference frame data loading module 910, a fine search module 213, and a fractional pixel search module 215. The rough selection pipeline includes: rough search pipeline, reference frame data loading pipeline, fine search pipeline and fractional pixel search pipeline;
所述粗搜索流水级通过粗搜索模块:从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧,对参考帧和当前CTU块进行下采样操作,并在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量;The coarse search pipeline stage passes through the coarse search module: select a frame from the reference array, select a reference frame from its original frame or reconstructed frame, perform down-sampling operations on the reference frame and the current CTU block, and perform down-sampling on the Find the pixel location with the least cost compared with the down-sampled CTU block in the reference frame, and calculate the coarse search vector of the pixel location relative to the current CTU block;
所述参考帧数据加载流水级通过参考帧数据加载流水级:通过整体控制模块获取粗搜索流水级的粗搜索矢量以及根据CTU块周围的运动矢量获得跟粗搜索有同样功能的一个或多个预测运动矢量,根据粗搜索矢量和一个或多个预测矢量加载参考帧数据,并通过整体控制模块传给精搜索流水级;The reference frame data loading pipeline stage is through the reference frame data loading pipeline stage: the coarse search vector of the coarse search pipeline is obtained through the overall control module, and one or more predictions with the same function as the coarse search are obtained according to the motion vector around the CTU block Motion vector, load reference frame data according to the coarse search vector and one or more prediction vectors, and pass it to the fine search pipeline through the overall control module;
所述精搜索流水级通过精搜索模块:根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域,并在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及用于根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块;The fine search pipeline passes the fine search module: according to the coarse search vector, a fine search area is set in the reconstructed image of the reference frame for each PU block, and a corresponding PU block is generated in the fine search area A fine search vector with the smallest cost; and used to generate one or more predicted motion vectors with the same function as the coarse search vector based on the motion vector information around the current CTU block, and generate a fine search vector based on the predicted motion vector; and Send all the generated fine search vectors to the fractional pixel search module;
所述分数像素搜索流水级通过分数像素搜索模块:根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域,并在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。优选的,所述帧内预测粗选择流水级与分数像素搜索流水级为同一个流水级,所述帧内预测粗选择模块与分数像素搜 索模块并行地执行于该同一个流水级。The fractional pixel search pipeline level passes through the fractional pixel search module: according to each received fine search vector, a corresponding fractional pixel search area is set in the reference frame for each PU block, and in the fractional pixel search area Generate a fractional pixel search vector with the smallest cost corresponding to the PU block. Preferably, the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are the same pipeline stage, and the intra-frame prediction coarse selection module and the fractional pixel search module are executed in parallel at the same pipeline stage.
所述参考列表为存放参考帧的列表,当前帧的参考帧可以有多帧,都是通过参考列表索引的。一个参考帧包括重构帧和原始帧。由于参考帧和当前CTU块都是经过下采样操作得到,因而粗搜索模块计算得到的粗搜索矢量也应为相应下采样的搜索矢量,即相较于当前CTU块对应的粗搜索矢量需要乘以下采样的倍率(如1/4),并将乘以相应倍率后的粗搜索矢量传输给下一处理模块。The reference list is a list storing reference frames, and the reference frame of the current frame may have multiple frames, all of which are indexed through the reference list. A reference frame includes reconstructed frames and original frames. Since the reference frame and the current CTU block are obtained through down-sampling, the coarse search vector calculated by the coarse search module should also be the corresponding down-sampled search vector, that is, the coarse search vector corresponding to the current CTU block needs to be multiplied by the following The sampling magnification (such as 1/4), and the coarse search vector multiplied by the corresponding magnification is transmitted to the next processing module.
在某些实施例中,所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;所述帧内预测粗选择流水级包括:对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。In some embodiments, the intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline; the intra-frame prediction coarse selection pipeline includes: for each division mode Each PU block uses the original pixels of the current frame to generate reference pixels. According to the reference pixels, all intra-frame prediction directions are predicted according to the rules of the H.265 protocol to obtain the prediction results in each direction. The prediction results in each direction are compared with the original The pixel calculates the distortion cost, and sorts the cost from small to large to select one or more intra prediction directions with a small cost.
在某些实施例中,所述帧内预测粗选择流水级与分数像素搜索流水级为不同流水级,所述帧内预测粗选择模块执行于分数像素搜索模块之后的流水级。所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;所述参考像素生成模块用于对每个划分模式中的每个PU块,使用当前帧的重构像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。In some embodiments, the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are different pipeline stages, and the intra-frame prediction coarse selection module is executed at the pipeline stage after the fractional pixel search module. The intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline; the reference pixel generation module is used for each PU block in each division mode, using the reconstruction of the current frame Pixels to generate reference pixels, predict all intra-frame prediction directions according to the rules of the H.265 protocol according to the reference pixels to obtain the prediction results in each direction, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and reduce the cost One or more intra-frame prediction directions with a lower cost are selected in a large order.
如图3所示,所述粗搜索模块从原始帧或者重构帧中选择一个作为参考帧,对参考帧和当前CTU分别进行下采样操作,再在下采样后的参考帧中找到与下采样后的CTU相比代价最小的像素位置和粗搜索矢量。优选的,在本实施方式中,参考帧和当前CTU的下采样缩放比例相同。例如参考帧310经过下采样311后得到的下采样后图像320,是将参考帧的长宽各缩放至1/4,则当前CTU330经过下采样331后得到的下采样后的CTU,通过将当前CTU330长宽各缩放至1/4得到。而后以下采样后的CTU340(图3中B子块)为单位,在下采样后图像(图3中A子块)中进行预测,并依次计算采样后的CTU340与下采样后图像320中各个对应的子块(以A子块中各个像素点为中心,取与B子块大小相同的子块)的代价,找到与下采样后的CTU相比代价最小的像素块,记为最小代价像素块352(图3中C子块),并记录当前最小代价像素块的中心像素位置和粗搜索矢量,粗搜索矢量为下采样后CTU340(图3中B子块)的中心像素与最小代价像素块352(图3中C子块)的中心像素位置之间的矢量位移(即图3中的运动矢量351)。As shown in Figure 3, the coarse search module selects one of the original frame or the reconstructed frame as a reference frame, performs down-sampling operations on the reference frame and the current CTU respectively, and then finds and down-sampled the reference frame after down-sampling. The CTU is compared to the least costly pixel location and coarse search vector. Preferably, in this embodiment, the down-sampling scaling ratio of the reference frame and the current CTU are the same. For example, the down-sampled image 320 obtained from the reference frame 310 after down-sampling 311 is to scale the length and width of the reference frame to 1/4, then the down-sampled CTU obtained by the current CTU 330 after down-sampling 331, through the current The length and width of CTU330 are scaled to 1/4. Then the down-sampled CTU340 (B sub-block in Figure 3) is used as a unit, and prediction is performed in the down-sampled image (A sub-block in Figure 3), and the sampled CTU340 and the down-sampled image 320 are calculated in turn. The cost of the sub-block (take each pixel in the A sub-block as the center, take the sub-block with the same size as the B sub-block), find the pixel block with the smallest cost compared with the down-sampled CTU, and record it as the minimum cost pixel block 352 (C sub-block in Figure 3), and record the center pixel position of the current minimum cost pixel block and the coarse search vector. The coarse search vector is the center pixel and minimum cost pixel block 352 of the CTU340 (sub-block B in Figure 3) after downsampling. The vector displacement between the center pixel positions of (C sub-block in FIG. 3) (that is, the motion vector 351 in FIG. 3).
如图13,对于一个64x64大小的当前CTU块,在位于上边的10个8x8大小的子块(图13中用1-10来标注的子块),左上边相邻的CTU块和右上边相邻的CTU块中,分别有其对应的一个粗搜索结果以及对应的运动矢量信息。此外,当前CTU块内部有16个协助运动矢量,因此最多有28个mv作为邻接mv(即当前CTU块周围的运动矢量信息)。这28个运动矢量信息会经过一定的筛选,筛选出预设数量个(如3个)的邻接mv传输给精搜索模块,从而确定同样预设数量个的精搜索运动矢量。在本实施方式中,同样功能是指筛选出的预设数量个的邻接mv与粗搜索模块得到的搜索结果的作用是一致的,即均会输入给精搜索模块的接口进行下一步处理。As shown in Figure 13, for a 64x64 current CTU block, in the upper 10 8x8 size sub-blocks (sub-blocks marked with 1-10 in Figure 13), the adjacent CTU block on the upper left side is the same as the upper right side. In adjacent CTU blocks, there is a corresponding rough search result and corresponding motion vector information. In addition, there are 16 assisted motion vectors in the current CTU block, so there are at most 28 mvs as adjacent mvs (that is, the motion vector information around the current CTU block). The 28 motion vector information will undergo a certain screening, and a preset number (such as 3) of adjacent mvs will be screened out and transmitted to the fine search module to determine the same preset number of fine search motion vectors. In this embodiment, the same function means that the filtered preset number of adjacent mvs are the same as the search results obtained by the coarse search module, that is, they will be input to the interface of the fine search module for further processing.
在本实施方式中,粗搜索模块会给精搜模块输入一个运动矢量,然后也会从邻接mv 中选择几个mv输入给精搜索模块,假设总共有N个mv输入给精搜索模块,那么精搜索模块也会产生N个精搜索rmv(即精搜索矢量),并将N个精搜索矢量均输入给FME(即分数像素搜索模块),再由FME从这N个精搜索mv中通过代价比较得到一个最优的fme_mv(即分数像素搜索矢量),这个fme_mv最后会输入给精确比较模块。In this embodiment, the coarse search module will input a motion vector to the fine search module, and then select several mvs from adjacent mvs to input to the fine search module. Assuming that there are a total of N mvs input to the fine search module, then the fine search module The search module will also generate N fine search rmvs (that is, fine search vectors), and input the N fine search vectors to FME (that is, the fractional pixel search module), and then FME will compare the costs from these N fine search mvs An optimal fme_mv (ie, fractional pixel search vector) is obtained, and this fme_mv will finally be input to the accurate comparison module.
如图5所示,为了进一步提高搜索精度,所述分数像素搜索模块215用于根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域530,并在该分数像素搜索区域530中生成一个该PU块对应的代价最小的一个分数像素搜索矢量423。具体地,分数像素搜索区域530可以通过以下方式确定:根据当前PU位置520以及之前获取的精搜索运动矢量,在参考帧510中确定当前PU位置520对应的起始搜索位置531,以起始搜索位置像素为中心,分别在上下左右4个方位各扩展K个像素(K的值可以根据实际需要设定),得到边长为2K的方形区域即为分数像素搜索区域530。与精搜索的搜索方式类似的,以起始搜索位置531像素为中心,依次计算起始搜索位置531与分数像素搜索区域530中各个像素点为中心、当前PU大小相同的子块的代价,找到最小代价位置533,并计算当前PU位置与最小代价位置533之间的运动矢量,记为分数像素搜索运动矢量523。As shown in FIG. 5, in order to further improve the search accuracy, the fractional pixel search module 215 is configured to set a corresponding fractional pixel search area 530 in the reference frame for each PU block according to each received fine search vector. , And generate a fractional pixel search vector 423 with the lowest cost corresponding to the PU block in the fractional pixel search area 530. Specifically, the fractional pixel search area 530 can be determined in the following manner: according to the current PU position 520 and the previously acquired fine search motion vector, the start search position 531 corresponding to the current PU position 520 is determined in the reference frame 510 to start the search The position pixel is the center, and K pixels are expanded in 4 directions respectively (the value of K can be set according to actual needs), and a square area with a side length of 2K is obtained as the fractional pixel search area 530. Similar to the search method of the fine search, the starting search position 531 pixel is taken as the center, the starting search position 531 and each pixel point in the fractional pixel search area 530 are calculated in turn, and the current PU size is the same. The minimum cost position 533 is calculated, and the motion vector between the current PU position and the minimum cost position 533 is calculated and recorded as the fractional pixel search motion vector 523.
请参阅图7,为本发明一实施方式涉及的H.265编码装置中的精确比较模块的示意图。在某些实施例中,所述精确比较模块140包括有分发模块711、多个单级计算模块(比如721、722、723和724)、和多个分层比较模块740。所述分发模块711与粗选择模块130连接,并与多个单级计算模块连接;所述每个单级计算模块与一个相对应的分层比较模块740连接。其中:Please refer to FIG. 7, which is a schematic diagram of the precise comparison module in the H.265 encoding device according to an embodiment of the present invention. In some embodiments, the precise comparison module 140 includes a distribution module 711, multiple single-stage calculation modules (such as 721, 722, 723, and 724), and multiple hierarchical comparison modules 740. The distribution module 711 is connected to the coarse selection module 130 and connected to a plurality of single-stage calculation modules; each single-stage calculation module is connected to a corresponding hierarchical comparison module 740. among them:
所述分发模块711用于根据每个CTU块的不同划分模式,将每个划分模式中的不同的与该CU块相对应的预测信息分发给对不同的单级计算模块;The distribution module 711 is configured to distribute different prediction information corresponding to the CU block in each division mode to different single-stage calculation modules according to different division modes of each CTU block;
所述单级计算模块用于根据从分发模块711接收到的与CU块相对应的预测信息,计算多个代价信息并进行层内比较,选出一个该CU块对应的代价最小的预测模式和划分模式;The single-stage calculation module is used to calculate multiple cost information and compare them in layers according to the prediction information corresponding to the CU block received from the distribution module 711, and select a prediction mode with the least cost corresponding to the CU block and Partition mode
所述分层比较模块740用于比较不同层的单级比较模块所计算出的代价信息,选择出对于CTU块代价最小的划分模式和相对应的编码信息。The layered comparison module 740 is used to compare the cost information calculated by the single-stage comparison modules of different layers, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
在某些实施例中,图7的精确比较模块140包含了四个单级计算模块721、722、723和724。每个单级计算模块721、722、723和724可以由图8的单级计算模块810来组成。如图8所示,单级计算模块810包括帧间模式代价计算模块820、帧内模式代价计算模块830、和优选模块840。对于每个输入的CU,单级计算模块810可以通过帧间模式代价计算模块820来计算一个帧间代价,通过帧内模式代价计算模块830来计算一个帧内代价,并通过优选模块840来比较帧间代价和帧内代价,确定综合代价最小的一个划分模式和预测模式,即为当前输入的CU相对应的代价最小的划分模式和预测模式。In some embodiments, the exact comparison module 140 of FIG. 7 includes four single- stage calculation modules 721, 722, 723, and 724. Each single- stage calculation module 721, 722, 723, and 724 may be composed of the single-stage calculation module 810 of FIG. 8. As shown in FIG. 8, the single-stage calculation module 810 includes an inter-mode cost calculation module 820, an intra-mode cost calculation module 830, and an optimization module 840. For each input CU, the single-stage calculation module 810 may calculate an inter-frame cost through the inter-mode cost calculation module 820, and calculate an intra-frame cost by the intra-mode cost calculation module 830, and compare it by the optimization module 840 For the inter-frame cost and the intra-frame cost, determine the partition mode and prediction mode with the smallest comprehensive cost, that is, the partition mode and prediction mode with the smallest cost corresponding to the currently input CU.
回到图7的实施例中,每个单级计算模块721、722、723和724用于处理一个特定级别的CU块。比如,单级计算模块721可设为一级计算模块,用于处理64x64大小的CU块;单级计算模块722可设为二级计算模块,用于处理32x32大小的CU块;单级计算模块723可设为三级计算模块,用于处理16x16大小的CU块;单级计算模块724可设为四级计算模块,用于处理8x8大小的CU块。假设精确比较模块140从粗选择模块130接收到一个 CTU以及相应的划分模式、预测信息、以及多个帧间运动矢量和参考信息。分发模块711可以根据各种划分模式下的CU,根据其大小,分发给各级计算模块721-724。Returning to the embodiment of FIG. 7, each single- stage calculation module 721, 722, 723, and 724 is used to process a CU block of a specific level. For example, the single-stage calculation module 721 can be set as a first-level calculation module for processing 64x64 CU blocks; the single-stage calculation module 722 can be set as a second-level calculation module for processing CU blocks of 32x32 size; single-stage calculation module 723 can be set as a three-level calculation module for processing 16x16 CU blocks; single-level calculation module 724 can be set as a four-level calculation module for processing 8x8 CU blocks. Assume that the precise comparison module 140 receives a CTU from the coarse selection module 130 and the corresponding division mode, prediction information, and multiple inter-frame motion vectors and reference information. The distribution module 711 can distribute to the computing modules 721-724 at all levels according to the size of the CU in various division modes.
在某些实施例中,每个单级计算模块的帧内模式代价计算模块830,会接收到与某个级别的CU相关的一个或多个帧内预测信息,计算并选出一个帧内代价。每个单级计算模块的帧间模式代价计算模块820,会同时/并行接收到与某个级别的CU相关的一个或多个帧间运动矢量和参考信息,计算并选出一个帧间代价。之后,每个单级计算模块的优选模块840会从已经计算出的帧内代价和帧间代价中,优选一个最小代价。换句话说,当最小代价是帧内代价时,说明采用相关的帧内预测信息来进行H.265编码是较佳选择;当最小代价是帧间代价时,说明采用相关的帧间运动矢量和参考信息来进行H.265编码是较佳选择。In some embodiments, the intra-mode cost calculation module 830 of each single-stage calculation module receives one or more intra-frame prediction information related to a CU of a certain level, calculates and selects an intra-frame cost . The inter-mode cost calculation module 820 of each single-stage calculation module simultaneously/parallel receives one or more inter-frame motion vectors and reference information related to a CU of a certain level, calculates and selects an inter-frame cost. After that, the optimization module 840 of each single-stage calculation module will select a minimum cost from the calculated intra-frame cost and inter-frame cost. In other words, when the minimum cost is intra-frame cost, it means that it is a better choice to use relevant intra-frame prediction information for H.265 encoding; when the minimum cost is inter-frame cost, it means that the relevant inter-frame motion vector and Reference information for H.265 encoding is a better choice.
例如,分层比较模块743可以将四级计算模块724计算得到的四个8x8块对应的最小代价之和、与1个从三级计算模块723计算得到的1个16x16块的最小代价进行比较,并得到代价更小的那一种划分模式。具体来说,分层比较所比较的对象之一的:4个8x8块(假设称为A、B、C、D块),可以全部是帧间比较获取的最小代价块,全部是帧内比较获取的最小代价块,或者同时包含帧间比较获取的最小代价块和帧内比较获取的最小代价块。比如A块可以是帧间获取的,B、C、D块可以是帧内获取的。或者A、C块可以是帧间获取的,B、D块是帧内获取的。For example, the hierarchical comparison module 743 can compare the sum of the minimum costs corresponding to four 8x8 blocks calculated by the four-level calculation module 724 with the minimum cost of one 16x16 block calculated from the three-level calculation module 723, And get the less expensive division mode. Specifically, one of the objects to be compared for hierarchical comparison: 4 8x8 blocks (assumed to be called A, B, C, and D blocks), which can all be the smallest cost blocks obtained by inter-frame comparison, and all are intra-frame comparison The obtained minimum cost block, or both the minimum cost block obtained by inter-frame comparison and the minimum cost block obtained by intra-frame comparison. For example, block A can be acquired between frames, and blocks B, C, and D can be acquired within frames. Or blocks A and C can be obtained between frames, and blocks B and D are obtained within frames.
同样的,分层比较模块742可以选择4个从分层比较模块743获取的、最小代价的16x16块,合并起来跟1个从二级计算模块722计算得到的最小代价的32x32块进行比较。具体来说,分层比较模块742选择的4个16x16块(假设称为E、F、G、H块),可以包括完整的16x16CU块,也可以是由多个8x8块组成。比如E块可以是帧间获取的一个16x16CU块;F块可以是帧内获取的一个16x16CU块;G块可以是包括帧间获取的和帧内获取的4个8x8块组成的16x16组合块。Similarly, the hierarchical comparison module 742 can select four 16x16 blocks with the lowest cost obtained from the hierarchical comparison module 743, and combine them with one 32x32 block with the lowest cost calculated from the secondary calculation module 722 for comparison. Specifically, the four 16x16 blocks (supposedly called E, F, G, and H blocks) selected by the hierarchical comparison module 742 may include a complete 16x16 CU block, or may be composed of multiple 8x8 blocks. For example, the E block may be a 16x16CU block obtained between frames; the F block may be a 16x16CU block obtained within a frame; and the G blocks may be a 16x16 combined block composed of four 8x8 blocks obtained between frames and intraframes.
同样的,分层比较模块741可以选择4个从分层比较模块742获取的,有最小代价的32x32块,合并起来跟1个从一级计算模块721计算得到的最小代价的64x64块进行比较。具体来说,分层比较模块741选择的4个32x32块(假设称为I、J、K、L块),可以包括完整的32x32CU块,也可以是由多个16x16块组成,每个再由多个8x8块组成的组合块。比如I块可以是帧间获取的一个32x32CU块;J块是包括帧间获取的和帧内获取的4个16x16CU块组成的;K块中的一个或多个16x16块可以分别是由多个8x8块组成的。Similarly, the hierarchical comparison module 741 can select four 32x32 blocks with the smallest cost obtained from the hierarchical comparison module 742, and combine them with one 64x64 block with the smallest cost calculated from the first-level calculation module 721 for comparison. Specifically, the 4 32x32 blocks selected by the hierarchical comparison module 741 (assumed to be called I, J, K, L blocks) can include a complete 32x32CU block, or can be composed of multiple 16x16 blocks, each of which is composed of A combination block composed of multiple 8x8 blocks. For example, the I block can be a 32x32CU block acquired between frames; the J block is composed of four 16x16CU blocks acquired between frames and intraframe; one or more 16x16 blocks in the K block can be composed of multiple 8x8 blocks. Composed of blocks.
通过以上的方式,分层比较模块740可以找到有最小代价的CTU、CU和PU块的组合,选择出对于CTU块代价最小的划分模式和相对应的编码信息。Through the above method, the hierarchical comparison module 740 can find the combination of the CTU, CU, and PU block with the smallest cost, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
在某些实施例中,所述帧内预测粗选择模块330包括参考像素生成模块231;帧内预测粗选择模块330执行于帧内预测粗选择流水;In some embodiments, the intra-frame prediction coarse selection module 330 includes a reference pixel generation module 231; the intra-frame prediction coarse selection module 330 is executed in the intra-frame prediction coarse selection pipeline;
所述参考像素生成模块231用于对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,并根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,并根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The reference pixel generating module 231 is used to generate reference pixels using the original pixels of the current frame for each PU block in each division mode, and to predict all intra-frame directions according to the rules of the H.265 protocol according to the reference pixels. Perform prediction to obtain prediction results in each direction, and calculate the distortion cost with the original pixels according to the prediction results in each direction, and sort the cost from small to large to select one or more intra-frame prediction directions with less cost.
帧内预测粗选择模块进行粗选择的方法与帧间预测粗选择模块的类似,此处不再赘述。两者的差别在于在进行帧内预测时,是对原始帧进行下采样得到下采样后图像,下采 样后的CTU在原始帧进行下采样得到下采样后图像进行预测;而在进行帧间预测时,是对参考帧进行下采样得到下采样后图像,下采样后的CTU在参考帧进行下采样得到下采样后图像中进行预测。The coarse selection method of the intra-frame prediction coarse selection module is similar to that of the inter-frame prediction coarse selection module, and will not be repeated here. The difference between the two is that when performing intra-frame prediction, the original frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is down-sampled from the original frame to obtain the down-sampled image for prediction; while performing inter-frame prediction At this time, the reference frame is down-sampled to obtain the down-sampled image, and the down-sampled CTU is predicted in the down-sampled image obtained by down-sampling the reference frame.
如图6-A和图6-B,按照H.265的协议,参考像素应该用重构像素,但硬件实现的过程中,当前时间点只能得到原始像素,往往还无法得到重构像素,因此本发明中采用原始像素代替重构像素的方式。以4x4大小的PU子块为例,图中有黑色填充的圆点部分为边像素,根据H.265协议,4x4块(图6-B中阴影填充的圆点部分)的边界像素总共有17个,图中的黑色填充部分像素(即边像素)应该应用重构像素填充,但当前时间点无法得到重构像素,只用原始像素代替。阴影填充部分即为4x4大小的PU块。边界像素填充完成之后,再按协议进行预测得到阴影部分填充的4x4大小的块。As shown in Figure 6-A and Figure 6-B, according to the H.265 protocol, the reference pixels should be reconstructed pixels, but in the process of hardware implementation, only the original pixels can be obtained at the current time point, and the reconstructed pixels are often not available. Therefore, the method of replacing reconstructed pixels with original pixels is adopted in the present invention. Taking a 4x4 PU sub-block as an example, the black-filled dots in the figure are edge pixels. According to the H.265 protocol, the 4x4 block (the shadow-filled dots in Figure 6-B) has a total of 17 boundary pixels. First, the black filled part of the pixels in the figure (ie, side pixels) should be filled with reconstructed pixels, but the reconstructed pixels cannot be obtained at the current time point, and only original pixels are used instead. The shadow filling part is a PU block of 4x4 size. After the boundary pixel filling is completed, prediction is performed according to the protocol to obtain a 4x4 block filled with the shadow part.
如图4,所述精搜索模块根据粗搜索矢量,对每个PU在参考帧中设定一个精搜索区域,并在该精搜索区域中,找到一个对应该PU的代价最小的一个精搜索矢量。精搜索步骤是在参考帧410内进行的,每个当前CTU包含有多个PU,精搜索则是以某种顺序从这些PU中一个一个地选作当前PU来进行的。具体地,首先确定当前PU位置420,而后根据之前获取的粗搜索矢量(或称为还原运动矢量421)对该个PU在参考帧中设定一个精搜索区域430。并再根据还原运动矢量421在精搜索区域430内确定与当前PU位置420对应的一个起始搜索位置431。与粗搜索的搜索方式类似的,在精搜索区域430内,以起始搜索位置431像素为中心,依次计算起始搜索位置431与精搜索区域430中各个像素点为中心、当前PU大小相同的子块的代价,找到最小代价位置433,并计算当前PU位置420与最小代价位置433之间的运动矢量,记为精搜索运动矢量423。As shown in Figure 4, the fine search module sets a fine search area in the reference frame for each PU according to the coarse search vector, and finds a fine search vector corresponding to the PU with the smallest cost in the fine search area . The fine search step is performed in the reference frame 410. Each current CTU contains multiple PUs, and the fine search is performed by selecting one of these PUs as the current PU in a certain order. Specifically, the current PU position 420 is determined first, and then a fine search area 430 is set in the reference frame for the PU according to the previously obtained coarse search vector (or called the restored motion vector 421). Then, a starting search position 431 corresponding to the current PU position 420 is determined in the fine search area 430 according to the restored motion vector 421. Similar to the search method of coarse search, in the fine search area 430, with the starting search position 431 pixels as the center, the pixels in the starting search position 431 and the fine search area 430 are calculated in turn, and the current PU size is the same. For the cost of the sub-block, find the minimum cost position 433, calculate the motion vector between the current PU position 420 and the minimum cost position 433, and record it as the fine search motion vector 423.
在某些实施例中,所述装置还包括后处理模块180,所述后处理模块180与精确比较模块140连接;所述后处理模块180执行于后处理流水步骤,所述后处理流水步骤包括:根据精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的重构信息,来生成与当前帧相对应的重构帧。In some embodiments, the device further includes a post-processing module 180, which is connected to the precise comparison module 140; the post-processing module 180 is executed in post-processing pipeline steps, and the post-processing pipeline steps include : Generate a reconstructed frame corresponding to the current frame according to the least costly partition mode corresponding to each CTU block output by the precise comparison module and according to the corresponding reconstruction information.
优选的,所述后处理模块180包括去块滤波模块160和样本自适应偏移模块170;所述后处理流水步骤包括去块滤波流水步骤和样本自适应偏移步骤;所述去块滤波模块执行于去块滤波流水步骤,所述样本自适应偏移步骤执行于样本自适应偏移步骤;所述去块滤波流水步骤包括:利用精确比较模块所提供的代价最小的划分模式和与其对应的重构信息,对重构帧进行滤波处理;所述样本自适应偏移流水步骤包括:对滤波处理后的重构帧进行SAO计算,得到最后的重构帧,用于参考和显示。去块滤波流水步骤和样本自适应偏移流水步骤依次串行的执行于后处理流水级。Preferably, the post-processing module 180 includes a deblocking filtering module 160 and a sample adaptive offset module 170; the post-processing pipeline step includes a deblocking filtering pipeline step and a sample adaptive offset step; the deblocking filtering module Executed in the deblocking filtering pipeline step, the sample adaptive offset step is executed in the sample adaptive offset step; the deblocking filtering pipeline step includes: using the least costly partition mode provided by the accurate comparison module and its corresponding The reconstructed information is reconstructed and the reconstructed frame is filtered; the sample adaptive offset pipeline step includes: performing SAO calculation on the reconstructed frame after the filtering process to obtain the final reconstructed frame for reference and display. The deblocking filtering pipeline step and the sample adaptive offset pipeline step are sequentially executed in the post-processing pipeline stage in sequence.
在某些实施例中,所述装置还包括熵编码模块150,所述熵编码模块150与精确比较模块140连接。所述熵编码模块150执行于熵编码流水步骤,所述熵编码流水步骤包括:根据精确比较模块140输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的熵编码信息,来生成与当前帧相对应的H.265码流。所述熵编码流水步骤与后处理流水步骤并行的执行于同一流水级。In some embodiments, the device further includes an entropy encoding module 150 connected to the precise comparison module 140. The entropy encoding module 150 is executed in the entropy encoding pipeline step, and the entropy encoding pipeline step includes: according to the least costly partition mode corresponding to each CTU block output by the precise comparison module 140 and the and generated according to the corresponding encoding information Entropy coding information corresponding to the current frame is used to generate an H.265 code stream corresponding to the current frame. The entropy coding pipeline step and the post-processing pipeline step are executed in parallel at the same pipeline stage.
具体地,所述精确比较模块140根据该CTU代价最小的划分模式和预测模式生成与该CTU相对应的熵编码所需数据,即如图1中所示的编码信息141,所述熵编码模块150用于根据与该CTU相对应的熵编码所需数据来生成与原始视频相对应的编码后码流190。同 时,图像编码设备110也会输出编码后视频180,编码后视频180的某一图像帧即为重构图像帧145。Specifically, the precise comparison module 140 generates the data required for entropy coding corresponding to the CTU according to the partition mode and prediction mode with the smallest CTU cost, that is, the coding information 141 shown in FIG. 1, the entropy coding module 150 is used to generate an encoded code stream 190 corresponding to the original video according to the data required for entropy encoding corresponding to the CTU. At the same time, the image encoding device 110 will also output the encoded video 180, and a certain image frame of the encoded video 180 is the reconstructed image frame 145.
如图14所示,发明人还提供了一种H.265编码方法,所述方法应用于H.265编码装置,所述装置包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,其中:As shown in Figure 14, the inventor also provides a H.265 encoding method, the method is applied to H.265 encoding device, the device includes multiple modules and multiple pipeline steps, each pipeline step includes at least one The pipeline stage is used to execute at least one module, where:
多个模块包括预处理模块、粗选择模块、精确比较模块和整体控制模块,所述整体控制模块分别与预处理模块、粗选择模块、精确比较模块连接;The multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行;The multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
所述方法包括以下步骤:The method includes the following steps:
首先进入步骤S101’预处理流水步骤通过预处理模块,将一个原始视频中的一个当前帧分割为多个CTU块;First, go to step S101' preprocessing pipeline step to divide a current frame in an original video into multiple CTU blocks through the preprocessing module;
而后进入步骤S102’粗选择流水步骤通过粗选择模块,按照多个划分模式来划分每个CTU块,对每个CTU块的每个划分模式进行帧间预测粗选择和帧内预测粗选择,并生成一个与每个划分模式相对应的预测信息;Then enter step S102' rough selection pipeline step through the rough selection module, divide each CTU block according to multiple division modes, and perform coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and Generate a prediction information corresponding to each division mode;
而后进入步骤S103’精确比较流水步骤通过精确比较模块,对与每个CTU块的各个划分模式相对应的预测信息进行代价计算并比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息,Then enter the step S103' accurate comparison pipeline step, through the accurate comparison module, the prediction information corresponding to each division mode of each CTU block is calculated and compared, and the division mode with the smallest cost for each CTU block is selected and compared The coding information corresponding to the division mode, and according to the selected division mode and its corresponding coding information, the entropy coding information for generating the H.265 code stream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated ,
整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
如图10所示,所述粗选择模块包括:帧间预测粗选择模块和帧内预测粗选择模块;所述粗选择流水步骤包括:帧间预测粗选择流水级和帧内预测粗选择流水级;As shown in FIG. 10, the coarse selection module includes: an inter prediction coarse selection module and an intra prediction coarse selection module; the coarse selection pipeline includes: an inter prediction coarse selection pipeline and an intra prediction coarse selection pipeline ;
所述方法还包括:The method also includes:
帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of them The CU block is divided into one or more corresponding PU blocks, and each division mode of each CTU block is inter-predicted and reference frame information is obtained, and each division mode of each CTU block is intra-predicted and generated A prediction information corresponding to each division mode;
帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。Intra-frame prediction coarse selection pipeline through the intra-frame prediction coarse selection module: Perform intra-frame prediction for each PU block in each division mode and calculate the corresponding cost, and select one or more frames relative to the cost of each PU block according to the cost Intra prediction direction, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
在某些实施例中,所述粗选择模块还包括帧间粗选择模块,所述精确比较模块还包括帧内粗选择模块;所述粗选择流水步骤包括帧间粗选择流水级,所述精确比较流水步骤包括帧内粗选择流水级。In some embodiments, the coarse selection module further includes a coarse inter-frame selection module, and the precise comparison module further includes a coarse intra-frame selection module; the coarse selection pipeline includes the inter-frame coarse selection pipeline, and the precise comparison module The comparison pipeline step includes coarse selection of pipeline stages within the frame.
所述方法包括:The method includes:
帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为 对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of them The CU block is divided into one or more corresponding PU blocks, and each division mode of each CTU block is inter-predicted and reference frame information is obtained, and each division mode of each CTU block is intra-predicted and generated A prediction information corresponding to each division mode;
帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。Intra-frame prediction coarse selection pipeline through the intra-frame prediction coarse selection module: Perform intra-frame prediction for each PU block in each division mode and calculate the corresponding cost, and select one or more frames relative to the cost of each PU block according to the cost Intra prediction direction, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
简言之,帧内粗选择模块既可以是粗选择模块的一部分,也可以是精确比较模块的一部分,从而有效拓宽本发明的应用场景。In short, the intra-frame coarse selection module can be a part of the coarse selection module or a part of the precise comparison module, thereby effectively broadening the application scenarios of the present invention.
在某些实施例中,所述帧间预测粗选择模块包括:粗搜索模块、参考帧数据加载模块、精搜索模块和分数像素搜索模块;In some embodiments, the inter-frame prediction coarse selection module includes: a coarse search module, a reference frame data loading module, a fine search module, and a fractional pixel search module;
所述粗选择流水步骤包括:粗搜索流水级、参考帧数据加载流水级、精搜索流水级和分数像素搜索流水级;The rough selection pipeline includes: rough search pipeline, reference frame data loading pipeline, fine search pipeline and fractional pixel search pipeline;
所述方法包括:The method includes:
如图10所示,粗搜索流水级通过粗搜索模块:首先进入步骤S201粗搜索模块从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧;而后进入步骤S202对参考帧和当前CTU块进行下采样操作;而后进入步骤S203在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量。As shown in Fig. 10, the rough search pipeline stage passes through the rough search module: first enter step S201, the rough search module selects a frame from the reference array, and selects a reference frame from its original frame or reconstructed frame; then enters step S202 for reference The frame and the current CTU block perform down-sampling operation; then go to step S203 to find the pixel position with the least cost compared with the down-sampled CTU block in the down-sampled reference frame, and calculate the coarse search of the pixel position relative to the current CTU block Vector.
参考帧数据加载流水级通过参考帧数据加载流水级:通过整体控制模块获取粗搜索流水级的粗搜索矢量以及根据CTU块周围的运动矢量获得跟粗搜索有同样功能的一个或多个预测运动矢量,根据粗搜索矢量和一个或多个预测矢量加载参考帧数据,并通过整体控制模块传给精搜索流水级;The reference frame data loading pipeline stage The reference frame data loading pipeline stage: obtain the coarse search vector of the coarse search pipeline through the overall control module and obtain one or more predicted motion vectors with the same function as the coarse search according to the motion vectors around the CTU block , Load the reference frame data according to the coarse search vector and one or more prediction vectors, and pass it to the fine search pipeline through the overall control module;
如图11所示,精搜索流水级通过精搜索模块:首先进入步骤S301根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域;而后进入步骤S302在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块;As shown in Figure 11, the fine search pipeline level passes through the fine search module: first enter step S301 according to the coarse search vector, set a fine search area in the reconstructed image of the reference frame for each PU block; then enter step S302, Generate a fine search vector corresponding to the PU block in the fine search area; and generate one or more predicted motion vectors with the same function as the coarse search vector according to the motion vector information around the current CTU block, and Predict the motion vector to generate a fine search vector; and send all the generated fine search vectors to the fractional pixel search module;
如图12所示,所述分数像素搜索流水级通过分数像素搜索模块:首先进入步骤S401分数像素搜索模块根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域;而后进入步骤S402在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。As shown in Figure 12, the fractional pixel search pipeline passes through the fractional pixel search module: first enter step S401, the fractional pixel search module sets a corresponding frame in the reference frame for each PU block according to each received fine search vector Then, proceed to step S402 to generate a fractional pixel search vector corresponding to the PU block with the smallest cost in the fractional pixel search area.
在某些实施例中,所述帧内预测粗选择流水级与分数像素搜索流水级为同一个流水级,所述帧内预测粗选择模块与分数像素搜索模块并行地执行于该同一个流水级。简言之,帧内预测粗选择流水级与分数像素搜索流水级在时序上可以是并行执行,即同步进行,也可以是按照先后顺序执行,即先执行帧内预测粗选择流水级,再执行分数像素搜索流水级。In some embodiments, the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are the same pipeline stage, and the intra-frame prediction coarse selection module and the fractional pixel search module are executed in parallel at the same pipeline stage . In short, the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline can be executed in parallel, that is, synchronously, or in sequential order, that is, the intra-frame prediction coarse selection pipeline is executed first, and then executed The fractional pixel search pipeline level.
在某些实施例中,所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;所述方法包括:In some embodiments, the intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed at the intra-frame prediction coarse selection pipeline; the method includes:
帧内预测粗选择流水级包括:对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得 到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The rough selection pipeline of intra prediction includes: for each PU block in each division mode, the original pixels of the current frame are used to generate reference pixels, and all intra prediction directions are performed according to the rules of the H.265 protocol according to the reference pixels. The prediction results in each direction are obtained by prediction, and the distortion cost is calculated with the original pixels according to the prediction results in each direction, and the cost is sorted from small to large to select one or more intra prediction directions with a small cost.
在某些实施例中,所述多个模块还包括后处理模块,所述多个流水步骤还包括后处理流水步骤,所述方法包括:后处理流水步骤通过后处理模块,将精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的重构信息,来生成与当前帧相对应的重构帧。在另一些实施例中,所述多个模块还包括熵编码模块,所述多个流水步骤还包括熵编码流水步骤;所述方法包括:熵编码流水步骤通过熵编码模块,将精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的熵编码信息,生成符合H.265协议规范的二进制码流。In some embodiments, the multiple modules further include a post-processing module, and the multiple pipeline steps further include a post-processing pipeline step, and the method includes: the post-processing pipeline step passes through the post-processing module and outputs the precise comparison module Each CTU block corresponds to a partition mode with the least cost and a reconstructed frame corresponding to the current frame is generated according to the corresponding reconstruction information. In other embodiments, the multiple modules further include an entropy encoding module, and the multiple pipeline steps further include an entropy encoding pipeline step; the method includes: the entropy encoding pipeline step outputs the precise comparison module through the entropy encoding module Each CTU block corresponds to the least costly partition mode and according to its corresponding entropy coding information, a binary code stream conforming to the H.265 protocol specification is generated.
如图15,预处理模块120属于一级流水,执行预处理流水步骤。粗选择模块执行粗选择流水步骤,粗选择模块包括粗搜索模块211、参考帧数据加载模块910、精搜索模块213和分数像素搜索模块215。相应地,粗选择流水级包括粗搜索流水级(即二级流水)、参考帧数据加载流水级(即三级流水)、精搜索流水级(即四级流水级)和分数像素搜索流水级(即五级流水)。优选的,所述帧内预测粗选择模块与分数像素搜索模块并行的执行于同一个流水级(即都执行于五级流水)。精确比较模块140执行精确比较流水步骤,属于六级流水。熵编码模块150和后处理模块分别执行于熵编码流水级和后处理流水级,熵编码流水级和后处理流水级并行的执行于七级流水。一至七级的流水均通过整体控制模块920实现数据传输、调度、控制,使得编码过程有序进行,极大提升了编码效率。As shown in Fig. 15, the preprocessing module 120 belongs to the first-level pipeline and executes the preprocessing pipeline steps. The coarse selection module performs rough selection pipeline steps. The coarse selection module includes a coarse search module 211, a reference frame data loading module 910, a fine search module 213, and a fractional pixel search module 215. Correspondingly, the coarse selection pipeline includes a coarse search pipeline (i.e., two-stage pipeline), a reference frame data loading pipeline (i.e., a three-stage pipeline), a fine search pipeline (i.e., a four-stage pipeline), and a fractional pixel search pipeline ( That is five-level pipeline). Preferably, the intra-frame prediction coarse selection module and the fractional pixel search module are executed in parallel at the same pipeline stage (that is, both are executed in a five-stage pipeline). The precise comparison module 140 executes the precise comparison pipeline, which belongs to the six-stage pipeline. The entropy coding module 150 and the post-processing module are respectively executed in the entropy coding pipeline stage and the post-processing pipeline stage, and the entropy coding pipeline stage and the post-processing pipeline stage are executed in parallel in the seven-stage pipeline. One to seven levels of pipelines are all implemented through the overall control module 920 to achieve data transmission, scheduling, and control, so that the coding process is carried out in an orderly manner, which greatly improves the coding efficiency.
本发明提供了一种H.265编码方法和装置,所述装置包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,多个模块包括预处理模块、粗选择模块、精确比较模块和整体控制模块;多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行。整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。本发明通过分布搜索的方式提高了搜索精度,同时更好地保留了重构图像的细节,降低了硬件资源消耗。The present invention provides a H.265 encoding method and device. The device includes multiple modules and multiple pipeline steps. Each pipeline step includes at least one pipeline stage for executing at least one module. The multiple modules include preprocessing. Module, rough selection module, accurate comparison module and overall control module; multiple pipeline steps include pretreatment pipeline step, rough selection pipeline step, and accurate comparison pipeline step. The rough selection pipeline step is executed after the pretreatment pipeline step, so The precise comparison pipeline step is performed after the rough selection pipeline step. The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps. The invention improves the search accuracy through the distributed search mode, while better retaining the details of the reconstructed image, and reduces the hardware resource consumption.
需要说明的是,尽管在本文中已经对上述各实施例进行了描述,但并非因此限制本发明的专利保护范围。因此,基于本发明的创新理念,对本文所述实施例进行的变更和修改,或利用本发明说明书及附图内容所作的等效结构或等效流程变换,直接或间接地将以上技术方案运用在其他相关的技术领域,均包括在本发明的专利保护范围之内。It should be noted that although the foregoing embodiments have been described in this article, the scope of patent protection of the present invention is not limited thereby. Therefore, based on the innovative concept of the present invention, changes and modifications to the embodiments described herein, or equivalent structures or equivalent process transformations made by using the description and drawings of the present invention, directly or indirectly apply the above technical solutions In other related technical fields, they are all included in the scope of patent protection of the present invention.

Claims (42)

  1. 一种H.265编码装置,其特征在于,包括如下模块:预处理模块、粗选择模块和精确比较模块,所述预处理模块与所述粗选择模块连接,所述粗选择模块与所述精确比较模块连接;其中:An H.265 encoding device, which is characterized by comprising the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module, the preprocessing module is connected to the coarse selection module, and the coarse selection module is connected to the precise comparison module. Compare module connection; among them:
    所述预处理模块用于将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing module is used to divide a current frame in an original video into multiple CTU blocks;
    所述粗选择模块用于按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块;所述粗选择模块还用于对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划分模式相对应的预测信息;The coarse selection module is used to divide each CTU block according to multiple division modes, each division mode divides one CTU block into corresponding multiple CU blocks, and divides each CU block into corresponding one or Multiple PU blocks; the coarse selection module is also used to perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate prediction information corresponding to each division mode;
    所述精确比较模块用于对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息。The precise comparison module is used to compare the cost of prediction information corresponding to each partition mode of each CTU block, select the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and According to the selected division mode and its corresponding coding information, the entropy coding information used to generate the H.265 bitstream from the current frame and the reconstruction information for generating the reconstructed frame from the current frame are generated.
  2. 根据权利要求1所述的H.265编码装置,其特征在于,还包括熵编码模块,所述熵编码模块与精确比较模块连接:The H.265 encoding device according to claim 1, further comprising an entropy encoding module, and the entropy encoding module is connected to the precise comparison module:
    所述熵编码模块用于根据每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的熵编码信息,来生成与当前帧相对应的H.265码流。The entropy coding module is used to generate H.265 corresponding to the current frame according to the partition mode with the lowest cost corresponding to each CTU block and the entropy coding information corresponding to the current frame generated according to the corresponding coding information. Code stream.
  3. 根据权利要求2所述的H.265编码装置,其特征在于,包括后处理模块,所述后处理模块与精确比较模块连接:The H.265 encoding device according to claim 2, characterized in that it comprises a post-processing module which is connected to the precise comparison module:
    所述后处理模块用于根据与每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的重构信息,来生成与当前帧相对应的重构帧。The post-processing module is used to generate the reconstruction corresponding to the current frame according to the least costly partition mode corresponding to each CTU block and the reconstruction information corresponding to the current frame generated according to the corresponding coding information frame.
  4. 根据权利要求3所述的H.265编码装置,其特征在于,所述后处理模块包括去块滤波模块和样本自适应偏移模块;所述去块滤波模块和样本自适应偏移模块连接;The H.265 encoding device according to claim 3, wherein the post-processing module comprises a deblocking filtering module and a sample adaptive offset module; the deblocking filtering module is connected to the sample adaptive offset module;
    所述去块滤波模块用于利用精确比较模块所提供的代价最小的划分模式和与其对应的编码信息,对重构帧进行滤波处理;The deblocking filtering module is used for filtering the reconstructed frame by using the partition mode with the least cost provided by the accurate comparison module and the corresponding coding information;
    所述样本自适应偏移模块用于对滤波处理后的重构帧进行SAO计算,并将计算后的数据传输至熵编码模块。The sample adaptive offset module is used to perform SAO calculation on the reconstructed frame after filtering processing, and transmit the calculated data to the entropy coding module.
  5. 根据权利要求1所述的H.265编码装置,其特征在于,所述粗选择模块包括帧间预测粗选择模块和帧内预测粗选择模块,所述帧间预测粗选择模块分别与预处理模块、精确比较模块连接,所述帧内预测粗选择模块分别与预处理模块、精确比较模块连接;其中:The H.265 encoding device according to claim 1, wherein the coarse selection module comprises an inter-frame prediction coarse selection module and an intra-frame prediction coarse selection module, the inter-frame prediction coarse selection module and the preprocessing module respectively 1. The precise comparison module is connected, and the intra-frame prediction coarse selection module is respectively connected with the preprocessing module and the precise comparison module; wherein:
    所述帧间预测粗选择模块用于对每个划分模式中的每个PU块进行帧间预测,并选择相对于每个PU块代价小于预设代价值的一个或多个从参考帧中获取的参考信息,以及将选择的参考PU块的运动矢量作为该划分模式相对应的预测信息;The inter-frame prediction coarse selection module is used to perform inter-frame prediction on each PU block in each division mode, and select one or more reference frames with a cost less than a preset cost value relative to each PU block Reference information of, and the motion vector of the selected reference PU block as the prediction information corresponding to the division mode;
    所述帧内预测粗选择模块用于对每个划分模式中的每个PU块进行帧内预测,并选择相对于每个PU块代价小于预设代价值的一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。The intra-frame prediction coarse selection module is used to perform intra-frame prediction on each PU block in each division mode, and select one or more intra-frame prediction directions with a cost less than a preset cost value relative to each PU block, And use the selected intra prediction direction as the prediction information corresponding to the division mode.
  6. 根据权利要求5所述的H.265编码装置,其特征在于,所述帧内预测粗选择模块还包括参考像素生成模块;The H.265 encoding device according to claim 5, wherein the intra-frame prediction coarse selection module further comprises a reference pixel generation module;
    所述参考像素生成模块用于对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,并根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得 到各个方向的预测结果,并根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The reference pixel generation module is used to generate reference pixels using the original pixels of the current frame for each PU block in each division mode, and perform all intra prediction directions according to the rules of the H.265 protocol according to the reference pixels. The prediction results in each direction are obtained by prediction, and the distortion cost is calculated with the original pixels according to the prediction results in each direction, and the cost is sorted from small to large to select one or more intra prediction directions with a small cost.
  7. 根据权利要求5所述的H.265编码装置,其特征在于,所述帧间预测粗选择模块还包括有:粗搜索模块、精搜索模块和分数像素搜索模块,所述粗搜索模块与预处理模块连接,所述粗搜索模块与精搜索模块连接,所述精搜索模块与分数像素搜索模块连接。The H.265 encoding device according to claim 5, wherein the inter-frame prediction coarse selection module further comprises: a coarse search module, a fine search module, and a fractional pixel search module, the coarse search module and preprocessing Module connection, the coarse search module is connected with the fine search module, and the fine search module is connected with the fractional pixel search module.
  8. 根据权利要求7所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 7, wherein:
    所述粗搜索模块用于从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧,对参考帧和当前CTU块进行下采样操作,并在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量。The coarse search module is used to select a frame from the reference array, select a reference frame in its original frame or reconstructed frame, perform down-sampling operations on the reference frame and the current CTU block, and find the down-sampled reference frame The pixel location with the least cost compared with the down-sampled CTU block, and the coarse search vector of the pixel location relative to the current CTU block is calculated.
  9. 根据权利要求7所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 7, wherein:
    所述精搜索模块用于根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域,并在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及用于根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块。The fine search module is used to set a fine search area in the reconstructed image of the reference frame for each PU block according to the coarse search vector, and generate a PU block corresponding to the smallest cost in the fine search area A fine search vector; and used to generate one or more predicted motion vectors with the same function as the coarse search vector based on the motion vector information around the current CTU block, and generate a fine search vector based on the predicted motion vector; and use all the generated The refined search vector is sent to the fractional pixel search module.
  10. 根据权利要求9所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 9, wherein:
    所述分数像素搜索模块用于根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域,并在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。The fractional pixel search module is used to set a corresponding fractional pixel search area in the reference frame for each PU block according to each received fine search vector, and generate a PU block in the fractional pixel search area Corresponding to a fractional pixel search vector with the smallest cost.
  11. 根据权利要求1所述的H.265编码装置,其特征在于,所述精确比较模块包括有分发模块、多个分层计算模块和多个分层比较模块,所述分发模块与粗选择模块连接,所述分层比较模块与分发模块连接,其中:The H.265 encoding device according to claim 1, wherein the precise comparison module includes a distribution module, multiple hierarchical calculation modules, and multiple hierarchical comparison modules, and the distribution module is connected to the coarse selection module , The layered comparison module is connected to the distribution module, wherein:
    所述分发模块用于根据每个CTU块的每个划分模式,将每个划分模式中的每个CU块、以及与该CU块相对应的预测信息分发给对不同的分层计算模块;The distribution module is configured to distribute each CU block in each division mode and the prediction information corresponding to the CU block to different hierarchical calculation modules according to each division mode of each CTU block;
    所述分层计算模块用于根据接收的与CU块相对应的预测信息,计算多个代价信息并进行层内比较,选出一个该CU块对应的代价最小的预测模式和划分模式;The layered calculation module is configured to calculate multiple cost information according to the received prediction information corresponding to the CU block and perform intra-layer comparison, and select a prediction mode and division mode with the least cost corresponding to the CU block;
    所述分层比较模块用于比较不同层的分层计算模块所选择出的预测模式和划分模式对应的最小代价,选择出对于CTU块代价最小的划分模式和相对应的编码信息。The layered comparison module is used to compare the prediction mode selected by the layered calculation modules of different layers and the minimum cost corresponding to the partition mode, and select the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  12. 一种H.265编码方法,其特征在于,所述方法应用于H.265编码装置,所述装置包括如下模块:预处理模块、粗选择模块和精确比较模块,所述预处理模块与所述粗选择模块连接,所述粗选择模块与所述精确比较模块连接;所述方法包括以下步骤:An H.265 encoding method, characterized in that the method is applied to an H.265 encoding device, and the device includes the following modules: a preprocessing module, a coarse selection module, and an accurate comparison module, the preprocessing module and the The coarse selection module is connected, and the coarse selection module is connected with the precise comparison module; the method includes the following steps:
    预处理模块将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing module divides a current frame in an original video into multiple CTU blocks;
    粗选择模块按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块;以及对每个CTU块的每个划分模式进行帧间预测和帧内预测,并生成一个与每个划分模式相对应的预测信息;The coarse selection module divides each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each CU block into one or more corresponding PU blocks. ; And perform inter-frame prediction and intra-frame prediction on each division mode of each CTU block, and generate a prediction information corresponding to each division mode;
    精确比较模块对与每个CTU块的各个划分模式相对应的预测信息进行代价比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将 当前帧生成重构帧的重构信息。The precise comparison module compares the cost of the prediction information corresponding to each partition mode of each CTU block, selects the partition mode with the smallest cost for each CTU block and the coding information corresponding to the partition mode, and selects The division mode and its corresponding coding information are used to generate entropy coding information for generating an H.265 code stream from the current frame and reconstruction information for generating a reconstructed frame from the current frame.
  13. 根据权利要求12所述的H.265编码方法,其特征在于,所述装置还包括熵编码模块,所述熵编码模块与精确比较模块连接;所述方法包括以下步骤:The H.265 encoding method according to claim 12, wherein the device further comprises an entropy encoding module, and the entropy encoding module is connected to an accurate comparison module; the method comprises the following steps:
    熵编码模块根据每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的熵编码信息,来生成与当前帧相对应的H.265码流。The entropy coding module generates the H.265 code stream corresponding to the current frame according to the partition mode with the lowest cost corresponding to each CTU block and the entropy coding information corresponding to the current frame generated according to the corresponding coding information.
  14. 根据权利要求13所述的H.265编码方法,其特征在于,所述装置包括后处理模块,所述后处理模块与精确比较模块连接:所述方法包括:The H.265 encoding method according to claim 13, wherein the device comprises a post-processing module, and the post-processing module is connected to an accurate comparison module: the method comprises:
    后处理模块根据与每个CTU块相对应的代价最小的划分模式和根据与其对应的编码信息生成的与当前帧相对应的重构信息,来生成与当前帧相对应的重构帧。The post-processing module generates a reconstructed frame corresponding to the current frame according to the partition mode with the lowest cost corresponding to each CTU block and the reconstruction information corresponding to the current frame generated according to the corresponding coding information.
  15. 根据权利要求14所述的H.265编码方法,其特征在于,所述后处理模块包括去块滤波模块和样本自适应偏移模块;所述去块滤波模块和样本自适应偏移模块连接;所述方法包括:The H.265 encoding method according to claim 14, wherein the post-processing module comprises a deblocking filtering module and a sample adaptive offset module; the deblocking filtering module is connected to the sample adaptive offset module; The method includes:
    去块滤波模块利用精确比较模块所提供的代价最小的划分模式和与其对应的编码信息,对重构帧进行滤波处理;The deblocking filtering module uses the least costly partition mode provided by the accurate comparison module and the corresponding coding information to filter the reconstructed frame;
    样本自适应偏移模块对滤波处理后的重构帧进行SAO计算,并将计算后的数据传输至熵编码模块。The sample adaptive offset module performs SAO calculation on the reconstructed frame after the filtering process, and transmits the calculated data to the entropy coding module.
  16. 根据权利要求12所述的H.265编码方法,其特征在于,所述粗选择模块包括帧间预测粗选择模块和帧内预测粗选择模块,所述帧间预测粗选择模块分别与预处理模块、精确比较模块连接,所述帧内预测粗选择模块分别与预处理模块、精确比较模块连接;所述方法包括:The H.265 encoding method according to claim 12, wherein the coarse selection module comprises an inter-frame prediction coarse selection module and an intra-frame prediction coarse selection module, and the inter-frame prediction coarse selection module and the preprocessing module are respectively , The precise comparison module is connected, the intra-frame prediction coarse selection module is respectively connected with the preprocessing module and the precise comparison module; the method includes:
    帧间预测粗选择模块对每个划分模式中的每个PU块进行帧间预测,并选择相对于每个PU块代价小于预设代价值的一个或多个从参考帧中获取的参考信息,以及将选择的参考PU块的运动矢量作为该划分模式相对应的预测信息;The inter-frame prediction coarse selection module performs inter-frame prediction on each PU block in each division mode, and selects one or more reference information obtained from the reference frame whose cost is less than the preset cost value relative to each PU block, And use the motion vector of the selected reference PU block as the prediction information corresponding to the division mode;
    帧内预测粗选择模块对每个划分模式中的每个PU块进行帧内预测,并选择相对于每个PU块代价小于预设代价值的一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。The intra-frame prediction coarse selection module performs intra-frame prediction on each PU block in each division mode, and selects one or more intra-frame prediction directions whose cost is less than the preset cost value relative to each PU block, and selects The intra prediction direction is used as the prediction information corresponding to the division mode.
  17. 根据权利要求16所述的H.265编码方法,其特征在于,所述帧内预测粗选择模块还包括参考像素生成模块;所述方法包括:The H.265 encoding method according to claim 16, wherein the intra-frame prediction coarse selection module further comprises a reference pixel generation module; the method comprises:
    参考像素生成模块对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,并根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,并根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The reference pixel generation module uses the original pixels of the current frame to generate reference pixels for each PU block in each division mode, and predicts all intra-frame prediction directions according to the rules of the H.265 protocol according to the reference pixels to obtain each direction According to the prediction results of each direction, the distortion cost is calculated with the original pixels separately, and the cost is sorted from small to large to select one or more intra prediction directions with a small cost.
  18. 根据权利要求16所述的H.265编码方法,其特征在于,所述帧间预测粗选择模块还包括有:粗搜索模块、精搜索模块和分数像素搜索模块,所述粗搜索模块与预处理模块连接,所述粗搜索模块与精搜索模块连接,所述精搜索模块与分数像素搜索模块连接。The H.265 encoding method according to claim 16, wherein the inter-frame prediction coarse selection module further comprises: a coarse search module, a fine search module, and a fractional pixel search module, the coarse search module and the preprocessing Module connection, the coarse search module is connected with the fine search module, and the fine search module is connected with the fractional pixel search module.
  19. 根据权利要求18所述的H.265编码方法,其特征在于,所述方法包括:The H.265 encoding method of claim 18, wherein the method comprises:
    粗搜索模块用于从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧,对参考帧和当前CTU块进行下采样操作,并在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量。The coarse search module is used to select a frame from the reference array, select a reference frame in its original frame or reconstructed frame, perform down-sampling operations on the reference frame and the current CTU block, and find and down-sample the reference frame after down-sampling. The sampled CTU block compares the pixel location with the least cost, and calculates the coarse search vector of the pixel location relative to the current CTU block.
  20. 根据权利要求18所述的H.265编码方法,其特征在于,所述方法包括:The H.265 encoding method of claim 18, wherein the method comprises:
    精搜索模块根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域,并在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块。According to the coarse search vector, the fine search module sets a fine search area in the reconstructed image of the reference frame for each PU block, and generates a fine search vector corresponding to the PU block in the fine search area with the smallest cost. ; And according to the motion vector information around the current CTU block, generate one or more predicted motion vectors with the same function as the coarse search vector, and generate the fine search vector based on the predicted motion vector; and send all the generated fine search vectors to the score Pixel search module.
  21. 根据权利要求20所述的H.265编码方法,其特征在于,所述方法包括:The H.265 encoding method of claim 20, wherein the method comprises:
    分数像素搜索模块根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域,并在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。The fractional pixel search module sets a corresponding fractional pixel search area in the reference frame for each PU block according to each received fine search vector, and generates a corresponding PU block in the fractional pixel search area with the least cost A fractional pixel search vector.
  22. 根据权利要求12所述的H.265编码方法,其特征在于,所述精确比较模块包括有分发模块、多个分层计算模块和多个分层比较模块,所述分发模块与粗选择模块连接,所述分层比较模块与分发模块连接;所述方法包括:The H.265 encoding method according to claim 12, wherein the precise comparison module includes a distribution module, multiple layered calculation modules, and multiple layered comparison modules, and the distribution module is connected to the coarse selection module , The layered comparison module is connected to the distribution module; the method includes:
    分发模块根据每个CTU块的每个划分模式,将每个划分模式中的每个CU块、以及与该CU块相对应的预测信息分发给对不同的分层计算模块;The distribution module distributes each CU block in each division mode and the prediction information corresponding to the CU block to different layered calculation modules according to each division mode of each CTU block;
    分层计算模块根据接收的与CU块相对应的预测信息,计算多个代价信息并进行层内比较,选出一个该CU块对应的代价最小的预测模式和划分模式;The layered calculation module calculates multiple cost information according to the received prediction information corresponding to the CU block and performs intra-layer comparison, and selects a prediction mode and partition mode with the least cost corresponding to the CU block;
    分层比较模块比较不同层的分层计算模块所选择出的预测模式和划分模式对应的最小代价,选择出对于CTU块代价最小的划分模式和相对应的编码信息。The layered comparison module compares the prediction mode selected by the layered calculation modules of different layers and the minimum cost corresponding to the partition mode, and selects the partition mode with the smallest cost for the CTU block and the corresponding coding information.
  23. 一种H.265编码装置,其特征在于,包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,其中:An H.265 encoding device, which is characterized by comprising multiple modules and multiple pipeline steps, each pipeline step includes at least one pipeline stage for executing at least one module, wherein:
    多个模块包括预处理模块、粗选择模块、精确比较模块和整体控制模块,所述整体控制模块分别与预处理模块、粗选择模块、精确比较模块连接;The multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
    多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行;The multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
    所述预处理流水步骤通过预处理模块,将一个原始视频中的一个当前帧分割为多个CTU块;The preprocessing pipeline step divides a current frame in an original video into multiple CTU blocks through the preprocessing module;
    所述粗选择流水步骤通过粗选择模块,按照多个划分模式来划分每个CTU块,对每个CTU块的每个划分模式进行帧间预测粗选择和帧内预测粗选择,并生成一个与每个划分模式相对应的预测信息;The rough selection pipeline step uses the rough selection module to divide each CTU block according to multiple division modes, and performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates a and Forecast information corresponding to each division mode;
    所述精确比较流水步骤通过精确比较模块,对与每个CTU块的各个划分模式相对应的预测信息进行代价计算并比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息,The precise comparison pipeline step calculates and compares the prediction information corresponding to each division mode of each CTU block through the precise comparison module, and selects a division mode with the smallest cost for each CTU block and the division mode. Corresponding coding information, and according to the selected division mode and its corresponding coding information, generate entropy coding information for generating the H.265 code stream from the current frame and reconstruction information for generating the reconstructed frame from the current frame,
    所述整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
  24. 如权利要求23所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 23, wherein:
    所述粗选择模块包括:帧间预测粗选择模块和帧内预测粗选择模块;所述粗选择流水 步骤包括:帧间预测粗选择流水级和帧内预测粗选择流水级;The coarse selection module includes: an inter prediction coarse selection module and an intra prediction coarse selection module; the coarse selection pipeline includes: an inter prediction coarse selection pipeline and an intra prediction coarse selection pipeline;
    所述帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides the Each CU block is divided into one or more corresponding PU blocks, inter-frame prediction is performed for each division mode of each CTU block and reference frame information is obtained, and each division mode of each CTU block is intra-frame prediction And generate a prediction information corresponding to each division mode;
    所述帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。The intra-frame prediction coarse selection pipeline passes through the intra-frame prediction coarse selection module: performs intra-frame prediction on each PU block in each division mode and calculates the corresponding cost, and selects one or more costs relative to each PU block according to the cost. Intra prediction directions, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  25. 如权利要求23所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 23, wherein:
    所述粗选择模块还包括帧间粗选择模块,所述精确比较模块还包括帧内粗选择模块;The coarse selection module further includes a coarse inter-frame selection module, and the precise comparison module further includes a coarse intra-frame selection module;
    所述粗选择流水步骤包括帧间粗选择流水级,所述精确比较流水步骤包括帧内粗选择流水级;The rough selection pipeline includes a rough selection pipeline between frames, and the accurate comparison pipeline includes a rough selection pipeline within a frame;
    所述帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides the Each CU block is divided into one or more corresponding PU blocks, inter-frame prediction is performed for each division mode of each CTU block and reference frame information is obtained, and each division mode of each CTU block is intra-frame prediction And generate a prediction information corresponding to each division mode;
    所述帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。The intra-frame prediction coarse selection pipeline passes through the intra-frame prediction coarse selection module: performs intra-frame prediction on each PU block in each division mode and calculates the corresponding cost, and selects one or more costs relative to each PU block according to the cost. Intra prediction directions, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  26. 如权利要求24或25所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 24 or 25, wherein:
    所述帧间预测粗选择模块包括:粗搜索模块、参考帧数据加载模块、精搜索模块和分数像素搜索模块;The inter-frame prediction coarse selection module includes: a coarse search module, a reference frame data loading module, a fine search module, and a fractional pixel search module;
    所述粗选择流水步骤包括:粗搜索流水级、参考帧数据加载流水级、精搜索流水级和分数像素搜索流水级;The rough selection pipeline includes: rough search pipeline, reference frame data loading pipeline, fine search pipeline and fractional pixel search pipeline;
    所述粗搜索流水级通过粗搜索模块:从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧,对参考帧和当前CTU块进行下采样操作,并在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量;The coarse search pipeline stage passes through the coarse search module: select a frame from the reference array, select a reference frame from its original frame or reconstructed frame, perform down-sampling operations on the reference frame and the current CTU block, and perform down-sampling on the Find the pixel location with the least cost compared with the down-sampled CTU block in the reference frame, and calculate the coarse search vector of the pixel location relative to the current CTU block;
    所述参考帧数据加载流水级通过参考帧数据加载流水级:通过整体控制模块获取粗搜索流水级的粗搜索矢量以及根据CTU块周围的运动矢量获得跟粗搜索有同样功能的一个或多个预测运动矢量,根据粗搜索矢量和一个或多个预测矢量加载参考帧数据,并通过整体控制模块传给精搜索流水级;The reference frame data loading pipeline stage is through the reference frame data loading pipeline stage: the coarse search vector of the coarse search pipeline is obtained through the overall control module, and one or more predictions with the same function as the coarse search are obtained according to the motion vector around the CTU block Motion vector, load reference frame data according to the coarse search vector and one or more prediction vectors, and pass it to the fine search pipeline through the overall control module;
    所述精搜索流水级通过精搜索模块:根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域,并在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及用于根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块;The fine search pipeline passes the fine search module: according to the coarse search vector, a fine search area is set in the reconstructed image of the reference frame for each PU block, and a corresponding PU block is generated in the fine search area A fine search vector with the smallest cost; and used to generate one or more predicted motion vectors with the same function as the coarse search vector based on the motion vector information around the current CTU block, and generate a fine search vector based on the predicted motion vector; and Send all the generated fine search vectors to the fractional pixel search module;
    所述分数像素搜索流水级通过分数像素搜索模块:根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域,并在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。The fractional pixel search pipeline level passes through the fractional pixel search module: according to each received fine search vector, a corresponding fractional pixel search area is set in the reference frame for each PU block, and in the fractional pixel search area Generate a fractional pixel search vector with the smallest cost corresponding to the PU block.
  27. 如权利要求26所述的H.265编码装置,其特征在于,所述帧内预测粗选择流水级与分数像素搜索流水级为同一个流水级,所述帧内预测粗选择模块与分数像素搜索模块并行地执行于该同一个流水级。The H.265 encoding device according to claim 26, wherein the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are the same pipeline, and the intra-frame prediction coarse selection module and the fractional pixel search Modules are executed in parallel at the same pipeline stage.
  28. 如权利要求27所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 27, wherein:
    所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;The intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline;
    所述帧内预测粗选择流水级包括:对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The rough selection pipeline of intra prediction includes: for each PU block in each division mode, the original pixels of the current frame are used to generate reference pixels, and all intra predictions are performed according to the rules of the H.265 protocol according to the reference pixels. Direction prediction is performed to obtain prediction results in each direction, and the distortion cost is calculated with the original pixels according to the prediction results in each direction, and one or more intra-frame prediction directions with lower cost are selected by sorting the cost from small to large.
  29. 如权利要求26所述的H.265编码装置,其特征在于,所述帧内预测粗选择流水级与分数像素搜索流水级为不同流水级,所述帧内预测粗选择模块执行于分数像素搜索模块之后的流水级。The H.265 encoding device of claim 26, wherein the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are different pipeline stages, and the intra-frame prediction coarse selection module executes the fractional pixel search The pipeline stage after the module.
  30. 如权利要求29所述的H.265编码装置,其特征在于,The H.265 encoding device according to claim 29, wherein:
    所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;The intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline;
    所述参考像素生成模块用于对每个划分模式中的每个PU块,使用当前帧的重构像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The reference pixel generation module is used to generate reference pixels using the reconstructed pixels of the current frame for each PU block in each division mode, and perform all intra prediction directions according to the rules of the H.265 protocol according to the reference pixels. The prediction results in each direction are obtained by prediction, the distortion cost is calculated with the original pixels according to the prediction results in each direction, and the cost is sorted from small to large to select one or more intra-frame prediction directions with a small cost.
  31. 如权利要求23所述的H.265编码装置,其特征在于,所述多个模块还包括后处理模块,所述多个流水步骤还包括后处理流水步骤,The H.265 encoding device of claim 23, wherein the multiple modules further comprise a post-processing module, and the multiple pipeline steps further include a post-processing pipeline step,
    所述后处理流水步骤通过后处理模块,将精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的重构信息,来生成与当前帧相对应的重构帧。The post-processing pipeline step passes through the post-processing module to accurately compare the partition mode with the lowest cost corresponding to each CTU block output by the module and generate a reconstructed frame corresponding to the current frame according to the corresponding reconstruction information.
  32. 如权利要求23所述的H.265编码装置,其特征在于,所述多个模块还包括熵编码模块,所述多个流水步骤还包括熵编码流水步骤,The H.265 encoding device according to claim 23, wherein the multiple modules further comprise an entropy encoding module, and the multiple pipeline steps further include an entropy encoding pipeline step,
    所述熵编码流水步骤通过熵编码模块,将精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的熵编码信息,生成符合H.265协议规范的二进制码流。The entropy coding pipeline step uses the entropy coding module to accurately compare the partition mode with the lowest cost corresponding to each CTU block output by the module and generate a binary code stream conforming to the H.265 protocol specification according to the corresponding entropy coding information.
  33. 一种H.265编码方法,其特征在于,所述方法应用于H.265编码装置,所述装置包括多个模块和多个流水步骤,每个流水步骤包括至少一个流水级,用于执行至少一个模块,其中:An H.265 encoding method, characterized in that the method is applied to an H.265 encoding device, the device includes multiple modules and multiple pipeline steps, and each pipeline step includes at least one pipeline stage for executing at least A module in which:
    多个模块包括预处理模块、粗选择模块、精确比较模块和整体控制模块,所述整体控制模块分别与预处理模块、粗选择模块、精确比较模块连接;The multiple modules include a preprocessing module, a coarse selection module, an accurate comparison module, and an overall control module, and the overall control module is respectively connected to the preprocessing module, the rough selection module, and the precise comparison module;
    多个流水步骤包括预处理流水步骤、粗选择流水步骤、和精确比较流水步骤,所述粗选择流水步骤在预处理流水步骤之后执行,所述精确比较流水步骤在粗选择流水步骤之后执行;The multiple pipeline steps include a pretreatment pipeline step, a rough selection pipeline step, and an accurate comparison pipeline step, the rough selection pipeline step is performed after the pretreatment pipeline step, and the precise comparison pipeline step is executed after the rough selection pipeline step;
    所述方法包括以下步骤:The method includes the following steps:
    预处理流水步骤通过预处理模块,将一个原始视频中的一个当前帧分割为多个CTU 块;The preprocessing pipeline step divides a current frame in an original video into multiple CTU blocks through the preprocessing module;
    粗选择流水步骤通过粗选择模块,按照多个划分模式来划分每个CTU块,对每个CTU块的每个划分模式进行帧间预测粗选择和帧内预测粗选择,并生成一个与每个划分模式相对应的预测信息;The rough selection pipeline process uses the rough selection module to divide each CTU block according to multiple division modes, and performs coarse selection of inter prediction and coarse selection of intra prediction for each division mode of each CTU block, and generates one and each Forecast information corresponding to the division mode;
    精确比较流水步骤通过精确比较模块,对与每个CTU块的各个划分模式相对应的预测信息进行代价计算并比较,选择出对于每个CTU块代价最小的一个划分模式和与该划分模式对应的编码信息,并根据选择出的划分模式和其对应的编码信息,生成用于将当前帧生成H.265码流的熵编码信息和将当前帧生成重构帧的重构信息,The precise comparison pipeline step calculates and compares the prediction information corresponding to each partition mode of each CTU block through the precise comparison module, and selects the partition mode with the smallest cost for each CTU block and the partition mode corresponding to the partition mode. Encoding information, and according to the selected division mode and its corresponding encoding information, generating entropy encoding information for generating H.265 bitstream from the current frame and reconstruction information for generating reconstructed frames from the current frame,
    整体控制模块用于控制存、取原始帧数据和参考帧数据,以及控制所述预处理模块、粗选择模块、精确比较模块依次执行与之相对应的流水步骤。The overall control module is used to control the storage and retrieval of original frame data and reference frame data, and to control the preprocessing module, the coarse selection module, and the precise comparison module to sequentially execute the corresponding pipeline steps.
  34. 如权利要求33所述的H.265编码方法,其特征在于,The H.265 encoding method according to claim 33, wherein:
    所述粗选择模块包括:帧间预测粗选择模块和帧内预测粗选择模块;所述粗选择流水步骤包括:帧间预测粗选择流水级和帧内预测粗选择流水级;The coarse selection module includes: an inter prediction coarse selection module and an intra prediction coarse selection module; the coarse selection pipeline includes: an inter prediction coarse selection pipeline and an intra prediction coarse selection pipeline;
    所述方法还包括:The method also includes:
    帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of them The CU block is divided into one or more corresponding PU blocks, and each division mode of each CTU block is inter-predicted and reference frame information is obtained, and each division mode of each CTU block is intra-predicted and generated A prediction information corresponding to each division mode;
    帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。Intra-frame prediction coarse selection pipeline through the intra-frame prediction coarse selection module: Perform intra-frame prediction for each PU block in each division mode and calculate the corresponding cost, and select one or more frames relative to the cost of each PU block according to the cost Intra prediction direction, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  35. 如权利要求33所述的H.265编码方法,其特征在于,The H.265 encoding method according to claim 33, wherein:
    所述粗选择模块还包括帧间粗选择模块,所述精确比较模块还包括帧内粗选择模块;The coarse selection module further includes a coarse inter-frame selection module, and the precise comparison module further includes a coarse intra-frame selection module;
    所述粗选择流水步骤包括帧间粗选择流水级,所述精确比较流水步骤包括帧内粗选择流水级;The rough selection pipeline includes a rough selection pipeline between frames, and the accurate comparison pipeline includes a rough selection pipeline within a frame;
    所述方法包括:The method includes:
    帧间预测粗选择流水级通过帧间预测粗选择模块,按照多个划分模式来划分每个CTU块,每个划分模式将一个CTU块分割为对应的多个CU块,以及将其中的每个CU块分割为对应的一个或多个PU块,对每个CTU块的每个划分模式进行帧间预测并获取参考帧信息,以及对每个CTU块的每个划分模式进行帧内预测并生成一个与每个划分模式相对应的预测信息;The inter-frame prediction coarse selection pipeline uses the inter-frame prediction coarse selection module to divide each CTU block according to multiple division modes. Each division mode divides a CTU block into corresponding multiple CU blocks, and divides each of them The CU block is divided into one or more corresponding PU blocks, and each division mode of each CTU block is inter-predicted and reference frame information is obtained, and each division mode of each CTU block is intra-predicted and generated A prediction information corresponding to each division mode;
    帧内预测粗选择流水级通过帧内预测粗选择模块:对每个划分模式中的每个PU块进行帧内预测并计算相应代价,根据代价选择相对于每个PU块代价一个或多个帧内预测方向,并将选择的帧内预测方向作为该划分模式相对应的预测信息。Intra-frame prediction coarse selection pipeline through the intra-frame prediction coarse selection module: Perform intra-frame prediction for each PU block in each division mode and calculate the corresponding cost, and select one or more frames relative to the cost of each PU block according to the cost Intra prediction direction, and the selected intra prediction direction is used as the prediction information corresponding to the division mode.
  36. 如权利要求34或35所述的H.265编码方法,其特征在于,The H.265 encoding method according to claim 34 or 35, wherein:
    所述帧间预测粗选择模块包括:粗搜索模块、参考帧数据加载模块、精搜索模块和分数像素搜索模块;The inter-frame prediction coarse selection module includes: a coarse search module, a reference frame data loading module, a fine search module, and a fractional pixel search module;
    所述粗选择流水步骤包括:粗搜索流水级、参考帧数据加载流水级、精搜索流水级和 分数像素搜索流水级;The rough selection pipeline includes: rough search pipeline, reference frame data loading pipeline, fine search pipeline, and fractional pixel search pipeline;
    所述方法包括:The method includes:
    粗搜索流水级通过粗搜索模块:从参考阵列中选择一帧,在其原始帧或者重构帧中选择一个参考帧,对参考帧和当前CTU块进行下采样操作,并在下采样后的参考帧中找到与下采样后的CTU块相比代价最小的像素位置,并计算该像素位置相对于当前CTU块的粗搜索矢量;The coarse search pipeline stage passes through the coarse search module: select a frame from the reference array, select a reference frame in its original frame or reconstructed frame, perform down-sampling operations on the reference frame and the current CTU block, and perform the down-sampled reference frame Find the pixel location with the least cost compared with the down-sampled CTU block, and calculate the coarse search vector of the pixel location relative to the current CTU block;
    参考帧数据加载流水级通过参考帧数据加载流水级:通过整体控制模块获取粗搜索流水级的粗搜索矢量以及根据CTU块周围的运动矢量获得跟粗搜索有同样功能的一个或多个预测运动矢量,根据粗搜索矢量和一个或多个预测矢量加载参考帧数据,并通过整体控制模块传给精搜索流水级;The reference frame data loading pipeline stage The reference frame data loading pipeline stage: obtain the coarse search vector of the coarse search pipeline through the overall control module and obtain one or more predicted motion vectors with the same function as the coarse search according to the motion vectors around the CTU block , Load the reference frame data according to the coarse search vector and one or more prediction vectors, and pass it to the fine search pipeline through the overall control module;
    精搜索流水级通过精搜索模块:根据粗搜索矢量,针对每个PU块在参考帧的的重构图像中设定一个精搜索区域,并在该精搜索区域中生成一个该PU块对应的代价最小的一个精搜索矢量;以及根据当前CTU块周围的运动矢量信息,生成与粗搜索矢量具有同样功能的一个或多个预测运动矢量,并根据预测运动矢量生成精搜索矢量;并将生成的所有精搜索矢量发送给分数像素搜索模块;The fine search pipeline stage passes through the fine search module: according to the coarse search vector, a fine search area is set in the reconstructed image of the reference frame for each PU block, and a cost corresponding to the PU block is generated in the fine search area The smallest fine search vector; and according to the motion vector information around the current CTU block, one or more predicted motion vectors with the same function as the coarse search vector are generated, and the fine search vector is generated based on the predicted motion vector; and all generated The refined search vector is sent to the fractional pixel search module;
    所述分数像素搜索流水级通过分数像素搜索模块:根据每个接收到的精搜索矢量,针对每个PU块在参考帧中设定一个对应的分数像素搜索区域,并在该分数像素搜索区域中生成一个该PU块对应的代价最小的一个分数像素搜索矢量。The fractional pixel search pipeline level passes through the fractional pixel search module: according to each received fine search vector, a corresponding fractional pixel search area is set in the reference frame for each PU block, and in the fractional pixel search area Generate a fractional pixel search vector with the smallest cost corresponding to the PU block.
  37. 如权利要求36所述的H.265编码方法,其特征在于,所述帧内预测粗选择流水级与分数像素搜索流水级为同一个流水级,所述帧内预测粗选择模块与分数像素搜索模块并行地执行于该同一个流水级。The H.265 encoding method of claim 36, wherein the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are the same pipeline, and the intra-frame prediction coarse selection module and the fractional pixel search Modules are executed in parallel at the same pipeline stage.
  38. 如权利要求37所述的H.265编码方法,其特征在于,The H.265 encoding method according to claim 37, wherein:
    所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;The intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline;
    所述方法包括:The method includes:
    帧内预测粗选择流水级包括:对每个划分模式中的每个PU块,使用当前帧的原始像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The rough selection pipeline of intra prediction includes: for each PU block in each division mode, the original pixels of the current frame are used to generate reference pixels, and all intra prediction directions are performed according to the rules of the H.265 protocol according to the reference pixels. The prediction results in each direction are obtained by prediction, the distortion cost is calculated with the original pixels according to the prediction results in each direction, and the cost is sorted from small to large to select one or more intra-frame prediction directions with a small cost.
  39. 如权利要求36所述的H.265编码方法,其特征在于,所述帧内预测粗选择流水级与分数像素搜索流水级为不同流水级,所述帧内预测粗选择模块执行于分数像素搜索模块之后的流水级。The H.265 encoding method of claim 36, wherein the intra-frame prediction coarse selection pipeline and the fractional pixel search pipeline are different pipeline stages, and the intra-frame prediction coarse selection module executes the fractional pixel search The pipeline stage after the module.
  40. 如权利要求39所述的H.265编码方法,其特征在于,The H.265 encoding method of claim 39, wherein:
    所述帧内预测粗选择模块包括参考像素生成模块,执行于帧内预测粗选择流水级;The intra-frame prediction coarse selection module includes a reference pixel generation module, which is executed in the intra-frame prediction coarse selection pipeline;
    所述方法包括:The method includes:
    参考像素生成模块对每个划分模式中的每个PU块,使用当前帧的重构像素来生成参考像素,根据参考像素按H.265协议的规则对所有的帧内预测方向进行预测得到各个方向的预测结果,根据各个方向的预测结果分别与原始像素计算失真代价,并把代价从小到大排序选择出代价较小的一个或多个帧内预测方向。The reference pixel generation module uses the reconstructed pixels of the current frame to generate reference pixels for each PU block in each division mode, and predicts all the intra-frame prediction directions according to the rules of the H.265 protocol according to the reference pixels to obtain each direction According to the prediction results of each direction, the distortion cost is calculated with the original pixels, and the cost is sorted from small to large to select one or more intra prediction directions with a small cost.
  41. 如权利要求33所述的H.265编码方法,其特征在于,所述多个模块还包括后处 理模块,所述多个流水步骤还包括后处理流水步骤,The H.265 encoding method according to claim 33, wherein the multiple modules further comprise a post-processing module, and the multiple pipeline steps further include a post-processing pipeline step,
    所述方法包括:The method includes:
    后处理流水步骤通过后处理模块,将精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的重构信息,来生成与当前帧相对应的重构帧。The post-processing pipeline step uses the post-processing module to accurately compare the partition mode with the lowest cost corresponding to each CTU block output by the module and generate a reconstructed frame corresponding to the current frame according to the corresponding reconstruction information.
  42. 如权利要求33所述的H.265编码方法,其特征在于,所述多个模块还包括熵编码模块,所述多个流水步骤还包括熵编码流水步骤;The H.265 encoding method according to claim 33, wherein the multiple modules further comprise an entropy encoding module, and the multiple pipeline steps further include an entropy encoding pipeline step;
    所述方法包括:熵编码流水步骤通过熵编码模块,将精确比较模块输出的每个CTU块相对应的代价最小的划分模式和根据与其对应的熵编码信息,生成符合H.265协议规范的二进制码流。The method includes: the entropy coding pipeline step, through the entropy coding module, accurately compares each CTU block output by the module with the least costly partition mode and generates a binary compliant with H.265 protocol specifications according to the corresponding entropy coding information Code stream.
PCT/CN2020/084093 2018-04-11 2020-04-10 H.265 encoding method and apparatus WO2020207451A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/603,002 US11956452B2 (en) 2018-04-11 2020-04-10 System and method for H.265 encoding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2019082189 2019-04-11
CNPCT/CN2019/082189 2019-04-11

Publications (1)

Publication Number Publication Date
WO2020207451A1 true WO2020207451A1 (en) 2020-10-15

Family

ID=72752207

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/084093 WO2020207451A1 (en) 2018-04-11 2020-04-10 H.265 encoding method and apparatus

Country Status (1)

Country Link
WO (1) WO2020207451A1 (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101605256A (en) * 2008-06-12 2009-12-16 华为技术有限公司 Method and device for video encoding and decoding
CN104602019A (en) * 2014-12-31 2015-05-06 乐视网信息技术(北京)股份有限公司 Video coding method and device
WO2016083833A1 (en) * 2014-11-27 2016-06-02 British Broadcasting Corporation Video encoding and decoding with hierarchical coding tree
US20180084284A1 (en) * 2016-09-22 2018-03-22 Canon Kabushiki Kaisha Method, apparatus and system for encoding and decoding video data
CN108924551A (en) * 2018-08-29 2018-11-30 腾讯科技(深圳)有限公司 The prediction technique and relevant device of video image coding pattern
CN110365988A (en) * 2018-04-11 2019-10-22 福州瑞芯微电子股份有限公司 A kind of H.265 coding method and device
CN110971896A (en) * 2018-09-28 2020-04-07 福州瑞芯微电子股份有限公司 H.265 coding method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101605256A (en) * 2008-06-12 2009-12-16 华为技术有限公司 Method and device for video encoding and decoding
WO2016083833A1 (en) * 2014-11-27 2016-06-02 British Broadcasting Corporation Video encoding and decoding with hierarchical coding tree
CN104602019A (en) * 2014-12-31 2015-05-06 乐视网信息技术(北京)股份有限公司 Video coding method and device
US20180084284A1 (en) * 2016-09-22 2018-03-22 Canon Kabushiki Kaisha Method, apparatus and system for encoding and decoding video data
CN110365988A (en) * 2018-04-11 2019-10-22 福州瑞芯微电子股份有限公司 A kind of H.265 coding method and device
CN108924551A (en) * 2018-08-29 2018-11-30 腾讯科技(深圳)有限公司 The prediction technique and relevant device of video image coding pattern
CN110971896A (en) * 2018-09-28 2020-04-07 福州瑞芯微电子股份有限公司 H.265 coding method and device

Similar Documents

Publication Publication Date Title
US11616960B2 (en) Machine learning video processing systems and methods
RU2708347C1 (en) Image encoding method and device and image decoding method and device
CN103327325B (en) The quick self-adapted system of selection of intra prediction mode based on HEVC standard
KR101941955B1 (en) Recursive block partitioning
CN105898325A (en) Method and device for deriving sub-candidate set of motion vector prediction
CN110365988B (en) H.265 coding method and device
CN109660799A (en) Method for estimating, device, electronic equipment and storage medium in Video coding
CN103313058B (en) The HEVC Video coding multimode optimization method realized for chip and system
CN106454349A (en) Motion estimation block matching method based on H.265 video coding
CN104837019A (en) AVS-to-HEVC optimal video transcoding method based on support vector machine
Ma et al. Residual-based video restoration for HEVC intra coding
CN105245896A (en) HEVC (High Efficiency Video Coding) parallel motion compensation method and device
CN110971896B (en) H.265 coding method and device
Katayama et al. Low-complexity intra coding algorithm based on convolutional neural network for HEVC
JPWO2020054060A1 (en) Video coding method and video coding device
CN108881908B (en) Fast Blocking Based on Coding Unit Texture Complexity in Video Coding
Chen et al. CNN-optimized image compression with uncertainty based resource allocation
WO2020207451A1 (en) H.265 encoding method and apparatus
WO2023245460A1 (en) Neural network codec with hybrid entropy model and flexible quantization
Man et al. Content-Aware Dynamic In-loop Filter with Adjustable Complexity for VVC Intra Coding
CN101472174A (en) Method and device for recuperating original image data in video decoder
CN112770115B (en) Rapid intra-frame prediction mode decision method based on directional gradient statistical characteristics
US11956452B2 (en) System and method for H.265 encoding
US20240056601A1 (en) Hierarchical motion search processing
CN114222136A (en) Motion compensation processing method, encoder, decoder and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20787941

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20787941

Country of ref document: EP

Kind code of ref document: A1