Disclosure of Invention
The invention provides an encoding method and a decoding method for a multi-component video, which solve the problems that the existing processing method can only process specific types of images, the universality is poor and the like.
The invention can be realized by the following technical scheme:
a coding method for multi-component video, the number of said components being greater than three, comprising the steps of:
step one, with a GOP image group as a unit, recombining each component of a multi-component video to obtain a single-component video sequence;
and step two, artificially adding two components U, V into the single-component video sequence to convert the single-component video sequence into the existing three-component video format, and then coding the single-component video sequence according to the existing coding method to obtain corresponding coded data.
Further, the multi-component video is uniformly divided into a plurality of multi-component GOP group of pictures, the same components in all frame pictures in each multi-component GOP group of pictures are combined into one group, and a data group formed by the same components is defined as a single-component video sequence, so that a plurality of single-component video sequences are formed.
Further, the method of obtaining corresponding encoded data comprises the steps of:
step I, artificially adding two components U, V to each data of the single-component video sequence to convert the data into the existing three-component video format;
step II, coding the converted single component video sequence according to the existing coding method, and marking the component type of the component video corresponding to the first I frame image of each GOP image group to obtain corresponding coded data;
step III, repeating the steps I and II, completing the coding of all the single-component video sequences in a multi-component GOP image group, and obtaining corresponding coded data;
and IV, repeating the step III to finish the coding of all the multi-component GOP image groups and obtain the coded data corresponding to the multi-component video.
Further, the components U and V corresponding to each datum are the same, and the values of the components U and V are any one of values from 0 to 255.
Further, the existing encoding method is configured as transform coding, motion estimation and motion compensation, entropy coding, or hybrid coding, and the length of the GOP group of pictures set in the encoding process is the same as that of the multi-component GOP group of pictures.
A decoding method based on the above-described encoding method for multi-component video, comprising the steps of:
decoding the coded data according to the existing decoding method to obtain decoded data in a corresponding three-component video format, and then removing two artificially added components U, V to obtain a corresponding single-component video sequence;
and step ii, recombining all the single-component video sequences again to obtain the corresponding multi-component video.
Further, the method of obtaining a corresponding single-component video sequence comprises: the length of the GOP group of the decoder is set to be the same as that of the GOP group of the encoder corresponding to the encoded data, then the GOP group to be decoded is decoded according to the existing decoding method to obtain the decoded data in the corresponding three-component video format, three components of the decoded data are respectively stored in different addresses, then two artificially added components are removed U, V, and a single-component video sequence corresponding to the component types is obtained according to the component types of the multi-component video corresponding to the first I-frame image of the GOP group during encoding.
Further, the method of recombination again comprises: judging whether the component type of the first single-component video sequence is the first component of the multi-component video, if so, recombining all the single-component data again by using the inverse operation of recombination with each component of the multi-component video; if not, deleting the first single-component video sequence, judging whether the component type of the second single-component video sequence is the first component of the multi-component video or not until the single-component video sequence with the component type being the first component of the multi-component video is found, and then recombining all the residual single-component data again by using the inverse operation of recombination with each component of the multi-component video.
The beneficial technical effects of the invention are as follows:
through recombination, the multi-component video is changed into a single-component video sequence, then U, V components are artificially added, each single-component video series conforms to a three-component video format which can be processed by the existing coding and decoding method, so that coding and decoding can be carried out on the single-component video, finally recombination is carried out again by utilizing the inverse operation of the recombination, the decoded multi-component video is obtained, the coding and decoding problems of the multi-component videos of different types can be processed by utilizing the method of the invention, the types of the excessive components do not need to be concerned, the universality is strong, meanwhile, the coding and decoding can be carried out by utilizing the existing coder and decoder through the recombination and the recombination again, and the calculation process is simple. As the images in each single-component GOP group belong to the same type of components, the correlation among the images of the same component type is strongest, the compression effect is good during encoding, and experiments also verify that the compression effect of the invention is very good.
Detailed Description
The following detailed description of the preferred embodiments will be made with reference to the accompanying drawings.
The invention provides an encoding method for multi-component video, wherein the number of components is more than three, as shown in figure 1, the method specifically comprises the following steps:
step one, with a GOP image group as a unit, recombining each component of a multi-component video to obtain a single-component video sequence.
Specifically, a multi-component video is uniformly divided into a plurality of multi-component GOP group pictures, the same components in all frame pictures in each multi-component GOP group picture are combined into one group, and a data group composed of the same components is defined as a single-component video sequence, so that a plurality of single-component video sequences are formed.
Assuming that the number of components of the multi-component video is four, respectively YZHX, where Y, Z, H, X represents one of the components, respectively, and the frame rate is F fps, as shown below,
(Y1Z1H1X1),(Y2Z2H2X2),……,(YnZnHnXn) (1)
wherein, (Y1Z1H1X1) represents the 1 st frame data of the four-component video sequence, (YnZnHnXn) represents the n-th frame data of the four-component video sequence, and Yn, Zn, Hn, Xn represent the n-th frame data of each component of Y, Z, H, X, respectively. Because of the lack of correlation among the four components, the four components cannot be directly sent to an encoder for encoding, and recombination is needed, and the data format after recombination is as follows:
Y1,Y2,…Yk,Z1,Z2,…,Zk,H1,H2,…,Hk,X1,X2,…Xk,Yk+1,Yk+2,…Y2k,Zk+1,Zk+2,…,Z2k,Hk+1,Hk+2,…,H2k,Xk+1,Xk+2,…X2k,Y2k+1,Y2k+2,…Y3k,…… (2)
where k denotes the length of a multi-component GOP picture, Y1, Y2, … Yk denote single component video sequences, each component occurring k frames in succession, and then the components are arranged in a loop. The value of k depends on the encoder GOP size. Since the frame rate of the four-component video is F fps, the frame rate of the single-component video sequence is 4 times F fps.
And step two, artificially adding two components U, V into the single-component video sequence to convert the single-component video sequence into the existing three-component video format, and then coding the single-component video sequence according to the existing coding method to obtain corresponding coded data.
The method comprises the following specific steps:
step I, artificially adding two components U, V to each data of the single-component video sequence, and converting the data into the existing three-component video format.
After decomposition of the four-component video data, the decomposed single-component video sequence will be video encoded, whereas conventional encoding methods require three components, such as YUV (YCbCr4:2:0, YCbCr4:2:2, and YCbCr4:4: 4). In order to adapt to the conventional encoding method, the UV component needs to be extended to the single-component video sequence, and therefore, the single-component video sequence needs to be transformed as follows:
(Y1,U1,V1),(Y2,U2,V2),…(Yk,Uk,Vk),(Z1,U1,V1),(Z2,U2,V2),…,(Zk,Uk,Vk),(H1,U1,V1),(H2,U2,V2),…,(Hk,Uk,Vk),(X1,U1,V1),(X2,U2,V2),…(Xk,Uk,Vk),(Yk+1,Uk+1,Vk+1),(Yk+2,Uk+2,Vk+2),…(Y2k,U2k,V2k),(Zk+1,Uk+1,Vk+1),(Zk+2,Uk+2,Vk+2),…,(Z2k,U2k,V2k),(Hk+1,Uk+1,Vk+1),(Hk+2,Uk+2,Vk+2),…,(H2k,U2k,V2k),(Xk+1,Uk+1,Vk+1),(Xk+2,Uk+2,Vk+2),…(X2k,U2k,V2k),(Y2k+1,U2k+1,V2k+1),(Y2k+2,U2k+2,V2k+2),…(Y3k,U3k,V3k),…… (3)
thus, each single component video sequence becomes the existing three-component video format, such as YUV, (Y1, U1, V1) representing the 1 st frame data of the three-component video sequence, and (Yk, Uk, Vk) representing the k-th frame data of the three-component video sequence. According to the encoding support capability of an encoder (YCbCr4:2:0, YCbCr4:2:2 and YCbCr4:4:4), the value of the UV component for the extension of the single-component video sequence is any fixed value between 0 and 255, the corresponding components U and V of each data are the same, and the components U and V can be the same or different.
And finally, encoding the three-component video sequence formula (3) by adopting the existing arbitrary encoding standards (such as MPEG2, AVS/AVS2, H264, H265 and the like), wherein the requirements for setting the encoding parameters of the encoder are as follows:
(1) setting the size of a coding parameter group of pictures (GOP) of an encoder to be k;
(2) describing the component type (Y, Z, H, X) of each GOP in the user data area in the I frame of each GOP, and finally outputting the code stream
And step II, coding the converted single-component video sequence according to the existing coding method such as transform coding, motion estimation and motion compensation, entropy coding or mixed coding to obtain corresponding coded data.
In order to uniformly process multi-component video and three-component video manufactured by people and simplify calculation, in the encoding process adopting the existing method, the length of a GOP group of pictures set by an encoder is the same as that of the multi-component GOP group of pictures, and the component type of the component video corresponding to the first I frame picture of each GOP group of pictures is marked.
And III, repeating the steps I and II to finish the coding of all the single-component video sequences in a multi-component GOP image group to obtain corresponding coded data.
And IV, repeating the step III to finish the coding of all the multi-component GOP image groups and obtain the coded data corresponding to the multi-component video.
For the above encoded data, the present invention further provides a decoding method for multi-component video, which specifically includes the following steps:
and step i, decoding the coded data according to the existing decoding method to obtain decoded data in a corresponding three-component video format, and then removing two artificially added components U, V to obtain a corresponding single-component video sequence.
Firstly, the length of a GOP group of pictures of a decoder is set to be the same as that of a GOP group of pictures of an encoder corresponding to the encoded data, then the GOP group of pictures to be decoded is decoded according to the existing decoding method to obtain decoded data of a corresponding three-component video format, three components of the decoded data are respectively stored in different addresses, two artificially added components are removed U, V, for example, data in the two last addresses are fixedly set to be two artificially added components U, V, so that the two remaining components can be removed according to address searching, and finally, a single-component video sequence of the corresponding component type is obtained according to the component type of a multi-component video corresponding to the first I-frame image of the GOP group of pictures during encoding.
And step ii, recombining all the single-component data again to obtain the corresponding multi-component video.
Firstly, judging whether the component type of a first single-component video sequence is the first component of a multi-component video, if so, recombining all single-component data again by using the inverse operation of recombination of each component of the multi-component video; if not, deleting the first single-component video sequence, judging whether the component type of the second single-component video sequence is the first component of the multi-component video or not until the single-component video sequence with the component type being the first component of the multi-component video is found, and then recombining all the residual single-component data again by using the inverse operation of recombination with each component of the multi-component video.
Because each component of the decoded data in the three-component video format is stored in a different address, all the components are recombined again according to the difference of the addresses to form the multi-component video.
Taking a 1000-frame quarter-component video as an example to specifically describe the embodiment of the present invention, each component size is 3840 × 2160, assuming that the frame rate F is 25fps, the corresponding equation (1) becomes
(Y1Z1H1X1),(Y2Z2H2X2),……,(Y1000,Z1000,H1000,X1000)。
Four-component video data reorganization:
assuming that the encoder sets the GOP group picture size to 25 frames, the recomposition is as follows according to equation (2):
Y1,Y2,…Y25,Z1,Z2,…,Z25,H1,H2,…,H25,X1,X2,…X25,Y26,Y27,…Y50,Z26,Z27,…,Z50,H26,H27,…,H50,X26,X27,…X50,Y51,Y52,…Y75,……, (4)
wherein, Y1, Y2, …, Y25, Z1, Z2, …, Z25, H1, H2, …, H25, X1, X2, … X25, etc. respectively represent the corresponding single-component video sequence, and the corresponding frame rate is 100 fps.
Encoding of a single component video sequence:
assuming that an h.264 encoder is used, considering the meaningless data of the UV component, the YCbCr4:2:0 chroma coding format is adopted, the size of the UV component is 1920x1080 respectively, and the values are all set to 128 uniformly.
The single component video sequence is transformed using equation (3) as follows:
(Y1,U1,V1),(Y2,U2,V2),…(Y25,U25,V25),(Z1,U1,V1),(Z2,U2,V2),…,(Z25,U25,V25),(H1,U1,V1),(H2,U2,V2),…,(H25,U25,V25),(X1,U1,V1),(X2,U2,V2),…(X25,U25,V25),(Y26,U26,V26),(Y27,U27,V27),…(Y50,U50,V50),(Z26,U26,V26),(Z27,U27,V27),…,(Z50,U50,V50),(H26,U26,V26),(H27,U27,V27),…,(H50,U50,V50),(X26,U26,V26),(X27,U27,V27),…(X50,U50,V50),(Y51,U51,V51),(Y52,U52,V52),…(Y75,U75,V75),…… (5)
setting the length of a GOP group of encoding parameters of an encoder to be 25, describing a component type (Y, Z, H and X) corresponding to each GOP group in a user data area in a first I frame of each GOP group, wherein the component type of the first GOP group in the above formula is Y, the component type of the second GOP group is Z, the component type of the third GOP group is H, the component type of the fourth GO group is X, the component type of the fifth GOP group is Y, and the like; and then encoded to produce encoded data.
Decoding of a single component video sequence:
decoding the coded data to obtain a three-component video sequence (as shown in formula 5); determining the component type (Y, Z, H, X) of each GOP group of pictures according to the description of the user data area of the first I frame in each GOP group of pictures, storing the three components in different addresses respectively, thus deleting U, V components according to different addresses to obtain a single-component video sequence, processing other GOP groups of pictures according to the method, and finally obtaining the sequence shown in the similar formula (4). If a group of GOP pictures is decoded, the following 25 frames are determined to be (H1, U1, V1), (H2, U2, V2), …, (H25, U25, V25) by knowing that the group of H components is an H component by the description of the user data area of the I frame, and the three components are stored in different addresses, so that U, V components can be deleted according to the different addresses, and a single-component video sequence H1, H2, …, H25 is obtained. Processing other GOP groups of pictures according to the method, and obtaining the following sequence:
H1,H2,…,H25,X1,X2,…X25,Y26,Y27,…Y50,Z26,Z27,…,Z50,H26,H27,…,H50,X26,X27,…X50,Y51,Y52,…Y75,……, (6)
four-component video data combining:
firstly, judging whether the component type of a first single-component video sequence is the first component of the multi-component video, if so, recombining all single-component data again by using the inverse operation of recombination with each component of the multi-component video, namely, directly recombining by using the inverse operation if the component type of the first single-component video sequence is Y to obtain the four-component video sequence shown in the formula (1).
If not, deleting the first single-component video sequence, judging whether the component type of the second single-component video sequence is the first component of the multi-component video until the single-component video sequence of which the component type is the first component of the multi-component video is found, then, recombining the remaining single-component data again by using the inverse operation of recombination of each component of the multi-component video, namely, if the component type of the first single-component video sequence is H, as shown in formula 6, deleting H1, H2, …, H25, X1, X2 and … X25, and then recombining according to the inverse operation, wherein the finally obtained four-component video sequence is as follows:
(Y26Z26H26X26),(Y27Z27H27X27),……,(Y1000,Z1000,H1000,X1000)
although particular embodiments of the present invention have been described above, it will be understood by those skilled in the art that these are by way of example only and that various changes or modifications may be made to these embodiments without departing from the spirit and scope of the invention and, therefore, the scope of the invention is to be defined by the appended claims.