CN114173131B

CN114173131B - Video compression method and system based on inter-frame correlation

Info

Publication number: CN114173131B
Application number: CN202111407064.XA
Authority: CN
Inventors: 张凯
Original assignee: Wuhan Zhongzhi Digital Technology Co ltd
Current assignee: Wuhan Zhongzhi Digital Technology Co ltd
Priority date: 2021-11-24
Filing date: 2021-11-24
Publication date: 2024-09-17
Anticipated expiration: 2041-11-24
Also published as: CN114173131A

Abstract

A video compression method and system based on inter-frame correlation, the method includes: dividing an input video sequence into a plurality of picture groups, wherein each picture group comprises a plurality of frames of pictures; for each picture group, setting a 1 st frame as a reference frame, setting the rest frames as experimental frames, and carrying out fixed high sampling rate sampling on the reference frames; the method comprises the steps of carrying out identical non-overlapping blocking processing on a reference frame and a non-reference frame, and dividing each reference frame and each non-reference frame into a plurality of image blocks; for each image block of the non-reference frame, classifying the image blocks of the non-reference frame based on the degree of change of the image block corresponding to the reference frame, and distributing different sampling rates to the image blocks of different classes; and directly reconstructing the reference frame, preprocessing an image block with a lower sampling rate, then performing multi-hypothesis prediction to obtain a predicted frame, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally correcting a reconstruction result by using bidirectional motion estimation.

Description

Video compression method and system based on inter-frame correlation

Technical Field

The invention relates to the field of video compression, in particular to a video compression system and method based on inter-frame correlation.

Background

With the widespread popularity of the internet and communication devices, networks have become one of the important channels for people to acquire various kinds of information, and thus, various multimedia applications have been continuously generated. In this process, the network bandwidth affects the rate of information transmission, which is an important factor affecting the quality of experience of people, so streaming media technology has been developed. However, how to effectively compress streaming media video is influenced by factors such as huge data volume, high real-time requirement, fast data growth, high query frequency and the like, so that the realization of information transmission and storage is a problem which needs to be solved urgently at present.

The conventional video compressed sensing algorithm generally adopts the same measurement matrix to measure and reconstruct each frame of image, and aims at the obvious characteristic that video signals have strong inter-frame correlation, and a video compressed sensing scheme based on the inter-frame correlation is provided. The traditional Shannon sampling law adopts a signal acquisition mode of sampling before compression, so that resource waste is caused, and the compressed sensing (Compressive Sensing, CS) theory combines sampling and compression into a whole, so that the complexity of a coding end and the resource waste are reduced to a great extent, and the Shannon sampling law is widely applied to the field of video coding. Such a video coding scheme based on compressed sensing theory is called video compressed sensing (Compressive Video Sensing, CVS) scheme.

Disclosure of Invention

In view of the technical drawbacks and technical shortcomings existing in the prior art, embodiments of the present invention provide a video compression method based on inter-frame correlation, which overcomes the above problems or at least partially solves the above problems, and specifically includes the following steps:

As a first aspect of the present invention, there is provided a video compression method based on inter-frame correlation, the method comprising:

step 1, dividing an input video sequence into a plurality of picture groups, wherein each picture group comprises a plurality of frames of pictures;

step2, setting a1 st frame as a reference frame and the rest frames as experimental frames for each picture group, and carrying out fixed high sampling rate sampling on the reference frames;

step 3, carrying out the same non-overlapping blocking treatment on the reference frame and the non-reference frame, and dividing each reference frame and the non-reference frame into a plurality of image blocks;

step 4, classifying the image blocks of the non-reference frame based on the change degree of the image blocks corresponding to the reference frame, and distributing different sampling rates to the image blocks of different categories;

And 5, directly reconstructing the reference frame, preprocessing an image block with a lower sampling rate, then performing multi-hypothesis prediction to obtain a predicted frame, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally correcting a reconstruction result through bidirectional motion estimation.

The original frame corresponds to an I frame in the video, and the following P frames are all predicted frames obtained by predicting the I frame.

Further, in step 4, the classifying the image blocks specifically includes: and calculating residual energy difference values between image blocks corresponding to the reference frame and the non-reference frame, and dividing the image blocks into three types based on the residual energy difference values, namely a basic unchanged block, a slow changed block and a fast changed block.

Further, each picture group includes three frames of pictures, a first frame is set as a reference frame, i.e., an I frame, and a third frame is set as a forward prediction frame, i.e., a P frame; setting a second frame as a bi-directional prediction frame, namely a B frame, carrying out the same non-overlapping block division treatment on the B frame, the P frame and the I frame, and uniformly dividing the B frame, the P frame and the I frame into K image blocks with D multiplied by D size;

Classifying image blocks of a non-reference frame based on the magnitude of the degree of change of the image blocks corresponding to the reference frame specifically includes:

For P frames:

The residual energy difference value between the original data of the I-th image block of the P frame xt+1 and the image block at the corresponding position in the I frame Xt-1 reflects the correlation between the two corresponding image blocks, so the magnitude of the residual energy difference value between the corresponding image blocks can be used as the classifying basis of the P frame image blocks, and the expression is as follows:

That is, the motion characteristics of the current CU are determined by using the sum of squares of differences between luminance pixel values of the current depth CU and the CU whose corresponding position in the time domain has been encoded, where 2N represents the size of the current CU, CU _Cur(I,j) and CU _Col(I,j) represent luminance pixel values of (i, j) positions in CU blocks of the current and corresponding positions in the time domain, respectively, and an ASSD that is larger indicates that the difference between adjacent CU in the time domain is larger, the motion is more intense.

The ratio of the residual energy of the two image blocks to the energy of the reference image block is used as a criterion, and the expression is as follows:

Wherein ASSDX represents the value of ASSD when the optimal depth in the coded CU in the current frame is X (0-3), namely ASSDX only considers the CU with the optimal coding depth of X, and other parameters have the meaning as ASSD;

Since each measurement vector is a linear projection of the original signal data, the ratio of the residual energy of the sum-ready measurement vector to the measured energy of the reference image block is used as an image block classification decision criterion for the P-frame, expressed as follows:

ThSkip＝b×Ave(ASSD_not _X)；；

ThSkip indicates the average value of the ASSDs of all CUs with the optimal depth greater than X in the coded CUs, if the ASSD at the depth X of the current CU block is greater than ThSkip, which indicates that the motion of the current CU block is relatively severe, so that calculation and judgment of various PU prediction modes at the depth X are not needed, and the next depth x+1 is directly skipped to perform judgment of a CU with a smaller size, where b is an adjustment parameter, and 1.2 is taken herein, and the larger the value is, the larger the corresponding ThSkip is.

For B frames:

Carrying out image block classification judgment on the B frame by utilizing the correlation between the B frame and the front frame and the rear frame; at this time, the measurement vectors obtained by pre-sampling the I-frame and the P-frame are respectively calculated as the average value of the two vectors, and the ratio of the difference between the measured value of the B-frame and the energy of the average value to the energy of the average value is calculated, wherein the calculation formula is as follows:

J(m，λ_Motion)＝SAD(s,c(m))+λ_MotionR(m-p)；

Where m= (mx, my) t is a motion vector, λmotion is a lagrangian (lagranger) multiplier factor, is a constant, is related to a quantization parameter QP and a type of block to be encoded, SAD is a sum of absolute errors, s represents an original video signal, c represents a decoded video signal, p= (px, py) t is a predicted motion vector, and R (m-P) is a number of bits required to represent a motion vector difference;

Image blocks are divided into three classes by the size of the e value: the process of the basic invariant block, the slow variant block and the fast variant block is as follows: firstly, setting two thresholds T1, T2, T1 < T2, comparing the obtained e value with the two thresholds, when e < At T1, the block is judged to be a basically unchanged block; when T1 is more than or equal to e and less than or equal to T2, judging the image block to be a slow change block; when e > T2, the block decision is a fast-changing block.

Further, the categories of image blocks include a substantially constant block, a slowly varying block, and a rapidly varying block, the method further comprising:

In the self-adaptive sampling rate measurement process, corresponding sampling rates of S1, S2 and S3 are respectively allocated to a basic unchanged block, a slow changed block and a fast changed block, S1 is less than S2 and less than S3, the sampling rate of an image block of a reference frame is also S3, the sampling points corresponding to three types of image blocks are M1=S1.D2, M2=S2.D2 and M3 respectively, wherein D1 is the side length of K image blocks generated after the reference frame is divided, and D2 represents the one-dimensional signal length of each image block;

In step 5, for the reference frame, reconstructing by using a BCS-SPL algorithm, namely a segmented image compressed sensing reconstruction method; for a non-reference frame, firstly, preprocessing a basic unchanged block and a slow changed block, then, performing multi-hypothesis prediction to obtain a predicted frame, then, correcting errors between the predicted frame and an original frame through residual error reconstruction (the original frame mainly refers to an I frame, the latter P frames are all predicted frames obtained through prediction of the I frame), obtaining a high-quality reconstructed frame, and finally, correcting a reconstruction result through bidirectional motion estimation.

Further, the BCS-SPL algorithm is processed as follows:

a, obtaining an approximate solution of an original signal, namely an initial solution x0 (x0=Φty) of an image by using minimum mean square error estimation (MMSE);

B, reducing image blocking effect caused by blocking by using a self-adaptive wiener filter with the size of 3 multiplied by 3, and smoothing the image;

c, projecting the filtered image block into a convex set, expressed as:

p(x,y,Φ)＝x+ΦT(ΦΦT)-1(y-Φx)；

when Φ is an orthogonal matrix, that is ΦΦ Φt=1, the above formula is simplified to:

p(x,y,Φ)＝x+ΦT(y-Φx)；

d, filtering the projection result by using a bivariate shrinkage threshold value to realize a denoising process;

E, judging whether an iteration termination condition is met, if not, carrying out repeated iteration until an optimal solution is obtained, and setting the maximum iteration times, wherein x refers to a pixel block x coordinate component vector, y refers to a pixel block y coordinate component, T refers to a function formula for obtaining a reconstructed frame through BCS-SPL optimization, and phi refers to a vector transformation formula between an original frame and a predicted frame.

As a second aspect of the present invention, there is provided a video compression system based on inter-frame correlation, the system including a grouping module, a frame classifying module, a blocking module, a sampling rate distributing module, and a reconstructing module;

The grouping module is used for dividing an input video sequence into a plurality of picture groups, and each picture group comprises multi-frame pictures;

The frame classification module is used for setting a1 st frame as a reference frame and the rest frames as experimental frames for each picture group, and carrying out fixed high sampling rate sampling on the reference frames;

The block module is used for carrying out the same non-overlapping block processing on the reference frame and the non-reference frame, and dividing each reference frame and the non-reference frame into a plurality of image blocks;

The sampling rate distribution module is used for classifying the image blocks of the non-reference frame based on the change degree of the image blocks corresponding to the reference frame for each image block of the non-reference frame, and distributing different sampling rates for the image blocks of different categories;

The reconstruction module is used for directly reconstructing the reference frame, preprocessing an image block with a lower sampling rate, then performing multi-hypothesis prediction to obtain a predicted frame, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally correcting a reconstruction result through bidirectional motion estimation.

Further, the image blocks are classified specifically as follows: and calculating residual energy difference values between image blocks corresponding to the reference frame and the non-reference frame, and dividing the image blocks into three types based on the residual energy difference values, namely a basic unchanged block, a slow changed block and a fast changed block.

For P frames:

ThSkip＝b×Ave(ASSD_notX)；

For B frames:

J(m，λ_Motion)＝SAD(s,c(m))+λ_MotionR(m-p)；

Further, the categories of the image blocks include a basic invariant block, a slow variant block and a fast variant block, and the reconstruction module is specifically configured to:

for a reference frame, reconstructing by using a BCS-SPL algorithm, namely a block image compressed sensing reconstruction method; for a non-reference frame, firstly, preprocessing a basic unchanged block and a slow changed block, then, performing multi-hypothesis prediction to obtain a predicted frame, then, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally, correcting a reconstruction result by using bidirectional motion estimation.

Further, the BCS-SPL algorithm is processed as follows:

c, projecting the filtered image block into a convex set, expressed as:

p(x,y,Φ)＝x+ΦT(ΦΦT)-1(y-Φx)；

p(x,y,Φ)＝x+ΦT(y-Φx)；

e, judging whether an iteration termination condition is met, if not, repeatedly iterating until an optimal solution is obtained, and setting the maximum iteration times;

wherein x refers to the x coordinate component vector of the pixel block, y refers to the y coordinate component of the pixel block, T refers to a function formula for obtaining a reconstructed frame by BCS-SPL optimization, and phi refers to a vector conversion formula between an original frame and a predicted frame.

The invention has the following beneficial effects:

1. The invention provides a bidirectional prediction-based variable sampling rate block video coding scheme, which comprises the steps of firstly, carrying out fixed high sampling rate sampling on a reference frame, then carrying out non-overlapping block processing on a non-reference frame, classifying the non-reference frame according to the change degree of an image block, distributing different sampling rates for different types of image blocks, carrying out primary classification by using bidirectional prediction in the image block classification judging process, and correcting a primarily obtained classification result by utilizing intra-frame relativity of a video, wherein an experimental result shows that the bidirectional prediction-based self-adaptive sampling rate measurement mode can effectively improve the classification accuracy of the image block, enable the classification result to be closer to the actual condition, thereby reducing the sampling rate of the frame to a certain extent and effectively improving the reconstruction quality of the video,

2. The present invention proposes a multi-hypothesis predictive video decoding scheme based on a variable sampling rate, which can be directly reconstructed, firstly, since the reference frame uses a fixed high sampling rate measurement. For a non-reference frame, firstly, preprocessing two types of image blocks with lower sampling rate, then, performing multi-hypothesis prediction to obtain a predicted frame, obtaining a predicted frame with higher precision through bidirectional motion estimation, and finally, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame. Experimental results show that the multi-hypothesis prediction video reconstruction scheme with the variable sampling rate can reconstruct a video frame with higher quality under a lower sampling rate.

Drawings

Fig. 1 is a flowchart of a video compression method based on inter-frame correlation according to an embodiment of the present invention;

Fig. 2 is a schematic flow chart of a process for extracting frames for algorithm processing according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, as a first embodiment of the present invention, there is provided a video compression method based on inter-frame correlation, the method including:

Step 5, for the reference frame, directly reconstructing the reference frame, preprocessing the image block with lower sampling rate, then performing multi-hypothesis prediction to obtain a predicted frame, correcting the error between the predicted frame and the original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally correcting the reconstruction result through bidirectional motion estimation

To achieve a controllable total sampling rate, a two-stage adaptive CVS framework based on inter-frame correlation is proposed herein, as shown in fig. 2. In this framework, a fixed sample rate measurement is used for reference frames and a variable sample rate measurement is used for non-reference frames. For a given non-reference frame, the total measurement includes a fixed sample rate measurement and an adaptive sample rate measurement. Assuming that the average sampling rate of the non-reference frame image block is S, there is s=f+a, where F is the fixed sampling rate and a is the average adaptive sampling rate.

In the above embodiment, the input video sequence is first divided into several groups of pictures (GOP), each group of pictures contains N frames (N is greater than or equal to 2), where the 1 st frame is set as a reference frame (I frame), the remaining other images are experimental frames (P frames), the reference frames are processed differently from the experimental frames, the reference frames are separately processed for tone mapping, the experimental frames of the same group are divided into blocks of the same size, and block matching motion estimation is performed with respect to the reference frames, a matching block similar to the block is found in the reference frames, and if a perfect matching block is found in the reference frames through motion estimation, the block information of the reference frames is used for the blocks of the experimental frames, otherwise, the blocks of the experimental frames are to be processed for additional tone mapping.

The reference frame in each picture group is different from the experimental frame, the reference frame image is required to be subjected to tone mapping processing of bilateral filtering to obtain an LDR image with excellent visual effect, the bilateral filtering algorithm is an image filtering algorithm which is designed and put forward on the basis of a classical Gaussian filtering algorithm, the spatial position information and pixel value similarity are considered in the filtering process, the edge characteristics of the image can be well reserved, the characteristics of local, simple, non-iterative and the like are achieved, and the image block preprocessing is also an LDR image with excellent visual effect obtained through tone mapping processing of bilateral filtering.

Preferably, since different regions of the video have different scene complexity and variation intensity, the invention reasonably distributes sampling rates according to the variation degree among the partitioned video frames, namely, the image blocks with smaller variation degree distribute lower sampling rates, the blocks with larger variation degree distribute higher sampling rates, and the image blocks are divided into 3 classes of approximate static blocks, slow variation blocks and fast variation blocks, thus ensuring that the video image can still be reconstructed with higher quality under lower total sampling rate. When the self-adaptive measurement rate is set, firstly, a proper block size B is selected, video frames are not overlapped and segmented to obtain K blocks with the size of B multiplied by B, then the same measurement matrix is utilized for pre-sampling to obtain a measurement vector of each image block, and as the complexity and the change intensity of the video scene are directly related to the difference value of pixels between frames, the classification of the video blocks can be carried out according to residual error energy, and the method comprises the following specific steps:

firstly, the same projection matrix is used for obtaining measurement vectors of image blocks corresponding to a reference frame and a non-reference frame, and the image blocks are divided into three types according to the energy difference value values: a basic invariant block, a slow varying block and a fast varying block;

Secondly, correcting the primary classification result by utilizing the intra-frame correlation of the image;

And finally obtaining a classification result, distributing different sampling rates for different types of image blocks to perform sampling processing, and adaptively adjusting the number of measurement vector points of the current image block to obtain a measurement vector.

Preferably, the correlation between the image frames is specifically that a first frame is set as a reference frame, i.e. an I frame; then setting the third frame as a forward predicted frame, i.e., a P frame; finally, setting the second frame as a bi-directional prediction frame, namely a B frame;

Firstly, carrying out the same non-overlapping blocking treatment on a P frame and an I frame, and uniformly dividing the P frame and the I frame into K image blocks with D multiplied by D; the (residual energy) difference between the original data of the I-th image block of the P-frame xt+1 and the image block at the corresponding position in the I-frame Xt-1 reflects the correlation between the corresponding image blocks of the two frames; therefore, the residual energy value between the corresponding image blocks can be used as the classifying basis of the image blocks of the P frame, and the B frame, the P frame and the I frame are subjected to the same non-overlapping blocking treatment and are uniformly divided into K image blocks with D multiplied by D;

For P frames:

Judging the motion characteristics of the current CU by utilizing the square sum of the difference values of the luminance pixel values of the current depth CU and the coded CU at the corresponding time domain positions, wherein 2N represents the size of the current CU, CU _Cur(I,j) and CU _Col(I,j) respectively represent the luminance pixel values of (i, j) positions in the CU blocks at the current and time domain corresponding positions, and the larger the ASSD is, the larger the difference value of the adjacent CU at the time domain is, the more intense the motion is;

Since each measurement vector is a linear projection of the original signal data, the ratio of the residual energy of the sum-ready measurement vector to the measured energy of the reference image block is used as an image block classification decision criterion for the P-frame, expressed as follows: thSkip =b×ave (ASSD _notX)

ThSkip is a mean value of ASSDs of all CUs with optimal depth greater than X in the coded CUs, if the ASSD at the depth X of the current CU block is greater than ThSkip, which indicates that the motion of the current CU block is relatively severe, so that calculation and judgment of various PU prediction modes at the depth X are not needed, and a next depth x+1 is directly skipped to perform judgment of a CU with smaller size, wherein b is an adjustment parameter, and the larger the value of b is, the larger the corresponding ThSkip is;

for B frames:

J(m，λ_Motion)＝SAD(s,c(m))+λ_MotionR(m-p)

Where m= (mx, my) t is a motion vector, λmotion is a lagrangian (lagranger) multiplier, SAD is a sum of absolute errors, s represents an original video signal, c represents a decoded video signal, p= (px, py) t is a predicted motion vector, and R (m-P) represents the number of bits required for the motion vector difference;

Image blocks are divided into three classes by the size of the e value: the process of the basic invariant block, the slow variant block and the fast variant block is as follows: firstly, setting two thresholds T1, T2, wherein T1 is less than T2, comparing the obtained e value with the two thresholds, and judging the block as a basically unchanged block when e is less than T1; when T1 is more than or equal to e and less than or equal to T2, judging the image block to be a slow change block; when e > T2, the block decision is a fast-changing block.

The intra-frame correlation corrects the types of its four adjacent image blocks up, down, left, and right according to the type of the current image block.

In the self-adaptive sampling rate measurement process, corresponding sampling rates are respectively allocated to a basic unchanged block, a slow changed block and a fast changed block, wherein S1, S2, S3, S1 is less than S2 and less than S3, S3 is the sampling rate of an image block of a reference frame, the sampling points corresponding to three types of image blocks are M1=S1.D2, M2=S2.D2 and M respectively, D is the side length of K image blocks generated after the reference frame is segmented, and D2 represents the one-dimensional signal length of each image block;

The adaptive sample rate measurement procedure for non-reference frames P-frames and B-frames is described as follows:

For a basic unchanged block, only M1+1-M2 values in the pre-sampling vector are removed, and the previous M1 values are reserved to obtain a measurement vector, so that the sampling rate of the image block is reduced to S1;

for the slowly-changing block, keeping M2 measured values in the current pre-sampling vector unchanged, and obtaining that the measured vector is equal to the pre-sampling vector, wherein the sampling rate of the current image block is equal to S2;

For the fast-changing block, the measuring matrix phi of the reference frame I frame is used for projection to obtain a measuring vector, and the sampling rate of the image block is the same as that of the I frame at the moment, and is S3;

secondly, video compressed sensing decoding based on variable sampling rate is performed as follows:

for a reference frame, reconstructing by using a BCS-SPL algorithm, namely a block image compressed sensing reconstruction method;

For reconstruction of a non-reference frame, firstly, preprocessing a basic unchanged block and a slow changed block, secondly, predicting the preprocessed image block through multiple hypotheses to obtain a predicted frame, then, carrying out residual error reconstruction to obtain a target frame, and finally, correcting a reconstruction result by using bidirectional motion estimation.

The BCS-SPL algorithm comprises the following steps:

(1) Using minimum mean square error estimation MMSE to obtain an approximate solution of the original signal, namely an initial solution x0 (x0=Φty) of the image;

(2) An adaptive wiener filter with the size of 3 multiplied by 3 is used for reducing the image blocking effect caused by blocking and smoothing the image;

(3) Projecting the filtered image into a convex set, denoted as:

p(x,y,Φ)＝x+ΦT(ΦΦT)-1(y-Φx) (5)

when Φ is an orthogonal matrix, that is ΦΦ Φt=1, the formula (5) is simplified to

p(x,y,Φ)＝x+ΦT(y-Φx) (6)

(4) Filtering the projection result by using a bivariate shrinkage threshold value to realize a denoising process;

(5) And judging whether the iteration termination condition is met, if not, repeating the iteration until the optimal solution is obtained, and setting the maximum iteration times.

As a second embodiment of the present invention, there is provided a video compression system based on inter-frame correlation, wherein the system includes a grouping module, a frame classifying module, a blocking module, a sampling rate allocating module, and a reconstructing module;

The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

1. A method of video compression based on inter-frame correlation, the method comprising:

Step 5, directly reconstructing the reference frame, preprocessing an image block with a lower sampling rate, then performing multi-hypothesis prediction to obtain a predicted frame, correcting errors between the predicted frame and a corresponding original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally correcting a reconstruction result through bidirectional motion estimation;

Each picture group comprises three frames of pictures, wherein a first frame is set as a reference frame, namely an I frame, and a third frame is set as a forward prediction frame, namely a P frame; setting a second frame as a bi-directional prediction frame, namely a B frame, carrying out the same non-overlapping block division treatment on the B frame, the P frame and the I frame, and uniformly dividing the B frame, the P frame and the I frame into K image blocks with D multiplied by D size;

For P frames:

The residual energy difference value between the original data of the I-th image block of the P frame Xt+1 and the image block at the corresponding position in the I frame Xt-1 reflects the correlation between the two corresponding image blocks, therefore, the magnitude of the residual energy difference value between the corresponding image blocks is taken as the classifying basis of the P frame image blocks, and the expression is as follows:

Using the ratio of the residual energy of the two image blocks to the energy of the reference image block as a judging standard;

Since each measurement vector is a linear projection of the original signal data, the ratio of the residual energy of the measurement vector to the measured energy of the reference image block is used as the image block classification decision criterion for the P frame, and the expression is: thSkip = b x Ave (ASSD _{not X});

for B frames:

J(m,λ_Motion)＝SAD(s,c(m))+λ_MotionR(m-p)

Where m is a motion vector, λmotion is a lagrangian (lagranger) multiplier factor, SAD is a sum of absolute errors, s is an original video signal, c is a decoded video signal, P is a predicted motion vector, and R (m-P) is the number of bits required for a motion vector difference;

Image blocks are divided into three classes by the size of the e value: the process of the basic invariant block, the slow variant block and the fast variant block is as follows: firstly, setting two thresholds T1 and T2, wherein T1 is less than T2, comparing the obtained e value with the two thresholds, and judging the block as a basically unchanged block when eT 1; when T1 is more than or equal to e and less than or equal to T2, judging the image block to be a slow change block; when e > T2, the block decision is a fast-changing block.

2. The method of video compression based on inter-frame correlation according to claim 1, wherein in step 4, classifying the image blocks is specifically: and calculating residual energy difference values between image blocks corresponding to the reference frame and the non-reference frame, and dividing the image blocks into three types based on the residual energy difference values, namely a basic unchanged block, a slow changed block and a fast changed block.

3. The method of video compression based on inter-frame correlation according to claim 1, wherein the categories of image blocks include a substantially constant block, a slowly varying block, and a rapidly varying block, the method further comprising:

In step 5, for the reference frame, reconstructing by using a BCS-SPL algorithm, namely a segmented image compressed sensing reconstruction method; for a non-reference frame, firstly, preprocessing a basic unchanged block and a slow changed block, then, performing multi-hypothesis prediction to obtain a predicted frame, then, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally, correcting a reconstruction result by using bidirectional motion estimation.

4. A video compression method based on inter-frame correlation according to claim 3, wherein the BCS-SPL algorithm is processed as follows:

c, projecting the filtered image block into a convex set, expressed as:

p(x,y,Φ)＝x+ΦT(ΦΦT)-1(y-Φx)；

p(x,y,Φ)＝x+ΦT(y-Φx)；

5. The video compression system based on the inter-frame correlation is characterized by comprising a grouping module, a frame classifying module, a blocking module, a sampling rate distributing module and a reconstructing module;

The reconstruction module is used for directly reconstructing the reference frame, preprocessing an image block with a lower sampling rate, then performing multi-hypothesis prediction to obtain a predicted frame, correcting errors between the predicted frame and an original frame through residual error reconstruction to obtain a high-quality reconstructed frame, and finally correcting a reconstruction result through bidirectional motion estimation;

For P frames:

for B frames:

J(m，λ_Motion)＝SAD(s,c(m))+λ_MotionR(m-p)

6. The video compression system based on inter-frame correlation according to claim 5, wherein the image blocks are classified as: and calculating residual energy difference values between image blocks corresponding to the reference frame and the non-reference frame, and dividing the image blocks into three types based on the residual energy difference values, namely a basic unchanged block, a slow changed block and a fast changed block.

7. The video compression system based on inter-frame correlation according to claim 5, wherein the categories of image blocks include a substantially constant block, a slowly varying block, and a fast varying block, the reconstruction module being specifically configured to:

8. The video compression system based on inter-frame correlation as recited in claim 7, wherein the BCS-SPL algorithm is processed as follows:

c, projecting the filtered image block into a convex set, expressed as:

p(x,y,Φ)＝x+ΦT(ΦΦT)-1(y-Φx)；

p(x,y,Φ)＝x+ΦT(y-Φx)；