[go: up one dir, main page]

CN110381313B - Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality - Google Patents

Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality Download PDF

Info

Publication number
CN110381313B
CN110381313B CN201910610758.XA CN201910610758A CN110381313B CN 110381313 B CN110381313 B CN 110381313B CN 201910610758 A CN201910610758 A CN 201910610758A CN 110381313 B CN110381313 B CN 110381313B
Authority
CN
China
Prior art keywords
image group
lstm network
frame
reconstruction
reconstructed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910610758.XA
Other languages
Chinese (zh)
Other versions
CN110381313A (en
Inventor
刘浩
魏冬
周健
田伟
陈根龙
黄荣
孙韶媛
李德敏
周武能
魏国林
廖荣生
黄震
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donghua University
Original Assignee
Donghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donghua University filed Critical Donghua University
Priority to CN201910610758.XA priority Critical patent/CN110381313B/en
Publication of CN110381313A publication Critical patent/CN110381313A/en
Application granted granted Critical
Publication of CN110381313B publication Critical patent/CN110381313B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/177Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a group of pictures [GOP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明涉及一种基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其中重建端收到帧观测矢量码流,组合形成连续的图像组观测矢量,对每个图像组观测矢量执行基于LSTM网络的多帧联合迭代重构,获得相应的重建图像组,逐一输出最终的重建帧,同时根据迭代次数达到最大值的持续状况决定是否更新LSTM网络的参数集合。本发明能够将稀疏先验建模与数据驱动机制结合起来,有助于提升重建视频的质量。

Figure 201910610758

The invention relates to a video compressive sensing reconstruction method based on LSTM network and image group quality blind evaluation, wherein the reconstruction end receives the frame observation vector code stream, combines to form a continuous image group observation vector, and executes the image group observation vector for each image group. Based on the multi-frame joint iterative reconstruction of the LSTM network, the corresponding reconstructed image groups are obtained, and the final reconstructed frames are output one by one. The present invention can combine the sparse prior modeling with the data-driven mechanism, which helps to improve the quality of the reconstructed video.

Figure 201910610758

Description

Video compression sensing reconstruction method based on LSTM network and image group quality blind evaluation
Technical Field
The invention relates to the technical field of video compression sensing reconstruction, in particular to a video compression sensing reconstruction method based on an LSTM network and image group quality blind evaluation.
Background
The rise of the compressive sensing technology provides a novel signal acquisition and recovery mechanism, according to the compressive sensing theory, only the original signal needs to be projected onto a random basis to obtain a small number of measurement values, and signals with sparse or nearly sparse representation in a certain transform domain can be recovered through the measurement values. In compressed sensing video communication, a measuring end and a reconstruction end are extremely asymmetric, the measuring end is an information-physical fusion system and has the basic characteristics of limited physical and computing resources, cooperative signal acquisition and transmission and the like, and the reconstruction end with sufficient resources needs to recover an original signal without a feedback channel.
Video compression sensing generally adopts a communication architecture of 'independent measurement of each frame and multi-frame joint reconstruction', the computational complexity is transferred from a measurement end to a reconstruction end, and the extremely simple measurement end design is very suitable for a visual sensor with limited resources in a sensing network. The measuring terminal independently observes and encodes each frame image of the video by adopting the same observation matrix to generate continuous frame observation vectors, and sends out the continuous frame observation vectors as code streams. After receiving the code stream, the reconstruction end combines the code stream into a continuous image group observation vector, the multi-frame joint reconstruction utilizes different degrees of space-time redundant information, and the speed and the quality of video reconstruction are different.
The original image signal cannot be obtained in the reconstruction process of video compressed sensing, and the original image is difficult to refer to for the reconstruction performance evaluation. The blind evaluation of the video quality utilizes image samples in a typical image database to train, a blind evaluation model of video characteristic change is established through supervised pattern recognition and statistical regression, and the quality evaluation of a plurality of frames of images can be executed without original images. The Video-BLIINDS and VIIDEO proposed by Bovik et al are two typical Video quality blind evaluation criteria, wherein the Video-BLIINDS criteria is a frequency domain statistical model based on space-time natural scenes, and the VIIDEO criteria is a statistical model based on the distribution of frame differences between the front and the rear. Blind evaluation of the quality of the reconstructed image group can extract the self-characteristics of multi-frame images, and is helpful for recovering the structural information of a video signal.
The deep learning has shown promising performance in machine vision and image recovery tasks, and the compressed sensing deep learning can fully utilize the resources of a reconstruction terminal and better reconstruct a dynamically changing video signal. Long-short term memory (LSTM) networks perform attention model-based long-time sequence modeling, enabling the expression of more complex spatio-temporal information, and LSTM network-based deep learning mechanisms help to recover detailed information of video signals.
Disclosure of Invention
The invention aims to solve the technical problem of providing a video compression sensing reconstruction method based on the LSTM network and image group quality blind evaluation, which can combine sparse prior modeling with a data driving mechanism and is beneficial to improving the quality of a reconstructed video.
The technical scheme adopted by the invention for solving the technical problems is as follows: a video compression perception reconstruction method based on LSTM network and image group quality blind evaluation is provided, which comprises the following steps:
(1) a reconstruction end receives a frame observation vector code stream and combines the frame observation vector code stream to form a continuous image group observation vector;
(2) observation vector GMV Using 1 st image group1Training a parameter set of the LSTM network by the reconstructed image group;
(3) observation vector GMV for nth image groupnExecuting multi-frame joint iterative reconstruction based on the LSTM network, wherein n is more than or equal to 2, and the stopping condition is that when the iteration number reaches the maximum value K or the residual error l2Norm Rn,j||2Less than threshold resMin or group of images blind quality Qb nIs higher than the threshold value qMax, thereby completing the recovery of the nth image group and reconstructing a frame F in the nth image groupnAs the final nth reconstructed frame; after the recovery of continuous alpha image groups is finished, if the final recovery of each image group is that the iteration number reaches the maximumIf the value K is stopped, entering the step (4); otherwise, subsequent multi-frame joint iterative reconstruction still employs the current parameter set of the LSTM network §*And jumping to the step (5);
(4) the reconstruction end uses the nth image group to observe the vector GMVnThe reconstructed image group G ofnTraining an LSTM network;
(5) if the image group observation vector to be reconstructed still exists, returning to the step (3), and continuing to restore the image groups one by one; otherwise, outputting the rest reconstructed frame Fn+1、…、Fn+L-1And finishing video reconstruction as the final n +1, … and n + L-1 reconstructed frames.
Each image group observation vector in the step (1) comprises L frame observation vectors, wherein L is more than or equal to 2, and each frame observation vector contains M measured values.
The observation vector GMV for the 1 st image group in the step (2)1The 1 st reconstruction image group G is restored by the reconstruction end frame by adopting an image reconstruction algorithm1={F1,F2,...,FLThen (G) is added1,GMV1) Parameter set for training LSTM networks as reference data pairs §1To obtain the current set of parameters of the LSTM network §*=§1
The multi-frame joint iterative reconstruction based on the LSTM network in the step (3) is realized by frame observation vector GMVn(i) initializing the i frame residual vector R one by onen,j(i) and the initialized residual vector Rn,j(i) as an input to the LSTM network; using a conversion matrix U to output the LSTM network of the ith frame image in the jth iteration
Figure BDA0002122152640000021
Conversion to base vectors
Figure BDA0002122152640000022
ncell is the number of LSTM network neurons; will base vector zn,j(ii) i) further input to the softmax layer, thereby deriving non-zero probabilities for each element in the ith frame sparse vector, selecting the element with the highest probability and applying it to the iAdding to a support set of frame sparse vectors; finally, finding out sparse vectors { S ] of each frame in the jth iteration one by one through a least square estimation methodn,j(:,i)}i=1,2,…,L
And (3) weighting the residual error coefficient by the residual error vector after the repeated iteration calculated by the reconstruction end in the step (3) according to the probability that the residual error coefficient is zero to obtain a weighted residual error minimization problem, and solving the problem by adopting a Split Bregman iteration algorithm.
And (4) evaluating the blind quality of the image group in the step (3) by a Video-BLIINDS or VIIDEO criterion.
The reconstruction end recovers the nth reconstruction image group G frame by frame in the step (4) by adopting an image reconstruction algorithmn={Fn,Fn+1,...,Fn+L-1Will be (G)n,GMVn) Training a parameter set of an LSTM network as a reference data pair §nUpdating the Current parameter set of the LSTM network §*=§n
When the LSTM network is trained, the LSTM network is adopted to reconstruct the image group G for trainingnCarrying out sparse coding, and carrying out sparse representation on given data by using the LSTM network to obtain a coefficient matrix; then fixing the coefficient matrix, and updating each atom of the LSTM network in turn to enable each atom to represent the reconstructed image group G for training more closelyn
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention considers the space-time sparse characteristics of continuous multi-frame images, and provides a video compression sensing reconstruction method integrating sparse prior modeling and data memory driving. The method can simultaneously consider a large number of frames, does not need to make linear assumption on the motion of the object, can comprehensively reflect the motion information of the object, and is favorable for recovering the structure and detail information of a multi-frame image integrally, thereby improving the quality of the reconstructed video.
Drawings
FIG. 1 is a timing diagram of image group observation vectors versus frame observation vectors;
FIG. 2 is a general flow diagram of a video compressed perceptual reconstruction method;
fig. 3 is a flow chart of multi-frame joint iterative reconstruction based on an LSTM network.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
In compressed sensing video communication, a measuring terminal performs each frame of independent measurement on an original video or a sub video divided by blocks frame by frame, and sends a frame observation vector code stream. The reconstruction end receives the code stream and combines the code stream to form continuous image group observation vectors, each image group observation vector contains L frame observation vectors, i represents the frame number (i is more than or equal to 1 and less than or equal to L) in the same image group, each frame observation vector contains M measured values,
Figure BDA0002122152640000031
represents the nth image group observation vector (n ≧ 1), and L frame observation vectors are arranged into GMVnL column of (1), wherein GMVn(: i) denotes the i-th frame observation vector. Based on the continuous frame observation vector code stream, fig. 1 shows a timing relationship diagram of the image group observation vectors, where the total number L of frame numbers included in each image group observation vector is 3. The reconstruction end carries out multi-frame joint iterative reconstruction based on long-short term memory (LSTM) on each image group observation vector, and determines whether to update the current parameter set of the LSTM network in a self-adaptive mode according to the recent condition of iterative reconstruction because the previous image group and the next image group of the video have strong correlation.
In the reconstructed video, each reconstructed image group contains L reconstructed frames, Gn={Fn,Fn+1,...,Fn+L-1Denotes an nth reconstructed image group, Qb nRepresents GnBlind quality of image groups, Rn,jRepresenting the nth image set residual vector in the jth iteration. FnRepresenting the nth reconstructed frame, containing N pixels. Sn,jRepresenting a reconstructed image group GnAnd (4) corresponding image group sparse vectors in the j iteration. Based on the LSTM network and the blind estimation of the image group quality, fig. 2 shows a general flow chart of a video compressed sensing reconstruction method, which mainly includes the following steps:
in the first step, in the initialization operation, the sequence number n is 1. Because the original frame can not be obtained, the reconstruction end adopts the 1 st image group observation vector GMV1The recovered reconstructed image set trains LSTM network parameters. For GMV1The 1 st reconstruction image group G is restored by the reconstruction end frame by adopting the image reconstruction algorithm with the total variation minimization1={F1,F2,...,FLWill then be (G)1,GMV1) Training the LSTM network as a reference data pair to obtain its parameter set §1As the current set of parameters for the LSTM network §*=§1
Step two, observing vector GMV for nth image groupnAnd the reconstruction end executes multi-frame joint iterative reconstruction based on the LSTM network, calculates a residual error in each iteration, weights the residual error coefficient according to the probability that the residual error coefficient is zero to form a weighted residual error minimization problem, and adopts a Split Bregman iterative algorithm to solve the problem. The stop condition of multi-frame joint iterative reconstruction is that the iteration number reaches the maximum value K or the residual error l2Norm Rn,j||2Less than threshold resMin or group of images blind quality Qb nAbove the threshold qMax. The blind evaluation of the quality of the image sets uses the Video-BLIINDS or VIIDEO criterion, the quality being better for larger values. In cooperation with newly introduced qMax, K may be selected to be larger and resMin may be selected to be smaller. The reconstruction end completes the image group observation vector GMVnTo obtain the reconstructed frames { F one by onen+i-1=Ψ·Sn,j(:,i)}i=1,2,…,LFurther, a reconstructed image group G is obtainedn={Fn,Fn+1,...,Fn+L-1And outputs a reconstructed frame FnAs the final n-th reconstructed frame, the rest of the reconstructed frames Fn+1、…、Fn+L-1As the initial state of the same time-series frame of the subsequent image group. The adjacent image groups have approximate multi-frame joint sparse characteristics, and the parameter set of the LSTM network generally does not need to be updated in the recovery of the multiple image groups. When the recovery of the continuous alpha image groups is finished, if the situation that the iteration times reach the maximum value and stop continues for the alpha image groups, continuing the step three; otherwise, subsequent image group observation vectors still employ the current parameter set of the LSTM network §*And jumping to the step four.
In the multi-frame joint iterative reconstruction of the step, L frame sparse vectors are combined into an image group observation vector to carry out joint recovery, and L reconstructed frames are output according to the frame sequence and the frame rate to form a reconstructed image group. In the case of video reconstruction, it is known that,
Figure BDA0002122152640000041
is a matrix of a gaussian random number and,
Figure BDA0002122152640000051
is a dual-tree wavelet transform base, observation matrix
Figure BDA0002122152640000052
For any image group observation vector, a multi-frame joint iterative reconstruction flow chart based on the LSTM network is shown in FIG. 3. Image group residual vector Rn,jI frame residual vector R ofn,j(:,i)=GMVn(:,i)-Aj·Sn,j(i), frame number i is 1,2, …, L, AjIs to include only the corresponding Sn,jA matrix of A columns supporting the elements of the set, Sn,j(i) is the image group sparse vector Sn,jThe ith frame sparse vector of (1). L is the total number of frame numbers in a group of pictures observation vector, i.e. Sn,jThe number of columns. The joint sparse dependency of multiple frame observation vectors tends to be gradual, and needs to be dynamically obtained by calculating the conditional probability of the residual. The method is obtained by calculating the conditional probability of each vectorThis dependency uses a data-driven LSTM network to infer these probabilities, completing the GMVn(: i) and A)jIs estimated by least squares. Suppose Sn,jThe columns are sparse in common, i.e. the non-zero elements of each vector appear at the same positions as the other vectors, which means that the frame sparse vectors have the same support set. The method observes a vector GMV through an ith framen(i) initializing the i frame residual vector R one by onen,j(ii) (i) these residual vectors are used as input to the LSTM network; the LSTM network for the ith frame in the jth iteration is then output using the transformation matrix U
Figure BDA0002122152640000053
Conversion into base vectors
Figure BDA0002122152640000054
Will zn,j(ii) (i) input to the softmax layer, whose output is expressed as conditional probabilities, thereby finding the non-zero probabilities of the elements in the ith frame sparse vector, selecting the element with the highest probability and adding it to the support set of frame sparse vectors; and then finding the ith frame sparse vector by a least square estimation method, and simultaneously calculating a new ith frame residual vector as the input of the LSTM network in the next iteration.
Step three, the reconstruction end utilizes the nth image group observation vector GMVnThe reconstructed image set of (a) trains LSTM network parameters. The reconstruction end recovers the nth reconstruction image group G by adopting an image reconstruction algorithm with minimized total variation frame by framen={Fn,Fn+1,...,Fn+L-1Will be (G)n,GMVn) Retraining the parameter set of the LSTM network as a reference data pair §nAnd update the current set of parameters of the LSTM network §*=§n
In the third step, if the iteration times reach the maximum value and the situation of stopping continues for alpha image groups, the training and updating of the LSTM network parameters are started. Reconstructed image set GnAnd its corresponding group of images observation vector GMVnForm a reference data pair (G)n,GMVn). To solve for the LSTM network parameters, it is necessary to minimize LThe STM network gives a cross entropy cost function between the conditional probability and the known probability of the reference data pair. The alternate training process first uses the LSTM network to reconstruct the image group G for trainingnPerforming sparse coding, i.e. fixing an LSTM network with which to sparsely represent given data, i.e. to represent the image set observation vectors GMV as closely as possible with as few coefficients as possiblenObtaining a coefficient matrix; then, the coefficient matrix is fixed, and each atom of the LSTM network (each column of the LSTM network) is updated in turn to enable the atom to represent the reconstructed image group G for training more closelyn. The training process of the LSTM network parameters is usually carried out again at longer time sequence numbers, the smaller the alpha value is, the more stable the quality of the reconstructed video is, but more computing resources are consumed, and the updated LSTM network parameter set is used for recovering the subsequent image group.
Step four, if the image group observation vector to be reconstructed still exists at the reconstruction end, if n is n +1, jumping back to the step two, repeatedly executing the process, and continuing to recover the subsequent image group; otherwise, the reconstruction end outputs GnThe remaining reconstructed frame Fn+1、…、Fn+L-1And the reconstructed frames are the final n +1, … and n + L-1 reconstructed frames, so that the video reconstruction is completed.
The invention provides a video compression sensing reconstruction method integrating sparse prior modeling and data memory driving, which considers the space-time sparse characteristics of continuous multi-frame images. The method can simultaneously consider a large number of frames, does not need to make linear assumption on the motion of the object, can comprehensively reflect the motion information of the object, and is favorable for recovering the structure and detail information of a multi-frame image integrally, thereby improving the quality of the reconstructed video.

Claims (7)

1.一种基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其特征在于,包括以下步骤:1. a video compressive sensing reconstruction method based on LSTM network and image group quality blind assessment, is characterized in that, comprises the following steps: (1)重建端接收帧观测矢量码流,组合形成连续的图像组观测矢量;(1) The reconstruction end receives the frame observation vector code stream, and combines to form a continuous image group observation vector; (2)利用第1个图像组观测矢量GMV1的重建图像组训练LSTM网络的参数集合;(2) Using the reconstructed image group of the first image group observation vector GMV 1 to train the parameter set of the LSTM network; (3)对于第n个图像组观测矢量GMVn,执行基于LSTM网络的多帧联合迭代重构,其中,n≥2,停止条件是当迭代次数达到最大值K或残差l2范数||Rn,j||2小于阈值resMin或图像组盲质量Qb n高于阈值qMax,从而完成第n个重建图像组的恢复,将其中的重建帧Fn作为最终的第n个重建帧;当完成连续α个图像组的恢复后,若每个图像组的最终恢复均是迭代次数达到最大值K才停止的情况,则进入步骤(4);否则,后续的多帧联合迭代重构仍采用LSTM网络的当前参数集合§*,并跳转到步骤(5);其中,基于LSTM网络的多帧联合迭代重构是通过帧观测矢量GMVn(:,i)逐一初始化第i帧残差矢量Rn,j(:,i),并将初始化的残差矢量Rn,j(:,i)用作LSTM网络的输入;使用转换矩阵U将第j次迭代中第i帧图像的LSTM网络输出
Figure FDA0003103675240000011
转换为基矢量
Figure FDA0003103675240000012
ncell是LSTM网络神经元的个数;将基矢量zn,j(:,i)进一步输入到softmax层,由此得出第i帧稀疏矢量中各个元素的非零概率,选择具有最大概率的元素并将其添加到帧稀疏矢量的支撑集;最后,通过最小二乘估计法逐一找到第j次迭代中各个帧稀疏矢量{Sn,j(:,i)}i=1,2,…,L
(3) For the nth image group observation vector GMV n , perform joint iterative reconstruction of multiple frames based on LSTM network, where n ≥ 2, the stopping condition is when the number of iterations reaches the maximum value K or the residual l 2 norm| |R n,j || 2 is smaller than the threshold resMin or the image group blind quality Q b n is higher than the threshold qMax, thus completing the restoration of the nth reconstructed image group, and taking the reconstructed frame F n as the final nth reconstructed frame ; When the recovery of consecutive α picture groups is completed, if the final recovery of each picture group is the case that the number of iterations reaches the maximum value K before stopping, then enter step (4); otherwise, the subsequent multi-frame joint iterative reconstruction Still using the current parameter set of the LSTM network § * , and jumping to step (5); wherein, the multi-frame joint iterative reconstruction based on the LSTM network is to initialize the ith frame residual one by one through the frame observation vector GMV n (:, i). The difference vector R n,j (:,i), and the initialized residual vector R n,j (:,i) is used as the input of the LSTM network; use the transformation matrix U to convert the image of the i-th frame in the j-th iteration LSTM network output
Figure FDA0003103675240000011
Convert to base vector
Figure FDA0003103675240000012
ncell is the number of neurons in the LSTM network; the base vector z n,j (:,i) is further input into the softmax layer, and the non-zero probability of each element in the ith frame sparse vector is obtained, and the one with the largest probability is selected. elements and add them to the support set of the frame sparse vector; finally, find each frame sparse vector {S n,j (:,i)} i=1,2,… ,L ;
(4)重建端利用第n个图像组观测矢量GMVn的重建图像组Gn训练LSTM网络;(4) The reconstruction end uses the reconstructed image group G n of the nth image group observation vector GMV n to train the LSTM network; (5)如果仍然存在待重构的图像组观测矢量,则返回步骤(3),继续逐个恢复图像组;否则,输出剩余的重建帧Fn+1、…、Fn+L-1,作为最终的第n+1、…、n+L-1个重建帧,完成视频重构。(5) If there is still the observation vector of the image group to be reconstructed, then return to step (3), and continue to restore the image groups one by one; otherwise, output the remaining reconstructed frames F n+1 , . . . , F n+L-1 , as The final n+1, ..., n+L-1 reconstructed frames complete the video reconstruction.
2.根据权利要求1所述的基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其特征在于,所述步骤(1)中每个图像组观测矢量包含L个帧观测矢量,其中,L≥2,每个帧观测矢量均含有M个测量值。2. the video compressive sensing reconstruction method based on LSTM network and image group quality blind assessment according to claim 1, is characterized in that, in described step (1), each image group observation vector comprises L frame observation vector, Among them, L≥2, each frame observation vector contains M measurement values. 3.根据权利要求1所述的基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其特征在于,所述步骤(2)中对于第1个图像组观测矢量GMV1,重建端逐帧采用图像重构算法恢复第1个重建图像组G1={F1,F2,...,FL},然后将(G1,GMV1)作为基准数据对,用于训练LSTM网络的参数集合§1,从而获得LSTM网络的当前参数集合§*=§13. the video compressive sensing reconstruction method based on LSTM network and image group quality blind evaluation according to claim 1, is characterized in that, in described step (2), for the 1st image group observation vector GMV 1 , the reconstruction end Frame-by-frame image reconstruction algorithm is used to restore the first reconstructed image group G 1 ={F 1 ,F 2 ,..., FL }, and then (G 1 , GMV 1 ) is used as the reference data pair for training LSTM The parameter set of the network § 1 , thereby obtaining the current parameter set of the LSTM network § * = § 1 . 4.根据权利要求1所述的基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其特征在于,所述步骤(3)中重建端计算多次迭代后的残差矢量根据残差系数为零的概率对残差系数进行加权,得到一个加权的残差最小化问题,采用Split Bregman迭代算法解决该问题。4. the video compressive sensing reconstruction method based on LSTM network and image group quality blind assessment according to claim 1, is characterized in that, in described step (3), the residual vector after the reconstruction end calculates multiple iterations is based on residual The probability that the difference coefficient is zero weights the residual coefficient to obtain a weighted residual minimization problem, which is solved by the Split Bregman iterative algorithm. 5.根据权利要求1所述的基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,所述步骤(3)中图像组盲质量通过Video-BLIINDS或VIIDEO准则进行评估。5. The video compressive sensing reconstruction method based on LSTM network and blind evaluation of GOP quality according to claim 1, wherein in the step (3), the blind GOP quality is evaluated by Video-BLIINDS or VIIDEO criteria. 6.根据权利要求1所述的基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其特征在于,所述步骤(4)中重建端逐帧采用图像重构算法恢复第n个重建图像组Gn={Fn,Fn+1,...,Fn+L-1},将(Gn,GMVn)作为基准数据对,训练LSTM网络的参数集合§n,更新LSTM网络的当前参数集合§*=§n6. the video compressive sensing reconstruction method based on LSTM network and image group quality blind assessment according to claim 1, is characterized in that, in described step (4), reconstruction end adopts image reconstruction algorithm frame by frame to restore the nth Reconstructed image group G n ={F n ,F n+1 ,...,F n+L-1 }, using (G n , GMV n ) as the reference data pair, train the parameter set of LSTM network § n , update The current set of parameters of the LSTM network § * = § n . 7.根据权利要求1所述的基于LSTM网络与图像组质量盲评估的视频压缩感知重构方法,其特征在于,所述训练LSTM网络时,采用LSTM网络对训练用的重建图像组Gn进行稀疏编码,并用该LSTM网络对给定数据进行稀疏表示,得到系数矩阵;之后固定系数矩阵,依次更新LSTM网络的每个原子,使其更接近地表征训练用的重建图像组Gn7. the video compressive sensing reconstruction method based on LSTM network and image group quality blind assessment according to claim 1, is characterized in that, when described training LSTM network, adopts LSTM network to carry out the reconstruction image group G n used for training. Sparse coding, and use the LSTM network to sparsely represent the given data to obtain a coefficient matrix; then fix the coefficient matrix, and sequentially update each atom of the LSTM network to more closely represent the reconstructed image group G n for training.
CN201910610758.XA 2019-07-08 2019-07-08 Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality Active CN110381313B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910610758.XA CN110381313B (en) 2019-07-08 2019-07-08 Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910610758.XA CN110381313B (en) 2019-07-08 2019-07-08 Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality

Publications (2)

Publication Number Publication Date
CN110381313A CN110381313A (en) 2019-10-25
CN110381313B true CN110381313B (en) 2021-08-31

Family

ID=68252277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910610758.XA Active CN110381313B (en) 2019-07-08 2019-07-08 Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality

Country Status (1)

Country Link
CN (1) CN110381313B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112085102B (en) * 2020-09-10 2023-03-10 西安电子科技大学 No-reference video quality assessment method based on 3D spatio-temporal feature decomposition

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102630011A (en) * 2012-03-31 2012-08-08 浙江师范大学 Compressive perceptual coding and decoding method and system in video sensor network
CN107154064A (en) * 2017-05-04 2017-09-12 西安电子科技大学 Natural image compressed sensing method for reconstructing based on depth sparse coding
CN107317583A (en) * 2017-05-18 2017-11-03 湖北工业大学 Variable step size distributed compression based on Recognition with Recurrent Neural Network perceives method for reconstructing
WO2019081937A1 (en) * 2017-10-26 2019-05-02 Gb Gas Holdings Limited Determining operating state from complex sensor data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102630011A (en) * 2012-03-31 2012-08-08 浙江师范大学 Compressive perceptual coding and decoding method and system in video sensor network
CN107154064A (en) * 2017-05-04 2017-09-12 西安电子科技大学 Natural image compressed sensing method for reconstructing based on depth sparse coding
CN107317583A (en) * 2017-05-18 2017-11-03 湖北工业大学 Variable step size distributed compression based on Recognition with Recurrent Neural Network perceives method for reconstructing
WO2019081937A1 (en) * 2017-10-26 2019-05-02 Gb Gas Holdings Limited Determining operating state from complex sensor data

Also Published As

Publication number Publication date
CN110381313A (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN109903228B (en) Image super-resolution reconstruction method based on convolutional neural network
CN111667442B (en) High-quality high-frame-rate image reconstruction method based on event camera
WO2022267641A1 (en) Image defogging method and system based on cyclic generative adversarial network
WO2020037965A1 (en) Method for multi-motion flow deep convolutional network model for video prediction
CN113177882B (en) Single-frame image super-resolution processing method based on diffusion model
CN111539879A (en) Video blind denoising method and device based on deep learning
CN113673307A (en) A Lightweight Video Action Recognition Method
CN111696027B (en) Multi-modal image style migration method based on adaptive attention mechanism
CN106973293A (en) The light field image coding method predicted based on parallax
CN109636721B (en) Video super-resolution method based on countermeasure learning and attention mechanism
CN111127325B (en) Satellite video super-resolution reconstruction method and system based on cyclic neural network
CN110475118A (en) A kind of old film flicker removal method based on attention mechanism deep-cycle network
CN114170088A (en) Relational reinforcement learning system and method based on graph structure data
CN114463218B (en) Video deblurring method based on event data driving
CN114202459A (en) Blind image super-resolution method based on depth prior
CN113902647B (en) Image deblurring method based on double closed-loop network
CN112270691B (en) Monocular video structure and motion prediction method based on dynamic filter network
CN110363068A (en) A high-resolution pedestrian image generation method based on multi-scale recurrent generative adversarial network
CN112767283A (en) Non-uniform image defogging method based on multi-image block division
CN113793261A (en) Spectrum reconstruction method based on 3D attention mechanism full-channel fusion network
CN113992920A (en) Video compressed sensing reconstruction method based on deep expansion network
CN110381313B (en) Video Compressed Sensing Reconstruction Method Based on LSTM Network and Blind Evaluation of Image Group Quality
CN116109537A (en) Distorted image reconstruction method and related device based on deep learning
CN110505479B (en) Video compressed sensing reconstruction method with same measurement rate frame by frame under time delay constraint
CN104053006B (en) Video image compression sensing reconstruction method based on frame difference background image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant