CN111083501A

CN111083501A - Video frame reconstruction method and device and terminal equipment

Info

Publication number: CN111083501A
Application number: CN201911420286.8A
Authority: CN
Inventors: 郭烈强
Original assignee: Hefei Tucodec Information Technology Co ltd
Current assignee: Hefei Tucodec Information Technology Co ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-04-28

Abstract

The invention is suitable for the technical field of video compression, and provides a video frame reconstruction method, a device and terminal equipment, wherein the method comprises the following steps: acquiring optical flow information and warp difference of a previous frame and a current frame of the current frame; inputting the optical flow information, the warp difference and the current frame into an encoding network to obtain first characteristic information; inputting the related information of the reference frame into a coding network to obtain second characteristic information; obtaining final characteristic information based on the first characteristic information and the second characteristic information, and storing the final characteristic information after encoding; and inputting the encoded final characteristic information and the second characteristic information into a decoding network to obtain a reconstructed frame. The invention extracts the characteristics by carrying out network transformation on the reference frame before decoding and sends the extracted characteristics and the characteristics of the current frame into the decoder together to restore the reconstructed image of the current frame, thereby fully utilizing the information of the reference frame and ensuring the reconstruction quality of the current frame to be better. And because the extracted features of the reference frame do not need quantization, the loss caused by quantization is reduced, the transmission bandwidth is saved, and the compression rate is improved.

Description

Video frame reconstruction method and device and terminal equipment

Technical Field

The invention belongs to the technical field of video compression, and particularly relates to a video frame reconstruction method, a video frame reconstruction device and terminal equipment.

Background

The prior art does not fully utilize all information of the reference frame and needs to quantize the characteristics of the reference frame, so that the compression quality and the compression rate are not high.

Therefore, a new technical solution is needed to solve the above problems.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for reconstructing a video frame, so as to solve the problem in the prior art that compression quality and compression ratio are not high.

A first aspect of an embodiment of the present invention provides a method for reconstructing a video frame, including:

acquiring optical flow information and warp difference of a previous frame and a current frame of the current frame;

inputting the optical flow information, the warp difference and the current frame into an encoding network to obtain first characteristic information;

inputting the related information of the reference frame into a coding network to obtain second characteristic information;

obtaining final characteristic information based on the first characteristic information and the second characteristic information, and storing the final characteristic information after encoding;

and inputting the encoded final characteristic information and the second characteristic information into a decoding network to obtain a reconstructed frame.

A second aspect of an embodiment of the present invention provides a video frame reconstruction apparatus, including:

the acquisition module is used for acquiring the optical flow information and warp difference of the previous frame and the current frame of the current frame;

the first characteristic module is used for inputting the optical flow information, the warp difference and the current frame into an encoding network to obtain first characteristic information;

the second characteristic module is used for inputting the related information of the reference frame into the coding network to obtain second characteristic information;

the final characteristic module is used for obtaining final characteristic information based on the first characteristic information and the second characteristic information, coding the final characteristic information and storing the final characteristic information;

and the reconstructed frame module is used for inputting the encoded final characteristic information and the second characteristic information into a decoding network to obtain a reconstructed frame.

A third aspect of embodiments of the present invention provides a video frame reconstruction terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method provided in the first aspect when executing the computer program.

A fourth aspect of embodiments of the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method as provided in the first aspect above.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

the invention extracts the characteristics by carrying out network transformation on the reference frame before decoding and sends the extracted characteristics and the characteristics of the current frame into the decoder together to restore the reconstructed image of the current frame, thereby fully utilizing the information of the reference frame and ensuring the reconstruction quality of the current frame to be better. And because the extracted features of the reference frame do not need quantization, the loss caused by quantization is reduced, the transmission bandwidth is saved, and the compression rate is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart illustrating an implementation of a video frame reconstruction method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a video frame reconstruction apparatus according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a video frame reconstruction terminal device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to explain the technical solution of the present invention, the following description will be given by way of specific examples.

Example one

Fig. 1 shows an implementation flow of a video frame reconstruction method according to an embodiment of the present invention, where an execution subject of the method may be a terminal device, which is detailed as follows:

step S101, acquiring optical flow information and warp difference of a previous frame and a current frame of the current frame.

Optionally, the optical flow information is obtained by calculating a spatial position mapping relationship between the pixels of the current frame image and the pixels of the previous frame image of the current frame. Specifically, the optical flow is to use the change of pixels in the image sequence in the time domain and the correlation between adjacent frames to find the correlation between two adjacent frames, so as to calculate the motion information of the object between the adjacent frames: and inputting the current frame and the previous frame of the current frame into a preset optical flow network to obtain optical flow information. Further, the optical flow network includes two network structures: FlowNeTS (FlowNetSimple) and FlowNetC (FlowNetCorr). The optical flow network FlowNet S directly overlaps and inputs two images according to channel dimensions, and the network structure of the FlowNet S only has convolution layers; the optical flow network FlowNet C firstly extracts the characteristics of the two input images respectively and then calculates the correlation of the characteristics, namely the characteristics of the two images are subjected to convolution operation in a space dimension.

Furthermore, after performing warp operation on the previous frame of the current frame based on the optical flow information, subtracting the current frame to obtain a warp difference. Specifically, a warp frame is obtained by converting a reference frame warp (affine transformation of an image) to a specified position according to optical flow information, and a warp difference is obtained by subtracting the warp frame from a current frame.

And S102, inputting the optical flow information, the warp difference and the current frame into an encoding network to obtain first characteristic information.

Optionally, the coding network is a downsampling network, and the optical flow information, the warp difference and the current frame are input into the coding network and subjected to downsampling operation and convolution operation to obtain the first feature information.

Step S103, inputting the related information of the reference frame into the coding network to obtain second characteristic information.

Optionally, the coding network is a downsampling network, the information related to the reference frame is input into the coding network, and downsampling operation and convolution operation are performed to obtain the second feature information. Wherein the related information of the reference frame comprises: the reference frame, the feature information extracted from the reference frame through the resNet50 (residual error network), and the reference optical flow, wherein the reference optical flow is the optical flow of the frame before the reference frame and the reference frame.

And step S104, obtaining final characteristic information based on the first characteristic information and the second characteristic information, and storing the final characteristic information after encoding.

Optionally, the first feature information is input into a downsampling network to perform downsampling operation and convolution operation, and then first reconstruction feature information is obtained;

further, inputting the second feature information into a downsampling network to perform downsampling operation and convolution operation to obtain second reconstruction feature information;

further, the first reconstruction feature information and the second reconstruction feature information are subtracted to obtain final feature information.

Step S105, inputting the encoded final feature information and the second feature information into a decoding network to obtain a reconstructed frame.

Optionally, the final feature information and the second feature information are spliced and then input to a decoding network for performing upsampling operation and convolution operation, so as to obtain a reconstructed frame.

In the embodiment, the reference frame is subjected to network transformation before decoding to extract the characteristics, and the characteristics of the current frame are sent to the decoder together to restore the reconstructed image of the current frame, so that the information of the reference frame is fully utilized, and the reconstruction quality of the current frame is better. And because the extracted features of the reference frame do not need quantization, the loss caused by quantization is reduced, the transmission bandwidth is saved, and the compression rate is improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Example two

Fig. 2 is a block diagram showing a configuration of a video frame reconstruction apparatus according to an embodiment of the present invention, and only a portion related to the embodiment of the present invention is shown for convenience of explanation. The video frame reconstruction apparatus 2 includes: an acquisition module 21, a first feature module 22, a second feature module 23, a final feature module 24, and a reconstructed frame module 25.

The acquiring module 21 is configured to acquire optical flow information and a warp difference of a previous frame and a current frame of a current frame;

a first characteristic module 22, configured to input the optical flow information, warp difference and current frame into an encoding network to obtain first characteristic information;

a second characteristic module 23, configured to input the relevant information of the reference frame into the coding network to obtain second characteristic information;

a final characteristic module 24, configured to obtain final characteristic information based on the first characteristic information and the second characteristic information, encode the final characteristic information, and store the encoded final characteristic information;

and a reconstructed frame module 25, configured to input the encoded final feature information and the second feature information into a decoding network to obtain a reconstructed frame.

Optionally, the obtaining module 21 includes:

an optical flow unit, configured to calculate a spatial position mapping relationship between pixels of the current frame image and pixels of a previous frame image of the current frame to obtain optical flow information;

and the warp difference unit is used for performing warp operation on the previous frame of the current frame based on the optical flow information and then subtracting the current frame to obtain a warp difference.

Optionally, the final feature module 24 comprises:

a first down-sampling unit, configured to input the first feature information into a down-sampling network to obtain first reconstructed feature information;

a second down-sampling unit, configured to input the second feature information into a down-sampling network to obtain second reconstruction feature information;

and the characteristic calculating unit is used for subtracting the first reconstruction characteristic information and the second reconstruction characteristic information to obtain final characteristic information.

Optionally, the frame reconstructing module 25 includes:

and the decoding unit is used for splicing the final characteristic information and the second characteristic information and inputting the spliced final characteristic information and the spliced second characteristic information into a decoding network to obtain a reconstructed frame.

EXAMPLE III

Fig. 3 is a schematic diagram of a video frame reconstruction terminal device according to an embodiment of the present invention. As shown in fig. 3, the video frame reconstruction terminal device 3 of this embodiment includes: a processor 30, a memory 31 and a computer program 32, such as a video frame reconstruction program, stored in the memory 31 and executable on the processor 30. The processor 30, when executing the computer program 32, implements the steps of the various embodiments of the video frame reconstruction method described above, such as the steps 101 to 105 shown in fig. 1. Alternatively, the processor 30 implements the functions of the modules/units in the device embodiments, such as the modules 21 to 25 shown in fig. 2, when executing the computer program 32.

Illustratively, the computer program 32 may be divided into one or more modules/units, which are stored in the memory 31 and executed by the processor 30 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 32 in the video frame reconstruction terminal device 3. For example, the computer program 32 may be divided into an obtaining module, a first feature module, a second feature module, a final feature module, and a reconstructed frame module, where the specific functions of the modules are as follows:

The video frame reconstruction terminal device 3 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The video frame reconstruction terminal device may include, but is not limited to, a processor 30 and a memory 31. Those skilled in the art will appreciate that fig. 3 is only an example of the video frame reconstruction terminal device 3, and does not constitute a limitation of the video frame reconstruction terminal device 3, and may include more or less components than those shown, or combine some components, or different components, for example, the above-described video frame reconstruction terminal device may further include an input-output device, a network access device, a bus, etc.

The Processor 30 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 31 may be an internal storage unit of the video frame reconstruction terminal device 3, such as a hard disk or a memory of the video frame reconstruction terminal device 3. The memory 31 may also be an external storage device of the video frame reconstruction terminal device 3, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), or the like provided on the video frame reconstruction terminal device 3. Further, the memory 31 may include both an internal storage unit of the video frame reconstruction terminal device 3 and an external storage device. The memory 31 is used to store the computer program and other programs and data required by the video frame reconstruction terminal device. The above-mentioned memory 31 may also be used to temporarily store data that has been output or is to be output.

As can be seen from the above, in the embodiment, the reference frame is subjected to network transformation before decoding to extract features, and the extracted features and the features of the current frame are sent to the decoder together, so that the reconstructed image of the current frame is restored, and the information of the reference frame is fully utilized, so that the reconstruction quality of the current frame is better. And because the extracted features of the reference frame do not need quantization, the loss caused by quantization is reduced, the transmission bandwidth is saved, and the compression rate is improved.

It is clear to those skilled in the art that for the convenience and simplicity of description, the above functional units and modules are merely illustrated as being divided, and in practical applications, the above secure digital flash memory card and the like may be used as needed, and further, the memory may include both an internal storage unit of some terminal device and an external storage device, the memory is used for storing the computer program and other programs and data required by the terminal device, and the memory may be used for temporarily storing data that has been output or will be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media which may not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A method for video frame reconstruction, comprising:

2. The video frame reconstruction method of claim 1, wherein said obtaining optical flow information and warp differential for a previous frame and a current frame of a current frame comprises:

calculating the spatial position mapping relation between the pixels of the current frame image and the pixels of the previous frame image of the current frame to obtain optical flow information;

and after carrying out warp operation on the previous frame of the current frame based on the optical flow information, subtracting the current frame to obtain a warp difference.

3. The video frame reconstruction method of claim 1,

the related information of the reference frame comprises: reference frame, feature information extracted from the reference frame through resNet50 (residual error network), and reference optical flow.

4. The method of claim 1, wherein the deriving final feature information based on the first feature information and the second feature information comprises:

inputting the first characteristic information into a down-sampling network to obtain first reconstruction characteristic information;

inputting the second feature information into a down-sampling network to obtain second reconstruction feature information;

and subtracting the first reconstruction characteristic information and the second reconstruction characteristic information to obtain final characteristic information.

5. The method of claim 1, wherein the inputting the encoded final feature information and the second feature information into a decoding network to obtain a reconstructed frame comprises:

and splicing the final characteristic information and the second characteristic information and inputting the spliced final characteristic information and the second characteristic information into a decoding network to obtain a reconstructed frame.

6. A video frame reconstruction apparatus, comprising:

7. The video frame reconstruction apparatus of claim 6, wherein the acquisition module comprises:

the optical flow unit is used for calculating the spatial position mapping relation between the pixels of the current frame image and the pixels of the previous frame image of the current frame to obtain optical flow information;

and the warp difference unit is used for carrying out warp operation on the previous frame of the current frame based on the optical flow information and then subtracting the current frame to obtain a warp difference.

8. The video frame reconstruction apparatus of claim 6, wherein the final feature module comprises:

the first down-sampling unit is used for inputting the first characteristic information into a down-sampling network to obtain first reconstruction characteristic information;

the second downsampling unit is used for inputting the second feature information into a downsampling network to obtain second reconstruction feature information;

9. A video frame reconstruction terminal device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, characterized in that said processor implements the steps of the method according to any of claims 1 to 5 when executing said computer program.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.