CN114745545B

CN114745545B - Video frame inserting method, device, equipment and medium

Info

Publication number: CN114745545B
Application number: CN202210375212.2A
Authority: CN
Inventors: 王庆; 徐天春; 赵世杰; 陈嘉麟; 张清源; 李军林
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2022-04-11
Filing date: 2022-04-11
Publication date: 2024-07-09
Anticipated expiration: 2042-04-11
Also published as: CN114745545A

Abstract

The embodiment of the disclosure discloses a video frame inserting method, a device, equipment and a medium. The method comprises the following steps: acquiring two adjacent frames in a target video; respectively carrying out downsampling treatment on a first frame and a second frame in two adjacent frames to obtain a first downsampled frame and a second downsampled frame; determining a first downsampled light flow map, a second downsampled light flow map and downsampling correction information based on a preset light flow correction model; respectively carrying out up-sampling processing on the first down-sampling optical flow diagram, the second down-sampling optical flow diagram and the down-sampling correction information to obtain a first up-sampling optical flow diagram, a second up-sampling optical flow diagram and the up-sampling correction information; based on a preset conversion integration model, an intermediate frame between two adjacent frames is determined according to the first frame, the second frame, the first up-sampling light flow diagram, the second up-sampling light flow diagram and up-sampling correction information, and a target frame inserting corresponding to the two adjacent frames is determined based on the intermediate frame, so that the video frame inserting efficiency can be improved on the premise of guaranteeing the frame inserting effect.

Description

Video frame inserting method, device, equipment and medium

Technical Field

Embodiments of the present disclosure relate to computer technology, and in particular, to a method, apparatus, device, and medium for video frame insertion.

Background

With the rapid development of computer technology, the frame rate and fluency of the video can be improved by performing frame inserting operation on the video, so that the high-quality requirement of users on the video is met. At present, video frame interpolation is usually performed based on optical flow, and a frame interpolation video frame between two adjacent frames is determined. However, the existing frame inserting method directly carries out frame inserting operation on the video frames with the original resolution, which often requires longer frame inserting time, thereby reducing the frame inserting efficiency of the video.

Disclosure of Invention

The embodiment of the disclosure provides a video frame inserting method, a device, equipment and a medium, so that the video frame inserting efficiency is improved on the premise of ensuring the frame inserting effect.

In a first aspect, an embodiment of the present disclosure provides a video frame inserting method, including:

Acquiring two adjacent frames in a target video;

based on a preset sampling multiple, respectively carrying out downsampling treatment on a first frame and a second frame in the two adjacent frames to obtain a first downsampled frame and a second downsampled frame;

Determining a first downsampling light flow graph corresponding to the first downsampling frame, a second downsampling light flow graph corresponding to the second downsampling frame and downsampling correction information based on a preset light flow correction model;

based on the preset sampling multiple, respectively carrying out up-sampling processing on the first down-sampling optical flow graph, the second down-sampling optical flow graph and the down-sampling correction information to obtain a first up-sampling optical flow graph, a second up-sampling optical flow graph and up-sampling correction information;

Based on a preset conversion integration model, determining an intermediate frame between the two adjacent frames according to the first frame, the second frame, the first up-sampling light flow diagram, the second up-sampling light flow diagram and the up-sampling correction information, and determining a target plug frame corresponding to the two adjacent frames based on the intermediate frame.

In a second aspect, an embodiment of the present disclosure further provides a video frame inserting apparatus, including:

the adjacent two-frame acquisition module is used for acquiring adjacent two frames in the target video;

The downsampling processing module is used for respectively downsampling the first frame and the second frame in the two adjacent frames based on preset sampling multiples to obtain a first downsampled frame and a second downsampled frame;

The downsampling information determining module is used for determining a first downsampling light flow graph corresponding to the first downsampling frame, a second downsampling light flow graph corresponding to the second downsampling frame and downsampling correction information based on a preset light flow correction model;

The up-sampling processing module is used for respectively carrying out up-sampling processing on the first down-sampling optical flow graph, the second down-sampling optical flow graph and the down-sampling correction information based on the preset sampling multiple to obtain a first up-sampling optical flow graph, a second up-sampling optical flow graph and up-sampling correction information;

The target frame inserting determining module is configured to determine an intermediate frame between the two adjacent frames according to the first frame, the second frame, the first up-sampling optical flow diagram, the second up-sampling optical flow diagram and the up-sampling correction information based on a preset conversion integration model, and determine a target frame inserting corresponding to the two adjacent frames based on the intermediate frame.

In a third aspect, embodiments of the present disclosure further provide an electronic device, including:

One or more processors;

a memory for storing one or more programs;

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement a video interpolation method as provided by any embodiment of the present disclosure.

In a fourth aspect, the embodiments of the present disclosure further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a video interpolation method as provided by any of the embodiments of the present disclosure.

According to the embodiment of the disclosure, the first frame and the second frame in two adjacent frames in the target video are respectively subjected to downsampling processing based on the preset sampling times, so that the first downsampled frame and the second downsampled frame with reduced resolution are obtained, and therefore, based on the preset optical flow correction model, a first downsampled optical flow graph corresponding to the first downsampled frame, a second downsampled optical flow graph corresponding to the second downsampled frame and downsampling correction information can be more rapidly determined, and further, the frame inserting efficiency is improved. The first downsampling optical flow diagram, the second downsampling optical flow diagram and the downsampling correction information are respectively subjected to upsampling processing based on preset sampling multiples, so that the first upsampling optical flow diagram, the second upsampling optical flow diagram and the upsampling correction information with the original resolution are obtained, and based on a preset conversion integration model, according to the first frame, the second frame, the first upsampling optical flow diagram, the second upsampling optical flow diagram and the upsampling correction information, an intermediate frame between two adjacent frames can be accurately determined, and a target frame inserting corresponding to the two adjacent frames is determined based on the intermediate frame, so that the video frame inserting efficiency is improved on the premise of guaranteeing the frame inserting effect.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

Fig. 1 is a flowchart of a video frame inserting method according to an embodiment of the present disclosure;

FIG. 2 is an example of a video framing procedure in accordance with a first aspect of the present disclosure;

Fig. 3 is a flowchart of a video frame inserting method according to a second embodiment of the disclosure;

FIG. 4 is an example of a thread parallel process in accordance with a second embodiment of the present disclosure;

fig. 5 is a schematic structural diagram of a video frame inserting device according to a third embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

Example 1

Fig. 1 is a flowchart of a video frame inserting method according to a first embodiment of the present disclosure, where the present embodiment is applicable to a case of inserting frames into every two adjacent frames in a video. The method may be performed by a video framing apparatus, which may be implemented in software and/or hardware, integrated into a clothing electronic device. As shown in fig. 1, the method specifically includes the following steps:

s110, acquiring two adjacent frames in the target video.

The target video may refer to a video that needs to be inserted into a frame to improve the smoothness of the video. For example, the target video may refer to a slow motion video or the like. The adjacent two frames may refer to any adjacent two video frames in the target video, i.e., the first frame and the second frame, so as to perform a frame inserting operation between the first frame and the second frame.

S120, based on a preset sampling multiple, respectively carrying out downsampling processing on a first frame and a second frame in two adjacent frames to obtain a first downsampled frame and a second downsampled frame.

The preset sampling multiple may be a sampling multiple set in advance based on a service scenario and a requirement. For example, the preset sampling multiple may be, but is not limited to, 2 times.

Specifically, fig. 2 gives an example of a video plug-in process. As shown in fig. 2, the first frame I ₁ is subjected to downsampling processing based on a preset sampling multiple, to obtain a first downsampled frame dI ₁ in which the preset sampling multiple (for example, downsampling by 2 times) is downsampled. The second frame I ₂ is downsampled based on the preset sampling multiple to obtain a second downsampled frame dI ₂ downsampled by the preset sampling multiple (for example, downsampled by 2 times), so that a first downsampled frame and a second downsampled frame with reduced resolution can be obtained.

S130, determining a first downsampling light flow graph corresponding to the first downsampling frame, a second downsampling light flow graph corresponding to the second downsampling frame and downsampling correction information based on a preset light flow correction model.

The preset optical flow correction model may be a network model that is preset and is used for performing optical flow, warp conversion and correction processing on the video frame. The first downsampled optical flow map may refer to optical flow map dF _t1 from the preset interpolation time t to the first frame time t ₁, and correspondingly the second downsampled optical flow map may refer to optical flow map dF _t2 from the preset interpolation time t to the second frame time t ₂. The first downsampled optical flow map may also refer to an optical flow map dF _1t from the first frame time t ₁ to the preset frame time t, and correspondingly the second downsampled optical flow map may refer to an optical flow map dF _2t from the second frame time t ₂ to the preset frame time t. The first frame time t ₁ refers to the video time at which the first frame I ₁ is located. The second frame time t ₂ refers to the video time at which the second frame I ₂ is located. The preset frame inserting time t refers to a time when a frame needs to be inserted between the first frame time t ₁ and the second frame time t ₂. The downsampling correction information may include downsampled residual information dRt and downsampled occlusion information dOt. The preset optical flow correction model may be obtained by training based on sample data in advance.

Specifically, the first downsampled frame dI ₁ and the second downsampled frame dI ₂ with low resolution may be input into a trained preset optical flow correction model for processing, and based on an output of the preset optical flow correction model, a first downsampled optical flow map dF _t1 (or dF _1t), a second downsampled optical flow map dF _t2 (or dF _2t), and downsampled correction information (such as downsampled residual information dRt and downsampled occlusion information dOt) may be obtained.

Illustratively, as shown in fig. 2, the preset optical flow correction model may include: an optical flow generation sub-model, a first conversion sub-model, and a correction sub-model. The optical flow generation sub-model may be a network model for generating an optical flow graph corresponding to two moments. The first transform sub-model may be a warp model for transforming the first downsampled frame or the second downsampled frame into a transformed frame at a preset interpolation time instant. The correction sub-model may be a network model for performing correction processing on the converted frames.

Illustratively, S130 may include: respectively inputting the first downsampling frame and the second downsampling frame into an optical flow generation submodel to obtain a first downsampling optical flow diagram corresponding to the first downsampling frame and a second downsampling optical flow diagram corresponding to the second downsampling frame; inputting the first downsampling frame and the first downsampling light flow graph into a first conversion sub-model to obtain a first downsampling conversion frame corresponding to a preset frame inserting moment; inputting the second downsampling frame and the second downsampling light flow graph into the first conversion sub-model to obtain a second downsampling conversion frame corresponding to the preset frame inserting moment; and inputting the first downsampled converted frame and the second downsampled converted frame into the correction submodel to obtain downsampled correction information.

Specifically, a first downsampled frame dI ₁ may be input into the optical flow generation sub-model for optical flow generation and based on the output of the optical flow generation sub-model, a first downsampled optical flow graph dF _t1 may be obtained, and a second downsampled frame dI ₂ may be input into the optical flow generation sub-model for optical flow generation and based on the output of the optical flow generation sub-model, a second downsampled optical flow graph dF _t2 may be obtained. The first downsampling frame dI ₁ and the first downsampling optical flow diagram dF _t1 are input into a first converter model to be converted, and based on the output of the first converter model, a first downsampling conversion frame dI _t1 corresponding to the preset frame inserting time t is obtained. And inputting the second downsampled frame dI ₂ and the second downsampled optical flow diagram dF _t2 into the first converter model for conversion processing, and obtaining a second downsampled converted frame dI _t2 corresponding to the preset frame inserting time t based on the output of the first converter model. The first downsampled converted frame dI _t1 and the second downsampled converted frame dI _t2 are input into a correction sub-model for correction processing, and downsampled correction information is obtained based on the output of the correction sub-model.

Illustratively, the preset optical flow correction model may further include: the sub-model is extracted semantically. The semantic extraction sub-model may be a network model for extracting semantic information in video frames, among other things. Accordingly, "inputting the first downsampled converted frame and the second downsampled converted frame into the correction submodel to obtain downsampled correction information" may include: respectively inputting the first downsampling frame and the second downsampling frame into a semantic extraction submodel to obtain first downsampling semantic information corresponding to the first downsampling frame and second downsampling semantic information corresponding to the second downsampling frame; and inputting the first downsampling conversion frame, the second downsampling conversion frame, the first downsampling semantic information and the second downsampling semantic information into the correction submodel to obtain downsampling correction information.

Specifically, when the preset optical flow correction model further includes a semantic extraction sub-model, the first downsampling frame dI ₁ may be input into the semantic extraction sub-model to perform semantic extraction, and based on the output of the semantic extraction sub-model, first downsampling semantic information may be obtained. Similarly, the second downsampled frame dI ₂ is input into the semantic extraction sub-model for semantic extraction, and based on the output of the semantic extraction sub-model, second downsampled semantic information is obtained. The first downsampling conversion frame dI _t1, the second downsampling conversion frame dI _t2, the first downsampling semantic information and the second downsampling semantic information are input into the correction submodel to be corrected, and downsampling correction information containing the semantic information is obtained, so that the accuracy of the correction information can be further improved, and the frame inserting effect is further improved.

And S140, respectively carrying out up-sampling processing on the first down-sampling optical flow chart, the second down-sampling optical flow chart and the down-sampling correction information based on preset sampling multiples to obtain the first up-sampling optical flow chart, the second up-sampling optical flow chart and the up-sampling correction information.

Specifically, as shown in fig. 2, the up-sampling process is performed on the first downsampled optical flowchart dF _t1 based on a preset sampling multiple, so as to obtain a first up-sampled optical flowchart F _t1 with an up-sampling preset sampling multiple (for example, up-sampling by 2 times), so that a first up-sampled optical flowchart F _t1 with an original resolution can be quickly obtained. Similarly, by performing upsampling processing on the second downsampled flowchart dF _t2 and the downsampling correction information (such as downsampled residual information dRt and downsampled occlusion information dOt), the second upsampled flowchart F _t2 and the upsampling correction information (such as upsampled residual information Rt and upsampled occlusion information Ot) of the original resolution can be quickly obtained.

S150, based on a preset conversion integration model, determining an intermediate frame between two adjacent frames according to the first frame, the second frame, the first up-sampling light flow diagram, the second up-sampling light flow diagram and the up-sampling correction information, and determining a target frame insertion corresponding to the two adjacent frames based on the intermediate frame.

The preset conversion integration model may be a network model that is preset and is used for converting video frames into warp and integration. The intermediate frame may refer to a pending video frame at a preset interpolation time. The target inter-frame may refer to the final inter-frame video frame between the first frame and the second frame. The preset conversion integration model in this embodiment may be obtained by training based on sample data in advance.

Specifically, as shown in fig. 2, the first frame I ₁, the second frame I ₂, the first upsampled optical flow diagram F _t1, the second upsampled optical flow diagram F _t2, and the upsampled correction information (such as Rt and Ot) with the original resolution may be input into a pre-trained preset conversion integration model for processing, and based on the output of the preset conversion integration model, an intermediate frame between two adjacent frames I ₁ and I ₂ may be accurately obtained. The embodiment can directly determine the output intermediate frame as the target insertion frame corresponding to the two adjacent frames; and detecting whether the intermediate frame meets the preset frame inserting condition, and if so, taking the intermediate frame as a target frame inserting corresponding to two adjacent frames.

Illustratively, as shown in fig. 2, the preset transition integration model may include: a second conversion sub-model and an integration sub-model. The second conversion sub-model may be a warp model for converting the first frame or the second frame to a converted frame at a preset interpolation time. The first transformation sub-model and the second transformation sub-model in this embodiment may be the same warp model, or may be two warp models with the same structure and different weights.

Illustratively, S150 may include: inputting the first frame and the first up-sampling light flow graph into a second conversion sub-model to obtain a first conversion frame corresponding to a preset frame inserting moment; inputting the second frame and the second up-sampling light flow graph into a second conversion sub-model to obtain a second conversion frame corresponding to the preset frame inserting moment; and inputting the first conversion frame, the second conversion frame and the up-sampling correction information into the integration submodel to obtain an intermediate frame between two adjacent frames.

Specifically, as shown in fig. 2, the first frame I ₁ and the first upsampled light flow diagram F _t1 that is upsampled to the original resolution may be input into the second conversion sub-model to perform conversion processing, and based on the output of the second conversion sub-model, a first conversion frame I _t1 corresponding to the preset frame inserting time t is obtained. Similarly, the second frame I ₂ and the second upsampled light flow diagram F _t2 are input into a second conversion sub-model to perform conversion processing, and based on the output of the second conversion sub-model, a second conversion frame I _t2 corresponding to the preset frame inserting time t is obtained. The first conversion frame I _t1, the second conversion frame I _t2, and the up-sampling correction information (such as Rt and Ot) are input into the integration submodel for data integration, and based on the output of the integration submodel, an intermediate frame between two adjacent frames I ₁ and I ₂ is obtained. It should be noted that, in this embodiment, the first conversion frame and the second conversion frame with the original resolution are regenerated based on the first frame, the second frame, the first up-sampling optical flow diagram and the second up-sampling optical flow diagram, instead of the first down-sampling conversion frame and the second down-sampling conversion frame being up-sampled, the conversion frame with the original resolution is obtained, so that the situation of distortion of an intermediate frame of the inserted frame can be avoided, the definition of the intermediate frame is further ensured, and the effect of the inserted frame is further improved.

According to the technical scheme, the first frame and the second frame in two adjacent frames in the target video are subjected to downsampling respectively based on the preset sampling times to obtain the first downsampled frame and the second downsampled frame with reduced resolution, so that a first downsampled light flow graph corresponding to the first downsampled frame, a second downsampled light flow graph corresponding to the second downsampled frame and downsampled correction information can be determined more rapidly based on the preset light flow correction model, and further the frame inserting efficiency is improved. The first downsampling optical flow diagram, the second downsampling optical flow diagram and the downsampling correction information are respectively subjected to upsampling processing based on preset sampling multiples, so that the first upsampling optical flow diagram, the second upsampling optical flow diagram and the upsampling correction information with the original resolution are obtained, and based on a preset conversion integration model, according to the first frame, the second frame, the first upsampling optical flow diagram, the second upsampling optical flow diagram and the upsampling correction information, an intermediate frame between two adjacent frames can be accurately determined, and a target frame inserting corresponding to the two adjacent frames is determined based on the intermediate frame, so that the video frame inserting efficiency is improved on the premise of guaranteeing the frame inserting effect.

Based on the technical scheme, the method can further comprise the following steps: on the image processor loaded with the preset conversion plug-in, an operation of determining an intermediate frame between two adjacent frames based on the preset optical flow correction model and the preset conversion integration model is performed.

The preset conversion plug-in may be a plug-in running on the image processor GPU (Graphics Processing Unit) for implementing the warp conversion process. For example, the preset transition plug-in may be, but is not limited to, tensorRT plug-in.

Specifically, the existing warp operation is that position points in the horizontal and vertical directions are generated first, a position coordinate grid is constructed, then an optical flow is added to obtain coordinates of pixel points in an image, and grid_sample functions are called to obtain output. Because the generation and copying of the grid to the GPU take time, and the grid data needs to be additionally subjected to storage format conversion, namely, the conversion from B2HW to BHW2, so that the correctness of the grid_sample function output result is ensured. It can be seen that existing warp operations require a switch to the GPU involving the cpu CPU (Central Processing Unit) to complete the entire warp operation directly on the GPU. In this embodiment, after the preset conversion plug-in is loaded on the GPU, a complete warp operation process may be run on the GPU. For example, the warp operation procedure is: the current position point coordinates are directly added to the optical flow value of each position to obtain the actual accurate floating point position of the output pixel point; obtaining the fixed point positions of four neighbor points according to the actual positions of the pixel points; and calculating the areas of the floating point positions of the pixel points and the four neighbor points of the floating point positions of the pixel points, taking the areas as weights of linear interpolation, and obtaining pixel values of the output pixel positions according to the contribution degree of the weights. Because the warp operation process only needs to run on the GPU, the CPU and GPU switching is not required, so that all model processing operations including the warp operation (namely, the operation of determining an intermediate frame between two adjacent frames based on a preset optical flow correction model and a preset conversion integration model) can be directly run on the GPU, the warp time consumption is reduced, and the plug-in engine can be utilized for deep acceleration, so that the video frame inserting efficiency is further improved.

Example two

Fig. 3 is a flowchart of a video frame inserting method according to a second embodiment of the present disclosure, where the step of determining, based on an intermediate frame, a target frame inserting corresponding to two adjacent frames is further optimized on the basis of the above embodiment. Wherein the explanation of the same or corresponding terms as those of the above embodiments is not repeated herein.

Referring to fig. 3, the video frame inserting method provided in this embodiment specifically includes the following steps:

s310, acquiring two adjacent frames in the target video.

S320, based on a preset sampling multiple, respectively performing downsampling processing on a first frame and a second frame in two adjacent frames to obtain a first downsampled frame and a second downsampled frame.

S330, determining a first downsampled light flow graph corresponding to the first downsampled frame, a second downsampled light flow graph corresponding to the second downsampled frame and downsampling correction information based on a preset light flow correction model.

S340, based on a preset sampling multiple, up-sampling processing is respectively carried out on the first down-sampling optical flow chart, the second down-sampling optical flow chart and the down-sampling correction information, so as to obtain the first up-sampling optical flow chart, the second up-sampling optical flow chart and the up-sampling correction information.

S350, based on a preset conversion integration model, determining an intermediate frame between two adjacent frames according to the first frame, the second frame, the first upsampled light flow chart, the second upsampled light flow chart and the upsampled correction information.

S360, determining the frame inserting confidence corresponding to the intermediate frame based on a preset optical flow correction model or a preset conversion integration model.

Wherein the interpolated confidence level may be used to characterize the difference between the output intermediate frame and the actual intermediate frame. For example, if the confidence is higher, the smaller the difference is, the more accurate the frame insertion is. In this embodiment, the preset optical flow correction model or the preset conversion integration model may predict the frame insertion confidence corresponding to the intermediate frame, and output the predicted frame insertion confidence in addition to the original information.

Specifically, after training the preset optical flow correction model and the preset conversion integration model based on the sample data, the preset optical flow correction model or the preset conversion integration model may be trained based on a preset error function, for example, charbonnier error functions, according to the sample frame interpolation confidence and the actual frame interpolation confidence corresponding to each sample intermediate frame, so that the preset optical flow correction model or the preset conversion integration model may predict the frame interpolation confidence corresponding to the intermediate frame. For example, the actual interpolated frame confidence level St _gt may be determined by the formula St _gt＝exp(-abs(It-It_gt), where It is an intermediate frame predicted based on the preset optical flow correction model and the preset conversion integration model; it _gt refers to the actual intermediate frame.

It should be noted that, the frame inserting confidence coefficient output by the preset optical flow correction model or the preset conversion integration model does not participate in the whole frame inserting process, such as the data integration process, so that the frame inserting confidence coefficient corresponding to the intermediate frame output by the conversion sub-model in the preset optical flow correction model or the preset conversion integration model may also be used.

S370, determining a target interpolation region corresponding to the intermediate frame based on the interpolation frame confidence.

Specifically, a target interpolation region in the intermediate frame may be determined based on the interpolation confidence corresponding to the intermediate frame, so as to determine whether a larger artifact exists in the intermediate frame based on the size of the target interpolation region.

Illustratively, S370 may include: based on a preset confidence threshold, performing binarization processing on the interpolation frame confidence, and determining a binarization image corresponding to the intermediate frame; and performing opening operation on the binarized image, obtaining a residual area after the opening operation, and taking the residual area as a target interpolation area corresponding to the intermediate frame.

The preset confidence threshold may refer to a minimum value of acceptable confidence. Specifically, the frame interpolation confidence corresponding to each pixel point in the intermediate frame is compared with a preset confidence threshold, and binarization processing is performed based on the comparison result, for example, the pixel value corresponding to the pixel point with the frame interpolation confidence greater than or equal to the preset confidence threshold is determined to be 255 (i.e., white), and the pixel value corresponding to the pixel point with the frame interpolation confidence less than the preset confidence threshold is determined to be 0 (i.e., black), so that a binarized image (i.e., black-and-white image) corresponding to the intermediate frame is obtained. And performing corrosion-before-expansion opening operation on the binarized image, removing a smaller area, obtaining a residual area after the opening operation, namely an area with a relatively poor plug frame, and taking the residual area as a target plug-in area corresponding to the intermediate frame.

S380, detecting whether the target interpolation region is smaller than or equal to a preset interpolation region threshold, if yes, executing step S390, otherwise, executing step S391.

Specifically, whether the intermediate frame obtained by frame interpolation has larger artifacts or not is determined by detecting whether the target interpolation area is smaller than or equal to a preset interpolation area threshold value.

S390, determining the intermediate frame as a target insertion frame corresponding to two adjacent frames.

Specifically, when the target interpolation region is smaller than or equal to a preset interpolation region threshold, it indicates that the interpolation region in the intermediate frame is smaller, and within an acceptable range, the intermediate frame can be used as a target interpolation frame corresponding to two adjacent frames.

S391, the first frame or the second frame is determined to be the target insertion frame corresponding to the two adjacent frames.

Specifically, when the target interpolation region is larger than a preset interpolation region threshold, the fact that the interpolation region in the middle frame is larger indicates that larger artifacts exist, and user experience is affected, and at the moment, the first frame or the second frame can be used as the target interpolation frame corresponding to two adjacent frames, so that the occurrence of interpolation frame artifacts is avoided, the interpolation frame effect is further improved, and the user watching experience is improved.

According to the technical scheme, through the adoption of the method, the frame inserting confidence corresponding to the intermediate frame is determined based on a preset optical flow correction model or a preset conversion integration model, the target interpolation area corresponding to the intermediate frame is determined based on the frame inserting confidence, and if the target interpolation area is smaller than or equal to a preset interpolation area threshold value, the intermediate frame is determined to be the target interpolation frame corresponding to two adjacent frames; if the target interpolation region is larger than the preset interpolation region threshold, the first frame or the second frame is determined to be the target interpolation frame corresponding to the two adjacent frames, so that the condition that artifacts occur due to direct interpolation when the interpolation frame confidence is too large in the region with low interpolation frame confidence can be avoided, the interpolation frame effect is further improved, and the user watching experience is improved.

Based on the above technical solution, S320 may include: determining the image difference degree between a first frame and a second frame in two adjacent frames; if the image difference is smaller than a preset difference threshold, respectively performing downsampling processing on a first frame and a second frame in two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame.

Specifically, after two adjacent frames are obtained, the image difference degree between the first frame and the second frame in the two adjacent frames can be determined based on a perceptual hash mode, whether the image difference degree is smaller than a preset difference degree threshold value or not is detected, if yes, the fact that the image content difference between the first frame and the second frame is smaller is indicated, at the moment, a subsequent frame inserting operation can be executed, otherwise, the first frame or the second frame can be directly used as a target frame inserting corresponding to the two adjacent frames, so that frame inserting artifacts caused by overlarge image content difference can be avoided, the frame inserting effect can be further improved, and the user viewing experience is improved.

It should be noted that, in this embodiment, after determining the intermediate frame in S350, the image difference degree between the first frame and the second frame in the two adjacent frames may be determined, and step S360 may be executed when the image difference degree is smaller than the preset difference degree threshold, otherwise step S391 may be executed, so that the artifact judgment step is used as a post-processing operation, so as to implement the parallel operation of the threads.

Based on the above technical solution, S320 may further include: determining a first light flow graph from a first frame to a second frame and a second light flow graph from the second frame to the first frame in two adjacent frames; determining a first large optical flow area corresponding to the first optical flow map and a second large optical flow area corresponding to the second optical flow map; if the first large optical flow area and the second large optical flow area are smaller than the preset optical flow area threshold, respectively performing downsampling processing on a first frame and a second frame in two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame.

The preset optical flow area threshold value flow_area_thr may be determined based on the image area img_area and the optical flow area ratio flow_area_ratio, that is, flow_area_thr=img_area×flow_area_ratio. The image area img_area may be obtained based on the height img_h and the width img_w of the image, i.e., img_area=img_h×img_w.

Specifically, a first optical flow map F12 from a first frame to a second frame and a second optical flow map F21 from the second frame to the first frame may be determined based on the optical flow generation model, and a first large optical flow region in the first optical flow map F12 and a second large optical flow region in the second optical flow map F21 may be determined. If the first large optical flow area and the second large optical flow area are smaller than the preset optical flow area threshold, the fact that the motion between the first frame and the second frame is smaller is indicated, and then the subsequent frame inserting operation can be executed. If the first large optical flow area and/or the second large optical flow area is larger than the preset optical flow area threshold, the fact that the motion between the first frame and the second frame is larger indicates that the bad frame is easy to insert by direct frame insertion, and at the moment, the first frame or the second frame can be used as a target frame insertion corresponding to two adjacent frames, so that frame insertion artifacts caused by overlarge motion between the two adjacent frames are avoided, the frame insertion effect can be further improved, and the user watching experience is improved.

Illustratively, determining a first large optical flow region corresponding to the first optical flow map may include: determining an optical flow value corresponding to each pixel point based on the first optical flow graph; and determining a pixel area formed by all pixel points with optical flow values larger than a preset optical flow threshold value as a first large optical flow area corresponding to the first optical flow diagram.

The preset optical flow threshold value flow_motion_thr may be determined based on the image hypotenuse img_ hypotenuse and the optical flow motion ratio flow_motion_ratio, that is, the preset optical flow threshold value flow_motion_thr=img_ hypotenuse ×flow_motion_ratio. The image hypotenuse img_ hypotenuse may be obtained based on the height img_h and width img_w of the image, i.e., img_ hypotenuse =sqrt (img_h2+img_w 2).

Specifically, the optical flow value flow_magnitide, that is, flow_magnitide=sqrt (f12_x≡2+f12_y ζ2) corresponding to each pixel point in the first optical flow map may be determined based on the optical flow value f12_x in the X direction and the optical flow value f12_y in the Y direction of each pixel point. Detecting whether an optical flow value flow_magnetic corresponding to each pixel point is larger than a preset optical flow threshold value flow_motion_thr, and taking all pixel areas larger than the preset optical flow threshold value as a first large optical flow area corresponding to a first optical flow map. Similarly, a second large optical flow region corresponding to the second optical flow map may be determined.

It should be noted that, in this embodiment, after determining the intermediate frame in S350, a first optical flow diagram from the first frame to the second frame and a second optical flow diagram from the second frame to the first frame in the two adjacent frames may also be determined; and determining a first large optical flow area corresponding to the first optical flow diagram and a second large optical flow area corresponding to the second optical flow diagram, executing step S360 when the first large optical flow area and the second large optical flow area are smaller than a preset optical flow area threshold value, otherwise executing step S391, and taking the artifact judgment step as a post-processing operation so as to realize the parallel operation of the threads.

Based on the technical schemes, the method can further comprise the following steps: executing preprocessing operation of first two adjacent frames in a first queue by using a first thread on a central processing unit, and storing a preprocessing result into a second queue; executing an intermediate frame determining operation corresponding to the second two adjacent frames based on the preprocessing result corresponding to the second two adjacent frames in the second queue by using a second thread on the image processor, and storing the determined intermediate frame in the second queue; and executing post-processing operation corresponding to the third adjacent two frames based on the intermediate frames corresponding to the third adjacent two frames in the second queue by using a third thread on the central processing unit, and storing the determined target plug frames corresponding to the third adjacent two frames in the third queue.

Wherein the second adjacent two frames are two adjacent frames preceding the first adjacent two frames. The third adjacent two frames are the adjacent two frames located before the second adjacent two frames. For example, if the first two adjacent frames refer to the 4 th and 5 th frames in the target video, the second two adjacent frames may refer to two adjacent frames located before the 5 th frame in the target video, such as the 3 rd and 4 th frames or the 2 nd and 3 rd frames. If the second adjacent two frames refer to the 3 rd and 4 th frames in the target video, the third adjacent two frames may refer to the adjacent two frames located before the 4 th frame in the target video, for example, the 2 nd and 3 rd frames or the 1 st and 2 nd frames. The first queue may store each adjacent two frames in order based on the interpolated frame. The second queue may store the pre-processing results corresponding to every two adjacent frames based on the same order. The third queue may store target frame insertion results corresponding to every two adjacent frames based on the same order.

Specifically, the present embodiment may divide the entire frame inserting process into three phases for every two adjacent frames, which are respectively: a preprocessing operation, an intermediate frame determination operation, and a post-processing operation. The preprocessing operation may be, but is not limited to, performing color transformation on each of two adjacent frames, and determining whether a scene change exists, so as to avoid a situation that the scene change is too severe. The intermediate frame determination operation may refer to a process of determining an intermediate frame between two adjacent frames based on a preset optical flow correction model and a preset conversion integration model. The post-processing operation may be a process of determining a final target frame between two adjacent frames based on the frame insertion confidence level, the image content difference level, and/or the large optical flow area size to determine whether the frame insertion condition is met. Fig. 4 gives an example of a thread parallel processing. As shown in fig. 4, in this embodiment, the preprocessing operation and the post-processing operation may be run on the CPU, the intermediate frame determining operation may be run on the GPU, and each operation is processed by using a separate thread, and each operation has its own input queue and output queue, and the output queue of the current operation is also the input queue of the next operation, so that the frame inserting operation of each two adjacent frames may be processed in parallel in a pipelined manner, so that the CPU and the GPU may be executed simultaneously, thereby improving the resource utilization of the CPU and the GPU, increasing the throughput of the system, and further improving the frame inserting efficiency.

The following is an embodiment of a video frame inserting apparatus provided by an embodiment of the present disclosure, where the apparatus and the video frame inserting method of the foregoing embodiment belong to the same inventive concept, and details of the embodiment of the video frame inserting apparatus are not described in detail, and reference may be made to the video frame inserting method of the foregoing embodiment.

Example III

Fig. 5 is a schematic structural diagram of a video frame inserting apparatus according to a third embodiment of the present disclosure, where the present embodiment is applicable to a case of inserting frames into every two adjacent frames in a video. As shown in fig. 5, the apparatus specifically includes: two adjacent frames acquisition module 510, downsampling processing module 520, downsampling information determination module 530, upsampling processing module 540, and target frame insertion determination module 550.

The adjacent two-frame acquisition module 510 is configured to acquire two adjacent frames in the target video; the downsampling processing module 520 is configured to perform downsampling processing on a first frame and a second frame in two adjacent frames based on a preset sampling multiple, so as to obtain a first downsampled frame and a second downsampled frame; the downsampling information determination module 530 is configured to determine a first downsampled light flow map corresponding to the first downsampled frame, a second downsampled light flow map corresponding to the second downsampled frame, and downsampling correction information based on the preset optical flow correction model; the up-sampling processing module 540 is configured to perform up-sampling processing on the first downsampled optical flowchart, the second downsampled optical flowchart, and the downsampled correction information based on a preset sampling multiple, to obtain a first upsampled optical flowchart, a second upsampled optical flowchart, and upsampled correction information; the target frame inserting determining module 550 is configured to determine an intermediate frame between two adjacent frames according to the first frame, the second frame, the first upsampled light flow chart, the second upsampled light flow chart, and the upsampled correction information based on the preset conversion integration model, and determine a target frame inserting corresponding to the two adjacent frames based on the intermediate frame.

On the basis of the technical scheme, the preset optical flow correction model comprises the following steps: an optical flow generation sub-model, a first conversion sub-model, and a correction sub-model; the downsampling information determination module 530 is specifically configured to:

Respectively inputting the first downsampling frame and the second downsampling frame into an optical flow generation submodel to obtain a first downsampling optical flow diagram corresponding to the first downsampling frame and a second downsampling optical flow diagram corresponding to the second downsampling frame; inputting the first downsampling frame and the first downsampling light flow graph into a first conversion sub-model to obtain a first downsampling conversion frame corresponding to a preset frame inserting moment; inputting the second downsampling frame and the second downsampling light flow graph into the first conversion sub-model to obtain a second downsampling conversion frame corresponding to the preset frame inserting moment; and inputting the first downsampled converted frame and the second downsampled converted frame into the correction submodel to obtain downsampled correction information.

On the basis of the above technical solutions, the preset optical flow correction model further includes: extracting a sub-model of the semantic meaning; the downsampling information determination module 530 is further specifically configured to:

Respectively inputting the first downsampling frame and the second downsampling frame into a semantic extraction submodel to obtain first downsampling semantic information corresponding to the first downsampling frame and second downsampling semantic information corresponding to the second downsampling frame; and inputting the first downsampling conversion frame, the second downsampling conversion frame, the first downsampling semantic information and the second downsampling semantic information into the correction submodel to obtain downsampling correction information.

On the basis of the above technical solutions, the preset conversion integration model includes: a second conversion sub-model and an integration sub-model;

the target frame insertion determination module 550 includes: the intermediate frame determining unit is specifically configured to:

Inputting the first frame and the first up-sampling light flow graph into a second conversion sub-model to obtain a first conversion frame corresponding to a preset frame inserting moment; inputting the second frame and the second up-sampling light flow graph into a second conversion sub-model to obtain a second conversion frame corresponding to the preset frame inserting moment; and inputting the first conversion frame, the second conversion frame and the up-sampling correction information into the integration submodel to obtain an intermediate frame between two adjacent frames.

Based on the above technical solutions, the target frame inserting determining module 550 includes: a target frame insertion determination unit; the target frame inserting determining unit specifically includes:

The frame inserting confidence determining subunit is used for determining the frame inserting confidence corresponding to the intermediate frame based on a preset optical flow correction model or a preset conversion integration model;

The target interpolation region determining subunit is used for determining a target interpolation region corresponding to the intermediate frame based on the interpolation frame confidence;

the first determining unit is used for determining the intermediate frame as a target interpolation frame corresponding to two adjacent frames if the target interpolation region is smaller than or equal to a preset interpolation region threshold value;

And the second determining unit is used for determining the first frame or the second frame as a target interpolation frame corresponding to two adjacent frames if the target interpolation region is larger than a preset interpolation region threshold value.

Based on the above technical solutions, the target interpolation region determining subunit is specifically configured to:

Based on a preset confidence threshold, performing binarization processing on the interpolation frame confidence, and determining a binarization image corresponding to the intermediate frame; and performing opening operation on the binarized image, obtaining a residual area after the opening operation, and taking the residual area as a target interpolation area corresponding to the intermediate frame.

Based on the above technical solutions, the downsampling processing module 520 is specifically configured to:

Determining the image difference degree between a first frame and a second frame in two adjacent frames; if the image difference is smaller than a preset difference threshold, respectively performing downsampling processing on a first frame and a second frame in two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame.

Based on the above technical solutions, the downsampling processing module 520 is further specifically configured to:

Determining a first light flow graph from a first frame to a second frame and a second light flow graph from the second frame to the first frame in two adjacent frames; determining a first large optical flow area corresponding to the first optical flow map and a second large optical flow area corresponding to the second optical flow map; if the first large optical flow area and the second large optical flow area are smaller than the preset optical flow area threshold, respectively performing downsampling processing on a first frame and a second frame in two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame.

Determining an optical flow value corresponding to each pixel point based on the first optical flow graph; and determining a pixel area formed by all pixel points with optical flow values larger than a preset optical flow threshold value as a first large optical flow area corresponding to the first optical flow diagram.

On the basis of the technical schemes, the device further comprises:

And the image processor executing module is used for executing the operation of determining an intermediate frame between two adjacent frames based on the preset optical flow correction model and the preset conversion integration model on the image processor loaded with the preset conversion plugin.

On the basis of the technical schemes, the device further comprises: the parallel processing module is specifically used for:

executing preprocessing operation of first two adjacent frames in a first queue by using a first thread on a central processing unit, and storing a preprocessing result into a second queue;

Executing an intermediate frame determining operation corresponding to a second adjacent two frames based on a preprocessing result corresponding to the second adjacent two frames in the second queue by using a second thread on the image processor, and storing the determined intermediate frame into the second queue, wherein the second adjacent two frames are two adjacent frames positioned before the first adjacent two frames;

And executing post-processing operation corresponding to the third adjacent two frames based on the intermediate frames corresponding to the third adjacent two frames in the second queue by utilizing a third thread on the central processing unit, and storing the determined target plug frames corresponding to the third adjacent two frames in the third queue, wherein the third adjacent two frames are adjacent two frames positioned before the second adjacent two frames.

The video frame inserting device provided by the embodiment of the disclosure can execute the video frame inserting method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the video frame inserting method.

It should be noted that, in the embodiment of the video frame inserting apparatus, each unit and module included are only divided according to the functional logic, but not limited to the above-mentioned division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present invention.

Example IV

Referring now to fig. 6, a schematic diagram of an electronic device 900 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device shown in fig. 6 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 6, the electronic device 900 may include a processing means (e.g., a central processor, a graphics processor, etc.) 901, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 902 or a program loaded from a storage means 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are also stored. The processing device 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

In general, the following devices may be connected to the I/O interface 905: input devices 906 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 907 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 908 including, for example, magnetic tape, hard disk, etc.; and a communication device 909. The communication means 909 may allow the electronic device 900 to communicate wirelessly or by wire with other devices to exchange data. While fig. 6 shows an electronic device 900 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication device 909, or installed from the storage device 908, or installed from the ROM 902. When executed by the processing device 901, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

The electronic device provided by the embodiment of the present disclosure and the video frame inserting method provided by the above embodiment belong to the same inventive concept, and technical details not described in detail in the embodiment of the present disclosure may be referred to the above embodiment, and the embodiment of the present disclosure has the same beneficial effects as the above embodiment.

Example five

The embodiment of the present disclosure provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the video interpolation method provided by the above embodiment.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the server; or may exist alone without being assembled into the server.

The computer readable medium carries one or more programs which, when executed by the server, cause the server to: acquiring two adjacent frames in a target video; based on a preset sampling multiple, respectively carrying out downsampling treatment on a first frame and a second frame in the two adjacent frames to obtain a first downsampled frame and a second downsampled frame; determining a first downsampling light flow graph corresponding to the first downsampling frame, a second downsampling light flow graph corresponding to the second downsampling frame and downsampling correction information based on a preset light flow correction model; based on the preset sampling multiple, respectively carrying out up-sampling processing on the first down-sampling optical flow graph, the second down-sampling optical flow graph and the down-sampling correction information to obtain a first up-sampling optical flow graph, a second up-sampling optical flow graph and up-sampling correction information; based on a preset conversion integration model, determining an intermediate frame between the two adjacent frames according to the first frame, the second frame, the first up-sampling light flow diagram, the second up-sampling light flow diagram and the up-sampling correction information, and determining a target plug frame corresponding to the two adjacent frames based on the intermediate frame.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the name of the unit does not constitute a limitation of the unit itself in some cases, for example, the editable content display unit may also be described as an "editing unit".

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

According to one or more embodiments of the present disclosure, there is provided a video framing method, including:

Acquiring two adjacent frames in a target video;

According to one or more embodiments of the present disclosure, there is provided a video frame inserting method, further comprising:

Optionally, the preset optical flow correction model includes: an optical flow generation sub-model, a first conversion sub-model, and a correction sub-model;

The determining, based on a preset optical flow correction model, a first downsampled optical flow map corresponding to the first downsampled frame, a second downsampled optical flow map corresponding to the second downsampled frame, and downsampling correction information includes:

Respectively inputting the first downsampling frame and the second downsampling frame into the optical flow generation submodel to obtain a first downsampling optical flow diagram corresponding to the first downsampling frame and a second downsampling optical flow diagram corresponding to the second downsampling frame;

Inputting the first downsampling frame and the first downsampling light flow graph into the first converter model to obtain a first downsampling conversion frame corresponding to a preset frame inserting moment;

inputting the second downsampling frame and the second downsampling light flow graph into the first converter model to obtain a second downsampling conversion frame corresponding to a preset frame inserting moment;

And inputting the first downsampling conversion frame and the second downsampling conversion frame into the correction submodel to obtain downsampling correction information.

optionally, the preset optical flow correction model further includes: extracting a sub-model of the semantic meaning;

The step of inputting the first downsampled converted frame and the second downsampled converted frame into the correction submodel to obtain downsampled correction information includes:

respectively inputting the first downsampling frame and the second downsampling frame into the semantic extraction submodel to obtain first downsampling semantic information corresponding to the first downsampling frame and second downsampling semantic information corresponding to the second downsampling frame;

And inputting the first downsampling conversion frame, the second downsampling conversion frame, the first downsampling semantic information and the second downsampling semantic information into the correction submodel to obtain downsampling correction information.

According to one or more embodiments of the present disclosure, there is provided a video frame inserting method [ example four ] further comprising:

optionally, the preset conversion integration model includes: a second conversion sub-model and an integration sub-model;

The determining, based on a preset conversion integration model, an intermediate frame between the two adjacent frames according to the first frame, the second frame, the first upsampled light flow sheet, the second upsampled light flow sheet, and the upsampled correction information, includes:

inputting the first frame and the first up-sampling light flow graph into the second conversion sub-model to obtain a first conversion frame corresponding to a preset frame inserting moment;

inputting the second frame and the second up-sampling light flow graph into the second conversion sub-model to obtain a second conversion frame corresponding to a preset frame inserting moment;

and inputting the first conversion frame, the second conversion frame and the up-sampling correction information into the integration submodel to obtain an intermediate frame between the two adjacent frames.

According to one or more embodiments of the present disclosure, there is provided a video frame inserting method [ example five ]:

Optionally, the determining, based on the intermediate frame, the target insertion frame corresponding to the two adjacent frames includes:

Determining the frame inserting confidence corresponding to the intermediate frame based on the preset optical flow correction model or the preset conversion integration model;

determining a target interpolation region corresponding to the intermediate frame based on the interpolation frame confidence;

If the target interpolation region is smaller than or equal to a preset interpolation region threshold, determining the intermediate frame as a target interpolation frame corresponding to the two adjacent frames;

And if the target interpolation region is larger than a preset interpolation region threshold, determining the first frame or the second frame as a target interpolation frame corresponding to the two adjacent frames.

According to one or more embodiments of the present disclosure, there is provided a video frame inserting method [ example six ] further comprising:

optionally, the determining, based on the frame insertion confidence, the target insertion area corresponding to the intermediate frame includes:

based on a preset confidence threshold, carrying out binarization processing on the inserted frame confidence, and determining a binarization image corresponding to the intermediate frame;

and performing opening operation on the binarized image, obtaining a residual area after the opening operation, and taking the residual area as a target interpolation area corresponding to the intermediate frame.

Optionally, the performing downsampling processing on the first frame and the second frame in the two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame includes:

determining the image difference degree between a first frame and a second frame in the two adjacent frames;

and if the image difference is smaller than a preset difference threshold, respectively carrying out downsampling processing on a first frame and a second frame in the two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame.

Optionally, the performing downsampling processing on the first frame and the second frame in the two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame, further includes:

Determining a first light flow graph from a first frame to a second frame and a second light flow graph from the second frame to the first frame in the two adjacent frames;

determining a first large optical flow area corresponding to the first optical flow map and a second large optical flow area corresponding to the second optical flow map;

And if the first large optical flow area and the second large optical flow area are smaller than a preset optical flow area threshold, respectively performing downsampling processing on a first frame and a second frame in the two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame.

optionally, the determining the first large optical flow area corresponding to the first optical flow map includes:

determining an optical flow value corresponding to each pixel point based on the first optical flow graph;

And determining a pixel area formed by each pixel point with the optical flow value larger than a preset optical flow threshold value as a first large optical flow area corresponding to the first optical flow map.

optionally, the method further comprises:

And executing an operation of determining an intermediate frame between the two adjacent frames based on the preset optical flow correction model and the preset conversion integration model on the image processor loaded with the preset conversion plugin.

optionally, the method further comprises:

Executing an intermediate frame determining operation corresponding to a second adjacent two frames in the second queue based on a preprocessing result corresponding to the second adjacent two frames by using a second thread on the image processor, and storing the determined intermediate frame in the second queue, wherein the second adjacent two frames are two adjacent frames positioned before the first adjacent two frames;

And executing post-processing operation corresponding to the third adjacent two frames based on the intermediate frames corresponding to the third adjacent two frames in the second queue by utilizing a third thread on the central processing unit, and storing the determined target insertion frames corresponding to the third adjacent two frames into the third queue, wherein the third adjacent two frames are adjacent two frames positioned before the second adjacent two frames.

According to one or more embodiments of the present disclosure, there is provided a video framing apparatus, comprising:

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A method for video framing, comprising:

Acquiring two adjacent frames in a target video;

Based on a preset conversion integration model, determining an intermediate frame between the two adjacent frames according to the first frame, the second frame, the first up-sampling light flow diagram, the second up-sampling light flow diagram and the up-sampling correction information, and determining an interpolation frame confidence corresponding to the intermediate frame based on the preset light flow correction model or the preset conversion integration model; determining a target interpolation region corresponding to the intermediate frame based on the interpolation frame confidence; and if the target interpolation region is smaller than or equal to a preset interpolation region threshold, determining the intermediate frame as a target interpolation frame corresponding to the two adjacent frames.

2. The method of claim 1, wherein the predetermined optical flow correction model comprises: an optical flow generation sub-model, a first conversion sub-model, and a correction sub-model;

3. The method of claim 2, wherein the preset optical flow correction model further comprises: extracting a sub-model of the semantic meaning;

4. The method of claim 1, wherein the preset transition integration model comprises: a second conversion sub-model and an integration sub-model;

5. The method according to claim 1, wherein the method further comprises:

6. The method of claim 1, wherein determining the target interpolation region corresponding to the intermediate frame based on the interpolation confidence comprises:

7. The method according to claim 1, wherein the downsampling the first frame and the second frame of the two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame includes:

8. The method according to claim 1, wherein the downsampling the first frame and the second frame in the two adjacent frames based on a preset sampling multiple to obtain a first downsampled frame and a second downsampled frame, respectively, further comprises:

9. The method of claim 8, wherein the determining a first large optical flow region corresponding to the first optical flow map comprises:

10. The method according to any one of claims 1-9, wherein the method further comprises:

11. The method according to claim 10, wherein the method further comprises:

12. A video framing apparatus, comprising:

The target frame inserting determining module is used for determining an intermediate frame between the two adjacent frames according to the first frame, the second frame, the first up-sampling optical flow diagram, the second up-sampling optical flow diagram and the up-sampling correction information based on a preset conversion integration model, and determining a frame inserting confidence corresponding to the intermediate frame based on the preset optical flow correction model or the preset conversion integration model; determining a target interpolation region corresponding to the intermediate frame based on the interpolation frame confidence; and if the target interpolation region is smaller than or equal to a preset interpolation region threshold, determining the intermediate frame as a target interpolation frame corresponding to the two adjacent frames.

13. An electronic device, the electronic device comprising:

One or more processors;

a memory for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the video interpolation method of any of claims 1-11.

14. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a video interpolation method according to any of claims 1-11.