CN101930614B

CN101930614B - Drawing rendering method based on video sub-layer

Info

Publication number: CN101930614B
Application number: CN2010102500634A
Authority: CN
Inventors: 黄华; 张磊; 付田楠
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2010-08-10
Filing date: 2010-08-10
Publication date: 2012-11-28
Anticipated expiration: 2030-08-10
Also published as: CN101930614A

Abstract

The invention provides a painting rendering method based on video layering. This method uses the video layering method in the field of computer vision to decompose the input video sequence into corresponding layered representations according to parameters such as color and motion, and then perform stylized painting rendering on each layer. Different from the traditional method of directly arranging painting brushes on each frame of the video, this invention proposes a new method of arranging corresponding drawing brushes on layers, which can be used according to the parameters such as colors and motions corresponding to each layer. The entire video sequence is optimized for brush propagation layout, which can greatly reduce video flicker and generate stylized painting videos with better continuity between frames. At the same time, by arranging brushes with preset styles on different layers, the present invention can conveniently generate videos with various painting styles and create stylized rendering results with more artistic effects.

Description

Painting rendering method based on video layering

技术领域 technical field

本发明是涉及一种基于视频分层的绘画渲染方法，具体涉及一种基于勾画的视频分层以及风格化绘画渲染方法。The invention relates to a video layering-based painting rendering method, in particular to a sketch-based video layering and stylized painting rendering method.

背景技术 Background technique

随着计算机技术的发展，多媒体以及数字娱乐受到大众越来越多的青睐，而计算机风格化渲染技术也逐渐成为研究热点。视频作为一种常见的多媒体形式，具有信息量大、表现力强等优点，因此视频的风格化渲染受到广泛的重视。单幅图像的风格化绘画渲染已经有了成熟的方法，但如果简单的采用单幅图像的渲染方法去绘制视频的每一帧，会造成严重的视觉闪烁。因此，如何减少视频闪烁，提高帧间连续性是视频风格化绘制的关键。基于视频分层的风格化绘制方法有效的解决了视频闪烁的问题，它通过在视频不同的分层上布置笔刷进行绘制，从而取得一致的渲染效果，有效的增加了帧间连续性。With the development of computer technology, multimedia and digital entertainment are more and more favored by the public, and computer stylized rendering technology has gradually become a research hotspot. As a common form of multimedia, video has the advantages of large amount of information and strong expressiveness, so the stylized rendering of video has been widely valued. There are already mature methods for stylized painting rendering of a single image, but if a single image rendering method is simply used to draw each frame of the video, it will cause serious visual flicker. Therefore, how to reduce video flicker and improve frame-to-frame continuity is the key to video stylized rendering. The stylized drawing method based on video layering effectively solves the problem of video flickering. It draws by arranging brushes on different layers of the video, so as to achieve a consistent rendering effect and effectively increase the continuity between frames.

传统的计算机视觉中的分层技术对于视频场景中的运动模型有严格的约束限制，通常处理简单的如仿射、投影等运动，而对于日常生活中的复杂的运动缺乏有效的处理，而基于勾画的分层方法不受具体运动的影响，能够处理更多类型的视频。The layering technology in traditional computer vision has strict constraints on the motion model in the video scene. It usually deals with simple motions such as affine and projection, but lacks effective processing for complex motions in daily life. Based on The delineated layered approach is independent of specific motion and can handle more types of videos.

传统的视频风格化绘制通常直接在每一帧上布置笔刷，缺少笔刷的连续过度，从而产生严重的视觉闪烁。基于视频分层的绘制方法，则是将笔刷在分层上根据其相应的运动进行布置，使得笔刷参数能够适应于视频场景内物体的运动变换，生成一致的绘制效果，减少视频闪烁。Traditional video stylized drawing usually directly arranges brushes on each frame, lacking continuous transition of brushes, resulting in severe visual flickering. The drawing method based on video layering is to arrange the brushes on the layer according to their corresponding motions, so that the brush parameters can adapt to the motion transformation of objects in the video scene, generate consistent drawing effects, and reduce video flickering.

传统的计算机视觉领域的视频分层对于运动模型有严格的约束，因此不太适合一般的风格化绘制处理的视频。为了处理更多的视频，需要对运动约束进行必要的放宽，而这通常需要添加额外的一些用户交互指导分层。The video layering in the traditional computer vision field has strict constraints on the motion model, so it is not suitable for general stylized rendering processing of video. In order to handle more videos, the necessary relaxation of the motion constraints is required, which usually requires adding some additional layers of user interaction guidance.

传统的视频风格化绘制往往直接在每一帧上布置笔刷，使得相邻帧之间缺乏必要的过渡。这样在不同帧绘制视频场景中的同一物体时可能使用不同的笔刷参数，造成视觉上的强烈变化，引起视频闪烁。另外，传统的视频绘制在同一帧上使用统一的笔刷模型参数，缺乏针对内容的笔刷变化。为了艺术性的绘制视频场景，应该根据视频场景包含的具体内容进行笔刷布置，进行多风格的绘制渲染。Traditional video stylized drawing often directly arranges brushes on each frame, so that there is no necessary transition between adjacent frames. In this way, different brush parameters may be used when drawing the same object in the video scene in different frames, resulting in strong visual changes and causing video flickering. In addition, traditional video drawing uses uniform brush model parameters on the same frame, and lacks content-specific brush changes. In order to draw the video scene artistically, the brushes should be arranged according to the specific content contained in the video scene, and multi-style rendering should be performed.

发明内容 Contents of the invention

本发明的目的在于提供一种适合于风格化绘制的基于视频分层的绘画渲染方法，并且在获取准确的分层后，能够在不同的分层上传播笔刷参数、布置相应的笔刷进行绘制，生成帧间连续性更优的风格化绘画渲染视频。The purpose of the present invention is to provide a painting rendering method based on video layering that is suitable for stylized drawing, and after obtaining accurate layers, it can propagate brush parameters on different layers and arrange corresponding brushes for further rendering. Draw to generate a stylized painterly rendered video with better continuity between frames.

为达到上述目的，本发明采用的技术方案是：In order to achieve the above object, the technical scheme adopted in the present invention is:

1）根据输入的视频选取关键帧，用户交互的在关键帧的不同区域进行勾画，并指定视频场景的层数以及各个分层上的种子区域；1) Select key frames according to the input video, and the user interactively sketches different areas of the key frames, and specifies the number of layers of the video scene and the seed area on each layer;

2）利用基于勾画的分层方法，借助光流将关键帧上的种子区域依次传播到其余各帧，采用高斯混合模型对传播的种子区域进行可靠性分析，保留可靠性高的区域作为该帧用来视频分层的种子区域；2) Using the layered method based on sketching, the seed area on the key frame is propagated to the remaining frames in turn with the help of optical flow, and the Gaussian mixture model is used to analyze the reliability of the propagated seed area, and the area with high reliability is reserved as the frame Seed area for video layering;

3）根据各帧上获得的种子区域，利用图割优化方法对每一帧进行分层，进而在各帧之间得到一致的分层区域及位于前景层的笔刷布置；3) According to the seed area obtained on each frame, use the graph cut optimization method to layer each frame, and then obtain a consistent layered area and brush layout in the foreground layer between each frame;

4）在获得关键帧上位于前景层的笔刷布置后，根据相邻帧对应层的薄板样条变换进行传递，生成所有视频序列的前景的渲染结果；4) After obtaining the brush layout on the foreground layer on the key frame, transfer it according to the thin-plate spline transformation of the corresponding layer of the adjacent frame to generate the rendering results of the foreground of all video sequences;

5）将背景层通过变换拼接为全景图，在全景图上进行笔刷布置并绘制全景图，然后利用逆变换反求每一帧背景层的渲染结果；5) Stitch the background layer into a panorama through transformation, arrange the brushes on the panorama and draw the panorama, and then use the inverse transformation to inversely calculate the rendering result of the background layer of each frame;

6）在每一帧上依次将绘制的背景层和前景层进行融合，得到整个视频的风格化绘画渲染结果。6) Merge the drawn background layer and foreground layer sequentially on each frame to obtain the stylized painting rendering result of the entire video.

其具体步骤如下：The specific steps are as follows:

步骤1：根据给定输入的视频，选取视频序列中包含视频场景中的颜色以及出现的物体最多的做为关键帧，如果视频序列太长，则将视频序列分解为几个片段，每一个片段选取各自的关键帧；Step 1: According to the given input video, select the video sequence that contains the most colors and objects in the video scene as the key frame. If the video sequence is too long, decompose the video sequence into several segments, each segment Select the respective keyframes;

步骤2：在关键帧上采用勾画方式指定分层的种子区域：根据关键帧包含物体的颜色、运动信息，在关键帧相对应的区域进行勾画，通过勾画的不同的灰度值指定分层索引，从而获得关键帧上分层的种子区域

Step 2: Specify the layered seed area by drawing on the key frame: According to the color and motion information of the object contained in the key frame, draw in the area corresponding to the key frame, and specify the layer index by different gray values drawn , so as to obtain the layered seed region on the keyframe

对于灰度值为c的勾画，其覆盖区域作为c/40分层的种子区域，对于每一个勾画，利用高斯混合模型分别计算各个区域的颜色分布：For the outline with a gray value of c, its coverage area is used as the seed area of the c/40 layer. For each outline, the Gaussian mixture model is used to calculate the color distribution of each area:

$Pr PR ((c c | | I I)) = = {Σ Σ}_{j j = = 11}^{M m} p p ((c c | | j j)) P P ((j j))$

其中P(j)是对应于每个分支的权因子,取为1/3,p(c|j)是高斯混合模型中每个分支的概率Where P(j) is the weight factor corresponding to each branch, which is taken as 1/3, and p(c|j) is the probability of each branch in the Gaussian mixture model

$p p ((c c | | j j)) = = \frac{11}{22 π π {| | {Σ Σ}_{j j} | |}^{\frac{11}{22}}} {exp exp}^{- - \frac{11}{22} {((ξ ξ - - {μ μ}_{j j}))}^{T T} {Σ Σ}_{j j}^{- - 11} ((ξ ξ - - {μ μ}_{j j}))}$

ξ是RGB三个通道的颜色值，μ是每个勾画的颜色均值，∑是协方差矩阵,π是圆周率常值；ξ is the color value of the three channels of RGB, μ is the mean value of each drawn color, ∑ is the covariance matrix, and π is the constant value of pi;

步骤3：采用对偶基方法计算相邻两帧之间的光流场：对于每一帧上的每个像素p_z，对应一个光流向量v_z；对于带有噪音的视频，在得到光流场后，采用高斯滤波对光流场进行光顺处理，获得更加稳定的光流场；Step 3: Use the dual basis method to calculate the optical flow field between two adjacent frames: for each pixel p _z on each frame, there is an optical flow vector v _z corresponding to it; for a video with noise, after obtaining the optical flow After the field, Gaussian filtering is used to smooth the optical flow field to obtain a more stable optical flow field;

步骤4：在视频序列之间传播勾画：对于每一个勾画，计算一组窗口{W_m}将勾画的边界覆盖，同时使得相邻的窗口彼此相互覆盖，对每个窗口W_m计算其内部的平均光流向量

做为该点处勾画位移向量，那么在下一帧，勾画将传播至

处；Step 4: Propagate sketches between video sequences: For each sketch, compute a set of windows {W _m } to cover the borders of the sketch, while making adjacent windows cover each other, and for each window W _m compute its interior mean optical flow vector

As the displacement vector of the sketch at this point, then in the next frame, the sketch will be propagated to

place;

步骤5：计算传播过程中种子区域的可信度：对于每个勾画

中每个像素i，其可信度定义为RGB颜色相对于前一帧对应勾画的分布概率，即Pr(i)，如果Pr(i)小于0.2，认为该像素可信度比较低，不再适合做该层的种子，利用可信度对勾画进行修正，通过优化下面的能量函数进行操作：Step 5: Calculate the confidence of the seed region during propagation: for each sketched

For each pixel i in , its credibility is defined as the distribution probability of the RGB color relative to the corresponding drawing of the previous frame, that is, Pr(i). If Pr(i) is less than 0.2, the pixel’s credibility is considered to be relatively low, and no longer It is suitable to be the seed of this layer, use the reliability to correct the outline, and operate by optimizing the following energy function:

[式1][Formula 1]

$E E. ((l l)) = = \underset{i i}{Σ Σ} {R R}_{i i} (({l l}_{i i})) + + λ λ \underset{< > &Element; &Element; N N}{Σ Σ} {V V}_{< >} (({l l}_{p p},, {l l}_{q q}))$

其中λ是权因子，控制可信区域的大小，取值为0.3；R_i(l_i)是高斯混合模型定义的颜色概率，具体定义为Among them, λ is a weight factor, which controls the size of the credible region, and the value is 0.3; R _i (l _i ) is the color probability defined by the Gaussian mixture model, specifically defined as

R_i(l_i)=-ln(Pr(C_i|l_i))R _i (l _i )=-ln(Pr(C _i |l _i ))

$Pr PR (({C C}_{i i} | | {l l}_{i i})) = = \frac{11}{22 π π {| | {Σ Σ}_{j j} | |}^{\frac{11}{22}}} {exp exp}^{- - \frac{11}{22} {((ξ ξ - - {μ μ}_{j j}))}^{T T} {Σ Σ}_{j j}^{- - 11} ((ξ ξ - - {μ μ}_{j j}))}$

V_<p，q>是定义相邻像素的光滑性，使得勾画仍然能够保持相对的紧凑完整性：V _{<p, q>} is to define the smoothness of adjacent pixels, so that the outline can still maintain a relatively compact integrity:

${V V}_{< >} (({l l}_{p p},, {l l}_{q q})) = = \frac{11}{11 + + | | | | {C C}_{p p} - - {C C}_{q q} | | | |} | | {l l}_{p p} - - {l l}_{q q} | |$

其中C是像素的颜色,l_p是像素所在的层数,[式1]采用图割算法进行有效优化求解，将可信度低于30%的区域的勾画一分为二，保留可信度高的部分区域做为该分层上的种子区域，进而得到满足分层要求的种子区域；Among them, C is the color of the pixel, l _p is the number of layers where the pixel is located. [Formula 1] uses the graph cut algorithm to effectively optimize the solution, divide the outline of the area with a reliability lower than 30% into two, and retain the reliability The high part of the area is used as the seed area on the layer, and then the seed area that meets the layering requirements is obtained;

步骤6：对每一帧上根据勾画的种子区域进行分层：分层的结果是将每一帧分割为互不相交的平面区域，每一块平面区域具有和种子区域相近的颜色、运动，这里颜色的相近性采用[式1]中高斯混合模型定义的颜色概率描述，运动的相似性则采用位移后对应像素的颜色差异来描述，具体的运动差异定义为：Step 6: Stratify each frame according to the sketched seed area: the result of stratification is to divide each frame into disjoint planar areas, each planar area has a color and movement similar to the seed area, here The similarity of color is described by the color probability defined by the Gaussian mixture model in [Formula 1], and the similarity of motion is described by the color difference of the corresponding pixel after displacement. The specific motion difference is defined as:

${M m}_{i i} (({l l}_{i i})) = = arctan arctan (({| | | | {I I}_{t t} ((i i)) - - {I I}_{t t + + 11} (({i i}^{' '})) | | | |}^{22} - - τ τ)) + + \frac{π π}{22}$

其中τ是常值取值60，为了增加相邻两帧分层的一致性，定义如下的时间一致性能量：Among them, τ is a constant value of 60. In order to increase the consistency of the layers of two adjacent frames, the time consistency energy is defined as follows:

综合考虑以上因素，在每一帧上计算分层通过优化如下能量函数得到：Considering the above factors comprehensively, the calculation layer on each frame is obtained by optimizing the following energy function:

[式2][Formula 2]

$E E. ((l l)) = = \underset{i i}{Σ Σ} (({R R}_{i i} (({l l}_{i i})) + + {M m}_{i i} (({l l}_{i i})) + + {T T}_{i i} (({l l}_{i i})))) + + λ λ \underset{< > &Element; &Element; N N}{Σ Σ} {V V}_{< >} (({l l}_{p p},, {l l}_{q q}))$

其中权因子λ取值0.3，该函数衡量了将每个像素赋予某一层时的能量，通过极小化该能量函数可以得到分层的视频，将视频的每一帧依次表示为不同分层区域的组合；Among them, the weight factor λ takes a value of 0.3. This function measures the energy of assigning each pixel to a certain layer. By minimizing the energy function, a layered video can be obtained, and each frame of the video is represented as a different layer in turn. combination of regions;

步骤7：对于视频的前景层，首先在关键帧的每一个前景层上采用各向异性的模型布置笔刷，进而生成风格化绘制的前景层；Step 7: For the foreground layer of the video, first use an anisotropic model to arrange brushes on each foreground layer of the key frame, and then generate a stylized foreground layer;

步骤8：为了使得绘制视频各帧的每个分层时尽可能一致的布置笔刷，将关键帧上的笔刷逐帧的传播到其余各帧，为了尽可能光滑的传递笔刷，采用薄板样条函数定义的如下变换来进行笔刷的传递：Step 8: In order to make the arrangement of brushes as consistent as possible when drawing each layer of each frame of the video, the brush on the key frame is propagated to the remaining frames frame by frame. In order to transfer the brush as smoothly as possible, a thin plate is used The following transformation defined by the spline function is used to transfer the brush:

[式3][Formula 3]

T(x,y)=(f₁(x,y),f₂(x,y))T(x,y)=(f ₁ (x,y),f ₂ (x,y))

$f f ((x x,, y the y)) = = {c c}_{00} + + {c c}_{11} x x + + {c c}_{22} y the y + + {Σ Σ}_{i i = = 11}^{n no} {w w}_{i i} Ψ Ψ ((| | | | ((x x,, y the y)) - - (({x x}_{a a},, {y the y}_{a a})) | | | |))$

其中Ψ(r)=r²logr²是核函数，薄板样条系数可以通过求解下面的线性方程组得到：Where Ψ(r)=r ² logr ² is the kernel function, and the thin plate spline coefficients can be obtained by solving the following linear equations:

$(\begin{matrix} K K & P P \\ {P P}^{T T} & 00 \end{matrix}) (\begin{matrix} w w \\ c c \end{matrix}) = = (\begin{matrix} {p p}^{k k + + 11} \\ 00 \end{matrix})$

其中K的第a列第b行K_ab=Ψ(‖(x_a,y_a)-(x_b,y_b)‖)，P^T的第a列为(1,x_a，y_a)^T，p^k+1是该帧上对应特征点，通过前景层的笔刷传播，可以生成视频序列所有前景的风格化绘制；Among them, column a and row b of K K _ab =Ψ(‖(x _a , y _a )-(x _b ,y _b )‖), and column a of P ^T is (1, x _a , y _a ) ^T , p ^k+1 is the corresponding feature point on the frame, through the brush propagation of the foreground layer, the stylized drawing of all the foreground of the video sequence can be generated;

步骤9：将视频序列所有的背景层在同一个坐标系下拼接成一个全景图：为了得到精确的全景图重构，需要计算视频序列各帧之间位于每个分层上的对应特征点，假设位于第k帧l层上的特征点为

与之对应的第k+1帧上的特征点为

那么寻找最优的变换H_k使得变换后的特征点误差最小，从而得到精确的全景图重构，变换H_k通过下面优化函数求解：Step 9: Stitch all the background layers of the video sequence into a panorama in the same coordinate system: in order to obtain accurate panorama reconstruction, it is necessary to calculate the corresponding feature points on each layer between the frames of the video sequence, Assume that the feature points on the l layer of the kth frame are

The corresponding feature points on the k+1th frame are

Then find the optimal transformation H _k to minimize the error of the transformed feature points, so as to obtain an accurate panorama reconstruction. The transformation H _k is solved by the following optimization function:

${H h}_{k k} = = \underset{T T}{arg arg min min} \underset{l l}{Σ Σ} {| | | | {p p}_{l l}^{k k} - - T T \cdot &Center Dot; {p p}_{l l}^{k k + + 11} | | | |}^{22}$

在对于每一帧的背景层求解得变换后，可以将所有的背景层在同一个坐标系下拼接生成全景图；After solving the transformation for the background layer of each frame, all the background layers can be stitched together in the same coordinate system to generate a panorama;

步骤10：对于拼接得到的背景层的全景图，采用各向异性的模型布置笔刷，进而生成风格化绘制的背景层，然后，利用拼接变换的逆变换

将风格化绘制后的全景图中相应的部分映射回每一帧，从而得到每一帧上具有绘画风格的背景；Step 10: For the stitched panorama of the background layer, use an anisotropic model to arrange brushes to generate a stylized background layer, and then use the inverse transformation of the stitching transformation

Map the corresponding part of the stylized panorama back to each frame, so as to obtain a painting-style background on each frame;

步骤11：对于视频序列中的每一帧，按照从后往前的顺序将绘制后的各个分层融合起来，生成最终的风格化绘制的视频序列，即这里

对应第e帧第r分层，

是融合系数，根据各个分层相应的面积比例计算。Step 11: For each frame in the video sequence, merge the drawn layers in order from back to front to generate the final stylized drawn video sequence, namely here

Corresponding to the r-th layer of the e-th frame,

is the fusion coefficient, which is calculated according to the corresponding area ratio of each stratum.

本发明针对现有技术中所存在的问题首先从视频的运动模型出发提出一种基于勾画的视频分层方法；然后提出了针对不同分层的笔刷布置方法；最后在绘制时，本发明提出一种不同分层之间的融合方法，从而能够生成更加连续的绘画风格化视频序列。该方法首先根据频场景中包含的运动模型，将关键帧采用勾画指定的一些种子区域在各帧之间进行传播。然后根据前一帧的种子区域颜色分布计算下一帧上勾画种子区域的可靠性，获得更加准确的种子区域。利用图割优化算法（Boykov Y,Veksler O,Zabih R(2001)Fast approximateenergy minimization via graph cuts.IEEE Transactions on Pattern Analysisand Machine Intelligence 2001,pp 509–522.）对每一帧进行分层，获得视频序列的分层表示，将视频表示成一个背景层的全景图以及若干前景层。最后分别在背景层的全景图以及前景层上布置笔刷进行绘制渲染，通过图像融合将渲染后的各个分层混合生成最终的具有绘画风格的视频。Aiming at the problems existing in the prior art, the present invention first proposes a video layering method based on sketching from the motion model of the video; then proposes a brush layout method for different layers; A fusion method between different layers, enabling the generation of more continuous painterly stylized video sequences. In this method, according to the motion model contained in the video scene, key frames are propagated between frames using some seed regions specified by sketching. Then, according to the color distribution of the seed area in the previous frame, the reliability of the outline of the seed area in the next frame is calculated to obtain a more accurate seed area. Use the graph cut optimization algorithm (Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 2001, pp 509–522.) to layer each frame to obtain a video sequence A layered representation of , which represents the video as a panorama with a background layer and several foreground layers. Finally, brushes are arranged on the panorama of the background layer and the foreground layer for drawing and rendering, and the rendered layers are mixed through image fusion to generate the final painting-style video.

附图说明 Description of drawings

图1是本发明基于视频分层的绘画渲染算法的流程图；Fig. 1 is the flowchart of the painting rendering algorithm based on video layering in the present invention;

图2是显示勾画在每一帧的传播方式；Figure 2 shows how the outline is propagated in each frame;

图3是每一帧上勾画可信度的计算以及采用可信度高的区域做为种子区域的分层结果；Figure 3 is the calculation of the reliability of the outline on each frame and the hierarchical results of using the region with high reliability as the seed region;

图4显示基于勾画的视频分层结果。Figure 4 shows the delineation-based video layering results.

图5显示采用本发明创作出的风格化绘制视频的若干帧。Figure 5 shows several frames of a stylized rendering video created using the present invention.

具体实施方式 Detailed ways

下面将根据附图对本发明进行详细说明。The present invention will be described in detail below with reference to the accompanying drawings.

图1是本发明的流程图。如图一所示，本发明主要分为12个步骤：Fig. 1 is a flow chart of the present invention. As shown in Figure 1, the present invention is mainly divided into 12 steps:

步骤1：给定输入的视频，选取视频序列中某一帧做为关键帧。该帧需要尽可能多的包含视频场景中的颜色以及出现的物体。如果视频序列太长，需要将视频序列分解为几个片段，每一个片段选取各自的关键帧。Step 1: Given the input video, select a frame in the video sequence as a key frame. The frame needs to contain as many colors and objects as possible in the video scene. If the video sequence is too long, the video sequence needs to be decomposed into several segments, and each segment selects its own key frame.

步骤2：在关键帧上采用勾画方式指定分层的种子区域。根据关键帧包含物体的颜色、运动等信息，在相应的区域进行勾画，通过勾画的不同的灰度值指定分层索引，从而获得关键帧上分层的种子区域 Step 2: Designate the layered seed area by sketching on the key frame. According to the color, motion and other information of the object contained in the key frame, the corresponding area is drawn, and the layered index is specified by different gray values drawn, so as to obtain the layered seed area on the key frame

ξ是RGB三个通道的颜色值，μ是每个勾画的颜色均值，∑是协方差矩阵，π是圆周率常值；ξ is the color value of the three channels of RGB, μ is the mean value of each drawn color, ∑ is the covariance matrix, and π is the constant value of pi;

步骤3：采用对偶基方法（Zach C,Pock T,Bischof H(2007)A dual basedapproach for realtime TV-L1 optical flow.Proceedings of the 29^th DAGMSymposium on Pattern Recognition 2007.）计算相邻两帧之间的光流场。对于每一帧上的每个像素p_z，对应一个光流向量v_z.对于带有噪音的视频，在得到光流场后，采用高斯滤波对光流场进行光顺处理，获得更加稳定的光流场。Step 3: Use the dual basis method (Zach C, Pock T, Bischof H (2007) A dual based approach for realtime TV-L1 optical flow. Proceedings of the 29 ^th DAGMSymposium on Pattern Recognition 2007.) to calculate the distance between two adjacent frames Optical flow field. For each pixel p _z on each frame, it corresponds to an optical flow vector v _z . For a video with noise, after obtaining the optical flow field, Gaussian filtering is used to smooth the optical flow field to obtain a more stable Optical flow field.

步骤4：在视频序列之间传播勾画。对于每一个勾画，计算一组窗口{W_n}将勾画的边界覆盖，同时使得相邻的窗口彼此相互覆盖（如图2所示）。对每个窗口W_m计算其内部的平均光流向量

做为该点处勾画位移向量，那么在下一帧，勾画将传播至处。Step 4: Propagate sketches between video sequences. For each sketch, calculate a set of windows {W _n } to cover the border of the sketch, while making adjacent windows cover each other (as shown in Figure 2). Calculate the average optical flow vector inside each window W _m

As the displacement vector of the sketch at this point, then in the next frame, the sketch will be propagated to place.

步骤5：计算传播过程中种子区域的可信度。由于光流计算有时候不稳定，简单的将传播过来的勾画做为种子区域会产生错误的分层结果，如图3所示。因此，在对勾画传播后，应该对其在该帧上的可信度进行评估，保留可信度高的区域做为种子区域分层。Step 5: Calculate the credibility of the seed region during propagation. Because the optical flow calculation is sometimes unstable, simply using the propagated outline as the seed area will produce wrong layering results, as shown in Figure 3. Therefore, after the outline is propagated, its reliability on the frame should be evaluated, and the region with high reliability should be reserved as the seed region layer.

对于每个勾画

中每个像素i，其可信度定义为RGB颜色相对于前一阵对应勾画的分布概率，即Pr(i)。如果Pr(i)小于0.2，认为该像素可信度比较低，不再适合做该层的种子。利用可信度对勾画进行修正，可以通过优化下面的能量函数进行操作：for each sketch

For each pixel i in , its reliability is defined as the distribution probability of the RGB color relative to the corresponding drawing of the previous one, that is, Pr(i). If Pr(i) is less than 0.2, it is considered that the reliability of the pixel is relatively low, and it is no longer suitable to be the seed of this layer. Correcting the delineation with confidence can be done by optimizing the following energy function:

[式1][Formula 1]

R_i(l_i)=-ln(Pr(C_i|l_i))R _i (l _i )=-ln(Pr(C _i |l _i ))

其中C是像素的颜色,l_p是像素所在的层数,[式1]可以采用图割算法（]Boykov Y,Veksler O,Zabih R(2001)Fast approximate energy minimization via graphcuts.IEEE Transactions on Pattern Analysis and Machine Intelligence 2001,pp 509–522.）进行有效优化求解，将可信度低于30%的区域的勾画一分为二，保留可信度高(大于70%)的部分区域做为该分层上的种子区域，进而得到满足分层要求的种子区域（如图3所示）。Where C is the color of the pixel, l _p is the number of layers where the pixel is located, [Formula 1] can use the graph cut algorithm (] Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graphcuts.IEEE Transactions on Pattern Analysis and Machine Intelligence 2001, pp 509–522.) to effectively optimize the solution, divide the outline of the area with a reliability lower than 30% into two parts, and retain the part of the area with a high reliability (greater than 70%) as the divided area. The seed area on the layer, and then obtain the seed area that meets the layering requirements (as shown in Figure 3).

步骤6：对每一帧上根据勾画的种子区域进行分层。分层的结果是将每一帧分割为互不相交的平面区域，每一块平面区域具有和种子区域相近的颜色、运动等。这里颜色的相近性采用[式1]中高斯混合模型定义的颜色概率描述，运动的相似性则采用位移后对应像素的颜色差异来描述，具体的运动差异可以定义为：Step 6: Layer the sketched seed regions on each frame. The result of layering is to divide each frame into disjoint planar regions, and each planar region has a color, motion, etc. similar to the seed region. Here, the similarity of color is described by the color probability defined by the Gaussian mixture model in [Formula 1], and the similarity of motion is described by the color difference of the corresponding pixel after displacement. The specific motion difference can be defined as:

其中τ是常值，在本方法中取值60。为了增加相邻两帧分层的一致性，定义如下的时间一致性能量：Among them, τ is a constant value, which is 60 in this method. In order to increase the consistency of the layering of two adjacent frames, the temporal consistency energy is defined as follows:

综合考虑以上因素，在每一帧上计算分层可以通过优化如下能量函数得到：Considering the above factors, calculating the layering on each frame can be obtained by optimizing the following energy function:

[式2][Formula 2]

其中权因子λ取值0.3。该函数衡量了将每个像素赋予某一层时的能量，通过极小化该能量函数可以得到分层的视频，将视频的每一帧依次表示为不同分层区域的组合。通常这些分层包含一个背景层和若干前景物体层：背景层上包含的运动主要是摄像机的平移、选择等产生的，其运动模型相对简单；前景物体层的运动则比较复杂。图4显示了对于某一视频序列分层的结果，其中不同灰度值区域代表不同的分层。Among them, the weight factor λ takes a value of 0.3. This function measures the energy of assigning each pixel to a certain layer. By minimizing the energy function, a layered video can be obtained, and each frame of the video is represented as a combination of different layered regions in turn. Usually these layers include a background layer and several foreground object layers: the motion contained in the background layer is mainly generated by camera translation, selection, etc., and its motion model is relatively simple; the motion of the foreground object layer is more complicated. Figure 4 shows the results of stratification for a certain video sequence, where different gray value areas represent different stratifications.

步骤7：对于视频的前景层，首先在关键帧的每一个前景层上采用各向异性的模型（Huang H,Fu T N,Li C F(2010)Anisotropic brush for painterlyrendering.In:Computer Graphics International 2010.）布置笔刷，进而生成风格化绘制的前景层。Step 7: For the foreground layer of the video, first use an anisotropic model on each foreground layer of the key frame (Huang H, Fu T N, Li C F (2010) Anisotropic brush for painterly rendering. In: Computer Graphics International 2010 .) Arrange the brushes to generate the foreground layer for stylized drawing.

步骤8：为了使得绘制视频各帧的每个分层时尽可能一致的布置笔刷，将关键帧上的笔刷逐帧的传播到其余各帧。为了尽可能光滑的传递笔刷，采用薄板样条函数定义的如下变换来进行笔刷的传递：Step 8: In order to make the arrangement of brushes as consistent as possible when drawing each layer of each frame of the video, the brush on the key frame is propagated to the remaining frames frame by frame. In order to transfer the brush as smoothly as possible, the following transformation defined by the thin plate spline function is used to transfer the brush:

[式3][Formula 3]

T(x,y)=(f₁(x,y),f₂(x,y))T(x,y)=(f ₁ (x,y),f ₂ (x,y))

其中K的第a列第b行K_ab=Ψ(‖(x_a,y_a)-(x_b,y_b)‖)，P^T的第a列为(1,x_a，y_a)^T，,x，y是像素的几何坐标,p^k+1是该帧上对应特征点。通过前景层的笔刷传播，可以生成视频序列所有前景的风格化绘制。Among them, column a and row b of K K _ab =Ψ(‖(x _a , y _a )-(x _b ,y _b )‖), and column a of P ^T is (1, x _a , y _a ) ^T ,, x, y are the geometric coordinates of the pixel, and p ^k+1 is the corresponding feature point on the frame. Through the brush propagation of the foreground layer, a stylized rendering of all the foreground of the video sequence can be generated.

步骤9：将视频序列所有的背景层在同一个坐标系下拼接成一个全景图。为了得到精确的全景图重构，需要计算视频序列各帧之间位于每个分层上的对应特征点。假设位于第k帧l层上的特征点为

与之对应的第k+1帧上的特征点为那么寻找最优的变换H_k使得变换后的特征点误差最小，从而得到精确的全景图重构。变换H_k通过下面优化函数求解：Step 9: Stitch all the background layers of the video sequence into a panorama in the same coordinate system. In order to obtain accurate panorama reconstruction, it is necessary to calculate the corresponding feature points on each layer between frames of the video sequence. Assume that the feature points on the l layer of the kth frame are

The corresponding feature points on the k+1th frame are Then find the optimal transformation _Hk to minimize the error of the transformed feature points, so as to obtain accurate reconstruction of the panorama. The transformation H _k is solved by the following optimization function:

在对于每一帧的背景层求解得变换后，可以将所有的背景层在同一个坐标系下拼接生成全景图。After solving the transformation for the background layer of each frame, all the background layers can be stitched together in the same coordinate system to generate a panorama.

步骤10：对于拼接得到的背景层的全景图，采用各向异性的模型（Huang H,Fu T N,Li C F(2010)Anisotropic brush for painterly rendering.In:Computer Graphics International 2010.）布置笔刷，进而生成风格化绘制的背景层。然后，利用拼接变换的逆变换

将风格化绘制后的全景图中相应的部分映射回每一帧，从而得到每一帧上具有绘画风格的背景。Step 10: For the stitched panorama of the background layer, use the anisotropic model (Huang H, Fu T N, Li C F (2010) Anisotropic brush for painterly rendering. In: Computer Graphics International 2010.) to arrange the brushes, and then Generates a background layer for stylized drawing. Then, using the inverse of the stitching transform

The corresponding part in the stylized drawn panorama is mapped back to each frame, so as to obtain a background with a painting style on each frame.

步骤11：对于视频序列中的每一帧，按照从后往前的顺序将绘制后的各个分层融合起来，生成最终的风格化绘制的视频序列，即

这里

对应第e帧第r分层，是融合系数，根据各个分层相应的面积比例计算。Step 11: For each frame in the video sequence, merge the drawn layers in order from back to front to generate the final stylized drawn video sequence, namely

here

Corresponding to the r-th layer of the e-th frame, is the fusion coefficient, which is calculated according to the corresponding area ratio of each stratum.

图5是风格化绘制的视频中的几帧，显示了采用视频分层的风格化绘制效果。可以看出，本发明可以生成具有特定艺术效果的绘画风格的视频。Figure 5 is a few frames from the stylized video, showing the effect of stylized rendering with video layering. It can be seen that the present invention can generate a painting-style video with a specific artistic effect.

如上所述，本发明提出了一种基于视频分层的绘画渲染方法，它借助于计算机视觉中的视频分层，获得视频序列的层次表示，然后通过在各个分层上布置笔刷绘制，有效减少视觉闪烁，提高帧间连续性，生成更具艺术效果的风格化绘制视频。As mentioned above, the present invention proposes a painting rendering method based on video layering, which obtains a layered representation of a video sequence by means of video layering in computer vision, and then draws effectively by arranging brushes on each layer. Reduce visual flicker, improve frame-to-frame continuity, and produce more artistically rendered stylized videos.

尽管已经参考附图对本发明进行了解释和描述，专业技术人员应该理解，在不脱离本发明精神和范围的情况下，可以在其中或对其进行各种其他改变，While the present invention has been illustrated and described with reference to the accompanying drawings, it will be understood by those skilled in the art that various other changes may be made therein or thereto without departing from the spirit and scope of the invention,

Claims

1. A painting rendering method based on video layering, characterized in that it comprises the following steps:

1) Select key frames according to the input video, and the user interactively sketches different areas of the key frames, and specifies the number of layers of the video scene and the seed area on each layer;

2) Using the layered method based on sketching, the seed area on the key frame is propagated to the remaining frames in turn with the help of optical flow, and the Gaussian mixture model is used to analyze the reliability of the propagated seed area, and the area with high reliability is reserved as the frame Seed area for video layering;

3) According to the seed area obtained on each frame, use the graph cut optimization method to layer each frame, and then obtain a consistent layered area and brush layout in the foreground layer between each frame;

4) After obtaining the brush layout on the foreground layer on the key frame, transfer it according to the thin-plate spline transformation of the corresponding layer of the adjacent frame to generate the rendering results of the foreground of all video sequences;

5) Stitch the background layer into a panorama through transformation, arrange the brushes on the panorama and draw the panorama, and then use the inverse transformation to inversely calculate the rendering result of the background layer of each frame;

6) Merge the drawn background layer and foreground layer sequentially on each frame to obtain the stylized painting rendering result of the entire video.

2. the painting rendering method based on video layering as claimed in claim 1, its concrete steps are as follows:

Step 1: According to the given input video, select the video sequence that contains the most colors and objects in the video scene as the key frame. If the video sequence is too long, decompose the video sequence into several segments, each segment Select the respective keyframes;

For the outline with a gray value of c, its coverage area is used as the seed area of the c/40 layer. For each outline, the Gaussian mixture model is used to calculate the color distribution of each area:

Pr PR ((c c | | I I)) = = {Σ Σ}_{j j = = 11}^{M m} p p ((c c | | j j)) P P ((j j))

Where P(j) is the weight factor corresponding to each branch, which is taken as 1/3, and p(c|j) is the probability of each branch in the Gaussian mixture model

p p ((c c | | j j)) = = \frac{11}{22 π π {| | {Σ Σ}_{j j} | |}^{\frac{11}{22}}} {exp exp}^{- - \frac{11}{22} {((ξ ξ - - {μ μ}_{j j}))}^{T T} {Σ Σ}_{j j}^{- - 11} ((ξ ξ - - {μ μ}_{j j}))}

ξ is the color value of the three channels of RGB, μ is the mean value of each drawn color, ∑ is the covariance matrix, and π is the constant value of pi;

Step 3: Use the dual basis method to calculate the optical flow field between two adjacent frames: for each pixel p _z on each frame, there is an optical flow vector v _z corresponding to it; for a video with noise, after obtaining the optical flow After the field, Gaussian filtering is used to smooth the optical flow field to obtain a more stable optical flow field;

Step 4: Propagate sketches between video sequences: For each sketch, compute a set of windows {W _m } to cover the borders of the sketch, while making adjacent windows cover each other, and for each window W _m compute its interior mean optical flow vector As the displacement vector of the sketch at this point, then in the next frame, the sketch will be propagated to

place;

Step 5: Calculate the confidence of the seed region during propagation: for each sketched

[Formula 1]

E E. ((l l)) = = \underset{i i}{Σ Σ} {R R}_{i i} (({l l}_{i i})) + + λ λ \underset{< < p p,, q q > > &Element; &Element; N N}{Σ Σ} {V V}_{< < p p,, q q > >} (({l l}_{p p},, {l l}_{q q}))

Among them, λ is a weight factor, which controls the size of the credible region, and the value is 0.3; R _i (l _i ) is the color probability defined by the Gaussian mixture model, specifically defined as

R _i (l _i )=-ln(Pr(C _i |l _i ))

Pr PR (({C C}_{i i} | | {l l}_{i i})) = = \frac{11}{22 π π {| | {Σ Σ}_{j j} | |}^{\frac{11}{22}}} {exp exp}^{- - \frac{11}{22} {((ξ ξ - - {μ μ}_{j j}))}^{T T} {Σ Σ}_{j j}^{- - 11} ((ξ ξ - - {μ μ}_{j j}))}

V _{<p, q>} is to define the smoothness of adjacent pixels, so that the outline can still maintain a relatively compact integrity:

{V V}_{< < p p,, q q > >} (({l l}_{p p},, {l l}_{q q})) = = \frac{11}{11 + + | | | | {C C}_{p p} - - {C C}_{q q} | | | |} | | {l l}_{p p} - - {l l}_{q q} | |

Among them, C is the color of the pixel, l _p is the number of layers where the pixel is located. [Formula 1] uses the graph cut algorithm to effectively optimize the solution, divide the outline of the area with a reliability lower than 30% into two, and retain the reliability The high part of the area is used as the seed area on the layer, and then the seed area that meets the layering requirements is obtained;

Step 6: Stratify each frame according to the sketched seed area: the result of stratification is to divide each frame into disjoint planar areas, each planar area has a color and movement similar to the seed area, here The similarity of color is described by the color probability defined by the Gaussian mixture model in [Formula 1], and the similarity of motion is described by the color difference of the corresponding pixel after displacement. The specific motion difference is defined as:

{M m}_{i i} (({l l}_{i i})) = = arctan arctan (({| | | | {I I}_{t t} ((i i)) - - {I I}_{t t + + 11} (({i i}^{' '})) | | | |}^{22} - - τ τ)) + + \frac{π π}{22}

Among them, τ is a constant value of 60. In order to increase the consistency of the layers of two adjacent frames, the time consistency energy is defined as follows:

Considering the above factors comprehensively, the calculation layer on each frame is obtained by optimizing the following energy function:

[Formula 2]

E E. ((l l)) = = \underset{i i}{Σ Σ} (({R R}_{i i} (({l l}_{i i})) + + {M m}_{i i} (({l l}_{i i})) + + {T T}_{i i} (({l l}_{i i})))) + + λ λ \underset{< < p p,, q q > > &Element; &Element; N N}{Σ Σ} {V V}_{< < p p,, q q > >} (({l l}_{p p},, {l l}_{q q}))

Among them, the weight factor λ takes a value of 0.3. This function measures the energy of assigning each pixel to a certain layer. By minimizing the energy function, a layered video can be obtained, and each frame of the video is represented as a different layer in turn. combination of regions;

Step 7: For the foreground layer of the video, first use an anisotropic model to arrange brushes on each foreground layer of the key frame, and then generate a stylized foreground layer;

Step 8: In order to make the arrangement of brushes as consistent as possible when drawing each layer of each frame of the video, the brush on the key frame is propagated to the remaining frames frame by frame. In order to transfer the brush as smoothly as possible, a thin plate is used The following transformation defined by the spline function is used to transfer the brush:

[Formula 3]

T(x,y)=(f ₁ (x,y),f ₂ (x,y))

f f ((x x,, y the y)) = = {c c}_{00} + + {c c}_{11} x x + + {c c}_{22} y the y + + {Σ Σ}_{i i = = 11}^{n no} {w w}_{i i} Ψ Ψ ((| | | | ((x x,, y the y)) - - (({x x}_{a a},, {y the y}_{a a})) | | | |))

Where Ψ(r)=r ² logr ² is the kernel function, and the thin plate spline coefficients can be obtained by solving the following linear equations:

(\begin{matrix} K K & P P \\ {P P}^{T T} & 00 \end{matrix}) (\begin{matrix} w w \\ c c \end{matrix}) = = (\begin{matrix} {p p}^{k k + + 11} \\ 00 \end{matrix})

Among them, column a and row b of K K _ab =Ψ(‖(x _a , y _a )-(x _b ,y _b )‖), and column a of P ^T is (1, x _a , y _a ) ^T , p ^k+1 is the corresponding feature point on the frame, through the brush propagation of the foreground layer, the stylized drawing of all the foreground of the video sequence can be generated;

Step 9: Stitch all the background layers of the video sequence into a panorama in the same coordinate system: in order to obtain accurate panorama reconstruction, it is necessary to calculate the corresponding feature points on each layer between the frames of the video sequence, Assume that the feature points on the l layer of the kth frame are

The corresponding feature points on the k+1th frame are Then find the optimal transformation H _k to minimize the error of the transformed feature points, so as to obtain an accurate panorama reconstruction. The transformation H _k is solved by the following optimization function:

{H h}_{k k} = = \underset{T T}{arg arg min min} \underset{l l}{Σ Σ} {| | | | {p p}_{l l}^{k k} - - T T \cdot \cdot {p p}_{l l}^{k k + + 11} | | | |}^{22}

After solving the transformation for the background layer of each frame, all the background layers can be stitched together in the same coordinate system to generate a panorama;

Step 10: For the stitched panorama of the background layer, use an anisotropic model to arrange brushes to generate a stylized background layer, and then use the inverse transformation of the stitching transformation

Step 11: For each frame in the video sequence, merge the drawn layers in order from back to front to generate the final stylized drawn video sequence, namely

here

Corresponding to the r-th layer of the e-th frame,