CN104010180B

CN104010180B - Three-dimensional video filtering method and device

Info

Publication number: CN104010180B
Application number: CN201410265360.4A
Authority: CN
Inventors: 朱策; 王昕�; 郑建铧; 张玉花
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2014-06-13
Filing date: 2014-06-13
Publication date: 2017-01-25
Anticipated expiration: 2034-06-13
Also published as: CN104010180A; WO2015188666A1

Abstract

The embodiment of the invention provides a three-dimensional video filtering method and device. The three-dimensional video filtering method comprises the following steps: projecting pixels in an image plane into three-dimensional space; according to the coordinate values of the pixel to be filtered and the reference pixel in the three-dimensional space, calculating the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space; calculating the similarity of the texture pixel values of the pixel to be filtered and the reference pixel according to the texture pixel values of the pixel to be filtered and the reference pixel; calculating the consistency of the motion characteristics of the pixel to be filtered and the reference pixel according to the texture pixel values of the pixel to be filtered, the reference pixel and the pixel at the same position in the previous frame of image; and determining a filtering weight value according to the spatial proximity, the similarity of the texture pixel values and the consistency of the motion characteristics, and respectively carrying out weighted average on the pixel values of the reference pixels in the reference pixel set to obtain a filtering result of the pixel to be filtered. The embodiment of the invention improves the filtering accuracy of the three-dimensional video.

Description

Three-dimensional video filtering method and device

技术领域technical field

本发明实施例涉及图像处理技术，尤其涉及一种三维视频滤波方法和装置。Embodiments of the present invention relate to image processing technologies, and in particular, to a three-dimensional video filtering method and device.

背景技术Background technique

随着视觉多媒体技术的不断发展，三维视频以其独特景深效果逐步走进人们的生活，并应用于教育、军事、娱乐及医疗等多个领域。目前的三维视频根据视频内容主要分为两类：纯彩色三维视频和基于深度的三维视频。纯彩色三维视频将多路彩色视频直接呈现给用户，其视点位置和视差固定，给人们的观看带来了一定的局限性。相比于纯彩色三维视频，由于深度图的引入，基于深度的三维视频能够通过基于深度图像的绘制技术合成任意视点的虚拟图像，人们可以根据个人喜好选择视点和调节视差，进而更好的享受三维视频带来的乐趣。这一自由灵活的特点使基于深度的三维视频成为目前被广泛接受的三维视频格式。With the continuous development of visual multimedia technology, 3D video has gradually entered people's lives with its unique depth-of-field effect, and has been applied in many fields such as education, military affairs, entertainment and medical treatment. The current 3D video is mainly divided into two categories according to video content: pure color 3D video and depth-based 3D video. Pure color 3D video directly presents multi-channel color video to users, and its viewpoint position and parallax are fixed, which brings certain limitations to people's viewing. Compared with pure color 3D video, due to the introduction of depth map, depth-based 3D video can synthesize virtual images of any viewpoint through the rendering technology based on depth image. People can choose viewpoint and adjust parallax according to personal preference, so as to enjoy better The joy of 3D video. This free and flexible feature makes depth-based 3D video a widely accepted 3D video format.

基于深度的三维视频内容由纹理图序列和深度图序列组成，纹理图直观的呈现了物体表面的纹理特征，深度图反映了物体与相机之间的距离。利用上述视频内容和基于深度图像的绘制技术可以合成指定的虚拟视点纹理图像。然而，深度图和纹理图在获取、编码和传输等过程中会引入大量噪声。在视点合成阶段，深度图和纹理图中的噪声会分别引起合成图像的几何失真和纹理失真，进而严重影响人们的视觉体验。而滤波技术可以有效的去除这些噪声，有效提升三维视频质量。The depth-based 3D video content consists of a texture map sequence and a depth map sequence. The texture map intuitively presents the texture features of the object surface, and the depth map reflects the distance between the object and the camera. The specified virtual viewpoint texture image can be synthesized by using the above video content and the rendering technology based on the depth image. However, depth maps and texture maps will introduce a lot of noise in the process of acquisition, encoding and transmission. In the view synthesis stage, the noise in the depth map and texture map will cause geometric distortion and texture distortion of the synthesized image respectively, which will seriously affect people's visual experience. Filtering technology can effectively remove these noises and effectively improve the quality of 3D video.

现有技术中，针对纹理图的去噪方法主要是双边滤波器，利用待滤波像素周围像素作为参考，并对其进行加权平均的方法，得到滤波结果。在权值计算的过程中主要参考了图像中像素位置邻近性和像素值的相似性。该滤波方法认为：两个像素点在图像平面的距离越接近，相关性越强；两个像素点的像素值越相似，相关性越强。In the prior art, the denoising method for the texture image is mainly a bilateral filter, which uses the surrounding pixels of the pixel to be filtered as a reference and performs weighted average on them to obtain the filtering result. In the process of weight calculation, it mainly refers to the similarity of pixel position proximity and pixel value in the image. The filtering method believes that: the closer the distance between two pixel points on the image plane, the stronger the correlation; the more similar the pixel values of the two pixel points, the stronger the correlation.

图1为现有技术的双边滤波器的计算邻近性的示意图。现有技术的问题是，由于图像中的像素点是真实三维空间中的点在二维图像平面的再现，然而双边滤波器在考虑像素点邻近性时并没有从真实三维场景出发，其计算结果并不准确，如图1所示，其中A’、B’、C’为真实场景中的三个点，经过相机采集其在图像平面中的位置为A、B、C，A与C在平面上的距离与B与C在平面上的距离相等。如果对C进行滤波，A、B为参考像素。根据图1中所示，可以明显看出，在三维空间中，B与C的邻近性更强，而双边滤波器则认为A与C和B与C的邻近性一致，因此滤波结果准确性不高。FIG. 1 is a schematic diagram of computing proximity of a bilateral filter in the prior art. The problem of the existing technology is that since the pixels in the image are the reproduction of the points in the real three-dimensional space on the two-dimensional image plane, however, the bilateral filter does not start from the real three-dimensional scene when considering the proximity of the pixels, and its calculation results Not accurate, as shown in Figure 1, where A', B', and C' are three points in the real scene, and their positions in the image plane are A, B, and C after being collected by the camera, and A and C are on the plane The distance on is equal to the distance between B and C on the plane. If C is filtered, A and B are reference pixels. As shown in Figure 1, it can be clearly seen that in the three-dimensional space, the proximity of B and C is stronger, while the bilateral filter believes that the proximity of A and C and B and C are consistent, so the accuracy of the filtering results is not good. high.

发明内容Contents of the invention

本发明实施例提供一种三维视频滤波方法和装置，以克服现有技术中滤波结果准确性不高的问题。Embodiments of the present invention provide a three-dimensional video filtering method and device to overcome the problem of low accuracy of filtering results in the prior art.

第一方面，本发明实施例提供一种三维视频滤波方法，包括：In a first aspect, an embodiment of the present invention provides a three-dimensional video filtering method, including:

将图像平面中的像素投影到三维空间；所述像素包括待滤波像素和参考像素集合；Projecting pixels in the image plane to a three-dimensional space; the pixels include pixels to be filtered and a set of reference pixels;

根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合与所述待滤波像素在同一帧图像中；Calculate the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; wherein, the The set of reference pixels and the pixel to be filtered are in the same frame of image;

根据所述待滤波像素和所述参考像素集合内的参考像素的纹理像素值，计算所述待滤波像素和所述参考像素的纹理像素值相似性；calculating the similarity of texel values between the pixel to be filtered and the reference pixel according to the texel value of the pixel to be filtered and the reference pixel in the reference pixel set;

根据所述待滤波像素、所述参考像素集合内的参考像素和所述待滤波像素所在帧的前一帧图像中相同位置的像素的纹理像素值，计算所述待滤波像素和所述参考像素的运动特征一致性；Calculate the pixel to be filtered and the reference pixel according to the texture pixel value of the pixel to be filtered, the reference pixel in the reference pixel set, and the pixel at the same position in the previous frame image of the frame where the pixel to be filtered is located The consistency of the movement characteristics;

根据所述空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。The filtering weight is determined according to the spatial proximity, texture pixel value similarity and motion feature consistency, and the pixel values of the reference pixels in the reference pixel set are respectively weighted and averaged to obtain the filtering result of the pixel to be filtered.

结合第一方面，在第一方面的第一种实现方式中，对深度图像滤波时，根据所述空间邻近性、所述深度像素对应的纹理像素值相似性和所述运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的深度像素值进行加权平均，获得所述深度图像待滤波像素的深度像素值的滤波结果；或，With reference to the first aspect, in the first implementation of the first aspect, when filtering the depth image, the filtering is determined according to the spatial proximity, the similarity of the texel value corresponding to the depth pixel, and the consistency of the motion feature Weighted values of the depth pixel values of the reference pixels in the reference pixel set are respectively weighted and averaged to obtain a filtering result of the depth pixel values of the pixels to be filtered in the depth image; or,

对纹理图像滤波时，根据所述空间邻近性、所述纹理像素相似性和所述运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的纹理像素值进行加权平均，获得所述纹理图像待滤波像素的纹理像素值的滤波结果。When filtering a texture image, determine the weight of filtering according to the spatial proximity, the similarity of the texel and the consistency of the motion feature, and perform weighted average of the texture pixel values of the reference pixels in the reference pixel set respectively , obtaining a filtering result of the texel value of the pixel to be filtered in the texture image.

结合第一方面、或第一方面的第一种实现方式，在第一方面的第二种实现方式中，所述将图像平面中的像素投影到三维空间，包括：In combination with the first aspect, or the first implementation of the first aspect, in the second implementation of the first aspect, the projecting the pixels in the image plane to a three-dimensional space includes:

利用三维视频提供的深度图像信息、视点位置信息和参考相机参数信息，将所述像素从图像平面投影到三维空间；所述深度图像信息包括所述像素的深度像素值。Using the depth image information, viewpoint position information and reference camera parameter information provided by the 3D video, project the pixel from the image plane to the 3D space; the depth image information includes the depth pixel value of the pixel.

结合第一方面的第二种实现方式，在第一方面的第三种实现方式中，所述利用三维视频提供的深度图像信息、视点位置信息和参考相机参数信息，将像素从图像平面投影到三维空间，包括：With reference to the second implementation of the first aspect, in the third implementation of the first aspect, the pixel is projected from the image plane to the three-dimensional space, including:

根据公式P＝R^-1(dA^-1p-t)计算所述像素投影到三维空间后的坐标值；According to the formula P=R ^-1 (dA ^-1 pt), calculate the coordinate value after the pixel is projected into the three-dimensional space;

其中，R和t为参考相机的旋转矩阵和平移矢量，A为参考相机参数矩阵， $A = (\begin{matrix} f_{x} & r & o_{x} \\ 0 & f_{y} & o_{y} \\ 0 & 0 & 1 \end{matrix}),$ $p = (\begin{matrix} u \\ v \\ 1 \end{matrix})$ 为所述像素在所述图像平面中的坐标值， $p = (\begin{matrix} x \\ y \\ z \end{matrix})$ 为所述像素在所述三维空间中的坐标值，d为所述像素的深度像素值；f_x和f_y分别为水平和竖直方向的归一化焦距，r为径向畸变系数，(o_x,o_y)为所述图像平面上的基准点的坐标值；所述基准点为所述参考相机的光轴和所述图像平面的交点。Among them, R and t are the rotation matrix and translation vector of the reference camera, A is the parameter matrix of the reference camera, $A = (\begin{matrix} f_{x} & r & o_{x} \\ 0 & f_{the y} & o_{the y} \\ 0 & 0 & 1 \end{matrix}),$ $p = (\begin{matrix} u \\ v \\ 1 \end{matrix})$ is the coordinate value of the pixel in the image plane, $p = (\begin{matrix} x \\ the y \\ z \end{matrix})$ For the coordinate value of the pixel in the three-dimensional space, d is the depth pixel value of the pixel; f _x and f _y are respectively the normalized focal lengths in the horizontal and vertical directions, and r is the radial distortion coefficient, ( o _x , o _y ) are coordinate values of a reference point on the image plane; the reference point is the intersection of the optical axis of the reference camera and the image plane.

结合第一方面、或第一方面的第一-第三任一种实现方式，在第一方面的第四种实现方式中，所述空间邻近性通过三维空间中所述待滤波像素和所述参考像素的距离作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；In combination with the first aspect, or any one of the first-third implementation manners of the first aspect, in the fourth implementation manner of the first aspect, the spatial proximity is determined by the pixel to be filtered and the pixel to be filtered in three-dimensional space The distance of the reference pixel is calculated as the input value of the function; the output value of the function increases as the input value decreases;

所述纹理像素值相似性通过所述待滤波像素和所述参考像素的纹理像素值的差值作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；The texel value similarity is calculated by using the difference between the texture pixel values of the pixel to be filtered and the reference pixel as an input value of a function; the output value of the function increases as the input value decreases;

所述运动特征一致性通过计算所述待滤波像素和所述参考像素的运动特征是否一致得到，包括：The motion feature consistency is obtained by calculating whether the motion features of the pixel to be filtered and the reference pixel are consistent, including:

当所述待滤波像素与前一帧中对应位置的像素的纹理像素值的差值，以及所述参考像素与前一帧中对应位置的像素的纹理像素值的差值，同时大于或小于预设的阈值时确定为所述待滤波像素和所述参考像素的运动状态一致；否则确定为所述待滤波像素和所述参考像素的运动状态不一致。When the difference between the texture pixel value of the pixel to be filtered and the pixel at the corresponding position in the previous frame, and the difference between the texture pixel value between the reference pixel and the pixel at the corresponding position in the previous frame are greater than or less than the predetermined When the threshold is set, it is determined that the motion state of the pixel to be filtered is consistent with the reference pixel; otherwise, it is determined that the motion state of the pixel to be filtered is inconsistent with the reference pixel.

结合第一方面、或第一方面的第一-第四任一种实现方式，在第一方面的第五种实现方式中，所述根据所述空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果，包括：In combination with the first aspect, or any one of the first to fourth implementations of the first aspect, in the fifth implementation of the first aspect, according to the spatial proximity, texel value similarity and motion feature Consistency determines the weight of the filter, respectively performs weighted average on the pixel values of the reference pixels in the reference pixel set to obtain the filtering result of the pixel to be filtered, including:

根据公式(1)： ${D_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q}) D_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q})}$ 计算获得所述待滤波像素的深度像素值的滤波结果；或，According to formula (1): ${D.}_{p}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q}) {D.}_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q})}$ Calculate and obtain the filtering result of the depth pixel value of the pixel to be filtered; or,

根据公式(2)： ${T_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q}) T_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q})}$ 计算获得所述待滤波像素的纹理像素值的滤波结果；According to formula (2): ${T_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q}) T_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q})}$ Calculate and obtain the filtering result of the texel value of the pixel to be filtered;

其中，用于计算所述待滤波像素和所述参考像素的空间邻近性；in, for calculating the spatial proximity between the pixel to be filtered and the reference pixel;

f_T(T_p,T_q)＝f_T(||T_p-T_q||)用于计算所述待滤波像素和所述参考像素的纹理像素值相似性；f _T (T _p , T _q )=f _T (||T _p −T _q ||) is used to calculate the texture pixel value similarity between the pixel to be filtered and the reference pixel;

$f_{M} (M_{p}, M_{q}) = f_{M} (| | M_{p} - M_{q} | |) = \{\begin{matrix} 1, if (| T_{p} - T_{p^{'}} | < th) &CirclePlus; (| T_{q} - T_{q^{'}} | < th) = 0 \\ 0, else \end{matrix}$ 用于计算所述待滤波像素和所述参考像素的运动特征一致性； $f_{m} (m_{p}, m_{q}) = f_{m} (| | m_{p} - m_{q} | |) = \{\begin{matrix} 1, if (| T_{p} - T_{p^{'}} | < the th) &CirclePlus; (| T_{q} - T_{q^{'}} | < the th) = 0 \\ 0, else \end{matrix}$ used to calculate the motion feature consistency between the pixel to be filtered and the reference pixel;

其中，p为待滤波像素，q为参考像素，K为参考像素集合，D_p'为p滤波后的深度像素值，D_q为q的深度像素值，P、Q为p、q在三维空间中的坐标值，T_p、T_q为p、q的纹理像素值，T_p'、T_q'为p、q在前一帧相同位置的纹理像素值，T_p'为p滤波后的纹理像素值，th为预设的纹理像素差阈值。Among them, p is the pixel to be filtered, q is the reference pixel, K is the reference pixel set, D _p ' is the depth pixel value after p filtering, D _q is the depth pixel value of q, and P and Q are p and q in the three-dimensional space The coordinate values in , T _p , T _q are the texture pixel values of p and q, T _p' , T _q' are the texture pixel values of p and q at the same position in the previous frame, T _p ' is the texture after p filtering Pixel value, th is the preset texture pixel difference threshold.

第二方面，本发明实施例提供一种三维视频滤波方法，包括：In a second aspect, an embodiment of the present invention provides a three-dimensional video filtering method, including:

根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合在与所述待滤波像素所在同一帧图像和相邻多帧图像中；Calculate the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; wherein, the The reference pixels are set in the same frame of image and adjacent multi-frame images where the pixel to be filtered is located;

根据所述待滤波像素和所述参考像素集合内的参考像素所在帧的时间间隔，计算所述待滤波像素和所述参考像素的时域邻近性；calculating the temporal proximity of the pixel to be filtered and the reference pixel according to the time interval of the frame where the pixel to be filtered and the reference pixel in the set of reference pixels are located;

根据所述空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。The filtering weight is determined according to the spatial proximity, texture pixel value similarity and temporal proximity, and the pixel values of the reference pixels in the reference pixel set are respectively weighted and averaged to obtain the filtering result of the pixel to be filtered.

结合第二方面，在第二方面的第一种实现方式中，对深度图像滤波时，根据所述空间邻近性、所述深度像素对应的纹理像素值相似性和所述时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的深度像素值进行加权平均，获得所述深度图像待滤波像素的深度像素值的滤波结果；或，With reference to the second aspect, in the first implementation of the second aspect, when filtering the depth image, the filtering is determined according to the spatial proximity, the similarity of the texel value corresponding to the depth pixel, and the temporal proximity Weighted values of the depth pixel values of the reference pixels in the reference pixel set are respectively weighted and averaged to obtain a filtering result of the depth pixel values of the pixels to be filtered in the depth image; or,

对纹理图像滤波时，根据所述空间邻近性、所述纹理像素相似性和所述时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的纹理像素值进行加权平均，获得所述纹理图像待滤波像素的纹理像素值的滤波结果。When filtering the texture image, determine the weight of filtering according to the spatial proximity, the texture pixel similarity and the temporal proximity, and perform weighted average on the texture pixel values of the reference pixels in the reference pixel set respectively , obtaining a filtering result of the texel value of the pixel to be filtered in the texture image.

结合第二方面、或第二方面的第一种实现方式，在第二方面的第二种实现方式中，所述将图像平面中的像素投影到三维空间，包括：In combination with the second aspect, or the first implementation manner of the second aspect, in the second implementation manner of the second aspect, the projecting the pixels in the image plane to a three-dimensional space includes:

结合第二方面的第二种实现方式，在第二方面的第三种实现方式中，所述利用三维视频提供的深度图像信息、视点位置信息和参考相机参数信息，将所述像素从图像平面投影到三维空间，包括：With reference to the second implementation of the second aspect, in the third implementation of the second aspect, the depth image information, viewpoint position information and reference camera parameter information provided by the 3D video are used to convert the pixels from the image plane to Projection into 3D space, including:

结合第二方面、或第二方面的第一-第三任一种实现方式，在第二方面的第四种实现方式中，所述空间邻近性通过三维空间中所述待滤波像素和所述参考像素的距离作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；In combination with the second aspect, or any of the first-third implementation manners of the second aspect, in the fourth implementation manner of the second aspect, the spatial proximity is determined by the pixel to be filtered and the pixel to be filtered in three-dimensional space The distance of the reference pixel is calculated as the input value of the function; the output value of the function increases as the input value decreases;

所述时域邻近性通过所述待滤波像素和所述参考像素所在帧的时间间隔作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大。The temporal proximity is calculated by using the time interval of the frame where the pixel to be filtered and the reference pixel are located as an input value of a function; the output value of the function increases as the input value decreases.

结合第二方面、或第二方面的第一-第四任一种实现方式，在第二方面的第五种实现方式中，所述根据所述空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果，包括：In combination with the second aspect, or any of the first-fourth implementations of the second aspect, in the fifth implementation of the second aspect, according to the spatial proximity, texel value similarity and time domain Proximity determines the weight of the filter, and performs weighted average on the pixel values of the reference pixels in the reference pixel set to obtain the filtering result of the pixel to be filtered, including:

根据公式(3)： ${D_{p}}^{'} = \frac{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) D_{q_{i}})}{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ 计算获得所述待滤波像素的深度像素值的滤波结果；或，According to formula (3): ${D.}_{p}^{'} = \frac{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) {D.}_{q_{i}})}{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ Calculate and obtain the filtering result of the depth pixel value of the pixel to be filtered; or,

根据公式(4)： ${T_{p}}^{'} = \frac{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) T_{q_{i}})}{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ 计算获得所述待滤波像素的纹理像素值的滤波结果；According to formula (4): ${T_{p}}^{'} = \frac{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) T_{q_{i}})}{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ Calculate and obtain the filtering result of the texel value of the pixel to be filtered;

用于计算所述待滤波像素和所述参考像素的纹理像素值相似性； being used to calculate the similarity of texel values between the pixel to be filtered and the reference pixel;

f_tem(i,N)＝f_tem(||i-N||)用于计算所述待滤波像素和所述参考像素的时域邻近性；f _tem (i, N)=f _tem (||iN||) is used to calculate the temporal proximity of the pixel to be filtered and the reference pixel;

其中，N为待滤波像素所在帧的帧号，i为参考像素所在帧的帧号，i取值为[N-m，N+n]区间的整数，m、n分别为在待滤波像素所在帧之前、之后的参考帧个数，m、n为非负整数，p为待滤波像素，q_i为第i帧中的参考像素，K_i为第i帧中的参考像素集合，D_p'为p滤波后的深度像素值，第i帧中q的深度像素值，P、Q_i为p、第i帧中q在三维空间中的坐标值，T_p、分别为p、第i帧中q的纹理像素值，T_p'为p滤波后的纹理像素值。Among them, N is the frame number of the frame where the pixel to be filtered is located, i is the frame number of the frame where the reference pixel is located, and the value of i is an integer in the interval [Nm, N+n], m and n are respectively before the frame where the pixel to be filtered is located , the number of subsequent reference frames, m and n are non-negative integers, p is the pixel to be filtered, q _i is the reference pixel in the i-th frame, K _i is the reference pixel set in the i-th frame, and D _p ' is p Filtered depth pixel values, The depth pixel value of q in the i-th frame, P, Q _i is p, the coordinate value of q in the i-th frame in three-dimensional space, T _p , are the texture pixel values of p and q in the i-th frame, respectively, and T _p ' is the texture pixel value of p after filtering.

第三方面，本发明实施例提供一种三维视频滤波装置，包括：In a third aspect, an embodiment of the present invention provides a three-dimensional video filtering device, including:

投影模块，用于将图像平面中的像素投影到三维空间；所述像素包括待滤波像素和参考像素集合；A projection module, configured to project pixels in the image plane to a three-dimensional space; the pixels include a pixel to be filtered and a set of reference pixels;

计算模块，用于根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合与所述待滤波像素在同一帧图像中；A calculation module, configured to calculate the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space property; wherein, the set of reference pixels and the pixel to be filtered are in the same frame of image;

所述计算模块，还用于根据所述待滤波像素和所述参考像素集合内的参考像素的纹理像素值，计算所述待滤波像素和所述参考像素的纹理像素值相似性；The calculation module is further configured to calculate the similarity of the texel value of the pixel to be filtered and the reference pixel according to the texel value of the pixel to be filtered and the reference pixel in the reference pixel set;

所述计算模块，还用于根据所述待滤波像素、所述参考像素集合内的参考像素和所述待滤波像素所在帧的前一帧图像中相同位置的像素的纹理像素值，计算所述待滤波像素和所述参考像素的运动特征一致性；The calculation module is further configured to calculate the pixel to be filtered, the reference pixel in the reference pixel set, and the texture pixel value of the pixel at the same position in the previous frame image of the frame where the pixel to be filtered is located. The motion feature consistency of the pixel to be filtered and the reference pixel;

滤波模块，用于根据所述空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。A filtering module, configured to determine a filtering weight according to the spatial proximity, texture pixel value similarity, and motion feature consistency, and perform weighted average on the pixel values of the reference pixels in the reference pixel set to obtain the to-be-filtered The result of filtering for pixels.

结合第三方面，在第三方面的第一种实现方式中，所述滤波模块，具体用于：With reference to the third aspect, in the first implementation manner of the third aspect, the filtering module is specifically used for:

对深度图像滤波时，根据所述空间邻近性、所述深度像素对应的纹理像素值相似性和所述运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的深度像素值进行加权平均，获得所述深度图像待滤波像素的深度像素值的滤波结果；或，When filtering the depth image, the weight of the filter is determined according to the spatial proximity, the similarity of the texture pixel value corresponding to the depth pixel and the consistency of the motion feature, and the depth of the reference pixel in the reference pixel set is respectively Performing weighted average of the pixel values to obtain a filtering result of the depth pixel values of the pixels to be filtered in the depth image; or,

结合第三方面、或第三方面的第一种实现方式，在第三方面的第二种实现方式中，所述投影模块，具体用于：In combination with the third aspect, or the first implementation manner of the third aspect, in the second implementation manner of the third aspect, the projection module is specifically used for:

结合第三方面的第二种实现方式，在第三方面的第三种实现方式中，所述投影模块，具体用于：In combination with the second implementation of the third aspect, in the third implementation of the third aspect, the projection module is specifically used for:

结合第三方面、或第三方面的第一-第三任一种实现方式，在第三方面的第四种实现方式中，所述空间邻近性通过三维空间中所述待滤波像素和所述参考像素的距离作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；In combination with the third aspect, or any of the first-third implementation manners of the third aspect, in the fourth implementation manner of the third aspect, the spatial proximity is determined by the pixel to be filtered and the pixel to be filtered in three-dimensional space The distance of the reference pixel is calculated as the input value of the function; the output value of the function increases as the input value decreases;

结合第三方面、或第三方面的第一-第四任一种实现方式，在第三方面的第五种实现方式中，所述滤波模块，具体用于：In combination with the third aspect, or any one of the first-fourth implementation manners of the third aspect, in the fifth implementation manner of the third aspect, the filtering module is specifically used for:

第四方面，本发明实施例提供一种三维视频滤波装置，包括：In a fourth aspect, an embodiment of the present invention provides a three-dimensional video filtering device, including:

计算模块，用于根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合在与所述待滤波像素所在同一帧图像和相邻多帧图像中；A calculation module, configured to calculate the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space property; wherein, the set of reference pixels is located in the same frame of image as the pixel to be filtered and in adjacent multiple frames of images;

所述计算模块，还用于根据所述待滤波像素和所述参考像素集合内的参考像素所在帧的时间间隔，计算所述待滤波像素和所述参考像素的时域邻近性；The calculation module is further configured to calculate the temporal proximity of the pixel to be filtered and the reference pixel according to the time interval of the frame where the pixel to be filtered and the reference pixel in the set of reference pixels are located;

滤波模块，用于根据所述空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。A filtering module, configured to determine a filtering weight according to the spatial proximity, texture pixel value similarity, and temporal proximity, and perform a weighted average on the pixel values of the reference pixels in the reference pixel set to obtain the to-be-filtered The result of filtering for pixels.

结合第四方面，在第四方面的第一种实现方式中，所述滤波模块，具体用于：With reference to the fourth aspect, in the first implementation manner of the fourth aspect, the filtering module is specifically used for:

对深度图像滤波时，根据所述空间邻近性、所述深度像素对应的纹理像素值相似性和所述时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的深度像素值进行加权平均，获得所述深度图像待滤波像素的深度像素值的滤波结果；或，When filtering the depth image, the weight of filtering is determined according to the spatial proximity, the similarity of the texture pixel value corresponding to the depth pixel and the temporal proximity, and the depth of the reference pixel in the reference pixel set is respectively Performing weighted average of the pixel values to obtain a filtering result of the depth pixel values of the pixels to be filtered in the depth image; or,

结合第四方面、或第四方面的第一种实现方式，在第四方面的第二种实现方式中，所述投影模块，具体用于：With reference to the fourth aspect, or the first implementation manner of the fourth aspect, in the second implementation manner of the fourth aspect, the projection module is specifically used for:

结合第四方面的第二种实现方式，在第四方面的第三种实现方式中，所述投影模块，具体用于：In combination with the second implementation of the fourth aspect, in the third implementation of the fourth aspect, the projection module is specifically used for:

结合第四方面、或第四方面的第一-第三任一种实现方式，在第四方面的第四种实现方式中，所述空间邻近性通过三维空间中所述待滤波像素和所述参考像素的距离作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；In combination with the fourth aspect, or any of the first-third implementation manners of the fourth aspect, in the fourth implementation manner of the fourth aspect, the spatial proximity is determined by the pixel to be filtered and the pixel to be filtered in three-dimensional space The distance of the reference pixel is calculated as the input value of the function; the output value of the function increases as the input value decreases;

结合第四方面、或第四方面的第一-第四任一种实现方式，在第四方面的第五种实现方式中，所述滤波模块，具体用于：In combination with the fourth aspect, or any one of the first-fourth implementation manners of the fourth aspect, in the fifth implementation manner of the fourth aspect, the filtering module is specifically used for:

本发明实施例三维视频滤波方法和装置，利用待滤波像素和参考像素在真实三维空间中的关系，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性、纹理像素值相似性、运动特征一致性和时域邻近性；根据空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，或根据空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果，计算权值时综合考虑了空间邻近性、纹理像素值相似性、运动特征一致性和时域邻近性，由于计算空间邻近性时利用的是真实三维空间中的位置，而且由于三维视频由一系列采集自不同时刻的图像组成，不同帧之间的像素点也存在相关性，因此权值考虑时域邻近性滤波后帧与帧之间的连续性较强，并且增加考虑了像素点之间的运动特征一致性，提高了滤波结果准确性，解决了现有技术中滤波结果准确性不高的问题。The three-dimensional video filtering method and device of the embodiment of the present invention use the relationship between the pixel to be filtered and the reference pixel in the real three-dimensional space to calculate the spatial proximity and texture pixel of the pixel to be filtered and the reference pixel in the three-dimensional space Value similarity, motion feature consistency, and temporal proximity; determine the weight of filtering based on spatial proximity, texture pixel value similarity, and motion feature consistency, or determine the weight of filtering based on spatial proximity, texture pixel value similarity, and temporal proximity To determine the weight of the filter, the pixel values of the reference pixels in the reference pixel set are weighted and averaged to obtain the filtering result of the pixel to be filtered, and the spatial proximity and texture pixel value similarity are comprehensively considered when calculating the weight , motion feature consistency, and time-domain proximity, because the location in the real 3D space is used to calculate the spatial proximity, and since the 3D video is composed of a series of images collected at different times, the pixels between different frames are also There is a correlation, so the weight value considers the continuity between frames after temporal proximity filtering, and adds consideration of the consistency of motion features between pixels, which improves the accuracy of filtering results and solves the problem of existing problems. The accuracy of the filtering results in the technology is not high.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1为现有技术的双边滤波器的计算邻近性的示意图；FIG. 1 is a schematic diagram of computing proximity of a bilateral filter in the prior art;

图2为本发明三维视频滤波方法实施例一的流程图；FIG. 2 is a flow chart of Embodiment 1 of the three-dimensional video filtering method of the present invention;

图3为本发明方法实施例一的像素投影示意图；Fig. 3 is a schematic diagram of pixel projection according to Embodiment 1 of the method of the present invention;

图4为本发明三维视频滤波方法实施例二的流程图；FIG. 4 is a flow chart of Embodiment 2 of the three-dimensional video filtering method of the present invention;

图5为本发明方法实施例二的参考像素选择示意图；5 is a schematic diagram of reference pixel selection in the second embodiment of the method of the present invention;

图6为本发明三维视频滤波装置实施例的结构示意图；FIG. 6 is a schematic structural diagram of an embodiment of a three-dimensional video filtering device according to the present invention;

图7为本发明三维视频滤波设备实施例的结构示意图。FIG. 7 is a schematic structural diagram of an embodiment of a three-dimensional video filtering device according to the present invention.

具体实施方式detailed description

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

图2为本发明三维视频滤波方法实施例一的流程图，图3为本发明方法实施例一的像素投影示意图。如图2所示，本实施例的方法可以包括：FIG. 2 is a flowchart of Embodiment 1 of the three-dimensional video filtering method of the present invention, and FIG. 3 is a schematic diagram of pixel projection of Embodiment 1 of the method of the present invention. As shown in Figure 2, the method of this embodiment may include:

步骤201、将图像平面中的像素投影到三维空间；所述像素包括待滤波像素和参考像素集合。Step 201. Project the pixels in the image plane to a three-dimensional space; the pixels include pixels to be filtered and a set of reference pixels.

可选地，将图像平面中的像素投影到三维空间，包括：Optionally, project pixels in the image plane to 3D space, including:

利用三维视频提供的深度图像信息、视点位置信息和参考相机参数信息，将像素从图像平面投影到三维空间；所述深度图像信息包括所述像素的深度像素值。Using the depth image information, viewpoint position information and reference camera parameter information provided by the 3D video, project the pixel from the image plane to the 3D space; the depth image information includes the depth pixel value of the pixel.

可选地，所述利用三维视频提供的深度图像信息、视点位置信息和参考相机参数信息，将像素从图像平面投影到三维空间，包括：Optionally, projecting pixels from an image plane to a three-dimensional space using depth image information, viewpoint position information and reference camera parameter information provided by a three-dimensional video includes:

如图3所示，uv坐标所在的平面为图像平面，三维空间中的像素位置用世界坐标系中的坐标表示，p为图像平面中的像素，所述像素在所述图像平面中的坐标值为 $p = (\begin{matrix} u \\ v \\ 1 \end{matrix}),$ 应用三维投影技术，将像素投影到世界坐标系中的P点，所述像素在所世界坐标系中的坐标值为 $p = (\begin{matrix} x \\ y \\ z \end{matrix}),$ 该坐标值可以通过公式P＝R^-1(dA^-1p-t)计算得到，其中R和t为参考相机的旋转矩阵和平移矢量，d为像素的深度像素值，可以由三维视频提供的深度图信息得到，A为参考相机参数矩阵 $A = (\begin{matrix} f_{x} & r & o_{x} \\ 0 & f_{y} & o_{y} \\ 0 & 0 & 1 \end{matrix}),$ f_x和f_y分别为水平和竖直方向的归一化焦距，r为径向畸变系数，(o_x,o_y)为所述图像平面上的基准点的坐标值；所述基准点为所述参考相机的光轴和所述图像平面的交点。As shown in Figure 3, the plane where the uv coordinates are located is the image plane, the pixel position in the three-dimensional space is represented by the coordinates in the world coordinate system, p is the pixel in the image plane, and the coordinate value of the pixel in the image plane for $p = (\begin{matrix} u \\ v \\ 1 \end{matrix}),$ Apply the three-dimensional projection technology to project the pixel to point P in the world coordinate system, and the coordinate value of the pixel in the world coordinate system is $p = (\begin{matrix} x \\ the y \\ z \end{matrix}),$ The coordinate value can be calculated by the formula P=R ^-1 (dA ^-1 pt), where R and t are the rotation matrix and translation vector of the reference camera, d is the depth pixel value of the pixel, which can be obtained from the depth map provided by the 3D video The information is obtained, and A is the reference camera parameter matrix $A = (\begin{matrix} f_{x} & r & o_{x} \\ 0 & f_{the y} & o_{the y} \\ 0 & 0 & 1 \end{matrix}),$ f _x and f _y are the normalized focal lengths in the horizontal and vertical directions respectively, r is the radial distortion coefficient, (o _x , o _y ) is the coordinate value of the reference point on the image plane; the reference point is The intersection of the optical axis of the reference camera and the image plane.

步骤202、根据待滤波像素和参考像素集合内的参考像素在三维空间的坐标值，计算待滤波像素和参考像素在所述三维空间内的空间邻近性；其中，参考像素集合与待滤波像素在同一帧图像中。Step 202, according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space, calculate the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space; wherein, the reference pixel set and the pixel to be filtered are in in the same image frame.

可选地，所述空间邻近性通过三维空间中所述待滤波像素和所述参考像素的距离作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大。Optionally, the spatial proximity is calculated by taking the distance between the pixel to be filtered and the reference pixel in three-dimensional space as an input value of a function; the output value of the function increases as the input value decreases.

具体地，两点之间的空间距离可以反映其空间邻近程度，距离越近相关性越强，则空间邻近性越大，即可以通过待滤波像素和参考像素集合内的参考像素在三维空间的坐标值计算空间距离，并将该空间距离作为输入值例如通过高斯函数计算出空间邻近性，计算空间邻近性的函数还可以是其他函数，但是需要保证该函数的输出值随着输入值的减小而增大；其中，本实施例中的参考像素集合与待滤波像素在同一帧中。Specifically, the spatial distance between two points can reflect their spatial proximity. The closer the distance, the stronger the correlation, and the greater the spatial proximity, that is, the pixel to be filtered and the reference pixel in the reference pixel set can be compared in the three-dimensional space. The coordinate value calculates the spatial distance, and takes the spatial distance as an input value. For example, the spatial proximity is calculated through the Gaussian function. The function of calculating the spatial proximity can also be other functions, but it is necessary to ensure that the output value of the function decreases with the input value. Small and large; wherein, the reference pixel set in this embodiment is in the same frame as the pixel to be filtered.

步骤203、根据待滤波像素和参考像素集合内的参考像素的纹理像素值，计算待滤波像素和参考像素的纹理像素值相似性。Step 203 , according to the pixel to be filtered and the texel value of the reference pixel in the reference pixel set, calculate the similarity of the texel value of the pixel to be filtered and the reference pixel.

可选地，所述纹理像素值相似性通过所述待滤波像素和所述参考像素的纹理像素值的差值作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大。Optionally, the texel value similarity is calculated by using the difference between the texel values of the pixel to be filtered and the reference pixel as an input value of a function; the output value of the function decreases as the input value And increase.

具体地，两点之间纹理特征的差异程度反映了其相似程度，纹理越相似相关性越强，则纹理像素值相似性越大，即可以通过所述待滤波像素和所述参考像素的纹理像素值计算差值，并将该差值作为输入值例如通过高斯函数计算出纹理像素值相似性，计算纹理像素值相似性的函数还可以是其他函数，但是需要保证该函数的输出值随着输入值的减小而增大。Specifically, the degree of difference in texture features between two points reflects their similarity. The more similar the texture, the stronger the correlation, and the greater the similarity of the texture pixel value, that is, the texture of the pixel to be filtered and the reference pixel can be Calculate the difference of pixel values, and use the difference as an input value. For example, the similarity of texture pixel values is calculated by a Gaussian function. The function of calculating the similarity of texture pixel values can also be other functions, but it is necessary to ensure that the output value of this function follows the increase as the input value decreases.

步骤204、根据待滤波像素、参考像素集合内的参考像素和待滤波像素所在帧的前一帧图像中相同位置的像素的纹理像素值，计算待滤波像素和参考像素的运动特征一致性。Step 204, according to the pixel to be filtered, the reference pixel in the reference pixel set, and the texel value of the pixel at the same position in the previous frame image where the pixel to be filtered is located, calculate the consistency of motion characteristics between the pixel to be filtered and the reference pixel.

可选地，所述运动特征一致性通过计算所述待滤波像素和所述参考像素的运动特征是否一致得到，包括：Optionally, the motion feature consistency is obtained by calculating whether the motion features of the pixel to be filtered and the reference pixel are consistent, including:

当所述待滤波像素与前一帧中对应位置的像素的纹理像素值的差值，以及所述参考像素与前一帧中对应位置的像素的纹理像素值的差值，同时大于或小于预设的阈值时确定为所述待滤波像素和所述参考像素的运动特征一致；否则确定为所述待滤波像素和所述参考像素的运动特征不一致。When the difference between the texture pixel value of the pixel to be filtered and the pixel at the corresponding position in the previous frame, and the difference between the texture pixel value between the reference pixel and the pixel at the corresponding position in the previous frame are greater than or less than the predetermined When the threshold is set, it is determined that the motion characteristics of the pixel to be filtered and the reference pixel are consistent; otherwise, it is determined that the motion characteristics of the pixel to be filtered and the reference pixel are inconsistent.

具体地，两点之间相对运动的关系也反映了其运动相似性，运动越相似相关性越强。由于从三维视频序列中很难获取像素点的运动信息，因此本发明实施例用前后两帧的像素在图像平面中相同位置纹理像素的差值来判断所述像素是否运动，当差值大于某一预设的阈值时，认为该像素的运动特征为运动，反之，认为该像素的运动特征为无运动；进一步地，用前后两帧的待滤波像素和参考像素在图像平面中相同位置纹理像素的差值来判断所述待滤波像素和参考像素运动特征是否一致，当差值都大于或小于某一预设的阈值时，认为所述待滤波像素和参考像素的运动特征一致，反之，运动特征不一致。如果像素点的运动特征一致，认为有相关性，反之认为无相关性。Specifically, the relative movement relationship between two points also reflects their movement similarity, and the more similar the movement, the stronger the correlation. Since it is difficult to obtain the motion information of a pixel point from a three-dimensional video sequence, the embodiment of the present invention uses the difference between the pixels of the two frames before and after the texel at the same position in the image plane to determine whether the pixel is moving. When a preset threshold is reached, the motion feature of the pixel is considered to be motion, otherwise, the motion feature of the pixel is considered to be no motion; further, the pixel to be filtered and the reference pixel of the two frames before and after are used in the same position texel in the image plane The difference between the pixel to be filtered and the reference pixel to determine whether the motion characteristics are consistent, when the difference is greater than or less than a preset threshold, it is considered that the motion characteristics of the pixel to be filtered and the reference pixel are consistent, otherwise, the motion Features are inconsistent. If the motion characteristics of the pixels are consistent, it is considered that there is correlation, otherwise it is considered that there is no correlation.

步骤205、根据空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对参考像素集合内的参考像素的像素值进行加权平均获得待滤波像素的滤波结果。Step 205: Determine the filtering weight according to the spatial proximity, texture pixel value similarity and motion feature consistency, and perform weighted average on the pixel values of the reference pixels in the reference pixel set respectively to obtain the filtering result of the pixel to be filtered.

可选地，对深度图像滤波时，根据所述空间邻近性、所述深度像素对应的纹理像素值相似性和所述运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的深度像素值进行加权平均，获得所述待深度图像待滤波像素的深度像素值的滤波结果；或，Optionally, when filtering the depth image, the weight of the filter is determined according to the spatial proximity, the similarity of the texel value corresponding to the depth pixel, and the consistency of the motion feature, respectively for the reference pixel set performing weighted averaging on the depth pixel values of the reference pixels to obtain a filtering result of the depth pixel values of the to-be-filtered pixels in the to-be-depth image; or,

对纹理图像滤波时，根据所述空间邻近性、所述纹理像素相似性和所述运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的纹理像素值进行加权平均，获得所述待纹理图像待滤波像素的纹理像素值的滤波结果。When filtering a texture image, determine the weight of filtering according to the spatial proximity, the similarity of the texel and the consistency of the motion feature, and perform weighted average of the texture pixel values of the reference pixels in the reference pixel set respectively , obtaining a filtering result of the texel value of the pixel to be filtered in the image to be textured.

可选地，根据空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对参考像素集合内的参考像素的像素值进行加权平均获得待滤波像素的滤波结果，包括：Optionally, according to the spatial proximity, texture pixel value similarity and motion feature consistency to determine the weight of the filter, the pixel values of the reference pixels in the reference pixel set are weighted and averaged to obtain the filtering result of the pixel to be filtered, including:

根据公式(1)： ${D_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q}) D_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q})}$ 计算获得待滤波像素的深度像素值的滤波结果；或，According to formula (1): ${D.}_{p}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q}) {D.}_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q})}$ calculating a filtering result to obtain the depth pixel value of the pixel to be filtered; or,

根据公式(2)： ${T_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q}) T_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q})}$ 计算获得待滤波像素的纹理像素值的滤波结果；According to formula (2): ${T_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q}) T_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q})}$ Calculate and obtain the filtering result of the texture pixel value of the pixel to be filtered;

其中，用于计算待滤波像素和参考像素的空间邻近性；in, Used to calculate the spatial proximity of the pixel to be filtered and the reference pixel;

f_T(T_p,T_q)＝f_T(||T_p-T_q||)用于计算待滤波像素和参考像素的纹理像素值相似性；f _T (T _p , T _q )=f _T (||T _p -T _q ||) is used to calculate the texture pixel value similarity between the pixel to be filtered and the reference pixel;

$f_{M} (M_{p}, M_{q}) = f_{M} (| | M_{p} - M_{q} | |) = \{\begin{matrix} 1, if (| T_{p} - T_{p^{'}} | < th) &CirclePlus; (| T_{q} - T_{q^{'}} | < th) = 0 \\ 0, else \end{matrix}$ 用于计算待滤波像素和参考像素的运动特征一致性； $f_{m} (m_{p}, m_{q}) = f_{m} (| | m_{p} - m_{q} | |) = \{\begin{matrix} 1, if (| T_{p} - T_{p^{'}} | < the th) &CirclePlus; (| T_{q} - T_{q^{'}} | < the th) = 0 \\ 0, else \end{matrix}$ It is used to calculate the motion feature consistency between the pixel to be filtered and the reference pixel;

具体地，可以根据公式 ${D_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q}) D_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q})}$ 对参考像素集合中的参考像素进行加权平均，计算获得待滤波像素的深度像素值的滤波结果；根据公式 ${T_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q}) T_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{M} (M_{p}, M_{q})}$ 对参考像素集合中的参考像素进行加权平均，计算获得待滤波像素的纹理像素值的滤波结果；Specifically, according to the formula ${D.}_{p}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q}) {D.}_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q})}$ Carry out weighted average on the reference pixels in the reference pixel set, calculate and obtain the filtering result of the depth pixel value of the pixel to be filtered; according to the formula ${T_{p}}^{'} = \frac{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q}) T_{q}}{\underset{q &Element; K}{Σ} f_{S} (P, Q) f_{T} (T_{p}, T_{q}) f_{m} (m_{p}, m_{q})}$ Carrying out weighted averaging on the reference pixels in the reference pixel set, and calculating and obtaining the filtering result of the texel value of the pixel to be filtered;

其中，用于计算待滤波像素和参考像素的空间邻近性；该函数的输入值为所述待滤波像素和所述参考像素的空间距离；所述函数的输出值随着输入值的减小而增大；in, Used to calculate the spatial proximity of the pixel to be filtered and the reference pixel; the input value of the function is the spatial distance between the pixel to be filtered and the reference pixel; the output value of the function increases as the input value decreases ;

f_T(T_p,T_q)＝f_T(||T_p-T_q||)用于计算待滤波像素和参考像素的纹理像素值相似性；该函数的输入值为所述待滤波像素和所述参考像素的纹理像素值的差值；所述函数的输出值随着输入值的减小而增大；f _T (T _p , T _q )=f _T (||T _p -T _q ||) is used to calculate the texture pixel value similarity between the pixel to be filtered and the reference pixel; the input value of this function is the pixel to be filtered and the difference between the texel value of the reference pixel; the output value of the function increases as the input value decreases;

$f_{M} (M_{p}, M_{q}) = f_{M} (| | M_{p} - M_{q} | |) = \{\begin{matrix} 1, if (| T_{p} - T_{p^{'}} | < th) &CirclePlus; (| T_{q} - T_{q^{'}} | < th) = 0 \\ 0, else \end{matrix}$ 用于计算待滤波像素和参考像素的运动特征一致性；即当所述待滤波像素与前一帧中对应位置的像素的纹理像素值的差值，以及所述参考像素与前一帧中对应位置的像素的纹理像素值的差值，同时大于或小于预设的阈值时确定为所述待滤波像素和所述参考像素的运动特征一致；否则确定为所述待滤波像素和所述参考像素的运动特征不一致。 $f_{m} (m_{p}, m_{q}) = f_{m} (| | m_{p} - m_{q} | |) = \{\begin{matrix} 1, if (| T_{p} - T_{p^{'}} | < the th) &CirclePlus; (| T_{q} - T_{q^{'}} | < the th) = 0 \\ 0, else \end{matrix}$ It is used to calculate the motion feature consistency between the pixel to be filtered and the reference pixel; that is, when the difference between the texture pixel value of the pixel to be filtered and the pixel at the corresponding position in the previous frame, and the corresponding pixel value between the reference pixel and the previous frame If the difference between the texture pixel values of the pixel at the position is greater than or less than a preset threshold, it is determined that the motion characteristics of the pixel to be filtered and the reference pixel are consistent; otherwise, it is determined that the pixel to be filtered is consistent with the reference pixel The movement characteristics are inconsistent.

其中，p为待滤波像素，q为参考像素，K为参考像素集合，此集合通常取以待滤波像素为中心的正方形区域，大小为5*5或7*7，D_p'为p滤波后的深度像素值，D_q为q的深度像素值，P、Q为p、q在三维空间中的坐标值，T_p、T_q为p、q的纹理像素值，T_p'、T_q'为p、q在前一帧相同位置的纹理像素值，此处相同位置是指在图像平面中对应相同的位置，T_p'为p滤波后的纹理像素值，th为预设的纹理像素差阈值。th为判断像素点运动特征是否一致的门限值，可根据三维视频序列内容的不同选择，一般为6～20，当选择适当时，可以较好的区分运动物体边界，使滤波后物体边界更加明显。Among them, p is the pixel to be filtered, q is the reference pixel, and K is the set of reference pixels. This set usually takes a square area centered on the pixel to be filtered, with a size of 5*5 or 7*7. D _p ' is p after filtering D _q is the depth pixel value of q, P and Q are the coordinate values of p and q in three-dimensional space, T _p and T _q are the texture pixel values of p and q, T _p' and T _q' is the texel value of p and q at the same position in the previous frame, where the same position refers to the same position in the image plane, T _p ' is the texel value after p filtering, and th is the preset texel difference threshold. th is the threshold value for judging whether the motion characteristics of pixels are consistent. It can be selected according to the content of the 3D video sequence, generally 6-20. When the selection is appropriate, the boundary of the moving object can be better distinguished, making the boundary of the filtered object more accurate. obvious.

需要说明的是，在本实施例中步骤202、步骤203、步骤204不分先后顺序。It should be noted that step 202, step 203, and step 204 are not sequenced in this embodiment.

本实施例，利用待滤波像素和参考像素在真实三维空间中的关系，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性、纹理像素值相似性、运动特征一致性；根据空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果，计算权值时综合考虑了空间邻近性、纹理像素值相似性、运动特征一致性，由于计算空间邻近性时利用的是真实三维空间中的位置，并且增加考虑了像素点之间的运动特征一致性，提高了滤波结果准确性，解决了现有技术中滤波结果准确性不高的问题。In this embodiment, the relationship between the pixel to be filtered and the reference pixel in the real three-dimensional space is used to calculate the spatial proximity, texture pixel value similarity, and motion feature consistency between the pixel to be filtered and the reference pixel in the three-dimensional space property; according to the spatial proximity, texture pixel value similarity and motion feature consistency to determine the weight of the filter, the pixel values of the reference pixels in the reference pixel set are respectively weighted and averaged to obtain the filtering result of the pixel to be filtered, When calculating the weight value, the spatial proximity, texture pixel value similarity, and motion feature consistency are comprehensively considered. Since the position in the real three-dimensional space is used to calculate the spatial proximity, and the motion feature consistency between pixels is considered The accuracy of the filtering result is improved, and the problem of low accuracy of the filtering result in the prior art is solved.

图4为本发明三维视频滤波方法实施例二的流程图，图5为本发明方法实施例二的参考像素选择示意图，如图4所示，本实施例的方法可以包括：FIG. 4 is a flowchart of Embodiment 2 of the three-dimensional video filtering method of the present invention, and FIG. 5 is a schematic diagram of reference pixel selection in Embodiment 2 of the method of the present invention. As shown in FIG. 4 , the method of this embodiment may include:

步骤401、将图像平面中的像素投影到三维空间；所述像素包括待滤波像素和参考像素集合。Step 401. Project the pixels in the image plane to a three-dimensional space; the pixels include pixels to be filtered and a set of reference pixels.

可选地，利用三维视频提供的深度图像信息、视点位置信息和参考相机参数信息，将像素从图像平面投影到三维空间，包括：Optionally, using the depth image information, viewpoint position information and reference camera parameter information provided by the 3D video, projecting the pixels from the image plane to the 3D space includes:

步骤402、根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合在与所述待滤波像素所在同一帧图像和相邻多帧图像中。Step 402: Calculate the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; Wherein, the reference pixels are set in the same frame of image as the pixel to be filtered and multiple adjacent frames of images.

步骤403、根据所述待滤波像素和所述参考像素集合内的参考像素的纹理像素值，计算所述待滤波像素和所述参考像素的纹理像素值相似性。Step 403: Calculate the similarity of texel values between the pixel to be filtered and the reference pixel according to the texel value of the pixel to be filtered and the reference pixel in the reference pixel set.

可选地，所述纹理像素值相似性通过所述待滤波像素和所述参考像素的纹理像素值的差值作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；Optionally, the texel value similarity is calculated by using the difference between the texel values of the pixel to be filtered and the reference pixel as an input value of a function; the output value of the function decreases as the input value and increased;

步骤401、步骤402、步骤403中的实现原理与实施例一中的类似，此处不再赘述。The implementation principles in step 401, step 402, and step 403 are similar to those in the first embodiment, and will not be repeated here.

步骤404、根据所述待滤波像素和所述参考像素集合内的参考像素所在帧的时间间隔，计算所述待滤波像素和所述参考像素的时域邻近性。Step 404 , according to the time interval of the frame where the pixel to be filtered and the reference pixel in the set of reference pixels are located, calculate the temporal proximity between the pixel to be filtered and the reference pixel.

可选地，所述时域邻近性通过所述待滤波像素和所述参考像素所在帧的时间间隔作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大。Optionally, the temporal proximity is calculated by using the time interval between the frame where the pixel to be filtered and the reference pixel is located as an input value of a function; the output value of the function increases as the input value decreases .

具体地，由于不同帧之间的像素共同反映了一段时间内物体的状态，其存在一定的相关性，此种相关性也能用于三维视频中图像的滤波方法中。因此，本发明在上述实施例一的权重计算方法的基础上，将参考像素的选取范围从当前待滤波像素点所在帧拓展至其相邻帧(滤波参考帧)，以增加滤波后帧与帧之间的连续性。如图5所示，各滤波参考帧中，参考像素的选取范围与待滤波帧中参考像素的选取范围一致，其中，第N帧为当前待滤波像素所在帧，选择其前m帧和后n帧作为滤波参考帧，各帧参考像素窗口与第N帧参考像素窗口坐标和大小相同(这里指图像平面上的坐标和大小相同)，K_i为第i帧中的参考像素集合，i取值为[N-m，N+n]区间的整数，m、n为非负整数，当n＝0时表示只从待滤波帧之前已经编码或解码的帧获取。Specifically, since the pixels between different frames jointly reflect the state of the object within a period of time, there is a certain correlation between them, and this correlation can also be used in the image filtering method in the 3D video. Therefore, on the basis of the weight calculation method in the first embodiment above, the present invention expands the selection range of reference pixels from the frame where the current pixel to be filtered is located to its adjacent frame (filtered reference frame), so as to increase the number of frames after filtering. continuity between. As shown in Figure 5, in each filtered reference frame, the selection range of reference pixels is consistent with the selection range of reference pixels in the frame to be filtered, wherein the Nth frame is the frame where the current pixel to be filtered is located, and the first m frames and the next n frames are selected. The frame is used as a filtering reference frame, and the reference pixel window of each frame has the same coordinates and size as the reference pixel window of the Nth frame (here, the coordinates and size on the image plane are the same), K _i is the reference pixel set in the i-th frame, and the value of i It is an integer in the interval [Nm, N+n], m and n are non-negative integers, and when n=0, it means that it is only acquired from the encoded or decoded frame before the frame to be filtered.

两点之间在时域上的距离反映了其时域邻近程度，时域距离越近相关性越强，则时域邻近性越大，即可以通过所述待滤波像素和所述参考像素所在的帧计算时间间隔，并将该时间间隔作为输入值例如通过高斯函数计算出时域邻近性，计算时域邻近性的函数还可以是其他函数，但是需要保证该函数的输出值随着输入值的减小而增大。The distance between two points in the time domain reflects their temporal proximity. The closer the time domain distance is, the stronger the correlation is, and the greater the temporal proximity is, that is, the distance between the pixel to be filtered and the reference pixel can be Calculate the time interval of the frame, and use this time interval as an input value to calculate the time domain proximity through the Gaussian function. The function of calculating the time domain proximity can also be other functions, but it is necessary to ensure that the output value of the function follows the input value decrease and increase.

步骤405、根据所述空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。Step 405: Determine the weight of filtering according to the spatial proximity, texture pixel value similarity and temporal proximity, respectively perform weighted average on the pixel values of the reference pixels in the reference pixel set to obtain the pixel value of the pixel to be filtered filter results.

可选地，对深度图像滤波时，根据所述空间邻近性、所述深度像素对应的纹理像素值相似性和所述时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的深度像素值进行加权平均，获得所述深度图像待滤波像素的深度像素值的滤波结果；或，Optionally, when filtering the depth image, the filtering weight is determined according to the spatial proximity, the similarity of the texel value corresponding to the depth pixel, and the temporal proximity, and respectively for the reference pixel set performing a weighted average on the depth pixel values of the reference pixels to obtain a filtering result of the depth pixel values of the pixels to be filtered in the depth image; or,

可选地，根据所述空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果，包括：Optionally, the filtering weight is determined according to the spatial proximity, texture pixel value similarity and temporal proximity, and the pixel values of the reference pixels in the reference pixel set are respectively weighted and averaged to obtain the pixel to be filtered The filtering results, including:

其中，用于计算所述待滤波像素和所述参考像素的空间邻近性；；in, For calculating the spatial proximity of the pixel to be filtered and the reference pixel;;

具体地，可以根据公式 ${D_{p}}^{'} = \frac{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) D_{q_{i}})}{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ 对参考像素集合中的参考像素进行加权平均，计算获得待滤波像素的深度像素值的滤波结果；根据公式 ${T_{p}}^{'} = \frac{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) T_{q_{i}})}{Σ_{i = N - m}^{N + n} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ 对参考像素集合中的参考像素进行加权平均，计算获得待滤波像素的纹理像素值的滤波结果；Specifically, according to the formula ${D.}_{p}^{'} = \frac{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) {D.}_{q_{i}})}{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ Carry out weighted average on the reference pixels in the reference pixel set, calculate and obtain the filtering result of the depth pixel value of the pixel to be filtered; according to the formula ${T_{p}}^{'} = \frac{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N) T_{q_{i}})}{Σ_{i = N - m}^{N + no} (\underset{q_{i} &Element; K_{i}}{Σ} f_{S} (P, Q_{i}) f_{T} (T_{p}, T_{q_{i}}) f_{tem} (i, N))}$ Carrying out weighted averaging on the reference pixels in the reference pixel set, and calculating and obtaining the filtering result of the texel value of the pixel to be filtered;

f_tem(i,N)＝f_tem(||i-N||)用于计算所述待滤波像素和所述参考像素的时域邻近性；该函数的输入值为所述待滤波像素和所述参考像素所帧的时间间隔；所述函数的输出值随着输入值的减小而增大；f _tem (i, N)=f _tem (||iN||) is used to calculate the temporal proximity of the pixel to be filtered and the reference pixel; the input value of this function is the pixel to be filtered and the pixel to be filtered the time interval framed by the reference pixel; the output value of the function increases as the input value decreases;

其中，N为待滤波像素所在帧的帧号，i为参考像素所在帧的帧号，m、n分别为在待滤波像素所在帧之前、之后的参考帧个数，通常m、n可以为1～3，因为随着时间间隔的增加，帧与帧之间的相关性会变得非常小，可以忽略，p为待滤波像素，q_i为第i帧中的参考像素，K_i为第i帧中的参考像素集合，此集合通常取以待滤波像素为中心的正方形区域，大小为5*5或7*7，D_p'为p滤波后的深度像素值，第i帧中q的深度像素值，P、Q_i为p、第i帧中q在三维空间中的坐标值，T_p、分别为p、第i帧中q的纹理像素值，T_p'为p滤波后的纹理像素值。Among them, N is the frame number of the frame where the pixel to be filtered is located, i is the frame number of the frame where the reference pixel is located, m and n are the number of reference frames before and after the frame where the pixel to be filtered is located, and usually m and n can be 1 ~3, because as the time interval increases, the correlation between frames will become very small and can be ignored, p is the pixel to be filtered, q _i is the reference pixel in the i-th frame, and K _i is the i-th frame A set of reference pixels in the frame, which usually takes a square area centered on the pixel to be filtered, with a size of 5*5 or 7*7, D _p ' is the depth pixel value after p filtering, The depth pixel value of q in the i-th frame, P, Q _i is p, the coordinate value of q in the i-th frame in three-dimensional space, T _p , are the texture pixel values of p and q in the i-th frame, respectively, and T _p ' is the texture pixel value of p after filtering.

需要说明的是，在本实施例中步骤402、步骤403、步骤404不分先后顺序。It should be noted that, in this embodiment, step 402, step 403, and step 404 are not sequenced.

本实施例，利用待滤波像素和参考像素在真实三维空间中的关系，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性、纹理像素值相似性、和时域邻近性；根据空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果，计算权值时综合考虑了空间邻近性、纹理像素值相似性和时域邻近性，由于计算空间邻近性时利用的是真实三维空间中的位置，而且由于三维视频由一系列采集自不同时刻的图像组成，不同帧之间的像素点也存在相关性，因此权值考虑时域邻近性滤波后帧与帧之间的连续性较强，提高了滤波结果准确性，解决了现有技术中滤波结果准确性不高的问题。In this embodiment, the relationship between the pixel to be filtered and the reference pixel in the real three-dimensional space is used to calculate the spatial proximity, texture pixel value similarity, and time domain Proximity: determine the weight of filtering according to spatial proximity, texture pixel value similarity and time domain proximity, respectively perform weighted average on the pixel values of the reference pixels in the reference pixel set to obtain the filtering result of the pixel to be filtered , the spatial proximity, texture pixel value similarity and temporal proximity are considered comprehensively when calculating the weight, because the position in the real 3D space is used to calculate the spatial proximity, and because the 3D video consists of a series of images collected from different times There is also a correlation between pixels between different frames. Therefore, the continuity between frames is stronger after the weight is considered in the time-domain proximity filter, which improves the accuracy of the filtering results and solves the problem in the prior art. The accuracy of filtering results is not high.

图6为本发明三维视频滤波装置实施例的结构示意图，如图6所示，本实施例的三维视频滤波装置60可以包括：投影模块601、计算模块602和滤波模块603；FIG. 6 is a schematic structural diagram of an embodiment of a three-dimensional video filtering device of the present invention. As shown in FIG. 6, the three-dimensional video filtering device 60 of this embodiment may include: a projection module 601, a calculation module 602, and a filtering module 603;

其中，投影模块601，用于将图像平面中的像素投影到三维空间；所述像素包括待滤波像素和参考像素集合；Wherein, the projection module 601 is configured to project pixels in the image plane to a three-dimensional space; the pixels include a pixel to be filtered and a set of reference pixels;

计算模块602，用于根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合与所述待滤波像素在同一帧图像中；Calculation module 602, configured to calculate the space between the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space Proximity; wherein, the set of reference pixels and the pixel to be filtered are in the same frame image;

所述计算模块602，还用于根据所述待滤波像素和所述参考像素集合内的参考像素的纹理像素值，计算所述待滤波像素和所述参考像素的纹理像素值相似性；The calculation module 602 is further configured to calculate the similarity of the texel value of the pixel to be filtered and the reference pixel according to the texel value of the pixel to be filtered and the reference pixel in the reference pixel set;

所述计算模块602，还用于根据所述待滤波像素、所述参考像素集合内的参考像素和所述待滤波像素所在帧的前一帧图像中相同位置的像素的纹理像素值，计算所述待滤波像素和所述参考像素的运动特征一致性；The calculation module 602 is further configured to calculate the pixel to be filtered, the reference pixel in the reference pixel set, and the texture pixel value of the pixel at the same position in the previous frame image of the frame where the pixel to be filtered is located. The motion feature consistency of the pixel to be filtered and the reference pixel;

滤波模块603，用于根据所述空间邻近性、纹理像素值相似性和运动特征一致性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。The filtering module 603 is configured to determine the weight of filtering according to the spatial proximity, texture pixel value similarity and motion feature consistency, and perform weighted average on the pixel values of the reference pixels in the reference pixel set respectively to obtain the to-be The filtered result of the filtered pixel.

可选地，所述滤波模块603，具体用于：Optionally, the filtering module 603 is specifically configured to:

可选地，所述投影模块601，具体用于：Optionally, the projection module 601 is specifically used for:

可选地，所述空间邻近性通过三维空间中所述待滤波像素和所述参考像素的距离作为函数的输入值计算得到；所述函数的输出值随着输入值的减小而增大；Optionally, the spatial proximity is calculated by using the distance between the pixel to be filtered and the reference pixel in three-dimensional space as an input value of a function; the output value of the function increases as the input value decreases;

本实施例的装置，可以用于执行图2所示方法实施例的技术方案，其实现原理和技术效果类似，此处不再赘述。The device of this embodiment can be used to implement the technical solution of the method embodiment shown in FIG. 2 , and its implementation principle and technical effect are similar, and will not be repeated here.

在本发明三维视频滤波装置实施例二中，本实施例的装置在图6所示装置结构的基础上，进一步地，本实施例的三维视频滤波装置60中的投影模块601，用于将图像平面中的像素投影到三维空间；所述像素包括待滤波像素和参考像素集合；In the second embodiment of the three-dimensional video filtering device of the present invention, the device in this embodiment is based on the structure of the device shown in FIG. Pixels in the plane are projected into a three-dimensional space; the pixels include pixels to be filtered and a set of reference pixels;

计算模块602，用于根据所述待滤波像素和所述参考像素集合内的参考像素在所述三维空间的坐标值，计算所述待滤波像素和所述参考像素在所述三维空间内的空间邻近性；其中，所述参考像素集合在与所述待滤波像素所在同一帧图像和相邻多帧图像中；Calculation module 602, configured to calculate the space between the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space Proximity; wherein, the set of reference pixels is located in the same frame of image and adjacent multi-frame images as the pixel to be filtered;

所述计算模块602，还用于根据所述待滤波像素和所述参考像素集合内的参考像素所在帧的时间间隔，计算所述待滤波像素和所述参考像素的时域邻近性；The calculation module 602 is further configured to calculate the temporal proximity of the pixel to be filtered and the reference pixel according to the time interval of the frame where the pixel to be filtered and the reference pixel in the set of reference pixels are located;

滤波模块603，用于根据所述空间邻近性、纹理像素值相似性和时域邻近性确定滤波的权值，分别对所述参考像素集合内的参考像素的像素值进行加权平均获得所述待滤波像素的滤波结果。The filtering module 603 is configured to determine the weight of filtering according to the spatial proximity, texture pixel value similarity and temporal proximity, and perform weighted average on the pixel values of the reference pixels in the reference pixel set respectively to obtain the to-be The filtered result of the filtered pixel.

本实施例的装置，可以用于执行图4所示方法实施例的技术方案，其实现原理和技术效果类似，此处不再赘述。The device of this embodiment can be used to implement the technical solution of the method embodiment shown in FIG. 4 , and its implementation principle and technical effect are similar, and details are not repeated here.

图7为本发明三维视频滤波设备实施例的结构示意图。如图7所示，本实施例提供的三维视频滤波设备70包括处理器701和存储器702。其中存储器702用于存储执行指令，当三维视频滤波设备70运行时，处理器701与存储器702之间通信，处理器701调用存储器702中的执行指令，用于执行任一方法实施例所述的技术方案，其实现原理和技术效果类似，此处不再赘述。FIG. 7 is a schematic structural diagram of an embodiment of a three-dimensional video filtering device according to the present invention. As shown in FIG. 7 , a three-dimensional video filtering device 70 provided in this embodiment includes a processor 701 and a memory 702 . The memory 702 is used to store execution instructions. When the three-dimensional video filtering device 70 is running, the processor 701 communicates with the memory 702, and the processor 701 calls the execution instructions in the memory 702 to execute any method described in the embodiment. The implementation principle and technical effect of the technical solution are similar and will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的设备和方法，可以通过其它的方式实现。例如，以上所描述的设备实施例仅仅是示意性的，例如，所述单元或模块的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或模块可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，设备或模块的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units or modules is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or modules can be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.

所述作为分离部件说明的模块可以是或者也可以不是物理上分开的，作为模块显示的部件可以是或者也可以不是物理模块，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, that is, they may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

本领域普通技术人员可以理解：实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时，执行包括上述各方法实施例的步骤；而前述的存储介质包括：ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above method embodiments can be completed by program instructions and related hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps including the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.

Claims

1. A method for filtering three-dimensional video, comprising:

projecting pixels in an image plane into three-dimensional space; the pixels comprise pixels to be filtered and reference pixel sets;

calculating the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; wherein the reference pixel set and the pixel to be filtered are in the same frame of image;

calculating the similarity of the texture pixel values of the pixel to be filtered and the reference pixel according to the texture pixel values of the pixel to be filtered and the reference pixel in the reference pixel set;

calculating the consistency of the motion characteristics of the pixel to be filtered and the reference pixel according to the texture pixel values of the pixels at the same position in the previous frame image of the frame where the pixel to be filtered, the reference pixel in the reference pixel set and the pixel to be filtered are located;

when the depth image is filtered, determining a filtering weight value according to the spatial proximity, the similarity of the texture pixel values corresponding to the depth pixels and the motion characteristic consistency, and respectively carrying out weighted average on the depth pixel values of the reference pixels in the reference pixel set to obtain a filtering result of the depth pixel values of the pixels to be filtered of the depth image.

2. The method of claim 1, wherein projecting pixels in an image plane into a three-dimensional space comprises:

projecting the pixels from an image plane to a three-dimensional space by using depth image information, viewpoint position information and reference camera parameter information provided by a three-dimensional video; the depth image information includes a depth pixel value of the pixel.

3. The method of claim 2, wherein projecting the pixel from the image plane into the three-dimensional space using the depth image information, the viewpoint position information, and the reference camera parameter information provided by the three-dimensional video comprises:

according to the formula P ═ R^-1(dA^-1p-t) calculating a coordinate value of the pixel projected to the three-dimensional space;

wherein R and t are a rotation matrix and a translation vector of the reference camera, A is a reference camera parameter matrix,is the pixel is atThe coordinate values in the image plane are,the coordinate value of the pixel in the three-dimensional space is shown, and d is the depth pixel value of the pixel; f. of_xAnd f_yNormalized focal lengths in the horizontal and vertical directions, respectively, r is the radial distortion coefficient, (o)_x,o_y) Coordinate values of a reference point on the image plane; the reference point is an intersection of an optical axis of the reference camera and the image plane.

4. A method according to any one of claims 1 to 3, characterized in that said spatial proximity is calculated from the distance of said pixel to be filtered and said reference pixel in three-dimensional space as an input value of a function; the output value of the function increases with decreasing input value;

the texture pixel value similarity is obtained by calculating the difference value of the texture pixel values of the pixel to be filtered and the reference pixel as the input value of a function; the output value of the function increases with decreasing input value;

the motion characteristic consistency is obtained by calculating whether the motion characteristics of the pixel to be filtered and the reference pixel are consistent, and the motion characteristic consistency comprises the following steps:

when the difference value of the texture pixel values of the pixel to be filtered and the pixel at the corresponding position in the previous frame and the difference value of the texture pixel values of the pixel of the reference pixel and the pixel at the corresponding position in the previous frame are simultaneously greater than or less than a preset threshold value, determining that the motion states of the pixel to be filtered and the reference pixel are consistent; otherwise, determining that the motion states of the pixel to be filtered and the reference pixel are inconsistent.

5. The method of claim 1, wherein the determining a filtered weight value according to the spatial proximity, the similarity of texture pixel values, and the consistency of motion features, and performing weighted average on pixel values of reference pixels in the reference pixel set to obtain a filtering result of the pixel to be filtered respectively comprises:

according to formula (1):calculating a filtering result of the depth pixel value of the pixel to be filtered;

wherein,for calculating the spatial proximity of the pixel to be filtered and the reference pixel;

f_T(T_p,T_q)＝f_T(||T_p-T_q| |) is used for calculating the similarity of the texture pixel values of the pixel to be filtered and the reference pixel;

the motion characteristic consistency of the pixel to be filtered and the reference pixel is calculated;

wherein p is a pixel to be filtered, q is a reference pixel, K is a reference pixel set, and D_pIs a p-filtered depth pixel value, D_qA depth pixel value of q, P, Q coordinate values of p and q in three-dimensional space, T_p、T_qTexel values p, q, T_p'、T_q'The texel values of p and q at the same position in the previous frame are obtained, and th is a preset texel difference threshold value.

6. A method for filtering three-dimensional video, comprising:

calculating the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; the reference pixel set is in the same frame image and adjacent multi-frame images where the pixel to be filtered is located;

calculating the time domain proximity of the pixel to be filtered and the reference pixel according to the time interval of the frame where the pixel to be filtered and the reference pixel in the reference pixel set are located;

when the depth image is filtered, determining a filtering weight value according to the spatial proximity, the similarity of the texture pixel values corresponding to the depth pixels and the time domain proximity, and respectively performing weighted average on the depth pixel values of the reference pixels in the reference pixel set to obtain a filtering result of the depth pixel values of the pixels to be filtered of the depth image.

7. The method of claim 6, wherein projecting pixels in an image plane into a three-dimensional space comprises:

8. The method of claim 7, wherein projecting the pixel from the image plane into the three-dimensional space using the depth image information, the viewpoint position information, and the reference camera parameter information provided by the three-dimensional video comprises:

wherein R and t are a rotation matrix and a translation vector of the reference camera, A is a reference camera parameter matrix,for the pixel in the figureThe coordinate values in the image plane are,the coordinate value of the pixel in the three-dimensional space is shown, and d is the depth pixel value of the pixel; f. of_xAnd f_yNormalized focal lengths in the horizontal and vertical directions, respectively, r is the radial distortion coefficient, (o)_x,o_y) Coordinate values of a reference point on the image plane; the reference point is an intersection of an optical axis of the reference camera and the image plane.

9. The method according to any one of claims 6 to 8, wherein the spatial proximity is calculated from an input value of a function of a distance between the pixel to be filtered and the reference pixel in three-dimensional space; the output value of the function increases with decreasing input value;

the time domain proximity is obtained by calculating the time interval of the frame where the pixel to be filtered and the reference pixel are located as the input value of a function; the output value of the function increases with decreasing input value.

10. The method of claim 6, wherein the determining a filtered weight value according to the spatial proximity, the texel value similarity and the temporal proximity, and performing weighted averaging on pixel values of reference pixels in the reference pixel set to obtain a filtering result of the pixel to be filtered respectively comprises:

according to formula (3):calculating a filtering result of the depth pixel value of the pixel to be filtered;

the texture pixel value similarity of the pixel to be filtered and the reference pixel is calculated;

f_tem(i,N)＝f_tem(| i-N |) is used for calculating the temporal proximity of the pixel to be filtered and the reference pixel;

wherein, N is the frame number of the frame where the pixel to be filtered is located, i is the frame number of the frame where the reference pixel is located, and the value of i is [ N-m, N + N ]]The integer of the interval, m and n are respectively the number of reference frames before and after the frame of the pixel to be filtered, m and n are non-negative integers, p is the pixel to be filtered, q is the integer of the interval_iIs a reference pixel in the ith frame, K_iFor a set of reference pixels in the ith frame, D_pIs' a p-filtered depth pixel value,depth pixel value of q in the ith frame, P, Q_iIs the coordinate value of p, q in the ith frame in three-dimensional space, T_p、The texel values of p and q in the ith frame are respectively.

11. A three-dimensional video filtering apparatus, comprising:

a projection module for projecting pixels in an image plane into a three-dimensional space; the pixels comprise pixels to be filtered and reference pixel sets;

the calculation module is used for calculating the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; wherein the reference pixel set and the pixel to be filtered are in the same frame of image;

the calculating module is further configured to calculate, according to the to-be-filtered pixel and the texture pixel value of the reference pixel in the reference pixel set, the similarity between the texture pixel values of the to-be-filtered pixel and the reference pixel;

the computing module is further configured to compute motion feature consistency of the pixel to be filtered and the reference pixel according to texture pixel values of pixels at the same position in a previous frame image of a frame where the pixel to be filtered, the reference pixel in the reference pixel set, and the pixel to be filtered are located;

and the filtering module is used for determining a filtering weight value according to the spatial proximity, the similarity of texture pixel values corresponding to the depth pixels and the motion characteristic consistency when filtering the depth image, and respectively carrying out weighted average on the depth pixel values of the reference pixels in the reference pixel set to obtain a filtering result of the depth pixel values of the pixels to be filtered of the depth image.

12. The apparatus of claim 11, wherein the projection module is specifically configured to:

13. The apparatus of claim 12, wherein the projection module is specifically configured to:

wherein R and t are a rotation matrix and a translation vector of the reference camera, A is a reference camera parameter matrix,is a coordinate value of the pixel in the image plane,the coordinate value of the pixel in the three-dimensional space is shown, and d is the depth pixel value of the pixel; f. of_xAnd f_yNormalized focal lengths in the horizontal and vertical directions, respectively, r is the radial distortion coefficient, (o)_x,o_y) Coordinate values of a reference point on the image plane; the reference point is an intersection of an optical axis of the reference camera and the image plane.

14. The apparatus according to any one of claims 11-13, wherein the spatial proximity is calculated from an input value of a distance between the pixel to be filtered and the reference pixel in a three-dimensional space as a function; the output value of the function increases with decreasing input value;

15. The apparatus of claim 11, wherein the filtering module is specifically configured to:

16. A three-dimensional video filtering apparatus, comprising:

the calculation module is used for calculating the spatial proximity of the pixel to be filtered and the reference pixel in the three-dimensional space according to the coordinate values of the pixel to be filtered and the reference pixel in the reference pixel set in the three-dimensional space; the reference pixel set is in the same frame image and adjacent multi-frame images where the pixel to be filtered is located;

the calculating module is further configured to calculate time domain proximity of the pixel to be filtered and the reference pixel according to a time interval of a frame where the pixel to be filtered and the reference pixel in the reference pixel set are located;

and the filtering module is used for determining a filtering weight value according to the spatial proximity, the similarity of texture pixel values corresponding to the depth pixels and the time domain proximity when filtering the depth image, and respectively carrying out weighted average on the depth pixel values of the reference pixels in the reference pixel set to obtain a filtering result of the depth pixel values of the pixels to be filtered of the depth image.

17. The apparatus of claim 16, wherein the projection module is specifically configured to:

18. The apparatus of claim 17, wherein the projection module is specifically configured to:

19. The apparatus according to any one of claims 16-18, wherein the spatial proximity is calculated from an input value of a distance between the pixel to be filtered and the reference pixel in a three-dimensional space as a function; the output value of the function increases with decreasing input value;

20. The apparatus of claim 16, wherein the filtering module is specifically configured to: