CN101840574B - Depth estimation method based on edge pixel characteristics - Google Patents
Depth estimation method based on edge pixel characteristics Download PDFInfo
- Publication number
- CN101840574B CN101840574B CN2010101495041A CN201010149504A CN101840574B CN 101840574 B CN101840574 B CN 101840574B CN 2010101495041 A CN2010101495041 A CN 2010101495041A CN 201010149504 A CN201010149504 A CN 201010149504A CN 101840574 B CN101840574 B CN 101840574B
- Authority
- CN
- China
- Prior art keywords
- pixel
- edge
- parallax
- depth
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
Description
技术领域 technical field
本发明属于通信领域,涉及三维立体视频中深度估计技术,具体的说是一种能够得到高精度的深度图的深度估计方法,从而使合成的虚拟视图在边缘部分的形成的噪声点少,有效改善合成虚拟视图的主观质量和客观质量,可应用于任意视点电视系统中。The invention belongs to the communication field, and relates to depth estimation technology in three-dimensional stereoscopic video, specifically a depth estimation method capable of obtaining a high-precision depth map, so that the synthetic virtual view has fewer noise points formed at the edge part, and is effective The subjective quality and objective quality of the synthesized virtual view are improved, and it can be applied to any viewpoint television system.
背景技术 Background technique
在传统电视系统中,用户仅能观看到三维世界中有限的视角,且其观看的视点与视角由摄像机的三维空间位置与方向决定。因此用户不能自由的选择观看的视点与视角。任意视点电视FTV系统则允许用户从不同的视角观看真实的三维空间,从而提供一种全新的更加生动、真实的三维视听系统。由于FTV能够向用户提供更加真实的交互式观看效果,因此可以被广泛应用于广播通信、娱乐、教育、医疗、视频监控等各种类型视频系统。In traditional TV systems, users can only watch a limited viewing angle in the three-dimensional world, and the viewing point and viewing angle are determined by the three-dimensional space position and direction of the camera. Therefore, the user cannot freely select the viewpoint and viewing angle for viewing. Any point of view TV FTV system allows users to watch the real three-dimensional space from different perspectives, thus providing a new, more vivid and real three-dimensional audio-visual system. Since FTV can provide users with a more realistic interactive viewing effect, it can be widely used in various types of video systems such as broadcast communication, entertainment, education, medical treatment, and video surveillance.
图1为FTV系统的主要功能模块。FTV系统发送端生成的数据包括位于多个视点的摄像机阵列拍摄得到的视频数据,以及对应的场景深度数据;FTV系统接收端则使用各种基于深度信息的虚拟视图生成技术获取用户所需的任意视点视频数据。因此,高质量深度数据的获取是实现FTV系统的关键技术之一。在目前的FTV系统中,首先使用视差估计得到视差值d,接着根据下式将视差值d转换为相应的深度值:Figure 1 shows the main functional modules of the FTV system. The data generated by the sending end of the FTV system includes the video data captured by the camera array located at multiple viewpoints, and the corresponding scene depth data; the receiving end of the FTV system uses various depth information-based virtual view generation Viewpoint video data. Therefore, the acquisition of high-quality depth data is one of the key technologies to realize the FTV system. In the current FTV system, the disparity value d is first obtained by disparity estimation, and then the disparity value d is converted into the corresponding depth value according to the following formula:
式中:I表示摄像机间距;f摄像机镜头的焦距;Znear与Zfar分别表示三维场景中的物体距离摄像机的最近距离平面的深度值与最远距离平面的深度值。In the formula: I represents the distance between cameras; f is the focal length of the camera lens; Z near and Z far represent the depth value of the plane with the closest distance and the depth value of the plane with the farthest distance from the object in the 3D scene to the camera, respectively.
目前,主要采用基于图割的全局优化来估计视差。即首先建立一个代表象素亮度非一致性以及视差非一致性的能量函数,然后采用全局优化使该能量函数最小化。Currently, graph cut-based global optimization is mainly used to estimate disparity. That is, firstly, an energy function representing the non-uniformity of pixel brightness and disparity is established, and then the energy function is minimized by global optimization.
能量最小化定义了能够使视差非一致性能量函数E最小化的最佳的视差组合f,而该能量函数由对应视差象素点的亮度非一致性项和相邻象素视差非一致性项组成,如式(2)所示:Energy minimization defines the optimal parallax combination f that can minimize the parallax non-uniform energy function E, and the energy function consists of the brightness non-uniformity item of the corresponding parallax pixel point and the parallax non-uniformity item of adjacent pixels Composition, as shown in formula (2):
E(f)=Edata(f)+Esmooth(f) (2)E(f)=E data (f)+E smooth (f) (2)
式中:P表示当前视图中的象素集合,Dp衡量当前视图中的象素p在视差fp的条件下与参考视图中的象素(p+fp)之间的亮度非一致性,通常其中Ip表示当前视图中的象素p的亮度值,表示参考视图中的象素(p+fp)的亮度值;而表示当前象素与其相邻象素之间的视差非一致性,其中,N表示邻近象素对的集合,V{p,q}∈N(fp,fq)衡量当前视图中象素p的视差为fp,其相邻象素q的视差为fq时,二者视差的不一致性。若:In the formula: P represents the set of pixels in the current view, and D p measures the brightness inconsistency between the pixel p in the current view and the pixel (p+f p ) in the reference view under the condition of parallax f p , usually Where I p represents the brightness value of pixel p in the current view, represents the brightness value of the pixel (p+f p ) in the reference view; and Indicates the disparity inconsistency between the current pixel and its neighboring pixels, where N represents the set of neighboring pixel pairs, and V {p, q}∈N (f p , f q ) measures the pixel p in the current view When the disparity of f p is f p , and the disparity of its adjacent pixel q is f q , the disparity between the two is inconsistent. like:
V{p,q}∈N(fp,fq)=λ|fp-fq| (3)V {p, q}∈N (f p , f q )=λ|f p -f q | (3)
则对于较小的|fp-fq|,其相应的视差非一致性代价不会太大;而对于较大的|fp-fq|,其视差非一致性代价也会相应增加。从而所得到的最优的视差通常在相邻象素之间的视差差异不会很大。而实际上,在立体匹配问题中,由于存在视差的不连续性,特别是在物体的边缘部分,物体边缘部分的象素视差通常与其周围象素视差差异较大,因此,式(3)会造成物体边缘部分的过平滑现象,由于物体边缘部分的过平滑,会造成合成视图中边缘部分的模糊和混叠,严重影响视觉效果。Then for smaller |f p -f q |, the corresponding disparity inconsistency cost will not be too large; and for larger |f p -f q |, the disparity inconsistency cost will increase accordingly. The resulting optimal disparity is thus usually not very different from the disparity between adjacent pixels. In fact, in the stereo matching problem, due to the discontinuity of parallax, especially at the edge of the object, the parallax of the pixel at the edge of the object is usually quite different from the parallax of the surrounding pixels. Therefore, formula (3) will Causes the over-smoothing phenomenon of the edge part of the object. Due to the over-smoothing of the edge part of the object, it will cause blurring and aliasing of the edge part in the synthetic view, which seriously affects the visual effect.
对于同一边缘的物体,它们的视差基本是相同的,即使有变化,也是连续的和缓慢的,因此式(3)具有适应性。但是,由于边缘划分的不同物体在图像中的投影,即使是相邻象素,它们也不具有视差的一致性。因而需要研究式(3)的适应性,才能产生真正有效的深度估计方法。For objects on the same edge, their disparity is basically the same, even if there is a change, it is continuous and slow, so formula (3) is adaptive. However, due to the projection of different objects in the image divided by the edges, they do not have the consistency of disparity even for adjacent pixels. Therefore, it is necessary to study the adaptability of formula (3) in order to produce a truly effective depth estimation method.
目前,FTV系统中所采用的深度估计方法是通过设置最佳的视差fp,最小化视差非一致性能量函数E,最终为每一个象素找到其合适的视差值,然后通过式(1)将视差值d转换为相应的深度值。视差非一致性能量函数E由对应视差象素点的亮度非一致性项和相邻象素视差非一致性项组成。其中对应视差象素点的亮度非一致性项描述了对当前视图中的象素p分配一个视差fp,通过视差fp在参考视图中找到象素p的对应象素点,当前象素p与其对应象素点的匹配程度;相邻象素视差非一致性项描述了当前视图中象素p与其周围相邻象素视差的非一致性程度,若当前象素p与周围相邻象素q具有不同的视差,则在视差非一致性能量函数中增加相应的非一致性代价。At present, the depth estimation method used in the FTV system is to minimize the parallax inconsistency energy function E by setting the optimal parallax f p , and finally find the appropriate parallax value for each pixel, and then use the formula (1 ) converts the disparity value d to the corresponding depth value. The disparity inconsistency energy function E is composed of the brightness inconsistency item of the corresponding parallax pixel point and the disparity inconsistency item of adjacent pixels. Among them, the brightness inconsistency item corresponding to the disparity pixel describes that a disparity f p is assigned to the pixel p in the current view, and the corresponding pixel point of the pixel p is found in the reference view through the disparity f p , and the current pixel p The degree of matching with its corresponding pixel; the non-consistency item of adjacent pixel parallax describes the inconsistency of the parallax between pixel p and its surrounding adjacent pixels in the current view, if the current pixel p and surrounding adjacent pixels q has different disparities, the corresponding inconsistency cost is added in the disparity inconsistency energy function.
注意到边缘点视差的特殊性,现有深度估计方法认为与周围象素相比,边缘是具有阶跳视差的象素点。为了研究边缘象素的深度,该方法给出了一些场景图像的边缘辅助信息,其中,边缘辅助信息所指出的边缘象素点上具有阶跃性视差。Paying attention to the particularity of the parallax of edge points, the existing depth estimation method considers that compared with the surrounding pixels, the edge is a pixel point with step-jump parallax. In order to study the depth of edge pixels, this method gives some edge auxiliary information of scene images, and the edge pixels pointed out by the edge auxiliary information have step disparity.
如图2所示,现有深度估计方法首先判断当前象素与其周围相邻象素的状态,即判断象素是否位于图像中物体的边缘上,然后设计了如式(4)所示的两类视差非一致性代价函数Vj和Vc:As shown in Figure 2, the existing depth estimation method first judges the state of the current pixel and its surrounding adjacent pixels, that is, judges whether the pixel is located on the edge of the object in the image, and then designs two Parallax-like inconsistency cost functions V j and V c :
其中,E为边缘图 (4) Among them, E is the edge map (4)
式中:即如果象素p和象素q均不在边缘上,它们之间的视差变化较为平缓,因此若出现较大的阶跃视差,Vc取一个无穷大值;而若象素p和象素q其中一个位于边缘上或者二者均位于边缘上,则它们之间的视差具有较大的阶跃,因此若二者之间的视差变化较为平缓,Vj将同样取一个无穷大值。In the formula: That is, if the pixel p and the pixel q are not on the edge, the parallax between them changes relatively gently, so if there is a large step parallax, Vc takes an infinite value; And if one of pixel p and pixel q is located on the edge or both are located on the edge, the disparity between them will have a large step, so if the disparity between the two is relatively gentle, V j will be Also take an infinite value.
其次,计算象素亮度非一致性以及根据所设计的视差非一致性函数计算视差非一致性,利用能量最小化函数进行视差估计。Secondly, calculate the pixel brightness inconsistency and calculate the parallax inconsistency according to the designed parallax inconsistency function, and use the energy minimization function to estimate the parallax.
最后,利用所估计得到的视差值,根据视差深度转换公式将视差值转换为相应的深度值以完成深度估计。Finally, using the estimated disparity value, the disparity value is converted into the corresponding depth value according to the disparity depth conversion formula to complete the depth estimation.
现有深度估计方法是基于以下事实:在同一个边缘上的点的视差应该变化平缓;同时该方法指出图像边缘上象素和图像边缘以外的象素点视差差异较大。而实际中,若边缘象素与其相邻象素属于同一深度对象,则二者的视差值应该平缓变化,若二者不属于同一深度对象,则视差差异可能较大。该方法不加区别的认为图像边缘点象素与相邻象素的视差差异均较大,在图像边缘处会产生错误的视差估计结果,从而影响到图像边缘处的深度值的准确性,使合成的虚拟视图质量降低。The existing depth estimation method is based on the fact that the parallax of points on the same edge should change gently; at the same time, the method points out that the parallax difference between pixels on the edge of the image and pixels outside the edge of the image is relatively large. In practice, if the edge pixel and its adjacent pixels belong to the same depth object, the disparity value of the two should change smoothly; if the two do not belong to the same depth object, the disparity difference may be large. This method indiscriminately considers that the parallax difference between the edge pixel of the image and the adjacent pixel is large, and the wrong parallax estimation result will be generated at the edge of the image, thereby affecting the accuracy of the depth value at the edge of the image, making Composite virtual views are of reduced quality.
发明内容 Contents of the invention
本发明的目的是针对FTV系统中现有的深度估计方法中存在的问题,提出一种基于边缘象素深度特征的深度估计方法,以提高物体边缘处深度估计结果,从而保证FTV系统接收端合成的虚拟视图质量。The purpose of the present invention is to address the problems existing in the existing depth estimation methods in the FTV system, and propose a depth estimation method based on edge pixel depth features to improve the depth estimation results at the edge of the object, thereby ensuring that the receiving end of the FTV system synthesizes virtual view quality.
实现本发明的技术方案是:根据当前象素的具体位置,将图像中的象素划分为三类;根据图像边缘处象素深度的特点,对所划分的每一类象素提出相应的视差非一致性函数;利用所得到的视差非一致性函数进行视差估计;将视差值转换为相应的深度值,具体步骤包括:The technical solution for realizing the present invention is: according to the specific position of the current pixel, the pixels in the image are divided into three categories; Inconsistency function; use the obtained disparity inconsistency function to estimate the disparity; convert the disparity value into the corresponding depth value, and the specific steps include:
A.对图像中的象素进行分类,将图像中的象素分为三类:第一类为边缘上象素,第二类为边缘旁象素,第三类为非边缘象素;A. The pixel in the image is classified, and the pixel in the image is divided into three classes: the first class is the pixel on the edge, the second class is the pixel next to the edge, and the third class is the non-edge pixel;
B.根据处于图像中的物体边缘上的象素的深度与较近物体的深度相同的特征,分别设计出每一类象素相应的视差非一致性函数:B. According to the feature that the depth of the pixel on the edge of the object in the image is the same as the depth of the closer object, the corresponding disparity non-uniformity function of each type of pixel is designed respectively:
B1)第一类象素相对应的视差非一致性函数设计:B1) Parallax inconsistency function design corresponding to the first type of pixels:
若当前象素的相邻象素处于图像中的物体边缘上,则相应的视差非一致性函数为:If the adjacent pixels of the current pixel are on the edge of the object in the image, the corresponding disparity non-uniformity function is:
V{p,q}∈N(fp,fq)=λ|fp-fq|V {p, q}∈N (f p , f q )=λ|f p -f q |
式中:fp表示当前象素p的视差;fq表示当前象素p的相邻象素q的视差;λ表示平滑因子,取值为1;In the formula: f p represents the disparity of the current pixel p; f q represents the disparity of the adjacent pixel q of the current pixel p; λ represents the smoothing factor, and the value is 1;
若当前象素的相邻象素在图像中的物体边缘旁,判断相邻象素是否处于较近物体对象内,如果是,则相应的视差非一致性函数为:If the adjacent pixel of the current pixel is next to the edge of the object in the image, it is judged whether the adjacent pixel is in the closer object object, and if so, the corresponding parallax inconsistency function is:
V{p,q}∈N(fp,fq)=λ|fp-fq|;V {p, q} ∈ N (f p , f q ) = λ|f p -f q |;
如果相邻象素处于较远物体对象内,则视差非一致性函数为:If the adjacent pixel is in the far object object, the parallax inconsistency function is:
V{p,q}∈N(fp,fq)=∞;V {p, q} ∈ N (f p , f q ) = ∞;
从而设计出第一类象素的视差非一致性函数为:Thus, the parallax inconsistency function of the first type of pixels is designed as:
式中:E(·)是边缘辅助信息,E(·)=1表示该象素位于图像中物体边缘上,E(·)=0则表示该象素不在图像中物体边缘上;In the formula: E(·) is edge auxiliary information, E(·)=1 represents that the pixel is located on the edge of the object in the image, and E(·)=0 represents that the pixel is not on the edge of the object in the image;
B2)第二类象素相对应的视差非一致性函数设计:B2) Parallax inconsistency function design corresponding to the second type of pixel:
若当前象素以及当前象素的相邻象素均在图像中物体边缘旁,则相应的视差非一致性函数为:If the current pixel and the adjacent pixels of the current pixel are all next to the edge of the object in the image, then the corresponding disparity non-uniformity function is:
V{p,q}∈N(fp,fq)=λ|fp-fq|;V {p, q} ∈ N (f p , f q ) = λ|f p -f q |;
若当前象素的相邻象素在图像中物体边缘上,判断当前象素是否处于较近物体对象内,如果当前象素处于较近物体对象内,相应的视差非一致性函数为:If the adjacent pixel of the current pixel is on the edge of the object in the image, it is judged whether the current pixel is in the closer object, if the current pixel is in the closer object, the corresponding disparity non-uniformity function is:
V{p,q}∈N(fp,fq)=λ|fp-fq|;V {p, q} ∈ N (f p , f q ) = λ|f p -f q |;
如果当前象素处于较远物体对象内,则视差非一致性函数为:If the current pixel is in the far object, the parallax inconsistency function is:
V{p,q}∈N(fp,fq)=∞;V {p, q} ∈ N (f p , f q ) = ∞;
从而设计出第二类象素的视差非一致性函数为:Thus, the parallax non-uniformity function of the second type of pixels is designed as:
B3)第三类象素相对应的视差非一致性函数设计:B3) Parallax inconsistency function design corresponding to the third type of pixel:
若当前象素与其周围相邻象素均不在图像中物体边缘上,则此类象素的视差非一致性函数为:If the current pixel and its surrounding adjacent pixels are not on the edge of the object in the image, then the parallax non-uniformity function of such pixels is:
V(p,q)∈N(fp,fq)=λ|fp-fq|若E(p)=0且E(q)=0;V (p, q)∈N (f p , f q )=λ|f p -f q | If E(p)=0 and E(q)=0;
C.根据亮度非一致性函数以及所得到的三类象素的视差非一致性函数,分别计算象素亮度非一致性和视差非一致性,利用能量最小化函数进行相应的视差估计C. According to the brightness non-uniformity function and the obtained parallax non-uniformity function of the three types of pixels, calculate the pixel brightness non-uniformity and parallax non-uniformity respectively, and use the energy minimization function to perform corresponding parallax estimation
D.根据估计得到的视差值,利用视差深度转换函数,将视差值转换为相应的深度值,以完成深度估计。D. According to the estimated disparity value, use the disparity depth conversion function to convert the disparity value into a corresponding depth value, so as to complete the depth estimation.
本发明与现有技术相比具有以下优点:Compared with the prior art, the present invention has the following advantages:
由于本发明充分利用图像中的物体边缘象素的深度特征,根据图像中的物体边缘旁的象素是否处于较近物体对象内的位置,设计出所划分的三类象素相对应的视差非一致性函数,因此与FTV系统中现有的深度估计方法相比,得到的深度图在物体边缘部分更为准确,从而使合成的虚拟视图在物体边缘部分的形成的噪声点较少,有效提高了合成视图的主观质量和客观质量。Because the present invention makes full use of the depth characteristics of the object edge pixels in the image, according to whether the pixels next to the object edge in the image are in the position in the closer object object, the disparity corresponding to the divided three types of pixels is designed to be inconsistent Therefore, compared with the existing depth estimation method in the FTV system, the obtained depth map is more accurate at the edge of the object, so that the synthesized virtual view has fewer noise points at the edge of the object, effectively improving the Subjective quality and objective quality of synthetic views.
附图说明 Description of drawings
图1是FTV系统中的功能模块结构框图;Fig. 1 is a structural block diagram of functional modules in the FTV system;
图2是现有深度估计方法的流程图;Fig. 2 is the flowchart of existing depth estimation method;
图3是本发明深度估计方法的流程图;Fig. 3 is a flow chart of the depth estimation method of the present invention;
图4是分别利用现有深度估计方法和本发明深度估计方法得到的深度图;FIG. 4 is a depth map obtained by using the existing depth estimation method and the depth estimation method of the present invention respectively;
图5是分别利用现有深度估计方法和本发明深度估计方法得到的合成虚拟视图。Fig. 5 is a synthetic virtual view obtained by using the existing depth estimation method and the depth estimation method of the present invention respectively.
具体实施方式 Detailed ways
参照图3,本发明的深度估计方法,包括如下步骤:Referring to Fig. 3, the depth estimation method of the present invention comprises the following steps:
步骤1,根据当前象素具体位置,对图像中的象素进行分类。
1A)如果当前象素位于图像中的物体边缘上,则将当前象素划分到第一类象素中,即边缘上象素;1A) If the current pixel is located on the edge of the object in the image, then the current pixel is divided into the first type of pixels, i.e. the pixels on the edge;
1B)如果当前象素处于图像中的物体边缘旁,而其周围相邻象素存在位于物体边缘上的象素,则将当前象素归于第二类象素,即边缘旁象素;1B) If the current pixel is next to the edge of the object in the image, and there are pixels located on the edge of the object in its surrounding adjacent pixels, then the current pixel is assigned to the second type of pixel, i.e., the pixel next to the edge;
1C)如果当前象素处于图像中的物体边缘旁,且其周围相邻象素均处于图像中的物体边缘旁,则将当前象素归于第三类象素,即非边缘象素。1C) If the current pixel is on the edge of the object in the image, and its surrounding adjacent pixels are all on the edge of the object in the image, then classify the current pixel as the third type of pixel, ie non-edge pixel.
步骤2,根据图像中的物体边缘象素的深度特征,设计出三类象素相应的视差非一致性函数。Step 2, according to the depth features of the object edge pixels in the image, the disparity inconsistency functions corresponding to the three types of pixels are designed.
图像中的物体边缘象素的深度特征如下:The depth features of the edge pixels of the object in the image are as follows:
(a)场景是由充满物体的空间构成。场景中的各个物体有它的轮廓线,反映到图像里就是边缘线。边缘线是场景中物体间在图像中投影的分割线,在图像中边缘两侧往往是两个不同的物体投影。所以边缘点深度与它邻近点深度不尽相同,但也不一定不相同;(a) A scene consists of a space filled with objects. Each object in the scene has its contour line, which is reflected in the image as the edge line. The edge line is the dividing line projected in the image between objects in the scene, and there are often two different object projections on both sides of the edge in the image. So the depth of the edge point is not the same as the depth of its neighbors, but not necessarily the same;
(b)在人们的视线方向,物体由近及远重叠出现,直至无穷远。物体的重叠构成边缘,而边缘点属于两个重叠物体中较近的那个物体,也就是说,边缘点的深度和较近物体的深度相同或相近;(b) In the direction of people's line of sight, objects overlap from near to far until infinity. The overlapping of objects constitutes an edge, and the edge point belongs to the closer object among the two overlapping objects, that is, the depth of the edge point is the same or similar to the depth of the closer object;
(c)即使边缘是分割物体的边界,但在图像中还存在假边缘,即在一个物体内部有类似边缘的线条存在,它们和邻近点的深度关系同于一般相同物体内的邻近点深度关系。(c) Even if the edge is the boundary of the segmented object, there are still false edges in the image, that is, there are lines similar to the edge inside an object, and their depth relationship with adjacent points is the same as that of adjacent points in the same object. .
三类象素相应的视差非一致性函数如下:The disparity inconsistency functions corresponding to the three types of pixels are as follows:
2A)第一类象素的视差非一致性函数设计2A) Parallax inconsistency function design of the first type of pixels
在第一类象素中,当前象素处于图像中物体的边缘上,若当前象素的相邻象素也处于图像中的物体边缘上,位于同一边缘上的象素其视差变化应该是缓慢的,则相应的视差非一致性函数为:In the first type of pixels, the current pixel is on the edge of the object in the image, if the adjacent pixels of the current pixel are also on the edge of the object in the image, the parallax change of the pixels on the same edge should be slow , then the corresponding parallax inconsistency function is:
V(p,q}∈N(fp,fq)=λ|fp-fq|V (p, q}∈N (f p , f q )=λ|f p -f q |
式中:fp表示当前象素p的视差;fq表示当前象素p的相邻象素q的视差;λ表示平滑因子,取值为1;In the formula: f p represents the disparity of the current pixel p; f q represents the disparity of the adjacent pixel q of the current pixel p; λ represents the smoothing factor, and the value is 1;
若当前象素的相邻象素在图像中的物体边缘旁,则需要判断相邻象素是否处于较近物体对象内。令当前象素为p,其左方相邻象素q位于边缘旁。比较相邻象素q与当前象素p的视差差值,以及相邻象素q与象素q左方的相邻象素m的视差差值,若|fq-fp|≤|fq-fm|,则象素q属于较近物体对象内;否则,象素q属于较远物体对象内,其中:fm表示象素m的视差;If the adjacent pixels of the current pixel are beside the edge of the object in the image, it is necessary to determine whether the adjacent pixels are in the closer object. Let the current pixel be p, and its left neighbor pixel q is located next to the edge. Compare the parallax difference between the adjacent pixel q and the current pixel p, and the parallax difference between the adjacent pixel q and the adjacent pixel m to the left of the pixel q, if |f q -f p |≤|f q -f m |, then the pixel q belongs to the closer object object; otherwise, the pixel q belongs to the far object object, wherein: f m represents the disparity of the pixel m;
如果相邻象素处于较近物体对象内,由于当前象素位于较近物体对象内,则二者属于同一物体对象内。根据图像中物体边缘象素的深度特点(b)和(c),象素p与其相邻象素q属于同一深度对象,二者之间的视差变化应该是缓慢的,则相应的视差非一致性函数为:If the adjacent pixel is in the closer object, since the current pixel is in the closer object, the two belong to the same object. According to the depth characteristics (b) and (c) of the edge pixels of the object in the image, the pixel p and its adjacent pixel q belong to the same depth object, and the parallax change between the two should be slow, so the corresponding parallax is inconsistent The sex function is:
V{p,q}∈N(fp,fq)=λ|fp-fq|;V {p, q} ∈ N (f p , f q ) = λ|f p -f q |;
如果象素q位于较远物体对象内,同样根据图像中的物体边缘象素的深度特点(b)和(c),其与位于图像中的物体边缘上的象素p不属于同一深度对象内,二者之间的视差差异较大,则相应的视差非一致性函数为:If the pixel q is located in the far object object, according to the depth characteristics (b) and (c) of the object edge pixel in the image, it does not belong to the same depth object as the pixel p located on the object edge in the image , the disparity difference between the two is large, then the corresponding disparity non-consistency function is:
V{p,q}∈N(fp,fq)=∞;V {p, q} ∈ N (f p , f q ) = ∞;
从而设计出第一类象素的视差非一致性函数为:Thus, the parallax inconsistency function of the first type of pixels is designed as:
式中:E(·)是边缘辅助信息,E(·)=1表示该象素位于图像中的物体边缘上,E(·)=0则表示该象素位于图像中的物体边缘旁;In the formula: E( ) is edge auxiliary information, E( )=1 represents that this pixel is located on the object edge in the image, E( )=0 then represents that this pixel is located beside the object edge in the image;
2B)第二类象素的视差非一致性函数设计2B) Parallax inconsistency function design of the second type of pixels
在第二类象素中,当前象素处于图像中物体的边缘旁,若当前象素的相邻象素也处于图像中的物体边缘旁,二者之间视差变化应该是缓慢的,则相应的视差非一致性函数为:In the second type of pixels, the current pixel is at the edge of the object in the image, if the adjacent pixels of the current pixel are also at the edge of the object in the image, and the parallax change between the two should be slow, then the corresponding The disparity inconsistency function of is:
V{p,q}∈N(fp,fq)=λ|fp-fq|;V {p, q} ∈ N (f p , f q ) = λ|f p -f q |;
若当前象素的相邻象素在图像中的物体边缘上,则需要判断当前象素是否处于较近物体对象内。令当前象素为p,其左方相邻象素q位于图像中物体的边缘上;比较相邻象素q与当前象素p的视差差值,以及相邻象素q与象素q左方的相邻象素m的视差差值,若|fq-fp|≤|fq-fm|,则象素p属于较近物体对象内;否则,象素p属于较远物体对象内;If the adjacent pixels of the current pixel are on the edge of the object in the image, it is necessary to judge whether the current pixel is in the closer object. Let the current pixel be p, and its left adjacent pixel q is located on the edge of the object in the image; compare the parallax difference between the adjacent pixel q and the current pixel p, and the adjacent pixel q and the left side of the pixel q The parallax difference value of the adjacent pixel m, if |f q -f p |≤|f q -f m |, then the pixel p belongs to the closer object; otherwise, the pixel p belongs to the far object;
如果当前象素处于较近物体对象内,由于相邻象素位于较近物体对象内,则二者属于同一物体对象内。同样根据图像中的物体边缘象素的深度特点(b)和(c),象素p与其相邻象素q属于同一深度对象,二者之间的视差变化应该是缓慢的,则相应的视差非一致性函数为:If the current pixel is in the closer object, since the adjacent pixel is in the closer object, the two belong to the same object. Also according to the depth characteristics (b) and (c) of the object edge pixels in the image, the pixel p and its adjacent pixel q belong to the same depth object, and the parallax change between the two should be slow, then the corresponding parallax The non-uniform function is:
V{p,q}∈N(fp,fq)=λ|fp-fq|;V {p, q} ∈ N (f p , f q ) = λ|f p -f q |;
如果象素p位于较远物体对象内,同样根据图像中的物体边缘象素的深度特点(b)和(c),其与位于图像中的物体边缘上的象素q不属于同一深度对象内,二者之间的视差差异较大,则相应的视差非一致性函数为:If the pixel p is located in the far object object, according to the depth characteristics (b) and (c) of the object edge pixel in the image, it does not belong to the same depth object as the pixel q located on the object edge in the image , the disparity difference between the two is large, then the corresponding disparity non-consistency function is:
V{p,q}∈N(fp,fq)=∞;V {p, q} ∈ N (f p , f q ) = ∞;
从而设计出第二类象素的视差非一致性函数为:Thus, the parallax non-uniformity function of the second type of pixels is designed as:
2C)第三类象素的视差非一致性函数设计2C) Parallax inconsistency function design of the third type of pixels
在第三类象素中,当前象素与其相邻象素均位于图像中的物体边缘旁,因此,二者之间的视差变化应该是缓慢的,所设计的此类象素的视差非一致性函数为:In the third type of pixels, the current pixel and its adjacent pixels are located near the edge of the object in the image, so the parallax change between the two should be slow, and the designed parallax of such pixels is inconsistent The sex function is:
V(p,q)∈N(fp,fq)=λ|fp-fq|若E(p)=0且E(q)=0。V (p, q)∈N (f p , f q )=λ|f p −f q |if E(p)=0 and E(q)=0.
步骤3,利用能量最小化函数进行视差估计。Step 3, use the energy minimization function to estimate the disparity.
3A)象素亮度非一致性计算3A) Calculation of pixel brightness inconsistency
象素亮度非一致性衡量了当前视图中的象素p在视差fp的条件下与参考视图中的象素(p+fp)之间的亮度非一致性,由下式计算:Pixel brightness inconsistency measures the brightness inconsistency between the pixel p in the current view and the pixel in the reference view (p+f p ) under the condition of disparity f p , and is calculated by the following formula:
式中:Ip表示当前视图中的象素p的亮度值;表示参考视图中的象素(p+fp)的亮度值;In the formula: I p represents the brightness value of the pixel p in the current view; Indicates the brightness value of the pixel (p+f p ) in the reference view;
3B)象素视差非一致性计算3B) Pixel parallax inconsistency calculation
象素视差非一致性衡量当前视图中的象素p的视差为fp,其相邻象素q的视差为fq时,二者视差的不一致性,其由步骤2所设计的三类象素的非一致性函数V{p,q}∈N(fp,fq)计算得到;Pixel disparity inconsistency measures the inconsistency of the disparity between the disparity of the pixel p in the current view when the disparity of the pixel p is f p , and the disparity of its adjacent pixel q is f q . The inconsistency function V {p, q}∈N (f p , f q ) of the prime is calculated;
3C)视差估计3C) Disparity Estimation
利用计算得到的象素亮度非一致性和象素视差非一致性,根据能量最小化函数来估计视差。The disparity is estimated according to the energy minimization function by using the computed pixel brightness inconsistency and pixel disparity inconsistency.
能量最小化函数定义了能够使能量函数E最小化的最佳的视差组合f,如下式所示:The energy minimization function defines the best disparity combination f that can minimize the energy function E, as shown in the following formula:
E(f)=Edata(f)+Esmooth(f)E(f)=E data (f)+E smooth (f)
式中:其中P表示当前视图中的象素集合;其中N表示邻近象素对的集合。In the formula: Where P represents the set of pixels in the current view; where N represents the set of adjacent pixel pairs.
步骤4,利用视差值计算深度值。Step 4, using the disparity value to calculate the depth value.
根据步骤3估计得到的视差值,利用视差深度转换函数,将视差值转换为相应的深度值,以完成深度估计;According to the disparity value estimated in step 3, using the disparity depth conversion function, the disparity value is converted into a corresponding depth value to complete the depth estimation;
视差深度转换函数如下式所示:The parallax depth conversion function is as follows:
式中:d表示估计得到的视差值;Z表示深度值;I表示摄像机间距;f摄像机镜头的焦距;Znear与Zfar分别表示三维场景中的物体距离摄像机的最近距离平面的深度值与最远距离平面的深度值。In the formula: d represents the estimated parallax value; Z represents the depth value; I represents the distance between the cameras; f is the focal length of the camera lens; Z near and Z far represent the depth value and The depth value of the farthest plane.
本发明的效果可以通过以下实验进一步证明:Effect of the present invention can further prove by following experiment:
1、实验条件1. Experimental conditions
本发明采用运动图像专家组MPEG组织提供的深度估计参考软件和虚拟视图合成参考软件为实验平台,将本发明的基于边缘象素特征的深度估计方法与FTV系统中现有的深度估计方法分别集成到深度估计参考软件中。同时,实验中采用了MPEG组织提供的三组多视点视频序列,即champagne_tower、book_arrival与newspaper。为了验证本发明方法的有效性,比较这两种方法所得到的深度图质量,同时比较和分析了在FTV系统接收端合成的虚拟视图的主客观质量。The present invention adopts the depth estimation reference software and the virtual view synthesis reference software provided by the moving picture expert group MPEG organization as the experimental platform, and integrates the depth estimation method based on edge pixel features of the present invention with the existing depth estimation method in the FTV system respectively into the depth estimation reference software. At the same time, three groups of multi-viewpoint video sequences provided by MPEG organization are used in the experiment, namely champagne_tower, book_arrival and newspaper. In order to verify the effectiveness of the method of the present invention, the quality of the depth map obtained by the two methods is compared, and the subjective and objective quality of the virtual view synthesized at the receiving end of the FTV system is compared and analyzed.
2、实验内容2. Experimental content
首先,分别利用现有的深度估计方法与本发明中的基于边缘象素特征的深度估计方法估计深度数据,得到的champagne_tower序列的深度图如图4所示。其中图4(a)为现有深度估计方法的深度图结果,图4(b)为本发明的深度估计方法的深度图结果。First, the existing depth estimation method and the depth estimation method based on edge pixel features in the present invention are used to estimate the depth data respectively, and the obtained depth map of the champagne_tower sequence is shown in FIG. 4 . Fig. 4(a) is the depth map result of the existing depth estimation method, and Fig. 4(b) is the depth map result of the depth estimation method of the present invention.
其次,利用采集到的多视点视图以及由两种方法估计得到相应视图的深度图合成虚拟视图,得到的champagne_tower序列的合成虚拟视图如图5所示。其中图5(a)为利用现有深度估计方法合成的虚拟视图结果,图5(b)为利用本发明的深度估计方法合成的虚拟视图结果。Secondly, the virtual view is synthesized by using the collected multi-viewpoint view and the depth map of the corresponding view estimated by the two methods, and the obtained synthetic virtual view of the champagne_tower sequence is shown in Figure 5. Figure 5(a) is the result of virtual view synthesized by using the existing depth estimation method, and Figure 5(b) is the result of virtual view synthesized by using the depth estimation method of the present invention.
最后,分别计算利用两种深度估计方法得到的虚拟视图的峰值信噪比PSNR,结果如表1所示。Finally, calculate the peak signal-to-noise ratio (PSNR) of the virtual view obtained by using the two depth estimation methods, and the results are shown in Table 1.
表1 基于两种方法得到的合成虚拟视图PSNR结果Table 1 PSNR results of synthetic virtual view based on two methods
3、实验分析3. Experimental analysis
从图4中可以看出,由于充分利用了图像中的物体边缘象素的深度特性,采用本发明的深度估计方法所得到的深度图在物体边缘处更为准确,极大地减少了图像噪声。As can be seen from FIG. 4 , since the depth characteristics of the object edge pixels in the image are fully utilized, the depth map obtained by the depth estimation method of the present invention is more accurate at the edge of the object, which greatly reduces image noise.
从图5中可以看出,由于本发明的深度估计方法得到了更为精确的深度图,从而极大改善了合成虚拟视图边缘的主观质量。It can be seen from FIG. 5 that since the depth estimation method of the present invention obtains a more accurate depth map, the subjective quality of the synthetic virtual view edge is greatly improved.
从表1中可以看出,与现有的深度估计方法相比,采用本发明方法所得到的虚拟视图的客观质量都得到不同程度的提高,平均可达0.32dB。It can be seen from Table 1 that compared with the existing depth estimation method, the objective quality of the virtual view obtained by the method of the present invention has been improved to varying degrees, and the average can reach 0.32dB.
Claims (3)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010101495041A CN101840574B (en) | 2010-04-16 | 2010-04-16 | Depth estimation method based on edge pixel characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010101495041A CN101840574B (en) | 2010-04-16 | 2010-04-16 | Depth estimation method based on edge pixel characteristics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101840574A CN101840574A (en) | 2010-09-22 |
CN101840574B true CN101840574B (en) | 2012-05-23 |
Family
ID=42743931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010101495041A Expired - Fee Related CN101840574B (en) | 2010-04-16 | 2010-04-16 | Depth estimation method based on edge pixel characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101840574B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102724527A (en) * | 2012-06-19 | 2012-10-10 | 清华大学 | Depth evaluation method capable of configuring multi-scenario model and system using same |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102156987A (en) * | 2011-04-25 | 2011-08-17 | 深圳超多维光电子有限公司 | Method and device for acquiring depth information of scene |
TWI504233B (en) * | 2011-12-22 | 2015-10-11 | Teco Elec & Machinery Co Ltd | Depth estimation method and device using the same |
CN103686139B (en) * | 2013-12-20 | 2016-04-06 | 华为技术有限公司 | Two field picture conversion method, frame video conversion method and device |
CN105374039B (en) * | 2015-11-16 | 2018-09-21 | 辽宁大学 | Monocular image depth information method of estimation based on contour acuity |
GB2553782B (en) * | 2016-09-12 | 2021-10-20 | Niantic Inc | Predicting depth from image data using a statistical model |
CN107133982B (en) * | 2017-04-28 | 2020-05-15 | Oppo广东移动通信有限公司 | Depth map construction method and device, shooting equipment and terminal equipment |
CN108833876B (en) * | 2018-06-01 | 2019-10-25 | 宁波大学 | A Method for Recombining Stereoscopic Image Content |
CN112446946B (en) * | 2019-08-28 | 2024-07-09 | 深圳市光鉴科技有限公司 | Depth reconstruction method, system, equipment and medium based on sparse depth and boundary |
CN110926334B (en) * | 2019-11-29 | 2022-02-22 | 深圳市商汤科技有限公司 | Measuring method, measuring device, electronic device and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6556704B1 (en) * | 1999-08-25 | 2003-04-29 | Eastman Kodak Company | Method for forming a depth image from digital image data |
JP2010506482A (en) * | 2006-10-02 | 2010-02-25 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Method and filter for parallax recovery of video stream |
CN101222647B (en) * | 2007-10-12 | 2010-10-27 | 四川虹微技术有限公司 | Scene global depth estimation method for multi-vision angle video image |
-
2010
- 2010-04-16 CN CN2010101495041A patent/CN101840574B/en not_active Expired - Fee Related
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102724527A (en) * | 2012-06-19 | 2012-10-10 | 清华大学 | Depth evaluation method capable of configuring multi-scenario model and system using same |
Also Published As
Publication number | Publication date |
---|---|
CN101840574A (en) | 2010-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101840574B (en) | Depth estimation method based on edge pixel characteristics | |
CN101720047B (en) | Method for acquiring range image by stereo matching of multi-aperture photographing based on color segmentation | |
CN108074218B (en) | Image super-resolution method and device based on light field acquisition device | |
CN102223553B (en) | Method for converting two-dimensional video into three-dimensional video automatically | |
Tam et al. | 3D-TV content generation: 2D-to-3D conversion | |
CN101902657B (en) | Method for generating virtual multi-viewpoint images based on depth image layering | |
CN104756489B (en) | A kind of virtual visual point synthesizing method and system | |
CN102665086B (en) | Method for obtaining parallax by using region-based local stereo matching | |
CN101771893A (en) | Video frequency sequence background modeling based virtual viewpoint rendering method | |
CN106228605A (en) | A kind of Stereo matching three-dimensional rebuilding method based on dynamic programming | |
CN102360489B (en) | Method and device for realizing conversion from two-dimensional image to three-dimensional image | |
CN108596965A (en) | A kind of light field image depth estimation method | |
CN101287142A (en) | Method of Converting Plane Video to Stereo Video Based on Two-way Tracking and Feature Point Correction | |
CN103248909B (en) | Method and system of converting monocular video into stereoscopic video | |
CN102254348A (en) | Block matching parallax estimation-based middle view synthesizing method | |
Solh et al. | A no-reference quality measure for DIBR-based 3D videos | |
CN106056622B (en) | A kind of multi-view depth video restored method based on Kinect cameras | |
CN102547350A (en) | Method for synthesizing virtual viewpoints based on gradient optical flow algorithm and three-dimensional display device | |
CN101662695B (en) | Method and device for acquiring virtual viewport | |
CN107067452A (en) | A kind of film 2D based on full convolutional neural networks turns 3D methods | |
CN106447718B (en) | A 2D to 3D depth estimation method | |
CN104661014B (en) | The gap filling method that space-time combines | |
CN102026012B (en) | Generation method and device of depth map through three-dimensional conversion to planar video | |
Knorr et al. | Stereoscopic 3D from 2D video with super-resolution capability | |
JP2015012429A (en) | Image processing apparatus, image processing method, and image processing program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120523 Termination date: 20180416 |