Abstract
In this work, we propose a non-local \(L_{0}\) gradient minimization filter. The nonlocal idea is to restore an unknown pixel using other similar pixels, and the nonlocal gradient model has been verified for feature and structure-preserving. We introduce the nonlocal idea into a \(L_{0}\) gradient minimization approach, which is effective for preserving major edges while eliminating the low-amplitude structures. An optimization framework is designed for achieving this effort. Many optimized based filters do not have the property of joint filtering, so they can not be used in many problems, such as joint denoising, joint upsampling, while the proposed filter not only inherits the advantages of the \(L_{0}\) gradient minimization filter, but also has the property of the joint filtering. So our filter can be applied to joint super resolution. With the guidance of the high-resolution image, we propose upsampling the low-resolution depth image with the proposed filter. Experimental results demonstrate the effectiveness of our method both qualitatively and quantitatively compared with the state-of-the-art methods.
H. Yang—Thanks to the National Science Foundation of China under Grant (No. 61401425) for providing support.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Edge-aware image processing technique is broadly studied for smoothing images without destroying different levels of structures. It is wildly applied for computer graphics community. Edge-preserving filters can be broadly divided into two broad categories: average based approaches and optimization based approaches.
The methods of first class smooth images by taking a weighted average of nearby pixels, where the weights depend on the intensity/color difference. Average based filters include bilateral filter [17], nonlocal means filter [1], guided image filter [8] and rolling guidance filter [23]. They often use guidance image to define the similarity between pixels. The main drawback of these filters is that they will produce the halo effect near the edge.
The total variation (TV) model [15], \(L_{0}\) gradient minimization filter (\(L_{0}\) filter) [20], weighted least squares (WLS) [3] and curvature filter [7] belong to the optimization based methods. These approaches smooth images by optimizing objective functions containing terms defined in \(L_{p}\) norm (\(p = 0,1,2\)). Although the optimization based methods can avoid the halo effect along salient edges and often generate high quality results, it does not have the property of joint filtering with reference image, and this shortcoming limits their applications.
Recently, the nonlocal framework has been extensively studied by many scholars as a regularization term to overcome the staircase effect and obtain better performance. Gilboa and Osher defined a variational functional based nonlocal TV operators [6]. Zhang et al. [24] proposed a fast split Bregman iteration for this nonlocal TV minimization. Lately, the nonlocal regularizations are extended to process more general inverse problems in [12]. However, they penalize large nonlocal gradient magnitudes, and it possibly influence contrast during smoothing.
In summary, most image smooth models aim to preserve edges from noise and textures, and each of them has its limitations. In this work, we present a new edge-preserving filter based on an optimization framework, which incorporates the nonlocal strategy into the \(L_{0}\) gradient minimization model and takes advantage of both variational models and spatial filters. This notion leads to an unconventional global optimization process involving discrete metrics, whose solution is able to manipulate the edges in a variety of ways depending on the saliency.
The proposed framework is general and can be used for several applications. Different from other optimization based methods, the proposed algorithm can use the reference image for joint filtering.
The depth images captured by 3D scanning devices such as ToF camera or Kinect camera may be highly degraded, which have limited resolution and low quality. As a result, it’s hard to recover high quality depth maps from single depth image. Fortunately, the depth map is often coupled by high resolution (HR) color image which shows the same scene and they have strong structural similarities [4, 19, 22]. In recently, deep learn based depth upsampling methods [5, 9, 11, 16] achieve well results. These methods produce the end-to-end upsampling networks, which learn high-resolution features in the intensity image and supplement the low-resolution depth structures in the depth map.
So this paper applies the proposed filter for depth image super resolution and treats the natural image as the reference image. With the guidance of the high-resolution RGB image, the proposed algorithm is well suited for upsampling the low-resolution depth image and it can not only reduce noises, but also preserve the sharp edges during super resolution. With simulations, the experimental results demonstrate that the proposed approach is promising, and it does significantly improve the visual quality of the low-resolution depth image compared with the existing upsampling methods.
2 Non-local \(L_{0}\) Gradient Minimization
Different from the definition of gradient, the nonlocal gradient \(\nabla _{\omega } S_{p}\) of each pixel p on the image S is defined as follows:
where
and \(S_{p}(q)\) is the vector element corresponding to \(q, \omega (p, q)\) is the weight function, which is assumed to be nonnegative and symmetric, it measures the similarity features between two patches (the size is \(m\times m\)) centered at the pixels p and \(q, \varOmega _{p}\) is a search window centered at the pixel p (the size of \(\varOmega _{p}\) is \(n\times n\)) [12, 24].
The weight function \(\omega (p,q)\) in \(\varOmega _{p}\) has the form:
and the normalizing factor \(C_{p}\) is
where \(G_{a}\) is the Gaussian kernel with standard deviation a, h is a smoothing parameter, and J is a reference image which can be chosen according to different applications.
In this work, we denote the input image and filtered image as I and S, respectively. Our nonlocal gradient measure is written as
It counts p whose magnitude
is not zero. Based on this definition, we can estimate S by solving:
The first term constrains image structure similarity.
It is a discrete counting metric involved in Eq. (6). These two terms describe the pixel-wise difference and global discontinuity respectively, it is commonly regarded as computationally intractable. In this work, we introduce an auxiliary variable based on the half-quadratic splitting method, which can expand the original terms and update them iteratively. This approach leads to an alternating optimization strategy.
Due to the discrete nature, our method contains new subproblems, and it is different from other \(L_{0}\)-norm regularized optimization problems. Although the proposed method can only approximate the solution of Eq. (6), but it can make the original problem easier to handle and inherit the property to maintain salient structures [20].
The auxiliary variables \(\mathbf d _{p}\) are introduced, and they are corresponding to \(\nabla _{\omega } S_{p}\). We can rewrite the cost function as
where \(C(\mathbf d ) = \sharp \{p \mid \parallel \mathbf d _{p} \parallel _{1} \ne 0 \}\), and \(\beta \) is a an automatically adapting controlling parameter.
Our split variables approaches motivate us to propose this iterative method. In practice, a good result can be obtained by solving the following two subproblems iteratively.
Subproblem 1: computing S
Now, the subproblem for S consists in solving the linear equations
which provides
Here, \(div_{\omega }{} \mathbf d \) is defined as the divergence of \(\mathbf d \), and its discretization at p can be written as
The non-local Laplacian \(\varDelta _{\omega }\) is defined as
Since the non-local Laplacian is negative semi definite, the operator \(1-\varDelta _{\omega }\) is diagonally dominant. Therefore we can solve S by a Gauss-Seidel algorithm.
Subproblem 2: computing \(\mathbf d \)
This subproblem can be solved efficiently because the Eq. (13) can be spatially decomposed where \(\mathbf d _{p}\) are estimated individually. It is the main benefit of the proposed scheme, which makes the altered problem empirically solvable. Equation (13) is accordingly decomposed to:
where \(H(\mathbf d _{p})\) is a binary function returning 1 if \(\parallel \mathbf d _{p} \parallel _{1} \ne 0\) and 0 otherwise.
Equation (14) reaches its minimum \(E^{*}_{p}\) under the condition
Proof:
(1) When \(\lambda /\beta \ge \parallel \nabla _{\omega } S_{p} \parallel ^{2}_{2}\), non-zero \(\mathbf d _{p}\) yields
Note that \(\mathbf d _{p} = \mathbf 0 \) leads to
Comparing Eq. (16), the minimum energy \(E_{p} = \parallel \nabla _{\omega } S_{p} \parallel ^{2}_{2} \) is produced when \(\mathbf d _{p} = \mathbf 0 \).
(2) When \(\lambda /\beta < \parallel \nabla _{\omega } S_{p} \parallel ^{2}_{2}\), Eq. (19) still holds. But when \( \mathbf d _{p} = \nabla _{\omega } S_{p}, E_{p}\) has its minimum value \( \lambda /\beta \). Comparing these two values, the minimum energy \(E_{p}\) is produced when \( \mathbf d _{p} = \nabla _{\omega } S_{p}\).
Parameter \(\beta \) is automatically adapted in iterations starting from a small value, it is multiplied by 2 each time. This scheme is effective to speed up convergence [20].
Continuous nonlocal gradient \(L_{1}\) norm was enforced in nonlocal total variation (NLTV) [24] smoothing to suppress noise. In our method, strong smoothing inevitably curtails originally salient edges to penalize their magnitudes. In this framework, large nonlocal gradient magnitudes are allowed by nature with our discrete counting measure.
In Fig. 1, we show a natural image smoothing example compared with other competitive algorithms. One can see that \(L_{0}\) filter [20] (\(\lambda = 0.035\)) generates a sharp but not completely smooth image which is shown in Fig. 1(b). Many details are still retained after filtering, such as flower diameter and butterfly, it is not good enough for applications. The result obtained by NLTV [24] (\(\lambda = 0.05\)) is shown in Fig. 1(c), in the case of overall non-local gradients with small energies, the edges are not sharp, which makes them difficult to distinguish low contrast details around. In Fig. 1(d), our result (\(\lambda = 0.05\)) contains the most significant structures, which are slightly sharper as the nonlocal gradient energy increases.
Our alternating minimization method is described in Algorithm 1.

In [14], Petschnigg et al. proposed to denoise a no-flash image with its flash version as the reference image. In Fig. 2, we show a comparison of using the joint bilateral filter (JBF) [14], NLTV [24] and our method. Although JBF works well, from the Fig. 2(b), one can find that the gradient inversion artifacts are significant near some edges. And NLTV does not obtain a satisfactory result. Our result is sharper and contains few noise, which is shown in Fig. 2(d).
3 Depth Image Upsampling
In this application, we upscale a single depth image d (size of \(m \times n\)) which is guided by a high-resolution natural image T (size of \(M \times N\)). One can see that depth images are textureless compared with natural images and have quite sparse gradients. However, according to the statistics of depth image gradient [21], the sparse gradient assumption is not accurate enough. That is to say, most gradient values of depth image are not always 0 but rather very small.
The proposed nonlocal \(L_{0}\) gradient regularization can reduce the penalty for small elements, because we deal with the nonlocal gradient of the image as a whole, take into account the energy sum of the multi-directional weighted gradients, and avoid to obtain an overly smooth result.
In the first step, we upsample the depth image d to the size of \(M \times N\) with nearest neighbor interpolation, and obtain an initial image D. In the second step, we compute the weights \(\omega (p,q)\) with the high-resolution natural image T in Eqs. (3) and (4), that is to say, T is used as the reference image. In the last step, we use D as the input image and solve the minimization problem Eq. (7). The result of Eq. (7) is the final joint upsampling image.
We show some experimental evaluations of our algorithm compared with the competitive methods for depth image upsampling. We work on 3 depth images from Middlebury 2007 datasets [4] with the scaling factors of 4 and 8, respectively. To simulate the acquisition process, these depth images are added Gaussian noise [13].
The numerical results for this experiment in terms of the PSNR are shown in Table 1. In our experiments, our method clearly outperforms the other four methods in the most cases.
The numerical results for this experiment in terms of the Peak Signal Noise Ratio (PSNR) are shown in Table 1 and Structural Similarity (SSIM) [18] in Table 2. From the Table 1, one can see that our method clearly outperforms the other four method in the most cases. In Table 2, the proposed method achieve significant SSIM improvements over other leading methods. In average, our algorithm outperforms other methods by 0.05 for the SSIM comparison.
To show the visual comparison clearly, we show some results of experiments in Fig. 3. One can find that our method can enhance edges and reduce noise better, whereas other algorithms suffer from edge blurring or noise. From Table 1 and Figs. 3, 4 and 5 one can observe that the proposed approach is effective for noisy complex scenes and can obtain clearer high resolution depth images.
In order to show the stability of the proposed deconvolution algorithm, we give the convergence curve of the alternative optimization in Fig. 6. We plot the histories of the relative error \(\mid S^{k+1} - S^{k} \mid \). Three depth images (Art, Book and Moebius) are used and the scaling factor is 8. It is noticeable that the proposed method is stable.
In order to improve computational time and storage efficiency, we only compute the “best” neighbors, that is, for each pixel p, we only include the 10 best neighbors in the searching window of \(7 \times 7\) centered at p and the size of patch is \(5 \times 5\), the parameters a and h are empirically set to 0.5 and 0.25, respectively. 7–10 iterations are generally performed in our algorithm.
For computational time, the proposed approach takes about 2.3 s for a computer which runs Windows 7 64bit version with Intel Core i5 CPU and 8 GB RAM to construct the weight function of a \(256 \times 256\) image in Matlab 2010b. Once the weight is constructed, the iteration of our method is comparable to ROF [15] in speed. The computation speed depends on the number of iterations. In general, it takes around 3.5 s for 10 iterations.
4 Conclusion
In this work, we propose a solution for nonlocal \(L_{0}\) gradient minimization and show its applications for depth image upsampling. We propose an effective smoothing approach based on minimizing discretely counting nonlocal spatial changes. Different from many optimized based filters, the proposed method has the property of joint filtering, so our filter can be used for many applications. In particular, it achieves good performance in the depth image super resolution. Treating the high-resolution RGB image as a reference image, the proposed algorithm is well suited for upsampling the low-resolution depth image. The experimental results demonstrate that the proposed approach is promising, and it has better objective performance compared to the existing upsampling methods.
References
Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 60–65. IEEE (2005)
Chan, D., Buisman, H., Theobalt, C., Thrun. S.: A noise-aware filter for real-time depth upsampling. In: The Workshop on Multi-camera Multi-modal Sensor Fusion Algorithms Applications (2008)
Farbman, Z., Fattal, R., Lischinski, D., Szeliski, R.: Edge-preserving decompositions for multi-scale tone and detail manipulation. ACM Trans. Graph. 27, 67 (2008)
Ferstl, D., Reinbacher, C., Ranftl, R., Ruether, M., Bischof, H.: Guided depth upsampling using anisotropic total generalized variation. IEEE International Conference on Computer Vision, pp. 993–1000. IEEE (2013)
Ferstl, D., Ruther, M., Bischof, H.: Variational depth superresolution using example-based edge representations. In: IEEE International Conference on Computer Vision (ICCV), pp. 513–521. IEEE (2015)
Gilboa, G., Osher, S.: Nonlocal operators with applications to image processing. Multiscale Model. Simul. 7, 1005–1028 (2008)
Gong, Y.: Bernstein filter: a new solver for mean curvature regularized models. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1701–1705. IEEE (2016)
He, K., Sun, J., Tang, X.: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1397–1409 (2013)
Hornacek, M., Rhemann, C., Gelautz, M., Rother, C.: Depth super resolution by rigid body self-similarity in 3D. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1123–1130. IEEE (2013)
Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5197–5206. IEEE (2015)
Hui, T.-W., Loy, C.C., Tang, X.: Depth map super-resolution by deep multi-scale guidance. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 353–369. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_22
Lou, Y., Zhang, X., Osher, S.: Image recovery via nonlocal operators. J. Sci. Comput. 42, 185–197 (2010)
Park, J., Kim, H., Tai, Y.W., Brown, M.S., Kweon, I.: High quality depth map upsampling for 3D-TOF cameras. In: IEEE International Conference on Computer Vision (ICCV), pp. 1623–1630. IEEE (2011)
Petschnigg, G., Szeliski, R., Agrawala, M.: Digital photography with flash and no-flash image pairs. ACM Trans. Graph. (TOG) 23, 664–672 (2004)
Rudin, L., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenom. 60, 259–268 (1992)
Song, X., Dai, Y., Qin, X.: Deep depth super-resolution: learning depth super-resolution using deep convolutional neural network. arXiv preprint arXiv:1607.01977 (2016)
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: International Conference on Computer Vision, pp. 839–846. IEEE (1998)
Wang, Z., Bovik, A.C., Sheikh, H.R.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004)
Xie, J., Feris, R.S., Sun, M.T.: Edge-guided single depth image super resolution. IEEE Trans. Image Process. 25, 428–438 (2016)
Xu, L., Lu, C., Xu, Y., Jia, J.: Image smoothing via \(\text{ L }_0\) gradient minimization. ACM Trans. Graph. 30, 174 (2011)
Xue, H., Zhang, S., Cai, D.: Depth image inpainting: improving low rank matrix completion with low gradient regularization. arXiv preprint arXiv:160405817 (2016)
Yang, J., Ye, X., Li, K., Hou, C., Wang, Y.: Color-guided depth recovery from RGB-D data using an adaptive autoregressive model. IEEE Trans. Image Process. 23, 3443–3458 (2014)
Zhang, Q., Shen, X., Xu, L., Jia, J.: Rolling guidance filter. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8691, pp. 815–830. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10578-9_53
Zhang, X., Burger, M., Bresson, X.: Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM J. Imaging Sci. 3, 253–276 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yang, H., Sun, X., Zhu, M., Wu, K. (2017). Non-local \(L_{0}\) Gradient Minimization Filter and Its Applications for Depth Image Upsampling. In: Zhao, Y., Kong, X., Taubman, D. (eds) Image and Graphics. ICIG 2017. Lecture Notes in Computer Science(), vol 10666. Springer, Cham. https://doi.org/10.1007/978-3-319-71607-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-71607-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71606-0
Online ISBN: 978-3-319-71607-7
eBook Packages: Computer ScienceComputer Science (R0)