Boosting monocular depth estimation models to high-resolution via content-adaptive multi-resolution merging

SMH Miangoleh, S Dille, L Mai… - Proceedings of the …, 2021 - openaccess.thecvf.com
Proceedings of the IEEE/CVF Conference on Computer Vision and …, 2021openaccess.thecvf.com
Neural networks have shown great abilities in estimating depth from a single image.
However, the inferred depth maps are well below one-megapixel resolution and often lack
fine-grained details, which limits their practicality. Our method builds on our analysis on how
the input resolution and the scene structure affects depth estimation performance. We
demonstrate that there is a trade-off between a consistent scene structure and the high-
frequency details, and merge low-and high-resolution estimations to take advantage of this …
Abstract
Neural networks have shown great abilities in estimating depth from a single image. However, the inferred depth maps are well below one-megapixel resolution and often lack fine-grained details, which limits their practicality. Our method builds on our analysis on how the input resolution and the scene structure affects depth estimation performance. We demonstrate that there is a trade-off between a consistent scene structure and the high-frequency details, and merge low-and high-resolution estimations to take advantage of this duality using a simple depth merging network. We present a double estimation method that improves the whole-image depth estimation and a patch selection method that adds local details to the final result. We demonstrate that by merging estimations at different resolutions with changing context, we can generate multi-megapixel depth maps with a high level of detail using a pre-trained model.
openaccess.thecvf.com