Computational Visual Media


depth inpainting, self-supervised learning, reference-guided learning


Depth information can benefit various computer vision tasks on both images and videos. However, depth maps may suffer from invalid values in many pixels, and also large holes. To improve such data, we propose a joint self-supervised and reference-guided learning approach for depth inpainting. For the self-supervised learning strategy, we introduce an improved spatial convolutional sparse coding module in which total variation regularization is employed to enhance the structural information while preserving edge information. This module alternately learns a convolutional dictionary and sparse coding from a corrupted depth map. Then, both the learned convolutional dictionary and sparse coding are convolved to yield an initial depth map, which is effectively smoothed using local contextual information. The reference-guided learning part is inspired by the fact that adjacent pixels with close colors in the RGB image tend to have similar depth values. We thus construct a hierarchical joint bilateral filter module using the corresponding color image to fill in large holes. In summary, our approach integrates a convolutional sparse coding module to preserve local contextual information and a hierarchical joint bilateral filter module for filling using specific adjacent information. Experimental results show that the proposed approach works well for both invalid value restoration and large hole inpainting.