Researchers from Google AI have proposed a new state-of-the-art method for unsupervised depth estimation from images, that works without using any additional input.
The novel method, which in fact is a continuation and improvement over previous successful methods such as vid2depth and struct2depth, introduces several significant improvements. According to the paper, the method is able to learn monocular depth from even unknown cameras going as far as estimating camera intrinsic parameters.
The proposed method, similarly to previous methods, is based on applying differentiable warping to frames and compares the results to adjacent frames. However, as opposed to other existing methods, the proposed method introduces several novelties such as a novel randomized layer normalization, a novel regularizer and a new way of handling occlusions both geometrically and differentiably.
This method is the first method to learn camera intrinsic parameters from video in a completely unsupervised manner. Researchers evaluated both depth estimation and ego-motion estimation from videos in the wild using several datasets: Cityscapes, KITTI, EuROoC and show that the method is superior to existing state-of-the-art methods in both odometry and depth estimation. The implementation of the method was open-sourced and is available on Github.
More details about the new method, as well as the evaluation and comparison with other methods can be found in the official paper.