Multi-view stereo by temporal nonparametric fusion

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

35 Citations (Scopus)

Abstract

We propose a novel idea for depth estimation from multi-view image-pose pairs, where the model has capability to leverage information from previous latent-space encodings of the scene. This model uses pairs of images and poses, which are passed through an encoder-decoder model for disparity estimation. The novelty lies in soft-constraining the bottleneck layer by a nonparametric Gaussian process prior. We propose a pose-kernel structure that encourages similar poses to have resembling latent spaces. The flexibility of the Gaussian process (GP) prior provides adapting memory for fusing information from nearby views. We train the encoder-decoder and the GP hyperparameters jointly end-to-end. In addition to a batch method, we derive a lightweight estimation scheme that circumvents standard pitfalls in scaling Gaussian process inference, and demonstrate how our scheme can run in real-time on smart devices.

Original languageEnglish
Title of host publication2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019)
PublisherIEEE
Pages2651–2660
Number of pages10
ISBN (Electronic)9781728148038
DOIs
Publication statusPublished - 2019
MoE publication typeA4 Article in a conference publication
EventIEEE International Conference on Computer Vision - Seoul, Korea, Republic of
Duration: 27 Oct 20192 Nov 2019
http://iccv2019.thecvf.com/

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
Volume2019-October
ISSN (Electronic)1550-5499

Conference

ConferenceIEEE International Conference on Computer Vision
Abbreviated titleICCV
Country/TerritoryKorea, Republic of
CitySeoul
Period27/10/201902/11/2019
Internet address

Fingerprint

Dive into the research topics of 'Multi-view stereo by temporal nonparametric fusion'. Together they form a unique fingerprint.

Cite this