Full-Frame Scene Coordinate Regression for Image-Based Localization

Xiaotian Li, Juha Ylioinas, Juho Kannala

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

Image-based localization, or camera relocalization, is a fundamental problem in computer vision and robotics, and it refers to estimating camera pose from an image. Recent state-of-the-art approaches use learning based methods, such as Random Forests (RFs) and Convolutional Neural Networks (CNNs), to regress for each pixel in the image its corresponding position in the scene's world coordinate frame, and solve the final pose via a RANSAC-based optimization scheme using the predicted correspondences. In this paper, instead of in a patch-based manner, we propose to perform the scene coordinate regression in a full-frame manner to make the computation efficient at test time and, more importantly, to add more global context to the regression process to improve the robustness. To do so, we adopt a fully convolutional encoder-decoder neural network architecture which accepts a whole image as input and produces scene coordinate predictions for all pixels in the image. However, using more global context is prone to overfitting. To alleviate this issue, we propose to use data augmentation to generate more data for training. In addition to the data augmentation in 2D image space, we also augment the data in 3D space. We evaluate our approach on the publicly available 7-Scenes dataset, and experiments show that it has better scene coordinate predictions and achieves state-of-the-art results in localization with improved robustness on the hardest frames (e.g., frames with repeated structures).
Original languageEnglish
Title of host publicationRobotics: Science and Systems XIV
PublisherUniversity of Queensland
Number of pages9
ISBN (Electronic)978-0-9923747-4-7
DOIs
Publication statusPublished - 2018
MoE publication typeA4 Conference publication
EventRobotics: Science and Systems Conference - Pittsburgh, United States
Duration: 26 Jun 201830 Jun 2019
Conference number: 14

Publication series

NameRobotics: Science and Systems
ISSN (Electronic)2330-765X

Conference

ConferenceRobotics: Science and Systems Conference
Country/TerritoryUnited States
CityPittsburgh
Period26/06/201830/06/2019

Fingerprint

Dive into the research topics of 'Full-Frame Scene Coordinate Regression for Image-Based Localization'. Together they form a unique fingerprint.

Cite this