Digging Into Self-Supervised Learning of Feature Descriptors

Iaroslav Melekhov, Zakaria Laskar, Xiaotian Li, Shuzhe Wang, Juho Kannala

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

3 Citations (Scopus)
66 Downloads (Pure)


Fully-supervised CNN-based approaches for learning local image descriptors have shown remarkable results in a wide range of geometric tasks. However, most of them require per-pixel ground-truth keypoint correspondence data which is difficult to acquire at scale. To address this challenge, recent weakly-and self-supervised methods can learn feature descriptors from relative camera poses or using only synthetic rigid transformations such as homographies. In this work, we focus on understanding the limitations of existing self-supervised approaches and propose a set of improvements that combined lead to powerful feature descriptors. We show that increasing the search space from in-pair to in-batch for hard negative mining brings consistent improvement. To enhance the discriminativeness of feature descriptors, we propose a coarse-to-fine method for mining local hard negatives from a wider search space by using global visual image descriptors. We demonstrate that a combination of synthetic homography transformation, color augmentation, and photorealistic image stylization produces useful representations that are viewpoint and illumination invariant. The feature descriptors learned by the proposed approach perform competitively and surpass their fully- and weakly-supervised counterparts on various geometric benchmarks such as image-based localization, sparse feature matching, and image retrieval.
Original languageEnglish
Title of host publicationProceedings - 2021 International Conference on 3D Vision, 3DV 2021
Number of pages12
ISBN (Electronic)978-1-6654-2688-6
ISBN (Print)978-1-6654-2689-3
Publication statusPublished - Jan 2022
MoE publication typeA4 Conference publication
EventInternational Conference on 3D Vision - Virtual, Online, United Kingdom
Duration: 1 Dec 20213 Dec 2021
Conference number: 9

Publication series

NameInternational Conference on 3D Vision proceedings
ISSN (Electronic)2475-7888


ConferenceInternational Conference on 3D Vision
Abbreviated title3DV
Country/TerritoryUnited Kingdom
CityVirtual, Online
Internet address


Dive into the research topics of 'Digging Into Self-Supervised Learning of Feature Descriptors'. Together they form a unique fingerprint.

Cite this