Deep Learning Methods for Semantic Matching, Image Retrieval and Camera Relocalization

Zakaria Laskar

Research output: ThesisDoctoral ThesisCollection of Articles


Image matching is a central component in many computer vision applications. The field has progressed significantly with the advancement of deep learning models such as convolutional neural networks. The thesis makes several contributions in advancing the performance of existing CNN based approaches in closely related problem areas of image matching, namely semantic matching, image retrieval and image based localization. In this thesis, the problem of data and ground-truth labelling efficiency for training CNN models is studied in the context of semantic matching. A weakly supervised method is presented to address the problem of learning using small training datasets. The method first generates additional training samples using existing data and proposes a novel loss function based on cyclic consistency to regularize the training process. Results show that the proposed method can learn from weakly labelled data without pixel level correspondence information. In the next part of the thesis, we study the application of both global and local image matching to the problem of image retrieval. In the problem of particular landmark retrieval the thesis studies the role of contextual information in global query image representation which is generally ignored by existing approaches to remove noisy background information. An attention model is proposed that uses bottom-up saliency to modulate contextual information in intermediary CNN representations in a top-down manner. On the other hand, to address the challenges due to local variations in city-scale retrieval, the thesis proposes a geometric verification method using CNN based image matching. In addition, it proposes method for improving the accuracy and efficiency of the image matching method. Lastly, the thesis demonstrates methods utilizing the key concepts from image matching and image retrieval to address problems in the field of image based localization. In contrast to existing approaches the proposed method can be applied to novel scenes not seen during training and scales favourably with the size of the environment. In addition, a challenging indoor localization dataset is made publicly available to address limitation of existing datasets.
Translated title of the contributionDeep Learning Methods for Semantic Matching, Image Retrieval and Camera Relocalization
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
  • Kannala, Juho, Supervising Professor
Print ISBNs978-952-64-0145-4
Electronic ISBNs978-952-64-0146-1
Publication statusPublished - 2020
MoE publication typeG5 Doctoral dissertation (article)


  • computer vision
  • machine learning
  • deep learning
  • camera relocalization
  • image retrieval
  • image matching


Dive into the research topics of 'Deep Learning Methods for Semantic Matching, Image Retrieval and Camera Relocalization'. Together they form a unique fingerprint.

Cite this