Using semantic features to improve large-scale visual concept detection

Mats Sjöberg, Jorma Laaksonen

    Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

    3 Citations (Scopus)


    Currently there are many multimedia benchmarks and databases available with a predefined set of concepts for which detectors can be formed or are even already available. One can use these background concepts to form semantic concept vectors for each image or video in the database by concatenating the concept prediction outputs. In this paper we investigate the use of such semantic concept features for detecting novel concepts in two large-scale experiments: the TRECVID 2012 evaluation with 800 hours of video data, and MIRFLICKR with 1 million images. We show that the detection performance can improve significantly over using visual features only. In some applications, computationally expensive kernel classifiers cannot be used in the detection phase, and our experiments show a consistent significant improvement using fast linear classifiers when we replace visual features with the semantic concept feature. We also propose a Self-Organising Map-based method which affords fast training-free detection and intuitive visualisation properties.

    Original languageEnglish
    Title of host publication2014 12th International Workshop on Content-Based Multimedia Indexing, CBMI 2014
    Number of pages6
    Publication statusPublished - 2014
    MoE publication typeA4 Article in a conference publication
    EventInternational Workshop on Content-Based Multimedia Indexing - Klagenfurt, Austria
    Duration: 18 Jun 201420 Jun 2014
    Conference number: 12

    Publication series

    NameInternational Workshop on Content-Based Multimedia Indexing
    ISSN (Print)1949-3983
    ISSN (Electronic)1949-3991


    WorkshopInternational Workshop on Content-Based Multimedia Indexing
    Abbreviated titleCBMI
    Internet address


    Dive into the research topics of 'Using semantic features to improve large-scale visual concept detection'. Together they form a unique fingerprint.

    Cite this