Terabyte-scale image similarity search: Experience and best practice

Diana Moise, Denis Shestakov, Gylfi Gudmundsson, Laurent Amsaleg

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    20 Citations (Scopus)

    Abstract

    While the past decade has witnessed an unprecedented growth of data generated and collected all over the world, existing data management approaches lack the ability to address the challenges of Big Data. One of the most promising tools for Big Data processing is the MapReduce paradigm. Although it has its limitations, the MapReduce programming model has laid the foundations for answering some of the Big Data challenges. In this paper, we focus on Hadoop, the open-source implementation of the MapReduce paradigm. Using as case-study a Hadoop-based application, i.e., image similarity search, we present our experiences with the Hadoop framework when processing terabytes of data. The scale of the data and the application workload allowed us to test the limits of Hadoop and the efficiency of the tools it provides. We present a wide collection of experiments and the practical lessons we have drawn from our experience with the Hadoop environment. Our findings can be shared as best practices and recommendations to the Big Data researchers and practioners.

    Original languageEnglish
    Title of host publicationProceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
    Pages674-682
    Number of pages9
    ISBN (Electronic)978-1-4799-1293-3
    DOIs
    Publication statusPublished - 2013
    MoE publication typeA4 Conference publication
    EventIEEE International Conference on Big Data - Santa Clara, United States
    Duration: 6 Oct 20139 Oct 2013

    Conference

    ConferenceIEEE International Conference on Big Data
    Abbreviated titleBig Data
    Country/TerritoryUnited States
    CitySanta Clara
    Period06/10/201309/10/2013

    Fingerprint

    Dive into the research topics of 'Terabyte-scale image similarity search: Experience and best practice'. Together they form a unique fingerprint.

    Cite this