Multi-stream Convolutional Networks for Indoor Scene Recognition

Rao Muhammad Anwer*, Fahad Shahbaz Khan, Jorma Laaksonen, Nazar Zaki

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference contributionScientificvertaisarvioitu

Abstrakti

Convolutional neural networks (CNNs) have recently achieved outstanding results for various vision tasks, including indoor scene understanding. The de facto practice employed by state-of-the-art indoor scene recognition approaches is to use RGB pixel values as input to CNN models that are trained on large amounts of labeled data (ImageNet or Places). Here, we investigate CNN architectures by augmenting RGB images with estimated depth and texture information, as multiple streams, for monocular indoor scene recognition. First, we exploit the recent advancements in the field of depth estimation from monocular images and use the estimated depth information to train a CNN model for learning deep depth features. Second, we train a CNN model to exploit the successful Local Binary Patterns (LBP) by using mapped coded images with explicit LBP encoding to capture texture information available in indoor scenes. We further investigate different fusion strategies to combine the learned deep depth and texture streams with the traditional RGB stream. Comprehensive experiments are performed on three indoor scene classification benchmarks: MIT-67, OCIS and SUN-397. The proposed multi-stream network significantly outperforms the standard RGB network by achieving an absolute gain of 9.3%, 4.7%, 7.3% on the MIT-67, OCIS and SUN-397 datasets respectively.

AlkuperäiskieliEnglanti
OtsikkoComputer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings
ToimittajatMario Vento, Gennaro Percannella
Sivut196-208
Sivumäärä13
DOI - pysyväislinkit
TilaJulkaistu - 1 tammikuuta 2019
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisuussa
TapahtumaInternational Conference on Computer Analysis of Images and Patterns - Salerno, Italia
Kesto: 3 syyskuuta 20195 syyskuuta 2019
Konferenssinumero: 18

Julkaisusarja

NimiLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
KustantajaSpringer
Vuosikerta11678 LNCS
ISSN (painettu)0302-9743
ISSN (elektroninen)1611-3349

Conference

ConferenceInternational Conference on Computer Analysis of Images and Patterns
LyhennettäCAIP
MaaItalia
KaupunkiSalerno
Ajanjakso03/09/201905/09/2019

Sormenjälki Sukella tutkimusaiheisiin 'Multi-stream Convolutional Networks for Indoor Scene Recognition'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

  • Projektit

    MeMAD Laaksonen

    Laaksonen, J., Sjöberg, M., Laria Mantecon, H. & Pehlivan Tort, S.

    01/01/201831/12/2020

    Projekti: EU: Framework programmes funding

    Laitteet

    Science-IT

    Mikko Hakala (Manager)

    Perustieteiden korkeakoulu

    Laitteistot/tilat: Facility

  • Siteeraa tätä

    Anwer, R. M., Khan, F. S., Laaksonen, J., & Zaki, N. (2019). Multi-stream Convolutional Networks for Indoor Scene Recognition. teoksessa M. Vento, & G. Percannella (Toimittajat), Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings (Sivut 196-208). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vuosikerta 11678 LNCS). https://doi.org/10.1007/978-3-030-29888-3_16