Projects per year
Abstract
Convolutional neural networks (CNNs) have recently achieved outstanding results for various vision tasks, including indoor scene understanding. The de facto practice employed by state-of-the-art indoor scene recognition approaches is to use RGB pixel values as input to CNN models that are trained on large amounts of labeled data (ImageNet or Places). Here, we investigate CNN architectures by augmenting RGB images with estimated depth and texture information, as multiple streams, for monocular indoor scene recognition. First, we exploit the recent advancements in the field of depth estimation from monocular images and use the estimated depth information to train a CNN model for learning deep depth features. Second, we train a CNN model to exploit the successful Local Binary Patterns (LBP) by using mapped coded images with explicit LBP encoding to capture texture information available in indoor scenes. We further investigate different fusion strategies to combine the learned deep depth and texture streams with the traditional RGB stream. Comprehensive experiments are performed on three indoor scene classification benchmarks: MIT-67, OCIS and SUN-397. The proposed multi-stream network significantly outperforms the standard RGB network by achieving an absolute gain of 9.3%, 4.7%, 7.3% on the MIT-67, OCIS and SUN-397 datasets respectively.
Original language | English |
---|---|
Title of host publication | Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings |
Editors | Mario Vento, Gennaro Percannella |
Publisher | Springer |
Pages | 196-208 |
Number of pages | 13 |
ISBN (Print) | 9783030298876 |
DOIs | |
Publication status | Published - 1 Jan 2019 |
MoE publication type | A4 Conference publication |
Event | International Conference on Computer Analysis of Images and Patterns - Salerno, Italy Duration: 3 Sept 2019 → 5 Sept 2019 Conference number: 18 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Publisher | Springer |
Volume | 11678 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | International Conference on Computer Analysis of Images and Patterns |
---|---|
Abbreviated title | CAIP |
Country/Territory | Italy |
City | Salerno |
Period | 03/09/2019 → 05/09/2019 |
Keywords
- Depth features
- Scene recognition
- Texture features
Fingerprint
Dive into the research topics of 'Multi-stream Convolutional Networks for Indoor Scene Recognition'. Together they form a unique fingerprint.Projects
- 2 Finished
-
MeMAD Laaksonen
Laaksonen, J. (Principal investigator), Sjöberg, M. (Project Member), Pehlivan Tort, S. (Project Member) & Laria Mantecon, H. (Project Member)
01/01/2018 → 31/03/2021
Project: EU: Framework programmes funding
-
Deep neural networks in scene graph generation for perception of visual multimedia semantics
Laaksonen, J. (Principal investigator), Anwer, R. (Project Member), Sjöberg, M. (Project Member), Pehlivan Tort, S. (Project Member) & Wang, T.-J. (Project Member)
01/01/2018 → 31/12/2019
Project: Academy of Finland: Other research funding