Two-layer contractive encodings for learning stable nonlinear features

Hannes Schulz*, Kyunghyun Cho, Tapani Raiko, Sven Behnke

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

17 Citations (Scopus)

Abstract

Unsupervised learning of feature hierarchies is often a good strategy to initialize deep architectures for supervised learning. Most existing deep learning methods build these feature hierarchies layer by layer in a greedy fashion using either auto-encoders or restricted Boltzmann machines. Both yield encoders which compute linear projections of input followed by a smooth thresholding function. In this work, we demonstrate that these encoders fail to find stable features when the required computation is in the exclusive-or class. To overcome this limitation, we propose a two-layer encoder which is less restricted in the type of features it can learn. The proposed encoder is regularized by an extension of previous work on contractive regularization. This proposed two-layer contractive encoder potentially poses a more difficult optimization problem, and we further propose to linearly transform hidden neurons of the encoder to make learning easier. We demonstrate the advantages of the two-layer encoders qualitatively on artificially constructed datasets as well as commonly used benchmark datasets. We also conduct experiments on a semi-supervised learning task and show the benefits of the proposed two-layer encoders trained with the linear transformation of perceptrons.

Original languageEnglish
Pages (from-to)4-11
Number of pages8
JournalNeural Networks
Volume64
DOIs
Publication statusPublished - 1 Apr 2015
MoE publication typeA1 Journal article-refereed

Keywords

  • Deep learning
  • Linear transformation
  • Multi-layer perceptron
  • Pretraining
  • Semi-supervised learning
  • Two-layer contractive encoding

Fingerprint Dive into the research topics of 'Two-layer contractive encodings for learning stable nonlinear features'. Together they form a unique fingerprint.

Cite this