Abstract
Hyperparameter selection generally relies on running multiple full
training trials, with hyperparameter selection based on validation set
performance. We propose a gradient-based approach for locally adjusting
hyperparameters on the fly in which we adjust the hyperparameters so as
to make the model parameter gradients, and hence updates, more
advantageous for the validation cost. We explore the approach for tuning
regularization hyperparameters and find that in experiments on MNIST the
resulting regularization levels are within the optimal regions. The
method is less computationally demanding compared to similar
gradient-based approaches to hyperparameter selection, only requires a
few trials, and consistently finds solid hyperparameter values which
makes it a useful tool for training neural network models.
Original language | English |
---|---|
Publication status | Published - 2015 |
MoE publication type | D4 Published development or research report or study |
Keywords
- Computer Science - Learning