L4: Practical loss-based stepsize adaptation for deep learning

NIPS 2018

Authors: Michal Rolinek, Georg Martius

Is Adam really the best we can do? An simple enough update rule can dramatically outperform Adam on some datasets. The optimizer turned out not to be very robust but it had its moments such as actually driving the training loss on MNIST to 0.0 in 20 epochs.

Links: Arxiv Github

Tired of tuning parameters of SGD or Adam for #DeepLearning? Our new optimizer (https://t.co/90hi80ghna) works much better than the best constant learning rates. Try it out: #Tensorflow code included, see https://t.co/k4YVzeqJrF pic.twitter.com/qmBxsYWYgA
— Georg Martius (@GMartius) February 19, 2018

Share on

Twitter Facebook LinkedIn

Michal Rolínek

Share on