Papers of the day   All papers

ON EMPIRICAL COMPARISONS OF OPTIMIZERS FOR DEEP LEARNING

Comments

Roger Grosse: In deep learning research, the sky turns out to be blue, but only if you measure it very carefully. Interesting meta-scientific paper on evaluating neural net optimizers, by Choi et al. https://arxiv.org/pdf/1910.05446.pdf

2 replies, 203 likes


Sebastian Raschka: "On Empirical Comparisons of Optimizers for Deep Learning" => "As tuning effort grows without bound, more general optimizers should never underperform the ones they can approximate" https://arxiv.org/abs/1910.05446 https://t.co/hUIGMyshkC

2 replies, 183 likes


Dmytro Mishkin: Tl;dr: Adam >> SGD, if you tune eps, momentum and lr schedule for it

0 replies, 14 likes


Daisuke Okanohara: In NN optimization, ignored metaparameters are actually important. Especially, "eps" is often used with default 1e-8, but the optimal value can be 1~10^4. With full metaparameter search, ADAM and NADAM outperform SGD and momentum. https://arxiv.org/abs/1910.05446

0 replies, 13 likes


arxiv: On Empirical Comparisons of Optimizers for Deep Learning. http://arxiv.org/abs/1910.05446 https://t.co/QVxeQCpAOL

0 replies, 10 likes


Delip Rao: 2. "Did you optimize your hyperparameters?" With compute costs coming down, it is becoming more affordable to run hyperparameter optimization, as long as you stay away from the Sesame Street. It would be interesting to condition this based on where the authors are coming from. https://t.co/9g9PgepxFO

1 replies, 7 likes


Content

Found on Oct 15 2019 at https://arxiv.org/pdf/1910.05446.pdf

PDF content of a computer science paper: ON EMPIRICAL COMPARISONS OF OPTIMIZERS FOR DEEP LEARNING