hardmaru: These two figures basically list all of the methods people have tried so far to make stochastic gradient descent work
“Descending through a Crowded Valley — Benchmarking Deep Learning Optimizers”
11 replies, 1058 likes
Brandon Rohrer: One thing I get from this study is that if you had to choose a reliable, robust, one-size-fits-all solution, out of the box Adam is pretty dang good.
2 replies, 62 likes
Christian S. Perone: "Descending through a Crowded Valley" (https://arxiv.org/abs/2007.01547), an extremely useful take on optimization for Deep Learning: "Perhaps the most important takeaway from our study is hidden in plain sight: the field is in danger of being drowned by noise.", figures from the paper. https://t.co/TZBSGsotM4
1 replies, 26 likes
Grid AI: For most applied problems, the optimizer choice is largely irrelevant. Use good defaults like ADAM or SGD.
The most important thing is to tune the learning rate - most gains come from here.
If you have limited budget and don’t know what to tune, learning rate is it!
0 replies, 19 likes
Rubén Arce Santolaya: Benchmarking #DeepLearning #MachineLearning
2 replies, 18 likes
Carl Carrie (@🏠): Analyzing various optimizer characteristics and their performance for deep learning is the subject of this paper
0 replies, 13 likes
Loreto Parisi: A comprehensive guide of #NeuralNetwork optimizers with benchmarking and #pyrhon code to run you Gradient Descent tests! 👌 https://github.com/SirRob1997/Crowded-Valley---Results
0 replies, 8 likes
Frank Schneider: Wow, it is great to see our work (@schmidtr97 & @PhilippHennig5) and this topic being discussed!
0 replies, 4 likes
Wálé Akínfadérìn: Really interesting idea. I’m a huge fan of rigorous empirical testing.
DESCENDING THROUGH A CROWDED VALLEY — BENCHMARKING DEEP LEARNING OPTIMIZERS
0 replies, 3 likes
DrHB: @rasbt Nice! There is also this nice article that summarise everything with experimental evidence https://arxiv.org/abs/2007.01547
0 replies, 2 likes
Federico Andres Lois: The kind of papers I like. Meta-analysis, systematic, and practical applicability.
1 replies, 1 likes
Leo Boytsov: 👇"we perform an extensive, standardized benchmark of more than a dozen particularly popular deep learning optimizers while giving a concise overview of the wide range of possible choices. Analyzing almost 35 000 individual runs, we contribute the following three points:"
1 replies, 0 likes
Found on Oct 10 2020 at https://arxiv.org/pdf/2007.01547.pdf