Jascha Sohl-Dickstein: Modern deep learning is a story of learned features outperforming (then replacing!) hand-designed algorithms. But we still use hand designed loss functions and optimizers. Here is a big step towards learned optimizers outperforming existing optimizers: http://arxiv.org/abs/2009.11243 https://t.co/Pg4ehwoEEg
7 replies, 1066 likes
Luke Metz: We have a new paper on learned optimizers! We used thousands of tasks (and a lot of compute 😬) to train general purpose learned optimizers that perform well on never-before-seen tasks, and can even train new versions of themselves.
16 replies, 1021 likes
Chip Huyen: One direction in AutoML I’m really excited about is learned optimizers: training optimizers to replace hand-designed optimizers (eg Adam, SGD).
With 6k tasks and A LOT of compute, the authors found a learned optimizer that can train itself to be better 🤯
1 replies, 259 likes
Jeff Dean (@🏡): Nice thread by @Luke_Metz summarizing new work on learned optimizers.
2 replies, 45 likes
Jack Clark: Neat to see a mention of AGI in the 'broader impacts' section of @GoogleAI 's paper on Learned Optimizers.
Writing up paper for Import AI - learning to learn has become learning how to learn tools that learn how to learn efficient training. https://arxiv.org/abs/2009.11243 https://t.co/l0lcRbYXMH
2 replies, 42 likes
Aran Komatsuzaki: Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
Learned optimizer that generalizes better to unseen tasks and enables automatic regularization.
0 replies, 18 likes
Parag Agrawal: Awesome work, fascinating results, great summary thread 💯
1 replies, 16 likes
Sam Greydanus: "Optimizers all the way down" 🐢
0 replies, 11 likes
Sterling Crispin 🕊️: Things like this feel like the future of machine learning, this meta learning and doing away with hand tuned hyper parameters. Totally awesome work
0 replies, 11 likes
Alexander Kruel: This is IMPORTANT.
0 replies, 9 likes
Daisuke Okanohara: They train an optimizer on thousands of tasks, optimized for minimizing a test loss of each task with ES. It acquires weight decay-like regularization. The learned optimizer can optimize the "learning optimizer" task, showing task generalization ability. https://arxiv.org/abs/2009.11243
0 replies, 8 likes
hardmaru: Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves
In place of hand-engineered optimizers like Adam, they look at RNNs trained (on thousands of tasks) to output update rules for SGD
0 replies, 4 likes
elvis: Interesting work on using thousands of tasks to train optimizers that generalize better to unseen tasks.
The authors claim that these learned optimizers outperform existing optimizers, are adaptive and potentially useful for out of distribution tasks.
0 replies, 3 likes
David: #DeepLearning #MachineLearning new learned optimizers, LSTM approach
0 replies, 1 likes
Found on Sep 24 2020 at https://arxiv.org/pdf/2009.11243.pdf