Papers of the day   All papers

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

Comments

Jascha Sohl-Dickstein: Modern deep learning is a story of learned features outperforming (then replacing!) hand-designed algorithms. But we still use hand designed loss functions and optimizers. Here is a big step towards learned optimizers outperforming existing optimizers: http://arxiv.org/abs/2009.11243 https://t.co/Pg4ehwoEEg

7 replies, 1066 likes


Luke Metz: We have a new paper on learned optimizers! We used thousands of tasks (and a lot of compute šŸ˜¬) to train general purpose learned optimizers that perform well on never-before-seen tasks, and can even train new versions of themselves. https://arxiv.org/abs/2009.11243 1/8

16 replies, 1021 likes


Chip Huyen: One direction in AutoML Iā€™m really excited about is learned optimizers: training optimizers to replace hand-designed optimizers (eg Adam, SGD). With 6k tasks and A LOT of compute, the authors found a learned optimizer that can train itself to be better šŸ¤Æ

1 replies, 259 likes


Jeff Dean (@šŸ”): Nice thread by @Luke_Metz summarizing new work on learned optimizers.

2 replies, 45 likes


Jack Clark: Neat to see a mention of AGI in the 'broader impacts' section of @GoogleAI 's paper on Learned Optimizers. Writing up paper for Import AI - learning to learn has become learning how to learn tools that learn how to learn efficient training. https://arxiv.org/abs/2009.11243 https://t.co/l0lcRbYXMH

2 replies, 42 likes


Aran Komatsuzaki: Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves Learned optimizer that generalizes better to unseen tasks and enables automatic regularization. https://arxiv.org/abs/2009.11243 https://t.co/V6sjLwzCKc

0 replies, 18 likes


Parag Agrawal: Awesome work, fascinating results, great summary thread šŸ’Æ

1 replies, 16 likes


Sam Greydanus: "Optimizers all the way down" šŸ¢

0 replies, 11 likes


Sterling Crispin šŸ•Šļø: Things like this feel like the future of machine learning, this meta learning and doing away with hand tuned hyper parameters. Totally awesome work

0 replies, 11 likes


Alexander Kruel: This is IMPORTANT.

0 replies, 9 likes


Daisuke Okanohara: They train an optimizer on thousands of tasks, optimized for minimizing a test loss of each task with ES. It acquires weight decay-like regularization. The learned optimizer can optimize the "learning optimizer" task, showing task generalization ability. https://arxiv.org/abs/2009.11243

0 replies, 8 likes


hardmaru: Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves In place of hand-engineered optimizers like Adam, they look at RNNs trained (on thousands of tasks) to output update rules for SGD https://arxiv.org/abs/2009.11243

0 replies, 4 likes


elvis: Interesting work on using thousands of tasks to train optimizers that generalize better to unseen tasks. The authors claim that these learned optimizers outperform existing optimizers, are adaptive and potentially useful for out of distribution tasks.

0 replies, 3 likes


David: #DeepLearning #MachineLearning new learned optimizers, LSTM approach

0 replies, 1 likes


Content

Found on Sep 24 2020 at https://arxiv.org/pdf/2009.11243.pdf

PDF content of a computer science paper: Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves