Oriol Vinyals: Rapid Learning or Feature Reuse? Meta-learning algorithms on standard benchmarks have much more feature reuse than rapid learning! This also gives us a way to simplify MAML -- (Almost) No Inner Loop (A)NIL. https://arxiv.org/abs/1909.09157 With Aniruddh Raghu @maithra_raghu Samy Bengio. https://t.co/7T6SzMYfiY
9 replies, 633 likes
Maithra Raghu: Rapid Learning or Feature Reuse?
New paper: https://arxiv.org/abs/1909.09157
We analyze MAML (and meta-learning and meta learning more broadly) finding that feature reuse is the critical component in the efficient learning of new tasks -- leading to some algorithmic simplifications!
2 replies, 161 likes
Alex Nichol: This is something I always wanted to investigate, but never had time to. It would seem that MAML (and presumably Reptile as well) learn useful features in the outer loop, rather than learning how to learn useful features in the inner loop.
2 replies, 104 likes
Maithra Raghu: Presenting this at @iclr_conf *today*!
Talk and Slides: https://iclr.cc/virtual/poster_rkgMkCEtPB.html
Poster Sessions: (i) 10am - 12 Pacific Time, (ii) 1pm - 3pm Pacific Time
Thanks to the organizers for a *fantastic* virtual conference, hope to see you there!
1 replies, 89 likes
Andrey Kurenkov 🤖: Had thoughts about this paper today.
I've often wondered when Imagenet-style pretraining will come to RL (NLP's "ImageNet moment" having arrived https://thegradient.pub/nlp-imagenet/), and this paper shows it kind of has? Features are 'pre-trained' on tasks, and then used on new ones. Neat!
3 replies, 17 likes
arxiv: Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML. http://arxiv.org/abs/1909.09157 https://t.co/EQImHy7sPA
0 replies, 15 likes
Carlos E. Perez: Indeed, learning is made efficient because of 'reuse'. https://arxiv.org/abs/1909.09157
0 replies, 2 likes
Michael Zhang: Neat paper! Shows that inner loop adaptation is not necessary at meta-test time for MAML. Removing the final layer and computing cosine similarities (similar to prototypical nets) is sufficient.
1 replies, 1 likes
Found on Sep 23 2019 at https://arxiv.org/pdf/1909.09157.pdf