Utku: End-to-end training of sparse deep neural networks with little-to-no performance loss. Check out our new paper: “Rigging the Lottery: Making All Tickets Winners” (RigL👇) !
with @Tgale96 @jacobmenick @pcastr and @erich_elsen https://t.co/LmR18hK4LV
1 replies, 380 likes
hardmaru: Everyone is a winner 🔥
1 replies, 263 likes
DeepMind: We also introduce a technique [https://arxiv.org/abs/1911.11134] for training neural networks that are sparse throughout training from a random initialization - no luck required, all initialization “tickets” are winners. https://t.co/fA7VmXrj20
0 replies, 127 likes
Pablo Samuel Castro: Come by tomorrow and chat with us about our #ICML2020 paper “Rigging the Lottery: Making All Tickets Winners”!
- Thu Jul 16 9 a.m. EDT
- Thu Jul 16 8 p.m. EDT
1 replies, 54 likes
Delip Rao: Great paper title, with results to match. “MobileNets are efficient networks and difficult to sparsify. With RigL we can train 75% sparse MobileNets with almost no drop in accuracy.” 😱😱
1 replies, 46 likes
Sara Hooker: What differs in this paper is how the connections are grown after pruning for the most important weights. I think this is part of a very interesting direction of research, amplifying the role of weights estimated to be important (in addition to removing the “weakest” links).
0 replies, 24 likes
Pablo Samuel Castro: 🎟️🎟️make everyone a lottery winner🎟️🎟️
train sparse networks (with a randomly initialized topology) end-to-end without sacrificing (much) accuracy!
joint work with @utkuevci @Tgale96 @jacobmenick and @erich_elsen
0 replies, 16 likes
Jacob Menick: New work by Utku Evci et al. on sparse training. My contribution was helping with the RNN experiments. Fun collaborating with @utkuevci and getting involved in sparse man @erich_elsen's sweeping sparsity research programme.
0 replies, 11 likes
Pablo Samuel Castro: @RobertTLange @jefrankle @mcarbin Nice writeup!
*Shameless plug*: you may want to check out our paper, soon to be presented at @icmlconf :
0 replies, 10 likes
Daisuke Okanohara: RigL trains sparse NNs from scratch; regularly drops the edges with the smallest magnitude, computes the gradients wrt virtual dense edges, and introduces new edges with the largest gradient. Escaping bad local minima by making a new descending direction. https://arxiv.org/abs/1911.11134
0 replies, 7 likes
Jesse Engel: Sparsity is a clear inductive bias for neural nets, but end to end training and efficient inference have always been a challenge. I know @erich_elsen has been thinking about this for a long time, and seems like they've made some real progress!
0 replies, 7 likes
Brundage Bot: Rigging the Lottery: Making All Tickets Winners. Utku Evci, Trevor Gale, Jacob Menick, Pablo Samuel Castro, and Erich Elsen http://arxiv.org/abs/1911.11134
1 replies, 4 likes
email@example.com: @SuryaGanguli @jm_alexia @Hidenori8Tanaka @dyamins Nice work! Will have to go through it more carefully to better understand your method.
You might want to check out our paper on our method (RigL) that was recently accepted to @icmlconf , as it's tackling the same problem.
0 replies, 4 likes
Instead of choosing good initial values in favor of "Lottery Theory", they propose RigL to train a sparse and accurate network from any initial value.The learning time does not increase greatly, inference speed is improved with the same accuracy. https://t.co/b8BJbEiaWE
0 replies, 2 likes
Carles R. Riera: Well, we are back to the 2000 with the return of constructive-deconstructive methods. Glad to see this.
Instead of finding the correct initialization they add and remove units according to the gradient.
1 replies, 1 likes
Mitchell Gordon: Really cool improvements on Tim Dettmer's work; now sparse networks really can be trained from scratch using less GPU memory!
0 replies, 1 likes
Leandro von Werra: Rigging the Lottery (RigL) by @utkuevci et al.: Instead of looking for sparse lottery tickets in large dense networks effectively train sparse networks from scratch. The method continuously adds and deletes connections beating pruning on many tasks.
1 replies, 1 likes
Fabien Da Silva: @owulveryck @arxiv - https://arxiv.org/abs/1911.11134 Rigging the Lottery: Making All Tickets Winners
- https://arxiv.org/abs/1911.04252 Self-training with Noisy Student improves ImageNet classification
- https://arxiv.org/abs/1910.08435 Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Input
1 replies, 0 likes
Found on Nov 26 2019 at https://arxiv.org/pdf/1911.11134.pdf