Papers of the day   All papers

Rigging the Lottery: Making All Tickets Winners


Utku: End-to-end training of sparse deep neural networks with little-to-no performance loss. Check out our new paper: “Rigging the Lottery: Making All Tickets Winners” (RigL👇) ! 📃 📁 with @Tgale96 @jacobmenick @pcastr and @erich_elsen

1 replies, 359 likes

hardmaru: Everyone is a winner 🔥

1 replies, 263 likes

DeepMind: We also introduce a technique [] for training neural networks that are sparse throughout training from a random initialization - no luck required, all initialization “tickets” are winners.

0 replies, 127 likes

Delip Rao: Great paper title, with results to match. “MobileNets are efficient networks and difficult to sparsify. With RigL we can train 75% sparse MobileNets with almost no drop in accuracy.” 😱😱

1 replies, 46 likes

Sara Hooker: What differs in this paper is how the connections are grown after pruning for the most important weights. I think this is part of a very interesting direction of research, amplifying the role of weights estimated to be important (in addition to removing the “weakest” links).

0 replies, 24 likes

Pablo Samuel Castro: 🎟️🎟️make everyone a lottery winner🎟️🎟️ train sparse networks (with a randomly initialized topology) end-to-end without sacrificing (much) accuracy! joint work with @utkuevci @Tgale96 @jacobmenick and @erich_elsen 👇🏾🎟️👇🏾🎟️👇🏾

0 replies, 16 likes

Jacob Menick: New work by Utku Evci et al. on sparse training. My contribution was helping with the RNN experiments. Fun collaborating with @utkuevci and getting involved in sparse man @erich_elsen's sweeping sparsity research programme.

0 replies, 11 likes

Jesse Engel: Sparsity is a clear inductive bias for neural nets, but end to end training and efficient inference have always been a challenge. I know @erich_elsen has been thinking about this for a long time, and seems like they've made some real progress!

0 replies, 7 likes

Daisuke Okanohara: RigL trains sparse NNs from scratch; regularly drops the edges with the smallest magnitude, computes the gradients wrt virtual dense edges, and introduces new edges with the largest gradient. Escaping bad local minima by making a new descending direction.

0 replies, 7 likes

Brundage Bot: Rigging the Lottery: Making All Tickets Winners. Utku Evci, Trevor Gale, Jacob Menick, Pablo Samuel Castro, and Erich Elsen

1 replies, 4 likes

akira: Instead of choosing good initial values ​​in favor of "Lottery Theory", they propose RigL to train a sparse and accurate network from any initial value.The learning time does not increase greatly, inference speed is improved with the same accuracy.

0 replies, 2 likes

Mitchell Gordon: Really cool improvements on Tim Dettmer's work; now sparse networks really can be trained from scratch using less GPU memory!

0 replies, 1 likes

Carles R. Riera: Well, we are back to the 2000 with the return of constructive-deconstructive methods. Glad to see this. Instead of finding the correct initialization they add and remove units according to the gradient.

1 replies, 1 likes

Fabien Da Silva: @owulveryck @arxiv - Rigging the Lottery: Making All Tickets Winners - Self-training with Noisy Student improves ImageNet classification - Using Local Knowledge Graph Construction to Scale Seq2Seq Models to Multi-Document Input

1 replies, 0 likes


Found on Nov 26 2019 at

PDF content of a computer science paper: Rigging the Lottery: Making All Tickets Winners