Batch Normalization: Accelerating Deep Network Training b y Reducing Internal Covariate Shift


Sep 11 2019 David Page

The paper that introduced Batch Norm combines clear intuition with compelling experiments (14x speedup on ImageNet!!) So why has 'internal covariate shift' remained controversial to this day? Thread πŸ‘‡
Sep 11 2019 Jeremy Howard

This is the best distillation of recent (and old!) research on batchnorm I've seen. There is so much to learn about training mechanics by studying this thread and the links it contains.
Sep 12 2019 Roger Grosse

Excellent overview of the (widely misunderstood) mechanism hypothesized in the original batch norm paper, as well as recent empirical evidence that strongly supports it.
Sep 12 2019 Yann LeCun

Nice tweethread on the effect of variable centering in deep nets (like batch norm).
Sep 12 2019 Yann LeCun

Sep 12 2019 Frank Dellaert

Sep 11 2019 Andrei Bursuc

Nice summary of the knowledge and understanding we have over BatchNorm from both recent and classic perspectives at the time of its publication
Sep 13 2019 Brandon Rohrer

Sep 12 2019 Lavanya

Sep 11 2019 Thread Reader App

Sep 12 2019 (((Ω„()(Ω„() 'yoav))))

Sep 12 2019 Aakash Kumar Nain πŸ”Ž

May 04 2019 Rob Flickenger ⚑

TIL that some ML researchers had an idea (batch normalization) that worked, then tacked on a post-hoc mathematical analysis to claim that it reduced "internal covariate shift". The technique works, but not for that reason.
Oct 19 2019 Brandon Rohrer

@dcpage3 offers keen insights into how batch norm works. I particularly resonated with his meta-conclusions: 1) When studying neural networks, use simple examples. You can iterate faster and see what’s going on more easily. 2) VISUALIZE. h/t @jeremyphoward
