Papers of the day   All papers

Batch Normalization: Accelerating Deep Network Training b y Reducing Internal Covariate Shift

Comments

David Page: The paper that introduced Batch Norm http://arxiv.org/abs/1502.03167 combines clear intuition with compelling experiments (14x speedup on ImageNet!!) So why has 'internal covariate shift' remained controversial to this day? Thread πŸ‘‡ https://t.co/L0BBmo0q4t

14 replies, 1174 likes


Jeremy Howard: This is the best distillation of recent (and old!) research on batchnorm I've seen. There is so much to learn about training mechanics by studying this thread and the links it contains.

3 replies, 695 likes


Roger Grosse: Excellent overview of the (widely misunderstood) mechanism hypothesized in the original batch norm paper, as well as recent empirical evidence that strongly supports it.

2 replies, 141 likes


Yann LeCun: Nice tweethread on the effect of variable centering in deep nets (like batch norm). https://twitter.com/dcpage3/status/1171867587417952260?s=19 https://twitter.com/dcpage3/status/1171867587417952260?s=19

0 replies, 100 likes


Yann LeCun: Nice thread on the effects of variable normalization in neural nets (e.g. with batch norm).

0 replies, 46 likes


Frank Dellaert: Wondering about batch norm? Read this amazing thread.

0 replies, 18 likes


Andrei Bursuc: Nice summary of the knowledge and understanding we have over BatchNorm from both recent and classic perspectives at the time of its publication

1 replies, 15 likes


Lavanya: Cool thread on the intuition behind why BatchNorm works so well in practice! πŸ€·πŸΌβ€β™€οΈ

0 replies, 10 likes


Brandon Rohrer: A fun deep dive on β€œWhy Batch Norm?”

0 replies, 10 likes


Thread Reader App: @soldni Hola there is your unroll: Thread by @dcpage3: "The paper that introduced Batch Norm http://arxiv.org/abs/1502.03167 combines clear intuition with compelling experiments (14x […]" https://threadreaderapp.com/thread/1171867587417952260.html Talk to you soon. πŸ€–

0 replies, 7 likes


Aakash Kumar Nain πŸ”Ž: Nice thread

0 replies, 4 likes


(((Ω„()(Ω„() 'yoav)))): (another!) great thread re batch norm.

0 replies, 4 likes


Brandon Rohrer: @dcpage3 offers keen insights into how batch norm works. I particularly resonated with his meta-conclusions: 1) When studying neural networks, use simple examples. You can iterate faster and see what’s going on more easily. 2) VISUALIZE. h/t @jeremyphoward

0 replies, 2 likes


Rob Flickenger ⚑: TIL that some ML researchers had an idea (batch normalization) that worked, then tacked on a post-hoc mathematical analysis to claim that it reduced "internal covariate shift". The technique works, but not for that reason. https://arxiv.org/abs/1502.03167 https://arxiv.org/abs/1805.11604

1 replies, 2 likes


Content

Found on Sep 11 2019 at https://arxiv.org/pdf/1502.03167.pdf

PDF content of a computer science paper: Batch Normalization: Accelerating Deep Network Training b y Reducing Internal Covariate Shift