Papers of the day   All papers

Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes


Mar 13 2020 Google AI

Announcing Neural Tangents, a new easy-to-use, open-source neural network library that enables researchers to build finite- and infinite-width versions of neural networks simultaneously. Grab the code and try it for yourself at
13 replies, 2027 likes

Dec 05 2019 Greg Yang

1/ Why do wide, random neural networks form Gaussian processes, *regardless of architecture*? Let me give an overview in case you are too lazy to check out the paper or the code The proof has two parts…
8 replies, 1020 likes

Oct 29 2019 Greg Yang

1/ I can't teach you how to dougie but I can teach you how to compute the Gaussian Process corresponding to infinite-width neural network of ANY architecture, feedforward or recurrent, eg: resnet, GRU, transformers, etc ... RT plz💪
4 replies, 398 likes

Mar 13 2020 Greg Yang

RNNs and batchnorm will be coming soon, but you can already play with them here The general theory for this is based on tensor programs Give Neural Tangents a try and let us know what you think!
1 replies, 250 likes

Dec 19 2019 Greg Yang

1/ Neural networks are Gaussian Processes --- the Poster Edition from #NeurIPS2019 last week. In case you missed it, here’s a twitter version of the poster presentation, following the format of @colinraffel; and here’s the previous tweet thread
1 replies, 209 likes

Oct 31 2019 Microsoft Research

Explore the open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network on GitHub:
0 replies, 47 likes

Dec 09 2019 Greg Yang

Hit me up @NeurIPSConf if you wanna learn more about wide neural networks and come to my poster session on Wednesday 5pm to 7pm, east exhibition hall B+C, poster #242
0 replies, 28 likes

Dec 12 2019 Andrey Kurenkov 🤖 @ Neurips

This Twitter thread by @TheGregYang, as well as the associated poster (which I stopped by today, hope you dont mind the not so grear pic 😅), is a great example of communicating tricky math stuff with both depth and accessible & concise clarity! We should all strive for this! :)
2 replies, 25 likes

Oct 31 2019 Greg Yang

Pairs best with the paper and previous discussion
1 replies, 5 likes

Feb 21 2020 Greg Yang

@andrewgwils 1/2 This prior for DNNs has been studied recently (extending Neal's work) in the limit of infinite width in particular shows this prior is a GP for *any* DNN architecture
1 replies, 4 likes

Feb 11 2020 Greg Yang

@sschoenholz @stormtroper1721 @alanyttian Thanks for ping, Sam! Here is, for example, a thread on why all NNs look like Gaussian Processes at initialization.
0 replies, 3 likes

Dec 05 2019 Nicole Radziwill

this is super cool. thanks @BruceTedesco for RTing it
0 replies, 3 likes

Dec 06 2019 Matios Berhe

I’m not skilled enough to know why this makes me nervous cc:@paulportesi
0 replies, 1 likes

Dec 05 2019 Kevin Yang 楊凱筌

Another poster I'm really excited to see. I'm basically a sucker for anything that has GPs and NNs together.
0 replies, 1 likes

Nov 27 2019 Hacker News

Wide Neural Networks of Any Architecture Are Gaussian Processes: Comments:
0 replies, 1 likes

Oct 29 2019 Sham Kakade

cool stuff from @TheGregYang: Tensors, Neural Nets, GPs, and kernels! looks like we can derive a corresponding kernel/GP in a fairly general sense. very curious on broader empirical comparisons to neural nets, which (potentially) draw strength from the non-linear regime!
0 replies, 1 likes