Papers of the day   All papers

Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes

Comments

Dec 05 2019 Greg Yang

1/ Why do wide, random neural networks form Gaussian processes, *regardless of architecture*? Let me give an overview in case you are too lazy to check out the paper https://arxiv.org/abs/1910.12478 or the code https://github.com/thegregyang/GP4A. The proof has two parts… https://t.co/cKtfpRGMQd
8 replies, 1008 likes


Oct 29 2019 Greg Yang

1/ I can't teach you how to dougie but I can teach you how to compute the Gaussian Process corresponding to infinite-width neural network of ANY architecture, feedforward or recurrent, eg: resnet, GRU, transformers, etc ... RT plz💪http://arxiv.org/abs/1910.12478 https://t.co/TgCBmf1OcA
4 replies, 398 likes


Dec 19 2019 Greg Yang

1/ Neural networks are Gaussian Processes --- the Poster Edition from #NeurIPS2019 last week. In case you missed it, here’s a twitter version of the poster presentation, following the format of @colinraffel; and here’s the previous tweet thread https://twitter.com/TheGregYang/status/1202608248534077440?s=20 https://t.co/lHJgH43gqa
1 replies, 209 likes


Oct 31 2019 Microsoft Research

Explore the open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network on GitHub: https://github.com/thegregyang/GP4A
0 replies, 47 likes


Dec 09 2019 Greg Yang

Hit me up @NeurIPSConf if you wanna learn more about wide neural networks and come to my poster session on Wednesday 5pm to 7pm, east exhibition hall B+C, poster #242 https://whova.com/webapp/event/program/839448/ https://t.co/YUXBuYMU2N
0 replies, 28 likes


Dec 12 2019 Andrey Kurenkov 🤖 @ Neurips

This Twitter thread by @TheGregYang, as well as the associated poster (which I stopped by today, hope you dont mind the not so grear pic 😅), is a great example of communicating tricky math stuff with both depth and accessible & concise clarity! We should all strive for this! :) https://t.co/ZJ1J8Hqdvb
2 replies, 25 likes


Oct 31 2019 Greg Yang

Pairs best with the paper https://arxiv.org/abs/1910.12478 and previous discussion https://twitter.com/TheGregYang/status/1189174848611745792?s=20
1 replies, 5 likes


Feb 11 2020 Greg Yang

@sschoenholz @stormtroper1721 @alanyttian Thanks for ping, Sam! Here is, for example, a thread on why all NNs look like Gaussian Processes at initialization. https://twitter.com/TheGregYang/status/1202608248534077440?s=19
0 replies, 3 likes


Dec 05 2019 Nicole Radziwill

this is super cool. thanks @BruceTedesco for RTing it
0 replies, 3 likes


Dec 05 2019 Kevin Yang 楊凱筌

Another poster I'm really excited to see. I'm basically a sucker for anything that has GPs and NNs together.
0 replies, 1 likes


Nov 27 2019 Hacker News

Wide Neural Networks of Any Architecture Are Gaussian Processes: https://arxiv.org/abs/1910.12478 Comments: https://news.ycombinator.com/item?id=21651113
0 replies, 1 likes


Oct 29 2019 Sham Kakade

cool stuff from @TheGregYang: Tensors, Neural Nets, GPs, and kernels! looks like we can derive a corresponding kernel/GP in a fairly general sense. very curious on broader empirical comparisons to neural nets, which (potentially) draw strength from the non-linear regime!
0 replies, 1 likes


Dec 06 2019 Matios Berhe

I’m not skilled enough to know why this makes me nervous cc:@paulportesi
0 replies, 1 likes


Content