Papers of the day   All papers

Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes

Comments

Google AI: Announcing Neural Tangents, a new easy-to-use, open-source neural network library that enables researchers to build finite- and infinite-width versions of neural networks simultaneously. Grab the code and try it for yourself at https://goo.gle/33eErSu https://t.co/bL6nQL2PoR

13 replies, 2027 likes


Greg Yang: 1/ Why do wide, random neural networks form Gaussian processes, *regardless of architecture*? Let me give an overview in case you are too lazy to check out the paper https://arxiv.org/abs/1910.12478 or the code https://github.com/thegregyang/GP4A. The proof has two parts… https://t.co/cKtfpRGMQd

8 replies, 1027 likes


Greg Yang: 1/ I can't teach you how to dougie but I can teach you how to compute the Gaussian Process corresponding to infinite-width neural network of ANY architecture, feedforward or recurrent, eg: resnet, GRU, transformers, etc ... RT plz💪http://arxiv.org/abs/1910.12478 https://t.co/TgCBmf1OcA

4 replies, 398 likes


Greg Yang: RNNs and batchnorm will be coming soon, but you can already play with them here https://github.com/thegregyang/GP4A The general theory for this is based on tensor programs https://arxiv.org/abs/1902.04760 https://arxiv.org/abs/1910.12478 Give Neural Tangents a try and let us know what you think!

1 replies, 250 likes


Greg Yang: 1/ Neural networks are Gaussian Processes --- the Poster Edition from #NeurIPS2019 last week. In case you missed it, here’s a twitter version of the poster presentation, following the format of @colinraffel; and here’s the previous tweet thread https://twitter.com/TheGregYang/status/1202608248534077440?s=20 https://t.co/lHJgH43gqa

1 replies, 209 likes


Microsoft Research: Explore the open-source implementations of the Gaussian Process kernels of simple RNN, GRU, transformer, and batchnorm+ReLU network on GitHub: https://github.com/thegregyang/GP4A

0 replies, 47 likes


Greg Yang: Hit me up @NeurIPSConf if you wanna learn more about wide neural networks and come to my poster session on Wednesday 5pm to 7pm, east exhibition hall B+C, poster #242 https://whova.com/webapp/event/program/839448/ https://t.co/YUXBuYMU2N

0 replies, 28 likes


Andrey Kurenkov 🤖 @ Neurips: This Twitter thread by @TheGregYang, as well as the associated poster (which I stopped by today, hope you dont mind the not so grear pic 😅), is a great example of communicating tricky math stuff with both depth and accessible & concise clarity! We should all strive for this! :) https://t.co/ZJ1J8Hqdvb

2 replies, 25 likes


Greg Yang: Pairs best with the paper https://arxiv.org/abs/1910.12478 and previous discussion https://twitter.com/TheGregYang/status/1189174848611745792?s=20

1 replies, 5 likes


Greg Yang: @andrewgwils 1/2 This prior for DNNs has been studied recently (extending Neal's work) in the limit of infinite width https://arxiv.org/abs/1711.00165 https://arxiv.org/abs/1810.05148 https://arxiv.org/abs/1804.11271 in particular https://arxiv.org/abs/1910.12478 shows this prior is a GP for *any* DNN architecture

1 replies, 4 likes


Nicole Radziwill: this is super cool. thanks @BruceTedesco for RTing it

0 replies, 3 likes


Greg Yang: @sschoenholz @stormtroper1721 @alanyttian Thanks for ping, Sam! Here is, for example, a thread on why all NNs look like Gaussian Processes at initialization. https://twitter.com/TheGregYang/status/1202608248534077440?s=19

0 replies, 3 likes


Kevin Yang 楊凱筌: Another poster I'm really excited to see. I'm basically a sucker for anything that has GPs and NNs together.

0 replies, 1 likes


Matios Berhe: I’m not skilled enough to know why this makes me nervous cc:@paulportesi

0 replies, 1 likes


Sham Kakade: cool stuff from @TheGregYang: Tensors, Neural Nets, GPs, and kernels! looks like we can derive a corresponding kernel/GP in a fairly general sense. very curious on broader empirical comparisons to neural nets, which (potentially) draw strength from the non-linear regime!

0 replies, 1 likes


Hacker News: Wide Neural Networks of Any Architecture Are Gaussian Processes: https://arxiv.org/abs/1910.12478 Comments: https://news.ycombinator.com/item?id=21651113

0 replies, 1 likes


Content

Found on Mar 13 2020 at https://arxiv.org/pdf/1910.12478.pdf

PDF content of a computer science paper: Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes