Jean-Baptiste Cordonnier: Very happy to share our latest work accepted at #ICRL2020: we prove that a Self-Attention layer can express any CNN layer. 1/5
🍿Interactive website : https://epfml.github.io/attention-cnn/
📝Blog: http://jbcordonnier.com/posts/attention-cnn/ https://t.co/X1rNS1JvPt
5 replies, 1156 likes
hardmaru: Self-attention is proving to be a really good unified prior for both image and sequence processing.
It can prob also learn useful representations for images that are difficult for conv layers to learn.
See author’s thread for blog post and web demo: https://twitter.com/jb_cordonnier/status/1215581826187743232?s=21
1 replies, 139 likes
Torsten Scholak: After reading @chrmanning et al’s paper on where bert looks at, https://arxiv.org/abs/1906.04341, this makes intuitive sense to me
1 replies, 75 likes
(not Hugging for the moment) Face: On the Relationship between Self-Attention and Convolutional Layers
@jb_cordonnier, @loukasa_tweet, & Jaggi show that multi-head attention layers can and often do perform convolution, linking two of the most popular operations in modern neural networks.
2 replies, 50 likes
Maks Sorokin 🦾: @hardmaru @iclr_conf @jb_cordonnier Author's tweet: https://twitter.com/jb_cordonnier/status/1215581826187743232?s=21
0 replies, 6 likes
Chaitanya Joshi: @ChrSzegedy Similar results for computer vision:
1 replies, 0 likes
Found on Jan 10 2020 at https://openreview.net/pdf?id=HJlnC1rKPB