Papers of the day   All papers

AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE

Comments

Oriol Vinyals: Recent conversation with a friend: @ilyasut: what's your take on https://openreview.net/pdf?id=YicbFdNTTy? @OriolVinyalsML: my take is: farewell convolutions : ) https://t.co/9PEvxmWvO4

11 replies, 904 likes


Gabriel Ilharco: I've been seeing a lot of talk around the recent Vision Transformer (ViT) paper, so I thought I'd highlight some of my favorite previous work on self-attention and transformers in computer vision! Link to ViT: https://openreview.net/pdf?id=YicbFdNTTy (thread 👇)

2 replies, 251 likes


Xavier Bresson: Actually it can be shown that convolution and transformer/attention are (almost) equivalent for graphs. Architectures like Transformers and ConvNets are slowly but happily converging.

3 replies, 233 likes


Ilya Sutskever: attention is all you need, as anonymous mathematical proof: https://openreview.net/pdf?id=YicbFdNTTy

3 replies, 213 likes


Tim Dettmers: A precious summary of some uses of attention in vision! Attention has been used in the vision for a long time, and in light of the recent success, it is insightful to look back and look at what worked and what did not and why that is so.

0 replies, 32 likes


Vladimir Haltakov: Another paper review, but a little different this time... 🤷‍♂️ The paper is not published yet, but is submitted for review at ICLR 2021. It is getting a lot of attention from the CV/ML community, though, and many speculate that it is the end of CNNs... 👇 https://twitter.com/OriolVinyalsML/status/1312404990871375873?s=20 https://t.co/dZGBYB8A5U

3 replies, 11 likes


Reza Zadeh: Another notable attempt: https://openreview.net/pdf?id=YicbFdNTTy

0 replies, 8 likes


Volodymyr Kuleshov: Do you find the experimental results impressive? (Honest question)

1 replies, 8 likes


Pierre Vandergheynst: Semi non-local image processing is back via the transformer door in computer vision

0 replies, 4 likes


Trending Papers: [1/10] 📈 - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale - 331 ⭐ - 📄 https://openreview.net/pdf?id=YicbFdNTTy - 🔗 https://github.com/lucidrains/vit-pytorch

0 replies, 2 likes


School of AI Algiers: A few hours separate us from the beginning of a new SOAI Reading Session, where we will talk about "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" 😍 https://openreview.net/pdf?id=YicbFdNTTy Join us today at 7:00 pm through this link : 👉 https://meet.google.com/tbw-pyfj-tyh https://t.co/ViybsmugRY

0 replies, 2 likes


Ankur Handa: This is a nice list of papers on self-attention used in image based tasks.

0 replies, 2 likes


Content

Found on Oct 03 2020 at https://openreview.net/pdf?id=YicbFdNTTy

PDF content of a computer science paper: AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE