Papers of the day   All papers

Stand-Alone Self-Attention in Vision Models

Comments

niki parmar: New Paper: Stand-Alone Self-Attention in Vision Models https://arxiv.org/abs/1906.05909 Can attention work as a stand-alone primitive for vision models? We develop a pure self-attention model by replacing the spatial convolutions in a ResNet by a simple, local self-attention layer. https://t.co/A2NDeivGmt

15 replies, 463 likes


niki parmar: Our paper got accepted to #Neurips!! Code release coming soon, keep an eye out :)

2 replies, 206 likes


William Fedus: Additional evidence of the transformer/self-attention as a useful computational primitive in vision tasks such as ImageNet classification and COCO detection. Future work is exciting: "...we hope to unify convolution and self-attention to best combine their unique advantages"

0 replies, 23 likes


Jean-Baptiste Cordonnier: Our work explains the recent success of Transformer architecture applied to vision: Attention Augmented Convolutional Networks. @IrwanBello et al., 2019. https://arxiv.org/abs/1904.09925 Stand-Alone Self-Attention in Vision Models. Ramachandran et al., 2019. https://arxiv.org/abs/1906.05909 3/5

1 replies, 17 likes


Ashish Vaswani: Pure content based interactions are competitive for vision models. Lot's of exciting work to be done in this research area.

2 replies, 14 likes


Daisuke Okanohara: For image recognition tasks, they showed that local self-attention is competitive or superior to convolution in higher layers, and full attention model can achieve similar performance as ConvNet. Better absolute/relative position encoding is required. https://arxiv.org/abs/1906.05909

0 replies, 8 likes


HotComputerScience: Most popular computer science paper of the day: "Stand-Alone Self-Attention in Vision Models" https://hotcomputerscience.com/paper/stand-alone-self-attention-in-vision-models https://twitter.com/nikiparmar09/status/1140649942505013249

0 replies, 4 likes


Content

Found on Jun 17 2019 at https://arxiv.org/pdf/1906.05909.pdf

PDF content of a computer science paper: Stand-Alone Self-Attention in Vision Models