niki parmar: New Paper:
Stand-Alone Self-Attention in Vision Models
Can attention work as a stand-alone primitive for vision models?
We develop a pure self-attention model by replacing the spatial convolutions in a ResNet by a simple, local self-attention layer. https://t.co/A2NDeivGmt
15 replies, 463 likes
niki parmar: Our paper got accepted to #Neurips!!
Code release coming soon, keep an eye out :)
2 replies, 206 likes
William Fedus: Additional evidence of the transformer/self-attention as a useful computational primitive in vision tasks such as ImageNet classification and COCO detection.
Future work is exciting: "...we hope to unify convolution and self-attention to best combine their unique advantages"
0 replies, 23 likes
Jean-Baptiste Cordonnier: Our work explains the recent success of Transformer architecture applied to vision:
Attention Augmented Convolutional Networks. @IrwanBello et al., 2019. https://arxiv.org/abs/1904.09925
Stand-Alone Self-Attention in Vision Models. Ramachandran et al., 2019. https://arxiv.org/abs/1906.05909
1 replies, 17 likes
Ashish Vaswani: Pure content based interactions are competitive for vision models. Lot's of exciting work to be done in this research area.
2 replies, 14 likes
Daisuke Okanohara: For image recognition tasks, they showed that local self-attention is competitive or superior to convolution in higher layers, and full attention model can achieve similar performance as ConvNet. Better absolute/relative position encoding is required. https://arxiv.org/abs/1906.05909
0 replies, 8 likes
HotComputerScience: Most popular computer science paper of the day:
"Stand-Alone Self-Attention in Vision Models"
0 replies, 4 likes
Found on Jun 17 2019 at https://arxiv.org/pdf/1906.05909.pdf