Papers of the day   All papers

VISUALBERT: A SIMPLE AND PERFORMANT BASELINE FOR VISION AND LANGUAGE

Comments

Aug 12 2019 Mark Yatskar

hot off the press -- VisualBert: A simple and performant baseline for vision and language. Language + image region proposals -> stack of Transformers + pretrain on captions = SOTA or near on 4 V&L problems. https://arxiv.org/abs/1908.03557 @LiLiunian +Cho-Jui Hsieh +Da Yin @kaiwei_chang
3 replies, 108 likes


Aug 13 2019 Paul Liang

some exciting recent work in self-supervised multimodal learning including VideoBERT (https://arxiv.org/abs/1904.01766), ViLBERT (https://arxiv.org/abs/1908.02265), and VisualBERT (https://arxiv.org/abs/1908.03557). for more papers in multimodal representation learning, check out https://github.com/pliang279/awesome-multimodal-ml https://t.co/8tCQ0Gg5Qo
0 replies, 84 likes


Aug 12 2019 Thomas Lahore

VisualBERT: A Simple and Performant Baseline for Vision and Language "VisualBERT...is even sensitive to syntactic relationships, tracking, for example, associations between verbs and image regions corresponding to their arguments" https://arxiv.org/abs/1908.03557 https://t.co/HCfV8QOBtA
0 replies, 53 likes


Aug 12 2019 Yoav Artzi

New SOTA on NLVR2. Very impressive progress! 👏 Will be interesting to see NLVR2 attention examples. Still a lot of room for human performance. http://lil.nlp.cornell.edu/nlvr/ #NLProc https://t.co/ur6kM4Achy
1 replies, 36 likes


Aug 27 2019 William Wang

https://t.co/Cng1KgTMV0
1 replies, 12 likes


Aug 12 2019 arXiv CS-CL

VisualBERT: A Simple and Performant Baseline for Vision and Language http://arxiv.org/abs/1908.03557
0 replies, 9 likes


Aug 12 2019 arXiv CS-CV

VisualBERT: A Simple and Performant Baseline for Vision and Language http://arxiv.org/abs/1908.03557
0 replies, 8 likes


Aug 27 2019 Rogue 🌻. Bigham

https://t.co/OwqpzS1Yjx
0 replies, 2 likes


Content