Papers of the day   All papers

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Comments

Jul 29 2019 Yann LeCun

The Paper on RoBERTa, the current holder of the pole position on the GLUE leaderboard ( https://gluebenchmark.com/leaderboard/ ) https://arxiv.org/abs/1907.11692 https://arxiv.org/abs/1907.11692
3 replies, 311 likes


Jul 29 2019 Ves Stoyanov

Our paper describing RoBERTa is out https://arxiv.org/abs/1907.11692. State of the art results on GLUE, SQuAD and RACE. https://twitter.com/sleepinyourhat/status/1151940994688016384?s=19
2 replies, 305 likes


Jul 18 2019 Sam Bowman

The new RoBERTa model from FAIR just edged out XLNet on the http://gluebenchmark.com nine-task leaderboard. Here's the information we have so far: https://t.co/B9wPtR7bY5
6 replies, 252 likes


Jul 30 2019 Russ Salakhutdinov

I can see XLNetTron paper coming out very soon studying various objectives and training parameters of XLNet :)
0 replies, 70 likes


Aug 23 2019 Daisuke Okanohara

RoBERTa improves BERT w/o architecture changes; 1) dynamic mask 2) remove next sentence prediction 3) larger batch size 4) larger iterations 5) byte BPE. achieved new SoTA on many tasks, and the score on SuperGLUE approaches human https://arxiv.org/abs/1907.11692 https://super.gluebenchmark.com/leaderboard
0 replies, 16 likes


Sep 22 2019 Victoria X Lin

Interesting modeling detail from the RoBERTa paper https://arxiv.org/pdf/1907.11692.pdf I wonder what is the motivation of adding an extra separator token and how much effect it has 🧐 https://t.co/Ri76amPiS5
1 replies, 15 likes


Jul 29 2019 Daniel Whitenack

Here we go again. And just when I was about to try XLNet. Question is how long it will take before @huggingface releases it in pytorch-transformers? #ai #NLProc
0 replies, 10 likes


Aug 14 2019 Stanford NLP Group

“Nvidia was able to train BERT-Large using optimized PyTorch software and a DGX-SuperPOD of more than 1,000 GPUs that is able to train BERT in 53 minutes.” – ⁦⁦@kharijohnson⁩, ⁦@VentureBeat⁩ https://venturebeat.com/2019/08/13/nvidia-trains-worlds-largest-transformer-based-language-model/
0 replies, 9 likes


Oct 05 2019 AUEB NLP Group

Next AUEB NLP Group meeting, Tue Oct 8, 17:15-19:00, *IPLab* (http://nlp.cs.aueb.gr/contact.html): Discussion of RoBERTa (https://arxiv.org/abs/1907.11692) and ALBERT (https://arxiv.org/abs/1909.11942). Coordinator: Ilias Chalkidis @KiddoThe2B. Study the papers before the meeting. All welcome.
0 replies, 4 likes


Oct 13 2019 AUEB NLP Group

Next AUEB NLP Group meeting, Tue Oct 15, 17:15-19:00, *IPLab* (http://nlp.cs.aueb.gr/contact.html): Part II of discussion of RoBERTa (https://arxiv.org/abs/1907.11692) and ALBERT (https://arxiv.org/abs/1909.11942). Coordinator: Ilias Chalkidis. Study the papers before the meeting. All welcome.
0 replies, 2 likes


Jul 29 2019 arXiv CS-CL

RoBERTa: A Robustly Optimized BERT Pretraining Approach http://arxiv.org/abs/1907.11692
0 replies, 1 likes


Jul 30 2019 Stephen Pimentel

RoBERTa: A Robustly Optimized BERT Pretraining Approach https://arxiv.org/abs/1907.11692
0 replies, 1 likes


Sep 15 2019 rajarshee mitra

Too much incremental? 1. Facebook's RoBERTa: BERT + more data + longer training. https://arxiv.org/abs/1907.11692 2. Salesforce's conditional transformer : prepend control codes to input. https://einstein.ai/presentations/ctrl.pdf
1 replies, 0 likes


Content