Papers of the day   All papers

Unsupervised Cross-lingual Representation Learning at Scale

Comments

Nov 07 2019 Alexis Conneau

Our new paper: Unsupervised Cross-lingual Representation Learning at Scale https://arxiv.org/pdf/1911.02116.pdf We release XLM-R, a Transformer MLM trained in 100 langs on 2.5 TB of text data. Double digit gains on XLU benchmarks + strong per-language performance (~XLNet on GLUE). [1/6] https://t.co/0RX1ljGuri
5 replies, 423 likes


Nov 09 2019 Yann LeCun

XLM-R: Amazing results on XLU and GLUE benchmarks from Facebook AI: large transformer network trained on 2.5TB of text from 100 languages.
1 replies, 204 likes


Nov 07 2019 Guillaume Lample

XLM-R, the large scale version of XLM. Super impressive results. A single model trained on 2.5TB of data handles 100 languages, and outperforms mBERT by more than 10% on several classification benchmarks, with up to 21% accuracy on low-resource languages like Swahili and Urdu.
1 replies, 133 likes


Nov 07 2019 Thomas Wolf

Nice work by @alex_conneau @kakemeister and co. on pretraining multilingual language models to overcome the curse of multilinguality. Pretty impressive to see the resulting 100-languages model challenge strong English-only models like XLNet & RoBERTa 👇 https://twitter.com/alex_conneau/status/1192490719031656448 https://t.co/VPJ5QIbPUK
1 replies, 64 likes


Nov 07 2019 Kartikay Khandelwal

Really excited to share new work! XLM-R: A multilingual model in 100 languages, trained on 2TB of data! SOTA on cross-lingual benchmarks AND competitive with monolingual models on GLUE! We also explore how to effectively train these models! My first first author NLP paper! :)
1 replies, 64 likes


Nov 08 2019 Ves Stoyanov

We released XLM-R (XLM-Roberta) it achieves new state of the art results on cross-lingual NLI, QA and NER. I am particularly excited about the huge improvement on low-resource languages.
0 replies, 63 likes


Nov 08 2019 Roee Aharoni

Very happy to see more massively-multilingual work coming out. The world needs more non-English NLP!
0 replies, 17 likes


Nov 19 2019 CeShine 😷

They found that applying a Sentence Piece model on raw text data for all languages is enough. No need for extra tokenization steps.
0 replies, 5 likes


Nov 07 2019 Myle Ott

Now available in fairseq: https://github.com/pytorch/fairseq/tree/master/examples/xlmr
0 replies, 5 likes


Nov 07 2019 roadrunner01

Unsupervised Cross-lingual Representation Learning at Scale pdf: https://arxiv.org/pdf/1911.02116.pdf abs: https://arxiv.org/abs/1911.02116 https://t.co/l5XiXJrBsZ
0 replies, 4 likes


Nov 06 2019 Stefan

XLM-RoBERTa is out 😍 Thanks to the fairseq-team 🤗 #nlp https://github.com/pytorch/fairseq/tree/master/examples/xlmr
0 replies, 3 likes


Nov 07 2019 Stefan

"Unsupervised Cross-lingual Representation Learning at Scale" is out now: https://arxiv.org/abs/1911.02116
0 replies, 2 likes


Nov 07 2019 Kartikay Khandelwal

You can find the paper here: https://arxiv.org/pdf/1911.02116.pdf
0 replies, 1 likes


Content