Papers of the day   All papers

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

Comments

Sebastian Ruder: I'm excited to announce XTREME, a new benchmark that covers 9 tasks and 40 typologically diverse languages. Paper: https://arxiv.org/abs/2003.11080 Blog post: https://ai.googleblog.com/2020/04/xtreme-massively-multilingual-multi.htm Code: https://github.com/google-research/xtreme/ https://t.co/YVo0T9gT63

3 replies, 799 likes


Google AI: Announcing XTREME, a new #NaturalLanguageProcessing benchmark for cross-lingual generalization, which covers 40 typologically diverse languages using nine tasks that collectively require reasoning about different levels of syntax or semantics. Learn more ↓https://goo.gle/2xh8rlp

5 replies, 670 likes


Sam Bowman: I had heard mumblings of this earlier, and I'm excited that it has come out: XTREME, Google/DeepMind/CMU's new benchmark for cross-lingual transfer in NLU. https://arxiv.org/pdf/2003.11080.pdf

7 replies, 222 likes


Noah Smith: I must be missing something. If an aggregate-of-benchmarks doesn't inspire/catalyze *new*/better dataset creation (which this one doesn't appear to), what's the effect, other than to divert credit/citations from the hard-working researchers who actually build datasets?

4 replies, 50 likes


Junjie Hu: I am happy to share my recent work on benchmarking crosslingual generalization of popular contextual models. Joint work w/ @seb_ruder @orf_bnw @gneubig Melvin Aditya Code: https://github.com/google-research/xtreme Paper: https://arxiv.org/pdf/2003.11080.pdf Website would be up soon.

0 replies, 28 likes


Sam Bowman: It looks like MSR has been working on something very similar in parallel, though with generation tasks included: https://arxiv.org/pdf/2004.01401.pdf

0 replies, 26 likes


Roee Aharoni: Better NLP for the world’s languages requires massively multilingual benchmarks! Awesome work by @JunjieHu12 @seb_ruder @gneubig @orf_bnw and Melvin Johnson

1 replies, 11 likes


Thomas Scialom: Really cool work on #NLProc to establish a general multilingual benchmark. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization https://arxiv.org/abs/2003.11080 by Junjie Hu @seb_ruder et al. 1/3

1 replies, 7 likes


arxiv: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Genera... http://arxiv.org/abs/2003.11080 https://t.co/fPFYVNwpcj

0 replies, 4 likes


roadrunner01: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization pdf: https://arxiv.org/pdf/2003.11080.pdf abs: https://arxiv.org/abs/2003.11080 https://t.co/BeNPtmkz5q

0 replies, 3 likes


HotComputerScience: Most popular computer science paper of the day: "XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization" https://hotcomputerscience.com/paper/xtreme-a-massively-multilingual-multi-task-benchmark-for-evaluating-cross-lingual-generalization https://twitter.com/seb_ruder/status/1249779748961767425

0 replies, 2 likes


David Aronchick: Models are great, but benchmarks like this drive the industry forward to new heights. Really cool new release!

0 replies, 2 likes


ML and Data Projects To Know: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization by @JunjieHu12, @seb_ruder, @asiddhant1, @gneubig, @orf_bnw, Melvin Johnson Paper Link: https://arxiv.org/abs/2003.11080 https://t.co/ZXOEOj8qTW

0 replies, 1 likes


Content

Found on Apr 13 2020 at https://arxiv.org/pdf/2003.11080.pdf

PDF content of a computer science paper: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization