Papers of the day   All papers

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Comments

Oct 24 2019 Colin Raffel

New paper! We perform a systematic study of transfer learning for NLP using a unified text-to-text model, then push the limits to achieve SoTA on GLUE, SuperGLUE, CNN/DM, and SQuAD. Paper: https://arxiv.org/abs/1910.10683 Code/models/data/etc: https://git.io/Je0cZ Summary ⬇️ (1/14) https://t.co/VP1nkkHefB
9 replies, 1150 likes


Oct 24 2019 Sebastian Ruder

The new study by @colinraffel et al. provides a great overview of best practices in the current transfer learning landscape in NLP. Check out page 33 of the paper or below for the main takeaways. https://arxiv.org/abs/1910.10683 https://t.co/8p2oZ3q8uf
2 replies, 340 likes


Nov 12 2019 Colin Raffel

I'm starting a professorship in the CS department at UNC in fall 2020 (!!) and am hiring students! If you're interested in doing a PhD @unccs please get in touch. More info here: https://cs.unc.edu/admissions/graduate/graduate-programs/
28 replies, 226 likes


Oct 24 2019 Sam Bowman

Major progress on our SuperGLUE benchmark from Brain, plus a really extensive ablation study!
1 replies, 85 likes


Oct 24 2019 Delip Rao

🚨🚨Big #nlproc claim from Google: "we .. [introduce] a unified framework that converts every language problem into a text-to-text format." https://arxiv.org/abs/1910.10683 https://t.co/pBZrdompF9
5 replies, 62 likes


Oct 24 2019 Katherine Lee

Curious about the state of NLP? We explore how different pre-training objectives, datasets, training strategies, and more affect downstream task performance, and how well can we do on when we combine these insights & scale. It was amazing to collaborate with this team!
0 replies, 51 likes


Oct 24 2019 Jack Hessel

Table 15 from T5 might be the most computationally expensive table ever constructed in the history of natural language processing 🙀 https://arxiv.org/pdf/1910.10683.pdf https://t.co/wv40rj1MPs
0 replies, 38 likes


Nov 02 2019 Ryan Chesler

https://arxiv.org/abs/1910.10683 Read through the new T5 paper from google. Major conclusions: Does pretraining data matter? A little. Does pretraining task matter? A little. Does model architecture matter? A little. Does model size matter? A lot. Training a 11B model gave SOTA
2 replies, 34 likes


Nov 12 2019 Mohit Bansal

Welcome again, @ColinRaffel! Looking fwd to having u here soon 😀; & NLP folks applying for PhD this year, definitely apply to @UNCCS! In addn to Colin doing exciting NLP (e.g., see recent T5 paper: https://arxiv.org/abs/1910.10683), the awesome @snigdhac25 +@shsriva have also joined us!
0 replies, 24 likes


Nov 11 2019 Adam Roberts

A few of you at #ismir2019 asked me what I've been up to. Well, I took a sorta-but-not-really-hiatus from Magenta to work on NLP! The result was T5, which has been a very rewarding experience. Now, I'm looking forward to bringing some new insights back to music generation.
1 replies, 22 likes


Oct 24 2019 niki parmar

Great work! Text to text Transformer with a masking loss does better than other Transfer learning techniques.
0 replies, 21 likes


Oct 26 2019 小猫遊りょう(たかにゃし・りょう)

T5 : Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer https://arxiv.org/abs/1910.10683 https://t.co/1b00tIjxqE
1 replies, 12 likes


Oct 24 2019 roadrunner01

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer pdf: https://arxiv.org/pdf/1910.10683.pdf abs: https://arxiv.org/abs/1910.10683 github: https://github.com/google-research/text-to-text-transfer-transformer https://t.co/zbYUbPFYII
0 replies, 10 likes


Oct 26 2019 arXiv CS-CL

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer http://arxiv.org/abs/1910.10683
0 replies, 9 likes


Oct 24 2019 Jeff Dalton

Great summary of encoder-decoder text-to-text models by @colinraffel and authors. Also, a new dataset based on a cleaned common crawl (C4). Particularly interesting reflection as well as new SOTA on several tasks.
0 replies, 7 likes


Oct 25 2019 arXiv CS-CL

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer http://arxiv.org/abs/1910.10683
0 replies, 5 likes


Oct 24 2019 Douglas Eck

Great work by my colleagues on the Google Research Brain Team.
0 replies, 4 likes


Oct 25 2019 tung

Google's T5 (Text-To-Text Transfer Transformer) language model set new record and gets very close to human on SuperGLUE benchmark. https://super.gluebenchmark.com/leaderboard Paper: https://arxiv.org/abs/1910.10683 Code: https://github.com/google-research/text-to-text-transfer-transformer https://t.co/8SRJmoiaw6
0 replies, 3 likes


Oct 26 2019 HotComputerScience

Most popular computer science paper of the day: "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" https://hotcomputerscience.com/paper/exploring-the-limits-of-transfer-learning-with-a-unified-text-to-text-transformer https://twitter.com/colinraffel/status/1187161460033458177
0 replies, 3 likes


Oct 27 2019 arXiv CS-CL

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer http://arxiv.org/abs/1910.10683
0 replies, 1 likes


Oct 22 2018 Gary Marcus

@jeffrschneider @bendee983 SWAG wa already defeated, but there has been no progress on Winograd Schemas and you should check the leaderboards @allen_ai
1 replies, 0 likes


Content