Papers of the day   All papers

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Comments

Colin Raffel: New paper! We perform a systematic study of transfer learning for NLP using a unified text-to-text model, then push the limits to achieve SoTA on GLUE, SuperGLUE, CNN/DM, and SQuAD. Paper: https://arxiv.org/abs/1910.10683 Code/models/data/etc: https://git.io/Je0cZ Summary ⬇️ (1/14) https://t.co/VP1nkkHefB

9 replies, 1192 likes


Sebastian Ruder: The new study by @colinraffel et al. provides a great overview of best practices in the current transfer learning landscape in NLP. Check out page 33 of the paper or below for the main takeaways. https://arxiv.org/abs/1910.10683 https://t.co/8p2oZ3q8uf

2 replies, 340 likes


Colin Raffel: I will be at @NeurIPSConf next week to present MixMatch [1] and give a T5 demo [2]! Please get in touch if you want to discuss research, eat vegan food, and/or go bouldering. [1] https://arxiv.org/abs/1905.02249 [2] https://arxiv.org/abs/1910.10683

6 replies, 130 likes


Sam Bowman: Major progress on our SuperGLUE benchmark from Brain, plus a really extensive ablation study!

1 replies, 85 likes


Thomas Wolf: Notebook with demo: https://github.com/huggingface/transformers/blob/master/notebooks/03-pipelines.ipynb Release notes for Transformers 2.7.0: https://github.com/huggingface/transformers/releases/tag/v2.7.0 T5 paper on "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer": http://arxiv.org/abs/1910.10683

0 replies, 85 likes


Mohit Bansal: Welcome again, @ColinRaffel! Looking fwd to having u here soon πŸ˜€; & NLP folks applying for PhD this year, definitely apply to @UNCCS! In addn to Colin doing exciting NLP (e.g., see recent T5 paper: https://arxiv.org/abs/1910.10683), the awesome @snigdhac25 +@shsriva have also joined us!

1 replies, 69 likes


Delip Rao: 🚨🚨Big #nlproc claim from Google: "we .. [introduce] a unified framework that converts every language problem into a text-to-text format." https://arxiv.org/abs/1910.10683 https://t.co/pBZrdompF9

5 replies, 62 likes


Katherine Lee: Curious about the state of NLP? We explore how different pre-training objectives, datasets, training strategies, and more affect downstream task performance, and how well can we do on when we combine these insights & scale. It was amazing to collaborate with this team!

0 replies, 51 likes


Jack Hessel: Table 15 from T5 might be the most computationally expensive table ever constructed in the history of natural language processing πŸ™€ https://arxiv.org/pdf/1910.10683.pdf https://t.co/wv40rj1MPs

0 replies, 38 likes


Ryan Chesler: https://arxiv.org/abs/1910.10683 Read through the new T5 paper from google. Major conclusions: Does pretraining data matter? A little. Does pretraining task matter? A little. Does model architecture matter? A little. Does model size matter? A lot. Training a 11B model gave SOTA

2 replies, 34 likes


Tom Brown: Wanted to give credit to @colinraffel for this excellent summary thread for T5. I really appreciate having an overview before diving into the nitty gritty of a paper, and I used this as inspiration to do my own summary thread yesterday.

1 replies, 33 likes


Adam Roberts: A few of you at #ismir2019 asked me what I've been up to. Well, I took a sorta-but-not-really-hiatus from Magenta to work on NLP! The result was T5, which has been a very rewarding experience. Now, I'm looking forward to bringing some new insights back to music generation.

2 replies, 31 likes


niki parmar: Great work! Text to text Transformer with a masking loss does better than other Transfer learning techniques.

0 replies, 21 likes


Bryan McCann: Fun behind-the-scenes fact about decaNLP (https://arxiv.org/abs/1806.08730). Had we used the relaxed definition of multitask learning the T5 paper uses (https://arxiv.org/abs/1910.10683 -- different checkpoints for each task), our multitask models would have beaten single-task even back then

1 replies, 19 likes


Daisuke Okanohara: Many NLP tasks can be represented as a uniform text-to-text problem, (even the task specification is just a prefix of the input), and many techniques and ideas can be compared directly. Combining the findings, they achieved new SOTA on many tasks. https://arxiv.org/abs/1910.10683

0 replies, 16 likes


Amirhossein Tebbifakhr: T5 by google explores the field of transfer learning in NLP. Very good systematic study on how to pretrain and transfer transformer models for downstream tasks: https://arxiv.org/pdf/1910.10683.pdf cc @fbk_mt https://t.co/J0QpUOa64U

0 replies, 15 likes


RΓ©mi Louf πŸ‘ΎπŸ›Έβœ¨: You can read the paper here πŸ‘‰ https://arxiv.org/abs/1910.10683

0 replies, 13 likes


ε°ηŒ«ιŠγ‚Šγ‚‡γ†οΌˆγŸγ‹γ«γ‚ƒγ—γƒ»γ‚Šγ‚‡γ†οΌ‰: T5 : Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer https://arxiv.org/abs/1910.10683 https://t.co/1b00tIjxqE

1 replies, 12 likes


tung: Google's T5 (Text-To-Text Transfer Transformer) language model set new record and gets very close to human on SuperGLUE benchmark. https://super.gluebenchmark.com/leaderboard Paper: https://arxiv.org/abs/1910.10683 Code: https://github.com/google-research/text-to-text-transfer-transformer https://t.co/8SRJmoiaw6

0 replies, 12 likes


roadrunner01: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer pdf: https://arxiv.org/pdf/1910.10683.pdf abs: https://arxiv.org/abs/1910.10683 github: https://github.com/google-research/text-to-text-transfer-transformer https://t.co/zbYUbPFYII

0 replies, 10 likes


Sebastian Ruder: @zaidalyafeai @omarsar0 Seconding what @omarsar0 said. I've included what I was aware of in my PhD thesis (https://ruder.io/thesis/neural_transfer_learning_for_nlp.pdf#page=60), which includes multi-task, sequential, cross-lingual and domain transfer. The T5 paper (https://arxiv.org/abs/1910.10683 ) is a good overview of sequential transfer best practices.

0 replies, 9 likes


arXiv CS-CL: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer http://arxiv.org/abs/1910.10683

0 replies, 9 likes


Jeff Dalton: Great summary of encoder-decoder text-to-text models by @colinraffel and authors. Also, a new dataset based on a cleaned common crawl (C4). Particularly interesting reflection as well as new SOTA on several tasks.

0 replies, 7 likes


MT Group at FBK: Our pick of the week: Raffel et al. paper on "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". By @at_amir #nlproc #deeplearning @colinraffel @ada_rob @katherine1ee @sharan0909 @zhouyanqi30 @kongkonglli @peterjliu

0 replies, 5 likes


arXiv CS-CL: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer http://arxiv.org/abs/1910.10683

0 replies, 5 likes


Douglas Eck: Great work by my colleagues on the Google Research Brain Team.

0 replies, 4 likes


HotComputerScience: Most popular computer science paper of the day: "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer" https://hotcomputerscience.com/paper/exploring-the-limits-of-transfer-learning-with-a-unified-text-to-text-transformer https://twitter.com/colinraffel/status/1187161460033458177

0 replies, 3 likes


Tim Finin: Google's T5 looks very interesting as does the corpus used to train it, the Colossal Clean Crawled Corpus (C4). Overview post at https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html, details on arXiv at https://arxiv.org/abs/1910.10683 https://t.co/kozupqIPSi

0 replies, 2 likes


AUEB NLP Group: Next AUEB NLP Group meeting, Tue *May 5*, *17:00-18:30*: "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5)", Raffel et al. (https://arxiv.org/abs/1910.10683). Study the paper before the meeting. All welcome (but max capacity 100).

1 replies, 2 likes


akira: https://arxiv.org/abs/1910.10683 The authors propose a model named T5 for classification/translation/question answering/summarization and large scale English dataset C4. And comprehensively investigates pre-training strategies; the format of the mask, etc. It got SOTA for various tasks. https://t.co/vRzo4w9U0S

0 replies, 1 likes


arXiv CS-CL: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer http://arxiv.org/abs/1910.10683

0 replies, 1 likes


Content

Found on Oct 24 2019 at https://arxiv.org/pdf/1910.10683.pdf

PDF content of a computer science paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer