Papers of the day   All papers

Rethinking Pre-training and Self-training

Comments

Quoc Le: We researchers love pre-training. Our new paper shows that pre-training is unhelpful when we have a lot of labeled data. In contrast, self-training works well even when we have a lot of labeled data. SOTA on PASCAL segmentation & COCO detection. Link: http://arxiv.org/abs/2006.06882 https://t.co/XxUth1LJc4

5 replies, 1043 likes


Sayak Paul: Here's a list of my favorite recent papers on transfer learning for vision: - BigTransfer: https://arxiv.org/abs/1912.11370 - VirTex: https://arxiv.org/abs/2006.06666 - SimCLRv2: https://arxiv.org/abs/2006.10029 - Self-training: https://arxiv.org/abs/2006.06882 Would love to see a T5-like paper for vision.

2 replies, 217 likes


Barret Zoph: Models and checkpoints are now open sourced for my recent work: "Rethinking Pre-training and Self-training". Paper link: https://arxiv.org/abs/2006.06882 Code Link: https://bit.ly/3j5sVAn. On COCO we achieve 54.3 AP and on Pascal Segmentation 90.5 mIOU!

1 replies, 114 likes


Thang Luong: Success of self-training extends to object detection and semantic segmentation! Key to SOTA results in PASCAL semantic segmentation is the usage of #NoisyStudent checkpoints EfficientNet-L2 :) https://github.com/google-research/noisystudent

0 replies, 60 likes


Joan Serrà: Insightful paper comparing pre-trained (transfer learning) with self-trained models: https://arxiv.org/abs/2006.06882 TLDR: self-training >> pre-training (including self-supervised pre-training). Encouraging!

0 replies, 32 likes


Mingxing Tan: Excited to see self-training obtains SoTA accuracy on COCO detection and Pascal segmentation. What if you also need efficiency? Try out our updated EfficientDet (53.7AP, with 55M params and 122ms latency): https://arxiv.org/abs/1911.09070. Enjoy :)

0 replies, 24 likes


Hossein Mobahi: I see a rapidly growing success from "self-training" and "self-distillation" type methods recently. There is a lot of opportunity there for theoretical understanding and explanations with huge practical impact as these methods are now at the core to some SOTA models.

1 replies, 20 likes


午後のarXiv: "Rethinking Pre-training and Self-training", Barret Zoph, Golnaz Ghiasi, Tsung-Yi Lin, Yin Cui, Hanxiao Liu, Ekin D… https://arxiv.org/abs/2006.06882

0 replies, 13 likes


Leo Boytsov: If self supervised and supervised pretraining both have somewhat limited value in CV, is ther a hope for NLP? Do large self supervisedly trained Transformers work bc most NLP tasks are low data regime tasks (and NLP might need more data compared to vision?)

1 replies, 11 likes


arXiv CS-CV: Rethinking Pre-training and Self-training http://arxiv.org/abs/2006.06882

0 replies, 11 likes


Daisuke Okanohara: Pre-training cannot improve (or even hurts) the performance when stronger data augmentation and large labeled data is available. On the other hand, self-training always helpful for low-data and large-data regimes with stronger data augmentation. https://arxiv.org/abs/2006.06882

0 replies, 10 likes


arXiv CS-CV: Rethinking Pre-training and Self-training http://arxiv.org/abs/2006.06882

0 replies, 6 likes


arXiv in review: #NeurIPS2020 Rethinking Pre-training and Self-training. (arXiv:2006.06882v1 [cs\.CV]) http://arxiv.org/abs/2006.06882

0 replies, 3 likes


Connor Shorten: Rethinking Pre-training and Self-training 📚 "Our results suggest that both supervised and self-supervised pre-training methods fail to scale as the labeled dataset size grows, while self-training is still useful." https://arxiv.org/pdf/2006.06882.pdf

1 replies, 1 likes


akira: https://arxiv.org/abs/2006.06882 If object detection is performed on all labels, or if strong data extensions are used, ImageNet pre-trained model can be degraded. But using Self-training (Noisy Student) to learn beforehand, found that even in those cases, the accuracy is improved.

1 replies, 1 likes


Content

Found on Jun 15 2020 at https://arxiv.org/pdf/2006.06882.pdf

PDF content of a computer science paper: Rethinking Pre-training and Self-training