Papers of the day   All papers

Self-training with Noisy Student improves ImageNet classification


Nov 12 2019 Quoc Le

Want to improve accuracy and robustness of your model? Use unlabeled data! Our new work uses self-training on unlabeled data to achieve 87.4% top-1 on ImageNet, 1% better than SOTA. Huge gains are seen on harder benchmarks (ImageNet-A, C and P). Link:
22 replies, 1540 likes

Nov 12 2019 Jeff Dean

Nice new results from @GoogleAI researchers on improving the state-of-the-art on ImageNet! "We...train a...model on...ImageNet...& use it as a teacher to generate pseudo labels on 300M unlabeled images. We then train a larger...model on the...labeled & pseudo labeled images."
6 replies, 537 likes

Nov 12 2019 hiroto

"Self-training with Noisy Student improves ImageNet classification" achieves 87.4% top-1 accuracy. 1 Train a model on ImageNet 2 Generate pseudo labels on unlabeled extra dataset 3 Train a student model using all the data and make it a new teacher ->2
4 replies, 330 likes

Nov 12 2019 Ilya Sutskever

Amazing unsupervised learning results:
3 replies, 203 likes

Nov 12 2019 Thang Luong

Another view of Noisy Student: semi-supervised learning is great even when labeled data is plentiful! 130M unlabeled images yields 1% gain over previous ImageNet SOTA that uses 3.5B weakly labeled examples! joint work /w @QizheXie, Ed Hovy, @quocleix
0 replies, 83 likes

Nov 19 2019 Eric Jang 🇺🇸🇹🇼

Self-training with Noisy Student: A semi-supervised approach by Google/CMU that outperforms Facebook's "weakly labeled 3.5B Instagram" method on ImageNet.
1 replies, 73 likes

Nov 13 2019 Bindu Reddy 🔥❤️

You can train more accurate models by combining unlabelled data with labelled data. Google's latest paper uses a clever trick to take advantage of loads of unlabelled data that most organizations have. One more step in truly democratizing AI -
1 replies, 30 likes

Nov 12 2019 Daniel Situnayake

This seems like a intriguing approach when you have a ton of unlabelled data: 1) Train a classifier on a labeled set of data 2) Use it to pseudo-label a much larger unlabelled dataset 3) Train a larger classifier on the combined sets 4) Iterate the process, adding noise
3 replies, 29 likes

Nov 13 2019 Stanisław Jastrzębski

So do deep networks 'interpolate' or do they 'extrapolate'? :) For context see or @GaryMarcus critique of deep learning; I think most people would classify ImageNet-A as 'extrapolation', but also unclear what is the unlabeled dataset overlap with ImageNetA
0 replies, 15 likes

Nov 13 2019 Rajat Monga

Love the simplicity.
0 replies, 11 likes

Nov 12 2019 Andrey Kurenkov 🤖

wow neat trick. So simple, so effective! Kind of surprising this works so well, you'd think semi-supervised learning without injecting noisy labels would work better... seems unsupervised learning is just tough compared to supervised? Looking forward to theory :)
0 replies, 10 likes

Nov 12 2019 mat kelcey

the adding noise result is a great idea but the most surprising thing about this result is the responses from people who didn't know self training was a thing!
2 replies, 7 likes

Nov 13 2019 Daisuke Okanohara

Self-training (training a student using an unlabeled dataset with labels estimated by a teacher) benefit from using a larger model for students and injecting noises at student training. Achieved new SOTA on ImageNet and challenging ImageNet-A (17%->74%)
0 replies, 6 likes

Nov 13 2019 Aakash Kumar Nain

Another really good paper from @quocleix
2 replies, 5 likes

Nov 12 2019 Moez Baccouche

Very interesting work by Google Brain on « Self-training » : 1. Train a model on ImageNet 2. Infer labels on unlabeled dataset 3. Train a student model using all the data and make it a new teacher 4. Go to 2. This leads to new sota on imagenet with with 87.4% top-1 accuracy.
0 replies, 4 likes

Nov 14 2019 George Seif

Very cool idea to get state of the art on ImageNet by @GoogleAI #DeepLearning
0 replies, 3 likes

Nov 12 2019 David Luan

Amazing progress using clever ideas that are also simple to explain.
0 replies, 3 likes

Nov 13 2019 eSteve almirall

Image recognition with Deep Learning is improving and solving fundamental problems of labeled data with self-training !!! kudos for @GoogleAI @XavierFerras @oalcoba @ganyet @ProfVives @albertcuesta
0 replies, 3 likes

Nov 13 2019 Shital Shah

This shall go down as one of the great abstracts. Did they just said they improved SOTA on adversarial ImageNet from 16.6% to 74.2%, daug? You bet they did!
0 replies, 2 likes

Nov 13 2019 Andrew Lavin

Self-training leads EfficientNet to a new state-of-the-art in ImageNet classification accuracy. But the exciting result is really the vast improvement to classification robustness.
0 replies, 2 likes

Nov 12 2019 Somshubra Majumdar

A semi-simple method that I will probably try soon.
1 replies, 2 likes

Nov 14 2019 Tobias Sterbak

Pseudo labeling with noise is such an elegant (and effective) idea! Great work work by Quoc V. Le and team! #deeplearning #neuralnetworks #computervision
0 replies, 1 likes

Nov 13 2019 Christian Szegedy

A cool semi-supervised training trick.
0 replies, 1 likes

Nov 19 2019 akira Create a more accurate model by repeating the process : “Adding noise to the pseudo-label data ,which is created with model that learned ImageNet. Then Distilling with a larger model using this data and labeled data”. Robustness is also improved.
0 replies, 1 likes

Nov 14 2019 Piotr Czapla

Brilliant idea how to make a repeated teacher student learning working even if both are the same architecture. It seems generic enough to work for text, can’t wait to give it a try on multifit zeroshot.
0 replies, 1 likes

Nov 12 2019 Brundage Bot

Self-training with Noisy Student improves ImageNet classification. Qizhe Xie, Eduard Hovy, Minh-Thang Luong, and Quoc V. Le
1 replies, 0 likes