Papers of the day   All papers

Unsupervised Data Augmentation


Quoc Le: Data augmentation is often associated with supervised learning. We find *unsupervised* data augmentation works better. It combines well with transfer learning (e.g. BERT) and improves everything when datasets have a small number of labeled examples. Link:

3 replies, 668 likes

Thang Luong: Introducing UDA, our new work on "Unsupervised data augmentation" for semi-supervised learning (SSL) with Qizhe Xie, Zihang Dai, Eduard Hovy, & @quocleix. SOTA results on IMDB (with just 20 labeled examples!), SSL Cifar10 & SVHN (30% error reduction)!

3 replies, 444 likes

Thang Luong: Nice recent tutorial on semi-supervised learning, covering our recent works on UDA ( and #NoisyStudent ( It also highlights VAT, Pi-Model, MeanTeacher, and MixMatch. Slides:

1 replies, 134 likes

Thang Luong: Nice additional gains achieved by MPL (Meta Pseudo Labels, on top of UDA (Unsupervised Data Augmentation, on low-data regimes!

0 replies, 98 likes

Quoc Le: Links to the mentioned papers. MixMatch: Unsupervised Data Augmentation:

1 replies, 82 likes

Quoc Le: To add to Vincent's point above, new findings also include: 1. The method is general (works well for images & texts). 2. The method works well on top of transfer learning (e.g., BERT). You can find these results in Unsupervised Data Augmentation paper:

0 replies, 67 likes

Quoc Le: This work continues our efforts on semi-supervised learning such as UDA: MixMatch: FixMatch: Noisy Student: etc. Joint work with @hieupham789 @QizheXie @ZihangDai

1 replies, 64 likes

Mihail Eric: Yum! Unsupervised data augmentation that works from @GoogleAI @QizheXie @quocleix. New state-of-the-art on various language and vision tasks:

0 replies, 54 likes

Thang Luong: Our UDA work ( proposes the use of strong augmentation (RandAugment) which subsequent works (FixMatch, NoisyStudent) follow. UDA uses soft pseudo-labels whereas FixMatch uses hard ones after "weak" augmentation in consistency training.

1 replies, 38 likes

Sayak Paul: Unsupervised Data Augmentation for Consistency Training ( also knows as UDA is such an important paper in the area of self-supervised learning. It systematically studies how stronger data augmentation ops benefit a model in learning good representations.

0 replies, 30 likes

Arjun (Raj) Manrai: Wow: "on IMDb, UDA with 20 labeled examples outperforms the state-of-the-art model trained on 1250x more labeled data"

0 replies, 8 likes

Thang Luong: These plots (also included in the updated version of our UDA paper with a lot more results & details) illustrate very well Vincent's article on the quiet revolution of semi-supervised learning!

2 replies, 8 likes

arXiv CS-CV: Unsupervised Data Augmentation for Consistency Training

0 replies, 7 likes

Daisuke Okanohara: In semi-supervised learning, VAT adds adversarial noise to unsupervised data and makes its prediction distribution matches the original distribution. UDA instead applies data augmentation methods and gradually increases the signal from the supervised data

0 replies, 5 likes

arXiv CS-CL: Unsupervised Data Augmentation for Consistency Training

0 replies, 4 likes

arXiv in review: #ICLR2020 Unsupervised Data Augmentation for Consistency Training. (arXiv:1904.12848v5 [cs\.LG] UPDATED)

0 replies, 4 likes

Quoc Le: @ivan_bezdomny In NLP, there is backtranslation method that works quite well as a data augmentation method. You can check out its use in UDA: Link to code:

1 replies, 3 likes

BioDecoded: Inference of clonal selection in cancer populations using single-cell sequencing data | Bioinformatics #MachineLearning

1 replies, 3 likes

BioDecoded: Advancing Semi-supervised Learning with Unsupervised Data Augmentation | Google AI Blog #MachineLearning

0 replies, 3 likes

arXiv CS-CV: Unsupervised Data Augmentation for Consistency Training

0 replies, 2 likes

arXiv CS-CL: Unsupervised Data Augmentation for Consistency Training

0 replies, 2 likes

AK: #UDA or unsupervised data augmentations new technique from @google to generate synthetic data for #neuralnetworks #AI #machinelearning

0 replies, 1 likes

Saleh Elmohamed: Really nice work by Quoc Le and colleagues at Google & CMU on unsupervised data augmentation. Highly recommend checking out their latest paper at the arXiv.

0 replies, 1 likes

Cherrypick: UNSUPERVISED DATA AUGMENTATION (UDA) trying to use for stock market sentiment analysis (pytorch and BERT)

1 replies, 0 likes


Found on Apr 30 2019 at

PDF content of a computer science paper: Unsupervised Data Augmentation