Papers of the day   All papers

Self-training and Pre-training are Complementary for Speech Recognition

Comments

Yann LeCun: Awesome work from FAIR on self-supervised pre-training for speech recognition. 10 minutes of labeled training data yields the same accuracy as the best system from last year trained on 960 hours of labeled data.

10 replies, 653 likes


Michael Auli: Great progress in speech recognition: wav2vec 2.0 pre-training + self-training with just 10 minutes of labeled data rivals the best published systems trained on 960 hours of labeled data from just a year ago. Paper: https://arxiv.org/abs/2010.11430 Models: https://github.com/pytorch/fairseq/tree/master/examples/wav2vec https://t.co/AjEWdna6J1

7 replies, 573 likes


MIT CSAIL: A major leap in speech recognition: a system trained on just 10 minutes of labeled data has been shown to rival existing systems trained on 960 *hours* of labeled data. Paper: http://arxiv.org/abs/2010.11430 Code: http://github.com/pytorch/fairse… (v/@facebookai) #ML https://t.co/XFG7qZ1wFm

1 replies, 299 likes


MIT CSAIL: A major leap in speech recognition: a system trained on just 10 minutes of labeled data performs at similar levels as existing systems trained on 960 *hours* of labeled data. Paper: https://arxiv.org/abs/2010.11430 Code: https://github.com/pytorch/fairseq/tree/master/examples/wav2vec (v/@facebookai) #ML

0 replies, 99 likes


Mike Schroepfer: 5,760x reduction in training data needed to achieve similar results of state of the art systems from just 1 year ago. Or 960 hours to 10 minutes. Or train on the larger dataset for better results. AI is often overhyped but this is a truly impressive result.

0 replies, 30 likes


Leo Boytsov: It's a WWW (wild-wild-wild) world! Google and Facebook report nearly simultaneously on success in using self-supervision and self-training in speech recognition. Ouch. 1. https://arxiv.org/abs/2010.10504 2. https://arxiv.org/abs/2010.11430

0 replies, 16 likes


Harold Sinnott 📲 😷: A major leap in speech recognition: a system trained on just 10 minutes of labeled data has been shown to rival existing systems trained on 960 *hours* of labeled data. Paper: http://arxiv.org/abs/2010.11430 Code: http://github.com/pytorch/fairse… v/ @facebookai @MIT_CSAIL #ML #AI https://t.co/K6HcHBaqe1

0 replies, 6 likes


Thomas Reutterer: 10 mins of labeled audio data are enough for “learning” a language? impressive work ...

0 replies, 6 likes


Alexis Conneau: New paper: "Self-training and Pre-training are Complementary for Speech Recognition" TL;DR: Combination of self-training and unsupervised pre-training is very powerful for ASR; leads to 5.2% WER on LS with only 10-min of labeled data! By @MichaelAuli @ZloiAlexei @syhw et al.

1 replies, 5 likes


Pranav Rajpurkar: Cool!

0 replies, 5 likes


Anand Kumar H: GOOD info!

0 replies, 1 likes


Popular ML resources: The most popular ArXiv tweet in the last 24h: https://twitter.com/MichaelAuli/status/1320755019432427520

0 replies, 1 likes


Content

Found on Oct 26 2020 at https://arxiv.org/pdf/2010.11430.pdf

PDF content of a computer science paper: Self-training and Pre-training are Complementary for Speech Recognition