Papers of the day   All papers

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Comments

Yann LeCun: Self-Supervised Learning making strides in speech recognition. Wav2Vec 2.0 from FAIR uses a kind of contrastive SSL for pre-training. This is the first time an SSL system reaches the very best results on a number of different ASR tasks. https://arxiv.org/abs/2006.11477 1/N

15 replies, 733 likes


Joan Serrà: Well... It has been making strides for quite some time now. (I don't usually do this but, TBH, I think it is a bit shameful that they forget to cite our work PASE https://www.isca-speech.org/archive/Interspeech_2019/abstracts/2605.html w/ @santty128, @mirco_ravanelli, Bonafonte, & Bengio)

1 replies, 46 likes


arXiv CS-CL: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations http://arxiv.org/abs/2006.11477

0 replies, 34 likes


Alexis Conneau: Self-supervised pretraining for speech enables automatic speech recognition with only *10 minutes* 🕐 of transcribed speech data. Works very well cross-lingually too. Wav2vec 2.0: https://arxiv.org/abs/2006.11477 XLSR: https://arxiv.org/abs/2006.13979 By @ZloiAlexei and colleagues from Facebook

0 replies, 32 likes


Eugene: Insanely good results on self-supervised learning of speech representations: 53k hours of unlabeled data + 10min of labelled data = 5.7/10.1 WER noisy/clean test of Librispeech. Baevski et al, wav2vec 2.0 https://arxiv.org/abs/2006.11477

1 replies, 10 likes


arXiv CS-CL: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations http://arxiv.org/abs/2006.11477

0 replies, 8 likes


Jonathan Lim: Automatic speech recognition just got better

0 replies, 2 likes


Alexis Conneau: We build on wav2vec 2.0 (https://arxiv.org/abs/2006.11477), a self-supervised model which is trained by solving a contrastive task over masked latent speech representations. In this paper, we jointly learn a quantization of the latents shared across languages and call our approach XLSR.

2 replies, 1 likes


Content

Found on Jun 24 2020 at https://arxiv.org/pdf/2006.11477.pdf

PDF content of a computer science paper: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations