Papers of the day   All papers

Common Voice: A Massively-Multilingual Speech Corpus

Comments

Dec 17 2019 Josh Meyer

It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus https://arxiv.org/abs/1912.06670
11 replies, 508 likes


Dec 17 2019 Jeremy Howard

2500 hours of speech! "Character Error Rate improvement of 5.99 +/- 5.48 for twelve target languages (German, French, Italian, Turkish, Catalan, Slovenian, Welsh, Irish, Breton, Tatar, Chuvash, and Kabyle). For most of these languages, these are the first ever published results"
1 replies, 143 likes


Dec 18 2019 jenny (phire) zhang

*whispers* hey it’s the thing I work on now
2 replies, 35 likes


Dec 17 2019 Robert (Munro) Monarch

38 languages & 50k speakers in the latest Common Voice release: https://arxiv.org/abs/1912.06670 Congrats @rosanardila @KellyJayDavis @mikehenrty @KohlerSolutions @_josh_meyer_ @ezesanlasai @gr__or & co! Language support is AI's biggest bias & most languages are not written
1 replies, 26 likes


Dec 18 2019 Peter Skomoroch

New dataset: Common Voice Corpus - Over 50,000 individuals & 2,500 hours of collected audio, largest audio corpus in the public domain for speech recognition by number of hours and languages
1 replies, 15 likes


Dec 18 2019 Rosana Ardila

A paper on Common Voice is out! Mozilla alongside a huge community is building a massive multilingual speech corpus to make speech recognition available for all languages. Proud to work with such an amazing team!
0 replies, 9 likes


Dec 17 2019 e-Katerina Vylomova

Common Voice (A Massively-Multilingual Speech Corpus): 💫38 languages (Nov, 2019) 💫 over 50,000 individuals who have participated 💫2,500 hours of collected audio https://t.co/VqSzJPp3eC
0 replies, 8 likes


Dec 20 2019 Armando Kirwin

We’re fortunate to have @_josh_meyer_ on our team at Artie. 😎
0 replies, 5 likes


Dec 18 2019 Michael Kohler

Woohoo, first paper published. Thanks to @_josh_meyer_ for starting this!
1 replies, 4 likes


Dec 18 2019 George Roter

Congrats to all my great colleagues (staff and contributors) @mozilla who I had the honour to work alongside on this project over the past 2.5 years!! It's great to see their names in lights :)
0 replies, 4 likes


Dec 17 2019 Michael Henretty

Yaaaaay, first time I’m an author for an @arxiv paper (always wanted to be one, but didn’t think I was smart enough). So honored to have worked on this project with these folks
0 replies, 2 likes


Dec 18 2019 IoT Watcher

Josh Meyer on Twitter: "It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus https://arxiv.org/abs/1912.06670" https://twitter.com/_josh_meyer_/status/1207018840661454848, see more http://tweetedtimes.com/v/2412?s=tnp
0 replies, 1 likes


Dec 17 2019 Badr Abdullah 🐪 🇾🇪

Massive effort 👏
0 replies, 1 likes


Dec 18 2019 Łukasz Augustyniak

Great, @Adrian_Szymczak @niedakh @PiotrZelasko check this out
0 replies, 1 likes


Dec 17 2019 Djamé

That's really great !!!
0 replies, 1 likes


Dec 18 2019 Odette Scharenborg

Amazing! So totally cool! Thanks a lot to all the people involved! #speech #data #corpora @SFeng9 please check this out!
0 replies, 1 likes


Dec 18 2019 AgTuíteáil

This is so fucking cool.
0 replies, 1 likes


Dec 18 2019 Ramon Sanabria

great work friend! @_josh_meyer_
0 replies, 1 likes


Dec 18 2019 amyfou🐕🐕🐕

!!!!
0 replies, 1 likes


Dec 17 2019 Oleksii Kuchaiev

Thank you! We love this dataset and already pre-trained and released some models with it!
0 replies, 1 likes


Dec 18 2019 C. Scott Ananian

Fantastic! Not just for the data, but for the crowd-sourcing infrastructure that enables it. Hopefully we can figure out how to plug this together with @Wikimedia language work.
0 replies, 1 likes


Dec 20 2019 HotComputerScience

Most popular computer science paper of the day: "Common Voice: A Massively-Multilingual Speech Corpus" https://hotcomputerscience.com/paper/common-voice-a-massively-multilingual-speech-corpus https://twitter.com/_josh_meyer_/status/1207018840661454848
0 replies, 1 likes


Dec 18 2019 Epsilon Guanlin Lee

data #resource, great work and contribution
0 replies, 1 likes


Dec 18 2019 Peter Friot

Josh Meyer on Twitter: "It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus https://arxiv.org/abs/1912.06670" https://twitter.com/_josh_meyer_/status/1207018840661454848, see more http://tweetedtimes.com/v/2412?s=tnp
0 replies, 1 likes


Dec 18 2019 Finn Årup Nielsen

Interesting, but no Danish :(
2 replies, 0 likes


Content