Papers of the day   All papers

Common Voice: A Massively-Multilingual Speech Corpus


Josh Meyer: It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus

12 replies, 523 likes

Jeremy Howard: 2500 hours of speech! "Character Error Rate improvement of 5.99 +/- 5.48 for twelve target languages (German, French, Italian, Turkish, Catalan, Slovenian, Welsh, Irish, Breton, Tatar, Chuvash, and Kabyle). For most of these languages, these are the first ever published results"

1 replies, 143 likes

Josh Meyer: Accepted to LREC!

3 replies, 44 likes

jenny (phire) zhang: *whispers* hey it’s the thing I work on now

2 replies, 35 likes

Robert (Munro) Monarch: 38 languages & 50k speakers in the latest Common Voice release: Congrats @rosanardila @KellyJayDavis @mikehenrty @KohlerSolutions @_josh_meyer_ @ezesanlasai @gr__or & co! Language support is AI's biggest bias & most languages are not written

1 replies, 26 likes

Peter Skomoroch: New dataset: Common Voice Corpus - Over 50,000 individuals & 2,500 hours of collected audio, largest audio corpus in the public domain for speech recognition by number of hours and languages

1 replies, 15 likes

Rosana Ardila: A paper on Common Voice is out! Mozilla alongside a huge community is building a massive multilingual speech corpus to make speech recognition available for all languages. Proud to work with such an amazing team!

0 replies, 9 likes

e-Katerina Vylomova: Common Voice (A Massively-Multilingual Speech Corpus): 💫38 languages (Nov, 2019) 💫 over 50,000 individuals who have participated 💫2,500 hours of collected audio

0 replies, 8 likes

Armando Kirwin: We’re fortunate to have @_josh_meyer_ on our team at Artie. 😎

0 replies, 5 likes

Michael Kohler: Woohoo, first paper published. Thanks to @_josh_meyer_ for starting this!

1 replies, 4 likes

George Roter: Congrats to all my great colleagues (staff and contributors) @mozilla who I had the honour to work alongside on this project over the past 2.5 years!! It's great to see their names in lights :)

0 replies, 4 likes

Michael Henretty: Yaaaaay, first time I’m an author for an @arxiv paper (always wanted to be one, but didn’t think I was smart enough). So honored to have worked on this project with these folks

0 replies, 2 likes

IoT Watcher: Josh Meyer on Twitter: "It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus", see more

0 replies, 1 likes

Odette Scharenborg: Amazing! So totally cool! Thanks a lot to all the people involved! #speech #data #corpora @SFeng9 please check this out!

0 replies, 1 likes

HotComputerScience: Most popular computer science paper of the day: "Common Voice: A Massively-Multilingual Speech Corpus"

0 replies, 1 likes

Badr Abdullah 🐪 🇾🇪: Massive effort 👏

0 replies, 1 likes

Oleksii Kuchaiev: Thank you! We love this dataset and already pre-trained and released some models with it!

0 replies, 1 likes

Peter Friot: Josh Meyer on Twitter: "It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus", see more

0 replies, 1 likes

Łukasz Augustyniak: Great, @Adrian_Szymczak @niedakh @PiotrZelasko check this out

0 replies, 1 likes

Djamé: That's really great !!!

0 replies, 1 likes

Ramon Sanabria: great work friend! @_josh_meyer_

0 replies, 1 likes

AgTuíteáil: This is so fucking cool.

0 replies, 1 likes

C. Scott Ananian: Fantastic! Not just for the data, but for the crowd-sourcing infrastructure that enables it. Hopefully we can figure out how to plug this together with @Wikimedia language work.

0 replies, 1 likes

amyfou🐕🐕🐕: !!!!

0 replies, 1 likes

Epsilon Guanlin Lee: data #resource, great work and contribution

0 replies, 1 likes

Finn Årup Nielsen: Interesting, but no Danish :(

2 replies, 0 likes


Found on Dec 17 2019 at

PDF content of a computer science paper: Common Voice: A Massively-Multilingual Speech Corpus