Papers of the day   All papers

Common Voice: A Massively-Multilingual Speech Corpus

Comments

Josh Meyer: It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus https://arxiv.org/abs/1912.06670

12 replies, 523 likes


Jeremy Howard: 2500 hours of speech! "Character Error Rate improvement of 5.99 +/- 5.48 for twelve target languages (German, French, Italian, Turkish, Catalan, Slovenian, Welsh, Irish, Breton, Tatar, Chuvash, and Kabyle). For most of these languages, these are the first ever published results"

1 replies, 143 likes


Josh Meyer: Accepted to LREC!

3 replies, 44 likes


jenny (phire) zhang: *whispers* hey it’s the thing I work on now

2 replies, 35 likes


Robert (Munro) Monarch: 38 languages & 50k speakers in the latest Common Voice release: https://arxiv.org/abs/1912.06670 Congrats @rosanardila @KellyJayDavis @mikehenrty @KohlerSolutions @_josh_meyer_ @ezesanlasai @gr__or & co! Language support is AI's biggest bias & most languages are not written

1 replies, 26 likes


Peter Skomoroch: New dataset: Common Voice Corpus - Over 50,000 individuals & 2,500 hours of collected audio, largest audio corpus in the public domain for speech recognition by number of hours and languages

1 replies, 15 likes


Rosana Ardila: A paper on Common Voice is out! Mozilla alongside a huge community is building a massive multilingual speech corpus to make speech recognition available for all languages. Proud to work with such an amazing team!

0 replies, 9 likes


e-Katerina Vylomova: Common Voice (A Massively-Multilingual Speech Corpus): 💫38 languages (Nov, 2019) 💫 over 50,000 individuals who have participated 💫2,500 hours of collected audio https://t.co/VqSzJPp3eC

0 replies, 8 likes


Armando Kirwin: We’re fortunate to have @_josh_meyer_ on our team at Artie. 😎

0 replies, 5 likes


Michael Kohler: Woohoo, first paper published. Thanks to @_josh_meyer_ for starting this!

1 replies, 4 likes


George Roter: Congrats to all my great colleagues (staff and contributors) @mozilla who I had the honour to work alongside on this project over the past 2.5 years!! It's great to see their names in lights :)

0 replies, 4 likes


Michael Henretty: Yaaaaay, first time I’m an author for an @arxiv paper (always wanted to be one, but didn’t think I was smart enough). So honored to have worked on this project with these folks

0 replies, 2 likes


IoT Watcher: Josh Meyer on Twitter: "It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus https://arxiv.org/abs/1912.06670" https://twitter.com/_josh_meyer_/status/1207018840661454848, see more http://tweetedtimes.com/v/2412?s=tnp

0 replies, 1 likes


Odette Scharenborg: Amazing! So totally cool! Thanks a lot to all the people involved! #speech #data #corpora @SFeng9 please check this out!

0 replies, 1 likes


HotComputerScience: Most popular computer science paper of the day: "Common Voice: A Massively-Multilingual Speech Corpus" https://hotcomputerscience.com/paper/common-voice-a-massively-multilingual-speech-corpus https://twitter.com/_josh_meyer_/status/1207018840661454848

0 replies, 1 likes


Badr Abdullah 🐪 🇾🇪: Massive effort 👏

0 replies, 1 likes


Oleksii Kuchaiev: Thank you! We love this dataset and already pre-trained and released some models with it!

0 replies, 1 likes


Peter Friot: Josh Meyer on Twitter: "It's finally out! (submitted to LREC) Common Voice: A Massively-Multilingual Speech Corpus https://arxiv.org/abs/1912.06670" https://twitter.com/_josh_meyer_/status/1207018840661454848, see more http://tweetedtimes.com/v/2412?s=tnp

0 replies, 1 likes


Łukasz Augustyniak: Great, @Adrian_Szymczak @niedakh @PiotrZelasko check this out

0 replies, 1 likes


Djamé: That's really great !!!

0 replies, 1 likes


Ramon Sanabria: great work friend! @_josh_meyer_

0 replies, 1 likes


AgTuíteáil: This is so fucking cool.

0 replies, 1 likes


C. Scott Ananian: Fantastic! Not just for the data, but for the crowd-sourcing infrastructure that enables it. Hopefully we can figure out how to plug this together with @Wikimedia language work.

0 replies, 1 likes


amyfou🐕🐕🐕: !!!!

0 replies, 1 likes


Epsilon Guanlin Lee: data #resource, great work and contribution

0 replies, 1 likes


Finn Årup Nielsen: Interesting, but no Danish :(

2 replies, 0 likes


Content

Found on Dec 17 2019 at https://arxiv.org/pdf/1912.06670.pdf

PDF content of a computer science paper: Common Voice: A Massively-Multilingual Speech Corpus