Articles tagged “data sets”
-
Common Voice dataset tops 20,000 hours
The latest Common Voice dataset, released today, has achieved a major milestone: More than 20,000 hours of open-source speech data that anyone, anywhere can use. The dataset has nearly doubled in the past year. Mozilla’s Common Voice seeks to change the language technology ecosystem by supporting communities to collect voice data for the creation of voice-enabled applications for their own languages.