Mozilla releases dataset and model to lower voice-recognition barriers

Mozilla releases dataset and model to lower voice-recognition barriers

Mozilla has released its Common Voice collection, which contains almost 400,000 recordings from 20,000 people, and is claimed to be the second-largest voice dataset publicly available. The voice samples in the collection were obtained from Mozilla's Common Voice project, which allowed users via an iOS app or website to donate their utterances. It is hoped that creating a large public dataset will allow for better voice-enabled applications.

Alongside its dataset, Mozilla also released its open-source Project DeepSpeech voice-recognition model based on work done by Chinese internet giant Baidu. It is claimed that with its 6.5 percent error rate on the LibriSpeech dataset, DeepSpeech is approaching human levels of recognition.

In August, Microsoft said it had reached a voice-recognition error rate of 5.1 percent on the Switchboard corpus, the same level as professional human transcribers. Earlier in the year, Google said it had a 4.9 percent error rate in its speech-recognition software.

See http://zd.net/2k8wX0S




from Danie van der Merwe - Google+ Posts http://ift.tt/2it5kiO
via IFTTT

Comments