Speech Articles
-
DeepSpeech 0.6: Mozilla’s Speech-to-Text Engine Gets Fast, Lean, and Ubiquitous
The Machine Learning team at Mozilla continues work on DeepSpeech, an automatic speech recognition (ASR) engine which aims to make speech recognition technology and trained models openly available to developers. In this overview of recent improvements, we'll show how DeepSpeech can transform your applications by enabling client-side, low-latency, and privacy-preserving speech recognition capabilities. Find out how you can participate.
-
A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
This is an update on the LPCNet project, an efficient neural speech synthesizer from Mozilla’s Emerging Technologies group. LPCNet combines signal processing and deep learning to improve the efficiency of neural speech synthesis. Our recent work turns LPCNet into a very low-bitrate neural speech codec that’s actually usable on current hardware and even on phones.
-
LPCNet: DSP-Boosted Neural Speech Synthesis
LPCNet is a new project out of Mozilla’s Emerging Technologies group — an efficient neural speech synthesiser with reduced complexity over some of its predecessors. Neural speech synthesis models have already demonstrated impressive speech synthesis quality, but their computational complexity has made them hard to use in real-time, especially on phones. Our solution with LPCNet uses a combination of deep learning and digital signal processing (DSP) techniques.