Last Updated on March 6, 2023
In Operation
The quickest way to start using STT is with its model manager. This provides a convenient unified interface to connect your microphone to a Coqui Speech-to-Text model, manage your installed models and install new ones from the Coqui Model Zoo. The Coqui Model Zoo is the central hub for finding STT models created by its community as well as official Coqui models.
Start the model manager with the command:
$ stt-model-manager
This launches the system’s default web browser at http://127.0.0.1:38450/
Install a model from the Coqui STT Model zoo to get started. There are lots of pre-trained STT models available.
We installed the English STT huge vocab model. The acoustic model was trained on American English data with synthetic noise augmentation. This model was trained on Common Voice 7.0 English (custom Coqui train/dev/test splits), LibriSpeech, and Multilingual Librispeech. In total approximately 47,000 hours of data.
The model is stored at ~/local/share/coqui/models/English STT v1.0.0-huge-vocab
total 979M -rw-rw-r-- 1 sde sde 934M Feb 20 19:44 huge-vocabulary.scorer -rw-rw-r-- 1 sde sde 46M Feb 20 19:41 model.tflite
We can test the model by clicking the Run model button. In the image below, the model has accurately transcribed our spoken words. For best results, you should ensure you’re using the software in a low-noise environment with a good microphone.
The software has an efficient training pipeline with multi-GPU support. Streaming and real-time inference is supported.
Summary
STT gets our firm recommendation. It’s very impressive software with high quality pre-trained models available.
Language models are trained from text, and the more similar that text is to the speech your STT system encounters at run-time, the better STT performs. For more accurate transaction you’ll want to use a custom language model.
There are bindings for various programming languages.
Website: coqui.ai
Support: GitHub Code Repository
Developer: Coqui STT developers
License: Mozilla Public License 2.0
Coqui STT is written in C++ and Python. Learn C++ with our recommended free books and free tutorials. Learn Python with our recommended free books and free tutorials.
For other useful open source apps that use machine learning/deep learning, we’ve compiled this roundup.
Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary