In Operation
The project lets us synthesize speech on the command line. We can either use a model from the provided list, or train our own model. In the absence of a specified mode, TTS uses LJSpeech based English model.
Here’s sample output with the default model. When you try a model for the first time, it’s automatically downloaded. The models are stored at .local/share/tts
. They consume a fair check of hard disk space. For example the jenny model uses 1.8GB of disk space.
$ tts --text "This is a deep learning toolkit for Text-to-Speech written in Python and published under an open source license" --out_path /home/sde/results/speech-default-model.wav
Let’s try the tts_models/en/jenny/jenny model.
$ tts --text "We hope you enjoy our reviews. We cover both software and hardware from a Linux perspective. We love receiving your thoughts on our site, so please share in the comments section below" --model_name tts_models/en/jenny/jenny --out_path /home/sde/results/speech-jenny.wav
And here’s example output from the tts_models/en/ljspeech/glow-tts model
$ tts --text "Thanks to everyone that has donated to our site. We really appreciate your support" --model_name tts_models/en/ljspeech/glow-tts --out_path /home/sde/results/speech-glow-tts.wav
There are also multi-speaker TTS models available, as well as voice conversion models.
The real power of TTS comes from its ability to train new models and fine-tune existing models in any language. The training is fast from our tests.
Summary
TTS is impressive software which we’ve only scratched the surface. With a huge arsenal of pretrained models, realistic prosody and intonation, it’s one of the best TTS libraries we’ve tested.
With its array of deep learning models, a good trainer API and lots of tools to curate datasets, TTS gets our firm recommendation.
There are some features we’d love to see added. For example, the ability to make the TTS library play the audio with streaming mode support.
Website: github.com/coqui-ai/TTS
Support:
Developer: Coqui and contributors
License: Mozilla Public License Version 2.0
For other useful open source apps that use machine learning/deep learning, we’ve compiled this roundup.
TTS is written in Python. Learn Python with our recommended free books and free tutorials.
Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary
This looks interesting, but I wish developers would provide distro-specific packages.
This project has been shut down.
That’s not true.
The main developer actually said that he doesn’t plan continuing maintaining the code at the moment, but that may change in the future.
The last commit was only 5 days ago.
Plus, the project is open source, so anyone can fork it and carry on development in the meantime.