Yoshimi is a sophisticated, algorithmic MIDI software synthesizer for Linux, a fork of ZynAddSubFX.
Read moreCategory: Multimedia
VCV Rack – modular synthesizer
VCV Rack is a Eurorack modular synthesizer simulator. This is free and open source software although there are proprietary plugins.
Read moreFluidSynth – a SoundFont synthesizer
FluidSynth is a console based real-time software synthesizer based on the SoundFont 2 specifications.
Read moreZynAddSubFX – fully featured open source software synthesizer
ZynAddSubFX is a powerful real-time, software synthesizer with many features, including polyphony, multi-timbral and microtonal capabilities.
Read moreSimon – frontend for simon speech recognition solution
Simon is open source speech recognition software which aims to be flexible and highly customizable.
Read moreEesen – End-to-End Speech Recognition using Deep RNN Models and WFST-based Decoding
Eesen is to simplify the existing complicated, expertise-intensive ASR pipeline into a straightforward sequence learning problem.
Read moreCMUSphinx – Open Source Speech Recognition System for Mobile and Server Applications
CMUSphinx (Sphinx) is a collective term to describe a group of speech recognition systems developed at Carnegie Mellon University.
Read moreJulius – large vocabulary continuous speech recognition decoder software
Julius is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) engine. It supports N-gram based dictation.
Read moreOpenSeq2Seq – TensorFlow-based toolkit for sequence-to-sequence models
OpenSeq2Seq is a toolkit for distributed and mixed precision training of sequence-to-sequence models.
Read moreDeepSpeech – TensorFlow implementation of Baidu’s DeepSpeech architecture
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques.
Read moredeepspeech.pytorch – Implementation of DeepSpeech2 using Baidu Warp-CTC
deepspeech.pytorch is an implementation of DeepSpeech2 using Baidu Warp-CTC. It creates a network based on the DeepSpeech2 architecture.
Read moreESPnet – end-to-end speech processing toolkit
ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech.
Read moreKaldi Speech Recognition Toolkit – Designed for Speech Recognition Researchers
Kaldi is a state-of-the-art speech recognition toolkit written in C++. It’s intended to be used mainly for acoustic modelling research.
Read moreSpeechBrain – conversational AI toolkit
SpeechBrain is an all-in-one conversational AI toolkit based on PyTorch. This is free and open source software written in Python.
Read moreFlashlight – C++ standalone library for machine learning
Flashlight is a fast, flexible machine learning library written entirely in C++. It provides apps for research across multiple domains.
Read more