This series looks at practical applications of Machine Learning from a Linux perspective. We only feature free and open source software in this series (except where stated).
Let’s clear up one potential source of confusion at the outset. What’s the difference between Machine Learning and Deep Learning? The two terms mean different things.
In essence, Machine Learning is the practice of using algorithms to parse data, learn insights from that data, and then make a determination or prediction. The machine is ‘trained’ using huge amounts of data.
Deep Learning is a subset of Machine Learning that uses multi-layers artificial neural networks to deliver state-of-the-art accuracy in tasks such as object detection, speech recognition, language translation and others. Think of Machine Learning as cutting-edge, and Deep Learning as the cutting-edge of the cutting-edge.
Both Machine Learning and Deep Learning are changing the world. Deep Learning is trending.
The apps are self-hosted so you don’t need to pay any hosting/cloud fees to use them. We’ve written short reviews for each app. And there are many more reviews currently under preparation.
Audio | |
---|---|
![]() | Audiocraft - Python-based software which provides the code and models for MusicGen, a simple and controllable model for music generation. The models generate short music extracts based on the text description you provide. The models can generate up to 30 seconds of audio in one pass. |
![]() | Bark - Transformer-based text-to-audio model. The software can generate realistic multilingual speech as well as other audio – including music, background noise and simple sound effects, from text. |
![]() | Coqui STT - a deep-learning toolkit for training and deploying speech-to-text models. There are bindings for various programming languages. |
![]() | Demucs - billed as “a state-of-the-art music source separation model, currently capable of separating drums, bass, and vocals from the rest of the accompaniment”. |
![]() | DiffRhythm - billed as a blazingly fast and embarrassingly simple end-to-end full-length song generation with latent diffusion. |
![]() | Piper - fast, local neural text to speech system written in C++ and Python that runs well even on single board computers. |
![]() | Speech Note - GUI frontend for various processing engines. For Speech to Text it uses Coqui STT, Vosk, and Whisper. For Text to Speech, Speech Note uses espeak-ng, MBROLA, Piper, RHVoice, and Coqui TTS. And machine translation is handled by Bergamot Translator. |
![]() | Spleeter - Command-line source separation library with pre-trained models. It's designed to help the research community in Music Information Retrieval (MIR) leverage the power of a state-of-the-art source separation algorithm. |
![]() | StemRoller - GUI software which lets you separate vocal and instrumental stems from any song with a single click. |
![]() | Tortoise TTS - Multi-voice text-to-speech system trained with an emphasis on quality. It seeks to provide strong multi-voice capabilities, and highly realistic prosody and intonation. |
![]() | TTS - Library for advanced Text-to-Speech generation. It offers pretrained models in more than 1,100 different languages, together tools for training new models and improving existing models. There are also utilities for dataset analysis. |
![]() | Ultimate Vocal Remover - GUI that lets you isolate stems from music. It offers convenient access to a wide range of different models. |
![]() | Whisper - an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Whisper is a natural language processing system that’s built on PyTorch. |
Chat | |
---|---|
![]() | Alpaca - chat with a wide range of local AI models. There's also support for image recognition, code highlighting, and more. |
![]() | Bavarder - GTK4/libadwaita based app that offers an easy way to experiment with ChatGPT. |
![]() | ChatGPT (by lencx) - a desktop application wrapper for the ChatGPT website. The chatbot generates human-like text in a conversational style and can be used for a variety of natural language processing tasks. |
![]() | chatGPT-shell-cli is a simple script to use OpenAI’s chatGPT and DALL-E from the terminal without needing to install either Python or Node.js. |
![]() | Dalai - bills itself as “the simplest way to run LLaMA on your local machine”. Large Languages Models trained on massive amount of text can perform new tasks from textual instructions. |
![]() | GodMode - a dedicated chat browser giving instant access to the full webapps of ChatGPT, Bard, Claude 2, Perplexity, Bing, Quora Poe and other AI services all accessible with a single keyboard shortcut. |
![]() | GPT4All - GUI and CLI locally-running AI chat application powered by the GPT4All-J Apache 2 Licensed chatbot. |
![]() | Jan - a ChatGPT-alternative that runs 100% offline on your desktop. The aim of the project is to make it easy for anyone to run LLMs and use AI yet retain full control and privacy. |
![]() | Ollama - run and chat with Llama 2 and other models with the ability to customize models by creating your own Modelfile. |
![]() | Reor - a private AI personal knowledge management tool. Think of it as a notes program on steroids. Each note is saved as a Markdown file to a “vault” directory on your machine. |
![]() | Serge - chat interface crafted with LLaMA for running GGUF models. It's very simple to install and use. |
![]() | Simplexity - a simple desktop app that accesses Perplexity. It's written in JavaScript. |
![]() | Terminal GPT - a command-line interface (CLI) tool that allows you to use ChatGPT 3.5 in your terminal without needing API keys. |
![]() | Text generation web UI - offers a web user interface for a variety of large language models such as LLaMA, llama.cpp, GPT-J, OPT, and GALACTICA. |
![]() | Jan - a ChatGPT-alternative that runs 100% offline on your desktop. The aim of the project is to make it easy for anyone to run LLMs and use AI yet retain full control and privacy. |
Graphics | |
---|---|
![]() | BackgroundRemover - a command line tool to remove the background from images and videos using AI. The AI is performed courtesy of U2Net, a machine learning model that allows you to crop objects in a single shot. |
![]() | CodeFormer - command-line software which offers blind face restoration. This aims at recovering high-quality faces from the low-quality counterparts suffering from unknown degradation. This is freeware. |
![]() | DeOldify - A modern way to colorize black and white images using deep learning technology. The software provides pre-trained weights which allows you to colorize images and video without needing to train your own models |
![]() | Easy Diffusion - web interface to Stable Diffusion designed to be as easy-to-use as possible. |
![]() | FBCNN - Flexible Blind Convolutional Neural Network is software which seeks to remove artifacts from JPEGs while preserving the integrity of the images. |
![]() | Final2x - GUI software that uses sophisticated AI models to enhance your images by guessing what the details could be. |
![]() | Fooocus - image generating software that allows users to focus on text prompts to generate images. It’s built entirely on the Stable Diffusion XL architecture. |
![]() | GFPGAN - perform real-world face restoration. This software can radically improve the quality of photos. |
![]() | ImaginAIry - Python-based software for generating Stable Diffusion images. It’s primarily designed for the command-line but there’s a web frontend in development. |
![]() | Imaginer - extremely easy-to-use GTK4 software which lets you generate pictures using AI. |
![]() | InvokeAI - a Stable Diffusion toolkit. Generate highly detailed images based on text descriptions, or from images/drawings. |
![]() | Lama Cleaner - Fully self-hostable inpainting tool powered by state-of-the-art AI models |
![]() | Old Photo Restoration - use deep learning to restore old photos via deep latent space translation. |
![]() | PhotoPrism - AI-powered photos app for the decentralized web. It uses modern technologies to tag and find pictures. The software can be run at home, on a private server, or in the cloud. |
![]() | Real-ESRGAN - create practical algorithms for general image/video restoration. |
![]() | Rembg - remove backgrounds from images. The tool relies on the U2Net model, a machine learning model that performs object cropping in a single shot. |
![]() | RMBG-2-Studio - an enhanced background remove and replace app built around BRIA-RMBG-2.0. |
![]() | Stable Diffusion web UI - web interface to Stable Diffusion, a deep learning text-to-image diffusion model capable of generating photo-realistic images given any text input. |
![]() | Upscaler - GTK4 software that uses sophisticated AI models to enhance your images by guessing what the details could be. It's a frontend for Real-ESRGAN. |
![]() | Upscayl - GUI software that uses sophisticated AI models to enhance your images by guessing what the details could be. Like Upscaler, it's a frontend for Real-ESRGAN. |
Science | |
---|---|
![]() | Argos Translate is state of the art neural machine translation software. Argos Translate can be used as either a Python library, command-line, or GUI application. It uses OpenNMT for translations. |
![]() | astroML - a Python module which offers statistical data analysis in astronomy and astrophysics. |
![]() | EasyOCR - General OCR that can read both natural scene text and dense text in documents. The software supports more than 80 languages. |
![]() | LibreTranslate is a machine translation API which is entirely self-hosted. This software lets you use open source machine translation in your projects. It uses Argos Translate for its translation engine. It sports a great web frontend. |
![]() | ocrs - Rust library and CLI tool for extracting text from images, also known as OCR (Optical Character Recognition). The software uses neural network models written in PyTorch. |
![]() | scikit-learn - a machine learning library built on top of SciPy that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection, model evaluation, and many other utilities |
App Installers | |
---|---|
![]() | Pinokio describes itself as a browser that lets you install, run, and manage any server application, on your local machine. These applications are AI software. Pinokio is not a browser in the traditional sense. |
![]() | Stability Matrix is a multi-platform package manager and inference UI for AI image generation. It works with Stable Diffusion and Flux. |
If you have recommendations for other good free and open source machine learning software for Linux, please comment below.
Machine learning science apps please!
There are so many interesting projects here to try. The Stable Diffusion ones are particularly useful.
One thing I hate is that so many of these programs take up so much hard disk space. I’m not talking about the size of their models but rather the virtual environments with tons of Python libraries. Python is really a mess
You should make it clear that the apps are self-hosted. That’s a big virtue and is worth stressing to readers, don’t ya think?
Good point, article has been updated to include a reference that the apps are self-hosted.
Hello! Thanks for this great collection. I think that pinokio ai browser should also be listed. Here or somewhere else in this site, but it should be listed. It is an open source tool to manage many LLMs locally with a matter of one click.
Another one I’d like to suggest is Open-WebUI. Both are great FOSS tools so deserve to be mentioned here. Thank you.
Thanks, I’ve never heard of Pinokio. I’ll try out the software.
Update: I had an old version of the software installed, so I have tried it before. I’ve tried installing quite a few of the apps with the latest version, most fail to install.
I edited out the URLs (see our Comment FAQ). URLs can be provided on our Discord server.
Oh, I see, sorry for the inconvenience. I didn’t noted the Comment FAQ first, but I’ve read it now.
I actually running an instance on my gaming laptop and it runs ok. I was experimenting with Fooocus which I think is a front end for the stable diffusion image generation AI. Does your machine have a graphic card by the way?
So should I suggest it using the forms?
No need to apologise.
There’s no need now to complete our form for Pinokio. I will publish an article on it when I’ve done more testing.
Yes, my test machine has an NVIDIA GeForce RTX 3060 Ti which is sufficient to test most of the software available from Pinokio; some of them like YuE needs more VRAM than the card has though, so I won’t try that.
I’ve had a bit more success getting software installed with Pinokio. Things like Hallucinator fail to install. Does that install for you?
Ok, thank you!
Mine has an RTX 4060 8GB vRAM. I haven’t tried to install Hallucinator yet. I installed Fooocus and another one that was for image manipulation with effects I think, sorry I didn’t remember it now.
But I will give the Hallucinator a try tonight and let you know the result.
Thanks; interesting to see how you get on with Hallucinator.
Your graphics card as the same VRAM as mine, so you should be able to run the vast majority of the apps available in Pinokio. There’s some really interesting AI software that needs more than my card’s VRAM though.
Hi Steve!
I tried deploying Hallucinator on my machine. Installation is successful but it fails with the following error when I launch it:
Yes, I get exactly the same error. Others I’ve tried also fail to run/install even though I can get the app running fine with a manual installation.