In Operation
The models available are:
- Vocals (singing voice) / accompaniment separation (2 stems).
- Vocals / drums / bass / other separation (4 stems).
- Vocals / drums / bass / piano / other separation (5 stems).
Spleeter is a fairly complex engine that’s easy to use. The actual separation needs a single command line.
Usage: spleeter [OPTIONS] COMMAND [ARGS]... Options: --version Return Spleeter version --help Show this message and exit. Commands: evaluate Evaluate a model on the musDB test dataset separate Separate audio file(s) train Train a source separation model
Here are a few examples:
By default, spleeter creates 2 stems. Perfect for karaoke!
$ spleeter separate test-music-file.flac -o /output/path
This command creates a folder called test-music-file with 2 stems: vocals.wav and accompaniment.
Let’s say we want 4 stems (vocals, drums, bass and other). Issue the command
$ spleeter separate test-music-file.flac -p spleeter:4stems -o /output/path
Let’s say we want 5 stems (vocals, drums, bass, piano and other). Issue the command
$ spleeter separate test-music-file.flac -p spleeter:5stems -o /output/path
The first time a model is used, the software will automatically download it before performing the separation.
The software can create wav, mp3, ogg, m4a, wma, and flac formats (use the -c flag). It supports tensorflow and librosa. Librosa is faster than tensorflow on CPU and uses less memory. If GPU acceleration is not available librosa is used by default.
The released models were trained on spectrograms up to 11kHz. But there are several ways of performing separation up to 16kHz or even 22kHz.
spleeter separate test-music-file.flac -c spleeter:4stems-16kHz -o /output/path
When you use the CLI, each time you run the spleeter command it will load the model again with an overhead. To avoid this overhead, it’s best to separate with a single call to the CLI utility.
Summary
Spleeter is designed to help the research community in Music Information Retrieval (MIR) leverage the power of a state-of-the-art source separation algorithm.
Spleeter makes it easy to train source separation model using a dataset of isolated sources. The project also supplies already trained state of the art models for performing various types of separation.
Try as hard as we could, we couldn’t coax Spleeter to use our GPU under Ubuntu 22.10 or 23.04. According to the project you need a fully working CUDA. Other machine learning projects we’ve evaluated had no issues whatsoever with our CUDA installation, so it’s not clear what’s wrong. We even tried a fresh installation of Ubuntu 22.04 and used our best endeavours to ensure our CUDA installation was flawless. But again no GPU usage. However, this didn’t stop as testing the software albeit slower as processing was bound to the CPU.
Website: research.deezer.com
Support: GitHub Code Repository
Developer: Deezer SA.
License: MIT License
Spleeter is written in Python. Learn Python with our recommended free books and free tutorials.
For other useful open source apps that use machine learning/deep learning, we’ve compiled this roundup.
Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary
I can’t get my GPU working either with Spleeter although my 13th gen processor is still pretty speedy at processing.
It’s great for creating karaoke tracks.