Chat

Machine Learning in Linux: Bark – Text-Prompted Generative Audio

In Operation

We can run the Bark models with a simple command such as this:

$ python -m bark --text "Hello everyone, my name is Steve. Let's have some fun!" --output_filename "bark-my-name-is.wav"

Here’s an example of the generated audio with the text prompt using the smaller models.

The clip is vaguely reminiscent of the voice of Stephen Mangan, an English actor, comedian, presenter and writer. Each time you run this command, you’ll get different output. Bark generates audio from scratch. It’s not meant to create only high-fidelity, studio-quality speech. Sometimes the generated audio is garbage.

Bark will occasionally add music to the text, but the symbol ♪ around the text will help or use [music]. We created the next two clips using the Python file shown on Page 3 of this article.

To illustrate how each generation differs, here’s a second version using the same text prompt.

What’s more impressive is the variety of speaker presets. There are more than 100 available for a wide range of languages. The next clip uses a female voice which we specified using audio_array = generate_audio(text_prompt, history_prompt="v2/en_speaker_9")

Bark also supports various languages out-of-the-box and automatically determines language from input text.

Summary

Bark is a really interesting project and great fun to boot. You’re not limited to speech, as Bark can generate music lyrics, sound effects or other non-speech sounds.

With a GeForce RTX 3060 Ti graphics card, processing is fast. A 14 second audio files takes around 13 seconds to be generated. That’s important, as you’ll often need to run the software multiple times to get useful output.

We’d love to try the larger models but we don’t have a graphics card with at least 12GB of VRAM. Maybe NVIDIA or AMD will donate a suitable graphics card to LinuxLinks?

Bark creates audio files with a maximum duration of about 13 seconds, but it’s possible to create much longer audio files by splitting longer text into sentences using nltk and generate the sentences one by one.

Bark has amassed a whopping 22k GitHub stars.

Website: github.com/suno-ai/bark
Support:
Developer: Suno, Inc
License: MIT License

Bark is written in Python. Learn Python with our recommended free books and free tutorials.

Artificial intelligence icon For other useful open source apps that use machine learning/deep learning, we’ve compiled this roundup.

Next page: Page 3 – Example Python File

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary
Page 3 – Example Python File

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

6 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Carlos
Carlos
1 year ago

Never heard of Bark before. It looks kinda interesting. I’ll give it a whirl under Ubuntu.

James
James
1 year ago
Reply to  Carlos

I’m using Debian so I should be able to get it working.

Neil
Neil
1 year ago
Reply to  James

do what?

Mel
Mel
1 year ago

Can you run Bark without a dedicated graphics card? I’ve got a 5th generation Intel machine with 8GB of RAM.