High Quality TTS in Linux

This will use piper and Ryan’s voice. Which is a voice we created at Dream Face Technologies for our companion robot, Ryan. You can hear samples for all available voices here.

Download piper:

cd ~/Applications
wget https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_amd64.tar.gz
tar xvfz piper_amd64.tar.gz
rm piper_amd64.tar.gz
cd piper
mkdir models

Download the voices you want from here. You need the .onnx and .onnx.json files. There are a few options (and languages).

Here I download Ryan’s voice.

cd ~/Applications/piper/models
wget https://huggingface.co/rhasspy/piper-voices/blob/v1.0.0/en/en_US/ryan/high/en_US-ryan-high.onnx
wget https://huggingface.co/rhasspy/piper-voices/blob/v1.0.0/en/en_US/ryan/high/en_US-ryan-high.onnx.json

Now setup speech-dispatcher:

mkdir ~/.config/speech-dispatcher/modules

Put this in ~/.config/speech-dispatcher/modules/piper.conf:

AddVoice "en-US" "MALE1" "en_US-ryan-high"

DefaultVoice "en_US-ryan-high"

GenericExecuteSynth "echo \'$DATA\' | /home/hojjat/Applications/piper/piper --model /home/hojjat/Applications/piper/models/en_US-ryan-high.onnx --output_raw | aplay -r 22050 -f S16_LE -t raw - "

You can copy more models into the models folder and add them here with AddVoice.

Put this in ~/.config/speech-dispatcher/speechd.conf:

AddModule "piper" "sd_generic" "/home/hojjat/.config/speech-dispatcher/modules/piper.conf"
DefaultModule piper
AudioOutputMethod "pulse"
AudioPulseDevice "default"

Now let’s make sure we enable speech-dispatcher:

systemctl --user enable speech-dispatcher
systemctl --user start speech-dispatcher

Now do a reboot for good measures.

You can test the audio with spd-say "hello world". And you can use ReadAloud on Firefox.

Converting large texts may take a while on CPU. Checkout Piper’s Github page to figure out how to use GPU acceleration with Piper.

This article was updated on September 27, 2023