High Quality TTS in Linux
This will use piper
and Ryan’s voice. Which is a voice we created at Dream Face Technologies for our companion robot, Ryan. You can hear samples for all available voices here.
Download piper:
cd ~/Applications
wget https://github.com/rhasspy/piper/releases/download/v1.2.0/piper_amd64.tar.gz
tar xvfz piper_amd64.tar.gz
rm piper_amd64.tar.gz
cd piper
mkdir models
Download the voices you want from here. You need the .onnx
and .onnx.json
files. There are a few options (and languages).
Here I download Ryan’s voice.
cd ~/Applications/piper/models
wget https://huggingface.co/rhasspy/piper-voices/blob/v1.0.0/en/en_US/ryan/high/en_US-ryan-high.onnx
wget https://huggingface.co/rhasspy/piper-voices/blob/v1.0.0/en/en_US/ryan/high/en_US-ryan-high.onnx.json
Now setup speech-dispatcher
:
mkdir ~/.config/speech-dispatcher/modules
Put this in ~/.config/speech-dispatcher/modules/piper.conf
:
AddVoice "en-US" "MALE1" "en_US-ryan-high"
DefaultVoice "en_US-ryan-high"
GenericExecuteSynth "echo \'$DATA\' | /home/hojjat/Applications/piper/piper --model /home/hojjat/Applications/piper/models/en_US-ryan-high.onnx --output_raw | aplay -r 22050 -f S16_LE -t raw - "
You can copy more models into the models
folder and add them here with AddVoice
.
Put this in ~/.config/speech-dispatcher/speechd.conf
:
AddModule "piper" "sd_generic" "/home/hojjat/.config/speech-dispatcher/modules/piper.conf"
DefaultModule piper
AudioOutputMethod "pulse"
AudioPulseDevice "default"
Now let’s make sure we enable speech-dispatcher
:
systemctl --user enable speech-dispatcher
systemctl --user start speech-dispatcher
Now do a reboot for good measures.
You can test the audio with spd-say "hello world"
. And you can use ReadAloud on Firefox.
Converting large texts may take a while on CPU. Checkout Piper’s Github page to figure out how to use GPU acceleration with Piper.
Comments