After handling the voice command, it’s common to give the user a response in the form of a speech. Rhasspy uses a text to speech system (TTS) to generate a voice out of text. We tested two TTS,Mozilla TTS and Nano TTS.
To trigger the TTS to say something, you must publish the MQTT message
/hermes/tts/say with the attribute
text and the attribute
siteId. The attribute
text must contain the text you want to hear from the TTS. After you send it out, the TTS produces a voice wav file and plays it on the speaker of your rhasspy. When rhasspy finished playing the wav file, the MQTT message
hermes/tts/sayFinished is send with the same siteId as
There are different text to speech systems, and the list below shows some examples with the available language:
eSpeak: is the default text to speech system and has the widest support.
Flite: sounds better than eSpeak, but can support only fewer languages. Uses FestVox’s Flite for speech.
PicoTTS: uses SVOX’s PicoTTS for text to speech.
NanoTTS: uses an imported fork of SVOX’s picoTTs for test to speech.
MaryTTS: has a MaryTTS Docker image available with many voices included. Use a remote MaryTTS web server.
OpenTTS: Uses a remote OpenTTS, supports many different text to speech system such as Mozilla TTS.
Google WaveNet: requires a google account and internet connection to function