How-To Series · Episode 19 / 59 · Module 4: Eyes, Ears, Voice
Hermes · Voice Mode
Hold a key, talk to Hermes, hear it talk back. CLI, Telegram, Discord, and Discord VC.
After this videoYou can now have a real spoken conversation with Hermes.
Voice mode lets you talk to Hermes on the CLI (push-to-talk with Ctrl+B, VAD silence detection), on Telegram and Discord (auto voice replies alongside text), and in Discord voice channels (the bot joins, listens, speaks back). Three STT picks: faster-whisper (local, no key, default), Groq Whisper (free tier, fast), OpenAI Whisper (paid). Four TTS picks: Tool Gateway (Portal, no keys), ElevenLabs (premium), Edge TTS (free), NeuTTS (local). One caveat: voice mode does not work on Android via Termux — faster-whisper has no ctranslate2 wheel for Android.
About these resources. Every command in this video comes from the Voice Mode user-guide doc; the practical guide is referenced for setup walkthrough patterns.
Sources · What this video distills
2 docs pages · every command below traces to one of themCommands shown · Copy and paste
each shows the source doc it came frompip install "hermes-agent[voice]"brew install portaudio ffmpeg opussudo apt install portaudio19-dev ffmpeg libopus0pip install "hermes-agent[tts-premium]"/voice onGoing deeper · Related Hermes docs
further reading · not sources of facts shown aboveNext in the series · Episodes that build on this
E20
Browser Automation
E24
Hermes on Telegram
E25
Hermes on Discord