How-To Series · Episode 22 / 59 · Module 4: Eyes, Ears, Voice
Hermes · Vision & Paste
Paste a screenshot. Ask. The agent reads it.
After this videoYou can now hand the agent any image and have it work from there.
Copy any image to your clipboard, hit /paste (or Ctrl/Cmd+V), type your question, send. The image goes to the model as a base64 vision content block. Multiple attachments work (Ctrl+C clears them). Three attach modes: /paste (most reliable), Cmd/Ctrl+V (layered), and Mac screenshot/file:// auto-attach. OS clipboard helpers: Mac needs nothing, Linux X11 needs xclip, Wayland needs wl-clipboard, WSL2 works via PowerShell, SSH does not work. /terminal-setup fixes IDE-terminal key intercepts in VS Code, Cursor, and Windsurf. Every pasted image saves to ~/.hermes/images/ for later re-attach with @.
About these resources. Every command in this video comes from the Vision & Image Paste doc.
Sources · What this video distills
1 docs page · every command below traces to one of themCommands shown · Copy and paste
each shows the source doc it came from/paste/terminal-setupsudo apt install xclipsudo apt install wl-clipboardbrew install pngpasteGoing deeper · Related Hermes docs
further reading · not sources of facts shown aboveNext in the series · Episodes that build on this
E15
Drop Files With @
E20
Browser Automation
E23
Image Generation