How-To Series · Episode 20 / 59 · Module 4: Eyes, Ears, Voice
Hermes · Browser Automation
Real browser. Real navigation. Real form fills. Six backends to pick from.
After this videoYou can now let Hermes drive a real browser for you.
Pages render as accessibility trees with ref IDs (@e1, @e2). The agent clicks, types, scrolls, and screenshots like a person. Six backends: Browserbase, Browser Use, Firecrawl (cloud), Camofox (local anti-detection), Local CDP (attach to your Chrome/Brave/Edge via /browser connect), and agent-browser (clean local Chromium). Easiest path: Portal Tool Gateway (no keys). Use-your-own-session path: /browser connect. Pure cloud: one Browserbase key. Session isolation + auto-cleanup are automatic. Pair with vision for screenshot understanding.
About these resources. Every command in this video comes from the Browser Automation doc. The Tool Gateway doc is cited for the Portal-managed path.
Sources · What this video distills
2 docs pages · every command below traces to one of themCommands shown · Copy and paste
each shows the source doc it came fromhermes setup --portal && hermes model/browser connectecho "BROWSERBASE_API_KEY=***" >> ~/.hermes/.envGoing deeper · Related Hermes docs
further reading · not sources of facts shown aboveNext in the series · Episodes that build on this
E22
Vision · Images
E11
Tools & Toolsets
E53
Tool Gateway