How-To Series · Episode 20 / 59 · Module 4: Eyes, Ears, Voice

Hermes · Browser Automation

Real browser. Real navigation. Real form fills. Six backends to pick from.

After this videoYou can now let Hermes drive a real browser for you.

Pages render as accessibility trees with ref IDs (@e1, @e2). The agent clicks, types, scrolls, and screenshots like a person. Six backends: Browserbase, Browser Use, Firecrawl (cloud), Camofox (local anti-detection), Local CDP (attach to your Chrome/Brave/Edge via /browser connect), and agent-browser (clean local Chromium). Easiest path: Portal Tool Gateway (no keys). Use-your-own-session path: /browser connect. Pure cloud: one Browserbase key. Session isolation + auto-cleanup are automatic. Pair with vision for screenshot understanding.

About these resources. Every command in this video comes from the Browser Automation doc. The Tool Gateway doc is cited for the Portal-managed path.

Sources · What this video distills

2 docs pages · every command below traces to one of them
Primary · six backends, accessibility tree, session isolation, auto cleanup
Browser Automation
Read ↗
Portal-managed browser path with no API keys
Tool Gateway
Read ↗

Commands shown · Copy and paste

each shows the source doc it came from
Portal pathfrom source ↗
hermes setup --portal && hermes model
Attach to your browserfrom source ↗
/browser connect
Cloud key (Browserbase)from source ↗
echo "BROWSERBASE_API_KEY=***" >> ~/.hermes/.env

Going deeper · Related Hermes docs

further reading · not sources of facts shown above

Next in the series · Episodes that build on this

E22
Vision · Images
E11
Tools & Toolsets
E53
Tool Gateway