agent-browser: Electron App Automation via CDP
Control Electron applications automatically using Chrome DevTools Protocol through agent-browser, enabling UI automation for desktop apps like Slack, VS Code, and Discord.
Automate Any Electron App with agent-browser and Chrome DevTools Protocol
If you've ever wished you could automate Slack, VS Code, Discord, or Notion the same way you automate a webpage — without wrestling with fragile official APIs or complex native UI frameworks — you're about to have a very good day.
A powerful but underappreciated insight is quietly changing how developers think about desktop automation: almost every Electron application exposes a Chrome DevTools Protocol (CDP) port, and tools like agent-browser can connect to it directly. The result is full programmatic control over any Electron-based desktop app, using the exact same mechanism you'd use to drive a browser.
This post breaks down how it works, why it matters, and how you can start automating desktop apps today.
Why Electron Apps Are Actually Just Chrome in Disguise
Here's the mental model shift that makes everything click: Electron is Chrome. Not metaphorically — literally. Electron bundles Chromium and Node.js together, wrapping web-based UIs inside a desktop application shell. When you open Slack or VS Code, you're essentially launching a dedicated Chromium instance running a specific web app.
This means everything Chrome exposes for debugging and automation — the Chrome DevTools Protocol — is available in Electron apps too. CDP is a low-level protocol that lets you inspect DOM elements, execute JavaScript, take screenshots, simulate clicks and keyboard input, and much more. Browser automation tools like Puppeteer and Playwright are built on top of it.
The key insight: you don't need a special Electron automation framework. You just need to unlock CDP access, which Electron supports natively via a startup flag:
--remote-debugging-port=9222
Once that port is open, any CDP-compatible client — including agent-browser — can connect and treat the Electron app exactly like a browser tab.
Setting Up agent-browser for Electron Automation
agent-browser is an open-source tool from Vercel Labs (available at agent-browser.dev and GitHub) designed specifically for AI-driven browser and desktop automation. It ships with a growing library of skills — structured instruction sets that teach an AI agent how to interact with specific applications. Two particularly useful ones are the Electron Skill and the Slack Skill.
Step 1: Launch Your Electron App with CDP Enabled
On macOS, pass the debugging port flag at startup using open:
# Launch Slack with CDP on port 9222
open -a "Slack" --args --remote-debugging-port=9222
# Launch VS Code with CDP on port 9223
open -a "Visual Studio Code" --args --remote-debugging-port=9223
# Launch Discord
open -a "Discord" --args --remote-debugging-port=9224
On Linux and Windows, you can pass the flag directly to the application binary:
# Linux
/usr/bin/slack --remote-debugging-port=9222
# Windows (PowerShell)
& "C:\Users\you\AppData\Local\slack\slack.exe" --remote-debugging-port=9222
Step 2: Connect with agent-browser
Once the app is running with the debugging port open, connecting is a single command:
agent-browser connect 9222
From here, you have full access to the application's UI. You can take a snapshot to get a structured representation of all visible UI elements, interact with them by reference, and capture screenshots at any point:
# Inspect the current UI state
agent-browser snapshot -i
# Click a specific element (by its ID from the snapshot)
agent-browser click @e5
# Type into a focused input field
agent-browser type "Hello from automation"
# Take a screenshot
agent-browser screenshot output.png
The snapshot -i command is particularly powerful for AI-driven workflows — it returns a structured, machine-readable map of every interactive element on screen, which a language model can reason about to determine next steps.
Real-World Automation Use Cases
The combination of CDP access and agent-browser's skill system opens up automation scenarios that would otherwise require official API access, OAuth tokens, or brittle screen-scraping hacks. Here are some of the most compelling applications:
Customer Support Automation via WhatsApp Desktop
WhatsApp Desktop is an Electron app. That means you can connect to it via CDP and automate message reading and sending — without the WhatsApp Business API. For small businesses or teams running customer support through WhatsApp, this unlocks:
- Auto-routing incoming messages to the right team member
- Sending templated replies to common queries
- Logging conversations to a CRM in real time
- Monitoring for keywords or urgent requests
Since the automation runs inside an already-authenticated session, there's no need to deal with API keys, webhooks, or approval processes.
Slack Workflow Automation
The dedicated Slack Skill in agent-browser provides structured guidance for automating common Slack tasks:
- Sending scheduled messages to channels or DMs
- Monitoring channels for specific keywords and triggering downstream actions
- Extracting message threads for summarization or archiving
- Automating standup updates or status reports
This approach preserves your existing login session and works within Slack's actual UI — meaning it can do anything a human user can do, including interactions that the Slack API explicitly doesn't support.
Code Editor and Development Workflow Automation
VS Code's CDP exposure is particularly interesting for development automation:
- Triggering build tasks, terminal commands, or test runs
- Navigating between files and injecting code snippets
- Capturing editor state for AI-assisted code review pipelines
- Automating repetitive refactoring tasks across a codebase
Pair this with a Claude Code integration (or any LLM that can reason about UI snapshots), and you have a foundation for genuinely autonomous coding agents that operate inside a real developer environment.
Knowledge Management with Notion and Obsidian
Both Notion and Obsidian have Electron-based desktop apps, making them fully automatable via this approach:
- Auto-creating and populating note templates
- Syncing content between tools
- Extracting structured data from notes for downstream processing
- Bulk editing or tagging existing content
Why This Beats Official APIs for Many Use Cases
There are three distinct advantages to CDP-based Electron automation that make it worth understanding deeply:
1. No API required. Many desktop apps — WhatsApp, Discord in certain configurations, older Slack features — either don't have APIs, have severely rate-limited ones, or restrict access behind expensive enterprise tiers. CDP bypasses all of that.
2. Session persistence. Because you're connecting to a running, authenticated app, you inherit the user's existing login state. No OAuth flows. No token management. No re-authentication headaches.
3. Complete UI parity. An API can only do what its designers chose to expose. CDP-based automation can do literally anything a human user can do in the UI — including interacting with features the app doesn't officially surface via API.
The tradeoff is that this approach is best suited for local or semi-automated workflows rather than large-scale production systems. It requires the desktop app to be running, and it's inherently tied to the specific UI version installed. But for developer tooling, AI-assisted workflows, and personal productivity automation, those constraints are rarely a problem.
Getting Started: Your Next Steps
Here's a practical starting point for exploring Electron app automation with agent-browser:
# Clone the agent-browser repository
git clone https://github.com/vercel-labs/agent-browser
cd agent-browser
# Review the Electron skill documentation
cat skills/electron/SKILL.md
# Review the Slack-specific skill
cat skills/slack/SKILL.md
# Install dependencies and start experimenting
npm install
From there, pick one Electron app you use daily — Slack, Notion, or VS Code are great starting points — launch it with --remote-debugging-port, connect with agent-browser, and run snapshot -i to see what's exposed.
The surface area for automation is much larger than most developers realize. If it runs on Electron, it's automatable — and agent-browser gives you a clean, AI-friendly interface to make it happen.
Interested in building OpenClaw skills around Electron automation? ClawList.io is tracking the best emerging patterns in AI-driven desktop automation. Stay tuned for follow-up posts on WhatsApp Desktop automation, VS Code agent workflows, and integrating these skills with Claude Code.