agent-browser: Electron App Automation via CDP
Control Electron applications automatically using Chrome DevTools Protocol through agent-browser, enabling UI automation for desktop apps like Slack, VS Code, and Discord.
Automate Any Electron App with agent-browser and Chrome DevTools Protocol
If you've ever wished you could automate Slack, VS Code, Discord, or Notion the same way you automate a webpage — without wrestling with fragile official APIs or complex native UI frameworks — you're about to have a very good day.
A powerful but underappreciated insight is quietly changing how developers think about desktop automation: almost every Electron application exposes a Chrome DevTools Protocol (CDP) port, and tools like agent-browser can connect to it directly. The result is full programmatic control over any Electron-based desktop app, using the exact same mechanism you'd use to drive a browser.
This post breaks down how it works, why it matters, and how you can start automating desktop apps today.
Why Electron Apps Are Actually Just Chrome in Disguise
Here's the mental model shift that makes everything click: Electron is Chrome. Not metaphorically — literally. Electron bundles Chromium and Node.js together, wrapping web-based UIs inside a desktop application shell. When you open Slack or VS Code, you're essentially launching a dedicated Chromium instance running a specific web app.
This means everything Chrome exposes for debugging and automation — the Chrome DevTools Protocol — is available in Electron apps too. CDP is a low-level protocol that lets you inspect DOM elements, execute JavaScript, take screenshots, simulate clicks and keyboard input, and much more. Browser automation tools like Puppeteer and Playwright are built on top of it.
The key insight: you don't need a special Electron automation framework. You just need to unlock CDP access, which Electron supports natively via a startup flag:
--remote-debugging-port=9222
Once that port is open, any CDP-compatible client — including agent-browser — can connect and treat the Electron app exactly like a browser tab.
Setting Up agent-browser for Electron Automation
agent-browser is an open-source tool from Vercel Labs (available at agent-browser.dev and GitHub) designed specifically for AI-driven browser and desktop automation. It ships with a growing library of skills — structured instruction sets that teach an AI agent how to interact with specific applications. Two particularly useful ones are the Electron Skill and the Slack Skill.
Step 1: Launch Your Electron App with CDP Enabled
On macOS, pass the debugging port flag at startup using open:
# Launch Slack with CDP on port 9222
open -a "Slack" --args --remote-debugging-port=9222
# Launch VS Code with CDP on port 9223
open -a "Visual Studio Code" --args --remote-debugging-port=9223
# Launch Discord
open -a "Discord" --args --remote-debugging-port=9224
On Linux and Windows, you can pass the flag directly to the application binary:
# Linux
/usr/bin/slack --remote-debugging-port=9222
# Windows (PowerShell)
& "C:\Users\you\AppData\Local\slack\slack.exe" --remote-debugging-port=9222
Step 2: Connect with agent-browser
Once the app is running with the debugging port open, connecting is a single command:
agent-browser connect 9222
From here, you have full access to the application's UI. You can take a snapshot to get a structured representation of all visible UI elements, interact with them by reference, and capture screenshots at any point:
# Inspect the current UI state
agent-browser snapshot -i
# Click a specific element (by its ID from the snapshot)
agent-browser click @e5
# Type into a focused input field
agent-browser type "Hello from automation"
# Take a screenshot
agent-browser screenshot output.png
The snapshot -i command is particularly powerful for AI-driven workflows — it returns a structured, machine-readable map of every interactive element on screen, which a language model can reason about to determine next steps.
Real-World Automation Use Cases
The combination of CDP access and agent-browser's skill system opens up automation scenarios that would otherwise require official API access, OAuth tokens, or brittle screen-scraping hacks. Here are some of the most compelling applications:
Customer Support Automation via WhatsApp Desktop
WhatsApp Desktop is an Electron app. That means you can connect to it via CDP and automate message reading and sending — without the WhatsApp Business API. For small businesses or teams running customer support through WhatsApp, this unlocks:
- Auto-routing incoming messages to the right team member
- Sending templated replies to common queries
- Logging conversations to a CRM in real time
- Monitoring for keywords or urgent requests
Since the automation runs inside an already-authenticated session, there's no need to deal with API keys, webhooks, or approval processes.
Slack Workflow Automation
The dedicated Slack Skill in agent-browser provides structured guidance for automating common Slack tasks:
- Sending scheduled messages to channels or DMs
- Monitoring channels for specific keywords and triggering downstream actions
- Extracting message threads for summarization or archiving
- Automating standup updates or status reports
This approach preserves your existing login session and works within Slack's actual UI — meaning it can do anything a human user can do, including interactions that the Slack API explicitly doesn't support.
Code Editor and Development Workflow Automation
VS Code's CDP exposure is particularly interesting for development automation:
- Triggering build tasks, terminal commands, or test runs
- Navigating between files and injecting code snippets
- Capturing editor state for AI-assisted code review pipelines
- Automating repetitive refactoring tasks across a codebase
Pair this with a Claude Code integration (or any LLM that can reason about UI snapshots), and you have a foundation for genuinely autonomous coding agents that operate inside a real developer environment.
Knowledge Management with Notion and Obsidian
Both Notion and Obsidian have Electron-based desktop apps, making them fully automatable via this approach:
- Auto-creating and populating note templates
- Syncing content between tools
- Extracting structured data from notes for downstream processing
- Bulk editing or tagging existing content
Why This Beats Official APIs for Many Use Cases
There are three distinct advantages to CDP-based Electron automation that make it worth understanding deeply:
1. No API required. Many desktop apps — WhatsApp, Discord in certain configurations, older Slack features — either don't have APIs, have severely rate-limited ones, or restrict access behind expensive enterprise tiers. CDP bypasses all of that.
2. Session persistence. Because you're connecting to a running, authenticated app, you inherit the user's existing login state. No OAuth flows. No token management. No re-authentication headaches.
3. Complete UI parity. An API can only do what its designers chose to expose. CDP-based automation can do literally anything a human user can do in the UI — including interacting with features the app doesn't officially surface via API.
The tradeoff is that this approach is best suited for local or semi-automated workflows rather than large-scale production systems. It requires the desktop app to be running, and it's inherently tied to the specific UI version installed. But for developer tooling, AI-assisted workflows, and personal productivity automation, those constraints are rarely a problem.
Getting Started: Your Next Steps
Here's a practical starting point for exploring Electron app automation with agent-browser:
# Clone the agent-browser repository
git clone https://github.com/vercel-labs/agent-browser
cd agent-browser
# Review the Electron skill documentation
cat skills/electron/SKILL.md
# Review the Slack-specific skill
cat skills/slack/SKILL.md
# Install dependencies and start experimenting
npm install
From there, pick one Electron app you use daily — Slack, Notion, or VS Code are great starting points — launch it with --remote-debugging-port, connect with agent-browser, and run snapshot -i to see what's exposed.
The surface area for automation is much larger than most developers realize. If it runs on Electron, it's automatable — and agent-browser gives you a clean, AI-friendly interface to make it happen.
Interested in building OpenClaw skills around Electron automation? ClawList.io is tracking the best emerging patterns in AI-driven desktop automation. Stay tuned for follow-up posts on WhatsApp Desktop automation, VS Code agent workflows, and integrating these skills with Claude Code.
Editorial context
Why this article matters
Agent browser: Electron App Automation via CDP belongs to a broader ClawList coverage cluster: the main cluster for openclaw nodes, browser/device automation, and practical control-layer tutorials. This article matters because it turns that cluster into a concrete read for builders evaluating browser, desktop, or device automation with AI agents.
Primary angle
Automation
Best next move
Pair this article with Electron App Automation via CDP with agent-browser if you want to turn the idea into a testable workflow.
Why now
This piece helps readers decide what is signal versus noise in agent browser: electron app automation via cdp.
Best for
Best for builders evaluating browser, desktop, or device automation with AI agents. If you are deciding whether this topic changes your current stack, this is the kind of page you read before you commit engineering time or rewrite an ops process.
Read with caution
Product screenshots, pricing, and launch claims can change faster than the underlying workflow pattern, so verify current vendor details before rollout.
Architecture patterns rarely transfer one-to-one across agent runtimes, so adapt the pattern to your own tool surface instead of copying it blindly.
Next Best Step
Keep this session moving with the OpenClaw Nodes & Automation hub
This hub pulls together the pages that explain how OpenClaw nodes connect to browsers, Electron apps, and device workflows so readers can move from theory into hands-on automation faster.
See the OpenClaw Nodes hub
Jump from the current article into the full device and automation cluster.
Install agent-browser
Move from the tutorial into a working automation skill for browsers and Electron apps.
Browse more automation skills
Compare adjacent skills before you commit to a specific control stack.
Tags
Related Skills
Electron App Automation via CDP with agent-browser
通过 Chrome DevTools Protocol 使用 agent-browser 自动化控制任何 Electron 应用(Slack、VS Code、Discord 等)的完整解决方案。
Agent-Browser: Electron App Automation via CDP
Automate Electron applications using Chrome DevTools Protocol through agent-browser for UI control and interaction.
Electron App Automation via CDP and agent-browser
通过 Chrome DevTools Protocol 和 agent-browser 实现对任何 Electron 应用(Slack、VS Code、Discord 等)的自动化控制。
Related Articles
UI-TARS-Desktop: Local Desktop Automation Agent
Open-source desktop automation agent by ByteDance that controls applications, files, and websites locally without cloud dependency.
Browser Automation for Social Media Publishing
Tutorial on automating X article publishing using browser control, clipboard manipulation, and intelligent image positioning techniques.
Playwright MCP Concurrent Chrome Instances Guide
Tips for running multiple Chrome instances concurrently using Playwright MCP with --user-data-dir for isolated agent execution.