desktop-mcp — the same discipline, past the API layer
Python 121 passing tests 4 tool groups input off by default rate-cappedThe problem
Not every integration is an HTTP API. Sometimes the thing an agent needs to drive is the desktop itself — read a window, capture the screen, record a flow, or click through an app that has no API at all. That surface is far more dangerous to expose than a REST endpoint: an injected keystroke or mouse move touches the live machine. The question a client has is whether you can hand an agent that reach and keep it bounded.
What I built
desktop-mcp is a Windows desktop-control MCP server — 18 tools across four capability groups, each independently gated, built to the same standard as the API servers:
- observe (always on) — screenshot, list/inspect windows. Read-only, safe by construction.
desktop_mcp/groups/observe.py - window / record (env-gated) — focus/move/resize windows; ffmpeg screen-recording with an auto-stop cap. Off unless explicitly enabled.
groups/window.py,groups/record.py - input (OFF by default) — mouse + keyboard injection, refused with a structured error unless
DESKTOP_MCP_ENABLE_INPUT=1, and rate-capped even when on. The most dangerous group is the one you have to deliberately turn on.groups/input.py - honest limits — the README names what it can't do (elevated windows via Windows UIPI, the UAC secure desktop, single-machine scope) instead of pretending they don't exist.
Evidence you can check yourself
- Repo: github.com/jaimenbell/desktop-mcp
- Tests:
python -m pytest→121 passed, 2 skipped, 0 failed(the two skips are live smokes that produce a real screenshot PNG and a real screen-recording mp4, gated behind an env flag) - Safety: every disabled group returns a structured
policy_refusal— never a silent no-op, never a crash. Input actions are additionally capped at 60/min by default.
Built and shipped in a single day with a full test suite and an adversarial review pass before it went public. The same gate-the-dangerous-surface-by-default pattern as the API servers — the safe default is the one you opt out of, not into.
What it shows about how I work
Range plus restraint. The MCP work isn't limited to wrapping REST APIs — it extends to OS-level control when a client needs it — and the safety discipline is identical whichever surface it is: capability groups, off-by-default on anything that writes or acts, structured refusals, tests that run on a clean checkout.