desktop-mcp — the same discipline, past the API layer

// public repo · 18 tools · 121 passing tests · input off by default

Python 121 passing tests 4 tool groups input off by default rate-capped

The problem

Not every integration is an HTTP API. Sometimes the thing an agent needs to drive is the desktop itself — read a window, capture the screen, record a flow, or click through an app that has no API at all. That surface is far more dangerous to expose than a REST endpoint: an injected keystroke or mouse move touches the live machine. The question a client has is whether you can hand an agent that reach and keep it bounded.

What I built

desktop-mcp is a Windows desktop-control MCP server — 18 tools across four capability groups, each independently gated, built to the same standard as the API servers:

observe (always on) — screenshot, list/inspect windows. Read-only, safe by construction. desktop_mcp/groups/observe.py
window / record (env-gated) — focus/move/resize windows; ffmpeg screen-recording with an auto-stop cap. Off unless explicitly enabled. groups/window.py, groups/record.py
input (OFF by default) — mouse + keyboard injection, refused with a structured error unless DESKTOP_MCP_ENABLE_INPUT=1, and rate-capped even when on. The most dangerous group is the one you have to deliberately turn on. groups/input.py
honest limits — the README names what it can't do (elevated windows via Windows UIPI, the UAC secure desktop, single-machine scope) instead of pretending they don't exist.

Evidence you can check yourself

Repo: github.com/jaimenbell/desktop-mcp
Tests: python -m pytest → 121 passed, 2 skipped, 0 failed (the two skips are live smokes that produce a real screenshot PNG and a real screen-recording mp4, gated behind an env flag)
Safety: every disabled group returns a structured policy_refusal — never a silent no-op, never a crash. Input actions are additionally capped at 60/min by default.

Built and shipped in a single day with a full test suite and an adversarial review pass before it went public. The same gate-the-dangerous-surface-by-default pattern as the API servers — the safe default is the one you opt out of, not into.

What it shows about how I work

Range plus restraint. The MCP work isn't limited to wrapping REST APIs — it extends to OS-level control when a client needs it — and the safety discipline is identical whichever surface it is: capability groups, off-by-default on anything that writes or acts, structured refusals, tests that run on a clean checkout.

Book a scoping call All case studies →