MCP Integration

OpenFactory supports the Model Context Protocol (MCP) , an open standard that lets AI assistants like Claude, Cursor, and other MCP-compatible clients interact directly with OpenFactory. Build OS images, manage VMs, run tests, and browse recipes — all through natural language.

Quick Start

Claude Desktop

Add this to your Claude Desktop configuration (~/.claude/claude_desktop_config.json):


{
  "mcpServers": {
    "openfactory": {
      "url": "https://console.openfactory.tech/api/mcp/sse"
    }
  }
}

Cursor

Add this to your Cursor MCP settings (.cursor/mcp.json):


{
  "mcpServers": {
    "openfactory": {
      "url": "https://console.openfactory.tech/api/mcp/sse"
    }
  }
}

Claude Code


claude mcp add openfactory \
  --transport sse \
  --url "https://console.openfactory.tech/api/mcp/sse"

Claude.ai (Browser)

You can also connect OpenFactory directly in the Claude.ai web app:

Open claude.ai and go to Settings
Navigate to Customize → Connectors
Click the + button to add a new connector
Enter the SSE endpoint: https://console.openfactory.tech/api/mcp/sse
Name it OpenFactory and save

Once connected, you’ll see OpenFactory’s tools listed under the connector. Set tool permissions to Needs approval or Always allow based on your preference.

OpenFactory MCP connector configured in Claude.ai settings, showing all 21 available tools

Restart your client and you’ll see OpenFactory’s tools available.

Connection

Property	Value
SSE Endpoint	`https://console.openfactory.tech/api/mcp/sse`
Messages Endpoint	`https://console.openfactory.tech/api/mcp/messages`
Transport	SSE (Server-Sent Events)

The MCP server uses the SSE transport. Your MCP client connects to the SSE endpoint for server-to-client messages and posts to the messages endpoint for client-to-server messages. Most MCP clients handle this automatically.

Your session is tied to your IP address, so your builds, VMs, and test results persist across requests.

Available Tools

OpenFactory exposes 29 tools across 5 categories.

Build Tools

Tool	Description	Key Parameters
`list_builds`	List your builds with status and features	`status_filter`, `limit`
`get_build`	Get full details of a build including its recipe	`build_id`
`get_build_status`	Get current status with stage information	`build_id`
`create_build`	Create a new build from a recipe	`recipe` (name, base_image, features, packages, services, networking, security)
`get_iso_download_url`	Get the download URL for a completed build	`build_id`
`retry_build`	Retry a failed build with the same recipe	`build_id`

Recipe Tools

Tool	Description	Key Parameters
`list_recipes`	Browse pre-built recipe templates	`category`, `search`
`list_recipe_categories`	List all recipe categories with descriptions	—
`get_recipe`	Get full recipe details including packages and services	`recipe_id`
`validate_recipe`	Validate a recipe without starting a build	`recipe`
`create_recipe_from_template`	Customize an existing recipe template	`template_id`, `name`, `modifications`

VM Tools

Tool	Description	Key Parameters
`list_vms`	List your running VMs	`build_id`
`create_vm`	Create and start a VM from a build	`build_id`, `name`, `memory_mb`, `vcpus`
`start_vm`	Start a stopped VM	`vm_name`
`stop_vm`	Stop a running VM (graceful or forced)	`vm_name`, `force`
`delete_vm`	Delete a VM and its resources	`vm_name`
`screenshot_vm`	Capture a PNG screenshot of the VM display	`vm_name`
`interact_vm`	Click, type, scroll, drag, and screenshot the VM GUI	`vm_name`, `actions` (JSON array), `screenshot_after`
`get_vm_console_url`	Get the VNC console URL for a VM	`vm_name`

Desktop Tools (Agent Desktop Interaction)

These tools are designed for AI agents running inside OpenFactory’s Direktor agent containers. Each agent has an assigned desktop VM (XFCE + Firefox) that it can control. Every tool combines the VM Interaction API with OmniParser — returning both a screenshot and a structured list of all detected UI elements (buttons, text, icons) with pixel coordinates. No need to call screenshot and parse separately.

Tool	Description	Key Parameters
`desktop_screenshot`	Screenshot + parse all UI elements. Call first to orient yourself.	`vm_name`
`desktop_click`	Click at coordinates, return screenshot + elements	`vm_name`, `x`, `y`, `button`, `double`
`desktop_type`	Type text into focused field, return screenshot + elements	`vm_name`, `text`, `press_enter`
`desktop_key`	Press key combo (e.g. `ctrl+l`, `Return`), return screenshot + elements	`vm_name`, `key`
`desktop_open_url`	Open URL in Firefox (Ctrl+L → type → Enter → wait → parse)	`vm_name`, `url`, `wait_seconds`
`desktop_interact`	Batch actions (click, type, key, wait), return screenshot + elements	`vm_name`, `actions` (JSON array)

Each tool returns:

A screenshot image (PNG) of the desktop after the action
A parsed elements list with center: [x, y], content, type, and interactivity for each detected element

Test Tools

Tool	Description	Key Parameters
`run_tests`	Run tests on a built ISO by creating a test VM	`build_id`, `tests`, `memory_mb`, `vcpus`, `timeout_seconds`
`get_test_results`	Get detailed results of a test run	`run_id`
`list_test_runs`	List test runs for your builds	`build_id`, `limit`
`stop_test_run`	Stop a running test and clean up VMs	`run_id`

Examples

Once connected, you can interact with OpenFactory using natural language:

Building an image:

“Build me a Debian Bookworm server with SSH, Docker, and monitoring enabled”

Checking build status:

“Show my recent builds” or “What’s the status of my latest build?”

Running tests:

“Run tests on build abc123” or “Show the test results for my last build”

Working with recipes:

“Show me the available healthcare recipes” or “Create a build from the GxP MedTech template with extra packages: vim, htop”

Managing VMs:

“Create a VM from my latest build” or “List my running VMs”

Interacting with a VM GUI:

“Take a screenshot of my VM” or “Click on the login button at coordinates 500, 300”

GUI Interaction with `interact_vm`

The interact_vm tool lets you control a VM’s graphical display via VNC. Pass a JSON array of actions that execute sequentially:


[
  {"type": "click", "x": 500, "y": 300},
  {"type": "wait", "duration": 1},
  {"type": "type", "text": "hello world"},
  {"type": "key", "key": "Return"},
  {"type": "screenshot"}
]

Supported action types:

Action	Fields	Description
`click`	`x`, `y`, `button?` (0=left, 2=right), `double?`	Click at coordinates
`double_click`	`x`, `y`	Double-click
`right_click`	`x`, `y`	Right-click
`move`	`x`, `y`	Move mouse cursor
`scroll`	`x`, `y`, `direction` (up/down), `amount`	Scroll wheel
`drag`	`x`, `y`, `end_x`, `end_y`	Click-drag
`type`	`text`	Type text string
`key`	`key`	Send key (e.g. `Return`, `Tab`, `ctrl+c`, `alt+F4`)
`key_combo`	`keys` (array)	Key combination (e.g. `["ctrl", "shift", "t"]`)
`wait`	`duration` (seconds)	Pause between actions
`screenshot`	—	Capture current display

By default, screenshot_after: true returns a screenshot image after all actions complete. Use screenshot_vm for a standalone screenshot without any actions.

Desktop Interaction with `desktop_*` Tools

The desktop tools are designed for Direktor agents with an assigned desktop VM. Unlike interact_vm which returns raw screenshots, desktop tools automatically run OmniParser on every screenshot to detect all UI elements and their coordinates.

Typical workflow:


1. desktop_screenshot(vm_name="of-desktop-abc-myorg-def123")
   → Returns screenshot + elements: [{"content": "Firefox", "center": [512, 400], "type": "icon"}, ...]

2. desktop_click(vm_name="of-desktop-abc-myorg-def123", x=512, y=400)
   → Clicks Firefox icon, returns new screenshot + updated elements

3. desktop_open_url(vm_name="of-desktop-abc-myorg-def123", url="https://example.com")
   → Opens URL in Firefox, waits for load, returns screenshot + page elements

Opening a URL directly:


desktop_open_url(vm_name="of-desktop-...", url="https://github.com")

This sends Ctrl+L → types URL → Enter → waits 5 seconds → screenshot + parse. Returns both the visual screenshot and all detected page elements with coordinates.

Element response format:


{
  "elements": [
    {
      "center": [512, 300],
      "bbox": [480, 280, 544, 320],
      "content": "Sign In",
      "type": "button",
      "interactivity": true
    }
  ],
  "total": 47,
  "interactive_count": 12,
  "summary": "Found 47 elements (12 interactive):\n  * [button] \"Sign In\" at (512, 300)\n  ..."
}

If you create an account at console.openfactory.tech , you can link your MCP client to it with an API key. This lets you access builds from the web console and keeps your work consistent across different networks and devices.

Creating an API Key

Sign in at console.openfactory.tech
Go to Settings and generate an MCP API key
Add the key to your client config:

Claude Desktop / Cursor:


{
  "mcpServers": {
    "openfactory": {
      "url": "https://console.openfactory.tech/api/mcp/sse",
      "headers": {
        "Authorization": "Bearer of_mcp_your_key_here"
      }
    }
  }
}

Claude Code:


claude mcp remove openfactory
claude mcp add openfactory \
  --transport sse \
  --url "https://console.openfactory.tech/api/mcp/sse" \
  --header "Authorization: Bearer of_mcp_your_key_here"

Managing API Keys

You can also manage keys via the API:


# Create a key
POST /api/mcp-keys
Authorization: Bearer <your_jwt_token>
Content-Type: application/json
 
{"name": "Claude Desktop"}


# List keys
GET /api/mcp-keys
Authorization: Bearer <your_jwt_token>


# Revoke a key
DELETE /api/mcp-keys/{key_id}
Authorization: Bearer <your_jwt_token>

API keys are prefixed with of_mcp_ and shown only once when created. Revoking a key immediately disconnects any clients using it.

Troubleshooting

Connection refused or timeout

Test the connection with:


curl -H "Accept: text/event-stream" \
  https://console.openfactory.tech/api/mcp/sse

You should see SSE events streaming back. If not, the service may be temporarily unavailable.

Tools not appearing

Restart your MCP client after adding the server configuration. Claude Desktop and Cursor may need a full restart to pick up new MCP servers.

”Invalid API key” error

If you’re using an API key, verify it hasn’t been revoked. Remove the Authorization header to fall back to your default session, or generate a new key from the console.

Build or VM operations failing

Check that the operation is valid (e.g., you can only download ISOs from completed builds).