Skip to content

Self-host Overview

You can run all of openma — the API, the Console, the integrations gateway, and the agent runtime — on your own Cloudflare account. The same code that powers openma.dev is the code in the public repo; nothing is held back.

This page describes what you’re getting into. The next page, Deploy, is the step-by-step.

openma deploys as four Workers plus a set of bindings:

WorkerPurpose
apps/mainREST API + Console static assets. The front door.
apps/agentSession Durable Objects + sandbox Containers. Where agents actually run.
apps/integrationsOAuth callback + webhook gateway for Linear / GitHub / Slack.
apps/docsThis site. (Optional — only if you want your own docs.)

They talk to each other via service bindings (Worker-to-Worker calls, no public hop). State lives in:

  • D1 — auth (users, tenants, sessions), agent configs, skill metadata, integration installs.
  • KV — fast-path caches: agent/environment/credential lookups.
  • R2 — file storage: skill files, session files, workspace.
  • Durable ObjectsSESSION_DO (per-session event log SQLite), SANDBOX (per-session container).
  • Vectorize — semantic memory index.
  • Workers AI — embedding generation.
  • Cloudflare Containers — sandbox runtime (the per-session container the bash tool runs in).
  • Email Service — transactional email for auth.
  • Analytics Engine — observability for silent catches and structured logs.

Cloudflare Workers Paid plan

Required for Durable Objects and Containers. The free plan won’t work.

A domain

You’ll point routes at it. openma.dev in our deployment; whatever you own in yours.

An LLM provider

Anthropic, OpenAI, MiniMax, or any OpenAI-compatible endpoint. You pay them directly.

OAuth apps

You’ll register your own GitHub App, Linear OAuth, and Slack app. Walked through in OAuth Apps.

  • A separate database server — D1 is managed.
  • A queue or task runner — Durable Object alarms cover scheduling.
  • A vector database service — Vectorize is managed.
  • Container infrastructure — CF Containers handle it.
  • Manual scaling — Workers + DOs scale on demand.

In rough order of cost (highest first):

  1. LLM API calls — your Anthropic/OpenAI/MiniMax bill. The biggest variable.
  2. Containers — per-session runtime. Pay for active container time.
  3. Workers — requests + CPU time. Cheap.
  4. Storage — D1 / R2 / KV / Vectorize. Generally negligible until you have lots of long-lived sessions.

There are no openma platform fees because there is no openma platform — it’s all your Cloudflare account.

A single self-hosted deployment supports multiple tenants out of the box (the same code path the hosted product uses). On first sign-up, the databaseHooks.user.create.after hook creates a tenant row idempotently. You don’t need to seed anything.

If you want stricter isolation (per-tenant D1 databases), set PER_TENANT_DB_ENABLED=true and configure STORE_BACKENDS. See Operations.