Self-host Overview

You can run all of openma — the API, the Console, the integrations gateway, and the agent runtime — on your own Cloudflare account. The same code that powers openma.dev is the code in the public repo; nothing is held back.

This page describes what you’re getting into. The next page, Deploy, is the step-by-step.

Architecture at a glance

openma deploys as four Workers plus a set of bindings:

Worker	Purpose
`apps/main`	REST API + Console static assets. The front door.
`apps/agent`	Session Durable Objects + sandbox Containers. Where agents actually run.
`apps/integrations`	OAuth callback + webhook gateway for Linear / GitHub / Slack.
`apps/docs`	This site. (Optional — only if you want your own docs.)

They talk to each other via service bindings (Worker-to-Worker calls, no public hop). State lives in:

D1 — auth (users, tenants, sessions), agent configs, skill metadata, integration installs.
KV — fast-path caches: agent/environment/credential lookups.
R2 — file storage: skill files, session files, workspace.
Durable Objects — SESSION_DO (per-session event log SQLite), SANDBOX (per-session container).
Vectorize — semantic memory index.
Workers AI — embedding generation.
Cloudflare Containers — sandbox runtime (the per-session container the bash tool runs in).
Email Service — transactional email for auth.
Analytics Engine — observability for silent catches and structured logs.

Requirements

Cloudflare Workers Paid plan

Required for Durable Objects and Containers. The free plan won’t work.

A domain

You’ll point routes at it. openma.dev in our deployment; whatever you own in yours.

An LLM provider

Anthropic, OpenAI, MiniMax, or any OpenAI-compatible endpoint. You pay them directly.

OAuth apps

You’ll register your own GitHub App, Linear OAuth, and Slack app. Walked through in OAuth Apps.

What you don’t need

A separate database server — D1 is managed.
A queue or task runner — Durable Object alarms cover scheduling.
A vector database service — Vectorize is managed.
Container infrastructure — CF Containers handle it.
Manual scaling — Workers + DOs scale on demand.

Cost shape

In rough order of cost (highest first):

LLM API calls — your Anthropic/OpenAI/MiniMax bill. The biggest variable.
Containers — per-session runtime. Pay for active container time.
Workers — requests + CPU time. Cheap.
Storage — D1 / R2 / KV / Vectorize. Generally negligible until you have lots of long-lived sessions.

There are no openma platform fees because there is no openma platform — it’s all your Cloudflare account.

Multi-tenancy

A single self-hosted deployment supports multiple tenants out of the box (the same code path the hosted product uses). On first sign-up, the databaseHooks.user.create.after hook creates a tenant row idempotently. You don’t need to seed anything.

If you want stricter isolation (per-tenant D1 databases), set PER_TENANT_DB_ENABLED=true and configure STORE_BACKENDS. See Operations.

Where to go next

Deploy Step-by-step: clone, set bindings, set secrets, deploy.

OAuth Apps Register your own GitHub App, Linear OAuth app, Slack app.

Operations Custom domains, observability, multi-tenant config, troubleshooting.