Security Model

View as Markdown

Onumia lets agents do real work on a real WordPress site without handing them direct, unbounded access to production. It achieves this not through a single gate but through a sequence of independent layers, each of which assumes the others might fail. An agent must pass through all of them, in order, before anything it does can affect the live site.

A Layered Model

It helps to picture the path a request travels. WordPress authentication establishes who is making the request. The role capability policy decides whether that user is allowed to perform the action at all. A separate set of command capabilities then distinguishes reads from writes and from eval, so that being allowed to run commands does not imply being allowed to change anything. Whatever the agent does happens inside an isolated sandbox rather than against live data. Reviewers see that work only through preview links that cannot execute commands or promote. And finally, nothing reaches production until someone with the right promotion capability acts deliberately. The rest of this document walks through those layers in turn.

Authentication Comes First

Every request starts with a WordPress identity. The Onumia MCP transport refuses anonymous callers outright; if there is no authenticated WordPress user behind a request, the transport does not respond. Agents authenticate the same way a person on a script would, using a WordPress Application Password tied to a specific WordPress user. That user, and the roles attached to it, becomes the identity Onumia reasons about for the rest of the request. Because the agent can never be more than the user it authenticates as, there is no way for it to act outside that user’s roles.

Role Capabilities Gate Every Action

Authentication only proves identity; it grants nothing on its own. Immediately after a user is identified, Onumia consults the capabilities assigned to that user’s WordPress roles to decide what is permitted. Those capabilities cover the full range of meaningful actions: creating sandboxes, running read commands, running write commands, running eval commands, promoting code, promoting database changes, and managing sandboxes owned by other users. Administrators configure the mapping from roles to capabilities in Settings, and the same policy is enforced uniformly whether a request arrives through the admin app or over MCP.

Sandbox Ownership

Capabilities decide what you can do; ownership decides where you can do it. A user works within their own active sandboxes by default, and reaching sandboxes created by other users requires the manage_all_sandboxes capability. Inactive sandboxes are off the table entirely. A sandbox that has been discarded or otherwise deactivated cannot be selected as an execution target, which prevents stale or abandoned work from being revived and acted upon by mistake.

No Raw Host Shell

The execute tool is the agent’s command surface, and it is intentionally not a shell into the host. Commands run through Onumia’s PHP-native bash runtime, which provides process-free shell utilities together with sandbox-aware wp and on commands. There is no path to arbitrary process execution on the server, and raw Git commands are not exposed through MCP. The agent operates a curated, capability-aware command vocabulary rather than the underlying operating system.

The Sandbox Filesystem Boundary

File operations are confined to the selected sandbox root. Onumia’s path handling resolves and validates targets so that nothing outside that root can be read or written, Git metadata paths remain inaccessible, and symlinks must resolve back inside an allowed sandbox root before any file operation will follow them. A read-only user, lacking write capability, sees the sandbox filesystem as read-only. The effect is that an agent’s reach into the filesystem ends at the boundary of the sandbox it is working in.

WordPress Command Policy

The sandbox-aware wp command keeps WP-CLI usable while removing the parts that could break out of the current context. Flags that switch which site a command targets are rejected before the command runs, specifically --url, --blog, --network, and --network-wide, so a command cannot quietly redirect itself at another site or apply network-wide. Within the allowed commands, the same read and write distinction applies as everywhere else: read-only WP-CLI commands require execute_read, and any mutating command requires execute_write. A user without write capability is held to a known set of read subcommands and cannot run anything that would change state.

The Eval Boundary

wp eval and wp eval-file are the most powerful surface Onumia exposes, so they sit behind their own execute_eval capability and behind additional code-level restrictions. Even a user who holds execute_eval cannot use it to escape the sandbox. Onumia blocks high-risk PHP behavior in evaluated code, including shell execution, process execution, dynamic function calls, external includes, request termination, unsafe filesystem access, and multisite switching or site creation and deletion calls. What remains is ordinary WordPress programming: because eval runs in the selected sandbox context, calls such as update_option(...) write to the sandbox’s database state rather than the live site, so legitimate work behaves normally while the dangerous escape hatches stay closed.

Database Isolation

The database is where the live-site boundary matters most, and Onumia enforces it at the prefix level. Sandbox database work uses sandbox table prefixes, and a managed database drop-in activates the correct sandbox prefix for the preview requests and command contexts that belong to a sandbox. On top of that, a database write guard rejects mutating queries that would target live prefixed tables or unscoped tables while a sandbox is active. This guard is the core protection for WordPress table writes: it is what ensures that an agent preparing changes in a sandbox cannot accidentally or deliberately write through to production rows.

Control Plane Writes

Onumia has to persist its own state, and those writes are explicitly distinct from sandbox content writes. Storing sandbox records, updating role capability settings, writing audit logs, and recording database merge rollback artifacts all go to Onumia’s control tables, which the write guard treats as legitimate control-plane operations rather than sandbox content. Drawing this line keeps Onumia’s bookkeeping working without weakening the boundary that protects live WordPress data, and it keeps control data out of the tables a sandbox can shadow.

Preview Privacy

A preview link is a way to look, not a way to act. It renders the selected sandbox in the browser and does nothing more: it grants no MCP access, no command execution, no settings access, and no promotion rights. Because the link only reveals proposed work, it should still be treated as a private review link and shared only with the people who are meant to evaluate the change. The important guarantee is that handing someone a preview URL never hands them any of Onumia’s privileged surfaces.

Promotion Is Always Explicit

Nothing a sandbox contains becomes live on its own. Promotion is a separate, deliberate action that requires both an explicit tool or UI call and the correct promotion capability for what is being promoted, with promote_code and promote_database checked independently. Database promotion in particular is built to be careful rather than blunt: the merge path works from staged selections, runs preflight checks, applies guarded row preconditions, and records rollback artifacts so an apply can be reversed. The result is that the live site advances only when a sufficiently privileged user decides it should, and even then through a reviewable, recoverable path.

Audit Logging

Onumia records tool inputs and outputs wherever it can, which gives operators the accountability and rollback context they need to understand what an agent did. That completeness has a corollary: the logs themselves are sensitive, because they may contain whatever data passed through commands. Treat Onumia logs as confidential, and avoid sending secrets through commands unless you accept that those values may be captured in the log. Where possible, prefer configuration and credentials that live outside the command stream.

External Side Effects

Onumia’s isolation covers WordPress sandbox files and WordPress table writes; it cannot retroactively sandbox the wider world that a plugin might reach into. Code running in a sandbox can still trigger genuine external effects such as remote API calls, email delivery, payment provider calls, third-party queues, writes to non-WordPress databases, and outbound webhooks. These actions happen for real because they leave WordPress entirely. When a task touches external systems, review it with that in mind, and prefer test credentials or disabled integrations while work is still being prepared in a sandbox.