Security

The architecture that makes the agent safe to use.

Midlight gives an AI agent the ability to read, write, and execute bash inside a workspace of your most personal docs. Doing that responsibly means designing for it from the foundation. This page describes what we actually do — and what we don't.

Constrained tool surface

The agent never has access to its provider's built-in Read, Write, Bash, WebFetch, or WebSearch tools. We disable them at the SDK configuration layer. The agent's complete tool list is the small set of Midlight-owned MCP tools we ship — every file mutation, every shell command, every subagent spawn flows through that surface and gets logged.

Verified. Attack-prompt tests on both runtimes confirm the model has no path to a built-in Write or Bash when those prompts are issued. SDK version pinned; tests re-run on every upgrade.

Memory isolation

Your .memory/ directories — the place the agent keeps notes about you — are hidden from the agent's shell by the kernel. When the agent runs bash, it sees a workspace where every .memory/ folder appears empty. Memory contents are only reachable through a dedicated retrieval tool that logs every access in an audit table you can read.

Mechanism. Linux mount namespaces with a read-only tmpfs masking each .memory/ directory inside the sandbox. The retrieval sub-agent runs outside the masked namespace with scoped read-only tools.

Hardened sandbox

The agent runs in a per-session Docker container with no inbound network, an egress allowlist scoped to the model API endpoints, a read-only root filesystem, dropped capabilities, an unprivileged user, and PID/memory/CPU limits. Container escape is the unsolved problem on every PaaS — we mitigate it with defense in depth, and v2 moves to true microVMs as a precondition for open signup.

Hardening profile. --cap-drop=ALL, --security-opt no-new-privileges, --read-only, user namespace remapping, host-side iptables egress allowlist on a per-session network.

Your keys, encrypted

Anthropic and OpenAI API keys you paste in are envelope-encrypted at rest. Each key gets its own data key, wrapped by a master key we rotate. Decryption happens only in memory in the control plane, never on the sandbox's filesystem. The agent itself never sees raw credentials — inference calls are proxied through the control plane and authenticated there.

Encryption. AES-256-GCM with per-record DEKs, authenticated. Tampered ciphertext fails to decrypt — not silently.

What we don't do.

Equally important: the things we deliberately rule out.

No ambient memory. The agent does not get your full context pre-loaded. It explores memory on demand, retrieves only when relevant, and surfaces personal context to you with source attribution.
No silent writes. Every memory mutation is a first-class tool call you see in the chat. Every file edit is a commit you can revert.
No data lock-in. Workspaces export as a folder of files. You can take everything and leave at any time.
No third-party telemetry inside your workspace. Operational logs and error reporting are aggregate and scrubbed; we don't ship your chat or file content to analytics platforms.
No public-facing services from your sandbox. The container has no inbound networking, no SSH, no scheduled jobs that survive a session. This is a workspace, not a hosting platform.

What the security model doesn't cover.

Being architecturally honest means saying what's not protected. The agent sends workspace content to your configured model providers as part of inference — that's the entire point, and the privacy of those calls is bounded by who you trust to host the model. Anything the agent can read, it can show you in chat (so secrets in workspace files can surface in conversation). Provider-side breaches of Anthropic, OpenAI, etc. are outside our control. v1 is invite-only on hardened Docker; v2 moves to true microVMs before we open signup.

Curious about implementation details? The architecture is open. [email protected] for anything specific.