Sandboxing agentic coding tools is a networking problem

Command allowlisting for agentic tools presents significant challenges. Taking inspiration from Simon Willison, sandboxes help address the "lethal trifecta":

Diagram showing sandbox architecture for agentic coding tools

Sandboxes help us reason about their relation to the lethal trifecta:

The lethal trifecta diagram showing untrusted content, external communication, and sensitive data

Anthropic provides several sandboxing tools:

Cursor and OpenAI's Codex CLI offer similar features. Custom sandboxes using gVisor or Firecracker VMs apply comparable network isolation principles.

What is the worst a sandbox can do?

A sufficiently sandboxed Claude Code resembles a separate host. Key considerations:

Nearly all Claude Code instances access Anthropic API keys. Claude Code inherits all environment variables from your terminal session and can read files in the directory where you run claude.

While software often requires secrets, development credentials remain sensitive. Dotenv files in your working directory — even if properly .gitignored — become accessible to Claude Code, creating exfiltration risks.

Unpacking the devcontainer firewall

The devcontainer template includes an init-firewall.sh script permitting connections to:

The firewall operates at the IP layer using iptables. However, this IP-level enforcement doesn't prevent application-layer attacks. Domain fronting, for instance, allows diverse actions on single domains. Even HTTP-only allowlists can enable credential exfiltration through npm packages or GitHub gists.

Application-layer inspection becomes necessary for effective restriction.

Using network proxies to prevent secrets exfiltration

Claude Code supports two proxy configurations:

These configurations operate independently.

You can configure Claude Code to use an HTTP proxy using the following configuration in settings.json:

{
  "env": {
    "HTTP_PROXY": "http://localhost:8080",
    "HTTPS_PROXY": "http://localhost:8080"
  },
  "sandbox": {
    "httpProxyPort": 8080
  }
}

mitmproxy is a great tool to run these HTTP proxies. You could then pass an invalid Anthropic API Key to Claude Code, and write a mitmproxy addon that intercepts requests to api.anthropic.com and updates the X-API-Key header with the actual credentials:

from mitmproxy import http

REAL_API_KEY = "sk-ant-api03-real-key-here"

class InjectApiKey:
    def request(self, flow: http.HTTPFlow) -> None:
        if flow.request.pretty_host == "api.anthropic.com":
            flow.request.headers["x-api-key"] = REAL_API_KEY

addons = [InjectApiKey()]

You could then run mitmweb with the right API key:

mitmweb -s inject_api_key.py --set ssl_insecure=true

Afterwards, run claude with the ANTHROPIC_API_KEY environment variable set to an invalid API key:

ANTHROPIC_API_KEY=sk-ant-dummy claude

From the perspective of Claude Code, all API responses from api.anthropic.com work correctly, but it never sees the real credentials. Neither Claude Code nor the sandbox possesses real credentials.

Note: Claude Code requires OAuth sign-in before checking ANTHROPIC_API_KEY, so obtain the API key first, close the session, then restart with invalid credentials.

This technique extends beyond Anthropic keys — dummy credentials with mitmproxy injection work for any API.

Tying a developer's permissions to their Claude Code permissions

Formal enables least privilege for both human and machine identities. Current Anthropic Admin API Keys inherit full user permissions without fine-grained restrictions. API keys generated for Claude Code appear limited but lack clear documentation.

The optimal approach prevents Claude Code from accessing credentials directly. Using Formal Connectors, Resources, and Native Users ensures Claude Code cannot leak API keys. Claude Code makes requests with Formal-specific credentials while the Connector injects actual secrets upstream.

When hostnames and headers are hard to edit: mitmproxy add-ons

For hostnames and headers that are hard to tweak, use mitmproxy add-ons to route the HTTP requests for these domains to the corresponding listener. Default hostnames and ports appear identical from Claude Code's perspective.

from mitmproxy import http

# Map original hostnames to Formal Connector listeners
REROUTE_MAP = {
    "api.anthropic.com": ("localhost", 4004),
    "api.openai.com": ("localhost", 4005),
    "api.github.com": ("localhost", 4006),
}

class RerouteHosts:
    def request(self, flow: http.HTTPFlow) -> None:
        host = flow.request.pretty_host
        if host in REROUTE_MAP:
            target_host, target_port = REROUTE_MAP[host]
            flow.request.host = target_host
            flow.request.port = target_port
            flow.request.headers["X-Original-Host"] = host

addons = [RerouteHosts()]

You can then pass this add-on via mitmproxy -s reroute_hosts.py.

Applying fine-grained least privilege policies

We could then create a policy in a similar way to the policy we created for the local Github MCP server use case. For example, allow access only to specific API endpoints:

{
  "type": "allow",
  "description": "Allow Claude Code to access Anthropic completions",
  "condition": {
    "match": {
      "host": "api.anthropic.com",
      "method": "POST",
      "path": "/v1/complete"
    }
  }
}

If we change the path param to "/v1/messages," we can confirm that this policy is able to block requests to that endpoint:

{
  "type": "block",
  "description": "Block direct access to messages API",
  "condition": {
    "match": {
      "host": "api.anthropic.com",
      "method": "POST",
      "path": "/v1/messages"
    }
  },
  "action": "deny"
}

We also get visibility into every request being made to the Anthropic API across our organization!

Formal dashboard showing visibility into API requests made by Claude Code across the organization

Of course, this technique was not specific to safeguarding Anthropic API Keys. Proxies enhance two dimensions of the lethal trifecta: