Shadow Warden AI
Enterprise Documentation
Enterprise Integration &
Deployment Guide
Version 5.2 · June 2026 · Confidential
Complete deployment playbook covering the 15-layer filter pipeline, Agentic SOC (SOVA + MasterAgent),
Post-Quantum Cryptography, Shadow AI Governance, Explainable AI 2.0, Sovereign Cloud routing,
SEP Communities, Semantic Layer (Headless BI), Settings Hub, API integration, compliance, and high-availability configuration.
01
Executive Summary & Business Case
Shadow Warden AI is a zero-trust AI security gateway deployed inline between your application and every LLM call.
It processes requests in under 8 ms P99, blocks 97.3% of adversarial prompts, strips secrets and PII before they
reach third-party models, and self-improves through an automated Evolution Engine — all without sending sensitive
data to external services.
$2.3M
Average annual breach cost avoided (IBM 2024)
340%
3-year ROI for mid-market deployments
< 2 ms
Topological gatekeeper P50 latency
v7.1 Key Additions
📊 AI Analytics Hub
9 built-in Semantic Layer models, Redis query cache (600s TTL), SOVA tools: semantic_query + list_semantic_models + check_commerce_budget
💰 Commerce Budget Guardian
check_budget() in every AP2 payment — queries ai_spend Semantic Layer model for real MTD spend; Slack alert on budget exceeded
🏪 M2M Commerce Store
Full seller-side: catalog, inventory, store agent, security layer; UECIID-linked listings; AP2/UCP/MCP protocol support
🗂️ Self-Service Catalog
bootstrap_tenant_models() on startup; register_tenant_model() persists to SQLite + hot-loads into SemanticEngine singleton at /semantic-layer/models/catalog
02
15-Layer Defense Architecture
Every request passes through a deterministic, ordered pipeline. Each layer fails open on non-fatal errors
except the auth guard (fail-closed). Latency per layer is tracked via OpenTelemetry spans.
L1
Auth Guard < 0.1 ms
Per-tenant API keys (SHA-256 hash, constant-time compare). Fail-closed — missing key = 401.
L2
Topological Gatekeeper < 2 ms
n-gram point cloud → β₀/β₁ Betti numbers via TDA. Rejects structurally anomalous prompts.
L3
Obfuscation Decoder < 1 ms
Depth-3 recursive decode: base64, hex, ROT13, Caesar, word-split, UUencode, unicode homoglyphs.
L4
Secret Redactor < 0.5 ms
15 regex patterns (API keys, JWTs, SSH, credit cards, SSN) + Shannon entropy scan for unknown secrets.
L5
Semantic Rule Engine < 1 ms
Regex ruleset + compound risk escalation (3+ MEDIUM → HIGH). Hot-reload via dynamic_rules.json.
L6
ML Brain (MiniLM) 3–6 ms
all-MiniLM-L6-v2 cosine similarity + Poincaré ball hyperbolic distance blend (70/30). Adversarial suffix stripping.
L7
Causal Arbiter < 0.5 ms
Bayesian DAG — 5 nodes, Pearl do-calculus, backdoor correction. P(HIGH_RISK|evidence).
L8
Phishing & SE Guard < 0.5 ms
URL phishing detection + social engineering classification (PhishGuard + SE-Arbiter).
L9
ERS & Shadow Ban < 0.3 ms
Redis sliding-window rate limiter. Shadow-ban at score ≥ 0.75 — gaslight / delay / standard modes.
03
Client Segment Playbooks
Six deployment profiles map organization type to recommended tier, configuration preset, and compliance posture.
A
FinTech / Banking
Enterprise
- › PQC signing on all AI transactions
- › Sovereign routing — PHI stays US/EU only
- › SOC 2 Type II evidence vault
- › GDPR Art. 35 DPIA pre-filled
B
Healthcare / Life Sciences
Enterprise
- › PHI data pod jurisdiction lock
- › HIPAA audit chain (STIX 2.1)
- › Secret redactor covers HL7/FHIR tokens
- › Causal Transfer Guard blocks exfiltration
- › Per-tenant API keys + rate limits
- › MasterAgent SOC coordinator
- › Shadow AI Discovery for rogue models
- › XAI audit reports per request
D
Government / Public Sector
Enterprise
- › MASQUE H3 sovereign tunnels
- › 8-jurisdiction transfer matrix
- › Post-quantum key encapsulation (ML-KEM-768)
- › Air-gapped Evolution Engine mode
E
SMB / Startup
Community Business
- › Single Docker Compose deploy
- › SMB Governance Suite add-on
- › AI Budget Dashboard + Vendor Registry
- › Obsidian Plugin for note security
F
AI-Native / LLM Platform
Pro + Add-ons
- › OpenAI-compatible proxy endpoint
- › LangChain WardenCallback
- › SOVA scheduled threat sync (every 6h)
- › Evolution Engine with ArXiv auto-synthesis
04
Integration Patterns
Pattern A — Direct REST Filter
Send every AI request through POST /filter before passing to your LLM. Synchronous, <8 ms P99.
curl -X POST https://api.shadow-warden-ai.com/filter \
-H "X-API-Key: $WARDEN_KEY" \
-H "Content-Type: application/json" \
-d '{"content":"<user prompt>","tenant_id":"acme"}'
# → {"blocked":false,"risk_score":0.12,"redacted_content":"...","processing_ms":4.7}
Pattern B — OpenAI-Compatible Proxy
Drop-in replacement for the OpenAI base URL. Zero code changes. Supports streaming with 400-char fast-scan buffer.
from openai import OpenAI
client = OpenAI(
base_url="https://api.shadow-warden-ai.com/v1",
api_key="your-openai-key"
)
Pattern C — LangChain Callback
Duck-typed callback handler. Attach to any LangChain chain or agent — no changes to existing code.
from warden.integrations.langchain_callback import WardenCallback
chain = LLMChain(
llm=chat,
callbacks=[WardenCallback(api_key="...", tenant_id="acme")]
)
Pattern D — Agentic SOC API (v4.0+)
Trigger MasterAgent SOC investigation. Sub-agents run in parallel; high-impact actions require Slack approval.
# Trigger MasterAgent
curl -X POST https://api.shadow-warden-ai.com/agent/master \
-H "X-API-Key: $WARDEN_KEY" \
-d '{"query":"Investigate bypass spike last 2h"}'
# Approve pending high-impact action
curl -X POST https://api.shadow-warden-ai.com/agent/approve/{token}?action=approve \
-H "X-API-Key: $WARDEN_KEY"
05
Deployment Architecture
Shadow Warden runs as 11 Docker services behind a Caddy v2 reverse proxy with QUIC/HTTP3 support.
All services are stateless except Redis and PostgreSQL.
| Service | Port | Role |
| proxy (Caddy) | 80 / 443 | TLS termination, HTTPS/3 QUIC, vhost routing |
| warden | 8001 | Core FastAPI gateway — all /filter, /agent, /sep endpoints |
| app | 8000 | Streamlit analytics dashboard |
| analytics | 8002 | Analytics REST adapter |
| dashboard | 3002 | Next.js 14 SOC Dashboard |
| postgres | 5432 | Persistent event store (optional) |
| redis | 6379 | ERS, shadow-ban, cache, SOVA memory |
| prometheus | 9090 | Metrics scrape |
| grafana | 3000 | Dashboards + SLO alerts |
| minio | 9000 / 9001 | S3-compatible object store — evidence vault, patrol video |
| minio-init | — | One-shot bucket provisioning init container |
Quick Start
git clone https://github.com/shadow-warden-ai/shadow-warden.git
cp .env.example .env && nano .env
docker compose up --build -d
curl http://localhost:8001/health
# → {"status":"ok","version":"5.2.0","model_loaded":true}
06
Dollar Impact Calculator
Financial impact is calculated across five independent cost models using IBM 2024 benchmarks.
Run python scripts/impact_analysis.py --live --industry fintech for real-time figures.
Breach Prevention
Blocks average AI-enabled breach ($2.3M IBM median). Multiplier 1.6× for regulated industries.
$2.3M – $3.7M
Secret Leak Prevention
15-pattern + entropy scan catches API keys, JWTs, PII before reaching third-party LLMs.
$180K – $450K
Compliance Penalty Avoidance
GDPR Art. 83 max €20M or 4% revenue. SOC 2 enables $500K–$2M contract eligibility.
$20K – $2M
Productivity Recovery
Eliminates manual prompt review. ~$1,200/employee/year at 1 FTE per 500 users.
$120K – $600K
Shadow AI Cost Control
Detects unauthorized AI spend. Median $47K/yr untracked LLM costs per 200-person org.
$47K – $190K
Starter / SMB
5-year NPV $380K
Shadow AI + breach prevention
Mid-Market Pro
3-year ROI 340%
Full 5-model stack
Enterprise
Payback < 4 months
Regulated + PQC + Sovereign
07
Agentic SOC — SOVA & MasterAgent
Shadow Warden AI v7.1 ships two autonomous AI operators running on Claude Opus 4.6 with 30+ tools. SOVA handles
scheduled threat intelligence, health monitoring, and visual patrol. MasterAgent decomposes complex SOC
investigations into parallel sub-agent workstreams with HMAC task tokens and human approval gates.
SOVA Agent
Claude Opus 4.6 · 30 tools · ARQ cron scheduler
✓ Morning Brief — 08:00 UTC daily — threat summary + health report
✓ Threat Sync — Every 6h — CVE triage + ArXiv paper analysis
✓ Visual Patrol — 03:00 UTC — screenshot + Claude Vision review
✓ Corpus Watchdog — Every 30 min — WardenHealer anomaly detection
✓ SLA Report — Monday 09:00 UTC — P99 + availability metrics
✓ Key Rotation — 02:00 UTC daily — credential lifecycle check
MasterAgent SOC Coordinator
v4.0 · 4 sub-agents · Human-in-the-loop approval
SOVAOperator Health, stats, billing, key rotation
ThreatHunter CVE triage, ArXiv intel, adversarial analysis
ForensicsAgent Activity logs, GDPR Art.30, Evidence Vault
ComplianceAgent SLA monitors, SOC 2 controls, ROI proposals
HMAC-SHA256 task tokens prevent cross-agent injection. High-impact actions post to Slack and wait for approval before execution.
Explainable AI 2.0 — Causal Chain Reports
Every blocked or flagged request generates a 9-stage causal chain: which layer fired, the primary cause,
and counterfactual remediations. Available as HTML or PDF via GET /xai/report/{id}/pdf.
topologyobfuscationsecretssemantic_rulesbraincausalphishersdecision
08
Deployment Sizing Guide
| Tier | Requests/day | CPU | RAM | Disk | Notes |
| Dev / Staging | < 5K | 2 vCPU | 4 GB | 20 GB | Single compose, no MinIO |
| Small Production | 5K – 50K | 4 vCPU | 8 GB | 60 GB | Add Redis persistence |
| Mid-Market | 50K – 500K | 8 vCPU | 16 GB | 200 GB | Horizontal warden scaling |
| Enterprise | 500K+ | 16+ vCPU | 32 GB | 500 GB+ | K8s Helm chart, HA Redis Cluster |
Model Loading
MiniLM loads once via @lru_cache. Pre-warmed in FastAPI lifespan. ~80 MB RAM. Docker volume warden-models persists ONNX across rebuilds.
Redis
Required for ERS, shadow-ban, and SOVA memory. Use memory:// for tests. Redis Sentinel recommended for HA production.
Evolution Engine
Optional. Requires ANTHROPIC_API_KEY. Air-gap mode: all detection works, Evolution disabled. Set ANTHROPIC_API_KEY="" to disable.
09
GDPR & Compliance Checklist
GDPR
✓ Art. 5(1)(c) — data minimisation: metadata only, never content
✓ Art. 17 — right to erasure: DELETE /gdpr/purge/{request_id}
✓ Art. 35 — DPIA pre-filled template in docs/dpia.md
✓ Art. 30 — ForensicsAgent generates processing record automatically
✓ Art. 83 — audit chain timestamps defensible in proceedings
SOC 2 Type II
✓ CC6.1 — per-tenant API key isolation, constant-time compare
✓ CC6.7 — all traffic TLS 1.3+, Caddy HSTS, optional PQC layer
✓ CC7.2 — SOVA scheduled health monitoring, Grafana SLO alerts
✓ CC9.2 — Evidence Vault: MinIO WORM-compatible storage
✓ A1.2 — ScreencastRecorder video evidence → MinIO SOC 2 trail
Post-Quantum (v4.1+)
✓ ML-DSA-65 (FIPS 204) hybrid signatures on AI transactions
✓ ML-KEM-768 (FIPS 203) key encapsulation for sovereign tunnels
✓ liboqs fail-open — classical Ed25519 fallback if library missing
✓ Enterprise-tier only (PQC gate in TIER_LIMITS)
✓ Hybrid key ID convention: v1-hybrid suffix on PQC keypairs
Analytics & Commerce (v7.1)
✓ AI Analytics Hub: 9 built-in Semantic Layer models, Redis query cache
✓ Self-Service Catalog: tenant model registry, SQLite persistence, hot-reload
✓ Commerce Budget Guardian: real MTD spend from ai_spend model, Slack alerts
✓ M2M Commerce Store: seller-side catalog + inventory + UECIID provenance
✓ STIX 2.1 tamper-evident audit chain links all AI incidents and transfers
10
Configuration Reference
All configuration is via environment variables. Copy .env.example to .env and override as needed.
Core Gateway
| WARDEN_API_KEY | string | Primary API key. Fail-closed if unset and ALLOW_UNAUTHENTICATED != true. |
| WARDEN_API_KEYS_PATH | path | JSON file with per-tenant key list (multi-tenant mode). |
| SEMANTIC_THRESHOLD | 0.72 | MiniLM cosine similarity cutoff (0.0–1.0). Lower = stricter. |
| STRICT_MODE | false | true = block on any non-PASS stage. false = block only on explicit BLOCK verdict. |
| REDIS_URL | redis://redis:6379 | Redis connection string. Use memory:// for tests. |
| LOGS_PATH | /data/logs.json | NDJSON event log. Content never written — metadata only (GDPR). |
Agentic SOC
| ANTHROPIC_API_KEY | string | Required for SOVA, MasterAgent, Evolution Engine, WardenHealer. Omit to disable all agents. |
| ADMIN_KEY | string | X-Admin-Key header for billing admin endpoints (grant/revoke add-ons). |
| SLACK_WEBHOOK_URL | https://... | Slack incoming webhook for SOVA alerts and MasterAgent approval requests. |
| PATROL_URLS | comma-separated | Extra URLs for sova_visual_patrol in addition to defaults. |
Security & Crypto
| VAULT_MASTER_KEY | Fernet key | Base64 32-byte Fernet key. Required for community keypairs and sovereign pod encryption. |
| SOVEREIGN_ATTEST_KEY | string | HMAC key for sovereignty attestations. Falls back to VAULT_MASTER_KEY. |
| COMMUNITY_VAULT_KEY | string | Key for encrypting sovereign data pod secret keys. |
| TRANSFER_RISK_THRESHOLD | 0.70 | CausalArbiter P(exfiltration) threshold for SEP transfer blocking. |
Shadow AI & Storage
| SHADOW_AI_CONCURRENCY | 50 | Max concurrent probes per subnet scan (max prefix /24). |
| SHADOW_AI_PROBE_TIMEOUT | 3 | Per-host timeout in seconds. |
| S3_ENABLED | false | true = enable MinIO/S3 evidence vault and patrol video upload. |
| S3_ENDPOINT_URL | http://minio:9000 | S3-compatible endpoint URL. |
Observability
| OTEL_ENABLED | false | Enable OpenTelemetry distributed tracing. Zero overhead when false. |
| OTEL_EXPORTER_OTLP_ENDPOINT | http://otel-collector:4317 | gRPC collector endpoint. |
| INTEL_OPS_ENABLED | false | Enable ArXiv + OSV threat intel background sync. |
| INTEL_BRIDGE_INTERVAL_HRS | 6 | Hours between ArXiv → Evolution Engine sync cycles. |
AI Analytics Hub (v7.1)
| SEMANTIC_DB_PATH | /tmp/warden_semantic.db | SQLite path for Self-Service Catalog tenant model registry. |
| SEMANTIC_CACHE_TTL | 600 | Redis query cache TTL in seconds for SemanticEngine.generate(). |
| COMMERCE_BUDGET_ENABLED | true | Enable Budget Guardian check on every AP2 payment flow. |
| M2M_STORE_DB_PATH | /tmp/warden_m2m.db | SQLite path for M2M Commerce Store catalog + inventory. |
Shadow Warden AI — Enterprise Integration & Deployment Guide
Version 5.2 · June 2026 · Proprietary & Confidential