Self-Hosted Deployment
Run the smoltbot gateway on your own infrastructure for full data residency control. All traces, integrity checkpoints, and agent data stay within your network. The self-hosted gateway is a Node.js adapter that runs the same code as the managed Cloudflare Workers service — identical behavior, your infrastructure.Self-hosted deployment requires an Enterprise license. Contact us to obtain a license key. Enterprise includes hybrid analysis mode, SSO/SAML integration, and dedicated support.
Deployment options
| Managed (Cloud) | Docker Compose | Kubernetes (Helm) | |
|---|---|---|---|
| Best for | Most teams | Small teams, eval, dev | Production at scale |
| Infrastructure | None (Mnemom hosts) | Single VM or server | K8s cluster |
| Setup time | Minutes | ~10 minutes | ~30 minutes |
| Scaling | Automatic | Manual | HPA auto-scaling |
| Data residency | Mnemom cloud | Your infrastructure | Your infrastructure |
| High availability | Built-in | Single node | Multi-replica, PDB |
| Monitoring | Dashboard | Prometheus + logs | Prometheus + ServiceMonitor |
Prerequisites
- An Enterprise license JWT from mnemom.ai/dashboard
- An Anthropic API key (required for AIP integrity analysis)
- Optional: OpenAI and Gemini API keys for multi-provider tracing
Quick Start: Docker Compose
The fastest way to get a self-hosted gateway running. Includes PostgreSQL, Redis, and automatic database migrations.Requirements
- Docker 24+ and Docker Compose v2+
- 2 GB RAM minimum, 4 GB recommended
- 10 GB disk space
Configure environment
Copy the example environment file and fill in your credentials:Edit
.env and set the required values:Start the stack
- PostgreSQL — database with health check
- Redis — caching layer with persistence
- Migrate — applies database schema (runs once, then exits)
- Gateway — HTTP proxy on port 8787
- Observer — background scheduler for trace processing
Production: Kubernetes with Helm
For production deployments with auto-scaling, high availability, and monitoring.Requirements
- Kubernetes 1.27+
- Helm 3.12+
kubectlconfigured for your cluster
What the chart deploys
- Gateway Deployment (2 replicas by default) — HTTP proxy with liveness, readiness, and startup probes
- Observer Deployment (1 replica) — background scheduler for trace processing
- Migration Job — Helm pre-install/pre-upgrade hook that applies database migrations
- Service — ClusterIP on port 8787
- NetworkPolicy — deny-all default with explicit allows for ingress, Redis, PostgreSQL, and upstream LLM APIs
- PodDisruptionBudget — ensures at least 1 replica during rolling updates
- Optional: Ingress with TLS, HPA, ServiceMonitor for Prometheus
Scaling
Enable the HorizontalPodAutoscaler for automatic scaling:Architecture
In self-hosted mode, a Node.js adapter layer replaces Cloudflare-specific APIs while running the exact same gateway code:| Cloudflare API | Self-Hosted Replacement |
|---|---|
| KV Namespace | Redis (with in-memory fallback) |
ctx.waitUntil() | Promise collection with drain after response |
| AI Gateway URL routing | Fetch interceptor rewriting to upstream APIs |
ExecutionContext | Node.js shim with fire-and-forget semantics |
Configuration Reference
Required
| Variable | Description |
|---|---|
SUPABASE_URL | Supabase project URL or PostgreSQL REST endpoint |
SUPABASE_KEY | Supabase service-role key |
MNEMOM_LICENSE_JWT | Enterprise license JWT from mnemom.ai/dashboard |
ANTHROPIC_API_KEY | Anthropic API key (required for AIP analysis) |
Optional: Providers
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY | — | OpenAI API key for multi-provider routing |
GEMINI_API_KEY | — | Google Gemini API key for multi-provider routing |
Optional: Hybrid Analysis
| Variable | Default | Description |
|---|---|---|
MNEMOM_ANALYZE_URL | — | Delegate AIP analysis to Mnemom cloud (https://api.mnemom.ai/v1/analyze) |
MNEMOM_API_KEY | — | Mnemom API key with analyze scope (required when MNEMOM_ANALYZE_URL is set) |
Optional: Infrastructure
| Variable | Default | Description |
|---|---|---|
REDIS_URL | — | Redis connection URL. Without Redis, an in-memory KV adapter is used (single-node only). |
PORT | 8787 | HTTP listen port |
HOST | 0.0.0.0 | HTTP bind address |
SMOLTBOT_ROLE | all | gateway (HTTP only), scheduler (cron only), or all (both) |
LOG_LEVEL | info | debug, info, warn, or error. Structured JSON to stdout. |
Health Endpoints
Three Kubernetes-standard probes:| Endpoint | Purpose | Behavior |
|---|---|---|
/health/live | Liveness probe | Always 200 unless deadlocked |
/health/ready | Readiness probe | Checks Redis, PostgreSQL, and license validity |
/health/startup | Startup probe | Returns 503 until initialization complete |
Prometheus Metrics
The gateway exposes a/metrics endpoint with:
gateway_requests_total{provider,status}— request countergateway_request_duration_seconds{provider}— latency histogramgateway_aip_checks_total{verdict}— integrity check countergateway_cache_operations_total{operation,result}— cache hit/miss- Standard
process_*andnodejs_*metrics
values.yaml:
Upgrading
Docker Compose
migrate service.
Helm
Troubleshooting
Gateway won't start — EnvValidationError
Gateway won't start — EnvValidationError
A required environment variable is missing. Check the error message for which variable, then verify your
.env file or Kubernetes Secret.Redis connection refused
Redis connection refused
- Docker Compose: ensure the
redisservice is healthy (docker compose ps) - Kubernetes: verify
REDIS_URLin your Secret points to a reachable Redis instance - Without Redis, the gateway falls back to in-memory KV (single-node only)
License validation failed
License validation failed
- Verify
MNEMOM_LICENSE_JWTis set and not expired - Check
/health/readyfor the specific license error - Contact support@mnemom.ai for license reissuance
Upstream LLM API errors (401/403)
Upstream LLM API errors (401/403)
- Verify your API keys are correct and have sufficient credits
- The gateway proxies directly to provider APIs — ensure outbound HTTPS (port 443) is allowed
- In Kubernetes, check the NetworkPolicy allows egress to
0.0.0.0/0:443
High memory / OOMKilled
High memory / OOMKilled
- Increase container memory limits (512Mi minimum, 1Gi recommended for high traffic)
- If using in-memory KV, switch to Redis to reduce memory pressure
- Set
NODE_OPTIONS=--max-old-space-size=768for fine-grained heap control
Next steps
- Smoltbot overview — architecture and components
- Enforcement modes — observe, nudge, and enforce
- Observability guide — dashboards and alerting
- Security model — trust boundaries and threat model