Smoltbot supports three enforcement modes that control how the system responds when alignment or integrity violations are detected. You can choose the level of intervention appropriate for your use case — from passive observation to active blocking.
In observe mode, smoltbot detects and records all violations but takes no action to modify agent behavior. This is the default mode for all new agents.Behavior:
All API calls pass through unchanged
Violations are detected and recorded in the trace database
Integrity checkpoints are created for every interaction
Drift alerts are generated when behavioral patterns shift
No modification to agent requests or responses
When to use:
During initial deployment to establish behavioral baselines
When you want to monitor without affecting agent behavior
For compliance auditing where you need a record but not intervention
When evaluating whether to enable more active enforcement
In nudge mode, when a violation is detected, smoltbot creates a pending nudge record. On the agent’s next request, the gateway injects an integrity notice into the system prompt. The agent sees the notice, can review its approach, and self-correct.Behavior:
Violations are detected and recorded
A pending nudge record is created for each violation
On the next API request, an integrity notice is injected into the system prompt
The agent receives the nudge and can self-correct
Nudge delivery is tracked in the conscience timeline
The original request passes through (not blocked)
How nudges work:
The gateway detects a violation in real-time via streaming response analysis (tee() + background AIP check, ~3s latency)
A pending nudge record is created immediately — ready before the agent’s next turn
On the agent’s next API call, the gateway injects a system prompt addition:
Copy
Ask AI
[INTEGRITY NOTICE] Your previous action was flagged for:UNBOUNDED_ACTION (HIGH) — action "delete_all_records" is notin your declared bounded_actions. Please review your approach.
The agent processes the notice alongside its normal instructions
The nudge is marked as delivered in the conscience timeline
When to use:
When you want the agent to self-correct without hard blocking
For agents in production where availability is critical
When building trust in the enforcement system before enabling hard blocks
For advisory or delegated_authority principal relationships
In enforce mode, smoltbot actively blocks requests when critical violations are detected. The behavior depends on whether the request is streaming or non-streaming.Behavior:
Non-streaming requests: Hard block with HTTP 403 response. The request does not reach the AI provider.
Streaming requests: Falls back to nudge mode (injects integrity notice into system prompt). This is because streaming connections cannot be cleanly terminated mid-stream.
When to use:
For high-security environments where violations must be prevented
When operating under fail_closed AIP failure policy
For agents handling sensitive data or financial transactions
When regulatory requirements demand active intervention
Enforce mode will block non-streaming API calls when violations are detected. Ensure your application handles 403 responses gracefully before enabling this mode.
Use threshold mode to avoid alert fatigue. The agent only receives a nudge after repeated violations in the same session, giving it a chance to self-correct naturally first.
Policy enforcement operates as a parallel system alongside alignment enforcement. While alignment enforcement checks agent behavior against card values, policy enforcement checks tool usage against governance rules.
Violation Type
Severity
Enforcement Behavior
POLICY_VIOLATION
HIGH
Blocked when policy enforcement mode is enforce; logged when warn
UNMAPPED_TOOL
MEDIUM
Logged as warning; behavior depends on defaults.unmapped_tool_action
CAPABILITY_MISMATCH
HIGH
Blocked when policy enforcement mode is enforce; logged when warn
Policy enforcement is controlled independently via the enforcement_mode field in the Policy DSL:
The X-Policy-Verdict response header is always present when a policy is active:
Header Value
Meaning
pass
All tools mapped and permitted
warn
Violations detected but not blocking
fail
Violations detected and request blocked (enforce mode only)
Alignment enforcement (observe/nudge/enforce) and policy enforcement (off/warn/enforce) can be configured independently. For example, you might use nudge for alignment violations while using enforce for policy violations, or vice versa.
In enforce mode, only CRITICAL and HIGH severity violations trigger hard blocks on non-streaming requests. MEDIUM severity violations are always handled via nudge, even in enforce mode. This applies to both alignment and policy violations.
Agent containment is a separate enforcement layer that operates above the per-request enforcement modes. While enforcement modes (observe/nudge/enforce) control individual request handling, containment controls whether the agent can make requests at all.
Paused agents can be resumed by an org owner or admin. Killed agents require explicit reactivation by an owner only. The distinction matters for audit: pause means “we need to investigate,” kill means “this agent is compromised.”