Skip to main content

Protocol Overview

Mnemom is built on two complementary open-source verification protocols (AAP and AIP) plus a governance layer (CLPI). Together, they provide retrospective verification, real-time integrity assurance, and active policy enforcement for AI agent behavior.
AAPAIP
Full nameAgent Alignment ProtocolAgent Integrity Protocol
When it checksAfter the agent acts (post-hoc)While the agent thinks (real-time)
What it checks”Did the agent behave consistently with its declared values?""Is the agent’s reasoning being compromised right now?”
Core mechanismAlignment Cards + AP-Traces + verificationThinking block analysis + integrity checkpoints
DetectsValue drift, autonomy violations, missing escalationsPrompt injection, manipulation, value erosion, boundary violations

The twin protocol relationship

AAP is post-hoc verification. After an agent makes a decision, AAP records what happened (the AP-Trace) and checks it against what the agent declared it would do (the Alignment Card). This catches inconsistencies between declared and actual behavior — but only after the fact. AIP is real-time assurance. During execution, AIP analyzes the LLM’s thinking blocks (Anthropic), thought parts (Gemini), or reasoning summaries (OpenAI) to detect active threats: prompt injection attempts, gradual value drift within a session, or manipulation by adversarial inputs. AIP catches problems before the agent acts on compromised reasoning. They complement each other:
  • AAP catches behavioral drift over time that develops gradually across many interactions
  • AIP catches active attacks in progress that compromise a single reasoning step
  • AAP verifies declared vs. actual alignment across a session or lifetime
  • AIP verifies reasoning integrity within a single inference call
Neither protocol alone is sufficient. An agent that passes all AIP checks during execution could still drift from its declared values over weeks of operation (caught by AAP). An agent that matches its alignment card perfectly could have its reasoning temporarily compromised by a prompt injection (caught by AIP).
What the protocols guarantee: AAP provides complete audit trails of every agent decision. AIP provides real-time attestation of reasoning integrity at the thinking level — with cryptographic proofs (certificates, ZK proofs) that any party can independently verify. Combined with CLPI policy enforcement, Mnemom provides verifiable accountability from declared intent through execution.The boundary: No external system can enforce at the sub-thinking level — inside the model’s weights. If a model produces compromised reasoning that does not surface in its thinking blocks, no external observer can detect it. See the AAP limitations and AIP limitations for details.

How they work together

When deployed via the Smoltbot Gateway, both protocols run automatically:
Your Application


┌─────────────────────────────────────────┐
│           Smoltbot Gateway              │
│                                         │
│  1. Intercept LLM API call              │
│  2. Forward to provider                 │
│  3. Receive response with thinking      │
│                                         │
│  ┌─────────────┐   ┌─────────────────┐  │
│  │ AIP Analyzer │   │  AAP Observer   │  │
│  │             │   │                 │  │
│  │ Analyze      │   │ Extract action   │  │
│  │ thinking     │   │ Build AP-Trace   │  │
│  │ blocks       │   │ Verify against   │  │
│  │             │   │ Alignment Card   │  │
│  │ Verdict:     │   │                 │  │
│  │ clear /      │   │ Result:          │  │
│  │ review /     │   │ verified /       │  │
│  │ violation    │   │ violation        │  │
│  └─────────────┘   └─────────────────┘  │
│                                         │
│  4. Return response to your app         │
│  5. Store trace + checkpoint            │
└─────────────────────────────────────────┘


┌─────────────────────────────────────────┐
│         Dashboard (mnemom.ai)           │
│                                         │
│  Conscience timeline, drift alerts,     │
│  integrity scores, enforcement log      │
└─────────────────────────────────────────┘

Protocol layers

Mnemom’s protocols sit alongside existing agent infrastructure standards:
LayerProtocolPurpose
Tool accessMCPStandardized tool and context access for LLMs
Agent communicationA2AAgent-to-agent task delegation and coordination
Alignment verificationAAPPost-hoc verification of behavior against declared values
Integrity assuranceAIPReal-time analysis of reasoning for active threats
Cryptographic attestationAIP CertificatesEd25519 signatures, hash chains, Merkle proofs, and SP1 zero-knowledge proofs
Policy governanceCLPIGovernance-as-code: policy enforcement, trust recovery, risk intelligence, on-chain anchoring
Trust scoringMnemom Trust RatingComposite trust metric for agents and teams — cryptographically provable, anchorable on-chain
AAP and AIP do not replace MCP or A2A — they add a verification layer on top. CLPI adds a governance layer that prevents configuration drift, recovers from false violations, and anchors trust on-chain. An agent can use MCP tools, communicate via A2A, and have all of that activity traced and verified through AAP and AIP, with governance enforced by CLPI.
  • MCP + AAP: Every MCP tool call can generate an AP-Trace. See MCP migration.
  • A2A + AAP: Before two agents collaborate, value coherence checks verify compatibility. See A2A integration.
  • CLPI + AAP/AIP: CLPI’s policy engine governs which tools are permitted, its reclassification system fixes false violations from AAP/AIP, and its on-chain layer anchors the resulting trust scores immutably. See CLPI overview.

Core concepts

Alignment Cards

Machine-readable declarations of agent identity, values, autonomy boundaries, escalation triggers, and audit commitments. The reference document that all verification checks against.

AP-Traces

Structured records of agent decisions. Each trace captures what action was taken, what alternatives were considered, what values were applied, and whether escalation was triggered.

Integrity Checkpoints

Real-time AIP analysis results. Each checkpoint contains a verdict (clear, review_needed, boundary_violation), identified concerns, and confidence levels.

Drift Detection

Statistical analysis of agent behavior over time. Detects sustained deviations from declared alignment, including autonomy expansion, escalation rate changes, and value application shifts.

Value Coherence

Pairwise compatibility checking between two agents’ Alignment Cards. Identifies shared values, conflicts, and proposes resolutions before collaboration begins.

CLPI: Governance Layer

Card Lifecycle & Policy Intelligence. The 5-phase governance system that enforces policies, recovers trust after false violations, and anchors reputation on-chain.

Mnemom Trust Rating

Composite trust metric for AI agents — a credit score built from integrity checkpoints, drift stability, compliance, and fleet coherence. Publicly queryable, embeddable, and cryptographically provable.

Team Reputation

Teams as first-class meta-agents with persistent identity, their own alignment cards, accumulated reputation, and ZK-provable team trust scores.

Verifiable Integrity

Four-layer cryptographic attestation stack: Ed25519 signatures, hash chains, Merkle proofs, and SP1 zero-knowledge proofs for independent verdict verification.

Specifications

SDK packages

Both protocols have SDK implementations in Python and TypeScript:
PackageLanguageProtocolRegistry
agent-alignment-protoPythonAAPPyPI
@mnemom/agent-alignment-protocolTypeScriptAAPnpm
agent-integrity-protoPythonAIPPyPI
@mnemom/agent-integrity-protocolTypeScriptAIPnpm

Quickstarts