AAP Architecture
This document describes the system architecture of the Agent Alignment Protocol (AAP), including component relationships, data flow, and extension points.Protocol Stack
AAP operates as an alignment layer that extends existing agent protocols:Component Architecture
Overview
Schemas Module (aap.schemas)
The schemas module provides Pydantic models for the three core AAP components:
Alignment Card (alignment_card.py)
AP-Trace (ap_trace.py)
Value Coherence (value_coherence.py)
Verification Engine (aap.verification)
The verification engine implements the three core operations:
verify_trace(trace, card) -> VerificationResult
Performs six verification checks (SPEC Section 7.3):
VerificationResult:
verified: bool— True if no violationsviolations: list[Violation]— Type, description, severitywarnings: list[Warning]— Near-boundary conditionsverification_metadata— Algorithm version, checks performed, duration
check_coherence(my_card, their_card) -> CoherenceResult
Computes value compatibility score (SPEC Section 6.4):
CoherenceResult:
compatible: bool— No conflicts AND score >= 0.5score: float— Coherence score [0, 1]value_alignment— Matched, unmatched, conflictsproceed: bool— Safe to collaborateproposed_resolution— If incompatible, suggests escalation
detect_drift(card, traces, thresholds) -> list[DriftAlert]
Analyzes trace sequence for behavioral drift (SPEC Section 8):
value_drift— Using undeclared values (>30% of recent)autonomy_expansion— Escalation rate dropped by >50%principal_misalignment— Declining confidence on principal_benefitunknown— Pattern doesn’t match known categories
Feature Extraction (features.py)
TF-IDF-based feature extraction for drift detection:
Constants (constants.py)
Calibrated thresholds derived from corpus analysis (see calibration):
Data Flow
Single Trace Verification
Multi-Agent Coherence Check
Drift Detection Over Time
Extension Points
1. Custom Values
Define domain-specific values invalues.definitions:
2. Protocol Extensions
Add protocol-specific data inextensions:
3. Custom Escalation Triggers
Define complex conditions inescalation_triggers:
field == "value"— String equalityfield > N— Numeric comparison (>,<,>=,<=,!=)field_name— Boolean check (truthy)
4. Verification Customization
Override default thresholds:5. Integration Hooks
For A2A integration, extend the Agent Card:Implementation Notes
Python SDK
- Location:
src/aap/ - Models: Pydantic v2 with strict validation
- Type hints: Full coverage,
py.typedmarker - Dependencies: Only
pydantic>=2.0
TypeScript SDK
- Location:
typescript/src/ - Output formats: CJS, ESM, DTS
- Types: Full TypeScript types, no
any - Dependencies: None (zero runtime deps)
JSON Schemas
- Location:
schemas/ - Format: JSON Schema Draft 2020-12
- Generated from: Pydantic models via
model_json_schema()
- Validation in any language (ajv, jsonschema, etc.)
- Code generation (quicktype, json-schema-to-typescript)
- Documentation (JSON Schema viewers)
Browser (Playground)
- Location:
docs/playground/ - Runtime: Pyodide (Python in WASM)
- API:
window.AAP.verifyTrace(), etc. - No server: All verification runs client-side
Security Considerations
See security for the full threat model. Key points:- AAP does not ensure alignment — It provides visibility, not guarantees
- AP-Traces are self-reported — Adversarial agents can lie
- Verification is point-in-time — Does not prevent future violations
- Thresholds are calibrated — But may not fit all domains
- Use AAP alongside behavioral monitoring
- Implement rate limiting and anomaly detection
- Maintain human oversight for high-stakes decisions
- Regularly audit AP-Trace storage for integrity
References
- specification — Full protocol specification
- limitations — What AAP does NOT guarantee
- calibration — Threshold derivation methodology
- quickstart — 5-minute integration guide
- A2A integration — A2A integration guide
- MCP migration — MCP integration guide