VaHive Systems Lab
We build structural governance architecture for deployed AI agents — the runtime layer that makes long-running agentic systems verifiably constrained rather than just hopefully aligned.
Training-time alignment does not prevent structural drift in long-running agentic systems. After deployment, agents operating across sessions accumulate operational history, expand scope, and launder authority — in ways that are individually reasonable and cumulatively wrong.
These are not bugs to patch. They are structural consequences of how agentic systems work. Current governance approaches — prompt constraints, guardrails, wrappers — address surface behaviour without structural enforcement. They do not solve the problem. They mask it.
FAILURE MODE 01
Instruction Drift
The agent's operational interpretation of its mandate shifts incrementally across sessions. Each step is locally reasonable. The cumulative trajectory is not.
FAILURE MODE 02
Autonomy Accumulation
The agent acquires operational latitude through repeated decisions that individually appear authorised but collectively represent unsanctioned scope expansion.
FAILURE MODE 03
Authority Laundering
Instructions acquire apparent legitimacy through the agent's own prior actions rather than through verifiable human authorisation. The agent authorises itself.
MAGUS is a three-component governance layer that operates at execution time, not inference time. It enforces nine formal, falsifiable invariants — properties that either hold at every execution point or trigger a governed shutdown. This is not "hopefully aligned." It is verifiably constrained or halted.
DEL
Dynamic Epistemic Ledger
A cryptographically anchored, append-only record of every behavioural state transition. The auditable ground truth of what the agent has done and under what authority.
Guardian
Execution Governance Layer
Evaluates every proposed action against formal invariants before execution. Gates state-altering proposals on dual human-authority sign-off. Nothing executes without clearance.
RT
Reconciliation Thread
A constitutive governance record that makes the agent's drift trajectory continuously legible to operators in real time. Not a log — a living governance record.
9
Formal invariants — falsifiable, not aspirational
2
Deployment pathways — Local LLM and Agent/API
14
Sealed specification documents
v3.5
Active development — closing remaining open problems
MAGUS v3.0 is published as a fully open specification. The architecture includes a formal Category 3/4 open-problems register — a deliberate record of what is not yet solved. We publish gaps because falsifiable claims are more valuable than polished ones.
MAGUS v3.0: A Runtime Governance Architecture for Structural Alignment Drift in Long-Running AI Agents
Access on Zenodo ↗A second pathway specification covering governance of agent systems deployed on cloud-hosted models (GPT-4o, Claude, Gemini and equivalents) is in final documentation. Publication forthcoming.
VaHive Systems Lab is Calvin Cook and Titiya Ruangkwam, operating independently from Chiang Mai, Thailand. No institutional affiliation. No external funding to date. Every line of this architecture has been built on our own time.
Calvin Cook
Lead Architect
Fifteen years in senior logistics and operations management for engineering manufacturing — a background that trains instinct for systems that must hold under real-world operational variance. Programming for 30 years. Responsible for MAGUS formal specification, invariant design, and adversarial stress-testing across both deployment pathways.
Titiya Ruangkwam
Co-Architect & Governance Specialist
AI governance domain specialist with established professional relationships across AI governance, GRC, and enterprise AI leadership at the head-of-function level. Co-architect of both MAGUS deployment pathways. Active in senior AI governance communities.
LinkedIn ↗Aivare — AI-powered operations platform
Aivare is a fully built, beta-ready AI operations platform. It is not deployed. We completed the product, then held it back because we were not satisfied with the alignment and drift landscape for the multi-agent systems it runs on. It ships when MAGUS has a working reference implementation — because a commercially viable AI product built on sound governance is worth more than one with alignment concerns papered over.
Publication
Zenodo DOI ↗Fund this work
Manifund ↗Location
Chiang Mai, Thailand