The workflow
How three agents stay in sync
without talking to each other.
The ring coordinates through shared state, not chatter. A single database is the brain; routing flags are the nervous system; human sign-off is the gate on anything that leaves the building. Here's the whole machine.
Shared state
One database, one source of truth
All three peers share a single Postgres database exposed through a data-API layer. Each entity has a clear owner — the agent allowed to write it — and the others are read-only or blind to it entirely. This is how the ring stays coherent across days, weeks, and phase boundaries without ever holding everything in one context window.
| Entity | What it holds | Owner (writes) |
|---|---|---|
signals | Raw inbound from Haleon / Microsoft / backlog / scout observations | CoS + 5 scouts (tier_0_reader) |
briefs | The 5-3-2-1 morning & end-of-day brief rows | CoS |
comms_drafts | Outbound queue, gated on Charlie sign-off | CoS |
quiet_watch_state | Silent-stakeholder tracking | CoS |
meeting_preps · steerco_preps | Pre-meeting prep docs | CoS |
adrs | Architecture decision records + DB-backed review_stage axis | SA |
compliance_state | Per-use-case regulatory posture | SA |
reuse_catalog | Cross-pod reusable patterns | SA |
decisions | Narrative log; peer routing is queryable data — decision_class, owning_peer, adr_id, narrative (ISS-14) | CoS / SA |
agent_runs | Per-turn audit spine. Class enum (7): {turn, rotation, snapshot, ado_drain, audit, correction, scout_sweep}. Multi-writer; gap-scan key (session_id, session_turn_seq) | multi-writer (every role writes its own); Steward audits |
rotation_log · agent_snapshots | Rotation handover records + Historian snapshots; cross-referenced from agent_runs via detail_ref (detail-row-first ordering, HS-4) | Steward (Operator for Steward self-rotation) |
cost_telemetry | Whole-UTC-day cost cells; derived projection of agent_runs tokens via sp_rollup_cost_telemetry | derived (rollup SP) |
ado_writeback_queue | Pending ADO ops; gated default pending_approval; drained by ado-scribe through CAS-guarded sp_update_ado_writeback | CoS |
tick_lease · scout_lease · drainer_lease | Three independent single-flight mutexes (re-entrant same-holder CAS, TTL-expiry steal) | CoS (via shim entities) |
scout_watermark · scout_enable_flags | Per-source scan watermark; 3-greens DB gate (cost_breaker_live / drain_proven / scout_proven) read by sp_check_scout_enabled() | scouts + Charlie |
The shared brain spans ~29 base tables (22 originals + the Batch 3–7
substrate: rotation_log, agent_snapshots, the three leases, scout_watermark,
scout_enable_flags) exposed through 52 DAB entities (base reads + *In
shim views + action-views like CostRollupDayIn / CostBreakerRollingIn /
ScoutEnabledCheckIn). A handful of cross-cutting entities (pod intel, customer health, risks,
recipient registry) are read by more than one peer but still have a single writer. The rule never changes:
one writer per row, everyone else reads or stays blind.
Who can talk to whom
Communication topology
The peers never call each other directly. The Chief of Staff and Solution Architect
coordinate by writing routing flags onto decisions rows. The
Steward is one-way: it reads health metadata and only ever "speaks" by spawning a
replacement. Charlie is the only node everyone talks to directly.
decisions table (the only peer channel). (Peer routing is now persisted as data — decision_class, owning_peer, adr_id, narrative on decisions (ISS-14); the per-handoff flag values shown are illustrative shorthand over those columns.)
Grey dashes = escalations to Charlie. Teal dashes = the Steward reading metadata and spawning replacements.
End to end
A day in the life of the ring
Follow one signal — a Haleon stakeholder raising a data-residency concern — as it moves
through the ring from inbound ping to ratified decision and customer-ready reply.
(Today the ring is on-demand only. The timeline below shows the intended cadence; the recurring
daily-brief cron and the 15-min ring-tick are not yet enabled — see
Roadmap.)
-
07:00
CoS Morning brief. Overnight signals are metabolised into a 5-3-2-1 brief and posted to Charlie via Teams. The residency concern lands as one of the top-5 risks.
-
07:12
CoS Routing. The concern is decision-shaped and technical, so CoS writes a
decisionsrow withrouting=needs_sa_review— and purges the technical detail from its own working memory. -
09:30
SA Picks up the queue. SA reads the routed row, checks the reuse catalog for an existing residency pattern, finds none, and opens a new ADR in
draft. -
10:05
SA Adversarial review. SA spawns the Reviewer skill. It pushes back on the first draft; SA revises. On pass, the ADR moves to
reviewed. -
10:40
SA Compliance gate. The decision touches a consumer-health surface, so the Compliance Checker runs (mandatory). It returns
needs-reviewand acompliance_staterow is written. -
11:15
Steward Health tick. A routine metadata scan: both peers green, turn counts nominal, cost within band. An
agent_runsaudit row is written. Silent — no Teams. -
14:00
👤 Charlie ratifies. SA surfaces the reviewed, compliance-checked ADR. Charlie acks; the ADR becomes
ratifiedand SA writes a hand-back row withrouting=needs_narrative_for_steerco. -
14:20
CoS Drafts the reply. CoS picks up the narrative, drafts a reply to the Haleon stakeholder, runs it through the Outbound Voice skill, and queues it at
awaiting_signoff. -
14:25
👤 Charlie sends. He reviews the draft, tweaks one line, and approves. Only now does anything leave the building. human gate
-
17:00
CoS End-of-day brief. The residency thread is closed out in the brief, the ratified ADR noted, and it's added to the Steerco prep accumulation for the week.
Staying healthy over the long haul
The rotation lifecycle
A long-lived agent's context degrades over time — it re-asks questions, contradicts earlier decisions, loses the thread. The Steward exists to catch that and swap in a fresh successor without losing institutional memory. There are two ceremonies, and they're deliberately kept distinct.
① Context-degradation rotation
Any-time, threshold-driven, low-ceremony. Triggered by the two-signal
rule on health metadata. The dying peer emits an ## Open state block; a
successor is spawned, adopts that state, and is smoke-tested before the old one is archived.
② Phase-boundary rotation
Heavier, calendar-driven at the Pilot→Scale and Scale→Transform markers. Includes a fuller Historian snapshot and a phase-carryover document. Both delivery peers rotate together — a clean break for the new phase — and the Steward rotates last, with Charlie's ack.
The ceremony, step by step
The executable ceremony lives in the generic rotate-role
skill (~/.copilot/m-skills/rotate-role/SKILL.md), driven by
~/.copilot/m-workflows/haleon/workflow.json. The old loom-specific
rotate-clawpilot-role is a deprecated forwarding shim.
- Classify. The dying peer is marked RED via the two-signal rule (or a hard self-flag / Charlie request).
- Freeze & request state. Confirm the peer is idle, then ask it for its
## Open stateblock plus swap-protocol additions. Address bysessionId, not name — the name is about to be reused (D8). - Snapshot. Historian for narrative roles (SA / Architect) → structured snapshot child; mechanical roles (CoS / Steward) → inline summary from the freeze response. Snapshot persisted to the dedicated
agent_snapshotstable (no longer the runtime ledger). - Spawn the successor. Rotator spawns from the same role brief (resolved from
workflow.json), handed the predecessor's open state to adopt. - Await boot ack (D3 — no boot-turn race). Wait for the successor's
## Context healthgreen ack before releasing the SDK handle. Release withkeepPendingTurn:false. - Status-probe re-adopt (D4 — no orphan-turn race). A lightweight ping cleanly re-adopts the successor handle before the heavy readiness smoke-test.
- Smoke-test. If it fails, both predecessor and failed successor stay alive — the successor is archived (
delete:false), never deleted — and Charlie is pinged. - Archive. On a clean smoke-test, the predecessor is archived (
delete:false, never deleted) and Charlie gets one Teams notification. The ZOMBIE branch (D5) defaults to cold-spawn + archive; transcript-synthesis ordelete:truerequires an explicit Charlie ack quoted inrotation_log.notes. - Broadcast. The successor announces its new
sessionIdto the surviving peers so the ring re-links — for a Steward rotation, only after Charlie's ack. - Record (Step 9b — detail-row-first, HS-4). Write
rotation_logandagent_snapshotsrows FIRST, capture their ids, then INSERT theagent_runsspine row withclass='rotation'anddetail_refalready populated. The runtime ledger (plan.md) is updated as a human-readable companion.
selfRotationRotator:"operator",
Operator (Charlie's hands-on session) is the rotator-of-record for Steward self-rotation —
the outgoing Steward NEVER spawns its own successor. That's single-point-of-failure protection for the ring.