The workflow

How three agents stay in sync
without talking to each other.

The ring coordinates through shared state, not chatter. A single database is the brain; routing flags are the nervous system; human sign-off is the gate on anything that leaves the building. Here's the whole machine.

The shared brain Communication topology A day in the life The rotation lifecycle

Shared state

One database, one source of truth

All three peers share a single Postgres database exposed through a data-API layer. Each entity has a clear owner — the agent allowed to write it — and the others are read-only or blind to it entirely. This is how the ring stays coherent across days, weeks, and phase boundaries without ever holding everything in one context window.

Entity	What it holds	Owner (writes)
`signals`	Raw inbound from Haleon / Microsoft / backlog / scout observations	CoS + 5 scouts (tier_0_reader)
`briefs`	The 5-3-2-1 morning & end-of-day brief rows	CoS
`comms_drafts`	Outbound queue, gated on Charlie sign-off	CoS
`quiet_watch_state`	Silent-stakeholder tracking	CoS
`meeting_preps` · `steerco_preps`	Pre-meeting prep docs	CoS
`adrs`	Architecture decision records + DB-backed `review_stage` axis	SA
`compliance_state`	Per-use-case regulatory posture	SA
`reuse_catalog`	Cross-pod reusable patterns	SA
`decisions`	Narrative log; peer routing is queryable data — `decision_class`, `owning_peer`, `adr_id`, `narrative` (ISS-14)	CoS / SA
`agent_runs`	Per-turn audit spine. Class enum (7): `{turn, rotation, snapshot, ado_drain, audit, correction, scout_sweep}`. Multi-writer; gap-scan key `(session_id, session_turn_seq)`	multi-writer (every role writes its own); Steward audits
`rotation_log` · `agent_snapshots`	Rotation handover records + Historian snapshots; cross-referenced from `agent_runs` via `detail_ref` (detail-row-first ordering, HS-4)	Steward (Operator for Steward self-rotation)
`cost_telemetry`	Whole-UTC-day cost cells; derived projection of `agent_runs` tokens via `sp_rollup_cost_telemetry`	derived (rollup SP)
`ado_writeback_queue`	Pending ADO ops; gated default `pending_approval`; drained by `ado-scribe` through CAS-guarded `sp_update_ado_writeback`	CoS
`tick_lease` · `scout_lease` · `drainer_lease`	Three independent single-flight mutexes (re-entrant same-holder CAS, TTL-expiry steal)	CoS (via shim entities)
`scout_watermark` · `scout_enable_flags`	Per-source scan watermark; 3-greens DB gate (`cost_breaker_live` / `drain_proven` / `scout_proven`) read by `sp_check_scout_enabled()`	scouts + Charlie

The shared brain spans ~29 base tables (22 originals + the Batch 3–7 substrate: rotation_log, agent_snapshots, the three leases, scout_watermark, scout_enable_flags) exposed through 52 DAB entities (base reads + *In shim views + action-views like CostRollupDayIn / CostBreakerRollingIn / ScoutEnabledCheckIn). A handful of cross-cutting entities (pod intel, customer health, risks, recipient registry) are read by more than one peer but still have a single writer. The rule never changes: one writer per row, everyone else reads or stays blind.

Who can talk to whom

Communication topology

The peers never call each other directly. The Chief of Staff and Solution Architect coordinate by writing routing flags onto decisions rows. The Steward is one-way: it reads health metadata and only ever "speaks" by spawning a replacement. Charlie is the only node everyone talks to directly.

Rose arcs = routing flags through the decisions table (the only peer channel). (Peer routing is now persisted as data — decision_class, owning_peer, adr_id, narrative on decisions (ISS-14); the per-handoff flag values shown are illustrative shorthand over those columns.) Grey dashes = escalations to Charlie. Teal dashes = the Steward reading metadata and spawning replacements.

End to end

A day in the life of the ring

Follow one signal — a Haleon stakeholder raising a data-residency concern — as it moves through the ring from inbound ping to ratified decision and customer-ready reply. (Today the ring is on-demand only. The timeline below shows the intended cadence; the recurring daily-brief cron and the 15-min ring-tick are not yet enabled — see Roadmap.)

07:00
CoS Morning brief. Overnight signals are metabolised into a 5-3-2-1 brief and posted to Charlie via Teams. The residency concern lands as one of the top-5 risks.
07:12
CoS Routing. The concern is decision-shaped and technical, so CoS writes a decisions row with routing=needs_sa_review — and purges the technical detail from its own working memory.
09:30
SA Picks up the queue. SA reads the routed row, checks the reuse catalog for an existing residency pattern, finds none, and opens a new ADR in draft.
10:05
SA Adversarial review. SA spawns the Reviewer skill. It pushes back on the first draft; SA revises. On pass, the ADR moves to reviewed.
10:40
SA Compliance gate. The decision touches a consumer-health surface, so the Compliance Checker runs (mandatory). It returns needs-review and a compliance_state row is written.
11:15
Steward Health tick. A routine metadata scan: both peers green, turn counts nominal, cost within band. An agent_runs audit row is written. Silent — no Teams.
14:00
👤 Charlie ratifies. SA surfaces the reviewed, compliance-checked ADR. Charlie acks; the ADR becomes ratified and SA writes a hand-back row with routing=needs_narrative_for_steerco.
14:20
CoS Drafts the reply. CoS picks up the narrative, drafts a reply to the Haleon stakeholder, runs it through the Outbound Voice skill, and queues it at awaiting_signoff.
14:25
👤 Charlie sends. He reviews the draft, tweaks one line, and approves. Only now does anything leave the building. human gate
17:00
CoS End-of-day brief. The residency thread is closed out in the brief, the ratified ADR noted, and it's added to the Steerco prep accumulation for the week.

Notice what never happened: the two agents never sent each other a message, no agent ever made an architecture call outside the Reviewer + Compliance gates, and nothing reached the customer without Charlie's explicit approval.

Staying healthy over the long haul

The rotation lifecycle

A long-lived agent's context degrades over time — it re-asks questions, contradicts earlier decisions, loses the thread. The Steward exists to catch that and swap in a fresh successor without losing institutional memory. There are two ceremonies, and they're deliberately kept distinct.

① Context-degradation rotation

Any-time, threshold-driven, low-ceremony. Triggered by the two-signal rule on health metadata. The dying peer emits an ## Open state block; a successor is spawned, adopts that state, and is smoke-tested before the old one is archived.

② Phase-boundary rotation

Heavier, calendar-driven at the Pilot→Scale and Scale→Transform markers. Includes a fuller Historian snapshot and a phase-carryover document. Both delivery peers rotate together — a clean break for the new phase — and the Steward rotates last, with Charlie's ack.

The ceremony, step by step

The executable ceremony lives in the generic rotate-role skill (~/.copilot/m-skills/rotate-role/SKILL.md), driven by ~/.copilot/m-workflows/haleon/workflow.json. The old loom-specific rotate-clawpilot-role is a deprecated forwarding shim.

Classify. The dying peer is marked RED via the two-signal rule (or a hard self-flag / Charlie request).
Freeze & request state. Confirm the peer is idle, then ask it for its ## Open state block plus swap-protocol additions. Address by sessionId, not name — the name is about to be reused (D8).
Snapshot. Historian for narrative roles (SA / Architect) → structured snapshot child; mechanical roles (CoS / Steward) → inline summary from the freeze response. Snapshot persisted to the dedicated agent_snapshots table (no longer the runtime ledger).
Spawn the successor. Rotator spawns from the same role brief (resolved from workflow.json), handed the predecessor's open state to adopt.
Await boot ack (D3 — no boot-turn race). Wait for the successor's ## Context health green ack before releasing the SDK handle. Release with keepPendingTurn:false.
Status-probe re-adopt (D4 — no orphan-turn race). A lightweight ping cleanly re-adopts the successor handle before the heavy readiness smoke-test.
Smoke-test. If it fails, both predecessor and failed successor stay alive — the successor is archived (delete:false), never deleted — and Charlie is pinged.
Archive. On a clean smoke-test, the predecessor is archived (delete:false, never deleted) and Charlie gets one Teams notification. The ZOMBIE branch (D5) defaults to cold-spawn + archive; transcript-synthesis or delete:true requires an explicit Charlie ack quoted in rotation_log.notes.
Broadcast. The successor announces its new sessionId to the surviving peers so the ring re-links — for a Steward rotation, only after Charlie's ack.
Record (Step 9b — detail-row-first, HS-4). Write rotation_log and agent_snapshots rows FIRST, capture their ids, then INSERT the agent_runs spine row with class='rotation' and detail_ref already populated. The runtime ledger (plan.md) is updated as a human-readable companion.

The cardinal rule of rotation: the rotator spawns the successor — the role being rotated never spawns itself. And the Steward never rotates itself without Charlie's explicit acknowledgement; per workflow.json's selfRotationRotator:"operator", Operator (Charlie's hands-on session) is the rotator-of-record for Steward self-rotation — the outgoing Steward NEVER spawns its own successor. That's single-point-of-failure protection for the ring.

See what the ring actually produces →

How three agents stay in syncwithout talking to each other.

One database, one source of truth

Communication topology

A day in the life of the ring

The rotation lifecycle

① Context-degradation rotation

② Phase-boundary rotation

The ceremony, step by step

How three agents stay in sync
without talking to each other.