Single-pane attention router for interactive AI coding agents: stuck sessions surface in one prioritized queue (most-stuck first) via Claude Code hooks, and your reply is delivered back into the exact session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 KiB
Trail Boss — design plan
The complete design: the problem, the capabilities required, the architecture, the
implementation phases, and the open questions. See
../research/claude-code-mechanics.md for the
underlying Claude Code primitives and ../research/related-work.md
for prior art.
Problem
You run long-form, human-in-the-loop agentic coding across many concurrent agent sessions, one per terminal window. Each session periodically stalls waiting on you: a permission prompt, a plan to approve, a clarifying question, or a finished turn awaiting the next instruction. Today you find those stalls by manually cycling windows. That polling is the bottleneck: most of your time goes to finding the session that needs you, not answering it, and blocked sessions burn wall-clock while otherwise-parallel work waits.
Invert the loop: stuck sessions raise their hand, and a single pane presents them as a prioritized queue of pending decisions, most-stuck first. You answer or skip; the reply is delivered back into the exact session, which unblocks and resumes. You never go hunting again.
What is needed
Five capabilities. The hard parts are #1 (a reliable "I'm blocked" signal) and #5 (getting the reply back into the exact session). The rest is plumbing.
1. A blocked-state signal from every session — Claude Code hooks
Each session emits an event the instant it starts waiting on a human. Claude Code's hook
system is the native mechanism; configure these in ~/.claude/settings.json (or a project
.claude/settings.json) to POST their stdin JSON to the collector:
| Hook | Fires when | Use |
|---|---|---|
Stop |
the turn finished; session is idle, waiting for the next prompt | enqueue ready for next — the reliable idle signal |
PermissionRequest |
a tool needs approval | enqueue an allow/deny decision |
Notification |
Claude needs attention (permission prompt, long idle) * | supplementary signal |
UserPromptSubmit |
input was submitted | dequeue (the block is resolved) |
SessionStart / SessionEnd |
session opens / closes | register / retire the session |
Stop is the backbone — it fires every time a turn completes and the session sits waiting, so
"idle, wants the next instruction" needs no polling. PermissionRequest covers hard blocks.
* Notification's exact trigger conditions are less documented; treat it as supplementary.
Every payload carries session_id, transcript_path, and cwd — the correlation keys.
2. A central collector + live state store
A small always-on service that ingests hook POSTs and maintains the authoritative set of
sessions and their status: {session_id, project, status: running|blocked|idle, reason, blocked_since, context_ref}. SQLite is enough; it broadcasts changes to the UI over
WebSocket/SSE.
3. Context extraction — what is each session asking?
A "blocked" status is useless without the question. Two sources:
- Transcript JSONL (
transcript_path): append-only, one message per line, tailable in real time. The last assistant / tool-use entry holds the permission request, the plan text, or the question body. - tmux
capture-pane -pas a literal fallback: scrape the visible prompt verbatim (useful when the transcript lags the on-screen menu).
4. The Trail Boss pane — the attention queue
A single-pane queue/inbox sorted by policy (most-stuck-first), rendering each item's context,
with a reply box and decision keys, keyboard-driven for speed. Can be a TUI or a thin web
page (better for rich plan/diff rendering and remote access). skip = advance the cursor only.
5. The input-delivery path — reply → the right session
The crux; it forks on substrate:
- Substrate A — tmux
send-keys(overlays a terminal workflow, no rewrite). ASessionStarthook records the session's$TMUX_PANEinto the registry, mappingsession_id → pane. To answer, the dispatcher runstmux send-keys -t <pane> '<reply>' Enter, or sends the menu choice for a permission prompt. Sessions stay interactive; the pane is just driven remotely. Lowest-friction MVP. - Substrate B — Agent SDK (cleaner, but a rewrite). Run sessions under the Python/TS Agent
SDK. The
canUseToolcallback turns every permission prompt into a programmatic await that can pend indefinitely while it waits for your remote allow/deny (and can return modified tool input). Streaming input delivers follow-up turns over a long-lived channel. Note: headlessclaude -pis not a path — it's one-shot and cannot stream input into a running session; only the SDK can.
Recommendation: build the MVP on Substrate A (hooks for detection + tmux for delivery) — zero rewrite. Treat Substrate B as v2 once the queue/UX is proven.
Architecture (Substrate A / MVP)
window 1 ─┐ agent session (hooks → curl)
window 2 ─┤ │ PermissionRequest / Stop / SessionStart
window N ─┘ ▼
┌──────────────────────┐ tail transcript JSONL
│ Collector + store │◀─── + tmux capture-pane (context)
│ (SQLite, WS/SSE) │
│ session_id → pane │
└──────────┬───────────┘
│ WS/SSE: queue state
▼
┌──────────────────────┐
│ Trail Boss pane │ most-stuck first
│ read · reply · skip │
└──────────┬───────────┘
│ reply / decision
▼
tmux send-keys -t <pane> → session unblocks, resumes
Components
- Emitter — hook scripts in
~/.claude/settings.json, present in every session:trailboss-emit.sh— passes stdin JSON through forPermissionRequest,Notification,Stop,UserPromptSubmit,SessionEnd.trailboss-register.sh— onSessionStart, enriches the payload with$TMUX_PANE, the tmux window name, and the git repo atcwd, then POSTs. This is what makes reply-delivery possible.
- Collector — ingests events, upserts session state, holds the
session_id → {pane, project, transcript_path}registry, tails transcripts for context, broadcasts queue state. - Trail Boss pane — subscribes to the collector, renders the queue, captures
replies/decisions, POSTs them to
/reply. - Dispatcher — resolves the target pane (Substrate A) or the pending
canUseToolfuture (Substrate B) and delivers the input.
Session state machine
SessionStart
│
▼
┌─────────┐ PreToolUse / activity
│ RUNNING │◀──────────────────────────┐
└────┬────┘ │
PermissionRequest │ Stop │ UserPromptSubmit
▼ │ (block resolved)
┌─────────┐ │
│ BLOCKED │─────────────────────────────┘
└────┬────┘ reply delivered → input lands → session resumes
│ SessionEnd
▼
┌────────┐
│ RETIRED│
└────────┘
BLOCKED carries a reason (permission | plan | question | idle) and a
blocked_since timestamp. reason drives both card layout and queue priority.
Queue & skip semantics
- Membership: every
BLOCKEDsession is a queue item;RUNNING/RETIREDare not. - Ordering (default): permission blocks (
permission|plan|question, time-sensitive) outrankidle(fromStop, opportunistic); within a tier, oldestblocked_sincefirst. - Focus cursor: starts at the head.
enterfocuses (expands context);tab/skipadvances without answering — the item staysBLOCKEDand re-surfaces (optionally with a small penalty so a just-skipped item doesn't bounce straight back to the top). - Dequeue: the
UserPromptSubmithook fires when input lands — whether the reply came from the pane or because you answered directly in the terminal. Either way the item leaves the queue, so it never lies. - Staleness: if a session retires (
SessionEnd) while queued, drop the item.
Session → pane registry (Substrate A)
The one piece Claude Code does not hand you. Built by trailboss-register.sh on SessionStart:
#!/usr/bin/env bash
# trailboss-register.sh — enrich SessionStart with tmux + repo, then POST
payload="$(cat)" # event JSON on stdin
pane="${TMUX_PANE:-}" # e.g. %12
win="$(tmux display-message -p -t "$pane" '#S:#I.#P' 2>/dev/null || true)"
repo="$(git -C "$(printf '%s' "$payload" | jq -r .cwd)" rev-parse --show-toplevel 2>/dev/null | xargs -r basename)"
printf '%s' "$payload" \
| jq --arg pane "$pane" --arg win "$win" --arg repo "$repo" \
'. + {tmux_pane:$pane, tmux_window:$win, repo:$repo}' \
| curl -s -X POST http://localhost:4000/event --data-binary @-
Delivery resolves session_id → tmux_pane and runs send-keys. A session with no pane
(launched outside tmux, or under the SDK) falls back to Substrate B.
Only deliver human-authored input. The dispatcher must
send-keysonly content the human explicitly typed in the pane — never synthesize or auto-answer a session.
Substrate B variant (Agent SDK)
When sessions run under the Agent SDK, the collector holds pending decision objects directly
instead of a pane map: on canUseTool, the SDK process registers a pending future
(session_id, tool, input) and awaits; the UI renders it; your allow/deny/edit resolves the
future via /reply; the callback returns {behavior, updatedInput?} and the session proceeds.
Removes the tmux scraping/send-keys fragility and gives input-modification for free, at the
cost of moving sessions off the terminal. Recommended as v2.
Implementation phases (Substrate A)
- Probe
Notification/Stopfiring behavior to lock the detection model. - Emitter — the two hook scripts +
settings.jsonwiring. - Collector — ingest endpoint, SQLite state, registry, WS broadcast.
- Context — transcript tailer +
capture-panefallback. - Trail Boss pane — minimal TUI or web page: queue list, focused-card context, reply box, skip.
- Dispatcher —
send-keysdelivery +UserPromptSubmitdequeue. Close the loop. - Iterate on ordering policy and re-surface penalty.
Open questions
- Idle vs. blocked priority —
Stopreliably signals "finished its turn, wants the next instruction," so idle detection is solved; the open question is ranking it below time-sensitive permission blocks. Confirm the policy. Notificationvalue-add — withStop+PermissionRequestcovering idle and hard blocks, doesNotificationsurface anything they miss (e.g., a long-idle nudge)? If not, drop it from the emitter.- Reply round-trip fidelity — does
send-keyssurvive multi-line input, paste, or a menu selection intact? Test per prompt type. - Stale items — a skipped item you answer directly in the terminal should dequeue via the
UserPromptSubmithook; confirm that closes the loop. - Ordering policy — most-stuck-first vs. project-priority vs. cheapest-to-answer-first.
- Remote access — local TUI vs. a web page reachable from a phone; web wins for plan/diff rendering and mobility.
- Concurrency ceiling — at what session count does the human saturate regardless of routing? The router buys throughput, not infinite capacity.