Documents the root cause of the bf-40i loss (claude-sonnet PTY fallback in resolve_adapter), the consequences, and the mitigations (atomic label, NEEDLE fixes bf-14w/bf-2wi). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
106 KiB
claude-print Plan
Overview
Single Rust binary that is a drop-in replacement for claude -p. It drives the Claude Code interactive TUI via PTY, extracts the response via the Stop hook and JSONL transcript, and emits claude -p-compatible output — all while billing against the subscription (cc_entrypoint=cli) rather than the Agent SDK credit pool.
Background
Starting June 15, 2026, Anthropic separates claude -p (headless) into a separate monthly credit pool. Only the interactive TUI (cc_entrypoint=cli) continues drawing from the unlimited subscription. claude-print wraps the TUI in a PTY so callers get claude -p wire-compatible output while billing against the subscription.
The billing classification is determined by isatty(stdout) inside the claude binary at startup:
- PTY slave as stdout →
isatty()returns true → TUI mode →cc_entrypoint=cli→ subscription - Pipe as stdout →
isatty()returns false → print mode →cc_entrypoint=sdk-cli→ credit pool
Glossary
| Term | Definition |
|---|---|
| PTY | Pseudoterminal: a master/slave fd pair where isatty() returns true on the slave. Allows a parent process to control a child process's terminal I/O through the kernel line discipline. |
| cc_entrypoint | Anthropic billing header field. cli = subscription pool; sdk-cli = Agent SDK credit pool. Determined at Claude Code startup by isatty(stdout). |
| Stop hook | A Claude Code hook event fired when the AI completes a turn. Payload includes session_id, transcript_path, and last_assistant_message. Used as the IPC signal between the inner claude process and claude-print. (Note: in claude -p-style single-turn sessions, Stop fires once at session end. With --max-turns > 1 and tool use, Stop behavior is unverified — add to OQ-1 resolution checklist. The Stop Poller assumes single-fire per session; if multi-fire is observed, the poller must be updated to match on the JSONL Result event before acting.) |
| FIFO | POSIX named pipe (mkfifo). The Stop hook writes to it; the parent poll loop reads from it. Per-run, per-pid — prevents cross-invocation contamination. |
| Bracketed paste | Terminal feature that wraps pasted text in ESC[200~ … ESC[201~ markers. Prevents embedded newlines from triggering premature Enter in Ink's REPL. |
| Ink | The React/Yoga-based TUI framework used by Claude Code. Sends DEC terminal probes (DA1, DA2, DSR, XTVERSION, window-size) at startup and hangs indefinitely if unanswered. |
| login_tty | glibc function: setsid() + ioctl(TIOCSCTTY) + dup2(slave, 0/1/2) + close(slave). Makes the PTY slave the controlling terminal for the child process. |
| JSONL transcript | Newline-delimited JSON at ~/.claude/projects/<cwd-slug>/<session-id>.jsonl. Claude Code appends one event per line as the session progresses. The <cwd-slug> is derived by stripping the leading / and replacing remaining / with -. (Note: paths containing hyphens in directory names produce ambiguous slugs; session_id resolves the file within the directory.) |
| usage-fingerprint | Tuple of (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) used to deduplicate streaming JSONL events from the same API call when message.id is absent. |
| stream-json | Output format where each transcript event line is forwarded to stdout as Claude Code writes it, providing real-time streaming compatible with claude -p --output-format stream-json. |
| mock_claude | Compiled Rust binary (test-fixtures/mock-claude/) simulating Claude Code's PTY and JSONL behavior. Controlled via env vars — not a shell script. |
| NEEDLE | LLM fleet runner that dispatches AI agents to code workspaces. claude-print.yaml configures NEEDLE to use claude-print instead of claude -p. |
Non-Goals
The following are explicitly out of scope with rationale:
| Non-Goal | Rationale |
|---|---|
| Windows support | PTY (openpty, login_tty) is POSIX-only. The target platform is x86_64 Linux (musl). Adding Windows requires ConPTY — a fundamentally different approach not needed for the server/NEEDLE use case. |
| macOS / ARM Linux | Initial target is x86_64-unknown-linux-musl. Can be added in a future release if needed. |
| Response caching | Caching belongs at a higher layer (e.g., the NEEDLE dispatcher). Adding it here would complicate billing accounting and break the stateless design. |
| Multi-turn interactive sessions | claude-print handles one prompt → one response per invocation, mirroring claude -p semantics. Session management is the caller's responsibility. |
| GUI or web interface | Output format is stdin/stdout. No web server, no gRPC, no REST. |
| Rate-limit retry | Rate limits surface as exit 1. Retry logic belongs in the caller or NEEDLE. |
| Streaming response reassembly | stream-json forwards raw JSONL lines as-is. No custom streaming protocol or chunk reassembly. |
| Model-name validation | --model is forwarded verbatim to claude. If the model name is invalid, claude rejects it. |
Hard Requirements
These MUST hold. Any design that violates them is invalid.
- MUST produce a single statically-linked binary — no shared library dependencies, no Python, no Node, no scripts at runtime.
- MUST set
cc_entrypoint=cli— every invocation MUST bill against the subscription pool. This is the core correctness invariant. - MUST be a drop-in replacement for
claude -p— positional prompt, stdin,--input-file,--output-format text/json/stream-json,--model,--max-turns, and all five exit codes MUST be compatible. - MUST NOT redirect
CLAUDE_CONFIG_DIR— transcripts MUST land in~/.claude/projects/exactly asclaude -pwrites them. - MUST NOT break user hooks in default mode — all hooks in
~/.claude/settings.jsonMUST fire alongside the relay hook. - MUST survive Claude Code version updates — unknown JSONL fields, event types, and escape sequences MUST be silently tolerated without a binary rebuild.
- MUST clean up temp dir on all exit paths — no leftover
claude-print-*directories in$TMPDIRafter normal exit, timeout, SIGINT, or panic. - MUST forward SIGINT to child — Ctrl-C MUST reach the inner
claudeprocess.
What It Is Not
- Not a general-purpose PTY wrapper (not
script(1)ortmux). - Not a Claude Code plugin — it runs
claudeas a subprocess. - Not a billing bypass — it uses the interactive TUI as designed; it does not spoof headers.
- Not a session manager — no state persists between invocations.
- Not aware of multi-turn conversation history — each invocation is independent.
- Not a streaming proxy —
stream-jsonforwards raw JSONL, not a custom protocol.
Scope Lock
Any feature not listed in the Components section is out of scope for v1.0. To add a feature it MUST (1) solve a documented problem that claude -p compatibility cannot address, (2) not require changes to the PTY event loop's core state machine, and (3) not add a runtime dependency. Features violating the musl static binary requirement are permanently out of scope.
Normative Language
This document uses RFC-2119 conventions: MUST = required, MUST NOT = prohibited, SHOULD = recommended, MAY = optional.
Delivery
Single statically-linked binary. No Python, no runtime dependencies, no pip packages.
claude-print # the binary (musl static)
mock_claude # test fixture binary (musl static, installed by install.sh)
claude-print.yaml # NEEDLE agent config
install.sh # installs all of the above to ~/.local/bin/ and ~/.needle/agents/
Built with:
cargo build --release --target x86_64-unknown-linux-musl # fully static, no libc dep
Distribution: GitHub Release artifact via claude-print-ci Argo WorkflowTemplate (same pattern as NEEDLE, SIGIL, ARMOR).
Acceptance Scenarios
Named scenarios that define correct system behavior. Pass/fail criteria are testable without credentials unless noted.
AS-1: Shell Script Caller (Happy Path)
Action: echo "What is 2+2?" | claude-print
Pass: exit 0; stdout contains a non-empty text response; ~/.claude/projects/ gains a new JSONL file.
Fail: any non-zero exit, empty stdout, or stdout contains JSON syntax.
AS-2: JSON Consumer
Action: claude-print --output-format json "What is the capital of France?"
Pass: exit 0; stdout is a single valid JSON object with type=result, is_error=false, result non-empty, usage.input_tokens > 0, claude_version present.
Fail: invalid JSON, missing required field, is_error=true.
AS-3: NEEDLE Worker
Action: NEEDLE dispatches a bead with claude-print.yaml agent.
Pass: exit 0; JSON output contains a valid UUID session_id; transcript appears in ~/.claude/projects/<workspace-slug>/; --no-inherit-hooks suppresses user hooks.
Fail: NEEDLE cannot parse output; session_id absent; exit non-zero.
AS-4: Billing Classification
Action: Any invocation, followed by inspection of the most recent JSONL in ~/.claude/projects/.
Pass: The file contains a line with "entrypoint": "cli".
Fail: entrypoint is "sdk-cli" or absent.
(Credential-required; run manually and before each release.)
AS-5: Error Surface — claude Not Found
Action: PATH= claude-print "hello" (or --claude-binary /nonexistent).
Pass: exit 2; stderr contains a human-readable error naming the missing binary; --output-format json output has is_error=true, subtype=internal_error.
Fail: exit 0 or process hangs.
AS-6: Degraded Path — Transcript Race
Action: Integration test with mock_claude MOCK_DELAY_JSONL=150.
Pass: retry loop fires (visible in --verbose); response extracted correctly; exit 0.
Fail: exit non-zero or empty response.
Success Metrics
Functionality: AS-1 through AS-6 all pass on every commit; AS-4 passes before every release; all mock integration scenarios (at minimum, the scenarios listed in the integration test table) exit with expected codes.
Performance: claude-print overhead (invocation to prompt injection) < 5 s on a cold start; transcript reader produces output within 2 s of Stop hook firing; binary size < 10 MB.
Adoption: NEEDLE workers using claude-print.yaml produce zero billing-classification failures; claude --version changes do not require a claude-print rebuild within 30 days of a Claude Code release.
Architecture
caller
│ prompt (stdin, arg, or --input-file)
▼
claude-print (single Rust binary)
├── CLI parser flags forwarded to claude subprocess (clap)
├── Hook installer per-run temp dir: settings.json + hook.sh + stop.fifo
├── PTY spawner nix::pty::openpty() + fork() + login_tty()
├── Event loop poll() on master_fd; dispatches to:
│ ├── Terminal emu responds to DA1/DA2/DSR/XTVERSION/window-size probes
│ ├── Startup seq phase 1: trust dismiss phase 2: bracketed-paste inject
│ └── FIFO poller blocks on stop.fifo until Stop hook fires
├── Transcript rdr JSONL parse → final text + token counts (retry loop)
├── Emitter text / json / stream-json to stdout
└── Cleanup FIFO, temp dir, master_fd, waitpid
Module Layout
claude-print/
├── Cargo.toml # workspace root; declares `test-fixtures/mock-claude` as a workspace member so `cargo build` compiles `mock_claude`
├── Cargo.lock
├── install.sh
├── claude-print.yaml # NEEDLE agent config
├── src/
│ ├── main.rs # entry point: parse args, orchestrate
│ ├── cli.rs # clap CLI struct + validation
│ ├── config.rs # ~/.config/claude-print/config.toml loader
│ ├── hook.rs # HookInstaller: temp dir, settings.json, hook.sh, mkfifo
│ ├── pty.rs # PTY spawner: openpty, fork, login_tty, winsize
│ ├── event_loop.rs # poll() loop: dispatch to terminal/startup/fifo
│ ├── terminal.rs # TerminalEmu: probe scanner, response table, dedup bitmask
│ ├── startup.rs # StartupSeq: trust dismiss, bracketed paste injection
│ ├── transcript.rs # JSONL parser, usage dedup, text extraction, retry loop
│ ├── emitter.rs # Output formatter: text/json/stream-json
│ └── error.rs # ClaudePrintError enum, exit code mapping
├── tests/
│ ├── cli.rs
│ ├── terminal.rs
│ ├── transcript.rs
│ ├── hook.rs
│ ├── emitter.rs
│ ├── startup.rs
│ ├── version_compat.rs
│ ├── integration/
│ │ ├── mod.rs
│ │ └── scenarios.rs # 20+ mock PTY integration tests
│ ├── hooks.rs # hook inheritance tests
│ └── fixtures/
│ └── transcript_v2.1.168.jsonl
└── test-fixtures/
└── mock-claude/
├── Cargo.toml
└── src/
└── main.rs
State Machine
Two orthogonal state machines run inside the event loop.
StartupSeq States
WAITING
│ trust keywords found in PTY line
│ OR (bytes_received ≥ 200 AND PTY idle ≥ 0.8 s)
▼
TRUST_DISMISSED ← CR sent
│ PTY idle ≥ 2.0 s after CR write
▼
PROMPT_INJECTED ← bracketed paste sent; FIFO read-end opened
│ FIFO becomes readable (Stop hook fired)
▼
DONE
From any state:
wall-clock timeout → SIGTERM child → exit 124
child exits unexpectedly → exit 2
SIGINT → SIGINT child (per HR-8) → exit 130
Stop fires before PROMPT_INJECTED → error: emit is_error=true, exit 2 (see EC-7: a response to an unsent prompt indicates a session identity leak; EC-11 prevents this in normal operation)
Guard conditions:
WAITING → TRUST_DISMISSED: either trust keywords OR the idle/byte threshold. Not both required. One-shot: once the WAITING → TRUST_DISMISSED transition occurs for any reason (keyword or idle), the idle fallback is deactivated.TRUST_DISMISSED → PROMPT_INJECTED: idle gap measured from the CR write timestamp, not from last PTY output — avoids re-triggering on buffered output that arrives after CR.- FIFO read end opened at the
TRUST_DISMISSED → PROMPT_INJECTEDtransition, before the bracketed paste is written (EC-3).
FIFO Poller States
UNOPENED
│ opened O_NONBLOCK at TRUST_DISMISSED → PROMPT_INJECTED transition
▼
OPEN_WAITING
│ FIFO becomes readable (Stop hook wrote payload)
▼
PAYLOAD_READ → DONE
FIFO open mechanics: Opening O_RDONLY|O_NONBLOCK on a named FIFO returns ENXIO if no writer holds the write end. To prevent this, claude-print opens a "keeper" write-end fd O_WRONLY|O_NONBLOCK on the same FIFO and holds it open until Stop fires. This guarantees the read-end open succeeds (write end is always held). When Stop fires and the payload is read, the keeper write-end fd is closed. The hook.sh write (cat > '<fifo>') opens a second write end and writes the payload — both write-end opens are valid simultaneously. On all other exit paths (SIGINT, timeout, child-exit-before-Stop), the keeper write-end fd MUST be explicitly closed before waitpid — this causes any pending cat > '<fifo>' in hook.sh to receive EPIPE/ENXIO and exit, preventing a hang in claude's hook runner.
Concurrency Model
claude-print is single-threaded except for stream-json mode.
Default and json mode
All work runs on the main thread: fork(), poll() event loop, transcript reading, output. No shared mutable state. No locks.
stream-json mode
A reader thread is spawned at PROMPT_INJECTED:
Main thread Reader thread
───────────────────────────────── ──────────────────────────────────
poll() loop (master_fd, stop_fifo) tail transcript from prompt_injected_at
│ byte offset — captured as file.seek(End)
│ on the transcript file at the moment the
│ bracketed paste is written. The reader
│ thread reads from this byte offset forward,
│ so pre-injection events (SessionStart,
│ system messages) are not forwarded to stdout.
│ If the transcript file does not exist at
│ prompt injection time (claude has not yet
│ written the first event), the reader thread
│ MUST retry the file open in a loop with 50ms
│ sleeps until the file appears or a 5-second
│ timeout expires. If the 5-second timeout
│ expires, the reader thread MUST send the
│ drain signal on the mpsc channel (same as
│ normal Stop) before returning, so the main
│ thread's `Receiver::recv()` returns promptly.
│ The main thread then emits an error result
│ (`is_error: true`, `subtype: 'internal_error'`,
│ `error_message: 'transcript file did not appear
│ within 5s'`) and exits 2. This is the same race
│ condition handled by the normal transcript
│ reader's retry loop, applied here to the
│ file-open step rather than the content-read
│ step.
│ write each new line → stdout
Stop fires via mpsc::channel unbounded sender
│
mpsc drain_signal sent drain remaining lines, thread exits
│
join reader thread
│
emit exit code
Synchronization: one-shot std::sync::mpsc::channel. Reader owns the transcript file handle (no sharing). Reader thread MUST be joined before main() returns on all exit paths — including timeout and SIGINT paths (the SIGINT handler sets a flag that breaks the poll loop, which then joins the thread before calling process::exit).
Non-Stop exit paths (SIGINT, timeout): The reader thread MUST also exit on these paths. Mechanism: the reader thread holds the mpsc Receiver; the main thread holds the Sender. On SIGINT or timeout, the main thread drops the Sender (without sending a value). The receiver's recv() or try_recv() then returns Err(RecvError), which the reader thread treats as a shutdown signal — it exits its tail loop and returns. This means join() returns promptly on all exit paths. The reader thread drain logic: on Ok(()) from recv = drain_signal; on Err = immediate exit without draining.
The reader thread handle is stored as Option<JoinHandle<()>>, initialized to None. The Option is set to Some(handle) only at the PROMPT_INJECTED transition when the thread is spawned. On any exit path — including early exits before PROMPT_INJECTED — the join is conditional: if let Some(h) = reader_handle { h.join().ok(); }
Cross-Cutting Concerns
Error Propagation
error.rs defines ClaudePrintError with an exit code per variant. All errors route through the Emitter, so --output-format json callers always receive a structured error object, never bare stderr.
pub enum ClaudePrintError {
Setup(String), // exit 2
Timeout, // exit 124
Interrupted, // exit 130
AssistantError(String), // exit 1
}
Variant-to-JSON mapping:
| Variant | JSON subtype | Exit code |
|---|---|---|
| Setup(_) | "internal_error" | 2 |
| Timeout | "timeout" | 124 |
| Interrupted | "interrupted" | 130 |
| AssistantError(_) | "assistant_error" | 1 |
--verbose Trace Points
Written to stderr, timestamped [claude-print <ms>ms] <message>. Never to stdout. Trace points (in order): temp dir created, PTY opened, child forked (pid), phase transitions, FIFO opened, prompt injected, Stop received (session_id), retry count, cleanup reason.
Signal Handling
| Signal | Handler | Action |
|---|---|---|
| SIGINT | installed before fork | SIGINT child (forwarding the signal as required by HR-8); set interrupted flag; poll loop breaks; join reader thread (if any); emit exit 130 |
| SIGTERM | installed before fork — mirrors SIGINT handler | SIGTERM child (per HR-8 mirror); sets interrupted flag; writes to self-pipe; poll loop breaks; join reader thread; exit 130 (same as SIGINT via Interrupted variant); allowing normal cleanup and TempDir drop before exit. SIGTERM is handled the same as SIGINT — not a dirty kill. This guarantees INV-1 and INV-2 hold on SIGTERM. |
| SIGPIPE | ignored | stdout pipe may close early in stream-json mode |
Signal handler safety: The interrupted flag MUST be std::sync::atomic::AtomicBool with store(true, Ordering::SeqCst). Calling kill(2) from a signal handler is async-signal-safe on Linux. The AtomicBool::store is also safe from signal handlers. To wake a blocked poll() call, use a self-pipe: before fork(), create a pipe(2) pair; add the read-end to the pollfd array; the SIGINT/SIGTERM handler writes one byte to the write-end. The poll() loop checks the self-pipe read-end and the AtomicBool on each wake.
Temp Dir Cleanup
tempfile::TempDir is stored in main() scope (not nested in a struct). Drop on any exit path — including panics — calls remove_dir_all. The SIGINT handler does not directly clean up; it breaks the poll loop which returns control to main() where TempDir drops normally.
Log Boundary
claude-print writes NO files to ~/.claude/. All artifacts there are written by the inner claude process. claude-print only reads ~/.claude/projects/<slug>/<session-id>.jsonl after Stop fires.
Hook Inheritance and Log Placement
Default: Inherit User Hooks
By default claude-print does not redirect CLAUDE_CONFIG_DIR. The inner claude process:
- Writes its transcript to
~/.claude/projects/<cwd-slug>/<session-id>.jsonldirectly — the same placeclaude -pwrites it - Writes its session entry to
~/.claude/sessions/<pid>.json(ccdash sees it as a normal session) - Appends to
~/.claude/history.jsonl - Fires all hooks in
~/.claude/settings.json(SessionStart, Stop, PreToolUse, trail-boss, ccdash, etc.)
claude-print adds its own Stop hook by passing --settings <temp>/settings.json with the per-run relay hook. Claude Code merges --settings with the user's settings file — all existing hooks continue to fire alongside the relay hook (merge behavior per OQ-1, unverified; see Hook Installer §2 schema note and PO-1 for fallback if merge fails).
This matches exactly what claude -p does. Transcripts, token counts, and usage stats land in ~/.claude/ with no special handling.
--no-inherit-hooks (Isolation Mode)
When --no-inherit-hooks is passed:
--setting-sources=is forwarded to claude (empty value = load no standard settings sources)- Only
--settings <temp>/settings.jsonis loaded, which contains solely the Stop relay hook - User's
~/.claude/settings.jsonhooks do not fire (ccdash, trail-boss, etc.) CLAUDE_CONFIG_DIRis not set even in isolation mode — transcripts still land in~/.claude/projects/
Use this when running as a NEEDLE worker to prevent hook noise, or when the user's hooks have side effects (e.g., trail-boss POSTs to a collector that doesn't expect headless sessions).
Configuration File
$XDG_CONFIG_HOME/claude-print/config.toml if $XDG_CONFIG_HOME is set, otherwise ~/.config/claude-print/config.toml. Created with defaults on first run.
[defaults]
inherit_hooks = true # do not pass --setting-sources; let claude use its default source loading
model = "claude-sonnet-4-6"
max_turns = 30
timeout_secs = 3600
CLI flags override config file values. inherit_hooks = true — Setting to false is equivalent to passing --no-inherit-hooks on the command line: --setting-sources= (per OQ-2, unverified) is forwarded to the inner claude process, suppressing user hook inheritance. CLI --no-inherit-hooks takes precedence over the config file value.
Where Logs and Token Counts Land
In both modes:
| Artifact | Location | Same as claude -p? |
|---|---|---|
| Transcript JSONL | ~/.claude/projects/<cwd-slug>/<session-id>.jsonl |
Yes |
| Session registry | ~/.claude/sessions/<pid>.json |
Yes |
| History entry | ~/.claude/history.jsonl |
Yes |
| Stats cache | ~/.claude/stats-cache.json (rebuilt on next interactive start) |
Yes |
| Token counts | Inside the transcript JSONL message.usage fields |
Yes |
The temp dir holds only the relay infrastructure (hook script + FIFO). It is not part of the log path.
Crate Dependencies
| Crate | Purpose | Rationale |
|---|---|---|
clap (derive) |
CLI argument parsing | Derive macros generate type-safe flag structs with no boilerplate; dominates Rust CLI tooling; well-maintained. argh considered but lacks completions/subcommands for future extensibility. |
nix |
openpty, fork, login_tty, setsid, ioctl, poll, mkfifo, signal |
Safe Rust wrappers over the exact POSIX syscalls needed. Using the libc crate directly would require more unsafe blocks with no benefit. |
serde + serde_json |
JSONL parsing with schema-tolerant deserialization | Standard choice; #[serde(default)] + #[serde(other)] give schema tolerance with no extra code. |
uuid |
Reserved for future use (e.g., pre-assigning a session ID before spawning claude). Not required in v1.0 — the session_id is derived from the Stop payload or transcript filename. May be removed if unused after implementation. | Listed in Cargo.toml but not yet called; session_id is derived at runtime from Stop payload or transcript basename, not generated. |
tempfile |
Per-run temp directory with guaranteed cleanup | TempDir drop cleans up even on panic — manual mktemp + cleanup would require careful unwinding. |
No async runtime: the PTY event loop is a tight poll() on 2–3 fds; tokio would add binary size, compile time, and conceptual overhead for no throughput benefit. stream-json uses a single reader thread — no runtime needed.
No regex crate: probe matching uses a byte-by-byte state machine because probe bytes can straddle chunk boundaries; regex on a raw chunk would miss split sequences.
Components
1. CLI Interface
Drop-in for claude -p:
| Flag | Description |
|---|---|
prompt (positional) |
Prompt string; mutually exclusive with --input-file and stdin |
--input-file FILE |
Read prompt from file |
--model MODEL |
Forwarded to claude (default: claude-sonnet-4-6) |
--max-turns N |
Forwarded to claude (default: 30) |
--output-format FORMAT |
text (default), json, stream-json |
--allowedTools LIST |
Comma-separated, forwarded |
--disallowedTools LIST |
Forwarded |
--dangerously-skip-permissions |
Forwarded |
--timeout SECS |
Wall-clock timeout (default: 3600) |
--claude-binary PATH |
Override claude binary path (default: resolves claude from PATH) |
--no-inherit-hooks |
Disable user hook inheritance; passes --setting-sources= to claude (unverified per OQ-2) |
--version |
Print claude-print <version> (wrapping claude <version>) and exit. The claude version is obtained by running the binary at --claude-binary (or the PATH-resolved claude if not specified). If claude is not found, print claude-print <version> (wrapping claude: not found) and exit 0. |
--verbose |
Write timing traces to stderr |
--check |
Run installation self-test: verify openpty, mkfifo, optional PTY round-trip with mock_claude. Exits 0 on all checks passed, 2 on any failure. |
Stdin accepted as prompt when not a TTY and no positional/--input-file given.
Model precedence: CLI --model flag > config.toml defaults.model > compiled-in default (claude-sonnet-4-6). The NEEDLE claude-print.yaml model: field is passed by NEEDLE as the {model} template variable, which is forwarded via --model — so NEEDLE YAML's model is equivalent to passing --model on the command line.
Exit codes:
0— success1— assistant error (is_error: truein transcript)2— internal error (PTY spawn, hook setup, parse failure)124— timeout exceeded130— interrupted (SIGINT)
2. Hook Installer
Creates $TMPDIR/claude-print-<pid>-<rand>/ via tempfile::Builder, created with mode 0700 (via tempfile::Builder::new().mode(0o700)) — world-readable temp dirs would allow other local users to read the Stop hook payload (T-1). The temp dir path is validated at creation time: if the path returned by tempfile contains a single-quote character, abort with exit 2 (see T-4). In practice this cannot happen with standard tempfile crate output, but the check is required by the security threat model.
<temp>/
├── settings.json ← per-run Stop relay hook (merged with user settings via --settings)
├── hook.sh ← executed by Claude Code on Stop
└── stop.fifo ← POSIX named pipe for hook→parent IPC
settings.json — contains only the per-run Stop relay hook:
{
"hooks": {
"Stop": [{
"hooks": [{"type": "command", "command": "<temp>/hook.sh", "timeout": 10}]
}]
}
}
Passed to claude via --settings <temp>/settings.json. Claude Code merges this with all other loaded settings sources. The user's ~/.claude/settings.json Stop hooks (if any) also fire, plus this relay hook.
Schema note: This double-nested hooks.Stop[{hooks:[...]}] structure matches the Claude Code settings format observed in v2.x. Add schema verification to OQ-1's resolution checklist: confirm the settings JSON schema by inspecting a real ~/.claude/settings.json from the target Claude Code version. If the schema changes, this template must be updated.
Hook merge ordering: Claude Code runs merged hooks sequentially in the order they appear in the merged settings. The relay hook's "timeout": 10 applies only to the relay hook itself — it does not affect the user's hooks. The user's Stop hooks likely run first (settings.json is merged before --settings), but this ordering is unverified (per OQ-1).
hook.sh (executed by Claude Code on Stop):
#!/bin/sh
cat > '<temp>/stop.fifo' 2>/dev/null || true
Receives the Stop JSON payload on stdin and writes it to the FIFO. Claude Code does not wait for the hook to complete beyond the 10 s timeout.
stop.fifo — POSIX named pipe created with nix::unistd::mkfifo().
In --no-inherit-hooks mode, also forward --setting-sources= to claude (empty = no standard sources loaded) (per OQ-2, unverified; see PO-2 for fallback). Only --settings <temp>/settings.json is active. This prevents the user's SessionStart/Stop/PreToolUse hooks from firing.
tempfile::TempDir handles cleanup on any drop path.
3. PTY Spawner
use nix::pty::{openpty, OpenptyResult};
use nix::unistd::{fork, ForkResult, login_tty};
let OpenptyResult { master, slave } = openpty(None, None)?;
// Set window size on master before fork
set_winsize(master, rows, cols);
match unsafe { fork()? } {
ForkResult::Child => {
drop(master);
login_tty(slave)?; // setsid + TIOCSCTTY + dup2(slave, 0/1/2)
// Reset inherited signal handlers to default before exec
nix::sys::signal::signal(Signal::SIGINT, SigHandler::SigDfl)?;
nix::sys::signal::signal(Signal::SIGTERM, SigHandler::SigDfl)?;
execvp("claude", &args)?;
unreachable!()
}
ForkResult::Parent { child } => {
drop(slave);
// After the prompt is read from stdin and the fork is complete, the parent
// closes STDIN_FILENO (nix::unistd::close(0)) to release the caller's pipe.
// The child's fd 0 is already replaced by login_tty's dup2(slave, 0) regardless.
run_event_loop(master, child, ...)
}
}
Signal handlers MUST be reset to SIG_DFL in the child before execvp — the child inherits the parent's SIGINT/SIGTERM handlers from fork(), which would interfere with claude's own signal handling.
login_tty(slave) is glibc's login_tty(3): setsid() → TIOCSCTTY → dup2(slave, 0/1/2) → close(slave).
Window size probe order: (1) TIOCGWINSZ on STDOUT_FILENO, (2) TIOCGWINSZ on STDIN_FILENO, (3) open /dev/tty and TIOCGWINSZ, (4) fallback 220 × 50. In headless/NEEDLE mode, steps 1–3 all fail and the fallback is always used — this is the expected behavior.
Cleanup on any exit path: SIGTERM → 2 s → SIGKILL → waitpid. (Note: the 2-second grace period means actual process exit may be up to 2s after the specified --timeout. Callers should account for this when setting their own outer timeout budget. The grace period exists to allow claude to save any in-progress state before being killed.)
4. Event Loop
Single poll() call on master_fd and self_pipe_read (2 fds always present). At PROMPT_INJECTED, stop_fifo read-end is added as a third fd. Deadline tracking is separate:
master_fd POLLIN → read PTY output, dispatch to TerminalEmu + StartupSeq
stop_fifo POLLIN → Stop hook fired; read payload, begin transcript extraction (added at PROMPT_INJECTED)
[timeout] — → tracked via Instant; sets poll() timeout_ms, not a physical fd
Timer mechanism: There is no separate timer fd. Timeouts (startup 45s, wall-clock --timeout) are tracked via Instant::now() captured at the relevant phase transition. On each poll() call, the timeout argument is set to the minimum remaining ms across all active timers. poll() returns at or before the soonest deadline. The initial poll set is 2 fds (master_fd, self_pipe_read); the FIFO fd is pushed at PROMPT_INJECTED. The 'timer' entry in the architecture diagram is a logical representation of deadline tracking, not a physical fd.
Dynamic fd registration: The event loop initially polls only master_fd (1 fd). At the TRUST_DISMISSED → PROMPT_INJECTED transition, the FIFO read-end fd is added to the poll() set. Subsequent poll() iterations include both fds. The simplest implementation: represent the pollfd array as a Vec<pollfd> and push the FIFO fd at transition time.
TerminalEmu runs on every chunk of PTY output, scanning for escape sequences and queueing responses. Responses written to master_fd on the next writable poll.
StartupSeq tracks phase (Waiting / TrustDismiss / PromptInjected) and transitions based on heuristics (see §5).
FifoPoller opens stop.fifo for reading in a non-blocking O_NONBLOCK open; polls for data via the same poll() call.
5. Terminal Emulator (Ink probe responder)
Ink sends DEC terminal queries at startup and hangs if unanswered. The emulator scans raw bytes for known probe patterns:
| Probe bytes | Response bytes | Notes |
|---|---|---|
ESC [ c or ESC [ 0 c |
ESC [ ? 6 c |
DA1 |
ESC [ > c or ESC [ > 0 c |
ESC [ > 0 ; 0 ; 0 c |
DA2 |
ESC [ 6 n |
ESC [ 1 ; 1 R |
DSR cursor position |
ESC [ > q or ESC [ > 0 q |
`\x1bP> | claude-print\x1b\` |
ESC [ 1 8 t |
ESC [ 8 ; <rows> ; <cols> t |
Window size |
Version-resilience rule: Unknown escape sequences (ESC [ ... <letter> not in the table above) are silently discarded — never treated as an error. If Ink adds new probe types in future versions, they are ignored and the session proceeds via the startup sequencer timeout.
Each probe type is acknowledged at most once per session (dedup bitmask).
6. Startup Sequencer
Phase 1 — Trust/welcome dismiss:
The trust dialog asks the user to confirm before allowing tool use. Detection uses keyword scanning, not exact string match, to survive UI text changes across Claude Code versions:
- If any output line contains two or more of:
trust,Allow,continue,folder,permission,proceed→ send\rimmediately - Fallback: after 0.8 s with no new PTY bytes and ≥ 200 bytes received total → send
\r(covers any welcome/confirmation prompt) - Hard timeout: if the process has been in WAITING state for 45 s and fewer than 200 bytes have been received → exit 2 (binary not found or hung, or partial-output hang)
The idle/byte fallback is a one-shot: once any trigger (keyword or idle) fires and transitions to TRUST_DISMISSED, the fallback timer is deactivated and cannot re-fire.
Phase 2 — Prompt injection:
- After Phase 1 CR, wait until PTY is idle for 2.0 s (REPL re-renders) (If the PTY never goes idle for 2.0 s — e.g., claude streams continuous progress output — the wall-clock
--timeoutis the only exit path. This is expected behavior; the phase has no dedicated sub-timeout.--verboselogs a warning if TRUST_DISMISSED persists > 10 s.) - Send via bracketed paste:
\x1b[200~<prompt>\x1b[201~\r - Bracketed paste treats embedded
\nas literals (no premature Enter) - Prompts > 32 KB: write to
$TMPDIR/claude-print-.../prompt.txt; send/read <path>\r(/readis a built-in slash command, not an MCP tool. Prompt file written as UTF-8 with no BOM. After sending/read <path>\r, the startup sequencer re-enters the idle-wait loop (same as after trust dismiss, 2.0s idle threshold). Claude Code reads the file contents and begins processing — no system acknowledgment is emitted before the response. The response extraction path is identical to inline injection: Stop hook fires after the response, transcript JSONL is read normally. See EC-5 for sandboxing note.)
7. Stop Poller
Assumption: Stop fires once per session, not once per turn. This matches observed claude -p behavior for single-turn sessions. Verify for multi-turn --max-turns > 1 sessions during OQ-1 verification.
Reads from stop.fifo (non-blocking open; polled via the main poll() loop). On data available:
- Read one line → parse JSON with lenient schema (all fields
Option<T>) - Extract
session_idandtranscript_path(either direct or derived fromsession_id+cwd). If bothtranscript_pathandcwdare absent from the Stop payload: skip path derivation entirely; proceed directly to the retry loop usinglast_assistant_messageas the only fallback. Iflast_assistant_messageis also absent: emitis_error=true, exit 1. - Signal the event loop to exit
- Send
/exit\rto the PTY child. (Bracketed paste is not used here: at this point the REPL has returned to idle after completing the response, so a plain CR-terminated command is accepted./exitis a Claude Code built-in slash command that initiates graceful shutdown.) After sending/exit\r, wait up to 5s for the child to exit, detected by pollingmaster_fdwith a 5-second deadline: whenEIOis returned, the child process has exited.waitpid(WNOHANG)MAY be used as a supplementary check on each poll iteration. No SIGCHLD handler is required for this path. If the child has not exited after 5s, proceed directly to SIGTERM → 2s → SIGKILL cleanup.
If Stop never fires within --timeout seconds: emit timeout result, SIGTERM child, exit 124.
8. Transcript Reader
On Stop receipt:
1. Open transcript_path (derived if not in payload)
Path derivation algorithm (observed from Claude Code v2.x): strip the leading `/` from
`cwd`, replace all remaining `/` characters with `-`.
Example: `/home/coding/myproject` → `home-coding-myproject`.
This algorithm can produce ambiguous slugs for paths where directory names contain hyphens
(e.g., `/home/user/a-b` and `/home/user-a/b` both produce `home-user-a-b`). In practice,
`session_id` uniquely identifies the JSONL file within the directory, so slug ambiguity only
causes a problem if the slug-derived *directory* is wrong. If path derivation fails (directory
not found), fall back to `last_assistant_message`.
Add a unit test in `tests/transcript.rs` asserting this mapping for 3–4 representative
cwd values (e.g. `/home/coding/myproject`, `/root/foo/bar`, `/home/user/a-b` [note: same
slug as `/home/user-a/b` — ambiguity documented above], `/tmp/x`).
2. Scan for unique API turns (usage-fingerprint dedup)
3. Collect final turn's text blocks
4. Sum token counts across all unique turns
5. Retry loop if final_text is empty (race window): 40 × 50 ms
6. Fallback to last_assistant_message from Stop payload if retries exhausted
7. If both empty: is_error=true, exit 1
Token aggregation (usage dedup):
Multiple consecutive assistant events sharing the same API call carry identical message.usage objects (streaming chunks). Use two complementary dedup strategies, with message.id as the primary key:
let mut seen_ids: HashSet<String> = HashSet::new();
let mut prev_usage_key: Option<UsageKey> = None;
let mut turns: Vec<Usage> = vec![];
for event in parse_events(path) {
if let Event::Assistant { message } = event {
// Primary dedup: message.id (each API call has a unique id)
let is_new_turn = if let Some(id) = &message.id {
seen_ids.insert(id.clone()) // returns true if newly inserted
} else {
// Fallback for versions that omit message.id: usage-fingerprint dedup
let key = UsageKey::from(&message.usage);
let new = Some(&key) != prev_usage_key.as_ref();
prev_usage_key = Some(key);
new
};
if is_new_turn {
turns.push(message.usage.clone());
}
// accumulate text blocks from current chunk regardless
}
}
message.id is present in observed transcripts. Usage-fingerprint fallback handles older Claude Code versions that may not include it.
Known limitation of fingerprint fallback: Two consecutive turns with identical (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) are incorrectly collapsed into one turn. This is a known false-negative. message.id is the required path in production — fingerprint fallback is only for Claude Code versions that omit message.id, which is not observed in any current version. If fingerprint dedup is triggered and produces wrong results, the indication is a lower-than-expected num_turns count in the JSON output.
Schema tolerance (serde config for all JSONL structs):
#[derive(Deserialize, Default)]
#[serde(default)] // missing fields → Default::default()
pub struct Usage {
pub input_tokens: Option<u64>,
pub output_tokens: Option<u64>,
pub cache_creation_input_tokens: Option<u64>,
pub cache_read_input_tokens: Option<u64>,
// Unknown fields are silently ignored (no deny_unknown_fields)
}
#[derive(Deserialize)]
#[serde(tag = "type", rename_all = "kebab-case")]
pub enum Event {
Assistant { message: AssistantMessage },
User { message: UserMessage },
Result(ResultEvent),
#[serde(other)] // any unknown type → skip, no error
Unknown,
}
#[derive(Deserialize)]
#[serde(tag = "type", rename_all = "kebab-case")]
pub enum ContentBlock {
Text { text: String },
ToolUse { name: String },
Thinking { thinking: String },
#[serde(other)]
Unknown,
}
9. Emitter
text (default): {response_text}\n
json:
{
"type": "result",
"subtype": "success",
"is_error": false,
"result": "<response text>",
"session_id": "<uuid>",
"num_turns": 3,
"duration_ms": 4200,
"cost_usd": 0,
"claude_version": "2.1.168",
"usage": {
"input_tokens": 6224,
"output_tokens": 43079,
"cache_creation_input_tokens": 107205,
"cache_read_input_tokens": 4066110
}
}
duration_ms: wall-clock milliseconds from std::time::Instant::now() captured at main() entry to the moment the emitter writes its final output. This includes all overhead AND model latency — it is the total time a caller waited for a response.
stream-json: Spawns a reader thread that tails the transcript JSONL from the byte offset captured at prompt injection time, forwarding each new raw event line to stdout as it is written by Claude Code. After Stop fires, drains remaining lines. Output is raw JSONL (one JSON object per line), compatible with claude -p --output-format stream-json. The reader thread forwards ALL raw JSONL lines (no dedup) — this matches claude -p --output-format stream-json behavior, which also emits one line per chunk. The dedup logic in §8 Transcript Reader applies only to the json and text output formats where a single aggregated response is needed. Callers of stream-json MUST handle duplicate streaming chunks (same message.id, identical usage) as they would with claude -p. On normal completion, the final {"type":"result", "is_error": false, ...} line in the output is Claude Code's own Result event forwarded verbatim; claude-print does NOT synthesize an additional result line on success. claude_version is NOT injected into the forwarded Result event. On error (no Claude Code result), claude-print synthesizes the final result line and injects claude_version.
session_id in output: taken directly from the Stop payload if present. If absent from the payload, derive from the transcript file basename (filename without .jsonl). If neither is available (no transcript), emit null.
Known limitation: cost_usd is always 0. Claude Code does not expose per-session cost data via the transcript JSONL. Callers should not use this field for billing purposes. It is included for wire compatibility with claude -p --output-format json which also emits 0 for this field.
claude_version field (new, not in claude -p wire format): included in json output and in the final error result line of stream-json output. It does not appear in text output (no JSON envelope in text mode). Callers that parse strictly by field name are unaffected by the extra field.
claude_version runtime value: run claude --version (or the binary at --claude-binary) once at process startup, before fork(). Parse the output with the same permissive regex used by --version flag handling. Cache the result and pass it to the emitter. On parse failure, use "unknown".
Error result:
{"type": "result", "subtype": "timeout|interrupted|internal_error|assistant_error",
"is_error": true, "error_message": "...", "claude_version": "..."}
Error output by format:
textmode: on error, nothing is written to stdout; the error message is written to stderr. Exit code is the signal to callers.jsonmode: the error JSON object is written to stdout (as specified above). Nothing to stderr unless--verbose.stream-jsonmode: if an error occurs after prompt injection, a final JSON error line is emitted to stdout ({"type": "result", "is_error": true, "subtype": "...", "error_message": "...", "claude_version": "..."}); if an error occurs before prompt injection, same astextmode (nothing to stdout, stderr message).
10. NEEDLE Agent Config
claude-print.yaml → ~/.needle/agents/:
name: claude-print
description: Claude Code interactive mode — subscription billing (cc_entrypoint=cli)
agent_cli: claude-print
version_command: "claude-print --version"
input_method:
method: stdin
invoke_template: "cd {workspace} && claude-print --model {model} --max-turns 30 --output-format json --dangerously-skip-permissions --no-inherit-hooks"
timeout_secs: 3600
provider: anthropic
# Note: --max-turns 30 and --no-inherit-hooks are hardcoded in the template above.
# --max-turns 30 takes precedence over config.toml's max_turns setting for NEEDLE-dispatched
# jobs. To change the turn limit for NEEDLE workers, edit the invoke_template directly.
# NEEDLE workers run in isolation mode by default (--no-inherit-hooks is included in the
# template). To enable user hook inheritance for NEEDLE jobs, remove --no-inherit-hooks
# from the invoke_template.
model: claude-sonnet-4-6
output_transform: needle-transform-claude
cost:
type: use_or_lose
needle-transform-claude is the built-in NEEDLE output transform for Claude Code's --output-format json output. It extracts the result field (the assistant's response text) from the JSON object and passes it to the NEEDLE worker as the agent's response. This transform is already defined in NEEDLE's built-in transform registry — no new implementation is required in Phase 9.
With input_method: stdin, NEEDLE pipes the bead prompt text to claude-print's stdin. Since claude-print is invoked non-interactively (its stdin is a pipe, not a TTY), the CLI reads stdin as the prompt source (see §1: "Stdin accepted as prompt when not a TTY and no positional/--input-file given").
11. Install Script
install.sh:
- Detect arch (
uname -m) and select binary from release assets - Verify
claudeis on$PATH - If
~/.local/bin/claude-printalready exists, move it to~/.local/bin/claude-print.prev(enables one-step rollback) - Install binary to
~/.local/bin/claude-print(mode 755) - Install
mock_claudeto~/.local/bin/mock_claude(mode 755) — unlessSKIP_MOCK_CLAUDE=1(mock_claudeinstallation can be skipped by settingSKIP_MOCK_CLAUDE=1in the install environment — e.g., for users who prefer not to add test fixtures to their PATH) - Install
claude-print.yamlto~/.needle/agents/(mode 644, skipped if NEEDLE not installed) - Run
claude-print --checkto verify installation (full PTY round-trip self-test using mock_claude; skips PTY round-trip ifSKIP_MOCK_CLAUDE=1was set in step 5) - Print
claude-print --versionfor confirmation
Data Models
Stop Hook Payload (received from Claude Code — all fields optional)
{
"hook_event_name": "Stop",
"session_id": "abc123",
"transcript_path": "/home/coding/.claude/projects/.../abc123.jsonl",
"last_assistant_message": "...",
"cwd": "/home/coding/..."
}
transcript_path absent → derive from session_id + cwd.
last_assistant_message absent → retry loop only (no string fallback).
JSONL Transcript — Full Usage Object (as observed v2.1.168)
{
"input_tokens": 6178,
"output_tokens": 295,
"cache_creation_input_tokens": 825,
"cache_read_input_tokens": 26442,
"server_tool_use": {"web_search_requests": 0, "web_fetch_requests": 0},
"service_tier": "standard",
"cache_creation": {"ephemeral_5m_input_tokens": 0, "ephemeral_1h_input_tokens": 825},
"inference_geo": "",
"iterations": [{"input_tokens": 6178, "output_tokens": 295, ...}],
"speed": "standard"
}
Only input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens are aggregated. All other fields ignored.
Emitted Result (--output-format json)
{
"type": "result",
"subtype": "success",
"is_error": false,
"result": "response text",
"session_id": "abc123",
"num_turns": 1,
"duration_ms": 4200,
"cost_usd": 0,
"claude_version": "2.1.168",
"usage": {
"input_tokens": 1240,
"output_tokens": 380,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 900
}
}
duration_ms: wall-clock milliseconds from std::time::Instant::now() captured at main() entry to the moment the emitter writes its final output. This includes all overhead AND model latency — it is the total time a caller waited for a response.
Error Handling
| Condition | Detection | Action | Exit |
|---|---|---|---|
claude binary not found |
PATH lookup fails at startup | emit error | 2 |
| PTY open fails | openpty() returns Err |
emit error | 2 |
| Hook installer fails | temp dir / mkfifo / write error | emit error | 2 |
| WAITING state persists for 45 s and bytes_received < 200 | startup timer | kill child, emit error | 2 |
| Child exits before Stop | waitpid returns |
emit error with child exit code | 2 |
| Wall-clock timeout | poll timer | SIGTERM child, emit timeout | 124 |
| Stop hook never fires | FIFO timeout | SIGTERM child, emit timeout | 124 |
| SIGINT | signal handler | SIGINT child (per HR-8); set interrupted flag, emit interrupt result | 130 |
| SIGTERM received | signal handler | SIGTERM child, emit interrupt result | 130 |
Stop payload has no transcript_path and no cwd |
payload parse | skip to last_assistant_message fallback; if also absent, emit error |
1 |
| Transcript empty + fallback empty | retry exhausted | emit error | 1 |
is_error: true in transcript |
result event or error block | emit error result | 1 |
| Rate limit / API error | error content in transcript | emit error result | 1 |
Edge Case Catalog
| # | Edge Case | Resolution |
|---|---|---|
| EC-1 | Two claude-print instances on the same cwd concurrently |
Each has its own session_id and JSONL file. FIFO paths are per-pid — no cross-contamination. |
| EC-2 | ~/.claude/projects/ does not exist |
The inner claude creates it (standard behavior). If still absent after Stop, path derivation returns an error; fallback to last_assistant_message. |
| EC-3 | FIFO write blocks (Stop fires before read-end is open) | Read-end opened O_NONBLOCK at TRUST_DISMISSED → PROMPT_INJECTED transition, before prompt is injected. Stop cannot fire before prompt is sent. |
| EC-4 | Prompt contains null bytes | Rejected at CLI validation time with exit 2. claude -p itself does not support null bytes. |
| EC-5 | Prompt > 32 KB | Written to $TMPDIR/<session>/prompt.txt; /read <path>\r sent instead. File cleaned up with temp dir. Requires PO-6 to hold. See Startup Sequencer §6 for the full /read relay specification including encoding and response flow. |
| EC-6 | claude --version output format changes |
Version parsing uses a permissive regex. If parsing fails, claude_version: "unknown" in output; --version still exits 0. |
| EC-7 | Stop hook fires before trust dismiss (no dialog shown) | EC-11 unsets CLAUDE_CODE_SESSION_ID/CLAUDE_CODE_SESSION_KIND before execvp, which should prevent this in normal operation. If Stop fires before prompt injection despite EC-11, treat it as an error: emit is_error=true and exit 2, rather than silently accepting an empty-prompt response. |
| EC-8 | WAITING state persists for 45 s with fewer than 200 bytes received (covers both zero-byte case and partial-output hang — detects binary-not-found, hung startup, or process emitting <200 bytes then stalling) | Hard timeout: SIGTERM → 2 s → SIGKILL → waitpid → exit 2. |
| EC-9 | last_assistant_message contains ANSI escape sequences |
Strip ANSI before emitting in text and json formats (simple regex on the fallback string only). In stream-json mode, if the last_assistant_message fallback is used (retry loop exhausted), ANSI sequences MUST also be stripped before the synthesized fallback result event is emitted. |
| EC-10 | Truncated final JSONL line | Malformed line skipped by lenient parser. If no complete assistant events remain, retry loop fires. |
| EC-11 | CLAUDE_CODE_SESSION_ID / CLAUDE_CODE_SESSION_KIND inherited from parent |
Unset both in child env before execvp to prevent session identity confusion. (See Open Questions #6.) |
| EC-12 | Stdin is a TTY (interactive call with no prompt) | Require a prompt source. If stdin is a TTY and no positional/--input-file given, exit 2 with usage error. Do NOT drop into an interactive session. |
Anti-Patterns
Approaches considered and rejected. Document why so they are not re-proposed.
| Anti-Pattern | Why Rejected |
|---|---|
Use CLAUDE_CONFIG_DIR to sandbox all claude I/O |
Over-engineering: requires credential symlinking, settings duplication, and transcript forwarding. --settings merge achieves the relay hook without redirecting any I/O. |
| Parse Ink probes with regex on raw chunks | Probe bytes can straddle chunk boundaries. A regex on a single chunk misses split sequences. Use a byte-by-byte state machine. |
Use tokio async runtime for the event loop |
Tight poll() on 2–3 fds; no throughput benefit. Adds compile time, binary size, and complexity. |
| Open FIFO read-end after prompt injection | Creates a race: Stop hook may write before the read-end is open, causing hook's cat > fifo to block until timeout. |
Use last_assistant_message from Stop payload as primary text |
May be truncated or differently formatted than transcript content blocks. JSONL transcript is canonical; Stop payload is fallback only. |
Scrape PTY screen buffer with pyte as primary path |
Screen holds only what fits in terminal height. Long responses truncated. JSONL is complete. pyte is last-resort only. |
One global relay settings.json in ~/.claude/ |
Multiple concurrent invocations would race on the same file. Per-run temp dir + per-invocation file avoids all concurrency issues. |
shell=true for hook.sh |
Shell injection risk if temp dir path contains special characters. hook.sh is exec'd directly by Claude Code, not through a shell. |
Invariants
Named invariants that MUST hold on all exit paths. Each is testable.
| # | Invariant | Test |
|---|---|---|
| INV-1 | Temp dir cleaned up on every exit path | After each integration test assert $TMPDIR/claude-print-* is absent |
| INV-2 | Child process always waited on before main() returns |
Zombie check in cleanup integration test |
| INV-3 | FIFO read-end opened before prompt injection | --verbose trace: "fifo opened" timestamp precedes "prompt injected" |
| INV-4 | master_fd closed before waitpid |
lsof in integration test: no master fd open after child exits |
| INV-5 | No write-opens to ~/.claude/ by the claude-print process itself |
strace -e openat shows no writes; verified in hook inheritance tests |
| INV-6 | cc_entrypoint=cli in every generated transcript |
AS-4 scenario; run before every release |
| INV-7 | Exit code matches the Error Handling table | Each error condition tested with mock_claude; exit code asserted |
| INV-8 | Reader thread (stream-json) joined before process exit | Join coverage in stream-json integration test |
Proof Obligations
Assumptions that must hold for the design to work. Each has a named recovery if false.
| # | Assumption | If False | Recovery |
|---|---|---|---|
| PO-1 | --settings <file> merges hooks rather than replacing |
User hooks silently stop firing | Read ~/.claude/settings.json, merge hook arrays in-process, write combined file to temp dir, pass combined via --settings |
| PO-2 | --setting-sources= (empty) suppresses all standard sources |
--no-inherit-hooks still loads user hooks |
Try --setting-sources=none; if unsupported, enumerate only relay hook source explicitly |
| PO-3 | login_tty compiles under x86_64-unknown-linux-musl |
Phase 2 fails to build | Inline as setsid() + ioctl(slave, TIOCSCTTY, 0) + dup2(slave, 0/1/2) + close(slave) — all four syscalls musl always provides |
| PO-4 | Ink probes are DA1/DA2/DSR/XTVERSION/window-size only | Session hangs on unrecognized probe | Unknown probes ignored; session falls through to idle timeout for trust dismiss. Add new probes to table as discovered. |
| PO-5 | Stop hook fires after final JSONL flush | Transcript empty on first attempt | 40×50 ms retry loop (2 s budget). If Stop fires >2 s ahead of JSONL flush, increase retry budget or fall back to last_assistant_message. |
| PO-6 | /read <path> accepts absolute paths for prompts >32 KB |
Large prompt relay fails | Truncate at 32 KB with appended notice [prompt truncated at 32KB]. |
Implementation Phases
Status
| Item | State |
|---|---|
| Phases 1–11 module implementation | COMPLETE — all module-level deliverables committed |
main() session orchestration |
IN PROGRESS (bf-40i) |
| Binary-level E2E tests (AS-1, AS-2, AS-5) | IN PROGRESS (bf-52c) |
| AS-4 billing classification | PENDING manual verification (requires live credentials) |
| CI release binary | PENDING — claude-print-ci WorkflowTemplate synced to ArgoCD; no release tag cut yet (blocked on main() completion) |
Phase ordering is sequential. Each phase MUST NOT begin until the prior phase's completion criterion is met.
Phase 1: Crate Scaffold (~150 LOC) Entry: None.
Cargo.tomlworkspace with pinned deps,src/main.rs,cli.rs(clap),error.rs,config.rs--versionprintsclaude-print 0.1.0 (wrapping claude X.Y.Z)- Add
claude-print-ci.yamlstub tojedarden/declarative-config(verify step only;build-muslandgithub-releasesteps added in Phase 11)
Complete when: cargo build --target x86_64-unknown-linux-musl succeeds; claude-print --version prints expected format; cargo test --lib passes; claude-print-ci.yaml stub exists in declarative-config and ArgoCD syncs it to argo-workflows-ns-iad-ci.
Phase 2: Hook Installer + PTY Spawner (~200 LOC)
Entry: Phase 1 complete. PO-3 verified (attempt login_tty under musl; if absent, inline implementation ready before starting). PO-1 verified (confirm --settings merges hooks rather than replacing; if false, see PO-1 recovery before writing the hook installer). PO-1 can be verified with a simple test: run claude --settings /tmp/test_settings.json echo test where test_settings.json contains a dummy hook, alongside a user hook in ~/.claude/settings.json, and confirm both fire. OQ-5 (login_tty availability in musl) verified or PO-3 inline fallback ready; OQ-6 (CLAUDE_CODE_SESSION_ID inheritance) resolved.
hook.rs: temp dir (tempfile::TempDir), writesettings.jsonandhook.sh,mkfifopty.rs:openpty,fork, window-size probe,login_tty,execvp, SIGTERM/SIGKILL/waitpid--no-inherit-hooksforwards--setting-sources=to child (unverified per OQ-2)- Build
mock_claudefixture binary (test-fixtures/mock-claude/) as part of the workspace — required for PTY integration tests starting this phase
Complete when: Integration test test_pty_spawns_tty passes (child observes isatty(stdout)=true); temp dir absent after test; --setting-sources= in child argv when --no-inherit-hooks set.
Phase 3: Event Loop (~150 LOC) Entry: Phase 2 complete.
event_loop.rs:poll()onmaster_fd + self_pipe_read(initial 2-fd set);Vec<pollfd>for dynamic stop_fifo registration at PROMPT_INJECTED; read buffer; EIO detection (child exit)
Complete when: test_event_loop_reads_pty_output passes; test_event_loop_detects_child_exit (EIO → exit 2) passes.
Phase 4: Terminal Emulator (~100 LOC) Entry: Phase 3 complete. PO-4 noted (unknown Ink probes are ignored by design — no pre-phase verification required beyond confirming the design choice is implemented correctly).
terminal.rs: probe scanner, response table, dedup bitmask, unknown-probe passthrough
Complete when: All terminal unit tests pass (all 5 probes answered, unknown probe ignored, split-chunk probe handled, dedup works).
Phase 5: Startup Sequencer (~120 LOC)
Entry: Phase 4 complete. OQ-3b must be resolved (verify /read accepts absolute paths; if false, commit to PO-6 truncation fallback before implementing the large-prompt relay).
startup.rs: keyword trust dismiss, idle-gap timing, bracketed paste injection, large-prompt file relay
Complete when: All startup unit tests pass; integration test test_trust_dialog_standard_wording and test_trust_dialog_alternate_wording pass.
Phase 6: Stop Poller (~80 LOC)
Entry: Phase 5 complete. OQ-2 must be resolved (verify --setting-sources= suppresses standard sources; see PO-2 for fallback). OQ-4 (FIFO open race) validated by test.
- Open FIFO read-end O_NONBLOCK, integrate into
poll()loop, parse Stop payload, derive transcript path, signal event loop exit
Complete when: Integration test test_stop_hook_fires passes; test_missing_transcript_path_derived passes.
Phase 7: Transcript Reader (~180 LOC)
Entry: Phase 6 complete. PO-5 acknowledged: retry loop (40×50ms) is the mitigation for Stop-before-JSONL races. Verify retry timing is sufficient by running test_transcript_race with MOCK_DELAY_JSONL=100 and confirming exit 0.
transcript.rs: JSONL parse with lenient serde,message.iddedup + fingerprint fallback, text extraction, retry loop, Stop-payload fallback, path derivation
Complete when: All transcript unit tests pass; test_streaming_dedup_40_retries passes; AS-6 (race scenario) passes.
Phase 8: Emitter (~120 LOC) Entry: Phase 7 complete.
emitter.rs: text/json/stream-json,claude_version, error result objects, exit code mapping; stream-json reader thread + mpsc channel
Complete when: All emitter unit tests pass; AS-1 (text), AS-2 (json), stream-json output parses as valid JSONL.
Phase 9: NEEDLE Integration (~50 LOC + config) Entry: Phase 8 complete.
claude-print.yaml,install.sh,claude-print-ciWorkflowTemplate in declarative-config- Implement
--checkdoctor subcommand (openpty probe, mkfifo probe, optional mock_claude PTY round-trip)
Complete when: install.sh is written and syntactically valid (bash -n install.sh passes); manually copying the locally-built binary to ~/.local/bin/claude-print and running claude-print --check succeeds. Full install.sh end-to-end test (downloading from GitHub Release) is reserved for Phase 11. NEEDLE dispatches a test bead using claude-print.yaml; AS-3 passes; README flags table matches claude-print --help output (verified manually).
Phase 10: Tests (~500 LOC) Entry: Phase 8 complete (can run in parallel with Phase 9).
- Phase 10 completes the test suite by adding any tests not already written as part of Phases 2–9's completion criteria. Each phase's completion criterion already specifies and runs its own targeted integration tests — Phase 10 adds the remaining cross-phase and corner-case tests: the version-resilience suite, hook inheritance suite, all MEDIUM/LOW mock scenarios not covered by earlier phases, and the conformance harness.
Complete when: cargo test passes with zero failures.
Phase 11: CI (~YAML only) Entry: Phase 10 complete.
-
claude-print-ciArgo WorkflowTemplate: fmt + clippy + test + musl release binary + artifact upload (Note: theclaude-print-ciWorkflowTemplate is committed tojedarden/declarative-configand confirmed Synced in ArgoCD. The WorkflowTemplate covers verify + build-musl + github-release steps. No release tag has been cut yet — the install.sh end-to-end download test is blocked on a release binary existing, which requiresmain()session orchestration to be complete first.) -
CI also builds
mock_claudebinary (musl) and uploads it as a release artifact alongsideclaude-print -
Confirm
cargo auditruns on every push (either viarust-verifyor as an explicit CI step) -
Run install.sh end-to-end download test: download release artifact from GitHub Release URL and verify install.sh exits 0 and
claude-print --checkpasses (Deferred: blocked on a release binary existing. Will unblock oncemain()is complete and a release tag is cut.)
Complete when: CI run on main branch produces release binary; last-claude-version.txt artifact present; binary passes claude-print --check (credential-free) via install.sh; install.sh end-to-end download test (deferred from Phase 9) passes; full AS-1 is verified manually before each release tag is pushed.
Testing
Unit Tests (src/ inline + tests/)
Terminal probe responder (tests/terminal.rs):
- DA1 bytes in →
ESC[?6cresponse bytes out - DA2 bytes in →
ESC[>0;0;0cout - DSR bytes in →
ESC[1;1Rout - XTVERSION bytes in → correct DCS string out
- Window-size query →
ESC[8;50;220twith actual configured dimensions - Multiple probes in one chunk → all answered in order
- Probe dedup: send DA1 twice → response emitted only once
- Unknown escape sequence (
ESC[99t) → ignored, no response, no panic - Partial probe at chunk boundary (probe split across two reads) → matched and answered on second read
JSONL parser (tests/transcript.rs):
- Single assistant turn, single text block → correct text
- Multi-block content: text + tool_use + thinking + text → text blocks concatenated, others skipped
- Multi-turn: 3 unique usage keys → 3 unique turns, last turn's text returned
- Streaming duplicate dedup: 5 consecutive events with identical usage → counted as 1 turn
- Token aggregation: 45 unique turns → correct sum across all 4 token fields
- Missing
cache_creation_input_tokensin usage → defaults to 0, no panic input_tokens: nullin usage → treated as 0- Unknown event type (
"type": "new-future-event") → silently skipped, parse continues - Unknown content block type (
"type": "image") → silently skipped, text blocks still extracted - Unknown fields in
usageobject → silently ignored, known fields still parsed - Malformed JSONL line (truncated JSON) → line skipped, subsequent lines parsed
- Empty file → returns empty text, zero token counts (no panic)
Stop hook parser (tests/hook.rs):
- Full payload → all fields extracted
- Missing
transcript_path→ fallback path derived fromsession_id+cwd - Missing
last_assistant_message→None(retry-only fallback) - Unknown top-level fields in payload → silently ignored
- Malformed JSON →
Err, triggers exit 2
Emitter (tests/emitter.rs):
text: correct string, trailing newline, no extra whitespacejson: valid JSON, all required fields present,claude_versionincludedjson:usagefields are integers not stringsstream-json: each line parses as independent JSON object- Error result:
is_error: true, correctsubtypestring, non-zero exit - Zero token counts when fallback path taken:
usagepresent with all-zero values
Startup sequencer (tests/startup.rs):
- Trust keywords
trust+Allowin same line → CR sent immediately - Trust keywords in different lines of same chunk → CR sent
- Alternative wording
continue+folder→ CR sent (keyword union logic) - Arbitrary unknown welcome text (no keywords) → fallback: CR after 0.8 s idle
- WAITING state persists for 45 s with fewer than 200 bytes received → error returned (covers zero-byte case and partial-output hang; if ≥ 200 bytes arrive before 45s, the idle fallback at 0.8s fires first)
- 199 bytes received then idle 0.8 s → no CR yet (minimum 200 bytes enforced)
- 200 bytes received then idle 0.8 s → CR sent
CLI (tests/cli.rs):
- Positional prompt → forwarded correctly
--input-fileoverrides stdin- Stdin used when not a TTY and no other prompt source
- Conflicting prompt sources → error with clear message
--timeout 0→ error (must be positive)--output-format invalid→ error listing valid values--claude-binary /custom/path→ spawns that binary, not PATH lookup--versionoutput parses as"claude-print X.Y.Z (wrapping claude A.B.C)"
Mock PTY Integration Tests (tests/integration/)
All integration tests invoke claude-print --claude-binary <path-to-mock_claude>. The path is resolved in tests/integration/mod.rs using env!("CARGO_MANIFEST_DIR") plus the known target/debug/mock_claude output path from the test-fixtures/mock-claude workspace member. Mock behavior is set via env vars passed to the mock_claude process.
A mock_claude binary (compiled as a test fixture, not a shell script) simulates Claude Code's startup behavior. Built in a separate Cargo workspace member test-fixtures/mock-claude/ so it compiles to a native binary with controlled behavior. Controlled via env vars:
| Env var | Effect |
|---|---|
MOCK_TRUST_DIALOG=1 |
Emit trust dialog text before REPL |
MOCK_TRUST_WORDING=alternate |
Use different trust wording (Continue instead of Allow) |
MOCK_OMIT_TRANSCRIPT_PATH=1 |
Omit transcript_path from Stop payload |
MOCK_OMIT_LAST_MESSAGE=1 |
Omit last_assistant_message from Stop payload |
MOCK_DELAY_JSONL=<ms> |
Write final JSONL event after N ms delay (race simulation) |
MOCK_UNKNOWN_PROBE=1 |
Emit unknown ESC sequence before DA1 |
MOCK_UNKNOWN_EVENT_TYPE=1 |
Write unknown event type to transcript JSONL |
MOCK_UNKNOWN_USAGE_FIELDS=1 |
Add extra fields to usage object |
MOCK_RESPONSE=<text> |
Response text to write into transcript |
MOCK_TURNS=<n> |
Number of assistant turns to simulate |
MOCK_EXIT_BEFORE_STOP=1 |
Exit without firing Stop hook |
MOCK_DELAY_STOP=<ms> |
Fire Stop after delay |
MOCK_IS_ERROR=1 |
Write is_error: true to transcript result event |
MOCK_STOP_BEFORE_INJECT=1 |
Fire Stop hook immediately, before trust dismiss |
MOCK_SILENT=1 |
Emit no startup output; never fire Stop hook; block indefinitely (used to test timeout paths). |
All env vars listed above are exercised by at least one scenario in the integration test table. MOCK_DELAY_STOP is used in the SIGINT and "Stop hook never fires" scenarios.
Integration test scenarios:
| Scenario | Mock config | Assertion |
|---|---|---|
| Happy path | defaults | exit 0, correct response text, non-zero token counts |
| Trust dialog (standard wording) | MOCK_TRUST_DIALOG=1 |
exit 0 |
| Trust dialog (alternate wording) | MOCK_TRUST_DIALOG=1 MOCK_TRUST_WORDING=alternate |
exit 0 (resilience) |
| No startup output | MOCK_SILENT=1 |
exit 2 after timeout |
| Child exits before Stop | MOCK_EXIT_BEFORE_STOP=1 |
exit 2 |
| Stop hook never fires | MOCK_DELAY_STOP=99999 |
exit 124 |
| Transcript race | MOCK_DELAY_JSONL=100 |
retry loop fires, exit 0 |
Missing transcript_path |
MOCK_OMIT_TRANSCRIPT_PATH=1 |
path derived, exit 0 |
Missing last_assistant_message |
MOCK_OMIT_LAST_MESSAGE=1 |
retry-only path, exit 0 |
| Both omitted + delayed JSONL | MOCK_OMIT_LAST_MESSAGE=1 MOCK_DELAY_JSONL=200 |
retries suffice, exit 0 |
| Error in transcript | MOCK_IS_ERROR=1 |
exit 1, is_error: true in output |
| SIGINT | MOCK_DELAY_STOP=5000 + send SIGINT at 1 s |
exit 130, child killed |
| Multi-turn | MOCK_TURNS=3 |
last turn text returned, 3 turns in token sum |
| Large prompt (>32KB) | (no mock env var needed; test harness sends a 33 000-byte string as stdin; mock_claude reads stdin verbatim and reflects it in the transcript JSONL) | file relay used, exit 0 |
| Unknown probe emitted | MOCK_UNKNOWN_PROBE=1 |
probe ignored, session completes |
| Unknown event type in JSONL | MOCK_UNKNOWN_EVENT_TYPE=1 |
parse succeeds, text extracted |
| Unknown usage fields | MOCK_UNKNOWN_USAGE_FIELDS=1 |
ignored, token counts correct |
| Custom response text | MOCK_RESPONSE=hello |
response field in json output equals 'hello' |
--no-inherit-hooks |
--no-inherit-hooks flag set |
appropriate --setting-sources arg in child argv (either = or =none per OQ-2 resolution), exit 0 |
| Output format json | defaults | output parses as valid JSON |
| Output format stream-json | defaults | each output line parses as valid JSON |
| Stop fires before PROMPT_INJECTED | MOCK_STOP_BEFORE_INJECT=1 |
exit 2, is_error: true in output (EC-7 path) |
Hook Inheritance Tests (tests/hooks.rs)
These tests verify that --settings relay hook merges correctly and that --no-inherit-hooks suppresses user hooks.
Settings merge (default mode):
- Verify
--settings <temp>/settings.jsonis always passed to mock_claude - Verify the relay hook fires (Stop payload arrives on FIFO)
- With
mock_claudesimulating additional hooks in user settings: both user hook + relay hook fire --settingsflag is present in the child process argv (visible via/proc/<pid>/cmdline)
--no-inherit-hooks flag:
- The appropriate
--setting-sourcesargument is present in child argv when flag is set — either--setting-sources=(empty value, per OQ-2 primary) or--setting-sources=none(per PO-2 fallback). The test MUST be parameterized over both valid forms and accept whichever is generated by the current implementation. The specific form used MUST match what was verified in OQ-2 resolution. --setting-sourcesis absent from child argv when flag is not set- Mock that tracks whether a "user hook" fires: with
--no-inherit-hooks, user hook does not fire; without, it does
Temp dir lifecycle:
- After a successful run,
$TMPDIRcontains no leftoverclaude-print-*directories - After a panicked/early-exit run (simulated), TempDir drop cleans up
hook.shandstop.fifopaths are within the temp dir (not in user-visible locations)
Hook script correctness:
hook.shwrites exactly the stdin payload to the FIFO (no modification, no extra newline)hook.shexits 0 even if FIFO write fails (fire-and-forget)
--verbose trace:
- With
--verbose, stderr includes: temp dir path,--settingspath,--no-inherit-hooksstatus
Version-Resilience Test Suite (tests/version_compat.rs)
A dedicated test module that verifies the binary survives schema changes across Claude Code versions. These tests run in CI on every push as part of the standard claude-print-ci WorkflowTemplate.
Schema migration tests (property-based, using serde_json::Value to construct arbitrary payloads):
- Stop payload with 50 unknown extra fields → parsed without error
- Usage object with 20 new numeric fields → all ignored, 4 known fields correct
- Content block with new required field →
#[serde(other)]catches it as Unknown - JSONL with events in a new order (e.g.,
summarybeforeuser) → no assumption on ordering
claude --version compatibility tracker:
fn test_claude_version_recorded() {
let output = Command::new("claude").arg("--version").output().unwrap();
let version_str = String::from_utf8_lossy(&output.stdout);
// Verify output is parseable (not checking the specific version)
assert!(version_str.contains("Claude Code"), "unexpected claude --version format: {}", version_str);
// Write to test artifact for CI diff tracking
std::fs::write("target/last-claude-version.txt", version_str.as_bytes()).ok();
}
CI stores last-claude-version.txt as a build artifact. On the next run, if the version changed, a warning is printed and the full integration suite re-runs.
Startup heuristic stability test:
- Generate 20 different trust dialog phrasings (varied keyword combinations)
- For each: verify
should_dismiss(line)returns true - Generate 10 non-dialog lines (ANSI art, progress bars, empty lines)
- For each: verify
should_dismiss(line)returns false
Token count regression test:
- Fixture:
tests/fixtures/transcript_v2.1.168.jsonl— a real captured transcript - Assert: token sum matches hardcoded expected values
- When a new Claude version produces transcripts with a different schema, add a new fixture and assert on the new values. Both old and new fixtures must pass simultaneously (the parser handles both)
Conformance Harness
The test_output_format_wire_compat test verifies claude-print JSON output is structurally identical to claude -p --output-format json. It runs against mock_claude (no credentials needed):
- Run
claude-print --output-format json <prompt>withmock_claude - Assert all fields present in the
claude -pwire format are present - Assert
is_error=false,type=result,usageobject has all four token fields as integers - The extra
claude_versionfield MUST NOT cause a parse failure in a strict JSON parser (tested withserde_jsondeny_unknown_fieldson aclaude -p-shaped struct)
For billing conformance (AS-4, credential-required), the scripts/check-billing.sh script inspects the most recent JSONL and asserts entrypoint: cli. Run before every release.
Definition of Done
A phase or PR is done when ALL of the following hold:
cargo fmt --checkpassescargo clippy -- -D warningspassescargo testpasses with zero failures (all mocked tests, no credentials needed)- No
unsafeblocks added without a comment explaining why - No new
unwrap()calls in non-test code - Integration tests cover the new phase's completion criterion
- INV-1 (temp dir cleanup) verified for any new exit path
All-gates policy: every commit that reaches the CI step MUST pass all gates simultaneously. No "fix tests separately" commits.
End-to-End Tests (credential-required, excluded from CI, run manually)
# Basic
echo "Say hello" | claude-print
claude-print --output-format json "What is 2+2?"
claude-print --output-format stream-json "List 5 animals"
# Tool use
claude-print --allowedTools Bash --dangerously-skip-permissions "Run: echo hello"
# Billing verification
# After running: check transcript entrypoint field
python3 -c "
import json, glob
for path in sorted(glob.glob('/home/coding/.claude/projects/**/*.jsonl', recursive=True))[-1:]:
for line in open(path):
obj = json.loads(line)
if ep := obj.get('entrypoint'):
print('entrypoint:', ep)
break
"
# Expected: entrypoint: cli (not sdk-cli)
# NEEDLE integration
needle run --agent claude-print --workspace /home/coding/some-project
Security
Threat Model
| # | Threat | Attacker | Surface | Impact | Mitigation |
|---|---|---|---|---|---|
| T-1 | FIFO hijack | Local user on same machine | $TMPDIR world-readable by default |
Attacker reads the Stop payload (session_id, prompt text) | Create temp dir with mode 0700 via tempfile::Builder::new().mode(0o700). |
| T-2 | Prompt injection via --input-file |
Any caller | --input-file path argument |
Read arbitrary file contents as the prompt | --input-file is resolved to an absolute path and size-checked before use. Null bytes rejected. |
| T-3 | Environment variable leakage | None (ambient) | Inherited env of parent process | CLAUDE_CODE_SESSION_ID / CLAUDE_CODE_SESSION_KIND confuse child session identity |
Unset both before execvp (EC-11). |
| T-4 | Temp dir path with shell metacharacters | Filesystem | hook.sh path interpolation | Command injection if hook.sh uses shell expansion |
hook.sh uses cat > <literal-path> with the FIFO path embedded at write time — no variable expansion at hook execution time. The FIFO path is written as a shell single-quoted string: cat > '<path>'. Single quotes prevent all shell interpretation. If the path contains a single quote character (extremely unlikely in $TMPDIR output from tempfile), reject it at temp-dir creation time. |
| T-5 | PTY escape sequence injection from response | Malicious assistant response | ANSI sequences in prompt/response | Terminal control of caller's terminal | claude-print does not forward raw PTY output to its stdout. Output is extracted from JSONL as plain text. |
| T-6 | PATH hijack | Local attacker with PATH control | PATH lookup of claude binary |
Malicious binary intercepts all sessions; billing classification undetectable | Users can set claude-binary to an absolute path in config.toml as hardening. Out of scope for v1.0 signature verification. |
Untrusted Input Policy
- Prompts (positional, stdin,
--input-file): content is forwarded verbatim to claude via bracketed paste. Null bytes rejected. Size capped at 32KB before file relay. - Stop hook payload: parsed with lenient serde (
Option<T>for all fields). Malformed JSON → exit 2. Path values from payload are validated before use as filesystem paths. - JSONL transcript: parsed with lenient serde. Malformed lines skipped. No eval or dynamic dispatch on transcript content.
Supply Chain
- All dependencies pinned in
Cargo.lock. cargo auditrun in CI on every push.- The
claudebinary being spawned is resolved from PATH (or--claude-binary).claude-printdoes not verify the binary's signature — this is out of scope for v1.0.
Performance
Budgets
| Metric | Target | How Measured |
|---|---|---|
| Startup overhead (invocation → prompt injection) | < 5 s | --verbose trace timestamps |
| Transcript-to-output latency after Stop | < 2 s | Retry loop bound: 40 × 50 ms |
| Binary size (musl static) | < 10 MB | ls -lh target/x86_64-unknown-linux-musl/release/claude-print |
| Memory (RSS at steady state) | < 50 MB | /proc/<pid>/status VmRSS during integration test |
| PTY read-to-write round-trip (probe response) | < 1 ms | Not CI-gated; verified by Ink not hanging |
Benchmark Contract
Overhead is measured as wall-clock time from process start to the bracketed paste write timestamp (logged at PROMPT_INJECTED transition in --verbose mode). This excludes model latency, which is outside claude-print's control.
CI-Gated Benchmarks
Binary size is checked in CI: after the musl release build, ls -lh the binary and fail if > 10 MB. No runtime performance benchmarks in CI (they require credentials or complex mock setup). Performance is validated manually against the budgets above before each release.
Scalability Limits
claude-print is designed for at most ~20 concurrent invocations on the same machine (matching NEEDLE fleet size). Each instance holds one PTY fd pair and one temp dir. No per-instance memory scaling concerns. Maximum transcript size: bounded by disk; the reader loads one line at a time, not the whole file.
Operations
Migration Plan
Users currently calling claude -p in scripts, Makefiles, or NEEDLE configs:
- Install
claude-printviainstall.sh - Replace
claude -pwithclaude-print(all other flags identical) - Replace
claude -p --output-format jsonwithclaude-print --output-format json(output is a superset: addsclaude_versionfield; strict parsers unaffected if using field-name access) - NEEDLE: swap agent YAML from
claude-anthropic-sonnet.yamltoclaude-print.yaml
No data migration required. Transcripts from before the switch remain in ~/.claude/projects/ and are unaffected.
Backward Compatibility Stance
claude-print follows semver for its own output format:
- Patch (0.1.x): bug fixes; output format unchanged.
- Minor (0.x.0): new optional output fields (additive); new flags. Existing callers unaffected.
- Major (x.0.0): breaking output format change or flag removal. Requires caller update.
The claude_version field is additive (minor) and will not be removed in a major release — it is needed for version-regression debugging.
Rollout / Rollback Criteria
- Promote to stable: AS-1 through AS-6 pass; AS-4 (billing) verified manually; no open P0 bugs.
- Roll back: If AS-4 fails (entrypoint is
sdk-cli), immediately pull the release from the CI artifact store and revert the install. The previous binary is always preserved asclaude-print.prevbyinstall.sh.
Monitoring and Alerting
claude-print emits no metrics itself. Billing-classification failures are detected by:
- Manually running
scripts/check-billing.shafter each release (assertsentrypoint: cli) - Reviewing NEEDLE worker session transcripts for unexpected
entrypoint: sdk-clilines
No automated alerting in v1.0. If billing classification fails silently in production, it is an incident (see Risk Register R-1).
Doctor Command (--check)
claude-print --check runs a self-test with no credentials needed:
- Verify
claudebinary found on PATH (or--claude-binary) - Verify
openpty()succeeds and returns two valid fds - Verify
mkfifoworks in$TMPDIR - Spawn
mock_claude(installed alongside the main binary byinstall.sh) and verify a basic PTY round-trip —mock_claudeis resolved from the same directory asclaude-printitself, not hardcoded to~/.local/bin/. Ifclaude-printis at~/.local/bin/claude-print,mock_claudeis expected at~/.local/bin/mock_claude. Ifmock_claudeis not found at the expected path (e.g., becauseSKIP_MOCK_CLAUDE=1was used during install), step 4 emits a warningmock_claude not found — skipping PTY round-trip testand proceeds. The--checkexits 0 with steps 1–3 verified. - Scan
$TMPDIRfor leftoverclaude-print-*directories older than 1 hour and report them as warnings (does not fail the check). Example message:WARNING: found orphaned temp dir /tmp/claude-print-12345-abc (1.2h old) — run rm -rf to clean up. - Print
OKor a specific failure message per step
install.sh runs --check after installation. --check exits 0 on success, 2 on failure.
Risk Register
| # | Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|---|
| R-1 | Claude Code update changes isatty() detection logic; cc_entrypoint silently becomes sdk-cli |
Low | Critical (billing regression, all sessions misclassified) | AS-4 check before every release; --verbose shows PTY slave assigned; --check verifies PTY opens |
| R-2 | --settings merge behavior changes in a Claude Code update; user hooks stop firing |
Medium | Medium (user hooks silently broken) | PO-1 verified before Phase 2; version-compat tests track claude --version; CI alert on version change |
| R-3 | Ink adds a new mandatory terminal probe; session hangs indefinitely | Low | High (complete outage for new Claude Code versions) | Unknown probes are ignored; session falls through to idle timeout; MOCK_UNKNOWN_PROBE integration test verifies resilience |
| R-4 | login_tty absent in musl-libc |
Low | High (binary fails to build) | Inline implementation (PO-3 recovery) is 4 syscalls; verified before Phase 2 |
| R-5 | FIFO race: Stop hook fires before read-end open | Low | Medium (payload lost; exit 2) | FIFO opened before prompt injection (EC-3, INV-3); integration test test_fast_stop_hook validates timing |
| R-6 | JSONL schema changes break transcript parsing | Medium | High (empty response, exit 1 for all sessions) | #[serde(default)] + #[serde(other)] on all structs; property-based schema tests; version-compat fixture suite |
| R-7 | Temp dir cleanup fails on panic; disk fills over time | Low | Low (disk leak, recoverable with rm -rf /tmp/claude-print-*) |
tempfile::TempDir drop on panic; INV-1 integration test; --check can scan for orphaned dirs |
ADRs
ADR-001: No CLAUDE_CONFIG_DIR Redirect
Decision: Do not set CLAUDE_CONFIG_DIR in the child environment.
Context: An early design redirected all claude I/O to a per-run sandbox directory using CLAUDE_CONFIG_DIR, then forwarded transcripts to ~/.claude/. This was replaced.
Rationale: The --settings overlay achieves the only goal that required redirection (injecting the relay hook). Redirecting CLAUDE_CONFIG_DIR requires symlinking credentials, duplicating settings, and forwarding transcripts — all complexity with no benefit. Transcripts land in ~/.claude/projects/ natively, which is exactly what we want.
Consequences: Transcripts always land in ~/.claude/projects/. User hooks always fire (unless --no-inherit-hooks). No transcript forwarding logic needed.
ADR-002: Synchronous poll() Over Async Runtime
Decision: Use nix::poll::poll() synchronously; no tokio or async-std.
Context: The event loop monitors at most 3 file descriptors: master_fd (always), self_pipe_read (always), and stop_fifo (added at PROMPT_INJECTED). A reader thread handles stream-json output.
Rationale: Async runtimes add binary size (~2 MB), compile time, and conceptual complexity. The workload is I/O-bound on 2–3 fds with no parallelism benefit. A single poll() call + one reader thread is the simplest correct model.
Consequences: stream-json mode uses std::sync::mpsc. All new I/O (if added in future versions) must be registered with the poll() call or pushed to a thread.
ADR-003: message.id Primary Dedup with Fingerprint Fallback
Decision: Deduplicate streaming JSONL events by message.id (primary) with usage-fingerprint fallback.
Context: Claude Code writes multiple assistant events per API call when streaming. They share identical message.usage but have a unique message.id. Token counts must be summed once per API call, not once per event.
Rationale: message.id is stable across Claude Code versions and is the authoritative dedup key. The fingerprint fallback handles older versions that may omit message.id. Using fingerprint alone risks false dedup if two consecutive API calls have identical usage (unlikely but possible). Using message.id alone risks double-counting on older versions.
Consequences: Both seen_ids: HashSet<String> and prev_usage_key: Option<UsageKey> are maintained. Memory cost is O(unique API calls) per session — negligible.
ADR-004: NEEDLE Workers Must Use Configured Agent — No Silent Escalation
Decision: A NEEDLE worker dispatching beads from this workspace MUST use its configured agent adapter (e.g., claude-code-glm-47). No strand may silently escalate to a different model (e.g., claude-sonnet) based on bead complexity or adapter availability.
Context: During Phase 6–11 completion, the claude-print-bravo worker's configured adapter (claude-code-glm-47) was not found in the dispatcher at runtime. NEEDLE's resolve_adapter() silently fell back to the claude-sonnet built-in. claude-sonnet uses unbuffer -p claude which allocates a PTY for stdout; needle-transform-claude then receives escape sequences instead of stream-json, causing transform.failed with exit -1. The bead was dispatched at sequence 97882, timed out after exactly 600 s (exit 124), failed to release (bead.release.failed), and was lost when the mend cycle rebuilt the DB. This blocked the main() wiring bead indefinitely.
Rationale: claude-print exists specifically because the PTY-vs-pipe distinction determines billing classification. A worker running in this workspace that silently switches to a PTY-based agent inverts the very invariant the project enforces. Beyond billing, the transform failure silently destroys progress: the bead times out, can't release, and disappears from the DB. Silent degradation is worse than loud failure.
Consequences:
- NEEDLE
resolve_adapter()must fail loudly if the configured adapter is not found (NEEDLE beads bf-14w, bf-2wi track this fix). - All implementation beads in this workspace carry
--label atomicto suppress the mitosis strand's forced-split behavior, which can also destroy beads when combined with release failures. - When launching workers for this workspace, always verify the agent adapter file is present before dispatch:
ls ~/claude-config/agents/claude-code-glm-47/or equivalent.
Open Questions
Unresolved questions are mapped to the phase they block. Each MUST be resolved before that phase begins.
| # | Question | Blocks | Resolution / Fallback |
|---|---|---|---|
| OQ-1 | Does --settings <file> merge hooks with ~/.claude/settings.json or replace them? |
Phase 2 | Verify by running claude with --settings containing a test hook alongside a real user hook and checking both fire. If merge fails: PO-1 fallback (merge in-process). Also verify hook firing order: confirm user hooks run before or after the relay hook. If relay fires first, confirm this does not cause a read race with user Stop hooks that post-process the JSONL (e.g., ccdash). |
| OQ-2 | Does --setting-sources= (empty string) suppress all standard sources? |
Phase 6 | Verify by running claude --setting-sources= --settings <relay-only-file> and checking user hooks do not fire. If not accepted: try --setting-sources=none; if neither works, enumerate relay source explicitly. |
| OQ-3a | Is /read a built-in slash command (always available) vs. a tool invocation (requires allowedTools)? |
— | Resolved. Confirmed built-in slash command; does not require Read in --allowedTools. |
| OQ-3b | Does /read accept absolute paths for prompts >32 KB? |
Phase 5 | End-to-end test with a 33 KB prompt file at an absolute path. If not: PO-6 fallback (truncate at 32 KB). |
| OQ-4 | FIFO open race: will O_NONBLOCK open-before-inject reliably prevent timing issues? | Phase 6 | Validated by test_fast_stop_hook integration test (MOCK_DELAY_STOP=0). If race occurs in practice, add a pre-prompt-inject poll() to confirm FIFO open. |
| OQ-5 | Is login_tty available in x86_64-unknown-linux-musl? |
Phase 2 | Attempt compilation before Phase 2 begins. If absent: inline 4-syscall implementation (PO-3 recovery). Resolve before writing Phase 2 code. |
| OQ-6 | Do CLAUDE_CODE_SESSION_ID / CLAUDE_CODE_SESSION_KIND from a parent session confuse the child? |
Phase 2 | Unset both in child env before execvp as a precaution. Test by running claude-print from inside an active claude session and verifying the child gets its own session identity. |
CI/CD
Overview
claude-print ships as a static musl binary. All CI/CD runs on Argo Workflows in the iad-ci cluster. GitHub Actions are disabled — never re-enable them.
WorkflowTemplate location: jedarden/declarative-config → k8s/iad-ci/argo-workflows/claude-print-ci.yaml
ArgoCD app argo-workflows-ns-iad-ci auto-syncs on push to declarative-config.
WorkflowTemplate: claude-print-ci
Two trigger paths:
- PR / branch push — verify only (fmt + clippy + test); no release.
- Release tag (
v*) — verify, then build musl binary, then create GitHub release.
Template structure (conceptual — final YAML lives in declarative-config):
entrypoint: main
arguments:
parameters:
- name: repo # git.ardenone.com/jedarden/claude-print
- name: revision # branch name or tag name
- name: tag # set by caller; empty on branch push
steps:
- [verify] # rust-verify WorkflowTemplate ref (fmt + clippy + test)
- [build-musl] # only if tag is non-empty
- [github-release] # only if tag is non-empty
Step: verify
Delegates to the existing rust-verify WorkflowTemplate (fmt + clippy + test). No duplication. If rust-verify is not yet parameterized for arbitrary repos, add a repo parameter — do not inline the verify steps. Note: if rust-verify does not already include cargo audit, add it as an explicit step in claude-print-ci between verify and build-musl. The Phase 11 checklist MUST include cargo audit verification either way.
Step: build-musl
container:
image: ghcr.io/jedarden/rust-musl-builder:latest # or equivalent
command: [sh, -c, "git clone {{inputs.parameters.repo}} /workspace &&
git -C /workspace checkout {{inputs.parameters.revision}} &&
cd /workspace &&
cargo build --release --target x86_64-unknown-linux-musl &&
mv /workspace/target/x86_64-unknown-linux-musl/release/claude-print /workspace/claude-print-linux-amd64 &&
mv /workspace/target/x86_64-unknown-linux-musl/release/mock_claude /workspace/mock-claude-linux-amd64"]
env:
- name: CARGO_TERM_COLOR
value: never
outputs:
artifacts:
- name: binary
path: /workspace/claude-print-linux-amd64
- name: mock-binary
path: /workspace/mock-claude-linux-amd64
The cargo build step also builds mock_claude from the test-fixtures/mock-claude/ workspace member (it is declared as a workspace member in the root Cargo.toml, so a single cargo build --release compiles both). After the build, both binaries are renamed for upload: claude-print → claude-print-linux-amd64, mock_claude → mock-claude-linux-amd64.
Both binaries MUST be statically linked and self-contained. Verify with file <binary> — must say "statically linked".
Step: github-release
Uses gh release create with the artifacts from build-musl:
gh release create "${TAG}" \
--repo jedarden/claude-print \
--title "${TAG}" \
--notes "Release ${TAG}" \
claude-print-linux-amd64 \
mock-claude-linux-amd64
Asset naming convention: claude-print-linux-amd64 and mock-claude-linux-amd64 (no version in filenames — the release tag provides the version). This simplifies install scripts that pin to a known URL pattern.
Release Tag Convention
Tags follow semver: v<MAJOR>.<MINOR>.<PATCH>. Tags are pushed manually (git tag v0.1.0 && git push origin v0.1.0). The workflow is submitted manually or via Argo Events webhook on tag push (out of scope for v1.0; manual workflow submission is sufficient for initial releases).
Submitting CI Manually
kubectl --kubeconfig=/home/coding/.kube/iad-ci.kubeconfig create -f - <<EOF
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: claude-print-ci-manual-
namespace: argo-workflows
spec:
workflowTemplateRef:
name: claude-print-ci
arguments:
parameters:
- name: repo
value: "git.ardenone.com/jedarden/claude-print"
- name: revision
value: main
- name: tag
value: "" # empty = verify only; set to "v0.1.0" for release
EOF
Implementation Placement
- Phase 1: Add
claude-print-ci.yamlstub to declarative-config (verify step only; no release). Createjedarden/claude-printrepo on GitHub if not already done. - Phase 11 (CI): Add build-musl and github-release steps to the template, matching the phase completion criterion in the Implementation Phases section.
- CI also builds
mock_claudeas a musl binary and uploads it as a release artifact alongsideclaude-print.
Documentation
README.md
The repository README targets two audiences: (a) a human who wants to install and use claude-print, and (b) an AI agent that needs to invoke it programmatically.
Required sections (in order):
-
One-line description — "Drop-in replacement for
claude -pthat drives the interactive TUI via PTY, preserving subscription billing after the June 15, 2026 Agent SDK split." -
Installation —
curl-based one-liner pulling the latest GitHub release asset:curl -fsSL https://github.com/jedarden/claude-print/releases/latest/download/claude-print-linux-amd64 \ -o ~/.local/bin/claude-print && chmod +x ~/.local/bin/claude-printAnd the
install.shvariant (from the repo) for NEEDLE agent YAML setup. -
Requirements —
claude(Claude Code) must be on PATH; Linux x86-64 only;TMPDIRmust supportmkfifo. -
Quick start — Three examples:
# Simple prompt echo "What is 2+2?" | claude-print # Structured JSON output echo "Summarize this" | claude-print --output-format json # Streaming (NEEDLE-style) echo "Write a Rust function to..." | claude-print --output-format stream-json --max-turns 10 -
Output formats — Brief prose description of
text,json,stream-jsonwith a sample of each. -
All flags — Reference the CLI table from §1 of this plan verbatim or as a derived table; keep in sync with
claude-print --helpoutput. -
Exit codes — Table: 0 = success, 1 = assistant error, 2 = internal error, 124 = timeout, 130 = interrupted.
-
NEEDLE Integration — One paragraph explaining the YAML agent config + install step. Link to
~/.needle/agents/claude-print.yamlor include its contents as a code block. -
Self-test —
claude-print --checkand what each check does. -
Troubleshooting — Two most common failure modes:
- "PTY open failed" → likely in a container without
/dev/ptmx; run on a real host. - "Session never completes" → check
--timeout;--verboseshows state transitions.
- "PTY open failed" → likely in a container without
README must NOT contain: implementation internals, PTY mechanics, JSONL schema, or billing internals — those live in docs/.
AGENTS.md
AGENTS.md lives at the repo root. Its purpose is to give AI agents invoking claude-print everything they need in one file, without requiring the agent to read the full plan.
Required sections (in order):
-
Purpose — One paragraph: what
claude-printdoes, why it exists, and why an agent should prefer it overclaude -p. -
Invocation — The canonical single-turn invocation:
echo "<prompt>" | claude-print \ --model claude-sonnet-4-6 \ --max-turns 30 \ --output-format stream-json \ --dangerously-skip-permissions \ --no-inherit-hooksAnd the equivalent NEEDLE template form for agents running in NEEDLE context.
-
Input — Prompt is read from stdin. Max ~32 KB before
/readfallback kicks in (OQ-3b). Must be plain UTF-8 text; no shell escaping needed when piped. -
Output — For each
--output-format:text: the assistant's response, verbatim, on stdout. Nothing else.json: a JSON object on stdout; list every field (see Emitter §9 and Data Models for the full field list).stream-json: A sequence of JSONL lines forwarded verbatim from the Claude Code transcript. On success, the final line is Claude Code's own{"type":"result", "is_error": false, ...}event (forwarded as-is; noclaude_versionfield). On error, the final line is a synthesized result event:{"type":"result", "is_error": true, "subtype": "...", "error_message": "...", "claude_version": "..."}. List the result line fields.
-
Exit codes — Same table as README, plus: "On exit ≠ 0, check stderr for a human-readable error message."
-
Do not — A short bulleted list of anti-patterns:
- Do not pass
--dangerously-skip-permissionsin interactive (human-supervised) contexts. - Do not read or parse mid-session JSONL files directly — wait for
claude-printto exit. - Do not retry on exit 130 (interrupted) — investigate the cause.
- Do not set
CLAUDE_CODE_SESSION_IDin the environment before invokingclaude-print.
- Do not pass
-
Self-test —
claude-print --checkexits 0 if the environment can run it. -
Version compatibility —
claude-printembedsclaude --versionat startup; pass--verboseto see it. Theclaude_versionfield is present injsonoutput and in the synthesized error result line ofstream-jsonoutput. In thestream-jsonsuccess path, the final result line is forwarded verbatim from Claude Code and does not containclaude_version.
Docs Organization
docs/notes/ hosts short decision notes:
billing-context.md— why PTY preserves subscription billing (already exists)hook-design.md— relay hook mechanics, FIFO protocol, keeper fd patternterminal-probes.md— Ink startup probe table and response bytes
docs/research/ hosts external reference material:
claude-code-internals.md— Claude Code TUI behavior observations (already exists)pty-mechanics.md— PTY system call reference (already exists)
docs/plan/plan.md — the implementation plan (this file).
Implementation Placement
- Phase 1: Stub README.md with description, requirements, and placeholder sections.
- Phase 9 (NEEDLE Integration): Complete README.md (all sections) + write AGENTS.md.
- Phase 9 acceptance criterion:
claude-print --helpoutput matches the README flags table exactly. Any divergence is a CI failure (checked manually before release).