34 KiB
claude-print Plan
Overview
Single Rust binary that is a drop-in replacement for claude -p. It drives the Claude Code interactive TUI via PTY, extracts the response via the Stop hook and JSONL transcript, and emits claude -p-compatible output — all while billing against the subscription (cc_entrypoint=cli) rather than the Agent SDK credit pool.
Background
Starting June 15, 2026, Anthropic separates claude -p (headless) into a separate monthly credit pool. Only the interactive TUI (cc_entrypoint=cli) continues drawing from the unlimited subscription. claude-print wraps the TUI in a PTY so callers get claude -p wire-compatible output while billing against the subscription.
The billing classification is determined by isatty(stdout) inside the claude binary at startup:
- PTY slave as stdout →
isatty()returns true → TUI mode →cc_entrypoint=cli→ subscription - Pipe as stdout →
isatty()returns false → print mode →cc_entrypoint=sdk-cli→ credit pool
Delivery
Single statically-linked binary. No Python, no runtime dependencies, no pip packages.
claude-print # the binary
install.sh # copies binary to ~/.local/bin/, installs NEEDLE agent config
Built with:
cargo build --release --target x86_64-unknown-linux-musl # fully static, no libc dep
Distribution: GitHub Release artifact via claude-print-ci Argo WorkflowTemplate (same pattern as NEEDLE, SIGIL, ARMOR).
Architecture
caller
│ prompt (stdin, arg, or --input-file)
▼
claude-print (single Rust binary)
├── CLI parser flags forwarded to claude subprocess (clap)
├── Hook installer per-run temp dir: settings.json + hook.sh + stop.fifo
├── PTY spawner nix::pty::openpty() + fork() + login_tty()
├── Event loop poll() on master_fd; dispatches to:
│ ├── Terminal emu responds to DA1/DA2/DSR/XTVERSION/window-size probes
│ ├── Startup seq phase 1: trust dismiss phase 2: bracketed-paste inject
│ └── FIFO poller blocks on stop.fifo until Stop hook fires
├── Transcript rdr JSONL parse → final text + token counts (retry loop)
├── Emitter text / json / stream-json to stdout
└── Cleanup FIFO, temp dir, master_fd, waitpid
Sandbox Isolation
The inner claude process must not:
- Register itself in the live session registry (
~/.claude/sessions/) where ccdash and trail-boss can see it - Fire the user's global hooks (ccdash session tracking, trail-boss telemetry emitter) on Start/Stop/PermissionRequest
- Pollute
~/.claude/history.jsonlwith headless prompts
But its output (transcript JSONL + token counts) must be forwarded to ~/.claude/projects/ so the normal stats pipeline can aggregate usage.
Mechanism: CLAUDE_CONFIG_DIR
Confirmed present in the Claude Code binary. When set, Claude Code uses that directory instead of ~/.claude for all file I/O:
CLAUDE_CONFIG_DIR → sessions/, projects/, history.jsonl, settings.json, stats-cache.json, etc.
claude-print sets CLAUDE_CONFIG_DIR to a subdirectory inside its per-run temp dir before execvp:
$TMPDIR/claude-print-<pid>-<rand>/ ← tempfile::TempDir root
├── claude-home/ ← CLAUDE_CONFIG_DIR value
│ ├── .credentials.json → ~/.claude/.credentials.json (symlink)
│ ├── settings.json ← Stop hook only
│ ├── sessions/ ← subprocess session files (isolated)
│ └── projects/
│ └── <cwd-slug>/
│ └── <session-id>.jsonl ← subprocess transcript
├── hook.sh
└── stop.fifo
The credentials symlink gives the child access to OAuth auth without copying secrets into the temp dir.
What the Inner Process Writes (Sandbox)
| File | Written by child | Disposition after session |
|---|---|---|
sessions/<pid>.json |
Yes | discarded (in temp dir, cleaned up) |
projects/<slug>/<id>.jsonl |
Yes | copied to ~/.claude/projects/<slug>/<id>.jsonl |
history.jsonl |
Yes | discarded (headless prompts not in interactive history) |
stats-cache.json |
Yes | discarded (rebuilt from projects/) |
Transcript Forwarding
After the Stop hook fires and the transcript is read:
- Ensure
~/.claude/projects/<cwd-slug>/exists (create if absent) - Copy
$CLAUDE_CONFIG_DIR/projects/<cwd-slug>/<session-id>.jsonlto~/.claude/projects/<cwd-slug>/<session-id>.jsonl - The stats cache rebuilds naturally on next interactive Claude Code startup — the transcript appears as a normal past session
This makes claude-print sessions visible in /status usage stats, preserves the billing audit trail, and lets the user see past prompts via /resume <session-id>.
Hooks Not Inherited
CLAUDE_CONFIG_DIR/settings.json contains only the per-run Stop hook. The user's ~/.claude/settings.json is not read. Therefore:
- ccdash session tracking does not fire
- trail-boss does not receive these session events
- No
PermissionRequesthook fires (the REPL trust dialog is dismissed via PTY instead)
Crate Dependencies
| Crate | Purpose |
|---|---|
clap (derive) |
CLI argument parsing |
nix |
openpty, fork, login_tty, setsid, ioctl, poll, mkfifo, signal |
serde + serde_json |
JSONL parsing with schema-tolerant deserialization |
uuid |
Generate session IDs (for --session-id pre-assignment) |
tempfile |
Per-run temp directory with guaranteed cleanup |
No async runtime. The PTY event loop uses nix::poll::poll() synchronously. stream-json output uses a separate thread tailing the transcript file.
Components
1. CLI Interface
Drop-in for claude -p:
| Flag | Description |
|---|---|
prompt (positional) |
Prompt string; mutually exclusive with --input-file and stdin |
--input-file FILE |
Read prompt from file |
--model MODEL |
Forwarded to claude (default: claude-sonnet-4-6) |
--max-turns N |
Forwarded to claude (default: 30) |
--output-format FORMAT |
text (default), json, stream-json |
--allowedTools LIST |
Comma-separated, forwarded |
--disallowedTools LIST |
Forwarded |
--dangerously-skip-permissions |
Forwarded |
--timeout SECS |
Wall-clock timeout (default: 3600) |
--claude-binary PATH |
Override claude binary path (default: resolves claude from PATH) |
--version |
Print claude-print <version> (wrapping claude <version>) and exit |
--verbose |
Write timing traces to stderr |
Stdin accepted as prompt when not a TTY and no positional/--input-file given.
Exit codes:
0— success1— assistant error (is_error: truein transcript)2— internal error (PTY spawn, hook setup, parse failure)124— timeout exceeded130— interrupted (SIGINT)
2. Hook Installer / Sandbox Builder
Creates $TMPDIR/claude-print-<pid>-<rand>/ via tempfile::Builder with this layout:
<temp>/
├── claude-home/ ← CLAUDE_CONFIG_DIR (set in child env)
│ ├── .credentials.json ← symlink → ~/.claude/.credentials.json
│ └── settings.json ← Stop hook only (no user hooks)
├── hook.sh ← executed by Claude Code on Stop
└── stop.fifo ← POSIX named pipe for hook→parent IPC
claude-home/settings.json — the only settings file the child reads:
{
"hooks": {
"Stop": [{
"hooks": [{"type": "command", "command": "<temp>/hook.sh", "timeout": 10}]
}]
}
}
hook.sh (executed by Claude Code on Stop; receives payload on stdin):
#!/bin/sh
cat > <temp>/stop.fifo
stop.fifo — POSIX named pipe created with nix::unistd::mkfifo().
Child process environment additions:
CLAUDE_CONFIG_DIR=<temp>/claude-home
CLAUDE_CONFIG_DIR is set in the child's env via the fork/exec path — it is not set in the parent process. This ensures the parent's own Claude Code session (if any) is unaffected.
tempfile::TempDir handles cleanup on any drop path (panic, early return, or normal exit). Transcript copying (see Sandbox Isolation §) runs before the temp dir is dropped.
The user's ~/.claude/settings.json is never touched.
3. PTY Spawner
use nix::pty::{openpty, OpenptyResult};
use nix::unistd::{fork, ForkResult, login_tty};
let OpenptyResult { master, slave } = openpty(None, None)?;
// Set window size on master before fork
set_winsize(master, rows, cols);
match unsafe { fork()? } {
ForkResult::Child => {
drop(master);
login_tty(slave)?; // setsid + TIOCSCTTY + dup2(slave, 0/1/2)
execvp("claude", &args)?;
unreachable!()
}
ForkResult::Parent { child } => {
drop(slave);
run_event_loop(master, child, ...)
}
}
login_tty(slave) is glibc's login_tty(3): setsid() → TIOCSCTTY → dup2(slave, 0/1/2) → close(slave).
Window size read from /dev/tty via TIOCGWINSZ; falls back to 220 × 50.
Cleanup on any exit path: SIGTERM → 2 s → SIGKILL → waitpid.
4. Event Loop
Single poll() call on three fds:
master_fd POLLIN → read PTY output, dispatch to TerminalEmu + StartupSeq
stop_fifo POLLIN → Stop hook fired; read payload, begin transcript extraction
timer — → check wall-clock timeout
TerminalEmu runs on every chunk of PTY output, scanning for escape sequences and queueing responses. Responses written to master_fd on the next writable poll.
StartupSeq tracks phase (Waiting / TrustDismiss / PromptInjected) and transitions based on heuristics (see §5).
FifoPoller opens stop.fifo for reading in a non-blocking O_NONBLOCK open; polls for data via the same poll() call.
5. Terminal Emulator (Ink probe responder)
Ink sends DEC terminal queries at startup and hangs if unanswered. The emulator scans raw bytes for known probe patterns:
| Probe bytes | Response bytes | Notes |
|---|---|---|
ESC [ c or ESC [ 0 c |
ESC [ ? 6 c |
DA1 |
ESC [ > c or ESC [ > 0 c |
ESC [ > 0 ; 0 ; 0 c |
DA2 |
ESC [ 6 n |
ESC [ 1 ; 1 R |
DSR cursor position |
ESC [ > q |
ESC P > | claude-print ESC \ |
XTVERSION (DCS string) |
ESC [ 1 8 t |
ESC [ 8 ; <rows> ; <cols> t |
Window size |
Version-resilience rule: Unknown escape sequences (ESC [ ... <letter> not in the table above) are silently discarded — never treated as an error. If Ink adds new probe types in future versions, they are ignored and the session proceeds via the startup sequencer timeout.
Each probe type is acknowledged at most once per session (dedup bitmask).
6. Startup Sequencer
Phase 1 — Trust/welcome dismiss:
The trust dialog asks the user to confirm before allowing tool use. Detection uses keyword scanning, not exact string match, to survive UI text changes across Claude Code versions:
- If any output line contains two or more of:
trust,Allow,continue,folder,permission,proceed→ send\rimmediately - Fallback: after 0.8 s with no new PTY bytes and ≥ 200 bytes received total → send
\r(covers any welcome/confirmation prompt) - Hard timeout 45 s with zero bytes → exit 2 (binary not found or hung)
Phase 2 — Prompt injection:
- After Phase 1 CR, wait until PTY is idle for 2.0 s (REPL re-renders)
- Send via bracketed paste:
\x1b[200~<prompt>\x1b[201~\r - Bracketed paste treats embedded
\nas literals (no premature Enter) - Prompts > 32 KB: write to
$TMPDIR/claude-print-.../prompt.txt; send/read <path>\r
7. Stop Poller
Reads from stop.fifo (non-blocking open; polled via the main poll() loop). On data available:
- Read one line → parse JSON with lenient schema (all fields
Option<T>) - Extract
session_idandtranscript_path(either direct or derived fromsession_id+cwd) - Signal the event loop to exit
- Send
\x1b[201~\r/exit\rto PTY child to trigger graceful shutdown
If Stop never fires within --timeout seconds: emit timeout result, SIGTERM child, exit 124.
8. Transcript Reader
On Stop receipt:
1. Open transcript_path (derived if not in payload)
2. Scan for unique API turns (usage-fingerprint dedup)
3. Collect final turn's text blocks
4. Sum token counts across all unique turns
5. Retry loop if final_text is empty (race window): 40 × 50 ms
6. Fallback to last_assistant_message from Stop payload if retries exhausted
7. If both empty: is_error=true, exit 1
Token aggregation (usage dedup):
Multiple consecutive assistant events share identical message.usage objects (streaming chunks). Count a new turn only when (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) changes:
let mut prev_key: Option<UsageKey> = None;
let mut turns: Vec<Usage> = vec![];
for event in parse_events(path) {
if let Event::Assistant { message } = event {
let key = UsageKey::from(&message.usage);
if Some(&key) != prev_key.as_ref() {
turns.push(message.usage.clone());
prev_key = Some(key);
}
// accumulate text blocks from current chunk
}
}
Schema tolerance (serde config for all JSONL structs):
#[derive(Deserialize, Default)]
#[serde(default)] // missing fields → Default::default()
pub struct Usage {
pub input_tokens: Option<u64>,
pub output_tokens: Option<u64>,
pub cache_creation_input_tokens: Option<u64>,
pub cache_read_input_tokens: Option<u64>,
// Unknown fields are silently ignored (no deny_unknown_fields)
}
#[derive(Deserialize)]
#[serde(tag = "type", rename_all = "kebab-case")]
pub enum Event {
Assistant { message: AssistantMessage },
User { message: UserMessage },
Result(ResultEvent),
#[serde(other)] // any unknown type → skip, no error
Unknown,
}
#[derive(Deserialize)]
#[serde(tag = "type", rename_all = "kebab-case")]
pub enum ContentBlock {
Text { text: String },
ToolUse { name: String },
Thinking { thinking: String },
#[serde(other)]
Unknown,
}
8b. Transcript Forwarding
After extraction completes (regardless of success or failure):
let src = sandbox_claude_home
.join("projects")
.join(&cwd_slug)
.join(format!("{}.jsonl", session_id));
let dst_dir = real_claude_dir.join("projects").join(&cwd_slug);
std::fs::create_dir_all(&dst_dir)?;
let dst = dst_dir.join(format!("{}.jsonl", session_id));
std::fs::copy(&src, &dst)?;
real_claude_dir is $HOME/.claude (not CLAUDE_CONFIG_DIR, which is the sandbox). The copy runs before the TempDir is dropped.
After the copy, the session appears in ~/.claude/projects/ exactly like any other Claude Code session. It is visible in /status usage stats and resumable via claude --resume <session-id>.
If the copy fails (disk full, permissions): log a warning to stderr but do not change the exit code. Response extraction already succeeded; forwarding is best-effort.
9. Emitter
text (default): {response_text}\n
json:
{
"type": "result",
"subtype": "success",
"is_error": false,
"result": "<response text>",
"session_id": "<uuid>",
"num_turns": 3,
"duration_ms": 4200,
"cost_usd": 0,
"claude_version": "2.1.168",
"usage": {
"input_tokens": 6224,
"output_tokens": 43079,
"cache_creation_input_tokens": 107205,
"cache_read_input_tokens": 4066110
}
}
stream-json: Spawns a reader thread that tails the transcript JSONL from prompt_injected_at timestamp, forwarding each new raw event line to stdout as it is written by Claude Code. After Stop fires, drains remaining lines. Output is raw JSONL (one JSON object per line), compatible with claude -p --output-format stream-json.
claude_version field (new, not in claude -p wire format): included in all output formats for version-change debugging. Callers that parse strictly by field name are unaffected by the extra field.
Error result:
{"type": "result", "subtype": "timeout|interrupted|internal_error|assistant_error",
"is_error": true, "error_message": "..."}
10. NEEDLE Agent Config
claude-print.yaml → ~/.needle/agents/:
name: claude-print
description: Claude Code interactive mode — subscription billing (cc_entrypoint=cli)
agent_cli: claude-print
version_command: "claude-print --version"
input_method:
method: stdin
invoke_template: "cd {workspace} && claude-print --model {model} --max-turns 30 --dangerously-skip-permissions"
timeout_secs: 3600
provider: anthropic
model: claude-sonnet-4-6
output_transform: needle-transform-claude
cost:
type: use_or_lose
11. Install Script
install.sh:
- Detect arch (
uname -m) and select binary from release assets - Verify
claudeis on$PATH - Install binary to
~/.local/bin/claude-print(mode 755) - Install
claude-print.yamlto~/.needle/agents/(mode 644, skipped if NEEDLE not installed) - Run
claude-print --versionto confirm - Print detected
claudeversion for version-compat record
Data Models
Stop Hook Payload (received from Claude Code — all fields optional)
{
"hook_event_name": "Stop",
"session_id": "abc123",
"transcript_path": "/home/coding/.claude/projects/.../abc123.jsonl",
"last_assistant_message": "...",
"cwd": "/home/coding/..."
}
transcript_path absent → derive from session_id + cwd.
last_assistant_message absent → retry loop only (no string fallback).
JSONL Transcript — Full Usage Object (as observed v2.1.168)
{
"input_tokens": 6178,
"output_tokens": 295,
"cache_creation_input_tokens": 825,
"cache_read_input_tokens": 26442,
"server_tool_use": {"web_search_requests": 0, "web_fetch_requests": 0},
"service_tier": "standard",
"cache_creation": {"ephemeral_5m_input_tokens": 0, "ephemeral_1h_input_tokens": 825},
"inference_geo": "",
"iterations": [{"input_tokens": 6178, "output_tokens": 295, ...}],
"speed": "standard"
}
Only input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens are aggregated. All other fields ignored.
Emitted Result (--output-format json)
{
"type": "result",
"subtype": "success",
"is_error": false,
"result": "response text",
"session_id": "abc123",
"num_turns": 1,
"duration_ms": 4200,
"cost_usd": 0,
"claude_version": "2.1.168",
"usage": {
"input_tokens": 1240,
"output_tokens": 380,
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 900
}
}
Error Handling
| Condition | Detection | Action | Exit |
|---|---|---|---|
claude binary not found |
PATH lookup fails at startup | emit error | 2 |
| Credentials file missing | symlink target absent | emit error | 2 |
| PTY open fails | openpty() returns Err |
emit error | 2 |
| Sandbox build fails | temp dir / mkfifo / symlink error | emit error | 2 |
| Transcript copy fails | I/O error on forwarding | warning to stderr, continue | — |
| No PTY output within 45 s | startup timer | kill child, emit error | 2 |
| Child exits before Stop | waitpid returns |
emit error with child exit code | 2 |
| Wall-clock timeout | poll timer | SIGTERM child, emit timeout | 124 |
| Stop hook never fires | FIFO timeout | SIGTERM child, emit timeout | 124 |
| SIGINT | signal handler | SIGTERM child, emit interrupt result | 130 |
| Transcript empty + fallback empty | retry exhausted | emit error | 1 |
is_error: true in transcript |
result event or error block | emit error result | 1 |
| Rate limit / API error | error content in transcript | emit error result | 1 |
Implementation Phases
- Phase 1: Crate scaffold —
Cargo.tomlwith pinned deps,src/main.rswith CLI parsing (clap),--versionoutput including detectedclaude --version - Phase 2: Sandbox builder + PTY spawner — temp dir,
CLAUDE_CONFIG_DIRsubdirectory, credentials symlink, sandboxedsettings.json,hook.sh,mkfifo, thennixfork/exec withCLAUDE_CONFIG_DIRin child env, window-size probe,login_tty, SIGTERM/SIGKILL cleanup,waitpid - Phase 3: Event loop —
poll()on master_fd + FIFO fd + timeout; read buffer; EIO detection - Phase 4: Terminal emulator — probe scanner, response table, dedup bitmask; unknown-probe passthrough
- Phase 5: Startup sequencer — keyword-based trust dismiss, idle-gap timing, bracketed paste injection, large-prompt file relay
- Phase 6: Hook installer —
tempfile::TempDir, writesettings.jsonandhook.sh,mkfifo, FIFO polling - Phase 7: Transcript reader — JSONL parse with lenient serde, usage dedup, text extraction, retry loop, Stop-payload fallback, path derivation
- Phase 8: Emitter — text/json/stream-json formats,
claude_versionfield, error result objects, exit code mapping - Phase 9: NEEDLE integration —
claude-print.yaml,install.sh,claude-print-ciWorkflowTemplate in declarative-config - Phase 10: Tests — unit + mock PTY + version-resilience (see Testing section)
- Phase 11: CI —
claude-print-ciArgo WorkflowTemplate: fmt + clippy + test + release binary
Testing
Unit Tests (src/ inline + tests/)
Terminal probe responder (tests/terminal.rs):
- DA1 bytes in →
ESC[?6cresponse bytes out - DA2 bytes in →
ESC[>0;0;0cout - DSR bytes in →
ESC[1;1Rout - XTVERSION bytes in → correct DCS string out
- Window-size query →
ESC[8;50;220twith actual configured dimensions - Multiple probes in one chunk → all answered in order
- Probe dedup: send DA1 twice → response emitted only once
- Unknown escape sequence (
ESC[99t) → ignored, no response, no panic - Partial probe at chunk boundary (probe split across two reads) → matched and answered on second read
JSONL parser (tests/transcript.rs):
- Single assistant turn, single text block → correct text
- Multi-block content: text + tool_use + thinking + text → text blocks concatenated, others skipped
- Multi-turn: 3 unique usage keys → 3 unique turns, last turn's text returned
- Streaming duplicate dedup: 5 consecutive events with identical usage → counted as 1 turn
- Token aggregation: 45 unique turns → correct sum across all 4 token fields
- Missing
cache_creation_input_tokensin usage → defaults to 0, no panic input_tokens: nullin usage → treated as 0- Unknown event type (
"type": "new-future-event") → silently skipped, parse continues - Unknown content block type (
"type": "image") → silently skipped, text blocks still extracted - Unknown fields in
usageobject → silently ignored, known fields still parsed - Malformed JSONL line (truncated JSON) → line skipped, subsequent lines parsed
- Empty file → returns empty text, zero token counts (no panic)
Stop hook parser (tests/hook.rs):
- Full payload → all fields extracted
- Missing
transcript_path→ fallback path derived fromsession_id+cwd - Missing
last_assistant_message→None(retry-only fallback) - Unknown top-level fields in payload → silently ignored
- Malformed JSON →
Err, triggers exit 2
Emitter (tests/emitter.rs):
text: correct string, trailing newline, no extra whitespacejson: valid JSON, all required fields present,claude_versionincludedjson:usagefields are integers not stringsstream-json: each line parses as independent JSON object- Error result:
is_error: true, correctsubtypestring, non-zero exit - Zero token counts when fallback path taken:
usagepresent with all-zero values
Startup sequencer (tests/startup.rs):
- Trust keywords
trust+Allowin same line → CR sent immediately - Trust keywords in different lines of same chunk → CR sent
- Alternative wording
continue+folder→ CR sent (keyword union logic) - Arbitrary unknown welcome text (no keywords) → fallback: CR after 0.8 s idle
- No output for 45 s → error returned
- 199 bytes received then idle 0.8 s → no CR yet (minimum 200 bytes enforced)
- 200 bytes received then idle 0.8 s → CR sent
CLI (tests/cli.rs):
- Positional prompt → forwarded correctly
--input-fileoverrides stdin- Stdin used when not a TTY and no other prompt source
- Conflicting prompt sources → error with clear message
--timeout 0→ error (must be positive)--output-format invalid→ error listing valid values--claude-binary /custom/path→ spawns that binary, not PATH lookup--versionoutput parses as"claude-print X.Y.Z (wrapping claude A.B.C)"
Mock PTY Integration Tests (tests/integration/)
A mock_claude binary (compiled as a test fixture, not a shell script) simulates Claude Code's startup behavior. Built in a separate Cargo workspace member test-fixtures/mock-claude/ so it compiles to a native binary with controlled behavior. Controlled via env vars:
| Env var | Effect |
|---|---|
MOCK_TRUST_DIALOG=1 |
Emit trust dialog text before REPL |
MOCK_TRUST_WORDING=alternate |
Use different trust wording (Continue instead of Allow) |
MOCK_OMIT_TRANSCRIPT_PATH=1 |
Omit transcript_path from Stop payload |
MOCK_OMIT_LAST_MESSAGE=1 |
Omit last_assistant_message from Stop payload |
MOCK_DELAY_JSONL=<ms> |
Write final JSONL event after N ms delay (race simulation) |
MOCK_UNKNOWN_PROBE=1 |
Emit unknown ESC sequence before DA1 |
MOCK_UNKNOWN_EVENT_TYPE=1 |
Write unknown event type to transcript JSONL |
MOCK_UNKNOWN_USAGE_FIELDS=1 |
Add extra fields to usage object |
MOCK_RESPONSE=<text> |
Response text to write into transcript |
MOCK_TURNS=<n> |
Number of assistant turns to simulate |
MOCK_EXIT_BEFORE_STOP=1 |
Exit without firing Stop hook |
MOCK_DELAY_STOP=<ms> |
Fire Stop after delay |
MOCK_IS_ERROR=1 |
Write is_error: true to transcript result event |
Integration test scenarios:
| Scenario | Mock config | Assertion |
|---|---|---|
| Happy path | defaults | exit 0, correct response text, non-zero token counts |
| Trust dialog (standard wording) | TRUST_DIALOG=1 |
exit 0 |
| Trust dialog (alternate wording) | TRUST_DIALOG=1 TRUST_WORDING=alternate |
exit 0 (resilience) |
| No startup output | emit nothing | exit 2 after timeout |
| Child exits before Stop | EXIT_BEFORE_STOP=1 |
exit 2 |
| Stop hook never fires | DELAY_STOP=99999 |
exit 124 |
| Transcript race | DELAY_JSONL=100 |
retry loop fires, exit 0 |
Missing transcript_path |
OMIT_TRANSCRIPT_PATH=1 |
path derived, exit 0 |
Missing last_assistant_message |
OMIT_LAST_MESSAGE=1 |
retry-only path, exit 0 |
| Both omitted + delayed JSONL | OMIT_LAST_MESSAGE=1 DELAY_JSONL=200 |
retries suffice, exit 0 |
| Error in transcript | IS_ERROR=1 |
exit 1, is_error: true in output |
| SIGINT | DELAY_STOP=5000 + send SIGINT at 1 s |
exit 130, child killed |
| Multi-turn | TURNS=3 |
last turn text returned, 3 turns in token sum |
| Large prompt (>32KB) | 33000-byte prompt | file relay used, exit 0 |
| Unknown probe emitted | UNKNOWN_PROBE=1 |
probe ignored, session completes |
| Unknown event type in JSONL | UNKNOWN_EVENT_TYPE=1 |
parse succeeds, text extracted |
| Unknown usage fields | UNKNOWN_USAGE_FIELDS=1 |
ignored, token counts correct |
| Output format json | defaults | output parses as valid JSON |
| Output format stream-json | defaults | each output line parses as valid JSON |
Sandbox Isolation Tests (tests/sandbox.rs)
These tests verify that the inner claude process is contained and that transcripts are forwarded correctly to ~/.claude/projects/.
CLAUDE_CONFIG_DIR isolation:
- Spawn
mock_claudewith a controlledCLAUDE_CONFIG_DIR; verify the child writes its session file inside that dir, not in~/.claude/sessions/ - Spawn with
CLAUDE_CONFIG_DIRset; verify real~/.claude/sessions/contains no new entry after the run - Verify real
~/.claude/settings.jsonhooks (read the file before and after a mock run) are not modified
Credentials symlink:
- Verify sandbox dir contains
.credentials.jsonas a symlink pointing to real credentials file - Verify the symlink resolves to the real file (not a copy)
- Run with credentials symlink absent: expect graceful error, not hang
Transcript forwarding:
- After a successful mock run, verify
~/.claude/projects/<cwd-slug>/<session-id>.jsonlwas created - Verify its contents match the sandbox transcript byte-for-byte
- Verify the temp dir is cleaned up after the run (no leftover files in
$TMPDIR) - Run with
~/.claude/projects/unwritable: verify warning to stderr but exit 0 (forwarding is best-effort)
Hooks not inherited:
- Write a test hook script to a temp file; point real
~/.claude/settings.jsonat it viaCLAUDE_CONFIG_DIRtrick inside the test; verify the test hook does NOT fire during a subprocess run (because the subprocess reads only its sandboxed settings.json)
--verbose sandbox trace:
- With
--verbose, verify stderr includes lines for: temp dir path, CLAUDE_CONFIG_DIR value, transcript copy src→dst
Version-Resilience Test Suite (tests/version_compat.rs)
A dedicated test module that verifies the binary survives schema changes across Claude Code versions. These tests are run in CI on every push and also on a weekly schedule.
Schema migration tests (property-based, using serde_json::Value to construct arbitrary payloads):
- Stop payload with 50 unknown extra fields → parsed without error
- Usage object with 20 new numeric fields → all ignored, 4 known fields correct
- Content block with new required field →
#[serde(other)]catches it as Unknown - JSONL with events in a new order (e.g.,
summarybeforeuser) → no assumption on ordering
claude --version compatibility tracker:
fn test_claude_version_recorded() {
let output = Command::new("claude").arg("--version").output().unwrap();
let version_str = String::from_utf8_lossy(&output.stdout);
// Verify output is parseable (not checking the specific version)
assert!(version_str.contains("Claude Code"), "unexpected claude --version format: {}", version_str);
// Write to test artifact for CI diff tracking
std::fs::write("target/last-claude-version.txt", version_str.as_bytes()).ok();
}
CI stores last-claude-version.txt as a build artifact. On the next run, if the version changed, a warning is printed and the full integration suite re-runs.
Startup heuristic stability test:
- Generate 20 different trust dialog phrasings (varied keyword combinations)
- For each: verify
should_dismiss(line)returns true - Generate 10 non-dialog lines (ANSI art, progress bars, empty lines)
- For each: verify
should_dismiss(line)returns false
Token count regression test:
- Fixture:
tests/fixtures/transcript_v2.1.168.jsonl— a real captured transcript - Assert: token sum matches hardcoded expected values
- When a new Claude version produces transcripts with a different schema, add a new fixture and assert on the new values. Both old and new fixtures must pass simultaneously (the parser handles both)
End-to-End Tests (credential-required, excluded from CI, run manually)
# Basic
echo "Say hello" | claude-print
claude-print --output-format json "What is 2+2?"
claude-print --output-format stream-json "List 5 animals"
# Tool use
claude-print --allowedTools Bash --dangerously-skip-permissions "Run: echo hello"
# Billing verification
# After running: check transcript entrypoint field
python3 -c "
import json, glob
for path in sorted(glob.glob('/home/coding/.claude/projects/**/*.jsonl', recursive=True))[-1:]:
for line in open(path):
obj = json.loads(line)
if ep := obj.get('entrypoint'):
print('entrypoint:', ep)
break
"
# Expected: entrypoint: cli (not sdk-cli)
# NEEDLE integration
needle run --agent claude-print --workspace /home/coding/some-project
Open Questions
--settingsmerge behavior: Does Claude Code merge multiple--settingsfiles, or does the last one win? If merge, per-run hooks layer cleanly on user hooks. If last-wins, the user's hooks are shadowed. Needs verification; may require reading user settings and merging in-process rather than relying on Claude Code's merge.- Multiline prompt > 32 KB: Does the
/read <path>slash command accept absolute paths? Does it block tool use (--allowedTools)? Needs end-to-end verification. FIFOopen race:hook.shopens the FIFO for writing; the parent opens it for reading. Both sides block until the other end connects. The parent must open the read end before the Stop hook fires. If the Stop hook fires before the FIFO read end is open, the write blocks and eventually times out. Mitigation: open the read end before injecting the prompt (before Stop could fire). Verify timing.- musl vs glibc:
openptyandlogin_ttyare glibc extensions. Musl providesopenptyin its PTY headers, butlogin_ttymay not be available. May need to inline thelogin_ttyimplementation (setsid+TIOCSCTTYioctl +dup2). - Credentials lookup with
CLAUDE_CONFIG_DIR: ConfirmedCLAUDE_CONFIG_DIRoverrides all file I/O. The child reads.credentials.jsonfrom$CLAUDE_CONFIG_DIR/.credentials.json. Symlink to the real file is the right approach — it avoids copying secrets and stays current if the token is refreshed. Verify the child follows symlinks (it should; it uses normal file open). - Other
CLAUDE_*env vars: The binary reads many env vars. Confirm none of them cause the child to bypassCLAUDE_CONFIG_DIRfor session or history I/O. In particular,CLAUDE_CODE_SESSION_ID,CLAUDE_CODE_SESSION_KIND, andCLAUDE_JOB_DIRmay need to be unset/overridden in the child env to avoid inheriting the parent session's identity.