commit 134af23bbfd088e5d7d479211aa1660f09608f79 Author: jedarden Date: Sun Jun 7 10:30:31 2026 -0400 Initial scaffold: claude-print PTY wrapper for subscription billing diff --git a/README.md b/README.md new file mode 100644 index 0000000..764397d --- /dev/null +++ b/README.md @@ -0,0 +1,13 @@ +# claude-print + +Drop-in replacement for `claude -p` (print/headless mode) that drives the Claude Code interactive TUI via PTY — preserving subscription billing after the June 15, 2026 Agent SDK credit split. + +## Why this exists + +Starting June 15, 2026, Anthropic separates `claude -p` (headless) into a separate Agent SDK credit pool ($100–$200/month on Max plans). Only the interactive TUI (`cc_entrypoint=cli`) continues drawing from the unlimited subscription. `claude-print` wraps the interactive TUI in a PTY so callers get `claude -p` wire-compatible output while billing against the subscription. + +## Structure + +- `docs/notes/` — design decisions, constraints, integration details +- `docs/research/` — prior art, billing change references, PTY/hook patterns +- `docs/plan/plan.md` — complete implementation plan diff --git a/docs/notes/billing-context.md b/docs/notes/billing-context.md new file mode 100644 index 0000000..6fee9dc --- /dev/null +++ b/docs/notes/billing-context.md @@ -0,0 +1,36 @@ +# Billing Context + +## The June 15, 2026 Split + +Anthropic's billing header field `cc_entrypoint` determines which pool a request draws from: + +- `cc_entrypoint=cli` → interactive TUI → unlimited subscription +- `cc_entrypoint=sdk-cli` → `claude -p` / Agent SDK → monthly credit pool + +Credit pool sizes: Pro $20/mo, Max 5x $100/mo, Max 20x $200/mo. No rollover. + +`claude -p` is currently misclassified as `sdk-cli` even for subscription users (GitHub issue #59105 — acknowledged by Anthropic, not fixed). The June 15 change formalizes this split rather than fixing the classification. + +## Why PTY Preserves `cli` Billing + +Running `claude` under a real PTY (via `forkpty`) produces `cc_entrypoint=cli` because: +1. `claude` detects it has a real TTY on stdout +2. It enters interactive/TUI mode +3. The billing header is set at startup based on the entrypoint mode + +Any wrapper that provides a PTY inherits the `cli` classification. Screen-scraping and hook-based approaches extract the response without changing the billing header. + +## Prior Art Repos + +| Repo | Approach | Billing | +|------|----------|---------| +| `smithersai/claude-p` | Zig + zmux PTY + Stop hook | cli ✓ | +| `hristo2612/jinn` | Node.js + node-pty + hook relay | cli ✓ | +| `halfwhey/claudraband` | Shell + tmux persistent sessions | cli ✓ | +| `npow/claude-relay` | Wraps `claude -p` | sdk-cli ✗ | + +## NEEDLE Integration Context + +The `jedarden/NEEDLE` repo has `plugins/claude-interactive/` — a Python PTY wrapper added 2026-05-16. It uses idle-timeout completion detection and `pyte` screen parsing. `claude-print` is the productionized, standalone version of that plugin with Stop hook completion and real token counting. + +The NEEDLE agent config (`claude-print.yaml`) will replace `claude-anthropic-sonnet.yaml` for workers that should bill against the subscription. diff --git a/docs/plan/plan.md b/docs/plan/plan.md new file mode 100644 index 0000000..bdcd40f --- /dev/null +++ b/docs/plan/plan.md @@ -0,0 +1,148 @@ +# claude-print Plan + +## Overview + +Drop-in replacement for `claude -p` that drives the Claude Code interactive TUI via PTY, emitting wire-compatible output while billing against the subscription (`cc_entrypoint=cli`) rather than the Agent SDK credit pool. + +## Background + +Anthropic's June 15, 2026 billing split: +- `claude -p` / Agent SDK → separate monthly credit pool ($100/mo Max 5x, $200/mo Max 20x) +- Interactive TUI (`claude`) → continues consuming unlimited subscription + +The key mechanism: the interactive TUI sends `cc_entrypoint=cli` in the billing header. Any wrapper that gives `claude` a real PTY inherits that classification. Screen-scraping or hook-based completion detection extracts the response and emits it in `claude -p` wire format. + +## Prior Art + +### `jedarden/NEEDLE` — `plugins/claude-interactive` +Python script (~300 lines). PTY via `pty.openpty()` + `os.fork()`. Completion detection via 30s idle timeout after seeing `●` bullet. Response extraction via `pyte` virtual terminal screen parse. Token counts always zero. Single commit (2026-05-16). **Source for initial implementation.** + +### `smithersai/claude-p` +Zig binary, zmux PTY library. Uses `SessionStart` + `Stop` hooks injected via `--settings` for authoritative completion detection. Reads token counts from JSONL transcript. Wire-compatible with `claude -p`. ~50–200ms overhead. **Source for Stop hook pattern.** + +### `hristo2612/jinn` +Node.js, node-pty. Hook relay via HTTP loopback with shared secret. Intercepts SSE stream via `ANTHROPIC_BASE_URL` proxy. More moving parts; better for persistent session keep-alive use cases. + +## Architecture + +``` +caller + │ prompt (stdin or arg) + ▼ +claude-print + ├── PTY spawner forkpty() → claude --dangerously-skip-permissions + ├── Terminal emu responds to DA1/DA2/DSR/XTVERSION probes (Ink requirement) + ├── Startup seq wait for burst → CR (trust dismiss) → bracketed-paste prompt + ├── Stop hook per-run ~/.config/claude-print//settings.json overlay + │ hook writes {session_id, transcript_path} to named pipe + ├── Transcript reads ~/.claude/projects/**/.jsonl for token counts + └── Emitter emits stream-json / json / text to stdout (claude -p compat) +``` + +## Components + +### 1. PTY Spawner +- `pty.openpty()` + `os.fork()` + `os.execvp('claude', [...])` +- Forwards `--model`, `--max-turns`, `--allowedTools`, `--dangerously-skip-permissions` +- Sets PTY window size from `/dev/tty` or defaults (220×50) + +### 2. Terminal Emulator (Ink probe responder) +- DA1 (`ESC[c`) → `ESC[?6c` +- DA2 (`ESC[>0c`) → `ESC[>0;0;0c` +- DSR (`ESC[6n`) → `ESC[1;1R` +- XTVERSION (`ESC[>q`) → `ESC P>|claude-print ESC \` +- Window size (`ESC[18t`) → `ESC[8;50;220t` +- Without these, Ink hangs indefinitely at startup + +### 3. Startup Sequencer +- Phase 1: accumulate startup bytes; after 0.8s idle gap (or 45s timeout), send CR to dismiss any trust dialog +- Phase 2: after 2.0s gap post-CR, send prompt via bracketed paste (`ESC[200~...ESC[201~` + CR) +- Detects `trust` + `folder` in PTY output and sends CR immediately + +### 4. Stop Hook (completion signal) +- Before spawning: write per-run settings overlay to `~/.config/claude-print//settings.json` + ```json + {"hooks": {"Stop": [{"hooks": [{"type": "command", "command": "/path/to/hook.sh"}]}]}} + ``` +- Invoke claude with `--settings ~/.config/claude-print//settings.json` +- Hook script reads stdin JSON, writes `{session_id, transcript_path, timestamp}` to named FIFO +- Parent polls FIFO; on receipt, breaks from PTY read loop +- Cleanup: remove settings overlay + FIFO on exit (defer) +- **This replaces the fragile idle-timeout approach in the NEEDLE plugin** + +### 5. Transcript Reader +- On Stop hook receipt, extract `transcript_path` from hook payload +- Read JSONL, find last `type: "assistant"` event, extract `content[].text` +- Aggregate `usage` blocks across all assistant events for real token counts +- Retry loop (40 × 50ms) for Stop→JSONL flush race +- Fallback: extract `last_assistant_message` from Stop payload directly + +### 6. Emitter +- `--output-format text` (default): print response text to stdout +- `--output-format json`: emit single JSON object with `result`, `session_id`, `usage`, `duration_ms`, `is_error` +- `--output-format stream-json`: tail transcript JSONL and emit lines as they appear (per-message streaming) +- All formats match `claude -p` wire format + +## Data Models + +### Stop Hook Payload (from Claude Code) +```json +{ + "session_id": "abc123", + "transcript_path": "/home/coding/.claude/projects/.../abc123.jsonl", + "last_assistant_message": "...", + "hook_event_name": "Stop" +} +``` + +### Emitted JSON (--output-format json) +```json +{ + "type": "result", + "subtype": "success", + "is_error": false, + "result": "assistant response text", + "session_id": "abc123", + "num_turns": 1, + "duration_ms": 4200, + "cost_usd": 0, + "usage": { + "input_tokens": 1240, + "output_tokens": 380, + "cache_creation_input_tokens": 0, + "cache_read_input_tokens": 900 + } +} +``` + +### NEEDLE Agent Config (`claude-print.yaml`) +```yaml +name: claude-print +description: Claude Code interactive mode — subscription billing (cc_entrypoint=cli) +agent_cli: claude-print +input_method: + method: stdin +invoke_template: "cd {workspace} && claude-print --model {model} --max-turns 30 --dangerously-skip-permissions" +timeout_secs: 3600 +provider: anthropic +model: claude-sonnet-4-6 +output_transform: needle-transform-claude +``` + +## Implementation Phases + +- [ ] Phase 1: Core PTY wrapper — spawner, terminal probe responder, startup sequencer, idle-timeout completion (port from NEEDLE plugin, working baseline) +- [ ] Phase 2: Stop hook completion — per-run settings overlay, named FIFO, hook script, poll loop (replaces idle timeout) +- [ ] Phase 3: Transcript reader — JSONL parse, token extraction, retry loop, fallback to Stop payload +- [ ] Phase 4: Emitter — text/json/stream-json output formats, wire-compat with `claude -p` +- [ ] Phase 5: NEEDLE integration — `claude-print.yaml` agent config, `install.sh`, test with NEEDLE worker +- [ ] Phase 6: Tests and CI — unit tests for transcript parsing, mock PTY scenarios, Argo Workflows CI + +## Open Questions + +- **Language**: Python (port from NEEDLE plugin, fast iteration) or Rust (native NEEDLE integration)? Python has `pyte` and `pty` ready; Rust would need a PTY crate. +- **`--settings` overlay vs project-level hooks**: Per-run `--settings` file (smithersai approach) avoids mutating `~/.claude/settings.json` and is self-contained. Project-level `.claude/settings.json` per workspace is an alternative but affects all sessions in that workspace. +- **pyte for response extraction**: Still needed for fallback when Stop payload `last_assistant_message` is absent (older Claude Code versions). Keep or drop? +- **Multiline prompts**: NEEDLE sends prompt via stdin pipe. Bracketed paste handles embedded newlines correctly; verify with very long prompts (>32KB). +- **Rate limit 429 handling**: Claude emits an error event in the transcript; `is_error: true` and exit 1. No explicit retry — callers handle retry. +- **Dedicated GitHub repo vs NEEDLE plugin**: Standalone repo (`jedarden/claude-print`) allows independent versioning, pre-built releases, and use outside NEEDLE. NEEDLE plugin stays as thin wrapper pointing at the binary.