pdftract/docs/notes
jedarden 57df42f478 docs(pdftract-3b1x): finalize sdk-invocation.md with subprocess contract and TH-07 compliance
Add comprehensive "Subprocess Contract" section documenting:
- argv layout with canonical form
- stdin discipline (password ingress, PDF bytes from stdin)
- stdout/stderr discipline (what goes where, what never gets logged)
- Exit code taxonomy (0, 64-78) with TH-03 (exit 78) and TH-07 (exit 64) refs
- Environment variable pass-through (PDFTRACT_PASSWORD, PDFTRACT_MCP_TOKEN, etc.)
- --progress-json event schema (ndjson format, all event types)
- --capture-diagnostics archive layout (zip/tar, contained files, scrubbing rules)

Update all language examples (Python, Node.js, Go, Ruby, Java, Rust) with
TH-07-compliant password handling:
- Pass password via PDFTRACT_PASSWORD env var (subprocess)
- Pass password via multipart form field (HTTP)
- Never use --password VALUE flag (rejected unless opt-in)

Add progress JSON parsing examples for Python, Node.js, and Rust showing
real-world event-driven progress tracking.

File grows from 1100 to 1837 lines (+737 lines, ~67%).

Closes: pdftract-3b1x
2026-05-24 07:48:09 -04:00
..
.gitkeep Initial repo scaffold with README and docs structure 2026-05-16 14:26:16 -04:00
ocr-language-packs.md feat(pdftract-3zhf): add unified TableDetector::detect entry point 2026-05-24 00:51:59 -04:00
pdftract-3c4i.md fix(pdftract-3c4i): export detect_merged_cells from table module 2026-05-24 00:23:14 -04:00
sdk-architecture.md docs(pdftract-32y9): finalize SDK architecture note with workspace layout, cross-compile matrix, and KU-12 alignment 2026-05-24 06:38:23 -04:00
sdk-conformance-runner.md feat(pdftract-5omc): implement per-language conformance test runner pattern 2026-05-18 01:32:24 -04:00
sdk-contract.md docs(pdftract-147a): author SDK contract specification 2026-05-17 23:13:55 -04:00
sdk-invocation.md docs(pdftract-3b1x): finalize sdk-invocation.md with subprocess contract and TH-07 compliance 2026-05-24 07:48:09 -04:00