Bead pdftract-zltqd implements bearer-token authentication for the
MCP HTTP+SSE transport. The implementation was already complete.
This verification note confirms all acceptance criteria are met.
Verification summary:
- Non-loopback binds without token abort with exit code 78
- Env var and token-file auth sources work correctly
- Insecure CLI token requires PDFTRACT_INSECURE_CLI_TOKEN=1
- /health endpoint is auth-exempt (returns 200 without credentials)
- POST requests require valid Authorization: Bearer header
- Constant-time token comparison using subtle crate
- IPv4 and IPv6 loopback addresses are exempt from token requirement
All unit tests pass (90 MCP tests). Manual testing confirms
the plan critical test: "--bind 0.0.0.0:8080 without token
aborts startup; with token, valid requests succeed and
missing tokens get 401"
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The --root DIR flag was already fully implemented in the codebase.
All 25 tests pass (12 unit + 13 integration tests).
Acceptance criteria verified:
- Path traversal rejected with -32602
- Absolute paths rejected when --root is set
- HTTPS URLs bypass the check
- Symlink escapes detected via canonicalize
- Startup validation for root directory
Co-Authored-By: Claude Code <noreply@anthropic.com>
Per ADR-006: stdio and HTTP transports are mutually exclusive because they
have opposite stdout discipline (stdio: JSON-RPC sink; HTTP: log channel).
Changes:
- Add clap ArgGroup with multiple(false) to enforce --stdio XOR --bind
- Default to stdio mode when neither flag is specified
- Change --bind from required String to Option<String>
- Add ADR-006 reference to help text and doc comments
- Add unit tests for CLI argument validation
Acceptance criteria:
- pdftract mcp → launches in stdio mode (default)
- pdftract mcp --stdio → launches in stdio mode
- pdftract mcp --bind ADDR → launches in HTTP+SSE mode
- pdftract mcp --stdio --bind ADDR → exits 2 with clap conflict error
- pdftract mcp --help shows mutual exclusivity note
- Unit test verifies ArgGroup conflict on dual-transport invocation
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements the HTTP+SSE transport for the MCP server per bead pdftract-g0ro2.
All acceptance criteria PASS.
Routes:
- POST /: JSON-RPC requests (single or batch)
- GET /sse: Server-Sent Events for notifications
- GET /health: Health check (auth-exempt)
Key features:
- Reuses axum/tokio/tower-http from Phase 6.4 (no new deps)
- Bearer token auth (from sibling bead 6.7.7)
- Request body limit (256 MB default, configurable via --max-upload-mb)
- SSE keepalive every 30 seconds
- Broadcast channel for fan-out notifications
- Backpressure handling (drops lagged clients with WARN log)
- 100-client SSE limit (MAX_SSE_CLIENTS)
- Custom 413 Payload Too Large JSON response
- Batch request support per JSON-RPC 2.0 spec
All 10 integration tests pass:
- test_post_tools_list: POST / returns tool catalog
- test_get_sse_stream: GET /sse opens SSE stream with keepalive
- test_50_concurrent_clients: 50 concurrent clients succeed
- test_health_during_load: GET /health returns 200 under load
- test_post_batch_request: Batch requests return batch responses
- test_post_payload_too_large: POST / over limit returns 413 with JSON body
- test_auth_required_for_non_loopback: Bearer auth returns 401 with WWW-Authenticate
- test_post_single_request_returns_single_response: Single request returns single response
- test_unknown_method: Unknown method returns method_not_found error
- test_get_health: GET /health returns 200 with version info
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements the stdio transport for the MCP server, enabling communication
with local agents (Claude Desktop, Claude Code, Continue, Cursor) over
standard input/output with Content-Length framing.
Core features:
- LSP-style Content-Length framing with \r\n terminators
- JSON-RPC 2.0 message parsing and serialization
- INV-9 compliance: stdout contains only JSON-RPC frames
- Panic hook redirects panics to stderr
- SIGTERM handler for graceful shutdown
- Parse errors return -32700 with id: null, then continue
Acceptance criteria:
- ✅ Piping tools/list with framing produces expected response < 50ms
- ✅ EOF on stdin → clean exit within 100ms
- ✅ Malformed JSON → -32700 error, subsequent requests work
- ✅ No println!/log output to stdout (INV-9 enforced)
- ✅ Panics go to stderr, no partial JSON on stdout
- ✅ SIGTERM → exit 0, SIGINT → immediate non-zero exit
Tests added:
- crates/pdftract-cli/tests/mcp-stdio.rs (8 integration tests, all pass)
- All 49 existing unit tests continue to pass
Refs: pdftract-67tm8, plan Phase 6.7.2
Add verification note documenting JSON-RPC 2.0 framing implementation
with all acceptance criteria PASS.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add compute-sha256sums step to pdftract-ci publish-if-tag that produces
an aggregate SHA256SUMS file covering all distributed artifacts: binary
archives, Python wheels, sdist, and CycloneDX SBOM.
Key changes:
- Glob-based artifact collection (tar.gz, zip, whl, cdx.json)
- Deterministic sorting with LC_ALL=C sort -k 2 for reproducibility
- Local verification via sha256sum --check before publishing
- Dynamic artifact upload array instead of hardcoded EXPECTED_ARTIFACTS
- SBOM added as optional input artifact
The SHA256SUMS file format matches GNU coreutils sha256sum output,
enabling one-command verification with cosign verify-blob.
References:
- Plan line 3369: SHA256SUMS aggregate
- Plan line 3419: sign-blob of SHA256SUMS
- Plan line 3460: one cosign verify-blob umbrella
Co-Authored-By: Claude Code <noreply@anthropic.com>
- Wire generate-provenance and verify-provenance steps into workflow DAG
- Update publish-if-tag to upload multiple.intoto.jsonl to GitHub Release
- Fix provenance reproducibility by using SOURCE_DATE_EPOCH from git commit
- Docker images already have cosign attest --type slsaprovenance
Acceptance criteria:
- PASS: generate-provenance step wired into DAG
- PASS: provenance uploaded to GitHub Release
- PASS: Docker image cosign attest already implemented
- WARN: Full slsa-verifier verification requires OIDC issuer registration
- PASS: Provenance is reproducible using git commit timestamp
- PASS: Automated smoke test validates JSON structure
Refs: pdftract-3gk5, plan line 3415 (Signing and Provenance)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add verification note confirming that per-page Resource dictionary
inheritance is complete and all acceptance criteria are met.
The implementation in resources.rs and pages.rs provides:
- Per-namespace merging (Font, XObject, ExtGState, ColorSpace, etc.)
- Per-key last-write-wins semantics
- Arc sharing for memory efficiency when pages lack /Resources
- Support for inline ColorSpace arrays
All 10 resource-related tests pass, including:
- 3-level inheritance test
- Per-key override test
- Arc sharing test
- ColorSpace inline array test
- Empty root /Resources test
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Enhanced the `detect_linearization` function to avoid false matches when
extracting keys from the linearization dictionary. Previous implementation
could incorrectly match "/L" within "/Linearized" or "/H" within other keys.
Changes:
- Added loop-based search in extract_number helper to skip substring matches
- Added similar substring-aware logic for /H (hint stream) parsing
- Added new diagnostic codes for /Prev chain error handling
- Added comprehensive verification note
Acceptance criteria PASS:
- Non-linearized files return None
- Valid linearized dict detected correctly
- File size mismatch (incremental update) invalidates linearization
- No /H entry returns None for hint_stream_offset
- Random bytes never panic (proptest)
- Forward scan disabled for linearized files
- INV-8 maintained (no panics on arbitrary input)
Co-Authored-By: Claude Code <noreply@anthropic.com>
The 512 MiB DEFAULT_MAX_DECOMPRESS_BYTES change was implemented in
commit e94f2ab (fix(bf-49wmw)). This note documents the verification.
Co-Authored-By: Claude Code <noreply@anthropic.com>
The hybrid xref handler (merge_hybrid) was already implemented. This adds
a property-based test to verify it handles random combinations of traditional
and stream entries without panicking.
Changes:
- Added proptest_merge_hybrid_no_panic to proptest_tests module
- Tests random entry sets using prop::collection::hash_map
- Covers all entry types (InUse, Free, Compressed)
- Verification note confirms all acceptance criteria PASS
Test results: 9/9 merge_hybrid tests pass
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fixed compilation error in xref.rs where u64 literal 0x5DEECE66D was used
with u32 state, causing overflow. Changed state to u64 for proper Java
Random algorithm behavior.
The OCG /OCProperties parsing implementation was already complete and
all tests pass. See notes/pdftract-2a6rk.md for verification.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements merge_hybrid() and is_hybrid_trailer() for hybrid PDF files.
Hybrid files have both a traditional xref table at startxref and a
supplementary xref stream pointed to by /XRefStm in the trailer.
Per PDF spec, the traditional table is authoritative for objects it
covers; the stream's type-2 entries fill gaps not covered by the
traditional table.
Key behaviors:
- Traditional entries override stream entries for same object numbers
- Stream-only type-2 entries are added as gap fill
- Free/InUse conflicts emit STRUCT_HYBRID_CONFLICT diagnostic
- Merged trailer has /XRefStm key removed
- Result XrefSection has is_hybrid: true set
Acceptance criteria:
- Critical test: traditional entries override stream entries (PASS)
- Gap fill: stream-only type-2 entries added (PASS)
- Free/InUse conflict: diagnostic emitted (PASS)
- Non-hybrid trailer: is_hybrid_trailer returns false (PASS)
- proptest: no panics with random combinations (PASS)
- INV-8 maintained: no panics in library code (PASS)
Co-Authored-By: Claude Code <noreply@anthropic.com>
Verifies that the per-page Resource dictionary inheritance implementation
is complete and correct. All acceptance criteria are met:
- 3-level resource inheritance test passes
- Per-key override test passes
- /Resources missing on page inherits parent's
- Arc<ResourceDict> sharing verified with Arc::ptr_eq
- ColorSpace inline-array test passes
- Empty root /Resources propagates correctly
- INV-8 maintained (all fuzz tests pass)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add documentation for the fix that removed diagnostic emission for
unknown keywords, complementing the earlier keyword fallback fix.
Co-Authored-By: Claude Code <noreply@anthropic.com>
Fixed incorrect fallback behavior in keyword lexer functions. Four
functions (lex_e_keyword, lex_o_keyword, lex_r_keyword, lex_n_keyword)
were incorrectly calling lex_name() instead of lex_keyword() when
keywords didn't match.
When a PDF contains an unrecognized word starting with e/o/n/R
(e.g., "endob" instead of "endobj"), the lexer should fall back to
generic keyword parsing (Token::Keyword(bytes)), not name parsing.
Names always start with /, so calling lex_name() on input without
a leading / would incorrectly skip the first byte.
References:
- Bead: pdftract-5upi
- Notes: notes/pdftract-5upi.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents that CycloneDX SBOM generation is fully implemented
in the Argo Workflows (declarative-config). The workflows:
- Generate pdftract-vX.Y.Z.cdx.json using cargo-cyclonedx
- Validate schema with cyclonedx-cli validate
- Attest to Docker images via cosign attest --type cyclonedx
- Attach to GitHub Release as an asset
- Include in SHA256SUMS aggregate
Acceptance criteria: 5 PASS, 1 WARN (grype test requires release)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the completed implementation of pdftract-release-cascade
WorkflowTemplate and pdftract-tag-trigger Argo Events Sensor.
Acceptance criteria:
- PASS: All infrastructure files committed in declarative-config
- WARN: Runtime verification deferred (kubectl not available in env)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the enhancements made to cosign keyless signing:
- Projected service account token with sigstore audience
- Explicit OIDC issuer URL configuration
- Improved digest extraction with fallback strategies
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Make diagnostics module visible to fingerprint module and fix
hash_page_geometry signature to match usage.
Changes:
- Add `pub mod diagnostics;` to lib.rs for module visibility
- Modify hash_page_geometry to create diagnostics internally
The canonicalize module already has complete implementation:
- canonicalize_f64: banker's rounding to 4dp for geometry
- normalize_content_stream: whitespace normalization via lexer
- serialize_dict_canonical: sorted-key dict serialization
- hash_resource_dict_canonical: order-independent resource hashing
Verification: notes/pdftract-154mz.md
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the WorkflowTemplate creation for mdBook → Cloudflare Pages CI.
Template committed to declarative-config 4fe4947.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Documents the implementation of the pdftract-github-release
WorkflowTemplate, including artifact taxonomy, release notes
generation, and acceptance criteria status.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add per-PR property tests and nightly fuzz job infrastructure:
CI Changes (declarative-config):
- pdftract-ci.yaml: Add proptest step to test-matrix
- New test-proptest template with configurable case count
- Sets PROPTEST_SEED for reproducibility
- Runs 10,000 cases per module within 1 CPU-hour budget
- pdftract-nightly-fuzz.yaml: Sync fuzz workflow
- CronWorkflow runs daily at 0400 UTC
- 5 fuzz targets with address sanitizer
- Seed corpus from malformed fixtures
Existing Infrastructure (Already in Place):
- Proptest suites for lexer, object_parser, xref, stream, cmap_parser
- Fuzz targets for all 5 modules
- proptest-regressions/ with README
- Seed corpus in fuzz/corpus/
Verification:
- Added tests/proptest-panic-verification.rs
- Proptest infrastructure correctly structured
- Will catch deliberate panics within budget
Closes: pdftract-33v
Documents the implementation of the pdftract-crates-publish WorkflowTemplate
in jedarden/declarative-config.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The OCG implementation was already complete in ocg.rs. All 20 tests pass:
- BaseState parsing (ON/OFF/Unchanged)
- /ON and /OFF array override handling
- OCMD policy preservation (AllOn, AnyOn, AllOff, AnyOff)
- INV-8 compliance verified via proptests
Phase 3 will consume OcProperties via is_visible() to suppress
glyphs in /OC /OCGRef BDC blocks when the referenced OCG is OFF.
Co-Authored-By: Claude Code <noreply@anthropic.com>
Add details about the BytesSource cleanup bug fix and clarify that the
contract defines 7 error kinds, not 8 as initially stated in the task.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add quality-matrix implementation to pdftract-ci with msrv-check step
using rust:1.78-slim to detect usage of newer Rust features.
Changes:
- .ci/argo-workflows/pdftract-ci.yaml: Implement quality-matrix DAG with
msrv-check, clippy-fmt, and cargo-audit templates
- CHANGELOG.md: New file documenting MSRV bump policy (MINOR version
event, warning period, update checklist)
The MSRV gate prevents silent drift that would break downstream consumers
on older toolchains. Any Rust 1.79+ feature (e.g., let-else, core::error::Error)
will fail the msrv-check step, triggering a policy review.
See notes/pdftract-2w02.md for acceptance criteria verification.
Co-Authored-By: Claude Code <noreply@anthropic.com>
Add MSRV (Minimum Supported Rust Version) pinning to 1.78 for
pdftract-core and pdftract-cli. The MSRV gate prevents silent
absorption of newer Rust features that would break downstream
consumers on older toolchains.
Changes:
- CI: Add quality-matrix DAG with msrv-check step (rust:1.78-slim)
- CI: Add clippy-check, fmt-check, cargo-audit, cargo-deny templates
- README: Add MSRV badge (shields.io)
- clippy.toml: Enable msrv=1.78 for MSRV-aware lints
- CONTRIBUTING.md: Document MSRV bump policy (MINOR version event)
The rust-version was already declared in workspace Cargo.toml;
this bead adds the CI enforcement and documentation.
Refs: pdftract-2w02
Add comprehensive verification note for forward_scan_xref implementation.
The function was already implemented in xref.rs; this note documents
verification of all bead requirements.
Also fix duplicate ObjRef import in parser/mod.rs (ObjRef is defined in
diagnostics module and re-exported).
Bead: pdftract-46lw