Add 4 new tests to verify PNG and TIFF predictor functions use row-by-row
processing with bounded peak memory (2x stride), never pre-allocating full
output buffers inside tests.
- test_png_predictor_budget_enforcement_small_fixture: 200-byte fixture,
100-byte budget, verifies truncation at row boundary
- test_tiff_predictor_2_budget_enforcement_small_fixture: 160-byte fixture,
80-byte budget, verifies row-by-row processing for grayscale
- test_png_predictor_multiple_selectors_budget_per_row: 25-byte fixture
with all PNG selector types, verifies per-row budget checking
- test_tiff_predictor_2_rgb_budget_enforcement: 45-byte RGB fixture,
verifies multi-byte pixel handling with budget enforcement
All fixtures are under 250 bytes, no full-buffer pre-allocation, tests
mirror the row-by-row discipline from bf-49wmw production fix.
Closes bf-21hw8
- Fix test_bomb_limit_flate to actually test early abort behavior
- Use 200-byte pattern (not large buffers) that compresses to ~50 bytes
- Set bomb_limit to 50 bytes to force truncation
- Assert output.len() < pattern.len() to verify truncation occurred
- Add documentation explaining the minimal input approach
Per bf-4xk2v: "Decompression-bomb and max_decompress_bytes tests must
trigger the STREAM_BOMB abort WITHOUT building the multi-GB decoded output
in memory. Use minimal crafted inputs and assert the byte-budget limit fires
early. Never pre-size a Vec to the claimed or decompressed length."
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Changes from Phase 6.7 child beads that were not committed earlier:
- Add subtle dependency for constant-time token comparison
- Add root directory for path-traversal protection in HTTP+SSE transport
- Update MCP server state to support --root flag
- Minor fixes and improvements across MCP modules
These changes support the 7 closed child beads:
- pdftract-5xq16: JSON-RPC 2.0 framing layer
- pdftract-67tm8: stdio transport
- pdftract-g0ro2: HTTP+SSE transport
- pdftract-24kut: transport mutual exclusion enforcement
- pdftract-1rami: tool catalog (10 tools)
- pdftract-6696g: path-traversal protection
- pdftract-zltqd: bearer-token auth
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit implements the Cargo.lock policy for reproducible builds
across all workspace members (pdftract-core, pdftract-cli, pdftract-py).
Changes:
- Add CONTRIBUTING.md with lockfile-update workflow documentation
- Add .renovaterc.json for weekly lockfile-only PRs (human-gated)
- Add crates/pdftract-core/README.md with rationale for checked-in lockfiles
- Add notes/pdftract-49f8.md with verification note
The Argo workflow updates (pdftract-ci.yaml) are committed separately
in the declarative-config repo.
Acceptance criteria:
- PASS: Cargo.lock tracked by git, not in .gitignore
- PASS: Argo workflow templates document --locked/--frozen requirements
- WARN: Enforcement to be completed when placeholder templates are implemented
- WARN: Binary reproducibility verification deferred to pdftract-build-binaries implementation
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement the document catalog parser (/Root traversal) for PDF documents.
The catalog parser extracts all key entries from the document catalog
including Pages, Outlines, MarkInfo, StructTreeRoot, AcroForm, Names,
Metadata, PageLabels, OCProperties, OpenAction, AA, and Version.
Key structures:
- MarkInfo: parses /MarkInfo dictionary with is_tagged, user_properties, suspects
- PageLabelStyle: enum for all label styles (D, R, r, A, a)
- PageLabel: single page label with style, prefix, and start value
- PageLabelsTree: number tree parser for /PageLabels with /Nums and /Kids support
- OcProperties: stub for OCG implementation (delegated to dedicated bead)
- Catalog: main catalog struct with all required and optional fields
Number tree implementation:
- Parses /Nums arrays (leaf nodes with alternating key-value pairs)
- Supports /Kids arrays (internal nodes for recursive tree traversal)
- Provides get_label_with_start() and get_label() methods for lookup
- Correctly formats roman numerals (uppercase/lowercase) and letter sequences
All 27 tests pass including proptests for fuzzing robustness (INV-8).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>