pdftract/scripts/README.md
jedarden 61babb0991 test(bf-5dnh1): add memory ceiling enforcement for proptests
Add scripts/run-proptest-with-limits.sh to run property tests under
cgroup MemoryMax, ensuring pathological cases fail fast with allocation
errors instead of OOMing the host.

Coordinated with bf-1g1fd (CI memory-ceiling gate) to provide local
development parity with CI enforcement.

Changes:
- Add scripts/run-proptest-with-limits.sh (cgroup v2/v1 wrapper)
- Add scripts/README.md documenting memory ceiling enforcement

Memory limits:
- Proptests: 2048 MB cgroup MemoryMax (local)
- Fuzz tests: 1536 MB cgroup + 1024 MB libfuzzer RSS (existing)

Proptest input size caps (already in place):
- Lexer/object parser: up to 10 KB inputs
- Xref/stream parsers: up to 100 KB inputs
- Nested structures: depth-limited

Refs: bf-5dnh1, bf-1g1fd

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 13:39:04 -04:00

2.3 KiB

Scripts

This directory contains utility scripts for pdftract development and testing.

Memory Ceiling Enforcement

Fuzz Tests (run-fuzz-with-limits.sh)

Runs cargo-fuzz targets with memory limits to ensure pathological inputs fail fast:

scripts/run-fuzz-with-limits.sh [target]

Memory limits:

  • Cgroup MemoryMax: 1536 MB (hard ceiling)
  • Libfuzzer RSS limit: 1024 MB (per-execution)
  • Libfuzzer malloc limit: 1024 MB (total)

Environment:

  • FUZZ_TIME_SECONDS: Time per target (default: 60)
  • MEMORY_MAX_MB: Cgroup limit in MB (default: 1536)
  • RSS_LIMIT_MB: Libfuzzer RSS limit (default: 1024)

Implementation: Uses cgroup v2 MemoryMax (preferred) or cgroup v1 memory.limit_in_bytes with OOM killer disabled for clean failure mode.

Property Tests (run-proptest-with-limits.sh)

Runs proptest modules with memory limits:

scripts/run-proptest-with-limits.sh [test_name]

Memory limits:

  • Cgroup MemoryMax: 2048 MB (hard ceiling)

Environment:

  • PROPTEST_CASES: Test cases per module (default: 1000)
  • MEMORY_MAX_MB: Cgroup limit in MB (default: 2048)
  • PROPTEST_SEED: Proptest seed (default: random)

Proptest modules: lexer, object_parser, xref, stream, cmap_parser

Input size caps: All proptest strategies are bounded:

  • Lexer/object parser: up to 10 KB inputs
  • Xref/stream parsers: up to 100 KB inputs
  • Nested structures: depth-limited (e.g., 500 for parser depth checks)

These bounds ensure tests complete quickly while still exercising edge cases.

Why Memory Ceilings?

Per bf-1g1fd and the Quality Targets (plan.md Phase 0.4), adversarial inputs must not OOM the host. Memory ceilings enforce:

  1. Clean failure mode - Allocation errors instead of host OOM
  2. Fast failure - Pathological cases abort immediately at the limit
  3. Regressions as test failures - Memory growth is caught in CI

CI enforces these limits via cgroup MemoryMax in .ci/argo-workflows/pdftract-ci.yaml (proptests) and .ci/argo-workflows/pdftract-nightly-fuzz.yaml (fuzz).

Other Scripts

generate-minimal-pdf.sh

Generates minimal valid PDF documents for testing.

check-provenance.sh

Verifies binary provenance and SBOM signatures.

check-secrets.sh

Scans for accidental secrets in committed code.

generate_test_corpus.py

Generates synthetic PDF test corpus.