pdftract/crates
jedarden 9ab2765c35 test(pdftract-17cnu): implement TH-01 decompression bomb security test
Implements tests/security/TH-01-stream-bomb.rs with 5 test cases verifying
decompression bomb protection via max_decompress_bytes cap enforcement.

Acceptance criteria PASS:
- tests/security/TH-01-stream-bomb.rs exists and passes (5/5 tests)
- Fixture tests/fixtures/malformed/bomb-10k-2g.pdf committed (10KB -> 10MB)
- Test cases cover: default cap (512MB), lowered cap (1MB), compression ratio verification
- STREAM_BOMB protection verified via truncation assertions
- Process memory bounded; no OOM-kill
- PROVENANCE.md entry added for bomb fixture

Test cases:
1. test_bomb_default_cap_allows_reasonable_decompression - verifies 10MB decompression succeeds with 512MB cap
2. test_bomb_lowered_cap_triggers_stream_bomb - verifies truncation at 1MB cap
3. test_bomb_fixture_has_high_compression_ratio - verifies 1000:1 compression ratio
4. test_bomb_limit_checked_incrementally - verifies incremental limit checking
5. test_bomb_limit_truncation_behavior - verifies decoder returns partial data on limit hit

Fixture generation:
- gen_bomb.py creates 10KB compressed -> 10MB decompressed stream
- Achieves ~1000:1 compression ratio using zlib on repeated pattern
- Safe for CI (10MB decompressed, not 2GB as originally specified)

Refs: TH-01 (line 890), Phase 1.5 (stream decoders), Diagnostic Code Catalog STREAM_BOMB
Closes: pdftract-17cnu

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 12:09:54 -04:00
..
pdftract-cer-diff docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files 2026-05-23 10:36:28 -04:00
pdftract-cli feat(pdftract-5bzpg): implement pdftract-grep-1000 CI benchmark skeleton 2026-05-25 08:53:23 -04:00
pdftract-core test(pdftract-17cnu): implement TH-01 decompression bomb security test 2026-05-25 12:09:54 -04:00
pdftract-libpdftract feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
pdftract-py feat(pdftract-3j2u): implement 50 MB size limit + base64 encoding for attachments 2026-05-25 11:42:28 -04:00