pdftract/notes/bf-4xk2v.md
jedarden 56a773b5f0 docs(bf-4xk2v): add verification note and compression bomb fixture
Add verification note documenting all 13 decompression-bomb tests now
use minimal crafted inputs and assert byte-budget limit fires early.
Add compression-bomb.bin fixture (509 bytes → 500 KB, 982:1 ratio)
for TH-01 decompression bomb abort test.

Acceptance criteria:
- STREAM_BOMB abort fires before materialization: PASS
- Minimal crafted inputs (no multi-GB buffers): PASS
- Byte-budget limit fires early: PASS
- Never pre-size Vec in tests: PASS
- TH-01 bomb-abort test exists: PASS

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 13:32:19 -04:00

4.2 KiB
Raw Blame History

bf-4xk2v: Bound decompression-bomb tests — assert abort before materialization

Summary

Fixed decompression-bomb and max_decompress_bytes tests to trigger STREAM_BOMB abort WITHOUT building multi-GB decoded outputs in memory. All tests now use minimal crafted inputs and assert the byte-budget limit fires early.

Changes Made

1. Fixed test_bomb_limit_flate (line 1117)

Before: Used "hello" compressed (5 bytes), not a real bomb test After: Proper bomb test using minimal crafted input with clear documentation

  • Uses small compressed payload that would expand beyond bomb limit
  • Asserts output.len() <= bomb_limit
  • Documents the TH-01 requirement

2. Fixed test_flate_decode_bomb_limit (line 2177)

Before: Created vec![0u8; 1MB] first - violates "never pre-size Vec" After: Uses fixture file or minimal inline payload

  • Falls back to 200-byte pattern if fixture unavailable
  • Never creates multi-MB buffers
  • Bomb limit of 100 bytes forces early abort
  • Includes fixture loading logic for compression-bomb.bin

3. Fixed test_document_level_bomb_limit (line 2227)

Before: Created vec![0u8; 500KB] for each stream After: Uses 200-byte pattern

  • Total budget 150 bytes forces truncation on first stream
  • Never creates large buffers

4. Fixed test_flate_decode_bomb_limit_with_predictor (line 2954)

Before: Created 6000-byte buffer with loop After: Uses 150-byte pattern (25 rows × 6 bytes)

  • Bomb limit 50 bytes forces early abort
  • Verifies predictor doesn't bypass bomb checks

5. Added test_th01_decompression_bomb_abort (line 2397)

New test implementing TH-01 from plan:

  • Uses compression-bomb.bin fixture (509 bytes → 500 KB, 982:1 ratio)
  • Bomb limit 100 KB forces abort before materializing full 500 KB
  • Critical assertions:
    • decoded.len() <= bomb_limit
    • decoded.len() < 400000 (not full output)
    • Clear failure messages if bomb check doesn't fire early

6. Created fixture file

File: tests/fixtures/malformed/compression-bomb.bin

  • 509 bytes compressed → 500 KB decompressed
  • 982:1 compression ratio using repeated "AB" pattern
  • Created with Python script to avoid large buffers in Rust code

Acceptance Criteria

Criterion Status Notes
STREAM_BOMB abort fires before materialization PASS All tests use small inputs with low bomb limits
Minimal crafted inputs (no multi-GB buffers) PASS Max buffer created is 200 bytes for patterns
Byte-budget limit fires early PASS Bomb limits set well below decoded sizes
Never pre-size Vec in tests PASS All tests use small patterns or fixtures
TH-01 bomb-abort test exists PASS New test using compression-bomb.bin fixture

Test Results

All 13 bomb-related tests pass:

  • test_bomb_limit_flate
  • test_flate_decode_bomb_limit
  • test_document_level_bomb_limit
  • test_flate_decode_bomb_limit_with_predictor
  • test_th01_decompression_bomb_abort
  • test_lzw_bomb_limit
  • test_crypt_decode_bomb_limit
  • test_decompression_bomb_objstm
  • test_bomb_limit_enforcement
  • proptest_flate_decode_bomb_limit_no_panic
  • proptest_lzw_decode_bomb_limit_no_panic
  • proptest_crypt_decode_bomb_limit_no_panic
  • test_bomb_protection_detection

Verification

# Run all bomb tests
cargo test -p pdftract-core --lib bomb

# Run specific tests
cargo test -p pdftract-core --lib test_th01_decompression_bomb_abort
cargo test -p pdftract-core --lib test_bomb_limit_flate
cargo test -p pdftract-core --lib test_flate_decode_bomb_limit

Files Modified

  • crates/pdftract-core/src/parser/stream.rs - Fixed 4 tests, added 1 new test
  • tests/fixtures/malformed/compression-bomb.bin - New fixture file (509 bytes)

Key Implementation Notes

  1. Minimal inputs: All tests use small patterns (50-200 bytes) that compress well
  2. Early abort: Bomb limits set to 1/3 or less of decoded size to force truncation
  3. Fixture-based: TH-01 test uses pre-compressed fixture to avoid creating large buffers
  4. Clear assertions: Each test explicitly checks decoded.len() <= bomb_limit

References

  • Plan EC-10: FlateDecode bomb mitigation
  • Plan TH-01: Decompression bomb threat and test
  • Bead requirement: "Use minimal crafted inputs and assert the byte-budget limit fires early"