From 56a773b5f0541db67adc8f477fe6c20c8ca712ff Mon Sep 17 00:00:00 2001 From: jedarden Date: Sat, 23 May 2026 13:32:19 -0400 Subject: [PATCH] docs(bf-4xk2v): add verification note and compression bomb fixture MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add verification note documenting all 13 decompression-bomb tests now use minimal crafted inputs and assert byte-budget limit fires early. Add compression-bomb.bin fixture (509 bytes → 500 KB, 982:1 ratio) for TH-01 decompression bomb abort test. Acceptance criteria: - STREAM_BOMB abort fires before materialization: PASS - Minimal crafted inputs (no multi-GB buffers): PASS - Byte-budget limit fires early: PASS - Never pre-size Vec in tests: PASS - TH-01 bomb-abort test exists: PASS Co-Authored-By: Claude Opus 4.7 --- notes/bf-4xk2v.md | 108 ++++++++++++++++++ tests/fixtures/malformed/compression-bomb.bin | Bin 0 -> 509 bytes 2 files changed, 108 insertions(+) create mode 100644 notes/bf-4xk2v.md create mode 100644 tests/fixtures/malformed/compression-bomb.bin diff --git a/notes/bf-4xk2v.md b/notes/bf-4xk2v.md new file mode 100644 index 0000000..64579eb --- /dev/null +++ b/notes/bf-4xk2v.md @@ -0,0 +1,108 @@ +# bf-4xk2v: Bound decompression-bomb tests — assert abort before materialization + +## Summary + +Fixed decompression-bomb and max_decompress_bytes tests to trigger STREAM_BOMB +abort WITHOUT building multi-GB decoded outputs in memory. All tests now use +minimal crafted inputs and assert the byte-budget limit fires early. + +## Changes Made + +### 1. Fixed `test_bomb_limit_flate` (line 1117) +**Before:** Used "hello" compressed (5 bytes), not a real bomb test +**After:** Proper bomb test using minimal crafted input with clear documentation +- Uses small compressed payload that would expand beyond bomb limit +- Asserts output.len() <= bomb_limit +- Documents the TH-01 requirement + +### 2. Fixed `test_flate_decode_bomb_limit` (line 2177) +**Before:** Created `vec![0u8; 1MB]` first - violates "never pre-size Vec" +**After:** Uses fixture file or minimal inline payload +- Falls back to 200-byte pattern if fixture unavailable +- Never creates multi-MB buffers +- Bomb limit of 100 bytes forces early abort +- Includes fixture loading logic for compression-bomb.bin + +### 3. Fixed `test_document_level_bomb_limit` (line 2227) +**Before:** Created `vec![0u8; 500KB]` for each stream +**After:** Uses 200-byte pattern +- Total budget 150 bytes forces truncation on first stream +- Never creates large buffers + +### 4. Fixed `test_flate_decode_bomb_limit_with_predictor` (line 2954) +**Before:** Created 6000-byte buffer with loop +**After:** Uses 150-byte pattern (25 rows × 6 bytes) +- Bomb limit 50 bytes forces early abort +- Verifies predictor doesn't bypass bomb checks + +### 5. Added `test_th01_decompression_bomb_abort` (line 2397) +**New test** implementing TH-01 from plan: +- Uses compression-bomb.bin fixture (509 bytes → 500 KB, 982:1 ratio) +- Bomb limit 100 KB forces abort before materializing full 500 KB +- Critical assertions: + - `decoded.len() <= bomb_limit` + - `decoded.len() < 400000` (not full output) + - Clear failure messages if bomb check doesn't fire early + +### 6. Created fixture file +**File:** `tests/fixtures/malformed/compression-bomb.bin` +- 509 bytes compressed → 500 KB decompressed +- 982:1 compression ratio using repeated "AB" pattern +- Created with Python script to avoid large buffers in Rust code + +## Acceptance Criteria + +| Criterion | Status | Notes | +|-----------|--------|-------| +| STREAM_BOMB abort fires before materialization | PASS | All tests use small inputs with low bomb limits | +| Minimal crafted inputs (no multi-GB buffers) | PASS | Max buffer created is 200 bytes for patterns | +| Byte-budget limit fires early | PASS | Bomb limits set well below decoded sizes | +| Never pre-size Vec in tests | PASS | All tests use small patterns or fixtures | +| TH-01 bomb-abort test exists | PASS | New test using compression-bomb.bin fixture | + +## Test Results + +All 13 bomb-related tests pass: +- test_bomb_limit_flate +- test_flate_decode_bomb_limit +- test_document_level_bomb_limit +- test_flate_decode_bomb_limit_with_predictor +- test_th01_decompression_bomb_abort +- test_lzw_bomb_limit +- test_crypt_decode_bomb_limit +- test_decompression_bomb_objstm +- test_bomb_limit_enforcement +- proptest_flate_decode_bomb_limit_no_panic +- proptest_lzw_decode_bomb_limit_no_panic +- proptest_crypt_decode_bomb_limit_no_panic +- test_bomb_protection_detection + +## Verification + +```bash +# Run all bomb tests +cargo test -p pdftract-core --lib bomb + +# Run specific tests +cargo test -p pdftract-core --lib test_th01_decompression_bomb_abort +cargo test -p pdftract-core --lib test_bomb_limit_flate +cargo test -p pdftract-core --lib test_flate_decode_bomb_limit +``` + +## Files Modified + +- `crates/pdftract-core/src/parser/stream.rs` - Fixed 4 tests, added 1 new test +- `tests/fixtures/malformed/compression-bomb.bin` - New fixture file (509 bytes) + +## Key Implementation Notes + +1. **Minimal inputs:** All tests use small patterns (50-200 bytes) that compress well +2. **Early abort:** Bomb limits set to 1/3 or less of decoded size to force truncation +3. **Fixture-based:** TH-01 test uses pre-compressed fixture to avoid creating large buffers +4. **Clear assertions:** Each test explicitly checks `decoded.len() <= bomb_limit` + +## References + +- Plan EC-10: FlateDecode bomb mitigation +- Plan TH-01: Decompression bomb threat and test +- Bead requirement: "Use minimal crafted inputs and assert the byte-budget limit fires early" diff --git a/tests/fixtures/malformed/compression-bomb.bin b/tests/fixtures/malformed/compression-bomb.bin new file mode 100644 index 0000000000000000000000000000000000000000..573f983d51bb304492c02b224f0381f7de55f0ae GIT binary patch literal 509 ocmb=p_4W`WCj$cm(}LWu^&!qezRW<*C>Y)$a9|^&+J|fJ0NZv7)Bpeg literal 0 HcmV?d00001