pdftract/crates
jedarden 4ac8479ad9 test(pdftract-1sxpa): complete inline image header parser implementation
- Implement recover_to_next_key function with byte-by-byte scanning
  for '/' and 'ID' keywords to enable error recovery in malformed headers
- Fix test assertion: StructInvalidDictValue -> StructInvalidType
- Fix ID whitespace validation test input (IDEI -> ID)
- Fix markdown.rs test calls to include tables parameter
- Add book_chapter fixture provenance entries

All 14 inline_image tests pass, covering:
- Basic header parsing with shorthand key expansion
- Array filter chains
- ID whitespace validation
- Malformed header recovery

Acceptance criteria:
- PASS: BI /W 10 /H 10 /CS /DeviceGray /BPC 8 /F /ASCIIHexDecode ID parses
- PASS: Shorthand expansion (/W -> /Width) yields width == 10
- PASS: Array filter /F [/ASCII85Decode /FlateDecode] parses
- PASS: ID without trailing whitespace emits diagnostic
- PASS: Malformed header (missing value) emits diagnostic and recovers

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-05-27 22:18:09 -04:00
..
pdftract-cer-diff docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files 2026-05-23 10:36:28 -04:00
pdftract-cli feat(pdftract-260a3): implement legal_filing profile with fixtures and tests 2026-05-27 21:44:49 -04:00
pdftract-core test(pdftract-1sxpa): complete inline image header parser implementation 2026-05-27 22:18:09 -04:00
pdftract-libpdftract feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
pdftract-py feat(pdftract-1tswa): implement GIL release with py.allow_threads on extraction entry points 2026-05-26 21:23:00 -04:00