pdftract/crates
jedarden a11b24459a feat(pdftract-1g578): implement image-source dispatch for binarization selection
- Add ImageSource enum (PhysicalScan, DigitalOrigin, Jbig2)
- Add BinarizerKind enum (Sauvola, Otsu, Skip)
- Implement image_source_from_filters(): maps PDF filter chain to ImageSource
- Implement select_binarizer(): maps ImageSource to BinarizerKind
- Dispatch policy: DCTDecode → Sauvola, FlateDecode → Otsu, JBIG2 → Skip
- Unknown filter chains default to PhysicalScan (conservative)
- Pure functions, no I/O, fully unit-tested

Acceptance criteria:
- DCTDecode → Sauvola 
- FlateDecode → Otsu 
- JBIG2Decode → Skip 
- Unknown → PhysicalScan (default) 
- Pure dispatch, fully tested 
- Wired into preprocessing coordinator 
2026-05-31 23:54:26 -04:00
..
pdftract-cer-diff docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files 2026-05-23 10:36:28 -04:00
pdftract-cli feat(pdftract-21wci): integrate OCR regions renderer into inspector API 2026-05-31 23:54:14 -04:00
pdftract-core feat(pdftract-1g578): implement image-source dispatch for binarization selection 2026-05-31 23:54:26 -04:00
pdftract-libpdftract feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
pdftract-py fix(pdftract-2uk9z): wrap native module results in typed Python objects 2026-05-28 21:18:38 -04:00