- Add OcrFallback variant to SpanSource enum for fallback spans - Add page_seg_mode field to TessOpts for PSM_SPARSE_TEXT support - Add ASSISTED_OCR_KEEP_THRESH (0.7) and ASSISTED_OCR_FALLBACK_THRESH (0.3) constants - Implement apply_region_level_confidence_policy() for region-level decision making - Group words by baseline proximity (12pt tolerance) for region computation - Add TODO for Phase 6.1 confidence_source enum to include "ocr-fallback" Closes: pdftract-29gu |
||
|---|---|---|
| .. | ||
| pdftract-cer-diff | ||
| pdftract-cli | ||
| pdftract-core | ||
| pdftract-libpdftract | ||
| pdftract-py | ||