Commit graph

2 commits

Author SHA1 Message Date
jedarden
49859e176f docs(pdftract-1f8we): verify ConfidenceSource enum and mapping implementation
Verified that ConfidenceSource enum and map_confidence_source function
are already fully implemented in crates/pdftract-core/src/confidence.rs.

All acceptance criteria PASS:
- Single-glyph to_unicode → Native
- Single-glyph shape_match → Heuristic
- Mixed-glyph (agl + shape_match) → Heuristic (worst)
- 4.7 correction on all-agl → Heuristic (override)
- OCR-produced span → Ocr
- JSON serialization lowercase

No code changes required - implementation was already complete.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 01:10:16 -04:00
jedarden
5a7c25ead4 feat(pdftract-1f8we): add map_confidence_source to public API, remove duplicate from span module
- Add map_confidence_source to confidence module re-exports in lib.rs
- Remove duplicate map_confidence_source function from span/mod.rs
- Add Ocr case to map_unicode_source_to_confidence helper
- Add comprehensive tests for map_confidence_source in span module

The ConfidenceSource enum and map_confidence_source function were already
implemented in the confidence module from bead pdftract-2etcd. This change
completes the public API exposure and removes the duplicate implementation.

Acceptance criteria (all PASS):
- Single-glyph to_unicode span: confidence_source == Native
- Single-glyph shape_match span: confidence_source == Heuristic
- Mixed-glyph span (agl + shape_match): confidence_source == Heuristic
- 4.7 correction applied: Native -> Heuristic override
- OCR span: confidence_source == Ocr
- JSON serialization: lowercase strings

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 01:06:02 -04:00