feat(pdftract-qzjw): implement 4-level encoding resolver with per-font cache
Implements Phase 2.2 encoding fallback chain:
- L1: ToUnicode CMap (1.0 confidence)
- L2: Named encoding + AGL (0.9 confidence)
- L3: Font fingerprint cache (0.85 confidence)
- L4: Shape recognition stub (0.7 confidence, cfg-gated)
Features:
- DashMap-based per-font resolution cache
- Single GLYPH_UNMAPPED diagnostic per (font, code) miss
- FontId from Arc pointer for unique identification
- ResolvedGlyph with chars, source, and confidence
- Proper short-circuit on L1 empty/U+FFFD results
Acceptance criteria:
- ✅ Ligature expansion → multi-char slice, confidence 1.0
- ✅ AGL lookup → confidence 0.9
- ✅ Fingerprint lookup → confidence 0.85
- ✅ All-level miss → U+FFFD, confidence 0.0, single diagnostic
- ✅ Cache hit returns identical result to miss
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>