pdftract/crates/pdftract-core/src/font
jedarden 1dfaf73aa4
Some checks are pending
Schema Generation Validation / Validate JSON Schema (push) Waiting to run
Schema Generation Validation / Validate JSON Syntax (push) Waiting to run
feat(pdftract-3g6ne): implement CMap codespace range parser
This commit adds the codespace range parser for CMap streams. The parser
extracts the begincodespacerange / endcodespacerange blocks that define
legal byte-width boundaries for character codes in a CMap.

## Implementation

- CodespaceRange: Single range with lo/hi bounds (stored as [u8; 4]) and width (1-4 bytes)
- CodespaceRanges: Collection with SmallVec<[CodespaceRange; 8]>
- CodespaceParser: PostScript-style tokenizer for begincodespacerange blocks

## Acceptance Criteria (all PASS)

- Parse <00> <7F> → 1 range, width=1 
- Parse <00> <7F> <8000> <FFFF> in one block → 2 ranges 
- Width inference: 2-char hex → width=1; 4-char hex → width=2 
- Case-insensitive hex (<C0> and <c0> equivalent) 
- Malformed range (width mismatch) → diagnostic + skipped 
- Empty CMap → empty ranges 
- JIS range <8140> <FEFE> → 2-byte CJK 
- 3-byte and 4-byte range support 

Also adds encrypted fixture provenance entries to PROVENANCE.md.

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-05-28 05:47:07 -04:00
..
agl.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
cjk_encoding.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
cmap.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
codespace.rs feat(pdftract-3g6ne): implement CMap codespace range parser 2026-05-28 05:47:07 -04:00
embedded.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
encoding.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
fingerprint.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
mod.rs feat(pdftract-1uj5): implement Type 3 font encoding resolution 2026-05-24 04:28:11 -04:00
predefined_cmap.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
resolver.rs fix(pdftract-63ka2): AES-128 test buffer allocation for PKCS#7 padding 2026-05-28 01:30:33 -04:00
shape.rs feat(pdftract-2iur): implement nearest-neighbor scanner with Hamming distance and frequency tie-break 2026-05-24 06:57:27 -04:00
std14.rs feat(pdftract-3dwu): implement named encoding tables 2026-05-23 18:00:05 -04:00
type0.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
type3.rs feat(pdftract-1uj5): implement Type 3 font encoding resolution 2026-05-24 04:28:11 -04:00
type3_rasterizer.rs feat(pdftract-p7yll): implement cm operator diagnostics 2026-05-24 04:13:16 -04:00