pdftract/crates/pdftract-core/src
jedarden 566cac2aea feat(pdftract-28m6): implement AGL compile-time phf::Map
Add Adobe Glyph List (AGL) 1.4 and AGLFN 1.7 compile-time lookup using phf::Map.

- Add generate_agl.py to parse AGL source files and generate agl.json
- Add aglfn.txt (AGLFN 1.7, ~770 entries) and glyphlist.txt (AGL 1.4, ~4400 entries)
- Add build.rs function to generate two phf::Map structures:
  - AGL: 4,200 single-codepoint entries
  - AGL_MULTI: 81 multi-codepoint entries (Hebrew/Arabic)
- Add src/font/agl.rs with public API:
  - unicode_for_glyph_name() - handles algorithmic patterns (uniXXXX, uXXXXXX), variant stripping, AGL lookup
  - unicode_for_glyph_name_multi() - for multi-codepoint ligatures

All 21 acceptance criteria tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 18:44:47 -04:00
..
cache feat(pdftract-2i6rt): implement cache CLI subcommand and HTTP integration 2026-05-23 06:33:43 -04:00
fingerprint feat(pdftract-154mz): fix canonicalization module compilation 2026-05-20 19:24:38 -04:00
font feat(pdftract-28m6): implement AGL compile-time phf::Map 2026-05-23 18:44:47 -04:00
parser test(pdftract-57o4): add ParentTree integration tests for annotation and sparse arrays 2026-05-23 18:36:09 -04:00
receipts feat(pdftract-36wlt): implement verify-receipt subcommand + verifier protocol 2026-05-23 04:00:15 -04:00
render feat(pdftract-4my): implement pdfium-render path behind full-render feature 2026-05-23 16:28:08 -04:00
schema feat(pdftract-sg6): implement DPI selection logic for OCR rendering 2026-05-23 17:37:40 -04:00
classify.rs feat(pdftract-2zw): page classification fixtures + integration tests + reproducibility gate 2026-05-23 15:04:05 -04:00
diagnostics.rs feat(pdftract-5nbp): implement /Differences overlay handler for font encodings 2026-05-23 18:09:46 -04:00
document.rs feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction 2026-05-23 12:30:26 -04:00
dpi.rs feat(pdftract-sg6): implement DPI selection logic for OCR rendering 2026-05-23 17:37:40 -04:00
extract.rs feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction 2026-05-23 12:30:26 -04:00
graphics_state.rs feat(pdftract-byq): implement direct image compositing path (Phase 5.2.1) 2026-05-23 15:46:38 -04:00
hybrid.rs feat(pdftract-4y9l): implement hybrid page routing with bbox merge rule 2026-05-23 17:48:00 -04:00
lib.rs feat(pdftract-sg6): implement DPI selection logic for OCR rendering 2026-05-23 17:37:40 -04:00
options.rs feat(pdftract-sg6): implement DPI selection logic for OCR rendering 2026-05-23 17:37:40 -04:00
render.rs feat(pdftract-4my): implement pdfium-render path behind full-render feature 2026-05-23 16:28:08 -04:00
semaphore.rs fix(pdftract-bf-5mry9): fix compilation bugs in rayon parallel extraction 2026-05-23 12:02:54 -04:00