Two new research documents covering the glyph-to-span-to-block assembly
pipeline (inter-operator merging, adaptive word gap threshold, column
detection, ligature bbox splitting, multi-granularity output) and
Unicode post-processing (NFC normalization, selective NFKC decomposition
for ligatures, PUA preservation, soft hyphen resolution, ZWJ/ZWNJ
handling, combining character reordering).
Also adds docs/plan/implementation-plan.md: the full 7-phase Rust
implementation roadmap covering core parser, font/encoding pipeline,
content stream processing, text assembly, OCR integration, API surface,
and advanced features — with crate selections, complexity ratings,
test strategy, and v0.1–v1.0 release milestones.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>