pdftract/crates
jedarden d41d47de66 feat(pdftract-1x2): implement StructTree depth-first walker with RoleMap resolution
Implements the StructTree parser (Phase 7.1.1) with:
- Depth-first walker over /StructTreeRoot via /K array
- Support for all four /K entry types: StructElem, MCID, MCR, OBJR
- /RoleMap resolution with chain handling and cycle detection
- /Lang inheritance through the structure tree
- /ActualText inheritance (applies to all descendant content)
- Public API: StructureType, StructElemNode, StructTreeRoot, RoleMap, Kid

Acceptance criteria:
- PASS: All four /K element kinds handled without crashing
- PASS: /RoleMap chains resolve to standard type or NonStruct
- PASS: /Lang and /ActualText inherit correctly down tree
- PASS: Unit tests for Word RoleMap (Heading1 -> H1)
- PASS: Unit tests for nested /Lang and /ActualText scope
- PASS: Public type StructElemNode documented in core crate

References:
- Plan section 7.1 StructTree Exploitation (lines 2547-2549, 2552-2553)
- PDF 1.7 spec 14.7.4 (Structure Tree) and 14.8.4 (Standard Structure Types)

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-05-23 16:43:22 -04:00
..
pdftract-cer-diff docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files 2026-05-23 10:36:28 -04:00
pdftract-cli feat(pdftract-4my): implement serve mode integration for full-render feature 2026-05-23 16:28:08 -04:00
pdftract-core feat(pdftract-1x2): implement StructTree depth-first walker with RoleMap resolution 2026-05-23 16:43:22 -04:00
pdftract-libpdftract feat(pdftract-juc): implement Standard 14 font metrics registry 2026-05-23 14:04:02 -04:00
pdftract-py docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files 2026-05-23 10:36:28 -04:00