Implemented the full SVG page renderer for the inspector debug viewer (Phase 7.9.4). The renderer generates complete SVG documents with multiple layers for visual debugging of PDF extraction results. Changes: - Implemented render_page_svg() with 10 layers (background, selection, 8 overlays) - Added selection layer with invisible <text> elements for browser text selection - Integrated all 8 overlay layer renderers (spans, blocks, columns, reading_order, confidence_heatmap, ocr, mcid, anchors) - Added arrowhead marker definition for reading order arrows - Implemented helper functions: render_selection_layer(), render_ocr_layer(), extract_columns_from_spans(), escape_xml_text() - Added comprehensive unit tests for all functions Acceptance criteria: - ✅ Per-page SVG structure with proper viewBox and namespace - ✅ 8 toggleable overlay layers with correct class names - ✅ Color coding by confidence (spans) and kind (blocks) - ✅ Coordinate system flip (PDF y-up to SVG y-down) - ✅ Invisible <text> elements for browser text selection - ✅ SVG determinism (same input produces identical output) Deferred: - Glyph paths via ttf-parser (requires font data not in JSON) - Performance testing (requires full inspector integration) - MCID layer (MCID tracking not in schema yet) Closes: pdftract-4ct3y Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| benches | ||
| src | ||
| tests | ||
| build.rs | ||
| Cargo.toml | ||
| pdftract-cli.cdx.json | ||