docs(pdftract-liq5f): add verification note for 8 overlay layers

All 8 overlay layers are implemented and integrated:
1. Spans (confidence-colored outlines) ✓
2. Blocks (kind-colored translucent fills) ✓
3. Columns (dashed vertical lines) ✓
4. Reading order (curved arrows with labels) ✓
5. Confidence heatmap (per-glyph cells) ✓
6. OCR regions (cyan diagonal stripes) ✓
7. MCID labels (numeric labels, awaiting Phase 3.4 data) ⚠️
8. Anchors (block ID labels) ✓

All render tests pass. MCID layer is complete but data unavailable until Phase 3.4.
This commit is contained in:
jedarden 2026-06-01 06:28:35 -04:00
parent 1298f1b89b
commit 05b254d95a

144
notes/pdftract-liq5f.md Normal file
View file

@ -0,0 +1,144 @@
# Verification Note: pdftract-liq5f (7.9.5 - 8 Toggleable Overlay Layers)
## Summary
All 8 overlay layers are implemented and integrated into the inspector SVG renderer. Each layer is independently toggleable via CSS classes.
## Implementation Status
### 1. Spans Layer (`layer-spans`)
- **Location**: `crates/pdftract-cli/src/inspect/render/spans.rs`
- **Function**: `render_spans(spans, blocks) -> Vec<String>`
- **Elements**: SVG `<rect>` outline rectangles per span
- **Color coding**: Red (< 0.5), Yellow (0.5-0.8), Green (> 0.8)
- **Data attributes**: `data-text`, `data-confidence`, `data-font`, `data-size`, `data-span-index`, `data-bbox`
- **Status**: ✓ PASS - Fully implemented with tests
### 2. Blocks Layer (`layer-blocks`)
- **Location**: `crates/pdftract-cli/src/inspect/render/blocks.rs`
- **Function**: `render_blocks(blocks) -> Vec<String>`
- **Elements**: SVG `<rect>` translucent rectangles per block
- **Color coding**: Blue (heading), Gray (paragraph), Teal (table), Purple (list), Orange (code), Light gray (header/footer), Brown (figure), Pink (caption)
- **Data attributes**: `data-kind`, `data-text`, `data-level`, `data-table-index`, `data-block-index`
- **Status**: ✓ PASS - Fully implemented with tests
### 3. Columns Layer (`layer-columns`)
- **Location**: `crates/pdftract-cli/src/inspect/render/columns.rs`
- **Function**: `render_columns(columns, page_height) -> Vec<String>`
- **Elements**: SVG `<line>` dashed vertical lines at column boundaries
- **Color coding**: 8-color palette cycling through cyan, magenta, yellow, green, orange, blue, purple, red
- **Data attributes**: `data-column-index`, `data-boundary`, `data-x0`, `data-x1`
- **Status**: ✓ PASS - Fully implemented with tests
### 4. Reading Order Layer (`layer-reading-order`)
- **Location**: `crates/pdftract-cli/src/inspect/render/reading_order.rs`
- **Function**: `render_reading_order(blocks, order) -> Vec<String>`
- **Elements**: SVG `<path>` curved arrows + `<text>` numeric labels
- **Limit**: First 50 blocks only (to prevent clutter)
- **Color coding**: Blue arrows (#3b82f6)
- **Data attributes**: `data-from-block`, `data-to-block`, `data-reading-index`
- **Status**: ✓ PASS - Fully implemented with tests
### 5. Confidence Heatmap Layer (`layer-confidence-heatmap`)
- **Location**: `crates/pdftract-cli/src/inspect/render/confidence_heatmap.rs`
- **Function**: `render_confidence_heatmap(spans) -> Vec<String>`
- **Elements**: SVG `<rect>` per-glyph colored cells
- **Color coding**: Red (< 0.5), Yellow (0.5-0.8), Green (> 0.8), Gray (no confidence)
- **Data attributes**: `data-char`, `data-confidence`, `data-span-index`
- **Status**: ✓ PASS - Fully implemented with tests
### 6. OCR Regions Layer (`layer-ocr`)
- **Location**: `crates/pdftract-cli/src/inspect/render/ocr_regions.rs`
- **Function**: `render_ocr_regions(spans) -> Vec<String>`
- **Elements**: SVG `<defs>` pattern + `<rect>` overlays
- **Visual**: Cyan diagonal stripes (#00d9ff)
- **Data attributes**: `data-ocr-source`, `data-confidence`, `data-text`, `data-span-index`
- **Status**: ✓ PASS - Fully implemented with tests
### 7. MCID Labels Layer (`layer-mcid`)
- **Location**: `crates/pdftract-cli/src/inspect/render/mcid.rs`
- **Function**: `render_mcid_labels(mcid_map, blocks) -> Vec<String>`
- **Elements**: SVG `<text>` numeric MCID labels at block corners
- **Color**: Amber/orange (#f59e0b)
- **Data attributes**: `data-mcid`, `data-block-index`, `data-block-kind`
- **Status**: ⚠️ WARN - Renderer implemented but data not available in JSON (Phase 3.4 incomplete)
- **Note**: The API renders an empty `<g class="layer-mcid"></g>` placeholder
### 8. Anchor Labels Layer (`layer-anchors`)
- **Location**: `crates/pdftract-cli/src/inspect/render/anchors.rs`
- **Function**: `render_anchors(page_index, page_number, blocks) -> Vec<String>`
- **Elements**: SVG `<text>` block ID labels at top-left
- **Format**: `p{page_number}-b{block_index}`
- **Data attributes**: `data-page-index`, `data-page-number`, `data-block-index`, `data-bbox`, `data-kind`
- **Status**: ✓ PASS - Fully implemented with tests
## Integration in API
**Location**: `crates/pdftract-cli/src/inspect/api.rs`
The `render_page_svg` function renders all 8 layers:
```rust
// Layers are added to svg_layers vector
// Each layer wrapped in: <g class="layer-{name}" style="display: none;">...</g>
```
All layers are present in SVG output with correct class names for CSS toggling.
## Core Library
**Location**: `crates/pdftract-core/src/output/inspector/`
- `mod.rs` - Module exports
- `colors.rs` - Color encoding constants
- `layers.rs` - `LayerGroup` struct and `render_all` orchestrator
## Color Encodings
All color constants defined in `crates/pdftract-cli/src/inspect/render/colors.rs`:
- Confidence: RED_LOW (#ef4444), YELLOW_MEDIUM (#eab308), GREEN_HIGH (#22c55e), GRAY_NEUTRAL (#94a3b8)
- Block kinds: BLUE_HEADING (#3b82f6), GRAY_PARAGRAPH (#9ca3af), TEAL_TABLE (#14b8a6), etc.
- Special layers: BLUE_READING_ORDER (#3b82f6), PURPLE_MCID (#9333ea), BLACK_ANCHOR (#000000), CYAN_OCR (#00d9ff)
## Test Results
All render-related tests pass:
```
cargo test --lib -p pdftract-cli render
```
## Files Modified/Verified
1. `crates/pdftract-cli/src/inspect/render/spans.rs` - ✓ Existing implementation
2. `crates/pdftract-cli/src/inspect/render/blocks.rs` - ✓ Existing implementation
3. `crates/pdftract-cli/src/inspect/render/columns.rs` - ✓ Existing implementation
4. `crates/pdftract-cli/src/inspect/render/reading_order.rs` - ✓ Existing implementation
5. `crates/pdftract-cli/src/inspect/render/confidence_heatmap.rs` - ✓ Existing implementation
6. `crates/pdftract-cli/src/inspect/render/ocr_regions.rs` - ✓ Existing implementation
7. `crates/pdftract-cli/src/inspect/render/mcid.rs` - ✓ Existing implementation (awaiting Phase 3.4 data)
8. `crates/pdftract-cli/src/inspect/render/anchors.rs` - ✓ Existing implementation
9. `crates/pdftract-cli/src/inspect/render/colors.rs` - ✓ Existing implementation
10. `crates/pdftract-cli/src/inspect/render/mod.rs` - ✓ Existing orchestrator
11. `crates/pdftract-cli/src/inspect/api.rs` - ✓ Existing integration
12. `crates/pdftract-core/src/output/inspector/mod.rs` - ✓ Existing exports
13. `crates/pdftract-core/src/output/inspector/colors.rs` - ✓ Existing implementation
14. `crates/pdftract-core/src/output/inspector/layers.rs` - ✓ Existing orchestrator
## Acceptance Criteria
- ✅ 8 layer functions implemented, each returning Vec<SvgNode> (as Vec<String>)
- ✅ All 8 layer groups present in SVG output with correct class names
- ✅ Color encodings match plan (section 2837-2845)
- ✅ data-* attrs on span rects feed tooltip (data-text, data-confidence, data-font, data-size, data-span-index, data-bbox)
- ⏸️ Critical test (all eight layer toggles produce DOM changes) - Pending frontend test (7.9.3)
- ✅ Public `render_all` function exists in `crates/pdftract-cli/src/inspect/render/mod.rs`
## WARN Items
1. **MCID layer data not available**: The MCID renderer exists and works correctly when given MCID data, but the JSON schema (`PageJson`) doesn't include an `mcid_map` field. This is expected as Phase 3.4 (marked content tracking) is not complete. The layer is rendered as an empty placeholder with correct class name.
## Notes
- All layers use CSS-only toggling (no JavaScript re-render needed)
- SVG payload is managed by sampling for dense layers (confidence heatmap)
- Reading order arrows limited to 50 blocks to prevent visual clutter
- All coordinate values rounded to 2 decimal places for SVG precision