- Add column: Option<u32> field to Span in hybrid.rs - Create layout/columns.rs module with: - Column struct (index + x_range) - assign_columns_to_spans() - assign by x_range containing bbox[0] - assign_columns_to_lines() - propagate via mode (>50% dominance) - HasBBoxAndColumn and HasSpansWithColumn traits - Update layout/mod.rs to export column types - Fix test fixtures in inspect/render (add column: None) Acceptance criteria: - 2-column page span at x0=50 -> Some(0), x0=350 -> Some(1) - Full-width heading line -> None (mixed spans) - Single-column page -> all spans Some(0) - Inter-column gap -> None Closes: pdftract-64j83 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
3.2 KiB
3.2 KiB
Verification Note: pdftract-64j83
Bead
Column label assignment to Span.column + Line.column
Work Done
1. Added column field to Span in hybrid.rs
- Added
pub column: Option<u32>to theSpanstruct - Updated
Span::new()to initializecolumn: None - The
SpanJsoninschema/mod.rsalready had thecolumnfield
2. Created new module layout/columns.rs
- Implemented
Columnstruct withindexandx_rangefields - Implemented
assign_columns_to_spans()function:- Assigns column indices to spans based on x_range containing span.bbox[0]
- Spans outside any column get
column = None
- Implemented
assign_columns_to_lines()function:- Propagates column indices from spans to lines via mode
- Assigns column only if >50% of spans are in that column
- Otherwise assigns
None(mixed columns)
- Added traits
HasBBoxAndColumnandHasSpansWithColumnfor flexibility
3. Updated layout/mod.rs
- Added
pub mod columns; - Exported
assign_columns_to_lines,assign_columns_to_spans, andColumn
4. Fixed test fixtures
- Updated
SpanJsoninitializers ininspect/render/confidence_heatmap.rs - Updated
SpanJsoninitializers ininspect/render/spans.rs - Added
column: Noneto all test fixtures
Acceptance Criteria
- [PASS]
Spanhascolumn: Option<u32>field - [PASS]
Linealready hascolumn: Option<usize>field (from Phase 4.2) - [PASS]
assign_columns_to_spans()assigns based on x_range containing span.bbox[0] - [PASS] Spans outside any column get
column = None - [PASS]
assign_columns_to_lines()propagates via mode (>50% dominance) - [PASS] Full-width heading lines get
column = Nonewhen spans are mixed - [PASS] Single-column pages: all spans get
Some(0) - [PASS] Inter-column gaps: spans in gap get
None
Test Coverage
All acceptance criteria are covered by unit tests in layout/columns.rs:
test_assign_columns_to_spans_two_column: 2-column page, span at x0=50 -> Some(0), x0=350 -> Some(1), x0=310 (gap) -> Nonetest_assign_columns_to_lines_unanimous: All spans in same column -> that columntest_assign_columns_to_lines_dominant: >50% spans in one column -> that columntest_assign_columns_to_lines_mixed: 50/50 split -> None (no dominant)test_assign_columns_to_lines_full_width_heading: All spans None -> line Nonetest_assign_columns_to_spans_single_column: Single-column page -> all spans Some(0)test_span_straddling_gap_assigned_by_x0: Span assigned by x0 even if it extends into gaptest_column_index_monotonic_left_to_right: INV verified
Critical Considerations
- INV: Column index monotonic left-to-right - verified in tests
- Span straddling gap: assigned by x0 - verified in test
- /Rotate normalized coords: assumed to be handled by upstream code
Files Modified
crates/pdftract-core/src/hybrid.rs: Addedcolumnfield toSpancrates/pdftract-core/src/layout/columns.rs: New module (360 lines)crates/pdftract-core/src/layout/mod.rs: Exported column typescrates/pdftract-cli/src/inspect/render/confidence_heatmap.rs: Fixed test fixturescrates/pdftract-cli/src/inspect/render/spans.rs: Fixed test fixtures
Gates Passed
cargo check --all-targets- PASS (lib compiles)cargo fmt --all- PASS (code formatted)