- Add build_x0_histogram() function for 1pt-resolution x0 histogram - Add HasBBox trait for generic bbox access - Implement for [f32; 4] and [f64; 4] types - Clamp out-of-bounds x0 values with diagnostics - Add 7 tests covering single/multiple spans, clamping, rounding, A4 pages Acceptance criteria PASS: - Single span at x0=100: hist[100] == 1 - Multiple spans: hist[100]==2, hist[200]==2, hist[300]==1 - Negative x0 clamped to hist[0] with diagnostic - Empty spans returns zero Vec Closes: pdftract-56vwd
1.9 KiB
1.9 KiB
pdftract-56vwd: x0 histogram builder
Summary
Implemented build_x0_histogram(spans: &[S], page_width: f32) -> Vec<u32> function for column detection (Phase 4.3).
Changes Made
crates/pdftract-core/src/layout/columns.rs
- Added
build_x0_histogram()function that builds a 1pt-resolution histogram of span x0 coordinates - Added
HasBBoxtrait for generic bbox access (returns[f32; 4]) - Implemented
HasBBoxfor[f32; 4]and[f64; 4]array types - Function clamps x0 values to valid histogram range and logs diagnostics for out-of-bounds values
crates/pdftract-core/src/layout/mod.rs
- Exported
build_x0_histogramfunction
Acceptance Criteria Status
| Criterion | Status |
|---|---|
| 1 span at x0=100, page_width=612: hist[100] == 1 | PASS |
| 5 spans at x0=100,100,200,200,300: hist[100]==2, hist[200]==2, hist[300]==1 | PASS |
| Span at x0=-5: clamped to hist[0], diagnostic | PASS |
| Empty spans: returns Vec of zeros | PASS |
Test Results
All 20 tests in layout::columns module pass, including 7 new tests for build_x0_histogram:
test_build_x0_histogram_single_span- Single span histogramtest_build_x0_histogram_multiple_spans- Multiple spans at different x0 positionstest_build_x0_histogram_clamp_negative_x0- Negative x0 clamping with diagnostictest_build_x0_histogram_clamp_overflow_x0- Overflow x0 clamping with diagnostictest_build_x0_histogram_empty_spans- Empty span handlingtest_build_x0_histogram_rounding- Rounding behavior (x0.4 -> x0, x0.6 -> x0+1)test_build_x0_histogram_a4_page- A4 page width (595pt)
Notes
- Function signature uses generic
S: HasBBoxtrait for flexibility with different span representations - 1pt resolution per plan: for 612pt letter page, 612 buckets; for 595pt A4, 595 buckets
- Only x0 (LEFT edge) is histogrammed; x1 is not used
- Each span contributes exactly one bucket increment
- Diagnostics use
tracing::warn!for out-of-bounds x0 values