The detect_column_gaps function was already implemented in columns.rs with full test coverage. All acceptance criteria verified: - 8 zeros < threshold: no gap - 20 zeros middle: 1 gap detected - Leading zeros >= threshold: gap emitted - All-zero histogram: 0 gaps Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
44 lines
1.8 KiB
Markdown
44 lines
1.8 KiB
Markdown
# pdftract-14w0w: Gap detection verification
|
|
|
|
## Summary
|
|
|
|
The `detect_column_gaps` function was already implemented in `crates/pdftract-core/src/layout/columns.rs` (lines 156-201). All acceptance criteria tests pass.
|
|
|
|
## Implementation details
|
|
|
|
The function:
|
|
- Takes histogram slice and page_width
|
|
- Calculates threshold: `(page_width * 0.03).ceil() as usize`
|
|
- Returns `Vec<ColumnGap { lo, hi }>`
|
|
|
|
Key behaviors:
|
|
- Handles leading zeros (left margin) - emits gap if >= threshold
|
|
- Handles trailing zeros (right margin) - emits gap if >= threshold
|
|
- Handles all-zero histogram (empty page) - returns no gaps
|
|
- Handles empty histogram - returns no gaps
|
|
|
|
## Acceptance criteria verification
|
|
|
|
| Criterion | Test | Status |
|
|
|-----------|------|--------|
|
|
| 8 zeros, page_width=600: NO gap | `test_detect_column_gaps_short_zeros_no_gap` | PASS |
|
|
| 20 zeros middle, page_width=600: 1 gap | `test_detect_column_gaps_middle_gap` | PASS |
|
|
| Leading zeros >= threshold: 1 gap | `test_detect_column_gaps_leading_gap` | PASS |
|
|
| All-zero histogram: 0 gaps | `test_detect_column_gaps_all_zeros_no_gaps` | PASS |
|
|
|
|
Additional tests:
|
|
- `test_detect_column_gaps_trailing_gap` - trailing margin gap
|
|
- `test_detect_column_gaps_multiple_gaps` - multiple separated gaps
|
|
- `test_detect_column_gaps_threshold_exact` - gap at exact threshold
|
|
- `test_detect_column_gaps_threshold_minus_one` - gap just below threshold
|
|
- `test_detect_column_gaps_empty_histogram` - empty input
|
|
- `test_detect_column_gaps_no_zeros` - no gaps in histogram
|
|
- `test_detect_column_gaps_small_page` - small page width
|
|
- `test_detect_column_gaps_leading_and_trailing` - both margins
|
|
|
|
All 36 column tests PASS (including 13 detect_column_gaps tests).
|
|
|
|
## Files verified
|
|
|
|
- `crates/pdftract-core/src/layout/columns.rs` - implementation (lines 89-201)
|
|
- `crates/pdftract-core/src/layout/mod.rs` - exports ColumnGap, detect_column_gaps
|