pdftract/crates/pdftract-core/src/layout
jedarden fda17d4d77 feat(pdftract-2rkc1): implement column confirmation with >= 3 line threshold
Implement confirm_columns function that partitions page into candidate
columns (regions between consecutive gaps + before-first + after-last),
counts unique lines whose first span's x0 falls within each candidate's
x-range, and promotes candidates with line_count >= 3 to confirmed columns.

Supporting code:
- ColumnGap struct with lo/hi bounds, width(), midpoint()
- detect_column_gaps function for zero-coverage region detection
- HasFirstSpan trait for first span bbox access
- CandidateColumn struct for tracking x_range and line_count

All 49 column tests pass, including all acceptance criteria.

Bead: pdftract-2rkc1

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 23:09:01 -04:00
..
caption.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
code.rs feat(pdftract-8n270): implement code block detection 2026-05-24 10:04:22 -04:00
columns.rs feat(pdftract-2rkc1): implement column confirmation with >= 3 line threshold 2026-05-27 23:09:01 -04:00
correction.rs feat(pdftract-1vrxg): implement word-break normalization 2026-05-27 22:55:57 -04:00
line.rs feat(pdftract-6bwq4): implement baseline clustering algorithm 2026-05-24 10:39:01 -04:00
mod.rs feat(pdftract-4md5z): implement XY-cut recursive reading order algorithm 2026-05-26 18:37:31 -04:00
readability.rs fix(pdftract-tuky): fix color clamping test and verify Phase 3.1 coordinator 2026-05-26 16:36:01 -04:00
reading_order.rs feat(pdftract-4md5z): implement XY-cut recursive reading order algorithm 2026-05-26 18:37:31 -04:00
wordlist.rs fix: resolve compilation errors across codebase 2026-05-25 08:38:04 -04:00