Implement header row detection for tables using two signals:
1. Bold font detection (fully implemented)
2. StructTree TH detection (stub pending MCID tracking)
Bold detection:
- is_bold_font(): detects bold fonts from PostScript name patterns
- is_cell_bold(): checks if all non-whitespace content in a cell is bold
- is_bold_header_row(): validates rows with >=2 bold cells
- count_header_rows(): counts contiguous bold headers from top
- Cell::mark_header_rows(): sets is_header_row flag on cells
TH detection (stub):
- is_th_header_row(): placeholder for StructTree TH detection
Requires MCID tracking on TableSpan (future work)
Will use ParentTree to map MCIDs to StructElems
Will verify TR > TH chain structure
Combined detection:
- is_header_row(): combines bold and TH signals
- Bold wins on conflict per body data design principle
Documentation:
- Updated table-structure-reconstruction.md with full header detection spec
- Documented implemented vs pending signals
- Added implementation notes for TH detection
Tests:
- 45 tests covering all bold detection scenarios
- Tests for multi-row headers (contiguous from top)
- Tests for single-cell row exclusion
- Tests for empty/whitespace cell handling
- Placeholder tests for TH detection
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>