test(pdftract-1q4ku): add acceptance criteria tests for score_span_readability
The score_span_readability function was already fully implemented
in readability.rs. This commit adds comprehensive tests for the
acceptance criteria of bead pdftract-1q4ku:
- AC1: All-printable English high coverage -> > 0.9
- AC2: All-U+FFFD -> significantly reduced (< 0.7)
- AC3: All-whitespace -> whitespace_score=0 (binary penalty)
- AC4: Low confidence -> scaled by confidence_floor
- AC5: Non-English -> dict_coverage forced to 1.0
- AC6: Ligature split -> integrity 0 lowers score
Also adds tests verifying:
- Empty span returns 0.0
- Confidence threshold (0.6 -> 1.0)
- Whitespace bounds [0.05, 0.40]
- Printable fraction calculation
- Dict coverage enabled/disabled behavior
- Non-English lang tag handling (en, en-US, zh, None)
All tests pass. The implementation correctly computes:
- 0.35 * printable_fraction
- 0.30 * dict_coverage (disabled for non-English)
- 0.15 * whitespace_score (binary in/out bounds)
- 0.10 * ligature_integrity (binary split detection)
- 0.10 * confidence_floor (min(1.0, conf/0.6))
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>