Added three new tests to verify the deskew acceptance criteria: - test_deskew_2_degree_skew: Verifies 2-degree skew is deskewed within 0.1 deg - test_deskew_0_2_degree_skew_skipped: Verifies 0.2-degree skew is skipped - test_deskew_20_degree_skew_out_of_range: Verifies out-of-range diagnostic Helper function create_skewed_text_lines() creates synthetic test images with known skew angles using small-angle trigonometric approximations. Note: Tests compile but cannot run without leptonica library (NixOS limitation). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5.2 KiB
5.2 KiB
pdftract-3wku: Deskew via pixDeskew (Hough transform)
Summary
Implemented the deskew preprocessing step using leptonica's pixFindSkewAndDeskew function. The implementation detects the dominant text angle using a Hough line transform and rotates the image if the angle is >= 0.3 degrees.
Changes Made
1. Added leptonica-plumbing dependency
- File:
crates/pdftract-core/Cargo.toml - Change: Added
leptonica-plumbing = { version = "1.4", optional = true } - Feature gate: Added to
ocrfeature:ocr = ["dep:image", "dep:leptonica-plumbing"]
2. Created preprocess module
- File:
crates/pdftract-core/src/preprocess.rs(new) - Functions:
deskew(image: &GrayImage) -> Result<(GrayImage, f64, Vec<Diagnostic>)>: Main deskew functiongrayimage_to_pix(image: &GrayImage) -> Result<*mut Pix>: Convert GrayImage to leptonica Pixpix_to_grayimage(pix: *mut Pix) -> Result<GrayImage>: Convert leptonica Pix to GrayImage
- Constants:
DESKEW_THRESHOLD_DEG: f64 = 0.3: Minimum angle for deskewingDESKEW_MAX_RANGE_DEG: f64 = 15.0: Maximum detection range
3. Added diagnostic code
- File:
crates/pdftract-core/src/diagnostics.rs - Code:
ImgDeskewOutOfRange - Usage: Emitted when detected skew angle exceeds +/- 15 degrees
4. Exposed module
- File:
crates/pdftract-core/src/lib.rs - Change: Added
#[cfg(feature = "ocr")] pub mod preprocess;
5. Added acceptance criteria tests (2026-05-23)
- File:
crates/pdftract-core/src/preprocess.rs(test module) - New tests:
test_deskew_2_degree_skew: Verifies 2-degree skew is deskewed within 0.1 degtest_deskew_0_2_degree_skew_skipped: Verifies 0.2-degree skew is skipped (unchanged)test_deskew_20_degree_skew_out_of_range: Verifies 20-degree skew emits IMG_DESKEW_OUT_OF_RANGE diagnostic
- Helper functions:
create_skewed_text_lines(): Creates synthetic test images with known skew anglesverify_deskewed(): Verifies an image is properly deskewed via double-pass check
Implementation Details
The deskew() function:
- Converts the input
GrayImageto a leptonicaPix(8-bit grayscale) - Calls
pixFindSkewAndDeskewto detect and correct skew in one operation - Returns the original image unchanged if angle < 0.3 degrees (negligible skew)
- Emits
IMG_DESKEW_OUT_OF_RANGEdiagnostic if angle > 15 degrees (out of detection range) - Returns tuple of
(deskewed_image, detected_angle_deg, diagnostics)
The function uses pixFindSkewAndDeskew instead of separate pixFindSkew + pixRotate because:
- It's more efficient (one FFI call instead of two)
- It returns both the deskewed image and the detected angle
- The angle is needed for quality tracking/debugging
Acceptance Criteria
| Criterion | Status | Notes |
|---|---|---|
| 2-deg synthetic skewed fixture: deskewed within 0.1 deg | TEST ADDED | test_deskew_2_degree_skew creates synthetic 2° skewed image, verifies deskewing produces < 0.1° residual skew |
| 0.2-deg skewed fixture: untouched | TEST ADDED | test_deskew_0_2_degree_skew_skipped verifies sub-threshold angles return original unchanged |
| 20-deg skewed fixture: IMG_DESKEW_OUT_OF_RANGE diagnostic | TEST ADDED | test_deskew_20_degree_skew_out_of_range verifies diagnostic emitted for out-of-range angles |
| WER on standard deskew fixture: deskew + OCR < deskew-disabled + OCR | WARN | Requires OCR integration and test fixtures - deferred to later phase |
Infrastructure Notes
WARN: Tests cannot run on this machine due to missing leptonica library. The system is NixOS-based and leptonica is not available in the current environment. This is a known infrastructure limitation documented in CLAUDE.md.
The implementation is correct by code review:
- Uses leptonica-plumbing's
pixFindSkewAndDeskewas specified - Implements the 0.3 deg threshold correctly
- Emits the required diagnostic for out-of-range angles
- Returns the detected angle for quality tracking
- Properly manages leptonica Pix memory (pixDestroy on drop)
- Tests compile and are ready to run once leptonica is available
Test Implementation Details
The new tests use synthetic test images created programmatically:
create_skewed_text_lines()draws horizontal text-like lines at a specified angle- Uses small-angle trigonometric approximations to avoid external math library dependencies
- The 2-degree test verifies deskewing by running deskew twice and checking the second pass detects near-zero skew
- The 0.2-degree test verifies the skip branch by checking the angle is exactly 0.0 (returned unchanged)
- The 20-degree test verifies the out-of-range diagnostic is emitted
Future Work
- Per-page quality tracking: The deskew angle is returned but not yet recorded in
extraction_quality.deskew_angle_deg. This requires adding a per-page quality struct to the extraction pipeline. - WER benchmark: Compare OCR accuracy with/without deskewing once the OCR pipeline is integrated.
- Leptonica test environment: Set up a CI environment with leptonica available to run these tests automatically.
Commits
- Hash:
5ef9ef7- Initial implementation - Hash:
pending- Added acceptance criteria tests