pdftract/crates/pdftract-cli
jedarden d1dc2280f1 feat(pdftract-27n3): implement border padding, pipeline orchestration, and fixtures
Implement step 5 (white-border padding: 10 px on all sides), wire all
preprocessing steps into the final preprocess(input, ImageSource) ->
GrayImage entry point, and curate fixtures for the three image-source
paths (PhysicalScan / DigitalOrigin / Jbig2).

Changes:
- Add add_border_padding() function: creates (width+20) x (height+20)
  image with 10px white border on all sides
- Add preprocess() pipeline orchestrator: applies deskew, contrast
  normalization, binarization, denoising, and padding in correct order
- Skip contrast, binarization, and denoising for JBIG2 images
- Generate test fixtures for skewed_2deg, uneven_lighting, clean_digital,
  and jbig2_scan scenarios
- Add integration tests for all critical test scenarios
- Add A4-page benchmarks targeting < 500ms for physical/digital, < 200ms
  for JBIG2

Refs:
- Plan section: Phase 5.3 step 5 (line 1878) + critical tests (lines 1882-1885)
- Bead: pdftract-27n3
- Note: notes/pdftract-27n3.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 21:55:11 -04:00
..
src feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
tests feat(pdftract-mcp): add MCP server implementation changes 2026-05-23 03:09:56 -04:00
build.rs feat(pdftract-4q8cq): implement 14 environment checks for pdftract doctor 2026-05-23 06:47:07 -04:00
Cargo.toml feat(pdftract-27n3): implement border padding, pipeline orchestration, and fixtures 2026-05-23 21:55:11 -04:00
pdftract-cli.cdx.json feat(pdftract-67tm8): implement MCP stdio transport with integration tests 2026-05-23 00:16:42 -04:00