pdftract/crates/pdftract-core/src/ocr/preprocessing/mod.rs
jedarden ff82fdce90 feat(pdftract-5xyjv): implement 3x3 median-filter denoising for OCR preprocessing
- Add median_denoise() function using imageproc::filter::median_filter
- 3x3 kernel (radius 1,1) removes salt-and-pepper noise while preserving edges
- Comprehensive tests: noise removal, edge preservation, binary output
- Export median_denoise from ocr::preprocessing module

Closes: pdftract-5xyjv
2026-05-24 16:09:08 -04:00

11 lines
379 B
Rust

//! OCR preprocessing operations (Phase 5.3).
//!
//! This module provides image preprocessing functions that prepare scanned
//! pages for OCR. Operations include contrast normalization, binarization,
//! and noise reduction.
pub mod contrast;
pub mod denoise;
pub use contrast::{histogram_stretch, histogram_stretch_if_needed, PreprocError};
pub use denoise::median_denoise;