Commit graph

1 commit

Author SHA1 Message Date
jedarden
ca1582a839 feat(pdftract-47vu): implement pHash for glyph shape recognition
Implement phash_glyph(bitmap: &[u8; 1024]) -> u64 that computes
a 64-bit perceptual hash for 32×32 grayscale glyph bitmaps.

Algorithm:
1. Normalize pixel values to [-1.0, +1.0]
2. Apply 32×32 2D DCT-II (hand-rolled, precomputed basis)
3. Extract 64 low-frequency AC coefficients (8×8 block, DC excluded)
4. Threshold against median to produce 64-bit hash

Key features:
- Special case for uniform bitmaps (returns 0 deterministically)
- Deterministic across platforms (no NaN, stable float ordering)
- hamming_distance helper for hash comparison

Closes: pdftract-47vu
2026-05-24 04:20:55 -04:00