Add supply chain security gates:
- cargo-deny.toml: License allowlist (MIT, Apache-2.0, BSD, ISC, Zlib,
Unicode-DFS-2016, MPL-2.0), bans (openssl-sys, native-tls, git2,
libgit2-sys), minimum versions (ring >= 0.17.5, rustls >= 0.23)
- build/CHECKSUMS.sha256: SHA-256 checksum for build/glyph-shapes.json.
build.rs already verifies checksums on every build (TH-06 supply-chain
gate per plan line 909)
These are part of the security hardening epic (pdftract-e9lz).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add cargo xtask gen-shape-db command that walks font directories,
rasterizes glyphs at 32x32 via fontdue, computes pHash, and outputs
build/glyph-shapes.json.
Implementation details:
- Fontdue integration for TrueType/OpenType font loading
- 32x32 bitmap rasterization with centering
- DCT-based pHash computation (32x32 DCT → 8x8 low-freq → median threshold)
- Character frequency data for collision resolution
- Deduplication by (phash, char) pairs
- Cross-character collision handling (keep higher-frequency char)
- Sorted output by pHash ascending
Artifacts:
- build/frequency.json: Character frequency rankings
- build/README.md: Command documentation and usage
Acceptance criteria:
- ✅ cargo xtask gen-shape-db --fonts <dir> produces valid JSON
- ✅ Deterministic output (byte-identical on same inputs)
- ✅ Fontdue integration and 32x32 rasterization
- ✅ pHash computation via DCT
- ⚠️ No system fonts for full integration test (documented)
Closes: pdftract-2aq0