pdftract/notes/pdftract-5l9m.md
jedarden 7fed5a0a6f docs(pdftract-5l9m): add CI validation script and verification note
Add CI validation script for checking unauthorized expose_secret() call
sites. The script validates that all uses of expose_secret() are in
approved locations (SecretFingerprint and test code).

Also add verification note summarizing the bead completion status.

Per pdftract-5l9m acceptance criteria:
- CI grep guard rejects unauthorized expose_secret() call sites
- Verification documents existing SecretString wrapping status

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-05-18 01:05:33 -04:00

3.2 KiB

pdftract-5l9m: Hardening - secrecy::SecretString wrapper

Summary

This bead introduces the secrecy crate for type-safe secret handling, preventing accidental leakage via Debug printing and serialization.

Work Completed

1. Workspace Dependency (Already Done)

  • secrecy = "0.8" already added to [workspace.dependencies] in Cargo.toml

2. pdftract-core ExtractionOptions (Already Done)

  • ExtractionOptions.password is Option<SecretString> (not Option<String>)
  • Manual Debug impl prints <REDACTED> for password field
  • Custom Serialize impl redacts password value
  • Custom Deserialize impl wraps incoming password in SecretString
  • Doctest verifies Debug output never leaks password (passes)

3. SecretFingerprint Helper (Already Done)

  • SecretFingerprint type exists in crates/pdftract-core/src/parser/secrets.rs
  • Provides SHA-256-based fingerprint for audit logs
  • Tests verify consistency and non-reversibility

4. CI Validation Check (Added)

  • Created scripts/ci/validate_expose_secret.sh script
  • Validates all expose_secret() call sites against authorized list
  • Authorized locations:
    • crates/pdftract-core/src/parser/secrets.rs:37 - SecretFingerprint::from_secret()
    • crates/pdftract-core/src/parser/stream.rs:2161 - test deserialization

5. Other Crates (Not Applicable)

The following crates mentioned in the bead description do not exist yet:

  • pdftract-mcp - authentication state (token field)
  • pdftract-inspector - launch state (single-use token)
  • pdftract-remote - HTTP fetch URL credential

These will need SecretString wrapping when implemented in future beads.

Acceptance Criteria Status

Criterion Status
secrecy crate is a workspace dependency PASS
All listed secret-holding fields are SecretString-wrapped ⚠️ WARN - only ExtractionOptions exists; other crates not yet implemented
No struct holding SecretString derives Debug PASS - ExtractionOptions uses manual impl
Custom clippy lint or CI grep rejects unauthorized expose_secret() PASS - CI script added
Doctest demonstrates Debug-print never leaks password PASS - doctest passes
TH-08 log audit test passes ⚠️ WARN - separate bead, not yet verified

Test Results

$ cargo test --doc -p pdftract-core
# ...
test crates/pdftract-core/src/parser/stream.rs - parser::stream::ExtractionOptions (line 840) ... ok

test result: ok. 5 passed; 0 failed; 9 ignored; 0 measured; 0 filtered out
$ bash scripts/ci/validate_expose_secret.sh
Checking for unauthorized expose_secret() call sites...
✓ All expose_secret() calls are authorized

Files Modified/Created

  • Created: scripts/ci/validate_expose_secret.sh - CI validation for expose_secret() call sites
  • Created: notes/pdftract-5l9m.md - this verification note
  • Existing (no changes needed):
    • Cargo.toml - secrecy dependency already present
    • crates/pdftract-core/src/parser/secrets.rs - SecretFingerprint already implemented
    • crates/pdftract-core/src/parser/stream.rs - ExtractionOptions already uses SecretString

References

  • Plan: line 921 (token in SecretString), Secrets Handling (lines 897-929), TH-08 (line 879)