Commit graph

2 commits

Author SHA1 Message Date
jedarden
adaf27be85 feat(pdftract-64p5): implement classify CLI subcommand and --auto flag
- Implement pdftract classify command with JSON output
- Load built-in profiles + custom profiles from --profiles DIR
- Output format: {"document_type":"invoice","confidence":0.87,"reasons":[...],"runner_up":"receipt","runner_up_confidence":0.42}
- Support --top-k, --exit-on-unknown, --pretty flags
- Implement --auto flag for extract subcommand
- Add path traversal protection for profiles directory
- Add load_profiles_from_file() and load_profiles_from_dir() to profiles/loader

Closes: pdftract-64p5
2026-05-24 15:16:56 -04:00
jedarden
a0f01977a1 feat(pdftract-64p5): implement classify CLI subcommand structure
Add the `pdftract classify` CLI subcommand with proper argument parsing,
feature gates, and path traversal protection. Add `--auto` flag to extract
subcommand.

Implementation details:
- Add Classify subcommand with --profiles DIR, --pretty, --top-k, --exit-on-unknown
- Implement path traversal protection for --profiles DIR
- Add --auto flag to Extract subcommand
- Feature-gate classify command behind `profiles` feature
- Create classify.rs module with ClassificationOutput struct
- Add unit tests for JSON serialization

Limitations deferred to bead 5.6.4:
- Built-in profiles (load_builtins() not yet available)
- YAML profile loading (requires YAML-to-Profile parsing)
- Full classification pipeline (awaits profile infrastructure)

Closes: pdftract-64p5

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 13:45:44 -04:00