pdftract/crates/pdftract-cli
jedarden 85acaa9b56 feat(pdftract-4a3je): implement multipart parsing with PDF magic-byte validation
- Add field-typing helpers (parse_bool, parse_float, parse_int, parse_comma_list)
- Add validate_pdf_magic_bytes() to check for %PDF- header
- Update ExtractParams to support: ocr_language, ocr_dpi, markdown_anchors
- Update receive_pdf() to use type-aware parsing and validate PDF bytes
- Update build_options() to map form fields to ExtractionOptions
- Add comprehensive unit tests for form helpers and build_options

Per plan section 2127-2137, implements optional form field parsing with:
- Forward-compatibility for unknown fields (warning logs, ignored)
- Clear 400 errors with hints on parse failure
- Typed coercion (bool from "true"/"1"; comma-list to Vec<String>)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 20:19:10 -04:00
..
benches feat(pdftract-3h9xo): implement threads JSON output + schema integration 2026-05-25 13:40:15 -04:00
src feat(pdftract-4a3je): implement multipart parsing with PDF magic-byte validation 2026-05-27 20:19:10 -04:00
tests feat(pdftract-3b1mk): implement TH-09 inspector XSS test with CSP headers 2026-05-26 20:38:21 -04:00
build.rs docs(pdftract-32y9): finalize SDK architecture note with workspace layout, cross-compile matrix, and KU-12 alignment 2026-05-24 06:38:23 -04:00
Cargo.toml feat(pdftract-3b1mk): implement TH-09 inspector XSS test with CSP headers 2026-05-26 20:38:21 -04:00
pdftract-cli.cdx.json feat(pdftract-67tm8): implement MCP stdio transport with integration tests 2026-05-23 00:16:42 -04:00