pdftract/crates/pdftract-cli
jedarden 47df769e4b feat(pdftract-5ls35): implement JSON-Lines output sink for grep
Implement the --json output sink for pdftract grep with JSON-Lines
format (one match per line). Includes MatchEvent, FileOnlyEvent,
CountEvent structs and JsonSink line-buffered writer.

Key features:
- MatchEvent with all fields (path, page_index, bbox, match_text,
  span_text, span_confidence, pdf_fingerprint, crosses_spans)
- crosses_spans omitted when false via skip_serializing_if
- NaN/Infinity in span_confidence replaced with null
- page_index is 0-based (machine convention)
- FileOnlyEvent for -l mode, CountEvent for -c mode
- Line-buffered writes with immediate flush
- JSON schema at docs/schema/v1.0/grep-jsonl.schema.json

Closes: pdftract-5ls35
2026-05-25 02:05:17 -04:00
..
src feat(pdftract-5ls35): implement JSON-Lines output sink for grep 2026-05-25 02:05:17 -04:00
tests feat(pdftract-dtpwa): implement contract profile per Phase 7.10 schema 2026-05-24 07:10:32 -04:00
build.rs docs(pdftract-32y9): finalize SDK architecture note with workspace layout, cross-compile matrix, and KU-12 alignment 2026-05-24 06:38:23 -04:00
Cargo.toml feat(pdftract-1s2uj): add xref test fixture corpus and integration test runner 2026-05-24 08:20:04 -04:00
pdftract-cli.cdx.json feat(pdftract-67tm8): implement MCP stdio transport with integration tests 2026-05-23 00:16:42 -04:00