pdftract/docs/schema/v1.0
jedarden 47df769e4b feat(pdftract-5ls35): implement JSON-Lines output sink for grep
Implement the --json output sink for pdftract grep with JSON-Lines
format (one match per line). Includes MatchEvent, FileOnlyEvent,
CountEvent structs and JsonSink line-buffered writer.

Key features:
- MatchEvent with all fields (path, page_index, bbox, match_text,
  span_text, span_confidence, pdf_fingerprint, crosses_spans)
- crosses_spans omitted when false via skip_serializing_if
- NaN/Infinity in span_confidence replaced with null
- page_index is 0-based (machine convention)
- FileOnlyEvent for -l mode, CountEvent for -c mode
- Line-buffered writes with immediate flush
- JSON schema at docs/schema/v1.0/grep-jsonl.schema.json

Closes: pdftract-5ls35
2026-05-25 02:05:17 -04:00
..
grep-jsonl.schema.json feat(pdftract-5ls35): implement JSON-Lines output sink for grep 2026-05-25 02:05:17 -04:00
pdftract.schema.json feat(pdftract-5nv9h): implement xtask gen-schema with stable ordering and proper metadata 2026-05-24 17:31:16 -04:00