pdftract/docs
jedarden 47df769e4b feat(pdftract-5ls35): implement JSON-Lines output sink for grep
Implement the --json output sink for pdftract grep with JSON-Lines
format (one match per line). Includes MatchEvent, FileOnlyEvent,
CountEvent structs and JsonSink line-buffered writer.

Key features:
- MatchEvent with all fields (path, page_index, bbox, match_text,
  span_text, span_confidence, pdf_fingerprint, crosses_spans)
- crosses_spans omitted when false via skip_serializing_if
- NaN/Infinity in span_confidence replaced with null
- page_index is 0-based (machine convention)
- FileOnlyEvent for -l mode, CountEvent for -c mode
- Line-buffered writes with immediate flush
- JSON schema at docs/schema/v1.0/grep-jsonl.schema.json

Closes: pdftract-5ls35
2026-05-25 02:05:17 -04:00
..
adr feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction 2026-05-23 12:30:26 -04:00
conformance feat(pdftract-5omc): implement SDK conformance test runner pattern 2026-05-18 01:22:23 -04:00
integrations docs(pdftract-3om3): add MCP client configuration guide 2026-05-24 13:10:33 -04:00
notes docs(pdftract-3wrx): add release signing strategy note 2026-05-24 11:12:56 -04:00
operations docs(pdftract-653ah): add runbook integration for pdftract doctor 2026-05-24 13:26:31 -04:00
plan feat(pdftract-3zhf): add unified TableDetector::detect entry point 2026-05-24 00:51:59 -04:00
research docs(pdftract-1tjn): finalize OpenType MATH and formula extraction research note v1.0 2026-05-24 10:41:39 -04:00
schema/v1.0 feat(pdftract-5ls35): implement JSON-Lines output sink for grep 2026-05-25 02:05:17 -04:00
security docs(pdftract-58kz): add security policy documentation 2026-05-20 19:39:24 -04:00
user-docs docs(pdftract-5nare): add comprehensive FAQ with 24 questions 2026-05-25 00:22:48 -04:00
research-index.md Add parallel extraction research and comprehensive research index 2026-05-16 16:30:35 -04:00