pdftract/docs
jedarden 84b4448648 feat(pdftract-5qca): implement form_fields JSON output + schema integration
Phase 7.4.5 implementation: Wire combined Vec<(String, FormFieldValue)> from
combiner into document-level /form_fields JSON output with tagged union schema.

- Add FormFieldJson, FormFieldTypeJson, FormFieldValueJson, ChoiceValueJson to schema
- Add form_fields: Vec<FormFieldJson> to ExtractionResult (always emitted, empty when none)
- Implement acro_field_to_value() converter for Phase 7.4.2 type-specific extraction
- Wire form field extraction in extract_pdf(): walk AcroForm, extract XFA, combine with XFA-wins
- Add convert_form_field_to_json() helper for FormFieldValue → FormFieldJson conversion
- Update docs/schema/v1.0/pdftract.schema.json with form_fields $defs and required field
- Add form_fields_to_markdown() to markdown module for Form Fields footer table

Schema shape: /form_fields is array of {name, type, value, default?, page_index?, rect?,
required, read_only, multiline?, max_length?, options?, multi_select?, selected?,
state_name?, pushbutton?, radio?}. Type field is tagged enum: "text", "button", "choice",
"signature". Value field varies by type (string|boolean|string|array|uint|null).

Closes: pdftract-5qca

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 14:36:03 -04:00
..
adr feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction 2026-05-23 12:30:26 -04:00
conformance feat(pdftract-5omc): implement SDK conformance test runner pattern 2026-05-18 01:22:23 -04:00
integrations docs(pdftract-3om3): add MCP client configuration guide 2026-05-24 13:10:33 -04:00
notes docs(pdftract-3wrx): add release signing strategy note 2026-05-24 11:12:56 -04:00
operations docs(pdftract-653ah): add runbook integration for pdftract doctor 2026-05-24 13:26:31 -04:00
plan feat(pdftract-3zhf): add unified TableDetector::detect entry point 2026-05-24 00:51:59 -04:00
research docs(pdftract-1tjn): finalize OpenType MATH and formula extraction research note v1.0 2026-05-24 10:41:39 -04:00
schema/v1.0 feat(pdftract-5qca): implement form_fields JSON output + schema integration 2026-05-24 14:36:03 -04:00
security docs(pdftract-58kz): add security policy documentation 2026-05-20 19:39:24 -04:00
user-docs docs(pdftract-653ah): add runbook integration for pdftract doctor 2026-05-24 13:26:31 -04:00
research-index.md Add parallel extraction research and comprehensive research index 2026-05-16 16:30:35 -04:00