jedarden
|
6b96d8d637
|
Add research: error handling, PDF/A guarantees, output schema, generator quirks
Four new extraction research documents covering permissive error handling
with extraction quality signaling (five error classes, circular reference
detection, memory limits), PDF/A conformance level guarantees and
fast-path optimization (Level A skips OCR and layout heuristics), the
complete extraction output schema (span/block/table/NDJSON streaming/
versioning), and per-generator extraction quirks (Word/LibreOffice/
InDesign/LaTeX/Chrome/Ghostscript/scanners).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
2026-05-16 16:07:13 -04:00 |
|