Three new extraction research documents covering subset font Unicode
recovery, pdfLaTeX/XeLaTeX encoding tables and two-column layout, and
proper vs. improper redaction detection with output schema.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>