jedarden/pdftract

Fork 0

Commit graph

Author	SHA1	Message	Date
jedarden	516ca154aa	Add research: page labels, government forms, book publishing, filter decoding Four new extraction research documents covering page label/PageLabels number tree and outline/bookmark tree extraction, government form PDF patterns (IRS, USCIS, court filings, classification markings), book and publishing PDF structure (running heads, footnotes, index extraction), and PDF stream filter pipeline (FlateDecode/LZW predictors, JBIG2 global segments, CCITTFax, JPX, error boundaries). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 15:55:08 -04:00

Author

SHA1

Message

Date

jedarden

516ca154aa

Add research: page labels, government forms, book publishing, filter decoding

Four new extraction research documents covering page label/PageLabels
number tree and outline/bookmark tree extraction, government form PDF
patterns (IRS, USCIS, court filings, classification markings), book and
publishing PDF structure (running heads, footnotes, index extraction),
and PDF stream filter pipeline (FlateDecode/LZW predictors, JBIG2 global
segments, CCITTFax, JPX, error boundaries).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-16 15:55:08 -04:00

1 commit