pdftract

History

jedarden 922c34611b feat(pdftract-4exg): implement classifier corpus test infrastructure Add classifier corpus test harness for 200-document labeled corpus: - Move test from tests/ to crates/pdftract-core/tests/classifier_corpus.rs - Implement classify_document() using pdftract_core::profiles - Add robust path resolution for workspace and crate test directories - Fix PdfObject number extraction in threads module (compilation error) Corpus infrastructure is complete but PDF generation needs fix: - Generated PDFs have non-standard trailer structure - ReportLab embeds comment inside trailer dictionary - Causes pdftract parser to fail with "/Root is not a dictionary" - Test harness ready to run once PDFs are regenerated Closes: pdftract-4exg (partial - infrastructure complete, PDF generation blocked) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>		2026-05-25 04:06:44 -04:00
..
pdftract-cer-diff	docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files	2026-05-23 10:36:28 -04:00
pdftract-cli	feat(pdftract-5edjj): implement render_anchors inspector layer renderer	2026-05-25 03:16:07 -04:00
pdftract-core	feat(pdftract-4exg): implement classifier corpus test infrastructure	2026-05-25 04:06:44 -04:00
pdftract-libpdftract	feat(pdftract-3s2i): implement Phase 5.5.2 validation filter	2026-05-24 04:57:17 -04:00
pdftract-py	feat(pdftract-2nu0s): implement Python SDK contract conformance	2026-05-24 08:55:11 -04:00