pdftract/crates/pdftract-core/tests
jedarden 922c34611b feat(pdftract-4exg): implement classifier corpus test infrastructure
Add classifier corpus test harness for 200-document labeled corpus:
- Move test from tests/ to crates/pdftract-core/tests/classifier_corpus.rs
- Implement classify_document() using pdftract_core::profiles
- Add robust path resolution for workspace and crate test directories
- Fix PdfObject number extraction in threads module (compilation error)

Corpus infrastructure is complete but PDF generation needs fix:
- Generated PDFs have non-standard trailer structure
- ReportLab embeds comment inside trailer dictionary
- Causes pdftract parser to fail with "/Root is not a dictionary"
- Test harness ready to run once PDFs are regenerated

Closes: pdftract-4exg (partial - infrastructure complete, PDF generation blocked)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 04:06:44 -04:00
..
classifier_corpus.rs feat(pdftract-4exg): implement classifier corpus test infrastructure 2026-05-25 04:06:44 -04:00
conformance.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
memory_guard.rs feat(bf-2ervu): implement mmap-backed PdfSource via memmap2 2026-05-24 08:40:11 -04:00
memory_guard_tests.rs feat(bf-2ervu): implement mmap-backed PdfSource via memmap2 2026-05-24 08:40:11 -04:00
ocr_integration.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
page_classification.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
struct_tree_coverage.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
test_xref_debug.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
TH-03-mcp-no-auth.rs test(pdftract-5m3hp): implement TH-03 MCP no-auth bind security tests 2026-05-24 18:43:52 -04:00
TH-07-ps-leak.rs test(pdftract-43jxa): implement TH-07 ps leak security test 2026-05-25 00:45:57 -04:00
th_05_ssrf_block.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
xref_helpers.rs feat(bf-2ervu): implement mmap-backed PdfSource via memmap2 2026-05-24 08:40:11 -04:00
xref_integration_test.rs feat(bf-2ervu): implement mmap-backed PdfSource via memmap2 2026-05-24 08:40:11 -04:00