pdftract/tests/fixtures
jedarden 02c8843e2a docs(pdftract-3a310): add Phase 7.10 coordinator verification note
Coordinator bead closing as all 4 blocking child beads are now CLOSED:
- pdftract-1lp2 (Profile Authoring epic)
- pdftract-3zhf (Phase 7.2 Table Detection)
- pdftract-6d5w (Phase 7.3 Digital Signature)
- pdftract-2mw6 (Phase 7.4 AcroForm/XFA)

Profile system infrastructure is COMPLETE and FUNCTIONAL:
- Core profile modules (types, extraction, loader, engine, signals, evaluator)
- 9 built-in classification + extraction profiles
- CLI profiles subcommand (list, show, export, install, validate)
- --auto and --profile flags on extract
- 72 PDF fixtures, PROVENANCE.md, 200-doc classifier corpus

Known gaps documented (regression tests, critical acceptance tests,
serve hot-reload implementation) - tracked in child bead close reasons.

Acceptance criterion met: All Phase 7.10 child task beads closed.

Also fix PROVENANCE.md entries for json_schema and fixtures root:
- Update sample.pdf to json_schema/sample.pdf
- Add EC-04-rc4-encrypted.pdf entry
- Add EC-05-aes128-encrypted.pdf entry
- Add valid-minimal.pdf entry
- Re-add sample.pdf entry (fixtures root)
2026-06-01 04:23:20 -04:00
..
classifier feat(pdftract-59zz): implement MCP bearer token ingress channels and TH-03 enforcement 2026-05-18 02:47:54 -04:00
fonts feat(pdftract-5u8bp): implement SVG clip generator 2026-05-23 03:43:19 -04:00
grep-corpus feat(pdftract-5bzpg): implement pdftract-grep-1000 CI benchmark skeleton 2026-05-25 08:53:23 -04:00
json_schema docs(pdftract-35byi): update verification note with current fixture count 2026-06-01 02:38:31 -04:00
malformed test(pdftract-17cnu): implement TH-01 decompression bomb security test 2026-05-25 12:09:54 -04:00
ocr feat(pdftract-48ea): implement BrokenVector fixtures + WER delta CI gate 2026-05-24 10:52:41 -04:00
page_class feat(pdftract-2zw): page classification fixtures + integration tests + reproducibility gate 2026-05-23 15:04:05 -04:00
perf feat(bf-1g1fd): implement CI memory-ceiling gate with cgroup MemoryMax enforcement 2026-05-23 13:22:55 -04:00
preprocess feat(pdftract-27n3): implement border padding, pipeline orchestration, and fixtures 2026-05-23 21:55:11 -04:00
profiles docs(pdftract-3a310): add Phase 7.10 coordinator verification note 2026-06-01 04:23:20 -04:00
security wip: intermediate state from previous work 2026-05-29 08:25:23 -04:00
EC-04-rc4-encrypted.pdf feat(pdftract-57np8): add DCTDecode SOI/EOI diagnostic emission at call site 2026-05-28 06:36:35 -04:00
EC-05-aes128-encrypted.pdf feat(pdftract-57np8): add DCTDecode SOI/EOI diagnostic emission at call site 2026-05-28 06:36:35 -04:00
EC-06-aes256-encrypted.pdf feat(pdftract-57np8): add DCTDecode SOI/EOI diagnostic emission at call site 2026-05-28 06:36:35 -04:00
EC-empty-password.pdf feat(pdftract-57np8): add DCTDecode SOI/EOI diagnostic emission at call site 2026-05-28 06:36:35 -04:00
gen_fixtures feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_ocr_fixtures feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
gen_suspects feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple_local feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple_local.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v2.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v3 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v3.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v4.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v6 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v6.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v7 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v7.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v8 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v8.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_book_chapter_fixtures.rs fix(pdftract-2f7oi): fix test fixture compilation bug and verify error handling 2026-05-27 22:12:25 -04:00
generate_encrypted_fixtures.py chore(pdftract-36glh): remove unused JpxDecoder import and add verification note 2026-05-28 05:23:13 -04:00
generate_encrypted_fixtures.rs chore(pdftract-36glh): remove unused JpxDecoder import and add verification note 2026-05-28 05:23:13 -04:00
generate_large_remote_fixture.rs wip: AcroForm improvements, debug tooling, test corpus, and fixture updates 2026-05-30 09:48:14 -04:00
generate_legal_filing_fixtures.rs feat(pdftract-260a3): implement legal_filing profile with fixtures and tests 2026-05-27 21:44:49 -04:00
generate_lzw_fixtures.rs.disabled fix(pdftract-2uk9z): wrap native module results in typed Python objects 2026-05-28 21:18:38 -04:00
generate_lzw_fixtures_main.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
generate_ocr_fixtures.rs feat(pdftract-315s): implement WER CI gate and OCR CLI flags 2026-05-24 02:07:27 -04:00
generate_page_class_fixtures.rs feat(pdftract-2zw): page classification fixtures + integration tests + reproducibility gate 2026-05-23 15:04:05 -04:00
generate_scientific_paper_fixtures.rs feat(pdftract-2vajs): implement slide_deck profile with fixtures and tests 2026-05-27 21:12:24 -04:00
generate_slide_deck_fixtures.rs feat(pdftract-2vajs): implement slide_deck profile with fixtures and tests 2026-05-27 21:12:24 -04:00
generate_suspects_fixture feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixture.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures.py feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures_v5.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
lzw_incremental_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_incremental_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_incremental_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_mixed_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_mixed_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_mixed_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_predictor_encoded.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_predictor_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_repeated_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_repeated_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_repeated_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_simple_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_simple_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_simple_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_truncated.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
PROVENANCE.md docs(pdftract-35byi): update verification note with current fixture count 2026-06-01 02:38:31 -04:00
remote_100page.pdf wip: AcroForm improvements, debug tooling, test corpus, and fixture updates 2026-05-30 09:48:14 -04:00
sample.pdf docs(pdftract-145s8): update SDK docs with correct API 2026-05-31 23:43:05 -04:00
tagged-suspects-false.pdf feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
tagged-suspects-true-high-coverage.pdf feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
tagged-suspects-true.pdf feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
test-minimal.pdf feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction 2026-05-23 12:30:26 -04:00
valid-minimal.pdf test(pdftract-1eaxm): add distribution templates and C conformance tests 2026-05-23 09:20:22 -04:00