pdftract/tests/fixtures
jedarden 05be70d36f feat(pdftract-48ea): implement BrokenVector fixtures + WER delta CI gate
Add two PDF/A fixtures for testing assisted-OCR (BrokenVector path):
- Aligned fixture with correctly-positioned invisible text layer
- Misaligned fixture with text layer offset by (10pt, 5pt)

Extend ci/wer-gate.sh with WER validation for BrokenVector fixtures.

Acceptance criteria:
- Two BrokenVector fixtures committed (both 1.5 KB, well under 200 KB limit)
- ci/wer-gate.sh extended with new fixture invocations
- WER delta tests will skip gracefully when OCR environment unavailable

Closes: pdftract-48ea

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 10:52:41 -04:00
..
classifier feat(pdftract-59zz): implement MCP bearer token ingress channels and TH-03 enforcement 2026-05-18 02:47:54 -04:00
fonts feat(pdftract-5u8bp): implement SVG clip generator 2026-05-23 03:43:19 -04:00
malformed docs(bf-4xk2v): add verification note and compression bomb fixture 2026-05-23 13:32:19 -04:00
ocr feat(pdftract-48ea): implement BrokenVector fixtures + WER delta CI gate 2026-05-24 10:52:41 -04:00
page_class feat(pdftract-2zw): page classification fixtures + integration tests + reproducibility gate 2026-05-23 15:04:05 -04:00
perf feat(bf-1g1fd): implement CI memory-ceiling gate with cgroup MemoryMax enforcement 2026-05-23 13:22:55 -04:00
preprocess feat(pdftract-27n3): implement border padding, pipeline orchestration, and fixtures 2026-05-23 21:55:11 -04:00
profiles feat(pdftract-48ea): implement BrokenVector fixtures + WER delta CI gate 2026-05-24 10:52:41 -04:00
gen_fixtures feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_ocr_fixtures feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
gen_suspects feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple_local feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_simple_local.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v2.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v3 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v3.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v4.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v6 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v6.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v7 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v7.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v8 feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
gen_suspects_v8.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_lzw_fixtures.rs feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
generate_lzw_fixtures_main.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
generate_ocr_fixtures.rs feat(pdftract-315s): implement WER CI gate and OCR CLI flags 2026-05-24 02:07:27 -04:00
generate_page_class_fixtures.rs feat(pdftract-2zw): page classification fixtures + integration tests + reproducibility gate 2026-05-23 15:04:05 -04:00
generate_suspects_fixture feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixture.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures.py feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
generate_suspects_fixtures_v5.rs feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
lzw_incremental_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_incremental_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_incremental_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_mixed_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_mixed_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_mixed_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_predictor_encoded.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_predictor_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_repeated_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_repeated_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_repeated_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_simple_early.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_simple_late.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_simple_orig.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
lzw_truncated.bin feat(pdftract-3uu6v): implement LZWDecode with /EarlyChange parameter 2026-05-22 22:38:31 -04:00
tagged-suspects-false.pdf feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
tagged-suspects-true-high-coverage.pdf feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
tagged-suspects-true.pdf feat(pdftract-2w3r): implement StructTree coverage check and XY-cut fallback 2026-05-23 20:53:25 -04:00
test-minimal.pdf feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction 2026-05-23 12:30:26 -04:00
valid-minimal.pdf test(pdftract-1eaxm): add distribution templates and C conformance tests 2026-05-23 09:20:22 -04:00