pdftract/tests/document_model/fixtures
jedarden 225f96c241 fix(pyo3): correct extract_text_fn call in extract_markdown stub
The extract_markdown stub was calling extract_text instead of
extract_text_fn, causing a compilation error. This fixes the
function name to match the exported function from extract_text.rs.

This completes the extract_text PyO3 entry point implementation,
which was already present in extract_text.rs and lib.rs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 20:28:25 -04:00
..
expected_backup fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
base_hello.pdf feat(pdftract-91e1i): HTTP fetch sequence implementation 2026-05-28 13:17:00 -04:00
encrypted_aes128_test.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_aes128_test.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_aes256_test.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_aes256_test.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_empty_password.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_empty_password.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_rc4_test.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_rc4_test.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_unknown_handler.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
encrypted_unknown_handler.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
generate_fixtures fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
generate_fixtures.rs feat(pdftract-91e1i): HTTP fetch sequence implementation 2026-05-28 13:17:00 -04:00
inheritance_grandparent_mediabox.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
inheritance_grandparent_mediabox.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
js_in_openaction.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
js_in_openaction.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
missing_mediabox.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
missing_mediabox.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
multi_revision_3.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
multi_revision_3.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ocg_default_off.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ocg_default_off.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
page_labels_roman_arabic.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
page_labels_roman_arabic.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
partial_resource_override.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
partial_resource_override.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
pdfa_1b_conformance.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
pdfa_1b_conformance.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
README.md feat(pdftract-91e1i): HTTP fetch sequence implementation 2026-05-28 13:17:00 -04:00
tagged_3_level_outline.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
tagged_3_level_outline.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
xfa_form.expected.json fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
xfa_form.pdf fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00

Document Model Test Fixtures

This directory contains curated PDF fixtures for testing the document model integration.

Fixture Passwords

IMPORTANT: The passwords for encrypted fixtures are NOT secret. They are test fixtures:

  • encrypted_rc4_test.pdf: RC4-40, password "test"
  • encrypted_aes128_test.pdf: AES-128, password "test"
  • encrypted_aes256_test.pdf: AES-256 (PDF 2.0), password "test"
  • encrypted_empty_password.pdf: RC4-40, empty password

Fixture List

Encrypted Files (EC-04, EC-05, EC-06)

  • encrypted_rc4_test.pdf — RC4-encrypted, user password "test" (EC-04)
  • encrypted_aes128_test.pdf — AES-128, password "test" (EC-05)
  • encrypted_aes256_test.pdf — AES-256 (PDF 2.0), password "test" (EC-06)
  • encrypted_empty_password.pdf — RC4-encrypted, empty owner password
  • encrypted_unknown_handler.pdf — Custom handler (Adobe Public Key, /Filter /Adobe.PubSec)

Tagged PDFs

  • tagged_3_level_outline.pdf — 3 levels of bookmarks with mixed UTF-16BE/PDFDocEncoded titles

Optional Content (EC-16)

  • ocg_default_off.pdf — Single OCG with /D /BaseState /OFF (EC-16)

Multi-Revision

  • multi_revision_3.pdf — 3 incremental revisions, page count differs across revisions

Page Tree Inheritance (EC-09)

  • inheritance_grandparent_mediabox.pdf — page 0 has no MediaBox; inherits from grandparent /Pages node
  • missing_mediabox.pdf — page with no MediaBox anywhere (EC-09)

Resource Merging

  • partial_resource_override.pdf — page overrides /Resources /Font partially; merged result expected

JavaScript Detection

  • js_in_openaction.pdf — /OpenAction /S /JavaScript

XFA Forms

  • xfa_form.pdf — /AcroForm /XFA present

Conformance Detection

  • pdfa_1b_conformance.pdf — XMP metadata declaring PDF/A-1B conformance

Page Labels

  • page_labels_roman_arabic.pdf — pages 0..3 roman, pages 4..end arabic

Fixture Generation

Fixtures are generated using qpdf and hand-crafted PDF construction.

See scripts/generate_document_model_fixtures.sh for generation scripts.