pdftract/tests/stream_decoder/fixtures
jedarden bb7146cffe fix(pdftract-2uk9z): wrap native module results in typed Python objects
The native PyO3 module returns raw dicts via pythonize, but the Python SDK
API expects typed dataclass objects (Document, Page, Metadata, etc.) to be
consistent with the subprocess fallback and test expectations.

Updated wrapper functions in __init__.py to convert native results:
- extract(): wraps dict in Document.from_dict()
- extract_stream(): wraps yielded page dicts in Page.from_dict()
- get_metadata(): wraps dict in Metadata()
- hash(): wraps string in Fingerprint.from_string()
- classify(): wraps dict in Classification()
- search(): wraps yielded match dicts in Match

The native PyO3 entry points (extract, extract_text, extract_stream) were
already implemented with:
- extract: uses extract_pdf + pythonize for PyDict conversion
- extract_text: uses extract_text for plain String return
- extract_stream: uses extract_pdf_streaming with custom StreamIterator

All kwargs parsing with strict validation (unknown kwargs raise TypeError)
was already in place.

Acceptance criteria:
- pdftract.extract() returns Document object with pages/metadata
- pdftract.extract_text() returns plain text string
- pdftract.extract_stream() yields Page objects
- Unknown kwarg raises TypeError
2026-05-28 21:18:38 -04:00
..
__pycache__ fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ascii85_terminator.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ascii85_terminator.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
ascii85_terminator.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ascii85_z_shortcut.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ascii85_z_shortcut.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
ascii85_z_shortcut.meta fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
asciihex_odd_length.bin fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
asciihex_odd_length.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
asciihex_odd_length.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
crypt_identity.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
crypt_identity.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
crypt_identity.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
dct_missing_eoi.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
dct_missing_eoi.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
dct_missing_eoi.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
dct_valid_jpeg.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
dct_valid_jpeg.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
dct_valid_jpeg.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
filter_array_a85_then_flate.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
filter_array_a85_then_flate.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
filter_array_a85_then_flate.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_bomb_3gb.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_bomb_3gb.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
flate_bomb_3gb.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_png_pred15_all_six.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_png_pred15_all_six.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
flate_png_pred15_all_six.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_simple.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_simple.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
flate_simple.meta fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
flate_tiff_pred2.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_tiff_pred2.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
flate_tiff_pred2.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_truncated.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_truncated.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
flate_truncated.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
gen_bomb_fixture.py fix(pdftract-4pnmd): build.rs doc comment format string parsing 2026-05-28 14:36:45 -04:00
gen_bomb_simple.py fix(pdftract-4pnmd): build.rs doc comment format string parsing 2026-05-28 14:36:45 -04:00
gen_bomb_zlib.py fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
gen_fixtures.py fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
gen_fixtures_corrected.py feat(pdftract-91e1i): HTTP fetch sequence implementation 2026-05-28 13:17:00 -04:00
gen_lzw.rs fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
gen_lzw_fixtures.py fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
gen_stream_lzw.rs fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
generate_lzw_fixtures.rs fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
jbig2_passthrough.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
jbig2_passthrough.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
jbig2_passthrough.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
lzw_early_change_0.bin fix(pdftract-2uk9z): wrap native module results in typed Python objects 2026-05-28 21:18:38 -04:00
lzw_early_change_0.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
lzw_early_change_0.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
lzw_early_change_1.bin fix(pdftract-2uk9z): wrap native module results in typed Python objects 2026-05-28 21:18:38 -04:00
lzw_early_change_1.expected fix(pdftract-25igv): fix emit! macro usage in codespace parser 2026-05-28 07:29:33 -04:00
lzw_early_change_1.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
regen_fixtures.py fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
regen_lzw_fixtures.rs fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
runlength_basic.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
runlength_basic.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
runlength_basic.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
unknown_filter.bin fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
unknown_filter.expected fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00
unknown_filter.meta fix(pyo3): correct extract_text_fn call in extract_markdown stub 2026-05-28 20:28:25 -04:00