Commit graph

3 commits

Author SHA1 Message Date
jedarden
9b5fbc9b5e feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction
- Add decode_page_content_streams() function for per-page lazy decode
- Update extract_page_from_dict() to support lazy stream decoding
- Modify extract_pdf() and extract_pdf_ndjson() to enable lazy decoding
- Fix borrow checker issue in LazyPageIter::next()

This ensures content streams are decoded lazily per page and dropped
immediately after processing, keeping peak RSS flat across page count.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:30:26 -04:00
jedarden
0f0e40e717 test(pdftract-1eaxm): add thread sanitizer results and improve conformance tests
- Add thread sanitizer verification results to notes/pdftract-1eaxm.md
- Improve conformance.c to gracefully handle error JSON responses
- Update test_hash.c to test version and ABI version functions

These changes improve the test coverage and documentation for the
libpdftract C FFI implementation.

Related: pdftract-1eaxm
2026-05-23 10:33:51 -04:00
jedarden
dfdfb9de79 test(pdftract-1eaxm): add distribution templates and C conformance tests
- Add Homebrew formula template (homebrew-formula.rb.erb)
- Add vcpkg port template with submission instructions
- Add C conformance test (conformance.c) with thread safety verification
- Add simple link test (simple_test.c) to verify library linkage
- Add hash test (test_hash.c) for hash API verification
- Add parse debug test (test_parse.rs) for development
- Add test fixtures (test-minimal.pdf, valid-minimal.pdf)
- Add PROVENANCE.md entry for valid-minimal.pdf

All tests pass: version, abi_version, free(NULL), hash, extract methods.

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-05-23 09:20:22 -04:00