pdftract/test_trailer_parsing.rs
jedarden 9b5fbc9b5e feat(pdftract-bf-2y2rp): implement lazy stream decoding for PDF extraction
- Add decode_page_content_streams() function for per-page lazy decode
- Update extract_page_from_dict() to support lazy stream decoding
- Modify extract_pdf() and extract_pdf_ndjson() to enable lazy decoding
- Fix borrow checker issue in LazyPageIter::next()

This ensures content streams are decoded lazily per page and dropped
immediately after processing, keeping peak RSS flat across page count.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-23 12:30:26 -04:00

20 lines
580 B
Rust

use pdftract_core::document::parse_pdf_file;
use std::path::Path;
fn main() {
let pdf_path = Path::new("/tmp/valid_test.pdf");
match parse_pdf_file(pdf_path) {
Ok((fingerprint, catalog, pages, resolver)) => {
println!("Success!");
println!("Fingerprint: {}", fingerprint);
println!("Pages: {}", pages.len());
}
Err(e) => {
println!("Error: {}", e);
println!("Error chain:");
for cause in e.chain() {
println!(" - {}", cause);
}
}
}
}