This commit fixes a compilation error in the javascript tests that were using PageDict::default(). The JBIG2 decoder module was already fully implemented; this change only enables the tests to compile and run. Changes: - Add Default impl for PageDict in parser/pages.rs - Verify all 11 JBIG2-related tests pass The JBIG2Decode passthrough filter implementation is complete: - Passthrough of raw JBIG2 bytes - /JBIG2Globals reference recording for downstream consumers - OCR_JBIG2_UNSUPPORTED diagnostic emission when full-render disabled Co-Authored-By: Claude Code <noreply@anthropic.com>
4.1 KiB
4.1 KiB
pdftract-2sswr: JBIG2Decode passthrough + /JBIG2Globals reference recording + OCR_JBIG2_UNSUPPORTED diagnostic
Summary
Verified that the JBIG2Decode passthrough filter implementation is complete and functional. The JBIG2 decoder module (crates/pdftract-core/src/decoder/jbig2.rs) was already implemented with all required functionality.
Acceptance Criteria Status
PASS
- JBIG2 stream with full-render feature → pass-through, no diagnostic (stream.rs:3542-3548)
- JBIG2 stream WITHOUT full-render → OCR_JBIG2_UNSUPPORTED diagnostic; pass-through anyway (stream.rs:3542-3548)
- /JBIG2Globals reference recorded on StreamMeta (stream.rs:3550-3556)
- Self-contained JBIG2 (no globals): StreamMeta.jbig2_globals_ref is None (field defaults to None)
WARN
- Round-trip test with reference JBIG2 fixture: Unit tests in stream.rs (test_jbig2_passthrough, test_jbig2_extract_globals_ref, etc.) verify the passthrough and globals extraction functionality with mock data. No actual JBIG2 PDF fixture exists in the test suite.
Changes Made
Fixed compilation error in parser/pages.rs
- Added
Defaultimplementation forPageDictstruct to fix compilation errors injavascript.rstests - The
PageDict::default()method is used in javascript detection tests
Verified existing implementation
The following components were already implemented and verified working:
crates/pdftract-core/src/decoder/jbig2.rs (225 lines):
Jbig2GlobalsRefstruct - captures ObjRef to globals streamJbig2Decoderstruct - handles passthrough and diagnostic emissionextract_globals_ref()- extracts /JBIG2Globals reference from stream dictemit_unsupported_diagnostic()- emits OCR_JBIG2_UNSUPPORTED when full-render not availablehas_full_render()- checks cfg!(feature = "full-render") at compile time- Read trait implementation for passthrough compatibility
- 6 unit tests (all passing)
crates/pdftract-core/src/parser/stream.rs (integration):
- Lines 3542-3548: Emit OCR_JBIG2_UNSUPPORTED diagnostic when full-render disabled
- Lines 3550-3556: Extract /JBIG2Globals reference and store in stream_meta
- Lines 5742-5831: 5 integration tests for JBIG2 passthrough (all passing)
crates/pdftract-core/src/diagnostics.rs:
DiagCode::OcrJbig2Unsupporteddefined at line 633- Diagnostic info at line 1951-1955 (Warning severity, recoverable)
Test Results
All 11 JBIG2-related tests pass:
test decoder::jbig2::tests::test_emit_unsupported_diagnostic_when_feature_disabled ... ok
test decoder::jbig2::tests::test_extract_globals_ref_with_valid_ref ... ok
test decoder::jbig2::tests::test_extract_globals_ref_with_invalid_type ... ok
test decoder::jbig2::tests::test_extract_globals_ref_without_globals ... ok
test decoder::jbig2::tests::test_jbig2_decoder_const ... ok
test decoder::jbig2::tests::test_jbig2_globals_ref_const ... ok
test parser::stream::source_tests::test_jbig2_bomb_limit ... ok
test parser::stream::source_tests::test_jbig2_extract_globals_ref ... ok
test parser::stream::source_tests::test_jbig2_extract_globals_ref_invalid_type ... ok
test parser::stream::source_tests::test_jbig2_extract_globals_ref_missing ... ok
test parser::stream::source_tests::test_jbig2_passthrough ... ok
Implementation Details
Per PDF spec 7.4.7:
- JBIG2Decode is a lossless compression format for bitonal images
- /JBIG2Globals is an indirect reference to a globally-shared symbol dictionary
- Without globals, the stream is self-contained (still decodable)
Passthrough behavior (EC-11):
- With full-render feature: Passthrough only, no diagnostic
- Without full-render: Emit OCR_JBIG2_UNSUPPORTED diagnostic, still passthrough
Files Modified
crates/pdftract-core/src/parser/pages.rs- Added Default impl for PageDict
Files Verified (no changes needed)
crates/pdftract-core/src/decoder/jbig2.rs- Complete implementationcrates/pdftract-core/src/decoder/mod.rs- Module exportscrates/pdftract-core/src/parser/stream.rs- Integration and diagnosticscrates/pdftract-core/src/diagnostics.rs- Diagnostic code definitioncrates/pdftract-core/src/lib.rs- Public module export