The encrypt_padded_mut API requires the buffer to be large enough to
hold the padded ciphertext. The tests were using plaintext.to_vec() which
only allocated plaintext.len() bytes, insufficient for padding.
Changed pattern:
- Before: plaintext.to_vec() (insufficient space)
- After: vec![0u8; plaintext.len() + 16] with copy_from_slice
Also fixed incorrect usage: encrypt_padded_mut returns Result<(), Error>,
not a length. Use data_copy.len() directly for ciphertext length.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Fixed missing fields in BlockJson, SpanJson, ExtractionOptions initializations
- Added feature gates to ocr_integration tests for conditional compilation
- Fixed McpServerState::new calls to include audit writer argument
- Fixed CCITTFaxDecoder::decode calls to use instance method
- Fixed type casts for ObjRef::new calls
- Fixed serde_json::Value method calls (is_some -> !is_null)
- Fixed ProfileType test feature gates
- Worked around lifetime issues in schema roundtrip tests
These changes fix numerous compilation errors that were blocking the
codebase from building. The main library and tests now compile successfully.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement per-word validation filter for assisted-OCR BrokenVector path.
Changes:
- Add SpanSource::OcrAssisted variant to hybrid.rs
- Add Span::ocr_assisted() helper method
- Implement validate_ocr_with_position_hints() in ocr.rs
- 5pt distance threshold for position validation
- 0.4 confidence cap for rejected words
- Linear scan for nearest-neighbor lookup
- Add unit tests for validation filter
Closes: pdftract-3s2i
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>