3.1 KiB
pdftract-2cnmr: PdfSource trait + MmapSource + FileSource
Summary
The PdfSource trait abstraction with MmapSource and FileSource implementations was already implemented in the codebase prior to this bead. All core acceptance criteria are met.
Implementation Status
PASS Criteria
-
PdfSource trait defined and exported ✓
- Location:
crates/pdftract-core/src/source/mod.rs - Provides
len(),read_range(offset, length), andprefetch()methods - Object-safe trait with
Read + Seek + Send + Syncbounds for rayon parallelism
- Location:
-
MmapSource implementation ✓
- Location:
crates/pdftract-core/src/source/mmap.rs - Uses
memmap2 = "0.9"for memory-mapped I/O - Implements
advise_sequential()for MADV_SEQUENTIAL hint - Comprehensive test suite (29 tests) covering all operations
- Location:
-
FileSource implementation ✓
- Location:
crates/pdftract-core/src/source/file_source.rs - Uses
parking_lot::Mutexfor thread-safe&selfaccess - Handles special files (e.g., /proc) that don't support mmap
- Comprehensive test suite (15 tests)
- Location:
-
MemorySource implementation (bonus)
- Location:
crates/pdftract-core/src/source/memory.rs - In-memory source for testing with zero-copy Bytes
- Location:
-
Exports from lib.rs ✓
- All source types re-exported from
pdftract_core::source - Line 90:
pub use source::{FileSource, MmapSource, PdfSource};
- All source types re-exported from
Dependencies
All required dependencies are present in Cargo.toml:
bytes = "1"- Zero-copy slice typememmap2 = "0.9"- Memory mappingparking_lot = "0.12"- Mutex for FileSource
Build Status
- Lib compilation: PASS -
cargo build --package pdftract-core --libsucceeds - Test compilation: BLOCKED - Unrelated test errors in
markdown.rs(API signature mismatches in test-only code)
The source module itself compiles cleanly. Test compilation errors are in test code for the markdown module (block_to_markdown and page_to_markdown function calls with outdated signatures), which is unrelated to the source abstraction work.
Code Quality
- All implementations include comprehensive test suites
- Thread-safe (
Send + Sync) for rayon parallelism - Proper error handling with
io::Result - Well-documented with examples and safety notes
INV-8 Verification
INV-8 (Invariants) appears to be maintained:
- Source abstraction properly abstracts over storage medium
- Thread safety enforced via trait bounds
- No assumptions about file mutability (documented as known limitation for mmap)
Files Examined
crates/pdftract-core/src/source/mod.rs- PdfSource trait definitioncrates/pdftract-core/src/source/mmap.rs- MmapSource implementationcrates/pdftract-core/src/source/file_source.rs- FileSource implementationcrates/pdftract-core/src/source/memory.rs- MemorySource implementationcrates/pdftract-core/src/lib.rs- Public exportscrates/pdftract-core/Cargo.toml- Dependencies verified
Conclusion
The PdfSource trait and implementations are complete and meet all acceptance criteria. The bead work was already done in a prior commit. No new code changes were required for this bead.