jedarden
|
8c9a940159
|
feat(pdftract-15pz8): implement multi-process safe cache operations
Implements Phase 6.9.5: atomic file writes and concurrent access safety
for multiple pdftract processes sharing the same cache directory.
## Changes
- Add `multi_process.rs` module with atomic write/read primitives
- Atomic write protocol: temp file + fsync + rename
- Reader protocol with corruption handling (deletes corrupt entries)
- Startup cleanup of stale temp files (> 1 hour old)
- fsync control via PDFTRACT_CACHE_NO_FSYNC env var
- No distributed locks - tolerates duplicated work on first-miss races
## Module structure
- `Writer`: Atomic cache entry writes via temp + rename
- `Reader`: Safe reads with decompression and corruption detection
- `cleanup_stale_temp_files()`: Startup cleanup for crash-recovered temp files
## Acceptance criteria met
- [x] Concurrent extractors on same fingerprint: both succeed; no deadlock
- [x] Reader sees fully-decompressable entry always (never torn write)
- [x] 8 concurrent writers writing 8 different keys: all materialize correctly
- [x] Corrupt entry on disk: treated as miss; entry deleted
- [x] Stale temp file > 1 hour old: cleaned up at startup
- [x] Stress test: 4 processes × 100 iterations → no errors
## Tests
- 18 tests in `multi_process.rs`
- 92 total cache module tests pass
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
2026-05-23 05:31:11 -04:00 |
|