Implement orchestration layer connecting HttpRangeSource to Phase 1.3 xref resolver and Phase 1.4 document model for remote PDF access: - Document::open_remote() public API for remote PDF loading - Progressive tail fetch (16 KB → 1 MB) for startxref location - Xref forward-scan disabled for remote sources (via is_remote check) - Page-by-page on-demand fetch via HttpRangeSource caching - Resource lazy load through XrefResolver cache - HEAD probe with 405 fallback, no Content-Length handling Acceptance criteria: ✅ open_remote(url) returns Document with correct page count ✅ HEAD failure modes (405, no Content-Length, 401) handled ✅ xref forward-scan disabled for remote (is_remote check) ✅ Page-by-page on-demand fetch (HttpRangeSource LRU cache) ✅ INV-8 maintained (all errors return Result) Files modified: - crates/pdftract-core/src/document.rs (Document::open_remote, from_source) - crates/pdftract-core/src/remote.rs (progressive tail fetch) - crates/pdftract-core/src/lib.rs (re-exports) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
19 lines
815 B
Bash
19 lines
815 B
Bash
#!/usr/bin/env bash
|
|
# Script to measure rustdoc coverage for pdftract-core
|
|
|
|
cd /home/coding/pdftract || exit 1
|
|
|
|
# Find all public items (pub fn, pub struct, pub enum, pub trait, pub mod, pub type, pub const)
|
|
# Count lines with pub declarations
|
|
TOTAL_ITEMS=$(grep -rn '^pub ' crates/pdftract-core/src --include='*.rs' 2>/dev/null | wc -l)
|
|
|
|
# Find doc comments (/// or //!)
|
|
DOC_COMMENTS=$(grep -rn '^////' crates/pdftract-core/src --include='*.rs' 2>/dev/null | wc -l)
|
|
|
|
# This is a rough estimate; we need a more sophisticated tool
|
|
echo "Public item declarations: $TOTAL_ITEMS"
|
|
echo "Doc comment lines: $DOC_COMMENTS"
|
|
echo "Note: This is a rough count. Real coverage needs rustdoc analysis."
|
|
|
|
# For better coverage, we'll use cargo-deadlinks or similar tools
|
|
# For now, let's just build the docs and see what happens
|