pdftract/scripts/doc_coverage.sh
jedarden f85e5149dd feat(pdftract-91e1i): HTTP fetch sequence implementation
Implement orchestration layer connecting HttpRangeSource to Phase 1.3
xref resolver and Phase 1.4 document model for remote PDF access:

- Document::open_remote() public API for remote PDF loading
- Progressive tail fetch (16 KB → 1 MB) for startxref location
- Xref forward-scan disabled for remote sources (via is_remote check)
- Page-by-page on-demand fetch via HttpRangeSource caching
- Resource lazy load through XrefResolver cache
- HEAD probe with 405 fallback, no Content-Length handling

Acceptance criteria:
 open_remote(url) returns Document with correct page count
 HEAD failure modes (405, no Content-Length, 401) handled
 xref forward-scan disabled for remote (is_remote check)
 Page-by-page on-demand fetch (HttpRangeSource LRU cache)
 INV-8 maintained (all errors return Result)

Files modified:
- crates/pdftract-core/src/document.rs (Document::open_remote, from_source)
- crates/pdftract-core/src/remote.rs (progressive tail fetch)
- crates/pdftract-core/src/lib.rs (re-exports)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 13:17:00 -04:00

19 lines
815 B
Bash

#!/usr/bin/env bash
# Script to measure rustdoc coverage for pdftract-core
cd /home/coding/pdftract || exit 1
# Find all public items (pub fn, pub struct, pub enum, pub trait, pub mod, pub type, pub const)
# Count lines with pub declarations
TOTAL_ITEMS=$(grep -rn '^pub ' crates/pdftract-core/src --include='*.rs' 2>/dev/null | wc -l)
# Find doc comments (/// or //!)
DOC_COMMENTS=$(grep -rn '^////' crates/pdftract-core/src --include='*.rs' 2>/dev/null | wc -l)
# This is a rough estimate; we need a more sophisticated tool
echo "Public item declarations: $TOTAL_ITEMS"
echo "Doc comment lines: $DOC_COMMENTS"
echo "Note: This is a rough count. Real coverage needs rustdoc analysis."
# For better coverage, we'll use cargo-deadlinks or similar tools
# For now, let's just build the docs and see what happens