pdftract/check_content.py
jedarden 1c6f26ecaa fix(bf-4mkhv): clean up unused imports in hash.rs
The bead description mentioned compile errors in hash.rs from API drift,
but those errors were either already fixed or misattributed. The API usage
was already correct:
- compute_fingerprint already takes 3 arguments with source
- len() already propagates Result with ?
- read_at method already used correctly
- Catalog fields accessed via trailer correctly

Only cleanup: removed unused std::fs::File and std::io imports.

Verification: notes/bf-4mkhv.md
2026-06-01 09:43:48 -04:00

21 lines
671 B
Python

#!/usr/bin/env python3
import sys
try:
import pikepdf
except ImportError:
sys.exit("pikepdf not available")
def extract_text(path):
with pikepdf.open(path) as pdf:
for page in pdf.pages:
if "/Contents" in page:
contents = page["/Contents"]
if hasattr(contents, "read_bytes"):
data = contents.read_bytes()
else:
data = bytes(contents)
print(f"{path}: {data[:200]}")
break
extract_text("tests/fingerprint/fixtures/content_edit_one_glyph/v1.pdf")
extract_text("tests/fingerprint/fixtures/content_edit_one_glyph/v2.pdf")