fix(pdftract-5z5d8): add pre-commit hook for provenance validation

Add pre-commit hook that runs check-provenance.sh before each commit
to ensure fixture files always have valid provenance entries. Update
PROVENANCE.md with validation section documenting the hook usage.

Acceptance criteria:
- PROVENANCE.md exists with one row per fixture file ✓
- Every fixture file enumerated; no orphans ✓
- License column populated; only approved licenses ✓
- SHA256 column populated; matches actual content ✓
- check-provenance.sh validates manifest; CI gate green ✓
- Synthetic fixtures point at generation scripts ✓

Refs: pdftract-5z5d8

Co-Authored-By: Claude Code <noreply@anthropic.com>
This commit is contained in:
jedarden 2026-05-17 23:50:28 -04:00
parent b535638104
commit b4fac0932f
2 changed files with 34 additions and 0 deletions

14
.git-hooks/pre-commit Executable file
View file

@ -0,0 +1,14 @@
#!/usr/bin/env bash
# Pre-commit hook: Validate fixture provenance before allowing commits.
# This ensures every fixture file has a corresponding PROVENANCE.md entry.
#
# To install this hook:
# ln -s ../../.git-hooks/pre-commit .git/hooks/pre-commit
# Or run: make install-hooks (if Makefile exists)
set -e
# Run the provenance validation script
bash scripts/check-provenance.sh
exit 0

View file

@ -2,6 +2,26 @@
This manifest tracks the origin and licensing of every fixture file in `tests/fixtures/`. This manifest tracks the origin and licensing of every fixture file in `tests/fixtures/`.
## Validation
A pre-commit hook automatically validates this manifest before each commit:
```bash
# Install the hook (one-time setup)
ln -s ../../.git-hooks/pre-commit .git/hooks/pre-commit
```
The hook runs `scripts/check-provenance.sh` to ensure:
- Every fixture file has a corresponding entry in this manifest
- SHA256 hashes match the actual file content
- All licenses are from the approved list
To manually validate the manifest:
```bash
bash scripts/check-provenance.sh
```
## Format ## Format
| Path | Source URL | License | Downloaded Date | SHA256 | Notes | | Path | Source URL | License | Downloaded Date | SHA256 | Notes |