Commit graph

8 commits

Author SHA1 Message Date
jedarden
8b5dd4febb docs(pdftract-4iier): add per-profile README documentation for all 9 built-in profiles
This commit creates user-facing documentation for each built-in profile:

- Profile YAML files defining match criteria, priority, and extracted fields
- Per-profile READMEs with match criteria summary, extracted fields table,
  known limitations, sample input pointers, and configuration tips
- xtask skeleton generator for automated README generation

Profiles documented:
- invoice: Commercial invoices with line items, vendor/customer, totals
- receipt: POS receipts with items, payment method
- contract: Legal contracts with parties, effective date, term, signatures
- scientific_paper: Academic papers with title, authors, abstract, DOI, references
- slide_deck: Presentation slides with title, presenter, date, slide titles
- form: Fillable forms (degenerate case: uses Phase 7.4 form_fields)
- bank_statement: Bank statements with account info, period, balances, transactions
- legal_filing: Court filings with case number, court, parties, filing date, docket
- book_chapter: Book chapters with title, chapter number, author, section headings

Closes: pdftract-4iier
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 23:19:00 -04:00
jedarden
5e66846288 docs(pdftract-147a): author SDK contract specification
Add comprehensive SDK contract specification at docs/notes/sdk-contract.md.
This document serves as the constitutional specification for all pdftract
SDK implementations across all languages.

The contract defines:
- Method surface (9 methods mirroring CLI/MCP tools)
- Error mapping (CLI exit codes → native exceptions)
- Versioning compatibility rules (MAJOR lock, MINOR flexibility)
- Option-naming conventions (CLI flag → language-native case)
- Native type-mapping requirements (Document, Page, Span, Block, Match, Fingerprint, Classification)
- Async conventions per language
- Conformance enforcement (100% pass required)
- Change policy (ADR required for contract changes)

Verification note: notes/pdftract-147a.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 23:13:55 -04:00
jedarden
1747812323 docs(pdftract-1wqec): verify CI scaffolding acceptance criteria
- Confirm pdftract-ci.yaml exists in declarative-config
- Verify WorkflowTemplate deployed to argo-workflows namespace
- Document all scaffold templates are present with placeholders
- Note: ArgoCD sync will reconcile minor version drift

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 07:12:16 -04:00
jedarden
891718319e docs(pdftract-1wqec): verify manual workflow execution succeeds
Updated verification notes with successful manual workflow test results.
All DAG steps completed successfully; publish-if-tag correctly skipped.
2026-05-17 07:06:38 -04:00
jedarden
5a6449a8cf docs(phase-0.1): verify pdftract-ci scaffolding complete
The pdftract-ci WorkflowTemplate was already created in declarative-config
in a previous session. This commit adds verification notes confirming all
acceptance criteria are met:

- WorkflowTemplate exists in k8s/iad-ci/argo-workflows/pdftract-ci.yaml
- Template synced to iad-ci cluster (argo-workflows namespace)
- DAG structure: setup -> [build-matrix, test-matrix, quality-matrix,
  bench-matrix] -> publish-if-tag
- All required configuration present (parameters, securityContext,
  volumeClaimTemplates, podGC, TTL)
- Webhook payload schema documented in YAML comments
- Empty step skeletons ready for Phase 0 sibling beads

Manual workflow test attempted but encountered transient Rackspace Spot
CSI storage attachment issue (infrastructure, not template defect).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 06:58:21 -04:00
jedarden
9f27d16f25 docs(phase-0.1): verify pdftract-ci scaffolding complete
Verified the pdftract-ci WorkflowTemplate exists in declarative-config
and is correctly synced to the iad-ci cluster. All scaffolding
requirements met for Phase 0.1.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 03:24:36 -04:00
jedarden
f2bf29b0c8 docs(Phase 0.1): document pdftract-ci scaffold status
Verify completion of Phase 0.1 scaffolding bead. The WorkflowTemplate
was already implemented in declarative-config with all required elements:
- DAG structure with empty step skeletons
- VolumeClaimTemplates for cargo cache
- Exit handler, security context, imagePullSecrets
- Webhook payload schema documentation

Subsequent Phase 0 beads can now develop each DAG leg in parallel.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 02:43:48 -04:00
jedarden
427c353fbc docs(Phase 0.1): document pdftract-ci scaffold status
The pdftract-ci.yaml WorkflowTemplate scaffold already exists in
declarative-config (commit 8248a1f). This notes file documents the
current state and pending ArgoCD sync.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 01:52:42 -04:00