pdftract/.cargo/config.toml
jedarden 016c738188 feat(pdftract-5nv9h): implement xtask gen-schema with stable ordering and proper metadata
Implement the xtask gen-schema binary at xtask/src/bin/gen_schema.rs that
derives JSON Schema Draft 2020-12 from the Rust ExtractionResult type via
the schemars crate.

Changes:
- Add stable key sorting (sort_keys_recursive) for byte-identical output
- Set $id to stable URL: https://pdftract.com/schema/v1.0/pdftract.schema.json
- Set title to "pdftract Output v1.0"
- Add cargo alias `gen-schema` for convenient invocation
- Emit schema to docs/schema/v1.0/pdftract.schema.json

The schema is generated from the Rust types with schemars derives, ensuring
the JSON schema is always in sync with the source types.

Acceptance criteria:
- cargo gen-schema regenerates docs/schema/v1.0/pdftract.schema.json
- Generated schema validates against JSON Schema Draft 2020-12
- Schema $id is the stable URL
- Title is "pdftract Output v1.0"
- Stable ordering: regenerating twice produces byte-identical output
- All expected types appear in $defs (BlockJson, SpanJson, PageResult, etc.)

Note: page_type and confidence_source enums are not yet implemented in the
Rust types (marked as TODO in schema/mod.rs). These will be added by sibling
beads pdftract-1ob and pdftract-1f8we respectively.

Closes: pdftract-5nv9h
2026-05-24 17:31:16 -04:00

27 lines
726 B
TOML

# Cargo config for pdftract workspace
# Build aliases used by CI workflows and local development
[alias]
# CI-compatible aliases
bench-ci = "bench --features benchmark"
test-ci = "test --workspace"
# Development conveniences
b = "build"
br = "build --release"
c = "check"
cr = "check --release"
t = "test"
tr = "test --release"
# xtask aliases (invoke via --manifest-path to avoid workspace issues)
gen-schema = "run --manifest-path=xtask/Cargo.toml --bin gen_schema"
# Profile for CI property tests (nextest with proptest)
[profile.ci-proptest]
inherits = "release"
opt-level = 2 # Faster builds than full release, still fast execution
debug = false
strip = "none"
lto = "off"
codegen-units = 256 # Maximum parallelism