jedarden 857f928732 feat(pdftract-5omc): implement SDK conformance test runner pattern

Implement the conformance test runner pattern that every SDK will
implement to validate against the shared test suite.

- Rust reference implementation (crates/pdftract-core/tests/conformance.rs)
  * Full test suite loader and executor
  * Comparison engine with min/max, string constraints, tolerances
  * Skip logic for unsupported features and schema versions
  * Report generation in JSON format

- CLI compare subcommand (crates/pdftract-cli/src/main.rs)
  * pdftract compare - Compare actual vs expected with tolerances
  * Cross-language comparison tool to avoid reimplementations

- Documentation (docs/conformance/sdk-contract.md)
  * Complete pattern specification with pseudocode
  * Per-language runner locations
  * CI integration requirements

- Python reference stub (tests/python-conformance/test_conformance.py)
  * Full pytest-based implementation following the pattern

Closes: pdftract-5omc

2026-05-18 01:22:23 -04:00

7.3 KiB

Raw Blame History

SDK Conformance Test Runner Pattern

This document describes the pattern that every pdftract SDK must implement for conformance testing.

Overview

Every SDK ships a pdftract-sdk-conformance test runner that:

Loads tests/sdk-conformance/cases.json (the shared test suite)
Iterates through test cases
Invokes the SDK's native method with the case's options
Compares the result against expected with tolerances
Reports per-case pass/fail/skip/error status
Emits conformance-report.json

The runner is a TEST target, not production code. It lives in the SDK's test tree.

Test Case Structure

Each test case in cases.json has:

{
  "id": "extract-vector-scientific-paper",
  "fixture": "scientific_paper/01.pdf",
  "method": "extract",
  "options": {
    "ocr_language": "eng",
    "ocr_threshold": 0.7,
    "preserve_layout": false,
    "extract_images": false
  },
  "expected": {
    "schema_version": "1.0",
    "metadata.page_count": 1,
    "pages.length": 1,
    "pages[0].page_index": 0,
    "pages[0].width": {"min": 500, "max": 700},
    "pages[0].height": {"min": 700, "max": 900},
    "pages[0].rotation": 0,
    "pages[0].spans.length": {"min": 1},
    "pages[0].blocks.length": {"min": 1},
    "pages[0].blocks[0].kind": "heading",
    "errors.length": 0
  },
  "tolerances": {
    "pages[*].blocks[*].bbox": {"abs": 0.5},
    "pages[*].spans[*].bbox": {"abs": 0.5}
  },
  "feature": "vector",
  "min_schema_version": "1.0"
}

Expected Value Constraints

The expected field supports several constraint types:

Exact Value Match

{"pages[0].rotation": 0}

Min/Max Ranges

{"pages[0].width": {"min": 500, "max": 700}}

Minimum Length (arrays/strings)

{"pages[0].spans.length": {"min": 1}}
{"value": {"min_length": 50}}

Contains (strings)

{"value": {"contains": ["Abstract", "Introduction"]}}

Boolean/Null Checks

{"metadata.is_encrypted": true}
{"metadata.title": null}

Tolerances

Tolerances allow for numeric imprecision in comparisons:

{
  "tolerances": {
    "pages[*].blocks[*].bbox": {"abs": 0.5},
    "pages[*].spans[*].confidence": {"abs": 0.2, "rel": 0.1}
  }
}

abs: Absolute tolerance - values pass if |actual - expected| <= abs
rel: Relative tolerance - values pass if |actual - expected| / average <= rel

Wildcard patterns (*) in tolerance paths match any array index or field name.

Skip Conditions

A test case should be skipped (status: "skip") if:

Feature unavailable: The SDK doesn't support the required feature
- Check: case.feature is not in the SDK's available features
- Example: C SDK without OCR support skips all feature: "ocr" tests
Schema version too old: The SDK's binary schema version is older than required
- Check: sdk.schema_version < case.min_schema_version
- Example: SDK with schema 1.0 skips tests requiring 1.1
Explicit skip: The case has skip_reason set
- Check: case.skip_reason is not null

Report Format

The runner must emit conformance-report.json:

{
  "sdk": "pdftract-python",
  "sdk_version": "1.0.0",
  "suite_version": "1.0.0",
  "timestamp": "2026-05-18T12:00:00Z",
  "results": [
    {
      "id": "extract-vector-scientific-paper",
      "status": "pass",
      "actual": {...},
      "expected": {...},
      "duration_ms": 150
    },
    {
      "id": "extract-scanned-receipt",
      "status": "fail",
      "actual": {...},
      "expected": {...},
      "error": "pages[0].page_type: expected 'scanned', got 'vector'",
      "duration_ms": 200
    },
    {
      "id": "extract-remote-pdf",
      "status": "skip",
      "error": "Feature 'remote' not supported by this SDK",
      "duration_ms": 0
    }
  ],
  "summary": {
    "total": 32,
    "passed": 28,
    "failed": 1,
    "skipped": 3,
    "errors": 0
  }
}

Status values: "pass", "fail", "skip", "error"

Exit Codes

The runner must exit with:

0 if all non-skip tests passed
1 if any test failed or had an error

Comparison Logic (Pseudocode)

function compare(actual, expected, tolerances, path):
    match (actual, expected):
        case (Number, Object with min/max):
            if actual < expected.min: return FAIL("value below minimum")
            if actual > expected.max: return FAIL("value above maximum")
            if expected.value exists:
                return compare_with_tolerance(actual, expected.value, tolerances, path)
            return PASS

        case (String, Object with constraints):
            if actual.length < expected.min_length: return FAIL("string too short")
            for substring in expected.contains:
                if substring not in actual: return FAIL("missing required substring")
            return PASS

        case (Array, Object with min/max):
            if actual.length < expected.min: return FAIL("array too short")
            if actual.length > expected.max: return FAIL("array too long")
            return PASS

        case (_, _):
            if actual == expected: return PASS
            return FAIL("value mismatch")

function compare_with_tolerance(actual, expected, tolerances, path):
    tolerance = find_tolerance(tolerances, path)
    if tolerance == null:
        return exact_compare(actual, expected)

    diff = abs(actual - expected)
    if tolerance.abs exists and diff <= tolerance.abs:
        return PASS
    if tolerance.rel exists:
        avg = (actual + expected) / 2
        if diff / avg <= tolerance.rel:
            return PASS
    return FAIL("numeric mismatch")

function find_tolerance(tolerances, path):
    // Try exact match first
    if tolerances[path] exists: return tolerances[path]

    // Try wildcard patterns
    for key in tolerations:
        if key contains '*':
            pattern = key.replace('*', '.*')
            if path matches pattern: return tolerations[key]

    return null

Using the CLI Compare Subcommand

For SDKs that prefer not to reimplement the comparison logic, the pdftract CLI provides a compare subcommand:

pdftract compare actual.json expected.json --tolerances tolerances.json --format json

This outputs a JSON report of pass/fail for each expected field, with detailed failure reasons.

Per-Language Runner Locations

SDK	Runner Path	Test Framework
Python	`tests/test_conformance.py`	pytest
Rust	`crates/pdftract-cli/tests/conformance.rs`	cargo test
Node.js	`test/conformance.test.ts`	vitest
Go	`conformance_test.go`	go test
Java	`src/test/java/.../ConformanceTest.java`	JUnit 5
.NET	`tests/Pdftract.Tests/ConformanceTests.cs`	xUnit
C	`tests/conformance.c`	standalone binary
Ruby	`test/conformance_test.rb`	minitest
PHP	`tests/ConformanceTest.php`	PHPUnit
Swift	`Tests/PdftractTests/ConformanceTests.swift`	XCTest

CI Integration

Each SDK's Argo publish workflow must:

Run the conformance runner
Parse the report JSON
Fail the workflow if summary.failed > 0 or summary.errors > 0
Upload the report as an Argo artifact
Link the artifact from the SDK's README "Conformance" section

Milestone Gates

Before publishing any SDK milestone tag:

100% of applicable (non-skip) tests must pass
The conformance report must be included in the release notes
The README must link to the published report artifact

7.3 KiB Raw Blame History