pdftract/tests/fixtures/profiles/form/PROVENANCE.md
jedarden 6000c654ce fix: resolve compilation errors across codebase
- Fixed missing fields in BlockJson, SpanJson, ExtractionOptions initializations
- Added feature gates to ocr_integration tests for conditional compilation
- Fixed McpServerState::new calls to include audit writer argument
- Fixed CCITTFaxDecoder::decode calls to use instance method
- Fixed type casts for ObjRef::new calls
- Fixed serde_json::Value method calls (is_some -> !is_null)
- Fixed ProfileType test feature gates
- Worked around lifetime issues in schema roundtrip tests

These changes fix numerous compilation errors that were blocking the
codebase from building. The main library and tests now compile successfully.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 08:38:04 -04:00

1.2 KiB

Form Profile Fixture Provenance

This manifest tracks the origin and licensing of form fixture files.

Format

Path Source URL License Downloaded Date SHA256 Notes
irs_1040.pdf TBD TBD TBD TBD IRS Form 1040 sample - placeholder, to be replaced with public domain source
w2.pdf TBD TBD TBD TBD W-2 Wage and Tax Statement sample - placeholder, to be replaced with public domain source
i9.pdf TBD TBD TBD TBD Form I-9 Employment Eligibility Verification sample - placeholder, to be replaced with public domain source
expense_report.pdf TBD TBD TBD TBD Simple expense report sample - placeholder, to be replaced with public domain source
intake_form.pdf TBD TBD TBD TBD Multi-page intake form sample - placeholder, to be replaced with public domain source

Notes

  • Form fixtures should be sourced from official government forms (public domain) or created synthetically
  • IRS forms are generally in the public domain as U.S. government works
  • No real forms with personally identifiable information (PII) should be used
  • Synthetic forms can be generated using reportlab or similar PDF generation tools