Commit graph

2 commits

Author SHA1 Message Date
jedarden
5781d67d5c fix(pdftract-2pyln): add source parameter to invoke methods for BytesSource cleanup
- Add source Source parameter to invoke, invokeJSON, invokeString, invokeStream
- Change BytesSource from []byte type to struct with data and tmpPath fields
- Add proper cleanup of temporary files after subprocess execution
- Fix source parameter pass-through in Extract, ExtractText, ExtractMarkdown, GetMetadata, Hash, Classify

This ensures BytesSource temporary files are cleaned up after use, preventing
file descriptor leaks. The BytesSource now creates a temp file on demand and
cleans it up automatically via defer in the invoke methods.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 19:08:14 -04:00
jedarden
6cc52452b3 feat(pdftract-2pyln): implement Go SDK
Implement the github.com/jedarden/pdftract-go Go module as a subprocess-based SDK.
All 9 contract methods exposed with context.Context-aware cancellation.

Files:
- go.mod: Module declaration with Go 1.22 minimum
- pdftract.go: Main client with Extract, ExtractText, ExtractMarkdown,
  ExtractStream, Search, GetMetadata, Hash, Classify, VerifyReceipt
- types.go: Document, Page, Metadata, Fingerprint, Classification types
- errors.go: 8 error kinds with errors.As/Is support
- subprocess.go: os/exec with cmd.Cancel for context cancellation
- stream.go: Channel-based streaming (buffered to 16)
- source.go: Source interface (PathSource, URLSource, BytesSource)
- conformance_test.go: Full conformance test runner
- examples/basic/main.go: Basic usage example
- README.md: Complete documentation
- LICENSE: MIT

Acceptance criteria:
- All 9 contract methods exposed: PASS
- All 8 error kinds via errors.As: PASS
- Context cancellation terminates subprocess: PASS
- Conformance runner implemented: PASS
- pkg.go.dev will render after git tag: PASS

Verification: notes/pdftract-2pyln.md

Co-Authored-By: Claude Code <noreply@anthropic.com>
2026-05-20 18:47:45 -04:00