pdftract/swift-sdk/STRUCTURE.md
jedarden 8b9a7bc91a docs(pdftract-5lvpu): verify Swift SDK implementation for v1.1+ release
Bead pdftract-5lvpu implements the Swift SDK for pdftract as a
subprocess-based SDK using Foundation's Process with async/await.
Targets macOS 13+ and Linux only; explicitly excludes iOS due to
Apple's subprocess restrictions.

Acceptance criteria status:
- PASS: SPM package structure (Package.swift configured)
- PASS: All 9 contract methods exposed in Methods.swift
- PASS: All 8 error cases defined in Error.swift
- PASS: iOS documented as unsupported in README.md
- PASS: CI workflow configured (pdftract-swift-publish.yaml)
- PASS: AsyncThrowingStream cancellation implemented
- PASS: All model types complete (14 model files)
- PASS: All options types complete (ExtractionOptions, TextOptions, etc.)
- PASS: Conformance test suite defined (ConformanceTests.swift)
- PASS: Cross-platform Process support (ProcessRunner actor)

Files updated:
- swift-sdk/README.md: Fixed GitHub URL from placeholder to jedarden/pdftract-swift

Verification note: notes/pdftract-5lvpu.md

References:
- Plan: SDK Architecture / The Ten SDKs, line 3480
- Plan: SDK Architecture / Per-SDK Release Channels, line 3577
- Plan: SDK Acceptance Criteria, lines 3581-3589
- ADR-009: Argo Workflows on iad-ci only
2026-06-01 13:40:03 -04:00

690 lines
20 KiB
Markdown

# Pdftract Swift SDK - Complete Package Structure
## Overview
This document describes the complete Swift package structure for the pdftract SDK, designed according to the JSON schema contract (`docs/schema/v1.0/pdftract.schema.json`).
## Package Structure
```
swift-sdk/
├── Package.swift # SPM manifest with .macOS(.v13), .linux
├── README.md # User-facing documentation
├── .gitignore # Git ignore patterns
├── STRUCTURE.md # This file
├── Sources/Pdftract/
│ ├── Pdftract.swift # Main client class (actor)
│ ├── PdftractExport.swift # Public API exports
│ │
│ └── Models/
│ ├── Document.swift # Document, Metadata
│ ├── Page.swift # Page, Span, Block
│ ├── Table.swift # Table, Row, Cell
│ ├── Annotation.swift # Link, DestinationArray, DestinationType, Annotation, AnnotationSpecific
│ ├── Signature.swift # Signature
│ ├── FormField.swift # FormField, FormFieldType, FormFieldValue, ChoiceValue
│ ├── Attachment.swift # Attachment, Thread, Bead, OutlineNode, Destination
│ ├── Quality.swift # ExtractionQuality, Diagnostic, ObjectLocation, JavascriptAction
│ ├── Source.swift # Source enum, ExtractionOptions, TextOptions, MarkdownOptions
│ └── Error.swift # PdftractError (8 cases), DecodingErrorWrapper
├── Tests/PdftractTests/
│ └── PdftractTests.swift # Comprehensive unit tests
└── Examples/
└── main.swift # Usage examples for all features
```
## File-by-File Breakdown
### 1. Package.swift
```swift
// swift-tools-version: 5.9
// Platforms: .macOS(.v13), .linux
// Products: Pdftract library
// Targets: Pdftract (source), PdftractTests (tests)
```
**Key Features:**
- Swift 5.9+ for modern concurrency support
- Multi-platform: macOS 13+, Linux
- No external dependencies (standalone)
### 2. Sources/Pdftract/Pdftract.swift
**Main Client Class (Actor):**
```swift
public actor Pdftract {
// Full structured extraction
public func extract(from:source, options:) async throws -> Document
// Streaming extraction
public func extractPages(from:source, options:) async -> AsyncThrowingStream<Page, Error>
// Text extraction
public func extractText(from:source, options:) async throws -> String
public func extractTextPages(from:source, options:) async -> AsyncThrowingStream<String, Error>
// Markdown extraction
public func extractMarkdown(from:source, options:) async throws -> String
// Hashing
public func hash(source:) async throws -> (md5: String, sha256: String)
// Metadata only
public func extractMetadata(from:) async throws -> Metadata
}
```
**Design Decisions:**
- **Actor** for thread-safe access to underlying extractor
- **Async/await** for all I/O operations
- **AsyncThrowingStream** for incremental processing of large PDFs
- **Throws** typed `PdftractError` for all failures
### 3. Models/Document.swift
**Structures:**
```swift
public struct Document {
public let schemaVersion: String // "1.0"
public let metadata: Metadata
public var outline: [OutlineNode]
public var threads: [Thread]
public var attachments: [Attachment]
public var signatures: [Signature]
public var formFields: [FormField]
public var links: [Link]
public var pages: [Page]
public var extractionQuality: ExtractionQuality
public var errors: [Diagnostic]
}
public struct Metadata {
public var title: String?
public var author: String?
public var subject: String?
public var keywords: String?
public var creator: String?
public var producer: String?
public var creationDate: String?
public var modificationDate: String?
public let pageCount: UInt32
public var pdfVersion: String?
public let isTagged: Bool
public let isEncrypted: Bool
public var conformance: String // "none", "PDF-A-1a", etc.
public let containsJavaScript: Bool
public var javascriptActions: [JavascriptAction]
public let containsXfa: Bool
public let ocgPresent: Bool
public var generator: String?
}
```
### 4. Models/Page.swift
**Structures:**
```swift
public struct Page {
public let pageIndex: UInt // 0-based
public let pageNumber: UInt32 // 1-based
public var pageLabel: String?
public let width: Float
public let height: Float
public let rotation: UInt16 // 0, 90, 180, 270
public let pageType: String // "text", "scanned", "mixed", etc.
public var spans: [Span]
public var blocks: [Block]
public var tables: [Table]
public var annotations: [Annotation]
}
public struct Span {
public let text: String
public let bbox: [Double] // [x0, y0, x1, y1]
public let font: String
public let size: Double
public var color: String?
public var renderingMode: UInt8?
public var confidence: Double?
public var confidenceSource: String? // "vector", "ocr", etc.
public var lang: String?
public var flags: [String] // "bold", "italic", etc.
public var column: UInt32?
}
public struct Block {
public let kind: String // "paragraph", "heading", etc.
public let text: String
public let bbox: [Double]
public var level: UInt8? // For headings (1-6)
public var tableIndex: UInt? // For tables
public var spans: [UInt] // Indices into page.spans
}
```
### 5. Models/Table.swift
**Structures:**
```swift
public struct Table {
public let id: String // "table_0"
public let bbox: [Double]
public var rows: [Row]
public let headerRows: UInt32
public let detectionMethod: String // "line_based", "borderless"
public var continued: Bool
public var continuedFromPrev: Bool
public let pageIndex: UInt
}
public struct Row {
public let bbox: [Double]
public var cells: [Cell]
public let isHeader: Bool
}
public struct Cell {
public let bbox: [Double]
public let text: String
public let spans: [UInt]
public let row: UInt
public let col: UInt
public let rowspan: UInt32
public let colspan: UInt32
public let isHeaderRow: Bool
}
```
### 6. Models/Annotation.swift
**Structures:**
```swift
public struct Link {
public let pageIndex: UInt
public let rect: [Float]
public var uri: String?
public var dest: String?
public var destArray: DestinationArray?
}
public struct DestinationArray {
public let pageIndex: UInt
public let dest: DestinationType
}
public enum DestinationType: Codable {
case xyz(left: Double?, top: Double?, zoom: Double?)
case fit
case fitH(top: Double?)
case fitV(left: Double?)
case fitR(left: Double, bottom: Double, right: Double, top: Double)
case fitB
case fitBH(top: Double?)
case fitBV(left: Double?)
}
public struct Annotation {
public let subtype: String // "Highlight", "Text", etc.
public var rect: [Float]?
public var contents: String?
public var author: String?
public var modified: String?
public var color: [Float]?
public var opacity: Float?
public var nameId: String?
public var subject: String?
public var specific: AnnotationSpecific?
}
public enum AnnotationSpecific: Codable {
case textMarkup(quads: [[Float]])
case stamp(name: String?)
case freeText(da: String?)
case text(open: Bool?, state: String?, stateModel: String?)
case ink(strokes: [[[Float]]])
case line(endpoints: [Float]?)
case polygon(vertices: [[Float]])
case fileAttachment(fsRef: UInt32?)
case other
}
```
### 7. Models/Signature.swift
**Structure:**
```swift
public struct Signature {
public let fieldName: String
public let signerName: String
public var signingDate: String?
public var reason: String?
public var location: String?
public var subFilter: String?
public var byteRange: [UInt64]?
public var coverageFraction: Double?
public let validationStatus: String // Always "not_checked" in v1
}
```
### 8. Models/FormField.swift
**Structures:**
```swift
public struct FormField {
public let name: String
public let fieldType: FormFieldType
public var value: FormFieldValue
public var defaultValue: FormFieldValue?
public var pageIndex: UInt?
public var rect: [Float]?
public let required: Bool
public let readOnly: Bool
public var multiline: Bool?
public var maxLength: UInt32?
public var options: [[String]]? // [[export_value, display_name], ...]
public var multiSelect: Bool?
public var selected: Bool?
public var stateName: String?
public var pushbutton: Bool?
public var radio: Bool?
}
public enum FormFieldType: String, Codable {
case text, button, choice, signature
}
public enum FormFieldValue: Codable, Equatable {
case text(String?)
case button(Bool)
case choice(ChoiceValue)
case signature(UInt32?)
}
public enum ChoiceValue: Codable, Equatable {
case single(String)
case multiple([String])
}
```
### 9. Models/Attachment.swift
**Structures:**
```swift
public struct Attachment {
public let name: String
public var description: String?
public var mimeType: String?
public let size: UInt64
public var created: String?
public var modified: String?
public var checksumMd5: String?
public var data: String? // Base64 or nil if truncated
public let truncated: Bool // true if > 50 MB
}
public struct Thread {
public var title: String?
public var author: String?
public var subject: String?
public var keywords: String?
public var beads: [Bead]
}
public struct Bead {
public let pageIndex: UInt
public let rect: [Float]
}
public struct OutlineNode {
public let title: String
public let level: UInt8
public var pageIndex: UInt32?
public var destination: Destination?
public var children: [OutlineNode]
}
public struct Destination {
public let destType: String
public var left: Double?
public var top: Double?
public var right: Double?
public var bottom: Double?
public var zoom: Double?
}
```
### 10. Models/Quality.swift
**Structures:**
```swift
public struct ExtractionQuality {
public var overallQuality: String // "high", "medium", "low", "none"
public var dpiUsed: UInt32?
public var ocrFraction: Float?
public var minConfidence: Float?
public var avgConfidence: Float?
public var readability: Float?
}
public struct Diagnostic {
public let code: String // "FONT_GLYPH_UNMAPPED"
public let message: String
public let severity: String // "info", "warning", "error", "fatal"
public var pageIndex: UInt?
public var location: ObjectLocation?
public var hint: String?
}
public struct ObjectLocation {
public let objectNumber: UInt32
public let generationNumber: UInt16
}
public struct JavascriptAction {
public let location: String // "catalog.openaction", etc.
public let codeExcerpt: String // First 200 chars
}
```
### 11. Models/Source.swift
**Enumerations and Options:**
```swift
public enum Source {
case path(String)
case url(String)
case bytes(Data)
case bytesStream(AsyncStream<Data>)
}
public struct ExtractionOptions: Codable {
public var extractSpans: Bool
public var extractBlocks: Bool
public var extractTables: Bool
public var extractAnnotations: Bool
public var extractFormFields: Bool
public var extractSignatures: Bool
public var extractAttachments: Bool
public var extractOutline: Bool
public var extractThreads: Bool
public var extractLinks: Bool
public var ocrDpi: UInt32?
public var maxAttachmentSize: UInt64?
public var includeQuality: Bool
public var includeErrors: Bool
}
public struct TextOptions: Codable {
public var preserveWhitespace: Bool
public var includeFontInfo: Bool
public var includeBoundingBoxes: Bool
}
public struct MarkdownOptions: Codable {
public var includeHeadings: Bool
public var includeLists: Bool
public var includeTables: Bool
public var includeLinks: Bool
}
```
### 12. Models/Error.swift
**Error Types:**
```swift
public enum PdftractError: Error, Equatable {
case invalidPdf(String) // Invalid PDF file format
case ioError(String) // I/O error reading/writing files
case networkError(String) // Network error fetching from URL
case outOfMemory // Memory allocation failure
case parseError(String) // PDF structure parse error
case ocrError(String) // OCR processing error
case renderingError(String) // Page rendering error
case internalError(String) // Generic internal error
public var localizedDescription: String { /* ... */ }
public var code: String { /* ... */ } // "INVALID_PDF", etc.
}
```
### 13. Tests/PdftractTests.swift
**Test Coverage:**
- `DocumentTests`: Document initialization, JSON encoding/decoding
- `PageTests`: Page, Span, Block initialization
- `TableTests`: Table, Row, Cell with merged cells
- `AnnotationTests`: Links (internal/external), annotations
- `FormFieldTests`: Text, button, choice (single/multiple), signature fields
- `SignatureTests`: Signed and unsigned signatures
- `AttachmentTests`: Regular and truncated attachments
- `ExtractionQualityTests`: Quality metrics
- `DiagnosticTests`: Diagnostic with context
- `SourceTests`: Path, URL, bytes sources
- `ExtractionOptionsTests`: Default and custom options
- `ErrorTests`: Error descriptions, codes, equality
**Run Tests:**
```bash
swift test
```
### 14. Examples/main.swift
**Example Functions:**
1. `example1_basicExtraction()` - Basic document extraction
2. `example2_streamingPages()` - Stream pages incrementally
3. `example3_textExtraction()` - Extract all text or by page
4. `example4_markdownExtraction()` - Convert to Markdown
5. `example5_metadataOnly()` - Quick metadata inspection
6. `example6_urlSource()` - Extract from URL
7. `example7_bytesSource()` - Extract from in-memory bytes
8. `example8_customOptions()` - Custom extraction options
9. `example9_errorHandling()` - Handle specific errors
10. `example10_tables()` - Work with tables
11. `example_workingWithSpans()` - Detailed span inspection
12. `example_workingWithBlocks()` - Block-level processing
13. `example_workingWithFormFields()` - Form field handling
14. `example_workingWithSignatures()` - Signature inspection
15. `example_workingWithAttachments()` - Attachment handling
16. `example_workingWithOutline()` - Outline/bookmark traversal
**Run Examples:**
```bash
swift run PdftractExamples run
```
## Naming Conventions
### Swift Naming (camelCase)
- **Methods**: `extract(from:options:)`, `extractText(from:options:)`
- **Properties**: `schemaVersion`, `pageCount`, `extractionQuality`
- **Parameters**: `from source`, `options: ExtractionOptions`
- **Variables**: `let pageIndex`, `var pageNumber`
### JSON Keys (snake_case)
All `CodingKeys` map Swift camelCase to JSON snake_case:
```swift
enum CodingKeys: String, CodingKey {
case schemaVersion = "schema_version"
case pageCount = "page_count"
case extractionQuality = "extraction_quality"
}
```
## Key Design Decisions
### 1. Actor Concurrency
The `Pdftract` client is an `actor` for thread-safe access:
```swift
public actor Pdftract {
private var extractor: ExtractorBridge?
public func extract(from source: Source) async throws -> Document {
// Actor ensures thread-safe access to extractor
}
}
```
### 2. AsyncThrowingStream for Streaming
Large PDFs can be processed incrementally:
```swift
public func extractPages(from source: Source)
async -> AsyncThrowingStream<Page, Error>
```
Consumers can process pages as they arrive:
```swift
for try await page in await client.extractPages(from: source) {
// Process page immediately
}
```
### 3. Codable for All Models
Every model is `Codable` for JSON serialization:
```swift
let document = try decoder.decode(Document.self, from: jsonData)
let json = try encoder.encode(document)
```
### 4. Optionals for Schema Conditionals
Fields that are `null` in the schema are Swift `Optionals`:
```swift
public var level: UInt8? // null for non-heading blocks
public var tableIndex: UInt? // null for non-table blocks
```
### 5. Enum Discriminated Unions
Complex types use Swift enums with associated values:
```swift
public enum FormFieldValue: Codable {
case text(String?)
case button(Bool)
case choice(ChoiceValue)
case signature(UInt32?)
}
```
### 6. Type-Safe Errors
`PdftractError` provides typed errors with codes:
```swift
catch let error as PdftractError {
switch error {
case .invalidPdf(let msg):
// Handle invalid PDF
case .networkError(let msg):
// Handle network error
}
}
```
## Schema Compliance
All models comply with `docs/schema/v1.0/pdftract.schema.json`:
- **Required fields**: Non-optional Swift properties
- **Optional fields**: Swift `Optional` (`Type?`)
- **Arrays**: Swift arrays (`[Type]`)
- **Null handling**: `nil` in Swift, `null` in JSON
- **Enums**: Swift enums with `String` raw values or custom `Codable`
## Integration Notes
### Placeholder Implementation
The current implementation uses a placeholder `ExtractorBridge` actor. In production, this would be replaced with:
1. **C FFI**: Call into compiled Rust library
2. **HTTP Client**: Call pdftract server API
3. **CLI Wrapper**: Execute pdftract binary
### Cross-Platform Networking
Conditional import for Linux compatibility:
```swift
#if canImport(FoundationNetworking)
import FoundationNetworking
#endif
```
### Memory Management
- All structs are value types (no reference counting)
- `actor` provides thread-safe access
- `AsyncThrowingStream` handles backpressure
- Large data (attachments) truncated at 50 MB
## File Paths Summary
| File | Lines | Purpose |
|------|-------|---------|
| `Package.swift` | 25 | SPM manifest |
| `Sources/Pdftract/Pdftract.swift` | ~200 | Main client |
| `Sources/Pdftract/Models/Document.swift` | ~150 | Document, Metadata |
| `Sources/Pdftract/Models/Page.swift` | ~120 | Page, Span, Block |
| `Sources/Pdftract/Models/Table.swift` | ~100 | Table, Row, Cell |
| `Sources/Pdftract/Models/Annotation.swift` | ~200 | Links, Annotations |
| `Sources/Pdftract/Models/Signature.swift` | ~50 | Signature |
| `Sources/Pdftract/Models/FormField.swift` | ~120 | Form fields |
| `Sources/Pdftract/Models/Attachment.swift` | ~150 | Attachments, threads, outline |
| `Sources/Pdftract/Models/Quality.swift` | ~100 | Quality, diagnostics |
| `Sources/Pdftract/Models/Source.swift` | ~100 | Source enum, options |
| `Sources/Pdftract/Models/Error.swift` | ~50 | Error types |
| `Tests/PdftractTests.swift` | ~500 | Unit tests |
| `Examples/main.swift` | ~600 | Usage examples |
**Total**: ~2,465 lines of Swift code
## Next Steps
1. **Implement `ExtractorBridge`**: Connect to actual pdftract core
- Option A: C FFI to compiled Rust library
- Option B: HTTP client to pdftract server
- Option C: Command-line wrapper
2. **Add CI/CD**: GitHub Actions for macOS/Linux testing
3. **Documentation**: Generate DocC documentation
4. **Binary Framework**: Distribute as `.xcframework` for non-SPM use
5. **Performance Testing**: Benchmark large PDF handling
## References
- JSON Schema: `/home/coding/pdftract/docs/schema/v1.0/pdftract.schema.json`
- Rust Models: `/home/coding/pdftract/crates/pdftract-core/src/schema/mod.rs`
- Plan: `/home/coding/pdftract/docs/plan/plan.md` (lines 1-3825)