Fixed compilation errors in Span constructors by adding missing `column: None` field. Verified that the existing multi-output CLI parsing implementation meets all acceptance criteria for bead pdftract-37qim. Changes: - crates/pdftract-core/src/span/mod.rs: Add column field to new() and empty() constructors Verification: - All 23 output::tests pass - CLI parsing validated for duplicate format detection, ndjson exclusivity, stdout uniqueness - Format auto-naming (--format with -o) works correctly - Default behavior (no flags -> JSON to stdout) confirmed See notes/pdftract-37qim.md for detailed verification results. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
5 KiB
Verification Note: pdftract-37qim
Task: CLI parsing + validation (multi-format flags, --ndjson exclusivity, stdout uniqueness)
Summary
The CLI parsing + validation for multi-output was already implemented in crates/pdftract-cli/src/output.rs. This verification confirms that the implementation meets all acceptance criteria.
Pre-existing Work
The implementation was already present in the codebase. This task primarily verified that:
- The
OutputConfigstruct andbuild_specs()method correctly validate output configurations - All validation rules from the plan (lines 2261-2265) are enforced
- The CLI integration in
main.rsuses the output configuration correctly
Fixes Made
- Fixed compilation errors in
crates/pdftract-core/src/span/mod.rsby adding missingcolumn: Nonefield to two constructors (new()andempty())
Verification Results
Acceptance Criteria - ALL PASS
-
--json a.json --md b.md -> 2 OutputSpecs built- PASS- Test:
test_multiple_format_flags - Verified:
cargo nextest run -p pdftract-cli --lib output::tests::test_multiple_format_flags
- Test:
-
--json a.json --json b.json -> CLI error "duplicate format"- PASS- CLI test:
./target/debug/pdftract extract --json a.json --json b.json tests/fixtures/empty.pdf - Output:
Error: duplicate format: --json and --json both specify json output
- CLI test:
-
--ndjson --md b.md -> CLI error "--ndjson cannot be combined"- PASS (critical test line 2302)- CLI test:
./target/debug/pdftract extract --ndjson --md b.md tests/fixtures/empty.pdf - Output:
error: the argument '--ndjson' cannot be used with '--md <PATH>' - Note: clap's
conflicts_with_allcatches this at parse time
- CLI test:
-
--md - --json out.json -> 2 specs, MD=Stdout, JSON=File- PASS- Test:
test_stdout_with_file - Verified: MD goes to stdout, JSON goes to file
- Test:
-
--md - --json - -> CLI error "at most one stdout"- PASS- CLI test:
./target/debug/pdftract extract --md - --json - tests/fixtures/empty.pdf - Output:
Error: at most one output may be stdout (-); multiple formats cannot all write to stdout
- CLI test:
-
--format json,md -o out -> 2 specs, out.json + out.md- PASS- Test:
test_format_with_base - CLI test:
./target/debug/pdftract extract --format json,md -o out tests/fixtures/empty.pdf - Output:
Producing 2 outputs: json -> out.json, markdown -> out.md
- Test:
Additional Verification
-
Default behavior (no output flags) - PASS
- Per line 2242-2243: Single output to stdout (default)
test_output_config_defaultconfirms JSON to stdout when no flags specified
-
--format without -oerror - PASS- CLI test:
./target/debug/pdftract extract --format json tests/fixtures/empty.pdf - Output:
Error: --format requires -o (output base path)
- CLI test:
-
Cross-format duplication detection - PASS
- Tests:
test_duplicate_format_json_flag_and_format_list,test_duplicate_format_md_flag_and_format_list,test_duplicate_format_text_flag_and_format_list - Validates that
--jsonand--format jsoncannot both specify JSON output
- Tests:
Implementation Details
OutputConfig Structure
Located in crates/pdftract-cli/src/output.rs:
OutputConfigstruct stores parsed CLI flagsbuild_specs()method validates and buildsVec<OutputSpec>- Validation rules:
- Each format can appear at most once
- At most one output can be stdout
--ndjsoncannot be combined with other formats--formatrequires-o
CLI Integration
Located in crates/pdftract-cli/src/main.rs:
cmd_extract()createsOutputConfigfrom CLI args- Calls
build_specs()and reports errors withexit(2) - Iterates over output specs and writes each to its destination
- Uses
AtomicFileWriterfor file outputs (atomic writes)
Test Coverage
All 23 tests in output::tests pass:
- Format parsing (
test_format_from_str) - Extension mapping (
test_format_extension) - Destination handling (
test_destination_from_path) - Single format flags (
test_single_format_flag_json,test_single_format_flag_md,test_single_format_flag_text) - Multiple format flags (
test_multiple_format_flags) - Stdout handling (
test_stdout_with_file,test_multiple_stdout_rejected) - NDJSON exclusivity (
test_ndjson_exclusive_with_json,test_ndjson_exclusive_with_md,test_ndjson_exclusive_with_text) - Format auto-naming (
test_format_with_base,test_format_with_all_formats,test_output_spec_auto_named) - Duplicate detection (
test_duplicate_format_json_flag_and_format_list, etc.)
References
- Plan section: Phase 6.6 CLI design + validation rules (lines 2221-2247, 2261-2303)
- Critical test: Line 2302 -
--ndjson --md b.md→ rejected at CLI parse time
PASS/WARN/FAIL Summary
- PASS: All 6 acceptance criteria
- WARN: None
- FAIL: None
Files Modified
crates/pdftract-core/src/span/mod.rs- Fixed compilation errors (addedcolumn: Noneto constructors)
Files Verified
crates/pdftract-cli/src/output.rs- Core validation logiccrates/pdftract-cli/src/main.rs- CLI integration