pdftract/notes/pdftract-4h06h.md
jedarden ea1184168d test(pdftract-4h06h): implement TH-02 path traversal security test
Implement comprehensive path-traversal security tests documenting
the 10 canonical payloads from the threat model (plan line 891).

The test suite verifies that the resolve_path function in
mcp/root.rs properly rejects path-traversal attempts when --root
mode is enabled, while allowing HTTPS URLs to bypass validation
per INV-10.

Test coverage:
- All 10 traversal payloads rejected when --root is set
- Valid paths within root are accepted
- HTTPS URLs bypass root check
- Symlink escapes are caught
- URL-encoded traversal is rejected
- Special filesystem paths are rejected
- Deep traversal payloads are caught

Acceptance: All 10 tests pass. Current state documented:
Phase 1 (current): paths pass through without --root; validated with --root
Phase 2 (future): --root mode to be wired to MCP server entry point

References: Plan line 891 (TH-02), INV-10 (no file-path params in HTTP mode)

Closes: pdftract-4h06h
2026-05-25 13:03:45 -04:00

3.9 KiB

Verification Note: pdftract-4h06h - TH-02 Path Traversal Test

Bead

ID: pdftract-4h06h Title: TH-02 test: MCP path traversal (10 payloads) rejected with PATH_OUTSIDE_ROOT (or no path param accepted) Status: PASS

Summary

Implemented comprehensive path-traversal security tests at crates/pdftract-cli/tests/TH-02-path-traversal.rs. The test suite documents the 10 canonical path-traversal payloads from the threat model (plan line 891) and verifies that the resolve_path function properly rejects them when --root mode is enabled.

What Was Done

  1. Created crates/pdftract-cli/tests/TH-02-path-traversal.rs with 10 test functions

  2. Documented all 10 path-traversal payloads from the threat model:

    • Basic traversal (../../etc/passwd)
    • Deeper traversal (../../../etc/passwd)
    • Very deep traversal (../../../../etc/passwd)
    • Absolute paths (/etc/passwd)
    • Traversal with valid prefix (./valid/../../../etc/passwd)
    • URL-encoded traversal (valid/..%2F..%2Fetc%2Fpasswd)
    • Windows separators on Linux (valid/..\..\..\etc\passwd)
    • Long traversal with valid prefix (valid/../../../../etc/passwd)
    • Special filesystem (/proc/self/environ)
    • Windows reserved name (con)
  3. Test coverage:

    • test_root_mode_rejects_all_traversal_payloads: Verifies all 10 payloads are rejected when --root is set
    • test_root_mode_accepts_valid_paths: Verifies valid paths within root are accepted
    • test_without_root_paths_pass_through: Documents current behavior (paths pass through without --root)
    • test_https_urls_bypass_root_check: Verifies HTTPS URLs bypass validation per INV-10
    • test_symlink_escape_rejected: Verifies symlinks escaping root are rejected
    • test_url_encoded_traversal_rejected: Verifies URL-encoded traversal is caught
    • test_windows_reserved_name_handling: Handles Windows reserved names safely
    • test_special_filesystem_paths_rejected: Rejects /proc, /dev paths
    • test_nested_traversal_with_valid_prefix: Catches traversal after legitimate-looking prefix
    • test_deep_traversal_rejected: Verifies various depths of ../ are caught

Current State Documented

Per the bead description, the test documents the current security posture:

  • Phase 1 (current): MCP tools accept path parameters. Without --root, paths pass through as-is (trust-the-caller mode for local stdio). With --root, paths are canonicalized and validated.
  • Phase 2 (future): When --root DIR is introduced to pdftract mcp, all paths will be validated against the root boundary.

The resolve_path function in crates/pdftract-cli/src/mcp/root.rs already implements the security boundary (canonicalization + boundary check). The --root mode is not yet wired to the MCP server entry point, which is a known gap documented in the test comments.

Acceptance Criteria

  • tests/security/TH-02-path-traversal.rs exists (created at crates/pdftract-cli/tests/TH-02-path-traversal.rs)
  • Phase 1 tests pass: All 10 traversal payloads are rejected when --root is set
  • The 10 traversal payloads are documented in the test file
  • INV-10 cited as the structural mitigation source (referenced in test documentation)

Test Results

cargo nextest run --test TH-02-path-traversal
Summary [   0.009s] 10 tests run: 10 passed, 0 skipped

All tests passed:

  • test_root_mode_rejects_all_traversal_payloads
  • test_root_mode_accepts_valid_paths
  • test_without_root_paths_pass_through
  • test_https_urls_bypass_root_check
  • test_symlink_escape_rejected
  • test_url_encoded_traversal_rejected
  • test_windows_reserved_name_handling
  • test_special_filesystem_paths_rejected
  • test_nested_traversal_with_valid_prefix
  • test_deep_traversal_rejected

References

  • Plan section: TH-02 entry (line 891)
  • INV-10: pdftract mcp in HTTP mode MUST NOT accept file-path parameters
  • crates/pdftract-cli/src/mcp/root.rs: Path resolution and escape checking implementation