pdftract/crates/pdftract-cli
jedarden 28c31ba0a1 feat(pdftract-vk0gc): implement markdown anchors with parser regex
Add --md-anchors flag that emits HTML comment markers before each block
in Markdown output, allowing downstream tools to map excerpts back to
precise PDF locations.

Changes:
- Add markdown module with Anchor struct and parse_anchors() function
- Regex: <!-- pdftract: page=(\d+) block=(\d+) bbox=[([\d.,]+)] kind=(\w+) -->
- Add markdown_anchors: bool to ExtractionOptions
- Add --md-anchors CLI flag
- Implement block_to_markdown() and page_to_markdown() functions
- Add comprehensive documentation in docs/integrations/markdown-anchors.md
- 16 unit tests pass, including roundtrip test

Closes: pdftract-vk0gc
2026-05-24 02:49:16 -04:00
..
src feat(pdftract-vk0gc): implement markdown anchors with parser regex 2026-05-24 02:49:16 -04:00
tests feat(pdftract-mcp): add MCP server implementation changes 2026-05-23 03:09:56 -04:00
build.rs feat(pdftract-4q8cq): implement 14 environment checks for pdftract doctor 2026-05-23 06:47:07 -04:00
Cargo.toml test(pdftract-sy8x): implement lexer proptest harness and curated corpus 2026-05-24 02:36:37 -04:00
pdftract-cli.cdx.json feat(pdftract-67tm8): implement MCP stdio transport with integration tests 2026-05-23 00:16:42 -04:00