pdftract/crates/pdftract-cli
jedarden 7a70bb82b8 feat(pdftract-ixzbg): implement regex engine wiring for grep subcommand
Implement bead 7.8.2: Build the per-search matcher from GrepArgs.
Compile PATTERN into either a literal Aho-Corasick automaton (-F mode,
default) or a regex::Regex (-E mode). Apply -i (case-insensitive) and
-w (word-boundary) wrappers. Provide a uniform Matcher::find_iter(text)
-> Iter<MatchRange> API used by the per-span matcher.

Key changes:
- Add aho-corasick dependency for fast literal matching
- Create grep/matcher.rs with MatchRange and Matcher enum
- Reorganize grep.rs -> grep/mod.rs for proper module structure
- Implement literal mode with Aho-Corasick automaton
- Implement regex mode with regex::Regex
- Support case-insensitive matching in both modes
- Support word-boundary matching (\b anchors for regex, post-match check for literal)
- Comprehensive unit tests for all modes and edge cases

Closes: pdftract-ixzbg
2026-05-24 06:30:02 -04:00
..
src feat(pdftract-ixzbg): implement regex engine wiring for grep subcommand 2026-05-24 06:30:02 -04:00
tests feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
build.rs feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
Cargo.toml feat(pdftract-ixzbg): implement regex engine wiring for grep subcommand 2026-05-24 06:30:02 -04:00
pdftract-cli.cdx.json feat(pdftract-67tm8): implement MCP stdio transport with integration tests 2026-05-23 00:16:42 -04:00