pdftract/docs/notes
jedarden 96b548ea18 docs(pdftract-19oy): add verification note for codespace parser + tokenizer
Implementation is complete. The codespace range parser and multi-byte
tokenizer exist in crates/pdftract-core/src/cmap/:
- codespace.rs: CodespaceParser for begincodespacerange blocks
- tokenize.rs: tokenize_cjk_bytes with widest-first matching

All acceptance criteria PASS. Compilation blocked by unrelated missing_docs
errors in parser/struct_tree.rs and other modules.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 12:26:25 -04:00
..
.gitkeep Initial repo scaffold with README and docs structure 2026-05-16 14:26:16 -04:00
ocr-language-packs.md feat(pdftract-3zhf): add unified TableDetector::detect entry point 2026-05-24 00:51:59 -04:00
pdftract-2bfgc.md docs(pdftract-2bfgc): add sample nginx and Traefik reverse-proxy configs 2026-05-28 00:37:34 -04:00
pdftract-3c4i.md fix(pdftract-3c4i): export detect_merged_cells from table module 2026-05-24 00:23:14 -04:00
pdftract-19oy.md docs(pdftract-19oy): add verification note for codespace parser + tokenizer 2026-05-28 12:26:25 -04:00
release-signing.md docs(pdftract-3wrx): add release signing strategy note 2026-05-24 11:12:56 -04:00
sdk-architecture.md docs(pdftract-32y9): finalize SDK architecture note with workspace layout, cross-compile matrix, and KU-12 alignment 2026-05-24 06:38:23 -04:00
sdk-conformance-runner.md feat(pdftract-5omc): implement per-language conformance test runner pattern 2026-05-18 01:32:24 -04:00
sdk-contract.md docs(pdftract-147a): author SDK contract specification 2026-05-17 23:13:55 -04:00
sdk-invocation.md docs(pdftract-3b1x): finalize sdk-invocation.md with subprocess contract and TH-07 compliance 2026-05-24 07:48:09 -04:00