pdftract/crates
jedarden 01d7442c0f fix(correction): add Ligature::Ff to skip pattern and improve mojibake tests
- Add Ligature::Ff to the skip_next pattern in repair_split_ligatures
- Update mojibake test patterns to use readable Unicode escape sequences
- Fix NBSP test to use correct UTF-8 byte sequences
- Simplify multiple mojibake test to focus on accented character repair
- Update ligature test with more realistic scenario and complete glyph sequence

This fixes the handling of 'ff' ligatures that appear as f<U+FFFD>f in
split ligature scenarios, ensuring the second 'f' is properly skipped
during reconstruction.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 10:34:06 -04:00
..
pdftract-cer-diff docs(pdftract-aawrz): add LICENSE-MIT and LICENSE-APACHE files 2026-05-23 10:36:28 -04:00
pdftract-cli feat(pdftract-4k1x4): complete Phase 4 Text Assembly and Layout 2026-06-08 09:09:37 -04:00
pdftract-core fix(correction): add Ligature::Ff to skip pattern and improve mojibake tests 2026-06-08 10:34:06 -04:00
pdftract-inspector-ui fix(bf-4mkhv): clean up unused imports in hash.rs 2026-06-01 09:43:48 -04:00
pdftract-libpdftract feat(pdftract-3s2i): implement Phase 5.5.2 validation filter 2026-05-24 04:57:17 -04:00
pdftract-py fix(pdftract-2uk9z): wrap native module results in typed Python objects 2026-05-28 21:18:38 -04:00
pdftract-schema-migrate feat(bf-4w2rt): scaffold pdftract-schema-migrate crate 2026-06-01 10:00:37 -04:00