pdftract/crates/pdftract-core/src/parser
jedarden e176fa68ad fix(pdftract-2hm4): fix hex string lexer invalid char handling and whitespace/comment skipping
Two fixes:

1. Hex string lexer now flushes dangling nibble when encountering invalid
   characters. For `<4X8Y>`, the X and Y are invalid, so we flush nibble 4
   as 0x40, then flush nibble 8 as 0x80, producing `\x40\x80`.

2. Fixed skip_whitespace_and_comments() to properly handle whitespace
   after comments. The previous logic only continued looping if the next
   byte was `%`, missing cases where whitespace follows a comment.

All 52 lexer tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 01:47:17 -04:00
..
lexer fix(pdftract-2hm4): fix hex string lexer invalid char handling and whitespace/comment skipping 2026-05-18 01:47:17 -04:00
object feat(pdftract-7nav): add PdfStream helper methods and consolidate stream types 2026-05-17 23:55:47 -04:00
catalog.rs fix(pdftract-2bsfc): fix stream tests and catalog parser error handling 2026-05-17 23:56:10 -04:00
diagnostic.rs feat(pdftract-2bsfc): implement document catalog parser with PageLabels number tree 2026-05-17 23:45:45 -04:00
mod.rs feat(pdftract-7nav): add PdfStream helper methods and consolidate stream types 2026-05-17 23:55:47 -04:00
pages.rs test(pdftract-5tmcg): add cycle detection test for page tree flattener 2026-05-18 00:38:44 -04:00
stream.rs test(pdftract-2bpf6): add FlateDecode predictor tests and proptests 2026-05-18 01:08:21 -04:00
xref.rs fix(pdftract-1yad): enable proptest tests and update verification note 2026-05-18 00:15:00 -04:00