The PDF name object lexer was already fully implemented with all acceptance criteria passing. Added verification note documenting test results. Co-Authored-By: Claude Code <noreply@anthropic.com> Bead-Id: pdftract-5dng
3.7 KiB
3.7 KiB
pdftract-5dng: PDF Name Object Lexer Implementation
Summary
The PDF name object lexer was already fully implemented in crates/pdftract-core/src/parser/lexer/mod.rs (lines 658-767). All acceptance criteria pass with comprehensive test coverage.
Acceptance Criteria Status
PASS
All critical tests from the plan pass:
/Foo->Token::Name(b"Foo")-name_simpletest/Foo#20Bar->Token::Name(b"Foo Bar")-name_with_hex_escape_spacetest (#20 = space)/Foo#00Bar->Token::Name(b"Foo")+STRUCT_INVALID_NAME-name_nul_byte_rejectedtest/AAA...AAA(128 A's) ->Token::Name(b"AAA...AAA")(127 A's) +STRUCT_INVALID_NAME-name_length_limit_127_bytestest/(alone) ->Token::Name(b"")(no diagnostic) -name_emptytest/#23#23->Token::Name(b"##")-name_hex_escape_decodes_to_hashtest (#23 = #)/Foo#GZ->Token::Name(b"Foo#GZ")+STRUCT_INVALID_NAME-name_invalid_hex_escape_keeps_hash_literaltest
Proptests
name_proptest_never_panics_on_random_bytes- PASSname_proptest_always_produces_valid_token- PASS
INV-8
The implementation maintains INV-8 (lexer invariant). The name lexer properly:
- Tracks raw byte consumption (
raw_consumed) - Enforces 127-byte raw length limit before hex expansion
- Rejects NUL bytes (0x00) with
STRUCT_INVALID_NAMEdiagnostic - Truncates at 127 raw bytes, avoiding half-decoded escapes
Implementation Details
The lex_name() function:
- Entry: Position immediately after the leading
/ - Hex escapes (
#XX): Decodes to single byte, checking for valid hex digits - NUL rejection: Detects both literal NUL and
#00escape, emits diagnostic, truncates at NUL - Length limit: 127 raw bytes (before hex expansion), truncates cleanly before incomplete
#XXsequences - Termination: Stops at whitespace or any PDF delimiter
- Empty name:
/followed by delimiter/EOF producesToken::Name(b"")with no diagnostic
Test Results
running 23 tests
test parser::lexer::tests::name_case_sensitive ... ok
test parser::lexer::tests::name_hex_escape_decodes_to_hash ... ok
test parser::lexer::tests::name_empty ... ok
test parser::lexer::tests::name_hex_escape_zero_zero_is_nul ... ok
test parser::lexer::tests::name_invalid_hex_escape_keeps_hash_literal ... ok
test parser::lexer::tests::name_invalid_hex_escape_single_digit ... ok
test parser::lexer::tests::name_empty_followed_by_delimiter ... ok
test parser::lexer::tests::name_length_limit_127_bytes ... ok
test parser::lexer::tests::name_length_limit_exact_127_bytes_valid ... ok
test parser::lexer::tests::name_hex_escape_mixed_case ... ok
test parser::lexer::tests::name_simple ... ok
test parser::lexer::tests::name_nul_byte_rejected ... ok
test parser::lexer::tests::name_length_limit_counts_raw_bytes_before_expansion ... ok
test parser::lexer::tests::name_multiple_invalid_hex_escapes ... ok
test parser::lexer::tests::name_literal_nul_byte_rejected ... ok
test parser::lexer::tests::name_truncation_before_incomplete_escape ... ok
test parser::lexer::tests::name_with_bytes_preserved ... ok
test parser::lexer::tests::name_with_all_delimiters ... ok
test parser::lexer::tests::name_with_hex_escape_space ... ok
test parser::lexer::tests::name_with_slash_delimiter ... ok
test parser::lexer::tests::name_zero_byte_not_confused_with_nul ... ok
test parser::lexer::tests::name_proptest_never_panics_on_random_bytes ... ok
test parser::lexer::tests::name_proptest_always_produces_valid_token ... ok
test result: ok. 23 passed; 0 failed; 0 ignored
Full lexer suite: 77 tests passed.
Files Modified
No changes required - implementation was already complete.