Add Phase 4 caption classification for detecting figure captions. Implements classify_caption() which identifies blocks as captions when: - Small font size (median < page body median) - Follows Figure block within 2 line heights - Same column as Figure Module: crates/pdftract-core/src/layout/caption.rs Acceptance criteria: - Block immediately below Figure, small font, same column → kind: Caption - Block 5 lines below Figure → NOT Caption (gap too large) - Block with body-size font below Figure → NOT Caption (font not smaller) - Block in different column from Figure → NOT Caption Tests: 9/9 passed covering all acceptance criteria plus edge cases. Closes: pdftract-xzfkt Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
23 lines
712 B
TOML
23 lines
712 B
TOML
# Clippy configuration for pdftract
|
|
#
|
|
# This file configures clippy lints for the pdftract workspace.
|
|
|
|
# Minimum Supported Rust Version
|
|
# Enables MSRV-aware lints that warn about using APIs newer than our MSRV
|
|
msrv = "1.78"
|
|
|
|
# Warn on suspicious patterns that may indicate secret leakage
|
|
warn-on-all-wildcard-imports = true
|
|
|
|
# Cognitive complexity threshold - helps keep code simple
|
|
cognitive-complexity-threshold = 30
|
|
|
|
# Type complexity threshold
|
|
type-complexity-threshold = 250
|
|
|
|
# Literal representation threshold
|
|
literal-representation-threshold = 10
|
|
|
|
# Enforce documentation for public items
|
|
# Note: missing-docs-in-private-items is not a valid clippy.toml option
|
|
# Documentation is enforced via other means
|