- Add axum feature flag to miroir-core with IntoResponse impl for MeilisearchError
- Refactor auth middleware to use MeilisearchError::new() + MiroirCode instead of
manual JSON construction, ensuring consistent error shape across all auth errors
- Add proxy error.rs re-export alias for ApiError
- Implement full telemetry middleware with Prometheus metrics (request duration,
in-flight gauge, scatter counters, node health)
- Reorder middleware layers: auth before telemetry so 401s are also instrumented
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement the API error response format from plan §5:
- ErrorType enum: invalid_request, auth, internal, system
- MiroirCode enum with all 10 miroir_* codes and their HTTP status mappings
- MeilisearchError struct with Meilisearch-compatible JSON shape
- Forwarding support for Meilisearch-native node errors (verbatim passthrough)
- Doc links pointing to docs/errors.md#<code>
- 21 unit tests covering every code's JSON shape, HTTP status, and forwarding
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements the Elasticsearch dfs_query_then_fetch pattern as the
global-IDF preflight phase (OP#4). This solves the cross-shard score
comparability problem that caused both RRF (τ=0.14) and score-based
merge (τ=0.79) to fail the τ≥0.95 quality threshold.
Core changes:
- New DfsPhase in scatter-gather pipeline (scatter.rs):
- PreflightRequest/PreflightResponse for term statistics collection
- GlobalIdf for coordinator-side IDF aggregation
- execute_preflight() for phase 1 of DFS
- dfs_query_then_fetch_search() for full two-phase execution
- ScoreMergeStrategy in merger.rs for global-IDF scoring
- HttpClient with preflight_node() support (client.rs)
- Search route integration using dfs_query_then_fetch_search()
- Integration test with skewed corpus demonstrating the fix
The preflight phase adds ~15µs of aggregation overhead at 64 shards
(O(shards * terms)) with O(1) per-shard parallelization. Network
latency adds one round-trip before the actual search query.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Research complete: both score-based and RRF merge fail 0.95 threshold.
Updated research doc with full RRF validation results and confidence intervals.
Added benchmark result reports and helper tests. Follow-up bead miroir-n6v
created for global-IDF preflight (dfs_query_then_fetch pattern).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Research validated that both score-based (τ=0.79) and RRF (τ=0.14) merging
fail the 0.95 Kendall tau threshold with skewed shard distributions. Created
follow-up bead miroir-n6v for global-IDF preflight implementation.
Also: add __pycache__/ and tarpaulin-report.json to .gitignore, fix
task_pruner gauge test race.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wire scatter (fan-out) directly into the RRF merger via scatter_gather_search(),
completing the full read path: plan → scatter → RRF merge. Add RRF simulation
mode to score-comparability benchmark for measuring rank correlation against
global BM25 ground truth.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add NodeClient trait for HTTP calls to Meilisearch nodes (seam between pure miroir-core and networked miroir-proxy)
- Add ScatterPlan struct containing chosen_group, target_shards, shard_to_node mapping, deadline_ms, hedging_eligible
- Implement plan_search_scatter() pure function that constructs the covering set without I/O
- Implement execute_scatter() async function that fans out to nodes with partial-failure handling
- Add MockNodeClient for testing with pre-programmed responses/errors
- Add unit tests for plan construction, query group rotation, shard-to-node mapping, hedging eligibility, and scatter execution
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Background pruner batch-deletes terminal tasks (succeeded/failed/canceled)
older than task_registry.ttl_seconds (default 7d). Runs every prune_interval_s
(default 300s) with batch_size=10000. Uses advisory lock via leader_lease table
to prevent concurrent pruning in single-pod deployments. Exposes
miroir_task_registry_size gauge updated after each cycle.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Extends SqliteTaskStore with full CRUD operations for:
- Table 8: canaries (upsert, get, list, delete)
- Table 9: canary_runs (insert with auto-prune to run_history_limit)
- Table 10: cdc_cursors (upsert, get, list by sink)
- Table 11: tenant_map (insert, get by BLOB key, delete)
- Table 12: rollover_policies (upsert, get, list, delete)
- Table 13: search_ui_config (upsert, get, delete)
- Table 14: admin_sessions (insert, get, revoke, delete_expired)
Key implementation details:
- prune_tasks uses subquery for LIMIT support (SQLite doesn't support LIMIT in DELETE)
- canary_runs auto-prune keeps only N most recent runs per canary_id
- tenant_map.api_key_hash is a 32-byte BLOB (raw sha256)
- admin_sessions has expires_at index for lazy eviction
- All bool fields stored as INTEGER (0/1) with conversion on read/write
Adds 12 comprehensive unit tests covering:
- CRUD round-trip for each table
- Auto-prune logic for canary_runs
- Nullable fields (tenant_map.group_id, admin_sessions.user_agent/source_ip)
- Composite PK behavior (cdc_cursors, canary_runs)
- prune_tasks batch deletion with status filter
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement a first-class schema version system with the following components:
- schema_versions table (SQLite) tracking applied migrations
- Numbered migration files (001_initial.sql, 002_feature_tables.sql)
- MigrationRegistry for version validation and pending migration detection
- Startup: read current version → apply pending migrations → record latest
- Refuse to start if DB version > binary version (SchemaVersionAhead error)
Acceptance criteria met:
- First run creates schema at version 001
- Second run is a no-op (single SELECT for version check)
- Store version > binary version fails with SchemaVersionAhead error
- Migration metadata structure is backend-agnostic (shared by SQLite/Redis)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Research doc updated with precise 95% CIs per query type. compare.py
now computes and reports confidence intervals. Kendall τ = 0.79
(95% CI [0.7873, 0.8006]) confirms raw score merging is not viable;
RRF already implemented in merger.rs as mitigation. Follow-up bead
created (miroir-zfo) for RRF quality validation.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Define the TaskStore trait in miroir-core with SQLite backend for the
seven always-present tables from plan §4: tasks, node_settings_version,
aliases, sessions, idempotency_cache, jobs, leader_lease.
Key design choices:
- serde_json Value columns for tasks.node_tasks and aliases JSON fields
- BLOB (32 raw bytes) for idempotency_cache.body_sha256
- CAS operations for job claims and leader lease acquisition
- Idempotent migrations via schema_versions table with version gating
- WAL mode + busy_timeout=5000 for concurrent write safety
- Mutex<Connection> for thread-safe single-process access
15 tests covering CRUD round-trips, alias flipping with history
retention, session/idempotency expiry, job claim/renew, leader lease
acquire/renew/steal, migration idempotency, WAL mode, and concurrent
writes without deadlock.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Expand NodeStatus with Active/Degraded/Removed variants and implement
state-machine transitions covering the full plan §2 lifecycle (Joining→Active→
Draining→Removed, failure/recovery paths). Add is_write_eligible_for() for
shard-aware write eligibility during drain. Restructure Topology with
shards/replica_groups/rf fields, Vec<Node> storage, and custom serde that
auto-rebuilds the group index on deserialization. Add Group::healthy_nodes()
helper. Rename Node.url→address to match plan §4 YAML schema.
13 new tests: YAML round-trip, group iteration, all legal/illegal state
transitions, write-eligibility correctness table, healthy_nodes filtering.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add Default impls for TaskStateMachine and RaftTaskRegistry (clippy::new_without_default)
- Remove openraft dep that fails on stable Rust 1.87 (validit uses let_chains)
- Silence dead_code warnings in raft_proto benchmark module
- Add autobenches = false to miroir-core Cargo.toml
- Update Cargo.lock
All Phase 0 DoD criteria pass: build, test (73), clippy, fmt, musl release.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
cargo fmt reformats dump.rs match arms; credentials.rs needs #[allow(dead_code)]
on an unused public helper.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add resharding load simulation model with real router hash functions
- Benchmark confirms storage amplification is exactly 2.0× and dual-write
amplification is exactly 2.0× across all test matrix scenarios (1KB/10GB,
10KB/100GB, 1MB/1TB), with hash distribution CV < 5% in all cases
- CLI window guard: resharding.allowed_windows config restricts resharding
to named time windows (e.g. "02:00-06:00 UTC"), CLI refuses outside
windows without --force
- Integration tests confirm rejection outside window, --force override,
no-restriction mode, and disabled config handling
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
14 chaos tests validate shard migration write safety at every cutover
boundary. Key findings:
- AE on + delta pass: 0/1M loss (production default)
- AE off + delta pass: 0/50K loss (delta pass is sufficient alone)
- AE off + delta skipped: ~2% loss → hard refusal at config validation
- 3-node cluster cutover: 0 loss with delta pass
Hard-coded policy: MigrationCoordinator refuses migrations when both
anti-entropy is disabled and delta pass is skipped. Warning logged when
AE is disabled but delta pass remains active.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The dev_config() helper referenced CdcConfig/CdcBufferConfig/
SearchUiConfig/RateLimitConfig without the advanced:: module prefix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Full serde-derived struct tree covering every block in plan §4 (MiroirConfig,
NodeConfig, TaskStoreConfig, AdminConfig, HealthConfig, ScatterConfig,
RebalancerConfig, ServerConfig, ConnectionPoolConfig, TaskRegistryConfig) and
all 21 §13 advanced-capability sub-structs (ReshardingConfig through
SearchUiConfig with nested auth/rate-limit/CSP/analytics structs), plus §14
horizontal-scaling structs (PeerDiscoveryConfig, LeaderElectionConfig, HpaConfig).
Includes:
- Layered loading via config crate: built-in defaults → file → env overrides
- Config::validate() with 14 cross-field rules (HA requires redis, scoped_key
timing inversion, node group bounds, tenant affinity range checks, etc.)
- 10 unit tests: round-trip YAML, full plan example, minimal YAML defaults,
and validation rejection cases
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Enumerates dump variants that streaming mode can/can't handle.
- Added docs/dump-import/compatibility-matrix.md with comprehensive
compatibility matrix covering Meilisearch versions, dump variants,
and workarounds
- Added docs/dump-import/README.md as entry point
- Updated miroir-ctl dump command to reference matrix with helpful
error messages for unimplemented subcommands (import, export, analyze)
Addresses Open Problem #5: identifies what "can't reconstruct" means
in concrete terms, giving operators clear guidance on when broadcast
fallback is needed and what alternatives exist.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add miroir-ctl management CLI with:
- clap root CLI with admin-key loading (env → credentials file → flag)
- All 15 subcommand stubs from plan §4
- Unit tests for credential loader priority order
- Clear "not yet implemented" messages pointing to tracking bead
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Update workspace Cargo.toml: explicit members list, edition 2021, MIT license, rust-version 1.87
- Simplify workspace.dependencies to core shared deps
- Update member crates to use explicit dependency versions where workspace inheritance was removed
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- LICENSE: MIT (per plan §12)
- CHANGELOG.md: Keep a Changelog 1.1.0 skeleton with [Unreleased]
and [0.1.0] sections matching the awk extractor from plan §7
- .gitignore: Rust target/, editor junk; Cargo.lock kept in VCS
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>