No description
Find a file
jedarden 4622dc503a P3: Verify Phase 3 Task Registry + Persistence completion
Phase 3 — Task Registry + Persistence (SQLite schema, Redis mirror) has been
completed and verified. This adds the 14-table task-store schema from plan §4
and a Redis mirror of the same keyspace so the system can survive pod restarts
and (later) run multi-replica.

## Verification Summary

### 1. SQLite Backend (SqliteTaskStore)
-  All 14 tables defined in migrations (001_initial.sql, 002_feature_tables.sql)
-  Idempotent migration system with schema version tracking
-  Full TaskStore trait implementation (all 14 tables)
-  WAL mode + busy_timeout configuration
-  36 passing tests including:
  - CRUD round-trips for all tables
  - Property tests (proptest)
  - Restart resilience (task_survives_store_reopen, all_tables_survive_store_reopen)
  - Concurrent write safety
  - Schema version validation

### 2. Redis Backend (RedisTaskStore)
-  Full TaskStore trait implementation mirroring SQLite
-  All 14 tables mapped to Redis keyspace
-  Index sets for O(cardinality) iteration (no SCAN)
-  Rate limiting helpers (search_ui, admin_login with backoff)
-  Pub/Sub session revocation support
-  CDC overflow buffer with byte-budget trimming
-  Scoped key rotation coordination
-  testcontainers-based integration tests

### 3. Schema Migrations
-  001_initial.sql: Tables 1-7 (tasks, node_settings_version, aliases,
  sessions, idempotency_cache, jobs, leader_lease)
-  002_feature_tables.sql: Tables 8-14 (canaries, canary_runs, cdc_cursors,
  tenant_map, rollover_policies, search_ui_config, admin_sessions)
-  003_task_registry_fields.sql: No-op (fields already in 001)
-  Version tracking with SchemaVersionAhead error

### 4. Helm Schema Validation
-  values.schema.json Rule 1: miroir.replicas > 1 requires taskStore.backend: redis
-  values.schema.json Rule 2: hpa.enabled requires replicas >= 2 AND redis
-  values.schema.json Rule 3-4: rate_limit.backend must be redis when replicas > 1
-  Verified with helm lint (rejects replicas=3 + backend=sqlite)

### 5. Memory Accounting (Plan §14.7)
-  test_redis_memory_budget: 10k tasks + 1k idempotency entries + 1k sessions
-  Target: < 2 MB RSS for representative workload
-  CDC overflow buffer enforces per-sink byte budget

## Files Verified
- crates/miroir-core/src/task_store/mod.rs: TaskStore trait + row types
- crates/miroir-core/src/task_store/sqlite.rs: SQLite implementation
- crates/miroir-core/src/task_store/redis.rs: Redis implementation
- crates/miroir-core/src/schema_migrations.rs: Migration registry
- crates/miroir-core/src/migrations/*.sql: Schema migrations
- charts/miroir/values.schema.json: Helm validation rules

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-02 17:33:24 -04:00
.beads P3: Verify Phase 3 Task Registry + Persistence completion 2026-05-02 17:33:24 -04:00
.cargo Multi-stage Dockerfile with musl cross-compilation and .dockerignore 2026-04-19 13:47:45 -04:00
.github P8.6: Release mechanics — bump script, release-ready check, PR template, Argo CIs 2026-04-19 09:54:26 -04:00
benches P12.OP4: Implement dfs_query_then_fetch for cross-shard comparability 2026-04-19 03:43:10 -04:00
charts/miroir P3: Complete Phase 3 — Task Registry + Persistence (SQLite + Redis) 2026-05-02 17:14:29 -04:00
crates P3: Complete Phase 3 Task Registry + Persistence 2026-05-02 17:27:48 -04:00
dashboards P7.3: Add §13.1 resharding row to Grafana dashboard, fix y-coordinate overlaps 2026-04-19 13:18:13 -04:00
docs P3: Complete Phase 3 — Task Registry + Persistence (SQLite + Redis) 2026-05-02 16:52:25 -04:00
k8s P12: close Phase 12 epic — all 6 open problems triaged and documented 2026-04-24 19:14:23 -04:00
notes P3: Verify Phase 3 Task Registry + Persistence completion 2026-05-02 17:30:09 -04:00
scripts P8.6: Release mechanics — bump script, release-ready check, PR template, Argo CIs 2026-04-19 09:54:26 -04:00
tests/benches/score-comparability P2.2: Implement write path with primary key validation, shard injection, and two-rule quorum 2026-04-19 06:48:30 -04:00
.dockerignore Multi-stage Dockerfile with musl cross-compilation and .dockerignore 2026-04-19 13:47:45 -04:00
.editorconfig Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
.gitignore P8: Add optional OpenTelemetry tracing deps, fix subscriber init, clean up .gitignore 2026-04-19 13:24:24 -04:00
.needle-predispatch-sha P3: Verify Phase 3 Task Registry + Persistence completion 2026-05-02 17:33:24 -04:00
1 P7.5.a: Request ID middleware + X-Request-Id response header 2026-04-21 08:01:30 -04:00
Cargo.lock P3: Complete Phase 3 — Task Registry + Persistence (SQLite + Redis) 2026-05-02 16:52:25 -04:00
Cargo.toml P12.OP4: Implement dfs_query_then_fetch for cross-shard comparability 2026-04-19 03:43:10 -04:00
CHANGELOG.md P8: Finalize CI/CD templates, prod ArgoCD app, and CHANGELOG for v0.1.0 2026-04-19 15:09:14 -04:00
clippy.toml Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
Dockerfile Multi-stage Dockerfile with musl cross-compilation and .dockerignore 2026-04-19 13:47:45 -04:00
LICENSE Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
miroir.yaml P3.3.d: Fix compilation - add missing local_search_ui_rate_limiter field 2026-04-26 19:30:10 -04:00
README.md Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
rust-toolchain.toml Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
rustfmt.toml Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00

Miroir

Multi-node Index Replication Orchestrator, Integrated Rebalancing

Miroir is a RAID-like orchestration layer for Meilisearch. It stripes a large index across a fleet of small-RAM Meilisearch nodes with a configurable replication factor, fans out search queries across all shards, and rebalances shard assignments when nodes are added or removed — all using the Meilisearch Community Edition.

The Problem

Meilisearch loads its entire index into memory-mapped LMDB files. A large index that exceeds a single server's available RAM cannot run on that server. The Enterprise Edition's native sharding is gated behind a commercial license. Miroir solves this without it.

How It Works

Client
  │
  ▼
Miroir Orchestrator
  ├── Write path: hash(doc_id) → assign to shard → write to R replicas
  ├── Read path:  scatter query to all shards → gather → merge ranked results
  └── Rebalance: on node add/remove → recompute assignments → migrate minimum shards

Meilisearch Nodes (N instances, each holding a subset of shards)
  node-0   node-1   node-2   ...   node-N

Replication Factor

Analogous to software RAID — configurable per deployment:

RF Redundancy Node failures tolerated Capacity
1 None (stripe only) 0 100% of fleet
2 One replica 1 per shard group 50% of fleet
3 Two replicas 2 per shard group 33% of fleet

Key Components

  • Orchestrator — proxy that handles shard routing, scatter-gather, result merging, and topology management
  • Shard router — consistent hash function (Rendezvous/HRW) mapping document IDs to node assignments; minimal reshuffling on topology change
  • Rebalancer — on node add/remove, recomputes assignments and migrates only the shards that changed owners; surviving replicas serve reads during rebuild
  • Result merger — normalizes and merges ranked result sets from multiple shards into a single coherent response

Status

Design phase. See docs/ for architecture detail.