No description

Find a file

jedarden ebc300355c P2.3: Implement scatter-gather search with group fallback Implement the search read path with scatter-gather + merge + group selection: 1. Group-unavailability fallback: When a shard has no available replica in the primary group, the Fallback policy tries other replica groups before failing. This provides full results (not degraded) when an alternate group is healthy. 2. X-Miroir-Degraded header: Now includes actual shard IDs in the format "X-Miroir-Degraded: shards=3,7,11" instead of just "partial". 3. Acceptance tests for P2.3: - Unique-keyword search deduplicates correctly (RRF) - Facet counts sum across shards - Paging with no dupes/gaps - Node down with RF=2 still covers all shards - Group down falls back to other group (not degraded) - Degraded header includes actual shard IDs Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>		2026-04-19 06:40:04 -04:00
.beads	Fix clippy warnings, improve test robustness, and clean up proxy code	2026-04-19 04:53:45 -04:00
.cargo	P1.5: Implement scatter module with covering-set construction + dispatch trait	2026-04-19 00:20:29 -04:00
benches	P12.OP4: Implement dfs_query_then_fetch for cross-shard comparability	2026-04-19 03:43:10 -04:00
charts/miroir	P3.5: Add values.schema.json constraint for replicas>1 requires Redis	2026-04-18 23:44:15 -04:00
crates	P2.3: Implement scatter-gather search with group fallback	2026-04-19 06:40:04 -04:00
docs	P12.OP4.1: Validate dfs_query_then_fetch benchmark (τ=0.9817) and document latency	2026-04-19 05:31:13 -04:00
tests/benches/score-comparability	P12.OP4: Validate RRF merge quality — τ=0.14 confirms DFS preflight is required	2026-04-19 05:43:42 -04:00
.editorconfig	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00
.gitignore	P12.OP4: Finalize score normalization validation — RRF τ=0.14, score τ=0.79	2026-04-19 02:40:54 -04:00
.needle-predispatch-sha	Fix clippy warnings, improve test robustness, and clean up proxy code	2026-04-19 04:53:45 -04:00
Cargo.lock	Integrate MeilisearchError into proxy (IntoResponse, auth middleware) + telemetry	2026-04-19 05:21:09 -04:00
Cargo.toml	P12.OP4: Implement dfs_query_then_fetch for cross-shard comparability	2026-04-19 03:43:10 -04:00
CHANGELOG.md	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00
clippy.toml	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00
LICENSE	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00
miroir.yaml	P2.1: Implement axum server skeleton with health/version/ready/topology/shards/metrics endpoints	2026-04-19 06:12:05 -04:00
README.md	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00
rust-toolchain.toml	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00
rustfmt.toml	Add repo hygiene: LICENSE, CHANGELOG, .gitignore	2026-04-18 20:47:36 -04:00

README.md

Miroir

Multi-node Index Replication Orchestrator, Integrated Rebalancing

Miroir is a RAID-like orchestration layer for Meilisearch. It stripes a large index across a fleet of small-RAM Meilisearch nodes with a configurable replication factor, fans out search queries across all shards, and rebalances shard assignments when nodes are added or removed — all using the Meilisearch Community Edition.

The Problem

Meilisearch loads its entire index into memory-mapped LMDB files. A large index that exceeds a single server's available RAM cannot run on that server. The Enterprise Edition's native sharding is gated behind a commercial license. Miroir solves this without it.

How It Works

Client
  │
  ▼
Miroir Orchestrator
  ├── Write path: hash(doc_id) → assign to shard → write to R replicas
  ├── Read path:  scatter query to all shards → gather → merge ranked results
  └── Rebalance: on node add/remove → recompute assignments → migrate minimum shards

Meilisearch Nodes (N instances, each holding a subset of shards)
  node-0   node-1   node-2   ...   node-N

Replication Factor

Analogous to software RAID — configurable per deployment:

RF	Redundancy	Node failures tolerated	Capacity
1	None (stripe only)	0	100% of fleet
2	One replica	1 per shard group	50% of fleet
3	Two replicas	2 per shard group	33% of fleet

Key Components

Orchestrator — proxy that handles shard routing, scatter-gather, result merging, and topology management
Shard router — consistent hash function (Rendezvous/HRW) mapping document IDs to node assignments; minimal reshuffling on topology change
Rebalancer — on node add/remove, recomputes assignments and migrates only the shards that changed owners; surviving replicas serve reads during rebuild
Result merger — normalizes and merges ranked result sets from multiple shards into a single coherent response

Status

Design phase. See docs/ for architecture detail.