No description
Find a file
jedarden 67b44611c4 fix(ci): enable kafka-sink feature in CI build and Dockerfile
The kafka-sink Cargo feature existed but was not enabled in production builds,
causing all Kafka CDC events to be silently dropped at runtime.

Changes:
- Add --features miroir-core/kafka-sink to cargo-build in miroir-ci.yaml
- Update Dockerfile comments to reflect the expected build commands
- Add kafka_sink_feature.rs integration test with #[cfg(feature = "kafka-sink")]

The test verifies:
- Feature is enabled (compile-time check)
- CdcManager publish works with Kafka config
- Kafka sink config parses correctly

Fixes plan-gap: kafka-sink feature not enabled in CI build and Dockerfile
2026-05-31 12:07:48 -04:00
.beads feat(ttl): implement actual TTL sweep logic with NodeClient integration 2026-05-26 13:21:33 -04:00
.cargo Multi-stage Dockerfile with musl cross-compilation and .dockerignore 2026-04-19 13:47:45 -04:00
.config fix(marathon): enforce nextest + command timeouts to prevent loop stalls 2026-05-25 19:45:40 -04:00
.github docs(pr): improve PR templates for CHANGELOG discipline 2026-05-24 21:00:14 -04:00
.marathon fix(marathon): enforce nextest + command timeouts to prevent loop stalls 2026-05-25 19:45:40 -04:00
benches P12.OP4: Implement dfs_query_then_fetch for cross-shard comparability 2026-04-19 03:43:10 -04:00
charts/miroir fix(helm): enhance Helm connection test with comprehensive endpoint checks 2026-05-25 08:27:48 -04:00
coverage Phase 1 (miroir-cdo): Close bead - Core Routing complete 2026-05-09 11:38:45 -04:00
crates fix(ci): enable kafka-sink feature in CI build and Dockerfile 2026-05-31 12:07:48 -04:00
dashboards fix(dashboard): flatten panels structure for Grafana v10 compatibility 2026-05-24 19:22:53 -04:00
docs feat(search-ui): add i18n locales field to SearchUiIndexConfig (plan §13.21) 2026-05-31 12:02:07 -04:00
examples docs: add troubleshooting cross-links to production and examples guides 2026-05-25 02:55:06 -04:00
fuzz feat(tests): add property tests and fuzz targets for router, config, and parsers (plan §8, P9.6) 2026-05-24 11:41:48 -04:00
k8s fix(ci): enable kafka-sink feature in CI build and Dockerfile 2026-05-31 12:07:48 -04:00
notes Merge remote-tracking branch 'origin/master' 2026-05-24 05:21:32 -04:00
scripts feat(bench): add performance benchmarks and regression gate (P9.5) 2026-05-25 00:44:33 -04:00
tests test(integration): implement 7 Docker Compose end-to-end scenarios 2026-05-26 19:17:07 -04:00
.dockerignore Multi-stage Dockerfile with musl cross-compilation and .dockerignore 2026-04-19 13:47:45 -04:00
.editorconfig Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
.gitignore P8: Add optional OpenTelemetry tracing deps, fix subscriber init, clean up .gitignore 2026-04-19 13:24:24 -04:00
.needle-predispatch-sha Phase 5 — Advanced Capabilities: Mode A coordination and HPA custom metrics 2026-05-24 00:07:37 -04:00
.proptest P1.6: Verify property tests and benchmarks for router/merger 2026-05-23 13:03:54 -04:00
.tarpaulin.toml feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
1 P7.5.a: Request ID middleware + X-Request-Id response header 2026-04-21 08:01:30 -04:00
Cargo.lock fix(tests): fix syntax error in p10_5_scoped_key_rotation.rs 2026-05-26 14:09:07 -04:00
Cargo.toml Merge remote-tracking branch 'origin/master' 2026-05-24 05:21:32 -04:00
CHANGELOG.md P11.9 v1.0 versioning-commitments policy doc (§12) 2026-05-20 06:41:27 -04:00
clippy.toml Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
CONTRIBUTING.md docs: add CONTRIBUTING.md for development workflow and code submission 2026-05-25 07:43:40 -04:00
Dockerfile fix(ci): enable kafka-sink feature in CI build and Dockerfile 2026-05-31 12:07:48 -04:00
lcov.info Phase 1 (miroir-cdo): Core Routing - Final verification 2026-05-09 11:50:04 -04:00
librust_out.rlib P2.4 Index lifecycle endpoints: implementation verification 2026-05-23 22:28:33 -04:00
LICENSE Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
Makefile feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
miroir.yaml P2.1: Fix session_pinning blocking read and verify acceptance criteria 2026-05-23 12:19:10 -04:00
proptest.toml P1.6: Add proptest.toml for 1024 test cases 2026-05-20 08:07:00 -04:00
README.md docs(readme): add SDK configuration section (Phase 11 §11) 2026-05-25 07:26:08 -04:00
rust-toolchain.toml Phase 0 (miroir-qon): Rust 1.88 upgrade + test infrastructure 2026-05-09 02:05:44 -04:00
rustfmt.toml Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00

Miroir

License: MIT SemVer Latest Release

Multi-node Index Replication Orchestrator, Integrated Rebalancing

Miroir is a RAID-like orchestration layer for Meilisearch. It stripes a large index across a fleet of small-RAM Meilisearch nodes with a configurable replication factor, fans out search queries across all shards, and rebalances shard assignments when nodes are added or removed — all using the Meilisearch Community Edition.

The Problem

Meilisearch loads its entire index into memory-mapped LMDB files. A large index that exceeds a single server's available RAM cannot run on that server. The Enterprise Edition's native sharding is gated behind a commercial license. Miroir solves this without it.

How It Works

Client
  │
  ▼
Miroir Orchestrator
  ├── Write path: hash(doc_id) → assign to shard → write to R replicas
  ├── Read path:  scatter query to all shards → gather → merge ranked results
  └── Rebalance: on node add/remove → recompute assignments → migrate minimum shards

Meilisearch Nodes (N instances, each holding a subset of shards)
  node-0   node-1   node-2   ...   node-N

Replication Factor

Analogous to software RAID — configurable per deployment:

RF Redundancy Node failures tolerated Capacity
1 None (stripe only) 0 100% of fleet
2 One replica 1 per shard group 50% of fleet
3 Two replicas 2 per shard group 33% of fleet

Key Components

  • Orchestrator — proxy that handles shard routing, scatter-gather, result merging, and topology management
  • Shard router — consistent hash function (Rendezvous/HRW) mapping document IDs to node assignments; minimal reshuffling on topology change
  • Rebalancer — on node add/remove, recomputes assignments and migrates only the shards that changed owners; surviving replicas serve reads during rebuild
  • Result merger — normalizes and merges ranked result sets from multiple shards into a single coherent response

Feature Matrix

Miroir implements 21 advanced capabilities (plan §13) that sit entirely within the orchestrator layer. Every Meilisearch node runs unmodified Community Edition — no patches, no forks, no custom builds.

Capability Description Default
§13.1 Online resharding Change shard count without reindex via shadow index on
§13.2 Hedged requests Tail-latency mitigation via duplicate requests to alternate replicas on
§13.3 Adaptive replica selection EWMA-based routing to lowest-latency nodes on
§13.4 Shard-aware query planner Narrow fan-out for PK-constrained searches on
§13.5 Two-phase settings broadcast Atomic settings changes with verification on
§13.6 Read-your-writes Session pinning for immediate consistency on
§13.7 Atomic index aliases Blue-green reindexing and multi-target aliases on
§13.8 Anti-entropy reconciler Continuous shard repair and drift detection on
§13.9 Streaming dump import Route documents during import (no broadcast) on
§13.10 Idempotency keys Request deduplication and query coalescing on
§13.11 Multi-search Batch API for multiple queries in one round-trip on
§13.12 Vector + hybrid search Over-fetch with RRF/convex merging for correct global ranking on
§13.13 CDC stream Change data capture to webhook/NATS/Kafka/internal queue on
§13.14 Document TTL Automatic expiration with background sweeper on
§13.15 Tenant affinity Route tenant queries to dedicated replica groups on
§13.16 Traffic shadow Async request tee to shadow cluster with diff analysis on
§13.17 ILM Rolling time-series indexes with rollover policies on
§13.18 Canary queries Synthetic queries with golden assertions for relevance testing on
§13.19 Admin Web UI Embedded SPA for topology, config, query debugging, operations on
§13.20 Query explain Debug routing decisions and warnings without executing on
§13.21 End-user Search UI Embedded instant-search SPA with facets, keyboard nav, i18n on

See docs/plan/plan.md#13-advanced-capabilities for detailed design of each capability.

Stability

Miroir is currently in development (v0.x). Starting with v1.0, the project provides backward-compatibility commitments for the Meilisearch API layer, miroir-ctl CLI, config file schema, and Helm chart values.

See docs/versioning-policy.md for the full versioning policy, including what constitutes a breaking change and the deprecation process.

Documentation

Quick Start

Get Miroir running locally in 5 minutes with Docker Compose:

# Clone the repository
git clone https://github.com/jedarden/miroir.git
cd miroir

# Start the development stack (3 Meilisearch nodes + 1 Miroir orchestrator)
docker compose -f examples/docker-compose-dev.yml up -d

# Verify health
curl http://localhost:7700/health
# Expected: {"status":"available"}

# Index documents (Meilisearch-compatible API)
curl -X POST http://localhost:7700/indexes/movies/documents \
  -H "Authorization: Bearer dev-key" \
  -H "Content-Type: application/json" \
  -d '[{"id": 1, "title": "Inception"}, {"id": 2, "title": "Interstellar"}]'

# Search
curl -X POST http://localhost:7700/indexes/movies/search \
  -H "Authorization: Bearer dev-key" \
  -H "Content-Type: application/json" \
  -d '{"q": "inception"}'

# Teardown (removes containers and volumes)
docker compose -f examples/docker-compose-dev.yml down -v

See examples/README.md for more details on the development stack, configuration options, and troubleshooting.

SDK Configuration

Migrating your existing Meilisearch SDK code to Miroir requires only changing the endpoint URL. All other SDK code (index operations, document CRUD, search queries) remains unchanged.

Python

# Before — single-node Meilisearch
client = meilisearch.Client('https://old-meili.example.com', 'api-key')

# After — Miroir
client = meilisearch.Client('https://search.example.com', 'miroir-master-key')

TypeScript / JavaScript

// Before — single-node Meilisearch
const client = new MeiliSearch({
  host: 'https://old-meili.example.com',
  apiKey: 'api-key'
})

// After — Miroir
const client = new MeiliSearch({
  host: 'https://search.example.com',
  apiKey: 'miroir-master-key'
})

Go

// Before — single-node Meilisearch
client := meilisearch.NewClient(meilisearch.ClientConfig{
    Host:   "https://old-meili.example.com",
    APIKey: "api-key",
})

// After — Miroir
client := meilisearch.NewClient(meilisearch.ClientConfig{
    Host:   "https://search.example.com",
    APIKey: "miroir-master-key",
})

That's it — no other code changes required. Miroir presents the same Meilisearch-compatible API surface to all official SDKs.

Production deployment

For production deployments, see the Deployment Sizing Guide to determine orchestrator pod count and task store configuration based on your corpus size and query throughput.

When to use

  • Multi-pod with Redis — Recommended for production. Horizontal scaling with 2+ orchestrator pods delivers fault tolerance (zero-downtime rollouts, pod-loss survival) and scales query throughput via HPA. See Deployment Sizing Guide.

  • Single oversized pod — Supported for dev clusters, very small deployments, or constrained environments. A single pod at 4 vCPU / 8 GB is validated but loses HA benefits (no zero-downtime rollouts, no pod-loss survival). See Single-Pod Mode.

  • Large index sharding — When a single Meilisearch node cannot fit your corpus in RAM, Miroir stripes it across multiple nodes with configurable replication factor.

Additional production resources:

Community

  • Issues — Bug reports and feature requests
  • Discussions — Q&A and design discussions
  • Contributing — Development workflow and code submission guidelines

License

MIT License — see LICENSE for details.