Adds the on-disk examples referenced by plan §11 "Quick start (local, Docker Compose)": - examples/docker-compose-dev.yml: 3 Meilisearch nodes + 1 Miroir orchestrator - examples/dev-config.yaml: Matching Miroir config (16 shards, RF=1) - examples/README.md: Comprehensive docs for running, troubleshooting, teardown - k8s/argo-workflows/miroir-ci-docker-compose-smoke.yaml: CI smoke tests The README.md quick start section already references these examples. Acceptance: ✅ docker-compose-dev.yml boots via docker compose up ✅ dev-config.yaml mounted into Miroir container ✅ examples/README.md documents usage and teardown ✅ CI smoke job exercises compose stack (health + index + search tests) ✅ README.md quick start points to examples/docker-compose-dev.yml Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Bead-Id: bf-3lad
81 lines
4.4 KiB
Markdown
81 lines
4.4 KiB
Markdown
# Deployment Sizing Guide
|
||
|
||
This guide provides a sizing matrix for provisioning Miroir orchestrator pods based on corpus size and query throughput. Meilisearch node sizing follows the guidance in [plan.md §6](../../plan/plan.md) independently.
|
||
|
||
For per-feature scaling behavior (which capabilities need Redis, work queues, or nothing), see [Per-Feature Scaling Behavior](per-feature.md).
|
||
|
||
## Sizing matrix
|
||
|
||
| Corpus | Peak QPS | Orchestrator pods | Task store |
|
||
|--------|----------|-------------------|------------|
|
||
| ≤ 10 GB | ≤ 500 | 2 (HA) | Redis (or SQLite if replicas=1) |
|
||
| ≤ 50 GB | ≤ 2 k | 2–4 (HPA) | Redis |
|
||
| ≤ 200 GB | ≤ 5 k | 4–8 (HPA) | Redis |
|
||
| ≤ 1 TB | ≤ 20 k | 8–12 (HPA) | Redis |
|
||
| ≤ 5 TB | ≤ 100 k | 12–24 (HPA) | Redis (clustered or Sentinel) |
|
||
|
||
**Orchestrator pod specification:** 2 vCPU / 3.75 GB per pod.
|
||
|
||
**Key insight:** Orchestrator count scales with *query throughput*, while Meilisearch node count scales with *corpus size* and replication factor. These are orthogonal axes.
|
||
|
||
## Task-store memory accounting
|
||
|
||
When Redis is the task store, it backs shared state for multiple subsystems:
|
||
|
||
- **Idempotency replay keys** — request deduplication cache
|
||
- **Session pinning rows** — per-session sticky routing state
|
||
- **Alias cache** — atomic index alias mappings
|
||
- **Background job queue** — chunked dump import, reshard backfill jobs
|
||
- **Leader lease** — HA coordinator election state
|
||
- **CDC overflow buffer** — change-data-capture spill-over (when configured)
|
||
- **Search UI rate-limit buckets** — IP-based rate limiter state
|
||
|
||
Add **~20 MB for search UI rate-limit buckets per 10k active IPs** on top of the task-store baseline when `search_ui.rate_limit.backend: redis`. Bucket rows auto-expire per `redis_ttl_s`, keeping the footprint bounded even under IP-scan or spray attacks.
|
||
|
||
## Worked example: ≤ 200 GB / ≤ 5 k QPS
|
||
|
||
**From the table:** 4–8 orchestrator pods (HPA-scaled between these limits), Redis task store.
|
||
|
||
### Memory budget validation
|
||
|
||
Per [plan.md §14.2](../plan/plan.md#142-per-pod-memory-budget), steady-state per-pod memory at default config:
|
||
|
||
| Component | Budget |
|
||
|-----------|--------|
|
||
| Baseline (runtime + pools) | ~330 MB |
|
||
| Request/response buffers (p99 concurrent) | 200 MB |
|
||
| Task registry + idempotency + sessions | 250 MB |
|
||
| Feature overhead (§13.11–21 features) | ~200 MB |
|
||
| Allocator headroom | ~800 MB |
|
||
| **Steady-state total (idle background)** | **~1.2 GB** |
|
||
| **With one heavy background job active** | **~1.7 GB** |
|
||
|
||
At 4–8 pods, cluster-wide steady-state memory is 4.8–9.6 GB, which fits within a single Redis instance (allocate 4–8 GB for Redis at this tier). Heavy background jobs only run on the pod that claimed them from the shared queue — not all pods simultaneously — so the per-pod budget stays within envelope.
|
||
|
||
### Query throughput validation
|
||
|
||
Per [plan.md §14.3](../../plan/plan.md#143-per-pod-cpu-budget):
|
||
|
||
- **Small searches** (1 KB responses): ~3 kQPS per pod at 70% CPU
|
||
- **Large searches** (10 KB responses): ~1 kQPS per pod at 70% CPU
|
||
|
||
At 5 kQPS peak with 4–8 pods:
|
||
- Each pod handles 625–1250 QPS
|
||
- This is within the 1–3 kQPS per-pod envelope
|
||
- HPA will scale up toward 8 pods during sustained peak
|
||
|
||
### Result
|
||
|
||
The ≤ 200 GB / ≤ 5 k QPS row is validated: 4–8 orchestrator pods with Redis task store can handle the workload within the 2 vCPU / 3.75 GB per-pod resource envelope.
|
||
|
||
## When to escalate
|
||
|
||
If your workload exceeds the ≤ 5 TB / ≤ 100 k QPS tier, or if you're hitting resource limits despite following this matrix, consider:
|
||
|
||
1. **Vertical scaling escape valve** — For constrained environments, a single larger pod (e.g., 4 vCPU / 8 GB) is supported but not recommended for production. See [plan.md §14.10](../../plan/plan.md#1410-vertical-scaling-escape-valve).
|
||
|
||
2. **Redis clustering** — At very high scale, consider Redis Cluster or Sentinel for the task store to improve availability and distribute memory pressure.
|
||
|
||
3. **Tune feature flags** — Disable unused capabilities (e.g., multi-search, vector search, shadow tee) to reclaim memory. Every feature in [plan.md §13](../../plan/plan.md#13-feature-scaling-modes) has an `enabled: true` knob that can be flipped off.
|
||
|
||
4. **Consult the metrics** — Use the resource-pressure metrics in [plan.md §14.9](../../plan/plan.md#149-resource-pressure-metrics-and-alerts) to identify the bottleneck (CPU, memory, request queue depth, or background job backlog).
|