miroir/docs/horizontal-scaling/sizing.md
jedarden 02ad8fce9b P11.7: Add quick-start example artifacts (Docker Compose + config)
Adds the on-disk examples referenced by plan §11 "Quick start (local, Docker Compose)":

- examples/docker-compose-dev.yml: 3 Meilisearch nodes + 1 Miroir orchestrator
- examples/dev-config.yaml: Matching Miroir config (16 shards, RF=1)
- examples/README.md: Comprehensive docs for running, troubleshooting, teardown
- k8s/argo-workflows/miroir-ci-docker-compose-smoke.yaml: CI smoke tests

The README.md quick start section already references these examples.

Acceptance:
 docker-compose-dev.yml boots via docker compose up
 dev-config.yaml mounted into Miroir container
 examples/README.md documents usage and teardown
 CI smoke job exercises compose stack (health + index + search tests)
 README.md quick start points to examples/docker-compose-dev.yml

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bead-Id: bf-3lad
2026-05-20 06:50:43 -04:00

81 lines
4.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Deployment Sizing Guide
This guide provides a sizing matrix for provisioning Miroir orchestrator pods based on corpus size and query throughput. Meilisearch node sizing follows the guidance in [plan.md §6](../../plan/plan.md) independently.
For per-feature scaling behavior (which capabilities need Redis, work queues, or nothing), see [Per-Feature Scaling Behavior](per-feature.md).
## Sizing matrix
| Corpus | Peak QPS | Orchestrator pods | Task store |
|--------|----------|-------------------|------------|
| ≤ 10 GB | ≤ 500 | 2 (HA) | Redis (or SQLite if replicas=1) |
| ≤ 50 GB | ≤ 2 k | 24 (HPA) | Redis |
| ≤ 200 GB | ≤ 5 k | 48 (HPA) | Redis |
| ≤ 1 TB | ≤ 20 k | 812 (HPA) | Redis |
| ≤ 5 TB | ≤ 100 k | 1224 (HPA) | Redis (clustered or Sentinel) |
**Orchestrator pod specification:** 2 vCPU / 3.75 GB per pod.
**Key insight:** Orchestrator count scales with *query throughput*, while Meilisearch node count scales with *corpus size* and replication factor. These are orthogonal axes.
## Task-store memory accounting
When Redis is the task store, it backs shared state for multiple subsystems:
- **Idempotency replay keys** — request deduplication cache
- **Session pinning rows** — per-session sticky routing state
- **Alias cache** — atomic index alias mappings
- **Background job queue** — chunked dump import, reshard backfill jobs
- **Leader lease** — HA coordinator election state
- **CDC overflow buffer** — change-data-capture spill-over (when configured)
- **Search UI rate-limit buckets** — IP-based rate limiter state
Add **~20 MB for search UI rate-limit buckets per 10k active IPs** on top of the task-store baseline when `search_ui.rate_limit.backend: redis`. Bucket rows auto-expire per `redis_ttl_s`, keeping the footprint bounded even under IP-scan or spray attacks.
## Worked example: ≤ 200 GB / ≤ 5 k QPS
**From the table:** 48 orchestrator pods (HPA-scaled between these limits), Redis task store.
### Memory budget validation
Per [plan.md §14.2](../plan/plan.md#142-per-pod-memory-budget), steady-state per-pod memory at default config:
| Component | Budget |
|-----------|--------|
| Baseline (runtime + pools) | ~330 MB |
| Request/response buffers (p99 concurrent) | 200 MB |
| Task registry + idempotency + sessions | 250 MB |
| Feature overhead (§13.1121 features) | ~200 MB |
| Allocator headroom | ~800 MB |
| **Steady-state total (idle background)** | **~1.2 GB** |
| **With one heavy background job active** | **~1.7 GB** |
At 48 pods, cluster-wide steady-state memory is 4.89.6 GB, which fits within a single Redis instance (allocate 48 GB for Redis at this tier). Heavy background jobs only run on the pod that claimed them from the shared queue — not all pods simultaneously — so the per-pod budget stays within envelope.
### Query throughput validation
Per [plan.md §14.3](../../plan/plan.md#143-per-pod-cpu-budget):
- **Small searches** (1 KB responses): ~3 kQPS per pod at 70% CPU
- **Large searches** (10 KB responses): ~1 kQPS per pod at 70% CPU
At 5 kQPS peak with 48 pods:
- Each pod handles 6251250 QPS
- This is within the 13 kQPS per-pod envelope
- HPA will scale up toward 8 pods during sustained peak
### Result
The ≤ 200 GB / ≤ 5 k QPS row is validated: 48 orchestrator pods with Redis task store can handle the workload within the 2 vCPU / 3.75 GB per-pod resource envelope.
## When to escalate
If your workload exceeds the ≤ 5 TB / ≤ 100 k QPS tier, or if you're hitting resource limits despite following this matrix, consider:
1. **Vertical scaling escape valve** — For constrained environments, a single larger pod (e.g., 4 vCPU / 8 GB) is supported but not recommended for production. See [plan.md §14.10](../../plan/plan.md#1410-vertical-scaling-escape-valve).
2. **Redis clustering** — At very high scale, consider Redis Cluster or Sentinel for the task store to improve availability and distribute memory pressure.
3. **Tune feature flags** — Disable unused capabilities (e.g., multi-search, vector search, shadow tee) to reclaim memory. Every feature in [plan.md §13](../../plan/plan.md#13-feature-scaling-modes) has an `enabled: true` knob that can be flipped off.
4. **Consult the metrics** — Use the resource-pressure metrics in [plan.md §14.9](../../plan/plan.md#149-resource-pressure-metrics-and-alerts) to identify the bottleneck (CPU, memory, request queue depth, or background job backlog).