miroir/docs/horizontal-scaling/sizing.md
jedarden 02ad8fce9b P11.7: Add quick-start example artifacts (Docker Compose + config)
Adds the on-disk examples referenced by plan §11 "Quick start (local, Docker Compose)":

- examples/docker-compose-dev.yml: 3 Meilisearch nodes + 1 Miroir orchestrator
- examples/dev-config.yaml: Matching Miroir config (16 shards, RF=1)
- examples/README.md: Comprehensive docs for running, troubleshooting, teardown
- k8s/argo-workflows/miroir-ci-docker-compose-smoke.yaml: CI smoke tests

The README.md quick start section already references these examples.

Acceptance:
 docker-compose-dev.yml boots via docker compose up
 dev-config.yaml mounted into Miroir container
 examples/README.md documents usage and teardown
 CI smoke job exercises compose stack (health + index + search tests)
 README.md quick start points to examples/docker-compose-dev.yml

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bead-Id: bf-3lad
2026-05-20 06:50:43 -04:00

4.4 KiB
Raw Permalink Blame History

Deployment Sizing Guide

This guide provides a sizing matrix for provisioning Miroir orchestrator pods based on corpus size and query throughput. Meilisearch node sizing follows the guidance in plan.md §6 independently.

For per-feature scaling behavior (which capabilities need Redis, work queues, or nothing), see Per-Feature Scaling Behavior.

Sizing matrix

Corpus Peak QPS Orchestrator pods Task store
≤ 10 GB ≤ 500 2 (HA) Redis (or SQLite if replicas=1)
≤ 50 GB ≤ 2 k 24 (HPA) Redis
≤ 200 GB ≤ 5 k 48 (HPA) Redis
≤ 1 TB ≤ 20 k 812 (HPA) Redis
≤ 5 TB ≤ 100 k 1224 (HPA) Redis (clustered or Sentinel)

Orchestrator pod specification: 2 vCPU / 3.75 GB per pod.

Key insight: Orchestrator count scales with query throughput, while Meilisearch node count scales with corpus size and replication factor. These are orthogonal axes.

Task-store memory accounting

When Redis is the task store, it backs shared state for multiple subsystems:

  • Idempotency replay keys — request deduplication cache
  • Session pinning rows — per-session sticky routing state
  • Alias cache — atomic index alias mappings
  • Background job queue — chunked dump import, reshard backfill jobs
  • Leader lease — HA coordinator election state
  • CDC overflow buffer — change-data-capture spill-over (when configured)
  • Search UI rate-limit buckets — IP-based rate limiter state

Add ~20 MB for search UI rate-limit buckets per 10k active IPs on top of the task-store baseline when search_ui.rate_limit.backend: redis. Bucket rows auto-expire per redis_ttl_s, keeping the footprint bounded even under IP-scan or spray attacks.

Worked example: ≤ 200 GB / ≤ 5 k QPS

From the table: 48 orchestrator pods (HPA-scaled between these limits), Redis task store.

Memory budget validation

Per plan.md §14.2, steady-state per-pod memory at default config:

Component Budget
Baseline (runtime + pools) ~330 MB
Request/response buffers (p99 concurrent) 200 MB
Task registry + idempotency + sessions 250 MB
Feature overhead (§13.1121 features) ~200 MB
Allocator headroom ~800 MB
Steady-state total (idle background) ~1.2 GB
With one heavy background job active ~1.7 GB

At 48 pods, cluster-wide steady-state memory is 4.89.6 GB, which fits within a single Redis instance (allocate 48 GB for Redis at this tier). Heavy background jobs only run on the pod that claimed them from the shared queue — not all pods simultaneously — so the per-pod budget stays within envelope.

Query throughput validation

Per plan.md §14.3:

  • Small searches (1 KB responses): ~3 kQPS per pod at 70% CPU
  • Large searches (10 KB responses): ~1 kQPS per pod at 70% CPU

At 5 kQPS peak with 48 pods:

  • Each pod handles 6251250 QPS
  • This is within the 13 kQPS per-pod envelope
  • HPA will scale up toward 8 pods during sustained peak

Result

The ≤ 200 GB / ≤ 5 k QPS row is validated: 48 orchestrator pods with Redis task store can handle the workload within the 2 vCPU / 3.75 GB per-pod resource envelope.

When to escalate

If your workload exceeds the ≤ 5 TB / ≤ 100 k QPS tier, or if you're hitting resource limits despite following this matrix, consider:

  1. Vertical scaling escape valve — For constrained environments, a single larger pod (e.g., 4 vCPU / 8 GB) is supported but not recommended for production. See plan.md §14.10.

  2. Redis clustering — At very high scale, consider Redis Cluster or Sentinel for the task store to improve availability and distribute memory pressure.

  3. Tune feature flags — Disable unused capabilities (e.g., multi-search, vector search, shadow tee) to reclaim memory. Every feature in plan.md §13 has an enabled: true knob that can be flipped off.

  4. Consult the metrics — Use the resource-pressure metrics in plan.md §14.9 to identify the bottleneck (CPU, memory, request queue depth, or background job backlog).