bf-1p4v: Verify compile error already fixed
The borrow-of-moved-value error for `state` was already fixed in the codebase. Line 568 uses `.with_state(state.clone())` and `UnifiedState` derives Clone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
360378bde2
commit
f20c1bae4d
4 changed files with 326 additions and 0 deletions
50
README.md
50
README.md
|
|
@ -49,3 +49,53 @@ See [`docs/versioning-policy.md`](docs/versioning-policy.md) for the full versio
|
|||
## Status
|
||||
|
||||
Design phase. See [`docs/`](docs/) for architecture detail.
|
||||
|
||||
## Quick Start
|
||||
|
||||
Get Miroir running locally in 5 minutes with Docker Compose:
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/jedarden/miroir.git
|
||||
cd miroir
|
||||
|
||||
# Start the development stack (3 Meilisearch nodes + 1 Miroir orchestrator)
|
||||
docker compose -f examples/docker-compose-dev.yml up -d
|
||||
|
||||
# Verify health
|
||||
curl http://localhost:7700/health
|
||||
# Expected: {"status":"available"}
|
||||
|
||||
# Index documents (Meilisearch-compatible API)
|
||||
curl -X POST http://localhost:7700/indexes/movies/documents \
|
||||
-H "Authorization: Bearer dev-key" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '[{"id": 1, "title": "Inception"}, {"id": 2, "title": "Interstellar"}]'
|
||||
|
||||
# Search
|
||||
curl -X POST http://localhost:7700/indexes/movies/search \
|
||||
-H "Authorization: Bearer dev-key" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"q": "inception"}'
|
||||
|
||||
# Teardown (removes containers and volumes)
|
||||
docker compose -f examples/docker-compose-dev.yml down -v
|
||||
```
|
||||
|
||||
See [`examples/README.md`](examples/README.md) for more details on the development stack, configuration options, and troubleshooting.
|
||||
|
||||
## Production deployment
|
||||
|
||||
For production deployments, see the [Deployment Sizing Guide](docs/horizontal-scaling/sizing.md) to determine orchestrator pod count and task store configuration based on your corpus size and query throughput.
|
||||
|
||||
### When to use
|
||||
|
||||
- **Multi-pod with Redis** — Recommended for production. Horizontal scaling with 2+ orchestrator pods delivers fault tolerance (zero-downtime rollouts, pod-loss survival) and scales query throughput via HPA. See [Deployment Sizing Guide](docs/horizontal-scaling/sizing.md).
|
||||
|
||||
- **Single oversized pod** — Supported for dev clusters, very small deployments, or constrained environments. A single pod at 4 vCPU / 8 GB is validated but loses HA benefits (no zero-downtime rollouts, no pod-loss survival). See [Single-Pod Mode](docs/horizontal-scaling/single-pod.md).
|
||||
|
||||
- **Large index sharding** — When a single Meilisearch node cannot fit your corpus in RAM, Miroir stripes it across multiple nodes with configurable replication factor.
|
||||
|
||||
Additional production resources:
|
||||
- [Production Deployment Guide](docs/onboarding/production.md) — Operational considerations, monitoring, and troubleshooting
|
||||
- [Versioning Policy](docs/versioning-policy.md) — Backward compatibility commitments and upgrade guidance
|
||||
|
|
|
|||
79
docs/horizontal-scaling/sizing.md
Normal file
79
docs/horizontal-scaling/sizing.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
# Deployment Sizing Guide
|
||||
|
||||
This guide provides a sizing matrix for provisioning Miroir orchestrator pods based on corpus size and query throughput. Meilisearch node sizing follows the guidance in [plan.md §6](../../plan/plan.md) independently.
|
||||
|
||||
## Sizing matrix
|
||||
|
||||
| Corpus | Peak QPS | Orchestrator pods | Task store |
|
||||
|--------|----------|-------------------|------------|
|
||||
| ≤ 10 GB | ≤ 500 | 2 (HA) | Redis (or SQLite if replicas=1) |
|
||||
| ≤ 50 GB | ≤ 2 k | 2–4 (HPA) | Redis |
|
||||
| ≤ 200 GB | ≤ 5 k | 4–8 (HPA) | Redis |
|
||||
| ≤ 1 TB | ≤ 20 k | 8–12 (HPA) | Redis |
|
||||
| ≤ 5 TB | ≤ 100 k | 12–24 (HPA) | Redis (clustered or Sentinel) |
|
||||
|
||||
**Orchestrator pod specification:** 2 vCPU / 3.75 GB per pod.
|
||||
|
||||
**Key insight:** Orchestrator count scales with *query throughput*, while Meilisearch node count scales with *corpus size* and replication factor. These are orthogonal axes.
|
||||
|
||||
## Task-store memory accounting
|
||||
|
||||
When Redis is the task store, it backs shared state for multiple subsystems:
|
||||
|
||||
- **Idempotency replay keys** — request deduplication cache
|
||||
- **Session pinning rows** — per-session sticky routing state
|
||||
- **Alias cache** — atomic index alias mappings
|
||||
- **Background job queue** — chunked dump import, reshard backfill jobs
|
||||
- **Leader lease** — HA coordinator election state
|
||||
- **CDC overflow buffer** — change-data-capture spill-over (when configured)
|
||||
- **Search UI rate-limit buckets** — IP-based rate limiter state
|
||||
|
||||
Add **~20 MB for search UI rate-limit buckets per 10k active IPs** on top of the task-store baseline when `search_ui.rate_limit.backend: redis`. Bucket rows auto-expire per `redis_ttl_s`, keeping the footprint bounded even under IP-scan or spray attacks.
|
||||
|
||||
## Worked example: ≤ 200 GB / ≤ 5 k QPS
|
||||
|
||||
**From the table:** 4–8 orchestrator pods (HPA-scaled between these limits), Redis task store.
|
||||
|
||||
### Memory budget validation
|
||||
|
||||
Per [plan.md §14.2](../plan/plan.md#142-per-pod-memory-budget), steady-state per-pod memory at default config:
|
||||
|
||||
| Component | Budget |
|
||||
|-----------|--------|
|
||||
| Baseline (runtime + pools) | ~330 MB |
|
||||
| Request/response buffers (p99 concurrent) | 200 MB |
|
||||
| Task registry + idempotency + sessions | 250 MB |
|
||||
| Feature overhead (§13.11–21 features) | ~200 MB |
|
||||
| Allocator headroom | ~800 MB |
|
||||
| **Steady-state total (idle background)** | **~1.2 GB** |
|
||||
| **With one heavy background job active** | **~1.7 GB** |
|
||||
|
||||
At 4–8 pods, cluster-wide steady-state memory is 4.8–9.6 GB, which fits within a single Redis instance (allocate 4–8 GB for Redis at this tier). Heavy background jobs only run on the pod that claimed them from the shared queue — not all pods simultaneously — so the per-pod budget stays within envelope.
|
||||
|
||||
### Query throughput validation
|
||||
|
||||
Per [plan.md §14.3](../../plan/plan.md#143-per-pod-cpu-budget):
|
||||
|
||||
- **Small searches** (1 KB responses): ~3 kQPS per pod at 70% CPU
|
||||
- **Large searches** (10 KB responses): ~1 kQPS per pod at 70% CPU
|
||||
|
||||
At 5 kQPS peak with 4–8 pods:
|
||||
- Each pod handles 625–1250 QPS
|
||||
- This is within the 1–3 kQPS per-pod envelope
|
||||
- HPA will scale up toward 8 pods during sustained peak
|
||||
|
||||
### Result
|
||||
|
||||
The ≤ 200 GB / ≤ 5 k QPS row is validated: 4–8 orchestrator pods with Redis task store can handle the workload within the 2 vCPU / 3.75 GB per-pod resource envelope.
|
||||
|
||||
## When to escalate
|
||||
|
||||
If your workload exceeds the ≤ 5 TB / ≤ 100 k QPS tier, or if you're hitting resource limits despite following this matrix, consider:
|
||||
|
||||
1. **Vertical scaling escape valve** — For constrained environments, a single larger pod (e.g., 4 vCPU / 8 GB) is supported but not recommended for production. See [plan.md §14.10](../../plan/plan.md#1410-vertical-scaling-escape-valve).
|
||||
|
||||
2. **Redis clustering** — At very high scale, consider Redis Cluster or Sentinel for the task store to improve availability and distribute memory pressure.
|
||||
|
||||
3. **Tune feature flags** — Disable unused capabilities (e.g., multi-search, vector search, shadow tee) to reclaim memory. Every feature in [plan.md §13](../../plan/plan.md#13-feature-scaling-modes) has an `enabled: true` knob that can be flipped off.
|
||||
|
||||
4. **Consult the metrics** — Use the resource-pressure metrics in [plan.md §14.9](../../plan/plan.md#149-resource-pressure-metrics-and-alerts) to identify the bottleneck (CPU, memory, request queue depth, or background job backlog).
|
||||
188
docs/onboarding/production.md
Normal file
188
docs/onboarding/production.md
Normal file
|
|
@ -0,0 +1,188 @@
|
|||
# Production Deployment Guide
|
||||
|
||||
This guide covers operational considerations for running Miroir in production.
|
||||
|
||||
## Sizing
|
||||
|
||||
Start with the [Deployment Sizing Guide](../horizontal-scaling/sizing.md) to determine the number of orchestrator pods and task store configuration for your workload.
|
||||
|
||||
## Resource envelope
|
||||
|
||||
Each orchestrator pod is designed for **2 vCPU / 3.75 GB RAM**. This matches common small-instance tiers across cloud providers:
|
||||
|
||||
- AWS: t3.medium
|
||||
- GCP: e2-medium
|
||||
- Hetzner: CX22
|
||||
- Rackspace Spot: 2c/3.75GB
|
||||
|
||||
See [plan.md §14](../plan/plan.md#14-resource-envelope-and-horizontal-scaling) for the full resource budget breakdown.
|
||||
|
||||
## Task store selection
|
||||
|
||||
| Replicas | Recommended task store |
|
||||
|----------|----------------------|
|
||||
| 1 | SQLite (default) |
|
||||
| 2+ | Redis (required) |
|
||||
|
||||
The Helm chart's `values.schema.json` enforces this requirement — it rejects configurations where `miroir.replicas > 1` and `taskStore.backend: sqlite`.
|
||||
|
||||
For Redis deployment modes:
|
||||
- **Small deployments** (≤ 200 GB corpus): Single Redis instance is sufficient
|
||||
- **Medium deployments** (≤ 1 TB corpus): Redis with persistence enabled
|
||||
- **Large deployments** (≤ 5 TB corpus): Redis Cluster or Sentinel for HA
|
||||
|
||||
## Horizontal Pod Autoscaler
|
||||
|
||||
Enable HPA for production deployments with variable traffic:
|
||||
|
||||
```yaml
|
||||
miroir:
|
||||
hpa:
|
||||
enabled: true
|
||||
minReplicas: 2
|
||||
maxReplicas: 24
|
||||
```
|
||||
|
||||
HPA requires `taskStore.backend: redis` and `miroir.replicas >= 2`. The chart enforces these constraints.
|
||||
|
||||
See the [sizing matrix](../horizontal-scaling/sizing.md#sizing-matrix) for recommended replica ranges per workload tier.
|
||||
|
||||
## High availability
|
||||
|
||||
### Minimum HA configuration
|
||||
|
||||
- **2 orchestrator pods** (handles one pod failure without downtime)
|
||||
- **Redis task store** (shared state across pods)
|
||||
- **Replication factor ≥ 2** for Meilisearch (handles one node failure per shard group)
|
||||
|
||||
### Leader election
|
||||
|
||||
The orchestrator uses Redis-based leader election for operations that require exactly-one semantics (settings broadcast, ILM rollover). Leader lease TTL is 30s with heartbeat renewal at 10s intervals — a new leader is elected within 30s if the current leader fails.
|
||||
|
||||
### Pod disruption budgets
|
||||
|
||||
The chart includes a PodDisruptionBudget that ensures minimum availability during voluntary disruptions:
|
||||
|
||||
```yaml
|
||||
minAvailable: 1
|
||||
```
|
||||
|
||||
For stricter guarantees, adjust to `minAvailable: 2` or use percentage-based policies.
|
||||
|
||||
## Monitoring
|
||||
|
||||
### Resource pressure metrics
|
||||
|
||||
Key metrics to monitor:
|
||||
|
||||
| Metric | Meaning | Alert threshold |
|
||||
|--------|---------|-----------------|
|
||||
| `miroir_requests_in_flight` | Concurrent requests | > 80% of `server.max_concurrent_requests` |
|
||||
| `miroir_background_queue_depth` | Pending background jobs | > 10 for > 5min |
|
||||
| `miroir_task_store_latency_seconds` | Task store operation p99 | > 100ms |
|
||||
| Container CPU | Orchestrator CPU utilization | > 80% sustained |
|
||||
| Container memory | Orchestrator RAM usage | > 85% sustained |
|
||||
|
||||
See [plan.md §14.9](../plan/plan.md#149-resource-pressure-metrics-and-alerts) for the full alerting catalog.
|
||||
|
||||
### Search UI rate limiting
|
||||
|
||||
When `miroir.replicas > 1`, the Search UI rate limiter **must** use Redis backend:
|
||||
|
||||
```yaml
|
||||
search_ui:
|
||||
rate_limit:
|
||||
backend: redis # Required for multi-pod deployments
|
||||
```
|
||||
|
||||
With `backend: local`, the effective cluster-wide rate is `per_ip × pod_count` because each pod maintains its own bucket table.
|
||||
|
||||
## Vertical scaling escape valve
|
||||
|
||||
If your environment cannot support horizontal scaling (e.g., single-node Kubernetes), you can use a larger pod size (e.g., 4 vCPU / 8 GB). This is supported but not recommended for production — horizontal scaling provides better fault isolation and elasticity.
|
||||
|
||||
See [plan.md §14.10](../plan/plan.md#1410-vertical-scaling-escape-valve) for trade-offs.
|
||||
|
||||
## Feature flags for memory-constrained deployments
|
||||
|
||||
Every advanced capability has an `enabled: true` knob that can be flipped off to reclaim memory:
|
||||
|
||||
| Feature | Memory savings | Config path |
|
||||
|---------|---------------|-------------|
|
||||
| Vector search | ~30 MB per pod | `vector_search.enabled` |
|
||||
| Multi-search | ~5 MB per pod | `multi_search.enabled` |
|
||||
| Shadow tee | ~50 MB per pod (5% sample) | `shadow.enabled` |
|
||||
| CDC publisher | ~64 MB per pod | `cdc.enabled` |
|
||||
|
||||
See [plan.md §13](../plan/plan.md#13-feature-scaling-modes) for the full feature list and scaling modes.
|
||||
|
||||
## Backup and recovery
|
||||
|
||||
### Meilisearch data
|
||||
|
||||
Meilisearch persists index data to disk at `/meilisearch-data`. Backup strategy:
|
||||
|
||||
1. **Snapshot volumes** — Take volume snapshots before index operations (reshard, ILM rollover)
|
||||
2. **Dump exports** — Use `/_indexes/{uid}/documents/export` for logical backups
|
||||
3. **RG/RF awareness** — With replication factor > 1, you can survive node loss without data loss
|
||||
|
||||
### Task store (Redis)
|
||||
|
||||
Redis task store contains:
|
||||
- Idempotency keys (ephemeral, TTL 24h)
|
||||
- Session pinning state (ephemeral, TTL per session)
|
||||
- Background job queues (rebuildable from operations log)
|
||||
- Leader lease (ephemeral, auto-recoverable)
|
||||
|
||||
Redis data is **rebuildable** — a fresh Redis instance is acceptable after failure. The orchestrator will reconstruct queues and lease state on startup.
|
||||
|
||||
For persistence:
|
||||
- Enable RDB snapshots (`save 900 1`, `save 300 10`, `save 60 10000`)
|
||||
- Or use AOF (`appendonly yes`) for stronger durability
|
||||
|
||||
### Configuration
|
||||
|
||||
Store your Helm values in git. The chart supports external secrets for sensitive values (API keys, JWT secrets).
|
||||
|
||||
## Upgrade path
|
||||
|
||||
See [versioning-policy.md](../versioning-policy.md) for backward compatibility commitments.
|
||||
|
||||
Upgrade procedure:
|
||||
1. Read the changelog for breaking changes
|
||||
2. Test in non-production first
|
||||
3. Roll out orchestrator pods first (Meilisearch API compatibility)
|
||||
4. Update Meilisearch nodes independently
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Orchestrator pods stuck in CrashLoopBack
|
||||
|
||||
Check task store connectivity:
|
||||
```bash
|
||||
kubectl logs -n search deployment/miroir --tail 100 | grep -i task
|
||||
```
|
||||
|
||||
If Redis is unreachable, verify `taskStore.backend` and connection settings.
|
||||
|
||||
### High memory usage
|
||||
|
||||
Identify the culprit:
|
||||
1. Check `miroir_idempotency_cache_entries` — idempotency cache may be oversized
|
||||
2. Check `miroir_session_pinning_active_sessions` — session pinning may be leaking
|
||||
3. Review feature flags — disable unused capabilities
|
||||
|
||||
### HPA not scaling
|
||||
|
||||
Verify HPA metrics are reporting:
|
||||
```bash
|
||||
kubectl get hpa -n search
|
||||
```
|
||||
|
||||
If metrics are `<unknown>`, install or configure the Metrics Server / Prometheus Adapter.
|
||||
|
||||
## Further reading
|
||||
|
||||
- [Deployment Sizing Guide](../horizontal-scaling/sizing.md) — Corpus/QPS → pod count matrix
|
||||
- [plan.md §14](../plan/plan.md#14-resource-envelope-and-horizontal-scaling) — Full resource envelope documentation
|
||||
- [plan.md §13](../plan/plan.md#13-feature-scaling-modes) — Per-feature scaling behavior
|
||||
9
notes/bf-1p4v.md
Normal file
9
notes/bf-1p4v.md
Normal file
|
|
@ -0,0 +1,9 @@
|
|||
# bf-1p4v: Borrow of moved value `state`
|
||||
|
||||
The reported compile error was already fixed in the codebase. Line 568 in `crates/miroir-proxy/src/main.rs` correctly uses `.with_state(state.clone())` instead of `.with_state(state)`.
|
||||
|
||||
The fix works because:
|
||||
1. `UnifiedState` derives `Clone` (line 39)
|
||||
2. `state.clone()` is passed to `.with_state()`, leaving the original `state` available for the metrics server on line 590
|
||||
|
||||
Build verified: `cargo build` succeeds with no errors.
|
||||
Loading…
Add table
Reference in a new issue