jedarden e5902bb47f P3: Complete Phase 3 — Task Registry + Persistence (SQLite + Redis)

Implements the 14-table task-store schema from plan §4 with both SQLite
and Redis backends. Every §13 advanced capability and §14 HA mode consumes
one or more of these tables, so settling the schema now prevents per-feature
bespoke persistence.

## SQLite Backend (rusqlite)

- All 14 tables created idempotently at startup via migrations
- Schema version tracking with validation (rejects store ahead of binary)
- WAL mode + 5s busy_timeout for concurrent access
- Full TaskStore trait implementation with comprehensive tests
- Property tests for (insert, get) round-trip and (upsert, list) semantics
- Restart resilience test: tasks survive pod restart simulation

## Redis Backend (async via tokio)

- Mirrors the same 14-table API as SQLite (TaskStore trait)
- Keyspace mapping per plan §4 "Redis mode (HA)"
- Uses _index secondary sets for O(cardinality) list-wide queries (no SCAN)
- TTL-based auto-expiration for sessions, idempotency, rate-limits
- Leader election via SET NX EX with heartbeat renewal
- Pub/Sub for instant admin session revocation propagation
- CDC overflow buffer bounded by byte budget with auto-trim
- Rate limiting for search UI and admin login with exponential backoff
- Search UI scoped-key rotation coordination

## Schema Migrations

- 001_initial.sql: Tables 1-7 (tasks, node_settings_version, aliases,
  sessions, idempotency_cache, jobs, leader_lease)
- 002_feature_tables.sql: Tables 8-14 (canaries, canary_runs, cdc_cursors,
  tenant_map, rollover_policies, search_ui_config, admin_sessions)
- 003_task_registry_fields.sql: No-op (node_errors already present)

## Tests

- SQLite: 36 tests passing (unit + property + restart resilience)
- Redis: Integration tests using testcontainers (25+ async tests)
- Helm schema validation: enforces replicas > 1 + taskStore.backend: redis

## Definition of Done

✓ rusqlite-backed store with idempotent migrations
✓ Redis-backed store mirroring the same API (trait TaskStore)
✓ Migrations/versioning with schema version validation
✓ Property tests on SQLite backend (7 proptests passing)
✓ Integration test: task survives restart (task_survives_store_reopen)
✓ Redis-backend integration tests (testcontainers)
✓ miroir:tasks:_index-style iteration (no SCAN)
✓ Helm values.schema.json enforces replicas > 1 + redis requirement
✓ Redis memory accounting documented in plan §14.7

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-02 16:52:25 -04:00

12 KiB

Raw Blame History

Redis Memory Usage and Capacity Planning

This document describes Redis memory usage patterns for the Miroir task store and provides guidance for capacity planning (plan §14.7).

Overview

Miroir uses Redis as an optional task store backend for multi-replica deployments. The keyspace is organized into 14 table-like structures plus auxiliary keys for rate limiting, CDC overflow buffering, and Pub/Sub.

Redis Keyspace Organization

All keys use the miroir: prefix to avoid collisions with other applications using the same Redis instance.

Table 1: `tasks` (Miroir task registry)

Key pattern: miroir:tasks:<miroir_id> (hash) Index: miroir:tasks:_index (set)

Field	Type	Example Size	Notes
miroir_id	string	~30 bytes	UUIDv4 with "mtask-" prefix
created_at	string	~10 bytes	Millisecond timestamp as string
status	string	~10 bytes	"enqueued", "processing", "succeeded", "failed", "canceled"
node_tasks	string	~50 bytes	JSON: `{"node-0":123}` (varies by node count)
node_errors	string	~10 bytes	JSON object, often empty `{}`
error	string	0-100 bytes	Optional error message
started_at	string	0-10 bytes	Optional timestamp
finished_at	string	0-10 bytes	Optional timestamp
index_uid	string	0-50 bytes	Optional index identifier
task_type	string	0-50 bytes	Optional task type identifier

Estimated per-task memory: ~200-300 bytes (including Redis hash overhead)

Index overhead: ~40 bytes per task in the _index set

Table 2: `node_settings_version`

Key pattern: miroir:node_settings_version:<index_uid>:<node_id> (hash) Index: miroir:node_settings_version:_index (set)

Field	Type	Example Size
index_uid	string	~20 bytes
node_id	string	~20 bytes
version	string	~10 bytes
updated_at	string	~10 bytes

Estimated per-entry memory: ~100-150 bytes

Table 3: `aliases`

Key pattern: miroir:aliases:<name> (hash) Index: miroir:aliases:_index (set)

Field	Type	Example Size
name	string	~30 bytes
kind	string	~10 bytes
current_uid	string	0-40 bytes
target_uids	string	0-100 bytes
version	string	~10 bytes
created_at	string	~10 bytes
history	string	~50 bytes

Estimated per-entry memory: ~200-300 bytes

Table 4: `sessions`

Key pattern: miroir:session:<session_id> (hash with EXPIRE)

Field	Type	Example Size
session_id	string	~40 bytes
last_write_mtask_id	string	0-40 bytes
last_write_at	string	0-10 bytes
pinned_group	string	0-10 bytes
min_settings_version	string	~10 bytes
ttl	string	~10 bytes

Estimated per-entry memory: ~150-200 bytes

Note: Sessions have TTL set via Redis EXPIRE and are automatically garbage-collected.

Table 5: `idempotency_cache`

Key pattern: miroir:idemp:<key> (hash with EXPIRE)

Field	Type	Example Size
key	string	~50 bytes
body_sha256	string	~64 bytes
miroir_task_id	string	~40 bytes
expires_at	string	~10 bytes

Estimated per-entry memory: ~200-250 bytes

Note: Entries have TTL set via Redis EXPIRE and are automatically garbage-collected.

Table 6: `jobs`

Key pattern: miroir:jobs:<id> (hash) Index: miroir:jobs:_index (set) Queued: miroir:jobs:_queued (set)

Field	Type	Example Size
id	string	~40 bytes
type	string	~30 bytes
params	string	~100 bytes
state	string	~20 bytes
claimed_by	string	0-20 bytes
claim_expires_at	string	0-10 bytes
progress	string	~50 bytes

Estimated per-entry memory: ~300-400 bytes

Table 7: `leader_lease`

Key pattern: miroir:lease:<scope> (string with EXPIRE)

Estimated per-entry memory: ~50-100 bytes (simple key-value with TTL)

Note: Leases use Redis SET NX EX for distributed coordination.

Table 8: `canaries`

Key pattern: miroir:canary:<id> (hash) Index: miroir:canary:_index (set)

Field	Type	Example Size
id	string	~30 bytes
name	string	~40 bytes
index_uid	string	~30 bytes
interval_s	string	~10 bytes
query_json	string	~50 bytes
assertions_json	string	~50 bytes
enabled	string	~5 bytes
created_at	string	~10 bytes

Estimated per-entry memory: ~250-350 bytes

Table 9: `canary_runs`

Key pattern: miroir:canary_runs:<canary_id> (sorted set, ZADD with score=ran_at)

Value: JSON serialization of run data (~100 bytes) Score: ran_at timestamp

Estimated per-run memory: ~150-200 bytes (including ZSET overhead)

Auto-pruning: Sorted set is trimmed to run_history_per_canary (default 100) on each insert.

Table 10: `cdc_cursors`

Key pattern: miroir:cdc_cursor:<sink_name>:<index_uid> (hash) Index: miroir:cdc_cursor:_index:<sink_name> (set)

Field	Type	Example Size
sink_name	string	~30 bytes
index_uid	string	~30 bytes
last_event_seq	string	~10 bytes
updated_at	string	~10 bytes

Estimated per-entry memory: ~120-150 bytes

Table 11: `tenant_map`

Key pattern: miroir:tenant_map:<hex_encoded_api_key_hash> (hash)

Field	Type	Example Size
tenant_id	string	~40 bytes
group_id	string	0-10 bytes

Estimated per-entry memory: ~80-120 bytes

Table 12: `rollover_policies`

Key pattern: miroir:rollover:<name> (hash) Index: miroir:rollover:_index (set)

Field	Type	Example Size
name	string	~30 bytes
write_alias	string	~30 bytes
read_alias	string	~30 bytes
pattern	string	~30 bytes
triggers_json	string	~100 bytes
retention_json	string	~100 bytes
template_json	string	~200 bytes
enabled	string	~5 bytes

Estimated per-entry memory: ~400-600 bytes

Table 13: `search_ui_config`

Key pattern: miroir:search_ui_config:<index_uid> (hash)

Field	Type	Example Size
index_uid	string	~30 bytes
config_json	string	~200 bytes
updated_at	string	~10 bytes

Estimated per-entry memory: ~250-300 bytes

Table 14: `admin_sessions`

Key pattern: miroir:admin_session:<session_id> (hash with EXPIRE)

Field	Type	Example Size
session_id	string	~40 bytes
csrf_token	string	~40 bytes
admin_key_hash	string	~64 bytes
created_at	string	~10 bytes
expires_at	string	~10 bytes
revoked	string	~5 bytes
user_agent	string	0-100 bytes
source_ip	string	0-20 bytes

Estimated per-entry memory: ~200-300 bytes

Note: Sessions have TTL set via Redis EXPIRE and are automatically garbage-collected.

Auxiliary Keys

Rate Limiting: Search UI

Key pattern: miroir:ratelimit:searchui:<ip> (string with EXPIRE)

Estimated per-entry memory: ~30-50 bytes (simple counter)

Key pattern: miroir:ratelimit:adminlogin:<ip> (string with EXPIRE) Backoff pattern: miroir:ratelimit:adminlogin:backoff:<ip> (hash with EXPIRE)

Estimated per-entry memory: ~30-100 bytes

CDC Overflow Buffer

Key pattern: miroir:cdc:overflow:<sink_name> (list) Byte counter: miroir:cdc:overflow_bytes:<sink_name> (string)

Memory budget: Configurable per sink (default 1 GiB) Elements: Variable-size JSON blobs

Search UI Scoped Keys

Key pattern: miroir:search_ui_scoped_key:<index_uid> (hash) Observation: miroir:search_ui_scoped_key_observed:<pod_id>:<index_uid> (hash with EXPIRE, TTL 60s)

Estimated per-entry memory: ~200-300 bytes

Live Pod Registry

Key pattern: miroir:live_pods (sorted set, ZADD with score=timestamp)

Estimated per-pod memory: ~50 bytes

Pub/Sub: Session Revocation

Channel: miroir:admin_session:revoked

Memory overhead: Negligible (Pub/Sub is not persisted)

Capacity Planning

Memory Budget Estimation

For a typical production deployment with the following characteristics:

10,000 active tasks (in-flight or recently completed)
1,000 concurrent sessions (search UI + admin)
1,000 idempotency cache entries (recent deduplication)
100 background jobs (queued/in-progress)
10 leader leases (coordinating reshard/rollover operations)
5 canaries with 100-run history each
50 CDC cursors (per-sink, per-index)
10 rollover policies
20 search UI configs

Estimated memory usage:

Component	Count	Size per Item	Subtotal
Tasks	10,000	250 bytes	~2.5 MB
Tasks index	10,000	40 bytes	~400 KB
Sessions	1,000	175 bytes	~175 KB
Idempotency	1,000	225 bytes	~225 KB
Jobs	100	350 bytes	~35 KB
Leases	10	75 bytes	~1 KB
Canaries	5	300 bytes	~1.5 KB
Canary runs	500	175 bytes	~88 KB
CDC cursors	50	135 bytes	~7 KB
Rollover policies	10	500 bytes	~5 KB
Search UI configs	20	275 bytes	~5.5 KB
Scoped keys	20	250 bytes	~5 KB
Rate limiting	2,000	40 bytes	~80 KB
Total			~3.5 MB

Redis Memory Overhead

Redis adds memory overhead for:

Hash table overhead: ~20-30% of raw data size
Pointer chasing: Each key/value pair has pointers
Memory allocator fragmentation: Varies by allocator

Conservative estimate: Multiply the raw data size by 1.5x for overhead.

Recommended minimum for above workload: ~6 MB

Per-Pod Memory Growth

In multi-replica deployments:

Live pod registry: ~50 bytes per pod
Scoped key observations: ~250 bytes per pod per index with scoped keys

For 10 replicas with 20 scoped-key indexes: ~50 KB

Monitoring

Monitor miroir_cdc_redis_memory_bytes (Prometheus metric exported by Miroir) which tracks the used_memory value from Redis INFO command.

Alert thresholds (plan §14.7):

Warning: > 500 MB
Critical: > 1 GB

If memory usage grows beyond thresholds:

Increase Redis memory limit
Review task pruning policy (reduce retention period)
Reduce idempotency cache TTL
Check for CDC overflow buffer growth (may indicate sink is down)

Redis Configuration Recommendations

maxmemory-policy

Recommended: allkeys-lru (evict least-recently-used keys when memory limit is reached)

This is safe for Miroir because:

Tasks are eventually pruned to a retention window
Sessions and idempotency entries have TTL and will naturally expire first
Critical data (leader leases) are refreshed frequently and won't be evicted

Persistence

For production deployments:

RDB snapshots: Every 5-10 minutes is sufficient (tasks are source-of-truth in Meilisearch)
AOF: Not required (acceptable to lose last few seconds of task updates on failover)

Connection Pooling

Miroir uses redis-rs with connection-manager for automatic connection pooling and reconnection. No additional configuration needed.

High Availability

For production multi-replica deployments:

Use Redis Sentinel or Redis Cluster for HA
Configure taskStore.url with Sentinel master name or Cluster endpoints
Miroir's connection-manager handles failover automatically

Testing

Run the integration test suite to verify memory usage under load:

cargo test -p miroir-core --features redis-store --test-threads=1 test_redis_memory_budget

This test inserts 10k tasks, 1k idempotency entries, and 1k sessions, verifying that the workload can be created successfully. In production, monitor actual RSS via docker stats or Kubernetes metrics.

12 KiB Raw Blame History

Redis Memory Usage and Capacity Planning

Overview

Redis Keyspace Organization

Table 1: tasks (Miroir task registry)

Table 2: node_settings_version

Table 3: aliases

Table 4: sessions

Table 5: idempotency_cache

Table 6: jobs

Table 7: leader_lease

Table 8: canaries

Table 9: canary_runs

Table 10: cdc_cursors

Table 11: tenant_map

Table 12: rollover_policies

Table 13: search_ui_config

Table 14: admin_sessions

Auxiliary Keys

Rate Limiting: Search UI

Rate Limiting: Admin Login

CDC Overflow Buffer

Search UI Scoped Keys

Live Pod Registry

Pub/Sub: Session Revocation

Capacity Planning

Memory Budget Estimation

Redis Memory Overhead

Per-Pod Memory Growth

Monitoring

Redis Configuration Recommendations

maxmemory-policy

Persistence

Connection Pooling

High Availability

Testing

12 KiB

Raw Blame History

Table 1: `tasks` (Miroir task registry)

Table 2: `node_settings_version`

Table 3: `aliases`

Table 4: `sessions`

Table 5: `idempotency_cache`

Table 6: `jobs`

Table 7: `leader_lease`

Table 8: `canaries`

Table 9: `canary_runs`

Table 10: `cdc_cursors`

Table 11: `tenant_map`

Table 12: `rollover_policies`

Table 13: `search_ui_config`

Table 14: `admin_sessions`