miroir/tests/chaos/runbooks
jedarden 304879d32a feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4)
Add comprehensive chaos testing infrastructure for Miroir failure scenarios:

- **TestCluster** harness with chaos helpers:
  - `kill_meili()` / `restart_meili()` for node failure simulation
  - `apply_netem()` / `remove_netem()` for network delay injection
  - `kill_miroir()` / `restart_miroir()` for orchestrator failure
  - Docker-compose stack lifecycle management

- **6 chaos test scenarios** (all marked `#[ignore]`):
  1. Kill 1 of 3 nodes (RF=2) → continuous search, no degraded header
  2. Kill 2 of 3 nodes (RF=2) → 503 or partial results with degraded header
  3. Kill 1 of 2 Miroir replicas → zero client-visible downtime
  4. tc netem 500ms delay → searches slow but succeed, no errors
  5. Restart killed node → Miroir detects recovery within health check interval
  6. Kill node mid-rebalance → rebalancer pauses, resumes on recovery

- **Runbooks** in `tests/chaos/runbooks/scenario*.md`:
  - Manual reproduction steps
  - Expected observables (metrics, headers, errors)
  - Recovery procedures
  - HA vs single-instance differences
  - Operator notes and common causes

- **Updated docker-compose files**:
  - Added `CAP_NET_ADMIN` to all Meilisearch containers for tc netem support

Tests are slow (30+ seconds each) and require docker-compose. Run with:
  cargo test --test chaos -- --ignored --test-threads=1

Closes: miroir-89x.4
2026-05-24 10:23:24 -04:00
..
scenario1.md feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4) 2026-05-24 10:23:24 -04:00
scenario2.md feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4) 2026-05-24 10:23:24 -04:00
scenario3.md feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4) 2026-05-24 10:23:24 -04:00
scenario4.md feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4) 2026-05-24 10:23:24 -04:00
scenario5.md feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4) 2026-05-24 10:23:24 -04:00
scenario6.md feat(tests): add chaos test scenarios and runbooks (plan §8, P9.4) 2026-05-24 10:23:24 -04:00