miroir/tests/chaos
jedarden 158752fe7b feat(multi-search): implement timeout enforcement and acceptance tests (§13.11)
- Add per-query and total timeout enforcement to MultiSearchExecutor
- Improve SearchResult with helper methods (ok, err, timeout, is_success)
- Fix ModeACoordinator feature-gate compilation issues
- Add comprehensive acceptance tests for multi-search:
  - 5-query batch completion
  - Slow query doesn't block fast queries (parallel execution)
  - Partial failure handling
  - Per-query timeout
  - Total timeout
  - 100-query batch completion

Closes: miroir-uhj.11
2026-05-24 01:54:20 -04:00
..
01_kill_one_node_rf2.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
02_kill_two_nodes_rf2.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
03_kill_miroir_replica.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
04_network_delay.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
05_restart_node.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
06_kill_during_rebalance.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
README.md feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00
run_all_chaos_tests.sh feat(multi-search): implement timeout enforcement and acceptance tests (§13.11) 2026-05-24 01:54:20 -04:00

Miroir Chaos Tests

This directory contains chaos engineering tests for Miroir. These tests verify system behavior under failure conditions and ensure the system meets its availability and consistency guarantees.

Prerequisites

Start the test environment:

cd /home/coding/miroir/examples
docker-compose -f docker-compose-dev.yml up -d

Wait for all services to be healthy:

docker-compose ps

Scenarios

Each scenario has its own runbook with detailed steps:

  1. Kill 1 of 3 nodes (RF=2) — Continuous search; degraded writes warn via header
  2. Kill 2 of 3 nodes (RF=2) — Shard loss; 503 or partial per policy
  3. Kill 1 of 2 Miroir replicas — Zero client-visible downtime
  4. [Network delay 500ms]((./04_network_delay.md) — Search slows, no errors
  5. Restart killed node — Miroir detects within health interval
  6. Kill node mid-rebalance — Pause + resume; no data loss

Running Tests

Automated

cd /home/coding/miroir/tests/chaos
./run_all_chaos_tests.sh

Manual

Follow the steps in each scenario's runbook.

Cleanup

Stop the test environment:

cd /home/coding/miroir/examples
docker-compose -f docker-compose-dev.yml down -v

Expected Behaviors

RF=2 Configuration

  • 1 node down: Continued reads, writes degrade with warning header
  • 2 nodes down: Shard unavailability, 503 errors or partial results

Miroir Replica Resilience

  • 1 replica down: Zero client-visible downtime (load balancer fails over)

Rebalance Safety

  • Node killed during rebalance: Pauses, resumes on restart, no data loss

Monitoring

During chaos tests, monitor:

  • Miroir logs: docker logs miroir-orchestrator -f
  • Meilisearch logs: docker logs miroir-meili-0 -f
  • Health status: curl http://localhost:7700/health