- Add per-query and total timeout enforcement to MultiSearchExecutor - Improve SearchResult with helper methods (ok, err, timeout, is_success) - Fix ModeACoordinator feature-gate compilation issues - Add comprehensive acceptance tests for multi-search: - 5-query batch completion - Slow query doesn't block fast queries (parallel execution) - Partial failure handling - Per-query timeout - Total timeout - 100-query batch completion Closes: miroir-uhj.11 |
||
|---|---|---|
| .. | ||
| 01_kill_one_node_rf2.md | ||
| 02_kill_two_nodes_rf2.md | ||
| 03_kill_miroir_replica.md | ||
| 04_network_delay.md | ||
| 05_restart_node.md | ||
| 06_kill_during_rebalance.md | ||
| README.md | ||
| run_all_chaos_tests.sh | ||
Miroir Chaos Tests
This directory contains chaos engineering tests for Miroir. These tests verify system behavior under failure conditions and ensure the system meets its availability and consistency guarantees.
Prerequisites
Start the test environment:
cd /home/coding/miroir/examples
docker-compose -f docker-compose-dev.yml up -d
Wait for all services to be healthy:
docker-compose ps
Scenarios
Each scenario has its own runbook with detailed steps:
- Kill 1 of 3 nodes (RF=2) — Continuous search; degraded writes warn via header
- Kill 2 of 3 nodes (RF=2) — Shard loss; 503 or partial per policy
- Kill 1 of 2 Miroir replicas — Zero client-visible downtime
- [Network delay 500ms]((./04_network_delay.md) — Search slows, no errors
- Restart killed node — Miroir detects within health interval
- Kill node mid-rebalance — Pause + resume; no data loss
Running Tests
Automated
cd /home/coding/miroir/tests/chaos
./run_all_chaos_tests.sh
Manual
Follow the steps in each scenario's runbook.
Cleanup
Stop the test environment:
cd /home/coding/miroir/examples
docker-compose -f docker-compose-dev.yml down -v
Expected Behaviors
RF=2 Configuration
- 1 node down: Continued reads, writes degrade with warning header
- 2 nodes down: Shard unavailability, 503 errors or partial results
Miroir Replica Resilience
- 1 replica down: Zero client-visible downtime (load balancer fails over)
Rebalance Safety
- Node killed during rebalance: Pauses, resumes on restart, no data loss
Monitoring
During chaos tests, monitor:
- Miroir logs:
docker logs miroir-orchestrator -f - Meilisearch logs:
docker logs miroir-meili-0 -f - Health status:
curl http://localhost:7700/health