miroir/docs
jedarden 8f91d6998f P12.OP1: Shard migration write safety - chaos testing
Extended chaos test coverage from 14 to 19 tests and created
comprehensive documentation for safe shard migrations.

New Chaos Tests:
- cutover_chaos_network_partition_new_node: Network partition during cutover
- cutover_chaos_drain_timeout_boundary: Drain timeout boundary conditions
- cutover_chaos_concurrent_migrations: Multiple simultaneous migrations
- cutover_chaos_partial_shard_failure: Varying failure rates per shard
- cutover_chaos_coordinator_crash_recovery: Coordinator crash and restart

Documentation:
- docs/chaos_testing_report.md: Test coverage, findings, recommendations
- docs/migration_runbook.md: Operational procedures, rollback, troubleshooting
- notes/bf-4d9a.md: Task summary and completion report

Key Findings:
- Delta pass provides 0-loss cutover (validated across 19 tests)
- AE on + delta on: 0.000% loss (recommended)
- AE off + delta on: 0.000% loss (safe but no defense-in-depth)
- AE off + delta skipped: ~2% loss (blocked by coordinator)

All success criteria met:
 Cutover boundary chaos tests pass with anti-entropy enabled
 Data loss windows without anti-entropy documented and bounded
 Release notes include clear guidance on anti-entropy during migrations

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-08 15:29:48 -04:00
..
benchmarks P12.OP3: Validate 2× transient load caveat and add CLI schedule window guard 2026-04-18 22:00:57 -04:00
dump-import P12.OP5: Add dump import compatibility matrix 2026-04-18 21:06:46 -04:00
notes Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
plan Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
research P12.OP2: Finalize Raft research — correct openraft version, update benchmarks, suppress warnings 2026-04-18 22:37:20 -04:00
arm64-support.md OP#6: Add ARM64 (aarch64-unknown-linux-musl) target support 2026-05-08 15:25:12 -04:00
chaos_testing_report.md P12.OP1: Shard migration write safety - chaos testing 2026-05-08 15:29:48 -04:00
migration_runbook.md P12.OP1: Shard migration write safety - chaos testing 2026-05-08 15:29:48 -04:00
trade-offs.md OP#3: Document S-change (resharding) vs N-change (node scaling) trade-offs 2026-05-08 15:25:53 -04:00