OP#3: Add completion notes for resharding vs scaling documentation

Add notes/bf-5xs1.md documenting the completion of OP#3 work.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
jedarden 2026-05-08 15:26:37 -04:00
parent 4a3c05473e
commit 3491f9e7da

60
notes/bf-5xs1.md Normal file
View file

@ -0,0 +1,60 @@
# OP#3: Resharding (S-change) vs. Node Scaling (N-change) - COMPLETED
## Task Summary
Completed the remaining work for Open Problem #3 by documenting the trade-offs between S-change (resharding) and N-change (node scaling).
## What Already Existed
The codebase already had excellent implementation and validation:
1. **Benchmark resharding with realistic corpora** - `docs/benchmarks/resharding-load.md`
- Comprehensive test matrix covering small/medium/large documents
- Empirically validates the 2× transient storage and write load caveat
- All scenarios confirm exactly 2.0× storage and dual-write amplification
2. **CLI schedule guidance for off-peak windows** - `crates/miroir-ctl/src/commands/reshard.rs`
- Schedule window guard with time-based restrictions
- `--schedule-window` flag to specify allowed windows
- `--force` override with warnings
- Dry run support
3. **Core resharding implementation** - `crates/miroir-core/src/reshard.rs`
- Six-phase shadow-index operation (§13.1)
- Configuration management
- Load simulation and validation
## What Was Added
Added comprehensive "Resharding (S-Change) vs Node Scaling (N-Change)" section to `docs/trade-offs.md`:
- **Core distinction table**: Compares N-change (node scaling) vs S-change (resharding)
- **Node scaling explanation**: Lightweight operation using rendezvous hashing, moves only ~1/N of documents
- **Resharding explanation**: Heavy operation with 2× transient storage/write amplification
- **Decision matrix**: Helps operators choose the right approach based on symptoms
- **Capacity planning guidance**: S = max_nodes_per_group_ever × 8 formula with rationale
- **Operator guidance**: Steps to follow if resharding is necessary
## Key Insights
The main insight for operators: **Node scaling is lightweight; resharding is heavy.**
- Adding nodes: only ~1/N of documents move (those whose top-ranked node changes)
- Resharding: every document's shard assignment changes, requiring full dual-hash dual-write
- Prefer N-change over S-change whenever possible
- Choose S generously at index creation to avoid ever needing to reshard
## Commit
```
commit 1fa5187
OP#3: Document S-change (resharding) vs N-change (node scaling) trade-offs
```
## Status
**COMPLETED** - All OP#3 requirements addressed:
- ✅ Benchmark resharding operations with realistic document distributions
- ✅ Validate transient storage and write load multiplier assumptions
- ✅ Add CLI schedule guidance for off-peak reshard windows
- ✅ Document trade-offs between S-change (resharding) and N-change (node scaling)