jedarden e47c1c2f73 P12.OP3: Validate 2× transient load caveat and add CLI schedule window guard

- Add resharding load simulation model with real router hash functions
- Benchmark confirms storage amplification is exactly 2.0× and dual-write
  amplification is exactly 2.0× across all test matrix scenarios (1KB/10GB,
  10KB/100GB, 1MB/1TB), with hash distribution CV < 5% in all cases
- CLI window guard: resharding.allowed_windows config restricts resharding
  to named time windows (e.g. "02:00-06:00 UTC"), CLI refuses outside
  windows without --force
- Integration tests confirm rejection outside window, --force override,
  no-restriction mode, and disabled config handling

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-18 22:00:57 -04:00

4.3 KiB

Raw Blame History

Resharding Load Benchmark Results

Plan §15 Open Problem #3: Empirical validation of the 2× transient load caveat.

Summary

Online resharding (§13.1) creates a shadow index alongside the live index, backfills documents, verifies, then swaps aliases. The plan states that this "doubles transient storage and write load." This benchmark confirms that estimate using the actual routing code with synthetic corpora.

Conclusion: Storage amplification is exactly 2.0× and dual-write amplification is exactly 2.0× across all tested scenarios. However, peak write amplification (dual-write + backfill) depends heavily on the backfill throttle rate relative to the incoming write rate and can reach 500× or more if the backfill runs unthrottled against a low-write corpus.

Test Matrix

Doc size	Corpus	Write rate	RG	RF	Nodes/group	Old→New shards	Backfill throttle
1 KB	10 GB	100 dps	2	1	3	64→128	10,000 dps
10 KB	100 GB	1,000 dps	2	2	4	64→128	10,000 dps
1 MB	1 TB	10 dps	2	1	4	64→128	5,000 dps

Results

Scenario 1: Small documents, moderate corpus

Metric	Value
Total documents	10,485,760
Normal storage (steady-state)	20.00 GB
Peak storage (resharding)	40.00 GB
Storage amplification	2.00×
Per-node normal	3.91 GB
Per-node peak	8.05 GB
Normal write rate	200 writes/sec
Dual-write rate	400 writes/sec (2.0×)
Peak write rate (bf+dw)	20,400 writes/sec (102.0×)
Backfill duration	17.5 min
Total bytes written	20.40 GB
Old shard CV	0.21%
New shard CV	0.33%

Scenario 2: Medium documents, large corpus, RF=2

Metric	Value
Total documents	10,485,760
Normal storage (steady-state)	200.00 GB
Peak storage (resharding)	400.00 GB
Storage amplification	2.00×
Per-node normal	62.51 GB
Per-node peak	118.73 GB
Normal write rate	4,000 writes/sec
Dual-write rate	8,000 writes/sec (2.0×)
Peak write rate (bf+dw)	48,000 writes/sec (12.0×)
Backfill duration	17.5 min
Total bytes written	480.00 GB
Old shard CV	0.21%
New shard CV	0.33%

Scenario 3: Large blobs, very large corpus

Metric	Value
Total documents	1,048,576
Normal storage (steady-state)	2,048.00 GB
Peak storage (resharding)	4,096.00 GB
Storage amplification	2.00×
Per-node normal	319.90 GB
Per-node peak	647.67 GB
Normal write rate	20 writes/sec
Dual-write rate	40 writes/sec (2.0×)
Peak write rate (bf+dw)	10,040 writes/sec (502.0×)
Backfill duration	3.5 min
Total bytes written	2,056.19 GB
Old shard CV	0.66%
New shard CV	1.04%

Invariant Verification

Invariant	S1	S2	S3
Storage amplification == 2.0×	PASS (2.00)	PASS (2.00)	PASS (2.00)
Dual-write amplification == 2.0×	PASS (2.00)	PASS (2.00)	PASS (2.00)
Hash distribution CV < 5%	PASS (0.21%, 0.33%)	PASS (0.21%, 0.33%)	PASS (0.66%, 1.04%)

Operator Guidance

Storage: Reserve exactly 2× your corpus size (× RG) before starting a reshard. No exceptions.
Write load during dual-write: Expect exactly 2× your normal write throughput. Plan headroom accordingly.
Peak write load: The backfill throttle dominates peak write load. Set throttle_docs_per_sec conservatively — aim for peak total writes ≤ 3× normal. Formula: backfill_throttle_dps × RF × RG + write_rate × 2 × RF × RG.
Backfill duration: At 10K docs/sec, a 10M doc corpus takes ~17 minutes. Scale linearly.
Schedule window: Use resharding.allowed_windows to restrict resharding to off-peak hours. The CLI refuses to start outside configured windows unless --force is given.

Methodology

The simulation builds a synthetic topology matching the test parameters, then iterates every document in the corpus, routing it through the real shard_for_key (XxHash64) and assign_shard_in_group (rendezvous hash) functions. Storage and write amplification are computed from the resulting shard assignments.

Run with: cargo run --bin bench-reshard-load

Last run: 2026-04-18

4.3 KiB Raw Blame History Unescape Escape