## Retrospective - **What worked:** The state machine approach with clear phase transitions (Initializing → Syncing → SyncComplete → Active) made the flow easy to understand and test. Separating the coordinator from the sync worker allowed for clean testing. - **What didn't:** Initial implementation had the sync worker running in a tight loop; needed to add configurable intervals and proper timeout handling. - **Surprise:** The query routing already filtered by group state, so the 'queries NOT routed to initializing groups' requirement was already satisfied by existing logic. - **Reusable pattern:** For future multi-phase operations, use a Coordinator + Worker pattern where the coordinator manages state/progress and the worker performs the actual work with periodic checkpoints.
2.2 KiB
2.2 KiB
P4.4 Replica Group Addition: Initializing → Active
Summary
This bead implements the "Adding a new replica group" flow from plan §2, enabling horizontal query scaling by adding new replica groups without interrupting existing queries.
Implementation
Core Components
-
GroupAdditionCoordinator(crates/miroir-core/src/group_addition.rs)- State machine for group addition: Initializing → Syncing → SyncComplete → Active
- Per-shard sync state tracking with round-robin source group selection
- Progress tracking and timeout handling
-
GroupSyncWorker(crates/miroir-core/src/group_sync_worker.rs)- Background worker that copies documents from existing groups to new group
- Paginated sync using
filter=_miroir_shard={id} - Handles source unavailability gracefully (pauses and resumes)
-
Admin API (
crates/miroir-proxy/src/routes/admin_endpoints.rs)POST /_miroir/replica_groups- Add new replica groupGET /_miroir/replica_groups/{id}/status- Check sync progressPOST /_miroir/replica_groups/{id}/activate- Mark group as active
Key Behaviors
- Query routing: During sync, queries only route to
Activegroups (notInitializing) - Write fan-out: New writes immediately fan out to all groups including the new one
- Zero read interruption: Existing groups continue serving queries throughout
- Round-robin source selection: Spreads read load during sync across source groups
Acceptance Tests
All 4 acceptance tests pass:
- ✓ During sync, query throughput on original group unchanged
- ✓ After
active, queries distribute round-robin between groups - ✓ Mid-sync writes present on both groups after sync
- ✓ Failed sync pauses and resumes when source returns
Files Modified
crates/miroir-core/src/group_addition.rs- New filecrates/miroir-core/src/group_sync_worker.rs- New filecrates/miroir-core/tests/p44_replica_group_addition.rs- New test filecrates/miroir-proxy/src/routes/admin_endpoints.rs- Added group addition endpointscrates/miroir-proxy/src/routes/admin.rs- Added routescrates/miroir-proxy/src/main.rs- Added sync worker background task