Verified that all acceptance criteria are met: - Fingerprint → diff → repair pipeline implemented - TTL interaction for expired documents - CDC suppression via origin tag - Mode A scaling with rendezvous-owned shards - All 9 acceptance tests passing - Prometheus metrics and alert defined Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Bead-Id: miroir-uhj.8
3.1 KiB
P5.8 §13.8 Anti-entropy shard reconciler - Verification Summary
Bead: miroir-uhj.8
Implementation Status: COMPLETE ✓
The anti-entropy shard reconciler (plan §13.8) is fully implemented and tested.
Core Components
-
crates/miroir-core/src/anti_entropy.rs- Core reconcilerAntiEntropyReconcilerwith fingerprint → diff → repair pipelineShardFingerprintfor Merkle tree fingerprints with bucket hashesReplicaDifffor divergence detectionRepairActionandRepairReasonfor repair tracking- TTL interaction: expired documents are deleted from all replicas
- Mode A scaling: each pod scans rendezvous-owned shards
- Metrics callbacks for Prometheus integration
-
crates/miroir-core/src/rebalancer_worker/anti_entropy_worker.rs- Background workerAntiEntropyWorkerwith leader election via advisory lockHttpNodeClientfor node communication- Schedule parsing ("every 6h" format)
- Leader lease management with renewal
- Metrics integration
-
crates/miroir-core/src/config/advanced.rs- ConfigurationAntiEntropyConfigwith all required fields- Defaults:
enabled: true,schedule: "every 6h",auto_repair: true updated_at_fieldandexpires_at_fieldfor TTL interaction
-
crates/miroir-proxy/src/routes/documents.rs- Write path integration_miroir_updated_atstamping whenanti_entropy.enabled: true- Reserved field rejection when enabled
-
crates/miroir-proxy/src/middleware.rs- Prometheus metricsmiroir_antientropy_shards_scanned_totalmiroir_antientropy_mismatches_found_totalmiroir_antientropy_docs_repaired_totalmiroir_antientropy_last_scan_completed_seconds
-
charts/miroir/templates/miroir-prometheusrule.yaml- AlertMiroirAntientropyMismatch: fires whenincrease(miroir_antientropy_mismatches_found_total[18h]) > 0- Corresponds to 3 consecutive passes at default 6h schedule
Acceptance Criteria Status
-
Induce divergence on 1 shard; reconciler detects within
scheduleinterval and repairs- Test:
test_acceptance_1_detect_and_repair_divergence✓
- Test:
-
Expired-doc test: stale write with older
updated_atdoes NOT resurrect- Test:
test_acceptance_2_expired_doc_no_resurrection✓
- Test:
-
CDC subscribers do NOT see anti-entropy writes (filtered by
_miroir_origin)- Test:
test_acceptance_3_cdc_suppression✓
- Test:
-
Mode A: 3 pods, each owns ~1/3 of shards; runs exactly once per shard cluster-wide
- Test:
test_acceptance_4_mode_a_shard_partitioning✓
- Test:
Test Results
All 9 tests pass:
running 9 tests
test test_acceptance_1_detect_and_repair_divergence ... ok
test test_acceptance_2_expired_doc_no_resurrection ... ok
test test_acceptance_3_cdc_suppression ... ok
test test_acceptance_4_mode_a_shard_partitioning ... ok
test test_authoritative_doc_selection ... ok
test test_authoritative_doc_selection_tiebreak ... ok
test test_bucket_isolation ... ok
test test_content_hash_excludes_internal_fields ... ok
test test_metrics_tracking ... ok
Verification Date
2025-05-23