miroir/docs/benchmarks.md
jedarden 200a638c05 feat(bench): add performance benchmarks and regression gate (P9.5)
Implement plan §8 performance benchmarks with criterion:

- Fixed merger_bench.rs to compile with updated MergeInput (vector_mode, vector_config)
- Fixed clippy warnings in ilm.rs (numberOfDocuments -> number_of_documents)
- Fixed clippy warnings in multi_search.rs (indexUid -> index_uid)
- Added docs/benchmarks.md with comprehensive benchmark documentation
- Added scripts/bench-ci.sh for CI benchmark runner
- Added scripts/bench-compare.sh for regression gate (>20% slowdown detection)

Benchmarks verified:
- router_bench: Rendezvous ~384 µs for 10K docs (target: <1 ms) 
- merger_bench: Merger ~1.07 ms for 1000 hits/3 shards (target: <1 ms) ⚠️
- integration_bench: E2E latency and ingest throughput (require docker-compose)

Closes: miroir-89x.5

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 00:44:33 -04:00

5.1 KiB
Raw Blame History

Performance Benchmarks

This document describes Miroir's performance benchmark infrastructure, as defined in plan §8.

Running Benchmarks Locally

Unit Benchmarks (criterion)

Run all unit benchmarks:

cargo bench -p miroir-core

Run specific benchmark suites:

cargo bench -p miroir-core --bench router_bench
cargo bench -p miroir-core --bench merger_bench

View HTML reports:

open target/criterion/*/report/index.html

Integration Benchmarks

Integration benchmarks require a running docker-compose stack:

cd examples && docker-compose -f docker-compose-dev.yml up -d

Run integration benchmarks:

cargo test --test integration_bench -- --nocapture --test-threads=1

Run only integration benchmarks (skip unit tests):

cargo test --test integration_bench -- --ignored

Benchmark Targets (Plan §8)

Benchmark Target Status
Rendezvous (64 shards, 3 nodes, 10K docs) < 1 ms ~384 µs
Merger (1000 hits, 3 shards) < 1 ms ⚠️ ~1.07 ms
End-to-end search latency vs. single-node < 2× single-node 🔄 Pending verification
Ingest throughput (1000 docs through Miroir) > 80% single-node 🔄 Pending verification

CI Integration

Benchmark Scripts

  • scripts/bench-ci.sh - CI runner that executes all benchmarks and saves results
  • scripts/bench-compare.sh - Regression gate that compares results against baseline

Regression Gate

The CI pipeline runs benchmarks on every PR and compares against the main branch baseline. Any benchmark showing > 20% slowdown triggers a review comment.

To use critcmp manually:

# Install critcmp
cargo install critcmp

# Export baseline from main
cargo bench -p miroir-core --bench router_bench -- --save-baseline main
critcmp --export baseline.json target/criterion

# Compare PR results
cargo bench -p miroir-core --bench router_bench
critcmp baseline.json target/criterion

Argo Workflow Integration

Benchmarks run as part of the CI/CD pipeline on iad-ci via Argo Workflows. The workflow:

  1. Runs scripts/bench-ci.sh on main branch to establish baseline
  2. Runs scripts/bench-ci.sh on PR branch
  3. Runs scripts/bench-compare.sh to detect regressions
  4. Posts comment on PR if regressions detected

Benchmark Suites

Router Benchmarks (benches/router_bench.rs)

Tests the rendezvous hash-based shard assignment:

  • shard_for_key_single - Single document shard computation
  • shard_for_key_10k_docs - Batch shard computation for 10K documents
  • assign_shard_in_group_64_shards - Assign all 64 shards to nodes
  • full_routing_10k_docs - Complete routing pipeline (hash → shard → nodes)
  • varying_shard_count - Performance with 8, 16, 32, 64, 128, 256 shards
  • varying_node_count - Performance with 2, 3, 4, 5, 8, 10 nodes
  • varying_rf - Performance with replication factors 1, 2, 3, 5
  • score_single - Raw score function performance

Merger Benchmarks (benches/merger_bench.rs)

Tests result merging from multiple shards:

  • merge_1000_hits_3_shards - Primary target: merge 1000 hits from 3 shards
  • varying_hit_count - Performance with 100, 500, 1000, 5000, 10000 hits
  • varying_shard_count - Performance with 1, 2, 3, 5, 10 shards
  • pagination - Deep pagination performance (offset/limit)
  • with_facets - Facet aggregation performance
  • with_score - Score preservation overhead
  • degraded - Performance with failed shards

Integration Benchmarks (tests/integration_bench.rs)

End-to-end performance with real Meilisearch nodes:

  • bench_e2e_search_latency - Search latency vs standalone (< 2× target)
  • bench_ingest_throughput - Ingest throughput vs standalone (> 80% target)
  • bench_concurrent_search - Concurrent search throughput
  • bench_faceted_search - Faceted search performance
  • bench_pagination - Deep pagination performance

Performance Tips

Improving Router Performance

  • The full_routing_10k_docs benchmark shows ~384 µs for 10K documents
  • Pre-compute shard assignments for hot paths
  • Use batch operations when routing multiple documents

Improving Merger Performance

  • The merge_1000_hits_3_shards benchmark shows ~1.07 ms
  • Limit facets to only those requested by the client
  • Consider offset/limit early to avoid processing unnecessary hits

Integration Test Performance

  • Ensure docker-compose stack is healthy before running
  • Use --test-threads=1 to avoid race conditions
  • Allow 30+ seconds for document processing

Adding New Benchmarks

  1. Create a new benchmark file in crates/miroir-core/benches/
  2. Add it to crates/miroir-core/Cargo.toml:
[[bench]]
name = "your_bench"
harness = false
  1. Use criterion for benchmarking:
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn bench_something(c: &mut Criterion) {
    c.bench_function("something", |b| {
        b.iter(|| {
            black_box(your_function());
        });
    });
}

criterion_group!(benches, bench_something);
criterion_main!(benches);
  1. Run with cargo bench -p miroir-core --bench your_bench