jedarden 200a638c05 feat(bench): add performance benchmarks and regression gate (P9.5)

Implement plan §8 performance benchmarks with criterion:

- Fixed merger_bench.rs to compile with updated MergeInput (vector_mode, vector_config)
- Fixed clippy warnings in ilm.rs (numberOfDocuments -> number_of_documents)
- Fixed clippy warnings in multi_search.rs (indexUid -> index_uid)
- Added docs/benchmarks.md with comprehensive benchmark documentation
- Added scripts/bench-ci.sh for CI benchmark runner
- Added scripts/bench-compare.sh for regression gate (>20% slowdown detection)

Benchmarks verified:
- router_bench: Rendezvous ~384 µs for 10K docs (target: <1 ms) ✅
- merger_bench: Merger ~1.07 ms for 1000 hits/3 shards (target: <1 ms) ⚠️
- integration_bench: E2E latency and ingest throughput (require docker-compose)

Closes: miroir-89x.5

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-25 00:44:33 -04:00

5.1 KiB

Raw Blame History

Performance Benchmarks

This document describes Miroir's performance benchmark infrastructure, as defined in plan §8.

Running Benchmarks Locally

Unit Benchmarks (criterion)

Run all unit benchmarks:

cargo bench -p miroir-core

Run specific benchmark suites:

cargo bench -p miroir-core --bench router_bench
cargo bench -p miroir-core --bench merger_bench

View HTML reports:

open target/criterion/*/report/index.html

Integration Benchmarks

Integration benchmarks require a running docker-compose stack:

cd examples && docker-compose -f docker-compose-dev.yml up -d

Run integration benchmarks:

cargo test --test integration_bench -- --nocapture --test-threads=1

Run only integration benchmarks (skip unit tests):

cargo test --test integration_bench -- --ignored

Benchmark Targets (Plan §8)

Benchmark	Target	Status
Rendezvous (64 shards, 3 nodes, 10K docs)	< 1 ms	✅ ~384 µs
Merger (1000 hits, 3 shards)	< 1 ms	⚠️ ~1.07 ms
End-to-end search latency vs. single-node	< 2× single-node	🔄 Pending verification
Ingest throughput (1000 docs through Miroir)	> 80% single-node	🔄 Pending verification

CI Integration

Benchmark Scripts

scripts/bench-ci.sh - CI runner that executes all benchmarks and saves results
scripts/bench-compare.sh - Regression gate that compares results against baseline

Regression Gate

The CI pipeline runs benchmarks on every PR and compares against the main branch baseline. Any benchmark showing > 20% slowdown triggers a review comment.

To use critcmp manually:

# Install critcmp
cargo install critcmp

# Export baseline from main
cargo bench -p miroir-core --bench router_bench -- --save-baseline main
critcmp --export baseline.json target/criterion

# Compare PR results
cargo bench -p miroir-core --bench router_bench
critcmp baseline.json target/criterion

Argo Workflow Integration

Benchmarks run as part of the CI/CD pipeline on iad-ci via Argo Workflows. The workflow:

Runs scripts/bench-ci.sh on main branch to establish baseline
Runs scripts/bench-ci.sh on PR branch
Runs scripts/bench-compare.sh to detect regressions
Posts comment on PR if regressions detected

Benchmark Suites

Router Benchmarks (`benches/router_bench.rs`)

Tests the rendezvous hash-based shard assignment:

shard_for_key_single - Single document shard computation
shard_for_key_10k_docs - Batch shard computation for 10K documents
assign_shard_in_group_64_shards - Assign all 64 shards to nodes
full_routing_10k_docs - Complete routing pipeline (hash → shard → nodes)
varying_shard_count - Performance with 8, 16, 32, 64, 128, 256 shards
varying_node_count - Performance with 2, 3, 4, 5, 8, 10 nodes
varying_rf - Performance with replication factors 1, 2, 3, 5
score_single - Raw score function performance

Merger Benchmarks (`benches/merger_bench.rs`)

Tests result merging from multiple shards:

merge_1000_hits_3_shards - Primary target: merge 1000 hits from 3 shards
varying_hit_count - Performance with 100, 500, 1000, 5000, 10000 hits
varying_shard_count - Performance with 1, 2, 3, 5, 10 shards
pagination - Deep pagination performance (offset/limit)
with_facets - Facet aggregation performance
with_score - Score preservation overhead
degraded - Performance with failed shards

Integration Benchmarks (`tests/integration_bench.rs`)

End-to-end performance with real Meilisearch nodes:

bench_e2e_search_latency - Search latency vs standalone (< 2× target)
bench_ingest_throughput - Ingest throughput vs standalone (> 80% target)
bench_concurrent_search - Concurrent search throughput
bench_faceted_search - Faceted search performance
bench_pagination - Deep pagination performance

Performance Tips

Improving Router Performance

The full_routing_10k_docs benchmark shows ~384 µs for 10K documents
Pre-compute shard assignments for hot paths
Use batch operations when routing multiple documents

Improving Merger Performance

The merge_1000_hits_3_shards benchmark shows ~1.07 ms
Limit facets to only those requested by the client
Consider offset/limit early to avoid processing unnecessary hits

Integration Test Performance

Ensure docker-compose stack is healthy before running
Use --test-threads=1 to avoid race conditions
Allow 30+ seconds for document processing

Adding New Benchmarks

Create a new benchmark file in crates/miroir-core/benches/
Add it to crates/miroir-core/Cargo.toml:

[[bench]]
name = "your_bench"
harness = false

Use criterion for benchmarking:

use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn bench_something(c: &mut Criterion) {
    c.bench_function("something", |b| {
        b.iter(|| {
            black_box(your_function());
        });
    });
}

criterion_group!(benches, bench_something);
criterion_main!(benches);

Run with cargo bench -p miroir-core --bench your_bench

5.1 KiB Raw Blame History Unescape Escape