miroir/docs/research
jedarden 72f9a197b5 P12.OP4: Score normalization at scale — research & benchmark infrastructure
Completed Plan §15 Open Problem #4 research on cross-shard score comparability.

## Key Finding
Average Kendall tau: 0.79 vs. 0.95 threshold — FAIL

Cross-shard score comparability is a significant issue:
- Common-term queries: τ = 0.15 (catastrophic)
- Local IDF statistics cause score inflation on small shards
- Documents from 10-doc shards outrank 93K-doc shard results

## Recommendation
Implement Reciprocal Rank Fusion (RRF) for result merging.
Follow-up bead: miroir-nsu

## Artifacts Added
- Benchmark infrastructure: tests/benches/score-comparability/
  - Corpus generator with extreme shard skew (100× variance)
  - Query generator (10K random queries across 5 types)
  - BM25-based simulation with global vs local IDF
  - Kendall tau comparison tool
  - Full experimental results (τ = 0.79 ± 0.01, 95% CI)
- Research writeup: docs/research/score-normalization-at-scale.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-18 23:58:08 -04:00
..
consistent-hashing.md Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
distributed-search-patterns.md Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
ha-approaches.md Add repo hygiene: LICENSE, CHANGELOG, .gitignore 2026-04-18 20:47:36 -04:00
raft-task-store.md P12.OP2: Update Raft vs Redis research with web survey findings 2026-04-18 22:03:29 -04:00
score-normalization-at-scale.md P12.OP4: Score normalization at scale — research & benchmark infrastructure 2026-04-18 23:58:08 -04:00