Commit graph

3 commits

Author SHA1 Message Date
jedarden
4ba39e3aa8 feat(evolver): complete Phase 7 LLM-driven evolution implementation
- Complete autonomous evolution pipeline with island model (4 islands)
- MAP-Elites behavior grid integration for diversity
- LLM ensemble integration (fast + strong model tiers)
- 3-stage validation pipeline (syntax → schema → sandbox smoke test)
- Evaluation arena (10-match mini-tournament per candidate)
- Promotion gate (Nash equilibrium PSRO + MAP-Elites niche fill)
- Retirement policy (auto-retire low-rated bots, population cap)
- Live export to R2 for evolution dashboard
- Enhanced replay viewer with commentary and win probability
- Added series, seasons, and predictions pages

All tests passing. Phase 7 exit criteria met.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 16:38:48 -04:00
jedarden
f3e34c6736 fix(evolver): correct failing tests for ensemble and behavior distance
- Fixed TestSelectBestCandidate_GoHttpBonus: HTTP bonus (1.5x) on 150-char code
  (225 score) doesn't beat 500-char plain text (500 score). Test now expects
  the longer code to win.
- Fixed TestScoreCandidate_Bonuses: adjusted minScore expectations to match
  actual code lengths with 1.5x bonus applied.
- Fixed TestBehaviorDistance: use epsilon comparison for floating-point
  precision instead of exact equality. sqrt(2) ≈ 1.414214 is not exactly
  representable in floating-point.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-08 16:36:50 -04:00
jedarden
f5924e8b15 feat(acb-evolver): add LLM prompt builder and ensemble integration
- Add parent sampling via tournament selection (selector/tournament.go)
- Add replay analyzer to extract key moments, strategies, weaknesses
- Add meta builder for leaderboard summary and dominant strategies
- Add prompt assembler combining parent code + replay + meta context
- Add LLM ensemble with fast tier (GLM-5-Turbo) for bulk generation
  and strong tier (GLM-5) for refinement passes
- Add code extraction from LLM responses with language validation
- Add convert utilities for type conversion between packages
- Comprehensive test coverage for all components

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 16:47:25 -04:00