The WinnerID field is a player-slot integer as string (e.g. "2"), not a bot_id.
The SQL query already computes the correct winner status in p.Won field.
Fixed in 3 functions:
- matchToSummary: Changed Won: p.BotID == m.WinnerID to Won: p.Won
- buildPlaylistMatch: Changed Won: p.BotID == m.WinnerID to Won: p.Won
- ratingUpsetMagnitude: Use p.Won to identify winner instead of comparing with m.WinnerID
- maxScoreDiff: Use p.Won to identify winner instead of comparing with m.WinnerID
- isEvolutionBreakthrough: Find winner using p.Won before checking if evolved
This fixes the issue where 984/1000 prod matches had winner_id set but all participants showed won: false.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix TestBuildNarrativePrompt_Comeback to check for current ELO
instead of old rating (comeback arc shows bottom 25%→top 25%)
- Fix TestDetectRivalryArcs to use 10+ matches (grudge match spec)
instead of only 5 matches
Story arc detection (per §3.7 chronicles):
✓ Comeback bots: recovered from bottom 25% to top 25%
✓ Grudge matches: same pair meets 10+ times
✓ Underdog victories: bottom-10 beats top-10
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Evolver writes live.json to R2 every cycle. Observatory page polls and
renders live feed + lineage tree + meta shift chart.
- Added ACB_R2_UPLOAD_ENABLED env var to enable automatic R2 upload during run loop
- CycleState tracks real-time evolution cycle status (generation, phase, candidate, validation, evaluation)
- Export() now includes cycle info when cycleState is provided
- runCycle() integrated with live observatory exports at each phase transition
- exportLiveQuiet() for mid-cycle status updates without verbose logging
- Fixed function signature mismatches for exportLiveQuiet calls
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Added tests for:
- TestNextScheduledTime: verifies correct calculation of next scheduled
run time across various scenarios (same-day future, same-day past,
different weekdays, edge cases around midnight)
- TestWeeklyScheduleEnvParsing: validates environment variable parsing
for the WEEKDAY:HH:MM format, including valid and invalid inputs
These tests ensure the weekly automated map evolution ticker (§14.6)
correctly schedules evolution runs at the configured time.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wire up the acb-map-evolver to run automatically on a weekly schedule
(Sunday 03:00 UTC by default) from the evolver deployment.
The map evolution ticker:
- Waits until the next scheduled time (weekday:hour:minute UTC)
- Runs acb-map-evolver --once to evolve maps for all player counts
- Repeats every 7 days
The schedule can be configured via ACB_MAP_EVOLUTION_SCHEDULE env var
(format: WEEKDAY:HH:MM, e.g., "0:03:00" for Sunday 03:00 UTC).
Enable via ACB_MAP_EVOLUTION_ENABLED=true or --enable-map-evolution flag.
Per plan §14.6: the weekly map evolution loads engagement scores,
runs MAP-Elites evolution, promotes high-scoring variants, and updates
the active map pool in the database.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Implement runWeeklyLoop() function that waits for scheduled time and
runs evolution for all player counts (2, 3, 4, 6) weekly
- Add --weekly flag to enable weekly mode (default: Sunday 03:00 UTC)
- Add --weekly-schedule flag for custom schedule (WEEKDAY:HH:MM format)
- Add ACB_WEEKLY_SCHEDULE env var for configuration
feat(acb-evolver): add weekly map evolution ticker
- Add MapEvolutionEnabled and MapEvolutionSchedule to RunConfig
- Add --enable-map-evolution flag to acb-evolver run subcommand
- Add startMapEvolutionTicker() goroutine that runs weekly
- Ticker executes acb-map-evolver --once to trigger map breeding
- Configurable via ACB_MAP_EVOLUTION_ENABLED and ACB_MAP_EVOLUTION_SCHEDULE
This integrates map evolution into the bot evolver's deployment,
allowing weekly automated map evolution based on engagement scores
as specified in plan §14.6.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The --once mode was implemented but the command-line flag was not being
parsed. This commit adds the flag parsing and help text for --once, which
enables the weekly automated map evolution run from the evolver.
The evolver's weekly ticker (run.go) calls acb-map-evolver --once to
trigger map evolution on Sundays at 03:00 UTC as specified in plan §14.6.
Per plan §13.3, implements user-requested AI replay commentary with:
- HMAC bot authentication via shared_secret
- Rate limiting: 5 requests/day per bot
- Match validation (exists and completed)
- Idempotency via enrichment_requested_at column
- Enqueues to Valkey for acb-enrichment service
- Returns 202 Accepted with estimated wait time
Also adds:
- AllowN() method to ratelimit package for multi-token checks
- enrichment_requested_at column to matches table (idempotency)
- enrichLtr rate limiter (5/day per bot)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add 'featured' boolean column to series table for weekly featured series
- Add tickFeaturedSeries ticker that runs Friday 20:00 UTC to create bo5 featured series
- Featured series: query top 20 bots by rating, select 4 rivalry pairs by ELO proximity
- Best-of-7 championship bracket already implemented via createChampionshipBracket
- Add FeaturedSchedSecs config (default: 3600s check interval)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Update the map engagement scoring formula to match plan §14.6:
- score = win_prob_crossings * 3.0 + critical_moments * 2.0 +
resource_contest_turns * 1.5 + survival_turns * 0.5
New metrics computed from replay data:
- resource_contest_turns: turns where energy is contested by multiple players
- survival_turns: turns where all players have at least one bot alive
The old formula used map_coverage_pct, closeness, and turn_pct which
did not match the specification.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Change filter from 'idea'/'mistake' to 'insight'/'idea' (mapping to 'hint'/'strategy' from plan §13.6)
- Increase upvote threshold from 3 to 10 for higher quality signals
- The evolver consumes community_hints.json for LLM prompt context
The rating recovery CLI mode (-mode=recalc-ratings) was using
glicko2Tau (0.5) instead of glicko2DefaultSigma (0.06) for the
default sigma value when resetting ratings. This caused the reset
sigma to be ~8x higher than the schema-defined default.
Added glicko2DefaultSigma constant (0.06) and updated ResetAllRatings
and recalcRatings to use it correctly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements the rating recovery procedure specified in plan §12.3.
Running 'go run ./cmd/acb-worker -mode=recalc-ratings' will:
1. Reset all bot ratings to Glicko-2 defaults (mu=1500, phi=350, sigma=0.06)
2. Fetch all completed matches from the database in chronological order
3. Replay each match to recompute Glicko-2 ratings from scratch
4. Update the bots table with the recalculated ratings
This is needed for disaster recovery when ratings are corrupted or lost.
Database functions added:
- ResetAllRatings: resets all bot ratings to defaults
- GetAllCompletedMatches: fetches completed matches chronologically with participants
- UpdateAllRatings: bulk updates all bot ratings in a single transaction
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add sitemap.xml generation as a final pass in the index builder. The
sitemap covers all public pages: home, leaderboard, bots list, bot
profiles, matches list, featured replays, seasons, rivalries,
predictions, and docs.
- Add SiteURL config field (ACB_SITE_URL env var, defaults to
https://aicodebattle.com)
- Add generateSitemap() function with proper XML encoding
- Add SitemapURL and Sitemap types for XML marshaling
- Call generateSitemap() at the end of generateAllIndexes()
- Write sitemap.xml to output directory alongside leaderboard.json
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add engine.CalculateMapEngagement() to compute map engagement scores from replay data (win_prob_crossings, critical_moments, map_coverage_pct, closeness, turn_pct)
- Add DBClient.UpdateMapEngagement() to update map engagement using rolling average
- Worker now calculates and writes map engagement scores after each match
- Add test to verify win_prob array is non-empty in produced replays
This implements the win probability Monte Carlo array storage in replay JSON
feature. The engine already called ComputeWinProbability() in MatchRunner.Run(),
so this commit adds the missing map engagement tracking.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Energy node placement now uses a tiered radius distribution: 30% in the
contested central zone (0.05-0.20 from center), 40% in the mid-zone
(0.20-0.40), and 30% in the home zone (0.40-0.60). Previously nodes were
placed uniformly at 0.20-0.70, letting bots farm their home quadrant
indefinitely without crossing the midline.
After cellular automata wall generation, a 3-wide corridor is carved from
each core straight to the map center, plus a 5x5 open arena at the center
tile. This creates lanes that funnel bots into contact — replicating the key
mechanic that drove frequent fights in the original AI Challenge Ants game,
where symmetric food spawning near the midfield forced both colonies to
expand outward and collide.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds combat_turns metric (distinct turns where ≥1 bot died from enemy
focus-fire, excluding self-collisions). Worker computes it after each
match; index builder sorts matches/index.json and the new most-combat
playlist descending by it, and bumps interest score for combat-heavy
matches so they surface in highlights.
Also switches homepage featured replay default view from influence to
standard so the actual bot-on-bot combat is visible.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
map_json generated by acb-map-evolver lacks a 'spawns' key; scanning
map_json->>'spawns' into a non-nullable string causes "converting NULL
to string is unsupported". Use COALESCE for walls/spawns/cores.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add seedIfEmpty: idempotent startup seeding (20 maps per player count,
ON CONFLICT DO NOTHING) using cellular-automata generation + validate()
- Add continuous evolution loop across all player counts (2/3/4/6)
- ACB_MIN_SEED_COUNT and ACB_EVOLUTION_PERIOD configurable via env vars
- Add Dockerfile (lean Alpine build, no language runtimes)
- Add acb-map-evolver to acb-build.yml CI pipeline
- Add staging K8s Deployment manifest
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The encryption key stored in OpenBao/K8s secrets is base64-encoded but
the API and worker crypto functions expected hex. Add parseAESKey() that
accepts both formats (tries hex first, falls back to base64).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TestThreeMonthAgeCheck used 89*24h as "3 months minus 1 day", but
89 calendar days == exactly 3 months on dates like May 1 (Feb+Mar+Apr=
28+31+30=89). The equality case makes the >3-month eligibility check
return true instead of false. Replace with AddDate-relative anchors
so the test stays correct regardless of current date.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two root causes prevented bots from making any moves:
1. SignRequest signing string included timestamp ({match_id}.{turn}.{timestamp}.{hash})
but all bots implement verifySignature without timestamp ({match_id}.{turn}.{hash}).
Fixed by dropping timestamp from the signing string; X-ACB-Timestamp header is still
sent for clock-skew checks but not in the HMAC.
2. The API stores bot secrets AES-GCM encrypted (184 hex chars) in the DB. The worker
was passing the ciphertext directly as the HMAC key, while bots use their plaintext
k8s secret (64 hex chars). Fixed by decrypting in the worker using ACB_ENCRYPTION_KEY.
Also tightens the home page winner filter to exclude winner_id="0" stalemates.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_worker.js static file approach fails — Cloudflare rejects it when uploaded
as a static asset. Instead, copy web/functions/ into the image and set
wrangler CWD to /app/web/ so it discovers functions/ and uploads the Pages
Functions bundle correctly on every deploy cycle.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every index-builder deploy was overwriting the production Pages deployment
without functions (wrangler ran from /tmp, no functions/ dir visible).
Compiling functions/ to dist/_worker.js during the Docker web-builder stage
means the worker is always included in every Pages deploy, regardless of CWD.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AWS SDK Go v2 s3 v1.100.0 defaults to RequestChecksumCalculationWhenSupported,
which causes PutObject to send STREAMING-UNSIGNED-PAYLOAD-TRAILER — a chunked
transfer mode R2 doesn't support. Setting WhenRequired makes the SDK send a
standard signed payload instead, resolving the 403 SignatureDoesNotMatch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds R2 (Cloudflare) as a direct upload target alongside B2 (cold archive).
When ACB_R2_* credentials are configured, the worker uploads replays and
thumbnails to R2 immediately after each match, bypassing the index-builder's
B2→R2 promotion cycle.
This is necessary because ARMOR's B2 app key is write-only; reads via the
direct S3 path return 403. The Cloudflare CDN read path (armor-hub-b2.ardenone.com)
is dead post-hub-decommission. Direct R2 upload ensures replays are available
without waiting for a working B2 read path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EndpointResolverV2 with a custom static URI does not honor UsePathStyle —
the resolver provides the final endpoint and the SDK does not re-apply
path-style bucket addressing on top of it. This means the bucket name was
dropped from the request path even with UsePathStyle=true, sending PUTs
to /replays/... instead of /armor-apexalgo/replays/...
BaseEndpoint is the SDK's documented approach for S3-compatible custom
endpoints; it sets the base URL and then correctly applies path-style
addressing to produce /bucket/key URLs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two fixes:
1. Add UsePathStyle=true to B2 S3 client. Without it the SDK uses
virtual-hosted addressing, dropping the bucket name from the request
path. Uploads hit /replays/... instead of /armor-apexalgo/replays/...
causing NoSuchBucket errors on every replay/thumbnail PutObject.
2. Don't update crash_strikes for normal game endings (stalemate, turns).
In snake-style games every bot eventually crashes into a wall/snake —
that is the expected end condition, not an HTTP error. The old code
treated all Crashed[] entries from the engine as errors, causing all
6 bots to accumulate strikes after every single match and hitting the
30-min cooldown threshold after just 3 matches.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The replay_feedback table was missing its foreign key constraint to
matches(match_id). This happened because CREATE TABLE IF NOT EXISTS
doesn't add FKs to existing tables.
Added an idempotent migration that checks for the constraint's existence
before adding it, ensuring it's safe to run on both fresh installs and
existing databases.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The BaseEndpoint approach with older aws-sdk-go-v2 causes
"Invalid region: region was not a valid DNS name" errors when
uploading to ARMOR's S3-compatible endpoint.
Switching to EndpointResolverV2 bypasses the SDK's endpoint
rule validation entirely, resolving the issue.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fixes 'Invalid region: region was not a valid DNS name' error when
uploading replays to B2 via ARMOR proxy.
The error was caused by a known bug in aws-sdk-go-v2 v1.41.4 where
the endpoint resolver would validate the region as a DNS name even
when using a custom BaseEndpoint with UsePathStyle=true.
Upgraded SDK versions:
- github.com/aws/aws-sdk-go-v2 v1.41.4 -> v1.41.6
- github.com/aws/aws-sdk-go-v2/config v1.32.12 -> v1.32.16
- github.com/aws/aws-sdk-go-v2/service/s3 v1.97.2 -> v1.100.0
- github.com/aws/smithy-go v1.24.2 -> v1.25.1
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Verified /watch/replays shows real completed matches (not just demo)
- Match cards display bot names, turn count, winner badges, map ID
- 'Watch Replay' links point to real match IDs (m_test_*)
- Curated playlists render with real data (featured, comebacks, upsets, etc.)
- Pagination/infinite scroll works via IntersectionObserver
- Mobile testing on Pixel 6 via ADB: layout responsive, touch targets usable
- Created MATCH_LIST_TEST_RESULTS.md with full verification details
- Thumbnails not implemented (clean UI without broken images due to R2 issues)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The AWS SDK requires a valid AWS region name even when using custom
S3-compatible endpoints (ARMOR/B2). Using "auto" as the region causes
an error: "Invalid region: region was not a valid DNS name."
This fixes the replay upload pipeline which was failing with the
invalid region error. Replays should now upload successfully to B2
via the ARMOR proxy.
Related to ai-code-battle-o43: Replay viewer verification task.
Verification results:
1. ✅ /data/blog/index.json exists and has 1 post (meta-week-13-season-1)
2. ✅ Individual post pages load correctly at /blog/{slug}
3. ✅ Blog post JSON structure matches frontend expectations (content_md field)
4. ✅ Tags and filters implemented in UI (All, Meta Reports, Chronicles buttons)
5. ✅ Blog page builds successfully (blog-D4QMd11d.js included in build)
Current state: Blog infrastructure is fully implemented with:
- LLM-powered narrative generation (blog.go, narrative.go)
- Story arc detection (rise, fall, rivalry, upset, evolution milestones)
- Weekly meta report generation with ELO movers, strategy analysis
- Chronicles for story arcs (rivalry, upset, rise/fall, evolution)
- Tag-based filtering and search
Note: Current blog content is placeholder/template-based. Meaningful
match commentary will be generated when:
- ACB_LLM_BASE_URL and ACB_LLM_API_KEY are configured in index-builder
- Real match data exists in PostgreSQL database
- Story arcs are detected from rating history and match results
Remove custom endpoint resolver and use AWS SDK's standard approach
for S3-compatible endpoints:
- Use config.WithRegion("auto") for custom endpoints
- Set BaseEndpoint directly via s3.NewFromConfig options
- Add UsePathStyle for B2 compatibility
This fixes the 'Invalid region: region was not a valid DNS name' error
that was preventing replay uploads. The deployment manifest already
sets ACB_B2_REGION to empty string to avoid conflicts.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The main leaderboard SPA is now served at / (index.html) and the
standalone replay viewer lives at /replay.html. This removes the
_redirects workaround in index-builder that patched over the inverted
entry points.
- Rename web/app.html → web/index.html (main SPA)
- Rename web/index.html → web/replay.html (standalone viewer)
- Update vite.config.ts: main→index.html, replay→replay.html
- Remove _redirects injection from deploy.go verifyMergedOutput
- Update pages.json routes and README dev URL
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The replay viewer was baked into index.html (served at /) while the
leaderboard app was at /app.html. Add a _redirects file so visitors
landing on / get redirected to the main leaderboard app.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Passing time.Duration (int64 nanoseconds) as $2 in NOW() + $2 caused
PostgreSQL to interpret the nanosecond value as seconds, setting
cooldown_until to year ~59066 instead of +30 minutes.
Fix: pre-compute time.Now().Add(CrashCooldownDuration) and pass the
resulting time.Time — pq encodes it as a proper timestamptz.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The seasons table was recreated with id BIGSERIAL (not season_id VARCHAR).
The ClaimJob query was still referencing s.season_id (stale column name).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Raw game scores (capture points) are tied in most matches since the
winner is determined by an energy/bots-alive tiebreaker. This caused
Glicko-2 delta=0, leaving rating_mu frozen at 1500 for all bots.
Now winner gets 1.0, non-winners 0.0, draws 0.5 — correct pairwise
win/loss signal for Glicko-2 convergence.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Jobs remain in 'claimed' status until completed — the reaper was
querying 'running' (which is the match status, not job status) so
stale claimed jobs were never recycled.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
b.bot_id was selected without being in the GROUP BY clause or wrapped
in an aggregate, causing a Postgres error on live export. Replaced with
a correlated subquery that finds the highest-rated bot per island.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The metrics package is a local module dependency imported by all services
but was missing from every Dockerfile's build context.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>