- Update Config struct to use individual postgres connection components (ACB_POSTGRES_HOST, ACB_POSTGRES_PORT, etc.) instead of ACB_DATABASE_URL
- Add DatabaseURL() method to build connection string from components
- This matches the pattern used by acb-index-builder and other services
Closes: bf-1ghm
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The evolver arena was using DefaultConfig() which has attack_radius2=12
for all matches. Per plan §3.4, 2-player matches should have
attack_radius2=36 (6 tiles) to achieve 65-80% combat density.
This bug caused evolved bots to learn energy-farming strategies since
enemies were rarely in attack range on 40x40 maps with only 3.5 tile
radius. With the correct 6-tile radius, bots will experience actual
combat during evolution and should develop fighting behaviors.
Closes: bf-3lt3
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The worker was hardcoding AttackRadius2=5 in executeMatch, but
engine.ConfigForPlayers sets AttackRadius2=12 for both 2-player and
3+ matches. This mismatch meant matches ran with the old attack radius
instead of the improved value that supports better combat density.
Now uses ConfigForPlayers which provides:
- AttackRadius2: 12 (3.5 tiles) for all player counts
- Proper zone parameters scaled by player count
- Correct max turns scaling
Grid dimensions are overridden from the pre-generated map, and
SeasonID/RulesVersion are preserved from the match.
Closes: bf-576s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The acb-local tool was panicking when a match ended in a draw
(Winner = -1) because it tried to use -1 as an array index into
botNames[]. Fixed by checking if Winner >= 0 before accessing
the array, and printing "Result: Draw" for draws.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implement center-weighted energy distribution as a forcing function
to pull players into contested midfield, increasing combat density.
Changes:
- engine/match.go: Update placeEnergyNodes to use tiered radius
distribution (30% central 0.05-0.20, 40% mid 0.20-0.40, 30% outer
0.40-0.60) instead of uniform 0.3-0.7
- engine/integration_test.go: Add TestIntegration_CenterWeightedEnergy
to verify ~25% of energy nodes spawn in central zone
- cmd/acb-mapgen: Already had tiered distribution (unchanged, just
comments updated)
- cmd/acb-mapgen/mapgen_test.go: Add TestGenerateMap_CenterWeightedEnergy
This uses the existing economic incentive (energy collection) as a
forcing function without changing combat resolution or scoring.
Closes: bf-648i
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Fix computeCombatTurns to count EventCombatDeath events instead of
EventBotDied with reason="combat" (which was never emitted, causing
CombatTurns to always be 0)
- Add CombatDeaths field to MapEngagementScore to track focus-fire kills
- Update engagement formula to weight combat deaths at 3.0 (same as
win_prob_crossings) to bias map evolution toward combat-dense maps
- Add countCombatDeaths helper function to count EventCombatDeath events
- Update log output to include combat_deaths metric
This implements bf-4nxs: the combat-density metric is now measured and
weighted in map engagement, which gates map curation/selection. Maps
with zero combat will have low engagement scores and be filtered out.
Closes: bf-4nxs
Reduce default 2-player map size from 60x60 to 40x40 (from 3600 to 1600
tiles) to increase encounter frequency and combat density. Add -skirmish
flag to acb-mapgen for generating even smaller dense maps (32x32, 0.20
wall density, 15 energy nodes) with "skirmish_" ID prefix.
Changes:
- engine/types.go: DefaultConfig() returns 40x40, ConfigForPlayers()
uses 800 tiles/player for 2-player (40x40) and 2000 tiles/player for
3+ players
- cmd/acb-matchmaker/tickers.go: gridForPlayers() returns 40x40 for 2
players
- cmd/acb-map-evolver/main.go: gridForPlayers() returns 40x40 for 2
players
- cmd/acb-mapgen/main.go: defaults to 40x40, adds -skirmish flag for
32x32 high-density maps
- cmd/acb-matchmaker/tickers_test.go: update test expectations for new
40x40 default
Closes: bf-39wt
- Add /opt to nsjail bindmounts so Rust toolchain (/opt/rust) is accessible
during sandboxed validation of Rust bots
- Explicitly enable Alpine community repository in Dockerfile to ensure
nsjail package can be installed (nsjail lives in community, not main)
- nsjail integration was already optional (falls back to plain exec if
unavailable), but these changes ensure it actually works when enabled
This addresses bead bf-3f29: nsjail was listed in apk add but /opt wasn't
bindmounted, causing Rust validation to fail when UseNsjail=true.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The evolver's validation pipeline supports Rust and Java bots, but the
container image was missing rustc and javac runtimes. Additionally, nsjail
was documented as part of the sandbox stage but not installed.
Changes:
- Add nsjail package (from Alpine community repo) for sandbox isolation
- Add openjdk-17-jdk for Java bot validation
- Install Rust toolchain (rustc) via rustup to /opt/rust for shared access
- Set PATH to include Rust binaries for the acb user
The validator already had graceful fallback when nsjail wasn't found in PATH,
but with nsjail installed, the sandbox stage now provides proper CPU/memory
resource limits during smoke testing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The WinnerID field is a player-slot integer as string (e.g. "2"), not a bot_id.
The SQL query already computes the correct winner status in p.Won field.
Fixed in 3 functions:
- matchToSummary: Changed Won: p.BotID == m.WinnerID to Won: p.Won
- buildPlaylistMatch: Changed Won: p.BotID == m.WinnerID to Won: p.Won
- ratingUpsetMagnitude: Use p.Won to identify winner instead of comparing with m.WinnerID
- maxScoreDiff: Use p.Won to identify winner instead of comparing with m.WinnerID
- isEvolutionBreakthrough: Find winner using p.Won before checking if evolved
This fixes the issue where 984/1000 prod matches had winner_id set but all participants showed won: false.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix TestBuildNarrativePrompt_Comeback to check for current ELO
instead of old rating (comeback arc shows bottom 25%→top 25%)
- Fix TestDetectRivalryArcs to use 10+ matches (grudge match spec)
instead of only 5 matches
Story arc detection (per §3.7 chronicles):
✓ Comeback bots: recovered from bottom 25% to top 25%
✓ Grudge matches: same pair meets 10+ times
✓ Underdog victories: bottom-10 beats top-10
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Evolver writes live.json to R2 every cycle. Observatory page polls and
renders live feed + lineage tree + meta shift chart.
- Added ACB_R2_UPLOAD_ENABLED env var to enable automatic R2 upload during run loop
- CycleState tracks real-time evolution cycle status (generation, phase, candidate, validation, evaluation)
- Export() now includes cycle info when cycleState is provided
- runCycle() integrated with live observatory exports at each phase transition
- exportLiveQuiet() for mid-cycle status updates without verbose logging
- Fixed function signature mismatches for exportLiveQuiet calls
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Added tests for:
- TestNextScheduledTime: verifies correct calculation of next scheduled
run time across various scenarios (same-day future, same-day past,
different weekdays, edge cases around midnight)
- TestWeeklyScheduleEnvParsing: validates environment variable parsing
for the WEEKDAY:HH:MM format, including valid and invalid inputs
These tests ensure the weekly automated map evolution ticker (§14.6)
correctly schedules evolution runs at the configured time.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wire up the acb-map-evolver to run automatically on a weekly schedule
(Sunday 03:00 UTC by default) from the evolver deployment.
The map evolution ticker:
- Waits until the next scheduled time (weekday:hour:minute UTC)
- Runs acb-map-evolver --once to evolve maps for all player counts
- Repeats every 7 days
The schedule can be configured via ACB_MAP_EVOLUTION_SCHEDULE env var
(format: WEEKDAY:HH:MM, e.g., "0:03:00" for Sunday 03:00 UTC).
Enable via ACB_MAP_EVOLUTION_ENABLED=true or --enable-map-evolution flag.
Per plan §14.6: the weekly map evolution loads engagement scores,
runs MAP-Elites evolution, promotes high-scoring variants, and updates
the active map pool in the database.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Implement runWeeklyLoop() function that waits for scheduled time and
runs evolution for all player counts (2, 3, 4, 6) weekly
- Add --weekly flag to enable weekly mode (default: Sunday 03:00 UTC)
- Add --weekly-schedule flag for custom schedule (WEEKDAY:HH:MM format)
- Add ACB_WEEKLY_SCHEDULE env var for configuration
feat(acb-evolver): add weekly map evolution ticker
- Add MapEvolutionEnabled and MapEvolutionSchedule to RunConfig
- Add --enable-map-evolution flag to acb-evolver run subcommand
- Add startMapEvolutionTicker() goroutine that runs weekly
- Ticker executes acb-map-evolver --once to trigger map breeding
- Configurable via ACB_MAP_EVOLUTION_ENABLED and ACB_MAP_EVOLUTION_SCHEDULE
This integrates map evolution into the bot evolver's deployment,
allowing weekly automated map evolution based on engagement scores
as specified in plan §14.6.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The --once mode was implemented but the command-line flag was not being
parsed. This commit adds the flag parsing and help text for --once, which
enables the weekly automated map evolution run from the evolver.
The evolver's weekly ticker (run.go) calls acb-map-evolver --once to
trigger map evolution on Sundays at 03:00 UTC as specified in plan §14.6.
Per plan §13.3, implements user-requested AI replay commentary with:
- HMAC bot authentication via shared_secret
- Rate limiting: 5 requests/day per bot
- Match validation (exists and completed)
- Idempotency via enrichment_requested_at column
- Enqueues to Valkey for acb-enrichment service
- Returns 202 Accepted with estimated wait time
Also adds:
- AllowN() method to ratelimit package for multi-token checks
- enrichment_requested_at column to matches table (idempotency)
- enrichLtr rate limiter (5/day per bot)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add 'featured' boolean column to series table for weekly featured series
- Add tickFeaturedSeries ticker that runs Friday 20:00 UTC to create bo5 featured series
- Featured series: query top 20 bots by rating, select 4 rivalry pairs by ELO proximity
- Best-of-7 championship bracket already implemented via createChampionshipBracket
- Add FeaturedSchedSecs config (default: 3600s check interval)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Update the map engagement scoring formula to match plan §14.6:
- score = win_prob_crossings * 3.0 + critical_moments * 2.0 +
resource_contest_turns * 1.5 + survival_turns * 0.5
New metrics computed from replay data:
- resource_contest_turns: turns where energy is contested by multiple players
- survival_turns: turns where all players have at least one bot alive
The old formula used map_coverage_pct, closeness, and turn_pct which
did not match the specification.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Change filter from 'idea'/'mistake' to 'insight'/'idea' (mapping to 'hint'/'strategy' from plan §13.6)
- Increase upvote threshold from 3 to 10 for higher quality signals
- The evolver consumes community_hints.json for LLM prompt context
The rating recovery CLI mode (-mode=recalc-ratings) was using
glicko2Tau (0.5) instead of glicko2DefaultSigma (0.06) for the
default sigma value when resetting ratings. This caused the reset
sigma to be ~8x higher than the schema-defined default.
Added glicko2DefaultSigma constant (0.06) and updated ResetAllRatings
and recalcRatings to use it correctly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements the rating recovery procedure specified in plan §12.3.
Running 'go run ./cmd/acb-worker -mode=recalc-ratings' will:
1. Reset all bot ratings to Glicko-2 defaults (mu=1500, phi=350, sigma=0.06)
2. Fetch all completed matches from the database in chronological order
3. Replay each match to recompute Glicko-2 ratings from scratch
4. Update the bots table with the recalculated ratings
This is needed for disaster recovery when ratings are corrupted or lost.
Database functions added:
- ResetAllRatings: resets all bot ratings to defaults
- GetAllCompletedMatches: fetches completed matches chronologically with participants
- UpdateAllRatings: bulk updates all bot ratings in a single transaction
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add sitemap.xml generation as a final pass in the index builder. The
sitemap covers all public pages: home, leaderboard, bots list, bot
profiles, matches list, featured replays, seasons, rivalries,
predictions, and docs.
- Add SiteURL config field (ACB_SITE_URL env var, defaults to
https://aicodebattle.com)
- Add generateSitemap() function with proper XML encoding
- Add SitemapURL and Sitemap types for XML marshaling
- Call generateSitemap() at the end of generateAllIndexes()
- Write sitemap.xml to output directory alongside leaderboard.json
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Add engine.CalculateMapEngagement() to compute map engagement scores from replay data (win_prob_crossings, critical_moments, map_coverage_pct, closeness, turn_pct)
- Add DBClient.UpdateMapEngagement() to update map engagement using rolling average
- Worker now calculates and writes map engagement scores after each match
- Add test to verify win_prob array is non-empty in produced replays
This implements the win probability Monte Carlo array storage in replay JSON
feature. The engine already called ComputeWinProbability() in MatchRunner.Run(),
so this commit adds the missing map engagement tracking.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Energy node placement now uses a tiered radius distribution: 30% in the
contested central zone (0.05-0.20 from center), 40% in the mid-zone
(0.20-0.40), and 30% in the home zone (0.40-0.60). Previously nodes were
placed uniformly at 0.20-0.70, letting bots farm their home quadrant
indefinitely without crossing the midline.
After cellular automata wall generation, a 3-wide corridor is carved from
each core straight to the map center, plus a 5x5 open arena at the center
tile. This creates lanes that funnel bots into contact — replicating the key
mechanic that drove frequent fights in the original AI Challenge Ants game,
where symmetric food spawning near the midfield forced both colonies to
expand outward and collide.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds combat_turns metric (distinct turns where ≥1 bot died from enemy
focus-fire, excluding self-collisions). Worker computes it after each
match; index builder sorts matches/index.json and the new most-combat
playlist descending by it, and bumps interest score for combat-heavy
matches so they surface in highlights.
Also switches homepage featured replay default view from influence to
standard so the actual bot-on-bot combat is visible.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
map_json generated by acb-map-evolver lacks a 'spawns' key; scanning
map_json->>'spawns' into a non-nullable string causes "converting NULL
to string is unsupported". Use COALESCE for walls/spawns/cores.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add seedIfEmpty: idempotent startup seeding (20 maps per player count,
ON CONFLICT DO NOTHING) using cellular-automata generation + validate()
- Add continuous evolution loop across all player counts (2/3/4/6)
- ACB_MIN_SEED_COUNT and ACB_EVOLUTION_PERIOD configurable via env vars
- Add Dockerfile (lean Alpine build, no language runtimes)
- Add acb-map-evolver to acb-build.yml CI pipeline
- Add staging K8s Deployment manifest
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The encryption key stored in OpenBao/K8s secrets is base64-encoded but
the API and worker crypto functions expected hex. Add parseAESKey() that
accepts both formats (tries hex first, falls back to base64).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TestThreeMonthAgeCheck used 89*24h as "3 months minus 1 day", but
89 calendar days == exactly 3 months on dates like May 1 (Feb+Mar+Apr=
28+31+30=89). The equality case makes the >3-month eligibility check
return true instead of false. Replace with AddDate-relative anchors
so the test stays correct regardless of current date.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two root causes prevented bots from making any moves:
1. SignRequest signing string included timestamp ({match_id}.{turn}.{timestamp}.{hash})
but all bots implement verifySignature without timestamp ({match_id}.{turn}.{hash}).
Fixed by dropping timestamp from the signing string; X-ACB-Timestamp header is still
sent for clock-skew checks but not in the HMAC.
2. The API stores bot secrets AES-GCM encrypted (184 hex chars) in the DB. The worker
was passing the ciphertext directly as the HMAC key, while bots use their plaintext
k8s secret (64 hex chars). Fixed by decrypting in the worker using ACB_ENCRYPTION_KEY.
Also tightens the home page winner filter to exclude winner_id="0" stalemates.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_worker.js static file approach fails — Cloudflare rejects it when uploaded
as a static asset. Instead, copy web/functions/ into the image and set
wrangler CWD to /app/web/ so it discovers functions/ and uploads the Pages
Functions bundle correctly on every deploy cycle.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every index-builder deploy was overwriting the production Pages deployment
without functions (wrangler ran from /tmp, no functions/ dir visible).
Compiling functions/ to dist/_worker.js during the Docker web-builder stage
means the worker is always included in every Pages deploy, regardless of CWD.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AWS SDK Go v2 s3 v1.100.0 defaults to RequestChecksumCalculationWhenSupported,
which causes PutObject to send STREAMING-UNSIGNED-PAYLOAD-TRAILER — a chunked
transfer mode R2 doesn't support. Setting WhenRequired makes the SDK send a
standard signed payload instead, resolving the 403 SignatureDoesNotMatch.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds R2 (Cloudflare) as a direct upload target alongside B2 (cold archive).
When ACB_R2_* credentials are configured, the worker uploads replays and
thumbnails to R2 immediately after each match, bypassing the index-builder's
B2→R2 promotion cycle.
This is necessary because ARMOR's B2 app key is write-only; reads via the
direct S3 path return 403. The Cloudflare CDN read path (armor-hub-b2.ardenone.com)
is dead post-hub-decommission. Direct R2 upload ensures replays are available
without waiting for a working B2 read path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EndpointResolverV2 with a custom static URI does not honor UsePathStyle —
the resolver provides the final endpoint and the SDK does not re-apply
path-style bucket addressing on top of it. This means the bucket name was
dropped from the request path even with UsePathStyle=true, sending PUTs
to /replays/... instead of /armor-apexalgo/replays/...
BaseEndpoint is the SDK's documented approach for S3-compatible custom
endpoints; it sets the base URL and then correctly applies path-style
addressing to produce /bucket/key URLs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two fixes:
1. Add UsePathStyle=true to B2 S3 client. Without it the SDK uses
virtual-hosted addressing, dropping the bucket name from the request
path. Uploads hit /replays/... instead of /armor-apexalgo/replays/...
causing NoSuchBucket errors on every replay/thumbnail PutObject.
2. Don't update crash_strikes for normal game endings (stalemate, turns).
In snake-style games every bot eventually crashes into a wall/snake —
that is the expected end condition, not an HTTP error. The old code
treated all Crashed[] entries from the engine as errors, causing all
6 bots to accumulate strikes after every single match and hitting the
30-min cooldown threshold after just 3 matches.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The replay_feedback table was missing its foreign key constraint to
matches(match_id). This happened because CREATE TABLE IF NOT EXISTS
doesn't add FKs to existing tables.
Added an idempotent migration that checks for the constraint's existence
before adding it, ensuring it's safe to run on both fresh installs and
existing databases.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The BaseEndpoint approach with older aws-sdk-go-v2 causes
"Invalid region: region was not a valid DNS name" errors when
uploading to ARMOR's S3-compatible endpoint.
Switching to EndpointResolverV2 bypasses the SDK's endpoint
rule validation entirely, resolving the issue.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fixes 'Invalid region: region was not a valid DNS name' error when
uploading replays to B2 via ARMOR proxy.
The error was caused by a known bug in aws-sdk-go-v2 v1.41.4 where
the endpoint resolver would validate the region as a DNS name even
when using a custom BaseEndpoint with UsePathStyle=true.
Upgraded SDK versions:
- github.com/aws/aws-sdk-go-v2 v1.41.4 -> v1.41.6
- github.com/aws/aws-sdk-go-v2/config v1.32.12 -> v1.32.16
- github.com/aws/aws-sdk-go-v2/service/s3 v1.97.2 -> v1.100.0
- github.com/aws/smithy-go v1.24.2 -> v1.25.1
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Verified /watch/replays shows real completed matches (not just demo)
- Match cards display bot names, turn count, winner badges, map ID
- 'Watch Replay' links point to real match IDs (m_test_*)
- Curated playlists render with real data (featured, comebacks, upsets, etc.)
- Pagination/infinite scroll works via IntersectionObserver
- Mobile testing on Pixel 6 via ADB: layout responsive, touch targets usable
- Created MATCH_LIST_TEST_RESULTS.md with full verification details
- Thumbnails not implemented (clean UI without broken images due to R2 issues)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The AWS SDK requires a valid AWS region name even when using custom
S3-compatible endpoints (ARMOR/B2). Using "auto" as the region causes
an error: "Invalid region: region was not a valid DNS name."
This fixes the replay upload pipeline which was failing with the
invalid region error. Replays should now upload successfully to B2
via the ARMOR proxy.
Related to ai-code-battle-o43: Replay viewer verification task.
Verification results:
1. ✅ /data/blog/index.json exists and has 1 post (meta-week-13-season-1)
2. ✅ Individual post pages load correctly at /blog/{slug}
3. ✅ Blog post JSON structure matches frontend expectations (content_md field)
4. ✅ Tags and filters implemented in UI (All, Meta Reports, Chronicles buttons)
5. ✅ Blog page builds successfully (blog-D4QMd11d.js included in build)
Current state: Blog infrastructure is fully implemented with:
- LLM-powered narrative generation (blog.go, narrative.go)
- Story arc detection (rise, fall, rivalry, upset, evolution milestones)
- Weekly meta report generation with ELO movers, strategy analysis
- Chronicles for story arcs (rivalry, upset, rise/fall, evolution)
- Tag-based filtering and search
Note: Current blog content is placeholder/template-based. Meaningful
match commentary will be generated when:
- ACB_LLM_BASE_URL and ACB_LLM_API_KEY are configured in index-builder
- Real match data exists in PostgreSQL database
- Story arcs are detected from rating history and match results