jedarden/ai-code-battle

Author	SHA1	Message	Date
jedarden	00069b1870	feat(acb-api): implement bot registration, job coordination, and replay endpoints per plan §12 Phase 4 - POST /api/register: bot registration with URL + shared secret validation - GET /api/job: worker polls for next pending match job (authenticated) - POST /api/job/:id/result: worker submits match result (winner, replay JSON) - GET /api/replay/🆔 serve replay JSON from R2 warm cache (falls back to B2) - GET /api/bot/🆔 bot profile JSON (rating, elo, record, metadata) - GET /api/bots: leaderboard snapshot with pagination - POST /api/ui-feedback: accept Agentation UI feedback Authentication via Bearer token (worker API key). Shared secrets encrypted with AES-256-GCM using ACB_ENCRYPTION_KEY.	2026-04-21 08:58:42 -04:00
jedarden	fb0ae2b603	docs(phase6): add deployment checklist and make scripts executable - Add comprehensive Phase 6 deployment checklist (docs/phase6-deployment-checklist.md) - Make all deployment scripts executable (chmod +x scripts/*.sh) - Document remaining Cloudflare setup steps requiring account access - Include verification commands and expected URLs - Document data flow architecture Phase 6 code is complete. Remaining infrastructure setup requires Cloudflare account access for: - Cloudflare Pages project creation - R2 bucket creation and custom domain - DNS configuration All deployment scripts are ready to run once Cloudflare access is available.	2026-04-08 17:29:02 -04:00
jedarden	f5d7553f98	Add Phase 7-9 features: evolution dashboard, WASM sandbox, enhanced replay Phase 7 Evolution: - Add live-export subcommand to acb-evolver for dashboard JSON generation - Export programs, stats, and generation log to live.json Phase 8 Enhanced Features: - Add WASM game engine build (cmd/acb-wasm/) with JS bindings - Add in-browser sandbox page with Monaco editor (web/src/pages/sandbox.ts) - Add win probability computation (web/src/win-probability.ts) - Add replay commentary generator (web/src/commentary.ts) - Add clip maker for GIF/MP4 export (web/src/pages/clip-maker.ts) - Add rivalry detection and pages (web/src/pages/rivalries.ts) - Add replay feedback system (web/src/pages/feedback.ts) - Add evolution dashboard page (web/src/pages/evolution.ts) Phase 9 Platform Depth: - Add predictions API (cmd/acb-api/predictions.go) - Add series management API (cmd/acb-api/series.go) - Add seasons API (cmd/acb-api/seasons.go) - Add narrative generator for rivalries (cmd/acb-indexer/src/narrative.ts) Engine Updates: - Add debug field to move response schema - Add match event timeline extraction - Add replay enrichment fields Web Updates: - Update app.html navigation for new pages - Add API client methods for predictions, series, seasons - Export engine types for browser use Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-29 01:13:23 -04:00
jedarden	930f0263c2	Restore hybrid architecture: Cloudflare Pages + R2 for web, K8s for compute Corrected the full-K8s migration — the static site belongs on Cloudflare Pages, not Nginx in K8s. The architecture is now: - Cloudflare Pages: SPA + pre-computed JSON indexes (deployed by K8s index builder via wrangler every ~90 min) - Cloudflare R2: replays, per-match metadata, evolution live.json, thumbnails, bot cards (uploaded by K8s workers via S3 API) - apexalgo-iad K8s: Go API (Traefik ingress), PostgreSQL (CNPG), Valkey (job queue), match workers, strategy bots, evolver, index builder Browser loads SPA from Pages, replays from R2, calls Go API on K8s. No Nginx pod, no PersistentVolume for web data. R2 is the data bus between K8s agents and the Cloudflare-served website. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 23:24:22 -04:00
jedarden	8ee9944f2a	Replace CronJobs with Deployments using sleep-loop + periodic exit All periodic workloads now run as Deployments with internal sleep loops that exit after a configurable lifetime (default 4h) for K8s to restart. No CronJobs or Jobs used. - Index builder: 15-min sleep loop, writes JSON to PV, exits after 4h - Replay pruning: folded into index builder (weekly cycle check), acb-replay-pruner container removed - Evolver: already used this pattern (4h exit) - Match workers: standard long-running Deployment (no periodic exit needed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 22:55:05 -04:00
jedarden	1c857801fe	Migrate deployment from Cloudflare+Rackspace to apexalgo-iad K8s cluster Major architecture change: everything now deploys to the apexalgo-iad Kubernetes cluster via ArgoCD GitOps. Replaced: - Cloudflare Pages → Nginx container in Web Pod - Cloudflare Workers → Go API container in Web Pod (same pod, shared PVC) - Cloudflare D1 → CNPG PostgreSQL (cnpg-apexalgo cluster) - Cloudflare R2 → SATA (Cinder CSI) PersistentVolume - Rackspace Spot → K8s Deployments in ai-code-battle namespace New infrastructure: - Web Pod: 2-container pod (Nginx + Go API) sharing an RWO SATA PVC - Match workers: K8s Deployment with configurable replicas - Strategy bots: individual Deployments + Services - Job queue: Valkey (Redis-compatible, existing cluster resource) - Cron jobs: K8s CronJobs for index building, replay pruning, health checks - CI/CD: Argo Workflows + Events (GitHub webhook triggers) - Secrets: SealedSecrets encrypted in git - TLS: Traefik IngressRoutes + cert-manager - CDN: Cloudflare free plan as reverse proxy only (no CF services) - D1 SQLite schema → PostgreSQL (proper types, FKs, indexes, JSONB) Storage: SATA Cinder CSI (RWO). Web Pod and Index Builder CronJob share the PVC via node affinity (both scheduled to same node). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 22:50:57 -04:00
jedarden	caf97b4535	Add artifact inventory (§11), renumber sections 12-16 New section 11 (Artifact Inventory) documents every deliverable: - Monorepo structure with full directory tree (engine/, cmd/, worker-api/, web/, wasm/, bots/, starters/, docs/) - 4 Cloudflare artifacts (Pages, Worker, D1, R2) - 12 Rackspace container images (worker, evolver, index-builder, replay-pruner, 6 strategy bots, evolved bots) - 7 WASM artifacts (engine + 6 sandbox bots) - 6 starter kit template repos - 2 CLI tools (acb-local, acb-mapgen) - Build & deploy pipeline (GitHub Actions → CF + container registry) - Shared library map showing code reuse across artifacts Sections renumbered: old 11→12 (Phases), 12→13 (Enhanced), 13→14 (Depth), 14→15 (Ecosystem), 15→16 (UX). All §-references updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 00:36:50 -04:00
jedarden	00e444295b	Fix 6 minor review issues: R2 budget math, section refs, data paths - R2 Class A writes corrected from 44K to 101K/month (60 matches/hr × 2 writes × 720h + 14.4K evolution live.json). Still 90% headroom. - R2 Class B reads reconciled to 50K/month (was 30K in one place, 20K in another) - Fixed §12 evolution meta reference → §10.2 MAP-Elites behavior grid - Added community_hints.json to documented data paths in §14.2 - Normalized all SS notation to § throughout document (5 occurrences) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 00:25:45 -04:00
jedarden	e241aed88d	Fix 13 review issues: R2 budget, schema consolidation, protocol spec, secret storage, evolution throughput, predictions, enrichment, fairness, lineage, crash ratings, cron model, core capture, replay pruning Critical fixes: - R2 write budget: replaced Worker cron index rebuilder (was ~1.6M writes/mo, over 1M limit) with Rackspace index builder that deploys to Pages every ~90 min (500 builds/mo). R2 now only for replays, match metadata, and evolution live status (~44K writes/mo). - D1 schema consolidated: all 13 tables in one place (§8.3), including predictions, map_votes, replay_feedback, series, series_games, seasons - Protocol schema examples updated with notes about future additive fields (season_id, terrain, debug) per backward compatibility rules High fixes: - Shared secret storage: removed self-contradicting draft note, clean statement of AES-256-GCM approach - Predictions: changed predicted_winner INTEGER to predicted_bot_id TEXT (tied to bot identity, not random player slot) Medium fixes: - Evolution throughput: configurable ladder/evolution ratio (70/30 default), container exits after 4h for Kubernetes redeploy - Test harnesses added: game engine, bot protocol, evolution validation - Enrichment: coding agent on Rackspace generates markdown play-by-play - Map fairness: sample increased from 20 to 80 matches (~2% false positive vs ~15%) - Bot lineage: parent_ids TEXT column added to bots table - Crash/timeout matches explicitly affect Glicko-2 ratings - "Undefended core" defined at Phase CAPTURE - Replay pruning: age-based 90-day, weekly Rackspace job, exemptions for playlists/rivalries/series/seasons, acb-replay-pruner container Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-24 00:16:06 -04:00
jedarden	b5454d35b3	Add UX polish: micro-animations, Director Mode, view morphing, follow camera, PiP, performance trifecta, progressive disclosure, mobile swipe playlists, theater mode, ambient awareness 10 UX/UI features in sections 15.9-15.18: - Replay micro-animations: 60fps interpolation between game ticks, particle deaths, energy starbursts, combat lines, core shockwaves - Director Mode: adaptive auto-speed based on action density per turn, cinematic tempo that slows for combat and speeds through exploration - View mode morphing: 300ms cross-fade between dots/territory/influence using dual off-screen Canvas buffers - Follow Bot camera: viewport lerps to track one player's bounding box, pairs with fog-of-war for first-person documentary feel - Picture-in-picture: replay minimizes to 200x200 floating player on navigation, Canvas reparented (not recreated), GPU-animated transitions - Performance trifecta: hover preload (100ms delay), layout-matched skeleton screens with shimmer, 5-page back-nav cache (0ms restore) - Progressive disclosure: viewer XP tracks engagement, reveals controls at levels 0/2/5/10/20, golden pulse + tooltip on new features - Mobile swipe playlists: full-screen cards, auto-play Director Mode, swipe-up to advance, background preload of next replay - Theater mode: fullscreen chrome-free canvas, vignette + critical moment pulse, auto-hiding controls, landscape trigger on mobile - Ambient awareness: favicon badges, tab title updates, haptic on critical moments, seasonal background hue shift, live match counter Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 23:35:29 -04:00
jedarden	8f7dea26ee	Add UX design: audiences, navigation, mobile, first-visit flow New section 15 covers: - Three audiences (spectator, participant, visitor) with distinct needs - Information architecture: /watch, /compete, /leaderboard as primary hubs - Homepage design: auto-playing featured replay, two CTAs, compact leaderboard + playlists + season status + evolution mini - Desktop top bar + mobile bottom tab bar navigation - Responsive breakpoints (phone/tablet/desktop) with mobile-first spectating, desktop-first building - Mobile replay viewer: full-width canvas, pinch-to-zoom, tap play/pause, swipe-to-scrub timeline, slide-up panels - Sandbox is desktop-only (clear mobile message + "send to desktop" link) - First-time visitor funnel: see → watch → engage → build - Performance budget: <2s LCP, <200KB gzipped JS, code-split per route, lazy WASM loading, stale-while-revalidate data fetching - Design language: dark theme, minimal chrome, functional animation, mono+sans-serif typography - 11 reusable components covering all pages and breakpoints Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 23:19:21 -04:00
jedarden	dec57cc78d	Clarify Pages vs R2 data split in architecture Pages serves the SPA shell only (~500-1000 files): HTML, JS, CSS, WASM, docs. Changes only on code deploys. R2 serves all dynamic data via custom domain: replays (~130K files at 90d retention), leaderboard, bot profiles, matches, evolution status, blog posts, thumbnails, cards. R2 is also the data bus for Rackspace agents — same files browsers read are what workers write. Added detailed file layout for both, data loading pattern with cache headers, and updated architecture diagram showing the three-way data flow (Worker materializes D1→R2, Rackspace writes to R2, browser reads from R2). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 23:17:20 -04:00
jedarden	34887204ee	Add ecosystem features: meta reports, public data, accessibility, evolution observatory, narrative engine New section 14 + Phase 10: - Weekly meta report as auto-generated blog posts on /blog with LLM-enhanced narrative sections (~$0.05/week) - Public match data as documented static JSON file paths in R2, no Worker API needed; versioned replay format spec - Accessibility suite: Tol color-blind palette, shape-per-player, keyboard shortcuts, high contrast, reduced motion, screen reader transcript, focus indicators - Live evolution observatory: evolver writes live.json to R2 every cycle (~20 writes/hr), page polls every 10s, renders island status + candidate pipeline + activity feed + lineage tree + meta chart - Narrative engine: weekly story arc detection (rise, fall, rivalry, upset, evolution, comeback), LLM-generated 200-word chronicles published as blog posts (~$1.30/month) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 23:15:02 -04:00
jedarden	282a3aed94	Add platform depth features: debug telemetry, territory heatmaps, embeds, playlists, predictions, map evolution, series, event timeline, seasons, bot cards New section 13 (10 features) + Phase 9: - Bot debug telemetry: optional structured debug field in move response, rendered as targets/heatmaps/reasoning in the replay viewer - Three replay view modes: dots, Voronoi territory, influence gradient - Embeddable replay widget (~50KB iframe, OG tags, query params) - Auto-curated playlists: Closest Finishes, Upsets, Comebacks, etc. - Prediction system for non-coders (within CF free tier: ~864 writes/day) - Map evolution with breeding, engagement scoring, positional fairness monitoring, and user voting (upvote/downvote maps) - Multi-game series (bo3/bo5/bo7) across different maps with spoiler toggle - Match event timeline with clickable icon ribbon - Seasonal rotations with backward-compatible rule versioning (additive changes only: new optional fields, param tuning, new tile types that old bots can ignore), championship brackets, season archives - Bot profile cards as shareable PNGs with OG tags Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 22:56:54 -04:00
jedarden	46f5e2ac4a	Sandbox: WASM-per-bot model, drop what-if branching Reworked sandbox to use separate WASM modules per bot instead of JS function callbacks. Each bot compiles to its own WASM with a standard init/compute_moves/free_result interface. Supports Go, Rust, TS natively; PHP/Java bots reimplemented in Go for WASM. Memory budget 30-105MB depending on player count. Two user modes: TS quick-start in Monaco or upload pre-compiled .wasm file. Dropped what-if replay branching — good in theory but the split-pane counterfactual UI is a power-user feature that would confuse casual visitors. Not worth the complexity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 22:38:00 -04:00
jedarden	66fe718835	Add enhanced features: WASM sandbox, win probability, what-if branching, AI commentary, clip maker, rival detection, community feedback New section 12 covers seven features with Phase 8 for implementation: - In-browser WASM sandbox (1 WASM engine + JS bots, ~30-50MB, <2s) - Win probability graph via Monte Carlo rollout + critical moments - What-if replay branching (single WASM, state injection, split-pane) - Selective AI commentary for featured/rivalry/milestone replays - Clip maker with GIF/MP4 export for Twitter, TikTok, IG, Discord - Automatic rival detection with template-generated narratives - Community replay feedback tagged by turn, feeding evolution hints Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 22:35:34 -04:00
jedarden	e41597f65b	Move web tier to Cloudflare free plan, compute to Rackspace Spot Architecture split: Cloudflare Pages (static site), Worker (API + cron scheduling), D1 (SQLite database), R2 (replays + JSON indexes, zero egress) — all within the free tier with 95%+ headroom on every quota. Rackspace Spot handles match workers, bot containers, and the evolution pipeline — all stateless and interruptible. Includes D1 schema, Worker cron design (matchmaker, indexer, health checker, reaper), R2 bucket layout, free tier usage math, and graceful degradation model. Drops infrastructure cost from ~$65-110/mo to ~$35-70/mo. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 22:16:56 -04:00
jedarden	512dfc201d	Replace Minio with plain filesystem + Nginx All platform data is now JSON files on disk served directly by Nginx. Workers submit results to the scheduler over Tailscale HTTP, scheduler writes to the data directory. No object store, no S3 API. Scheduler coordinates jobs via HTTP endpoints backed by SQLite. Data directory is a persistent volume on the stable instance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 22:05:46 -04:00
jedarden	85a2a3915e	Rework architecture for static site + Rackspace Spot practicalities Major changes: - Website is now a static site loading JSON from object store, no app server - Removed PostgreSQL, Redis, PgBouncer, read replicas — replaced with Minio (S3-compatible) for all data and SQLite on the scheduler for bookkeeping - Registration is a minimal 3-endpoint API, no user accounts needed - Match job coordination via file moves in Minio (no message broker) - All user-facing data is pre-built JSON fetched client-side - Single stable instance runs everything except match workers - Spot workers are fully stateless — site stays up when all are reclaimed - Added cost model (~$65-110/mo excluding LLM API) - Simplified monitoring to match the architecture Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 21:59:17 -04:00
jedarden	63212f02ab	Add LLM-driven bot evolution system to plan New section 10 covers the full evolution pipeline: FunSearch-style island model for population diversity, LLM ensemble (fast+strong) for candidate generation, multi-stage validation (syntax, schema, nsjail sandbox), evaluation arena, Nash equilibrium promotion gate (LLM-PSRO), automated container build/deploy, retirement policy, and evolution dashboard with lineage viewer and meta tracker. Phase 7 added to implementation plan. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 21:51:24 -04:00
jedarden	decae849c7	Add research on LLM-driven bot evolution systems Covers FunSearch, AlphaEvolve (+ open-source clones OpenEvolve, ShinkaEvolve), ELM/OpenELM, AlphaCode, Voyager, LLM-PSRO, CATArena, and sandboxing options. Recommends FunSearch island model + LLM-PSRO Nash equilibrium selection for the evolution pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 21:46:23 -04:00
jedarden	d7cf4625e2	Strategy bots: one per language with starter kits Each of the six built-in strategy bots is now implemented in a different language (Python, Go, Rust, PHP, TypeScript, Java) to demonstrate that the HTTP protocol is truly language-agnostic. Added per-language container templates, resource specs, and forkable starter kit repos for participants. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 21:32:22 -04:00
jedarden	11f91db338	Add comprehensive implementation plan Covers game mechanics (toroidal grid, focus-fire combat, energy economy, fog of war), HTTP communication protocol with HMAC authentication, six built-in strategy bots as containers, Glicko-2 tournament system, compact JSON replay format with Canvas-based browser viewer, web platform with bot registration, and Rackspace Spot deployment architecture. Organized into six implementation phases from core engine through production. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 21:24:17 -04:00
jedarden	584c715e2c	Add aichallenge ants research and initial requirements Research covers game mechanics, bot protocol, tournament/ranking, replay system, infrastructure, and successor platforms. Requirements document captures the HTTP-based bot communication model, 3-second timeout, strict schema enforcement, and replay visualization needs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 21:08:11 -04:00
jedarden	81394e8705	Add docs directory structure with research and plan subdirectories Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-23 20:47:58 -04:00

25 commits