ai-code-battle/PROGRESS.md
jedarden 259b9de86a Add Kubernetes manifests for ArgoCD GitOps deployment
Create deploy/k8s/ directory with complete K8s manifests for the
ai-code-battle namespace on apexalgo-iad cluster:

- Namespace and ArgoCD Application (auto-sync, prune, self-heal)
- Match worker Deployment (2 replicas, metrics on :9090)
- Index builder Deployment (Recreate strategy)
- 6 strategy bot Deployments (random, gatherer, rusher, guardian, swarm, hunter)
- ClusterIP Services for all bots (cluster DNS resolution)
- SealedSecret templates (API key, R2 creds, bot secrets, Cloudflare token)
- All containers from forgejo.ardenone.com/ai-code-battle/ registry
- Health/readiness probes and resource limits on all deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 00:57:48 -04:00

12 KiB

AI Code Battle - Implementation Progress

Current Phase: Phase 6 - Deployment & Production

Status: 🔄 In Progress

Last Updated: 2026-03-26

Recent Changes (2026-03-26)

  • Added Kubernetes manifests for GitOps deployment via ArgoCD (deploy/k8s/)
    • Namespace, ArgoCD Application with auto-sync and self-heal
    • Deployments: match worker (2 replicas), index builder, 6 strategy bots
    • ClusterIP Services for all 6 bots (cluster DNS: acb-strategy-*.ai-code-battle.svc:8080)
    • SealedSecret templates: API key, R2 credentials, bot HMAC secrets, Cloudflare API token
    • All manifests validated (20 files, valid YAML with correct apiVersion/kind)
    • Container images from forgejo.ardenone.com/ai-code-battle/ registry
    • Health/readiness probes on all deployments
    • Resource requests/limits on all containers
  • All tests pass (engine + worker)

Previous Changes (2026-03-26)

  • Added Prometheus-compatible metrics endpoint to match worker (cmd/acb-worker/metrics.go)
    • Counters: matches_total, match_errors_total, jobs_claimed/failed, replays_uploaded, poll_cycles, heartbeats
    • Histograms: match_duration_seconds, replay_upload_duration_seconds, replay_size_bytes
    • Worker info gauge with worker_id label
    • /health and /ready endpoints on metrics HTTP server (default :9090)
    • Configurable via ACB_METRICS_ADDR environment variable
  • Instrumented worker execution flow with metrics recording
  • Added comprehensive tests (cmd/acb-worker/metrics_test.go)
    • Health/ready endpoint tests, counter accuracy, histogram bucket correctness
    • Concurrency safety test (10 goroutines x 100 operations)
  • All tests pass (engine + worker)

Previous Changes (2026-03-24)

  • Added GitHub Actions CI workflow (.github/workflows/ci.yml)
  • Added README.md with project overview and quick start guide
  • Added .gitignore and package-lock.json files

Phase 6 Progress

  • Match worker container (cmd/acb-worker/Dockerfile)
    • Multi-stage Go build
    • Non-root user for security
    • Environment variable configuration
  • Bot-host deployment (docker-compose.bots.yml)
    • Orchestrates all 6 strategy bots
    • Health checks for each bot
    • Environment-based secret configuration
  • Worker deployment (docker-compose.workers.yml)
    • Match worker with scaling support
    • Index builder for periodic runs
    • R2 and API configuration
  • Environment configuration (.env.example)
    • Documented all required environment variables
  • Deployment documentation (DEPLOYMENT.md)
    • Architecture overview
    • Cloudflare setup instructions
    • Container deployment commands
    • Troubleshooting guide
  • D1 database schema and migrations
    • Complete schema.sql with all tables from plan
    • Added: predictions, predictor_stats, map_votes, replay_feedback, series, series_games, seasons
    • Added evolution fields to bots table (evolved, island, generation, parent_ids)
    • Created migrations/0001_initial.sql for D1 migrations
    • Updated wrangler.toml with migrations_dir config
  • Monitoring endpoints
    • /health - Liveness probe (always returns 200)
    • /ready - Readiness probe (checks database connectivity, returns 503 if unavailable)
    • Documented in DEPLOYMENT.md
  • Prometheus metrics endpoint (cmd/acb-worker/metrics.go)
    • Counters: matches, errors, jobs, replays, polls, heartbeats
    • Histograms: match duration, replay upload duration, replay size
    • Worker info gauge with labels
    • Separate HTTP server on configurable port (default :9090)
    • Integrated into worker execution flow with full instrumentation
  • GitHub Actions CI workflow
    • .github/workflows/ci.yml for automated testing
    • Go tests with race detector
    • TypeScript tests for worker-api and indexer
    • Web build verification
    • Go binary builds
  • Kubernetes manifests for ArgoCD GitOps (deploy/k8s/)
    • namespace.yaml - Dedicated ai-code-battle namespace
    • argocd-application.yaml - Auto-sync with prune and self-heal
    • deployments/acb-worker.yaml - Match worker (2 replicas, metrics on :9090)
    • deployments/acb-index-builder.yaml - Index builder (1 replica, Recreate strategy)
    • deployments/acb-strategy-{random,gatherer,rusher,guardian,swarm,hunter}.yaml - 6 strategy bots
    • services/acb-strategy-*.yaml - ClusterIP services for bot DNS resolution
    • sealed-secrets/ - Templates for API key, R2 creds, bot secrets, Cloudflare token
    • All containers from forgejo.ardenone.com/ai-code-battle/ registry
    • Health/readiness probes and resource limits on all deployments

Remaining Phase 6 Work (requires Cloudflare account access)

  • Cloudflare Pages project creation and deployment
  • R2 bucket creation and custom domain
  • Worker API deployment via Wrangler (wrangler deploy)
  • DNS configuration

Phase 5 Completed

  • SPA application shell (web/app.html)
    • Navigation header with links to all sections
    • Dark theme with CSS custom properties
    • Responsive layout
  • Hash-based router (web/src/router.ts)
    • Pattern matching with parameter extraction
    • Navigation and history support
  • Page components (web/src/pages/)
    • Home page with hero, features, quick links
    • Leaderboard with ranking table
    • Match history with match cards
    • Bot directory with bot cards
    • Bot profile with stats, rating chart, recent matches
    • Registration form with API key display
    • Replay viewer (integrated from Phase 3)
    • Docs/Getting Started page
  • API client (web/src/api-types.ts)
    • fetchLeaderboard()
    • fetchBotDirectory()
    • fetchBotProfile()
    • fetchMatchIndex()
    • registerBot()
    • rotateApiKey()
  • Cloudflare Pages deployment configuration
    • web/pages.json - Project configuration
    • web/public/_headers - Cache control headers
    • web/public/robots.txt - SEO
    • web/public/data/ - Placeholder index file structure
  • R2 bucket custom domain documentation
    • Documented in web/pages.json data_paths section

Phase 4 Completed

Phase 3 Completed

Phase 2 Completed

Phase 5 Exit Criteria

Criterion Status
SPA with navigation (leaderboard, matches, bots, register) Complete
Home page with getting started info Complete
Registration form with API key display Complete
Bot profiles with rating history chart Complete
Match history page Complete
Leaderboard with rankings Complete
Getting started / docs page Complete
Cloudflare Pages deployment config Complete
R2 bucket custom domain for replays Documented

Phase 1 Completed

File Structure

ai-code-battle/
├── go.mod
├── go.sum
├── .env.example              # Environment configuration template
├── DEPLOYMENT.md             # Deployment guide
├── docker-compose.bots.yml   # Bot-host orchestration
├── docker-compose.workers.yml # Worker orchestration
├── .github/
│   └── workflows/
│       └── ci.yml            # GitHub Actions CI workflow
├── engine/
│   ├── types.go        # Core data types
│   ├── grid.go         # Toroidal grid implementation
│   ├── game.go         # Game state management
│   ├── turn.go         # Turn execution phases
│   ├── replay.go       # Replay recording
│   ├── match.go        # Match runner
│   ├── bot_local.go    # Local bot interface
│   ├── bot_http.go     # HTTP bot client
│   ├── auth.go         # HMAC authentication
│   └── *_test.go       # Test files
├── cmd/
│   ├── acb-local/      # CLI match runner
│   ├── acb-mapgen/     # Map generator
│   ├── acb-worker/     # Match execution worker
│   │   ├── main.go      # Worker entry point
│   │   ├── api.go       # Worker API client
│   │   ├── api_test.go  # API client tests
│   │   ├── r2.go        # R2 upload client
│   │   └── Dockerfile   # Worker container
│   └── acb-indexer/    # Index builder
│       ├── package.json
│       ├── Dockerfile
│       └── src/
│           ├── index.ts       # Entry point
│           ├── api.ts         # Worker API client
│           ├── generator.ts   # Index file generator
│           ├── writer.ts      # File system writer
│           └── generator.test.ts
├── worker-api/
│   ├── package.json    # npm dependencies
│   ├── wrangler.toml   # Cloudflare Worker config
│   ├── schema.sql      # Complete D1 schema (all tables)
│   ├── migrations/     # D1 migration files
│   │   └── 0001_initial.sql
│   └── src/
│       ├── index.ts        # Router + cron dispatcher
│       ├── types.ts        # TypeScript types
│       ├── glicko2.ts      # Glicko-2 rating system
│       ├── glicko2.test.ts # Rating system tests
│       ├── jobs.ts         # Job coordination endpoints
│       ├── bots.ts         # Bot management endpoints
│       ├── export.ts       # Data export endpoint
│       └── cron.ts         # Cron handlers
├── web/
│   ├── package.json    # npm dependencies
│   ├── tsconfig.json   # TypeScript config
│   ├── vite.config.ts  # Vite bundler config
│   ├── pages.json      # Cloudflare Pages project config
│   ├── index.html      # Standalone replay viewer
│   ├── app.html        # SPA shell with navigation
│   ├── public/         # Static assets (copied to dist/)
│   │   ├── _headers    # Cloudflare cache headers
│   │   ├── robots.txt  # SEO
│   │   └── data/       # Placeholder index files
│   │       ├── leaderboard.json
│   │       ├── bots/index.json
│   │       └── matches/index.json
│   └── src/
│       ├── types.ts        # Replay type definitions
│       ├── api-types.ts    # API client and types
│       ├── router.ts       # Hash-based SPA router
│       ├── replay-viewer.ts # Canvas viewer class
│       ├── main.ts         # Standalone replay viewer
│       ├── app.ts          # SPA entry point
│       └── pages/          # SPA page components
│           ├── home.ts
│           ├── leaderboard.ts
│           ├── matches.ts
│           ├── bots.ts
│           ├── bot-profile.ts
│           └── register.ts
├── bots/
│   ├── random/         # Python - RandomBot
│   ├── gatherer/       # Go - GathererBot
│   ├── rusher/         # Rust - RusherBot
│   ├── guardian/       # PHP - GuardianBot
│   ├── swarm/          # TypeScript - SwarmBot
│   └── hunter/         # Java - HunterBot
├── deploy/
│   └── k8s/               # Kubernetes manifests (ArgoCD GitOps)
│       ├── namespace.yaml
│       ├── argocd-application.yaml
│       ├── deployments/   # Worker, index builder, 6 strategy bots
│       ├── services/      # ClusterIP services for bots
│       └── sealed-secrets/ # Secret templates
└── docs/
    └── plan/
        └── plan.md     # Full implementation plan

Strategy Bot Summary

Bot Language Strategy Expected Rank
RandomBot Python Random valid moves 6th (floor)
GathererBot Go Energy collection, avoid combat 4th-5th
RusherBot Rust Rush enemy cores aggressively 4th-5th
GuardianBot PHP Defend cores, cautious expansion 3rd-4th
SwarmBot TypeScript Formation cohesion, group advance 1st-2nd
HunterBot Java Target isolated enemies 1st-2nd

Running Tests

# Go engine tests
go test ./engine/... -v

# Web build verification
cd web && npm run build

Building CLI Tools

go build ./cmd/acb-local
go build ./cmd/acb-mapgen

Running a Match

./acb-local -seed 42 -max-turns 100 -output replay.json -verbose

Viewing a Replay

cd web
npm run dev
# Standalone viewer: http://localhost:3000/index.html
# Full SPA: http://localhost:3000/app.html (then go to #/replay)