Add enhanced features: WASM sandbox, win probability, what-if branching,

AI commentary, clip maker, rival detection, community feedback

New section 12 covers seven features with Phase 8 for implementation:
- In-browser WASM sandbox (1 WASM engine + JS bots, ~30-50MB, <2s)
- Win probability graph via Monte Carlo rollout + critical moments
- What-if replay branching (single WASM, state injection, split-pane)
- Selective AI commentary for featured/rivalry/milestone replays
- Clip maker with GIF/MP4 export for Twitter, TikTok, IG, Discord
- Automatic rival detection with template-generated narratives
- Community replay feedback tagged by turn, feeding evolution hints

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
jedarden 2026-03-23 22:35:34 -04:00
parent e41597f65b
commit 66fe718835

View file

@ -1918,3 +1918,487 @@ participants can register and play.
validates, evaluates, promotes, deploys, and retires bots without human
intervention. At least one evolved bot reaches the top 50% of the ladder
within the first week of operation.
### Phase 8: Enhanced Features
**Deliverables:**
- WASM game engine build (`GOOS=js GOARCH=wasm`) with `loadState()` and
`simulate()` API for browser use
- In-browser sandbox: Monaco editor + WASM engine + JS bot execution + replay
viewer integration
- JS reimplementations of the 6 built-in strategy bots for sandbox opponents
- What-if replay branching: fork UI, state injection, split-pane comparison
- Win probability computation in the match worker (Monte Carlo rollout) +
critical moments detector + replay viewer sparkline graph
- Replay enrichment pipeline: selective AI commentary for featured matches
- Clip maker: GIF + MP4 export with social media format presets
- Rival detection query + rivalry pages
- Community replay feedback system: tagged annotations guiding evolution
- D1 schema additions: `replay_feedback`, `rivalries` tables
**Exit criteria:** users can write and test bots in the browser without
deploying anything, fork replays to explore alternate outcomes, watch
enriched replays with commentary and win probability, export clips for
social sharing, and submit tactical feedback that influences the evolution
pipeline.
---
## 12. Enhanced Features
### 12.1 In-Browser WASM Game Sandbox
The game engine compiles to WebAssembly, enabling users to write bots and
play matches entirely in the browser — zero deployment, zero server setup.
**Architecture:**
```
Browser
├── Game Engine (Go → WASM, single instance, ~15 MB)
│ ├── loadState(json) → set engine to a specific turn state
│ ├── simulate(botMoves[]) → advance one turn
│ └── runMatch(botFunctions[]) → run full match, return replay
├── User Bot (JavaScript function)
│ └── computeMoves(gameState) → returns moves array
├── Opponent Bots (JavaScript, built-in)
│ ├── randomBot(gameState) → random valid moves
│ ├── gathererBot(gameState) → BFS to nearest energy
│ ├── rusherBot(gameState) → BFS to nearest enemy core
│ ├── guardianBot(gameState) → perimeter defense
│ ├── swarmBot(gameState) → formation advance
│ └── hunterBot(gameState) → target isolated enemies
├── Monaco Editor (code editing)
└── Replay Viewer (Canvas, renders result)
```
**Why 1 WASM, not 8:**
The game engine is the only WASM module. Bots run as plain JavaScript
functions called synchronously by the engine via JS interop. For a 6-player
match: 1 WASM instance + 6 JS function calls per turn. Memory: ~3050 MB
total. A 500-turn, 6-player match simulates in <2 seconds on modern hardware.
The built-in strategy bots (originally Python, Go, Rust, PHP, TypeScript,
Java) are **reimplemented in JavaScript** for the sandbox. They only need to
be behaviorally equivalent — same BFS, same combat logic, same decision
heuristics — not identical code. These JS versions are ~200500 lines each.
**WASM engine API:**
```typescript
interface WASMEngine {
// Run a full match, calling botFn[i](state) for each player each turn
runMatch(config: MatchConfig, map: MapData, bots: BotFunction[]): Replay
// Load a specific game state (for what-if branching)
loadState(state: GameState): void
// Simulate one turn with given moves
step(moves: PlayerMoves[]): TurnResult
// Simulate forward from current state using bot functions
simulateForward(bots: BotFunction[], remainingTurns: number): Replay
}
type BotFunction = (state: GameState) => Move[]
```
**User flow:**
1. User visits `/sandbox`
2. Monaco editor pre-loaded with a starter bot (hold-all-units stub)
3. User selects opponent (dropdown: RandomBot, GathererBot, etc.) and map
4. Clicks "Run Match"
5. WASM engine executes the match locally (~12 seconds)
6. Replay viewer renders the result inline
7. User modifies code, re-runs — instant feedback loop
8. When ready for the real ladder, user follows a link to the starter kit
docs for deploying an HTTP server
**Why this matters:** The current onboarding path is "set up a server, write
HTTP handlers, deploy, register." The sandbox makes it "open a page, write
20 lines of JS, click play." The difference between 50 users and 5,000.
### 12.2 Win Probability Graph + Critical Moments
Every match replay includes a **win probability curve** — a per-turn estimate
of each player's chance of winning — and a set of **critical moments** where
the game's outcome shifted decisively.
**Win probability computation:**
After each match, the worker computes win probability using Monte Carlo
rollout:
```
for each turn T in the match:
state = game_state_at_turn_T
wins = [0, 0, ..., 0] // per player
for i in 1..100:
result = simulate_random_play(state, remaining_turns)
wins[result.winner] += 1
win_prob[T] = wins / 100
```
`simulate_random_play` runs the game engine with random valid moves for all
players from the given state to completion. 100 rollouts × 500 turns is
~50,000 engine steps — the Go engine handles this in <1 second.
The result is stored in the replay JSON as a `win_prob` array:
```json
"win_prob": [
[0.50, 0.50],
[0.51, 0.49],
[0.48, 0.52],
...
]
```
Size: ~4 KB for a 500-turn, 2-player match. Negligible.
**Critical moments:**
A critical moment is any turn where `|Δwin_prob|` exceeds 0.15 (15%) for
any player. Typically 35 per match. Stored in the replay JSON:
```json
"critical_moments": [
{ "turn": 87, "delta": 0.22, "description": "SwarmBot loses 6 units in eastern engagement" },
{ "turn": 203, "delta": -0.31, "description": "GathererBot's core captured" }
]
```
The `description` is auto-generated from the turn's events (deaths, captures,
large position changes). No LLM needed — template-based.
**Replay viewer integration:**
- **Sparkline graph** below the main canvas: one line per player, color-coded
- Horizontal axis: turns. Vertical axis: 0%100% win probability
- **Critical moment markers**: vertical dashed lines on the graph with labels
- **Click to jump**: clicking any point on the graph scrubs to that turn
- **Quick nav buttons**: "Next critical moment" / "Previous critical moment"
to skip between turning points
This transforms replay viewing from "press play and wait 5 minutes" to
"click the 3 interesting moments and watch 30 seconds of decisive action."
### 12.3 What-If Replay Branching
Pause a replay at any turn, modify one player's moves, and simulate an
alternate timeline. Compare the real outcome with the counterfactual.
**How it works (no WASM swapping):**
The WASM engine (same instance used by the sandbox) supports state injection.
The replay viewer already reconstructs the full game state at every turn.
```
1. User pauses replay at turn N
2. UI shows the current player's units with editable move arrows
3. User drags arrows to change moves for their chosen player
4. Click "What If?"
5. Viewer serializes turn-N game state → engine.loadState(stateN)
6. Engine applies:
- Modified moves for the user's chosen player
- Original replay moves for all other players
7. Engine simulates forward from turn N+1:
- Other players continue using their original replay moves
- The modified player uses a selectable strategy (hold all, random,
gatherer, or "repeat my change")
8. Result renders in split-pane view:
Left: original replay from turn N onward
Right: alternate timeline from turn N onward
Both share the same scrub bar and play/pause controls
```
**Key design decision:** Other players replay their *original moves* from
the replay data, not re-computed moves. This means the what-if shows "what
would have happened if I changed my move but my opponent didn't react
differently." This is a simplification — a true counterfactual would need
the opponent's bot to recompute — but it's useful for answering tactical
questions ("would moving east instead of north have avoided that ambush?")
and requires no access to opponent bot code.
**UI:**
```
┌────────────────────┬────────────────────┐
│ Original (real) │ What-If (branch) │
│ │ │
│ [canvas] │ [canvas] │
│ │ │
├────────────────────┴────────────────────┤
│ [win prob graph: solid=real, dashed=branch] │
│ ◄ ▶ ⏸ Turn 203/500 Speed: 4x │
└─────────────────────────────────────────┘
```
The win probability graph shows both timelines overlaid — the real outcome
as a solid line and the branched outcome as a dashed line. The divergence
point is marked with a vertical line at turn N.
### 12.4 Replay Enrichment (Selective AI Commentary)
Select replays receive AI-generated natural language commentary — a
play-by-play narration that makes matches accessible to casual viewers.
**Not all replays are enriched.** Commentary is generated selectively for:
- **Featured matches**: matches flagged by the system as particularly
interesting (high win probability variance, close finishes, upsets where
a lower-rated bot wins)
- **Rivalry matches**: matches between detected rivals (§12.6)
- **Evolution milestones**: first match of a newly promoted evolved bot,
or matches where an evolved bot breaks into the top 10
- **User-requested**: participants can request enrichment for their own
matches (rate-limited: 5 per day per bot)
**Selection criteria (automatic):**
```
enrich if:
- win_prob crossed 0.5 at least 3 times (back-and-forth match)
- final score difference ≤ 2
- winner's rating was ≥100 lower than loser's (upset)
- match involved a newly promoted evolved bot
- match is between detected rivals
```
At ~60 matches/hour, roughly 1015% qualify — about 69 enriched replays
per hour.
**Commentary generation:**
Run a fast, cheap LLM (Haiku-class) over the replay data at match
completion. This happens as an optional post-processing step on the match
worker.
**Input prompt:**
```
Narrate this bot battle. Provide commentary for key moments only
(not every turn). Write 1-2 sentences per key moment. Be specific
about positions, unit counts, and tactical decisions.
Match: {players, ratings, map_size}
Win probability curve: {win_prob array, sampled every 10 turns}
Critical moments: {critical_moments array}
Key events by turn: {deaths, captures, large movements, energy collected}
```
**Output:** array of `{turn, text}` entries stored in the replay JSON:
```json
"commentary": [
{ "turn": 1, "text": "Both bots spawn at opposite corners of a 60x60 grid with heavy wall cover in the center. SwarmBot immediately sends all units east in a tight cluster." },
{ "turn": 42, "text": "First contact. GathererBot's scout stumbles into SwarmBot's formation near the central energy cluster. The scout is outnumbered 8-to-1 and eliminated instantly." },
{ "turn": 87, "text": "The turning point. SwarmBot pushes through the eastern corridor but GathererBot has quietly amassed 14 units behind the western wall line — a force SwarmBot doesn't know exists." }
]
```
**Cost:** ~$0.010.03 per enriched match at Haiku pricing. At 9
enriched matches/hour: ~$26/day, ~$60180/month. Reasonable.
**Replay viewer integration:**
- Commentary appears as subtitles below the canvas, synchronized to turn
playback
- Toggle on/off via a "Commentary" button
- Enriched replays are badged on the match list ("Featured" / "Narrated")
### 12.5 Shareable Replay Clips
One-click export of a replay segment as a GIF or video, formatted for
major social media platforms. This is the viral growth engine.
**Export formats:**
| Preset | Resolution | Aspect | Format | Target |
|--------|-----------|--------|--------|--------|
| Landscape | 1920×1080 | 16:9 | MP4 | YouTube, Twitter, Discord |
| Square | 1080×1080 | 1:1 | MP4 | Twitter, Instagram feed |
| Portrait | 1080×1920 | 9:16 | MP4 | TikTok, YouTube Shorts, IG Stories |
| GIF (compact) | 640×360 | 16:9 | GIF | Discord embeds, forums |
| GIF (square) | 480×480 | 1:1 | GIF | Twitter, Slack |
**User flow:**
1. While watching a replay, click "Clip" (scissors icon)
2. Drag handles on the turn scrubber to select a segment (default: 20 turns
centered on the current turn, or the nearest critical moment)
3. Select format preset from dropdown
4. Optional: toggle overlays (score, win probability, commentary subtitles)
5. Click "Export"
6. Browser records the Canvas replay segment using `OffscreenCanvas` +
`MediaRecorder` API (MP4/WebM) or gif.js (GIF)
7. Processing happens entirely client-side — no server cost
8. Download button appears, plus "Share" buttons:
- **Twitter/X**: opens compose with the clip attached + auto-generated
text ("SwarmBot pulls off a comeback against HunterBot! 🎮
aicodebattle.com/replay/{id}")
- **Reddit**: copies a markdown link with embedded video
- **Discord**: downloads the file (under Discord's 25MB upload limit)
- **Copy link**: shareable URL to the replay at the specific turn range
**Clip overlay:** the exported clip includes:
- Player names + colors in a header bar
- Score overlay (bottom-left)
- Win probability mini-graph (bottom strip, if enabled)
- "aicodebattle.com" watermark (small, bottom-right)
**GIF optimization:** GIFs are limited to 256 colors and can be large.
The clip maker uses:
- Reduced frame rate (10 fps for GIF vs 30 fps for MP4)
- Color quantization optimized for the grid art style
- Max 10-second duration for GIFs (longer clips → MP4 only)
- Target size: <5 MB for GIFs, <15 MB for MP4
**Implementation:** ~200 lines of TypeScript. `MediaRecorder` for MP4,
`gif.js` for GIF, `OffscreenCanvas` for headless rendering. All runs in
the browser. The share buttons use Web Share API where available, fallback
to window.open() with pre-composed URLs.
### 12.6 Automatic Rival Detection
The platform automatically identifies **rivalries** — pairs of bots that
frequently play each other with close results — and surfaces them as
narrative-driven content.
**Detection algorithm:**
```sql
-- Run by the index rebuilder cron
SELECT
a.bot_id AS bot_a,
b.bot_id AS bot_b,
COUNT(*) AS matches,
SUM(CASE WHEN winner = a.player_slot THEN 1 ELSE 0 END) AS a_wins,
SUM(CASE WHEN winner = b.player_slot THEN 1 ELSE 0 END) AS b_wins
FROM match_participants a
JOIN match_participants b ON a.match_id = b.match_id AND a.bot_id < b.bot_id
JOIN matches m ON m.match_id = a.match_id
WHERE m.status = 'completed'
GROUP BY a.bot_id, b.bot_id
HAVING COUNT(*) >= 10
ORDER BY COUNT(*) * (1.0 - ABS(CAST(a_wins - b_wins AS REAL) / COUNT(*))) DESC
LIMIT 20
```
The ranking formula: `matches_played × (1 - |win_rate_imbalance|)`.
High-scoring pairs have many matches with near-50/50 results — the
definition of a rivalry.
**Rivalry page** (`/rivalry/{bot_a_id}/{bot_b_id}`):
```json
{
"bot_a": { "id": "b_4e8c1d2f", "name": "SwarmBot", "owner": "alice" },
"bot_b": { "id": "b_9a1b3c4d", "name": "HunterBot", "owner": "bob" },
"matches": 23,
"record": { "a_wins": 11, "b_wins": 11, "draws": 1 },
"closest_match": "m_abc123",
"longest_streak": { "holder": "b_4e8c1d2f", "length": 4 },
"recent_matches": ["m_abc123", "m_def456", ...],
"narrative": "SwarmBot and HunterBot have met 23 times — the series is dead even at 11-11-1. SwarmBot held a 4-match winning streak from Mar 15-18, but HunterBot answered with 3 straight victories. Their last match was decided by a single point."
}
```
The narrative is template-generated from the stats (no LLM needed):
```
"{bot_a} and {bot_b} have met {n} times — {record_description}.
{streak_description}. {recent_trend_description}."
```
**Platform integration:**
- Rivalry widget on the landing page: "Top Rivalries" with head-to-head
records and links to key matches
- Bot profile pages show "Rivals" section listing detected rivalries
- Rivalry matches are auto-flagged for replay enrichment (§12.4)
- Leaderboard can show "rivalry mode" — filter to matches between two
specific bots
### 12.7 Community Replay Feedback
Users can leave **tagged feedback** on specific moments in replays.
Feedback is anchored to a `(replay_id, turn)` pair and is visible to
other viewers. High-signal feedback is fed into the evolution pipeline
as strategic hints.
**Feedback types:**
| Type | Icon | Purpose |
|------|------|---------|
| Tactical insight | 💡 | "This flanking move was brilliant because..." |
| Mistake spotted | ⚠️ | "Bot should have retreated here — outnumbered 3:1" |
| Strategy idea | 🧪 | "What if a bot used this wall corridor as a chokepoint?" |
| Highlight | ⭐ | "Amazing play" (lightweight, like a star/upvote) |
**D1 schema:**
```sql
CREATE TABLE replay_feedback (
feedback_id TEXT PRIMARY KEY,
match_id TEXT NOT NULL,
turn INTEGER NOT NULL,
type TEXT NOT NULL, -- 'insight', 'mistake', 'idea', 'highlight'
body TEXT NOT NULL,
author TEXT NOT NULL, -- free text (no accounts, like registration)
upvotes INTEGER NOT NULL DEFAULT 0,
created_at TEXT NOT NULL
);
CREATE INDEX idx_feedback_match ON replay_feedback(match_id, turn);
```
**Replay viewer integration:**
- Small markers appear on the turn scrubber at turns with feedback
- Hovering shows a preview count: "3 comments at turn 87"
- Clicking opens a side panel showing all feedback for that turn
- Users can add their own feedback via a form in the side panel
- Upvote button on each feedback item (1 per visitor via localStorage)
**Feeding into evolution:**
The evolution pipeline's prompt builder (§10.3) consumes community
feedback as an additional signal:
1. Index rebuilder aggregates high-upvote feedback of type `idea` and
`mistake` into `data/evolution/community_hints.json`
2. The evolver reads this file and includes the top-voted recent hints
in the prompt:
```
## Community Tactical Insights (from replay annotations)
Replay m_abc123, Turn 87 (12 upvotes):
"The bot should have used the narrow wall corridor at (30,42)-(30,48)
as a chokepoint instead of engaging in the open. A defensive line of
3 units there could have held off the 8-unit swarm."
Replay m_def456, Turn 203 (8 upvotes):
"When outnumbered 3:1, retreating toward the nearest energy cluster
and spawning reinforcements is better than fighting — the focus combat
system guarantees you lose the 3v1."
```
3. If a resulting evolved bot performs well, the feedback items that
contributed to its prompt are credited on the evolution dashboard:
"Feedback from user 'tactician42' on replay m_abc123 contributed to
evo-py-g42-7f3a (rating: 1720)"
**Moderation:**
- Feedback is plain text, max 500 characters
- No accounts means no banning — but feedback is public and upvote-ranked,
so low-quality content sinks
- A simple word filter catches obvious spam
- The evolution pipeline only consumes feedback with ≥3 upvotes, filtering
noise automatically
- Admin can delete feedback via a D1 query (no UI needed initially)
**Why this matters:** It creates a human-AI collaboration loop. Spectators
contribute strategic insight, the AI translates it into code, the platform
evaluates it, and successful feedback is credited. This gives non-coders a
way to participate meaningfully in the competition.