Research covers game mechanics, bot protocol, tournament/ranking, replay system, infrastructure, and successor platforms. Requirements document captures the HTTP-based bot communication model, 3-second timeout, strict schema enforcement, and replay visualization needs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
25 KiB
Ants AI Challenge (aichallenge.org) — Comprehensive Research
1. History and Context
The Ants AI Challenge was the Fall 2011 edition of the AI Challenge, a series of programming competitions organized by the University of Waterloo Computer Science Club with sponsorship from Google. The competition was codenamed "Epsilon" internally.
Timeline
- Fall 2010: Planet Wars (based on Galcon) — the predecessor competition, a 1v1 game
- October 2011: Ants announced and launched
- December 2011: Competition concluded; results published
- After 2011: The AI Challenge project went dormant; no further competitions were run
Scale
- Approximately 7,900 submissions during the competition
- Top 100 contestants recognized
- Winner: Mathis Lichtenberger ("xathis") from University of Lubeck, Germany
- Runner-up: Evgeniy Voronyuk ("GreenTea") from Dnipropetrovsk National University, Ukraine
- Third place: "protocolocon" from Spain
GitHub Repository
github.com/aichallenge/aichallenge— 640 stars, 170 forks, 2,843 commits- Components: game engine (
ants/), tournament manager (manager/), workers (worker/), website, SQL schemas, integration testing
2. Game Mechanics
Map / Grid
- Toroidal grid: rectangular, wraps both horizontally and vertically (no edges)
- Characters:
.= land,%= water (impassable),*= food,a-j= player ants,A-J= ants on own hill,0-9= hill locations - Dimensions: up to 200x200, with 900-5000 area per player, total area capped at 25,000
- Symmetry: all maps were symmetric — every player's starting position mirrors every other
- Map generation constraints:
- 2-10 player support
- Hills must be 20-150 steps apart, with Euclidean distance minimum of 6
- Must have a path between all hills traversable by a 3x3 block (no narrow chokepoints that block army movement)
- No islands allowed
Turn Structure
Each turn follows a strict sequence:
- Send game state to all players (filtered by fog of war)
- Receive orders from all players (within time limit)
- Execute six phases in order:
- Move — execute player movement orders
- Attack — resolve combat between adjacent enemy ants
- Raze Hills — enemy ant on a hill location destroys it
- Spawn Ants — new ants emerge from hills (if hive has food)
- Gather Food — ants within spawn radius collect food
- Spawn Food — new food appears on the map
- Check endgame conditions
Fog of War
- Each player has individual vision — only squares within
viewradius2(default: 77, i.e., ~8.77 tiles Euclidean) of any owned ant are visible - Vision is updated incrementally as ants move, spawn, or die
- Water is only transmitted to a bot the first turn it becomes visible (permanent terrain)
- Enemy ants are only visible within the player's vision radius
- Player identifiers are remapped: you always see yourself as player 0, enemies numbered sequentially by discovery order
- Bots do NOT know the total number of players
Food Mechanics
Spawning:
- Each game has a hidden food rate:
food_extra += (food_rate * num_players) / food_turnper turn food_rate: 5-11 total food;food_turn: 19-37 turns per cycle- Initial food: 2-5 items placed within each player's starting vision, plus distributed food based on
food_start(75-175 per land area ratio) - Four spawning algorithms: none, random, sections (BFS-based region splitting), symmetric (maintains rotational/reflective symmetry — the one used in competition)
- Food is distributed across "symmetric square sets" that shuffle randomly; every set spawns at least once before repeating
- Equidistant squares (from multiple players) form smaller sets
- Strategic implication: the best way to gather the most food is to explore and cover the most area
Harvesting:
- Occurs after combat resolution each turn
- If all ants within
spawnradius2(default: 1) of food belong to one player, that player harvests it into their "hive" - If ants from multiple players are within spawn radius, the food is destroyed — benefits nobody
- Each harvested food produces exactly 1 ant
Ant Spawning:
- Each food in the hive spawns 1 ant at a hill
- Hill must be unrazed and unoccupied
- One ant spawns per hill per turn maximum
- Priority: hills longest without activity get spawns first (ties broken randomly)
- Players can block spawning by keeping an ant on a hill
Combat Resolution
The competition offered four selectable combat algorithms. The one used in the actual competition was "focus" (also called "occupied"):
Focus Algorithm:
for every ant:
for each enemy in range of ant (using attackradius2):
if enemies_in_range(ant) >= enemies_in_range(enemy):
ant is marked dead
attackradius2default: 5 (Euclidean distance ~2.24 tiles)- An ant dies when its "focus" (number of enemies it faces) is greater than or equal to any opponent's focus
- All deaths are resolved simultaneously — no mid-turn recalculation
- Key properties:
- Superior numbers win without casualties — 2v1 kills the lone ant with no friendly losses
- Defensive formations are powerful — a small concentrated force can inflict heavy losses (the "Sparta" effect)
- Multi-faction battles create chaos — third parties can benefit from letting opponents weaken each other
- Locally deterministic — only requires knowledge of immediate surroundings
Other algorithms available (not used in main competition):
- Support: ant survives if friendly count >= enemy count in range
- Closest: iterative BFS elimination; ants in groups >1 all die (discourages concentration)
- Damage: each ant deals
1/enemy_countdamage; ants receiving >= 1 total damage die
Hill Mechanics
- Each player starts with 1+ hills; initial ant spawned at each
- Razing: a hill is destroyed when an enemy ant occupies its location after the attack phase (no defending ant present)
- Razing awards +2 points to the razer, -1 point to the hill owner
- Razed hills stop spawning but the player continues playing with remaining ants
- Even with all hills razed, ants can still move, attack, gather food, and raze enemy hills
Winning Conditions / Endgame
Five ways a game can end:
- Lone Survivor: only one player has living ants. Bonus: +2 points per unrazed enemy hill, -1 to their owners
- No Players Remain: all eliminated simultaneously
- Food Not Gathered (Cutoff): food sits at >= 90% of combined food+ant count for 150 turns — targets inactive bots
- Dominance Without Hill Razing (Cutoff): one player has >= 85% of total food+ants for 150 turns without razing — assumes winner can't lose
- Rank Stabilized: no remaining player can mathematically improve their ranking
- Turn Limit: default 1000 turns (calibrated so 90-95% of matches end via this rule)
Scoring
- Each player starts with 1 point per hill owned
- Razing an enemy hill: +2 points
- Losing a hill: -1 point
- If you never attack and lose all hills: 0 points (incentivizes aggression)
- Lone survivor bonus: +2 per remaining unrazed enemy hill
- Crashed/timed-out bots: surviving opponents still receive hill-destruction bonus points
Collision Rules
- Two friendly ants ordered to the same square: both die
- Your ant ordered onto an enemy ant's square: both die (before attack radius applies)
- Moves into water or food: ignored (ant stays in place)
3. Bot Communication Protocol
Transport
Bots communicate via stdin/stdout. The game engine launches each bot as a subprocess and pipes data through standard streams. Stderr is captured for logging but not processed.
Initialization (Turn 0)
The engine sends:
turn 0
loadtime <ms>
turntime <ms>
rows <height>
cols <width>
turns <max_turns>
viewradius2 <squared_distance>
attackradius2 <squared_distance>
spawnradius2 <squared_distance>
player_seed <seed>
ready
The bot performs initialization and responds with:
go
Per-Turn Input
Each subsequent turn, the engine sends:
turn <turn_number>
w <row> <col> # water (only first time visible)
f <row> <col> # food location
a <row> <col> <owner> # live ant (owner 0 = self)
h <row> <col> <owner> # hill
d <row> <col> <owner> # dead ant (visible or own)
go
All coordinates are 0-indexed. Only information within the player's fog of war is transmitted.
Bot Output (Orders)
Bots issue movement commands:
o <row> <col> <direction>
go
Where direction is one of: N, E, S, W. Each ant can receive at most one order. Ants without orders remain stationary.
Game End
When the game ends, bots receive:
end
<final state data>
players <num_players>
score <p1_score> <p2_score> ... <pN_score>
go
Timing
- Load time: default 3000ms — time allowed for initialization after receiving setup data
- Turn time: default 1000ms — time allowed per turn to compute and send orders
- Timeout behavior: if a bot exceeds the time limit, it is killed. The bot's status is recorded as "timeout" in the replay. Its ants remain on the map but receive no further orders (effectively becoming stationary targets)
- Crash handling: similar to timeout — bot removed from active play, ants become inert
- Timing measurement: the platform discussed both CPU time and wall-clock time. The recommended approach was a hybrid: base timeout on CPU time with a 10x multiplied wall-clock backup limit. Server-side monitoring checked ~50 times per turn
Distance Calculation
All distance checks (vision, attack, spawn) use squared Euclidean distance with wrapping:
dr = min(abs(a.row - b.row), rows - abs(a.row - b.row))
dc = min(abs(a.col - b.col), cols - abs(a.col - b.col))
distance_squared = dr*dr + dc*dc
4. Tournament / Ranking System
Rating Algorithm: TrueSkill
The platform used Microsoft's TrueSkill rating system via the JSkills 0.9.0 Java library:
- Each player has mu (mean skill estimate) and sigma (uncertainty)
- Display rating: mu - 3*sigma (conservative estimate)
- After each game, mu and sigma are updated based on finish order
- Multi-player games supported natively by TrueSkill
- The manager ran
TSUpdateas a Java subprocess, passing player data and receiving updated ratings
Matchmaking Algorithm
- Seed selection: pick the bot that hasn't played in the longest time (tiebreak by user_id)
- Match size: determined by the player count the seed has played in least
- Opponent selection: uses a Pareto distribution to select skill-rank distance — 80% of matches are within 16 ranks of the seed
- Fairness filters: exclude bots with excessive recent games to keep game counts even
- Pairing priority (in order):
- Least recent pairing with selected opponent (24-hour window)
- Fewest total games played recently
- Best TrueSkill match quality (computed as a 0-1 score from mu, sigma, and beta)
- Map selection: least played by all opponents
- Position assignment: random player slot assignment
Game Size
- Maps supported 2-10 players per game
- The matchmaker selected game size based on what the seed player had least experience with
Match Frequency
Games ran continuously via the worker pool. As soon as a worker completed a game, it requested a new task. With the worker fleet running, new ratings appeared "within minutes" of code submission.
Leaderboard
- Updated periodically via
generate_leaderboardstored procedure - Configurable refresh interval
- Displayed mu-3*sigma as the public rating
5. Replay System
Two Formats
- Streaming format: used for real-time viewing, ~10x larger than storage format
- Storage format: JSON-based, used for downloads and archival
Storage Format (JSON) Structure
{
"challenge": "ants",
"replayformat": "json",
"playernames": ["bot1", "bot2", ...],
"playerstatus": ["survived", "eliminated", "timeout", "crash"],
"submitids": [123, 456],
"user_ids": [1, 2],
"user_url": "http://aichallenge.org/profile.php?user=~",
"game_url": "http://aichallenge.org/game.php?game=~",
"date": "2011-12-01T12:00:00Z",
"game_id": 12345,
"playercolors": ["#ff0000", "#0000ff"],
"replaydata": {
"revision": 2,
"players": 2,
"turns": 1000,
"loadtime": 3000,
"turntime": 1000,
"viewradius2": 77,
"attackradius2": 5,
"spawnradius2": 1,
"map": {
"rows": 100,
"cols": 100,
"data": [".%%.....a...", "...%%..b....", ...]
},
"ants": [
[row, col, start_turn, conversion_turn, end_turn, player_id, "moves_string"]
],
"scores": [[1.0, 1.0], [1.0, 3.0], ...],
"bonus": [0, 2]
}
}
antsarray: each element tracks an ant's lifecycle — position, birth turn, conversion turn, death turn, owner, and a string of movement commands (-= stay,n/e/s/w= direction)scores: 2D array of floating-point values per player per turn- Map characters:
.= land,%= water,*= food,a-z= starting ant positions
Visualizer
- Template-based HTML generation:
replay.html.templateproduced self-contained replay pages - JavaScript rendering with UglifyJS for minification
- Rhino (server-side JS engine) for processing
- Java components for core visualization logic
- Local development:
visualize_locally.pyfor testing - Visualizers verified
challenge == "ants"andreplayformat == "json"before parsing
6. Infrastructure
Architecture
Four core components:
- Website (PHP/Apache): user registration, code submission, leaderboard display, replay viewing
- Auto-compile system: detected language by file extension, compiled submissions, generated
run.shexecution scripts - Tournament manager (Python): matchmaking, game scheduling, TrueSkill updates, leaderboard generation
- Workers (distributed): standalone game executors with compilation and sandboxing
Worker System
- OS: Ubuntu 11.04
- Deployment: automated via shell script —
curl http://server/api_server_setup.php?api_create_key=KEY | sh— approximately 25 minutes to install - Communication: JSON over HTTP (GET for task assignment, POST for results)
- Task flow:
- Worker requests task via
api_get_task.php - Server assigns compilation job or game matchup
- Worker downloads submissions (with hash verification) and maps
- Worker compiles/runs the game
- Worker posts results via
api_compile_result.phporapi_record_game.php
- Worker requests task via
- Idempotency: unique
post_idper request prevents duplicate processing - Error recovery: failed games reassigned to different workers
- Database: MySQL
Sandboxing
- Primary approach:
schroot+jailguard.py— chroot-based isolation under/srv/chrootwith dedicatedjailuseraccounts - Process control: signal-based (
KILL,STOP,CONT) via jailguard - Earlier approach: Systrace (ptrace-based syscall interception) — used in the 2010 Planet Wars competition
- Problem: severe performance penalty for Java due to syscall interception overhead ("every syscall intercepted requires alternation between Systrace and the untrusted program")
- Java bots struggled to stay within per-move time limits
- Evaluated but not deployed: SMACK (Simplified Mandatory Access Control Kernel) — kernel-side decisions, no performance penalty, but required custom kernel compilation
- Fallback: "House" class — unsandboxed subprocess execution for development
Supported Languages (26+)
Compiled: Ada, C, C++, C++11, D, Go, Haskell, Java, Lisp, OCaml, Pascal, Scala Interpreted: Clojure, CoffeeScript, Dart, Erlang, Groovy, JavaScript, Lua, Perl, PHP, Python, Python3, PyPy, Racket, Ruby, Scheme, Tcl Managed: C#, VB.NET
Auto-detection by file extension. Memory limit default: 1500MB. Java heap configured via -Xmx.
Submission Pipeline
Status codes tracked the entire lifecycle:
- 10: record created
- 15: archive placed in temp directory
- 20: ready for compilation
- 24: compiling
- 27: compiled, awaiting tests
- 40: active — compiled and passed test cases
- 30/50/60/70: various errors (upload, unzip, file problem, compilation)
- 80: compiled but failed test cases
- 90-99: suspended
- 100: retired
Known Infrastructure Issues
- Disk space nearly exhausted during the Winter 2010 contest (monitored afterward)
- Workers did not gzip results (bandwidth inefficiency)
- Sandbox lacked proper timestamp logging
- Planned Django migration never completed
- No proper licensing across the codebase
- The project became unmaintainable after the organizers graduated/moved on
- Deployment documentation was sparse — by 2019, it was unclear if the project could still be deployed
Health Monitoring
Automated scripts checked: account creation, code submission, leaderboard updates, entry suspension, disk space usage.
7. Game Balance and Dominant Strategies
Design Principles
The game was designed around explicit criteria:
- Easy — low barrier to entry; 5 minutes to submit a starter bot
- Familiar — ant colony concept universally understood
- Interesting — appeals to programmers at first glance
- Technically feasible — turn-based, bounded computation
- Non-trivial — no discoverable "perfect solution"
- Fun to watch — ant swarms make compelling visualizations
Strategic Layers
Exploration: covering the most map area directly correlated with food income. BFS-based territory partitioning was the standard approach.
Food Gathering: BFS from food squares to find closest ant was the baseline. Advanced bots used simultaneous multi-source BFS across all food to optimally assign ants.
Combat: the focus algorithm rewarded tight formations. Key tactics:
- 2v1 superiority: two ants adjacent to one enemy kill it with zero losses
- Defensive lines: concentrated forces could hold chokepoints (the "Sparta" strategy)
- Surrounding: spreading ants around an enemy force could create local numerical advantages everywhere
- Third-party exploitation: in multi-player games, letting opponents fight and cleaning up survivors
Hill Offense/Defense: razing hills was the primary scoring mechanism (+2 points). This created tension between:
- Defending your own hills (keeping ants near home)
- Sending expeditionary forces to destroy enemy hills
- The optimal balance shifted through the competition as stronger bots emerged
Pathfinding: BFS was the standard. A* was recommended for known-target navigation. Multi-source BFS from all ant positions partitioned the map into influence zones.
Balance Observations
- Symmetric maps eliminated positional advantage — pure strategy and code quality determined outcomes
- Fog of war prevented perfect-information exploitation — bots needed exploration and scouting
- The focus combat system rewarded formation play — raw ant count wasn't enough; positioning mattered
- Food competition (contested food destroyed) prevented simple rushing strategies
- Multi-player games (2-10 players) prevented degenerate 1v1 metagames from dominating
- Cutoff rules prevented boring games where dominant bots couldn't finish (stalemate prevention)
- The 85% dominance cutoff acknowledged that some bots were smart enough to win but not smart enough to execute the final hill-razing — this was a practical concession
Known Balance Issues
- The competition was heavily explored — by the end, top bots converged on similar strategies
- Java bots had a systematic disadvantage due to Systrace sandbox overhead eating into their turn time budget
- Language performance disparities existed: C++ bots could compute much more within the 1000ms turn limit than Python bots
- Map generation symmetry was essential but also predictable — top bots could infer enemy positions from their own
8. Successor Platforms
Halite (Two Sigma) — 2016-2019
Created by: Ben Spector and Michael Truell during their summer 2016 internship at Two Sigma. Additional contributors from Two Sigma and Cornell Tech.
Three seasons:
- Halite I (2016): grid-based territory control. Players moved pieces up/down/left/right on a rectangular grid. Website:
2016.halite.io - Halite II (2017): continuous 2D space with ships mining planets and building fleets. "Bots battle for control of a virtual continuous universe." Created by David Li, Jaques Clapauch, Harikrishna Menon, Julia Kastner.
- Halite III (2018): resource management — bots navigate the sea collecting halite on a square grid. 9,528 commits in the repo.
Scale: "developing a flourishing community of bot builders from around the globe, representing 35+ universities and 20+ organizations"
Key improvements over AI Challenge:
- Professional backing (Two Sigma funded and hosted)
- New game each season to prevent stale metagames
- Modern web infrastructure
- WebAssembly-based visualizer (Halite III was 81.2% WebAssembly)
Demise: halite.io now redirects to twosigma.com. The competition was discontinued after Halite III (2018-2019). Forums also shut down.
Battlecode (MIT) — 2001-present
Longest-running AI programming competition. Organized by MIT students, running annually since 2001.
Key characteristics:
- New game theme every year (e.g., "Chromatic Conflict" 2025, "Breadwars" 2024, "Tempest" 2023)
- Java-based game engine
- Bytecode counting for fairness — measures computation in bytecodes rather than wall-clock time (eliminates language performance disparity)
- Code instrumentation for isolated, deterministic execution
- Emphasizes "artificial intelligence, pathfinding, distributed algorithms, and communications"
- Tournament finals recorded and published on YouTube
- Post-mortems from top teams published
2025 winner: "Just Woke Up" (Tim Gubskiy, Andy Nguyen) — "Chromatic Conflict" (robot bunnies paint 70% of the map)
Key difference from Ants: Battlecode uses bytecode limits instead of wall-clock time, making it language-agnostic in terms of performance. It also changes the game every year, preventing long-term metagame stagnation.
Lux AI Challenge (Kaggle) — 2021-present
Season 1 (2021):
- 1v1 resource management with day/night cycles
- Cities that fail to produce enough light are consumed by darkness
- 22,000+ submissions from 1,100+ teams
- TrueSkill ranking (same as Ants)
- Multiple languages supported (Python, JS, Rust, C++, Java, TypeScript, Kotlin)
- $10,000 prize pool
Season 2 (2023, NeurIPS track):
- Asymmetric maps
- Larger scale gameplay
- JAX-based GPU/TPU acceleration for simulation
- Episode dataset from hundreds of human-written agents for training
- Academic focus through NeurIPS competition track
Key improvements: GPU-accelerated environments enable ML-based agents to train at scale. Hosted on Kaggle's infrastructure (no custom deployment needed). Docker-based consistent execution.
Kaggle Simulation Environments — 2019-present
Kaggle built a general-purpose kaggle-environments library for hosting AI competitions:
- Configurable game lifecycles
- Multi-agent orchestration
- Episode serialization and replay
- HTTP server interface for distributed execution
- OpenAI Gym-style training interface
- Docker support for consistency
This standardized the infrastructure that Ants, Halite, and others each built from scratch.
Other Notable Successors
- CodinGame — persistent platform with multiple AI games, monthly competitions
- Terminal.live (Correlation One) — tower-defense-style AI competition
- AIIDE StarCraft AI Competition — academic competition using StarCraft: Brood War
- microRTS — academic RTS AI competition framework
9. What Went Wrong / Lessons Learned
Infrastructure
- The platform was built by university volunteers on a shoestring budget
- No professional DevOps — deployment was manual shell scripts
- Worker scaling was "add more Ubuntu boxes" with no auto-scaling
- MySQL as the sole datastore with no documented backup strategy
- The project died when the original organizers (University of Waterloo CS Club) graduated
Sandboxing
- Systrace caused measurable unfairness for Java bots
- The ideal solution (SMACK) was too complex to deploy on standard Ubuntu
- The chroot/jailguard approach was functional but fragile
Game Design Tension
- Cutoff rules were necessary but imperfect — some genuinely close games were cut short
- The focus combat algorithm was elegant but led to convergent strategies at the top level
- 10-player games on large maps could be chaotic and hard to reason about
Community
- The competition generated enormous enthusiasm (7,900 submissions)
- But sustaining it required ongoing volunteer effort that couldn't survive beyond the academic cycle
- Google's sponsorship provided visibility but not long-term infrastructure commitment
Legacy
The Ants AI Challenge demonstrated the viability and appeal of multi-player AI programming competitions. Its open-source codebase directly influenced Halite's architecture. The TrueSkill matchmaking, worker-based game execution, and JSON replay format became standard patterns. The project's ultimate failure was organizational sustainability, not technical quality.