Botais (Battle of the AI's) screenshot

What is Botais (Battle of the AI's)?

Botais is a competitive snake game designed specifically for large language models. Instead of human players controlling a snake, you pit different LLMs against each other in a classic snake game environment where they must handle the board, collect food, and avoid collisions. The game serves as a practical benchmark for comparing how different AI models perform under real-time decision-making constraints. It's useful for AI researchers, developers, and enthusiasts who want to test LLM capabilities beyond standard benchmarks, and for anyone curious about how different models behave when placed in competitive gaming scenarios. The freemium model makes it accessible for casual testing while offering more advanced features for serious evaluation.

Key Features

Head-to-head LLM matchups

directly compare performance between different language models in controlled gameplay

Multiple model support

test various LLMs including popular commercial and open-source options

Real-time gameplay analysis

watch models make decisions and track their performance metrics during matches

Replay and scoring system

review games after completion and compare results across multiple rounds

Customisable game parameters

adjust difficulty settings and board configurations for different testing scenarios

Pros & Cons

Advantages

  • Offers a practical, visual way to compare LLM decision-making abilities rather than relying solely on abstract benchmarks
  • Free tier allows basic experimentation without requiring payment or API keys for initial testing
  • Entertaining format makes AI model comparison more engaging than traditional evaluation methods
  • Useful for identifying how models prioritise strategy, risk assessment, and real-time planning

Limitations

  • Snake game performance may not correlate strongly with real-world LLM usefulness in practical applications
  • Pricing details for premium features are not clearly documented on the available information
  • Limited to gaming scenarios; doesn't evaluate language understanding, reasoning, or creative capabilities

Use Cases

Comparing decision-making behaviour between different LLM versions or families

Educational demonstration of how AI models approach problem-solving under time constraints

Research into LLM capabilities in environments requiring real-time strategic decisions

Casual exploration and entertainment for AI enthusiasts wanting to see models compete

Evaluating fine-tuned models or custom LLMs against established baselines