Get StartedSubmit Project
AI Gateway Hackathon

Build your own
model eval game.

Join the challenge to create autonomous simulations, human evaluation games, or innovative experiments using AI Gateway and AI SDK.

The Goal

Create a competitive model-evaluation game where multiple AI models face off against each other.

The Rules

Create a game that support either automated head-to-head comparisons or user-driven judging.
Users can press Start to launch a round and watch the competition play out.
Deliver a final ranked list of models based on their performance.
Deploy it on Vercel with AI Gateway.
Powered by AI Gateway
One API to access 200+ AI models from OpenAI, Anthropic, Google & more
Submissions Close In
--
Days
--
Hours
--
Min
--
Sec
December 12, 2025 at 11:59 PM PST

Types of games you can create

Simulation

Let the models play. Create autonomous simulations where multiple AI models run in parallel, racing or collaborating toward a goal.

Examples
  • Wordle Battle (Models vs. Game)
  • Speedrun QA & Coding Sprints
  • Multi-Agent Ecosystems

Human Evals

Let humans decide. Build interactive evaluation games where users judge AI outputs without knowing which model produced them.

Examples
  • Prompt Fight / Model Arena
  • AI Debate Club
  • Style Showdown & Creative Writing

Open Category

Break the rules. Create a hybrid of simulation and human interaction, or build something completely unexpected.

Examples
  • Join as player and race against AI
  • Interactive Hybrid Games
  • Experimental Interfaces

Prizes

1st Place
$1,000
in AI Gateway Credits
2nd Place
$500
in AI Gateway Credits
3rd Place
$250
in AI Gateway Credits
Simulation Example
Inspired by George Jefferson's game AI Model Races.
Live Demo

Wordle Battle

6 models race to solve today's Wordle

0.0s
GPT-4o
Claude Sonnet
Gemini Pro
Llama 3
Mistral
Grok
Simulation Example
Live Demo

Code Golf

Models compete for the shortest solution

0.0s
Challenge
FizzBuzz

Print 1-100. For multiples of 3 print 'Fizz', for 5 print 'Buzz', for both print 'FizzBuzz'.

GPT-4o
0 chars
|
Claude
0 chars
|
Gemini
0 chars
|
Grok
0 chars
|
Simulation Example
Live Demo

Speed Math

Models race through 5 math problems

0.0s
#1
17 × 23
#2
√1764
#3
15% of 840
#4
2^10
#5
999 + 888 + 777
GPT-4o
Claude
Gemini
Llama 3
Human Evals Example
Live Demo

Logo Arena

Pick the better AI-generated logo

1 / 3

Prompt: "A minimal logo for a coffee shop called 'Dawn'"

Human Evals Example
Live Demo

Turing Test

Find the human among the AI responses

0/0
Question

"What's the most underrated city to visit?"

Human Evals Example
Live Demo

Emoji Translator

Which model captures the vibe best?

1 / 3
Translate this

"I'm running late to the airport and my phone is dying"