AI Game Arena
Visualizing LLM Coding Capabilities through Classic Games
AI Game Arena
A visual benchmark platform comparing how different LLMs generate classic HTML games.

Background (Why)
I started this project at the turn of the year, driven by a trend I saw on social media. There were countless short videos showing “AI making Mario” or “AI making Flappy Bird” that were racking up massive view counts. I wanted to replicate that success.
My hypothesis was simple: people love seeing AIs compete. If I built a “colosseum” where users could see the same game (like Tetris or Snake) built by different models using the exact same prompt, it would be natural viral content.
What I Built (What)
I built a web platform deployed at webutilitykit.com/arena that hosted these side-by-side comparisons.
- The Arena: A grid layout displaying the generated games.
- The Tests: I managed to implement two benchmark cases: Tetris and Snake.
- The Workflow: I manually fed the same prompt to various models, collected the HTML/JS output, and hosted them on the site for users to play and compare.
- Tech Stack: Simple HTML/CSS/JS wrapper hosted on a static site.
- Duration: It was a quick sprint over the New Year holiday (Dec 31, 2025 – Jan 2, 2026).
Outcome (Result)
I produced a demo video and posted it on Douyin (TikTok China) to launch the project.
The result was… silence.
The video got virtually no views. No one visited the site. It turns out that while “AI writes code” is a cool novelty, a static page of barely-functioning Tetris clones isn’t actually entertaining to play or watch. I abandoned the project immediately after the launch failed to gain traction.
Key Takeaways
- Content Creation != Engineering: I realized that what I was trying to do was essentially content marketing, not software engineering. Success depended on video editing, pacing, and “hooking” the audience, not on the technical implementation of the arena.
- The “AI Gimmick” Fatigue: Just slapping “AI” on something isn’t a magic ticket to traffic anymore. The content needs to be inherently fun or useful.
- Misaligned Goals: I was chasing traffic rather than building something I actually wanted to use. When the traffic didn’t come, I had no motivation to continue.
Current Status
- Status: Failed
- Recommendation: Not recommended. It was a short-lived experiment in viral marketing that didn’t go viral.
- Future: Permanently abandoned.