AI Models Build Multiplayer Shooter Benchmark

In late 2025, Stepan Parunashvili tasked GPT-5.1 Codex Max, Gemini 3 Pro and Claude Opus 4.5 to build a browser-based, 3D multiplayer Counter-Strike–style game entirely from model-generated code. The experiment compared frontend, backend and debugging behaviors, showing Claude excelled at art and UX, Gemini at networking and persistence, and Codex at steady long-session debugging.
Key Points
- 1Demonstrates models autonomously generate near-working 3D multiplayer FPS games with no human code patches
- 2Reveals differing strengths: Claude for frontend/visuals, Gemini for backend/sync, Codex for steady debugging
- 3Implies engineers must match model selection to task: aesthetic, systems, or long-session maintenance workflows
Scoring Rationale
Practical demonstration highlights meaningful model behavior differences, but single-person informal experiment limits generalizability and rigorous benchmarking.
Sources
Public references used for this report.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problems