Skip to content

GoTourFour takes on the same Go tutorial.

One spec, four model families. Pick one to compare prose, pacing, the runner UI, and how each handles the parts the Playground sandbox can't actually run.

By the numbers

Wall-clock between the agent's first phase commit and its Phase 19 polish commit, plus a few git-derivable side-channels:

go-claudego-geminigo-glmgo-minimax
Implementation time24m 32s36m 18s1h 13m 18s48m 15s
Commits (Phase 01–19)191915 (some bundled)19
Lines added (excl. lockfile)4,9974,7445,4945,331
Files created125544773
SubscriptionClaude Max ($100/mo)Google AI Pro ($20/mo, via Google One)Ollama ($20/mo)Ollama ($20/mo)

A few honest caveats:

  • "Implementation time" is wall-clock between commits, not active model think-time. It includes any pauses, retries, or tool waits the agent didn't actively bill against.
  • All four started from the same 458d83c baseline (the spec commit). The clock starts at each agent's first phase commit so kickoff lag isn't counted.
  • go-glm's extra hour is mid-run rework, not after-the-fact debugging: clean cadence through Phase 12, then ~41 minutes across two bundled "Phase 12–14" / "Phase 14–18" commits that revisit earlier work. Post-Phase-19 commits (a directory restructure and a dark-mode fix) are not included.
  • go-minimax ran clean — one commit per phase, all 19. Those per-phase commits live on the codex/full-implementation branch; they were folded into a single commit when the app was merged to main, so the granular history is preserved on the branch.
  • go-claude's file count is an architecture choice — one TypeScript module per lesson — not 2× the work. The others leaned on fewer, registry-style modules.
  • Token usage and model think-time aren't shown; git can't see them. Subscription cost is the flat monthly rate, not the marginal cost of this run.
  • All four runs were driven from the same workstation — an M4 Max MacBook Pro (36 GB) — but inference happened in each provider's cloud, so the laptop spec didn't influence the comparison.

What this is

Four coding agents — Claude Code, Google Antigravity, the Pi coding agent running GLM-5.1, and OpenAI's Codex CLI running MiniMax (via Ollama cloud) — were each handed the same brief and the same set of GitHub issues, then turned loose to build an interactive Go tutorial autonomously.

The spec is the brief that defines the product. The prompt is the meta-instruction that told each agent how to work. The issues are the phased breakdown all four followed.

This site is the read-only docs surface. The four apps themselves are linked above.

Built as an experiment in three-model parity.