Garry Tan Built an AI Software Factory That Ships Like a Team of Twenty

Key Takeaways

01 Garry Tan (YC CEO) built gstack — 15 AI specialist agents that act as CEO, Designer, Eng Manager, QA Lead, and Release Engineer
02 He claims 600,000 lines of production code in 60 days, averaging 10,000-20,000 usable lines per day
03 gstack runs 10-15 parallel sprints simultaneously — each agent knows its role and when to stop
04 Multi-AI review: Claude Code reviews with /review, Codex (OpenAI) reviews with /codex for cross-model analysis
05 All open source, MIT licensed — fork it and make it your own

Introduction

Last week, I was digging through GitHub trending and stumbled upon something that made me pause. A repo called gstack had blown past 33,000 stars in days. The description read: “Use Garry Tan’s exact Claude Code setup: 15 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA.”

Garry Tan. As in, the President and CEO of Y Combinator.

That is not a typo. The person running one of the most influential startup accelerators in the world just open-sourced his entire AI-powered software factory — and it turns Claude Code into a virtual engineering team that you actually manage.

I had to try it.

The Problem No One Talks About

You have probably felt this before: you have a feature idea, you ping your AI assistant, and it writes some code. Then what? Who challenges the product thinking? Who reviews the architecture? Who runs the tests, opens the browser, clicks through the flow, finds the bug, fixes it, and writes a regression test?

If you are doing all of that yourself, you have a copilot. Not a team.

If you are letting the AI run wild without any process, you have chaos. Not automation.

Garry Tan describes it this way: “We are at the dawn of something real — one person shipping at a scale that used to require a team of twenty.”

What is gstack Exactly?

gstack is a collection of 15 slash-command skills and 6 power tools that plug into Claude Code (and now Codex, Gemini CLI, and Cursor). Each skill is a specialist role:

Skill	Specialist	What They Do
`/office-hours`	YC Office Hours	Reframe your product before you write code. Six forcing questions.
`/plan-ceo-review`	CEO / Founder	Rethink the problem. Find the 10-star product hiding inside your request.
`/plan-eng-review`	Eng Manager	Lock architecture, draw ASCII diagrams, map edge cases.
`/plan-design-review`	Senior Designer	Rate each design dimension 0-10, detect AI slop.
`/design-consultation`	Design Partner	Build a complete design system from scratch.
`/review`	Staff Engineer	Find bugs that pass CI but blow up in production. Auto-fix obvious ones.
`/investigate`	Debugger	Systematic root-cause debugging. Iron Law: no fixes without investigation.
`/qa`	QA Lead	Open a real browser, click through flows, find bugs, write regression tests.
`/ship`	Release Engineer	Sync main, run tests, push, open PR. One command.
`/document-release`	Technical Writer	Update every doc file to match what you just shipped.

The power tools include /codex (second opinion from OpenAI’s Codex CLI), /careful (safety guardrails), /freeze (lock edits to one directory), and /gstack-upgrade (self-updater).

My First Run

I installed gstack on a side project I had been dragging along — a React dashboard that needed love.

First, I ran /office-hours and described what I was building. The response surprised me:

“You said ‘dashboard.’ But what you actually described is a command center with real-time metrics. The narrowest wedge that delivers value is a daily summary view — not the full dashboard. That is a 3-month project. Start with one view that actually gets used.”

It pushed back on my framing. It listened to my pain (stale data, slow loading), not my feature request.

Then I ran /plan-ceo-review. It generated a design doc with three implementation approaches and effort estimates. That doc automatically fed into /plan-eng-review, which produced ASCII diagrams for data flow and a test matrix.

Eight commands. Forty-five minutes. What would have taken a week of back-and-forth with a team.

The Parallel Sprint Revolution

Here is what really got me: gstack is not just about one sprint. It is about running 10-15 of them simultaneously.

Garry Tan uses Conductor to run multiple Claude Code sessions in parallel. Each session runs in its own isolated workspace — one doing /office-hours on a new idea, another implementing a feature, a third running /review on a PR, a fourth doing /qa on staging.

You manage them the way a CEO manages a team: check in on the decisions that matter, let the rest run.

“Without a process, ten agents is ten sources of chaos. With a process — think, plan, build, review, test, ship — each agent knows exactly what to do and when to stop.”

Multi-AI Review: When Claude and Codex Disagree

One of the coolest features is /codex — an independent code review from OpenAI’s Codex CLI.

Three modes:

Review: Pass/fail gate on your changes
Adversarial: Actively tries to break your code
Consultation: Open-ended discussion

When both /review (Claude) and /codex (OpenAI) have reviewed the same branch, gstack generates a cross-model analysis showing which findings overlap and which are unique to each AI.

That is powerful. If two completely different AI systems flag the same issue, you listen. If they disagree, you get the full picture.

Real Numbers

Garry Tan claims he wrote over 600,000 lines of production code in 60 days, with 35% tests. His last /retro (weekly stats) showed:

140,751 lines added
362 commits
~115k net LOC added

All while being the CEO of Y Combinator.

You might be thinking: this is Garry Tan. He is brilliant. He has access to everything.

And you are right. But here is the thing — he is not keeping this to himself. He open-sourced gstack under the MIT license. Every tool, every skill, every workflow is there for you to fork and make your own.

When to Use This

gstack excels when:

You are a solo founder who wants to ship like a team
You are a technical CEO who still wants to code
You need structured review gates before shipping
You want parallel AI workers handling different branches

It might not be for you if:

You prefer a blank canvas over opinionated structure
You need very lightweight, simple automation
You do not use Claude Code (though Codex and Gemini CLI support is growing)

Common Mistakes I Saw

Skipping /office-hours. People jump straight to building. That defeats the purpose — the design doc feeds into everything downstream. Without it, you lose the structure.
Not running /ship. Some folks forget that /ship bootstrap test frameworks if you do not have one. It is an incredible safety net.
Forgetting parallel sprints. One sprint is powerful. Ten running at once is transformational. The process structure makes parallelism work.
Ignoring /document-release. I love this one — it keeps your docs current automatically. No more stale READMEs.

Conclusion

Garry Tan ends his README with this: “Same tools, different outcome — because gstack gives you structured roles and review gates, not generic agent chaos.”

That is the key insight. It is not about having an AI write code. It is about having a structured team of AI specialists, each with a clear role, each feeding into the next. That governance is the difference between shipping fast and shipping reckless.

The best part? It is all free. No premium tier. No waitlist. Just a repo you can clone and a process that transforms how you build software.

Welcome to the era of one-person engineering teams that ship like they have twenty people behind them.

Installation takes 30 seconds. Open Claude Code and run: git clone https://github.com/garrytan/gstack.git ~/.claude/skills/gstack && cd ~/.claude/skills/gstack && ./setup

Bittalks

Developer and tech enthusiast exploring the intersection of open source, AI, and modern software development.

Comments

Join the discussion — requires GitHub login

Key Takeaways

Introduction

The Problem No One Talks About

What is gstack Exactly?

My First Run

The Parallel Sprint Revolution

Multi-AI Review: When Claude and Codex Disagree

Real Numbers

When to Use This

Common Mistakes I Saw

Conclusion

Bittalks

Related Articles

The 'Reasoning-Loom': Weaving Multi-Modal Intent into Unified Action Traces in 2026

The 'Reasoning-Migrator': Why 2026 Agents are Moving Live Thought-States Across Global Clusters

The 'Reasoning-Defragmenter': Optimizing Latent Memory in 2026 Agent Swarms

Comments