We Leaked the Brain of 34 AI Coding Tools — Here's What They're Really Thinking

Key Takeaways

01 A GitHub repo with 132K stars leaked system prompts from 34 AI coding tools — including Claude Code, Cursor, Windsurf, and Manus.
02 Each tool has a distinct 'personality': Claude Code is ruthlessly concise, Cursor worships semantic search, Windsurf obsesses over memory.
03 The key differentiator in 2026 isn't the model — it's how each tool balances proactivity vs. caution, and verbosity vs. silence.
04 Zero研究表明 these AI assistants share a common fear: appearing too expensive or wasteful with tokens.
05 The leaked prompts reveal a hidden arms race over planning, memory, and agentic autonomy — the next frontier of AI coding.
06 Most tools forbid the same things: creating docs proactively, using emojis, surprising the user with actions they didn't ask for.

Last week, a repo dropped that made every AI researcher do a double-take. 132,401 stars and climbing. Forked 33,512 times. The repo? A collection of system prompts from 34 different AI coding tools — Claude Code, Cursor, Windsurf, Manus, v0, Devin, Xcode (Apple’s new AI), and dozens more.

Most of these prompts were never meant to see the light of day. They’re the “secret sauce,” the hidden instructions that tell an AI how to think, not just what to think. The moment I saw this repo hit the trending page, I had to dig in. And what I found? These AI coding assistants are nothing alike under the hood.

What’s a System Prompt (And Why Should You Care)?

Before we dive in: a system prompt is the hidden instruction set that shapes every response an AI model gives. It tells the model who it is, what it should prioritize, what it must avoid, and how it should behave.

When you use Claude Code and say “fix this bug,” Claude Code doesn’t just know how to code — it’s been told exactly how to approach that bug. System prompts define things like:

How verbose should responses be?
Should it proactively create files, or ask first?
Is it allowed to write documentation on its own?
What constitutes “too expensive” an operation?

These aren’t minor details. They’re the difference between an AI that’s helpful and one that’s absolutely infuriating.

The 132K-Star Goldmine

The repo — x1xhlol/system-prompts-and-models-of-ai-tools — collects prompts from:

The Full List

Claude Code, Cursor, Windsurf, Manus, Augment Code, Cluely, CodeBuddy, Comet, Devin AI, Junie, Kiro, Leap.new, Lovable, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI, VSCode Agent, Warp.dev, Xcode, Z.ai Code, Dia, v0 (Vercel), and more.

That’s over 30,000 lines of internal instructions. Someone literally dumped the brain scans of nearly every major AI coding tool on the market. And here’s the wild part: they’re all very different.

The Big Five: How They Differ

I read through the system prompts from the biggest players. Here’s what makes each one unique:

1. Claude Code — The Minimalist

Claude Code’s prompt is arguably the most strict. Its key directives:

Responses must be under 4 lines unless the task is genuinely complex
No preamble or postamble — never say “Here’s what I did” or “Based on my analysis”
Only use emojis if explicitly requested — they’re optional, not default
Defensive security only — refuse to help with anything that could be used maliciously
NEVER proactively create documentation files — only create them if the user asks

“If you can answer in 1-3 sentences or a short paragraph, please do so. You MUST avoid extra preamble before/after your response.”

The philosophy here is crystal clear: efficiency above all else. Claude Code treats every token like it’s billing per character. And this approach works — developers love that they don’t get paragraphs of explanation when they just need a one-liner.

2. Cursor — The Semantic Search Evangelist

Cursor’s prompt is fascinating because it builds an entire tool philosophy around search:

Heavily emphasizes codebase_search — semantic search that “finds code by meaning, not exact text”
Provides detailed guidelines for when to use grep vs. semantic search vs. reading files
Mandates that search queries be natural language questions, not keyword fragments
Requires each tool call to include an explanation field — why is this tool being called?

The key pattern? Cursor treats the codebase as a search-first environment. Before you write a single line, Cursor wants to understand the architecture through search. It’s the tool equivalent of “measure twice, cut once.”

3. Windsurf — The Memory Obsessive

Windsurf (by Codeium) is unique because it’s the only tool that explicitly markets itself around “AI Flow”:

Built around a persistent memory system — explicitly instructed to save context about the user’s task
Has a planning tool that maintains a plan of action, updated dynamically
Introduces the concept of “AI Flow paradigm” — working “both independently and collaboratively with a user”
Strong warnings about not calling tools “unless absolutely necessary” — they’re expensive

“Remember that you have a limited context window and ALL CONVERSATION CONTEXT, INCLUDING checkpoint summaries, will be deleted. Therefore, you should create memories liberally to preserve key context.”

— Windsurf System Prompt

The standout insight? Windsurf treats memory as a survival mechanism. The moment context gets cleared, everything’s lost — so it proactively archives everything.

4. Manus — The Generalist

Manus’s prompt stands out for being the most broad and flexible:

Designed for “wide range of tasks” — not just coding
Emphasizes research through web searches and data analysis
Detailed breakdown of browser capabilities, file operations, deployment
Explicitly lists framework support across React, Vue, Angular, Django, Flask, etc.

Unlike Claude Code’s laser focus or Cursor’s search-first approach, Manus is the jack-of-all-trades. It doesn’t obsess over minimal output — it focuses on capability breadth.

5. v0 (Vercel) — The Frontend Specialist

Vercel’s v0 has the most specific tooling instructions:

Deep integration with Vercel’s ecosystem (image generation, deployment)
Specific rules about Python with uv — uses uv init --bare not pip install
Strong image handling protocols: save images locally, use /images/ paths by default
Mandatory use of console.log("[v0] ...") for debugging

“By default, reference images in code using the local file path (e.g., /images/dashboard.png) rather than a blob URL or external URL, unless the user explicitly asks otherwise.”

— v0 System Prompt

v0 is opinionated about frontend workflows. It’s not trying to be a general coding assistant — it’s trying to be the best possible wrapper around Vercel’s deployment pipeline.

The Common Threads

Despite their differences, some patterns appear across every single prompt:

1. Token Paranoia

Every tool without exception instructs the model to:

Minimize output tokens
Avoid redundant tool calls
Batch operations when possible
Never create files “unless absolutely necessary”

This tells us something important: the economics of AI coding are still raw. Companies are extremely aware of the cost differential between a 10-line response and a 100-line one.

2. Proactivity Limits

Three rules show up everywhere:

NEVER proactively create documentation (unless asked)
NEVER surprise the user with actions they didn’t request
ALWAYS ask before running commands that could be destructive

The industry has learned: developers hate it when AI “helpfully” creates a README.md they’ll never read.

3. Security Fencing

Every single prompt includes defensive security instructions:

Refuse to help with credential harvesting
Don’t assist with bulk crawling for keys/cookies
Allow security analysis and defensive tool creation

Attacking is out. Defending is in. This is likely a response to real-world abuse concerns.

The Surprising Bits

A few findings caught me completely off guard:

Code References Must Include Line Numbers

Claude Code explicitly demands this format:

file_path:line_number

Every reference to code must include the line number so users can jump there directly. It’s a tiny detail that massively improves the developer experience.

Memory Systems Are the New Arms Race

Windsurf’s memory system isn’t unique — most 2026 tools now have some form of context preservation. But the implementations vary wildly:

Some use explicit “memory tools” to save state
Others rely on the model’s context window (risky with small windows)
Windsurf goes furthest: explicitly teaches the AI to “create memories liberally”

The next frontier isn’t better language models — it’s better memory.

”Planning” Tools Are Exploding

Windsurf, Manus, and several other tools explicitly call out a “planning” workflow:

Maintain a plan of action
Update it when instructions change
Check the plan before major actions

I’m seeing the emergence of meta-cognition — the AI isn’t just executing, it’s thinking about what it’ll do next.

What This Tells Us About the Future

Reading through 30,000 lines of AI instructions, a few things become clear:

The “personality” of an AI tool is 100% artificial. These aren’t emergent behaviors — they’re carefully engineered constraints.
The model matters less than the wrapper. GPT-4.1 and Claude Sonnet 4.6 are both capable. What makes Cursor feel different from Claude Code is the system prompt, not the underlying model.
We’re entering an era of prompt-driven differentiation. Two tools using the same model can feel completely different just by changing how they instruct it to behave.
Memory and planning are the new battlegrounds. The moment one tool figures out persistent, useful memory across sessions, it’ll have a massive advantage.

My Take

After spending hours in these system prompts, I came away with a weird realization: these AIs are more constrained than I thought.

They’re not reasoning freely. They’re following very specific scripts, with very specific limits. And that’s the point — the best AI coding tools aren’t the ones that can do the most; they’re the ones that know exactly when to stop.

The next time someone asks “which AI coding tool should I use?” — my answer will be: look at the system prompt. That’s the real product.

Have you tried one of these tools and noticed something specific about how it behaves? Drop a comment below — I’m curious if you’ve spotted these patterns in action.

Bittalks

Developer and tech enthusiast exploring the intersection of open source, AI, and modern software development.

Comments

Join the discussion — requires GitHub login