Why Your AI Agent Wastes 40K Tokens Before Writing a Single Line

March 4, 20267 min read

Watch any AI coding agent work on a real codebase and you'll see the same pattern: read file, read file, read file, read file. By the time it starts writing code, it has burned through 30,000–50,000 tokens just to orient itself. That's not a bug. It's the default behavior — and it's incredibly wasteful.

The file-reading problem

When Claude Code or Cursor starts a task, it doesn't know your codebase. It needs to discover:

What entities exist and how they relate
Which files handle the relevant logic
What conventions the codebase follows
What the current schema looks like

Without structured context, the only way to learn this is reading files one by one. Each file costs 500–2,000 tokens. Reading 20–30 files to understand the architecture means 20,000–40,000 tokens spent before a single line of code is generated.

The math

Let's break down a typical "add a feature" task:

Without structured context

Read package.json — 800 tokens (understand tech stack)
Read schema/models — 5,000 tokens (4–5 model files)
Read related routes — 4,000 tokens (3–4 route files)
Read related services — 6,000 tokens (3–4 service files)
Read test examples — 3,000 tokens (understand test patterns)
Read config files — 2,000 tokens (understand conventions)
Orientation subtotal: ~20,000–40,000 tokens
Actual code generation — 3,000–5,000 tokens

With structured context via MCP

start_ticket() — 1,500 tokens (ticket + entity context + conventions + file paths)
Actual code generation — 3,000–5,000 tokens

That's a 10–20x reduction in context-gathering tokens. Same output quality, fraction of the cost.

Why this matters beyond cost

Token waste isn't just a billing problem. It has compounding effects:

Context window pollution — those 40K tokens of raw file contents push out space for reasoning, planning, and code generation
Speed — reading 30 files sequentially takes time, even at API speeds
Accuracy — raw files don't highlight what matters. The model has to figure out which parts of each file are relevant, and it often gets this wrong
Consistency — different runs read files in different orders, leading to inconsistent understanding and non-deterministic results

Structured context: the fix

The solution is to pre-analyze the codebase and deliver structured context on demand. Instead of reading schema.prisma (2,000 tokens of raw text), the agent receives:

Entity: User
Fields: id (UUID, PK), email (String, unique), name (String)
Relations: has_many Orders, has_one Profile
Files: src/models/user.ts, src/routes/users.ts

That's ~100 tokens instead of 2,000 — and it's more useful because it highlights the relationships and file paths the agent actually needs.

How MCP enables this

The Model Context Protocol (MCP) lets AI agents request structured data from external tools. With Scope connected via MCP:

start_ticket() returns the ticket, relevant entities, conventions, and file paths in one call
get_context(scope: "entities:User+Order") returns just the entities needed for the current task
search(query: "payment processing") finds relevant patterns without reading every file

The agent never reads a file it doesn't need to modify. It already knows the architecture.

And context improves over time. When the agent completes a ticket, it saves learnings — patterns discovered, gotchas hit, conventions confirmed. Those learnings become context for the next ticket, reducing orientation cost even further.

Try it yourself

Connect your GitHub repo — or sync any local codebase via scope_sync through MCP — to Scope, set up the MCP integration, and watch the difference. Your AI agent will spend tokens writing code instead of reading files.