How to Keep Coding Agents From Changing Too Much Code

A practical way to set scope, review AI-generated diffs, and stop small requests from turning into large edits.

Jun 16, 2026

I keep coming back to one kind of AI coding failure because I see the same pattern again and again.

Ask an agent to add bounded retry logic to one client. It returns a diff across 14 files. The retry code is there. So are changes to dependency injection wiring, a new interface nobody asked for, test movement, and edits in bootstrap code that were never part of the request.

That run does not look absurd. It looks plausible. That is the problem. The agent solved the local task by increasing the scope of the change until the local problem disappeared.

I think we still talk about these runs too gently. We call them prompt problems, context problems, or tool problems. Sometimes they are. Many times the simpler explanation is the right one: the agent had no firm boundary around the work.

That is why I keep landing on the same conclusion. Coding agents need a change budget.

A prompt is not a boundary

I keep seeing teams try to solve this with better instructions.

Add a stronger system prompt. Add repository rules. Add an AGENTS.md file. Add a sentence that says “touch as few files as possible.” Add another sentence that says “do not change public APIs.” Add a third sentence that says “ask before making broader architectural changes.”

That is useful guidance. It is still guidance.

If the same model defines the scope, executes the edits, explains the result, and justifies the extra changes, then the process has no independent control point where it needs one most.

GitHub’s own Copilot inline suggestions guidance says users are responsible for reviewing and validating suggestions before accepting them. It also says the system may miss larger design or architectural issues. I like that because it is plain. If the model can miss the larger structure, I do not want it deciding whether the change stayed inside the intended structure.

The phrase “human in the loop” also makes people relax too quickly. A tired reviewer reading a 40-file diff is not a control plane. End-of-process approval does not solve the earlier problem, which is that the system already allowed a one-class task to become much larger.

I care less about whether the model had good reasons. I care about who can stop scope growth before it reaches review.

Code got cheaper. Review did not.

The appealing part of AI coding is real. Starting work gets easier. The first pass is cheaper. We can try more things in less time.

Confidence still takes review time.

Someone still has to read the diff and decide whether the system deserves trust afterward. That means checking hidden contracts, failure paths, dependency choices, configuration fallout, tests, and the small edits that look fine until they reach production. If the change touches security, infrastructure, or public behavior, review gets expensive very quickly.

The review burden is visible in survey data now. Sonar’s 2026 State of Code survey says 96 percent of developers do not fully trust AI-generated code, but only 48 percent say they always check it before committing. The same survey says 38 percent find AI-generated code harder to review than code written by colleagues. That sounds right to me. Faster drafting helps. The review work still remains.

This is the part that demos usually skip. AI is good at producing code quickly. Review still moves at human speed. That can still be a good trade, but the real unit of cost is reviewable change.

Large diffs are where this gets expensive. In a status update they look productive. Files changed. CI passed. The task is marked done. Then a reviewer has to spend real time sorting out whether the diff contains one change or four.

GitHub makes the same point in product docs language. Its Copilot review guidance says Copilot pull requests deserve the same thorough review as any contribution, and that if a repository requires approvals, another reviewer still has to approve the pull request before merge. Good. I trust that instinct much more than I trust claims of autonomous productivity.

I am not arguing for a ban on agents. I use these tools too. I am arguing for systems that admit the obvious cost instead of hiding it behind generation speed.

Give the agent a change budget

When I say change budget, I mean something very simple.

Before the agent starts work, decide how much change you are willing to accept for this task before it has to stop, escalate, or get rejected.

We already use budgets in distributed systems because they make limits explicit. Error budgets, retry budgets, timeouts, memory limits, and connection caps all exist for the same reason. Uncontrolled growth gets expensive. I want the same habit applied to change scope.

A change budget can be small and boring:

Which paths the agent may touch
How many production files it may change
Whether it may add dependencies
Whether it may change public APIs
Which areas are protected outright
Which checks must pass before a human even reads the diff
Which conditions force escalation

That changes the question in a useful way.

Instead of asking whether the agent completed the task, we ask whether it stayed inside the agreed budget while completing the task.

I trust that question much more because it sounds like normal engineering.

The boundary has to live outside the model

I only trust this when the policy lives outside the agent.

If the budget is another paragraph in the prompt, the model can ignore it, reinterpret it, or violate it and then write a summary that sounds reasonable. I want a clear separation between guidance and enforcement.

The shape I trust looks more like this:

task contract
  -> isolated workspace
  -> deterministic checks
  -> independent review
  -> human approval

The model can help inside that flow. It should not own the whole flow.

The Model Context Protocol architecture docs are useful here because they stay within their scope. MCP focuses on the protocol for context exchange. It does not tell an AI application how to manage the context it receives. The same docs split the system into a data layer and a transport layer, with authorization living in transport. That is important plumbing. It is still plumbing.

The MCP security guidance points in the same direction. It talks about confused deputy problems, scope minimization, and local server compromise. I read that as a reminder that a tool protocol can move capabilities around cleanly while still leaving the trust decisions to something outside the protocol.

I also think Java teams are in a better position here than the public AI conversation suggests. Many of them already own the enforcement layer. ArchUnit can catch boundary drift. Revapi or japicmp can flag API breakage. Maven Enforcer and dependency analysis can reject changes the agent had no business making. Test suites, format checks, SBOM comparison, container policy checks, and build rules already exist in many teams. Those tools keep their value when the code arrives through a model.

In a healthy setup, deterministic gates stay authoritative. A reviewer model can summarize or critique the diff. It should not be the thing that excuses structural violations.

The workspace is the transaction

There is a second boundary that matters just as much: where the change happens.

For coding agents, I treat the workspace as the transaction boundary.

Give the agent an isolated worktree, branch, container, or ephemeral development environment. Let it make local changes, run the allowed checks, and produce something we can inspect. If the run is bad, discard it. If the run is good, continue.

Git already gives us a simple version of this with worktrees. The official docs describe them as multiple working trees attached to the same repository. That is exactly what I want here: a separate place to do the work without turning the main checkout into another review problem.

The value is simple. Local mutations are reversible.

File edits are reversible. Test artifacts are reversible. A branch commit is reversible. A bad local refactor is still annoying, but it is recoverable.

External side effects are different. Once the agent opens a pull request, posts to GitHub, sends a message, publishes a package, changes a cloud resource, or writes to a real database, the change has moved beyond the local workspace. At that point isolation is only one part of the story.

This is where many agent tool stacks still look unfinished to me. Tool protocols can tell you how to call something. They usually do not decide who approved the call, which identity it used, whether it can be replayed safely, or whether the action should have been available to the task at all.

So yes, isolate the workspace. I want that. I also want a clear line between reversible local work and external side effects.

Measure review cost, not generated output

If I had to remove one success metric from the current conversation, I would remove generated volume.

I do not care how many files the agent touched if half the runs get discarded. I do not care how quickly the draft appeared if it still takes a senior engineer 45 minutes to decide whether the diff contains unrelated changes. I do not care how many suggestions were accepted if the team learns to distrust every large diff.

The metrics I would actually watch are much less glamorous:

Files changed per accepted task
Percentage of runs discarded
Human review time per accepted change
Rate of budget violations
Frequency of protected-area touches
How often a “small change” needed elevation

Those numbers tell me whether the system is getting easier to review with confidence.

That is the bar I care about. I care less about how much code the agent can generate than about whether a busy team can understand the accepted change without extra friction.

This is not about obedient agents

I am not trying to make coding agents perfectly obedient. I am trying to make their mistakes cheap.

That goal feels realistic to me, and it is enough.

We need systems where agent mistakes stay small, visible, and reversible.

That means the agent does not decide its own change scope. The task has a contract. The workspace is isolated. The structural checks are real. The reviewer is separate from the author. The human sees a bounded diff that can be understood in one pass.

This is not a dramatic vision of AI-assisted development. It is simply the first version I trust.

The most useful sentence I can give a team right now is this: if the agent is allowed to change real systems, give it a change budget before it starts.

Otherwise the team learns the limits of the system during review, which is an expensive way to learn.

Discussion about this post

Ready for more?