Why Senior Java Developers Are Using AI Coding Tools Wrong
How Compounding Engineering turns AI from autocomplete into a junior developer you can actually trust
At NDC Manchester 2025, Aleksander Stensby gave one of the more honest talks I’ve seen about AI-assisted coding. Not a demo reel. Not a hype session. A practical breakdown of why tools like Claude Code, Cursor, and GitHub Copilot often disappoint experienced developers – and what to do about it.
The core idea is simple, but uncomfortable:
If AI produces bad code for you, that’s often a workflow problem, not a model problem.
Stensby calls the fix Compounding Engineering. The name matters. This is not about prompts. It’s about teaching your tools over time, the same way you onboard a junior developer and make them better week by week.
Let’s quickly waltk through it all.
Tip 0: Treat the AI Like a Junior Intern
This is the mindset shift everything else depends on.
If a junior developer submits messy code, you don’t fire them. You review it. You explain why it’s wrong. You show what “good” looks like. Over time, they improve.
Most people don’t do this with AI. They accept the output, complain about “AI slop,” and move on.
That’s a choice.
You can be a passive user, or you can be a collaborator. If you want better results, you need to actively guide the model, correct it, and push back when it makes bad decisions. Your value does not disappear just because the code compiles.
Context Is King (and Scarce)
Large language models do not have infinite attention. They have context windows, and those windows fill up fast.
Too little context and the model hallucinates.
Too much context and it loses focus.
The mistake many developers make is trusting auto-compaction and long-running chats. Over time, the model forgets why certain decisions were made and starts improvising.
The fix is boring but effective:
Clear chats aggressively.
Store state explicitly.
Use markdown files as memory. Architecture notes, constraints, decisions, rejected approaches. When you start a new session, you re-load exactly the context that matters. No more. No less.
This feels manual because it is. But it keeps the model sharp and predictable.
Rule Files Are Your Long-Term Memory
Files like CLAUDE.md or cursor.rules are not configuration. They are training.
Most people treat them as static style guides. Stensby treats them as a living knowledge base.
When you fix a subtle bug, define a pattern, or explain why something is off-limits in your codebase, you don’t just correct the AI once. You update the rule file and ask the AI to learn from it.
Now the mistake doesn’t come back.
Over time, this compounds. The AI becomes more aligned with how you write code, how your system works, and what your standards are. This is where real productivity gains show up.
Always Plan Before You Code
If you let an AI jump straight into implementation, it will make decisions for you. Some of them will be fine. Some of them will be quietly terrible.
Plan mode fixes this.
You ask the model to outline the steps first. To explain assumptions. To ask clarifying questions. For complex work, you explicitly enable high-reasoning modes and slow it down.
This does two things.
It forces the model to think.
It forces you to spot bad ideas early.
Planning is cheap. Refactoring bad architecture is not.
Your Job Is Taste and Judgment
AI can produce working code very fast. It cannot reliably produce good code.
That’s where you come in.
Your value is taste. Architecture. Knowing when something feels wrong even if the tests pass. Knowing which trade-offs are acceptable and which ones will hurt you six months later.
Don’t just issue commands. Have a conversation. Ask the model why it chose a design. Challenge it. Let it challenge you. This back-and-forth is where quality emerges.
Break Work into Atomic Tasks
Large features overwhelm models the same way they overwhelm humans.
The fix is simple: make tasks small and explicit.
Use GitHub issues. Use markdown task files. One feature. One responsibility. One outcome.
If the AI has an unrelated idea mid-task, don’t let it derail the flow. Tell it to capture the idea somewhere else and move on. Focus matters.
Pick the Right Model on Purpose
Not every task needs a big thinking model.
Fast, cheap models are great for refactors, formatting, and boilerplate. Slower, more expensive models are for architecture, tricky logic, and design decisions.
Blindly using one model for everything is wasteful and often counter-productive. Active model selection is part of engineering now.
Slash Commands and Sub-Agents
Once your workflow stabilizes, you can automate parts of it.
Custom slash commands handle repetitive tasks like writing unit tests or reviewing pull requests. Sub-agents run in parallel with their own context windows, working on documentation or QA without polluting your main thread.
This is how you scale your attention, not just the model’s.
MCP: Giving the AI Arms and Legs
With Model Context Protocol, the AI stops being a text box.
It can search documentation. Inspect browser state. Look at DevTools logs. Interact with real systems instead of guessing.
This is where agentic workflows become useful, not magical. The model still needs supervision, but it’s finally grounded in reality.
Autonomous Agents (Carefully)
Yes, you can let agents work in the background. On GitHub issues. In sandboxes. Submitting PRs for review.
The key word is review.
You don’t merge blindly. You treat these agents like very fast juniors who never get tired and absolutely need oversight.
Always Have a Safety Net
AI will break your code. Not sometimes. Always.
Use Git. Use checkpoints. Make it trivial to rewind. If rollback is painful, you’ll hesitate to experiment, and that kills the upside of AI-assisted development.
One of Stensby’s favorite prompts near the end of the talk was simple:
“Are you sure?”
It sounds trivial. It isn’t.
The Real Takeaway
AI coding tools are not autocomplete anymore. They are closer to collaborators. But collaborators need onboarding, feedback, and boundaries.
If you invest in that, results compound.
If you don’t, you get faster mediocrity.
And faster mediocrity is still mediocrity.


