The Real Problem With AI-Assisted Java Content Is Drift

Not every bad technical article is obvious. This is how I now ground AI output before it turns plausible mistakes into shared knowledge.

Apr 13, 2026

I recently took down one of my Quarkus posts. Not because the whole thing was garbage. Not because every example was broken. It was because a few parts were off in a way that matters a lot in technical writing: they sounded plausible, but they were wrong enough to mislead readers.

That is worse than an obvious mistake.

An obvious mistake gets caught quickly. A plausible mistake gets repeated. Someone reads it before an interview. Someone copies it into a demo. Someone turns it into a team explanation. And suddenly the problem is not one bad sentence anymore. The problem is that bad information now looks like accepted knowledge.

I was called out on that, and rightly so. The feedback was direct. Some of it was uncomfortable. But it was fair.

The main point was not that AI should never be used. The point was that a high-speed, AI-assisted publishing workflow creates a very specific risk. Code can compile. The narrative can sound smooth. The article can feel finished. And still, a couple of details can drift just far enough away from reality to become harmful. In a fast-moving ecosystem like Quarkus, that is not a small issue. That is exactly where trust starts to break.

There is another side to this that also matters to me. The Main Thread was never only meant to be a publishing machine for Java content. It also became my laboratory for figuring out how far I can push AI to teach, and how far I can push myself to work with AI without losing control of the outcome. That experiment is still very much alive. It is also exactly why I need to be honest when it fails.

Another point hit just as hard. Review is expensive. Engineering time is expensive. A review process should not be there to rescue content that was never grounded properly in the first place. It should be there as a final safety layer. Not as cleanup for a workflow that moves too fast.

That landed with me.

So I promised to keep the original post down. And I thought that was the right call. Not because I want to perform some public self-punishment ritual. Just because replacing a flawed article with a better one is more useful than quietly pretending nothing happened.

So rather than trying to rescue the old article, I decided to write this one.

It is not another list post. It is not a patched version of interview questions. It is the thing underneath the problem: how I think about keeping AI-infused IDEs, coding assistants, and agent workflows grounded enough that they stay useful without drifting into confident nonsense.

Because that, for me, is the real issue now. AI is not going away. IDEs with copilots, agents, MCP servers, retrieval layers, and doc-aware tooling are not going away. The question is not whether we use them. The question is whether we build workflows around them that hold up under technical scrutiny.

The Real Problem Is Not “AI Slop.” It Is Drift.

“AI slop” is a catchy phrase, but I do not think it is precise enough.

The real problem is drift.

A model starts with a roughly correct understanding. Then it fills in one missing detail from training data and statistics. Then another. It picks a term that used to be right. It explains a pattern that technically works but is not the right Quarkus way to do it. It mixes old and new vocabulary. It invents a connection between two things that sound related. None of these mistakes are dramatic on their own. But together they produce content that feels sharp and complete while quietly losing contact with the source material.

That is the dangerous part.

And this is exactly why “the code compiles” is not enough. I learned that one the hard way. A generated example can compile and still teach the wrong habit. It can compile and still overcomplicate the solution. It can compile and still present a pattern that no experienced Quarkus engineer would recommend. Technical correctness is more than syntactic success.

There is also a second kind of drift that gets less attention. Tone drift.

When I rely too much on model-first drafting, the writing starts to flatten. Every sentence gets punchy. Every paragraph sounds polished in the same way. The article reads like it was assembled by a machine trained on five thousand “developer content” headlines and then sprayed with confidence. Even when the facts are right, that tone damages trust. Readers can feel it.

So when I say I want to keep AI grounded, I mean both things. I want the facts grounded in current sources and runnable reality. And I want the writing grounded in a voice that still sounds like me.

What Grounding Means for Me

Grounding, for me, is simple in principle.

The model does not get to answer from vibes.

I do not trust a general-purpose model to “just know” Quarkus. Not for version-sensitive details. Not for renamed extensions. Not for testing changes. Not for migration nuances. Not for what is technically possible versus what is idiomatic. That is where drift shows up first.

So I try to force the workflow away from memory and back toward sources.

The first layer is current documentation. If I write about Quarkus, I want the model to work from current guides, migration notes, release material, and actual code. Not from stale training memory. That sounds obvious, but it changes the whole character of the output. The model stops behaving like an oracle and starts behaving more like an assistant reading over your shoulder.

The second layer is targeted retrieval. I do not want broad prompts like “tell me about Quarkus testing.” I want narrower, version-aware context. Show me the current guide. Show me what changed. Show me the names that are valid now. Show me the artifact or config that matches the current platform line. Broad prompts invite generic answers. Narrow prompts force contact with specifics.

The third layer is contradiction hunting. This is one of the least glamorous parts of the process, but it matters a lot. I look for stale tokens. Old names. Old guide references. Old vocabulary. Old explanations that used to be true in one release line and are not true anymore. This is where a lot of plausible nonsense hides. Not in wild hallucinations. In leftovers.

The fourth layer is runnable code. I want code that builds, starts, and behaves the way the article says it behaves. I want the failure path to be real. I want the endpoint response to be real. I want the config to do something visible. If I make a claim, I want some proof behind it. That does not mean every article becomes a giant test suite. But it does mean that “looks right” is not enough.

The fifth layer is human judgment. I still use AI heavily. I am not moving backward on that. But there is a big difference between using AI to accelerate exploration and letting AI define technical truth. The model can help me think faster, compare options, rewrite, structure, and pressure-test. It should not be the final source of authority on framework behavior.

That distinction matters more and more.

How I Actually Use AI in the Workflow

My workflow is not “generate article, publish article.”

That would be irresponsible, and it would also not produce the kind of content I want to put my name on.

I use AI at several stages, but not for the same job in every stage.

I use it to help me explore a topic faster. I use it to challenge assumptions. I use it to find likely weak points. I use it to shape structure when the material is messy. I use it to rewrite drafts into something that has a cleaner arc. I use it to pressure-test whether an explanation makes sense to someone who was not already in my head.

But the closer I get to the final text, the less I want “free generation” and the more I want constrained generation. I want source-linked docs. I want current framework material. I want rules for tone and structure. I want code I can verify. I want a process that reduces randomness.

That is the part many AI debates skip over. The tool is not one thing. “Using AI” can mean lazy autopilot. It can also mean a carefully constrained system where the model is only one part of a larger workflow. Those are not remotely the same.

And this is where grounding tools start to matter.

The Tooling Part: Why Context Matters More Than Cleverness

One of the biggest mistakes with AI IDEs is expecting the model to carry too much of the truth inside itself.

That works for generic coding tasks. It breaks down fast for active frameworks, product lines, release-specific guidance, and fast-moving ecosystems. Quarkus changes. Tooling changes. Names change. Recommended approaches change. A model that only answers from memory will always lag behind that reality.

So I use context injection and documentation retrieval wherever I can.

That includes working with documentation-oriented tooling that can pull current, source-linked material into the prompt instead of leaving the model alone with its own memory. It also includes using MCP-based doc access so the assistant can retrieve the right project material at the moment I ask the question. This is not glamorous, but it is a huge part of what makes the output less fragile.

I also think there is a broader lesson here for framework teams. If we want AI tools to produce better outcomes, we need better ways to expose authoritative project knowledge to them. Not by replacing documentation, but by making documentation easier for these systems to consume correctly. Good docs are still the foundation. Better retrieval just gives them a stronger path into the workflow.

The Skills Layer I Use

Grounding is one part of the story. The other part is skills.

In my setup, skills are curated playbooks. They are not code, and they are not some magic hidden training layer. They are explicit instructions for how certain kinds of work should be done. They help reduce one of the biggest practical problems in AI-assisted writing and coding: inconsistency.

Without that layer, every draft starts from scratch in the worst possible way. One day the model writes a clean technical walkthrough. The next day it over-explains. Another day it changes tone halfway through. Another day it forgets the article structure, skips verification, or slips into that too-polished “developer content” voice that nobody really trusts.

Skills give me a way to tighten that up.

For writing, I mainly rely on three kinds of guardrails. One defines the structure of a proper Main Thread tutorial or article. One keeps the voice closer to how I actually speak and write. One acts as a stricter review pass that checks whether the result is technically solid, teachable, and ready to ship.

That combination helps a lot. Structure keeps the article stable. Voice keeps it human. Review keeps it honest.

Beyond that, I am also testing work from the Quarkus project around technical guardrails expressed as skills. I think that is one of the more promising directions in this space. Not because skills replace expertise. They do not. But because they can encode project-specific expectations in a form that an assistant can actually follow. That means fewer random detours, fewer invented patterns, and a better chance that the output reflects how the framework really wants to be used.

That part is still evolving, and I am learning along the way. But I like the direction very much. It moves the workflow away from “trust the model” and closer to “constrain the model with the project’s own rules.”

And that is exactly where I want to be.

What Changed for Me After 400+ Posts

Publishing at a high cadence taught me a lot. Some of it was good. Some of it was painful.

The good part is obvious. I learned faster. I explored more topics. I found patterns in what readers care about. I got better at turning technical material into readable stories. I also got a very practical education in what these tools are good at and where they fail.

And this is where The Main Thread ended up meaning more to me than “just” a publication. It became a working lab. A place where I could test not only what AI can produce, but how AI changes the way I research, structure, verify, explain, and ship technical content. That is also why I am speaking about this process publicly in my JCON Europe 2026 session, “Chasing the Main Thread - Adventures in AI Assisted Coding.”

The painful part is also clear now. Speed hides weaknesses until it does not. A workflow can feel productive for months and still contain a flaw that only becomes fully visible when trust is on the line. In my case, that flaw was not “too much AI” in some abstract moral sense. It was not enough hard grounding around the parts that matter most: current facts, framework idioms, and final accountability.

That is why I do not think the answer is to stop using AI. I think the answer is to stop pretending that generation alone is a workflow.

Generation is one step. Grounding, retrieval, contradiction checks, runnable code, editorial constraint, and final accountability are the workflow.

That is the difference.

What I Am Trying to Do Now

I still believe these tools matter. I still believe learning them aggressively is the right move. I still think the future of technical work involves more agentic tooling, more IDE assistance, more retrieval, and more model-driven exploration.

But I also think there is a responsibility that comes with publishing technical content in public, especially around a project like Quarkus where people use articles as a shortcut to understanding.

I do not want to create review debt for busy engineers. I do not want to publish things that sound official just because they circulate widely. I do not want to produce content that looks polished while eroding trust underneath. And I definitely do not want to feed wrong explanations back into the broader machine that will repeat them again later.

So the goal now is not just more output. The goal is better constraints.

Better sources. Better retrieval. Better guardrails. Better code verification. Better editorial discipline. Better use of AI where it helps, and less trust where it does not deserve trust.

Conclusion

I took down a Quarkus article because a few answers were wrong. That was the immediate reason. The deeper reason is that it exposed something more useful: if I want AI-infused IDEs and writing workflows to be worth anything, they need tighter contact with reality than “looks plausible.” For me, that means current docs, targeted retrieval, contradiction checks, runnable code, explicit skills, and a workflow where the model helps, but does not get to decide what is true.

That is the version of this experiment I want to keep doing: less faith in generation, more discipline around grounding, and a clearer understanding that The Main Thread is both a publication and a laboratory. Thank you for joining me on this experiment. And thank you for your feedback.

Discussion about this post

Ready for more?