Your AI Agent Needs Live Cluster State: Build a Kubernetes MCP Server in Java

A hands-on Quarkus tutorial using Fabric8 informers, isWatching(), and Minikube to stop agents from reasoning over stale Kubernetes data.

Mar 25, 2026

There is a category of failure that human operators catch instinctively and AI agents do not: stale data.

When an SRE opens a terminal and sees a deployment stuck in Pending, they reach for kubectl get pods before making any decisions. They check the live state because they know their mental model might already be outdated. An AI agent querying a Kubernetes MCP server has no such instinct. If the informer backing your tool has silently lost its watch connection, the agent can reason confidently over a cache that stopped updating minutes ago and recommend actions that make things worse.

This tutorial is about preventing exactly that.

We are going to build a production-aware Kubernetes MCP server using Quarkus and the Fabric8 Kubernetes Client. The centerpiece is Informer.isWatching(), which Fabric8 exposed to reflect whether the informer is actively watching live cluster state, not merely still “running.” That sounds small, but it is load-bearing for an MCP server that exposes Kubernetes data to an AI agent. Fabric8’s 7.4 release notes explicitly call out the change as “Allow Informer.isWatching to see underlying Watch state.” Quarkus MCP Server also gives us a natural fit for this kind of tooling with CDI-based tool definitions, multiple transports, and a Dev UI for local testing. (GitHub)

By the end, you will have a working Quarkus MCP server that backs its tools with informer caches, refuses to answer from stale state, exposes health checks for Kubernetes, and runs locally on Minikube with Podman.

Why informers, not per-request API calls

The naive way to expose Kubernetes state over MCP is to call the API server on every tool invocation.

@Tool(description = "List running pods in a namespace")
ToolResponse listPods(@ToolArg(description = "The namespace to query") String namespace) {
    List<Pod> pods = client.pods().inNamespace(namespace).list().getItems();
    // format and return
}

This works in a demo. In production, it fails in two ways.

First, it is expensive. Every tool call becomes an API server round-trip, and agents call tools repeatedly in a reasoning chain. A single user question can turn into many reads. That cost adds up quickly.

Second, it makes your MCP server a direct load source on the API server. Cluster operators do not want an AI assistant hammering the control plane because it asks the same question five times with slightly different phrasing.

The informer pattern solves both problems. An informer maintains a local cache backed by a watch connection to the API server. Reads are served from memory. That makes them cheap and fast. The API server sees one watch connection instead of a burst of repeated list calls.

The trade-off is that you now have to reason about informer lifecycle and cache freshness.

That is where isWatching() matters.

The isRunning() versus isWatching() distinction

For a long time, SharedIndexInformer users mostly looked at isRunning() and hasSynced().

Those are useful, but they do not answer the one question that matters most for an operational MCP server: is the informer actively watching live cluster state right now?

An informer can be in a state where:

isRunning() is true
hasSynced() is true
isWatching() is false

That is the dangerous state.

It means the informer thread exists. It means the cache did complete an initial sync at some point. But it also means the live watch is currently down, so the cache is frozen until the watch reconnect succeeds.

For a human operator, this is annoying but survivable. They notice weirdness, refresh, and investigate.

For an AI agent, this is a correctness problem. The agent sees structured data and assumes it reflects the current state of the cluster.

So our design principle is simple: if the informer is not actively watching, the tool must refuse to answer from cache.

Not warn. Not answer with a footnote. Refuse.

What we are building

We will build a Quarkus application with three core pieces.

The first is an informer registry. This is an @ApplicationScoped CDI bean that owns the Fabric8 client and informer lifecycle. It starts the pod and deployment informers once and exposes them to the rest of the application.

The second is a health check. This reports isRunning(), hasSynced(), and isWatching() for each informer so that both humans and Kubernetes can see whether the cache is actually trustworthy.

The third is a set of MCP tools. These tools query informer-backed caches, not the API server. Before they return anything, they verify that the informer has synced and is still watching. If not, they return ToolResponse.error() so the model knows it should retry rather than reason over stale state.

We will also package and deploy the application on Minikube so the whole setup runs end to end inside a local Kubernetes cluster.

Prerequisites

You need Java 21, Maven 3.9 or newer, kubectl, and a local Kubernetes cluster. We will use Minikube because it is still one of the easiest ways to run a local Kubernetes environment, and the current docs list Podman as a supported driver. Deployment uses the Quarkus Kubernetes extension: it generates manifests from configuration and builds the container image with Jib, so you do not need a Dockerfile or an external registry.

You should also have a basic understanding of CDI and Quarkus dev mode. We are not starting from zero on Quarkus here.

Project setup

Create a new Quarkus project or start directly from my Github project:

mvn io.quarkus:quarkus-maven-plugin:create \
  -DprojectGroupId=com.mainthread \
  -DprojectArtifactId=kubernetes-mcp-server \
  -Dextensions='quarkus-smallrye-health,io.quarkiverse.mcp:quarkus-mcp-server-http:1.10.0,quarkus-kubernetes-client,quarkus-kubernetes,quarkus-container-image-jib,quarkus-minikube' \
  -DnoCode
cd kubernetes-mcp-server

The Quarkus Kubernetes Client extension (quarkus-kubernetes-client) provides an injectable Fabric8 KubernetesClient bean and handles kubeconfig and in-cluster configuration for you.

quarkus-mcp-server-http gives us an MCP server over HTTP. The default transport is Streamable HTTP (at /mcp); the legacy SSE endpoint remains at /mcp/sse but is deprecated. The Kubernetes Client extension gives us Fabric8, including the informer APIs. quarkus-smallrye-health exposes the health endpoints we will later wire into Kubernetes probes. The Quarkus Kubernetes extension generates Deployment, Service, and RBAC manifests from configuration; the Minikube extension tailors them for local Minikube (e.g. imagePullPolicy: IfNotPresent). Jib builds the container image without a Dockerfile. See the Quarkus Kubernetes guide for details.

Now create the package structure we need:

mkdir -p src/main/java/com/mainthread/k8s

Building the informer registry

Create src/main/java/com/mainthread/k8s/ClusterInformerRegistry.java:

package com.mainthread.k8s;

import java.util.concurrent.TimeUnit;

import org.eclipse.microprofile.config.inject.ConfigProperty;
import org.jboss.logging.Logger;

import io.fabric8.kubernetes.api.model.Pod;
import io.fabric8.kubernetes.api.model.apps.Deployment;
import io.fabric8.kubernetes.client.KubernetesClient;
import io.fabric8.kubernetes.client.informers.SharedIndexInformer;
import io.fabric8.kubernetes.client.informers.SharedInformerFactory;
import jakarta.annotation.PostConstruct;
import jakarta.annotation.PreDestroy;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

@ApplicationScoped
public class ClusterInformerRegistry {

    private static final Logger log = Logger.getLogger(ClusterInformerRegistry.class);

    @Inject
    KubernetesClient client;

    @ConfigProperty(name = "quarkus.kubernetes.namespace", defaultValue = "mcp-demo")
    String watchNamespace;

    private SharedInformerFactory factory;

    private SharedIndexInformer<Pod> podInformer;
    private SharedIndexInformer<Deployment> deploymentInformer;

    @PostConstruct
    void start() {
        factory = client.informers();

        // Namespace-scoped informers so a namespaced Role (list/watch in mcp-demo) is
        // enough.
        // inNamespace() is deprecated in Fabric8 7.x but remains the supported way to
        // get namespaced informers in this API.
        long resyncMs = TimeUnit.MINUTES.toMillis(5);
        @SuppressWarnings("deprecation")
        var namespacedFactory = factory.inNamespace(watchNamespace);
        podInformer = namespacedFactory.sharedIndexInformerFor(Pod.class, resyncMs);
        deploymentInformer = namespacedFactory.sharedIndexInformerFor(Deployment.class, resyncMs);

        factory.startAllRegisteredInformers();
        log.infof("Started pod and deployment informers for namespace %s", watchNamespace);
    }

    @PreDestroy
    void stop() {
        if (factory != null) {
            factory.stopAllRegisteredInformers();
        }
    }

    public SharedIndexInformer<Pod> pods() {
        return podInformer;
    }

    public SharedIndexInformer<Deployment> deployments() {
        return deploymentInformer;
    }

    public boolean allInformersWatching() {
        return podInformer != null
                && deploymentInformer != null
                && podInformer.isWatching()
                && deploymentInformer.isWatching();
    }

    public boolean allInformersSynced() {
        return podInformer != null
                && deploymentInformer != null
                && podInformer.hasSynced()
                && deploymentInformer.hasSynced();
    }
}

This bean owns the client and all informer lifecycle. That is an important design choice. It keeps the tool code thin and makes lifecycle easy to reason about. The tools should not know how the informers were created. They should only ask for data and freshness state.

There are two details worth calling out here.

First, we use informer caches for pods and deployments because those are the resources most agents start with when they inspect workload health. If you extend this later for services, events, replica sets, or nodes, you add them here.

Second, we scope the informers to a single namespace via factory.inNamespace(watchNamespace). That way the deployment can use a namespaced Role (list/watch pods and deployments in mcp-demo) instead of a cluster-wide ClusterRole. The API server will reject cluster-scoped list/watch with “pods is forbidden … at the cluster scope” if the service account only has a namespaced Role.

Third, we start the informers once in @PostConstruct and stop them in @PreDestroy. That is the correct shape for a CDI-managed long-lived cache.

Exposing informer health

Create src/main/java/com/mainthread/k8s/InformerHealthCheck.java:

package com.mainthread.k8s;

import org.eclipse.microprofile.health.HealthCheck;
import org.eclipse.microprofile.health.HealthCheckResponse;
import org.eclipse.microprofile.health.Readiness;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

@Readiness
@ApplicationScoped
public class InformerHealthCheck implements HealthCheck {

    @Inject
    ClusterInformerRegistry registry;

    @Override
    public HealthCheckResponse call() {
        boolean podsRunning = registry.pods().isRunning();
        boolean podsWatching = registry.pods().isWatching();
        boolean podsSynced = registry.pods().hasSynced();

        boolean deploymentsRunning = registry.deployments().isRunning();
        boolean deploymentsWatching = registry.deployments().isWatching();
        boolean deploymentsSynced = registry.deployments().hasSynced();

        boolean healthy = podsWatching && deploymentsWatching;

        return HealthCheckResponse.named("kubernetes-informers")
                .status(healthy)
                .withData("pods.isRunning", podsRunning)
                .withData("pods.isWatching", podsWatching)
                .withData("pods.hasSynced", podsSynced)
                .withData("deployments.isRunning", deploymentsRunning)
                .withData("deployments.isWatching", deploymentsWatching)
                .withData("deployments.hasSynced", deploymentsSynced)
                .build();
    }
}

The reason this check includes all three flags is simple. They tell different stories.

isRunning() means the informer machinery is alive.
hasSynced() means it successfully completed an initial sync at least once.
isWatching() means the informer is currently attached to a live watch.

If you only surface one of them, you hide the exact failure mode we care about. The interesting and dangerous combination is hasSynced=true and isWatching=false. That means the cache contains plausible data, but it is no longer live.

Writing the MCP tools

Create src/main/java/com/mainthread/k8s/KubernetesTools.java:

package com.mainthread.k8s;

import java.util.List;
import java.util.stream.Collectors;

import io.fabric8.kubernetes.api.model.Pod;
import io.fabric8.kubernetes.api.model.apps.Deployment;
import io.quarkiverse.mcp.server.Tool;
import io.quarkiverse.mcp.server.ToolArg;
import io.quarkiverse.mcp.server.ToolResponse;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

@ApplicationScoped
public class KubernetesTools {

    @Inject
    ClusterInformerRegistry registry;

    @Tool(description = """
            List all pods in a namespace. Returns pod names, phases, and the node assignment.
            Use this when you need a quick workload view before drilling into one specific pod.
            """)
    public ToolResponse listPods(
            @ToolArg(description = "The Kubernetes namespace to query") String namespace) {

        if (!registry.pods().hasSynced()) {
            return ToolResponse.error(
                    "The pod informer has not completed its initial sync yet. Retry in a few seconds.");
        }

        if (!registry.pods().isWatching()) {
            return ToolResponse.error(
                    "The pod informer is not currently watching live cluster state. Retry in a few seconds instead of acting on cached data.");
        }

        List<Pod> pods = registry.pods().getStore().list().stream()
                .filter(p -> namespace.equals(p.getMetadata().getNamespace()))
                .toList();

        if (pods.isEmpty()) {
            return ToolResponse.success("No pods found in namespace '" + namespace + "'.");
        }

        String content = pods.stream()
                .map(p -> String.format(
                        "%s  phase=%s  node=%s",
                        p.getMetadata().getName(),
                        p.getStatus() != null && p.getStatus().getPhase() != null ? p.getStatus().getPhase()
                                : "<unknown>",
                        p.getSpec() != null && p.getSpec().getNodeName() != null ? p.getSpec().getNodeName()
                                : "<unscheduled>"))
                .collect(Collectors.joining("\n"));

        return ToolResponse.success("Pods in namespace '" + namespace + "':\n" + content);
    }

    @Tool(description = """
            List all deployments in a namespace with desired, ready, and available replica counts.
            Use this for a quick namespace-level rollout view.
            """)
    public ToolResponse listDeployments(
            @ToolArg(description = "The Kubernetes namespace to query") String namespace) {

        if (!registry.deployments().hasSynced()) {
            return ToolResponse.error(
                    "The deployment informer has not completed its initial sync yet. Retry in a few seconds.");
        }

        if (!registry.deployments().isWatching()) {
            return ToolResponse.error(
                    "The deployment informer is not currently watching live cluster state. Retry shortly.");
        }

        List<Deployment> deployments = registry.deployments().getStore().list().stream()
                .filter(d -> namespace.equals(d.getMetadata().getNamespace()))
                .toList();

        if (deployments.isEmpty()) {
            return ToolResponse.success("No deployments found in namespace '" + namespace + "'.");
        }

        String content = deployments.stream()
                .map(d -> String.format(
                        "%s  desired=%d  ready=%d  available=%d",
                        d.getMetadata().getName(),
                        d.getSpec() != null && d.getSpec().getReplicas() != null ? d.getSpec().getReplicas() : 0,
                        d.getStatus() != null && d.getStatus().getReadyReplicas() != null
                                ? d.getStatus().getReadyReplicas()
                                : 0,
                        d.getStatus() != null && d.getStatus().getAvailableReplicas() != null
                                ? d.getStatus().getAvailableReplicas()
                                : 0))
                .collect(Collectors.joining("\n"));

        return ToolResponse.success("Deployments in namespace '" + namespace + "':\n" + content);
    }

    @Tool(description = """
            Get the status of a specific deployment. Returns desired, ready, available, and updated replica counts.
            Use this when you need to know whether a rollout has finished or is degraded.
            """)
    public ToolResponse getDeploymentStatus(
            @ToolArg(description = "The Kubernetes namespace") String namespace,
            @ToolArg(description = "The deployment name") String name) {

        if (!registry.deployments().hasSynced()) {
            return ToolResponse.error(
                    "The deployment informer has not completed its initial sync yet. Retry in a few seconds.");
        }

        if (!registry.deployments().isWatching()) {
            return ToolResponse.error(
                    "The deployment informer is not currently watching live cluster state. Retry in a few seconds rather than acting on cached data.");
        }

        Deployment deployment = registry.deployments().getStore().list().stream()
                .filter(d -> namespace.equals(d.getMetadata().getNamespace())
                        && name.equals(d.getMetadata().getName()))
                .findFirst()
                .orElse(null);

        if (deployment == null) {
            return ToolResponse.error(
                    "Deployment '" + name + "' was not found in namespace '" + namespace + "'.");
        }

        int desired = deployment.getSpec() != null && deployment.getSpec().getReplicas() != null
                ? deployment.getSpec().getReplicas()
                : 0;
        int ready = deployment.getStatus() != null && deployment.getStatus().getReadyReplicas() != null
                ? deployment.getStatus().getReadyReplicas()
                : 0;
        int available = deployment.getStatus() != null && deployment.getStatus().getAvailableReplicas() != null
                ? deployment.getStatus().getAvailableReplicas()
                : 0;
        int updated = deployment.getStatus() != null && deployment.getStatus().getUpdatedReplicas() != null
                ? deployment.getStatus().getUpdatedReplicas()
                : 0;

        StringBuilder response = new StringBuilder();
        response.append("Deployment: ").append(namespace).append("/").append(name).append("\n");
        response.append("Desired replicas: ").append(desired).append("\n");
        response.append("Ready replicas: ").append(ready).append("\n");
        response.append("Available replicas: ").append(available).append("\n");
        response.append("Updated replicas: ").append(updated);

        if (deployment.getStatus() != null && deployment.getStatus().getConditions() != null) {
            String conditions = deployment.getStatus().getConditions().stream()
                    .map(c -> String.format(
                            "%s status=%s reason=%s message=%s",
                            c.getType(),
                            c.getStatus(),
                            c.getReason() != null ? c.getReason() : "<none>",
                            c.getMessage() != null ? c.getMessage() : "<none>"))
                    .collect(Collectors.joining("\n"));

            if (!conditions.isBlank()) {
                response.append("\nConditions:\n").append(conditions);
            }
        }

        return ToolResponse.success(response.toString());
    }
}

The pattern here is the whole point of the tutorial.

We check hasSynced() first so we do not answer during initial startup before the cache is ready.

We check isWatching() second so we do not answer from stale cache when the watch is down.

Only then do we read from the store.

The server does not quietly return old data and hope the model interprets a warning correctly. It fails at the tool level in a way the model can understand and act on. That is the safer behavior.

The tool descriptions matter too. Agents choose tools based on those descriptions. If the description is vague, the model guesses. If the description clearly says what the tool returns and when to use it, the model has a better chance of choosing correctly.

Application configuration

Create src/main/resources/application.properties:

quarkus.http.cors.enabled=true

quarkus.mcp.server.server-info.name=kubernetes-mcp-server
quarkus.mcp.server.server-info.version=1.0.0

quarkus.smallrye-health.ui.always-include=true

This keeps the local developer experience simple. The server runs on port 8080 and the health UI is always visible (even in prod!).

Running locally in dev mode

Before we deploy to Minikube, run the application locally once:

./mvnw quarkus:dev

Open the health endpoint:

curl -s http://localhost:8080/q/health | jq

As we are not running in a cluster yet, you will see an unhealthy status for now. But the endpoint itself works:

{
  "status": "DOWN",
  "checks": [
    {
      "name": "kubernetes-informers",
      "status": "DOWN",
      "data": {
        "pods.isRunning": true,
        "pods.isWatching": false,
        "pods.hasSynced": false,
        "deployments.isRunning": true,
        "deployments.isWatching": false,
        "deployments.hasSynced": false
      }
    }
  ]
}

You can also use the Quarkus Dev UI to inspect the MCP server and invoke tools interactively while developing.

Deploying locally on Minikube

Now let us run the whole thing where it belongs: inside Kubernetes.

Start Minikube with Podman:

minikube start --driver=podman --container-runtime=cri-o

Minikube’s current docs list Podman as a supported driver, and its image tooling supports building directly into the cluster runtime with minikube image build, which is perfect for local Quarkus workflows. (minikube)

Verify your context:

kubectl config current-context
kubectl get nodes

You should see the minikube context and one Ready node.

Create the namespace we will use for both the sample workload and the MCP server:

kubectl create namespace mcp-demo

Now create a simple workload for the agent to inspect:

kubectl -n mcp-demo create deployment api-gateway --image=docker.io/library/nginx:1.27
kubectl -n mcp-demo scale deployment api-gateway --replicas=3
kubectl -n mcp-demo get deployments,pods

Use the full image name `docker.io/library/nginx:1.27` (not just `nginx:1.27`). CRI-O often fails with `ImageInspectError` when the registry prefix is omitted. If you still see image errors, load the image into Minikube: `podman pull docker.io/library/nginx:1.27`, then `minikube image load docker.io/library/nginx:1.27`, and restart the deployment.

At this point, the cluster contains something real. That matters. The tools are much more interesting when they query actual state instead of an empty namespace.

Configuring the Kubernetes deployment

The Quarkus Kubernetes and Minikube extensions generate manifests from `application.properties`. Add the following so the generated Deployment, Service, and RBAC match wha the MCP server needs:

# Kubernetes Client (Fabric8 via Quarkus extension) and deployment
quarkus.kubernetes-client.namespace=mcp-demo
quarkus.kubernetes.namespace=mcp-demo
quarkus.kubernetes.name=kubernetes-mcp-server
# docker.io/localhost/... so minikube image load finds it (minikube resolves to docker.io/... when loading)
quarkus.container-image.registry=docker.io
quarkus.container-image.group=localhost
quarkus.container-image.name=kubernetes-mcp-server
quarkus.container-image.tag=1.0.0
quarkus.kubernetes.image-pull-policy=IfNotPresent
# Service port 8080 so port-forward 8080:8080 works (default is 80)
quarkus.kubernetes.ports.http.host-port=8080
# RBAC: read-only access to pods and deployments in the namespace (for informers)
quarkus.kubernetes.rbac.roles.kubernetes-mcp-server-reader.policy-rules.0.api-groups=
quarkus.kubernetes.rbac.roles.kubernetes-mcp-server-reader.policy-rules.0.resources=pods
quarkus.kubernetes.rbac.roles.kubernetes-mcp-server-reader.policy-rules.0.verbs=get,list,watch
quarkus.kubernetes.rbac.roles.kubernetes-mcp-server-reader.policy-rules.1.api-groups=apps
quarkus.kubernetes.rbac.roles.kubernetes-mcp-server-reader.policy-rules.1.resources=deployments
quarkus.kubernetes.rbac.roles.kubernetes-mcp-server-reader.policy-rules.1.verbs=get,list,watch

Namespace, registry, and image name/tag control where and how the app is deployed. Using docker.io and group=localhost produces the image name docker.io/localhost/kubernetes-mcp-server:1.0.0, which is what Minikube expects when loading from the host daemon (it resolves names under docker.io/...), so no remote registry is needed. The RBAC role limits the service account to get, list, and watch on pods and deployments in that namespace—least-privilege and read-only, as required by the informers. The Minikube extension sets the deployment target to Minikube and, together with image-pull-policy: IfNotPresent, ensures Kubernetes uses a locally built image instead of pulling from a registry. See the Quarkus Kubernetes guide for more options.

Packaging and building the image

Build the Quarkus application and container image in one step. Jib builds the image without a Dockerfile; the Kubernetes extension generates the manifests into target/kubernetes/:

./mvnw clean package -Dquarkus.container-image.build=true

You will get target/kubernetes/minikube.yaml and target/kubernetes/minikube.json (and the vanilla kubernetes.yaml / kubernetes.json). The image is built into your local container runtime (Docker or Podman). With registry=docker.io and group=localhost set above, the image name is docker.io/localhost/kubernetes-mcp-server:1.0.0, which Minikube can load from the host without a remote registry.

Loading the image into Minikube (Podman / CRI-O)

With Minikube on Podman and CRI-O, the cluster does not share the host’s image store. Load the image you just built so the cluster can use it. Minikube looks up the image on the host by the name in the manifest (docker.io/localhost/kubernetes-mcp-server:1.0.0). Podman often stores the image only as localhost/kubernetes-mcp-server:1.0.0; if podman images does not show the docker.io/localhost/... tag, add it so Minikube can find it:

podman tag localhost/kubernetes-mcp-server:1.0.0 docker.io/localhost/kubernetes-mcp-server:1.0.0
minikube image load docker.io/localhost/kubernetes-mcp-server:1.0.0

If the image already has the docker.io/localhost/... tag, the load step alone is enough.

On Minikube with the Docker driver you can instead point the build at Minikube’s daemon: eval $(minikube -p minikube docker-env) then run the same ./mvnw clean package -Dquarkus.container-image.build=true; no separate load step is needed.

Deploying to Minikube

Apply the generated Minikube manifests:

kubectl apply -f target/kubernetes/minikube.yml

Watch the rollout:

kubectl -n mcp-demo rollout status deployment/kubernetes-mcp-server
kubectl -n mcp-demo get pods
kubectl -n mcp-demo get svc

If the pod does not start, inspect the logs:

kubectl -n mcp-demo logs deployment/kubernetes-mcp-server

If Kubernetes cannot find the image, ensure you loaded it with minikube image load docker.io/localhost/kubernetes-mcp-server:1.0.0. If load reports “image was not found” or “image not known”, the host daemon needs that exact tag: run podman tag localhost/kubernetes-mcp-server:1.0.0 docker.io/localhost/kubernetes-mcp-server:1.0.0, then try the load again.

Accessing the service from your machine

Because the service is ClusterIP, the simplest access pattern is port-forwarding. The config sets quarkus.kubernetes.ports.http.host-port=8080 so the Service port matches the container port; then you can use 8080:8080:

kubectl -n mcp-demo port-forward svc/kubernetes-mcp-server 8080:8080

If your Service was generated with the default port 80, use 8080:80 instead (local port 8080 → Service port 80).

In another terminal, verify the health endpoint:

curl -s http://localhost:8080/q/health | jq

Once the informers are healthy, you should see the same healthy shape you saw earlier in dev mode.

Now your local machine can reach the in-cluster MCP server on http://localhost:8080.

Connecting an MCP client

Use the Streamable HTTP endpoint (the default; the legacy SSE endpoint at `/mcp/sse` is deprecated). Example MCP client configuration:

{
  "mcpServers": {
    "kubernetes": {
      "url": "http://localhost:8080/mcp"
    }
  }
}

(Some clients register the server under a name like user-kubernetes; use the same URL.)

Available tools

The server exposes three read-only tools. All require the informer cache to be synced and watching; otherwise they return an error and ask the client to retry.

ToolArgumentsDescriptionlistPodsnamespaceList pods in a namespace (name, phase, node).listDeploymentsnamespaceList deployments with desired/ready/available replica counts.getDeploymentStatusnamespace, nameFull status of one deployment, including conditions.

The informers are scoped to the namespace configured in the server (quarkus.kubernetes.namespace, e.g. mcp-demo). Tools only return data for that namespace.

Usage examples

Example 1: List pods in the demo namespace

User prompt:

List all pods in namespace mcp-demo.

The agent can call listPods with namespace: "mcp-demo". Example response:

Pods in namespace 'mcp-demo':
api-gateway-b47f4d764-cb5vp  phase=Running  node=minikube
api-gateway-b47f4d764-d5mws  phase=Running  node=minikube
api-gateway-b47f4d764-mh96b  phase=Running  node=minikube
kubernetes-mcp-server-694b6f8447-4x7ww  phase=Running  node=minikube

Example 2: List deployments

User prompt:

What deployments are running in mcp-demo?

The agent can call listDeployments with namespace: "mcp-demo". Example response:

Deployments in namespace 'mcp-demo':
api-gateway  desired=3  ready=3  available=3
kubernetes-mcp-server  desired=1  ready=1  available=1

Example 3: Check whether a deployment is healthy

User prompt:

Check whether the api-gateway deployment in namespace mcp-demo is healthy.

The agent can call getDeploymentStatus with namespace: "mcp-demo" and name: "api-gateway". If the deployment informer is healthy, it gets a direct answer. If the watch is down, it receives a tool-level error telling it to retry instead of reasoning from stale data. That is the behavior we wanted all along.

What the agent sees

For a prompt like “Check whether the api-gateway deployment in namespace mcp-demo is healthy,” the agent calls getDeploymentStatus. A healthy (informer watching) answer looks like:

Deployment: mcp-demo/api-gateway
Desired replicas: 3
Ready replicas: 3
Available replicas: 3
Updated replicas: 3
Conditions:
Available status=True reason=MinimumReplicasAvailable message=Deployment has minimum availability.
Progressing status=True reason=NewReplicaSetAvailable message=ReplicaSet "api-gateway-..." has successfully progressed.

If the informer is not watching (e.g. watch connection dropped), the tool returns an error instead of cached data:

The deployment informer is not currently watching live cluster state. Retry in a few seconds rather than acting on cached data.

That second response is not a failure of the design. It is the design working correctly: the server is telling the model that it cannot safely answer right now. In operational tooling, that is much better than giving a plausible but stale answer.

Why this matters in production

This tutorial uses a small local cluster, but the failure mode is the same in real environments.

The wrong design produces a tool that always answers, even when its cache is stale. That makes the server look reliable right up until the moment the model takes action based on information that stopped updating.

The better design is stricter. It forces a distinction between “I have data” and “I have current data.” That sounds subtle, but it is exactly the sort of distinction that keeps agent-assisted operations from drifting into unsafe automation.

The MCP server should not only expose state. It should also expose confidence in that state.

That is what isWatching() gives us.

Where to go next

This tutorial intentionally stays on the read side of the line.

The next logical step is write tools such as scaleDeployment, restartDeployment, or annotateDeployment. That is where the security and governance problems become much more interesting. Once a tool can mutate a cluster, you need confirmation patterns, stronger authentication, audit logs, and carefully scoped permissions.

That is also where Quarkus becomes even more useful, because you can layer OIDC, authorization, rate limiting, and structured logging directly into the same application.

For now, though, this is the right foundation: informer-backed reads, freshness-aware tools, health-driven operations, and a full local Minikube deployment that proves the pattern end to end.

Discussion about this post

Ready for more?