Building a Real-Time AI Progress Tracker with Quarkus, LangChain4j, Ollama, and Granite 4

Turn your LLM pipeline into a transparent, step-by-step experience that shows retrieval, prompt creation, and model calls in real time. Powered by Quarkus, LangChain4j, and Granite4 on Ollama.

Oct 21, 2025

I’ve always liked progress bars in software. They are simple, but they do something important. They show you the computer is working and tell you, “Don’t worry, I will be finished soon.” You can see that something is happening.

But with Large Language Models (LLMs), this is a problem I see a lot. You ask a question, and then... nothing. You just wait in silence. Then, suddenly, the full answer appears.

For business software, this is not good enough. Users want to know what the system is doing. They want to see the steps, like: “Okay, now I’m finding the documents,” then “Now I am filtering the information,” and “Now I am calling the LLM to get the final answer.”

So, in this post, I want to show you how to build exactly that. We will make a real-time progress tracker for an LLM application. We will use Quarkus with LangChain4j, and run a model on our own computer with Ollama.

The idea is very simple. For each step in our code, we will add a small ProgressEvent. The system will then use these events and automatically populate the progress bar for us. The backend will send live updates to the frontend, so the user sees a smooth progress bar.

When we are finished, you will have a complete project you can run. Your LLM application will no longer be a “black box.” Instead, it will be a transparent system that shows your users exactly what it is doing.

Prerequisites

Java 21
Ollama installed and running (if you don’t want the Dev Service to spin up a model container) with local models
Example models: llama3.1:8b (chat) or granite4:tiny-h
Pull them once:

ollama pull granite4:tiny-h

Quarkus CLI or Maven

Bootstrap

quarkus create app com.example:ai-progress-tracker -x=rest-jackson,langchain4j-ollama,quarkus-langchain4j-easy-rag

cd ai-progress-tracker

Docs for the extensions we use:

quarkus-rest-jackson for JSON REST. Latest artifact is io.quarkus:quarkus-rest-jackson. (Quarkus)
quarkus-langchain4j-ollama for local LLM + embeddings. (Quarkus)
quarkus-langchain4j-easy-rag to ingest documents with one property (quarkus.langchain4j.easy-rag.path) and auto-wire a default retrieval augmentor. (Quarkiverse Docs)

And as usual, find the full project in my Github repository.

Configuration

We do need a little bit of configuration for Quarkus.

quarkus.langchain4j.easy-rag.path=easy-rag-catalog
quarkus.langchain4j.easy-rag.reuse-embeddings.enabled=true
quarkus.langchain4j.timeout=60s

# Ollama chat model (local)
quarkus.langchain4j.ollama.chat-model.model-id=granite4:tiny-h
quarkus.langchain4j.ollama.chat-model.temperature=0
quarkus.langchain4j.ollama.log-requests=true

Easy RAG will scan the easy-rag-catalog directory and auto-ingest on startup, wiring a default RetrievalAugmentor to an in-memory store.

Sample docs

Use the provided sample docs or add more to /easy-rag-catalog with a few lines of content your RAG step can retrieve.

Progress model

A simple JSON object for streaming updates:

src/main/java/com/example/progress/ProgressEvent.java

package com.example.progress;

public class ProgressEvent {
    public String step;
    public int percent;
    public String detail; // optional small text payload

    public ProgressEvent() {
    }

    public ProgressEvent(String step, int percent, String detail) {
        this.step = step;
        this.percent = percent;
        this.detail = detail;
    }
}

The AI service and pipeline

Use LangChain4j’s declarative style to define our assistant:

src/main/java/com/example/ai/Assistant.java

package com.example.ai;

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;
import jakarta.enterprise.context.ApplicationScoped;

@RegisterAiService
@ApplicationScoped
public interface Assistant {

    @SystemMessage(”“”
            You are a helpful assistant. Use retrieved context when available.
            Keep answers concise and cite short bullet points from context when relevant.”“”)

    String answer(@UserMessage String question);
}

With Easy RAG on the classpath and configured, a default Retrieval Augmentor is provided and wired to an in-memory store, so your service benefits from retrieval automatically.

Pipeline with Progress Callbacks

The core AI pipeline (AiPipeline) breaks down complex operations into discrete, trackable steps:

Step 1: Query Normalization (20%) - Clean and validate user input

Step 2: Document Retrieval (40%) - Fetch relevant context using RAG

Step 3: Prompt Construction (60%) - Build the final LLM prompt

Step 4: LLM Processing (80%) - Execute the language model call

Step 5: Response Composition (100%) - Format and return the answer

Each step triggers a progress callback with structured metadata about the current operation.

src/main/java/com/example/ai/AiPipeline.java

package com.example.ai;

import com.example.progress.ProgressEvent;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;

@ApplicationScoped
public class AiPipeline {

    @Inject
    Assistant assistant;

    public interface ProgressCallback {
        void onProgress(ProgressEvent event);
    }

    public String processWithProgress(String question, ProgressCallback callback) {
        try {
            // Step 1: Normalize query
            callback.onProgress(new ProgressEvent(”Normalize query”, 20, “Query normalized”));
            String q = normalize(question);

            // Step 2: Retrieve documents
            callback.onProgress(new ProgressEvent(”Retrieve documents”, 40, “Documents retrieved”));
            String preview = retrievePreview(q);

            // Step 3: Build final prompt
            callback.onProgress(new ProgressEvent(”Build final prompt”, 60, “Prompt built”));
            String finalPrompt = buildPrompt(q, preview);

            // Step 4: Call LLM
            callback.onProgress(new ProgressEvent(”Call LLM”, 80, “LLM called”));
            String raw = callLlm(finalPrompt);
            String answer = compose(raw);

            // Step 5: Complete
            callback.onProgress(new ProgressEvent(”Answer”, 100, answer));

            return answer;
        } catch (Exception e) {
            callback.onProgress(new ProgressEvent(”Error”, 0, e.getMessage()));
            throw e;
        }
    }

    public String normalize(String q) {
        return q == null ? “” : q.trim().replaceAll(”\\s+”, “ “);
    }

    public String retrievePreview(String q) {
        // Ask the model to surface only the top snippets (forces retrieval path)
        return assistant.answer(”List 3 short bullets relevant to: “ + q);
    }

    public String buildPrompt(String q, String preview) {
        return “”“
                Using the following context, answer clearly:

                Question: %s

                Context:
                %s
                “”“.formatted(q, preview);
    }

    public String callLlm(String finalPrompt) {
        return assistant.answer(finalPrompt);
    }

    public String compose(String raw) {
        // Could format markdown, add citations, etc.
        return raw;
    }
}

The functional callback interface decouples the business logic from the progress reporting. This pattern allows the same pipeline to be used in different contexts (web UI, batch processing, testing) with different progress reporting mechanisms.

SSE resource that streams progress and the final answer

The REST API (AskResource) uses Server-Sent Events to stream progress updates to the frontend:

Endpoint: GET /ai/ask/stream?q={question}

Content-Type: text/event-stream

Data Format: JSON-serialized ProgressEvent objects

Error Handling: Graceful fallback with manual JSON construction

src/main/java/com/example/api/AskResource.java

package com.example.api;

import com.example.ai.AiPipeline;
import com.example.progress.ProgressEvent;
import com.fasterxml.jackson.databind.ObjectMapper;

import jakarta.inject.Inject;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.sse.Sse;
import jakarta.ws.rs.sse.SseEventSink;

@Path(”/ai”)
public class AskResource {

    @Inject
    AiPipeline pipeline;
    @Inject
    Sse sse;
    @Inject
    ObjectMapper objectMapper;

    @GET
    @Path(”/ask/stream”)
    @Produces(MediaType.SERVER_SENT_EVENTS)
    public void ask(@QueryParam(”q”) String question, SseEventSink eventSink) {
        try {
            pipeline.processWithProgress(question, event -> sendEvent(event, eventSink));
            eventSink.close();
        } catch (Exception e) {
            sendErrorEvent(e.getMessage(), eventSink);
            eventSink.close();
        }
    }

    private void sendEvent(ProgressEvent event, SseEventSink eventSink) {
        try {
            String jsonData = objectMapper.writeValueAsString(event);
            eventSink.send(sse.newEventBuilder()
                    .data(jsonData)
                    .mediaType(MediaType.APPLICATION_JSON_TYPE)
                    .build());
        } catch (Exception e) {
            // Fallback to manual JSON construction
            sendEventManually(event, eventSink);
        }
    }

    private void sendErrorEvent(String errorMessage, SseEventSink eventSink) {
        ProgressEvent errorEvent = new ProgressEvent(”Error”, 0, errorMessage);
        sendEvent(errorEvent, eventSink);
    }

    private void sendEventManually(ProgressEvent event, SseEventSink eventSink) {
        try {
            String jsonData = String.format(”{\”step\”:\”%s\”,\”percent\”:%d,\”detail\”:\”%s\”}”, 
                    event.step, event.percent, event.detail);
            eventSink.send(sse.newEventBuilder()
                    .data(jsonData)
                    .mediaType(MediaType.APPLICATION_JSON_TYPE)
                    .build());
        } catch (Exception e) {
            // If even manual construction fails, close the sink
            eventSink.close();
        }
    }
}

Minimal UI

Let’s just create a quick and minimal UI to display the functionality. Find the css files in the project folder in my Github repository.

src/main/resources/META-INF/resources/index.html

<!doctype html>
<html data-theme=”light”>

<head>
    <meta charset=”utf-8” />
    <meta name=”viewport” content=”width=device-width, initial-scale=1”>
    <title>LLM Progress Tracker</title>
    <link rel=”stylesheet” href=”https://cdn.jsdelivr.net/npm/@picocss/pico@2/css/pico.min.css”>
    <link rel=”stylesheet” href=”style.css”>
</head>

<body>
    <div class=”container”>
        <header class=”header”>
            <h1>🤖 AI Progress Tracker</h1>
            <p>Ask questions and watch the AI work through each step in real-time</p>
        </header>

        <main>
            <div class=”form-container”>
                <form id=”f”>
                    <div class=”grid”>
                        <div>
                            <input id=”q” type=”text” placeholder=”e.g. What does this service do? How does it work?”
                                required />
                        </div>
                        <div>
                            <button type=”submit” role=”button”>Ask Question</button>
                        </div>
                    </div>
                </form>
            </div>

            <div class=”progress-container hidden” id=”progressContainer”>
                <div class=”progress-header”>
                    <h3>Progress</h3>
                    <span id=”pct”>0%</span>
                </div>
                <div class=”progress-bar”>
                    <div class=”progress-fill” id=”fill”></div>
                </div>
            </div>

            <div class=”steps-container hidden” id=”stepsContainer”>
                <h3>Steps</h3>
                <ul class=”steps-list” id=”steps”></ul>
            </div>

            <div class=”answer-container hidden” id=”answerContainer”>
                <h3>Answer</h3>
                <div class=”answer-content” id=”answer”></div>
            </div>
        </main>
    </div>

    <script>
        const f = document.getElementById(’f’);
        const q = document.getElementById(’q’);
        const fill = document.getElementById(’fill’);
        const pct = document.getElementById(’pct’);
        const steps = document.getElementById(’steps’);
        const answer = document.getElementById(’answer’);
        const progressContainer = document.getElementById(’progressContainer’);
        const stepsContainer = document.getElementById(’stepsContainer’);
        const answerContainer = document.getElementById(’answerContainer’);

        function showElement(element) {
            element.classList.remove(’hidden’);
            element.classList.add(’fade-in’);
        }

        function hideElement(element) {
            element.classList.add(’hidden’);
            element.classList.remove(’fade-in’);
        }

        function resetUI() {
            steps.innerHTML = ‘’;
            answer.textContent = ‘’;
            fill.style.width = ‘0%’;
            pct.textContent = ‘0%’;

            hideElement(progressContainer);
            hideElement(stepsContainer);
            hideElement(answerContainer);
        }

        f.onsubmit = (e) => {
            e.preventDefault();
            resetUI();

            showElement(progressContainer);
            showElement(stepsContainer);

            const es = new EventSource(`/ai/ask/stream?q=${encodeURIComponent(q.value)}`);

            es.onmessage = (event) => {
                try {
                    const data = JSON.parse(event.data);

                    // Add step to list
                    const stepItem = document.createElement(’li’);
                    stepItem.innerHTML = `
                        <strong>${data.step}</strong>
                        <div class=”step-detail”>${data.detail}</div>
                    `;
                    steps.appendChild(stepItem);

                    // Update progress
                    fill.style.width = data.percent + ‘%’;
                    pct.textContent = data.percent + ‘%’;

                    // Show answer when complete
                    if (data.step === ‘Answer’ && data.detail) {
                        answer.textContent = data.detail;
                        showElement(answerContainer);
                        es.close();
                    }
                } catch (err) {
                    console.error(’Error parsing JSON:’, err, ‘Data:’, event.data);
                    const errorItem = document.createElement(’li’);
                    errorItem.innerHTML = `<div class=”error-message”>Error parsing response: ${err.message}</div>`;
                    steps.appendChild(errorItem);
                }
            };

            es.onerror = () => {
                const errorItem = document.createElement(’li’);
                errorItem.innerHTML = `<div class=”error-message”>Connection error occurred</div>`;
                steps.appendChild(errorItem);
                es.close();
            };
        };
    </script>
</body>

</html>

Uses PicoCSS for a clean, modern baseline design without writing heavy custom CSS.
Content is wrapped in a centered .container, making it feel balanced and readable.
Light theme (<html data-theme=”light”>)

Run it

quarkus dev
# open http://localhost:8080

Try a question that hits your easy-rag-catalog so that retrieval is visible in the “Retrieve documents” step.

Production notes

Backpressure: SSE handles bursts well, but for high load consider WebSockets.
RAG scaling: Easy RAG uses in-memory by default. Switch to Postgres or Redis for persistence.
Weighted steps: Right now, each step contributes equally. Add a weight field if some steps take longer.
Security: Don’t leak internal prompts. Scrub detail before streaming to users.

What this unlocks

You can reuse the annotation approach for any multi-step workflow: CSV importers, video transcoders, CI log streams.
For more control, add weight() to @ProgressStep and compute weighted percentages.
Add a jobId and a ProgressBus if you want multi-tab or multi-user subscriptions.

Progress bars may seem like small UX details, but in AI-driven applications they serve a much deeper purpose. They transform opaque model calls into a transparent process, helping users build trust and giving developers a way to surface exactly what happens behind the scenes. With Quarkus, LangChain4j, Ollama, and Granite 4, you can build pipelines that aren’t just powerful, but explainable and auditable.

This tutorial showed you how to stream progress step by step, from retrieval to the final answer, with a clean UI and real-time updates. The same pattern applies to many other workflows: batch jobs, data pipelines, or long-running transactions. Once you start making progress visible, you’ll find users expect it everywhere. And in today’s enterprise landscape, visibility is as important as speed.