Quarkus Graceful Shutdown That Holds Up During Rolling Deploys
A practical shutdown recipe for teams that want fewer dropped requests, cleaner Kubernetes rollouts, and less guesswork around readiness and termination timing.
Rolling deploys are when I find out a service was never really ready to go away. The pod gets a SIGTERM, the load balancer still has the instance in rotation for another probe interval, and a payment handoff that was fine a second ago comes back as 503 Service Unavailable or just drops the connection. The app looked healthy in unit tests. Production was doing a choreographed shutdown and nobody told the JVM.
Quarkus has had graceful shutdown settings for a while. In Quarkus 3.32 the HTTP stack got a meaningful upgrade (PR #50975): during shutdown it tries to answer requests instead of spraying 503 at everything. That helps. It does not replace the deploy protocol you still need: fail readiness, stop new traffic, let in-flight HTTP finish, then exit.
We build OrderBridge, a tiny payment handoff API, then prove shutdown behavior with a script that keeps a long request in flight while the JVM receives SIGTERM. The sample uses Quarkus 3.35.2.
What we build
OrderBridge is a small service that:
exposes
GET /orders/{id}for a quick status check;exposes
POST /orders/handoffthat simulates a five-second payment gateway call;exposes SmallRye Health liveness and readiness on
/q/health/liveand/q/health/ready;logs startup, shutdown delay, and shutdown events;
ships two Quarkus profiles:
naive(defaults) andgraceful(timeout + delay);includes a bash script that terminates the packaged app mid-handoff so you can see the difference.
What you need
I assume you have shipped Jakarta REST apps and have seen Kubernetes readiness probes in the wild. This is not a hello-world.
JDK 21
Quarkus CLI or Maven
curland bash for the shutdown scriptAbout 40 minutes
Project setup
Create the project:
quarkus create app com.orderbridge:orderbridge-graceful-shutdown \
--extension='rest-jackson,smallrye-health' \
--java=21 \
--no-code
cd orderbridge-graceful-shutdown
Extensions:
rest-jackson— JSON endpoints and the HTTP stack that participates in graceful shutdownsmallrye-health— readiness and liveness probes
Use package com.orderbridge for application code.
Add the Maven wrapper from the Quarkus getting started guide if your tree does not already include ./mvnw, or clone the finished sample linked at the end.
Order status and payment handoff
Payment handoffs are the interesting case: they run for seconds while deploys happen for seconds. A fast status endpoint gives us something cheap to hit while the slow one is in flight.
DTO records
package com.orderbridge;
public record HandoffRequest(String orderId, long amountCents) {
}
package com.orderbridge;
public record HandoffResult(String orderId, String status, long elapsedMs) {
}
package com.orderbridge;
public record OrderStatus(String orderId, String status) {
}
OrderService
The handoff sleeps to mimic an external gateway. Production code should use timeouts, cancellation, and idempotency keys — not Thread.sleep. For teaching shutdown, sleep is honest about “long request in flight.”
package com.orderbridge;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import org.eclipse.microprofile.config.inject.ConfigProperty;
import org.jboss.logging.Logger;
import jakarta.enterprise.context.ApplicationScoped;
@ApplicationScoped
public class OrderService {
private static final Logger LOG = Logger.getLogger(OrderService.class);
private final Map<String, String> statuses = new ConcurrentHashMap<>();
@ConfigProperty(name = "orderbridge.handoff.delay-ms", defaultValue = "5000")
long handoffDelayMs;
public OrderStatus status(String orderId) {
String status = statuses.getOrDefault(orderId, "CREATED");
return new OrderStatus(orderId, status);
}
public HandoffResult handoff(HandoffRequest request) {
long started = System.currentTimeMillis();
LOG.infof("Starting payment handoff for order %s", request.orderId());
statuses.put(request.orderId(), "HANDOFF_IN_PROGRESS");
try {
Thread.sleep(handoffDelayMs);
} catch (InterruptedException interrupted) {
Thread.currentThread().interrupt();
statuses.put(request.orderId(), "HANDOFF_INTERRUPTED");
throw new IllegalStateException("Payment handoff interrupted during shutdown", interrupted);
}
statuses.put(request.orderId(), "HANDOFF_COMPLETE");
long elapsedMs = System.currentTimeMillis() - started;
LOG.infof("Payment handoff complete for order %s in %d ms", request.orderId(), elapsedMs);
return new HandoffResult(request.orderId(), "HANDOFF_COMPLETE", elapsedMs);
}
}
OrderResource
package com.orderbridge;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.PathParam;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
@Path("/orders")
@Produces(MediaType.APPLICATION_JSON)
@Consumes(MediaType.APPLICATION_JSON)
public class OrderResource {
private final OrderService orderService;
public OrderResource(OrderService orderService) {
this.orderService = orderService;
}
@GET
@Path("/{id}")
public OrderStatus get(@PathParam("id") String orderId) {
return orderService.status(orderId);
}
@POST
@Path("/handoff")
public HandoffResult handoff(HandoffRequest request) {
return orderService.handoff(request);
}
}
Base configuration
In src/main/resources/application.properties:
orderbridge.handoff.delay-ms=5000
%script.quarkus.http.port=18080
%script.quarkus.log.console.level=INFO
%graceful.quarkus.shutdown.timeout=15s
%graceful.quarkus.shutdown.delay-enabled=true
%graceful.quarkus.shutdown.delay=5s
%test.orderbridge.handoff.delay-ms=100
%test.quarkus.shutdown.timeout=30s
%test.quarkus.shutdown.delay-enabled=true
%test.quarkus.shutdown.delay=2s
The script profile pins port 18080 so our shutdown script does not fight dev mode on 8080. The graceful profile turns on the shutdown recipe we want in production. Tests use a 100 ms handoff so ./mvnw test stays snappy.
Run dev mode and try the endpoints:
./mvnw quarkus:dev
curl -s http://localhost:8080/orders/ORD-1
curl -s -X POST http://localhost:8080/orders/handoff \
-H 'Content-Type: application/json' \
-d '{"orderId":"ORD-1","amountCents":2500}'
The handoff blocks for about five seconds, then returns "status":"HANDOFF_COMPLETE".
Readiness vs liveness during shutdown
Liveness answers: is the process alive? If this fails, Kubernetes restarts the pod.
Readiness answers: should traffic be sent here? If this fails, the pod stays running but is removed from Service endpoints.
During a rolling update, you want readiness to flip DOWN while the instance can still finish work it already accepted. That is exactly what Quarkus shutdown delay is for. The lifecycle guide describes a delay window where HTTP still runs, but readiness reports down so orchestrators and load balancers stop sending new connections.
Check probes manually:
curl -s http://localhost:8080/q/health/live
curl -s http://localhost:8080/q/health/ready
While the app is running, both return "status":"UP".
See naive shutdown hurt the client
With no quarkus.shutdown.timeout and no delay, Quarkus does not wait for your handoff the way you expect. The server log may still show “handoff complete,” while the client sees a dead connection.
The module ships scripts/demonstrate-shutdown.sh. It packages the app, starts the JVM, fires a long POST /orders/handoff, sends SIGTERM, and polls readiness.
Run the naive profile (script port only — no graceful settings):
./scripts/demonstrate-shutdown.sh naive
On my machine the handoff line looked like:
handoff HTTP 000 (total 0.521690s)
HTTP 000 means curl never got a response — connection gone. The log often still shows the handoff finishing a few seconds later, which is the frustrating split-brain: server thought it was fine, client already gave up.
That is the bug we are fixing: shutdown was never part of the contract we tested.
Add quarkus.shutdown.timeout
quarkus.shutdown.timeout is a runtime setting. When set, Quarkus waits for active HTTP requests to complete before tearing down, up to the limit. The lifecycle guide documents it; it is off by default.
Add only the timeout first (you can use a throwaway profile or add it to %graceful later):
quarkus.shutdown.timeout=15s
Too short — in-flight handoffs get cut off; clients see resets or errors even though the pod had time to drain.
Too long — deploys stall because Kubernetes will eventually send SIGKILL when terminationGracePeriodSeconds runs out.
Fifteen seconds is generous for our five-second fake gateway, but it leaves headroom for real network jitter.
Enable shutdown delay (readiness first)
Timeout alone does not tell the load balancer to stop new traffic early. For that you need delay:
quarkus.shutdown.delay-enabled=true— build time. You must package the app with this set; flipping it only at runtime is not enough.quarkus.shutdown.delay— runtime duration of the pre-shutdown phase.
Our %graceful profile sets:
%graceful.quarkus.shutdown.timeout=15s
%graceful.quarkus.shutdown.delay-enabled=true
%graceful.quarkus.shutdown.delay=5s
Package with the graceful profile baked in:
./mvnw package -DskipTests -Dquarkus.profile=graceful,script
What happens on SIGTERM with delay enabled:
Delay phase starts. Readiness goes DOWN (SmallRye reports a
Graceful Shutdowncheck).Existing connections can still complete work.
After the delay, Quarkus moves toward full shutdown, honoring
quarkus.shutdown.timeoutfor active HTTP.Process exits.
Run the script in graceful mode:
./scripts/demonstrate-shutdown.sh graceful
Excerpt from a successful run:
-- Readiness before shutdown: UP --
ready HTTP 200
-- Polling readiness during shutdown --
[15:24:23] readiness HTTP 503
[15:24:24] readiness HTTP 503
...
-- Handoff result --
{"orderId":"ORD-SHUTDOWN","status":"HANDOFF_COMPLETE","elapsedMs":5003}
handoff HTTP 200 (total 5.101526s)
Readiness flips to 503 while the handoff still returns 200. That is the protocol you want: orchestrator sees “not ready,” client already in flight gets an answer.
Lifecycle hooks for visibility
Observers make the sequence visible in logs — useful when someone swears “readiness failed too early.”
package com.orderbridge;
import org.jboss.logging.Logger;
import io.quarkus.runtime.ShutdownEvent;
import io.quarkus.runtime.StartupEvent;
import io.quarkus.runtime.ShutdownDelayInitiatedEvent;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.enterprise.event.Observes;
@ApplicationScoped
public class ShutdownLifecycle {
private static final Logger LOG = Logger.getLogger(ShutdownLifecycle.class);
void onStart(@Observes StartupEvent event) {
LOG.info("OrderBridge is ready to accept traffic");
}
void onShutdownDelay(@Observes ShutdownDelayInitiatedEvent event) {
LOG.info("Shutdown delay started — readiness should fail while HTTP still winds down");
}
void onStop(@Observes ShutdownEvent event) {
LOG.info("OrderBridge shutdown event fired");
}
}
You can use @ShutdownDelayInitiated on a method instead of the event observer; behavior is the same. Methods marked that way run when delay starts — keep them fast, because they participate in the shutdown path.
In graceful mode you should see Shutdown delay started before OrderBridge shutdown event fired, and SmallRye log lines like Reporting health down status with a Graceful Shutdown check.
Kubernetes timing that actually matches Quarkus
Quarkus only controls what happens inside the pod after SIGTERM. Your platform still needs enough time and sensible probes.
terminationGracePeriodSeconds — must be at least delay + timeout + buffer. With our example (5s delay + 15s timeout), I would not go below 25s; 30s is a comfortable default.
Readiness probe periodSeconds and failureThreshold — control how quickly the Service removes the pod from endpoints. If the probe period is 10s and threshold is 3, it can take ~30s before Kubernetes stops routing even after readiness is DOWN. Align that with how fast your ingress or service mesh drains.
preStop hooks — optional sleep can help legacy load balancers that ignore readiness. Prefer fixing probe and delay alignment first; arbitrary sleep 15 in preStop hides misconfiguration.
Liveness during shutdown — do not point liveness at something that fails during graceful drain unless you want the kubelet to restart a pod that is intentionally winding down.
What graceful shutdown does not cover
The lifecycle guide is explicit: only extensions that opt in participate. Today the documented graceful path is HTTP. Kafka consumers, scheduled jobs, and background queues need their own drain story.
Long business work still needs application-level timeouts. quarkus.shutdown.timeout bounds how long Quarkus waits on HTTP; it does not make an unbounded database migration safe.
Native image and dev mode follow the same configuration ideas, but always re-run your shutdown script on the artifact you actually deploy.
Prove it
Unit and resource tests:
./mvnw test
Integration test against the packaged runner:
./mvnw verify
Shutdown demonstrations:
./scripts/demonstrate-shutdown.sh naive
./scripts/demonstrate-shutdown.sh graceful
Expect naive mode to show readiness dropping only when the process is already gone, and the handoff client to fail often. Expect graceful mode to show readiness 503 for several seconds and the handoff to finish with HTTP 200.
Closing
Graceful shutdown in Quarkus is a small set of properties, but the real work is the deploy protocol: readiness fails first, new traffic stops, in-flight HTTP gets a bounded chance to finish, then the process exits. The 3.32 HTTP improvements reduce surprise 503 responses during that window; they do not remove the need for delay and timeout.
When you wire this into a real service, copy the recipe: enable delay at build time, set delay and timeout for your slowest acceptable request, align Kubernetes termination and probes, and keep a script like ours that proves behavior under SIGTERM, not just in a happy-path integration test.
Source for the full sample lives in the orderbridge-graceful-shutdown repository.


