Stop Flying Blind: Add Robust Health Checks to Your Quarkus Apps
Learn the why and how of MicroProfile Health, implementing Liveness and Readiness checks for stable, manageable services.
In modern distributed systems and microservices architectures, knowing the state of your application instances is an essential functional requirement. Is the service running? Is it ready to accept traffic? Answering these questions reliably enables automated deployment, scaling, and recovery strategies, and lead to more resilient systems.
Quarkus, leveraging the MicroProfile Health specification, provides a straightforward mechanism for exposing application health information. This guide shows you the implementation and usage for health checks in your Quarkus applications. Let’s get started:
Why Health Checks Matter
Health checks are elemental as they drive automation within orchestration platforms like Kubernetes, which use them to manage application lifecycles through actions like restarts and traffic routing. They provide essential observability by offering a standard way to gauge a service's operational status. Crucially, by accurately reporting their state, applications enable the infrastructure to take corrective measures, fostering resilience and enhancing overall system stability.
Quarkus implements MicroProfile Health via the quarkus-smallrye-health
extension. This extension automatically provides health check endpoints and integrates seamlessly with Quarkus's CDI-based programming model.
Getting Started: Adding the Health Extension
To enable health checks, add the quarkus-smallrye-health
extension to your project.
For Maven:
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-smallrye-health</artifactId>
</dependency>
Or use the Quarkus CLI:
quarkus ext add quarkus-smallrye-health
Once added, Quarkus automatically exposes default health endpoints. Even without custom checks, you can access these:
/q/health/live
: Reports basic liveness (is the application process running?)./q/health/ready
: Reports readiness (is the application ready to serve requests?)./q/health/started
- The application is started./q/health
: An aggregation of all registered health checks.
By default, if no custom checks are defined, these endpoints will report the application as "UP".The general status of the health check is computed as a logical AND of all the declared health check procedures.
Implementing Custom Health Checks
Real-world applications often require more sophisticated checks. Is the database connection valid? Can the service reach a critical downstream system dependency? MicroProfile Health defines two primary types of checks:
Liveness Checks: Indicate if the application instance is fundamentally functional. A failed liveness check suggests the instance is in an unrecoverable state and should be terminated and potentially replaced (e.g., restarted by Kubernetes). These checks should be fast and lightweight.
Readiness Checks: Indicate if the application instance is ready to process requests. A failed readiness check suggests the instance is temporarily unable to serve traffic (e.g., waiting for a dependency, warming up a cache) but might recover. The orchestrator should stop sending traffic to this instance but not terminate it.
You implement these checks by creating CDI beans that implement the org.eclipse.microprofile.health.HealthCheck
interface and annotating them with either @Liveness
or @Readiness
.
It’s recommended to annotate the health check class with @ApplicationScoped
or the @Singleton
scope so that a single bean instance is used for all health check requests.
1. Basic Liveness Check:
Let's create a simple liveness check. This could represent a fundamental internal state check.
import jakarta.enterprise.context.ApplicationScoped;
import org.eclipse.microprofile.health.HealthCheck;
import org.eclipse.microprofile.health.HealthCheckResponse;
import org.eclipse.microprofile.health.Liveness;
@Liveness // Marks this as a Liveness check
@ApplicationScoped // Makes it a CDI bean discoverable by Quarkus
public class SimpleLivenessCheck implements HealthCheck {
@Override
public HealthCheckResponse call() {
// In a real scenario, perform a quick check here.
// For this example, we'll just report UP.
return HealthCheckResponse.named("SimpleLivenessCheck") // Name of the check
.up() // Report status as UP
.build();
}
}
2. Basic Readiness Check:
Now, let's create a readiness check. Imagine this check verifies if a necessary configuration has been loaded.
import jakarta.enterprise.context.ApplicationScoped;
import org.eclipse.microprofile.health.HealthCheck;
import org.eclipse.microprofile.health.HealthCheckResponse;
import org.eclipse.microprofile.health.Readiness;
@Readiness // Marks this as a Readiness check
@ApplicationScoped
public class ConfigReadinessCheck implements HealthCheck {
private boolean configLoaded = true; // Simulate config state
@Override
public HealthCheckResponse call() {
HealthCheckResponse response;
if (configLoaded) {
response = HealthCheckResponse.named("ConfigReadinessCheck")
.up()
.withData("config.status", "loaded") // Optional: Add contextual data
.build();
} else {
response = HealthCheckResponse.named("ConfigReadinessCheck")
.down()
.withData("config.error", "not loaded")
.build();
}
return response;
}
// Method to simulate changing the config state (for testing)
public void setConfigLoaded(boolean loaded) {
this.configLoaded = loaded;
}
}
Understanding the HealthCheckResponse
:
The call()
method must return a HealthCheckResponse
. You build this using HealthCheckResponse.named("check-name")
.
.up()
: Sets the status to UP..down()
: Sets the status to DOWN..withData(key, value)
: Adds arbitrary key-value pairs (JSON objects, strings, numbers) to provide context about the check's status. This is useful for debugging.
Accessing the Endpoints:
With these checks implemented, start your Quarkus application (e.g., mvn quarkus:dev
). Now you can access the health endpoints:
Liveness:
curl http://localhost:8080/q/health/live
{
"status": "UP",
"checks": [
{
"name": "SimpleLivenessCheck",
"status": "UP"
}
// Potentially other built-in liveness checks
]
}
Readiness:
curl http://localhost:8080/q/health/ready
{
"status": "UP",
"checks": [
{
"name": "ConfigReadinessCheck",
"status": "UP",
"data": {
"config.status": "loaded"
}
}
// Potentially other built-in readiness checks
]
}
Aggregated:
curl http://localhost:8080/q/health
{
"status": "UP", // Overall status is UP only if ALL checks are UP
"checks": [
{
"name": "SimpleLivenessCheck",
"status": "UP"
},
{
"name": "ConfigReadinessCheck",
"status": "UP",
"data": {
"config.status": "loaded"
}
}
// ... other checks
]
}
If any single check within /q/health/live
reports DOWN, the overall status of that endpoint becomes DOWN. The same logic applies to /q/health/ready
. The aggregated /q/health
endpoint reports DOWN if any registered check (liveness or readiness) reports DOWN.
Simulating Dependency Checks
Readiness checks often involve verifying connections to external dependencies like databases, messaging queues, or other microservices. Let's simulate a database connection check.
import jakarta.enterprise.context.ApplicationScoped;
import org.eclipse.microprofile.health.HealthCheck;
import org.eclipse.microprofile.health.HealthCheckResponse;
import org.eclipse.microprofile.health.Readiness;
import java.util.Random;
@Readiness // Database availability affects readiness
@ApplicationScoped
public class DatabaseConnectionCheck implements HealthCheck {
private final Random random = new Random();
@Override
public HealthCheckResponse call() {
// Simulate checking the database connection
// In a real app, you'd inject a DataSource or client
// and perform a lightweight query (e.g., 'SELECT 1') or connection validation.
boolean isDbAvailable = checkDatabaseConnection();
if (isDbAvailable) {
return HealthCheckResponse.named("DatabaseConnection")
.up()
.build();
} else {
return HealthCheckResponse.named("DatabaseConnection")
.down()
.withData("error", "Database connection failed")
.build();
}
}
private boolean checkDatabaseConnection() {
// Simulate intermittent failures for demonstration
// Replace with actual database interaction
return random.nextInt(10) < 8; // 80% chance of success
}
}
Important Considerations for Dependency Checks:
Keep them lightweight: Health checks, especially liveness checks, are often called frequently. Avoid heavyweight operations. For database checks, use the connection pool's validation mechanism or execute a minimal query (e.g.,
SELECT 1
).Timeouts: Health checks should complete quickly. Implement timeouts within your check logic if the underlying operation could hang. Platforms like Kubernetes also configure timeouts for probes.
Error Handling: Gracefully handle exceptions within your
call()
method. An uncaught exception typically results in the check being reported as DOWN, but providing specific error data via.withData()
is more informative.Use Appropriate Type: Checking a database connection is usually a Readiness concern. The application might be live (running) even if the database is temporarily down, but it's not ready to serve requests requiring the database.
Configuration
The root path for health endpoints can be configured in the main Quarkus application.properties
:
# Changes the endpoints to /healthz/live, /healthz/ready, /healthz
quarkus.smallrye-health.root-path=/healthz
# Disable the default, aggregated /q/health endpoint if not needed
# quarkus.smallrye-health.extensions.enabled=false
# Note: Disabling extensions might affect other integrations relying on MicroProfile Health spec compliance.
# Customize the overall health endpoint name (default: /q/health)
# quarkus.smallrye-health.path=/myhealth
Take a look at the Quarkus SmallRye Health configuration guide for more options.
Integration with Kubernetes
Kubernetes is a prime consumer of health endpoints through its Liveness Probes and Readiness Probes.
Liveness Probe: Kubernetes periodically calls this endpoint. If it fails (returns non-2xx status code or times out) repeatedly according to the probe's configuration (
failureThreshold
), Kubernetes assumes the container is dead and restarts it. Map this to your Quarkus@Liveness
endpoint (/q/health/live
by default).Readiness Probe: Kubernetes calls this endpoint to determine if the container is ready to receive traffic. If it fails, Kubernetes removes the Pod's IP address from the corresponding Service's endpoints, effectively stopping traffic flow to that instance. It continues probing, and once the probe succeeds, the Pod is added back. Map this to your Quarkus
@Readiness
endpoint (/q/health/ready
by default).
Here's an example snippet for a Kubernetes Deployment
manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-quarkus-app
spec:
replicas: 3
selector:
matchLabels:
app: my-quarkus-app
template:
metadata:
labels:
app: my-quarkus-app
spec:
containers:
- name: my-quarkus-container
image: your-repo/my-quarkus-app:latest
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /q/health/live # Matches the @Liveness endpoint
port: 8080
initialDelaySeconds: 15 # Wait before first probe
periodSeconds: 20 # How often to probe
failureThreshold: 3 # How many failures before restarting
readinessProbe:
httpGet:
path: /q/health/ready # Matches the @Readiness endpoint
port: 8080
initialDelaySeconds: 20 # Wait longer for readiness (e.g., cache warmup)
periodSeconds: 10 # Probe readiness more often potentially
failureThreshold: 3 # How many failures before stopping traffic
Key Kubernetes Integration Points:
Match Probe Type: Use
livenessProbe
for/q/health/live
andreadinessProbe
for/q/health/ready
. Mixing them up can lead to incorrect behavior (e.g., restarting an instance that is just temporarily busy).Configure Paths: Ensure the
path
in the probe configuration matches the actual endpoint path exposed by Quarkus (including any customization viaquarkus.smallrye-health.root-path
).Tune Parameters: Adjust
initialDelaySeconds
,periodSeconds
,timeoutSeconds
,successThreshold
, andfailureThreshold
based on your application's startup time, check execution time, and desired sensitivity.
Best Practices for Quarkus Health Checks
Keep Liveness checks extremely fast and simple. They must only verify the fundamental operational status of the application process itself and should fail solely if the instance is irrecoverably broken and requires a restart. Critically, avoid checks on external dependencies within liveness probes, as temporary external issues shouldn't trigger a potentially disruptive restart.
Readiness checks, conversely, are designed to confirm the application is fully prepared to handle requests. This is the correct place to verify connections to essential external dependencies, such as databases or downstream microservices, and to check internal states like cache readiness or the completion of necessary warmup procedures.
Evaluate the security requirements for your health endpoints. While often accessed internally within a trusted network (like a Kubernetes cluster) and left unsecured, apply appropriate protection if they are exposed externally or if security policies demand it. Options range from infrastructure-level network policies to integrating application security, though the latter can complicate the checks.
Remember to monitor the health checks themselves. These endpoints are critical signals but can also fail or become slow, providing misleading information. Include the latency and success/failure rates of your /q/health/*
endpoints in your overall observability dashboards to ensure their reliability.
Finally, make effective use of the withData()
method when building HealthCheckResponse
objects, particularly for down()
statuses. Including relevant key-value context (like error codes or specific connection details) directly in the health check response significantly accelerates troubleshooting when failures occur.
Conclusion
Quarkus, through the quarkus-smallrye-health
extension and MicroProfile Health, offers a robust and developer-friendly way to implement essential health checks. By distinguishing between liveness and readiness, and integrating correctly with orchestrators like Kubernetes, you can significantly improve the resilience and manageability of your Java applications. Implementing thoughtful health checks is not just a feature; it's a foundational practice for building reliable cloud-native systems. Adopt them early in your development lifecycle.
Learn even more about Quarkus Health Checks:
Red Hat Developer Quarkus Tutorial
How to poll a Quarkus app in a container with a custom health endpoint.
Quarkus in Action by Manning as free eBook!