How to Benchmark Quarkus Like an Engineer — Not a Marketer

Forget pretty graphs. Benchmarking Quarkus is about context, not competition. Learn how to separate signal from noise when testing JVM and Native performance on macOS and Linux.

Oct 30, 2025

Every developer has done it: build a Quarkus app, flip the -Pnative flag, and start timing startup and throughput. Seconds later, you’re ready to declare a winner. But when you test Quarkus Native against the JVM on macOS, those numbers are not just meaningless, they’re dangerously misleading.

macOS, especially on Apple Silicon, is not a neutral testing ground. Its performance and efficiency cores behave unpredictably under load. It silently throttles CPU power, mixes architectures, and sometimes runs benchmarking tools under Rosetta emulation. What looks like a fair fight between JVM and Native is actually a distorted contest shaped by the operating system’s energy model.

As Francesco Nigro rightly pointed out, we should never load test on notebooks or artificial environments. Your laptop is not a production server. It has thermal limits, variable clocks, and no real control over CPU quotas or memory isolation. The result may look like a pretty graph but it’s a false conclusion.

Benchmarking Is Not About Numbers

Benchmarks are seductive. They promise precision, clear winners, and engineering confidence. But benchmarking, especially in Java, is a craft. Even small oversights, a missing warm-up, background processes, or inconsistent thread affinity, can invert results.

The JVM and Quarkus Native operate on fundamentally different models:

JVM: Just-In-Time (JIT) compilation adapts at runtime, optimizing hot paths dynamically.
Native: Ahead-Of-Time (AOT) compilation fixes decisions at build time.

Comparing them requires more than a stopwatch. You need warm-up phases, multiple runs, controlled concurrency, and careful interpretation to separate signal from noise. Otherwise, you’re just benchmarking startup luck, not performance.

Why macOS Distorts Results

Apple’s unified memory and scheduler design make macOS a great developer machine but a poor benchmark host. Without Linux’s cgroups or predictable CPU pinning, workloads drift across efficiency and performance cores. Container virtualization adds another abstraction layer.

The JVM’s adaptive runtime manages these quirks better. It continuously optimizes itself. Quarkus Native binaries depend on static AOT compilation. That’s why many developers see the JVM outperform Native builds locally, even though the opposite is often true inside Linux containers.

What You’ll Actually Learn Here

This tutorial is not about declaring a winner. It’s about understanding why results differ and how to benchmark responsibly. You’ll build a simple Quarkus REST API and run it in both modes under identical load. Along the way, you’ll learn how to:

design fair, repeatable performance tests
interpret results in context
decide when to use each runtime in production

On macOS, treat the numbers as relative, not absolute. Benchmarking is about developing intuition, not collecting trophies.

Building the Project

Quarkus is a Java framework designed for cloud efficiency, fast startup, high throughput, low resource use. You’ll test those claims by running the same REST API in two modes: on the JVM and as a native executable. By tracking CPU and memory consumption under identical conditions, you’ll see how results differ across platforms and learn how to design fair experiments that hold up beyond your laptop.

Grab the source code and benchmarking script from my GitHub repository.

Prerequisites

Java 21+
GraalVM 21+
Maven 3.8+
wrk for load generation
Basic system monitoring tools (top, htop, or ps)

Bootstrap the Project

mvn io.quarkus:quarkus-maven-plugin:create \
  -DprojectGroupId=com.example \
  -DprojectArtifactId=quarkus-benchmark \
  -Dextensions=”rest-jackson”
cd quarkus-benchmark

Implement REST Endpoints

Replace the default GreetingResource.java with:

package com.example;

import java.util.*;
import java.util.stream.*;
import jakarta.ws.rs.*;
import jakarta.ws.rs.core.MediaType;

@Path(”/api”)
public class GreetingResource {

    @GET
    @Path(”/hello”)
    @Produces(MediaType.TEXT_PLAIN)
    @NonBlocking
    public String hello() {
        return “Hello from Quarkus!”;
    }

    @GET
    @Path(”/compute”)
    @Produces(MediaType.APPLICATION_JSON)
    @NonBlocking
    public ComputeResult compute(@QueryParam(”iterations”) Integer iterations) {
        int iter = iterations != null ? iterations : 1000;

        List<Integer> processed = IntStream.range(0, iter)
            .filter(n -> n % 2 == 0)
            .map(n -> n * n)
            .boxed()
            .collect(Collectors.toList());

        long sum = processed.stream().mapToLong(Integer::longValue).sum();
        return new ComputeResult(processed.size(), sum);
    }

    public static class ComputeResult {
        public int count;
        public long sum;
        public ComputeResult(int count, long sum) {
            this.count = count;
            this.sum = sum;
        }
    }
}

The /compute endpoint is ment to simulate CPU intensive load. It takes a number N, squares all even numbers from 0 to N, sums them, and returns {count, sum} as JSON. It’s a typical example microservice endpoint used to demonstrate reactive / non-blocking REST APIs.

In application.properties, enable reflection-free Jackson serialization:

quarkus.rest.jackson.optimization.enable-reflection-free-serializers=true

This improves serialization performance, especially in Native mode.

Building JVM and Native Versions

./mvnw clean package        # JVM
./mvnw package -Pnative     # Native

The native build may take a few minutes. Once done, verify the runner file exists in target/.

Benchmark Methodology

A custom benchmark.sh script automates startup, memory, and throughput tests for both modes. It ensures consistency across multiple runs and provides realistic data on:

startup performance
memory footprint
throughput and latency
CPU and memory utilization

Key Features

Mode-aware configuration: Runs either JVM (java -jar) or native binary.
Consistent CPU limits: Same number of cores for both modes.
Warm-up for JVM: JIT compiles hot paths before measurement.
Resource monitoring: Samples CPU and memory in real time.
Error resilience and cleanup: Ensures reliable runs.

Why Limit CPU Cores

The script includes:

SYSTEM_CPUS=$(echo “4”)

This caps both modes at four logical CPUs. On a laptop, that’s crucial. You likely have browser tabs, telemetry, and the load generator (wrk) running in parallel. Limiting CPUs creates a consistent, repeatable envelope and minimizes interference.

You’re not measuring maximum speed. You’re measuring consistent behavior under controlled conditions.

taskset and taskpolicy

On Linux, taskset pins a process to specific CPU cores, reducing variability. On macOS, this doesn’t exist. Instead, the script uses taskpolicy -c utility to assign a lower-priority class, but it can’t enforce CPU affinity. The scheduler may move threads between performance and efficiency cores mid-test which is another reason macOS benchmarks are unreliable.

Comparing JVM and Native Performance on macOS

Running the benchmark on an M4 Pro (four logical cores) produced an inversion of expected results:

/api/hello (I/O-bound): JVM ≈ 120 000 rps vs. Native ≈ 110 000 rps
/api/compute (CPU-bound): JVM ≈ 50 000 rps vs. Native ≈ 31 000 rps

The JVM benefits from adaptive JIT optimizations and runs with higher scheduling priority. The native binary, launched via taskpolicy, runs at lower QoS and lacks runtime feedback, leading to slower execution.

Where Native Still Wins

Instant startup
Smaller memory footprint
Better fit for short-lived functions or dense container workloads
More competitive in Linux environments with cgroup control

Where JVM Excels

Sustained throughput
Adaptive optimization over time
Predictable scaling under load
Best match for long-running APIs and compute-heavy services

Strategic Recommendations

Use Quarkus Native when:

You run many small services
Memory, cold starts, or density matter more than compute
You deploy to serverless or container platforms
You want cost savings from smaller footprints

Use JVM mode when:

You run fewer but high-traffic services
CPU efficiency or cost per transaction is critical
You handle analytics or heavy computation
Peak throughput matters more than startup latency

Closing Thoughts

Benchmarks are not about numbers. They’re about understanding. Every graph is a snapshot of behavior under specific constraints. On macOS, those constraints are blurry. On Linux, they’re clearer. But your job as an engineer is the same: interpret results, don’t chase them.

When you compare Quarkus on the JVM and as a native image, you’re not testing who’s faster. You’re testing trade-offs: startup time versus throughput, adaptability versus predictability. The right answer depends on your workload and environment.

Never trust numbers you don’t understand. Never trust results you can’t reproduce. And never benchmark to prove a point: Benchmark to learn!

If you measure with intent, interpret with care, and optimize where it matters, you’ve already won.