Quarkus Test Data: Teach Bob Datafaker and Instancio
Replace hardcoded agent fixtures with synthetic data, a Bob skill, seeded boundary tests, and REST proof in a NebulaTrack Quarkus walkthrough.
I asked IBM Bob to scaffold tests for a new NebulaTrack service. I got the usual first pass: one happy-path test, a handful of string literals, coordinates stuck at 0.0, and IDs that never change between runs.
That matches what a 2026 MSR study found across more than 1.2 million commits from 2025. Coding agents modify tests more often than humans do, and they add mocks more often too. 36% of agent test commits added mocks, versus 26% for non-agents. When an agent has no data vocabulary, it takes the easy shortcut: hardcoded "testUser", Mockito stubs, and "NOMINAL" everywhere.
Mocks still belong in some tests. They just do not replace varied input. What I keep reaching for is synthetic data: realistic field values from Datafaker, object graphs from Instancio, and a Bob skill that tells the agent what good NebulaTrack fixtures look like.
In this walkthrough, Bob scaffolds a Quarkus service first. We see where the fixtures fall short, add Datafaker and Instancio, teach Bob through .bob/skills/, and push past the happy path with seeded, reproducible parameterized tests.
What we build
We extend NebulaTrack with a small REST slice that ingests satellite telemetry events:
SatelliteEventdomain record with coordinates, altitude, telemetry state, and payloadSatelliteEventServicethat validates domain rules (ID format, coordinate bounds, altitude floor, required payload on anomaly states)@QuarkusTestcoverage that starts with naive Bob fixtures, then moves to Datafaker + Instancio factoriesa Bob skill under
.bob/skills/synthetic-test-data/plus a slash command entry pointparameterized boundary tests with Instancio models and a fixed seed for reproducible failures
REST Assured proof for the HTTP path
When you finish, you have a pattern you can drop into any Quarkus module where Bob writes tests: put the data rules in project context, generate fixtures with libraries, and prove edge cases with seeds.
What you need
You need a current JDK, Maven, basic Quarkus REST comfort, and IBM Bob in Advanced mode. Bob skills load only there.
JDK 25
Maven 3.9+ or the Quarkus CLI
IBM Bob with skills enabled
Familiarity with JUnit 5 and
@QuarkusTestAbout ☕️☕️
This article targets Quarkus 3.36.1, Datafaker 2.5.4, and Instancio 5.6.0.
Project setup
Create a REST Quarkus app without codestarts.
quarkus create app dev.quarkex:nebulatrack-testdata \
--extension='rest-jackson' \
--java=25 \
--no-codeUnder src/main/java, use package dev.quarkex.nebulatrack.testdata with subpackages model, service, and resource.
Add test-scoped dependencies for AssertJ, Datafaker, Instancio, and REST Assured:
<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<version>3.27.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>net.datafaker</groupId>
<artifactId>datafaker</artifactId>
<version>2.5.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.instancio</groupId>
<artifactId>instancio-junit</artifactId>
<version>5.6.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>rest-assured</artifactId>
<scope>test</scope>
</dependency>instancio-junit pulls in instancio-core and the JUnit 5 extension. Keep Datafaker and Instancio in test scope. Production code should never depend on fake data generators.
Verify
From the project root:
./mvnw testAn empty project should compile and pass before we add domain code.
Let Bob scaffold the service
Prompt Bob with something like below. Ideally you use the Quarkus Agent MCP.
Add a
SatelliteEventServiceand REST resource for ingesting satellite telemetry. Include unit tests.
If you are following along without Bob, paste the code below. I care about the shape of the output here, not whether Bob created it.
Domain model
Create src/main/java/dev/quarkex/nebulatrack/testdata/model/TelemetryState.java:
package dev.quarkex.nebulatrack.testdata.model;
public enum TelemetryState {
NOMINAL,
DEGRADED,
ANOMALY,
OFFLINE
}Create SatelliteEvent.java:
package dev.quarkex.nebulatrack.testdata.model;
import java.time.Instant;
public record SatelliteEvent(
String eventId,
String satelliteId,
double latitude,
double longitude,
double altitudeKm,
TelemetryState state,
Instant observedAt,
String payloadJson) {
}Create ValidationResult.java:
package dev.quarkex.nebulatrack.testdata.model;
public record ValidationResult(boolean valid, String reason) {
public static ValidationResult ok() {
return new ValidationResult(true, null);
}
public static ValidationResult reject(String reason) {
return new ValidationResult(false, reason);
}
}Service
Create src/main/java/dev/quarkex/nebulatrack/testdata/service/SatelliteEventService.java:
package dev.quarkex.nebulatrack.testdata.service;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import dev.quarkex.nebulatrack.testdata.model.TelemetryState;
import dev.quarkex.nebulatrack.testdata.model.ValidationResult;
import jakarta.enterprise.context.ApplicationScoped;
import java.util.regex.Pattern;
@ApplicationScoped
public class SatelliteEventService {
static final Pattern SATELLITE_ID = Pattern.compile("SAT-[A-Z]{2}-\\d{4}");
public ValidationResult validate(SatelliteEvent event) {
if (event == null) {
return ValidationResult.reject("event is required");
}
if (event.eventId() == null || event.eventId().isBlank()) {
return ValidationResult.reject("eventId is required");
}
if (event.satelliteId() == null || !SATELLITE_ID.matcher(event.satelliteId()).matches()) {
return ValidationResult.reject("satelliteId must match SAT-XX-0000");
}
if (event.latitude() < -90.0 || event.latitude() > 90.0) {
return ValidationResult.reject("latitude out of range");
}
if (event.longitude() < -180.0 || event.longitude() > 180.0) {
return ValidationResult.reject("longitude out of range");
}
if (event.altitudeKm() < 0.0) {
return ValidationResult.reject("altitude cannot be negative");
}
if (event.state() == TelemetryState.ANOMALY
&& (event.payloadJson() == null || event.payloadJson().isBlank())) {
return ValidationResult.reject("anomaly events require payload");
}
return ValidationResult.ok();
}
}REST resource
Create src/main/java/dev/quarkex/nebulatrack/testdata/resource/SatelliteEventResource.java:
package dev.quarkex.nebulatrack.testdata.resource;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import dev.quarkex.nebulatrack.testdata.model.ValidationResult;
import dev.quarkex.nebulatrack.testdata.service.SatelliteEventService;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;
@Path("/events")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public class SatelliteEventResource {
private final SatelliteEventService service;
public SatelliteEventResource(SatelliteEventService service) {
this.service = service;
}
@POST
public Response ingest(SatelliteEvent event) {
ValidationResult result = service.validate(event);
if (!result.valid()) {
return Response.status(Response.Status.BAD_REQUEST)
.entity(result)
.build();
}
return Response.accepted(event).build();
}
}The naive Bob test
Here is the test shape I usually get back on the first pass:
package dev.quarkex.nebulatrack.testdata;
import static org.assertj.core.api.Assertions.assertThat;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import dev.quarkex.nebulatrack.testdata.model.TelemetryState;
import dev.quarkex.nebulatrack.testdata.model.ValidationResult;
import dev.quarkex.nebulatrack.testdata.service.SatelliteEventService;
import io.quarkus.test.junit.QuarkusTest;
import jakarta.inject.Inject;
import java.time.Instant;
import org.junit.jupiter.api.Test;
@QuarkusTest
class NaiveAgentSatelliteEventTest {
@Inject
SatelliteEventService service;
@Test
void acceptsNominalEvent() {
SatelliteEvent event = new SatelliteEvent(
"EVT-001",
"SAT-NE-0001",
0.0,
0.0,
400.0,
TelemetryState.NOMINAL,
Instant.parse("2024-01-01T00:00:00Z"),
"{}");
ValidationResult result = service.validate(event);
assertThat(result.valid()).isTrue();
}
}Run it:
./mvnw test -Dtest=NaiveAgentSatelliteEventTestIt passes. That is the trap.
The IDs satisfy this validator, but they never change. Coordinates sit at (0, 0) every time. Altitude is always 400.0. Bob did not mock the service. It hardcoded a fixture just plausible enough to compile and pass once.
Why hardcoded fixtures fail quietly
Hardcoded fixtures look safe and easy but they do not test a lot of functionality.
Every test reuses the same values, so you can ship green builds and still miss rules you never exercised. Anomaly payloads are a good example. Production finds that gap fast.
The next Bob prompt makes it worse. Ask for “error case tests” and you often get more literals or Mockito stubs instead of wider input. The MSR authors call for mocking guidance in agent configuration files. I read that more broadly: agents need a data generation vocabulary, not just Mockito rules.
Synthetic data breaks that loop. Datafaker fills individual fields with realistic values. Instancio assembles whole object graphs with type-correct randomness. Together they replace "testUser@test.com" and "EVT-001" with inputs that actually exercise validation.
Add Datafaker and Instancio
Create a test helper at src/test/java/dev/quarkex/nebulatrack/testdata/support/SatelliteEventModels.java:
package dev.quarkex.nebulatrack.testdata.support;
import static org.instancio.Select.field;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import dev.quarkex.nebulatrack.testdata.model.TelemetryState;
import net.datafaker.Faker;
import org.instancio.Instancio;
import org.instancio.Model;
public final class SatelliteEventModels {
private static final Faker FAKER = new Faker();
private SatelliteEventModels() {
}
public static Model<SatelliteEvent> validEvent() {
return Instancio.of(SatelliteEvent.class)
.generate(field(SatelliteEvent::eventId),
gen -> gen.string().prefix("EVT-").length(12))
.supply(field(SatelliteEvent::satelliteId),
() -> FAKER.regexify("SAT-[A-Z]{2}-[0-9]{4}"))
.generate(field(SatelliteEvent::latitude),
gen -> gen.doubles().range(-90.0, 90.0))
.generate(field(SatelliteEvent::longitude),
gen -> gen.doubles().range(-180.0, 180.0))
.generate(field(SatelliteEvent::altitudeKm),
gen -> gen.doubles().range(200.0, 35_786.0))
.generate(field(SatelliteEvent::state),
gen -> gen.oneOf(TelemetryState.NOMINAL, TelemetryState.DEGRADED))
.generate(field(SatelliteEvent::observedAt),
gen -> gen.temporal().instant().past())
.generate(field(SatelliteEvent::payloadJson),
gen -> gen.oneOf("{}", "{\"signal\":\"ok\"}"))
.toModel();
}
public static SatelliteEvent anyValidEvent() {
return Instancio.create(validEvent());
}
public static SatelliteEvent anomalyMissingPayload() {
return Instancio.of(validEvent())
.set(field(SatelliteEvent::state), TelemetryState.ANOMALY)
.set(field(SatelliteEvent::payloadJson), " ")
.create();
}
public static SatelliteEvent withNegativeAltitude() {
return Instancio.of(validEvent())
.set(field(SatelliteEvent::altitudeKm), -1.0)
.create();
}
public static String randomSatelliteId() {
return FAKER.regexify("SAT-[A-Z]{2}-[0-9]{4}");
}
}Three choices I made here:
Instancio owns the object graph. One
Model<SatelliteEvent>is reusable across tests and parameterized runs.Datafaker handles regex-shaped IDs.
gen.text().pattern()treats the pattern as a literal template, not a regular expression. ForSAT-[A-Z]{2}-[0-9]{4}, use.supply(field(...), () -> FAKER.regexify(...))instead.Domain rules live in generators. Altitude ranges, telemetry states, and ID patterns mirror production constraints instead of
"foo".
Replace the naive test with src/test/java/dev/quarkex/nebulatrack/testdata/SatelliteEventServiceTest.java:
package dev.quarkex.nebulatrack.testdata;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.anomalyMissingPayload;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.anyValidEvent;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.withNegativeAltitude;
import static org.assertj.core.api.Assertions.assertThat;
import dev.quarkex.nebulatrack.testdata.model.ValidationResult;
import dev.quarkex.nebulatrack.testdata.service.SatelliteEventService;
import io.quarkus.test.junit.QuarkusTest;
import jakarta.inject.Inject;
import org.junit.jupiter.api.RepeatedTest;
import org.junit.jupiter.api.Test;
@QuarkusTest
class SatelliteEventServiceTest {
@Inject
SatelliteEventService service;
@RepeatedTest(20)
void acceptsValidSyntheticEvents() {
ValidationResult result = service.validate(anyValidEvent());
assertThat(result.valid()).isTrue();
}
@Test
void rejectsAnomalyWithoutPayload() {
ValidationResult result = service.validate(anomalyMissingPayload());
assertThat(result.valid()).isFalse();
assertThat(result.reason()).contains("payload");
}
@Test
void rejectsNegativeAltitude() {
ValidationResult result = service.validate(withNegativeAltitude());
assertThat(result.valid()).isFalse();
assertThat(result.reason()).contains("altitude");
}
}Run:
./mvnw test -Dtest=SatelliteEventServiceTestTwenty repetitions of acceptsValidSyntheticEvents would have been painful with hand-rolled data. With Instancio, repetition is cheap. A random longitude at exactly 180.0 already caught a boundary comparison I had wrong once.
Teach Bob with a skill
Most Datafaker tutorials stop at the library setup. I care about the part after that.
Bob only uses what it can load. Put the rules in a project skill:
.bob/
commands/
add-event-tests.md
skills/
synthetic-test-data/
SKILL.md
references/
domain-formats.md
forbidden-literals.mdPut the workflow in .bob/skills/synthetic-test-data/SKILL.md:
---
name: synthetic-test-data
description: Generate Quarkus test data for NebulaTrack satellite events using Instancio models, Datafaker, and domain-specific formats instead of hardcoded literals
---
Use this skill when writing or modifying tests for `SatelliteEventService`, `SatelliteEventResource`, or related NebulaTrack telemetry code in this module.
Read [domain formats](references/domain-formats.md) and [forbidden literals](references/forbidden-literals.md) before generating fixtures.
Working rules:
1. Reuse factories in `src/test/java/dev/quarkex/nebulatrack/testdata/support/SatelliteEventModels.java` before inventing new object setup.
2. Use **Instancio** for domain objects and reusable `Model<T>` templates.
3. Use **Datafaker** for standalone formatted strings when Instancio field generators are not enough.
4. Prefer `@RepeatedTest` or `@ParameterizedTest` with Instancio models over duplicating near-identical constructors.
5. Do **not** add Mockito stubs when real domain objects can exercise validation logic.
6. When a random test fails, reproduce it with `Instancio.of(model).withSeed(<seed>).create()` and leave the seed in a comment or `@Seed` annotation.Push the domain detail into references/domain-formats.md — satellite ID regex, coordinate ranges, anomaly payload rule, and the Instancio .supply() + Datafaker regexify() note.
Add a thin slash command at .bob/commands/add-event-tests.md:
---
description: Add or extend SatelliteEvent tests using synthetic data rules
argument-hint: <service-or-resource-and-goal>
---
Use the `synthetic-test-data` skill for: $ARGUMENTS
Reuse `SatelliteEventModels` before writing new fixtures.
Do not hardcode placeholder IDs like `SAT-001` or `EVT-001`.
Run `./mvnw test` when you finish.After the skill is in the repo, re-run the original Bob prompt in a fresh session:
Add tests for invalid satellite IDs and coordinate edge cases for
SatelliteEventService.
The output changes. Instead of another "EVT-001" negative test, Bob imports SatelliteEventModels, uses Instancio.of(validEvent()).set(...), and adds parameterized cases.
You can put the same contract in repo-root AGENTS.md if your team also uses Cursor or Claude Code. Here, Bob is the system under test.
Parameterized boundary tests
Repetition on the happy path helps. It does not replace explicit edge cases.
Create src/test/java/dev/quarkex/nebulatrack/testdata/SatelliteEventBoundaryTest.java:
package dev.quarkex.nebulatrack.testdata;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.validEvent;
import static org.assertj.core.api.Assertions.assertThat;
import static org.instancio.Select.field;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import dev.quarkex.nebulatrack.testdata.model.ValidationResult;
import dev.quarkex.nebulatrack.testdata.service.SatelliteEventService;
import io.quarkus.test.junit.QuarkusTest;
import jakarta.inject.Inject;
import org.instancio.Instancio;
import org.instancio.junit.InstancioExtension;
import org.junit.jupiter.api.extension.ExtendWith;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.ValueSource;
@QuarkusTest
@ExtendWith(InstancioExtension.class)
class SatelliteEventBoundaryTest {
@Inject
SatelliteEventService service;
@ParameterizedTest
@ValueSource(doubles = {-90.1, 90.1})
void rejectsOutOfRangeLatitude(double latitude) {
SatelliteEvent event = Instancio.of(validEvent())
.set(field(SatelliteEvent::latitude), latitude)
.create();
ValidationResult result = service.validate(event);
assertThat(result.valid()).isFalse();
assertThat(result.reason()).contains("latitude");
}
@ParameterizedTest
@ValueSource(doubles = {-180.1, 180.1})
void rejectsOutOfRangeLongitude(double longitude) {
SatelliteEvent event = Instancio.of(validEvent())
.set(field(SatelliteEvent::longitude), longitude)
.create();
ValidationResult result = service.validate(event);
assertThat(result.valid()).isFalse();
assertThat(result.reason()).contains("longitude");
}
@ParameterizedTest
@ValueSource(strings = {"SAT-001", "SAT-NE-42", "NE-1042", ""})
void rejectsMalformedSatelliteIds(String satelliteId) {
SatelliteEvent event = Instancio.of(validEvent())
.set(field(SatelliteEvent::satelliteId), satelliteId)
.create();
ValidationResult result = service.validate(event);
assertThat(result.valid()).isFalse();
assertThat(result.reason()).contains("satelliteId");
}
}Instancio gives you a valid baseline in one line. Each parameterized case overrides exactly the field under test, so you avoid copy-pasting twelve nearly identical constructors. That copy-paste pattern is the agent output I see most often.
Run:
./mvnw test -Dtest=SatelliteEventBoundaryTestReproduce random failures with seeds
Randomized tests help until one fails on CI at 2 a.m. Instancio logs the seed it used. You can also pin one up front:
SatelliteEvent event = Instancio.of(validEvent())
.withSeed(948_221_337L)
.create();When @RepeatedTest or a large parameterized run fails, copy the seed from the failure output, fix the bug, and leave the seeded test in place as a regression guard:
package dev.quarkex.nebulatrack.testdata;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.validEvent;
import static org.assertj.core.api.Assertions.assertThat;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import dev.quarkex.nebulatrack.testdata.model.ValidationResult;
import dev.quarkex.nebulatrack.testdata.service.SatelliteEventService;
import io.quarkus.test.junit.QuarkusTest;
import jakarta.inject.Inject;
import org.instancio.Instancio;
import org.junit.jupiter.api.Test;
@QuarkusTest
class SeededRegressionTest {
private static final long KNOWN_GOOD_SEED = 948_221_337L;
@Inject
SatelliteEventService service;
@Test
void seededEventPassesValidation() {
SatelliteEvent event = Instancio.of(validEvent())
.withSeed(KNOWN_GOOD_SEED)
.create();
ValidationResult result = service.validate(event);
assertThat(result.valid()).isTrue();
}
}With the JUnit extension, @Seed(948221337) on a test method works too. Once randomized data is in the suite, I keep a pinned seed around. It turns a random failure into something you can rerun on Monday morning.
Run:
./mvnw test -Dtest=SeededRegressionTestProve the REST path
Service-level tests cover validation logic. They do not prove HTTP status codes, JSON bodies, or serialization quirks. Add src/test/java/dev/quarkex/nebulatrack/testdata/SatelliteEventResourceTest.java:
package dev.quarkex.nebulatrack.testdata;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.anomalyMissingPayload;
import static dev.quarkex.nebulatrack.testdata.support.SatelliteEventModels.anyValidEvent;
import static io.restassured.RestAssured.given;
import static org.hamcrest.CoreMatchers.is;
import dev.quarkex.nebulatrack.testdata.model.SatelliteEvent;
import io.quarkus.test.junit.QuarkusTest;
import org.junit.jupiter.api.Test;
@QuarkusTest
class SatelliteEventResourceTest {
@Test
void acceptsValidEvent() {
SatelliteEvent event = anyValidEvent();
given()
.contentType("application/json")
.body(event)
.when()
.post("/events")
.then()
.statusCode(202)
.body("satelliteId", is(event.satelliteId()));
}
@Test
void rejectsAnomalyWithoutPayload() {
SatelliteEvent event = anomalyMissingPayload();
given()
.contentType("application/json")
.body(event)
.when()
.post("/events")
.then()
.statusCode(400)
.body("valid", is(false))
.body("reason", is("anomaly events require payload"));
}
}Run:
./mvnw test -Dtest=SatelliteEventResourceTestThen run the full suite:
./mvnw testAll tests should pass.
What changed
We started with Bob output that looked fine and tested almost nothing interesting. We ended with reusable Instancio models, Datafaker for regex-shaped IDs, a Bob skill so the next scaffold does not regress, parameterized boundary tests, seeded reproducers, and REST Assured proof that validation survives JSON serialization.
The libraries matter. The Bob skill matters more. Agents will keep generating "EVT-001" until you tell them, in repo-native context, what good NebulaTrack data looks like and where the factories live.
That is the workflow I want: let Bob move fast, encode the data rules once, and make every follow-up test more useful than the last one.
The sample code for this article lives in my Github repository for you to explore and play around with.


