Quarkus Repository Guardrails for Agent-Generated Java Code
Use Semgrep, safer query defaults, local hooks, and CI to reduce low-cost security mistakes before they reach code review.
Better models help, but they still ship familiar mistakes.
Agentic coding tools are good at producing a diff that compiles, passes the happy path, and looks finished in review. They are less reliable with the dull rules that keep Java code out of incident review. A native query built with string concatenation can still land in the diff. So can a hardcoded token in a helper class.
For that kind of drift, I still want deterministic guardrails. So let’s build a small Quarkus service called VaultFlow, run PostgreSQL through Dev Services, add a Semgrep rule that blocks one bad query shape, and wire that rule into the same loop the agent uses: local hooks, repo instructions, and CI.
What we build
VaultFlow stores document metadata for a compliance team. It has three endpoints:
POST /documentsGET /documents/{externalId}GET /documents/search?ownerEmail=...
The app uses Quarkus 3.36.2, Java 25, Hibernate ORM with Panache, and PostgreSQL Dev Services. For the guardrail side, we only need a few pieces:
a custom Semgrep rule in
.semgrep/vaultflow-rules.yamla local
pre-commitconfigurationa GitHub Actions workflow
repo-native instructions in
AGENTS.md
By the end we have a working Quarkus app, tests that prove the behavior, and a guardrail for one of the easiest agent mistakes to ship.
Prerequisites
You need a current JDK, a container runtime for Dev Services, and Python if you want to run pre-commit locally. We add Semgrep later.
JDK 25
Docker or Podman
Maven 3.9+ or the generated Maven wrapper
Python 3.10+ for
pre-commitAbout ☕️☕️
Create the project
Use Quarkus 3.36.2 and Java 25. Create the project or follow along from my Github repository:
quarkus create app dev.morling.mainthread:vaultflow \
--extension='rest-jackson,hibernate-orm-panache,jdbc-postgresql' \
--java=25 \
--no-codeUse these extensions:
quarkus-rest-jacksonexposes JSON REST endpointsquarkus-hibernate-orm-panachegives us an idiomatic repository layer without a pile of boilerplatequarkus-jdbc-postgresqlenables JDBC and starts PostgreSQL automatically in dev and test mode through Dev Services
Configure PostgreSQL Dev Services
Open src/main/resources/application.properties:
quarkus.datasource.db-kind=postgresql
quarkus.hibernate-orm.schema-management.strategy=drop-and-createdb-kind=postgresql tells Quarkus which Dev Service to start. drop-and-create recreates the schema on each dev or test boot. That keeps this example short. In production it would drop the schema at startup, so you would replace it with real migrations.
Use import.sql for the seed data. Quarkus loads it automatically in dev and test mode, which is enough here. We do not need a startup observer or repository calls just to insert three rows.
StoredDocument extends PanacheEntity, so Hibernate generates the id. The seed rows need to read IDs from the generated sequence instead of leaving id null.
Create src/main/resources/import.sql:
insert into stored_documents (id, external_id, owner_email, title, storage_key, checksum)
select nextval(sequence_name::regclass), 'DOC-1000', 'legal@parchment.example', 'Export declaration for batch 1000', 'docs/2026/06/DOC-1000.pdf', 'sha256:0b74fd3a6f4f7002'
from information_schema.sequences
where sequence_schema = current_schema()
order by sequence_name
fetch first 1 row only;
insert into stored_documents (id, external_id, owner_email, title, storage_key, checksum)
select nextval(sequence_name::regclass), 'DOC-1001', 'legal@parchment.example', 'Supplier invoice for route correction', 'docs/2026/06/DOC-1001.pdf', 'sha256:4c4cd9314b32300c'
from information_schema.sequences
where sequence_schema = current_schema()
order by sequence_name
fetch first 1 row only;
insert into stored_documents (id, external_id, owner_email, title, storage_key, checksum)
select nextval(sequence_name::regclass), 'DOC-2000', 'compliance@parchment.example', 'Retention hold notice for shipment 2000', 'docs/2026/06/DOC-2000.pdf', 'sha256:8d9f4401c0c9dd11'
from information_schema.sequences
where sequence_schema = current_schema()
order by sequence_name
fetch first 1 row only;I prefer this for demo data. The file is smaller, easier to read, and easier to replace later with real migrations.
Model the document record
We need one entity, two small DTOs, and one repository. I use the repository pattern here because the article is about where query code lives.
Create src/main/java/dev/morling/mainthread/vaultflow/StoredDocument.java:
package dev.morling.mainthread.vaultflow;
import io.quarkus.hibernate.orm.panache.PanacheEntity;
import jakarta.persistence.Column;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;
@Entity
@Table(name = "stored_documents")
public class StoredDocument extends PanacheEntity {
@Column(name = "external_id", nullable = false, unique = true, length = 64)
public String externalId;
@Column(name = "owner_email", nullable = false, length = 256)
public String ownerEmail;
@Column(nullable = false, length = 256)
public String title;
@Column(name = "storage_key", nullable = false, length = 512)
public String storageKey;
@Column(nullable = false, length = 128)
public String checksum;
}Create CreateDocumentRequest.java and DocumentResponse.java in the same package:
package dev.morling.mainthread.vaultflow;
public record CreateDocumentRequest(
String externalId,
String ownerEmail,
String title,
String storageKey,
String checksum) {
}package dev.morling.mainthread.vaultflow;
public record DocumentResponse(
Long id,
String externalId,
String ownerEmail,
String title,
String storageKey,
String checksum) {
}Now the repository:
package dev.morling.mainthread.vaultflow;
import java.util.List;
import java.util.Optional;
import io.quarkus.hibernate.orm.panache.PanacheRepository;
import jakarta.enterprise.context.ApplicationScoped;
@ApplicationScoped
public class StoredDocumentRepository implements PanacheRepository<StoredDocument> {
public Optional<StoredDocument> findByExternalId(String externalId) {
return find("externalId", externalId).firstResultOptional();
}
public List<StoredDocument> findByOwnerEmail(String ownerEmail) {
return find("ownerEmail", ownerEmail).list();
}
}Keep that method simple for now. We come back to it when we look at the unsafe version an agent can generate under pressure.
Add the service boundary
The write path belongs behind a transaction. The REST resource should not decide how persistence works.
Create DocumentService.java:
package dev.morling.mainthread.vaultflow;
import java.util.List;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;
@ApplicationScoped
public class DocumentService {
@Inject
StoredDocumentRepository repository;
@Transactional
public DocumentResponse create(CreateDocumentRequest request) {
repository.findByExternalId(request.externalId())
.ifPresent(existing -> {
throw new DuplicateDocumentException(request.externalId());
});
StoredDocument document = new StoredDocument();
document.externalId = request.externalId();
document.ownerEmail = request.ownerEmail();
document.title = request.title();
document.storageKey = request.storageKey();
document.checksum = request.checksum();
repository.persist(document);
return toResponse(document);
}
public DocumentResponse getByExternalId(String externalId) {
StoredDocument document = repository.findByExternalId(externalId)
.orElseThrow(() -> new DocumentNotFoundException(externalId));
return toResponse(document);
}
public List<DocumentResponse> searchByOwnerEmail(String ownerEmail) {
return repository.findByOwnerEmail(ownerEmail).stream()
.map(DocumentService::toResponse)
.toList();
}
private static DocumentResponse toResponse(StoredDocument document) {
return new DocumentResponse(
document.id,
document.externalId,
document.ownerEmail,
document.title,
document.storageKey,
document.checksum);
}
}This service does not use EntityManager, build SQL strings, or put query logic in the REST layer. That keeps the easy mistakes in one place.
Add the two exception types:
package dev.morling.mainthread.vaultflow;
public class DuplicateDocumentException extends RuntimeException {
public DuplicateDocumentException(String externalId) {
super("Document " + externalId + " already exists");
}
}package dev.morling.mainthread.vaultflow;
public class DocumentNotFoundException extends RuntimeException {
public DocumentNotFoundException(String externalId) {
super("Document " + externalId + " was not found");
}
}Expose the REST API
Now add the three endpoints. Quarkus REST keeps the resource class small, which is all we need here. What matters is the data boundary and the guardrail.
Create DocumentResource.java:
package dev.morling.mainthread.vaultflow;
import java.net.URI;
import java.util.List;
import java.util.Map;
import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.PathParam;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;
import org.jboss.resteasy.reactive.RestResponse;
import org.jboss.resteasy.reactive.server.ServerExceptionMapper;
@Path("/documents")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public class DocumentResource {
@Inject
DocumentService service;
@POST
public Response create(CreateDocumentRequest request) {
DocumentResponse response = service.create(request);
return Response.created(URI.create("/documents/" + response.externalId()))
.entity(response)
.build();
}
@GET
@Path("/{externalId}")
public DocumentResponse getByExternalId(@PathParam("externalId") String externalId) {
return service.getByExternalId(externalId);
}
@GET
@Path("/search")
public List<DocumentResponse> searchByOwnerEmail(@QueryParam("ownerEmail") String ownerEmail) {
return service.searchByOwnerEmail(ownerEmail);
}
@ServerExceptionMapper
public RestResponse<Map<String, String>> mapDuplicate(DuplicateDocumentException e) {
return RestResponse.status(Response.Status.CONFLICT, Map.of("error", e.getMessage()));
}
@ServerExceptionMapper
public RestResponse<Map<String, String>> mapNotFound(DocumentNotFoundException e) {
return RestResponse.status(Response.Status.NOT_FOUND, Map.of("error", e.getMessage()));
}
}We can now create, read, and search documents. The search path is still safe because it stays inside a Panache finder.
Start the app and prove the baseline
Run dev mode:
./mvnw quarkus:devThe seeded data has two legal documents and one compliance document, so the search should return two rows.
Test it:
curl -s "http://localhost:8080/documents/search?ownerEmail=legal@parchment.example" | jqExpected output:
[
{
"id": 1,
"externalId": "DOC-1000",
"ownerEmail": "legal@parchment.example",
"title": "Export declaration for batch 1000",
"storageKey": "docs/2026/06/DOC-1000.pdf",
"checksum": "sha256:0b74fd3a6f4f7002"
},
{
"id": 51,
"externalId": "DOC-1001",
"ownerEmail": "legal@parchment.example",
"title": "Supplier invoice for route correction",
"storageKey": "docs/2026/06/DOC-1001.pdf",
"checksum": "sha256:4c4cd9314b32300c"
}
]Create one more document:
curl -i \
-H "Content-Type: application/json" \
-d '{
"externalId": "DOC-3000",
"ownerEmail": "ops@parchment.example",
"title": "Late customs memo for route 3000",
"storageKey": "docs/2026/06/DOC-3000.pdf",
"checksum": "sha256:33bbca6ab4b22000"
}' \
http://localhost:8080/documentsExpected output starts like this:
HTTP/1.1 201 Created
Content-Type: application/json;charset=UTF-8
content-length: 192
Location: http://localhost:8080/documents/DOC-3000The app is real now. The security rule is attached to code that already does useful work.
Add the Semgrep rule that blocks the bad query shape
If you ask an agent for “a quick email search”, this is the sort of code it may generate in StoredDocumentRepository:
public List<StoredDocument> findByOwnerEmail(String ownerEmail) {
return entityManager.createNativeQuery(
"select * from stored_documents where owner_email = '" + ownerEmail + "'",
StoredDocument.class)
.getResultList();
}It looks normal in a diff. It is still a policy violation.
Create .semgrep/vaultflow-rules.yaml:
rules:
- id: vaultflow-native-query-concatenation
patterns:
- pattern: $EM.createNativeQuery($QUERY + ..., ...)
- pattern-not: $EM.createNativeQuery("...", ...)
message: >
Native query built with string concatenation in createNativeQuery().
Use a Panache finder or a parameterized query instead.
languages: [java]
severity: ERROR
metadata:
category: security
technology: [quarkus, hibernate]
- id: vaultflow-hardcoded-secret
patterns:
- pattern: String $NAME = $VALUE;
- metavariable-regex:
metavariable: $NAME
regex: (?i).*(secret|token|password|apikey|api_key).*
message: >
Possible hardcoded secret in $NAME. Move it to configuration before it lands in git history.
languages: [java]
severity: ERROR
metadata:
category: securityThe first rule is the one we care about here. It does not try to prove exploitability. It encodes a repository rule: no concatenated native queries.
The fix is already in our real code:
public List<StoredDocument> findByOwnerEmail(String ownerEmail) {
return find("ownerEmail", ownerEmail).list();
}That split is the whole point. Semgrep blocks the bad shape. Quarkus and Panache already give us a safer path.
Put Semgrep in the local loop
pre-commit is not part of Quarkus and not required to run the app. It is just a small CLI that can manage local Git hooks.
If you want that local hook on macOS, install the pre-commit CLI with Homebrew:
brew install pre-commitThis command installs the pre-commit executable. It does not install the Git hook in this repository yet.
Create .pre-commit-config.yaml:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: detect-private-key
- id: check-merge-conflict
- id: no-commit-to-branch
args: ["--branch", "main"]
- repo: https://github.com/semgrep/pre-commit
rev: v1.166.0
hooks:
- id: semgrep
name: semgrep-vaultflow
args:
- "--config"
- "p/java"
- "--config"
- "p/secrets"
- "--config"
- ".semgrep/vaultflow-rules.yaml"
- "--error"
- "--skip-unknown-extensions"
- "--quiet"
- "--disable-version-check"Then install the Git hook for this repository:
pre-commit installThis command writes the repo’s .git/hooks/pre-commit hook so Git runs the configured checks before each commit.
The agent and the human developer now hit the same boundary. A git commit with the unsafe native query should fail before the diff goes any further.
Put the same rule in the agent instructions
Open AGENTS.md and add a project-specific section:
## Project Security Guardrails
- Run Semgrep before committing changes in this project.
- Treat `ERROR` findings from Semgrep as blockers.
- Prefer Panache finders or parameterized queries. Do not build `createNativeQuery()` calls with string concatenation.
- Do not add `# nosemgrep` suppressions without a short security rationale next to the suppression.AGENTS.md does not enforce anything by itself. It tells the agent what the local hook and CI already enforce. That is still useful because clear repo rules save review time.
Add the CI check
Create .github/workflows/semgrep.yml:
name: Semgrep Security Scan
on:
pull_request:
push:
branches: [main]
jobs:
semgrep:
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
steps:
- uses: actions/checkout@v6.0.3
with:
fetch-depth: 0
- name: Run Semgrep
run: |
semgrep scan \
--config p/java \
--config p/secrets \
--config .semgrep/vaultflow-rules.yaml \
--error \
src/main/javaCI is the merge boundary. The local hook gives fast feedback. CI decides whether a skipped hook becomes a merge problem.
Prove the Quarkus side with tests
Add src/test/java/dev/morling/mainthread/vaultflow/DocumentResourceTest.java:
package dev.morling.mainthread.vaultflow;
import static io.restassured.RestAssured.given;
import static org.hamcrest.Matchers.endsWith;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.hasSize;
import io.quarkus.test.common.http.TestHTTPEndpoint;
import io.quarkus.test.junit.QuarkusTest;
import io.restassured.http.ContentType;
import org.junit.jupiter.api.Test;
@QuarkusTest
@TestHTTPEndpoint(DocumentResource.class)
class DocumentResourceTest {
@Test
void shouldReturnSeededDocumentsForOwner() {
given()
.queryParam("ownerEmail", "legal@parchment.example")
.when()
.get("/search")
.then()
.statusCode(200)
.body("$", hasSize(2))
.body("[0].ownerEmail", equalTo("legal@parchment.example"));
}
@Test
void shouldCreateAndReadDocument() {
String payload = """
{
"externalId": "DOC-3000",
"ownerEmail": "ops@parchment.example",
"title": "Late customs memo for route 3000",
"storageKey": "docs/2026/06/DOC-3000.pdf",
"checksum": "sha256:33bbca6ab4b22000"
}
""";
given()
.contentType(ContentType.JSON)
.body(payload)
.when()
.post()
.then()
.statusCode(201)
.header("Location", endsWith("/documents/DOC-3000"))
.body("externalId", equalTo("DOC-3000"))
.body("ownerEmail", equalTo("ops@parchment.example"));
given()
.when()
.get("/DOC-3000")
.then()
.statusCode(200)
.body("title", equalTo("Late customs memo for route 3000"))
.body("storageKey", equalTo("docs/2026/06/DOC-3000.pdf"));
}
@Test
void shouldRejectDuplicateExternalId() {
String payload = """
{
"externalId": "DOC-1000",
"ownerEmail": "legal@parchment.example",
"title": "Duplicate seed document",
"storageKey": "docs/2026/06/DOC-1000-duplicate.pdf",
"checksum": "sha256:9c406998f735af15"
}
""";
given()
.contentType(ContentType.JSON)
.body(payload)
.when()
.post()
.then()
.statusCode(409)
.body("error", equalTo("Document DOC-1000 already exists"));
}
}Run the tests:
./mvnw testExpected result:
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0These tests are not about Semgrep. They prove the Quarkus app works before and after the guardrail discussion.
What survives production and what does not
This app path is small because I just want to show how Semgrep is working. The repository uses Panache, the write boundary is transactional, and Dev Services keeps local setup cheap.
None of this replaces review. Semgrep does not understand business intent. The agent instructions do not enforce policy. A coding agent can still generate a bad change, and a developer can still bypass a local hook. CI exists because of that.
Use this approach:
Quarkus gives you a good default path
Semgrep blocks specific bad shapes
pre-commitmakes the feedback immediateCI turns the same rule into merge policy
AGENTS.mdkeeps the agent aimed at the safe path
That is enough to catch cheap mistakes before they turn into expensive ones.
Conclusion
We built a small Quarkus document service, proved the API with tests, and added one deterministic guardrail for a common agent failure: concatenated native queries. Semgrep is not smarter than the agent. It is just faster, cheaper, and more stubborn about one rule we care about here. Use this as the baseline to make your agent output more reliable and trusted.


