Quarkus Repository Guardrails for Agent-Generated Java Code

Use Semgrep, safer query defaults, local hooks, and CI to reduce low-cost security mistakes before they reach code review.

Jun 24, 2026

Better models help, but they still ship familiar mistakes.

Agentic coding tools are good at producing a diff that compiles, passes the happy path, and looks finished in review. They are less reliable with the dull rules that keep Java code out of incident review. A native query built with string concatenation can still land in the diff. So can a hardcoded token in a helper class.

For that kind of drift, I still want deterministic guardrails. So let’s build a small Quarkus service called VaultFlow, run PostgreSQL through Dev Services, add a Semgrep rule that blocks one bad query shape, and wire that rule into the same loop the agent uses: local hooks, repo instructions, and CI.

What we build

VaultFlow stores document metadata for a compliance team. It has three endpoints:

POST /documents
GET /documents/{externalId}
GET /documents/search?ownerEmail=...

The app uses Quarkus 3.36.2, Java 25, Hibernate ORM with Panache, and PostgreSQL Dev Services. For the guardrail side, we only need a few pieces:

a custom Semgrep rule in .semgrep/vaultflow-rules.yaml
a local pre-commit configuration
a GitHub Actions workflow
repo-native instructions in AGENTS.md

By the end we have a working Quarkus app, tests that prove the behavior, and a guardrail for one of the easiest agent mistakes to ship.

Prerequisites

You need a current JDK, a container runtime for Dev Services, and Python if you want to run pre-commit locally. We add Semgrep later.

JDK 25
Docker or Podman
Maven 3.9+ or the generated Maven wrapper
Python 3.10+ for pre-commit
About ☕️☕️

Create the project

Use Quarkus 3.36.2 and Java 25. Create the project or follow along from my Github repository:

quarkus create app dev.morling.mainthread:vaultflow \
  --extension='rest-jackson,hibernate-orm-panache,jdbc-postgresql' \
  --java=25 \
  --no-code

Use these extensions:

quarkus-rest-jackson exposes JSON REST endpoints
quarkus-hibernate-orm-panache gives us an idiomatic repository layer without a pile of boilerplate
quarkus-jdbc-postgresql enables JDBC and starts PostgreSQL automatically in dev and test mode through Dev Services

Configure PostgreSQL Dev Services

Open src/main/resources/application.properties:

quarkus.datasource.db-kind=postgresql
quarkus.hibernate-orm.schema-management.strategy=drop-and-create

db-kind=postgresql tells Quarkus which Dev Service to start. drop-and-create recreates the schema on each dev or test boot. That keeps this example short. In production it would drop the schema at startup, so you would replace it with real migrations.

Use import.sql for the seed data. Quarkus loads it automatically in dev and test mode, which is enough here. We do not need a startup observer or repository calls just to insert three rows.

StoredDocument extends PanacheEntity, so Hibernate generates the id. The seed rows need to read IDs from the generated sequence instead of leaving id null.

Create src/main/resources/import.sql:

insert into stored_documents (id, external_id, owner_email, title, storage_key, checksum)
select nextval(sequence_name::regclass), 'DOC-1000', 'legal@parchment.example', 'Export declaration for batch 1000', 'docs/2026/06/DOC-1000.pdf', 'sha256:0b74fd3a6f4f7002'
from information_schema.sequences
where sequence_schema = current_schema()
order by sequence_name
fetch first 1 row only;

insert into stored_documents (id, external_id, owner_email, title, storage_key, checksum)
select nextval(sequence_name::regclass), 'DOC-1001', 'legal@parchment.example', 'Supplier invoice for route correction', 'docs/2026/06/DOC-1001.pdf', 'sha256:4c4cd9314b32300c'
from information_schema.sequences
where sequence_schema = current_schema()
order by sequence_name
fetch first 1 row only;

insert into stored_documents (id, external_id, owner_email, title, storage_key, checksum)
select nextval(sequence_name::regclass), 'DOC-2000', 'compliance@parchment.example', 'Retention hold notice for shipment 2000', 'docs/2026/06/DOC-2000.pdf', 'sha256:8d9f4401c0c9dd11'
from information_schema.sequences
where sequence_schema = current_schema()
order by sequence_name
fetch first 1 row only;

I prefer this for demo data. The file is smaller, easier to read, and easier to replace later with real migrations.

Model the document record

We need one entity, two small DTOs, and one repository. I use the repository pattern here because the article is about where query code lives.

Create src/main/java/dev/morling/mainthread/vaultflow/StoredDocument.java:

package dev.morling.mainthread.vaultflow;

import io.quarkus.hibernate.orm.panache.PanacheEntity;
import jakarta.persistence.Column;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;

@Entity
@Table(name = "stored_documents")
public class StoredDocument extends PanacheEntity {

    @Column(name = "external_id", nullable = false, unique = true, length = 64)
    public String externalId;

    @Column(name = "owner_email", nullable = false, length = 256)
    public String ownerEmail;

    @Column(nullable = false, length = 256)
    public String title;

    @Column(name = "storage_key", nullable = false, length = 512)
    public String storageKey;

    @Column(nullable = false, length = 128)
    public String checksum;
}

Create CreateDocumentRequest.java and DocumentResponse.java in the same package:

package dev.morling.mainthread.vaultflow;

public record CreateDocumentRequest(
        String externalId,
        String ownerEmail,
        String title,
        String storageKey,
        String checksum) {
}

package dev.morling.mainthread.vaultflow;

public record DocumentResponse(
        Long id,
        String externalId,
        String ownerEmail,
        String title,
        String storageKey,
        String checksum) {
}

Now the repository:

package dev.morling.mainthread.vaultflow;

import java.util.List;
import java.util.Optional;

import io.quarkus.hibernate.orm.panache.PanacheRepository;
import jakarta.enterprise.context.ApplicationScoped;

@ApplicationScoped
public class StoredDocumentRepository implements PanacheRepository<StoredDocument> {

    public Optional<StoredDocument> findByExternalId(String externalId) {
        return find("externalId", externalId).firstResultOptional();
    }

    public List<StoredDocument> findByOwnerEmail(String ownerEmail) {
        return find("ownerEmail", ownerEmail).list();
    }
}

Keep that method simple for now. We come back to it when we look at the unsafe version an agent can generate under pressure.

Add the service boundary

The write path belongs behind a transaction. The REST resource should not decide how persistence works.

Create DocumentService.java:

package dev.morling.mainthread.vaultflow;

import java.util.List;

import jakarta.enterprise.context.ApplicationScoped;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;

@ApplicationScoped
public class DocumentService {

    @Inject
    StoredDocumentRepository repository;

    @Transactional
    public DocumentResponse create(CreateDocumentRequest request) {
        repository.findByExternalId(request.externalId())
                .ifPresent(existing -> {
                    throw new DuplicateDocumentException(request.externalId());
                });

        StoredDocument document = new StoredDocument();
        document.externalId = request.externalId();
        document.ownerEmail = request.ownerEmail();
        document.title = request.title();
        document.storageKey = request.storageKey();
        document.checksum = request.checksum();
        repository.persist(document);

        return toResponse(document);
    }

    public DocumentResponse getByExternalId(String externalId) {
        StoredDocument document = repository.findByExternalId(externalId)
                .orElseThrow(() -> new DocumentNotFoundException(externalId));
        return toResponse(document);
    }

    public List<DocumentResponse> searchByOwnerEmail(String ownerEmail) {
        return repository.findByOwnerEmail(ownerEmail).stream()
                .map(DocumentService::toResponse)
                .toList();
    }

    private static DocumentResponse toResponse(StoredDocument document) {
        return new DocumentResponse(
                document.id,
                document.externalId,
                document.ownerEmail,
                document.title,
                document.storageKey,
                document.checksum);
    }
}

This service does not use EntityManager, build SQL strings, or put query logic in the REST layer. That keeps the easy mistakes in one place.

Add the two exception types:

package dev.morling.mainthread.vaultflow;

public class DuplicateDocumentException extends RuntimeException {

    public DuplicateDocumentException(String externalId) {
        super("Document " + externalId + " already exists");
    }
}

package dev.morling.mainthread.vaultflow;

public class DocumentNotFoundException extends RuntimeException {

    public DocumentNotFoundException(String externalId) {
        super("Document " + externalId + " was not found");
    }
}

Expose the REST API

Now add the three endpoints. Quarkus REST keeps the resource class small, which is all we need here. What matters is the data boundary and the guardrail.

Create DocumentResource.java:

package dev.morling.mainthread.vaultflow;

import java.net.URI;
import java.util.List;
import java.util.Map;

import jakarta.inject.Inject;
import jakarta.ws.rs.Consumes;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.POST;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.PathParam;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;
import jakarta.ws.rs.core.Response;
import org.jboss.resteasy.reactive.RestResponse;
import org.jboss.resteasy.reactive.server.ServerExceptionMapper;

@Path("/documents")
@Consumes(MediaType.APPLICATION_JSON)
@Produces(MediaType.APPLICATION_JSON)
public class DocumentResource {

    @Inject
    DocumentService service;

    @POST
    public Response create(CreateDocumentRequest request) {
        DocumentResponse response = service.create(request);
        return Response.created(URI.create("/documents/" + response.externalId()))
                .entity(response)
                .build();
    }

    @GET
    @Path("/{externalId}")
    public DocumentResponse getByExternalId(@PathParam("externalId") String externalId) {
        return service.getByExternalId(externalId);
    }

    @GET
    @Path("/search")
    public List<DocumentResponse> searchByOwnerEmail(@QueryParam("ownerEmail") String ownerEmail) {
        return service.searchByOwnerEmail(ownerEmail);
    }

    @ServerExceptionMapper
    public RestResponse<Map<String, String>> mapDuplicate(DuplicateDocumentException e) {
        return RestResponse.status(Response.Status.CONFLICT, Map.of("error", e.getMessage()));
    }

    @ServerExceptionMapper
    public RestResponse<Map<String, String>> mapNotFound(DocumentNotFoundException e) {
        return RestResponse.status(Response.Status.NOT_FOUND, Map.of("error", e.getMessage()));
    }
}

We can now create, read, and search documents. The search path is still safe because it stays inside a Panache finder.

Start the app and prove the baseline

Run dev mode:

./mvnw quarkus:dev

The seeded data has two legal documents and one compliance document, so the search should return two rows.

Test it:

curl -s "http://localhost:8080/documents/search?ownerEmail=legal@parchment.example" | jq

Expected output:

[
  {
    "id": 1,
    "externalId": "DOC-1000",
    "ownerEmail": "legal@parchment.example",
    "title": "Export declaration for batch 1000",
    "storageKey": "docs/2026/06/DOC-1000.pdf",
    "checksum": "sha256:0b74fd3a6f4f7002"
  },
  {
    "id": 51,
    "externalId": "DOC-1001",
    "ownerEmail": "legal@parchment.example",
    "title": "Supplier invoice for route correction",
    "storageKey": "docs/2026/06/DOC-1001.pdf",
    "checksum": "sha256:4c4cd9314b32300c"
  }
]

Create one more document:

curl -i \
  -H "Content-Type: application/json" \
  -d '{
    "externalId": "DOC-3000",
    "ownerEmail": "ops@parchment.example",
    "title": "Late customs memo for route 3000",
    "storageKey": "docs/2026/06/DOC-3000.pdf",
    "checksum": "sha256:33bbca6ab4b22000"
  }' \
  http://localhost:8080/documents

Expected output starts like this:

HTTP/1.1 201 Created
Content-Type: application/json;charset=UTF-8
content-length: 192
Location: http://localhost:8080/documents/DOC-3000

The app is real now. The security rule is attached to code that already does useful work.

Add the Semgrep rule that blocks the bad query shape

If you ask an agent for “a quick email search”, this is the sort of code it may generate in StoredDocumentRepository:

public List<StoredDocument> findByOwnerEmail(String ownerEmail) {
    return entityManager.createNativeQuery(
                    "select * from stored_documents where owner_email = '" + ownerEmail + "'",
                    StoredDocument.class)
            .getResultList();
}

It looks normal in a diff. It is still a policy violation.

Create .semgrep/vaultflow-rules.yaml:

rules:
  - id: vaultflow-native-query-concatenation
    patterns:
      - pattern: $EM.createNativeQuery($QUERY + ..., ...)
      - pattern-not: $EM.createNativeQuery("...", ...)
    message: >
      Native query built with string concatenation in createNativeQuery().
      Use a Panache finder or a parameterized query instead.
    languages: [java]
    severity: ERROR
    metadata:
      category: security
      technology: [quarkus, hibernate]

  - id: vaultflow-hardcoded-secret
    patterns:
      - pattern: String $NAME = $VALUE;
      - metavariable-regex:
          metavariable: $NAME
          regex: (?i).*(secret|token|password|apikey|api_key).*
    message: >
      Possible hardcoded secret in $NAME. Move it to configuration before it lands in git history.
    languages: [java]
    severity: ERROR
    metadata:
      category: security

The first rule is the one we care about here. It does not try to prove exploitability. It encodes a repository rule: no concatenated native queries.

The fix is already in our real code:

public List<StoredDocument> findByOwnerEmail(String ownerEmail) {
    return find("ownerEmail", ownerEmail).list();
}

That split is the whole point. Semgrep blocks the bad shape. Quarkus and Panache already give us a safer path.

Put Semgrep in the local loop

pre-commit is not part of Quarkus and not required to run the app. It is just a small CLI that can manage local Git hooks.

If you want that local hook on macOS, install the pre-commit CLI with Homebrew:

brew install pre-commit

This command installs the pre-commit executable. It does not install the Git hook in this repository yet.

Create .pre-commit-config.yaml:

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v5.0.0
    hooks:
      - id: detect-private-key
      - id: check-merge-conflict
      - id: no-commit-to-branch
        args: ["--branch", "main"]

  - repo: https://github.com/semgrep/pre-commit
    rev: v1.166.0
    hooks:
      - id: semgrep
        name: semgrep-vaultflow
        args:
          - "--config"
          - "p/java"
          - "--config"
          - "p/secrets"
          - "--config"
          - ".semgrep/vaultflow-rules.yaml"
          - "--error"
          - "--skip-unknown-extensions"
          - "--quiet"
          - "--disable-version-check"

Then install the Git hook for this repository:

pre-commit install

This command writes the repo’s .git/hooks/pre-commit hook so Git runs the configured checks before each commit.

The agent and the human developer now hit the same boundary. A git commit with the unsafe native query should fail before the diff goes any further.

Put the same rule in the agent instructions

Open AGENTS.md and add a project-specific section:

## Project Security Guardrails

- Run Semgrep before committing changes in this project.
- Treat `ERROR` findings from Semgrep as blockers.
- Prefer Panache finders or parameterized queries. Do not build `createNativeQuery()` calls with string concatenation.
- Do not add `# nosemgrep` suppressions without a short security rationale next to the suppression.

AGENTS.md does not enforce anything by itself. It tells the agent what the local hook and CI already enforce. That is still useful because clear repo rules save review time.

Add the CI check

Create .github/workflows/semgrep.yml:

name: Semgrep Security Scan

on:
  pull_request:
  push:
    branches: [main]

jobs:
  semgrep:
    runs-on: ubuntu-latest
    container:
      image: semgrep/semgrep
    steps:
      - uses: actions/checkout@v6.0.3
        with:
          fetch-depth: 0

      - name: Run Semgrep
        run: |
          semgrep scan \
            --config p/java \
            --config p/secrets \
            --config .semgrep/vaultflow-rules.yaml \
            --error \
            src/main/java

CI is the merge boundary. The local hook gives fast feedback. CI decides whether a skipped hook becomes a merge problem.

Prove the Quarkus side with tests

Add src/test/java/dev/morling/mainthread/vaultflow/DocumentResourceTest.java:

package dev.morling.mainthread.vaultflow;

import static io.restassured.RestAssured.given;
import static org.hamcrest.Matchers.endsWith;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.hasSize;

import io.quarkus.test.common.http.TestHTTPEndpoint;
import io.quarkus.test.junit.QuarkusTest;
import io.restassured.http.ContentType;
import org.junit.jupiter.api.Test;

@QuarkusTest
@TestHTTPEndpoint(DocumentResource.class)
class DocumentResourceTest {

    @Test
    void shouldReturnSeededDocumentsForOwner() {
        given()
                .queryParam("ownerEmail", "legal@parchment.example")
                .when()
                .get("/search")
                .then()
                .statusCode(200)
                .body("$", hasSize(2))
                .body("[0].ownerEmail", equalTo("legal@parchment.example"));
    }

    @Test
    void shouldCreateAndReadDocument() {
        String payload = """
                {
                  "externalId": "DOC-3000",
                  "ownerEmail": "ops@parchment.example",
                  "title": "Late customs memo for route 3000",
                  "storageKey": "docs/2026/06/DOC-3000.pdf",
                  "checksum": "sha256:33bbca6ab4b22000"
                }
                """;

        given()
                .contentType(ContentType.JSON)
                .body(payload)
                .when()
                .post()
                .then()
                .statusCode(201)
                .header("Location", endsWith("/documents/DOC-3000"))
                .body("externalId", equalTo("DOC-3000"))
                .body("ownerEmail", equalTo("ops@parchment.example"));

        given()
                .when()
                .get("/DOC-3000")
                .then()
                .statusCode(200)
                .body("title", equalTo("Late customs memo for route 3000"))
                .body("storageKey", equalTo("docs/2026/06/DOC-3000.pdf"));
    }

    @Test
    void shouldRejectDuplicateExternalId() {
        String payload = """
                {
                  "externalId": "DOC-1000",
                  "ownerEmail": "legal@parchment.example",
                  "title": "Duplicate seed document",
                  "storageKey": "docs/2026/06/DOC-1000-duplicate.pdf",
                  "checksum": "sha256:9c406998f735af15"
                }
                """;

        given()
                .contentType(ContentType.JSON)
                .body(payload)
                .when()
                .post()
                .then()
                .statusCode(409)
                .body("error", equalTo("Document DOC-1000 already exists"));
    }
}

Run the tests:

./mvnw test

Expected result:

[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0

These tests are not about Semgrep. They prove the Quarkus app works before and after the guardrail discussion.

What survives production and what does not

This app path is small because I just want to show how Semgrep is working. The repository uses Panache, the write boundary is transactional, and Dev Services keeps local setup cheap.

None of this replaces review. Semgrep does not understand business intent. The agent instructions do not enforce policy. A coding agent can still generate a bad change, and a developer can still bypass a local hook. CI exists because of that.

Use this approach:

Quarkus gives you a good default path
Semgrep blocks specific bad shapes
pre-commit makes the feedback immediate
CI turns the same rule into merge policy
AGENTS.md keeps the agent aimed at the safe path

That is enough to catch cheap mistakes before they turn into expensive ones.

Conclusion

We built a small Quarkus document service, proved the API with tests, and added one deterministic guardrail for a common agent failure: concatenated native queries. Semgrep is not smarter than the agent. It is just faster, cheaper, and more stubborn about one rule we care about here. Use this as the baseline to make your agent output more reliable and trusted.

Discussion about this post

Ready for more?