Hi Markus, thank you so much for the article. I have a PDF with a client's information in several sections (ID document, policy letter, policy, contract, and others), and they have different image and text scan formats. I need to split it into separate PDFs for each section. What process would you recommend, or how would you approach this?
thanks for this article. I tried to build and run it. The Rest API via upload starts without error, Virus Check is Ok, but afterwards I get timeout message in LLM:
2025-10-11 01:01:48,049 INFO [PdfProcessingRoute:32] (executor-thread-1) ? Generating summary via LLM...
2025-10-11 01:01:58,055 WARN [dev.lan.int.RetryUtils] (executor-thread-1) A retriable exception occurred. Remaining retries: 2 of 2: jakarta.ws.rs.ProcessingException: The timeout period of 10000ms has been exceeded while executing POST /api/chat for server null
Thanks for this tutorial. I got a couple of dependency management errors, but have managed to solve with the the following pom:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.example</groupId>
<artifactId>pdf-processing-pipeline</artifactId>
<version>1.0.0-SNAPSHOT</version>
<properties>
<compiler-plugin.version>3.14.0</compiler-plugin.version>
<maven.compiler.release>21</maven.compiler.release>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<quarkus.platform.artifact-id>quarkus-bom</quarkus.platform.artifact-id>
<quarkus.platform.group-id>io.quarkus.platform</quarkus.platform.group-id>
<quarkus.platform.version>3.29.0</quarkus.platform.version>
<skipITs>true</skipITs>
<surefire-plugin.version>3.5.2</surefire-plugin.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>${quarkus.platform.group-id}</groupId>
<artifactId>${quarkus.platform.artifact-id}</artifactId>
<version>${quarkus.platform.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>${quarkus.platform.group-id}</groupId>
<artifactId>quarkus-camel-bom</artifactId>
<version>${quarkus.platform.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-rest-jackson</artifactId>
</dependency>
<dependency>
<groupId>org.apache.camel.quarkus</groupId>
<artifactId>camel-quarkus-direct</artifactId>
</dependency>
<dependency>
<groupId>org.apache.camel.quarkus</groupId>
<artifactId>camel-quarkus-pdf</artifactId>
</dependency>
<dependency>
<groupId>org.apache.camel.quarkus</groupId>
<artifactId>camel-quarkus-langchain4j</artifactId>
<version>3.26.0</version>
</dependency>
<dependency>
<groupId>org.apache.camel.quarkus</groupId>
<artifactId>camel-quarkus-langchain4j-chat</artifactId>
<version>3.26.0</version>
</dependency>
<dependency>
<groupId>io.quarkiverse.antivirus</groupId>
<artifactId>quarkus-antivirus</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>io.quarkiverse.langchain4j</groupId>
<artifactId>quarkus-langchain4j-ollama</artifactId>
<version>1.4.0.CR2</version>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-arc</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-rest</artifactId>
</dependency>
<dependency>
<groupId>io.quarkus</groupId>
<artifactId>quarkus-junit5</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>rest-assured</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>${quarkus.platform.group-id}</groupId>
<artifactId>quarkus-maven-plugin</artifactId>
<version>${quarkus.platform.version}</version>
<extensions>true</extensions>
<executions>
<execution>
<goals>
<goal>build</goal>
<goal>generate-code</goal>
<goal>generate-code-tests</goal>
<goal>native-image-agent</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>${compiler-plugin.version}</version>
<configuration>
<parameters>true</parameters>
</configuration>
</plugin>
<plugin>
<artifactId>maven-surefire-plugin</artifactId>
<version>${surefire-plugin.version}</version>
<configuration>
<argLine>--add-opens java.base/java.lang=ALL-UNNAMED</argLine>
<systemPropertyVariables>
<java.util.logging.manager>org.jboss.logmanager.LogManager</java.util.logging.manager>
<maven.home>${maven.home}</maven.home>
</systemPropertyVariables>
</configuration>
</plugin>
<plugin>
<artifactId>maven-failsafe-plugin</artifactId>
<version>${surefire-plugin.version}</version>
<executions>
<execution>
<goals>
<goal>integration-test</goal>
<goal>verify</goal>
</goals>
</execution>
</executions>
<configuration>
<argLine>--add-opens java.base/java.lang=ALL-UNNAMED</argLine>
<systemPropertyVariables>
<native.image.path>${project.build.directory}/${project.build.finalName}-runner</native.image.path>
<java.util.logging.manager>org.jboss.logmanager.LogManager</java.util.logging.manager>
<maven.home>${maven.home}</maven.home>
</systemPropertyVariables>
</configuration>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>native</id>
<activation>
<property>
<name>native</name>
</property>
</activation>
<properties>
<quarkus.package.jar.enabled>false</quarkus.package.jar.enabled>
<skipITs>false</skipITs>
<quarkus.native.enabled>true</quarkus.native.enabled>
</properties>
</profile>
</profiles>
</project>
In the above, I set the quarkus.platform.version to 3.29.0 and quarkus-langchain libs to version 3.26.0
Also, I had to use the llama3.2:latest model.
Hi Markus, thank you so much for the article. I have a PDF with a client's information in several sections (ID document, policy letter, policy, contract, and others), and they have different image and text scan formats. I need to split it into separate PDFs for each section. What process would you recommend, or how would you approach this?
Hi Juanes, that’s a great question. I’d probably look at Docling https://www.the-main-thread.com/p/quarkus-docling-data-preparation-for-ai
thanks for this article. I tried to build and run it. The Rest API via upload starts without error, Virus Check is Ok, but afterwards I get timeout message in LLM:
2025-10-11 01:01:48,049 INFO [PdfProcessingRoute:32] (executor-thread-1) ? Generating summary via LLM...
2025-10-11 01:01:58,055 WARN [dev.lan.int.RetryUtils] (executor-thread-1) A retriable exception occurred. Remaining retries: 2 of 2: jakarta.ws.rs.ProcessingException: The timeout period of 10000ms has been exceeded while executing POST /api/chat for server null
at org.jboss.resteasy.reactive.client.impl.InvocationBuilderImpl.unwrap(InvocationBuilderImpl.java:212)
at org.jboss.resteasy.reactive.client.impl.InvocationBuilderImpl.post(InvocationBuilderImpl.java:243)
at io.quarkiverse.langchain4j.jaxrsclient.JaxRsHttpClient.execute(JaxRsHttpClient.java:117)
at dev.langchain4j.model.ollama.OllamaClient.chat(OllamaClient.java:93)
at dev.langchain4j.model.ollama.OllamaChatModel.lambda$doChat$0(OllamaChatModel.java:42)
at dev.langchain4j.internal.ExceptionMapper.withExceptionMapper(ExceptionMapper.java:29)
at dev.langchain4j.internal.RetryUtils.lambda$withRetryMappingExceptions$2(RetryUtils.java:324)
at dev.langchain4j.internal.RetryUtils$RetryPolicy.withRetry(RetryUtils.java:211)
Do you know, where can I fix this timeout ?
Many thanks
Rainer
Hey. gpt-oss is a large model. Try adding an explicit timeout in the application.properties or switch to a smaller model like llama3.2.
quarkus.langchain4j.timeout=60S
Hey, that works fine. Great.