Building a Real-Time Bluesky Feed with Quarkus and Java

How to stream, filter, and analyze #Java posts from Bluesky’s firehose using modern Java and Quarkus.

Aug 25, 2025

Custom Bluesky feeds (feed generators) are external services that Bluesky can query to fetch a skeleton list of post URIs, implementing custom timelines . In this tutorial, we’ll build a local feed generator in Java using Quarkus – filtering Bluesky’s global feed for #Java posts (focusing on tech content, not travel), collecting metadata, and exposing the required XRPC endpoint for Bluesky. This step-by-step guide assumes you’re an intermediate Java developer and will cover:

Project Setup: Creating a new Quarkus project (with the Quarkus CLI) with extensions for WebSockets, REST/JSON, PostgreSQL (Panache ORM), and Flyway.
Bluesky Firehose Connection: Subscribing to Bluesky’s firehose (via Jetstream WebSocket) and processing incoming posts.
Post Filtering: Detecting #Java posts and distinguishing technology-related posts from travel-related posts.
Metadata Extraction: Parsing anonymized metadata from posts – frameworks mentioned, post creation time, hashtags, links, language detection.
Persistence: Storing filtered posts and metadata in PostgreSQL using Quarkus Panache and Flyway.
XRPC Feed Endpoint: Implementing the app.bsky.feed.getFeedSkeleton endpoint.
Testing Locally: Running the app and verifying results with curl.

As this has some more code than usual, you are highly encouraged to grab the project from my Github repository and start from there.

Setting Up the Quarkus Project and Extensions

Create a new Quarkus project with the needed extensions:

quarkus create app com.example:bsky-javafeed-generator \
    -x vertx,rest-jackson,hibernate-orm-panache,jdbc-postgresql,flyway
cd bsky-javafeed-generator

Extensions we’ll use:

quarkus-vertx (Access to the Vert.X API for Websocket connections)
quarkus-rest-jackson (REST API + JSON)
quarkus-hibernate-orm-panache (ORM with convenience API)
quarkus-jdbc-postgresql (JDBC driver)
quarkus-flyway (migrations)

Configure the database and migration in src/main/resources/application.properties:

quarkus.hibernate-orm.database.generation=none
quarkus.flyway.migrate-at-start=true

Quarkus will use these properties to initialize the datasource. We set generation=none because we’ll create the schema via Flyway, rather than Quarkus auto-creating tables. We also enable migrate-at-start so Flyway applies our SQL migrations on launch

Create an initial Flyway migration: src/main/resources/db/migration/V1__Create_posts_table.sql

CREATE TABLE post (
    id          BIGSERIAL PRIMARY KEY,
    uri         TEXT    NOT NULL,  -- AT Protocol URI of the post
    text        TEXT    NOT NULL,  -- Post content text
    createdat  TIMESTAMP WITH TIME ZONE NOT NULL,
    hourofday INT     NOT NULL,  -- Hour (0-23) the post was created (UTC)
    hashtags    TEXT,              -- Comma-separated hashtags in the post
    links       TEXT,              -- Comma-separated links in the post
    frameworks  TEXT,              -- Comma-separated tech libraries mentioned
    language    VARCHAR(8),        -- Detected language code (e.g. 'en')
    indexedat  TIMESTAMP WITH TIME ZONE DEFAULT now() -- (indexedat is when we saved the post; could help with pagination or TTL policies)

);

create sequence Post_SEQ start with 1 increment by 50;

This creates a post table to store each filtered post’s URI, content, metadata, and timestamps. We include an indexedat with a default of current time. This is not strictly required, but useful if we later implement post expiry (e.g. removing posts older than 48 hours, as Bluesky suggests) or need a unique cursor.

At this point, our project is set up with all required extensions and an initial database schema. We’re ready to start coding the feed generator logic.

Connecting to the Bluesky Firehose (Jetstream)

Bluesky provides a “firehose” stream of all events (posts, likes, follows, etc.) across the network. Feed generators are expected to subscribe to this firehose (the com.atproto.sync.subscribeRepos stream) and index whatever data they need. Jetstream lets us subscribe to just the data we care about (in our case, posts) with far less bandwidth.

wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=app.bsky.feed.post

This query subscribes us to only post events (filtering out likes, follows, etc.). We don’t need authentication for this stream – it’s open for reading public posts.

Let’s set up a Quarkus bean to connect to this WebSocket on application startup. We’ll use Vert.x’s WebSocket client and handle messages.

Create a class BlueskySubscriber (e.g. in src/main/java/com/example/service/BlueskySubscriber.java):

package com.example.service;

import java.time.OffsetDateTime;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import org.jboss.logging.Logger;

import com.example.model.PostEntity;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;

import io.quarkus.runtime.StartupEvent;
import io.vertx.core.Vertx;
import io.vertx.core.http.WebSocketClient;
import io.vertx.core.http.WebSocketClientOptions;
import io.vertx.core.http.WebSocketConnectOptions;
import java.net.URI;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.enterprise.event.Observes;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;

@ApplicationScoped
public class BlueskySubscriber {

    private static final Logger LOG = Logger.getLogger(BlueskySubscriber.class);

    @Inject
    Vertx vertx;

    private WebSocketClient wsClient;

    void onStart(@Observes StartupEvent ev) {
        // Based on official Jetstream documentation:
        // https://github.com/bluesky-social/jetstream
        String firehoseUrl = "wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=app.bsky.feed.post";
        String description = "Jetstream2 US-East (Official)";

        // Initialize WebSocket client with SSL options
        WebSocketClientOptions options = new WebSocketClientOptions()
                .setSsl(true)
                .setVerifyHost(false)
                .setTrustAll(true); // For testing - in production use proper SSL verification

        wsClient = vertx.createWebSocketClient(options);

        connectToFirehose(firehoseUrl, description);
    }

    private void connectToFirehose(String url, String description) {
        LOG.infof("Attempting to connect to %s: %s", description, url);

        try {
            URI uri = URI.create(url);
            WebSocketConnectOptions connectOptions = new WebSocketConnectOptions()
                    .setHost(uri.getHost())
                    .setPort(uri.getPort() == -1 ? 443 : uri.getPort())
                    .setURI(uri.getPath() + (uri.getQuery() != null ? "?" + uri.getQuery() : ""))
                    .setSsl(true);

            wsClient.connect(connectOptions)
                .onSuccess(ws -> {
                    LOG.infof("Successfully connected to %s: %s", description, url);

                    // Set up message handlers - dispatch to same I/O thread for transaction support
                    ws.textMessageHandler(message -> {
                        vertx.executeBlocking(() -> {
                            handleTextMessage(message);
                            return null;
                        }, false);
                    });

                    // Set up error and close handlers
                    ws.exceptionHandler(error -> {
                        LOG.errorf("WebSocket error on %s: %s", description, error.getMessage());
                        // Try to reconnect after a delay
                        vertx.setTimer(5000, timer -> connectToFirehose(url, description));
                    });

                    ws.closeHandler(closeReason -> {
                        LOG.warnf("WebSocket closed on %s", description);
                        // Try to reconnect after a delay
                        vertx.setTimer(5000, timer -> connectToFirehose(url, description));
                    });

                    LOG.infof("Listening for real-time Bluesky posts...");
                })
                .onFailure(error -> {
                    LOG.errorf("Failed to connect to %s (%s): %s", description, url, error.getMessage());
                    // Try to reconnect after a delay
                    vertx.setTimer(10000, timer -> connectToFirehose(url, description));
                });
        } catch (Exception e) {
            LOG.errorf("Error parsing URL %s: %s", url, e.getMessage());
            // Try to reconnect after a delay
            vertx.setTimer(10000, timer -> connectToFirehose(url, description));
        }
    }

    private void handleTextMessage(String message) {
        // This method is invoked for each incoming JSON message from Jetstream
        try {
            processJetstreamEvent(message);
        } catch (Exception ex) {
            LOG.error("Error processing Jetstream event: ", ex);
        }
    }

//...
}

A few notes on this setup:

We inject BasicWebSocketConnector (provided by quarkus-websockets-next) to initiate client connections.
On startup (onStart observer), we connect to the Jetstream URL. We specify ExecutionModel.NON_BLOCKING so that our message handler does NOT run on a worker thread (since we’ll do blocking I/O like DB writes).
We register an .onTextMessage handler that calls our handlePostMessage logic for each text frame (Jetstream sends JSON text messages).
If the connection drops or errors out, in a real service you’d want to retry or handle reconnection logic; for our prototype, we’ll log failures.

Now, parsing and processing the incoming messages. Jetstream events are JSON objects.

We’re interested in commit events where operation == "create" and the collection == "app.bsky.feed.post". These indicate a new post was created. The record contains the post content (text) and timestamp (createdAt). We’ll extract those, then apply our filters and metadata extraction.

Process Events

In BlueskySubscriber, add a method processFirehoseEvent(String json) to handle one event message:

  @Transactional
    void processJetstreamEvent(String json) throws Exception {
        // Parse JSON text into a tree for inspection
        JsonNode root = new ObjectMapper().readTree(json);

        // Jetstream message format: {"kind": "commit", "commit": {...}, "did": "..."}
        if (!"commit".equals(root.path("kind").asText())) {
            return; // ignore non-commit events (e.g. 'identity' or 'account' updates)
        }

        JsonNode commit = root.path("commit");
        if (!"create".equals(commit.path("operation").asText()) ||
                !"app.bsky.feed.post".equals(commit.path("collection").asText())) {
            return; // not a new post creation, ignore (could be likes, follows, etc.)
        }

        // Extract post text and creation timestamp
        String text = commit.path("record").path("text").asText("");
        String createdAtStr = commit.path("record").path("createdAt").asText("");
        if (text.isEmpty() || !text.contains("#Java")) {
            return; // Skip if no text or does not contain #Java hashtag
        }

        // Distinguish tech vs. travel context for "#Java"
        if (!isTechRelatedPost(text)) {
            // It's a #Java mention likely about Java (island/coffee), skip indexing
            return;
        }

        // Extract metadata
        OffsetDateTime createdAt = OffsetDateTime.parse(createdAtStr);
        int hour = createdAt.getHour();
        String frameworks = findFrameworks(text);
        String hashtags = extractHashtags(text);
        String links = extractLinks(text);
        String language = detectLanguage(text);

        // Construct the AT URI for the post: "at://{did}/app.bsky.feed.post/{rkey}"
        String userDid = root.path("did").asText();
        String rkey = commit.path("rkey").asText();
        String atUri = "at://" + userDid + "/app.bsky.feed.post/" + rkey;

        // Persist to database via Panache entity
        PostEntity post = new PostEntity();
        post.uri = atUri;
        post.text = text;
        post.createdAt = createdAt;
        post.hourOfDay = hour;
        post.frameworks = frameworks;
        post.hashtags = hashtags;
        post.links = links;
        post.language = language;
        post.persist(); // Panache will insert the record (within the @Transactional context)

        LOG.infof("Indexed post %s (hour %d, frameworks: %s)", atUri, hour, frameworks);
    }

This method uses Jackson to parse the JSON and then applies our filtering and extraction logic:

Filtering #Java tech posts: We check if the post text contains the hashtag #Java. If not, ignore the event. If yes, we call isTechRelatedPost(text) to determine if it’s about Java the programming language (tech) or something else (e.g. the Indonesian island or coffee). We implement this as simple rule-based logic (layered checks), which we’ll show next.
Extracting metadata: For a qualifying post, we gather:
- Time of day: parse createdAt into an OffsetDateTime and get the hour of day (0–23).
- Frameworks/Libraries: scan the text for known tech keywords.
- Hashtags: find all hashtags in the text.
- Links: find any URLs in the text.
- Language: (Basic approach) determine the likely language of the post.
Constructing the post URI: The Bluesky AT Protocol URI for a post is composed of the author’s DID, the collection (app.bsky.feed.post), and the record’s key. We retrieve the did (author’s DID) from the top-level JSON, and rkey from the commit, then form at://did:.../app.bsky.feed.post/rkey. This URI is what we will later return in our feed skeleton.
Persisting to PostgreSQL: We create a PostEntity (a Panache entity class for the post table) and set its fields, then call persist(). We annotated processFirehoseEvent with @Transactional so that this DB operation occurs in a transaction (Panache will handle the insert). We’ll define PostEntity shortly in the next section.

Filtering and Extraction

Let’s implement the helper methods used above for filtering and extraction:

  // Determine if a #Java post is tech-related or not
    private boolean isTechRelatedPost(String text) {
        String textLower = text.toLowerCase();
        // Keywords indicating a travel/coffee context for "Java"
        if (textLower.contains("indonesia") || textLower.contains("jakarta")
                || textLower.contains("coffee") || textLower.contains("island")) {
            return false; // mentions of Indonesia/coffee likely mean Java the place or coffee
        }
        // Keywords that strongly indicate tech context
        if (textLower.contains("spring") || textLower.contains("quarkus")
                || textLower.contains("jdk") || textLower.contains("programming")) {
            return true;
        }
        // (Basic language hint: if contains typical Indonesian words, you could flag as
        // non-tech too)
        if (textLower.matches(".*\\bsebuah\\b.*") || textLower.matches(".*\\bpulau\\b.*")) {
            // e.g. Indonesian words "sebuah" (a/an), "pulau" (island)
            return false;
        }
        // Default: assume tech if none of the travel indicators were present
        return true;
    }

    // Find known Java-related frameworks or libraries mentioned in text
    private String findFrameworks(String text) {
        String[] techTerms = { "Spring", "Quarkus", "Jakarta", "Hibernate", "JDK", "JVM" };
        StringBuilder found = new StringBuilder();
        for (String term : techTerms) {
            if (text.contains(term)) {
                if (found.length() > 0)
                    found.append(",");
                found.append(term);
            }
        }
        return found.toString();
    }

    // Extract all hashtags (e.g. #Java, #Quarkus) from text
    private String extractHashtags(String text) {
        Matcher m = Pattern.compile("#\\w+").matcher(text);
        StringBuilder tags = new StringBuilder();
        while (m.find()) {
            if (tags.length() > 0)
                tags.append(",");
            tags.append(m.group());
        }
        return tags.toString();
    }

    // Extract all URLs from text (simple regex for http/https links)
    private String extractLinks(String text) {
        Matcher m = Pattern.compile("(https?://\\S+)").matcher(text);
        StringBuilder links = new StringBuilder();
        while (m.find()) {
            if (links.length() > 0)
                links.append(",");
            links.append(m.group());
        }
        return links.toString();
    }

    // Very basic language detection (placeholder for a real NLP library)
    private String detectLanguage(String text) {
        // For demo: if contains likely English stopwords vs. Indonesian words, etc.
        // Here we'll just default to "en" for simplicity.
        return "en";
    }

The isTechRelatedPost function is our layered filter: it first checks for obvious travel context keywords (if present, we classify the post as non-tech and skip it), then checks for programming context keywords (if present, definitely tech). We also included a simple check for a couple of Indonesian words as a heuristic. In a real scenario, you might use a natural language processing library or a more nuanced model to disambiguate “Java,” but these rules illustrate the approach.

The other methods gather metadata using regex searches:

Hashtags: find all words starting with #.
Links: find all http:// or https:// URLs.
Frameworks: we used a fixed list of tech keywords to look for. (E.g., if a post mentions “Spring” or “JDK”, we capture those.)
Language: Here we stub a simple solution that always returns "en" (assuming most tech posts are in English). You could integrate a library (like OpenNLP or a language detector) to improve this, or even use the presence of certain characters/words to guess language.

With the BlueskySubscriber in place, our application on startup will connect to Bluesky’s Jetstream and continuously process incoming posts. It will filter and save any posts that match #Java (tech) to our database.

As seen above, the core filter is checking for the hashtag #Java and then disambiguating its meaning. This step is crucial because we only want technology-related Java posts in our feed (e.g. Java programming, frameworks, JVM topics) – not posts about traveling to Java or enjoying a cup of Java coffee.

Our approach used simple keyword rules as a proxy for basic NLP:

We assume if the post text contains certain place names, tourism keywords, or “coffee”, it’s talking about the island or coffee (e.g. “Java trip in Indonesia”, “Java coffee is great”) – those are excluded from the feed.
If the text contains programming terms (libraries like Spring, Quarkus, or words like “JDK”, “programming”), we treat it as a tech post and include it.
We also included a crude language check – if we detect Indonesian words in the content, it’s likely not a programming discussion (since many Indonesian users might use “Java” to refer to the place). In practice, a dedicated language detector could be used for better accuracy.

This layered logic ensures we don’t blindly include every #Java mention. For example: a post that says “Exploring coffee plantations in #Java” would be skipped (contains “coffee” → not tech), whereas “Debugging memory issues in #Java on the JVM” would pass the filter (contains “JVM” → tech context).

Extracting Anonymized Metadata

Beyond filtering, our feed generator collects various metadata from each post. This data can be used for analytics or future enhancements (though it won’t be directly exposed in the feed skeleton). Here’s what we extract and how:

Libraries/Frameworks Mentioned: We scan the post text for keywords like “Spring”, “Quarkus”, “Jakarta”, etc. The findFrameworks() function simply checks for presence of these substrings. We store any matches as a comma-separated list (e.g. "Spring,Quarkus" if both are mentioned). This gives a sense of what Java technologies are being discussed in trending posts.
Time of Day: From the createdAt timestamp of the post, we take the hour (0–23 in UTC) and store it as hour_of_day. This could reveal when Java discussions are most active (for example, if many posts happen around 9-10 AM, etc.).
Hashtags: We extract all hashtags in the post (using a regex #\w+). This will include “#Java” (by definition of our filter) and possibly other tags (e.g. “#Quarkus” or “#programming”). Storing hashtags can help identify related topics or trends in the posts.
Links: We pull out any URLs present in the text (regex for http:// or https://). Analyzing links could tell us if posts often reference documentation, blogs, StackOverflow, etc., though for our prototype we just store them as text.
Language Detection: We include a placeholder for language. In our demo code, we default to "en" or do a naive check. In a real scenario, integrating a library like Tika or Optimaize Language Detector could automatically detect if a post is in English, Indonesian, Spanish, etc. This might further help filter content or label it. For example, if a post is entirely in Indonesian, it’s likely not about Java programming (and our filter might catch that). We mark the field language so that later we could plug in an actual detector.

All this metadata is stored in the post table alongside the original post text and URI. We avoid storing any user personal info – note that we use the user’s DID only as part of the uri (which is an opaque identifier like did:plc:abcdef1234...), and we do not store usernames or anything private. This keeps our data anonymized to an extent, focusing only on content patterns. (If needed, we could also exclude storing the full text to be extra cautious, but since the text is public posts and useful for debugging the feed, we keep it in this prototype.)

We’ve set up the database and used Panache to persist our PostEntity. Now, let’s define the PostEntity class that maps to our post table. Panache allows a very straightforward entity definition. Create a file PostEntity.java in com.example.model package:

package com.example.model;

import java.time.OffsetDateTime;

import io.quarkus.hibernate.orm.panache.PanacheEntity;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;

@Entity
@Table(name = "post")
public class PostEntity extends PanacheEntity {
    public String uri;
    public String text;
    public OffsetDateTime createdAt;
    public int hourOfDay;
    public String hashtags;
    public String links;
    public String frameworks;
    public String language;
    public OffsetDateTime indexedAt;
}

Because we extend PanacheEntity, an id field is automatically provided (with types and generation strategy as per Panache defaults). We’ve named the table explicitly to post to match our SQL migration. The field names should correspond to column names (Panache will default to using the field name as column name unless strategies like Hibernate naming conventions alter them – we can assume direct mapping for simplicity).

With this entity in place, our earlier call post.persist() will handle inserting a new row. Panache also gives us convenient query methods we’ll use soon for retrieving data (e.g. PostEntity.findAll()).

We already wrote V1__Create_posts_table.sql for the table. If you compile and run the application (mvn quarkus:dev), Quarkus + Flyway will execute that migration on startup. You should see Flyway logs indicating the migration was applied.

Notice we annotated processFirehoseEvent with @Transactional. Quarkus will manage starting a transaction for that method call (since it’s a CDI bean method), so that the post.persist() runs inside a transaction and commits automatically. Alternatively, we could have opened a transaction explicitly or used Panache’s active record pattern in a different way. But this approach is straightforward: each WebSocket message is handled and committed independently. For a higher throughput design, one might batch commits or use reactive approaches, but our focus is clarity.

At this point, every qualifying post from the Bluesky firehose is being stored in our local database, with rich metadata. Over time, our post table will contain the recent #Java-tech posts. We might want to implement a retention policy (for example, delete entries older than 48 hours, since feed generators typically don’t need to keep content indefinitely), but that could be a simple periodic cleanup job.

Serving the Custom Feed via XRPC

Now for the key piece: the feed generator must expose an HTTP GET endpoint at the path /xrpc/app.bsky.feed.getFeedSkeleton. When a user’s Bluesky app requests our custom feed, it will call this endpoint. According to Bluesky’s spec, the request will include a query param feed (the AT URI of the feed definition record) and possibly a cursor for pagination. Our server should respond with a JSON object containing an array of feed items (each item is a post URI, plus optional reason data) and a cursor string if there’s more content.

In our case, the feed array will contain up to N posts (N defined by a limit, default perhaps 30 or 50) sorted by recency. The cursor can be used to paginate older posts if the feed is long. For simplicity, we’ll implement a single-page feed or a basic cursor.

Let’s create the resource. Make a class FeedResource.java in com.example.api:

package com.example.api;

import java.util.List;

import com.example.model.PostEntity;

import jakarta.ws.rs.DefaultValue;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;

@Path("/xrpc/app.bsky.feed.getFeedSkeleton")
@Produces(MediaType.APPLICATION_JSON)
public class FeedResource {

    @GET
    public FeedResponse getFeedSkeleton(@QueryParam("feed") String feedUri,
            @QueryParam("cursor") String cursor,
            @QueryParam("limit") @DefaultValue("50") int limit) {
        // Query recent posts from the database (sorted by newest first)
        List<PostEntity> posts;
        if (cursor != null && !cursor.isEmpty()) {
            // TODO: optional pagination logic (not fully implemented in this prototype)
            posts = PostEntity.findAll().list(); // (for now, ignore cursor in prototype)
        } else {
            posts = PostEntity.findAll().page(0, limit).list();
            // Alternatively: PostEntity.findAll(Sort.by("createdAt").descending()).range(0,
            // limit-1).list();
        }

        // Build the feed list
        FeedResponse response = new FeedResponse();
        response.feed = posts.stream().map(
                p -> {
                    FeedItem item = new FeedItem();
                    item.post = p.uri;
                    return item;
                }).toList();

        // For simplicity, we won’t implement real cursor pagination here.
        response.cursor = "";
        return response;
    }

    // DTO classes for JSON serialization
    public static class FeedResponse {
        public List<FeedItem> feed;
        public String cursor;
    }

    public static class FeedItem {
        public String post;
        // (optionally could include "reason" or other fields if needed)
    }
}

A few important points about the endpoint:

It’s registered under the exact XRPC path "/xrpc/app.bsky.feed.getFeedSkeleton". We use this full path so that Bluesky can find it. The method is a plain @GET since feed skeleton retrieval is an HTTP GET.
Query parameters:
- feed – the AT URI of the feed that’s being requested. In Bluesky, one feed generator service can host multiple feeds (different algorithms). The feed param tells us which specific feed the user wantsedavis.dev. In our case, we might have only one feed (e.g. at .../feed.generator/javafeed). We aren’t using this value to differentiate logic (since just one feed), but we accept it to conform to the interface.
- cursor – a token for pagination. If the client received a cursor last time, it will send it back to get the next page. Our implementation above doesn’t fully handle it; it just ignores or resets it. For a real feed, we’d implement cursor-based pagination (explained below).
- limit – Bluesky’s app can request a certain limit (often 30 by default). We use a default of 50 if not provided. We then fetch that many posts from our DB.
Fetching from DB: We use Panache to get the recent posts. The above uses a simple findAll().page(0, limit) which by default orders by primary key (which correlates to insertion order). To be safer, you might want to sort by createdAt descending. In Panache, we could do:

posts = PostEntity.findAll(Sort.by("createdAt").descending()).range(0, limit-1).list();

This would retrieve the latest limit posts by timestamp. Ensure your createdAt field is indexed or use id if insertion order is guaranteed to follow creation time. For our prototype scale, this is fine. The result is a list of PostEntity objects.
Building response: We map each PostEntity to a FeedItem DTO with just the post URI field. Bluesky’s spec allows an optional "reason" field (e.g. if the feed is recommending a post because someone you follow liked it, etc.), but our feed has no such concept – it’s purely topical. So we omit any reason (or we could set a constant reason like “custom-java-feed”, but it’s optional). We then wrap that list in a FeedResponse object that contains the feed list and a cursor string.
Cursor logic: In production, we would implement a cursor to allow pagination. Typically, the cursor could be something like <timestamp>::<postCid> of the last item in the current page. The feed generator would then use that to fetch older posts on the next request. For simplicity, we set cursor = "" (empty) always, indicating either end-of-feed or that we aren’t supporting pagination in this prototype. The Bluesky app will interpret an empty cursor as no further results. This is acceptable for a demo (the feed will just not paginate beyond the first batch). If implementing fully, we’d generate and return a cursor when the feed has more items than limit, and honor an incoming cursor by querying older posts (e.g. “createdAt < lastSeenCreatedAt” in the DB query).

Now our feed service should produce the expected output. When Bluesky (or any client) calls GET /xrpc/app.bsky.feed.getFeedSkeleton?..., our FeedResource will return JSON like:

{
  "cursor": "",
  "feed": [
    { "post": "at://did:plc:.../app.bsky.feed.post/abcdef..." },
    { "post": "at://did:plc:.../app.bsky.feed.post/ghijkl..." },
    ...
  ]
}

This matches the schema defined in Bluesky’s lexicon. The Bluesky app (AppView) will take these URIs and hydrate them – meaning it will fetch the actual posts (author info, content, etc.) from the network and then display them in the user’s feed. Our service’s job is only to provide the list of relevant post URIs in the right order.

Before moving on, double-check that the resource path is correct and the application is configured to listen on the default port (Quarkus dev mode runs on port 8080 by default). Also, ensure CORS or other settings are open if you plan to test from a browser or other environment – for simple curl tests, it’s not an issue.

Testing Locally

With everything in place, it’s time to run our prototype and test it out.

quarkus dev

If you don’t want to wait for live #Java posts, insert a test row:

INSERT INTO post(uri,text,created_at,hour_of_day) 
VALUES ('at://did:plc:test/app.bsky.feed.post/1','Hello #Java from Quarkus','2025-08-10T12:00:00Z',12);

Then query the feed:

curl "http://localhost:8080/xrpc/app.bsky.feed.getFeedSkeleton?feed=at://did:example/app.bsky.feed.generator/java-feed" | jq .

Expected output:

{
  "feed": [
    {
      "post": "at://did:plc:3lweg7xrq2kmkek46ofvvxln/app.bsky.feed.post/3lwiwhrh6ek2f"
    }
  ],
  "cursor": ""
}

Finally, to integrate this feed into your Bluesky app (for real use), you would need to do two things not covered in depth here:

Obtain a DID for your feed service (either did:web or did:plc) and host a .well-known/did.json. In development you might skip this, but publishing a feed requires it.
Create a feed record in your Bluesky account’s repository (using Bluesky’s AT Protocol library or API) that points to your feed generator’s DID and gives your feed a name/description. For instance, you’d create an app.bsky.feed.generator record like “java-feed”.

Wrap-Up

Congratulations – you have a working Bluesky custom feed generator prototype!

You set up a Quarkus service that streams Bluesky posts in real-time, filters for #Java tech content, enriches it with metadata, stores it locally, and serves a feed endpoint that Bluesky can query.

This provides the foundation for a powerful custom feed. You can extend this by refining the NLP filters, adding pagination, deploying it online, and registering the feed so others can follow it. Happy coding, and enjoy your custom Java feed on Bluesky!

Aug 26

The team reached out wirh comments around two things:

- a I should use Websocket Next and

- make sure it's more resilient

So, I started a reactive version of this if you're interested.

https://github.com/myfear/ejq_substack_articles/tree/main/bsky-javafeed-generator-reactive

I will get this feature complete and push out in a separate post soon.

Expand full comment

Discussion about this post

Ready for more?