Building a Real-Time Bluesky Feed with Quarkus and Java
How to stream, filter, and analyze #Java posts from Bluesky’s firehose using modern Java and Quarkus.
Custom Bluesky feeds (feed generators) are external services that Bluesky can query to fetch a skeleton list of post URIs, implementing custom timelines . In this tutorial, we’ll build a local feed generator in Java using Quarkus – filtering Bluesky’s global feed for #Java posts (focusing on tech content, not travel), collecting metadata, and exposing the required XRPC endpoint for Bluesky. This step-by-step guide assumes you’re an intermediate Java developer and will cover:
Project Setup: Creating a new Quarkus project (with the Quarkus CLI) with extensions for WebSockets, REST/JSON, PostgreSQL (Panache ORM), and Flyway.
Bluesky Firehose Connection: Subscribing to Bluesky’s firehose (via Jetstream WebSocket) and processing incoming posts.
Post Filtering: Detecting
#Java
posts and distinguishing technology-related posts from travel-related posts.Metadata Extraction: Parsing anonymized metadata from posts – frameworks mentioned, post creation time, hashtags, links, language detection.
Persistence: Storing filtered posts and metadata in PostgreSQL using Quarkus Panache and Flyway.
XRPC Feed Endpoint: Implementing the
app.bsky.feed.getFeedSkeleton
endpoint.Testing Locally: Running the app and verifying results with curl.
As this has some more code than usual, you are highly encouraged to grab the project from my Github repository and start from there.
Setting Up the Quarkus Project and Extensions
Create a new Quarkus project with the needed extensions:
quarkus create app com.example:bsky-javafeed-generator \
-x vertx,rest-jackson,hibernate-orm-panache,jdbc-postgresql,flyway
cd bsky-javafeed-generator
Extensions we’ll use:
quarkus-vertx (Access to the Vert.X API for Websocket connections)
quarkus-rest-jackson (REST API + JSON)
quarkus-hibernate-orm-panache (ORM with convenience API)
quarkus-jdbc-postgresql (JDBC driver)
quarkus-flyway (migrations)
Configure the database and migration in src/main/resources/application.properties
:
quarkus.hibernate-orm.database.generation=none
quarkus.flyway.migrate-at-start=true
Quarkus will use these properties to initialize the datasource. We set generation=none
because we’ll create the schema via Flyway, rather than Quarkus auto-creating tables. We also enable migrate-at-start
so Flyway applies our SQL migrations on launch
Create an initial Flyway migration: src/main/resources/db/migration/V1__Create_posts_table.sql
CREATE TABLE post (
id BIGSERIAL PRIMARY KEY,
uri TEXT NOT NULL, -- AT Protocol URI of the post
text TEXT NOT NULL, -- Post content text
createdat TIMESTAMP WITH TIME ZONE NOT NULL,
hourofday INT NOT NULL, -- Hour (0-23) the post was created (UTC)
hashtags TEXT, -- Comma-separated hashtags in the post
links TEXT, -- Comma-separated links in the post
frameworks TEXT, -- Comma-separated tech libraries mentioned
language VARCHAR(8), -- Detected language code (e.g. 'en')
indexedat TIMESTAMP WITH TIME ZONE DEFAULT now() -- (indexedat is when we saved the post; could help with pagination or TTL policies)
);
create sequence Post_SEQ start with 1 increment by 50;
This creates a post
table to store each filtered post’s URI, content, metadata, and timestamps. We include an indexedat
with a default of current time. This is not strictly required, but useful if we later implement post expiry (e.g. removing posts older than 48 hours, as Bluesky suggests) or need a unique cursor.
At this point, our project is set up with all required extensions and an initial database schema. We’re ready to start coding the feed generator logic.
Connecting to the Bluesky Firehose (Jetstream)
Bluesky provides a “firehose” stream of all events (posts, likes, follows, etc.) across the network. Feed generators are expected to subscribe to this firehose (the com.atproto.sync.subscribeRepos
stream) and index whatever data they need. Jetstream lets us subscribe to just the data we care about (in our case, posts) with far less bandwidth.
wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=app.bsky.feed.post
This query subscribes us to only post events (filtering out likes, follows, etc.). We don’t need authentication for this stream – it’s open for reading public posts.
Let’s set up a Quarkus bean to connect to this WebSocket on application startup. We’ll use Vert.x’s WebSocket client and handle messages.
Create a class BlueskySubscriber
(e.g. in src/main/java/com/example/service/BlueskySubscriber.java
):
package com.example.service;
import java.time.OffsetDateTime;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.jboss.logging.Logger;
import com.example.model.PostEntity;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import io.quarkus.runtime.StartupEvent;
import io.vertx.core.Vertx;
import io.vertx.core.http.WebSocketClient;
import io.vertx.core.http.WebSocketClientOptions;
import io.vertx.core.http.WebSocketConnectOptions;
import java.net.URI;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.enterprise.event.Observes;
import jakarta.inject.Inject;
import jakarta.transaction.Transactional;
@ApplicationScoped
public class BlueskySubscriber {
private static final Logger LOG = Logger.getLogger(BlueskySubscriber.class);
@Inject
Vertx vertx;
private WebSocketClient wsClient;
void onStart(@Observes StartupEvent ev) {
// Based on official Jetstream documentation:
// https://github.com/bluesky-social/jetstream
String firehoseUrl = "wss://jetstream2.us-east.bsky.network/subscribe?wantedCollections=app.bsky.feed.post";
String description = "Jetstream2 US-East (Official)";
// Initialize WebSocket client with SSL options
WebSocketClientOptions options = new WebSocketClientOptions()
.setSsl(true)
.setVerifyHost(false)
.setTrustAll(true); // For testing - in production use proper SSL verification
wsClient = vertx.createWebSocketClient(options);
connectToFirehose(firehoseUrl, description);
}
private void connectToFirehose(String url, String description) {
LOG.infof("Attempting to connect to %s: %s", description, url);
try {
URI uri = URI.create(url);
WebSocketConnectOptions connectOptions = new WebSocketConnectOptions()
.setHost(uri.getHost())
.setPort(uri.getPort() == -1 ? 443 : uri.getPort())
.setURI(uri.getPath() + (uri.getQuery() != null ? "?" + uri.getQuery() : ""))
.setSsl(true);
wsClient.connect(connectOptions)
.onSuccess(ws -> {
LOG.infof("Successfully connected to %s: %s", description, url);
// Set up message handlers - dispatch to same I/O thread for transaction support
ws.textMessageHandler(message -> {
vertx.executeBlocking(() -> {
handleTextMessage(message);
return null;
}, false);
});
// Set up error and close handlers
ws.exceptionHandler(error -> {
LOG.errorf("WebSocket error on %s: %s", description, error.getMessage());
// Try to reconnect after a delay
vertx.setTimer(5000, timer -> connectToFirehose(url, description));
});
ws.closeHandler(closeReason -> {
LOG.warnf("WebSocket closed on %s", description);
// Try to reconnect after a delay
vertx.setTimer(5000, timer -> connectToFirehose(url, description));
});
LOG.infof("Listening for real-time Bluesky posts...");
})
.onFailure(error -> {
LOG.errorf("Failed to connect to %s (%s): %s", description, url, error.getMessage());
// Try to reconnect after a delay
vertx.setTimer(10000, timer -> connectToFirehose(url, description));
});
} catch (Exception e) {
LOG.errorf("Error parsing URL %s: %s", url, e.getMessage());
// Try to reconnect after a delay
vertx.setTimer(10000, timer -> connectToFirehose(url, description));
}
}
private void handleTextMessage(String message) {
// This method is invoked for each incoming JSON message from Jetstream
try {
processJetstreamEvent(message);
} catch (Exception ex) {
LOG.error("Error processing Jetstream event: ", ex);
}
}
//...
}
A few notes on this setup:
We inject
BasicWebSocketConnector
(provided byquarkus-websockets-next
) to initiate client connections.On startup (
onStart
observer), we connect to the Jetstream URL. We specifyExecutionModel.NON_BLOCKING
so that our message handler does NOT run on a worker thread (since we’ll do blocking I/O like DB writes).We register an
.onTextMessage
handler that calls ourhandlePostMessage
logic for each text frame (Jetstream sends JSON text messages).If the connection drops or errors out, in a real service you’d want to retry or handle reconnection logic; for our prototype, we’ll log failures.
Now, parsing and processing the incoming messages. Jetstream events are JSON objects.
We’re interested in commit events where operation == "create"
and the collection == "app.bsky.feed.post"
. These indicate a new post was created. The record
contains the post content (text
) and timestamp (createdAt
). We’ll extract those, then apply our filters and metadata extraction.
Process Events
In BlueskySubscriber
, add a method processFirehoseEvent(String json)
to handle one event message:
@Transactional
void processJetstreamEvent(String json) throws Exception {
// Parse JSON text into a tree for inspection
JsonNode root = new ObjectMapper().readTree(json);
// Jetstream message format: {"kind": "commit", "commit": {...}, "did": "..."}
if (!"commit".equals(root.path("kind").asText())) {
return; // ignore non-commit events (e.g. 'identity' or 'account' updates)
}
JsonNode commit = root.path("commit");
if (!"create".equals(commit.path("operation").asText()) ||
!"app.bsky.feed.post".equals(commit.path("collection").asText())) {
return; // not a new post creation, ignore (could be likes, follows, etc.)
}
// Extract post text and creation timestamp
String text = commit.path("record").path("text").asText("");
String createdAtStr = commit.path("record").path("createdAt").asText("");
if (text.isEmpty() || !text.contains("#Java")) {
return; // Skip if no text or does not contain #Java hashtag
}
// Distinguish tech vs. travel context for "#Java"
if (!isTechRelatedPost(text)) {
// It's a #Java mention likely about Java (island/coffee), skip indexing
return;
}
// Extract metadata
OffsetDateTime createdAt = OffsetDateTime.parse(createdAtStr);
int hour = createdAt.getHour();
String frameworks = findFrameworks(text);
String hashtags = extractHashtags(text);
String links = extractLinks(text);
String language = detectLanguage(text);
// Construct the AT URI for the post: "at://{did}/app.bsky.feed.post/{rkey}"
String userDid = root.path("did").asText();
String rkey = commit.path("rkey").asText();
String atUri = "at://" + userDid + "/app.bsky.feed.post/" + rkey;
// Persist to database via Panache entity
PostEntity post = new PostEntity();
post.uri = atUri;
post.text = text;
post.createdAt = createdAt;
post.hourOfDay = hour;
post.frameworks = frameworks;
post.hashtags = hashtags;
post.links = links;
post.language = language;
post.persist(); // Panache will insert the record (within the @Transactional context)
LOG.infof("Indexed post %s (hour %d, frameworks: %s)", atUri, hour, frameworks);
}
This method uses Jackson to parse the JSON and then applies our filtering and extraction logic:
Filtering #Java tech posts: We check if the post text contains the hashtag
#Java
. If not, ignore the event. If yes, we callisTechRelatedPost(text)
to determine if it’s about Java the programming language (tech) or something else (e.g. the Indonesian island or coffee). We implement this as simple rule-based logic (layered checks), which we’ll show next.Extracting metadata: For a qualifying post, we gather:
Time of day: parse
createdAt
into anOffsetDateTime
and get the hour of day (0–23).Frameworks/Libraries: scan the text for known tech keywords.
Hashtags: find all hashtags in the text.
Links: find any URLs in the text.
Language: (Basic approach) determine the likely language of the post.
Constructing the post URI: The Bluesky AT Protocol URI for a post is composed of the author’s DID, the collection (
app.bsky.feed.post
), and the record’s key. We retrieve thedid
(author’s DID) from the top-level JSON, andrkey
from the commit, then format://did:.../app.bsky.feed.post/rkey
. This URI is what we will later return in our feed skeleton.Persisting to PostgreSQL: We create a
PostEntity
(a Panache entity class for thepost
table) and set its fields, then callpersist()
. We annotatedprocessFirehoseEvent
with@Transactional
so that this DB operation occurs in a transaction (Panache will handle the insert). We’ll definePostEntity
shortly in the next section.
Filtering and Extraction
Let’s implement the helper methods used above for filtering and extraction:
// Determine if a #Java post is tech-related or not
private boolean isTechRelatedPost(String text) {
String textLower = text.toLowerCase();
// Keywords indicating a travel/coffee context for "Java"
if (textLower.contains("indonesia") || textLower.contains("jakarta")
|| textLower.contains("coffee") || textLower.contains("island")) {
return false; // mentions of Indonesia/coffee likely mean Java the place or coffee
}
// Keywords that strongly indicate tech context
if (textLower.contains("spring") || textLower.contains("quarkus")
|| textLower.contains("jdk") || textLower.contains("programming")) {
return true;
}
// (Basic language hint: if contains typical Indonesian words, you could flag as
// non-tech too)
if (textLower.matches(".*\\bsebuah\\b.*") || textLower.matches(".*\\bpulau\\b.*")) {
// e.g. Indonesian words "sebuah" (a/an), "pulau" (island)
return false;
}
// Default: assume tech if none of the travel indicators were present
return true;
}
// Find known Java-related frameworks or libraries mentioned in text
private String findFrameworks(String text) {
String[] techTerms = { "Spring", "Quarkus", "Jakarta", "Hibernate", "JDK", "JVM" };
StringBuilder found = new StringBuilder();
for (String term : techTerms) {
if (text.contains(term)) {
if (found.length() > 0)
found.append(",");
found.append(term);
}
}
return found.toString();
}
// Extract all hashtags (e.g. #Java, #Quarkus) from text
private String extractHashtags(String text) {
Matcher m = Pattern.compile("#\\w+").matcher(text);
StringBuilder tags = new StringBuilder();
while (m.find()) {
if (tags.length() > 0)
tags.append(",");
tags.append(m.group());
}
return tags.toString();
}
// Extract all URLs from text (simple regex for http/https links)
private String extractLinks(String text) {
Matcher m = Pattern.compile("(https?://\\S+)").matcher(text);
StringBuilder links = new StringBuilder();
while (m.find()) {
if (links.length() > 0)
links.append(",");
links.append(m.group());
}
return links.toString();
}
// Very basic language detection (placeholder for a real NLP library)
private String detectLanguage(String text) {
// For demo: if contains likely English stopwords vs. Indonesian words, etc.
// Here we'll just default to "en" for simplicity.
return "en";
}
The isTechRelatedPost
function is our layered filter: it first checks for obvious travel context keywords (if present, we classify the post as non-tech and skip it), then checks for programming context keywords (if present, definitely tech). We also included a simple check for a couple of Indonesian words as a heuristic. In a real scenario, you might use a natural language processing library or a more nuanced model to disambiguate “Java,” but these rules illustrate the approach.
The other methods gather metadata using regex searches:
Hashtags: find all words starting with
#
.Links: find all
http://
orhttps://
URLs.Frameworks: we used a fixed list of tech keywords to look for. (E.g., if a post mentions “Spring” or “JDK”, we capture those.)
Language: Here we stub a simple solution that always returns
"en"
(assuming most tech posts are in English). You could integrate a library (like OpenNLP or a language detector) to improve this, or even use the presence of certain characters/words to guess language.
With the BlueskySubscriber
in place, our application on startup will connect to Bluesky’s Jetstream and continuously process incoming posts. It will filter and save any posts that match #Java (tech) to our database.
As seen above, the core filter is checking for the hashtag #Java
and then disambiguating its meaning. This step is crucial because we only want technology-related Java posts in our feed (e.g. Java programming, frameworks, JVM topics) – not posts about traveling to Java or enjoying a cup of Java coffee.
Our approach used simple keyword rules as a proxy for basic NLP:
We assume if the post text contains certain place names, tourism keywords, or “coffee”, it’s talking about the island or coffee (e.g. “Java trip in Indonesia”, “Java coffee is great”) – those are excluded from the feed.
If the text contains programming terms (libraries like Spring, Quarkus, or words like “JDK”, “programming”), we treat it as a tech post and include it.
We also included a crude language check – if we detect Indonesian words in the content, it’s likely not a programming discussion (since many Indonesian users might use “Java” to refer to the place). In practice, a dedicated language detector could be used for better accuracy.
This layered logic ensures we don’t blindly include every #Java
mention. For example: a post that says “Exploring coffee plantations in #Java” would be skipped (contains “coffee” → not tech), whereas “Debugging memory issues in #Java on the JVM” would pass the filter (contains “JVM” → tech context).
Extracting Anonymized Metadata
Beyond filtering, our feed generator collects various metadata from each post. This data can be used for analytics or future enhancements (though it won’t be directly exposed in the feed skeleton). Here’s what we extract and how:
Libraries/Frameworks Mentioned: We scan the post text for keywords like “Spring”, “Quarkus”, “Jakarta”, etc. The
findFrameworks()
function simply checks for presence of these substrings. We store any matches as a comma-separated list (e.g."Spring,Quarkus"
if both are mentioned). This gives a sense of what Java technologies are being discussed in trending posts.Time of Day: From the
createdAt
timestamp of the post, we take the hour (0–23 in UTC) and store it ashour_of_day
. This could reveal when Java discussions are most active (for example, if many posts happen around 9-10 AM, etc.).Hashtags: We extract all hashtags in the post (using a regex
#\w+
). This will include “#Java” (by definition of our filter) and possibly other tags (e.g. “#Quarkus” or “#programming”). Storing hashtags can help identify related topics or trends in the posts.Links: We pull out any URLs present in the text (regex for
http://
orhttps://
). Analyzing links could tell us if posts often reference documentation, blogs, StackOverflow, etc., though for our prototype we just store them as text.Language Detection: We include a placeholder for language. In our demo code, we default to
"en"
or do a naive check. In a real scenario, integrating a library like Tika or Optimaize Language Detector could automatically detect if a post is in English, Indonesian, Spanish, etc. This might further help filter content or label it. For example, if a post is entirely in Indonesian, it’s likely not about Java programming (and our filter might catch that). We mark the fieldlanguage
so that later we could plug in an actual detector.
All this metadata is stored in the post
table alongside the original post text and URI. We avoid storing any user personal info – note that we use the user’s DID only as part of the uri
(which is an opaque identifier like did:plc:abcdef1234...
), and we do not store usernames or anything private. This keeps our data anonymized to an extent, focusing only on content patterns. (If needed, we could also exclude storing the full text to be extra cautious, but since the text is public posts and useful for debugging the feed, we keep it in this prototype.)
We’ve set up the database and used Panache to persist our PostEntity
. Now, let’s define the PostEntity
class that maps to our post
table. Panache allows a very straightforward entity definition. Create a file PostEntity.java
in com.example.model
package:
package com.example.model;
import java.time.OffsetDateTime;
import io.quarkus.hibernate.orm.panache.PanacheEntity;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;
@Entity
@Table(name = "post")
public class PostEntity extends PanacheEntity {
public String uri;
public String text;
public OffsetDateTime createdAt;
public int hourOfDay;
public String hashtags;
public String links;
public String frameworks;
public String language;
public OffsetDateTime indexedAt;
}
Because we extend PanacheEntity
, an id
field is automatically provided (with types and generation strategy as per Panache defaults). We’ve named the table explicitly to post
to match our SQL migration. The field names should correspond to column names (Panache will default to using the field name as column name unless strategies like Hibernate naming conventions alter them – we can assume direct mapping for simplicity).
With this entity in place, our earlier call post.persist()
will handle inserting a new row. Panache also gives us convenient query methods we’ll use soon for retrieving data (e.g. PostEntity.findAll()
).
We already wrote V1__Create_posts_table.sql
for the table. If you compile and run the application (mvn quarkus:dev
), Quarkus + Flyway will execute that migration on startup. You should see Flyway logs indicating the migration was applied.
Notice we annotated processFirehoseEvent
with @Transactional
. Quarkus will manage starting a transaction for that method call (since it’s a CDI bean method), so that the post.persist()
runs inside a transaction and commits automatically. Alternatively, we could have opened a transaction explicitly or used Panache’s active record pattern in a different way. But this approach is straightforward: each WebSocket message is handled and committed independently. For a higher throughput design, one might batch commits or use reactive approaches, but our focus is clarity.
At this point, every qualifying post from the Bluesky firehose is being stored in our local database, with rich metadata. Over time, our post
table will contain the recent #Java-tech posts. We might want to implement a retention policy (for example, delete entries older than 48 hours, since feed generators typically don’t need to keep content indefinitely), but that could be a simple periodic cleanup job.
Serving the Custom Feed via XRPC
Now for the key piece: the feed generator must expose an HTTP GET endpoint at the path /xrpc/app.bsky.feed.getFeedSkeleton
. When a user’s Bluesky app requests our custom feed, it will call this endpoint. According to Bluesky’s spec, the request will include a query param feed
(the AT URI of the feed definition record) and possibly a cursor
for pagination. Our server should respond with a JSON object containing an array of feed items (each item is a post URI, plus optional reason data) and a cursor
string if there’s more content.
In our case, the feed array will contain up to N posts (N defined by a limit, default perhaps 30 or 50) sorted by recency. The cursor
can be used to paginate older posts if the feed is long. For simplicity, we’ll implement a single-page feed or a basic cursor.
Let’s create the resource. Make a class FeedResource.java
in com.example.api
:
package com.example.api;
import java.util.List;
import com.example.model.PostEntity;
import jakarta.ws.rs.DefaultValue;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.QueryParam;
import jakarta.ws.rs.core.MediaType;
@Path("/xrpc/app.bsky.feed.getFeedSkeleton")
@Produces(MediaType.APPLICATION_JSON)
public class FeedResource {
@GET
public FeedResponse getFeedSkeleton(@QueryParam("feed") String feedUri,
@QueryParam("cursor") String cursor,
@QueryParam("limit") @DefaultValue("50") int limit) {
// Query recent posts from the database (sorted by newest first)
List<PostEntity> posts;
if (cursor != null && !cursor.isEmpty()) {
// TODO: optional pagination logic (not fully implemented in this prototype)
posts = PostEntity.findAll().list(); // (for now, ignore cursor in prototype)
} else {
posts = PostEntity.findAll().page(0, limit).list();
// Alternatively: PostEntity.findAll(Sort.by("createdAt").descending()).range(0,
// limit-1).list();
}
// Build the feed list
FeedResponse response = new FeedResponse();
response.feed = posts.stream().map(
p -> {
FeedItem item = new FeedItem();
item.post = p.uri;
return item;
}).toList();
// For simplicity, we won’t implement real cursor pagination here.
response.cursor = "";
return response;
}
// DTO classes for JSON serialization
public static class FeedResponse {
public List<FeedItem> feed;
public String cursor;
}
public static class FeedItem {
public String post;
// (optionally could include "reason" or other fields if needed)
}
}
A few important points about the endpoint:
It’s registered under the exact XRPC path
"/xrpc/app.bsky.feed.getFeedSkeleton"
. We use this full path so that Bluesky can find it. The method is a plain@GET
since feed skeleton retrieval is an HTTP GET.Query parameters:
feed
– the AT URI of the feed that’s being requested. In Bluesky, one feed generator service can host multiple feeds (different algorithms). Thefeed
param tells us which specific feed the user wantsedavis.dev. In our case, we might have only one feed (e.g. at.../feed.generator/javafeed
). We aren’t using this value to differentiate logic (since just one feed), but we accept it to conform to the interface.cursor
– a token for pagination. If the client received acursor
last time, it will send it back to get the next page. Our implementation above doesn’t fully handle it; it just ignores or resets it. For a real feed, we’d implement cursor-based pagination (explained below).limit
– Bluesky’s app can request a certain limit (often 30 by default). We use a default of 50 if not provided. We then fetch that many posts from our DB.
Fetching from DB: We use Panache to get the recent posts. The above uses a simple
findAll().page(0, limit)
which by default orders by primary key (which correlates to insertion order). To be safer, you might want to sort bycreatedAt
descending. In Panache, we could do:
posts = PostEntity.findAll(Sort.by("createdAt").descending()).range(0, limit-1).list();
This would retrieve the latest
limit
posts by timestamp. Ensure yourcreatedAt
field is indexed or useid
if insertion order is guaranteed to follow creation time. For our prototype scale, this is fine. The result is a list ofPostEntity
objects.Building response: We map each
PostEntity
to aFeedItem
DTO with just thepost
URI field. Bluesky’s spec allows an optional"reason"
field (e.g. if the feed is recommending a post because someone you follow liked it, etc.), but our feed has no such concept – it’s purely topical. So we omit any reason (or we could set a constant reason like “custom-java-feed”, but it’s optional). We then wrap that list in aFeedResponse
object that contains thefeed
list and acursor
string.Cursor logic: In production, we would implement a cursor to allow pagination. Typically, the cursor could be something like
<timestamp>::<postCid>
of the last item in the current page. The feed generator would then use that to fetch older posts on the next request. For simplicity, we setcursor = ""
(empty) always, indicating either end-of-feed or that we aren’t supporting pagination in this prototype. The Bluesky app will interpret an empty cursor as no further results. This is acceptable for a demo (the feed will just not paginate beyond the first batch). If implementing fully, we’d generate and return a cursor when the feed has more items thanlimit
, and honor an incomingcursor
by querying older posts (e.g. “createdAt < lastSeenCreatedAt” in the DB query).
Now our feed service should produce the expected output. When Bluesky (or any client) calls GET /xrpc/app.bsky.feed.getFeedSkeleton?...
, our FeedResource
will return JSON like:
{
"cursor": "",
"feed": [
{ "post": "at://did:plc:.../app.bsky.feed.post/abcdef..." },
{ "post": "at://did:plc:.../app.bsky.feed.post/ghijkl..." },
...
]
}
This matches the schema defined in Bluesky’s lexicon. The Bluesky app (AppView) will take these URIs and hydrate them – meaning it will fetch the actual posts (author info, content, etc.) from the network and then display them in the user’s feed. Our service’s job is only to provide the list of relevant post URIs in the right order.
Before moving on, double-check that the resource path is correct and the application is configured to listen on the default port (Quarkus dev mode runs on port 8080 by default). Also, ensure CORS or other settings are open if you plan to test from a browser or other environment – for simple curl tests, it’s not an issue.
Testing Locally
With everything in place, it’s time to run our prototype and test it out.
quarkus dev
If you don’t want to wait for live #Java posts, insert a test row:
INSERT INTO post(uri,text,created_at,hour_of_day)
VALUES ('at://did:plc:test/app.bsky.feed.post/1','Hello #Java from Quarkus','2025-08-10T12:00:00Z',12);
Then query the feed:
curl "http://localhost:8080/xrpc/app.bsky.feed.getFeedSkeleton?feed=at://did:example/app.bsky.feed.generator/java-feed" | jq .
Expected output:
{
"feed": [
{
"post": "at://did:plc:3lweg7xrq2kmkek46ofvvxln/app.bsky.feed.post/3lwiwhrh6ek2f"
}
],
"cursor": ""
}
Finally, to integrate this feed into your Bluesky app (for real use), you would need to do two things not covered in depth here:
Obtain a DID for your feed service (either
did:web
ordid:plc
) and host a.well-known/did.json
. In development you might skip this, but publishing a feed requires it.Create a feed record in your Bluesky account’s repository (using Bluesky’s AT Protocol library or API) that points to your feed generator’s DID and gives your feed a name/description. For instance, you’d create an
app.bsky.feed.generator
record like “java-feed”.
Wrap-Up
Congratulations – you have a working Bluesky custom feed generator prototype!
You set up a Quarkus service that streams Bluesky posts in real-time, filters for #Java tech content, enriches it with metadata, stores it locally, and serves a feed endpoint that Bluesky can query.
This provides the foundation for a powerful custom feed. You can extend this by refining the NLP filters, adding pagination, deploying it online, and registering the feed so others can follow it. Happy coding, and enjoy your custom Java feed on Bluesky!
The team reached out wirh comments around two things:
- a I should use Websocket Next and
- make sure it's more resilient
So, I started a reactive version of this if you're interested.
https://github.com/myfear/ejq_substack_articles/tree/main/bsky-javafeed-generator-reactive
I will get this feature complete and push out in a separate post soon.