BuzzConnect | Shantanu Bhusari

Summary

A Twitter-like social platform built as 5 independent Go microservices to explore distributed systems patterns at small scale: Change Data Capture, eventual consistency, sorted set feeds, WebSocket notification delivery. Each service owns its own database with no shared state.

The core challenge was feed delivery. A user with 10,000 followers posts a tweet. 10,000 timeline records need to be updated. Doing this synchronously in the write path makes the post endpoint slow and fragile — if any timeline update fails, the entire post fails. The CDC pattern solves this: PostgreSQL writes are the source of truth; RabbitMQ events propagate to the Timeline service asynchronously; Redis sorted sets materialise the feed. The post returns immediately; feeds update in the background.

Services: User, Tweet, Timeline, Relations, Notifications — each with its own database and no shared state between services.

Architecture Decisions

Why CDC instead of synchronous fan-out

The options considered: Synchronous fan-out (write to all follower timelines in the post request), async fan-out via direct service calls, Change Data Capture pattern via message queue.

The constraint: A synchronous write to 10,000 timeline records in a single request is not acceptable — it makes posting slow proportional to follower count and creates a single point of failure.

The decision: CDC pattern. The Tweet service writes to its own PostgreSQL database. That write emits an event to RabbitMQ. The Timeline service consumes the event and updates Redis sorted sets for each follower's feed. The post endpoint returns as soon as the PostgreSQL write succeeds — feed delivery happens asynchronously.

The trade-off: Eventual consistency between writes and reads. A follower's timeline may lag behind the actual post by a small window. This is acceptable — Twitter exhibits the same behaviour.

What I'd change: Nothing on the core pattern. The trade-off is explicit and appropriate for this use case.

Why Redis sorted sets for feed storage

The options considered: PostgreSQL query joining tweets and follow relationships on every feed load, a materialised view in PostgreSQL, Redis sorted sets.

The constraint: Feed retrieval must be fast regardless of how many people a user follows. A SQL join across follows and tweets gets expensive as the follow graph grows.

The decision: Redis sorted sets with Unix timestamps as scores. Each user has a sorted set of tweet IDs in their timeline. Retrieval is ZREVRANGE — O(log N + M) where M is the number of tweets returned, regardless of follower count.

The trade-off: Feed data is duplicated — a tweet by a user with 10,000 followers exists in 10,000 Redis sorted sets. Storage cost scales with follower count, not tweet count.

What I'd change: Add a fan-out-on-read strategy for high-follower accounts. Celebrities with 1M+ followers should not trigger 1M Redis writes on post — their feed entries should be fetched and merged at read time.

Why WebSocket for notifications instead of polling

The options considered: Client polling on an interval, server-sent events, WebSocket.

The constraint: Notification delivery (someone liked your tweet, a new follower) should feel immediate. Polling every N seconds either feels laggy or wastes requests.

The decision: The Notifications service maintains persistent WebSocket connections per authenticated user. When a notification event is consumed from RabbitMQ, it is broadcast directly to the connected client.

The trade-off: Persistent connections per user consume server memory. The Notifications service must handle connection drops and reconnects gracefully.

What I'd change: Add Redis pub/sub as a fan-out layer for the WebSocket broadcast. Currently if the Notifications service scales to multiple instances, a notification event on instance A cannot reach a user connected to instance B.