• The Blueprint
  • Posts
  • How India's largest ecom store stays consistent with 100M+ events per month

How India's largest ecom store stays consistent with 100M+ events per month

Good morning busy engineers! Flipkart is India’s largest e-commerce store and with 100M+ orders/month, it’s safe to assume their engineering department stays busy :)

Flipkart’s event-driven system

Between placing the order to it’s delivery, a well-rehearsed dance of multiple services (shown above) takes place to ensure that it is efficiently processed, fulfilled, and delivered. So, what is their problem?

“Consistency is key”

A common problem faced with this architecture is a potentially inconsistent system! Let’s take a look at an example.

Adam orders yellow socks but something goes wrong :(

Here Adam’s order doesn’t get persisted to the database (transaction failure) but the event still gets published to the warehouse service. This inconsistency in the system over millions of orders can lead to major problems!

Let’s see how Flipkart cleverly manages to achieve atomicity in their event-driven system!

Initial solution: Transactional Outbox Pattern 

Transaction Outbox Pattern

As shown above, this pattern has two simple components:

  • Outbox table: Service 1 will perform a transaction that writes Order information to the order table and Event information to the outbox table. Upon failure, all operations within the transaction will be rolled-back from both tables (atomicity).

  • Relayer: A Poller process that reads newly-added rows from the outbox table, marks them as “read” and publishes them through the message broker.

Since everything happens within a single database and transaction, rolling back in case of failure is very easy.

Then, reality settled in…

When the solution was scaled, the engineers at Flipkart noticed three main problems:

  1. Primary database started consuming much more disk space because of the outbox table

  2. High availability was compromised since the Relayer was the single point of failure (Consistency vs. Availability tradeoff - CAP theorem)

  3. Read and Write throughput requirements increase: Two writes on the outbox table and one costly read for all unread events need to performed additionally. The read queries were pulling large amounts of data from the database and due to network bandwidth limitations, these queries clogged the primary db network → increasing latency.

The superior solution: Turbo Relayer

Let’s see how Flipkart addressed some of the problems above.

Solving the Disk & Network Bandwidth Problem

Problem: The massive payload for each event made the outbox table explode in size and take up a lot of primary db disk space.

Solution: To overcome this challenge, the payload and metadata of each event is stored in another database and the message identifier (message_id) is then committed to the primary database within the same transaction. The Relayer then publishes the events only after the transaction is committed to the primary database.

Flipkart’s Turbo Relayer

Now, the primary database has freed up disk space and network bandwidth. The throughput requirements on the primary database are also significantly reduced since the Relayer performs a single query (read) on the message_id!

Reducing Queries on Outbound DB 

Problem: Before publishing an event, the relayer polled the db for new events, published it, and marked that event as “read” or “relayed”. This led to excessive and inefficient queries (scanning the entire table for unrelayed messages + updating outbox table on every event publish).

Solution: Instead of using a bool to mark an event as “read”, represent it using auto-increment IDs. The Relayer will read messages with a specific range of IDs in batches (ex- 1 to 1000, 1001 to 2000) which eliminates the extra write query. A separate table will store checkpoints representing all the processed ranges, so that in case of failure, the relayer will remember it’s past progress.

Handling Data Growth

Problem: The events data has large payloads which takes up a lot of disk space and leads to a growing database of append-only data.

Naive Solution: Once the events are read and relayed, that data is not needed anymore. While deleting the event row upon relay/publish is a possible solution, it is an unnecessary operation.

Image credit: Invoca Engineering blog

Ideal Solution: The ideal solution uses MySQL Range Partitions based on the auto-increment IDs. Once all ranges of data within a partition has been relayed, that whole partition can be deleted!

Checkout the full article here! Their article was Part 1 of an incomplete series, watch out for another newsletter on their solution to high availability of relayer!

Thank you for your time! Share this with your friends if you learned something new!