- The Blueprint
- Posts
- Scaling Discord's message datastore to 1,000,000,000,000 messages
Scaling Discord's message datastore to 1,000,000,000,000 messages
GM Busy Engineers! We covered Discord’s MongoDB → Cassandra migration in a previous article. But guess what? After scaling to a trillion messages, their massive 177-node Cassandra cluster gave up. Let’s jump right in!
Credit: ScyllaDB Website
Problems With Storing 1 Trillion Messages
After scaling their Cassandra nodes from 12 to 177 in five years, they faced some major challenges that limited them in terms of scale.
Hot Partitions
As discussed in a prev. article, Discord decided to partition based on channel_id and time-based buckets. Let’s say a channel has a large amount of active users, it will likely cause many concurrent reads on a partition, which leads to a “hot partition” (similar to a Cache Stampede).
According to Discord, when a node has a hot partition, the latency in the node would increase as the node tries harder to serve traffic.
Cluster Maintenance
Some cluster maintenance ops were becoming too expensive to run. In Cassandra, a compaction is the process of merging and reorganizing on-disk data (stored in SSTables) to improve read performance.
When compactions are not performed regularly, reads are performed on unoptimized storage which translates to slower reads.
🗑️ Garbage Collection
Since Cassandra is JVM-based, it needs regular garbage collection runs. The garbage collector temporarily pauses the execution of the program to perform its memory management tasks. These pauses result in latency spikes which Discord spent a lot of time trying to reduce through tuning.
Why ScyllaDB
Cassandra is a popular choice when it comes to choosing a highly-scalable database. So how does ScyllaDB even beat Cassandra?
Better Performance: ScyllaDB surpasses Cassandra in terms of performance, delivering faster read and write operations, lower latencies, and improved throughput.
No garbage collection: Written in C++, it does not rely on garbage collection for memory management.
Shard-per-core architecture: Each CPU core handles a specific shard independently, leading to improved performance and scalability.
Credit: ScyllaDB Website
Addressing the Hot Partition Problem
Since ScyllaDB is also prone to Hot Partitions, Discord decided to build their own Data Service layer that sits between their API and DB cluster. They implemented one gRPC endpoint per db query for optimal performance. Let’s see how this layer solves Hot Partitions.
Request Coalescing
Request coalescing refers to the process of combining or merging multiple similar requests into a single request. From the image below, the worker task makes a single query to the database on behalf of multiple requests and then returns the result back to them.
Credit: Discord Blog
Consistent Hash-Based Routing
Request coalescing is only effective when many similar requests hit the same message-service together. To ensure this, Discord engineers added a consistent hash-based routing key to every request (channel_id for messages).
Thus, all read/write requests from one channel hits the same message-service (refer to image below) to allow for coalesced requests!
Credit: Discord Blog
Migration Headaches
Understanding their migration process requires a full article on its own, so stay tuned!
Happy Ending
ScyllaDB worked really well for Discord’s use case and posed significantly less problems. To celebrate this happy ending, I will end on a fun note!
Credit: Discord Blog
The above graph shows Discord’s system activity during the France V.S Argentina World Cup match. Every major spike was caused by Discord users frantically texting their friends about a major event within the match (Ex. Messi scoring, France’s comeback, etc.)