[Recap] How Grab optimized the storage of segments membership for 270 million users by using the Roaring Bitmap

May 12, 2024

Reference:

Background

Launched in 2019, the Segmentation Platform is Grab's central system for user segmentation across all business verticals, enabling teams to create and retrieve user segments with attributes from our data ecosystem.
Based on their interaction patterns with Grab superapp, Grab has clustered users into a few segments.
For example:
- “New segment”: Users recently signed up to the Grab app but haven’t taken any rides yet.
- “Active segment”: Users who took rides in the past month.
- Utilizing these metrics, Grab determined optimal daily and weekly frequency limits for each user segment. The solution ensures a user's communications don't surpass these segment thresholds.

Segmentation Platform comprises two major subsystems: Segment Creation and Segment Serving
Segmentation Platform architecture

Segment creation is powered by Spark jobs. When a Grab team creates a segment, a Spark job starts to retrieve data from our data lake. After the data is retrieved, cleaned, and validated, the Spark job calls the serving sub-system to populate the segment with users.
Segment serving operates via a suite of Go services, with ScyllaDB chosen as the primary storage layer for its horizontal scalability and ability to meet <80ms p99 SLA. Users within a segment are stored as rows indexed by user ID, ensuring segment data is evenly spread across ScyllaDB clusters.

Segmentation Platform handles up to 12K read and 36K write QPS, with a p99 latency of 40ms.

The creation of more and larger segments caused a bottleneck in write QPS, leading to extended latency.
Grab services requested even lower latency for membership checks.
- With a huge customer base of over 270 million users, storing the user segment membership information has to be cost-efficient and memory-sleek. Querying the segment to which a user belongs should also have minimal latency.

As a segment is stored across multiple rows in ScyllaDB, creating a large segment incurs a huge number of writes to the database. What we needed was a better way to store a large set of user IDs. Since user IDs are represented as integers in Grab system, a natural solution to storing a set of integers was a bitmap.
For example, a segment containing the following user IDs: 1, 4, 24, 89, 90 could be represented with a bitmap as follows:

To perform a membership check, a bitwise operation could be used to check if the user ID's index is 0 or 1 with O(1)
Bitmap needs an array size of 91 to store 5 elements
If a segment contains 2 user IDs 100 and 200,000,000, it will require a bitmap containing 200 million bits (25MB) where all but 2 of the bits are just 0. => The team needed an encoding to handle sparse segments more efficiently.
As a bitmap, the segment can also be stored as a single Blob in object storage instead of inside ScyllaDB.

We chose Roaring Bitmaps, compressed uint32 bitmaps, for more storage efficiency. They can store a segment with 1 million members in less than 1MB, as opposed to 4MB with naive encoding. Roaring Bitmaps optimizes compression by dividing the set into fixed-size integer chunks and employing three container data structures based on data distribution within each chunk.
Example
- For every element to insert to roaring bitmaps, we need to convert to 32-bit binary
  - 7000 => 0000 0000 0000 0001 0001 0001 0111 0000
    - Key 0000 0000 0000 0001
    - Value: 0001 0001 0111 0000 = 4464

Convert element to the 32-bit binary, where the 16 Most Significant Bits (MSB) are used as the key and the remaining 16 Least Significant Bits (LSB) are represented as the value in containers.
There are three types of container:
- Array container: If len(values) <= 4096, then use sorted array of 16-bit integers. O(LogN) access
- Bitmap container: when data is dense. A bitmap container is a 216 bit container. O(1) access
- Run Length Encoding (RLE) container: is used when a chunk has long consecutive values.
  - Example: if we have the set [0,1,2,3,4...90]container that would have required 91 bits can be compressed into a run container by storing only the start (0) and the length (90).
This is an exam of using Roaring Bitmap with set [5, 8, 13, 100, 70000...80000, 500320...504320, 505000...507000]

By using different containers, Roaring Bitmaps are able to achieve good compression across various data distributions, while maintaining excellent lookup performance.
Service teams are able to perform set operations (union, interaction, and difference, etc) on the segments on the fly instead of re-materialising the combined segment into the database.
- Example: quickly finding the users existing into two segments by using bitwise

The SDK manages segment retrieval, decoding, caching, and watching updates.
The SDK provides a cache with a least-recently-used eviction policy to ensure that hot segments are kept in the cache.
The SDK is able to watch for updates on a segment and the SDK will automatically refresh the cached segment when it is updated.

This post covered scaling challenges faced by the Segmentation Platform, and our solutions via alternative storage and encoding techniques for faster segment creation and low latency reads.
The SDK simplifies segment use for our teams, managing caching, eviction, and updates. Future plans include better scalability and optimization, as some use cases still depend on ScyllaDB segments opposed to Roaring Bitmaps.