I still remember the day the Mercury project kicked off. Clevertap was costing us millions, and the plan was to replace it entirely with an in-house system. Ambitious? Definitely. But it became one of the most defining chapters of my time at Gojek. From writing ingestion pipelines in Go to building segmentation from scratch and re-architecting analytics platforms — I got my hands dirty and learned more than I ever expected.
Chapter 0: The start of the Mercury Project
Gojek used to use Clevertap. Clevertap is used by Gojek for campaign management. Clevertap helped the company by capturing clickstream events, and the clickstream events data was used for sending targeted campaigns and analytics around clickstream data. Clevertap used to cost the company X millions of dollars annually. This drove the initiation of the Mercury project. A project to replace Clevertap with an in-house solution. The engineering development of the Mercury project is where I was involved during this part of my journey at Gojek.
Chapter 1: Writing the ingestion system
At the beginning of the Mercury project, an ingestion system to ingest clickstream data is required. This needs to be done before developing other parts of the system. My role in this is to write a GO service that accepts WebSocket connections and forwards the data to Kafka. The technical spec was already written in an RFC by my manager. My job was to implement the technical spec. In this work, I polished my expertise in GO. By the time of this writing, the system is ingesting petabytes of data daily per data partition. The system has also been extended to accept other protocols apart from WebSocket.
To align with the team initiative at the time, we also open-sourced this ingestion system. https://github.com/goto/raccoon.
Chapter 2: Developing rule rule-based segmentation system from scratch
After the ingestion system was done, we moved on to build the campaign management platform. It’s a big platform that consists of campaign management, comms channel integrations, real-time campaign, and personalization. The development was broken down into 3 sub-systems. I worked on the segmentation sub-system.
Segmentation is a rule-based segment creation system. It’s a system where marketers can get a list of users who meet the criteria. The criteria are encoded in a rule that consists of user behaviors(past actions) and user properties. The system then prepares the list of users accordingly. This system utilizes BigQuery to prepare the user list. Segmentation translates the rule to BigQuery job executions and manages the execution reliably. For this project, I led a team of 3, including me. Segmentation has turned into a core system that serves more and more use cases. The extensibility and the reliability of the system enabled segmentation to grow beyond the initial Mercury use case.
Chapter 3: Fixing the reliability of the analytics product
Another part of Clevertap’s capabilities that was built in-house was analytics. The analytics development was offloaded to a different team. When Clevertap was decommissioned, analytics product ownership was moved to my team. This analytics product offers self-serve analytics on top of Clickstream data. However, the state of the product was not good. Almost all the features were not reliable. The support channel was full of people reporting issues. For this work, I re-architected the backend system of the analytics product. I apply advanced data engineering techniques that I learned at work. The complaints on the product went to 0.
Chapter 4: Experiment analytics reverse engineering
Gojek had been using Eppo to automate experimentation result calculations. I worked with the experimentation platform team to build an in-house experiment platform. We reverse-engineered the existing Eppo product and designed our architecture. My involvement in this revolves around designing the system and overseeing the project to completion.