How Swiggy Manages the Chaos of 1.5 Million Alerts Every Day
Analytics India Magazine (Ankush Das)
Food-ordering platforms are very popular in many parts of India, and the technology behind them is constantly evolving. Businesses are continuously trying to keep up with customers’ demands, keeping all the essentials in mind.
Whether it is the 10-minute delivery aspect or the conventional food delivery orders, companies keep on upgrading their platforms as they scale up, like Zepto’s shift to MongoDB and the move to rely on a combination of technologies.
Klaxon is Swiggy’s in-house alerting platform, designed not just to monitor but also to respond. It sends over 1.5 million alerts each day, helping teams identify, understand, and act on issues that would otherwise disrupt operations or annoy customers.
A platform like Klaxon clarifies the massive amount of information the company has to track throughout the day, such as a delayed Instamart order.
Monitoring The Interactions With Klaxon
In Swiggy’s tech blog, the company shared all the details one would need to know. Klaxon didn’t start as a generic alerting tool. It was purpose-built to reflect Swiggy’s real-time needs. Built on complex event processing (CEP), it pulls in data from platforms like Hive, Snowflake, and Kafka. These streams are filtered, correlated, and transformed into actionable alerts, sometimes in real time, sometimes batched, depending on the nature of the anomaly.
“On a regular day, Klaxon sends out more than 1.5 million alerts, playing a critical role in keeping teams informed of important metrics and potential issues in real-time,” the blog read.
Users across teams can create alerts with custom trigger conditions, define recipients, test logic before launch, and track delivery performance. Alerts are routed through a multi-channel setup, email, Slack, SMS, app notifications, and even direct Freshdesk ticket creation for the customer care team.
When an event occurs, the Klaxon service picks up the signal, pushes it into Kafka, and hands it over to the Armstrong database pipeline. A streaming job listens, translates the event into an alert, and then routes it based on user configuration. Whether it’s a Slack ping to a city ops manager or an SMS to a delivery executive mid-route, the alert is always contextual and timed for impact.
Swiggy trimmed down Kafka jobs from 30 to eight and introduced a lean, on-demand Databricks setup for batch alerts. This reduced operational costs by 50–60% while maintaining responsiveness and precision.
Klaxon’s Role in Boosting the Team’s Efficiency
Real-time alerting is only effective if it results in real action. That’s where Klaxon excels, as Swiggy claims. For delivery executives, push notifications now help avoid damage to fragile items mid-delivery. SMS alerts for cake orders have drastically reduced mishandling complaints, proving that even seemingly small nudges can shift outcomes at scale.
For operations teams, Klaxon provides live insights into order delays, allowing them to reassign delivery partners or reroute orders in real-time. Delay alerts from Instamart have helped tighten fulfilment windows and directly impacted customer satisfaction scores. Moreover, when it comes to premium users, Klaxon ensures that every poorly rated order gets attention from support teams within minutes.
The platform also taps into social signals. When customers raise complaints, Klaxon filters for key terms and reroutes complaints to specialist agents, resulting in faster responses and fewer escalations.
Klaxon’s user profile functionality allows teams to manage all their alerts in one place. Email alerts now support acknowledgement tracking, with open and click-through rates visible for every message. Each alert comes with a performance dashboard, enabling continuous iteration and improvement.
“We’ve made Klaxon more efficient by reducing the complexity of our deployment infrastructure. By deprecating two redundant deployment units (Swiss-klaxon-ui and swiss-klaxon-preprod-ui), we’ve streamlined operations, leading to faster development cycles and a more maintainable platform,” the blog post stated.
Swiggy’s Monitoring Platform Can Make or Break The Entire Platform
Klaxon isn’t a tool; it’s an operating layer. It allows Swiggy to move fast, course-correct even faster and ensure that no anomaly goes unnoticed. With over 300 live alerts and counting, Klaxon has become deeply embedded across the company, from tech teams to city operations, from customer care to business leadership.
Looking ahead, Swiggy plans to simplify Klaxon’s architecture further by unifying environments and building lightweight mechanisms to measure alert impact. Teams will soon be able to identify and disable low-utility alerts, focusing only on what truly matters.
For now, Klaxon’s job is simple: detect, inform, and guide action. In a business where minutes matter and customer expectations are sky-high, this monitoring tool ensures that Swiggy stays ahead of chaos. Because sometimes, managing millions of moving parts starts with just one timely alert.
The company may consider using AI in some form in the near future, but this is unclear at the moment.
The post How Swiggy Manages the Chaos of 1.5 Million Alerts Every Day appeared first on Analytics India Magazine.
Generated by RSStT. The copyright belongs to the original author.