Shaadi.com
Download the Case Study
The SRE team struggled with a daily barrage of over 400 alerts from disparate infrastructure and business units, creating severe alert fatigue where critical incidents were frequently obscured by low-priority noise. Manual monitoring and uncoordinated routing processes caused avoidable delays, as alerts were often broadcast to anyone available rather than specific experts.
By adopting Xurrent IMR, Shaadi.com centralized their monitoring, aggregating metrics from AWS and business sources into a single pane of glass. The platform’s intelligent routing and automated escalation policies now ensure that critical alerts are instantly directed to the correct subject matter experts, bypassing general channels.
Shaadi.com Improves Mean Time to Acknowledge (MTTA) by 80% with Xurrent IMR
People Interactive operates Shaadi.com, the world's largest matrimonial service, playing a critical role in the lives of over 35 million users globally. In the sensitive business of finding love and lifelong partnership, platform reliability is not just a technical requirement—it is a promise of trust. Users expect a seamless, uninterrupted experience, meaning the underlying infrastructure must operate with near-perfect uptime.
However, maintaining this level of reliability became increasingly difficult as the platform scaled. The Site Reliability Engineering (SRE) team found themselves battling significant "alert chaos," struggling to manage an onslaught of over 400 alerts generated daily. These alerts flooded in from various business units and complex AWS infrastructure components, creating a wall of noise that made it nearly impossible to distinguish critical incidents from routine notifications.
The operational workflow suffered from severe bottlenecks due to manual monitoring and uncoordinated routing. Alerts were often broadcast to "anyone available" rather than specific subject matter experts, leading to confusion and slower response times. Critical issues were frequently buried under low-priority noise, and without a centralized system, the team lacked the visibility needed to track incident progress or perform effective root cause analysis.
To combat this fatigue and ensure top-tier reliability, Shaadi.com turned to Xurrent IMR to centralize and automate their incident management. The implementation transformed their reactive operations into a proactive, data-driven environment. By aggregating alerts from all sources into a single pane of glass, the team finally achieved the visibility required to manage their complex infrastructure.
Key operational improvements driven by Xurrent IMR included:
- Intelligent Alert Routing: Moving away from "broadcast" notifications, the system now ensures critical alerts reach the specific subject matter experts immediately, eliminating unnecessary distractions for the rest of the team.
- Automated Escalation Policies: To prevent burnout and missed alerts, the team implemented rules that automatically escalate issues to senior personnel if an incident exceeds set time thresholds.
- Data-Driven Post-Mortems: The dashboard provides granular insights into incident trends, allowing the team to identify recurring issues and implement long-term preventive measures rather than just quick fixes.
The impact of this transformation was immediate and measurable. By streamlining the workflow and ensuring the right alerts reach the right people at the right time, Shaadi.com achieved a massive 80% improvement in Mean Time to Acknowledge (MTTA).
Today, the SRE team at Shaadi.com operates with a focus on strategic reliability rather than firefighting. The shift to Xurrent IMR has not only improved technical metrics but has also enhanced the overall user experience for millions of people worldwide, ensuring that the search for a life partner is never interrupted by avoidable downtime.

