Swiggy Mukesh Kabra May 23, 2023

How We Improved Testing Processes Using Ephemeral Environment

Article Summary

Swiggy was missing production deadlines because their UAT environments were constantly unstable or unavailable. Sound familiar?

The Swiggy engineering team solved their testing bottleneck by ditching static UAT environments for ephemeral ones. They built EaaS (Ephemeral Environment as a Service) to spin up isolated, on-demand testing environments that live only as long as needed.

Key Takeaways

Critical Insight

Swiggy eliminated testing environment bottlenecks and reduced costs by replacing shared static UAT environments with automated, isolated ephemeral environments.

The article hints at two upcoming tools (Shuttle and QGP) that layer on top of EaaS to further streamline their infrastructure and quality control.

About This Article

Problem

Swiggy's UAT environments had data corruption issues and teams were competing for shared resources. Environments that should have been temporary kept running indefinitely because max TTL policies weren't enforced, which made it impossible to run parallel feature tests.

Solution

Mukesh Kabra's team built an Ephemeral Environment Units system (EEU = Max(1, TotalService/30, (Cost/hour)/10)) to distribute resources fairly across teams and enforce TTL limits so environments wouldn't pile up.

Impact

The on-demand provisioning model worked as expected, with consistent daily environment creation. Weekend infrastructure costs dropped noticeably compared to weekdays because teams only paid for what they actually used.