Hey everyone, we’re Arjun and Anirudh, founders of Signadot (
https://www.signadot.com/). Signadot is a Kubernetes-based solution that enables lightweight environments, called Sandboxes, to test microservices early in the development lifecycle.
Before founding Signadot, I managed engineering teams building microservices at different companies. As the number of services and external dependencies (i.e databases, message queues, third-party APIs, etc) increased, testing became challenging. While at a smaller scale, we could stand up our “application in a box”, as the complexity increased, we relied on a pre-production (staging) environment to do a lot of our testing. However since the staging environment was shared by many teams for testing, it became a bottleneck. When issues were discovered on staging, the root-cause analysis took a long time.
We talked to many companies about how they were testing their microservices, especially once they started to grow beyond ~20 engineers. We encountered various solutions ranging from companies setting up multiple (expensive) staging environments, to having each team take turns “locking” the staging environment. At larger companies, like Uber and Lyft, we learned that they had built their own highly scalable (but bespoke) solution for testing microservices based on dynamic traffic routing. With this kind of system, environments can spin up quickly, and are cost-effective at scale. We wanted to build a similar system that was more generalized and make it available to everyone running on Kubernetes.
The intuition behind Sandboxes is that each test environment only has a few microservices under test and all other dependencies can be fulfilled from a shared pool of microservices running the latest version called the *baseline*. Starting with a staging Kubernetes cluster running up-to-date stable versions of microservices, sandbox environments are described in terms of what is modified with respect to the baseline. This is similar to the copy-on-write model of resource management. Once a sandbox environment is set up, tests can be run against it and requests get routed to the versions of microservices under test.
An important consideration is isolation between different sandbox environments. For isolating test requests, we use traffic labeling and request routing. With traffic labeling, a tenant ID is set as an L7 header on each HTTP/gRPC request and based on the value of this header, requests follow different paths through the microservices. Request routing is realized by working with a service mesh (like Istio) if already installed, or by using our sidecar proxies. For tests that require data isolation, we built a pluggable framework called Resources (https://docs.signadot.com/docs/sandbox-resources) that can set up an ephemeral stateful resource (Kafka topic, database schema, etc) on the fly and tie it to the sandbox lifecycle. Resources can also be used to test when there is async communication across services.
We worked with a few enterprise companies for over a year to come up with an architecture that can support complex microservice environments. We built a Kubernetes Operator that installs into our users’ Kubernetes clusters and connects to a control plane that we host. Our control plane acts as both an API to create sandbox environments and as a proxy layer that can route traffic to sandboxes.
We are launching with support for integration & end-to-end testing by providing high-fidelity environments that can be set up via the CI pipeline. Next on our roadmap is making it easy to write custom resource plugins and enabling feature testing using sandboxes. With feature testing, developers working on different microservices can see how their changes interact with each other’s services before merging.
If you’d like to try it out, we have a free tier. Our pricing is based on the number of unique services in sandboxes.
Thanks for reading this post and we welcome your feedback and comments!
Either way, this notion of slices of environments being deployed for testing, with "baseline" or fallback environments being used otherwise, is the future of software development. It's a real boon for developers when rolled out effectively, and I've seen it scale massively at GoodRx.
Congrats to the team. Wish you all tons of success!