Utilization or isolation? K8s leaves it up to you
Kubernetes multi-tenancy is a tricky topic, because Kubernetes is a single-tenant orchestrator by design.
By default, Kubernetes does not enforce API, network, or resource isolation. Any workload on any pod can talk to another or invoke the Kubernetes API, and the Kubernetes scheduler may place workloads from any tenant next to another.
Instead Kubernetes provides various constructs for operators to use to implement multi-tenancy and offloads this choice (read: burden) to administrators.
This presents a huge challenge to infrastructure teams who face an ever-growing demand for Kubernetes resources to place their applications.
In fact, the 2023 State of Production Kubernetes research found that 86% of respondents expect the number of new containerized applications built for Kubernetes to grow. 78% see the number of development teams deploying to Kubernetes growing. 77% plan to migrate existing applications to Kubernetes.
Firing up more and more clusters to serve all these applications and use cases is not the only answer: there’s multi-tenancy, too.
The question then becomes “how do we implement multi-tenancy in Kubernetes that can both achieve the isolation we need while balancing the resource utilization and management efforts.” In other words, how do we ensure security through isolation — while still achieving cost efficiency through sharing resources?
The facetious response to all of these questions is, and always will be, “it depends”. But in this article, we’ll lay out the pros and cons for the three popular approaches to implementing multi-tenant architectures in Kubernetes along with some real-world considerations to help you answer this question yourself.
What is multi-tenancy?
At a high-level, multi-tenant solutions are all about sharing. Multi-tenancy deals with housing multiple ‘tenants’ on shared resources, whether the tenant is an application, user, customer, or team. It’s a fundamental principle behind, for example, public cloud computing, because sharing pooled resources is generally more efficient. It’s also important to note that each tenant should have a degree of isolation from the others.
In Kubernetes, multitenancy is important in various different scenarios, for example:
- You're an MSP, an ISV or other service provider and you have customers that you want to give access to your infrastructure, for example the SaaS application you sell — without maintaining separate infrastructure for each customer, which would reduce your margin. Of course while customers share computing resources and perhaps even a single database, you need strong isolation between each customer for security and privacy.
- You're an internal operations team running an infrastructure platform that you want to share across multiple users from different teams and business units. Despite them all being internal, you want to keep a degree of separation between these internal customers and how their applications run, yet you also want to have the efficiency of a single infrastructure to manage.
We also need to define exactly how much sharing is being done. It may mean sharing infrastructure from all, some, or none of the Kubernetes clusters.
In other words, at one end of the spectrum, we have the model of running a single-tenant model where each cluster hosts a single set of applications for a dedicated tenant (i.e., customer, team, organization).
On the other end, we can have a fully multi tenant environment: one giant Kubernetes cluster that runs various tenants' workloads.
As mentioned before, Kubernetes does not come with multi-tenancy support out of the box. What it does instead is leave that choice and implementation detail up to the administrator. Kubernetes does have constructs and controls that can implement soft multi-tenancy within a single cluster, but the open-source community has also created higher-level abstractions that make this easier as well.
So without further ado, let’s dive into the different models.
Approach 1: one dedicated cluster per tenant
The simplest multi-tenancy model is to take Kubernetes’s single-tenancy design and assign a tenant per cluster, whether that tenant is a customer’s deployment, a business unit, a development team or a solitary developer and his or her application.
The obvious upside here is that no further work is needed to segregate workloads within Kubernetes. All of the isolation guarantees come from the underlying infrastructure provider and the boundaries are drawn at the cluster level. This is a ‘hard’ multitenancy approach.
The downside to this model is poor resource utilization and the cost associated with it. The cost here refers to not just the direct costs of running more virtual machines, VPCs, load balancers, etc, but also the indirect costs of managing such large numbers of clusters. It’s important to consider how these indirect costs can become a significant burden, especially considering the efforts to maintain a single, production-ready cluster.
While running a cluster per tenant sounds extreme, it is actually very popular depending on how you define ‘tenant’. Some large organizations do actually provision a cluster per use case or engineer, but the more common practice is to define tenancy around teams, environments, or product lines.
The simplest model of this for small companies might be a prod and non-prod cluster. For mid-sized companies, clusters may be divided by environment plus teams. Finally, some consulting or PaaS companies allow end-users to provision an isolated environment via separate clusters.
Since operating multiple clusters at scale can be a logistical challenge, there are some tools that help with management such as:
…and this one called Palette that you might have heard of!
Approach 2: multiple tenants within a single cluster via namespaces
On the other end of the spectrum, we have the soft multi-tenancy model where multiple tenants are housed within a single cluster. Isolation is then enforced by various Kubernetes constructs.
The key concept here is Kubernetes namespaces, which allows logical separation of resources within the same cluster. But by default, namespaces alone do not enforce isolation. In fact, workloads in different namespaces can freely discover and communicate to each other and have no limits enforced on usage of shared resources like CPU, memory, or Kubernetes operations.
To successfully isolate each tenant via namespaces, the following is required:
- API isolation via roles and role-bindings
- Network isolation via network policies
- Guardrails against cluster resources via resource quotas and limit ranges
Let’s walk through each one in more detail.
Kubernetes uses the role-based access control (RBAC) model to implement authorization policies. This means that the cluster admin is responsible for creating the right roles and policies to limit what different Kubernetes users or service accounts can get access to. Each role is scoped to a single namespace so the admin can carefully control what resources the role can have access to (e.g., secrets, volumes, etc).
However, Kubernetes also provides cluster roles that work across namespaces. While this is convenient for things that are meant to be shared across namespaces like ingress controllers or webhooks, a badly designed cluster role or over-scoped one can easily break the security and isolation model.
Since one of the goals of Kubernetes was to make it easier for pods to discover one another and communicate, by default, there is no network isolation. Any pod can talk to another pod in the same cluster with no restrictions. In a multi-tenancy model, we have use cases where we need to limit this either to enforce security or to simply reduce noise or spammy behavior.
There are several approaches to network isolation.
One can happen at the Container Network Interface (CNI) layer where network policies can be enforced. Popular open-source CNIs include Calico, Cilium, Flannel, and Weave. Different cloud providers may also have their own flavor of CNIs where network policies can be enforced to disallow communication.
The other approach is to address this on the network mesh layer. If you are running Istio or Linkerd, you can set policies to limit communication via proxy sidecars.
Guardrails for cluster resources
Since multiple tenants all share the same cluster-level resources, to achieve isolation, we also need some guardrails against either intentional or unintentional attacks on shared resources.
First, we need to remember that nodes are not namespace-scoped resources. That is, different workloads that fall under different namespaces may be physically running on the same node. This can present security risks when a malicious attack gains access to the underlying node either via privilege escalation or other means, and can affect more workloads than the compromised namespace.
To remediate, we can use node taints to segregate critical workloads onto different nodes and also utilize security contexts and other policy engines to prevent privileged access.
Next, we also need to make sure one tenant does not hog all the underlying resources available. We can accomplish this by setting resource quotas and limit ranges to different namespaces so that CPU and memory usage is within some bound.
This is not only a good cost-saving measure, but a practical way to prevent some misconfigured job left overnight from affecting more tenants.
Best practices for using namespaces
While this ‘soft’ tenancy model significantly increases resource utilization compared to the single tenancy model, it leaves a ton of responsibility for the cluster admin to enforce isolation correctly to limit blast radius.
Some controls are simple to implement, such as setting resource quotas and limit ranges. But as the user base grows, managing different roles and network policies becomes a challenge.
For this reason, this model works best when all of the tenants are trusted entities (e.g., different teams from the same organization) or if you control the entire cluster, like in a hosted SaaS model where the end users do not have direct access to the cluster. This way the various controls mentioned above can be frequently tweaked and refined behind the scenes.
In practice, some mix of single tenancy and multi tenancy is used. For example, a SaaS company may have multi-tenant clusters divided by environment (i.e., prod vs. non-prod) or even product lines. Others might opt to host a developer sandbox on a multi-tenant cluster and use cluster-segregation for higher environments.
Limitations of using namespaces
It is important to note that even with API isolation and network isolation in place, using namespaces does not provide the same level of security and isolation as using separate clusters. This is because some core Kubernetes resources are still shared across namespaces such as CoreDNS and Kubernetes API. This means that a single malicious actor abusing these components could impact the entire cluster.
The other limitation involves tenants who may want to span multiple namespaces. For example, we might want to have a hierarchical structure of an organization that spans across multiple teams. Kubernetes has a special interest group working on this problem via Hierarchical Namespace Controller (HNC) but the features are still relatively new.
Approach 3: using virtual control planes
Given the limitations of using namespaces as means to implement multi-tenancy and the operational burden associated with it, the open-source community has designed a new model using virtual control planes.
The key idea here is to extend Kubernetes APIs to form custom resource definitions (CRDs) and create virtual control planes that are assigned to each tenant. Each virtual control plane runs its own API server, controller manager, and etcd to decouple it from the host Kubernetes cluster. This solves the issue of shared Kubernetes API access that is not possible via namespace isolation.
Some popular implementations of virtual cluster mechanisms include:
With the rise of platform engineering in recent years, implementing multi-tenancy via virtual control planes has been a popular option. It is an elegant solution that balances the needs of tenant isolation with high resource utilization and cost savings.
Virtual clusters also abstract away a lot of the complexities of implementing controls at the namespace level, which also reduces the operational complexity.
The reality? Multiple options will thrive
As Kubernetes usage grows, every organization will face the challenge of implementing multi-tenancy to a certain degree. The “best” design choice between the hard cluster-based segregation or soft isolation via namespaces or virtual clusters will depend on your team and the requirements for your workloads.
If you need to prioritize greater isolation and security, a cluster-based approach is likely better. If cost optimization or strong isolation isn’t a concern (e.g., lower environments, fully hosted SaaS model), then multi-tenancy within a single cluster can be more effective.
Also, the choice doesn’t always have to be binary. A mix of these models can be tailored for each use case. For example, a multi-tenant cluster can be used for developer environments, with production workloads segregated by cluster.
Finally, as applications with multi-region support grows in demand, multi-tenancy must cross multiple clusters by necessity (as most cloud managed offerings tie clusters to a specific region).
As more enterprise workloads inevitably shift over to Kubernetes, the boundaries between hard and soft isolation will likely be abstracted away into higher-level components.
How Palette can help
In this article we’ve focused on your architectural decisions about how you allocate resources and achieve isolation in your Kubernetes clusters when dealing with multiple tenants, whatever they may be.
In reality, you’ll probably adopt multiple models depending on the scenario and the constraints of working in your business. And that’s why at Spectro Cloud we have a number of capabilities that can take advantage of:
First, we’re a multi-tenant SaaS provider ourselves — and we know that you may prefer full isolation. So check here to find out about our dedicated deployment options for the Palette platform.
Second, we’ve made it our core mission to simplify dealing with multiple clusters, even large numbers of clusters, so if you need that kind of isolation, you’re not burning out your team. Our repeatable Cluster Profiles and scalable decentralized architecture are key here, as is our use of Cluster API to declaratively manage clusters across multiple different environments.
Third, we’ve enthusiastically embraced the potential of virtual clusters and the vCluster project for providing a middle ground between namespaces and full isolated clusters. Our Palette Virtual Clusters feature is the result.
The best way to see these features in action is in a live 1:1 demo — give us a shout and we can set one up!