Published

August 18, 2023

The complete guide to Kubernetes cost management

What is Kubernetes costing you?

Kubernetes has become the most popular solution for orchestrating containerized workloads. It helps organizations simplify and scale the management of their cloud-native applications and associated resources.

But the power of Kubernetes comes at a price. Literally. Kubernetes can be costly, wherever you run it — and those costs can be difficult to predict.

If you want an accurate view of total cost of ownership (TCO) for running Kubernetes, you need to factor in:

Cloud spend, whether on a managed service like EKS, or generic cloud instances like EC2
Data center compute resources if you’re running on-premises, including servers and facilities
Storage and networking
Container compute resources
Software licenses, vendor support costs and managed services
Time spent by internal teams and stakeholders such as developers (because after all, time is money)

Kubernetes already has a reputation for complexity. You’ll probably face challenges maintaining cost control as your clusters grow in number and spread across different environments.

In this article, we’ll explore various approaches for Kubernetes cost optimization.

We’ll look at best practices you can follow for configuring K8s and scaling its resources efficiently.

We’ll also review some of the Kubernetes cost monitoring software that has sprung up to help you.

Optimize Kubernetes for better workload utilization

Start with Kubernetes native auto-scaling features

When you ‘rightsize’ your clusters to your workloads, you are no longer paying for unused capacity through your cloud bills or data center expenses. Thankfully, Kubernetes has tooling to help you trim excess resources from almost all variable workloads.

Kubernetes offers multiple scaling techniques, each with a distinct purpose.

Vertical Pod Autoscaler (VPA). It scales resources based on application metrics, ensuring optimal allocation.
Horizontal Pod Autoscaler (HPA). It adjusts workloads to maintain a desired number of replicas, optimizing performance and efficiency.
Cluster Autoscaler (CA). It automatically adds or removes nodes from the cluster in response to changes in resource utilization.

If you need to incorporate more advanced metrics and events, KEDA (the Kubernetes Event-Driven Autoscaler) is an excellent choice. It allows you to define custom scaling rules and policies based on external metrics like message queues or webhooks.

However, it can be complex to adjust auto-scaling parameters when application requirements change, leading to misconfigurations and inefficiencies. And before you can accurately configure resource quotas, you need a thorough understanding of the platform's usage.

This is where software comes in. As we’ll discuss later, dedicated software can provide intelligence and automation. It can help you drive Kubernetes' native auto-scaling capabilities to adapt your resources to real-time demand. This should both streamline costs and maximize application performance.

Take advantage of spot instances for heavy cloud discounts

Scaling is key to controlling your K8s costs, but placement can be just as important. This is where spot instances can be really powerful.

Spot instances are excess compute capacity offered by cloud providers at discounted rates.

By placing your workloads on these instances, you can enjoy substantial savings, often up to 90%, compared to regular compute capacity.

However, using spot instances may introduce challenges when running stateful workloads. When the provider terminates the spot instance, your data will vanish into the abyss.

In such cases, it becomes crucial to select a CSI that can efficiently replicate your data and ensure real-time availability.

To make the most of these discounted resources, it's important to follow certain infrastructure patterns.

Set resource limits and watch out for inter-region costs

The configuration of your clusters can make a big difference to your costs. What do we mean by configuration? Things like:

Setting appropriate resource limits on individual workloads to allow for dynamic scaling of resources in use.
Optimizing underlying infrastructure, for example by selecting the right instance types, deploying workloads in cost-effective regions, and taking advantage of affordable storage options.

In a multi-cluster Kubernetes setup, where microservices communicate across different regions and availability zones (AZ), it’s particularly important to watch out for egress/ingress costs.

Communication within the same region and AZ is usually free or at a reduced cost, but across regions, the costs can be huge, particularly if there is a high volume of cross-region communication.

Caching, compression, and region-aware routing can help you cut down on these expenses. But it’s imperative to regularly monitor and analyze your network traffic and consider how you configure things like load balancers.

Oh, and one more thing: keep a close eye on log growth. Logs can quickly gobble up your storage space if you set the wrong retention policy. If you have to retain logs and don’t need them for analysis, make sure you store them on the cheapest storage tiers.

Share resources with multitenancy and vClusters

By embracing multi-tenant architectures, you can actually save costs by efficiently sharing resources across applications and teams.

Virtual clusters (vClusters), an open source project by Loft, is one example of a multi-tenant platform that can help you achieve this.

vClusters are Kubernetes clusters that run on top of other Kubernetes clusters, also known as the "host" Kubernetes cluster.

They provide isolation and configuration independence, while sharing the nodes from the underlying host cluster.

vClusters not only avoid resource consumption, they provide dedicated environments for rapid prototyping and testing. The best part? You won't have to worry about impacting the underlying host cluster or other vClusters.

One key feature of vClusters is the "pause" capability. This lets you temporarily suspend your clusters when they're not in use, cutting costs.

There’s one more vCluster trick you need to know about: oversubscription.

Maximize utilization by oversubscribing with virtual clusters

As we’ve seen in our section above on autoscaling, cost optimization is all about sizing your infrastructure to maximize utilization.

Generally you don’t want to over-provision resources, because it wastes compute capacity. That's why setting resource limits is crucial.

And usually you don’t want to over-allocate resources either, because then you end up with potentially severe application performance issues.

Finding the right balance can be tricky.

Virtual clusters allow for oversubscription, meaning you can safely run more containers than your infrastructure typically allows.

Why? Well, not all containers are fully utilized all the time. So, by leveraging this concept, you can optimize your resource utilization.

By the way: we have implemented vClusters in our Palette management platform, as Palette Virtual Clusters. You get all the power of vClusters with simple centralized declarative management and enterprise policy controls.

Adopt a dedicated cost-management platform

A quick Google for ‘Kubernetes cost management’ (which may have brought you here) will turn up many dedicated software or cloud solutions.

The discipline of Kubernetes cost management tools is an offshoot of cloud cost management solutions.

These have become extremely popular in response to cloud “bill shock”. The majority of Kubernetes workloads run on public cloud services, and left unchecked, cloud bills can easily tick up.

Cost management starts with cost visibility. So these platforms offer a holistic perspective of your Kubernetes deployments, enabling you to identify areas of resource waste and potential cost reduction opportunities. They also enable automated ways to tackle costs.

Some of these tools are standalone and run directly within your cluster control plane: for example, Kubecost, kube-green, and OpenCost. Others are cloud-based tools that you can subscribe to.

Depending on whether you look at pure Kubernetes cost management tools, overall cloud cost management, or observability tools that have cost modules, there are plenty out there: Stormforge, Yotascale, Vantage, Anodot, Loft, Harness, Datadog, and others.

You’ll find cost optimization capabilities baked into platforms like OpenShift, and natively into Kubernetes services like GKE. We even have some multicluster cost visibility tools baked into Palette’s dashboards, as we explored in this blog.

Which is the best? Well, that’s a subject for another article. But let’s take a look at three of the most prominent.

Kubecost

Kubecost addresses the need for developers to gain insights into their Kubernetes spending.

Co-founders Webb Brown and Ajay Tripathy, who previously held roles in infrastructure monitoring at Google, started Kubecost as an open-source tool in 2019.

Kubecost is an easily installable, containerized application that offers a strong free version, along with commercial add-ons like single sign-on (SSO) and combined reports.

To install and configure Kubecost, you can follow these simple steps (Remember that Prometheus is a prerequisite):

Install the Helm chart

helm upgrade --install kubecost \
--repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
--namespace kubecost --create-namespace

Enable port-forwarding

kubectl port-forward --namespace kubecost deployment/kubecost-cost-analyzer 9090

With your browser, navigate to http://localhost:9090

Kubecost provides flexible configuration options to enhance your experience. This includes cloud billing integration and the Cluster Controller.

Cloud billing integration gives you detailed insights into your Kubernetes spending. It integrates with leading cloud providers like AWS, GCP, and Azure, providing a total view of your cloud costs. By interacting with your cloud provider's APIs, it gathers cost data and resource usage information.

The Cluster Controller automates tasks like cluster right-sizing and turndown. It evaluates your cluster's resource usage and provides optimal sizing recommendations. It also actively monitors the cluster for underutilized nodes and scales down as needed.

Kube-green

Kube-green is a Kubernetes operator that comes at the efficiency issue from a different angle: environmental sustainability. Its mission is to minimize the CO2 emissions in your clusters.

Kube-green automatically suspends idle pods, conserving energy and reducing your carbon footprint.

Here's how it works: Kube-green acts as a watchdog, intercepting pod lifecycle events through a webhook.

When Kubernetes creates a pod, kube-green checks if it is scheduled to run during off-hours or when it is underutilized. If it meets these criteria, kube-green will suspend the pod. Once the pod is scheduled to run again, kube-green will resume its execution.

Currently, kube-green supports Kubernetes Deployments and CronJobs.

Setting up Kube-green is a breeze, requiring just a single step:

kubectl apply -f https://github.com/kube-green/kube-green/releases/latest/download/kube-green.yaml

The kube-green configuration options must be specified in a CRD called SleepInfo. Here is an example of a SleepInfo resource:

apiVersion: kube-green.com/v1alpha1
kind: SleepInfo
metadata:
  name: working-hours
spec:
  weekdays: "1-5"
  sleepAt: "20:00"
  wakeUpAt: "08:00"
  timeZone: "Europe/Rome"
  suspendCronJobs: true
  excludeRef:
  - apiVersion: "apps/v1"
    kind: Deployment
    name: api-gateway

This resource specifies that the pods in the working-hours namespace should be suspended at 20:00 (UTC+1) on weekdays (Monday through Friday) and resumed at 08:00 (UTC+1) on weekdays. Cronjobs should also be suspended, and the pod named api-gateway should not be suspended.

Whether it's for development environments or non-essential workloads, Kube-green provides a simple yet powerful solution to maintain lean, clean, and green Kubernetes clusters.

OpenCost

OpenCost is a promising CNCF project currently in incubation.

It gathers data from both your Kubernetes cluster and cloud provider about the resources utilized by your pods, the associated costs, and the duration of the pod runtime.

Using this data, OpenCost calculates the expenses incurred by your Kubernetes workloads.

Kubecost powers the cost allocation engine in the OpenCost implementation.

To deploy OpenCost, execute a single command:

kubectl apply --namespace opencost -f https://raw.githubusercontent.com/opencost/opencost/develop/kubernetes/opencost.yaml

Then, enable port-forwarding:

kubectl port-forward --namespace opencost service/opencost 9003 9090

You can also install the `kubect-cost` plugin to give a CLI access to the OpenCost API. To install with helm, just run:

helm repo add kubecost https://kubecost.github.io/cost-analyzer/

helm upgrade -i --create-namespace kubecost kubecost/cost-analyzer --namespace kubecost --set kubecostToken="a3ViZWN0bEBrdWJlY29zdC5jb20=xm343yadf98"

You can then use kubecost within your Kubernetes cluster to view cost and performance metrics of your workloads.

For example, you can view the cost breakdown by CPU and memory for a specific namespace:

kubectl cost namespace --historical --namespace=my-namespace --breakdown

What's more exciting is that you can predict the cost of a deployment that is defined in a YAML specification.

kubectl cost predict -f 'k8s-deployment.yaml' \
 --show-cost-per-resource-hr

OBJECT                   &Delta; QTY RESOURCE UNIT COST PER UNIT &Delta; COST/MO % CHANGE
────────────────────────────────────────────────────────────────────────────────────────
default deployment       +9 CPU cores        23.27 USD     +209.47 USD
nginx-deployment
                         +6 RAM GiB          3.12 USD      +18.72 USD
────────────────────────────────────────────────────────────────────────────────────────
TOTAL MONTHLY COST CHANGE +228.18 USD

How do Kubernetes cost-management tools work?

These tools integrate with popular cloud providers like AWS, GCP, and Azure to fetch real-time pricing information. They take into account various factors such as instance types, regions, and other cloud-specific parameters to accurately calculate the cost of the underlying infrastructure. Even for on-premises clusters, you have the flexibility to provide custom pricing.

What sets these tools apart is their ability to leverage both declarative resource definitions and real-time metrics like CPU and memory within the cluster. By mapping these metrics to the cost of the underlying infrastructure, they provide a comprehensive view of resource consumption and its associated cost. This means you can easily understand how your containers are utilizing resources and how much they are costing you.

Moreover, these tools continuously monitor resource utilization and costs, enabling them to identify any anomalies or spending spikes. This proactive approach helps you uncover misconfigurations or inefficiencies that may be causing unnecessary expenses.

Cloud-based cost management platforms

Another way to measure and optimize costs for Kubernetes is to use SaaS services. Solutions such as Replex, CAST.ai, or CloudZero provide comprehensive insights into resource utilization, cost allocation, and efficiency metrics.

These platforms offer real-time monitoring, intelligent analytics, and automated recommendations to ensure that your Kubernetes clusters are running optimally.

For instance, Replex (acquired by Cisco under the AppDynamics brand) uses an AI engine to analyze usage data in real-time and provide insights into your spending patterns. It also allows you to create custom cloud cost policies that are enforced automatically. You can even configure alerts to be sent when resource utilization or costs exceed certain thresholds.

CloudZero offers a similar set of features with the added benefit of automatically suspending resources when they are not being used. This helps reduce costs by ensuring that the cluster is only running what's necessary.

By using cloud-based platforms, you can quickly and easily identify opportunities to reduce costs. You'll also gain valuable insights into your resource utilization that will help you make more informed decisions about the use of Kubernetes in your organization.

Don’t forget the hidden cost: time

We started this blog by asking: what is Kubernetes costing you?

It’s easy to jump straight to the tangible costs. For many of the tools we’ve looked at, that’s primarily the cloud bills you receive every month with dollar amounts visible on dashboards.

Go a little further and you’ll start to calculate the per-machine license for hypervisor or other software, your MSP’s monthly bill, or even the purchase price of hardware for on-prem deployments.

Although they may not come out of your IT budget in the same way, it’s just as important to look at the less tangible costs.

This includes things like developer productivity, infrastructure downtime, and the hours that your ops and platform teams are spending managing your Kubernetes infrastructure.

This is particularly important if your organization makes choices that, in effect, swap dollars for hours. For example:

Instead of buying commercial software, you ‘roll your own’ from upstream OSS projects and do your own integration work.
Instead of paying for vendor support contracts or an MSP, you support everything in house.

In these cases, the dollar spend goes down — but there’s no such thing as free. The spend is now just hidden in the payroll budget (and likely your team’s evenings and weekends).

The average salary of a Kubernetes engineer in the US is around $150,000, and most IT teams feel they’re struggling with a shortage of resources and expertise.

If you have a whole team of K8s ninjas repeating manual updates and patches across multiple clusters, triaging Jira tickets, scrambling to troubleshoot issues — is that a good use of their time and skills? What could you automate?

Equally, if you have 50 expensive software engineers waiting around for access to clusters to test their code, how is that affecting your feature velocity and time to market?

If you have outages or application performance issues caused by misconfigurations, configuration drift, snowflake clusters, or patches and upgrades that your team hasn’t got around to yet — how much are those outages costing you?

Obviously this figure will vary wildly depending on the length and timing of the outage, and which business applications are affected — but you get the idea.

The more you can put a number on these soft factors, the more you can start to measure the actual monetary impact of investments in things like internal developer platforms, self-service Kubernetes, and highly automated enterprise cluster fleet management tools (yes, like Palette).

Cost optimization is a continuous effort

If you’re trying to keep Kubernetes costs under control, you have a wealth of techniques at your disposal.

Essentially, you want to scale your clusters precisely to the needs of your workloads, and Kubernetes has a number of techniques, such as autoscaling, to make that happen, if you set sensible policies for resource limits.

More broadly, multi-tenancy and resource sharing tools like Virtual Clusters can help you tighten up, and even benefit from capacity oversubscription.

If you’re running K8s in the cloud, then you need to look at placement, traffic flows, storage policies and spot instances. The hyperscalers may only charge you cents per hour for a service, but at scale, a single misconfiguration can turn into thousands of dollars.

In complex environments, you may not be able to do all this manually. Tools like Kubecost and SaaS products like CAST.ai have sprung up offering analytics insights and automation to help you achieve cost savings. Many are easy to use, some are open source and free.

As you work to optimize, don’t forget those costs that are harder to measure, but potentially even bigger. Operator time. Developer productivity. Outages and security vulnerabilities.

Ideally you should look for a Kubernetes management platform that enables you to:

Centralize consistent policy and configurations
Declaratively manage multiple clusters throughout their lifecycle
Provide automation and self-service throughout CI/CD and DevOps workflows

Of course, that’s what we do at Spectro Cloud with our Palette platform. You can find out more across spectrocloud.com. Or check out this report from the analysts at ESG to see how they attempted to quantify some of the operational time savings that a modern K8s management platform can deliver.