Engineering

Kubernetes Cost Optimization: Strategies & Tools to Deploy Cost-Effective Kubernetes

July 23, 2024
Christopher Fellowes
Sakshi Jain

Kubernetes has become the gold standard for orchestration due to its reliable, scalable, and extensible design. It’s battle-tested, proven at scale, and supported with a thriving community. Yet all these benefits must come at some cost, and often that cost is quite literal. This is why having a clear approach to Kubernetes cost optimization is so important.

The scenario above has led many smaller companies to shy away from Kubernetes altogether, fearing that the time and money required to set it up would be better spent elsewhere. As time passes, this sometimes results in a home-grown solution of various cloud offerings cobbled together into a confusing amalgamation that becomes a nightmare to maintain. What may have initially been a keen cost-saving strategy ends up costing more than adopting Kubernetes in the first place.

But is this simply the reality with Kubernetes?  Or is there an alternative approach that would allow companies of all sizes to reap the benefits of Kubernetes without breaking the bank?  

The truth is Kubernetes can be extremely cost effective, and it may be simpler to get started than you think. Roughly a decade of public use has led to a proliferation of best practices and tools that can help guide you on this journey. In this article, we’ll provide a  brief overview of the current landscape of Kubernetes cost optimization, as well as some suggestions on immediate steps you can take today.

What is Kubernetes Cost Optimization?

Kubernetes cost optimization involves the balancing act of minimizing the amount of compute you pay for while maximizing the amount it gets used. Provision too little and you risk your workloads crashing or not running in the first place. Provision too much and you have a fleet of virtual machines happily idling your money away. There truly is no silver bullet, but through careful consideration of processes and tools it is possible to use Kubernetes to its fullest without breaking records every month on your cloud bill.

Which factors influence Kubernetes cost?

To narrow the scope of this article, I’ll be assuming the cluster’s control plane is a cloud-managed offering like Amazon’s EKS or Google’s GKE. This means that almost all the cost from your cluster will be from virtual machines (VMs) that run your workloads and data transfer costs between them.  

Compute:

Your workloads will need something to run on, and as they scale your resource demands will scale too. Every pod you run will eat up some amount of CPU and memory, requiring additional nodes in your cluster. Although it may be technically possible to run a small cluster with only one or two nodes, this type of setup is most likely better suited for a non-Kubernetes approach. Kubernetes, by design, works best for highly dynamic workloads so it’s important that your cluster’s compute power can accommodate shifting demands.

Data Transfer:

Cloud providers typically allow data to flow freely within regions/availability zones. However, if you want to run a highly resilient cluster that’s optimized for local performance you will likely expand across regions.This will incur additional cost on all data transferred in between. What may appear as pennies at first (~$0.01/GiB) can lead to sticker shock as your cluster scales up.

Kubernetes Cost Optimization Challenges

Optimizing cost for Kubernetes clusters is a daunting problem for a few main reasons.

Dynamic Workloads:  

The “perfectly optimized" cluster has 100% resource utilization at all times, no more, no less. In this dream world every single pod is operating at maximum efficiency, any uptick in resource consumption addressed instantaneously. Of course, this isn't possible, and acts as an exciting black hole to throw all your precious time into. Trying to maintain highly optimized resource utilization in the face of rapidly scaling workloads is even more difficult, since now you have a moving target that shifts constantly.  

Bin Packing:  

Resources are not added to your cluster in isolation, your CPU cores and Memory come in the form of virtual machines. This means that if you suddenly need more resources, you cannot simply tack on another CPU core, you need to launch an entirely new node and add it to your cluster.  

After the dust has settled and pods have stabilized, the question comes about whether the new node was the best fit for the job, and if its presence means that older nodes can be consolidated. For example, take the case where the new node you launched could entirely fit all the workloads running on an old node, it would make sense to simply shift them over and clean it up.  

Doing this optimally at scale becomes difficult. It requires tooling that has knowledge of your cloud providers offerings, and proper configurations of your workloads to be shifted around. Failure on either side results in cost spikes or service outages.

Dev-Infra Harmony:  

Although the rise of DevOps culture has largely brought the realm of infrastructure directly to your developer’s “doorstep”, it can be tempting for devs to simply throw application builds over the wall for the platform/ops team to handle deploying.  

This disconnect makes everyone’s lives harder, as each side lacks critical information from the other. Ops teams trying to optimize cost need to deeply understand the workloads they are deploying, while Devs need to learn Kubernetes concepts to optimize their workloads to reap the full benefit.

7 Tips to Improve Cost-Effective Kubernetes Deployment

Tip #1: Observability

Trying to optimize Kubernetes cost without any observability set up is like trying to drive a racecar blindfolded. The sheer number of metrics, tools and configurations can be mind-boggling. And the volatility of the workloads you run can mean slight adjustments lead you miles off course. It's essential to go in with a game plan and some way to measure it.

Luckily, there are plenty of options out there, from open-source solutions like the Kubernetes Prometheus Stack to enterprise offerings like New Relic or DataDog. Ultimately these will all address the core issues of observability in slightly different approaches, so it's up to you to determine which best fits your use case. Starting with an open-source offering is typically a safe bet but recognize that as you scale this will be one more service for you to manage.

Tip #2: Focus on Request Rightsizing from Day 1

The Kubernetes resource management model operates based on limits and requests. A limit defines how much your pod can consume while a request denotes how much it should be guaranteed. Early on, both concepts can be nebulous. You may have some rough approximation of how resource intensive a process is under various loads but most of these assumptions are best-effort and can vary wildly from reality. As a result, sometimes we can opt for an overly generous initial request configuration or skipping the request altogether.  

The consequences from an overly generous request are self-evident. You’re paying for resources your pods aren’t using. If you’ve seen your pod spike up to 1GB memory usage and you set its request accordingly, you’ll likely face an overwhelming percentage of that being completely unused throughout non-peak hours.  

The dangers of omitting a request altogether are far more nuanced. Kubernetes offers three levels of Quality of Service (QoS) guarantees: Best Effort, Burstable, and Guaranteed (ascending order). If a pod has no resource request altogether it is classified as “Best Effort” and is culpable to being killed at a moment's notice if a node runs low on available resources. This means if anything without high fault tolerance is scheduled as “Best Effort,” its stability can be jeopardized by high resource consumption.  

By setting up resource requests from day 1 and continuously reevaluating them over time, you can both improve the durability of your services and save on costs at the same time.

Tip #3: Get Developers Involved

The closer your developers are to your live environments, the more informed any decision about them will be. If your developers aren’t familiar with Kubernetes concepts like resource requests and limits, how will your Platform/DevOps teams be able to reverse-engineer these configurations effectively?  

While you could  set up observability for resource consumption, this only accounts for part of the story. This approach can’t help us infer how fault-tolerant a workload might be.  

This crucial detail explains the rise of Internal Developer Platforms, where devs can self-serve on everything from building applications to deploying and monitoring them on live environments. The more empowered developers are to fully own the lifecycle of code they write, the easier it becomes to fully take advantage of all Kubernetes has to offer.  

Tip #4: Cloud Discounts

In an earlier tip, we discussed how properly setting resource requests can help prevent your services from getting terminated abruptly. However, if your service can tolerate this type of disruption, you might be in the perfect position to utilize spot instances and receive significant discounts.  

“Spot” instances, in contrast to “On-Demand” instances, are virtual machines that cloud providers offer at a significant discount but with lower quality-of-service guarantees. While on-demand instances typically offer fixed prices and guaranteed availability, spot instances can vary based on current demand and can be reclaimed by the cloud provider at any time. While this can certainly cause service outages if you aren’t careful, a well-architected system can withstand this without any downtime. It is typically a good idea to run a mix of spot and on-demand instances in your cluster, allowing workloads to be scheduled according to their fault tolerance using concepts like Node Affinities.

In addition to spot instances, all major cloud providers offer discounts on compute resources, typically in the form of a long-term contract. This can easily be one of your largest sources of savings, getting deals up to 70% off in some cases. In AWS, you can utilize Reserved Instances, GCP offers Committed Use Discounts, and Azure has Savings Plans.  

If the concept of locking yourself into a long contract seems scary, don’t turn away immediately. Each of these cloud providers have a sliding scale of flexibility-to-savings ratio. If you want to be more cautious, you can choose a less restrictive plan and still get some significant discounts without the fear of lock-in.  

Tip #5: Automate, automate, automate

If these tips so far have you more apprehensive than excited, don’t worry! It should be clear by now that the scope of “cost optimization” encompasses many different things, and it's entirely unrealistic to be able to cover everything in a brief article.  Thankfully, there’s a strong community with extensive experience managing Kubernetes clusters so there are countless tools to help automate these processes. Some notable ones that are part of the Kubernetes project are:

  • Vertical Pod Autoscaler: helps right size your resource requests by monitoring resource consumption and automatically editing the values.
  • Horizontal Pod Autoscaler: helps ensure you are efficiently scheduling pods when they are needed based off of metrics like CPU usage.
  • Cluster Autoscaler: helps maintain the amount of nodes in your cluster by scaling up and down to accommodate pods.

No matter what tooling you end up instrumenting, it’s critical to have your cluster be able to “self-manage” as much as possible. The most useful tools will be those that allow you to define high level objectives and attempt to realize them automatically. Whether this involves foregoing cost efficiency for higher uptime, or optimizing for latency at all costs, the quicker this process can happen entirely automatic, the clearer your cost optimization journey will become.

Tip #6 Learn your Cloud Provider’s Networking Fees

Unfortunately, data transfer costs are unavoidable on major cloud providers. You will likely want to run a highly reliable system that has fault tolerance at the region/availability zone level, which means you will be transferring data between regions.  

Each cloud provider has slightly different terminology and concepts for their networking models, but there are a few general tips that are relevant.  

  1. Region-aware services: even if your cluster is multi-region there is a good chance not all of your services’ communication needs to be. You can most likely optimize data transfer costs by having your services prioritize resources in their local region whenever possible.
  1. Private network communication: for endpoints available publicly, there's typically an option to route traffic to them over your private network. This saves data transfer costs by minimizing the places your traffic passes through that could charge you.

Tip #7 Use Sidecars with Consideration

Sidecars are a popular model for attaching extra functionality into a pod without needing to modify your service’s container. For example, you could configure your service to join a service mesh by adding a sidecar to its PodSpec, and it would automatically gain all of the benefits without any other changes.  

Although this convenience may be tempting, additional sidecars can add up quickly. Even if a single sidecar container only uses 0.1 CPU cores and 100 MB of Memory, if this container needs to be added to every pod you run, your total resource consumption will skyrocket.  

There is no general rule here. The benefits of the sidecar may be worth the cost it incurs but it is important that this decision is made carefully. Additionally, the recent rise in popularity of Extended Berkeley Packet Filter (eBPF) offers promising solutions for this sidecar dilemma, allowing ‘sidecars’ to exist at the node level rather than pod-level, significantly reducing cost. One such example is Istio’s ambient mode.

5 Kubernetes Cost Optimization Tools

Now that we’ve set the context for Kubernetes cost optimization, let’s explore some tools that can help.

#1: Kapstan

Overview:

Kapstan is an Internal Developer Platform that allows devs to quickly spin up infrastructure, build CI/CD pipelines, and monitor their live environments all in a few clicks.

Features:

  • Deploy applications on Kubernetes with a few clicks
  • Monitoring and observability out of the box
  • Zero-downtime deployment upgrades
  • Audit history for all changes
  • Multi-cloud support
  • Built in intelligent autoscaling using Karpenter and KEDA
  • Manages other cost optimization tools and provides simple interfaces

Does Kapstan offer a free version?

Kapstan offers a free plan as well as a 30-day free trial for the Premium plan.

#2: Karpenter

Overview:

Karpenter is a node autoscaler for Kubernetes clusters running on AWS.

Features:

  • Fast scale up and scale down
  • Extremely efficient bin packing due to native design for AWS
  • Configured via Custom Resource Definitions (CRDs) in your cluster

Does Karpenter offer a free version?

Karpenter is an open-source tool you can get started with by installing its Helm chart on your cluster.

#3: Kubecost

Overview:

Kubecost is a price monitoring tool that attributes resource costs to various components within a Kubernetes cluster.

Features:

  • Provides detailed break downs for Kubernetes costs into groupings like deployments, secrets, and namespaces
  • Works on all major public clouds, with custom options to support on-prem environments
  • Offers price monitoring and alerting

Does Kubecost offer a free version?

Kubecost offers a free tier as well as an enterprise tier with a 30-day free trial.

#4: KEDA

Overview:

Kubernetes Event-Driven Autoscaler (KEDA) is a horizontal autoscaler that offers a wide variety of scalers to scale your workloads based off events from different data sources. This allows you to scale your application using triggers beyond resource utilization, such as Kafka queue length.

Features:

  • Scale your workloads off metrics other than resource consumption
  • Reduce idle costs by scaling to zero when no events are detected
  • Simple visibility into scaling decisions by exposing metrics for configured scalers

Does KEDA offer a free version?

KEDA is an open-source tool you can get started with by installing its Helm chart onto your cluster.  

#5: Densify

Overview:

Densify is a Cloud Resource and Kubernetes optimization tool that acts as a central panel for resource control, monitoring, and optimization. It provides detailed dashboards to get insights into your entire cloud cost and tools to optimize your environments.

Features:

  • Numerous dashboards for visibility into cloud costs
  • Automatic Kubernetes optimization for resource requests and limits, as well as node type optimization
  • Alerting support for cost monitoring

Does Densify offer a free version?

Densify doesn’t offer a free plan, but they do offers a 30-day free trial on their Enterprise plan.

Conclusion: Using Kapstan for Kubernetes Cost Optimization

We hope this crash course on Kubernetes cost optimization has been helpful! While adopting Kubernetes can seem daunting at first, there are ways to leverage the benefits of Kubernetes with less pain and in a cost-optimized way.

Kapstan provides one of the simplest ways to get started on your Kubernetes cost optimization journey. We provide all the infrastructure you will need in a few clicks and empower developers to be directly connected to the live environments they are deploying to. You can easily view your full cost journey in one simple UI with cloud costs by resource, Kubernetes cost by application, and historical metrics to analyze resource consumption all in one place. Stop juggling a dozen different tools, like Karpenter and KEDA, and rest assured knowing they are configured optimally for your use case, with automatic upgrades and fine-tuning.  

Interested in learning how Kapstan can help you with Kubernetes cost optimization? Reach out to schedule a demo with one of our founding engineers!

Christopher Fellowes
Software Engineer @ Kapstan. Chris is passionate about all things Kubernetes, with an emphasis on observability and security. In his free time, he is an avid rock climber.

Simplify your DevEx with a single platform

Schedule a demo