Kubernetes: how to get comfortable with complexity

Is Kubernetes as hard as people think it is?

Kubernetes has a reputation for being a complex, even difficult, technology.

Our 2022 State of Production Kubernetes survey found that 4 in 5 people that are already running Kubernetes in production say that Kubernetes is more complex than other technologies they use.

The complexity of Kubernetes has even spawned its own meme accounts.

Take a tour round the Reddit /r/Kubernetes subreddit and you’ll quickly find posts from beginners struggling to get their heads around the concepts, sharing memes, and asking for help. A quick Google will turn up plenty of beginner’s guides, too.

It’s not just newbies that admit to finding Kubernetes challenging. Experienced IT people who have spent their lives in other modalities have a lot of adjusting to do because K8s is so different from what they're used to, even in its fundamental principles.

And this is the crux of the matter. Many people approach Kubernetes for the first time without fully appreciating what it was built for: running sophisticated containerized application workloads at scale. And with its great power comes a degree of unavoidable complexity, both in terms of initial setup and operations. If you’re trying to use it to host very simple workloads, of course Kubernetes looks like a lot of work and overhead for little gain. It’s like using an Artemis rocket for a trip to the store.

Our argument is this: Kubernetes is not as complex as people think. But more importantly, there are ways to get value from Kubernetes (even in production, at scale) without everyone on your team yet fully understanding all of its complexities. It is totally possible to get up to speed with that complexity to the point where teams are comfortable with basic tasks, and operational pitfalls can be avoided.

Understand the fundamental K8s differences with the hypervisor analogy

To explain why you don't need to be an expert to run Kubernetes at scale, I like to use an analogy. It describes the familiar world of Virtual Machines as discrete detached houses, and the Kubernetes platform as an apartment complex.

An apartment building is a much more complex structure than a house. Rather than being a fully self-contained unit, each apartment uses many more shared resources, such as the basement, the elevator, electrical grid, fire extinguishers, parking lot, even the stairs. In fact, while as a homeowner you probably know how to operate and maintain all the appliances and systems in your house, as an apartment dweller you may know very little of how the overall building functions.

Understand the fundamental K8s differences with the hypervisor analogy

Does that prevent you from living the dream life in your lovely penthouse? Certainly not! You can still fix most of the common problems that arise, such as a pipe leak or having no hot water in winter. You know where to find the tools when you need them. When things get too complicated to fix, you can call the property management staff, or in the worst case, you can call a handyman to help you.

Let's follow the same approach to compare the Kubernetes admin experience to a type 1 hypervisor. For the last 15 years, VM admins didn't have to know how every single component worked to operate their virtual platform efficiently. Schedulers, resource management, QoS, and runtime engines are nothing new, but not that many VM admins understand how they work in detail. That didn't prevent them from architecting and operating their environment following the best standards.

While x86 virtualization shares some common architecture patterns with Kubernetes, the boundary between the infrastructure components and, on the other side, the application and middleware sitting on top is much more blurred with Kubernetes. These components run as separate entities and require the user to choose the appropriate recipe to combine all these ingredients. Virtualization was much more like going to the restaurant and ordering the set lunch menu!

So this is the first task that platform admins have to clear: learn these Kubernetes building blocks and how to troubleshoot them. The CNCF has released multiple exams and programs (such as the CKA) to reach this goal, so there is no excuse for not having access to that knowledge other than time and budget.

Divide and conquer with team responsibilities

Kubernetes is complex, in many different ways, across a very broad set of platform elements, integration pipelines and application stacks. Its software ecosystem is breathtaking and expanding every day.

But: vitally, in an enterprise deployment, it’s very unlikely that the same person, or the same roles, will be responsible for all of it. In other words, the complexity of Kubernetes can be distributed across several teams and personas. Divide and conquer!

Take a CI/CD tool, for example. The DevOps team may be responsible for it. They don't have to know how the Kubernetes scheduler makes decisions but must be very familiar with Kubernetes application updates, dependency management, and other cloud-native paradigms implementation.

Start small and grow with quick wins

Another way of gradually improving Kubernetes proficiency while not being overwhelmed by production issues is to start small and focus on quick wins that bring value to the business.

Maybe start by running automation tools as containers in Kubernetes, leverage ingress capabilities, and keep services externally. The developers can start refactoring applications when the ops team is comfortable with the new paradigms. In contrast, the platform team can focus on the network and security aspects to improve the current infrastructure posture.

Find what will make the business more agile to build a competitive advantage. Don't deploy all features and ecosystem tools at once. That is the shortest path to failure!

Plan for sustainable scale for an easier future

However, there is an obvious tip: the teams involved in Kubernetes management should start eating their own dog food. Using Kubernetes to manage its lifecycle and extend its API allows you to operate cloud-native and mission-critical applications more efficiently. Such initiatives include Cluster API (CAPI), database operators, and other open-source projects extending the Kubernetes API with custom resources and controllers.

When scaling becomes a necessity, standardization and repeatability are essential. These qualities enable better operations because they guarantee platform predictability. It also means that you can apply good architectural practices in a consistent way and capture trends in terms of resource consumption and associated costs. It is then trivial to run analytical tools to help optimize costs and the overall infrastructure uptime and SLOs (e.g., tools like Kubecost or Cast AI). Then, you can iterate architectural and operational decisions in short cycles based on your findings and tool feedback.

There are prominent allies here, such as Kubernetes node auto-scaling to help scale up or down your worker nodes automatically (projects such as Karpenter) and Kubernetes HPA (Horizontal Pod Autoscaling) to scale your application.

Use case evolution at the edge

Kubernetes provides many benefits as the next-generation data center Operating System (OS). But other use cases are emerging, such as edge computing. Performing decentralized operations closer to the data source has proven to improve many aspects of enterprise product quality, cost optimization, and user experience. It is also crucial to innovation, with applications such as autonomous vehicles, Virtual Reality (VR) and Augmented Reality (AR).

Some key characteristics of edge architectures are fewer resources available, smaller form factors, and connection quality, varying depending on time and location. It drives strict features for a system to support edge computing:

Autonomous configuration and reconciliation.
A higher degree of workload consolidation and resiliency.
An architecture that can fit into a repeatable framework.

Kubernetes is a prime candidate for implementing edge computing because it provides a highly available distributed OS capable of managing infrastructure and application stacks in a composable and fully customizable fashion. It can be tailored to any use case while keeping the control-plane standard.

But other constraints require functions that are not natively baked into Kubernetes. For example, system disruption must be limited when performing upgrades and maintenance operations. When dozens of nodes are available, that is not an issue. However, this is challenging in environments at the edge with one or two nodes only. In that case, additional care must be taken when choosing a solution for Kubernetes Edge use cases. For example, Spectro Cloud created Palette Edge to meet these specific edge requirements. It helps deliver a consistent solution that is highly customizable and easily manageable using proven technologies and open-source projects.

Is Kubernetes the right choice?

Most of Kubernetes’ perceived complexities are not due to the platform's inherent complexity but stem from how it is used and exposed to its consumers: application developers.

With the advent of paradigms such as "shifting infrastructure left," developers rely more on composable, software-driven, and on-demand infrastructure components than ever. The ‘on-demand’ part is crucial to Kubernetes adoption. In an ideal Kubernetes environment, application developers continue to use their preferred tools and are shielded from Kubernetes’ low-level details. Kubernetes objects are implicitly deployed when needed, and the DevOps team only exposes the minimum required elements.

With all this said, although you shouldn’t be scared of Kubernetes, that doesn’t mean it’s always the right choice for your project, either. If you just need to run containers and have the ability to scale, you're probably safer with a hyperscaler managed service (e.g., AWS ECS or Fargate). It abstracts the container runtime and provides the tools needed for easy application deployment and accessible day-2 operations. But it won't be the right fit if you need an abstraction layer that provides independent management of microservices, built-in automation, and total control of the integration and delivery pipelines. In that case, Kubernetes will be the right choice in the long term. Just make sure you spread the burden of knowledge between teams with different profiles and responsibilities!

Tags:

Concepts

Best Practices

Thought Leadership

Demystifying Kubernetes: how to get comfortable with complexity