Blogs
2020-05-14

You’re not using Cluster API for K8s Infrastructure Management? Why not?

Best of breed technology for Kubernetes cluster provisioning and management

Photo by Steven Christenson via StarCircleAcademy

As Kubernetes continues to mature there has been a proliferation of choice for Kubernetes infrastructure tools, installers, and services. Practitioners have a pretty big menu of options, specifically around cluster provisioning and management. As these Kubernetes infrastructure management products continue to innovate and introduce new features and capabilities, are there places where these products should share a common toolset?

In this post, we’ll explore why Cluster API should be embedded in many of these infrastructure management products. If you’re not familiar with Cluster API and its capabilities, we highly recommend you read the Cluster API Book.

K8s Infrastructure Management Products

Kubernetes container orchestration technology has swept the enterprise world, with 75% of organizations running containers in production soon (most orchestrated by Kubernetes). Kubernetes delivers a cloud-native platform for application delivery, runtime, and scaling. While these are huge benefits for scale out applications, providing all these capabilities comes at the expense of the complexity of the platform itself. These complexities can be categorized as:

  • Complex and heavyweight lifecycle management: significant time and resources needed to provision, upgrade, and maintain the control planes of clusters. Each cloud has a different set of IaaS primitives that need to be cobbled together to make a “cluster”.
  • Intricate and involved integrations for basic components such as storage, networking, security, and others.
  • Difficult and not always consistent multi-cluster management and operations.

Attempting to address these issues, the first generation of Kubernetes infrastructure products were born; think Kubespray, Kops, Kubicorn, KQueen, Bootkube — and from the public cloud providers a set of services on the managed Kubernetes front: GKE, EKS, AKS, DigitalOcean. Simplification and automation of deployment and, sometimes, management, was part of the goal of these projects and offers.

The number of installers, cluster managers, and managed Kubernetes platforms that exist today is nearing the triple-digits! Many of the Kubernetes installers are designed to provision dev/test clusters on developer workstations and laptops, but the vast majority of the cluster managers and managed Kubernetes platforms run and manage production Kubernetes clusters.

Providing basic cluster provisioning and management is table stakes for a Kubernetes infrastructure management product, but most provide significantly more value than just infrastructure and cluster lifecycle management, including backup, upgrades, audit, observability, and much more. Over time, these infrastructure products will continue to innovate and build new capabilities to best serve their maturing customer bases.

What is Cluster API?

Cluster API is a project to bring Kubernetes style APIs and declarative approach to Kubernetes cluster lifecycle management (creation, configuration, upgrade, destruction). It’s designed to work across various cloud properties and provide a reusable set of ecosystem components.

Some of the capabilities gained through Cluster API include:

  • Provisioning of multi-master Kubernetes-conformant clusters
  • Provisioning and maintenance of all required cluster primitives (compute, storage, networking, security, …)
  • Implementation of security best practices (security groups, isolated subnets, bastion hosts, etc)
  • Upgrades of Control Plane and Workers on a rolling basis (control-plane now part of v1alpha3)
  • Support for thirteen cloud providers (public, private, and bare-metal)

The key observation behind Cluster API is that Kubernetes orchestration is already complex and requires an orchestration system, so you can just rely on Kubernetes as the orchestrator to orchestrate other Kubernetes clusters. Recall, Kubernetes, in addition to container orchestration, also provides first-class features for application workloads running in Kubernetes:

  • Workload rolling upgrades
  • Dynamic scaling
  • Resiliency
  • Logging / Monitoring

Especially if you are using the operator pattern, the Kubernetes framework provides significant additional capabilities to your apps — declarative state management, reconciliation, error retries, and much more. Modern applications will need to implement all of these, so instead of building all of these capabilities, you can just leverage Kubernetes orchestration to orchestrate the platform.

This is truly where Cluster API shines. It’s built on the same building blocks as Kubernetes and provides the functionality to provision and manage cluster primitives and operations.

Why Cluster API as an infrastructure management building block

A subset of the new capabilities that Kubernetes infrastructure management products will add relate to infrastructure provisioning and cluster management: provisioning to new types of compute infrastructure, multi-master rolling upgrade, or even support of a new public or private cloud.

Instead of Kubernetes infrastructure management products reimplementing similar logic to cover these kinds of cases, why not leverage a shared component across these solutions? A shared component is similar to a shared library where bug fixes, enhancements, and new capabilities are provided with little or no effort. That’s where something like Cluster API could come in. By leveraging Cluster API, either as an embedded component or a required dependency, Kubernetes infrastructure products would increase their rate of innovation, decrease their time to market, and ultimately provide better experiences to their customers.

If you believe that infrastructure management products should utilize a shared component to provide cluster infrastructure provisioning and management, the next question is why Cluster API? Why not leverage a more mature and developed tool, like Kops?

Well, the answer really boils down to the Kubernetes architecture and functionality itself; because Cluster API utilizes Kubernetes and inherits application resiliency, scaling, logging, and more; it becomes a natural choice for the job.

There would be an advantage for all Kubernetes infrastructure management tools if they didn’t all have to implement and reimplement similar capabilities to launch, configure, and manage IaaS infrastructure for a cluster. Think about everything that goes into making a “cluster”: virtual machines, storage volumes, IP addresses, security groups, networking isolations, load balancers, etc; there are many areas of repeated non-differentiated investment on the part of solutions. Many of the Kubernetes infrastructure management tools are innovating in different areas but cannot leverage or build on top of the work done by others. If tomorrow your public cloud provider of choice offers some fancy new application load balancer or networking plugin to enforce network security policies in hardware there would be significant effort expended by all these management tools to provide support.

If infrastructure tools and products leveraged Cluster API internally it would free those tools and products up to really focus on additional value-add capabilities such as:

  • Namespace and resource management
  • Container security management
  • Policy, Audit, Compliance
  • Service Mesh, Observability, and Logging
  • CICD and Deployment
  • Archive, Backup, Restore

The Road Ahead

The first major release of Cluster API was on March 29, 2019. In a little over a year, across the Cluster API and its providers there have been a little over 1000 commits from 100+ contributors across VMware, Google, Microsoft, and more! We’re proud to share that Spectro Cloud is also contributing to the vibrant and growing Cluster API project and its providers. The main development repository is healthy and continues to forge forward.

The Cluster API v1alpha3 release brought in lots of new capabilities with control plan upgrades, multi-master support for other clouds, and node health monitoring/recovery. The community is building its next release, v1alpha3+, slated to be released middle of this year with the following improvements:

  • Cluster auto-scaling
  • Support for other load balancers
  • Spot instance supports

More up-to-date roadmap for v1alpha3+ and beyond are available here: https://cluster-api.sigs.k8s.io/roadmap.html

Cluster API is a Kubernetes SIG Cluster lifecycle subproject and while it has a growing community, it needs more involvement from more organizations and members. Most of the top contributors to the main Cluster API project are currently employed at VMware. One of the critical hallmarks of successful open-source projects includes the support and contributions from a diverse set of individuals and organizations. It would be great to have other small and large organizations join the project as it continues to mature.

We’re already seeing Kubicorn, Tanzu, Openshift, Airship, and NKE adopting the Cluster API project. It’s time to help this project along! This is a great example of a place where a concerted open-source push in an area just makes a lot of sense.

Saad Malik
CTO & Co-Founder @ Spectro Cloud
Saad is passionate about building products in the areas of cloud, virtualization, containers, and distributed systems. In his fifteen years of experience, Saad has shipped multiple new products in enterprise, service provider, and consumer technologies. He is a hardcore Trekkie and enjoys building autonomous drone tracking software.