2021-11-15

The Subtle (but important) difference between Management and Control Plane in Kubernetes

The “planes” concept originated from the need for logical separation of a network’s operation into planes and it first surfaced in the 1980s as part of the ISDN architecture. Based on that ontological approach, there are three logical planes: the data plane (also known as forwarding plane, user plane, carrier plane, or bearer plane), the control plane, and the management plane. Each plane can be thought of as a different area of operations and carries a different type of traffic. Conceptually — and often in reality — it is effectively an overlay network. This blog will discuss the nuances of those concepts in a Kubernetes architecture.

Logical Planes in Kubernetes architecture

The logical planes concept can be easily mapped to the Kubernetes architecture and associated management platforms:

  • The data plane carries the user traffic. This is mission-critical traffic and will immediately impact the user if the plane is unavailable. In Kubernetes, you can think of the network between the worker nodes, the ingress controller, and the load balancer, all part of the data plane.

  • The control plane carries signaling traffic. It is composed of the master nodes and the traffic from the controller in master nodes to each worker node’s kubelet component. This is also mission-critical traffic. If it goes down, although it may not immediately impact the data plane, it will render it out of control, with the application pods not able to be deployed, updated, or relaunched.

  • The management plane, which carries administrative traffic, is considered less mission-critical. It can be a local dashboard of the Kubernetes cluster or an external management platform that manages the operations of Kubernetes clusters.

46 image1

Management Plane vs Control Plane and why you should “keep em separated”

When dealing with a single-cluster environment, sometimes the line between Management Plane and Control Plane can be a bit blurry. But that’s ok since we are not worried about isolating traffic between clusters. However, when managing multiple clusters that each one has its own individual control plane, the two layers need to stay clearly separated, in order to ensure that everything that needs to be cluster-specific is not exposed to the other clusters. Creating unwanted latency is another risk.

By design, the control plane was intended to enforce the policies that were “decided” using the management plane. A good architectural approach based on this principle is to always leave the control plane alone to take care of the interactions with its local cluster and data plane, without any error-prone human involvement. Equally, treat the management plane as the lightweight plane to present to their human operators consolidated options for one or more of the clusters and their administrative tasks or use it to integrate with external APIs and automate.

To ensure the management plane stays at a non-mission-critical state (as opposed to the control plane), the key design point is to have the management plane stay as passive as possible: it can be a dashboard to display the health status across all clusters and generate alerts if needed. It can be used to define the policy and push it to the — more intelligent — control plane for enforcement. It can initiate certain operations by sending the command to the control plane and check the execution status, but it should not play an important role in orchestrating the operation.

A common mistake to avoid is auto-scaling. If we are sending all the cluster metrics to the management plane to make decisions, it now becomes mission-critical. A more sound approach would be to have the control plane collect the metrics to make the decisions and act, based on the policies that were selected in the clearly-separated management plane.

Putting theory into practice with Palette

Palette strictly follows the above design principles. For each Kubernetes cluster under management, there is a management agent acting as part of the control plane to interact with the local Kubernetes API server and Controllers. This management agent will take the policy or desired-state definitions (for example a cluster profile definition) from the Management Plane, store it locally and keep enforcing it locally. This gives the cluster auto-reconciliation and self-healing capabilities. The agent will periodically send cluster health status events to the management plane. If there is any policy change from the management plane, the agent will receive notifications to get the latest policy.

This keeps the traffic between the Management Plane and the Control Plane agent at a minimum, with the system being able to tolerate occasional network outages between the two, since the events can be accumulated on the agent side to resend later and the agent will be notified to get any updated policy once the network is restored. In the meantime, the entire cluster is fully functional, and all management policies are continued to be enforced locally without the central management system involved. This is how Palette can scale and manage thousands of clusters and deal with edge deployment with sometimes disconnected or unreliable networks. Some of the old generation of cluster management tools that are using proprietary orchestrator do not have this level of sophistication in their control plane agent, thus can only do management in “fire-and-forget” fashion without being able to do self-healing locally.

46 image2

Non-trivial design for infinite scale

The benefits of a separated Management Plane and a more intelligent control plane agent are quite obvious. The Management Plane provides centralized management functionality but the actual policy enforcement and control operations are distributed to the individual clusters. This helps enterprises to be able to scale to manage tens, hundreds, even thousands of clusters without having the Management Plane being a bottleneck and therefore a potential risk. Also, this architecture is future-proof: whether you have a single cluster, multiple clusters, or even multiple clusters from multiple vendors in multiple environments. This is another aspect to consider when evaluating Kubernetes management platforms or architecting your own DIY platform.

Stay tuned for more content on Kubernetes and feel free to send any questions for us to answer — we are always up for a bit of a Kubernetes challenge! Oh, and don’t forget to register for the big live unveiling of Palette 2.0 in December.