Architecture Description › The SUSE CaaS Platform stack

Architecture Description

Navigation←→

Applies to SUSE CaaS Platform 4.5.2

3 The SUSE CaaS Platform stack #

3.1 Base Operating System
3.2 Software Versions
3.3 Networking

All long-lived components run in containers, with the exception of:

Kubelet
CRI-O

All Kubernetes components critical to the infrastructure are created inside the kube-system namespace. As a user, you can create as many namespaces as necessary for your operations and enforce different permissions using RBAC rules.

kubeadm is what drives the deployment of new machines, and this component runs on-demand, uncontainerized during the bootstrap or join procedures.

3.1 Base Operating System #

SLE 15 SP2
- Kernel: kernel-default 4.12.14 or greater
- Filesystem: XFS or BTRFS

3.1.1 Software Management #

SUSE CaaS Platform is distributed as a dedicated repository. All required packages to deploy a node to the cluster are installable via a pattern. This pattern will automatically be installed by skuba when bootstrapping or joining a node.

Extension to base OS image
Software distribution channels
Packages
Container ecosystem
- Helm
- SUSE container registry

3.1.1.1 Upgrades of OS components (automated) #

By default SUSE CaaS Platform clusters automatically apply all the patches that are marked as non interactive. These patches are safe to be applied since they should not cause any side effect.

However, some patches need the nodes to be rebooted in order to be activated. This is the case for example of kernel updates or some package updates (like glibc).

The nodes of the cluster have some metadata that is kept up-to-date by SUSE CaaS Platform. This metadata can be used by the cluster administrator to answer these questions:

Does the node need to be rebooted to make some updates active?
Does the node have non interactive updates pending?
When was the check done last time?

Cluster administrators can get a quick overview of each cluster node by using one of the following methods.

UI mode:

Open a standard kubernetes UI (eg: kubernetes dashboard).
Click on the node.
Look at the annotations associated to the node.

Text mode:

Go to a machine with a working kubectl (meaning the user can connect to the cluster).
Ensure you’ve the caasp kubectl plugin installed. This is a simple statically linked binary that SUSE distributes alongside skuba.
Execute the kubectl caasp cluster status command (or skuba cluster status).

The output of the command will look like that:

NAME      OS-IMAGE                              KERNEL-VERSION           KUBELET-VERSION   CONTAINER-RUNTIME   HAS-UPDATES   HAS-DISRUPTIVE-UPDATES   CAASP-RELEASE-VERSION
master0   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
master1   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
master2   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
worker0   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
worker1   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
worker2   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0

3.1.1.1.1 Node reboots #

Some updates require the node to be rebooted to make them active.

The SUSE CaaS Platform cluster is configured by default to take advantage of kured. This service looks for nodes that have to be rebooted and, before doing the actual reboot, takes care of draining the node.

Kured reboots one node per time. This ensures the rest of worker nodes won’t be saturated when adopting the being rebooted worker node workloads, as well as ensuring that etcd will always be healthy in the case of control plane nodes.

Cluster administrators can integrate kured stats into a prometheus instance and create alerts, charts and other personal customizations.

Cluster administrators can also fine tune the kured deployment to prevent its agent from rebooting machines where special workloads are running. For example: it’s possible to prevent kured from rebooting nodes running computational workloads until their pods are done. To achieve that it’s necessary to instruct cluster users to add special labels to the pods they don’t want to see interrupted due to the node being rebooted.

3.1.1.1.2 Interactive upgrades #

Some updates might cause damages to the running workloads. These interactive updates are currently being referenced by this document as "disruptive upgrades".

Cluster administrators don’t have to worry about disruptive upgrades. SUSE CaaS Platform will automatically apply them making sure no disruption is caused to the cluster. That happens because nodes with disruptive upgrades are updated one at a time, similar to when nodes are automatically rebooted. Moreover, SUSE CaaS Platform will take care of draining and cordoning the node before these updates are applied, and uncordoning it afterwards.

Disruptive upgrades could take some time to be automatically applied due to their sensitive nature (nodes are updated one by one). Cluster operators can always see the status of all nodes by looking at the annotations of the kubernetes nodes (see previous section).

By looking at node annotations a cluster administrator can answer the following questions:

Does the node have pending disruptive upgrades?
Are the disruptive upgrades being applied?

3.1.1.2 Upgrades of OS components (not automated) #

It’s possible to disable the automatic patch apply completely, if the user wants to inspect every patch that will be applied to the cluster, so they are in complete control of when the cluster is patched. By default SUSE CaaS Platform clusters have some updates applied automatically. Nodes can also be rebooted in an automatic fashion under some circumstances. To prevent that from happening it’s possible to annotate nodes that are not desired to be automatically rebooted. Any user with rights to annotate nodes will be able to configure this behavior. Cluster administrators can use software like SUSE Manager to check the patching level of the underlying operating system of any node. When rebooting nodes, it’s important to take some considerations into account:

Ensure nodes are drained (and thus, cordoned) before they are rebooted, uncordon the nodes once they are back
Reboot master/etcd nodes one by one. Wait for the rebooted node to come back, make sure etcd is in an healthy state before moving to the next etcd node (this can be done using etcdctl)
Do not reboot too many worker nodes at the same time to avoid the remaining ones to be swamped by workloads

Cluster administrators have to follow the very same steps whenever a node has an interactive (aka disruptive) upgrade pending.

3.1.1.3 Upgrades of the Kubernetes platform #

The cluster administrator can check whether there’s a new Kubernetes version available distributed by SUSE, and in case there is, they can upgrade the cluster in a controlled way.

In order to find out if there’s a new Kubernetes version available, the following command will be executed in a machine that has access to the cluster definition folder:
- skuba cluster upgrade plan
If there’s a new version available, it will be reported in the terminal.
In order to start the upgrade of the cluster, all commands should be executed from a machine that contains the cluster definition folder, so that it contains an administrative kubeconfig file:
- This command will confirm the target version to what the cluster will be applied if the process is continued.
  - It’s necessary to upgrade all control plane nodes first, running:
  - skuba node upgrade apply --user sles --sudo --target <IP_ADDRESS/FQDN>
- This command has to be applied on all control plane nodes, one by one.
  - It’s necessary to upgrade all worker nodes last, running:
  - skuba node upgrade apply --user sles --sudo --target <IP_ADDRESS/FQDN>
- This command has to be applied on all worker nodes, one by one.

3.2 Software Versions #

SUSE CaaS Platform 4 ships with the following components and their respective dependencies (not listed).

Component	Version
https://kubernetes.io/	1.18.10
https://cri-o.io/	1.18.4
https://cilium.io	1.7.6
https://github.com/etcd-io/etcd	3.4.13
Kured https://github.com/weaveworks/kured	1.4.3

3.3 Networking #

In the networking level, several concepts need to be defined:

Container Network Interface (CNI)
The default plugin providing CNI in SUSE CaaS Platform is Cilium. CNI forms an overlay network, allowing for pods running in different machines in the cluster to communicate with each other transparently.
Network policies
Network policies allow for security in terms of routing within the cluster. They define how a groups of pods are allowed to communicate with each other and with other network endpoints.
- Network policies allow for fine grained restrictions on what networks are reachable, both in ingress and egress traffic.
Kube proxy
The kube proxy is a service that runs containerized in all machines of the cluster and allows for pod to service communication. When you expose a service, and some other pod consumes this service, the kube proxy is the responsible for setting the rules on every machine so the service is reachable from within the pod using that service.
Envoy
Envoy is a high performance service and edge proxy that provides L7 policy support. Envoy is an application language neutral service and hence can work along with services written in different languages, even though the core is written in C++ language. Envoy supports Go-extensions and WASM support that makes it more adoptable by other apps. Envoy supports mulitple protocols such as HTTP/1.1, HTTP/2, gRPC, MangoDB, DynamMoDB etc. Envoy supports L3/L4 filtering. Envoy provides Advanced load balancing comparable to NGINX. Envoy supports Health checking, observability and Service Discovery.

Cilium 1.6 uses Envoy for the L7 Network Policies.

Cilium KVStore free operation
Cilium can now operate entirely without a KVstore in the context of Kubernetes.
Cilium Socket-based load-balancing
Cilium’s Socket-based load-balancing combines the advantages of client-side loadbalancing and network based load-balancing to provide a transparent loadbalancing service through kubernetes to map the ServiceIP to the Endpoint IP are done only once during connection establishment.
Cilium Generic CNI chaining
CNI chaining framework is introduced in cilium 1.6 release that allows to run Cilium along with other CNI’s such as Weave, Calico, Flannel, AWS VPC CNI or the Lyft CNI plugin. This allows to utilize the eBPF based security policy enforcements and its features while getting the basic networking support from the legacy CNI plugin in-use.
Cilium Native AWS ENI mode
Cilium 1.6 has introduced a new IPAM mode, AWS ENI mode that works along with a new operator-based design. This helps when running services in the AWS and managing IPAM from the operator defined pools while getting the network policy enforcement from Cilium..
Cilium Policy scalability improvements
Cilium 1.6 provides an improved and scalable policy system that decouples handling of policy and identity definitions while moving to an entirely incremental model.

Print this page