SUSE CaaS Platform 4.5.2

Deployment Guide

This guide describes deployment for SUSE CaaS Platform 4.5.2.

Authors: Markus Napp and Nora Kořánová

Publication Date: 2021-06-23

#

Warning

This document is a work in progress.

The content in this document is subject to change without notice.

Note

This guide assumes a configured SUSE Linux Enterprise Server 15 SP2 environment.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”.

For SUSE trademarks, see http://www.suse.com/company/legal/. All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™, etc.) denote trademarks of SUSE and its affiliates. Asterisks (*) denote third-party trademarks.

All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its affiliates, the authors, nor the translators shall be held liable for possible errors or the consequences thereof.

About This Guide #

Required Background
Available Documentation
Feedback
Documentation Conventions

1 Required Background #

To keep the scope of these guidelines manageable, certain technical assumptions have been made. These documents are not aimed at beginners in Kubernetes usage and require that:

You have some computer experience and are familiar with common technical terms.
You are familiar with the documentation for your system and the network on which it runs.
You have a basic understanding of Linux systems.
You have an understanding of how to follow instructions aimed at experienced Linux administrators and can fill in gaps with your own research.
You understand how to plan, deploy and manage Kubernetes applications.

2 Available Documentation #

We provide HTML and PDF versions of our books in different languages. Documentation for our products is available at https://documentation.suse.com/, where you can also find the latest updates and browse or download the documentation in various formats.

The following documentation is available for this product:

Deployment Guide: The SUSE CaaS Platform Deployment Guide gives you details about installation and configuration of SUSE CaaS Platform along with a description of architecture and minimum system requirements.
Quick Start Guide: The SUSE CaaS Platform Quick Start guides you through the installation of a minimum cluster in the fastest way possible.
Admin Guide: The SUSE CaaS Platform Admin Guide discusses authorization, updating clusters and individual nodes, monitoring, logging, use of Helm, troubleshooting and integration with SUSE Enterprise Storage and SUSE Cloud Application Platform.

3 Feedback #

Several feedback channels are available:

Bugs and Enhancement Requests

For services and support options available for your product, refer to http://www.suse.com/support/.

To report bugs for a product component, go to https://scc.suse.com/support/requests, log in, and click Create New.

User Comments

We want to hear your comments about and suggestions for this manual and the other documentation included with this product. Use the User Comments feature at the bottom of each page in the online documentation or go to https://documentation.suse.com/, click Feedback at the bottom of the page and enter your comments in the Feedback Form.

Mail

For feedback on the documentation of this product, you can also send a mail to doc-team@suse.com. Make sure to include the document title, the product version and the publication date of the documentation. To report errors or suggest enhancements, provide a concise description of the problem and refer to the respective section number and page (or URL).

4 Documentation Conventions #

The following notices and typographical conventions are used in this documentation:

/etc/passwd : directory names and file names
<PLACEHOLDER>: replace <PLACEHOLDER> with the actual value
PATH: the environment variable PATH
ls, --help: commands, options, and parameters
user : users or groups
package name : name of a package
Alt, Alt–F1 : a key to press or a key combination; keys are shown in uppercase as on a keyboard
File › Save As : menu items, buttons
Dancing Penguins (Chapter Penguins, ↑Another Manual): This is a reference to a chapter in another manual.
Commands that must be run with root privileges. Often you can also prefix these commands with the sudo command to run them as non-privileged user.
```
sudo command
```
Commands that can be run by non-privileged users.
```
command
```
Notices:
Warning
Vital information you must be aware of before proceeding. Warns you about security issues, potential loss of data, damage to hardware, or physical hazards.
Important
Important information you should be aware of before proceeding.
Note
Additional information, for example about differences in software versions.
Tip
Helpful information, like a guideline or a piece of practical advice.

1 Requirements #

1.1 Platform
1.2 Nodes
1.3 Hardware
1.4 Networking

1.1 Platform #

Currently we support the following platforms to deploy on:

SUSE OpenStack Cloud 8
VMware ESXi 6.7
KVM
Bare Metal x86_64
Amazon Web Services (technological preview)

SUSE CaaS Platform itself is based on SLE 15 SP2.

The steps for obtaining the correct installation image for each platform type are detailed in the respective platform deployment instructions.

1.2 Nodes #

SUSE CaaS Platform consists of a number of (virtual) machines that run as a cluster.

You will need at least two machines:

1 master node
1 worker node

SUSE CaaS Platform 4.5.2 supports deployments with a single or multiple master nodes. Production environments must be deployed with multiple master nodes for resilience.

All communication to the cluster is done through a load balancer talking to the respective nodes. For that reason any failure tolerant environment must provide at least two load balancers for incoming communication.

The minimal viable failure tolerant production environment configuration consists of:

Cluster nodes: #

3 master nodes
2 worker nodes

Important: Dedicated Cluster Nodes

All cluster nodes must be dedicated (virtual) machines reserved for the purpose of running SUSE CaaS Platform.

Additional systems: #

Fault tolerant load balancing solution
(for example SUSE Linux Enterprise High Availability Extension with pacemaker and haproxy)
1 management workstation

1.3 Hardware #

1.3.1 Management Workstation #

In order to deploy and control a SUSE CaaS Platform cluster you will need at least one machine capable of running skuba. This typically is a regular desktop workstation or laptop running SLE 15 SP2 or later.

The skuba CLI package is available from the SUSE CaaS Platform module. You will need a valid SUSE Linux Enterprise and SUSE CaaS Platform subscription to install this tool on the workstation.

Important: Time Synchronization

It is vital that the management workstation runs an NTP client and that time synchronization is configured to the same NTP servers, which you will use later to synchronize the cluster nodes.

1.3.2 Storage Sizing #

The storage sizes in the following lists are absolute minimum configurations.

Sizing of the storage for worker nodes depends largely on the expected amount of container images, their size and change rate. The basic operating system for all nodes might also include snapshots (when using btrfs) that can quickly fill up existing space.

We recommend provisioning a separate storage partition for container images on each (worker) node that can be adjusted in size when needed. Storage for /var/lib/containers on the worker nodes should be approximately 50GB in addition to the base OS storage.

1.3.2.1 Master Nodes #

Up to 5 worker nodes (minimum):

Storage: 50 GB+
(v)CPU: 2
RAM: 4 GB
Network: Minimum 1Gb/s (faster is preferred)

Up to 10 worker nodes:

Storage: 50 GB+
(v)CPU: 2
RAM: 8 GB
Network: Minimum 1Gb/s (faster is preferred)

Up to 100 worker nodes:

Storage: 50 GB+
(v)CPU: 4
RAM: 16 GB
Network: Minimum 1Gb/s (faster is preferred)

Up to 250 worker nodes:

Storage: 50 GB+
(v)CPU: 8
RAM: 16 GB
Network: Minimum 1Gb/s (faster is preferred)

Important

Using a minimum of 2 (v)CPUs is a hard requirement, deploying a cluster with less processing units is not possible.

1.3.2.2 Worker nodes #

Important

The worker nodes must have sufficient memory, CPU and disk space for the Pods/containers/applications that are planned to be hosted on these workers.

A worker node requires the following resources:

CPU cores: 1.250
RAM: 1.2 GB

Based on these values, the minimal configuration of a worker node is:

Storage: Depending on workloads, minimum 20-30 GB to hold the base OS and required packages. Mount additional storage volumes as needed.
(v)CPU: 2
RAM: 2 GB
Network: Minimum 1Gb/s (faster is preferred)

Calculate the size of the required (v)CPU by adding up the base requirements, the estimated additional essential cluster components (logging agent, monitoring agent, configuration management, etc.) and the estimated CPU workloads:

1.250 (base requirements) + 0.250 (estimated additional cluster components) + estimated workload CPU requirements

Calculate the size of the RAM using a similar formula:

1.2 GB (base requirements) + 500 MB (estimated additional cluster components) + estimated workload RAM requirements

Note

These values are provided as a guide to work in most cases. They may vary based on the type of the running workloads.

1.3.3 Storage Performance #

For master nodes you must ensure storage performance of at least 50 to 500 sequential IOPS with disk bandwidth depending on your cluster size. It is highly recommended to use SSD.

"Typically 50 sequential IOPS (for example, a 7200 RPM disk) is required.
For heavily loaded clusters, 500 sequential IOPS (for example, a typical local SSD
or a high performance virtualized block device) is recommended."

"Typically 10MB/s will recover 100MB data within 15 seconds.
For large clusters, 100MB/s or higher is suggested for recovering 1GB data
within 15 seconds."

https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#disks

This is extremely important to ensure a proper functioning of the critical component etcd.

It is possible to preliminary validate these requirements by using fio. This tool allows us to simulate etcd I/O (input/output) and to find out from the output statistics whether or not the storage is suitable.

Install the tool:
```
zypper in -y fio
```
Run the testing:
```
fio --rw=write --ioengine=sync --fdatasync=1 --directory=test-etcd-dir --size=22m --bs=2300 --name=test-etcd-io
```
- Replace test-etcd-dir with a directory located on the same disk as the incoming etcd data under /var/lib/etcd

From the outputs, the interesting part is fsync/fdatasync/sync_file_range where the values are expressed in microseconds (usec). A disk is considered sufficient when the value of the 99.00th percentile is below 10000usec (10ms).

Be careful though, this benchmark is for etcd only and does not take into consideration external disk usage. This means that a value slightly under 10ms should be taken with precaution as other workloads will have an impact on the disks.

Warning

If the storage is very slow, the values can be expressed directly in milliseconds.

Let’s see two different examples:

[...]
  fsync/fdatasync/sync_file_range:
    sync (usec): min=251, max=1894, avg=377.78, stdev=69.89
    sync percentiles (usec):
     |  1.00th=[  273],  5.00th=[  285], 10.00th=[  297], 20.00th=[  330],
     | 30.00th=[  343], 40.00th=[  355], 50.00th=[  367], 60.00th=[  379],
     | 70.00th=[  400], 80.00th=[  424], 90.00th=[  465], 95.00th=[  506],
     | 99.00th=[  594], 99.50th=[  635], 99.90th=[  725], 99.95th=[  742], 1
     | 99.99th=[ 1188]
[...]

1	Here we get a value of 594usec (0.5ms) so the storage meets the requirements.

[...]
  fsync/fdatasync/sync_file_range:
    sync (msec): min=10, max=124, avg=17.62, stdev= 3.38
    sync percentiles (usec):
     |  1.00th=[11731],  5.00th=[11994], 10.00th=[12911], 20.00th=[16712],
     | 30.00th=[17695], 40.00th=[17695], 50.00th=[17695], 60.00th=[17957],
     | 70.00th=[17957], 80.00th=[17957], 90.00th=[19530], 95.00th=[22676],
     | 99.00th=[28705], 99.50th=[30016], 99.90th=[41681], 99.95th=[59507], 1
     | 99.99th=[89654]
[...]

1	Here we get a value of 28705usec (28ms) so the storage clearly does not meet the requirements.

1.4 Networking #

The management workstation needs at least the following networking permissions:

SSH access to all machines in the cluster
Access to the apiserver (the load balancer should expose it, port 6443), that will in turn talk to any master in the cluster
Access to Dex on the configured NodePort (the load balancer should expose it, port 32000) so when the OIDC token has expired, kubectl can request a new token using the refresh token

Important

It is good security practice not to expose the kubernetes API server on the public internet. Use network firewalls that only allow access from trusted subnets.

1.4.1 Sub-Network Sizing #

Important

The service subnet and pod subnet must not overlap.

Please plan generously for workload and the expected size of the networks before bootstrapping.

The default pod subnet is 10.244.0.0/16. It allows for 65536 IP addresses overall. Assignment of CIDR’s is by default /24 (254 usable IP addresses per node).

The default node allocation of /24 means a hard cluster node limit of 256 since this is the number of /24 ranges that fit in a /16 range.

Depending on the size of the nodes that you are planning to use (in terms of resources), or on the number of nodes you are planning to have, the CIDR can be adjusted to be bigger on a per node basis but the cluster would accommodate less nodes overall.

If you are planning to use more or less pods per node or have a higher number of nodes, you can adjust these settings to match your requirements. Please make sure that the networks are suitably sized to adjust to future changes in the cluster.

You can also adjust the service subnet size, this subnet must not overlap with the pod CIDR, and it should be big enough to accommodate all services.

For more advanced network requirements please refer to: https://docs.cilium.io/en/v1.6/concepts/ipam/#address-management

1.4.2 Ports #

Node	Port	Protocol	Accessibility	Description
All nodes	22	TCP	Internal	SSH (required in public clouds)
	4240	TCP	Internal	Cilium health check
	8472	UDP	Internal	Cilium VXLAN
	10250	TCP	Internal	Kubelet (API server → kubelet communication)
	10256	TCP	Internal	kube-proxy health check
	30000 - 32767	TCP + UDP	Internal	Range of ports used by Kubernetes when allocating services of type `NodePort`
	32000	TCP	External	Dex (OIDC Connect)
	32001	TCP	External	Gangway (RBAC Authenticate)
Masters	2379	TCP	Internal	etcd (client communication)
	2380	TCP	Internal	etcd (server-to-server traffic)
	6443	TCP	Internal / External	Kubernetes API server

1.4.3 IP Addresses #

Warning

Using IPv6 addresses is currently not supported.

All nodes must be assigned static IPv4 addresses, which must not be changed manually afterwards.

Important

Plan carefully for required IP ranges and future scenarios as it is not possible to reconfigure the IP ranges once the deployment is complete.

1.4.4 IP Forwarding #

The Kubernetes networking model requires that your nodes have IP forwarding enabled in the kernel. skuba checks this value when installing your cluster and installs a rule in /etc/sysctl.d/90-skuba-net-ipv4-ip-forward.conf to make it persistent.

Other software can potentially install rules with higher priority overriding this value and causing machines to not behave as expected after rebooting.

You can manually check if this is enabled using the following command:

# sysctl net.ipv4.ip_forward

net.ipv4.ip_forward = 1

net.ipv4.ip_forward must be set to 1. Additionally, you can check in what order persisted rules are processed by running sysctl --system -a.

1.4.5 Networking Whitelist #

Besides the SUSE provided packages and containers, SUSE CaaS Platform is typically used with third party provided containers and charts.

The following SUSE provided resources must be available:

URL	Name	Purpose
scc.suse.com	SUSE Customer Center	Allow registration and license activation
registry.suse.com	SUSE container registry	Provide container images
*.cloudfront.net	Cloudfront	CDN/distribution backend for `registry.suse.com`
kubernetes-charts.suse.com	SUSE helm charts repository	Provide helm charts
updates.suse.com	SUSE package update channel	Provide package updates

If you wish to use Upstream / Third-Party resources, please also allow the following:

URL	Name	Purpose
k8s.gcr.io	Google Container Registry	Provide container images
kubernetes-charts.storage.googleapis.com	Google Helm charts repository	Provide helm charts
docker.io	Docker Container Registry	Provide container images
quay.io	Red Hat Container Registry	Provide container images

Please note that not all installation scenarios will need all of these resources.

Note

If you are deploying into an air gap scenario, you must ensure that the resources required from these locations are present and available on your internal mirror server.

1.4.6 Communication #

Please make sure that all your Kubernetes components can communicate with each other. This might require the configuration of routing when using multiple network adapters per node.

Refer to: https://v1-18.docs.kubernetes.io/docs/setup/independent/install-kubeadm/#check-network-adapters.

Configure firewall and other network security to allow communication on the default ports required by Kubernetes: https://v1-18.docs.kubernetes.io/docs/setup/independent/install-kubeadm/#check-required-ports

1.4.7 Performance #

All master nodes of the cluster must have a minimum 1Gb/s network connection to fulfill the requirements for etcd.

"1GbE is sufficient for common etcd deployments. For large etcd clusters,
a 10GbE network will reduce mean time to recovery."

https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#network

1.4.8 Security #

Do not grant access to the kubeconfig file or any workstation configured with this configuration to unauthorized personnel. In the current state, full administrative access is granted to the cluster.

Authentication is done via the kubeconfig file generated during deployment. This file will grant full access to the cluster and all workloads. Apply best practices for access control to workstations configured to administer the SUSE CaaS Platform cluster.

The SUSE CaaS Platform leverages Kubernetes role-based access control (RBAC) for authentication and will need to have an external authentication server such as LDAP, Active Directory, or similar to validate the user’s entity and grant different user roles or cluster role permission.

1.4.9 Replicas #

Some addon services are desired to be highly available. These services require enough cluster nodes available to run replicas of their services.

When the cluster is deployed with enough nodes for replica sizes, those service distributions will be balanced across the cluster.

For clusters deployed with a node number lower than the default replica sizes, services will still try to find a suitable node to run on. However it is likely you will see services all running on the same nodes, defeating the purpose of high availability.

You can check the deployment replica size after node bootstrap. The number of cluster nodes should be equal or greater than the DESIRED replica size.

kubectl get rs -n kube-system

After deployment, if the number of healthy nodes falls below the number required for fulfilling the replica sizing, service replicas will show in Pending state until either the unhealthy node recovers or a new node is joined to cluster.

The following describes two methods for replica management if you wish to work with a cluster below the default replica size requirement.

1.4.9.1 Update replica number #

One method is to update the number of overall replicas being created by a service. Please consult the documentation of your respective service what the replica limits for proper high availability are. In case the replica number is too high for the cluster, you must increase the cluster size to provide more resources.

Update deployment replica size before node joining.
Note
You can use the same steps to increase the replica size again if more resources become available later on.
```
kubectl -n kube-system scale --replicas=<DESIRED_REPLICAS> deployment <NAME>
```
Join new nodes.

1.4.9.2 Re-distribute replicas #

When multiple replicas are running on the same pod you will want to redistribute those manually to ensure proper high availability,

Find the pod for re-distribution. Check the NAME and NODE column for duplicated pods.
```
kubectl -n kube-system get pod -o wide
```
Delete duplicated pod. This will trigger another pod creation.
```
kubectl -n kube-system delete pod <POD_NAME>
```

2 Deployment Scenarios #

2.1 Default Deployment Scenario
2.2 Air gap Environment

2.1 Default Deployment Scenario #

The default scenario consists of a SUSE CaaS Platform cluster, an external load balancer, and a management workstation.

The minimum viable failure tolerant configuration for the cluster is 3 master nodes and 2 worker nodes. For more information, refer to Chapter 1, Requirements.

2.2 Air gap Environment #

For detailed instructions on how to prepare deployment in an air gapped environment, refer to: https://documentation.suse.com/suse-caasp/4.5/html/caasp-airgap/index.html.

3 Deployment Instructions #

3.1 Deployment Preparations
3.2 Deployment on SUSE OpenStack Cloud
3.3 Deployment on VMware
3.4 Deployment on Bare Metal or KVM
3.5 Deployment on Existing SLES Installation
3.6 Deployment on Amazon Web Services (AWS)

Important

If you are installing over one of the previous milestones, you must remove the RPM repository. SUSE CaaS Platform is now distributed as an extension for SUSE Linux Enterprise and no longer requires the separate repository.

If you do not remove the repository before installation, there might be conflicts with the package dependencies that could render your installation nonfunctional.

Note

Due to a naming convention conflict, all versions of SUSE CaaS Platform 4.x up to 4.5 will be released in the 4.0 module. Starting with 4.5 the product will be delivered in the 4.5 module.

3.1 Deployment Preparations #

In order to deploy SUSE CaaS Platform you need a workstation running SUSE Linux Enterprise Server 15 SP2 or similar openSUSE equivalent. This workstation is called the "Management machine". Important files are generated and must be maintained on this machine, but it is not a member of the SUSE CaaS Platform cluster.

3.1.1 Basic SSH Key Configuration #

In order to successfully deploy SUSE CaaS Platform, you need to have SSH keys loaded into an SSH agent. This is important, because it is required in order to use the installation tools skuba and terraform.

Note

The use of ssh-agent comes with some implications for security that you should take into consideration.

The pitfalls of using ssh-agent

To avoid these risks please make sure to either use ssh-agent -t <TIMEOUT> and specify a time after which the agent will self-terminate, or terminate the agent yourself before logging out by running ssh-agent -k.

To log in to the created cluster nodes from the Management machine, you need to configure an SSH key pair. This key pair needs to be trusted by the user account you will log in with into each cluster node; that user is called "sles" by default. In order to use the installation tools terraform and skuba, this trusted keypair must be loaded into the SSH agent.

If you do not have an existing ssh keypair to use, run:
```
ssh-keygen -t ecdsa
```
The ssh-agent or a compatible program is sometimes started automatically by graphical desktop environments. If that is not your situation, run:
```
eval "$(ssh-agent)"
```
This will start the agent and set environment variables used for agent communication within the current session. This has to be the same terminal session that you run the skuba commands in. A new terminal usually requires a new ssh-agent. In some desktop environments the ssh-agent will also automatically load the SSH keys. To add an SSH key manually, use the ssh-add command:
```
ssh-add <PATH_TO_KEY>
```
Tip
If you are adding the SSH key manually, specify the full path. For example: /home/sles/.ssh/id_rsa

You can load multiple keys into your agent using the ssh-add <PATH_TO_KEY> command. Keys should be password protected as a security measure. The ssh-add command will prompt for your password, then the agent caches the decrypted key material for a configurable lifetime. The -t lifetime option to ssh-add specifies a maximum time to cache the specific key. See man ssh-add for more information.

Warning: Specify a key expiration time

The ssh key is decrypted when loaded into the key agent. Though the key itself is not accesible from the agent, anyone with access to the agent’s control socket file can use the private key contents to impersonate the key owner. By default, socket access is limited to the user who launched the agent. None the less, it is good security practice to specify an expiration time for the decrypted key using the -t option. For example: ssh-add -t 1h30m $HOME/.ssh/id.ecdsa would expire the decrypted key in 1.5 hours. . Alternatively, ssh-agent can also be launched with -t to specify a default timeout. For example: eval $( ssh-agent -t 120s ) would default to a two minute (120 second) timeout for keys added. If timeouts are specified for both programs, the timeout from ssh-add is used. See man ssh-agent and man ssh-add for more information.

Note: Usage of multiple identities with `ssh-agent`

Skuba will try all the identities loaded into the ssh-agent until one of them grants access to the node, or until the SSH server’s maximum authentication attempts are exhausted. This could lead to undesired messages in SSH or other security/authentication logs on your local machine.

3.1.1.1 Forwarding the Authentication Agent Connection #

It is also possible to forward the authentication agent connection from a host to another, which can be useful if you intend to run skuba on a "jump host" and don’t want to copy your private key to this node. This can be achieved using the ssh -A command. Please refer to the man page of ssh to learn about the security implications of using this feature.

3.1.2 Registration Code #

Note

The registration code for SUSE CaaS Platform.4 also contains the activation permissions for the underlying SUSE Linux Enterprise operating system. You can use your SUSE CaaS Platform registration code to activate the SUSE Linux Enterprise Server 15 SP2 subscription during installation.

You need a subscription registration code to use SUSE CaaS Platform. You can retrieve your registration code from SUSE Customer Center.

Login to https://scc.suse.com
Navigate to MY ORGANIZATIONS → <YOUR ORG>
Select the Subscriptions tab from the menu bar at the top
Search for "CaaS Platform"
Select the version you wish to deploy (should be the highest available version)
Click on the Link in the Name column
The registration code should be displayed as the first line under "Subscription Information"

Tip

If you can not find SUSE CaaS Platform in the list of subscriptions please contact your local administrator responsible for software subscriptions or SUSE support.

3.1.3 Unique Machine IDs #

During deployment of the cluster nodes, each machine will be assigned a unique ID in the /etc/machine-id file by Terraform or AutoYaST. If you are using any (semi-)manual methods of deployments that involve cloning of machines and deploying from templates, you must make sure to delete this file before creating the template.

If two nodes are deployed with the same machine-id, they will not be correctly recognized by skuba.

Important: Regenerating Machine ID

In case you are not using Terraform or AutoYaST you must regenerate machine IDs manually.

During the template preparation you will have removed the machine ID from the template image. This ID is required for proper functionality in the cluster and must be (re-)generated on each machine.

rm /etc/machine-id
dbus-uuidgen --ensure
systemd-machine-id-setup
systemctl restart systemd-journald

This will regenerate the machine id values for DBUS (/var/lib/dbus/machine-id) and systemd (/etc/machine-id) and restart the logging service to make use of the new IDs.

3.1.4 Installation Tools #

For any deployment type you will need skuba and Terraform. These packages are available from the SUSE CaaS Platform package sources. They are provided as an installation "pattern" that will install dependencies and other required packages in one simple step.

Access to the packages requires the SUSE CaaS Platform, Containers and Public Cloud extension modules. Enable the modules during the operating system installation or activate them using SUSE Connect.

sudo SUSEConnect -r  <CAASP_REGISTRATION_CODE> 1
sudo SUSEConnect -p sle-module-containers/15.2/x86_64 2
sudo SUSEConnect -p sle-module-public-cloud/15.2/x86_64 3
sudo SUSEConnect -p caasp/4.5/x86_64 -r <CAASP_REGISTRATION_CODE> 4

1	Activate SUSE Linux Enterprise
2	Add the free `Containers` module
3	Add the free `Public Cloud` module
4	Add the SUSE CaaS Platform extension with your registration code

Install the required tools:

sudo zypper in -t pattern SUSE-CaaSP-Management

This will install the skuba command line tool and Terraform; as well as various default configurations and examples.

Note: Using a Proxy Server

Sometimes you need a proxy server to be able to connect to the SUSE Customer Center. If you have not already configured a system-wide proxy, you can temporarily do so for the duration of the current shell session like this:

Expose the environmental variable http_proxy:

export http_proxy=http://PROXY_IP_FQDN:PROXY_PORT

Replace <PROXY_IP_FQDN> by the IP address or a fully qualified domain name (FQDN) of the proxy server and <PROXY_PORT> by its port.
If you use a proxy server with basic authentication, create the file $HOME/.curlrc with the following content:
```
--proxy-user "<USER>:<PASSWORD>"
```
Replace <USER> and <PASSWORD> with the credentials of an allowed user for the proxy server, and consider limiting access to the file (chmod 0600).

3.1.5 Load Balancer #

Important

Setting up a load balancer is mandatory in any production environment.

SUSE CaaS Platform requires a load balancer to distribute workload between the deployed master nodes of the cluster. A failure-tolerant SUSE CaaS Platform cluster will always use more than one control plane node as well as more than one load balancer, so there isn’t a single point of failure.

There are many ways to configure a load balancer. This documentation cannot describe all possible combinations of load balancer configurations and thus does not aim to do so. Please apply your organization’s load balancing best practices.

For SUSE OpenStack Cloud, the Terraform configurations shipped with this version will automatically deploy a suitable load balancer for the cluster.

For bare metal, KVM, or VMware, you must configure a load balancer manually and allow it access to all master nodes created during Chapter 4, Bootstrapping the Cluster.

The load balancer should be configured before the actual deployment. It is needed during the cluster bootstrap, and also during upgrades. To simplify configuration, you can reserve the IPs needed for the cluster nodes and pre-configure these in the load balancer.

The load balancer needs access to port 6443 on the apiserver (all master nodes) in the cluster. It also needs access to Gangway port 32001 and Dex port 32000 on all master and worker nodes in the cluster for RBAC authentication.

We recommend performing regular HTTPS health checks on each master node /healthz endpoint to verify that the node is responsive. This is particularly important during upgrades, when a master node restarts the apiserver. During this rather short time window, all requests have to go to another master node’s apiserver. The master node that is being upgraded will have to be marked INACTIVE on the load balancer pool at least during the restart of the apiserver. We provide reasonable defaults for that on our default openstack load balancer Terraform configuration.

The following contains examples for possible load balancer configurations based on SUSE Linux Enterprise Server 15 SP2 and nginx or HAProxy.

3.1.5.1 Nginx TCP Load Balancer with Passive Checks #

For TCP load balancing, we can use the ngx_stream_module module (available since version 1.9.0). In this mode, nginx will just forward the TCP packets to the master nodes.

The default mechanism is round-robin so each request will be distributed to a different server.

Warning

The open source version of Nginx referred to in this guide only allows the use of passive health checks. nginx will mark a node as unresponsive only after a failed request. The original request is lost and not forwarded to an available alternative server.

This load balancer configuration is therefore only suitable for testing and proof-of-concept (POC) environments.

For production environments, we recommend the use of SUSE Linux Enterprise High Availability Extension 15

3.1.5.1.1 Configuring the Load Balancer #

SUSEConnect -r CAASP_REGISTRATION_CODE
SUSEConnect --product sle-module-server-applications/15.2/x86_64

Install Nginx:
```
zypper in nginx
```

Write the configuration in /etc/nginx/nginx.conf:

user  nginx;
worker_processes  auto;

load_module /usr/lib64/nginx/modules/ngx_stream_module.so;

error_log  /var/log/nginx/error.log;
error_log  /var/log/nginx/error.log  notice;
error_log  /var/log/nginx/error.log  info;

events {
    worker_connections  1024;
    use epoll;
}

stream {
    log_format proxy '$remote_addr [$time_local] '
                     '$protocol $status $bytes_sent $bytes_received '
                     '$session_time "$upstream_addr"';

    error_log  /var/log/nginx/k8s-masters-lb-error.log;
    access_log /var/log/nginx/k8s-masters-lb-access.log proxy;

    upstream k8s-masters {
        #hash $remote_addr consistent; 1
        server master00:6443 weight=1 max_fails=2 fail_timeout=5s; 2
        server master01:6443 weight=1 max_fails=2 fail_timeout=5s;
        server master02:6443 weight=1 max_fails=2 fail_timeout=5s;
    }
    server {
        listen 6443;
        proxy_connect_timeout 5s;
        proxy_timeout 30s;
        proxy_pass k8s-masters;
    }

    upstream dex-backends {
        #hash $remote_addr consistent; 3
        server master00:32000 weight=1 max_fails=2 fail_timeout=5s; 4
        server master01:32000 weight=1 max_fails=2 fail_timeout=5s;
        server master02:32000 weight=1 max_fails=2 fail_timeout=5s;
    }
    server {
        listen 32000;
        proxy_connect_timeout 5s;
        proxy_timeout 30s;
        proxy_pass dex-backends; 5
    }

    upstream gangway-backends {
        #hash $remote_addr consistent; 6
        server master00:32001 weight=1 max_fails=2 fail_timeout=5s; 7
        server master01:32001 weight=1 max_fails=2 fail_timeout=5s;
        server master02:32001 weight=1 max_fails=2 fail_timeout=5s;
    }
    server {
        listen 32001;
        proxy_connect_timeout 5s;
        proxy_timeout 30s;
        proxy_pass gangway-backends; 8
    }
}

1 3 6	Note: To enable session persistence, uncomment the `hash` option so the same client will always be redirected to the same server except if this server is unavailable.
2 4 7	Replace the individual `masterXX` with the IP/FQDN of your actual master nodes (one entry each) in the `upstream k8s-masters` section.
5 8	Dex port `32000` and Gangway port `32001` must be accessible through the load balancer for RBAC authentication.

Configure firewalld to open up port 6443. As root, run:

firewall-cmd --zone=public --permanent --add-port=6443/tcp
firewall-cmd --zone=public --permanent --add-port=32000/tcp
firewall-cmd --zone=public --permanent --add-port=32001/tcp
firewall-cmd --reload

Start and enable Nginx. As root, run:
```
systemctl enable --now nginx
```

3.1.5.1.2 Verifying the Load Balancer #

Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Chapter 4, Bootstrapping the Cluster is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# tail -f /var/log/nginx/k8s-masters-lb-access.log
10.0.0.47 [17/May/2019:13:49:06 +0000] TCP 200 2553 1613 1.136 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:08 +0000] TCP 200 2553 1613 0.981 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:10 +0000] TCP 200 2553 1613 0.891 "10.0.0.7:6443"
10.0.0.47 [17/May/2019:13:49:12 +0000] TCP 200 2553 1613 0.895 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:15 +0000] TCP 200 2553 1613 1.157 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:17 +0000] TCP 200 2553 1613 0.897 "10.0.0.7:6443"

3.1.5.2 HAProxy TCP Load Balancer with Active Checks #

Warning: Package Support

HAProxy is available as a supported package with a SUSE Linux Enterprise High Availability Extension 15 subscription.

Alternatively, you can install HAProxy from SUSE Package Hub but you will not receive product support for this component.

HAProxy is a very powerful load balancer application which is suitable for production environments. Unlike the open source version of nginx mentioned in the example above, HAProxy supports active health checking which is a vital function for reliable cluster health monitoring.

The version used at this date is the 1.8.7.

Important

The configuration of an HA cluster is out of the scope of this document.

The default mechanism is round-robin so each request will be distributed to a different server.

The health-checks are executed every two seconds. If a connection fails, the check will be retried two times with a timeout of five seconds for each request. If no connection succeeds within this interval (2x5s), the node will be marked as DOWN and no traffic will be sent until the checks succeed again.

3.1.5.2.1 Configuring the Load Balancer #

SUSEConnect -r CAASP_REGISTRATION_CODE
SUSEConnect --product sle-module-server-applications/15.2/x86_64

Enable the source for the haproxy package:
- If you are using the SUSE Linux Enterprise High Availability Extension
```
SUSEConnect --product sle-ha/15.2/x86_64 -r ADDITIONAL_REGCODE
```
- If you want the free (unsupported) package:
```
SUSEConnect --product PackageHub/15.2/x86_64
```
Configure /dev/log for HAProxy chroot (optional)
This step is only required when HAProxy is configured to run in a jail directory (chroot). This is highly recommended since it increases the security of HAProxy.
Since HAProxy is chrooted, it’s necessary to make the log socket available inside the jail directory so HAProxy can send logs to the socket.
```
mkdir -p /var/lib/haproxy/dev/ && touch /var/lib/haproxy/dev/log
```
This systemd service will take care of mounting the socket in the jail directory.
```
cat > /etc/systemd/system/bindmount-dev-log-haproxy-chroot.service <<EOF
[Unit]
Description=Mount /dev/log in HAProxy chroot
After=systemd-journald-dev-log.socket
Before=haproxy.service

[Service]
Type=oneshot
ExecStart=/bin/mount --bind /dev/log /var/lib/haproxy/dev/log

[Install]
WantedBy=multi-user.target
EOF
```
Enabling the service will make the changes persistent after a reboot.
```
systemctl enable --now bindmount-dev-log-haproxy-chroot.service
```
Install HAProxy:
```
zypper in haproxy
```

Write the configuration in /etc/haproxy/haproxy.cfg:

Note

Replace the individual <MASTER_XX_IP_ADDRESS> with the IP of your actual master nodes (one entry each) in the server lines. Feel free to leave the name argument in the server lines (master00 and etc.) as is - it only serves as a label that will show up in the haproxy logs.

global
  log /dev/log local0 info 1
  chroot /var/lib/haproxy 2
  user haproxy
  group haproxy
  daemon

defaults
  mode       tcp
  log        global
  option     tcplog
  option     redispatch
  option     tcpka
  retries    2
  http-check     expect status 200 3
  default-server check check-ssl verify none
  timeout connect 5s
  timeout client 5s
  timeout server 5s
  timeout tunnel 86400s 4

listen stats 5
  bind    *:9000
  mode    http
  stats   hide-version
  stats   uri       /stats

listen apiserver 6
  bind   *:6443
  option httpchk GET /healthz
  server master00 <MASTER_00_IP_ADDRESS>:6443
  server master01 <MASTER_01_IP_ADDRESS>:6443
  server master02 <MASTER_02_IP_ADDRESS>:6443

listen dex 7
  bind   *:32000
  option httpchk GET /healthz
  server master00 <MASTER_00_IP_ADDRESS>:32000
  server master01 <MASTER_01_IP_ADDRESS>:32000
  server masetr02 <MASTER_02_IP_ADDRESS>:32000

listen gangway 8
  bind   *:32001
  option httpchk GET /
  server master00 <MASTER_00_IP_ADDRESS>:32001
  server master01 <MASTER_01_IP_ADDRESS>:32001
  server master02 <MASTER_02_IP_ADDRESS>:32001

1	Forward the logs to systemd journald, the log level can be set to `debug` to increase verbosity.
2	Define if it will run in a `chroot`.
4	This timeout is set to `24h` in order to allow long connections when accessing pod logs or port forwarding.
5	URL to expose `HAProxy` stats on port `9000`, it is accessible at `http://loadbalancer:9000/stats`
3	The performed health checks will expect a `200` return code
6	Kubernetes apiserver listening on port `6443`, the checks are performed against `https://MASTER_XX_IP_ADDRESS:6443/healthz`
7	Dex listening on port `32000`, it must be accessible through the load balancer for RBAC authentication, the checks are performed against `https://MASTER_XX_IP_ADDRESS:32000/healthz`
8	Gangway listening on port `32001`, it must be accessible through the load balancer for RBAC authentication, the checks are performed against `https://MASTER_XX_IP_ADDRESS:32001/`

Configure firewalld to open up port 6443. As root, run:

firewall-cmd --zone=public --permanent --add-port=6443/tcp
firewall-cmd --zone=public --permanent --add-port=32000/tcp
firewall-cmd --zone=public --permanent --add-port=32001/tcp
firewall-cmd --reload

Start and enable HAProxy. As root, run:
```
systemctl enable --now haproxy
```

3.1.5.2.2 Verifying the Load Balancer #

Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Chapter 4, Bootstrapping the Cluster is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# journalctl -flu haproxy
haproxy[2525]: 10.0.0.47:59664 [30/Sep/2019:13:33:20.578] apiserver apiserver/master00 1/0/578 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59666 [30/Sep/2019:13:33:22.476] apiserver apiserver/master01 1/0/747 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59668 [30/Sep/2019:13:33:24.522] apiserver apiserver/master02 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59670 [30/Sep/2019:13:33:26.386] apiserver apiserver/master00 1/0/567 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59678 [30/Sep/2019:13:33:28.279] apiserver apiserver/master01 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59682 [30/Sep/2019:13:33:30.174] apiserver apiserver/master02 1/0/571 9727 -- 18/18/17/7/0 0/0

3.2 Deployment on SUSE OpenStack Cloud #

Note: Preparation Required

You must have completed Section 3.1, “Deployment Preparations” to proceed.

3.2.1 Install required kernel package #

The JeOS image used to install on SUSE OpenStack Cloud comes only with the basic kernel pre-installed. This will be insufficient to run cilium. Please make sure that you install the kernel-default package on all cluster nodes before proceeding.

3.2.2 Overview #

You will use Terraform to deploy the required master and worker cluster nodes (plus a load balancer) to SUSE OpenStack Cloud and then use the skuba tool to bootstrap the Kubernetes cluster on top of those.

Download the SUSE OpenStack Cloud RC file.
1. Log in to SUSE OpenStack Cloud.
2. Click on your username in the upper right hand corner to reveal the drop-down menu.
3. Click on Download OpenStack RC File v3.
4. Save the file to your workstation.
5. Load the file into your shell environment using the following command, replacing DOWNLOADED_RC_FILE with the name your file:
```
source <DOWNLOADED_RC_FILE>.sh
```
6. Enter the password for the RC file. This should be same the credentials that you use to log in to SUSE OpenStack Cloud.
Get the SUSE Linux Enterprise Server 15 SP2 image.
1. Download the pre-built image of SUSE SUSE Linux Enterprise Server 15 SP2 for SUSE OpenStack Cloud from https://www.suse.com/download/sles/.
2. Upload the image to your SUSE OpenStack Cloud.

Note: The default user is 'sles'

The SUSE Linux Enterprise Server 15 SP2 images for SUSE OpenStack Cloud come with predefined user sles, which you use to log in to the cluster nodes. This user has been configured for password-less 'sudo' and is the one recommended to be used by Terraform and skuba.

3.2.3 Deploying the Cluster Nodes #

Find the Terraform template files for SUSE OpenStack Cloud in /usr/share/caasp/terraform/openstack (which was installed as part of the management pattern - sudo zypper in -t pattern SUSE-CaaSP-Management). Copy this folder to a location of your choice as the files need adjustment.
```
mkdir -p ~/caasp/deployment/
cp -r /usr/share/caasp/terraform/openstack/ ~/caasp/deployment/
cd ~/caasp/deployment/openstack/
```
Once the files are copied, rename the terraform.tfvars.example file to terraform.tfvars:
```
mv terraform.tfvars.example terraform.tfvars
```

Edit the terraform.tfvars file and add/modify the following variables:

# Name of the image to use
image_name = "SLES15-SP2-JeOS.x86_64-15.2-OpenStack-Cloud-GM"

# Identifier to make all your resources unique and avoid clashes with other users of this Terraform project
stack_name = "caasp" 1

# Name of the internal network to be created
internal_net = "caasp-net" 2

# Name of the internal subnet to be created
# IMPORTANT: If this variable is not set or empty,
# then it will be generated following a schema like
# internal_subnet = "${var.internal_net}-subnet"
internal_subnet = "caasp-subnet"

# Name of the internal router to be created
# IMPORTANT: If this variable is not set or empty,
# then it will be generated following a schema like
# internal_router = "${var.internal_net}-router"
internal_router = "caasp-router"

# Name of the external network to be used, the one used to allocate floating IPs
external_net = "floating"

# CIDR of the subnet for the internal network
subnet_cidr = "172.28.0.0/24"

# Number of master nodes
masters = 3 3

# Number of worker nodes
workers = 2 4

# Size of the master nodes
master_size = "t2.large"

# Size of the worker nodes
worker_size = "t2.large"

# Attach persistent volumes to workers
workers_vol_enabled = 0

# Size of the worker volumes in GB
workers_vol_size = 5

# Name of DNS domain
dnsdomain = "caasp.example.com"

# Set DNS Entry (0 is false, 1 is true)
dnsentry = 0

# Optional: Define the repositories to use
# repositories = {
#   repository1 = "http://repo.example.com/repository1/"
#   repository2 = "http://repo.example.com/repository2/"
# }
repositories = {} 5

# Define required packages to be installed/removed. Do not change those.
packages = [  6
  "kernel-default",
  "-kernel-default-base",
  "new-package-to-install"
]

# ssh keys to inject into all the nodes
authorized_keys = [ 7
  ""
]

# IMPORTANT: Replace these ntp servers with ones from your infrastructure
ntp_servers = ["0.example.ntp.org", "1.example.ntp.org", "2.example.ntp.org", "3.example.ntp.org"] 8

1	`stack_name`: Prefix for all machines of the cluster spawned by terraform.
2	`internal_net`: the internal network name that will be created/used for the cluster in SUSE OpenStack Cloud. Note: This string will be used to generate the human readable IDs in SUSE OpenStack Cloud. If you use a generic term, deployment is very likely to fail because the term is already in use by someone else. It’s a good idea to use your username or some other unique identifier.
3	`masters`: Number of master nodes to be deployed.
4	`workers`: Number of worker nodes to be deployed.
5	`repositories`: A list of additional repositories to be added on each machines. Leave empty if no additional packages need to be installed.
6	`packages`: Additional packages to be installed on the node. Note: Do not remove any of the pre-filled values in the `packages` section. This can render your cluster unusable. You can add more packages but do not remove any of the default packages listed.
7	`authorized_keys`: List of ssh public keys that will be injected into the cluster nodes, allowing you to be able to log in into them via SSH as `sles` user. Copy and paste the text from the keyname.pub file here, not the private key. At least one of the keys must match a key loaded into your `ssh-agent`.
8	`ntp_servers`: A list of `ntp` servers you would like to use with `chrony`.

Tip

You can set the timezone before deploying the nodes by modifying the following file:

~/caasp/deployment/openstack/cloud-init/common.tpl

(Optional) If you absolutely need to be able to SSH into your cluster nodes using password instead of key-based authentication, this is the best time to set it globally for all of your nodes. If you do this later, you will have to do it manually. To set this, modify the cloud-init configuration and comment out the related SSH configuration: ~/caasp/deployment/openstack/cloud-init/common.tpl
```
# Workaround for bsc#1138557 . Disable root and password SSH login
# - sed -i -e '/^PermitRootLogin/s/^.*$/PermitRootLogin no/' /etc/ssh/sshd_config
# - sed -i -e '/^#ChallengeResponseAuthentication/s/^.*$/ChallengeResponseAuthentication no/' /etc/ssh/sshd_config
# - sed -i -e '/^#PasswordAuthentication/s/^.*$/PasswordAuthentication no/' /etc/ssh/sshd_config
# - systemctl restart sshd
```
Register your nodes by using the SUSE CaaS Platform Product Key or by registering nodes against local SUSE Repository Mirroring Server in ~/caasp/deployment/openstack/registration.auto.tfvars:
Substitute <CAASP_REGISTRATION_CODE> for the code from Section 3.1.2, “Registration Code”.
```
## To register CaaSP product please use one of the following method
# - register against SUSE Customer Service, with SUSE CaaSP Product Key
# - register against local SUSE Repository Mirroring Server

# SUSE CaaSP Product Key
caasp_registry_code = "<CAASP_REGISTRATION_CODE>"

# SUSE Repository Mirroring Server Name (FQDN)
#rmt_server_name = "rmt.example.com"
```
This is required so all the deployed nodes can automatically register with SUSE Customer Center and retrieve packages.

You can also enable Cloud Provider Integration with OpenStack in ~/caasp/deployment/openstack/cpi.auto.tfvars:

# Enable CPI integration with OpenStack
cpi_enable = true

# Used to specify the name of to your custom CA file located in /etc/pki/trust/anchors/.
# Upload CUSTOM_CA_FILE to this path on nodes before joining them to your cluster.
#ca_file = "/etc/pki/trust/anchors/<CUSTOM_CA_FILE>"

Now you can deploy the nodes by running:
```
terraform init
terraform plan
terraform apply
```
Check the output for the actions to be taken. Type "yes" and confirm with Enter when ready. Terraform will now provision all the machines and network infrastructure for the cluster.
Important: Note down IP/FQDN for nodes
The IP addresses of the generated machines will be displayed in the Terraform output during the cluster node deployment. You need these IP addresses to deploy SUSE CaaS Platform to the cluster.
If you need to find an IP address later on, you can run terraform output within the directory you performed the deployment from the ~/caasp/deployment/openstack directory or perform the following steps:
1. Log in to SUSE OpenStack Cloud and click on Network › Load Balancers. Find the one with the string you entered in the Terraform configuration above, for example "testing-lb".
2. Note down the "Floating IP". If you have configured an FQDN for this IP, use the host name instead.
3. Now click on Compute › Instances.
4. Switch the filter dropdown box to Instance Name and enter the string you specified for stack_name in the terraform.tfvars file.
5. Find the floating IPs on each of the nodes of your cluster.

3.2.4 Logging in to the Cluster Nodes #

Connecting to the cluster nodes can be accomplished only via SSH key-based authentication thanks to the ssh-public key injection done earlier via Terraform. You can use the predefined sles user to log in.
If the ssh-agent is running in the background, run:
```
ssh sles@<NODE_IP_ADDRESS>
```
Without the ssh-agent running, run:
```
ssh sles@<NODE_IP_ADDRESS> -i <PATH_TO_YOUR_SSH_PRIVATE_KEY>
```
Once connected, you can execute commands using password-less sudo. In addition to that, you can also set a password if you prefer to.
To set the root password, run:
```
sudo passwd
```
To set the sles user’s password, run:
```
sudo passwd sles
```

Important: Password authentication has been disabled

Under the default settings you always need your SSH key to access the machines. Even after setting a password for either root or sles user, you will be unable to log in via SSH using their respective passwords. You will most likely receive a Permission denied (publickey) error. This mechanism has been deliberately disabled because of security best practices. However, if this setup does not fit your workflows, you can change it at your own risk by modifying the SSH configuration: under /etc/ssh/sshd_config

To allow password SSH authentication, set:

+ PasswordAuthentication yes

To allow login as root via SSH, set:

+ PermitRootLogin yes

For the changes to take effect, you need to restart the SSH service by running:

sudo systemctl restart sshd.service

3.2.5 Container Runtime Proxy #

Important

CRI-O proxy settings must be adjusted on all nodes before joining the cluster!

Please refer to: https://documentation.suse.com/suse-caasp/4.5/html/caasp-admin/_miscellaneous.html#_configuring_httphttps_proxy_for_cri_o

In some environments you must configure the container runtime to access the container registries through a proxy. In this case, please refer to: SUSE CaaS Platform Admin Guide: Configuring HTTP/HTTPS Proxy for CRI-O

3.3 Deployment on VMware #

Note: Preparation Required

You must have completed Section 3.1, “Deployment Preparations” to proceed.

3.3.1 Environment Description #

Note

These instructions are based on VMware ESXi 6.7.

Important: Enable Jumbo Frames

The SUSE CaaS Platform cluster components use long request headers (in excess of 1300 Bytes) that might not fit into a default configuration. You must enable "Jumbo Frames" on the switch ports connecting to your ESXi hosts. Any MTU value above 1500 is considered a "Jumbo Frame" but minor increases can still be too small to hold all headers. Please set the maximum transmission unit (MTU) to 9000. This is the maximum value for vSphere and should cover all payloads.

To enable "Jumbo Frames" refer to: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.networking.doc/GUID-53F968D9-2F91-41DA-B7B2-48394D997F2A.html

Please check with your hardware vendor that the network adapters used by your cluster support "Jumbo Frames".

Important

VMware vSphere doesn’t offer a load-balancer solution. Please expose port 6443 for the Kubernetes api-servers on the master nodes on a local load balancer using round-robin 1:1 port forwarding.

3.3.2 Choose a Setup Type #

You can deploy SUSE CaaS Platform onto existing VMware instances using AutoYaST or create a VMware template that will also create the instances. You must choose one method to deploy the entire cluster!

Please follow the instructions for you chosen method below and ignore the instructions for the other method.

3.3.3 Setup with AutoYaST #

Note

If you choose the AutoYaST method, please ignore all the following steps for the VMware template creation.

For each VM deployment, follow the AutoYaST installation method used for deployment on bare metal machines as described in Section 3.4, “Deployment on Bare Metal or KVM”.

3.3.4 Setup Using the VMware Template #

3.3.4.1 Choose a Disk Format for the Template #

Before creating the template, it is important to select the right format of the root hard disk for the node. This format is then used by default when creating new instances with Terraform.

For the majority of cases, we recommend using Thick Provision Lazy Zeroed. This format is quick to create, provides good performance, and avoids the risk of running out of disk space due to over-provisioning.

Note

It is not possible to resize a disk when using Thick Provision Eager Zeroed. For this reason, the Terraform variables master_disk_size and worker_disk_size must be set to the exact same size as in the original template.

Official VMware documentation describes these formats as follows:

Thick Provision Lazy Zeroed: Creates a virtual disk in a default thick format. Space required for the virtual disk is allocated when the disk is created. Data remaining on the physical device is not erased during creation, but is zeroed out on demand later on first write from the virtual machine. Virtual machines do not read stale data from the physical device.
Thick Provision Eager Zeroed: A type of thick virtual disk that supports clustering features such as Fault Tolerance. Space required for the virtual disk is allocated at creation time. In contrast to the thick provision lazy zeroed format, the data remaining on the physical device is zeroed out when the virtual disk is created. It might take longer to create virtual disks in this format than to create other types of disks. Increasing the size of an Eager Zeroed Thick virtual disk causes a significant stun time for the virtual machine.
Thin Provision: Use this format to save storage space. For the thin disk, you provision as much datastore space as the disk would require based on the value that you enter for the virtual disk size. However, the thin disk starts small and at first, uses only as much datastore space as the disk needs for its initial operations. If the thin disk needs more space later, it can grow to its maximum capacity and occupy the entire datastore space provisioned to it.

Important: Select the Disk Format Thoughtfully

It is not possible to change the format in Terraform later. Once you have selected one format, you can only create instances with that format and it is not possible to switch.

3.3.4.2 VM Preparation for Creating a Template #

Upload the ISO image SLE-15-SP2-Full-x86_64-GM-Media1.iso to the desired VMware datastore.

Now you can create a new base VM for SUSE CaaS Platform within the designated resource pool through the vSphere WebUI:

Create a "New Virtual Machine".
Define a name for the virtual machine (VM).
Select the folder where the VM will be stored.
Select a Compute Resource that will run the VM.
Select the storage to be used by the VM.
Select ESXi 6.7 and later from compatibility.
Select Guest OS Family › Linux and Guest OS Version › SUSE Linux Enterprise 15 (64-bit).
Note: You will manually select the correct installation medium in the next step.
Now customize the hardware settings.
1. Select CPU › 2.
2. Select Memory › 4096 MB.
3. Select New Hard disk › 40 GB, New Hard disk › Disk Provisioning > See Section 3.3.4.1, “Choose a Disk Format for the Template” to select the appropriate disk format. For Thick Provision Eager Zeroed, use this value for Terraform variables master_disk_size and worker_disk_size
4. Select New SCSI Controller › LSI Logic Parallel SCSI controller (default) and change it to "VMware Paravirtualized".
5. Select New Network › VM Network, New Network › Adapter Type › VMXNET3.
  ("VM Network" sets up a bridged network which provides a public IP address reachable within a company.)
6. Select New CD/DVD › Datastore ISO File.
7. Check the box New CD/DVD › Connect At Power On to be able boot from ISO/DVD.
8. Then click on "Browse" next to the CD/DVD Media field to select the downloaded ISO image on the desired datastore.
9. Go to the VM Options tab.
10. Select Boot Options.
11. Select Firmware › BIOS.
12. Confirm the process with Next.

3.3.4.2.1 SUSE Linux Enterprise Server Installation #

Power on the newly created VM and install the system over graphical remote console:

Enter registration code for SUSE Linux Enterprise in YaST.
Confirm the update repositories prompt with "Yes".
Remove the check mark in the "Hide Development Versions" box.
Make sure the following modules are selected on the "Extension and Module Selection" screen:
- SUSE CaaS Platform 4.0 x86_64
  Note
  Due to a naming convention conflict, all versions of SUSE CaaS Platform 4.x up to 4.5 will be released in the 4.0 module. Starting with 4.5 the product will be delivered in the 4.5 module.
- Basesystem Module
- Containers Module (this will automatically be checked when you select SUSE CaaS Platform)
- Public Cloud Module
Enter the registration code to unlock the SUSE CaaS Platform extension.
Select System Role › Minimal on the "System Role" screen.
Click on "Expert Partitioner" to redesign the default partition layout.
Select "Start with current proposal".
1. Keep sda1 as BIOS partition.
2. Remove the root / partition.
  Select the device in "System View" on the left (default: /dev/sda2) and click "Delete". Confirm with "Yes".
3. Remove the /home partition.
4. Remove the swap partition.
Select the /dev/sda/ device in "System View" and then click Partitions › Add Partition.
Accept the default maximum size (remaining size of the hard disk defined earlier without the boot partition).
1. Confirm with "Next".
2. Select Role › Operating System
3. Confirm with "Next".
4. Accept the default settings.
  - Filesystem: BtrFS
  - Enable Snapshots
  - Mount Device
  - Mount Point /
You should be left with two partitions. Now click "Accept".
Confirm the partitioning changes.
Click "Next".
Configure your timezone and click "Next".
Create a user with the username sles and specify a password.
1. Check the box Local User › Use this password for system administrator.
Click "Next".
On the "Installation Settings" screen:
1. In the "Security" section:
  1. Disable the Firewall (click on (disable)).
  2. Enable the SSH service (click on (enable)).
2. Scroll to the kdump section of the software description and click on the title.
In the "Kdump Start-Up" screen, select Enable/Disable Kdump › Disable Kdump.
Note
Kdump needs to be disabled because it defines a certain memory limit. If you later wish to deploy the template on a machine with different memory allocation (e.g. template created for 4GB, new machine has 2GB), the results of Kdump will be useless.
You can always configure Kdump on the machine after deploying from the template.
Refer to: SUSE Linux Enterprise Server 15 SP2 System Analysis and Tuning Guide: Basic Kdump Configuration
1. Confirm with "OK".
Click "Install". Confirm the installation by clicking "Install" in the pop-up dialog.
Finish the installation and confirm system reboot with "OK".

3.3.4.3 Preparation of the VM as a Template #

In order to run SUSE CaaS Platform on the created VMs, you must configure and install some additional packages like sudo, cloud-init and open-vm-tools.

Tip: Activate extensions during SUSE Linux Enterprise installation with YaST

Steps 1-4 may be skipped, if they were already performed in YaST during the SUSE Linux Enterprise installation.

Register the SUSE Linux Enterprise Server 15 SP2 system. Substitute <CAASP_REGISTRATION_CODE> for the code from Section 3.1.2, “Registration Code”.
```
SUSEConnect -r CAASP_REGISTRATION_CODE
```

SUSEConnect -p sle-module-containers/15.2/x86_64

Register the Public Cloud module for basic cloud-init package (free of charge):
```
SUSEConnect -p sle-module-public-cloud/15.2/x86_64
```
Register the SUSE CaaS Platform module. Substitute <CAASP_REGISTRATION_CODE> for the code from Section 3.1.2, “Registration Code”.
```
SUSEConnect -p caasp/4.5/x86_64 -r CAASP_REGISTRATION_CODE
```

Install required packages. As root, run:

zypper in sudo cloud-init cloud-init-vmware-guestinfo open-vm-tools

Enable the installed cloud-init services. As root, run:

systemctl enable cloud-init cloud-init-local cloud-config cloud-final

}

Deregister from scc:
```
SUSEConnect -d; SUSEConnect --cleanup
```

Do a cleanup of the SLE image for converting into a VMware template. As root, run:

rm /etc/machine-id /var/lib/zypp/AnonymousUniqueId \
/var/lib/systemd/random-seed /var/lib/dbus/machine-id \
/var/lib/wicked/*

Clean up btrfs snapshots and create one with initial state:

snapper list
snapper delete <list_of_nums_of_unneeded_snapshots>
snapper create -d "Initial snapshot for caasp template" -t single

Power down the VM. As root, run:
```
shutdown -h now
```

3.3.4.4 Creating the VMware Template #

Now you can convert the VM into a template in VMware (or repeat this action block for each VM).

In the vSphere WebUI, right-click on the VM and select Template › Convert to Template. Name it reasonably so you can later identify the template. The template will be created.

3.3.4.5 Deploying VMs from the Template #

3.3.4.5.1 Using Terraform #

Find the Terraform template files for VMware in /usr/share/caasp/terraform/vmware which was installed as part of the management pattern (sudo zypper in patterns-caasp-Management). Copy this folder to a location of your choice; as the files need to be adjusted.
```
mkdir -p ~/caasp/deployment/
cp -r /usr/share/caasp/terraform/vmware/ ~/caasp/deployment/
cd ~/caasp/deployment/vmware/
```
Once the files are copied, rename the terraform.tfvars.example file to terraform.tfvars:
```
mv terraform.tfvars.example terraform.tfvars
```
Edit the terraform.tfvars file and add/modify the following variables:

# datastore to use in vSphere
vsphere_datastore = "STORAGE-0" 1

# datastore_ cluster to use in vSphere
vsphere_datastore_cluster = "STORAGE-CLUSTER-0" 2

# datacenter to use in vSphere
vsphere_datacenter = "DATACENTER" 3

# network to use in vSphere
vsphere_network = "VM Network" 4

# resource pool the machines will be running in
vsphere_resource_pool = "esxi1/Resources" 5

# template name the machines will be copied from
template_name = "sles15-sp2-caasp" 6

# IMPORTANT: Replace by "efi" string in case your template was created by using EFI firmware
firmware = "bios"

# prefix that all of the booted machines will use
# IMPORTANT: please enter unique identifier below as value of
# stack_name variable to not interfere with other deployments
stack_name = "caasp" 7

# Number of master nodes
masters = 1 8

# Optional: Size of the root disk in GB on master node
master_disk_size = 50 9

# Number of worker nodes
workers = 2 10

# Optional: Size of the root disk in GB on worker node
worker_disk_size = 40 11

# Username for the cluster nodes. Must exist on base OS.
username = "sles" 12

# Optional: Define the repositories to use
# repositories = {
#   repository1 = "http://repo.example.com/repository1/"
#   repository2 = "http://repo.example.com/repository2/"
# }
repositories = {} 13

# Minimum required packages. Do not remove them.
# Feel free to add more packages
packages = [ 14
]

# ssh keys to inject into all the nodes
authorized_keys = [ 15
  "ssh-rsa <example_key> example@example.com"
]

# IMPORTANT: Replace these ntp servers with ones from your infrastructure
ntp_servers = ["0.example.ntp.org", "1.example.ntp.org", "2.example.ntp.org", "3.example.ntp.org"] 16

# Controls whether or not the guest network waiter waits for a routable address.
# Default is True and should not be changed unless you hit the upstream bug: https://github.com/hashicorp/terraform-provider-vsphere/issues/1127
wait_for_guest_net_routable = true 17

Important

Only one of vsphere_datastore or vsphere_datastore_cluster can be set at the same time. Proceed to comment or delete the unused one from your terraform.tfvars

1	`vsphere_datastore`: The datastore to use. This option is mutually exclusive with `vsphere_datastore_cluster`.
2	`vsphere_datastore_cluster`: The datastore cluster to use. This option is mutually exclusive with `vsphere_datastore`.
3	`vsphere_datacenter`: The datacenter to use.
4	`vsphere_network`: The network to use.
5	`vsphere_resource_pool`: The root resource pool or an user-created child resource pool. Refer to https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.resmgmt.doc/GUID-60077B40-66FF-4625-934A-641703ED7601.html for detailed information.
6	`template_name`: The name of the template created according to instructions.
7	`stack_name`: Prefix for all machines of the cluster spawned by terraform. Note: This string will be used to generate the human readable IDs in SUSE OpenStack Cloud. If you use a generic term, deployment very likely to fail because the term is already in use by someone else. It’s a good idea to use your username or some other unique identifier.
8	`masters`: Number of master nodes to be deployed.
9	`master_disk_size`: Size of the root disk in GB. Note: The value must be at least the same size as the source template. It is only possible to increase the size of a disk.
10	`workers`: Number of worker nodes to be deployed.
11	`worker_disk_size`: Size of the root disk in GB. Note: The value must be at least the same size as the source template. It is only possible to increase the size of a disk.
12	`username`: Login username for the nodes. Note: Leave this as the default `sles`. The username must exist on the used base operating system. It will not be created.
13	`repositories`: A list of additional repositories to be added on each machines. Leave empty if no additional packages need to be installed.
14	`packages`: Additional packages to be installed on the node. Note: Do not remove any of the pre-filled values in the `packages` section. This can render your cluster unusable. You can add more packages but do not remove any of the default packages listed.
15	`authorized_keys`: List of ssh-public-keys that will be able to log in to the deployed machines.
16	`ntp_servers`: A list of `ntp` servers you would like to use with `chrony`.
17	`wait_for_guest_net_routable`: true or false to disable the routable check. Default is true. Note: This should only be changed if terraform times out while creating VMs as mentioned on the upstream bug: https://github.com/hashicorp/terraform-provider-vsphere/issues/1127

Enter the registration code for your nodes in ~/caasp/deployment/vmware/registration.auto.tfvars:
Substitute <CAASP_REGISTRATION_CODE> for the code from Section 3.1.2, “Registration Code”.
```
# SUSE CaaSP Product Product Key
caasp_registry_code = "CAASP_REGISTRATION_CODE"
```
This is required so all the deployed nodes can automatically register with SUSE Customer Center and retrieve packages.
You can also enable Cloud Provider Integration with vSphere.
```
# Enable CPI integration with vSphere
cpi_enable = true
```
When cpi is enabled, hostnames set from the DHCP server will automatically be disabled. This sets _vSphere virtual machine’s hostname with a naming convention ("<stack_name>-master-<index>" or "<stack_name>-worker-<index>"). This can be used as node name when using skuba command to bootstrapping or joining nodes.
Important
It is mandatory that each virtual machine’s hostname must match its cluster node name.
```
# Set node's hostname from DHCP server
hostname_from_dhcp = false
```
Once the files are adjusted, terraform needs to know about the vSphere server and the login details for it; these can be exported as environment variables or entered every time terraform is invoked.
Additionally, the ssh-key that is specified in the tfvars file must be added to the key agent so the machine running skuba can ssh into the machines:
```
export VSPHERE_SERVER="<server_address"
export VSPHERE_USER="<username>"
export VSPHERE_PASSWORD="<password>"
export VSPHERE_ALLOW_UNVERIFIED_SSL=true # In case you are using custom certificate for accessing vsphere API

ssh-add <path_to_private_ssh_key_from_tfvars>
```
Warning: Specify a key expiration time
The ssh key is decrypted when loaded into the key agent. Though the key itself is not accesible, anyone with access to the agent’s control socket file can use the private key contents to impersonate the key owner. By default, socket access is limited to the user who launched the agent. None the less, it is still good security practice to specify an expiration time for the decrypted key using the -t option.
For example: ssh-add -t 1h30m $HOME/.ssh/id.ecdsa would expire the decrypted key in 1.5 hours. See man ssh-agent and man ssh-add for more information.
Run Terraform to create the required machines for use with skuba:
```
terraform init
terraform plan
terraform apply
```

3.3.4.5.2 Setup by Hand #

Note

Full instructions for the manual setup and configuration are currently not in scope of this document.

Deploy the template to your created VMs. After that, boot into the node and configure the OS as needed.

Power on the newly created VMs
Generate new machine IDs on each node
You need to know the FQDN/IP for each of the created VMs during the bootstrap process
Continue with bootstrapping/joining of nodes

Tip

To manually generate the unique machine-id please refer to: Important: Regenerating Machine ID.

3.3.5 Container Runtime Proxy #

Important

CRI-O proxy settings must be adjusted on all nodes before joining the cluster!

3.4 Deployment on Bare Metal or KVM #

Note: Preparation Required

You must have completed Section 3.1, “Deployment Preparations” to proceed.

Note

If deploying on KVM virtual machines, you may use a tool such as virt-manager to configure the virtual machines and begin the SUSE Linux Enterprise Server 15 SP2 installation.

3.4.1 Environment Description #

Note

You must have a load balancer configured as described in Section 3.1.5, “Load Balancer”.

Note

The AutoYaST file found in skuba is a template. It has the base requirements. This AutoYaST file should act as a guide and should be updated with your company’s standards.

Note

To account for hardware/platform-specific setup criteria (legacy BIOS vs. (U)EFI, drive partitioning, networking, etc.), you must adjust the AutoYaST file to your needs according to the requirements.

Refer to the official AutoYaST documentation for more information: AutoYaST Guide.

3.4.1.1 Machine Configuration Prerequisites #

Deployment with AutoYaST will require a minimum disk size of 40 GB. That space is reserved for container images without any workloads (10 GB), for the root partition (30 GB) and the EFI system partition (200 MB).

3.4.2 AutoYaST Preparation #

On the management machine, get an example AutoYaST file from /usr/share/caasp/autoyast/bare-metal/autoyast.xml, (which was installed earlier on as part of the management pattern (sudo zypper in -t pattern SUSE-CaaSP-Management).
Copy the file to a suitable location to modify it. Name the file autoyast.xml.
Modify the following places in the AutoYaST file (and any additional places as required by your specific configuration/environment):
1. <ntp-client>
  Change the pre-filled value to your organization’s NTP server. Provide multiple servers if possible by adding new <ntp_server> subentries.
2. <timezone>
  Adjust the timezone your nodes will be set to. Refer to: SUSE Linux Enterprise Server AutoYaST Guide: Country Settings
3. <username>sles</username>
  Insert your authorized key in the placeholder field.
4. <users>
  You can add additional users by creating new blocks in the configuration containing their data.
  Note
  If the users are configured to not have a password like in the example, ensure the system’s sudoers file is updated. Without updating the sudoers file the user will only be able to perform basic operations that will prohibit many administrative tasks.
  The default AutoYaST file provides examples for a disabled root user and a sles user with authorized key SSH access.
  The password for root can be enabled by using the passwd command.
5. <suse_register>
  Insert the email address and SUSE CaaS Platform registration code in the placeholder fields. This activates SUSE Linux Enterprise Server 15 SP2.
6. <addon>
  Insert the SUSE CaaS Platform registration code in the placeholder field. This enables the SUSE CaaS Platform extension module. Update the AutoYaST file with your registration keys and your company’s best practices and hardware configurations.
  Note
  Your SUSE CaaS Platform registration key can be used to both activate SUSE Linux Enterprise Server 15 SP2 and enable the extension.
Refer to the official AutoYaST documentation for more information: AutoYaST Guide.
Host the AutoYaST files on a Web server reachable inside the network you are installing the cluster in.

3.4.2.1 Deploying with local Repository Mirroring Tool (RMT) server #

In order to use a local Repository Mirroring Tool (RMT) server for deployment of packages, you need to specify the server configuration in your AutoYaST file. To do so add the following section:

<suse_register>
<do_registration config:type="boolean">true</do_registration>
<install_updates config:type="boolean">true</install_updates>

<reg_server>https://rmt.example.org</reg_server> 1
<reg_server_cert>https://rmt.example.org/rmt.crt</reg_server_cert> 2
<reg_server_cert_fingerprint_type>SHA1</reg_server_cert_fingerprint_type>
<reg_server_cert_fingerprint>0C:A4:A1:06:AD:E2:A2:AA:D0:08:28:95:05:91:4C:07:AD:13:78:FE</reg_server_cert_fingerprint> 3
<slp_discovery config:type="boolean">false</slp_discovery>
<addons config:type="list">
  <addon>
    <name>sle-module-containers</name>
    <version>15.2</version>
    <arch>x86_64</arch>
  </addon>
  <addon>
    <name>sle-module-public-cloud</name>
    <version>15.2</version>
    <arch>x86_64</arch>
  </addon>
  <addon>
    <name>caasp</name>
    <version>4.5</version>
    <arch>x86_64</arch>
  </addon>
</addons>
</suse_register>

1	Provide FQDN of the Repository Mirroring Tool (RMT) server
2	Provide the location on the server where the certificate can be found
3	Provide the certificate fingerprint for the Repository Mirroring Tool (RMT) server

3.4.3 Provisioning the Cluster Nodes #

Once the AutoYaST file is available in the network that the machines will be configured in, you can start deploying machines.

The default production scenario consists of 6 nodes:

1 load balancer
3 masters
2 workers

Depending on the type of load balancer you wish to use, you need to deploy at least 5 machines to serve as cluster nodes and provide a load balancer from the environment.

The load balancer must point at the machines that are assigned to be used as master nodes in the future cluster.

Tip

If you do not wish to use infrastructure load balancers, please deploy additional machines and refer to Section 3.1.5, “Load Balancer”.

Install SUSE Linux Enterprise Server 15 SP2 from your preferred medium and follow the steps for Invoking the Auto-Installation Process

Provide autoyast=https://[webserver/path/to/autoyast.xml] during the SUSE Linux Enterprise Server 15 SP2 installation.

3.4.3.1 SUSE Linux Enterprise Server Installation #

Note

Use AutoYaST and make sure to use a staged frozen patchlevel via RMT/SUSE Manager to ensure a 100% reproducible setup. RMT Guide

Once the machines have been installed using the AutoYaST file, you are now ready proceed with Chapter 4, Bootstrapping the Cluster.

3.4.4 Container Runtime Proxy #

Important

CRI-O proxy settings must be adjusted on all nodes before joining the cluster!

Please refer to: https://documentation.suse.com/suse-caasp/4.5/html/caasp-admin/_miscellaneous.html#_configuring_httphttps_proxy_for_cri_o

3.5 Deployment on Existing SLES Installation #

If you already have a running SUSE Linux Enterprise Server 15 SP2 installation, you can add SUSE CaaS Platform to this installation using SUSE Connect. You also need to enable the "Containers" and "Public Cloud" modules because it contains some dependencies required by SUSE CaaS Platform.

3.5.1 Requirements #

Note: Preparation Required

You must have completed Section 3.1, “Deployment Preparations” to proceed.

3.5.1.1 Dedicated Cluster Nodes #

Important

Adding a machine with an existing use case (e.g. web server) as a cluster node is not supported!

SUSE CaaS Platform requires dedicated machines as cluster nodes.

The instructions in this document are meant to add SUSE CaaS Platform to an existing SUSE Linux Enterprise installation that has no other active use case.

For example: You have installed a machine with SUSE Linux Enterprise but it has not yet been commissioned to run a specific application and you decide now to make it a SUSE CaaS Platform cluster node.

3.5.1.2 Disabling Swap #

When using a pre-existing SUSE Linux Enterprise installation, swap will be enabled. You must disable swap for all cluster nodes before performing the cluster bootstrap.

On all nodes that are meant to join the cluster; run:

sudo swapoff -a

Then modify /etc/fstab on each node to remove the swap entries.

Important

It is recommended to reboot the machine to finalize these changes and prevent accidental reactivation of swap during an automated reboot of the machine later on.

3.5.2 Adding SUSE CaaS Platform repositories #

Retrieve your SUSE CaaS Platform registration code and run the following. Substitute <CAASP_REGISTRATION_CODE> for the code from Section 3.1.2, “Registration Code”.

SUSEConnect -p sle-module-containers/15.2/x86_64
SUSEConnect -p sle-module-public-cloud/15.2/x86_64

SUSEConnect -p caasp/4.5/x86_64 -r <CAASP_REGISTRATION_CODE>

Repeat all preparation steps for any cluster nodes you wish to join. You can then proceed with Chapter 4, Bootstrapping the Cluster.

3.6 Deployment on Amazon Web Services (AWS) #

Deployment on Amazon Web Services (AWS) is currently a tech preview.

Note: Preparation Required

You must have completed Section 3.1, “Deployment Preparations” to proceed.

You will use Terraform to deploy the whole infrastructure described in Section 3.6.1, “AWS Deployment”. Then you will use the skuba tool to bootstrap the Kubernetes cluster on top of it.

3.6.1 AWS Deployment #

The AWS deployment created by our Terraform template files leads to the creation of the infrastructure described in the next paragraphs.

3.6.1.1 Network #

All of the infrastructure is created inside of a user specified AWS region. The resources are currently all located inside of the same availability zone.

The Terraform template files create a dedicated Amazon Virtual Private Cloud (VPC) with two subnets: "public" and "private". Instances inside of the public subnet have Elasic IP addresses associated, hence they are reachable from the internet. Instances inside of the private subnet are not reachable from the internet. However they can still reach external resources; for example they can still perform operations like downloading updates and pulling container images from external container registries. Communication between the public and the private subnet is allowed. All the control plane instances are currently located inside of the public subnet. Worker instances are inside of the private subnet.

Both control plane and worker nodes have tailored Security Groups assigned to them. These are based on the networking requirements described in Section 1.4, “Networking”.

3.6.1.2 Load Balancer #

The Terraform template files take care of creating a Classic Load Balancer which exposes the Kubernetes API service deployed on the control plane nodes.

The load balancer exposes the following ports:

6443: Kubernetes API server
32000: Dex (OIDC Connect)
32001: Gangway (RBAC Authenticate)

3.6.1.3 Join Already Existing VPCs #

The Terraform template files allow the user to have the SUSE CaaS Platform VPC join one or more existing VPCs.

This is achieved by the creation of VPC peering links and dedicated Route tables.

This feature allows SUSE CaaS Platform to access and be accessed by resources defined inside of other VPCs. For example, this capability can be used to register all the SUSE CaaS Platform instances against a SUSE Manager server running inside of a private VPC.

Current limitations:

The VPCs must belong to the same AWS region.
The VPCs must be owned by the same user who is creating the SUSE CaaS Platform infrastructure via Terraform.

3.6.1.4 IAM Profiles #

The AWS Cloud Provider integration for Kubernetes requires special IAM profiles to be associated with the control plane and worker instances. Terraform can create these profiles or can leverage existing ones. It all depends on the rights of the user invoking Terraform.

The Terraform AWS provider requires your credentials. These can be obtained by following these steps:

Log in to the AWS console.
Click on your username in the upper right hand corner to reveal the drop-down menu.
Click on My Security Credentials.
Click Create Access Key on the "Security Credentials" tab.
Note down the newly created Access and Secret keys.

3.6.2 Deploying the Infrastructure #

On the management machine, find the Terraform template files for AWS in /usr/share/caasp/terraform/aws. These files have been installed as part of the management pattern (sudo zypper in -t pattern SUSE-CaaSP-Management).

Copy this folder to a location of your choice as the files need adjustment.

mkdir -p ~/caasp/deployment/
cp -r /usr/share/caasp/terraform/aws/ ~/caasp/deployment/
cd ~/caasp/deployment/aws/

Once the files are copied, rename the terraform.tfvars.example file to terraform.tfvars:

cp terraform.tfvars.example terraform.tfvars

Edit the terraform.tfvars file and add/modify the following variables:

# prefix that all of the booted machines will use
# IMPORTANT, please enter unique identifier below as value of
# stack_name variable to not interfere with other deployments
stack_name = "caasp" 1

# Number of master nodes
masters = 1 2

# Number of worker nodes
workers = 2 3

# ssh keys to inject into all the nodes
# EXAMPLE:
# authorized_keys = [
#   "ssh-rsa <key-content>"
# ]
authorized_keys = [ 4
  "ssh-rsa <example_key> example@example.com"
]

# To register CaaSP product please use ONLY ONE of the following method
#
# SUSE CaaSP Product Product Key:
#caasp_registry_code = ""  5
#
# SUSE Repository Mirroring Server Name (FQDN):
#rmt_server_name = "rmt.example.com"  6

# List of VPC IDs to join via VPC peer link
#peer_vpc_ids = ["vpc-id1", "vpc-id2"] 7

# Name of the IAM profile to associate to control plane nodes
# Leave empty to have terraform create one.
# This is required to have AWS CPI support working properly.
#
# Note well: you must  have the right set of permissions.
# iam_profile_master = "caasp-k8s-master-vm-profile" 8

# Name of the IAM profile to associate to worker nodes.
# Leave empty to have terraform create one.
# This is required to have AWS CPI support working properly.
#
# Note well: you must  have the right set of permissions.
#iam_profile_worker = "caasp-k8s-worker-vm-profile" 9

1	`stack_name`: Prefix for all machines of the cluster spawned by terraform.
2	`masters`: Number of master nodes to be deployed.
3	`workers`: Number of worker nodes to be deployed.
4	`authorized_keys`: List of ssh-public-keys that will be able to log into the deployed machines.
5	`caasp_registry_code`: SUSE CaaS Platform Product Key for registering the product against SUSE Customer Center.
6	`caasp_registry_code`: register against a local SUSE Repository Mirroring Server.
7	`peer_vpc_ids`: List of already existing VPCs to join via dedicated VPC peering links.
8	`iam_profile_master`: Name of the IAM profile to associate with the control plane instance. Leave empty to have Terraform create it.
9	`iam_profile_worker`: Name of the IAM profile to associate with the worker instances. Leave empty to have Terraform create it.

Tip

You can set timezone and other parameters before deploying the nodes by modifying the cloud-init template:

~/caasp/deployment/aws/cloud-init/cloud-init.yaml.tpl

You can enter the registration code for your nodes in ~/caasp/deployment/aws/registration.auto.tfvars instead of the terraform.tfvars file.

Substitute CAASP_REGISTRATION_CODE for the code from Section 3.1.2, “Registration Code”.

# SUSE CaaSP Product Key
caasp_registry_code = "<CAASP_REGISTRATION_CODE>"

This is required so all the deployed nodes can automatically register with SUSE Customer Center and retrieve packages.

The last step before deploying is to provide the credentials as environment variables that Terraform will automatically retrieve:

AWS_ACCESS_KEY_ID: This is the AWS access key.
AWS_SECRET_ACCESS_KEY This is the AWS secret key.
AWS_DEFAULT_REGION This is the AWS region. A list of region names can be found in the official AWS documentation.

To do so, source the following variables, for security reasons turn off bash history:

set +o history
export AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
export AWS_DEFAULT_REGION="eu-central-1"
set -o history

It can also be stored in a file, for example aws-credentials:

AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
AWS_DEFAULT_REGION="eu-central-1"

And sourced:

set -a; source aws-credentials; set +a

Now you can deploy the nodes by running:

terraform init
terraform plan
terraform apply

Check the output for the actions to be taken. Type "yes" and confirm with Enter when ready. Terraform will now provision all the cluster infrastructure.

Important: Public IPs for Nodes

skuba currently cannot access nodes through a bastion host, so all the nodes in the cluster must be directly reachable from the machine where skuba is being run. skuba could be run from one of the master nodes or from a pre-existing bastion host located inside of a joined VPC as described in Section 3.6.1.3, “Join Already Existing VPCs”.

Important: Note Down IP/FQDN For the Nodes

The IP addresses and FQDN of the generated machines will be displayed in the Terraform output during the cluster node deployment. You need these information later to deploy SUSE CaaS Platform.

These information can be obtained at any time by executing the terraform output command within the directory from which you executed Terraform.

3.6.3 Logging into the Cluster Nodes #

Connecting to the cluster nodes can be accomplished only via SSH key-based authentication thanks to the ssh-public key injection done earlier via cloud-init. You can use the predefined ec2-user user to log in.

If the ssh-agent is running in the background, run:

ssh ec2-user@<node-ip-address>

Without the ssh-agent running, run:

ssh ec2-user@<node-ip-address> -i <path-to-your-ssh-private-key>

Once connected, you can execute commands using password-less sudo.

4 Bootstrapping the Cluster #

4.1 Preparation
4.2 Cluster Deployment
4.3 Using kubectl

Bootstrapping the cluster is the initial process of starting up the cluster and defining which of the nodes are masters and which are workers. For maximum automation of this process, SUSE CaaS Platform uses the skuba package.

4.1 Preparation #

4.1.1 Install `skuba` #

First you need to install skuba on a management machine, like your local workstation:

Add the SLE15 SP2 extension containing skuba. This also requires the "containers" and the "public cloud" module.

SUSEConnect -p sle-module-containers/15.2/x86_64
SUSEConnect -p sle-module-public-cloud/15.2/x86_64
SUSEConnect -p caasp/4.5/x86_64 -r <PRODUCT_KEY>

Install the management pattern with:

zypper in -t pattern SUSE-CaaSP-Management

Tip

Example deployment configuration files for each deployment scenario are installed under /usr/share/caasp/terraform/, or in case of the bare metal deployment: /usr/share/caasp/autoyast/.

4.1.2 Container Runtime Proxy #

Important

CRI-O proxy settings must be adjusted manually on all nodes before joining the cluster!

4.2 Cluster Deployment #

Make sure you have added the SSH identity (corresponding to the public SSH key distributed above) to the ssh-agent on your workstation. For instructions on how to add the SSH identity, refer to Section 3.1.1, “Basic SSH Key Configuration”.

This is a requirement for skuba (https://github.com/SUSE/skuba#prerequisites).

By default skuba connects to the nodes as root user. A different user can be specified by the following flags:

--sudo --user <USERNAME>

Important

You must configure sudo for the user to be able to authenticate without password. Replace <USERNAME> with the user you created during installation. As root, run:

echo "<USERNAME> ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers

4.2.1 Initializing the Cluster #

Warning: Secure configuration files access

The directory created during this step contains configuration files that allow full administrator access to your cluster. Apply best practices for access control to this folder.

Now you can initialize the cluster on the deployed machines. As --control-plane enter the IP/FQDN of your load balancer. If you do not use a load balancer use your first master node.

Important

If you are deploying on a cloud provider you must enable vendor specific integrations (CPI). Please refer further below to Section 4.2.3.1, “Enabling Cloud Provider Integration”.

skuba cluster init --control-plane <LB_IP/FQDN> <CLUSTER_NAME>

cluster init generates the folder named <CLUSTER_NAME> and initializes the directory that will hold the configuration (kubeconfig) for the cluster.

Important

The IP/FQDN must be reachable by every node of the cluster and therefore 127.0.0.1/localhost cannot be used.

4.2.1.1 Transitioning from Docker to CRI-O #

SUSE CaaS Platform 4.5.2 default configuration uses the CRI-O Container Engine in conjunction with Docker Linux capabilities. This means SUSE CaaS Platform 4.5.2 containers run on top of CRI-O with the following additional Linux capabilities: audit_write, setfcap and mknod. This measure ensures a transparent transition and seamless compatibility with workloads running on the previous SUSE CaaS Platform versions and out-of-the-box Docker compatibility.

In case you wish to use unmodified CRI-O, use the --strict-capability-defaults option during the initial setup when you run skuba cluster init, which will create the vanilla CRI-O configuration:

skuba cluster init --strict-capability-defaults

Please be aware that this might result in incompatibility with your previously running workloads, unless you explicitly define the additional Linux capabilities required on top of CRI-O defaults.

Important

After the bootstrap of the Kubernetes cluster there will be no easy way to revert this modification. Please choose wisely.

4.2.2 Configuring Kubernetes Services #

Inspect the kubeadm-init.conf file inside your cluster definition and set extra configuration settings supported by kubeadm. The latest supported version is v1beta1. Later, when you later run skuba node bootstrap, kubeadm will read kubeadm-init.conf and will forcefully set certain settings to the ones required by SUSE CaaS Platform.

4.2.2.1 Network Settings #

The default network settings inside kubeadm-init.conf are viable for production clusters and adjusting them is optional. If you however wish to change the pod and service subnets, it is important that you do so before the bootstrap. The subnet ranges must be planned carefully, because the settings cannot be adjusted after deployment is complete. The default settings are the following:

networking:
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12

The podSubnet IP range must be big enough to contain all IP addresses for all PODs planned for the cluster. The subnet also mustn’t conflict with services from outside of the cluster - external databases, file services, etc. This also holds for serviceSubnet - the IP range must not conflict with external services and needs to be broad enough for all services planned for the cluster.

4.2.3 Cluster Configuration #

Before bootstrapping the cluster, it is advisable to perform some additional configuration.

4.2.3.1 Enabling Cloud Provider Integration #

Enable cloud provider integration to take advantage of the underlying cloud platforms and automatically manage resources like the Load Balancer, Nodes (Instances), Network Routes and Storage services.

If you want to enable cloud provider integration with different cloud platforms, initialize the cluster with the flag --cloud-provider <CLOUD PROVIDER>. The only currently available options are openstack, aws and vsphere, but more options are planned.

Important: Cleanup

By enabling CPI providers your Kubernetes cluster will be able to provision cloud resources on its own (eg: Load Balancers, Persistent Volumes). You will have to manually clean these resources before you destroy the cluster with Terraform.

Not removing resources like Load Balancers created by the CPI will result in Terraform timing out during destroy operations.

Persistent volumes created with the retain policy will exist inside of the external cloud infrastructure even after the cluster is removed.

4.2.3.1.1 OpenStack CPI #

Define the cluster using the following command:

skuba cluster init --control-plane <LB_IP/FQDN> --cloud-provider openstack <CLUSTER_NAME>

Running the above command will create a directory <CLUSTER_NAME>/cloud/openstack with a README.md and an openstack.conf.template in it. Copy openstack.conf.template or create an openstack.conf file inside <CLUSTER_NAME>/cloud/openstack, according to the supported format. The supported format and content can be found in the official Kubernetes documentation:

https://v1-18.docs.kubernetes.io/docs/concepts/cluster-administration/cloud-providers/#openstack

Warning

The file <CLUSTER_NAME>/cloud/openstack/openstack.conf must not be freely accessible. Please remember to set proper file permissions for it, for example 600.

4.2.3.2 Example OpenStack Cloud Provider Configuration #

You can find the required parameters in OpenStack RC File v3.

[Global]
auth-url=<OS_AUTH_URL> 1
username=<OS_USERNAME> 2
password=<OS_PASSWORD> 3
tenant-id=<OS_PROJECT_ID> 4
domain-name=<OS_USER_DOMAIN_NAME> 5
region=<OS_REGION_NAME> 6
ca-file="/etc/pki/trust/anchors/SUSE_Trust_Root.pem" 7
[LoadBalancer]
lb-version=v2 8
subnet-id=<PRIVATE_SUBNET_ID> 9
floating-network-id=<PUBLIC_NET_ID> 10
create-monitor=yes 11
monitor-delay=1m 12
monitor-timeout=30s 13
monitor-max-retries=3 14
[BlockStorage]
bs-version=v2 15
ignore-volume-az=true 16

1	(required) Specifies the URL of the Keystone API used to authenticate the user. This value can be found in Horizon (the OpenStack control panel). under Project > Access and Security > API Access > Credentials.
2	(required) Refers to the username of a valid user set in Keystone.
3	(required) Refers to the password of a valid user set in Keystone.
4	(required) Used to specify the ID of the project where you want to create your resources.
5	(optional) Used to specify the name of the domain your user belongs to.
6	(optional) Used to specify the identifier of the region to use when running on a multi-region OpenStack cloud. A region is a general division of an OpenStack deployment.
7	(optional) Used to specify the path to your custom CA file.
8	(optional) Used to override automatic version detection. Valid values are `v1` or `v2`. Where no value is provided, automatic detection will select the highest supported version exposed by the underlying OpenStack cloud.
9	(optional) Used to specify the ID of the subnet you want to create your load balancer on. Can be found at Network > Networks. Click on the respective network to get its subnets.
10	(optional) If specified, will create a floating IP for the load balancer.
11	(optional) Indicates whether or not to create a health monitor for the Neutron load balancer. Valid values are true and false. The default is false. When true is specified then monitor-delay, monitor-timeout, and monitor-max-retries must also be set.
12	(optional) The time between sending probes to members of the load balancer. Ensure that you specify a valid time unit.
13	(optional) Maximum time for a monitor to wait for a ping reply before it times out. The value must be less than the delay value. Ensure that you specify a valid time unit.
14	(optional) Number of permissible ping failures before changing the load balancer member’s status to INACTIVE. Must be a number between 1 and 10.
15	(optional) Used to override automatic version detection. Valid values are v1, v2, v3 and auto. When auto is specified, automatic detection will select the highest supported version exposed by the underlying OpenStack cloud.
16	(optional) Influences availability zone, use when attaching Cinder volumes. When Nova and Cinder have different availability zones, this should be set to `true`.

After setting options in the openstack.conf file, please proceed with Section 4.2.5, “Cluster Bootstrap”.

Important

When cloud provider integration is enabled, it’s very important to bootstrap and join nodes with the same node names that they have inside Openstack, as these names will be used by the Openstack cloud controller manager to reconcile node metadata.

4.2.3.2.1 Amazon Web Services (AWS) CPI #

Define the cluster using the following command:

skuba cluster init --control-plane <LB IP/FQDN> --cloud-provider aws <CLUSTER_NAME>

Running the above command will create a directory <CLUSTER_NAME>/cloud/aws with a README.md file in it. No further configuration files are needed.

The supported format and content can be found in the official Kubernetes documentation.

Important

When cloud provider integration is enabled, it’s very important to bootstrap and join nodes with the same node names that they have inside AWS, as these names will be used by the AWS cloud controller manager to reconcile node metadata.

You can use the "private dns" values provided by the Terraform output.

4.2.3.2.2 vSphere CPI (VCP) #

Define the cluster using the following command:

skuba cluster init --control-plane <LB_IP/FQDN> --cloud-provider vsphere <CLUSTER_NAME>

Running the above command will create a directory <CLUSTER_NAME>/cloud/vsphere with a README.md and a vsphere.conf.template in it. Copy vsphere.conf.template or create a vsphere.conf file inside <CLUSTER_NAME>/cloud/vsphere, according to the supported format.

The supported format and content can be found in the official Kubernetes documentation.

Warning

The file <CLUSTER_NAME>/cloud/vsphere/vsphere.conf must not be freely accessible. Please remember to set proper file permissions for it, for example 600.

4.2.3.3 Example vSphere Cloud Provider Configuration #

[Global]
user = "<VC_ADMIN_USERNAME>" 1
password = "<VC_ADMIN_PASSWORD>" 2
port = "443" 3
insecure-flag = "1" 4
[VirtualCenter "<VC_IP_OR_FQDN>"] 5
datacenters = "<VC_DATACENTERS>" 6
[Workspace]
server = "<VC_IP_OR_FQDN>" 7
datacenter = "<VC_DATACENTER>" 8
default-datastore = "<VC_DATASTORE>" 9
resourcepool-path = "<VC_RESOURCEPOOL_PATH>" 10
folder = "<VC_VM_FOLDER>" 11
[Disk]
scsicontrollertype = pvscsi 12
[Network]
public-network = "VM Network" 13
[Labels] 14
region = "<VC_DATACENTER_TAG>" 15
zone = "<VC_CLUSTER_TAG>" 16

1	(required) Refers to the vCenter username for vSphere cloud provider to authenticate with.
2	(required) Refers to the vCenter password for vCenter user specified with `user`.
3	(optional) The vCenter Server Port. The default is 443 if not specified.
4	(optional) Set to 1 if vCenter used a self-signed certificate.
5	(required) The IP address of the vCenter server.
6	(required) The datacenter name in vCenter where Kubernetes nodes reside.
7	(required) The IP address of the vCenter server for storage provisioning. Usually the same as `VirtualCenter`
8	(required) The datacenter to provision temporary VMs for volume provisioning.
9	(required) The default datastore to provision temporary VMs for volume provisioning.
10	(required) The resource pool to provision temporary VMs for volume provisioning.
11	(required) The vCenter VM folder where Kubernetes nodes are in.
12	(required) Defines the SCSI controller in use on the VMs. Almost always set to `pvscsi`.
13	(optional) The network in vCenter where Kubernetes nodes should join. The default is "VM Network" if not specified.
14	(optional) The feature flag for zone and region support. Important The zone and region tags must exist and assigned to datacenter and cluster before bootstrap. Instruction to tag zones and regions, refer to: https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/zones.html#tag-zones-and-regions-in-vcenter.
15	(optional) The category name of the tag assigned to the vCenter datacenter.
16	(optional) The category name of the tag assigned to the vCenter cluster.

After setting options in the vsphere.conf file, please proceed with Section 4.2.5, “Cluster Bootstrap”.

Important: Set `vSphere` virtual machine hostnames

When cloud provider integration is enabled, it’s very important to bootstrap and join nodes with the node names same as vSphere virtual machine’s hostnames. These names will be used by the vSphere cloud controller manager to reconcile node metadata.

Important: Enable `disk.EnableUUID`.

Each virtual machine requires to have disk.EnableUUID enabled to successfully mount the virtual disks.

Clusters provisioned following Deploying VMs from the Template with cpi_enable = true automatically enables disk.EnableUUID.

For clusters provisioned by any other method, ensure virtual machines are set to use disk.EnableUUID.

For more information, refer to: Configure Kubernetes Cluster Virtual Machines .

Important: Create a Folder For Your Virtual Machines.

All virtual machines must exist in a folder and provide the name of that folder as the folder variable in the vsphere.conf before bootstrap.

Clusters provisioned following Deploying VMs from the Template with cpi_enable = true automatically create and place all cluster node virtual machines inside a *-cluster folder.

For clusters provisioned by any other method, make sure to create and move all cluster node virtual machines to a folder.

4.2.3.4 Enable vSphere Cloud Provider #

For an existing cluster without cloud provider enabled at bootstrap, you can enable it later.

In vCenter, create a folder and move all cluster virtual machines into the folder. You can use govc to automate the task.
For installation instructions, refer to: https://github.com/vmware/govmomi/tree/master/govc.
```
DATACENTER="<VC_DATACENTER>" 1
CLUSTER_PREFIX="<VC_CLUSTER_PREFIX>" 2
govc folder.create /$DATACENTER/vm/$CLUSTER_PREFIX-cluster
govc object.mv /$DATACENTER/vm/$CLUSTER_PREFIX-\* /$DATACENTER/vm/$CLUSTER_PREFIX-cluster
```
1
The datacenter where cluster virtual machines are in.
2
Prefix for all machines of the cluster.
In vCenter, enable disk.UUID for all cluster virtual machines. You can use govc to automate the task.
Warning
Setup disk.enabledUUID requires virtual machine to be powered off. The following script will setup all virtul machine in parallel, hense resulting some cluster downtimes while all machines are powered off. Modify the script or simply DO NOT use the script if minimal downtime is in consideration.
```
DATACENTER="PROVO" 1
VMS=("caasp-master-0" "caasp-master-1" "caasp-master-2" "caasp-worker-0" "caasp-worker-1") 2
```
```
function setup {
  NAME=$1
  echo "[$NAME]"
  govc vm.power -dc=$DATACENTER -off $NAME
  govc vm.change -dc=$DATACENTER -vm=$NAME -e="disk.enableUUID=1" &&\
    echo "Configured disk.enabledUUID: 1"
  govc vm.power -dc=$DATACENTER -on $NAME
}
```
```
for vm in ${VMS[@]}
do
  setup $vm &
done
wait
```
1
The datacenter where cluster virtual machines are in.
2
The cluster virtual machine names.

Update the provider ID for all Kuberentes nodes.

DATACENTER="<VC_DATACENTER>" 1
CLUSTER_PREFIX="<VC_CLUSTER_PREFIX>" 2
for vm in $(govc ls "/$DATACENTER/vm/$CLUSTER_PREFIX-cluster")
do
  VM_INFO=$(govc vm.info -json -dc=$DATACENTER -vm.ipath="/$vm" -e=true)
  VM_NAME=$(jq -r ' .VirtualMachines[] | .Name' <<< $VM_INFO)
  [[ $VM_NAME == *"-lb-"* ]] && continue
  VM_UUID=$( jq -r ' .VirtualMachines[] | .Config.Uuid' <<< $VM_INFO )
  echo "Patching $VM_NAME with UUID:$VM_UUID"
  kubectl patch node $VM_NAME -p "{\"spec\":{\"providerID\":\"vsphere://$VM_UUID\"}}"
done

1	The datacenter where cluster virtual machines are in.
2	Prefix for all machines of the cluster.

Create /etc/kubernetes/vsphere.config in every master and worker nodes. Refer to Section 4.2.3.3, “Example vSphere Cloud Provider Configuration” for details.
On local machine, save kubeadm-config as kubeadm-config.conf.
kubectl -n kube-system get cm/kubeadm-config -o yaml > kubeadm-config.conf

Edit the kubeadm-config.conf to add cloud-provider and relate configurations.

data:
  ClusterConfiguration: |
    apiServer:
      extraArgs:
        cloud-config: /etc/kubernetes/vsphere.conf
        cloud-provider: vsphere
      extraVolumes:
      - hostPath: /etc/kubernetes/vsphere.conf
        mountPath: /etc/kubernetes/vsphere.conf
        name: cloud-config
        pathType: FileOrCreate
        readOnly: true
    controllerManager:
      extraArgs:
        cloud-config: /etc/kubernetes/vsphere.conf
        cloud-provider: vsphere
      extraVolumes:
      - hostPath: /etc/kubernetes/vsphere.conf
        mountPath: /etc/kubernetes/vsphere.conf
        name: cloud-config
        pathType: FileOrCreate
        readOnly: true

Apply the kubeadm-config to the cluster.
kubectl apply -f kubeadm-config.conf

On every master node, update kubelet.

sudo systemctl stop kubelet
source /var/lib/kubelet/kubeadm-flags.env
echo KUBELET_KUBEADM_ARGS='"'--cloud-config=/etc/kubernetes/vsphere.conf --cloud-provider=vsphere $KUBELET_KUBEADM_ARGS'"' > /tmp/kubeadm-flags.env
sudo mv /tmp/kubeadm-flags.env /var/lib/kubelet/kubeadm-flags.env
sudo systemctl start kubelet

On every master node, update control-plane components.
sudo kubeadm upgrade node phase control-plane --etcd-upgrade=false

On every worker node, update kubelet.

sudo systemctl stop kubelet
source /var/lib/kubelet/kubeadm-flags.env
echo KUBELET_KUBEADM_ARGS='"'--cloud-config=/etc/kubernetes/vsphere.conf --cloud-provider=vsphere $KUBELET_KUBEADM_ARGS'"' > /tmp/kubeadm-flags.env
sudo mv /tmp/kubeadm-flags.env /var/lib/kubelet/kubeadm-flags.env
sudo systemctl start kubelet

After the setup you can proceed to use vSphere Storage in cluster.

4.2.3.5 Integrate External LDAP TLS #

Based on the manifest in <CLUSTER_NAME>/addons/dex/base/dex.yaml, provide a kustomize patch to <CLUSTER_NAME>/addons/dex/patches/custom.yaml of the form of strategic merge patch or a JSON 6902 patch.
Adapt the ConfigMap by adding LDAP configuration to the connector section of the custom.yaml file. For detailed configurations for the LDAP connector, refer to https://github.com/dexidp/dex/blob/v2.23.0/Documentation/connectors/ldap.md.

Read https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchstrategicmerge and https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchjson6902 to get more information.

# Example LDAP connector

connectors:
- type: ldap
  id: 389ds
  name: 389ds
  config:
    host: ldap.example.org:636 1 2
    rootCAData: <BASE64_ENCODED_PEM_FILE> 3
    bindDN: cn=user-admin,ou=Users,dc=example,dc=org 4
    bindPW: <BIND_DN_PASSWORD> 5
    usernamePrompt: Email Address 6
    userSearch:
      baseDN: ou=Users,dc=example,dc=org 7
      filter: "(objectClass=person)" 8
      username: mail 9
      idAttr: DN 10
      emailAttr: mail 11
      nameAttr: cn 12

1	Host name of LDAP server reachable from the cluster.
2	The port on which to connect to the host (for example StartTLS: `389`, TLS: `636`).
3	LDAP server base64 encoded root CA certificate file (for example `cat <root-ca-pem-file> \| base64 \| awk '{print}' ORS='' && echo`)
4	Bind DN of user that can do user searches.
5	Password of the user.
6	Label of LDAP attribute users will enter to identify themselves (for example `username`).
7	BaseDN where users are located (for example `ou=Users,dc=example,dc=org`).
8	Filter to specify type of user objects (for example "(objectClass=person)").
9	Attribute users will enter to identify themselves (for example mail).
10	Attribute used to identify user within the system (for example DN).
11	Attribute containing the user’s email.
12	Attribute used as username within OIDC tokens.

Besides the LDAP connector you can also set up other connectors. For additional connectors, refer to the available connector configurations in the Dex repository: https://github.com/dexidp/dex/tree/v2.23.0/Documentation/connectors.

4.2.3.6 Prevent Nodes Running Special Workloads from Being Rebooted #

Some nodes might run specially treated workloads (pods).

To prevent downtime of those workloads and the respective node, it is possible to flag the pod with --blocking-pod-selector=<POD_NAME>. Any node running this workload will not be rebooted via kured and needs to be rebooted manually.

Based on the manifest in <CLUSTER_NAME>/addons/kured/base/kured.yaml, provide a kustomize patch to <CLUSTER_NAME>/addons/kured/patches/custom.yaml of the form of strategic merge patch or a JSON 6902 patch. Read https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchstrategicmerge and https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchjson6902 to get more information.

Adapt the DaemonSet by adding one of the following flags to the command section of the kured container:

---
apiVersion: apps/v1
kind: DaemonSet
...
spec:
  ...
    ...
      ...
      containers:
        ...
          command:
            - /usr/bin/kured
            - --blocking-pod-selector=name=<POD_NAME>

You can add any key/value labels to this selector:

--blocking-pod-selector=<LABEL_KEY_1>=<LABEL_VALUE_1>,<LABEL_KEY_2>=<LABEL_VALUE_2>

Alternatively, you can adapt the kured DaemonSet also later during runtime (after bootstrap) by editing <CLUSTER_NAME>/addons/kured/patches/custom.yaml and executing:

kubectl apply -k <CLUSTER_NAME>/addons/kured/

This will restart all kured pods with the additional configuration flags.

4.2.4 Prevent Nodes with Any Prometheus Alerts from Being Rebooted #

Note

By default, any prometheus alert blocks a node from reboot. However you can filter specific alerts to be ignored via the --alert-filter-regexp flag.

Based on the manifest in <CLUSTER_NAME>/addons/kured/base/kured.yaml, provide a kustomize patch to <CLUSTER_NAME>/addons/kured/patches/custom.yaml of the form of strategic merge patch or a JSON 6902 patch. Read https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchstrategicmerge and https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#patchjson6902 to get more information.

Adapt the DaemonSet by adding one of the following flags to the command section of the kured container:

---
apiVersion: apps/v1
kind: DaemonSet
...
spec:
  ...
    ...
      ...
      containers:
        ...
          command:
            - /usr/bin/kured
            - --prometheus-url=<PROMETHEUS_SERVER_URL>
            - --alert-filter-regexp=^(RebootRequired|AnotherBenignAlert|...$

Important

The <PROMETHEUS_SERVER_URL> needs to contain the protocol (http:// or https://)

Alternatively you can adapt the kured DaemonSet also later during runtime (after bootstrap) by editing <CLUSTER_NAME>/addons/kured/patches/custom.yaml and executing:

kubectl apply -k <CLUSTER_NAME>/addons/kured/

This will restart all kured pods with the additional configuration flags.

4.2.5 Cluster Bootstrap #

Switch to the new directory.
Now bootstrap a master node. For --target enter the FQDN of your first master node. Replace <NODE_NAME> with a unique identifier, for example, "master-one".
Note: Log retention
By default skuba will only display the events of the bootstrap process in the terminal during execution. The examples in the following sections will use the tee tool to store a copy of the outputs in a file of your choosing.
For more information on the different logging approaches utilized by SUSE CaaS Platform components please refer to: SUSE CaaS Platform - Admin Guide: Logging.
Tip: Custom Trusted CA Certificate
During cluster bootstrap, skuba automatically generates CA certificates. You can however also deploy the Kubernetes cluster with your custom trusted CA certificate.
Please refer to the SUSE CaaS Platform Administration Guide for more information on how to deploy the Kubernetes cluster with a custom trusted CA certificate.
```
cd <CLUSTER_NAME>
skuba node bootstrap --user sles --sudo --target <IP/FQDN> <NODE_NAME>
```
This will bootstrap the specified node as the first master in the cluster. The process will generate authentication certificates and the admin.conf file that is used for authentication against the cluster. The files will be stored in the <CLUSTER_NAME> directory specified in step one.
Add additional master nodes to the cluster.
Replace the <IP/FQDN> with the IP for the machine. Replace <NODE_NAME> with a unique identifier, for example, "master-two".
```
skuba node join --role master --user sles --sudo --target <IP/FQDN> <NODE_NAME>| tee <NODE_NAME>-skuba-node-join.log
```
Add a worker to the cluster:
Replace the <IP/FQDN> with the IP for the machine. Replace <NODE_NAME> with a unique identifier, for example, "worker-one".
```
skuba node join --role worker --user sles --sudo --target <IP/FQDN> <NODE_NAME>| tee <NODE_NAME>-skuba-node-join.log
```

Verify that the nodes have been added:

skuba cluster status

The output should look like this:

NAME      STATUS    ROLE     OS-IMAGE                              KERNEL-VERSION           KUBELET-VERSION   CONTAINER-RUNTIME   HAS-UPDATES   HAS-DISRUPTIVE-UPDATES   CAASP-RELEASE-VERSION
master0   Ready     master   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
master1   Ready     master   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
master2   Ready     master   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
worker0   Ready     worker   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
worker1   Ready     worker   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0
worker2   Ready     worker   SUSE Linux Enterprise Server 15 SP2   4.12.14-197.29-default   v1.18.6           cri-o://1.18.2      no            no                       4.5.0

Important

The IP/FQDN must be reachable by every node of the cluster and therefore 127.0.0.1/localhost cannot be used.

4.3 Using kubectl #

You can install and use kubectl by installing the kubernetes-client package from the SUSE CaaS Platform extension.

sudo zypper in kubernetes-client

Tip

Alternatively you can install from upstream: https://v1-18.docs.kubernetes.io/docs/tasks/tools/install-kubectl/.

To talk to your cluster, you must be in the <CLUSTER_NAME> directory when running commands so it can find the admin.conf file.

Tip: Setting up `kubeconfig`

To make usage of Kubernetes tools easier, you can store a copy of the admin.conf file as kubeconfig.

mkdir -p ~/.kube
cp admin.conf ~/.kube/config

Warning

The configuration file contains sensitive information and must be handled in a secure fashion. Copying it to a shared user directory might grant access to unwanted users.

You can run commands against your cluster like usual. For example:

kubectl get nodes -o wide
or

kubectl get pods --all-namespaces

# kubectl get pods --all-namespaces

NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
kube-system   coredns-86c58d9df4-5zftb                1/1       Running   0          2m
kube-system   coredns-86c58d9df4-fct4m                1/1       Running   0          2m
kube-system   etcd-my-master                          1/1       Running   0          1m
kube-system   kube-apiserver-my-master                1/1       Running   0          1m
kube-system   kube-controller-manager-my-master       1/1       Running   0          1m
kube-system   cilium-operator-7d6ddddbf5-dmbhv        1/1       Running   0          51s
kube-system   cilium-qjt9h                            1/1       Running   0          53s
kube-system   cilium-szkqc                            1/1       Running   0          2m
kube-system   kube-proxy-5qxnt                        1/1       Running   0          2m
kube-system   kube-proxy-746ws                        1/1       Running   0          53s
kube-system   kube-scheduler-my-master                1/1       Running   0          1m
kube-system   kured-ztnfj                             1/1       Running   0          2m
kube-system   kured-zv696                             1/1       Running   0          2m
kube-system   oidc-dex-55fc689dc-b9bxw                1/1       Running   0          2m
kube-system   oidc-gangway-7b7fbbdbdf-ll6l8           1/1       Running   0          2m

5 Cilium Network Policy Config Examples #

The following example allows all pods in the namespace in which the policy is created to communicate with kube-dns on port 53/UDP in the kube-system namespace.

Note

Versions of SUSE CaaS Platform after 4.1 are slated to include L7 policy management which will enable policies to be enforced on items like memcached verbs, gRPC methods, and Cassandra tables.

The default behavior of Kubernetes is that all pods can communicate with all other pods within a cluster, whether those pods are hosted by the same Kubernetes node or different ones. This behavior is intentional, and aids greatly in the development process as the complexity of networking is effectively removed from both the developer and the operator.

However, when a workload is deployed in a Kubernetes cluster in production, any number of reasons may arise leading to the need to isolate some workloads from others. For example, if a Human Resources department is running workloads processing PII (Personally Identifiable Information), those workloads should not by default be accessible by any other workload in the cluster.

Network policies are the mechanism provided by Kubernetes which allow a cloud operator to isolate workloads from each other in a variety of ways. For example, a policy could be defined which only allows a database server workload to be accessed only by the web servers whose pages use the data in the database. Another policy could be defined in the cluster which allows only web browsers outside the cluster to access the web server workloads in the cluster and so on.

To implement network policies, a network plugin must be correctly integrated into the cluster. SUSE CaaS Platform incorporates Cilium as its supported network policy management plugin. Cilium leverages BPF (Berkeley Packet Filter) where every bit of communication transits through a packet processing engine in the kernel. Other policy management plugins in the Kubernetes ecosystem leverage iptables.

SUSE has supported iptables since its inception in the Linux world, but believes BPF brings sufficiently compelling advantages (fine-grained control, performance) over iptables. Not only does Cilium have performance benefits brought on by BPF, it also has benefits far higher in the network stack.

The most typically used policies in Kubernetes cover L3 and L4 events in the network stack, allowing workloads to be protected by specifying IP addresses and TCP ports. To implement the earlier example of a dedicated webserver accessing a critical secured database, an L3 policy would be define allowing a web server workload running at IP address 192.168.0.1 to access a MySQL database workload running at IP address 192.168.0.2 on TCP port 3306.

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "allow-to-kubedns"
spec:
  endpointSelector:
    {}
  egress:
  - toEndpoints:
    - matchLabels:
        k8s:io.kubernetes.pod.namespace: kube-system
        k8s-app: kube-dns
    toPorts:
    - ports:
      - port: '53'
        protocol: UDP

6 Glossary #

AWS	Amazon Web Services. A broadly adopted cloud platform run by Amazon.
BPF	Berkeley Packet Filter. Technology used by Cilium to filter network traffic at the level of packet processing in the kernel.
CA	Certificate or Certification Authority. An entity that issues digital certificates.
CIDR	Classless Inter-Domain Routing. Method for allocating IP addresses and IP routing.
CNI	Container Networking Interface. Creates a generic plugin-based networking solution for containers based on spec files in JSON format.
CRD	Custom Resource Definition. Functionality to define non-default resources for Kubernetes pods.
FQDN	Fully Qualified Domain Name. The complete domain name for a specific computer, or host, on the internet, consisting of two parts: the hostname and the domain name.
GKE	Google Kubernetes Engine. Manager for container orchestration built on Kubernetes by Google. Similar for example to Amazon Elastic Kubernetes Service (Amazon EKS) and Azure Kubernetes Service (AKS).
HPA	Horizontal Pod Autoscaler. Based on CPU usage, HPA controls the number of pods in a deployment/replica or stateful set or a replication controller.
KVM	Kernel-based Virtual Machine. Linux native virtualization tool that allows the kernel to function as a hypervisor.
LDAP	Lightweight Directory Access Protocol. A client/server protocol used to access and manage directory information. It reads and edits directories over IP networks and runs directly over TCP/IP using simple string formats for data transfer.
OCI	Open Containers Initiative. A project under the Linux Foundation with the goal of creating open industry standards around container formats and runtime.
OIDC	OpenID Connect. Identity layer on top of the OAuth 2.0 protocol.
OLM	Operator Lifecycle Manager. Open Source tool for managing operators in a Kubernetes cluster.
POC	Proof of Concept. Pioneering project directed at proving the feasibility of a design concept.
PSP	Pod Security Policy. PSPs are cluster-level resources that control security-sensitive aspects of pod specification.
PVC	Persistent Volume Claim. A request for storage by a user.
RBAC	Role-based Access Control. An approach to restrict authorized user access based on defined roles.
RMT	Repository Mirroring Tool. Successor of the SMT. Helps optimize the management of SUSE Linux Enterprise software updates and subscription entitlements.
RPO	Recovery Point Objective. Defines the interval of time that can occur between to backup points before normal business can no longer be resumed.
RTO	Recovery Time Objective. This defines the time (and typically service level from SLA) with which backup relevant incidents must be handled within.
RSA	Rivest-Shamir-Adleman. Asymmetric encryption technique that uses two different keys as public and private keys to perform the encryption and decryption.
SLA	Service Level Agreement. A contractual clause or set of clauses that determines the guaranteed handling of support or incidents by a software vendor or supplier.
SMT	SUSE Subscription Management Tool. Helps to manage software updates, maintain corporate firewall policy and meet regulatory compliance requirements in SUSE Linux Enterprise 11 and 12. Has been replaced by the RMT and SUSE Manager in newer SUSE Linux Enterprise versions.
STS	StatefulSet. Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods for a "stateful" application.
SMTP	Simple Mail Transfer Protocol. A communication protocol for electronic mail transmission.
TOML	Tom’s Obvious, Minimal Language. Configuration file format used for configuring container registries for CRI-O.
VPA	Vertical Pod Autoscaler. VPA automatically sets the values for resource requests and container limits based on usage.
VPC	Virtual Private Cloud. Division of a public cloud, which supports private cloud computing and thus offers more control over virtual networks and an isolated environment for sensitive workloads.

A Contributors #

The contents of these documents are edited by the technical writers for SUSE CaaS Platform and original works created by its contributors.

B GNU Licenses #

B.1 GNU Free Documentation License

This appendix contains the GNU Free Documentation License version 1.2.

B.1 GNU Free Documentation License #

Copyright © 2000, 2001, 2002 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.

B.1.1 0. PREAMBLE #

The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or non-commercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.

This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.

We have designed this License to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.

B.1.2 1. APPLICABILITY AND DEFINITIONS #

This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.

A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.

A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.

The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.

A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.

The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text.

A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition.

The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.

B.1.3 2. VERBATIM COPYING #

You may copy and distribute the Document in any medium, either commercially or non-commercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.

You may also lend copies, under the same conditions stated above, and you may publicly display copies.

B.1.4 3. COPYING IN QUANTITY #

If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.

If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.

If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.

It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.

B.1.5 4. MODIFICATIONS #

You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:

Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission.
List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement.
State on the Title page the name of the publisher of the Modified Version, as the publisher.
Preserve all the copyright notices of the Document.
Add an appropriate copyright notice for your modifications adjacent to the other copyright notices.
Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below.
Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice.
Include an unaltered copy of this License.
Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence.
Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission.
For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein.
Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles.
Delete any section Entitled "Endorsements". Such a section may not be included in the Modified Version.
Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section.
Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles.

You may add a section Entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.

You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.

B.1.6 5. COMBINING DOCUMENTS #

You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.

The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.

In the combination, you must combine any sections Entitled "History" in the various original documents, forming one section Entitled "History"; likewise combine any sections Entitled "Acknowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled "Endorsements".

B.1.7 6. COLLECTIONS OF DOCUMENTS #

You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.

B.1.8 7. AGGREGATION WITH INDEPENDENT WORKS #

A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.

If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.

B.1.9 8. TRANSLATION #

Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.

If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.

B.1.10 9. TERMINATION #

You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.

B.1.10.1 10. FUTURE REVISIONS OF THIS LICENSE #

The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.

Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.

B.1.10.2 ADDENDUM: How to use this License for your documents #

Copyright (c) YEAR YOUR NAME.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.2
or any later version published by the Free Software Foundation;
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
A copy of the license is included in the section entitled "GNU
Free Documentation License".

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “ with…Texts.” line with this:

with the Invariant Sections being LIST THEIR TITLES, with the
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.

If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.

If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.

Print this page

1	The datacenter where cluster virtual machines are in.
2	The cluster virtual machine names.

SUSE CaaS Platform 4.5.2

Deployment Guide

This guide describes deployment for SUSE CaaS Platform 4.5.2.

#

Warning

Note

About This Guide #

1 Required Background #

2 Available Documentation #

3 Feedback #

4 Documentation Conventions #

Warning

Important

Note

Tip

1 Requirements #

1.1 Platform #

1.2 Nodes #

Cluster nodes: #

Important: Dedicated Cluster Nodes

Additional systems: #

1.3 Hardware #

1.3.1 Management Workstation #

Important: Time Synchronization

1.3.2 Storage Sizing #

1.3.2.1 Master Nodes #

Important

1.3.2.2 Worker nodes #

Important

Note

1.3.3 Storage Performance #

Warning

1.4 Networking #

Important

1.4.1 Sub-Network Sizing #

Important

1.4.2 Ports #

1.4.3 IP Addresses #

Warning

Important

1.4.4 IP Forwarding #

1.4.5 Networking Whitelist #

Note

1.4.6 Communication #

1.4.7 Performance #

1.4.8 Security #

1.4.9 Replicas #

1.4.9.1 Update replica number #

Note

1.4.9.2 Re-distribute replicas #

2 Deployment Scenarios #

2.1 Default Deployment Scenario #

2.2 Air gap Environment #

3 Deployment Instructions #

Important

Note

3.1 Deployment Preparations #

3.1.1 Basic SSH Key Configuration #

Note

Tip

Warning: Specify a key expiration time

Note: Usage of multiple identities with ssh-agent

3.1.1.1 Forwarding the Authentication Agent Connection #

3.1.2 Registration Code #

Note

Tip

3.1.3 Unique Machine IDs #

Important: Regenerating Machine ID

3.1.4 Installation Tools #

Note: Using a Proxy Server

3.1.5 Load Balancer #

Important

3.1.5.1 Nginx TCP Load Balancer with Passive Checks #

Warning

3.1.5.1.1 Configuring the Load Balancer #

3.1.5.1.2 Verifying the Load Balancer #

Important

3.1.5.2 HAProxy TCP Load Balancer with Active Checks #

Warning: Package Support

Important

Note: Usage of multiple identities with `ssh-agent`