QuickStart Guide › Deployment Preparations

QuickStart Guide

Navigation←→

Applies to SUSE CaaS Platform 4.5.2

2 Deployment Preparations #

2.1 Basic SSH Key Configuration
2.2 Registration Code
2.3 Unique Machine IDs
2.4 Installation Tools
2.5 Load Balancer

In order to deploy SUSE CaaS Platform you need a workstation running SUSE Linux Enterprise Server 15 SP2 or similar openSUSE equivalent. This workstation is called the "Management machine". Important files are generated and must be maintained on this machine, but it is not a member of the SUSE CaaS Platform cluster.

2.1 Basic SSH Key Configuration #

In order to successfully deploy SUSE CaaS Platform, you need to have SSH keys loaded into an SSH agent. This is important, because it is required in order to use the installation tools skuba and terraform.

Note

The use of ssh-agent comes with some implications for security that you should take into consideration.

The pitfalls of using ssh-agent

To avoid these risks please make sure to either use ssh-agent -t <TIMEOUT> and specify a time after which the agent will self-terminate, or terminate the agent yourself before logging out by running ssh-agent -k.

To log in to the created cluster nodes from the Management machine, you need to configure an SSH key pair. This key pair needs to be trusted by the user account you will log in with into each cluster node; that user is called "sles" by default. In order to use the installation tools terraform and skuba, this trusted keypair must be loaded into the SSH agent.

If you do not have an existing ssh keypair to use, run:
```
ssh-keygen -t ecdsa
```
The ssh-agent or a compatible program is sometimes started automatically by graphical desktop environments. If that is not your situation, run:
```
eval "$(ssh-agent)"
```
This will start the agent and set environment variables used for agent communication within the current session. This has to be the same terminal session that you run the skuba commands in. A new terminal usually requires a new ssh-agent. In some desktop environments the ssh-agent will also automatically load the SSH keys. To add an SSH key manually, use the ssh-add command:
```
ssh-add <PATH_TO_KEY>
```
Tip
If you are adding the SSH key manually, specify the full path. For example: /home/sles/.ssh/id_rsa

You can load multiple keys into your agent using the ssh-add <PATH_TO_KEY> command. Keys should be password protected as a security measure. The ssh-add command will prompt for your password, then the agent caches the decrypted key material for a configurable lifetime. The -t lifetime option to ssh-add specifies a maximum time to cache the specific key. See man ssh-add for more information.

Warning: Specify a key expiration time

The ssh key is decrypted when loaded into the key agent. Though the key itself is not accesible from the agent, anyone with access to the agent’s control socket file can use the private key contents to impersonate the key owner. By default, socket access is limited to the user who launched the agent. None the less, it is good security practice to specify an expiration time for the decrypted key using the -t option. For example: ssh-add -t 1h30m $HOME/.ssh/id.ecdsa would expire the decrypted key in 1.5 hours. . Alternatively, ssh-agent can also be launched with -t to specify a default timeout. For example: eval $( ssh-agent -t 120s ) would default to a two minute (120 second) timeout for keys added. If timeouts are specified for both programs, the timeout from ssh-add is used. See man ssh-agent and man ssh-add for more information.

Note: Usage of multiple identities with `ssh-agent`

Skuba will try all the identities loaded into the ssh-agent until one of them grants access to the node, or until the SSH server’s maximum authentication attempts are exhausted. This could lead to undesired messages in SSH or other security/authentication logs on your local machine.

2.1.1 Forwarding the Authentication Agent Connection #

It is also possible to forward the authentication agent connection from a host to another, which can be useful if you intend to run skuba on a "jump host" and don’t want to copy your private key to this node. This can be achieved using the ssh -A command. Please refer to the man page of ssh to learn about the security implications of using this feature.

2.2 Registration Code #

Note

The registration code for SUSE CaaS Platform.4 also contains the activation permissions for the underlying SUSE Linux Enterprise operating system. You can use your SUSE CaaS Platform registration code to activate the SUSE Linux Enterprise Server 15 SP2 subscription during installation.

You need a subscription registration code to use SUSE CaaS Platform. You can retrieve your registration code from SUSE Customer Center.

Login to https://scc.suse.com
Navigate to MY ORGANIZATIONS → <YOUR ORG>
Select the Subscriptions tab from the menu bar at the top
Search for "CaaS Platform"
Select the version you wish to deploy (should be the highest available version)
Click on the Link in the Name column
The registration code should be displayed as the first line under "Subscription Information"

Tip

If you can not find SUSE CaaS Platform in the list of subscriptions please contact your local administrator responsible for software subscriptions or SUSE support.

2.3 Unique Machine IDs #

During deployment of the cluster nodes, each machine will be assigned a unique ID in the /etc/machine-id file by Terraform or AutoYaST. If you are using any (semi-)manual methods of deployments that involve cloning of machines and deploying from templates, you must make sure to delete this file before creating the template.

If two nodes are deployed with the same machine-id, they will not be correctly recognized by skuba.

Important: Regenerating Machine ID

In case you are not using Terraform or AutoYaST you must regenerate machine IDs manually.

During the template preparation you will have removed the machine ID from the template image. This ID is required for proper functionality in the cluster and must be (re-)generated on each machine.

rm /etc/machine-id
dbus-uuidgen --ensure
systemd-machine-id-setup
systemctl restart systemd-journald

This will regenerate the machine id values for DBUS (/var/lib/dbus/machine-id) and systemd (/etc/machine-id) and restart the logging service to make use of the new IDs.

2.4 Installation Tools #

For any deployment type you will need skuba and Terraform. These packages are available from the SUSE CaaS Platform package sources. They are provided as an installation "pattern" that will install dependencies and other required packages in one simple step.

Access to the packages requires the SUSE CaaS Platform, Containers and Public Cloud extension modules. Enable the modules during the operating system installation or activate them using SUSE Connect.

sudo SUSEConnect -r  <CAASP_REGISTRATION_CODE> 1
sudo SUSEConnect -p sle-module-containers/15.2/x86_64 2
sudo SUSEConnect -p sle-module-public-cloud/15.2/x86_64 3
sudo SUSEConnect -p caasp/4.5/x86_64 -r <CAASP_REGISTRATION_CODE> 4

1	Activate SUSE Linux Enterprise
2	Add the free `Containers` module
3	Add the free `Public Cloud` module
4	Add the SUSE CaaS Platform extension with your registration code

Install the required tools:

sudo zypper in -t pattern SUSE-CaaSP-Management

This will install the skuba command line tool and Terraform; as well as various default configurations and examples.

Note: Using a Proxy Server

Sometimes you need a proxy server to be able to connect to the SUSE Customer Center. If you have not already configured a system-wide proxy, you can temporarily do so for the duration of the current shell session like this:

Expose the environmental variable http_proxy:

export http_proxy=http://PROXY_IP_FQDN:PROXY_PORT

Replace <PROXY_IP_FQDN> by the IP address or a fully qualified domain name (FQDN) of the proxy server and <PROXY_PORT> by its port.
If you use a proxy server with basic authentication, create the file $HOME/.curlrc with the following content:
```
--proxy-user "<USER>:<PASSWORD>"
```
Replace <USER> and <PASSWORD> with the credentials of an allowed user for the proxy server, and consider limiting access to the file (chmod 0600).

2.5 Load Balancer #

Important

Setting up a load balancer is mandatory in any production environment.

SUSE CaaS Platform requires a load balancer to distribute workload between the deployed master nodes of the cluster. A failure-tolerant SUSE CaaS Platform cluster will always use more than one control plane node as well as more than one load balancer, so there isn’t a single point of failure.

There are many ways to configure a load balancer. This documentation cannot describe all possible combinations of load balancer configurations and thus does not aim to do so. Please apply your organization’s load balancing best practices.

For SUSE OpenStack Cloud, the Terraform configurations shipped with this version will automatically deploy a suitable load balancer for the cluster.

For bare metal, KVM, or VMware, you must configure a load balancer manually and allow it access to all master nodes created during Section 3.5, “Bootstrapping the Cluster”.

The load balancer should be configured before the actual deployment. It is needed during the cluster bootstrap, and also during upgrades. To simplify configuration, you can reserve the IPs needed for the cluster nodes and pre-configure these in the load balancer.

The load balancer needs access to port 6443 on the apiserver (all master nodes) in the cluster. It also needs access to Gangway port 32001 and Dex port 32000 on all master and worker nodes in the cluster for RBAC authentication.

We recommend performing regular HTTPS health checks on each master node /healthz endpoint to verify that the node is responsive. This is particularly important during upgrades, when a master node restarts the apiserver. During this rather short time window, all requests have to go to another master node’s apiserver. The master node that is being upgraded will have to be marked INACTIVE on the load balancer pool at least during the restart of the apiserver. We provide reasonable defaults for that on our default openstack load balancer Terraform configuration.

The following contains examples for possible load balancer configurations based on SUSE Linux Enterprise Server 15 SP2 and nginx or HAProxy.

2.5.1 Nginx TCP Load Balancer with Passive Checks #

For TCP load balancing, we can use the ngx_stream_module module (available since version 1.9.0). In this mode, nginx will just forward the TCP packets to the master nodes.

The default mechanism is round-robin so each request will be distributed to a different server.

Warning

The open source version of Nginx referred to in this guide only allows the use of passive health checks. nginx will mark a node as unresponsive only after a failed request. The original request is lost and not forwarded to an available alternative server.

This load balancer configuration is therefore only suitable for testing and proof-of-concept (POC) environments.

For production environments, we recommend the use of SUSE Linux Enterprise High Availability Extension 15

2.5.1.1 Configuring the Load Balancer #

SUSEConnect -r CAASP_REGISTRATION_CODE
SUSEConnect --product sle-module-server-applications/15.2/x86_64

Install Nginx:
```
zypper in nginx
```

Write the configuration in /etc/nginx/nginx.conf:

user  nginx;
worker_processes  auto;

load_module /usr/lib64/nginx/modules/ngx_stream_module.so;

error_log  /var/log/nginx/error.log;
error_log  /var/log/nginx/error.log  notice;
error_log  /var/log/nginx/error.log  info;

events {
    worker_connections  1024;
    use epoll;
}

stream {
    log_format proxy '$remote_addr [$time_local] '
                     '$protocol $status $bytes_sent $bytes_received '
                     '$session_time "$upstream_addr"';

    error_log  /var/log/nginx/k8s-masters-lb-error.log;
    access_log /var/log/nginx/k8s-masters-lb-access.log proxy;

    upstream k8s-masters {
        #hash $remote_addr consistent; 1
        server master00:6443 weight=1 max_fails=2 fail_timeout=5s; 2
        server master01:6443 weight=1 max_fails=2 fail_timeout=5s;
        server master02:6443 weight=1 max_fails=2 fail_timeout=5s;
    }
    server {
        listen 6443;
        proxy_connect_timeout 5s;
        proxy_timeout 30s;
        proxy_pass k8s-masters;
    }

    upstream dex-backends {
        #hash $remote_addr consistent; 3
        server master00:32000 weight=1 max_fails=2 fail_timeout=5s; 4
        server master01:32000 weight=1 max_fails=2 fail_timeout=5s;
        server master02:32000 weight=1 max_fails=2 fail_timeout=5s;
    }
    server {
        listen 32000;
        proxy_connect_timeout 5s;
        proxy_timeout 30s;
        proxy_pass dex-backends; 5
    }

    upstream gangway-backends {
        #hash $remote_addr consistent; 6
        server master00:32001 weight=1 max_fails=2 fail_timeout=5s; 7
        server master01:32001 weight=1 max_fails=2 fail_timeout=5s;
        server master02:32001 weight=1 max_fails=2 fail_timeout=5s;
    }
    server {
        listen 32001;
        proxy_connect_timeout 5s;
        proxy_timeout 30s;
        proxy_pass gangway-backends; 8
    }
}

1 3 6	Note: To enable session persistence, uncomment the `hash` option so the same client will always be redirected to the same server except if this server is unavailable.
2 4 7	Replace the individual `masterXX` with the IP/FQDN of your actual master nodes (one entry each) in the `upstream k8s-masters` section.
5 8	Dex port `32000` and Gangway port `32001` must be accessible through the load balancer for RBAC authentication.

Configure firewalld to open up port 6443. As root, run:

firewall-cmd --zone=public --permanent --add-port=6443/tcp
firewall-cmd --zone=public --permanent --add-port=32000/tcp
firewall-cmd --zone=public --permanent --add-port=32001/tcp
firewall-cmd --reload

Start and enable Nginx. As root, run:
```
systemctl enable --now nginx
```

2.5.1.2 Verifying the Load Balancer #

Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Section 3.5, “Bootstrapping the Cluster” is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# tail -f /var/log/nginx/k8s-masters-lb-access.log
10.0.0.47 [17/May/2019:13:49:06 +0000] TCP 200 2553 1613 1.136 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:08 +0000] TCP 200 2553 1613 0.981 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:10 +0000] TCP 200 2553 1613 0.891 "10.0.0.7:6443"
10.0.0.47 [17/May/2019:13:49:12 +0000] TCP 200 2553 1613 0.895 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:15 +0000] TCP 200 2553 1613 1.157 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:17 +0000] TCP 200 2553 1613 0.897 "10.0.0.7:6443"

2.5.2 HAProxy TCP Load Balancer with Active Checks #

Warning: Package Support

HAProxy is available as a supported package with a SUSE Linux Enterprise High Availability Extension 15 subscription.

Alternatively, you can install HAProxy from SUSE Package Hub but you will not receive product support for this component.

HAProxy is a very powerful load balancer application which is suitable for production environments. Unlike the open source version of nginx mentioned in the example above, HAProxy supports active health checking which is a vital function for reliable cluster health monitoring.

The version used at this date is the 1.8.7.

Important

The configuration of an HA cluster is out of the scope of this document.

The default mechanism is round-robin so each request will be distributed to a different server.

The health-checks are executed every two seconds. If a connection fails, the check will be retried two times with a timeout of five seconds for each request. If no connection succeeds within this interval (2x5s), the node will be marked as DOWN and no traffic will be sent until the checks succeed again.

2.5.2.1 Configuring the Load Balancer #

SUSEConnect -r CAASP_REGISTRATION_CODE
SUSEConnect --product sle-module-server-applications/15.2/x86_64

Enable the source for the haproxy package:
- If you are using the SUSE Linux Enterprise High Availability Extension
```
SUSEConnect --product sle-ha/15.2/x86_64 -r ADDITIONAL_REGCODE
```
- If you want the free (unsupported) package:
```
SUSEConnect --product PackageHub/15.2/x86_64
```
Configure /dev/log for HAProxy chroot (optional)
This step is only required when HAProxy is configured to run in a jail directory (chroot). This is highly recommended since it increases the security of HAProxy.
Since HAProxy is chrooted, it’s necessary to make the log socket available inside the jail directory so HAProxy can send logs to the socket.
```
mkdir -p /var/lib/haproxy/dev/ && touch /var/lib/haproxy/dev/log
```
This systemd service will take care of mounting the socket in the jail directory.
```
cat > /etc/systemd/system/bindmount-dev-log-haproxy-chroot.service <<EOF
[Unit]
Description=Mount /dev/log in HAProxy chroot
After=systemd-journald-dev-log.socket
Before=haproxy.service

[Service]
Type=oneshot
ExecStart=/bin/mount --bind /dev/log /var/lib/haproxy/dev/log

[Install]
WantedBy=multi-user.target
EOF
```
Enabling the service will make the changes persistent after a reboot.
```
systemctl enable --now bindmount-dev-log-haproxy-chroot.service
```
Install HAProxy:
```
zypper in haproxy
```

Write the configuration in /etc/haproxy/haproxy.cfg:

Note

Replace the individual <MASTER_XX_IP_ADDRESS> with the IP of your actual master nodes (one entry each) in the server lines. Feel free to leave the name argument in the server lines (master00 and etc.) as is - it only serves as a label that will show up in the haproxy logs.

global
  log /dev/log local0 info 1
  chroot /var/lib/haproxy 2
  user haproxy
  group haproxy
  daemon

defaults
  mode       tcp
  log        global
  option     tcplog
  option     redispatch
  option     tcpka
  retries    2
  http-check     expect status 200 3
  default-server check check-ssl verify none
  timeout connect 5s
  timeout client 5s
  timeout server 5s
  timeout tunnel 86400s 4

listen stats 5
  bind    *:9000
  mode    http
  stats   hide-version
  stats   uri       /stats

listen apiserver 6
  bind   *:6443
  option httpchk GET /healthz
  server master00 <MASTER_00_IP_ADDRESS>:6443
  server master01 <MASTER_01_IP_ADDRESS>:6443
  server master02 <MASTER_02_IP_ADDRESS>:6443

listen dex 7
  bind   *:32000
  option httpchk GET /healthz
  server master00 <MASTER_00_IP_ADDRESS>:32000
  server master01 <MASTER_01_IP_ADDRESS>:32000
  server masetr02 <MASTER_02_IP_ADDRESS>:32000

listen gangway 8
  bind   *:32001
  option httpchk GET /
  server master00 <MASTER_00_IP_ADDRESS>:32001
  server master01 <MASTER_01_IP_ADDRESS>:32001
  server master02 <MASTER_02_IP_ADDRESS>:32001

1	Forward the logs to systemd journald, the log level can be set to `debug` to increase verbosity.
2	Define if it will run in a `chroot`.
4	This timeout is set to `24h` in order to allow long connections when accessing pod logs or port forwarding.
5	URL to expose `HAProxy` stats on port `9000`, it is accessible at `http://loadbalancer:9000/stats`
3	The performed health checks will expect a `200` return code
6	Kubernetes apiserver listening on port `6443`, the checks are performed against `https://MASTER_XX_IP_ADDRESS:6443/healthz`
7	Dex listening on port `32000`, it must be accessible through the load balancer for RBAC authentication, the checks are performed against `https://MASTER_XX_IP_ADDRESS:32000/healthz`
8	Gangway listening on port `32001`, it must be accessible through the load balancer for RBAC authentication, the checks are performed against `https://MASTER_XX_IP_ADDRESS:32001/`

Configure firewalld to open up port 6443. As root, run:

firewall-cmd --zone=public --permanent --add-port=6443/tcp
firewall-cmd --zone=public --permanent --add-port=32000/tcp
firewall-cmd --zone=public --permanent --add-port=32001/tcp
firewall-cmd --reload

Start and enable HAProxy. As root, run:
```
systemctl enable --now haproxy
```

2.5.2.2 Verifying the Load Balancer #

Important

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# journalctl -flu haproxy
haproxy[2525]: 10.0.0.47:59664 [30/Sep/2019:13:33:20.578] apiserver apiserver/master00 1/0/578 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59666 [30/Sep/2019:13:33:22.476] apiserver apiserver/master01 1/0/747 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59668 [30/Sep/2019:13:33:24.522] apiserver apiserver/master02 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59670 [30/Sep/2019:13:33:26.386] apiserver apiserver/master00 1/0/567 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59678 [30/Sep/2019:13:33:28.279] apiserver apiserver/master01 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59682 [30/Sep/2019:13:33:30.174] apiserver apiserver/master02 1/0/571 9727 -- 18/18/17/7/0 0/0

Print this page

QuickStart Guide

2 Deployment Preparations #

2.1 Basic SSH Key Configuration #

Note

Tip

Warning: Specify a key expiration time

Note: Usage of multiple identities with ssh-agent

2.1.1 Forwarding the Authentication Agent Connection #

2.2 Registration Code #

Note

Tip

2.3 Unique Machine IDs #

Important: Regenerating Machine ID

2.4 Installation Tools #

Note: Using a Proxy Server

2.5 Load Balancer #

Important

2.5.1 Nginx TCP Load Balancer with Passive Checks #

Warning

2.5.1.1 Configuring the Load Balancer #

2.5.1.2 Verifying the Load Balancer #

Important

2.5.2 HAProxy TCP Load Balancer with Active Checks #

Warning: Package Support

Important

2.5.2.1 Configuring the Load Balancer #

Note

2.5.2.2 Verifying the Load Balancer #

Important

Note: Usage of multiple identities with `ssh-agent`