This is a draft document that was built and uploaded automatically. It may document beta software and be incomplete or even incorrect. Use this document at your own risk.

Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Applies to SUSE CaaS Platform 4.5.2

2 Deployment Preparations

In order to deploy SUSE CaaS Platform you need a workstation running SUSE Linux Enterprise Server 15 SP2 or similar openSUSE equivalent. This workstation is called the "Management machine". Important files are generated and must be maintained on this machine, but it is not a member of the SUSE CaaS Platform cluster.

2.1 Basic SSH Key Configuration

In order to successfully deploy SUSE CaaS Platform, you need to have SSH keys loaded into an SSH agent. This is important, because it is required in order to use the installation tools skuba and terraform.

Note
Note

The use of ssh-agent comes with some implications for security that you should take into consideration.

The pitfalls of using ssh-agent

To avoid these risks please make sure to either use ssh-agent -t <TIMEOUT> and specify a time after which the agent will self-terminate, or terminate the agent yourself before logging out by running ssh-agent -k.

To log in to the created cluster nodes from the Management machine, you need to configure an SSH key pair. This key pair needs to be trusted by the user account you will log in with into each cluster node; that user is called "sles" by default. In order to use the installation tools terraform and skuba, this trusted keypair must be loaded into the SSH agent.

  1. If you do not have an existing ssh keypair to use, run:

    ssh-keygen -t ecdsa
  2. The ssh-agent or a compatible program is sometimes started automatically by graphical desktop environments. If that is not your situation, run:

    eval "$(ssh-agent)"

    This will start the agent and set environment variables used for agent communication within the current session. This has to be the same terminal session that you run the skuba commands in. A new terminal usually requires a new ssh-agent. In some desktop environments the ssh-agent will also automatically load the SSH keys. To add an SSH key manually, use the ssh-add command:

    ssh-add <PATH_TO_KEY>
    Tip
    Tip

    If you are adding the SSH key manually, specify the full path. For example: /home/sles/.ssh/id_rsa

You can load multiple keys into your agent using the ssh-add <PATH_TO_KEY> command. Keys should be password protected as a security measure. The ssh-add command will prompt for your password, then the agent caches the decrypted key material for a configurable lifetime. The -t lifetime option to ssh-add specifies a maximum time to cache the specific key. See man ssh-add for more information.

Warning
Warning: Specify a key expiration time

The ssh key is decrypted when loaded into the key agent. Though the key itself is not accesible from the agent, anyone with access to the agent’s control socket file can use the private key contents to impersonate the key owner. By default, socket access is limited to the user who launched the agent. None the less, it is good security practice to specify an expiration time for the decrypted key using the -t option. For example: ssh-add -t 1h30m $HOME/.ssh/id.ecdsa would expire the decrypted key in 1.5 hours. . Alternatively, ssh-agent can also be launched with -t to specify a default timeout. For example: eval $( ssh-agent -t 120s ) would default to a two minute (120 second) timeout for keys added. If timeouts are specified for both programs, the timeout from ssh-add is used. See man ssh-agent and man ssh-add for more information.

Note
Note: Usage of multiple identities with ssh-agent

Skuba will try all the identities loaded into the ssh-agent until one of them grants access to the node, or until the SSH server’s maximum authentication attempts are exhausted. This could lead to undesired messages in SSH or other security/authentication logs on your local machine.

2.1.1 Forwarding the Authentication Agent Connection

It is also possible to forward the authentication agent connection from a host to another, which can be useful if you intend to run skuba on a "jump host" and don’t want to copy your private key to this node. This can be achieved using the ssh -A command. Please refer to the man page of ssh to learn about the security implications of using this feature.

2.2 Registration Code

Note
Note

The registration code for SUSE CaaS Platform.4 also contains the activation permissions for the underlying SUSE Linux Enterprise operating system. You can use your SUSE CaaS Platform registration code to activate the SUSE Linux Enterprise Server 15 SP2 subscription during installation.

You need a subscription registration code to use SUSE CaaS Platform. You can retrieve your registration code from SUSE Customer Center.

  • Login to https://scc.suse.com

  • Navigate to MY ORGANIZATIONS → <YOUR ORG>

  • Select the Subscriptions tab from the menu bar at the top

  • Search for "CaaS Platform"

  • Select the version you wish to deploy (should be the highest available version)

  • Click on the Link in the Name column

  • The registration code should be displayed as the first line under "Subscription Information"

Tip
Tip

If you can not find SUSE CaaS Platform in the list of subscriptions please contact your local administrator responsible for software subscriptions or SUSE support.

2.3 Unique Machine IDs

During deployment of the cluster nodes, each machine will be assigned a unique ID in the /etc/machine-id file by Terraform or AutoYaST. If you are using any (semi-)manual methods of deployments that involve cloning of machines and deploying from templates, you must make sure to delete this file before creating the template.

If two nodes are deployed with the same machine-id, they will not be correctly recognized by skuba.

Important
Important: Regenerating Machine ID

In case you are not using Terraform or AutoYaST you must regenerate machine IDs manually.

During the template preparation you will have removed the machine ID from the template image. This ID is required for proper functionality in the cluster and must be (re-)generated on each machine.

Log in to each virtual machine created from the template and run:

rm /etc/machine-id
dbus-uuidgen --ensure
systemd-machine-id-setup
systemctl restart systemd-journald

This will regenerate the machine id values for DBUS (/var/lib/dbus/machine-id) and systemd (/etc/machine-id) and restart the logging service to make use of the new IDs.

2.4 Installation Tools

For any deployment type you will need skuba and Terraform. These packages are available from the SUSE CaaS Platform package sources. They are provided as an installation "pattern" that will install dependencies and other required packages in one simple step.

Access to the packages requires the SUSE CaaS Platform, Containers and Public Cloud extension modules. Enable the modules during the operating system installation or activate them using SUSE Connect.

sudo SUSEConnect -r  <CAASP_REGISTRATION_CODE> 1
sudo SUSEConnect -p sle-module-containers/15.2/x86_64 2
sudo SUSEConnect -p sle-module-public-cloud/15.2/x86_64 3
sudo SUSEConnect -p caasp/4.5/x86_64 -r <CAASP_REGISTRATION_CODE> 4

1

Activate SUSE Linux Enterprise

2

Add the free Containers module

3

Add the free Public Cloud module

4

Add the SUSE CaaS Platform extension with your registration code

Install the required tools:

sudo zypper in -t pattern SUSE-CaaSP-Management

This will install the skuba command line tool and Terraform; as well as various default configurations and examples.

Note
Note: Using a Proxy Server

Sometimes you need a proxy server to be able to connect to the SUSE Customer Center. If you have not already configured a system-wide proxy, you can temporarily do so for the duration of the current shell session like this:

  1. Expose the environmental variable http_proxy:

    export http_proxy=http://PROXY_IP_FQDN:PROXY_PORT
  2. Replace <PROXY_IP_FQDN> by the IP address or a fully qualified domain name (FQDN) of the proxy server and <PROXY_PORT> by its port.

  3. If you use a proxy server with basic authentication, create the file $HOME/.curlrc with the following content:

    --proxy-user "<USER>:<PASSWORD>"

    Replace <USER> and <PASSWORD> with the credentials of an allowed user for the proxy server, and consider limiting access to the file (chmod 0600).

2.5 Load Balancer

Important
Important

Setting up a load balancer is mandatory in any production environment.

SUSE CaaS Platform requires a load balancer to distribute workload between the deployed master nodes of the cluster. A failure-tolerant SUSE CaaS Platform cluster will always use more than one control plane node as well as more than one load balancer, so there isn’t a single point of failure.

There are many ways to configure a load balancer. This documentation cannot describe all possible combinations of load balancer configurations and thus does not aim to do so. Please apply your organization’s load balancing best practices.

For SUSE OpenStack Cloud, the Terraform configurations shipped with this version will automatically deploy a suitable load balancer for the cluster.

For bare metal, KVM, or VMware, you must configure a load balancer manually and allow it access to all master nodes created during Section 3.5, “Bootstrapping the Cluster”.

The load balancer should be configured before the actual deployment. It is needed during the cluster bootstrap, and also during upgrades. To simplify configuration, you can reserve the IPs needed for the cluster nodes and pre-configure these in the load balancer.

The load balancer needs access to port 6443 on the apiserver (all master nodes) in the cluster. It also needs access to Gangway port 32001 and Dex port 32000 on all master and worker nodes in the cluster for RBAC authentication.

We recommend performing regular HTTPS health checks on each master node /healthz endpoint to verify that the node is responsive. This is particularly important during upgrades, when a master node restarts the apiserver. During this rather short time window, all requests have to go to another master node’s apiserver. The master node that is being upgraded will have to be marked INACTIVE on the load balancer pool at least during the restart of the apiserver. We provide reasonable defaults for that on our default openstack load balancer Terraform configuration.

The following contains examples for possible load balancer configurations based on SUSE Linux Enterprise Server 15 SP2 and nginx or HAProxy.

2.5.1 Nginx TCP Load Balancer with Passive Checks

For TCP load balancing, we can use the ngx_stream_module module (available since version 1.9.0). In this mode, nginx will just forward the TCP packets to the master nodes.

The default mechanism is round-robin so each request will be distributed to a different server.

Warning
Warning

The open source version of Nginx referred to in this guide only allows the use of passive health checks. nginx will mark a node as unresponsive only after a failed request. The original request is lost and not forwarded to an available alternative server.

This load balancer configuration is therefore only suitable for testing and proof-of-concept (POC) environments.

For production environments, we recommend the use of SUSE Linux Enterprise High Availability Extension 15

2.5.1.1 Configuring the Load Balancer

  1. Register SLES and enable the "Server Applications" module:

    SUSEConnect -r CAASP_REGISTRATION_CODE
    SUSEConnect --product sle-module-server-applications/15.2/x86_64
  2. Install Nginx:

    zypper in nginx
  3. Write the configuration in /etc/nginx/nginx.conf:

    user  nginx;
    worker_processes  auto;
    
    load_module /usr/lib64/nginx/modules/ngx_stream_module.so;
    
    error_log  /var/log/nginx/error.log;
    error_log  /var/log/nginx/error.log  notice;
    error_log  /var/log/nginx/error.log  info;
    
    events {
        worker_connections  1024;
        use epoll;
    }
    
    stream {
        log_format proxy '$remote_addr [$time_local] '
                         '$protocol $status $bytes_sent $bytes_received '
                         '$session_time "$upstream_addr"';
    
        error_log  /var/log/nginx/k8s-masters-lb-error.log;
        access_log /var/log/nginx/k8s-masters-lb-access.log proxy;
    
        upstream k8s-masters {
            #hash $remote_addr consistent; 1
            server master00:6443 weight=1 max_fails=2 fail_timeout=5s; 2
            server master01:6443 weight=1 max_fails=2 fail_timeout=5s;
            server master02:6443 weight=1 max_fails=2 fail_timeout=5s;
        }
        server {
            listen 6443;
            proxy_connect_timeout 5s;
            proxy_timeout 30s;
            proxy_pass k8s-masters;
        }
    
        upstream dex-backends {
            #hash $remote_addr consistent; 3
            server master00:32000 weight=1 max_fails=2 fail_timeout=5s; 4
            server master01:32000 weight=1 max_fails=2 fail_timeout=5s;
            server master02:32000 weight=1 max_fails=2 fail_timeout=5s;
        }
        server {
            listen 32000;
            proxy_connect_timeout 5s;
            proxy_timeout 30s;
            proxy_pass dex-backends; 5
        }
    
        upstream gangway-backends {
            #hash $remote_addr consistent; 6
            server master00:32001 weight=1 max_fails=2 fail_timeout=5s; 7
            server master01:32001 weight=1 max_fails=2 fail_timeout=5s;
            server master02:32001 weight=1 max_fails=2 fail_timeout=5s;
        }
        server {
            listen 32001;
            proxy_connect_timeout 5s;
            proxy_timeout 30s;
            proxy_pass gangway-backends; 8
        }
    }

    1 3 6

    Note: To enable session persistence, uncomment the hash option so the same client will always be redirected to the same server except if this server is unavailable.

    2 4 7

    Replace the individual masterXX with the IP/FQDN of your actual master nodes (one entry each) in the upstream k8s-masters section.

    5 8

    Dex port 32000 and Gangway port 32001 must be accessible through the load balancer for RBAC authentication.

  4. Configure firewalld to open up port 6443. As root, run:

    firewall-cmd --zone=public --permanent --add-port=6443/tcp
    firewall-cmd --zone=public --permanent --add-port=32000/tcp
    firewall-cmd --zone=public --permanent --add-port=32001/tcp
    firewall-cmd --reload
  5. Start and enable Nginx. As root, run:

    systemctl enable --now nginx

2.5.1.2 Verifying the Load Balancer

Important
Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Section 3.5, “Bootstrapping the Cluster” is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# tail -f /var/log/nginx/k8s-masters-lb-access.log
10.0.0.47 [17/May/2019:13:49:06 +0000] TCP 200 2553 1613 1.136 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:08 +0000] TCP 200 2553 1613 0.981 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:10 +0000] TCP 200 2553 1613 0.891 "10.0.0.7:6443"
10.0.0.47 [17/May/2019:13:49:12 +0000] TCP 200 2553 1613 0.895 "10.0.0.145:6443"
10.0.0.47 [17/May/2019:13:49:15 +0000] TCP 200 2553 1613 1.157 "10.0.0.148:6443"
10.0.0.47 [17/May/2019:13:49:17 +0000] TCP 200 2553 1613 0.897 "10.0.0.7:6443"

2.5.2 HAProxy TCP Load Balancer with Active Checks

Warning
Warning: Package Support

HAProxy is available as a supported package with a SUSE Linux Enterprise High Availability Extension 15 subscription.

Alternatively, you can install HAProxy from SUSE Package Hub but you will not receive product support for this component.

HAProxy is a very powerful load balancer application which is suitable for production environments. Unlike the open source version of nginx mentioned in the example above, HAProxy supports active health checking which is a vital function for reliable cluster health monitoring.

The version used at this date is the 1.8.7.

Important
Important

The configuration of an HA cluster is out of the scope of this document.

The default mechanism is round-robin so each request will be distributed to a different server.

The health-checks are executed every two seconds. If a connection fails, the check will be retried two times with a timeout of five seconds for each request. If no connection succeeds within this interval (2x5s), the node will be marked as DOWN and no traffic will be sent until the checks succeed again.

2.5.2.1 Configuring the Load Balancer

  1. Register SLES and enable the "Server Applications" module:

    SUSEConnect -r CAASP_REGISTRATION_CODE
    SUSEConnect --product sle-module-server-applications/15.2/x86_64
  2. Enable the source for the haproxy package:

    • If you are using the SUSE Linux Enterprise High Availability Extension

      SUSEConnect --product sle-ha/15.2/x86_64 -r ADDITIONAL_REGCODE
    • If you want the free (unsupported) package:

      SUSEConnect --product PackageHub/15.2/x86_64
  3. Configure /dev/log for HAProxy chroot (optional)

    This step is only required when HAProxy is configured to run in a jail directory (chroot). This is highly recommended since it increases the security of HAProxy.

    Since HAProxy is chrooted, it’s necessary to make the log socket available inside the jail directory so HAProxy can send logs to the socket.

    mkdir -p /var/lib/haproxy/dev/ && touch /var/lib/haproxy/dev/log

    This systemd service will take care of mounting the socket in the jail directory.

    cat > /etc/systemd/system/bindmount-dev-log-haproxy-chroot.service <<EOF
    [Unit]
    Description=Mount /dev/log in HAProxy chroot
    After=systemd-journald-dev-log.socket
    Before=haproxy.service
    
    [Service]
    Type=oneshot
    ExecStart=/bin/mount --bind /dev/log /var/lib/haproxy/dev/log
    
    [Install]
    WantedBy=multi-user.target
    EOF

    Enabling the service will make the changes persistent after a reboot.

    systemctl enable --now bindmount-dev-log-haproxy-chroot.service
  4. Install HAProxy:

    zypper in haproxy
  5. Write the configuration in /etc/haproxy/haproxy.cfg:

    Note
    Note

    Replace the individual <MASTER_XX_IP_ADDRESS> with the IP of your actual master nodes (one entry each) in the server lines. Feel free to leave the name argument in the server lines (master00 and etc.) as is - it only serves as a label that will show up in the haproxy logs.

    global
      log /dev/log local0 info 1
      chroot /var/lib/haproxy 2
      user haproxy
      group haproxy
      daemon
    
    defaults
      mode       tcp
      log        global
      option     tcplog
      option     redispatch
      option     tcpka
      retries    2
      http-check     expect status 200 3
      default-server check check-ssl verify none
      timeout connect 5s
      timeout client 5s
      timeout server 5s
      timeout tunnel 86400s 4
    
    listen stats 5
      bind    *:9000
      mode    http
      stats   hide-version
      stats   uri       /stats
    
    listen apiserver 6
      bind   *:6443
      option httpchk GET /healthz
      server master00 <MASTER_00_IP_ADDRESS>:6443
      server master01 <MASTER_01_IP_ADDRESS>:6443
      server master02 <MASTER_02_IP_ADDRESS>:6443
    
    listen dex 7
      bind   *:32000
      option httpchk GET /healthz
      server master00 <MASTER_00_IP_ADDRESS>:32000
      server master01 <MASTER_01_IP_ADDRESS>:32000
      server masetr02 <MASTER_02_IP_ADDRESS>:32000
    
    listen gangway 8
      bind   *:32001
      option httpchk GET /
      server master00 <MASTER_00_IP_ADDRESS>:32001
      server master01 <MASTER_01_IP_ADDRESS>:32001
      server master02 <MASTER_02_IP_ADDRESS>:32001

    1

    Forward the logs to systemd journald, the log level can be set to debug to increase verbosity.

    2

    Define if it will run in a chroot.

    4

    This timeout is set to 24h in order to allow long connections when accessing pod logs or port forwarding.

    5

    URL to expose HAProxy stats on port 9000, it is accessible at http://loadbalancer:9000/stats

    3

    The performed health checks will expect a 200 return code

    6

    Kubernetes apiserver listening on port 6443, the checks are performed against https://MASTER_XX_IP_ADDRESS:6443/healthz

    7

    Dex listening on port 32000, it must be accessible through the load balancer for RBAC authentication, the checks are performed against https://MASTER_XX_IP_ADDRESS:32000/healthz

    8

    Gangway listening on port 32001, it must be accessible through the load balancer for RBAC authentication, the checks are performed against https://MASTER_XX_IP_ADDRESS:32001/

  6. Configure firewalld to open up port 6443. As root, run:

    firewall-cmd --zone=public --permanent --add-port=6443/tcp
    firewall-cmd --zone=public --permanent --add-port=32000/tcp
    firewall-cmd --zone=public --permanent --add-port=32001/tcp
    firewall-cmd --reload
  7. Start and enable HAProxy. As root, run:

    systemctl enable --now haproxy

2.5.2.2 Verifying the Load Balancer

Important
Important

The SUSE CaaS Platform cluster must be up and running for this to produce any useful results. This step can only be performed after Section 3.5, “Bootstrapping the Cluster” is completed successfully.

To verify that the load balancer works, you can run a simple command to repeatedly retrieve cluster information from the master nodes. Each request should be forwarded to a different master node.

From your workstation, run:

while true; do skuba cluster status; sleep 1; done;

There should be no interruption in the skuba cluster status running command.

On the load balancer virtual machine, check the logs to validate that each request is correctly distributed in a round robin way.

# journalctl -flu haproxy
haproxy[2525]: 10.0.0.47:59664 [30/Sep/2019:13:33:20.578] apiserver apiserver/master00 1/0/578 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59666 [30/Sep/2019:13:33:22.476] apiserver apiserver/master01 1/0/747 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59668 [30/Sep/2019:13:33:24.522] apiserver apiserver/master02 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59670 [30/Sep/2019:13:33:26.386] apiserver apiserver/master00 1/0/567 9727 -- 18/18/17/3/0 0/0
haproxy[2525]: 10.0.0.47:59678 [30/Sep/2019:13:33:28.279] apiserver apiserver/master01 1/0/575 9727 -- 18/18/17/7/0 0/0
haproxy[2525]: 10.0.0.47:59682 [30/Sep/2019:13:33:30.174] apiserver apiserver/master02 1/0/571 9727 -- 18/18/17/7/0 0/0
Print this page