Applies to SUSE CaaS Platform 4.5.2

8 Monitoring #

8.1 Monitoring Stack
8.2 Health Checks
8.3 Horizontal Pod Autoscaler
8.4 Stratos Web Console

8.1 Monitoring Stack #

Important

The described monitoring approach in this document is a generalized example of one way of monitoring a SUSE CaaS Platform cluster.

Please apply best practices to develop your own monitoring approach using the described examples and available health checking endpoints.

8.1.1 Introduction #

This document aims to describe monitoring in a Kubernetes cluster.

The monitoring stack consists of a monitoring/trending system and a visualization platform. Additionally you can use the in-memory metrics-server to perform automatic scaling (Refer to: Section 8.3, “Horizontal Pod Autoscaler”).

Prometheus
Prometheus is an open-source monitoring and trending system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach. The time series collection happens via a pull mode over HTTP.
Prometheus consists of multiple components:
- Prometheus server: scrapes and stores data to time series database
- Alertmanager handles client alerts, sanitizes duplicates and noise and routes them to configurable receivers.
- Pushgateway is an intermediate service which allows you to push metrics from jobs which cannot be scraped.
Note
Deploying Prometheus Pushgateway is out of the scope of this document.
- Exporters are libraries which help to exports existing metrics from 3rd-party system as Prometheus metric.
Grafana
Grafana is an open-source system for querying, analysing and visualizing metrics.

8.1.2 Prerequisites #

NGINX Ingress Controller
Please refer to Section 6.8, “NGINX Ingress Controller” on how to configure ingress in your cluster. Deploying NGINX Ingress Controller also allows us to provide TLS termination to our services and to provide basic authentication to the Prometheus Expression browser/API.
Monitoring namespace
We will deploy our monitoring stack in its own namespace and therefore create one.
```
kubectl create namespace monitoring
```
Configure Authentication
We need to create a basic-auth secret so the NGINX Ingress Controller can perform authentication.
Install apache2-utils, which contains htpasswd, on your local workstation.
```
zypper in apache2-utils
```
Create the secret file auth
```
htpasswd -c auth admin
New password:
Re-type new password:
Adding password for user admin
```
Important
It is very important that the filename is auth. During creation, a key in the configuration containing the secret is created that is named after the used filename. The ingress controller will expect a key named auth. And when you access the monitoring WebUI, you need to enter the username and password.
Create secret in Kubernetes cluster
```
kubectl create secret generic -n monitoring prometheus-basic-auth --from-file=auth
```

8.1.3 Installation #

There will be two different ways of using ingress for accessing the monitoring system.

Section 8.1.3.1, “Installation For Subdomains”: Using subdomains for accessing monitoring system such as prometheus.example.com, prometheus-alertmanager.example.com, and grafana.example.com.
Section 8.1.3.3, “Installation For Subpaths”: Using subpaths for accessing monitoring system such as example.com/prometheus, example.com/alertmanager, and example.com/grafana.

8.1.3.1 Installation For Subdomains #

This installation example shows how to install and configure Prometheus and Grafana using subdomains such as prometheus.example.com, prometheus-alertmanager.example.com, and grafana.example.com.

Important

In order to provide additional security by using TLS certificates, please make sure you have the Section 6.8, “NGINX Ingress Controller” installed and configured.

If you don’t need TLS, you may use other methods for exposing these web services as native LBaaS in OpenStack, haproxy service or k8s native methods as port-forwarding or NodePort but this is out of scope of this document.

8.1.3.2 Create DNS entries #

In this example, we will use a master node with IP 10.86.4.158 in the case of NodePort service of the Ingress Controller.

Note

You should configure proper DNS names in any production environment. These values are only for example purposes.

Configure the DNS server

monitoring.example.com                      IN  A       10.86.4.158
prometheus.example.com                      IN  CNAME   monitoring.example.com
prometheus-alertmanager.example.com         IN  CNAME   monitoring.example.com
grafana.example.com                         IN  CNAME   monitoring.example.com

Configure the management workstation /etc/hosts (optional)

10.86.4.158 prometheus.example.com prometheus-alertmanager.example.com grafana.example.com

8.1.3.2.1 TLS Certificate #

You must configure your certificates for the components as secrets in the Kubernetes cluster. Get certificates from your certificate authority.

Individual certificate
Single-name TLS certificate protects a single sub-domain, and it means each sub-domain owns its private key. From the security perspective, it is recommended to use individual certificates. However, you have to manage the private key and the certificate rotation separately.
Important: Note Down Secret Names For Configuration
When you choose to secure each service with an individual certificate, you must repeat the step below for each component and adjust the name for the individual secret each time. Please note down the names of the secrets you have created.
In this example, the secret name is monitoring-tls.
Wildcard certificate
Wildcard TLS allows you to secure multiple sub-domains with one certificate and it means multiple sub-domains share the same private key. You can then add more sub-domains without having to redeploy the certificate and moreover, save the additional certificate costs.

Refer to Section 6.9.9.1.1, “Trusted Server Certificate” on how to sign the trusted certificate or refer to Section 6.9.9.2.2, “Self-signed Server Certificate” on how to sign the self-signed certificate. The server.conf for DNS.1 is prometheus.example.com and prometheus-alertmanager.example.com grafana.example.com for individual certificates separately. The server.conf for DNS.1 is *.example.com for a wildcard certificate.

Then, import your certificate and key pair into the Kubernetes cluster secret name monitoring-tls. In this example, the certificate and key are monitoring.crt and monitoring.key.

kubectl create -n monitoring secret tls monitoring-tls  \
--key  ./monitoring.key \
--cert ./monitoring.crt

8.1.3.2.2 Prometheus #

Create a configuration file prometheus-config-values.yaml

We need to configure the storage for our deployment. Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

Use an existing PersistentVolumeClaim
Use a StorageClass (preferred)

# Alertmanager configuration
alertmanager:
  enabled: true
  ingress:
    enabled: true
    hosts:
    -  prometheus-alertmanager.example.com
    annotations:
      kubernetes.io/ingress.class: nginx
      nginx.ingress.kubernetes.io/auth-type: basic
      nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
      nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
    tls:
      - hosts:
        - prometheus-alertmanager.example.com
        secretName: monitoring-tls
  persistentVolume:
    enabled: true
    ## Use a StorageClass
    storageClass: my-storage-class
    ## Create a PersistentVolumeClaim of 2Gi
    size: 2Gi
    ## Use an existing PersistentVolumeClaim (my-pvc)
    #existingClaim: my-pvc

## Alertmanager is configured through alertmanager.yml. This file and any others
## listed in alertmanagerFiles will be mounted into the alertmanager pod.
## See configuration options https://prometheus.io/docs/alerting/configuration/
#alertmanagerFiles:
#  alertmanager.yml:

# Create a specific service account
serviceAccounts:
  nodeExporter:
    name: prometheus-node-exporter

# Node tolerations for node-exporter scheduling to nodes with taints
# Allow scheduling of node-exporter on master nodes
nodeExporter:
  hostNetwork: false
  hostPID: false
  podSecurityPolicy:
    enabled: true
    annotations:
      apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
      apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
      seccomp.security.alpha.kubernetes.io/allowedProfileNames: runtime/default
      seccomp.security.alpha.kubernetes.io/defaultProfileName: runtime/default
  tolerations:
    - key: node-role.kubernetes.io/master
      operator: Exists
      effect: NoSchedule

# Disable Pushgateway
pushgateway:
  enabled: false

# Prometheus configuration
server:
  ingress:
    enabled: true
    hosts:
    - prometheus.example.com
    annotations:
      kubernetes.io/ingress.class: nginx
      nginx.ingress.kubernetes.io/auth-type: basic
      nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
      nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
    tls:
      - hosts:
        - prometheus.example.com
        secretName: monitoring-tls
  persistentVolume:
    enabled: true
    ## Use a StorageClass
    storageClass: my-storage-class
    ## Create a PersistentVolumeClaim of 8Gi
    size: 8Gi
    ## Use an existing PersistentVolumeClaim (my-pvc)
    #existingClaim: my-pvc

## Prometheus is configured through prometheus.yml. This file and any others
## listed in serverFiles will be mounted into the server pod.
## See configuration options
## https://prometheus.io/docs/prometheus/latest/configuration/configuration/
#serverFiles:
#  prometheus.yml:

Add SUSE helm charts repository

helm repo add suse https://kubernetes-charts.suse.com

Deploy SUSE prometheus helm chart and pass our configuration values file.

helm install prometheus suse/prometheus \
--namespace monitoring \
--values prometheus-config-values.yaml

There need to be 3 pods running (3 node-exporter pods because we have 3 nodes).

kubectl -n monitoring get pod | grep prometheus
NAME                                             READY     STATUS    RESTARTS   AGE
prometheus-alertmanager-5487596d54-kcdd6         2/2       Running   0          2m
prometheus-kube-state-metrics-566669df8c-krblx   1/1       Running   0          2m
prometheus-node-exporter-jnc5w                   1/1       Running   0          2m
prometheus-node-exporter-qfwp9                   1/1       Running   0          2m
prometheus-node-exporter-sc4ls                   1/1       Running   0          2m
prometheus-server-6488f6c4cd-5n9w8               2/2       Running   0          2m

There need to be be 2 ingresses configured

kubectl get ingress -n monitoring
NAME                      HOSTS                                 ADDRESS   PORTS     AGE
prometheus-alertmanager   prometheus-alertmanager.example.com             80, 443   87s
prometheus-server         prometheus.example.com                          80, 443   87s

At this stage, the Prometheus Expression browser/API should be accessible, depending on your network configuration
- NodePort: https://prometheus.example.com:32443
- External IPs: https://prometheus.example.com
- LoadBalancer: https://prometheus.example.com

8.1.3.2.3 Alertmanager Configuration Example #

The configuration example sets one "receiver" to get notified by email when one of below conditions is met:

Node is unschedulable: severity is critical because the node cannot accept new pods
Node runs out of disk space: severity is critical because the node cannot accept new pods
Node has memory pressure: severity is warning
Node has disk pressure: severity is warning
Certificates is going to expire in 7 days: severity is critical
Certificates is going to expire in 30 days: severity is warning

Certificates is going to expire in 3 months: severity is info

Configure alerting receiver in Alertmanager

The Alertmanager handles alerts sent by Prometheus server, it takes care of deduplicating, grouping, and routing them to the correct receiver integration such as email. It also takes care of silencing and inhibition of alerts.

Add the alertmanagerFiles section to your Prometheus configuration file prometheus-config-values.yaml.

For more information on how to configure Alertmanager, refer to Prometheus: Alerting - Configuration.

alertmanagerFiles:
  alertmanager.yml:
    global:
      # The smarthost and SMTP sender used for mail notifications.
      smtp_from: alertmanager@example.com
      smtp_smarthost: smtp.example.com:587
      smtp_auth_username: admin@example.com
      smtp_auth_password: <PASSWORD>
      smtp_require_tls: true

    route:
      # The labels by which incoming alerts are grouped together.
      group_by: ['node']

      # When a new group of alerts is created by an incoming alert, wait at
      # least 'group_wait' to send the initial notification.
      # This way ensures that you get multiple alerts for the same group that start
      # firing shortly after another are batched together on the first
      # notification.
      group_wait: 30s

      # When the first notification was sent, wait 'group_interval' to send a batch
      # of new alerts that started firing for that group.
      group_interval: 5m

      # If an alert has successfully been sent, wait 'repeat_interval' to
      # resend them.
      repeat_interval: 3h

      # A default receiver
      receiver: admin-example

    receivers:
    - name: 'admin-example'
      email_configs:
      - to: 'admin@example.com'

Configures alerting rules in Prometheus server

Replace the serverFiles section of the Prometheus configuration file prometheus-config-values.yaml.

For more information on how to configure alerts, refer to: Prometheus: Alerting - Notification Template Examples

serverFiles:
  alerts: {}
  rules:
    groups:
    - name: caasp.node.rules
      rules:
      - alert: NodeIsNotReady
        expr: kube_node_status_condition{condition="Ready",status="false"} == 1 or kube_node_status_condition{condition="Ready",status="unknown"} == 1
        for: 1m
        labels:
          severity: critical
        annotations:
          description: '{{ $labels.node }} is not ready'
      - alert: NodeIsOutOfDisk
        expr: kube_node_status_condition{condition="OutOfDisk",status="true"} == 1
        labels:
          severity: critical
        annotations:
          description: '{{ $labels.node }} has insufficient free disk space'
      - alert: NodeHasDiskPressure
        expr: kube_node_status_condition{condition="DiskPressure",status="true"} == 1
        labels:
          severity: warning
        annotations:
          description: '{{ $labels.node }} has insufficient available disk space'
      - alert: NodeHasInsufficientMemory
        expr: kube_node_status_condition{condition="MemoryPressure",status="true"} == 1
        labels:
          severity: warning
        annotations:
          description: '{{ $labels.node }} has insufficient available memory'
    - name: caasp.certs.rules
      rules:
      - alert: KubernetesCertificateExpiry3Months
        expr: (cert_exporter_cert_expires_in_seconds / 86400) < 90
        labels:
          severity: info
        annotations:
          description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 3 months'
      - alert: KubernetesCertificateExpiry30Days
        expr: (cert_exporter_cert_expires_in_seconds / 86400) < 30
        labels:
          severity: warning
        annotations:
          description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 30 days'
      - alert: KubernetesCertificateExpiry7Days
        expr: (cert_exporter_cert_expires_in_seconds / 86400) < 7
        labels:
          severity: critical
        annotations:
          description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 7 days'
      - alert: KubeconfigCertificateExpiry3Months
        expr: (cert_exporter_kubeconfig_expires_in_seconds / 86400) < 90
        labels:
          severity: info
        annotations:
          description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 3 months'
      - alert: KubeconfigCertificateExpiry30Days
        expr: (cert_exporter_kubeconfig_expires_in_seconds / 86400) < 30
        labels:
          severity: warning
        annotations:
          description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 30 days'
      - alert: KubeconfigCertificateExpiry7Days
        expr: (cert_exporter_kubeconfig_expires_in_seconds / 86400) < 7
        labels:
          severity: critical
        annotations:
          description: 'The cert for {{ $labels.filename }} on {{ $labels.nodename }} node is going to expire in 7 days'
      - alert: AddonCertificateExpiry3Months
        expr: (cert_exporter_secret_expires_in_seconds / 86400) < 90
        labels:
          severity: info
        annotations:
          description: 'The cert for {{ $labels.secret_name }} is going to expire in 3 months'
      - alert: AddonCertificateExpiry30Days
        expr: (cert_exporter_secret_expires_in_seconds / 86400) < 30
        labels:
          severity: warning
        annotations:
          description: 'The cert for {{ $labels.secret_name }} is going to expire in 30 days'
      - alert: AddonCertificateExpiry7Days
        expr: (cert_exporter_secret_expires_in_seconds / 86400) < 7
        labels:
          severity: critical
        annotations:
          description: 'The cert for {{ $labels.secret_name }} is going to expire in 7 days'

To apply the changed configuration, run:

helm upgrade prometheus suse/prometheus --namespace monitoring --values prometheus-config-values.yaml

You should now be able to see your Alertmanager, depending on your network configuration

NodePort: https://prometheus-alertmanager.example.com:32443
External IPs: https://prometheus-alertmanager.example.com
LoadBalancer: https://prometheus-alertmanager.example.com

8.1.3.2.4 Recording Rules Configuration Example #

Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series. Querying the precomputed result will then often be much faster than executing the original expression every time it is needed. This is especially useful for dashboards, which need to query the same expression repeatedly every time they refresh. Another common use case is federation where precomputed metrics are scraped from one Prometheus instance by another.

For more information on how to configure recording rules, refer to Prometheus:Recording Rules - Configuration.

Configuring recording rules

Add the following group of rules in the serverFiles section of the prometheus-config-values.yaml configuration file.

serverFiles:
  alerts: {}
  rules:
    groups:
    - name: node-exporter.rules
      rules:
      - expr: count by (instance) (count without (mode) (node_cpu_seconds_total{component="node-exporter"}))
        record: instance:node_num_cpu:sum
      - expr: 1 - avg by (instance) (rate(node_cpu_seconds_total{component="node-exporter",mode="idle"}[5m]))
        record: instance:node_cpu_utilisation:rate5m
      - expr: node_load1{component="node-exporter"} / on (instance) instance:node_num_cpu:sum
        record: instance:node_load1_per_cpu:ratio
      - expr: node_memory_MemAvailable_bytes / on (instance) node_memory_MemTotal_bytes
        record: instance:node_memory_utilisation:ratio
      - expr: rate(node_vmstat_pgmajfault{component="node-exporter"}[5m])
        record: instance:node_vmstat_pgmajfault:rate5m
      - expr: rate(node_disk_io_time_seconds_total{component="node-exporter", device=~"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+"}[5m])
        record: instance_device:node_disk_io_time_seconds:rate5m
      - expr: rate(node_disk_io_time_weighted_seconds_total{component="node-exporter", device=~"nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|dasd.+"}[5m])
        record: instance_device:node_disk_io_time_weighted_seconds:rate5m
      - expr: sum by (instance) (rate(node_network_receive_bytes_total{component="node-exporter", device!="lo"}[5m]))
        record: instance:node_network_receive_bytes_excluding_lo:rate5m
      - expr: sum by (instance) (rate(node_network_transmit_bytes_total{component="node-exporter", device!="lo"}[5m]))
        record: instance:node_network_transmit_bytes_excluding_lo:rate5m
      - expr: sum by (instance) (rate(node_network_receive_drop_total{component="node-exporter", device!="lo"}[5m]))
        record: instance:node_network_receive_drop_excluding_lo:rate5m
      - expr: sum by (instance) (rate(node_network_transmit_drop_total{component="node-exporter", device!="lo"}[5m]))
        record: instance:node_network_transmit_drop_excluding_lo:rate5m

To apply the changed configuration, run:

helm upgrade prometheus suse/prometheus --namespace monitoring --values prometheus-config-values.yaml

You should now be able to see your configured rules, depending on your network configuration
- NodePort: https://prometheus.example.com:32443/rules
- External IPs: https://prometheus.example.com/rules
- LoadBalancer: https://prometheus.example.com/rules

8.1.3.2.5 Grafana #

Starting from Grafana 5.0, it is possible to dynamically provision the data sources and dashboards via files. In a Kubernetes cluster, these files are provided via the utilization of ConfigMap, editing a ConfigMap will result by the modification of the configuration without having to delete/recreate the pod.

Configure Grafana provisioning

Create the default datasource configuration file grafana-datasources.yaml which point to our Prometheus server

kind: ConfigMap
apiVersion: v1
metadata:
  name: grafana-datasources
  namespace: monitoring
  labels:
     grafana_datasource: "1"
data:
  datasource.yaml: |-
    apiVersion: 1
    deleteDatasources:
      - name: Prometheus
        orgId: 1
    datasources:
    - name: Prometheus
      type: prometheus
      url: http://prometheus-server.monitoring.svc.cluster.local:80
      access: proxy
      orgId: 1
      isDefault: true

Create the ConfigMap in Kubernetes cluster

kubectl create -f grafana-datasources.yaml

Configure storage for the deployment

Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

Use an existing PersistentVolumeClaim

Use a StorageClass (preferred)

Create a file grafana-config-values.yaml with the appropriate values

# Configure admin password
adminPassword: <PASSWORD>

# Ingress configuration
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
  hosts:
    - grafana.example.com
  tls:
    - hosts:
      - grafana.example.com
      secretName: monitoring-tls

# Configure persistent storage
persistence:
  enabled: true
  accessModes:
    - ReadWriteOnce
  ## Use a StorageClass
  storageClassName: my-storage-class
  ## Create a PersistentVolumeClaim of 10Gi
  size: 10Gi
  ## Use an existing PersistentVolumeClaim (my-pvc)
  #existingClaim: my-pvc

# Enable sidecar for provisioning
sidecar:
  datasources:
    enabled: true
    label: grafana_datasource
  dashboards:
    enabled: true
    label: grafana_dashboard

Add SUSE helm charts repository

helm repo add suse https://kubernetes-charts.suse.com

Deploy SUSE grafana helm chart and pass our configuration values file

helm install grafana suse/grafana \
--namespace monitoring \
--values grafana-config-values.yaml

The result should be a running Grafana pod

kubectl -n monitoring get pod | grep grafana
NAME                                             READY     STATUS    RESTARTS   AGE
grafana-dbf7ddb7d-fxg6d                          3/3       Running   0          2m

At this stage, Grafana should be accessible, depending on your network configuration
- NodePort: https://grafana.example.com:32443
- External IPs: https://grafana.example.com
- LoadBalancer: https://grafana.example.com
Now you can add Grafana dashboards.

8.1.3.2.6 Adding Grafana Dashboards #

There are three ways to add dashboards to Grafana:

Deploy an existing dashboard from Grafana dashboards
1. Open the deployed Grafana in your browser and log in.
2. On the home page of Grafana, hover your mousecursor over the + button on the left sidebar and click on the import menuitem.
3. Select an existing dashboard for your purpose from Grafana dashboards. Copy the URL to the clipboard.
4. Paste the URL (for example) https://grafana.com/dashboards/3131 into the first input field to import the "Kubernetes All Nodes" Grafana Dashboard. After pasting in the url, the view will change to another form.
5. Now select the "Prometheus" datasource in the prometheus field and click on the import button.
6. The browser will redirect you to your newly created dashboard.

Use our pre-built dashboards to monitor the SUSE CaaS Platform system

# monitor SUSE CaaS Platform cluster
kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-cluster.yaml
# monitor SUSE CaaS Platform etcd cluster
kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-etcd-cluster.yaml
# monitor SUSE CaaS Platform nodes
kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-nodes.yaml
# monitor SUSE CaaS Platform namespaces
kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-namespaces.yaml
# monitor SUSE CaaS Platform pods
kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-pods.yaml
# monitor SUSE CaaS Platform certificates
kubectl apply -f https://raw.githubusercontent.com/SUSE/caasp-monitoring/master/grafana-dashboards-caasp-certificates.yaml

Build your own dashboard Deploy your own dashboard by configuration file containing the dashboard definition.

Create your dashboard definition file as a ConfigMap, for example grafana-dashboards-caasp-cluster.yaml.

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboards-caasp-cluster
  namespace: monitoring
  labels:
     grafana_dashboard: "1"
data:
  caasp-cluster.json: |-
    {
      "__inputs": [
        {
          "name": "DS_PROMETHEUS",
          "label": "Prometheus",
          "description": "",
          "type": "datasource",
          "pluginId": "prometheus",
          "pluginName": "Prometheus"
        }
      ],
      "__requires": [
        {
          "type": "grafana",
[...]
continues with definition of dashboard JSON
[...]

Apply the ConfigMap to the cluster.

kubectl apply -f grafana-dashboards-caasp-cluster.yaml

8.1.3.3 Installation For Subpaths #

This installation example shows how to install and configure Prometheus and Grafana using subpaths such as example.com/prometheus, example.com/alertmanager, and example.com/grafana.

Important

Overlapped instructions from subdomains will be omitted. Refer to the instruction from subdomains.

8.1.3.4 Create DNS entries #

In this example, we will use a master node with IP 10.86.4.158 in the case of NodePort service of the Ingress Controller.

Note

You should configure proper DNS names in any production environment. These values are only for example purposes.

Configure the DNS server

example.com                      IN  A       10.86.4.158

Configure the management workstation /etc/hosts (optional)
```
10.86.4.158 example.com
```

8.1.3.4.1 TLS Certificate #

You must configure your certificates for the components as secrets in the Kubernetes cluster. Get certificates from your certificate authority.

Then, import your certificate and key pair into the Kubernetes cluster secret name monitoring-tls. In this example, the certificate and key are monitoring.crt and monitoring.key.

kubectl create -n monitoring secret tls monitoring-tls  \
--key  ./monitoring.key \
--cert ./monitoring.crt

8.1.3.4.2 Prometheus #

Create a configuration file prometheus-config-values.yaml

We need to configure the storage for our deployment. Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

Use an existing PersistentVolumeClaim
Use a StorageClass (preferred)
Add the external URL to baseURL at which the server can be accessed. The baseURL depends on your network configuration.
- NodePort: https://example.com:32443/prometheus and https://example.com:32443/alertmanager
- External IPs: https://example.com/prometheus and https://example.com/alertmanager
- LoadBalancer: https://example.com/prometheus and https://example.com/alertmanager

# Alertmanager configuration
alertmanager:
  enabled: true
  baseURL: https://example.com:32443/alertmanager
  prefixURL: /alertmanager
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: nginx
      nginx.ingress.kubernetes.io/auth-type: basic
      nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
      nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
    hosts:
      - example.com/alertmanager
    tls:
      - secretName: monitoring-tls
        hosts:
        - example.com
  persistentVolume:
    enabled: true
    ## Use a StorageClass
    storageClass: my-storage-class
    ## Create a PersistentVolumeClaim of 2Gi
    size: 2Gi
    ## Use an existing PersistentVolumeClaim (my-pvc)
    #existingClaim: my-pvc

## Alertmanager is configured through alertmanager.yml. This file and any others
## listed in alertmanagerFiles will be mounted into the alertmanager pod.
## See configuration options https://prometheus.io/docs/alerting/configuration/
#alertmanagerFiles:
#  alertmanager.yml:

# Create a specific service account
serviceAccounts:
  nodeExporter:
    name: prometheus-node-exporter

# Node tolerations for node-exporter scheduling to nodes with taints
# Allow scheduling of node-exporter on master nodes
nodeExporter:
  hostNetwork: false
  hostPID: false
  podSecurityPolicy:
    enabled: true
    annotations:
      apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
      apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
      seccomp.security.alpha.kubernetes.io/allowedProfileNames: runtime/default
      seccomp.security.alpha.kubernetes.io/defaultProfileName: runtime/default
  tolerations:
    - key: node-role.kubernetes.io/master
      operator: Exists
      effect: NoSchedule

# Disable Pushgateway
pushgateway:
  enabled: false

# Prometheus configuration
server:
  baseURL: https://example.com:32443/prometheus
  prefixURL: /prometheus
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: nginx
      nginx.ingress.kubernetes.io/auth-type: basic
      nginx.ingress.kubernetes.io/auth-secret: prometheus-basic-auth
      nginx.ingress.kubernetes.io/auth-realm: "Authentication Required"
    hosts:
      - example.com/prometheus
    tls:
      - secretName: monitoring-tls
        hosts:
        - example.com
  persistentVolume:
    enabled: true
    ## Use a StorageClass
    storageClass: my-storage-class
    ## Create a PersistentVolumeClaim of 8Gi
    size: 8Gi
    ## Use an existing PersistentVolumeClaim (my-pvc)
    #existingClaim: my-pvc

## Prometheus is configured through prometheus.yml. This file and any others
## listed in serverFiles will be mounted into the server pod.
## See configuration options
## https://prometheus.io/docs/prometheus/latest/configuration/configuration/
#serverFiles:
#  prometheus.yml:

Add SUSE helm charts repository

helm repo add suse https://kubernetes-charts.suse.com

Deploy SUSE prometheus helm chart and pass our configuration values file.

helm install prometheus suse/prometheus \
--namespace monitoring \
--values prometheus-config-values.yaml

There need to be 3 pods running (3 node-exporter pods because we have 3 nodes).

kubectl -n monitoring get pod | grep prometheus
NAME                                             READY     STATUS    RESTARTS   AGE
prometheus-alertmanager-5487596d54-kcdd6         2/2       Running   0          2m
prometheus-kube-state-metrics-566669df8c-krblx   1/1       Running   0          2m
prometheus-node-exporter-jnc5w                   1/1       Running   0          2m
prometheus-node-exporter-qfwp9                   1/1       Running   0          2m
prometheus-node-exporter-sc4ls                   1/1       Running   0          2m
prometheus-server-6488f6c4cd-5n9w8               2/2       Running   0          2m

8.1.3.4.3 Alertmanager Configuration Example #

Refer to Section 8.1.3.2.3, “Alertmanager Configuration Example”

8.1.3.4.4 Recording Rules Configuration Example #

Refer to Section 8.1.3.2.4, “Recording Rules Configuration Example”

8.1.3.4.5 Grafana #

Starting from Grafana 5.0, it is possible to dynamically provision the data sources and dashboards via files. In Kubernetes cluster, these files are provided via the utilization of ConfigMap, editing a ConfigMap will result by the modification of the configuration without having to delete/recreate the pod.

Configure Grafana provisioning

Create the default datasource configuration file grafana-datasources.yaml which point to our Prometheus server

---
kind: ConfigMap
apiVersion: v1
metadata:
  name: grafana-datasources
  namespace: monitoring
  labels:
     grafana_datasource: "1"
data:
  datasource.yaml: |-
    apiVersion: 1
    deleteDatasources:
      - name: Prometheus
        orgId: 1
    datasources:
    - name: Prometheus
      type: prometheus
      url: http://prometheus-server.monitoring.svc.cluster.local:80
      access: proxy
      orgId: 1
      isDefault: true

Create the ConfigMap in Kubernetes cluster

kubectl create -f grafana-datasources.yaml

Configure storage for the deployment

Choose among the options and uncomment the line in the config file. In production environments you must configure persistent storage.

Use an existing PersistentVolumeClaim
Use a StorageClass (preferred)
Add the external URL to root_url at which the server can be accessed. The root_url depends on your network configuration.
- NodePort: https://example.com:32443/grafana
- External IPs: https://example.com/grafana
- LoadBalancer: https://example.com/grafana

Create a file grafana-config-values.yaml with the appropriate values

# Configure admin password
adminPassword: <PASSWORD>

# Ingress configuration
ingress:
  enabled: true
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/rewrite-target: /
  hosts:
    - example.com
  path: /grafana
  tls:
    - secretName: monitoring-tls
      hosts:
      - example.com

# subpath for grafana
grafana.ini:
  server:
    root_url: https://example.com:32443/grafana

# Configure persistent storage
persistence:
  enabled: true
  accessModes:
    - ReadWriteOnce
  ## Use a StorageClass
  storageClassName: my-storage-class
  ## Create a PersistentVolumeClaim of 10Gi
  size: 10Gi
  ## Use an existing PersistentVolumeClaim (my-pvc)
  #existingClaim: my-pvc

# Enable sidecar for provisioning
sidecar:
  datasources:
    enabled: true
    label: grafana_datasource
  dashboards:
    enabled: true
    label: grafana_dashboard

Add SUSE helm charts repository

helm repo add suse https://kubernetes-charts.suse.com

Deploy SUSE grafana helm chart and pass our configuration values file

helm install grafana suse/grafana \
--namespace monitoring \
--values grafana-config-values.yaml

The result should be a running Grafana pod

kubectl -n monitoring get pod | grep grafana
NAME                                             READY     STATUS    RESTARTS   AGE
grafana-dbf7ddb7d-fxg6d                          3/3       Running   0          2m

Access Prometheus, Alertmanager, and Grafana
At this stage, the Prometheus Expression browser/API, Alertmanager, and Grafana should be accessible, depending on your network configuration
- Prometheus Expression browser/API
  - NodePort: https://example.com:32443/prometheus
  - External IPs: https://example.com/prometheus
  - LoadBalancer: https://example.com/prometheus
- Alertmanager
  - NodePort: https://example.com:32443/alertmanager
  - External IPs: https://example.com/alertmanager
  - LoadBalancer: https://example.com/alertmanager
- Grafana
  - NodePort: https://example.com:32443/grafana
  - External IPs: https://example.com/grafana
  - LoadBalancer: https://example.com/grafana
Now you can add the Grafana dashboards.

8.1.3.4.6 Adding Grafana Dashboards #

Refer to Section 8.1.3.2.6, “Adding Grafana Dashboards”

8.1.4 Monitoring #

8.1.4.1 Prometheus Jobs #

The Prometheus SUSE helm chart includes the following predefined jobs that will scrape metrics from these jobs using service discovery.

prometheus: Get metrics from prometheus server
kubernetes-apiservers: Get metrics from Kubernetes apiserver
kubernetes-nodes: Get metrics from Kubernetes nodes
kubernetes-service-endpoints: Get metrics from Services which have annotation prometheus.io/scrape=true in the metadata
kubernetes-pods: Get metrics from Pods which have annotation prometheus.io/scrape=true in the metadata

If you want to monitor new pods and services, you don’t need to change prometheus.yaml but add annotation prometheus.io/scrape=true, prometheus.io/port=<TARGET_PORT> and prometheus.io/path=<METRIC_ENDPOINT> to your pods and services metadata. Prometheus will automatically scrape the target.

8.1.4.2 ETCD Cluster #

ETCD server exposes metrics on the /metrics endpoint. Prometheus jobs do not scrape it by default. Edit the prometheus.yaml file if you want to monitor the etcd cluster. Since the etcd cluster runs on https, we need to create a certificate to access the endpoint.

Create a new etcd client certificate signed by etcd CA cert/key pair:

cat << EOF > <CLUSTER_NAME>/pki/etcd/openssl-monitoring-client.conf
[req]
distinguished_name = req_distinguished_name
req_extensions = v3_req
prompt = no

[v3_req]
keyUsage = digitalSignature,keyEncipherment
extendedKeyUsage = clientAuth

[req_distinguished_name]
O = system:masters
CN = kube-etcd-monitoring-client
EOF

openssl req -nodes -new -newkey rsa:2048 -config <CLUSTER_NAME>/pki/etcd/openssl-monitoring-client.conf -out <CLUSTER_NAME>/pki/etcd/monitoring-client.csr -keyout <CLUSTER_NAME>/pki/etcd/monitoring-client.key
openssl x509 -req -days 365 -CA <CLUSTER_NAME>/pki/etcd/ca.crt -CAkey <CLUSTER_NAME>/pki/etcd/ca.key -CAcreateserial -in <CLUSTER_NAME>/pki/etcd/monitoring-client.csr -out <CLUSTER_NAME>/pki/etcd/monitoring-client.crt -sha256 -extfile <CLUSTER_NAME>/pki/etcd/openssl-monitoring-client.conf -extensions v3_req

Create the etcd client certificate to secret in monitoring namespace:

kubectl -n monitoring create secret generic etcd-certs --from-file=<CLUSTER_NAME>/pki/etcd/ca.crt --from-file=<CLUSTER_NAME>/pki/etcd/monitoring-client.crt --from-file=<CLUSTER_NAME>/pki/etcd/monitoring-client.key

Get all etcd cluster private IP address:

kubectl get pods -n kube-system -l component=etcd -o wide
NAME           READY   STATUS    RESTARTS   AGE   IP             NODE      NOMINATED NODE   READINESS GATES
etcd-master0   1/1     Running   2          21h   192.168.0.6    master0   <none>           <none>
etcd-master1   1/1     Running   2          21h   192.168.0.20   master1   <none>           <none>

Edit the configuration file prometheus-config-values.yaml, add extraSecretMounts and extraScrapeConfigs parts, change the extraScrapeConfigs targets IP address(es) as your environment and change the target numbers if you have different etcd cluster members:

# Prometheus configuration
server:
  ...
  extraSecretMounts:
  - name: etcd-certs
    mountPath: /etc/secrets
    secretName: etcd-certs
    readOnly: true

extraScrapeConfigs: |
  - job_name: etcd
    static_configs:
    - targets: ['192.168.0.32:2379','192.168.0.17:2379','192.168.0.5:2379']
    scheme: https
    tls_config:
      ca_file: /etc/secrets/ca.crt
      cert_file: /etc/secrets/monitoring-client.crt
      key_file: /etc/secrets/monitoring-client.key

Upgrade prometheus helm deployment:

helm upgrade prometheus suse/prometheus \
--namespace monitoring \
--values prometheus-config-values.yaml

8.2 Health Checks #

Although Kubernetes cluster takes care of a lot of the traditional deployment problems on its own, it is good practice to monitor the availability and health of your services and applications in order to react to problems should they go beyond the automated measures.

There are three levels of health checks.

Cluster
Node
Service / Application

8.2.1 Cluster Health Checks #

The basic check if a cluster is working correctly is based on a few criteria:

Are all services running as expected?
Is there at least one Kubernetes master fully working? Even if the deployment is configured to be highly available, it’s useful to know if kube-controller-manager is down on one of the machines.

Note

For further understanding cluster health information, consider reading https://v1-18.docs.kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/

8.2.1.1 Kubernetes master #

All components in Kubernetes cluster expose a /healthz endpoint. The expected (healthy) HTTP response status code is 200.

The minimal services for the master to work properly are:

kube-apiserver:
The component that receives your requests from kubectl and from the rest of the Kubernetes components. The URL is https://<CONTROL_PLANE_IP/FQDN>:6443/healthz
- Local Check
```
curl -k -i https://localhost:6443/healthz
```
- Remote Check
```
curl -k -i https://<CONTROL_PLANE_IP/FQDN>:6443/healthz
```
kube-controller-manager:
The component that contains the control loop, driving current state to the desired state. The URL is http://<CONTROL_PLANE_IP/FQDN>:10252/healthz
- Local Check
```
curl -i http://localhost:10252/healthz
```
- Remote Check
  Make sure firewall allows port 10252.
```
curl -i http://<CONTROL_PLANE_IP/FQDN>:10252/healthz
```
kube-scheduler:
The component that schedules workloads to nodes. The URL is http://<CONTROL_PLANE_IP/FQDN>:10251/healthz
- Local Check
```
curl -i http://localhost:10251/healthz
```
- Remote Check
  Make sure firewall allows port 10251.
```
curl -i http://<CONTROL_PLANE_IP/FQDN>:10251/healthz
```

Note: High-Availability Environments

In a HA environment you can monitor kube-apiserver on https://<LOAD_BALANCER_IP/FQDN>:6443/healthz.

If any one of the master nodes is running correctly, you will receive a valid response.

This does, however, not mean that all master nodes necessarily work correctly. To ensure that all master nodes work properly, the health checks must be repeated individually for each deployed master node.

This endpoint will return a successful HTTP response if the cluster is operational; otherwise it will fail. It will for example check that it can access etcd. This should not be used to infer that the overall cluster health is ideal. It will return a successful response even when only minimal operational cluster health exists.

To probe for full cluster health, you must perform individual health checking for all machines.

8.2.1.2 ETCD Cluster #

The etcd cluster exposes an endpoint /health. The expected (healthy) HTTP response body is {"health":"true"}. The etcd cluster is accessed through HTTPS only, so be sure to have etcd certificates.

Local Check

curl --cacert /etc/kubernetes/pki/etcd/ca.crt
--cert /etc/kubernetes/pki/etcd/healthcheck-client.crt
--key /etc/kubernetes/pki/etcd/healthcheck-client.key https://localhost:2379/health

Remote Check

Make sure firewall allows port 2379.

curl --cacert <ETCD_ROOT_CA_CERT> --cert <ETCD_CLIENT_CERT>
--key <ETCD_CLIENT_KEY> https://<CONTROL_PLANE_IP/FQDN>:2379/health

8.2.2 Node Health Checks #

This basic node health check consists of two parts. It checks:

The kubelet endpoint
CNI (Container Networking Interface) pod state

8.2.2.1 kubelet #

First, determine if kubelet is up and working on the node.

Kubelet has two ports exposed on all machines:

Port https/10250: exposes kubelet services to the entire cluster and is available from all nodes through authentication.
Port http/10248: is only available on local host.

You can send an HTTP request to the endpoint to find out if kubelet is healthy on that machine. The expected (healthy) HTTP response status code is 200.

8.2.2.1.1 Local Check #

If there is an agent running on each node, this agent can simply fetch the local healthz port:

curl -i http://localhost:10248/healthz

8.2.2.1.2 Remote Check #

There are two ways to fetch endpoints remotely (metrics, healthz, etc.). Both methods use HTTPS and a token.

The first method is executed against the APIServer and mostly used with Prometheus and Kubernetes discovery kubernetes_sd_config. It allows automatic discovery of the nodes and avoids the task of defining monitoring for each node. For more information see the Kubernetes documentation: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

The second method directly talks to kubelet and can be used in more traditional monitoring where one must configure each node to be checked.

Configuration and Token retrieval:

Create a Service Account (monitoring) with an associated secondary Token (monitoring-secret-token). The token will be used in HTTP requests to authenticate against the API server.

This Service Account can only fetch information about nodes and pods. Best practice is not to use the token that has been created default. Using a secondary token is also easier for management. Create a file kubelet.yaml with the following as content.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: monitoring
  namespace: kube-system
secrets:
- name: monitoring-secret-token
---
apiVersion: v1
kind: Secret
metadata:
  name: monitoring-secret-token
  namespace: kube-system
  annotations:
    kubernetes.io/service-account.name: monitoring
type: kubernetes.io/service-account-token
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: monitoring-clusterrole
  namespace: kube-system
rules:
- apiGroups: [""]
  resources:
  - nodes/metrics
  - nodes/proxy
  - pods
  verbs: ["get", "list"]
- nonResourceURLs: ["/metrics", "/healthz", "/healthz/*"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: monitoring-clusterrole-binding
  namespace: kube-system
roleRef:
  kind: ClusterRole
  name: monitoring-clusterrole
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: monitoring
  namespace: kube-system

Apply the yaml file:

kubectl apply -f kubelet.yaml

Export the token to an environment variable:

TOKEN=$(kubectl -n kube-system get secrets monitoring-secret-token
-o jsonpath='{.data.token}' | base64 -d)

This token can now be passed through the --header argument as: "Authorization: Bearer $TOKEN".

Now export important values as environment variables:

Environment Variables Setup
1. Choose a Kubernetes master node or worker node. The NODE_IP_FQDN here must be a node’s IP address or FQDN. The NODE_NAME here must be a node name in your Kubernetes cluster. Export the variables NODE_IP_FQDN and NODE_NAME so it can be reused.
```
NODE_IP_FQDN="10.86.4.158"
NODE_NAME=worker0
```
2. Retrieve the TOKEN with kubectl.
```
TOKEN=$(kubectl -n kube-system get secrets monitoring-secret-token
-o jsonpath='{.data.token}' | base64 -d)
```
3. Get the control plane <IP/FQDN> from the configuration file. You can skip this step if you only want to use the kubelet endpoint.
```
CONTROL_PLANE=$(kubectl config view | grep server | cut -f 2- -d ":" | tr -d " ")
```
  Now the key information to retrieve data from the endpoints should be available in the environment and you can poll the endpoints.

Fetching Information from kubelet Endpoint

Make sure firewall allows port 10250.

Fetching metrics

curl -k https://$NODE_IP_FQDN:10250/metrics --header "Authorization: Bearer $TOKEN"

Fetching healthz

curl -k https://$NODE_IP_FQDN:10250/healthz --header "Authorization: Bearer $TOKEN"

Fetching Information from APISERVER Endpoint

Fetching metrics

curl -k $CONTROL_PLANE/api/v1/nodes/$NODE_NAME/proxy/metrics --header
"Authorization: Bearer $TOKEN"

Fetching healthz

curl -k $CONTROL_PLANE/api/v1/nodes/$NODE_NAME/proxy/healthz --header
"Authorization: Bearer $TOKEN"

8.2.2.2 CNI #

You can check if the CNI (Container Networking Interface) is working as expected by check if the coredns service is running. If CNI has some kind of trouble coredns will not be able to start:

kubectl get deployments -n kube-system
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
cilium-operator   1/1     1            1           8d
coredns           2/2     2            2           8d
oidc-dex          1/1     1            1           8d
oidc-gangway      1/1     1            1           8d

If coredns is running and you are able to create pods then you can be certain that CNI and your CNI plugin are working correctly.

There’s also the Monitor Node Health check. This is a DaemonSet that runs on every node, and reports to the apiserver back as NodeCondition and Events.

8.2.3 Service/Application Health Checks #

If the deployed services contain a health endpoint, or if they contain an endpoint that can be used to determine if the service is up, you can use livenessProbes and/or readinessProbes.

Note: Health check endpoints vs. functional endpoints

A proper health check is always preferred if designed correctly.

Despite the fact that any endpoint could potentially be used to infer if your application is up, it is better to have an endpoint specifically for health in your application. Such an endpoint will only respond affirmatively when all your setup code on the server has finished and the application is running in a desired state.

The livenessProbes and readinessProbes share configuration options and probe types.

initialDelaySeconds: Number of seconds to wait before performing the very first liveness probe.
periodSeconds: Number of seconds that the kubelet should wait between liveness probes.
successThreshold: Number of minimum consecutive successes for the probe to be considered successful (Default: 1).
failureThreshold: Number of times this probe is allowed to fail in order to assume that the service is not responding (Default: 3).
timeoutSeconds: Number of seconds after which the probe times out (Default: 1).

There are different options for the livenessProbes to check:

Command: A command executed within a container; a return code of 0 means success. All other return codes mean failure.
TCP: If a TCP connection can be established is considered success.
HTTP: Any HTTP response between 200 and 400 indicates success.

8.2.3.1 livenessProbe #

livenessProbes are used to detect running but misbehaving pods/a service that might be running (the process didn’t die), but that is not responding as expected. You can find out more about livenessProbes here: https://v1-18.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/

Probes are executed by each kubelet against the pods that define them and that are running in that specific node. When a livenessProbe fails, Kubernetes will automatically restart the pod and increase the RESTARTS count for that pod. These probes will be executed every periodSeconds starting from initialDelaySeconds.

8.2.3.2 readinessProbe #

readinessProbes are used to wait for processes that take some time to start. Find out more about readinessProbes here: https://v1-18.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes Despite the container running, it might be performing some time consuming initialization operations. During this time, you don’t want Kubernetes to route traffic to that specific pod. You also don’t want that container to be restarted because it will appear unresponsive.

These probes will be executed every periodSeconds starting from initialDelaySeconds until the service is ready.

Both probe types can be used at the same time. If a service is running, but misbehaving, the livenessProbe will ensure that it’s restarted, and the readinessProbe will ensure that Kubernetes won’t route traffic to that specific pod until it’s considered to be fully functional and running again.

8.2.4 General Health Checks #

We recommend to apply other best practices from system administration to your monitoring and health checking approach. These steps are not specific to SUSE CaaS Platform and are beyond the scope of this document.

8.3 Horizontal Pod Autoscaler #

Horizontal Pod Autoscaler (HPA) is a tool that automatically increases or decreases the number of pods in a replication controller, deployment, replica set or stateful set, based on metrics collected from pods.

In order to leverage HPA, skuba now supports an addon metrics-server. The metrics-server addon is first installed into the Kubernetes cluster. After that, HPA fetches metrics from the aggregated API metrics.k8s.io and according to the user configuration determines whether to increase or decrease the scale of a replication controller, deployment, replica set or stateful set.

The HPA metrics.target.type can be one of the following:

Utilization: the value returned from the metrics server API is calculated as the average resource utilization across all relevant pods and subsequently compared with the metrics.target.averageUtilization.
AverageValue: the value returned from the metrics server API is divided by the number of all relevant pods, then compared to the metrics.target.averageValue.
Value: the value returned from the metrics server API is directly compared to the metrics.target.value.

Note

The metrics supported by metrics-server are the CPU and memory of a pod or node.

Important

API versions supported by the HPA:

CPU metric: autoscaling/v1,autoscaling/v2beta2
Memory metric: autoscaling/v2beta2.

8.3.1 Usage #

It is useful to first find out about the available resources of your cluster.

To display resource (CPU/Memory) usage for nodes, run:

$ kubectl top node

the expected output should look like the following:

NAME        CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
master000   207m         10%    1756Mi          45%
worker000   100m         10%    602Mi           31%

To display resource (CPU/Memory) usage for pods, run:

$ kubectl top pod

the expected output should look like the following:

NAME                                CPU(cores)   MEMORY(bytes)
cilium-9fjw2                        32m          216Mi
cilium-cqnq5                        43m          227Mi
cilium-operator-7d6ddddbf5-2jwgr    1m           46Mi
coredns-69c4947958-2br4b            2m           11Mi
coredns-69c4947958-kb6dq            3m           11Mi
etcd-master000                      21m          584Mi
kube-apiserver-master000            20m          325Mi
kube-controller-manager-master000   6m           105Mi
kube-proxy-x2965                    0m           24Mi
kube-proxy-x9zlv                    0m           19Mi
kube-scheduler-master000            2m           46Mi
kured-45rc2                         1m           25Mi
kured-cptk4                         0m           25Mi
metrics-server-79b8658cd7-gjvhs     1m           21Mi
oidc-dex-55fc689dc-f6cfg            1m           20Mi
oidc-gangway-7b7fbbdbdf-85p6t       1m           18Mi

Note

The option flag --sort-by=cpu/--sort-by=memory has an sorting issue at the moment. It will be fixed in the future.

8.3.1.1 Using Horizontal Pod Autoscaler (HPA) #

You can set the HPA to scale according to various metrics. These include average CPU utilization, average CPU value, average memory utilization and average memory value. The following sections show the recommended configuration for each of the aforementioned options.

8.3.1.1.1 Creating an HPA Using Average CPU Utilization #

The following code is an example of what this type of HPA can look like. You will have to run the code on your admin node or user local machine. Note that you need a kubeconfig file with RBAC permission that allow setting up autoscale rules into your Kubernetes cluster.

# deployment
kubectl autoscale deployment <DEPLOYMENT_NAME> \
    --min=<MIN_REPLICAS_NUMBER> \
    --max=<MAX_REPLICAS_NUMBER> \
    --cpu-percent=<PERCENT>

# replication controller
kubectl autoscale replicationcontrollers <REPLICATIONCONTROLLERS_NAME> \
    --min=<MIN_REPLICAS_NUMBER> \
    --max=<MAX_REPLICAS_NUMBER> \
    --cpu-percent=<PERCENT>

You could for example use the following values:

kubectl autoscale deployment oidc-dex \
    --name=avg-cpu-util \
    --min=1 \
    --max=10 \
    --cpu-percent=50

The example output below shows autoscaling works in case of the oidc-dex deployment. The HPA increases the minimum number of pods to 1 and will increase the pods up to 10, if the average CPU utilization of the pods reaches 50%. For more details about the inner workings of the scaling, refer to The Kubernetes documentation on the horizontal pod autoscale algorithm.

To check the current status of the HPA run:

kubectl get hpa

Example output:

NAME       REFERENCE             TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
oidc-dex   Deployment/oidc-dex   0%/50%          1         10        3          115s

Note

To calculate pod CPU utilization HPA divides the total CPU usage of all containers by the total number of CPU requests:

POD CPU UTILIZATION = TOTAL CPU USAGE OF ALL CONTAINERS / NUMBER OF CPU REQUESTS

For example:

Container1 requests 0.5 CPU and uses 0 CPU.
Container2 requests 1 CPU and uses 2 CPU.

The CPU utilization will be (0+2)/(0.5+1)*100 (%)=133 (%)

If a replication controller, deployment, replica set or stateful set does not specify the CPU request, the output of kubectl get hpa TARGETS will be unknown.

8.3.1.1.2 Creating an HPA Using the Average CPU Value #

Create a yaml manifest file hpa-avg-cpu-value.yaml with the following content:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: avg-cpu-value 1
  namespace: kube-system 2
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment 3
    name: example 4
  minReplicas: 1 5
  maxReplicas: 10 6
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: AverageValue
        averageValue: 500Mi 7

1	Name of the HPA.
2	Namespace of the HPA.
3	Specifies the kind of object to scale (a replication controller, deployment, replica set or stateful set).
4	Specifies the name of the object to scale.
5	Specifies the minimum number of replicas.
6	Specifies the maximum number of replicas.
7	The average value of the requested CPU that each pod uses.

Apply the yaml manifest by running:
```
kubectl apply -f hpa-avg-cpu-value.yaml
```

Check the current status of the HPA:

kubectl get hpa

NAME            REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
avg-cpu-value   Deployment/php-apache   1m/500Mi   1         10        1          39s

8.3.1.1.3 Creating an HPA Using Average Memory Utilization #

Create a yaml manifest file hpa-avg-memory-util.yaml with the following content:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: avg-memory-util 1
  namespace: kube-system 2
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment 3
    name: example 4
  minReplicas: 1 5
  maxReplicas: 10 6
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 50 7

1	Name of the HPA.
2	Namespace of the HPA.
3	Specifies the kind of object to scale (a replication controller, deployment, replica set or stateful set).
4	Specifies the name of the object to scale.
5	Specifies the minimum number of replicas.
6	Specifies the maximum number of replicas.
7	The average utilization of the requested memory that each pod uses.

Apply the yaml manifest by running:

kubectl apply -f hpa-avg-memory-util.yaml

Check the current status of the HPA:
```
kubectl get hpa

NAME              REFERENCE            TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
avg-memory-util   Deployment/example   5%/50%           1         10        1          4m54s
```
Note
HPA calculates pod memory utilization as: total memory usage of all containers / total memory requests. If a deployment or replication controller does not specify the memory request, the ouput of kubectl get hpa TARGETS is <unknown>.

8.3.1.1.4 Creating an HPA Using Average Memory Value #

Create a yaml manifest file hpa-avg-memory-value.yaml with the following content:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: avg-memory-value 1
  namespace: kube-system 2
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment 3
    name: example 4
  minReplicas: 1 5
  maxReplicas: 10 6
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: AverageValue
        averageValue: 500Mi 7

1	Name of the HPA.
2	Namespace of the HPA.
3	Specifies the kind of object to scale (a replication controller, deployment, replica set or stateful set).
4	Specifies the name of the object to scale.
5	Specifies the minimum number of replicas.
6	Specifies the maximum number of replicas.
7	The average value of the requested memory that each pod uses.

Apply the yaml manifest by running:

kubectl apply -f hpa-avg-memory-value.yaml

Check the current status of the HPA:

kubectl get hpa

NAME                     REFERENCE            TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
avg-memory-value         Deployment/example   11603968/500Mi   1         10        1          6m24s

8.4 Stratos Web Console #

Important

This feature is offered as a "tech preview".

We release this as a tech-preview in order to get early feedback from our customers. Tech previews are largely untested, unsupported, and thus not ready for production use.

That said, we strongly believe this technology is useful at this stage in order to make the right improvements based on your feedback. A fully supported, production-ready release is planned for a later point in time.

Note

If you plan to deploy SUSE Cloud Application Platform on your SUSE CaaS Platform cluster please skip this section of the documentation and refer to the official SUSE Cloud Application Platform instructions. This will include Stratos.

https://documentation.suse.com/suse-cap/1.5.2/single-html/cap-guides/#cha-cap-depl-caasp

8.4.1 Introduction #

The Stratos user interface (UI) is a modern web-based management application for Kubernetes and for Cloud Foundry distributions based on Kubernetes like SUSE Cloud Application Platform.

Stratos provides a graphical management console for both developers and system administrators.

A single Stratos instance can be used to monitor multiple Kubernetes clusters as long as it is granted access to their Kubernetes API endpoint.

This document aims to describe how to install Stratos in a SUSE CaaS Platform cluster that doesn’t plan to run any SUSE Cloud Application Platform components.

The Stratos stack is deployed using helm charts and consists of its web UI POD and a MariaDB one that is used to store configuration values.

8.4.2 Prerequisites #

8.4.2.1 Helm #

The deployment of Stratos is performed using a helm chart. Your remote administration machine must have Helm installed.

8.4.2.2 Persistent Storage #

The MariaDB instance used by Stratos requires a persistent storage to store its data.

The cluster must have a Kubernetes Storage Class defined.

8.4.3 Installation #

8.4.3.1 Adding helm chart repository and default values #

Add SUSE helm charts repository

helm repo add suse https://kubernetes-charts.suse.com

Obtain the default values.yaml file of the helm chart

helm inspect values suse/console > stratos-values.yaml

Create the stratos namespace
```
kubectl create namespace stratos
```

8.4.3.2 Define `admin` user password #

Create a secure password for your admin user and write that into the stratos-values.yaml as value of the console.localAdminPassword key.

Important

This step is required to allow the installation of Stratos without having any SUSE Cloud Application Platform components deployed on the cluster.

8.4.3.3 Define the Storage Class to be used #

If your cluster does not have a default storage class configured, or you want to use a different one, follow these instructions.

Open the stratos-values.yaml file and look for the storageClass entry defined at the global level, uncomment the line and provide the name of your Storage Class.

The values file will have something like that:

# Specify which storage class should be used for PVCs
storageClass: default

Note

The file has other storageClass keys defined inside of some of its resources. These can be left empty to rely on the global Storage Class that has just been defined.

8.4.3.4 Exposing the Web UI #

The web interface of Stratos can be exposed either via a Ingress resource or by using a Service of type LoadBalancer or even both at the same time.

An Ingress controller must be deployed on the cluster to be able to expose the service using an Ingress resource.

The cluster must be deployed on a platform that can handle LoadBalancer objects and must have the Cloud Provider Integration (CPI) enabled. This can be achieved, for example, when deploying SUSE CaaS Platform on top of OpenStack.

The behavior is defined inside of the console.service stanza of the yaml file:

console:
  service:
    annotations: []
    externalIPs: []
    loadBalancerIP:
    loadBalancerSourceRanges: []
    servicePort: 443
    # nodePort: 30000
    type: ClusterIP
    externalName:
    ingress:
      ## If true, Ingress will be created
      enabled: false

      ## Additional annotations
      annotations: {}

      ## Additional labels
      extraLabels: {}

      ## Host for the ingress
      # Defaults to console.[env.Domain] if env.Domain is set and host is not
      host:

      # Name of secret containing TLS certificate
      secretName:

      # crt and key for TLS Certificate (this chart will create the secret based on these)
      tls:
        crt:
        key:

8.4.3.4.1 Expose the web UI using a LoadBalancer #

The service can be exposes as a LoadBalancer one by setting the value of console.service.type to be LoadBalancer.

The LoadBalancer resource can be tuned by changing the values of the other loadBalancer* params specified inside of the console.service stanza.

8.4.3.4.2 Expose the web UI using an Ingress #

The Ingress resource can be created by setting console.service.ingress.enabled to be true.

Stratos is exposed by the Ingress using a dedicated host rule. Hence you must specify the FQDN of the host as a value of the console.service.ingress.host key.

The behavior of the Ingress object can be fine tuned by using the other keys inside of the console.service.ingress stanza.

8.4.3.5 Securing Stratos #

It’s highly recommended to secure Stratos' web interface using TLS encryption.

This can be done by creating a TLS certificate for Stratos.

8.4.3.5.1 Secure Stratos web UI #

It’s highly recommended to secure the web interface of Stratos by using TLS encryption. This can be easily done when exposing the web interface using an Ingress resource.

Inside of the console.service.ingress stanza ensure the Ingress resource is enabled and then specify values for console.service.ingress.tls.crt and console.service.ingress.tls.key. These keys hold the base64 encoded TLS certificate and key.

The TLS certificate and key can be base64 encoded by using the following command:

base64 tls.crt
base64 tls.key

The output produced by the two commands has to be copied into the stratos-values.yaml file, resulting in something like that:

console:
  service:
    ingress:
      enabled: true
      tls: |
        <output of base64 tls.crt>
      key: |
        <output of base64 tls.key>

8.4.3.5.2 Change MariaDB password #

The helm chart provisions the MariaDB database with a default weak password. A stronger password can be specified by altering the value of mariadb.mariadbPassword.

8.4.3.6 Enable tech preview features #

You can enable tech preview features of Stratos by changing the value of console.techPreview from false to true.

8.4.3.7 Deploying Stratos #

Now Stratos can be deployed using helm and the values specified inside of the stratos-values.yaml file:

helm install stratos-console suse/console \
  --namespace stratos \
  --values stratos-values.yaml

You can monitor the status of your Stratos deployment with the watch command:

watch --color 'kubectl get pods --namespace stratos'

When Stratos is successfully deployed, the following is observed:

For the volume-migration pod, the STATUS is Completed and the READY column is at 0/1.
All other pods have a Running STATUS and a READY value of n/n.

Press Ctrl–C to exit the watch command.

At this stage Stratos web UI should be accessible. You can log into that using the admin user and the password you specified inside of your stratos-values.yaml file.

8.4.4 Stratos configuration #

Now that Stratos is up and running you can log into it and configure it to connect to your Kubernetes cluster(s).

Please refer to the SUSE Cloud Application Platform documentation for more information.

Print this page

Administration Guide

8 Monitoring #

8.1 Monitoring Stack #

Important

8.1.1 Introduction #

Note

8.1.2 Prerequisites #

Important

8.1.3 Installation #

8.1.3.1 Installation For Subdomains #

Important

8.1.3.2 Create DNS entries #

Note

8.1.3.2.1 TLS Certificate #

Important: Note Down Secret Names For Configuration

8.1.3.2.2 Prometheus #

8.1.3.2.3 Alertmanager Configuration Example #

8.1.3.2.4 Recording Rules Configuration Example #

8.1.3.2.5 Grafana #

8.1.3.2.6 Adding Grafana Dashboards #

8.1.3.3 Installation For Subpaths #

Important

8.1.3.4 Create DNS entries #

Note

8.1.3.4.1 TLS Certificate #

8.1.3.4.2 Prometheus #

8.1.3.4.3 Alertmanager Configuration Example #

8.1.3.4.4 Recording Rules Configuration Example #

8.1.3.4.5 Grafana #

8.1.3.4.6 Adding Grafana Dashboards #

8.1.4 Monitoring #

8.1.4.1 Prometheus Jobs #

8.1.4.2 ETCD Cluster #

8.2 Health Checks #

8.2.1 Cluster Health Checks #

Note

8.2.1.1 Kubernetes master #

Note: High-Availability Environments

8.2.1.2 ETCD Cluster #

8.2.2 Node Health Checks #

8.2.2.1 kubelet #

8.2.2.1.1 Local Check #

8.2.2.1.2 Remote Check #

8.2.2.2 CNI #

8.2.3 Service/Application Health Checks #

Note: Health check endpoints vs. functional endpoints

8.2.3.1 livenessProbe #

8.2.3.2 readinessProbe #

8.2.4 General Health Checks #

8.3 Horizontal Pod Autoscaler #

Note

Important

8.3.1 Usage #

Note

8.3.1.1 Using Horizontal Pod Autoscaler (HPA) #

8.3.1.1.1 Creating an HPA Using Average CPU Utilization #

Note

8.3.1.1.2 Creating an HPA Using the Average CPU Value #

8.3.1.1.3 Creating an HPA Using Average Memory Utilization #

Note

8.3.1.1.4 Creating an HPA Using Average Memory Value #

8.4 Stratos Web Console #

Important

Note

8.4.1 Introduction #

8.4.2 Prerequisites #

8.4.2.1 Helm #

8.4.2.2 Persistent Storage #

8.4.3 Installation #

8.4.3.1 Adding helm chart repository and default values #

8.4.3.2 Define admin user password #

Important

8.4.3.3 Define the Storage Class to be used #

Note

8.4.3.4 Exposing the Web UI #

8.4.3.4.1 Expose the web UI using a LoadBalancer #

8.4.3.4.2 Expose the web UI using an Ingress #

8.4.3.5 Securing Stratos #

8.4.3.5.1 Secure Stratos web UI #

8.4.3.5.2 Change MariaDB password #

8.4.3.2 Define `admin` user password #